JFlex Home

  Home  

  Features  

  Download  

  Documentation  

  Mailing List  

  Bugs  

  Contact  

 


SourceForge Logo
project summary
 
JFlex - Frequently Asked Questions

 

  • Can I use my old JLex specifications with JFlex?
    • Yes. You usually can use them unchanged. See section [porting from JLex] of the manual for more information on that topic.
       
  • Where can I get the latest version of JFlex?
  • What platforms does JFlex support?
    • JFlex is written with Sun's JDK 1.1 and produces JDK 1.1 compatible code. It should run on any platform that supports JRE/JDK 1.1 or above.
       
  • Can I use CUP and JFlex together?
  • Can I use the generated code of my JFlex specification commercially?
    • You can use your generated code without restriction. See the file copyright for more information.
       
  • I want my scanner to read from a network byte stream or from interactive stdin. Can I do this with JFlex?
    • This actually depends on the syntax of the input you want to read. The problem is, that for some expressions the scanner needs one character of lookahead to decide which action to execute. For interactive use and network streams this is very inconvenient, because the stream doesn't send an EOF (or any other data) when the user stops typing while the scanner just waits for the next character and doesn't return a symbol. Since version 1.1.1 of JFlex this problem can be avoided because of a little more analysis at generation time. Take a look at these three rules:
      1. a
      2. a*
      3. ";"
      When the scanner has read one a, an additional input character is needed to decide, if this matches rule 1 (just one a) or rule two (when the next character is another a). With input aaa the scanner also has to read one additional character, because it is supposed to return the longest match (so if there comes another a, the match is aaaa and not only aaa). But: When the scanner reads a ";", it does not need an additional character and can immediatly execute the action for rule number 3. This is the case for all rules that are not prefixes of any other rules in the specification and that have a fixed length end (so (a* b) is ok but (a b*) is not).

      For your application this means: if all commands (or whatever units of input you have) are terminated by some delimiter (for instance ";" or LF or "end") reading from a network bytestream or an interactive stream works fine with JFlex.
       

  • How can I let my scanner read its input from a string?
      String myString = "some input";
      Scanner myScanner = new Scanner( new java.io.StringReader(myString) );
      

       
  • Why do standalone scanners have a different standard return type (int instead of Yytoken)?
      That's because int is a predefined type in Java and Yytoken is not. If a scanner generated with %standalone option would have return type Yytoken, you would have to provide this class for every standalone scanner you write. In most cases you don't want to do that, because the scanner wouldn't be really standalone then. The standard Yytoken for non standalone cases stems from JLex and is only kept for compatibility (it's rarely used anyway). If you still really want Yytoken as return type in a standalone scanner, you can always explicitly specify it with %type Yytoken. If you just want to test your scanner scanner and see what it does without a parser attached, use %debug instead of %standalone.
       
  • The expression ![a] seems to match "aa". Is negation broken?
      The semantics of the negation of an expression r is a literal everything not matched by r. The expression [a] matches strings that contain exactly one character (namely "a"). The string "aa" is not matched by this expression, hence it should be matched by ![a].
       
  • Negation doesn't seem to be awfully useful then, does it?
      That depends on what you want to specify, of course. Pure negation is not used very often, but the derived concept "set difference" can be expressed by negation (in combination with "or"), and is occasionally useful. Example: you want to match every one character sequence that is not a letter or digit. The character class syntax doesn't at the moment allow you to write something like [^[:letter:][:digit:]], but you can help yourself with negation:
      LD = [:letter:] | [:digit:]
      NLD = !(![^] | {LD})

      Algebraically this is equivalent to [^] & !{LD} and [^] - {LD}, i.e., everything that is a one-character match ([^]) and not letter nor digit (there doesn't exist an "and" or "minus" operator in JFlex, therefore the double negation form).
       
  • I use %8bit and get an Exception, but I know my platform only uses 8 bit. Is %8bit broken?
      Short answer: not broken, use %unicode. Long answer: Most probably this is an encoding problem. Java uses Unicode internally and converts the bytes it reads from files (or somewhere else) to Unicode first. The 8 bit value of your platform may not be 8 bit anymore when converted to Unicode. On many Windozes for instance Cp1252 (Windows-Latin-1) is used as standard encoding, and there the character "single right quotation mark" has code \x92 but after conversion to Unicode it's \u2019 which is not 8 bit any more. See also the section on Encodings, platforms, and Unicode of the JFlex manual for more information.
       
  • My scanner needs to read a file that is not in my platforms standard encoding, but in encoding XYZ. How?
      Since the scanner reads Java Unicode characters, it is independent of the actual character encoding a file or a string uses. The transformation byte-stream to Java characters for files usually happens in the java.io.InputStreamReader object connected with the input stream. Class java.io.FileReader uses the platforms default encoding automatically. If you would like to explicitly specify another encoding, for instance UTF-8, you could do something like
      Reader r = new InputStreamReader(new FileInputStream(file), "UTF8");
      
      Now you have a Reader r that can be passed to the scanner's constructor in the usual way. For more information on encodings see also Sun's JDK documentation, especially in Guide to Features - Java Platform item Internationalization and there the FAQ and Supported Encodings.
       
most recently modified at 2008-05-27 11:30 UTC by Gerwin Klein