[il-antlr-interest: 34748] Re: [antlr-interest] Empty ifs in Java

Bart Kiers Sat, 05 Nov 2011 08:57:19 -0700

Hi,

On Sat, Nov 5, 2011 at 4:16 PM, Patrick Zimmermann <patr...@zakweb.de>wrote:


> Hi,
>
> thank you a lot.
> Using a lexer rule does in fact solve this problem.
>
> And now I am already on the next:
> stripped down to:
>
> start   :       ('{' 'ab' '}')* '{a}';
>
> using input:
> {ab}{a}
>
> Will not list '{ab' on the input stream in AntlrWorks and thus fails to
> parse
> the input. I suspect this is another "should be done with the lexer"-thing.
>

No, the literals in your parser rule are implicit lexer rules, although
it's better to create explicit rules instead of mixing them inside your
parser rules:

ABraced : '{a}';
OBrace  : '{';
CBrace  : '}';
AB      : 'ab';
A       : 'a';

If the lexer now tries to tokenize the input "{ab", then the lexer will see
"{a" and expects a "}" but there's a "b" instead: and an error is emitted.



> I'm currently thinking about whether ANTLR is the right tool for my job:
>
> In many cases the input I have is character wise context sensitive. I have
> some areas (the free text area) where '(' and ')' have a specific meaning
> and
> others (the note area) where '(' ')' are simply normal text. Or whitespace
> which is important in the text and to be ignored in tags and similar
> constructs.
>
> If I'm not mistaken the lexer runs completely before the parser and
> constructs
> tokens. Those tokens are then matched by the parser. So if an input would
> match several tokens (e.g. text not containing parenthesis) and the "wrong"
> one is chosen by the lexer the parser is screwed, right?
>

Yes, the parser has no control over what tokens the lexer produces.



> I currently realize that I am forced to use lexer rules for certain
> constructs
> (like ..) because I need character ranges to define the chars that are
> allowed
> (unicode, only certain languages).
>
>
> Do you think ANTLR is the right tool for for this job and I'm just not
> seeing
> the point in how to do it, or should I better use something else? What?
>

You could let the lexer simply create single tokens and create parser rules
that match a certain range of tokens (like the `ab` rule below):

start
  :  OBrace ab CBrace OBrace A CBrace EOF
  ;

ab
  :  A B
  ;

OBrace  : '{';
CBrace  : '}';
A       : 'a';
B       : 'b';



> Thanks so far,
> Patrick


Regards,

Bart.

PS. could you use the list for communication please?

List: http://www.antlr.org/mailman/listinfo/antlr-interest
Unsubscribe: 
http://www.antlr.org/mailman/options/antlr-interest/your-email-address

-- 
You received this message because you are subscribed to the Google Groups 
"il-antlr-interest" group.
To post to this group, send email to il-antlr-inter...@googlegroups.com.
To unsubscribe from this group, send email to 
il-antlr-interest+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/il-antlr-interest?hl=en.

[il-antlr-interest: 34748] Re: [antlr-interest] Empty ifs in Java

Reply via email to