Sure, you can check, using a gated semantic predicate, if there is no "PR"
ahead when matching the VALUE-token.

Something like this:

grammar T;

@lexer::members {
  private boolean ahead(String text) {
    for(int i = 0; i < text.length(); i++) {
      if(text.charAt(i) != input.LA(i + 1)) {
        return false;
    return true;

  :  productionReceipt EOF


PR : 'PR';

  :  {!ahead("PR")}?=> ('a'..'z'|'A'..'Z')+



On Mon, Oct 31, 2011 at 10:01 PM,
Weiler-Thiessen,David,SASKATOON,Engineering <> wrote:

>  Hi ****
> ** **
> Yes, I can see how that is happening.****
> ** **
> So, in my case, because I have token value pairs, and the values are not
> terminated by something deterministic, I can’t use ANTLR to lex the input
> stream.  Is that correct?****
> ** **
> Turns out that the input stream is fix length format, so it can be parsed
> in other ways.  I was just thinking that this might be a problem space that
> ANTLR could address also.****
> ** **
> David Weiler-Thiessen
> Nestlé Purina PetCare
> phone: 306-933-0232
> cell: 306-291-9770 ****
> *This e-mail, its electronic document attachments, and the contents of
> its website linkages may contain confidential information. This information
> is intended solely for use by the individual or entity to whom it is
> addressed. If you have received this information in error, please notify
> the sender immediately and promptly destroy the material and any
> accompanying attachments from your system.*****
> *From:* Bart Kiers []
> *Sent:* Monday, October 31, 2011 12:09 PM
> *To:* Weiler-Thiessen,David,SASKATOON,Engineering
> *Cc:*
> *Subject:* Re: [antlr-interest] How to Parse a datastream of tokens and
> values****
> ** **
> Hi David,****
> ** **
> ANTLR's lexer greedily matches characters: the input "PRCLINTON" is being
> tokenized as a single VALUE-token, not as a PR- and VALUE-token.****
> ** **
> Regards,****
> ** **
> Bart.****
> ** **
> On Mon, Oct 31, 2011 at 6:24 PM, Weiler-Thiessen, David, SASKATOON,
> Engineering <> wrote:****
> Hi
> I am trying to parse a string that is a collection of tokens and values.
> For example:
> Where PR is my token, and CLINTON is the value for the token.
> I have started a simple grammar, see below, but it won't parse the sample
> above.
> message              :               productionReceipt
>                ;
> productionReceipt
>                :               PR VALUE
>                ;
> PR           :               'PR'
>                ;
> VALUE  :               ('a'..'z'|'A'..'Z')+
>                ;
> What am I doing wrong?  I get a MisMatchedTokenException in ANTLRWorks.
> David Weiler-Thiessen
> Nestlé Purina PetCare
> phone: 306-933-0232
> cell: 306-291-9770
> This e-mail, its electronic document attachments, and the contents of its
> website linkages may contain confidential information. This information is
> intended solely for use by the individual or entity to whom it is
> addressed. If you have received this information in error, please notify
> the sender immediately and promptly destroy the material and any
> accompanying attachments from your system.
> List:
> Unsubscribe:
> ** **


You received this message because you are subscribed to the Google Groups 
"il-antlr-interest" group.
To post to this group, send email to
To unsubscribe from this group, send email to
For more options, visit this group at

Reply via email to