Hi Jim, others, Sorry, but I'd appreciate it if you (or someone else) could answer my question with a bit more detail because I really don't understand you (Jim).
You say `.+` matches forever, but in my example, there is a predicate in front of the `.` causing it _not_ to match forever as you can see yourself. The input "aaaBaa" is tokenized into 3 different tokens: "aaa", "B" and "aa" and _not_ into one single token by the rule that has the `.+` and the predicate in it. Your last comment suggests to me that you imply that "aaaBaa" will be tokenized as a single token (which, again, is not the case). My question therefor remains the same: why are "aaa" and "aa" from the input "aaaBaa" being tokenized as ANY_EXEPT_B instead of MANY_A, where MANY_A is defined before ANY_EXEPT_B and MANY_A matches exactly the same amount of characters as ANY_EXEPT_B does? To me, it's as if input "while" would be matched by the ID rule instead of the WHILE rule in: WHILE : 'while'; ID : 'a'..'z'+; (which is not the case, of course!) Regards, Bart. On Thu, Oct 27, 2011 at 10:34 PM, Jim Idle <j...@temporal-wave.com> wrote: > .+ matches forever > > > > Jim > > > > *From:* Bart Kiers [mailto:bki...@gmail.com] > *Sent:* Thursday, October 27, 2011 12:22 PM > *To:* Jim Idle > *Subject:* Re: [antlr-interest] Fwd: Rule precedence works differently > when using a predicate? > > > > On Thu, Oct 27, 2011 at 8:54 PM, Jim Idle <j...@temporal-wave.com> wrote: > > As I said earlier you need more predicates: > > > > Sorry Jim, I did not know you replied to my message below before. > > > > > > But you also need to not use .+, which essentially match anything anyway > once it is triggered. > > > > Err, no, not with a predicate, AFAIK (see the rule ANY_EXEPT_B in my > example below which does not match anything). > > > > > > Try something like this. > fragment KEY : ; > > ANY > : {!test()}?=> 'KEY') > | ({test()}?=> . ) > ; > > > But once you take out .+ , then it might just work as it was anyway. > > Jim > > > > Thanks for your suggestion, but I know how to make it work. My question was > more about why, when two rules match the same amount of characters, the rule > later defined in the grammar is used to create a token. > > Let me give another example grammar: > > > > grammar T; > > > > @parser::members { > > public static void main(String[] args) throws Exception { > > TLexer lexer = new TLexer(new ANTLRStringStream("aaaBaa")); > > TParser parser = new TParser(new CommonTokenStream(lexer)); > > parser.parse(); > > } > > } > > > > @lexer::members { > > private boolean noBAhead() { > > return input.LA(1) != 'B'; > > } > > } > > > > parse > > : (t=. {System.out.printf("\%-15s \%s\n", tokenNames[$t.type], > $t.text);})+ EOF > > ; > > > > MANY_A > > : 'a'+ > > ; > > > > B > > : 'B' > > ; > > > > ANY_EXEPT_B > > : ({noBAhead()}?=> . )+ > > ; > > > > If you run the TParser class, you will see the following output when > parsing "aaaBaa": > > > > ANY_EXEPT_B aaa > > B B > > ANY_EXEPT_B aa > > > > I.e., although the rule MANY_A also matches both "aaa" and > "aa", ANY_EXEPT_B matches them where I thought the rule defined first > (MANY_A) would match them. > > > > Regards, > > > > Bart. > > > > > > > > -----Original Message----- > > From: antlr-interest-boun...@antlr.org [mailto:antlr-interest- > > boun...@antlr.org] On Behalf Of Bart Kiers > > Sent: Thursday, October 27, 2011 10:56 AM > > To: antlr-interest@antlr.org interest > > Subject: [antlr-interest] Fwd: Rule precedence works differently when > > > using a predicate? > > > > Just a little bump, in case it got buried under some of the newer > > posts. > > And in case my previous grammar wasn't entirely clear, the following > > grammar: > > > > grammar T; > > > > @lexer::members { > > private boolean test() { > > return true; > > } > > } > > > > parse > > : KEY EOF > > ; > > > > KEY > > : 'key' > > ; > > > > ANY > > : ({test()}?=> . )+ > > ; > > > > > > with the test class: > > > > import org.antlr.runtime.*; > > > > public class Main { > > public static void main(String[] args) throws Exception { > > TLexer lexer = new TLexer(new ANTLRStringStream("key")); > > TParser parser = new TParser(new CommonTokenStream(lexer)); > > parser.parse(); > > } > > } > > > > > > Produces the following error: > > > > line 1:0 mismatched input 'key' expecting KEY > > > > > > In other words, 'key' is being tokenized as ANY instead of KEY. > > Is this expected behavior or a bug? And if it's expected behavior, > > could someone point me to the documentation (book) or wiki-link that > > explains this? > > > > Cheers & regards, > > > > Bart. > > > > ------------------- > > > > From: Bart Kiers <bki...@gmail.com> > > Date: Mon, Oct 24, 2011 at 11:46 AM > > Subject: Rule precedence works differently when using a predicate? > > To: "antlr-interest@antlr.org interest" <antlr-interest@antlr.org> > > > > > > Hi all, > > > > As I understand it, ANTLR's lexer matches rules from top to bottom in > > the .g grammar file and when two rules match the same number of > > characters, the one that is defined first has precedence over the later > > one(s). > > > > However, take the following grammar: > > > > grammar T; > > > > @lexer::members { > > private boolean test() { > > return true; > > } > > } > > > > parse > > : (t=. {System.out.println(tokenNames[$t.type] + " :: " + > > $t.text);})* EOF > > ; > > > > KEY > > : 'key' > > ; > > > > ANY > > : ({test()}?=> . )+ > > ; > > > > > > And the test class:" > > > > import org.antlr.runtime.*; > > > > > > public class Main { > > public static void main(String[] args) throws Exception { > > TLexer lexer = new TLexer(new ANTLRStringStream("key")); > > TParser parser = new TParser(new CommonTokenStream(lexer)); > > parser.parse(); > > } > > } > > > > > > I'd expected "KEY :: key" to be printed to the console, however, "ANY > > :: key" > > is printed instead. So the last rule is matched, while the KEY rule > > also matches the same input and is defined before ANY. Why? > > > > Kind regards, > > > > Bart. > > > > > List: http://www.antlr.org/mailman/listinfo/antlr-interest > > Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your- > > email-address > > List: http://www.antlr.org/mailman/listinfo/antlr-interest > Unsubscribe: > http://www.antlr.org/mailman/options/antlr-interest/your-email-address > > > List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- You received this message because you are subscribed to the Google Groups "il-antlr-interest" group. To post to this group, send email to il-antlr-inter...@googlegroups.com. To unsubscribe from this group, send email to il-antlr-interest+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en.