It works like this (as flex): 1. Longest matching token first.
2. On equal length, use the token defined previously in the grammar. So, in your case, don't make your TEXT token a repetitive regexp. Match a single char only. Cheers, /Per On Monday, February 28, 2011, Drew Vogel <drewpvo...@gmail.com> wrote: > I would expect the token definition order to matter, based on my experience > with similar tools like flex. I must be doing something wrong. > This is the test file I am trying to parse: > -------------------------------------------------- >>email< Enter your email address: > > This is my test grammar:-------------------------------------------------- > %header%GRAMMARTYPE = "LL" > %tokens%RCARET = ">"LCARET = "<"ITEM_NAME = <<[a-zA-Z][a-zA-Z0-9]+>> > TEXT = <<.+>> > %productions%Item = ItemDecl TEXT;ItemDecl = RCARET ITEM_NAME LCARET ; > > This is the error I get from grammatica: > --------------------------------------------------java -jar > grammatica-1.5.jar Q.grammar --parse test.qParse tree from test.q: > Error: in test.q: line 1: unexpected token ">email<" <TEXT>, expected ">" > > If I remove the TEXT token definition and the reference in the Item > production, the remaining grammar does properly match the first line and I > get a parse error at the new line character (as expected). Why does the > introduction of my TEXT token override those previously-matching tokens, even > though it is listed last in the %tokens% section? > > > > On Sun, Feb 27, 2011 at 11:49 PM, Oliver Bock <oli...@g7.org> wrote: > > > > > > > > I had to do a similar thing, but putting the more specific tokens > first in %tokens% worked for me. From my grammar: > > ON = "ON" > VARNAME = <<[A-Z@#]([A-Z0-9._$#@]*[A-Z0-9_$#@])?>> > > The text "ON" could match both these tokens, but for me ON matches, > not VARNAME. I suggest you cut your example down into a very simple > grammar (like the above). > > > Oliver > > On 28/02/2011 4:37 PM, Drew Vogel wrote: > If I have two regex tokens A and B and A is a subset > of B, how do I disambiguate them such that A will always be tried > before B? The order they appear in the %tokens% section does not > seem to affect this and I did not see an example of this in the > documentation. > > > > The parser I am trying to construct is for a template-like > language with commands embedded in text. Thus I have a "text" > token regex <<.+>> to match everything not otherwise > matched as a command, but I only want to match it after all > other token regex patterns have been tried. > > > > Drew Vogel > > > > _______________________________________________ > Grammatica-users mailing list > Grammatica-users@nongnu.org > http://lists.nongnu.org/mailman/listinfo/grammatica-users > > > > > > > _______________________________________________ > Grammatica-users mailing list > Grammatica-users@nongnu.org > http://lists.nongnu.org/mailman/listinfo/grammatica-users > > > _______________________________________________ Grammatica-users mailing list Grammatica-users@nongnu.org http://lists.nongnu.org/mailman/listinfo/grammatica-users