On Wed, Aug 1, 2012 at 7:48 AM, Dmitry Olshansky <dmitry.o...@gmail.com> wrote: >> Well, >> >> - for a lexer lookahead is sometimes useful (the Dragon book cite the >> FORTRAN grammar, for which keywords are not reserved and so when you >> encounter IF, you don't know if (!) it's a function call or a 'real' >> if) > > > Well while lookahead will help, there are simpler ways. e.g. > regex ("IF\ZYOUR_LOOKAHEAD"); > > \Z means that you are capturing text only up to \Z. It's still regular. I > think Boost regex does support this. We could have supported this cleanly > but have chosen ECM-262 standard. > > > Critical difference is that lookahead can be used like this: > > regex("blah(?=lookahead)some_other_stuff_that_lookahead_also_checks");
In the FORTRAN case, you indeed *need* to re-lex the stuff after IF, with another regex, once you've determined it's an IF instruction and not some moron who used IF as an identifier. You know, as a first step, I'd be happy to get ctRegex to recognize the \Z flag.