On 08-Jul-12 00:50, Roman D. Boiko wrote:
On Saturday, 7 July 2012 at 20:29:26 UTC, Dmitry Olshansky wrote:
And given the backtracking nature of PEGs you'll do your distributed
thing many times over or ... spend a lot of RAM to remember not to
redo it. I recall lexing takes even more then parsing itself.

I think that your conclusions are about statistical evidences of PEG
misuses and poor PEG parser implementations. My point was that there is
nothing fundamentally worse in having lexer integrated with parser, but
there are performance advantages of having to check less possible cases
when the structural information is available (so that lexSmth could be
called when Smth is expected, thus requiring less switch branches if
switch is used).

You may misunderstood me as well, the point is still the same:
there are 2 things - notation and implementation, the fact that lexer is integrated in notation like in PEGs is not related to the fact that PEGs in their classic definition never use term Token and do backtracking parsing essentially on character level.



As for lexing multiple times, simply use a free list of terminals (aka
tokens). I still assume that grammar is properly defined, so that there
is only one way to split source into tokens.


Tokens.. there is no such term in use if we talk about 'pure' PEG.

--
Dmitry Olshansky


Reply via email to