Le 26/05/2010 17:27, Attila Szegedi a écrit :
On 2010.05.26., at 16:55, Rémi Forax wrote:
Le 26/05/2010 16:26, Attila Szegedi a écrit :
Yeah, it's just that most software running under the term "parser generator"
today are in fact combined lexer/parser generators, and usually don't allow for a
situation where arbitrary new tokens can be introduced by the text being analyzed (which
is the case with both examples I gave). I was thinking about this (Ruby parsing) some
time ago and concluded that you'd most likely end up with a hand-patched lexer, as I
haven't seen this feature in any of the ready-made solutions I know (there might be some
that I don't know, naturally).
Attila.
For %Q{...}, the lexer lookup the matching right parenthesis, crop the text
between the parenthesis
and called another parser (or the same one at another start terminal) with the
text.
Yeah, but my point is that you can use *any* character after Q to specify the
delimiter, i.e.:
%Qafooa
is equal to
"foo"
so your lexer must be ready to use as a terminating character whatever
character follows immediately after %Q. It's just that it'll have special cases
for some chars - most notably parentheses, brackets, and braces, so a string
starting with %Q{ will not be terminated by { but by }. Most people will use {
and }, but the point is that you are free to use *any* character.
I see, like verb!! in latex.
Easy, if you have a good parser generator that do fast reduce i.e
doesn't consume the next token
before recognizing one if not necessary and let you access to the
underlying character buffer.
Tatoo does that: http://gforgeigm.univ-mlv.fr/projects/tatoo
In that case, you only recognize %Q, and fall into an hand coded method
to do the rest.
Attila.
Rémi
--
You received this message because you are subscribed to the Google Groups "JVM
Languages" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to
[email protected].
For more options, visit this group at
http://groups.google.com/group/jvm-languages?hl=en.