On Wed, May 26, 2010 at 11:27 AM, Attila Szegedi <[email protected]> wrote:

>
> Yeah, but my point is that you can use *any* character after Q to specify
> the delimiter, i.e.:
>
>  %Qafooa
>
> is equal to
>
>  "foo"
>
> so your lexer must be ready to use as a terminating character whatever
> character follows immediately after %Q. It's just that it'll have special
> cases for some chars - most notably parentheses, brackets, and braces, so a
> string starting with %Q{ will not be terminated by { but by }. Most people
> will use { and }, but the point is that you are free to use *any* character.
>
>
I'm not sure this feature is doable in a sane way in a classic lex/yacc
parser.  I think you can, in classic lex, drop down to a hand-rolled lexer
if you need to, but this is a serious code smell.

However, this misses the point of my original post.  If you're parsing an
existing language, then you don't get a choice in features.  If you're
parsing Ruby, you can't choose to not implement this feature.  Thus, if lex
and yacc can't handle this feature, you can't use lex and yacc to parse
Ruby.  This is where fancier parsers with more features and greater ability
to handle weird syntaxes become really useful.

If you're creating a language, you have a choice- you can include hard to
parse features or not.  And the thing to remember is that there is a cost to
adding the features- every hard to parse feature you add reduces the number
of parsers for your language other people are willing to write- and thus
limiting the portability of the language, limiting the number of tools for
the language, etc.  Even worse, every one of these features you add
increases the likelihood that people who do implement other parsers for your
language get it subtly wrong.  Add enough of these features and there will
only ever be one parser for your language- yours.

This isn't to say that you shouldn't add these features- it's that you
should be aware of the trade offs you are making, and be making them
deliberately and not accidentally.  With more powerful/flexible parser
generators, it's much easier to add these sorts of features accidentally,
and paint yourself into a corner (and it's even more likely you will do this
if you're implementing a hand-written parser and not using a parser
generator at all).

Brian

-- 
You received this message because you are subscribed to the Google Groups "JVM 
Languages" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/jvm-languages?hl=en.

Reply via email to