>>
>> 1. Berkeley Yacc for Perl - works pretty well, but is kinda limited.
>>
I'm not sure what (if any) practical advantage this would have over
bison. I get the sense it's less well maintained.
>> 2. Parse::RecDescent - very impressive feature set, but a little
slow, and has
>> been under-maintained (though it seemed to have improved slightly
with several
>> new releases in 2009). It also tends to be hard to debug its errors.
>>
I tried this. Works ok, two or three orders of magnitude slower than
bison/C for me. Debugging tools are pretty good. Docs are pretty good.
>> 3. Parse::Yapp - http://search.cpan.org/dist/Parse-Yapp/ - I tried
to use it
>> in https://svn.berlios.de/svnroot/repos/web-cpan/Text-Qantor/ but it
gives me
>> an error for what appears to be a valid syntax, and for the life of me I
>> cannot understand why it is.
For people who aren't experts in the field most of the grammar errors
are completely inscrutable for all of these tools, even after reading
the docs. Start with something simple and make it more complex until it
stops working, then try phrasing it differently. Well, works for me. I
haven't tried this specific tool.
>>
>> 4. There's a new version of GNU bison with support for multiple language
>> backends. I tried writing a backend for Perl 5, but I gave up on the m4
>> hacking (I think that m4 must die!).
Bison is fast and relatively simple. I'm not sure about using it
directly from perl, but I wrote a C++ program to parse router configs
and spit out Data::Dumper() style perl struct output. Very fast.
_Relatively_ easy to add in exceptions for poorly behaved grammar. You
should be able to use this directly via XS or Inline if you care about
more direct integration. bison + c/c++ + valgrind works pretty well to
make a well behaved parser (w/out valgrind I always leak a bunch of
memory when writing in more uh.. "hands on" languages).
I did look at yacc but bison has more features. Also bison is consistent
across platforms.
>>
>> 5. There's also ANTLR - http://www.antlr.org/ :
>>
>> http://www.antlr.org/wiki/display/ANTLR3/Code+Generation+Targets says:
>>
>> Perl - Early prototyping. Simple lexer is working.
>>
I have used the python version of antlr. I found it to be more difficult
than bison/C (steeper learning curve), and maybe one or two order of
magnitude slower than bison/C.
The author and the users seem very competent, but if you are like me and
just want to get some parsing done this may be more effort than it's worth.
Cross language support is excellent between java/python, not sure about
perl. If you are trying to create an AST it will probably work fine.
Beyond that... ?
If you are writing your own grammar this is a really powerful tool. If
you are stuck trying to parse someone else's jive it may not be as useful.
Error messages for the user when something isn't parsable are also
fairly inscrutable, imo.
6. Can I interact with the Parrot Grammar Engine (PGE)? Any input would be
useful.
If you look at the parrot stuff I would be interested to hear how well
that work for you.
There's also the 'roll your own' approach. That's pretty fast and less
difficult than you might expect. Also depending on what you are parsing,
if a grammar is poorly behaved (possibly because the !...@!@ing vendor
decides to be retarded instead of consistent) it can be brutal to try to
add weird exceptions in a format any of the above tools will recognize
happily.
Another advantage is that when you write your own parser it's easier to
figure out what happens and what to do when you get something unexpected.
Knowing what I know now I would tend to opt for bison/C or roll your
own, depending on how speed critical your app is. If I was writing my
own language maybe antlr.
If you would like examples for any of the above I'd be happy to share,
please email me off-list.
Austin