Hi
This question first appeared at the PerlMonks website
(http://www.perlmonks.org/index.pl?node_id=218054). It was suggested that I post the
question and answer (from Damian) here, as others might find it of interest.
Best wishes.
Kevin
Q:
Given the following code:
=========================
use strict;
use warnings;
use Parse::RecDescent;
my $grammar = 'startrule: ( "aa" | "a" ) "a"';
my $parser = Parse::RecDescent->new($grammar);
my $text = 'aa';
print defined($parser->startrule($text)) ? "Good!\n" : "Bad!\n";
=========================
the output is "Bad!". The reason is that the first branch of the alternation matched,
then the next subrule failed.
Now, I understand from the P::RD manpage that bottom-up parsers try all possible
matches and select the longest one, so I guess the yacc equivalent would have worked.
However, I still found the behaviour of P::RD a little surprising.
My question is: Is there any way to persuade a top-down parser like P::RD to accept
the above text as valid? I know that, in this simple example, I could easily rewrite
the grammar (either by saying ("a" | "aa" ) "a" or "aa" a" | "a" "a"). What I mean is:
Is there any additional feature I have missed which would allow the grammar as is to
parse the text successfully?
To put the question another way, can I get P::RD to behave more like a regex engine?
After all, even an NFA engine would backtrack to try all possible alternatives before
failing :-) (Perhaps parsers just do not backtrack past individual subrules under any
circumstances.)
A:
The answer is that RecDescent parsers do not work that way. They don't backtrack on
failure; they just fail. Of course, there's nothing to prevent a recursive descent
parser from incorporating backtracking too, but RecDescent doesn't.
So, if you need backtracking in part of your grammar, you need to use plain old
regexes there. Sorry.
Damian
+ANYTHING+BELOW+WAS+ADDED+AFTER+I+HIT+SEND+
Visit our website at http://www.ubswarburg.com
This message contains confidential information and is intended only
for the individual named. If you are not the named addressee you
should not disseminate, distribute or copy this e-mail. Please
notify the sender immediately by e-mail if you have received this
e-mail by mistake and delete this e-mail from your system.
E-mail transmission cannot be guaranteed to be secure or error-free
as information could be intercepted, corrupted, lost, destroyed,
arrive late or incomplete, or contain viruses. The sender therefore
does not accept liability for any errors or omissions in the contents
of this message which arise as a result of e-mail transmission. If
verification is required please request a hard-copy version. This
message is provided for informational purposes and should not be
construed as a solicitation or offer to buy or sell any securities or
related financial instruments.