On Sun, Dec 23, 2007 at 06:01:22PM -0800, Stephen Weeks wrote:
> Working on getting parrot to parse the lojban grammar, I found that it
> would hang forever with --target=parse.  Tracked it down to a <ws> rule.
> When I define a custom token ws in the grammar, it parses without
> problem.
> 
> Built a stripped-down test case.
> 
> http://nopaste.snit.ch:8001/paste has the grammar.
> http://pleasedieinafire.net/~tene/jp.tar.bz2 has the grammar, a slightly
> modified version of abc's .pir and Makefile.
> 
> Can anyone else confirm this?  Am I doing something wrong?

The default <ws> rule (as defined by Synopsis 5) only requires whitespace
between two word characters.  Thus the grammar as written (without a custom
<ws> rule) looks for an infinite number of <space_char> matches, and since
<ws> can match an empty string when not between two word characters, PGE
gets caught in an infinite loop.

Thus this ticket should likely be merged with or treated as a duplicate
of RT#37745 (handle repeated zero-length captures in PGE).  At any rate,
until #37745 is fixed it's probably worthwhile to make sure grammars don't
end up with repeated zero-length subrules or subpatterns.

Pm

Reply via email to