On Sun, Dec 23, 2007 at 06:01:22PM -0800, Stephen Weeks wrote: > Working on getting parrot to parse the lojban grammar, I found that it > would hang forever with --target=parse. Tracked it down to a <ws> rule. > When I define a custom token ws in the grammar, it parses without > problem. > > Built a stripped-down test case. > > http://nopaste.snit.ch:8001/paste has the grammar. > http://pleasedieinafire.net/~tene/jp.tar.bz2 has the grammar, a slightly > modified version of abc's .pir and Makefile. > > Can anyone else confirm this? Am I doing something wrong?
The default <ws> rule (as defined by Synopsis 5) only requires whitespace between two word characters. Thus the grammar as written (without a custom <ws> rule) looks for an infinite number of <space_char> matches, and since <ws> can match an empty string when not between two word characters, PGE gets caught in an infinite loop. Thus this ticket should likely be merged with or treated as a duplicate of RT#37745 (handle repeated zero-length captures in PGE). At any rate, until #37745 is fixed it's probably worthwhile to make sure grammars don't end up with repeated zero-length subrules or subpatterns. Pm