On Sat, Dec 18, 2004 at 08:47:42AM -0800, Larry Wall wrote:
> : This test seems to cause an infinite loop
> : (with parrot_2004-12-16_160001)
> : 
> : p6rule_isnt('a--', '^[a?b?]*$', 're_tests 387 (#438)');  # infinite loop
> 
> Detecting failure to progress can be quite tricky, actually.  It's easy
> enough to detect that it *might* be an infinite loop.  But that pattern
> would succeed the string were all a's and b's.  It's not enough to figure
> out that you're at the same position or the same state.  You have to figure
> out that you're at the same position and the same state, and you may well
> have visited different positions in this state, or different states in
> this position.  So a naive solution requires N**2 in time or space.

In PGE I've been thinking this won't be *too* difficult (and I'll fully
admit to the possibility of being naive here).  Our "states" are actually
encoded into the pattern's subroutine code, with our "current state"
being held by Parrot's execution pointer and the various stacks, and 
we're already keeping track of the starting and current position of 
each "substring" being matched by a repeating group.  (Or, if we're 
not, we certainly can keep track of it in a stack of some sort.

So, when we get to the end of the bracketed group, we look to see 
if the current position has changed at all since we started the group, 
and if not we refuse to repeat the group again but just go on to 
whatever other checks need to be made.  If going on to the remaining 
checks causes a match failure, we're just going to backtrack into the 
group's subexpression anyway, which would then start doing things that
change the current pointer we'll be seeing at the end of the group.
This may not always be the most efficient mechanism -- i.e., we could
find ourselves repeating later matches we've already tried, but at
least it shouldn't infinite loop.

But rather than trying to explain it all and debate here whether it'll 
work or not, it's probably quicker for me to just implement that section 
of code and let our tests tell the tale.  I'll do that today/tomorrow and
report back on the results.  But if anyone knows of a case where what
I've discussed isn't likely to work or has caused problems in the past, 
let me know so we can code up a test and/or workaround for it and see
what happens.

Pm

Reply via email to