https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6143





--- Comment #27 from Sidney Markowitz <[email protected]>  2009-07-07 17:06:56 
PST ---
I think I see the problem now, but not yet the solution.

The generated in code in scanner1.c looks like the following (in your example.
FWIW, it shows up in scanner2.re and scanner2.c when using a newer version of
re2c, but still has the same problem)

 ++YYCURSOR;
 {RET("__SEEK_1R0JFS");}

where RET is #defined to start with

  YYCURSOR = YYMARKER;

Clearly that doesn't make sense, which would be ok for generated code if all it
meant was that the ++YYCURSOR is an extra operation that does nothing. The
problem is that if YYMARKER points to the start of the buffer that was passed
in, the loop that is calling this loops forever waiting for YYCURSOR to be
advanced to the end of the buffer.

  while (cursor < pend) { [... call this function passing it &cursor] }

I need to think through exactly how YYMARKER and the RET macro are supposed to
make the overlapped backtracking thing work out. As it stands now, any single
character match string will cause this to happen, but some other wrong thing
may happen in some other circumstances because I see at least one other place
where a ++YCURSOR; comes right before a RET.

Oh, I just got an insight as to why the __SEEK_1R0JFS rule is involved even
though as far as I can tell that rule does not match on just a single NUL
character. I bet that it is because it does contain NUL characters in its match
string and there is also the NUL_IN_BODY rule, and it is the combination of the
two that leads the re2c compiler to produce this code. What I think is going on
is that the backtracking is supposed to let the scanner match on the single NUL
character for NUL_IN_BODY, then backtrack and continue matching __SEEK_1R0JFS
which includes that NUL character.

Does anyone know how the the rule2xs scanner is supposed to handle two rules
that match on the same string? So far I haven't seen how that can be done.

-- 
Configure bugmail: 
https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

Reply via email to