In message <[EMAIL PROTECTED]>, Jef
f McNiel writes:
>I see now that what is happening: when contains() does not match,
>it sets the currentOffset of the PatternMatcherInput to "past the end"
>
>I suppose that this is the correct behavior as designed - just my
>misunderstanding :-) (suggestion for the docs - nothing is said about
>changing the offset in the case of match failure)
>
>I guess to get the behavior I want, I'll have to save the currentOffset
>and reset it on match failure.
Yes, that's the intended way to do it. However, this point has been
brought up before and it really does appear that it is more useful not
to advance the offset. It would also be more in keeping with reproducing
the behavior of \G and /g which PatternMatcherInput is intended to
duplicate. I just checked 'man perlop' and ran a little test and contrary
to my memory, \G does not change on failed matches. At the time, it
seemed more consistent and less surprising for
contains(PatternMatcherInput,...) to always advance to where the last match
_attempt_ left off. However, the primary reason an iterator interface was
not chosen was because of its inflexibility, but if contains() fastforwards
you to the end, although not as useless as an iterator, it makes life
a bit more complicated because you have to reset the offsets yourself
every single time.
I'm inclined to make the change because it makes lexical analysis and
tokenizer construction easier. I remember having an awkward time writing
a tokenizer because of this a few months ago. The change should not break
most code that rewinds the offsets and will be advertised in both the
javadocs and the CHANGES file. Objections?
daniel