In <085601c01cc8$2c94f390$[EMAIL PROTECTED]>, "mike mulligan" w
rites:
:From: Hugo <[EMAIL PROTECTED]>
:Sent: Monday, September 11, 2000 11:59 PM
:
:
:> mike mulligan replied to Peter Heslin:
:> : ... it is greedy in the sense of the forward matching "*" or "+"
:constructs.
:> : [snip]
:>
:> This is nothing to do with greediness and everything to do with
:> left-to-rightness. The regexp engine does not look for x* except
:> in those positions where the lookbehind has already matched.
:
:I was trying to understand at what point the lookbehind was attempted, and
:confused myself and posted a bad example.  My apologies to everyone.  Let's
:see if I can make sense of it on a second try.
:
:My question is: if I have the regex  /(?<=[aeiou]X[yz]+/  then does Perl: 1.
:scan first for 'X', test the lookbehind, and then test the '[yz]',  or 2.
:scan for 'X[yz]' and then test the lookbehind?

3. The regexp is matched left to right: first the lookbehind, then 'X',
then '[yz]'.

:I am expecting these two alternatives to give the same result, but certain
:test strings might run slower or faster depending on the approach.
:
:Running perl -Dr shows that alternative 1 is used:

Running perl -Dr shows that alternative 3 is used. However the -Dr data
is confused by the optimiser, which happens to have chosen the fixed
string 'X' as something worth searching for first. So the optimiser
permits the main matching engine to look only at those positions where
there is an 'X' immediately following.

I've annotated the -Dr output below to try and clarify. Note that if
you replace 'X' with '(x|X)', no optimisations take place (other than
a 'minimum length' check) and -Dr will give a much clearer picture of
the flow; again, if you replace 'X[yz]' with '(x|X)y' the optimiser
will now pick 'y' as the most significant thing worth searching for.

Hope this helps,

Hugo
---
:qq(aXuhXvoXyz) =~ /(?<=[aeiou])X[yz]/
:
:Guessing start of match, REx `(?<=[aeiou])X[yz]' against `aXuhXvoXyz'...

The optimiser is entered.

:Found anchored substr `X' at offset 1...

This is what the optimiser is looking for.

:Guessed: match at offset 1

This is what the optimiser found.

:Matching REx `(?<=[aeiou])X[yz]' against `XuhXvoXyz'

The real matcher is entered.

:  Setting an EVAL scope, savestack=3
:   1 <a> <XuhXvoXyz>      |  1:  IFMATCH[-1]
:   0 <> <aXuhXvoXyz>      |  3:    ANYOF[aeiou]

Checking lookbehind ...

:   1 <a> <XuhXvoXyz>      | 12:    SUCCEED

Ok.

:                              could match...
:   1 <a> <XuhXvoXyz>      | 14:  EXACT <X>

Checking 'X' ...

:   2 <aX> <uhXvoXyz>      | 16:  ANYOF[yz]

Checking '[yz]' ...

:                            failed...

Failed: try the next position permitted by the optimiser.

:  Setting an EVAL scope, savestack=3
:   4 <aXuh> <XvoXyz>      |  1:  IFMATCH[-1]
:   3 <aXu> <hXvoXyz>      |  3:    ANYOF[aeiou]

Checking lookbehind ...

:                              failed...

Failed.

:                            failed...

Try the next position permitted by the optimiser.

:  Setting an EVAL scope, savestack=3
:   7 <aXuhXvo> <Xyz>      |  1:  IFMATCH[-1]
:   6 <aXuhXv> <oXyz>      |  3:    ANYOF[aeiou]

Checking lookbehind ...

:   7 <aXuhXvo> <Xyz>      | 12:    SUCCEED

Ok.

:                              could match...
:   7 <aXuhXvo> <Xyz>      | 14:  EXACT <X>

Checking 'X' ...

:   8 <aXuhXvoX> <yz>      | 16:  ANYOF[yz]

Checking '[yz]' ...

:   9 <aXuhXvoXy> <z>      | 25:  END
:Match successful!

Match successful.

Reply via email to