Re: puzzling '.{1,4}'

tom arnall Mon, 26 Jun 2006 11:40:23 -0700

 inline
 
 On Monday 26 June 2006 10:49 am, Peter Cornelius wrote:
> I think I can parse this regex:
>
> qr/x          #match one 'x'
>       .{1,4}  #followed by no less than one but no more than four of
> anything
>       \d\d     #followed by 2 digits.
>       x|a      #followed by an 'x' or an 'a'


aren't the alternates (1) everything to the left of '|' and (2) 'a'? thus - to 
take another example - the following:

        $_='abcde'
        print /(abx|c)/

produces:

        c

not:

        abc



        
>      /x;
>
> You're string looks like
>
> " x11x  x22x a"
>
> So if we march through the above:
>     match 'x'
>     1-4 of anything followed by 2 digits, so...
>     (x1)1x<-- Second one isn't a digit


>     (x11)x<-- First one isn't a digit
>     (x11x) <-- Space isn't a digit
>     (x11x )x<--Hmm, x isn't a project either, that's the end of our
> 1-4 matches of anything
>
> If you keep marching through the string like this you'll see that the
> first match is (the part in the parens)
> x11(x x22)x a
>
> I think you would want to either include '0' in the {} or get rid of
> that part altogether from looking at your test data.


eeek! the '1' in {1,4} is the blunderous killer. thanks for helping me see it. 
{0,4} gives the desired result.

do you have any idea why:

        $_ = " x11x  x22x a ";

        $re1 = qr/x.*?\d\dx|a/;
        $re2 = qr/($re1\s)?$re1/;
        ($_) = /($re2)/;
        print $_;

doesn't produce 'x11x' ? (note btw that if you insert '\n' between the first 
two tokens of the target string, the result >does become 'x11x'. note also 
that if you drop '|a' from $re1 you also get 'x11x'.)

i read this example as follows:

        $re1 = qr/              
        x                                       #find an 'x'
        .*?                             #find whatever of whatever length
        \d\d                            #find two digits
        x                                       #find an 'x'
        |                                       #or, instead of all the 
foregoing,
        a                                       #find an 'a'
        /x;
        
        $re2 = qr/
        (
        $re1                            #find $re1
        \s                              #and whitespace
        )?                              #or maybe none of the foregoing
        $re1                            #find for sure $re1 
                                                #in sum, find $re1 possibly 
preceded by $re1+whitespace
        /x;

does this sound right?

thanks in advance,

tom arnall






-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
<http://learn.perl.org/> <http://learn.perl.org/first-response>

Re: puzzling '.{1,4}'

Reply via email to