inline
On Monday 26 June 2006 10:49 am, Peter Cornelius wrote:
> I think I can parse this regex:
>
> qr/x #match one 'x'
> .{1,4} #followed by no less than one but no more than four of
> anything
> \d\d #followed by 2 digits.
> x|a #followed by an 'x' or an 'a'
aren't the alternates (1) everything to the left of '|' and (2) 'a'? thus - to
take another example - the following:
$_='abcde'
print /(abx|c)/
produces:
c
not:
abc
> /x;
>
> You're string looks like
>
> " x11x x22x a"
>
> So if we march through the above:
> match 'x'
> 1-4 of anything followed by 2 digits, so...
> (x1)1x<-- Second one isn't a digit
> (x11)x<-- First one isn't a digit
> (x11x) <-- Space isn't a digit
> (x11x )x<--Hmm, x isn't a project either, that's the end of our
> 1-4 matches of anything
>
> If you keep marching through the string like this you'll see that the
> first match is (the part in the parens)
> x11(x x22)x a
>
> I think you would want to either include '0' in the {} or get rid of
> that part altogether from looking at your test data.
eeek! the '1' in {1,4} is the blunderous killer. thanks for helping me see it.
{0,4} gives the desired result.
do you have any idea why:
$_ = " x11x x22x a ";
$re1 = qr/x.*?\d\dx|a/;
$re2 = qr/($re1\s)?$re1/;
($_) = /($re2)/;
print $_;
doesn't produce 'x11x' ? (note btw that if you insert '\n' between the first
two tokens of the target string, the result >does become 'x11x'. note also
that if you drop '|a' from $re1 you also get 'x11x'.)
i read this example as follows:
$re1 = qr/
x #find an 'x'
.*? #find whatever of whatever length
\d\d #find two digits
x #find an 'x'
| #or, instead of all the
foregoing,
a #find an 'a'
/x;
$re2 = qr/
(
$re1 #find $re1
\s #and whitespace
)? #or maybe none of the foregoing
$re1 #find for sure $re1
#in sum, find $re1 possibly
preceded by $re1+whitespace
/x;
does this sound right?
thanks in advance,
tom arnall
--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
<http://learn.perl.org/> <http://learn.perl.org/first-response>