inline On Monday 26 June 2006 10:49 am, Peter Cornelius wrote: > I think I can parse this regex: > > qr/x #match one 'x' > .{1,4} #followed by no less than one but no more than four of > anything > \d\d #followed by 2 digits. > x|a #followed by an 'x' or an 'a'
aren't the alternates (1) everything to the left of '|' and (2) 'a'? thus - to take another example - the following: $_='abcde' print /(abx|c)/ produces: c not: abc > /x; > > You're string looks like > > " x11x x22x a" > > So if we march through the above: > match 'x' > 1-4 of anything followed by 2 digits, so... > (x1)1x<-- Second one isn't a digit > (x11)x<-- First one isn't a digit > (x11x) <-- Space isn't a digit > (x11x )x<--Hmm, x isn't a project either, that's the end of our > 1-4 matches of anything > > If you keep marching through the string like this you'll see that the > first match is (the part in the parens) > x11(x x22)x a > > I think you would want to either include '0' in the {} or get rid of > that part altogether from looking at your test data. eeek! the '1' in {1,4} is the blunderous killer. thanks for helping me see it. {0,4} gives the desired result. do you have any idea why: $_ = " x11x x22x a "; $re1 = qr/x.*?\d\dx|a/; $re2 = qr/($re1\s)?$re1/; ($_) = /($re2)/; print $_; doesn't produce 'x11x' ? (note btw that if you insert '\n' between the first two tokens of the target string, the result >does become 'x11x'. note also that if you drop '|a' from $re1 you also get 'x11x'.) i read this example as follows: $re1 = qr/ x #find an 'x' .*? #find whatever of whatever length \d\d #find two digits x #find an 'x' | #or, instead of all the foregoing, a #find an 'a' /x; $re2 = qr/ ( $re1 #find $re1 \s #and whitespace )? #or maybe none of the foregoing $re1 #find for sure $re1 #in sum, find $re1 possibly preceded by $re1+whitespace /x; does this sound right? thanks in advance, tom arnall -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] <http://learn.perl.org/> <http://learn.perl.org/first-response>