Re: escaping regex to do math on backreferences
Chas. Owens wrote: On Sun, Apr 12, 2009 at 21:58, Gunnar Hjalmarsson nore...@gunnar.cc wrote: Chas. Owens wrote: my @rank = qw/ 2 3 4 5 6 7 8 9 10 J Q K A /; my @rank = qw/A 2 3 4 5 6 7 8 9 10 J Q K A /; --^ snip That depends on who you play with. Ok. Also, if you make that change you need to check the for loop as well: for my $i (0 .. 10) { Actually no. $ perl -wle ' @rank = qw/A 2 3 4 5 6 7 8 9 10 J Q K A/; print map $_.[cdhs], @rank[10..10+4]; ' Use of uninitialized value $_ in concatenation (.) or string at -e line 3. J[cdhs]Q[cdhs]K[cdhs]A[cdhs][cdhs] $ -- Gunnar Hjalmarsson Email: http://www.gunnar.cc/cgi-bin/contact.pl -- To unsubscribe, e-mail: beginners-unsubscr...@perl.org For additional commands, e-mail: beginners-h...@perl.org http://learn.perl.org/
Re: escaping regex to do math on backreferences
On Mon, Apr 13, 2009 at 06:12, Gunnar Hjalmarsson nore...@gunnar.cc wrote: snip Also, if you make that change you need to check the for loop as well: for my $i (0 .. 10) { Actually no. $ perl -wle ' @rank = qw/A 2 3 4 5 6 7 8 9 10 J Q K A/; print map $_.[cdhs], @rank[10..10+4]; ' Use of uninitialized value $_ in concatenation (.) or string at -e line 3. snip Ahha, my original had the same off-by-one error. How did I miss that? -- Chas. Owens wonkden.net The most important skill a programmer can have is the ability to read. -- To unsubscribe, e-mail: beginners-unsubscr...@perl.org For additional commands, e-mail: beginners-h...@perl.org http://learn.perl.org/
escaping regex to do math on backreferences
Hello everyone, I have a program that needs to find straights in a hand of cards. The hand is a string with no whitespace sorted by the cards' ranks, eg 9d10cJhQsKd. How can I identify if that hand contains a straight with a single regex? Is that even possible? Is there a way to escape the regex and do addition on a backreference to get something like /(\d+)[cdhs]\1+1[cdhs]\1+2/? Thanks, -Andrew
Re: escaping regex to do math on backreferences
Andrew Fithian wrote: I have a program that needs to find straights in a hand of cards. Only straights? The hand is a string with no whitespace sorted by the cards' ranks, eg 9d10cJhQsKd. How can I identify if that hand contains a straight with a single regex? Why on earth would you want to do that? Is that even possible? I doubt it. I suggest you check out the Games::Poker::HandEvaluator module at CPAN. -- Gunnar Hjalmarsson Email: http://www.gunnar.cc/cgi-bin/contact.pl -- To unsubscribe, e-mail: beginners-unsubscr...@perl.org For additional commands, e-mail: beginners-h...@perl.org http://learn.perl.org/
Re: escaping regex to do math on backreferences
On Sun, Apr 12, 2009 at 18:34, Andrew Fithian afit...@gmail.com wrote: Hello everyone, I have a program that needs to find straights in a hand of cards. The hand is a string with no whitespace sorted by the cards' ranks, eg 9d10cJhQsKd. How can I identify if that hand contains a straight with a single regex? Is that even possible? Is there a way to escape the regex and do addition on a backreference to get something like /(\d+)[cdhs]\1+1[cdhs]\1+2/? Thanks, -Andrew It is easier to just build it: #!/usr/bin/perl use strict; use warnings; my @rank = qw/ 2 3 4 5 6 7 8 9 10 J Q K A /; my @hands; for my $i (0 .. 9) { push @hands, join '', map { $_ . [cdhs] } @rank[$i .. $i+4]; } my $re = join '|', @hands; $re = qr/^$re$/; for my $s (qw/ 9d10cJhQsKd 8c10cJhQsKd /) { print $s , $s =~ /$re/ ? is : isn't, a straight\n; } -- Chas. Owens wonkden.net The most important skill a programmer can have is the ability to read. -- To unsubscribe, e-mail: beginners-unsubscr...@perl.org For additional commands, e-mail: beginners-h...@perl.org http://learn.perl.org/
Re: escaping regex to do math on backreferences
Chas. Owens wrote: my @rank = qw/ 2 3 4 5 6 7 8 9 10 J Q K A /; my @rank = qw/A 2 3 4 5 6 7 8 9 10 J Q K A /; --^ -- Gunnar Hjalmarsson Email: http://www.gunnar.cc/cgi-bin/contact.pl -- To unsubscribe, e-mail: beginners-unsubscr...@perl.org For additional commands, e-mail: beginners-h...@perl.org http://learn.perl.org/
Re: escaping regex to do math on backreferences
On Sun, Apr 12, 2009 at 21:58, Gunnar Hjalmarsson nore...@gunnar.cc wrote: Chas. Owens wrote: my @rank = qw/ 2 3 4 5 6 7 8 9 10 J Q K A /; my @rank = qw/A 2 3 4 5 6 7 8 9 10 J Q K A /; --^ snip That depends on who you play with. Also, if you make that change you need to check the for loop as well: for my $i (0 .. 10) { -- Chas. Owens wonkden.net The most important skill a programmer can have is the ability to read. -- To unsubscribe, e-mail: beginners-unsubscr...@perl.org For additional commands, e-mail: beginners-h...@perl.org http://learn.perl.org/
[regexp] Warnings on Backreferences
Hello All, I'm using '-w' like any good hacker, but every time I try to use backreferences in my regexps, I get a warning \1 better written as $1 at I'm confused because, according to perlretut: Although $1 and \1 represent the same thing, care should be taken to use matched variables $1, $2, ... only outside a regexp and backreferences \1, \2, ... only inside a regexp; not doing so may lead to surprising and/or undefined results. The other source of relevant information seems to be here: http://www.perl.com/doc/manual/html/pod/perlre.html#WARNING_on_1_vs_1 However, I'm having trouble understanding if it is referring to using backreferences in general, or to a particular case where using \1 instead of $1 is a bad idea. Here is an example of one of my regexps that produces this warning: $text =~ s!(.*?)(\()(.*?)(\))!a\ href=\\3\ alt=\\3\\1\/a!g; Should I be using $1 and $3 instead of \1 and \3 in this case, and if so, why? Thanks in advance, Adam -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] http://learn.perl.org/ http://learn.perl.org/first-response
Re: [regexp] Warnings on Backreferences
Adam W wrote: Hello All, Hello, I'm using '-w' like any good hacker, but every time I try to use backreferences in my regexps, I get a warning \1 better written as $1 at I'm confused because, according to perlretut: Although $1 and \1 represent the same thing, care should be taken to use matched variables $1, $2, ... only outside a regexp and backreferences \1, \2, ... only inside a regexp; not doing so may lead to surprising and/or undefined results. The other source of relevant information seems to be here: http://www.perl.com/doc/manual/html/pod/perlre.html#WARNING_on_1_vs_1 However, I'm having trouble understanding if it is referring to using backreferences in general, or to a particular case where using \1 instead of $1 is a bad idea. Here is an example of one of my regexps that produces this warning: $text =~ s!(.*?)(\()(.*?)(\))!a\ href=\\3\ alt=\\3\\1\/a!g; Should I be using $1 and $3 instead of \1 and \3 in this case, and if so, why? The first part of s/// is a regular expression so you have to use \1, \2, \3, etc. however the second part of s/// is a double quoted string so it is preferred that you use $1, $2, $3, etc. because in a double quoted string the escape sequences \1, \2, \3, etc. are usually used as octal codes for characters. $ perl -le'print \120\145\162\154' Perl BTW, why capture $2 and $4 if you are not using them and why is everything backslashed? $text =~ s!(.*?)\((.*?)\)!a href=$2 alt=$2$1/a!g; John -- use Perl; program fulfillment -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] http://learn.perl.org/ http://learn.perl.org/first-response
Re: [regexp] Warnings on Backreferences
John W. Krahn wrote: Adam W wrote: Here is an example of one of my regexps that produces this warning: $text =~ s!(.*?)(\()(.*?)(\))!a\ href=\\3\ alt=\\3\\1\/a!g; BTW, why capture $2 and $4 if you are not using them and why is everything backslashed? Since I'm relatively new to the language, most of my regexps are more explicit than is necessary. As for the backslashing, I was under the impression that spaces needed to be backslashed, but now I know that it is not necessary in this context. $text =~ s!(.*?)\((.*?)\)!a href=$2 alt=$2$1/a!g; Thanks for the help and the more streamlined regexp. Adam -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] http://learn.perl.org/ http://learn.perl.org/first-response
Re: [regexp] Warnings on Backreferences
$text =~ s!(.*?)\((.*?)\)!a href=$2 alt=$2$1/a!g; Thanks for the help and the more streamlined regexp. An even better way (see O'reilley's Perl Best Practices by Damian Conway - buy this book you will write better code) Is to make it extremely readable with xms :) Same exact regex as above: $test =~ s{ (.*?) [(] (.*?) [)] } {a href=$2 alt=$2$1/a}xmsg; Just a .02 via an FYI :) -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] http://learn.perl.org/ http://learn.perl.org/first-response
Re: [regexp] Warnings on Backreferences
JupiterHost.Net wrote: $text =~ s!(.*?)\((.*?)\)!a href=$2 alt=$2$1/a!g; Thanks for the help and the more streamlined regexp. An even better way (see O'reilley's Perl Best Practices by Damian Conway - buy this book you will write better code) Is to make it extremely readable with xms :) Same exact regex as above: $test =~ s{ (.*?) [(] (.*?) [)] } {a href=$2 alt=$2$1/a}xmsg; Just a .02 via an FYI :) That looks pretty cool. Using 'x' allows whitespace use, correct? And 'm' and 's' are ways of telling Perl how to interpret a line, right? Can you tell me what the function of the square-brackets are for regexps? How are they different than regular parens? Thanks, Adam -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] http://learn.perl.org/ http://learn.perl.org/first-response
Re: [regexp] Warnings on Backreferences
JupiterHost.Net wrote: Adam W wrote: John W. Krahn wrote: $text =~ s!(.*?)\((.*?)\)!a href=$2 alt=$2$1/a!g; Thanks for the help and the more streamlined regexp. An even better way (see O'reilley's Perl Best Practices by Damian Conway - buy this book you will write better code) Is to make it extremely readable with xms :) Same exact regex as above: $test =~ s{ (.*?) [(] (.*?) [)] } {a href=$2 alt=$2$1/a}xmsg; Adding the /s option changes what the regular expression will match which may not be what the OP wants. John -- use Perl; program fulfillment -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] http://learn.perl.org/ http://learn.perl.org/first-response
Re: [regexp] Warnings on Backreferences
Adam W wrote: JupiterHost.Net wrote: $text =~ s!(.*?)\((.*?)\)!a href=$2 alt=$2$1/a!g; Thanks for the help and the more streamlined regexp. An even better way (see O'reilley's Perl Best Practices by Damian Conway - buy this book you will write better code) Is to make it extremely readable with xms :) Same exact regex as above: $test =~ s{ (.*?) [(] (.*?) [)] } {a href=$2 alt=$2$1/a}xmsg; Just a .02 via an FYI :) That looks pretty cool. Using 'x' allows whitespace use, correct? Correct. And 'm' and 's' are ways of telling Perl how to interpret a line, right? The /m option defines what the ^ and $ anchors match but you aren't using those anchors. The /s option defines what . matches so your regular expression will match something different than before. Can you tell me what the function of the square-brackets are for regexps? How are they different than regular parens? '[' and ']' define a character class, but you don't really need a character class in your example. perldoc perlre John -- use Perl; program fulfillment -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] http://learn.perl.org/ http://learn.perl.org/first-response
Re: [regexp] Warnings on Backreferences
$test =~ s{ (.*?) [(] (.*?) [)] } {a href=$2 alt=$2$1/a}xmsg; Just a .02 via an FYI :) That looks pretty cool. Using 'x' allows whitespace use, correct? Correct. And 'm' and 's' are ways of telling Perl how to interpret a line, right? The /m option defines what the ^ and $ anchors match but you aren't using those anchors. The /s option defines what . matches so your regular expression will match something different than before. Good catch :) Although Best Practice recommends xms all the time so that you get used to writing it/writing for it. Sorry I forgot to mention th . difference Can you tell me what the function of the square-brackets are for regexps? How are they different than regular parens? '[' and ']' define a character class, but you don't really need a character class in your example. Again, just recommending Best Practice :) perldoc perlre -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] http://learn.perl.org/ http://learn.perl.org/first-response
Re: [regexp] Warnings on Backreferences
John W. Krahn am Dienstag, 7. März 2006 00.12: Adam W wrote: JupiterHost.Net wrote: $text =~ s!(.*?)\((.*?)\)!a href=$2 alt=$2$1/a!g; [...] Same exact regex as above: $test =~ s{ (.*?) [(] (.*?) [)] } {a href=$2 alt=$2$1/a}xmsg; [...] Can you tell me what the function of the square-brackets are for regexps? How are they different than regular parens? '[' and ']' define a character class, but you don't really need a character class in your example. Adam, just to sum up: $test =~ s{ (.*?) \( (.*?) \) } {a href=$2 alt=$2$1/a}xsg; - \( instead of [(]: more readable - no /m modifier : unnecessary without ^/$-anchors - /s : may be appropriate for your html source text :-) Hans -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] http://learn.perl.org/ http://learn.perl.org/first-response
Re: [regexp] Warnings on Backreferences
Hans Meier (John Doe) wrote: John W. Krahn am Dienstag, 7. März 2006 00.12: Adam W wrote: JupiterHost.Net wrote: $text =~ s!(.*?)\((.*?)\)!a href=$2 alt=$2$1/a!g; [...] Same exact regex as above: $test =~ s{ (.*?) [(] (.*?) [)] } {a href=$2 alt=$2$1/a}xmsg; [...] Can you tell me what the function of the square-brackets are for regexps? How are they different than regular parens? '[' and ']' define a character class, but you don't really need a character class in your example. just to sum up: $test =~ s{ (.*?) \( (.*?) \) } {a href=$2 alt=$2$1/a}xsg; - \( instead of [(]: more readable And more efficient. John -- use Perl; program fulfillment -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] http://learn.perl.org/ http://learn.perl.org/first-response
Re: backreferences
On Mar 9, 2004, at 9:00 PM, Stuart White wrote: That got me started. I do have a question though about your regex. Good, better to ask and know, I think. Let's see if we can clear it up... This backreference is $1, and matches the team abbreviation. ([A-Z0-9 -]+) This one is $2, and matches the first name: (\w+) Full marks to here. You're right and right. And this one I'd think should be $3, and match Steal or Assist etc., but the results don't say that. (?:Steal|Assist|Block|replaced by) Stuart, I love your questions. ;) You're always missing one tiny piece of knowledge and you always ask them in such a way that I know exactly what it is. Here's the missing piece this time: You see that the above is surrounded by ( ) and you think that means it should capture. The truth is that the above is surrounded by (?: ), which happens to be ( )'s cousin. It groups things together, like ( ), but it does not capture. I had to cluster them so the |s would work, but I didn't need to hang onto the results, so I chose (?: ) over ( ). You could do it just fine with normal parenthesis, if you prefer. If you do, you just have to remember that your third answer is in $4, because $3 is holding some junk. Take your pick. Instead, $3 is this, and matches the second name. (\w+) If backreferences are supposed to be in parentheses, why isn't this (?:Steal|Assist|Block|replaced by) a backreference? I'm hoping this makes sense now. There are only three captures in my regex. (\w+) is the third. Can you see that now? James -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] http://learn.perl.org/ http://learn.perl.org/first-response
Re: backreferences
--- James Edward Gray II [EMAIL PROTECTED] snip And this one I'd think should be $3, and match Steal or Assist etc., but the results don't say that. (?:Steal|Assist|Block|replaced by) Stuart, I love your questions. ;) You're always missing one tiny piece of knowledge and you always ask them in such a way that I know exactly what it is. Here's the missing piece this time: You see that the above is surrounded by ( ) and you think that means it should capture. The truth is that the above is surrounded by (?: ), which happens to be ( )'s cousin. It groups things together, like ( ), but it does not capture. I had to cluster them so the |s would work, but I didn't need to hang onto the results, so I chose (?: ) over ( ). You could do it just fine with normal parenthesis, if you prefer. If you do, you just have to remember that your third answer is in $4, because $3 is holding some junk. Take your pick. Geez, I can't recall them covering (?: ) in my books...D'oh! The part about it grouping and capturing things makes sense, as it's the cousin of ( ). The part about being able to include the |'s doesn't. I found out, without knowing at the time, that the parentheses breakdown with |'s. I didn't know it at the time, but when I put the ORs in the parentheses and ran the program, I just got the command prompt, no output. Your explanation tells me that (?: ) could capture the ORs, and implies that the ( ) could not. --This part makes sense, as i'll just regard it as a rule. But then you go on to say that I could still use it with ( ), but then $3 would contain junk and $4 would contain the name after the (?:Steal|Assist|Block|replaced by). I'm assuming that junk in $3 would be either Assist or Block or Steal or replaced by, is that correct? I ask this because later, perhaps two days from now, perhaps two weeks from now, I'm going to want that information, assuming it is Assist or Block or Steal or replaced by. Do I just put (:? ) within ( )? That sortof makes sense, but it also seems, I'm not sure what the right word is, but it doesn't seem right. Lastly, I'm curious about this (:? ) operator. I'm going to look it up, but assuming that perldoc is not going to explain it sufficiently for me, as is often the case, do you mind telling me why it is needed to get the |'s, if that also applies to , and numerical and word comparison operators? Thanks. Instead, $3 is this, and matches the second name. (\w+) If backreferences are supposed to be in parentheses, why isn't this (?:Steal|Assist|Block|replaced by) a backreference? I'm hoping this makes sense now. There are only three captures in my regex. (\w+) is the third. Can you see that now? It does, and I can. Oh, and I appreciate when you not only answer a question with an explanation, but you use an example as well. That's extremely helpful. -stu James __ Do you Yahoo!? Yahoo! Search - Find what youre looking for faster http://search.yahoo.com -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] http://learn.perl.org/ http://learn.perl.org/first-response
Re: backreferences
On Mar 10, 2004, at 10:20 AM, Stuart White wrote: Geez, I can't recall them covering (?: ) in my books...D'oh! It may not have. It's not super common to see it thrown about. Most people just use (...), I would guess. The part about it grouping and capturing things makes sense, as it's the cousin of ( ). The part about being able to include the |'s doesn't. I found out, without knowing at the time, that the parentheses breakdown with |'s. I didn't know it at the time, but when I put the ORs in the parentheses and ran the program, I just got the command prompt, no output. Hmm. this still sounds a little confused. Let's us another example: #!/usr/bin/perl use strict; use warnings; while (DATA) { print \nLine: $_; if (m/\[([A-Z0-9 -]+)\] (\w+).+(?:Steal|Assist|Block|replaced by):? (\w+)/) { print \tMatched: \\[([A-Z0-9 -]+)\\] (\\w+).+(?:Steal|Assist|Block|replaced by):? (\\w+)\n; print \t\t\$1 is $1\n\t\t\$2 is $2\n\t\t\$3 is $3\n; } if (m/\[([A-Z0-9 -]+)\] (\w+).+(Steal|Assist|Block|replaced by):? (\w+)/) { print \tMatched: \\[([A-Z0-9 -]+)\\] (\\w+).+(Steal|Assist|Block|replaced by):? (\\w+)\n; print \t\t\$1 is $1\n\t\t\$2 is $2\n\t\t\$3 is $3\n\t\t\$4 is $4\n; } } __DATA__ (10:18) [PHX] Stoudemire Turnover: Lost Ball (1 TO) Steal: Jackson (1 ST) (10:51) [SAN 4-0] Jackson Jump Shot: Made (2 PTS) Assist: Duncan (1 AST) (9:33) [SAN] Duncan Layup Shot: Missed Block: Stoudemire (2 BLK) (5:35) [SAN] Bowen Substitution replaced by Ginobili When I run the above, I get: Line: (10:18) [PHX] Stoudemire Turnover: Lost Ball (1 TO) Steal: Jackson (1 ST) Matched: \[([A-Z0-9 -]+)\] (\w+).+(?:Steal|Assist|Block|replaced by):? (\w+) $1 is PHX $2 is Stoudemire $3 is Jackson Matched: \[([A-Z0-9 -]+)\] (\w+).+(Steal|Assist|Block|replaced by):? (\w+) $1 is PHX $2 is Stoudemire $3 is Steal $4 is Jackson Line: (10:51) [SAN 4-0] Jackson Jump Shot: Made (2 PTS) Assist: Duncan (1 AST) Matched: \[([A-Z0-9 -]+)\] (\w+).+(?:Steal|Assist|Block|replaced by):? (\w+) $1 is SAN 4-0 $2 is Jackson $3 is Duncan Matched: \[([A-Z0-9 -]+)\] (\w+).+(Steal|Assist|Block|replaced by):? (\w+) $1 is SAN 4-0 $2 is Jackson $3 is Assist $4 is Duncan Line: (9:33) [SAN] Duncan Layup Shot: Missed Block: Stoudemire (2 BLK) Matched: \[([A-Z0-9 -]+)\] (\w+).+(?:Steal|Assist|Block|replaced by):? (\w+) $1 is SAN $2 is Duncan $3 is Stoudemire Matched: \[([A-Z0-9 -]+)\] (\w+).+(Steal|Assist|Block|replaced by):? (\w+) $1 is SAN $2 is Duncan $3 is Block $4 is Stoudemire Line: (5:35) [SAN] Bowen Substitution replaced by Ginobili Matched: \[([A-Z0-9 -]+)\] (\w+).+(?:Steal|Assist|Block|replaced by):? (\w+) $1 is SAN $2 is Bowen $3 is Ginobili Matched: \[([A-Z0-9 -]+)\] (\w+).+(Steal|Assist|Block|replaced by):? (\w+) $1 is SAN $2 is Bowen $3 is replaced by $4 is Ginobili Notice that they are nearly identical matches, I just changed the (?:...) to (...) in the second one. They function the same, the variables set by the expression is the only difference. (?:...) doesn't set a variable. Your other confusion seems to be the | character. You seem to think it's a Perl or symbol. Not true. We're inside a regex here, gotta switch thinking. Regex knowledge in, Perl out. | is a regex alternation character, which pretty much means find this or this, as expected. That's probably why the symbol was chosen, looks like the or operators of many languages. However, note that isn't significant in a regex. Now, let's get to why | needs the (?:...) or (...) around it. If they weren't there, my regex would read like this: Find \[([A-Z0-9 -]+)\] (\w+).+Steal OR Assist OR Block OR replaced by:? (\w+) Instead, it reads like this: Find \[([A-Z0-9 -]+)\] (\w+).+ Followed By Steal OR Assist OR Block OR replaced by Followed By :? (\w+) As you can see, I need the parenthesis to keep the oring behavior of | from going to far. Hopefully that makes sense. You might take a trip back to the regex section of your books, if | is new to you. It's regex 101 and I would be super surprised if it isn't covered. James -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] http://learn.perl.org/ http://learn.perl.org/first-response
Re: backreferences
--- James Edward Gray II [EMAIL PROTECTED] wrote: On Mar 10, 2004, at 10:20 AM, Stuart White wrote: Geez, I can't recall them covering (?: ) in my books...D'oh! It may not have. It's not super common to see it thrown about. Most people just use (...), I would guess. Ahh, ok. So, like you said, (?: ) is just for grouping things. I can see how that might be useful. Thanks. The part about it grouping and capturing things makes sense, as it's the cousin of ( ). The part about being able to include the |'s doesn't. I found out, without knowing at the time, that the parentheses breakdown with |'s. I didn't know it at the time, but when I put the ORs in the parentheses and ran the program, I just got the command prompt, no output. Hmm. this still sounds a little confused. When I look at your regex, I think now that perhaps it wasn't the ( ) that were written incorrectly by me, but rather, my mistake in not accounting for the digits in the brackets where the Team is, or the .+ instead of just the . in between $2 and $3. this makes sense, because in Beginning Perl, it has quite a few examples of | within ( ), which is why I didn't think it'd be a problem in the first place. Let's us another example: The second regex: if (m/\[([A-Z0-9 -]+)\](\w+).+ (Steal|Assist|Block|replaced by):? (\w+)/) is what I want. This example makes sense too. Notice that they are nearly identical matches, I just changed the (?:...) to (...) in the second one. They function the same, the variables set by the expression is the only difference. (?:...) doesn't set a variable. Got it. Your other confusion seems to be the | character. You seem to think it's a Perl or symbol. Not true. We're inside a regex here, gotta switch thinking. Regex knowledge in, Perl out. | is a regex alternation character, which pretty much means find this or this, as expected. That's probably why the symbol was chosen, looks like the or operators of many languages. However, note that isn't significant in a regex. Yup, simple mistake. It's been awhile since I read about | in regex, but I remember now that it is an alternation character. I certainly did get confused in my last post though. Thanks for the clarification. Now, let's get to why | needs the (?:...) or (...) around it. If they weren't there, my regex would read like this: This part I understood. I was confused before because I thought that (...) broke down when | was used, and that to circumvent that, one would use (?:...) instead. You might take a trip back to the regex section of your books, if | is new to you. It's regex 101 and I would be super surprised if it isn't covered. It's covered. I'll be looking at that at lunchtime. Thanks. James __ Do you Yahoo!? Yahoo! Search - Find what youre looking for faster http://search.yahoo.com -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] http://learn.perl.org/ http://learn.perl.org/first-response
Re: backreferences
Stuart White wrote: Geez, I can't recall them covering (?: ) in my books...D'oh! The part about it grouping and capturing things makes sense, as it's the cousin of ( ). The part about being able to include the |'s doesn't. I found out, without knowing at the time, that the parentheses breakdown with |'s. I didn't know it at the time, but when I put the ORs in the parentheses and ran the program, I just got the command prompt, no output. Greetings! E:\d_drive\perlStuffperl -w my $string = 'Yada, yuda, heyho, whuzit'; my $regex = '(Y.{3}).*?(y.{3}).*?(boingo|eekers|heyho).*?(\w*)$'; if ($string =~ /$regex/i) { print $1\n$2\n$3\n$4\n; } ^Z Yada yuda heyho whuzit So the problem may lie elsewhere in the match. Joseph -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] http://learn.perl.org/ http://learn.perl.org/first-response
Re: backreferences
On Mar 8, 2004, at 8:17 PM, Stuart White wrote: (5:35) [SAN] Bowen Substitution replaced by Ginobili -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] http://learn.perl.org/ http://learn.perl.org/first-response
Re: backreferences
Stuart, please start new messages for new topics, instead of replying to existing threads. On Mar 8, 2004, at 8:17 PM, Stuart White wrote: Here's a line: (10:18) [PHX] Stoudemire Turnover: Lost Ball (1 TO) Steal: Jackson (1 ST) I want to match PHX Stoudemire Steal: Jackson See if this example gets you going. James #!/usr/bin/perl use strict; use warnings; while (DATA) { if (m/\[([A-Z0-9 -]+)\] (\w+).+(?:Steal|Assist|Block|replaced by): (\w+)/) { print $1 $2 $3\n; } } __DATA__ (10:18) [PHX] Stoudemire Turnover: Lost Ball (1 TO) Steal: Jackson (1 ST) (10:51) [SAN 4-0] Jackson Jump Shot: Made (2 PTS) Assist: Duncan (1 AST) (9:33) [SAN] Duncan Layup Shot: Missed Block: Stoudemire (2 BLK) (5:35) [SAN] Bowen Substitution replaced by Ginobili -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] http://learn.perl.org/ http://learn.perl.org/first-response
Re: backreferences
Okey doke, I didn't think to think that just changing the subject line would still put the message in the previous thread. My mistake. --- James Edward Gray II [EMAIL PROTECTED] wrote: Stuart, please start new messages for new topics, instead of replying to existing threads. On Mar 8, 2004, at 8:17 PM, Stuart White wrote: Here's a line: (10:18) [PHX] Stoudemire Turnover: Lost Ball (1 TO) Steal: Jackson (1 ST) I want to match PHX Stoudemire Steal: Jackson See if this example gets you going. James #!/usr/bin/perl use strict; use warnings; while (DATA) { if (m/\[([A-Z0-9 -]+)\] (\w+).+(?:Steal|Assist|Block|replaced by): (\w+)/) { print $1 $2 $3\n; } } __DATA__ (10:18) [PHX] Stoudemire Turnover: Lost Ball (1 TO) Steal: Jackson (1 ST) (10:51) [SAN 4-0] Jackson Jump Shot: Made (2 PTS) Assist: Duncan (1 AST) (9:33) [SAN] Duncan Layup Shot: Missed Block: Stoudemire (2 BLK) (5:35) [SAN] Bowen Substitution replaced by Ginobili __ Do you Yahoo!? Yahoo! Search - Find what youre looking for faster http://search.yahoo.com -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] http://learn.perl.org/ http://learn.perl.org/first-response
Re: backreferences
See if this example gets you going. James #!/usr/bin/perl use strict; use warnings; while (DATA) { if (m/\[([A-Z0-9 -]+)\] (\w+).+(?:Steal|Assist|Block|replaced by): (\w+)/) { print $1 $2 $3\n; } } That got me started. I do have a question though about your regex. This backreference is $1, and matches the team abbreviation. ([A-Z0-9 -]+) This one is $2, and matches the first name: (\w+) And this one I'd think should be $3, and match Steal or Assist etc., but the results don't say that. (?:Steal|Assist|Block|replaced by) Instead, $3 is this, and matches the second name. (\w+) If backreferences are supposed to be in parentheses, why isn't this (?:Steal|Assist|Block|replaced by) a backreference? Thanks. __ Do you Yahoo!? Yahoo! Search - Find what youre looking for faster http://search.yahoo.com -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] http://learn.perl.org/ http://learn.perl.org/first-response
backreferences
Here's a line: (10:18) [PHX] Stoudemire Turnover: Lost Ball (1 TO) Steal: Jackson (1 ST) I want to match PHX Stoudemire Steal: Jackson these are my patterns: my $steal = 'Steal:\s'; my $team = '\w{3}'; my $player = '\w+'; this is my regex: if ($_ =~ /\[($team)\] ($player).($steal)/) I tried printing $3 and $4, and then $1 and $2, and none of them printed. I'm not sure why. What am I missing? Eventually, I want to match any of these four lines: (10:18) [PHX] Stoudemire Turnover: Lost Ball (1 TO) Steal: Jackson (1 ST) (10:51) [SAN 4-0] Jackson Jump Shot: Made (2 PTS) Assist: Duncan (1 AST) (9:33) [SAN] Duncan Layup Shot: Missed Block: Stoudemire (2 BLK) (5:35) [SAN] Bowen Substitution replaced by Ginobili and when I tried to with this: if ($_ =~ /\[($team)\] ($player).('Assist: '|'Block: '|'Steal: '|'replaced by ')/) and tried to print any of the backreferences, it didn't work either. Any ideas? HEre is my code: - open(STATS, stats.txt) or die statfile\n; my $pattern = Foul; my $assist = 'Assist\s'; my $block = 'Block\s'; my $steal = 'Steal:\s'; my $time = '\d+\:\d\d'; my $team = '\w{3}'; my $player = '\w+'; my $foulType = 'Foul\: (.*) '; my $numFouls = '\w+\s\w+'; my @SAN; my @PHX; my %PHX; my %SAN; while (STATS) { if ($_ =~ /\[($team)\] ($player).($steal)/) #|$block|$steal { print this is 1: $1\n; print this is 2: $2\n; print this is 3: $3\n; print this is 4: $4\n; } } __ Do you Yahoo!? Yahoo! Search - Find what youre looking for faster http://search.yahoo.com -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] http://learn.perl.org/ http://learn.perl.org/first-response
Re: Regular Expressions: Grouping and backreferences...
eric-perl wrote at Tue, 10 Sep 2002 09:32:22 +0200: How can I capture all the words that contain 'at' in the string 'A fat cat sat on my hat.'? Any pointers? $sentence = 'A fat cat sat on my hat.' $sentence =~ m/(\wat)/; returns: $1 = 'fat' As TMTWTDI, here's a solution without a global matching: my @at_words = grep /at/, split /\W+/, $sentence; Greetings, Janek -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Regular Expressions: Grouping and backreferences...
Hello, All: How can I capture all the words that contain 'at' in the string 'A fat cat sat on my hat.'? Any pointers? $sentence = 'A fat cat sat on my hat.' $sentence =~ m/(\wat)/; returns: $1 = 'fat' -- Eric P. Sunnyvale, CA -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: Regular Expressions: Grouping and backreferences...
-Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] Sent: Tuesday, September 10, 2002 12:32 AM To: Beginners Perl Mailing List Subject: Regular Expressions: Grouping and backreferences... Hello, All: How can I capture all the words that contain 'at' in the string 'A fat cat sat on my hat.'? Any pointers? $sentence = 'A fat cat sat on my hat.' $sentence =~ m/(\wat)/; .returns: $1 = 'fat' -Response Message- your regex will only match the letter before at and the 'at', all words containing 'at' is the following, placing them into the array called @list: $sentence = 'A fat cat sat on that hat.'; @list = $sentence =~ m/(\w*at\w*)/g; foreach (@list) {print $_\n} -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Regular Expressions: Grouping and backreferences...
On Tue, 2002-09-10 at 03:32, [EMAIL PROTECTED] wrote: Hello, All: How can I capture all the words that contain 'at' in the string 'A fat cat sat on my hat.'? Any pointers? $sentence = 'A fat cat sat on my hat.' $sentence =~ m/(\wat)/; returns: $1 = 'fat' -- Eric P. Sunnyvale, CA -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] You were on the right track, but you need to do global matching and define contains better. Here is what I would do. snip href=perldoc perlop The /g modifier specifies global pattern match ing--that is, matching as many times as possible within the string. How it behaves depends on the context. In list context, it returns a list of the substrings matched by any capturing parenthe ses in the regular expression. If there are no parentheses, it returns a list of all the matched strings, as if there were parentheses around the whole pattern. /snip code #!/usr/bin/perl my $str = 'A fat cat sat on my hat and attacked me.'; my @at_words = $str =~ /(\w*at\w*\b)/g; print @at_words\n; /code output fat cat sat hat attacked /output -- Today is Pungenday the 34th day of Bureaucracy in the YOLD 3168 Or not. Missile Address: 33:48:3.521N 84:23:34.786W -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: using search backreferences in a variable [end]
Thanks, Jeff! - B __ On Mar 28, Bryan R Harris said: - Why is the \1, \2, etc. syntax obsolete? The one thing I love about regular expressions is that I can use them within text editors (nedit, textpad, bbedit), but none of these support the $1 notation in the replace string. Why aren't re's consistent between perl and text editors? \1 and \2 are to be used on the LEFT-HAND side of a regex; on the RIGHT-HAND side, you should use $1 and $2 to remove any ambiguity, because \1 can ALSO mean chr(1). - What does the qr tag do? qr// makes a Regexp object that is a compiled regular expression. -- Jeff japhy Pinyan [EMAIL PROTECTED] http://www.pobox.com/~japhy/ RPI Acacia brother #734 http://www.perlmonks.org/ http://www.cpan.org/ ** Look for Regular Expressions in Perl published by Manning, in 2002 ** stu what does y/// stand for? tenderpuss why, yansliterate of course. [ I'm looking for programming work. If you like my work, let me know. ] -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: using search backreferences in a variable
\1 is obsolete. try $1 and you also have use the /ee and fix your $replace for the /ee read perldoc perlre $match = qr/cat(\d+)/; $replace = '$1. dog'; $_ = cat15; s/$match/$replace/gee; print $_, \n; # prints -- 15dog -Original Message- From: Bryan R Harris [mailto:[EMAIL PROTECTED]] Sent: Thursday, March 28, 2002 10:46 AM To: Beginners Perl Mailing List Subject: using search backreferences in a variable Why does this segment not work properly? $match = cat(\d+); $replace = \1dog; $_ = cat15; s/$match/$replace/g; print $_; # prints -- \1dog Any ideas? TIA. - B -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] The views and opinions expressed in this email message are the sender's own, and do not necessarily represent the views and opinions of Summit Systems Inc. -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: using search backreferences in a variable
On Mar 28, Bryan R Harris said: Why does this segment not work properly? You would know if you had warnings turned on. perl -we '$match = cat(\d+)' yields the warning Unrecognized escape \d passed through at -e line 1. $match = cat(\d+); $replace = \1dog; $_ = cat15; s/$match/$replace/g; print $_; # prints -- \1dog Nope, that prints cat15. Why? Because cat(\d+) is the same as cat(d+) because \d becomes d. If you had used single quotes, you would have been ok. And $replace the string _dog, where that _ represents an unprintable character -- specifically, character 1 (SOH). If you had done: $match = 'cat(\d+)'; $replace = '\1dog'; $_ = cat15; s/$match/$replace/; print; You would get CLOSER, but not entirely there. $match would be correct, but you would get \1dog instead of 15dog. To fix that requires a bit of work. Here are two solutions: $match = 'cat(\d+)'; $replace = '$1dog'; # XXX: you should not use \1 on the right-hand # side of a regex; you should use $1 $_ = cat15; s/$match/qq(qq($replace))/ee; # qq() is just a fancy ... print; What does THAT do? Well, first, each /e modifier means execute the right-hand side as code. Since there are two /e's, we'll be executing it TWICE. The first time, qq(qq($replace)) returns the text qq($1dog). When we execute that, we get 15dog. The other way might be easier to understand: $match = 'cat(\d+)'; $replace = sub { $1dog }; $_ = cat15; s/$match/$replace-()/e; print; This uses a function reference stored in $replace. What this does is delay the evaluation of $1 until it's needed. -- Jeff japhy Pinyan [EMAIL PROTECTED] http://www.pobox.com/~japhy/ RPI Acacia brother #734 http://www.perlmonks.org/ http://www.cpan.org/ ** Look for Regular Expressions in Perl published by Manning, in 2002 ** stu what does y/// stand for? tenderpuss why, yansliterate of course. [ I'm looking for programming work. If you like my work, let me know. ] -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: using search backreferences in a variable
Jeff, David, Nikola-I obviously haven't been doing this long, but I already love perl. Thanks for the tips. A couple quick questions: - Why is the \1, \2, etc. syntax obsolete? The one thing I love about regular expressions is that I can use them within text editors (nedit, textpad, bbedit), but none of these support the $1 notation in the replace string. Why aren't re's consistent between perl and text editors? - What does the qr tag do? Thanks again. - Bryan __ \1 is obsolete. try $1 and you also have use the /ee and fix your $replace for the /ee read perldoc perlre $match = qr/cat(\d+)/; $replace = '$1. dog'; $_ = cat15; s/$match/$replace/gee; print $_, \n; # prints -- 15dog -Original Message- From: Bryan R Harris [mailto:[EMAIL PROTECTED]] Sent: Thursday, March 28, 2002 10:46 AM To: Beginners Perl Mailing List Subject: using search backreferences in a variable Why does this segment not work properly? $match = cat(\d+); $replace = \1dog; $_ = cat15; s/$match/$replace/g; print $_; # prints -- \1dog Any ideas? TIA. - B -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] The views and opinions expressed in this email message are the sender's own, and do not necessarily represent the views and opinions of Summit Systems Inc. -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: using search backreferences in a variable
On Mar 28, Bryan R Harris said: - Why is the \1, \2, etc. syntax obsolete? The one thing I love about regular expressions is that I can use them within text editors (nedit, textpad, bbedit), but none of these support the $1 notation in the replace string. Why aren't re's consistent between perl and text editors? \1 and \2 are to be used on the LEFT-HAND side of a regex; on the RIGHT-HAND side, you should use $1 and $2 to remove any ambiguity, because \1 can ALSO mean chr(1). - What does the qr tag do? qr// makes a Regexp object that is a compiled regular expression. -- Jeff japhy Pinyan [EMAIL PROTECTED] http://www.pobox.com/~japhy/ RPI Acacia brother #734 http://www.perlmonks.org/ http://www.cpan.org/ ** Look for Regular Expressions in Perl published by Manning, in 2002 ** stu what does y/// stand for? tenderpuss why, yansliterate of course. [ I'm looking for programming work. If you like my work, let me know. ] -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]