Numification of captured match
Thit has led to surprising results in Pugs's Net::IRC: if 'localhost:80' ~~ /^(.+)\:(\d+)$/ { my $socket = connect($0, $1); } If $1 is a match object here, and connect() assumes Int on its second argument, then it will connect to port 1, as the match object numifies to 1 (indicating a successful match). I fixed this for 6.2.3 by flattening $0, $1, $2 into plain scalars (for nonquantified matches), and use $/[0] etc to store match objects, but I'm not sure this treatment is right. Is it really intended that we get into habit of writing this? if 'localhost:80' ~~ /^(.+)\:(\d+)$/ { my $socket = connect(~$0, +$1); } It looks... weird. :) Thanks, /Autrijus/ pgpwUSQZmM4vw.pgp Description: PGP signature
Re: Numification of captured match
On Fri, May 13, 2005 at 03:23:20AM +0800, Autrijus Tang wrote: Is it really intended that we get into habit of writing this? if 'localhost:80' ~~ /^(.+)\:(\d+)$/ { my $socket = connect(~$0, +$1); } It looks... weird. :) And it would have to be if 'localhost:80' ~~ /^(.+)\:(\d+)$/ { my $socket = connect(~$0, ~$1); } because +$1 still evaluates to 1. (The ~ in front of $0 is probably optional.) My suggestion is that a match object in numeric context is the same as evaluating its string value in a numeric context. If we need a way to find out the number of match repetitions (what the numeric context was intended to provide), it might be better done with an explicit C.matchcount method or something like that. Pm
Re: Numification of captured match
On Thu, May 12, 2005 at 02:55:36PM -0500, Patrick R. Michaud wrote: On Fri, May 13, 2005 at 03:23:20AM +0800, Autrijus Tang wrote: Is it really intended that we get into habit of writing this? if 'localhost:80' ~~ /^(.+)\:(\d+)$/ { my $socket = connect(~$0, +$1); } It looks... weird. :) And it would have to be if 'localhost:80' ~~ /^(.+)\:(\d+)$/ { my $socket = connect(~$0, ~$1); } because +$1 still evaluates to 1. That's some subtle evil. My suggestion is that a match object in numeric context is the same as evaluating its string value in a numeric context. While I agree that this would be the right behavior it still feels special-casey, hackish and wrong. If, as an optimization, you could tell PGE that you didn't need Match objects and only cared about the string results of your captures, that might be better. For instance, if 'localhost:80' ~~ m:s/^(.+)\:(\d+)$/ { my $socket = connect($0, $1); } :s for :string (assuming that hasn't already been taken) If we need a way to find out the number of match repetitions (what the numeric context was intended to provide), it might be better done with an explicit C.matchcount method or something like that. Surely that would just be [EMAIL PROTECTED] Or have I crossed the perl[56] streams again? -Scott -- Jonathan Scott Duff [EMAIL PROTECTED]
Re: Numification of captured match
On 5/12/05, Jonathan Scott Duff [EMAIL PROTECTED] wrote: On Thu, May 12, 2005 at 02:55:36PM -0500, Patrick R. Michaud wrote: On Fri, May 13, 2005 at 03:23:20AM +0800, Autrijus Tang wrote: Is it really intended that we get into habit of writing this? if 'localhost:80' ~~ /^(.+)\:(\d+)$/ { my $socket = connect(~$0, +$1); } It looks... weird. :) And it would have to be if 'localhost:80' ~~ /^(.+)\:(\d+)$/ { my $socket = connect(~$0, ~$1); } because +$1 still evaluates to 1. That's some subtle evil. My suggestion is that a match object in numeric context is the same as evaluating its string value in a numeric context. While I agree that this would be the right behavior it still feels special-casey, hackish and wrong. If, as an optimization, you could tell PGE that you didn't need Match objects and only cared about the string results of your captures, that might be better. For instance, if 'localhost:80' ~~ m:s/^(.+)\:(\d+)$/ { my $socket = connect($0, $1); } :s for :string (assuming that hasn't already been taken) What about the fact that anything matching (\d+) is going to be an Int and anything matching (.+) is going to be a String, and so forth. There is sufficient information in the regex for P6 to know that $0 should smart-convert into a String and $1 should smart-convert into a Int. Can't we just do that? Rob
Re: Numification of captured match
On Thu, May 12, 2005 at 02:55:36PM -0500, Patrick R. Michaud wrote: : On Fri, May 13, 2005 at 03:23:20AM +0800, Autrijus Tang wrote: : Is it really intended that we get into habit of writing this? : : if 'localhost:80' ~~ /^(.+)\:(\d+)$/ { : my $socket = connect(~$0, +$1); : } : : It looks... weird. :) : : And it would have to be : : if 'localhost:80' ~~ /^(.+)\:(\d+)$/ { : my $socket = connect(~$0, ~$1); : } : : because +$1 still evaluates to 1. (The ~ in front of $0 is : probably optional.) : : My suggestion is that a match object in numeric context is the : same as evaluating its string value in a numeric context. If : we need a way to find out the number of match repetitions (what : the numeric context was intended to provide), it might be better : done with an explicit C.matchcount method or something like that. I think we already said something like that once some number of months ago. +$1 simply has to be the numeric value of the match. It's not as much of a problem as a Perl 5 programmer might think, since ?$1 is still true even if +$1 is 0. Anyway, while we could have a method for the .matchcount, +$1[] should work fine too. And maybe even [EMAIL PROTECTED], presuming that a match object can function as an array actually means a match object knows when it's being asked to supply an array reference. Actually, it's not clear to me offhand why @1 shouldn't mean $1[] and %1 shouldn't mean $1{}. Larry
Re: Numification of captured match
Larry Wall wrote: I think we already said something like that once some number of months ago. +$1 simply has to be the numeric value of the match. Agreed. Anyway, while we could have a method for the .matchcount, +$1[] should work fine too. Yep. Actually, it's not clear to me offhand why @1 shouldn't mean $1[] and %1 shouldn't mean $1{}. It *does*. According to the recent capture semantics document: Note that, outside a rule, C@1 is simply a shorthand for C@{$1}, and: And, of course, outside the rule, C%1 is a shortcut for C%{$1}: Damian
Re: Numification of captured match
On Fri, May 13, 2005 at 02:00:10PM +1000, Damian Conway wrote: : Actually, it's not clear to me offhand why @1 shouldn't mean $1[] : and %1 shouldn't mean $1{}. : : It *does*. According to the recent capture semantics document: : : Note that, outside a rule, C@1 is simply a shorthand for C@{$1}, : : and: : : And, of course, outside the rule, C%1 is a shortcut for C%{$1}: In that case it's very much less clear to me why it shouldn't mean that. :-) Larry
Re: Numification of captured match
On Thu, May 12, 2005 at 08:10:42PM -0700, Larry Wall wrote: On Thu, May 12, 2005 at 02:55:36PM -0500, Patrick R. Michaud wrote: : My suggestion is that a match object in numeric context is the : same as evaluating its string value in a numeric context. If : we need a way to find out the number of match repetitions (what : the numeric context was intended to provide), it might be better : done with an explicit C.matchcount method or something like that. I think we already said something like that once some number of months ago. +$1 simply has to be the numeric value of the match. It's not as much of a problem as a Perl 5 programmer might think, since ?$1 is still true even if +$1 is 0. Anyway, while we could have a method for the .matchcount, +$1[] should work fine too. And maybe even [EMAIL PROTECTED], presuming that a match object can function as an array actually means a match object knows when it's being asked to supply an array reference. So the counting idiom in S05 becomes one of: $match_count += @{m:g/pattern/}; $match_count += list m:g/pattern/; $match_count += m:g/pattern/.matchount; $match_count += (m:g/pattern/)[]; # maybe ??? -Scott -- Jonathan Scott Duff [EMAIL PROTECTED]
Re: Numification of captured match
On Thu, May 12, 2005 at 08:10:42PM -0700, Larry Wall wrote: On Thu, May 12, 2005 at 02:55:36PM -0500, Patrick R. Michaud wrote: : On Fri, May 13, 2005 at 03:23:20AM +0800, Autrijus Tang wrote: : Is it really intended that we get into habit of writing this? : : if 'localhost:80' ~~ /^(.+)\:(\d+)$/ { :my $socket = connect(~$0, +$1); : } : : It looks... weird. :) : : And it would have to be : : if 'localhost:80' ~~ /^(.+)\:(\d+)$/ { : my $socket = connect(~$0, ~$1); : } : : because +$1 still evaluates to 1. (The ~ in front of $0 is : probably optional.) : : My suggestion is that a match object in numeric context is the : same as evaluating its string value in a numeric context. If : we need a way to find out the number of match repetitions (what : the numeric context was intended to provide), it might be better : done with an explicit C.matchcount method or something like that. I think we already said something like that once some number of months ago. I guess I've been led astray (or downright confused) by the capture specs then, when it says: A successful match returns a CMatch object whose boolean value is true, whose integer value is typically 1 (except under the C:g or C:x flags; see LCapturing from non-singular matches), whose string value is the complete substring that was matched by the entire rule, whose array component contains all subpattern (unnamed) captures, and whose hash component contains all subrule (named) captures. and later If an named scalar alias is applied to a set of non-capturing brackets: m:w/ $key:=[ ([A-E]) (\d**{3..6}) (X?) ] /; then the corresponding entry in the rule's hash is assigned a CMatch object whose: * Boolean value is true, * Integer value is 1, * String value is the complete substring matched by the contents of the square brackets, * Array and hash are both empty. and under the :g option... if $text ~~ m:words:globally/ (\S+:) rocks / { say Matched {+$/} different ways; say 'Full match context is:'; say $/; } So, are the Match objects returned from subpattern captures treated differently in numeric context than the Match objects coming from named scalar aliases or the match itself... ? It's not as much of a problem as a Perl 5 programmer might think, since ?$1 is still true even if +$1 is 0. Anyway, while we could have a method for the .matchcount, +$1[] should work fine too. With .matchcount I wasn't concerned about the number of repetitions stored in $1 -- I was trying to get at the numeric value that $/ would've returned under the :g option. But in re-reading the draft of the :globally option I see we already have one -- C $/.matches in numeric context should supply it for us. So I'm guessing that we're all in agreement that +$/, +$1, and +$subrule all refer to the numeric value of the string matched, as opposed to what's currently written about their values in the draft...? Or am I still missing the picture entirely? Pm
Re: Numification of captured match
Patrick surmised: So I'm guessing that we're all in agreement that +$/, +$1, and +$subrule all refer to the numeric value of the string matched, as opposed to what's currently written about their values in the draft...? Yes. The semantics proposed in the draft have proved to be too orthogonal for practical use. ;-) Damian