Numification of captured match

2005-05-12 Thread Autrijus Tang
Thit has led to surprising results in Pugs's Net::IRC:

if 'localhost:80' ~~ /^(.+)\:(\d+)$/ {
my $socket = connect($0, $1);
}

If $1 is a match object here, and connect() assumes Int on its second
argument, then it will connect to port 1, as the match object numifies
to 1 (indicating a successful match).

I fixed this for 6.2.3 by flattening $0, $1, $2 into plain scalars
(for nonquantified matches), and use $/[0] etc to store match objects,
but I'm not sure this treatment is right.

Is it really intended that we get into habit of writing this?

if 'localhost:80' ~~ /^(.+)\:(\d+)$/ {
my $socket = connect(~$0, +$1);
}

It looks... weird. :)

Thanks,
/Autrijus/


pgpwUSQZmM4vw.pgp
Description: PGP signature


Re: Numification of captured match

2005-05-12 Thread Patrick R. Michaud
On Fri, May 13, 2005 at 03:23:20AM +0800, Autrijus Tang wrote:
 Is it really intended that we get into habit of writing this?
 
 if 'localhost:80' ~~ /^(.+)\:(\d+)$/ {
   my $socket = connect(~$0, +$1);
 }
 
 It looks... weird. :)

And it would have to be

 if 'localhost:80' ~~ /^(.+)\:(\d+)$/ {
my $socket = connect(~$0, ~$1);
 }

because +$1 still evaluates to 1.  (The ~ in front of $0 is 
probably optional.)

My suggestion is that a match object in numeric context is the
same as evaluating its string value in a numeric context.  If
we need a way to find out the number of match repetitions (what
the numeric context was intended to provide), it might be better
done with an explicit C.matchcount method or something like that.

Pm


Re: Numification of captured match

2005-05-12 Thread Jonathan Scott Duff
On Thu, May 12, 2005 at 02:55:36PM -0500, Patrick R. Michaud wrote:
 On Fri, May 13, 2005 at 03:23:20AM +0800, Autrijus Tang wrote:
  Is it really intended that we get into habit of writing this?
  
  if 'localhost:80' ~~ /^(.+)\:(\d+)$/ {
  my $socket = connect(~$0, +$1);
  }
  
  It looks... weird. :)
 
 And it would have to be
 
  if 'localhost:80' ~~ /^(.+)\:(\d+)$/ {
   my $socket = connect(~$0, ~$1);
  }
 
 because +$1 still evaluates to 1.  

That's some subtle evil.

 My suggestion is that a match object in numeric context is the
 same as evaluating its string value in a numeric context.  

While I agree that this would be the right behavior it still feels
special-casey, hackish and wrong.  

If, as an optimization, you could tell PGE that you didn't need Match
objects and only cared about the string results of your captures, that
might be better. For instance,

if 'localhost:80' ~~ m:s/^(.+)\:(\d+)$/ {
my $socket = connect($0, $1);
}
:s for :string  (assuming that hasn't already been taken)

 If
 we need a way to find out the number of match repetitions (what
 the numeric context was intended to provide), it might be better
 done with an explicit C.matchcount method or something like that.

Surely that would just be [EMAIL PROTECTED]  Or have I crossed the perl[56]
streams again?

-Scott
-- 
Jonathan Scott Duff
[EMAIL PROTECTED]


Re: Numification of captured match

2005-05-12 Thread Rob Kinyon
On 5/12/05, Jonathan Scott Duff [EMAIL PROTECTED] wrote:
 On Thu, May 12, 2005 at 02:55:36PM -0500, Patrick R. Michaud wrote:
  On Fri, May 13, 2005 at 03:23:20AM +0800, Autrijus Tang wrote:
   Is it really intended that we get into habit of writing this?
  
   if 'localhost:80' ~~ /^(.+)\:(\d+)$/ {
   my $socket = connect(~$0, +$1);
   }
  
   It looks... weird. :)
 
  And it would have to be
 
   if 'localhost:80' ~~ /^(.+)\:(\d+)$/ {
my $socket = connect(~$0, ~$1);
   }
 
  because +$1 still evaluates to 1.
 
 That's some subtle evil.
 
  My suggestion is that a match object in numeric context is the
  same as evaluating its string value in a numeric context.
 
 While I agree that this would be the right behavior it still feels
 special-casey, hackish and wrong.
 
 If, as an optimization, you could tell PGE that you didn't need Match
 objects and only cared about the string results of your captures, that
 might be better. For instance,
 
 if 'localhost:80' ~~ m:s/^(.+)\:(\d+)$/ {
 my $socket = connect($0, $1);
 }
 :s for :string  (assuming that hasn't already been taken)

What about the fact that anything matching (\d+) is going to be an Int
and anything matching (.+) is going to be a String, and so forth.
There is sufficient information in the regex for P6 to know that $0
should smart-convert into a String and $1 should smart-convert into a
Int. Can't we just do that?

Rob


Re: Numification of captured match

2005-05-12 Thread Larry Wall
On Thu, May 12, 2005 at 02:55:36PM -0500, Patrick R. Michaud wrote:
: On Fri, May 13, 2005 at 03:23:20AM +0800, Autrijus Tang wrote:
:  Is it really intended that we get into habit of writing this?
:  
:  if 'localhost:80' ~~ /^(.+)\:(\d+)$/ {
:  my $socket = connect(~$0, +$1);
:  }
:  
:  It looks... weird. :)
: 
: And it would have to be
: 
:  if 'localhost:80' ~~ /^(.+)\:(\d+)$/ {
:   my $socket = connect(~$0, ~$1);
:  }
: 
: because +$1 still evaluates to 1.  (The ~ in front of $0 is 
: probably optional.)
: 
: My suggestion is that a match object in numeric context is the
: same as evaluating its string value in a numeric context.  If
: we need a way to find out the number of match repetitions (what
: the numeric context was intended to provide), it might be better
: done with an explicit C.matchcount method or something like that.

I think we already said something like that once some number of
months ago.  +$1 simply has to be the numeric value of the match.
It's not as much of a problem as a Perl 5 programmer might think,
since ?$1 is still true even if +$1 is 0.  Anyway, while we could have
a method for the .matchcount, +$1[] should work fine too.  And maybe
even [EMAIL PROTECTED], presuming that a match object can function as an array
actually means a match object knows when it's being asked to supply
an array reference.

Actually, it's not clear to me offhand why @1 shouldn't mean $1[]
and %1 shouldn't mean $1{}.

Larry


Re: Numification of captured match

2005-05-12 Thread Damian Conway
Larry Wall wrote:
I think we already said something like that once some number of
months ago.  +$1 simply has to be the numeric value of the match.
Agreed.

Anyway, while we could have
a method for the .matchcount, +$1[] should work fine too. 
Yep.

Actually, it's not clear to me offhand why @1 shouldn't mean $1[]
and %1 shouldn't mean $1{}.
It *does*. According to the recent capture semantics document:
Note that, outside a rule, C@1 is simply a shorthand for C@{$1},
and:
And, of course, outside the rule, C%1 is a shortcut for C%{$1}:
Damian


Re: Numification of captured match

2005-05-12 Thread Larry Wall
On Fri, May 13, 2005 at 02:00:10PM +1000, Damian Conway wrote:
: Actually, it's not clear to me offhand why @1 shouldn't mean $1[]
: and %1 shouldn't mean $1{}.
: 
: It *does*. According to the recent capture semantics document:
: 
: Note that, outside a rule, C@1 is simply a shorthand for C@{$1},
: 
: and:
: 
: And, of course, outside the rule, C%1 is a shortcut for C%{$1}:

In that case it's very much less clear to me why it shouldn't mean that.  :-)

Larry


Re: Numification of captured match

2005-05-12 Thread Jonathan Scott Duff
On Thu, May 12, 2005 at 08:10:42PM -0700, Larry Wall wrote:
 On Thu, May 12, 2005 at 02:55:36PM -0500, Patrick R. Michaud wrote:
 : My suggestion is that a match object in numeric context is the
 : same as evaluating its string value in a numeric context.  If
 : we need a way to find out the number of match repetitions (what
 : the numeric context was intended to provide), it might be better
 : done with an explicit C.matchcount method or something like that.
 
 I think we already said something like that once some number of
 months ago.  +$1 simply has to be the numeric value of the match.
 It's not as much of a problem as a Perl 5 programmer might think,
 since ?$1 is still true even if +$1 is 0.  Anyway, while we could have
 a method for the .matchcount, +$1[] should work fine too.  And maybe
 even [EMAIL PROTECTED], presuming that a match object can function as an 
 array
 actually means a match object knows when it's being asked to supply
 an array reference.

So the counting idiom in S05 becomes one of:

$match_count += @{m:g/pattern/};
$match_count += list m:g/pattern/;
$match_count += m:g/pattern/.matchount;
$match_count += (m:g/pattern/)[];   # maybe

???

-Scott
-- 
Jonathan Scott Duff
[EMAIL PROTECTED]


Re: Numification of captured match

2005-05-12 Thread Patrick R. Michaud
On Thu, May 12, 2005 at 08:10:42PM -0700, Larry Wall wrote:
 On Thu, May 12, 2005 at 02:55:36PM -0500, Patrick R. Michaud wrote:
 : On Fri, May 13, 2005 at 03:23:20AM +0800, Autrijus Tang wrote:
 :  Is it really intended that we get into habit of writing this?
 :  
 :  if 'localhost:80' ~~ /^(.+)\:(\d+)$/ {
 :my $socket = connect(~$0, +$1);
 :  }
 :  
 :  It looks... weird. :)
 : 
 : And it would have to be
 : 
 :  if 'localhost:80' ~~ /^(.+)\:(\d+)$/ {
 : my $socket = connect(~$0, ~$1);
 :  }
 : 
 : because +$1 still evaluates to 1.  (The ~ in front of $0 is 
 : probably optional.)
 : 
 : My suggestion is that a match object in numeric context is the
 : same as evaluating its string value in a numeric context.  If
 : we need a way to find out the number of match repetitions (what
 : the numeric context was intended to provide), it might be better
 : done with an explicit C.matchcount method or something like that.
 
 I think we already said something like that once some number of
 months ago.  

I guess I've been led astray (or downright confused) by the capture 
specs then, when it says:

A successful match returns a CMatch object whose boolean value is
true, whose integer value is typically 1 (except under the C:g or
C:x flags; see LCapturing from non-singular matches), whose string
value is the complete substring that was matched by the entire rule,
whose array component contains all subpattern (unnamed) captures, and
whose hash component contains all subrule (named) captures.

and later

If an named scalar alias is applied to a set of non-capturing 
brackets:
m:w/ $key:=[ ([A-E]) (\d**{3..6}) (X?) ] /;
then the corresponding entry in the rule's hash is assigned a 
CMatch object whose:
* Boolean value is true,
* Integer value is 1,
* String value is the complete substring matched by the 
  contents of the square brackets,
* Array and hash are both empty.

and under the :g option...

 if $text ~~ m:words:globally/ (\S+:) rocks / {
 say Matched {+$/} different ways;
 say 'Full match context is:';
 say $/;
 }

So, are the Match objects returned from subpattern captures 
treated differently in numeric context than the Match objects
coming from named scalar aliases or the match itself... ?

 It's not as much of a problem as a Perl 5 programmer might think,
 since ?$1 is still true even if +$1 is 0.  Anyway, while we could have
 a method for the .matchcount, +$1[] should work fine too.  

With .matchcount I wasn't concerned about the number of repetitions
stored in $1 -- I was trying to get at the numeric value that $/
would've returned under the :g option.  But in re-reading the draft
of the :globally option I see we already have one --  
C $/.matches  in numeric context should supply it for us.

So I'm guessing that we're all in agreement that +$/, +$1, and 
+$subrule all refer to the numeric value of the string matched, 
as opposed to what's currently written about their values in the 
draft...?  Or am I still missing the picture entirely?

Pm


Re: Numification of captured match

2005-05-12 Thread Damian Conway
Patrick surmised:
So I'm guessing that we're all in agreement that +$/, +$1, and 
+$subrule all refer to the numeric value of the string matched, 
as opposed to what's currently written about their values in the 
draft...?
Yes. The semantics proposed in the draft have proved to be too orthogonal for 
practical use. ;-)

Damian