This and other RFCs are available on the web at
  http://dev.perl.org/rfc/

=head1  TITLE

counting matches

=head1 VERSION

        Maintainer: Richard Proctor <[EMAIL PROTECTED]>
        Date: 16 Aug 2000
        Last Modified: 12 Sep 2000
        Mailing List: [EMAIL PROTECTED]
        Number: 110
        Version: 5
        Status: Developing

=head1 ABSTRACT

Provide a simple way of giving a count of matches of a pattern.

=head1 DESCRIPTION

Have you ever wanted to count the number of matches of a patten?  s///g 
returns the number of matches it finds.  m//g just returns 1 for matching.
Counts can be made using s//$&/g but this is wastefull, or by putting some 
counting loop round a m//g.  But this all seams rather messy. 

TomC (and a couple of others) have said that it can also be done as :    
        $count = () = $string =~ /pattern/g;

However many people do not like this construct, here are a couple of quotes:

jhi: Which I find cute as a demonstration of the Perl's context concept,
but ugly as hell from usability viewpoint.  

Bart Lateur: '()=' is not perfect. It is also butt ugly. It is a "dirty hack".

This construct is also likely to be inefficient as perl will have to
build up a list of all the matches, store them somewhere, count them, then
throw them away.

Therefore I would like a way of counting matches.

=head2 Proposal

m//gt (or m//t see below) would be defined to do the match, and return the
count of matches, this leaves all existing uses consistent and unaffected.
/t is suggested for "counT", as /c is already taken.

Relationship of m//t and m//g - there are three possibilities, my original:

m//gt, where /t adds counting to a group match (/t without /g would just
return 0 or 1).  However \G loses its meaning.

The Alternative By Uri :

m//t and m//g are mutually exclusive and m//gt should be regarded as an error.

Hugo:

> I like this too. I'd suggest /t should mean a) return a scalar of
> the number of matches and b) don't set any special variables. Then
> /t without /g would return 0 or 1, but be faster since no extra
> information need be captured (except internally for (.)\1 type
> matching - compile time checks could determine if these are needed,
> though (?{..}) and (??{..}) patterns would require disabling of
> that optimisation). /tg would give a scalar count of the total
> number of matches. \G would retain its meaning.

I think Hugo's wording about the relationship makes the best sense, and
this is the suggested way forward.

=head1 CHANGES

RFC110 V1 - Original posting to perl6-language

RFC110 V2 - Reposted to perl6-language-regex

RFC110 V3 - Added Uri's alternitive m//t

RFC110 V4 - Added notes about $count = () = $string =~ /pattern/g

RFC110 V5 - Added Hugo's wording about /g and /t relationship, suggested this
is the way forward.

Unless any significant discussion takes place this RFC will move to frozen
within a week.

=head1 IMPLENTATION

Hugo:
> Implementation should be fairly straightforward,
> though ensuring that optimisations occurred precisely when they
> are safe would probably involve a few bug-chasing cycles.


=head1 REFERENCES

I brought this up on p5p a couple of years ago, but it was lost in the noise...



Reply via email to