RFC 360 (v1) Allow multiply matched groups in regexes to return a listref of all matches

2000-10-01 Thread Perl6 RFC Librarian

This and other RFCs are available on the web at
  http://dev.perl.org/rfc/

=head1 TITLE

Allow multiply matched groups in regexes to return a listref of all matches

=head1 VERSION

  Maintainer: Kevin Walker [EMAIL PROTECTED]
  Date: 30 Sep 2000
  Mailing List: [EMAIL PROTECTED]
  Number: 360
  Version: 1
  Status: Developing

=head1 DESCRIPTION

Since the October 1 RFC deadline is nigh, this will be pretty informal.

Suppose you want to parse text with looks like:

 name: John Abajace
 children: Tom, Dick, Harry
 favorite colors: red, green, blue

 name: I. J. Reilly
 children: Jane, Gertrude
 favorite colors: black, white
 
 ...

Currently, this takes two passes:

 while ($text =~ /name:\s*(.*?)\n\s*
children:\s*(.*?)\n\s*
favorite\ colors:\s*(.*?)\n/sigx) {
 # now second pass for $2 ( = "Tom, Dick, Harry") and $3, yielding
 # list of children and favorite colors
 }

If we introduce a new construction, (?@ ... ), which means "spit out a
list ref of all matches, not just the last match", then this could be
done in one pass:

 while ($text =~ /name:\s*(.*?)\n\s*
children:\s*(?:(?@\S+)[, ]*)*\n\s*
favorite\ colors:\s*(?:(?@\S+)[, ]*)*\n/sigx) {
 # now we have:
 #  $1 = "John Abajace";
 #  $2 = ["Tom", "Dick", "Harry"]
 #  $3 = ["red", "green", "blue"]
 }

Although the above example is contrived, I have very often felt the need
for this feature in real-world projects.

=head1 IMPLEMENTATION

Unknown.

=head1 REFERENCES

None.




RE: RFC 360 (v1) Allow multiply matched groups in regexes to return a listref of all matches

2000-10-01 Thread David Grove

On Sunday, October 01, 2000 1:38 AM, Perl6 RFC Librarian 
[SMTP:[EMAIL PROTECTED]] wrote:
 This and other RFCs are available on the web at
   http://dev.perl.org/rfc/

 =head1 TITLE

 Allow multiply matched groups in regexes to return a listref of all matches

 =head1 VERSION

   Maintainer: Kevin Walker [EMAIL PROTECTED]
   Date: 30 Sep 2000
   Mailing List: [EMAIL PROTECTED]
   Number: 360
   Version: 1
   Status: Developing

 =head1 DESCRIPTION

 Since the October 1 RFC deadline is nigh, this will be pretty informal.

 Suppose you want to parse text with looks like:

  name: John Abajace
  children: Tom, Dick, Harry
  favorite colors: red, green, blue

  name: I. J. Reilly
  children: Jane, Gertrude
  favorite colors: black, white

  ...

 Currently, this takes two passes:

  while ($text =~ /name:\s*(.*?)\n\s*
   children:\s*(.*?)\n\s*
   favorite\ colors:\s*(.*?)\n/sigx) {
  # now second pass for $2 ( = "Tom, Dick, Harry") and $3, yielding
  # list of children and favorite colors
  }

 If we introduce a new construction, (?@ ... ), which means "spit out a
 list ref of all matches, not just the last match", then this could be
 done in one pass:

  while ($text =~ /name:\s*(.*?)\n\s*
   children:\s*(?:(?@\S+)[, ]*)*\n\s*
   favorite\ colors:\s*(?:(?@\S+)[, ]*)*\n/sigx) {
  # now we have:
  #  $1 = "John Abajace";
  #  $2 = ["Tom", "Dick", "Harry"]
  #  $3 = ["red", "green", "blue"]
  }

 Although the above example is contrived, I have very often felt the need
 for this feature in real-world projects.

 =head1 IMPLEMENTATION

 Unknown.

 =head1 REFERENCES

 None.


 --
 for help to unsubscribe, etcetera, mail [EMAIL PROTECTED]
 more information at http://dev.perl.org/ and http://dev.perl.org/lists

Definitely. I think this has been one of the few actual "flaws" in the 
language. People are always trying to

($part1, $somevar) =~ s/(.*):(.*)/;

This would be in list context. In scalar context, it could still grab the 
number of patterns matched in the parenths, or a 1|0 to indicate a match at 
all, which would be less useful. Since it's a common error (I believe it's even 
FAQed a few times), the request goes well beyond a request for syntactic sugar, 
and points out a flaw in the language. People expect it to be there as a part 
of what makes Perl make sense.