Author: lwall
Date: 2009-03-19 01:43:53 +0100 (Thu, 19 Mar 2009)
New Revision: 25902

Modified:
   docs/Perl6/Spec/S05-regex.pod
Log:
[S05] define .caps and .chunks methods on match objects


Modified: docs/Perl6/Spec/S05-regex.pod
===================================================================
--- docs/Perl6/Spec/S05-regex.pod       2009-03-18 23:02:41 UTC (rev 25901)
+++ docs/Perl6/Spec/S05-regex.pod       2009-03-19 00:43:53 UTC (rev 25902)
@@ -16,7 +16,7 @@
    Date: 24 Jun 2002
    Last Modified: 18 Mar 2009
    Number: 5
-   Version: 92
+   Version: 93
 
 This document summarizes Apocalypse 5, which is about the new regex
 syntax.  We now try to call them I<regex> rather than "regular
@@ -1705,14 +1705,14 @@
 =item * before C<pattern>
 
 Perform lookahead -- i.e., check if we're at a position where
-C<pattern> matches.  Returns a zero-width Match object on
+C<pattern> matches.  Returns a zero-width C<Match> object on
 success.
 
 =item * after C<pattern>
 
 Perform lookbehind -- i.e., check if the string before the
 current position matches <pattern> (anchored at the end).
-Returns a zero-width Match object on success.
+Returns a zero-width C<Match> object on success.
 
 =item * <?>
 
@@ -2385,7 +2385,7 @@
 
 =item *
 
-A match always returns a Match object, which is also available
+A match always returns a C<Match> object, which is also available
 as C<$/>, which is a contextual lexical declared in the outer
 subroutine that is calling the regex.  (A regex declares its own
 lexical C<$/> variable, which always refers to the most recent
@@ -2547,6 +2547,9 @@
     $/.chars   # $/.to - $/.from
     $/.orig    # the original match string
     $/.Str     # substr($/.orig, $/.from, $/.chars)
+    $/.ast      # the abstract result associated with this node
+    $/.caps     # sequential captures
+    $/.chunks   # sequential tokenization
 
 Within the regex the current match state C<$ยข> also provides
 
@@ -2558,6 +2561,18 @@
 
 =item *
 
+As described above, a C<Match> in list context returns its positional
+captures.  However, sometimes you'd rather get a flat list of tokens in
+the order they occur in the text.  The C<.caps> method returns a list
+of every captured item, regardless of how it was otherwise bound into
+named or numbered captures.  The C<.chunks> method returns the captures
+as well as all the interleaved "noise" between the captures. [Conjecture:
+we could also have C<.deepcaps> and C<.deepchunks> that recursively expand
+any capture containing submatches.  Presumably each returned chunk would
+come equipped with some method to discover its "pedigree" in the parse tree.]
+
+=item *
+
 All match attempts--successful or not--against any regex, subrule, or
 subpattern (see below) return an object of class C<Match>. That is:
 
@@ -2566,8 +2581,8 @@
 
 =item *
 
-This returned object is also automatically assigned to the lexical
-C<$/> variable of the current surroundings. That is:
+This returned object is also automatically bound to the lexical
+C<$/> variable of the current surroundings regardless of success. That is:
 
      $str ~~ /pattern/;
      say "Matched" if $/;
@@ -3122,7 +3137,7 @@
         #        |                            |
       mm/ $<key>=[ (<[A..E]>) (\d**3..6) (X?) ] /;
 
-then the corresponding C<< $/<key> >> Match object contains only the string
+then the corresponding C<< $/<key> >> C<Match> object contains only the string
 matched by the non-capturing brackets.
 
 =item *

Reply via email to