Author: lwall
Date: 2009-03-19 01:43:53 +0100 (Thu, 19 Mar 2009)
New Revision: 25902
Modified:
docs/Perl6/Spec/S05-regex.pod
Log:
[S05] define .caps and .chunks methods on match objects
Modified: docs/Perl6/Spec/S05-regex.pod
===
--- docs/Perl6/Spec/S05-regex.pod 2009-03-18 23:02:41 UTC (rev 25901)
+++ docs/Perl6/Spec/S05-regex.pod 2009-03-19 00:43:53 UTC (rev 25902)
@@ -16,7 +16,7 @@
Date: 24 Jun 2002
Last Modified: 18 Mar 2009
Number: 5
- Version: 92
+ Version: 93
This document summarizes Apocalypse 5, which is about the new regex
syntax. We now try to call them I rather than "regular
@@ -1705,14 +1705,14 @@
=item * before C
Perform lookahead -- i.e., check if we're at a position where
-C matches. Returns a zero-width Match object on
+C matches. Returns a zero-width C object on
success.
=item * after C
Perform lookbehind -- i.e., check if the string before the
current position matches (anchored at the end).
-Returns a zero-width Match object on success.
+Returns a zero-width C object on success.
=item *
@@ -2385,7 +2385,7 @@
=item *
-A match always returns a Match object, which is also available
+A match always returns a C object, which is also available
as C<$/>, which is a contextual lexical declared in the outer
subroutine that is calling the regex. (A regex declares its own
lexical C<$/> variable, which always refers to the most recent
@@ -2547,6 +2547,9 @@
$/.chars # $/.to - $/.from
$/.orig# the original match string
$/.Str # substr($/.orig, $/.from, $/.chars)
+$/.ast # the abstract result associated with this node
+$/.caps # sequential captures
+$/.chunks # sequential tokenization
Within the regex the current match state C<$¢> also provides
@@ -2558,6 +2561,18 @@
=item *
+As described above, a C in list context returns its positional
+captures. However, sometimes you'd rather get a flat list of tokens in
+the order they occur in the text. The C<.caps> method returns a list
+of every captured item, regardless of how it was otherwise bound into
+named or numbered captures. The C<.chunks> method returns the captures
+as well as all the interleaved "noise" between the captures. [Conjecture:
+we could also have C<.deepcaps> and C<.deepchunks> that recursively expand
+any capture containing submatches. Presumably each returned chunk would
+come equipped with some method to discover its "pedigree" in the parse tree.]
+
+=item *
+
All match attempts--successful or not--against any regex, subrule, or
subpattern (see below) return an object of class C. That is:
@@ -2566,8 +2581,8 @@
=item *
-This returned object is also automatically assigned to the lexical
-C<$/> variable of the current surroundings. That is:
+This returned object is also automatically bound to the lexical
+C<$/> variable of the current surroundings regardless of success. That is:
$str ~~ /pattern/;
say "Matched" if $/;
@@ -3122,7 +3137,7 @@
#||
mm/ $=[ (<[A..E]>) (\d**3..6) (X?) ] /;
-then the corresponding C<< $/ >> Match object contains only the string
+then the corresponding C<< $/ >> C object contains only the string
matched by the non-capturing brackets.
=item *