vrana Mon Jun 13 12:26:28 2005 EDT
Modified files:
/phpdoc/en/reference/pcre pattern.syntax.xml
Log:
Document PCRE 4.3
# However, PCRE 5.0 is in the sources now
http://cvs.php.net/diff.php/phpdoc/en/reference/pcre/pattern.syntax.xml?r1=1.6&r2=1.7&ty=u
Index: phpdoc/en/reference/pcre/pattern.syntax.xml
diff -u phpdoc/en/reference/pcre/pattern.syntax.xml:1.6
phpdoc/en/reference/pcre/pattern.syntax.xml:1.7
--- phpdoc/en/reference/pcre/pattern.syntax.xml:1.6 Sun Dec 19 08:35:34 2004
+++ phpdoc/en/reference/pcre/pattern.syntax.xml Mon Jun 13 12:26:27 2005
@@ -1,5 +1,5 @@
<?xml version="1.0" encoding="iso-8859-1"?>
-<!-- $Revision: 1.6 $ -->
+<!-- $Revision: 1.7 $ -->
<!-- splitted from ./en/functions/pcre.xml, last change in rev 1.2 -->
<refentry id="reference.pcre.pattern.syntax">
<refnamediv>
@@ -67,7 +67,7 @@
<listitem>
<simpara>
The following Perl escape sequences are not supported:
- \l, \u, \L, \U, \E, \Q. In fact these are implemented by
+ \l, \u, \L, \U. In fact these are implemented by
Perl's general string-handling and are not part of its
pattern matching engine.
</simpara>
@@ -575,6 +575,15 @@
newline that is the last character of the string as well as at the end of
the string, whereas <literal>\z</literal> matches only at the end.
</para>
+
+ <para>
+ <literal>\Q</literal> and <literal>\E</literal> can be used to ignore
+ regexp metacharacters in the pattern. For example:
+ <literal>\w+\Q.$.\E$</literal> will match one or more word characters,
+ followed by literals <literal>.$.</literal> and anchored at the end of
+ the string.
+ </para>
+
</refsect2>
<refsect2 id="regexp.reference.circudollar">
@@ -924,6 +933,13 @@
setting in one branch does affect subsequent branches, so
the above patterns match "SUNDAY" as well as "Saturday".
</para>
+
+ <para>
+ It is possible to name the subpattern with
+ <literal>(?P<name>pattern)</literal>. Array with matches will
+ contain the match indexed by the string alongside the match indexed by
+ a number, then.
+ </para>
</refsect2>
<refsect2 id="regexp.reference.repetition">
@@ -1058,6 +1074,13 @@
default behaviour.
</para>
<para>
+ Quantifiers followed by <literal>+</literal> are "possessive". They eat
+ as many characters as possible and don't return to match the rest of the
+ pattern. Thus <literal>.*abc</literal> matches "aabc" but
+ <literal>.*+abc</literal> doesn't because <literal>.*+</literal> eats the
+ whole string. Possessive quantifiers can be used to speed up processing.
+ </para>
+ <para>
When a parenthesized subpattern is quantified with a minimum
repeat count that is greater than 1 or with a limited maximum,
more store is required for the compiled pattern, in
@@ -1556,6 +1579,13 @@
there is no way to give an out-of-memory error from within a
recursion.
</para>
+
+ <para>
+ <literal>(?1)</literal>, <literal>(?2)</literal> and so on can be used
+ for recursive subpatterns too. It is also possible to use named
+ subpatterns: <literal>(?P>foo)</literal>.
+ </para>
+
</refsect2>
<refsect2 id="regexp.reference.performances">