nlopess Thu Jul 29 06:15:26 2004 EDT
Modified files:
/phpdoc/en/reference/pcre pattern.modifiers.xml pattern.syntax.xml
reference.xml
Log:
fix IDs: now livedocs correctly links the pattern syntax/modifiers
some WS
http://cvs.php.net/diff.php/phpdoc/en/reference/pcre/pattern.modifiers.xml?r1=1.1&r2=1.2&ty=u
Index: phpdoc/en/reference/pcre/pattern.modifiers.xml
diff -u phpdoc/en/reference/pcre/pattern.modifiers.xml:1.1
phpdoc/en/reference/pcre/pattern.modifiers.xml:1.2
--- phpdoc/en/reference/pcre/pattern.modifiers.xml:1.1 Wed Mar 3 00:06:14 2004
+++ phpdoc/en/reference/pcre/pattern.modifiers.xml Thu Jul 29 06:15:26 2004
@@ -1,7 +1,7 @@
<?xml version="1.0" encoding="iso-8859-1"?>
-<!-- $Revision: 1.1 $ -->
+<!-- $Revision: 1.2 $ -->
<!-- splitted from ./en/functions/pcre.xml, last change in rev 1.2 -->
- <refentry id="pcre.pattern.modifiers">
+ <refentry id="reference.pcre.pattern.modifiers">
<refnamediv>
<refname>Pattern Modifiers</refname>
<refpurpose>Describes possible modifiers in regex
http://cvs.php.net/diff.php/phpdoc/en/reference/pcre/pattern.syntax.xml?r1=1.1&r2=1.2&ty=u
Index: phpdoc/en/reference/pcre/pattern.syntax.xml
diff -u phpdoc/en/reference/pcre/pattern.syntax.xml:1.1
phpdoc/en/reference/pcre/pattern.syntax.xml:1.2
--- phpdoc/en/reference/pcre/pattern.syntax.xml:1.1 Wed Mar 3 00:06:14 2004
+++ phpdoc/en/reference/pcre/pattern.syntax.xml Thu Jul 29 06:15:26 2004
@@ -1,7 +1,7 @@
<?xml version="1.0" encoding="iso-8859-1"?>
-<!-- $Revision: 1.1 $ -->
+<!-- $Revision: 1.2 $ -->
<!-- splitted from ./en/functions/pcre.xml, last change in rev 1.2 -->
- <refentry id="pcre.pattern.syntax">
+ <refentry id="reference.pcre.pattern.syntax">
<refnamediv>
<refname>Pattern Syntax</refname>
<refpurpose>Describes PCRE regex syntax</refpurpose>
@@ -121,23 +121,29 @@
</listitem>
<listitem>
<simpara>
- If <link linkend="pcre.pattern.modifiers">PCRE_DOLLAR_ENDONLY</link> is set and
- <link linkend="pcre.pattern.modifiers">PCRE_MULTILINE</link> is not
- set, the $ meta-character matches only at the very end of
- the string.
+ If <link
+ linkend="reference.pcre.pattern.modifiers">PCRE_DOLLAR_ENDONLY</link>
+ is set and <link
+ linkend="reference.pcre.pattern.modifiers">PCRE_MULTILINE</link> is
+ not set, the $ meta-character matches only at the very end of the
+ string.
</simpara>
</listitem>
<listitem>
<simpara>
- If <link linkend="pcre.pattern.modifiers">PCRE_EXTRA</link> is set, a backslash
followed by a letter
- with no special meaning is faulted.
+ If <link
+ linkend="reference.pcre.pattern.modifiers">PCRE_EXTRA</link> is
+ set, a backslash followed by a letter with no special meaning is
+ faulted.
</simpara>
</listitem>
<listitem>
<simpara>
- If <link linkend="pcre.pattern.modifiers">PCRE_UNGREEDY</link> is set, the
greediness of the
- repetition quantifiers is inverted, that is, by default they are
- not greedy, but if followed by a question mark they are.
+ If <link
+ linkend="reference.pcre.pattern.modifiers">PCRE_UNGREEDY</link> is
+ set, the greediness of the repetition quantifiers is inverted,
+ that is, by default they are not greedy, but if followed by a
+ question mark they are.
</simpara>
</listitem>
</orderedlist>
@@ -358,12 +364,12 @@
particular, if you want to match a backslash, you write "\\".
</para>
<para>
- If a pattern is compiled with the <link
linkend="pcre.pattern.modifiers">PCRE_EXTENDED</link> option,
+ If a pattern is compiled with the <link
+ linkend="reference.pcre.pattern.modifiers">PCRE_EXTENDED</link> option,
whitespace in the pattern (other than in a character class) and
- characters between a "#" outside a character class and the
- next newline character are ignored. An escaping backslash
- can be used to include a whitespace or "#" character as part
- of the pattern.
+ characters between a "#" outside a character class and the next newline
+ character are ignored. An escaping backslash can be used to include a
+ whitespace or "#" character as part of the pattern.
</para>
<para>
A second use of backslash provides a way of encoding
@@ -731,13 +737,13 @@
circumflex and dollar (described below) in that they only
ever match at the very start and end of the subject string,
whatever options are set. They are not affected by the
- <link linkend="pcre.pattern.modifiers">PCRE_MULTILINE</link> or
- <link linkend="pcre.pattern.modifiers">PCRE_DOLLAR_ENDONLY</link> options.
- The difference between <literal>\Z</literal> and
- <literal>\z</literal> is that <literal>\Z</literal>
- matches before a newline that is the
- last character of the string as well as at the end of the
- string, whereas <literal>\z</literal> matches only at the end.
+ <link linkend="reference.pcre.pattern.modifiers">PCRE_MULTILINE</link> or
+ <link
+ linkend="reference.pcre.pattern.modifiers">PCRE_DOLLAR_ENDONLY</link>
+ options. The difference between <literal>\Z</literal> and
+ <literal>\z</literal> is that <literal>\Z</literal> matches before a
+ newline that is the last character of the string as well as at the end of
+ the string, whereas <literal>\z</literal> matches only at the end.
</para>
</refsect2>
@@ -773,28 +779,31 @@
<para>
The meaning of dollar can be changed so that it matches only
at the very end of the string, by setting the
- <link linkend="pcre.pattern.modifiers">PCRE_DOLLAR_ENDONLY</link>
+ <link linkend="reference.pcre.pattern.modifiers">PCRE_DOLLAR_ENDONLY</link>
option at compile or matching time. This
does not affect the \Z assertion.
</para>
<para>
The meanings of the circumflex and dollar characters are
- changed if the <link linkend="pcre.pattern.modifiers">PCRE_MULTILINE</link>
option is set. When this is
- the case, they match immediately after and immediately
- before an internal "\n" character, respectively, in addition
- to matching at the start and end of the subject string. For
- example, the pattern /^abc$/ matches the subject string
- "def\nabc" in multiline mode, but not otherwise.
- Consequently, patterns that are anchored in single line mode
- because all branches start with "^" are not anchored in
- multiline mode. The <link
linkend="pcre.pattern.modifiers">PCRE_DOLLAR_ENDONLY</link> option is ignored if
- <link linkend="pcre.pattern.modifiers">PCRE_MULTILINE</link> is set.
+ changed if the <link
+ linkend="reference.pcre.pattern.modifiers">PCRE_MULTILINE</link> option
+ is set. When this is the case, they match immediately after and
+ immediately before an internal "\n" character, respectively, in addition
+ to matching at the start and end of the subject string. For example, the
+ pattern /^abc$/ matches the subject string "def\nabc" in multiline mode,
+ but not otherwise. Consequently, patterns that are anchored in single
+ line mode because all branches start with "^" are not anchored in
+ multiline mode. The <link
+ linkend="reference.pcre.pattern.modifiers">PCRE_DOLLAR_ENDONLY</link>
+ option is ignored if <link
+ linkend="reference.pcre.pattern.modifiers">PCRE_MULTILINE</link> is
+ set.
</para>
<para>
Note that the sequences \A, \Z, and \z can be used to match
the start and end of the subject in both modes, and if all
branches of a pattern start with \A is it always anchored,
- whether <link linkend="pcre.pattern.modifiers">PCRE_MULTILINE</link> is set or
not.
+ whether <link linkend="reference.pcre.pattern.modifiers">PCRE_MULTILINE</link>
is set or not.
</para>
</refsect2>
@@ -803,7 +812,8 @@
<para>
Outside a character class, a dot in the pattern matches any
one character in the subject, including a non-printing
- character, but not (by default) newline. If the <link
linkend="pcre.pattern.modifiers">PCRE_DOTALL</link>
+ character, but not (by default) newline. If the <link
+ linkend="reference.pcre.pattern.modifiers">PCRE_DOTALL</link>
option is set, then dots match newlines as well. The
handling of dot is entirely independent of the handling of
circumflex and dollar, the only relationship being that they
@@ -850,9 +860,10 @@
</para>
<para>
The newline character is never treated in any special way in
- character classes, whatever the setting of the <link
linkend="pcre.pattern.modifiers">PCRE_DOTALL</link>
- or <link linkend="pcre.pattern.modifiers">PCRE_MULTILINE</link> options is. A
class such as [^a] will
- always match a newline.
+ character classes, whatever the setting of the <link
+ linkend="reference.pcre.pattern.modifiers">PCRE_DOTALL</link>
+ or <link linkend="reference.pcre.pattern.modifiers">PCRE_MULTILINE</link>
+ options is. A class such as [^a] will always match a newline.
</para>
<para>
The minus (hyphen) character can be used to specify a range
@@ -923,10 +934,10 @@
<refsect2 id="regexp.reference.internal-options">
<title>Internal option setting</title>
<para>
- The settings of <link linkend="pcre.pattern.modifiers">PCRE_CASELESS</link>,
- <link linkend="pcre.pattern.modifiers">PCRE_MULTILINE</link>,
- <link linkend="pcre.pattern.modifiers">PCRE_DOTALL</link>,
- and <link linkend="pcre.pattern.modifiers">PCRE_EXTENDED</link> can be changed
from within the pattern by
+ The settings of <link
linkend="reference.pcre.pattern.modifiers">PCRE_CASELESS</link>,
+ <link linkend="reference.pcre.pattern.modifiers">PCRE_MULTILINE</link>,
+ <link linkend="reference.pcre.pattern.modifiers">PCRE_DOTALL</link>,
+ and <link linkend="reference.pcre.pattern.modifiers">PCRE_EXTENDED</link> can
be changed from within the pattern by
a sequence of Perl option letters enclosed between "(?" and
")". The option letters are
@@ -936,19 +947,19 @@
<tbody>
<row>
<entry><literal>i</literal></entry>
- <entry>for <link
linkend="pcre.pattern.modifiers">PCRE_CASELESS</link></entry>
+ <entry>for <link
linkend="reference.pcre.pattern.modifiers">PCRE_CASELESS</link></entry>
</row>
<row>
<entry><literal>m</literal></entry>
- <entry>for <link
linkend="pcre.pattern.modifiers">PCRE_MULTILINE</link></entry>
+ <entry>for <link
linkend="reference.pcre.pattern.modifiers">PCRE_MULTILINE</link></entry>
</row>
<row>
<entry><literal>s</literal></entry>
- <entry>for <link linkend="pcre.pattern.modifiers">PCRE_DOTALL</link></entry>
+ <entry>for <link
linkend="reference.pcre.pattern.modifiers">PCRE_DOTALL</link></entry>
</row>
<row>
<entry><literal>x</literal></entry>
- <entry>for <link
linkend="pcre.pattern.modifiers">PCRE_EXTENDED</link></entry>
+ <entry>for <link
linkend="reference.pcre.pattern.modifiers">PCRE_EXTENDED</link></entry>
</row>
</tbody>
</tgroup>
@@ -958,8 +969,8 @@
For example, (?im) sets caseless, multiline matching. It is
also possible to unset these options by preceding the letter
with a hyphen, and a combined setting and unsetting such as
- (?im-sx), which sets <link
linkend="pcre.pattern.modifiers">PCRE_CASELESS</link> and <link
linkend="pcre.pattern.modifiers">PCRE_MULTILINE</link> while
- unsetting <link linkend="pcre.pattern.modifiers">PCRE_DOTALL</link> and <link
linkend="pcre.pattern.modifiers">PCRE_EXTENDED</link>, is also permitted.
+ (?im-sx), which sets <link
linkend="reference.pcre.pattern.modifiers">PCRE_CASELESS</link> and <link
linkend="reference.pcre.pattern.modifiers">PCRE_MULTILINE</link> while
+ unsetting <link linkend="reference.pcre.pattern.modifiers">PCRE_DOTALL</link>
and <link linkend="reference.pcre.pattern.modifiers">PCRE_EXTENDED</link>, is also
permitted.
If a letter appears both before and after the hyphen, the
option is unset.
</para>
@@ -980,7 +991,7 @@
<para>
which in turn is the same as compiling the pattern abc with
- <link linkend="pcre.pattern.modifiers">PCRE_CASELESS</link> set.
+ <link linkend="reference.pcre.pattern.modifiers">PCRE_CASELESS</link> set.
In other words, such "top level" settings apply to the whole
pattern (unless there are other changes inside subpatterns).
If there is more than one setting of the same option at top level,
@@ -995,7 +1006,7 @@
<literal>(a(?i)b)c</literal>
matches abc and aBc and no other strings (assuming
- <link linkend="pcre.pattern.modifiers">PCRE_CASELESS</link> is not used). By
this means, options can be
+ <link linkend="reference.pcre.pattern.modifiers">PCRE_CASELESS</link> is not
used). By this means, options can be
made to have different settings in different parts of the
pattern. Any changes made in one alternative do carry on
into subsequent branches within the same subpattern. For
@@ -1009,8 +1020,8 @@
compile time. There would be some very weird behaviour otherwise.
</para>
<para>
- The PCRE-specific options <link
linkend="pcre.pattern.modifiers">PCRE_UNGREEDY</link> and
- <link linkend="pcre.pattern.modifiers">PCRE_EXTRA</link> can
+ The PCRE-specific options <link
linkend="reference.pcre.pattern.modifiers">PCRE_UNGREEDY</link> and
+ <link linkend="reference.pcre.pattern.modifiers">PCRE_EXTRA</link> can
be changed in the same way as the Perl-compatible options by
using the characters U and X respectively. The (?X) flag
setting is special in that it must always occur earlier in
@@ -1218,7 +1229,7 @@
that is the only way the rest of the pattern matches.
</para>
<para>
- If the <link linkend="pcre.pattern.modifiers">PCRE_UNGREEDY</link> option is
set (an option which is not
+ If the <link linkend="reference.pcre.pattern.modifiers">PCRE_UNGREEDY</link>
option is set (an option which is not
available in Perl) then the quantifiers are not greedy by
default, but individual ones can be made greedy by following
them with a question mark. In other words, it inverts the
@@ -1231,7 +1242,7 @@
proportion to the size of the minimum or maximum.
</para>
<para>
- If a pattern starts with .* or .{0,} and the <link
linkend="pcre.pattern.modifiers">PCRE_DOTALL</link>
+ If a pattern starts with .* or .{0,} and the <link
linkend="reference.pcre.pattern.modifiers">PCRE_DOTALL</link>
option (equivalent to Perl's /s) is set, thus allowing the .
to match newlines, then the pattern is implicitly anchored,
because whatever follows will be tried against every character
@@ -1239,7 +1250,7 @@
retrying the overall match at any position after the first.
PCRE treats such a pattern as though it were preceded by \A.
In cases where it is known that the subject string contains
- no newlines, it is worth setting <link
linkend="pcre.pattern.modifiers">PCRE_DOTALL</link> when the pattern begins with .*
in order to
+ no newlines, it is worth setting <link
linkend="reference.pcre.pattern.modifiers">PCRE_DOTALL</link> when the pattern
begins with .* in order to
obtain this optimization, or
alternatively using ^ to indicate anchoring explicitly.
</para>
@@ -1311,7 +1322,7 @@
following the backslash are taken as part of a potential
back reference number. If the pattern continues with a digit
character, then some delimiter must be used to terminate the
- back reference. If the <link
linkend="pcre.pattern.modifiers">PCRE_EXTENDED</link> option is set, this can
+ back reference. If the <link
linkend="reference.pcre.pattern.modifiers">PCRE_EXTENDED</link> option is set, this
can
be whitespace. Otherwise an empty comment can be used.
</para>
<para>
@@ -1603,7 +1614,7 @@
condition is satisfied if the capturing subpattern of that
number has previously matched. Consider the following pattern,
which contains non-significant white space to make it
- more readable (assume the <link
linkend="pcre.pattern.modifiers">PCRE_EXTENDED</link> option) and to
+ more readable (assume the <link
linkend="reference.pcre.pattern.modifiers">PCRE_EXTENDED</link> option) and to
divide it into three parts for ease of discussion:
<literal>( \( )? [^()]+ (?(1) \) )</literal>
@@ -1655,7 +1666,7 @@
comment play no part in the pattern matching at all.
</para>
<para>
- If the <link linkend="pcre.pattern.modifiers">PCRE_EXTENDED</link> option is
set, an unescaped # character
+ If the <link linkend="reference.pcre.pattern.modifiers">PCRE_EXTENDED</link>
option is set, an unescaped # character
outside a character class introduces a comment that
continues up to the next newline character in the pattern.
</para>
@@ -1673,7 +1684,7 @@
expressions to recurse (among other things). The special
item (?R) is provided for the specific case of recursion.
This PCRE pattern solves the parentheses problem (assume
- the <link linkend="pcre.pattern.modifiers">PCRE_EXTENDED</link>
+ the <link linkend="reference.pcre.pattern.modifiers">PCRE_EXTENDED</link>
option is set so that white space is
ignored):
@@ -1737,10 +1748,10 @@
regular expressions for efficient performance.
</para>
<para>
- When a pattern begins with .* and the <link
linkend="pcre.pattern.modifiers">PCRE_DOTALL</link> option is
+ When a pattern begins with .* and the <link
linkend="reference.pcre.pattern.modifiers">PCRE_DOTALL</link> option is
set, the pattern is implicitly anchored by PCRE, since it
can match only at the start of a subject string. However, if
- <link linkend="pcre.pattern.modifiers">PCRE_DOTALL</link> is not set, PCRE
cannot make this optimization,
+ <link linkend="reference.pcre.pattern.modifiers">PCRE_DOTALL</link> is not
set, PCRE cannot make this optimization,
because the . metacharacter does not then match a newline,
and if the subject string contains newlines, the pattern may
match from the character immediately following one of them
@@ -1756,7 +1767,7 @@
<para>
If you are using such a pattern with subject strings that do
not contain newlines, the best performance is obtained by
- setting <link linkend="pcre.pattern.modifiers">PCRE_DOTALL</link>, or starting
the pattern with ^.* to
+ setting <link linkend="reference.pcre.pattern.modifiers">PCRE_DOTALL</link>, or
starting the pattern with ^.* to
indicate explicit anchoring. That saves PCRE from having to
scan along the subject looking for a newline to restart at.
</para>
http://cvs.php.net/diff.php/phpdoc/en/reference/pcre/reference.xml?r1=1.11&r2=1.12&ty=u
Index: phpdoc/en/reference/pcre/reference.xml
diff -u phpdoc/en/reference/pcre/reference.xml:1.11
phpdoc/en/reference/pcre/reference.xml:1.12
--- phpdoc/en/reference/pcre/reference.xml:1.11 Wed Mar 3 00:06:14 2004
+++ phpdoc/en/reference/pcre/reference.xml Thu Jul 29 06:15:26 2004
@@ -1,5 +1,5 @@
<?xml version="1.0" encoding="iso-8859-1"?>
-<!-- $Revision: 1.11 $ -->
+<!-- $Revision: 1.12 $ -->
<reference id="ref.pcre">
<title>Regular Expression Functions (Perl-Compatible)</title>
<titleabbrev>PCRE</titleabbrev>
@@ -15,13 +15,14 @@
the delimiter character has to be used in the expression itself,
it needs to be escaped by backslash. Since PHP 4.0.4, you can also use
Perl-style (), {}, [], and <> matching delimiters.
- See <link linkend="pcre.pattern.syntax">Pattern Syntax</link>
+ See <link linkend="reference.pcre.pattern.syntax">Pattern Syntax</link>
for detailed explanation.
</para>
<para>
The ending delimiter may be followed by various modifiers that
affect the matching.
- See <link linkend="pcre.pattern.modifiers">Pattern Modifiers</link>.
+ See <link linkend="reference.pcre.pattern.modifiers">Pattern
+ Modifiers</link>.
</para>
<para>
PHP also supports regular expressions using a POSIX-extended syntax