functions pcre.pattern.syntax.xml

Mehdi Achour Mon, 01 Dec 2003 16:47:26 -0800

didou           Mon Dec  1 18:44:09 2003 EDT


  Modified files:              
    /phpdoc/en/reference/pcre/functions pcre.pattern.syntax.xml 
  Log:
  There were forgotten division-marks inside words (Jakub Vrana)

Index: phpdoc/en/reference/pcre/functions/pcre.pattern.syntax.xml
diff -u phpdoc/en/reference/pcre/functions/pcre.pattern.syntax.xml:1.5 
phpdoc/en/reference/pcre/functions/pcre.pattern.syntax.xml:1.6
--- phpdoc/en/reference/pcre/functions/pcre.pattern.syntax.xml:1.5      Tue Aug  6 
16:04:34 2002
+++ phpdoc/en/reference/pcre/functions/pcre.pattern.syntax.xml  Mon Dec  1 18:44:09 
2003
@@ -1,5 +1,5 @@
 <?xml version="1.0" encoding="iso-8859-1"?>
-<!-- $Revision: 1.5 $ -->
+<!-- $Revision: 1.6 $ -->
 <!-- splitted from ./en/functions/pcre.xml, last change in rev 1.2 -->
   <refentry id="pcre.pattern.syntax">
    <refnamediv>
@@ -47,10 +47,10 @@
      </listitem>
      <listitem>
       <simpara>
-     Capturing subpatterns that occur inside negative looka-
-     head assertions are counted, but their entries in the
-     offsets vector are never set. Perl sets its numerical vari-
-     ables from any such patterns that are matched before the
+     Capturing subpatterns that occur inside negative
+     lookahead assertions are counted, but their entries in the
+     offsets vector are never set. Perl sets its numerical
+     variables from any such patterns that are matched before the
      assertion fails to match something (thereby succeeding), but
      only  if  the negative lookahead assertion contains just one
      branch.
@@ -68,8 +68,8 @@
       <simpara>
      The following Perl escape sequences  are  not  supported:
      \l,  \u,  \L,  \U,  \E, \Q. In fact these are implemented by
-     Perl's general string-handling and are not part of its  pat-
-     tern matching engine.
+     Perl's general string-handling and are not part of its
+     pattern matching engine.
       </simpara>
       </listitem>
       <listitem>
@@ -123,7 +123,7 @@
          <simpara>
      If <link linkend="pcre.pattern.modifiers">PCRE_DOLLAR_ENDONLY</link>  is set and 
      <link linkend="pcre.pattern.modifiers">PCRE_MULTILINE</link>  is  not
-     set,  the  $ meta- character matches only at the very end of
+     set,  the  $ meta-character matches only at the very end of
      the string.
          </simpara>
         </listitem>
@@ -135,8 +135,8 @@
         </listitem>
         <listitem>
          <simpara>
-     If <link linkend="pcre.pattern.modifiers">PCRE_UNGREEDY</link>  is set, the 
greediness of  the  repeti-
-     tion  quantifiers  is inverted, that is, by default they are
+     If <link linkend="pcre.pattern.modifiers">PCRE_UNGREEDY</link>  is set, the 
greediness of  the
+     repetition  quantifiers  is inverted, that is, by default they are
      not greedy, but if followed by a question mark they are.
          </simpara>
         </listitem>
@@ -152,8 +152,8 @@
      <refsect2 id="regexp.introduction">
       <title>Introduction</title>
       <para>
-     The syntax and semantics of  the  regular  expressions  sup-
-     ported  by PCRE are described below. Regular expressions are
+     The syntax and semantics of  the  regular  expressions
+     supported  by PCRE are described below. Regular expressions are
      also described in the Perl documentation and in a number  of
      other  books,  some  of which have copious examples. Jeffrey
      Friedl's  "Mastering  Regular  Expressions",  published   by
@@ -162,8 +162,8 @@
 
      A regular expression is a pattern that is matched against  a
      subject string from left to right. Most characters stand for
-     themselves in a pattern, and match the corresponding charac-
-     ters in the subject. As a trivial example, the pattern
+     themselves in a pattern, and match the corresponding
+     characters in the subject. As a trivial example, the pattern
        <literal>The quick brown fox</literal>
      matches a portion of a subject string that is  identical  to
      itself.  
@@ -173,9 +173,9 @@
      <title>Meta-characters</title>
      <para>     
      The  power  of  regular  expressions comes from the
-     ability to include alternatives and repetitions in the  pat-
-     tern.  These  are encoded in the pattern by the use of <emphasis>meta</emphasis>-
-     <emphasis>characters</emphasis>, which do not stand for  themselves  but  instead
+     ability to include alternatives and repetitions in the
+     pattern.  These  are encoded in the pattern by the use of 
+     <emphasis>meta-characters</emphasis>, which do not stand for  themselves  but  
instead
      are interpreted in some special way.
     </para>
     <para>
@@ -299,8 +299,8 @@
       </variablelist>
 
      Part of a pattern that is in square  brackets is called a
-     "character  class". In a character class the only meta-
-     characters are:
+     "character  class". In a character class the only
+     meta-characters are:
       <variablelist>
        <varlistentry>
         <term><emphasis>\</emphasis></term>
@@ -350,23 +350,23 @@
     </para>
     <para>
      For example, if you want to match a "*" character, you write
-     "\*" in the pattern. This applies whether or not the follow-
-     ing character would otherwise be interpreted as a meta-
-     character, so it is always safe to precede a non-alphanumeric
-     with "\" to specify that it stands for itself.  In  particu-
-     lar, if you want to match a backslash, you write "\\".
+     "\*" in the pattern. This applies whether or not the
+     following character would otherwise be interpreted as a
+     meta-character, so it is always safe to precede a non-alphanumeric
+     with "\" to specify that it stands for itself.  In
+     particular, if you want to match a backslash, you write "\\".
     </para>
     <para>
-     If a pattern is compiled with the <link 
linkend="pcre.pattern.modifiers">PCRE_EXTENDED</link>  option, whi-
-     tespace in the pattern (other than in a character class) and
+     If a pattern is compiled with the <link 
linkend="pcre.pattern.modifiers">PCRE_EXTENDED</link>  option,
+     whitespace in the pattern (other than in a character class) and
      characters between a "#" outside a character class  and  the
      next  newline  character  are ignored. An escaping backslash
      can be used to include a whitespace or "#" character as part
      of the pattern.
     </para>
     <para>
-     A second use of backslash provides a way of encoding non-
-     printing characters in patterns in a visible manner. There
+     A second use of backslash provides a way of encoding
+     non-printing characters in patterns in a visible manner. There
      is no restriction on the appearance of non-printing  characters,
      apart from the binary zero that terminates a pattern,
      but when a pattern is being prepared by text editing, it is
@@ -569,8 +569,8 @@
      </variablelist>
     </para>
     <para>
-     Note that octal values of 100 or greater must not be intro-
-     duced by a leading zero, because no more than three octal
+     Note that octal values of 100 or greater must not be
+     introduced by a leading zero, because no more than three octal
      digits are ever read.
     </para>
     <para>
@@ -581,8 +581,8 @@
      class it has a different meaning (see below).
     </para>
     <para>
-     The third use of backslash is for specifying generic charac-
-     ter types:
+     The third use of backslash is for specifying generic
+     character types:
     </para>
     <para>
       <variablelist>
@@ -647,8 +647,8 @@
      Perl "<literal>word</literal>". The definition of letters and digits is  
      controlled by PCRE's character tables, and may vary if locale-specific
      matching is taking place (see  "Locale  support"
-     above). For example, in the "fr" (French) locale, some char-
-     acter codes greater than 128 are used for accented letters,
+     above). For example, in the "fr" (French) locale, some
+     character codes greater than 128 are used for accented letters,
      and these are matched by <literal>\w</literal>.
     </para>
     <para>
@@ -659,8 +659,8 @@
      is no character to match.
     </para>
     <para>
-     The fourth use of backslash is  for  certain  simple  asser-
-     tions. An assertion specifies a condition that has to be met
+     The fourth use of backslash is  for  certain  simple
+     assertions. An assertion specifies a condition that has to be met
      at a particular point in  a match, without consuming any
      characters from the subject string. The use of subpatterns
      for more complicated assertions is described below. The
@@ -752,11 +752,11 @@
      Circumflex need not be the first character of the pattern if
      a number of alternatives are involved, but it should be the
      first thing in each alternative in which it appears  if  the
-     pattern is ever to match that branch. If all possible alter-
-     natives start with a circumflex, that is, if the pattern  is
+     pattern is ever to match that branch. If all possible
+     alternatives start with a circumflex, that is, if the pattern  is
      constrained to match only at the start of the subject, it is
-     said to be an "anchored" pattern. (There are also other con-
-     structs that can cause a pattern to be anchored.)
+     said to be an "anchored" pattern. (There are also other
+     constructs that can cause a pattern to be anchored.)
 
      A dollar character is an assertion which is &true; only if the
      current  matching point is at the end of the subject string,
@@ -779,10 +779,10 @@
      before an internal "\n" character, respectively, in addition
      to matching at the start and end of the subject string.  For
      example,  the  pattern  /^abc$/  matches  the subject string
-     "def\nabc" in multiline  mode,  but  not  otherwise.  Conse-
-     quently,  patterns  that  are  anchored  in single line mode
-     because all branches start with "^" are not anchored in mul-
-     tiline  mode.  The  <link 
linkend="pcre.pattern.modifiers">PCRE_DOLLAR_ENDONLY</link>  option is ignored if
+     "def\nabc" in multiline  mode,  but  not  otherwise.
+     Consequently,  patterns  that  are  anchored  in single line mode
+     because all branches start with "^" are not anchored in
+     multiline  mode.  The  <link 
linkend="pcre.pattern.modifiers">PCRE_DOLLAR_ENDONLY</link>  option is ignored if
      <link linkend="pcre.pattern.modifiers">PCRE_MULTILINE</link>  is set.
 
      Note that the sequences \A, \Z, and \z can be used to  match
@@ -798,9 +798,9 @@
      Outside a character class, a dot in the pattern matches  any
      one  character  in  the  subject,  including  a non-printing
      character, but not (by default) newline.  If the <link 
linkend="pcre.pattern.modifiers">PCRE_DOTALL</link> 
-     option  is  set,  then dots match newlines as well. The han-
-     dling of dot is entirely independent of the handling of cir-
-     cumflex  and  dollar,  the only relationship being that they
+     option  is  set,  then dots match newlines as well. The
+     handling of dot is entirely independent of the handling of
+     circumflex  and  dollar,  the only relationship being that they
      both involve newline characters.  Dot has no special meaning
      in a character class.
      </literallayout>
@@ -809,25 +809,25 @@
     <refsect2 id="regexp.reference.squarebrackets">
      <title>Square brackets</title>
      <literallayout>
-     An opening square bracket introduces a character class, ter-
-     minated  by  a  closing  square  bracket.  A  closing square
+     An opening square bracket introduces a character class,
+     terminated  by  a  closing  square  bracket.  A  closing square
      bracket on its own is  not  special.  If  a  closing  square
      bracket  is  required as a member of the class, it should be
-     the first data character in the class (after an initial cir-
-     cumflex, if present) or escaped with a backslash.
+     the first data character in the class (after an initial
+     circumflex, if present) or escaped with a backslash.
 
      A character class matches a single character in the subject;
      the  character  must  be in the set of characters defined by
-     the class, unless the first character in the class is a cir-
-     cumflex,  in which case the subject character must not be in
+     the class, unless the first character in the class is a
+     circumflex,  in which case the subject character must not be in
      the set defined by the class. If a  circumflex  is  actually
      required  as  a  member  of  the class, ensure it is not the
      first character, or escape it with a backslash.
 
      For example, the character class [aeiou] matches  any  lower
      case vowel, while [^aeiou] matches any character that is not
-     a lower case vowel. Note that a circumflex is  just  a  con-
-     venient  notation for specifying the characters which are in
+     a lower case vowel. Note that a circumflex is  just  a
+     convenient  notation for specifying the characters which are in
      the class by enumerating those that are not. It  is  not  an
      assertion:  it  still  consumes a character from the subject
      string, and fails if the current pointer is at  the  end  of
@@ -836,8 +836,8 @@
      When caseless matching  is  set,  any  letters  in  a  class
      represent  both their upper case and lower case versions, so
      for example, a caseless [aeiou] matches "A" as well as  "a",
-     and  a caseless [^aeiou] does not match "A", whereas a case-
-     ful version would.
+     and  a caseless [^aeiou] does not match "A", whereas a
+     caseful version would.
 
      The newline character is never treated in any special way in
      character  classes,  whatever the setting of the <link 
linkend="pcre.pattern.modifiers">PCRE_DOTALL</link> 
@@ -848,17 +848,17 @@
      of  characters  in  a  character  class.  For example, [d-m]
      matches any letter between d and m, inclusive.  If  a  minus
      character  is required in a class, it must be escaped with a
-     backslash or appear in a position where it cannot be  inter-
-     preted as indicating a range, typically as the first or last
+     backslash or appear in a position where it cannot be
+     interpreted as indicating a range, typically as the first or last
      character in the class.
      
      It is not possible to have the literal character "]" as  the
      end  character  of  a  range.  A  pattern such as [W-]46] is
-     interpreted as a class of two characters ("W" and "-")  fol-
-     lowed by a literal string "46]", so it would match "W46]" or
+     interpreted as a class of two characters ("W" and "-")
+     followed by a literal string "46]", so it would match "W46]" or
      "-46]". However, if the "]" is escaped with a  backslash  it
-     is  interpreted  as  the end of range, so [W-\]46] is inter-
-     preted as a single class containing a range followed by  two
+     is  interpreted  as  the end of range, so [W-\]46] is
+     interpreted as a single class containing a range followed by  two
      separate characters. The octal or hexadecimal representation
      of "]" can also be used to end a range.
 
@@ -875,8 +875,8 @@
      appear  in  a  character  class, and add the characters that
      they match to the class. For example, [\dABCDEF] matches any
      hexadecimal  digit.  A  circumflex  can conveniently be used
-     with the upper case character types to specify a  more  res-
-     tricted set of characters than the matching lower case type.
+     with the upper case character types to specify a  more
+     restricted set of characters than the matching lower case type.
      For example, the class [^\W_] matches any letter  or  digit,
      but not underscore.
 
@@ -984,8 +984,8 @@
      which can be nested.  Marking part of a pattern as a subpattern
      does two things:
 
-     1. It localizes a set of alternatives. For example, the pat-
-     tern
+     1. It localizes a set of alternatives. For example, the
+     pattern
 
        cat(aract|erpillar|)
 
@@ -1131,8 +1131,8 @@
 
      does the right thing with the C comments. The meaning of the
      various  quantifiers is not otherwise changed, just the preferred
-     number of matches.  Do not confuse this use of  ques-
-     tion  mark  with  its  use as a quantifier in its own right.
+     number of matches.  Do not confuse this use of
+     question  mark  with  its  use as a quantifier in its own right.
      Because it has two uses, it can sometimes appear doubled, as
      in
 
@@ -1374,8 +1374,8 @@
      <title>Once-only subpatterns</title>
      <literallayout>
      With both maximizing and minimizing repetition,  failure  of
-     what  follows  normally  causes  the repeated item to be re-
-     evaluated to see if a different number of repeats allows the
+     what  follows  normally  causes  the repeated item to be
+     re-evaluated to see if a different number of repeats allows the
      rest  of  the  pattern  to  match. Sometimes it is useful to
      prevent this, either to change the nature of the  match,  or
      to  cause  it fail earlier than it otherwise might, when the
@@ -1401,8 +1401,8 @@
 
      This kind of parenthesis "locks up" the  part of the pattern
      it  contains once it has matched, and a failure further into
-     the pattern is prevented from backtracking  into  it.  Back-
-     tracking  past  it to previous items, however, works as normal.
+     the pattern is prevented from backtracking  into  it.
+     Backtracking  past  it to previous items, however, works as normal.
 
      An alternative description is that a subpattern of this type
      matches  the  string  of  characters that an identical standalone
@@ -1419,8 +1419,8 @@
      This construction can of course contain arbitrarily  complicated
      subpatterns, and it can be nested.
 
-     Once-only subpatterns can be used in conjunction with  look-
-     behind  assertions  to specify efficient matching at the end
+     Once-only subpatterns can be used in conjunction with
+     look-behind  assertions  to specify efficient matching at the end
      of the subject string. Consider a simple pattern such as
 
        abcd$
@@ -1547,8 +1547,8 @@
      comment play no part in the pattern matching at all.
 
      If the <link linkend="pcre.pattern.modifiers">PCRE_EXTENDED</link>  option is 
set, an unescaped # character
-     outside  a character class introduces a comment that contin-
-     ues up to the next newline character in the pattern.
+     outside  a character class introduces a comment that
+     continues up to the next newline character in the pattern.
      </literallayout>
     </refsect2>
 
@@ -1571,8 +1571,8 @@
        \( ( (?>[^()]+) | (?R) )* \)
 
      First it matches an opening parenthesis. Then it matches any
-     number  of substrings which can either be a sequence of non-
-     parentheses, or a recursive  match  of  the  pattern  itself
+     number  of substrings which can either be a sequence of
+     non-parentheses, or a recursive  match  of  the  pattern  itself
      (i.e. a correctly parenthesized substring). Finally there is
      a closing parenthesis.

[PHP-DOC] cvs: phpdoc /en/reference/pcre/functions pcre.pattern.syntax.xml

Reply via email to