[perl.git] branch blead, updated. v5.13.11-344-g968ee49

Karl Williamson Wed, 30 Mar 2011 19:09:34 -0700

In perl.git, the branch blead has been updated

<http://perl5.git.perl.org/perl.git/commitdiff/968ee499ff66c6dcd466884030cb185844f2d94f?hp=6a213ac57809619baac5720d0b932a5ca3380294>


- Log -----------------------------------------------------------------
commit 968ee499ff66c6dcd466884030cb185844f2d94f
Author: Karl Williamson <pub...@khwilliamson.com>
Date:   Wed Mar 30 19:03:13 2011 -0600

    Encode::Guess is iffy

M       pod/perluniintro.pod

commit 409a7f61ed25e04f9f9fedbe2d2ac5c95c2d22df
Author: Karl Williamson <pub...@khwilliamson.com>
Date:   Wed Mar 30 20:02:28 2011 -0600

    perlrecharclass: /dual are suffix in 5.14
    
    So there is no need to avoid using the / form for them.

M       pod/perlrecharclass.pod

commit bf7786d465a51a7a258b8f2ffd7989b231885e73
Author: Karl Williamson <pub...@khwilliamson.com>
Date:   Wed Mar 30 19:26:38 2011 -0600

    perlrecharclass: Mention UCD::num()

M       pod/perlrecharclass.pod
-----------------------------------------------------------------------

Summary of changes:
 pod/perlrecharclass.pod |   19 ++++++++++---------
 pod/perluniintro.pod    |    2 +-
 2 files changed, 11 insertions(+), 10 deletions(-)

diff --git a/pod/perlrecharclass.pod b/pod/perlrecharclass.pod
index d9eff38..9f27378 100644
--- a/pod/perlrecharclass.pod
+++ b/pod/perlrecharclass.pod
@@ -84,7 +84,7 @@ locale considers decimal digits.  Only when neither a Unicode 
interpretation
 nor locale prevails does C<\d> match only the digits '0' to '9' alone.
 
 Unicode digits may cause some confusion, and some security issues.  In UTF-8
-strings, unless the C<"a"> regular expression modifier is specified,
+strings, unless the C</a> regular expression modifier is specified,
 C<\d> matches the same characters matched by
 C<\p{General_Category=Decimal_Number}>, or synonymously,
 C<\p{General_Category=Digit}>.  Starting with Unicode version 4.1, this is the
@@ -107,7 +107,8 @@ have different values.  For example, BENGALI DIGIT FOUR 
(U+09EA) looks
 very much like an ASCII DIGIT EIGHT (U+0038).
 
 It may be useful for security purposes for an application to require that all
-digits in a row be from the same script.   See L<Unicode::UCD/charscript()>.
+digits in a row be from the same script.  This can be checked by using
+L<Unicode::UCD/num()>.
 
 Any character not matched by C<\d> is matched by C<\D>.
 
@@ -187,7 +188,7 @@ vertical whitespace. Furthermore, if the source string is 
not in UTF-8 format,
 and any locale or EBCDIC code page that is in effect doesn't include them, the
 next line (ASCII-platform C<"\x85">) and the no-break space (ASCII-platform
 C<"\xA0">) characters are not matched by C<\s>, but are by C<\v> and C<\h>
-respectively.  If the C<"a"> modifier is not in effect and the source
+respectively.  If the C</a> modifier is not in effect and the source
 string is in UTF-8 format, both the next line and the no-break space 
 are matched by C<\s>.
 
@@ -231,7 +232,7 @@ page is in effect that changes the C<\s> matching).
 =item [1]
 
 NEXT LINE and NO-BREAK SPACE only match C<\s> if the source string is in
-UTF-8 format and the C<"a"> modifier is not in effect, or if the locale 
+UTF-8 format and the C</a> modifier is not in effect, or if the locale
 or EBCDIC code page in effect includes them.
 
 =back
@@ -564,10 +565,10 @@ and any C<\p> property name can be prefixed with "Is" 
such as C<\p{IsAlpha}>.)
 Both the C<\p> forms are unaffected by any locale in effect, or whether
 the string is in UTF-8 format or not, or whether the platform is EBCDIC or not.
 In contrast, the POSIX character classes are affected, unless the
-regular expression is compiled with the C<"a"> modifier.  If the C<"a">
+regular expression is compiled with the C</a> modifier.  If the C</a>
 modifier is not in effect, and the source string is in UTF-8 format, the
 POSIX classes behave like their "Full-range" Unicode counterparts.  If
-C<"a"> modifier is in effect; or the source string is not in UTF-8
+C</a> modifier is in effect; or the source string is not in UTF-8
 format, and no locale is in effect, and the platform is not EBCDIC, all
 the POSIX classes behave like their ASCII-range counterparts.
 Otherwise, they behave based on the rules of the locale or EBCDIC code
@@ -715,7 +716,7 @@ the backslash sequences C<\b> and C<\B> are defined in 
terms of C<\w>
 and C<\W>, they also are affected.)
 
 Starting in Perl 5.14, if the regular expression is compiled with the
-C<"a"> modifier, the behavior doesn't differ regardless of any other
+C</a> modifier, the behavior doesn't differ regardless of any other
 factors.  C<\d> matches the 10 digits 0-9; C<\D> any character but those
 10; C<\s>, exactly the five characters "[ \f\n\r\t]"; C<\w> only the 63
 characters "[A-Za-z0-9_]"; and the C<"[[:posix:]]"> classes only the
@@ -729,7 +730,7 @@ encoded in utf8 (usually as a result of including a literal 
character
 whose code point is above 255), or if it contains a C<\N{U+...}> or
 C<\N{I<name>}> construct, or (starting in Perl 5.14) if it was compiled
 in the scope of a C<S<use feature "unicode_strings">> pragma and not in
-the scope of a C<S<use locale>> pragma, or has the C<"u"> regular
+the scope of a C<S<use locale>> pragma, or has the C</u> regular
 expression modifier.
 
 Note that one can specify C<"use re '/l'"> for example, for any regular
@@ -743,7 +744,7 @@ affects only ASCII platforms, and only when matching 
against characters
 whose code points are between 128 and 255 inclusive.  See
 L<perlunicode/The "Unicode Bug">.
 
-For portability reasons, unless the C<"a"> modifier is specified,
+For portability reasons, unless the C</a> modifier is specified,
 it may be better to not use C<\w>, C<\d>, C<\s> or the POSIX character
 classes and use the Unicode properties instead.
 
diff --git a/pod/perluniintro.pod b/pod/perluniintro.pod
index 3768b45..6ca2999 100644
--- a/pod/perluniintro.pod
+++ b/pod/perluniintro.pod
@@ -686,7 +686,7 @@ and the C<length()> function:
 
 How Do I Find Out What Encoding a File Has?
 
-Try L<Encode::Guess>.
+You might try L<Encode::Guess>, but it has a number of limitations.
 
 =item *
 

--
Perl5 Master Repository

[perl.git] branch blead, updated. v5.13.11-344-g968ee49

Reply via email to