On Tue, May 19, 2009 at 10:21, Alexander Koenig <alexander.koe...@mpi.nl> wrote:
> Chas. Owens wrote on 05/19/2009 04:02 PM:
>
>>> ($a,$n,$x,$y)) = $item =~ /(.{5})\.(\d\d?)[-+](\d{1,4})\.(\d{1,4})/;
>> snip
>>
>> As of Perl 5.8 \d no longer matches [0-9].  It now matches any UNICODE
>> character that has the digit property.  This includes characters such
>> as "\x{1815}" (MONGOLIAN DIGIT FIVE).  You must use [0-9] if you mean
>> [0-9] or use the bytes pragma[1] to return the old meaning of \d (but
>> this breaks all UNICODE processing in the scope you declare it).
>
> Oh, I didn't know that. Thanks for pointing that out.
>
> But in most scenarios \d will still work, right? I mean, how often do
> you actually encounter the Mongolian Digit Five in real life data?

It isn't just "\x{1815}", it is any UNICODE character with the digit
property.  That includes things like "\x{FF15}" (FULLWIDTH DIGIT FIVE)
which look just like a normal number 5, but you can't do math with it.
 If you mean [0-9] you should say [0-9], if you mean something that
looks vaguely like a number to somebody you should say \d.

-- 
Chas. Owens
wonkden.net
The most important skill a programmer can have is the ability to read.

--
To unsubscribe, e-mail: beginners-unsubscr...@perl.org
For additional commands, e-mail: beginners-h...@perl.org
http://learn.perl.org/


Reply via email to