On Tue, May 19, 2009 at 10:21, Alexander Koenig <alexander.koe...@mpi.nl> wrote: > Chas. Owens wrote on 05/19/2009 04:02 PM: > >>> ($a,$n,$x,$y)) = $item =~ /(.{5})\.(\d\d?)[-+](\d{1,4})\.(\d{1,4})/; >> snip >> >> As of Perl 5.8 \d no longer matches [0-9]. It now matches any UNICODE >> character that has the digit property. This includes characters such >> as "\x{1815}" (MONGOLIAN DIGIT FIVE). You must use [0-9] if you mean >> [0-9] or use the bytes pragma[1] to return the old meaning of \d (but >> this breaks all UNICODE processing in the scope you declare it). > > Oh, I didn't know that. Thanks for pointing that out. > > But in most scenarios \d will still work, right? I mean, how often do > you actually encounter the Mongolian Digit Five in real life data?
It isn't just "\x{1815}", it is any UNICODE character with the digit property. That includes things like "\x{FF15}" (FULLWIDTH DIGIT FIVE) which look just like a normal number 5, but you can't do math with it. If you mean [0-9] you should say [0-9], if you mean something that looks vaguely like a number to somebody you should say \d. -- Chas. Owens wonkden.net The most important skill a programmer can have is the ability to read. -- To unsubscribe, e-mail: beginners-unsubscr...@perl.org For additional commands, e-mail: beginners-h...@perl.org http://learn.perl.org/