> The A-Z syntax is really a shorthand for "All the uppercase letters". 
> (Originally at least) I won't argue the problems with sorting various sets 
> of characters in various locales, but for regexes at least it's not an 
> issue, because the point isn't sorting or ordering, it's identifying 
> groups. We just need to make sure there's a named group for the different 
> languages we know of--things like [[:kanji]] or [[:hiragana]] for example. 

It's spelled \p{...} (after I fixed a silly typo in bleadperl)

$ ./perl -Ilib -wle 'print "a" if "\x{30a1}" =~ /\p{InKatakana}/'
a
$ grep 30A1 lib/unicode/Unicode.txt
30A1;KATAKANA LETTER SMALL A;Lo;0;L;;;;;N;;;;;
3301;SQUARE ARUHUA;So;0;L;<square> 30A2 30EB 30D5 30A1;;;;N;SQUARED ARUHUA;;;;
3332;SQUARE HUARADDO;So;0;L;<square> 30D5 30A1 30E9 30C3 30C9;;;;N;SQUARED HUARA
DDO;;;;
FF67;HALFWIDTH KATAKANA LETTER SMALL A;Lo;0;L;<narrow> 30A1;;;;N;;;;;
$ 

-- 
$jhi++; # http://www.iki.fi/jhi/
        # There is this special biologist word we use for 'stable'.
        # It is 'dead'. -- Jack Cohen

Reply via email to