On Sun, Jun 22, 2003 at 05:28:03PM -0400, Daniel Yacob wrote: > > > For your information: > > Unicode 4.0 adds two sets of decimal digits. :-) > > > 1946..194F ; Nd # [10] LIMBU DIGIT ZERO..LIMBU DIGIT NINE > > 104A0..104A9 ; Nd # [10] OSMANYA DIGIT ZERO..OSMANYA DIGIT NINE > > Thanks! I wasn't aware of these additions. I gave them a try > but it appears Perl 5.8.0 was treating Unicode 4.0 chars as invalid.
In what way invalid? > My GNOME terminal seemed to be converting Osmanya into something else > also. Unicode 4.0 came out this spring (about 9 months after Perl 5.8.0), so I wouldn't be surprised if much software (or data, like fonts) isn't yet updated for it. > I'd like to bring up another utf8 issue. My scripts that work with > utf8 text always seem to start with: > > use utf8; > if ( $] >= 5.007 ) { > binmode (STDOUT, ":utf8"); > } > > > It would be nice if "use utf8" set IO modes for utf8 automagically. > Perhaps a pragma could be passed such as: use utf8 ':all' (or something), > that set everything to utf8 that is settable. And fixing that in Perl 5.8.1 would help Perl 5.8.0 how? :-) But more seriously, the "use utf8" is "an evolutionary dead end". The only thing it means these days is "my script is in UTF-8". For "all the other" things, I think there can't ever be a consensus for "all those things", since there are so many of such things. Better be very explicit about the things you want to "UTF-8-ize". > cheers, > > /Daniel -- Jarkko Hietaniemi <[EMAIL PROTECTED]> http://www.iki.fi/jhi/ "There is this special biologist word we use for 'stable'. It is 'dead'." -- Jack Cohen