On Sun, Jun 22, 2003 at 05:28:03PM -0400, Daniel Yacob wrote:
>
> > For your information:
> > Unicode 4.0 adds two sets of decimal digits. :-)
>
> > 1946..194F ; Nd # [10] LIMBU DIGIT ZERO..LIMBU DIGIT NINE
> > 104A0..104A9 ; Nd # [10] OSMANYA DIGIT ZERO..OSMANYA DIGIT NINE
>
> Thanks! I wasn't aware of these additions. I gave them a try
> but it appears Perl 5.8.0 was treating Unicode 4.0 chars as invalid.
In what way invalid?
> My GNOME terminal seemed to be converting Osmanya into something else
> also.
Unicode 4.0 came out this spring (about 9 months after Perl 5.8.0), so
I wouldn't be surprised if much software (or data, like fonts) isn't
yet updated for it.
> I'd like to bring up another utf8 issue. My scripts that work with
> utf8 text always seem to start with:
>
> use utf8;
> if ( $] >= 5.007 ) {
> binmode (STDOUT, ":utf8");
> }
>
>
> It would be nice if "use utf8" set IO modes for utf8 automagically.
> Perhaps a pragma could be passed such as: use utf8 ':all' (or something),
> that set everything to utf8 that is settable.
And fixing that in Perl 5.8.1 would help Perl 5.8.0 how? :-)
But more seriously, the "use utf8" is "an evolutionary dead end".
The only thing it means these days is "my script is in UTF-8".
For "all the other" things, I think there can't ever be a consensus
for "all those things", since there are so many of such things.
Better be very explicit about the things you want to "UTF-8-ize".
> cheers,
>
> /Daniel
--
Jarkko Hietaniemi <[EMAIL PROTECTED]> http://www.iki.fi/jhi/ "There is this special
biologist word we use for 'stable'. It is 'dead'." -- Jack Cohen