At 10:41 PM 6/7/2001 -0400, Buddha Buck wrote:
>Nick Ing-Simmons <[EMAIL PROTECTED]> writes:
>
> > Dan Sugalski <[EMAIL PROTECTED]> writes:
> > >
> > >It does bring up a deeper issue, however. Unicode is, at the moment,
> > >apparently inadequate to represent at least some part of the asian
> > >languages. Are the encodings currently in use less inadequate? I've been
> > >assuming that an Anything->Unicode translation will be lossless, but this
> > >makes me wonder whether that assumption is correct.
> >
> > One reason perl5.7.1+'s Encode does not do asian encodings yet is that
> > the tables I have found so far (Mainly Unicode 3.0 based) are lossy.
>
>Er, are the Unicode tables going to be embedded in /usr/bin/perl6?
>That doesn't give me a warm, cozy feeling about Perl-6 support of
>Unicode.

I'd rather they be dynamically linked in. It makes upgrading them easier, 
and if we can manage to have at least semi-modular support of encoded 
strings it means we could leverage that to present a mostly generic 
interface for whatever string encoding people want to write. (I'm not 
well-enough informed to implement Shift-JIS at the moment, for example, but 
I'd be happy if someone else could do it without pain)

>I think it's great that Perl internals will be able to handle
>arbitrary strings of Unicode characters (using some version of UTF-*),
>but may I suggest that anything that relies on the properties of
>characters (case, conversions, combining, visibility, etc) require
>explicit library support?  We'd lose some things, like normalization,
>but we wouldn't have to carry around huge tables, either.

This is sort of the plan, and I don't think we have to lose normalization 
as part of it either.

                                        Dan

--------------------------------------"it's like this"-------------------
Dan Sugalski                          even samurai
[EMAIL PROTECTED]                         have teddy bears and even
                                      teddy bears get drunk

Reply via email to