At 11:18 AM 6/5/2001 -0700, Hong Zhang wrote:

> > Firstly, the JIS standard defines, along with the ordering and
> > enumeration of its characters, their glyph shape. Unicode, on  the other
> > hand does not. This means that as far as Unicode is concerned, there is
> > literally no distinction between two distinct shapes and hence no way to
> > specify which should be used. This becomes particularly> emotive when one
> > is, for instance, attempting to represent a person's name -
> > if they have a particular preferred variant character with which they
>write their
> > name, there is no way to communicate that to the computer, and
> > information is lost.
>
>This is a very common practice, nothing to surprise. As you can tell,
>my name is "hong zhang", which already lost "chinese tone" and
>"glyph". "hong" has 4 tones, each tone can be any of several
>characters, each character can be one of several glyphs (simplified and
>tranditional). However, it does not really matter to still call it my name.

I dunno. It's one thing to have a word represented with non-native 
characters--loss is expected. It's quite another to have it spelled out in 
an encoding that's supposed to preserve such things and have it not 
actually do that. That'd be like having my name spelled or pronounced 
differently because it was encoded in Unicode instead of ASCII. That's just 
plain wrong.

> > The second objection is again related to character versus  glyph issues:
> > since Chinese,
>
>I think this problem =~ locale. For any unicode character, you can not
>properly tell its lower case or upper case without considering locale.
>And unicode does not encode locale.

Yeah, that is a problem. The alternative isn't any better, unfortunately. 
Human languages are a pain. :)

We're going to need case-translation stuff for perl 6, I think, if lc, uc, 
and its ilk are going to work properly.

> > Finally, there is a historiographical issue; when computers are used to
> > digitise and store historical literature containing archaic characters,
> > specifying the exact variant character becomes an important
> > consideration.
>
>I believe this should be handled by application. This kind of work is needed
>by research. Perl should not care about it.

I think I'd agree there. Different versions of a glyph are more a matter of 
art and handwriting styles, and that's not really something we ought to get 
involved in. The european equivalent would be to have many versions of "A", 
so we could represent the different ways it was drawn in various 
illuminated manuscripts. That seems rather excessive.

                                        Dan

--------------------------------------"it's like this"-------------------
Dan Sugalski                          even samurai
[EMAIL PROTECTED]                         have teddy bears and even
                                      teddy bears get drunk

Reply via email to