Re: Multiple encodings for 1 character

Kenneth Whistler Mon, 08 Jul 2002 14:31:35 -0700

Theodore wrote:

> What is going to be done about the confusion generated from 
> having multiple ways to encode the same character?
> 
> For example, for filenames, OSX will encode an accented Roman 
> letter one way, while for filenames Windows will encode it the 
> other way. These kind of confusions are totally expected, if 
> Unicode will allow more than one way to encode the same 
> character.


Perhaps a stray newsfeed routed via Alpha Centauri?
This is *very* old news, indeed.

> 
> This means that matching algorithm's won't work, because the 
> characters are different!
> 
> Will there be some kind of recommendation of which to avoid? 
> Will the Unicode consortium make a standard to say that one of 
> these encodings is strongly not recommended, and in fact 
> depreciated?

UAX #15: Unicode Normalization Forms

http://www.unicode.org/unicode/reports/tr15/

And it is up to an implementation to specify which normalization
form it uses.

By the way, we don't depreciate Unicode encodings -- we appreciate
them. ;-)

> And what about the OS that uses this encoding? How will the 
> Unicode consortium make the newly-offending OS change it's ways?

It isn't offending, and the Unicode Consortium won't.

--Ken

Re: Multiple encodings for 1 character

Reply via email to