In a message dated 2001-09-17 4:25:47 Pacific Daylight Time, 
[EMAIL PROTECTED] writes:

>> How should an UTF-8 application behave if it accidentally receives
>> a CESU-8 surrogate sequence?  How does an application which
>> relies on CESU-8 binary sorting behave if it accidentally receives an
>> UTF-8 4-byte sequence?
>
> Both should error out. In practice, I wonder how common it would be and
> because of this how many people will actually do THAT in their parsers. I
> expect lots of non-compliant parsers.

If Michka is referring to non-compliant CESU-8 parsers, I really wouldn't 
care much because CESU-8 is supposed to live in its own little private world. 
 But if people start compromising their UTF-8 parsers to accommodate CESU-8 
"adaptively," it would be a great blow to UTF-8.  It would essentially undo 
all the tightening-up that was accomplished by the Corrigendum, and it would 
revive all the old Bruce Schneier-style skepticism about the "security" of 
Unicode.

-Doug Ewell
 Fullerton, California

Reply via email to