RE: Encoding for Fun (was Line Separator)

jon Wed, 22 Oct 2003 10:59:29 -0700

> I can't argue with that ... but my strings were always in (32-bit wide) 
> Unicode at "sort-time". I'm not sure exactly how much value there is a 
> lexicographical sort anyway. I mean, even in Latin-1, surely 'é' should 
> not come after 'z'?


Not always. In particular there's time when a dependable sort order is 
required, but just what that sort order is isn't important. In those cases it 
can useful that UTF-8 and UTF-32 will both do a binary sort with equivalent 
results.

> 
> Of course, UTF-16 doesn't have the binary sort property either.

Nope, though an efficient mechanism to sort UTF-16 in the codepoint order is 
available.

RE: Encoding for Fun (was Line Separator)

Reply via email to