> > "Important: Supplementary code points must be supported for full Unicode > > support, regardless of the encoding form. > > That's the theory. But UTF-16 is strictly less convenient than UTF-32, > which means that a lot of code working in terms of UTF-16 doesn't bother > to support supplementary code points.
>From Wikipedia: "Unfortunately using UTF-16 makes characters outside the Basic Multilingual Plane a special case which increases the risk of oversights related to their handling. That said, programs that mishandle surrogate pairs probably also have problems with combining sequences, so using UTF-32 is unlikely to solve the more general problem of poor handling of multi-code-unit characters." > The only advantage of UTF-16 over UTF-32 is memory usage, and data > exchange with those who already use UTF-16. *Nothing* in UTF-16 is more > convenient or simpler than UTF-32, it's an additional complexity layer. "The only advantage of fixnums over bignums is [performance and] memory usage, and data exchange with those who already use fixnums. *Nothing* in fixnums is more convenient or simpler than bignums, it's an additional complexity layer." > > But I'll tell you what. Find a document, written by someone with > > substantial Unicode experience, that recommends UTF-32 as the best overall > > in-memory encoding. I don't agree with everything you said, but more to the point none of it related to the question I asked: can you find a single document written by a Unicode expert that recommends UTF-32? Every such document I can find recommends UTF-16 as the best overall encoding, with UTF-8 a second choice (based on expected usage). UTF-32 is always the third choice and it always has the caveat "if space doesn't matter." _______________________________________________ r6rs-discuss mailing list [email protected] http://lists.r6rs.org/cgi-bin/mailman/listinfo/r6rs-discuss
