On Wed, 2007-09-26 at 18:46 +0100, Duncan Coutts wrote: > In message <[EMAIL PROTECTED]> Jonathan Cast <[EMAIL PROTECTED]> writes: > > On Wed, 2007-09-26 at 09:05 +0200, Johan Tibell wrote: > > > > If UTF-16 is what's used by everyone else (how about Java? Python?) I > > > think that's a strong reason to use it. I don't know Unicode well > > > enough to say otherwise. > > > > I disagree. I realize I'm a dissenter in this regard, but my position > > is: excellent Unix support first, portability second, excellent support > > for Win32/MacOS a distant third. That seems to be the opposite of every > > language's position. Unix absolutely needs UTF-8 for backward > > compatibility. > > I think you're talking about different things, internal vs external > representations. > > Certainly we must support UTF-8 as an external representation. The choice of > internal representation is independent of that. It could be [Char] or some > memory efficient packed format in a standard encoding like UTF-8,16,32. The > choice depends mostly on ease of implementation and performance. Some formats > are easier/faster to process but there are also conversion costs so in some > use > cases there is a performance benefit to the internal representation being the > same as the external representation. > > So, the obvious choices of internal representation are UTF-8 and UTF-16. UTF-8 > has the advantage of being the same as a common external representation so > conversion is cheap (only need to validate rather than copy). UTF-8 is more > compact for western languages but less compact for eastern languages compared > to > UTF-16. UTF-8 is a more complex encoding in the common cases than UTF-16. In > the > common case UTF-16 is effectively fixed width. According to the ICU > implementors > this has speed advantages (probably due to branch prediction and smaller code > size). > > One solution is to do both and benchmark them.
OK, right. jcc _______________________________________________ Haskell-Cafe mailing list [email protected] http://www.haskell.org/mailman/listinfo/haskell-cafe
