Fri, 5 Oct 2001 23:23:50 +1000, Andrew J Bromage <[EMAIL PROTECTED]> pisze:
> There is a set of one million (more correctly, 1M) Unicode characters > which are only accessible using surrogate pairs (i.e. two UTF-16 > codes). There are currently none of these codes assigned, This information is out of date. AFAIR about 40000 of them is assigned. Most for Chinese (current, not historic). > So rare, in fact, that the cost of strings taking up twice the > space that the currently do simply isn't worth the cost. In Haskell strings already have high overhead. In GHC a Char# value (inside Char object) always takes the same size as the pointer (32 or 64 bits), no matter how much of it is used. > It just goes to show that strings are not merely arrays of characters > like some languages would have you believe. In Haskell String = [Char]. It's true that Char values don't necessarily correspond to glyphs, but Strings are composed of Chars. -- __("< Marcin Kowalczyk * [EMAIL PROTECTED] http://qrczak.ids.net.pl/ \__/ ^^ SYGNATURA ZASTĘPCZA QRCZAK _______________________________________________ Glasgow-haskell-users mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/glasgow-haskell-users