On 11/4/05, Martijn van Oosterhout <kleptog@svana.org> wrote: [snip] > : ICU does not use UCS-2. UCS-2 is a subset of UTF-16. UCS-2 does not > : support surrogates, and UTF-16 does support surrogates. This means > : that UCS-2 only supports UTF-16's Base Multilingual Plane (BMP). The > : notion of UCS-2 is deprecated and dead. Unicode 2.0 in 1996 changed > : its default encoding to UTF-16. > <snip>
This means it's fine.. ICU's use of UTF-16 will not break our support for all of unicode. Conversion too and from UTF-16 isn't cheap, however, if you're doing it all the time. Storing ASCII in UTF-16 is pretty lame. Widespread use of UTF-16 tends to hide bugs in the handling of non-bmp characters. ... I would be somewhat surprised to see a substantial performance difference in working with UTF-16 data over UTF-8, but then again ... they'd know and I wouldn't. Other lame aspects of using unicode encodings other than UTF-8 internally is that it's harder to figure out what is text in GDB output and such.. can make debugging more difficult. ---------------------------(end of broadcast)--------------------------- TIP 4: Have you searched our list archives? http://archives.postgresql.org