On Saturday 2005-09-10 11:43, Peter Eisentraut wrote: > Marc G. Fournier wrote: > > Are there any data types that can hold pretty much any type of > > character? UTF-16 isn't supported (or its missing from teh docs), and > > UTF-8 doesn't appear to have a big enough range ... > > UTF-8 has exactly the same "range" as UTF-16. In any case, the UTF-8 > encoding in PostgreSQL is probably your best choice, unless you want to > dig into the weirdness that is MULE_INTERNAL.
The 8.1 beta documentation says that UTF-8 in earlier versions of Pg only covered the first 16 bits of Unicode. Unfortunately "pure" Unicode uses 32 bits and (according to my Unicode Demystified) needed at least 21 (?) bits to represent all the code points available in Unicode 3.x. (I think Unicode is now in 4.x.) This means that the code space supported by Pg 8.0 is technically too small. It shouldn't matter though, unless you are working with Chinese or a private character set. ---------------------------(end of broadcast)--------------------------- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly