On Sat, 7 Aug 2004, John Hansen wrote: > Now, is it really 24 bits tho? > Afaict, it's really 21 (0 - 10FFFF or 0 - xxx10000 11111111 11111111)
Yes, up to 0x10ffff should be enough. The 24 is not really important, this is all about what utf-8 strings to accept as input. The strings are stored as utf-8 strings and when processed inside pg it uses wchar_t that is 32 bit (on some systems at least). By restricting the utf-8 input to unicode we can in the future store each character as 3 bytes if we want. -- /Dennis Björklund ---------------------------(end of broadcast)--------------------------- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match