> -----Original Message----- > From: [EMAIL PROTECTED] > [mailto:[EMAIL PROTECTED] On Behalf Of Bruce Momjian > Sent: Sunday, April 10, 2005 8:18 AM > To: Christopher Kings-Lynne > Cc: pgsql-hackers@postgresql.org > Subject: Re: [HACKERS] Unicode problems on IRC > > Christopher Kings-Lynne wrote: > > Hey guys, > > > > The 'Unicode characters above 0x10000' issue keeps rearing its ugly > > head in the IRC channel. I propose that it be fixed, even > backported... > > > > This is John Hansen's most recent patch to fix it: > > > > http://archives.postgresql.org/pgsql-patches/2004-11/msg00259.php > > > > And from what I can tell it was committed, then reverted because it > > wasn't a "bug". It was going to go in for 8.1. > > > > We on the channel are starting to think that it is in fact a bug. > > There are are people with legitimately utf-8 encoded XML documents > > that they cannot store in PostgreSQL. Apparently in the > distant past, > > Unicode was limited to 0x10000, but then was extended. > > > > Perhaps we can reopen this case... > > Uh, I thought we fixed this another way, buy not using > Unicode-aware functions for upper/lower/initcap when the > locale is "C" or "POSIX". > That is backpatched to 8.0.X. Does that not fix the problem reported?
No, as andrew said, what this patch does, is allow values > 0xffff and at the same time validates the input to make sure it's valid utf8. ... John > > -- > Bruce Momjian | http://candle.pha.pa.us > pgman@candle.pha.pa.us | (610) 359-1001 > + If your life is a hard drive, | 13 Roberts Road > + Christ can be your backup. | Newtown Square, > Pennsylvania 19073 > > ---------------------------(end of > broadcast)--------------------------- > TIP 9: the planner will ignore your desire to choose an index > scan if your > joining column's datatypes do not match > > ---------------------------(end of broadcast)--------------------------- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly