> Alvaro Herrera wrote: > > Sent: Sunday, May 08, 2005 2:49 PM > > To: John Hansen > > Cc: Tatsuo Ishii; pgman@candle.pha.pa.us; > > [EMAIL PROTECTED]; pgsql-hackers@postgresql.org > > Subject: Re: [HACKERS] Patch for collation using ICU > > > > On Sun, May 08, 2005 at 02:07:29PM +1000, John Hansen wrote: > > > Tatsuo Ishii wrote: > > > > > > So Japanese(including ASCII)/UNICODE behavior is > > perfectly correct > > > > at this moment. > > > > > > Right, so you _never_ use accented ascii characters in Japanese? > > > (like è for example, whose uppercase is È) > > > > That isn't ASCII. It's latin1 or some other ASCII extension. > > Point taken... > But... > > If you want EUC_JP (Japanese + ASCII) then use that as your backend encoding, > not UTF-8 (unicode). > UTF-8 encoded databases are very useful for representing multiple languages > in the same database, > but this usefulness vanishes if functions like upper/lower doesn't work > correctly.
I'm just curious if Germany/French/Spanish mixed text can be sorted correctly. I think these languages need their own locales even with UNICODE/ICU. > So optimizing for 3 languages breaks more than a hundred, that's doesn't seem > fair! Why don't you add a GUC variable or some such to control the upper/lower behavior? -- Tatsuo Ishii ---------------------------(end of broadcast)--------------------------- TIP 6: Have you searched our list archives? http://archives.postgresql.org