Re: [HACKERS] Patch for collation using ICU

Tatsuo Ishii Sat, 07 May 2005 22:46:44 -0700

> Alvaro Herrera wrote:
> > Sent: Sunday, May 08, 2005 2:49 PM
> > To: John Hansen
> > Cc: Tatsuo Ishii; [email protected]; 
> > [EMAIL PROTECTED]; [email protected]
> > Subject: Re: [HACKERS] Patch for collation using ICU
> > 
> > On Sun, May 08, 2005 at 02:07:29PM +1000, John Hansen wrote:
> > > Tatsuo Ishii wrote:
> > 
> > > > So Japanese(including ASCII)/UNICODE behavior is 
> > perfectly correct 
> > > > at this moment.
> > > 
> > > Right, so you _never_ use accented ascii characters in Japanese? 
> > > (like � for example, whose uppercase is �)
> > 
> > That isn't ASCII.  It's latin1 or some other ASCII extension.
> 
> Point taken...
> But...
> 
> If you want EUC_JP (Japanese + ASCII) then use that as your backend encoding, 
> not UTF-8 (unicode).
> UTF-8 encoded databases are very useful for representing multiple languages 
> in the same database,
> but this usefulness vanishes if functions like upper/lower doesn't work 
> correctly.


I'm just curious if Germany/French/Spanish mixed text can be sorted
correctly. I think these languages need their own locales even with
UNICODE/ICU.

> So optimizing for 3 languages breaks more than a hundred, that's doesn't seem 
> fair!

Why don't you add a GUC variable or some such to control the
upper/lower behavior?
--
Tatsuo Ishii

---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

               http://archives.postgresql.org

Re: [HACKERS] Patch for collation using ICU

Reply via email to