Re: [HACKERS] Patch for collation using ICU

Bruce Momjian Sat, 07 May 2005 07:12:42 -0700

Palle Girgensohn wrote:
> 
> --On l?rdag, maj 07, 2005 23.15.29 +1000 John Hansen <[EMAIL PROTECTED]> 
> wrote:
> 
> > Btw, I had been planning to propose replacing every single one of the
> > built in charset conversion functions with calls to ICU (thus making pg
> > _depend_ on ICU), as this would seem like a cleaner solution than for us
> > to maintain our own conversion tables.
> >
> > ICU also has a fair few conversions that we do not have at present.


That is a much larger issue, similar to our shipping our own timezone
database.  What does it buy us?
        
        o  Do we ship it in our tarball?
        o  Is the license compatible?
        o  Does it remove utils/mb conversions?
        o  Does it allow us to index LIKE (next high char)?
        o  Does it allow us to support multiple encodings in
           a single database easier?
        o  performance?

> I just had a similar though. And why use ICU only for multibyte charsets? 
> If I use LATIN1, I still expect upper('?') => SS, and I don't get it... 
> Same for the Turkish example.

We assume the native toupper() can handle single-byte character
encodings.  We use towupper() only for wide character sets.


-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  [email protected]               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

               http://archives.postgresql.org

Re: [HACKERS] Patch for collation using ICU

Reply via email to