Re: [HACKERS] Patch for collation using ICU

John Hansen Sun, 08 May 2005 17:09:23 -0700

Tatsuo Ishii wrote:
> Sent: Sunday, May 08, 2005 11:08 PM
> To: John Hansen
> Cc: pgman@candle.pha.pa.us; [EMAIL PROTECTED]; 
> pgsql-hackers@postgresql.org
> Subject: Re: [HACKERS] Patch for collation using ICU
> 
> > > I don't buy it. If current conversion tables does the 
> right thing, 
> > > why we need to replace. Or if conversion tables are not 
> correct, why 
> > > don't you fix it? I think the rule of character 
> conversion will not 
> > > change frequently, especially for LATIN languages. Thus 
> maintaining 
> > > cost is not too high.
> > 
> > I never said we need to, but if we're going to implement 
> ICU, then we 
> > might as well go all the way.
> 
> So you admit there's no benefit using ICU for replacing 
> existing conversions?
> 
> Besides ICU does not support all existing conversions, I 
> think ICU has serious flaw for using conversion. If I 
> understand correctly, ICU uses UNICODE internally to do the 
> conversion. For example, to implement
> SJIS->EUC_JP conversion, ICU first converts SJIS to UNICODE then
> converts UNICODE to EUC_JP. Problem is these conversion is 
> not roud trip(conversion between SJIS/EUC_JP and UNICODE will 
> lose some information). Thus SJIS->EUC_JP->SJIS conversion 
> using ICU does not preserve original text.


Could you please send me a sample text as an attachment encoded in SJIS
where this would happen?

> --
> Tatsuo Ishii
> 
> 

---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
      joining column's datatypes do not match

Re: [HACKERS] Patch for collation using ICU

Reply via email to