Tatsuo Ishii wrote: > Sent: Sunday, May 08, 2005 11:08 PM > To: John Hansen > Cc: pgman@candle.pha.pa.us; [EMAIL PROTECTED]; > pgsql-hackers@postgresql.org > Subject: Re: [HACKERS] Patch for collation using ICU > > > > I don't buy it. If current conversion tables does the > right thing, > > > why we need to replace. Or if conversion tables are not > correct, why > > > don't you fix it? I think the rule of character > conversion will not > > > change frequently, especially for LATIN languages. Thus > maintaining > > > cost is not too high. > > > > I never said we need to, but if we're going to implement > ICU, then we > > might as well go all the way. > > So you admit there's no benefit using ICU for replacing > existing conversions? > > Besides ICU does not support all existing conversions, I > think ICU has serious flaw for using conversion. If I > understand correctly, ICU uses UNICODE internally to do the > conversion. For example, to implement > SJIS->EUC_JP conversion, ICU first converts SJIS to UNICODE then > converts UNICODE to EUC_JP. Problem is these conversion is > not roud trip(conversion between SJIS/EUC_JP and UNICODE will > lose some information). Thus SJIS->EUC_JP->SJIS conversion > using ICU does not preserve original text.
Could you please send me a sample text as an attachment encoded in SJIS where this would happen? > -- > Tatsuo Ishii > > ---------------------------(end of broadcast)--------------------------- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match