>ICU's pedantic form The goal for ICU is to be charset neutral, and support all of the conversions that are in modern use. There are a large number of variants of character sets; you can use the one you want. See:
http://oss.software.ibm.com/icu/charset/index.html Mark ----- Original Message ----- From: "Dan Kogai" <[EMAIL PROTECTED]> To: "Nick Ing-Simmons" <[EMAIL PROTECTED]> Cc: "Nick Ing-Simmons" <[EMAIL PROTECTED]>; "SADAHIRO Tomoyuki" <[EMAIL PROTECTED]>; <[EMAIL PROTECTED]>; <[EMAIL PROTECTED]> Sent: Friday, February 01, 2002 07:46 Subject: Re: ICU's uconv vs Linux iconv and UTF-8 > On 2002.02.02, at 00:37, Nick Ing-Simmons wrote: > >> Oh, yes. This is the problem of the original Unicode 2.x map; It is > >> not ASCII preservative. I have posted this problem to perl- > >> [EMAIL PROTECTED] when I first released Jcode. Several discussions > >> later, I made Jcode so that it preserves ASCII by default and added > >> $Jcode::Unicode::PEDANTIC to change the behavior > > > > Ah. I take your point. If we used ICU's pedantic form > > Both UNIX ~/foo and MS C:\Foo get mangled. > > EXACTLY! > > > The other differences (having looked at diff in yudit) seems to be > > mapping 「 (cent),」 (pound) ,ャ (not) and one of the longer dashes to > > different width variants (full width for ICU). > > > > I am going off ICU ... > > As I addressed to [EMAIL PROTECTED], Yet another problems that > ftp://ftp.unicode.org/Public/MAPPINGS/EASTASIA/ is now gone so I don't > have a practical way to check the mapping. I want the mapping back! > > Dan > > >