Re: Question on Unicode data files
Mr Zhang is CEO of that company. Regards, Jianping. John Jenkins wrote: > On Monday, February 26, 2001, at 09:12 PM, Richard Cook wrote: > > > Is there any connection between this http://www.unihan.com.cn/ site and > > IRG? What is UniHan Digital Tech Co.? Their website has some rather > > annoying graphics and windows, but no basic info that i can see ... the > > bottom buttons don't work at all, no? > > > > I don't know who they are. They're not associated with the IRG that I'm > aware. I'm checking with Mr. Zhang to see if he's heard of them. > > = > John H. Jenkins > [EMAIL PROTECTED] > [EMAIL PROTECTED] > http://homepage.mac.com/jenkins/ begin:vcard n:Yang;Jianping tel;fax:650-506-7225 tel;work:650-506-4865 x-mozilla-html:FALSE org:Server Gobalization Technology;Server Technology version:2.1 email;internet:[EMAIL PROTECTED] title:Senior Development Manager adr;quoted-printable:;;500 Oracle Packway=0D=0AM/S 659407;Redwood Shores;CA;94065; fn:Jianping Yang end:vcard
Re: Question on Unicode data files
On Monday, February 26, 2001, at 09:12 PM, Richard Cook wrote: > Is there any connection between this http://www.unihan.com.cn/ site and > IRG? What is UniHan Digital Tech Co.? Their website has some rather > annoying graphics and windows, but no basic info that i can see ... the > bottom buttons don't work at all, no? > I don't know who they are. They're not associated with the IRG that I'm aware. I'm checking with Mr. Zhang to see if he's heard of them. = John H. Jenkins [EMAIL PROTECTED] [EMAIL PROTECTED] http://homepage.mac.com/jenkins/
Re: Question on Unicode data files
"John H. Jenkins" wrote: > > At 7:57 AM -0800 2/26/01, Richard Zhang wrote: > >Hello, Marco, > > > >Unihan is the official site I think. You can visit www.unihan.com.cn for > >more information about this, if you know Chinese :). Knowing Chinese is not enough. You and your browser need to know Simplified Chinese (GBK?) ... arguably not Chinese at all ... > > > >If you sign up for cooperation with them, you will get full access to their > >database. what does "cooperation" mean? > > > > No, Unihan is *NOT* the official site. They are not in any way > associated with Unicode. The official Unihan database is available > only from unicode.org. > Is there any connection between this http://www.unihan.com.cn/ site and IRG? What is UniHan Digital Tech Co.? Their website has some rather annoying graphics and windows, but no basic info that i can see ... the bottom buttons don't work at all, no?
Re: Question on Unicode data files
At 7:57 AM -0800 2/26/01, Richard Zhang wrote: >Hello, Marco, > >Unihan is the official site I think. You can visit www.unihan.com.cn for >more information about this, if you know Chinese :). > >If you sign up for cooperation with them, you will get full access to their >database. > No, Unihan is *NOT* the official site. They are not in any way associated with Unicode. The official Unihan database is available only from unicode.org. -- = John H. Jenkins [EMAIL PROTECTED] [EMAIL PROTECTED] http://homepage.mac.com/jenkins/
Re: Question on Unicode data files
Marco asked: > > The Unicode FTP site (ftp://ftp.unicode.org/Public, now temporarily remapped > on http://www.unicode.org/Public) contains several files with mappings of > East Asian character sets to/from Unicode. > > Are all these sources in sync? If not, which ones is it better to trust? > > - UNIDATA/CJKXREF.TXT (containing Big-5, CCCII-1, CNS-1, CNS-2, CNS-E, Actually: Unihan.txt, as Marco pointed out in his correction. > EACC=ANSI-Z39-64-89, GB-0=2312-80, GB-1=12345-90, GB-3=7589-87, > GB-5=7590-87, GB-7=GUCfMC, GB-8=8565-89, JIS-0=X-0208-90, JIS-1=X-0212-90, > JIS-IBM, KS-C-0=5601-87, KS-C-1=5657-1991, KSC-IBM, Xerox) > > - MAPPINGS/EASTASIA/EASTASIA/CJKXREF.TXT (containing same mappings as above) > > - MAPPINGS/EASTASIA/EASTASIA/UNIHAN.TXT The two files in MAPPINGS/EASTASIA/ are old and out-of-date. MAPPINGS/EASTASIA/UNIHAN.TXT is identical to Unihan-2.txt, which can be found under /Public/2.1-Update/ The CJKXREF.TXT is even older. The current Unihan file is: /Public/UNIDATA/Unihan.txt That is the same as /Public/3.0-Update/Unihan-3.txt. The Unihan file currently under beta review for Unicode 3.1 can be found in the beta directory: /Public/3.1-Update/ It will be renamed to Unihan-3.1.txt when the beta period is done, and will then also appear in the UNIDATA directory as Unihan.txt. Everything else under /Public/MAPPINGS/EASTASIA/ constitute mappings tables to particular code pages, and many of those are also somewhat out of date. > > - MAPPINGS/EASTASIA/EASTASIA/GB/GB12345.TXT > - MAPPINGS/EASTASIA/EASTASIA/GB/GB2312.TXT > Moreover, directory UNIDATA contains and > . They seem to always be identical (same date & > time, same size). > > Which one of them is the official Unicode database, and what is the other > one for? UnicodeData.txt is the official version. UnicodeData-Latest.txt is a duplicate placed there because of earlier policy, just in case anyone still had links pointing to "UnicodeData-Latest.txt" for the current version, instead of "UnicodeData.txt", so their links would not break. To find out about *official* versions of data files, always start from the standard page: http://www.unicode.org/unicode/standard/standard.html and follow the link to the "Enumerated Versions" page: http://www.unicode.org/unicode/standard/versions/enumeratedversions.html That page always gives you the links to the latest data files, and to the data files for each specific version of the standard. The MAPPINGS directory is all informative, and is not a part of the official Unicode Character Database at this time. --Ken > > Thanks. > _ Marco >
Re: Question on Unicode data files
Hello, Marco, Unihan is the official site I think. You can visit www.unihan.com.cn for more information about this, if you know Chinese :). If you sign up for cooperation with them, you will get full access to their database. Is this helpful to you? Best regards, Richard - Original Message - From: "Marco Cimarosti" <[EMAIL PROTECTED]> To: "Unicode List" <[EMAIL PROTECTED]> Sent: Monday, February 26, 2001 6:24 AM Subject: Question on Unicode data files > The Unicode FTP site (ftp://ftp.unicode.org/Public, now temporarily remapped > on http://www.unicode.org/Public) contains several files with mappings of > East Asian character sets to/from Unicode. > > Are all these sources in sync? If not, which ones is it better to trust? > > - UNIDATA/CJKXREF.TXT (containing Big-5, CCCII-1, CNS-1, CNS-2, CNS-E, > EACC=ANSI-Z39-64-89, GB-0=2312-80, GB-1=12345-90, GB-3=7589-87, > GB-5=7590-87, GB-7=GUCfMC, GB-8=8565-89, JIS-0=X-0208-90, JIS-1=X-0212-90, > JIS-IBM, KS-C-0=5601-87, KS-C-1=5657-1991, KSC-IBM, Xerox) > > - MAPPINGS/EASTASIA/EASTASIA/CJKXREF.TXT (containing same mappings as above) > > - MAPPINGS/EASTASIA/EASTASIA/UNIHAN.TXT > (Big-5, CCCII, CNS-86, CNS-92, EACC, GB-0=2312-80, GB-1=12345-90, > GB-3=7589-87, GB-5=7590-87, GB-7=GUCfMC, GB-8=8565-89, IBM-Japan, > JIS-0=X-0208-90, JIS-1=X-0212-90, KSC-0=5601-89, KSC-1=5657-1991, > Pseudo-GB-1=12345-90, Telegraph-PRC, Telegraph-Taiwan, Xerox) > > - MAPPINGS/EASTASIA/EASTASIA/GB/GB12345.TXT > - MAPPINGS/EASTASIA/EASTASIA/GB/GB2312.TXT > > - MAPPINGS/EASTASIA/EASTASIA/JIS/JIS0201.TXT > - MAPPINGS/EASTASIA/EASTASIA/JIS/JIS0208.TXT > - MAPPINGS/EASTASIA/EASTASIA/JIS/JIS0212.TXT > - MAPPINGS/EASTASIA/EASTASIA/JIS/SHIFTJIS.TXT > > - MAPPINGS/EASTASIA/EASTASIA/KSC/HANGUL.TXT > - MAPPINGS/EASTASIA/EASTASIA/KSC/JOHAB.TXT (containing > KS-X-1001-97=KS-C-5601-92) > - MAPPINGS/EASTASIA/EASTASIA/KSC/KSC5601.TXT > - MAPPINGS/EASTASIA/EASTASIA/KSC/KSX1001.TXT > - MAPPINGS/EASTASIA/EASTASIA/KSC/OLD5601.TXT > > - MAPPINGS/EASTASIA/EASTASIA/OTHER/BIG5.TXT > - MAPPINGS/EASTASIA/EASTASIA/OTHER/CNS11643.TXT > > Moreover, directory UNIDATA contains and > . They seem to always be identical (same date & > time, same size). > > Which one of them is the official Unicode database, and what is the other > one for? > > Thanks. > _ Marco >
RE: Question on Unicode data files
I wrote > - UNIDATA/CJKXREF.TXT ([...] Errata: I meant UNIDTA/UNIHAN.TXT Sorry. _ Marco
Question on Unicode data files
The Unicode FTP site (ftp://ftp.unicode.org/Public, now temporarily remapped on http://www.unicode.org/Public) contains several files with mappings of East Asian character sets to/from Unicode. Are all these sources in sync? If not, which ones is it better to trust? - UNIDATA/CJKXREF.TXT (containing Big-5, CCCII-1, CNS-1, CNS-2, CNS-E, EACC=ANSI-Z39-64-89, GB-0=2312-80, GB-1=12345-90, GB-3=7589-87, GB-5=7590-87, GB-7=GUCfMC, GB-8=8565-89, JIS-0=X-0208-90, JIS-1=X-0212-90, JIS-IBM, KS-C-0=5601-87, KS-C-1=5657-1991, KSC-IBM, Xerox) - MAPPINGS/EASTASIA/EASTASIA/CJKXREF.TXT (containing same mappings as above) - MAPPINGS/EASTASIA/EASTASIA/UNIHAN.TXT (Big-5, CCCII, CNS-86, CNS-92, EACC, GB-0=2312-80, GB-1=12345-90, GB-3=7589-87, GB-5=7590-87, GB-7=GUCfMC, GB-8=8565-89, IBM-Japan, JIS-0=X-0208-90, JIS-1=X-0212-90, KSC-0=5601-89, KSC-1=5657-1991, Pseudo-GB-1=12345-90, Telegraph-PRC, Telegraph-Taiwan, Xerox) - MAPPINGS/EASTASIA/EASTASIA/GB/GB12345.TXT - MAPPINGS/EASTASIA/EASTASIA/GB/GB2312.TXT - MAPPINGS/EASTASIA/EASTASIA/JIS/JIS0201.TXT - MAPPINGS/EASTASIA/EASTASIA/JIS/JIS0208.TXT - MAPPINGS/EASTASIA/EASTASIA/JIS/JIS0212.TXT - MAPPINGS/EASTASIA/EASTASIA/JIS/SHIFTJIS.TXT - MAPPINGS/EASTASIA/EASTASIA/KSC/HANGUL.TXT - MAPPINGS/EASTASIA/EASTASIA/KSC/JOHAB.TXT (containing KS-X-1001-97=KS-C-5601-92) - MAPPINGS/EASTASIA/EASTASIA/KSC/KSC5601.TXT - MAPPINGS/EASTASIA/EASTASIA/KSC/KSX1001.TXT - MAPPINGS/EASTASIA/EASTASIA/KSC/OLD5601.TXT - MAPPINGS/EASTASIA/EASTASIA/OTHER/BIG5.TXT - MAPPINGS/EASTASIA/EASTASIA/OTHER/CNS11643.TXT Moreover, directory UNIDATA contains and . They seem to always be identical (same date & time, same size). Which one of them is the official Unicode database, and what is the other one for? Thanks. _ Marco