Re: Question on Unicode data files

2001-02-28 Thread Jianping Yang

Mr Zhang is CEO of that company.

Regards,
Jianping.

John Jenkins wrote:

> On Monday, February 26, 2001, at 09:12 PM, Richard Cook wrote:
>
> > Is there any connection between this http://www.unihan.com.cn/ site and
> > IRG? What is UniHan Digital Tech Co.? Their website has some rather
> > annoying graphics and windows, but no basic info that i can see ... the
> > bottom buttons don't work at all, no?
> >
>
> I don't know who they are.  They're not associated with the IRG that I'm
> aware.  I'm checking with Mr. Zhang to see if he's heard of them.
>
> =
> John H. Jenkins
> [EMAIL PROTECTED]
> [EMAIL PROTECTED]
> http://homepage.mac.com/jenkins/


begin:vcard 
n:Yang;Jianping
tel;fax:650-506-7225
tel;work:650-506-4865
x-mozilla-html:FALSE
org:Server Gobalization Technology;Server Technology
version:2.1
email;internet:[EMAIL PROTECTED]
title:Senior Development Manager
adr;quoted-printable:;;500 Oracle Packway=0D=0AM/S 659407;Redwood Shores;CA;94065;
fn:Jianping Yang
end:vcard



Re: Question on Unicode data files

2001-02-27 Thread John Jenkins


On Monday, February 26, 2001, at 09:12 PM, Richard Cook wrote:

> Is there any connection between this http://www.unihan.com.cn/ site and
> IRG? What is UniHan Digital Tech Co.? Their website has some rather
> annoying graphics and windows, but no basic info that i can see ... the
> bottom buttons don't work at all, no?
>

I don't know who they are.  They're not associated with the IRG that I'm 
aware.  I'm checking with Mr. Zhang to see if he's heard of them.

=
John H. Jenkins
[EMAIL PROTECTED]
[EMAIL PROTECTED]
http://homepage.mac.com/jenkins/



Re: Question on Unicode data files

2001-02-26 Thread Richard Cook

"John H. Jenkins" wrote:
> 
> At 7:57 AM -0800 2/26/01, Richard Zhang wrote:
> >Hello, Marco,
> >
> >Unihan is the official site I think. You can visit www.unihan.com.cn for
> >more information about this, if you know Chinese :).

Knowing Chinese is not enough. You and your browser need to know
Simplified Chinese (GBK?) ... arguably not Chinese at all ...
> >
> >If you sign up for cooperation with them, you will get full access to their
> >database.

what does "cooperation" mean?
> >
> 
> No, Unihan is *NOT* the official site.  They are not in any way
> associated with Unicode.  The official Unihan database is available
> only from unicode.org.
> 

Is there any connection between this http://www.unihan.com.cn/ site and
IRG? What is UniHan Digital Tech Co.? Their website has some rather
annoying graphics and windows, but no basic info that i can see ... the
bottom buttons don't work at all, no?



Re: Question on Unicode data files

2001-02-26 Thread John H. Jenkins

At 7:57 AM -0800 2/26/01, Richard Zhang wrote:
>Hello, Marco,
>
>Unihan is the official site I think. You can visit www.unihan.com.cn for
>more information about this, if you know Chinese :).
>
>If you sign up for cooperation with them, you will get full access to their
>database.
>

No, Unihan is *NOT* the official site.  They are not in any way 
associated with Unicode.  The official Unihan database is available 
only from unicode.org.

-- 
=
John H. Jenkins
[EMAIL PROTECTED]
[EMAIL PROTECTED]
http://homepage.mac.com/jenkins/



Re: Question on Unicode data files

2001-02-26 Thread Kenneth Whistler

Marco asked:

> 
> The Unicode FTP site (ftp://ftp.unicode.org/Public, now temporarily remapped
> on http://www.unicode.org/Public) contains several files with mappings of
> East Asian character sets to/from Unicode.
> 
> Are all these sources in sync? If not, which ones is it better to trust?
> 
> - UNIDATA/CJKXREF.TXT (containing Big-5, CCCII-1, CNS-1, CNS-2, CNS-E,

Actually:   Unihan.txt, as Marco pointed out in his correction.

> EACC=ANSI-Z39-64-89, GB-0=2312-80, GB-1=12345-90, GB-3=7589-87,
> GB-5=7590-87, GB-7=GUCfMC, GB-8=8565-89, JIS-0=X-0208-90, JIS-1=X-0212-90,
> JIS-IBM, KS-C-0=5601-87, KS-C-1=5657-1991, KSC-IBM, Xerox)
> 
> - MAPPINGS/EASTASIA/EASTASIA/CJKXREF.TXT (containing same mappings as above)
> 
> - MAPPINGS/EASTASIA/EASTASIA/UNIHAN.TXT

The two files in MAPPINGS/EASTASIA/ are old and out-of-date.
MAPPINGS/EASTASIA/UNIHAN.TXT is identical to Unihan-2.txt, which
can be found under /Public/2.1-Update/  The CJKXREF.TXT is even
older.

The current Unihan file is:

/Public/UNIDATA/Unihan.txt

That is the same as /Public/3.0-Update/Unihan-3.txt.

The Unihan file currently under beta review for Unicode 3.1 can
be found in the beta directory:

/Public/3.1-Update/

It will be renamed to Unihan-3.1.txt when the beta period is done, and
will then also appear in the UNIDATA directory as Unihan.txt.

Everything else under /Public/MAPPINGS/EASTASIA/ constitute
mappings tables to particular code pages, and many of those
are also somewhat out of date.

> 
> - MAPPINGS/EASTASIA/EASTASIA/GB/GB12345.TXT
> - MAPPINGS/EASTASIA/EASTASIA/GB/GB2312.TXT

> Moreover, directory UNIDATA contains  and
> . They seem to always be identical (same date &
> time, same size).
> 
> Which one of them is the official Unicode database, and what is the other
> one for?

UnicodeData.txt is the official version. UnicodeData-Latest.txt is a
duplicate placed there because of earlier policy, just in case anyone
still had links pointing to "UnicodeData-Latest.txt" for the current
version, instead of "UnicodeData.txt", so their links would not break.

To find out about *official* versions of data files, always start from
the standard page:

http://www.unicode.org/unicode/standard/standard.html

and follow the link to the "Enumerated Versions" page:

http://www.unicode.org/unicode/standard/versions/enumeratedversions.html

That page always gives you the links to the latest data files, and
to the data files for each specific version of the standard. The
MAPPINGS directory is all informative, and is not a part of the
official Unicode Character Database at this time.

--Ken

> 
> Thanks.
> _ Marco
> 




Re: Question on Unicode data files

2001-02-26 Thread Richard Zhang

Hello, Marco,

Unihan is the official site I think. You can visit www.unihan.com.cn for
more information about this, if you know Chinese :).

If you sign up for cooperation with them, you will get full access to their
database.

Is this helpful to you?

Best regards,


Richard

- Original Message -
From: "Marco Cimarosti" <[EMAIL PROTECTED]>
To: "Unicode List" <[EMAIL PROTECTED]>
Sent: Monday, February 26, 2001 6:24 AM
Subject: Question on Unicode data files


> The Unicode FTP site (ftp://ftp.unicode.org/Public, now temporarily
remapped
> on http://www.unicode.org/Public) contains several files with mappings of
> East Asian character sets to/from Unicode.
>
> Are all these sources in sync? If not, which ones is it better to trust?
>
> - UNIDATA/CJKXREF.TXT (containing Big-5, CCCII-1, CNS-1, CNS-2, CNS-E,
> EACC=ANSI-Z39-64-89, GB-0=2312-80, GB-1=12345-90, GB-3=7589-87,
> GB-5=7590-87, GB-7=GUCfMC, GB-8=8565-89, JIS-0=X-0208-90, JIS-1=X-0212-90,
> JIS-IBM, KS-C-0=5601-87, KS-C-1=5657-1991, KSC-IBM, Xerox)
>
> - MAPPINGS/EASTASIA/EASTASIA/CJKXREF.TXT (containing same mappings as
above)
>
> - MAPPINGS/EASTASIA/EASTASIA/UNIHAN.TXT
> (Big-5, CCCII, CNS-86, CNS-92, EACC, GB-0=2312-80, GB-1=12345-90,
> GB-3=7589-87, GB-5=7590-87, GB-7=GUCfMC, GB-8=8565-89, IBM-Japan,
> JIS-0=X-0208-90, JIS-1=X-0212-90, KSC-0=5601-89, KSC-1=5657-1991,
> Pseudo-GB-1=12345-90, Telegraph-PRC, Telegraph-Taiwan, Xerox)
>
> - MAPPINGS/EASTASIA/EASTASIA/GB/GB12345.TXT
> - MAPPINGS/EASTASIA/EASTASIA/GB/GB2312.TXT
>
> - MAPPINGS/EASTASIA/EASTASIA/JIS/JIS0201.TXT
> - MAPPINGS/EASTASIA/EASTASIA/JIS/JIS0208.TXT
> - MAPPINGS/EASTASIA/EASTASIA/JIS/JIS0212.TXT
> - MAPPINGS/EASTASIA/EASTASIA/JIS/SHIFTJIS.TXT
>
> - MAPPINGS/EASTASIA/EASTASIA/KSC/HANGUL.TXT
> - MAPPINGS/EASTASIA/EASTASIA/KSC/JOHAB.TXT (containing
> KS-X-1001-97=KS-C-5601-92)
> - MAPPINGS/EASTASIA/EASTASIA/KSC/KSC5601.TXT
> - MAPPINGS/EASTASIA/EASTASIA/KSC/KSX1001.TXT
> - MAPPINGS/EASTASIA/EASTASIA/KSC/OLD5601.TXT
>
> - MAPPINGS/EASTASIA/EASTASIA/OTHER/BIG5.TXT
> - MAPPINGS/EASTASIA/EASTASIA/OTHER/CNS11643.TXT
>
> Moreover, directory UNIDATA contains  and
> . They seem to always be identical (same date &
> time, same size).
>
> Which one of them is the official Unicode database, and what is the other
> one for?
>
> Thanks.
> _ Marco
>



RE: Question on Unicode data files

2001-02-26 Thread Marco Cimarosti

I wrote
> - UNIDATA/CJKXREF.TXT ([...]

Errata: I meant UNIDTA/UNIHAN.TXT

Sorry.

_ Marco



Question on Unicode data files

2001-02-26 Thread Marco Cimarosti

The Unicode FTP site (ftp://ftp.unicode.org/Public, now temporarily remapped
on http://www.unicode.org/Public) contains several files with mappings of
East Asian character sets to/from Unicode.

Are all these sources in sync? If not, which ones is it better to trust?

- UNIDATA/CJKXREF.TXT (containing Big-5, CCCII-1, CNS-1, CNS-2, CNS-E,
EACC=ANSI-Z39-64-89, GB-0=2312-80, GB-1=12345-90, GB-3=7589-87,
GB-5=7590-87, GB-7=GUCfMC, GB-8=8565-89, JIS-0=X-0208-90, JIS-1=X-0212-90,
JIS-IBM, KS-C-0=5601-87, KS-C-1=5657-1991, KSC-IBM, Xerox)

- MAPPINGS/EASTASIA/EASTASIA/CJKXREF.TXT (containing same mappings as above)

- MAPPINGS/EASTASIA/EASTASIA/UNIHAN.TXT
(Big-5, CCCII, CNS-86, CNS-92, EACC, GB-0=2312-80, GB-1=12345-90,
GB-3=7589-87, GB-5=7590-87, GB-7=GUCfMC, GB-8=8565-89, IBM-Japan,
JIS-0=X-0208-90, JIS-1=X-0212-90, KSC-0=5601-89, KSC-1=5657-1991,
Pseudo-GB-1=12345-90, Telegraph-PRC, Telegraph-Taiwan, Xerox)

- MAPPINGS/EASTASIA/EASTASIA/GB/GB12345.TXT
- MAPPINGS/EASTASIA/EASTASIA/GB/GB2312.TXT

- MAPPINGS/EASTASIA/EASTASIA/JIS/JIS0201.TXT
- MAPPINGS/EASTASIA/EASTASIA/JIS/JIS0208.TXT
- MAPPINGS/EASTASIA/EASTASIA/JIS/JIS0212.TXT
- MAPPINGS/EASTASIA/EASTASIA/JIS/SHIFTJIS.TXT

- MAPPINGS/EASTASIA/EASTASIA/KSC/HANGUL.TXT
- MAPPINGS/EASTASIA/EASTASIA/KSC/JOHAB.TXT (containing
KS-X-1001-97=KS-C-5601-92)
- MAPPINGS/EASTASIA/EASTASIA/KSC/KSC5601.TXT
- MAPPINGS/EASTASIA/EASTASIA/KSC/KSX1001.TXT
- MAPPINGS/EASTASIA/EASTASIA/KSC/OLD5601.TXT

- MAPPINGS/EASTASIA/EASTASIA/OTHER/BIG5.TXT
- MAPPINGS/EASTASIA/EASTASIA/OTHER/CNS11643.TXT

Moreover, directory UNIDATA contains  and
. They seem to always be identical (same date &
time, same size).

Which one of them is the official Unicode database, and what is the other
one for?

Thanks.
_ Marco