Re: Errors in Unihan data : simplified/traditional variants

2010-11-01 Thread John H. Jenkins

On 2010/10/30, at 下午8:42, Koxinga wrote:

 My quickly done parsing program counted 1154 such pairs, where the head 
 character was the same as the character above. It seems to be always in the 
 order kTraditionalVariant then kSimplifiedVariant, so can maybe be 
 automatically corrected. It seems to be a very evident mistake, and the 
 correction should be easy. I can help with that, I am just waiting to see if 
 this is the right place to report problems in Unihan. I also 
 consideredhttp://www.unicode.org/reporting.html , would it be better ?
 

Yes, that would be better.  That way it will be tracked and it's less likely to 
slip through the cracks in my schedule.  For general questions, you can email 
me directly.

=
Hoani H. Tinikini
John H. Jenkins
jenk...@apple.com






Errors in Unihan data : simplified/traditional variants

2010-10-31 Thread Koxinga

Hello,

I recently looked up the relationships traditional-simplified in the 
Unihan database (Unihan_Variants.txt).


I knew it had mistakes and I wanted to help correct some of them, but 
the first thing that stand out and surprised me was the large number of 
lines like :


U+346F  kSimplifiedVariant  U+3454
U+346F  kTraditionalVariant U+3454

which should be (if I didn't mix them up ...)

U+3454  kTraditionalVariant  U+346F
U+346F  kSimplifiedVariant U+3454

My quickly done parsing program counted 1154 such pairs, where the head 
character was the same as the character above. It seems to be always in 
the order kTraditionalVariant then kSimplifiedVariant, so can maybe 
be automatically corrected. It seems to be a very evident mistake, and 
the correction should be easy. I can help with that, I am just waiting 
to see if this is the right place to report problems in Unihan. I also 
considered http://www.unicode.org/reporting.html , would it be better ?


I have a lot of other questions and comments on these 
simplified/traditional relationships, but I guess it will wait the 
resolution of this problem, this would make for a too long email.


Regards,

Koxinga