Hello,
In the Unihan.txt database, in the kMandarin field there are entries
with duplicate pronunciations. For example:
U+4E21 kMandarin 1 LIANG3 2 LIANG3 3 LIANG4
U+4E4E kMandarin 1 HU1 HU2 2 HU1
U+4E86 kMandarin 1 LIAO3 2 LE LIAO3
Is there a reason for these duplicates? If this is the case, the
format of this field should be documented better in the header. If
these duplications are errors, I can supply a list of them.
Also, what's the meaning of the isolated numbers?
----------------
Other entries certainly contains errors, for example:
U+5594 kMandarin 1 WO1 2 01
^ this is zero.
U+4EC0 kMandarin 1 SHI2 2 SHEN2 3 SHI2 SHIU2SHEN2 SHI2
^^^^ ?? --> shi2 shen2 ??
Regards,
Pierpaolo Bernardi