On Tuesday, November 14, 2000, at 08:24 AM, Pierpaolo Bernardi wrote:

> In the Unihan.txt database, in the kMandarin field there are entries
> with duplicate pronunciations. For example:
> 
> U+4E21        kMandarin       1 LIANG3 2 LIANG3 3 LIANG4
> U+4E4E        kMandarin       1 HU1 HU2 2 HU1
> U+4E86        kMandarin       1 LIAO3 2 LE LIAO3
> 
> Is there a reason for these duplicates? If this is the case, the
> format of this field should be documented better in the header. If
> these duplications are errors, I can supply a list of them.
> 

That would be very helpful, yes.  

> Also, what's the meaning of the isolated numbers?
> 

The value of the field was obtained from dictionaries.  When a dictionary provides 
more than one meaning, it is not infrequent that one pronunciation is specific to a 
particular meaning and another pronunciation specific to another.  This is where the 
numbers come from.

Inasmuch as the database doesn't maintain the link between specific definitions and 
pronunciations, the isolated numbers should also be removed.

Reply via email to