I've been looking at the Vietnamese readings given in the Unihan database
recently, and although I don't know Vietnamese, I think there may be something
not quite right with some of them, and so I wondered if anyone on this list who
knows Vietnamese could confirm the validity of the Unihan Vietnamese readings.

Since Unicode 3.2 the Unihan database has included Vietnamese Nôm readings for
164 basic CJK ideographs (from U+66F2 up to U+9C31, which is odd in itself), 122
CJK-A ideographs, and 4,230 CJK-B ideographs. The Vietnamese readings for the
CJK-A and CJK-B ideographs look like phonetic variations on the original Chinese
pronunciations of the ideographs (as would be expected), but none of the
Vietnamese readings for the 164 basic CJK ideographs bear any correspondence
with the Chinese pronunciations for the same ideographs.

I used the excellant Nôm Lookup Tool provided by the Nôm Foundation
(http://www.nomfoundation.org/nomdb/lookup.php) to check the Vietnamese readings
given in the Unihan database, and found that the Nôm readings for a random
sample of CJK-A and CJK-B ideographs exactly matched the readings given in the
Unihan database. On the other hand, none of the readings given by the Nôm Lookup
Tool for basic CJK ideographs (between U+66F2 and U+9C31) matched the readings
given in the Unihan database.

For example, the Unihan database has the following readings for these three
basic CJK ideographs :

U+66F2  kVietnamese     gi&#7843; <U+0067, U+0069, U+1EA3>
U+66F4  kVietnamese     xâu <U+0078, U+00E2, U+0075>
U+6771  kVietnamese     h&#7889;c <U+0068, U+1ED1, U+0063>

On the other hand the Nôm Lookup Tool gives the following readings for the same
ideographs :

U+66F2 = khúc <U+006B, U+0068, U+00FA, U+0063>
U+66F4 = canh <U+0063, U+0061, U+006E, U+0068>
U+6771 = ðông <U+0111, U+00F4, U+006E, U+0067>

And looking up the Unihan Vietnamese readings for these three ideographs with
the Nôm Lookup Tool gives the following results :
gi&#7843; = U+4F3D or U+5047 or U+5056 or U+8005 or U+8D6D
xâu = U+507B or U+641C or U+22D1C or U+22E64 or U+26113
h&#7889;c = U+561D or U+21417

Can anyone tell me whether this discrepancy between the Unihan Vietnamese
readings and the readings given by the Nôm Lookup Tool is due to an error in the
Unihan database or due to my lack of understanding of Vietnamese ?

Andrew

Reply via email to