Hello,

I am comparing radical data for CJK characters from different sources, 
including the Unihan database. According to the Unihan documentation* the 
kRSUnicode radical should correspond to kRSKangXi radical, which in turn should 
be based on the Kang Xi dictionary.

Is there any explanation for the following discrepancies? Did I miss any other 
rules or reasoning behind the content of these two fields?

Examples of the discrepancies:

(1) A very common character for "most, maximum".
U+6700  kRSKangXi       73.8
U+6700  kRSUnicode      13.10

(2) A funny character for autumn containing the turtle component.
U+9F9D  kRSKangXi       115.16
U+9F9D  kRSKanWa        115.16
U+9F9D  kRSUnicode      213.5

There are also characters that actually are not included in the Kang Xi 
dictionary**, but the Unihan data contain both a purported Kang Xi radical and 
in addition to that a _different_ Unicode radical.

(3) The simplified turtle character (commonly assigned to the traditional 
radical #213):
U+4E80  kRSKangXi       213.0
U+4E80  kRSUnicode      5.10

(4) Character with the radical #72/73 at the top, i.e. IMHO an arbitrary 
decision, but unexpectedly the fields differ:
U+66FB  kRSKangXi       72.7
U+66FB  kRSUnicode      73.7

- - -

[*] <http://www.unicode.org/reports/tr38/tr38-8.html>: "Property: kRSUnicode // 
Description: (...) The first value is intended to reflect the same radical as 
the kRSKangXi field and the stroke count of the glyph used to print the 
character within the Unicode Standard."

[**] The two characters are missing from the '89 edition of Kang Xi (which 
should be the same as used for Unihan) according to search on this site: 
<http://ctext.org/dictionary.pl>


-- 
Adam Nohejl


_______________________________________________
Unicode mailing list
Unicode@unicode.org
http://unicode.org/mailman/listinfo/unicode

Reply via email to