Discrepancies between kTotalStrokes and kRSUnicode in the Unihan database - repost all ascii

2014-09-09 Thread John Armstrong
be gotten by adding the stroke count of the radical in its standalone form to the stoke count of the residual portion. However, it can always be gotten by subtracting the stroke count of residual portion from the total stroke count of the character. The Unihan database provides the exact data

Re: Discrepancies between kTotalStrokes and kRSUnicode in the Unihan database

2014-09-09 Thread Andrew West
Hi John, You raise some interesting points, and I hope that one of the people who maintain the Unihan database can address your issues better than I can. I think that the reason why the main CJK block shows the greatest number of mismatches between kTotalStrokes and kRSUnicode is related

Re: Discrepancies between kTotalStrokes and kRSUnicode in the Unihan database - repost all ascii

2014-09-09 Thread Richard COOK
On Sep 9, 2014, at 8:28 AM, Richard COOK rsc...@wenlin.com wrote: On Sep 8, 2014, at 12:03 PM, John Armstrong john.armstrong@gmail.com wrote: Mr. Armstrong, I see that my reply to your message bounced from the main Unicode list, due to length constraints. At any rate, the message did

Re: Discrepancies between kTotalStrokes and kRSUnicode in the Unihan database

2014-09-09 Thread John Armstrong
Thanks for the comments Andrew. When I wrote up my second example I was not yet thinking of the complicating factor of the coexistence 3- and 4-stroke variants of Radical #140 'grass' in top position. #140 is not the main radical in the example but it is the radical of the phonetic, and if the

Re: Discrepancies between kTotalStrokes and kRSUnicode in the Unihan database - repost all ascii

2014-09-09 Thread John Armstrong
Thanks for your long and detailed reply Richard. (The full version came to me directly so I could see it.) It will take me some time to digest it, but I since you suggest I submit something to UTC I want to clarify the extent of my knowledge of and ongoing involvement with Han characters. I

Re: Multiple encoding used with Unihan database

2012-04-19 Thread z-test
for the information. You may notice that on some pages (in the web based Unihan database version) you can find terms like fén while on other pages there will be fen2 instead - perhaps the problem I noticed is not a database issue, as you suggest, but an issue in the interface. I may

Multiple encoding used with Unihan database

2012-04-17 Thread z-test
Good morning! I frequently consult the Unihan database to get detailed information about Japanese and Chinese characters, and I have noticed that at least some pages are encoded in more than one encoding, that is to say, although the main encoding is in UTF-8 (as one would expect

Re: Multiple encoding used with Unihan database

2012-04-17 Thread Jim Breen
z-..@shiroha.jp wrote: Subject: Multiple encoding used with Unihan database I frequently consult the Unihan database to get detailed information about Japanese and Chinese characters, and I have noticed that at least some pages are encoded in more than one encoding, that is to say

Unihan database

2012-04-13 Thread Martin Heijdra
Librarians are certainly a group of users using Unihan a lot, to identify encodings for rare characters. Several of them have complained that it gets more and more difficult to use for them. One issue is, that the database itself started to use encodings rather than images; which made it

Re: Unihan database

2012-04-13 Thread John H. Jenkins
Yes, this is very much possible, although I can't predict how soon we'll get it done. Martin Heijdra mheij...@princeton.edu 於 2012年4月13日 上午10:26 寫道: Librarians are certainly a group of users using Unihan a lot, to identify encodings for rare characters. Several of them have complained

Re: [unicode] Unihan database

2012-04-13 Thread suzuki toshiya
I think this comment is related with the current implementation of SimSun. Why don't you try to install the free fonts that can support the missing characters? From the viewpoint of an user over the distant network, the images-by-default is worse in some cases... Regards, mpsuzuki Martin Heijdra

Re: Unihan database

2012-04-13 Thread Julian Bradfield
On 2012-04-13, Martin Heijdra mheij...@princeton.edu wrote: But now they report that the radical-stroke page itself has changed to encodings rather than images; and the radicals are not in the standard fonts. Hence, the search pages (clicking on the number of strokes of the radical) shows

RE: [unicode] Unihan database

2012-04-13 Thread Martin Heijdra
PM To: Martin Heijdra Cc: unicode@unicode.org Subject: Re: [unicode] Unihan database I think this comment is related with the current implementation of SimSun. Why don't you try to install the free fonts that can support the missing characters? From the viewpoint of an user over the distant network

License on the Unihan database

2002-11-10 Thread Florian Weimer
Which of the two licenses applies to the Unihan database? The one in the file, or the one in UnicodeCharacterDatabase.html? Does the latter permit modified redistribution?