RE: Term Asian is not used properly on Computers and NET
There are also terms like the West or Western (world, languages, civilization, etc) which have referents that are not completely west of the Greenwich Meridian, whose usage cannot be simply explained or justified by it. Every point can be found west (or east) of the Greenwhich Meridian. Not all of them have west or east longitudes, though. YA
Re: Why call kanji/hanji/hanja 'ideographs' when almost none are?
oh, and BTW, Jon, what ~10 are you thinking of? I can't think of any ... Characters like 'above', 'below', 'center' ... depends on what you are willing to accept as 'an idea' and when you start calling it a 'snapshot of an action' like the words for 'music/medicine', 'learn' etc. Apart from that it's a bit pointless to have this argument (yet again). Linguists and sinologists and other -ists have had countless discussions on what to call our Jih. The problem lies not only in the different native terms used throughout the community of usage (Chinese languages, Korean, Japanese, *Vietnamese etc) but also in the fact that it's not a 'pure' set in the sense that they have all been derived via one process. So any term, be it lexigraph, ideograph, logograph, zograph, hieroglyph, glyph etc ff et ad infinitum will be in a way imprecise. The question should be whether there is a point to this discussion? Language isn't rocket science, we use a lot of terms which are highly imprecise or hard to define, but as long as we know what we're referring to, that's the problem solved, isn't it? It might be something for semanticists to discuss what the prototypical Jih is and get embroiled about the lexical decomposition issues and all that stuff ... but when X says 'Ideograph' on this list, we all know what they're talking abou, it's not even like it's not PC to say ideograph (unless I've missed something). One is a s wrong as the others, let's just pick one and be damned to it. Michael
RE: Why call kanji/hanji/hanja 'ideographs' when almost none are?
Jon, Most Kanji have Kun readings. The fact that they also have On readings as well is not material. Calling Kanji ideographic is referring to their Kun properties. I find that most foreigners who know nothing about Japanese are completely unaware of On readings and how Kanji are also used as a phonetic alphabet. This does not take away from the fact that ideographic is about as close as you can get in English to Kun readings. Kanji do express ideas independent of pronunciation. Carl -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]On Behalf Of Jon Babcock Sent: Friday, June 01, 2001 3:17 PM To: [EMAIL PROTECTED] Subject: Why call kanji/hanji/hanja 'ideographs' when almost none are? The Asia/East Asian/CJK thread reminded me of one of my own pet peeves, the use of 'ideograph' to refer to kanji. Perhaps some of the professionals on this list can enlighten me here. I thought that an ideograph meant that the graph stood for an idea, not a sound or a zographic image. Since only a very small percentage of kanji do this ... I can think of only about ten ... why do writers on Unicode lend credence to a fundamental misconception by using this term to refer to the whole lot? In English, wouldn't it be better to say 'han characters' or even just 'kanji' a word which has been in at least one English dictionary now for over twenty years? Jon -- Jon Babcock [EMAIL PROTECTED]
RE: UTF-8S (was: Re: ISO vs Unicode UTF-8)
One more thought on this topic: the issue has to do with comparing the results of sorting two data sources. It would seem to me that there's another issue that has to be taken into consideration here: normalisation. You can't just do a simple sort using raw binary comparison; you have to normalise strings before you compare them, even if the comparison is a binary compare. Why can they not in the process also normalise the way that strings would binary sort? Various people (on unicoRe) have already presented efficient algorithms for doing this that would not add significant overhead to the normalisation process. If the response is that the particular Oracle clients requesting this have already ensured that the data sources are already in (say) normalization form C, then that is one more indication that this is, in fact, a proprietary solution. If it is to be documented as a UTR (which in practice must make it an officially approved Unicode encoding form), then the UTR should also discuss the motivation, which has to do with comparing the sort results of two data sources, and should point out the need to normalise those data sources -- if the whole point is to make sure people know that there are issues involved in making their comparisons valid, then all of the issues should be pointed out, not just some. I think, though, that putting the two together will really beg the question. And remember, if it isn't just a proprietary solution, we *still* need to deal with the case of two data sources where one is UTF-16 and the other is UTF-8 or UTF-32 (not UTF-8s or UTF-32s). I still haven't heard from the advocates of this proposal how they reconcile that issue. - Peter --- Peter Constable Non-Roman Script Initiative, SIL International 7500 W. Camp Wisdom Rd., Dallas, TX 75236, USA Tel: +1 972 708 7485 E-mail: [EMAIL PROTECTED]