RE: Term Asian is not used properly on Computers and NET

2001-06-03 Thread Yves Arrouye

 There are also terms like the West or Western (world, languages,
civilization, etc) which have referents that are not completely west of
the Greenwich Meridian, whose usage cannot be simply explained or
justified by it.

Every point can be found west (or east) of the Greenwhich Meridian. Not all
of them have west or east longitudes, though.

YA




Re: Why call kanji/hanji/hanja 'ideographs' when almost none are?

2001-06-03 Thread akerbeltz.alba

 oh, and BTW, Jon, what ~10 are you thinking of? I can't think of any ...

Characters like 'above', 'below', 'center' ... depends on what you are
willing to accept as 'an idea' and when you start calling it a 'snapshot of
an action' like the words for 'music/medicine', 'learn' etc.

Apart from that it's a bit pointless to have this argument (yet again).
Linguists and sinologists and other -ists have had countless discussions on
what to call our Jih. The problem lies not only in the different native
terms used throughout the community of usage (Chinese languages, Korean,
Japanese, *Vietnamese etc) but also in the fact that it's not a 'pure' set
in the sense that they have all been derived via one process. So any term,
be it lexigraph, ideograph, logograph, zograph, hieroglyph, glyph etc ff et
ad infinitum will be in a way imprecise.

The question should be whether there is a point to this discussion? Language
isn't rocket science, we use a lot of terms which are highly imprecise or
hard to define, but as long as we know what we're referring to, that's the
problem solved, isn't it? It might be something for semanticists to discuss
what the prototypical Jih is and get embroiled about the lexical
decomposition issues and all that stuff ... but when X says 'Ideograph' on
this list, we all know what they're talking abou, it's not even like it's
not PC to say ideograph (unless I've missed something).
One is a s wrong as the others, let's just pick one and be damned to it.

Michael





RE: Why call kanji/hanji/hanja 'ideographs' when almost none are?

2001-06-03 Thread Carl W. Brown

Jon,

Most Kanji have Kun readings.  The fact that they also have On readings as
well is not material.  Calling Kanji ideographic is referring to their Kun
properties.

I find that most foreigners who know nothing about Japanese are completely
unaware of On readings and how Kanji are also used as a phonetic alphabet.

This does not take away from the fact that ideographic is about as close as
you can get in English to Kun readings.  Kanji do express ideas independent
of pronunciation.

Carl


-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]On
Behalf Of Jon Babcock
Sent: Friday, June 01, 2001 3:17 PM
To: [EMAIL PROTECTED]
Subject: Why call kanji/hanji/hanja 'ideographs' when almost none are?


The Asia/East Asian/CJK thread reminded me of one of my own pet peeves,
the use of 'ideograph' to refer to kanji.

 Perhaps some of the professionals on this list can enlighten me here. I
thought that an ideograph meant that the graph stood for an idea, not a
sound or a zographic image. Since only a very small percentage of kanji
do this ... I can think of only about ten ...  why do writers on Unicode
lend credence to a fundamental misconception by using this term to refer
to the whole lot?

In English, wouldn't it be better to say 'han characters' or even just
'kanji' a word which has been in at least one English dictionary now for
over twenty years?

Jon
--
Jon Babcock [EMAIL PROTECTED]












RE: UTF-8S (was: Re: ISO vs Unicode UTF-8)

2001-06-03 Thread Peter_Constable


One more thought on this topic: the issue has to do with comparing the
results of sorting two data sources. It would seem to me that there's
another issue that has to be taken into consideration here: normalisation.
You can't just do a simple sort using raw binary comparison; you have to
normalise strings before you compare them, even if the comparison is a
binary compare. Why can they not in the process also normalise the way that
strings would binary sort? Various people (on unicoRe) have already
presented efficient algorithms for doing this that would not add
significant overhead to the normalisation process.

If the response is that the particular Oracle clients requesting this have
already ensured that the data sources are already in (say) normalization
form C, then that is one more indication that this is, in fact, a
proprietary solution. If it is to be documented as a UTR (which in practice
must make it an officially approved Unicode encoding form), then the UTR
should also discuss the motivation, which has to do with comparing the sort
results of two data sources, and should point out the need to normalise
those data sources -- if the whole point is to make sure people know that
there are issues involved in making their comparisons valid, then all of
the issues should be pointed out, not just some. I think, though, that
putting the two together will really beg the question.

And remember, if it isn't just a proprietary solution, we *still* need to
deal with the case of two data sources where one is UTF-16 and the other is
UTF-8 or UTF-32 (not UTF-8s or UTF-32s). I still haven't heard from the
advocates of this proposal how they reconcile that issue.



- Peter


---
Peter Constable

Non-Roman Script Initiative, SIL International
7500 W. Camp Wisdom Rd., Dallas, TX 75236, USA
Tel: +1 972 708 7485
E-mail: [EMAIL PROTECTED]