Misha Wolf wrote:
> On 30/08/2001 09:16:21 Marco Cimarosti wrote:
> > Viranga Ratnaike wrote:
> > > Is it ok for Unicode code points to be
> > > encoded/serialized using EUC?
> [...]
> >
> > EUC size simply doesn't fit Unicode.
> >
> [...]
> That is, IMO, quite a misleading reply. It would be more
> helpful to say something like:
>
> Yes, it is OK for Unicode code points to be encoded using
> EUC.
*This* is, IMHO, a *very* misleading statement.
No: it is not OK to encode Unicode in EUC, because this would be technically
impossible, as I explained before and you explain again here:
> Keep in
> mind, though, that the EUC character repertoire is a lot smaller than
> the Unicode character repertoire. Consequently, many Unicode
> characters cannot be directly encoded using EUC.
In fact: only Unicode chars U+0000 to U+4000 would be representable in an
hypothetical "euc-unicode" encoding (so, e.g., no Unified ideographs would
be allowed, as they start at U+4E00).
Moreover, even if such a hybrid was possible, no current application would
recognize or process it, because the only expected forms of Unicode are
UTF-8, UTF-16, and UTF-32 (plus some obsolete or variant forms that is not
worth mentioning here).
> Of course, EUC (EUC-JP in the
> case of Japanese) may cover all the characters you require, in which
> case there is no problem.
What does Unicode have to do with this!? You are talking now about EUC-JP
(a.k.a. EUC-JIS); Viranga was asking about using EUC to serialize Unicode.
> Additionally, if you are thinking of XML (or
> HTML) then you can encode *all* Unicode characters in an EUC-encoded
> document, by employing numeric character references for characters
> outside the EUC character repertoire. Using the same
> technique, you can
> encode all Unicode characters in an ASCII-encoded document.
OK. But what does this have to do with Unicode, JIS, EUC, or anything else
in Viranga's question?
You are not obliged to reply a question but, if you decide to do so, you
should reply to it, not to something else.
_ Marco