Karlsson Kent - keka <[EMAIL PROTECTED]> wrote:
> At the level of XML the number of bits is irrelevant.
> The "high and low surrogate" code points are excluded
> from being used as NCRs.  A character (not UTF-16 code
> units) can be referenced by NCRs. See (XML) procuction 66
> (CharRef) and its well-formedness constraint (and
> production 2 (Char), though they missed to exclude a number
> of other non-character code points in that production).

I know that XML explicitly excludes surrogates. My question really refers
to what one can do to encode the non-BMP data in the new Han unification
data that will become part of 10646 and Unicode in the not too distant
future: is this huge block of characters regarded as irrelevant, or has
anyone proposed an encoding that can be used?


Reply via email to