Re: Roundtripping Solved

Doug Ewell Wed, 15 Dec 2004 10:36:47 -0800

Arcane Jill <arcanejill at ramonsky dot com> wrote:

> DEFINITION - "f" is a function which maps an arbitrary octet stream to
> a sequence of Unicode characters, such that (1) any substring which
> happens to be valid UTF-8 is mapped to the sequence of Unicode
> characters which would have been produced by UTF-8, and (2) all
> remaining single octets, xx (with x necessarily such that 0x80 <= xx
> <= 0xFF) are each mapped to the sequence: { U+0C55E3, U+01ED7A,
> U+05FDCB, U+09C351, U+07E168, U+0BBC80, U+107C09, U+0BA458, U+064188,
> U+048375, U+08ACE0, U+031DEF, U+00xx } (I got those numbers from a
> true random number generator).


Reminds me of Masahiko Maedera's "UTF-16X" proposal, which used triples
of code points in the block U+EExxx to represent values above 0x110000,
under the (false) assumption that such a thing was needed.

Of course, Jill's scheme uses non-private-use Unicode scalar values to
achieve what is essentially a private-use function, so this is still
non-conformant.  (A similar scheme that only used code points from the
Plane 0, Plane 15, and Plane 16 PUAs would be fine.)  But I gather that
Lars isn't too worried about being non-conformant, or we wouldn't be
having this thread.

-Doug Ewell
 Fullerton, California
 http://users.adelphia.net/~dewell/

Re: Roundtripping Solved

Reply via email to