Re: Roundtripping Solved

Marcin 'Qrczak' Kowalczyk Wed, 15 Dec 2004 11:30:35 -0800

Peter Kirk <[EMAIL PROTECTED]> writes:

> Jill, again your solution is ingenious. But would it not work just
> as well to for Lars' purposes to use, instead of your string of
> random characters, just ONE reserved code point followed by U+0xx?
> Instead of asking the UTC to allocate a specific code point for this
> (which it probably will not do), he can use either U+FFFE or U+FFFF,
> which "are intended for process internal uses, but are not permitted
> for interchange." Let's call the one non-character chosen INVALID.


Perhaps what is needed is a shift of viewpoint, not a big technical
change.

Don't call it a UTF. Call it escaping. Don't reserve 128 code points.
Use an existing but rare code point to prefix a byte escaped among
code points, and escape the escape if it's found in the original.
Perhaps the character could be ESC (27) or SUB (26), followed by
U+00nn.

Well, a viewpoint shift doesn't solve all problems: it's still
dangerous for interoperability. If the programmer doesn't do anything
special when writing filenames to a file, then instead of an error
which indicates that the goal doesn't have a natural solution he gets
an escaped string which will not be understood by other applications
wich don't use this convention. If the filename is passed to a part
of the program which doesn't use this convention, then it will break
too. If something cannot be done reliably, it's better to signal the
problem immediately than to hide it and misbehave later.

-- 
   __("<         Marcin Kowalczyk
   \__/       [EMAIL PROTECTED]
    ^^     http://qrnik.knm.org.pl/~qrczak/

Re: Roundtripping Solved

Reply via email to