Re: Coding system robustness?

David Kastrup Sat, 19 Mar 2005 01:14:10 -0800

Kenichi Handa <[EMAIL PROTECTED]> writes:

> In article <[EMAIL PROTECTED]>, Stefan Monnier <[EMAIL PROTECTED]> writes:
>
>>>  I'd like to know whether coding systems in general are supposed to be
>>>  robust, meaning that decoding some random byte string into the coding
>>>  system and reencoding it is guaranteed to deliver the same byte string
>>>  again?
>
>> AFAIK, (encode-coding-string (decode-coding-string STR 'foo) 'foo)
>> should always return STR, otherwise it's a bug.
>> With the introduction of eight-bit-*, this should be true of "all"
>> coding-systems in Emacs-21,
>
> No.  Redundant escape sequences in iso-2022 based coding
> systems are just ignored.  For instance,
>
>   (decode-coding-string "\e(J" 'iso-2022-jp) => ""
>
> And we can't recover "\e(J" on encoding.


Ok, making the problem somewhat more confined: if I have a file that
is written _by_ _Emacs_ in some coding system, and then externally I
chop parts of it into pieces (not dropping material) not taking into
account multibyte boundaries, convert these pieces with interspersed
ASCII) into the original decoding, encode it again to a unibyte
string, properly replace the ASCII-fied pieces with the original
material and decode to the original decoding (phew), I am pretty sure
that I have round-trip behavior, right?

Well, almost.  On escape-based coding systems I don't see in the first
place that one can encode/decode string parts in isolation, so I am
afraid that it is not really feasible to promise anything.  Do the
escapes at least start fresh every line?  I am just being curious
here, there is no actual chance that I am going to support such a
coding system, and I don't see how I sensibly could.

-- 
David Kastrup, Kriemhildstr. 15, 44793 Bochum


_______________________________________________
Emacs-devel mailing list
Emacs-devel@gnu.org
http://lists.gnu.org/mailman/listinfo/emacs-devel

Re: Coding system robustness?

Reply via email to