Hi :) On Mon 27 Feb 2017 17:07, Eli Zaretskii <e...@gnu.org> writes:
>> From: Andy Wingo <wi...@pobox.com> >> Date: Sun, 26 Feb 2017 22:20:31 +0100 >> >> In Scheme, strings are sequences of characters. Encoding and decoding >> is only needed when going to and from bytes. Guile supports a finite >> number of encodings, so in general some encoding/decoding will always be >> needed. The specific encoding may change over time. > > The lesson of Emacs development is that there's a need for > "characters" that represent raw bytes which cannot be decoded into the > internal representation, for whatever reasons. These special > "characters" need to be representable in strings, among "normal" > recognizable characters (and thus distinguishable from the latter > kind), and they need to be converted back to their single-byte form > when the string is output to the external world. An implementation of > text that doesn't include these features will always fail to support > some important use cases. Thanks for this note (and upthread). I didn't know Emacs settled on this strategy. It could fit in as a new "conversion strategy" (see Encoding in the manual). I think this feature will probably slip for 2.2.0 for lack of time, though. When someone does go to look at it, this thread is a useful resource, or parts of it anyway :) I especially appreciated the tradeoffs between surrogates and strange UTF-8 hacks. Andy