Linas Vepstas <linasveps...@gmail.com> skribis:

> On Mon, Jan 30, 2017 at 1:27 PM, David Kastrup <d...@gnu.org> wrote:
>> Marko Rauhamaa <ma...@pacujo.net> writes:
>>> David Kastrup <d...@gnu.org>:
>>>> Marko Rauhamaa <ma...@pacujo.net> writes:
>>>>> Guile's mistake was to move to Unicode strings in the operating system
>>>>> interface.
>>>>
>>>> Emacs uses an UTF-8 based encoding internally [...]
>>>
>>> C uses 8-bit characters. That is a model worth emulating.
>>
>> That's Guile-1.8.  Guile-2 uses either Latin-1 or UCS-32 in its string
>> internals, either Latin-1 or UTF-8 in its string API, and UTF-8 in its
>> string port internals.
>
> Which seems to be a bad decision. I've got strings, 10MBytes long, holding
> chinese in UTF8, and guile converts these internally, to UCS-32 which is a
> complete and total waste of CPU time. WTF.  It then has to  convert them
> back to UTF8 before passing them to my C++ code that actually does stuff
> with them.

I see this as an interaction problem: Guile 2.0 uses UCS-32 internally,
and your code uses UTF-8.  It could have been the other way around.

There were discussions to move to UTF-8 internally in 2.2.  As Mike
explained, that was not really an option in 2.0 mostly due to the
requirement to support O(1) random access.

<https://github.com/larcenists/larceny/wiki/StringRepresentations> lists
various options and the tradeoffs involved.

Ludo’.


Reply via email to