Greetings,
On Mon 06 Sep 2010 18:28, Mike Gran <[email protected]> writes:
> there is a failure case to consider for scm_from_utf8_string. The C
> utf8 string could contain incorrectly encoded data.
There is the analogous case of scm_to_locale_string, if the string is not
encodable in the current locale.
> You could throw the encoding error, or you could replace the
> bad utf8 with U+FFFD or the question mark.
>
> The bytevector's utf8->string always throws encoding-error.
> Maybe that's good enough.
Yeah, maybe so.
> Otherwise, perhaps something like
>
> scm_from_utf8_stringn (str, len, error_or_replace_strategy)
>
> If you didn't mind the overhead of calling the somewhat
> heavyweight scm_{to,from}_stringn, these could be macros
> or inline functions that wrap that.
Ah, I did not see scm_{to,from}_stringn. Cool! I think
scm_from_utf8_stringn et al should be proper functions, and probably
their initial implementations just call scm_{to,from}_stringn. But we
should at least do the straightforward optimization for the latin1 case.
Cheers,
Andy
--
http://wingolog.org/