Hello!
Mike Gran <[email protected]> writes:
> On Tue, 2009-04-21 at 23:37 +0200, Ludovic Courtès wrote:
>> You seem to imply that `scm_getc ()' will now return a Unicode
>> codepoint, is that right? What about `scm_c_{read,write} ()', and
>> `scm_{get,put}s ()'?
>>
>
> I vacillate on this, but, I think the most logical approach is to have
> scm_getc return codepoints and to have the rest of those functions
> return strings that could contain wide characters.
Hmm, `scm_c_{read,write} ()' are biased toward binary data, according to
the manual and to their prototype (they take `void *' buffers). So I
would keep them this way.
`scm_puts ()' is more of a concern since it takes a `char *', which the
caller may consider an 8-bit-encoded, null-terminated string. We should
probably deprecate it, and have it return an ISO-8859-1 string,
transcoding as necessary.
And `scm_gets ()' doesn't exist actually. ;-)
> This is if and only
> if the port has been assigned a character encoding. If it doesn't have
> an associated encoding, ports will be treated as de facto ISO-8859-1,
> where character values between 0 and 255 are stored without any
> interpretation and characters greater than 255 are invalid. (Unicode
> codepoints 0 to 255 are by design the same as ISO-8859-1.)
OK.
Thanks,
Ludo'.