Schanzenbach, Martin schreef op do 10-02-2022 om 22:34 [+0000]: > While I understand the problem GNS defines strings to be UTF-8 > (notwithstanding punycode exceptions). > You can't have UTF-8 strings with a zero terminator without having it > mean exactly that: A string termination. > > Yes, you can say "but what if it is not a UTF-8 string", but that is > not really the problem of the GNS spec. > It normatively defines it as such and the implementation must comply > (with UTF-8). > See also https://en.wikipedia.org/wiki/Null-terminated_string section > in "Character encoding".
I thought that UTF-8 supports encoding \0 characters.
For example Guile silently encodes \0 and decodes it again:
$ ((@ (rnrs bytevectors) utf8->string) ((@ (rnrs bytevectors) string->utf8)
"foo\x00bar"))
> "foo\x00bar"
and Guile claims it is UTF-8:
Return a newly allocated bytevector that contains the UTF-8, [...]
or UTF-32 [...] encoding of STR. For UTF-16 [...].
I guess I'll have to submit documentation patches to Guile and perhaps
even the RnRS.
Greetings,
Maxime.
signature.asc
Description: This is a digitally signed message part
