Dear CHICKEN mailing list, I encountered a strange issue with string-trim-right and some UTF8 string:
$ csi -R srfi-13 -p '(string-trim "Zazà")' Zazà So far so good! $ csi -R srfi-13 -p '(string-trim-right "Zazà")' Zaz� Oh no, what happened? $ csi -R utf8 -R srfi-13 -p '(string-trim-right "Zazà")' Zaz� utf8 doesn't seem to do it! But utf8, at least, gets the string-length right: $ csi -R srfi-13 -p '(string-length "Zazà")' 5 $ csi -R utf8 -R srfi-13 -p '(string-length "Zazà")' 4 It took me a while to figure out what was going on. These are the bytes of Zazà: $ printf 'Zazà' | xxd 00000000: 5a61 7ac3 a0 Zaz.. So it seems like string-trim-right just looks at the last byte, \xa0 which is a non-breaking space <https://en.wikipedia.org/wiki/Non-breaking_space> in itself, and then dropping that off. It should be looking at the last utf8 codepoint instead. I don't know if this is a known bug or if I've come across something undiscovered. I suppose the fix belongs in the utf8 egg. Thanks! K.
_______________________________________________ Chicken-users mailing list Chicken-users@nongnu.org https://lists.nongnu.org/mailman/listinfo/chicken-users