Alex Shinn scripsit: > On Tue, Jul 8, 2014 at 5:58 AM, Mario Domenech Goulart > <mario.goul...@gmail.com> wrote: > > It might help the discussion if we had a list of eggs which > are known to break on UTF-8 inputs.
Indeed. > > 1. Have <egg> and <egg>-utf8 variants. Or, more generally, <egg> and > > <egg>-<encoding> variants. That would turn our coop into a disgusting > > mess and would be a nightmare to egg authors. I don't think the extra generality is required. If the egg needs to be able to correctly handle arbitrary characters, UTF-8 is the appropriate internal representation. If not, ASCII/Latin-1 is appropriate. Anything Eggs that do conversion will need to convert between arbitrary encodings in byte vectors and UTF-8 strings. So at worst, some eggs might need to be split in two. This is already the case for SRFI 13 and 14. > The same approaches also apply to eggs needing the full > numeric tower, though with UTF-8 there's less chance of > breakage when mixing eggs which do and don't use the utf8 egg. I would say that UTF-8 has *more* chance of causing undetected breakage, because UTF-8 strings have an interpretation as core strings, whereas bignums, ratnums, compnums etc. don't look like numbers to the core, and errors will be thrown. -- John Cowan http://www.ccil.org/~cowan co...@ccil.org Police in many lands are now complaining that local arrestees are insisting on having their Miranda rights read to them, just like perps in American TV cop shows. When it's explained to them that they are in a different country, where those rights do not exist, they become outraged. --Neal Stephenson _______________________________________________ Chicken-users mailing list Chicken-users@nongnu.org https://lists.nongnu.org/mailman/listinfo/chicken-users