Send Beginners mailing list submissions to beginners@haskell.org To subscribe or unsubscribe via the World Wide Web, visit http://www.haskell.org/mailman/listinfo/beginners or, via email, send a message with subject or body 'help' to beginners-requ...@haskell.org
You can reach the person managing the list at beginners-ow...@haskell.org When replying, please edit your Subject line so it is more specific than "Re: Contents of Beginners digest..." Today's Topics: 1. Re: hGetContents, unicode and linux (Yitzchak Gale) 2. Re: hGetContents, unicode and linux (Michael Snoyman) ---------------------------------------------------------------------- Message: 1 Date: Sun, 28 Nov 2010 10:35:05 +0200 From: Yitzchak Gale <g...@sefer.org> Subject: Re: [Haskell-beginners] hGetContents, unicode and linux To: Michael Snoyman <mich...@snoyman.com> Cc: beginners@haskell.org Message-ID: <aanlkti=el7ydzgehkf4_hvt1290jczwqy-5le+y=z...@mail.gmail.com> Content-Type: text/plain; charset=ISO-8859-1 I wrote: >> In any case, you still need to have the correct encoding >> set on the handles as before. Michael Snoyman wrote: > ...it does *not* address invalid byte sequences (AFAIK), > which can be dealt with using the bytestring/text decoding > combination. Well, using the standard interface, you have three choices on how to handle invalid byte sequences - drop them, use a replacement character, or throw an exception, with the third choice being the default. You specify that choice when you set the encoding. See the documentation for System.IO for more details. However, those choices are implemented via GNU iconv, so on Windows you only have the default behavior. Also, in certain special situations - like if you need to be able to specify the replacement character yourself, or if you need in-band exceptions (e.g. a stream of Either error character), then the options do seem limited currently. You might still need to fall back on the old bytestring hack in those cases. If you find yourself in that situation, it might be a good idea to push the maintainers of System.IO and Data.Text to continue to improve support for encodings in the standard libraries. Regards, Yitz ------------------------------ Message: 2 Date: Sun, 28 Nov 2010 15:53:47 +0200 From: Michael Snoyman <mich...@snoyman.com> Subject: Re: [Haskell-beginners] hGetContents, unicode and linux To: g...@sefer.org Cc: beginners@haskell.org Message-ID: <aanlktikukqhwydmjlyebud2lhzrsw5m+q18nechbq...@mail.gmail.com> Content-Type: text/plain; charset=ISO-8859-1 On Sun, Nov 28, 2010 at 10:35 AM, Yitzchak Gale <g...@sefer.org> wrote: > I wrote: >>> In any case, you still need to have the correct encoding >>> set on the handles as before. > > Michael Snoyman wrote: >> ...it does *not* address invalid byte sequences (AFAIK), >> which can be dealt with using the bytestring/text decoding >> combination. > > Well, using the standard interface, you have three choices > on how to handle invalid byte sequences - drop them, > use a replacement character, or throw an exception, with > the third choice being the default. You specify that choice > when you set the encoding. See the documentation for > System.IO for more details. > > However, those choices are implemented via GNU iconv, > so on Windows you only have the default behavior. > > Also, in certain special situations - like if you need to be able > to specify the replacement character yourself, or if you need > in-band exceptions (e.g. a stream of Either error character), > then the options do seem limited currently. > > You might still need to fall back on the old bytestring hack > in those cases. If you find yourself in that situation, it might > be a good idea to push the maintainers of System.IO and > Data.Text to continue to improve support for encodings in the > standard libraries. I hadn't realized that the standard libraries offered so much sophistication in their approach to file encodings, I'll have to look at it more thoroughly. Michael ------------------------------ _______________________________________________ Beginners mailing list Beginners@haskell.org http://www.haskell.org/mailman/listinfo/beginners End of Beginners Digest, Vol 29, Issue 45 *****************************************