Send Beginners mailing list submissions to
        beginners@haskell.org

To subscribe or unsubscribe via the World Wide Web, visit
        http://www.haskell.org/mailman/listinfo/beginners
or, via email, send a message with subject or body 'help' to
        beginners-requ...@haskell.org

You can reach the person managing the list at
        beginners-ow...@haskell.org

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Beginners digest..."


Today's Topics:

   1. Re:  hGetContents, unicode and linux (Yitzchak Gale)
   2. Re:  hGetContents, unicode and linux (Michael Snoyman)


----------------------------------------------------------------------

Message: 1
Date: Sun, 28 Nov 2010 10:35:05 +0200
From: Yitzchak Gale <g...@sefer.org>
Subject: Re: [Haskell-beginners] hGetContents, unicode and linux
To: Michael Snoyman <mich...@snoyman.com>
Cc: beginners@haskell.org
Message-ID:
        <aanlkti=el7ydzgehkf4_hvt1290jczwqy-5le+y=z...@mail.gmail.com>
Content-Type: text/plain; charset=ISO-8859-1

I wrote:
>> In any case, you still need to have the correct encoding
>> set on the handles as before.

Michael Snoyman wrote:
> ...it does *not* address invalid byte sequences (AFAIK),
> which can be dealt with using the bytestring/text decoding
> combination.

Well, using the standard interface, you have three choices
on how to handle invalid byte sequences - drop them,
use a replacement character, or throw an exception, with
the third choice being the default. You specify that choice
when you set the encoding. See the documentation for
System.IO for more details.

However, those choices are implemented via GNU iconv,
so on Windows you only have the default behavior.

Also, in certain special situations - like if you need to be able
to specify the replacement character yourself, or if you need
in-band exceptions (e.g. a stream of Either error character),
then the options do seem limited currently.

You might still need to fall back on the old bytestring hack
in those cases. If you find yourself in that situation, it might
be a good idea to push the maintainers of System.IO and
Data.Text to continue to improve support for encodings in the
standard libraries.

Regards,
Yitz


------------------------------

Message: 2
Date: Sun, 28 Nov 2010 15:53:47 +0200
From: Michael Snoyman <mich...@snoyman.com>
Subject: Re: [Haskell-beginners] hGetContents, unicode and linux
To: g...@sefer.org
Cc: beginners@haskell.org
Message-ID:
        <aanlktikukqhwydmjlyebud2lhzrsw5m+q18nechbq...@mail.gmail.com>
Content-Type: text/plain; charset=ISO-8859-1

On Sun, Nov 28, 2010 at 10:35 AM, Yitzchak Gale <g...@sefer.org> wrote:
> I wrote:
>>> In any case, you still need to have the correct encoding
>>> set on the handles as before.
>
> Michael Snoyman wrote:
>> ...it does *not* address invalid byte sequences (AFAIK),
>> which can be dealt with using the bytestring/text decoding
>> combination.
>
> Well, using the standard interface, you have three choices
> on how to handle invalid byte sequences - drop them,
> use a replacement character, or throw an exception, with
> the third choice being the default. You specify that choice
> when you set the encoding. See the documentation for
> System.IO for more details.
>
> However, those choices are implemented via GNU iconv,
> so on Windows you only have the default behavior.
>
> Also, in certain special situations - like if you need to be able
> to specify the replacement character yourself, or if you need
> in-band exceptions (e.g. a stream of Either error character),
> then the options do seem limited currently.
>
> You might still need to fall back on the old bytestring hack
> in those cases. If you find yourself in that situation, it might
> be a good idea to push the maintainers of System.IO and
> Data.Text to continue to improve support for encodings in the
> standard libraries.

I hadn't realized that the standard libraries offered so much
sophistication in their approach to file encodings, I'll have to look
at it more thoroughly.

Michael


------------------------------

_______________________________________________
Beginners mailing list
Beginners@haskell.org
http://www.haskell.org/mailman/listinfo/beginners


End of Beginners Digest, Vol 29, Issue 45
*****************************************

Reply via email to