Duncan Coutts wrote:
On Tue, 2007-11-27 at 18:38 +0000, Paul Johnson wrote:
Brandon S. Allbery KF8NH wrote:
However, the IO system truncates [characters] to 8 bits.
Should this be considered a bug?
A design problem.
I presume that its because <stdio.h> was defined in the days of
ASCII-only strings, and the functions in System.IO are defined in
terms of <stdio.h>. But does this need to be the case in the future?
When it's phrased as "truncates to 8 bits" it sounds so simple, surely
all we need to do is not truncate to 8 bits right?
The problem is, what encoding should it pick? UTF8, 16, 32, EBDIC? How
would people specify that they really want to use a binary file.
Whatever we change it'll break programs that use the existing meanings.
One sensible suggestion many people have made is that H98 file IO should
use the locale encoding and do Unicode/String <-> locale conversion. So
that'd all be text files. Then openBinaryFile would be used for binary
files. Of course then we'd need control over setting the encoding and
what to do on encountering encoding errors.
Wouldn't it be sensible not to use the H98 file I/O operations at all
anymore with binary files? A Char represents a Unicode code point value
and is not the right data type to use to represent a byte from a binary
stream. Who wants binary I/O would have to use Data.ByteString.* and
Data.Binary.
So you would use System.IO.hPutStr to write a text string, and
Data.ByteString.hPutStr to write a sequence of bytes. Probably, a good
implementation of the earlier could be made in terms of the latter.
Reinier
_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe