simple binary IO proposition.

John Meacham Thu, 31 Aug 2000 20:37:03 -0700
Haskell is a wonderful language and i am tempted to use it everywhere but i
have had to turn it down for many tasks due to a very simple little shortcoming
in the language.. there is no way to communicate with the outside world in any
predefined binary format. i am not talking about pickling or persistance or any
of that but just the raw ability to read in binary files in a specified format
without resorting to calling external function via the FFI or writing code
dependant on unspecified behavior of a compiler (8 bit Char's for instance.).
to remedy this, i propose this very simple module be added..

module BIO(Byte, hByteFileSize, hByteSeek, hGetByte, hPutByte, 
    hByteGetContents, hByteRead, hByteWrite) where

import Word(Word8) -- this may be defined differently for other compilers..

type Byte = Word8  

hByteFileSize :: Handle -> IO Integer
hByteSeek :: Handle -> Seekmode -> Integer -> IO ()
hGetByte :: Handle -> IO Byte
hPutByte :: Handle -> Byte -> IO ()
hByteGetContents :: Handle -> IO [Byte]
hByteRead :: Handle -> Integer -> IO [Byte]
hByteWrite :: Handle -> [Byte] -> IO ()

i am not saying that we change the haskell spec or anything, simply that this
module be added to hslibs and be promoted to a 'standard way to be
non-standard' status. too much code that exists out there makes the assumption
that Char == Byte == Octet == 8 bits. we should not be stuck with C's legacy!
the definition of Byte as a type will help out incredibly, i have seen to many
modules that define type Byte = Char, even the Posix stuff in hslibs does it;
the existance of a 'proper' definition of Byte will at least encourage people
to write more standards compliant code.. 

note that this is necisarry to even read in text in a format other than your
native one, for instance i have written programs which needed to deal with both
UTF8 and Latin1 text. both are trivially converted to-from byte streams in
haskell but there is no way to get a raw byte stream. 

the reason for the different Seeks is that hByteSeek is guarenteed to seek to
the byte number given, independant of the host character format, mixing of h*
and hByte* seeking functions should be considered to have undefined behavior as
it depends on the native character encoding.

i hope people see the value in doing something like this, it is frustrating to
not be able to write even the basicest of programs without having to resort to
implementation issues or overpowered extensions when all i want is a simple
deterministic defined interface to the outside world.

PS. if people dont like the name 'Byte', i was also fooling around with
the type 'Octet'

-- 
--------------------------------------------------------------
John Meacham   http://repetae.net/~john/   [EMAIL PROTECTED]
California Institute of Technology, Alum.
--------------------------------------------------------------
simple binary IO proposition.

Reply via email to