>Could anyone with a serious and/or research application which would
>benefit please send me a few details.
> ... Patrick
Two applications requiring better file handling/persistence:
1/ In Lolita, one of the largest Haskell programs (approx. 60K lines),
a large data structure (the semantic net) is read from a file. For
efficiency this is done in C. Together with the group at Durham we are
parallelising the program, and the file-reading C must be seperately
parallelised (made re-entrant)
2/ I'm writing some data-intensive programs and not only do I need to
parse the data, but also the indices over the data (FiniteMaps).
Because read and show are so slow, I'm hand writing a parser: sigh!
Theology:
I believe that functional languages inadequate manipulation of files,
or persistent data, is a serious hinderance to their use for real
applications. To test this I'll also ask haskell-users whether they've
encountered it. There's no fundamental problem, it's just a matter of
finding the person-months to engineer a solution. The inadequacies of
current functional file manipulation are as follows:
1/ Using a printable representation, implies
o File must be reparsed on input
o data expansion
o loss of type information
o loss of sharing
2/ The file operations are monolithic: there are no facilities for
reading or writing a small part of a file.
3/ Hyperstrict read/write precludes the preservation of
o partially evaluated, and hence any potentially infinite, structures
o data structures containing functions.
4/ If the same file is read more than once and also written, the
file must logically be duplicated to preserve referential
transparency.
5/ There are some type issues, for example the contents of a binary
file must be dynamically typed, in contrast to the static typing
in the rest of the language.
Solutions:
Orthogonal persistence is wonderful, but expensive and awkward in a
file-based environment. I believe we could get a long way with Binary
files, possibly with indexed-sequential access. A proposal on these
lines, together with a critique of existing file handling is in the
1992 Glasgow FP Workshop.
For anyone thinking of implementing Binary files, the Parallel Haskell
packing code that Will gave pointers to would be a good starting
place. It'd need to be simplified (it ensures the uniqueness of
closures on a processor). I'd be happy to discuss this any Brave Soul
who does the implementation. There are papers describing the
packing/unpacking in IFL 1995, and PLDI 1996.
Phil