Con Terzoglou <[EMAIL PROTECTED]> writes:
> Currently I am attempting to convert some of my
> Haskell programs into C programs. I am finding great
> difficulty in this. I was wondering if they were any
> hints or tips that may be of assistance.
Interesting problem and probably very instructive to do. (The OS
kernel hackers that I work with sometimes comment that my C code looks
funny - I like to believe this is because I also write Haskell.)
Assuming you want to do it by hand, there's three main issues to
consider:
1) Lazy vs Strict.
You might also start by translating your programs which exploit
laziness into programs which would work well under strict
evaluation but are still legal Haskell programs. (Some of the Haskell
compilers (not Hugs) used to have flags to make everything strict. Or
you could translate to ML.)
You might try reading "Why Functional Programming Matters"
(http://www.md.chalmers.se/~rjmh/Papers/whyfp.html) which identifies
some key differences between strict and lazy functional programming.
Everyplace where Hughes identifies something good about [lazy]
functional programming, do the opposite.
In some cases, you won't be able to (or won't want to) eliminate
laziness. Your best bet is to read a book or paper about how lazy
evaluation is implemented. The first part of Simon Peyton Jones'
STG paper has a good overview - your goal should be to pick the
one that is simplest to implement/use in C not what is most
efficient (or was most efficient when the paper was written).
http://research.microsoft.com/copyright/accept.asp?path=/users/simonpj/papers/spineless-tagless-gmachine.ps.gz&pub=34
2) Memory allocation/deallocation.
You have to do manual memory management in C.
Choices include:
1) Keep careful track of when you have a pointer to an object and
call free when it's dead. Problem is what to do when an object
has multiple pointers to it. (It's also a little error prone.)
2) Put a reference count in every object, increment the reference
when you duplicate a pointer to an object, decrement the
reference when you discard a pointer. The big problem is that
every object in a cyclic data structure can have a reference
count of 1 even though there are no pointers to the structure
as a whole. (It's also _very_ error-prone. Unlike (1), it is
a nightmare to debug because it's hard to figure out which
part of the system to blame. The Mozilla project uses reference
counting and has developed a cool tool for helping you track
down reference counting errors. The intro is pretty good too.
http://www.mozilla.org/performance/refcnt-balancer.html)
3) Use a conservative garbage collector such as that used in Hugs
or the popular Boehm-Weiser collector. This is easy but will
not teach you an essential skill for C programmers to learn.
A more minor issue is allocation. This is easy enough: for every data
constructor, write an allocation function to build such an object.
The translation from datatypes to C's union-of-structs is quite
straightforward. Incidentally, using an object to represent [] (Nil)
(the Haskell way) instead of using a Null pointer has a beneficial
effect on your code - I recommend you try it.
3) Type systems
Translating ad-hoc polymorphism (ie type classes) can range from
trivial to hard depending on what kind of code you write. At the easy
end, you just figure out what type the function is used at and
call the appropriate method.
At the hard end, you have to duplicate code or tag objects with
enough information about their type that you know how to implement
show, ==, + (or whatever). Various abuses of the C preprocessor
let you semi-automate this process.
You can translate parametric polymorphism to C quite easily by making
sure that most things are pointers and casting.
A useful trick which makes debugging easier though is to put all your
casts in little inline functions like this:
extern inline directory* castFileToDir(file* f) { return (directory*)f; }
The advantages of using this instead of naked casts are:
1) You'll get a type error if the argument to this function has the
wrong type. (This is not true of cast which will cast any
pointer to any other pointer without a murmur.)
2) It's easier to tell what's going on and to grep for uses of a
particular "cast" or, since all such functions begin with the
word "cast", to find all casts in the system.
3) When you run into debugging problems, you can turn off inlining
(read the gcc documentation for why I used "extern inline")
and set a breakpoint on the cast function. Or you can insert
sanity checks on the casts or profiling code or ...
Hope this helps,
--
Alastair Reid [EMAIL PROTECTED] http://www.cs.utah.edu/~reid/