2010/6/22 José Romildo Malaquias <j.romi...@gmail.com> > Hello. > > I have been teaching an introductory course on compiler construction to > our undergraduates students using Appel's "Modern Compiler > Implementation in Java". There are also versions of the book in ML and > C. The books explain how to write a compiler for the Tiger programming > language. > > Now I want to implement a Tiger compiler in Haskell. > > The lexer and parser (built with the help of alex and happy) and the > type checker are already working. > > Next step is implementing a variable escape analysis phase, needed for > the intermediate code generator. The objective of this phase is to find > out if each variable escapes or not. In general an escaping variable > should be allocated in the stack frame (in memory). Non escaping > variables may be allocated in the stack frame or in a register. > > Generally, a variable escapes if it is passed by reference, its address > is taken (using C's & operator, for instance), or it is accessed from a > nested function. Only the last is possible with Tiger. > > The approach adopted by Appel in his books is easy: a muttable field in > the abstract syntax of variable declarations, of for expressions, and of > function formal parameters, which introduce new variables, is used to > collect the escaping information of the variable. In the ML version of > the book this is a reference to boolean (bool ref). Initially, in a > conservative approach, the reference is initialized to true. > > In the variable escaping analysis phase of the compiler, a function > findEscape looks for escaping variables and record this information in > the escape fields of the abstract syntax. To do this the entire abstract > syntax tree is traversed looking for escaping uses of every variable, > and, when found, the field is set to true. > > findEscape uses a symbol table to accomplish its work, binding variable > names to a reference to boolean (the same reference used in the abstract > syntax tree). When processing a variable declaraction, for instance, it > inserts the variable name into the symbol table, binding it to its > escaping field. When processing an expression which is a simple > variable, if the variable occurs in a nested function, its binding in > the symbol table is set to true. This reflects directly in the abstract > syntax tree of the variable declaration, as the escape field in the > variable declaration and the binding in the symbol table are the same > reference to bool. > > I am look for good ways to implement the variable escaping analysis > phase in Haskell, which is a pure language. I see two alternatives: > > 1) adopt the same approach as Appel used in his ML version of the > compiler: use a mutable reference in the IO monad (Data.IORef) to > hold the variable escaping information, and write everything inside > the IO monad > > 2) build a new abstract syntax tree with updated information regarding > variable escaping > > The second option is more elegant in my point of view, but would be much > less efficient than the first one. >
In Haskell you will get a lot of sharing between the original AST and the updated AST. This greatly reduces the overhead of #2. Have you read Chris Okasaki's Purely Functional Datastructures book? My other bit of advice would be to code it in the purely functional way and then do some benchmarking / profiling to see that it does have sufficiently good performance characteristics. If you find that it's getting awkward to manage the purely functional code then refactor that code into a monad that hides the tedious bits. I had some other comments to make but others here have already covered it :) (Such as using Writer or State or ST instead of IO, etc). Jason
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe