Hi Tom, I've been working on another code-generating graphics compiler, generating GPU code. As always, I run into the problem of efficient common subexpression elimination. In Pan, Vertigo & Pajama, I used lazy memoization, using stable pointers and weak references, to avoid the worst-case-exponential behavior you mention below. I'm now using a bottom-up CSE method that's slower and more complicated than I'm going for.
What's your latest wisdom about CSE in DSELs? Thanks, - Conal On Thu, Feb 7, 2008 at 11:33 PM, Tom Hawkins <tomahawk...@gmail.com> wrote: > I've been programming with Haskell for a few years and love it. One > of my favorite applications of Haskell is using for domain specific > languages. However, after designing a handful of DSLs, I continue to > hit what appears to be a fundamental hurdle -- or at least I have yet > to find an adequate solution. > > My DSLs invariably define a datatype to capture expressions; something > like this: > > data Expression > = Add Expression Expression > | Sub Expression Expression > | Variable String > | Constant Int > deriving Eq > > Using the datatype Expression, it is easy to mass a collections of > functions to help assemble complex expressions, which leads to very > concise programs in the DSL. > > The problem comes when I want to generate efficient code from an > Expression (ie. to C or some other target language). The method I use > invovles converting the tree of subexpressions into an acyclic graphic > to eliminate common subexpressions. The nodes are then topologically > ordered and assigned an instruction, or statement for each node. For > example: > > let a = Add (Constant 10) (Variable "i1") > b = Sub (Variable "i2") (Constant 2) > c = Add a b > > would compile to a C program that may look like this: > > a = 10 + i1; > b = i2 - 2; > c = a + b; > > The process of converting an expression tree to a graph uses either Eq > or Ord (either derived or a custom instance) to search and build a set > of unique nodes to be ordered for execution. In this case "a", then > "b", then "c". The problem is expressions often have shared, > equivalent subnodes, which dramatically grows the size of the tree. > For example: > > let d = Add c c > e = Add d d -- "e" now as 16 leaf nodes. > > As these trees grow in size, the equality comparison in graph > construction quickly becomes the bottleneck for DSL compilation. > What's worse, the phase transition from tractable to intractable is > very sharp. In one of my DSL programs, I made a seemingly small > change, and compilation time went from milliseconds to > not-in-a-million-years. > > Prior to Haskell, I wrote a few DSLs in OCaml. I didn't have this > problem in OCaml because each "let" expression was mutable, and I > could use the physical equality operator to perform fast comparisons. > Unfortunately, I have grown to love Haskell's type system and its lack > of side effects, and could never go back. > > Is there anything that can be done to dramatically speed up > comparisons, or is there a better approach I can take to extract > common subexpressions? I should point out I have an opportunity to > get Haskell on a real industrial application. But if I can't solve > this problem, I may have to resort to far less eloquent languages. > :-( > > Thanks for any and all help. > > -Tom > _______________________________________________ > Haskell-Cafe mailing list > Haskell-Cafe@haskell.org > http://www.haskell.org/mailman/listinfo/haskell-cafe >
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe