Hi

I'm learning Haskell through writing a compiler. I'm seeing huge memory use in a function which converts the dataflow graph to the form required by Data.Graph. It needs to return a map from dataflow nodes to Vertexs, a map in the other direction, and the list of edges (as Vertex pairs).

        total time  =       18.78 secs   (939 ticks @ 20 ms)
        total alloc = 6,616,297,296 bytes  (excludes profiling overheads)

COST CENTRE                    MODULE               %time %alloc

j                              Dataflow              87.6   99.9
graphvizCompile                Dataflow               6.4    0.0
f                              Dataflow               2.9    0.0
con2tag_DFReal#                Representation         2.1    0.0

I assume the allocation is being garbage-collected pretty quickly, because a) 6,616,297,296 bytes is stupid (!) and b) Process Explorer informs me that the peak private bytes of the program is not more than a couple of MB.

The offending code looks like this:

-- (next Vertex available for use, edges so far, map so far, map so far)
type DependencyEdgesAcc = (Vertex, [Edge], Map.Map DFNode Vertex, Map.Map 
Vertex DFNode)

-- Process a list of DFNodes in the same context.
dependencyEdges :: DependencyEdgesAcc -> [DFNode] -> DependencyEdgesAcc
dependencyEdges acc [] = {-# SCC "a" #-} acc
dependencyEdges acc@(_, _, mnv1, _) (n:ns) =
  -- Check whether this node has been done before.
  {-# SCC "j" #-} case {-# SCC "i" #-} Map.lookup n mnv1 of
    Just _ -> {-# SCC "b" #-} dependencyEdges acc ns
    Nothing ->
      -- Find the list of dependencies.
      let ndeps = {-# SCC "c" #-} nodeDependencies n in
      -- Process them first.
      let (v2, es2, mnv2, mvn2) = {-# SCC "d" #-} dependencyEdges acc ndeps in
      -- Now we can claim v2 as the label for this node n.
      -- Look up vdep for each dependency ndep and add an edge from the vdep to 
v2.
      let {(v4, es4, mnv4, mvn4) = {-# SCC "e" #-} List.foldl' (
          \(v3, es3, mnv3, mvn3) ndep ->
            case {-# SCC "f" #-} Map.lookup ndep mnv3 of
              Just vdep -> (v3, (vdep, v2):es3, mnv3, mvn3)
              Nothing -> error $ "this node should have been added already: " 
++ show ndep
          ) (v2, es2, mnv2, mvn2) ndeps } in
      assert (v2 == v4)
      assert (mvn2 == mvn4)
      assert (mnv2 == mnv4)
      -- Finally, add this node to the accumulator and then recurse.
      dependencyEdges (v2+1, es4, {-# SCC "g" #-} Map.insert n v2 mnv2, {-# SCC 
"h" #-} Map.insert v2 n mvn2) ns

with the following profile:

    dependencyGraph      Dataflow                                            
4395           1   0.0    0.0    93.4   99.9
     getRootDFNodes      Dataflow                                            
4397           5   0.0    0.0     0.0    0.0
     dependencyEdges     Dataflow                                            
4396         282   0.0    0.0    93.4   99.9
      a                  Dataflow                                            
4407         104   0.0    0.0     0.0    0.0
      j                  Dataflow                                            
4399         178  87.6   99.9    93.4   99.9
       b                 Dataflow                                            
4417          75   0.0    0.0     0.0    0.0
       g                 Dataflow                                            
4405         103   0.1    0.0     0.2    0.0
        con2tag_DFBool#  Representation                                      
4419        1972   0.0    0.0     0.0    0.0
        con2tag_DFReal#  Representation                                      
4414      186890   0.1    0.0     0.1    0.0
        con2tag_DFNode#  Representation                                      
4413        1204   0.0    0.0     0.0    0.0
       h                 Dataflow                                            
4404         103   0.0    0.0     0.0    0.0
       e                 Dataflow                                            
4403         103   0.0    0.0     4.8    0.0
        f                Dataflow                                            
4412         174   2.9    0.0     4.8    0.0
         con2tag_DFBool# Representation                                      
4420      245794   0.0    0.0     0.0    0.0
         con2tag_DFReal# Representation                                      
4416    25772482   1.9    0.0     1.9    0.0
         con2tag_DFNode# Representation                                      
4415        2068   0.0    0.0     0.0    0.0
       d                 Dataflow                                            
4402         103   0.0    0.0     0.0    0.0
       c                 Dataflow                                            
4401         103   0.0    0.0     0.0    0.0
        nodeDependencies Dataflow                                            
4406         103   0.0    0.0     0.0    0.0
       i                 Dataflow                                            
4400         178   0.6    0.0     0.7    0.0
        con2tag_DFBool#  Representation                                      
4418       48230   0.0    0.0     0.0    0.0
        con2tag_DFReal#  Representation                                      
4410     5045646   0.1    0.0     0.1    0.0
        con2tag_DFNode#  Representation                                      
4409        1928   0.0    0.0     0.0    0.0

Is this a space leak, or is my algorithm just silly? (It could well be the latter...)

I've tried adding random bangs, to no effect. So, can anyone help me? :)

Cheers
Ben
_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Reply via email to