Hi
I'm learning Haskell through writing a compiler. I'm seeing huge memory
use in a function which converts the dataflow graph to the form required
by Data.Graph. It needs to return a map from dataflow nodes to Vertexs,
a map in the other direction, and the list of edges (as Vertex pairs).
total time = 18.78 secs (939 ticks @ 20 ms)
total alloc = 6,616,297,296 bytes (excludes profiling overheads)
COST CENTRE MODULE %time %alloc
j Dataflow 87.6 99.9
graphvizCompile Dataflow 6.4 0.0
f Dataflow 2.9 0.0
con2tag_DFReal# Representation 2.1 0.0
I assume the allocation is being garbage-collected pretty quickly,
because a) 6,616,297,296 bytes is stupid (!) and b) Process Explorer
informs me that the peak private bytes of the program is not more than a
couple of MB.
The offending code looks like this:
-- (next Vertex available for use, edges so far, map so far, map so far)
type DependencyEdgesAcc = (Vertex, [Edge], Map.Map DFNode Vertex, Map.Map
Vertex DFNode)
-- Process a list of DFNodes in the same context.
dependencyEdges :: DependencyEdgesAcc -> [DFNode] -> DependencyEdgesAcc
dependencyEdges acc [] = {-# SCC "a" #-} acc
dependencyEdges acc@(_, _, mnv1, _) (n:ns) =
-- Check whether this node has been done before.
{-# SCC "j" #-} case {-# SCC "i" #-} Map.lookup n mnv1 of
Just _ -> {-# SCC "b" #-} dependencyEdges acc ns
Nothing ->
-- Find the list of dependencies.
let ndeps = {-# SCC "c" #-} nodeDependencies n in
-- Process them first.
let (v2, es2, mnv2, mvn2) = {-# SCC "d" #-} dependencyEdges acc ndeps in
-- Now we can claim v2 as the label for this node n.
-- Look up vdep for each dependency ndep and add an edge from the vdep to
v2.
let {(v4, es4, mnv4, mvn4) = {-# SCC "e" #-} List.foldl' (
\(v3, es3, mnv3, mvn3) ndep ->
case {-# SCC "f" #-} Map.lookup ndep mnv3 of
Just vdep -> (v3, (vdep, v2):es3, mnv3, mvn3)
Nothing -> error $ "this node should have been added already: "
++ show ndep
) (v2, es2, mnv2, mvn2) ndeps } in
assert (v2 == v4)
assert (mvn2 == mvn4)
assert (mnv2 == mnv4)
-- Finally, add this node to the accumulator and then recurse.
dependencyEdges (v2+1, es4, {-# SCC "g" #-} Map.insert n v2 mnv2, {-# SCC
"h" #-} Map.insert v2 n mvn2) ns
with the following profile:
dependencyGraph Dataflow
4395 1 0.0 0.0 93.4 99.9
getRootDFNodes Dataflow
4397 5 0.0 0.0 0.0 0.0
dependencyEdges Dataflow
4396 282 0.0 0.0 93.4 99.9
a Dataflow
4407 104 0.0 0.0 0.0 0.0
j Dataflow
4399 178 87.6 99.9 93.4 99.9
b Dataflow
4417 75 0.0 0.0 0.0 0.0
g Dataflow
4405 103 0.1 0.0 0.2 0.0
con2tag_DFBool# Representation
4419 1972 0.0 0.0 0.0 0.0
con2tag_DFReal# Representation
4414 186890 0.1 0.0 0.1 0.0
con2tag_DFNode# Representation
4413 1204 0.0 0.0 0.0 0.0
h Dataflow
4404 103 0.0 0.0 0.0 0.0
e Dataflow
4403 103 0.0 0.0 4.8 0.0
f Dataflow
4412 174 2.9 0.0 4.8 0.0
con2tag_DFBool# Representation
4420 245794 0.0 0.0 0.0 0.0
con2tag_DFReal# Representation
4416 25772482 1.9 0.0 1.9 0.0
con2tag_DFNode# Representation
4415 2068 0.0 0.0 0.0 0.0
d Dataflow
4402 103 0.0 0.0 0.0 0.0
c Dataflow
4401 103 0.0 0.0 0.0 0.0
nodeDependencies Dataflow
4406 103 0.0 0.0 0.0 0.0
i Dataflow
4400 178 0.6 0.0 0.7 0.0
con2tag_DFBool# Representation
4418 48230 0.0 0.0 0.0 0.0
con2tag_DFReal# Representation
4410 5045646 0.1 0.0 0.1 0.0
con2tag_DFNode# Representation
4409 1928 0.0 0.0 0.0 0.0
Is this a space leak, or is my algorithm just silly? (It could well be
the latter...)
I've tried adding random bangs, to no effect. So, can anyone help me? :)
Cheers
Ben
_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe