On 22/11/2011 19:12, David Terei wrote:
What change was made in HEAD compared to 7.2.1 to give the compile
time performance improvement for -fasm? It causes a huge slowdown for
-fllvm.
ghc-7.2.1 -O0 -fllvm T3016.hs => 2:17.47 s
ghc-head -O0 -fllvm T3016.hs => 10:34.84 s
This module has a lot of large Integer literals, and 7.4.1 has a new
internal representation for these which is much more compact. In the
backend a large integer literal turns into something like
x = mkInteger [I# xxx, I# yyy, I# zzz]
which will turn into
l1 = I# xxx
l2 = I# yyy
l3 = I# zzz
c1 = (:) l1 c2
c2 = (:) l2 c3
c3 = (:) l3 []
x = mkInteger c1
So nothing unusual, just lots of static data. I'm not sure why LLVM
should be taking so long - what does the generated LLVM code look like?
Cheers,
Simon
Cheers,
David
On 22 November 2011 10:41, David Terei<[email protected]> wrote:
Sorry, numbers I reported were with GHC 7.2.1, HEAD is indeed much quicker.
On 22 November 2011 01:18, Simon Marlow<[email protected]> wrote:
On 22/11/2011 07:36, David Terei wrote:
Argh! I spent the weekend fixing LLVM bugs and managed to close all
but one! I was so close to freedom.
Yep can confirm. -fasm -O0 takes 30seconds on my machine, -fllvm -O0
takes 2 min. 1 min spent in code generator (so either bad performance
in llvm code generator or in the llvm mangler) and 1 min in the llvm
tools (mostly llc) so probably nothing we can do there. Haven't tried
-O1 yet for -fllvm but -O1 -fasm takes 8min on my machine.
Can you recheck that? -O -fasm takes 8 *seconds* here:
'/64playpen/simonmar/nightly/HEAD-cam-04-unx/x86_64-unknown-linux/inplace/bin/ghc-stage2'
-fforce-recomp -dcore-lint -dcmm-lint -dno-debug-output
-no-user-package-conf -rtsopts -fno-ghci-history -c T3016.hs -O -fasm +RTS
-s
3,104,745,616 bytes allocated in the heap
829,479,400 bytes copied during GC
79,029,208 bytes maximum residency (20 sample(s))
842,880 bytes maximum slop
218 MB total memory in use (0 MB lost due to fragmentation)
Tot time (elapsed) Avg pause Max pause
Gen 0 5969 colls, 0 par 1.51s 1.51s 0.0003s 0.0097s
Gen 1 20 colls, 0 par 1.86s 1.86s 0.0932s 0.4177s
Parallel GC work balance: -nan (0 / 0, ideal 1)
MUT time (elapsed) GC time (elapsed)
Task 0 (worker) : 0.00s ( 7.96s) 0.00s ( 0.00s)
Task 1 (worker) : 0.00s ( 7.99s) 0.00s ( 0.00s)
Task 2 (bound) : 3.82s ( 4.61s) 3.37s ( 3.38s)
SPARKS: 0 (0 converted, 0 overflowed, 0 dud, 0 GC'd, 0 fizzled)
INIT time 0.00s ( 0.00s elapsed)
MUT time 3.82s ( 4.61s elapsed)
GC time 3.37s ( 3.38s elapsed)
EXIT time 0.00s ( 0.00s elapsed)
Total time 7.20s ( 7.99s elapsed)
Cheers,
Simon
Created ticket to track: http://hackage.haskell.org/trac/ghc/ticket/5652
Cheers,
David
On 21 November 2011 07:02, Simon Marlow<[email protected]> wrote:
David,
It looks like T3016 is running out of time with -fllvm:
=====> T3016(optllvm) 763 of 3159 [0, 1, 0]
cd ./simplCore/should_compile&&
'/64playpen/simonmar/nightly/HEAD-cam-04-unx/x86_64-unknown-linux/inplace/bin/ghc-stage2'
-fforce-recomp -dcore-lint -dcmm-lint -dno-debug-output
-no-user-package-conf -rtsopts -fno-ghci-history -c T3016.hs -O -fllvm
T3016.comp.stderr 2>&1
Timeout happened...killing process...
Compile failed (status 25344) errors were:
*** unexpected failure for T3016(optllvm)
Maybe it's hitting a bad case in the LLVM generator? This module is just
full of large integer constants, which ends up as a lot of top-level
static
data.
Cheers,
Simon
_______________________________________________
Cvs-ghc mailing list
[email protected]
http://www.haskell.org/mailman/listinfo/cvs-ghc