#5996: fix CSE -------------------------------------+-------------------------------------- Reporter: michalt | Owner: Type: bug | Status: new Priority: normal | Component: Compiler Version: 7.5 | Keywords: Os: Unknown/Multiple | Architecture: Unknown/Multiple Failure: Runtime performance bug | Testcase: Blockedby: | Blocking: Related: | -------------------------------------+-------------------------------------- The current version of CSE is slightly broken -- contrary to the comments explaining it, it does not add an additional mapping when it substitutes the RHS of some binding. Taking the example from ```CSE.lhs```
{{{ module Test2 where data C a b = C a b a = undefined {-# NOINLINE a #-} b = undefined {-# NOINLINE b #-} x1 = C a b x2 = C x1 b y1 = C a b y2 = C y1 b }}} We want to detect that ```y1``` is the same as ```x1``` and so rewrite it to ```y1 = x1```. But at this point we also want to add a new substitution that changes ```y1``` to ```x1```. So that we can get ```y2 = C x1 b``` and then ```y2 = x2```. This is '''not''' done by the current code: {{{ > ghc -O2 -fforce-recomp Test2.hs -ddump-simpl -dsuppress-all -rtsopts [1 of 1] Compiling Test2 ( Test2.hs, Test2.o ) ==================== Tidy Core ==================== Result size = 40 b = undefined a = b x1 = \\ @ a_aaf @ b_aag -> C (a) (b) x2 = \\ @ b_aao @ a_aap @ b1_aaq -> C (x1) (b) y1 = x1 y2 = \\ @ b_aaG @ a_aaH @ b1_aaI -> C (x1) (b) }}} I wrote a patch to fix that and we get the desired result: {{{ > ~/dev/ghc-work/inplace/bin/ghc-stage2 -O2 -fforce-recomp Test2.hs -ddump-simpl -dsuppress-all -rtsopts [1 of 1] Compiling Test2 ( Test2.hs, Test2.o ) ==================== Tidy Core ==================== Result size = 30 b = undefined a = b x1 = \\ @ a_aal @ b_aam -> C (b) (b) x2 = \\ @ b_aau @ a_aav @ b1_aaw -> C (x1) (b) y1 = x1 y2 = x2 }}} The fix seems quite easy but it made nofib unhappy -- see nofib1 attachment. Apparently there is quite a bit additional alloctation happening in a few benchmarks. I managed to narrow it down to the ```GHC.IO.Encoding.*``` modules. Adding a simple ```GHC_OPTIONS -fno-cse``` seems to improve the performance quite a bit (and above the current HEAD!) -- see nofib2 attachment. There seems to be a bit more code bloat, but the performance looks worth it. I haven't yet looked into exactly what causes the excessive allocation with CSE in the ```GHC.IO.Encoding.*``` modules (and I'm also a bit surprised by that -- I thought the main issue with CSE would be bigger memory usage). So any suggestions are more than welcome. ;) All the patches are in https://github.com/michalt/ghc (branch "cse") and https://github.com/michalt/packages-base (branch "cse"). -- Ticket URL: <http://hackage.haskell.org/trac/ghc/ticket/5996> GHC <http://www.haskell.org/ghc/> The Glasgow Haskell Compiler _______________________________________________ Glasgow-haskell-bugs mailing list Glasgow-haskell-bugs@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-bugs