#5996: fix CSE
-------------------------------------+--------------------------------------
 Reporter:  michalt                  |          Owner:                  
     Type:  bug                      |         Status:  new             
 Priority:  normal                   |      Component:  Compiler        
  Version:  7.5                      |       Keywords:                  
       Os:  Unknown/Multiple         |   Architecture:  Unknown/Multiple
  Failure:  Runtime performance bug  |       Testcase:                  
Blockedby:                           |       Blocking:                  
  Related:                           |  
-------------------------------------+--------------------------------------
 The current version of CSE is slightly broken -- contrary to the comments
 explaining it, it does not add an additional mapping when it substitutes
 the RHS
 of some binding. Taking the example from ```CSE.lhs```

 {{{
 module Test2 where

 data C a b = C a b
 a = undefined
 {-# NOINLINE a #-}
 b = undefined
 {-# NOINLINE b #-}

 x1 = C a b
 x2 = C x1 b
 y1 = C a b
 y2 = C y1 b
 }}}

 We want to detect that ```y1``` is the same as ```x1``` and so rewrite it
 to
 ```y1 = x1```. But at this point we also want to add a new substitution
 that
 changes ```y1``` to ```x1```. So that we can get ```y2 = C x1 b``` and
 then
 ```y2 = x2```. This is '''not''' done by the current code:
 {{{
 > ghc -O2 -fforce-recomp Test2.hs -ddump-simpl -dsuppress-all -rtsopts
 [1 of 1] Compiling Test2            ( Test2.hs, Test2.o )

 ==================== Tidy Core ====================
 Result size = 40

 b = undefined
 a = b
 x1 = \\ @ a_aaf @ b_aag -> C (a) (b)
 x2 = \\ @ b_aao @ a_aap @ b1_aaq -> C (x1) (b)
 y1 = x1
 y2 = \\ @ b_aaG @ a_aaH @ b1_aaI -> C (x1) (b)
 }}}

 I wrote a patch to fix that and we get the desired result:
 {{{

 > ~/dev/ghc-work/inplace/bin/ghc-stage2 -O2 -fforce-recomp Test2.hs
 -ddump-simpl -dsuppress-all -rtsopts
 [1 of 1] Compiling Test2            ( Test2.hs, Test2.o )

 ==================== Tidy Core ====================
   Result size = 30

 b = undefined
 a = b
 x1 = \\ @ a_aal @ b_aam -> C (b) (b)
 x2 = \\ @ b_aau @ a_aav @ b1_aaw -> C (x1) (b)
 y1 = x1
 y2 = x2
 }}}

 The fix seems quite easy but it made nofib unhappy -- see nofib1
 attachment. Apparently there is quite a bit additional alloctation
 happening in
 a few benchmarks. I managed to narrow it down to the
 ```GHC.IO.Encoding.*```
 modules. Adding a simple ```GHC_OPTIONS -fno-cse``` seems to improve the
 performance quite a bit (and above the current HEAD!) -- see nofib2
 attachment.
 There seems to be a bit more code bloat, but the performance looks worth
 it. I
 haven't yet looked into exactly what causes the excessive allocation with
 CSE in
 the ```GHC.IO.Encoding.*``` modules (and I'm also a bit surprised by that
 -- I
 thought the main issue with CSE would be bigger memory usage). So any
 suggestions are more than welcome. ;)

 All the patches are in https://github.com/michalt/ghc (branch "cse") and
 https://github.com/michalt/packages-base (branch "cse").

-- 
Ticket URL: <http://hackage.haskell.org/trac/ghc/ticket/5996>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler

_______________________________________________
Glasgow-haskell-bugs mailing list
Glasgow-haskell-bugs@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-bugs

Reply via email to