#4470: Loop optimization: identical counters
---------------------------------+------------------------------------------
    Reporter:  choenerzs         |       Owner:                   
        Type:  feature request   |      Status:  new              
    Priority:  normal            |   Component:  Compiler         
     Version:                    |    Keywords:  loop optimization
    Testcase:                    |   Blockedby:                   
          Os:  Unknown/Multiple  |    Blocking:                   
Architecture:  Unknown/Multiple  |     Failure:  None/Unknown     
---------------------------------+------------------------------------------
 Consider the small program below, where 'f' has to counters 'i' and 'j'.
 Both are completely identical; the only difference is that 'i' is used to
 change 's', while 'j' changes 'm'. It would be beneficial to have GHC
 transform 'f' into something like 'ff' so that one register less is
 required.

 Neither GHC nor LLVM perform this optimization.

 Code of this kind occurs when one uses the "vector library". See this
 discussion: [http://www.haskell.org/pipermail/glasgow-haskell-
 users/2010-November/019446.html]

 {{{
 {-# LANGUAGE BangPatterns #-}

 module Main where

 import Criterion.Main

 f :: Int -> Int -> Int -> Int -> Int
 f !i !j !s !m
   | i == 0    = s+m
   | otherwise = f (i-1) (j-1) (s + i+1) (m + j*5)

 g :: Int -> Int
 g !k = f k k 0 0


 ff :: Int -> Int -> Int -> Int
 ff !i !s !m
   | i == 0    = s+m
   | otherwise = ff (i-1) (s + i+1) (m + i*5)

 gg :: Int -> Int
 gg !k = ff k 0 0



 {-
 main = do
   print $ g 20
   print $ gg 20
 -}

 main = defaultMain
   [ bench " g" $ whnf g  20 -- 67.9ns
   , bench "gg" $ whnf gg 20 -- 64.5ns
   ]
 }}}

 Function 'f' produces this core:
 {{{
 $wf =
   \ (ww_s1uU :: Int#)
     (ww1_s1uY :: Int#)
     (ww2_s1v2 :: Int#)
     (ww3_s1v6 :: Int#) ->
     case ww_s1uU of wild_B1 {
       __DEFAULT ->
         $wf
           (-# wild_B1 1)
           (-# ww1_s1uY 1)
           (+# (+# ww2_s1v2 wild_B1) 1)
           (+# ww3_s1v6 (*# ww1_s1uY 5));
       0 -> +# ww2_s1v2 ww3_s1v6
     }
 }}}

 'wild_B1' and 'ww1_s1uY' should be merged in this case.

 The attached source is above program.

-- 
Ticket URL: <http://hackage.haskell.org/trac/ghc/ticket/4470>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler
_______________________________________________
Glasgow-haskell-bugs mailing list
Glasgow-haskell-bugs@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-bugs

Reply via email to