#4978: Continuation passing style loop doesn't compile into a loop
---------------------------------+------------------------------------------
    Reporter:  tibbe             |        Owner:              
        Type:  bug               |       Status:  new         
    Priority:  normal            |    Milestone:  7.4.1       
   Component:  Compiler          |      Version:  7.0.1       
    Keywords:                    |     Testcase:              
   Blockedby:                    |   Difficulty:              
          Os:  Unknown/Multiple  |     Blocking:              
Architecture:  Unknown/Multiple  |      Failure:  None/Unknown
---------------------------------+------------------------------------------

Comment(by tibbe):

 With $weta inlined the code looks good enough; There's no longer any
 allocation in the common case i.e. when there's still space in the write
 buffer. Hopefully that also means there isn't a heap check in the common
 path.

 It seems to me that at least some primops should look really cheap, like
 memory writes and reads, which only require a few instruction. Looking at
 `primOpIsCheap` (which simply calls `primOpOkForSpeculation`) we don't
 consider `writeWord8OffAddr#` cheap as it has side effects.

 If `primOpIsCheap` would pattern match on the actual primop and return
 `True` or `False` on a case-by-case basis, we could make
 `writeWord8OffAddr#` look cheap. Shouldn't calls to out-of-line primops
 (like `newPinnedByteArray#`) also look cheap, as the primop itself will
 never be inlined?

 The ''optimal'' Core for `$wa` would look something like:

 {{{
 $wa =
   \ (xs :: [Word8])
     (k :: Buffer -> [ByteString])
     (addr :: Addr#)
     (fp :: ForeignPtrContents)
     (o :: Int#)
     (u :: Int#)
     (l :: Int#) ->
     case xs of ys {
       [] -> k (Buffer addr fp o u l);
       : x xs1_awV ->
         case <=# 1 l of _ {
           False ->
             case u of wild2_X18 {
               __DEFAULT ->
                 : (PS addr fp o wild2_X18)
                   (case newPinnedByteArray# 32752 realWorld#
                    of _ { (# _, mbarr#_a10k #) ->
                    let { a2_s10s = byteArrayContents# (mbarr#_a10k `cast`
 ...) } in
                    let { fp' = PlainPtr mbarr#_a10k } in
                    case touch# fp' realWorld# of _ { __DEFAULT ->
                    $wa ys k a2_s10s fp' 0 0 32752
                    }
                    });
               0 ->
                 case newPinnedByteArray# 32752 realWorld#
                 of _ { (# _, mbarr#_a10k #) ->
                 let { a2_s10w = byteArrayContents# (mbarr#_a10k `cast`
 ...) } in
                 let { fp' = PlainPtr mbarr#_a10k } in
                 case touch# fp' realWorld# of _ { __DEFAULT ->
                 $wa ys k a2_s10w fp' 0 0 32752
                 }
                 }
             };
           True ->
             case x of _ { W8# x# ->
             case writeWord8OffAddr#
                    (plusAddr# addr (+# o u)) 0 x# realWorld#
             of s2_aYX { __DEFAULT ->
             case touch# fp s2_aYX of _ { __DEFAULT ->
             $wa
               xs1_awV
               k
               addr
               fp
               o
               (+# u 1)
               (-# l 1)
             }
             }
             }
         }
     }
 }}}

 I'm not sure it's possible to get there as the code transformations I
 applied manually are non-obvious (i.e. I shared the call to
 `writeWord8OffAddr#` between all three branches).

-- 
Ticket URL: <http://hackage.haskell.org/trac/ghc/ticket/4978#comment:17>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler

_______________________________________________
Glasgow-haskell-bugs mailing list
Glasgow-haskell-bugs@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-bugs

Reply via email to