Manuel

OK, I've improved the eta expander so it does a better
job of the code you found.

Try now.  On the HEAD.  Which should compile OK now, incidentally.

How's your parallel Haskell stuff coming?

Simon


| -----Original Message-----
| From: Manuel M. T. Chakravarty [mailto:[EMAIL PROTECTED]]
| Sent: 02 November 2000 13:54
| To: Simon Peyton-Jones
| Cc: [EMAIL PROTECTED]
| Subject: RE: Fun with GHC's optimiser
| 
| 
| Simon Peyton-Jones <[EMAIL PROTECTED]> wrote,
| 
| > I can never resist messages like these, even when I'm meant
| > to be doing other things.  
| 
| That's good to know ;-)
| 
| > It's very helpful when people offer
| > fairly precise performance-bug reports.  Thanks!
| 
| Thanks for the prompt reply!  To SimonM, too.
| 
| > | I am wondering whether there is a particular reason why the
| > | optimiser doesn't pull the
| > | 
| > |   (1)  a = NO_CCS PArray! [wild1 mba#];
| > 
| > This one is a definite bug.  It turns out that the head of the
| > before-ghci-branch doesn't have this bug, so I'm disinclined
| > to investigate it further.  
| 
| No problem, I can always build a new ghc from CVS.
| 
| > |   (2)  case w of wild3 {
| > |          I# e# ->
| > | 
| > | As for (2), the loop would be nice and straight if that
| > | unboxing where outside of the loop - as it is, we break the
| > | pipeline once per iteration it seems
| > 
| > This one is a bit harder.  Basically we want to make a wrapper
| > for a recursive function if it's sure to evaluate its free 
| variables.
| > 
| > In fact the 'liberate-case' pass (which isn't switched on in 4.08)
| > is meant to do just this. It's in simplCore/LiberateCase.lhs,
| > and it's not very complicated.  I've just tried it and it 
| doesn't seem
| > to have the desired effect, but I'm sure that's for a boring reason.
| > If anyone would like to fix it, go ahead!
| 
| Ok - something for a rainy weekend, I guess...
| 
| > Incidentally, you'll find that -ddump-simpl gives you a dump that
| > is pretty close to STG and usually much more readable.  Most
| > performance bugs show up there.  -dverbose-simpl gives you more
| > clues about what is happening where.
| 
| Good to know.  I just wasn't sure whether I wouldn't miss
| out on some optimisations.
| 
| > | Also if somebody is looking at the attached source, I was
| > | wondering why, when I use the commented out code in
| > | `newPArray', I get a lot worse code (the STG code is in a
| > | comment at the end of the file).  In particular, the lambda
| > | abstraction is not inlined, whereas `fill' gets inlined into
| > | the code of which the dump is above.  Is it because the
| > | compiler has a lot harder time with explicit recursion than
| > | with fold/build?  If so, the right RULES magic should allow
| > | me to do the same for my own recursively defined
| > | combinators, shouldn't it?
| > 
| > I couldn't figure out exactly what you meant.  The only commented
| > out code is STG code.  Maybe send a module with the actual
| > source you are bothered about.
| 
| Appended the version of the module where the code is
| activated (it was commented out with --) - you'll find the
| problematic code by searching for (**) in the code.  It
| generates
| 
| $w$snewPArray
|   = \ ww :: Int# w :: Int ->
|       case newIntArray# @ RealWorld ww realWorld#
|       of wild { (# s2#, mba# #) ->
|       case $wsimpleGen
|                ww
|                (\ i :: Int ->
|                     case i of wild1 { I# i# ->
|                     case w of wild2 { I# e# ->
|                     __coerce (ST RealWorld ())
|                     (\ s# :: (State# RealWorld) ->
|                          case writeIntArray# @ RealWorld mba# i# e# s#
|                          of s2#1 { __DEFAULT ->
|                          (# s2#1, () #)
|                          })
|                     }
|                     })
|                s2#
|       of wild1 { (# new_s, r #) ->
|       let {
|         a :: Int
|         a = $wI# ww
|       } in  (# a, (__coerce ByteArray# mba#) #)
|       }
|       }
| 
| One thing that is not nice is in the lambda abstraction
| based to $wsimpleGen.  There is one lambda, then two
| unboxing operations and another lambda for the state
| variables s#.  These two nested lambda's become in STG
| 
|               stg_c1Op =
|                   NO_CCS[] \r[i]
|                       case i of wild1 {
|                         I# i# ->
|                             case w of wild2 {
|                               I# e# ->
|                                   let {
|                                     stg_c1Sv =
|                                         NO_CCS[] \r[s#]
|                                             case 
| writeIntArray# [mba# i# e# s#] of s2#1 {
|                                               DEFAULT -> 
| (#,#) [s2#1 ()]
|                                             };
|                                   } in  stg_c1Sv;
|                             };
|                       };
| 
| So, stg_c1Sv is allocated and immediately entered.  I would
| have hoped that the two lambda abstractions are merged into
| something like this:
| 
|                (\ i :: Int s# :: (State# RealWorld) ->
|                     case i of wild1 { I# i# ->
|                     case w of wild2 { I# e# ->
|                     case writeIntArray# @ RealWorld mba# i# e# s#
|                          of s2#1 { __DEFAULT -> (# s2#1, () #)
|                     })
|                     }
|                     })
| 
| Maybe it is the __coerce, which prevents this from
| happening?
| 
| Also, I am wondering why in the case of the 
| 
|   foldr (fill mpa e) (return $ unsafeFreezeMPArray mpa) [0..n-1])
| 
| the function `fill' gets inlined into the foldr loop, but in
| the above, the lambda abstraction gets not inlined into
| $wsimpleGen.  Maybe it is because $wsimpleGen itself gets
| not inlined...
| 
| Cheers,
| Manuel
| 

_______________________________________________
Glasgow-haskell-users mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Reply via email to