If you use INLINEABLE, that should make the function specialisable to a particular monad, even if it's in a different module. You shouldn't need INLINE for that.
I don't understand the difference between cases (2) and (3). I am still suspicious of why there are so many calls to this one function that it, alone, is allocating a significant proportion of compilation of the entire run of GHC. Are you sure there isn't an algorithmic improvement to be had, to simply reduce the number of calls? Simon | -----Original Message----- | From: ghc-devs [mailto:ghc-devs-boun...@haskell.org] On Behalf Of | Richard Eisenberg | Sent: 16 December 2014 21:46 | To: Joachim Breitner | Cc: ghc-devs@haskell.org | Subject: Re: performance regressions | | I've learned several very interesting things in this analysis. | | - Inlining polymorphic methods is very important. Here are some data | points to back up that claim: | * Original implementation using zipWithAndUnzipM: 8,472,613,440 | bytes allocated in the heap | * Adding {-# INLINE #-} to the definition thereof: 6,639,253,488 | bytes allocated in the heap | * Using `inline` at call site to force inlining: 6,281,539,792 | bytes allocated in the heap | | The middle step above allowed GHC to specialize zipWithAndUnzipM to my | particular monad, but GHC didn't see fit to actually inline the | function. Using `inline` forced it, to good effect. (I did not collect | data on code sizes, but it wouldn't be hard to.) | | By comparison: | * Hand-written recursion: 6,587,809,112 bytes allocated in the | heap | Interestingly, this is *not* the best result! | | Conclusion: We should probably add INLINE pragmas to Util and | MonadUtils. | | | - I then looked at rejiggering the algorithm to keep the common case | fast. This had a side effect of changing the zipWithAndUnzipM to | mapAndUnzipM, from Control.Monad. To my surprise, this brought | disaster! | * Using `inline` and mapAndUnzipM: 7,463,047,432 bytes | allocated in the heap | * Hand-written recursion: 5,848,602,848 bytes | allocated in the heap | | That last number is better than the numbers above because of the | algorithm streamlining. But, the inadequacy of mapAndUnzipM surprised | me -- it already has an INLINE pragma in Control.Monad of course. | Looking at -ddump-simpl, it seems that mapAndUnzipM was indeed getting | inlined, but a call to `map` remained, perhaps causing extra | allocation. | | Conclusion: We should examine the implementation of mapAndUnzipM (and | similar functions) in Control.Monad. Is it as fast as possible? | | | | In the end, I was unable to bring the allocation numbers down to where | they were before my work. This is because the flattener now deals in | roles. Most of its behavior is the same between nominal and | representational roles, so it seems silly (though very possible) to | specialize the code to nominal to keep that path fast. Instead, I | identified one key spot and made that go fast. | | Thus, there is a 7% bump to memory usage on very-type-family-heavy | code, compared to before my commit on Friday. (On more ordinary code, | there is no noticeable change.) | | Validating my patch locally now; will push when that's done. | | Thanks, | Richard | | On Dec 16, 2014, at 10:41 AM, Joachim Breitner <mail@joachim- | breitner.de> wrote: | | > Hi, | > | > | > Am Dienstag, den 16.12.2014, 09:59 -0500 schrieb Richard Eisenberg: | >> On Dec 16, 2014, at 4:01 AM, Joachim Breitner <mail@joachim- | breitner.de> wrote: | >> | >>> another guess (without looking at the code, sorry): Are they in | the | >>> same module? I.e., can GHC specialize the code to your particular | Monad? | > | >> No, they're not in the same module. I could also try moving the | >> zipWithAndUnzipM function to the same module, and even specializing | >> it by hand to the right monad. | > | > I did mean zipWithAndUnzipM, so maybe yes: Try that. | > | > (I find it hard to believe that any polymorphic monadic code should | > perform well, with those many calls to an unknown (>>=) with a | > function parameter, but maybe I'm too pessimistic here.) | > | > | >> Could that be preventing the fusing? | > | > There is not going to be any fusing here, at least not list fusion; | > that would require your code to be written in terms of functions | with | > fusion rules. | > | > Greetings, | > Joachim | > | > -- | > Joachim "nomeata" Breitner | > m...@joachim-breitner.de * http://www.joachim-breitner.de/ | > Jabber: nome...@joachim-breitner.de * GPG-Key: 0xF0FBF51F Debian | > Developer: nome...@debian.org | > | > _______________________________________________ | > ghc-devs mailing list | > ghc-devs@haskell.org | > http://www.haskell.org/mailman/listinfo/ghc-devs | | _______________________________________________ | ghc-devs mailing list | ghc-devs@haskell.org | http://www.haskell.org/mailman/listinfo/ghc-devs _______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs