Re: performance regressions

Richard Eisenberg Tue, 16 Dec 2014 07:06:20 -0800

On Dec 16, 2014, at 2:59 AM, Jan Stolarek <jan.stola...@p.lodz.pl> wrote:
> What does your algorithm look like when you write it directly? Something like 
> this:
> 
> flatten_many fmode roles tys
> = unzip `liftM` mapM go (zip roles tys)
> where
>   go (Nominal,ty)          = flatten_one (fmode { fe_eq_rel = NomEq }) ty
>   go (Representational,ty) = flatten_one (fmode { fe_eq_rel = ReprEq }) ty
>   go (Phantom, ty)         = -- See Note [Phantoms in the flattener]
>                              return (ty, mkTcPhantomCo ty ty)
> 
> ?
> 
> Maybe this has something to do with `zipWithAndUnzipM` not being 
> tail-recursive vs. direct version 
> being able to fuse intermediate lists?

My direct version is even uglier:

> flatten_many fmode roles tys
>   = go roles tys
>   where
>     go (Nominal:rs)          (ty:tys)
>       = do { (xi, co) <- flatten_one (setFEEqRel fmode NomEq)  ty
>            ; (xis, cos) <- go rs tys
>            ; return (xi:xis, co:cos) }
>     go (Representational:rs) (ty:tys)
>       = do { (xi, co) <- flatten_one (setFEEqRel fmode ReprEq) ty
>            ; (xis, cos) <- go rs tys
>            ; return (xi:xis, co:cos) }
>     go (Phantom:rs)          (ty:tys)
>       = do { (xis, cos) <- go rs tys
>            ; -- See Note [Phantoms in the flattener]
>              return (ty:xis, mkTcPhantomCo ty ty:cos) }
>     go _ _ = return ([], [])

I could refactor to make it better, but I would be worried that the version you 
wrote would suffer from the same problems as zipWithAndUnzipM. Will check to 
see, though.

On Dec 16, 2014, at 4:01 AM, Joachim Breitner <m...@joachim-breitner.de> wrote:

> another guess (without looking at the code, sorry): Are they in the same
> module? I.e., can GHC specialize the code to your particular Monad?

No, they're not in the same module. I could also try moving the 
zipWithAndUnzipM function to the same module, and even specializing it by hand 
to the right monad. Could that be preventing the fusing?

On Dec 16, 2014, at 8:49 AM, Simon Peyton Jones <simo...@microsoft.com> wrote:

> That seems surprising.  I'd build a profiled compiler (GhcProfiled=YES) and 
> see what it says.  
> 
> If it increases allocation by 30% overall, there must be a LOT of calls to 
> this function.  Should there be so many?

I've been working from a profiled compiler. That's how I found that this 
function was the culprit -- it certainly wasn't my first guess!

Yes, there are A LOT of calls: 7,106,808 to be exact. (The test case is 
perf/compiler/T9872a; the function is flatten_many). The number of calls 
doesn't vary from before my commit, though, so the raw number isn't the problem 
-- it's the allocation.

I'll try turning some of these knobs to see where the difference is.

Thanks,
Richard

_______________________________________________
ghc-devs mailing list
ghc-devs@haskell.org
http://www.haskell.org/mailman/listinfo/ghc-devs

Re: performance regressions

Reply via email to