Re: cross module optimization issues

Don Stewart Sat, 22 Nov 2008 10:55:19 -0800

jwlato:
> On Wed, Nov 19, 2008 at 4:17 PM, Simon Peyton-Jones
> <[EMAIL PROTECTED]> wrote:
> > | I'm compiling with -O2 -Wall.  After looking at the Core output, I
> > | think I've found the key difference.  A function that is bound in a
> > | "where" statement is different between the monolithic and split
> > | sources.  I have no idea why, though.  I'll experiment with a few
> > | different things to see if I can get this resolved.
> >
> > In general, splitting code across modules should not make programs less 
> > efficient -- as Don says, GHC does quite aggressive cross-module inlining.
> >
> > There is one exception, though.  If a non-exported non-recursive function 
> > is called exactly once, then it is inlined *regardless of size*, because 
> > doing so does not cause code duplication.  But if it's exported and is 
> > large, then its inlining is not exposed -- and even if it were it might not 
> > be inlined, because doing so duplicates its code an unknown number of 
> > times.  You can change the threshold for (a) exposing and (b) using an 
> > inlining, with flags -funfolding-creation-threshold and 
> > -funfolding-use-threshold respectively.
> >
> > If you find there's something else going on then I'm all ears.
> >
> > Simon
> >
> 
> I did finally find the changes that make a difference.  I think it's
> safe to say that I have no idea what's actually going on, so I'll just
> report my results and let others try to figure it out.
> 
> I tried upping the thresholds mentioned, up to
> -funfolding-creation-threshold 200 -funfolding-use-threshold 100.
> This didn't seem to make any performance difference (I didn't check
> the core output).
> 
> This project is based on Oleg's Iteratee code; I started using his
> IterateeM.hs and Enumerator.hs files and added my own stuff to
> Enumerator.hs (thanks Oleg, great work as always).  When I started
> cleaning up by moving my functions from Enumerator.hs to MyEnum.hs, my
> minimal test case increased from 19s to 43s.
> 
> I've found two factors that contributed.  When I was cleaning up, I
> also removed a bunch of unused functions from IterateeM.hs (some of
> the test functions and functions specific to his running example of
> HTTP encoding).  When I added those functions back in, and added
> INLINE pragmas to the exported functions in MyEnum.hs, I got the
> performance back.
> 
> In general I hadn't added export lists to the modules yet, so all
> functions should have been exported.
> 
> So it seems that somehow the unused functions in IterateeM.hs are
> affecting how the functions I care about get implemented (or
> exported).  I did not expect that.  Next step for me is to see what
> happens if I INLINE the functions I'm exporting and remove the others,
> I suppose.
> 
> Thank you Simon and Don for your advice, especially since I'm pretty
> far over my head at this point.
>


Is this , since it is in IO code, a -fno-state-hack scenario?
Simon  wrote recently about when and why -fno-state-hack would be
needed, if you want to follow that up.

-- Don
_______________________________________________
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Re: cross module optimization issues

Reply via email to