jwlato: > On Wed, Nov 19, 2008 at 4:17 PM, Simon Peyton-Jones > <[EMAIL PROTECTED]> wrote: > > | I'm compiling with -O2 -Wall. After looking at the Core output, I > > | think I've found the key difference. A function that is bound in a > > | "where" statement is different between the monolithic and split > > | sources. I have no idea why, though. I'll experiment with a few > > | different things to see if I can get this resolved. > > > > In general, splitting code across modules should not make programs less > > efficient -- as Don says, GHC does quite aggressive cross-module inlining. > > > > There is one exception, though. If a non-exported non-recursive function > > is called exactly once, then it is inlined *regardless of size*, because > > doing so does not cause code duplication. But if it's exported and is > > large, then its inlining is not exposed -- and even if it were it might not > > be inlined, because doing so duplicates its code an unknown number of > > times. You can change the threshold for (a) exposing and (b) using an > > inlining, with flags -funfolding-creation-threshold and > > -funfolding-use-threshold respectively. > > > > If you find there's something else going on then I'm all ears. > > > > Simon > > > > I did finally find the changes that make a difference. I think it's > safe to say that I have no idea what's actually going on, so I'll just > report my results and let others try to figure it out. > > I tried upping the thresholds mentioned, up to > -funfolding-creation-threshold 200 -funfolding-use-threshold 100. > This didn't seem to make any performance difference (I didn't check > the core output). > > This project is based on Oleg's Iteratee code; I started using his > IterateeM.hs and Enumerator.hs files and added my own stuff to > Enumerator.hs (thanks Oleg, great work as always). When I started > cleaning up by moving my functions from Enumerator.hs to MyEnum.hs, my > minimal test case increased from 19s to 43s. > > I've found two factors that contributed. When I was cleaning up, I > also removed a bunch of unused functions from IterateeM.hs (some of > the test functions and functions specific to his running example of > HTTP encoding). When I added those functions back in, and added > INLINE pragmas to the exported functions in MyEnum.hs, I got the > performance back. > > In general I hadn't added export lists to the modules yet, so all > functions should have been exported. > > So it seems that somehow the unused functions in IterateeM.hs are > affecting how the functions I care about get implemented (or > exported). I did not expect that. Next step for me is to see what > happens if I INLINE the functions I'm exporting and remove the others, > I suppose. > > Thank you Simon and Don for your advice, especially since I'm pretty > far over my head at this point. >
Is this , since it is in IO code, a -fno-state-hack scenario? Simon wrote recently about when and why -fno-state-hack would be needed, if you want to follow that up. -- Don _______________________________________________ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users