Alfredo Di Napoli <[email protected]> writes: > Hey Ben, > Hi Alfredo,
Sorry for the late response! The email queue from the weekend was a bit longer than I would like. > as promised I’m back to you with something more articulated and hopefully > meaningful. I do hear you perfectly — probably trying to dive head-first > into this without at least a rough understanding of the performance > hotspots or the GHC overall architecture is going to do me more harm than > good (I get the overall picture and I’m aware of the different stages of > the GHC compilation pipeline, but it’s far from saying I’m proficient with > the architecture as whole). I have also read a couple of years ago the GHC > chapter on the “Architeture of Open Source Applications” book, but I don’t > know how much that is still relevant. If it is, I guess I should refresh my > memory. > It sounds like you have done a good amount of reading. That's great. Perhaps skimming the AOSA chapter again wouldn't hurt, but otherwise it's likely worthwhile diving in. > I’m currently trying to move on 2 fronts — please advice if I’m a fool > flogging a dead horse or if I have any hope of getting anything done ;) > > 1. I’m trying to treat indeed the compiler as a black block (as you > adviced) trying to build a sufficiently large program where GHC is not “as > fast as I would like” (I know that’s a very lame definition of “slow”, > hehe). In particular, I have built the stage2 compiler with the “prof” > flavour as you suggested, and I have chosen 2 examples as a reference > “benchmark” for performance; DynFlags.hs (which seems to have been > mentioned multiple times as a GHC perf killer) and the highlighting-kate > package as posted here: https://ghc.haskell.org/trac/ghc/ticket/9221 . Indeed, #9221 would be a very interesting ticket to look at. The highlighting-kate package is interesting in the context of that ticket as it has a very large amount of parallelism available. If you do want to look at #9221, note that the cost centre profiler may not provide the whole story. In particular, it has been speculated that the scaling issues may be due to either, * threads hitting a blackhole, resulting in blocking * the usual scaling limitations of GHC's stop-the-world GC The eventlog may be quite useful for characterising these. > The idea would be to compile those with -v +RTS -p -hc -RTS enabled, > look at the output from the .prof file AND the `-v` flag, find any > hotspot, try to change something, recompile, observe diff, rinse and > repeat. Do you think I have any hope of making progress this way? In > particular, I think compiling DynFlags.hs is a bit of a dead-end; I > whipped up this buggy script which > escalated into a Behemoth which is compiling pretty much half of the > compiler once again :D > > ``` > #!/usr/bin/env bash > > ../ghc/inplace/bin/ghc-stage2 --make -j8 -v +RTS -A256M -qb0 -p -h \ > -RTS -DSTAGE=2 -I../ghc/includes -I../ghc/compiler -I../ghc/compiler/stage2 > \ > -I../ghc/compiler/stage2/build \ > -i../ghc/compiler/utils:../ghc/compiler/types:../ghc/compiler/typecheck:../ghc/compiler/basicTypes > \ > -i../ghc/compiler/main:../ghc/compiler/profiling:../ghc/compiler/coreSyn:../ghc/compiler/iface:../ghc/compiler/prelude > \ > -i../ghc/compiler/stage2/build:../ghc/compiler/simplStg:../ghc/compiler/cmm:../ghc/compiler/parser:../ghc/compiler/hsSyn > \ > -i../ghc/compiler/ghci:../ghc/compiler/deSugar:../ghc/compiler/simplCore:../ghc/compile/specialise > \ > -fforce-recomp -c $@ > ``` > > I’m running it with `./dynflags.sh ../ghc/compiler/main/DynFlags.hs` but > it’s taking a lot to compile (20+ mins on my 2014 mac Pro) because it’s > pulling in half of the compiler anyway :D I tried to reuse the .hi files > from my stage2 compilation but I failed (GHC was complaining about > interface file mismatch). Short story short, I don’t think it will be a > very agile way to proceed. Am I right? Do you have any recommendation in > such sense? Do I have any hope to compile DynFlags.hs in a way which would > make this perf investigation feasible? > What I usually do in this case is just take the relevant `ghc` command line directly from the `make` output and execute it manually. I would imagine your debug cycle would look something like, * instrument the compiler * build stage1 * use stage2 to build DynFlags using the stage1 compiler (using a saved command line) * think * repeat This should only take a few minutes per iteration. > The second example (the highlighting-kate package) seems much more > promising. It takes maybe 1-2 mins on my machine, which is enough to take a > look at the perf output. Do you think I should follow this second lead? In > principle any 50+ modules package I think would do (better if with a lot of > TH ;) ) but this seems like a low-entry barrier start. > > 2. The second path I’m exploring is simply to take a less holistic approach > and try to dive in into a performance ticket like the ones listed here: > https://www.reddit.com/r/haskell/comments/45q90s/is_anything_being_done_to_remedy_the_soul/czzq6an/ > Maybe some are very specific, but it seems like fixing small things and > move forward could help giving me understanding of different sub-parts of > GHC, which seems less intimidating than the black-box approach. > Do you have any specific tickets from these lists that you found interesting? > In conclusion, what do you think is the best approach, 1 or 2, both or > none? ;) I would say that it largely depends upon what you feel most comfortable with. If you feel up for it, I think #9221 would be a nice, fairly self-contained, yet high-impact ticket which would be worth spending a few days diving further into. Cheers, - Ben
signature.asc
Description: PGP signature
_______________________________________________ ghc-devs mailing list [email protected] http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
