Re: Measuring performance of GHC
On Tue, Dec 6, 2016 at 10:10 PM Ben Gamari wrote: > [...] > > How should we proceed? Should I open a new ticket focused on this? > > (maybe we could try to figure out all the details there?) > > > That sounds good to me. Cool, opened: https://ghc.haskell.org/trac/ghc/ticket/12941 to track this. Cheers, Michal ___ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
Re: Measuring performance of GHC
Johannes Waldmann writes: > Hi Ben, thanks, > > >> 4. run the build, `cabal configure --ghc-options="-p -hc" $args && cabal >> build` > > cabal configure $args --ghc-options="+RTS -p -hc -RTS" > Ahh, yes, of course. I should have tried this before hitting send. >> You should end up with a .prof and .hp file. > > Yes, that works. - Typical output starts like this > > COST CENTRE MODULE %time %alloc > > SimplTopBinds SimplCore 60.7 57.3 > OccAnal SimplCore 6.06.0 > Simplify SimplCore 3.00.5 > Ahh yes. So one of the things I neglected to mention is that the profiled build flavour includes only a few cost centers. One of the tricky aspects of the cost-center profiler is that it affects core-to-core optimizations, meaning that the act of profiling may actually shift around costs. Consequently by default the the build flavour includes a rather conservative set of cost-centers to avoid distorting the results and preserve compiler performance. Typically when I've profiled the compiler I already have a region of interest in mind. I simply add `OPTIONS_GHC -fprof-auto` pragmas to the modules involved. The build system already adds this flag to a few top-level modules, hence the cost-centers which you observe (see compiler/ghc.mk; search for GhcProfiled). If you don't have a particular piece of the compiler in mind to study, you certainly can just pepper every module with cost centers by adding -fprof-auto to GhcStage2HcOpts (e.g. in mk/build.mk). The resulting compiler may be a bit slow and you may need to be just a tad more careful in evaluating the profile. It might be nice if we had a more aggressive profiled build flavour which added cost centers to a larger fraction of machinery of the compiler, which excluding low-level utilities like FastString, which are critical to the compiler's performance. > > These files are always called ghc.{prof,hp}, > how could this be changed? Ideally, the output file name > would depend on the package being compiled, > then the mechanism could probably be used with 'stack' builds. > We really should have a way to do this but sadly do not currently. Ideally we would also have a way to change the default eventlog destination path. > Building executables mentioned in the cabal file will > already overwrite profiling info from building libraries. > Note that you can instruct `cabal` to only build a single component of a package. For instance, in the case of the `text` package you can build just the library component with `cabal build text`. > When I 'cabal build' the 'text' package, > then the last actual compilation (which leaves > the profiling info) is for cbits/cbits.c > Ahh right. Moreover, there is likely another GHC invocation after that to link the final library. This is why I typically just use GHC directly, perhaps stealing the command line produced by `cabal` (with `-v`). > I don't see how to build Data/Text.hs alone > (with ghc, not via cabal), I am getting > Failed to load interface for ‘Data.Text.Show’ > Hmm, I'm not sure I see the issue. In the case of `text` I can just run `ghc` from the source root (ensuring that I set the #include path with `-I`), $ git clone git://github.com/bos/text $ cd text $ ghc Data/Text.hs -Iinclude However, some other packages (particularly those that make heavy use of CPP) aren't entirely straightforward. In these cases I often find myself copying bits from the command line produced by cabal. Cheers, - Ben signature.asc Description: PGP signature ___ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
Re: Measuring performance of GHC
Hi, Am Mittwoch, den 07.12.2016, 11:34 +0100 schrieb Johannes Waldmann: > When I 'cabal build' the 'text' package, > then the last actual compilation (which leaves > the profiling info) is for cbits/cbits.c > > I don't see how to build Data/Text.hs alone > (with ghc, not via cabal), I am getting > Failed to load interface for ‘Data.Text.Show’ you can run $ cabal build -v and then copy’n’paste the command line that you are intested in, add the flags +RTS -p -hc -RTS -fforce-recomp and run that again. Greetings, Joachim -- Joachim “nomeata” Breitner m...@joachim-breitner.de • https://www.joachim-breitner.de/ XMPP: nome...@joachim-breitner.de • OpenPGP-Key: 0xF0FBF51F Debian Developer: nome...@debian.org signature.asc Description: This is a digitally signed message part ___ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
Re: Measuring performance of GHC
Hi Ben, thanks, > 4. run the build, `cabal configure --ghc-options="-p -hc" $args && cabal > build` cabal configure $args --ghc-options="+RTS -p -hc -RTS" > You should end up with a .prof and .hp file. Yes, that works. - Typical output starts like this COST CENTRE MODULE %time %alloc SimplTopBinds SimplCore 60.7 57.3 OccAnal SimplCore 6.06.0 Simplify SimplCore 3.00.5 These files are always called ghc.{prof,hp}, how could this be changed? Ideally, the output file name would depend on the package being compiled, then the mechanism could probably be used with 'stack' builds. Building executables mentioned in the cabal file will already overwrite profiling info from building libraries. When I 'cabal build' the 'text' package, then the last actual compilation (which leaves the profiling info) is for cbits/cbits.c I don't see how to build Data/Text.hs alone (with ghc, not via cabal), I am getting Failed to load interface for ‘Data.Text.Show’ - J. ___ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
Re: Measuring performance of GHC
Joachim Breitner writes: > Hi, > > Am Dienstag, den 06.12.2016, 17:14 -0500 schrieb Ben Gamari: >> Joachim Breitner writes: >> >> > Hi, >> > >> > Am Dienstag, den 06.12.2016, 19:27 + schrieb Michal Terepeta: >> > > (isn't that's what perf.haskell.org is doing?) >> > >> > for compiler performance, it only reports the test suite perf test >> > number so far. >> > >> > If someone modifies the nofib runner to give usable timing results for >> > the compiler, I can easily track these numbers as well. >> > >> >> I have a module [1] that does precisely this for the PITA project (which >> I still have yet to put up on a public server; I'll try to make time for >> this soon). > > Are you saying that the compile time measurements of a single run of > the compiler are actually useful? > Not really, I generally ignore the compile times. However, knowing compiler allocations on a per-module basis is quite nice. > I’d expect we first have to make nofib call the compiler repeatedly. > This would be a good idea though. > Also, shouldn’t this then become part of nofib-analye? > The logic for producing these statistics is implemented by nofib-analyse's Slurp module today. All the script does is produce the statistics in a more consistent format. Cheers, - Ben signature.asc Description: PGP signature ___ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
Re: Measuring performance of GHC
Hi, Am Dienstag, den 06.12.2016, 17:14 -0500 schrieb Ben Gamari: > Joachim Breitner writes: > > > Hi, > > > > Am Dienstag, den 06.12.2016, 19:27 + schrieb Michal Terepeta: > > > (isn't that's what perf.haskell.org is doing?) > > > > for compiler performance, it only reports the test suite perf test > > number so far. > > > > If someone modifies the nofib runner to give usable timing results for > > the compiler, I can easily track these numbers as well. > > > > I have a module [1] that does precisely this for the PITA project (which > I still have yet to put up on a public server; I'll try to make time for > this soon). Are you saying that the compile time measurements of a single run of the compiler are actually useful? I’d expect we first have to make nofib call the compiler repeatedly. Also, shouldn’t this then become part of nofib-analye? Greetings, Joachim -- Joachim “nomeata” Breitner m...@joachim-breitner.de • https://www.joachim-breitner.de/ XMPP: nome...@joachim-breitner.de • OpenPGP-Key: 0xF0FBF51F Debian Developer: nome...@debian.org signature.asc Description: This is a digitally signed message part ___ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
Re: Measuring performance of GHC
Joachim Breitner writes: > Hi, > > Am Dienstag, den 06.12.2016, 19:27 + schrieb Michal Terepeta: >> (isn't that's what perf.haskell.org is doing?) > > for compiler performance, it only reports the test suite perf test > number so far. > > If someone modifies the nofib runner to give usable timing results for > the compiler, I can easily track these numbers as well. > I have a module [1] that does precisely this for the PITA project (which I still have yet to put up on a public server; I'll try to make time for this soon). Cheers, - Ben [1] https://github.com/bgamari/ghc-perf-import/blob/master/SummarizeResults.hs signature.asc Description: PGP signature ___ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
Re: Measuring performance of GHC
Johannes Waldmann writes: > Hi, > >> ... to compile it with a profiled GHC and look at the report? > > How hard is it to build hackage or stackage > with a profiled ghc? (Does it require ghc magic, or can I do it?) > Not terribly hard although it could be made smoother. To start you'll need to compile a profiled GHC. To do this you simply want to something like the following, 1. install the necessary build dependencies [1] 2. get the sources [2] 3. configure the tree to produce a profiled compiler: a. cp mk/build.mk.sample mk/build.mk b. uncomment the line `BuildFlavour=prof` in mk/build.mk 4. `./boot && ./configure --prefix=$dest && make && make install` Then for a particular package, 1. get a working directory: `cabal unpack $pkg && cd $pkg-*` 2. `args="--with-ghc=$dest/bin/ghc --allow-newer=base,ghc-prim,template-haskell,..."` 3. install dependencies: `cabal install --only-dependencies $args .` 4. run the build, `cabal configure --ghc-options="-p -hc" $args && cabal build` You should end up with a .prof and .hp file. Honestly, I often skip the `cabal` step entirely and just use `ghc` to compile a module of interest directly. [1] https://ghc.haskell.org/trac/ghc/wiki/Building/Preparation [2] https://ghc.haskell.org/trac/ghc/wiki/Building/GettingTheSources >> ... some obvious sub-optimal algorithms in GHC. > > obvious to whom? you mean sub-optimality is already known, > or that it would become obvious once the reports are there? > I think "obvious" may have been a bit of a strong word here. There are sub-optimal algorithms in the compiler and they can be found with a bit of work. If you have a good testcase tickling such an algorithm finding the issue can be quite straightforward; if not then the process can be a bit trickier. However, GHC is just another Haskell program and performance issues are approached just like in any other project. > Even without profiling - does hackage collect timing information from > its automated builds? > Sadly it doesn't. But... > What needs to be done to add timing information in places like > https://hackage.haskell.org/package/obdd-0.6.1/reports/1 ? > I've discussed the possibility with Herbert to add instrumentation in his matrix builder [3] to collect this sort of information. As a general note, keep in mind that timings are quite unstable, dependent upon factors beyond our control at all levels of the stack. For this reason, I generally prefer to rely on allocations, not runtimes, while profiling. As always, don't hesitate to drop by #ghc if you run into trouble. Cheers, - Ben [3] http://matrix.hackage.haskell.org/packages signature.asc Description: PGP signature ___ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
Re: Measuring performance of GHC
Hi, Am Dienstag, den 06.12.2016, 19:27 + schrieb Michal Terepeta: > (isn't that's what perf.haskell.org is doing?) for compiler performance, it only reports the test suite perf test number so far. If someone modifies the nofib runner to give usable timing results for the compiler, I can easily track these numbers as well. Greetings, Joachim -- Joachim “nomeata” Breitner m...@joachim-breitner.de • https://www.joachim-breitner.de/ XMPP: nome...@joachim-breitner.de • OpenPGP-Key: 0xF0FBF51F Debian Developer: nome...@debian.org signature.asc Description: This is a digitally signed message part ___ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
Re: Measuring performance of GHC
Michal Terepeta writes: >> On Tue, Dec 6, 2016 at 2:44 AM Ben Gamari wrote: >> >>I don't have a strong opinion on which of these would be better. >>However, I would point out that currently the tests/perf/compiler tests >>are extremely labor-intensive to maintain while doing relatively little >>to catch performance regressions. There are a few issues here: >> >> * some tests aren't very reproducible between runs, meaning that >> contributors sometimes don't catch regressions in their local >> validations >> * many tests aren't very reproducible between platforms and all tests >> are inconsistent between differing word sizes. This means that we end >> up having many sets of expected performance numbers in the testsuite. >> In practice nearly all of these except 64-bit Linux are out-of-date. >> * our window-based acceptance criterion for performance metrics doesn't >> catch most regressions, which typically bump allocations by a couple >> percent or less (whereas the acceptance thresholds range from 5% to >> 20%). This means that the testsuite fails to catch many deltas, only >> failing when some unlucky person finally pushes the number over the >> threshold. >> >> Joachim and I discussed this issue a few months ago at Hac Phi; he had >> an interesting approach to tracking expected performance numbers which >> may both alleviate these issues and reduce the maintenance burden that >> the tests pose. I wrote down some terse notes in #12758. > > Thanks for mentioning the ticket! > Sure! > To be honest, I'm not a huge fan of having performance tests being > treated the same as any other tests. IMHO they are quite different: > > - They usually need a quiet environment (e.g., cannot run two different > tests at the same time). But with ordinary correctness tests, I can > run as many as I want concurrently. > This is absolutely true; if I had a nickel for every time I saw the testsuite fail, only to pass upon re-running I would be able to fund a great deal of GHC development ;) > - The output is not really binary (correct vs incorrect) but some kind of a > number (or collection of numbers) that we want to track over time. > Yes, and this is more or less the idea which the ticket is supposed to capture; we track performance numbers in the GHC repository in git notes and have Harbormaster (or some other stable test environment) maintain them. Exact metrics would be recorded for every commit and we could warn during validate if something changes suspiciously (e.g. look at the mean and variance of the metric over the past N commits and squawk if the commit bumps the metric more than some number of sigmas). This sort of scheme could be implemented in either the testsuite or nofib. It's not clear that one is better than the other (although we would want to teach the testsuite driver to run performance tests serially). > - The decision whether to fail is harder. Since output might be noisy, you > need to have either quite relaxed bounds (and miss small > regressions) or try to enforce stronger bounds (and suffer from the > flakiness and maintenance overhead). > Yep. That is right. > So for the purpose of: > "I have a small change and want to check its effect on compiler > performance and expect, e.g., ~1% difference" > the model running of benchmarks separately from tests is much nicer. I > can run them when I'm not doing anything else on the computer and then > easily compare the results. (that's what I usually do for nofib). For > tracking the performance over time, one could set something up to run > the benchmarks when idle. (isn't that's what perf.haskell.org is > doing?) > > Due to that, if we want to extend tests/perf/compiler to support this > use case, I think we should include there benchmarks that are *not* > tests (and are not included in ./validate), but there's some easy tool > to run all of them and give you a quick comparison of what's changed. > When you put it like this it does sound like nofib is the natural choice here. > To a certain degree this would be then orthogonal to the improvements > suggested in the ticket. But we could probably reuse some things > (e.g., dumping .csv files for perf metrics?) > Indeed. > How should we proceed? Should I open a new ticket focused on this? > (maybe we could try to figure out all the details there?) > That sounds good to me. Cheers, - Ben signature.asc Description: PGP signature ___ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
Re: Measuring performance of GHC
> On Tue, Dec 6, 2016 at 2:44 AM Ben Gamari wrote: > Michal Terepeta writes: > > [...] >> >> Looking at the comments on the proposal from Moritz, most people would >> prefer to >> extend/improve nofib or `tests/perf/compiler` tests. So I guess the main >> question is - what would be better: >> - Extending nofib with modules that are compile only (i.e., not >> runnable) and focus on stressing the compiler? >> - Extending `tests/perf/compiler` with ability to run all the tests and do >> easy "before and after" comparisons? >> >I don't have a strong opinion on which of these would be better. >However, I would point out that currently the tests/perf/compiler tests >are extremely labor-intensive to maintain while doing relatively little >to catch performance regressions. There are a few issues here: > > * some tests aren't very reproducible between runs, meaning that > contributors sometimes don't catch regressions in their local > validations > * many tests aren't very reproducible between platforms and all tests > are inconsistent between differing word sizes. This means that we end > up having many sets of expected performance numbers in the testsuite. > In practice nearly all of these except 64-bit Linux are out-of-date. > * our window-based acceptance criterion for performance metrics doesn't > catch most regressions, which typically bump allocations by a couple > percent or less (whereas the acceptance thresholds range from 5% to > 20%). This means that the testsuite fails to catch many deltas, only > failing when some unlucky person finally pushes the number over the > threshold. > > Joachim and I discussed this issue a few months ago at Hac Phi; he had > an interesting approach to tracking expected performance numbers which > may both alleviate these issues and reduce the maintenance burden that > the tests pose. I wrote down some terse notes in #12758. Thanks for mentioning the ticket! To be honest, I'm not a huge fan of having performance tests being treated the same as any other tests. IMHO they are quite different: - They usually need a quiet environment (e.g., cannot run two different tests at the same time). But with ordinary correctness tests, I can run as many as I want concurrently. - The output is not really binary (correct vs incorrect) but some kind of a number (or collection of numbers) that we want to track over time. - The decision whether to fail is harder. Since output might be noisy, you need to have either quite relaxed bounds (and miss small regressions) or try to enforce stronger bounds (and suffer from the flakiness and maintenance overhead). So for the purpose of: "I have a small change and want to check its effect on compiler performance and expect, e.g., ~1% difference" the model running of benchmarks separately from tests is much nicer. I can run them when I'm not doing anything else on the computer and then easily compare the results. (that's what I usually do for nofib). For tracking the performance over time, one could set something up to run the benchmarks when idle. (isn't that's what perf.haskell.org is doing?) Due to that, if we want to extend tests/perf/compiler to support this use case, I think we should include there benchmarks that are *not* tests (and are not included in ./validate), but there's some easy tool to run all of them and give you a quick comparison of what's changed. To a certain degree this would be then orthogonal to the improvements suggested in the ticket. But we could probably reuse some things (e.g., dumping .csv files for perf metrics?) How should we proceed? Should I open a new ticket focused on this? (maybe we could try to figure out all the details there?) Thanks, Michal ___ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
Re: Measuring performance of GHC
Hi, > ... to compile it with a profiled GHC and look at the report? How hard is it to build hackage or stackage with a profiled ghc? (Does it require ghc magic, or can I do it?) > ... some obvious sub-optimal algorithms in GHC. obvious to whom? you mean sub-optimality is already known, or that it would become obvious once the reports are there? Even without profiling - does hackage collect timing information from its automated builds? What needs to be done to add timing information in places like https://hackage.haskell.org/package/obdd-0.6.1/reports/1 ? - J.W. ___ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
Re: Measuring performance of GHC
> | - One of the core issues I see in day to day programming (even though > |not necessarily with haskell right now) is that the spare time I > | have > |to file bug reports, boil down performance regressions etc. and file > |them with open source projects is not paid for and hence minimal. > |Hence whenever the tools I use make it really easy for me to file a > |bug, performance regression or fix something that takes the least > | time > |the chances of me being able to help out increase greatly. This was > | one > |of the ideas behind using just pull requests. > |E.g. This code seems to be really slow, or has subjectively > | regressed in > |compilation time. I also feel confident I can legally share this > | code > |snipped. So I just create a quick pull request with a short > | description, > |and then carry on with what ever pressing task I’m trying to solve > | right > |now. > > There's the same difficulty at the other end too - people who might fix perf > regressions are typically not paid for either. So they (eg me) tend to focus > on things where there is a small repro case, which in turn costs work to > produce. Eg #12745 which I fixed recently in part because thomie found a > lovely small example. > > So I'm a bit concerned that lowering the barrier to entry for perf reports > might not actually lead to better perf. (But undeniably the suite we built > up would be a Good Thing, so we'd be a bit further forward.) > > Simon I did not intend to imply that there was a surplus of time on the other end :) If this would result in a bunch of tiny test cases that can pinpoint the underlying issue, I’m not certain. Say we would tag the test cases though (e.g. uses TH, uses GADTs, uses X, Y and Z) and run these samples on every commit or every other commit (what ever the available hardware would allow the test suite to run on (and maybe even backtest where possible)) regressions w.r.t. subsets might be identifiable. E.g. commit made testcases predominantly with GADTs spike. Worst case scenario we have to declare defeat and decide that this approach has not produced any viable results, and we wasted time of contributes providing the samples. On the other hand we would never know without the samples, as they would have never been provided in the first place? Cheers, moritz ___ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
RE: Measuring performance of GHC
| - One of the core issues I see in day to day programming (even though |not necessarily with haskell right now) is that the spare time I | have |to file bug reports, boil down performance regressions etc. and file |them with open source projects is not paid for and hence minimal. |Hence whenever the tools I use make it really easy for me to file a |bug, performance regression or fix something that takes the least | time |the chances of me being able to help out increase greatly. This was | one |of the ideas behind using just pull requests. |E.g. This code seems to be really slow, or has subjectively | regressed in |compilation time. I also feel confident I can legally share this | code |snipped. So I just create a quick pull request with a short | description, |and then carry on with what ever pressing task I’m trying to solve | right |now. There's the same difficulty at the other end too - people who might fix perf regressions are typically not paid for either. So they (eg me) tend to focus on things where there is a small repro case, which in turn costs work to produce. Eg #12745 which I fixed recently in part because thomie found a lovely small example. So I'm a bit concerned that lowering the barrier to entry for perf reports might not actually lead to better perf. (But undeniably the suite we built up would be a Good Thing, so we'd be a bit further forward.) Simon ___ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
Re: Measuring performance of GHC
Hi, I see the following challenges here, which have partially be touched by the discussion in the mentioned proposal. - The tests we are looking at, might be quite time intensive (lots of modules that take substantial time to compile). Is this practical to run when people locally execute nofib to get *some* idea of the performance implications? Where is the threshold for the total execution time on running nofib? - One of the core issues I see in day to day programming (even though not necessarily with haskell right now) is that the spare time I have to file bug reports, boil down performance regressions etc. and file them with open source projects is not paid for and hence minimal. Hence whenever the tools I use make it really easy for me to file a bug, performance regression or fix something that takes the least time the chances of me being able to help out increase greatly. This was one of the ideas behind using just pull requests. E.g. This code seems to be really slow, or has subjectively regressed in compilation time. I also feel confident I can legally share this code snipped. So I just create a quick pull request with a short description, and then carry on with what ever pressing task I’m trying to solve right now. - Making sure that measurements are reliable. (E.g. running on a dedicated machine with no other applications interfering.) I assume Joachim has quite some experience here. Thanks. Cheers, Moritz > On Dec 6, 2016, at 9:44 AM, Ben Gamari wrote: > > Michal Terepeta writes: > >> Interesting! I must have missed this proposal. It seems that it didn't meet >> with much enthusiasm though (but it also proposes to have a completely >> separate >> repo on github). >> >> Personally, I'd be happy with something more modest: >> - A collection of modules/programs that are more representative of real >> Haskell programs and stress various aspects of the compiler. >> (this seems to be a weakness of nofib, where >90% of modules compile >> in less than 0.4s) > > This would be great. > >> - A way to compile all of those and do "before and after" comparisons >> easily. To measure the time, we should probably try to compile each >> module at least a few times. (it seems that this is not currently >> possible with `tests/perf/compiler` and >> nofib only compiles the programs once AFAICS) >> >> Looking at the comments on the proposal from Moritz, most people would >> prefer to >> extend/improve nofib or `tests/perf/compiler` tests. So I guess the main >> question is - what would be better: >> - Extending nofib with modules that are compile only (i.e., not >> runnable) and focus on stressing the compiler? >> - Extending `tests/perf/compiler` with ability to run all the tests and do >> easy "before and after" comparisons? >> > I don't have a strong opinion on which of these would be better. > However, I would point out that currently the tests/perf/compiler tests > are extremely labor-intensive to maintain while doing relatively little > to catch performance regressions. There are a few issues here: > > * some tests aren't very reproducible between runs, meaning that > contributors sometimes don't catch regressions in their local > validations > * many tests aren't very reproducible between platforms and all tests > are inconsistent between differing word sizes. This means that we end > up having many sets of expected performance numbers in the testsuite. > In practice nearly all of these except 64-bit Linux are out-of-date. > * our window-based acceptance criterion for performance metrics doesn't > catch most regressions, which typically bump allocations by a couple > percent or less (whereas the acceptance thresholds range from 5% to > 20%). This means that the testsuite fails to catch many deltas, only > failing when some unlucky person finally pushes the number over the > threshold. > > Joachim and I discussed this issue a few months ago at Hac Phi; he had > an interesting approach to tracking expected performance numbers which > may both alleviate these issues and reduce the maintenance burden that > the tests pose. I wrote down some terse notes in #12758. > > Cheers, > > - Ben ___ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
Re: Measuring performance of GHC
Michal Terepeta writes: > Interesting! I must have missed this proposal. It seems that it didn't meet > with much enthusiasm though (but it also proposes to have a completely > separate > repo on github). > > Personally, I'd be happy with something more modest: > - A collection of modules/programs that are more representative of real > Haskell programs and stress various aspects of the compiler. > (this seems to be a weakness of nofib, where >90% of modules compile > in less than 0.4s) This would be great. > - A way to compile all of those and do "before and after" comparisons > easily. To measure the time, we should probably try to compile each > module at least a few times. (it seems that this is not currently > possible with `tests/perf/compiler` and > nofib only compiles the programs once AFAICS) > > Looking at the comments on the proposal from Moritz, most people would > prefer to > extend/improve nofib or `tests/perf/compiler` tests. So I guess the main > question is - what would be better: > - Extending nofib with modules that are compile only (i.e., not > runnable) and focus on stressing the compiler? > - Extending `tests/perf/compiler` with ability to run all the tests and do > easy "before and after" comparisons? > I don't have a strong opinion on which of these would be better. However, I would point out that currently the tests/perf/compiler tests are extremely labor-intensive to maintain while doing relatively little to catch performance regressions. There are a few issues here: * some tests aren't very reproducible between runs, meaning that contributors sometimes don't catch regressions in their local validations * many tests aren't very reproducible between platforms and all tests are inconsistent between differing word sizes. This means that we end up having many sets of expected performance numbers in the testsuite. In practice nearly all of these except 64-bit Linux are out-of-date. * our window-based acceptance criterion for performance metrics doesn't catch most regressions, which typically bump allocations by a couple percent or less (whereas the acceptance thresholds range from 5% to 20%). This means that the testsuite fails to catch many deltas, only failing when some unlucky person finally pushes the number over the threshold. Joachim and I discussed this issue a few months ago at Hac Phi; he had an interesting approach to tracking expected performance numbers which may both alleviate these issues and reduce the maintenance burden that the tests pose. I wrote down some terse notes in #12758. Cheers, - Ben signature.asc Description: PGP signature ___ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
Re: Measuring performance of GHC
Michal Terepeta writes: > Hi everyone, > > I've been running nofib a few times recently to see the effect of some > changes > on compile time (not the runtime of the compiled program). And I've started > wondering how representative nofib is when it comes to measuring compile > time > and compiler allocations? It seems that most of the nofib programs compile > really quickly... > > Is there some collections of modules/libraries/applications that were put > together with the purpose of benchmarking GHC itself and I just haven't > seen/found it? > Sadly no; I've put out a number of calls for minimal programs (e.g. small, fairly free-standing real-world applications) but the response hasn't been terribly strong. I frankly can't blame people for not wanting to take the time to strip out dependencies from their working programs. Joachim and I have previously discussed the possibility of manually collecting a set of popular Hackage libraries on a regular basis for use in compiler performance characterization. Cheers, - Ben signature.asc Description: PGP signature ___ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
Re: Measuring performance of GHC
On Mon, Dec 5, 2016 at 12:00 PM Moritz Angermann wrote: > Hi, > > I’ve started the GHC Performance Regression Collection Proposal[1] > (Rendered [2]) > a while ago with the idea of having a trivially community curated set of > small[3] > real-world examples with performance regressions. I might be at fault here > for > not describing this to the best of my abilities. Thus if there is > interested, and > this sounds like an useful idea, maybe we should still pursue this > proposal? > > Cheers, > moritz > > [1]: https://github.com/ghc-proposals/ghc-proposals/pull/26 > [2]: > https://github.com/angerman/ghc-proposals/blob/prop/perf-regression/proposals/-perf-regression.rst > [3]: for some definition of small > Interesting! I must have missed this proposal. It seems that it didn't meet with much enthusiasm though (but it also proposes to have a completely separate repo on github). Personally, I'd be happy with something more modest: - A collection of modules/programs that are more representative of real Haskell programs and stress various aspects of the compiler. (this seems to be a weakness of nofib, where >90% of modules compile in less than 0.4s) - A way to compile all of those and do "before and after" comparisons easily. To measure the time, we should probably try to compile each module at least a few times. (it seems that this is not currently possible with `tests/perf/compiler` and nofib only compiles the programs once AFAICS) Looking at the comments on the proposal from Moritz, most people would prefer to extend/improve nofib or `tests/perf/compiler` tests. So I guess the main question is - what would be better: - Extending nofib with modules that are compile only (i.e., not runnable) and focus on stressing the compiler? - Extending `tests/perf/compiler` with ability to run all the tests and do easy "before and after" comparisons? Personally, I'm slightly leaning towards `tests/perf/compiler` since this would allow sharing the same module as a test for `validate` and to be used for comparing the performance of the compiler before and after a change. What do you think? Thanks, Michal ___ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
Re: Measuring performance of GHC
Hi, I’ve started the GHC Performance Regression Collection Proposal[1] (Rendered [2]) a while ago with the idea of having a trivially community curated set of small[3] real-world examples with performance regressions. I might be at fault here for not describing this to the best of my abilities. Thus if there is interested, and this sounds like an useful idea, maybe we should still pursue this proposal? Cheers, moritz [1]: https://github.com/ghc-proposals/ghc-proposals/pull/26 [2]: https://github.com/angerman/ghc-proposals/blob/prop/perf-regression/proposals/-perf-regression.rst [3]: for some definition of small > On Dec 5, 2016, at 6:31 PM, Simon Peyton Jones via ghc-devs > wrote: > > If not, maybe we should create something? IMHO it sounds reasonable to have > > separate benchmarks for: > > - Performance of GHC itself. > > - Performance of the code generated by GHC. > > > I think that would be great, Michael. We have a small and unrepresentative > sample in testsuite/tests/perf/compiler > > Simon > > From: ghc-devs [mailto:ghc-devs-boun...@haskell.org] On Behalf Of Michal > Terepeta > Sent: 04 December 2016 19:47 > To: ghc-devs > Subject: Measuring performance of GHC > > Hi everyone, > > > > I've been running nofib a few times recently to see the effect of some changes > > on compile time (not the runtime of the compiled program). And I've started > > wondering how representative nofib is when it comes to measuring compile time > > and compiler allocations? It seems that most of the nofib programs compile > > really quickly... > > > > Is there some collections of modules/libraries/applications that were put > > together with the purpose of benchmarking GHC itself and I just haven't > > seen/found it? > > > > If not, maybe we should create something? IMHO it sounds reasonable to have > > separate benchmarks for: > > - Performance of GHC itself. > > - Performance of the code generated by GHC. > > > > Thanks, > > Michal > > > > ___ > ghc-devs mailing list > ghc-devs@haskell.org > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs ___ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
RE: Measuring performance of GHC
If not, maybe we should create something? IMHO it sounds reasonable to have separate benchmarks for: - Performance of GHC itself. - Performance of the code generated by GHC. I think that would be great, Michael. We have a small and unrepresentative sample in testsuite/tests/perf/compiler Simon From: ghc-devs [mailto:ghc-devs-boun...@haskell.org] On Behalf Of Michal Terepeta Sent: 04 December 2016 19:47 To: ghc-devs Subject: Measuring performance of GHC Hi everyone, I've been running nofib a few times recently to see the effect of some changes on compile time (not the runtime of the compiled program). And I've started wondering how representative nofib is when it comes to measuring compile time and compiler allocations? It seems that most of the nofib programs compile really quickly... Is there some collections of modules/libraries/applications that were put together with the purpose of benchmarking GHC itself and I just haven't seen/found it? If not, maybe we should create something? IMHO it sounds reasonable to have separate benchmarks for: - Performance of GHC itself. - Performance of the code generated by GHC. Thanks, Michal ___ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
Re: Measuring performance of GHC
Seems like a good idea, for sure. I have not, but I might eventually. On 4 Dec 2016 21:52, "Joachim Breitner" wrote: > Hi, > > did you try to compile it with a profiled GHC and look at the report? I > would not be surprised if it would point to some obvious sub-optimal > algorithms in GHC. > > Greetings, > Joachim > > Am Sonntag, den 04.12.2016, 20:04 + schrieb David Turner: > > Nod nod. > > > > amazonka-ec2 has a particularly painful module containing just a > > couple of hundred type definitions and associated instances and > > stuff. None of the types is enormous. There's an issue open on > > GitHub[1] where I've guessed at some possible better ways of > > splitting the types up to make GHC's life easier, but it'd be great > > if it didn't need any such shenanigans. It's a bit of a pathological > > case: auto-generated 15kLoC and lots of deriving, but I still feel it > > should be possible to compile with less than 2.8GB RSS. > > > > [1] https://github.com/brendanhay/amazonka/issues/304 > > > > Cheers, > > > > David > > > > On 4 Dec 2016 19:51, "Alan & Kim Zimmerman" > > wrote: > > I agree. > > > > I find compilation time on things with large data structures, such as > > working with the GHC AST via the GHC API get pretty slow. > > > > To the point where I have had to explicitly disable optimisation on > > HaRe, otherwise the build takes too long. > > > > Alan > > > > > > On Sun, Dec 4, 2016 at 9:47 PM, Michal Terepeta > l.com> wrote: > > > Hi everyone, > > > > > > I've been running nofib a few times recently to see the effect of > > > some changes > > > on compile time (not the runtime of the compiled program). And I've > > > started > > > wondering how representative nofib is when it comes to measuring > > > compile time > > > and compiler allocations? It seems that most of the nofib programs > > > compile > > > really quickly... > > > > > > Is there some collections of modules/libraries/applications that > > > were put > > > together with the purpose of benchmarking GHC itself and I just > > > haven't > > > seen/found it? > > > > > > If not, maybe we should create something? IMHO it sounds reasonable > > > to have > > > separate benchmarks for: > > > - Performance of GHC itself. > > > - Performance of the code generated by GHC. > > > > > > Thanks, > > > Michal > > > > > > > > > ___ > > > ghc-devs mailing list > > > ghc-devs@haskell.org > > > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs > > > > > > > > > ___ > > ghc-devs mailing list > > ghc-devs@haskell.org > > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs > > > > > > ___ > > ghc-devs mailing list > > ghc-devs@haskell.org > > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs > -- > Joachim “nomeata” Breitner > m...@joachim-breitner.de • https://www.joachim-breitner.de/ > XMPP: nome...@joachim-breitner.de • OpenPGP-Key: 0xF0FBF51F > Debian Developer: nome...@debian.org > ___ > ghc-devs mailing list > ghc-devs@haskell.org > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs > > ___ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
Re: Measuring performance of GHC
Hi, did you try to compile it with a profiled GHC and look at the report? I would not be surprised if it would point to some obvious sub-optimal algorithms in GHC. Greetings, Joachim Am Sonntag, den 04.12.2016, 20:04 + schrieb David Turner: > Nod nod. > > amazonka-ec2 has a particularly painful module containing just a > couple of hundred type definitions and associated instances and > stuff. None of the types is enormous. There's an issue open on > GitHub[1] where I've guessed at some possible better ways of > splitting the types up to make GHC's life easier, but it'd be great > if it didn't need any such shenanigans. It's a bit of a pathological > case: auto-generated 15kLoC and lots of deriving, but I still feel it > should be possible to compile with less than 2.8GB RSS. > > [1] https://github.com/brendanhay/amazonka/issues/304 > > Cheers, > > David > > On 4 Dec 2016 19:51, "Alan & Kim Zimmerman" > wrote: > I agree. > > I find compilation time on things with large data structures, such as > working with the GHC AST via the GHC API get pretty slow. > > To the point where I have had to explicitly disable optimisation on > HaRe, otherwise the build takes too long. > > Alan > > > On Sun, Dec 4, 2016 at 9:47 PM, Michal Terepeta l.com> wrote: > > Hi everyone, > > > > I've been running nofib a few times recently to see the effect of > > some changes > > on compile time (not the runtime of the compiled program). And I've > > started > > wondering how representative nofib is when it comes to measuring > > compile time > > and compiler allocations? It seems that most of the nofib programs > > compile > > really quickly... > > > > Is there some collections of modules/libraries/applications that > > were put > > together with the purpose of benchmarking GHC itself and I just > > haven't > > seen/found it? > > > > If not, maybe we should create something? IMHO it sounds reasonable > > to have > > separate benchmarks for: > > - Performance of GHC itself. > > - Performance of the code generated by GHC. > > > > Thanks, > > Michal > > > > > > ___ > > ghc-devs mailing list > > ghc-devs@haskell.org > > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs > > > > > ___ > ghc-devs mailing list > ghc-devs@haskell.org > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs > > > ___ > ghc-devs mailing list > ghc-devs@haskell.org > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs -- Joachim “nomeata” Breitner m...@joachim-breitner.de • https://www.joachim-breitner.de/ XMPP: nome...@joachim-breitner.de • OpenPGP-Key: 0xF0FBF51F Debian Developer: nome...@debian.org signature.asc Description: This is a digitally signed message part ___ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
Re: Measuring performance of GHC
Nod nod. amazonka-ec2 has a particularly painful module containing just a couple of hundred type definitions and associated instances and stuff. None of the types is enormous. There's an issue open on GitHub[1] where I've guessed at some possible better ways of splitting the types up to make GHC's life easier, but it'd be great if it didn't need any such shenanigans. It's a bit of a pathological case: auto-generated 15kLoC and lots of deriving, but I still feel it should be possible to compile with less than 2.8GB RSS. [1] https://github.com/brendanhay/amazonka/issues/304 Cheers, David On 4 Dec 2016 19:51, "Alan & Kim Zimmerman" wrote: I agree. I find compilation time on things with large data structures, such as working with the GHC AST via the GHC API get pretty slow. To the point where I have had to explicitly disable optimisation on HaRe, otherwise the build takes too long. Alan On Sun, Dec 4, 2016 at 9:47 PM, Michal Terepeta wrote: > Hi everyone, > > I've been running nofib a few times recently to see the effect of some > changes > on compile time (not the runtime of the compiled program). And I've started > wondering how representative nofib is when it comes to measuring compile > time > and compiler allocations? It seems that most of the nofib programs compile > really quickly... > > Is there some collections of modules/libraries/applications that were put > together with the purpose of benchmarking GHC itself and I just haven't > seen/found it? > > If not, maybe we should create something? IMHO it sounds reasonable to have > separate benchmarks for: > - Performance of GHC itself. > - Performance of the code generated by GHC. > > Thanks, > Michal > > > ___ > ghc-devs mailing list > ghc-devs@haskell.org > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs > > ___ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs ___ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
Re: Measuring performance of GHC
I agree. I find compilation time on things with large data structures, such as working with the GHC AST via the GHC API get pretty slow. To the point where I have had to explicitly disable optimisation on HaRe, otherwise the build takes too long. Alan On Sun, Dec 4, 2016 at 9:47 PM, Michal Terepeta wrote: > Hi everyone, > > I've been running nofib a few times recently to see the effect of some > changes > on compile time (not the runtime of the compiled program). And I've started > wondering how representative nofib is when it comes to measuring compile > time > and compiler allocations? It seems that most of the nofib programs compile > really quickly... > > Is there some collections of modules/libraries/applications that were put > together with the purpose of benchmarking GHC itself and I just haven't > seen/found it? > > If not, maybe we should create something? IMHO it sounds reasonable to have > separate benchmarks for: > - Performance of GHC itself. > - Performance of the code generated by GHC. > > Thanks, > Michal > > > ___ > ghc-devs mailing list > ghc-devs@haskell.org > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs > > ___ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs