Hi there! Just a quick update on our CI situation. Ben, John, Davean and I have been discussion on CI yesterday, and what we can do about it, as well as some minor notes on why we are frustrated with it. This is an open invitation to anyone who in earnest wants to work on CI. Please come forward and help! We'd be glad to have more people involved!
First the good news, over the last few weeks we've seen we *can* improve CI performance quite substantially. And the goal is now to have MR go through CI within at most 3hs. There are some ideas on how to make this even faster, especially on wide (high core count) machines; however that will take a bit more time. Now to the more thorny issue: Stat failures. We do not want GHC to regress, and I believe everyone is on board with that mission. Yet we have just witnessed a train of marge trials all fail due to a -2% regression in a few tests. Thus we've been blocking getting stuff into master for at least another day. This is (in my opinion) not acceptable! We just had five days of nothing working because master was broken and subsequently all CI pipelines kept failing. We have thus effectively wasted a week. While we can mitigate the latter part by enforcing marge for all merges to master (and with faster pipeline turnaround times this might be more palatable than with 9-12h turnaround times -- when you need to get something done! ha!), but that won't help us with issues where marge can't find a set of buildable MRs, because she just keeps hitting a combination of MRs that somehow together increase or decrease metrics. We have three knobs to adjust: - Make GHC build faster / make the testsuite run faster. There is some rather interesting work going on about parallelizing (earlier) during builds. We've also seen that we've wasted enormous amounts of time during darwin builds in the kernel, because of a bug in the testdriver. - Use faster hardware. We've seen that just this can cut windows build times from 220min to 80min. - Reduce the amount of builds. We used to build two pipelines for each marge merge, and if either of both (see below) failed, marge's merge would fail as well. So not only did we build twice as much as we needed, we also increased our chances to hit bogous build failures by 2. We need to do something about this, and I'd advocate for just not making stats fail with marge. Build errors of course, but stat failures, no. And then have a separate dashboard (and Ben has some old code lying around for this, which someone would need to pick up and polish, ...), that tracks GHC's Performance for each commit to master, with easy access from the dashboard to the offending commit. We will also need to consider the implications of synthetic micro benchmarks, as opposed to say building Cabal or other packages, that reflect more real-world experience of users using GHC. I will try to provide a data driven report on GHC's CI on a bi-weekly or month (we will have to see what the costs for writing it up, and the usefulness is) going forward. And my sincere hope is that it will help us better understand our CI situation; instead of just having some vague complaints about it. Cheers, Moritz
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs