Re: On CI

2021-03-24 Thread Andreas Klebinger
> What about the case where the rebase *lessens* the improvement? That is, you're expecting these 10 cases to improve, but after a rebase, only 1 improves. That's news! But a blanket "accept improvements" won't tell you. I don't think that scenario currently triggers a CI failure. So this wouldn'

Re: On CI

2021-03-24 Thread Moritz Angermann
Yes, this is exactly one of the issues that marge might run into as well, the aggregate ends up performing differently from the individual ones. Now we have marge to ensure that at least the aggregate builds together, which is the whole point of these merge trains. Not to end up in a situation wher

Re: On CI

2021-03-24 Thread Richard Eisenberg
What about the case where the rebase *lessens* the improvement? That is, you're expecting these 10 cases to improve, but after a rebase, only 1 improves. That's news! But a blanket "accept improvements" won't tell you. I'm not hard against this proposal, because I know precise tracking has its o

Re: On CI

2021-03-24 Thread Andreas Klebinger
After the idea of letting marge accept unexpected perf improvements and looking at https://gitlab.haskell.org/ghc/ghc/-/merge_requests/4759 which failed because of a single test, for a single build flavour crossing the improvement threshold where CI fails after rebasing I wondered. When would acc

RE: On CI

2021-03-18 Thread Ben Gamari
Simon Peyton Jones via ghc-devs writes: > > We need to do something about this, and I'd advocate for just not making > > stats fail with marge. > > Generally I agree. One point you don’t mention is that our perf tests > (which CI forces us to look at assiduously) are often pretty weird > cases.

Re: On CI

2021-03-18 Thread Ben Gamari
Karel Gardas writes: > On 3/17/21 4:16 PM, Andreas Klebinger wrote: >> Now that isn't really an issue anyway I think. The question is rather is >> 2% a large enough regression to worry about? 5%? 10%? > > 5-10% is still around system noise even on lightly loaded workstation. > Not sure if CI is n

Re: On CI

2021-03-18 Thread John Ericson
My guess is most of the "noise" is not run time, but the compiled code changing in hard to predict ways. https://gitlab.haskell.org/ghc/ghc/-/merge_requests/1776/diffs for example was a very small PR that took *months* of on-off work to get passing metrics tests. In the end, binding `is_boot`

Re: On CI

2021-03-18 Thread davean
I left the wiggle room for things like longer wall time causing more time events in the IO Manager/RTS which can be a thermal/HW issue. They're small and indirect though -davean On Thu, Mar 18, 2021 at 1:37 PM Sebastian Graf wrote: > To be clear: All performance tests that run as part of CI mea

Re: On CI

2021-03-18 Thread Sebastian Graf
To be clear: All performance tests that run as part of CI measure allocations only. No wall clock time. Those measurements are (mostly) deterministic and reproducible between compiles of the same worktree and not impacted by thermal issues/hardware at all. Am Do., 18. März 2021 um 18:09 Uhr schrie

Re: On CI

2021-03-18 Thread davean
That really shouldn't be near system noise for a well constructed performance test. You might be seeing things like thermal issues, etc though - good benchmarking is a serious subject. Also we're not talking wall clock tests, we're talking specific metrics. The machines do tend to be bare metal, bu

Re: On CI

2021-03-17 Thread Karel Gardas
On 3/17/21 4:16 PM, Andreas Klebinger wrote: > Now that isn't really an issue anyway I think. The question is rather is > 2% a large enough regression to worry about? 5%? 10%? 5-10% is still around system noise even on lightly loaded workstation. Not sure if CI is not run on some shared cloud reso

Re: On CI

2021-03-17 Thread Merijn Verstraaten
On 17 Mar 2021, at 16:16, Andreas Klebinger wrote: > > While I fully agree with this. We should *always* want to know if a small > syntetic benchmark regresses by a lot. > Or in other words we don't want CI to accept such a regression for us ever, > but the developer of a patch should need to e

Re: On CI

2021-03-17 Thread Andreas Klebinger
> I'd be quite happy to accept a 25% regression on T9872c if it yielded a 1% improvement on compiling Cabal. T9872 is very very very strange! (Maybe if *all* the T9872 tests regressed, I'd be more worried.) While I fully agree with this. We should *always* want to know if a small syntetic benchma

Re: On CI

2021-03-17 Thread John Ericson
Yes, I think the counter point of "automating what Ben does" so people besides Ben can do it is very important. In this case, I think a good thing we could do is asynchronously build more of master post-merge, such as use the perf stats to automatically bisect anything that is fishy, including

Re: On CI

2021-03-17 Thread Sebastian Graf
Re: Performance drift: I opened https://gitlab.haskell.org/ghc/ghc/-/issues/17658 a while ago with an idea of how to measure drift a bit better. It's basically an automatically checked version of "Ben stares at performance reports every two weeks and sees that T9872 has regressed by 10% since 9.0"

Re: On CI

2021-03-17 Thread Richard Eisenberg
> On Mar 17, 2021, at 6:18 AM, Moritz Angermann > wrote: > > But what do we expect of patch authors? Right now if five people write > patches to GHC, and each of them eventually manage to get their MRs green, > after a long review, they finally see it assigned to marge, and then it > starts

Re: On CI

2021-03-17 Thread Moritz Angermann
I am not advocating to drop perf tests during merge requests, I just want them to not be fatal for marge batches. Yes this means that a bunch of unrelated merge requests all could be fine wrt to the perf checks per merge request, but the aggregate might fail perf. And then subsequently the next MR

Re: On CI

2021-03-17 Thread Spiwack, Arnaud
Ah, so it was really two identical pipelines (one for the branch where Margebot batches commits, and one for the MR that Margebot creates before merging). That's indeed a non-trivial amount of purely wasted computer-hours. Taking a step back, I am inclined to agree with the proposal of not checkin

RE: On CI

2021-03-17 Thread Simon Peyton Jones via ghc-devs
We need to do something about this, and I'd advocate for just not making stats fail with marge. Generally I agree. One point you don’t mention is that our perf tests (which CI forces us to look at assiduously) are often pretty weird cases. So there is at least a danger that these more exotic

Re: On CI

2021-03-17 Thread Moritz Angermann
*why* is a very good question. The MR fixing it is here: https://gitlab.haskell.org/ghc/ghc/-/merge_requests/5275 On Wed, Mar 17, 2021 at 4:26 PM Spiwack, Arnaud wrote: > Then I have a question: why are there two pipelines running on each merge > batch? > > On Wed, Mar 17, 2021 at 9:22 AM Moritz

Re: On CI

2021-03-17 Thread Spiwack, Arnaud
Then I have a question: why are there two pipelines running on each merge batch? On Wed, Mar 17, 2021 at 9:22 AM Moritz Angermann wrote: > No it wasn't. It was about the stat failures described in the next > paragraph. I could have been more clear about that. My apologies! > > On Wed, Mar 17, 20

Re: On CI

2021-03-17 Thread Moritz Angermann
No it wasn't. It was about the stat failures described in the next paragraph. I could have been more clear about that. My apologies! On Wed, Mar 17, 2021 at 4:14 PM Spiwack, Arnaud wrote: > > and if either of both (see below) failed, marge's merge would fail as well. >> > > Re: “see below” is th

Re: On CI

2021-03-17 Thread Spiwack, Arnaud
> and if either of both (see below) failed, marge's merge would fail as well. > Re: “see below” is this referring to a missing part of your email? ___ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

Re: On CI

2021-02-22 Thread John Ericson
*To:* ghc-devs mailto:ghc-devs@haskell.org>> *Subject:* Re: On CI I'm not opposed to some effort going into this, but I would strongly opposite putting all our effort there. Incremental CI can cut multiple hours to < mere minutes, especially with the test suite being

Re: On CI

2021-02-22 Thread Spiwack, Arnaud
n > > > > *From:* ghc-devs *On Behalf Of *John > Ericson > *Sent:* 22 February 2021 05:53 > *To:* ghc-devs > *Subject:* Re: On CI > > > > I'm not opposed to some effort going into this, but I would strongly > opposite putting all our effort there. Inc

RE: On CI

2021-02-22 Thread Simon Peyton Jones via ghc-devs
r 10) I think we need to do less compiling - hence incremental CI. Simon From: ghc-devs On Behalf Of John Ericson Sent: 22 February 2021 05:53 To: ghc-devs Subject: Re: On CI I'm not opposed to some effort going into this, but I would strongly opposite putting all our effort there. Incr

Re: On CI

2021-02-21 Thread John Ericson
the aggressive caching scheme. Just my 2p Josef *From:* ghc-devs mailto:ghc-devs-boun...@haskell.org>> on behalf of Simon Peyton Jones via ghc-devs mailto:ghc-devs@haskell.org>> *Sent:* F

Re: On CI

2021-02-19 Thread Richard Eisenberg
es" and restart building the libraries. That way we can > validate that the build failure was a true build failure and not just due to > the aggressive caching scheme. > > Just my 2p > > Josef > > From: ghc-devs <mailto:ghc-devs-boun...@haskell.org>> on

Re: On CI

2021-02-19 Thread Sebastian Graf
--- > *From:* ghc-devs on behalf of Simon Peyton > Jones via ghc-devs > *Sent:* Friday, February 19, 2021 8:57 AM > *To:* John Ericson ; ghc-devs < > ghc-devs@haskell.org> > *Subject:* RE: On CI > > >1. Building and testing happen together. When test

Re: On CI

2021-02-19 Thread Josef Svenningsson via ghc-devs
nt: Friday, February 19, 2021 8:57 AM To: John Ericson ; ghc-devs Subject: RE: On CI 1. Building and testing happen together. When tests failure spuriously, we also have to rebuild GHC in addition to re-running the tests. That's pure waste. https://gitlab.haskell.org/ghc/ghc/-/i

RE: On CI

2021-02-19 Thread Ben Gamari
Simon Peyton Jones via ghc-devs writes: >> 1. Building and testing happen together. When tests failure >> spuriously, we also have to rebuild GHC in addition to re-running >> the tests. That's pure waste. >> https://gitlab.haskell.org/ghc/ghc/-/issues/13897 tracks this more >> or less.

RE: On CI

2021-02-19 Thread Simon Peyton Jones via ghc-devs
imon From: ghc-devs On Behalf Of John Ericson Sent: 19 February 2021 03:19 To: ghc-devs Subject: Re: On CI I am also wary of us to deferring checking whole platforms and what not. I think that's just kicking the can down the road, and will result in more variance and uncertainty. It might

Re: On CI

2021-02-18 Thread John Ericson
I am also wary of us to deferring checking whole platforms and what not. I think that's just kicking the can down the road, and will result in more variance and uncertainty. It might be alright for those authoring PRs, but it will make Ben's job keeping the system running even more grueling.

Re: On CI

2021-02-18 Thread Ben Gamari
Moritz Angermann writes: > At this point I believe we have ample Linux build capacity. Darwin looks > pretty good as well the ~4 M1s we have should in principle also be able to > build x86_64-darwin at acceptable speeds. Although on Big Sur only. > > The aarch64-Linux story is a bit constraint by

Re: On CI

2021-02-18 Thread Ben Gamari
Apologies for the latency here. This thread has required a fair amount of reflection. Sebastian Graf writes: > Hi Moritz, > > I, too, had my gripes with CI turnaround times in the past. Here's a > somewhat radical proposal: > >- Run "full-build" stage builds only on Marge MRs. Then we can as

Re: On CI

2021-02-18 Thread Moritz Angermann
I'm glad to report that my math was off. But it was off only because I assumed that we'd successfully build all windows configurations, which we of course don't. Thus some builds fail faster. Sylvain also provided a windows machine temporarily, until it expired. This led to a slew of new windows w

Re: On CI

2021-02-17 Thread Moritz Angermann
At this point I believe we have ample Linux build capacity. Darwin looks pretty good as well the ~4 M1s we have should in principle also be able to build x86_64-darwin at acceptable speeds. Although on Big Sur only. The aarch64-Linux story is a bit constraint by powerful and fast CI machines but p

Re: On CI

2021-02-17 Thread Sebastian Graf
Hi Moritz, I, too, had my gripes with CI turnaround times in the past. Here's a somewhat radical proposal: - Run "full-build" stage builds only on Marge MRs. Then we can assign to Marge much earlier, but probably have to do a bit more of (manual) bisecting of spoiled Marge batches.