Am 28.01.2019 um 11:11 schrieb Edward Welbourne: > Am 25.01.2019 um 11:08 schrieb Lars Knoll: >>> The CI problem comes from the fact that if we have a high rate of >>> stages to qtbase/dev, we at some point get into a deadlock situation, >>> even if we disregard any flakiness in the system. That’s because >>> higher rates imply that more changes are tested together. This in >>> turn increasing the risk of rejection of all changes because of one >>> bad change in the set. So the current system doesn’t scale and >>> basically rate limits the amount of changes going into a branch (or >>> worse, we end up getting traffic jams where the actual rate actually >>> goes down to zero). > > My working guess at what the present system does is that it piles up a > staging branch with everything that gets staged while an integration is > running; when one integration completes (with maybe some modest delay if > the staging branch is sort), the staging branch gets its turn to attempt > to integrate (possibly via a rebase onto a freshly-integrated branch). > I hope someone who knows the actual process can describe it in this > thread. > > If that's reasonably close to true, the we shall indeed get many commits > piling up in each staging branch, increasing the likelihood of failure > in the integration attempt. We could mitigate that in various ways by > tweaking the process. > > In particular, we could cap the length of staging branches (perhaps with > a bit of flexibility to let in commits with the same owner as some > already in the branch, so that groups of related changes don't get split > up); once a staging branch hits the cap, we start a fresh staging > branch. This gives us a queue of staging branches, rather than just > one, each waiting to be integrated. > >>> To me this means we need to seriously rethink that part of our CI >>> system, and ideally test changes (or patch series) individually and >>> in parallel. So maybe we should adjust our CI system that way: >>> >>> * test changes (or patch series) individually >>> * If they pass CI merge or cherry-pick them into the branch >>> * reject only if the merge/cherry-pick gives conflicts > > We could equally run several integrations in parallel and select one of > those that succeed (probably the one that entered the staging queue > earliest) to be the new tip; all others that succeed, plus any new > staging branches grown in the interval, get rebased onto that and tested > in parallel again. That'll be "wasteful" of Coin resources in so far as > some branches pass repeatedly before getting accepted, but it'll avoid > the small risk you describe below. The "speculative" integrations being > run in parallel with the "will win if several succeed" one make the most > of Coin having the capacity to run several in parallel - assuming it does. > >>> This adds a very small risk that two parallel changes don’t conflict >>> during the merge/cherry-pick process, but cause a test regression >>> together. To help with that, we can simply run a regular status check >>> on the repo. If this happens the repo will be blocked for further >>> testing until someone submits a fix or reverts an offending change, >>> which is acceptable. > > When those events happen, they're going to gum up the whole works until > they're sorted out; they require manual intervention, Which Is Bad. > > Robert Loehning (25 January 2019 17:49) replied >> Could that be solved by testing the combinations of changes again? >> >> * test changes (or patch series) individually >> * If they pass CI merge or cherry-pick them into some local branch >> * reject if the merge/cherry-pick gives conflicts >> * when time period x has passed or the local branch contains y changes, >> test the local branch >> good: push the local branch to the public one >> bad: repeat step four with a subset of the changes it had before >> >> Assuming that y is significantly greater than 1, the added overhead for >> one more test run seems relatively small to me. > > IIUC, you're describing a two-stage integration process; test many > staging branches in parallel; accumulate successes; combine those and > re-test; if good, keep. There shall be new staging branches coming out > of the sausage machine while the earlier composite is going. We have to > work out what to do with those. If the composite fails, these fresh > successes can be combined and tested just as the earlier one was; but, > if the earlier composite passes, we need to rebase the integrations that > have passed while it was tested. However, all these have passed > previously, so we have fair confidence in them; so we can combine them > all together on the prior composite and set about testing this as a > second-stage composite integration. So I think that has a good chance > of working well. > > A note on merge vs rebase here: when merging several branches that have > passed first-integration, a conflict excludes a whole branch, though > it's probably caused by one or few of the commits in the branch; whereas > rebasing lets you detect which individual commits cause the conflicts, > so that only these get left out. It also gives you a linear history. > Given that each staging branch is typically a jumble of mostly unrelated > commits (albeit there may be a few related commits in it), these > branches have no "semantic" relevance, so aren't valuable to keep in the > history. So I'd encourage the use of rebase (discarding individual > commits when they conflict) rather than actual merges (discarding whole > branches on conflict); either way, it's what I mean by "combine" above. > > Suppose Coin is capable of running N+1 integrations in parallel; then > we'll typically be doing one second-integration while (up to) N > first-integrations run in parallel. If the second-stage one wins, the > (up to N) first-stage successes will be combined together on top of it; > otherwise, they just get combined as they are; either way, they form the > new second-stage integration. This typically has a good chance of > success (for the same reason that Lars's "small risk" is indeed small); > but, when it fails, we haven't got ourselves into a broken state that > gums up the works until manual intervention fixes the problem. > > So, in effect, Robert's model is Lars's with one of the N+1 integrations > it could have done in parallel being given up to do his "regular status > check" immediately to the result of his merge and the whole merge being > reverted if it fails. Since we expect that to be rare, this reduces how > many integrations we can do in parallel by one to forestall Lars's "one > small risk" and ensure the tip is always good (Which Is Important). > > So, modulo using rebase rather than merge to combine the first-stage > successes, I like Robert's model better, > > Eddy. >
Thank you, glad to hear that. But since you started with "IIUC" let me illustrate it: 1. The target branch is at x 2. Changes A, B, C, D, E are staged. 3. Tests are run for x+A, x+B, ... - x+B fails, the others pass So far this is the same as Lars suggested, IIUC. Lars would then proceed: 4L. Target branch is being updated to x+A+C+D+E I suggested instead: 4R. Tests are run for x+A+C+D+E, fails 5R. Tests are run for x+A+C, passes 6R. Target branch is being updated to x+A+C Whether this is done by merging or rebasing is a matter of taste but should not make a difference to the resulting code. The different runs in step 3 could be done in parallel if the CI allows. Step 4R could also be done in parallel with step 3, but that is a bit optimistic. If only one of the changes tested in step 3 fails, the parallel step 4R becomes a waste of energy. Cheers, Robert _______________________________________________ Development mailing list Development@qt-project.org https://lists.qt-project.org/listinfo/development