Thanks for gathering this data!

>> What are some notable cases of recent regressions that have landed because 
>> of non-use of commit queue and caused serious problems?
> 
> Some recent examples of regressions that would have been prevented by 
> mandatory commit/merge-queue as proposed:
> 
> https://commits.webkit.org/250940@main 
> <https://commits.webkit.org/250940@main> (PR did use Merge-Queue, but we’re 
> missing the feature that would have caught the problem)

What feature specifically?

> https://commits.webkit.org/250343@main 
> <https://commits.webkit.org/250343@main> (Commit did use Commit-Queue, but 
> skipped layout tests because of a previous iteration passed EWS)

Are you also proposing that merge-queue should re-run layout tests 
unconditionally? Do we know how long that takes on average? Do we have the 
hardware to support that?

> https://commits.webkit.org/250791@main 
> <https://commits.webkit.org/250791@main> (Should have failed EWS, but flakey 
> tests had different names, cross referencing trunk results would have made 
> that obvious)

How would merge-queue cross reference with trunk results or otherwise resolve 
this confusion?

> https://commits.webkit.org/248624@main 
> <https://commits.webkit.org/248624@main> (Landed manually, broke the build)

Seems like this would have been avoided by Merge-Queue alone. That’s nice.

For the other three cases, whether the proposal as-is would have resolved them 
or not seems muddy. Might need revision.

Thanks,
Geoff

>> Do you have any data on how frequent such regressions are, compared to the 
>> base rate of regressions that have landed despite use of commit queue?
> 
> Commit and Merge queue have basically prevented us from breaking the Mac 
> build, the only example of a broken Mac build I can find come from by-passing 
> Commit and Merge Queue. Layout test breakages are quite a bit more difficult 
> to reason about because breakages tend to not just be platform specific, but 
> configuration specific as well.
> 
>> 
>> Do you have any data on how frequently regressions are resolved by patches 
>> that land outside commit queue?
> 
> In the last week, we had 200 commits. 50 of those were made through the 
> Unsafe-Merge-Queue. Break down is here:
> 
> 15 were feature work that should have gone through Merge Queue
> 12 were test gardening
> 9 were build or test fixes
> 3 were 3rd party imports
> 3 were reverts
> 3 were tooling changes
> 2 were buildbot changes
> 2 were contributors.json changes
> 1 was a documentation change
> 
> My proposal would have sent the feature work, gardening work, imports, 
> tooling change and documentation change to Merge-Queue rather than 
> Unsafe-Merge-Queue. The build and test fixes would have needed modification 
> to their commit messages, but were the kind of changes I would want landed 
> via Unsafe-Merge-Queue.
> 
>> 
>>> and reduce demands of post-commit test gardening.
>> 
>> Is this goal distinct from preventing regressions from landing? If so, how?
> 
> I suppose I should have said:
> "The goal is to increase the stability of the build, speed up EWS and reduce 
> demands of post-commit test gardening by preventing regressions from 
> landing”, since build instability, slow EWS and post-commit test gardening 
> are all consequences of regressions landing.
> 
>> 
>>>> What problem are you trying to solve, and with what level of urgency?
>>> 
>>> Urgency is not high. I started this with the expectation it would be a 
>>> somewhat long and contentious discussion. The motivating change is that the 
>>> GitHub transition makes this proposal possible, from a technical 
>>> perspective, in a way it is not while the project is still backed by 
>>> Subversion.
>> 
>> I don’t understand the premise here. There are lots of ways to enforce 
>> commit policy on a Subversion repository.
> 
> Kind of. The problem here is that we want to provide enough escape hatches so 
> that any committer can quickly repair a broken build, so we have to check not 
> just the committer, but the commit itself. This is more analogous to ensuring 
> that commit has “Reviewed by” in it’s commit message (something that 
> Subversion does not enforce in our repository, despite it being policy) than 
> it is to ensuring that the committer has certain privileges. We’ve always 
> implemented these kind of checks in buildbot, not the Subversion server, and 
> it’s much easier to implement complicated logic that takes into consideration 
> the content of the commit in buildbot than it is Subversion hooks.
> 
>> On the meta level, while we are still dealing with serious regressions in 
>> our workflow caused by git and GitHub, I recommend that we do not push 
>> forward with more unrelated sweeping changes to the project and its 
>> workflow. Just like in software development, where we need to fix 
>> regressions before we can move forward with major feature work, so too in 
>> tooling we need to do the same. Otherwise we just pile chaos on top of 
>> chaos, and there is no way to know if things are improving or getting worse, 
>> and no way to hold ourselves accountable for improvement. 
>> 
>> Thanks,
>> Geoff
>> 
>>> 
>>> Jonathan
>>> 
>>>> 
>>>> Thanks,
>>>> Geoff
>>>> 
>>>> 
>>>>> On Jun 2, 2022, at 2:35 PM, Jonathan Bedard via webkit-dev 
>>>>> <webkit-dev@lists.webkit.org <mailto:webkit-dev@lists.webkit.org>> wrote:
>>>>> 
>>>>> As we move to GitHub, I would like to propose we strengthen our 
>>>>> protections on `main` by making MergeQueue and CommitQueue mandatory. 
>>>>> This would mean that with a few exceptions, all changes would need to be 
>>>>> built and run layout tests before they are landed. To spell out what the 
>>>>> exceptions I had in mind are:
>>>>> 
>>>>> - Revert commits, identified by a commit message that starts with 
>>>>> “Unreviewed, revering…” would be exempt
>>>>> - Changes which only modify files that do not effect building or testing 
>>>>> WebKit would be exempt. These files specifically are:
>>>>>          .github/
>>>>>          JSTests/
>>>>>          ManualTests/
>>>>>          metadata/
>>>>>          PerformanceTests/
>>>>>          Tools/
>>>>>            CISuport/
>>>>>            EWSTools/
>>>>>            WebKitBot
>>>>>         Websites/
>>>>> - Emergency build and infrastructure fixes, identified by a commit 
>>>>> message that starts with “Emergency build fix” or “Emergency 
>>>>> infrastructure fix” would be exempt
>>>>> - A reviewer who is not the commit author can overwrite this protection 
>>>>> by adding `unsafe-merge-queue` instead of the commit author
>>>>> - Changes which passed an EWS layout test queue within the last 7 days 
>>>>> would skip the layout test check
>>>>> 
>>>>> These exceptions are designed to provide contributors for a way to 
>>>>> by-pass potentially slow checks if extraordinary situations, or in ones 
>>>>> where CI has already validated the change. I think we should keep the 
>>>>> ability for any committer to deploy an emergency fix, because our project 
>>>>> has many contributors in different timezones and with different holiday 
>>>>> schedules.
>>>>> 
>>>>> We know that this policy change would potentially slow down development, 
>>>>> so I think these 3 improvements block making MergeQueue and CommitQueue 
>>>>> mandatory:
>>>>> 
>>>>> - run-webkit-tests consulting results.webkit.org 
>>>>> <http://results.webkit.org/> to avoid retrying known flakey or failing 
>>>>> tests
>>>>> - Another MergeQueue bot
>>>>> - Xcode workspace builds to speed up incremental builds
>>>>> 
>>>>> Jonathan Bedard
>>>>> WebKit Continuous Integration
>>>>> 
>>>>> _______________________________________________
>>>>> webkit-dev mailing list
>>>>> webkit-dev@lists.webkit.org <mailto:webkit-dev@lists.webkit.org>
>>>>> https://lists.webkit.org/mailman/listinfo/webkit-dev 
>>>>> <https://lists.webkit.org/mailman/listinfo/webkit-dev>
_______________________________________________
webkit-dev mailing list
webkit-dev@lists.webkit.org
https://lists.webkit.org/mailman/listinfo/webkit-dev

Reply via email to