So...where did we land on this? To take a few steps back and make sure we're all talking about the same thing, I want to mention that I myself was responsible for merging a PR that broke the build a few weeks ago. There was a warning that only appeared in some of the versions, but when checking on the results I immediately navigated to the "Tests" page before letting the main build page load. I just assumed that if there were test failures that must mean the build itself was fine, and with this tunnel-vision I missed that a subset of JDK builds had failed.
I'm mentioning this now not to make excuses, but because I really hope that it goes without saying that this was an honest (though sloppy) mistake, and that no one genuinely believes it is acceptable to merge any PR that causes any of the builds to fail to compile. Yet multiple people in this thread so far have voiced support for "gating merges on the successful completion of all parts of the build except tests". Just to be totally clear, I really don't think that was ever in question -- though it certainly doesn't hurt to remind everyone. So, this thread is not about whether or not to merge with failing *builds, *but it's whether it should be acceptable to merge with failing *tests.* It seems like we're in agreement for the most part that the current system isn't great, but I don't think we can (or rather, should) start enforcing the new rule to only merge fully-green builds until we can actually get to a reasonable percentage of green builds. Do we have any stats on how often builds are failing right now due to flaky tests? Just speaking from personal experience, it's been a while since I've seen a truly green build with all tests passing. Lastly, just to play devil's advocate for a moment, before we commit to this I think we need to consider how such a policy would impact the contributor experience. It's incredibly disheartening to go through all the work of submitting and getting it reviewed only to be blocked from merging at the last minute due to test failures completely outside that contributor's control. We already struggle just getting some folks, new contributors in particular, all the way through the review process without abandoning their PRs. I do think we've been doing a better job lately, but it's a cycle that ebbs and flows, and most community PRs still take an admirable degree of patience and persistence just to get enough reviews to reach the point of merging. So I just think we need to be careful not to make this situation even worse by having to wait for a green build. I'm just worried about our ability to stay on top of disabling tests, especially if we need to wait for someone with enough context to make a judgement call. Can we really rely on everyone to drop everything at any time to check on a failing test? At the same time I would be hesitant to be overly aggressive about reverting/disabling tests without having someone who understands the context take a look. That said, I do think it's worth trying this out as an experiment, as long as we can be clear to frustrated contributors that this isn't necessarily the new policy from here on out if it isn't going well. On Thu, Nov 16, 2023 at 3:32 AM Igor Soarez <i...@soarez.me> wrote: > Hi all, > > I think at least one of those is my fault, apologies. > I'll try to make sure all my tests are passing from now on. > > It doesn't help that GitHub always shows that the tests have failed, > even when they have not. I suspect this is because Jenkins always > marks the builds as unstable, even when all tests pass, because > the "Archive JUnit-formatted test results" step seems to persistently > fail with "[Checks API] No suitable checks publisher found.". > e.g. > https://ci-builds.apache.org/blue/organizations/jenkins/Kafka%2Fkafka-pr/detail/PR-14770/1/pipeline/ > > Can we get rid of that persistent failure and actually mark successful > test runs as green? > > -- > Igor >