So...where did we land on this?

To take a few steps back and make sure we're all talking about the same
thing, I want to mention that I myself was responsible for merging a PR
that broke the build a few weeks ago. There was a warning that only
appeared in some of the versions, but when checking on the results I
immediately navigated to the "Tests" page before letting the main build
page load. I just assumed that if there were test failures that must mean
the build itself was fine, and with this tunnel-vision I missed that a
subset of JDK builds had failed.

I'm mentioning this now not to make excuses, but because I really hope that
it goes without saying that this was an honest (though sloppy) mistake, and
that no one genuinely believes it is acceptable to merge any PR that causes
any of the builds to fail to compile. Yet multiple people in this thread so
far have voiced support for "gating merges on the successful completion of
all parts of the build except tests". Just to be totally clear, I really
don't think that was ever in question -- though it certainly doesn't hurt
to remind everyone.

So, this thread is not about whether or not to merge with failing
*builds, *but it's
whether it should be acceptable to merge with failing *tests.* It seems
like we're in agreement for the most part that the current system isn't
great, but I don't think we can (or rather, should) start enforcing the new
rule to only merge fully-green builds until we can actually get to a
reasonable percentage of green builds. Do we have any stats on how often
builds are failing right now due to flaky tests? Just speaking from
personal experience, it's been a while since I've seen a truly green build
with all tests passing.

Lastly, just to play devil's advocate for a moment, before we commit to
this I think we need to consider how such a policy would impact the
contributor experience. It's incredibly disheartening to go through all the
work of submitting and getting it reviewed only to be blocked from merging
at the last minute due to test failures completely outside that
contributor's control. We already struggle just getting some folks, new
contributors in particular, all the way through the review process without
abandoning their PRs. I do think we've been doing a better job lately, but
it's a cycle that ebbs and flows, and most community PRs still take an
admirable degree of patience and persistence just to get enough reviews to
reach the point of merging. So I just think we need to be careful not to
make this situation even worse by having to wait for a green build. I'm
just worried about our ability to stay on top of disabling tests,
especially if we need to wait for someone with enough context to make a
judgement call. Can we really rely on everyone to drop everything at any
time to check on a failing test? At the same time I would be hesitant to be
overly aggressive about reverting/disabling tests without having someone
who understands the context take a look.

That said, I do think it's worth trying this out as an experiment, as long
as we can be clear to frustrated contributors that this isn't necessarily
the new policy from here on out if it isn't going well.

On Thu, Nov 16, 2023 at 3:32 AM Igor Soarez <i...@soarez.me> wrote:

> Hi all,
>
> I think at least one of those is my fault, apologies.
> I'll try to make sure all my tests are passing from now on.
>
> It doesn't help that GitHub always shows that the tests have failed,
> even when they have not. I suspect this is because Jenkins always
> marks the builds as unstable, even when all tests pass, because
> the "Archive JUnit-formatted test results" step seems to persistently
> fail with "[Checks API] No suitable checks publisher found.".
> e.g.
> https://ci-builds.apache.org/blue/organizations/jenkins/Kafka%2Fkafka-pr/detail/PR-14770/1/pipeline/
>
> Can we get rid of that persistent failure and actually mark successful
> test runs as green?
>
> --
> Igor
>

Reply via email to