>
> I also don't think it leads to the right behaviour or incentives.

The gap between when a test is authored and the point at which it's
determined to be flaky, as the difficulty with responsibility assignment
(an "unrelated" change can in some cases make a previously stable test
become flaky) makes this a real devil of a problem to fix. Hence it's long
and rich legacy. ;)

While I agree with the general sentiment of "if we email the dev list with
a failure, or we git blame a test and poke the author to fix it they'll do
the right thing", we still end up in cases where people have rotated off
the project and nobody feels a sense of ownership over a test failure for
something someone else wrote, or a circumstance in which another change
broke something, etc. At least from where I sit, I can't see a solution to
this problem that doesn't involve some collective action for things not
directly under one's purview.

Also, fwiw in my experience, "soft" gatekeeping for things like this will
just lead to the problem persisting into perpetuity. The problem strikes me
as too complex and temporally / unpredictably distributed to be solvable by
incentivizing the "right" behavior (proactive prevention of introduction of
things like this, hygiene and rigor on authorship, etc), but I'm sure
there's ways of approaching this that I'm not thinking of.

But maybe I'm making a mountain out of a molehill. @bes - if you think that
emailing the dev list when a failure is encountered on rotation would be
sufficient to keep this problem under control with an obviously much
lighter touch, I'm +1 for giving it a shot.

On Fri, Jan 24, 2020 at 10:12 AM Benedict Elliott Smith <bened...@apache.org>
wrote:

> > due to oversight on a commit or a delta breaking some test the author
> thinks is unrelated to their diff but turns out to be a second-order
> consequence of their change that they didn't expect
>
> In my opinion/experience, this is all a direct consequence of lack of
> trust in CI caused by flakiness.  We have finite time to dedicate to our
> jobs, and figuring out whether or not a run is really clean for this patch
> is genuinely costly when  you cannot trust the result,  Those costs
> multiple rapidly across the contributor base.
>
> That does not conflict with what you are saying.  I don't, however, think
> it is reasonable to place the burden on the person trying to commit at that
> moment, whether or not by positive sentiment or "computer says no".  I also
> don't think it leads to the right behaviour or incentives.
>
> I further think there's been a degradation of community behaviour to some
> extent caused by the bifurcation of CI infrastructure and approach.
> Ideally we would all use a common platform, and there would be regular
> trunk runs to compare against, like-for-like.
>
> IMO, we should email dev@ if there are failing runs for trunk, and there
> should be a rotating role amongst the contributors to figure out who broke
> it, and poke them to fix it (or to just fix it, if easy).
>
>
> On 24/01/2020, 14:57, "Joshua McKenzie" <jmcken...@apache.org> wrote:
>
>     >
>     > gating PRs on clean runs won’t achieve anything other than dealing
> with
>     > folks who straight up ignore the spirit of the policy and knowingly
> commit
>     > code with test breakage
>
>     I think there's some nuance here. We have a lot of suites (novnode,
> cdc,
>     etc etc) where failures show up because people didn't run those tests
> or
>     didn't think to check them when they did. Likewise, I'd posit we have a
>     non-trivial number of failures (>= 15%? making up a number here) that
> are
>     due to oversight on a commit or a delta breaking some test the author
>     thinks is unrelated to their diff but turns out to be a second-order
>     consequence of their change that they didn't expect. I'm certainly not
>     claiming we have bad actors here merging known test failures because
> they
>     don't care.
>
>     This seems to me like an issue of collective ownership, and whether or
> not
>     we're willing to take it as a project. If I have a patch, I run CI,
> and a
>     test fails that's unrelated to my diff (or I think is, or conclude it's
>     unrelated after inspection, whatever), that's the crucial moment where
> we
>     can either say "Welp, not my problem. Merge time.", or say "hey, we all
>     live in this neighborhood together and while this trash on the ground
> isn't
>     actually mine, it's my neighborhood so if I clean this up it'll
> benefit me
>     and all the rest of us."
>
>     Depending on how much time and energy fixing a flake like that may
> take,
>     this may prove to be economically unsustainable for some/many
> participants
>     on the project. A lot of us are paid to work on C* by organizations
> with
>     specific priorities for the project that are not directly related to
> "has
>     green test board". But I do feel comfortable making the case that
> there's a
>     world in which "don't merge if any tests fail, clean up whatever
> failures
>     you run into" *could* be a sustainable model assuming everyone in the
>     ecosystem was willing and able to engage in that collectively
> benefiting
>     behavior.
>
>     Does the above make sense?
>
>     On Fri, Jan 24, 2020 at 7:39 AM Aleksey Yeshchenko
>     <alek...@apple.com.invalid> wrote:
>
>     > As for GH for code review, I find that it works very well for nits.
> It’s
>     > also great for doc changes, given how GH allows you suggest changes
> to
>     > files in-place and automatically create PRs for those changes. That
> lowers
>     > the barrier for those tiny contributions.
>     >
>     > For anything relatively substantial, I vastly prefer to summarise my
>     > feedback (and see others’ feedback summarised) in JIRA comments - an
>     > opinion I and other contributors have shared in one or two similar
> threads
>     > over the years.
>     >
>     >
>     > > On 24 Jan 2020, at 12:21, Aleksey Yeshchenko
> <alek...@apple.com.INVALID>
>     > wrote:
>     > >
>     > > The person introducing flakiness to a test will almost always have
> run
>     > it locally and on CI first with success. It’s usually later when
> they first
>     > start failing, and it’s often tricky to attribute to a particular
>     > commit/person.
>     > >
>     > > So long as we have these - and we’ve had flaky tests for as long
> as C*
>     > has existed - the problem will persist, and gating PRs on clean runs
> won’t
>     > achieve anything other than dealing with folks who straight up
> ignore the
>     > spirit of the policy and knowingly commit code with test breakage
> that can
>     > be attributed to their change. I’m not aware of such committers in
> this
>     > community, however.
>     > >
>     > >> On 24 Jan 2020, at 09:01, Benedict Elliott Smith <
> bened...@apache.org>
>     > wrote:
>     > >>
>     > >>> I find it only useful for nits, or for coaching-level comments
> that I
>     > would never want propagated to Jira.
>     > >>
>     > >> Actually, I'll go one step further. GitHub encourages comments
> that are
>     > too trivial, poisoning the well for third parties trying to find
> useful
>     > information.  If the comment wouldn't be made in Jira, it probably
>     > shouldn't be made.
>     > >>
>     > >>
>     > >>
>     > >> On 24/01/2020, 08:56, "Benedict Elliott Smith" <
> bened...@apache.org>
>     > wrote:
>     > >>
>     > >>   The common factor is flaky tests, not people.  You get a clean
> run,
>     > you commit.  Turns out, a test was flaky.  This reduces trust in CI,
> so
>     > people commit without looking as closely at results.  Gating on
> clean tests
>     > doesn't help, as you run until you're clean.  Rinse and repeat.
> Breakages
>     > accumulate.
>     > >>
>     > >>   This is what happens leading up to every release - nobody
> commits
>     > knowing there's a breakage.  We have a problem with bad tests, not
> bad
>     > people or process.
>     > >>
>     > >>   FWIW, I no longer like the GitHub workflow.  I find it only
> useful
>     > for nits, or for coaching-level comments that I would never want
> propagated
>     > to Jira.  I find a strong patch submission of any size is better
> managed
>     > with human-curated Jira comments, and provides a better permanent
> record.
>     > When skimming a discussion, Jira is more informative than GitHub.
> Even
>     > with the GitHub UX, the context hinders rather than helps.
>     > >>
>     > >>   As to propagating to Jira: has anyone here ever read them?  I
> haven't
>     > as they're impenetrable; ugly and almost entirely noise.  If
> anything, I
>     > would prefer that we discourage GitHub for review as a project, not
> move
>     > towards it.
>     > >>
>     > >>   This is without getting into the problem of multiple branch PRs.
>     > Until this is _provably_ painless, we cannot introduce a workflow
> that
>     > requires it and blocks commit on it.  Working with multiple branches
> is
>     > difficult enough already, surely?
>     > >>
>     > >>
>     > >>
>     > >>   On 24/01/2020, 03:16, "Jeff Jirsa" <jji...@gmail.com> wrote:
>     > >>
>     > >>       100% agree
>     > >>
>     > >>       François and team wrote a doc on testing and gating commits
>     > >>       Blake wrote a doc on testing and gating commits
>     > >>       Every release there’s a thread on testing and gating commits
>     > >>
>     > >>       People are the common factor every time. Nobody wants to
> avoid
>     > merging their patch because someone broke a test elsewhere.
>     > >>
>     > >>       We can’t block merging technically with the repo as it
> exists now
>     > so it’s always going to come down to people and peer pressure, until
> or
>     > unless someone starts reverting commits that break tests
>     > >>
>     > >>       (Of course, someone could write a tool that automatically
> reverts
>     > new commits as long as tests fail....)
>     > >>
>     > >>       On Jan 23, 2020, at 5:54 PM, Joshua McKenzie <
>     > jmcken...@apache.org> wrote:
>     > >>>
>     > >>> 
>     > >>>>
>     > >>>>
>     > >>>> I am reacting to what I currently see
>     > >>>> happening in the project; tests fail as the norm and this is
> kinda
>     > seen as
>     > >>>> expected, even though it goes against the policies as I
> understand it.
>     > >>>
>     > >>> After over half a decade seeing us all continue to struggle with
> this
>     > >>> problem, I've come around to the school of "apply pain" (I mean
> that as
>     > >>> light-hearted as you can take it) when there's a failure to
> incent
>     > fixing;
>     > >>> specifically in this case, the only idea I can think of is
> preventing
>     > merge
>     > >>> w/any failing tests on a PR. We go through this cycle as we
> approach
>     > each
>     > >>> major release: we have the gatekeeper of "we're not going to cut
> a
>     > release
>     > >>> with failing tests obviously", and we clean them up. After the
>     > release, the
>     > >>> pressure is off, we exhale, relax, and flaky test failures (and
> others)
>     > >>> start to creep back in.
>     > >>>
>     > >>> If the status quo is the world we want to live in, that's
> totally fine
>     > and
>     > >>> no judgement intended - we can build tooling around test failure
>     > history
>     > >>> and known flaky tests etc to optimize engineer workflows around
> that
>     > >>> expectation. But what I keep seeing on threads like this (and
> have
>     > always
>     > >>> heard brought up in conversation) is that our collective *moral*
>     > stance is
>     > >>> that we should have green test boards and not merge code if it
>     > introduces
>     > >>> failing tests.
>     > >>>
>     > >>> Not looking to prescribe or recommend anything, just hoping that
>     > >>> observation above might be of interest or value to the
> conversation.
>     > >>>
>     > >>>> On Thu, Jan 23, 2020 at 4:17 PM Michael Shuler <
>     > mich...@pbandjelly.org>
>     > >>>> wrote:
>     > >>>>
>     > >>>>> On 1/23/20 3:53 PM, David Capwell wrote:
>     > >>>>>
>     > >>>>> 2) Nightly build email to dev@?
>     > >>>>
>     > >>>> Nope. builds@c.a.o is where these go.
>     > >>>> https://lists.apache.org/list.html?bui...@cassandra.apache.org
>     > >>>>
>     > >>>> Michael
>     > >>>>
>     > >>>>
> ---------------------------------------------------------------------
>     > >>>> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
>     > >>>> For additional commands, e-mail: dev-h...@cassandra.apache.org
>     > >>>>
>     > >>>>
>     > >>
>     > >>
>     >
> ---------------------------------------------------------------------
>     > >>       To unsubscribe, e-mail:
> dev-unsubscr...@cassandra.apache.org
>     > >>       For additional commands, e-mail:
> dev-h...@cassandra.apache.org
>     > >>
>     > >>
>     > >>
>     > >>
>     > >>
>     > >>
>  ---------------------------------------------------------------------
>     > >>   To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
>     > >>   For additional commands, e-mail: dev-h...@cassandra.apache.org
>     > >>
>     > >>
>     > >>
>     > >>
>     > >>
>     > >>
> ---------------------------------------------------------------------
>     > >> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
>     > >> For additional commands, e-mail: dev-h...@cassandra.apache.org
>     > >>
>     > >
>     > >
>     > >
> ---------------------------------------------------------------------
>     > > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
>     > > For additional commands, e-mail: dev-h...@cassandra.apache.org
>     > >
>     >
>     >
>     > ---------------------------------------------------------------------
>     > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
>     > For additional commands, e-mail: dev-h...@cassandra.apache.org
>     >
>     >
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>
>

Reply via email to