Thanks Xintong. +1 to the proposal.

On Tue, Jun 29, 2021 at 11:05 AM 刘建刚 <liujiangangp...@gmail.com> wrote:

> +1 for the proposal. Since the test time is long and environment may vary,
> unstable tests are really annoying for developers. The solution is welcome.
>
> Best
> liujiangang
>
> Jingsong Li <jingsongl...@gmail.com> 于2021年6月29日周二 上午10:31写道:
>
> > +1 Thanks Xintong for the update!
> >
> > Best,
> > Jingsong
> >
> > On Mon, Jun 28, 2021 at 6:44 PM Till Rohrmann <trohrm...@apache.org>
> > wrote:
> >
> > > +1, thanks for updating the guidelines Xintong!
> > >
> > > Cheers,
> > > Till
> > >
> > > On Mon, Jun 28, 2021 at 11:49 AM Yangze Guo <karma...@gmail.com>
> wrote:
> > >
> > > > +1
> > > >
> > > > Thanks Xintong for drafting this doc.
> > > >
> > > > Best,
> > > > Yangze Guo
> > > >
> > > > On Mon, Jun 28, 2021 at 5:42 PM JING ZHANG <beyond1...@gmail.com>
> > wrote:
> > > > >
> > > > > Thanks Xintong for giving detailed documentation.
> > > > >
> > > > > The best practice for handling test failure is very detailed, it's
> a
> > > good
> > > > > guidelines document with clear action steps.
> > > > >
> > > > > +1 to Xintong's proposal.
> > > > >
> > > > > Xintong Song <tonysong...@gmail.com> 于2021年6月28日周一 下午4:07写道:
> > > > >
> > > > > > Thanks all for the discussion.
> > > > > >
> > > > > > Based on the opinions so far, I've drafted the new guidelines
> [1],
> > > as a
> > > > > > potential replacement of the original wiki page [2].
> > > > > >
> > > > > > Hopefully this draft has covered the most opinions discussed and
> > > > consensus
> > > > > > made in this discussion thread.
> > > > > >
> > > > > > Looking forward to your feedback.
> > > > > >
> > > > > > Thank you~
> > > > > >
> > > > > > Xintong Song
> > > > > >
> > > > > >
> > > > > > [1]
> > > > > >
> > > > > >
> > > >
> > >
> >
> https://docs.google.com/document/d/1uUbxbgbGErBXtmEjhwVhBWG3i6nhQ0LXs96OlntEYnU/edit?usp=sharing
> > > > > >
> > > > > > [2]
> > > > > >
> > > >
> > https://cwiki.apache.org/confluence/display/FLINK/Merging+Pull+Requests
> > > > > >
> > > > > >
> > > > > >
> > > > > > On Fri, Jun 25, 2021 at 10:40 PM Piotr Nowojski <
> > > pnowoj...@apache.org>
> > > > > > wrote:
> > > > > >
> > > > > > > Thanks for the clarification Till. +1 for what you have
> written.
> > > > > > >
> > > > > > > Piotrek
> > > > > > >
> > > > > > > pt., 25 cze 2021 o 16:00 Till Rohrmann <trohrm...@apache.org>
> > > > > > napisał(a):
> > > > > > >
> > > > > > > > One quick note for clarification. I don't have anything
> against
> > > > builds
> > > > > > > > running on your personal Azure account and this is not what I
> > > > > > understood
> > > > > > > > under "local environment". For me "local environment" means
> > that
> > > > > > someone
> > > > > > > > runs the test locally on his machine and then says that the
> > > > > > > > tests have passed locally.
> > > > > > > >
> > > > > > > > I do agree that there might be a conflict of interests if a
> PR
> > > > author
> > > > > > > > disables tests. Here I would argue that we don't have
> malignant
> > > > > > > committers
> > > > > > > > which means that every committer will probably first check
> the
> > > > > > respective
> > > > > > > > ticket for how often the test failed. Then I guess the next
> > step
> > > > would
> > > > > > be
> > > > > > > > to discuss on the ticket whether to disable it or not. And
> > > finally,
> > > > > > after
> > > > > > > > reaching a consensus, it will be disabled. If we see someone
> > > > abusing
> > > > > > this
> > > > > > > > policy, then we can still think about how to guard against
> it.
> > > But,
> > > > > > > > honestly, I have very rarely seen such a case. I am also ok
> to
> > > > pull in
> > > > > > > the
> > > > > > > > release manager to make the final call if this resolves
> > concerns.
> > > > > > > >
> > > > > > > > Cheers,
> > > > > > > > Till
> > > > > > > >
> > > > > > > > On Fri, Jun 25, 2021 at 9:07 AM Piotr Nowojski <
> > > > pnowoj...@apache.org>
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > +1 for the general idea, however I have concerns about a
> > couple
> > > > of
> > > > > > > > details.
> > > > > > > > >
> > > > > > > > > > I would first try to not introduce the exception for
> local
> > > > builds.
> > > > > > > > > > It makes it quite hard for others to verify the build and
> > to
> > > > make
> > > > > > > sure
> > > > > > > > > that the right things were executed.
> > > > > > > > >
> > > > > > > > > I would counter Till's proposal to ignore local green
> builds.
> > > If
> > > > > > > > committer
> > > > > > > > > is merging and closing a PR, with official azure failure,
> but
> > > > there
> > > > > > > was a
> > > > > > > > > green build before or in local azure it's IMO enough to
> leave
> > > the
> > > > > > > > message:
> > > > > > > > >
> > > > > > > > > > Latest build failure is a known issue: FLINK-12345
> > > > > > > > > > Green local build: URL
> > > > > > > > >
> > > > > > > > > This should address Till's concern about verification.
> > > > > > > > >
> > > > > > > > > On the other hand I have concerns about disabling tests.*
> It
> > > > > > shouldn't
> > > > > > > be
> > > > > > > > > the PR author/committer that's disabling a test on his own,
> > as
> > > > > > that's a
> > > > > > > > > conflict of interests*. I have however no problems with
> > > disabling
> > > > > > test
> > > > > > > > > instabilities that were marked as "blockers" though, that
> > > should
> > > > work
> > > > > > > > > pretty well. But the important thing here is to correctly
> > judge
> > > > > > bumping
> > > > > > > > > priorities of test instabilities based on their frequency
> and
> > > > current
> > > > > > > > > general health of the system. I believe that release
> managers
> > > > should
> > > > > > be
> > > > > > > > > playing a big role here in deciding on the guidelines of
> what
> > > > should
> > > > > > > be a
> > > > > > > > > priority of certain test instabilities.
> > > > > > > > >
> > > > > > > > > What I mean by that is two example scenarios:
> > > > > > > > > 1. if we have a handful of very frequently failing tests
> and
> > a
> > > > > > handful
> > > > > > > of
> > > > > > > > > very rarely failing tests (like one reported failure and no
> > > > another
> > > > > > > > > occurrence in many months, and let's even say that the
> > failure
> > > > looks
> > > > > > > like
> > > > > > > > > infrastructure/network timeout), we should focus on the
> > > > frequently
> > > > > > > > failing
> > > > > > > > > ones, and probably we are safe to ignore for the time being
> > the
> > > > rare
> > > > > > > > issues
> > > > > > > > > - at least until we deal with the most pressing ones.
> > > > > > > > > 2. If we have tons of rarely failing test instabilities, we
> > > > should
> > > > > > > > probably
> > > > > > > > > start addressing them as blockers.
> > > > > > > > >
> > > > > > > > > I'm using my own conscious and my best judgement when I'm
> > > > > > > > > bumping/decreasing priorities of test instabilities (and
> > bugs),
> > > > but
> > > > > > as
> > > > > > > > > individual committer I don't have the full picture. As I
> > wrote
> > > > > > above, I
> > > > > > > > > think release managers are in a much better position to
> keep
> > > > > > adjusting
> > > > > > > > > those kind of guidelines.
> > > > > > > > >
> > > > > > > > > Best, Piotrek
> > > > > > > > >
> > > > > > > > > pt., 25 cze 2021 o 08:10 Yu Li <car...@gmail.com>
> > napisał(a):
> > > > > > > > >
> > > > > > > > > > +1 for Xintong's proposal.
> > > > > > > > > >
> > > > > > > > > > For me, resolving problems directly (fixing the
> > > infrastructure
> > > > > > issue,
> > > > > > > > > > disabling unstable tests and creating blocker JIRAs to
> > track
> > > > the
> > > > > > fix
> > > > > > > > and
> > > > > > > > > > re-enable them asap, etc.) is (in most cases) better than
> > > > working
> > > > > > > > around
> > > > > > > > > > them (verify locally, manually check and judge the
> failure
> > as
> > > > > > > > > "unrelated",
> > > > > > > > > > etc.), and I believe the proposal could help us pushing
> > those
> > > > more
> > > > > > > > "real"
> > > > > > > > > > solutions forward.
> > > > > > > > > >
> > > > > > > > > > Best Regards,
> > > > > > > > > > Yu
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > On Fri, 25 Jun 2021 at 10:58, Yangze Guo <
> > karma...@gmail.com
> > > >
> > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > Creating a blocker issue for the manually disabled
> tests
> > > > sounds
> > > > > > > good
> > > > > > > > to
> > > > > > > > > > me.
> > > > > > > > > > >
> > > > > > > > > > > Minor: I'm still a bit worried about the commits merged
> > > > before we
> > > > > > > fix
> > > > > > > > > > > the unstable tests can also break those tests. Instead
> of
> > > > letting
> > > > > > > the
> > > > > > > > > > > assigners keep a look at all potentially related
> commits,
> > > > they
> > > > > > can
> > > > > > > > > > > maintain a branch that is periodically synced with the
> > > master
> > > > > > > branch
> > > > > > > > > > > while enabling the unstable test. So that they can
> catch
> > > the
> > > > > > > breaking
> > > > > > > > > > > changes asap.
> > > > > > > > > > >
> > > > > > > > > > > Best,
> > > > > > > > > > > Yangze Guo
> > > > > > > > > > >
> > > > > > > > > > > On Thu, Jun 24, 2021 at 9:52 PM Till Rohrmann <
> > > > > > > trohrm...@apache.org>
> > > > > > > > > > > wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > I like the idea of creating a blocker issue for a
> > > disabled
> > > > > > test.
> > > > > > > > This
> > > > > > > > > > > will
> > > > > > > > > > > > force us to resolve it in a timely manner and it
> won't
> > > fall
> > > > > > > through
> > > > > > > > > the
> > > > > > > > > > > > cracks.
> > > > > > > > > > > >
> > > > > > > > > > > > Cheers,
> > > > > > > > > > > > Till
> > > > > > > > > > > >
> > > > > > > > > > > > On Thu, Jun 24, 2021 at 8:06 AM Jingsong Li <
> > > > > > > > jingsongl...@gmail.com>
> > > > > > > > > > > wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > > +1 to Xintong's proposal
> > > > > > > > > > > > >
> > > > > > > > > > > > > I also have some concerns about unstable cases.
> > > > > > > > > > > > >
> > > > > > > > > > > > > I think unstable cases can be divided into these
> > types:
> > > > > > > > > > > > >
> > > > > > > > > > > > > - Force majeure: For example, network timeout,
> sudden
> > > > > > > > environmental
> > > > > > > > > > > > > collapse, they are accidental and can always be
> > solved
> > > by
> > > > > > > > > triggering
> > > > > > > > > > > azure
> > > > > > > > > > > > > again. Committers should wait for the next green
> > azure.
> > > > > > > > > > > > >
> > > > > > > > > > > > > - Obvious mistakes: For example, some errors caused
> > by
> > > > > > obvious
> > > > > > > > > > reasons
> > > > > > > > > > > may
> > > > > > > > > > > > > be repaired quickly. At this time, do we need to
> > wait,
> > > > or not
> > > > > > > > wait
> > > > > > > > > > and
> > > > > > > > > > > just
> > > > > > > > > > > > > ignore?
> > > > > > > > > > > > >
> > > > > > > > > > > > > - Difficult questions: These problems are very
> > > difficult
> > > > to
> > > > > > > find.
> > > > > > > > > > There
> > > > > > > > > > > > > will be no solution for a while and a half. We
> don't
> > > even
> > > > > > know
> > > > > > > > the
> > > > > > > > > > > reason.
> > > > > > > > > > > > > At this time, we should ignore it. (Maybe it's
> judged
> > > by
> > > > the
> > > > > > > > author
> > > > > > > > > > of
> > > > > > > > > > > the
> > > > > > > > > > > > > case. But what about the old case whose author
> can't
> > be
> > > > > > found?)
> > > > > > > > > > > > >
> > > > > > > > > > > > > So, the ignored cases should be the block of the
> next
> > > > release
> > > > > > > > until
> > > > > > > > > > the
> > > > > > > > > > > > > reason is found or the case is fixed?  We need to
> > > ensure
> > > > that
> > > > > > > > > someone
> > > > > > > > > > > will
> > > > > > > > > > > > > take care of these cases, because there is no
> > deepening
> > > > of
> > > > > > > failed
> > > > > > > > > > > tests, no
> > > > > > > > > > > > > one may continue to pay attention to these cases.
> > > > > > > > > > > > >
> > > > > > > > > > > > > I think this guideline should consider these
> > > situations,
> > > > and
> > > > > > > show
> > > > > > > > > how
> > > > > > > > > > > to
> > > > > > > > > > > > > solve them.
> > > > > > > > > > > > >
> > > > > > > > > > > > > Best,
> > > > > > > > > > > > > Jingsong
> > > > > > > > > > > > >
> > > > > > > > > > > > > On Thu, Jun 24, 2021 at 10:57 AM Jark Wu <
> > > > imj...@gmail.com>
> > > > > > > > wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > > > Thanks to Xintong for bringing up this topic, I'm
> > +1
> > > in
> > > > > > > > general.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > However, I think it's still not very clear how we
> > > > address
> > > > > > the
> > > > > > > > > > > unstable
> > > > > > > > > > > > > > tests.
> > > > > > > > > > > > > > I think this is a very important part of this new
> > > > > > guideline.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > According to the discussion above, if some tests
> > are
> > > > > > > unstable,
> > > > > > > > we
> > > > > > > > > > can
> > > > > > > > > > > > > > manually disable it.
> > > > > > > > > > > > > > But I have some questions in my mind:
> > > > > > > > > > > > > > 1) Is the instability judged by the committer
> > > > themselves or
> > > > > > > by
> > > > > > > > > some
> > > > > > > > > > > > > > metrics?
> > > > > > > > > > > > > > 2) Should we log the disable commit in the
> > > > corresponding
> > > > > > > issue
> > > > > > > > > and
> > > > > > > > > > > > > increase
> > > > > > > > > > > > > > the priority?
> > > > > > > > > > > > > > 3) What if nobody looks into this issue and this
> > > > becomes
> > > > > > some
> > > > > > > > > > > potential
> > > > > > > > > > > > > > bugs released with the new version?
> > > > > > > > > > > > > > 4) If no person is actively working on the issue,
> > who
> > > > > > should
> > > > > > > > > > > re-enable
> > > > > > > > > > > > > it?
> > > > > > > > > > > > > > Would it block PRs again?
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Best,
> > > > > > > > > > > > > > Jark
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > On Thu, 24 Jun 2021 at 10:04, Xintong Song <
> > > > > > > > > tonysong...@gmail.com>
> > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Thanks all for the feedback.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > @Till @Yangze
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > I'm also not convinced by the idea of having an
> > > > exception
> > > > > > > for
> > > > > > > > > > local
> > > > > > > > > > > > > > builds.
> > > > > > > > > > > > > > > We need to execute the entire build (or at
> least
> > > the
> > > > > > > failing
> > > > > > > > > > stage)
> > > > > > > > > > > > > > > locally, to make sure subsequent test cases
> > > > prevented by
> > > > > > > the
> > > > > > > > > > > failure
> > > > > > > > > > > > > one
> > > > > > > > > > > > > > > are all executed. In that case, it's probably
> > > easier
> > > > to
> > > > > > > rerun
> > > > > > > > > the
> > > > > > > > > > > build
> > > > > > > > > > > > > > on
> > > > > > > > > > > > > > > azure than locally.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Concerning disabling unstable test cases that
> > > > regularly
> > > > > > > block
> > > > > > > > > PRs
> > > > > > > > > > > from
> > > > > > > > > > > > > > > merging, maybe we can say that such cases can
> > only
> > > be
> > > > > > > > disabled
> > > > > > > > > > when
> > > > > > > > > > > > > > someone
> > > > > > > > > > > > > > > is actively looking into it, likely the person
> > who
> > > > > > disabled
> > > > > > > > the
> > > > > > > > > > > case.
> > > > > > > > > > > > > If
> > > > > > > > > > > > > > > this person is no longer actively working on
> it,
> > > > he/she
> > > > > > > > should
> > > > > > > > > > > enable
> > > > > > > > > > > > > the
> > > > > > > > > > > > > > > case again no matter if it is fixed or not.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > @Jing
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Thanks for the suggestions.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > +1 to provide guidelines on handling test
> > failures.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > 1. Report the test failures in the JIRA.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > +1 on this. Currently, the release managers are
> > > > > > monitoring
> > > > > > > > the
> > > > > > > > > ci
> > > > > > > > > > > and
> > > > > > > > > > > > > > cron
> > > > > > > > > > > > > > > build instabilities and reporting them on JIRA.
> > We
> > > > should
> > > > > > > > also
> > > > > > > > > > > > > encourage
> > > > > > > > > > > > > > > other contributors to do that for PRs.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > 2. Set a deadline to find out the root cause
> and
> > > > solve
> > > > > > the
> > > > > > > > > > failure
> > > > > > > > > > > for
> > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > new created JIRA  because we could not block
> > > other
> > > > > > commit
> > > > > > > > > > merges
> > > > > > > > > > > for
> > > > > > > > > > > > > a
> > > > > > > > > > > > > > > long
> > > > > > > > > > > > > > > > time
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > 3. What to do if the JIRA has not made
> > significant
> > > > > > progress
> > > > > > > > > when
> > > > > > > > > > > > > reached
> > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > the deadline time?
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > I'm not sure about these two. It feels a bit
> > > against
> > > > the
> > > > > > > > > > voluntary
> > > > > > > > > > > > > nature
> > > > > > > > > > > > > > > of open source projects.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > IMHO, frequent instabilities are more likely to
> > be
> > > > > > upgraded
> > > > > > > > to
> > > > > > > > > > the
> > > > > > > > > > > > > > critical
> > > > > > > > > > > > > > > / blocker priority, receive more attention and
> > > > eventually
> > > > > > > get
> > > > > > > > > > > fixed.
> > > > > > > > > > > > > > > Release managers are also responsible for
> looking
> > > for
> > > > > > > > assignees
> > > > > > > > > > for
> > > > > > > > > > > > > such
> > > > > > > > > > > > > > > issues. If a case is still not fixed soonish,
> > even
> > > > with
> > > > > > all
> > > > > > > > > these
> > > > > > > > > > > > > > efforts,
> > > > > > > > > > > > > > > I'm not sure how setting a deadline can help
> > this.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > 4. If we disable the respective tests
> > temporarily,
> > > we
> > > > > > also
> > > > > > > > > need a
> > > > > > > > > > > > > > mechanism
> > > > > > > > > > > > > > > > to ensure the issue would be continued to be
> > > > > > investigated
> > > > > > > > in
> > > > > > > > > > the
> > > > > > > > > > > > > > future.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > +1. As mentioned above, we may consider
> disabling
> > > > such
> > > > > > > tests
> > > > > > > > > iff
> > > > > > > > > > > > > someone
> > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > actively working on it.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Thank you~
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Xintong Song
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > On Wed, Jun 23, 2021 at 9:56 PM JING ZHANG <
> > > > > > > > > beyond1...@gmail.com
> > > > > > > > > > >
> > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Hi Xintong,
> > > > > > > > > > > > > > > > +1 to the proposal.
> > > > > > > > > > > > > > > > In order to better comply with the rule, it
> is
> > > > > > necessary
> > > > > > > to
> > > > > > > > > > > describe
> > > > > > > > > > > > > > > what's
> > > > > > > > > > > > > > > > best practice if encountering test failure
> > which
> > > > seems
> > > > > > > > > > unrelated
> > > > > > > > > > > with
> > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > current commits.
> > > > > > > > > > > > > > > > How to avoid merging PR with test failures
> and
> > > not
> > > > > > > blocking
> > > > > > > > > > code
> > > > > > > > > > > > > > merging
> > > > > > > > > > > > > > > > for a long time?
> > > > > > > > > > > > > > > > I tried to think about the possible steps,
> and
> > > > found
> > > > > > > there
> > > > > > > > > are
> > > > > > > > > > > some
> > > > > > > > > > > > > > > > detailed problems that need to be discussed
> in
> > a
> > > > step
> > > > > > > > > further:
> > > > > > > > > > > > > > > > 1. Report the test failures in the JIRA.
> > > > > > > > > > > > > > > > 2. Set a deadline to find out the root cause
> > and
> > > > solve
> > > > > > > the
> > > > > > > > > > > failure
> > > > > > > > > > > > > for
> > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > new created JIRA  because we could not block
> > > other
> > > > > > commit
> > > > > > > > > > merges
> > > > > > > > > > > for
> > > > > > > > > > > > > a
> > > > > > > > > > > > > > > long
> > > > > > > > > > > > > > > > time
> > > > > > > > > > > > > > > >     When is a reasonable deadline here?
> > > > > > > > > > > > > > > > 3. What to do if the JIRA has not made
> > > significant
> > > > > > > progress
> > > > > > > > > > when
> > > > > > > > > > > > > > reached
> > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > the deadline time?
> > > > > > > > > > > > > > > >     There are several situations as follows,
> > > maybe
> > > > > > > > different
> > > > > > > > > > > cases
> > > > > > > > > > > > > need
> > > > > > > > > > > > > > > > different approaches.
> > > > > > > > > > > > > > > >     1. the JIRA is non-assigned yet
> > > > > > > > > > > > > > > >     2. not found the root cause yet
> > > > > > > > > > > > > > > >     3. not found a good solution, but already
> > > > found the
> > > > > > > > root
> > > > > > > > > > > cause
> > > > > > > > > > > > > > > >     4. found a solution, but it needs more
> time
> > > to
> > > > be
> > > > > > > done.
> > > > > > > > > > > > > > > > 4. If we disable the respective tests
> > > temporarily,
> > > > we
> > > > > > > also
> > > > > > > > > > need a
> > > > > > > > > > > > > > > mechanism
> > > > > > > > > > > > > > > > to ensure the issue would be continued to be
> > > > > > investigated
> > > > > > > > in
> > > > > > > > > > the
> > > > > > > > > > > > > > future.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Best regards,
> > > > > > > > > > > > > > > > JING ZHANG
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Stephan Ewen <se...@apache.org>
> 于2021年6月23日周三
> > > > > > 下午8:16写道:
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > +1 to Xintong's proposal
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > On Wed, Jun 23, 2021 at 1:53 PM Till
> > Rohrmann <
> > > > > > > > > > > > > trohrm...@apache.org>
> > > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > I would first try to not introduce the
> > > > exception
> > > > > > for
> > > > > > > > > local
> > > > > > > > > > > > > builds.
> > > > > > > > > > > > > > It
> > > > > > > > > > > > > > > > > makes
> > > > > > > > > > > > > > > > > > it quite hard for others to verify the
> > build
> > > > and to
> > > > > > > > make
> > > > > > > > > > sure
> > > > > > > > > > > > > that
> > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > right things were executed. If we see
> that
> > > this
> > > > > > > becomes
> > > > > > > > > an
> > > > > > > > > > > issue
> > > > > > > > > > > > > > then
> > > > > > > > > > > > > > > > we
> > > > > > > > > > > > > > > > > > can revisit this idea.
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Cheers,
> > > > > > > > > > > > > > > > > > Till
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > On Wed, Jun 23, 2021 at 4:19 AM Yangze
> Guo
> > <
> > > > > > > > > > > karma...@gmail.com>
> > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > +1 for appending this to community
> > > > guidelines for
> > > > > > > > > merging
> > > > > > > > > > > PRs.
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > @Till Rohrmann
> > > > > > > > > > > > > > > > > > > I agree that with this approach
> unstable
> > > > tests
> > > > > > will
> > > > > > > > not
> > > > > > > > > > > block
> > > > > > > > > > > > > > other
> > > > > > > > > > > > > > > > > > > commit merges. However, it might be
> hard
> > to
> > > > > > prevent
> > > > > > > > > > merging
> > > > > > > > > > > > > > commits
> > > > > > > > > > > > > > > > > > > that are related to those tests and
> > should
> > > > have
> > > > > > > been
> > > > > > > > > > passed
> > > > > > > > > > > > > them.
> > > > > > > > > > > > > > > > It's
> > > > > > > > > > > > > > > > > > > true that this judgment can be made by
> > the
> > > > > > > > committers,
> > > > > > > > > > but
> > > > > > > > > > > no
> > > > > > > > > > > > > one
> > > > > > > > > > > > > > > can
> > > > > > > > > > > > > > > > > > > ensure the judgment is always precise
> and
> > > so
> > > > that
> > > > > > > we
> > > > > > > > > have
> > > > > > > > > > > this
> > > > > > > > > > > > > > > > > > > discussion thread.
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > Regarding the unstable tests, how about
> > > > adding
> > > > > > > > another
> > > > > > > > > > > > > exception:
> > > > > > > > > > > > > > > > > > > committers verify it in their local
> > > > environment
> > > > > > and
> > > > > > > > > > > comment in
> > > > > > > > > > > > > > such
> > > > > > > > > > > > > > > > > > > cases?
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > Best,
> > > > > > > > > > > > > > > > > > > Yangze Guo
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > On Tue, Jun 22, 2021 at 8:23 PM 刘建刚 <
> > > > > > > > > > > liujiangangp...@gmail.com
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > It is a good principle to run all
> tests
> > > > > > > > successfully
> > > > > > > > > > > with any
> > > > > > > > > > > > > > > > change.
> > > > > > > > > > > > > > > > > > > This
> > > > > > > > > > > > > > > > > > > > means a lot for project's stability
> and
> > > > > > > > development.
> > > > > > > > > I
> > > > > > > > > > > am big
> > > > > > > > > > > > > > +1
> > > > > > > > > > > > > > > > for
> > > > > > > > > > > > > > > > > > this
> > > > > > > > > > > > > > > > > > > > proposal.
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > Best
> > > > > > > > > > > > > > > > > > > > liujiangang
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > Till Rohrmann <trohrm...@apache.org>
> > > > > > > 于2021年6月22日周二
> > > > > > > > > > > 下午6:36写道:
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > One way to address the problem of
> > > > regularly
> > > > > > > > failing
> > > > > > > > > > > tests
> > > > > > > > > > > > > > that
> > > > > > > > > > > > > > > > > block
> > > > > > > > > > > > > > > > > > > > > merging of PRs is to disable the
> > > > respective
> > > > > > > tests
> > > > > > > > > for
> > > > > > > > > > > the
> > > > > > > > > > > > > > time
> > > > > > > > > > > > > > > > > being.
> > > > > > > > > > > > > > > > > > > Of
> > > > > > > > > > > > > > > > > > > > > course, the failing test then needs
> > to
> > > be
> > > > > > > fixed.
> > > > > > > > > But
> > > > > > > > > > at
> > > > > > > > > > > > > least
> > > > > > > > > > > > > > > > that
> > > > > > > > > > > > > > > > > > way
> > > > > > > > > > > > > > > > > > > we
> > > > > > > > > > > > > > > > > > > > > would not block everyone from
> making
> > > > > > progress.
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > Cheers,
> > > > > > > > > > > > > > > > > > > > > Till
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > On Tue, Jun 22, 2021 at 12:00 PM
> > Arvid
> > > > Heise
> > > > > > <
> > > > > > > > > > > > > > ar...@apache.org
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > I think this is overall a good
> > idea.
> > > > So +1
> > > > > > > from
> > > > > > > > > my
> > > > > > > > > > > side.
> > > > > > > > > > > > > > > > > > > > > > However, I'd like to put a higher
> > > > priority
> > > > > > on
> > > > > > > > > > > > > > infrastructure
> > > > > > > > > > > > > > > > > then,
> > > > > > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > > > > > > > particular docker image/artifact
> > > > caches.
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > On Tue, Jun 22, 2021 at 11:50 AM
> > Till
> > > > > > > Rohrmann
> > > > > > > > <
> > > > > > > > > > > > > > > > > > trohrm...@apache.org
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > Thanks for bringing this topic
> to
> > > our
> > > > > > > > attention
> > > > > > > > > > > > > Xintong.
> > > > > > > > > > > > > > I
> > > > > > > > > > > > > > > > > think
> > > > > > > > > > > > > > > > > > > your
> > > > > > > > > > > > > > > > > > > > > > > proposal makes a lot of sense
> and
> > > we
> > > > > > should
> > > > > > > > > > follow
> > > > > > > > > > > it.
> > > > > > > > > > > > > It
> > > > > > > > > > > > > > > > will
> > > > > > > > > > > > > > > > > > > give us
> > > > > > > > > > > > > > > > > > > > > > > confidence that our changes are
> > > > working
> > > > > > and
> > > > > > > > it
> > > > > > > > > > > might
> > > > > > > > > > > > > be a
> > > > > > > > > > > > > > > > good
> > > > > > > > > > > > > > > > > > > > > incentive
> > > > > > > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > > > > > quickly fix build
> instabilities.
> > > > Hence,
> > > > > > +1.
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > Cheers,
> > > > > > > > > > > > > > > > > > > > > > > Till
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > On Tue, Jun 22, 2021 at 11:12
> AM
> > > > Xintong
> > > > > > > > Song <
> > > > > > > > > > > > > > > > > > > tonysong...@gmail.com>
> > > > > > > > > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > Hi everyone,
> > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > In the past a couple of
> weeks,
> > > I've
> > > > > > > > observed
> > > > > > > > > > > several
> > > > > > > > > > > > > > > times
> > > > > > > > > > > > > > > > > that
> > > > > > > > > > > > > > > > > > > PRs
> > > > > > > > > > > > > > > > > > > > > are
> > > > > > > > > > > > > > > > > > > > > > > > merged without a green light
> > from
> > > > the
> > > > > > CI
> > > > > > > > > tests,
> > > > > > > > > > > where
> > > > > > > > > > > > > > > > failure
> > > > > > > > > > > > > > > > > > > cases
> > > > > > > > > > > > > > > > > > > > > are
> > > > > > > > > > > > > > > > > > > > > > > > considered *unrelated*. This
> > may
> > > > not
> > > > > > > always
> > > > > > > > > > cause
> > > > > > > > > > > > > > > problems,
> > > > > > > > > > > > > > > > > but
> > > > > > > > > > > > > > > > > > > would
> > > > > > > > > > > > > > > > > > > > > > > > increase the chance of
> breaking
> > > our
> > > > > > code
> > > > > > > > > base.
> > > > > > > > > > In
> > > > > > > > > > > > > fact,
> > > > > > > > > > > > > > > it
> > > > > > > > > > > > > > > > > has
> > > > > > > > > > > > > > > > > > > > > occurred
> > > > > > > > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > > > > > > me twice in the past few
> weeks
> > > > that I
> > > > > > had
> > > > > > > > to
> > > > > > > > > > > revert a
> > > > > > > > > > > > > > > > commit
> > > > > > > > > > > > > > > > > > > which
> > > > > > > > > > > > > > > > > > > > > > breaks
> > > > > > > > > > > > > > > > > > > > > > > > the master branch due to
> this.
> > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > I think it would be nicer to
> > > > enforce a
> > > > > > > > > stricter
> > > > > > > > > > > rule,
> > > > > > > > > > > > > > > that
> > > > > > > > > > > > > > > > no
> > > > > > > > > > > > > > > > > > PRs
> > > > > > > > > > > > > > > > > > > > > > should
> > > > > > > > > > > > > > > > > > > > > > > be
> > > > > > > > > > > > > > > > > > > > > > > > merged without passing CI.
> > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > The problems of merging PRs
> > with
> > > > > > > > "unrelated"
> > > > > > > > > > test
> > > > > > > > > > > > > > > failures
> > > > > > > > > > > > > > > > > are:
> > > > > > > > > > > > > > > > > > > > > > > > - It's not always
> > straightforward
> > > > to
> > > > > > tell
> > > > > > > > > > > whether a
> > > > > > > > > > > > > > test
> > > > > > > > > > > > > > > > > > > failures are
> > > > > > > > > > > > > > > > > > > > > > > > related or not.
> > > > > > > > > > > > > > > > > > > > > > > > - It prevents subsequent test
> > > cases
> > > > > > from
> > > > > > > > > being
> > > > > > > > > > > > > > executed,
> > > > > > > > > > > > > > > > > which
> > > > > > > > > > > > > > > > > > > may
> > > > > > > > > > > > > > > > > > > > > fail
> > > > > > > > > > > > > > > > > > > > > > > > relating to the PR changes.
> > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > To make things easier for the
> > > > > > committers,
> > > > > > > > the
> > > > > > > > > > > > > following
> > > > > > > > > > > > > > > > > > > exceptions
> > > > > > > > > > > > > > > > > > > > > > might
> > > > > > > > > > > > > > > > > > > > > > > be
> > > > > > > > > > > > > > > > > > > > > > > > considered acceptable.
> > > > > > > > > > > > > > > > > > > > > > > > - The PR has passed CI in the
> > > > > > > contributor's
> > > > > > > > > > > personal
> > > > > > > > > > > > > > > > > workspace.
> > > > > > > > > > > > > > > > > > > > > Please
> > > > > > > > > > > > > > > > > > > > > > > post
> > > > > > > > > > > > > > > > > > > > > > > > the link in such cases.
> > > > > > > > > > > > > > > > > > > > > > > > - The CI tests have been
> > > triggered
> > > > > > > multiple
> > > > > > > > > > > times, on
> > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > same
> > > > > > > > > > > > > > > > > > > > > commit,
> > > > > > > > > > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > > > > > > > each stage has at least
> passed
> > > for
> > > > > > once.
> > > > > > > > > Please
> > > > > > > > > > > also
> > > > > > > > > > > > > > > > comment
> > > > > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > > > > such
> > > > > > > > > > > > > > > > > > > > > > > cases.
> > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > If we all agree on this, I'd
> > > > update the
> > > > > > > > > > community
> > > > > > > > > > > > > > > > guidelines
> > > > > > > > > > > > > > > > > > for
> > > > > > > > > > > > > > > > > > > > > > merging
> > > > > > > > > > > > > > > > > > > > > > > > PRs wrt. this proposal. [1]
> > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > Please let me know what do
> you
> > > > think.
> > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > Thank you~
> > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > Xintong Song
> > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > [1]
> > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > >
> > > > > > >
> > > >
> > https://cwiki.apache.org/confluence/display/FLINK/Merging+Pull+Requests
> > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > --
> > > > > > > > > > > > > Best, Jingsong Lee
> > > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > >
> > >
> >
> >
> > --
> > Best, Jingsong Lee
> >
>


-- 
Best regards!
Rui Li

Reply via email to