I was hoping that by being stricter, we are going to make an effort to fix the 
flaky
tests. The reason is precisely what you mention: if you cannot commit, you need
to fix this situation. That is what I meant with providing an additional 
incentive to
make testing more robust: currently there is no incentive. I do not think that
improving tests is a responsibility of a single developer, but rather a 
responsibility
of all of us. Disabling the tests is a one-time solution to get to a clean run, 
trying to
accelerate the process to get to it, as we did not want to block development for
weeks. Then flaky tests should just not go in, and if they do, we can just 
revert
the patch (this is what [2] says btw).

The first thing we need to do is identifying why a test is flaky. After 
examining runs
for the last few days, I saw many of them fall in following categories:
- Many of them are flaky because of estimations such as data size. One possible
solution is to mask data size for those tests, as we already mask some 
environment
dependent information.
- Some of them are flaky because environment issues, e.g., I see this a lot with
TestTriggersMoveWorkloadManager. If their logic cannot be rewritten, a possible
solution is to add a max number of retries selectively for those tests 
(surefire has
a rerunFailingTestsCount option that I am not familiar with), expecting that 
they
pass at least once.

Not sure if you have other ideas?

-Jesús


On 5/15/18, 11:59 AM, "Vihang Karajgaonkar" <vih...@cloudera.com> wrote:

    Can we also define a standard process to identify a flaky test and thereby
    making it eligible to be disabled? I am worried that the intermittent the
    flaky ones will stall the patches when we restart allowing the commits.
    
    On Tue, May 15, 2018 at 10:50 AM, Vineet Garg <vg...@hortonworks.com> wrote:
    
    > +1
    >
    > > On May 15, 2018, at 9:13 AM, Alan Gates <alanfga...@gmail.com> wrote:
    > >
    > > +1.
    > >
    > > Alan.
    > >
    > > On Tue, May 15, 2018 at 9:12 AM, Sergio Pena <sergio.p...@cloudera.com>
    > > wrote:
    > >
    > >> +1
    > >>
    > >> On Tue, May 15, 2018 at 11:05 AM, Gunther Hagleitner <
    > >> ghagleit...@hortonworks.com> wrote:
    > >>
    > >>> +1
    > >>> ________________________________________
    > >>> From: Sankar Hariappan <shariap...@hortonworks.com>
    > >>> Sent: Tuesday, May 15, 2018 9:03 AM
    > >>> To: dev@hive.apache.org
    > >>> Subject: Re: [VOTE] Stricter commit guidelines
    > >>>
    > >>> +1
    > >>>
    > >>>
    > >>> On 15/05/18, 9:30 PM, "Sahil Takiar" <takiar.sa...@gmail.com> wrote:
    > >>>
    > >>>> +1
    > >>>>
    > >>>> On Tue, May 15, 2018 at 10:56 AM, Owen O'Malley <
    > owen.omal...@gmail.com
    > >>>
    > >>>> wrote:
    > >>>>
    > >>>>> +1
    > >>>>>
    > >>>>> On Tue, May 15, 2018 at 8:55 AM, Peter Vary <pv...@cloudera.com>
    > >> wrote:
    > >>>>>
    > >>>>>> +1 - Hoping for something like this for a long while! Thanks for
    > >>> taking
    > >>>>>> this up all!
    > >>>>>>
    > >>>>>>> On May 15, 2018, at 5:44 PM, Jesus Camacho Rodriguez <
    > >>>>>> jcama...@apache.org> wrote:
    > >>>>>>>
    > >>>>>>> Forgot to mention the length of the vote in original message.
    > >>>>>>>
    > >>>>>>> Let's leave the vote open for a shorter period than usual, for
    > >>> instance
    > >>>>>> 48 hours, i.e., till Wednesday 10pm PST. Situation can only get
    > >> worse
    > >>>>> than
    > >>>>>> it is now if we do not take action for a longer period.
    > >>>>>>>
    > >>>>>>> As Alan suggested, vote passes if there is a lazy majority (at
    > >>> least 3
    > >>>>>> votes, more +1s than -1s).
    > >>>>>>>
    > >>>>>>> Thanks,
    > >>>>>>> Jesús
    > >>>>>>>
    > >>>>>>>
    > >>>>>>> On 5/15/18, 8:37 AM, "Andrew Sherman" <asher...@cloudera.com>
    > >>> wrote:
    > >>>>>>>
    > >>>>>>>   +1
    > >>>>>>>
    > >>>>>>>   On Tue, May 15, 2018 at 2:34 AM Rui Li <lirui.fu...@gmail.com>
    > >>>>> wrote:
    > >>>>>>>
    > >>>>>>>> +1
    > >>>>>>>>
    > >>>>>>>> On Tue, May 15, 2018 at 2:24 PM, Prasanth Jayachandran <
    > >>>>>>>> pjayachand...@hortonworks.com> wrote:
    > >>>>>>>>
    > >>>>>>>>> +1
    > >>>>>>>>>
    > >>>>>>>>>
    > >>>>>>>>>
    > >>>>>>>>> Thanks
    > >>>>>>>>> Prasanth
    > >>>>>>>>>
    > >>>>>>>>>
    > >>>>>>>>>
    > >>>>>>>>> On Mon, May 14, 2018 at 10:44 PM -0700, "Jesus Camacho
    > >> Rodriguez"
    > >>> <
    > >>>>>>>>> jcama...@apache.org<mailto:jcama...@apache.org>> wrote:
    > >>>>>>>>>
    > >>>>>>>>>
    > >>>>>>>>> After work has been done to ignore most of the tests that were
    > >>>>> failing
    > >>>>>>>>> consistently/intermittently [1], I wanted to start this vote to
    > >>>>> gather
    > >>>>>>>>> support from the community to be stricter wrt committing patches
    > >>> to
    > >>>>>> Hive.
    > >>>>>>>>> The committers guide [2] already specifies that a +1 should be
    > >>>>> obtained
    > >>>>>>>>> before committing, but there is another clause that allows
    > >>> committing
    > >>>>>>>> under
    > >>>>>>>>> the presence of flaky tests (clause 4). Flaky tests are as good
    > >> as
    > >>>>>> having
    > >>>>>>>>> no tests, hence I propose to remove clause 4 and enforce the +1
    > >>> from
    > >>>>>>>>> testing infra before committing.
    > >>>>>>>>>
    > >>>>>>>>>
    > >>>>>>>>>
    > >>>>>>>>> As I see it, by enforcing that we always get a +1 from the
    > >> testing
    > >>>>>> infra
    > >>>>>>>>> before committing, 1) we will have a more stable project, and 2)
    > >>> we
    > >>>>>> will
    > >>>>>>>>> have another incentive as a community to create a more robust
    > >>> testing
    > >>>>>>>>> infra, e.g., replacing flaky tests for similar unit tests that
    > >> are
    > >>>>> not
    > >>>>>>>>> flaky, trying to decrease running time for tests, etc.
    > >>>>>>>>>
    > >>>>>>>>>
    > >>>>>>>>>
    > >>>>>>>>> Please, share your thoughts about this.
    > >>>>>>>>>
    > >>>>>>>>>
    > >>>>>>>>>
    > >>>>>>>>> Here is my +1.
    > >>>>>>>>>
    > >>>>>>>>>
    > >>>>>>>>>
    > >>>>>>>>> Thanks,
    > >>>>>>>>>
    > >>>>>>>>> Jes?s
    > >>>>>>>>>
    > >>>>>>>>>
    > >>>>>>>>>
    > >>>>>>>>> [1] http://mail-archives.apache.org/mod_mbox/hive-dev/201805.
    > >>>>>>>>> mbox/%3C63023673-AEE5-41A9-BA52-5A5DFB2078B6%40apache.org%3E
    > >>>>>>>>>
    > >>>>>>>>> [2] https://cwiki.apache.org/confluence/display/Hive/
    > >>>>>>>>> HowToCommit#HowToCommit-PreCommitruns,andcommittingpatches
    > >>>>>>>>>
    > >>>>>>>>>
    > >>>>>>>>>
    > >>>>>>>>>
    > >>>>>>>>>
    > >>>>>>>>
    > >>>>>>>>
    > >>>>>>>> --
    > >>>>>>>> Best regards!
    > >>>>>>>> Rui Li
    > >>>>>>>>
    > >>>>>>>
    > >>>>>>>
    > >>>>>>>
    > >>>>>>
    > >>>>>>
    > >>>>>
    > >>>>
    > >>>>
    > >>>>
    > >>>> --
    > >>>> Sahil Takiar
    > >>>> Software Engineer
    > >>>> takiar.sa...@gmail.com | (510) 673-0309
    > >>>
    > >>>
    > >>>
    > >>
    >
    >
    


Reply via email to