1. I think maybe we should have the special ability or process for patches 
which change dependencies. We know that they have the potential for damage way 
beyond their .patch size and I fear them

2. like allen says, we can't afford to have a full test run on every patch. 
Because t hen the infra is overloaded and either you don't get a turnaround 
time on a test within the day of submission, or the queue builds up so big that 
it's only by sunday evening that the backlog is cleared

3. And we don't test the object stores enough, as even if you can do it just 
with a set of credentials, we can't grant them to jenkins (security) and it 
still takes lots of time (though with HADOOP-14553 we will cut the windows time 
down)

4. And like allen also says, tests are a bit unreliable on the test infra. 
Example: TestKDiag; one of mine. No idea why it fails, it does work locally. 
Generally though, I think a lot of them are race conditions where the jenkins 
machines execute things in a different order, or simply take longer than we 
expect

How about we identify those tests which fail intermittently on Jenkins alone 
and somehow downgrade them/get them explicilty excluded. I know its cheating, 
and we should try to fix them first (after all, the way they fail may change, 
which would be a regression)

LambdaTestUtils.eventually() is designed to support spinning until a test 
passes, and with the most recent fix (HADOOP-14851) it may actually do this. It 
can help with race conditions (& inconsistent obect stores) by wrapping up the 
entire retry-until-something works process. But it only works if the race 
condition is between the production code and the assertion; if it is in the 
procuction code across threads, that's a serious problem

Anyway: tests fail, we should care. IF you want to learn how to care, try and 
do what Allen has been busy with: try and keep Jenkins happy.

Maybe we should have a week to try and collaborate on that, with a focus on 1+ 
specific build (branch 3 ?) for now, and get that stable and happy?

If we have to do it with a jenkins profile and skipping the unreliable tests, 
so be it


> On 14 Sep 2017, at 22:44, Arun Suresh <arun.sur...@gmail.com> wrote:
> 
> I actually like this idea:
> 
>> One approach: do a dependency:list of each module and for those that show
> a
> change with the patch we run tests there.
> 
> Can 'jdeps' be used to prune the list of sub modules on which we do
> pre-commit ? Essentially, we figure out which classes actually use the
> modified classes from the patch and then run the pre-commit on those
> packages ?
> 
> Cheers
> -Arun
> 
> On Thu, Sep 14, 2017 at 2:23 PM, Andrew Wang <andrew.w...@cloudera.com>
> wrote:
> 
>> On Thu, Sep 14, 2017 at 1:59 PM, Sean Busbey <bus...@apache.org> wrote:
>> 
>>> 
>>> 
>>> On 2017-09-14 15:36, Chris Douglas <cdoug...@apache.org> wrote:
>>>> This has gotten bad enough that people are dismissing legitimate test
>>>> failures among the noise.
>>>> 
>>>> On Thu, Sep 14, 2017 at 1:20 PM, Allen Wittenauer
>>>> <a...@effectivemachines.com> wrote:
>>>>>        Someone should probably invest some time into integrating the
>>> HBase flaky test code a) into Yetus and then b) into Hadoop.
>>>> 
>>>> What does the HBase flaky test code do? Another extension to
>>>> test-patch could run all new/modified tests multiple times, and report
>>>> to JIRA if any run fails.
>>>> 
>>> 
>>> The current HBase stuff segregates untrusted tests by looking through
>>> nightly test runs to find things that fail intermittently. We then don't
>>> include those tests in either nightly or precommit tests. We have a
>>> different job that just runs the untrusted tests and if they start
>> passing
>>> removes them from the list.
>>> 
>>> There's also a project getting used by SOLR called "BeastIT" that goes
>>> through running parallel copies of a given test a large number of times
>> to
>>> reveal flaky tests.
>>> 
>>> Getting either/both of those into Yetus and used here would be a huge
>>> improvement.
>>> 
>>> I discussed this on yetus-dev a while back and Allen thought it'd be
>> non-trivial:
>> 
>> https://lists.apache.org/thread.html/552ad614d1b3d5226a656b60c01084
>> 57bcaa1219fb9ad985f8750ba1@%3Cdev.yetus.apache.org%3E
>> 
>> I unfortunately don't have the test-patch.sh expertise to dig into this.
>> 
>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
>>> For additional commands, e-mail: common-dev-h...@hadoop.apache.org
>>> 
>>> 
>> 


---------------------------------------------------------------------
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org

Reply via email to