With fairly apt timing, we've recently had a PR merged (
https://github.com/apache/geode/pull/4745) that introduced a test that has
resulted in fairly consistently flaky failures in the pipeline (JIRA
ticket: https://issues.apache.org/jira/projects/GEODE/issues/GEODE-7843).
The PR was quite large and touched/created a lot of tests, so StressNewTest
never ran on it:
https://concourse.apachegeode-ci.info/builds/136784#L5e3ac3fc:2. Given how
frequently the test is failing in the pipeline (11 failures on various
IntegrationTest jobs over the past 6 commits), it seems very likely that
had StressNewTest been allowed to run, this issue would have been caught at
the PR stage and could have been remedied then, instead of resulting in a
nearly constantly red pipeline.

I've already given my +1 to this proposal, but this situation has prompted
me to make it a +2.

On Fri, Mar 6, 2020 at 1:46 PM Donal Evans <doev...@pivotal.io> wrote:

> With fairly apt timing, we've recently had a PR merged (
> https://github.com/apache/geode/pull/4745) that introduced a test that
> has resulted in fairly consistently flaky failures in the pipeline (JIRA
> ticket: https://issues.apache.org/jira/projects/GEODE/issues/GEODE-7843).
> The PR was quite large and touched/created a lot of tests, so StressNewTest
> never ran on it:
> https://concourse.apachegeode-ci.info/builds/136784#L5e3ac3fc:2. Given
> how frequently the test is failing in the pipeline (11 failures on various
> IntegrationTest jobs over the past 6 commits), it seems very likely that
> had StressNewTest been allowed to run, this issue would have been caught at
> the PR stage and could have been remedied then, instead of resulting in a
> nearly constantly red pipeline.
>
> I've already given my +1 to this proposal, but this situation has prompted
> me to make it a *+1*.
>
> On Tue, Mar 3, 2020 at 2:08 PM Anilkumar Gingade <aging...@pivotal.io>
> wrote:
>
>> The stress test is to identify the flaky-ness within the test; and
>> assuming
>> any changes to the test may have introduced the flaky-ness.
>> It's about paying the cost upfront or later when the test is determined to
>> be flaky.
>> If 25+ tests have been changed in a PR, the cost of running stress test
>> for
>> those;  and gating the PR for so long.
>> Knowing how much pain it causes to fix a flaky test after a certain/long
>> duration of time; I am +1 for doing this change.
>>
>> On Tue, Mar 3, 2020 at 10:06 AM Dan Smith <dsm...@pivotal.io> wrote:
>>
>> > What's the current timeout for StressNewTest? Maybe if we just up the
>> > threshold to 100 tests or so and up the timeout to match we can catch
>> > pretty much all PRs.
>> >
>> > I'm not sure why the job is flagging more tests than it should. It looks
>> > like at some point @rhoughon changed it to read the merge base from some
>> > file created by concourse as an optimization [1] - I suspect maybe that
>> > file is inaccurate?
>> >
>> > I originally wrote this job. It's definitely not a panacea, it will only
>> > catch a new flaky test if
>> >  - the test is really flaky (likely to fail more than 1/50 times)
>> >  - the change actually happened in the test file itself, and not the
>> > product or some other test file.
>> >
>> > [1]
>> >
>> >
>> https://github.com/apache/geode/commit/4c06ba4625e69d44a5165aa9f2fccddfc064de87
>> >
>> > -Dan
>> >
>> > On Sun, Mar 1, 2020 at 9:00 PM Owen Nichols <onich...@pivotal.io>
>> wrote:
>> >
>> > > We don’t tend to look too closely at successful PR checks to see
>> whether
>> > > they actually checked anything at all.
>> > >
>> > > One example I found is
>> > >
>> >
>> https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-pr/jobs/StressNewTestOpenJDK11/builds/5957
>> > > <
>> > >
>> >
>> https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-pr/jobs/StressNewTestOpenJDK11/builds/5957
>> > > >:
>> > > 32 is too many changed tests to stress test. Allowing this job to pass
>> > > without stress testing.
>> > >
>> > > Here are 92 more examples (url’s omitted for brevity — use the example
>> > > above as a template and just replace the last 4 digits):
>> > > 26 is too many changed tests to stress test. Allowing this job to pass
>> > > without stress testing.  (build 6243)
>> > > 26 is too many changed tests to stress test. Allowing this job to pass
>> > > without stress testing.  (build 6249)
>> > > 26 is too many changed tests to stress test. Allowing this job to pass
>> > > without stress testing.  (build 6402)
>> > > 27 is too many changed tests to stress test. Allowing this job to pass
>> > > without stress testing.  (build 6262)
>> > > 27 is too many changed tests to stress test. Allowing this job to pass
>> > > without stress testing.  (build 6430)
>> > > 27 is too many changed tests to stress test. Allowing this job to pass
>> > > without stress testing.  (build 6439)
>> > > 27 is too many changed tests to stress test. Allowing this job to pass
>> > > without stress testing.  (build 6449)
>> > > 27 is too many changed tests to stress test. Allowing this job to pass
>> > > without stress testing.  (build 6454)
>> > > 27 is too many changed tests to stress test. Allowing this job to pass
>> > > without stress testing.  (build 6458)
>> > > 27 is too many changed tests to stress test. Allowing this job to pass
>> > > without stress testing.  (build 6459)
>> > > 28 is too many changed tests to stress test. Allowing this job to pass
>> > > without stress testing.  (build 6224)
>> > > 28 is too many changed tests to stress test. Allowing this job to pass
>> > > without stress testing.  (build 6441)
>> > > 28 is too many changed tests to stress test. Allowing this job to pass
>> > > without stress testing.  (build 6448)
>> > > 28 is too many changed tests to stress test. Allowing this job to pass
>> > > without stress testing.  (build 6452)
>> > > 29 is too many changed tests to stress test. Allowing this job to pass
>> > > without stress testing.  (build 6102)
>> > > 29 is too many changed tests to stress test. Allowing this job to pass
>> > > without stress testing.  (build 6177)
>> > > 30 is too many changed tests to stress test. Allowing this job to pass
>> > > without stress testing.  (build 5939)
>> > > 30 is too many changed tests to stress test. Allowing this job to pass
>> > > without stress testing.  (build 5940)
>> > > 30 is too many changed tests to stress test. Allowing this job to pass
>> > > without stress testing.  (build 5949)
>> > > 30 is too many changed tests to stress test. Allowing this job to pass
>> > > without stress testing.  (build 6473)
>> > > 31 is too many changed tests to stress test. Allowing this job to pass
>> > > without stress testing.  (build 5953)
>> > > 31 is too many changed tests to stress test. Allowing this job to pass
>> > > without stress testing.  (build 6187)
>> > > 31 is too many changed tests to stress test. Allowing this job to pass
>> > > without stress testing.  (build 6470)
>> > > 31 is too many changed tests to stress test. Allowing this job to pass
>> > > without stress testing.  (build 6471)
>> > > 31 is too many changed tests to stress test. Allowing this job to pass
>> > > without stress testing.  (build 6474)
>> > > 31 is too many changed tests to stress test. Allowing this job to pass
>> > > without stress testing.  (build 6475)
>> > > 32 is too many changed tests to stress test. Allowing this job to pass
>> > > without stress testing.  (build 5958)
>> > > 32 is too many changed tests to stress test. Allowing this job to pass
>> > > without stress testing.  (build 6173)
>> > > 32 is too many changed tests to stress test. Allowing this job to pass
>> > > without stress testing.  (build 6236)
>> > > 32 is too many changed tests to stress test. Allowing this job to pass
>> > > without stress testing.  (build 6237)
>> > > 32 is too many changed tests to stress test. Allowing this job to pass
>> > > without stress testing.  (build 6242)
>> > > 33 is too many changed tests to stress test. Allowing this job to pass
>> > > without stress testing.  (build 6246)
>> > > 33 is too many changed tests to stress test. Allowing this job to pass
>> > > without stress testing.  (build 6248)
>> > > 33 is too many changed tests to stress test. Allowing this job to pass
>> > > without stress testing.  (build 6250)
>> > > 33 is too many changed tests to stress test. Allowing this job to pass
>> > > without stress testing.  (build 6251)
>> > > 33 is too many changed tests to stress test. Allowing this job to pass
>> > > without stress testing.  (build 6254)
>> > > 33 is too many changed tests to stress test. Allowing this job to pass
>> > > without stress testing.  (build 6255)
>> > > 34 is too many changed tests to stress test. Allowing this job to pass
>> > > without stress testing.  (build 6139)
>> > > 34 is too many changed tests to stress test. Allowing this job to pass
>> > > without stress testing.  (build 6141)
>> > > 34 is too many changed tests to stress test. Allowing this job to pass
>> > > without stress testing.  (build 6264)
>> > > 34 is too many changed tests to stress test. Allowing this job to pass
>> > > without stress testing.  (build 6266)
>> > > 34 is too many changed tests to stress test. Allowing this job to pass
>> > > without stress testing.  (build 6271)
>> > > 35 is too many changed tests to stress test. Allowing this job to pass
>> > > without stress testing.  (build 6142)
>> > > 35 is too many changed tests to stress test. Allowing this job to pass
>> > > without stress testing.  (build 6143)
>> > > 35 is too many changed tests to stress test. Allowing this job to pass
>> > > without stress testing.  (build 6277)
>> > > 35 is too many changed tests to stress test. Allowing this job to pass
>> > > without stress testing.  (build 6309)
>> > > 35 is too many changed tests to stress test. Allowing this job to pass
>> > > without stress testing.  (build 6413)
>> > > 35 is too many changed tests to stress test. Allowing this job to pass
>> > > without stress testing.  (build 6414)
>> > > 35 is too many changed tests to stress test. Allowing this job to pass
>> > > without stress testing.  (build 6419)
>> > > 35 is too many changed tests to stress test. Allowing this job to pass
>> > > without stress testing.  (build 6420)
>> > > 36 is too many changed tests to stress test. Allowing this job to pass
>> > > without stress testing.  (build 6144)
>> > > 36 is too many changed tests to stress test. Allowing this job to pass
>> > > without stress testing.  (build 6146)
>> > > 36 is too many changed tests to stress test. Allowing this job to pass
>> > > without stress testing.  (build 6147)
>> > > 36 is too many changed tests to stress test. Allowing this job to pass
>> > > without stress testing.  (build 6310)
>> > > 36 is too many changed tests to stress test. Allowing this job to pass
>> > > without stress testing.  (build 6326)
>> > > 36 is too many changed tests to stress test. Allowing this job to pass
>> > > without stress testing.  (build 6329)
>> > > 36 is too many changed tests to stress test. Allowing this job to pass
>> > > without stress testing.  (build 6330)
>> > > 40 is too many changed tests to stress test. Allowing this job to pass
>> > > without stress testing.  (build 6159)
>> > > 40 is too many changed tests to stress test. Allowing this job to pass
>> > > without stress testing.  (build 6160)
>> > > 40 is too many changed tests to stress test. Allowing this job to pass
>> > > without stress testing.  (build 6188)
>> > > 41 is too many changed tests to stress test. Allowing this job to pass
>> > > without stress testing.  (build 6440)
>> > > 41 is too many changed tests to stress test. Allowing this job to pass
>> > > without stress testing.  (build 6480)
>> > > 41 is too many changed tests to stress test. Allowing this job to pass
>> > > without stress testing.  (build 6483)
>> > > 43 is too many changed tests to stress test. Allowing this job to pass
>> > > without stress testing.  (build 6190)
>> > > 44 is too many changed tests to stress test. Allowing this job to pass
>> > > without stress testing.  (build 5768)
>> > > 44 is too many changed tests to stress test. Allowing this job to pass
>> > > without stress testing.  (build 5772)
>> > > 46 is too many changed tests to stress test. Allowing this job to pass
>> > > without stress testing.  (build 6191)
>> > > 46 is too many changed tests to stress test. Allowing this job to pass
>> > > without stress testing.  (build 6218)
>> > > 46 is too many changed tests to stress test. Allowing this job to pass
>> > > without stress testing.  (build 6219)
>> > > 46 is too many changed tests to stress test. Allowing this job to pass
>> > > without stress testing.  (build 6225)
>> > > 46 is too many changed tests to stress test. Allowing this job to pass
>> > > without stress testing.  (build 6244)
>> > > 46 is too many changed tests to stress test. Allowing this job to pass
>> > > without stress testing.  (build 6245)
>> > > 48 is too many changed tests to stress test. Allowing this job to pass
>> > > without stress testing.  (build 6129)
>> > > 49 is too many changed tests to stress test. Allowing this job to pass
>> > > without stress testing.  (build 5804)
>> > > 51 is too many changed tests to stress test. Allowing this job to pass
>> > > without stress testing.  (build 5877)
>> > > 53 is too many changed tests to stress test. Allowing this job to pass
>> > > without stress testing.  (build 5742)
>> > > 54 is too many changed tests to stress test. Allowing this job to pass
>> > > without stress testing.  (build 5757)
>> > > 66 is too many changed tests to stress test. Allowing this job to pass
>> > > without stress testing.  (build 6232)
>> > > 66 is too many changed tests to stress test. Allowing this job to pass
>> > > without stress testing.  (build 6233)
>> > > 66 is too many changed tests to stress test. Allowing this job to pass
>> > > without stress testing.  (build 6234)
>> > > 66 is too many changed tests to stress test. Allowing this job to pass
>> > > without stress testing.  (build 6259)
>> > > 66 is too many changed tests to stress test. Allowing this job to pass
>> > > without stress testing.  (build 6260)
>> > > 66 is too many changed tests to stress test. Allowing this job to pass
>> > > without stress testing.  (build 6278)
>> > > 79 is too many changed tests to stress test. Allowing this job to pass
>> > > without stress testing.  (build 6481)
>> > > 103 is too many changed tests to stress test. Allowing this job to
>> pass
>> > > without stress testing.  (build 6493)
>> > > 106 is too many changed tests to stress test. Allowing this job to
>> pass
>> > > without stress testing.  (build 6206)
>> > > 106 is too many changed tests to stress test. Allowing this job to
>> pass
>> > > without stress testing.  (build 6207)
>> > > 115 is too many changed tests to stress test. Allowing this job to
>> pass
>> > > without stress testing.  (build 5769)
>> > > 118 is too many changed tests to stress test. Allowing this job to
>> pass
>> > > without stress testing.  (build 6226)
>> > > 721 is too many changed tests to stress test. Allowing this job to
>> pass
>> > > without stress testing.  (build 6197)
>> > > 722 is too many changed tests to stress test. Allowing this job to
>> pass
>> > > without stress testing.  (build 6217)
>> > > 722 is too many changed tests to stress test. Allowing this job to
>> pass
>> > > without stress testing.  (build 6221)
>> > >
>> > > Please note, sometimes the number of files reported seems to be way
>> > higher
>> > > than it should be.  For example
>> > https://github.com/apache/geode/pull/4691
>> > > <https://github.com/apache/geode/pull/4691> shows exactly 1 test file
>> > > touched, but
>> > >
>> >
>> https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-pr/jobs/StressNewTestOpenJDK11/builds/6278
>> > > <
>> > >
>> >
>> https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-pr/jobs/StressNewTestOpenJDK11/builds/6278
>> > >
>> > > thinks it touched 66 test files.
>> > >
>> > > There’s currently no good data to predict how long StressNew will
>> take,
>> > it
>> > > just depends on the mix of tests touched.  I am aware of at least 4
>> cases
>> > > (with less than 25 files) where the timeout was hit.  In two of those
>> > cases
>> > > we re-ran with a longer timeout, and in two cases the PR author split
>> up
>> > > the PR into half a dozen smaller PRs to get around it.
>> > >
>> > >
>> > > > On Mar 1, 2020, at 7:52 PM, Anthony Baker <aba...@pivotal.io>
>> wrote:
>> > > >
>> > > > What percentage of PR’s are currently subject to the 25 test file
>> rule?
>> > > How many would be subject to the concourse timeout?
>> > > >
>> > > > I’d like to understand the scope of the impact for this change.
>> > > >
>> > > > Anthony
>> > > >
>> > > >
>> > > >> On Mar 1, 2020, at 8:58 AM, Owen Nichols <onich...@pivotal.io>
>> wrote:
>> > > >>
>> > > >> Impossible, no. Inconvenient, perhaps, but a small price to pay for
>> > > being
>> > > >> able to trust that green means green.
>> > > >>
>> > > >> With or without this proposed change, if anyone is having trouble
>> > > getting
>> > > >> their PR to pass StressNew, please bring it up on the dev list and
>> we
>> > > can
>> > > >> discuss the appropriate solution on a case-by-case basis (e.g.
>> > > increasing
>> > > >> timeout, fixing the logic that identifies changed test files,
>> > splitting
>> > > >> into multiple PRs, authorizing an override, etc).
>> > > >>
>> > > >> On Sun, Mar 1, 2020 at 8:56 AM Dan Smith <dsm...@pivotal.io>
>> wrote:
>> > > >>
>> > > >>> Won't this make it impossible to merge refactoring changes that
>> touch
>> > > a lot
>> > > >>> of tests?
>> > > >>>
>> > > >>> -Dan
>> > > >>>
>> > > >>> On Sat, Feb 29, 2020, 12:37 PM Robert Houghton <
>> rhough...@pivotal.io
>> > >
>> > > >>> wrote:
>> > > >>>
>> > > >>>> Yes, as it should
>> > > >>>>
>> > > >>>> On Sat, Feb 29, 2020, 12:25 Dan Smith <dsm...@pivotal.io> wrote:
>> > > >>>>
>> > > >>>>> Doesn't the build fail when concourse times out?
>> > > >>>>
>> > > >>>>
>> > > >>>
>> > > >
>> > >
>> > >
>> >
>>
>

Reply via email to