With fairly apt timing, we've recently had a PR merged ( https://github.com/apache/geode/pull/4745) that introduced a test that has resulted in fairly consistently flaky failures in the pipeline (JIRA ticket: https://issues.apache.org/jira/projects/GEODE/issues/GEODE-7843). The PR was quite large and touched/created a lot of tests, so StressNewTest never ran on it: https://concourse.apachegeode-ci.info/builds/136784#L5e3ac3fc:2. Given how frequently the test is failing in the pipeline (11 failures on various IntegrationTest jobs over the past 6 commits), it seems very likely that had StressNewTest been allowed to run, this issue would have been caught at the PR stage and could have been remedied then, instead of resulting in a nearly constantly red pipeline.
I've already given my +1 to this proposal, but this situation has prompted me to make it a +2. On Fri, Mar 6, 2020 at 1:46 PM Donal Evans <doev...@pivotal.io> wrote: > With fairly apt timing, we've recently had a PR merged ( > https://github.com/apache/geode/pull/4745) that introduced a test that > has resulted in fairly consistently flaky failures in the pipeline (JIRA > ticket: https://issues.apache.org/jira/projects/GEODE/issues/GEODE-7843). > The PR was quite large and touched/created a lot of tests, so StressNewTest > never ran on it: > https://concourse.apachegeode-ci.info/builds/136784#L5e3ac3fc:2. Given > how frequently the test is failing in the pipeline (11 failures on various > IntegrationTest jobs over the past 6 commits), it seems very likely that > had StressNewTest been allowed to run, this issue would have been caught at > the PR stage and could have been remedied then, instead of resulting in a > nearly constantly red pipeline. > > I've already given my +1 to this proposal, but this situation has prompted > me to make it a *+1*. > > On Tue, Mar 3, 2020 at 2:08 PM Anilkumar Gingade <aging...@pivotal.io> > wrote: > >> The stress test is to identify the flaky-ness within the test; and >> assuming >> any changes to the test may have introduced the flaky-ness. >> It's about paying the cost upfront or later when the test is determined to >> be flaky. >> If 25+ tests have been changed in a PR, the cost of running stress test >> for >> those; and gating the PR for so long. >> Knowing how much pain it causes to fix a flaky test after a certain/long >> duration of time; I am +1 for doing this change. >> >> On Tue, Mar 3, 2020 at 10:06 AM Dan Smith <dsm...@pivotal.io> wrote: >> >> > What's the current timeout for StressNewTest? Maybe if we just up the >> > threshold to 100 tests or so and up the timeout to match we can catch >> > pretty much all PRs. >> > >> > I'm not sure why the job is flagging more tests than it should. It looks >> > like at some point @rhoughon changed it to read the merge base from some >> > file created by concourse as an optimization [1] - I suspect maybe that >> > file is inaccurate? >> > >> > I originally wrote this job. It's definitely not a panacea, it will only >> > catch a new flaky test if >> > - the test is really flaky (likely to fail more than 1/50 times) >> > - the change actually happened in the test file itself, and not the >> > product or some other test file. >> > >> > [1] >> > >> > >> https://github.com/apache/geode/commit/4c06ba4625e69d44a5165aa9f2fccddfc064de87 >> > >> > -Dan >> > >> > On Sun, Mar 1, 2020 at 9:00 PM Owen Nichols <onich...@pivotal.io> >> wrote: >> > >> > > We don’t tend to look too closely at successful PR checks to see >> whether >> > > they actually checked anything at all. >> > > >> > > One example I found is >> > > >> > >> https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-pr/jobs/StressNewTestOpenJDK11/builds/5957 >> > > < >> > > >> > >> https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-pr/jobs/StressNewTestOpenJDK11/builds/5957 >> > > >: >> > > 32 is too many changed tests to stress test. Allowing this job to pass >> > > without stress testing. >> > > >> > > Here are 92 more examples (url’s omitted for brevity — use the example >> > > above as a template and just replace the last 4 digits): >> > > 26 is too many changed tests to stress test. Allowing this job to pass >> > > without stress testing. (build 6243) >> > > 26 is too many changed tests to stress test. Allowing this job to pass >> > > without stress testing. (build 6249) >> > > 26 is too many changed tests to stress test. Allowing this job to pass >> > > without stress testing. (build 6402) >> > > 27 is too many changed tests to stress test. Allowing this job to pass >> > > without stress testing. (build 6262) >> > > 27 is too many changed tests to stress test. Allowing this job to pass >> > > without stress testing. (build 6430) >> > > 27 is too many changed tests to stress test. Allowing this job to pass >> > > without stress testing. (build 6439) >> > > 27 is too many changed tests to stress test. Allowing this job to pass >> > > without stress testing. (build 6449) >> > > 27 is too many changed tests to stress test. Allowing this job to pass >> > > without stress testing. (build 6454) >> > > 27 is too many changed tests to stress test. Allowing this job to pass >> > > without stress testing. (build 6458) >> > > 27 is too many changed tests to stress test. Allowing this job to pass >> > > without stress testing. (build 6459) >> > > 28 is too many changed tests to stress test. Allowing this job to pass >> > > without stress testing. (build 6224) >> > > 28 is too many changed tests to stress test. Allowing this job to pass >> > > without stress testing. (build 6441) >> > > 28 is too many changed tests to stress test. Allowing this job to pass >> > > without stress testing. (build 6448) >> > > 28 is too many changed tests to stress test. Allowing this job to pass >> > > without stress testing. (build 6452) >> > > 29 is too many changed tests to stress test. Allowing this job to pass >> > > without stress testing. (build 6102) >> > > 29 is too many changed tests to stress test. Allowing this job to pass >> > > without stress testing. (build 6177) >> > > 30 is too many changed tests to stress test. Allowing this job to pass >> > > without stress testing. (build 5939) >> > > 30 is too many changed tests to stress test. Allowing this job to pass >> > > without stress testing. (build 5940) >> > > 30 is too many changed tests to stress test. Allowing this job to pass >> > > without stress testing. (build 5949) >> > > 30 is too many changed tests to stress test. Allowing this job to pass >> > > without stress testing. (build 6473) >> > > 31 is too many changed tests to stress test. Allowing this job to pass >> > > without stress testing. (build 5953) >> > > 31 is too many changed tests to stress test. Allowing this job to pass >> > > without stress testing. (build 6187) >> > > 31 is too many changed tests to stress test. Allowing this job to pass >> > > without stress testing. (build 6470) >> > > 31 is too many changed tests to stress test. Allowing this job to pass >> > > without stress testing. (build 6471) >> > > 31 is too many changed tests to stress test. Allowing this job to pass >> > > without stress testing. (build 6474) >> > > 31 is too many changed tests to stress test. Allowing this job to pass >> > > without stress testing. (build 6475) >> > > 32 is too many changed tests to stress test. Allowing this job to pass >> > > without stress testing. (build 5958) >> > > 32 is too many changed tests to stress test. Allowing this job to pass >> > > without stress testing. (build 6173) >> > > 32 is too many changed tests to stress test. Allowing this job to pass >> > > without stress testing. (build 6236) >> > > 32 is too many changed tests to stress test. Allowing this job to pass >> > > without stress testing. (build 6237) >> > > 32 is too many changed tests to stress test. Allowing this job to pass >> > > without stress testing. (build 6242) >> > > 33 is too many changed tests to stress test. Allowing this job to pass >> > > without stress testing. (build 6246) >> > > 33 is too many changed tests to stress test. Allowing this job to pass >> > > without stress testing. (build 6248) >> > > 33 is too many changed tests to stress test. Allowing this job to pass >> > > without stress testing. (build 6250) >> > > 33 is too many changed tests to stress test. Allowing this job to pass >> > > without stress testing. (build 6251) >> > > 33 is too many changed tests to stress test. Allowing this job to pass >> > > without stress testing. (build 6254) >> > > 33 is too many changed tests to stress test. Allowing this job to pass >> > > without stress testing. (build 6255) >> > > 34 is too many changed tests to stress test. Allowing this job to pass >> > > without stress testing. (build 6139) >> > > 34 is too many changed tests to stress test. Allowing this job to pass >> > > without stress testing. (build 6141) >> > > 34 is too many changed tests to stress test. Allowing this job to pass >> > > without stress testing. (build 6264) >> > > 34 is too many changed tests to stress test. Allowing this job to pass >> > > without stress testing. (build 6266) >> > > 34 is too many changed tests to stress test. Allowing this job to pass >> > > without stress testing. (build 6271) >> > > 35 is too many changed tests to stress test. Allowing this job to pass >> > > without stress testing. (build 6142) >> > > 35 is too many changed tests to stress test. Allowing this job to pass >> > > without stress testing. (build 6143) >> > > 35 is too many changed tests to stress test. Allowing this job to pass >> > > without stress testing. (build 6277) >> > > 35 is too many changed tests to stress test. Allowing this job to pass >> > > without stress testing. (build 6309) >> > > 35 is too many changed tests to stress test. Allowing this job to pass >> > > without stress testing. (build 6413) >> > > 35 is too many changed tests to stress test. Allowing this job to pass >> > > without stress testing. (build 6414) >> > > 35 is too many changed tests to stress test. Allowing this job to pass >> > > without stress testing. (build 6419) >> > > 35 is too many changed tests to stress test. Allowing this job to pass >> > > without stress testing. (build 6420) >> > > 36 is too many changed tests to stress test. Allowing this job to pass >> > > without stress testing. (build 6144) >> > > 36 is too many changed tests to stress test. Allowing this job to pass >> > > without stress testing. (build 6146) >> > > 36 is too many changed tests to stress test. Allowing this job to pass >> > > without stress testing. (build 6147) >> > > 36 is too many changed tests to stress test. Allowing this job to pass >> > > without stress testing. (build 6310) >> > > 36 is too many changed tests to stress test. Allowing this job to pass >> > > without stress testing. (build 6326) >> > > 36 is too many changed tests to stress test. Allowing this job to pass >> > > without stress testing. (build 6329) >> > > 36 is too many changed tests to stress test. Allowing this job to pass >> > > without stress testing. (build 6330) >> > > 40 is too many changed tests to stress test. Allowing this job to pass >> > > without stress testing. (build 6159) >> > > 40 is too many changed tests to stress test. Allowing this job to pass >> > > without stress testing. (build 6160) >> > > 40 is too many changed tests to stress test. Allowing this job to pass >> > > without stress testing. (build 6188) >> > > 41 is too many changed tests to stress test. Allowing this job to pass >> > > without stress testing. (build 6440) >> > > 41 is too many changed tests to stress test. Allowing this job to pass >> > > without stress testing. (build 6480) >> > > 41 is too many changed tests to stress test. Allowing this job to pass >> > > without stress testing. (build 6483) >> > > 43 is too many changed tests to stress test. Allowing this job to pass >> > > without stress testing. (build 6190) >> > > 44 is too many changed tests to stress test. Allowing this job to pass >> > > without stress testing. (build 5768) >> > > 44 is too many changed tests to stress test. Allowing this job to pass >> > > without stress testing. (build 5772) >> > > 46 is too many changed tests to stress test. Allowing this job to pass >> > > without stress testing. (build 6191) >> > > 46 is too many changed tests to stress test. Allowing this job to pass >> > > without stress testing. (build 6218) >> > > 46 is too many changed tests to stress test. Allowing this job to pass >> > > without stress testing. (build 6219) >> > > 46 is too many changed tests to stress test. Allowing this job to pass >> > > without stress testing. (build 6225) >> > > 46 is too many changed tests to stress test. Allowing this job to pass >> > > without stress testing. (build 6244) >> > > 46 is too many changed tests to stress test. Allowing this job to pass >> > > without stress testing. (build 6245) >> > > 48 is too many changed tests to stress test. Allowing this job to pass >> > > without stress testing. (build 6129) >> > > 49 is too many changed tests to stress test. Allowing this job to pass >> > > without stress testing. (build 5804) >> > > 51 is too many changed tests to stress test. Allowing this job to pass >> > > without stress testing. (build 5877) >> > > 53 is too many changed tests to stress test. Allowing this job to pass >> > > without stress testing. (build 5742) >> > > 54 is too many changed tests to stress test. Allowing this job to pass >> > > without stress testing. (build 5757) >> > > 66 is too many changed tests to stress test. Allowing this job to pass >> > > without stress testing. (build 6232) >> > > 66 is too many changed tests to stress test. Allowing this job to pass >> > > without stress testing. (build 6233) >> > > 66 is too many changed tests to stress test. Allowing this job to pass >> > > without stress testing. (build 6234) >> > > 66 is too many changed tests to stress test. Allowing this job to pass >> > > without stress testing. (build 6259) >> > > 66 is too many changed tests to stress test. Allowing this job to pass >> > > without stress testing. (build 6260) >> > > 66 is too many changed tests to stress test. Allowing this job to pass >> > > without stress testing. (build 6278) >> > > 79 is too many changed tests to stress test. Allowing this job to pass >> > > without stress testing. (build 6481) >> > > 103 is too many changed tests to stress test. Allowing this job to >> pass >> > > without stress testing. (build 6493) >> > > 106 is too many changed tests to stress test. Allowing this job to >> pass >> > > without stress testing. (build 6206) >> > > 106 is too many changed tests to stress test. Allowing this job to >> pass >> > > without stress testing. (build 6207) >> > > 115 is too many changed tests to stress test. Allowing this job to >> pass >> > > without stress testing. (build 5769) >> > > 118 is too many changed tests to stress test. Allowing this job to >> pass >> > > without stress testing. (build 6226) >> > > 721 is too many changed tests to stress test. Allowing this job to >> pass >> > > without stress testing. (build 6197) >> > > 722 is too many changed tests to stress test. Allowing this job to >> pass >> > > without stress testing. (build 6217) >> > > 722 is too many changed tests to stress test. Allowing this job to >> pass >> > > without stress testing. (build 6221) >> > > >> > > Please note, sometimes the number of files reported seems to be way >> > higher >> > > than it should be. For example >> > https://github.com/apache/geode/pull/4691 >> > > <https://github.com/apache/geode/pull/4691> shows exactly 1 test file >> > > touched, but >> > > >> > >> https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-pr/jobs/StressNewTestOpenJDK11/builds/6278 >> > > < >> > > >> > >> https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-pr/jobs/StressNewTestOpenJDK11/builds/6278 >> > > >> > > thinks it touched 66 test files. >> > > >> > > There’s currently no good data to predict how long StressNew will >> take, >> > it >> > > just depends on the mix of tests touched. I am aware of at least 4 >> cases >> > > (with less than 25 files) where the timeout was hit. In two of those >> > cases >> > > we re-ran with a longer timeout, and in two cases the PR author split >> up >> > > the PR into half a dozen smaller PRs to get around it. >> > > >> > > >> > > > On Mar 1, 2020, at 7:52 PM, Anthony Baker <aba...@pivotal.io> >> wrote: >> > > > >> > > > What percentage of PR’s are currently subject to the 25 test file >> rule? >> > > How many would be subject to the concourse timeout? >> > > > >> > > > I’d like to understand the scope of the impact for this change. >> > > > >> > > > Anthony >> > > > >> > > > >> > > >> On Mar 1, 2020, at 8:58 AM, Owen Nichols <onich...@pivotal.io> >> wrote: >> > > >> >> > > >> Impossible, no. Inconvenient, perhaps, but a small price to pay for >> > > being >> > > >> able to trust that green means green. >> > > >> >> > > >> With or without this proposed change, if anyone is having trouble >> > > getting >> > > >> their PR to pass StressNew, please bring it up on the dev list and >> we >> > > can >> > > >> discuss the appropriate solution on a case-by-case basis (e.g. >> > > increasing >> > > >> timeout, fixing the logic that identifies changed test files, >> > splitting >> > > >> into multiple PRs, authorizing an override, etc). >> > > >> >> > > >> On Sun, Mar 1, 2020 at 8:56 AM Dan Smith <dsm...@pivotal.io> >> wrote: >> > > >> >> > > >>> Won't this make it impossible to merge refactoring changes that >> touch >> > > a lot >> > > >>> of tests? >> > > >>> >> > > >>> -Dan >> > > >>> >> > > >>> On Sat, Feb 29, 2020, 12:37 PM Robert Houghton < >> rhough...@pivotal.io >> > > >> > > >>> wrote: >> > > >>> >> > > >>>> Yes, as it should >> > > >>>> >> > > >>>> On Sat, Feb 29, 2020, 12:25 Dan Smith <dsm...@pivotal.io> wrote: >> > > >>>> >> > > >>>>> Doesn't the build fail when concourse times out? >> > > >>>> >> > > >>>> >> > > >>> >> > > > >> > > >> > > >> > >> >