+1, thks Etienne for the hard work and to have spent time on it even if gradle was making it more complicated, really appreciated.
Romain Manni-Bucau @rmannibucau <https://twitter.com/rmannibucau> | Blog <https://rmannibucau.metawerx.net/> | Old Blog <http://rmannibucau.wordpress.com> | Github <https://github.com/rmannibucau> | LinkedIn <https://www.linkedin.com/in/rmannibucau> | Book <https://www.packtpub.com/application-development/java-ee-8-high-performance> 2018-04-26 15:29 GMT+02:00 Etienne Chauchot <echauc...@apache.org>: > Hi guys, > After some investigations, it appears to be a timing problem in the test. > If we bloc until the end of the threads in metricsExecutorService before > the ThreadLeakTracker rule is exectuted, then the test passes in gradle (no > leaks detected). I guess we were lucky with the other build runs that the > threads had time to finish before the rule gets executed. > > I also added a blackbox test with a real pipeline with a metrics DoFn and > it passes with no leaks with Romain's fix. > > Finally I ran these 2 tests on the actual master without Romain's fix and > they fail. > > So I'll update Romain's PR and merge it. > > Etienne > > > Le mardi 24 avril 2018 à 14:19 +0000, Kenneth Knowles a écrit : > > What I am trying to figure out is which of these we have: > > (a) the "test is broken" scenario > > - pass in maven (thread did not leak) > - pass in idea (thread did not leak) > - fail in gradle (thread did not leak, but the test incorrectly thinks > one did) > > (b) the "found a bug" scenario > > - pass in maven (thread leak not detected/exercised because of maven > setup) > - pass in idea (thread leak not detected/exercised because of idea setup) > - fail in gradle (thread leak exercised & detected because of gradle > setup) > > If no thread leaked under gradle and the test failed then it seems like we > are in (a) where the test depends on aspects of its environment that it > probably shouldn't. If there is a thread leak in a different module getting > detected here, that's a variant of (a) + (b) since the test is broken _and_ > we found a bug. That's what I'm trying to figure out with my questions. I > haven't had time to look at the code to see which it is. > > To be fair, we might be stuck in scenario (a) because you don't have a way > to test what you are hoping to test in a framework-independent way, and if > it was a critical thing to test we'd want to customize our environment to > catch it. > > Kenn > > > On Wed, Apr 18, 2018 at 11:18 PM Romain Manni-Bucau <rmannibu...@gmail.com> > wrote: > > No, otherwise it wouldnt pass in idea and maven. > > Le 19 avr. 2018 00:57, "Kenneth Knowles" <k...@google.com> a écrit : > > Does the test think a thread leaked when no thread leaked? > > On Wed, Apr 18, 2018 at 1:51 PM Romain Manni-Bucau <rmannibu...@gmail.com> > wrote: > > The leaking threadq are already dumped but we dont track who created it > > Le 18 avr. 2018 22:50, "Kenneth Knowles" <k...@google.com> a écrit : > > For this particular test, is it failing because a different test class > leaked threads? Is there a way to make something like a testing base class > with @Before and @After actions that would catch the leak closer to where > it happened? A quick search found this: https://github.com/ > elastic/elasticsearch-mapper-attachments/issues/88 but if that doesn't > apply maybe something else does. The very best thing is probably to use a > dynamic analysis since it is not very practical to unit test for leaks > (thread, memory, file handle, whatever) and other global correctness > criteria. > > Kenn > > On Wed, Apr 18, 2018 at 10:13 AM Romain Manni-Bucau <rmannibu...@gmail.com> > wrote: > > > > Le 18 avr. 2018 18:53, "Scott Wegner" <sweg...@google.com> a écrit : > > I wanted to provide some additional information about this issue and > Gradle test configuration. > > For context, BEAM-4088 [1] refers to a new test on PR/4965 [2] which > verifies that no threads are being leaked. Under Maven the test passes, but > under Gradle it reports leaked threads [3]. > > I'm not familiar with this test's logic, but presumably some differences > in test isolation are causing the failures. In Gradle, test classes are > executed in parallel in forked VMs, and those VMs are reused between test > classes. This configuration optimizes for execution time, but can break if > tests are not well isolated. > > When a test fails in this configuration, the first step should be to > understand why it fails and see if the test can be implemented with > better isolation. Barring that, we can update the execution parameters at > the project level for stronger sandboxing at the cost of longer execution > time. In Gradle, this is done by setting maxParallelForks [4] and forkEvery > [5] options on the Test task. > > > I dont know, first of all no hardcoded value must be in any build file > until required. A required case can be "fork and dont reuse the forks" > cause this IO is known as leaking. > > MaxFork and forEvery are about forks. > The test works if it runs alone in a fork until another test execution > leaked and still runs. > > Killing the daemon can help (and is a good practise on the CI), maybe > worth a try. > > We can also fork a thread for rhis particular test bit it just hides a > nasty bug so probably not worth it. > > > [1] https://issues.apache.org/jira/browse/BEAM-4088 > [2] https://github.com/apache/beam/pull/4965 > [3] https://scans.gradle.com/s/grha56432j3t2/tests/ > jqhvlvf72f7pg-ipde5etqqejoa?openStackTraces=WzBd > [4] https://docs.gradle.org/current/dsl/org.gradle.api. > tasks.testing.Test.html#org.gradle.api.tasks.testing.Test:maxParallelForks > > [5] https://docs.gradle.org/current/dsl/org.gradle.api. > tasks.testing.Test.html#org.gradle.api.tasks.testing.Test:forkEvery > > On Thu, Apr 12, 2018 at 10:03 AM Romain Manni-Bucau <rmannibu...@gmail.com> > wrote: > > Seems so but is not. Most of beam tests assume they are alone in their vm > when executing for a bunch of reason, if not the case a lot of side effects > can happen (backend state, local cache drop, ...., uncontrolled resources > and failure due to the GBKTest which creates 100M keys etc...) so you have > to have one test per vm at any time. If this assumption is true my test > will pass, if it is false gradle setup is wrong. > > Le 12 avr. 2018 18:40, "Kenneth Knowles" <k...@google.com> a écrit : > > It seems that the test probably depends on some details of Maven or our > Maven configuration. If so, that's a problem with the test. It should be > able to succeed in any build system, or as a standalone JUnit main built > from the suite, etc. > > Kenn > > On Thu, Apr 12, 2018 at 9:27 AM Romain Manni-Bucau <rmannibu...@gmail.com> > wrote: > > on maven it is quite reliable, ran it > 10 times without any failure. > > I suspect (but didnt check by lack of time) gradle parallelism is > different somehow and can lead to some flackyness here. > > Romain Manni-Bucau > @rmannibucau | Blog | Old Blog | Github | LinkedIn | Book > > > 2018-04-12 18:20 GMT+02:00 Scott Wegner <sweg...@google.com>: > > It looks like the precommit failure [1] is for a new test that was added. > > Have you debugged the test to ensure it's not flaky? > > > > [1] > > https://builds.apache.org/job/beam_PreCommit_Java_ > GradleBuild/4059/testReport/junit/org.apache.beam.runners.direct/ > ExecutorServiceParallelExecutorTest/ensureMetricsThreadDoesntLeak/ > > > > On Thu, Apr 12, 2018 at 5:06 AM Romain Manni-Bucau < > rmannibu...@gmail.com> > > wrote: > >> > >> Hi guys, > >> > >> did the gradle track changed the way test execution was done? > >> > >> This PR https://github.com/apache/beam/pull/4965 works very well with > >> maven and sometimes doesn't pass with gradle. Think we should keep the > >> previous setup which was globally reliable (I'm not speaking of tests > >> which are not but of the setup). > >> > >> Any inputs or in progress todo i missed? > >> > >> Romain Manni-Bucau > >> @rmannibucau | Blog | Old Blog | Github | LinkedIn | Book > > > > -- > > > > > > Got feedback? http://go/swegner-feedback > > > > -- > > > Got feedback? http://go/swegner-feedback > > > > > > > >