I'm reproducing the problem. There seems to be a race in Curator's ConnectionState#checkTimeouts() and HandleHolder#internalClose() with ZooKeeper#testableWaitForShutdown. I'll figure it out over the weekend.
-JZ > On Nov 30, 2018, at 3:22 PM, Cameron McKenzie <mckenzie....@gmail.com> wrote: > > Thanks Jordan, > I got one of the guys from the 2.x back port PR to try and run the tests and > they seem to have the same problem on the master branch. > > On Sat, 1 Dec. 2018, 12:51 am Jordan Zimmerman <jor...@jordanzimmerman.com > <mailto:jor...@jordanzimmerman.com> wrote: > I'll try running tests over the weekend > >> On Nov 27, 2018, at 7:49 PM, Cameron McKenzie <mckenzie....@gmail.com >> <mailto:mckenzie....@gmail.com>> wrote: >> >> Did you get anywhere with this Jordan? >> >> I've just done a bit of debugging on it, and it seems that when the >> teardown() method in the BaseClassForTests method gets called, the close() >> method on the server instance does not kill one of the threads. There is a >> ReaderThread that seems to run indefinitely, and this appears to cause >> issues with subsequent tests that run. I don't know when this has started >> happening and whether it's something unique to my environment, but it >> happens consistently, so after the first test in a suite runs, the second >> test just hangs when trying to close the Curator framework. >> >> On Tue, Nov 20, 2018 at 2:47 PM Cameron McKenzie <mckenzie....@gmail.com >> <mailto:mckenzie....@gmail.com>> wrote: >> At the moment that's the first test that's being run for me, and it is >> actually worse than just failing, it's hanging indefinitely. Something is >> not getting cleaned up correctly it would seem. I haven't done a lot of >> digging yet as I want to make sure it's not just my environment first. >> cheers >> >> On Tue, Nov 20, 2018 at 2:39 PM Jordan Zimmerman <jor...@jordanzimmerman.com >> <mailto:jor...@jordanzimmerman.com>> wrote: >> It would be good to know which ones consistently fail. >> >>> On Nov 19, 2018, at 10:18 PM, Cameron McKenzie <mckenzie....@gmail.com >>> <mailto:mckenzie....@gmail.com>> wrote: >>> >>> Thanks, >>> Yeah, they've been flaky in the past but would eventually succeed, but now, >>> for me at least, they're just failing consistently. >>> >>> On Tue, Nov 20, 2018 at 2:14 PM Jordan Zimmerman >>> <jor...@jordanzimmerman.com <mailto:jor...@jordanzimmerman.com>> wrote: >>> They've been flakey for a long time. I haven't run it in many months >>> though. I'll try to run soon. >>> >>> -JZ >>> >>> > On Nov 19, 2018, at 10:00 PM, Cameron McKenzie <cammcken...@apache.org >>> > <mailto:cammcken...@apache.org>> wrote: >>> > >>> > Guys, >>> > Is anyone else having issues running unit tests? I haven't done it for a >>> > while, but it appears that the code that injects LOST events when session >>> > expiration occurs while in a SUSPENDED state is not working. The LOST >>> > event >>> > just never appears. >>> > >>> > If I run TestBackgroundStates.testConnectionStateListenable() this just >>> > times out at line 124 when waiting for the LOST event to appear. >>> > >>> > Can someone run this test case when they've got a minute just to confirm >>> > that it's not something broken in my environment? >>> > cheers >>> > Cam >>> >> >