BTW - looks like there's already an open bug for this: https://issues.apache.org/jira/browse/CURATOR-472 <https://issues.apache.org/jira/browse/CURATOR-472> - this is what's causing the problem. I'll have a possible fix soon.
> On Nov 30, 2018, at 6:52 PM, Jordan Zimmerman <jor...@jordanzimmerman.com> > wrote: > > I'm reproducing the problem. There seems to be a race in Curator's > ConnectionState#checkTimeouts() and HandleHolder#internalClose() with > ZooKeeper#testableWaitForShutdown. I'll figure it out over the weekend. > > -JZ > >> On Nov 30, 2018, at 3:22 PM, Cameron McKenzie <mckenzie....@gmail.com >> <mailto:mckenzie....@gmail.com>> wrote: >> >> Thanks Jordan, >> I got one of the guys from the 2.x back port PR to try and run the tests and >> they seem to have the same problem on the master branch. >> >> On Sat, 1 Dec. 2018, 12:51 am Jordan Zimmerman <jor...@jordanzimmerman.com >> <mailto:jor...@jordanzimmerman.com> wrote: >> I'll try running tests over the weekend >> >>> On Nov 27, 2018, at 7:49 PM, Cameron McKenzie <mckenzie....@gmail.com >>> <mailto:mckenzie....@gmail.com>> wrote: >>> >>> Did you get anywhere with this Jordan? >>> >>> I've just done a bit of debugging on it, and it seems that when the >>> teardown() method in the BaseClassForTests method gets called, the close() >>> method on the server instance does not kill one of the threads. There is a >>> ReaderThread that seems to run indefinitely, and this appears to cause >>> issues with subsequent tests that run. I don't know when this has started >>> happening and whether it's something unique to my environment, but it >>> happens consistently, so after the first test in a suite runs, the second >>> test just hangs when trying to close the Curator framework. >>> >>> On Tue, Nov 20, 2018 at 2:47 PM Cameron McKenzie <mckenzie....@gmail.com >>> <mailto:mckenzie....@gmail.com>> wrote: >>> At the moment that's the first test that's being run for me, and it is >>> actually worse than just failing, it's hanging indefinitely. Something is >>> not getting cleaned up correctly it would seem. I haven't done a lot of >>> digging yet as I want to make sure it's not just my environment first. >>> cheers >>> >>> On Tue, Nov 20, 2018 at 2:39 PM Jordan Zimmerman >>> <jor...@jordanzimmerman.com <mailto:jor...@jordanzimmerman.com>> wrote: >>> It would be good to know which ones consistently fail. >>> >>>> On Nov 19, 2018, at 10:18 PM, Cameron McKenzie <mckenzie....@gmail.com >>>> <mailto:mckenzie....@gmail.com>> wrote: >>>> >>>> Thanks, >>>> Yeah, they've been flaky in the past but would eventually succeed, but >>>> now, for me at least, they're just failing consistently. >>>> >>>> On Tue, Nov 20, 2018 at 2:14 PM Jordan Zimmerman >>>> <jor...@jordanzimmerman.com <mailto:jor...@jordanzimmerman.com>> wrote: >>>> They've been flakey for a long time. I haven't run it in many months >>>> though. I'll try to run soon. >>>> >>>> -JZ >>>> >>>> > On Nov 19, 2018, at 10:00 PM, Cameron McKenzie <cammcken...@apache.org >>>> > <mailto:cammcken...@apache.org>> wrote: >>>> > >>>> > Guys, >>>> > Is anyone else having issues running unit tests? I haven't done it for a >>>> > while, but it appears that the code that injects LOST events when session >>>> > expiration occurs while in a SUSPENDED state is not working. The LOST >>>> > event >>>> > just never appears. >>>> > >>>> > If I run TestBackgroundStates.testConnectionStateListenable() this just >>>> > times out at line 124 when waiting for the LOST event to appear. >>>> > >>>> > Can someone run this test case when they've got a minute just to confirm >>>> > that it's not something broken in my environment? >>>> > cheers >>>> > Cam >>>> >>> >> >