Yeah, I'm still getting failures too. I will have more of a look if I get time tonight. cheers
On Thu, Jun 2, 2016 at 3:01 PM, Jordan Zimmerman <[email protected] > wrote: > Hmm - I’m still getting failures - maybe I’m wrong. It’s late and I’m off > to bed. I’ll look at this more tomorrow. > > -Jordan > > > On Jun 1, 2016, at 10:59 PM, Cameron McKenzie <[email protected]> > wrote: > > > > The counter is just being used to check if semaphores are still being > > acquired. Essentially it just runs in a loop acquiring semaphores (and > > incrementing the counter when they are acquired). > > > > Then it shuts down the server, waits until it the session is lost, then > > restarts the server and then checks that semaphores are being acquired > > correctly again (by checking that the counter is being incremented). > > > > This is just a simplified version of the test that is failing. > > > > When the test fails, all of the threads are attempting to get a lease on > > the semaphore, but none of them get it, then the test times out while > > waiting. > > > > > > > > On Thu, Jun 2, 2016 at 1:29 PM, Jordan Zimmerman < > [email protected] > >> wrote: > > > >> I also had to add: > >> > >> while(!lost.get() && (counter.get() > 0)) > >> { > >> Thread.sleep(1000); > >> } > >> Which seems more correct to me. > >> > >>> On Jun 1, 2016, at 9:07 PM, Cameron McKenzie <[email protected]> > >> wrote: > >>> > >>> I have just pushed an interprocess_mutex_issue branch. The test case is > >> in > >>> TestInterprocessMutexNotReconnecting > >>> > >>> For me it's failing around 20% of the time. > >>> cheers > >>> > >>> On Thu, Jun 2, 2016 at 11:17 AM, Cameron McKenzie < > >> [email protected]> > >>> wrote: > >>> > >>>> Yep, just let me confirm that it's actually getting the same problem. > >> I'm > >>>> sure it was before, but I've just run it a bunch of times and > >> everything's > >>>> been fine. > >>>> > >>>> On Thu, Jun 2, 2016 at 11:15 AM, Jordan Zimmerman < > >>>> [email protected]> wrote: > >>>> > >>>>> Can you push your unit test somewhere? > >>>>> > >>>>>> On Jun 1, 2016, at 7:37 PM, Cameron McKenzie < > [email protected]> > >>>>> wrote: > >>>>>> > >>>>>> Indeed. There seems to be a problem with InterProcessSemaphoreV2 > >> though. > >>>>>> I've written a simplified unit test that just has a bunch of clients > >>>>>> attempting to grab a lease on the semaphore. When I shutdown and > >>>>> restart ZK > >>>>>> about 25% of the time, none of the clients can reacquire the > >> semaphore. > >>>>>> > >>>>>> Still trying to work out what's going on, but I'm probably not going > >> to > >>>>>> have a lot of time today to look at it. > >>>>>> cheers > >>>>>> > >>>>>> On Thu, Jun 2, 2016 at 10:30 AM, Jordan Zimmerman < > >>>>>> [email protected]> wrote: > >>>>>> > >>>>>>> Odd - SemaphoreClient does seem wrong. > >>>>>>> > >>>>>>>> On Jun 1, 2016, at 1:43 AM, Cameron McKenzie < > >> [email protected]> > >>>>>>> wrote: > >>>>>>>> > >>>>>>>> It looks like under some circumstances (which I haven't worked out > >>>>> yet) > >>>>>>>> that the InterprocessMutex acquire() is not working correctly when > >>>>>>>> reconnecting to ZK. Still digging into why this is. > >>>>>>>> > >>>>>>>> There also seems to be a bug in the SemaphoreClient, unless I'm > >>>>> missing > >>>>>>>> something. At lines 126 and 140 it does compareAndSet() calls but > >>>>> throws > >>>>>>> an > >>>>>>>> exception if they return true. As far as I can work out, this > means > >>>>> that > >>>>>>>> whenever the lock is acquired, an exception gets thrown indicating > >>>>> that > >>>>>>>> there are Multiple acquirers. > >>>>>>>> > >>>>>>>> This test is failing fairly consistently. It seems to be the > >> remaining > >>>>>>> test > >>>>>>>> that keeps failing in the Jenkins build also > >>>>>>>> cheers > >>>>>>>> > >>>>>>>> > >>>>>>>> On Wed, Jun 1, 2016 at 3:10 PM, Cameron McKenzie < > >>>>> [email protected] > >>>>>>>> > >>>>>>>> wrote: > >>>>>>>> > >>>>>>>>> Looks like I was incorrect. The NoWatcherException is being > thrown > >> on > >>>>>>>>> success as well, and the problem is not in the cluster restart. > >> Will > >>>>>>> keep > >>>>>>>>> digging. > >>>>>>>>> > >>>>>>>>> On Wed, Jun 1, 2016 at 2:52 PM, Cameron McKenzie < > >>>>>>> [email protected]> > >>>>>>>>> wrote: > >>>>>>>>> > >>>>>>>>>> TestInterProcessSemaphoreCluster.testCluster() is failling > >>>>> (assertion > >>>>>>> at > >>>>>>>>>> line 294). Again, it seems like some sort of race condition with > >> the > >>>>>>>>>> watcher removal. > >>>>>>>>>> > >>>>>>>>>> When I run it in Eclipse, it fails maybe 25% of the time. When > it > >>>>> fails > >>>>>>>>>> it seems that it's got something to do with watcher removal. > When > >>>>> the > >>>>>>> test > >>>>>>>>>> passes, this error is not logged. > >>>>>>>>>> > >>>>>>>>>> org.apache.zookeeper.KeeperException$NoWatcherException: > >>>>>>> KeeperErrorCode > >>>>>>>>>> = No such watcher for /foo/bar/lock/leases > >>>>>>>>>> at > >>>>>>>>>> > >>>>>>> > >>>>> > >> > org.apache.zookeeper.ZooKeeper$ZKWatchManager.containsWatcher(ZooKeeper.java:377) > >>>>>>>>>> at > >>>>>>>>>> > >>>>>>> > >>>>> > >> > org.apache.zookeeper.ZooKeeper$ZKWatchManager.removeWatcher(ZooKeeper.java:252) > >>>>>>>>>> at > >>>>>>>>>> > >>>>>>> > >>>>> > >> > org.apache.zookeeper.WatchDeregistration.unregister(WatchDeregistration.java:58) > >>>>>>>>>> at > >> org.apache.zookeeper.ClientCnxn.finishPacket(ClientCnxn.java:712) > >>>>>>>>>> at > org.apache.zookeeper.ClientCnxn.access$1500(ClientCnxn.java:97) > >>>>>>>>>> at > >>>>>>>>>> > >>>>>>> > >>>>> > >> > org.apache.zookeeper.ClientCnxn$SendThread.readResponse(ClientCnxn.java:948) > >>>>>>>>>> at > >>>>>>>>>> > >>>>>>> > >>>>> > >> > org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:99) > >>>>>>>>>> at > >>>>>>>>>> > >>>>>>> > >>>>> > >> > org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361) > >>>>>>>>>> at > >>>>> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1236) > >>>>>>>>>> > >>>>>>>>>> Is it possible it's something to do with the way that the > cluster > >> is > >>>>>>>>>> restarted at line 282? The old cluster is not shutdown, a new > one > >> is > >>>>>>> just > >>>>>>>>>> created. > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> On Wed, Jun 1, 2016 at 10:44 AM, Jordan Zimmerman < > >>>>>>>>>> [email protected]> wrote: > >>>>>>>>>> > >>>>>>>>>>> I’ll try to address this as part of CURATOR-333 > >>>>>>>>>>> > >>>>>>>>>>>> On May 31, 2016, at 7:08 PM, Cameron McKenzie < > >>>>>>> [email protected]> > >>>>>>>>>>> wrote: > >>>>>>>>>>>> > >>>>>>>>>>>> Maybe we need to look at some way of providing a hook for > tests > >> to > >>>>>>> wait > >>>>>>>>>>>> reliably for asynch tasks to finish? > >>>>>>>>>>>> > >>>>>>>>>>>> The latest round of tests ran OK. One test failed on an > >> unrelated > >>>>>>> thing > >>>>>>>>>>>> (ConnectionLoss), but this appears to be a transient thing as > >> it's > >>>>>>>>>>> worked > >>>>>>>>>>>> ok the next time around. > >>>>>>>>>>>> > >>>>>>>>>>>> I will start getting a release together. Thanks for you help > >> with > >>>>> the > >>>>>>>>>>>> updated tests. > >>>>>>>>>>>> cheers > >>>>>>>>>>>> > >>>>>>>>>>>> On Wed, Jun 1, 2016 at 9:12 AM, Jordan Zimmerman < > >>>>>>>>>>> [email protected] > >>>>>>>>>>>>> wrote: > >>>>>>>>>>>> > >>>>>>>>>>>>> The problem is in-flight watchers and async background calls. > >>>>>>> There’s > >>>>>>>>>>> no > >>>>>>>>>>>>> way to cancel these and they can take time to occur - even > >> after > >>>>> a > >>>>>>>>>>> recipe > >>>>>>>>>>>>> instance is closed. > >>>>>>>>>>>>> > >>>>>>>>>>>>> -Jordan > >>>>>>>>>>>>> > >>>>>>>>>>>>>> On May 31, 2016, at 5:11 PM, Cameron McKenzie < > >>>>>>>>>>> [email protected]> > >>>>>>>>>>>>> wrote: > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> Ok, running it again now. > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> Is the problem that the watcher clean up for the recipes is > >> done > >>>>>>>>>>>>>> asynchronously after they are closed? > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> On Wed, Jun 1, 2016 at 1:35 AM, Jordan Zimmerman < > >>>>>>>>>>>>> [email protected] > >>>>>>>>>>>>>>> wrote: > >>>>>>>>>>>>>> > >>>>>>>>>>>>>>> OK - please try now. I added a loop in the “no watchers” > >>>>> checker. > >>>>>>> If > >>>>>>>>>>>>> there > >>>>>>>>>>>>>>> are remaining watchers, it will sleep a bit and try again. > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> -Jordan > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> On May 31, 2016, at 1:33 AM, Cameron McKenzie < > >>>>>>>>>>> [email protected]> > >>>>>>>>>>>>>>> wrote: > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> Looks like these failures are intermittent. Running them > >>>>> directly > >>>>>>>>>>> in > >>>>>>>>>>>>>>>> Eclipse they seem to be passing. I will run the whole > thing > >>>>> again > >>>>>>>>>>> in > >>>>>>>>>>>>> the > >>>>>>>>>>>>>>>> morning and see how it goes. > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> On Tue, May 31, 2016 at 2:29 PM, Cameron McKenzie < > >>>>>>>>>>>>>>> [email protected]> > >>>>>>>>>>>>>>>> wrote: > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> There are still 2 tests failing for me: > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> FAILURE! - in > >>>>>>>>>>>>>>>>> > >>>>> org.apache.curator.framework.recipes.cache.TestPathChildrenCache > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>> > >>>>>>> > >>>>> > >> > testKilledSession(org.apache.curator.framework.recipes.cache.TestPathChildrenCache) > >>>>>>>>>>>>>>>>> Time elapsed: 17.488 sec <<< FAILURE! > >>>>>>>>>>>>>>>>> java.lang.AssertionError: One or more child watchers are > >>>>> still > >>>>>>>>>>>>>>> registered: > >>>>>>>>>>>>>>>>> [/test] > >>>>>>>>>>>>>>>>> at > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>> > >>>>>>> > >>>>> > >> > org.apache.curator.framework.imps.TestCleanState.closeAndTestClean(TestCleanState.java:53) > >>>>>>>>>>>>>>>>> at > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>> > >>>>>>> > >>>>> > >> > org.apache.curator.framework.recipes.cache.TestPathChildrenCache.testKilledSession(TestPathChildrenCache.java:707) > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> FAILURE! - in > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>> > >>>>>>> > >>>>> > >> > org.apache.curator.framework.recipes.locks.TestInterProcessSemaphoreCluster > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>> > >>>>>>> > >>>>> > >> > testCluster(org.apache.curator.framework.recipes.locks.TestInterProcessSemaphoreCluster) > >>>>>>>>>>>>>>>>> Time elapsed: 96.641 sec <<< FAILURE! > >>>>>>>>>>>>>>>>> java.lang.AssertionError: expected [true] but found > [false] > >>>>>>>>>>>>>>>>> at org.testng.Assert.fail(Assert.java:94) > >>>>>>>>>>>>>>>>> at org.testng.Assert.failNotEquals(Assert.java:494) > >>>>>>>>>>>>>>>>> at org.testng.Assert.assertTrue(Assert.java:42) > >>>>>>>>>>>>>>>>> at org.testng.Assert.assertTrue(Assert.java:52) > >>>>>>>>>>>>>>>>> at > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>> > >>>>>>> > >>>>> > >> > org.apache.curator.framework.recipes.locks.TestInterProcessSemaphoreCluster.testCluster(TestInterProcessSemaphoreCluster.java:294) > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> Failed tests: > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>> > >>>>>>> > >>>>> > >> > org.apache.curator.framework.recipes.cache.TestPathChildrenCache.testKilledSession(org.apache.curator.framework.recipes.cache.TestPathChildrenCache) > >>>>>>>>>>>>>>>>> Run 1: TestPathChildrenCache.testKilledSession:707 One or > >>>>> more > >>>>>>>>>>> child > >>>>>>>>>>>>>>>>> watchers are still registered: [/test] > >>>>>>>>>>>>>>>>> Run 2: PASS > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> TestInterProcessSemaphoreCluster.testCluster:294 expected > >>>>> [true] > >>>>>>>>>>> but > >>>>>>>>>>>>>>>>> found [false] > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> Tests run: 495, Failures: 2, Errors: 0, Skipped: 0 > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> On Tue, May 31, 2016 at 12:52 PM, Cameron McKenzie < > >>>>>>>>>>>>>>> [email protected] > >>>>>>>>>>>>>>>>>> wrote: > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> Thanks, CURATOR-332 wasn't pushed. I will run the tests > >>>>> against > >>>>>>>>>>> that, > >>>>>>>>>>>>>>> and > >>>>>>>>>>>>>>>>>> if it's all good will merge into CURATOR-3.0 > >>>>>>>>>>>>>>>>>> cheers > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> On Tue, May 31, 2016 at 12:32 PM, Jordan Zimmerman < > >>>>>>>>>>>>>>>>>> [email protected]> wrote: > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> Actually - I don’t remember if branch CURATOR-332 is > >> merged > >>>>>>>>>>> yet. I > >>>>>>>>>>>>>>>>>>> made/pushed my changes in CURATOR-332 > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> -jordan > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> On May 26, 2016, at 10:04 PM, Cameron McKenzie < > >>>>>>>>>>>>>>> [email protected]> > >>>>>>>>>>>>>>>>>>> wrote: > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> I'm still seeing 6 failed tests that seem related to > the > >>>>> same > >>>>>>>>>>> stuff > >>>>>>>>>>>>>>>>>>> after > >>>>>>>>>>>>>>>>>>>> merging your fix: > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> Failed tests: > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>> > >>>>>>> > >>>>> > >> > org.apache.curator.framework.recipes.cache.TestPathChildrenCache.testBasics(org.apache.curator.framework.recipes.cache.TestPathChildrenCache) > >>>>>>>>>>>>>>>>>>>> Run 1: TestPathChildrenCache.testBasics:863 One or > more > >>>>> child > >>>>>>>>>>>>>>> watchers > >>>>>>>>>>>>>>>>>>>> are still registered: [/test] > >>>>>>>>>>>>>>>>>>>> Run 2: TestPathChildrenCache.testBasics:863 One or > more > >>>>> child > >>>>>>>>>>>>>>> watchers > >>>>>>>>>>>>>>>>>>>> are still registered: [/test] > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>> > >>>>>>> > >>>>> > >> > org.apache.curator.framework.recipes.cache.TestPathChildrenCache.testBasicsOnTwoCachesWithSameExecutor(org.apache.curator.framework.recipes.cache.TestPathChildrenCache) > >>>>>>>>>>>>>>>>>>>> Run 1: > >>>>>>>>>>>>>>> > >> TestPathChildrenCache.testBasicsOnTwoCachesWithSameExecutor:934 > >>>>>>>>>>>>>>>>>>>> One or more child watchers are still registered: > [/test] > >>>>>>>>>>>>>>>>>>>> Run 2: > >>>>>>>>>>>>>>> > >> TestPathChildrenCache.testBasicsOnTwoCachesWithSameExecutor:934 > >>>>>>>>>>>>>>>>>>>> One or more child watchers are still registered: > [/test] > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>> > >>>>>>> > >>>>> > >> > org.apache.curator.framework.recipes.cache.TestPathChildrenCache.testEnsurePath(org.apache.curator.framework.recipes.cache.TestPathChildrenCache) > >>>>>>>>>>>>>>>>>>>> Run 1: TestPathChildrenCache.testEnsurePath:363 One or > >>>>> more > >>>>>>>>>>> child > >>>>>>>>>>>>>>>>>>>> watchers are still registered: [/one/two/three] > >>>>>>>>>>>>>>>>>>>> Run 2: TestPathChildrenCache.testEnsurePath:363 One or > >>>>> more > >>>>>>>>>>> child > >>>>>>>>>>>>>>>>>>>> watchers are still registered: [/one/two/three] > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> TestInterProcessSemaphoreCluster.testCluster:294 > >> expected > >>>>>>>>>>> [true] > >>>>>>>>>>>>> but > >>>>>>>>>>>>>>>>>>>> found [false] > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>> > >>>>>>> > >>>>> > >> > org.apache.curator.framework.recipes.shared.TestSharedCount.testMultiClientVersioned(org.apache.curator.framework.recipes.shared.TestSharedCount) > >>>>>>>>>>>>>>>>>>>> Run 1: PASS > >>>>>>>>>>>>>>>>>>>> Run 2: TestSharedCount.testMultiClientVersioned:256 > One > >> or > >>>>>>> more > >>>>>>>>>>>>> data > >>>>>>>>>>>>>>>>>>>> watchers are still registered: [/count] > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>> > >>>>>>> > >>>>> > >> > org.apache.curator.framework.recipes.shared.TestSharedCount.testSimple(org.apache.curator.framework.recipes.shared.TestSharedCount) > >>>>>>>>>>>>>>>>>>>> Run 1: TestSharedCount.testSimple:174 One or more data > >>>>>>>>>>> watchers are > >>>>>>>>>>>>>>>>>>> still > >>>>>>>>>>>>>>>>>>>> registered: [/count] > >>>>>>>>>>>>>>>>>>>> Run 2: TestSharedCount.testSimple:174 One or more data > >>>>>>>>>>> watchers are > >>>>>>>>>>>>>>>>>>> still > >>>>>>>>>>>>>>>>>>>> registered: [/count] > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> Tests run: 491, Failures: 6, Errors: 0, Skipped: 0 > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> On Fri, May 27, 2016 at 3:30 AM, Jordan Zimmerman < > >>>>>>>>>>>>>>>>>>>> [email protected]> wrote: > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>> I see the problem. The fix is not simple though so > I’ll > >>>>>>> spend > >>>>>>>>>>> some > >>>>>>>>>>>>>>>>>>> time on > >>>>>>>>>>>>>>>>>>>>> it. The TL;DR is that exists watchers are still > >> supposed > >>>>> to > >>>>>>>>>>> get > >>>>>>>>>>>>> set > >>>>>>>>>>>>>>>>>>> when > >>>>>>>>>>>>>>>>>>>>> there is a KeeperException.NoNode and the code isn’t > >>>>>>> handling > >>>>>>>>>>> it. > >>>>>>>>>>>>>>> But, > >>>>>>>>>>>>>>>>>>>>> while I was looking at the code I realized there are > >> some > >>>>>>>>>>>>>>> significant > >>>>>>>>>>>>>>>>>>>>> additional problems. Curator, here, is trying to > mirror > >>>>> what > >>>>>>>>>>>>>>>>>>> ZooKeeper does > >>>>>>>>>>>>>>>>>>>>> internally which is insanely complicated. In > hindsight, > >>>>> the > >>>>>>>>>>> whole > >>>>>>>>>>>>> ZK > >>>>>>>>>>>>>>>>>>>>> watcher mechanism should’ve been decoupled from the > >>>>> mutator > >>>>>>>>>>> APIs. > >>>>>>>>>>>>>>>>>>> But, of > >>>>>>>>>>>>>>>>>>>>> course, that’s easy for me to say now. > >>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>> -Jordan > >>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>> On May 26, 2016, at 1:10 AM, Cameron McKenzie < > >>>>>>>>>>>>>>>>>>> [email protected]> > >>>>>>>>>>>>>>>>>>>>> wrote: > >>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>> Thanks Scott, > >>>>>>>>>>>>>>>>>>>>>> Those tests are now passing for me. > >>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>> Jordan, testNodeCache:testBasics() is failing > >>>>> consistently > >>>>>>>>>>> on the > >>>>>>>>>>>>>>> 3.0 > >>>>>>>>>>>>>>>>>>>>>> branch. It appears that this is actually > potentially a > >>>>> bug > >>>>>>>>>>> in the > >>>>>>>>>>>>>>>>>>>>>> NodeCache. It ends up leaking a Watcher reference. > >> I've > >>>>>>> had a > >>>>>>>>>>>>> quick > >>>>>>>>>>>>>>>>>>> look > >>>>>>>>>>>>>>>>>>>>>> through, but I haven't dived in in any detail. It's > >> the > >>>>>>>>>>>>>>>>>>>>>> WatcherRemovalManager stuff I think. If you've got > >> time, > >>>>>>> can > >>>>>>>>>>> you > >>>>>>>>>>>>>>>>>>> have a > >>>>>>>>>>>>>>>>>>>>>> look? If not, let me know and I'll do some more > >> digging. > >>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>> cheers > >>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>> On Thu, May 26, 2016 at 11:47 AM, Cameron McKenzie < > >>>>>>>>>>>>>>>>>>>>> [email protected]> > >>>>>>>>>>>>>>>>>>>>>> wrote: > >>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>> Thanks Scott. > >>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>> Push the fix to master and merge it into 3.0. > >>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>> Then I guess, I'll push new versions of 2.11 and > 3.2 > >>>>> onto > >>>>>>>>>>> Nexus. > >>>>>>>>>>>>>>>>>>>>>>> cheers > >>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>> On Thu, May 26, 2016 at 11:44 AM, Scott Blum < > >>>>>>>>>>>>>>> [email protected] > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>> wrote: > >>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>> Alright, I have a fix, but it wants to be applied > to > >>>>> both > >>>>>>>>>>>>> master > >>>>>>>>>>>>>>>>>>> and > >>>>>>>>>>>>>>>>>>>>> 3.0. > >>>>>>>>>>>>>>>>>>>>>>>> Where should I push the fix? > >>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>> On Wed, May 25, 2016 at 6:10 PM, Cameron McKenzie > < > >>>>>>>>>>>>>>>>>>>>> [email protected] > >>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>> wrote: > >>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>> Thanks Scott, > >>>>>>>>>>>>>>>>>>>>>>>>> If you just checkout the CURATOR-3.0 branch, they > >> are > >>>>>>>>>>> failing > >>>>>>>>>>>>>>>>>>> there. > >>>>>>>>>>>>>>>>>>>>>>>>> cheers > >>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>> On Thu, May 26, 2016 at 2:06 AM, Scott Blum < > >>>>>>>>>>>>>>>>>>> [email protected]> > >>>>>>>>>>>>>>>>>>>>>>>> wrote: > >>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>> Sure, what SHA are they failing at Cam? > >>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>> On Wed, May 25, 2016 at 9:36 AM, Jordan > Zimmerman > >> < > >>>>>>>>>>>>>>>>>>>>>>>>>> [email protected]> wrote: > >>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>> Scott can you take a look? > >>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>> -Jordan > >>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> On May 25, 2016, at 4:35 AM, Cameron McKenzie > < > >>>>>>>>>>>>>>>>>>>>>>>>> [email protected]> > >>>>>>>>>>>>>>>>>>>>>>>>>>> wrote: > >>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> Tree cache tests are still failing. I've > tried a > >>>>> few > >>>>>>>>>>> times > >>>>>>>>>>>>>>> but > >>>>>>>>>>>>>>>>>>> no > >>>>>>>>>>>>>>>>>>>>>>>>> love: > >>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>> > >> TestTreeCacheEventOrdering>TestEventOrdering.testEventOrdering:151 > >>>>>>>>>>>>>>>>>>>>>>>>>>> actual 6 > >>>>>>>>>>>>>>>>>>>>>>>>>>>> expected -31: > >>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> I will have a look into what's going on in the > >>>>>>> morning. > >>>>>>>>>>>>> Given > >>>>>>>>>>>>>>>>>>> that > >>>>>>>>>>>>>>>>>>>>>>>>>> these > >>>>>>>>>>>>>>>>>>>>>>>>>>>> may take some messing about to fix up, do we > >> just > >>>>>>> want > >>>>>>>>>>> to > >>>>>>>>>>>>>>> vote > >>>>>>>>>>>>>>>>>>> on > >>>>>>>>>>>>>>>>>>>>>>>>>> 2.11.0 > >>>>>>>>>>>>>>>>>>>>>>>>>>>> separately, as that is all ready to go? > >>>>>>>>>>>>>>>>>>>>>>>>>>>> cheers > >>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> On Wed, May 25, 2016 at 5:34 PM, Jordan > >> Zimmerman > >>>>> < > >>>>>>>>>>>>>>>>>>>>>>>>>>>> [email protected]> wrote: > >>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Great news. Thanks. > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> ==================== > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Jordan Zimmerman > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On May 25, 2016, at 12:37 AM, Cameron > >> McKenzie < > >>>>>>>>>>>>>>>>>>>>>>>>>> [email protected] > >>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> wrote: > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I have fixed up the test case, all good now. > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Wed, May 25, 2016 at 1:45 PM, Cameron > >>>>> McKenzie < > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> [email protected]> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> wrote: > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Looks like it was introduced with the > schema > >>>>>>>>>>> validation > >>>>>>>>>>>>>>>>>>> stuff. > >>>>>>>>>>>>>>>>>>>>>>>> It > >>>>>>>>>>>>>>>>>>>>>>>>>> now > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> does > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> an ACL check prior to the backgrounding > call. > >>>>>>>>>>> Because > >>>>>>>>>>>>> the > >>>>>>>>>>>>>>>>>>> unit > >>>>>>>>>>>>>>>>>>>>>>>>> test > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> uses a > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> bogus ACL provider it just throws an > >> exception > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> final String adjustedPath = > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > adjustPath(client.fixForNamespace(givenPath, > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> createMode.isSequential())); > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> List<ACL> aclList = > >>>>>>>>>>> acling.getAclList(adjustedPath); > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>> > client.getSchemaSet().getSchema(givenPath).validateCreate(createMode, > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> data, > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> aclList); > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> String returnPath = null; > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> if ( backgrounding.inBackground() ) > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> { > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> pathInBackground(adjustedPath, data, > >>>>>>>>>>> givenPath); > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> So, I guess the answer is to get the test > to > >>>>>>> force a > >>>>>>>>>>>>>>> failure > >>>>>>>>>>>>>>>>>>>>>>>> in a > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> different way. With the > >> UnhandledErrorListener, > >>>>>>> the > >>>>>>>>>>>>>>>>>>>>>>>> expectation is > >>>>>>>>>>>>>>>>>>>>>>>>>>> that > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> this only gets called on backgrounding > >>>>> operations? > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> cheers > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Wed, May 25, 2016 at 1:39 PM, Cameron > >>>>> McKenzie > >>>>>>> < > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> [email protected]> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> wrote: > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Same on the master branch, but it passes > >>>>> there, > >>>>>>> so > >>>>>>>>>>>>> maybe > >>>>>>>>>>>>>>>>>>>>>>>>> something > >>>>>>>>>>>>>>>>>>>>>>>>>>> has > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> legitimately broken the test. Will let you > >>>>> know > >>>>>>> if > >>>>>>>>>>> I > >>>>>>>>>>>>> get > >>>>>>>>>>>>>>>>>>>>>>>> stuck. > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> cheers > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Wed, May 25, 2016 at 1:36 PM, Jordan > >>>>>>> Zimmerman < > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> [email protected]> wrote: > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Let me know if you need help. > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> It might be a bad merge. Have you > compared > >>>>> it to > >>>>>>>>>>> the > >>>>>>>>>>>>>>>>>>> master > >>>>>>>>>>>>>>>>>>>>>>>>>> branch? > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> -JZ > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On May 24, 2016, at 10:31 PM, Cameron > >>>>>>> McKenzie < > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> [email protected]> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> wrote: > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Guys, > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> There's a test > >>>>>>>>>>>>>>> TestFrameworkBackground:testErrorListener > >>>>>>>>>>>>>>>>>>>>>>>> that > >>>>>>>>>>>>>>>>>>>>>>>>> is > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> failing > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> consistently on the CURATOR-3.0 branch. > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I can't see how it has ever worked. It > >>>>> seems to > >>>>>>>>>>> try > >>>>>>>>>>>>> and > >>>>>>>>>>>>>>>>>>>>>>>> provoke > >>>>>>>>>>>>>>>>>>>>>>>>>> an > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> error > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> via a bad ACL provider. > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> But this ACL provider is called by the > >>>>>>>>>>>>>>> CreateBuilderImpl > >>>>>>>>>>>>>>>>>>>>>>>> prior > >>>>>>>>>>>>>>>>>>>>>>>>> to > >>>>>>>>>>>>>>>>>>>>>>>>>>> the > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> backgrounding call, which means that the > >>>>>>>>>>> exception > >>>>>>>>>>>>> that > >>>>>>>>>>>>>>>>>>> it > >>>>>>>>>>>>>>>>>>>>>>>>> throws > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> happens > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> in the main Thread of the unit test. So, > >> it > >>>>>>> just > >>>>>>>>>>>>> throws > >>>>>>>>>>>>>>>>>>> an > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> UnsupportedOperationException which is > >>>>>>>>>>> propogated up > >>>>>>>>>>>>>>> the > >>>>>>>>>>>>>>>>>>>>>>>> stack > >>>>>>>>>>>>>>>>>>>>>>>>> at > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> which > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> point the unit test fails. > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> So, I will look at fixing this, but I > just > >>>>>>> don't > >>>>>>>>>>>>>>>>>>> understand > >>>>>>>>>>>>>>>>>>>>>>>> how > >>>>>>>>>>>>>>>>>>>>>>>>>> it > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> ever > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> worked? > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> cheers > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>> > >>>>>>>>> > >>>>>>> > >>>>>>> > >>>>> > >>>>> > >>>> > >> > >> > >
