You're right that shutdown calls flush, but when the RoD request hits, it breaks the while loop and shutdown isn't called. I suppose it is relying on PrepRequestProcessor to call shutdown.
-Flavio > On 06 Oct 2015, at 18:24, Chris Nauroth <cnaur...@hortonworks.com> wrote: > > Flavio, thank you for confirmation. > > Regarding the SyncRequestProcessor, I thought the case you described would > be covered by the final flush call from within > SyncRequestProcessor#shutdown. Let me know if I'm missing an important > point though. > > --Chris Nauroth > > > > > On 10/6/15, 9:14 AM, "Flavio Junqueira" <f...@apache.org> wrote: > >> That's a good analysis, Chris. I might be worth doing. >> >> I have a new conjecture for Jordan's issue. I was having a look at >> SyncRequestProcessor and the following seems to happen. This request >> processor tries to accumulate requests to perform a group commit to the >> transaction log. If the load is high, it tries to group as many as 1000. >> The trouble is that if it receives a RoD at a high load, then it is >> currently breaking the main thread loop and not flushing the requests >> accumulated in the toFlush list. Note that flushing includes executing >> them in FinalRequestProcessor. >> >> This is a change we can try to make and see if it sorts out the problem. >> >> -Flavio >> >> >>> On 05 Oct 2015, at 23:47, Chris Nauroth <cnaur...@hortonworks.com> >>> wrote: >>> >>> I was able to do more debugging today, using Jordan's ZooKeeper-only >>> test >>> (no Curator in the mix) mentioned in CURATOR-268. >>> >>> https://gist.github.com/Randgalt/c3358b740f7f09d47bb0 >>> >>> >>> My observation is that the transactions are getting processed and >>> persisted to the log, but the client's connection to the server is >>> getting >>> closed before a successful reply is sent. This causes the client to >>> reconnect and retry the same request. For a retried delete, this >>> causes a >>> NoNode error. For a retried create, this causes a NodeExists error. >>> The >>> effect is also visible by logging the last zxid seen by the client, >>> which >>> will show gaps. i.e. 1, 2, 3, 5, 6, ... >>> >>> I think a contributing factor is the shutdown sequence in >>> ZooKeeperServerMain. I see NIOServerCnxnFactory.shutdown() getting >>> called >>> and closing all connections before shutting down the request processing >>> pipeline. If requests are in flight in the pipeline after sockets have >>> been closed, but before the pipeline has been stopped, then the server >>> could process the transaction and persist it, but not tell the client >>> about it. I was able to reduce the frequency of test failures by >>> reordering the shutdown sequence so that the request processing pipeline >>> is shutdown before the sockets, but it appears this is only a partial >>> fix. >>> Before this change, I could only run Jordan's test for a few seconds >>> before it aborted. After this change, I can keep it running for >>> multiple >>> minutes, but it still aborts eventually. >>> >>> Note that even if we improve the handling for this on a graceful >>> shutdown, >>> this scenario would still be possible in the event of hard failure (i.e. >>> power loss). I know some applications wrap the ZooKeeper API calls with >>> additional semantics, so that a NoNode/NodeExists response is treated >>> as a >>> successful idempotent response. For example, here is something in >>> HBase: >>> >>> >>> http://hbase.apache.org/devapidocs/org/apache/hadoop/hbase/zookeeper/Reco >>> ve >>> rableZooKeeper.html >>> >>> >>> I don't know the Curator codebase very well, so I don't know if it has >>> similar options. >>> >>> --Chris Nauroth >>> >>> >>> >>> >>> On 10/5/15, 10:41 AM, "Flavio Junqueira" <f...@apache.org> wrote: >>> >>>> You said you can reproduce it with raw ZooKeeper and I thought you're >>>> saying you can't reproduce with raw ZooKeeper, but sounds like I got it >>>> wrong. >>>> >>>> -Flavio >>>> >>>>> On 05 Oct 2015, at 18:36, Jordan Zimmerman >>>>> <jor...@jordanzimmerman.com> >>>>> wrote: >>>>> >>>>>> can reproduce it with raw ZooKeeper -> can't reproduce it with raw >>>>>> ZooKeeper , yes? >>>>> >>>>> I don’t know what you mean. The test that the user posted used >>>>> Curator. >>>>> I changed all the Curator usages to raw ZooKeeper usages and the >>>>> problem >>>>> still shows. >>>>> >>>>> -JZ >>>>> >>>>>> On Oct 5, 2015, at 12:17 PM, Flavio Junqueira <f...@apache.org> wrote: >>>>>> >>>>>> can reproduce it with raw ZooKeeper -> can't reproduce it with raw >>>>>> ZooKeeper , yes? >>>>>> >>>>>> I'll have a look at the jira to see if I have any insight. >>>>>> >>>>>> -Flavio >>>>>> >>>>>>> On 05 Oct 2015, at 18:15, Jordan Zimmerman >>>>>>> <jor...@jordanzimmerman.com> wrote: >>>>>>> >>>>>>> What we’re seeing is transaction rollbacks. The bug was reported >>>>>>> against Curator but I can reproduce it with raw ZooKeeper: >>>>>>> https://issues.apache.org/jira/browse/CURATOR-268 >>>>>>> <https://issues.apache.org/jira/browse/CURATOR-268> >>>>>>> >>>>>>> -JZ >>>>>>> >>>>>>>> On Oct 5, 2015, at 12:00 PM, Flavio Junqueira <f...@apache.org> >>>>>>>> wrote: >>>>>>>> >>>>>>>> It is safe because the requests in the submittedRequests queue >>>>>>>> haven't been prepared yet. The simplest pipeline is the one of the >>>>>>>> standalone server: preprequestprocessor -> syncrequestprocessor -> >>>>>>>> finalrequestprocessor. If the request hasn't gone through prepRP, >>>>>>>> then nothing has changed in the state of zookeeper. The ones that >>>>>>>> have gone through prepPR will complete regularly. For quorum, the >>>>>>>> pipeline is a bit more complex, but the reasoning is very similar. >>>>>>>> >>>>>>>> -Flavio >>>>>>>> >>>>>>>> >>>>>>>>> On 05 Oct 2015, at 17:55, Jordan Zimmerman >>>>>>>>> <jor...@jordanzimmerman.com> wrote: >>>>>>>>> >>>>>>>>> That would mean that there’s no safe way to shut down the server, >>>>>>>>> right? Ideally, you’d want the server to shut down gracefully: a) >>>>>>>>> stop receiving requests; b) complete current requests; c) shut >>>>>>>>> down. >>>>>>>>> That’s how most servers work. Of course, you might want a >>>>>>>>> quick-die >>>>>>>>> shutdown but that’s not usual behavior. >>>>>>>>> >>>>>>>>> -JZ >>>>>>>>> >>>>>>>>>> On Oct 5, 2015, at 11:30 AM, Flavio Junqueira <f...@apache.org> >>>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>> You suggested that it is a bug, and I'm arguing that it isn't a >>>>>>>>>> bug. You may want to optimize and still process the requests in >>>>>>>>>> the >>>>>>>>>> queue before injecting RoD, but discarding them doesn't sound >>>>>>>>>> like >>>>>>>>>> a bug because you can't guarantee that requests submitted >>>>>>>>>> concurrently with the server shutting down will be executed. >>>>>>>>>> Optimizing isn't the same as spotting a bug. Also, if you are >>>>>>>>>> trying to shut down, you probably want to do it asap, rather than >>>>>>>>>> wait for a whole batch of operations to complete. >>>>>>>>>> >>>>>>>>>> -Flavio >>>>>>>>>> >>>>>>>>>>> On 05 Oct 2015, at 14:57, Jordan Zimmerman >>>>>>>>>>> <jor...@jordanzimmerman.com> wrote: >>>>>>>>>>> >>>>>>>>>>> Flavio, that isn’t logical. Just because you can’t make that >>>>>>>>>>> guarantee doesn’t imply that you should flush already queued >>>>>>>>>>> transactions. >>>>>>>>>>> >>>>>>>>>>> -JZ >>>>>>>>>>> >>>>>>>>>>>> On Oct 5, 2015, at 3:24 AM, Flavio Junqueira <f...@apache.org> >>>>>>>>>>>> wrote: >>>>>>>>>>>> >>>>>>>>>>>> Injecting the RoD means that we are shutting down the server >>>>>>>>>>>> pipeline. If the server is shutting down, then we can't >>>>>>>>>>>> guarantee >>>>>>>>>>>> that a request submitted concurrently will be executed anyway, >>>>>>>>>>>> so >>>>>>>>>>>> clearing the queue of submitted requests (submitted but no >>>>>>>>>>>> preped >>>>>>>>>>>> for execution) sounds like correct behavior to me. >>>>>>>>>>>> >>>>>>>>>>>> -Flavio >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>>> On 04 Oct 2015, at 23:05, Chris Nauroth >>>>>>>>>>>>> <cnaur...@hortonworks.com> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>> Hi Jordan, >>>>>>>>>>>>> >>>>>>>>>>>>> That's an interesting find. I think you have a good theory. >>>>>>>>>>>>> Have you >>>>>>>>>>>>> already tried patching this to see if the bug reported against >>>>>>>>>>>>> Curator >>>>>>>>>>>>> goes away? (BTW, is there a corresponding Curator JIRA?) >>>>>>>>>>>>> >>>>>>>>>>>>> That logic dates all the way back to the initial import of the >>>>>>>>>>>>> codebase. >>>>>>>>>>>>> I can't find a definitive explanation, but my best guess is >>>>>>>>>>>>> that dropping >>>>>>>>>>>>> pending requests (instead of gracefully quiescing) can give a >>>>>>>>>>>>> faster >>>>>>>>>>>>> shutdown in the event of a heavily overloaded server. >>>>>>>>>>>>> However, >>>>>>>>>>>>> the >>>>>>>>>>>>> correctness of this choice looks questionable, especially in >>>>>>>>>>>>> stand-alone >>>>>>>>>>>>> mode where you don't have a cluster of other machines to >>>>>>>>>>>>> compensate. >>>>>>>>>>>>> >>>>>>>>>>>>> Something else interesting is that this doesn't even really >>>>>>>>>>>>> guarantee that >>>>>>>>>>>>> the request of death is the only thing remaining to be >>>>>>>>>>>>> processed. There >>>>>>>>>>>>> is no synchronization over the queue covering both the clear >>>>>>>>>>>>> and the >>>>>>>>>>>>> enqueue of the request of death, so I think there is a window >>>>>>>>>>>>> in which >>>>>>>>>>>>> other requests could trickle in ahead of the request of death. >>>>>>>>>>>>> >>>>>>>>>>>>> --Chris Nauroth >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> On 10/1/15, 8:21 PM, "Jordan Zimmerman" >>>>>>>>>>>>> <jord...@bluejeansnet.com> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> Why does PrepRequestProcessor.shutdown() call >>>>>>>>>>>>>> submittedRequests.clear(); >>>>>>>>>>>>>> before adding the death request? What if there are pending >>>>>>>>>>>>>> requests? I¹m >>>>>>>>>>>>>> trying to track down a bug reported in Curator. It only >>>>>>>>>>>>>> happens in >>>>>>>>>>>>>> Standalone ZK instances. From what I can tell, shutting down >>>>>>>>>>>>>> a >>>>>>>>>>>>>> standalone >>>>>>>>>>>>>> instance might result in lost transactions. Am I looking down >>>>>>>>>>>>>> the wrong >>>>>>>>>>>>>> path or is this a possibility? >>>>>>>>>>>>>> >>>>>>>>>>>>>> -Jordan >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>>> >>> >> >> >