You're right that shutdown calls flush, but when the RoD request  hits, it 
breaks the while loop and shutdown isn't called. I suppose it is relying on 
PrepRequestProcessor to call shutdown.

-Flavio

> On 06 Oct 2015, at 18:24, Chris Nauroth <cnaur...@hortonworks.com> wrote:
> 
> Flavio, thank you for confirmation.
> 
> Regarding the SyncRequestProcessor, I thought the case you described would
> be covered by the final flush call from within
> SyncRequestProcessor#shutdown.  Let me know if I'm missing an important
> point though.
> 
> --Chris Nauroth
> 
> 
> 
> 
> On 10/6/15, 9:14 AM, "Flavio Junqueira" <f...@apache.org> wrote:
> 
>> That's a good analysis, Chris. I might be worth doing.
>> 
>> I have a new conjecture for Jordan's issue. I was having a look at
>> SyncRequestProcessor and the following seems to happen. This request
>> processor tries to accumulate requests to perform a group commit to the
>> transaction log. If the load is high, it tries to group as many as 1000.
>> The trouble is that if it receives a RoD at a high load, then it is
>> currently breaking the main thread loop and not flushing the requests
>> accumulated in the toFlush list. Note that flushing includes executing
>> them in FinalRequestProcessor.
>> 
>> This is a change we can try to make and see if it sorts out the problem.
>> 
>> -Flavio
>> 
>> 
>>> On 05 Oct 2015, at 23:47, Chris Nauroth <cnaur...@hortonworks.com>
>>> wrote:
>>> 
>>> I was able to do more debugging today, using Jordan's ZooKeeper-only
>>> test
>>> (no Curator in the mix) mentioned in CURATOR-268.
>>> 
>>> https://gist.github.com/Randgalt/c3358b740f7f09d47bb0
>>> 
>>> 
>>> My observation is that the transactions are getting processed and
>>> persisted to the log, but the client's connection to the server is
>>> getting
>>> closed before a successful reply is sent.  This causes the client to
>>> reconnect and retry the same request.  For a retried delete, this
>>> causes a
>>> NoNode error.  For a retried create, this causes a NodeExists error.
>>> The
>>> effect is also visible by logging the last zxid seen by the client,
>>> which
>>> will show gaps.  i.e. 1, 2, 3, 5, 6, ...
>>> 
>>> I think a contributing factor is the shutdown sequence in
>>> ZooKeeperServerMain.  I see NIOServerCnxnFactory.shutdown() getting
>>> called
>>> and closing all connections before shutting down the request processing
>>> pipeline.  If requests are in flight in the pipeline after sockets have
>>> been closed, but before the pipeline has been stopped, then the server
>>> could process the transaction and persist it, but not tell the client
>>> about it.  I was able to reduce the frequency of test failures by
>>> reordering the shutdown sequence so that the request processing pipeline
>>> is shutdown before the sockets, but it appears this is only a partial
>>> fix.
>>> Before this change, I could only run Jordan's test for a few seconds
>>> before it aborted.  After this change, I can keep it running for
>>> multiple
>>> minutes, but it still aborts eventually.
>>> 
>>> Note that even if we improve the handling for this on a graceful
>>> shutdown,
>>> this scenario would still be possible in the event of hard failure (i.e.
>>> power loss).  I know some applications wrap the ZooKeeper API calls with
>>> additional semantics, so that a NoNode/NodeExists response is treated
>>> as a
>>> successful idempotent response.  For example, here is something in
>>> HBase:
>>> 
>>> 
>>> http://hbase.apache.org/devapidocs/org/apache/hadoop/hbase/zookeeper/Reco
>>> ve
>>> rableZooKeeper.html
>>> 
>>> 
>>> I don't know the Curator codebase very well, so I don't know if it has
>>> similar options.
>>> 
>>> --Chris Nauroth
>>> 
>>> 
>>> 
>>> 
>>> On 10/5/15, 10:41 AM, "Flavio Junqueira" <f...@apache.org> wrote:
>>> 
>>>> You said you can reproduce it with raw ZooKeeper and I thought you're
>>>> saying you can't reproduce with raw ZooKeeper, but sounds like I got it
>>>> wrong.
>>>> 
>>>> -Flavio
>>>> 
>>>>> On 05 Oct 2015, at 18:36, Jordan Zimmerman
>>>>> <jor...@jordanzimmerman.com>
>>>>> wrote:
>>>>> 
>>>>>> can reproduce it with raw ZooKeeper -> can't reproduce it with raw
>>>>>> ZooKeeper , yes?
>>>>> 
>>>>> I don’t know what you mean. The test that the user posted used
>>>>> Curator.
>>>>> I changed all the Curator usages to raw ZooKeeper usages and the
>>>>> problem
>>>>> still shows.
>>>>> 
>>>>> -JZ
>>>>> 
>>>>>> On Oct 5, 2015, at 12:17 PM, Flavio Junqueira <f...@apache.org> wrote:
>>>>>> 
>>>>>> can reproduce it with raw ZooKeeper -> can't reproduce it with raw
>>>>>> ZooKeeper , yes?
>>>>>> 
>>>>>> I'll have a look at the jira to see if I have any insight.
>>>>>> 
>>>>>> -Flavio
>>>>>> 
>>>>>>> On 05 Oct 2015, at 18:15, Jordan Zimmerman
>>>>>>> <jor...@jordanzimmerman.com> wrote:
>>>>>>> 
>>>>>>> What we’re seeing is transaction rollbacks. The bug was reported
>>>>>>> against Curator but I can reproduce it with raw ZooKeeper:
>>>>>>> https://issues.apache.org/jira/browse/CURATOR-268
>>>>>>> <https://issues.apache.org/jira/browse/CURATOR-268>
>>>>>>> 
>>>>>>> -JZ
>>>>>>> 
>>>>>>>> On Oct 5, 2015, at 12:00 PM, Flavio Junqueira <f...@apache.org>
>>>>>>>> wrote:
>>>>>>>> 
>>>>>>>> It is safe because the requests in the submittedRequests queue
>>>>>>>> haven't been prepared yet. The simplest pipeline is the one of the
>>>>>>>> standalone server: preprequestprocessor -> syncrequestprocessor ->
>>>>>>>> finalrequestprocessor. If the request hasn't gone through prepRP,
>>>>>>>> then nothing has changed in the state of zookeeper. The ones that
>>>>>>>> have gone through prepPR will complete regularly. For quorum, the
>>>>>>>> pipeline is a bit more complex, but the reasoning is very similar.
>>>>>>>> 
>>>>>>>> -Flavio
>>>>>>>> 
>>>>>>>> 
>>>>>>>>> On 05 Oct 2015, at 17:55, Jordan Zimmerman
>>>>>>>>> <jor...@jordanzimmerman.com> wrote:
>>>>>>>>> 
>>>>>>>>> That would mean that there’s no safe way to shut down the server,
>>>>>>>>> right? Ideally, you’d want the server to shut down gracefully: a)
>>>>>>>>> stop receiving requests; b) complete current requests; c) shut
>>>>>>>>> down.
>>>>>>>>> That’s how most servers work. Of course, you might want a
>>>>>>>>> quick-die
>>>>>>>>> shutdown but that’s not usual behavior.
>>>>>>>>> 
>>>>>>>>> -JZ
>>>>>>>>> 
>>>>>>>>>> On Oct 5, 2015, at 11:30 AM, Flavio Junqueira <f...@apache.org>
>>>>>>>>>> wrote:
>>>>>>>>>> 
>>>>>>>>>> You suggested that it is a bug, and I'm arguing that it isn't a
>>>>>>>>>> bug. You may want to optimize and still process the requests in
>>>>>>>>>> the
>>>>>>>>>> queue before injecting RoD, but discarding them doesn't sound
>>>>>>>>>> like
>>>>>>>>>> a bug because you can't guarantee that requests submitted
>>>>>>>>>> concurrently with the server shutting down will be executed.
>>>>>>>>>> Optimizing isn't the same as spotting a bug. Also, if you are
>>>>>>>>>> trying to shut down, you probably want to do it asap, rather than
>>>>>>>>>> wait for a whole batch of operations to complete.
>>>>>>>>>> 
>>>>>>>>>> -Flavio
>>>>>>>>>> 
>>>>>>>>>>> On 05 Oct 2015, at 14:57, Jordan Zimmerman
>>>>>>>>>>> <jor...@jordanzimmerman.com> wrote:
>>>>>>>>>>> 
>>>>>>>>>>> Flavio, that isn’t logical. Just because you can’t make that
>>>>>>>>>>> guarantee doesn’t imply that you should flush already queued
>>>>>>>>>>> transactions.
>>>>>>>>>>> 
>>>>>>>>>>> -JZ
>>>>>>>>>>> 
>>>>>>>>>>>> On Oct 5, 2015, at 3:24 AM, Flavio Junqueira <f...@apache.org>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>> 
>>>>>>>>>>>> Injecting the RoD means that we are shutting down the server
>>>>>>>>>>>> pipeline. If the server is shutting down, then we can't
>>>>>>>>>>>> guarantee
>>>>>>>>>>>> that a request submitted concurrently will be executed anyway,
>>>>>>>>>>>> so
>>>>>>>>>>>> clearing the queue of submitted requests (submitted but no
>>>>>>>>>>>> preped
>>>>>>>>>>>> for execution) sounds like correct behavior to me.
>>>>>>>>>>>> 
>>>>>>>>>>>> -Flavio
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>>> On 04 Oct 2015, at 23:05, Chris Nauroth
>>>>>>>>>>>>> <cnaur...@hortonworks.com> wrote:
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Hi Jordan,
>>>>>>>>>>>>> 
>>>>>>>>>>>>> That's an interesting find.  I think you have a good theory.
>>>>>>>>>>>>> Have you
>>>>>>>>>>>>> already tried patching this to see if the bug reported against
>>>>>>>>>>>>> Curator
>>>>>>>>>>>>> goes away?  (BTW, is there a corresponding Curator JIRA?)
>>>>>>>>>>>>> 
>>>>>>>>>>>>> That logic dates all the way back to the initial import of the
>>>>>>>>>>>>> codebase.
>>>>>>>>>>>>> I can't find a definitive explanation, but my best guess is
>>>>>>>>>>>>> that dropping
>>>>>>>>>>>>> pending requests (instead of gracefully quiescing) can give a
>>>>>>>>>>>>> faster
>>>>>>>>>>>>> shutdown in the event of a heavily overloaded server.
>>>>>>>>>>>>> However,
>>>>>>>>>>>>> the
>>>>>>>>>>>>> correctness of this choice looks questionable, especially in
>>>>>>>>>>>>> stand-alone
>>>>>>>>>>>>> mode where you don't have a cluster of other machines to
>>>>>>>>>>>>> compensate.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Something else interesting is that this doesn't even really
>>>>>>>>>>>>> guarantee that
>>>>>>>>>>>>> the request of death is the only thing remaining to be
>>>>>>>>>>>>> processed.  There
>>>>>>>>>>>>> is no synchronization over the queue covering both the clear
>>>>>>>>>>>>> and the
>>>>>>>>>>>>> enqueue of the request of death, so I think there is a window
>>>>>>>>>>>>> in which
>>>>>>>>>>>>> other requests could trickle in ahead of the request of death.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> --Chris Nauroth
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> On 10/1/15, 8:21 PM, "Jordan Zimmerman"
>>>>>>>>>>>>> <jord...@bluejeansnet.com> wrote:
>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Why does PrepRequestProcessor.shutdown() call
>>>>>>>>>>>>>> submittedRequests.clear();
>>>>>>>>>>>>>> before adding the death request? What if there are pending
>>>>>>>>>>>>>> requests? I¹m
>>>>>>>>>>>>>> trying to track down a bug reported in Curator. It only
>>>>>>>>>>>>>> happens in
>>>>>>>>>>>>>> Standalone ZK instances. From what I can tell, shutting down
>>>>>>>>>>>>>> a
>>>>>>>>>>>>>> standalone
>>>>>>>>>>>>>> instance might result in lost transactions. Am I looking down
>>>>>>>>>>>>>> the wrong
>>>>>>>>>>>>>> path or is this a possibility?
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> -Jordan
>>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>> 
>>>> 
>>>> 
>>> 
>> 
>> 
> 

Reply via email to