Flavio, thank you for confirmation.

Regarding the SyncRequestProcessor, I thought the case you described would
be covered by the final flush call from within
SyncRequestProcessor#shutdown.  Let me know if I'm missing an important
point though.

--Chris Nauroth




On 10/6/15, 9:14 AM, "Flavio Junqueira" <f...@apache.org> wrote:

>That's a good analysis, Chris. I might be worth doing.
>
>I have a new conjecture for Jordan's issue. I was having a look at
>SyncRequestProcessor and the following seems to happen. This request
>processor tries to accumulate requests to perform a group commit to the
>transaction log. If the load is high, it tries to group as many as 1000.
>The trouble is that if it receives a RoD at a high load, then it is
>currently breaking the main thread loop and not flushing the requests
>accumulated in the toFlush list. Note that flushing includes executing
>them in FinalRequestProcessor.
>
>This is a change we can try to make and see if it sorts out the problem.
>
>-Flavio
>
>
>> On 05 Oct 2015, at 23:47, Chris Nauroth <cnaur...@hortonworks.com>
>>wrote:
>> 
>> I was able to do more debugging today, using Jordan's ZooKeeper-only
>>test
>> (no Curator in the mix) mentioned in CURATOR-268.
>> 
>> https://gist.github.com/Randgalt/c3358b740f7f09d47bb0
>> 
>> 
>> My observation is that the transactions are getting processed and
>> persisted to the log, but the client's connection to the server is
>>getting
>> closed before a successful reply is sent.  This causes the client to
>> reconnect and retry the same request.  For a retried delete, this
>>causes a
>> NoNode error.  For a retried create, this causes a NodeExists error.
>>The
>> effect is also visible by logging the last zxid seen by the client,
>>which
>> will show gaps.  i.e. 1, 2, 3, 5, 6, ...
>> 
>> I think a contributing factor is the shutdown sequence in
>> ZooKeeperServerMain.  I see NIOServerCnxnFactory.shutdown() getting
>>called
>> and closing all connections before shutting down the request processing
>> pipeline.  If requests are in flight in the pipeline after sockets have
>> been closed, but before the pipeline has been stopped, then the server
>> could process the transaction and persist it, but not tell the client
>> about it.  I was able to reduce the frequency of test failures by
>> reordering the shutdown sequence so that the request processing pipeline
>> is shutdown before the sockets, but it appears this is only a partial
>>fix.
>> Before this change, I could only run Jordan's test for a few seconds
>> before it aborted.  After this change, I can keep it running for
>>multiple
>> minutes, but it still aborts eventually.
>> 
>> Note that even if we improve the handling for this on a graceful
>>shutdown,
>> this scenario would still be possible in the event of hard failure (i.e.
>> power loss).  I know some applications wrap the ZooKeeper API calls with
>> additional semantics, so that a NoNode/NodeExists response is treated
>>as a
>> successful idempotent response.  For example, here is something in
>>HBase:
>> 
>> 
>>http://hbase.apache.org/devapidocs/org/apache/hadoop/hbase/zookeeper/Reco
>>ve
>> rableZooKeeper.html
>> 
>> 
>> I don't know the Curator codebase very well, so I don't know if it has
>> similar options.
>> 
>> --Chris Nauroth
>> 
>> 
>> 
>> 
>> On 10/5/15, 10:41 AM, "Flavio Junqueira" <f...@apache.org> wrote:
>> 
>>> You said you can reproduce it with raw ZooKeeper and I thought you're
>>> saying you can't reproduce with raw ZooKeeper, but sounds like I got it
>>> wrong.
>>> 
>>> -Flavio
>>> 
>>>> On 05 Oct 2015, at 18:36, Jordan Zimmerman
>>>><jor...@jordanzimmerman.com>
>>>> wrote:
>>>> 
>>>>> can reproduce it with raw ZooKeeper -> can't reproduce it with raw
>>>>> ZooKeeper , yes?
>>>> 
>>>> I don’t know what you mean. The test that the user posted used
>>>>Curator.
>>>> I changed all the Curator usages to raw ZooKeeper usages and the
>>>>problem
>>>> still shows.
>>>> 
>>>> -JZ
>>>> 
>>>>> On Oct 5, 2015, at 12:17 PM, Flavio Junqueira <f...@apache.org> wrote:
>>>>> 
>>>>> can reproduce it with raw ZooKeeper -> can't reproduce it with raw
>>>>> ZooKeeper , yes?
>>>>> 
>>>>> I'll have a look at the jira to see if I have any insight.
>>>>> 
>>>>> -Flavio
>>>>> 
>>>>>> On 05 Oct 2015, at 18:15, Jordan Zimmerman
>>>>>> <jor...@jordanzimmerman.com> wrote:
>>>>>> 
>>>>>> What we’re seeing is transaction rollbacks. The bug was reported
>>>>>> against Curator but I can reproduce it with raw ZooKeeper:
>>>>>> https://issues.apache.org/jira/browse/CURATOR-268
>>>>>> <https://issues.apache.org/jira/browse/CURATOR-268>
>>>>>> 
>>>>>> -JZ
>>>>>> 
>>>>>>> On Oct 5, 2015, at 12:00 PM, Flavio Junqueira <f...@apache.org>
>>>>>>>wrote:
>>>>>>> 
>>>>>>> It is safe because the requests in the submittedRequests queue
>>>>>>> haven't been prepared yet. The simplest pipeline is the one of the
>>>>>>> standalone server: preprequestprocessor -> syncrequestprocessor ->
>>>>>>> finalrequestprocessor. If the request hasn't gone through prepRP,
>>>>>>> then nothing has changed in the state of zookeeper. The ones that
>>>>>>> have gone through prepPR will complete regularly. For quorum, the
>>>>>>> pipeline is a bit more complex, but the reasoning is very similar.
>>>>>>> 
>>>>>>> -Flavio
>>>>>>> 
>>>>>>> 
>>>>>>>> On 05 Oct 2015, at 17:55, Jordan Zimmerman
>>>>>>>> <jor...@jordanzimmerman.com> wrote:
>>>>>>>> 
>>>>>>>> That would mean that there’s no safe way to shut down the server,
>>>>>>>> right? Ideally, you’d want the server to shut down gracefully: a)
>>>>>>>> stop receiving requests; b) complete current requests; c) shut
>>>>>>>>down.
>>>>>>>> That’s how most servers work. Of course, you might want a
>>>>>>>>quick-die
>>>>>>>> shutdown but that’s not usual behavior.
>>>>>>>> 
>>>>>>>> -JZ
>>>>>>>> 
>>>>>>>>> On Oct 5, 2015, at 11:30 AM, Flavio Junqueira <f...@apache.org>
>>>>>>>>> wrote:
>>>>>>>>> 
>>>>>>>>> You suggested that it is a bug, and I'm arguing that it isn't a
>>>>>>>>> bug. You may want to optimize and still process the requests in
>>>>>>>>>the
>>>>>>>>> queue before injecting RoD, but discarding them doesn't sound
>>>>>>>>>like
>>>>>>>>> a bug because you can't guarantee that requests submitted
>>>>>>>>> concurrently with the server shutting down will be executed.
>>>>>>>>> Optimizing isn't the same as spotting a bug. Also, if you are
>>>>>>>>> trying to shut down, you probably want to do it asap, rather than
>>>>>>>>> wait for a whole batch of operations to complete.
>>>>>>>>> 
>>>>>>>>> -Flavio
>>>>>>>>> 
>>>>>>>>>> On 05 Oct 2015, at 14:57, Jordan Zimmerman
>>>>>>>>>> <jor...@jordanzimmerman.com> wrote:
>>>>>>>>>> 
>>>>>>>>>> Flavio, that isn’t logical. Just because you can’t make that
>>>>>>>>>> guarantee doesn’t imply that you should flush already queued
>>>>>>>>>> transactions.
>>>>>>>>>> 
>>>>>>>>>> -JZ
>>>>>>>>>> 
>>>>>>>>>>> On Oct 5, 2015, at 3:24 AM, Flavio Junqueira <f...@apache.org>
>>>>>>>>>>> wrote:
>>>>>>>>>>> 
>>>>>>>>>>> Injecting the RoD means that we are shutting down the server
>>>>>>>>>>> pipeline. If the server is shutting down, then we can't
>>>>>>>>>>>guarantee
>>>>>>>>>>> that a request submitted concurrently will be executed anyway,
>>>>>>>>>>>so
>>>>>>>>>>> clearing the queue of submitted requests (submitted but no
>>>>>>>>>>>preped
>>>>>>>>>>> for execution) sounds like correct behavior to me.
>>>>>>>>>>> 
>>>>>>>>>>> -Flavio
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>>> On 04 Oct 2015, at 23:05, Chris Nauroth
>>>>>>>>>>>> <cnaur...@hortonworks.com> wrote:
>>>>>>>>>>>> 
>>>>>>>>>>>> Hi Jordan,
>>>>>>>>>>>> 
>>>>>>>>>>>> That's an interesting find.  I think you have a good theory.
>>>>>>>>>>>> Have you
>>>>>>>>>>>> already tried patching this to see if the bug reported against
>>>>>>>>>>>> Curator
>>>>>>>>>>>> goes away?  (BTW, is there a corresponding Curator JIRA?)
>>>>>>>>>>>> 
>>>>>>>>>>>> That logic dates all the way back to the initial import of the
>>>>>>>>>>>> codebase.
>>>>>>>>>>>> I can't find a definitive explanation, but my best guess is
>>>>>>>>>>>> that dropping
>>>>>>>>>>>> pending requests (instead of gracefully quiescing) can give a
>>>>>>>>>>>> faster
>>>>>>>>>>>> shutdown in the event of a heavily overloaded server.
>>>>>>>>>>>>However,
>>>>>>>>>>>> the
>>>>>>>>>>>> correctness of this choice looks questionable, especially in
>>>>>>>>>>>> stand-alone
>>>>>>>>>>>> mode where you don't have a cluster of other machines to
>>>>>>>>>>>> compensate.
>>>>>>>>>>>> 
>>>>>>>>>>>> Something else interesting is that this doesn't even really
>>>>>>>>>>>> guarantee that
>>>>>>>>>>>> the request of death is the only thing remaining to be
>>>>>>>>>>>> processed.  There
>>>>>>>>>>>> is no synchronization over the queue covering both the clear
>>>>>>>>>>>> and the
>>>>>>>>>>>> enqueue of the request of death, so I think there is a window
>>>>>>>>>>>> in which
>>>>>>>>>>>> other requests could trickle in ahead of the request of death.
>>>>>>>>>>>> 
>>>>>>>>>>>> --Chris Nauroth
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> On 10/1/15, 8:21 PM, "Jordan Zimmerman"
>>>>>>>>>>>> <jord...@bluejeansnet.com> wrote:
>>>>>>>>>>>> 
>>>>>>>>>>>>> Why does PrepRequestProcessor.shutdown() call
>>>>>>>>>>>>> submittedRequests.clear();
>>>>>>>>>>>>> before adding the death request? What if there are pending
>>>>>>>>>>>>> requests? I¹m
>>>>>>>>>>>>> trying to track down a bug reported in Curator. It only
>>>>>>>>>>>>> happens in
>>>>>>>>>>>>> Standalone ZK instances. From what I can tell, shutting down
>>>>>>>>>>>>>a
>>>>>>>>>>>>> standalone
>>>>>>>>>>>>> instance might result in lost transactions. Am I looking down
>>>>>>>>>>>>> the wrong
>>>>>>>>>>>>> path or is this a possibility?
>>>>>>>>>>>>> 
>>>>>>>>>>>>> -Jordan
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>> 
>>>> 
>>> 
>>> 
>> 
>
>

Reply via email to