Thank you!

> On Oct 10, 2015, at 5:36 PM, Chris Nauroth <cnaur...@hortonworks.com> wrote:
> 
> I have created ZOOKEEPER-2288 and attached my partial fix patch.  Jordan,
> I can see you already spotted it and linked it with CURATOR-268.  Thanks
> again for reporting this.
> 
> --Chris Nauroth
> 
> 
> 
> 
> On 10/6/15, 11:01 AM, "Flavio Junqueira" <f...@apache.org> wrote:
> 
>> You're right that shutdown calls flush, but when the RoD request  hits,
>> it breaks the while loop and shutdown isn't called. I suppose it is
>> relying on PrepRequestProcessor to call shutdown.
>> 
>> -Flavio
>> 
>>> On 06 Oct 2015, at 18:24, Chris Nauroth <cnaur...@hortonworks.com>
>>> wrote:
>>> 
>>> Flavio, thank you for confirmation.
>>> 
>>> Regarding the SyncRequestProcessor, I thought the case you described
>>> would
>>> be covered by the final flush call from within
>>> SyncRequestProcessor#shutdown.  Let me know if I'm missing an important
>>> point though.
>>> 
>>> --Chris Nauroth
>>> 
>>> 
>>> 
>>> 
>>> On 10/6/15, 9:14 AM, "Flavio Junqueira" <f...@apache.org> wrote:
>>> 
>>>> That's a good analysis, Chris. I might be worth doing.
>>>> 
>>>> I have a new conjecture for Jordan's issue. I was having a look at
>>>> SyncRequestProcessor and the following seems to happen. This request
>>>> processor tries to accumulate requests to perform a group commit to the
>>>> transaction log. If the load is high, it tries to group as many as
>>>> 1000.
>>>> The trouble is that if it receives a RoD at a high load, then it is
>>>> currently breaking the main thread loop and not flushing the requests
>>>> accumulated in the toFlush list. Note that flushing includes executing
>>>> them in FinalRequestProcessor.
>>>> 
>>>> This is a change we can try to make and see if it sorts out the
>>>> problem.
>>>> 
>>>> -Flavio
>>>> 
>>>> 
>>>>> On 05 Oct 2015, at 23:47, Chris Nauroth <cnaur...@hortonworks.com>
>>>>> wrote:
>>>>> 
>>>>> I was able to do more debugging today, using Jordan's ZooKeeper-only
>>>>> test
>>>>> (no Curator in the mix) mentioned in CURATOR-268.
>>>>> 
>>>>> https://gist.github.com/Randgalt/c3358b740f7f09d47bb0
>>>>> 
>>>>> 
>>>>> My observation is that the transactions are getting processed and
>>>>> persisted to the log, but the client's connection to the server is
>>>>> getting
>>>>> closed before a successful reply is sent.  This causes the client to
>>>>> reconnect and retry the same request.  For a retried delete, this
>>>>> causes a
>>>>> NoNode error.  For a retried create, this causes a NodeExists error.
>>>>> The
>>>>> effect is also visible by logging the last zxid seen by the client,
>>>>> which
>>>>> will show gaps.  i.e. 1, 2, 3, 5, 6, ...
>>>>> 
>>>>> I think a contributing factor is the shutdown sequence in
>>>>> ZooKeeperServerMain.  I see NIOServerCnxnFactory.shutdown() getting
>>>>> called
>>>>> and closing all connections before shutting down the request
>>>>> processing
>>>>> pipeline.  If requests are in flight in the pipeline after sockets
>>>>> have
>>>>> been closed, but before the pipeline has been stopped, then the server
>>>>> could process the transaction and persist it, but not tell the client
>>>>> about it.  I was able to reduce the frequency of test failures by
>>>>> reordering the shutdown sequence so that the request processing
>>>>> pipeline
>>>>> is shutdown before the sockets, but it appears this is only a partial
>>>>> fix.
>>>>> Before this change, I could only run Jordan's test for a few seconds
>>>>> before it aborted.  After this change, I can keep it running for
>>>>> multiple
>>>>> minutes, but it still aborts eventually.
>>>>> 
>>>>> Note that even if we improve the handling for this on a graceful
>>>>> shutdown,
>>>>> this scenario would still be possible in the event of hard failure
>>>>> (i.e.
>>>>> power loss).  I know some applications wrap the ZooKeeper API calls
>>>>> with
>>>>> additional semantics, so that a NoNode/NodeExists response is treated
>>>>> as a
>>>>> successful idempotent response.  For example, here is something in
>>>>> HBase:
>>>>> 
>>>>> 
>>>>> 
>>>>> http://hbase.apache.org/devapidocs/org/apache/hadoop/hbase/zookeeper/Re
>>>>> co
>>>>> ve
>>>>> rableZooKeeper.html
>>>>> 
>>>>> 
>>>>> I don't know the Curator codebase very well, so I don't know if it has
>>>>> similar options.
>>>>> 
>>>>> --Chris Nauroth
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> On 10/5/15, 10:41 AM, "Flavio Junqueira" <f...@apache.org> wrote:
>>>>> 
>>>>>> You said you can reproduce it with raw ZooKeeper and I thought you're
>>>>>> saying you can't reproduce with raw ZooKeeper, but sounds like I got
>>>>>> it
>>>>>> wrong.
>>>>>> 
>>>>>> -Flavio
>>>>>> 
>>>>>>> On 05 Oct 2015, at 18:36, Jordan Zimmerman
>>>>>>> <jor...@jordanzimmerman.com>
>>>>>>> wrote:
>>>>>>> 
>>>>>>>> can reproduce it with raw ZooKeeper -> can't reproduce it with raw
>>>>>>>> ZooKeeper , yes?
>>>>>>> 
>>>>>>> I don’t know what you mean. The test that the user posted used
>>>>>>> Curator.
>>>>>>> I changed all the Curator usages to raw ZooKeeper usages and the
>>>>>>> problem
>>>>>>> still shows.
>>>>>>> 
>>>>>>> -JZ
>>>>>>> 
>>>>>>>> On Oct 5, 2015, at 12:17 PM, Flavio Junqueira <f...@apache.org>
>>>>>>>> wrote:
>>>>>>>> 
>>>>>>>> can reproduce it with raw ZooKeeper -> can't reproduce it with raw
>>>>>>>> ZooKeeper , yes?
>>>>>>>> 
>>>>>>>> I'll have a look at the jira to see if I have any insight.
>>>>>>>> 
>>>>>>>> -Flavio
>>>>>>>> 
>>>>>>>>> On 05 Oct 2015, at 18:15, Jordan Zimmerman
>>>>>>>>> <jor...@jordanzimmerman.com> wrote:
>>>>>>>>> 
>>>>>>>>> What we’re seeing is transaction rollbacks. The bug was reported
>>>>>>>>> against Curator but I can reproduce it with raw ZooKeeper:
>>>>>>>>> https://issues.apache.org/jira/browse/CURATOR-268
>>>>>>>>> <https://issues.apache.org/jira/browse/CURATOR-268>
>>>>>>>>> 
>>>>>>>>> -JZ
>>>>>>>>> 
>>>>>>>>>> On Oct 5, 2015, at 12:00 PM, Flavio Junqueira <f...@apache.org>
>>>>>>>>>> wrote:
>>>>>>>>>> 
>>>>>>>>>> It is safe because the requests in the submittedRequests queue
>>>>>>>>>> haven't been prepared yet. The simplest pipeline is the one of
>>>>>>>>>> the
>>>>>>>>>> standalone server: preprequestprocessor -> syncrequestprocessor
>>>>>>>>>> ->
>>>>>>>>>> finalrequestprocessor. If the request hasn't gone through prepRP,
>>>>>>>>>> then nothing has changed in the state of zookeeper. The ones that
>>>>>>>>>> have gone through prepPR will complete regularly. For quorum, the
>>>>>>>>>> pipeline is a bit more complex, but the reasoning is very
>>>>>>>>>> similar.
>>>>>>>>>> 
>>>>>>>>>> -Flavio
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>>> On 05 Oct 2015, at 17:55, Jordan Zimmerman
>>>>>>>>>>> <jor...@jordanzimmerman.com> wrote:
>>>>>>>>>>> 
>>>>>>>>>>> That would mean that there’s no safe way to shut down the
>>>>>>>>>>> server,
>>>>>>>>>>> right? Ideally, you’d want the server to shut down gracefully:
>>>>>>>>>>> a)
>>>>>>>>>>> stop receiving requests; b) complete current requests; c) shut
>>>>>>>>>>> down.
>>>>>>>>>>> That’s how most servers work. Of course, you might want a
>>>>>>>>>>> quick-die
>>>>>>>>>>> shutdown but that’s not usual behavior.
>>>>>>>>>>> 
>>>>>>>>>>> -JZ
>>>>>>>>>>> 
>>>>>>>>>>>> On Oct 5, 2015, at 11:30 AM, Flavio Junqueira <f...@apache.org>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>> 
>>>>>>>>>>>> You suggested that it is a bug, and I'm arguing that it isn't a
>>>>>>>>>>>> bug. You may want to optimize and still process the requests in
>>>>>>>>>>>> the
>>>>>>>>>>>> queue before injecting RoD, but discarding them doesn't sound
>>>>>>>>>>>> like
>>>>>>>>>>>> a bug because you can't guarantee that requests submitted
>>>>>>>>>>>> concurrently with the server shutting down will be executed.
>>>>>>>>>>>> Optimizing isn't the same as spotting a bug. Also, if you are
>>>>>>>>>>>> trying to shut down, you probably want to do it asap, rather
>>>>>>>>>>>> than
>>>>>>>>>>>> wait for a whole batch of operations to complete.
>>>>>>>>>>>> 
>>>>>>>>>>>> -Flavio
>>>>>>>>>>>> 
>>>>>>>>>>>>> On 05 Oct 2015, at 14:57, Jordan Zimmerman
>>>>>>>>>>>>> <jor...@jordanzimmerman.com> wrote:
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Flavio, that isn’t logical. Just because you can’t make that
>>>>>>>>>>>>> guarantee doesn’t imply that you should flush already queued
>>>>>>>>>>>>> transactions.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> -JZ
>>>>>>>>>>>>> 
>>>>>>>>>>>>>> On Oct 5, 2015, at 3:24 AM, Flavio Junqueira <f...@apache.org>
>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Injecting the RoD means that we are shutting down the server
>>>>>>>>>>>>>> pipeline. If the server is shutting down, then we can't
>>>>>>>>>>>>>> guarantee
>>>>>>>>>>>>>> that a request submitted concurrently will be executed
>>>>>>>>>>>>>> anyway,
>>>>>>>>>>>>>> so
>>>>>>>>>>>>>> clearing the queue of submitted requests (submitted but no
>>>>>>>>>>>>>> preped
>>>>>>>>>>>>>> for execution) sounds like correct behavior to me.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> -Flavio
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> On 04 Oct 2015, at 23:05, Chris Nauroth
>>>>>>>>>>>>>>> <cnaur...@hortonworks.com> wrote:
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> Hi Jordan,
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> That's an interesting find.  I think you have a good theory.
>>>>>>>>>>>>>>> Have you
>>>>>>>>>>>>>>> already tried patching this to see if the bug reported
>>>>>>>>>>>>>>> against
>>>>>>>>>>>>>>> Curator
>>>>>>>>>>>>>>> goes away?  (BTW, is there a corresponding Curator JIRA?)
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> That logic dates all the way back to the initial import of
>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>> codebase.
>>>>>>>>>>>>>>> I can't find a definitive explanation, but my best guess is
>>>>>>>>>>>>>>> that dropping
>>>>>>>>>>>>>>> pending requests (instead of gracefully quiescing) can give
>>>>>>>>>>>>>>> a
>>>>>>>>>>>>>>> faster
>>>>>>>>>>>>>>> shutdown in the event of a heavily overloaded server.
>>>>>>>>>>>>>>> However,
>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>> correctness of this choice looks questionable, especially in
>>>>>>>>>>>>>>> stand-alone
>>>>>>>>>>>>>>> mode where you don't have a cluster of other machines to
>>>>>>>>>>>>>>> compensate.
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> Something else interesting is that this doesn't even really
>>>>>>>>>>>>>>> guarantee that
>>>>>>>>>>>>>>> the request of death is the only thing remaining to be
>>>>>>>>>>>>>>> processed.  There
>>>>>>>>>>>>>>> is no synchronization over the queue covering both the clear
>>>>>>>>>>>>>>> and the
>>>>>>>>>>>>>>> enqueue of the request of death, so I think there is a
>>>>>>>>>>>>>>> window
>>>>>>>>>>>>>>> in which
>>>>>>>>>>>>>>> other requests could trickle in ahead of the request of
>>>>>>>>>>>>>>> death.
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> --Chris Nauroth
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> On 10/1/15, 8:21 PM, "Jordan Zimmerman"
>>>>>>>>>>>>>>> <jord...@bluejeansnet.com> wrote:
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> Why does PrepRequestProcessor.shutdown() call
>>>>>>>>>>>>>>>> submittedRequests.clear();
>>>>>>>>>>>>>>>> before adding the death request? What if there are pending
>>>>>>>>>>>>>>>> requests? I¹m
>>>>>>>>>>>>>>>> trying to track down a bug reported in Curator. It only
>>>>>>>>>>>>>>>> happens in
>>>>>>>>>>>>>>>> Standalone ZK instances. From what I can tell, shutting
>>>>>>>>>>>>>>>> down
>>>>>>>>>>>>>>>> a
>>>>>>>>>>>>>>>> standalone
>>>>>>>>>>>>>>>> instance might result in lost transactions. Am I looking
>>>>>>>>>>>>>>>> down
>>>>>>>>>>>>>>>> the wrong
>>>>>>>>>>>>>>>> path or is this a possibility?
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> -Jordan
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>>> 
>>>>> 
>>>> 
>>>> 
>>> 
>> 
>> 
> 

Reply via email to