can reproduce it with raw ZooKeeper -> can't reproduce it with raw ZooKeeper , 
yes?

I'll have a look at the jira to see if I have any insight.

-Flavio

> On 05 Oct 2015, at 18:15, Jordan Zimmerman <jor...@jordanzimmerman.com> wrote:
> 
> What we’re seeing is transaction rollbacks. The bug was reported against 
> Curator but I can reproduce it with raw ZooKeeper: 
> https://issues.apache.org/jira/browse/CURATOR-268 
> <https://issues.apache.org/jira/browse/CURATOR-268>
> 
> -JZ
> 
>> On Oct 5, 2015, at 12:00 PM, Flavio Junqueira <f...@apache.org> wrote:
>> 
>> It is safe because the requests in the submittedRequests queue haven't been 
>> prepared yet. The simplest pipeline is the one of the standalone server: 
>> preprequestprocessor -> syncrequestprocessor -> finalrequestprocessor. If 
>> the request hasn't gone through prepRP, then nothing has changed in the 
>> state of zookeeper. The ones that have gone through prepPR will complete 
>> regularly. For quorum, the pipeline is a bit more complex, but the reasoning 
>> is very similar.
>> 
>> -Flavio
>> 
>> 
>>> On 05 Oct 2015, at 17:55, Jordan Zimmerman <jor...@jordanzimmerman.com> 
>>> wrote:
>>> 
>>> That would mean that there’s no safe way to shut down the server, right? 
>>> Ideally, you’d want the server to shut down gracefully: a) stop receiving 
>>> requests; b) complete current requests; c) shut down. That’s how most 
>>> servers work. Of course, you might want a quick-die shutdown but that’s not 
>>> usual behavior.
>>> 
>>> -JZ
>>> 
>>>> On Oct 5, 2015, at 11:30 AM, Flavio Junqueira <f...@apache.org> wrote:
>>>> 
>>>> You suggested that it is a bug, and I'm arguing that it isn't a bug. You 
>>>> may want to optimize and still process the requests in the queue before 
>>>> injecting RoD, but discarding them doesn't sound like a bug because you 
>>>> can't guarantee that requests submitted concurrently with the server 
>>>> shutting down will be executed. Optimizing isn't the same as spotting a 
>>>> bug. Also, if you are trying to shut down, you probably want to do it 
>>>> asap, rather than wait for a whole batch of operations to complete.
>>>> 
>>>> -Flavio
>>>> 
>>>>> On 05 Oct 2015, at 14:57, Jordan Zimmerman <jor...@jordanzimmerman.com> 
>>>>> wrote:
>>>>> 
>>>>> Flavio, that isn’t logical. Just because you can’t make that guarantee 
>>>>> doesn’t imply that you should flush already queued transactions.
>>>>> 
>>>>> -JZ
>>>>> 
>>>>>> On Oct 5, 2015, at 3:24 AM, Flavio Junqueira <f...@apache.org> wrote:
>>>>>> 
>>>>>> Injecting the RoD means that we are shutting down the server pipeline. 
>>>>>> If the server is shutting down, then we can't guarantee that a request 
>>>>>> submitted concurrently will be executed anyway, so clearing the queue of 
>>>>>> submitted requests (submitted but no preped for execution) sounds like 
>>>>>> correct behavior to me.
>>>>>> 
>>>>>> -Flavio  
>>>>>> 
>>>>>> 
>>>>>>> On 04 Oct 2015, at 23:05, Chris Nauroth <cnaur...@hortonworks.com> 
>>>>>>> wrote:
>>>>>>> 
>>>>>>> Hi Jordan,
>>>>>>> 
>>>>>>> That's an interesting find.  I think you have a good theory.  Have you
>>>>>>> already tried patching this to see if the bug reported against Curator
>>>>>>> goes away?  (BTW, is there a corresponding Curator JIRA?)
>>>>>>> 
>>>>>>> That logic dates all the way back to the initial import of the codebase.
>>>>>>> I can't find a definitive explanation, but my best guess is that 
>>>>>>> dropping
>>>>>>> pending requests (instead of gracefully quiescing) can give a faster
>>>>>>> shutdown in the event of a heavily overloaded server.  However, the
>>>>>>> correctness of this choice looks questionable, especially in stand-alone
>>>>>>> mode where you don't have a cluster of other machines to compensate.
>>>>>>> 
>>>>>>> Something else interesting is that this doesn't even really guarantee 
>>>>>>> that
>>>>>>> the request of death is the only thing remaining to be processed.  There
>>>>>>> is no synchronization over the queue covering both the clear and the
>>>>>>> enqueue of the request of death, so I think there is a window in which
>>>>>>> other requests could trickle in ahead of the request of death.
>>>>>>> 
>>>>>>> --Chris Nauroth
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> On 10/1/15, 8:21 PM, "Jordan Zimmerman" <jord...@bluejeansnet.com> 
>>>>>>> wrote:
>>>>>>> 
>>>>>>>> Why does PrepRequestProcessor.shutdown() call 
>>>>>>>> submittedRequests.clear();
>>>>>>>> before adding the death request? What if there are pending requests? 
>>>>>>>> I¹m
>>>>>>>> trying to track down a bug reported in Curator. It only happens in
>>>>>>>> Standalone ZK instances. From what I can tell, shutting down a 
>>>>>>>> standalone
>>>>>>>> instance might result in lost transactions. Am I looking down the wrong
>>>>>>>> path or is this a possibility?
>>>>>>>> 
>>>>>>>> -Jordan
>>>>>>> 
>>>>>> 
>>>>> 
>>>> 
>>> 
>> 
> 

Reply via email to