What we’re seeing is transaction rollbacks. The bug was reported against 
Curator but I can reproduce it with raw ZooKeeper: 
https://issues.apache.org/jira/browse/CURATOR-268 
<https://issues.apache.org/jira/browse/CURATOR-268>

-JZ

> On Oct 5, 2015, at 12:00 PM, Flavio Junqueira <f...@apache.org> wrote:
> 
> It is safe because the requests in the submittedRequests queue haven't been 
> prepared yet. The simplest pipeline is the one of the standalone server: 
> preprequestprocessor -> syncrequestprocessor -> finalrequestprocessor. If the 
> request hasn't gone through prepRP, then nothing has changed in the state of 
> zookeeper. The ones that have gone through prepPR will complete regularly. 
> For quorum, the pipeline is a bit more complex, but the reasoning is very 
> similar.
> 
> -Flavio
> 
> 
>> On 05 Oct 2015, at 17:55, Jordan Zimmerman <jor...@jordanzimmerman.com> 
>> wrote:
>> 
>> That would mean that there’s no safe way to shut down the server, right? 
>> Ideally, you’d want the server to shut down gracefully: a) stop receiving 
>> requests; b) complete current requests; c) shut down. That’s how most 
>> servers work. Of course, you might want a quick-die shutdown but that’s not 
>> usual behavior.
>> 
>> -JZ
>> 
>>> On Oct 5, 2015, at 11:30 AM, Flavio Junqueira <f...@apache.org> wrote:
>>> 
>>> You suggested that it is a bug, and I'm arguing that it isn't a bug. You 
>>> may want to optimize and still process the requests in the queue before 
>>> injecting RoD, but discarding them doesn't sound like a bug because you 
>>> can't guarantee that requests submitted concurrently with the server 
>>> shutting down will be executed. Optimizing isn't the same as spotting a 
>>> bug. Also, if you are trying to shut down, you probably want to do it asap, 
>>> rather than wait for a whole batch of operations to complete.
>>> 
>>> -Flavio
>>> 
>>>> On 05 Oct 2015, at 14:57, Jordan Zimmerman <jor...@jordanzimmerman.com> 
>>>> wrote:
>>>> 
>>>> Flavio, that isn’t logical. Just because you can’t make that guarantee 
>>>> doesn’t imply that you should flush already queued transactions.
>>>> 
>>>> -JZ
>>>> 
>>>>> On Oct 5, 2015, at 3:24 AM, Flavio Junqueira <f...@apache.org> wrote:
>>>>> 
>>>>> Injecting the RoD means that we are shutting down the server pipeline. If 
>>>>> the server is shutting down, then we can't guarantee that a request 
>>>>> submitted concurrently will be executed anyway, so clearing the queue of 
>>>>> submitted requests (submitted but no preped for execution) sounds like 
>>>>> correct behavior to me.
>>>>> 
>>>>> -Flavio  
>>>>> 
>>>>> 
>>>>>> On 04 Oct 2015, at 23:05, Chris Nauroth <cnaur...@hortonworks.com> wrote:
>>>>>> 
>>>>>> Hi Jordan,
>>>>>> 
>>>>>> That's an interesting find.  I think you have a good theory.  Have you
>>>>>> already tried patching this to see if the bug reported against Curator
>>>>>> goes away?  (BTW, is there a corresponding Curator JIRA?)
>>>>>> 
>>>>>> That logic dates all the way back to the initial import of the codebase.
>>>>>> I can't find a definitive explanation, but my best guess is that dropping
>>>>>> pending requests (instead of gracefully quiescing) can give a faster
>>>>>> shutdown in the event of a heavily overloaded server.  However, the
>>>>>> correctness of this choice looks questionable, especially in stand-alone
>>>>>> mode where you don't have a cluster of other machines to compensate.
>>>>>> 
>>>>>> Something else interesting is that this doesn't even really guarantee 
>>>>>> that
>>>>>> the request of death is the only thing remaining to be processed.  There
>>>>>> is no synchronization over the queue covering both the clear and the
>>>>>> enqueue of the request of death, so I think there is a window in which
>>>>>> other requests could trickle in ahead of the request of death.
>>>>>> 
>>>>>> --Chris Nauroth
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> On 10/1/15, 8:21 PM, "Jordan Zimmerman" <jord...@bluejeansnet.com> wrote:
>>>>>> 
>>>>>>> Why does PrepRequestProcessor.shutdown() call submittedRequests.clear();
>>>>>>> before adding the death request? What if there are pending requests? I¹m
>>>>>>> trying to track down a bug reported in Curator. It only happens in
>>>>>>> Standalone ZK instances. From what I can tell, shutting down a 
>>>>>>> standalone
>>>>>>> instance might result in lost transactions. Am I looking down the wrong
>>>>>>> path or is this a possibility?
>>>>>>> 
>>>>>>> -Jordan
>>>>>> 
>>>>> 
>>>> 
>>> 
>> 
> 

Reply via email to