[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13750061#comment-13750061
 ] 

Flavio Junqueira commented on ZOOKEEPER-1624:
---------------------------------------------

I was wondering about what to do with this jira. We need to work on the java 
test case, and on top of it decide what to do for the 3.4 branch. I believe 
that we didn't check in ZOOKEEPER-1572 to the 3.4. branch because [~mahadev] 
said that we don't check in new features into an ongoing branch. Well, in this 
case, I'd say we should so that we can cleanly apply this patch, unless we come 
up with a way of testing that does not rely on the async multi api. 

On the java test case, [~thawan] says that it doesn't pass, but the last run on 
jenkins returned +1 for the core unit tests. Did you mean to say that it 
doesn't pass reliably?

Could you people give me some feedback here, please, [~thawan], [~fournc]?
                
> PrepRequestProcessor abort multi-operation incorrectly
> ------------------------------------------------------
>
>                 Key: ZOOKEEPER-1624
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1624
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: server
>            Reporter: Thawan Kooburat
>            Assignee: Thawan Kooburat
>            Priority: Critical
>              Labels: zk-review
>             Fix For: 3.5.0, 3.4.6
>
>         Attachments: ZOOKEEPER-1624.patch, ZOOKEEPER-1624.patch, 
> ZOOKEEPER-1624.patch, ZOOKEEPER-1624.patch, ZOOKEEPER-1624.patch
>
>
> We found this issue when trying to issue multiple instances of the following 
> multi-op concurrently
> multi {
> 1. create sequential node /a- 
> 2. create node /b
> }
> The expected result is that only the first multi-op request should success 
> and the rest of request should fail because /b is already exist
> However, the reported result is that the subsequence multi-op failed because 
> of sequential node creation failed which is not possible.
> Below is the return code for each sub-op when issuing 3 instances of the 
> above multi-op asynchronously
> 1. ZOK, ZOK
> 2. ZOK, ZNODEEXISTS,
> 3. ZNODEEXISTS, ZRUNTIMEINCONSISTENCY,
> When I added more debug log. The cause is that PrepRequestProcessor rollback 
> outstandingChanges of the second multi-op incorrectly causing sequential node 
> name generation to be incorrect. Below is the sequential node name generated 
> by PrepRequestProcessor
> 1. create /a-0001
> 2. create /a-0003
> 3. create /a-0001
> The bug is getPendingChanges() method. In failed to copied ChangeRecord for 
> the parent node ("/").  So rollbackPendingChanges() cannot restore the right 
> previous change record of the parent node when aborting the second multi-op
> The impact of this bug is that sequential node creation on the same parent 
> node may fail until the previous one is committed. I am not sure if there is 
> other implication or not.  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to