[
https://issues.apache.org/jira/browse/ZOOKEEPER-2052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14160158#comment-14160158
]
Yip Ng commented on ZOOKEEPER-2052:
-----------------------------------
Hongchao:
Thanks for taking a look at the patch. The patch only deals with saving
pending changes and it does not change any of precondition checkings for
delete; hence, I am not sure
why it didn't work the way you described. Anyhow, I verified your scenario and
it works as expected where the second delete on a non-exist node will
triggered the abort of the entire
operation. A KeeperException.NoNodeException was thrown at getPathForRecord()
in this case and the /existed node was not deleted.
2014-10-06 02:38:21,795 [myid:] - INFO [ProcessThread(sid:0
cport:11221)::PrepRequestProcessor@798] - Got user-level KeeperException when
processing sessionid:0x148e4d2ed5b0000 type:multi cxid:0x4 zxid:0x4 txntype:2
reqpath:n/a aborting remaining multi ops. Error Path:/non-existed
Error:KeeperErrorCode = NoNode for /non-existed
org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode
...
I did ran the patch with "ant clean test" before submitting it and the all
tests passed in my environment. Is there a particular test case you are
referring to for the core tests that failed?
I see 6 failed tests in core test report but its error details show that the
"Address already in use" for 4 of the testcases and the other 2 are "waiting
for server being up" and
"Forked Java VM exited abnormally. Please note the time in the report does not
reflect the time until the VM exit." respectively. It looks like socket bind
issues
preventing the tests to run in completion.
The testcase was a simplified version to reproduce the issue. We use multi()
with delete to remove a set of nodes in its entirety or none at all as
advertised by the multi javadoc.
Partial deletions will result in inconsistency of the application state.
> Unable to delete a node when the node has no children
> -----------------------------------------------------
>
> Key: ZOOKEEPER-2052
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2052
> Project: ZooKeeper
> Issue Type: Bug
> Components: server
> Affects Versions: 3.4.6
> Environment: Red Hat Enterprise Linux 6.1 x86_64, standalone or 3
> node ensemble (v3.4.6), 2 Java clients (v3.4.6)
> Reporter: Yip Ng
> Attachments: ZOOKEEPER-2052.patch, ZOOKEEPER-2052.patch, zookeeper.log
>
>
> We stumbled upon a ZooKeeper bug where a node with no children cannot be
> removed on our 3 node ZooKeeper ensemble or standalone ZooKeeper on Red Hat
> Enterprise Linux x86_64 environment. Here is an example scenario/setup:
> o Standalone ZooKeeper or 3 node ensemble (v3.4.6)
> o 2 Java clients (v3.4.6)
> - Client A creates a persistent node (e.g.: /metadata/resources)
> - Client B creates ephemeral nodes under this persistent node
> o Client A attempts to remove the /metadata/resources node via multi op
> delete but fails since there are children
> o Client B's session expired, all the ephemeral nodes are removed
> o Client A attempts to recursively remove /metadata/resources node via
> multi op, this is expected to succeed but got the following exception:
> org.apache.zookeeper.KeeperException$NotEmptyException:
> KeeperErrorCode = Directory not empty
> (Note that Client B is the only client that creates these ephemeral nodes)
> o After this, we use zkCli.sh to inspect the problematic node but the
> zkCli.sh shows the /metadata/resources node indeed have no children but it
> will not allow /metadata/resources node to get deleted. (shown below)
> [zk: localhost:2181(CONNECTED) 0] ls /
> [zookeeper, metadata]
> [zk: localhost:2181(CONNECTED) 1] ls /metadata
> [resources]
> [zk: localhost:2181(CONNECTED) 2] get /metadata/resources
> null
> cZxid = 0x3
> ctime = Wed Oct 01 22:04:11 PDT 2014
> mZxid = 0x3
> mtime = Wed Oct 01 22:04:11 PDT 2014
> pZxid = 0x9
> cversion = 2
> dataVersion = 0
> aclVersion = 0
> ephemeralOwner = 0x0
> dataLength = 0
> numChildren = 0
> [zk: localhost:2181(CONNECTED) 3] delete /metadata/resources
> Node not empty: /metadata/resources
> [zk: localhost:2181(CONNECTED) 4] get /metadata/resources
> null
> cZxid = 0x3
> ctime = Wed Oct 01 22:04:11 PDT 2014
> mZxid = 0x3
> mtime = Wed Oct 01 22:04:11 PDT 2014
> pZxid = 0x9
> cversion = 2
> dataVersion = 0
> aclVersion = 0
> ephemeralOwner = 0x0
> dataLength = 0
> numChildren = 0
> o The only ways to remove this node is to either:
> a) Restart the ZooKeeper server
> b) set data to /metadata/resources then followed by a subsequent delete.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)