[
https://issues.apache.org/jira/browse/ZOOKEEPER-2464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15840421#comment-15840421
]
Mohammad Arshad commented on ZOOKEEPER-2464:
--------------------------------------------
Root cause of the problem is the inconsistent behavior of
DataNode.getChildren() API.
DataNode.getChildren() API Current Behaviour:
# returns null initially
When DataNode is created and no children are added yet, DataNode.getChildren()
returns null
# returns empty set after all the children are deleted:
created a Node
add a child
delete the child
DataNode.getChildren() returns empty set.
I think we should fix this issue by modifying the DataNode.getChildren() API.
We should always return empty set if there is no child.
> NullPointerException on ContainerManager
> ----------------------------------------
>
> Key: ZOOKEEPER-2464
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2464
> Project: ZooKeeper
> Issue Type: Bug
> Components: server
> Affects Versions: 3.5.1
> Reporter: Stefano Salmaso
> Assignee: Jordan Zimmerman
> Fix For: 3.5.3, 3.6.0
>
> Attachments: ContainerManagerTest.java, ZOOKEEPER-2464.patch
>
>
> I would like to expose you to a problem that we are experiencing.
> We are using a cluster of 7 zookeeper and we use them to implement a
> distributed lock using Curator
> (http://curator.apache.org/curator-recipes/shared-reentrant-lock.html)
> So .. we tried to play with the servers to see if everything worked properly
> and we stopped and start servers to see that the system worked well
> (like stop 03, stop 05, stop 06, start 05, start 06, start 03)
> We saw a strange behavior.
> The number of znodes grew up without stopping (normally we had 4000 or 5000,
> we got to 60,000 and then we stopped our application)
> In zookeeeper logs I saw this (on leader only, one every minute)
> 2016-07-04 14:53:50,302 [myid:7] - ERROR
> [ContainerManagerTask:ContainerManager$1@84] - Error checking containers
> java.lang.NullPointerException
> at
> org.apache.zookeeper.server.ContainerManager.getCandidates(ContainerManager.java:151)
> at
> org.apache.zookeeper.server.ContainerManager.checkContainers(ContainerManager.java:111)
> at
> org.apache.zookeeper.server.ContainerManager$1.run(ContainerManager.java:78)
> at java.util.TimerThread.mainLoop(Timer.java:555)
> at java.util.TimerThread.run(Timer.java:505)
> We have not yet deleted the data ... so the problem can be reproduced on our
> servers
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)