Fangmin Lv created ZOOKEEPER-2808:
-------------------------------------
Summary: ACL with index 1 might be removed if it's only being used
once
Key: ZOOKEEPER-2808
URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2808
Project: ZooKeeper
Issue Type: Bug
Components: server
Affects Versions: 3.6.0
Reporter: Fangmin Lv
Priority: Critical
When Zeus start up, it will create DataTree instance, in which the empty config
znode is created with READ_UNSAFE acl, the acl will be stored in a map with
index 1. Then it's going to load the snapshot from disk, the nodes and acl map
will be cleared, but the reconfig znode is still reference to acl index 1. The
reconfig znode will be reused, so actually it may reference to a different ACL
stored in the snasphot. After leader-follower syncing, the reconfig znode will
be added back again (if it doesn't exist), which will remove the previous
reference to ACL index 1, if the index 1 has 0 reference it will be removed
from the ACL map, which could cause that ACL un-usable, and that znode will not
be readable.
Error logs related:
-----------------------------
2017-06-12 12:02:21,443 [myid:2] - ERROR [CommitProcWorkThread-14:DataTree@249]
- ERROR: ACL not available for long 1
2017-06-12 12:02:21,444 [myid:2] - ERROR
[CommitProcWorkThread-14:FinalRequestProcessor@567] - Failed to process
sessionid:0x201035cc882002d type:getChildren cxid:0x1 zxid:0xfffffffffffffffe
txntype:unknown reqpath:n/a
java.lang.RuntimeException: Failed to fetch acls for 1
at org.apache.zookeeper.server.DataTree.convertLong(DataTree.java:250)
at org.apache.zookeeper.server.DataTree.getACL(DataTree.java:799)
at org.apache.zookeeper.server.ZKDatabase.getACL(ZKDatabase.java:574)
at
org.apache.zookeeper.server.FinalRequestProcessor.processRequest(FinalRequestProcessor.java:463)
at
org.apache.zookeeper.server.quorum.CommitProcessor$CommitWorkRequest.doWork(CommitProcessor.java:439)
at
org.apache.zookeeper.server.WorkerService$ScheduledWorkRequest.run(WorkerService.java:151)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)