Lasaro Camargos created ZOOKEEPER-2653: ------------------------------------------
Summary: epoch files do not match snapshots and logs Key: ZOOKEEPER-2653 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2653 Project: ZooKeeper Issue Type: Bug Affects Versions: 3.4.9 Environment: Linux 3.10.0-327.el7.x86_64 #1 SMP Thu Nov 19 22:10:57 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux (centos) Reporter: Lasaro Camargos Hi all. After shutting zk down and upgrading to centos 7, ZK would not start with exception Removing file: Dec 19, 2016 10:55:08 PM /hedvig/hpod/log/version-2/log.300ee0308 Removing file: Dec 19, 2016 7:11:23 PM /hedvig/hpod/data/version-2/snapshot.300ee0307 java.lang.RuntimeException: Unable to run quorum server at org.apache.zookeeper.server.quorum.QuorumPeer.loadDataBase(QuorumPeer.java:558) at org.apache.zookeeper.server.quorum.QuorumPeer.start(QuorumPeer.java:500) at org.apache.zookeeper.server.quorum.QuorumPeerMain.runFromConfig(QuorumPeerMain.java:153) at com.hedvig.hpod.service.PodnetService$1.run(PodnetService.java:2262) at java.lang.Thread.run(Thread.java:745) Caused by: java.io.IOException: The current epoch, 3, is older than the last zxid, 17179871862 at org.apache.zookeeper.server.quorum.QuorumPeer.loadDataBase(QuorumPeer.java:539) ... 4 more All logs are empty, and the following snapshot and commit logs exist find . . ./log ./log/version-2 ./log/version-2/log.40000010a ./log/version-2/log.300ef712b ./log/version-2/log.300f0659e ./log/version-2/.ignore ./data ./data/version-2 ./data/version-2/snapshot.400000109 ./data/version-2/currentEpoch ./data/version-2/acceptedEpoch ./data/version-2/snapshot.300ef712a ./data/version-2/snapshot.300f0659d ./data/myid.bak ./data/myid On other nodes we had the same exception but no commit log deletion. java.lang.RuntimeException: Unable to run quorum server at org.apache.zookeeper.server.quorum.QuorumPeer.loadDataBase(QuorumPeer.java:558) at org.apache.zookeeper.server.quorum.QuorumPeer.start(QuorumPeer.java:500) at org.apache.zookeeper.server.quorum.QuorumPeerMain.runFromConfig(QuorumPeerMain.java:153) at com.hedvig.hpod.service.PodnetService$1.run(PodnetService.java:2262) at java.lang.Thread.run(Thread.java:745) Caused by: java.io.IOException: The current epoch, 3, is older than the last zxid, 17179871862 at org.apache.zookeeper.server.quorum.QuorumPeer.loadDataBase(QuorumPeer.java:539) ./log ./log/version-2 ./log/version-2/log.300f06cfc ./log/version-2/log.300f03890 ./log/version-2/.ignore ./data ./data/version-2 ./data/version-2/snapshot.300f06cfb ./data/version-2/snapshot.300f06f10 ./data/version-2/currentEpoch ./data/version-2/acceptedEpoch ./data/version-2/snapshot.300f0388f ./data/myid.bak ./data/myid /log ./log/version-2 ./log/version-2/log.300f06dbf ./log/version-2/log.300ed96fc ./log/version-2/log.300ef1048 ./log/version-2/.ignore ./data ./data/version-2 ./data/version-2/snapshot.300f06dbe ./data/version-2/currentEpoch ./data/version-2/acceptedEpoch ./data/version-2/snapshot.300ed96fb ./data/version-2/snapshot.300ef1048 ./data/myid.bak ./data/myid The symptoms look like ZOOKEEPER-1549, but we are running 3.4.9 here. Any ideas? -- This message was sent by Atlassian JIRA (v6.3.4#6332)