Zhuqi Jin created ZOOKEEPER-3848:
------------------------------------
Summary: Zookeeper upgrade fails due to missing snapshots on
branch-3.6
Key: ZOOKEEPER-3848
URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3848
Project: ZooKeeper
Issue Type: Bug
Components: server
Affects Versions: 3.6.2
Reporter: Zhuqi Jin
We tested upgrading a single-node zookeeper from branch-3.4/branch-3.5 to
branch-3.6, but the upgraded node failed to start.
The error message is shown as following:
{code:java}
2020-05-24 00:24:24,996 [myid:1] - ERROR [main:ZooKeeperServerMain@90] -
Unexpected exception, exiting abnormally
java.io.IOException: No snapshot found, but there are log entries. Something is
broken!
at
org.apache.zookeeper.server.persistence.FileTxnSnapLog.restore(FileTxnSnapLog.java:281)
at
org.apache.zookeeper.server.ZKDatabase.loadDataBase(ZKDatabase.java:285)
at
org.apache.zookeeper.server.ZooKeeperServer.loadData(ZooKeeperServer.java:484)
at
org.apache.zookeeper.server.ZooKeeperServer.startdata(ZooKeeperServer.java:655)
at
org.apache.zookeeper.server.NIOServerCnxnFactory.startup(NIOServerCnxnFactory.java:758)
at
org.apache.zookeeper.server.ServerCnxnFactory.startup(ServerCnxnFactory.java:130)
at
org.apache.zookeeper.server.ZooKeeperServerMain.runFromConfig(ZooKeeperServerMain.java:159)
at
org.apache.zookeeper.server.ZooKeeperServerMain.initializeAndRun(ZooKeeperServerMain.java:112)
at
org.apache.zookeeper.server.ZooKeeperServerMain.main(ZooKeeperServerMain.java:67)
at
org.apache.zookeeper.server.quorum.QuorumPeerMain.initializeAndRun(QuorumPeerMain.java:140)
at
org.apache.zookeeper.server.quorum.QuorumPeerMain.main(QuorumPeerMain.java:90)
2020-05-24 00:24:24,999 [myid:1] - INFO [main:ZKAuditProvider@42] - ZooKeeper
audit is disabled.
2020-05-24 00:24:25,001 [myid:1] - ERROR [main:ServiceUtils@42] - Exiting JVM
with code 1 {code}
The error can be reproduced through the following steps:
# Step1: Start a single-node zookeeper (compiled from either branch-3.4 or
branch-3.5) with the following configuration(zoo.cfg):
{code:java}
tickTime=2000
initLimit=10
syncLimit=5
dataDir=/tmp/zookeeper
clientPort=2181
server.1=localhost:2888:3888{code}
# Step2: Use a zookeeper stress testing tool - zk-smoketool
([https://github.com/phunt/zk-smoketest.git]) - to test this node. We invoked
create, set, and get operations in zk-smoketool but not delete operation, so
that generated data are left on disk.
# Step3: Upgrade the node to branch-3.6 with the same configuration. After
upgraded, as the log suggested, zookeeper failed to start.
We learned about ZOOKEEPER-3056 and ZOOKEEPER-3513, and added
{code:java}
zookeeper.snapshot.trust.empty=true {code}
to branch-3.6's configuration(zoo.cfg), but it ran into the same failure.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)