suja s created ZOOKEEPER-1612:
---------------------------------
Summary: Zookeeper unable to recover and start once datadir disk
is full and disk space cleared
Key: ZOOKEEPER-1612
URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1612
Project: ZooKeeper
Issue Type: Bug
Affects Versions: 3.4.3
Reporter: suja s
Once zookeeper data dir disk becomes full, the process gets shut down.
{noformat}
2012-12-14 13:22:26,959 [myid:2] - ERROR
[QuorumPeer[myid=2]/0:0:0:0:0:0:0:0:2181:ZooKeeperServer@276] - Severe
unrecoverable error, exiting
java.io.IOException: No space left on device
at java.io.FileOutputStream.writeBytes(Native Method)
at java.io.FileOutputStream.write(FileOutputStream.java:282)
at
java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65)
at java.io.BufferedOutputStream.write(BufferedOutputStream.java:109)
at java.util.zip.CheckedOutputStream.write(CheckedOutputStream.java:56)
at java.io.DataOutputStream.write(DataOutputStream.java:90)
at java.io.FilterOutputStream.write(FilterOutputStream.java:80)
at
org.apache.jute.BinaryOutputArchive.writeBuffer(BinaryOutputArchive.java:119)
at org.apache.zookeeper.server.DataNode.serialize(DataNode.java:168)
at
org.apache.jute.BinaryOutputArchive.writeRecord(BinaryOutputArchive.java:123)
at
org.apache.zookeeper.server.DataTree.serializeNode(DataTree.java:1115)
at
org.apache.zookeeper.server.DataTree.serializeNode(DataTree.java:1130)
at
org.apache.zookeeper.server.DataTree.serializeNode(DataTree.java:1130)
at org.apache.zookeeper.server.DataTree.serialize(DataTree.java:1179)
at
org.apache.zookeeper.server.util.SerializeUtils.serializeSnapshot(SerializeUtils.java:138)
at
org.apache.zookeeper.server.persistence.FileSnap.serialize(FileSnap.java:213)
at
org.apache.zookeeper.server.persistence.FileSnap.serialize(FileSnap.java:230)
at
org.apache.zookeeper.server.persistence.FileTxnSnapLog.save(FileTxnSnapLog.java:242)
at
org.apache.zookeeper.server.ZooKeeperServer.takeSnapshot(ZooKeeperServer.java:274)
at
org.apache.zookeeper.server.quorum.Learner.syncWithLeader(Learner.java:407)
at
org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:82)
at
org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:759)
{noformat}
Later disk space is cleared and zk started again. Startup of zk fails as it is
not able to read snapshot properly. (Since load from disk failed it is not able
to join peers in the quorum and get a snapshot diff)
{noformat}
2012-12-14 16:20:31,489 [myid:2] - INFO [main:FileSnap@83] - Reading snapshot
../dataDir/version-2/snapshot.1000000042
2012-12-14 16:20:31,564 [myid:2] - ERROR [main:QuorumPeer@472] - Unable to load
database on disk
java.io.EOFException
at java.io.DataInputStream.readInt(DataInputStream.java:375)
at
org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
at
org.apache.zookeeper.server.persistence.FileHeader.deserialize(FileHeader.java:64)
at
org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.inStreamCreated(FileTxnLog.java:558)
at
org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.createInputArchive(FileTxnLog.java:577)
at
org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.goToNextLog(FileTxnLog.java:543)
at
org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.next(FileTxnLog.java:625)
at
org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.init(FileTxnLog.java:529)
at
org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.<init>(FileTxnLog.java:504)
at
org.apache.zookeeper.server.persistence.FileTxnLog.read(FileTxnLog.java:341)
at
org.apache.zookeeper.server.persistence.FileTxnSnapLog.restore(FileTxnSnapLog.java:132)
at
org.apache.zookeeper.server.ZKDatabase.loadDataBase(ZKDatabase.java:223)
at
org.apache.zookeeper.server.quorum.QuorumPeer.loadDataBase(QuorumPeer.java:436)
at
org.apache.zookeeper.server.quorum.QuorumPeer.start(QuorumPeer.java:428)
at
org.apache.zookeeper.server.quorum.QuorumPeerMain.runFromConfig(QuorumPeerMain.java:152)
at
org.apache.zookeeper.server.quorum.QuorumPeerMain.initializeAndRun(QuorumPeerMain.java:111)
at
org.apache.zookeeper.server.quorum.QuorumPeerMain.main(QuorumPeerMain.java:78)
2012-12-14 16:20:31,566 [myid:2] - ERROR [main:QuorumPeerMain@89] - Unexpected
exception, exiting abnormally
java.lang.RuntimeException: Unable to run quorum server
at
org.apache.zookeeper.server.quorum.QuorumPeer.loadDataBase(QuorumPeer.java:473)
at
org.apache.zookeeper.server.quorum.QuorumPeer.start(QuorumPeer.java:428)
at
org.apache.zookeeper.server.quorum.QuorumPeerMain.runFromConfig(QuorumPeerMain.java:152)
at
org.apache.zookeeper.server.quorum.QuorumPeerMain.initializeAndRun(QuorumPeerMain.java:111)
at
org.apache.zookeeper.server.quorum.QuorumPeerMain.main(QuorumPeerMain.java:78)
Caused by: java.io.EOFException
at java.io.DataInputStream.readInt(DataInputStream.java:375)
at
org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
at
org.apache.zookeeper.server.persistence.FileHeader.deserialize(FileHeader.java:64)
at
org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.inStreamCreated(FileTxnLog.java:558)
at
org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.createInputArchive(FileTxnLog.java:577)
at
org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.goToNextLog(FileTxnLog.java:543)
at
org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.next(FileTxnLog.java:625)
at
org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.init(FileTxnLog.java:529)
at
org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.<init>(FileTxnLog.java:504)
at
org.apache.zookeeper.server.persistence.FileTxnLog.read(FileTxnLog.java:341)
at
org.apache.zookeeper.server.persistence.FileTxnSnapLog.restore(FileTxnSnapLog.java:132)
{noformat}
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira