[
https://issues.apache.org/jira/browse/ZOOKEEPER-2332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15589719#comment-15589719
]
Sergey Maslyakov commented on ZOOKEEPER-2332:
---------------------------------------------
I've seen this problem happen on a system that ran out of disk space due to
other application filling up the disk. The entry for the transaction log file
was created on the file system but ZooKeeper was not able to write anything
into it. After the system was rebooted and disk space was released, ZooKeeper
failed to start.
I think this is a two-fold problem.
# On one hand, ZooKeeper should not be creating corrupted log or snapshot files.
# On the other hand, it should not explode with an unhandled exception if it
does come across an invalid log file.
Before opening a snapshot file, ZooKeeper does some quick and inexpensive
validation and rejects the corrupted snapshots. It does not validate the log
files and does not handle read/parse errors in case if came across a corrupted
log file.
The defect is reproducible on the heads of master, branch-3.5, and branch-3.4.
> Zookeeper failed to start for empty txn log
> -------------------------------------------
>
> Key: ZOOKEEPER-2332
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2332
> Project: ZooKeeper
> Issue Type: Bug
> Affects Versions: 3.4.6
> Reporter: Liu Shaohui
> Assignee: Liu Shaohui
> Priority: Critical
> Fix For: 3.6.0
>
> Attachments: ZOOKEEPER-2332-v001.diff
>
>
> We found that the zookeeper server with version 3.4.6 failed to start for
> there is a empty txn log in log dir.
> I think we should skip the empty log file during restoring the datatree.
> Any suggestion?
> {code}
> 2015-11-27 19:16:16,887 [myid:] - ERROR [main:ZooKeeperServerMain@63] -
> Unexpected exception, exiting abnormally
> java.io.EOFException
> at java.io.DataInputStream.readInt(DataInputStream.java:392)
> at org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
> at
> org.apache.zookeeper.server.persistence.FileHeader.deserialize(FileHeader.java:64)
> at
> org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.inStreamCreated(FileTxnLog.java:576)
> at
> org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.createInputArchive(FileTxnLog.java:595)
> at
> org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.goToNextLog(FileTxnLog.java:561)
> at
> org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.next(FileTxnLog.java:643)
> at
> org.apache.zookeeper.server.persistence.FileTxnSnapLog.restore(FileTxnSnapLog.java:158)
> at org.apache.zookeeper.server.ZKDatabase.loadDataBase(ZKDatabase.java:223)
> at
> org.apache.zookeeper.server.ZooKeeperServer.loadData(ZooKeeperServer.java:272)
> at
> org.apache.zookeeper.server.ZooKeeperServer.startdata(ZooKeeperServer.java:399)
> at
> org.apache.zookeeper.server.NIOServerCnxnFactory.startup(NIOServerCnxnFactory.java:122)
> at
> org.apache.zookeeper.server.ZooKeeperServerMain.runFromConfig(ZooKeeperServerMain.java:113)
> at
> org.apache.zookeeper.server.ZooKeeperServerMain.initializeAndRun(ZooKeeperServerMain.java:86)
> at
> org.apache.zookeeper.server.ZooKeeperServerMain.main(ZooKeeperServerMain.java:52)
> at
> org.apache.zookeeper.server.quorum.QuorumPeerMain.initializeAndRun(QuorumPeerMain.java:116)
> at
> org.apache.zookeeper.server.quorum.QuorumPeerMain.main(QuorumPeerMain.java:78)
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)