[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15589719#comment-15589719
 ] 

Sergey Maslyakov commented on ZOOKEEPER-2332:
---------------------------------------------

I've seen this problem happen on a system that ran out of disk space due to 
other application filling up the disk. The entry for the transaction log file 
was created on the file system but ZooKeeper was not able to write anything 
into it. After the system was rebooted and disk space was released, ZooKeeper 
failed to start.

I think this is a two-fold problem.
# On one hand, ZooKeeper should not be creating corrupted log or snapshot files.
# On the other hand, it should not explode with an unhandled exception if it 
does come across an invalid log file.

Before opening a snapshot file, ZooKeeper does some quick and inexpensive 
validation and rejects the corrupted snapshots. It does not validate the log 
files and does not handle read/parse errors in case if came across a corrupted 
log file.

The defect is reproducible on the heads of master, branch-3.5, and branch-3.4.

> Zookeeper failed to start for empty txn log
> -------------------------------------------
>
>                 Key: ZOOKEEPER-2332
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2332
>             Project: ZooKeeper
>          Issue Type: Bug
>    Affects Versions: 3.4.6
>            Reporter: Liu Shaohui
>            Assignee: Liu Shaohui
>            Priority: Critical
>             Fix For: 3.6.0
>
>         Attachments: ZOOKEEPER-2332-v001.diff
>
>
> We found that the zookeeper server with version 3.4.6 failed to start for 
> there is a empty txn log in log dir.  
> I think we should skip the empty log file during restoring the datatree. 
> Any suggestion?
> {code}
> 2015-11-27 19:16:16,887 [myid:] - ERROR [main:ZooKeeperServerMain@63] - 
> Unexpected exception, exiting abnormally
> java.io.EOFException
> at java.io.DataInputStream.readInt(DataInputStream.java:392)
> at org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
> at 
> org.apache.zookeeper.server.persistence.FileHeader.deserialize(FileHeader.java:64)
> at 
> org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.inStreamCreated(FileTxnLog.java:576)
> at 
> org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.createInputArchive(FileTxnLog.java:595)
> at 
> org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.goToNextLog(FileTxnLog.java:561)
> at 
> org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.next(FileTxnLog.java:643)
> at 
> org.apache.zookeeper.server.persistence.FileTxnSnapLog.restore(FileTxnSnapLog.java:158)
> at org.apache.zookeeper.server.ZKDatabase.loadDataBase(ZKDatabase.java:223)
> at 
> org.apache.zookeeper.server.ZooKeeperServer.loadData(ZooKeeperServer.java:272)
> at 
> org.apache.zookeeper.server.ZooKeeperServer.startdata(ZooKeeperServer.java:399)
> at 
> org.apache.zookeeper.server.NIOServerCnxnFactory.startup(NIOServerCnxnFactory.java:122)
> at 
> org.apache.zookeeper.server.ZooKeeperServerMain.runFromConfig(ZooKeeperServerMain.java:113)
> at 
> org.apache.zookeeper.server.ZooKeeperServerMain.initializeAndRun(ZooKeeperServerMain.java:86)
> at 
> org.apache.zookeeper.server.ZooKeeperServerMain.main(ZooKeeperServerMain.java:52)
> at 
> org.apache.zookeeper.server.quorum.QuorumPeerMain.initializeAndRun(QuorumPeerMain.java:116)
> at 
> org.apache.zookeeper.server.quorum.QuorumPeerMain.main(QuorumPeerMain.java:78)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to