[ https://issues.apache.org/jira/browse/ZOOKEEPER-1453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13412037#comment-13412037 ]
Flavio Junqueira commented on ZOOKEEPER-1453: --------------------------------------------- Hi Bill, Thanks a lot for your input. I really appreciate that you're taking your time to revisit this part of the code. Check some comments I have below, please: bq. If a CRC failure is always treated as EOF, then corruption that is not from a partial write during a crash will not be treated like corruption I think that if we are to assume corruptions like bit flips and such, we need to do much more than adding CRCs and block-aligned writes. I'm a bit concerned about a full redesign of the transaction log scheme to consider cases that the current fault model of zookeeper does not cover. bq. The log is preformatted to contain valid blocks with an earlier log sequence number. I'm not sure I understand this step. How do we know the log sequence numbers beforehand? bq. It may be sufficient to decide a CRC failure is EOF if it is caused by the CRC value being zero. It sounds right to me. > corrupted logs may not be correctly identified by FileTxnIterator > ----------------------------------------------------------------- > > Key: ZOOKEEPER-1453 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1453 > Project: ZooKeeper > Issue Type: Bug > Components: server > Affects Versions: 3.3.3 > Reporter: Patrick Hunt > Priority: Critical > Attachments: 10.10.5.123-withPath1489.tar.gz, 10.10.5.123.tar.gz, > 10.10.5.42-withPath1489.tar.gz, 10.10.5.42.tar.gz, > 10.10.5.44-withPath1489.tar.gz, 10.10.5.44.tar.gz > > > See ZOOKEEPER-1449 for background on this issue. The main problem is that > during server recovery > org.apache.zookeeper.server.persistence.FileTxnLog.FileTxnIterator.next() > does not indicate if the available logs are valid or not. In some cases (say > a truncated record and a single txnlog in the datadir) we will not detect > that the file is corrupt, vs reaching the end of the file. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira