[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13411591#comment-13411591
 ] 

Bill Bridge commented on ZOOKEEPER-1453:
----------------------------------------

The CRC checks certainly should catch a partially written record at the end of 
the log. However since this is an expected event we do not want to discard the 
log as corrupt. We want to ignore the partial record at the end of the log, and 
any records that might follow. If a CRC failure is always treated as EOF, then 
corruption that is not from a partial write during a crash will not be treated 
like corruption. Thus the CRC will no longer fulfill its current role as an 
assurance of no corruption.

You can solve this problem by putting a block header on every sector of a log 
file. The block header includes a check value. Every log write is an integral 
number of blocks. The log is preformatted to contain valid blocks with an 
earlier log sequence number. Encountering valid blocks with a lower sequence 
number is EOF. Encountering a block with an invalid check value is a 
corruption. This is based on the assumption that a disk write always completes 
an integral number of sector writes. This is true, but in very rare 
circumstances writing a block can result in an I/O error on reading the block. 
This is infrequent enough that the multiple logs ZooKeeper uses should be 
sufficient protection.

This is not a quick fix. I think there should be another jira to propose this. 
In the mean time it may be sufficient to decide a CRC failure is EOF if it is 
caused by the CRC value being zero. This is based on the fact that you clear 
ahead with zero to reduce the number of allocations. I am hoping the CRC is the 
very last value in a record and on disk it is aligned to its size so that a 
partial record ends in zero. A partial record that ends at the file EOF should 
also be ignored and not considered corruption.
                
> corrupted logs may not be correctly identified by FileTxnIterator
> -----------------------------------------------------------------
>
>                 Key: ZOOKEEPER-1453
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1453
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: server
>    Affects Versions: 3.3.3
>            Reporter: Patrick Hunt
>            Priority: Critical
>         Attachments: 10.10.5.123-withPath1489.tar.gz, 10.10.5.123.tar.gz, 
> 10.10.5.42-withPath1489.tar.gz, 10.10.5.42.tar.gz, 
> 10.10.5.44-withPath1489.tar.gz, 10.10.5.44.tar.gz
>
>
> See ZOOKEEPER-1449 for background on this issue. The main problem is that 
> during server recovery 
> org.apache.zookeeper.server.persistence.FileTxnLog.FileTxnIterator.next() 
> does not indicate if the available logs are valid or not. In some cases (say 
> a truncated record and a single txnlog in the datadir) we will not detect 
> that the file is corrupt, vs reaching the end of the file.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to