[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13408948#comment-13408948
 ] 

Bill Bridge commented on ZOOKEEPER-1453:
----------------------------------------

This is not just about the disk write cache. It is necessary to disable the 
disk write cache to ensure a commit is truly persistent, but that is not the 
only source of this problem. 

The problem is that the OS maintains a queue of disk writes. Some number of 
them have been submitted to the disk and some are still in the queue. These 
writes are not necessarily in file block order. When there is a reboot the 
write for the last data block may still be in the queue while the write to the 
block before it has gone to the disk. 

If the reboot does a hardware reset or is caused by power failure, a similar 
thing can happen for the writes that have been given to the disk. Disks like to 
have about 4 - 8 requests queued to them so that they can order them to reduce 
seek/rotational latency. If there is a reset from the disk controller or loss 
of power, the disk cannot complete the requests it was given. This can result 
in some completing and some not. Since the disk reorders based on location of 
the sectors and the current location of the disk arm and current location of 
spindle rotation, it is not possible to predict what order writes will complete 
in.
                
> corrupted logs may not be correctly identified by FileTxnIterator
> -----------------------------------------------------------------
>
>                 Key: ZOOKEEPER-1453
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1453
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: server
>    Affects Versions: 3.3.3
>            Reporter: Patrick Hunt
>            Priority: Critical
>         Attachments: 10.10.5.123-withPath1489.tar.gz, 10.10.5.123.tar.gz, 
> 10.10.5.42-withPath1489.tar.gz, 10.10.5.42.tar.gz, 
> 10.10.5.44-withPath1489.tar.gz, 10.10.5.44.tar.gz
>
>
> See ZOOKEEPER-1449 for background on this issue. The main problem is that 
> during server recovery 
> org.apache.zookeeper.server.persistence.FileTxnLog.FileTxnIterator.next() 
> does not indicate if the available logs are valid or not. In some cases (say 
> a truncated record and a single txnlog in the datadir) we will not detect 
> that the file is corrupt, vs reaching the end of the file.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to