[jira] [Commented] (ZOOKEEPER-1156) Log truncation truncating log too much - can cause data loss

Vishal Kathuria (JIRA) Thu, 18 Aug 2011 11:31:01 -0700

    [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13087178#comment-13087178
 ]


Vishal Kathuria commented on ZOOKEEPER-1156:
--------------------------------------------

Here is the scenario

Lets say the current leader A is at zxid 80.
A participant B with zxid 81 joins and gets a message from leader TRUNC,80

B then calculates the length of log up till zxid 80. The actual length is, say  
450, but because of the bug, the value it calculates is 420. B truncates the 
log to size 420.

When loadDatabase is called again, the log is replayed till 79 because log 
record 80 isn't complete.

The node B doesn't have the change that had zxid 80. The leader will not send 
change 80 to B either.

In my manual repro, the change with zxid 80 was a create. I could see the 
created node when I connected to A but not when connected to B.


> Log truncation truncating log too much - can cause data loss
> ------------------------------------------------------------
>
>                 Key: ZOOKEEPER-1156
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1156
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: quorum, server
>    Affects Versions: 3.3.3
>            Reporter: Vishal Kathuria
>            Priority: Blocker
>             Fix For: 3.3.4
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> The log truncation relies on position calculation for a particular zxid to 
> figure out the new size of the log file. There is a bug in 
> PositionInputStream implementation which skips counting the bytes in the log 
> which have value 0. This can lead to underestimating the actual log size. The 
> log records which should be there can get truncated, leading to data loss on 
> the participant which is executing the trunc.
> Clients can see different values depending on whether they connect to the 
> node on which trunc was executed. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (ZOOKEEPER-1156) Log truncation truncating log too much - can cause data loss

Reply via email to