[ 
https://issues.apache.org/jira/browse/HBASE-10829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13965680#comment-13965680
 ] 

Cosmin Lehene commented on HBASE-10829:
---------------------------------------

I can't find this issue in the 0.98.1 release notes. Perhaps fix version should 
be 0.98.2?

> Flush is skipped after log replay if the last recovered edits file is skipped
> -----------------------------------------------------------------------------
>
>                 Key: HBASE-10829
>                 URL: https://issues.apache.org/jira/browse/HBASE-10829
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Enis Soztutar
>            Assignee: Enis Soztutar
>            Priority: Critical
>             Fix For: 0.98.1, 0.99.0, 0.96.3
>
>         Attachments: hbase-10829_v1.patch, hbase-10829_v2.patch, 
> hbase-10829_v3.patch
>
>
> We caught this in an extended test run where IntegrationTestBigLinkedList 
> failed with some missing keys. 
> The problem is that HRegion.replayRecoveredEdits() would return -1 if all the 
> edits in the log file is skipped, which is true for example if the log file 
> only contains a single compaction record (HBASE-2231) or somehow the edits 
> cannot be applied (column family deleted, etc). 
> The callee, HRegion.replayRecoveredEditsIfAny() only looks for the last 
> returned seqId to decide whether a flush is necessary or not before opening 
> the region, and discarding replayed recovered edits files. 
> Therefore, if the last recovered edits file is skipped but some edits from 
> earlier recovered edits files are applied, the mandatory flush before opening 
> the region is skipped. If the region server dies after this point before a 
> flush, the edits are lost. 
> This is important to fix, though the sequence of events are super rare for a 
> production cluster. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to