[ 
https://issues.apache.org/jira/browse/HBASE-20724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16775767#comment-16775767
 ] 

Guanghao Zhang commented on HBASE-20724:
----------------------------------------

{quote}It seems we tackling prematurely archived HLog by writing the previous 
files into the newly create HFile
{quote}
Yes, there are a compaction event marker which contains the compacted files and 
be writed to HLog. But the problem is HLog may be archived after region flush. 
Then the HLog will not be replay when RS failover. So the compacted file cannot 
be removed. You can check the unit test to see how this happen. The UT roll log 
first, then flush the region to make the old log archived which contains 
compaction event marker. Then the compacted file will be leaved after RS 
failover. 

> Sometimes some compacted storefiles are still opened after region failover
> --------------------------------------------------------------------------
>
>                 Key: HBASE-20724
>                 URL: https://issues.apache.org/jira/browse/HBASE-20724
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 3.0.0, 1.3.0, 1.4.0, 1.5.0, 2.0.0
>            Reporter: Francis Liu
>            Assignee: Guanghao Zhang
>            Priority: Critical
>         Attachments: HBASE-20724.master.001.patch, 
> HBASE-20724.master.002.patch, HBASE-20724.master.003.patch, 
> HBASE-20724.master.004.patch, HBASE-20724.master.005.patch, 
> HBASE-20724.master.006.patch, HBASE-20724.master.007.patch
>
>
> It is important that compacted storefiles of a given compaction execution are 
> wholly opened or archived to insure data consistency. ie a storefile 
> containing delete tombstones can be archived while older storefiles 
> containing cells that were supposed to be deleted are left unarchived thereby 
> undeleting those cells.
> When a server fails compaction markers (in the wal edit) are used to 
> determine which storefiles are compacted and should be excluded during region 
> open (during failover). But the WALs containing compaction markers can be 
> prematurely archived even though there are still compacted storefiles for 
> that particular compaction event that hasn't been archived yet. Thus losing 
> compaction information that needs to be replayed in the event of an RS crash. 
> This is because hlog archiving logic only keeps track of flushed storefiles 
> and not compacted ones.
> https://issues.apache.org/jira/browse/HBASE-20704?focusedCommentId=16507680&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16507680



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to