[ https://issues.apache.org/jira/browse/HBASE-15454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15257519#comment-15257519 ]
Duo Zhang commented on HBASE-15454: ----------------------------------- [~davelatham] [~clarax98007] Any concerns on the latest patch? Thanks. > Archive store files older than max age > -------------------------------------- > > Key: HBASE-15454 > URL: https://issues.apache.org/jira/browse/HBASE-15454 > Project: HBase > Issue Type: Sub-task > Components: Compaction > Affects Versions: 2.0.0, 1.3.0, 0.98.18, 1.4.0 > Reporter: Duo Zhang > Assignee: Duo Zhang > Fix For: 2.0.0, 1.3.0, 1.4.0, 0.98.20 > > Attachments: HBASE-15454-v1.patch, HBASE-15454-v2.patch, > HBASE-15454-v3.patch, HBASE-15454-v4.patch, HBASE-15454-v5.patch, > HBASE-15454-v6.patch, HBASE-15454.patch > > > In date tiered compaction, the store files older than max age are never > touched by minor compactions. Here we introduce a 'freeze window' operation, > which does the follow things: > 1. Find all store files that contains cells whose timestamp are in the give > window. > 2. Compaction all these files and output one file for each window that these > files covered. > After the compaction, we will have only one in the give window, and all cells > whose timestamp are in the give window are in the only file. And if you do > not write new cells with an older timestamp in this window, the file will > never be changed. This makes it easier to do erasure coding on the freezed > file to reduce redundence. And also, it makes it possible to check > consistency between master and peer cluster incrementally. > And why use the word 'freeze'? > Because there is already an 'HFileArchiver' class. I want to use a different > word to prevent confusing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)