[ https://issues.apache.org/jira/browse/HBASE-17172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15702722#comment-15702722 ]
huaxiang sun commented on HBASE-17172: -------------------------------------- We have use cases that user wants TTL to be multiple years or so, there may be lots of deleted cells depending on the use case. I think we want to give user an option to free up the space for these deleted cells. > Optimize major mob compaction with _del files > --------------------------------------------- > > Key: HBASE-17172 > URL: https://issues.apache.org/jira/browse/HBASE-17172 > Project: HBase > Issue Type: Improvement > Components: mob > Affects Versions: 2.0.0 > Reporter: huaxiang sun > Assignee: huaxiang sun > > Today, when there is a _del file in mobdir, with major mob compaction, every > mob file will be recompacted, this causes lots of IO and slow down major mob > compaction (may take months to finish). This needs to be improved. A few > ideas are: > 1) Do not compact all _del files into one, instead, compact them based on > groups with startKey as the key. Then use firstKey/startKey to make each mob > file to see if the _del file needs to be included for this partition. > 2). Based on the timerange of the _del file, compaction for files after that > timerange does not need to include the _del file as these are newer files. -- This message was sent by Atlassian JIRA (v6.3.4#6332)