[ 
https://issues.apache.org/jira/browse/HBASE-3969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13050250#comment-13050250
 ] 

zhoushuaifeng commented on HBASE-3969:
--------------------------------------

Hi St, the 3rd solution may be not so good. For the regions in the queue, 
regions have files number close to or more than blockingStoreFiles should have 
the most higher priority, because if not, flush will be blocked and impact put. 
The secondary important is regions that need major compact to clean outdated 
data, reason has mentioned in this issue. But regions with few files(for 
example, only reach the compactionThreshold), and should do a minor compact 
should have the lowest priority, it does no matter how these regions hanging in 
the queue. 
So, I think may be setting the major compact priority to a proper value(between 
1 and blockingStoreFiles - compactionThreshold) may be a better choice. How do 
you think?

> Outdated data can not be cleaned in time
> ----------------------------------------
>
>                 Key: HBASE-3969
>                 URL: https://issues.apache.org/jira/browse/HBASE-3969
>             Project: HBase
>          Issue Type: Improvement
>          Components: regionserver
>    Affects Versions: 0.90.1, 0.90.2, 0.90.3
>            Reporter: zhoushuaifeng
>             Fix For: 0.90.4
>
>         Attachments: HBASE-3969-solution1-for-branch.patch, 
> HBASE-3969-solution1.patch
>
>
> Compaction checker will send regions to the compact queue to do compact. But 
> the priority of these regions is too low if these regions have only a few 
> storefiles. When there is large through output, and the compact queue will 
> aways have some regions with higher priority. This may causing the major 
> compact be delayed for a long time(even a few days),  and outdated data 
> cleaning will also be delayed.
> In our test case, we found some regions sent to the queue by major compact 
> checker hunging in the queue for more than 2 days! Some scanners on these 
> regions cannot get availably data for a long time and lease expired.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to