[ 
https://issues.apache.org/jira/browse/HBASE-4463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13113847#comment-13113847
 ] 

Karthik Ranganathan commented on HBASE-4463:
--------------------------------------------

@Stack - we can find the exact amount of data we are writing to the dfs (only 
hfile blocks will contribute to this during compactions). So adding a threshold 
like this is not too hard... but there could be disk iops pressure (instead of 
network bandwidth) and detecting that would be hard. So we would still need to 
set off-peak time.

I was trying to come up with a more generic solution but that involves setting 
up a feedback loop inside the regionserver - keep track of max, min and average 
latencies over the last k days (would have to store this in META or some other 
location as it needs to persist beyond restarts). Need to remove any spikes in 
the values. When we run an aggressive compaction, we need to make sure the 
latencies are still acceptable, otherwise dont run aggressive compactions. This 
is much harder to get right though.

> Run more aggressive compactions during off peak hours
> -----------------------------------------------------
>
>                 Key: HBASE-4463
>                 URL: https://issues.apache.org/jira/browse/HBASE-4463
>             Project: HBase
>          Issue Type: Improvement
>          Components: regionserver
>            Reporter: Karthik Ranganathan
>            Assignee: Karthik Ranganathan
>
> The number of iops on the disk and the top of the rack bandwidth utilization 
> at off peak hours is much lower than at peak hours depending on the 
> application usage pattern. We can utilize this knowledge to improve the 
> performance of the HBase cluster by increasing the compact selection ratio to 
> a much larger value during off-peak hours than otherwise - increasing 
> hbase.hstore.compaction.ratio (1.2 default) to 
> hbase.hstore.compaction.ratio.offpeak (5 default). This will help reduce the 
> average number of files per store.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to