[ 
https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14484275#comment-14484275
 ] 

Jeremy Carroll commented on HBASE-13103:
----------------------------------------

Few comments from my side of things. As a setup, in our architecture, we run 
master / slave HBase clusters with replication setup between them.

- Since this operation is pretty impactful on performance most likely we would 
do this on the slave cluster first. Switch the roles between master / slave. 
Then run the command on the master. Given the time difference between when the 
commands were run, this could end up with different region boundaries between 
the clusters -- which is not desired. So I second the idea of  generates 
"reshaping plan" so it can be applied in the same manner on the slave cluster.
- Probably should think about performing a major compaction operation before 
the normalize policy runs. We have a lot of tombstones on some of our clusters, 
which can inflate the region size by 60%. So splitting / merging in this 
condition is not ideal since the condition is temporary. Though after a 
compaction where you have the steady state is more realistic.

I think it's a great feature. Though most of our clusters are balanced for QPS 
distribution, as CPU is one of our primary capacity planning metrics. Any tool 
which makes it easier to recover from pre-splitting mistakes is welcome.


> [ergonomics] add region size balancing as a feature of master
> -------------------------------------------------------------
>
>                 Key: HBASE-13103
>                 URL: https://issues.apache.org/jira/browse/HBASE-13103
>             Project: HBase
>          Issue Type: Brainstorming
>          Components: Usability
>            Reporter: Nick Dimiduk
>            Assignee: Mikhail Antonov
>             Fix For: 2.0.0, 1.1.0
>
>         Attachments: HBASE-13103-v0.patch
>
>
> Often enough, folks miss-judge split points or otherwise end up with a 
> suboptimal number of regions. We should have an automated, reliable way to 
> "reshape" or "balance" a table's region boundaries. This would be for tables 
> that contain existing data. This might look like:
> {noformat}
> Admin#reshapeTable(TableName, int numSplits);
> {noformat}
> or from the shell:
> {noformat}
> > reshape TABLE, numSplits
> {noformat}
> Better still would be to have a maintenance process, similar to the existing 
> Balancer that runs AssignmentManager on an interval, to run the above 
> "reshape" operation on an interval. That way, the cluster will automatically 
> self-correct toward a desirable state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to