[ 
https://issues.apache.org/jira/browse/HBASE-18215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16054334#comment-16054334
 ] 

Francis Liu commented on HBASE-18215:
-------------------------------------

#1 You don't really need to put tables on master anymore just create another 
regionserver group to put the tables on. This makes meta much more available 
and allowing you to restart the master when needed without causing impact. 
Adding the master as part of an rsgroup may lead to operational suprises. I'd 
recommend let the master just do master responsibilities.

#2 Quick look at the patch and it looks like it is indeed a local file. It will 
make rsgroup code simpler but you are pushing complexity to the user in the way 
of managing the file: persistence in case of failure, dealing with concurrent 
updates, etc. Having apis aren't that complex and are much more user-friendly 
and possibly more flexible. 

#3 Yes we should. Getting the rsgroup patch in took herculean effort hence I 
focused only on the essentials. As Stack mentioned we need a way such that we 
don't cause unacceptable leaking of rsgroup into core code.

#4 Good catch. Would you like to submit a separate patch for this?

#5 Can you provide a specific scenario? When the rsgroup patch was written if I 
remember correctly the reverse was true. AM cannot handle null results when 
calling randomAssignment().

#6 Sounds reasonable.   



> some advises about refactoring of rsgroup
> -----------------------------------------
>
>                 Key: HBASE-18215
>                 URL: https://issues.apache.org/jira/browse/HBASE-18215
>             Project: HBase
>          Issue Type: Improvement
>          Components: Balancer
>            Reporter: chenxu
>         Attachments: HBASE-18215-1.2.4-v1.patch
>
>
> recently we have Integrated rsgroup into our cluster,  after Integrated, 
> found some refactoring points. maybe the points were not right, but i think 
> there is a need to share with you guys.
> # when hbase.balancer.tablesOnMaster configured, RSGroupBasedLoadBalancer 
> should consider masterServer assignment first in balanceCluster, 
> roundRobinAssignment, retainAssignment and randomAssignment
>   do the same thing as BaseLoadBalancer
> # why not use a local file as the persistence layer instead of rsgroup table. 
> in our implementation, we first modify the local rsgroup file, then load the 
> group info into memory, after that execute the balancer command, everything 
> is OK.
> when loading do some sanity check:
> (1) one server can not be owned by multi group
> (2) one table can not be owned by multi group
> (3) if group has table, it must also has servers
> (4) default group must has servers in it
> if sanity check can’t pass, give up the following process.work as this, it 
> can greatly reduce the complexity of rsgroup implementation, there is no need 
> to wait for the rsgroup table to be online, and methods like moveServers, 
> moveTables, addRSGroup, removeRSGroup, moveServersAndTables can be removed 
> from RSGroupAdminService.only a refresh method is need(modify persistence 
> layer first and refresh the memory)
> # we should add some group informations on master web UI
> to do this, RSGroupBasedLoadBalancer should move to hbase-server module, 
> because MasterStatusTmpl.jamon depends on it
> # there may be some issues about RSGroupBasedLoadBalancer.roundRobinAssignment
> if two groups both include BOGUS_SERVER_NAME, assignments.putAll will 
> overwrite the previous data
> # there may be some issues about RSGroupBasedLoadBalancer.randomAssignment
> when the return value is BOGUS_SERVER_NAME, AM can not handle this case. we 
> should return null value instead of BOGUS_SERVER_NAME.
> # when RSGroupBasedLoadBalancer.balanceCluster execute, groups are balanced 
> one by one, if there are two many groups, we can do this in parallel.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to