[ https://issues.apache.org/jira/browse/HBASE-10569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13933578#comment-13933578 ]
Jimmy Xiang commented on HBASE-10569: ------------------------------------- Attached a patch that passed unit tests, integration tests (including ITBLL), and some live cluster tests. Will put it on RB soon. Here is what I have done in this patch: * Moved RPC related code out of HRegionServer and HMaster so that they are smaller for easier change/maintenance. * Make HMaster extends HRegionServer so that HMaster is also a HRegionServer, removed duplicate code/parameters. * Due to B, HMaster#getMetrics is renamed to getMasterMetrics to avoid naming conflict with HRegionServer#getMetrics. The same has been done to HMaster#getCoprocessors, #getCoprocessorHost. * Added HRegionServer#getRpcServices and HMaster#getMasterRpcServices to expose the RPC functionalities. * Changed references related to C and D (a lot, especially in tests). * HMaster and HRegionServer share one RPC server and one InfoServer. * RpcServiceInterface is changed a little. Method #startThreads and #openServer are removed since backup master doesn’t hold the RPC server any more. A parameter HMaster#serviceStarted is introduced to indicate if a master is active so as ServerNotRunningYetException can be thrown before a master is active. * Master recovery in case of ZK connection loss is removed since it doesn’t recover listeners added in HRegionServer. We can get this feature back if needed. The other reason I didn’t try to get it back is because we are going to use raft to choose active master instead of relying on ZK. * HRegionServer on the active HMaster communicates with the active HMaster directly instead of going through the RPC. Shortcut helps. * Master(active/backup) web UI contains info about the corresponding region server. * Backup master moves users regions away (and meta/namespace region to the master if already assigned somewhere else) after becoming active. * Integration testing doesn’t restart the master as a region server, or restart the region server that holds the meta. One reason is because the startup script can’t tell if a region server should be master. Here is a list of things to be done (in separate issues): * Need to make sure the master listens to the old ports (RPC + webUI) too, so as to support rolling upgrade from old versions (0.96+), and be backward compatible. * Need to consolidate(?) chores/threads/handlers in master/regionserver, so that the active master manager in the backup master has a high priority so that it can grab the ZK node faster, before we move to raft. * Clean up MetaServerShutdownHandler and HMaster#assignMeta in next major release when rolling upgrade is not an issue any more. This should be done much later. > Co-locate meta and master > ------------------------- > > Key: HBASE-10569 > URL: https://issues.apache.org/jira/browse/HBASE-10569 > Project: HBase > Issue Type: Improvement > Components: master, Region Assignment > Reporter: Jimmy Xiang > Assignee: Jimmy Xiang > Attachments: hbase-10569_v1.patch > > > I was thinking simplifying/improving the region assignments. The first step > is to co-locate the meta and the master as many people agreed on HBASE-5487. -- This message was sent by Atlassian JIRA (v6.2#6252)