[ https://issues.apache.org/jira/browse/HBASE-12071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Stephen Yuan Jiang updated HBASE-12071: --------------------------------------- Attachment: HBASE-12071.v1-master.patch Comments from the V1 patch against master branch: " Currently we have 3 RpcExecutor in SimpleRpcScheduler for different request types: - priorityExecutor for meta table requests and some admin region operation requests (HConstants.HIGH_QOS) - replicationExecutor for replication requests (HConstants.REPLICATION_QOS) - callExecutor for normal requests (HConstants.REPLAY_QOS and HConstants.NORMAL_QOS) The proposed changes are (1). the comments about only meta table requests uses priorityExecutor is not true - it should be for all system tables - so change the check meta table to check system tables. (2). Split HConstants.HIGH_QOS to two: HConstants.ADMIN_QOS for admin requests and HConstants.SYSTEMTABLE_QOS => in the future, it is flexible that we could extend the code to use different executors for more reliability and scalability. (It is the case in the existing code that HConstants.REPLAY_QOS just use the callExecutor, in the future, it could move to other executor) (3). Add the requests in Admin.proto that involves Master<->RS communication to use priorityExecutor (via HConstants.ADMIN_QOS) (4). Currently, priorityExecutor only uses 1 LinkedBlockingQueue (default queue size=10 & default handler count=10). I changed to use 2 LinkedBlockingQueue. Two alternatives are: (1). increase the handler count to 15-20 to handle more load; or (2). make the number of queues configurable. " > Separate out thread pool for Master <-> RegionServer communication > ------------------------------------------------------------------ > > Key: HBASE-12071 > URL: https://issues.apache.org/jira/browse/HBASE-12071 > Project: HBase > Issue Type: Improvement > Affects Versions: 2.0.0 > Reporter: Sudarshan Kadambi > Assignee: Stephen Yuan Jiang > Fix For: 2.0.0 > > Attachments: HBASE-12071.v1-master.patch > > > Over in HBASE-12028, there is a discussion about the case of a RegionServer > still being alive despite all its handler threads being dead. One outcome of > this is that the Master is left hanging on the RS for completion of various > operations - such as region un-assignment when a table is disabled. Does it > make sense to create a separate thread pool for communication between the > Master and the RS? This addresses not just the case of the RPC handler > threads terminating but also long-running queries or co-processor executions > holding up master operations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)