[ 
https://issues.apache.org/jira/browse/HBASE-19757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16325436#comment-16325436
 ] 

Ted Yu edited comment on HBASE-19757 at 1/15/18 6:16 PM:
---------------------------------------------------------

In master, we have the following code in RSGroupInfoManagerImpl#refresh()
{code:java}
    if(!masterServices.isInitialized()) {
      specialTables = Arrays.asList(AccessControlLists.ACL_TABLE_NAME, 
TableName.META_TABLE_NAME,
          TableName.NAMESPACE_TABLE_NAME, RSGROUP_TABLE_NAME);
    } else {
      specialTables =
          
masterServices.listTableNamesByNamespace(NamespaceDescriptor.SYSTEM_NAMESPACE_NAME_STR);
    }
{code}
If acl table is about to be created, the call in else branch may end up not 
having hbase:acl as one of the special tables.

In RSGroupBasedLoadBalancer, due to lack of rs group, no server is provided for 
hbase:acl table, leading to the deadlock.


was (Author: yuzhih...@gmail.com):
In master, we have the following code in RSGroupInfoManagerImpl#refresh()
{code}
    if(!masterServices.isInitialized()) {
      specialTables = Arrays.asList(AccessControlLists.ACL_TABLE_NAME, 
TableName.META_TABLE_NAME,
          TableName.NAMESPACE_TABLE_NAME, RSGROUP_TABLE_NAME);
    } else {
      specialTables =
          
masterServices.listTableNamesByNamespace(NamespaceDescriptor.SYSTEM_NAMESPACE_NAME_STR);
    }
{code}
If acl table is about to be created, the call in else branch may end up not 
having hbase:acl as one of the special tables.
By always using the assignment in if block, TestRSGroupsWithACL passes.

> System table gets stuck after enabling region server group feature in secure 
> cluster
> ------------------------------------------------------------------------------------
>
>                 Key: HBASE-19757
>                 URL: https://issues.apache.org/jira/browse/HBASE-19757
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Ted Yu
>            Assignee: Ted Yu
>            Priority: Major
>         Attachments: 19757.v1.txt, 19757.v2.txt, 19757.v3.txt
>
>
> I was testing on an hbase-2 secure cluster against hadoop 3 where some tables 
> were created without region server group feature.
> After adding the RSGroupAdminEndpoint and RSGroupBasedLoadBalancer to 
> hbase-site, I restarted the whole cluster.
> After the restart, hbase:meta region got stuck in transition (forever).
> {code}
> 2018-01-10 21:20:16,696 INFO  
> [org.apache.hadoop.hbase.rsgroup.RSGroupInfoManagerImpl$RSGroupStartupWorker-ctr-e137-1514896590304-8706-01-000002.hwx.site,20000,1515619212617]
>   zookeeper.MetaTableLocator: Failed verification of hbase:meta,,1 at 
> address=ctr-e137-1514896590304-8706-01-000004.hwx.site,16020,1515618538016, 
> exception=org.apache.hadoop.    hbase.NotServingRegionException: 
> hbase:meta,,1 is not online on 
> ctr-e137-1514896590304-8706-01-000004.hwx.site,16020,1515619181453
>         at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.getRegionByEncodedName(HRegionServer.java:3314)
>         at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:3291)
>         at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.getRegion(RSRpcServices.java:1355)
>         at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.getRegionInfo(RSRpcServices.java:1667)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to