[ https://issues.apache.org/jira/browse/HBASE-25815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Caroline Zhou reassigned HBASE-25815: ------------------------------------- Assignee: Caroline Zhou > RSGroupBasedLoadBalancer online status never updates after being set to true > for the first time > ----------------------------------------------------------------------------------------------- > > Key: HBASE-25815 > URL: https://issues.apache.org/jira/browse/HBASE-25815 > Project: HBase > Issue Type: Bug > Reporter: Caroline Zhou > Assignee: Caroline Zhou > Priority: Minor > > Once the RSGroupBasedLoadBalancer is “online” (it has found the hbase:meta > and hbase:rsgroup tables), it will never update the status again. That means > if hbase:meta or hbase:rsgroup ever go offline, the balancer doesn’t update > its status to “offline,” so some of the code paths will go through the > “online” code path even though the catalog tables aren’t available to be read > from or written to (in particular, anything that calls > RSGroupInfoManagerImpl#flushConfig). > Also, in the RSGroupInfoManagerImpl#flushConfig code path, the call to write > to hbase:rsgroup comes before the update to the rsGroupMap and tableMap which > are stored in memory (see order of [these lines of > code|https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/rsgroup/RSGroupInfoManagerImpl.java#L664-L670]), > so if hbase:rsgroup goes offline after the RSGroupBasedLoadBalancer is > already marked as “online,” exceptions thrown while trying to write to an > offline hbase:rsgroup table prevent the in-memory rsGroupMap and tableMap > from being updated. In terms of the order just mentioned, in-memory state > should be updated first. -- This message was sent by Atlassian Jira (v8.3.4#803005)