[ 
https://issues.apache.org/jira/browse/HBASE-25829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17337191#comment-17337191
 ] 

Viraj Jasani commented on HBASE-25829:
--------------------------------------

{quote}"Loaded *80 regions* from in-memory state of AssignmentManager"

"Loaded *73 regions from 5 regionservers' reports* and found 0 orphan regions"
{quote}
First log line comes from loading all regions from in-memory state: 
loadRegionsFromInMemoryState() and second one from loadRegionsFromRSReport().

By any chance, was there any WARN log similar to _*Region is split but NOT 
offline: \{regionNameAsString}*_ ?

 
{code:java}
private void loadRegionsFromInMemoryState() {
  List<RegionState> regionStates =
      master.getAssignmentManager().getRegionStates().getRegionStates();
  for (RegionState regionState : regionStates) {
    RegionInfo regionInfo = regionState.getRegion();
    if (master.getTableStateManager()
        .isTableState(regionInfo.getTable(), TableState.State.DISABLED)) {
      disabledTableRegions.add(regionInfo.getRegionNameAsString());
    }
    if (regionInfo.isSplitParent()) {
      splitParentRegions.add(regionInfo.getRegionNameAsString());
    }
    HbckRegionInfo.MetaEntry metaEntry =
        new HbckRegionInfo.MetaEntry(regionInfo, regionState.getServerName(),
            regionState.getStamp());
    regionInfoMap.put(regionInfo.getEncodedName(), new 
HbckRegionInfo(metaEntry));
  }
  LOG.info("Loaded {} regions from in-memory state of AssignmentManager", 
regionStates.size());
}

{code}
 

 
{quote}However whenever the balancer runs there are a number of concerning INFO 
level log messages printed of the form _assignment.RegionStates: Skipping, no 
server for state=SPLIT, location=null, table=TABLENAME_
{quote}
Are these regions in RIT (splitting / splitting_new)?

I am trying to chase all references where we set regionLocation to null by 
calling this method:
{code:java}
public ServerName setRegionLocation(final ServerName serverName) {
  ServerName lastRegionLocation = this.regionLocation;
  if (LOG.isTraceEnabled() && serverName == null) {
    LOG.trace("Tracking when we are set to null " + this, new 
Throwable("TRACE"));
  }
  this.regionLocation = serverName;
  this.lastUpdate = EnvironmentEdgeManager.currentTime();
  return lastRegionLocation;
}

{code}
So far I see setRegionLocation(null) references for legit purposes like closing 
region, closing it abruptly, failed open etc. Haven't seen setting this null 
for splitting case.

In the meanwhile, how do corresponding meta entries look like for these 7 
regions?

 

 

> SPLIT state detritus
> --------------------
>
>                 Key: HBASE-25829
>                 URL: https://issues.apache.org/jira/browse/HBASE-25829
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 2.4.3
>            Reporter: Andrew Kyle Purtell
>            Priority: Major
>             Fix For: 3.0.0-alpha-1, 2.5.0, 2.4.3
>
>
> Seen after an integration test (see HBASE-25824) with 'calm' monkey, so this 
> happened in the happy path.
> There were no errors accessing all loaded table data. The integration test 
> writes a log to HDFS of every cell written to HBase and the verify phase uses 
> that log to read each value and confirm it. That seems fine:
> {noformat}
> 2021-04-30 02:16:33,316 INFO  [main] 
> test.IntegrationTestLoadCommonCrawl$Verify: REFERENCED: 154943544
> 2021-04-30 02:16:33,316 INFO  [main] 
> test.IntegrationTestLoadCommonCrawl$Verify: UNREFERENCED: 0
> 2021-04-30 02:16:33,316 INFO  [main] 
> test.IntegrationTestLoadCommonCrawl$Verify: CORRUPT: 0
> {noformat}
> However whenever the balancer runs there are a number of concerning INFO 
> level log messages printed of the form _assignment.RegionStates: Skipping, no 
> server for state=SPLIT, location=null, table=TABLENAME_ 
> For example:
> {noformat}
> 2021-04-30 02:02:09,286 INFO  [master/ip-172-31-58-47:8100.Chore.2] 
> assignment.RegionStates: Skipping, no server for state=SPLIT, location=null, 
> table=IntegrationTestLoadCommonCrawl, region=087fb2f7847c2fc0a0b85eb30a97036e
> 2021-04-30 02:02:09,286 INFO  [master/ip-172-31-58-47:8100.Chore.2] 
> assignment.RegionStates: Skipping, no server for state=SPLIT, location=null, 
> table=IntegrationTestLoadCommonCrawl, region=0952b94a920454afe9c40becbb7bf205
> 2021-04-30 02:02:09,286 INFO  [master/ip-172-31-58-47:8100.Chore.2] 
> assignment.RegionStates: Skipping, no server for state=SPLIT, location=null, 
> table=IntegrationTestLoadCommonCrawl, region=f87a8b993f7eca2524bf2331b7ee3c06
> 2021-04-30 02:02:09,286 INFO  [master/ip-172-31-58-47:8100.Chore.2] 
> assignment.RegionStates: Skipping, no server for state=SPLIT, location=null, 
> table=IntegrationTestLoadCommonCrawl, region=74bb28864a120decdf0f4956741df745
> 2021-04-30 02:02:09,286 INFO  [master/ip-172-31-58-47:8100.Chore.2] 
> assignment.RegionStates: Skipping, no server for state=SPLIT, location=null, 
> table=IntegrationTestLoadCommonCrawl, region=bc918b609ade0ae4d5530f0467354cae
> 2021-04-30 02:02:09,286 INFO  [master/ip-172-31-58-47:8100.Chore.2] 
> assignment.RegionStates: Skipping, no server for state=SPLIT, location=null, 
> table=IntegrationTestLoadCommonCrawl, region=183a199984539f3917a2f8927fe01572
> 2021-04-30 02:02:09,286 INFO  [master/ip-172-31-58-47:8100.Chore.2] 
> assignment.RegionStates: Skipping, no server for state=SPLIT, location=null, 
> table=IntegrationTestLoadCommonCrawl, region=6cc5ce4fb4adc00445b3ec7dd8760ba8
> {noformat}
> The HBCK chore notices them but does nothing:
> "Loaded *80 regions* from in-memory state of AssignmentManager"
> "Loaded *73 regions from 5 regionservers' reports* and found 0 orphan regions"
> "Loaded 3 tables 80 regions from filesystem and found 0 orphan regions"
> Yes, there are exactly 7 region state records of SPLIT state with 
> server=null. 
> {noformat}
> 2021-04-30 02:02:09,300 INFO  [master/ip-172-31-58-47:8100.Chore.1] 
> master.HbckChore: Loaded 80 regions from in-memory state of AssignmentManager
> 2021-04-30 02:02:09,300 INFO  [master/ip-172-31-58-47:8100.Chore.1] 
> master.HbckChore: Loaded 73 regions from 5 regionservers' reports and found 0 
> orphan regions
> 2021-04-30 02:02:09,306 INFO  [master/ip-172-31-58-47:8100.Chore.1] 
> master.HbckChore: Loaded 3 tables 80 regions from filesystem and found 0 
> orphan regions
> {noformat}
> This repeats indefinitely. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to