[ 
https://issues.apache.org/jira/browse/HBASE-18796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16174755#comment-16174755
 ] 

Abhishek Singh Chouhan commented on HBASE-18796:
------------------------------------------------

Spent some time looking at the failure. Looks to be a problem elsewhere that 
surfaced.
The test does a split and then tries a batch get operation which fails due to 
table not found although the table is there. This is happening because now that 
we do not put daughter locations before they're actually opened on the 
regionserver, we run into NoServerForRegionException in 
ConnectionImplementation#locateRegionInMeta which should be fine since there 
are retries which should succeed as soon as the region is opened. However our 
retry fails on a TableNotFound exception here

{code}
try (ReversedClientScanner rcs =
            new ReversedClientScanner(conf, s, TableName.META_TABLE_NAME, this, 
rpcCallerFactory,
                rpcControllerFactory, getMetaLookupPool(), 
metaReplicaCallTimeoutScanInMicroSecond)) {
          regionInfoRow = rcs.next();
        }
        if (regionInfoRow == null) {
            throw new TableNotFoundException(tableName);
        }
{code}

The result that we get has mayHaveMoreCellsInRow() true during one of the 
retries, since we don't have setAllowPartialResults(true) set on our scan we 
get regionInfoRow as null since we got only 1 row which has 
mayHaveMoreCellsInRow() as true and we use 
 CompleteScanResultCache which won't return this to the client. After i do
{code}
s.addFamily(HConstants.CATALOG_FAMILY);
    s.setOneRowLimit();
 + s.setAllowPartialResults(true);
    if (this.useMetaReplicas) {
      s.setConsistency(Consistency.TIMELINE);
    }
{code}
the client is able to ride over the split during its retries and the test 
passes.
[~tedyu] [~apurtell] This issues seems to be something that can be hit during 
any other retry too in locateRegionInMeta when mayHaveMoreCellsInRow() is true 
for the meta scan and the client would get TableNotFound and will not retry. I 
can open another jira for this if this sounds good.

> Admin#isTableAvailable returns incorrect result before daughter regions are 
> opened
> ----------------------------------------------------------------------------------
>
>                 Key: HBASE-18796
>                 URL: https://issues.apache.org/jira/browse/HBASE-18796
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 1.3.1
>            Reporter: Abhishek Singh Chouhan
>            Assignee: Abhishek Singh Chouhan
>             Fix For: 2.0.0, 3.0.0, 1.4.0, 1.3.2, 1.5.0
>
>         Attachments: HBASE-18796.branch-1.001.patch, 
> HBASE-18796.branch-1.001.patch, HBASE-18796.branch-1.002.patch, 
> HBASE-18796.branch-1.003.patch, HBASE-18796.master.001.patch
>
>
> Admin#isTableAvailable checks if it can getServerName for the meta entries it 
> reads. During the time of split server location are added to the meta entries 
> in MetaTableAccessor#splitRegion although the description of the method says 
> "Does not add the location information to the daughter regions since they are 
> not open yet.". At this point during the split daughter regions are not 
> actually open, so we can get to a state where parent is offline, daughters 
> are not yet open but isTableAvailable returns true.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to