[ https://issues.apache.org/jira/browse/HBASE-28595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17847727#comment-17847727 ]
Hudson commented on HBASE-28595: -------------------------------- Results for branch branch-3 [build #208 on builds.a.o|https://ci-hbase.apache.org/job/HBase%20Nightly/job/branch-3/208/]: (/) *{color:green}+1 overall{color}* ---- details (if available): (/) {color:green}+1 general checks{color} -- For more information [see general report|https://ci-hbase.apache.org/job/HBase%20Nightly/job/branch-3/208/General_20Nightly_20Build_20Report/] (/) {color:green}+1 jdk8 hadoop3 checks{color} -- For more information [see jdk8 (hadoop3) report|https://ci-hbase.apache.org/job/HBase%20Nightly/job/branch-3/208/JDK8_20Nightly_20Build_20Report_20_28Hadoop3_29/] (/) {color:green}+1 jdk11 hadoop3 checks{color} -- For more information [see jdk11 report|https://ci-hbase.apache.org/job/HBase%20Nightly/job/branch-3/208/JDK11_20Nightly_20Build_20Report_20_28Hadoop3_29/] (/) {color:green}+1 source release artifact{color} -- See build output for details. (/) {color:green}+1 client integration test{color} > Losing exception from scan RPC can lead to partial results > ---------------------------------------------------------- > > Key: HBASE-28595 > URL: https://issues.apache.org/jira/browse/HBASE-28595 > Project: HBase > Issue Type: Bug > Components: regionserver, Scanners > Reporter: Csaba Ringhofer > Assignee: Csaba Ringhofer > Priority: Critical > Labels: pull-request-available > Fix For: 2.4.18, 2.7.0, 3.0.0-beta-2, 2.6.1, 2.5.9 > > > This was discovered in Apache Impala using HBase 2.2 based branch hbase > client and server. It is not clear yet whether other branches are also > affected. > The issue happens if the server side of the scan throws an exception and > closes the scanner, but at the same time, the client gets an rpc connection > closed error and doesn't process the exception sent by the server. Client > then thinks it got a network error, which leads to retrying the RPC instead > of opening a new scanner. But then when the client retry reaches the server, > the server returns an empty ScanResponse instead of an error, leading to > closing the scanner on client side without returning any error. > A few pointers to critical parts: > region server: > 1st call throws exception leading to closing (but not deleting) scanner: > [https://github.com/apache/hbase/blob/0c8607a35008b7dca15e9daaec41ec362d159d67/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RSRpcServices.java#L3539] > 2nd call (retry of 1st) returns empty results: > [https://github.com/apache/hbase/blob/0c8607a35008b7dca15e9daaec41ec362d159d67/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RSRpcServices.java#L3403] > client: > some exceptions are handled as non-retriable at RPC level and are only > handled through opening a new scanner: > [https://github.com/apache/hbase/blob/0c8607a35008b7dca15e9daaec41ec362d159d67/hbase-client/src/main/java/org/apache/hadoop/hbase/client/ScannerCallable.java#L214] > [https://github.com/apache/hbase/blob/0c8607a35008b7dca15e9daaec41ec362d159d67/hbase-client/src/main/java/org/apache/hadoop/hbase/client/ClientScanner.java#L367] > This mechanism in the client only works if it gets the exception from the > server. If there are connection issues during the RPC then the client won't > really know the state of the server. -- This message was sent by Atlassian Jira (v8.20.10#820010)