guluo2016 commented on PR #5767: URL: https://github.com/apache/hbase/pull/5767#issuecomment-2016469719
> Mind explaining more on why locating is so slow when the replica id does not exist? Busying with other things, Sorry for the late reply. In a short, hbase would retry 8 times (by default) if the location of specified region is not obtained, and this stage take a long time. A detailed analysis for this situation against master branch We will get RegionOfflineException if region replica id does not exist. Code is in here ``` // AsyncRegionLocator.getRegionLocation // loc == null because region replica id does not exist HRegionLocation loc = locs.getRegionLocation(replicaId); if (loc == null) { future.completeExceptionally( new RegionOfflineException("No location for " + tableName + ", row='" + Bytes.toStringBinary(row) + "', locateType=" + type + ", replicaId=" + replicaId)); } ``` HBase would retry it over and over util the maxAttempts is reached when getting RegionOfflineException because this exception is not DoNotRetryIOException, Code is in here. ``` // AsyncRpcRetryingCaller.onError if (error instanceof DoNotRetryIOException && !(error instanceof ScannerResetException)) { future.completeExceptionally(error); return; } ... tryScheduleRetry(error); ``` We are sure that the region does not exist in this situation, so no need to retry maybe is better. So we can throw DoNotRetryIOException to avoid retrying in this situation -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@hbase.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org