sadanand48 commented on code in PR #10479:
URL: https://github.com/apache/ozone/pull/10479#discussion_r3440776671
##########
hadoop-hdds/client/src/main/java/org/apache/hadoop/hdds/scm/storage/StreamBlockInputStream.java:
##########
@@ -327,10 +338,14 @@ synchronized void readBlockImpl(long length) throws
IOException {
}
private void handleExceptions(IOException cause) throws IOException {
- if (cause instanceof StorageContainerException ||
isConnectivityIssue(cause)) {
- if (shouldRetryRead(cause, retryPolicy, retries++)) {
+ IOException root = unwrapCause(cause);
+ if (root instanceof StorageContainerException || isConnectivityIssue(root)
||
+ root instanceof TimeoutIOException) {
+ if (shouldRetryRead(root, retryPolicy, retries++)) {
+ recordFailedStreamingDatanode();
Review Comment:
initStreamRead() binds one DN for a long-lived gRPC stream, and mid-read
failures (e.g. TimeoutIOException when the active DN is stopped) are handled by
closing the stream and re-initializing.
On re-init, initStreamRead() can select the same DN again . The excluded set
records the DN that was actively serving the stream when the error happened, so
the retry opens a new stream on a different replica and resumes from
requestedLength = position.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]