Lokesh Jain created HDDS-3611:
---------------------------------

             Summary: Ozone client should not consider closed container error 
as failure
                 Key: HDDS-3611
                 URL: https://issues.apache.org/jira/browse/HDDS-3611
             Project: Hadoop Distributed Data Store
          Issue Type: Bug
            Reporter: Lokesh Jain


ContainerNotOpen exception exception is thrown by datanode when client is 
writing to a non open container. Currently ozone client sees this as failure 
and would increment the retry count. If client reaches a configured retry count 
it fails the write. Map reduce jobs were seen failing due to this error with 
default retry count of 5.

Idea is to not consider errors due to closed container in retry count. This 
would make sure that ozone client writes do not fail due to closed container 
exceptions.
{code:java}
2020-05-15 02:20:28,375 ERROR [main] 
org.apache.hadoop.ozone.client.io.KeyOutputStream: Retry request failed. 
retries get failed due to exceeded maximum allowed retries number: 5
java.io.IOException: Unexpected Storage Container Exception: 
java.util.concurrent.CompletionException: 
java.util.concurrent.CompletionException: 
org.apache.ratis.protocol.StateMachineException: 
org.apache.hadoop.hdds.scm.container.common.helpers.ContainerNotOpenException 
from Server e2eec12f-02c5-46e2-9c23-14d6445db219@group-A3BF3ABDC307: Container 
15 in CLOSED state
        at 
org.apache.hadoop.hdds.scm.storage.BlockOutputStream.setIoException(BlockOutputStream.java:551)
        at 
org.apache.hadoop.hdds.scm.storage.BlockOutputStream.lambda$writeChunkToContainer$3(BlockOutputStream.java:638)
        at 
java.util.concurrent.CompletableFuture.uniExceptionally(CompletableFuture.java:884)
        at 
java.util.concurrent.CompletableFuture$UniExceptionally.tryFire(CompletableFuture.java:866)
        at 
java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488)
        at 
java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1975)
        at 
org.apache.ratis.client.impl.OrderedAsync$PendingOrderedRequest.setReply(OrderedAsync.java:99)
        at 
org.apache.ratis.client.impl.OrderedAsync$PendingOrderedRequest.setReply(OrderedAsync.java:60)
        at 
org.apache.ratis.util.SlidingWindow$RequestMap.setReply(SlidingWindow.java:143)
        at 
org.apache.ratis.util.SlidingWindow$Client.receiveReply(SlidingWindow.java:314)
        at 
org.apache.ratis.client.impl.OrderedAsync.lambda$sendRequest$9(OrderedAsync.java:242)
        at 
java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:616)
        at 
java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:591)
        at 
java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488)
        at 
java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1975)
        at 
org.apache.ratis.grpc.client.GrpcClientProtocolClient$AsyncStreamObservers$1.lambda$onNext$0(GrpcClientProtocolClient.java:284)
        at java.util.Optional.ifPresent(Optional.java:159)
        at 
org.apache.ratis.grpc.client.GrpcClientProtocolClient$AsyncStreamObservers.handleReplyFuture(GrpcClientProtocolClient.java:340)
        at 
org.apache.ratis.grpc.client.GrpcClientProtocolClient$AsyncStreamObservers.access$100(GrpcClientProtocolClient.java:264)
        at 
org.apache.ratis.grpc.client.GrpcClientProtocolClient$AsyncStreamObservers$1.onNext(GrpcClientProtocolClient.java:284)
        at 
org.apache.ratis.grpc.client.GrpcClientProtocolClient$AsyncStreamObservers$1.onNext(GrpcClientProtocolClient.java:267)
        at 
org.apache.ratis.thirdparty.io.grpc.stub.ClientCalls$StreamObserverToCallListenerAdapter.onMessage(ClientCalls.java:436)
        at 
org.apache.ratis.thirdparty.io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1MessagesAvailable.runInternal(ClientCallImpl.java:658)
...{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

Reply via email to