[ 
https://issues.apache.org/jira/browse/HDDS-4164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17211251#comment-17211251
 ] 

Bharat Viswanadham commented on HDDS-4164:
------------------------------------------

HDDS-4262 is the root cause for this issue, when leader changes, all pending 
requests are replied from the old leader with NOT LEADER and replied back. As 
previously, we used new clientID and callID, Ratis server is not able to 
distinguish retry of request, with the fix from HDDS-4262 I ran a freon test, I 
don't see now Key_NOT_FOUND.

[~ljain] Once after your confirmation, will close this bug.

> OM client request fails with "failed to commit as key is not found in OpenKey 
> table"
> ------------------------------------------------------------------------------------
>
>                 Key: HDDS-4164
>                 URL: https://issues.apache.org/jira/browse/HDDS-4164
>             Project: Hadoop Distributed Data Store
>          Issue Type: Bug
>          Components: OM HA
>            Reporter: Lokesh Jain
>            Assignee: Bharat Viswanadham
>            Priority: Blocker
>
> OM client request fails with "failed to commit as key is not found in OpenKey 
> table"
> {code:java}
> 20/08/28 03:21:53 WARN retry.RetryInvocationHandler: A failover has occurred 
> since the start of call #28868 $Proxy17.submitRequest over 
> nodeId=om3,nodeAddress=vc1330.halxg.cloudera.com:9862
> 20/08/28 03:21:53 WARN retry.RetryInvocationHandler: A failover has occurred 
> since the start of call #28870 $Proxy17.submitRequest over 
> nodeId=om1,nodeAddress=vc1325.halxg.cloudera.com:9862
> 20/08/28 03:21:53 WARN retry.RetryInvocationHandler: A failover has occurred 
> since the start of call #28869 $Proxy17.submitRequest over 
> nodeId=om1,nodeAddress=vc1325.halxg.cloudera.com:9862
> 20/08/28 03:21:54 WARN retry.RetryInvocationHandler: A failover has occurred 
> since the start of call #28871 $Proxy17.submitRequest over 
> nodeId=om1,nodeAddress=vc1325.halxg.cloudera.com:9862
> 20/08/28 03:21:54 WARN retry.RetryInvocationHandler: A failover has occurred 
> since the start of call #28872 $Proxy17.submitRequest over 
> nodeId=om1,nodeAddress=vc1325.halxg.cloudera.com:9862
> 20/08/28 03:21:54 WARN retry.RetryInvocationHandler: A failover has occurred 
> since the start of call #28866 $Proxy17.submitRequest over 
> nodeId=om1,nodeAddress=vc1325.halxg.cloudera.com:9862
> 20/08/28 03:21:54 WARN retry.RetryInvocationHandler: A failover has occurred 
> since the start of call #28867 $Proxy17.submitRequest over 
> nodeId=om1,nodeAddress=vc1325.halxg.cloudera.com:9862
> 20/08/28 03:21:54 WARN retry.RetryInvocationHandler: A failover has occurred 
> since the start of call #28874 $Proxy17.submitRequest over 
> nodeId=om1,nodeAddress=vc1325.halxg.cloudera.com:9862
> 20/08/28 03:21:54 WARN retry.RetryInvocationHandler: A failover has occurred 
> since the start of call #28875 $Proxy17.submitRequest over 
> nodeId=om1,nodeAddress=vc1325.halxg.cloudera.com:9862
> 20/08/28 03:21:54 ERROR freon.BaseFreonGenerator: Error on executing task 
> 14424
> KEY_NOT_FOUND org.apache.hadoop.ozone.om.exceptions.OMException: Failed to 
> commit key, as /vol1/bucket1/akjkdz4hoj/14424/104766512182520809entry is not 
> found in the OpenKey table
>         at 
> org.apache.hadoop.ozone.om.protocolPB.OzoneManagerProtocolClientSideTranslatorPB.handleError(OzoneManagerProtocolClientSideTranslatorPB.java:593)
>         at 
> org.apache.hadoop.ozone.om.protocolPB.OzoneManagerProtocolClientSideTranslatorPB.commitKey(OzoneManagerProtocolClientSideTranslatorPB.java:650)
>         at 
> org.apache.hadoop.ozone.client.io.BlockOutputStreamEntryPool.commitKey(BlockOutputStreamEntryPool.java:306)
>         at 
> org.apache.hadoop.ozone.client.io.KeyOutputStream.close(KeyOutputStream.java:514)
>         at 
> org.apache.hadoop.ozone.client.io.OzoneOutputStream.close(OzoneOutputStream.java:60)
>         at 
> org.apache.hadoop.ozone.freon.OzoneClientKeyGenerator.lambda$createKey$0(OzoneClientKeyGenerator.java:118)
>         at com.codahale.metrics.Timer.time(Timer.java:101)
>         at 
> org.apache.hadoop.ozone.freon.OzoneClientKeyGenerator.createKey(OzoneClientKeyGenerator.java:113)
>         at 
> org.apache.hadoop.ozone.freon.BaseFreonGenerator.tryNextTask(BaseFreonGenerator.java:178)
>         at 
> org.apache.hadoop.ozone.freon.BaseFreonGenerator.taskLoop(BaseFreonGenerator.java:167)
>         at 
> org.apache.hadoop.ozone.freon.BaseFreonGenerator.lambda$startTaskRunners$0(BaseFreonGenerator.java:150)
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>         at java.lang.Thread.run(Thread.java:748)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

Reply via email to