[ 
https://issues.apache.org/jira/browse/HDFS-12913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16308365#comment-16308365
 ] 

Mukul Kumar Singh commented on HDFS-12913:
------------------------------------------

Thanks for the patch [~zvenczel].

the v2 patch looks good to me +1.

> TestDNFencingWithReplication.testFencingStress fix mini cluster not yet 
> active issue
> ------------------------------------------------------------------------------------
>
>                 Key: HDFS-12913
>                 URL: https://issues.apache.org/jira/browse/HDFS-12913
>             Project: Hadoop HDFS
>          Issue Type: Bug
>    Affects Versions: 3.0.0
>            Reporter: Zsolt Venczel
>            Assignee: Zsolt Venczel
>              Labels: flaky-test
>         Attachments: HDFS-12913.01.patch, HDFS-12913.02.patch
>
>
> Once in every 5000 test run the following issue happens:
> {code}
> 2017-12-11 10:33:09 [INFO] 
> 2017-12-11 10:33:09 [INFO] 
> -------------------------------------------------------
> 2017-12-11 10:33:09 [INFO]  T E S T S
> 2017-12-11 10:33:09 [INFO] 
> -------------------------------------------------------
> 2017-12-11 10:33:09 [INFO] Running 
> org.apache.hadoop.hdfs.server.namenode.ha.TestDNFencingWithReplication
> 2017-12-11 10:37:32 [ERROR] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, 
> Time elapsed: 262.641 s <<< FAILURE! - in 
> org.apache.hadoop.hdfs.server.namenode.ha.TestDNFencingWithReplication
> 2017-12-11 10:37:32 [ERROR] 
> testFencingStress(org.apache.hadoop.hdfs.server.namenode.ha.TestDNFencingWithReplication)
>   Time elapsed: 262.477 s  <<< ERROR!
> 2017-12-11 10:37:32 java.lang.RuntimeException: Deferred
> 2017-12-11 10:37:32   at 
> org.apache.hadoop.test.MultithreadedTestUtil$TestContext.checkException(MultithreadedTestUtil.java:130)
> 2017-12-11 10:37:32   at 
> org.apache.hadoop.test.MultithreadedTestUtil$TestContext.stop(MultithreadedTestUtil.java:166)
> 2017-12-11 10:37:32   at 
> org.apache.hadoop.hdfs.server.namenode.ha.TestDNFencingWithReplication.testFencingStress(TestDNFencingWithReplication.java:137)
> 2017-12-11 10:37:32   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native 
> Method)
> 2017-12-11 10:37:32   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> 2017-12-11 10:37:32   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> 2017-12-11 10:37:32   at java.lang.reflect.Method.invoke(Method.java:498)
> 2017-12-11 10:37:32   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
> 2017-12-11 10:37:32   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> 2017-12-11 10:37:32   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
> 2017-12-11 10:37:32   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> 2017-12-11 10:37:32   at 
> org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
> 2017-12-11 10:37:32   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
> 2017-12-11 10:37:32   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
> 2017-12-11 10:37:32   at 
> org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
> 2017-12-11 10:37:32   at 
> org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
> 2017-12-11 10:37:32   at 
> org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
> 2017-12-11 10:37:32   at 
> org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
> 2017-12-11 10:37:32   at 
> org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
> 2017-12-11 10:37:32   at 
> org.junit.runners.ParentRunner.run(ParentRunner.java:309)
> 2017-12-11 10:37:32   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:369)
> 2017-12-11 10:37:32   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:275)
> 2017-12-11 10:37:32   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:239)
> 2017-12-11 10:37:32   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:160)
> 2017-12-11 10:37:32   at 
> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:373)
> 2017-12-11 10:37:32   at 
> org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:334)
> 2017-12-11 10:37:32   at 
> org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:119)
> 2017-12-11 10:37:32   at 
> org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:407)
> 2017-12-11 10:37:32 Caused by: java.lang.RuntimeException: 
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException):
>  Operation category READ is not supported in state standby. Visit 
> https://s.apache.org/sbnn-error
> 2017-12-11 10:37:32   at 
> org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:88)
> 2017-12-11 10:37:32   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:1962)
> 2017-12-11 10:37:32   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:1421)
> 2017-12-11 10:37:32   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1862)
> 2017-12-11 10:37:32   at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:728)
> 2017-12-11 10:37:32   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:417)
> 2017-12-11 10:37:32   at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
> 2017-12-11 10:37:32   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523)
> 2017-12-11 10:37:32   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991)
> 2017-12-11 10:37:32   at 
> org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:869)
> 2017-12-11 10:37:32   at 
> org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:815)
> 2017-12-11 10:37:32   at java.security.AccessController.doPrivileged(Native 
> Method)
> 2017-12-11 10:37:32   at javax.security.auth.Subject.doAs(Subject.java:422)
> 2017-12-11 10:37:32   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1962)
> 2017-12-11 10:37:32   at 
> org.apache.hadoop.ipc.Server$Handler.run(Server.java:2675)
> 2017-12-11 10:37:32 
> 2017-12-11 10:37:32   at 
> org.apache.hadoop.hdfs.server.namenode.ha.TestDNFencingWithReplication$ReplicationToggler$1.get(TestDNFencingWithReplication.java:88)
> 2017-12-11 10:37:32   at 
> org.apache.hadoop.hdfs.server.namenode.ha.TestDNFencingWithReplication$ReplicationToggler$1.get(TestDNFencingWithReplication.java:80)
> 2017-12-11 10:37:32   at 
> org.apache.hadoop.test.GenericTestUtils.waitFor(GenericTestUtils.java:380)
> 2017-12-11 10:37:32   at 
> org.apache.hadoop.hdfs.server.namenode.ha.TestDNFencingWithReplication$ReplicationToggler.waitForReplicas(TestDNFencingWithReplication.java:80)
> 2017-12-11 10:37:32   at 
> org.apache.hadoop.hdfs.server.namenode.ha.TestDNFencingWithReplication$ReplicationToggler.doAnAction(TestDNFencingWithReplication.java:75)
> 2017-12-11 10:37:32   at 
> org.apache.hadoop.test.MultithreadedTestUtil$RepeatingTestThread.doWork(MultithreadedTestUtil.java:222)
> 2017-12-11 10:37:32   at 
> org.apache.hadoop.test.MultithreadedTestUtil$TestingThread.run(MultithreadedTestUtil.java:189)
> 2017-12-11 10:37:32 Caused by: 
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException):
>  Operation category READ is not supported in state standby. Visit 
> https://s.apache.org/sbnn-error
> 2017-12-11 10:37:32   at 
> org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:88)
> 2017-12-11 10:37:32   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:1962)
> 2017-12-11 10:37:32   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:1421)
> 2017-12-11 10:37:32   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1862)
> 2017-12-11 10:37:32   at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:728)
> 2017-12-11 10:37:32   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:417)
> 2017-12-11 10:37:32   at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
> 2017-12-11 10:37:32   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523)
> 2017-12-11 10:37:32   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991)
> 2017-12-11 10:37:32   at 
> org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:869)
> 2017-12-11 10:37:32   at 
> org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:815)
> 2017-12-11 10:37:32   at java.security.AccessController.doPrivileged(Native 
> Method)
> 2017-12-11 10:37:32   at javax.security.auth.Subject.doAs(Subject.java:422)
> 2017-12-11 10:37:32   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1962)
> 2017-12-11 10:37:32   at 
> org.apache.hadoop.ipc.Server$Handler.run(Server.java:2675)
> 2017-12-11 10:37:32 
> 2017-12-11 10:37:32   at 
> org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1491)
> 2017-12-11 10:37:32   at org.apache.hadoop.ipc.Client.call(Client.java:1437)
> 2017-12-11 10:37:32   at org.apache.hadoop.ipc.Client.call(Client.java:1347)
> 2017-12-11 10:37:32   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:228)
> 2017-12-11 10:37:32   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116)
> 2017-12-11 10:37:32   at com.sun.proxy.$Proxy23.getBlockLocations(Unknown 
> Source)
> 2017-12-11 10:37:32   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getBlockLocations(ClientNamenodeProtocolTranslatorPB.java:306)
> 2017-12-11 10:37:32   at sun.reflect.GeneratedMethodAccessor8.invoke(Unknown 
> Source)
> 2017-12-11 10:37:32   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> 2017-12-11 10:37:32   at java.lang.reflect.Method.invoke(Method.java:498)
> 2017-12-11 10:37:32   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422)
> 2017-12-11 10:37:32   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165)
> 2017-12-11 10:37:32   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157)
> 2017-12-11 10:37:32   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
> 2017-12-11 10:37:32   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359)
> 2017-12-11 10:37:32   at com.sun.proxy.$Proxy27.getBlockLocations(Unknown 
> Source)
> 2017-12-11 10:37:32   at 
> org.apache.hadoop.hdfs.DFSClient.callGetBlockLocations(DFSClient.java:852)
> 2017-12-11 10:37:32   at 
> org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:841)
> 2017-12-11 10:37:32   at 
> org.apache.hadoop.hdfs.DFSClient.getBlockLocations(DFSClient.java:898)
> 2017-12-11 10:37:32   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$2.doCall(DistributedFileSystem.java:271)
> 2017-12-11 10:37:32   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$2.doCall(DistributedFileSystem.java:268)
> 2017-12-11 10:37:32   at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
> 2017-12-11 10:37:32   at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileBlockLocations(DistributedFileSystem.java:278)
> 2017-12-11 10:37:32   at 
> org.apache.hadoop.hdfs.server.namenode.ha.TestDNFencingWithReplication$ReplicationToggler$1.get(TestDNFencingWithReplication.java:84)
> 2017-12-11 10:37:32   ... 6 more
> 2017-12-11 10:37:32 
> 2017-12-11 10:37:32 [INFO] 
> 2017-12-11 10:37:32 [INFO] Results:
> 2017-12-11 10:37:32 [INFO] 
> 2017-12-11 10:37:32 [ERROR] Errors: 
> 2017-12-11 10:37:32 [ERROR]   
> TestDNFencingWithReplication.testFencingStress:137 ? Runtime Deferred
> 2017-12-11 10:37:32 [INFO] 
> 2017-12-11 10:37:32 [ERROR] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to