[ https://issues.apache.org/jira/browse/HDFS-12913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16308365#comment-16308365 ]
Mukul Kumar Singh commented on HDFS-12913: ------------------------------------------ Thanks for the patch [~zvenczel]. the v2 patch looks good to me +1. > TestDNFencingWithReplication.testFencingStress fix mini cluster not yet > active issue > ------------------------------------------------------------------------------------ > > Key: HDFS-12913 > URL: https://issues.apache.org/jira/browse/HDFS-12913 > Project: Hadoop HDFS > Issue Type: Bug > Affects Versions: 3.0.0 > Reporter: Zsolt Venczel > Assignee: Zsolt Venczel > Labels: flaky-test > Attachments: HDFS-12913.01.patch, HDFS-12913.02.patch > > > Once in every 5000 test run the following issue happens: > {code} > 2017-12-11 10:33:09 [INFO] > 2017-12-11 10:33:09 [INFO] > ------------------------------------------------------- > 2017-12-11 10:33:09 [INFO] T E S T S > 2017-12-11 10:33:09 [INFO] > ------------------------------------------------------- > 2017-12-11 10:33:09 [INFO] Running > org.apache.hadoop.hdfs.server.namenode.ha.TestDNFencingWithReplication > 2017-12-11 10:37:32 [ERROR] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, > Time elapsed: 262.641 s <<< FAILURE! - in > org.apache.hadoop.hdfs.server.namenode.ha.TestDNFencingWithReplication > 2017-12-11 10:37:32 [ERROR] > testFencingStress(org.apache.hadoop.hdfs.server.namenode.ha.TestDNFencingWithReplication) > Time elapsed: 262.477 s <<< ERROR! > 2017-12-11 10:37:32 java.lang.RuntimeException: Deferred > 2017-12-11 10:37:32 at > org.apache.hadoop.test.MultithreadedTestUtil$TestContext.checkException(MultithreadedTestUtil.java:130) > 2017-12-11 10:37:32 at > org.apache.hadoop.test.MultithreadedTestUtil$TestContext.stop(MultithreadedTestUtil.java:166) > 2017-12-11 10:37:32 at > org.apache.hadoop.hdfs.server.namenode.ha.TestDNFencingWithReplication.testFencingStress(TestDNFencingWithReplication.java:137) > 2017-12-11 10:37:32 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native > Method) > 2017-12-11 10:37:32 at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > 2017-12-11 10:37:32 at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > 2017-12-11 10:37:32 at java.lang.reflect.Method.invoke(Method.java:498) > 2017-12-11 10:37:32 at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) > 2017-12-11 10:37:32 at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > 2017-12-11 10:37:32 at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44) > 2017-12-11 10:37:32 at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > 2017-12-11 10:37:32 at > org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271) > 2017-12-11 10:37:32 at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70) > 2017-12-11 10:37:32 at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50) > 2017-12-11 10:37:32 at > org.junit.runners.ParentRunner$3.run(ParentRunner.java:238) > 2017-12-11 10:37:32 at > org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63) > 2017-12-11 10:37:32 at > org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236) > 2017-12-11 10:37:32 at > org.junit.runners.ParentRunner.access$000(ParentRunner.java:53) > 2017-12-11 10:37:32 at > org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229) > 2017-12-11 10:37:32 at > org.junit.runners.ParentRunner.run(ParentRunner.java:309) > 2017-12-11 10:37:32 at > org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:369) > 2017-12-11 10:37:32 at > org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:275) > 2017-12-11 10:37:32 at > org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:239) > 2017-12-11 10:37:32 at > org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:160) > 2017-12-11 10:37:32 at > org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:373) > 2017-12-11 10:37:32 at > org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:334) > 2017-12-11 10:37:32 at > org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:119) > 2017-12-11 10:37:32 at > org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:407) > 2017-12-11 10:37:32 Caused by: java.lang.RuntimeException: > org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException): > Operation category READ is not supported in state standby. Visit > https://s.apache.org/sbnn-error > 2017-12-11 10:37:32 at > org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:88) > 2017-12-11 10:37:32 at > org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:1962) > 2017-12-11 10:37:32 at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:1421) > 2017-12-11 10:37:32 at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1862) > 2017-12-11 10:37:32 at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:728) > 2017-12-11 10:37:32 at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:417) > 2017-12-11 10:37:32 at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) > 2017-12-11 10:37:32 at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523) > 2017-12-11 10:37:32 at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991) > 2017-12-11 10:37:32 at > org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:869) > 2017-12-11 10:37:32 at > org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:815) > 2017-12-11 10:37:32 at java.security.AccessController.doPrivileged(Native > Method) > 2017-12-11 10:37:32 at javax.security.auth.Subject.doAs(Subject.java:422) > 2017-12-11 10:37:32 at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1962) > 2017-12-11 10:37:32 at > org.apache.hadoop.ipc.Server$Handler.run(Server.java:2675) > 2017-12-11 10:37:32 > 2017-12-11 10:37:32 at > org.apache.hadoop.hdfs.server.namenode.ha.TestDNFencingWithReplication$ReplicationToggler$1.get(TestDNFencingWithReplication.java:88) > 2017-12-11 10:37:32 at > org.apache.hadoop.hdfs.server.namenode.ha.TestDNFencingWithReplication$ReplicationToggler$1.get(TestDNFencingWithReplication.java:80) > 2017-12-11 10:37:32 at > org.apache.hadoop.test.GenericTestUtils.waitFor(GenericTestUtils.java:380) > 2017-12-11 10:37:32 at > org.apache.hadoop.hdfs.server.namenode.ha.TestDNFencingWithReplication$ReplicationToggler.waitForReplicas(TestDNFencingWithReplication.java:80) > 2017-12-11 10:37:32 at > org.apache.hadoop.hdfs.server.namenode.ha.TestDNFencingWithReplication$ReplicationToggler.doAnAction(TestDNFencingWithReplication.java:75) > 2017-12-11 10:37:32 at > org.apache.hadoop.test.MultithreadedTestUtil$RepeatingTestThread.doWork(MultithreadedTestUtil.java:222) > 2017-12-11 10:37:32 at > org.apache.hadoop.test.MultithreadedTestUtil$TestingThread.run(MultithreadedTestUtil.java:189) > 2017-12-11 10:37:32 Caused by: > org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException): > Operation category READ is not supported in state standby. Visit > https://s.apache.org/sbnn-error > 2017-12-11 10:37:32 at > org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:88) > 2017-12-11 10:37:32 at > org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:1962) > 2017-12-11 10:37:32 at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:1421) > 2017-12-11 10:37:32 at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1862) > 2017-12-11 10:37:32 at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:728) > 2017-12-11 10:37:32 at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:417) > 2017-12-11 10:37:32 at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) > 2017-12-11 10:37:32 at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523) > 2017-12-11 10:37:32 at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991) > 2017-12-11 10:37:32 at > org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:869) > 2017-12-11 10:37:32 at > org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:815) > 2017-12-11 10:37:32 at java.security.AccessController.doPrivileged(Native > Method) > 2017-12-11 10:37:32 at javax.security.auth.Subject.doAs(Subject.java:422) > 2017-12-11 10:37:32 at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1962) > 2017-12-11 10:37:32 at > org.apache.hadoop.ipc.Server$Handler.run(Server.java:2675) > 2017-12-11 10:37:32 > 2017-12-11 10:37:32 at > org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1491) > 2017-12-11 10:37:32 at org.apache.hadoop.ipc.Client.call(Client.java:1437) > 2017-12-11 10:37:32 at org.apache.hadoop.ipc.Client.call(Client.java:1347) > 2017-12-11 10:37:32 at > org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:228) > 2017-12-11 10:37:32 at > org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116) > 2017-12-11 10:37:32 at com.sun.proxy.$Proxy23.getBlockLocations(Unknown > Source) > 2017-12-11 10:37:32 at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getBlockLocations(ClientNamenodeProtocolTranslatorPB.java:306) > 2017-12-11 10:37:32 at sun.reflect.GeneratedMethodAccessor8.invoke(Unknown > Source) > 2017-12-11 10:37:32 at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > 2017-12-11 10:37:32 at java.lang.reflect.Method.invoke(Method.java:498) > 2017-12-11 10:37:32 at > org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422) > 2017-12-11 10:37:32 at > org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165) > 2017-12-11 10:37:32 at > org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157) > 2017-12-11 10:37:32 at > org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95) > 2017-12-11 10:37:32 at > org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359) > 2017-12-11 10:37:32 at com.sun.proxy.$Proxy27.getBlockLocations(Unknown > Source) > 2017-12-11 10:37:32 at > org.apache.hadoop.hdfs.DFSClient.callGetBlockLocations(DFSClient.java:852) > 2017-12-11 10:37:32 at > org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:841) > 2017-12-11 10:37:32 at > org.apache.hadoop.hdfs.DFSClient.getBlockLocations(DFSClient.java:898) > 2017-12-11 10:37:32 at > org.apache.hadoop.hdfs.DistributedFileSystem$2.doCall(DistributedFileSystem.java:271) > 2017-12-11 10:37:32 at > org.apache.hadoop.hdfs.DistributedFileSystem$2.doCall(DistributedFileSystem.java:268) > 2017-12-11 10:37:32 at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > 2017-12-11 10:37:32 at > org.apache.hadoop.hdfs.DistributedFileSystem.getFileBlockLocations(DistributedFileSystem.java:278) > 2017-12-11 10:37:32 at > org.apache.hadoop.hdfs.server.namenode.ha.TestDNFencingWithReplication$ReplicationToggler$1.get(TestDNFencingWithReplication.java:84) > 2017-12-11 10:37:32 ... 6 more > 2017-12-11 10:37:32 > 2017-12-11 10:37:32 [INFO] > 2017-12-11 10:37:32 [INFO] Results: > 2017-12-11 10:37:32 [INFO] > 2017-12-11 10:37:32 [ERROR] Errors: > 2017-12-11 10:37:32 [ERROR] > TestDNFencingWithReplication.testFencingStress:137 ? Runtime Deferred > 2017-12-11 10:37:32 [INFO] > 2017-12-11 10:37:32 [ERROR] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0 > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org