[jira] [Commented] (HDFS-13579) Out of memory when running TestDFSStripedOutputStreamWithFailure testCloseWithExceptionsInStreamer
[ https://issues.apache.org/jira/browse/HDFS-13579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17250885#comment-17250885 ] Akira Ajisaka commented on HDFS-13579: -- Rethinking this, the fix is not correct. I'm investigating the root cause. > Out of memory when running TestDFSStripedOutputStreamWithFailure > testCloseWithExceptionsInStreamer > -- > > Key: HDFS-13579 > URL: https://issues.apache.org/jira/browse/HDFS-13579 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Ewan Higgs >Assignee: Akira Ajisaka >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > When running TestDFSStripedOutputStreamWithFailure > testCloseWithExceptionsInStreamer we often get OOM errors. It's not every > time, but it occurs frequently. We have reproduced this on a few different > machines. This seems to have been introduced in > f83716b7f2e5b63e4c2302c374982755233d4dd6 by HDFS-13251. > Output from the test: > {code:java} > java.lang.OutOfMemoryError: unable to create new native thread > at java.lang.Thread.start0(Native Method) > at java.lang.Thread.start(Thread.java:714) > at > io.netty.util.concurrent.SingleThreadEventExecutor.shutdownGracefully(SingleThreadEventExecutor.java:578) > at > io.netty.util.concurrent.MultithreadEventExecutorGroup.shutdownGracefully(MultithreadEventExecutorGroup.java:146) > at > io.netty.util.concurrent.AbstractEventExecutorGroup.shutdownGracefully(AbstractEventExecutorGroup.java:69) > at > org.apache.hadoop.hdfs.server.datanode.web.DatanodeHttpServer.close(DatanodeHttpServer.java:270) > at > org.apache.hadoop.hdfs.server.datanode.DataNode.shutdown(DataNode.java:2023) > at > org.apache.hadoop.hdfs.MiniDFSCluster.shutdownDataNode(MiniDFSCluster.java:2023) > at > org.apache.hadoop.hdfs.MiniDFSCluster.shutdownDataNodes(MiniDFSCluster.java:2013) > at > org.apache.hadoop.hdfs.MiniDFSCluster.shutdown(MiniDFSCluster.java:1992) > at > org.apache.hadoop.hdfs.MiniDFSCluster.shutdown(MiniDFSCluster.java:1966) > at > org.apache.hadoop.hdfs.MiniDFSCluster.shutdown(MiniDFSCluster.java:1959) > at > org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailureBase.tearDown(TestDFSStripedOutputStreamWithFailureBase.java:222) > at > org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure.testCloseWithExceptionsInStreamer(TestDFSStripedOutputStreamWithFailure.java:266) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50) > at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238) > at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63) > at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236) > at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53) > at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229) > at org.junit.runners.ParentRunner.run(ParentRunner.java:309) > at org.junit.runner.JUnitCore.run(JUnitCore.java:160) > at > com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:68) > at > com.intellij.rt.execution.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:54) > at > com.intellij.rt.execution.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:242) > at > com.intellij.rt.execution.junit.JUnitStarter.main(JUnitStarter.java:70){code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13579) Out of memory when running TestDFSStripedOutputStreamWithFailure testCloseWithExceptionsInStreamer
[ https://issues.apache.org/jira/browse/HDFS-13579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17250840#comment-17250840 ] Akira Ajisaka commented on HDFS-13579: -- In the stack trace, a thread starts when shutdown MiniDFSCluster, which is not the expected behavior. The thread is started because DatanodeHttpServer.close calls bossGroup.shutdownGracefully(), and then call workerGroup.shutdownGracefully(). bossGroups.shutdownGracefully() actually closes the workerGroup and workerGroups.shutdownGracefully() starts a thread if the workerGroups is closed. To fix this issue, the order of closing the groups should be reversed. In the netty user guide, the order is reversed: https://netty.io/wiki/user-guide-for-4.x.html {code:java} } finally { workerGroup.shutdownGracefully(); bossGroup.shutdownGracefully(); } {code} > Out of memory when running TestDFSStripedOutputStreamWithFailure > testCloseWithExceptionsInStreamer > -- > > Key: HDFS-13579 > URL: https://issues.apache.org/jira/browse/HDFS-13579 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Ewan Higgs >Assignee: Akira Ajisaka >Priority: Major > > When running TestDFSStripedOutputStreamWithFailure > testCloseWithExceptionsInStreamer we often get OOM errors. It's not every > time, but it occurs frequently. We have reproduced this on a few different > machines. This seems to have been introduced in > f83716b7f2e5b63e4c2302c374982755233d4dd6 by HDFS-13251. > Output from the test: > {code:java} > java.lang.OutOfMemoryError: unable to create new native thread > at java.lang.Thread.start0(Native Method) > at java.lang.Thread.start(Thread.java:714) > at > io.netty.util.concurrent.SingleThreadEventExecutor.shutdownGracefully(SingleThreadEventExecutor.java:578) > at > io.netty.util.concurrent.MultithreadEventExecutorGroup.shutdownGracefully(MultithreadEventExecutorGroup.java:146) > at > io.netty.util.concurrent.AbstractEventExecutorGroup.shutdownGracefully(AbstractEventExecutorGroup.java:69) > at > org.apache.hadoop.hdfs.server.datanode.web.DatanodeHttpServer.close(DatanodeHttpServer.java:270) > at > org.apache.hadoop.hdfs.server.datanode.DataNode.shutdown(DataNode.java:2023) > at > org.apache.hadoop.hdfs.MiniDFSCluster.shutdownDataNode(MiniDFSCluster.java:2023) > at > org.apache.hadoop.hdfs.MiniDFSCluster.shutdownDataNodes(MiniDFSCluster.java:2013) > at > org.apache.hadoop.hdfs.MiniDFSCluster.shutdown(MiniDFSCluster.java:1992) > at > org.apache.hadoop.hdfs.MiniDFSCluster.shutdown(MiniDFSCluster.java:1966) > at > org.apache.hadoop.hdfs.MiniDFSCluster.shutdown(MiniDFSCluster.java:1959) > at > org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailureBase.tearDown(TestDFSStripedOutputStreamWithFailureBase.java:222) > at > org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure.testCloseWithExceptionsInStreamer(TestDFSStripedOutputStreamWithFailure.java:266) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50) > at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238) > at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63) > at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236) > at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53) > at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229) > at org.junit.runners.ParentRunner.run(ParentRunner.java:309) > at org.junit.runner.JUnitCore.run(JUnitCore.java:160) > at > com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:68) > at > com.intellij.rt.execution.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:54) > at >
[jira] [Commented] (HDFS-13579) Out of memory when running TestDFSStripedOutputStreamWithFailure testCloseWithExceptionsInStreamer
[ https://issues.apache.org/jira/browse/HDFS-13579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16583037#comment-16583037 ] Jonathan Hung commented on HDFS-13579: -- FYI I see this on branch-2 as well (which has neither HDFS-13251 or HDFS-11600). I don't think its just this particular test either, running hadoop-hdfs tests locally fails on different tests (depending on how far the test goal happens to get) e.g. {noformat}[INFO] Running org.apache.hadoop.TestRefreshCallQueue [ERROR] Tests run: 2, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 4.574 s <<< FAILURE! - in org.apache.hadoop.TestRefreshCallQueue [ERROR] testRefresh(org.apache.hadoop.TestRefreshCallQueue) Time elapsed: 0.807 s <<< ERROR! java.lang.OutOfMemoryError: unable to create new native thread at java.lang.Thread.start0(Native Method) at java.lang.Thread.start(Thread.java:714) at io.netty.util.concurrent.SingleThreadEventExecutor.shutdownGracefully(SingleThreadEventExecutor.java:557) at io.netty.util.concurrent.MultithreadEventExecutorGroup.shutdownGracefully(MultithreadEventExecutorGroup.java:146) at io.netty.util.concurrent.AbstractEventExecutorGroup.shutdownGracefully(AbstractEventExecutorGroup.java:69) at org.apache.hadoop.hdfs.server.datanode.web.DatanodeHttpServer.close(DatanodeHttpServer.java:285) at org.apache.hadoop.hdfs.server.datanode.DataNode.shutdown(DataNode.java:1986) at org.apache.hadoop.hdfs.MiniDFSCluster.shutdownDataNode(MiniDFSCluster.java:1892) at org.apache.hadoop.hdfs.MiniDFSCluster.shutdownDataNodes(MiniDFSCluster.java:1882) at org.apache.hadoop.hdfs.MiniDFSCluster.shutdown(MiniDFSCluster.java:1861) at org.apache.hadoop.hdfs.MiniDFSCluster.shutdown(MiniDFSCluster.java:1835) at org.apache.hadoop.hdfs.MiniDFSCluster.shutdown(MiniDFSCluster.java:1828) at org.apache.hadoop.TestRefreshCallQueue.tearDown(TestRefreshCallQueue.java:83){noformat} > Out of memory when running TestDFSStripedOutputStreamWithFailure > testCloseWithExceptionsInStreamer > -- > > Key: HDFS-13579 > URL: https://issues.apache.org/jira/browse/HDFS-13579 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Ewan Higgs >Priority: Major > > When running TestDFSStripedOutputStreamWithFailure > testCloseWithExceptionsInStreamer we often get OOM errors. It's not every > time, but it occurs frequently. We have reproduced this on a few different > machines. This seems to have been introduced in > f83716b7f2e5b63e4c2302c374982755233d4dd6 by HDFS-13251. > Output from the test: > {code:java} > java.lang.OutOfMemoryError: unable to create new native thread > at java.lang.Thread.start0(Native Method) > at java.lang.Thread.start(Thread.java:714) > at > io.netty.util.concurrent.SingleThreadEventExecutor.shutdownGracefully(SingleThreadEventExecutor.java:578) > at > io.netty.util.concurrent.MultithreadEventExecutorGroup.shutdownGracefully(MultithreadEventExecutorGroup.java:146) > at > io.netty.util.concurrent.AbstractEventExecutorGroup.shutdownGracefully(AbstractEventExecutorGroup.java:69) > at > org.apache.hadoop.hdfs.server.datanode.web.DatanodeHttpServer.close(DatanodeHttpServer.java:270) > at > org.apache.hadoop.hdfs.server.datanode.DataNode.shutdown(DataNode.java:2023) > at > org.apache.hadoop.hdfs.MiniDFSCluster.shutdownDataNode(MiniDFSCluster.java:2023) > at > org.apache.hadoop.hdfs.MiniDFSCluster.shutdownDataNodes(MiniDFSCluster.java:2013) > at > org.apache.hadoop.hdfs.MiniDFSCluster.shutdown(MiniDFSCluster.java:1992) > at > org.apache.hadoop.hdfs.MiniDFSCluster.shutdown(MiniDFSCluster.java:1966) > at > org.apache.hadoop.hdfs.MiniDFSCluster.shutdown(MiniDFSCluster.java:1959) > at > org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailureBase.tearDown(TestDFSStripedOutputStreamWithFailureBase.java:222) > at > org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure.testCloseWithExceptionsInStreamer(TestDFSStripedOutputStreamWithFailure.java:266) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44) > at >
[jira] [Commented] (HDFS-13579) Out of memory when running TestDFSStripedOutputStreamWithFailure testCloseWithExceptionsInStreamer
[ https://issues.apache.org/jira/browse/HDFS-13579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16478753#comment-16478753 ] Ewan Higgs commented on HDFS-13579: --- ad1b988a828608b12cafb6382436cd17f95bfcc5 (HDFS-11600) might be a more likely candidate. > Out of memory when running TestDFSStripedOutputStreamWithFailure > testCloseWithExceptionsInStreamer > -- > > Key: HDFS-13579 > URL: https://issues.apache.org/jira/browse/HDFS-13579 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Ewan Higgs >Priority: Major > > When running TestDFSStripedOutputStreamWithFailure > testCloseWithExceptionsInStreamer we often get OOM errors. It's not every > time, but it occurs frequently. We have reproduced this on a few different > machines. This seems to have been introduced in > f83716b7f2e5b63e4c2302c374982755233d4dd6 by HDFS-13251. > Output from the test: > {code:java} > java.lang.OutOfMemoryError: unable to create new native thread > at java.lang.Thread.start0(Native Method) > at java.lang.Thread.start(Thread.java:714) > at > io.netty.util.concurrent.SingleThreadEventExecutor.shutdownGracefully(SingleThreadEventExecutor.java:578) > at > io.netty.util.concurrent.MultithreadEventExecutorGroup.shutdownGracefully(MultithreadEventExecutorGroup.java:146) > at > io.netty.util.concurrent.AbstractEventExecutorGroup.shutdownGracefully(AbstractEventExecutorGroup.java:69) > at > org.apache.hadoop.hdfs.server.datanode.web.DatanodeHttpServer.close(DatanodeHttpServer.java:270) > at > org.apache.hadoop.hdfs.server.datanode.DataNode.shutdown(DataNode.java:2023) > at > org.apache.hadoop.hdfs.MiniDFSCluster.shutdownDataNode(MiniDFSCluster.java:2023) > at > org.apache.hadoop.hdfs.MiniDFSCluster.shutdownDataNodes(MiniDFSCluster.java:2013) > at > org.apache.hadoop.hdfs.MiniDFSCluster.shutdown(MiniDFSCluster.java:1992) > at > org.apache.hadoop.hdfs.MiniDFSCluster.shutdown(MiniDFSCluster.java:1966) > at > org.apache.hadoop.hdfs.MiniDFSCluster.shutdown(MiniDFSCluster.java:1959) > at > org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailureBase.tearDown(TestDFSStripedOutputStreamWithFailureBase.java:222) > at > org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure.testCloseWithExceptionsInStreamer(TestDFSStripedOutputStreamWithFailure.java:266) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50) > at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238) > at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63) > at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236) > at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53) > at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229) > at org.junit.runners.ParentRunner.run(ParentRunner.java:309) > at org.junit.runner.JUnitCore.run(JUnitCore.java:160) > at > com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:68) > at > com.intellij.rt.execution.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:54) > at > com.intellij.rt.execution.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:242) > at > com.intellij.rt.execution.junit.JUnitStarter.main(JUnitStarter.java:70){code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org