[jira] [Commented] (HDFS-13579) Out of memory when running TestDFSStripedOutputStreamWithFailure testCloseWithExceptionsInStreamer

2020-12-17 Thread Akira Ajisaka (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-13579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17250885#comment-17250885
 ] 

Akira Ajisaka commented on HDFS-13579:
--

Rethinking this, the fix is not correct. I'm investigating the root cause.

> Out of memory when running TestDFSStripedOutputStreamWithFailure 
> testCloseWithExceptionsInStreamer
> --
>
> Key: HDFS-13579
> URL: https://issues.apache.org/jira/browse/HDFS-13579
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Ewan Higgs
>Assignee: Akira Ajisaka
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> When running  TestDFSStripedOutputStreamWithFailure 
> testCloseWithExceptionsInStreamer we often get OOM errors. It's not every 
> time, but it occurs frequently. We have reproduced this on a few different 
> machines. This seems to have been introduced in 
> f83716b7f2e5b63e4c2302c374982755233d4dd6 by HDFS-13251.
> Output from the test:
> {code:java}
> java.lang.OutOfMemoryError: unable to create new native thread
>     at java.lang.Thread.start0(Native Method)
>     at java.lang.Thread.start(Thread.java:714)
>     at 
> io.netty.util.concurrent.SingleThreadEventExecutor.shutdownGracefully(SingleThreadEventExecutor.java:578)
>     at 
> io.netty.util.concurrent.MultithreadEventExecutorGroup.shutdownGracefully(MultithreadEventExecutorGroup.java:146)
>     at 
> io.netty.util.concurrent.AbstractEventExecutorGroup.shutdownGracefully(AbstractEventExecutorGroup.java:69)
>     at 
> org.apache.hadoop.hdfs.server.datanode.web.DatanodeHttpServer.close(DatanodeHttpServer.java:270)
>     at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.shutdown(DataNode.java:2023)
>     at 
> org.apache.hadoop.hdfs.MiniDFSCluster.shutdownDataNode(MiniDFSCluster.java:2023)
>     at 
> org.apache.hadoop.hdfs.MiniDFSCluster.shutdownDataNodes(MiniDFSCluster.java:2013)
>     at 
> org.apache.hadoop.hdfs.MiniDFSCluster.shutdown(MiniDFSCluster.java:1992)
>     at 
> org.apache.hadoop.hdfs.MiniDFSCluster.shutdown(MiniDFSCluster.java:1966)
>     at 
> org.apache.hadoop.hdfs.MiniDFSCluster.shutdown(MiniDFSCluster.java:1959)
>     at 
> org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailureBase.tearDown(TestDFSStripedOutputStreamWithFailureBase.java:222)
>     at 
> org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure.testCloseWithExceptionsInStreamer(TestDFSStripedOutputStreamWithFailure.java:266)
>     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>     at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>     at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>     at java.lang.reflect.Method.invoke(Method.java:498)
>     at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>     at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>     at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>     at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>     at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>     at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
>     at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
>     at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
>     at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
>     at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
>     at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
>     at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
>     at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
>     at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
>     at org.junit.runner.JUnitCore.run(JUnitCore.java:160)
>     at 
> com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:68)
>     at 
> com.intellij.rt.execution.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:54)
>     at 
> com.intellij.rt.execution.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:242)
>     at 
> com.intellij.rt.execution.junit.JUnitStarter.main(JUnitStarter.java:70){code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13579) Out of memory when running TestDFSStripedOutputStreamWithFailure testCloseWithExceptionsInStreamer

2020-12-16 Thread Akira Ajisaka (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-13579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17250840#comment-17250840
 ] 

Akira Ajisaka commented on HDFS-13579:
--

In the stack trace, a thread starts when shutdown MiniDFSCluster, which is not 
the expected behavior. The thread is started because DatanodeHttpServer.close 
calls bossGroup.shutdownGracefully(), and then call 
workerGroup.shutdownGracefully(). bossGroups.shutdownGracefully() actually 
closes the workerGroup and workerGroups.shutdownGracefully() starts a thread if 
the workerGroups is closed. To fix this issue, the order of closing the groups 
should be reversed.

In the netty user guide, the order is reversed: 
https://netty.io/wiki/user-guide-for-4.x.html
{code:java}
} finally {
workerGroup.shutdownGracefully();
bossGroup.shutdownGracefully();
}
{code}

> Out of memory when running TestDFSStripedOutputStreamWithFailure 
> testCloseWithExceptionsInStreamer
> --
>
> Key: HDFS-13579
> URL: https://issues.apache.org/jira/browse/HDFS-13579
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Ewan Higgs
>Assignee: Akira Ajisaka
>Priority: Major
>
> When running  TestDFSStripedOutputStreamWithFailure 
> testCloseWithExceptionsInStreamer we often get OOM errors. It's not every 
> time, but it occurs frequently. We have reproduced this on a few different 
> machines. This seems to have been introduced in 
> f83716b7f2e5b63e4c2302c374982755233d4dd6 by HDFS-13251.
> Output from the test:
> {code:java}
> java.lang.OutOfMemoryError: unable to create new native thread
>     at java.lang.Thread.start0(Native Method)
>     at java.lang.Thread.start(Thread.java:714)
>     at 
> io.netty.util.concurrent.SingleThreadEventExecutor.shutdownGracefully(SingleThreadEventExecutor.java:578)
>     at 
> io.netty.util.concurrent.MultithreadEventExecutorGroup.shutdownGracefully(MultithreadEventExecutorGroup.java:146)
>     at 
> io.netty.util.concurrent.AbstractEventExecutorGroup.shutdownGracefully(AbstractEventExecutorGroup.java:69)
>     at 
> org.apache.hadoop.hdfs.server.datanode.web.DatanodeHttpServer.close(DatanodeHttpServer.java:270)
>     at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.shutdown(DataNode.java:2023)
>     at 
> org.apache.hadoop.hdfs.MiniDFSCluster.shutdownDataNode(MiniDFSCluster.java:2023)
>     at 
> org.apache.hadoop.hdfs.MiniDFSCluster.shutdownDataNodes(MiniDFSCluster.java:2013)
>     at 
> org.apache.hadoop.hdfs.MiniDFSCluster.shutdown(MiniDFSCluster.java:1992)
>     at 
> org.apache.hadoop.hdfs.MiniDFSCluster.shutdown(MiniDFSCluster.java:1966)
>     at 
> org.apache.hadoop.hdfs.MiniDFSCluster.shutdown(MiniDFSCluster.java:1959)
>     at 
> org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailureBase.tearDown(TestDFSStripedOutputStreamWithFailureBase.java:222)
>     at 
> org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure.testCloseWithExceptionsInStreamer(TestDFSStripedOutputStreamWithFailure.java:266)
>     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>     at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>     at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>     at java.lang.reflect.Method.invoke(Method.java:498)
>     at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>     at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>     at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>     at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>     at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>     at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
>     at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
>     at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
>     at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
>     at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
>     at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
>     at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
>     at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
>     at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
>     at org.junit.runner.JUnitCore.run(JUnitCore.java:160)
>     at 
> com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:68)
>     at 
> com.intellij.rt.execution.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:54)
>     at 
> 

[jira] [Commented] (HDFS-13579) Out of memory when running TestDFSStripedOutputStreamWithFailure testCloseWithExceptionsInStreamer

2018-08-16 Thread Jonathan Hung (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16583037#comment-16583037
 ] 

Jonathan Hung commented on HDFS-13579:
--

FYI I see this on branch-2 as well (which has neither HDFS-13251 or HDFS-11600).

I don't think its just this particular test either, running hadoop-hdfs tests 
locally fails on different tests (depending on how far the test goal happens to 
get) e.g.

{noformat}[INFO] Running org.apache.hadoop.TestRefreshCallQueue
[ERROR] Tests run: 2, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 4.574 s 
<<< FAILURE! - in org.apache.hadoop.TestRefreshCallQueue
[ERROR] testRefresh(org.apache.hadoop.TestRefreshCallQueue)  Time elapsed: 
0.807 s  <<< ERROR!
java.lang.OutOfMemoryError: unable to create new native thread
at java.lang.Thread.start0(Native Method)
at java.lang.Thread.start(Thread.java:714)
at 
io.netty.util.concurrent.SingleThreadEventExecutor.shutdownGracefully(SingleThreadEventExecutor.java:557)
at 
io.netty.util.concurrent.MultithreadEventExecutorGroup.shutdownGracefully(MultithreadEventExecutorGroup.java:146)
at 
io.netty.util.concurrent.AbstractEventExecutorGroup.shutdownGracefully(AbstractEventExecutorGroup.java:69)
at 
org.apache.hadoop.hdfs.server.datanode.web.DatanodeHttpServer.close(DatanodeHttpServer.java:285)
at 
org.apache.hadoop.hdfs.server.datanode.DataNode.shutdown(DataNode.java:1986)
at 
org.apache.hadoop.hdfs.MiniDFSCluster.shutdownDataNode(MiniDFSCluster.java:1892)
at 
org.apache.hadoop.hdfs.MiniDFSCluster.shutdownDataNodes(MiniDFSCluster.java:1882)
at 
org.apache.hadoop.hdfs.MiniDFSCluster.shutdown(MiniDFSCluster.java:1861)
at 
org.apache.hadoop.hdfs.MiniDFSCluster.shutdown(MiniDFSCluster.java:1835)
at 
org.apache.hadoop.hdfs.MiniDFSCluster.shutdown(MiniDFSCluster.java:1828)
at 
org.apache.hadoop.TestRefreshCallQueue.tearDown(TestRefreshCallQueue.java:83){noformat}

> Out of memory when running TestDFSStripedOutputStreamWithFailure 
> testCloseWithExceptionsInStreamer
> --
>
> Key: HDFS-13579
> URL: https://issues.apache.org/jira/browse/HDFS-13579
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Ewan Higgs
>Priority: Major
>
> When running  TestDFSStripedOutputStreamWithFailure 
> testCloseWithExceptionsInStreamer we often get OOM errors. It's not every 
> time, but it occurs frequently. We have reproduced this on a few different 
> machines. This seems to have been introduced in 
> f83716b7f2e5b63e4c2302c374982755233d4dd6 by HDFS-13251.
> Output from the test:
> {code:java}
> java.lang.OutOfMemoryError: unable to create new native thread
>     at java.lang.Thread.start0(Native Method)
>     at java.lang.Thread.start(Thread.java:714)
>     at 
> io.netty.util.concurrent.SingleThreadEventExecutor.shutdownGracefully(SingleThreadEventExecutor.java:578)
>     at 
> io.netty.util.concurrent.MultithreadEventExecutorGroup.shutdownGracefully(MultithreadEventExecutorGroup.java:146)
>     at 
> io.netty.util.concurrent.AbstractEventExecutorGroup.shutdownGracefully(AbstractEventExecutorGroup.java:69)
>     at 
> org.apache.hadoop.hdfs.server.datanode.web.DatanodeHttpServer.close(DatanodeHttpServer.java:270)
>     at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.shutdown(DataNode.java:2023)
>     at 
> org.apache.hadoop.hdfs.MiniDFSCluster.shutdownDataNode(MiniDFSCluster.java:2023)
>     at 
> org.apache.hadoop.hdfs.MiniDFSCluster.shutdownDataNodes(MiniDFSCluster.java:2013)
>     at 
> org.apache.hadoop.hdfs.MiniDFSCluster.shutdown(MiniDFSCluster.java:1992)
>     at 
> org.apache.hadoop.hdfs.MiniDFSCluster.shutdown(MiniDFSCluster.java:1966)
>     at 
> org.apache.hadoop.hdfs.MiniDFSCluster.shutdown(MiniDFSCluster.java:1959)
>     at 
> org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailureBase.tearDown(TestDFSStripedOutputStreamWithFailureBase.java:222)
>     at 
> org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure.testCloseWithExceptionsInStreamer(TestDFSStripedOutputStreamWithFailure.java:266)
>     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>     at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>     at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>     at java.lang.reflect.Method.invoke(Method.java:498)
>     at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>     at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>     at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>     at 
> 

[jira] [Commented] (HDFS-13579) Out of memory when running TestDFSStripedOutputStreamWithFailure testCloseWithExceptionsInStreamer

2018-05-17 Thread Ewan Higgs (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16478753#comment-16478753
 ] 

Ewan Higgs commented on HDFS-13579:
---

ad1b988a828608b12cafb6382436cd17f95bfcc5 (HDFS-11600) might be a more likely 
candidate.

> Out of memory when running TestDFSStripedOutputStreamWithFailure 
> testCloseWithExceptionsInStreamer
> --
>
> Key: HDFS-13579
> URL: https://issues.apache.org/jira/browse/HDFS-13579
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Ewan Higgs
>Priority: Major
>
> When running  TestDFSStripedOutputStreamWithFailure 
> testCloseWithExceptionsInStreamer we often get OOM errors. It's not every 
> time, but it occurs frequently. We have reproduced this on a few different 
> machines. This seems to have been introduced in 
> f83716b7f2e5b63e4c2302c374982755233d4dd6 by HDFS-13251.
> Output from the test:
> {code:java}
> java.lang.OutOfMemoryError: unable to create new native thread
>     at java.lang.Thread.start0(Native Method)
>     at java.lang.Thread.start(Thread.java:714)
>     at 
> io.netty.util.concurrent.SingleThreadEventExecutor.shutdownGracefully(SingleThreadEventExecutor.java:578)
>     at 
> io.netty.util.concurrent.MultithreadEventExecutorGroup.shutdownGracefully(MultithreadEventExecutorGroup.java:146)
>     at 
> io.netty.util.concurrent.AbstractEventExecutorGroup.shutdownGracefully(AbstractEventExecutorGroup.java:69)
>     at 
> org.apache.hadoop.hdfs.server.datanode.web.DatanodeHttpServer.close(DatanodeHttpServer.java:270)
>     at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.shutdown(DataNode.java:2023)
>     at 
> org.apache.hadoop.hdfs.MiniDFSCluster.shutdownDataNode(MiniDFSCluster.java:2023)
>     at 
> org.apache.hadoop.hdfs.MiniDFSCluster.shutdownDataNodes(MiniDFSCluster.java:2013)
>     at 
> org.apache.hadoop.hdfs.MiniDFSCluster.shutdown(MiniDFSCluster.java:1992)
>     at 
> org.apache.hadoop.hdfs.MiniDFSCluster.shutdown(MiniDFSCluster.java:1966)
>     at 
> org.apache.hadoop.hdfs.MiniDFSCluster.shutdown(MiniDFSCluster.java:1959)
>     at 
> org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailureBase.tearDown(TestDFSStripedOutputStreamWithFailureBase.java:222)
>     at 
> org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure.testCloseWithExceptionsInStreamer(TestDFSStripedOutputStreamWithFailure.java:266)
>     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>     at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>     at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>     at java.lang.reflect.Method.invoke(Method.java:498)
>     at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>     at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>     at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>     at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>     at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>     at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
>     at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
>     at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
>     at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
>     at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
>     at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
>     at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
>     at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
>     at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
>     at org.junit.runner.JUnitCore.run(JUnitCore.java:160)
>     at 
> com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:68)
>     at 
> com.intellij.rt.execution.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:54)
>     at 
> com.intellij.rt.execution.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:242)
>     at 
> com.intellij.rt.execution.junit.JUnitStarter.main(JUnitStarter.java:70){code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org