Re: test failed due to OOME
Looks like SparkListenerSuite doesn't OOM on QA runs compared to Jenkins builds. I wonder if this is due to difference between machines running QA tests vs machines running Jenkins builds. On Fri, Oct 30, 2015 at 1:19 PM, Ted Yuwrote: > I noticed that the SparkContext created in each sub-test is not stopped > upon finishing sub-test. > > Would stopping each SparkContext make a difference in terms of heap memory > consumption ? > > Cheers > > On Fri, Oct 30, 2015 at 12:04 PM, Mridul Muralidharan > wrote: > >> It is giving OOM at 32GB ? Something looks wrong with that ... that is >> already on the higher side. >> >> Regards, >> Mridul >> >> >> On Fri, Oct 30, 2015 at 11:28 AM, shane knapp >> wrote: >> > here's the current heap settings on our workers: >> > InitialHeapSize == 2.1G >> > MaxHeapSize == 32G >> > >> > system ram: 128G >> > >> > we can bump it pretty easily... it's just a matter of deciding if we >> > want to do this globally (super easy, but will affect ALL maven builds >> > on our system -- not just spark) or on a per-job basis (this doesn't >> > scale that well). >> > >> > thoughts? >> > >> > On Fri, Oct 30, 2015 at 9:47 AM, Ted Yu wrote: >> >> This happened recently on Jenkins: >> >> >> >> >> https://amplab.cs.berkeley.edu/jenkins/job/Spark-Master-Maven-with-YARN/HADOOP_PROFILE=hadoop-2.3,label=spark-test/3964/console >> >> >> >> On Sun, Oct 18, 2015 at 7:54 AM, Ted Yu wrote: >> >>> >> >>> From >> >>> >> https://amplab.cs.berkeley.edu/jenkins/job/Spark-Master-Maven-with-YARN/HADOOP_PROFILE=hadoop-2.4,label=spark-test/3846/console >> >>> : >> >>> >> >>> SparkListenerSuite: >> >>> - basic creation and shutdown of LiveListenerBus >> >>> - bus.stop() waits for the event queue to completely drain >> >>> - basic creation of StageInfo >> >>> - basic creation of StageInfo with shuffle >> >>> - StageInfo with fewer tasks than partitions >> >>> - local metrics >> >>> - onTaskGettingResult() called when result fetched remotely *** >> FAILED *** >> >>> org.apache.spark.SparkException: Job aborted due to stage failure: >> Task >> >>> 0 in stage 0.0 failed 1 times, most recent failure: Lost task 0.0 in >> stage >> >>> 0.0 (TID 0, localhost): java.lang.OutOfMemoryError: Java heap space >> >>> at java.util.Arrays.copyOf(Arrays.java:2271) >> >>> at >> java.io.ByteArrayOutputStream.grow(ByteArrayOutputStream.java:113) >> >>> at >> >>> >> java.io.ByteArrayOutputStream.ensureCapacity(ByteArrayOutputStream.java:93) >> >>> at >> java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:140) >> >>> at >> >>> >> java.io.ObjectOutputStream$BlockDataOutputStream.write(ObjectOutputStream.java:1852) >> >>> at java.io.ObjectOutputStream.write(ObjectOutputStream.java:708) >> >>> at org.apache.spark.util.Utils$.writeByteBuffer(Utils.scala:182) >> >>> at >> >>> >> org.apache.spark.scheduler.DirectTaskResult$$anonfun$writeExternal$1.apply$mcV$sp(TaskResult.scala:52) >> >>> at >> org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1160) >> >>> at >> >>> >> org.apache.spark.scheduler.DirectTaskResult.writeExternal(TaskResult.scala:49) >> >>> at >> >>> >> java.io.ObjectOutputStream.writeExternalData(ObjectOutputStream.java:1458) >> >>> at >> >>> >> java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1429) >> >>> at >> java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1177) >> >>> at >> java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:347) >> >>> at >> >>> >> org.apache.spark.serializer.JavaSerializationStream.writeObject(JavaSerializer.scala:44) >> >>> at >> >>> >> org.apache.spark.serializer.JavaSerializerInstance.serialize(JavaSerializer.scala:101) >> >>> at >> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:256) >> >>> at >> >>> >> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) >> >>> at >> >>> >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) >> >>> at java.lang.Thread.run(Thread.java:745) >> >>> >> >>> >> >>> Should more heap be given to test suite ? >> >>> >> >>> >> >>> Cheers >> >> >> >> >> > >> > - >> > To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org >> > For additional commands, e-mail: dev-h...@spark.apache.org >> > >> > >
Re: test failed due to OOME
I believe this is some bug in our tests. For some reason we are using way more memory than necessary. We'll probably need to log into Jenkins and heap dump some running tests and figure out what is going on. On Mon, Nov 2, 2015 at 7:42 AM, Ted Yuwrote: > Looks like SparkListenerSuite doesn't OOM on QA runs compared to Jenkins > builds. > > I wonder if this is due to difference between machines running QA tests vs > machines running Jenkins builds. > > On Fri, Oct 30, 2015 at 1:19 PM, Ted Yu wrote: > >> I noticed that the SparkContext created in each sub-test is not stopped >> upon finishing sub-test. >> >> Would stopping each SparkContext make a difference in terms of heap >> memory consumption ? >> >> Cheers >> >> On Fri, Oct 30, 2015 at 12:04 PM, Mridul Muralidharan >> wrote: >> >>> It is giving OOM at 32GB ? Something looks wrong with that ... that is >>> already on the higher side. >>> >>> Regards, >>> Mridul >>> >>> >>> On Fri, Oct 30, 2015 at 11:28 AM, shane knapp >>> wrote: >>> > here's the current heap settings on our workers: >>> > InitialHeapSize == 2.1G >>> > MaxHeapSize == 32G >>> > >>> > system ram: 128G >>> > >>> > we can bump it pretty easily... it's just a matter of deciding if we >>> > want to do this globally (super easy, but will affect ALL maven builds >>> > on our system -- not just spark) or on a per-job basis (this doesn't >>> > scale that well). >>> > >>> > thoughts? >>> > >>> > On Fri, Oct 30, 2015 at 9:47 AM, Ted Yu wrote: >>> >> This happened recently on Jenkins: >>> >> >>> >> >>> https://amplab.cs.berkeley.edu/jenkins/job/Spark-Master-Maven-with-YARN/HADOOP_PROFILE=hadoop-2.3,label=spark-test/3964/console >>> >> >>> >> On Sun, Oct 18, 2015 at 7:54 AM, Ted Yu wrote: >>> >>> >>> >>> From >>> >>> >>> https://amplab.cs.berkeley.edu/jenkins/job/Spark-Master-Maven-with-YARN/HADOOP_PROFILE=hadoop-2.4,label=spark-test/3846/console >>> >>> : >>> >>> >>> >>> SparkListenerSuite: >>> >>> - basic creation and shutdown of LiveListenerBus >>> >>> - bus.stop() waits for the event queue to completely drain >>> >>> - basic creation of StageInfo >>> >>> - basic creation of StageInfo with shuffle >>> >>> - StageInfo with fewer tasks than partitions >>> >>> - local metrics >>> >>> - onTaskGettingResult() called when result fetched remotely *** >>> FAILED *** >>> >>> org.apache.spark.SparkException: Job aborted due to stage failure: >>> Task >>> >>> 0 in stage 0.0 failed 1 times, most recent failure: Lost task 0.0 in >>> stage >>> >>> 0.0 (TID 0, localhost): java.lang.OutOfMemoryError: Java heap space >>> >>> at java.util.Arrays.copyOf(Arrays.java:2271) >>> >>> at >>> java.io.ByteArrayOutputStream.grow(ByteArrayOutputStream.java:113) >>> >>> at >>> >>> >>> java.io.ByteArrayOutputStream.ensureCapacity(ByteArrayOutputStream.java:93) >>> >>> at >>> java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:140) >>> >>> at >>> >>> >>> java.io.ObjectOutputStream$BlockDataOutputStream.write(ObjectOutputStream.java:1852) >>> >>> at java.io.ObjectOutputStream.write(ObjectOutputStream.java:708) >>> >>> at org.apache.spark.util.Utils$.writeByteBuffer(Utils.scala:182) >>> >>> at >>> >>> >>> org.apache.spark.scheduler.DirectTaskResult$$anonfun$writeExternal$1.apply$mcV$sp(TaskResult.scala:52) >>> >>> at >>> org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1160) >>> >>> at >>> >>> >>> org.apache.spark.scheduler.DirectTaskResult.writeExternal(TaskResult.scala:49) >>> >>> at >>> >>> >>> java.io.ObjectOutputStream.writeExternalData(ObjectOutputStream.java:1458) >>> >>> at >>> >>> >>> java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1429) >>> >>> at >>> java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1177) >>> >>> at >>> java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:347) >>> >>> at >>> >>> >>> org.apache.spark.serializer.JavaSerializationStream.writeObject(JavaSerializer.scala:44) >>> >>> at >>> >>> >>> org.apache.spark.serializer.JavaSerializerInstance.serialize(JavaSerializer.scala:101) >>> >>> at >>> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:256) >>> >>> at >>> >>> >>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) >>> >>> at >>> >>> >>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) >>> >>> at java.lang.Thread.run(Thread.java:745) >>> >>> >>> >>> >>> >>> Should more heap be given to test suite ? >>> >>> >>> >>> >>> >>> Cheers >>> >> >>> >> >>> > >>> > - >>> > To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org >>> > For additional commands, e-mail: dev-h...@spark.apache.org >>> > >>> >> >> >
Re: test failed due to OOME
I noticed that the SparkContext created in each sub-test is not stopped upon finishing sub-test. Would stopping each SparkContext make a difference in terms of heap memory consumption ? Cheers On Fri, Oct 30, 2015 at 12:04 PM, Mridul Muralidharanwrote: > It is giving OOM at 32GB ? Something looks wrong with that ... that is > already on the higher side. > > Regards, > Mridul > > On Fri, Oct 30, 2015 at 11:28 AM, shane knapp wrote: > > here's the current heap settings on our workers: > > InitialHeapSize == 2.1G > > MaxHeapSize == 32G > > > > system ram: 128G > > > > we can bump it pretty easily... it's just a matter of deciding if we > > want to do this globally (super easy, but will affect ALL maven builds > > on our system -- not just spark) or on a per-job basis (this doesn't > > scale that well). > > > > thoughts? > > > > On Fri, Oct 30, 2015 at 9:47 AM, Ted Yu wrote: > >> This happened recently on Jenkins: > >> > >> > https://amplab.cs.berkeley.edu/jenkins/job/Spark-Master-Maven-with-YARN/HADOOP_PROFILE=hadoop-2.3,label=spark-test/3964/console > >> > >> On Sun, Oct 18, 2015 at 7:54 AM, Ted Yu wrote: > >>> > >>> From > >>> > https://amplab.cs.berkeley.edu/jenkins/job/Spark-Master-Maven-with-YARN/HADOOP_PROFILE=hadoop-2.4,label=spark-test/3846/console > >>> : > >>> > >>> SparkListenerSuite: > >>> - basic creation and shutdown of LiveListenerBus > >>> - bus.stop() waits for the event queue to completely drain > >>> - basic creation of StageInfo > >>> - basic creation of StageInfo with shuffle > >>> - StageInfo with fewer tasks than partitions > >>> - local metrics > >>> - onTaskGettingResult() called when result fetched remotely *** FAILED > *** > >>> org.apache.spark.SparkException: Job aborted due to stage failure: > Task > >>> 0 in stage 0.0 failed 1 times, most recent failure: Lost task 0.0 in > stage > >>> 0.0 (TID 0, localhost): java.lang.OutOfMemoryError: Java heap space > >>> at java.util.Arrays.copyOf(Arrays.java:2271) > >>> at > java.io.ByteArrayOutputStream.grow(ByteArrayOutputStream.java:113) > >>> at > >>> > java.io.ByteArrayOutputStream.ensureCapacity(ByteArrayOutputStream.java:93) > >>> at > java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:140) > >>> at > >>> > java.io.ObjectOutputStream$BlockDataOutputStream.write(ObjectOutputStream.java:1852) > >>> at java.io.ObjectOutputStream.write(ObjectOutputStream.java:708) > >>> at org.apache.spark.util.Utils$.writeByteBuffer(Utils.scala:182) > >>> at > >>> > org.apache.spark.scheduler.DirectTaskResult$$anonfun$writeExternal$1.apply$mcV$sp(TaskResult.scala:52) > >>> at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1160) > >>> at > >>> > org.apache.spark.scheduler.DirectTaskResult.writeExternal(TaskResult.scala:49) > >>> at > >>> > java.io.ObjectOutputStream.writeExternalData(ObjectOutputStream.java:1458) > >>> at > >>> > java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1429) > >>> at > java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1177) > >>> at > java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:347) > >>> at > >>> > org.apache.spark.serializer.JavaSerializationStream.writeObject(JavaSerializer.scala:44) > >>> at > >>> > org.apache.spark.serializer.JavaSerializerInstance.serialize(JavaSerializer.scala:101) > >>> at > org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:256) > >>> at > >>> > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > >>> at > >>> > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > >>> at java.lang.Thread.run(Thread.java:745) > >>> > >>> > >>> Should more heap be given to test suite ? > >>> > >>> > >>> Cheers > >> > >> > > > > - > > To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org > > For additional commands, e-mail: dev-h...@spark.apache.org > > >
test failed due to OOME
From https://amplab.cs.berkeley.edu/jenkins/job/Spark-Master-Maven-with-YARN/HADOOP_PROFILE=hadoop-2.4,label=spark-test/3846/console : SparkListenerSuite:- basic creation and shutdown of LiveListenerBus- bus.stop() waits for the event queue to completely drain- basic creation of StageInfo- basic creation of StageInfo with shuffle- StageInfo with fewer tasks than partitions- local metrics- onTaskGettingResult() called when result fetched remotely *** FAILED *** org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 0.0 failed 1 times, most recent failure: Lost task 0.0 in stage 0.0 (TID 0, localhost): java.lang.OutOfMemoryError: Java heap space at java.util.Arrays.copyOf(Arrays.java:2271)at java.io.ByteArrayOutputStream.grow(ByteArrayOutputStream.java:113) at java.io.ByteArrayOutputStream.ensureCapacity(ByteArrayOutputStream.java:93) at java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:140) at java.io.ObjectOutputStream$BlockDataOutputStream.write(ObjectOutputStream.java:1852) at java.io.ObjectOutputStream.write(ObjectOutputStream.java:708) at org.apache.spark.util.Utils$.writeByteBuffer(Utils.scala:182) at org.apache.spark.scheduler.DirectTaskResult$$anonfun$writeExternal$1.apply$mcV$sp(TaskResult.scala:52) at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1160) at org.apache.spark.scheduler.DirectTaskResult.writeExternal(TaskResult.scala:49) at java.io.ObjectOutputStream.writeExternalData(ObjectOutputStream.java:1458) at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1429) at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1177) at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:347) at org.apache.spark.serializer.JavaSerializationStream.writeObject(JavaSerializer.scala:44) at org.apache.spark.serializer.JavaSerializerInstance.serialize(JavaSerializer.scala:101) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:256) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Should more heap be given to test suite ? Cheers