OK, thank you. What do you suggest I do to get rid of the error?
________________________________ From: Ted Yu <yuzhih...@gmail.com> Sent: Wednesday, August 3, 2016 6:10 PM To: Rychnovsky, Dusan Cc: user@spark.apache.org Subject: Re: Managed memory leak detected + OutOfMemoryError: Unable to acquire X bytes of memory, got 0 The latest QA run was no longer accessible (error 404): https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59141/consoleFull Looking at the comments on the PR, there is not enough confidence in pulling in the fix into 1.6 On Wed, Aug 3, 2016 at 9:05 AM, Rychnovsky, Dusan <dusan.rychnov...@firma.seznam.cz<mailto:dusan.rychnov...@firma.seznam.cz>> wrote: I am confused. I tried to look for Spark that would have this issue fixed, i.e. https://github.com/apache/spark/pull/13027/ merged in, but it looks like the patch has not been merged for 1.6. How do I get a fixed 1.6 version? Thanks, Dusan [https://avatars2.githubusercontent.com/u/545478?v=3&s=400]<https://github.com/apache/spark/pull/13027/> [SPARK-4452][SPARK-11293][Core][BRANCH-1.6] Shuffle data structures can starve others on the same thread for memory by lianhuiwang · Pull Request #13027 · apache/spark · GitHub What changes were proposed in this pull request? This PR is for the branch-1.6 version of the commits PR #10024. In #9241 It implemented a mechanism to call spill() on those SQL operators that sup... Read more...<https://github.com/apache/spark/pull/13027/> ________________________________ From: Rychnovsky, Dusan Sent: Wednesday, August 3, 2016 3:58 PM To: Ted Yu Cc: user@spark.apache.org<mailto:user@spark.apache.org> Subject: Re: Managed memory leak detected + OutOfMemoryError: Unable to acquire X bytes of memory, got 0 Yes, I believe I'm using Spark 1.6.0. > spark-submit --version Welcome to ____ __ / __/__ ___ _____/ /__ _\ \/ _ \/ _ `/ __/ '_/ /___/ .__/\_,_/_/ /_/\_\ version 1.6.0 /_/ I don't understand the ticket. It says "Fixed in 1.6.0". I have 1.6.0 and therefore should have it fixed, right? Or what do I do to fix it? Thanks, Dusan ________________________________ From: Ted Yu <yuzhih...@gmail.com<mailto:yuzhih...@gmail.com>> Sent: Wednesday, August 3, 2016 3:52 PM To: Rychnovsky, Dusan Cc: user@spark.apache.org<mailto:user@spark.apache.org> Subject: Re: Managed memory leak detected + OutOfMemoryError: Unable to acquire X bytes of memory, got 0 Are you using Spark 1.6+ ? See SPARK-11293 On Wed, Aug 3, 2016 at 5:03 AM, Rychnovsky, Dusan <dusan.rychnov...@firma.seznam.cz<mailto:dusan.rychnov...@firma.seznam.cz>> wrote: Hi, I have a Spark workflow that when run on a relatively small portion of data works fine, but when run on big data fails with strange errors. In the log files of failed executors I found the following errors: Firstly > Managed memory leak detected; size = 263403077 bytes, TID = 6524 And then a series of > java.lang.OutOfMemoryError: Unable to acquire 241 bytes of memory, got 0 > at > org.apache.spark.memory.MemoryConsumer.allocatePage(MemoryConsumer.java:120) > at > org.apache.spark.shuffle.sort.ShuffleExternalSorter.acquireNewPageIfNecessary(ShuffleExternalSorter.java:346) > at > org.apache.spark.shuffle.sort.ShuffleExternalSorter.insertRecord(ShuffleExternalSorter.java:367) > at > org.apache.spark.shuffle.sort.UnsafeShuffleWriter.insertRecordIntoSorter(UnsafeShuffleWriter.java:237) > at > org.apache.spark.shuffle.sort.UnsafeShuffleWriter.write(UnsafeShuffleWriter.java:164) > at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73) > at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41) > at org.apache.spark.scheduler.Task.run(Task.scala:89) > at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) The job keeps failing in the same way (I tried a few times). What could be causing such error? I have a feeling that I'm not providing enough context necessary to understand the issue. Please ask for any other information needed. Thank you, Dusan