@Nezih, can you try again after setting `spark.memory.useLegacyMode` to true? Can you still reproduce the OOM that way?
2016-03-21 10:29 GMT-07:00 Nezih Yigitbasi <nyigitb...@netflix.com.invalid>: > Hi Spark devs, > I am using 1.6.0 with dynamic allocation on yarn. I am trying to run a > relatively big application with 10s of jobs and 100K+ tasks and my app > fails with the exception below. The closest jira issue I could find is > SPARK-11293 <https://issues.apache.org/jira/browse/SPARK-11293>, which is > a critical bug that has been open for a long time. There are other similar > jira issues (all fixed): SPARK-10474 > <https://issues.apache.org/jira/browse/SPARK-10474>, SPARK-10733 > <https://issues.apache.org/jira/browse/SPARK-10733>, SPARK-10309 > <https://issues.apache.org/jira/browse/SPARK-10309>, SPARK-10379 > <https://issues.apache.org/jira/browse/SPARK-10379>. > > Any workarounds to this issue or any plans to fix it? > > Thanks a lot, > Nezih > > 16/03/19 05:12:09 INFO memory.TaskMemoryManager: Memory used in task > 4687016/03/19 05:12:09 INFO memory.TaskMemoryManager: Acquired by > org.apache.spark.shuffle.sort.ShuffleExternalSorter@1c36f801: 32.0 KB16/03/19 > 05:12:09 INFO memory.TaskMemoryManager: 1512915599 bytes of memory were used > by task 46870 but are not associated with specific consumers16/03/19 05:12:09 > INFO memory.TaskMemoryManager: 1512948367 bytes of memory are used for > execution and 156978343 bytes of memory are used for storage16/03/19 05:12:09 > ERROR executor.Executor: Managed memory leak detected; size = 1512915599 > bytes, TID = 4687016/03/19 05:12:09 ERROR executor.Executor: Exception in > task 77.0 in stage 273.0 (TID 46870) > java.lang.OutOfMemoryError: Unable to acquire 128 bytes of memory, got 0 > at > org.apache.spark.memory.MemoryConsumer.allocatePage(MemoryConsumer.java:120) > at > org.apache.spark.shuffle.sort.ShuffleExternalSorter.acquireNewPageIfNecessary(ShuffleExternalSorter.java:354) > at > org.apache.spark.shuffle.sort.ShuffleExternalSorter.insertRecord(ShuffleExternalSorter.java:375) > at > org.apache.spark.shuffle.sort.UnsafeShuffleWriter.insertRecordIntoSorter(UnsafeShuffleWriter.java:237) > at > org.apache.spark.shuffle.sort.UnsafeShuffleWriter.write(UnsafeShuffleWriter.java:164) > at > org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73) > at > org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41) > at org.apache.spark.scheduler.Task.run(Task.scala:89) > at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745)16/03/19 05:12:09 ERROR > util.SparkUncaughtExceptionHandler: Uncaught exception in thread > Thread[Executor task launch worker-8,5,main] > java.lang.OutOfMemoryError: Unable to acquire 128 bytes of memory, got 0 > at > org.apache.spark.memory.MemoryConsumer.allocatePage(MemoryConsumer.java:120) > at > org.apache.spark.shuffle.sort.ShuffleExternalSorter.acquireNewPageIfNecessary(ShuffleExternalSorter.java:354) > at > org.apache.spark.shuffle.sort.ShuffleExternalSorter.insertRecord(ShuffleExternalSorter.java:375) > at > org.apache.spark.shuffle.sort.UnsafeShuffleWriter.insertRecordIntoSorter(UnsafeShuffleWriter.java:237) > at > org.apache.spark.shuffle.sort.UnsafeShuffleWriter.write(UnsafeShuffleWriter.java:164) > at > org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73) > at > org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41) > at org.apache.spark.scheduler.Task.run(Task.scala:89) > at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745)16/03/19 05:12:10 INFO > storage.DiskBlockManager: Shutdown hook called16/03/19 05:12:10 INFO > util.ShutdownHookManager: Shutdown hook called > > >