What value do you use for spark.yarn.executor.memoryOverhead ? Please see https://spark.apache.org/docs/latest/running-on-yarn.html for description of the parameter.
Which Spark release are you using ? Cheers On Tue, Feb 2, 2016 at 1:38 PM, Jakob Odersky <ja...@odersky.com> wrote: > Can you share some code that produces the error? It is probably not > due to spark but rather the way data is handled in the user code. > Does your code call any reduceByKey actions? These are often a source > for OOM errors. > > On Tue, Feb 2, 2016 at 1:22 PM, Stefan Panayotov <spanayo...@msn.com> > wrote: > > Hi Guys, > > > > I need help with Spark memory errors when executing ML pipelines. > > The error that I see is: > > > > > > 16/02/02 20:34:17 INFO Executor: Executor is trying to kill task 32.0 in > > stage 32.0 (TID 3298) > > > > > > 16/02/02 20:34:17 INFO Executor: Executor is trying to kill task 12.0 in > > stage 32.0 (TID 3278) > > > > > > 16/02/02 20:34:39 INFO MemoryStore: ensureFreeSpace(2004728720) called > with > > curMem=296303415, maxMem=8890959790 > > > > > > 16/02/02 20:34:39 INFO MemoryStore: Block taskresult_3298 stored as > bytes in > > memory (estimated size 1911.9 MB, free 6.1 GB) > > > > > > 16/02/02 20:34:39 ERROR CoarseGrainedExecutorBackend: RECEIVED SIGNAL 15: > > SIGTERM > > > > > > 16/02/02 20:34:39 ERROR Executor: Exception in task 12.0 in stage 32.0 > (TID > > 3278) > > > > > > java.lang.OutOfMemoryError: Java heap space > > > > > > at java.util.Arrays.copyOf(Arrays.java:2271) > > > > > > at > > java.io.ByteArrayOutputStream.toByteArray(ByteArrayOutputStream.java:191) > > > > > > at > > > org.apache.spark.serializer.JavaSerializerInstance.serialize(JavaSerializer.scala:86) > > > > > > at > > org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:256) > > > > > > at > > > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > > > > > > at > > > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > > > > > > at java.lang.Thread.run(Thread.java:745) > > > > > > 16/02/02 20:34:39 INFO DiskBlockManager: Shutdown hook called > > > > > > 16/02/02 20:34:39 INFO Executor: Finished task 32.0 in stage 32.0 (TID > > 3298). 2004728720 bytes result sent via BlockManager) > > > > > > 16/02/02 20:34:39 ERROR SparkUncaughtExceptionHandler: Uncaught > exception in > > thread Thread[Executor task launch worker-8,5,main] > > > > > > java.lang.OutOfMemoryError: Java heap space > > > > > > at java.util.Arrays.copyOf(Arrays.java:2271) > > > > > > at > > java.io.ByteArrayOutputStream.toByteArray(ByteArrayOutputStream.java:191) > > > > > > at > > > org.apache.spark.serializer.JavaSerializerInstance.serialize(JavaSerializer.scala:86) > > > > > > at > > org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:256) > > > > > > at > > > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > > > > > > at > > > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > > > > > > at java.lang.Thread.run(Thread.java:745) > > > > > > 16/02/02 20:34:39 INFO ShutdownHookManager: Shutdown hook called > > > > > > 16/02/02 20:34:39 INFO MetricsSystemImpl: Stopping azure-file-system > metrics > > system... > > > > > > 16/02/02 20:34:39 INFO MetricsSinkAdapter: azurefs2 thread interrupted. > > > > > > 16/02/02 20:34:39 INFO MetricsSystemImpl: azure-file-system metrics > system > > stopped. > > > > > > 16/02/02 20:34:39 INFO MetricsSystemImpl: azure-file-system metrics > system > > shutdown complete. > > > > > > > > > > > > And ….. > > > > > > > > > > > > 16/02/02 20:09:03 INFO impl.ContainerManagementProtocolProxy: Opening > proxy > > : 10.0.0.5:30050 > > > > > > 16/02/02 20:33:51 INFO yarn.YarnAllocator: Completed container > > container_1454421662639_0011_01_000005 (state: COMPLETE, exit status: > -104) > > > > > > 16/02/02 20:33:51 WARN yarn.YarnAllocator: Container killed by YARN for > > exceeding memory limits. 16.8 GB of 16.5 GB physical memory used. > Consider > > boosting spark.yarn.executor.memoryOverhead. > > > > > > 16/02/02 20:33:56 INFO yarn.YarnAllocator: Will request 1 executor > > containers, each with 2 cores and 16768 MB memory including 384 MB > overhead > > > > > > 16/02/02 20:33:56 INFO yarn.YarnAllocator: Container request (host: Any, > > capability: <memory:16768, vCores:2>) > > > > > > 16/02/02 20:33:57 INFO yarn.YarnAllocator: Launching container > > container_1454421662639_0011_01_000037 for on host 10.0.0.8 > > > > > > 16/02/02 20:33:57 INFO yarn.YarnAllocator: Launching ExecutorRunnable. > > driverUrl: > > akka.tcp://sparkDriver@10.0.0.15:47446/user/CoarseGrainedScheduler, > > executorHostname: 10.0.0.8 > > > > > > 16/02/02 20:33:57 INFO yarn.YarnAllocator: Received 1 containers from > YARN, > > launching executors on 1 of them. > > > > > > I'll really appreciate any help here. > > > > Thank you, > > > > Stefan Panayotov, PhD > > Home: 610-355-0919 > > Cell: 610-517-5586 > > email: spanayo...@msn.com > > spanayo...@outlook.com > > spanayo...@comcast.net > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > >