try to set --drive-memory xg , x would be as large as can be set . On Monday, July 18, 2016 6:31 PM, Saurav Sinha <sauravsinh...@gmail.com> wrote:
Hi, I am running spark job. Master memory - 5Gexecutor memort 10G(running on 4 node) My job is getting killed as no of partition increase to 20K. 16/07/18 14:53:13 INFO DAGScheduler: Got job 17 (foreachPartition at WriteToKafka.java:45) with 13524 output partitions (allowLocal=false)16/07/18 14:53:13 INFO DAGScheduler: Final stage: ResultStage 640(foreachPartition at WriteToKafka.java:45)16/07/18 14:53:13 INFO DAGScheduler: Parents of final stage: List(ShuffleMapStage 518, ShuffleMapStage 639)16/07/18 14:53:23 INFO DAGScheduler: Missing parents: List()16/07/18 14:53:23 INFO DAGScheduler: Submitting ResultStage 640 (MapPartitionsRDD[271] at map at BuildSolrDocs.java:209), which has no missing parents16/07/18 14:53:23 INFO MemoryStore: ensureFreeSpace(8248) called with curMem=41923262, maxMem=277877882816/07/18 14:53:23 INFO MemoryStore: Block broadcast_90 stored as values in memory (estimated size 8.1 KB, free 2.5 GB)Exception in thread "dag-scheduler-event-loop" java.lang.OutOfMemoryError: Java heap space at org.apache.spark.util.io.ByteArrayChunkOutputStream.allocateNewChunkIfNeeded(ByteArrayChunkOutputStream.scala:66) at org.apache.spark.util.io.ByteArrayChunkOutputStream.write(ByteArrayChunkOutputStream.scala:55) at org.xerial.snappy.SnappyOutputStream.dumpOutput(SnappyOutputStream.java:294) at org.xerial.snappy.SnappyOutputStream.flush(SnappyOutputStream.java:273) at org.apache.spark.io.SnappyOutputStreamWrapper.flush(CompressionCodec.scala:197) at java.io.ObjectOutputStream$BlockDataOutputStream.flush(ObjectOutputStream.java:1822) Help needed. -- Thanks and Regards, Saurav Sinha Contact: 9742879062