Re: Spark driver getting out of memory

2016-07-24 Thread Raghava Mutharaju
Saurav, We have the same issue. Our application runs fine on 32 nodes with 4 cores each and 256 partitions but gives an OOM on the driver when run on 64 nodes with 512 partitions. Did you get to know the reason behind this behavior or the relation between number of partitions and driver RAM

Re: Spark driver getting out of memory

2016-07-20 Thread RK Aduri
Cache defaults to MEMORY_ONLY. Can you try with different storage levels ,i.e., MEMORY_ONLY_SER or even DISK_ONLY. you may want to use persist( ) instead of cache. Or there is an experimental storage level OFF_HEAP which might also help. On Tue, Jul 19, 2016 at 11:08 PM, Saurav Sinha

Re: Spark driver getting out of memory

2016-07-20 Thread Saurav Sinha
Hi, I have set driver memory 10 GB and job ran with intermediate failure which is recovered back by spark. But I still what to know if no of parts increases git driver ram need to be increased and what is ration of no of parts/RAM. @RK : I am using cache on RDD. Is this reason of high RAM

Re: Spark driver getting out of memory

2016-07-19 Thread RK Aduri
Just want to see if this helps. Are you doing heavy collects and persist that? If that is so, you might want to parallelize that collection by converting to an RDD. Thanks, RK On Tue, Jul 19, 2016 at 12:09 AM, Saurav Sinha wrote: > Hi Mich, > >1. In what mode are

Re: Spark driver getting out of memory

2016-07-19 Thread Saurav Sinha
Hi Mich, 1. In what mode are you running the spark standalone, yarn-client, yarn cluster etc Ans: spark standalone 1. You have 4 nodes with each executor having 10G. How many actual executors do you see in UI (Port 4040 by default) Ans: There are 4 executor on which am using 8

Re: Spark driver getting out of memory

2016-07-18 Thread Mich Talebzadeh
can you please clarify: 1. In what mode are you running the spark standalone, yarn-client, yarn cluster etc 2. You have 4 nodes with each executor having 10G. How many actual executors do you see in UI (Port 4040 by default) 3. What is master memory? Are you referring to diver

Re: Spark driver getting out of memory

2016-07-18 Thread Saurav Sinha
I have set --drive-memory 5g. I need to understand that as no of partition increase drive-memory need to be increased. What will be best ration of No of partition/drive-memory. On Mon, Jul 18, 2016 at 4:07 PM, Zhiliang Zhu wrote: > try to set --drive-memory xg , x would be

Re: Spark driver getting out of memory

2016-07-18 Thread Zhiliang Zhu
try to set --drive-memory xg , x would be as large as can be set .  On Monday, July 18, 2016 6:31 PM, Saurav Sinha wrote: Hi, I am running spark job. Master memory - 5Gexecutor memort 10G(running on 4 node) My job is getting killed as no of partition increase

Spark driver getting out of memory

2016-07-18 Thread Saurav Sinha
Hi, I am running spark job. Master memory - 5G executor memort 10G(running on 4 node) My job is getting killed as no of partition increase to 20K. 16/07/18 14:53:13 INFO DAGScheduler: Got job 17 (foreachPartition at WriteToKafka.java:45) with 13524 output partitions (allowLocal=false) 16/07/18