Re: OOM on the driver after increasing partitions

2016-06-22 Thread Raghava Mutharaju
Thank you. Sure, if I find something I will post it. Regards, Raghava. On Wed, Jun 22, 2016 at 7:43 PM, Nirav Patel wrote: > I believe it would be task, partitions, task status etc information. I do > not know exact of those things but I had OOM on driver with 512MB and

Re: OOM on the driver after increasing partitions

2016-06-22 Thread Nirav Patel
I believe it would be task, partitions, task status etc information. I do not know exact of those things but I had OOM on driver with 512MB and increasing it did help. Someone else might be able to answer about exact memory usage of driver better. You also seem to use broadcast means sending

Re: OOM on the driver after increasing partitions

2016-06-22 Thread Raghava Mutharaju
Ok. Would be able to shed more light on what exact meta data it manages and what is the relationship with more number of partitions/nodes? There is one executor running on each node -- so there are 64 executors in total. Each executor, including the driver are give 12GB and this is the maximum

Re: OOM on the driver after increasing partitions

2016-06-22 Thread Nirav Patel
Yes driver keeps fair amount of meta data to manage scheduling across all your executors. I assume with 64 nodes you have more executors as well. Simple way to test is to increase driver memory. On Wed, Jun 22, 2016 at 10:10 AM, Raghava Mutharaju < m.vijayaragh...@gmail.com> wrote: > It is an

Re: OOM on the driver after increasing partitions

2016-06-22 Thread Raghava Mutharaju
It is an iterative algorithm which uses map, mapPartitions, join, union, filter, broadcast and count. The goal is to compute a set of tuples and in each iteration few tuples are added to it. Outline is given below 1) Start with initial set of tuples, T 2) In each iteration compute deltaT, and add

Re: OOM on the driver after increasing partitions

2016-06-22 Thread Sonal Goyal
What does your application do? Best Regards, Sonal Founder, Nube Technologies Reifier at Strata Hadoop World Reifier at Spark Summit 2015

OOM on the driver after increasing partitions

2016-06-22 Thread Raghava Mutharaju
Hello All, We have a Spark cluster where driver and master are running on the same node. We are using Spark Standalone cluster manager. If the number of nodes (and the partitions) are increased, the same dataset that used to run to completion on lesser number of nodes is now giving an out of