Re: OOM on the driver after increasing partitions

Nirav Patel Wed, 22 Jun 2016 16:43:34 -0700

I believe it would be task, partitions, task status etc information. I do
not know exact of those things but I had OOM on driver with 512MB and
increasing it did help. Someone else might be able to answer about exact
memory usage of driver better.


You also seem to use broadcast means sending something from dirver jvm. You
can try taking memory dump when your driver memory is about full or set jvm
args to take it automatically on OutOfMemory error. Analyze it and share
your finding :)



On Wed, Jun 22, 2016 at 4:33 PM, Raghava Mutharaju <
m.vijayaragh...@gmail.com> wrote:

> Ok. Would be able to shed more light on what exact meta data it manages
> and what is the relationship with more number of partitions/nodes?
>
> There is one executor running on each node -- so there are 64 executors in
> total. Each executor, including the driver are give 12GB and this is the
> maximum available limit. So the other options are
>
> 1) Separate the driver from master, i.e., run them on two separate nodes
> 2) Increase the RAM capacity on the driver/master node.
>
> Regards,
> Raghava.
>
>
> On Wed, Jun 22, 2016 at 7:05 PM, Nirav Patel <npa...@xactlycorp.com>
> wrote:
>
>> Yes driver keeps fair amount of meta data to manage scheduling across all
>> your executors. I assume with 64 nodes you have more executors as well.
>> Simple way to test is to increase driver memory.
>>
>> On Wed, Jun 22, 2016 at 10:10 AM, Raghava Mutharaju <
>> m.vijayaragh...@gmail.com> wrote:
>>
>>> It is an iterative algorithm which uses map, mapPartitions, join, union,
>>> filter, broadcast and count. The goal is to compute a set of tuples and in
>>> each iteration few tuples are added to it. Outline is given below
>>>
>>> 1) Start with initial set of tuples, T
>>> 2) In each iteration compute deltaT, and add them to T, i.e., T = T +
>>> deltaT
>>> 3) Stop when current T size (count) is same as previous T size, i.e.,
>>> deltaT is 0.
>>>
>>> Do you think something happens on the driver due to the application
>>> logic and when the partitions are increased?
>>>
>>> Regards,
>>> Raghava.
>>>
>>>
>>> On Wed, Jun 22, 2016 at 12:33 PM, Sonal Goyal <sonalgoy...@gmail.com>
>>> wrote:
>>>
>>>> What does your application do?
>>>>
>>>> Best Regards,
>>>> Sonal
>>>> Founder, Nube Technologies <http://www.nubetech.co>
>>>> Reifier at Strata Hadoop World
>>>> <https://www.youtube.com/watch?v=eD3LkpPQIgM>
>>>> Reifier at Spark Summit 2015
>>>> <https://spark-summit.org/2015/events/real-time-fuzzy-matching-with-spark-and-elastic-search/>
>>>>
>>>> <http://in.linkedin.com/in/sonalgoyal>
>>>>
>>>>
>>>>
>>>> On Wed, Jun 22, 2016 at 9:57 PM, Raghava Mutharaju <
>>>> m.vijayaragh...@gmail.com> wrote:
>>>>
>>>>> Hello All,
>>>>>
>>>>> We have a Spark cluster where driver and master are running on the
>>>>> same node. We are using Spark Standalone cluster manager. If the number of
>>>>> nodes (and the partitions) are increased, the same dataset that used to 
>>>>> run
>>>>> to completion on lesser number of nodes is now giving an out of memory on
>>>>> the driver.
>>>>>
>>>>> For example, a dataset that runs on 32 nodes with number of partitions
>>>>> set to 256 completes whereas the same dataset when run on 64 nodes with
>>>>> number of partitions as 512 gives an OOM on the driver side.
>>>>>
>>>>> From what I read in the Spark documentation and other articles,
>>>>> following are the responsibilities of the driver/master.
>>>>>
>>>>> 1) create spark context
>>>>> 2) build DAG of operations
>>>>> 3) schedule tasks
>>>>>
>>>>> I am guessing that 1) and 2) should not change w.r.t number of
>>>>> nodes/partitions. So is it that since the driver has to keep track of lot
>>>>> more tasks, that it gives an OOM?
>>>>>
>>>>> What could be the possible reasons behind the driver-side OOM when the
>>>>> number of partitions are increased?
>>>>>
>>>>> Regards,
>>>>> Raghava.
>>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>> Regards,
>>> Raghava
>>> http://raghavam.github.io
>>>
>>
>>
>>
>>
>> [image: What's New with Xactly] <http://www.xactlycorp.com/email-click/>
>>
>> <https://www.nyse.com/quote/XNYS:XTLY>  [image: LinkedIn]
>> <https://www.linkedin.com/company/xactly-corporation>  [image: Twitter]
>> <https://twitter.com/Xactly>  [image: Facebook]
>> <https://www.facebook.com/XactlyCorp>  [image: YouTube]
>> <http://www.youtube.com/xactlycorporation>
>
>
>
>
> --
> Regards,
> Raghava
> http://raghavam.github.io
>

-- 


[image: What's New with Xactly] <http://www.xactlycorp.com/email-click/>

<https://www.nyse.com/quote/XNYS:XTLY>  [image: LinkedIn] 
<https://www.linkedin.com/company/xactly-corporation>  [image: Twitter] 
<https://twitter.com/Xactly>  [image: Facebook] 
<https://www.facebook.com/XactlyCorp>  [image: YouTube] 
<http://www.youtube.com/xactlycorporation>

Re: OOM on the driver after increasing partitions

Reply via email to