The parameter spark.yarn.executor.memoryOverhead

2017-10-30 Thread Ashok Kumar
Hi Gurus,

The parameter spark.yarn.executor.memoryOverhead is explained as below:

spark.yarn.executor.memoryOverhead 
executorMemory * 0.10, with minimum of 384
The amount of off-heap memory (in megabytes) to be allocated per executor. This 
is memory that accounts for things like VM overheads, interned strings, other 
native overheads, etc. This tends to grow with the executor size (typically 
6-10%). 

 

So does that mean that for executor of 10GB this should be ideally set to ~ 10% 
= 1GB?  

   
What would happen if we set it higher to say 30% ~ 3GB.
What is this memory is exactly used for (as opposed to memory allocated to the 
executor)?

Thanking you

Re: spark.yarn.executor.memoryOverhead

2016-11-23 Thread Saisai Shao
>From my understanding, this memory overhead should include
"spark.memory.offHeap.size", which means off-heap memory size should not be
larger than the overhead memory size when running in yarn.

On Thu, Nov 24, 2016 at 3:01 AM, Koert Kuipers  wrote:

> in YarnAllocator i see that memoryOverhead is by default set to
> math.max((MEMORY_OVERHEAD_FACTOR * executorMemory).toInt,
> MEMORY_OVERHEAD_MIN))
>
> this does not take into account spark.memory.offHeap.size i think. should
> it?
>
> something like:
>
> math.max((MEMORY_OVERHEAD_FACTOR * executorMemory + offHeapMemory).toInt,
> MEMORY_OVERHEAD_MIN))
>


spark.yarn.executor.memoryOverhead

2016-11-23 Thread Koert Kuipers
in YarnAllocator i see that memoryOverhead is by default set to
math.max((MEMORY_OVERHEAD_FACTOR * executorMemory).toInt,
MEMORY_OVERHEAD_MIN))

this does not take into account spark.memory.offHeap.size i think. should
it?

something like:

math.max((MEMORY_OVERHEAD_FACTOR * executorMemory + offHeapMemory).toInt,
MEMORY_OVERHEAD_MIN))


Re: Increasing spark.yarn.executor.memoryOverhead degrades performance

2016-07-18 Thread Sean Owen
Possibilities:

- You are using more memory now (and not getting killed), but now are
exceeding OS memory and are swapping
- Your heap sizes / config aren't quite right and now, instead of
failing earlier because YARN killed the job, you're running normally
but seeing a lot of time lost to GC thrashing

Based on your description I suspect the first one. Disable swap in
general on cluster machines.

On Mon, Jul 18, 2016 at 4:47 PM, Sunita Arvind <sunitarv...@gmail.com> wrote:
> Hello Experts,
>
> For one of our streaming appilcation, we intermittently saw:
>
> WARN yarn.YarnAllocator: Container killed by YARN for exceeding memory
> limits. 12.0 GB of 12 GB physical memory used. Consider boosting
> spark.yarn.executor.memoryOverhead.
>
> Based on what I found on internet and the error message, I increased the
> memoryOverhead to 768. This is actually slowing the application. We are on
> spark1.3, so not sure if its due to any GC pauses. Just to do some
> intelligent trials, I wanted to understand what could be causing the
> degrade. Should I increase driver memoryOverhead also? Another interesting
> observation is, bringing down the executor memory to 5GB with executor
> memoryOverhead to 768 showed significant performance gains. What are the
> other associated settings?
>
> regards
> Sunita
>
>

-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org



Increasing spark.yarn.executor.memoryOverhead degrades performance

2016-07-18 Thread Sunita Arvind
Hello Experts,

For one of our streaming appilcation, we intermittently saw:

WARN yarn.YarnAllocator: Container killed by YARN for exceeding memory
limits. 12.0 GB of 12 GB physical memory used. Consider boosting
spark.yarn.executor.memoryOverhead.

Based on what I found on internet and the error message, I increased the
memoryOverhead to 768. This is actually slowing the application. We are on
spark1.3, so not sure if its due to any GC pauses. Just to do some
intelligent trials, I wanted to understand what could be causing the
degrade. Should I increase driver memoryOverhead also? Another interesting
observation is, bringing down the executor memory to 5GB with executor
memoryOverhead to 768 showed significant performance gains. What are the
other associated settings?

regards
Sunita


Re: Boosting spark.yarn.executor.memoryOverhead

2015-08-11 Thread Sandy Ryza
Hi Eric,

This is likely because you are putting the parameter after the primary
resource (latest_msmtdt_by_gridid_and_source.py), which makes it a
parameter to your application instead of a parameter to Spark/

-Sandy

On Wed, Aug 12, 2015 at 4:40 AM, Eric Bless eric.bl...@yahoo.com.invalid
wrote:

 Previously I was getting a failure which included the message
 Container killed by YARN for exceeding memory limits. 2.1 GB of 2 GB
 physical memory used. Consider boosting spark.yarn.executor.memoryOverhead.

 So I attempted the following -
 spark-submit --jars examples.jar latest_msmtdt_by_gridid_and_source.py
 --conf spark.yarn.executor.memoryOverhead=1024 host table

 This resulted in -
 Application application_1438983806434_24070 failed 2 times due to AM
 Container for appattempt_1438983806434_24070_02 exited with exitCode:
 -1000

 Am I specifying the spark.yarn.executor.memoryOverhead incorrectly?




Re: bitten by spark.yarn.executor.memoryOverhead

2015-03-02 Thread Ted Yu
bq. that 0.1 is always enough?

The answer is: it depends (on use cases).
The value of 0.1 has been validated by several users. I think it is a
reasonable default.

Cheers

On Mon, Mar 2, 2015 at 8:36 AM, Ryan Williams ryan.blake.willi...@gmail.com
 wrote:

 For reference, the initial version of #3525
 https://github.com/apache/spark/pull/3525 (still open) made this
 fraction a configurable value, but consensus went against that being
 desirable so I removed it and marked SPARK-4665
 https://issues.apache.org/jira/browse/SPARK-4665 as won't fix.

 My team wasted a lot of time on this failure mode as well and has settled
 in to passing --conf spark.yarn.executor.memoryOverhead=1024 to most
 jobs (that works out to 10-20% of --executor-memory, depending on the job).

 I agree that learning about this the hard way is a weak part of the
 Spark-on-YARN onboarding experience.

 The fact that our instinct here is to increase the 0.07 minimum instead of
 the alternate 384MB
 https://github.com/apache/spark/blob/3efd8bb6cf139ce094ff631c7a9c1eb93fdcd566/yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnSparkHadoopUtil.scala#L93
 minimum seems like evidence that the fraction is the thing we should allow
 people to configure, instead of absolute amount that is currently
 configurable.

 Finally, do we feel confident that 0.1 is always enough?


 On Sat, Feb 28, 2015 at 4:51 PM Corey Nolet cjno...@gmail.com wrote:

 Thanks for taking this on Ted!

 On Sat, Feb 28, 2015 at 4:17 PM, Ted Yu yuzhih...@gmail.com wrote:

 I have created SPARK-6085 with pull request:
 https://github.com/apache/spark/pull/4836

 Cheers

 On Sat, Feb 28, 2015 at 12:08 PM, Corey Nolet cjno...@gmail.com wrote:

 +1 to a better default as well.

 We were working find until we ran against a real dataset which was much
 larger than the test dataset we were using locally. It took me a couple
 days and digging through many logs to figure out this value was what was
 causing the problem.

 On Sat, Feb 28, 2015 at 11:38 AM, Ted Yu yuzhih...@gmail.com wrote:

 Having good out-of-box experience is desirable.

 +1 on increasing the default.


 On Sat, Feb 28, 2015 at 8:27 AM, Sean Owen so...@cloudera.com wrote:

 There was a recent discussion about whether to increase or indeed make
 configurable this kind of default fraction. I believe the suggestion
 there too was that 9-10% is a safer default.

 Advanced users can lower the resulting overhead value; it may still
 have to be increased in some cases, but a fatter default may make this
 kind of surprise less frequent.

 I'd support increasing the default; any other thoughts?

 On Sat, Feb 28, 2015 at 3:34 PM, Koert Kuipers ko...@tresata.com
 wrote:
  hey,
  running my first map-red like (meaning disk-to-disk, avoiding in
 memory
  RDDs) computation in spark on yarn i immediately got bitten by a
 too low
  spark.yarn.executor.memoryOverhead. however it took me about an
 hour to find
  out this was the cause. at first i observed failing shuffles
 leading to
  restarting of tasks, then i realized this was because executors
 could not be
  reached, then i noticed in containers got shut down and reallocated
 in
  resourcemanager logs (no mention of errors, it seemed the containers
  finished their business and shut down successfully), and finally i
 found the
  reason in nodemanager logs.
 
  i dont think this is a pleasent first experience. i realize
  spark.yarn.executor.memoryOverhead needs to be set differently from
  situation to situation. but shouldnt the default be a somewhat
 higher value
  so that these errors are unlikely, and then the experts that are
 willing to
  deal with these errors can tune it lower? so why not make the
 default 10%
  instead of 7%? that gives something that works in most situations
 out of the
  box (at the cost of being a little wasteful). it worked for me.

 -
 To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
 For additional commands, e-mail: user-h...@spark.apache.org








Re: bitten by spark.yarn.executor.memoryOverhead

2015-03-02 Thread Sean Owen
The problem is, you're left with two competing options then. You can
go through the process of deprecating the absolute one and removing it
eventually. You take away ability to set this value directly though,
meaning you'd have to set absolute values by depending on a % of what
you set your app memory too. I think there's non-trivial downside that
way too.

No value can always be right, or else it wouldn't be configurable. I
think of this one like any other param that's set in absolute terms,
but with an attempt to be smart about the default.

On Mon, Mar 2, 2015 at 4:36 PM, Ryan Williams
ryan.blake.willi...@gmail.com wrote:
 For reference, the initial version of #3525 (still open) made this fraction
 a configurable value, but consensus went against that being desirable so I
 removed it and marked SPARK-4665 as won't fix.

 My team wasted a lot of time on this failure mode as well and has settled in
 to passing --conf spark.yarn.executor.memoryOverhead=1024 to most jobs
 (that works out to 10-20% of --executor-memory, depending on the job).

 I agree that learning about this the hard way is a weak part of the
 Spark-on-YARN onboarding experience.

 The fact that our instinct here is to increase the 0.07 minimum instead of
 the alternate 384MB minimum seems like evidence that the fraction is the
 thing we should allow people to configure, instead of absolute amount that
 is currently configurable.

 Finally, do we feel confident that 0.1 is always enough?


 On Sat, Feb 28, 2015 at 4:51 PM Corey Nolet cjno...@gmail.com wrote:

 Thanks for taking this on Ted!

 On Sat, Feb 28, 2015 at 4:17 PM, Ted Yu yuzhih...@gmail.com wrote:

 I have created SPARK-6085 with pull request:
 https://github.com/apache/spark/pull/4836

 Cheers

 On Sat, Feb 28, 2015 at 12:08 PM, Corey Nolet cjno...@gmail.com wrote:

 +1 to a better default as well.

 We were working find until we ran against a real dataset which was much
 larger than the test dataset we were using locally. It took me a couple 
 days
 and digging through many logs to figure out this value was what was causing
 the problem.

 On Sat, Feb 28, 2015 at 11:38 AM, Ted Yu yuzhih...@gmail.com wrote:

 Having good out-of-box experience is desirable.

 +1 on increasing the default.


 On Sat, Feb 28, 2015 at 8:27 AM, Sean Owen so...@cloudera.com wrote:

 There was a recent discussion about whether to increase or indeed make
 configurable this kind of default fraction. I believe the suggestion
 there too was that 9-10% is a safer default.

 Advanced users can lower the resulting overhead value; it may still
 have to be increased in some cases, but a fatter default may make this
 kind of surprise less frequent.

 I'd support increasing the default; any other thoughts?

 On Sat, Feb 28, 2015 at 3:34 PM, Koert Kuipers ko...@tresata.com
 wrote:
  hey,
  running my first map-red like (meaning disk-to-disk, avoiding in
  memory
  RDDs) computation in spark on yarn i immediately got bitten by a too
  low
  spark.yarn.executor.memoryOverhead. however it took me about an hour
  to find
  out this was the cause. at first i observed failing shuffles leading
  to
  restarting of tasks, then i realized this was because executors
  could not be
  reached, then i noticed in containers got shut down and reallocated
  in
  resourcemanager logs (no mention of errors, it seemed the containers
  finished their business and shut down successfully), and finally i
  found the
  reason in nodemanager logs.
 
  i dont think this is a pleasent first experience. i realize
  spark.yarn.executor.memoryOverhead needs to be set differently from
  situation to situation. but shouldnt the default be a somewhat
  higher value
  so that these errors are unlikely, and then the experts that are
  willing to
  deal with these errors can tune it lower? so why not make the
  default 10%
  instead of 7%? that gives something that works in most situations
  out of the
  box (at the cost of being a little wasteful). it worked for me.

 -
 To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
 For additional commands, e-mail: user-h...@spark.apache.org







-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



bitten by spark.yarn.executor.memoryOverhead

2015-02-28 Thread Koert Kuipers
hey,
running my first map-red like (meaning disk-to-disk, avoiding in memory
RDDs) computation in spark on yarn i immediately got bitten by a too low
spark.yarn.executor.memoryOverhead. however it took me about an hour to
find out this was the cause. at first i observed failing shuffles leading
to restarting of tasks, then i realized this was because executors could
not be reached, then i noticed in containers got shut down and reallocated
in resourcemanager logs (no mention of errors, it seemed the containers
finished their business and shut down successfully), and finally i found
the reason in nodemanager logs.

i dont think this is a pleasent first experience. i realize
spark.yarn.executor.memoryOverhead needs to be set differently from
situation to situation. but shouldnt the default be a somewhat higher value
so that these errors are unlikely, and then the experts that are willing to
deal with these errors can tune it lower? so why not make the default 10%
instead of 7%? that gives something that works in most situations out of
the box (at the cost of being a little wasteful). it worked for me.


Re: bitten by spark.yarn.executor.memoryOverhead

2015-02-28 Thread Ted Yu
I have created SPARK-6085 with pull request:
https://github.com/apache/spark/pull/4836

Cheers

On Sat, Feb 28, 2015 at 12:08 PM, Corey Nolet cjno...@gmail.com wrote:

 +1 to a better default as well.

 We were working find until we ran against a real dataset which was much
 larger than the test dataset we were using locally. It took me a couple
 days and digging through many logs to figure out this value was what was
 causing the problem.

 On Sat, Feb 28, 2015 at 11:38 AM, Ted Yu yuzhih...@gmail.com wrote:

 Having good out-of-box experience is desirable.

 +1 on increasing the default.


 On Sat, Feb 28, 2015 at 8:27 AM, Sean Owen so...@cloudera.com wrote:

 There was a recent discussion about whether to increase or indeed make
 configurable this kind of default fraction. I believe the suggestion
 there too was that 9-10% is a safer default.

 Advanced users can lower the resulting overhead value; it may still
 have to be increased in some cases, but a fatter default may make this
 kind of surprise less frequent.

 I'd support increasing the default; any other thoughts?

 On Sat, Feb 28, 2015 at 3:34 PM, Koert Kuipers ko...@tresata.com
 wrote:
  hey,
  running my first map-red like (meaning disk-to-disk, avoiding in memory
  RDDs) computation in spark on yarn i immediately got bitten by a too
 low
  spark.yarn.executor.memoryOverhead. however it took me about an hour
 to find
  out this was the cause. at first i observed failing shuffles leading to
  restarting of tasks, then i realized this was because executors could
 not be
  reached, then i noticed in containers got shut down and reallocated in
  resourcemanager logs (no mention of errors, it seemed the containers
  finished their business and shut down successfully), and finally i
 found the
  reason in nodemanager logs.
 
  i dont think this is a pleasent first experience. i realize
  spark.yarn.executor.memoryOverhead needs to be set differently from
  situation to situation. but shouldnt the default be a somewhat higher
 value
  so that these errors are unlikely, and then the experts that are
 willing to
  deal with these errors can tune it lower? so why not make the default
 10%
  instead of 7%? that gives something that works in most situations out
 of the
  box (at the cost of being a little wasteful). it worked for me.

 -
 To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
 For additional commands, e-mail: user-h...@spark.apache.org






Re: bitten by spark.yarn.executor.memoryOverhead

2015-02-28 Thread Corey Nolet
Thanks for taking this on Ted!

On Sat, Feb 28, 2015 at 4:17 PM, Ted Yu yuzhih...@gmail.com wrote:

 I have created SPARK-6085 with pull request:
 https://github.com/apache/spark/pull/4836

 Cheers

 On Sat, Feb 28, 2015 at 12:08 PM, Corey Nolet cjno...@gmail.com wrote:

 +1 to a better default as well.

 We were working find until we ran against a real dataset which was much
 larger than the test dataset we were using locally. It took me a couple
 days and digging through many logs to figure out this value was what was
 causing the problem.

 On Sat, Feb 28, 2015 at 11:38 AM, Ted Yu yuzhih...@gmail.com wrote:

 Having good out-of-box experience is desirable.

 +1 on increasing the default.


 On Sat, Feb 28, 2015 at 8:27 AM, Sean Owen so...@cloudera.com wrote:

 There was a recent discussion about whether to increase or indeed make
 configurable this kind of default fraction. I believe the suggestion
 there too was that 9-10% is a safer default.

 Advanced users can lower the resulting overhead value; it may still
 have to be increased in some cases, but a fatter default may make this
 kind of surprise less frequent.

 I'd support increasing the default; any other thoughts?

 On Sat, Feb 28, 2015 at 3:34 PM, Koert Kuipers ko...@tresata.com
 wrote:
  hey,
  running my first map-red like (meaning disk-to-disk, avoiding in
 memory
  RDDs) computation in spark on yarn i immediately got bitten by a too
 low
  spark.yarn.executor.memoryOverhead. however it took me about an hour
 to find
  out this was the cause. at first i observed failing shuffles leading
 to
  restarting of tasks, then i realized this was because executors could
 not be
  reached, then i noticed in containers got shut down and reallocated in
  resourcemanager logs (no mention of errors, it seemed the containers
  finished their business and shut down successfully), and finally i
 found the
  reason in nodemanager logs.
 
  i dont think this is a pleasent first experience. i realize
  spark.yarn.executor.memoryOverhead needs to be set differently from
  situation to situation. but shouldnt the default be a somewhat higher
 value
  so that these errors are unlikely, and then the experts that are
 willing to
  deal with these errors can tune it lower? so why not make the default
 10%
  instead of 7%? that gives something that works in most situations out
 of the
  box (at the cost of being a little wasteful). it worked for me.

 -
 To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
 For additional commands, e-mail: user-h...@spark.apache.org







Re: bitten by spark.yarn.executor.memoryOverhead

2015-02-28 Thread Sean Owen
There was a recent discussion about whether to increase or indeed make
configurable this kind of default fraction. I believe the suggestion
there too was that 9-10% is a safer default.

Advanced users can lower the resulting overhead value; it may still
have to be increased in some cases, but a fatter default may make this
kind of surprise less frequent.

I'd support increasing the default; any other thoughts?

On Sat, Feb 28, 2015 at 3:34 PM, Koert Kuipers ko...@tresata.com wrote:
 hey,
 running my first map-red like (meaning disk-to-disk, avoiding in memory
 RDDs) computation in spark on yarn i immediately got bitten by a too low
 spark.yarn.executor.memoryOverhead. however it took me about an hour to find
 out this was the cause. at first i observed failing shuffles leading to
 restarting of tasks, then i realized this was because executors could not be
 reached, then i noticed in containers got shut down and reallocated in
 resourcemanager logs (no mention of errors, it seemed the containers
 finished their business and shut down successfully), and finally i found the
 reason in nodemanager logs.

 i dont think this is a pleasent first experience. i realize
 spark.yarn.executor.memoryOverhead needs to be set differently from
 situation to situation. but shouldnt the default be a somewhat higher value
 so that these errors are unlikely, and then the experts that are willing to
 deal with these errors can tune it lower? so why not make the default 10%
 instead of 7%? that gives something that works in most situations out of the
 box (at the cost of being a little wasteful). it worked for me.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re: bitten by spark.yarn.executor.memoryOverhead

2015-02-28 Thread Ted Yu
Having good out-of-box experience is desirable.

+1 on increasing the default.


On Sat, Feb 28, 2015 at 8:27 AM, Sean Owen so...@cloudera.com wrote:

 There was a recent discussion about whether to increase or indeed make
 configurable this kind of default fraction. I believe the suggestion
 there too was that 9-10% is a safer default.

 Advanced users can lower the resulting overhead value; it may still
 have to be increased in some cases, but a fatter default may make this
 kind of surprise less frequent.

 I'd support increasing the default; any other thoughts?

 On Sat, Feb 28, 2015 at 3:34 PM, Koert Kuipers ko...@tresata.com wrote:
  hey,
  running my first map-red like (meaning disk-to-disk, avoiding in memory
  RDDs) computation in spark on yarn i immediately got bitten by a too low
  spark.yarn.executor.memoryOverhead. however it took me about an hour to
 find
  out this was the cause. at first i observed failing shuffles leading to
  restarting of tasks, then i realized this was because executors could
 not be
  reached, then i noticed in containers got shut down and reallocated in
  resourcemanager logs (no mention of errors, it seemed the containers
  finished their business and shut down successfully), and finally i found
 the
  reason in nodemanager logs.
 
  i dont think this is a pleasent first experience. i realize
  spark.yarn.executor.memoryOverhead needs to be set differently from
  situation to situation. but shouldnt the default be a somewhat higher
 value
  so that these errors are unlikely, and then the experts that are willing
 to
  deal with these errors can tune it lower? so why not make the default 10%
  instead of 7%? that gives something that works in most situations out of
 the
  box (at the cost of being a little wasteful). it worked for me.

 -
 To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
 For additional commands, e-mail: user-h...@spark.apache.org




Re: bitten by spark.yarn.executor.memoryOverhead

2015-02-28 Thread Corey Nolet
+1 to a better default as well.

We were working find until we ran against a real dataset which was much
larger than the test dataset we were using locally. It took me a couple
days and digging through many logs to figure out this value was what was
causing the problem.

On Sat, Feb 28, 2015 at 11:38 AM, Ted Yu yuzhih...@gmail.com wrote:

 Having good out-of-box experience is desirable.

 +1 on increasing the default.


 On Sat, Feb 28, 2015 at 8:27 AM, Sean Owen so...@cloudera.com wrote:

 There was a recent discussion about whether to increase or indeed make
 configurable this kind of default fraction. I believe the suggestion
 there too was that 9-10% is a safer default.

 Advanced users can lower the resulting overhead value; it may still
 have to be increased in some cases, but a fatter default may make this
 kind of surprise less frequent.

 I'd support increasing the default; any other thoughts?

 On Sat, Feb 28, 2015 at 3:34 PM, Koert Kuipers ko...@tresata.com wrote:
  hey,
  running my first map-red like (meaning disk-to-disk, avoiding in memory
  RDDs) computation in spark on yarn i immediately got bitten by a too low
  spark.yarn.executor.memoryOverhead. however it took me about an hour to
 find
  out this was the cause. at first i observed failing shuffles leading to
  restarting of tasks, then i realized this was because executors could
 not be
  reached, then i noticed in containers got shut down and reallocated in
  resourcemanager logs (no mention of errors, it seemed the containers
  finished their business and shut down successfully), and finally i
 found the
  reason in nodemanager logs.
 
  i dont think this is a pleasent first experience. i realize
  spark.yarn.executor.memoryOverhead needs to be set differently from
  situation to situation. but shouldnt the default be a somewhat higher
 value
  so that these errors are unlikely, and then the experts that are
 willing to
  deal with these errors can tune it lower? so why not make the default
 10%
  instead of 7%? that gives something that works in most situations out
 of the
  box (at the cost of being a little wasteful). it worked for me.

 -
 To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
 For additional commands, e-mail: user-h...@spark.apache.org