There was a recent discussion about whether to increase or indeed make
configurable this kind of default fraction. I believe the suggestion
there too was that 9-10% is a safer default.

Advanced users can lower the resulting overhead value; it may still
have to be increased in some cases, but a fatter default may make this
kind of surprise less frequent.

I'd support increasing the default; any other thoughts?

On Sat, Feb 28, 2015 at 3:34 PM, Koert Kuipers <ko...@tresata.com> wrote:
> hey,
> running my first map-red like (meaning disk-to-disk, avoiding in memory
> RDDs) computation in spark on yarn i immediately got bitten by a too low
> spark.yarn.executor.memoryOverhead. however it took me about an hour to find
> out this was the cause. at first i observed failing shuffles leading to
> restarting of tasks, then i realized this was because executors could not be
> reached, then i noticed in containers got shut down and reallocated in
> resourcemanager logs (no mention of errors, it seemed the containers
> finished their business and shut down successfully), and finally i found the
> reason in nodemanager logs.
>
> i dont think this is a pleasent first experience. i realize
> spark.yarn.executor.memoryOverhead needs to be set differently from
> situation to situation. but shouldnt the default be a somewhat higher value
> so that these errors are unlikely, and then the experts that are willing to
> deal with these errors can tune it lower? so why not make the default 10%
> instead of 7%? that gives something that works in most situations out of the
> box (at the cost of being a little wasteful). it worked for me.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to