Article below gives a good idea.
http://blog.cloudera.com/blog/2015/03/how-to-tune-your-apache-spark-jobs-part-2/
Play around with two configuration (large number of executor with small core,
and small executor with large core) . Calculated value have to be conservative
or it will make the
I don't think it has anything to do with using all the cores, since 1
executor can run as many tasks as you like. Yes, you'd want them to
request all cores in this case. YARN vs Mesos does not matter here.
On Wed, Dec 16, 2015 at 1:58 PM, Michael Segel
wrote:
> Hmmm.
>
Hi Veljko,
I would assume keeping the number of executors per machine to a minimum is
best for performance (as long as you consider memory requirements as well).
Each executor is a process that can run tasks in multiple threads. On a
kernel/hardware level, thread switches are much cheaper than
Hi Veljko,
I usually ask the following questions: “how many memory per task?” then "How
many cpu per task?” then I calculate based on the memory and cpu requirements
per task. You might be surprise (maybe not you, but at least I am :) ) that
many OOM issues are actually because of this.
Best
1 per machine is the right number. If you are running very large heaps
(>64GB) you may consider multiple per machine just to make sure each's
GC pauses aren't excessive, but even this might be better mitigated
with GC tuning.
On Tue, Dec 15, 2015 at 9:07 PM, Veljko Skarich