Re: Efficient Spark-Submit planning

2017-09-12 Thread Sonal Goyal
Overall the defaults are sensible, but you definitely have to look at your
application and optimise a few of them. I mostly refer to the following
links when the job is slow or failing or we have more hardware which we see
we are not utilizing.

http://spark.apache.org/docs/latest/tuning.html
http://spark.apache.org/docs/latest/hardware-provisioning.html
http://spark.apache.org/docs/latest/configuration.html


Thanks,
Sonal
Nube Technologies 





On Tue, Sep 12, 2017 at 2:40 AM, Aakash Basu 
wrote:

> Hi,
>
> Can someone please clarify a little on how should we effectively calculate
> the parameters to be passed over using spark-submit.
>
> Parameters as in -
>
> Cores, NumExecutors, DriverMemory, etc.
>
> Is there any generic calculation which can be done over most kind of
> clusters with different sizes from small 3 node to 100s of nodes.
>
> Thanks,
> Aakash.
>


Efficient Spark-Submit planning

2017-09-11 Thread Aakash Basu
Hi,

Can someone please clarify a little on how should we effectively calculate
the parameters to be passed over using spark-submit.

Parameters as in -

Cores, NumExecutors, DriverMemory, etc.

Is there any generic calculation which can be done over most kind of
clusters with different sizes from small 3 node to 100s of nodes.

Thanks,
Aakash.