try --num-executors 3 --executor-cores 4 --executor-memory 2G --conf spark.scheduler.mode=FAIR
On Mon, Jun 11, 2018 at 2:43 PM, Aakash Basu <aakash.spark....@gmail.com> wrote: > Hi, > > I have submitted a job on* 4 node cluster*, where I see, most of the > operations happening at one of the worker nodes and other two are simply > chilling out. > > Picture below puts light on that - > > How to properly distribute the load? > > My cluster conf (4 node cluster [1 driver; 3 slaves]) - > > *Cores - 6* > *RAM - 12 GB* > *HDD - 60 GB* > > My Spark Submit command is as follows - > > *spark-submit --master spark://192.168.49.37:7077 > <http://192.168.49.37:7077> --num-executors 3 --executor-cores 5 > --executor-memory 4G /appdata/bblite-codebase/prima_diabetes_indians.py* > > What to do? > > Thanks, > Aakash. >