********************************************************************************************************************** yarn-site.xml
<property> <name>yarn.scheduler.fair.preemption.cluster-utilization-threshold</name> <value>0.8</value> </property> <property> <name>yarn.scheduler.minimum-allocation-mb</name> <value>3584</value> </property> <property> <name>yarn.scheduler.maximum-allocation-mb</name> <value>10752</value> </property> <property> <name>yarn.nodemanager.resource.memory-mb</name> <value>10752</value> ****************************************************************************************************************************** spark-defaults.conf spark.master yarn spark.driver.memory 9g spark.executor.memory 1024m spark.yarn.executor.memoryOverhead 1024m spark.eventLog.enabled true spark.eventLog.dir hdfs://tech-master:54310/spark-logs spark.history.provider org.apache.spark.deploy.history.FsHistoryProvider spark.history.fs.logDirectory hdfs://tech-master:54310/spark-logs spark.history.fs.update.interval 10s spark.history.ui.port 18080 spark.ui.enabled true spark.ui.port 4040 spark.ui.killEnabled true spark.ui.retainedDeadExecutors 100 spark.scheduler.mode FAIR spark.scheduler.allocation.file /usr/local/spark/current/conf/fairscheduler.xml #spark.submit.deployMode cluster spark.default.parallelism 30 SPARK_WORKER_MEMORY 10g SPARK_WORKER_INSTANCES 1 SPARK_WORKER_CORES 5 SPARK_DRIVER_MEMORY 9g SPARK_DRIVER_CORES 5 SPARK_MASTER_IP Tech-master SPARK_MASTER_PORT 7077 On Tue, Feb 13, 2018 at 4:43 PM, akshay naidu <akshaynaid...@gmail.com> wrote: > Hello, > I'm try to run multiple spark jobs on cluster running in yarn. > Master is 24GB server with 6 Slaves of 12GB > > fairscheduler.xml settings are - > <pool name="default"> > <schedulingMode>FAIR</schedulingMode> > <weight>10</weight> > <minShare>2</minShare> > </pool> > > I am running 8 jobs simultaneously , jobs are running parallelly but not > all. > at a time only 7 of then runs simultaneously while the 8th one is in queue > WAITING for a job to stop. > > also, out of the 7 running jobs, 4 runs comparatively much faster than > remaining three (maybe resources are not distributed properly) . > > I want to run n number of jobs at a time and make them run faster , Right > now, one job is taking more than three minutes while processing a max of > 1GB data . > > Kindly assist me. what am I missing. > > Thanks. >