Hi I have 5 Spark jobs which needs to be run in parallel to speed up process
they take around 6-8 hours together. I have 93 container nodes with 8 cores
each memory capacity of around 2.8 TB. Now I runs each jobs with around 30
executors with 2 cores and 20 GB each. My each jobs processes around 1 TB of
data. Now since my cluster is shared cluster many other teams spawn their
jobs along with me. So YARN kills my executors and not adding it back since
cluster is running at max capacity. I just want to know best practices in
such a resource crunching environment. These jobs runs everyday so I am
looking for innovative approaches to solve this problem. Before anyone says
we can have our own dedicated cluster so looking for alternative solutions.
Please guide. Thanks in advance.



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Best-practices-for-scheduling-Spark-jobs-on-shared-YARN-cluster-using-Autosys-tp24820.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to