Hi All, I am new to Spark and I am trying to understand how preemption works with Spark on Yarn. My goal is to determine amount of re-work a Spark application has to do if an executor is preempted.
For my test, I am using a 4 node cluster with Cloudera VM running Spark 1.3.0. I am running PageRank spark example. I tried to run tests with both Capacity Scheduler and Fair Scheduler and I can tell from Resource Manager and Application Master logs that containers are getting preempted. However, I am not able to see any task/executor failures in Spark UI. I checked logs for Driver(in yarn-client mode), Application Master and preempted Container but I am not able to answer the question I have. The main questions which I want to answer are: 1. What happens to the tasks which were killed due to preemption? Why I do not see any failure for these tasks on history server UI? 2. What happens to the tasks which are already completed by the executor which was preempted? Are there any cases when these tasks will be recomputed? 3. What happens to the tasks which are pending to be picked up by executors which was preempted? I am guessing these are scheduled on other executors but I cannot tell that from logs. It will be great if I get some help to answer these questions. Thanks, Surbhi -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Preemption-with-Spark-on-Yarn-tp25146.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org