Hi All, 

I am new to Spark and I am trying to understand how preemption works with
Spark on Yarn. My goal is to determine amount of re-work a Spark application
has to do if an executor is preempted. 

For my test, I am using a 4 node cluster with Cloudera VM running Spark
1.3.0. I am running PageRank spark example. I tried to run tests with both
Capacity Scheduler and Fair Scheduler and I can tell from Resource Manager
and Application Master logs that containers are getting preempted. However,
I am not able to see any task/executor failures in Spark UI. I checked logs
for Driver(in yarn-client mode), Application Master and preempted Container
but I am not able to answer the question I have. The main questions which I
want to answer are: 

1. What happens to the tasks which were killed due to preemption? Why I do
not see any failure for these tasks on history server UI?  
2. What happens to the tasks which are already completed by the executor
which was preempted? Are there any cases when these tasks will be
recomputed? 
3. What happens to the tasks which are pending to be picked up by executors
which was preempted? I am guessing these are scheduled on other executors
but I cannot tell that from logs. 

It will be great if I get some help to answer these questions. 

Thanks,
Surbhi








--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Preemption-with-Spark-on-Yarn-tp25146.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to