In my practice of spark application(almost Spark-SQL) , when there is a complete node failure in my cluster, jobs which have shuffle blocks on the node will completely fail after 4 task retries. It seems that data lineage didn't work. What' more, our applications use multiple SQL statements for data analysis. After a lengthy calculation, entire application failed because of one job failure is unacceptable. So we consider more stability rather than speed in some way.
-- Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/ --------------------------------------------------------------------- To unsubscribe e-mail: user-unsubscr...@spark.apache.org