Hi Folks, We have few spark job streaming jobs running on a yarn cluster, and from time to time a job need to be restarted (it was killed due to external reason or others).
Once we submit the new job we are face with the following exception: ERROR spark.SparkContext: Failed to add /mnt/data1/yarn/nm/usercache/spark/appcache/*application_1537885048149_15382*/container_e82_1537885048149_15382_01_000001/__app__.jar to Spark environment java.io.FileNotFoundException: Jar /mnt/data1/yarn/nm/usercache/spark/appcache/application_1537885048149_15382/container_e82_1537885048149_15382_01_000001/__app__.jar not found at org.apache.spark.SparkContext.addJarFile$1(SparkContext.scala:1807) at org.apache.spark.SparkContext.addJar(SparkContext.scala:1835) at org.apache.spark.SparkContext$$anonfun$12.apply(SparkContext.scala:457) Of course we know that *application_1537885048149_15382* correspond to the previous job that was killed, and that our Yarn is cleaning up the usercache directory very often to avoid choking the filesystem with so many unused file. However what can you guys recommend for long running jobs that have to be restarted but the previous context is not available due to the cleanup? Hope is clear what i meant, if you need more information just ask. Thanks JC -- Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/ --------------------------------------------------------------------- To unsubscribe e-mail: user-unsubscr...@spark.apache.org