Hi,
When I recovery from checkpoint in yarn-cluster mode using Spark Streaming,
I found it will reuse the application id (In my case is
application_1428664056212_0016) before falied to write spark eventLog, But now
my application id is application_1428664056212_0017,then spark write eventLog
will falied, the stacktrace as follow:
15/04/14 10:14:01 WARN util.ShutdownHookManager: ShutdownHook '$anon$3' failed,
java.io.IOException: Target log file already exists
(hdfs://mycluster/spark-logs/eventLog/application_1428664056212_0016)
java.io.IOException: Target log file already exists
(hdfs://mycluster/spark-logs/eventLog/application_1428664056212_0016)
at
org.apache.spark.scheduler.EventLoggingListener.stop(EventLoggingListener.scala:201)
at
org.apache.spark.SparkContext$$anonfun$stop$4.apply(SparkContext.scala:1388)
at
org.apache.spark.SparkContext$$anonfun$stop$4.apply(SparkContext.scala:1388)
at scala.Option.foreach(Option.scala:236)
at org.apache.spark.SparkContext.stop(SparkContext.scala:1388)
at
org.apache.spark.deploy.yarn.ApplicationMaster$$anon$3.run(ApplicationMaster.scala:107)
at
org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:54)
Is someone can help me, The issue is SPARK-6892.
thanks