[ https://issues.apache.org/jira/browse/SPARK-2975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Josh Rosen updated SPARK-2975: ------------------------------ Priority: Critical (was: Minor) I'm raising the priority of this issue to 'critical', since it causes problems when running on a cluster if some tasks are small enough to be run locally on the driver. Here's an example exception: {code} org.apache.spark.SparkException: Job aborted due to stage failure: Task 21 in stage 0.0 failed 1 times, most recent failure: Lost task 21.0 in stage 0.0 (TID 21, localhost): java.io.IOException: No such file or directory java.io.UnixFileSystem.createFileExclusively(Native Method) java.io.File.createNewFile(File.java:1006) java.io.File.createTempFile(File.java:1989) org.apache.spark.util.Utils$.fetchFile(Utils.scala:335) org.apache.spark.executor.Executor$$anonfun$org$apache$spark$executor$Executor$$updateDependencies$3.apply(Executor.scala:342) org.apache.spark.executor.Executor$$anonfun$org$apache$spark$executor$Executor$$updateDependencies$3.apply(Executor.scala:340) scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:772) scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98) scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98) scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:226) scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:39) scala.collection.mutable.HashMap.foreach(HashMap.scala:98) scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:771) org.apache.spark.executor.Executor.org$apache$spark$executor$Executor$$updateDependencies(Executor.scala:340) org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:180) java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) java.lang.Thread.run(Thread.java:745) Driver stacktrace: at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1153) at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1142) at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1141) at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47) at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1141) at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:682) at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:682) at scala.Option.foreach(Option.scala:236) at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:682) at org.apache.spark.scheduler.DAGSchedulerEventProcessActor$$anonfun$receive$2.applyOrElse(DAGScheduler.scala:1359) at akka.actor.ActorCell.receiveMessage(ActorCell.scala:498) at akka.actor.ActorCell.invoke(ActorCell.scala:456) at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:237) at akka.dispatch.Mailbox.run(Mailbox.scala:219) at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386) at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) {code} > SPARK_LOCAL_DIRS may cause problems when running in local mode > -------------------------------------------------------------- > > Key: SPARK-2975 > URL: https://issues.apache.org/jira/browse/SPARK-2975 > Project: Spark > Issue Type: Bug > Components: Spark Core > Affects Versions: 1.0.0, 1.1.0 > Reporter: Josh Rosen > Priority: Critical > > If we're running Spark in local mode and {{SPARK_LOCAL_DIRS}} is set, the > {{Executor}} modifies SparkConf so that this value overrides > {{spark.local.dir}}. Normally, this is safe because the modification takes > place before SparkEnv is created. In local mode, the Executor uses an > existing SparkEnv rather than creating a new one, so it winds up with a > DiskBlockManager that created local directories with the original > {{spark.local.dir}} setting, but other components attempt to use directories > specified in the _new_ {{spark.local.dir}}, leading to problems. > I discovered this issue while testing Spark 1.1.0-snapshot1, but I think it > will also affect Spark 1.0 (haven't confirmed this, though). > (I posted some comments at > https://github.com/apache/spark/pull/299#discussion-diff-15975800, but also > opening this JIRA so this isn't forgotten.) -- This message was sent by Atlassian JIRA (v6.2#6252) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org