Im using Spark1.4.2 with Hadoop 2.7, I tried increasing spark.shuffle.io.maxRetries to 10 but didn't help.
Any ideas on what could be causing this?? This is the exception that I am getting: [MySparkApplication] WARN : Failed to execute SQL statement select * from TableS s join TableC c on s.property = c.property from X YZ org.apache.spark.SparkException: Job aborted due to stage failure: Task 4 in stage 5710.0 failed 4 times, most recent failure: Lost task 4.3 in stage 5710.0 (TID 341269, ip-10-0-1-80.us-west-2.compute.internal): java.io.FileNotFoundException: /mnt/md0/var/lib/spark/spark-549f7d96-82da-4b8d-b9fe-7f6fe8238478/blockmgr- f44be41a-9036-4b93-8608-4a8b2fabbc06/0b/shuffle_3257_4_0.data (Permission denied) at java.io.FileOutputStream.open(Native Method) at java.io.FileOutputStream.<init>(FileOutputStream.java:213) at org.apache.spark.storage.DiskBlockObjectWriter.open( BlockObjectWriter.scala:128) at org.apache.spark.storage.DiskBlockObjectWriter.write( BlockObjectWriter.scala:203) at org.apache.spark.util.collection.WritablePartitionedIterator$$ anon$3.writeNext(WritablePartitionedPairCollection.scala:104) at org.apache.spark.util.collection.ExternalSorter.writePartitionedFile( ExternalSorter.scala:757) at org.apache.spark.shuffle.sort.SortShuffleWriter.write( SortShuffleWriter.scala:70) at org.apache.spark.scheduler.ShuffleMapTask.runTask( ShuffleMapTask.scala:70) at org.apache.spark.scheduler.ShuffleMapTask.runTask( ShuffleMapTask.scala:41) at org.apache.spark.scheduler.Task.run(Task.scala:70) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213) at java.util.concurrent.ThreadPoolExecutor.runWorker( ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run( ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Driver stacktrace: at org.apache.spark.scheduler.DAGScheduler.org <http://org.apache.spark.scheduler.dagscheduler.org/>$apache$spark$ scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1276) at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply( DAGScheduler.scala:1267) at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply( DAGScheduler.scala:1266) at scala.collection.mutable.ResizableArray$class.foreach( ResizableArray.scala:59) at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47) at org.apache.spark.scheduler.DAGScheduler.abortStage( DAGScheduler.scala:1266) at org.apache.spark.scheduler.DAGScheduler$$anonfun$ handleTaskSetFailed$1.apply(DAGScheduler.scala:730) at org.apache.spark.scheduler.DAGScheduler$$anonfun$ handleTaskSetFailed$1.apply(DAGScheduler.scala:730) at scala.Option.foreach(Option.scala:236) at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed( DAGScheduler.scala:730) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop. onReceive(DAGScheduler.scala:1460) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop. onReceive(DAGScheduler.scala:1421) at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48) Thanks Sahil