I have since resolved the issue. The problem was that multiple rdds were trying to write to the same s3 bucket.
Grega -- [image: Inline image 1] *Grega Kešpret* Analytics engineer Celtra — Rich Media Mobile Advertising celtra.com <http://www.celtra.com/> | @celtramobile<http://www.twitter.com/celtramobile> On Thu, May 22, 2014 at 8:18 AM, Grega Kešpret <gr...@celtra.com> wrote: > Hello, > > my last reduce task in the job always fails with "java.io.IOException: > Failed to save output of task" when using saveAsTextFile with s3 endpoint > (all others are successful). Has anyone had similar problems? > > https://gist.github.com/gregakespret/813b540faca678413ad4 > > > ------------- > > 14/05/21 21:44:45 ERROR SparkHadoopWriter: Error committing the output of > task: attempt_201405212144_0000_m_000000_3432 > java.io.IOException: Failed to save output of task: > attempt_201405212144_0000_m_000000_3432 > at > org.apache.hadoop.mapred.FileOutputCommitter.moveTaskOutputs(FileOutputCommitter.java:160) > at > org.apache.hadoop.mapred.FileOutputCommitter.moveTaskOutputs(FileOutputCommitter.java:172) > at > org.apache.hadoop.mapred.FileOutputCommitter.commitTask(FileOutputCommitter.java:132) > at > org.apache.hadoop.mapred.SparkHadoopWriter.commit(SparkHadoopWriter.scala:110) > at > org.apache.spark.rdd.PairRDDFunctions.org<http://org.apache.spark.rdd.pairrddfunctions.org/> > $apache$spark$rdd$PairRDDFunctions$$writeToFile$1(PairRDDFunctions.scala:731) > at > org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopDataset$2.apply(PairRDDFunctions.scala:734) > at > org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopDataset$2.apply(PairRDDFunctions.scala:734) > at > org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:109) > at org.apache.spark.scheduler.Task.run(Task.scala:53) > at > org.apache.spark.executor.Executor$TaskRunner$$anonfun$run$1.apply$mcV$sp(Executor.scala:213) > at > org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:46) > at > org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:45) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121) > at > org.apache.spark.deploy.SparkHadoopUtil.runAsUser(SparkHadoopUtil.scala:45) > at > org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:178) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:724) > > Grega > -- > [image: Inline image 1] > *Grega Kešpret* > Analytics engineer > > Celtra — Rich Media Mobile Advertising > celtra.com <http://www.celtra.com/> | > @celtramobile<http://www.twitter.com/celtramobile> >