java.io.IOException: Failed to save output of task

Grega Kešpret Wed, 21 May 2014 23:20:06 -0700

Hello,

my last reduce task in the job always fails with "java.io.IOException:
Failed to save output of task" when using saveAsTextFile with s3 endpoint
(all others are successful). Has anyone had similar problems?


https://gist.github.com/gregakespret/813b540faca678413ad4


-------------

14/05/21 21:44:45 ERROR SparkHadoopWriter: Error committing the output of
task: attempt_201405212144_0000_m_000000_3432
java.io.IOException: Failed to save output of task:
attempt_201405212144_0000_m_000000_3432
        at
org.apache.hadoop.mapred.FileOutputCommitter.moveTaskOutputs(FileOutputCommitter.java:160)
        at
org.apache.hadoop.mapred.FileOutputCommitter.moveTaskOutputs(FileOutputCommitter.java:172)
        at
org.apache.hadoop.mapred.FileOutputCommitter.commitTask(FileOutputCommitter.java:132)
        at
org.apache.hadoop.mapred.SparkHadoopWriter.commit(SparkHadoopWriter.scala:110)
        at 
org.apache.spark.rdd.PairRDDFunctions.org<http://org.apache.spark.rdd.pairrddfunctions.org/>
$apache$spark$rdd$PairRDDFunctions$$writeToFile$1(PairRDDFunctions.scala:731)
        at
org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopDataset$2.apply(PairRDDFunctions.scala:734)
        at
org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopDataset$2.apply(PairRDDFunctions.scala:734)
        at
org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:109)
        at org.apache.spark.scheduler.Task.run(Task.scala:53)
        at
org.apache.spark.executor.Executor$TaskRunner$$anonfun$run$1.apply$mcV$sp(Executor.scala:213)
        at
org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:46)
        at
org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:45)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
        at
org.apache.spark.deploy.SparkHadoopUtil.runAsUser(SparkHadoopUtil.scala:45)
        at
org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:178)
        at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:724)

Grega
--
[image: Inline image 1]
*Grega Kešpret*
Analytics engineer

Celtra — Rich Media Mobile Advertising
celtra.com <http://www.celtra.com/> |
@celtramobile<http://www.twitter.com/celtramobile>

java.io.IOException: Failed to save output of task

Reply via email to