> Hi,
>      MapReduce TeraSort Job fails on S3 with Output PathExistsException.
> Is this a known issue?
> Thanks,
> Prabhu Joseph
> [hrt_qa@hostname root]$ yarn jar
> /opt/cloudera/parcels/CDH/lib/hadoop-mapreduce/hadoop-mapreduce-examples-
> terasort s3a:/bucket/INPUT s3a://bucket/OUTPUT
> WARNING: YARN_OPTS has been replaced by HADOOP_OPTS. Using value of
> 19/06/07 14:13:11 INFO terasort.TeraSort: starting
> 19/06/07 14:13:12 WARN impl.MetricsConfig: Cannot locate configuration:
> tried hadoop-metrics2-s3a-file-system.properties,hadoop-metrics2.properties
> 19/06/07 14:13:12 INFO impl.MetricsSystemImpl: Scheduled Metric snapshot
> period at 10 second(s).
> 19/06/07 14:13:12 INFO impl.MetricsSystemImpl: s3a-file-system metrics
> system started
> 19/06/07 14:13:14 INFO input.FileInputFormat: Total input files to process
> : 2
> Spent 396ms computing base-splits.
> Spent 3ms computing TeraScheduler splits.
> Computing input splits took 400ms
> Sampling 2 splits of 2
> Making 80 from 10000 sampled records
> Computing parititions took 685ms
> Spent 1088ms computing partitions.
> 19/06/07 14:13:15 INFO client.RMProxy: Connecting to ResourceManager at
> hostname:8032
> 19/06/07 14:13:17 INFO mapreduce.JobResourceUploader: Disabling Erasure
> Coding for path: /user/hrt_qa/.staging/job_1559891760159_0011
> 19/06/07 14:13:17 INFO mapreduce.JobSubmitter: number of splits:2
> 19/06/07 14:13:17 INFO mapreduce.JobSubmitter: Submitting tokens for job:
> job_1559891760159_0011
> 19/06/07 14:13:17 INFO mapreduce.JobSubmitter: Executing with tokens: []
> 19/06/07 14:13:18 INFO conf.Configuration: resource-types.xml not found
> 19/06/07 14:13:18 INFO resource.ResourceUtils: Unable to find
> 'resource-types.xml'.
> 19/06/07 14:13:18 INFO impl.YarnClientImpl: Submitted application
> application_1559891760159_0011
> 19/06/07 14:13:18 INFO mapreduce.Job: The url to track the job:
> http://hostname:8088/proxy/application_1559891760159_0011/
> 19/06/07 14:13:18 INFO mapreduce.Job: Running job: job_1559891760159_0011
> 19/06/07 14:13:33 INFO mapreduce.Job: Job job_1559891760159_0011 running in
> uber mode : false
> 19/06/07 14:13:33 INFO mapreduce.Job:  map 0% reduce 0%
> 19/06/07 14:13:34 INFO mapreduce.Job: Job job_1559891760159_0011 failed
> with state FAILED due to: Job setup failed :
> org.apache.hadoop.fs.PathExistsException: `s3a://bucket/OUTPUT': Setting
> job as Task committer attempt_1559891760159_0011_m_000000_0: Destination
> path exists and committer conflict resolution mode is "fail"
> at
> org.apache.hadoop.fs.s3a.commit.staging.StagingCommitter.failDestinationExists(StagingCommitter.java:878)
> at
> org.apache.hadoop.fs.s3a.commit.staging.DirectoryStagingCommitter.setupJob(DirectoryStagingCommitter.java:71)
> at
> org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler$EventProcessor.handleJobSetup(CommitterEventHandler.java:255)
> at
> org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler$EventProcessor.run(CommitterEventHandler.java:235)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> 19/06/07 14:13:34 INFO mapreduce.Job: Counters: 2
> Job Counters
> Total time spent by all maps in occupied slots (ms)=0
> Total time spent by all reduces in occupied slots (ms)=0
> 19/06/07 14:13:34 INFO terasort.TeraSort: done
> 19/06/07 14:13:34 INFO impl.MetricsSystemImpl: Stopping s3a-file-system
> metrics system...
> 19/06/07 14:13:34 INFO impl.MetricsSystemImpl: s3a-file-system metrics
> system stopped.
> 19/06/07 14:13:34 INFO impl.MetricsSystemImpl: s3a-file-system metrics
> system shutdown complete.

