(Prabhu and I will work on this online; if HADOOP-16058 is in then it is probably just a test setup problem)
On Fri, Jun 7, 2019 at 3:18 PM Prabhu Joseph <prabhujose.ga...@gmail.com> wrote: > Hi, > > MapReduce TeraSort Job fails on S3 with Output PathExistsException. > Is this a known issue? > > Thanks, > Prabhu Joseph > > > [hrt_qa@hostname root]$ yarn jar > > /opt/cloudera/parcels/CDH/lib/hadoop-mapreduce/hadoop-mapreduce-examples-3.1.1.7.0.0.0-115.jar > terasort s3a:/bucket/INPUT s3a://bucket/OUTPUT > > WARNING: YARN_OPTS has been replaced by HADOOP_OPTS. Using value of > YARN_OPTS. > > 19/06/07 14:13:11 INFO terasort.TeraSort: starting > > 19/06/07 14:13:12 WARN impl.MetricsConfig: Cannot locate configuration: > tried hadoop-metrics2-s3a-file-system.properties,hadoop-metrics2.properties > > 19/06/07 14:13:12 INFO impl.MetricsSystemImpl: Scheduled Metric snapshot > period at 10 second(s). > > 19/06/07 14:13:12 INFO impl.MetricsSystemImpl: s3a-file-system metrics > system started > > 19/06/07 14:13:14 INFO input.FileInputFormat: Total input files to process > : 2 > > Spent 396ms computing base-splits. > > Spent 3ms computing TeraScheduler splits. > > Computing input splits took 400ms > > Sampling 2 splits of 2 > > Making 80 from 10000 sampled records > > Computing parititions took 685ms > > Spent 1088ms computing partitions. > > 19/06/07 14:13:15 INFO client.RMProxy: Connecting to ResourceManager at > hostname:8032 > > 19/06/07 14:13:17 INFO mapreduce.JobResourceUploader: Disabling Erasure > Coding for path: /user/hrt_qa/.staging/job_1559891760159_0011 > > 19/06/07 14:13:17 INFO mapreduce.JobSubmitter: number of splits:2 > > 19/06/07 14:13:17 INFO mapreduce.JobSubmitter: Submitting tokens for job: > job_1559891760159_0011 > > 19/06/07 14:13:17 INFO mapreduce.JobSubmitter: Executing with tokens: [] > > 19/06/07 14:13:18 INFO conf.Configuration: resource-types.xml not found > > 19/06/07 14:13:18 INFO resource.ResourceUtils: Unable to find > 'resource-types.xml'. > > 19/06/07 14:13:18 INFO impl.YarnClientImpl: Submitted application > application_1559891760159_0011 > > 19/06/07 14:13:18 INFO mapreduce.Job: The url to track the job: > http://hostname:8088/proxy/application_1559891760159_0011/ > > 19/06/07 14:13:18 INFO mapreduce.Job: Running job: job_1559891760159_0011 > > 19/06/07 14:13:33 INFO mapreduce.Job: Job job_1559891760159_0011 running in > uber mode : false > > 19/06/07 14:13:33 INFO mapreduce.Job: map 0% reduce 0% > > 19/06/07 14:13:34 INFO mapreduce.Job: Job job_1559891760159_0011 failed > with state FAILED due to: Job setup failed : > org.apache.hadoop.fs.PathExistsException: `s3a://bucket/OUTPUT': Setting > job as Task committer attempt_1559891760159_0011_m_000000_0: Destination > path exists and committer conflict resolution mode is "fail" > > at > > org.apache.hadoop.fs.s3a.commit.staging.StagingCommitter.failDestinationExists(StagingCommitter.java:878) > > at > > org.apache.hadoop.fs.s3a.commit.staging.DirectoryStagingCommitter.setupJob(DirectoryStagingCommitter.java:71) > > at > > org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler$EventProcessor.handleJobSetup(CommitterEventHandler.java:255) > > at > > org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler$EventProcessor.run(CommitterEventHandler.java:235) > > at > > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > > at > > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > > at java.lang.Thread.run(Thread.java:748) > > > > 19/06/07 14:13:34 INFO mapreduce.Job: Counters: 2 > > Job Counters > > Total time spent by all maps in occupied slots (ms)=0 > > Total time spent by all reduces in occupied slots (ms)=0 > > 19/06/07 14:13:34 INFO terasort.TeraSort: done > > 19/06/07 14:13:34 INFO impl.MetricsSystemImpl: Stopping s3a-file-system > metrics system... > > 19/06/07 14:13:34 INFO impl.MetricsSystemImpl: s3a-file-system metrics > system stopped. > > 19/06/07 14:13:34 INFO impl.MetricsSystemImpl: s3a-file-system metrics > system shutdown complete. >