[ 
https://issues.apache.org/jira/browse/HADOOP-11724?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lei (Eddy) Xu updated HADOOP-11724:
-----------------------------------
    Attachment: HADOOP-11724.001.patch

Thanks a lot [~yzhangal]. It is a great suggestion. I updated the patch for 
your comments. Would you mind take another look? 

> DistCp throws NPE when the target directory is root.
> ----------------------------------------------------
>
>                 Key: HADOOP-11724
>                 URL: https://issues.apache.org/jira/browse/HADOOP-11724
>             Project: Hadoop Common
>          Issue Type: Bug
>    Affects Versions: 2.6.0
>            Reporter: Lei (Eddy) Xu
>            Assignee: Lei (Eddy) Xu
>            Priority: Minor
>         Attachments: HADOOP-11724.000.patch, HADOOP-11724.001.patch
>
>
> Distcp throws NPE when the target directory is root. It is due to 
> {{CopyCommitter#cleanupTempFiles}} attempts to delete parent directory of 
> root, which is {{null}}:
> {code}
> $ hadoop distcp pom.xml hdfs://localhost/
> 15/03/17 11:17:44 WARN util.NativeCodeLoader: Unable to load native-hadoop 
> library for your platform... using builtin-java classes where applicable
> 15/03/17 11:17:45 INFO tools.DistCp: Input Options: 
> DistCpOptions{atomicCommit=false, syncFolder=false, deleteMissing=false, 
> ignoreFailures=false, maxMaps=20, sslConfigurationFile='null', 
> copyStrategy='uniformsize', sourceFileListing=null, sourcePaths=[pom.xml], 
> targetPath=hdfs://localhost/, targetPathExists=true, preserveRawXattrs=false}
> 15/03/17 11:17:45 INFO Configuration.deprecation: session.id is deprecated. 
> Instead, use dfs.metrics.session-id
> 15/03/17 11:17:45 INFO jvm.JvmMetrics: Initializing JVM Metrics with 
> processName=JobTracker, sessionId=
> 15/03/17 11:17:45 INFO Configuration.deprecation: io.sort.mb is deprecated. 
> Instead, use mapreduce.task.io.sort.mb
> 15/03/17 11:17:45 INFO Configuration.deprecation: io.sort.factor is 
> deprecated. Instead, use mapreduce.task.io.sort.factor
> 15/03/17 11:17:45 INFO jvm.JvmMetrics: Cannot initialize JVM Metrics with 
> processName=JobTracker, sessionId= - already initialized
> 15/03/17 11:17:45 INFO mapreduce.JobSubmitter: number of splits:1
> 15/03/17 11:17:45 INFO mapreduce.JobSubmitter: Submitting tokens for job: 
> job_local992233322_0001
> 15/03/17 11:17:46 INFO mapreduce.Job: The url to track the job: 
> http://localhost:8080/
> 15/03/17 11:17:46 INFO tools.DistCp: DistCp job-id: job_local992233322_0001
> 15/03/17 11:17:46 INFO mapreduce.Job: Running job: job_local992233322_0001
> 15/03/17 11:17:46 INFO mapred.LocalJobRunner: OutputCommitter set in config 
> null
> 15/03/17 11:17:46 INFO output.FileOutputCommitter: File Output Committer 
> Algorithm version is 1
> 15/03/17 11:17:46 INFO mapred.LocalJobRunner: OutputCommitter is 
> org.apache.hadoop.tools.mapred.CopyCommitter
> 15/03/17 11:17:46 INFO mapred.LocalJobRunner: Waiting for map tasks
> 15/03/17 11:17:46 INFO mapred.LocalJobRunner: Starting task: 
> attempt_local992233322_0001_m_000000_0
> 15/03/17 11:17:46 INFO output.FileOutputCommitter: File Output Committer 
> Algorithm version is 1
> 15/03/17 11:17:46 INFO util.ProcfsBasedProcessTree: ProcfsBasedProcessTree 
> currently is supported only on Linux.
> 15/03/17 11:17:46 INFO mapred.Task:  Using ResourceCalculatorProcessTree : 
> null
> 15/03/17 11:17:46 INFO mapred.MapTask: Processing split: 
> file:/tmp/hadoop/mapred/staging/lei2046334351/.staging/_distcp-1889397390/fileList.seq:0+220
> 15/03/17 11:17:46 INFO output.FileOutputCommitter: File Output Committer 
> Algorithm version is 1
> 15/03/17 11:17:46 INFO mapred.CopyMapper: Copying 
> file:/Users/lei/work/cloudera/s3a_cp_target/pom.xml to 
> hdfs://localhost/pom.xml
> 15/03/17 11:17:46 INFO mapred.CopyMapper: Skipping copy of 
> file:/Users/lei/work/cloudera/s3a_cp_target/pom.xml to 
> hdfs://localhost/pom.xml
> 15/03/17 11:17:46 INFO mapred.LocalJobRunner:
> 15/03/17 11:17:46 INFO mapred.Task: 
> Task:attempt_local992233322_0001_m_000000_0 is done. And is in the process of 
> committing
> 15/03/17 11:17:46 INFO mapred.LocalJobRunner:
> 15/03/17 11:17:46 INFO mapred.Task: Task 
> attempt_local992233322_0001_m_000000_0 is allowed to commit now
> 15/03/17 11:17:46 INFO output.FileOutputCommitter: Saved output of task 
> 'attempt_local992233322_0001_m_000000_0' to 
> file:/tmp/hadoop/mapred/staging/lei2046334351/.staging/_distcp-1889397390/_logs/_temporary/0/task_local992233322_0001_m_000000
> 15/03/17 11:17:46 INFO mapred.LocalJobRunner: Copying 
> file:/Users/lei/work/cloudera/s3a_cp_target/pom.xml to 
> hdfs://localhost/pom.xml
> 15/03/17 11:17:46 INFO mapred.Task: Task 
> 'attempt_local992233322_0001_m_000000_0' done.
> 15/03/17 11:17:46 INFO mapred.LocalJobRunner: Finishing task: 
> attempt_local992233322_0001_m_000000_0
> 15/03/17 11:17:46 INFO mapred.LocalJobRunner: map task executor complete.
> 15/03/17 11:17:46 INFO mapred.CopyCommitter: Remove parent: null for 
> hdfs://localhost/
> 15/03/17 11:17:46 WARN mapred.CopyCommitter: Unable to cleanup temp files
> java.lang.NullPointerException
>       at org.apache.hadoop.fs.Path.<init>(Path.java:104)
>       at org.apache.hadoop.fs.Path.<init>(Path.java:93)
>       at 
> org.apache.hadoop.tools.mapred.CopyCommitter.deleteAttemptTempFiles(CopyCommitter.java:141)
>       at 
> org.apache.hadoop.tools.mapred.CopyCommitter.cleanupTempFiles(CopyCommitter.java:130)
>       at 
> org.apache.hadoop.tools.mapred.CopyCommitter.commitJob(CopyCommitter.java:83)
>       at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:538)
> 15/03/17 11:17:46 INFO mapred.CopyCommitter: Cleaning up temporary work 
> folder: 
> file:/tmp/hadoop/mapred/staging/lei2046334351/.staging/_distcp-1889397390
> 15/03/17 11:17:47 INFO mapreduce.Job: Job job_local992233322_0001 running in 
> uber mode : false
> 15/03/17 11:17:47 INFO mapreduce.Job:  map 100% reduce 0%
> 15/03/17 11:17:47 INFO mapreduce.Job: Job job_local992233322_0001 completed 
> successfully
> 15/03/17 11:17:47 INFO mapreduce.Job: Counters: 22
>       File System Counters
>               FILE: Number of bytes read=103917
>               FILE: Number of bytes written=363277
>               FILE: Number of read operations=0
>               FILE: Number of large read operations=0
>               FILE: Number of write operations=0
>               HDFS: Number of bytes read=0
>               HDFS: Number of bytes written=0
>               HDFS: Number of read operations=8
>               HDFS: Number of large read operations=0
>               HDFS: Number of write operations=0
>       Map-Reduce Framework
>               Map input records=1
>               Map output records=1
>               Input split bytes=151
>               Spilled Records=0
>               Failed Shuffles=0
>               Merged Map outputs=0
>               GC time elapsed (ms)=14
>               Total committed heap usage (bytes)=163577856
>       File Input Format Counters
>               Bytes Read=252
>       File Output Format Counters
>               Bytes Written=70
>       org.apache.hadoop.tools.mapred.CopyMapper$Counter
>               BYTESSKIPPED=23491
>               SKIP=1
> {code}
> The distcp task can still success. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to