Hi, all I use distcp copying data from hadoop1.0.3 to hadoop 2.0.1.
When the file path(or file name) contain Chinese character, an exception will throw. Like below. I need some help about this. Thanks. [hdfs@host ~]$ hadoop distcp -i -prbugp -m 14 -overwrite -log /tmp/distcp.log hftp://10.xx.xx.aa:50070/tmp/中文路径测试 hdfs://10.xx.xx.bb:54310/tmp/distcp_test14 12/08/28 23:32:31 INFO tools.DistCp: Input Options: DistCpOptions{atomicCommit=false, syncFolder=false, deleteMissing=false, ignoreFailures=true, maxMaps=14, sslConfigurationFile='null', copyStrategy='uniformsize', sourceFileListing=null, sourcePaths=[hftp://10.xx.xx.aa:50070/tmp/中文路径测试], targetPath=hdfs://10.xx.xx.bb:54310/tmp/distcp_test14} 12/08/28 23:32:33 INFO tools.DistCp: DistCp job log path: /tmp/distcp.log 12/08/28 23:32:34 WARN conf.Configuration: io.sort.mb is deprecated. Instead, use mapreduce.task.io.sort.mb 12/08/28 23:32:34 WARN conf.Configuration: io.sort.factor is deprecated. Instead, use mapreduce.task.io.sort.factor 12/08/28 23:32:34 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 12/08/28 23:32:36 INFO mapreduce.JobSubmitter: number of splits:1 12/08/28 23:32:36 WARN conf.Configuration: mapred.jar is deprecated. Instead, use mapreduce.job.jar 12/08/28 23:32:36 WARN conf.Configuration: mapred.map.tasks.speculative.execution is deprecated. Instead, use mapreduce.map.speculative 12/08/28 23:32:36 WARN conf.Configuration: mapred.reduce.tasks is deprecated. Instead, use mapreduce.job.reduces 12/08/28 23:32:36 WARN conf.Configuration: mapred.mapoutput.value.class is deprecated. Instead, use mapreduce.map.output.value.class 12/08/28 23:32:36 WARN conf.Configuration: mapreduce.map.class is deprecated. Instead, use mapreduce.job.map.class 12/08/28 23:32:36 WARN conf.Configuration: mapred.job.name is deprecated. Instead, use mapreduce.job.name 12/08/28 23:32:36 WARN conf.Configuration: mapreduce.inputformat.class is deprecated. Instead, use mapreduce.job.inputformat.class 12/08/28 23:32:36 WARN conf.Configuration: mapred.output.dir is deprecated. Instead, use mapreduce.output.fileoutputformat.outputdir 12/08/28 23:32:36 WARN conf.Configuration: mapreduce.outputformat.class is deprecated. Instead, use mapreduce.job.outputformat.class 12/08/28 23:32:36 WARN conf.Configuration: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps 12/08/28 23:32:36 WARN conf.Configuration: mapred.mapoutput.key.class is deprecated. Instead, use mapreduce.map.output.key.class 12/08/28 23:32:36 WARN conf.Configuration: mapred.working.dir is deprecated. Instead, use mapreduce.job.working.dir 12/08/28 23:32:37 INFO mapred.ResourceMgrDelegate: Submitted application application_1345831938927_0039 to ResourceManager at baby20/10.1.1.40:8040 12/08/28 23:32:37 INFO mapreduce.Job: The url to track the job: http://baby20:8088/proxy/application_1345831938927_0039/ 12/08/28 23:32:37 INFO tools.DistCp: DistCp job-id: job_1345831938927_0039 12/08/28 23:32:37 INFO mapreduce.Job: Running job: job_1345831938927_0039 12/08/28 23:32:50 INFO mapreduce.Job: Job job_1345831938927_0039 running in uber mode : false 12/08/28 23:32:50 INFO mapreduce.Job: map 0% reduce 0% 12/08/28 23:33:00 INFO mapreduce.Job: map 100% reduce 0% 12/08/28 23:33:00 INFO mapreduce.Job: Task Id : attempt_1345831938927_0039_m_000000_0, Status : FAILED Error: java.io.IOException: File copy failed: hftp://10.1.1.26:50070/tmp/中 文路径测试/part-r-00017 --> hdfs://10.1.1.40:54310/tmp/distcp_test14/part-r-00017 at org.apache.hadoop.tools.mapred.CopyMapper.copyFileWithRetry(CopyMapper.java: 262) at org.apache.hadoop.tools.mapred.CopyMapper.map(CopyMapper.java:229) at org.apache.hadoop.tools.mapred.CopyMapper.map(CopyMapper.java:45) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:725) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:332) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:152) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.ja va:1232) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:147) Caused by: java.io.IOException: Couldn't run retriable-command: Copying hftp://10.1.1.26:50070/tmp/中文路径测试/part-r-00017 to hdfs://10.1.1.40:54310/tmp/distcp_test14/part-r-00017 at org.apache.hadoop.tools.util.RetriableCommand.execute(RetriableCommand.java: 101) at org.apache.hadoop.tools.mapred.CopyMapper.copyFileWithRetry(CopyMapper.java: 258) ... 10 more Caused by: org.apache.hadoop.tools.mapred.RetriableFileCopyCommand$CopyReadException: java.io.IOException: HTTP_OK expected, received 500 at org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.readBytes(RetriableF ileCopyCommand.java:201) at org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.copyBytes(RetriableF ileCopyCommand.java:167) at org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.copyToTmpFile(Retria bleFileCopyCommand.java:112) at org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.doCopy(RetriableFile CopyCommand.java:90) at org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.doExecute(RetriableF ileCopyCommand.java:71) at org.apache.hadoop.tools.util.RetriableCommand.execute(RetriableCommand.java: 87) ... 11 more Caused by: java.io.IOException: HTTP_OK expected, received 500 at org.apache.hadoop.hdfs.HftpFileSystem$RangeHeaderInputStream.checkResponseCo de(HftpFileSystem.java:381) at org.apache.hadoop.hdfs.ByteRangeInputStream.openInputStream(ByteRangeInputSt ream.java:121) at org.apache.hadoop.hdfs.ByteRangeInputStream.getInputStream(ByteRangeInputStr eam.java:103) at org.apache.hadoop.hdfs.ByteRangeInputStream.read(ByteRangeInputStream.java:1 58) at java.io.DataInputStream.read(DataInputStream.java:132) at java.io.BufferedInputStream.read1(BufferedInputStream.java:256) at java.io.BufferedInputStream.read(BufferedInputStream.java:317) at java.io.FilterInputStream.read(FilterInputStream.java:90) at org.apache.hadoop.tools.util.ThrottledInputStream.read(ThrottledInputStream. java:70) at org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.readBytes(RetriableF ileCopyCommand.java:198) ... 16 more