Hi, all

         I use distcp copying data from hadoop1.0.3 to hadoop 2.0.1.

         When the file path(or file name) contain Chinese character, an
exception will throw. Like below. I need some help about this.

         Thanks.

         

 

 

 

[hdfs@host ~]$ hadoop distcp -i -prbugp -m 14 -overwrite -log
/tmp/distcp.log hftp://10.xx.xx.aa:50070/tmp/中文路径测试
hdfs://10.xx.xx.bb:54310/tmp/distcp_test14

12/08/28 23:32:31 INFO tools.DistCp: Input Options:
DistCpOptions{atomicCommit=false, syncFolder=false, deleteMissing=false,
ignoreFailures=true, maxMaps=14, sslConfigurationFile='null',
copyStrategy='uniformsize', sourceFileListing=null,
sourcePaths=[hftp://10.xx.xx.aa:50070/tmp/中文路径测试],
targetPath=hdfs://10.xx.xx.bb:54310/tmp/distcp_test14}

12/08/28 23:32:33 INFO tools.DistCp: DistCp job log path: /tmp/distcp.log

12/08/28 23:32:34 WARN conf.Configuration: io.sort.mb is deprecated.
Instead, use mapreduce.task.io.sort.mb

12/08/28 23:32:34 WARN conf.Configuration: io.sort.factor is deprecated.
Instead, use mapreduce.task.io.sort.factor

12/08/28 23:32:34 WARN util.NativeCodeLoader: Unable to load native-hadoop
library for your platform... using builtin-java classes where applicable

12/08/28 23:32:36 INFO mapreduce.JobSubmitter: number of splits:1

12/08/28 23:32:36 WARN conf.Configuration: mapred.jar is deprecated.
Instead, use mapreduce.job.jar

12/08/28 23:32:36 WARN conf.Configuration:
mapred.map.tasks.speculative.execution is deprecated. Instead, use
mapreduce.map.speculative

12/08/28 23:32:36 WARN conf.Configuration: mapred.reduce.tasks is
deprecated. Instead, use mapreduce.job.reduces

12/08/28 23:32:36 WARN conf.Configuration: mapred.mapoutput.value.class is
deprecated. Instead, use mapreduce.map.output.value.class

12/08/28 23:32:36 WARN conf.Configuration: mapreduce.map.class is
deprecated. Instead, use mapreduce.job.map.class

12/08/28 23:32:36 WARN conf.Configuration: mapred.job.name is deprecated.
Instead, use mapreduce.job.name

12/08/28 23:32:36 WARN conf.Configuration: mapreduce.inputformat.class is
deprecated. Instead, use mapreduce.job.inputformat.class

12/08/28 23:32:36 WARN conf.Configuration: mapred.output.dir is deprecated.
Instead, use mapreduce.output.fileoutputformat.outputdir

12/08/28 23:32:36 WARN conf.Configuration: mapreduce.outputformat.class is
deprecated. Instead, use mapreduce.job.outputformat.class

12/08/28 23:32:36 WARN conf.Configuration: mapred.map.tasks is deprecated.
Instead, use mapreduce.job.maps

12/08/28 23:32:36 WARN conf.Configuration: mapred.mapoutput.key.class is
deprecated. Instead, use mapreduce.map.output.key.class

12/08/28 23:32:36 WARN conf.Configuration: mapred.working.dir is deprecated.
Instead, use mapreduce.job.working.dir

12/08/28 23:32:37 INFO mapred.ResourceMgrDelegate: Submitted application
application_1345831938927_0039 to ResourceManager at baby20/10.1.1.40:8040

12/08/28 23:32:37 INFO mapreduce.Job: The url to track the job:
http://baby20:8088/proxy/application_1345831938927_0039/

12/08/28 23:32:37 INFO tools.DistCp: DistCp job-id: job_1345831938927_0039

12/08/28 23:32:37 INFO mapreduce.Job: Running job: job_1345831938927_0039

12/08/28 23:32:50 INFO mapreduce.Job: Job job_1345831938927_0039 running in
uber mode : false

12/08/28 23:32:50 INFO mapreduce.Job:  map 0% reduce 0%

12/08/28 23:33:00 INFO mapreduce.Job:  map 100% reduce 0%

12/08/28 23:33:00 INFO mapreduce.Job: Task Id :
attempt_1345831938927_0039_m_000000_0, Status : FAILED

Error: java.io.IOException: File copy failed: hftp://10.1.1.26:50070/tmp/中
文路径测试/part-r-00017 -->
hdfs://10.1.1.40:54310/tmp/distcp_test14/part-r-00017

        at
org.apache.hadoop.tools.mapred.CopyMapper.copyFileWithRetry(CopyMapper.java:
262)

        at
org.apache.hadoop.tools.mapred.CopyMapper.map(CopyMapper.java:229)

        at org.apache.hadoop.tools.mapred.CopyMapper.map(CopyMapper.java:45)

        at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)

        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:725)

        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:332)

        at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:152)

        at java.security.AccessController.doPrivileged(Native Method)

        at javax.security.auth.Subject.doAs(Subject.java:396)

        at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.ja
va:1232)

        at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:147)

Caused by: java.io.IOException: Couldn't run retriable-command: Copying
hftp://10.1.1.26:50070/tmp/中文路径测试/part-r-00017 to
hdfs://10.1.1.40:54310/tmp/distcp_test14/part-r-00017

        at
org.apache.hadoop.tools.util.RetriableCommand.execute(RetriableCommand.java:
101)

        at
org.apache.hadoop.tools.mapred.CopyMapper.copyFileWithRetry(CopyMapper.java:
258)

        ... 10 more

Caused by:
org.apache.hadoop.tools.mapred.RetriableFileCopyCommand$CopyReadException:
java.io.IOException: HTTP_OK expected, received 500

        at
org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.readBytes(RetriableF
ileCopyCommand.java:201)

        at
org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.copyBytes(RetriableF
ileCopyCommand.java:167)

        at
org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.copyToTmpFile(Retria
bleFileCopyCommand.java:112)

        at
org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.doCopy(RetriableFile
CopyCommand.java:90)

        at
org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.doExecute(RetriableF
ileCopyCommand.java:71)

        at
org.apache.hadoop.tools.util.RetriableCommand.execute(RetriableCommand.java:
87)

        ... 11 more

Caused by: java.io.IOException: HTTP_OK expected, received 500

        at
org.apache.hadoop.hdfs.HftpFileSystem$RangeHeaderInputStream.checkResponseCo
de(HftpFileSystem.java:381)

        at
org.apache.hadoop.hdfs.ByteRangeInputStream.openInputStream(ByteRangeInputSt
ream.java:121)

        at
org.apache.hadoop.hdfs.ByteRangeInputStream.getInputStream(ByteRangeInputStr
eam.java:103)

        at
org.apache.hadoop.hdfs.ByteRangeInputStream.read(ByteRangeInputStream.java:1
58)

        at java.io.DataInputStream.read(DataInputStream.java:132)

        at java.io.BufferedInputStream.read1(BufferedInputStream.java:256)

        at java.io.BufferedInputStream.read(BufferedInputStream.java:317)

        at java.io.FilterInputStream.read(FilterInputStream.java:90)

        at
org.apache.hadoop.tools.util.ThrottledInputStream.read(ThrottledInputStream.
java:70)

        at
org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.readBytes(RetriableF
ileCopyCommand.java:198)

        ... 16 more

 

 

 

Reply via email to