[ https://issues.apache.org/jira/browse/MAPREDUCE-1431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12832697#action_12832697 ]
Tsz Wo (Nicholas), SZE commented on MAPREDUCE-1431: --------------------------------------------------- Took a closer look: HarFileSystem extends FilterFileSystem and it uses the underlying file system to get file checksum. That's why we got Wrong FS since HarFileSystem passes a har:// path to the underlying fs.getFileChecksum(..). In our case, the underlying fs is hdfs. > archive does not work with distcp -update > ----------------------------------------- > > Key: MAPREDUCE-1431 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1431 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: harchive > Reporter: Tsz Wo (Nicholas), SZE > Assignee: Mahadev konar > Fix For: 0.22.0 > > > The following distcp command works. > {noformat} > hadoop distcp -Dmapred.job.queue.name=q > har://hdfs-nn_hostname:8020/user/tsz/t101.har/t101 t101_distcp > {noformat} > However, it does not work for -update. > {noformat} > -bash-3.1$ hadoop distcp -Dmapred.job.queue.name=q -update > har://hdfs-nn_hostname:8020/user/tsz/t101.har/t101 t101_distcp > 10/01/29 20:06:53 INFO tools.DistCp: > srcPaths=[har://hdfs-nn_hostname:8020/user/tsz/t101.har/t101] > 10/01/29 20:06:53 INFO tools.DistCp: destPath=t101 > java.lang.IllegalArgumentException: Wrong FS: > har://hdfs-nn_hostname:8020/user/tsz/t101.har/t101/text-00000000, expected: > hdfs://nn_hostname > at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:310) > at > org.apache.hadoop.hdfs.DistributedFileSystem.checkPath(DistributedFileSystem.java:99) > at > org.apache.hadoop.hdfs.DistributedFileSystem.getPathName(DistributedFileSystem.java:155) > at > org.apache.hadoop.hdfs.DistributedFileSystem.getFileChecksum(DistributedFileSystem.java:463) > at > org.apache.hadoop.hdfs.DistributedFileSystem.getFileChecksum(DistributedFileSystem.java:46) > at > org.apache.hadoop.fs.FilterFileSystem.getFileChecksum(FilterFileSystem.java:250) > at org.apache.hadoop.tools.DistCp.sameFile(DistCp.java:1204) > at org.apache.hadoop.tools.DistCp.setup(DistCp.java:1084) > ... > {noformat} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.