[ https://issues.apache.org/jira/browse/HDFS-14323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16781294#comment-16781294 ]
Srinivasu Majeti edited comment on HDFS-14323 at 3/1/19 5:05 AM: ----------------------------------------------------------------- Hi [~jojochuang] and [~zvenczel], We would need to try webhdfs url with some special character like below. [hdfs@c1265-node2 root]$ hadoop distcp webhdfs://c2265-node2.hwx.com:50070/tmp/date=1234557 /check You could see an failure with exception like below. 19/02/21 06:35:59 DEBUG security.UserGroupInformation: PrivilegedActionException as:xxxxxxxx (auth:KERBEROS) cause:java.io.FileNotFoundException: File does not exist: /tmp/date%3D1234557 19/02/21 06:35:59 DEBUG ipc.ProtobufRpcEngine: Call: delete took 4ms 19/02/21 06:35:59 ERROR tools.DistCp: Invalid input: org.apache.hadoop.tools.CopyListing$InvalidInputException: webhdfs://c2265-node2.hwx.com:50070/tmp/date=1234557 doesn't exist c1265-node2 -> 3.x [ hadoop 3.2.0 ] client c2265-node2.hwx.com -> 2.x cluster NN . Thanks and Regards, Majeti. was (Author: smajeti): Hi [~jojochuang] and [~zvenczel], We would need try webhdfs url with some special character like below. [hdfs@c1265-node2 root]$ hadoop distcp webhdfs://c2265-node2.hwx.com:50070/tmp/date=1234557 /check You could see an failure with exception like below. 19/02/21 06:35:59 DEBUG security.UserGroupInformation: PrivilegedActionException as:xxxxxxxx (auth:KERBEROS) cause:java.io.FileNotFoundException: File does not exist: /tmp/date%3D1234557 19/02/21 06:35:59 DEBUG ipc.ProtobufRpcEngine: Call: delete took 4ms 19/02/21 06:35:59 ERROR tools.DistCp: Invalid input: org.apache.hadoop.tools.CopyListing$InvalidInputException: webhdfs://c2265-node2.hwx.com:50070/tmp/date=1234557 doesn't exist c1265-node2 -> 3.x [ hadoop 3.2.0 ] client c2265-node2.hwx.com -> 2.x cluster NN . Thanks and Regards, Majeti. > Distcp fails in Hadoop 3.x when 2.x source webhdfs url has special characters > in hdfs file path > ----------------------------------------------------------------------------------------------- > > Key: HDFS-14323 > URL: https://issues.apache.org/jira/browse/HDFS-14323 > Project: Hadoop HDFS > Issue Type: Bug > Components: webhdfs > Affects Versions: 3.2.0 > Reporter: Srinivasu Majeti > Priority: Major > > There was an enhancement to allow semicolon in source/target URLs for distcp > use case as part of HDFS-13176 and backward compatibility fix as part of > HDFS-13582 . Still there seems to be an issue when trying to trigger distcp > from 3.x cluster to pull webhdfs data from 2.x hadoop cluster. We might need > to deal with existing fix as described below by making sure if url is already > encoded or not. That fixes it. > diff --git > a/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/web/WebHdfsFileSystem.java > > b/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/web/WebHdfsFileSystem.java > index 5936603c34a..dc790286aff 100644 > --- > a/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/web/WebHdfsFileSystem.java > +++ > b/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/web/WebHdfsFileSystem.java > @@ -609,7 +609,10 @@ URL toUrl(final HttpOpParam.Op op, final Path fspath, > boolean pathAlreadyEncoded = false; > try { > fspathUriDecoded = URLDecoder.decode(fspathUri.getPath(), "UTF-8"); > - pathAlreadyEncoded = true; > + if(!fspathUri.getPath().equals(fspathUriDecoded)) > + { > + pathAlreadyEncoded = true; > + } > } catch (IllegalArgumentException ex) { > LOG.trace("Cannot decode URL encoded file", ex); > } > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org