[ 
https://issues.apache.org/jira/browse/HDFS-14323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16781294#comment-16781294
 ] 

Srinivasu Majeti edited comment on HDFS-14323 at 3/1/19 5:05 AM:
-----------------------------------------------------------------

Hi [~jojochuang] and [~zvenczel],

 We would need to try webhdfs url with some special character like below.

[hdfs@c1265-node2 root]$ hadoop distcp 
webhdfs://c2265-node2.hwx.com:50070/tmp/date=1234557 /check

You could see an failure with exception like below.

19/02/21 06:35:59 DEBUG security.UserGroupInformation: 
PrivilegedActionException as:xxxxxxxx  (auth:KERBEROS) 
cause:java.io.FileNotFoundException: File does not exist: /tmp/date%3D1234557

19/02/21 06:35:59 DEBUG ipc.ProtobufRpcEngine: Call: delete took 4ms
 19/02/21 06:35:59 ERROR tools.DistCp: Invalid input:
 org.apache.hadoop.tools.CopyListing$InvalidInputException: 
webhdfs://c2265-node2.hwx.com:50070/tmp/date=1234557 doesn't exist

c1265-node2 -> 3.x [ hadoop 3.2.0 ] client 

c2265-node2.hwx.com -> 2.x cluster NN .

Thanks and Regards,

Majeti.


was (Author: smajeti):
Hi [~jojochuang] and [~zvenczel],

 We would need try webhdfs url with some special character like below.

[hdfs@c1265-node2 root]$ hadoop distcp 
webhdfs://c2265-node2.hwx.com:50070/tmp/date=1234557 /check

You could see an failure with exception like below.

19/02/21 06:35:59 DEBUG security.UserGroupInformation: 
PrivilegedActionException as:xxxxxxxx  (auth:KERBEROS) 
cause:java.io.FileNotFoundException: File does not exist: /tmp/date%3D1234557

19/02/21 06:35:59 DEBUG ipc.ProtobufRpcEngine: Call: delete took 4ms
19/02/21 06:35:59 ERROR tools.DistCp: Invalid input:
org.apache.hadoop.tools.CopyListing$InvalidInputException: 
webhdfs://c2265-node2.hwx.com:50070/tmp/date=1234557 doesn't exist

c1265-node2 -> 3.x [ hadoop 3.2.0 ] client 

c2265-node2.hwx.com -> 2.x cluster NN .

Thanks and Regards,

Majeti.

> Distcp fails in Hadoop 3.x when 2.x source webhdfs url has special characters 
> in hdfs file path
> -----------------------------------------------------------------------------------------------
>
>                 Key: HDFS-14323
>                 URL: https://issues.apache.org/jira/browse/HDFS-14323
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: webhdfs
>    Affects Versions: 3.2.0
>            Reporter: Srinivasu Majeti
>            Priority: Major
>
> There was an enhancement to allow semicolon in source/target URLs for distcp 
> use case as part of HDFS-13176 and backward compatibility fix as part of 
> HDFS-13582 . Still there seems to be an issue when trying to trigger distcp 
> from 3.x cluster to pull webhdfs data from 2.x hadoop cluster. We might need 
> to deal with existing fix as described below by making sure if url is already 
> encoded or not. That fixes it. 
> diff --git 
> a/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/web/WebHdfsFileSystem.java
>  
> b/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/web/WebHdfsFileSystem.java
> index 5936603c34a..dc790286aff 100644
> --- 
> a/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/web/WebHdfsFileSystem.java
> +++ 
> b/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/web/WebHdfsFileSystem.java
> @@ -609,7 +609,10 @@ URL toUrl(final HttpOpParam.Op op, final Path fspath,
>  boolean pathAlreadyEncoded = false;
>  try {
>  fspathUriDecoded = URLDecoder.decode(fspathUri.getPath(), "UTF-8");
> - pathAlreadyEncoded = true;
> + if(!fspathUri.getPath().equals(fspathUriDecoded))
> + {
> + pathAlreadyEncoded = true;
> + }
>  } catch (IllegalArgumentException ex) {
>  LOG.trace("Cannot decode URL encoded file", ex);
>  }
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to