[ 
https://issues.apache.org/jira/browse/HDFS-14323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16782256#comment-16782256
 ] 

Srinivasu Majeti edited comment on HDFS-14323 at 3/2/19 3:43 AM:
-----------------------------------------------------------------

Hi [~jojochuang],

 Its with HDP 3.1  [ having HDFS 3.1.1 ] . hdfs dfs ls works and distcp with 
hdfs://hostname:8020 also works fine . Its only the url encoding on 3.1 side 
and failing with decoding at 2.6 side as it does have decoding feature. We have 
given the fix yesterday to one of customers and its working fine with the patch 
attached here :) . Further information on hadoop versions - The feature of 
encoding/decoding started from release-3.1.0-RC0 and not present upto - 
release-3.0.3-RC0. [~zvenczel] can comment and confirm .


was (Author: smajeti):
Hi [~jojochuang],

 Its with HDP 3.1  [ having HDFS 3.1.1 ] . hdfs dfs ls works and distcp with 
hdfs://hostname:8020 also works fine . Its only the url encoding on 3.1 side 
and failing with decoding at 2.6 side as it does have decoding feature. We have 
given the fix yesterday to one of customers and its working fine with the patch 
attached here :) .

> Distcp fails in Hadoop 3.x when 2.x source webhdfs url has special characters 
> in hdfs file path
> -----------------------------------------------------------------------------------------------
>
>                 Key: HDFS-14323
>                 URL: https://issues.apache.org/jira/browse/HDFS-14323
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: webhdfs
>    Affects Versions: 3.2.0
>            Reporter: Srinivasu Majeti
>            Priority: Major
>         Attachments: HDFS-14323v0.patch
>
>
> There was an enhancement to allow semicolon in source/target URLs for distcp 
> use case as part of HDFS-13176 and backward compatibility fix as part of 
> HDFS-13582 . Still there seems to be an issue when trying to trigger distcp 
> from 3.x cluster to pull webhdfs data from 2.x hadoop cluster. We might need 
> to deal with existing fix as described below by making sure if url is already 
> encoded or not. That fixes it. 
> diff --git 
> a/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/web/WebHdfsFileSystem.java
>  
> b/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/web/WebHdfsFileSystem.java
> index 5936603c34a..dc790286aff 100644
> --- 
> a/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/web/WebHdfsFileSystem.java
> +++ 
> b/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/web/WebHdfsFileSystem.java
> @@ -609,7 +609,10 @@ URL toUrl(final HttpOpParam.Op op, final Path fspath,
>  boolean pathAlreadyEncoded = false;
>  try {
>  fspathUriDecoded = URLDecoder.decode(fspathUri.getPath(), "UTF-8");
> - pathAlreadyEncoded = true;
> + if(!fspathUri.getPath().equals(fspathUriDecoded))
> + {
> + pathAlreadyEncoded = true;
> + }
>  } catch (IllegalArgumentException ex) {
>  LOG.trace("Cannot decode URL encoded file", ex);
>  }
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to