[ 
https://issues.apache.org/jira/browse/HDFS-6776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14123333#comment-14123333
 ] 

Yongjun Zhang commented on HDFS-6776:
-------------------------------------

Hi [~wheat9], 

I know that you are opposed to put the msg-parsing hack to webhdfs. However, 
you also have said:
{quote}
However, it's okay to me to return a null here so the behavior is similar to 
DistributedFileSystem. The actual fallback logic can happen at the distcp side 
when building the file list, but maybe we can defer it to another jira.
{quote}

I had quite some questions for you in my last two comments, I'd appreciate that 
you could comment on them. That way, we can understand more about your concern 
why it's so fragile as you said. 

Do you agree that a correct webhdfs contract is not to fail with the exception 
when accessing insecure cluster, rather, it should be able to access insecure 
cluster? This is a very important question that I hope you could answer.

We agree that the msg-parsing is a bit hacky, but why hack in webhdfs is so 
much worse than in distcp, given webhdfs doesn't work without a fix?

BTW, FYI, not to say that it's good thing to do so, there was already code 
doing msg parsing in webhdfs:
{code}
      // extract UGI-related exceptions and unwrap InvalidToken
      // the NN mangles these exceptions but the DN does not and may need
      // to re-fetch a token if either report the token is expired
      if (re.getMessage().startsWith("Failed to obtain user group 
information:")) {
        String[] parts = re.getMessage().split(":\\s+", 3);
        re = new RemoteException(parts[1], parts[2]);
        re = ((RemoteException)re).unwrapRemoteException(InvalidToken.class);
      }
{code}
Do you consider this fragile?

Disclaimer, In the patch I did here, it's not because there was existing code 
like quoted above. Rather it's because the solution has its simplicity which we 
discussed earlier.

Thanks.


> distcp from insecure cluster (source) to secure cluster (destination) doesn't 
> work via webhdfs
> ----------------------------------------------------------------------------------------------
>
>                 Key: HDFS-6776
>                 URL: https://issues.apache.org/jira/browse/HDFS-6776
>             Project: Hadoop HDFS
>          Issue Type: Bug
>    Affects Versions: 2.3.0, 2.5.0
>            Reporter: Yongjun Zhang
>            Assignee: Yongjun Zhang
>         Attachments: HDFS-6776.001.patch, HDFS-6776.002.patch, 
> HDFS-6776.003.patch, HDFS-6776.004.patch, HDFS-6776.004.patch, 
> HDFS-6776.005.patch, HDFS-6776.006.NullToken.patch, 
> HDFS-6776.006.NullToken.patch, HDFS-6776.007.patch, HDFS-6776.008.patch, 
> HDFS-6776.009.patch, HDFS-6776.010.patch, HDFS-6776.011.patch, 
> dummy-token-proxy.js
>
>
> Issuing distcp command at the secure cluster side, trying to copy stuff from 
> insecure cluster to secure cluster, and see the following problem:
> {code}
> hadoopuser@yjc5u-1 ~]$ hadoop distcp webhdfs://<insure-cluster>:<port>/tmp 
> hdfs://<sure-cluster>:8020/tmp/tmptgt
> 14/07/30 20:06:19 INFO tools.DistCp: Input Options: 
> DistCpOptions{atomicCommit=false, syncFolder=false, deleteMissing=false, 
> ignoreFailures=false, maxMaps=20, sslConfigurationFile='null', 
> copyStrategy='uniformsize', sourceFileListing=null, 
> sourcePaths=[webhdfs://<insecure-cluster>:<port>/tmp], 
> targetPath=hdfs://<secure-cluster>:8020/tmp/tmptgt, targetPathExists=true}
> 14/07/30 20:06:19 INFO client.RMProxy: Connecting to ResourceManager at 
> <secure-clister>:8032
> 14/07/30 20:06:20 WARN ssl.FileBasedKeyStoresFactory: The property 
> 'ssl.client.truststore.location' has not been set, no TrustStore will be 
> loaded
> 14/07/30 20:06:20 WARN security.UserGroupInformation: 
> PriviledgedActionException as:hadoopu...@xyz.com (auth:KERBEROS) 
> cause:java.io.IOException: Failed to get the token for hadoopuser, 
> user=hadoopuser
> 14/07/30 20:06:20 WARN security.UserGroupInformation: 
> PriviledgedActionException as:hadoopu...@xyz.com (auth:KERBEROS) 
> cause:java.io.IOException: Failed to get the token for hadoopuser, 
> user=hadoopuser
> 14/07/30 20:06:20 ERROR tools.DistCp: Exception encountered 
> java.io.IOException: Failed to get the token for hadoopuser, user=hadoopuser
>       at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>       at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
>       at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>       at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
>       at 
> org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106)
>       at 
> org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:95)
>       at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.toIOException(WebHdfsFileSystem.java:365)
>       at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$600(WebHdfsFileSystem.java:84)
>       at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.shouldRetry(WebHdfsFileSystem.java:618)
>       at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.runWithRetry(WebHdfsFileSystem.java:584)
>       at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.access$100(WebHdfsFileSystem.java:438)
>       at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner$1.run(WebHdfsFileSystem.java:466)
>       at java.security.AccessController.doPrivileged(Native Method)
>       at javax.security.auth.Subject.doAs(Subject.java:415)
>       at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1554)
>       at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.run(WebHdfsFileSystem.java:462)
>       at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getDelegationToken(WebHdfsFileSystem.java:1132)
>       at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getDelegationToken(WebHdfsFileSystem.java:218)
>       at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getAuthParameters(WebHdfsFileSystem.java:403)
>       at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.toUrl(WebHdfsFileSystem.java:424)
>       at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractFsPathRunner.getUrl(WebHdfsFileSystem.java:640)
>       at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.runWithRetry(WebHdfsFileSystem.java:565)
>       at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.access$100(WebHdfsFileSystem.java:438)
>       at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner$1.run(WebHdfsFileSystem.java:466)
>       at java.security.AccessController.doPrivileged(Native Method)
>       at javax.security.auth.Subject.doAs(Subject.java:415)
>       at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1554)
>       at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.run(WebHdfsFileSystem.java:462)
>       at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getHdfsFileStatus(WebHdfsFileSystem.java:781)
>       at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getFileStatus(WebHdfsFileSystem.java:796)
>       at org.apache.hadoop.fs.Globber.getFileStatus(Globber.java:57)
>       at org.apache.hadoop.fs.Globber.glob(Globber.java:248)
>       at org.apache.hadoop.fs.FileSystem.globStatus(FileSystem.java:1623)
>       at 
> org.apache.hadoop.tools.GlobbedCopyListing.doBuildListing(GlobbedCopyListing.java:77)
>       at org.apache.hadoop.tools.CopyListing.buildListing(CopyListing.java:81)
>       at 
> org.apache.hadoop.tools.DistCp.createInputFileListing(DistCp.java:342)
>       at org.apache.hadoop.tools.DistCp.execute(DistCp.java:154)
>       at org.apache.hadoop.tools.DistCp.run(DistCp.java:121)
>       at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>       at org.apache.hadoop.tools.DistCp.main(DistCp.java:390)
> Caused by: org.apache.hadoop.ipc.RemoteException(java.io.IOException): Failed 
> to get the token for hadoopuser, user=hadoopuser
>       at 
> org.apache.hadoop.hdfs.web.JsonUtil.toRemoteException(JsonUtil.java:159)
>       at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.validateResponse(WebHdfsFileSystem.java:334)
>       at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$200(WebHdfsFileSystem.java:84)
>       at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.runWithRetry(WebHdfsFileSystem.java:570)
>       ... 30 more
> [hadoopuser@yjc5u-1 ~]$ 
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to