[ 
https://issues.apache.org/jira/browse/HDFS-6776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14102904#comment-14102904
 ] 

Haohui Mai commented on HDFS-6776:
----------------------------------

I'm unsure that fixing on the filesystem itself is the right approach to take. 
Based on the above comments (and please correct me if I'm wrong), I'm assuming 
that distcp is pulling data from a secure cluster to an insecure cluster.

# What happens if the {{WebHdfsFileSystem}} intends to connect to a secure 
cluster but the attacker has somehow disabled the security of the cluster or 
successfully launch a MITM attack which keep returning {{NullToken}}? Instead 
of ignoring the failures, I think that {{WebHdfsFileSystem}} should fail 
explicitly because that the system is compromised. This is important as copying 
from a secure cluster to an insecure cluster can unintentionally breach the 
confidentiality.

On the implementation:

# It looks like that {{WebHdfsFileSystem}} will issue a 
{{GET_DELEGATION_TOKEN}} request for each request since in catching 
{{NullToken}} nullifies {{delegationToken}}, which significantly affects the 
performance. 

I have encountered this issue in production. What I have done is to put the fix 
on the server side instead of the client side. I asked the NN of the insecure 
cluster to issue a dummy token, which works across all filesystems. That way at 
the very least the user has to be informed instead of allowing the data 
silently flowing from secure to insecure clusters.

> distcp from insecure cluster (source) to secure cluster (destination) doesn't 
> work
> ----------------------------------------------------------------------------------
>
>                 Key: HDFS-6776
>                 URL: https://issues.apache.org/jira/browse/HDFS-6776
>             Project: Hadoop HDFS
>          Issue Type: Bug
>    Affects Versions: 2.3.0, 2.5.0
>            Reporter: Yongjun Zhang
>            Assignee: Yongjun Zhang
>         Attachments: HDFS-6776.001.patch, HDFS-6776.002.patch, 
> HDFS-6776.003.patch, HDFS-6776.004.patch, HDFS-6776.004.patch, 
> HDFS-6776.005.patch, HDFS-6776.006.NullToken.patch, 
> HDFS-6776.006.NullToken.patch, HDFS-6776.007.patch
>
>
> Issuing distcp command at the secure cluster side, trying to copy stuff from 
> insecure cluster to secure cluster, and see the following problem:
> {code}
> hadoopuser@yjc5u-1 ~]$ hadoop distcp webhdfs://<insure-cluster>:<port>/tmp 
> hdfs://<sure-cluster>:8020/tmp/tmptgt
> 14/07/30 20:06:19 INFO tools.DistCp: Input Options: 
> DistCpOptions{atomicCommit=false, syncFolder=false, deleteMissing=false, 
> ignoreFailures=false, maxMaps=20, sslConfigurationFile='null', 
> copyStrategy='uniformsize', sourceFileListing=null, 
> sourcePaths=[webhdfs://<insecure-cluster>:<port>/tmp], 
> targetPath=hdfs://<secure-cluster>:8020/tmp/tmptgt, targetPathExists=true}
> 14/07/30 20:06:19 INFO client.RMProxy: Connecting to ResourceManager at 
> <secure-clister>:8032
> 14/07/30 20:06:20 WARN ssl.FileBasedKeyStoresFactory: The property 
> 'ssl.client.truststore.location' has not been set, no TrustStore will be 
> loaded
> 14/07/30 20:06:20 WARN security.UserGroupInformation: 
> PriviledgedActionException as:hadoopu...@xyz.com (auth:KERBEROS) 
> cause:java.io.IOException: Failed to get the token for hadoopuser, 
> user=hadoopuser
> 14/07/30 20:06:20 WARN security.UserGroupInformation: 
> PriviledgedActionException as:hadoopu...@xyz.com (auth:KERBEROS) 
> cause:java.io.IOException: Failed to get the token for hadoopuser, 
> user=hadoopuser
> 14/07/30 20:06:20 ERROR tools.DistCp: Exception encountered 
> java.io.IOException: Failed to get the token for hadoopuser, user=hadoopuser
>       at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>       at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
>       at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>       at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
>       at 
> org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106)
>       at 
> org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:95)
>       at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.toIOException(WebHdfsFileSystem.java:365)
>       at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$600(WebHdfsFileSystem.java:84)
>       at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.shouldRetry(WebHdfsFileSystem.java:618)
>       at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.runWithRetry(WebHdfsFileSystem.java:584)
>       at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.access$100(WebHdfsFileSystem.java:438)
>       at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner$1.run(WebHdfsFileSystem.java:466)
>       at java.security.AccessController.doPrivileged(Native Method)
>       at javax.security.auth.Subject.doAs(Subject.java:415)
>       at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1554)
>       at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.run(WebHdfsFileSystem.java:462)
>       at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getDelegationToken(WebHdfsFileSystem.java:1132)
>       at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getDelegationToken(WebHdfsFileSystem.java:218)
>       at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getAuthParameters(WebHdfsFileSystem.java:403)
>       at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.toUrl(WebHdfsFileSystem.java:424)
>       at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractFsPathRunner.getUrl(WebHdfsFileSystem.java:640)
>       at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.runWithRetry(WebHdfsFileSystem.java:565)
>       at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.access$100(WebHdfsFileSystem.java:438)
>       at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner$1.run(WebHdfsFileSystem.java:466)
>       at java.security.AccessController.doPrivileged(Native Method)
>       at javax.security.auth.Subject.doAs(Subject.java:415)
>       at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1554)
>       at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.run(WebHdfsFileSystem.java:462)
>       at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getHdfsFileStatus(WebHdfsFileSystem.java:781)
>       at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getFileStatus(WebHdfsFileSystem.java:796)
>       at org.apache.hadoop.fs.Globber.getFileStatus(Globber.java:57)
>       at org.apache.hadoop.fs.Globber.glob(Globber.java:248)
>       at org.apache.hadoop.fs.FileSystem.globStatus(FileSystem.java:1623)
>       at 
> org.apache.hadoop.tools.GlobbedCopyListing.doBuildListing(GlobbedCopyListing.java:77)
>       at org.apache.hadoop.tools.CopyListing.buildListing(CopyListing.java:81)
>       at 
> org.apache.hadoop.tools.DistCp.createInputFileListing(DistCp.java:342)
>       at org.apache.hadoop.tools.DistCp.execute(DistCp.java:154)
>       at org.apache.hadoop.tools.DistCp.run(DistCp.java:121)
>       at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>       at org.apache.hadoop.tools.DistCp.main(DistCp.java:390)
> Caused by: org.apache.hadoop.ipc.RemoteException(java.io.IOException): Failed 
> to get the token for hadoopuser, user=hadoopuser
>       at 
> org.apache.hadoop.hdfs.web.JsonUtil.toRemoteException(JsonUtil.java:159)
>       at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.validateResponse(WebHdfsFileSystem.java:334)
>       at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$200(WebHdfsFileSystem.java:84)
>       at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.runWithRetry(WebHdfsFileSystem.java:570)
>       ... 30 more
> [hadoopuser@yjc5u-1 ~]$ 
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to