[ 
https://issues.apache.org/jira/browse/HADOOP-8828?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eli Collins moved HDFS-3712 to HADOOP-8828:
-------------------------------------------

        Key: HADOOP-8828  (was: HDFS-3712)
    Project: Hadoop Common  (was: Hadoop HDFS)
    
> Support distcp from secure to insecure clusters
> -----------------------------------------------
>
>                 Key: HADOOP-8828
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8828
>             Project: Hadoop Common
>          Issue Type: Bug
>            Reporter: Eli Collins
>
> Users currently can't distcp from secure to insecure clusters.
> Relevant background from ATM:
> There's no plumbing to make the HFTP client use AuthenticatedURL in the case 
> security is enabled. This means that even though you have the servlet filter 
> correctly configured on the server, the client doesn't know how to properly 
> authenticate to that filter.
> The crux of the issue is that security is enabled globally instead of 
> per-file system. The trick of using HFTP as the source FS works when the 
> source is insecure, but not the source is secure.
> Normal cp with two hdfs:// URL can be made to work. There is indeed logic in 
> o.a.h.ipc.Client to fall back to using simple authentication if your client 
> config has security enabled (hadoop.security.authentication set to 
> "kerberos") and the server responds with a response for simple 
> authentication. Thing is, there are at least 3 bugs with this that I bumped 
> into. All three can be worked around.
> 1) If your client config has security enabled you *must* have a valid 
> Kerberos TGT, even if you're interacting with an insecure cluster. The hadoop 
> client unfortunately tries to read the local ticket cache before it tries to 
> connect to the server, and so doesn't know that it won't need Kerberos 
> credentials.
> 2) Even though the destination NN is insecure, it has to have a Kerberos 
> principal created for it. You don't need a keytab, and you don't need to 
> change any settings on the destination NN. The principal just needs to exist 
> in the principal database. This is again because the hadoop client will, 
> before connecting to the remote NN, try to get a service ticket for the 
> hdfs/f.q.d.n principal for the remote NN. If this fails, it won't even get to 
> the part where it tries to connect to the insecure NN and falls back to 
> simple auth.
> 3) Once you get through problems 1 and 2, you will try to connect to the 
> remote, insecure NN. This will work, but the reported principal name of your 
> user will include a realm that the remote NN doesn't know about. You will 
> either need to change the default_realm setting in /etc/krb5.conf on the 
> insecure NN to be the same as the secure NN, or you will need to add some 
> custom hadoop.security.auth_to_local mappings on the insecure NN so it knows 
> how to translate this long principal name into a short name.
> Even with all these changes, distcp still won't work since the first thing it 
> tries to do when submitting the job is to get a delegation token for all the 
> involved NNs, which won't work since the insecure NN isn't running a DT 
> secret manager. I haven't been able to figure out a way around this, except 
> to make a custom distcp which doesn't necessarily do this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to