[ 
https://issues.apache.org/jira/browse/HDFS-3367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13577627#comment-13577627
 ] 

Daryn Sharp commented on HDFS-3367:
-----------------------------------

Good questions.  Just some background for others that may not be versed in 
details of ugi: The {{ugi#doAs}} is required to always ensure the 
{{AccessControlContext}} has the correct {{Subject}} containing the 
authentication credentials (ex. kerberos TGT).  When operations are performed 
outside of a top-level {{ugi#doAs}}, the context lacks a {{Subject}} which 
fails authentication.  I was bitten by a similar issue to this one when I 
removed a {{doAs}} in the IPC server's instantiation of a SASL server.

Proxy ugis (ex. oozie) will operate correctly with this change.  In fact, this 
ensures it will work correctly in all cases.  The important part is that 
connections, especially reconnections, must be performed within the ugi context 
used to instantiate the fs.  Each filesystem uses a slightly different way to 
ensure the original ugi is used for connections:
* Hdfs implicitly uses {{ugi#doAs}} at the ipc client level.  The 
{{ConnectionId}} stores the ugi used to make the first connection and will 
always reconnect with that ugi.  This is why other operations don't require the 
doAs.
* Hftp explicitly uses {{ugi#doAs}} during connections for token operations 
because those operations use the secure kssl port.  Subsequent operations use 
the insecure port which doesn't require a {{ugi#doAs}}.
* Webhdfs uses the SPNEGO secured port for all operations.  The SPNEGO 
token/cookie must be negotiated for any connection and that requires the 
{{ugi#doAs}}.  In particular, HDFS-4493 needs this behavior when a SPNEGO 
token/cookie expires and a new one is required.  Without a doAs, a new SPNEGO 
token/cookie cannot be negotiated.

The example to showcase the issue is in Jakob's original description.  That use 
case is exactly what internal applications are relying on which has prevented 
them from using/testing webhdfs.
                
> WebHDFS doesn't use the logged in user when opening connections
> ---------------------------------------------------------------
>
>                 Key: HDFS-3367
>                 URL: https://issues.apache.org/jira/browse/HDFS-3367
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: webhdfs
>    Affects Versions: 0.23.0, 1.0.2, 2.0.0-alpha, 3.0.0
>            Reporter: Jakob Homan
>            Assignee: Daryn Sharp
>            Priority: Critical
>         Attachments: HDFS-3367.branch-23.patch, HDFS-3367.patch
>
>
> Something along the lines of
> {noformat}
> UserGroupInformation.loginUserFromKeytab(<blah blah>)
> Filesystem fs = FileSystem.get(new URI("webhdfs://blah"), conf)
> {noformat}
> doesn't work as webhdfs doesn't use the correct context and the user shows up 
> to the spnego filter without kerberos credentials:
> {noformat}Exception in thread "main" java.io.IOException: Authentication 
> failed, 
> url=http://<NN>:50070/webhdfs/v1/?op=GETDELEGATIONTOKEN&user.name=<USER>
>       at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getHttpUrlConnection(WebHdfsFileSystem.java:337)
>       at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.httpConnect(WebHdfsFileSystem.java:347)
>       at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.run(WebHdfsFileSystem.java:403)
>       at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getDelegationToken(WebHdfsFileSystem.java:675)
>       at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.initDelegationToken(WebHdfsFileSystem.java:176)
>       at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.initialize(WebHdfsFileSystem.java:160)
>       at 
> org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1386)
> ...
> Caused by: 
> org.apache.hadoop.security.authentication.client.AuthenticationException: 
> GSSException: No valid credentials provided (Mechanism level: Failed to find 
> any Kerberos tgt)
>       at 
> org.apache.hadoop.security.authentication.client.KerberosAuthenticator.doSpnegoSequence(KerberosAuthenticator.java:232)
>       at 
> org.apache.hadoop.security.authentication.client.KerberosAuthenticator.authenticate(KerberosAuthenticator.java:141)
>       at 
> org.apache.hadoop.security.authentication.client.AuthenticatedURL.openConnection(AuthenticatedURL.java:217)
>       at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getHttpUrlConnection(WebHdfsFileSystem.java:332)
>       ... 16 more
> Caused by: GSSException: No valid credentials provided (Mechanism level: 
> Failed to find any Kerberos tgt)
>       at 
> sun.security.jgss.krb5.Krb5InitCredential.getInstance(Krb5InitCredential.java:130)
> ...{noformat}
> Explicitly getting the current user's context via a doAs block works, but 
> this should be done by webhdfs. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to