[ 
https://issues.apache.org/jira/browse/HADOOP-14441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16019834#comment-16019834
 ] 

Wei-Chiu Chuang commented on HADOOP-14441:
------------------------------------------

Hi [~shahrs87], There are two ways to configure KMS-HA: one is to use KMS 
servers behind VIP, and the other is via LoadBalancingKMSClientProvider, which 
is adopted by Cloudera. From a high level perspective, clients are not aware of 
KMS HA in the former configuration, and the VIP is responsible for routing the 
requests; while in the latter, the client are aware there are multiple KMS 
servers and itself is responsible for routing the requests to the KMS servers.

The bug described here is purely a problem using LoadBalancingKMSClientProvider 
configuration. When a KMS client requests a delegation token from KMS server, 
it uses the server address/port as the key to store dt in its UGI Credentials 
map:

{code:title=DelegationTokenAuthenticatedURL#getDelegationToken}
public org.apache.hadoop.security.token.Token<AbstractDelegationTokenIdentifier>
      getDelegationToken(URL url, Token token, String renewer, String doAsUser)
          throws IOException, AuthenticationException {
    Preconditions.checkNotNull(url, "url");
    Preconditions.checkNotNull(token, "token");
    try {
      token.delegationToken =
          ((KerberosDelegationTokenAuthenticator) getAuthenticator()).
              getDelegationToken(url, token, renewer, doAsUser);
      return token.delegationToken;
    } catch (IOException ex) {
      token.delegationToken = null;
      throw ex;
    }
  }
{code}
The problem is that the client is aware of the real server addreess/port, so 
when it looks up its Credentials map, the delegation token acquired from one 
KMS server can not be used for another KMS server.

The test case attached to this jira accurately capture the problem and the 
error.

bq. Even after the fix, the jobs can fail if one the servers went temporarily 
down and came back later and if the job was launched in between these time 
frame.
I agree this is a problem. Presumably there's a way for KMS to share the same 
URL, but the current Hadoop Authentication framework is shared by multiple 
agents including YARN client, so I am not sure what would be a better approach 
to fix it without affect other agents.

> LoadBalancingKMSClientProvider#addDelegationTokens should add delegation 
> tokens from all KMS instances
> ------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-14441
>                 URL: https://issues.apache.org/jira/browse/HADOOP-14441
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: kms
>    Affects Versions: 2.7.0
>         Environment: CDH5.7.4, Kerberized, SSL, KMS-HA, at rest encryption
>            Reporter: Wei-Chiu Chuang
>            Assignee: Wei-Chiu Chuang
>         Attachments: HADOOP-14441.001.patch, HADOOP-14441.002.patch, 
> HADOOP-14441.003.patch
>
>
> LoadBalancingKMSClientProvider only gets delegation token from one KMS 
> instance, in a round-robin fashion. This is arguably a bug, as JavaDoc for 
> {{KeyProviderDelegationTokenExtension#addDelegationTokens}} states:
> {quote}
> /**
>      * The implementer of this class will take a renewer and add all
>      * delegation tokens associated with the renewer to the 
>      * <code>Credentials</code> object if it is not already present, 
> ...
> **/
> {quote}
> This bug doesn't pop up very often, because HDFS clients such as MapReduce 
> unintentionally calls {{FileSystem#addDelegationTokens}} multiple times.
> We have a custom client that accesses HDFS/KMS-HA using delegation token, and 
> we were puzzled why it always throws "Failed to find any Kerberos tgt" 
> exceptions talking to one KMS but not the other. Turns out that client 
> couldn't talk to the KMS because {{FileSystem#addDelegationTokens}} only gets 
> one KMS delegation token at a time.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

Reply via email to