[ 
https://issues.apache.org/jira/browse/HDFS-7295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14186642#comment-14186642
 ] 

Steve Loughran commented on HDFS-7295:
--------------------------------------


[~aw]
bq. What Steve Loughran said.

I don't know whether to be pleased or scared by the fact you are agreeing with 
me. Maybe both.

[~adhoot]

bq. My concern is the damage with a stolen keytab is far greater than the HDFS 
token. Its universal kerberos identity versus something that works only with 
HDFS.

In a more complex application you end up needing to authenticate IPC/REST 
between different  services anyway. Example: pool of tomcat instances talking 
to HBase in YARN running against HDFS. Keytabs avoid having different solutions 
for different parts of the stack. For the example cited, I'd just have one 
single "app" account for the HBase and tomcat instances; {{sudo}} launch them 
all as that user.

bq. Ops team might consider a longer delegation token to be lower risk than 
having a more valuable asset - users's keytab - be exposed on a wide surface 
area (we need all nodes to have access to the keytabs)

push it out during localization; rely on the NM to set up the paths securely 
and to clean up afterwards. The weaknesses become
# packet sniffing. Better encrypt your wires.
# NM process fails, container then terminates: no cleanup
# malicious processes able to gain root access to the system. But do that and 
you get enough other things away...

bq. Using keytabs for headless accounts will work for services that do not use 
the user account. Spark streaming, for example, runs as the user just like Map 
Reduce. This would mean asking user to create and deploy keytabs for those 
scenarios, correct?

Depends on the duration of the instance. Short-lived: no. Medium lived: no. 
Long-lived, you need a keytab —but it does not have to be that of the user 
submitting the job, merely one with access to the (persistent) data.

[~bcwalrus]
bq. perhaps we can add a whitelist/blacklist for who can set arbitrary lifetime 
on their DT, and whether there is a cap to the lifetime.

This adding even more complexity to a security system that is already hard for 
some people (myself, for example) to understand.

bq. It's straightforward to build a revocation mechanism, along with some stats 
reporting on DT usages, plus auditing.

Yes —but does it scale? Is every request going to have to trigger a token 
revocation check, or simply a fraction? Even with that fraction, what load ends 
up being placed on the infrastructure -including potentially the enterprise 
wide Kerberos/AD systems. We also need to think about the availability of this 
token revocation check infrastructure, whether to hide in the NN and add more 
overhead there (as well as more data to keep in sync), or deploy and manage 
some other token revocation infrastructure. I am not, personally, enthused by 
the idea.


I don't think anyone pretends that keytabs are an ideal solution, I know some 
cluster ops teams will be unhappy about this, but also think that saying 
"near-indefinite kerberos tokens" isn't going to make those people happy 
either. 

There's another option which we looked at for slider: pushing out new tokens 
from the client, just as the RM does token renewal today. you've got to 
remember to refresh them regularly, and be able to get those tokens to the 
processes in the YARN containers, processes that may then want to switch over 
to them. I could imagine this though, with Oozie jobs scheduled to do the 
renewal, and something in YARN to help with token propagation. 


> Support arbitrary max expiration times for delegation token
> -----------------------------------------------------------
>
>                 Key: HDFS-7295
>                 URL: https://issues.apache.org/jira/browse/HDFS-7295
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>            Reporter: Anubhav Dhoot
>            Assignee: Anubhav Dhoot
>
> Currently the max lifetime of HDFS delegation tokens is hardcoded to 7 days. 
> This is a problem for different users of HDFS such as long running YARN apps. 
> Users should be allowed to optionally specify max lifetime for their tokens.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to