[ 
https://issues.apache.org/jira/browse/ACCUMULO-4306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15277545#comment-15277545
 ] 

William Slacum commented on ACCUMULO-4306:
------------------------------------------

I think both of those points are really the same thing, which is "Can't a 
process running on the cluster that isn't Accumulo read backing files?" The 
answer is most likely yes, given a couple of circumstances:

1) Any user in the accumulo group will be able to read backing files. 
Authentication is a non-factor for any such process. Potentially using 
something like Ranger would allow stricter authorization policies to mitigate 
this. 

2) Without HDFS authentication enabled, a user could masquerade as another user 
via webhdfs and that allows them to get around the authorization restriction 
mentioned above. In my use case, we have some other safeguards in place to 
mitigate this, such as cell level encryption that would require credentials 
external to both HDFS & Accumulo. 

For clarity's sake, when I say "user" I mean a user as HDFS would see it. I was 
not referring an Accumulo user, which is, by default, a synthetic account in 
ZooKeeper. As the architecture currently sits, only a subset of users (the 
Accumulo service principals, {{accumulo/HOST@REALM}}) interact with HDFS. 
Mostly I'm trying to get around the fact that "security" is either "on" or 
"off" for all of HDFS, and I have specific read paths through Accumulo for 
external clients that I'd prefer be strongly authenticated so they can get 
their proper authorizations. 

I was thinking about whether or not an encryption zone over Accumulo's data 
directory would help as well, but so long as a user can pretend to be another 
user, they'd be able to do #2 to decrypt the directory's contents.

I believe that the multi-volume case *should* work correctly right now. So long 
as each NN process is running as its own service principal, SASL+GSSAPI should 
be able to handle negotiation tickets between a given TServer (as 
{{accumulo/$HOST@REALM}}) and the NN as if it were any other service.

Would an update to our documentation/user manual that outlines the consequences 
of security configurations (both current and those as a result of this ticket) 
help sway you one way or the other? I think there's already gaps in our current 
capabilities now that are undocumented, and this would just add more unknown 
variables. Specifically you've mentioned reading backing files, but there are 
other concerns from Accumulo's perspective (such as user authorizations) that 
are a separate class of protection mechanisms which I'm also trying to consider.

As as always, I appreciate the comments Sean :)

> Support Kerberos authentication terminating at Accumulo
> -------------------------------------------------------
>
>                 Key: ACCUMULO-4306
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-4306
>             Project: Accumulo
>          Issue Type: Improvement
>          Components: core, rpc
>            Reporter: William Slacum
>            Assignee: William Slacum
>              Labels: authentication, kerberos
>             Fix For: 1.8.0
>
>
> We currently support Kerberos authentication via SASL+GSSAPI. Due to an 
> implementation detail, turning it on requires also enabling Kerberos for HDFS.
> This ticket proposes changing the implementation to avoid needing to turn on 
> Kerberos authentication for HDFS, but still (optionally) using it. Mostly, I 
> think this boils down to replacing uses of {{UserGroupInformation}} with 
> {{Subject}} references. There are couple places (specifically around creating 
> delegation tokens for use with a Kerberos-enabled Hadoop cluster) where 
> `UserGroupInformation` may need to stick around.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to