[
https://issues.apache.org/jira/browse/ACCUMULO-4306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15277545#comment-15277545
]
William Slacum commented on ACCUMULO-4306:
------------------------------------------
I think both of those points are really the same thing, which is "Can't a
process running on the cluster that isn't Accumulo read backing files?" The
answer is most likely yes, given a couple of circumstances:
1) Any user in the accumulo group will be able to read backing files.
Authentication is a non-factor for any such process. Potentially using
something like Ranger would allow stricter authorization policies to mitigate
this.
2) Without HDFS authentication enabled, a user could masquerade as another user
via webhdfs and that allows them to get around the authorization restriction
mentioned above. In my use case, we have some other safeguards in place to
mitigate this, such as cell level encryption that would require credentials
external to both HDFS & Accumulo.
For clarity's sake, when I say "user" I mean a user as HDFS would see it. I was
not referring an Accumulo user, which is, by default, a synthetic account in
ZooKeeper. As the architecture currently sits, only a subset of users (the
Accumulo service principals, {{accumulo/HOST@REALM}}) interact with HDFS.
Mostly I'm trying to get around the fact that "security" is either "on" or
"off" for all of HDFS, and I have specific read paths through Accumulo for
external clients that I'd prefer be strongly authenticated so they can get
their proper authorizations.
I was thinking about whether or not an encryption zone over Accumulo's data
directory would help as well, but so long as a user can pretend to be another
user, they'd be able to do #2 to decrypt the directory's contents.
I believe that the multi-volume case *should* work correctly right now. So long
as each NN process is running as its own service principal, SASL+GSSAPI should
be able to handle negotiation tickets between a given TServer (as
{{accumulo/$HOST@REALM}}) and the NN as if it were any other service.
Would an update to our documentation/user manual that outlines the consequences
of security configurations (both current and those as a result of this ticket)
help sway you one way or the other? I think there's already gaps in our current
capabilities now that are undocumented, and this would just add more unknown
variables. Specifically you've mentioned reading backing files, but there are
other concerns from Accumulo's perspective (such as user authorizations) that
are a separate class of protection mechanisms which I'm also trying to consider.
As as always, I appreciate the comments Sean :)
> Support Kerberos authentication terminating at Accumulo
> -------------------------------------------------------
>
> Key: ACCUMULO-4306
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4306
> Project: Accumulo
> Issue Type: Improvement
> Components: core, rpc
> Reporter: William Slacum
> Assignee: William Slacum
> Labels: authentication, kerberos
> Fix For: 1.8.0
>
>
> We currently support Kerberos authentication via SASL+GSSAPI. Due to an
> implementation detail, turning it on requires also enabling Kerberos for HDFS.
> This ticket proposes changing the implementation to avoid needing to turn on
> Kerberos authentication for HDFS, but still (optionally) using it. Mostly, I
> think this boils down to replacing uses of {{UserGroupInformation}} with
> {{Subject}} references. There are couple places (specifically around creating
> delegation tokens for use with a Kerberos-enabled Hadoop cluster) where
> `UserGroupInformation` may need to stick around.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)