> On Dec. 30, 2014, 5:46 p.m., Christopher Tubbs wrote:
> > core/src/main/java/org/apache/accumulo/core/client/security/tokens/KerberosToken.java,
> >  lines 71-79
> > <https://reviews.apache.org/r/29386/diff/4/?file=803145#file803145line71>
> >
> >     Why is this unsupported?
> 
> Josh Elser wrote:
>     Again, see earlier discussion. I have no idea what this is actually 
> supposed to destroy (we have zero documentation on the matter that I found) 
> and it's already been changed to be a noop. I am just as confident that a 
> noop is correct as an unsupported operation is correct (I have no confidence).
>     
>     As far as destroying a UGI, I haven't found any parallel to the 
> Destroyable interface. I'm not sure if something exists that we should be 
> calling.
> 
> Christopher Tubbs wrote:
>     It's a way for objects to remove their internal state (preferably by 
> securely overwriting memory) so any credentials can be removed from memory, 
> to give the user control over their credentials. It is expected to be called 
> by the user.
>     
>     In the earlier implementation, setting the ugi to null would have been 
> appropriate. In the latest version, you can nullify the principal. It's not 
> terribly important, since the principal alone is not sensitive, but it would 
> be helpful to have reasonable behavior when the user calls these methods to 
> avoid things like "I called destroy, but it won't destroy!" or "My 
> application verifies the token isn't destroyed before using it, but it seems 
> to always be destroyed!".
>     
>     Even if it was a truly no-op, we could add an internal "destroyed" flag 
> to give users reasonable behavior. Since we do have the principal, we can 
> just make that null instead of adding a new flag, and use that to check the 
> state of its destruction.
>     
>     Also, I think the other methods which assume it is not destroyed should 
> throw IllegalStateException explicitly if it is, in fact, destroyed. But, I'm 
> not even sure we do that in the other tokens, so it's probably fine to leave 
> that behavior undefined.
> 
> Josh Elser wrote:
>     WRT UserGroupInformation, I don't know how (or if I even can) destroy the 
> current login which is.. awkward to me. Generally speaking, this would be 
> great to get written down somewhere -- I'll try to fire up a ticket for that 
> so we don't have this discussion again in 6mos.

The semantics of destroy are not to destroy the current login. They are to 
destroy the current token, removing any user credentials from it. These 
semantics can be accomplished with a flag, or by setting the principal to null, 
without touching the Kerberos state that they represent.


> On Dec. 30, 2014, 5:46 p.m., Christopher Tubbs wrote:
> > core/src/main/java/org/apache/accumulo/core/conf/Property.java, lines 
> > 165-167
> > <https://reviews.apache.org/r/29386/diff/4/?file=803146#file803146line165>
> >
> >     I don't think we want to expose this. It's high risk. What is the 
> > reason and how is it related to this change?
> 
> Josh Elser wrote:
>     Again, see the previous discussion on this subject. This is necessary to 
> actually perform an apples to apples comparison of the impact of Kerberos on 
> Accumulo. Comparing KRB/SASL with TThreadPoolServer against no KRB/SASL and 
> THsHaServer is an invalid benchmark to determine the overhead of KRB.
>     
>     Like I said earlier, `Experimental` is not really the appropriate 
> annotation here (Internal use would be more accurate), but it's what we have. 
> Do you have a recommendation on some other change in wording/annotation to 
> convey the intent?
>     
>     I'd like to not have to change the codebase any time I want to verify no 
> performance regressions in future versions (having to change code would 
> preclude any non-devs from running their own benchmarks too).
> 
> Christopher Tubbs wrote:
>     My main concern is that I'm not entirely confident we're not exposing 
> these to users in some way, still: see ACCUMULO-2460 and ACCUMULO-2401.
> 
> Josh Elser wrote:
>     It sounds to me like we should introduce a new annotation then to cover 
> things not meant for user consumption. Experimental is for "maybe not stable 
> or entirely working" user facing things. This property is (likely) only 
> relevant for developers. Any objections to addressing this by introduction of 
> a new annotation (not in this changeset)?

That's a simple enough solution. I still kind of object to the idea that 
"experimental" properties have default values at all (it's experimental, so it 
should be okay to set to empty value, and should be null/unused by default), 
which was the underlying problem between ACCUMULO-2460 and ACCUMULO-2401 (the 
default value was added and used as a proxy for what probably should have been 
a constant). If that underlying problem was addressed, then using Experimental 
here would be perfectly fine. However, it may still be useful to have separate, 
clear semantics, with an additional annotation.


> On Dec. 30, 2014, 5:46 p.m., Christopher Tubbs wrote:
> > server/base/src/main/java/org/apache/accumulo/server/init/Initialize.java, 
> > lines 285-293
> > <https://reviews.apache.org/r/29386/diff/4/?file=803157#file803157line285>
> >
> >     It was discussed in the context of ACCUMULO-259 that the root user 
> > would be a built-in, password-based user, even if other auth mechanisms 
> > were available (like Linux root user). Some proposals were made to make 
> > this easier to work with and more intuitive for users, but the 
> > initialization of the root user should really just be the 
> >     
> >     See ACCUMULO-1300 and related issues, for more details.
> >     
> >     What are the risks of making the root user rely on Kerberos?
> 
> Josh Elser wrote:
>     Thanks for the background; I didn't remember this discussion.
>     
>     Hadoop does have a similar approach that we could adopt but I'm extremely 
> hesitant to recommend we do it (because I think it's some of the ugliest 
> configuration in Hadoop). Look up hadoop.security.auth_to_local: (first 
> result from google for me 
> http://hortonworks.com/blog/fine-tune-your-apache-hadoop-security-settings/). 
> Essentially, we could introduce this very convoluted user mapping into the 
> "root" user. Personally, I think this is much more confusing.
>     
>     I regret not being a part of the previous discussion, but I don't 
> understand why we should rely on always having a password based user. It 
> seems like forcing us into edge-case-city. Do you have more information on 
> this discussion? I didn't find anything on ACCUMULO-259 on it and I want to 
> make sure I'm up to speed. Comparing our "root" user is the Linux root user 
> is a bit of a fallacy in my opinion because they're not equivalent. In a 
> properly configured system, root should never be used by anyone but an 
> administrator to delegate permissions or perform one-time tasks (to avoid 
> permission delegation).
>     
>     In my opinion, avoid some "magical" user helps with the auditing and 
> authentication Accumulo uses as a whole. Each client that comes into Accumulo 
> is a user. They have a Kerberos identity and we treat them as the "Accumulo 
> user" based on that Kerberos identity. The internal "!SYSTEM" user is the 
> only other user; however it is still identified by a specific Kerberos 
> identity (the ones we know the servers use to log in).
>     
>     Also, this may benefit from non-reviewboard discussion, please feel free 
> to pop this over to the ML.
> 
> Christopher Tubbs wrote:
>     As far as I know, all the conversation happened in the comments on 
> ACCUMULO-259 and it's linked issues, as well as on the mailing list, but I 
> don't have direct references, other than the issues I linked.
>     
>     The root user as compared to the Linux root user is analogous, not 
> equivalent. I agree that a root user should never be used except to delegate 
> and perform one time tasks. The problem is that, like the Linux root user, it 
> is sometimes needed to do these tasks when other components of the system 
> (such as the pluggable authentication service) are unavailable. The same 
> permissions could be delegated to an admin in that service, so the built-in 
> root would be even less frequently used, but it still might be needed. This 
> is the same reasoning as setting a BIOS password, or Linux root user, or 
> MySQL root user, or Hadoop SuperUser, or ZK's "super" user.
>     
>     The point may be moot, though, if the system is completely unusable if 
> Kerberos is down (because the system token relies on Kerberos). Though... I 
> can see a use case for restarting the system with Kerberos disabled to do 
> maintenance and to have root available for that.
>     
>     As for the system user being a Kerberos identity, see my other concerns 
> regarding that.
> 
> Josh Elser wrote:
>     bq. This is the same reasoning as setting a BIOS password, or Linux root 
> user, or MySQL root user, or Hadoop SuperUser, or ZK's "super" user.
>     
>     I kind of see what's done here as analogous to what Hadoop already 
> provides -- a single, well-defined user with `System.SYSTEM`.
>     
>     bq. The point may be moot, though, if the system is completely unusable 
> if Kerberos is down (because the system token relies on Kerberos). 
>     
>     Yes, that's accurate.
>     
>     bq. Though... I can see a use case for restarting the system with 
> Kerberos disabled to do maintenance and to have root available for that.
>     
>     I think this would be wrong. If the KDC is down, you're already going to 
> be unable to connect to HDFS or ZooKeeper. Allowing a "backdoor" into the 
> instance without the (strong) Kerberos identity seems like a security risk to 
> me. 
>     
>     I still have to think about the "re-initialization" problem (both 
> malicious and benign as "I lost my credentials"). I'm also a bit worried 
> about trying to tackle the "change authentication handlers" problem in the 
> first patch as this is already gettin really big (despite my best efforts).

bq. Allowing a "backdoor" into the instance without the (strong) Kerberos 
identity seems like a security risk to me.

It's a trade-off between maintenance and security. And, I question the idea 
that a "strong" root password is any less secure than a keytab. It's also an 
acceptable risk, in my opinion, just as it is in the analogous systems.

I agree that tackling the authentication handler problem in a first pass here 
risks derailing this. It's unfortunate that ACCUMULO-1300 was not completed 
sooner.


> On Dec. 30, 2014, 5:46 p.m., Christopher Tubbs wrote:
> > server/base/src/main/java/org/apache/accumulo/server/security/handler/KerberosAuthenticator.java,
> >  line 41
> > <https://reviews.apache.org/r/29386/diff/4/?file=803164#file803164line41>
> >
> >     I don't think this should be so tightly coupled with the 
> > ZKAuthenticator. It doesn't seem that this coupling is necessary to provide 
> > authentication functionality.
> >     
> >     Have a different implementation of an Authenticator that hijacks the 
> > data storage of another implementation, could be quite confusing to 
> > administrators, especially if they need to swap out the implementation.
> >     
> >     See ACCUMULO-1300 and related issues for other options.
> 
> Josh Elser wrote:
>     I understand the intentions for ACCUMULO-1300; however, that's not 
> implemented. Being able to leverage the existing ZKAuthorizor was wonderful 
> so I really don't want to move away from how this fundamentally works.
>     
>     Shouldn't the Authenticator and Authorizor be fixed per instance 
> (allowing reset via `accumulo init --reset-security` or w/e)? What do you 
> actually want changed aside from documentation to make it clear to 
> administrators (I haven't written any documentation yet).
> 
> Christopher Tubbs wrote:
>     So, I pointed out in the conversation around ACCUMULO-259, that I would 
> prefer a pluggable security module that integrated all the security-related 
> functions, because they are so dependent on each other. However, they were 
> implemented as 3 separately configurable pluggable components. I would still 
> like to see them merged (which doesn't preclude making one from modular 
> components themselves).
>     
>     For now, I'm not really sure what it means for somebody to "create a 
> user" when the authenticator already validates them. That's why it's 
> confusing. Are we essentially making ZKAuthenticator Kerberos-aware, or are 
> we providing a separate Kerberos Authenticator? That's really what it boils 
> down to for me. It seems like we're doing a hybrid... but I think we should 
> solidify on one or the other.
> 
> Josh Elser wrote:
>     I view it as a Kerberos Authenticator. The implementation details of that 
> authenticator is that is happens to persist information into ZK to handle 
> things like permissions and authorizations, but a user doesn't have to know 
> that to use it. We could, hypothetically, swap out ZK for any other 
> persistent store for that information and it would continue to work.
>     
>     What would help alleviate your confusion? More javadocs? A different 
> class name?

Well, part is documentation. Part is behavior.

If it's a separate authenticator, we should make sure that it does not use the 
same storage area in such a tightly coupled way. It should not, for instance, 
be setting things that look like passwords, and "create local user", "list 
local users", and "delete local users" should be unsupported client side 
operations, just as you've done with "change password" (since users are managed 
within Kerberos). We already share storage between the ZKAuthenticator and the 
corresponding permissions handler and authorizer, so it's probably okay to 
share the directory identified with the principal. However, it can seamlessly 
create this directory as needed.

Mainly, I'm thinking the more we can separate the "local user" management 
functions from the KerberosAuthenticator, the better, since users are managed 
in Kerberos. Some of my concerns are with the current state of things, but I'm 
hoping not to exacerbate the problems I see with the current implementations. I 
don't want to make it harder to make improvements later. I'd rather move 
towards a better design, then simply add features to the existing ad-hoc design.


> On Dec. 30, 2014, 5:46 p.m., Christopher Tubbs wrote:
> > shell/src/main/java/org/apache/accumulo/shell/ShellOptionsJC.java, line 210
> > <https://reviews.apache.org/r/29386/diff/4/?file=803175#file803175line210>
> >
> >     Why is username the "short" user name? Is that unique in Kerberos? If 
> > not, the long version should be used everywhere instead. Otherwise, one 
> > user can appear to be another in logs, etc.
> >     
> >     If "getShortUserName" is not unique, it should avoided everywhere.
> 
> Josh Elser wrote:
>     Check out: 
> http://web.mit.edu/kerberos/krb5-1.5/krb5-1.5.4/doc/krb5-user/What-is-a-Kerberos-Principal_003f.html
>     
>     Kerberos principals are of the form: primary/instance@realm. Kerberos 
> principals are typically categorized as users and services. A user is not 
> qualified to a single instance (a host) and represent authentication across 
> the realm. For example, els...@example.com means that I can "roam". 
> Conversely, a service is typically "fixed" to a specific host. For example, 
> accumulo/node1.example....@example.com means that there is a process, logged 
> in as 'accumulo' on the host 'node1.example.com'. That service can't be run 
> on any other host. Now, an important note if someone actually creates a 
> principal "accum...@example.com" this is unique with respect to any other 
> "accumulo/`host`@EXAMPLE.COM" principal. I'm not sure if we need to do 
> anything else other than convention of kerberos principals, or if we should 
> be including the instance in "our" username when present.
>     
>     This kind of ties back into the SystemCredentials discussion again.
> 
> Christopher Tubbs wrote:
>     Okay, so a smart configuration would make shortnames unique. However, 
> UserGroupInformation returns only the `primary` for the short name. This 
> means that user names will have to be unique across realms and instances. 
> Right now, you are storing permissions using the short name. So, any user 
> with the same primary, will be able to masquerade as any other user with the 
> same primary from a different instance and/or realm, and be able to user 
> their permissions and authorizations. That's the problem with the shortname 
> here. That's very unexpected.
> 
> Josh Elser wrote:
>     Bingo. If you look at how HDFS does their configuration, this is the same 
> convention. The lack of documentation from me leaves something to be desired 
> here, and I apologize for that.
>     
>     To save you looking at HDFS (if you care not to look), you'll see that an 
> HDFS process uses a given principal with a special replacement string 
> `_HOST`. The common convention is to use something like 
> `dn/_h...@example.com` (the realm is unimportant for this example). This 
> ensures that the same configuration files can be used across all hosts in the 
> HDFS instance, and Hadoop dynamically replaces `_HOST` with the FQDN of the 
> host. Thus, there's an implicit link that all `dn/*@EXAMPLE.COM` can act as 
> datanodes and this is protected by the fact that access to the KDC is 
> restricted (you can't make your own user). The circle of trust is two-fold: 
> having a keytab with the correct principal and that Hadoop is requires that 
> specific configuration (which restricts the principal).

My concerns here are more about the impact on users, than for the system 
credentials. I don't know what HDFS is doing, but if they aren't (minimally) 
checking the realm when checking permissions/access on an authenticated 
principal, then they are less secure than I think we should be. Referencing 
HDFS also seems to imply that we're not so much doing Kerberos, as we are 
implementing HDFS-specific Kerberos conventions (which are less secure, with 
respect to data authorizations/permissions within Accumulo, than I'm 
comfortable with).


- Christopher


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/29386/#review66382
-----------------------------------------------------------


On Dec. 31, 2014, 4:24 p.m., Josh Elser wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/29386/
> -----------------------------------------------------------
> 
> (Updated Dec. 31, 2014, 4:24 p.m.)
> 
> 
> Review request for accumulo.
> 
> 
> Bugs: ACCUMULO-2815
>     https://issues.apache.org/jira/browse/ACCUMULO-2815
> 
> 
> Repository: accumulo
> 
> 
> Description
> -------
> 
> ACCUMULO-2815 Initial support for Kerberos client authentication.
> 
> Leverage SASL transport provided by Thrift which can speak GSSAPI, which 
> Kerberos implements. Introduced...
> 
> * An Accumulo KerberosToken which is an AuthenticationToken to validate users.
> * Custom thrift processor and invocation handler to ensure server RPCs have a 
> valid KRB identity and Accumulo authentication.
> * A KerberosAuthenticator which extends ZKAuthenticator to support Kerberos 
> identities seamlessly.
> * New ClientConf variables to use SASL transport and pass Kerberos server 
> principal
> * Updated ClientOpts and Shell opts to transparently use a KerberosToken when 
> SASL is enabled (no extra client work).
> 
> I believe this is the "bare minimum" for Kerberos support. They are also 
> grossly lacking in unit and integration tests. I believe that I might have 
> somehow broken the client address string in the server (I saw log messages 
> with client: null, but I'm not sure if it's due to these changes or not). A 
> necessary limitation in the Thrift server used is that, like the SSL 
> transport, the SASL transport cannot presently be used with the 
> TFramedTransport, which means none of the [half]async thrift servers will 
> function with this -- we're stuck with the TThreadPoolServer.
> 
> Performed some contrived benchmarks on my laptop (while still using it 
> myself) to get at big-picture view of the performance impact against "normal" 
> operation and Kerberos alone. Each "run" was the duration to ingest 100M 
> records using continuous-ingest, timed with `time`, using 'real'.
> 
> THsHaServer (our default), 6 runs:
> 
> Avg: 10m7.273s (607.273s)
> Min: 9m43.395s
> Max: 10m52.715s
> 
> TThreadPoolServer (no SASL), 5 runs:
> 
> Avg: 11m16.254s (676.254s)
> Min: 10m30.987s
> Max: 12m24.192s
> 
> TThreadPoolServer+SASL/GSSAPI (these changes), 6 runs:
> 
> Avg: 13m17.187s (797.187s)
> Min: 10m52.997s
> Max: 16m0.975s
> 
> The general takeway is that there's about 15% performance degredation in its 
> initial state which is in the realm of what I expected (~10%).
> 
> 
> Diffs
> -----
> 
>   core/src/main/java/org/apache/accumulo/core/cli/ClientOpts.java f6ea934 
>   core/src/main/java/org/apache/accumulo/core/client/ClientConfiguration.java 
> 6fe61a5 
>   core/src/main/java/org/apache/accumulo/core/client/impl/ClientContext.java 
> e75bec6 
>   core/src/main/java/org/apache/accumulo/core/client/impl/ConnectorImpl.java 
> f481cc3 
>   
> core/src/main/java/org/apache/accumulo/core/client/impl/ThriftTransportKey.java
>  6dc846f 
>   
> core/src/main/java/org/apache/accumulo/core/client/impl/ThriftTransportPool.java
>  5da803b 
>   
> core/src/main/java/org/apache/accumulo/core/client/security/tokens/KerberosToken.java
>  PRE-CREATION 
>   core/src/main/java/org/apache/accumulo/core/conf/Property.java e054a5f 
>   core/src/main/java/org/apache/accumulo/core/rpc/FilterTransport.java 
> PRE-CREATION 
>   core/src/main/java/org/apache/accumulo/core/rpc/SaslConnectionParams.java 
> PRE-CREATION 
>   core/src/main/java/org/apache/accumulo/core/rpc/TTimeoutTransport.java 
> 6eace77 
>   core/src/main/java/org/apache/accumulo/core/rpc/ThriftUtil.java 09bd6c4 
>   core/src/main/java/org/apache/accumulo/core/rpc/UGIAssumingTransport.java 
> PRE-CREATION 
>   
> core/src/main/java/org/apache/accumulo/core/rpc/UGIAssumingTransportFactory.java
>  PRE-CREATION 
>   core/src/main/java/org/apache/accumulo/core/security/Credentials.java 
> 525a958 
>   core/src/test/java/org/apache/accumulo/core/cli/TestClientOpts.java ff49bc0 
>   
> core/src/test/java/org/apache/accumulo/core/client/ClientConfigurationTest.java
>  PRE-CREATION 
>   
> core/src/test/java/org/apache/accumulo/core/conf/ClientConfigurationTest.java 
> 40be70f 
>   
> core/src/test/java/org/apache/accumulo/core/rpc/SaslConnectionParamsTest.java 
> PRE-CREATION 
>   proxy/src/main/java/org/apache/accumulo/proxy/Proxy.java 4b048eb 
>   
> server/base/src/main/java/org/apache/accumulo/server/AccumuloServerContext.java
>  09ae4f4 
>   server/base/src/main/java/org/apache/accumulo/server/init/Initialize.java 
> 046cfb5 
>   
> server/base/src/main/java/org/apache/accumulo/server/rpc/TCredentialsUpdatingInvocationHandler.java
>  PRE-CREATION 
>   
> server/base/src/main/java/org/apache/accumulo/server/rpc/TCredentialsUpdatingWrapper.java
>  PRE-CREATION 
>   server/base/src/main/java/org/apache/accumulo/server/rpc/TServerUtils.java 
> 641c0bf 
>   
> server/base/src/main/java/org/apache/accumulo/server/rpc/ThriftServerType.java
>  PRE-CREATION 
>   
> server/base/src/main/java/org/apache/accumulo/server/security/SecurityOperation.java
>  5e81018 
>   
> server/base/src/main/java/org/apache/accumulo/server/security/SecurityUtil.java
>  29e4939 
>   
> server/base/src/main/java/org/apache/accumulo/server/security/SystemCredentials.java
>  a59d57c 
>   
> server/base/src/main/java/org/apache/accumulo/server/security/handler/KerberosAuthenticator.java
>  PRE-CREATION 
>   
> server/base/src/main/java/org/apache/accumulo/server/thrift/UGIAssumingProcessor.java
>  PRE-CREATION 
>   
> server/base/src/test/java/org/apache/accumulo/server/AccumuloServerContextTest.java
>  PRE-CREATION 
>   
> server/base/src/test/java/org/apache/accumulo/server/rpc/TCredentialsUpdatingInvocationHandlerTest.java
>  PRE-CREATION 
>   
> server/base/src/test/java/org/apache/accumulo/server/security/SystemCredentialsTest.java
>  4202a7e 
>   server/gc/src/main/java/org/apache/accumulo/gc/SimpleGarbageCollector.java 
> 93a9a49 
>   
> server/gc/src/test/java/org/apache/accumulo/gc/GarbageCollectWriteAheadLogsTest.java
>  f98721f 
>   
> server/gc/src/test/java/org/apache/accumulo/gc/SimpleGarbageCollectorTest.java
>  99558b8 
>   
> server/gc/src/test/java/org/apache/accumulo/gc/replication/CloseWriteAheadLogReferencesTest.java
>  cad1e01 
>   server/master/src/main/java/org/apache/accumulo/master/Master.java 12195fa 
>   server/tracer/src/main/java/org/apache/accumulo/tracer/TraceServer.java 
> 7e33300 
>   server/tserver/src/main/java/org/apache/accumulo/tserver/TabletServer.java 
> d5c1d2f 
>   shell/src/main/java/org/apache/accumulo/shell/Shell.java 58308ff 
>   shell/src/main/java/org/apache/accumulo/shell/ShellOptionsJC.java 8167ef8 
>   shell/src/test/java/org/apache/accumulo/shell/ShellConfigTest.java 0e72c8c 
>   shell/src/test/java/org/apache/accumulo/shell/ShellOptionsJCTest.java 
> PRE-CREATION 
>   test/src/main/java/org/apache/accumulo/test/functional/ZombieTServer.java 
> eb84533 
>   
> test/src/main/java/org/apache/accumulo/test/performance/thrift/NullTserver.java
>  2ebc2e3 
>   
> test/src/test/java/org/apache/accumulo/server/security/SystemCredentialsIT.java
>  fb71f5f 
> 
> Diff: https://reviews.apache.org/r/29386/diff/
> 
> 
> Testing
> -------
> 
> Ensure existing unit tests still function. Accumulo is functional and ran 
> continuous ingest multiple times using a client with only a Kerberos identity 
> (no user/password provided). Used MIT Kerberos with Apache Hadoop 2.6.0 and 
> Apache ZooKeeper 3.4.5.
> 
> 
> Thanks,
> 
> Josh Elser
> 
>

Reply via email to