> On Dec. 30, 2014, 10:46 p.m., Christopher Tubbs wrote: > > core/src/main/java/org/apache/accumulo/core/cli/ClientOpts.java, lines > > 272-277 > > <https://reviews.apache.org/r/29386/diff/4/?file=803139#file803139line272> > > > > Are these mutually exclusive?
Yes, as far as I understand it. Hypothetically, I don't think they have to be, but the thrift implementations won't properly work. We have numerous checks on the server side to ensure exclusivity (see above discussion on the subject), but I'm not sure if one of them will also cover client side. Will investigate. > On Dec. 30, 2014, 10:46 p.m., Christopher Tubbs wrote: > > core/src/main/java/org/apache/accumulo/core/client/security/tokens/KerberosToken.java, > > line 41 > > <https://reviews.apache.org/r/29386/diff/4/?file=803145#file803145line41> > > > > This link tag is bad. It should be UserGroupInformation, not ugi. Meant to use `<code>ugi</code>` to refer to the parameter, not a link. > On Dec. 30, 2014, 10:46 p.m., Christopher Tubbs wrote: > > core/src/main/java/org/apache/accumulo/core/client/security/tokens/KerberosToken.java, > > lines 52-69 > > <https://reviews.apache.org/r/29386/diff/4/?file=803145#file803145line52> > > > > The no-arg constructor is used for Writable, and is used to serialize > > credentials to a file. This is useful for MapReduce. > > > > As is, this *should* work, as long as the user wishes to use the > > UserGroupInformation.getLoginUser(). However, that basically forces the > > MapReduce task to use the MapReduce task's own credentials, to talk to > > Accumulo, and not the user's. This could be a cumbersome pitfall. > > > > Is there a way to serialize the user's ugi, so that it can be > > de-serialized, falling back on the UserGroupInformation.getLoginUser() if > > the user's ugi cannot be deserialized? > > > > At the very least, the serialization should store *something* (magic > > bytes which refer to "nothing here"), so we can distinguish between that > > and future serialized options, if we add any later. MapReduce is something I need to investigate and test. I have no idea how token delegation works with MapReduce (if it's anything more than "use a keytab"). If it's that, this approach should be fine. Otherwise, it'll be problematic. I've also switched this from loginUser to currentUser (I think in the most recent patch) as I think loginUser was incorrect. I'll add some magic bytes to the serialization. > On Dec. 30, 2014, 10:46 p.m., Christopher Tubbs wrote: > > core/src/main/java/org/apache/accumulo/core/client/security/tokens/KerberosToken.java, > > lines 71-79 > > <https://reviews.apache.org/r/29386/diff/4/?file=803145#file803145line71> > > > > Why is this unsupported? Again, see earlier discussion. I have no idea what this is actually supposed to destroy (we have zero documentation on the matter that I found) and it's already been changed to be a noop. I am just as confident that a noop is correct as an unsupported operation is correct (I have no confidence). As far as destroying a UGI, I haven't found any parallel to the Destroyable interface. I'm not sure if something exists that we should be calling. > On Dec. 30, 2014, 10:46 p.m., Christopher Tubbs wrote: > > core/src/main/java/org/apache/accumulo/core/conf/Property.java, lines > > 150-153 > > <https://reviews.apache.org/r/29386/diff/4/?file=803146#file803146line150> > > > > Descriptions should explain if these are mutually exclusive. Or, at > > least, that auth-conf obviates the need for ssl. Will add a note about SSL exclusivity. I don't know enough to compare the encryption of SSL compared to auth-conf so I don't feel comfortable documenting anything on the subject (yet). > On Dec. 30, 2014, 10:46 p.m., Christopher Tubbs wrote: > > core/src/main/java/org/apache/accumulo/core/conf/Property.java, lines > > 165-167 > > <https://reviews.apache.org/r/29386/diff/4/?file=803146#file803146line165> > > > > I don't think we want to expose this. It's high risk. What is the > > reason and how is it related to this change? Again, see the previous discussion on this subject. This is necessary to actually perform an apples to apples comparison of the impact of Kerberos on Accumulo. Comparing KRB/SASL with TThreadPoolServer against no KRB/SASL and THsHaServer is an invalid benchmark to determine the overhead of KRB. Like I said earlier, `Experimental` is not really the appropriate annotation here (Internal use would be more accurate), but it's what we have. Do you have a recommendation on some other change in wording/annotation to convey the intent? I'd like to not have to change the codebase any time I want to verify no performance regressions in future versions (having to change code would preclude any non-devs from running their own benchmarks too). > On Dec. 30, 2014, 10:46 p.m., Christopher Tubbs wrote: > > core/src/main/java/org/apache/accumulo/core/rpc/SaslConnectionParams.java, > > lines 80-81 > > <https://reviews.apache.org/r/29386/diff/4/?file=803148#file803148line80> > > > > These are malformed javadocs. These tags are supposed to have > > descriptions. Please put javadoc descriptions or leave these off. > > > > Of the choices, in this case, I think the AccumuloConfiguration should > > really be documented. We have a bad habit of passing around configuration > > in an "AccumuloConfiguration" object, with no context of where that > > configuration came from. > > > > This is really just a utility method, so maybe it doesn't matter, but > > if it's only used internally, it might be better to use a ClientContext > > object instead of the configuration, to prevent abuse (like instantiating > > additional instances of AccumuloConfiguration to satisfy the method > > signature). If it uses the configuration from the ClientContext, it's more > > clear which configuration it is using (the one used to create the connector > > or, in the case of the server, the one used to launch the server). > > > > At the very least, the javadocs should be fixed. Good catch. This was unkempt javadoc. Recently made a change to expect ClientConfiguration -- documented the reasons for both and the difference. > On Dec. 30, 2014, 10:46 p.m., Christopher Tubbs wrote: > > core/src/main/java/org/apache/accumulo/core/rpc/UGIAssumingTransport.java, > > lines 61-64 > > <https://reviews.apache.org/r/29386/diff/4/?file=803151#file803151line61> > > > > This is what AssertionError is for. No, they should throw RuntimeException. The comment is meant that those exceptions weren't thrown from the doAs block. They should just wrap the exceptions in a RuntimeException and let thrift do the correct thing. > On Dec. 30, 2014, 10:46 p.m., Christopher Tubbs wrote: > > proxy/src/main/java/org/apache/accumulo/proxy/Proxy.java, lines 133-136 > > <https://reviews.apache.org/r/29386/diff/4/?file=803155#file803155line133> > > > > This warnings suppression is unnecessary if proxyProcClass was of type > > Class<? extends TProcessor> and the forName was followed by > > .asSubclass(TProcessor.class). > > > > I realize this wasn't part of this changeset, but it's something I just > > noticed that could be cleaned up. This code is already very problematic (in that it doesn't use TServerUtils). I forget if I made a ticket for this yet, but I will make sure it gets documented in an isue. > On Dec. 30, 2014, 10:46 p.m., Christopher Tubbs wrote: > > server/base/src/main/java/org/apache/accumulo/server/init/Initialize.java, > > lines 285-293 > > <https://reviews.apache.org/r/29386/diff/4/?file=803157#file803157line285> > > > > It was discussed in the context of ACCUMULO-259 that the root user > > would be a built-in, password-based user, even if other auth mechanisms > > were available (like Linux root user). Some proposals were made to make > > this easier to work with and more intuitive for users, but the > > initialization of the root user should really just be the > > > > See ACCUMULO-1300 and related issues, for more details. > > > > What are the risks of making the root user rely on Kerberos? Thanks for the background; I didn't remember this discussion. Hadoop does have a similar approach that we could adopt but I'm extremely hesitant to recommend we do it (because I think it's some of the ugliest configuration in Hadoop). Look up hadoop.security.auth_to_local: (first result from google for me http://hortonworks.com/blog/fine-tune-your-apache-hadoop-security-settings/). Essentially, we could introduce this very convoluted user mapping into the "root" user. Personally, I think this is much more confusing. I regret not being a part of the previous discussion, but I don't understand why we should rely on always having a password based user. It seems like forcing us into edge-case-city. Do you have more information on this discussion? I didn't find anything on ACCUMULO-259 on it and I want to make sure I'm up to speed. Comparing our "root" user is the Linux root user is a bit of a fallacy in my opinion because they're not equivalent. In a properly configured system, root should never be used by anyone but an administrator to delegate permissions or perform one-time tasks (to avoid permission delegation). In my opinion, avoid some "magical" user helps with the auditing and authentication Accumulo uses as a whole. Each client that comes into Accumulo is a user. They have a Kerberos identity and we treat them as the "Accumulo user" based on that Kerberos identity. The internal "!SYSTEM" user is the only other user; however it is still identified by a specific Kerberos identity (the ones we know the servers use to log in). Also, this may benefit from non-reviewboard discussion, please feel free to pop this over to the ML. > On Dec. 30, 2014, 10:46 p.m., Christopher Tubbs wrote: > > server/base/src/main/java/org/apache/accumulo/server/security/SystemCredentials.java, > > lines 62-63 > > <https://reviews.apache.org/r/29386/diff/4/?file=803163#file803163line62> > > > > What is the reason for making SystemCredentials able to use Kerberos? > > SystemCredentials are how the system identifies itself to itself. It's all > > internal. It shouldn't rely on external components, in my opinion. The point is that the same identity that we use to authenticate clients should be used to authenticate servers. In the same vein that the SystemCredentials precludes rogue server processes from participating in the instance, relying on a specific principal to identify "valid" servers for this instance guarantees the same thing. > On Dec. 30, 2014, 10:46 p.m., Christopher Tubbs wrote: > > server/base/src/main/java/org/apache/accumulo/server/security/SystemCredentials.java, > > line 106 > > <https://reviews.apache.org/r/29386/diff/4/?file=803163#file803163line106> > > > > The SystemToken should be a separate type. it is not a password token. > > It is a token type that is exclusive to the system (it's even final, so it > > cannot be sub-class'd). It should not be swapped out with Kerberos tokens, > > or any other kind of token. > > > > It should also not be considered a password token. That just happens to > > be the closest analogy to the current implementation. It's really it's own > > special type. > > > > However, there is a proposal to make individual components > > authenticate, but I don't think this achieves that: ACCUMULO-1165 bq. it is not a password token Really? It generates a secret and uses it to authenticate with a server. That sounds like a password to me :). Seriously though, I respect that you have concerns here, but I don't have the foggiest idea of what you think this should be changed to do. Kerberos gives each server a unique token (authenticated against the KDC) via their keytab. We ensure rogue servers cannot participate in the instance by requiring the primary in the principal ('accumulo' in '`accumulo/_h...@example.com`'). SASL is also ensuring that an RPC cannot even establish a connection to Accumulo servers. What piece of the puzzle are we missing and how do you think this should operate? > On Dec. 30, 2014, 10:46 p.m., Christopher Tubbs wrote: > > server/base/src/main/java/org/apache/accumulo/server/security/SystemCredentials.java, > > lines 153-168 > > <https://reviews.apache.org/r/29386/diff/4/?file=803163#file803163line153> > > > > This should not be here. Using Kerberos for the SystemToken removes an > > important check against rogue servers from other instances communicating > > with the current instance. > > > > It also removes all the other mandatory checks in the SystemToken which > > protect against incompatible variations in the "instance.*" properties, > > which are encapsulated in the SystemToken's hash. bq. from other instances communicating with the current instance Not true. It, along with the KerberosAuthenticator/SecurityOperation, guarantees that only servers with a primary component of their principal matching the `general.kerberos.principal` value in accumulo-site.xml can communicate with the instance. One would assume that separate realms or primaries are used for Accumulo instances sharing the same KDC, but we could try to make this stronger by (somehow) incorporating the rest of the `instance.` properties. I don't know how to approach this, however, and am open to suggestions. > On Dec. 30, 2014, 10:46 p.m., Christopher Tubbs wrote: > > server/base/src/main/java/org/apache/accumulo/server/security/handler/KerberosAuthenticator.java, > > line 41 > > <https://reviews.apache.org/r/29386/diff/4/?file=803164#file803164line41> > > > > I don't think this should be so tightly coupled with the > > ZKAuthenticator. It doesn't seem that this coupling is necessary to provide > > authentication functionality. > > > > Have a different implementation of an Authenticator that hijacks the > > data storage of another implementation, could be quite confusing to > > administrators, especially if they need to swap out the implementation. > > > > See ACCUMULO-1300 and related issues for other options. I understand the intentions for ACCUMULO-1300; however, that's not implemented. Being able to leverage the existing ZKAuthorizor was wonderful so I really don't want to move away from how this fundamentally works. Shouldn't the Authenticator and Authorizor be fixed per instance (allowing reset via `accumulo init --reset-security` or w/e)? What do you actually want changed aside from documentation to make it clear to administrators (I haven't written any documentation yet). > On Dec. 30, 2014, 10:46 p.m., Christopher Tubbs wrote: > > server/base/src/main/java/org/apache/accumulo/server/security/handler/KerberosAuthenticator.java, > > line 46 > > <https://reviews.apache.org/r/29386/diff/4/?file=803164#file803164line46> > > > > Why is this seed fixed? To keep using ZKAuthenticator as a "delegate", I need to put some password into ZK. I don't think it's required that these random strings actually be secure, it's mostly intended as obfuscation. Thoughts? > On Dec. 30, 2014, 10:46 p.m., Christopher Tubbs wrote: > > shell/src/main/java/org/apache/accumulo/shell/Shell.java, line 352 > > <https://reviews.apache.org/r/29386/diff/4/?file=803174#file803174line352> > > > > Do we even need this principal field, if we can just ask the connector? Probably not. Saving a few method calls on the shell is probably not worth it. > On Dec. 30, 2014, 10:46 p.m., Christopher Tubbs wrote: > > shell/src/main/java/org/apache/accumulo/shell/ShellOptionsJC.java, lines > > 176-180 > > <https://reviews.apache.org/r/29386/diff/4/?file=803175#file803175line176> > > > > Descriptions are not consistent with capitalization. ("Use" vs. "use") Obligatory, it bothers me that they're all lowercase :) > On Dec. 30, 2014, 10:46 p.m., Christopher Tubbs wrote: > > shell/src/main/java/org/apache/accumulo/shell/ShellOptionsJC.java, line 210 > > <https://reviews.apache.org/r/29386/diff/4/?file=803175#file803175line210> > > > > Why is username the "short" user name? Is that unique in Kerberos? If > > not, the long version should be used everywhere instead. Otherwise, one > > user can appear to be another in logs, etc. > > > > If "getShortUserName" is not unique, it should avoided everywhere. Check out: http://web.mit.edu/kerberos/krb5-1.5/krb5-1.5.4/doc/krb5-user/What-is-a-Kerberos-Principal_003f.html Kerberos principals are of the form: primary/instance@realm. Kerberos principals are typically categorized as users and services. A user is not qualified to a single instance (a host) and represent authentication across the realm. For example, els...@example.com means that I can "roam". Conversely, a service is typically "fixed" to a specific host. For example, accumulo/node1.example....@example.com means that there is a process, logged in as 'accumulo' on the host 'node1.example.com'. That service can't be run on any other host. Now, an important note if someone actually creates a principal "accum...@example.com" this is unique with respect to any other "accumulo/`host`@EXAMPLE.COM" principal. I'm not sure if we need to do anything else other than convention of kerberos principals, or if we should be including the instance in "our" username when present. This kind of ties back into the SystemCredentials discussion again. > On Dec. 30, 2014, 10:46 p.m., Christopher Tubbs wrote: > > test/src/test/java/org/apache/accumulo/server/security/SystemCredentialsIT.java, > > line 119 > > <https://reviews.apache.org/r/29386/diff/4/?file=803178#file803178line119> > > > > This change reduces the coverage and does not behave the way that was > > being tested. > > > > Previously, we were able to verify that the test properly failed if the > > information used to generate the unique system password for that instance, > > was incorrect (e.g. a system user from another instance). > > > > Now, it seems to be testing that a regular user named "!SYSTEM" with a > > "fake" password doesn't exist. This is not correct. Will look at the diff, perhaps I changed it unintentionally. - Josh ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/29386/#review66382 ----------------------------------------------------------- On Dec. 30, 2014, 11:16 p.m., Josh Elser wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/29386/ > ----------------------------------------------------------- > > (Updated Dec. 30, 2014, 11:16 p.m.) > > > Review request for accumulo. > > > Bugs: ACCUMULO-2815 > https://issues.apache.org/jira/browse/ACCUMULO-2815 > > > Repository: accumulo > > > Description > ------- > > ACCUMULO-2815 Initial support for Kerberos client authentication. > > Leverage SASL transport provided by Thrift which can speak GSSAPI, which > Kerberos implements. Introduced... > > * An Accumulo KerberosToken which is an AuthenticationToken to validate users. > * Custom thrift processor and invocation handler to ensure server RPCs have a > valid KRB identity and Accumulo authentication. > * A KerberosAuthenticator which extends ZKAuthenticator to support Kerberos > identities seamlessly. > * New ClientConf variables to use SASL transport and pass Kerberos server > principal > * Updated ClientOpts and Shell opts to transparently use a KerberosToken when > SASL is enabled (no extra client work). > > I believe this is the "bare minimum" for Kerberos support. They are also > grossly lacking in unit and integration tests. I believe that I might have > somehow broken the client address string in the server (I saw log messages > with client: null, but I'm not sure if it's due to these changes or not). A > necessary limitation in the Thrift server used is that, like the SSL > transport, the SASL transport cannot presently be used with the > TFramedTransport, which means none of the [half]async thrift servers will > function with this -- we're stuck with the TThreadPoolServer. > > Performed some contrived benchmarks on my laptop (while still using it > myself) to get at big-picture view of the performance impact against "normal" > operation and Kerberos alone. Each "run" was the duration to ingest 100M > records using continuous-ingest, timed with `time`, using 'real'. > > THsHaServer (our default), 6 runs: > > Avg: 10m7.273s (607.273s) > Min: 9m43.395s > Max: 10m52.715s > > TThreadPoolServer (no SASL), 5 runs: > > Avg: 11m16.254s (676.254s) > Min: 10m30.987s > Max: 12m24.192s > > TThreadPoolServer+SASL/GSSAPI (these changes), 6 runs: > > Avg: 13m17.187s (797.187s) > Min: 10m52.997s > Max: 16m0.975s > > The general takeway is that there's about 15% performance degredation in its > initial state which is in the realm of what I expected (~10%). > > > Diffs > ----- > > core/src/main/java/org/apache/accumulo/core/cli/ClientOpts.java f6ea934 > core/src/main/java/org/apache/accumulo/core/client/ClientConfiguration.java > 6fe61a5 > core/src/main/java/org/apache/accumulo/core/client/impl/ClientContext.java > e75bec6 > core/src/main/java/org/apache/accumulo/core/client/impl/ConnectorImpl.java > f481cc3 > > core/src/main/java/org/apache/accumulo/core/client/impl/ThriftTransportKey.java > 6dc846f > > core/src/main/java/org/apache/accumulo/core/client/impl/ThriftTransportPool.java > 5da803b > > core/src/main/java/org/apache/accumulo/core/client/security/tokens/KerberosToken.java > PRE-CREATION > core/src/main/java/org/apache/accumulo/core/conf/Property.java e054a5f > core/src/main/java/org/apache/accumulo/core/rpc/FilterTransport.java > PRE-CREATION > core/src/main/java/org/apache/accumulo/core/rpc/SaslConnectionParams.java > PRE-CREATION > core/src/main/java/org/apache/accumulo/core/rpc/TTimeoutTransport.java > 6eace77 > core/src/main/java/org/apache/accumulo/core/rpc/ThriftUtil.java 09bd6c4 > core/src/main/java/org/apache/accumulo/core/rpc/UGIAssumingTransport.java > PRE-CREATION > > core/src/main/java/org/apache/accumulo/core/rpc/UGIAssumingTransportFactory.java > PRE-CREATION > core/src/main/java/org/apache/accumulo/core/security/Credentials.java > 525a958 > core/src/test/java/org/apache/accumulo/core/cli/TestClientOpts.java ff49bc0 > > core/src/test/java/org/apache/accumulo/core/client/ClientConfigurationTest.java > PRE-CREATION > > core/src/test/java/org/apache/accumulo/core/conf/ClientConfigurationTest.java > 40be70f > > core/src/test/java/org/apache/accumulo/core/rpc/SaslConnectionParamsTest.java > PRE-CREATION > proxy/src/main/java/org/apache/accumulo/proxy/Proxy.java 4b048eb > > server/base/src/main/java/org/apache/accumulo/server/AccumuloServerContext.java > 09ae4f4 > server/base/src/main/java/org/apache/accumulo/server/init/Initialize.java > 046cfb5 > > server/base/src/main/java/org/apache/accumulo/server/rpc/TCredentialsUpdatingInvocationHandler.java > PRE-CREATION > > server/base/src/main/java/org/apache/accumulo/server/rpc/TCredentialsUpdatingWrapper.java > PRE-CREATION > server/base/src/main/java/org/apache/accumulo/server/rpc/TServerUtils.java > 641c0bf > > server/base/src/main/java/org/apache/accumulo/server/rpc/ThriftServerType.java > PRE-CREATION > > server/base/src/main/java/org/apache/accumulo/server/security/SecurityOperation.java > 5e81018 > > server/base/src/main/java/org/apache/accumulo/server/security/SecurityUtil.java > 29e4939 > > server/base/src/main/java/org/apache/accumulo/server/security/SystemCredentials.java > a59d57c > > server/base/src/main/java/org/apache/accumulo/server/security/handler/KerberosAuthenticator.java > PRE-CREATION > > server/base/src/main/java/org/apache/accumulo/server/thrift/UGIAssumingProcessor.java > PRE-CREATION > > server/base/src/test/java/org/apache/accumulo/server/AccumuloServerContextTest.java > PRE-CREATION > > server/base/src/test/java/org/apache/accumulo/server/rpc/TCredentialsUpdatingInvocationHandlerTest.java > PRE-CREATION > > server/base/src/test/java/org/apache/accumulo/server/security/SystemCredentialsTest.java > 4202a7e > server/gc/src/main/java/org/apache/accumulo/gc/SimpleGarbageCollector.java > 93a9a49 > > server/gc/src/test/java/org/apache/accumulo/gc/GarbageCollectWriteAheadLogsTest.java > f98721f > > server/gc/src/test/java/org/apache/accumulo/gc/SimpleGarbageCollectorTest.java > 99558b8 > > server/gc/src/test/java/org/apache/accumulo/gc/replication/CloseWriteAheadLogReferencesTest.java > cad1e01 > server/master/src/main/java/org/apache/accumulo/master/Master.java 12195fa > server/tracer/src/main/java/org/apache/accumulo/tracer/TraceServer.java > 7e33300 > server/tserver/src/main/java/org/apache/accumulo/tserver/TabletServer.java > d5c1d2f > shell/src/main/java/org/apache/accumulo/shell/Shell.java 58308ff > shell/src/main/java/org/apache/accumulo/shell/ShellOptionsJC.java 8167ef8 > shell/src/test/java/org/apache/accumulo/shell/ShellConfigTest.java 0e72c8c > shell/src/test/java/org/apache/accumulo/shell/ShellOptionsJCTest.java > PRE-CREATION > test/src/main/java/org/apache/accumulo/test/functional/ZombieTServer.java > eb84533 > > test/src/main/java/org/apache/accumulo/test/performance/thrift/NullTserver.java > 2ebc2e3 > > test/src/test/java/org/apache/accumulo/server/security/SystemCredentialsIT.java > fb71f5f > > Diff: https://reviews.apache.org/r/29386/diff/ > > > Testing > ------- > > Ensure existing unit tests still function. Accumulo is functional and ran > continuous ingest multiple times using a client with only a Kerberos identity > (no user/password provided). Used MIT Kerberos with Apache Hadoop 2.6.0 and > Apache ZooKeeper 3.4.5. > > > Thanks, > > Josh Elser > >