Re: Review Request 29176: ACCUMULO-3420 Get Visibility Metrics from PrintInfo
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/29176/#review66882 --- core/src/test/java/org/apache/accumulo/core/file/rfile/RFileMetricsTest.java https://reviews.apache.org/r/29176/#comment110579 Is this copied from the RFile unit test? If can the two test be modified to share this code? - kturner On Dec. 22, 2014, 3:25 p.m., Jenna Huston wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/29176/ --- (Updated Dec. 22, 2014, 3:25 p.m.) Review request for accumulo. Bugs: ACCUMULO-3420 https://issues.apache.org/jira/browse/ACCUMULO-3420 Repository: accumulo Description --- Added an option to PrintInfo to get visibility metrics. Prints the number of times a visibilty is seen in each locality group. Also shows how many blocks in the locality group have his visibiltiy. Diffs - core/src/main/java/org/apache/accumulo/core/file/rfile/MetricsGatherer.java PRE-CREATION core/src/main/java/org/apache/accumulo/core/file/rfile/PrintInfo.java 43586dd core/src/main/java/org/apache/accumulo/core/file/rfile/RFile.java 9dcb3a5 core/src/main/java/org/apache/accumulo/core/file/rfile/VisMetricsGatherer.java PRE-CREATION core/src/test/java/org/apache/accumulo/core/file/rfile/RFileMetricsTest.java PRE-CREATION Diff: https://reviews.apache.org/r/29176/diff/ Testing --- Tested with a few RFiles that I made. Thanks, Jenna Huston
Re: Review Request 29386: ACCUMULO-2815 Client authentication via Kerberos
On Dec. 30, 2014, 5:46 p.m., Christopher Tubbs wrote: server/base/src/main/java/org/apache/accumulo/server/security/handler/KerberosAuthenticator.java, line 41 https://reviews.apache.org/r/29386/diff/4/?file=803164#file803164line41 I don't think this should be so tightly coupled with the ZKAuthenticator. It doesn't seem that this coupling is necessary to provide authentication functionality. Have a different implementation of an Authenticator that hijacks the data storage of another implementation, could be quite confusing to administrators, especially if they need to swap out the implementation. See ACCUMULO-1300 and related issues for other options. Josh Elser wrote: I understand the intentions for ACCUMULO-1300; however, that's not implemented. Being able to leverage the existing ZKAuthorizor was wonderful so I really don't want to move away from how this fundamentally works. Shouldn't the Authenticator and Authorizor be fixed per instance (allowing reset via `accumulo init --reset-security` or w/e)? What do you actually want changed aside from documentation to make it clear to administrators (I haven't written any documentation yet). Christopher Tubbs wrote: So, I pointed out in the conversation around ACCUMULO-259, that I would prefer a pluggable security module that integrated all the security-related functions, because they are so dependent on each other. However, they were implemented as 3 separately configurable pluggable components. I would still like to see them merged (which doesn't preclude making one from modular components themselves). For now, I'm not really sure what it means for somebody to create a user when the authenticator already validates them. That's why it's confusing. Are we essentially making ZKAuthenticator Kerberos-aware, or are we providing a separate Kerberos Authenticator? That's really what it boils down to for me. It seems like we're doing a hybrid... but I think we should solidify on one or the other. Josh Elser wrote: I view it as a Kerberos Authenticator. The implementation details of that authenticator is that is happens to persist information into ZK to handle things like permissions and authorizations, but a user doesn't have to know that to use it. We could, hypothetically, swap out ZK for any other persistent store for that information and it would continue to work. What would help alleviate your confusion? More javadocs? A different class name? Christopher Tubbs wrote: Well, part is documentation. Part is behavior. If it's a separate authenticator, we should make sure that it does not use the same storage area in such a tightly coupled way. It should not, for instance, be setting things that look like passwords, and create local user, list local users, and delete local users should be unsupported client side operations, just as you've done with change password (since users are managed within Kerberos). We already share storage between the ZKAuthenticator and the corresponding permissions handler and authorizer, so it's probably okay to share the directory identified with the principal. However, it can seamlessly create this directory as needed. Mainly, I'm thinking the more we can separate the local user management functions from the KerberosAuthenticator, the better, since users are managed in Kerberos. Some of my concerns are with the current state of things, but I'm hoping not to exacerbate the problems I see with the current implementations. I don't want to make it harder to make improvements later. I'd rather move towards a better design, then simply add features to the existing ad-hoc design. Josh Elser wrote: bq. It should not, for instance, be setting things that look like passwords, and create local user, list local users, and delete local users should be unsupported client side operations, just as you've done with change password (since users are managed within Kerberos) Thanks for pointing that out. That helps me understand where your concerns are coming from. Perhaps it would be cleaner to just keep a reference to the ZkAuthenticator as a delegate (instead of extending it). This would give us a bit more obvious control over what's happening. bq. I'm thinking the more we can separate the local user management functions from the KerberosAuthenticator, the better, since users are managed in Kerberos. Some of my concerns are with the current state of things, but I'm hoping not to exacerbate the problems I see with the current implementations Agreed, those are very.. awkward, presently. Is making sure that we throw exceptions from KerberosAuthenticator when we call these non-sensical methods a sufficient change? Then, we can move to a better design later on like you said? What do
Re: Review Request 29176: ACCUMULO-3420 Get Visibility Metrics from PrintInfo
On Dec. 18, 2014, 5:20 p.m., Christopher Tubbs wrote: core/src/main/java/org/apache/accumulo/core/file/rfile/MetricsGatherer.java, line 34 https://reviews.apache.org/r/29176/diff/1/?file=795044#file795044line34 It'd be neat if there were a MetricsGatherer interface, which this class implemented. That way, you could register other MetricsGatherers with the RFile Reader. The interface should have javadocs which explain when each method is expected to be called (e.g., how they are used by RFile). You mean like https://lucene.apache.org/solr/4_10_2/solr-core/org/apache/solr/core/SolrInfoMBean.html? - Mike --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/29176/#review65502 --- On Dec. 22, 2014, 3:25 p.m., Jenna Huston wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/29176/ --- (Updated Dec. 22, 2014, 3:25 p.m.) Review request for accumulo. Bugs: ACCUMULO-3420 https://issues.apache.org/jira/browse/ACCUMULO-3420 Repository: accumulo Description --- Added an option to PrintInfo to get visibility metrics. Prints the number of times a visibilty is seen in each locality group. Also shows how many blocks in the locality group have his visibiltiy. Diffs - core/src/main/java/org/apache/accumulo/core/file/rfile/MetricsGatherer.java PRE-CREATION core/src/main/java/org/apache/accumulo/core/file/rfile/PrintInfo.java 43586dd core/src/main/java/org/apache/accumulo/core/file/rfile/RFile.java 9dcb3a5 core/src/main/java/org/apache/accumulo/core/file/rfile/VisMetricsGatherer.java PRE-CREATION core/src/test/java/org/apache/accumulo/core/file/rfile/RFileMetricsTest.java PRE-CREATION Diff: https://reviews.apache.org/r/29176/diff/ Testing --- Tested with a few RFiles that I made. Thanks, Jenna Huston
Re: Review Request 29386: ACCUMULO-2815 Client authentication via Kerberos
On Dec. 30, 2014, 10:46 p.m., Christopher Tubbs wrote: shell/src/main/java/org/apache/accumulo/shell/ShellOptionsJC.java, line 210 https://reviews.apache.org/r/29386/diff/4/?file=803175#file803175line210 Why is username the short user name? Is that unique in Kerberos? If not, the long version should be used everywhere instead. Otherwise, one user can appear to be another in logs, etc. If getShortUserName is not unique, it should avoided everywhere. Josh Elser wrote: Check out: http://web.mit.edu/kerberos/krb5-1.5/krb5-1.5.4/doc/krb5-user/What-is-a-Kerberos-Principal_003f.html Kerberos principals are of the form: primary/instance@realm. Kerberos principals are typically categorized as users and services. A user is not qualified to a single instance (a host) and represent authentication across the realm. For example, els...@example.com means that I can roam. Conversely, a service is typically fixed to a specific host. For example, accumulo/node1.example@example.com means that there is a process, logged in as 'accumulo' on the host 'node1.example.com'. That service can't be run on any other host. Now, an important note if someone actually creates a principal accum...@example.com this is unique with respect to any other accumulo/`host`@EXAMPLE.COM principal. I'm not sure if we need to do anything else other than convention of kerberos principals, or if we should be including the instance in our username when present. This kind of ties back into the SystemCredentials discussion again. Christopher Tubbs wrote: Okay, so a smart configuration would make shortnames unique. However, UserGroupInformation returns only the `primary` for the short name. This means that user names will have to be unique across realms and instances. Right now, you are storing permissions using the short name. So, any user with the same primary, will be able to masquerade as any other user with the same primary from a different instance and/or realm, and be able to user their permissions and authorizations. That's the problem with the shortname here. That's very unexpected. Josh Elser wrote: Bingo. If you look at how HDFS does their configuration, this is the same convention. The lack of documentation from me leaves something to be desired here, and I apologize for that. To save you looking at HDFS (if you care not to look), you'll see that an HDFS process uses a given principal with a special replacement string `_HOST`. The common convention is to use something like `dn/_h...@example.com` (the realm is unimportant for this example). This ensures that the same configuration files can be used across all hosts in the HDFS instance, and Hadoop dynamically replaces `_HOST` with the FQDN of the host. Thus, there's an implicit link that all `dn/*@EXAMPLE.COM` can act as datanodes and this is protected by the fact that access to the KDC is restricted (you can't make your own user). The circle of trust is two-fold: having a keytab with the correct principal and that Hadoop is requires that specific configuration (which restricts the principal). Christopher Tubbs wrote: My concerns here are more about the impact on users, than for the system credentials. I don't know what HDFS is doing, but if they aren't (minimally) checking the realm when checking permissions/access on an authenticated principal, then they are less secure than I think we should be. Referencing HDFS also seems to imply that we're not so much doing Kerberos, as we are implementing HDFS-specific Kerberos conventions (which are less secure, with respect to data authorizations/permissions within Accumulo, than I'm comfortable with). Josh Elser wrote: bq. if they aren't (minimally) checking the realm when checking permissions/access on an authenticated principal Do you mean the instance instead of the realm? In the case of a single realm, the KDC is going to verify the correct realm. Assuming you meant the instance though (the optional /hostname), it's typical that a user has the ability to use their credentials anywhere. Thus, you typically see principals without instances for actual users. As far as I understand it, that's what HDFS tends to follow and what I tried to as well. Accumulo doesn't care where you come from, just what your name is and that you have valid credentials. I don't think we're being substantially less secure by not including the instance in the Accumulo principal. Christopher Tubbs wrote: No, I mean the realm, to make it only necessary to guarantee uniqueness within a realm, vs. across all known realms (more reasonable of a guarantee to make for a KDC user admin). We could also include the instance (when specified), if we want to really be careful that users aren't sharing permissions. In my concerns, I'm assuming we
Re: Review Request 29176: ACCUMULO-3420 Get Visibility Metrics from PrintInfo
On Dec. 18, 2014, 12:20 p.m., Christopher Tubbs wrote: core/src/main/java/org/apache/accumulo/core/file/rfile/MetricsGatherer.java, line 34 https://reviews.apache.org/r/29176/diff/1/?file=795044#file795044line34 It'd be neat if there were a MetricsGatherer interface, which this class implemented. That way, you could register other MetricsGatherers with the RFile Reader. The interface should have javadocs which explain when each method is expected to be called (e.g., how they are used by RFile). Mike Drob wrote: You mean like https://lucene.apache.org/solr/4_10_2/solr-core/org/apache/solr/core/SolrInfoMBean.html? I don't know what that is for. The updated patch does what I was thinking. - Christopher --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/29176/#review65502 --- On Dec. 22, 2014, 10:25 a.m., Jenna Huston wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/29176/ --- (Updated Dec. 22, 2014, 10:25 a.m.) Review request for accumulo. Bugs: ACCUMULO-3420 https://issues.apache.org/jira/browse/ACCUMULO-3420 Repository: accumulo Description --- Added an option to PrintInfo to get visibility metrics. Prints the number of times a visibilty is seen in each locality group. Also shows how many blocks in the locality group have his visibiltiy. Diffs - core/src/main/java/org/apache/accumulo/core/file/rfile/MetricsGatherer.java PRE-CREATION core/src/main/java/org/apache/accumulo/core/file/rfile/PrintInfo.java 43586dd core/src/main/java/org/apache/accumulo/core/file/rfile/RFile.java 9dcb3a5 core/src/main/java/org/apache/accumulo/core/file/rfile/VisMetricsGatherer.java PRE-CREATION core/src/test/java/org/apache/accumulo/core/file/rfile/RFileMetricsTest.java PRE-CREATION Diff: https://reviews.apache.org/r/29176/diff/ Testing --- Tested with a few RFiles that I made. Thanks, Jenna Huston
Re: Review Request 29386: ACCUMULO-2815 Client authentication via Kerberos
On Dec. 30, 2014, 5:46 p.m., Christopher Tubbs wrote: core/src/main/java/org/apache/accumulo/core/client/security/tokens/KerberosToken.java, lines 71-79 https://reviews.apache.org/r/29386/diff/4/?file=803145#file803145line71 Why is this unsupported? Josh Elser wrote: Again, see earlier discussion. I have no idea what this is actually supposed to destroy (we have zero documentation on the matter that I found) and it's already been changed to be a noop. I am just as confident that a noop is correct as an unsupported operation is correct (I have no confidence). As far as destroying a UGI, I haven't found any parallel to the Destroyable interface. I'm not sure if something exists that we should be calling. Christopher Tubbs wrote: It's a way for objects to remove their internal state (preferably by securely overwriting memory) so any credentials can be removed from memory, to give the user control over their credentials. It is expected to be called by the user. In the earlier implementation, setting the ugi to null would have been appropriate. In the latest version, you can nullify the principal. It's not terribly important, since the principal alone is not sensitive, but it would be helpful to have reasonable behavior when the user calls these methods to avoid things like I called destroy, but it won't destroy! or My application verifies the token isn't destroyed before using it, but it seems to always be destroyed!. Even if it was a truly no-op, we could add an internal destroyed flag to give users reasonable behavior. Since we do have the principal, we can just make that null instead of adding a new flag, and use that to check the state of its destruction. Also, I think the other methods which assume it is not destroyed should throw IllegalStateException explicitly if it is, in fact, destroyed. But, I'm not even sure we do that in the other tokens, so it's probably fine to leave that behavior undefined. Josh Elser wrote: WRT UserGroupInformation, I don't know how (or if I even can) destroy the current login which is.. awkward to me. Generally speaking, this would be great to get written down somewhere -- I'll try to fire up a ticket for that so we don't have this discussion again in 6mos. The semantics of destroy are not to destroy the current login. They are to destroy the current token, removing any user credentials from it. These semantics can be accomplished with a flag, or by setting the principal to null, without touching the Kerberos state that they represent. On Dec. 30, 2014, 5:46 p.m., Christopher Tubbs wrote: core/src/main/java/org/apache/accumulo/core/conf/Property.java, lines 165-167 https://reviews.apache.org/r/29386/diff/4/?file=803146#file803146line165 I don't think we want to expose this. It's high risk. What is the reason and how is it related to this change? Josh Elser wrote: Again, see the previous discussion on this subject. This is necessary to actually perform an apples to apples comparison of the impact of Kerberos on Accumulo. Comparing KRB/SASL with TThreadPoolServer against no KRB/SASL and THsHaServer is an invalid benchmark to determine the overhead of KRB. Like I said earlier, `Experimental` is not really the appropriate annotation here (Internal use would be more accurate), but it's what we have. Do you have a recommendation on some other change in wording/annotation to convey the intent? I'd like to not have to change the codebase any time I want to verify no performance regressions in future versions (having to change code would preclude any non-devs from running their own benchmarks too). Christopher Tubbs wrote: My main concern is that I'm not entirely confident we're not exposing these to users in some way, still: see ACCUMULO-2460 and ACCUMULO-2401. Josh Elser wrote: It sounds to me like we should introduce a new annotation then to cover things not meant for user consumption. Experimental is for maybe not stable or entirely working user facing things. This property is (likely) only relevant for developers. Any objections to addressing this by introduction of a new annotation (not in this changeset)? That's a simple enough solution. I still kind of object to the idea that experimental properties have default values at all (it's experimental, so it should be okay to set to empty value, and should be null/unused by default), which was the underlying problem between ACCUMULO-2460 and ACCUMULO-2401 (the default value was added and used as a proxy for what probably should have been a constant). If that underlying problem was addressed, then using Experimental here would be perfectly fine. However, it may still be useful to have separate, clear semantics, with an additional annotation. On Dec. 30, 2014, 5:46 p.m., Christopher Tubbs wrote:
Re: Review Request 29230: ACCUMULO-3439 Add RegexGroupBalancer
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/29230/#review66915 --- Nit: whitespace Nit: single-line statements w/out braces server/base/src/main/java/org/apache/accumulo/server/master/balancer/GroupBalancer.java https://reviews.apache.org/r/29230/#comment110619 final server/base/src/main/java/org/apache/accumulo/server/master/balancer/GroupBalancer.java https://reviews.apache.org/r/29230/#comment110620 this(null); server/base/src/main/java/org/apache/accumulo/server/master/balancer/GroupBalancer.java https://reviews.apache.org/r/29230/#comment110621 final server/base/src/main/java/org/apache/accumulo/server/master/balancer/GroupBalancer.java https://reviews.apache.org/r/29230/#comment110622 final server/base/src/main/java/org/apache/accumulo/server/master/balancer/GroupBalancer.java https://reviews.apache.org/r/29230/#comment110623 final server/base/src/main/java/org/apache/accumulo/server/master/balancer/GroupBalancer.java https://reviews.apache.org/r/29230/#comment110624 final server/base/src/main/java/org/apache/accumulo/server/master/balancer/GroupBalancer.java https://reviews.apache.org/r/29230/#comment110625 @Override server/base/src/main/java/org/apache/accumulo/server/master/balancer/GroupBalancer.java https://reviews.apache.org/r/29230/#comment110626 @Override server/base/src/main/java/org/apache/accumulo/server/master/balancer/GroupBalancer.java https://reviews.apache.org/r/29230/#comment110627 @Override server/base/src/main/java/org/apache/accumulo/server/master/balancer/GroupBalancer.java https://reviews.apache.org/r/29230/#comment110628 Could this class be made static by passing in the table ID? - Eric Newton On Dec. 19, 2014, 1:52 a.m., kturner wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/29230/ --- (Updated Dec. 19, 2014, 1:52 a.m.) Review request for accumulo. Bugs: ACCUMULO-3439 https://issues.apache.org/jira/browse/ACCUMULO-3439 Repository: accumulo Description --- The patch contains a new balancer, test, and documentation. Diffs - docs/src/main/resources/examples/README.rgbalancer PRE-CREATION server/base/src/main/java/org/apache/accumulo/server/master/balancer/GroupBalancer.java PRE-CREATION server/base/src/main/java/org/apache/accumulo/server/master/balancer/RegexGroupBalancer.java PRE-CREATION server/base/src/main/java/org/apache/accumulo/server/master/balancer/TableLoadBalancer.java 5eae890 server/base/src/test/java/org/apache/accumulo/server/master/balancer/GroupBalancerTest.java PRE-CREATION test/src/test/java/org/apache/accumulo/test/functional/RegexGroupBalanceIT.java PRE-CREATION Diff: https://reviews.apache.org/r/29230/diff/ Testing --- The new unit test and intergeration test added in the patch run. Thanks, kturner
Re: Review Request 29386: ACCUMULO-2815 Client authentication via Kerberos
On Dec. 30, 2014, 10:46 p.m., Christopher Tubbs wrote: shell/src/main/java/org/apache/accumulo/shell/ShellOptionsJC.java, line 210 https://reviews.apache.org/r/29386/diff/4/?file=803175#file803175line210 Why is username the short user name? Is that unique in Kerberos? If not, the long version should be used everywhere instead. Otherwise, one user can appear to be another in logs, etc. If getShortUserName is not unique, it should avoided everywhere. Josh Elser wrote: Check out: http://web.mit.edu/kerberos/krb5-1.5/krb5-1.5.4/doc/krb5-user/What-is-a-Kerberos-Principal_003f.html Kerberos principals are of the form: primary/instance@realm. Kerberos principals are typically categorized as users and services. A user is not qualified to a single instance (a host) and represent authentication across the realm. For example, els...@example.com means that I can roam. Conversely, a service is typically fixed to a specific host. For example, accumulo/node1.example@example.com means that there is a process, logged in as 'accumulo' on the host 'node1.example.com'. That service can't be run on any other host. Now, an important note if someone actually creates a principal accum...@example.com this is unique with respect to any other accumulo/`host`@EXAMPLE.COM principal. I'm not sure if we need to do anything else other than convention of kerberos principals, or if we should be including the instance in our username when present. This kind of ties back into the SystemCredentials discussion again. Christopher Tubbs wrote: Okay, so a smart configuration would make shortnames unique. However, UserGroupInformation returns only the `primary` for the short name. This means that user names will have to be unique across realms and instances. Right now, you are storing permissions using the short name. So, any user with the same primary, will be able to masquerade as any other user with the same primary from a different instance and/or realm, and be able to user their permissions and authorizations. That's the problem with the shortname here. That's very unexpected. Josh Elser wrote: Bingo. If you look at how HDFS does their configuration, this is the same convention. The lack of documentation from me leaves something to be desired here, and I apologize for that. To save you looking at HDFS (if you care not to look), you'll see that an HDFS process uses a given principal with a special replacement string `_HOST`. The common convention is to use something like `dn/_h...@example.com` (the realm is unimportant for this example). This ensures that the same configuration files can be used across all hosts in the HDFS instance, and Hadoop dynamically replaces `_HOST` with the FQDN of the host. Thus, there's an implicit link that all `dn/*@EXAMPLE.COM` can act as datanodes and this is protected by the fact that access to the KDC is restricted (you can't make your own user). The circle of trust is two-fold: having a keytab with the correct principal and that Hadoop is requires that specific configuration (which restricts the principal). Christopher Tubbs wrote: My concerns here are more about the impact on users, than for the system credentials. I don't know what HDFS is doing, but if they aren't (minimally) checking the realm when checking permissions/access on an authenticated principal, then they are less secure than I think we should be. Referencing HDFS also seems to imply that we're not so much doing Kerberos, as we are implementing HDFS-specific Kerberos conventions (which are less secure, with respect to data authorizations/permissions within Accumulo, than I'm comfortable with). Josh Elser wrote: bq. if they aren't (minimally) checking the realm when checking permissions/access on an authenticated principal Do you mean the instance instead of the realm? In the case of a single realm, the KDC is going to verify the correct realm. Assuming you meant the instance though (the optional /hostname), it's typical that a user has the ability to use their credentials anywhere. Thus, you typically see principals without instances for actual users. As far as I understand it, that's what HDFS tends to follow and what I tried to as well. Accumulo doesn't care where you come from, just what your name is and that you have valid credentials. I don't think we're being substantially less secure by not including the instance in the Accumulo principal. Christopher Tubbs wrote: No, I mean the realm, to make it only necessary to guarantee uniqueness within a realm, vs. across all known realms (more reasonable of a guarantee to make for a KDC user admin). We could also include the instance (when specified), if we want to really be careful that users aren't sharing permissions. In my concerns, I'm assuming we
Re: Review Request 29386: ACCUMULO-2815 Client authentication via Kerberos
On Dec. 30, 2014, 5:46 p.m., Christopher Tubbs wrote: shell/src/main/java/org/apache/accumulo/shell/ShellOptionsJC.java, line 210 https://reviews.apache.org/r/29386/diff/4/?file=803175#file803175line210 Why is username the short user name? Is that unique in Kerberos? If not, the long version should be used everywhere instead. Otherwise, one user can appear to be another in logs, etc. If getShortUserName is not unique, it should avoided everywhere. Josh Elser wrote: Check out: http://web.mit.edu/kerberos/krb5-1.5/krb5-1.5.4/doc/krb5-user/What-is-a-Kerberos-Principal_003f.html Kerberos principals are of the form: primary/instance@realm. Kerberos principals are typically categorized as users and services. A user is not qualified to a single instance (a host) and represent authentication across the realm. For example, els...@example.com means that I can roam. Conversely, a service is typically fixed to a specific host. For example, accumulo/node1.example@example.com means that there is a process, logged in as 'accumulo' on the host 'node1.example.com'. That service can't be run on any other host. Now, an important note if someone actually creates a principal accum...@example.com this is unique with respect to any other accumulo/`host`@EXAMPLE.COM principal. I'm not sure if we need to do anything else other than convention of kerberos principals, or if we should be including the instance in our username when present. This kind of ties back into the SystemCredentials discussion again. Christopher Tubbs wrote: Okay, so a smart configuration would make shortnames unique. However, UserGroupInformation returns only the `primary` for the short name. This means that user names will have to be unique across realms and instances. Right now, you are storing permissions using the short name. So, any user with the same primary, will be able to masquerade as any other user with the same primary from a different instance and/or realm, and be able to user their permissions and authorizations. That's the problem with the shortname here. That's very unexpected. Josh Elser wrote: Bingo. If you look at how HDFS does their configuration, this is the same convention. The lack of documentation from me leaves something to be desired here, and I apologize for that. To save you looking at HDFS (if you care not to look), you'll see that an HDFS process uses a given principal with a special replacement string `_HOST`. The common convention is to use something like `dn/_h...@example.com` (the realm is unimportant for this example). This ensures that the same configuration files can be used across all hosts in the HDFS instance, and Hadoop dynamically replaces `_HOST` with the FQDN of the host. Thus, there's an implicit link that all `dn/*@EXAMPLE.COM` can act as datanodes and this is protected by the fact that access to the KDC is restricted (you can't make your own user). The circle of trust is two-fold: having a keytab with the correct principal and that Hadoop is requires that specific configuration (which restricts the principal). Christopher Tubbs wrote: My concerns here are more about the impact on users, than for the system credentials. I don't know what HDFS is doing, but if they aren't (minimally) checking the realm when checking permissions/access on an authenticated principal, then they are less secure than I think we should be. Referencing HDFS also seems to imply that we're not so much doing Kerberos, as we are implementing HDFS-specific Kerberos conventions (which are less secure, with respect to data authorizations/permissions within Accumulo, than I'm comfortable with). Josh Elser wrote: bq. if they aren't (minimally) checking the realm when checking permissions/access on an authenticated principal Do you mean the instance instead of the realm? In the case of a single realm, the KDC is going to verify the correct realm. Assuming you meant the instance though (the optional /hostname), it's typical that a user has the ability to use their credentials anywhere. Thus, you typically see principals without instances for actual users. As far as I understand it, that's what HDFS tends to follow and what I tried to as well. Accumulo doesn't care where you come from, just what your name is and that you have valid credentials. I don't think we're being substantially less secure by not including the instance in the Accumulo principal. Christopher Tubbs wrote: No, I mean the realm, to make it only necessary to guarantee uniqueness within a realm, vs. across all known realms (more reasonable of a guarantee to make for a KDC user admin). We could also include the instance (when specified), if we want to really be careful that users aren't sharing permissions. In my concerns, I'm assuming we
Re: Review Request 29502: ACCUMULO-3458 Adding scan authorizations to IteratorEnvironment
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/29502/#review66908 --- test/src/test/java/org/apache/accumulo/test/ScanIteratorIT.java https://reviews.apache.org/r/29502/#comment110613 Thinking a bit more about the batchScanner, `runTest()` could call another parameterized test that accepts a scanner. This would make it easy to test scanner and batch scanner. A reason to do this is if the batch scanner sets up iterators differently and does not pass auths in env, the test might catch it. runTest(auths, shouldFail){ runTest(scanner, auths, shouldFail); runTest(batchScanner, auths, shouldFail); batchScanner.close(); } - kturner On Jan. 6, 2015, 3:54 p.m., Corey Nolet wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/29502/ --- (Updated Jan. 6, 2015, 3:54 p.m.) Review request for accumulo, Christopher Tubbs, Eric Newton, Josh Elser, and kturner. Bugs: ACCUMULO-3458 https://issues.apache.org/jira/browse/ACCUMULO-3458 Repository: accumulo Description --- ACCUMULO-3458 Propagating scan-time authorizations through the IteratorEnvironment so that scan-time iterators can use them. Diffs - core/src/main/java/org/apache/accumulo/core/client/ClientSideIteratorScanner.java 4903656 core/src/main/java/org/apache/accumulo/core/client/ScannerBase.java 335b63a core/src/main/java/org/apache/accumulo/core/client/impl/OfflineScanner.java 2552682 core/src/main/java/org/apache/accumulo/core/client/impl/ScannerImpl.java 666a8af core/src/main/java/org/apache/accumulo/core/client/impl/ScannerOptions.java 9726266 core/src/main/java/org/apache/accumulo/core/client/impl/TabletServerBatchReader.java 2a79f05 core/src/main/java/org/apache/accumulo/core/client/mock/MockScannerBase.java 72cb863 core/src/main/java/org/apache/accumulo/core/iterators/IteratorEnvironment.java 9e20cb1 core/src/test/java/org/apache/accumulo/core/client/impl/ScannerImplTest.java be4d467 core/src/test/java/org/apache/accumulo/core/client/impl/TabletServerBatchReaderTest.java PRE-CREATION core/src/test/java/org/apache/accumulo/core/iterators/DefaultIteratorEnvironment.java 94da7b5 core/src/test/java/org/apache/accumulo/core/iterators/FirstEntryInRowIteratorTest.java fa46360 core/src/test/java/org/apache/accumulo/core/iterators/user/RowDeletingIteratorTest.java 4521e55 core/src/test/java/org/apache/accumulo/core/iterators/user/TransformingIteratorTest.java 4cebab7 server/base/src/test/java/org/apache/accumulo/server/iterators/MetadataBulkLoadFilterTest.java 4a45e99 server/base/src/test/java/org/apache/accumulo/server/replication/StatusCombinerTest.java a9801b0 server/monitor/src/main/java/org/apache/accumulo/monitor/servlets/trace/NullScanner.java bf35557 server/tserver/src/main/java/org/apache/accumulo/tserver/TabletIteratorEnvironment.java d1fece5 server/tserver/src/main/java/org/apache/accumulo/tserver/tablet/Compactor.java 869cc33 server/tserver/src/main/java/org/apache/accumulo/tserver/tablet/ScanDataSource.java fe4b16b test/src/main/java/org/apache/accumulo/test/functional/AuthsIterator.java PRE-CREATION test/src/test/java/org/apache/accumulo/test/ScanIteratorIT.java PRE-CREATION Diff: https://reviews.apache.org/r/29502/diff/ Testing --- Wrote an integration test to verify that ScanDataSource is actually setting the authorizations on the IteratorEnvironment Thanks, Corey Nolet
Re: Review Request 29386: ACCUMULO-2815 Client authentication via Kerberos
On Dec. 30, 2014, 5:46 p.m., Christopher Tubbs wrote: shell/src/main/java/org/apache/accumulo/shell/ShellOptionsJC.java, line 210 https://reviews.apache.org/r/29386/diff/4/?file=803175#file803175line210 Why is username the short user name? Is that unique in Kerberos? If not, the long version should be used everywhere instead. Otherwise, one user can appear to be another in logs, etc. If getShortUserName is not unique, it should avoided everywhere. Josh Elser wrote: Check out: http://web.mit.edu/kerberos/krb5-1.5/krb5-1.5.4/doc/krb5-user/What-is-a-Kerberos-Principal_003f.html Kerberos principals are of the form: primary/instance@realm. Kerberos principals are typically categorized as users and services. A user is not qualified to a single instance (a host) and represent authentication across the realm. For example, els...@example.com means that I can roam. Conversely, a service is typically fixed to a specific host. For example, accumulo/node1.example@example.com means that there is a process, logged in as 'accumulo' on the host 'node1.example.com'. That service can't be run on any other host. Now, an important note if someone actually creates a principal accum...@example.com this is unique with respect to any other accumulo/`host`@EXAMPLE.COM principal. I'm not sure if we need to do anything else other than convention of kerberos principals, or if we should be including the instance in our username when present. This kind of ties back into the SystemCredentials discussion again. Christopher Tubbs wrote: Okay, so a smart configuration would make shortnames unique. However, UserGroupInformation returns only the `primary` for the short name. This means that user names will have to be unique across realms and instances. Right now, you are storing permissions using the short name. So, any user with the same primary, will be able to masquerade as any other user with the same primary from a different instance and/or realm, and be able to user their permissions and authorizations. That's the problem with the shortname here. That's very unexpected. Josh Elser wrote: Bingo. If you look at how HDFS does their configuration, this is the same convention. The lack of documentation from me leaves something to be desired here, and I apologize for that. To save you looking at HDFS (if you care not to look), you'll see that an HDFS process uses a given principal with a special replacement string `_HOST`. The common convention is to use something like `dn/_h...@example.com` (the realm is unimportant for this example). This ensures that the same configuration files can be used across all hosts in the HDFS instance, and Hadoop dynamically replaces `_HOST` with the FQDN of the host. Thus, there's an implicit link that all `dn/*@EXAMPLE.COM` can act as datanodes and this is protected by the fact that access to the KDC is restricted (you can't make your own user). The circle of trust is two-fold: having a keytab with the correct principal and that Hadoop is requires that specific configuration (which restricts the principal). Christopher Tubbs wrote: My concerns here are more about the impact on users, than for the system credentials. I don't know what HDFS is doing, but if they aren't (minimally) checking the realm when checking permissions/access on an authenticated principal, then they are less secure than I think we should be. Referencing HDFS also seems to imply that we're not so much doing Kerberos, as we are implementing HDFS-specific Kerberos conventions (which are less secure, with respect to data authorizations/permissions within Accumulo, than I'm comfortable with). Josh Elser wrote: bq. if they aren't (minimally) checking the realm when checking permissions/access on an authenticated principal Do you mean the instance instead of the realm? In the case of a single realm, the KDC is going to verify the correct realm. Assuming you meant the instance though (the optional /hostname), it's typical that a user has the ability to use their credentials anywhere. Thus, you typically see principals without instances for actual users. As far as I understand it, that's what HDFS tends to follow and what I tried to as well. Accumulo doesn't care where you come from, just what your name is and that you have valid credentials. I don't think we're being substantially less secure by not including the instance in the Accumulo principal. Christopher Tubbs wrote: No, I mean the realm, to make it only necessary to guarantee uniqueness within a realm, vs. across all known realms (more reasonable of a guarantee to make for a KDC user admin). We could also include the instance (when specified), if we want to really be careful that users aren't sharing permissions. In my concerns, I'm assuming we
Re: Review Request 29386: ACCUMULO-2815 Client authentication via Kerberos
On Dec. 30, 2014, 10:46 p.m., Christopher Tubbs wrote: shell/src/main/java/org/apache/accumulo/shell/ShellOptionsJC.java, line 210 https://reviews.apache.org/r/29386/diff/4/?file=803175#file803175line210 Why is username the short user name? Is that unique in Kerberos? If not, the long version should be used everywhere instead. Otherwise, one user can appear to be another in logs, etc. If getShortUserName is not unique, it should avoided everywhere. Josh Elser wrote: Check out: http://web.mit.edu/kerberos/krb5-1.5/krb5-1.5.4/doc/krb5-user/What-is-a-Kerberos-Principal_003f.html Kerberos principals are of the form: primary/instance@realm. Kerberos principals are typically categorized as users and services. A user is not qualified to a single instance (a host) and represent authentication across the realm. For example, els...@example.com means that I can roam. Conversely, a service is typically fixed to a specific host. For example, accumulo/node1.example@example.com means that there is a process, logged in as 'accumulo' on the host 'node1.example.com'. That service can't be run on any other host. Now, an important note if someone actually creates a principal accum...@example.com this is unique with respect to any other accumulo/`host`@EXAMPLE.COM principal. I'm not sure if we need to do anything else other than convention of kerberos principals, or if we should be including the instance in our username when present. This kind of ties back into the SystemCredentials discussion again. Christopher Tubbs wrote: Okay, so a smart configuration would make shortnames unique. However, UserGroupInformation returns only the `primary` for the short name. This means that user names will have to be unique across realms and instances. Right now, you are storing permissions using the short name. So, any user with the same primary, will be able to masquerade as any other user with the same primary from a different instance and/or realm, and be able to user their permissions and authorizations. That's the problem with the shortname here. That's very unexpected. Josh Elser wrote: Bingo. If you look at how HDFS does their configuration, this is the same convention. The lack of documentation from me leaves something to be desired here, and I apologize for that. To save you looking at HDFS (if you care not to look), you'll see that an HDFS process uses a given principal with a special replacement string `_HOST`. The common convention is to use something like `dn/_h...@example.com` (the realm is unimportant for this example). This ensures that the same configuration files can be used across all hosts in the HDFS instance, and Hadoop dynamically replaces `_HOST` with the FQDN of the host. Thus, there's an implicit link that all `dn/*@EXAMPLE.COM` can act as datanodes and this is protected by the fact that access to the KDC is restricted (you can't make your own user). The circle of trust is two-fold: having a keytab with the correct principal and that Hadoop is requires that specific configuration (which restricts the principal). Christopher Tubbs wrote: My concerns here are more about the impact on users, than for the system credentials. I don't know what HDFS is doing, but if they aren't (minimally) checking the realm when checking permissions/access on an authenticated principal, then they are less secure than I think we should be. Referencing HDFS also seems to imply that we're not so much doing Kerberos, as we are implementing HDFS-specific Kerberos conventions (which are less secure, with respect to data authorizations/permissions within Accumulo, than I'm comfortable with). Josh Elser wrote: bq. if they aren't (minimally) checking the realm when checking permissions/access on an authenticated principal Do you mean the instance instead of the realm? In the case of a single realm, the KDC is going to verify the correct realm. Assuming you meant the instance though (the optional /hostname), it's typical that a user has the ability to use their credentials anywhere. Thus, you typically see principals without instances for actual users. As far as I understand it, that's what HDFS tends to follow and what I tried to as well. Accumulo doesn't care where you come from, just what your name is and that you have valid credentials. I don't think we're being substantially less secure by not including the instance in the Accumulo principal. Christopher Tubbs wrote: No, I mean the realm, to make it only necessary to guarantee uniqueness within a realm, vs. across all known realms (more reasonable of a guarantee to make for a KDC user admin). We could also include the instance (when specified), if we want to really be careful that users aren't sharing permissions. In my concerns, I'm assuming we
Re: Review Request 29386: ACCUMULO-2815 Client authentication via Kerberos
On Dec. 30, 2014, 5:46 p.m., Christopher Tubbs wrote: shell/src/main/java/org/apache/accumulo/shell/ShellOptionsJC.java, line 210 https://reviews.apache.org/r/29386/diff/4/?file=803175#file803175line210 Why is username the short user name? Is that unique in Kerberos? If not, the long version should be used everywhere instead. Otherwise, one user can appear to be another in logs, etc. If getShortUserName is not unique, it should avoided everywhere. Josh Elser wrote: Check out: http://web.mit.edu/kerberos/krb5-1.5/krb5-1.5.4/doc/krb5-user/What-is-a-Kerberos-Principal_003f.html Kerberos principals are of the form: primary/instance@realm. Kerberos principals are typically categorized as users and services. A user is not qualified to a single instance (a host) and represent authentication across the realm. For example, els...@example.com means that I can roam. Conversely, a service is typically fixed to a specific host. For example, accumulo/node1.example@example.com means that there is a process, logged in as 'accumulo' on the host 'node1.example.com'. That service can't be run on any other host. Now, an important note if someone actually creates a principal accum...@example.com this is unique with respect to any other accumulo/`host`@EXAMPLE.COM principal. I'm not sure if we need to do anything else other than convention of kerberos principals, or if we should be including the instance in our username when present. This kind of ties back into the SystemCredentials discussion again. Christopher Tubbs wrote: Okay, so a smart configuration would make shortnames unique. However, UserGroupInformation returns only the `primary` for the short name. This means that user names will have to be unique across realms and instances. Right now, you are storing permissions using the short name. So, any user with the same primary, will be able to masquerade as any other user with the same primary from a different instance and/or realm, and be able to user their permissions and authorizations. That's the problem with the shortname here. That's very unexpected. Josh Elser wrote: Bingo. If you look at how HDFS does their configuration, this is the same convention. The lack of documentation from me leaves something to be desired here, and I apologize for that. To save you looking at HDFS (if you care not to look), you'll see that an HDFS process uses a given principal with a special replacement string `_HOST`. The common convention is to use something like `dn/_h...@example.com` (the realm is unimportant for this example). This ensures that the same configuration files can be used across all hosts in the HDFS instance, and Hadoop dynamically replaces `_HOST` with the FQDN of the host. Thus, there's an implicit link that all `dn/*@EXAMPLE.COM` can act as datanodes and this is protected by the fact that access to the KDC is restricted (you can't make your own user). The circle of trust is two-fold: having a keytab with the correct principal and that Hadoop is requires that specific configuration (which restricts the principal). Christopher Tubbs wrote: My concerns here are more about the impact on users, than for the system credentials. I don't know what HDFS is doing, but if they aren't (minimally) checking the realm when checking permissions/access on an authenticated principal, then they are less secure than I think we should be. Referencing HDFS also seems to imply that we're not so much doing Kerberos, as we are implementing HDFS-specific Kerberos conventions (which are less secure, with respect to data authorizations/permissions within Accumulo, than I'm comfortable with). Josh Elser wrote: bq. if they aren't (minimally) checking the realm when checking permissions/access on an authenticated principal Do you mean the instance instead of the realm? In the case of a single realm, the KDC is going to verify the correct realm. Assuming you meant the instance though (the optional /hostname), it's typical that a user has the ability to use their credentials anywhere. Thus, you typically see principals without instances for actual users. As far as I understand it, that's what HDFS tends to follow and what I tried to as well. Accumulo doesn't care where you come from, just what your name is and that you have valid credentials. I don't think we're being substantially less secure by not including the instance in the Accumulo principal. Christopher Tubbs wrote: No, I mean the realm, to make it only necessary to guarantee uniqueness within a realm, vs. across all known realms (more reasonable of a guarantee to make for a KDC user admin). We could also include the instance (when specified), if we want to really be careful that users aren't sharing permissions. In my concerns, I'm assuming we
Re: Review Request 29176: ACCUMULO-3420 Get Visibility Metrics from PrintInfo
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/29176/#review66906 --- core/src/main/java/org/apache/accumulo/core/file/rfile/MetricsGatherer.java https://reviews.apache.org/r/29176/#comment110609 Should be generic interface MetricsGathererT with a method T getMetrics(); to expose the metrics in ways other than printing. core/src/main/java/org/apache/accumulo/core/file/rfile/MetricsGatherer.java https://reviews.apache.org/r/29176/#comment110603 The javadoc should describe when init is called (eg. when it is registered with the RFile Reader). core/src/main/java/org/apache/accumulo/core/file/rfile/MetricsGatherer.java https://reviews.apache.org/r/29176/#comment110604 This javadoc is incorrect, based on its usage here. What appears to be passed in is the column family, not the locality group name. new also seems to imply something is being created, when it simply indicates that a new one has been encountered. Maybe the start or begin instead of new would be more clear. [I'm also entertaining the idea that this method could just take the first key (or key/value pair) from the new locality group in order to make fewer assumptions about how it is used, but I haven't fully convinced myself that's better than strictly defining this method to take the column family portion only.] core/src/main/java/org/apache/accumulo/core/file/rfile/MetricsGatherer.java https://reviews.apache.org/r/29176/#comment110605 This could take a key and a value as two parameters, to enable gathering metrics of more things. core/src/main/java/org/apache/accumulo/core/file/rfile/MetricsGatherer.java https://reviews.apache.org/r/29176/#comment110606 could be called startBlock or beginBlock instead of newBlock to avoid the implication that something new is being created. core/src/main/java/org/apache/accumulo/core/file/rfile/MetricsGatherer.java https://reviews.apache.org/r/29176/#comment110607 javadoc should describe the output format, or rely on proposed getMetrics() method and defer the printing to the PrintInfo command. core/src/main/java/org/apache/accumulo/core/file/rfile/PrintInfo.java https://reviews.apache.org/r/29176/#comment110608 description should specify whether this requires -v or implies -v core/src/main/java/org/apache/accumulo/core/file/rfile/RFile.java https://reviews.apache.org/r/29176/#comment110610 toString isn't safe here; probably better to just pass the column family directly, or the whole key and let the gatherer decide how to use it. core/src/main/java/org/apache/accumulo/core/file/rfile/RFile.java https://reviews.apache.org/r/29176/#comment110612 Needs javadoc. Should specify that you can only register one (it will clobber any previously set one), and that you should do so before iterating. This could be enforced, by checking state when this method is called, but minimally, the javadocs should explain. - Christopher Tubbs On Dec. 22, 2014, 10:25 a.m., Jenna Huston wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/29176/ --- (Updated Dec. 22, 2014, 10:25 a.m.) Review request for accumulo. Bugs: ACCUMULO-3420 https://issues.apache.org/jira/browse/ACCUMULO-3420 Repository: accumulo Description --- Added an option to PrintInfo to get visibility metrics. Prints the number of times a visibilty is seen in each locality group. Also shows how many blocks in the locality group have his visibiltiy. Diffs - core/src/main/java/org/apache/accumulo/core/file/rfile/MetricsGatherer.java PRE-CREATION core/src/main/java/org/apache/accumulo/core/file/rfile/PrintInfo.java 43586dd core/src/main/java/org/apache/accumulo/core/file/rfile/RFile.java 9dcb3a5 core/src/main/java/org/apache/accumulo/core/file/rfile/VisMetricsGatherer.java PRE-CREATION core/src/test/java/org/apache/accumulo/core/file/rfile/RFileMetricsTest.java PRE-CREATION Diff: https://reviews.apache.org/r/29176/diff/ Testing --- Tested with a few RFiles that I made. Thanks, Jenna Huston
Re: Review Request 29386: ACCUMULO-2815 Client authentication via Kerberos
On Dec. 30, 2014, 10:46 p.m., Christopher Tubbs wrote: shell/src/main/java/org/apache/accumulo/shell/ShellOptionsJC.java, line 210 https://reviews.apache.org/r/29386/diff/4/?file=803175#file803175line210 Why is username the short user name? Is that unique in Kerberos? If not, the long version should be used everywhere instead. Otherwise, one user can appear to be another in logs, etc. If getShortUserName is not unique, it should avoided everywhere. Josh Elser wrote: Check out: http://web.mit.edu/kerberos/krb5-1.5/krb5-1.5.4/doc/krb5-user/What-is-a-Kerberos-Principal_003f.html Kerberos principals are of the form: primary/instance@realm. Kerberos principals are typically categorized as users and services. A user is not qualified to a single instance (a host) and represent authentication across the realm. For example, els...@example.com means that I can roam. Conversely, a service is typically fixed to a specific host. For example, accumulo/node1.example@example.com means that there is a process, logged in as 'accumulo' on the host 'node1.example.com'. That service can't be run on any other host. Now, an important note if someone actually creates a principal accum...@example.com this is unique with respect to any other accumulo/`host`@EXAMPLE.COM principal. I'm not sure if we need to do anything else other than convention of kerberos principals, or if we should be including the instance in our username when present. This kind of ties back into the SystemCredentials discussion again. Christopher Tubbs wrote: Okay, so a smart configuration would make shortnames unique. However, UserGroupInformation returns only the `primary` for the short name. This means that user names will have to be unique across realms and instances. Right now, you are storing permissions using the short name. So, any user with the same primary, will be able to masquerade as any other user with the same primary from a different instance and/or realm, and be able to user their permissions and authorizations. That's the problem with the shortname here. That's very unexpected. Josh Elser wrote: Bingo. If you look at how HDFS does their configuration, this is the same convention. The lack of documentation from me leaves something to be desired here, and I apologize for that. To save you looking at HDFS (if you care not to look), you'll see that an HDFS process uses a given principal with a special replacement string `_HOST`. The common convention is to use something like `dn/_h...@example.com` (the realm is unimportant for this example). This ensures that the same configuration files can be used across all hosts in the HDFS instance, and Hadoop dynamically replaces `_HOST` with the FQDN of the host. Thus, there's an implicit link that all `dn/*@EXAMPLE.COM` can act as datanodes and this is protected by the fact that access to the KDC is restricted (you can't make your own user). The circle of trust is two-fold: having a keytab with the correct principal and that Hadoop is requires that specific configuration (which restricts the principal). Christopher Tubbs wrote: My concerns here are more about the impact on users, than for the system credentials. I don't know what HDFS is doing, but if they aren't (minimally) checking the realm when checking permissions/access on an authenticated principal, then they are less secure than I think we should be. Referencing HDFS also seems to imply that we're not so much doing Kerberos, as we are implementing HDFS-specific Kerberos conventions (which are less secure, with respect to data authorizations/permissions within Accumulo, than I'm comfortable with). bq. if they aren't (minimally) checking the realm when checking permissions/access on an authenticated principal Do you mean the instance instead of the realm? In the case of a single realm, the KDC is going to verify the correct realm. Assuming you meant the instance though (the optional /hostname), it's typical that a user has the ability to use their credentials anywhere. Thus, you typically see principals without instances for actual users. As far as I understand it, that's what HDFS tends to follow and what I tried to as well. Accumulo doesn't care where you come from, just what your name is and that you have valid credentials. I don't think we're being substantially less secure by not including the instance in the Accumulo principal. On Dec. 30, 2014, 10:46 p.m., Christopher Tubbs wrote: server/base/src/main/java/org/apache/accumulo/server/security/handler/KerberosAuthenticator.java, line 41 https://reviews.apache.org/r/29386/diff/4/?file=803164#file803164line41 I don't think this should be so tightly coupled with the ZKAuthenticator. It doesn't seem that this coupling is necessary to provide authentication functionality. Have
Re: Review Request 29386: ACCUMULO-2815 Client authentication via Kerberos
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/29386/ --- (Updated Jan. 6, 2015, 11:14 p.m.) Review request for accumulo. Changes --- Big set of changes with not all of the stuff I wanted to do yet. * Found a really bad bug where ThriftTransportKey didn't respect the SASL principal which allowed users to reuse other user's cached transports in the same JVM (e.g. switched users) * Introduced KerberosIT which users MiniKdc to do basic Kerberos testing in the build (always runs with Kerberos) which will help greatly with catching regressions. * Introduced a `useKrbForIT` which will try to use MiniKdc for AccumuloClusterITs. Test coverage is no where close to 100% and I don't intend to do so for a first pass. Documentation and system token changes still incoming... Bugs: ACCUMULO-2815 https://issues.apache.org/jira/browse/ACCUMULO-2815 Repository: accumulo Description --- ACCUMULO-2815 Initial support for Kerberos client authentication. Leverage SASL transport provided by Thrift which can speak GSSAPI, which Kerberos implements. Introduced... * An Accumulo KerberosToken which is an AuthenticationToken to validate users. * Custom thrift processor and invocation handler to ensure server RPCs have a valid KRB identity and Accumulo authentication. * A KerberosAuthenticator which extends ZKAuthenticator to support Kerberos identities seamlessly. * New ClientConf variables to use SASL transport and pass Kerberos server principal * Updated ClientOpts and Shell opts to transparently use a KerberosToken when SASL is enabled (no extra client work). I believe this is the bare minimum for Kerberos support. They are also grossly lacking in unit and integration tests. I believe that I might have somehow broken the client address string in the server (I saw log messages with client: null, but I'm not sure if it's due to these changes or not). A necessary limitation in the Thrift server used is that, like the SSL transport, the SASL transport cannot presently be used with the TFramedTransport, which means none of the [half]async thrift servers will function with this -- we're stuck with the TThreadPoolServer. Performed some contrived benchmarks on my laptop (while still using it myself) to get at big-picture view of the performance impact against normal operation and Kerberos alone. Each run was the duration to ingest 100M records using continuous-ingest, timed with `time`, using 'real'. THsHaServer (our default), 6 runs: Avg: 10m7.273s (607.273s) Min: 9m43.395s Max: 10m52.715s TThreadPoolServer (no SASL), 5 runs: Avg: 11m16.254s (676.254s) Min: 10m30.987s Max: 12m24.192s TThreadPoolServer+SASL/GSSAPI (these changes), 6 runs: Avg: 13m17.187s (797.187s) Min: 10m52.997s Max: 16m0.975s The general takeway is that there's about 15% performance degredation in its initial state which is in the realm of what I expected (~10%). Diffs (updated) - core/src/main/java/org/apache/accumulo/core/cli/ClientOpts.java f6ea934 core/src/main/java/org/apache/accumulo/core/client/ClientConfiguration.java 6fe61a5 core/src/main/java/org/apache/accumulo/core/client/impl/ClientContext.java e75bec6 core/src/main/java/org/apache/accumulo/core/client/impl/ConnectorImpl.java f481cc3 core/src/main/java/org/apache/accumulo/core/client/impl/MasterClient.java a9ad8a1 core/src/main/java/org/apache/accumulo/core/client/impl/ThriftTransportKey.java 6dc846f core/src/main/java/org/apache/accumulo/core/client/impl/ThriftTransportPool.java 5da803b core/src/main/java/org/apache/accumulo/core/client/security/tokens/AbstractKerberosToken.java PRE-CREATION core/src/main/java/org/apache/accumulo/core/client/security/tokens/KerberosToken.java PRE-CREATION core/src/main/java/org/apache/accumulo/core/conf/Property.java e054a5f core/src/main/java/org/apache/accumulo/core/rpc/FilterTransport.java PRE-CREATION core/src/main/java/org/apache/accumulo/core/rpc/SaslConnectionParams.java PRE-CREATION core/src/main/java/org/apache/accumulo/core/rpc/TTimeoutTransport.java 6eace77 core/src/main/java/org/apache/accumulo/core/rpc/ThriftUtil.java 09bd6c4 core/src/main/java/org/apache/accumulo/core/rpc/UGIAssumingTransport.java PRE-CREATION core/src/main/java/org/apache/accumulo/core/rpc/UGIAssumingTransportFactory.java PRE-CREATION core/src/main/java/org/apache/accumulo/core/security/Credentials.java 525a958 core/src/test/java/org/apache/accumulo/core/cli/TestClientOpts.java ff49bc0 core/src/test/java/org/apache/accumulo/core/client/ClientConfigurationTest.java PRE-CREATION core/src/test/java/org/apache/accumulo/core/client/impl/ThriftTransportKeyTest.java PRE-CREATION core/src/test/java/org/apache/accumulo/core/conf/ClientConfigurationTest.java 40be70f
Re: Review Request 29386: ACCUMULO-2815 Client authentication via Kerberos
On Dec. 30, 2014, 7:39 p.m., kturner wrote: core/src/main/java/org/apache/accumulo/core/client/security/tokens/KerberosToken.java, line 46 https://reviews.apache.org/r/29386/diff/4/?file=803145#file803145line46 This is adding UserGroupInformation to Accumulo API. Is that a stable hadoop API? Does it have to be in API, why not just principal? Josh Elser wrote: UGI is marked as LimitedPrivate and Evolving, but I have no qualms against including it into our API. The problem with accepting a principal alone, we would then have to duplicate the login capabilities that UGI already provides (e.g performing a login via password or keytab). By accepting a UGI directly, we don't have to own this logic. It's my opinion that it's easiest to just use UGI directly, but could see an argument for transparently using it behind the scenes if you think that would be better. kturner wrote: I have no qualms against including it into our API Why do you think its ok to ignore the API marking? Shouldnt we honor whats marked on the assumption that someone in Hadoop land will think they are free to make breaking changes based on whats makred? but could see an argument for transparently using it behind the scenes if you think that would be better. From an implementation perspective, I don't know whats best. I don't know what the benefits of using UGI are. What are the problems with not using it? Would not using it require a complex facade class? Would not using it cause interoperability problems with others using Kerberos in the hadoop ecosystem? If this object is really useful to users of Hadoop and we want to use it in our API, I would advocate for opening a hadoop issue to change the stability guarantees. I am uncertain if we should use it before this change is made, but thats because I don't fully understand the details. Josh Elser wrote: bq. Why do you think its ok to ignore the API marking? Shouldnt we honor whats marked on the assumption that someone in Hadoop land will think they are free to make breaking changes based on whats makred? The only difference between Stable and Evolving is that breaking changes could happen at minor releases instead of just major releases (akin to semver). So, just to make a point, even using only Stable classes will result in us having to worry about breaking changes. It's possible that we could use a very thin shim around the most often used methods of UGI, but I'm worried that would be insufficient. For example, most people use `UserGroupInformation.loginUserFromKeytab` or they are have a cached ticket (from `kinit`'ing on the command line before invoking the application) which we can get access to via `UserGroupInformation.getCurrentUser`. The other approach is to use `UserGroupInformation.createProxyUser` which accepts a principal and proxies another user on top of another UGI -- this is very useful for us as it eliminates the need to do Authorization intersection as some server-user to maintain application security. The server is running an Accumulo query for the user; however Accumulo sees the user, not the server. Back to the point though: this actual assertion is a quick fail to make sure that the user trying to use Kerberos is actually logged in (catch it and fail before we actually try to send an RPC). We could push this farther down into the codebase -- inside ConnectorImpl maybe? -- which would even moreso make this a shell AuthenticationToken. To clarify, your biggest worry is the UGI argument in the constructor and not using it in the implementation, right? kturner wrote: To clarify, your biggest worry is the UGI argument in the constructor and not using it in the implementation, right? Correct. This patch adds KerberosToken to the public API w/ a public method using UGI. I would be much less concerned if UGI were only in implementation of KerberosToken as that gives us a lot more wiggle room. kturner wrote: The only difference between Stable and Evolving is that breaking changes could happen at minor releases instead of just major releases (akin to semver). Will Hadoop deprecate before changing in a minor release? Trying to think through case where a minor hadoop release moves UGI. If Hadoop does not deprecate and just moves it, we would be up the creek w/o a paddle. W/o deprecation we would be forced to deprecated and introduce a facade anyway because we can depend on two hadoop releases simultaneously. 1. Accumulo 1.7.0 release API method w/ UGI in signature, lets call this method foo1(UGI_A). 2. Hadoop release 2.9.0 which moves UGI_A to UGI_B AND leaves UGI_A as deprecated. 3. Accumulo must release 1.8.0 w/ foo1(UGI_A) deprecated and introduce foo1(UGI_B). To bring some closure
Re: Review Request 29386: ACCUMULO-2815 Client authentication via Kerberos
On Dec. 30, 2014, 5:46 p.m., Christopher Tubbs wrote: shell/src/main/java/org/apache/accumulo/shell/ShellOptionsJC.java, line 210 https://reviews.apache.org/r/29386/diff/4/?file=803175#file803175line210 Why is username the short user name? Is that unique in Kerberos? If not, the long version should be used everywhere instead. Otherwise, one user can appear to be another in logs, etc. If getShortUserName is not unique, it should avoided everywhere. Josh Elser wrote: Check out: http://web.mit.edu/kerberos/krb5-1.5/krb5-1.5.4/doc/krb5-user/What-is-a-Kerberos-Principal_003f.html Kerberos principals are of the form: primary/instance@realm. Kerberos principals are typically categorized as users and services. A user is not qualified to a single instance (a host) and represent authentication across the realm. For example, els...@example.com means that I can roam. Conversely, a service is typically fixed to a specific host. For example, accumulo/node1.example@example.com means that there is a process, logged in as 'accumulo' on the host 'node1.example.com'. That service can't be run on any other host. Now, an important note if someone actually creates a principal accum...@example.com this is unique with respect to any other accumulo/`host`@EXAMPLE.COM principal. I'm not sure if we need to do anything else other than convention of kerberos principals, or if we should be including the instance in our username when present. This kind of ties back into the SystemCredentials discussion again. Christopher Tubbs wrote: Okay, so a smart configuration would make shortnames unique. However, UserGroupInformation returns only the `primary` for the short name. This means that user names will have to be unique across realms and instances. Right now, you are storing permissions using the short name. So, any user with the same primary, will be able to masquerade as any other user with the same primary from a different instance and/or realm, and be able to user their permissions and authorizations. That's the problem with the shortname here. That's very unexpected. Josh Elser wrote: Bingo. If you look at how HDFS does their configuration, this is the same convention. The lack of documentation from me leaves something to be desired here, and I apologize for that. To save you looking at HDFS (if you care not to look), you'll see that an HDFS process uses a given principal with a special replacement string `_HOST`. The common convention is to use something like `dn/_h...@example.com` (the realm is unimportant for this example). This ensures that the same configuration files can be used across all hosts in the HDFS instance, and Hadoop dynamically replaces `_HOST` with the FQDN of the host. Thus, there's an implicit link that all `dn/*@EXAMPLE.COM` can act as datanodes and this is protected by the fact that access to the KDC is restricted (you can't make your own user). The circle of trust is two-fold: having a keytab with the correct principal and that Hadoop is requires that specific configuration (which restricts the principal). Christopher Tubbs wrote: My concerns here are more about the impact on users, than for the system credentials. I don't know what HDFS is doing, but if they aren't (minimally) checking the realm when checking permissions/access on an authenticated principal, then they are less secure than I think we should be. Referencing HDFS also seems to imply that we're not so much doing Kerberos, as we are implementing HDFS-specific Kerberos conventions (which are less secure, with respect to data authorizations/permissions within Accumulo, than I'm comfortable with). Josh Elser wrote: bq. if they aren't (minimally) checking the realm when checking permissions/access on an authenticated principal Do you mean the instance instead of the realm? In the case of a single realm, the KDC is going to verify the correct realm. Assuming you meant the instance though (the optional /hostname), it's typical that a user has the ability to use their credentials anywhere. Thus, you typically see principals without instances for actual users. As far as I understand it, that's what HDFS tends to follow and what I tried to as well. Accumulo doesn't care where you come from, just what your name is and that you have valid credentials. I don't think we're being substantially less secure by not including the instance in the Accumulo principal. Christopher Tubbs wrote: No, I mean the realm, to make it only necessary to guarantee uniqueness within a realm, vs. across all known realms (more reasonable of a guarantee to make for a KDC user admin). We could also include the instance (when specified), if we want to really be careful that users aren't sharing permissions. In my concerns, I'm assuming we
Re: Review Request 29386: ACCUMULO-2815 Client authentication via Kerberos
On Dec. 30, 2014, 2:39 p.m., kturner wrote: core/src/main/java/org/apache/accumulo/core/client/security/tokens/KerberosToken.java, line 46 https://reviews.apache.org/r/29386/diff/4/?file=803145#file803145line46 This is adding UserGroupInformation to Accumulo API. Is that a stable hadoop API? Does it have to be in API, why not just principal? Josh Elser wrote: UGI is marked as LimitedPrivate and Evolving, but I have no qualms against including it into our API. The problem with accepting a principal alone, we would then have to duplicate the login capabilities that UGI already provides (e.g performing a login via password or keytab). By accepting a UGI directly, we don't have to own this logic. It's my opinion that it's easiest to just use UGI directly, but could see an argument for transparently using it behind the scenes if you think that would be better. kturner wrote: I have no qualms against including it into our API Why do you think its ok to ignore the API marking? Shouldnt we honor whats marked on the assumption that someone in Hadoop land will think they are free to make breaking changes based on whats makred? but could see an argument for transparently using it behind the scenes if you think that would be better. From an implementation perspective, I don't know whats best. I don't know what the benefits of using UGI are. What are the problems with not using it? Would not using it require a complex facade class? Would not using it cause interoperability problems with others using Kerberos in the hadoop ecosystem? If this object is really useful to users of Hadoop and we want to use it in our API, I would advocate for opening a hadoop issue to change the stability guarantees. I am uncertain if we should use it before this change is made, but thats because I don't fully understand the details. Josh Elser wrote: bq. Why do you think its ok to ignore the API marking? Shouldnt we honor whats marked on the assumption that someone in Hadoop land will think they are free to make breaking changes based on whats makred? The only difference between Stable and Evolving is that breaking changes could happen at minor releases instead of just major releases (akin to semver). So, just to make a point, even using only Stable classes will result in us having to worry about breaking changes. It's possible that we could use a very thin shim around the most often used methods of UGI, but I'm worried that would be insufficient. For example, most people use `UserGroupInformation.loginUserFromKeytab` or they are have a cached ticket (from `kinit`'ing on the command line before invoking the application) which we can get access to via `UserGroupInformation.getCurrentUser`. The other approach is to use `UserGroupInformation.createProxyUser` which accepts a principal and proxies another user on top of another UGI -- this is very useful for us as it eliminates the need to do Authorization intersection as some server-user to maintain application security. The server is running an Accumulo query for the user; however Accumulo sees the user, not the server. Back to the point though: this actual assertion is a quick fail to make sure that the user trying to use Kerberos is actually logged in (catch it and fail before we actually try to send an RPC). We could push this farther down into the codebase -- inside ConnectorImpl maybe? -- which would even moreso make this a shell AuthenticationToken. To clarify, your biggest worry is the UGI argument in the constructor and not using it in the implementation, right? kturner wrote: To clarify, your biggest worry is the UGI argument in the constructor and not using it in the implementation, right? Correct. This patch adds KerberosToken to the public API w/ a public method using UGI. I would be much less concerned if UGI were only in implementation of KerberosToken as that gives us a lot more wiggle room. kturner wrote: The only difference between Stable and Evolving is that breaking changes could happen at minor releases instead of just major releases (akin to semver). Will Hadoop deprecate before changing in a minor release? Trying to think through case where a minor hadoop release moves UGI. If Hadoop does not deprecate and just moves it, we would be up the creek w/o a paddle. W/o deprecation we would be forced to deprecated and introduce a facade anyway because we can depend on two hadoop releases simultaneously. 1. Accumulo 1.7.0 release API method w/ UGI in signature, lets call this method foo1(UGI_A). 2. Hadoop release 2.9.0 which moves UGI_A to UGI_B AND leaves UGI_A as deprecated. 3. Accumulo must release 1.8.0 w/ foo1(UGI_A) deprecated and introduce foo1(UGI_B). Josh Elser wrote:
Re: Review Request 29502: ACCUMULO-3458 Adding scan authorizations to IteratorEnvironment
On Jan. 5, 2015, 9:09 p.m., Christopher Tubbs wrote: core/src/test/java/org/apache/accumulo/core/iterators/FirstEntryInRowIteratorTest.java, line 63 https://reviews.apache.org/r/29502/diff/2/?file=804705#file804705line63 Should this be Authorizations.EMPTY? Or should it have a default implementation on WrappingIterator which calls source.getAuthorizations()? Christopher Tubbs wrote: make that `getSource().getAuthorizations()` Specific to this test I returned null because all the other getters (other than what was being explicitly tested) were returning null. Were you thinking WrappingIterator should also provide a getAuthorizations() method? On Jan. 5, 2015, 9:09 p.m., Christopher Tubbs wrote: server/tserver/src/main/java/org/apache/accumulo/tserver/TabletIteratorEnvironment.java, line 46 https://reviews.apache.org/r/29502/diff/2/?file=804711#file804711line46 I wonder if there's a better way to provide environment options, like this and others, at specific scopes. Maybe use some dependency injection, with annotations, like Servlet @Context or JUnit @Rule: @ScanContext Authorizations auths; (throw error if type is not appropriate for context during injection). This feature would be pretty neat. Were you thinking this would extend past just the IteratorEnvironment into other places? Any other fields you can think of that would benefit from this change other than Authorizations? - Corey --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/29502/#review66725 --- On Dec. 31, 2014, 3:40 p.m., Corey Nolet wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/29502/ --- (Updated Dec. 31, 2014, 3:40 p.m.) Review request for accumulo, Christopher Tubbs, Eric Newton, Josh Elser, and kturner. Bugs: ACCUMULO-3458 https://issues.apache.org/jira/browse/ACCUMULO-3458 Repository: accumulo Description --- ACCUMULO-3458 Propagating scan-time authorizations through the IteratorEnvironment so that scan-time iterators can use them. Diffs - core/src/main/java/org/apache/accumulo/core/client/ClientSideIteratorScanner.java 4903656 core/src/main/java/org/apache/accumulo/core/client/ScannerBase.java 335b63a core/src/main/java/org/apache/accumulo/core/client/impl/OfflineScanner.java 2552682 core/src/main/java/org/apache/accumulo/core/client/impl/ScannerImpl.java 666a8af core/src/main/java/org/apache/accumulo/core/client/impl/ScannerOptions.java 9726266 core/src/main/java/org/apache/accumulo/core/client/impl/TabletServerBatchReader.java 2a79f05 core/src/main/java/org/apache/accumulo/core/client/mock/MockScannerBase.java 72cb863 core/src/main/java/org/apache/accumulo/core/iterators/IteratorEnvironment.java 9e20cb1 core/src/main/java/org/apache/accumulo/core/iterators/system/VisibilityFilter.java 15c33fa core/src/test/java/org/apache/accumulo/core/iterators/DefaultIteratorEnvironment.java 94da7b5 core/src/test/java/org/apache/accumulo/core/iterators/FirstEntryInRowIteratorTest.java fa46360 core/src/test/java/org/apache/accumulo/core/iterators/user/RowDeletingIteratorTest.java 4521e55 core/src/test/java/org/apache/accumulo/core/iterators/user/TransformingIteratorTest.java 4cebab7 server/base/src/test/java/org/apache/accumulo/server/iterators/MetadataBulkLoadFilterTest.java 4a45e99 server/base/src/test/java/org/apache/accumulo/server/replication/StatusCombinerTest.java a9801b0 server/monitor/src/main/java/org/apache/accumulo/monitor/servlets/trace/NullScanner.java bf35557 server/tserver/src/main/java/org/apache/accumulo/tserver/TabletIteratorEnvironment.java d1fece5 server/tserver/src/main/java/org/apache/accumulo/tserver/tablet/Compactor.java 869cc33 server/tserver/src/main/java/org/apache/accumulo/tserver/tablet/ScanDataSource.java fe4b16b test/src/main/java/org/apache/accumulo/test/functional/AuthsIterator.java PRE-CREATION test/src/test/java/org/apache/accumulo/test/ScanIteratorIT.java PRE-CREATION Diff: https://reviews.apache.org/r/29502/diff/ Testing --- Wrote an integration test to verify that ScanDataSource is actually setting the authorizations on the IteratorEnvironment Thanks, Corey Nolet
Re: Review Request 29502: ACCUMULO-3458 Adding scan authorizations to IteratorEnvironment
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/29502/ --- (Updated Jan. 6, 2015, 3:44 p.m.) Review request for accumulo, Christopher Tubbs, Eric Newton, Josh Elser, and kturner. Changes --- Fixed based on feedback from Christopher and Keith. Noticed some extra formatting removing whitespace in some places. Bugs: ACCUMULO-3458 https://issues.apache.org/jira/browse/ACCUMULO-3458 Repository: accumulo Description --- ACCUMULO-3458 Propagating scan-time authorizations through the IteratorEnvironment so that scan-time iterators can use them. Diffs (updated) - core/src/main/java/org/apache/accumulo/core/client/BatchDeleter.java 2bfc347 core/src/main/java/org/apache/accumulo/core/client/ClientSideIteratorScanner.java 4903656 core/src/main/java/org/apache/accumulo/core/client/Scanner.java 112179e core/src/main/java/org/apache/accumulo/core/client/ScannerBase.java 335b63a core/src/main/java/org/apache/accumulo/core/client/impl/OfflineScanner.java 2552682 core/src/main/java/org/apache/accumulo/core/client/impl/ScannerImpl.java 666a8af core/src/main/java/org/apache/accumulo/core/client/impl/ScannerIterator.java 1e0ac99 core/src/main/java/org/apache/accumulo/core/client/impl/ScannerOptions.java 9726266 core/src/main/java/org/apache/accumulo/core/client/impl/TabletServerBatchReader.java 2a79f05 core/src/main/java/org/apache/accumulo/core/client/mock/MockScannerBase.java 72cb863 core/src/main/java/org/apache/accumulo/core/iterators/IteratorEnvironment.java 9e20cb1 core/src/main/java/org/apache/accumulo/core/iterators/WrappingIterator.java 060fa76 core/src/main/java/org/apache/accumulo/core/iterators/system/VisibilityFilter.java 15c33fa core/src/test/java/org/apache/accumulo/core/client/impl/ScannerImplTest.java be4d467 core/src/test/java/org/apache/accumulo/core/client/impl/TabletServerBatchReaderTest.java PRE-CREATION core/src/test/java/org/apache/accumulo/core/iterators/DefaultIteratorEnvironment.java 94da7b5 core/src/test/java/org/apache/accumulo/core/iterators/FirstEntryInRowIteratorTest.java fa46360 core/src/test/java/org/apache/accumulo/core/iterators/user/RowDeletingIteratorTest.java 4521e55 core/src/test/java/org/apache/accumulo/core/iterators/user/TransformingIteratorTest.java 4cebab7 server/base/src/test/java/org/apache/accumulo/server/iterators/MetadataBulkLoadFilterTest.java 4a45e99 server/base/src/test/java/org/apache/accumulo/server/replication/StatusCombinerTest.java a9801b0 server/monitor/src/main/java/org/apache/accumulo/monitor/servlets/trace/NullScanner.java bf35557 server/tserver/src/main/java/org/apache/accumulo/tserver/TabletIteratorEnvironment.java d1fece5 server/tserver/src/main/java/org/apache/accumulo/tserver/tablet/Compactor.java 869cc33 server/tserver/src/main/java/org/apache/accumulo/tserver/tablet/ScanDataSource.java fe4b16b test/src/main/java/org/apache/accumulo/test/functional/AuthsIterator.java PRE-CREATION test/src/test/java/org/apache/accumulo/test/ScanIteratorIT.java PRE-CREATION Diff: https://reviews.apache.org/r/29502/diff/ Testing --- Wrote an integration test to verify that ScanDataSource is actually setting the authorizations on the IteratorEnvironment Thanks, Corey Nolet
Re: Review Request 29502: ACCUMULO-3458 Adding scan authorizations to IteratorEnvironment
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/29502/ --- (Updated Jan. 6, 2015, 3:54 p.m.) Review request for accumulo, Christopher Tubbs, Eric Newton, Josh Elser, and kturner. Changes --- Removing files which were formatted but not changed in any other way to augment the feature in the commit. Bugs: ACCUMULO-3458 https://issues.apache.org/jira/browse/ACCUMULO-3458 Repository: accumulo Description --- ACCUMULO-3458 Propagating scan-time authorizations through the IteratorEnvironment so that scan-time iterators can use them. Diffs (updated) - core/src/main/java/org/apache/accumulo/core/client/ClientSideIteratorScanner.java 4903656 core/src/main/java/org/apache/accumulo/core/client/ScannerBase.java 335b63a core/src/main/java/org/apache/accumulo/core/client/impl/OfflineScanner.java 2552682 core/src/main/java/org/apache/accumulo/core/client/impl/ScannerImpl.java 666a8af core/src/main/java/org/apache/accumulo/core/client/impl/ScannerOptions.java 9726266 core/src/main/java/org/apache/accumulo/core/client/impl/TabletServerBatchReader.java 2a79f05 core/src/main/java/org/apache/accumulo/core/client/mock/MockScannerBase.java 72cb863 core/src/main/java/org/apache/accumulo/core/iterators/IteratorEnvironment.java 9e20cb1 core/src/test/java/org/apache/accumulo/core/client/impl/ScannerImplTest.java be4d467 core/src/test/java/org/apache/accumulo/core/client/impl/TabletServerBatchReaderTest.java PRE-CREATION core/src/test/java/org/apache/accumulo/core/iterators/DefaultIteratorEnvironment.java 94da7b5 core/src/test/java/org/apache/accumulo/core/iterators/FirstEntryInRowIteratorTest.java fa46360 core/src/test/java/org/apache/accumulo/core/iterators/user/RowDeletingIteratorTest.java 4521e55 core/src/test/java/org/apache/accumulo/core/iterators/user/TransformingIteratorTest.java 4cebab7 server/base/src/test/java/org/apache/accumulo/server/iterators/MetadataBulkLoadFilterTest.java 4a45e99 server/base/src/test/java/org/apache/accumulo/server/replication/StatusCombinerTest.java a9801b0 server/monitor/src/main/java/org/apache/accumulo/monitor/servlets/trace/NullScanner.java bf35557 server/tserver/src/main/java/org/apache/accumulo/tserver/TabletIteratorEnvironment.java d1fece5 server/tserver/src/main/java/org/apache/accumulo/tserver/tablet/Compactor.java 869cc33 server/tserver/src/main/java/org/apache/accumulo/tserver/tablet/ScanDataSource.java fe4b16b test/src/main/java/org/apache/accumulo/test/functional/AuthsIterator.java PRE-CREATION test/src/test/java/org/apache/accumulo/test/ScanIteratorIT.java PRE-CREATION Diff: https://reviews.apache.org/r/29502/diff/ Testing --- Wrote an integration test to verify that ScanDataSource is actually setting the authorizations on the IteratorEnvironment Thanks, Corey Nolet
Re: Review Request 29386: ACCUMULO-2815 Client authentication via Kerberos
On Dec. 30, 2014, 10:46 p.m., Christopher Tubbs wrote: core/src/main/java/org/apache/accumulo/core/conf/Property.java, lines 165-167 https://reviews.apache.org/r/29386/diff/4/?file=803146#file803146line165 I don't think we want to expose this. It's high risk. What is the reason and how is it related to this change? Josh Elser wrote: Again, see the previous discussion on this subject. This is necessary to actually perform an apples to apples comparison of the impact of Kerberos on Accumulo. Comparing KRB/SASL with TThreadPoolServer against no KRB/SASL and THsHaServer is an invalid benchmark to determine the overhead of KRB. Like I said earlier, `Experimental` is not really the appropriate annotation here (Internal use would be more accurate), but it's what we have. Do you have a recommendation on some other change in wording/annotation to convey the intent? I'd like to not have to change the codebase any time I want to verify no performance regressions in future versions (having to change code would preclude any non-devs from running their own benchmarks too). Christopher Tubbs wrote: My main concern is that I'm not entirely confident we're not exposing these to users in some way, still: see ACCUMULO-2460 and ACCUMULO-2401. Josh Elser wrote: It sounds to me like we should introduce a new annotation then to cover things not meant for user consumption. Experimental is for maybe not stable or entirely working user facing things. This property is (likely) only relevant for developers. Any objections to addressing this by introduction of a new annotation (not in this changeset)? Christopher Tubbs wrote: That's a simple enough solution. I still kind of object to the idea that experimental properties have default values at all (it's experimental, so it should be okay to set to empty value, and should be null/unused by default), which was the underlying problem between ACCUMULO-2460 and ACCUMULO-2401 (the default value was added and used as a proxy for what probably should have been a constant). If that underlying problem was addressed, then using Experimental here would be perfectly fine. However, it may still be useful to have separate, clear semantics, with an additional annotation. Set a default of and updated the get() on the enum to default to hsha. - Josh --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/29386/#review66382 --- On Jan. 7, 2015, 5:43 a.m., Josh Elser wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/29386/ --- (Updated Jan. 7, 2015, 5:43 a.m.) Review request for accumulo. Bugs: ACCUMULO-2815 https://issues.apache.org/jira/browse/ACCUMULO-2815 Repository: accumulo Description --- ACCUMULO-2815 Initial support for Kerberos client authentication. Leverage SASL transport provided by Thrift which can speak GSSAPI, which Kerberos implements. Introduced... * An Accumulo KerberosToken which is an AuthenticationToken to validate users. * Custom thrift processor and invocation handler to ensure server RPCs have a valid KRB identity and Accumulo authentication. * A KerberosAuthenticator which extends ZKAuthenticator to support Kerberos identities seamlessly. * New ClientConf variables to use SASL transport and pass Kerberos server principal * Updated ClientOpts and Shell opts to transparently use a KerberosToken when SASL is enabled (no extra client work). I believe this is the bare minimum for Kerberos support. They are also grossly lacking in unit and integration tests. I believe that I might have somehow broken the client address string in the server (I saw log messages with client: null, but I'm not sure if it's due to these changes or not). A necessary limitation in the Thrift server used is that, like the SSL transport, the SASL transport cannot presently be used with the TFramedTransport, which means none of the [half]async thrift servers will function with this -- we're stuck with the TThreadPoolServer. Performed some contrived benchmarks on my laptop (while still using it myself) to get at big-picture view of the performance impact against normal operation and Kerberos alone. Each run was the duration to ingest 100M records using continuous-ingest, timed with `time`, using 'real'. THsHaServer (our default), 6 runs: Avg: 10m7.273s (607.273s) Min: 9m43.395s Max: 10m52.715s TThreadPoolServer (no SASL), 5 runs: Avg: 11m16.254s (676.254s) Min: 10m30.987s Max: 12m24.192s TThreadPoolServer+SASL/GSSAPI (these changes), 6 runs: Avg: 13m17.187s (797.187s) Min:
Re: Review Request 29386: ACCUMULO-2815 Client authentication via Kerberos
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/29386/ --- (Updated Jan. 7, 2015, 5:43 a.m.) Review request for accumulo. Changes --- Eliminates SystemKerberosToken (and then AbstractKerberosToken too). Fixed some warnings around config keys for kerberos configuration properties. Ensure that system user doesn't get an account. Fix some classes called during start-all.sh or stop-all.sh to not require valid krb credentials set up by the user (use the server login). Bugs: ACCUMULO-2815 https://issues.apache.org/jira/browse/ACCUMULO-2815 Repository: accumulo Description --- ACCUMULO-2815 Initial support for Kerberos client authentication. Leverage SASL transport provided by Thrift which can speak GSSAPI, which Kerberos implements. Introduced... * An Accumulo KerberosToken which is an AuthenticationToken to validate users. * Custom thrift processor and invocation handler to ensure server RPCs have a valid KRB identity and Accumulo authentication. * A KerberosAuthenticator which extends ZKAuthenticator to support Kerberos identities seamlessly. * New ClientConf variables to use SASL transport and pass Kerberos server principal * Updated ClientOpts and Shell opts to transparently use a KerberosToken when SASL is enabled (no extra client work). I believe this is the bare minimum for Kerberos support. They are also grossly lacking in unit and integration tests. I believe that I might have somehow broken the client address string in the server (I saw log messages with client: null, but I'm not sure if it's due to these changes or not). A necessary limitation in the Thrift server used is that, like the SSL transport, the SASL transport cannot presently be used with the TFramedTransport, which means none of the [half]async thrift servers will function with this -- we're stuck with the TThreadPoolServer. Performed some contrived benchmarks on my laptop (while still using it myself) to get at big-picture view of the performance impact against normal operation and Kerberos alone. Each run was the duration to ingest 100M records using continuous-ingest, timed with `time`, using 'real'. THsHaServer (our default), 6 runs: Avg: 10m7.273s (607.273s) Min: 9m43.395s Max: 10m52.715s TThreadPoolServer (no SASL), 5 runs: Avg: 11m16.254s (676.254s) Min: 10m30.987s Max: 12m24.192s TThreadPoolServer+SASL/GSSAPI (these changes), 6 runs: Avg: 13m17.187s (797.187s) Min: 10m52.997s Max: 16m0.975s The general takeway is that there's about 15% performance degredation in its initial state which is in the realm of what I expected (~10%). Diffs (updated) - core/src/main/java/org/apache/accumulo/core/cli/ClientOpts.java f6ea934 core/src/main/java/org/apache/accumulo/core/client/ClientConfiguration.java 6fe61a5 core/src/main/java/org/apache/accumulo/core/client/impl/ClientContext.java e75bec6 core/src/main/java/org/apache/accumulo/core/client/impl/ConnectorImpl.java f481cc3 core/src/main/java/org/apache/accumulo/core/client/impl/MasterClient.java a9ad8a1 core/src/main/java/org/apache/accumulo/core/client/impl/ThriftTransportKey.java 6dc846f core/src/main/java/org/apache/accumulo/core/client/impl/ThriftTransportPool.java 5da803b core/src/main/java/org/apache/accumulo/core/client/security/tokens/KerberosToken.java PRE-CREATION core/src/main/java/org/apache/accumulo/core/conf/Property.java e054a5f core/src/main/java/org/apache/accumulo/core/rpc/FilterTransport.java PRE-CREATION core/src/main/java/org/apache/accumulo/core/rpc/SaslConnectionParams.java PRE-CREATION core/src/main/java/org/apache/accumulo/core/rpc/TTimeoutTransport.java 6eace77 core/src/main/java/org/apache/accumulo/core/rpc/ThriftUtil.java 09bd6c4 core/src/main/java/org/apache/accumulo/core/rpc/UGIAssumingTransport.java PRE-CREATION core/src/main/java/org/apache/accumulo/core/rpc/UGIAssumingTransportFactory.java PRE-CREATION core/src/main/java/org/apache/accumulo/core/security/Credentials.java 525a958 core/src/test/java/org/apache/accumulo/core/cli/TestClientOpts.java ff49bc0 core/src/test/java/org/apache/accumulo/core/client/ClientConfigurationTest.java PRE-CREATION core/src/test/java/org/apache/accumulo/core/client/impl/ThriftTransportKeyTest.java PRE-CREATION core/src/test/java/org/apache/accumulo/core/conf/ClientConfigurationTest.java 40be70f core/src/test/java/org/apache/accumulo/core/rpc/SaslConnectionParamsTest.java PRE-CREATION minicluster/src/main/java/org/apache/accumulo/minicluster/impl/MiniAccumuloClusterImpl.java 27d6b19 minicluster/src/main/java/org/apache/accumulo/minicluster/impl/MiniAccumuloConfigImpl.java 26c23ed pom.xml ae188a0 proxy/src/main/java/org/apache/accumulo/proxy/Proxy.java 4b048eb
Re: Review Request 29386: ACCUMULO-2815 Client authentication via Kerberos
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/29386/ --- (Updated Jan. 7, 2015, 6:16 a.m.) Review request for accumulo. Changes --- Updated default value for property controlling thrift server instantiated. Bugs: ACCUMULO-2815 https://issues.apache.org/jira/browse/ACCUMULO-2815 Repository: accumulo Description --- ACCUMULO-2815 Initial support for Kerberos client authentication. Leverage SASL transport provided by Thrift which can speak GSSAPI, which Kerberos implements. Introduced... * An Accumulo KerberosToken which is an AuthenticationToken to validate users. * Custom thrift processor and invocation handler to ensure server RPCs have a valid KRB identity and Accumulo authentication. * A KerberosAuthenticator which extends ZKAuthenticator to support Kerberos identities seamlessly. * New ClientConf variables to use SASL transport and pass Kerberos server principal * Updated ClientOpts and Shell opts to transparently use a KerberosToken when SASL is enabled (no extra client work). I believe this is the bare minimum for Kerberos support. They are also grossly lacking in unit and integration tests. I believe that I might have somehow broken the client address string in the server (I saw log messages with client: null, but I'm not sure if it's due to these changes or not). A necessary limitation in the Thrift server used is that, like the SSL transport, the SASL transport cannot presently be used with the TFramedTransport, which means none of the [half]async thrift servers will function with this -- we're stuck with the TThreadPoolServer. Performed some contrived benchmarks on my laptop (while still using it myself) to get at big-picture view of the performance impact against normal operation and Kerberos alone. Each run was the duration to ingest 100M records using continuous-ingest, timed with `time`, using 'real'. THsHaServer (our default), 6 runs: Avg: 10m7.273s (607.273s) Min: 9m43.395s Max: 10m52.715s TThreadPoolServer (no SASL), 5 runs: Avg: 11m16.254s (676.254s) Min: 10m30.987s Max: 12m24.192s TThreadPoolServer+SASL/GSSAPI (these changes), 6 runs: Avg: 13m17.187s (797.187s) Min: 10m52.997s Max: 16m0.975s The general takeway is that there's about 15% performance degredation in its initial state which is in the realm of what I expected (~10%). Diffs (updated) - core/src/main/java/org/apache/accumulo/core/cli/ClientOpts.java f6ea934 core/src/main/java/org/apache/accumulo/core/client/ClientConfiguration.java 6fe61a5 core/src/main/java/org/apache/accumulo/core/client/impl/ClientContext.java e75bec6 core/src/main/java/org/apache/accumulo/core/client/impl/ConnectorImpl.java f481cc3 core/src/main/java/org/apache/accumulo/core/client/impl/MasterClient.java a9ad8a1 core/src/main/java/org/apache/accumulo/core/client/impl/ThriftTransportKey.java 6dc846f core/src/main/java/org/apache/accumulo/core/client/impl/ThriftTransportPool.java 5da803b core/src/main/java/org/apache/accumulo/core/client/security/tokens/KerberosToken.java PRE-CREATION core/src/main/java/org/apache/accumulo/core/conf/Property.java e054a5f core/src/main/java/org/apache/accumulo/core/rpc/FilterTransport.java PRE-CREATION core/src/main/java/org/apache/accumulo/core/rpc/SaslConnectionParams.java PRE-CREATION core/src/main/java/org/apache/accumulo/core/rpc/TTimeoutTransport.java 6eace77 core/src/main/java/org/apache/accumulo/core/rpc/ThriftUtil.java 09bd6c4 core/src/main/java/org/apache/accumulo/core/rpc/UGIAssumingTransport.java PRE-CREATION core/src/main/java/org/apache/accumulo/core/rpc/UGIAssumingTransportFactory.java PRE-CREATION core/src/main/java/org/apache/accumulo/core/security/Credentials.java 525a958 core/src/test/java/org/apache/accumulo/core/cli/TestClientOpts.java ff49bc0 core/src/test/java/org/apache/accumulo/core/client/ClientConfigurationTest.java PRE-CREATION core/src/test/java/org/apache/accumulo/core/client/impl/ThriftTransportKeyTest.java PRE-CREATION core/src/test/java/org/apache/accumulo/core/conf/ClientConfigurationTest.java 40be70f core/src/test/java/org/apache/accumulo/core/rpc/SaslConnectionParamsTest.java PRE-CREATION minicluster/src/main/java/org/apache/accumulo/minicluster/impl/MiniAccumuloClusterImpl.java 27d6b19 minicluster/src/main/java/org/apache/accumulo/minicluster/impl/MiniAccumuloConfigImpl.java 26c23ed pom.xml ae188a0 proxy/src/main/java/org/apache/accumulo/proxy/Proxy.java 4b048eb server/base/src/main/java/org/apache/accumulo/server/AccumuloServerContext.java 09ae4f4 server/base/src/main/java/org/apache/accumulo/server/init/Initialize.java 046cfb5 server/base/src/main/java/org/apache/accumulo/server/rpc/TCredentialsUpdatingInvocationHandler.java PRE-CREATION
Re: Review Request 29386: ACCUMULO-2815 Client authentication via Kerberos
On Dec. 30, 2014, 10:46 p.m., Christopher Tubbs wrote: shell/src/main/java/org/apache/accumulo/shell/ShellOptionsJC.java, line 210 https://reviews.apache.org/r/29386/diff/4/?file=803175#file803175line210 Why is username the short user name? Is that unique in Kerberos? If not, the long version should be used everywhere instead. Otherwise, one user can appear to be another in logs, etc. If getShortUserName is not unique, it should avoided everywhere. Josh Elser wrote: Check out: http://web.mit.edu/kerberos/krb5-1.5/krb5-1.5.4/doc/krb5-user/What-is-a-Kerberos-Principal_003f.html Kerberos principals are of the form: primary/instance@realm. Kerberos principals are typically categorized as users and services. A user is not qualified to a single instance (a host) and represent authentication across the realm. For example, els...@example.com means that I can roam. Conversely, a service is typically fixed to a specific host. For example, accumulo/node1.example@example.com means that there is a process, logged in as 'accumulo' on the host 'node1.example.com'. That service can't be run on any other host. Now, an important note if someone actually creates a principal accum...@example.com this is unique with respect to any other accumulo/`host`@EXAMPLE.COM principal. I'm not sure if we need to do anything else other than convention of kerberos principals, or if we should be including the instance in our username when present. This kind of ties back into the SystemCredentials discussion again. Christopher Tubbs wrote: Okay, so a smart configuration would make shortnames unique. However, UserGroupInformation returns only the `primary` for the short name. This means that user names will have to be unique across realms and instances. Right now, you are storing permissions using the short name. So, any user with the same primary, will be able to masquerade as any other user with the same primary from a different instance and/or realm, and be able to user their permissions and authorizations. That's the problem with the shortname here. That's very unexpected. Josh Elser wrote: Bingo. If you look at how HDFS does their configuration, this is the same convention. The lack of documentation from me leaves something to be desired here, and I apologize for that. To save you looking at HDFS (if you care not to look), you'll see that an HDFS process uses a given principal with a special replacement string `_HOST`. The common convention is to use something like `dn/_h...@example.com` (the realm is unimportant for this example). This ensures that the same configuration files can be used across all hosts in the HDFS instance, and Hadoop dynamically replaces `_HOST` with the FQDN of the host. Thus, there's an implicit link that all `dn/*@EXAMPLE.COM` can act as datanodes and this is protected by the fact that access to the KDC is restricted (you can't make your own user). The circle of trust is two-fold: having a keytab with the correct principal and that Hadoop is requires that specific configuration (which restricts the principal). Christopher Tubbs wrote: My concerns here are more about the impact on users, than for the system credentials. I don't know what HDFS is doing, but if they aren't (minimally) checking the realm when checking permissions/access on an authenticated principal, then they are less secure than I think we should be. Referencing HDFS also seems to imply that we're not so much doing Kerberos, as we are implementing HDFS-specific Kerberos conventions (which are less secure, with respect to data authorizations/permissions within Accumulo, than I'm comfortable with). Josh Elser wrote: bq. if they aren't (minimally) checking the realm when checking permissions/access on an authenticated principal Do you mean the instance instead of the realm? In the case of a single realm, the KDC is going to verify the correct realm. Assuming you meant the instance though (the optional /hostname), it's typical that a user has the ability to use their credentials anywhere. Thus, you typically see principals without instances for actual users. As far as I understand it, that's what HDFS tends to follow and what I tried to as well. Accumulo doesn't care where you come from, just what your name is and that you have valid credentials. I don't think we're being substantially less secure by not including the instance in the Accumulo principal. Christopher Tubbs wrote: No, I mean the realm, to make it only necessary to guarantee uniqueness within a realm, vs. across all known realms (more reasonable of a guarantee to make for a KDC user admin). We could also include the instance (when specified), if we want to really be careful that users aren't sharing permissions. In my concerns, I'm assuming we
Re: Review Request 29502: ACCUMULO-3458 Adding scan authorizations to IteratorEnvironment
On Jan. 5, 2015, 4:09 p.m., Christopher Tubbs wrote: core/src/test/java/org/apache/accumulo/core/iterators/FirstEntryInRowIteratorTest.java, line 63 https://reviews.apache.org/r/29502/diff/2/?file=804705#file804705line63 Should this be Authorizations.EMPTY? Or should it have a default implementation on WrappingIterator which calls source.getAuthorizations()? Christopher Tubbs wrote: make that `getSource().getAuthorizations()` Corey Nolet wrote: Specific to this test I returned null because all the other getters (other than what was being explicitly tested) were returning null. Were you thinking WrappingIterator should also provide a getAuthorizations() method? Oh, nevermind. I misunderstood the change. This is fine. - Christopher --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/29502/#review66725 --- On Jan. 6, 2015, 10:54 a.m., Corey Nolet wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/29502/ --- (Updated Jan. 6, 2015, 10:54 a.m.) Review request for accumulo, Christopher Tubbs, Eric Newton, Josh Elser, and kturner. Bugs: ACCUMULO-3458 https://issues.apache.org/jira/browse/ACCUMULO-3458 Repository: accumulo Description --- ACCUMULO-3458 Propagating scan-time authorizations through the IteratorEnvironment so that scan-time iterators can use them. Diffs - core/src/main/java/org/apache/accumulo/core/client/ClientSideIteratorScanner.java 4903656 core/src/main/java/org/apache/accumulo/core/client/ScannerBase.java 335b63a core/src/main/java/org/apache/accumulo/core/client/impl/OfflineScanner.java 2552682 core/src/main/java/org/apache/accumulo/core/client/impl/ScannerImpl.java 666a8af core/src/main/java/org/apache/accumulo/core/client/impl/ScannerOptions.java 9726266 core/src/main/java/org/apache/accumulo/core/client/impl/TabletServerBatchReader.java 2a79f05 core/src/main/java/org/apache/accumulo/core/client/mock/MockScannerBase.java 72cb863 core/src/main/java/org/apache/accumulo/core/iterators/IteratorEnvironment.java 9e20cb1 core/src/test/java/org/apache/accumulo/core/client/impl/ScannerImplTest.java be4d467 core/src/test/java/org/apache/accumulo/core/client/impl/TabletServerBatchReaderTest.java PRE-CREATION core/src/test/java/org/apache/accumulo/core/iterators/DefaultIteratorEnvironment.java 94da7b5 core/src/test/java/org/apache/accumulo/core/iterators/FirstEntryInRowIteratorTest.java fa46360 core/src/test/java/org/apache/accumulo/core/iterators/user/RowDeletingIteratorTest.java 4521e55 core/src/test/java/org/apache/accumulo/core/iterators/user/TransformingIteratorTest.java 4cebab7 server/base/src/test/java/org/apache/accumulo/server/iterators/MetadataBulkLoadFilterTest.java 4a45e99 server/base/src/test/java/org/apache/accumulo/server/replication/StatusCombinerTest.java a9801b0 server/monitor/src/main/java/org/apache/accumulo/monitor/servlets/trace/NullScanner.java bf35557 server/tserver/src/main/java/org/apache/accumulo/tserver/TabletIteratorEnvironment.java d1fece5 server/tserver/src/main/java/org/apache/accumulo/tserver/tablet/Compactor.java 869cc33 server/tserver/src/main/java/org/apache/accumulo/tserver/tablet/ScanDataSource.java fe4b16b test/src/main/java/org/apache/accumulo/test/functional/AuthsIterator.java PRE-CREATION test/src/test/java/org/apache/accumulo/test/ScanIteratorIT.java PRE-CREATION Diff: https://reviews.apache.org/r/29502/diff/ Testing --- Wrote an integration test to verify that ScanDataSource is actually setting the authorizations on the IteratorEnvironment Thanks, Corey Nolet