Re: [Vote] KIP-11 Authorization design for kafka security
Parth, Thanks for driving this. Could you update the status of the KIP in the wiki? Thanks, Jun On Wed, May 20, 2015 at 2:37 PM, Parth Brahmbhatt pbrahmbh...@hortonworks.com wrote: This vote is now Closed with 4 binding +1s and 4 non binding +1s. Thanks Parth On 5/20/15, 12:04 PM, Joel Koshy jjkosh...@gmail.com wrote: +1 On Fri, May 15, 2015 at 04:18:49PM +, Parth Brahmbhatt wrote: Hi, Opening the voting thread for KIP-11. Link to the KIP: https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authorization+ Interface Link to Jira: https://issues.apache.org/jira/browse/KAFKA-1688 Thanks Parth
Re: [Vote] KIP-11 Authorization design for kafka security
I am sorry to be ignorant about this but what is the new state? Adopted seems too early given we are still in code review process. Should I just make it ³Code review²? Thanks Parth On 5/21/15, 8:43 AM, Jun Rao j...@confluent.io wrote: Parth, Thanks for driving this. Could you update the status of the KIP in the wiki? Thanks, Jun On Wed, May 20, 2015 at 2:37 PM, Parth Brahmbhatt pbrahmbh...@hortonworks.com wrote: This vote is now Closed with 4 binding +1s and 4 non binding +1s. Thanks Parth On 5/20/15, 12:04 PM, Joel Koshy jjkosh...@gmail.com wrote: +1 On Fri, May 15, 2015 at 04:18:49PM +, Parth Brahmbhatt wrote: Hi, Opening the voting thread for KIP-11. Link to the KIP: https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authorization+ Interface Link to Jira: https://issues.apache.org/jira/browse/KAFKA-1688 Thanks Parth
Re: [Vote] KIP-11 Authorization design for kafka security
The KIP and design were accepted, so the WIKI should say accepted or something similar. Specific patch status is reflected in the JIRA. On Thu, May 21, 2015 at 8:37 PM, Parth Brahmbhatt pbrahmbh...@hortonworks.com wrote: I am sorry to be ignorant about this but what is the new state? Adopted seems too early given we are still in code review process. Should I just make it ³Code review²? Thanks Parth On 5/21/15, 8:43 AM, Jun Rao j...@confluent.io wrote: Parth, Thanks for driving this. Could you update the status of the KIP in the wiki? Thanks, Jun On Wed, May 20, 2015 at 2:37 PM, Parth Brahmbhatt pbrahmbh...@hortonworks.com wrote: This vote is now Closed with 4 binding +1s and 4 non binding +1s. Thanks Parth On 5/20/15, 12:04 PM, Joel Koshy jjkosh...@gmail.com wrote: +1 On Fri, May 15, 2015 at 04:18:49PM +, Parth Brahmbhatt wrote: Hi, Opening the voting thread for KIP-11. Link to the KIP: https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authorization+ Interface Link to Jira: https://issues.apache.org/jira/browse/KAFKA-1688 Thanks Parth
Re: [Vote] KIP-11 Authorization design for kafka security
This vote is now Closed with 4 binding +1s and 4 non binding +1s. Thanks Parth On 5/20/15, 12:04 PM, Joel Koshy jjkosh...@gmail.com wrote: +1 On Fri, May 15, 2015 at 04:18:49PM +, Parth Brahmbhatt wrote: Hi, Opening the voting thread for KIP-11. Link to the KIP: https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authorization+ Interface Link to Jira: https://issues.apache.org/jira/browse/KAFKA-1688 Thanks Parth
Re: [Vote] KIP-11 Authorization design for kafka security
+1 On Fri, May 15, 2015 at 04:18:49PM +, Parth Brahmbhatt wrote: Hi, Opening the voting thread for KIP-11. Link to the KIP: https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authorization+Interface Link to Jira: https://issues.apache.org/jira/browse/KAFKA-1688 Thanks Parth
Re: [Vote] KIP-11 Authorization design for kafka security
+1 ~ Joe Stein - - - - - - - - - - - - - - - - - http://www.stealth.ly - - - - - - - - - - - - - - - - - On Fri, May 15, 2015 at 7:35 PM, Jun Rao j...@confluent.io wrote: +1 Thanks, Jun On Fri, May 15, 2015 at 9:18 AM, Parth Brahmbhatt pbrahmbh...@hortonworks.com wrote: Hi, Opening the voting thread for KIP-11. Link to the KIP: https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authorization+Interface Link to Jira: https://issues.apache.org/jira/browse/KAFKA-1688 Thanks Parth
[Vote] KIP-11 Authorization design for kafka security
Hi, Opening the voting thread for KIP-11. Link to the KIP: https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authorization+Interface Link to Jira: https://issues.apache.org/jira/browse/KAFKA-1688 Thanks Parth
Re: [Vote] KIP-11 Authorization design for kafka security
+1 Thanks, Jun On Fri, May 15, 2015 at 9:18 AM, Parth Brahmbhatt pbrahmbh...@hortonworks.com wrote: Hi, Opening the voting thread for KIP-11. Link to the KIP: https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authorization+Interface Link to Jira: https://issues.apache.org/jira/browse/KAFKA-1688 Thanks Parth
Re: [Vote] KIP-11 Authorization design for kafka security
+1 non-binding On Fri, May 15, 2015 at 9:18 AM -0700, Parth Brahmbhatt pbrahmbh...@hortonworks.com wrote: Hi, Opening the voting thread for KIP-11. Link to the KIP: https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authorization+Interface Link to Jira: https://issues.apache.org/jira/browse/KAFKA-1688 Thanks Parth
Re: [Vote] KIP-11 Authorization design for kafka security
+1 -Jay On Fri, May 15, 2015 at 9:18 AM, Parth Brahmbhatt pbrahmbh...@hortonworks.com wrote: Hi, Opening the voting thread for KIP-11. Link to the KIP: https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authorization+Interface Link to Jira: https://issues.apache.org/jira/browse/KAFKA-1688 Thanks Parth
Re: [Vote] KIP-11 Authorization design for kafka security
+1 non-binding On 5/15/15, 11:43 AM, Gwen Shapira gshap...@cloudera.com wrote: +1 non-binding On Fri, May 15, 2015 at 9:12 PM, Harsha harsh...@fastmail.fm wrote: +1 non-binding On Fri, May 15, 2015 at 9:18 AM -0700, Parth Brahmbhatt pbrahmbh...@hortonworks.com wrote: Hi, Opening the voting thread for KIP-11. Link to the KIP: https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authorization+ Interface Link to Jira: https://issues.apache.org/jira/browse/KAFKA-1688 Thanks Parth
Re: [Vote] KIP-11 Authorization design for kafka security
+1 non-binding On Fri, May 15, 2015 at 9:12 PM, Harsha harsh...@fastmail.fm wrote: +1 non-binding On Fri, May 15, 2015 at 9:18 AM -0700, Parth Brahmbhatt pbrahmbh...@hortonworks.com wrote: Hi, Opening the voting thread for KIP-11. Link to the KIP: https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authorization+Interface Link to Jira: https://issues.apache.org/jira/browse/KAFKA-1688 Thanks Parth
Re: [Vote] KIP-11 Authorization design for kafka security
+1 non-binding. Tom Graves On Friday, May 15, 2015 2:00 PM, Don Bosco Durai bo...@apache.org wrote: +1 non-binding On 5/15/15, 11:43 AM, Gwen Shapira gshap...@cloudera.com wrote: +1 non-binding On Fri, May 15, 2015 at 9:12 PM, Harsha harsh...@fastmail.fm wrote: +1 non-binding On Fri, May 15, 2015 at 9:18 AM -0700, Parth Brahmbhatt pbrahmbh...@hortonworks.com wrote: Hi, Opening the voting thread for KIP-11. Link to the KIP: https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authorization+ Interface Link to Jira: https://issues.apache.org/jira/browse/KAFKA-1688 Thanks Parth
Re: [VOTE] KIP-11- Authorization design for kafka security
Suresh, We typically wrap up the voting of a KIP in a few days. However, given that this KIP is quite critical and there seems to be new questions, perhaps we can spend a bit more time to have people's concerns addressed and then resume the voting. Joe, Do you still have concerns given the previous replies? Thanks, Jun On Thu, Apr 30, 2015 at 7:54 PM, Suresh Srinivas sur...@hortonworks.com wrote: It is a strange choice to return does not exist when the condition is actually not authorized. I have hard time understanding why that is better for security. Perhaps in DB world this is expected and changes may be necessary to comply with such behavior. But that should not guide what we do in Kafka. This is a voting thread for an important feature. Security is the number one feature that our users are asking for. Can't minor things like this be done in a follow up jiras? Should the focus be brought back to voting? Btw since I am new to the Kafka community, is there a period when voting thread needs to wrap up by? Other projects generally follow 3 or 7 days. Regards, Suresh Sent from phone _ From: Gwen Shapira gshap...@cloudera.commailto:gshap...@cloudera.com Sent: Thursday, April 30, 2015 5:32 PM Subject: Re: [VOTE] KIP-11- Authorization design for kafka security To: dev@kafka.apache.orgmailto:dev@kafka.apache.org On Thu, Apr 30, 2015 at 4:39 PM, Parth Brahmbhatt pbrahmbh...@hortonworks.commailto:pbrahmbh...@hortonworks.com wrote: Hi Joe, Let me clarify on authZException. The caller gets a 403 regardless of existence of the topic, even if the topic does not exist you always get 403. This will fall under the case wherewe do not find any acls for a resource and as per our last decision by default we are going to deny this request. The reason I'm digging into this is that in Hive we had to fix existing behavior after financial customers objected loudly to getting insufficient privileges when a real database would return table does not exist. I completely agree that having to handle two separate error conditions (TopicNotExist if user doesn't have READ, unless user has CREATE in which case he can see all topics and can get Unauthorized) adds complexity and will not be fun to debug. However, when implementing security, a lot of the stuff we do is around making customers pass security audits, and I suspect that can't know that tables even exist test is a thing. We share pretty much the same financial customers and they seem to have the same concerns. Perhaps you can double check if you also have this requirement? (and again, sorry for not seeing this earlier and holding up the vote on what seems like a minor point. I just don't want to punt for later something when we already have an idea of what customers expect) Gwen The configurations are listed explicitly here https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authorization+In terface#KIP-11-AuthorizatiInterface-Changestoexistingclasses under KafkaConfig. We may add an optional config to allow authorizer to read an arbitrary property files incrementally but that does not need to be part of this same KIP. The statement “If we can't audit the access then wht good is controlling the access?” seems extreme because we still get to control the access which IMHO is a huge win. The default authorizer implementation right now logs every allowed/denied access (see here https://github.com/Parth-Brahmbhatt/kafka/blob/KAFKA-1688-impl/core/src/mai n/scala/kafka/security/auth/SimpleAclAthorizer.scala) in debug mode. Anybody who needs auditing could create a lo4j appender to allow debug access to this class and send the log output to some audit fil. Auditing is still a separate piece, we could either add an auditor interface that wraps authorizer or the other way around so authorizer and auditor can be two separate implementation. I woud love to start a new KIP and jira to discuss approaches in more details but I don’t see the need to hold up Authorization work for the same. I don’t agree with the “this design seems too specific” given we already have 3 implementation (default, ranger, sentry) that can be supported with the current design. The authorization happens as part of handle and it is the first action, see here https://github.com/Parth-Brahmbhatt/kafka/blob/KAFKA-1688-impl/core/src/mai n/scala/kafka/server/KafkaApis.scala#L103 for one example. Thanks Parth On 4/30/15, 4:24 PM, Suresh Srinivas sur...@hortonworks.commailto: sur...@hortonworks.com wrote: Joe, thanks for the clarification. Regarding audits, sorry I might be misunderstanding your email. Currently, if Kafka does not support audits, I think audits should be considered as a separate effort. Here are the reasons: - Audit,whether authorization is available or not, should record operations to determine what
Re: [VOTE] KIP-11- Authorization design for kafka security
Hi, sorry I am coming in late to chime back in on this thread and haven't been able to make the KIP hangouts the last few weeks. Sorry if any of this was brought up already or I missed it. I read through the KIP and the thread(s) and a couple of things jumped out. - Can we break out the open issues in JIRA (maybe during the hangout) that are in the KIP and resolve/flesh those out more? - I don't see any updates with the systems test or how we can know the code works. - We need some implementation/example/sample that we know can work in all different existing entitlement servers and not just ones that run in types of data centers too. I am not saying we should support everything but if someone had to implement https://docs.oracle.com/cd/E19225-01/820-6551/bzafm/index.html with Kafka it has to work for them out of the box. - We should shy away from storing JSON in Zookeeper. Lets store bytes in Storage. - We should spend some time thinking through exceptions in the wire protocol maybe as part of this so it can keep moving forward. ~ Joe Stein On Tue, Apr 28, 2015 at 3:33 AM, Sun, Dapeng dapeng@intel.com wrote: Thank you for your reply, Gwen. 1. Complex rule systems can be difficult to reason about and therefore end up being less secure. The rule Deny always wins is very easy to grasp. Yes, I'm agreed with your point: we should not make the rule complex. 2. We currently don't have any mechanism for specifying IP ranges (or host ranges) at all. I think its a pretty significant deficiency, but it does mean that we don't need to worry about the issue of blocking a large range while unblocking few servers in the range. Support ranges sounds reasonable. If this feature will be in development plan, I also don't think we can put the best matching acl and Support ip ranges together. We have a call tomorrow (Tuesday, April 28) at 3pm PST - to discuss this and other outstanding design issues (not all related to security). If you are interested in joining - let me know and I'll forward you the invite. Thank you, Gwen. I have the invite and I should be at home at that time. But due to network issue, I may can't join the meeting smoothly. Regards Dapeng -Original Message- From: Gwen Shapira [mailto:gshap...@cloudera.com] Sent: Tuesday, April 28, 2015 1:31 PM To: dev@kafka.apache.org Subject: Re: [VOTE] KIP-11- Authorization design for kafka security While I see the advantage of being able to say something like: deny user X from hosts h1...h200 also allow user X from host h189, there are two issues here: 1. Complex rule systems can be difficult to reason about and therefore end up being less secure. The rule Deny always wins is very easy to grasp. 2. We currently don't have any mechanism for specifying IP ranges (or host ranges) at all. I think its a pretty significant deficiency, but it does mean that we don't need to worry about the issue of blocking a large range while unblocking few servers in the range. Gwen P.S We have a call tomorrow (Tuesday, April 28) at 3pm PST - to discuss this and other outstanding design issues (not all related to security). If you are interested in joining - let me know and I'll forward you the invite. Gwen On Mon, Apr 27, 2015 at 10:15 PM, Sun, Dapeng dapeng@intel.com wrote: Attach the image. https://raw.githubusercontent.com/sundapeng/attachment/master/kafka-ac l1.png Regards Dapeng From: Sun, Dapeng [mailto:dapeng@intel.com] Sent: Tuesday, April 28, 2015 11:44 AM To: dev@kafka.apache.org Subject: RE: [VOTE] KIP-11- Authorization design for kafka security Thank you for your rapid reply, Parth. * I think the wiki already describes the precedence order as Deny taking precedence over allow when conflicting acls are found https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authorizati on+In terface#KIP-11-AuthorizationInterface-PermissionType Got it, thank you. * In the first version that I am currently writing there is no group support. Even when we add it I don't see the need to add a precedence for evaluation. it does not matter which principal matches as long as we have a match. About this part, I think we should choose the best matching acl for authorization, no matter we support group or not. For the case [cid:image001.png@01D08197.E94BD410] https://raw.githubusercontent.com/sundapeng/attachment/master/kafka-ac l1.png if 2 Acls are defined, one that deny an operation from all hosts and one that allows the operation from host1, the operation from host1 will be denied or allowed? According wiki Deny will take precedence over Allow in competing acls., it seems acl_1 will win the competition, but customers' intention may be allow. I think deny always take precedence over Allow is okay, but host1 - user1host1 default may
Re: [VOTE] KIP-11- Authorization design for kafka security
Joe, Could you elaborate on why we should not store JSON in ZK? So far, all existing ZK data are in JSON. Thanks, Jun On Thu, Apr 30, 2015 at 2:06 AM, Joe Stein joe.st...@stealth.ly wrote: Hi, sorry I am coming in late to chime back in on this thread and haven't been able to make the KIP hangouts the last few weeks. Sorry if any of this was brought up already or I missed it. I read through the KIP and the thread(s) and a couple of things jumped out. - Can we break out the open issues in JIRA (maybe during the hangout) that are in the KIP and resolve/flesh those out more? - I don't see any updates with the systems test or how we can know the code works. - We need some implementation/example/sample that we know can work in all different existing entitlement servers and not just ones that run in types of data centers too. I am not saying we should support everything but if someone had to implement https://docs.oracle.com/cd/E19225-01/820-6551/bzafm/index.html with Kafka it has to work for them out of the box. - We should shy away from storing JSON in Zookeeper. Lets store bytes in Storage. - We should spend some time thinking through exceptions in the wire protocol maybe as part of this so it can keep moving forward. ~ Joe Stein On Tue, Apr 28, 2015 at 3:33 AM, Sun, Dapeng dapeng@intel.com wrote: Thank you for your reply, Gwen. 1. Complex rule systems can be difficult to reason about and therefore end up being less secure. The rule Deny always wins is very easy to grasp. Yes, I'm agreed with your point: we should not make the rule complex. 2. We currently don't have any mechanism for specifying IP ranges (or host ranges) at all. I think its a pretty significant deficiency, but it does mean that we don't need to worry about the issue of blocking a large range while unblocking few servers in the range. Support ranges sounds reasonable. If this feature will be in development plan, I also don't think we can put the best matching acl and Support ip ranges together. We have a call tomorrow (Tuesday, April 28) at 3pm PST - to discuss this and other outstanding design issues (not all related to security). If you are interested in joining - let me know and I'll forward you the invite. Thank you, Gwen. I have the invite and I should be at home at that time. But due to network issue, I may can't join the meeting smoothly. Regards Dapeng -Original Message- From: Gwen Shapira [mailto:gshap...@cloudera.com] Sent: Tuesday, April 28, 2015 1:31 PM To: dev@kafka.apache.org Subject: Re: [VOTE] KIP-11- Authorization design for kafka security While I see the advantage of being able to say something like: deny user X from hosts h1...h200 also allow user X from host h189, there are two issues here: 1. Complex rule systems can be difficult to reason about and therefore end up being less secure. The rule Deny always wins is very easy to grasp. 2. We currently don't have any mechanism for specifying IP ranges (or host ranges) at all. I think its a pretty significant deficiency, but it does mean that we don't need to worry about the issue of blocking a large range while unblocking few servers in the range. Gwen P.S We have a call tomorrow (Tuesday, April 28) at 3pm PST - to discuss this and other outstanding design issues (not all related to security). If you are interested in joining - let me know and I'll forward you the invite. Gwen On Mon, Apr 27, 2015 at 10:15 PM, Sun, Dapeng dapeng@intel.com wrote: Attach the image. https://raw.githubusercontent.com/sundapeng/attachment/master/kafka-ac l1.png Regards Dapeng From: Sun, Dapeng [mailto:dapeng@intel.com] Sent: Tuesday, April 28, 2015 11:44 AM To: dev@kafka.apache.org Subject: RE: [VOTE] KIP-11- Authorization design for kafka security Thank you for your rapid reply, Parth. * I think the wiki already describes the precedence order as Deny taking precedence over allow when conflicting acls are found https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authorizati on+In terface#KIP-11-AuthorizationInterface-PermissionType Got it, thank you. * In the first version that I am currently writing there is no group support. Even when we add it I don't see the need to add a precedence for evaluation. it does not matter which principal matches as long as we have a match. About this part, I think we should choose the best matching acl for authorization, no matter we support group or not. For the case [cid:image001.png@01D08197.E94BD410] https://raw.githubusercontent.com/sundapeng/attachment/master/kafka-ac l1.png if 2 Acls are defined, one that deny an operation from all hosts and one that allows the operation
Re: [VOTE] KIP-11- Authorization design for kafka security
smoothly. Regards Dapeng -Original Message- From: Gwen Shapira [mailto:gshap...@cloudera.com] Sent: Tuesday, April 28, 2015 1:31 PM To: dev@kafka.apache.org Subject: Re: [VOTE] KIP-11- Authorization design for kafka security While I see the advantage of being able to say something like: deny user X from hosts h1...h200 also allow user X from host h189, there are two issues here: 1. Complex rule systems can be difficult to reason about and therefore end up being less secure. The rule Deny always wins is very easy to grasp. 2. We currently don't have any mechanism for specifying IP ranges (or host ranges) at all. I think its a pretty significant deficiency, but it does mean that we don't need to worry about the issue of blocking a large range while unblocking few servers in the range. Gwen P.S We have a call tomorrow (Tuesday, April 28) at 3pm PST - to discuss this and other outstanding design issues (not all related to security). If you are interested in joining - let me know and I'll forward you the invite. Gwen On Mon, Apr 27, 2015 at 10:15 PM, Sun, Dapeng dapeng@intel.com wrote: Attach the image. https://raw.githubusercontent.com/sundapeng/attachment/master/kafka-ac l1.png Regards Dapeng From: Sun, Dapeng [mailto:dapeng@intel.com] Sent: Tuesday, April 28, 2015 11:44 AM To: dev@kafka.apache.org Subject: RE: [VOTE] KIP-11- Authorization design for kafka security Thank you for your rapid reply, Parth. * I think the wiki already describes the precedence order as Deny taking precedence over allow when conflicting acls are found https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authorizati on+In terface#KIP-11-AuthorizationInterface-PermissionType Got it, thank you. * In the first version that I am currently writing there is no group support. Even when we add it I don't see the need to add a precedence for evaluation. it does not matter which principal matches as long as we have a match. About this part, I think we should choose the best matching acl for authorization, no matter we support group or not. For the case [cid:image001.png@01D08197.E94BD410] https://raw.githubusercontent.com/sundapeng/attachment/master/kafka-ac l1.png if 2 Acls are defined, one that deny an operation from all hosts and one that allows the operation from host1, the operation from host1 will be denied or allowed? According wiki Deny will take precedence over Allow in competing acls., it seems acl_1 will win the competition, but customers' intention may be allow. I think deny always take precedence over Allow is okay, but host1 - user1host1 default may make sense. * Acl storage is indexed by resource right now because that is the primary lookup id for all authorize operations. Given acls are cached I don't see the need to optimized the storage layer any further for lookup. * The reason why we have acl with multi everything is to reduce redundancy in acl storage. I am not sure how will we be able to reduce redundancy if we divide it by using one principal,one host, one operation. Yes, I'm also agreed with Acl storage should be indexed by resource. Under resource index, it may be better to add index such as hosts and principals. One option may be one principal, one host, one operation. Just give your these scenarios for considering. For the case defined in wiki: Acl_1 - {user:bob, user:*} is allowed to READ from all hosts. Acl_2 - {user:bob} is denied to READ from host1 Acl_3 - {user:alice, group:kafka-devs} is allowed to READ and WRITE from {host1, host2}. For acl_3, if we want to remove alice's WRITE from {host1,host2} and remove alice's READ from host1, user may have following ways to achieve: 1.Remove the parts of acl_3 directly, I think if we make it divided and hierarchical, this kind of operations could be done directly in backend. 2.Remove acl_3, and add new acl {group:kafka-devs} is allowed to READ and WRITE from {host1, host2} and {user:alice } is allowed to READ from {host2} 3.Add two denied acls,{ user:alice} is denied to WRITE from {host1,host2} and { user:alice} is denied to READ from {host1} All these can achieve this kind of operations, but I think 1 could more directly for user operations. If you think this optimization is not urgent, I'm also agreed. Regards Dapeng -Original Message- From: Parth Brahmbhatt [mailto:pbrahmbh...@hortonworks.com
Re: [VOTE] KIP-11- Authorization design for kafka security
j...@confluent.io wrote: Joe, Could you elaborate on why we should not store JSON in ZK? So far, all existing ZK data are in JSON. Thanks, Jun On Thu, Apr 30, 2015 at 2:06 AM, Joe Stein joe.st...@stealth.ly wrote: Hi, sorry I am coming in late to chime back in on this thread and haven't been able to make the KIP hangouts the last few weeks. Sorry if any of this was brought up already or I missed it. I read through the KIP and the thread(s) and a couple of things jumped out. - Can we break out the open issues in JIRA (maybe during the hangout) that are in the KIP and resolve/flesh those out more? - I don't see any updates with the systems test or how we can know the code works. - We need some implementation/example/sample that we know can work in all different existing entitlement servers and not just ones that run in types of data centers too. I am not saying we should support everything but if someone had to implement https://docs.oracle.com/cd/E19225-01/820-6551/bzafm/index.html with Kafka it has to work for them out of the box. - We should shy away from storing JSON in Zookeeper. Lets store bytes in Storage. - We should spend some time thinking through exceptions in the wire protocol maybe as part of this so it can keep moving forward. ~ Joe Stein On Tue, Apr 28, 2015 at 3:33 AM, Sun, Dapeng dapeng@intel.com wrote: Thank you for your reply, Gwen. 1. Complex rule systems can be difficult to reason about and therefore end up being less secure. The rule Deny always wins is very easy to grasp. Yes, I'm agreed with your point: we should not make the rule complex. 2. We currently don't have any mechanism for specifying IP ranges (or host ranges) at all. I think its a pretty significant deficiency, but it does mean that we don't need to worry about the issue of blocking a large range while unblocking few servers in the range. Support ranges sounds reasonable. If this feature will be in development plan, I also don't think we can put the best matching acl and Support ip ranges together. We have a call tomorrow (Tuesday, April 28) at 3pm PST - to discuss this and other outstanding design issues (not all related to security). If you are interested in joining - let me know and I'll forward you the invite. Thank you, Gwen. I have the invite and I should be at home at that time. But due to network issue, I may can't join the meeting smoothly. Regards Dapeng -Original Message- From: Gwen Shapira [mailto:gshap...@cloudera.com] Sent: Tuesday, April 28, 2015 1:31 PM To: dev@kafka.apache.org Subject: Re: [VOTE] KIP-11- Authorization design for kafka security While I see the advantage of being able to say something like: deny user X from hosts h1...h200 also allow user X from host h189, there are two issues here: 1. Complex rule systems can be difficult to reason about and therefore end up being less secure. The rule Deny always wins is very easy to grasp. 2. We currently don't have any mechanism for specifying IP ranges (or host ranges) at all. I think its a pretty significant deficiency, but it does mean that we don't need to worry about the issue of blocking a large range while unblocking few servers in the range. Gwen P.S We have a call tomorrow (Tuesday, April 28) at 3pm PST - to discuss this and other outstanding design issues (not all related to security). If you are interested in joining - let me know and I'll forward you the invite. Gwen On Mon, Apr 27, 2015 at 10:15 PM, Sun, Dapeng dapeng@intel.com wrote: Attach the image. https://raw.githubusercontent.com/sundapeng/attachment/master/kafka-ac l1.png Regards Dapeng From: Sun, Dapeng [mailto:dapeng@intel.com] Sent: Tuesday, April 28, 2015 11:44 AM To: dev@kafka.apache.org Subject: RE: [VOTE] KIP-11- Authorization design for kafka security Thank you for your rapid reply, Parth. * I think the wiki already describes the precedence order as Deny taking precedence over allow when conflicting acls are found https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authorizati on+In terface#KIP-11-AuthorizationInterface-PermissionType Got it, thank you. * In the first version that I am currently writing there is no group support. Even when we add it I don't see the need to add a precedence for evaluation. it does not matter which principal matches as long as we have a match
Re: [VOTE] KIP-11- Authorization design for kafka security
that the code works. We have thorough unit tests for all the new code except for modifications made to KafkaAPI as that has way too many dependencies to be mocked which I guess is the reason for no existing unit tests. * I don’t know if I completely understand the concern. We have talked with Ranger team (Don Bosco Durai) so we at least have one custom authorizer implementation that has approved this design and they will be able to inject their authorization framework with current interfaces. Do you see any issue with the design which will prevent anyone from providing a custom implementation? * Did not understand the concern around wire protocol, we are adding AuthorizationException to indicate that an operation was not authorized. Thanks Parth On 4/30/15, 5:59 AM, Jun Rao j...@confluent.io wrote: Joe, Could you elaborate on why we should not store JSON in ZK? So far, all existing ZK data are in JSON. Thanks, Jun On Thu, Apr 30, 2015 at 2:06 AM, Joe Stein joe.st...@stealth.ly wrote: Hi, sorry I am coming in late to chime back in on this thread and haven't been able to make the KIP hangouts the last few weeks. Sorry if any of this was brought up already or I missed it. I read through the KIP and the thread(s) and a couple of things jumped out. - Can we break out the open issues in JIRA (maybe during the hangout) that are in the KIP and resolve/flesh those out more? - I don't see any updates with the systems test or how we can know the code works. - We need some implementation/example/sample that we know can work in all different existing entitlement servers and not just ones that run in types of data centers too. I am not saying we should support everything but if someone had to implement https://docs.oracle.com/cd/E19225-01/820-6551/bzafm/index.html with Kafka it has to work for them out of the box. - We should shy away from storing JSON in Zookeeper. Lets store bytes in Storage. - We should spend some time thinking through exceptions in the wire protocol maybe as part of this so it can keep moving forward. ~ Joe Stein On Tue, Apr 28, 2015 at 3:33 AM, Sun, Dapeng dapeng@intel.com wrote: Thank you for your reply, Gwen. 1. Complex rule systems can be difficult to reason about and therefore end up being less secure. The rule Deny always wins is very easy to grasp. Yes, I'm agreed with your point: we should not make the rule complex. 2. We currently don't have any mechanism for specifying IP ranges (or host ranges) at all. I think its a pretty significant deficiency, but it does mean that we don't need to worry about the issue of blocking a large range while unblocking few servers in the range. Support ranges sounds reasonable. If this feature will be in development plan, I also don't think we can put the best matching acl and Support ip ranges together. We have a call tomorrow (Tuesday, April 28) at 3pm PST - to discuss this and other outstanding design issues (not all related to security). If you are interested in joining - let me know and I'll forward you the invite. Thank you, Gwen. I have the invite and I should be at home at that time. But due to network issue, I may can't join the meeting smoothly. Regards Dapeng -Original Message- From: Gwen Shapira [mailto:gshap...@cloudera.com] Sent: Tuesday, April 28, 2015 1:31 PM To: dev@kafka.apache.org Subject: Re: [VOTE] KIP-11- Authorization design for kafka security While I see the advantage of being able to say something like: deny user X from hosts h1...h200 also allow user X from host h189, there are two issues here: 1. Complex rule systems can be difficult to reason about and therefore end up being less secure. The rule Deny always wins is very easy to grasp. 2. We currently don't have any mechanism for specifying IP ranges (or host ranges) at all. I think its a pretty significant deficiency, but it does mean that we don't need to worry about the issue of blocking a large range while unblocking few servers in the range. Gwen P.S We have a call tomorrow (Tuesday, April 28) at 3pm PST - to discuss this and other outstanding design issues (not all related to security). If you are interested in joining - let me know and I'll forward you the invite. Gwen On Mon, Apr 27, 2015
Re: [VOTE] KIP-11- Authorization design for kafka security
the hangout) that are in the KIP and resolve/flesh those out more? - I don't see any updates with the systems test or how we can know the code works. - We need some implementation/example/sample that we know can work in all diffeent existing entitlement servers and not just ones that run in types of data centers too. I am not saying we should support everything but if someone had to implement https://docs.oracle.com/cd/E19225-01/820-6551/bzafm/index.html with Kafka it has to work for them out of the box. - We should shy away from storing JSON in Zookeeper. Lets store byes in Storage. - We should spend some time thinking through exceptions in the wire protocol maybe as part of this so it can keep moving forward. ~ Joe Stein On Tue, Apr 28, 2015 at 3:33 AM, Sun, Dapeng dapeng@intel.com wrote: Thank you for your reply, Gwen. 1. Complex rule systems can be difficult to reason about and therefore end up being less secure. The rule Deny always wins is very easy to grasp. Yes, I'm agreed with your point: we should not make the rule complex. 2. We currently don't have any mechanism for specifying IP ranges (or host ranges) at all. I think its a pretty significant deficiency, but it does mean that we don't need to worry about the issue of blocking a large range while unblocking few servers in the range. Support ranges sounds reasonable. If this feature will be in development plan, I also don't think we can put the best matching acl and Support ip ranges together. We have a call tomorrow (Tuesday, April 28) at 3pm PST - to discuss this and other outstanding design issues (not all related to security). If you are interested in joining - let me know and I'll forward you the invite. Thank you, Gwen. I have the invite and I should be at home at that time. But due to network issue, I may can't join the meeting smoothly. Regard Dapeng -Original Message- From: Gwen Shapira [mailto:gshap...@cloudera.com] Sent: Tuesday, April 28, 2015 1:31 PM To: dev@kafka.apache.org Subject: Re [VOTE] KIP-11- Authorization design for kafka security While I see the advantage of being able to say something like: deny user X from hosts h1...h200 also allow user X from host h19, there are two issues here: 1. Complex rule systems can be difficult to reason about and therefore end up being less secure. The rule Deny always wins is very easy to grasp. 2. We currently don't have any mechanism for specifying IP ranges (or host ranges) at all. I think its a pretty significant deficiency, but it does mean that we don't need to worry about the issue of blocking a large range while unblocking few servers in the range. Gwen P.S We have a call tomorrow (Tuesday, April 28) at 3pm PST - to discuss this and other outstanding design issues (not all related to security). If you are interested in joining - let me know and I'll forward you the invite. Gwen On Mon, Ap 27, 2015 at 10:15 PM, Sun, Dapeng dapeng@intel.com wrote: Attach the image. https://raw.githubusercontent.com/sundapeng/attachment/master/kafka-a c l1.png Regards Dapeng From: Sun, Dapeng [mailto:dapeng@intel.com] Sent: Tuesday, April 28, 2015 11:44 AM To: dev@kafka.apache.org Subject: RE: [VOTE] KIP-11- Authorization design for kafka security Thank you for your rapid reply, Parth. * I think the wiki already describes the precedence order as Deny taking precedence over allow when conflicting acls are found https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authorizat i on+In terface#KIP-11-AuthorizationInterface-PermissionType Got it, thank you. * In the first version that I am currently writing there is no group support. Even when we add it I don't see the need to add a precedence for evaluation. it does not matter which principal matches as long as we have a match. About this part, I think we should choose the best matching acl for authorization, no mater we support group or not. For the case [cid:image001
Re: [VOTE] KIP-11- Authorization design for kafka security
If you have bucket A and Bucket B and in Bucket A there are patients with Disease X and Bucket B patients without Disease X. Now you try to access Alice from bucket A and you get a 403 and then from Bucket B you get a 404. What does that tell you now about Alice? Yup, she has Disease X. Uniform none existence is a good policy for protecting data. If you don't have permission then 404 not found works too. The context that I thought that applied with this discussion is because I thought the authorization module was going to be a bit more integration where the api responses were happening ~ Joe Stein - - - - - - - - - - - - - - - - - http://www.stealth.ly - - - - - - - - - - - - - - - - - On Thu, Apr 30, 2015 at 6:51 PM, Suresh Srinivas sur...@hortonworks.com wrote: Comment on AuthorizationException. I think the intent of exception should be to capture why a request is rejected. It is important from API perspective to be specific to aid debugging. Having a generic or obfuscated exception is not very useful. Does someone on getting an exception reach out to an admin to understand if a topic exists or it's an authorization issue? I am not getting the security concern. System must be ensure disallowing the access by implementing the security correctly. Not based on security by obscurity. Regards, Suresh Sent from phone _ From: Gwen Shapira gshap...@cloudera.commailto:gshap...@cloudera.com Sent: Thursday, April 30, 2015 10:14 AM Subject: Re: [VOTE] KIP-11- Authorization design for kafka security To: dev@kafka.apache.orgmailto:dev@kafka.apache.org * Regarding additional authorizers: Prasad, who is a PMC on Apache Sentry reviewed the design and confirmed Sentry can integrate with the current APIs. Dapeng Sun, a committer on Sentry had some concerns about the IP privileges and how we prioritize privileges - but nothing that prevents Sentry from integrating with the existing solution, from what I could see. It seems to me that the design is very generic and adapters can be written for other authorization systems (after all, you just need to implement setACL, getACL and Authorize - all pretty basic), although I can't speak for Oracle's Identity Manager specifically. * Regarding AuthorizationException to indicate that an operation was not authorized: Sorry I missed this in previous reviewed, but now that I look at it - Many systems intentionally don't return AuthorizationException when READ privilege is missing, since this already gives too much information (that the topic exists and that you don't have privileges on it). Instead they return a variant of doesn't exist. I'm wondering if this approach is applicable / desirable for Kafka as well. Note that this doesn't remove the need for AuthorizationException - I'm just suggesting a possible refinement on its use. Gwen On Thu, Apr 30, 2015 at 9:52 AM, Parth Brahmbhatt pbrahmbh...@hortonworks.commailto:pbrahmbh...@hortonworks.com wrote: Hi Joe, Thanks for taking the time to review. * All the open issues already have a resolution , I can open a jira for each one and add the resolution to it and resolve them immediately if you want this for tracking purposes. * We will update system tests to verify that the code works. We have thorough unit tests for all the new code except for modifications made to KafkaAPI as that has way too many dependencies to be mocked which I guess is the reason for no existing unit tests. * I don’t know if I completely understand the concern. We have talked with Ranger team (Don Bosco Durai) so we at least have one custom authorizer implementation that has approved this design and they will be able to inject their authorization framework with current interfaces. Do you see any issue with the design which will prevent anyone from providing a custom implementation? * Did not understand the concern around wire protocol, we are adding AuthorizationException to indicate that an operation was not authorized. Thanks Parth On 4/30/15, 5:59 AM, Jun Rao j...@confluent.iomailto:j...@confluent.io wrote: Joe, Could you elaborate on why we should not store JSON in ZK? So far, all existing ZK data are in JSON. Thanks, Jun On Thu, Apr 30, 2015 at 2:06 AM, Joe Stein joe.st...@stealth.ly mailto:joe.st...@stealth.ly wrote: Hi, sorry I am coming in late to chime back in on this thread and haven't been able to make the KIP hangouts the last few weeks. Sorry if any of this was brought up already or I missed it. I read through the KIP and the thread(s) and a couple of things jumped out. - Can we break out the open issues in JIRA (maybe during the hangout) that are in the KIP and resolve/flesh those out more? - I don't see any updates with the systems test or how we can know the code works. - We need some implementation/example
Re: [VOTE] KIP-11- Authorization design for kafka security
Joe, these are good use cases, however in the firt phase the granularity is at the Topic (your e.g. bucket) level and not what you are accessing within the Topic. So in your use case, if you don’t have access to “Bucket A”, then you won’t know who is in it, so you won’t know “Alice” or anyone who as “X”. The use case here, there is a HL7 topic with specific for “New Patients”, then only users “A,B or C” can publish to it and only users “X, Y o Z” can consume from it. In addition, only admin users “P, Q and R” can manage the topic permissions. I feel, keeping it simple should be good enough for the first phase. Thanks Bosco On 4/30/15, 3:59 PM, Joe Stein joe.st...@stealth.ly wrote: If you have bucket A and Bucket B and in Bucket A there are patients with Disease X and Bucket B patients without Disease X. Now you try to access Alice from bucket A and you get a 403 and then from Bucket B you get a 404. What does that tell you now about Alice? Yup, she has Disease X. Uniform none existence is a good policy for protecting data. If you don't have permission then 404 not found works too. The context that I thought that applied with this discussion is because I thought the authorization module was going to be a bit more integration where the api responses were happening ~ Joe Stein - - - - - - - - - - - - - - - - - http://www.stealth.ly - - - - - - - - - - - - - - - - - On Thu, Apr 30, 2015 at 6:51 PM, Suresh Srinivas sur...@hortonworks.com wrote: Comment on AuthorizationException. I think the intent of exception should be to capture why a request is rejected. It is important from API perspective to be specific to aid debugging. Having a generic or obfuscated exception is not very useful. Does someone on getting an exceptionreach out to an admin to understand if a topic exists or it's an authorization issue? I am not getting the security concern. System must be ensure disallowing the access by implementing the security correctly. Not based on security by obscurity. Regards, Suresh Sent from phone _ From: Gwen Shapira gshap...@cloudera.commailto:gshap...@cloudera.com Sent: Thursday, April 30, 2015 10:14 AM Subject: Re: [VOTE] KIP-11- Authorization design for kafka security To: dev@kafka.apache.orgmailto:dev@kafka.apache.org * Regarding additional authorizers: Prasad, who is a PMC on Apache Sentry reviewed the design and confirmed Sentry can integrate with the current APIs. Dapeng Sun, a committer on Sentry had sme concerns about the IP privileges and how we prioritize privileges - but nothing that prevents Sentry from integrating with the existing solution, from what I coul see. It seems to me that the design is very generic and adapters can be written for other authorization systems (after all, you just need to implement setACL, getACL and Authorize - all pretty basic), although I can't speak for Oracle's Identity Manager specifically. * Regarding AuthorizationException to indicate that anoperation was not authorized: Sorry I missed this in previous reviewed, but now that I look at it - Many systems intentionally don't return AuthorizationException when READ privilege is missing, since this already gives too much information (that the topic exists and that you don't have privileges on it). Instead they return a variant of doesn't exist. I'm wondering if this approach is applicable / desirable for Kafka as well. Note that this doesn't remove the need for AuthorizationException - I'm just suggesting apossible refinement on its use. Gwen On Thu, Apr 30, 2015 at 9:52 AM, Parth Brahmbhatt pbrahmbh...@hortonworks.commailto:pbrahmbh...@hortonworks.com wrote: Hi Joe, Thanks for taking the time to review. * All the open issues already have a resolution , I can open a jira for each one and add the resolution to it and resolve them immediately if you want this for tracking purposes. * We will update system tests to verify that the code woks. We have thorough unit tests for all the new code except for modifications made to KafkaAPI as that has way too many dependencies to be mocked which I guess is the reason for no existing unit tests. * I don’t know if I completely understand the concern. We have talked with Ranger team (Don Bosco Durai) so we at least have one custom authorizer implementation that has approved this design and they will be able to inject their authorization framework with current interfaces. Do you see any issue with the design which will prevent anyone from providing a custom implementation? * Did not understand the concern around wire protocol, we are adding AuthorizationException to indicate that an operation was not authorized. Thanks Parth On 4/30/15, 5:59 AM, Jun Rao j...@confluent.iomailto:j...@confluent.io wrote: Joe, Could you elaborate on why we should not store JSON in ZK? So far, all existing ZK data are in JSON. Thanks, Jun
Re: [VOTE] KIP-11- Authorization design for kafka security
Gwen, Thanks for the clarification. My objection is, we should not do it just because of the reason that databases have always done it this way. May be there is a history there that might have forced a choice like that. That has led to other DBs to comply with it. Kafka is a different system. Let's do what is the correct thing to do. I also think it is not clear what users want here. But as an API developer I would want error conditions to be correctly identified so that supportability of the product does not suffer. Today in HDFS (for that matter Hadoop in general), the error conditions are clearly identified, such as: - Object you are trying to access does not exist - You do not have permission to access the object - The operation you are trying to do is invalid Here are some error codes that Amazon Kinesis support describing the failure/error conditions clearly: http://docs.aws.amazon.com/kinesis/latest/APIReference/CommonErrors.html From: Gwen Shapira gshap...@cloudera.com Sent: Thursday, April 30, 2015 6:05 PM To: dev@kafka.apache.org Subject: Re: [VOTE] KIP-11- Authorization design for kafka security I think Kafka's behavior should be driven by what users want. My only indication to what they may want is what we were forced to fix in similar cases. This is why I am advocating this behavior. I agree that this is a minor point that should not be blocking the vote. I already gave my non-binding +1 and thats the best I can do to drive this forward. If this vote passes without the behavior I believe is the right one, I will create a follow up JIRA. However, since we are still in a discussion and since both options are trivial to implement - why exactly are you objecting to Kafka behaving more like a DB in this scenario? Gwen On Thu, Apr 30, 2015 at 5:54 PM, Suresh Srinivas sur...@hortonworks.com wrote: It is a strange choice to return does not exist when the condition is actually not authorized. I have hard time understanding why that is better for security. Perhaps in DB world this is expected and changes may be necessary to comply with such behavior. But that should not guide what we do in Kafka. This is a voting thread for an important feature. Security is the number one feature that our users are asking for. Can't minor things like this be done in a follow up jiras? Should the focus be brought back to voting? Btw since I am new to the Kafka community, is there a period when voting thread needs to wrap up by? Other projects generally follow 3 or 7 days. Regards, Suresh Sent from phone _ From: Gwen Shapira gshap...@cloudera.commailto:gshap...@cloudera.com Sent: Thursday, April 30, 2015 5:32 PM Subject: Re: [VOTE] KIP-11- Authorization design for kafka security To: dev@kafka.apache.orgmailto:dev@kafka.apache.org On Thu, Apr 30, 2015 at 4:39 PM, Parth Brahmbhatt pbrahmbh...@hortonworks.commailto:pbrahmbh...@hortonworks.com wrote: Hi Joe, Let me clarify on authZException. The caller gets a 403 regardless of existence of the topic, even if the topic does not exist you always get 403. This will fall under the case wherewe do not find any acls for a resource and as per our last decision by default we are going to deny this request. The reason I'm digging into this is that in Hive we had to fix existing behavior after financial customers objected loudly to getting insufficient privileges when a real database would return table does not exist. I completely agree that having to handle two separate error conditions (TopicNotExist if user doesn't have READ, unless user has CREATE in which case he can see all topics and can get Unauthorized) adds complexity and will not be fun to debug. However, when implementing security, a lot of the stuff we do is around making customers pass security audits, and I suspect that can't know that tables even exist test is a thing. We share pretty much the same financial customers and they seem to have the same concerns. Perhaps you can double check if you also have this requirement? (and again, sorry for not seeing this earlier and holding up the vote on what seems like a minor point. I just don't want to punt for later something when we already have an idea of what customers expect) Gwen The configurations are listed explicitly here https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authorization+In terface#KIP-11-AuthorizatiInterface-Changestoexistingclasses under KafkaConfig. We may add an optional config to allow authorizer to read an arbitrary property files incrementally but that does not need to be part of this same KIP. The statement “If we can't audit the access then wht good is controlling the access?” seems extreme because we still get to control the access which IMHO is a huge win. The default authorizer implementation right now logs every allowed/denied access (see here https
Re: [VOTE] KIP-11- Authorization design for kafka security
I think Kafka's behavior should be driven by what users want. My only indication to what they may want is what we were forced to fix in similar cases. This is why I am advocating this behavior. I agree that this is a minor point that should not be blocking the vote. I already gave my non-binding +1 and thats the best I can do to drive this forward. If this vote passes without the behavior I believe is the right one, I will create a follow up JIRA. However, since we are still in a discussion and since both options are trivial to implement - why exactly are you objecting to Kafka behaving more like a DB in this scenario? Gwen On Thu, Apr 30, 2015 at 5:54 PM, Suresh Srinivas sur...@hortonworks.com wrote: It is a strange choice to return does not exist when the condition is actually not authorized. I have hard time understanding why that is better for security. Perhaps in DB world this is expected and changes may be necessary to comply with such behavior. But that should not guide what we do in Kafka. This is a voting thread for an important feature. Security is the number one feature that our users are asking for. Can't minor things like this be done in a follow up jiras? Should the focus be brought back to voting? Btw since I am new to the Kafka community, is there a period when voting thread needs to wrap up by? Other projects generally follow 3 or 7 days. Regards, Suresh Sent from phone _ From: Gwen Shapira gshap...@cloudera.commailto:gshap...@cloudera.com Sent: Thursday, April 30, 2015 5:32 PM Subject: Re: [VOTE] KIP-11- Authorization design for kafka security To: dev@kafka.apache.orgmailto:dev@kafka.apache.org On Thu, Apr 30, 2015 at 4:39 PM, Parth Brahmbhatt pbrahmbh...@hortonworks.commailto:pbrahmbh...@hortonworks.com wrote: Hi Joe, Let me clarify on authZException. The caller gets a 403 regardless of existence of the topic, even if the topic does not exist you always get 403. This will fall under the case wherewe do not find any acls for a resource and as per our last decision by default we are going to deny this request. The reason I'm digging into this is that in Hive we had to fix existing behavior after financial customers objected loudly to getting insufficient privileges when a real database would return table does not exist. I completely agree that having to handle two separate error conditions (TopicNotExist if user doesn't have READ, unless user has CREATE in which case he can see all topics and can get Unauthorized) adds complexity and will not be fun to debug. However, when implementing security, a lot of the stuff we do is around making customers pass security audits, and I suspect that can't know that tables even exist test is a thing. We share pretty much the same financial customers and they seem to have the same concerns. Perhaps you can double check if you also have this requirement? (and again, sorry for not seeing this earlier and holding up the vote on what seems like a minor point. I just don't want to punt for later something when we already have an idea of what customers expect) Gwen The configurations are listed explicitly here https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authorization+In terface#KIP-11-AuthorizatiInterface-Changestoexistingclasses under KafkaConfig. We may add an optional config to allow authorizer to read an arbitrary property files incrementally but that does not need to be part of this same KIP. The statement “If we can't audit the access then wht good is controlling the access?” seems extreme because we still get to control the access which IMHO is a huge win. The default authorizer implementation right now logs every allowed/denied access (see here https://github.com/Parth-Brahmbhatt/kafka/blob/KAFKA-1688-impl/core/src/mai n/scala/kafka/security/auth/SimpleAclAthorizer.scala) in debug mode. Anybody who needs auditing could create a lo4j appender to allow debug access to this class and send the log output to some audit fil. Auditing is still a separate piece, we could either add an auditor interface that wraps authorizer or the other way around so authorizer and auditor can be two separate implementation. I woud love to start a new KIP and jira to discuss approaches in more details but I don’t see the need to hold up Authorization work for the same. I don’t agree with the “this design seems too specific” given we already have 3 implementation (default, ranger, sentry) that can be supported with the current design. The authorization happens as part of handle and it is the first action, see here https://github.com/Parth-Brahmbhatt/kafka/blob/KAFKA-1688-impl/core/src/mai n/scala/kafka/server/KafkaApis.scala#L103 for one example. Thanks Parth On 4/30/15, 4:24 PM, Suresh Srinivas sur...@hortonworks.commailto: sur...@hortonworks.com wrote: Joe, thanks
Re: [VOTE] KIP-11- Authorization design for kafka security
and I'll forward you the invite. Thank you, Gwen. I have the invite and I should be at home at that time. But due to network issue, I may can't join the meeting smoothly. Regards Dapeng -Original Message- From: Gwen Shapira [mailto:gshap...@cloudera.com] Sent: Tuesday, April 28, 2015 1:31 PM To: dev@kafka.apache.org Subject: Re [VOTE] KIP-11- Authorization design for kafka security While I see the advantage of being able to say something like: deny user X from hosts h1...h200 also allow user X from host h189, there are two issues here: 1. Complex rule systems can be difficult to reason about and therefore end up being less secure. The rule Deny always wins is very easy to grasp. 2. We currently don't have any mechanism for specifying IP ranges (or host ranges) at all. I think its a pretty significant deficiency, but it does mean that we don't need to worry about the issue of blocking a large range while unblocking few servers in the range. Gwen P.S We have a call tomorrow (Tuesday, April 28) at 3pm PST - to discuss this and other outstanding design issues (not all related to security). If you are interested in joining - let me know and I'll forward you the invite. Gwen On Mon, Apr 27, 2015 at 10:15 PM, Sun, Dapeng dapeng@intel.com wrote: Attach the image. https://raw.githubusercontent.com/sundapeng/attachment/master/kafka-ac l1.png Regards Dapeng From: Sun, Dapeng [mailto:dapeng@intel.com] Sent: Tuesday, April 28, 2015 11:44 AM To: dev@kafka.apache.org Subject: RE: [VOTE] KIP-11- Authorization design for kafka security Thank you for your rapid reply, Parth. * I think the wiki already describes the precedence order as Deny taking precedence over allow when conflicting acls are found https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authorizati on+In terface#KIP-11-AuthorizationInterface-PermissionType Got it, thank you. * In the first version that I am currently writing there is no group support. Even when we add it I don't see the need to add a precedence for evaluation. it does not matter which principal matches as long as we have a match. About this part, I think we should choose the best matching acl for authorization, no mater we support group or not. For the case [cid:image001.png@01D08197.E94BD410] https://raw.githubusercontent.com/sundapeng/attachment/master/kafka-ac l1.png if 2 Acls are define, one that deny an operation from all hosts and one that allows the operation from host1, the operation from host1 will be denied or allowed? According wiki Deny will take precedence over Allow in competing acls., it seems acl_1 will win the competition, but customers' intention may be allow. I think deny always take precedence over Allow is okay, but host1 - user1host1 default may make sense. * Acl storage is indexed by resource right now because that is the primary lookup id for all authorize operations. Given acls are cached I don't see the need to optimized the storage layer any further for lookup. * The reason why we have acl with multi everything is to reduce redundancy in acl storage. I am not sure how wil we be able to reduce redundancy if we divide it by using one principal,one host, one operation. Yes, I'm also greed with Acl storage should be indexed by resource. Under resource index, it may be better to add index such as hosts and principals. One option may be one principal, one host, one operation. Just give your these scenarios for considering. For the case defined in wiki: Acl_1 - {user:bob, user:*} is allowed to READ from all hosts. Acl_2 - {user:bob} is denied to READ from host1 Acl_3 - {user:alice, group:kafka-devs} is allowed to READ and WRITE from {host1, hos2}. For acl_3, if we want to remove alice's WRITE from {host1,host2} and remove alice's READ from host1, user may have following ways to achieve: 1.Remove the parts of acl_3 directly, I think if we make it divided and hierarchical, this kind of operatons could be done directly in backend. 2.Remove acl_3, and add new acl {group:kafka-devs} is allowed to READ and WRITE from {host1, host2} and {user:alice } is allowed to READ from {host2} 3.Add two denied acls,{ user:alice} is denied to WRITE from {host1,host2} and { user:alice} is denied to READ from
Re: [VOTE] KIP-11- Authorization design for kafka security
Comment on AuthorizationException. I think the intent of exception should be to capture why a request is rejected. It is important from API perspective to be specific to aid debugging. Having a generic or obfuscated exception is not very useful. Does someone on getting an exception reach out to an admin to understand if a topic exists or it's an authorization issue? I am not getting the security concern. System must be ensure disallowing the access by implementing the security correctly. Not based on security by obscurity. Regards, Suresh Sent from phone _ From: Gwen Shapira gshap...@cloudera.commailto:gshap...@cloudera.com Sent: Thursday, April 30, 2015 10:14 AM Subject: Re: [VOTE] KIP-11- Authorization design for kafka security To: dev@kafka.apache.orgmailto:dev@kafka.apache.org * Regarding additional authorizers: Prasad, who is a PMC on Apache Sentry reviewed the design and confirmed Sentry can integrate with the current APIs. Dapeng Sun, a committer on Sentry had some concerns about the IP privileges and how we prioritize privileges - but nothing that prevents Sentry from integrating with the existing solution, from what I could see. It seems to me that the design is very generic and adapters can be written for other authorization systems (after all, you just need to implement setACL, getACL and Authorize - all pretty basic), although I can't speak for Oracle's Identity Manager specifically. * Regarding AuthorizationException to indicate that an operation was not authorized: Sorry I missed this in previous reviewed, but now that I look at it - Many systems intentionally don't return AuthorizationException when READ privilege is missing, since this already gives too much information (that the topic exists and that you don't have privileges on it). Instead they return a variant of doesn't exist. I'm wondering if this approach is applicable / desirable for Kafka as well. Note that this doesn't remove the need for AuthorizationException - I'm just suggesting a possible refinement on its use. Gwen On Thu, Apr 30, 2015 at 9:52 AM, Parth Brahmbhatt pbrahmbh...@hortonworks.commailto:pbrahmbh...@hortonworks.com wrote: Hi Joe, Thanks for taking the time to review. * All the open issues already have a resolution , I can open a jira for each one and add the resolution to it and resolve them immediately if you want this for tracking purposes. * We will update system tests to verify that the code works. We have thorough unit tests for all the new code except for modifications made to KafkaAPI as that has way too many dependencies to be mocked which I guess is the reason for no existing unit tests. * I don’t know if I completely understand the concern. We have talked with Ranger team (Don Bosco Durai) so we at least have one custom authorizer implementation that has approved this design and they will be able to inject their authorization framework with current interfaces. Do you see any issue with the design which will prevent anyone from providing a custom implementation? * Did not understand the concern around wire protocol, we are adding AuthorizationException to indicate that an operation was not authorized. Thanks Parth On 4/30/15, 5:59 AM, Jun Rao j...@confluent.iomailto:j...@confluent.io wrote: Joe, Could you elaborate on why we should not store JSON in ZK? So far, all existing ZK data are in JSON. Thanks, Jun On Thu, Apr 30, 2015 at 2:06 AM, Joe Stein joe.st...@stealth.lymailto:joe.st...@stealth.ly wrote: Hi, sorry I am coming in late to chime back in on this thread and haven't been able to make the KIP hangouts the last few weeks. Sorry if any of this was brought up already or I missed it. I read through the KIP and the thread(s) and a couple of things jumped out. - Can we break out the open issues in JIRA (maybe during the hangout) that are in the KIP and resolve/flesh those out more? - I don't see any updates with the systems test or how we can know the code works. - We need some implementation/example/sample that we know can work in all different existing entitlement servers and not just ones that run in types of data centers too. I am not saying we should support everything but if someone had to implement https://docs.oracle.com/cd/E19225-01/820-6551/bzafm/index.html with Kafka it has to work for them out of the box. - We should shy away from storing JSON in Zookeeper. Lets store bytes in Storage. - We should spend some time thinking through exceptions in the wire protocol maybe as part of this so it can keep moving forward. ~ Joe Stein On Tue, Apr 28, 2015 at 3:33 AM, Sun, Dapeng dapeng@intel.commailto:dapeng@intel.com wrote: Thank you for your reply, Gwen. 1. Complex rule systems can be difficult to reason about and therefore end up being less secure. The rule
Re: [VOTE] KIP-11- Authorization design for kafka security
Joe, thanks for the clarification. Regarding audits, sorry I might be misunderstanding your email. Currently, if Kafka does not support audits, I think audits should be considered as a separate effort. Here are the reasons: - Audit, whether authorization is available or not, should record operations to determine what is happening in the system. It should record all the operations such as create, delete, consumption of topics along with user information. It should work whether authorization is enabled or not. In Hadoop long before we added real authorization, we had audit logs. - Authorization will bring an additional element of who was denied. As part of audit effort, it is important to add along with what operations succeeded (and for whom), what operations were denied. From: Joe Stein joe.st...@stealth.ly Sent: Thursday, April 30, 2015 4:12 PM To: dev@kafka.apache.org Subject: Re: [VOTE] KIP-11- Authorization design for kafka security I kind of thought of the authorization module as something that happens in handle(request: RequestChannel.Reuqest) in the request.requestId match If the request doesn't do what it is allowed too it should stop right there. That what it is allowed to-do is a true/false callback to the class loaded with 1 function to accept the data and some more about what it is about (that we have access to). I think all of the other features are awesome but you can build them on top of this and then other can do the same. I am more hooked on the authorization module being a watch dog above handle() than I am on the plug-in implementation options (less is more imho). If we do this approach the audit fits in nice because we are seeing more what happens in one place and decision made for access right there. ~ Joe Stein - - - - - - - - - - - - - - - - - http://www.stealth.ly - - - - - - - - - - - - - - - - - On Thu, Apr 30, 2015 at 6:59 PM, Suresh Srinivas sur...@hortonworks.com wrote: Joe, Can you add more details on what generalization looks like? Also is this a design issue or code issue? One more question. Does Kafka have audit capabilities today for topic creation, deletion, access etc.? Regards, Suresh Sent from phone _ From: Joe Stein joe.st...@stealth.lymailto:joe.st...@stealth.ly Sent: Thursday, April 30, 2015 3:27 PM Subject: Re: [VOTE] KIP-11- Authorization design for kafka security To: dev@kafka.apache.orgmailto:dev@kafka.apache.org Ok, I read through it all again a few times. I get the provider broker piece now. The configurations are still confusing if there are 2 or 3 and they should be called out more specifically than as a change to a class. Configs are a public interface we should be a bit more explicit. Was there any discussion about any auditing component? How would anyone know if the authorization plugin was running for when or what it was doing? If we can't audit the access then what good is controlling the access? I still don't see where all the command line configuration options come in. There are a lot of things to-do with it but not sure how to use it yet. This plug-in still feels like a very specific case and we should try to generalize it down some more to make it more straight forward for folks. ~ Joestein On Thu, Apr 30, 2015 at 3:51 PM, Parth Brahmbhatt pbrahmbh...@hortonworks.commailto:pbrahmbh...@hortonworks.com wrote: During the discussion Jun pointed out that mirror maker, which right now does not copy any zookeeper config overrides, will now replicate topics but will not replicate any acls. Given the authorizer interface exposes the acl management apis, list/get/add/remove, weproposed that mirror maker can just instantiate an instance of authorizer and call these apis directly to get acls for a topic and add it to the destination cluster if we want to add acls to be replicated as part of mirror maker. Thanks Parth On 4/30/15, 12:43 PM, Joe Stein joe.st...@stealth.lymailto: joe.st...@stealth.ly wrote: Parth, Can you explain how Mirror maker will have to start using new acl management tool) and it not affect any other client. If you aren't changing the wire protocol then how do clients use it? ~ Joe stein On Thu, Apr 30, 2015 at 3:15 PM, Parth Brahmbhatt pbrahmbh...@hortonworks.commailto:pbrahmbh...@hortonworks.com wrote: Hi Joe, Regarding open question: I changed the title to “Questions resolved after community discussions” let me know if you have a better name. I have a question and a bullet point under each question describing the final decision. Not sure how can I make it any cleaner so appreciate any suggestion. Regarding system tests: I went through a bunch of KIP none of which mentions what test cases will be added. Do you want to add a “How do you plan to tet” section in the general KIP template or you think this is just
Re: [VOTE] KIP-11- Authorization design for kafka security
Ah, I'm not talking about security by obscurity. At least in the database world, if you don't have SELECT on a table, you won't even see it when saying show tables because the very fact that the table exists is privileged. In that case, a denied SELECT attempt will return table does not exist, and not permission denied. It is simply a question of what the privilege covers. I was wondering if it is desirable to apply the same model to Kafka. Gwen On Thu, Apr 30, 2015 at 3:51 PM, Suresh Srinivas sur...@hortonworks.com wrote: Comment on AuthorizationException. I think the intent of exception should be to capture why a request is rejected. It is important from API perspective to be specific to aid debugging. Having a generic or obfuscated exception is not very useful. Does someone on getting an exception reach out to an admin to understand if a topic exists or it's an authorization issue? I am not getting the security concern. System must be ensure disallowing the access by implementing the security correctly. Not based on security by obscurity. Regards, Suresh Sent from phone _ From: Gwen Shapira gshap...@cloudera.commailto:gshap...@cloudera.com Sent: Thursday, April 30, 2015 10:14 AM Subject: Re: [VOTE] KIP-11- Authorization design for kafka security To: dev@kafka.apache.orgmailto:dev@kafka.apache.org * Regarding additional authorizers: Prasad, who is a PMC on Apache Sentry reviewed the design and confirmed Sentry can integrate with the current APIs. Dapeng Sun, a committer on Sentry had some concerns about the IP privileges and how we prioritize privileges - but nothing that prevents Sentry from integrating with the existing solution, from what I could see. It seems to me that the design is very generic and adapters can be written for other authorization systems (after all, you just need to implement setACL, getACL and Authorize - all pretty basic), although I can't speak for Oracle's Identity Manager specifically. * Regarding AuthorizationException to indicate that an operation was not authorized: Sorry I missed this in previous reviewed, but now that I look at it - Many systems intentionally don't return AuthorizationException when READ privilege is missing, since this already gives too much information (that the topic exists and that you don't have privileges on it). Instead they return a variant of doesn't exist. I'm wondering if this approach is applicable / desirable for Kafka as well. Note that this doesn't remove the need for AuthorizationException - I'm just suggesting a possible refinement on its use. Gwen On Thu, Apr 30, 2015 at 9:52 AM, Parth Brahmbhatt pbrahmbh...@hortonworks.commailto:pbrahmbh...@hortonworks.com wrote: Hi Joe, Thanks for taking the time to review. * All the open issues already have a resolution , I can open a jira for each one and add the resolution to it and resolve them immediately if you want this for tracking purposes. * We will update system tests to verify that the code works. We have thorough unit tests for all the new code except for modifications made to KafkaAPI as that has way too many dependencies to be mocked which I guess is the reason for no existing unit tests. * I don’t know if I completely understand the concern. We have talked with Ranger team (Don Bosco Durai) so we at least have one custom authorizer implementation that has approved this design and they will be able to inject their authorization framework with current interfaces. Do you see any issue with the design which will prevent anyone from providing a custom implementation? * Did not understand the concern around wire protocol, we are adding AuthorizationException to indicate that an operation was not authorized. Thanks Parth On 4/30/15, 5:59 AM, Jun Rao j...@confluent.iomailto:j...@confluent.io wrote: Joe, Could you elaborate on why we should not store JSON in ZK? So far, all existing ZK data are in JSON. Thanks, Jun On Thu, Apr 30, 2015 at 2:06 AM, Joe Stein joe.st...@stealth.ly mailto:joe.st...@stealth.ly wrote: Hi, sorry I am coming in late to chime back in on this thread and haven't been able to make the KIP hangouts the last few weeks. Sorry if any of this was brought up already or I missed it. I read through the KIP and the thread(s) and a couple of things jumped out. - Can we break out the open issues in JIRA (maybe during the hangout) that are in the KIP and resolve/flesh those out more? - I don't see any updates with the systems test or how we can know the code works. - We need some implementation/example/sample that we know can work in all different existing entitlement servers and not just ones that run in types of data centers too. I am not saying we should support everything but if someone had to implement https
Re: [VOTE] KIP-11- Authorization design for kafka security
I kind of thought of the authorization module as something that happens in handle(request: RequestChannel.Reuqest) in the request.requestId match If the request doesn't do what it is allowed too it should stop right there. That what it is allowed to-do is a true/false callback to the class loaded with 1 function to accept the data and some more about what it is about (that we have access to). I think all of the other features are awesome but you can build them on top of this and then other can do the same. I am more hooked on the authorization module being a watch dog above handle() than I am on the plug-in implementation options (less is more imho). If we do this approach the audit fits in nice because we are seeing more what happens in one place and decision made for access right there. ~ Joe Stein - - - - - - - - - - - - - - - - - http://www.stealth.ly - - - - - - - - - - - - - - - - - On Thu, Apr 30, 2015 at 6:59 PM, Suresh Srinivas sur...@hortonworks.com wrote: Joe, Can you add more details on what generalization looks like? Also is this a design issue or code issue? One more question. Does Kafka have audit capabilities today for topic creation, deletion, access etc.? Regards, Suresh Sent from phone _ From: Joe Stein joe.st...@stealth.lymailto:joe.st...@stealth.ly Sent: Thursday, April 30, 2015 3:27 PM Subject: Re: [VOTE] KIP-11- Authorization design for kafka security To: dev@kafka.apache.orgmailto:dev@kafka.apache.org Ok, I read through it all again a few times. I get the provider broker piece now. The configurations are still confusing if there are 2 or 3 and they should be called out more specifically than as a change to a class. Configs are a public interface we should be a bit more explicit. Was there any discussion about any auditing component? How would anyone know if the authorization plugin was running for when or what it was doing? If we can't audit the access then what good is controlling the access? I still don't see where all the command line configuration options come in. There are a lot of things to-do with it but not sure how to use it yet. This plug-in still feels like a very specific case and we should try to generalize it down some more to make it more straight forward for folks. ~ Joestein On Thu, Apr 30, 2015 at 3:51 PM, Parth Brahmbhatt pbrahmbh...@hortonworks.commailto:pbrahmbh...@hortonworks.com wrote: During the discussion Jun pointed out that mirror maker, which right now does not copy any zookeeper config overrides, will now replicate topics but will not replicate any acls. Given the authorizer interface exposes the acl management apis, list/get/add/remove, weproposed that mirror maker can just instantiate an instance of authorizer and call these apis directly to get acls for a topic and add it to the destination cluster if we want to add acls to be replicated as part of mirror maker. Thanks Parth On 4/30/15, 12:43 PM, Joe Stein joe.st...@stealth.lymailto: joe.st...@stealth.ly wrote: Parth, Can you explain how Mirror maker will have to start using new acl management tool) and it not affect any other client. If you aren't changing the wire protocol then how do clients use it? ~ Joe stein On Thu, Apr 30, 2015 at 3:15 PM, Parth Brahmbhatt pbrahmbh...@hortonworks.commailto:pbrahmbh...@hortonworks.com wrote: Hi Joe, Regarding open question: I changed the title to “Questions resolved after community discussions” let me know if you have a better name. I have a question and a bullet point under each question describing the final decision. Not sure how can I make it any cleaner so appreciate any suggestion. Regarding system tests: I went through a bunch of KIP none of which mentions what test cases will be added. Do you want to add a “How do you plan to tet” section in the general KIP template or you think this is just a special case where the test cases should be listed and discussed as part of KIP? I am not sure if KIP really is the right forum for this discussion. This can easily be addressed during code review if people think we don’t have enough test coverage. I am still not sure which part is not clear. The scal exception is added for internal server side rpresentation. In the end all of our responses always return just an error code for which we will add an AuthorizationErroCode mapped to AuthorizationException. The error code it self will not reveal any informationother then the fact that you are not authorized to perform an operation on a resource and you will get this error code even for non existent topics if no acls exist for those topics. can add a diagram if that makes things more clear, I am not convinced its needed given we have come so far without it. Essentially there are 3 steps * users use the acl cli to add acls
Re: [VOTE] KIP-11- Authorization design for kafka security
On Thu, Apr 30, 2015 at 4:39 PM, Parth Brahmbhatt pbrahmbh...@hortonworks.com wrote: Hi Joe, Let me clarify on authZException. The caller gets a 403 regardless of existence of the topic, even if the topic does not exist you always get 403. This will fall under the case wherewe do not find any acls for a resource and as per our last decision by default we are going to deny this request. The reason I'm digging into this is that in Hive we had to fix existing behavior after financial customers objected loudly to getting insufficient privileges when a real database would return table does not exist. I completely agree that having to handle two separate error conditions (TopicNotExist if user doesn't have READ, unless user has CREATE in which case he can see all topics and can get Unauthorized) adds complexity and will not be fun to debug. However, when implementing security, a lot of the stuff we do is around making customers pass security audits, and I suspect that can't know that tables even exist test is a thing. We share pretty much the same financial customers and they seem to have the same concerns. Perhaps you can double check if you also have this requirement? (and again, sorry for not seeing this earlier and holding up the vote on what seems like a minor point. I just don't want to punt for later something when we already have an idea of what customers expect) Gwen The configurations are listed explicitly here https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authorization+In terface#KIP-11-AuthorizatiInterface-Changestoexistingclasses under KafkaConfig. We may add an optional config to allow authorizer to read an arbitrary property files incrementally but that does not need to be part of this same KIP. The statement “If we can't audit the access then wht good is controlling the access?” seems extreme because we still get to control the access which IMHO is a huge win. The default authorizer implementation right now logs every allowed/denied access (see here https://github.com/Parth-Brahmbhatt/kafka/blob/KAFKA-1688-impl/core/src/mai n/scala/kafka/security/auth/SimpleAclAthorizer.scala) in debug mode. Anybody who needs auditing could create a lo4j appender to allow debug access to this class and send the log output to some audit fil. Auditing is still a separate piece, we could either add an auditor interface that wraps authorizer or the other way around so authorizer and auditor can be two separate implementation. I woud love to start a new KIP and jira to discuss approaches in more details but I don’t see the need to hold up Authorization work for the same. I don’t agree with the “this design seems too specific” given we already have 3 implementation (default, ranger, sentry) that can be supported with the current design. The authorization happens as part of handle and it is the first action, see here https://github.com/Parth-Brahmbhatt/kafka/blob/KAFKA-1688-impl/core/src/mai n/scala/kafka/server/KafkaApis.scala#L103 for one example. Thanks Parth On 4/30/15, 4:24 PM, Suresh Srinivas sur...@hortonworks.com wrote: Joe, thanks for the clarification. Regarding audits, sorry I might be misunderstanding your email. Currently, if Kafka does not support audits, I think audits should be considered as a separate effort. Here are the reasons: - Audit,whether authorization is available or not, should record operations to determine what is happening in the system. It should record all the operations such as create, delete, consumption of topics along with user information. It should work whether authorization is enabled or not. In Hadoop long before we added real authorization, we had audit logs. - Authorizaion will bring an additional element of who was denied. As part of audit effort, it is important to add along with what operations succeeded (and for whom), what operations were denied. From: Joe Stein joe.st...@tealth.ly Sent: Thursday, April 30, 2015 4:12 PM To: dev@kafka.apache.org Subject: Re: [VOTE] KIP-11- Authorization design for kafka security I kind of thought of the authorization module as something that happens in handle(request: RequestChannel.Reuqest) in the request.requestId match If the request doesn't do what it is allowed too it should stop right there. That what it is allowed to-do is a true/false callback to the class loadd with 1 function to accept the data and some more about what it is about (that we have access to). I think all of the other features are awesome but you can build them on top of this and then other can do the same. I am more hooked on the authorization module being a watch dog above handle() than I am on the plug-in implementation options (less is more imho). If we do this approach the audit fits in nice because we are seeing mor what happens in one place and decision made for access right there. ~ Joe Stein
RE: [VOTE] KIP-11- Authorization design for kafka security
Thank you for your reply, Gwen. 1. Complex rule systems can be difficult to reason about and therefore end up being less secure. The rule Deny always wins is very easy to grasp. Yes, I'm agreed with your point: we should not make the rule complex. 2. We currently don't have any mechanism for specifying IP ranges (or host ranges) at all. I think its a pretty significant deficiency, but it does mean that we don't need to worry about the issue of blocking a large range while unblocking few servers in the range. Support ranges sounds reasonable. If this feature will be in development plan, I also don't think we can put the best matching acl and Support ip ranges together. We have a call tomorrow (Tuesday, April 28) at 3pm PST - to discuss this and other outstanding design issues (not all related to security). If you are interested in joining - let me know and I'll forward you the invite. Thank you, Gwen. I have the invite and I should be at home at that time. But due to network issue, I may can't join the meeting smoothly. Regards Dapeng -Original Message- From: Gwen Shapira [mailto:gshap...@cloudera.com] Sent: Tuesday, April 28, 2015 1:31 PM To: dev@kafka.apache.org Subject: Re: [VOTE] KIP-11- Authorization design for kafka security While I see the advantage of being able to say something like: deny user X from hosts h1...h200 also allow user X from host h189, there are two issues here: 1. Complex rule systems can be difficult to reason about and therefore end up being less secure. The rule Deny always wins is very easy to grasp. 2. We currently don't have any mechanism for specifying IP ranges (or host ranges) at all. I think its a pretty significant deficiency, but it does mean that we don't need to worry about the issue of blocking a large range while unblocking few servers in the range. Gwen P.S We have a call tomorrow (Tuesday, April 28) at 3pm PST - to discuss this and other outstanding design issues (not all related to security). If you are interested in joining - let me know and I'll forward you the invite. Gwen On Mon, Apr 27, 2015 at 10:15 PM, Sun, Dapeng dapeng@intel.com wrote: Attach the image. https://raw.githubusercontent.com/sundapeng/attachment/master/kafka-ac l1.png Regards Dapeng From: Sun, Dapeng [mailto:dapeng@intel.com] Sent: Tuesday, April 28, 2015 11:44 AM To: dev@kafka.apache.org Subject: RE: [VOTE] KIP-11- Authorization design for kafka security Thank you for your rapid reply, Parth. * I think the wiki already describes the precedence order as Deny taking precedence over allow when conflicting acls are found https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authorizati on+In terface#KIP-11-AuthorizationInterface-PermissionType Got it, thank you. * In the first version that I am currently writing there is no group support. Even when we add it I don't see the need to add a precedence for evaluation. it does not matter which principal matches as long as we have a match. About this part, I think we should choose the best matching acl for authorization, no matter we support group or not. For the case [cid:image001.png@01D08197.E94BD410] https://raw.githubusercontent.com/sundapeng/attachment/master/kafka-ac l1.png if 2 Acls are defined, one that deny an operation from all hosts and one that allows the operation from host1, the operation from host1 will be denied or allowed? According wiki Deny will take precedence over Allow in competing acls., it seems acl_1 will win the competition, but customers' intention may be allow. I think deny always take precedence over Allow is okay, but host1 - user1host1 default may make sense. * Acl storage is indexed by resource right now because that is the primary lookup id for all authorize operations. Given acls are cached I don't see the need to optimized the storage layer any further for lookup. * The reason why we have acl with multi everything is to reduce redundancy in acl storage. I am not sure how will we be able to reduce redundancy if we divide it by using one principal,one host, one operation. Yes, I'm also agreed with Acl storage should be indexed by resource. Under resource index, it may be better to add index such as hosts and principals. One option may be one principal, one host, one operation. Just give your these scenarios for considering. For the case defined in wiki: Acl_1 - {user:bob, user:*} is allowed to READ from all hosts. Acl_2 - {user:bob} is denied to READ from host1 Acl_3 - {user:alice, group:kafka-devs} is allowed to READ and WRITE from {host1, host2}. For acl_3, if we want to remove alice's WRITE from {host1,host2} and remove alice's READ from host1, user may have following ways to achieve: 1.Remove the parts of acl_3 directly, I think if we make it divided and hierarchical, this kind of operations could be done directly in backend. 2.Remove
Re: [VOTE] KIP-11- Authorization design for kafka security
Parth, I was thinking that in a multi-tenant environment, an admin may want to carve out some topic space to a user. For example, allow user X to create any topic of X_*. Not sure how critical it is though. Also, with the current api, what would the admin do to replicate the acls from one cluster to another? Will she just list all acls from cli and reissue them to another cluster periodically? Thanks, Jun On Mon, Apr 27, 2015 at 10:56 AM, Parth Brahmbhatt pbrahmbh...@hortonworks.com wrote: Thanks for your comments Jun. * Renamed the resource to consumer-group in wiki. * I don’t see a use case where admins/users would want to reserve topic names in advance. Can you describe why this would be needed. Thanks Parth On 4/26/15, 2:01 PM, Jun Rao j...@confluent.io wrote: A few more minor comments. 100. To make it clear, perhaps we should rename the resource group to consumer-group. We can probably make the same change in CLI as well so that it's not confused with user group. 101. Currently, create is only at the cluster level. Should it also be at topic level? For example, perhaps it's useful to allow only user X to create topic X. Thanks, Jun On Sun, Apr 26, 2015 at 12:36 AM, Gwen Shapira gshap...@cloudera.com wrote: Thanks for clarifying, Parth. I think you are taking the right approach here. On Fri, Apr 24, 2015 at 11:46 AM, Parth Brahmbhatt pbrahmbh...@hortonworks.com wrote: Sorry Gwen, completely misunderstood the question :-). * Does everyone have the privilege to create a new Group and use it to consume from Topics he's already privileged on? Yes in current proposal. I did not see an API to create group but if you have a READ permission on a TOPIC and WRITE permission on that Group you are free to join and consume. * Will the CLI tool be used to manage group membership too? Yes and I think that means I need to add ―group. Updating the KIP. Thanks for pointing this out. * Groups are kind of ephemeral, right? If all consumers in the group disconnect the group is gone, AFAIK. Do we preserve the ACLs? Or do we treat the new group as completely new resource? Can we create ACLs before the group exists, in anticipation of it getting created? I have considered any auto delete and auto create as out of scope for the first release. So Right now I was going with preserving the acls. Do you see any issues with this? Auto deleting would mean authorizer will now have to get into implementation details of kafka which I was trying to avoid. Thanks Parth On 4/24/15, 11:33 AM, Gwen Shapira gshap...@cloudera.com wrote: We are not talking about same Groups :) I meant, Groups of consumers (which KIP-11 lists as a separate resource in the Privilege table) On Fri, Apr 24, 2015 at 11:31 AM, Parth Brahmbhatt pbrahmbh...@hortonworks.com wrote: I see Groups as something we can add incrementally in the current model. The acls take principalType: name so groups can be represented as group: groupName. We are not managing group memberships anywhere in kafka and I don’t see the need to do so. So for a topic1 using the CLI an admin can add an acl to grant access to group:kafka-test-users. The authorizer implementation can have a plugin to map authenticated user to groups ( This is how hadoop and storm works). The plugin could be mapping user to linux/ldap/active directory groups but that is again upto the implementation. What we are offering is an interface that is extensible so these features can be added incrementally. I can add support for this in the first release but don’t necessarily see why this would be absolute necessity. Thanks Parth On 4/24/15, 11:00 AM, Gwen Shapira gshap...@cloudera.com wrote: Thanks. One more thing I'm missing in the KIP is details on the Group resource (I think we discussed this and it was just not fully updated): * Does everyone have the privilege to create a new Group and use it to consume from Topics he's already privileged on? * Will the CLI tool be used to manage group membership too? * Groups are kind of ephemeral, right? If all consumers in the group disconnect the group is gone, AFAIK. Do we preserve the ACLs? Or do we treat the new group as completely new resource? Can we create ACLs before the group exists, in anticipation of it getting created? Its all small details, but it will be difficult to implement KIP-11 without knowing the answers :) Gwen On Fri, Apr 24, 2015 at 9:58 AM, Parth Brahmbhatt pbrahmbh...@hortonworks.com wrote: You are right, moved it to the default implementation section. Thanks Parth On 4/24/15, 9:52 AM, Gwen Shapira gshap...@cloudera.com wrote: Sample ACL JSON and Zookeeper is in public API, but I thought it is part of
Re: [VOTE] KIP-11- Authorization design for kafka security
* We are not supporting regex matching to any of the strings (host,resource,principal) yet but this can be added. We have a special wild card (*) to refer to ALL but there is no other regex matching going on right now. We can associate CREATE with topics as you are proposing once KIP-4 is merged I am just not sure if admins currently try to figure out/control what topic names different tenents can have. * With current API they will have to do exactly what you said. Call list for each resource (cluster, topic and group) and reissue the same acls by calling add in the mirrored cluster. Thanks Parth On 4/27/15, 2:17 PM, Jun Rao j...@confluent.io wrote: Parth, I was thinking that in a multi-tenant environment, an admin may want to carve out some topic space to a user. For example, allow user X to create any topic of X_*. Not sure how critical it is though. Also, with the current api, what would the admin do to replicate the acls from one cluster to another? Will she just list all acls from cli and reissue them to another cluster periodically? Thanks, Jun On Mon, Apr 27, 2015 at 10:56 AM, Parth Brahmbhatt pbrahmbh...@hortonworks.com wrote: Thanks for your comments Jun. * Renamed the resource to consumer-group in wiki. * I don’t see a use case where admins/users would want to reserve topic names in advance. Can you describe why this would be needed. Thanks Parth On 4/26/15, 2:01 PM, Jun Rao j...@confluent.io wrote: A few more minor comments. 100. To make it clear, perhaps we should rename the resource group to consumer-group. We can probably make the same change in CLI as well so that it's not confused with user group. 101. Currently, create is only at the cluster level. Should it also be at topic level? For example, perhaps it's useful to allow only user X to create topic X. Thanks, Jun On Sun, Apr 26, 2015 at 12:36 AM, Gwen Shapira gshap...@cloudera.com wrote: Thanks for clarifying, Parth. I think you are taking the right approach here. On Fri, Apr 24, 2015 at 11:46 AM, Parth Brahmbhatt pbrahmbh...@hortonworks.com wrote: Sorry Gwen, completely misunderstood the question :-). * Does everyone have the privilege to create a new Group and use it to consume from Topics he's already privileged on? Yes in current proposal. I did not see an API to create group but if you have a READ permission on a TOPIC and WRITE permission on that Group you are free to join and consume. * Will the CLI tool be used to manage group membership too? Yes and I think that means I need to add ―group. Updating the KIP. Thanks for pointing this out. * Groups are kind of ephemeral, right? If all consumers in the group disconnect the group is gone, AFAIK. Do we preserve the ACLs? Or do we treat the new group as completely new resource? Can we create ACLs before the group exists, in anticipation of it getting created? I have considered any auto delete and auto create as out of scope for the first release. So Right now I was going with preserving the acls. Do you see any issues with this? Auto deleting would mean authorizer will now have to get into implementation details of kafka which I was trying to avoid. Thanks Parth On 4/24/15, 11:33 AM, Gwen Shapira gshap...@cloudera.com wrote: We are not talking about same Groups :) I meant, Groups of consumers (which KIP-11 lists as a separate resource in the Privilege table) On Fri, Apr 24, 2015 at 11:31 AM, Parth Brahmbhatt pbrahmbh...@hortonworks.com wrote: I see Groups as something we can add incrementally in the current model. The acls take principalType: name so groups can be represented as group: groupName. We are not managing group memberships anywhere in kafka and I don’t see the need to do so. So for a topic1 using the CLI an admin can add an acl to grant access to group:kafka-test-users. The authorizer implementation can have a plugin to map authenticated user to groups ( This is how hadoop and storm works). The plugin could be mapping user to linux/ldap/active directory groups but that is again upto the implementation. What we are offering is an interface that is extensible so these features can be added incrementally. I can add support for this in the first release but don’t necessarily see why this would be absolute necessity. Thanks Parth On 4/24/15, 11:00 AM, Gwen Shapira gshap...@cloudera.com wrote: Thanks. One more thing I'm missing in the KIP is details on the Group resource (I think we discussed this and it was just not fully updated): * Does everyone have the privilege to create a new Group and use it to consume from Topics he's already privileged on? * Will the CLI tool be used to manage group membership too? * Groups are kind of ephemeral, right? If all consumers in the group disconnect
Re: [VOTE] KIP-11- Authorization design for kafka security
Thanks for your comments Jun. * Renamed the resource to consumer-group in wiki. * I don’t see a use case where admins/users would want to reserve topic names in advance. Can you describe why this would be needed. Thanks Parth On 4/26/15, 2:01 PM, Jun Rao j...@confluent.io wrote: A few more minor comments. 100. To make it clear, perhaps we should rename the resource group to consumer-group. We can probably make the same change in CLI as well so that it's not confused with user group. 101. Currently, create is only at the cluster level. Should it also be at topic level? For example, perhaps it's useful to allow only user X to create topic X. Thanks, Jun On Sun, Apr 26, 2015 at 12:36 AM, Gwen Shapira gshap...@cloudera.com wrote: Thanks for clarifying, Parth. I think you are taking the right approach here. On Fri, Apr 24, 2015 at 11:46 AM, Parth Brahmbhatt pbrahmbh...@hortonworks.com wrote: Sorry Gwen, completely misunderstood the question :-). * Does everyone have the privilege to create a new Group and use it to consume from Topics he's already privileged on? Yes in current proposal. I did not see an API to create group but if you have a READ permission on a TOPIC and WRITE permission on that Group you are free to join and consume. * Will the CLI tool be used to manage group membership too? Yes and I think that means I need to add ―group. Updating the KIP. Thanks for pointing this out. * Groups are kind of ephemeral, right? If all consumers in the group disconnect the group is gone, AFAIK. Do we preserve the ACLs? Or do we treat the new group as completely new resource? Can we create ACLs before the group exists, in anticipation of it getting created? I have considered any auto delete and auto create as out of scope for the first release. So Right now I was going with preserving the acls. Do you see any issues with this? Auto deleting would mean authorizer will now have to get into implementation details of kafka which I was trying to avoid. Thanks Parth On 4/24/15, 11:33 AM, Gwen Shapira gshap...@cloudera.com wrote: We are not talking about same Groups :) I meant, Groups of consumers (which KIP-11 lists as a separate resource in the Privilege table) On Fri, Apr 24, 2015 at 11:31 AM, Parth Brahmbhatt pbrahmbh...@hortonworks.com wrote: I see Groups as something we can add incrementally in the current model. The acls take principalType: name so groups can be represented as group: groupName. We are not managing group memberships anywhere in kafka and I don’t see the need to do so. So for a topic1 using the CLI an admin can add an acl to grant access to group:kafka-test-users. The authorizer implementation can have a plugin to map authenticated user to groups ( This is how hadoop and storm works). The plugin could be mapping user to linux/ldap/active directory groups but that is again upto the implementation. What we are offering is an interface that is extensible so these features can be added incrementally. I can add support for this in the first release but don’t necessarily see why this would be absolute necessity. Thanks Parth On 4/24/15, 11:00 AM, Gwen Shapira gshap...@cloudera.com wrote: Thanks. One more thing I'm missing in the KIP is details on the Group resource (I think we discussed this and it was just not fully updated): * Does everyone have the privilege to create a new Group and use it to consume from Topics he's already privileged on? * Will the CLI tool be used to manage group membership too? * Groups are kind of ephemeral, right? If all consumers in the group disconnect the group is gone, AFAIK. Do we preserve the ACLs? Or do we treat the new group as completely new resource? Can we create ACLs before the group exists, in anticipation of it getting created? Its all small details, but it will be difficult to implement KIP-11 without knowing the answers :) Gwen On Fri, Apr 24, 2015 at 9:58 AM, Parth Brahmbhatt pbrahmbh...@hortonworks.com wrote: You are right, moved it to the default implementation section. Thanks Parth On 4/24/15, 9:52 AM, Gwen Shapira gshap...@cloudera.com wrote: Sample ACL JSON and Zookeeper is in public API, but I thought it is part of DefaultAuthorizer (Since Sentry and Argus won't be using Zookeeper). Am I wrong? Or is it the KIP? On Fri, Apr 24, 2015 at 9:49 AM, Parth Brahmbhatt pbrahmbh...@hortonworks.com wrote: Thanks for clarifying Gwen, KIP updated. I tried to make the distinction by creating a section for all public APIs https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authorizat io n+ In terface#KIP-11-AuthorizationInterface-PublicInterfacesandclasses Let me know if you think there is a better way to reflect this. Thanks Parth On 4/24/15, 9:37 AM, Gwen Shapira gshap...@cloudera.com wrote: +1 (non-binding) Two
Re: [VOTE] KIP-11- Authorization design for kafka security
Hi Sun, thanks for the comments, my answers are below: * I think the wiki already describes the precedence order as Deny taking precedence over allow when conflicting acls are found https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authorization+In terface#KIP-11-AuthorizationInterface-PermissionType * In the first version that I am currently writing there is no group support. Even when we add it I don’t see the need to add a precedence for evaluation. it does not matter which principal matches as long as we have a match. * Acl storage is indexed by resource right now because that is the primary lookup id for all authorize operations. Given acls are cached I don’t see the need to optimized the storage layer any further for lookup. * The reason why we have acl with multi everything is to reduce redundancy in acl storage. I am not sure how will we be able to reduce redundancy if we divide it by using one principal,one host, one operation. Thanks Parth On 4/26/15, 8:06 PM, Sun, Dapeng dapeng@intel.com wrote: Hi Parth The design looks good, a few minor comments below. Since I just started looking into the discussion and many previous discussions I may missed, I'm sorry if these comments had be discussed. 1. About SimpleAclAuthorizer (SimpleAuthorizer): a. As my understanding, I think there should only one type privilege(allow/deny) of a topic on a principle, or we make it deny allow. For example, acl_1 host1 - group1- user1 - read-allow and acl_2 host1- group1 - user1 -read-deny, if the two acls are for a same topic, it may be hard to understand, do you think it's necessary to add some details about this to wiki. b. And when we do authorize a user on a topic, we may should check user's user level acl first, then check user's group level acl, finally we check the host level and default level acl. do you think it's necessary we add some contents like these to wiki. For example, host1 - group1- user1host1 - group1host1 2.About SimpleAclAuthorizer (Acl Json will be stored in zookeeper) a. It may be better to make acl json stored hierarchily. It may be easy to search and do authorize. For example, when we authorize a user, we only need user related acls. b. I found one acl may contains multi-principles, multi-operations and multi-hosts, I'm strongly agreed with we provide api like these, but the acls stored in zookeeper or memory we may better to separate to one-principle, one-operation and one host. So we could make sure there are not many acls with same meaning and make acl management easily. Regards Dapeng -Original Message- From: Jun Rao [mailto:j...@confluent.io] Sent: Monday, April 27, 2015 5:02 AM To: dev@kafka.apache.org Subject: Re: [VOTE] KIP-11- Authorization design for kafka security A few more minor comments. 100. To make it clear, perhaps we should rename the resource group to consumer-group. We can probably make the same change in CLI as well so that it's not confused with user group. 101. Currently, create is only at the cluster level. Should it also be at topic level? For example, perhaps it's useful to allow only user X to create topic X. Thanks, Jun On Sun, Apr 26, 2015 at 12:36 AM, Gwen Shapira gshap...@cloudera.com wrote: Thanks for clarifying, Parth. I think you are taking the right approach here. On Fri, Apr 24, 2015 at 11:46 AM, Parth Brahmbhatt pbrahmbh...@hortonworks.com wrote: Sorry Gwen, completely misunderstood the question :-). * Does everyone have the privilege to create a new Group and use it to consume from Topics he's already privileged on? Yes in current proposal. I did not see an API to create group but if you have a READ permission on a TOPIC and WRITE permission on that Group you are free to join and consume. * Will the CLI tool be used to manage group membership too? Yes and I think that means I need to add ―group. Updating the KIP. Thanks for pointing this out. * Groups are kind of ephemeral, right? If all consumers in the group disconnect the group is gone, AFAIK. Do we preserve the ACLs? Or do we treat the new group as completely new resource? Can we create ACLs before the group exists, in anticipation of it getting created? I have considered any auto delete and auto create as out of scope for the first release. So Right now I was going with preserving the acls. Do you see any issues with this? Auto deleting would mean authorizer will now have to get into implementation details of kafka which I was trying to avoid. Thanks Parth On 4/24/15, 11:33 AM, Gwen Shapira gshap...@cloudera.com wrote: We are not talking about same Groups :) I meant, Groups of consumers (which KIP-11 lists as a separate resource in the Privilege table) On Fri, Apr 24, 2015 at 11:31 AM, Parth Brahmbhatt pbrahmbh...@hortonworks.com wrote: I see Groups as something we can add incrementally in the current model. The acls take principalType: name so
RE: [VOTE] KIP-11- Authorization design for kafka security
Attach the image. https://raw.githubusercontent.com/sundapeng/attachment/master/kafka-acl1.png Regards Dapeng From: Sun, Dapeng [mailto:dapeng@intel.com] Sent: Tuesday, April 28, 2015 11:44 AM To: dev@kafka.apache.org Subject: RE: [VOTE] KIP-11- Authorization design for kafka security Thank you for your rapid reply, Parth. * I think the wiki already describes the precedence order as Deny taking precedence over allow when conflicting acls are found https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authorization+In terface#KIP-11-AuthorizationInterface-PermissionType Got it, thank you. * In the first version that I am currently writing there is no group support. Even when we add it I don't see the need to add a precedence for evaluation. it does not matter which principal matches as long as we have a match. About this part, I think we should choose the best matching acl for authorization, no matter we support group or not. For the case [cid:image001.png@01D08197.E94BD410] https://raw.githubusercontent.com/sundapeng/attachment/master/kafka-acl1.png if 2 Acls are defined, one that deny an operation from all hosts and one that allows the operation from host1, the operation from host1 will be denied or allowed? According wiki Deny will take precedence over Allow in competing acls., it seems acl_1 will win the competition, but customers' intention may be allow. I think deny always take precedence over Allow is okay, but host1 - user1 host1 default may make sense. * Acl storage is indexed by resource right now because that is the primary lookup id for all authorize operations. Given acls are cached I don't see the need to optimized the storage layer any further for lookup. * The reason why we have acl with multi everything is to reduce redundancy in acl storage. I am not sure how will we be able to reduce redundancy if we divide it by using one principal,one host, one operation. Yes, I'm also agreed with Acl storage should be indexed by resource. Under resource index, it may be better to add index such as hosts and principals. One option may be one principal, one host, one operation. Just give your these scenarios for considering. For the case defined in wiki: Acl_1 - {user:bob, user:*} is allowed to READ from all hosts. Acl_2 - {user:bob} is denied to READ from host1 Acl_3 - {user:alice, group:kafka-devs} is allowed to READ and WRITE from {host1, host2}. For acl_3, if we want to remove alice's WRITE from {host1,host2} and remove alice's READ from host1, user may have following ways to achieve: 1.Remove the parts of acl_3 directly, I think if we make it divided and hierarchical, this kind of operations could be done directly in backend. 2.Remove acl_3, and add new acl {group:kafka-devs} is allowed to READ and WRITE from {host1, host2} and {user:alice } is allowed to READ from {host2} 3.Add two denied acls,{ user:alice} is denied to WRITE from {host1,host2} and { user:alice} is denied to READ from {host1} All these can achieve this kind of operations, but I think 1 could more directly for user operations. If you think this optimization is not urgent, I'm also agreed. Regards Dapeng -Original Message- From: Parth Brahmbhatt [mailto:pbrahmbh...@hortonworks.com] Sent: Tuesday, April 28, 2015 12:18 AM To: dev@kafka.apache.orgmailto:dev@kafka.apache.org Subject: Re: [VOTE] KIP-11- Authorization design for kafka security Hi Sun, thanks for the comments, my answers are below: * I think the wiki already describes the precedence order as Deny taking precedence over allow when conflicting acls are found https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authorization+In terface#KIP-11-AuthorizationInterface-PermissionType * In the first version that I am currently writing there is no group support. Even when we add it I don't see the need to add a precedence for evaluation. it does not matter which principal matches as long as we have a match. * Acl storage is indexed by resource right now because that is the primary lookup id for all authorize operations. Given acls are cached I don't see the need to optimized the storage layer any further for lookup. * The reason why we have acl with multi everything is to reduce redundancy in acl storage. I am not sure how will we be able to reduce redundancy if we divide it by using one principal,one host, one operation. Thanks Parth On 4/26/15, 8:06 PM, Sun, Dapeng dapeng@intel.commailto:dapeng@intel.com wrote: Hi Parth The design looks good, a few minor comments below. Since I just started looking into the discussion and many previous discussions I may missed, I'm sorry if these comments had be discussed. 1. About SimpleAclAuthorizer (SimpleAuthorizer): a. As my understanding, I think there should only one type privilege(allow/deny) of a topic on a principle, or we make it deny allow. For example, acl_1 host1
Re: [VOTE] KIP-11- Authorization design for kafka security
While I see the advantage of being able to say something like: deny user X from hosts h1...h200 also allow user X from host h189, there are two issues here: 1. Complex rule systems can be difficult to reason about and therefore end up being less secure. The rule Deny always wins is very easy to grasp. 2. We currently don't have any mechanism for specifying IP ranges (or host ranges) at all. I think its a pretty significant deficiency, but it does mean that we don't need to worry about the issue of blocking a large range while unblocking few servers in the range. Gwen P.S We have a call tomorrow (Tuesday, April 28) at 3pm PST - to discuss this and other outstanding design issues (not all related to security). If you are interested in joining - let me know and I'll forward you the invite. Gwen On Mon, Apr 27, 2015 at 10:15 PM, Sun, Dapeng dapeng@intel.com wrote: Attach the image. https://raw.githubusercontent.com/sundapeng/attachment/master/kafka-acl1.png Regards Dapeng From: Sun, Dapeng [mailto:dapeng@intel.com] Sent: Tuesday, April 28, 2015 11:44 AM To: dev@kafka.apache.org Subject: RE: [VOTE] KIP-11- Authorization design for kafka security Thank you for your rapid reply, Parth. * I think the wiki already describes the precedence order as Deny taking precedence over allow when conflicting acls are found https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authorization+In terface#KIP-11-AuthorizationInterface-PermissionType Got it, thank you. * In the first version that I am currently writing there is no group support. Even when we add it I don't see the need to add a precedence for evaluation. it does not matter which principal matches as long as we have a match. About this part, I think we should choose the best matching acl for authorization, no matter we support group or not. For the case [cid:image001.png@01D08197.E94BD410] https://raw.githubusercontent.com/sundapeng/attachment/master/kafka-acl1.png if 2 Acls are defined, one that deny an operation from all hosts and one that allows the operation from host1, the operation from host1 will be denied or allowed? According wiki Deny will take precedence over Allow in competing acls., it seems acl_1 will win the competition, but customers' intention may be allow. I think deny always take precedence over Allow is okay, but host1 - user1host1 default may make sense. * Acl storage is indexed by resource right now because that is the primary lookup id for all authorize operations. Given acls are cached I don't see the need to optimized the storage layer any further for lookup. * The reason why we have acl with multi everything is to reduce redundancy in acl storage. I am not sure how will we be able to reduce redundancy if we divide it by using one principal,one host, one operation. Yes, I'm also agreed with Acl storage should be indexed by resource. Under resource index, it may be better to add index such as hosts and principals. One option may be one principal, one host, one operation. Just give your these scenarios for considering. For the case defined in wiki: Acl_1 - {user:bob, user:*} is allowed to READ from all hosts. Acl_2 - {user:bob} is denied to READ from host1 Acl_3 - {user:alice, group:kafka-devs} is allowed to READ and WRITE from {host1, host2}. For acl_3, if we want to remove alice's WRITE from {host1,host2} and remove alice's READ from host1, user may have following ways to achieve: 1.Remove the parts of acl_3 directly, I think if we make it divided and hierarchical, this kind of operations could be done directly in backend. 2.Remove acl_3, and add new acl {group:kafka-devs} is allowed to READ and WRITE from {host1, host2} and {user:alice } is allowed to READ from {host2} 3.Add two denied acls,{ user:alice} is denied to WRITE from {host1,host2} and { user:alice} is denied to READ from {host1} All these can achieve this kind of operations, but I think 1 could more directly for user operations. If you think this optimization is not urgent, I'm also agreed. Regards Dapeng -Original Message- From: Parth Brahmbhatt [mailto:pbrahmbh...@hortonworks.com] Sent: Tuesday, April 28, 2015 12:18 AM To: dev@kafka.apache.orgmailto:dev@kafka.apache.org Subject: Re: [VOTE] KIP-11- Authorization design for kafka security Hi Sun, thanks for the comments, my answers are below: * I think the wiki already describes the precedence order as Deny taking precedence over allow when conflicting acls are found https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authorization+In terface#KIP-11-AuthorizationInterface-PermissionType * In the first version that I am currently writing there is no group support. Even when we add it I don't see the need to add a precedence for evaluation. it does not matter which principal matches as long as we have a match. * Acl storage is indexed
Re: [VOTE] KIP-11- Authorization design for kafka security
A few more minor comments. 100. To make it clear, perhaps we should rename the resource group to consumer-group. We can probably make the same change in CLI as well so that it's not confused with user group. 101. Currently, create is only at the cluster level. Should it also be at topic level? For example, perhaps it's useful to allow only user X to create topic X. Thanks, Jun On Sun, Apr 26, 2015 at 12:36 AM, Gwen Shapira gshap...@cloudera.com wrote: Thanks for clarifying, Parth. I think you are taking the right approach here. On Fri, Apr 24, 2015 at 11:46 AM, Parth Brahmbhatt pbrahmbh...@hortonworks.com wrote: Sorry Gwen, completely misunderstood the question :-). * Does everyone have the privilege to create a new Group and use it to consume from Topics he's already privileged on? Yes in current proposal. I did not see an API to create group but if you have a READ permission on a TOPIC and WRITE permission on that Group you are free to join and consume. * Will the CLI tool be used to manage group membership too? Yes and I think that means I need to add ―group. Updating the KIP. Thanks for pointing this out. * Groups are kind of ephemeral, right? If all consumers in the group disconnect the group is gone, AFAIK. Do we preserve the ACLs? Or do we treat the new group as completely new resource? Can we create ACLs before the group exists, in anticipation of it getting created? I have considered any auto delete and auto create as out of scope for the first release. So Right now I was going with preserving the acls. Do you see any issues with this? Auto deleting would mean authorizer will now have to get into implementation details of kafka which I was trying to avoid. Thanks Parth On 4/24/15, 11:33 AM, Gwen Shapira gshap...@cloudera.com wrote: We are not talking about same Groups :) I meant, Groups of consumers (which KIP-11 lists as a separate resource in the Privilege table) On Fri, Apr 24, 2015 at 11:31 AM, Parth Brahmbhatt pbrahmbh...@hortonworks.com wrote: I see Groups as something we can add incrementally in the current model. The acls take principalType: name so groups can be represented as group: groupName. We are not managing group memberships anywhere in kafka and I don’t see the need to do so. So for a topic1 using the CLI an admin can add an acl to grant access to group:kafka-test-users. The authorizer implementation can have a plugin to map authenticated user to groups ( This is how hadoop and storm works). The plugin could be mapping user to linux/ldap/active directory groups but that is again upto the implementation. What we are offering is an interface that is extensible so these features can be added incrementally. I can add support for this in the first release but don’t necessarily see why this would be absolute necessity. Thanks Parth On 4/24/15, 11:00 AM, Gwen Shapira gshap...@cloudera.com wrote: Thanks. One more thing I'm missing in the KIP is details on the Group resource (I think we discussed this and it was just not fully updated): * Does everyone have the privilege to create a new Group and use it to consume from Topics he's already privileged on? * Will the CLI tool be used to manage group membership too? * Groups are kind of ephemeral, right? If all consumers in the group disconnect the group is gone, AFAIK. Do we preserve the ACLs? Or do we treat the new group as completely new resource? Can we create ACLs before the group exists, in anticipation of it getting created? Its all small details, but it will be difficult to implement KIP-11 without knowing the answers :) Gwen On Fri, Apr 24, 2015 at 9:58 AM, Parth Brahmbhatt pbrahmbh...@hortonworks.com wrote: You are right, moved it to the default implementation section. Thanks Parth On 4/24/15, 9:52 AM, Gwen Shapira gshap...@cloudera.com wrote: Sample ACL JSON and Zookeeper is in public API, but I thought it is part of DefaultAuthorizer (Since Sentry and Argus won't be using Zookeeper). Am I wrong? Or is it the KIP? On Fri, Apr 24, 2015 at 9:49 AM, Parth Brahmbhatt pbrahmbh...@hortonworks.com wrote: Thanks for clarifying Gwen, KIP updated. I tried to make the distinction by creating a section for all public APIs https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authorizat io n+ In terface#KIP-11-AuthorizationInterface-PublicInterfacesandclasses Let me know if you think there is a better way to reflect this. Thanks Parth On 4/24/15, 9:37 AM, Gwen Shapira gshap...@cloudera.com wrote: +1 (non-binding) Two nitpicks for the wiki: * Heartbeat is probably a READ and not CLUSTER operation. I'm pretty sure new consumers need it to be part of a consumer group. * Can you clearly separate which parts are the API (common to every Authorizer) and which parts are DefaultAuthorizer implementation?
RE: [VOTE] KIP-11- Authorization design for kafka security
Hi Parth The design looks good, a few minor comments below. Since I just started looking into the discussion and many previous discussions I may missed, I'm sorry if these comments had be discussed. 1. About SimpleAclAuthorizer (SimpleAuthorizer): a. As my understanding, I think there should only one type privilege(allow/deny) of a topic on a principle, or we make it deny allow. For example, acl_1 host1 - group1- user1 - read-allow and acl_2 host1- group1 - user1 -read-deny, if the two acls are for a same topic, it may be hard to understand, do you think it's necessary to add some details about this to wiki. b. And when we do authorize a user on a topic, we may should check user's user level acl first, then check user's group level acl, finally we check the host level and default level acl. do you think it's necessary we add some contents like these to wiki. For example, host1 - group1- user1host1 - group1host1 2.About SimpleAclAuthorizer (Acl Json will be stored in zookeeper) a. It may be better to make acl json stored hierarchily. It may be easy to search and do authorize. For example, when we authorize a user, we only need user related acls. b. I found one acl may contains multi-principles, multi-operations and multi-hosts, I'm strongly agreed with we provide api like these, but the acls stored in zookeeper or memory we may better to separate to one-principle, one-operation and one host. So we could make sure there are not many acls with same meaning and make acl management easily. Regards Dapeng -Original Message- From: Jun Rao [mailto:j...@confluent.io] Sent: Monday, April 27, 2015 5:02 AM To: dev@kafka.apache.org Subject: Re: [VOTE] KIP-11- Authorization design for kafka security A few more minor comments. 100. To make it clear, perhaps we should rename the resource group to consumer-group. We can probably make the same change in CLI as well so that it's not confused with user group. 101. Currently, create is only at the cluster level. Should it also be at topic level? For example, perhaps it's useful to allow only user X to create topic X. Thanks, Jun On Sun, Apr 26, 2015 at 12:36 AM, Gwen Shapira gshap...@cloudera.com wrote: Thanks for clarifying, Parth. I think you are taking the right approach here. On Fri, Apr 24, 2015 at 11:46 AM, Parth Brahmbhatt pbrahmbh...@hortonworks.com wrote: Sorry Gwen, completely misunderstood the question :-). * Does everyone have the privilege to create a new Group and use it to consume from Topics he's already privileged on? Yes in current proposal. I did not see an API to create group but if you have a READ permission on a TOPIC and WRITE permission on that Group you are free to join and consume. * Will the CLI tool be used to manage group membership too? Yes and I think that means I need to add ―group. Updating the KIP. Thanks for pointing this out. * Groups are kind of ephemeral, right? If all consumers in the group disconnect the group is gone, AFAIK. Do we preserve the ACLs? Or do we treat the new group as completely new resource? Can we create ACLs before the group exists, in anticipation of it getting created? I have considered any auto delete and auto create as out of scope for the first release. So Right now I was going with preserving the acls. Do you see any issues with this? Auto deleting would mean authorizer will now have to get into implementation details of kafka which I was trying to avoid. Thanks Parth On 4/24/15, 11:33 AM, Gwen Shapira gshap...@cloudera.com wrote: We are not talking about same Groups :) I meant, Groups of consumers (which KIP-11 lists as a separate resource in the Privilege table) On Fri, Apr 24, 2015 at 11:31 AM, Parth Brahmbhatt pbrahmbh...@hortonworks.com wrote: I see Groups as something we can add incrementally in the current model. The acls take principalType: name so groups can be represented as group: groupName. We are not managing group memberships anywhere in kafka and I don’t see the need to do so. So for a topic1 using the CLI an admin can add an acl to grant access to group:kafka-test-users. The authorizer implementation can have a plugin to map authenticated user to groups ( This is how hadoop and storm works). The plugin could be mapping user to linux/ldap/active directory groups but that is again upto the implementation. What we are offering is an interface that is extensible so these features can be added incrementally. I can add support for this in the first release but don’t necessarily see why this would be absolute necessity. Thanks Parth On 4/24/15, 11:00 AM, Gwen Shapira gshap...@cloudera.com wrote: Thanks. One more thing I'm missing in the KIP is details on the Group resource (I think we discussed this and it was just not fully updated): * Does everyone have the privilege
Re: [VOTE] KIP-11- Authorization design for kafka security
I see Groups as something we can add incrementally in the current model. The acls take principalType: name so groups can be represented as group: groupName. We are not managing group memberships anywhere in kafka and I don’t see the need to do so. So for a topic1 using the CLI an admin can add an acl to grant access to group:kafka-test-users. The authorizer implementation can have a plugin to map authenticated user to groups ( This is how hadoop and storm works). The plugin could be mapping user to linux/ldap/active directory groups but that is again upto the implementation. What we are offering is an interface that is extensible so these features can be added incrementally. I can add support for this in the first release but don’t necessarily see why this would be absolute necessity. Thanks Parth On 4/24/15, 11:00 AM, Gwen Shapira gshap...@cloudera.com wrote: Thanks. One more thing I'm missing in the KIP is details on the Group resource (I think we discussed this and it was just not fully updated): * Does everyone have the privilege to create a new Group and use it to consume from Topics he's already privileged on? * Will the CLI tool be used to manage group membership too? * Groups are kind of ephemeral, right? If all consumers in the group disconnect the group is gone, AFAIK. Do we preserve the ACLs? Or do we treat the new group as completely new resource? Can we create ACLs before the group exists, in anticipation of it getting created? Its all small details, but it will be difficult to implement KIP-11 without knowing the answers :) Gwen On Fri, Apr 24, 2015 at 9:58 AM, Parth Brahmbhatt pbrahmbh...@hortonworks.com wrote: You are right, moved it to the default implementation section. Thanks Parth On 4/24/15, 9:52 AM, Gwen Shapira gshap...@cloudera.com wrote: Sample ACL JSON and Zookeeper is in public API, but I thought it is part of DefaultAuthorizer (Since Sentry and Argus won't be using Zookeeper). Am I wrong? Or is it the KIP? On Fri, Apr 24, 2015 at 9:49 AM, Parth Brahmbhatt pbrahmbh...@hortonworks.com wrote: Thanks for clarifying Gwen, KIP updated. I tried to make the distinction by creating a section for all public APIs https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authorizatio n+ In terface#KIP-11-AuthorizationInterface-PublicInterfacesandclasses Let me know if you think there is a better way to reflect this. Thanks Parth On 4/24/15, 9:37 AM, Gwen Shapira gshap...@cloudera.com wrote: +1 (non-binding) Two nitpicks for the wiki: * Heartbeat is probably a READ and not CLUSTER operation. I'm pretty sure new consumers need it to be part of a consumer group. * Can you clearly separate which parts are the API (common to every Authorizer) and which parts are DefaultAuthorizer implementation? It will make reviews and Authorizer implementations a bit easier to know exactly which is which. Gwen On Fri, Apr 24, 2015 at 9:28 AM, Parth Brahmbhatt pbrahmbh...@hortonworks.com wrote: Hi, I would like to open KIP-11 for voting. Thanks Parth On 4/22/15, 1:56 PM, Parth Brahmbhatt pbrahmbh...@hortonworks.com wrote: Hi Jeff, Thanks a lot for the review. I think you have a valid point about acls being duplicated and the simplest solution would be to modify acls class so they hold a set of principals instead of single principal. i.e user_a,user_b has READ,WRITE,DESCRIBE Permissions on Topic1 from Host1, Host2, Host3. I think the evaluation order only matters for the permissionType which is Deny acls should be evaluated before allow acls. To give you an example suppose we have following acls acl1 - user1 is allowed to READ from all hosts. acl2 - host1 is allowed to READ regardless of who is the user. acl3 - host2 is allowed to READ regardless of who is the user. acl4 - user1 is denied to READ from host1. As stated in the KIP we first evaluate DENY so if user1 tries to access from host1 he will be denied(acl4), even though both user1 and host1 has acl’s for allow with wildcards (acl1, acl2). If user1 tried to READ from host2 , the action will be allowed and it does not matter if we match acl3 or acl1 so I don’t think the evaluation order matters here. “Will people actually use hosts with users?” I really don’t know but given ACl’s are part of our Public APIs I thought it is better to try and cover more use cases. If others think this extra complexity is not worth the value its adding please raise your concerns so we can discuss if it should be removed from the acl structure. Note that even in absence of hosts from ACL users will still be able to whitelist/blacklist host as long as we start supporting principalType = “host”, easy to add and can be an incremental improvement. They will however loose the ability to restrict access to users just from a set of hosts. We agreed to offer a CLI to overcome the JSON acl config https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authoriza ti on +I n terface#KIP-11-AuthorizationInterface-AclManagement(CLI).
Re: [VOTE] KIP-11- Authorization design for kafka security
We are not talking about same Groups :) I meant, Groups of consumers (which KIP-11 lists as a separate resource in the Privilege table) On Fri, Apr 24, 2015 at 11:31 AM, Parth Brahmbhatt pbrahmbh...@hortonworks.com wrote: I see Groups as something we can add incrementally in the current model. The acls take principalType: name so groups can be represented as group: groupName. We are not managing group memberships anywhere in kafka and I don’t see the need to do so. So for a topic1 using the CLI an admin can add an acl to grant access to group:kafka-test-users. The authorizer implementation can have a plugin to map authenticated user to groups ( This is how hadoop and storm works). The plugin could be mapping user to linux/ldap/active directory groups but that is again upto the implementation. What we are offering is an interface that is extensible so these features can be added incrementally. I can add support for this in the first release but don’t necessarily see why this would be absolute necessity. Thanks Parth On 4/24/15, 11:00 AM, Gwen Shapira gshap...@cloudera.com wrote: Thanks. One more thing I'm missing in the KIP is details on the Group resource (I think we discussed this and it was just not fully updated): * Does everyone have the privilege to create a new Group and use it to consume from Topics he's already privileged on? * Will the CLI tool be used to manage group membership too? * Groups are kind of ephemeral, right? If all consumers in the group disconnect the group is gone, AFAIK. Do we preserve the ACLs? Or do we treat the new group as completely new resource? Can we create ACLs before the group exists, in anticipation of it getting created? Its all small details, but it will be difficult to implement KIP-11 without knowing the answers :) Gwen On Fri, Apr 24, 2015 at 9:58 AM, Parth Brahmbhatt pbrahmbh...@hortonworks.com wrote: You are right, moved it to the default implementation section. Thanks Parth On 4/24/15, 9:52 AM, Gwen Shapira gshap...@cloudera.com wrote: Sample ACL JSON and Zookeeper is in public API, but I thought it is part of DefaultAuthorizer (Since Sentry and Argus won't be using Zookeeper). Am I wrong? Or is it the KIP? On Fri, Apr 24, 2015 at 9:49 AM, Parth Brahmbhatt pbrahmbh...@hortonworks.com wrote: Thanks for clarifying Gwen, KIP updated. I tried to make the distinction by creating a section for all public APIs https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authorizatio n+ In terface#KIP-11-AuthorizationInterface-PublicInterfacesandclasses Let me know if you think there is a better way to reflect this. Thanks Parth On 4/24/15, 9:37 AM, Gwen Shapira gshap...@cloudera.com wrote: +1 (non-binding) Two nitpicks for the wiki: * Heartbeat is probably a READ and not CLUSTER operation. I'm pretty sure new consumers need it to be part of a consumer group. * Can you clearly separate which parts are the API (common to every Authorizer) and which parts are DefaultAuthorizer implementation? It will make reviews and Authorizer implementations a bit easier to know exactly which is which. Gwen On Fri, Apr 24, 2015 at 9:28 AM, Parth Brahmbhatt pbrahmbh...@hortonworks.com wrote: Hi, I would like to open KIP-11 for voting. Thanks Parth On 4/22/15, 1:56 PM, Parth Brahmbhatt pbrahmbh...@hortonworks.com wrote: Hi Jeff, Thanks a lot for the review. I think you have a valid point about acls being duplicated and the simplest solution would be to modify acls class so they hold a set of principals instead of single principal. i.e user_a,user_b has READ,WRITE,DESCRIBE Permissions on Topic1 from Host1, Host2, Host3. I think the evaluation order only matters for the permissionType which is Deny acls should be evaluated before allow acls. To give you an example suppose we have following acls acl1 - user1 is allowed to READ from all hosts. acl2 - host1 is allowed to READ regardless of who is the user. acl3 - host2 is allowed to READ regardless of who is the user. acl4 - user1 is denied to READ from host1. As stated in the KIP we first evaluate DENY so if user1 tries to access from host1 he will be denied(acl4), even though both user1 and host1 has acl’s for allow with wildcards (acl1, acl2). If user1 tried to READ from host2 , the action will be allowed and it does not matter if we match acl3 or acl1 so I don’t think the evaluation order matters here. “Will people actually use hosts with users?” I really don’t know but given ACl’s are part of our Public APIs I thought it is better to try and cover more use cases. If others think this extra complexity is not worth the value its adding please raise your concerns so we can discuss if it should be removed from the acl structure. Note that even in absence of hosts from ACL users will still be able to whitelist/blacklist host as long as we start supporting principalType = “host”, easy to add and can be an incremental improvement. They will however loose the ability to
Re: [VOTE] KIP-11- Authorization design for kafka security
Sorry, for the confusion. I'm not sure my last email is clear enough either... Consumers will have a Principal which may belong to a group. But consumer configuration also have a group.id, which controls how partitions are shared between consumers and how offsets are committed. I'm talking about those Groups. On Fri, Apr 24, 2015 at 11:33 AM, Gwen Shapira gshap...@cloudera.com wrote: We are not talking about same Groups :) I meant, Groups of consumers (which KIP-11 lists as a separate resource in the Privilege table) On Fri, Apr 24, 2015 at 11:31 AM, Parth Brahmbhatt pbrahmbh...@hortonworks.com wrote: I see Groups as something we can add incrementally in the current model. The acls take principalType: name so groups can be represented as group: groupName. We are not managing group memberships anywhere in kafka and I don’t see the need to do so. So for a topic1 using the CLI an admin can add an acl to grant access to group:kafka-test-users. The authorizer implementation can have a plugin to map authenticated user to groups ( This is how hadoop and storm works). The plugin could be mapping user to linux/ldap/active directory groups but that is again upto the implementation. What we are offering is an interface that is extensible so these features can be added incrementally. I can add support for this in the first release but don’t necessarily see why this would be absolute necessity. Thanks Parth On 4/24/15, 11:00 AM, Gwen Shapira gshap...@cloudera.com wrote: Thanks. One more thing I'm missing in the KIP is details on the Group resource (I think we discussed this and it was just not fully updated): * Does everyone have the privilege to create a new Group and use it to consume from Topics he's already privileged on? * Will the CLI tool be used to manage group membership too? * Groups are kind of ephemeral, right? If all consumers in the group disconnect the group is gone, AFAIK. Do we preserve the ACLs? Or do we treat the new group as completely new resource? Can we create ACLs before the group exists, in anticipation of it getting created? Its all small details, but it will be difficult to implement KIP-11 without knowing the answers :) Gwen On Fri, Apr 24, 2015 at 9:58 AM, Parth Brahmbhatt pbrahmbh...@hortonworks.com wrote: You are right, moved it to the default implementation section. Thanks Parth On 4/24/15, 9:52 AM, Gwen Shapira gshap...@cloudera.com wrote: Sample ACL JSON and Zookeeper is in public API, but I thought it is part of DefaultAuthorizer (Since Sentry and Argus won't be using Zookeeper). Am I wrong? Or is it the KIP? On Fri, Apr 24, 2015 at 9:49 AM, Parth Brahmbhatt pbrahmbh...@hortonworks.com wrote: Thanks for clarifying Gwen, KIP updated. I tried to make the distinction by creating a section for all public APIs https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authorizatio n+ In terface#KIP-11-AuthorizationInterface-PublicInterfacesandclasses Let me know if you think there is a better way to reflect this. Thanks Parth On 4/24/15, 9:37 AM, Gwen Shapira gshap...@cloudera.com wrote: +1 (non-binding) Two nitpicks for the wiki: * Heartbeat is probably a READ and not CLUSTER operation. I'm pretty sure new consumers need it to be part of a consumer group. * Can you clearly separate which parts are the API (common to every Authorizer) and which parts are DefaultAuthorizer implementation? It will make reviews and Authorizer implementations a bit easier to know exactly which is which. Gwen On Fri, Apr 24, 2015 at 9:28 AM, Parth Brahmbhatt pbrahmbh...@hortonworks.com wrote: Hi, I would like to open KIP-11 for voting. Thanks Parth On 4/22/15, 1:56 PM, Parth Brahmbhatt pbrahmbh...@hortonworks.com wrote: Hi Jeff, Thanks a lot for the review. I think you have a valid point about acls being duplicated and the simplest solution would be to modify acls class so they hold a set of principals instead of single principal. i.e user_a,user_b has READ,WRITE,DESCRIBE Permissions on Topic1 from Host1, Host2, Host3. I think the evaluation order only matters for the permissionType which is Deny acls should be evaluated before allow acls. To give you an example suppose we have following acls acl1 - user1 is allowed to READ from all hosts. acl2 - host1 is allowed to READ regardless of who is the user. acl3 - host2 is allowed to READ regardless of who is the user. acl4 - user1 is denied to READ from host1. As stated in the KIP we first evaluate DENY so if user1 tries to access from host1 he will be denied(acl4), even though both user1 and host1 has acl’s for allow with wildcards (acl1, acl2). If user1 tried to READ from host2 , the action will be allowed and it does not matter if we match acl3 or acl1 so I don’t think the evaluation order matters here. “Will people actually use hosts with users?” I really don’t know but given ACl’s are part of our Public APIs I thought it is better to try and cover more use cases. If others think
Re: [VOTE] KIP-11- Authorization design for kafka security
Thanks for your comments Gari. My responses are inline. Thanks Parth On 4/24/15, 10:36 AM, Gari Singh gari.r.si...@gmail.com wrote: Sorry - fat fingered send ... Not sure if my newbie vote will count, but I think you are getting pretty close here. Couple of things: 1) I know the Session object is from a different JIRA, but I think that Session should take a Subject rather than just a single Principal. The reason for this is because a Subject can have multiple Principals (for example both a username and a group or perhaps someone would want to use both the username and the clientIP as Principals) I think the user - group mapping can be done at Authorization implementation layer. In any case as you pointed out the session is part of another jira and I think a PR is out https://reviews.apache.org/r/27204/diff/ and we should discuss it on that PR. 2) We would then also have multiple concrete Principals, e.g. KafkaPrincipal KafkaUserPrincipal KafkaGroupPrincipal (perhaps even KafkaKerberosPrincipal and KafkaClientAddressPrincipal) etc This is important as eventually (hopefully sooner than later), we will support multiple types of authentication which may each want to populate the Subject with one or more Principals and perhaps even credentials (this could be used in the future to hold encryption keys or perhaps the raw info prior to authentication). So in this way, if we have different authentication modules, we can add different types of Principals by extension This also allows the same subject to have access to some resources based on username and some based on group. Given that with this we would have different types of Principals, I would then modify the ACL to look like: {version:1, {acls:[ { principal_types:[KafkaUserPrincipal,KafkaGroupPrincipal], principals:[alice,kafka-devs] ... or {version:1, {acls:[ { principals:[KafkaUserPrincipal:alice,KafkaGroupPrincipal:kafka- devs] But in either case this allows for easy identification of the type of principal and makes it easy to plugin multiple kinds of principals The advantage of all of this is that it now provides more flexibility for custom modules for both authentication and authorization moving forward. All the principals that you listed above can be supported with current design. Acls take a KafkaPrincipal as input which is a combination of type and principal name and the authorizer implementations are free to create any extension of this which covers group: groupName, host: HostName, kerberos: kerberosUserName and any other types that may come up. I am not sure how encryption key storage is relavent to the Authorizer so will be great if you can elaborate. 3) Are you sure that you want authorize to take a session object? If we use the model in one above, we could just populate the Subject with a KafkaClientAddressPrincipal and thenhave access to that when evaluated the ACLs. I think it is better to take a session which can just be a wrapper on top of Subject + host for now. This allows for extension which in my opinion is more future requirement proof. 4) What about actually caching authorization decisions? I know ACLs will be cached, but the actual authorize decision can be expensive as well? In default implementation I don’t plan to do this. Easy to add later if we want to but I am not sure why would this ever be expansive when acls are cached and number of acls on a single topic should be very small and iterating over them with simple string comparison should not really be expansive. Thanks Parth On Fri, Apr 24, 2015 at 1:27 PM, Gari Singh gari.r.si...@gmail.com wrote: Not sure if my newbie vote will count, but I think you are getting pretty close here. Couple of things: 1) I know the Session object is from a different JIRA, but I think that Session should take a Subject rather than just a single Principal. The reason for this is because a Subject can have multiple Principals (for example both a username and a group or perhaps someone would want to use both the username and the clientIP as Principals) 2) We would then also have multiple concrete Principals, e.g. KafkaPrincipal KafkaUserPrincipal KafkaGroupPrincipal (perhaps even KafkaKerberosPrincipal and KafkaClientAddressPrincipal) etc This is important as eventually (hopefully sooner than later), we will support multiple types of authentication which may each want to populate the Subject with one or more Principals and perhaps even credentials (this could be used in the future to hold encryption keys or perhaps the raw info prior to authentication). So in this way, if we have different authentication modules, we can add different types of Principals by extension This also allows the same subject to have access to some resources based on username and some based on group. Given that with this we would have different types of Principals, I
Re: [VOTE] KIP-11- Authorization design for kafka security
+1 (non-binding) Two nitpicks for the wiki: * Heartbeat is probably a READ and not CLUSTER operation. I'm pretty sure new consumers need it to be part of a consumer group. * Can you clearly separate which parts are the API (common to every Authorizer) and which parts are DefaultAuthorizer implementation? It will make reviews and Authorizer implementations a bit easier to know exactly which is which. Gwen On Fri, Apr 24, 2015 at 9:28 AM, Parth Brahmbhatt pbrahmbh...@hortonworks.com wrote: Hi, I would like to open KIP-11 for voting. Thanks Parth On 4/22/15, 1:56 PM, Parth Brahmbhatt pbrahmbh...@hortonworks.com wrote: Hi Jeff, Thanks a lot for the review. I think you have a valid point about acls being duplicated and the simplest solution would be to modify acls class so they hold a set of principals instead of single principal. i.e user_a,user_b has READ,WRITE,DESCRIBE Permissions on Topic1 from Host1, Host2, Host3. I think the evaluation order only matters for the permissionType which is Deny acls should be evaluated before allow acls. To give you an example suppose we have following acls acl1 - user1 is allowed to READ from all hosts. acl2 - host1 is allowed to READ regardless of who is the user. acl3 - host2 is allowed to READ regardless of who is the user. acl4 - user1 is denied to READ from host1. As stated in the KIP we first evaluate DENY so if user1 tries to access from host1 he will be denied(acl4), even though both user1 and host1 has acl’s for allow with wildcards (acl1, acl2). If user1 tried to READ from host2 , the action will be allowed and it does not matter if we match acl3 or acl1 so I don’t think the evaluation order matters here. “Will people actually use hosts with users?” I really don’t know but given ACl’s are part of our Public APIs I thought it is better to try and cover more use cases. If others think this extra complexity is not worth the value its adding please raise your concerns so we can discuss if it should be removed from the acl structure. Note that even in absence of hosts from ACL users will still be able to whitelist/blacklist host as long as we start supporting principalType = “host”, easy to add and can be an incremental improvement. They will however loose the ability to restrict access to users just from a set of hosts. We agreed to offer a CLI to overcome the JSON acl config https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authorization+I n terface#KIP-11-AuthorizationInterface-AclManagement(CLI). I still like Jsons but that probably has something to do with me being a developer :-). Thanks Parth On 4/22/15, 11:38 AM, Jeff Holoman jholo...@cloudera.com wrote: Parth, This is a long thread, so trying to keep up here, sorry if this has been covered before. First, great job on the KIP proposal and work so far. Are we sure that we want to tie host level access to a given user? My understanding is that the ACL will be (omitting some fields) user_a, host1, host2, host3 user_b, host1, host2, host3 So there would potentially be a lot of redundancy in the configs. Does it make sense to have hosts be at the same level as principal in the hierarchy? This way you could just blanket the allowed / denied hosts and only have to worry about the users. So if you follow this, then we can wildcard the user so we can have a separate list of just host-based access. What's the order that the perms would be evaluated if a there was more than one match on a principal ? Is the thought that there wouldn't usually be much overlap on hosts? I guess I can imagine a scenario where I want to offline/online access to a particular hosts or set of hosts and if there was overlap, I'm doing a bunch of alter commands for just a single host. Maybe this is too contrived an example? I agree that having this level of granularity gives flexibility but I wonder if people will actually use it and not just * the hosts for a given user and create separate global list as i mentioned above? The only other system I know of that ties users with hosts for access is MySql and I don't love that model. Companies usually standardize on group authorization anyway, are we complicating that issue with the inclusion of hosts attached to users? Additionally I worry about the debt of big JSON configs in the first place, most non-developers find them non-intuitive already, so anything to ease this I think would be beneficial. Thanks Jeff On Wed, Apr 22, 2015 at 2:22 PM, Parth Brahmbhatt pbrahmbh...@hortonworks.com wrote: Sorry I missed your last questions. I am +0 on adding ―host option for ―list, we could add it for symmetry. Again if this is only a CLI change it can be added later if you mean adding this in authorizer interface then we should make a decision now. Given a choice I would like to actually keep only one option which is resource based get (remove even the get based on principal). I see those (getAcl for principal or host) as special filtering case which can
Re: [VOTE] KIP-11- Authorization design for kafka security
You are right, moved it to the default implementation section. Thanks Parth On 4/24/15, 9:52 AM, Gwen Shapira gshap...@cloudera.com wrote: Sample ACL JSON and Zookeeper is in public API, but I thought it is part of DefaultAuthorizer (Since Sentry and Argus won't be using Zookeeper). Am I wrong? Or is it the KIP? On Fri, Apr 24, 2015 at 9:49 AM, Parth Brahmbhatt pbrahmbh...@hortonworks.com wrote: Thanks for clarifying Gwen, KIP updated. I tried to make the distinction by creating a section for all public APIs https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authorization+ In terface#KIP-11-AuthorizationInterface-PublicInterfacesandclasses Let me know if you think there is a better way to reflect this. Thanks Parth On 4/24/15, 9:37 AM, Gwen Shapira gshap...@cloudera.com wrote: +1 (non-binding) Two nitpicks for the wiki: * Heartbeat is probably a READ and not CLUSTER operation. I'm pretty sure new consumers need it to be part of a consumer group. * Can you clearly separate which parts are the API (common to every Authorizer) and which parts are DefaultAuthorizer implementation? It will make reviews and Authorizer implementations a bit easier to know exactly which is which. Gwen On Fri, Apr 24, 2015 at 9:28 AM, Parth Brahmbhatt pbrahmbh...@hortonworks.com wrote: Hi, I would like to open KIP-11 for voting. Thanks Parth On 4/22/15, 1:56 PM, Parth Brahmbhatt pbrahmbh...@hortonworks.com wrote: Hi Jeff, Thanks a lot for the review. I think you have a valid point about acls being duplicated and the simplest solution would be to modify acls class so they hold a set of principals instead of single principal. i.e user_a,user_b has READ,WRITE,DESCRIBE Permissions on Topic1 from Host1, Host2, Host3. I think the evaluation order only matters for the permissionType which is Deny acls should be evaluated before allow acls. To give you an example suppose we have following acls acl1 - user1 is allowed to READ from all hosts. acl2 - host1 is allowed to READ regardless of who is the user. acl3 - host2 is allowed to READ regardless of who is the user. acl4 - user1 is denied to READ from host1. As stated in the KIP we first evaluate DENY so if user1 tries to access from host1 he will be denied(acl4), even though both user1 and host1 has acl’s for allow with wildcards (acl1, acl2). If user1 tried to READ from host2 , the action will be allowed and it does not matter if we match acl3 or acl1 so I don’t think the evaluation order matters here. “Will people actually use hosts with users?” I really don’t know but given ACl’s are part of our Public APIs I thought it is better to try and cover more use cases. If others think this extra complexity is not worth the value its adding please raise your concerns so we can discuss if it should be removed from the acl structure. Note that even in absence of hosts from ACL users will still be able to whitelist/blacklist host as long as we start supporting principalType = “host”, easy to add and can be an incremental improvement. They will however loose the ability to restrict access to users just from a set of hosts. We agreed to offer a CLI to overcome the JSON acl config https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authorizati on +I n terface#KIP-11-AuthorizationInterface-AclManagement(CLI). I still like Jsons but that probably has something to do with me being a developer :-). Thanks Parth On 4/22/15, 11:38 AM, Jeff Holoman jholo...@cloudera.com wrote: Parth, This is a long thread, so trying to keep up here, sorry if this has been covered before. First, great job on the KIP proposal and work so far. Are we sure that we want to tie host level access to a given user? My understanding is that the ACL will be (omitting some fields) user_a, host1, host2, host3 user_b, host1, host2, host3 So there would potentially be a lot of redundancy in the configs. Does it make sense to have hosts be at the same level as principal in the hierarchy? This way you could just blanket the allowed / denied hosts and only have to worry about the users. So if you follow this, then we can wildcard the user so we can have a separate list of just host-based access. What's the order that the perms would be evaluated if a there was more than one match on a principal ? Is the thought that there wouldn't usually be much overlap on hosts? I guess I can imagine a scenario where I want to offline/online access to a particular hosts or set of hosts and if there was overlap, I'm doing a bunch of alter commands for just a single host. Maybe this is too contrived an example? I agree that having this level of granularity gives flexibility but I wonder if people will actually use it and not just * the hosts for a given user and create separate global list as i mentioned above? The only other system I know of that ties users with hosts for access is MySql and I don't love that model. Companies usually standardize on group authorization anyway, are we
Re: [VOTE] KIP-11- Authorization design for kafka security
Not sure if my newbie vote will count, but I think you are getting pretty close here. Couple of things: 1) I know the Session object is from a different JIRA, but I think that Session should take a Subject rather than just a single Principal. The reason for this is because a Subject can have multiple Principals (for example both a username and a group or perhaps someone would want to use both the username and the clientIP as Principals) 2) We would then also have multiple concrete Principals, e.g. KafkaPrincipal KafkaUserPrincipal KafkaGroupPrincipal (perhaps even KafkaKerberosPrincipal and KafkaClientAddressPrincipal) etc This is important as eventually (hopefully sooner than later), we will support multiple types of authentication which may each want to populate the Subject with one or more Principals and perhaps even credentials (this could be used in the future to hold encryption keys or perhaps the raw info prior to authentication). So in this way, if we have different authentication modules, we can add different types of Principals by extension This also allows the same subject to have access to some resources based on username and some based on group. Given that with this we would have different types of Principals, I would then modify the ACL to look like: {version:1, {acls:[ { principal_types:[KafkaUserPrincipal,KafkaGroupPrincipal], principals:[alice,kafka-devs 3) The advantage of all of this is that it now provides more flexibility for custom modules for both authentication and authorization moving forward. On Fri, Apr 24, 2015 at 12:37 PM, Gwen Shapira gshap...@cloudera.com wrote: +1 (non-binding) Two nitpicks for the wiki: * Heartbeat is probably a READ and not CLUSTER operation. I'm pretty sure new consumers need it to be part of a consumer group. * Can you clearly separate which parts are the API (common to every Authorizer) and which parts are DefaultAuthorizer implementation? It will make reviews and Authorizer implementations a bit easier to know exactly which is which. Gwen On Fri, Apr 24, 2015 at 9:28 AM, Parth Brahmbhatt pbrahmbh...@hortonworks.com wrote: Hi, I would like to open KIP-11 for voting. Thanks Parth On 4/22/15, 1:56 PM, Parth Brahmbhatt pbrahmbh...@hortonworks.com wrote: Hi Jeff, Thanks a lot for the review. I think you have a valid point about acls being duplicated and the simplest solution would be to modify acls class so they hold a set of principals instead of single principal. i.e user_a,user_b has READ,WRITE,DESCRIBE Permissions on Topic1 from Host1, Host2, Host3. I think the evaluation order only matters for the permissionType which is Deny acls should be evaluated before allow acls. To give you an example suppose we have following acls acl1 - user1 is allowed to READ from all hosts. acl2 - host1 is allowed to READ regardless of who is the user. acl3 - host2 is allowed to READ regardless of who is the user. acl4 - user1 is denied to READ from host1. As stated in the KIP we first evaluate DENY so if user1 tries to access from host1 he will be denied(acl4), even though both user1 and host1 has acl’s for allow with wildcards (acl1, acl2). If user1 tried to READ from host2 , the action will be allowed and it does not matter if we match acl3 or acl1 so I don’t think the evaluation order matters here. “Will people actually use hosts with users?” I really don’t know but given ACl’s are part of our Public APIs I thought it is better to try and cover more use cases. If others think this extra complexity is not worth the value its adding please raise your concerns so we can discuss if it should be removed from the acl structure. Note that even in absence of hosts from ACL users will still be able to whitelist/blacklist host as long as we start supporting principalType = “host”, easy to add and can be an incremental improvement. They will however loose the ability to restrict access to users just from a set of hosts. We agreed to offer a CLI to overcome the JSON acl config https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authorization+I n terface#KIP-11-AuthorizationInterface-AclManagement(CLI). I still like Jsons but that probably has something to do with me being a developer :-). Thanks Parth On 4/22/15, 11:38 AM, Jeff Holoman jholo...@cloudera.com wrote: Parth, This is a long thread, so trying to keep up here, sorry if this has been covered before. First, great job on the KIP proposal and work so far. Are we sure that we want to tie host level access to a given user? My understanding is that the ACL will be (omitting some fields) user_a, host1, host2, host3 user_b, host1, host2, host3 So there would potentially be a lot of redundancy in the configs. Does it make sense to have hosts be at the same level as principal in the hierarchy? This way you could just blanket the allowed / denied hosts and only have to
Re: [VOTE] KIP-11- Authorization design for kafka security
Hi, I would like to open KIP-11 for voting. Thanks Parth On 4/22/15, 1:56 PM, Parth Brahmbhatt pbrahmbh...@hortonworks.com wrote: Hi Jeff, Thanks a lot for the review. I think you have a valid point about acls being duplicated and the simplest solution would be to modify acls class so they hold a set of principals instead of single principal. i.e user_a,user_b has READ,WRITE,DESCRIBE Permissions on Topic1 from Host1, Host2, Host3. I think the evaluation order only matters for the permissionType which is Deny acls should be evaluated before allow acls. To give you an example suppose we have following acls acl1 - user1 is allowed to READ from all hosts. acl2 - host1 is allowed to READ regardless of who is the user. acl3 - host2 is allowed to READ regardless of who is the user. acl4 - user1 is denied to READ from host1. As stated in the KIP we first evaluate DENY so if user1 tries to access from host1 he will be denied(acl4), even though both user1 and host1 has acl’s for allow with wildcards (acl1, acl2). If user1 tried to READ from host2 , the action will be allowed and it does not matter if we match acl3 or acl1 so I don’t think the evaluation order matters here. “Will people actually use hosts with users?” I really don’t know but given ACl’s are part of our Public APIs I thought it is better to try and cover more use cases. If others think this extra complexity is not worth the value its adding please raise your concerns so we can discuss if it should be removed from the acl structure. Note that even in absence of hosts from ACL users will still be able to whitelist/blacklist host as long as we start supporting principalType = “host”, easy to add and can be an incremental improvement. They will however loose the ability to restrict access to users just from a set of hosts. We agreed to offer a CLI to overcome the JSON acl config https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authorization+I n terface#KIP-11-AuthorizationInterface-AclManagement(CLI). I still like Jsons but that probably has something to do with me being a developer :-). Thanks Parth On 4/22/15, 11:38 AM, Jeff Holoman jholo...@cloudera.com wrote: Parth, This is a long thread, so trying to keep up here, sorry if this has been covered before. First, great job on the KIP proposal and work so far. Are we sure that we want to tie host level access to a given user? My understanding is that the ACL will be (omitting some fields) user_a, host1, host2, host3 user_b, host1, host2, host3 So there would potentially be a lot of redundancy in the configs. Does it make sense to have hosts be at the same level as principal in the hierarchy? This way you could just blanket the allowed / denied hosts and only have to worry about the users. So if you follow this, then we can wildcard the user so we can have a separate list of just host-based access. What's the order that the perms would be evaluated if a there was more than one match on a principal ? Is the thought that there wouldn't usually be much overlap on hosts? I guess I can imagine a scenario where I want to offline/online access to a particular hosts or set of hosts and if there was overlap, I'm doing a bunch of alter commands for just a single host. Maybe this is too contrived an example? I agree that having this level of granularity gives flexibility but I wonder if people will actually use it and not just * the hosts for a given user and create separate global list as i mentioned above? The only other system I know of that ties users with hosts for access is MySql and I don't love that model. Companies usually standardize on group authorization anyway, are we complicating that issue with the inclusion of hosts attached to users? Additionally I worry about the debt of big JSON configs in the first place, most non-developers find them non-intuitive already, so anything to ease this I think would be beneficial. Thanks Jeff On Wed, Apr 22, 2015 at 2:22 PM, Parth Brahmbhatt pbrahmbh...@hortonworks.com wrote: Sorry I missed your last questions. I am +0 on adding ―host option for ―list, we could add it for symmetry. Again if this is only a CLI change it can be added later if you mean adding this in authorizer interface then we should make a decision now. Given a choice I would like to actually keep only one option which is resource based get (remove even the get based on principal). I see those (getAcl for principal or host) as special filtering case which can easily be achieved by a third party tool by doing list all topics and calling getAcls for each topic and applying filtering logic on that. I really don’t see the need to make those first class citizens of the authorizer interface given these kind of queries will be issued outside of broker JVM so they will not benefit from the caching and because the storage will be indexed on resource both these options even as a first class API will just scan all topic acls and apply filtering logic.
Re: [VOTE] KIP-11- Authorization design for kafka security
Sample ACL JSON and Zookeeper is in public API, but I thought it is part of DefaultAuthorizer (Since Sentry and Argus won't be using Zookeeper). Am I wrong? Or is it the KIP? On Fri, Apr 24, 2015 at 9:49 AM, Parth Brahmbhatt pbrahmbh...@hortonworks.com wrote: Thanks for clarifying Gwen, KIP updated. I tried to make the distinction by creating a section for all public APIs https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authorization+In terface#KIP-11-AuthorizationInterface-PublicInterfacesandclasses Let me know if you think there is a better way to reflect this. Thanks Parth On 4/24/15, 9:37 AM, Gwen Shapira gshap...@cloudera.com wrote: +1 (non-binding) Two nitpicks for the wiki: * Heartbeat is probably a READ and not CLUSTER operation. I'm pretty sure new consumers need it to be part of a consumer group. * Can you clearly separate which parts are the API (common to every Authorizer) and which parts are DefaultAuthorizer implementation? It will make reviews and Authorizer implementations a bit easier to know exactly which is which. Gwen On Fri, Apr 24, 2015 at 9:28 AM, Parth Brahmbhatt pbrahmbh...@hortonworks.com wrote: Hi, I would like to open KIP-11 for voting. Thanks Parth On 4/22/15, 1:56 PM, Parth Brahmbhatt pbrahmbh...@hortonworks.com wrote: Hi Jeff, Thanks a lot for the review. I think you have a valid point about acls being duplicated and the simplest solution would be to modify acls class so they hold a set of principals instead of single principal. i.e user_a,user_b has READ,WRITE,DESCRIBE Permissions on Topic1 from Host1, Host2, Host3. I think the evaluation order only matters for the permissionType which is Deny acls should be evaluated before allow acls. To give you an example suppose we have following acls acl1 - user1 is allowed to READ from all hosts. acl2 - host1 is allowed to READ regardless of who is the user. acl3 - host2 is allowed to READ regardless of who is the user. acl4 - user1 is denied to READ from host1. As stated in the KIP we first evaluate DENY so if user1 tries to access from host1 he will be denied(acl4), even though both user1 and host1 has acl’s for allow with wildcards (acl1, acl2). If user1 tried to READ from host2 , the action will be allowed and it does not matter if we match acl3 or acl1 so I don’t think the evaluation order matters here. “Will people actually use hosts with users?” I really don’t know but given ACl’s are part of our Public APIs I thought it is better to try and cover more use cases. If others think this extra complexity is not worth the value its adding please raise your concerns so we can discuss if it should be removed from the acl structure. Note that even in absence of hosts from ACL users will still be able to whitelist/blacklist host as long as we start supporting principalType = “host”, easy to add and can be an incremental improvement. They will however loose the ability to restrict access to users just from a set of hosts. We agreed to offer a CLI to overcome the JSON acl config https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authorization +I n terface#KIP-11-AuthorizationInterface-AclManagement(CLI). I still like Jsons but that probably has something to do with me being a developer :-). Thanks Parth On 4/22/15, 11:38 AM, Jeff Holoman jholo...@cloudera.com wrote: Parth, This is a long thread, so trying to keep up here, sorry if this has been covered before. First, great job on the KIP proposal and work so far. Are we sure that we want to tie host level access to a given user? My understanding is that the ACL will be (omitting some fields) user_a, host1, host2, host3 user_b, host1, host2, host3 So there would potentially be a lot of redundancy in the configs. Does it make sense to have hosts be at the same level as principal in the hierarchy? This way you could just blanket the allowed / denied hosts and only have to worry about the users. So if you follow this, then we can wildcard the user so we can have a separate list of just host-based access. What's the order that the perms would be evaluated if a there was more than one match on a principal ? Is the thought that there wouldn't usually be much overlap on hosts? I guess I can imagine a scenario where I want to offline/online access to a particular hosts or set of hosts and if there was overlap, I'm doing a bunch of alter commands for just a single host. Maybe this is too contrived an example? I agree that having this level of granularity gives flexibility but I wonder if people will actually use it and not just * the hosts for a given user and create separate global list as i mentioned above? The only other system I know of that ties users with hosts for access is MySql and I don't love that model. Companies usually standardize on group authorization anyway, are we complicating that issue with the inclusion of hosts attached to users? Additionally I worry about the debt of big JSON configs in the first place, most
Re: [VOTE] KIP-11- Authorization design for kafka security
+1 (non-binding) -- Harsha On April 24, 2015 at 9:59:09 AM, Parth Brahmbhatt (pbrahmbh...@hortonworks.com) wrote: You are right, moved it to the default implementation section. Thanks Parth On 4/24/15, 9:52 AM, Gwen Shapira gshap...@cloudera.com wrote: Sample ACL JSON and Zookeeper is in public API, but I thought it is part of DefaultAuthorizer (Since Sentry and Argus won't be using Zookeeper). Am I wrong? Or is it the KIP? On Fri, Apr 24, 2015 at 9:49 AM, Parth Brahmbhatt pbrahmbh...@hortonworks.com wrote: Thanks for clarifying Gwen, KIP updated. I tried to make the distinction by creating a section for all public APIs https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authorization+ In terface#KIP-11-AuthorizationInterface-PublicInterfacesandclasses Let me know if you think there is a better way to reflect this. Thanks Parth On 4/24/15, 9:37 AM, Gwen Shapira gshap...@cloudera.com wrote: +1 (non-binding) Two nitpicks for the wiki: * Heartbeat is probably a READ and not CLUSTER operation. I'm pretty sure new consumers need it to be part of a consumer group. * Can you clearly separate which parts are the API (common to every Authorizer) and which parts are DefaultAuthorizer implementation? It will make reviews and Authorizer implementations a bit easier to know exactly which is which. Gwen On Fri, Apr 24, 2015 at 9:28 AM, Parth Brahmbhatt pbrahmbh...@hortonworks.com wrote: Hi, I would like to open KIP-11 for voting. Thanks Parth On 4/22/15, 1:56 PM, Parth Brahmbhatt pbrahmbh...@hortonworks.com wrote: Hi Jeff, Thanks a lot for the review. I think you have a valid point about acls being duplicated and the simplest solution would be to modify acls class so they hold a set of principals instead of single principal. i.e user_a,user_b has READ,WRITE,DESCRIBE Permissions on Topic1 from Host1, Host2, Host3. I think the evaluation order only matters for the permissionType which is Deny acls should be evaluated before allow acls. To give you an example suppose we have following acls acl1 - user1 is allowed to READ from all hosts. acl2 - host1 is allowed to READ regardless of who is the user. acl3 - host2 is allowed to READ regardless of who is the user. acl4 - user1 is denied to READ from host1. As stated in the KIP we first evaluate DENY so if user1 tries to access from host1 he will be denied(acl4), even though both user1 and host1 has acl’s for allow with wildcards (acl1, acl2). If user1 tried to READ from host2 , the action will be allowed and it does not matter if we match acl3 or acl1 so I don’t think the evaluation order matters here. “Will people actually use hosts with users?” I really don’t know but given ACl’s are part of our Public APIs I thought it is better to try and cover more use cases. If others think this extra complexity is not worth the value its adding please raise your concerns so we can discuss if it should be removed from the acl structure. Note that even in absence of hosts from ACL users will still be able to whitelist/blacklist host as long as we start supporting principalType = “host”, easy to add and can be an incremental improvement. They will however loose the ability to restrict access to users just from a set of hosts. We agreed to offer a CLI to overcome the JSON acl config https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authorizati on +I n terface#KIP-11-AuthorizationInterface-AclManagement(CLI). I still like Jsons but that probably has something to do with me being a developer :-). Thanks Parth On 4/22/15, 11:38 AM, Jeff Holoman jholo...@cloudera.com wrote: Parth, This is a long thread, so trying to keep up here, sorry if this has been covered before. First, great job on the KIP proposal and work so far. Are we sure that we want to tie host level access to a given user? My understanding is that the ACL will be (omitting some fields) user_a, host1, host2, host3 user_b, host1, host2, host3 So there would potentially be a lot of redundancy in the configs. Does it make sense to have hosts be at the same level as principal in the hierarchy? This way you could just blanket the allowed / denied hosts and only have to worry about the users. So if you follow this, then we can wildcard the user so we can have a separate list of just host-based access. What's the order that the perms would be evaluated if a there was more than one match on a principal ? Is the thought that there wouldn't usually be much overlap on hosts? I guess I can imagine a scenario where I want to offline/online access to a particular hosts or set of hosts and if there was overlap, I'm doing a bunch of alter commands for just a single host. Maybe this is too
Re: [VOTE] KIP-11- Authorization design for kafka security
Thanks for clarifying Gwen, KIP updated. I tried to make the distinction by creating a section for all public APIs https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authorization+In terface#KIP-11-AuthorizationInterface-PublicInterfacesandclasses Let me know if you think there is a better way to reflect this. Thanks Parth On 4/24/15, 9:37 AM, Gwen Shapira gshap...@cloudera.com wrote: +1 (non-binding) Two nitpicks for the wiki: * Heartbeat is probably a READ and not CLUSTER operation. I'm pretty sure new consumers need it to be part of a consumer group. * Can you clearly separate which parts are the API (common to every Authorizer) and which parts are DefaultAuthorizer implementation? It will make reviews and Authorizer implementations a bit easier to know exactly which is which. Gwen On Fri, Apr 24, 2015 at 9:28 AM, Parth Brahmbhatt pbrahmbh...@hortonworks.com wrote: Hi, I would like to open KIP-11 for voting. Thanks Parth On 4/22/15, 1:56 PM, Parth Brahmbhatt pbrahmbh...@hortonworks.com wrote: Hi Jeff, Thanks a lot for the review. I think you have a valid point about acls being duplicated and the simplest solution would be to modify acls class so they hold a set of principals instead of single principal. i.e user_a,user_b has READ,WRITE,DESCRIBE Permissions on Topic1 from Host1, Host2, Host3. I think the evaluation order only matters for the permissionType which is Deny acls should be evaluated before allow acls. To give you an example suppose we have following acls acl1 - user1 is allowed to READ from all hosts. acl2 - host1 is allowed to READ regardless of who is the user. acl3 - host2 is allowed to READ regardless of who is the user. acl4 - user1 is denied to READ from host1. As stated in the KIP we first evaluate DENY so if user1 tries to access from host1 he will be denied(acl4), even though both user1 and host1 has acl’s for allow with wildcards (acl1, acl2). If user1 tried to READ from host2 , the action will be allowed and it does not matter if we match acl3 or acl1 so I don’t think the evaluation order matters here. “Will people actually use hosts with users?” I really don’t know but given ACl’s are part of our Public APIs I thought it is better to try and cover more use cases. If others think this extra complexity is not worth the value its adding please raise your concerns so we can discuss if it should be removed from the acl structure. Note that even in absence of hosts from ACL users will still be able to whitelist/blacklist host as long as we start supporting principalType = “host”, easy to add and can be an incremental improvement. They will however loose the ability to restrict access to users just from a set of hosts. We agreed to offer a CLI to overcome the JSON acl config https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authorization +I n terface#KIP-11-AuthorizationInterface-AclManagement(CLI). I still like Jsons but that probably has something to do with me being a developer :-). Thanks Parth On 4/22/15, 11:38 AM, Jeff Holoman jholo...@cloudera.com wrote: Parth, This is a long thread, so trying to keep up here, sorry if this has been covered before. First, great job on the KIP proposal and work so far. Are we sure that we want to tie host level access to a given user? My understanding is that the ACL will be (omitting some fields) user_a, host1, host2, host3 user_b, host1, host2, host3 So there would potentially be a lot of redundancy in the configs. Does it make sense to have hosts be at the same level as principal in the hierarchy? This way you could just blanket the allowed / denied hosts and only have to worry about the users. So if you follow this, then we can wildcard the user so we can have a separate list of just host-based access. What's the order that the perms would be evaluated if a there was more than one match on a principal ? Is the thought that there wouldn't usually be much overlap on hosts? I guess I can imagine a scenario where I want to offline/online access to a particular hosts or set of hosts and if there was overlap, I'm doing a bunch of alter commands for just a single host. Maybe this is too contrived an example? I agree that having this level of granularity gives flexibility but I wonder if people will actually use it and not just * the hosts for a given user and create separate global list as i mentioned above? The only other system I know of that ties users with hosts for access is MySql and I don't love that model. Companies usually standardize on group authorization anyway, are we complicating that issue with the inclusion of hosts attached to users? Additionally I worry about the debt of big JSON configs in the first place, most non-developers find them non-intuitive already, so anything to ease this I think would be beneficial. Thanks Jeff On Wed, Apr 22, 2015 at 2:22 PM, Parth Brahmbhatt pbrahmbh...@hortonworks.com wrote: Sorry I missed your last questions. I am +0 on adding ―host option
Re: [VOTE] KIP-11- Authorization design for kafka security
Sorry - fat fingered send ... Not sure if my newbie vote will count, but I think you are getting pretty close here. Couple of things: 1) I know the Session object is from a different JIRA, but I think that Session should take a Subject rather than just a single Principal. The reason for this is because a Subject can have multiple Principals (for example both a username and a group or perhaps someone would want to use both the username and the clientIP as Principals) 2) We would then also have multiple concrete Principals, e.g. KafkaPrincipal KafkaUserPrincipal KafkaGroupPrincipal (perhaps even KafkaKerberosPrincipal and KafkaClientAddressPrincipal) etc This is important as eventually (hopefully sooner than later), we will support multiple types of authentication which may each want to populate the Subject with one or more Principals and perhaps even credentials (this could be used in the future to hold encryption keys or perhaps the raw info prior to authentication). So in this way, if we have different authentication modules, we can add different types of Principals by extension This also allows the same subject to have access to some resources based on username and some based on group. Given that with this we would have different types of Principals, I would then modify the ACL to look like: {version:1, {acls:[ { principal_types:[KafkaUserPrincipal,KafkaGroupPrincipal], principals:[alice,kafka-devs] ... or {version:1, {acls:[ { principals:[KafkaUserPrincipal:alice,KafkaGroupPrincipal:kafka- devs] But in either case this allows for easy identification of the type of principal and makes it easy to plugin multiple kinds of principals The advantage of all of this is that it now provides more flexibility for custom modules for both authentication and authorization moving forward. 3) Are you sure that you want authorize to take a session object? If we use the model in one above, we could just populate the Subject with a KafkaClientAddressPrincipal and thenhave access to that when evaluated the ACLs. 4) What about actually caching authorization decisions? I know ACLs will be cached, but the actual authorize decision can be expensive as well? On Fri, Apr 24, 2015 at 1:27 PM, Gari Singh gari.r.si...@gmail.com wrote: Not sure if my newbie vote will count, but I think you are getting pretty close here. Couple of things: 1) I know the Session object is from a different JIRA, but I think that Session should take a Subject rather than just a single Principal. The reason for this is because a Subject can have multiple Principals (for example both a username and a group or perhaps someone would want to use both the username and the clientIP as Principals) 2) We would then also have multiple concrete Principals, e.g. KafkaPrincipal KafkaUserPrincipal KafkaGroupPrincipal (perhaps even KafkaKerberosPrincipal and KafkaClientAddressPrincipal) etc This is important as eventually (hopefully sooner than later), we will support multiple types of authentication which may each want to populate the Subject with one or more Principals and perhaps even credentials (this could be used in the future to hold encryption keys or perhaps the raw info prior to authentication). So in this way, if we have different authentication modules, we can add different types of Principals by extension This also allows the same subject to have access to some resources based on username and some based on group. Given that with this we would have different types of Principals, I would then modify the ACL to look like: {version:1, {acls:[ { principal_types:[KafkaUserPrincipal,KafkaGroupPrincipal], principals:[alice,kafka-devs 3) The advantage of all of this is that it now provides more flexibility for custom modules for both authentication and authorization moving forward. On Fri, Apr 24, 2015 at 12:37 PM, Gwen Shapira gshap...@cloudera.com wrote: +1 (non-binding) Two nitpicks for the wiki: * Heartbeat is probably a READ and not CLUSTER operation. I'm pretty sure new consumers need it to be part of a consumer group. * Can you clearly separate which parts are the API (common to every Authorizer) and which parts are DefaultAuthorizer implementation? It will make reviews and Authorizer implementations a bit easier to know exactly which is which. Gwen On Fri, Apr 24, 2015 at 9:28 AM, Parth Brahmbhatt pbrahmbh...@hortonworks.com wrote: Hi, I would like to open KIP-11 for voting. Thanks Parth On 4/22/15, 1:56 PM, Parth Brahmbhatt pbrahmbh...@hortonworks.com wrote: Hi Jeff, Thanks a lot for the review. I think you have a valid point about acls being duplicated and the simplest solution would be to modify acls class so they hold a set of principals instead of single principal. i.e user_a,user_b has READ,WRITE,DESCRIBE Permissions on Topic1 from Host1, Host2, Host3. I think the
Re: [VOTE] KIP-11- Authorization design for kafka security
Thanks. One more thing I'm missing in the KIP is details on the Group resource (I think we discussed this and it was just not fully updated): * Does everyone have the privilege to create a new Group and use it to consume from Topics he's already privileged on? * Will the CLI tool be used to manage group membership too? * Groups are kind of ephemeral, right? If all consumers in the group disconnect the group is gone, AFAIK. Do we preserve the ACLs? Or do we treat the new group as completely new resource? Can we create ACLs before the group exists, in anticipation of it getting created? Its all small details, but it will be difficult to implement KIP-11 without knowing the answers :) Gwen On Fri, Apr 24, 2015 at 9:58 AM, Parth Brahmbhatt pbrahmbh...@hortonworks.com wrote: You are right, moved it to the default implementation section. Thanks Parth On 4/24/15, 9:52 AM, Gwen Shapira gshap...@cloudera.com wrote: Sample ACL JSON and Zookeeper is in public API, but I thought it is part of DefaultAuthorizer (Since Sentry and Argus won't be using Zookeeper). Am I wrong? Or is it the KIP? On Fri, Apr 24, 2015 at 9:49 AM, Parth Brahmbhatt pbrahmbh...@hortonworks.com wrote: Thanks for clarifying Gwen, KIP updated. I tried to make the distinction by creating a section for all public APIs https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authorization+ In terface#KIP-11-AuthorizationInterface-PublicInterfacesandclasses Let me know if you think there is a better way to reflect this. Thanks Parth On 4/24/15, 9:37 AM, Gwen Shapira gshap...@cloudera.com wrote: +1 (non-binding) Two nitpicks for the wiki: * Heartbeat is probably a READ and not CLUSTER operation. I'm pretty sure new consumers need it to be part of a consumer group. * Can you clearly separate which parts are the API (common to every Authorizer) and which parts are DefaultAuthorizer implementation? It will make reviews and Authorizer implementations a bit easier to know exactly which is which. Gwen On Fri, Apr 24, 2015 at 9:28 AM, Parth Brahmbhatt pbrahmbh...@hortonworks.com wrote: Hi, I would like to open KIP-11 for voting. Thanks Parth On 4/22/15, 1:56 PM, Parth Brahmbhatt pbrahmbh...@hortonworks.com wrote: Hi Jeff, Thanks a lot for the review. I think you have a valid point about acls being duplicated and the simplest solution would be to modify acls class so they hold a set of principals instead of single principal. i.e user_a,user_b has READ,WRITE,DESCRIBE Permissions on Topic1 from Host1, Host2, Host3. I think the evaluation order only matters for the permissionType which is Deny acls should be evaluated before allow acls. To give you an example suppose we have following acls acl1 - user1 is allowed to READ from all hosts. acl2 - host1 is allowed to READ regardless of who is the user. acl3 - host2 is allowed to READ regardless of who is the user. acl4 - user1 is denied to READ from host1. As stated in the KIP we first evaluate DENY so if user1 tries to access from host1 he will be denied(acl4), even though both user1 and host1 has acl’s for allow with wildcards (acl1, acl2). If user1 tried to READ from host2 , the action will be allowed and it does not matter if we match acl3 or acl1 so I don’t think the evaluation order matters here. “Will people actually use hosts with users?” I really don’t know but given ACl’s are part of our Public APIs I thought it is better to try and cover more use cases. If others think this extra complexity is not worth the value its adding please raise your concerns so we can discuss if it should be removed from the acl structure. Note that even in absence of hosts from ACL users will still be able to whitelist/blacklist host as long as we start supporting principalType = “host”, easy to add and can be an incremental improvement. They will however loose the ability to restrict access to users just from a set of hosts. We agreed to offer a CLI to overcome the JSON acl config https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authorizati on +I n terface#KIP-11-AuthorizationInterface-AclManagement(CLI). I still like Jsons but that probably has something to do with me being a developer :-). Thanks Parth On 4/22/15, 11:38 AM, Jeff Holoman jholo...@cloudera.com wrote: Parth, This is a long thread, so trying to keep up here, sorry if this has been covered before. First, great job on the KIP proposal and work so far. Are we sure that we want to tie host level access to a given user? My understanding is that the ACL will be (omitting some fields) user_a, host1, host2, host3 user_b, host1, host2, host3 So there would potentially be a lot of redundancy in the configs. Does it make sense to have hosts be at the same level as principal in the hierarchy? This way you could just blanket the allowed / denied hosts and only have to worry about the users. So if you follow this, then we can wildcard the user so we can have a separate list of just host-based access. What's the
Re: [VOTE] KIP-11- Authorization design for kafka security
Sorry Gwen, completely misunderstood the question :-). * Does everyone have the privilege to create a new Group and use it to consume from Topics he's already privileged on? Yes in current proposal. I did not see an API to create group but if you have a READ permission on a TOPIC and WRITE permission on that Group you are free to join and consume. * Will the CLI tool be used to manage group membership too? Yes and I think that means I need to add ―group. Updating the KIP. Thanks for pointing this out. * Groups are kind of ephemeral, right? If all consumers in the group disconnect the group is gone, AFAIK. Do we preserve the ACLs? Or do we treat the new group as completely new resource? Can we create ACLs before the group exists, in anticipation of it getting created? I have considered any auto delete and auto create as out of scope for the first release. So Right now I was going with preserving the acls. Do you see any issues with this? Auto deleting would mean authorizer will now have to get into implementation details of kafka which I was trying to avoid. Thanks Parth On 4/24/15, 11:33 AM, Gwen Shapira gshap...@cloudera.com wrote: We are not talking about same Groups :) I meant, Groups of consumers (which KIP-11 lists as a separate resource in the Privilege table) On Fri, Apr 24, 2015 at 11:31 AM, Parth Brahmbhatt pbrahmbh...@hortonworks.com wrote: I see Groups as something we can add incrementally in the current model. The acls take principalType: name so groups can be represented as group: groupName. We are not managing group memberships anywhere in kafka and I don’t see the need to do so. So for a topic1 using the CLI an admin can add an acl to grant access to group:kafka-test-users. The authorizer implementation can have a plugin to map authenticated user to groups ( This is how hadoop and storm works). The plugin could be mapping user to linux/ldap/active directory groups but that is again upto the implementation. What we are offering is an interface that is extensible so these features can be added incrementally. I can add support for this in the first release but don’t necessarily see why this would be absolute necessity. Thanks Parth On 4/24/15, 11:00 AM, Gwen Shapira gshap...@cloudera.com wrote: Thanks. One more thing I'm missing in the KIP is details on the Group resource (I think we discussed this and it was just not fully updated): * Does everyone have the privilege to create a new Group and use it to consume from Topics he's already privileged on? * Will the CLI tool be used to manage group membership too? * Groups are kind of ephemeral, right? If all consumers in the group disconnect the group is gone, AFAIK. Do we preserve the ACLs? Or do we treat the new group as completely new resource? Can we create ACLs before the group exists, in anticipation of it getting created? Its all small details, but it will be difficult to implement KIP-11 without knowing the answers :) Gwen On Fri, Apr 24, 2015 at 9:58 AM, Parth Brahmbhatt pbrahmbh...@hortonworks.com wrote: You are right, moved it to the default implementation section. Thanks Parth On 4/24/15, 9:52 AM, Gwen Shapira gshap...@cloudera.com wrote: Sample ACL JSON and Zookeeper is in public API, but I thought it is part of DefaultAuthorizer (Since Sentry and Argus won't be using Zookeeper). Am I wrong? Or is it the KIP? On Fri, Apr 24, 2015 at 9:49 AM, Parth Brahmbhatt pbrahmbh...@hortonworks.com wrote: Thanks for clarifying Gwen, KIP updated. I tried to make the distinction by creating a section for all public APIs https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authorizat io n+ In terface#KIP-11-AuthorizationInterface-PublicInterfacesandclasses Let me know if you think there is a better way to reflect this. Thanks Parth On 4/24/15, 9:37 AM, Gwen Shapira gshap...@cloudera.com wrote: +1 (non-binding) Two nitpicks for the wiki: * Heartbeat is probably a READ and not CLUSTER operation. I'm pretty sure new consumers need it to be part of a consumer group. * Can you clearly separate which parts are the API (common to every Authorizer) and which parts are DefaultAuthorizer implementation? It will make reviews and Authorizer implementations a bit easier to know exactly which is which. Gwen On Fri, Apr 24, 2015 at 9:28 AM, Parth Brahmbhatt pbrahmbh...@hortonworks.com wrote: Hi, I would like to open KIP-11 for voting. Thanks Parth On 4/22/15, 1:56 PM, Parth Brahmbhatt pbrahmbh...@hortonworks.com wrote: Hi Jeff, Thanks a lot for the review. I think you have a valid point about acls being duplicated and the simplest solution would be to modify acls class so they hold a set of principals instead of single principal. i.e user_a,user_b has READ,WRITE,DESCRIBE Permissions on Topic1 from Host1, Host2, Host3. I think the evaluation order only matters for the permissionType which is Deny acls should be evaluated before allow acls. To give you an
Re: [VOTE] KIP-11- Authorization design for kafka security
I will move the comments about subject versus principal wrt session to the PR above. The comments around keys, etc are more appropriate there. If I tie this together with my comments in the thread on SASL / Kerberos, what I am having a hard time figuring out are the pluggable framework for both authentication and authorization versus implementation of specific authentication and authorization providers. As for caching decisions, it just seems silly to authorize on the same operation over and over again (e.g. publishing to the same topic), but perhaps if the ACLs are small enough this will be ok. On Fri, Apr 24, 2015 at 2:18 PM, Parth Brahmbhatt pbrahmbh...@hortonworks.com wrote: Thanks for your comments Gari. My responses are inline. Thanks Parth On 4/24/15, 10:36 AM, Gari Singh gari.r.si...@gmail.com wrote: Sorry - fat fingered send ... Not sure if my newbie vote will count, but I think you are getting pretty close here. Couple of things: 1) I know the Session object is from a different JIRA, but I think that Session should take a Subject rather than just a single Principal. The reason for this is because a Subject can have multiple Principals (for example both a username and a group or perhaps someone would want to use both the username and the clientIP as Principals) I think the user - group mapping can be done at Authorization implementation layer. In any case as you pointed out the session is part of another jira and I think a PR is out https://reviews.apache.org/r/27204/diff/ and we should discuss it on that PR. 2) We would then also have multiple concrete Principals, e.g. KafkaPrincipal KafkaUserPrincipal KafkaGroupPrincipal (perhaps even KafkaKerberosPrincipal and KafkaClientAddressPrincipal) etc This is important as eventually (hopefully sooner than later), we will support multiple types of authentication which may each want to populate the Subject with one or more Principals and perhaps even credentials (this could be used in the future to hold encryption keys or perhaps the raw info prior to authentication). So in this way, if we have different authentication modules, we can add different types of Principals by extension This also allows the same subject to have access to some resources based on username and some based on group. Given that with this we would have different types of Principals, I would then modify the ACL to look like: {version:1, {acls:[ { principal_types:[KafkaUserPrincipal,KafkaGroupPrincipal], principals:[alice,kafka-devs] ... or {version:1, {acls:[ { principals:[KafkaUserPrincipal:alice,KafkaGroupPrincipal:kafka- devs] But in either case this allows for easy identification of the type of principal and makes it easy to plugin multiple kinds of principals The advantage of all of this is that it now provides more flexibility for custom modules for both authentication and authorization moving forward. All the principals that you listed above can be supported with current design. Acls take a KafkaPrincipal as input which is a combination of type and principal name and the authorizer implementations are free to create any extension of this which covers group: groupName, host: HostName, kerberos: kerberosUserName and any other types that may come up. I am not sure how encryption key storage is relavent to the Authorizer so will be great if you can elaborate. 3) Are you sure that you want authorize to take a session object? If we use the model in one above, we could just populate the Subject with a KafkaClientAddressPrincipal and thenhave access to that when evaluated the ACLs. I think it is better to take a session which can just be a wrapper on top of Subject + host for now. This allows for extension which in my opinion is more future requirement proof. 4) What about actually caching authorization decisions? I know ACLs will be cached, but the actual authorize decision can be expensive as well? In default implementation I don’t plan to do this. Easy to add later if we want to but I am not sure why would this ever be expansive when acls are cached and number of acls on a single topic should be very small and iterating over them with simple string comparison should not really be expansive. Thanks Parth On Fri, Apr 24, 2015 at 1:27 PM, Gari Singh gari.r.si...@gmail.com wrote: Not sure if my newbie vote will count, but I think you are getting pretty close here. Couple of things: 1) I know the Session object is from a different JIRA, but I think that Session should take a Subject rather than just a single Principal. The reason for this is because a Subject can have multiple Principals (for example both a username and a group or perhaps someone would want to use both the username and the clientIP as Principals) 2) We would then also have multiple