Re: [Vote] KIP-11 Authorization design for kafka security

2015-05-21 Thread Jun Rao
Parth,

Thanks for driving this. Could you update the status of the KIP in the wiki?

Thanks,

Jun

On Wed, May 20, 2015 at 2:37 PM, Parth Brahmbhatt 
pbrahmbh...@hortonworks.com wrote:

 This vote is now Closed with 4 binding +1s and 4 non binding +1s.

 Thanks
 Parth

 On 5/20/15, 12:04 PM, Joel Koshy jjkosh...@gmail.com wrote:

 +1
 
 On Fri, May 15, 2015 at 04:18:49PM +, Parth Brahmbhatt wrote:
  Hi,
 
  Opening the voting thread for KIP-11.
 
  Link to the KIP:
 
 https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authorization+
 Interface
  Link to Jira: https://issues.apache.org/jira/browse/KAFKA-1688
 
  Thanks
  Parth
 




Re: [Vote] KIP-11 Authorization design for kafka security

2015-05-21 Thread Parth Brahmbhatt
I am sorry to be ignorant about this but what is the new state? Adopted
seems too early given we are still in code review process. Should I just
make it ³Code review²?

Thanks
Parth

On 5/21/15, 8:43 AM, Jun Rao j...@confluent.io wrote:

Parth,

Thanks for driving this. Could you update the status of the KIP in the
wiki?

Thanks,

Jun

On Wed, May 20, 2015 at 2:37 PM, Parth Brahmbhatt 
pbrahmbh...@hortonworks.com wrote:

 This vote is now Closed with 4 binding +1s and 4 non binding +1s.

 Thanks
 Parth

 On 5/20/15, 12:04 PM, Joel Koshy jjkosh...@gmail.com wrote:

 +1
 
 On Fri, May 15, 2015 at 04:18:49PM +, Parth Brahmbhatt wrote:
  Hi,
 
  Opening the voting thread for KIP-11.
 
  Link to the KIP:
 
 
https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authorization+
 Interface
  Link to Jira: https://issues.apache.org/jira/browse/KAFKA-1688
 
  Thanks
  Parth
 





Re: [Vote] KIP-11 Authorization design for kafka security

2015-05-21 Thread Gwen Shapira
The KIP and design were accepted, so the WIKI should say accepted or
something similar.
Specific patch status is reflected in the JIRA.

On Thu, May 21, 2015 at 8:37 PM, Parth Brahmbhatt 
pbrahmbh...@hortonworks.com wrote:

 I am sorry to be ignorant about this but what is the new state? Adopted
 seems too early given we are still in code review process. Should I just
 make it ³Code review²?

 Thanks
 Parth

 On 5/21/15, 8:43 AM, Jun Rao j...@confluent.io wrote:

 Parth,
 
 Thanks for driving this. Could you update the status of the KIP in the
 wiki?
 
 Thanks,
 
 Jun
 
 On Wed, May 20, 2015 at 2:37 PM, Parth Brahmbhatt 
 pbrahmbh...@hortonworks.com wrote:
 
  This vote is now Closed with 4 binding +1s and 4 non binding +1s.
 
  Thanks
  Parth
 
  On 5/20/15, 12:04 PM, Joel Koshy jjkosh...@gmail.com wrote:
 
  +1
  
  On Fri, May 15, 2015 at 04:18:49PM +, Parth Brahmbhatt wrote:
   Hi,
  
   Opening the voting thread for KIP-11.
  
   Link to the KIP:
  
 
 
 https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authorization+
  Interface
   Link to Jira: https://issues.apache.org/jira/browse/KAFKA-1688
  
   Thanks
   Parth
  
 
 




Re: [Vote] KIP-11 Authorization design for kafka security

2015-05-20 Thread Parth Brahmbhatt
This vote is now Closed with 4 binding +1s and 4 non binding +1s.

Thanks
Parth

On 5/20/15, 12:04 PM, Joel Koshy jjkosh...@gmail.com wrote:

+1

On Fri, May 15, 2015 at 04:18:49PM +, Parth Brahmbhatt wrote:
 Hi,
 
 Opening the voting thread for KIP-11.
 
 Link to the KIP:
https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authorization+
Interface
 Link to Jira: https://issues.apache.org/jira/browse/KAFKA-1688
 
 Thanks
 Parth




Re: [Vote] KIP-11 Authorization design for kafka security

2015-05-20 Thread Joel Koshy
+1

On Fri, May 15, 2015 at 04:18:49PM +, Parth Brahmbhatt wrote:
 Hi,
 
 Opening the voting thread for KIP-11.
 
 Link to the KIP: 
 https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authorization+Interface
 Link to Jira: https://issues.apache.org/jira/browse/KAFKA-1688
 
 Thanks
 Parth



Re: [Vote] KIP-11 Authorization design for kafka security

2015-05-18 Thread Joe Stein
+1

~ Joe Stein
- - - - - - - - - - - - - - - - -

  http://www.stealth.ly
- - - - - - - - - - - - - - - - -

On Fri, May 15, 2015 at 7:35 PM, Jun Rao j...@confluent.io wrote:

 +1

 Thanks,

 Jun

 On Fri, May 15, 2015 at 9:18 AM, Parth Brahmbhatt 
 pbrahmbh...@hortonworks.com wrote:

  Hi,
 
  Opening the voting thread for KIP-11.
 
  Link to the KIP:
 
 https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authorization+Interface
  Link to Jira: https://issues.apache.org/jira/browse/KAFKA-1688
 
  Thanks
  Parth
 



[Vote] KIP-11 Authorization design for kafka security

2015-05-15 Thread Parth Brahmbhatt
Hi,

Opening the voting thread for KIP-11.

Link to the KIP: 
https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authorization+Interface
Link to Jira: https://issues.apache.org/jira/browse/KAFKA-1688

Thanks
Parth


Re: [Vote] KIP-11 Authorization design for kafka security

2015-05-15 Thread Jun Rao
+1

Thanks,

Jun

On Fri, May 15, 2015 at 9:18 AM, Parth Brahmbhatt 
pbrahmbh...@hortonworks.com wrote:

 Hi,

 Opening the voting thread for KIP-11.

 Link to the KIP:
 https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authorization+Interface
 Link to Jira: https://issues.apache.org/jira/browse/KAFKA-1688

 Thanks
 Parth



Re: [Vote] KIP-11 Authorization design for kafka security

2015-05-15 Thread Harsha
+1 non-binding






On Fri, May 15, 2015 at 9:18 AM -0700, Parth Brahmbhatt 
pbrahmbh...@hortonworks.com wrote:










Hi,

Opening the voting thread for KIP-11.

Link to the KIP: 
https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authorization+Interface
Link to Jira: https://issues.apache.org/jira/browse/KAFKA-1688

Thanks
Parth

Re: [Vote] KIP-11 Authorization design for kafka security

2015-05-15 Thread Jay Kreps
+1

-Jay

On Fri, May 15, 2015 at 9:18 AM, Parth Brahmbhatt 
pbrahmbh...@hortonworks.com wrote:

 Hi,

 Opening the voting thread for KIP-11.

 Link to the KIP:
 https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authorization+Interface
 Link to Jira: https://issues.apache.org/jira/browse/KAFKA-1688

 Thanks
 Parth



Re: [Vote] KIP-11 Authorization design for kafka security

2015-05-15 Thread Don Bosco Durai
+1 non-binding


On 5/15/15, 11:43 AM, Gwen Shapira gshap...@cloudera.com wrote:

+1 non-binding

On Fri, May 15, 2015 at 9:12 PM, Harsha harsh...@fastmail.fm wrote:

 +1 non-binding






 On Fri, May 15, 2015 at 9:18 AM -0700, Parth Brahmbhatt 
 pbrahmbh...@hortonworks.com wrote:










 Hi,

 Opening the voting thread for KIP-11.

 Link to the KIP:
 
https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authorization+
Interface
 Link to Jira: https://issues.apache.org/jira/browse/KAFKA-1688

 Thanks
 Parth





Re: [Vote] KIP-11 Authorization design for kafka security

2015-05-15 Thread Gwen Shapira
+1 non-binding

On Fri, May 15, 2015 at 9:12 PM, Harsha harsh...@fastmail.fm wrote:

 +1 non-binding






 On Fri, May 15, 2015 at 9:18 AM -0700, Parth Brahmbhatt 
 pbrahmbh...@hortonworks.com wrote:










 Hi,

 Opening the voting thread for KIP-11.

 Link to the KIP:
 https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authorization+Interface
 Link to Jira: https://issues.apache.org/jira/browse/KAFKA-1688

 Thanks
 Parth



Re: [Vote] KIP-11 Authorization design for kafka security

2015-05-15 Thread Tom Graves
+1 non-binding.
Tom Graves 


 On Friday, May 15, 2015 2:00 PM, Don Bosco Durai bo...@apache.org wrote:
   

 +1 non-binding


On 5/15/15, 11:43 AM, Gwen Shapira gshap...@cloudera.com wrote:

+1 non-binding

On Fri, May 15, 2015 at 9:12 PM, Harsha harsh...@fastmail.fm wrote:

 +1 non-binding






 On Fri, May 15, 2015 at 9:18 AM -0700, Parth Brahmbhatt 
 pbrahmbh...@hortonworks.com wrote:










 Hi,

 Opening the voting thread for KIP-11.

 Link to the KIP:
 
https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authorization+
Interface
 Link to Jira: https://issues.apache.org/jira/browse/KAFKA-1688

 Thanks
 Parth





   

Re: [VOTE] KIP-11- Authorization design for kafka security

2015-05-01 Thread Jun Rao
Suresh,

We typically wrap up the voting of a KIP in a few days. However, given that
this KIP is quite critical and there seems to be new questions, perhaps we
can spend a bit more time to have people's concerns addressed and then
resume the voting.

Joe,

Do you still have concerns given the previous replies?

Thanks,

Jun


On Thu, Apr 30, 2015 at 7:54 PM, Suresh Srinivas sur...@hortonworks.com
wrote:

 It is a strange choice to return does not exist when the condition is
 actually not authorized. I have hard time understanding why that is
 better for security. Perhaps in DB world this is expected and changes may
 be necessary to comply with such behavior. But that should not guide what
 we do in Kafka.

 This is a voting thread for an important feature. Security is the number
 one feature that our users are asking for. Can't minor things like this be
 done in a follow up jiras? Should the focus be brought back to voting?

 Btw since I am new to the Kafka community, is there a period when voting
 thread needs to wrap up by? Other projects generally follow 3 or 7 days.

 Regards,
 Suresh

 Sent from phone

 _
 From: Gwen Shapira gshap...@cloudera.commailto:gshap...@cloudera.com
 Sent: Thursday, April 30, 2015 5:32 PM
 Subject: Re: [VOTE] KIP-11- Authorization design for kafka security
 To: dev@kafka.apache.orgmailto:dev@kafka.apache.org


 On Thu, Apr 30, 2015 at 4:39 PM, Parth Brahmbhatt 
 pbrahmbh...@hortonworks.commailto:pbrahmbh...@hortonworks.com wrote:

  Hi Joe,
 
  Let me clarify on authZException. The caller gets a 403 regardless of
  existence of the topic, even if the topic does not exist you always get
  403. This will fall under the case wherewe do not find any acls for a
  resource and as per our last decision by default we are going to deny
 this
  request.
 

 The reason I'm digging into this is that in Hive we had to fix existing
 behavior after financial customers objected loudly to getting insufficient
 privileges when a real database would return table does not exist.

 I completely agree that having to handle two separate error conditions
 (TopicNotExist if user doesn't have READ, unless user has CREATE in which
 case he can see all topics and can get Unauthorized) adds complexity and
 will not be fun to debug. However, when implementing security, a lot of the
 stuff we do is around making customers pass security audits, and I suspect
 that can't know that tables even exist test is a thing.

 We share pretty much the same financial customers and they seem to have the
 same concerns. Perhaps you can double check if you also have this
 requirement?

 (and again, sorry for not seeing this earlier and holding up the vote on
 what seems like a minor point. I just don't want to punt for later
 something when we already have an idea of what customers expect)

 Gwen



 
  The configurations are listed explicitly here
 
 https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authorization+In
  terface#KIP-11-AuthorizatiInterface-Changestoexistingclasses under
  KafkaConfig. We may add an optional config to allow authorizer to read an
  arbitrary property files incrementally but that does not need to be part
  of this same KIP.
 
  The statement “If we can't audit the access then wht good is controlling
  the access?” seems extreme because we still get to control the access
  which IMHO is a huge win. The default authorizer implementation right now
  logs every allowed/denied access (see here
 
 https://github.com/Parth-Brahmbhatt/kafka/blob/KAFKA-1688-impl/core/src/mai
  n/scala/kafka/security/auth/SimpleAclAthorizer.scala) in debug mode.
  Anybody who needs auditing could create a lo4j appender to allow debug
  access to this class and send the log output to some audit fil.
 
  Auditing is still a separate piece, we could either add an auditor
  interface that wraps authorizer or the other way around so authorizer and
  auditor can be two separate implementation. I woud love to start a new
  KIP and jira to discuss approaches in more details but I don’t see the
  need to hold up Authorization work for the same.
 
  I don’t agree with the “this design seems too specific” given we already
  have 3 implementation (default, ranger, sentry) that can be supported
 with
  the current design.
 
  The authorization happens as part of handle and it is the first action,
  see here
 
 https://github.com/Parth-Brahmbhatt/kafka/blob/KAFKA-1688-impl/core/src/mai
  n/scala/kafka/server/KafkaApis.scala#L103 for one example.
 
  Thanks
  Parth
 
 
 
  On 4/30/15, 4:24 PM, Suresh Srinivas sur...@hortonworks.commailto:
 sur...@hortonworks.com wrote:
 
  Joe, thanks for the clarification.
  
  Regarding audits, sorry I might be misunderstanding your email.
  Currently, if Kafka does not support audits, I think audits should be
  considered as a separate effort. Here are the reasons:
  - Audit,whether authorization is available or not, should record
  operations to determine what

Re: [VOTE] KIP-11- Authorization design for kafka security

2015-04-30 Thread Joe Stein
Hi, sorry I am coming in late to chime back in on this thread and haven't
been able to make the KIP hangouts the last few weeks. Sorry if any of this
was brought up already or I missed it.

I read through the KIP and the thread(s) and a couple of things jumped out.


   - Can we break out the open issues in JIRA (maybe during the hangout)
   that are in the KIP and resolve/flesh those out more?



   - I don't see any updates with the systems test or how we can know the
   code works.



   - We need some implementation/example/sample that we know can work in
   all different existing entitlement servers and not just ones that run in
   types of data centers too. I am not saying we should support everything but
   if someone had to implement
   https://docs.oracle.com/cd/E19225-01/820-6551/bzafm/index.html with
   Kafka it has to work for them out of the box.



   - We should shy away from storing JSON in Zookeeper. Lets store bytes in
   Storage.



   - We should spend some time thinking through exceptions in the wire
   protocol maybe as part of this so it can keep moving forward.


~ Joe Stein

On Tue, Apr 28, 2015 at 3:33 AM, Sun, Dapeng dapeng@intel.com wrote:

 Thank you for your reply, Gwen.

 1. Complex rule systems can be difficult to reason about and therefore
 end up being less secure. The rule Deny always wins is very easy to grasp.
 Yes, I'm agreed with your point: we should not make the rule complex.

 2. We currently don't have any mechanism for specifying IP ranges (or host
 ranges) at all. I think its a pretty significant deficiency, but it does
 mean that we don't need to worry about the issue of blocking a large range
 while unblocking few servers in the range.
 Support ranges sounds reasonable. If this feature will be in development
 plan, I also don't think we can put the best matching acl and  Support
 ip ranges together.

 We have a call tomorrow (Tuesday, April 28) at 3pm PST - to discuss this
 and other outstanding design issues (not all related to security). If you
 are interested in joining - let me know and I'll forward you the invite.
 Thank you, Gwen. I have the invite and I should be at home at that time.
 But due to network issue, I may can't join the meeting smoothly.

 Regards
 Dapeng

 -Original Message-
 From: Gwen Shapira [mailto:gshap...@cloudera.com]
 Sent: Tuesday, April 28, 2015 1:31 PM
 To: dev@kafka.apache.org
 Subject: Re: [VOTE] KIP-11- Authorization design for kafka security

 While I see the advantage of being able to say something like: deny user
 X from hosts h1...h200 also allow user X from host h189, there are two
 issues here:

 1. Complex rule systems can be difficult to reason about and therefore end
 up being less secure. The rule Deny always wins is very easy to grasp.

 2. We currently don't have any mechanism for specifying IP ranges (or host
 ranges) at all. I think its a pretty significant deficiency, but it does
 mean that we don't need to worry about the issue of blocking a large range
 while unblocking few servers in the range.

 Gwen

 P.S
 We have a call tomorrow (Tuesday, April 28) at 3pm PST - to discuss this
 and other outstanding design issues (not all related to security). If you
 are interested in joining - let me know and I'll forward you the invite.

 Gwen

 On Mon, Apr 27, 2015 at 10:15 PM, Sun, Dapeng dapeng@intel.com
 wrote:

  Attach the image.
 
  https://raw.githubusercontent.com/sundapeng/attachment/master/kafka-ac
  l1.png
 
  Regards
  Dapeng
 
  From: Sun, Dapeng [mailto:dapeng@intel.com]
  Sent: Tuesday, April 28, 2015 11:44 AM
  To: dev@kafka.apache.org
  Subject: RE: [VOTE] KIP-11- Authorization design for kafka security
 
 
  Thank you for your rapid reply, Parth.
 
 
 
  * I think the wiki already describes the precedence order as Deny
  taking
  precedence over allow when conflicting acls are found
  https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authorizati
  on+In
 
  terface#KIP-11-AuthorizationInterface-PermissionType
 
  Got it, thank you.
 
 
 
  * In the first version that I am currently writing there is no group
  support. Even when we add it I don't see the need to add a precedence
  for evaluation. it does not matter which principal matches as long as
 
   we have a match.
 
 
 
  About this part, I think we should choose the best matching acl for
  authorization, no matter we support group or not.
 
 
 
  For the case
 
   [cid:image001.png@01D08197.E94BD410]
 
  https://raw.githubusercontent.com/sundapeng/attachment/master/kafka-ac
  l1.png
 
 
 
  if 2 Acls are defined, one that deny an operation from all hosts and
  one that allows the operation from host1, the operation from host1
  will be denied or allowed?
 
  According wiki Deny will take precedence over Allow in competing
  acls., it seems acl_1 will win the competition, but customers'
  intention may be allow.
 
  I think deny always take precedence over Allow is okay, but  host1
  - user1host1 default may

Re: [VOTE] KIP-11- Authorization design for kafka security

2015-04-30 Thread Jun Rao
Joe,

Could you elaborate on why we should not store JSON in ZK? So far, all
existing ZK data are in JSON.

Thanks,

Jun

On Thu, Apr 30, 2015 at 2:06 AM, Joe Stein joe.st...@stealth.ly wrote:

 Hi, sorry I am coming in late to chime back in on this thread and haven't
 been able to make the KIP hangouts the last few weeks. Sorry if any of this
 was brought up already or I missed it.

 I read through the KIP and the thread(s) and a couple of things jumped out.


- Can we break out the open issues in JIRA (maybe during the hangout)
that are in the KIP and resolve/flesh those out more?



- I don't see any updates with the systems test or how we can know the
code works.



- We need some implementation/example/sample that we know can work in
all different existing entitlement servers and not just ones that run in
types of data centers too. I am not saying we should support everything
 but
if someone had to implement
https://docs.oracle.com/cd/E19225-01/820-6551/bzafm/index.html with
Kafka it has to work for them out of the box.



- We should shy away from storing JSON in Zookeeper. Lets store bytes in
Storage.



- We should spend some time thinking through exceptions in the wire
protocol maybe as part of this so it can keep moving forward.


 ~ Joe Stein

 On Tue, Apr 28, 2015 at 3:33 AM, Sun, Dapeng dapeng@intel.com wrote:

  Thank you for your reply, Gwen.
 
  1. Complex rule systems can be difficult to reason about and therefore
  end up being less secure. The rule Deny always wins is very easy to
 grasp.
  Yes, I'm agreed with your point: we should not make the rule complex.
 
  2. We currently don't have any mechanism for specifying IP ranges (or
 host
  ranges) at all. I think its a pretty significant deficiency, but it does
  mean that we don't need to worry about the issue of blocking a large
 range
  while unblocking few servers in the range.
  Support ranges sounds reasonable. If this feature will be in development
  plan, I also don't think we can put the best matching acl and  Support
  ip ranges together.
 
  We have a call tomorrow (Tuesday, April 28) at 3pm PST - to discuss this
  and other outstanding design issues (not all related to security). If you
  are interested in joining - let me know and I'll forward you the invite.
  Thank you, Gwen. I have the invite and I should be at home at that time.
  But due to network issue, I may can't join the meeting smoothly.
 
  Regards
  Dapeng
 
  -Original Message-
  From: Gwen Shapira [mailto:gshap...@cloudera.com]
  Sent: Tuesday, April 28, 2015 1:31 PM
  To: dev@kafka.apache.org
  Subject: Re: [VOTE] KIP-11- Authorization design for kafka security
 
  While I see the advantage of being able to say something like: deny user
  X from hosts h1...h200 also allow user X from host h189, there are two
  issues here:
 
  1. Complex rule systems can be difficult to reason about and therefore
 end
  up being less secure. The rule Deny always wins is very easy to grasp.
 
  2. We currently don't have any mechanism for specifying IP ranges (or
 host
  ranges) at all. I think its a pretty significant deficiency, but it does
  mean that we don't need to worry about the issue of blocking a large
 range
  while unblocking few servers in the range.
 
  Gwen
 
  P.S
  We have a call tomorrow (Tuesday, April 28) at 3pm PST - to discuss this
  and other outstanding design issues (not all related to security). If you
  are interested in joining - let me know and I'll forward you the invite.
 
  Gwen
 
  On Mon, Apr 27, 2015 at 10:15 PM, Sun, Dapeng dapeng@intel.com
  wrote:
 
   Attach the image.
  
   https://raw.githubusercontent.com/sundapeng/attachment/master/kafka-ac
   l1.png
  
   Regards
   Dapeng
  
   From: Sun, Dapeng [mailto:dapeng@intel.com]
   Sent: Tuesday, April 28, 2015 11:44 AM
   To: dev@kafka.apache.org
   Subject: RE: [VOTE] KIP-11- Authorization design for kafka security
  
  
   Thank you for your rapid reply, Parth.
  
  
  
   * I think the wiki already describes the precedence order as Deny
   taking
   precedence over allow when conflicting acls are found
   https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authorizati
   on+In
  
   terface#KIP-11-AuthorizationInterface-PermissionType
  
   Got it, thank you.
  
  
  
   * In the first version that I am currently writing there is no group
   support. Even when we add it I don't see the need to add a precedence
   for evaluation. it does not matter which principal matches as long as
  
we have a match.
  
  
  
   About this part, I think we should choose the best matching acl for
   authorization, no matter we support group or not.
  
  
  
   For the case
  
[cid:image001.png@01D08197.E94BD410]
  
   https://raw.githubusercontent.com/sundapeng/attachment/master/kafka-ac
   l1.png
  
  
  
   if 2 Acls are defined, one that deny an operation from all hosts and
   one that allows the operation

Re: [VOTE] KIP-11- Authorization design for kafka security

2015-04-30 Thread Gwen Shapira
 smoothly.
  
   Regards
   Dapeng
  
   -Original Message-
   From: Gwen Shapira [mailto:gshap...@cloudera.com]
   Sent: Tuesday, April 28, 2015 1:31 PM
   To: dev@kafka.apache.org
   Subject: Re: [VOTE] KIP-11- Authorization design for kafka security
  
   While I see the advantage of being able to say something like: deny
 user
   X from hosts h1...h200 also allow user X from host h189, there are
 two
   issues here:
  
   1. Complex rule systems can be difficult to reason about and therefore
  end
   up being less secure. The rule Deny always wins is very easy to
 grasp.
  
   2. We currently don't have any mechanism for specifying IP ranges (or
  host
   ranges) at all. I think its a pretty significant deficiency, but it
 does
   mean that we don't need to worry about the issue of blocking a large
  range
   while unblocking few servers in the range.
  
   Gwen
  
   P.S
   We have a call tomorrow (Tuesday, April 28) at 3pm PST - to discuss
 this
   and other outstanding design issues (not all related to security). If
 you
   are interested in joining - let me know and I'll forward you the
 invite.
  
   Gwen
  
   On Mon, Apr 27, 2015 at 10:15 PM, Sun, Dapeng dapeng@intel.com
   wrote:
  
Attach the image.
   
   
 https://raw.githubusercontent.com/sundapeng/attachment/master/kafka-ac
l1.png
   
Regards
Dapeng
   
From: Sun, Dapeng [mailto:dapeng@intel.com]
Sent: Tuesday, April 28, 2015 11:44 AM
To: dev@kafka.apache.org
Subject: RE: [VOTE] KIP-11- Authorization design for kafka security
   
   
Thank you for your rapid reply, Parth.
   
   
   
* I think the wiki already describes the precedence order as Deny
taking
precedence over allow when conflicting acls are found
   
 https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authorizati
on+In
   
terface#KIP-11-AuthorizationInterface-PermissionType
   
Got it, thank you.
   
   
   
* In the first version that I am currently writing there is no
 group
support. Even when we add it I don't see the need to add a
 precedence
for evaluation. it does not matter which principal matches as long
 as
   
 we have a match.
   
   
   
About this part, I think we should choose the best matching acl for
authorization, no matter we support group or not.
   
   
   
For the case
   
 [cid:image001.png@01D08197.E94BD410]
   
   
 https://raw.githubusercontent.com/sundapeng/attachment/master/kafka-ac
l1.png
   
   
   
if 2 Acls are defined, one that deny an operation from all hosts and
one that allows the operation from host1, the operation from host1
will be denied or allowed?
   
According wiki Deny will take precedence over Allow in competing
acls., it seems acl_1 will win the competition, but customers'
intention may be allow.
   
I think deny always take precedence over Allow is okay, but
 host1
- user1host1 default may make sense.
   
   
   
   
   
* Acl storage is indexed by resource right now because that is the
primary lookup id for all authorize operations. Given acls are
 cached
I don't see the need to optimized the storage layer any further for
   lookup.
   
* The reason why we have acl with multi everything is to reduce
redundancy in acl storage. I am not sure how will we be able to
 reduce
redundancy if we divide it by using one principal,one host, one
   operation.
   
   
   
Yes, I'm also agreed with Acl storage should be indexed by
 resource.
Under resource index, it may be better to add index such as hosts
 and
principals. One option may be one principal, one host, one
 operation.
Just give your these scenarios for considering.
   
   
   
For the case defined in wiki:
   
Acl_1 - {user:bob, user:*} is allowed to READ from all hosts.
   
Acl_2 - {user:bob} is denied to READ from host1
   
Acl_3 - {user:alice, group:kafka-devs} is allowed to READ and
WRITE from {host1, host2}.
   
   
   
For acl_3, if we want to remove alice's WRITE from {host1,host2} and
remove alice's READ from host1, user may have following ways to
  achieve:
   
   
   
1.Remove the parts of acl_3 directly, I think if we make it divided
and hierarchical, this kind of operations could be done directly in
   backend.
   
2.Remove acl_3, and add new acl {group:kafka-devs} is allowed to
READ and WRITE from {host1, host2} and {user:alice } is allowed to
READ from {host2}
   
3.Add two denied acls,{ user:alice} is denied to WRITE from
{host1,host2} and { user:alice} is denied to READ from {host1}
   
   
   
All these can achieve this kind of operations, but I think 1 could
more directly for user operations. If you think this optimization is
not urgent, I'm also agreed.
   
   
   
Regards
   
Dapeng
   
   
   
-Original Message-
   
From: Parth Brahmbhatt [mailto:pbrahmbh...@hortonworks.com

Re: [VOTE] KIP-11- Authorization design for kafka security

2015-04-30 Thread Joe Stein
 j...@confluent.io wrote:
 
  Joe,
  
  Could you elaborate on why we should not store JSON in ZK? So far, all
  existing ZK data are in JSON.
  
  Thanks,
  
  Jun
  
  On Thu, Apr 30, 2015 at 2:06 AM, Joe Stein joe.st...@stealth.ly
 wrote:
  
   Hi, sorry I am coming in late to chime back in on this thread and
  haven't
   been able to make the KIP hangouts the last few weeks. Sorry if any of
  this
   was brought up already or I missed it.
  
   I read through the KIP and the thread(s) and a couple of things jumped
  out.
  
  
  - Can we break out the open issues in JIRA (maybe during the
 hangout)
  that are in the KIP and resolve/flesh those out more?
  
  
  
  - I don't see any updates with the systems test or how we can know
  the
  code works.
  
  
  
  - We need some implementation/example/sample that we know can work
 in
  all different existing entitlement servers and not just ones that
  run in
  types of data centers too. I am not saying we should support
  everything
   but
  if someone had to implement
  https://docs.oracle.com/cd/E19225-01/820-6551/bzafm/index.html
 with
  Kafka it has to work for them out of the box.
  
  
  
  - We should shy away from storing JSON in Zookeeper. Lets store
  bytes in
  Storage.
  
  
  
  - We should spend some time thinking through exceptions in the wire
  protocol maybe as part of this so it can keep moving forward.
  
  
   ~ Joe Stein
  
   On Tue, Apr 28, 2015 at 3:33 AM, Sun, Dapeng dapeng@intel.com
  wrote:
  
Thank you for your reply, Gwen.
   
1. Complex rule systems can be difficult to reason about and
  therefore
end up being less secure. The rule Deny always wins is very easy
 to
   grasp.
Yes, I'm agreed with your point: we should not make the rule
 complex.
   
2. We currently don't have any mechanism for specifying IP ranges
 (or
   host
ranges) at all. I think its a pretty significant deficiency, but it
  does
mean that we don't need to worry about the issue of blocking a large
   range
while unblocking few servers in the range.
Support ranges sounds reasonable. If this feature will be in
  development
plan, I also don't think we can put the best matching acl and 
  Support
ip ranges together.
   
We have a call tomorrow (Tuesday, April 28) at 3pm PST - to discuss
  this
and other outstanding design issues (not all related to security).
 If
  you
are interested in joining - let me know and I'll forward you the
  invite.
Thank you, Gwen. I have the invite and I should be at home at that
  time.
But due to network issue, I may can't join the meeting smoothly.
   
Regards
Dapeng
   
-Original Message-
From: Gwen Shapira [mailto:gshap...@cloudera.com]
Sent: Tuesday, April 28, 2015 1:31 PM
To: dev@kafka.apache.org
Subject: Re: [VOTE] KIP-11- Authorization design for kafka security
   
While I see the advantage of being able to say something like: deny
  user
X from hosts h1...h200 also allow user X from host h189, there
 are
  two
issues here:
   
1. Complex rule systems can be difficult to reason about and
 therefore
   end
up being less secure. The rule Deny always wins is very easy to
  grasp.
   
2. We currently don't have any mechanism for specifying IP ranges
 (or
   host
ranges) at all. I think its a pretty significant deficiency, but it
  does
mean that we don't need to worry about the issue of blocking a large
   range
while unblocking few servers in the range.
   
Gwen
   
P.S
We have a call tomorrow (Tuesday, April 28) at 3pm PST - to discuss
  this
and other outstanding design issues (not all related to security).
 If
  you
are interested in joining - let me know and I'll forward you the
  invite.
   
Gwen
   
On Mon, Apr 27, 2015 at 10:15 PM, Sun, Dapeng dapeng@intel.com
 
wrote:
   
 Attach the image.


  https://raw.githubusercontent.com/sundapeng/attachment/master/kafka-ac
 l1.png

 Regards
 Dapeng

 From: Sun, Dapeng [mailto:dapeng@intel.com]
 Sent: Tuesday, April 28, 2015 11:44 AM
 To: dev@kafka.apache.org
 Subject: RE: [VOTE] KIP-11- Authorization design for kafka
 security


 Thank you for your rapid reply, Parth.



 * I think the wiki already describes the precedence order as Deny
 taking
 precedence over allow when conflicting acls are found

  https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authorizati
 on+In

 terface#KIP-11-AuthorizationInterface-PermissionType

 Got it, thank you.



 * In the first version that I am currently writing there is no
  group
 support. Even when we add it I don't see the need to add a
  precedence
 for evaluation. it does not matter which principal matches as long
  as

  we have a match

Re: [VOTE] KIP-11- Authorization design for kafka security

2015-04-30 Thread Sriharsha Chintalapani
 that the code works. We have  
  thorough unit tests for all the new code except for modifications made to  
  KafkaAPI as that has way too many dependencies to be mocked which I guess  
  is the reason for no existing unit tests.  
  * I don’t know if I completely understand the concern. We have talked  
 with  
  Ranger team (Don Bosco Durai) so we at least have one custom authorizer  
  implementation that has approved this design and they will be able to  
  inject their authorization framework with current interfaces. Do you see  
  any issue with the design which will prevent anyone from providing a  
  custom implementation?  
  * Did not understand the concern around wire protocol, we are adding  
  AuthorizationException to indicate that an operation was not authorized.  
   
  Thanks  
  Parth  
   
  On 4/30/15, 5:59 AM, Jun Rao j...@confluent.io wrote:  
   
  Joe,  

  Could you elaborate on why we should not store JSON in ZK? So far, all  
  existing ZK data are in JSON.  

  Thanks,  

  Jun  

  On Thu, Apr 30, 2015 at 2:06 AM, Joe Stein joe.st...@stealth.ly  
 wrote:  

   Hi, sorry I am coming in late to chime back in on this thread and  
  haven't  
   been able to make the KIP hangouts the last few weeks. Sorry if any of  
  this  
   was brought up already or I missed it.  

   I read through the KIP and the thread(s) and a couple of things jumped  
  out.  


   - Can we break out the open issues in JIRA (maybe during the  
 hangout)  
   that are in the KIP and resolve/flesh those out more?  



   - I don't see any updates with the systems test or how we can know  
  the  
   code works.  



   - We need some implementation/example/sample that we know can work  
 in  
   all different existing entitlement servers and not just ones that  
  run in  
   types of data centers too. I am not saying we should support  
  everything  
   but  
   if someone had to implement  
   https://docs.oracle.com/cd/E19225-01/820-6551/bzafm/index.html  
 with  
   Kafka it has to work for them out of the box.  



   - We should shy away from storing JSON in Zookeeper. Lets store  
  bytes in  
   Storage.  



   - We should spend some time thinking through exceptions in the wire  
   protocol maybe as part of this so it can keep moving forward.  


   ~ Joe Stein  

   On Tue, Apr 28, 2015 at 3:33 AM, Sun, Dapeng dapeng@intel.com  
  wrote:  

Thank you for your reply, Gwen.  
 
1. Complex rule systems can be difficult to reason about and  
  therefore  
end up being less secure. The rule Deny always wins is very easy  
 to  
   grasp.  
Yes, I'm agreed with your point: we should not make the rule  
 complex.  
 
2. We currently don't have any mechanism for specifying IP ranges  
 (or  
   host  
ranges) at all. I think its a pretty significant deficiency, but it  
  does  
mean that we don't need to worry about the issue of blocking a large  
   range  
while unblocking few servers in the range.  
Support ranges sounds reasonable. If this feature will be in  
  development  
plan, I also don't think we can put the best matching acl and   
  Support  
ip ranges together.  
 
We have a call tomorrow (Tuesday, April 28) at 3pm PST - to discuss  
  this  
and other outstanding design issues (not all related to security).  
 If  
  you  
are interested in joining - let me know and I'll forward you the  
  invite.  
Thank you, Gwen. I have the invite and I should be at home at that  
  time.  
But due to network issue, I may can't join the meeting smoothly.  
 
Regards  
Dapeng  
 
-Original Message-  
From: Gwen Shapira [mailto:gshap...@cloudera.com]  
Sent: Tuesday, April 28, 2015 1:31 PM  
To: dev@kafka.apache.org  
Subject: Re: [VOTE] KIP-11- Authorization design for kafka security  
 
While I see the advantage of being able to say something like: deny  
  user  
X from hosts h1...h200 also allow user X from host h189, there  
 are  
  two  
issues here:  
 
1. Complex rule systems can be difficult to reason about and  
 therefore  
   end  
up being less secure. The rule Deny always wins is very easy to  
  grasp.  
 
2. We currently don't have any mechanism for specifying IP ranges  
 (or  
   host  
ranges) at all. I think its a pretty significant deficiency, but it  
  does  
mean that we don't need to worry about the issue of blocking a large  
   range  
while unblocking few servers in the range.  
 
Gwen  
 
P.S  
We have a call tomorrow (Tuesday, April 28) at 3pm PST - to discuss  
  this  
and other outstanding design issues (not all related to security).  
 If  
  you  
are interested in joining - let me know and I'll forward you the  
  invite.  
 
Gwen  
 
On Mon, Apr 27, 2015

Re: [VOTE] KIP-11- Authorization design for kafka security

2015-04-30 Thread Joe Stein
 the
   hangout)
that are in the KIP and resolve/flesh those out more?



- I don't see any updates with the systems test or how we can
  know
the
code works.



- We need some implementation/example/sample that we know can
  work
   in
all diffeent existing entitlement servers and not just ones
  that
run in
types of data centers too. I am not saying we should support
everything
 but
if someone had to implement

 https://docs.oracle.com/cd/E19225-01/820-6551/bzafm/index.html
   with
Kafka it has to work for them out of the box.



- We should shy away from storing JSON in Zookeeper. Lets
 store
byes in
Storage.



- We should spend some time thinking through exceptions in
 the
  wire
protocol maybe as part of this so it can keep moving forward.


 ~ Joe Stein

 On Tue, Apr 28, 2015 at 3:33 AM, Sun, Dapeng
 dapeng@intel.com
  
wrote:

  Thank you for your reply, Gwen.
 
  1. Complex rule systems can be difficult to reason about and
therefore
  end up being less secure. The rule Deny always wins is very
  easy
   to
 grasp.
  Yes, I'm agreed with your point: we should not make the rule
   complex.
 
  2. We currently don't have any mechanism for specifying IP
  ranges
   (or
 host
  ranges) at all. I think its a pretty significant deficiency,
  but it
does
  mean that we don't need to worry about the issue of blocking a
  large
 range
  while unblocking few servers in the range.
  Support ranges sounds reasonable. If this feature will be in
development
  plan, I also don't think we can put the best matching acl
 and 
Support
  ip ranges together.
 
  We have a call tomorrow (Tuesday, April 28) at 3pm PST - to
  discuss
this
  and other outstanding design issues (not all related to
  security).
   If
you
  are interested in joining - let me know and I'll forward you
 the
invite.
  Thank you, Gwen. I have the invite and I should be at home at
  that
time.
  But due to network issue, I may can't join the meeting
 smoothly.
 
  Regard
  Dapeng
 
  -Original Message-
  From: Gwen Shapira [mailto:gshap...@cloudera.com]
  Sent: Tuesday, April 28, 2015 1:31 PM
  To: dev@kafka.apache.org
  Subject: Re [VOTE] KIP-11- Authorization design for kafka
  security
 
  While I see the advantage of being able to say something like:
  deny
user
  X from hosts h1...h200 also allow user X from host h19,
 there
   are
two
  issues here:
 
  1. Complex rule systems can be difficult to reason about and
   therefore
 end
  up being less secure. The rule Deny always wins is very
 easy to
grasp.
 
  2. We currently don't have any mechanism for specifying IP
 ranges
   (or
 host
  ranges) at all. I think its a pretty significant deficiency,
 but
  it
does
  mean that we don't need to worry about the issue of blocking a
  large
 range
  while unblocking few servers in the range.
 
  Gwen
 
  P.S
  We have a call tomorrow (Tuesday, April 28) at 3pm PST - to
  discuss
this
  and other outstanding design issues (not all related to
  security).
   If
you
  are interested in joining - let me know and I'll forward you
 the
invite.
 
  Gwen
 
  On Mon, Ap 27, 2015 at 10:15 PM, Sun, Dapeng
  dapeng@intel.com
   
  wrote:
 
   Attach the image.
  
  
   
 
 
 https://raw.githubusercontent.com/sundapeng/attachment/master/kafka-a
 c
   l1.png
  
   Regards
   Dapeng
  
   From: Sun, Dapeng [mailto:dapeng@intel.com]
   Sent: Tuesday, April 28, 2015 11:44 AM
   To: dev@kafka.apache.org
   Subject: RE: [VOTE] KIP-11- Authorization design for kafka
   security
  
  
   Thank you for your rapid reply, Parth.
  
  
  
   * I think the wiki already describes the precedence order
 as
  Deny
   taking
   precedence over allow when conflicting acls are found
  
   
 
 
 https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authorizat
 i
  on+In
  
   terface#KIP-11-AuthorizationInterface-PermissionType
  
   Got it, thank you.
  
  
  
   * In the first version that I am currently writing there
 is no
group
   support. Even when we add it I don't see the need to add a
precedence
   for evaluation. it does not matter which principal matches
 as
  long
as
  
we have a match.
  
  
  
   About this part, I think we should choose the best matching
 acl
   for
   authorization, no mater we support group or not.
  
  
  
   For the case
  
[cid:image001

Re: [VOTE] KIP-11- Authorization design for kafka security

2015-04-30 Thread Joe Stein
If you have bucket A and Bucket B and in Bucket A there are patients with
Disease X and Bucket B patients without Disease X.

Now you try to access Alice from bucket A and you get a 403  and then
from Bucket B you get a 404.

What does that tell you now about Alice? Yup, she has Disease X.

Uniform none existence is a good policy for protecting data. If you don't
have permission then 404 not found works too.

The context that I thought that applied with this discussion is because I
thought the authorization module was going to be a bit more integration
where the api responses were happening

~ Joe Stein
- - - - - - - - - - - - - - - - -

  http://www.stealth.ly
- - - - - - - - - - - - - - - - -

On Thu, Apr 30, 2015 at 6:51 PM, Suresh Srinivas sur...@hortonworks.com
wrote:

 Comment on AuthorizationException. I think the intent of exception should
 be to capture why a request is rejected. It is important from API
 perspective to be specific to aid debugging. Having a generic or obfuscated
 exception is not very useful. Does someone on getting an exception reach
 out to an admin to understand if a topic exists or it's an authorization
 issue?

 I am not getting the security concern. System must be ensure disallowing
 the access by implementing the security correctly. Not based on security by
 obscurity.

 Regards,
 Suresh

 Sent from phone

 _
 From: Gwen Shapira gshap...@cloudera.commailto:gshap...@cloudera.com
 Sent: Thursday, April 30, 2015 10:14 AM
 Subject: Re: [VOTE] KIP-11- Authorization design for kafka security
 To: dev@kafka.apache.orgmailto:dev@kafka.apache.org


 * Regarding additional authorizers:
 Prasad, who is a PMC on Apache Sentry reviewed the design and confirmed
 Sentry can integrate with the current APIs. Dapeng Sun, a committer on
 Sentry had some concerns about the IP privileges and how we prioritize
 privileges - but nothing that prevents Sentry from integrating with the
 existing solution, from what I could see. It seems to me that the design is
 very generic and adapters can be written for other authorization systems
 (after all, you just need to implement setACL, getACL and Authorize - all
 pretty basic), although I can't speak for Oracle's Identity Manager
 specifically.

 * Regarding AuthorizationException to indicate that an operation was not
 authorized: Sorry I missed this in previous reviewed, but now that I look
 at it - Many systems intentionally don't return AuthorizationException when
 READ privilege is missing, since this already gives too much information
 (that the topic exists and that you don't have privileges on it). Instead
 they return a variant of doesn't exist. I'm wondering if this approach is
 applicable / desirable for Kafka as well.
 Note that this doesn't remove the need for AuthorizationException - I'm
 just suggesting a possible refinement on its use.

 Gwen



 On Thu, Apr 30, 2015 at 9:52 AM, Parth Brahmbhatt 
 pbrahmbh...@hortonworks.commailto:pbrahmbh...@hortonworks.com wrote:

  Hi Joe, Thanks for taking the time to review.
 
  * All the open issues already have a resolution , I can open a jira for
  each one and add the resolution to it and resolve them immediately if you
  want this for tracking purposes.
  * We will update system tests to verify that the code works. We have
  thorough unit tests for all the new code except for modifications made to
  KafkaAPI as that has way too many dependencies to be mocked which I guess
  is the reason for no existing unit tests.
  * I don’t know if I completely understand the concern. We have talked
 with
  Ranger team (Don Bosco Durai) so we at least have one custom authorizer
  implementation that has approved this design and they will be able to
  inject their authorization framework with current interfaces. Do you see
  any issue with the design which will prevent anyone from providing a
  custom implementation?
  * Did not understand the concern around wire protocol, we are adding
  AuthorizationException to indicate that an operation was not authorized.
 
  Thanks
  Parth
 
  On 4/30/15, 5:59 AM, Jun Rao j...@confluent.iomailto:j...@confluent.io
 wrote:
 
  Joe,
  
  Could you elaborate on why we should not store JSON in ZK? So far, all
  existing ZK data are in JSON.
  
  Thanks,
  
  Jun
  
  On Thu, Apr 30, 2015 at 2:06 AM, Joe Stein joe.st...@stealth.ly
 mailto:joe.st...@stealth.ly wrote:
  
   Hi, sorry I am coming in late to chime back in on this thread and
  haven't
   been able to make the KIP hangouts the last few weeks. Sorry if any of
  this
   was brought up already or I missed it.
  
   I read through the KIP and the thread(s) and a couple of things jumped
  out.
  
  
  - Can we break out the open issues in JIRA (maybe during the
 hangout)
  that are in the KIP and resolve/flesh those out more?
  
  
  
  - I don't see any updates with the systems test or how we can know
  the
  code works.
  
  
  
  - We need some implementation/example

Re: [VOTE] KIP-11- Authorization design for kafka security

2015-04-30 Thread Don Bosco Durai
Joe, these are good use cases, however in the firt phase the granularity
is at the Topic (your e.g. bucket) level and not what you are accessing
within the Topic. So in your use case, if you don’t have access to “Bucket
A”, then you won’t know who is in it, so you won’t know “Alice” or anyone
who as “X”.

The use case here, there is a HL7 topic with specific for “New Patients”,
then only users “A,B or C” can publish to it and only users “X, Y o Z”
can consume from it. In addition, only admin users “P, Q and R” can manage
the topic permissions.

I feel, keeping it simple should be good enough for the first phase.

Thanks

Bosco



On 4/30/15, 3:59 PM, Joe Stein joe.st...@stealth.ly wrote:

If you have bucket A and Bucket B and in Bucket A there are patients with
Disease X and Bucket B patients without Disease X.

Now you try to access Alice from bucket A and you get a 403  and then
from Bucket B you get a 404.

What does that tell you now about Alice? Yup, she has Disease X.

Uniform none existence is a good policy for protecting data. If you don't
have permission then 404 not found works too.

The context that I thought that applied with this discussion is because I
thought the authorization module was going to be a bit more integration
where the api responses were happening

~ Joe Stein
- - - - - - - - - - - - - - - - -

  http://www.stealth.ly
- - - - - - - - - - - - - - - - -

On Thu, Apr 30, 2015 at 6:51 PM, Suresh Srinivas sur...@hortonworks.com
wrote:

 Comment on AuthorizationException. I think the intent of exception
should
 be to capture why a request is rejected. It is important from API
 perspective to be specific to aid debugging. Having a generic or
obfuscated
 exception is not very useful. Does someone on getting an exceptionreach
 out to an admin to understand if a topic exists or it's an authorization
 issue?

 I am not getting the security concern. System must be ensure disallowing
 the access by implementing the security correctly. Not based on
security by
 obscurity.

 Regards,
 Suresh

 Sent from phone

 _
 From: Gwen Shapira gshap...@cloudera.commailto:gshap...@cloudera.com
 Sent: Thursday, April 30, 2015 10:14 AM
 Subject: Re: [VOTE] KIP-11- Authorization design for kafka security
 To: dev@kafka.apache.orgmailto:dev@kafka.apache.org


 * Regarding additional authorizers:
 Prasad, who is a PMC on Apache Sentry reviewed the design and confirmed
 Sentry can integrate with the current APIs. Dapeng Sun, a committer on
 Sentry had sme concerns about the IP privileges and how we prioritize
 privileges - but nothing that prevents Sentry from integrating with the
 existing solution, from what I coul see. It seems to me that the
design is
 very generic and adapters can be written for other authorization systems
 (after all, you just need to implement setACL, getACL and Authorize -
all
 pretty basic), although I can't speak for Oracle's Identity Manager
 specifically.

 * Regarding AuthorizationException to indicate that anoperation was
not
 authorized: Sorry I missed this in previous reviewed, but now that I
look
 at it - Many systems intentionally don't return AuthorizationException
when
 READ privilege is missing, since this already gives too much information
 (that the topic exists and that you don't have privileges on it).
Instead
 they return a variant of doesn't exist. I'm wondering if this
approach is
 applicable / desirable for Kafka as well.
 Note that this doesn't remove the need for AuthorizationException - I'm
 just suggesting apossible refinement on its use.

 Gwen



 On Thu, Apr 30, 2015 at 9:52 AM, Parth Brahmbhatt 
 pbrahmbh...@hortonworks.commailto:pbrahmbh...@hortonworks.com wrote:

  Hi Joe, Thanks for taking the time to review.
 
  * All the open issues already have a resolution , I can open a jira
for
  each one and add the resolution to it and resolve them immediately if
you
  want this for tracking purposes.
  * We will update system tests to verify that the code woks. We have
  thorough unit tests for all the new code except for modifications
made to
  KafkaAPI as that has way too many dependencies to be mocked which I
guess
  is the reason for no existing unit tests.
  * I don’t know if I completely understand the concern. We have talked
 with
  Ranger team (Don Bosco Durai) so we at least have one custom
authorizer
  implementation that has approved this design and they will be able to
  inject their authorization framework with current interfaces. Do you
see
  any issue with the design which will prevent anyone from providing a
  custom implementation?
  * Did not understand the concern around wire protocol, we are adding
  AuthorizationException to indicate that an operation was not
authorized.
 
  Thanks
  Parth
 
  On 4/30/15, 5:59 AM, Jun Rao
j...@confluent.iomailto:j...@confluent.io
 wrote:
 
  Joe,
  
  Could you elaborate on why we should not store JSON in ZK? So far,
all
  existing ZK data are in JSON.
  
  Thanks,
  
  Jun

Re: [VOTE] KIP-11- Authorization design for kafka security

2015-04-30 Thread Suresh Srinivas
Gwen,

Thanks for the clarification.

My objection is, we should not do it just because of the reason that
databases have always done it this way. May be there is a history
there that might have forced a choice like that. That has led to
other DBs to comply with it. Kafka is a different system. Let's do 
what is the correct thing to do.

I also think it is not clear what users want here. But as an API developer
I would want error conditions to be correctly identified so that 
supportability of the product does not suffer.

Today in HDFS (for that matter Hadoop in general), the error conditions
are clearly identified, such as:
- Object you are trying to access does not exist
- You do not have permission to access the object
- The operation you are trying to do is invalid

Here are some error codes that Amazon Kinesis support describing the
failure/error conditions clearly:
http://docs.aws.amazon.com/kinesis/latest/APIReference/CommonErrors.html

From: Gwen Shapira gshap...@cloudera.com
Sent: Thursday, April 30, 2015 6:05 PM
To: dev@kafka.apache.org
Subject: Re: [VOTE] KIP-11- Authorization design for kafka security

I think Kafka's behavior should be driven by what users want. My only
indication to what they may want is what we were forced to fix in similar
cases. This is why I am advocating this behavior.

I agree that this is a minor point that should not be blocking the vote. I
already gave my non-binding +1 and thats the best I can do to drive this
forward.

If this vote passes without the behavior I believe is the right one, I will
create a follow up JIRA. However, since we are still in a discussion and
since both options are trivial to implement - why exactly are you objecting
to Kafka behaving more like a DB in this scenario?

Gwen



On Thu, Apr 30, 2015 at 5:54 PM, Suresh Srinivas sur...@hortonworks.com
wrote:

 It is a strange choice to return does not exist when the condition is
 actually not authorized. I have hard time understanding why that is
 better for security. Perhaps in DB world this is expected and changes may
 be necessary to comply with such behavior. But that should not guide what
 we do in Kafka.

 This is a voting thread for an important feature. Security is the number
 one feature that our users are asking for. Can't minor things like this be
 done in a follow up jiras? Should the focus be brought back to voting?

 Btw since I am new to the Kafka community, is there a period when voting
 thread needs to wrap up by? Other projects generally follow 3 or 7 days.

 Regards,
 Suresh

 Sent from phone

 _
 From: Gwen Shapira gshap...@cloudera.commailto:gshap...@cloudera.com
 Sent: Thursday, April 30, 2015 5:32 PM
 Subject: Re: [VOTE] KIP-11- Authorization design for kafka security
 To: dev@kafka.apache.orgmailto:dev@kafka.apache.org


 On Thu, Apr 30, 2015 at 4:39 PM, Parth Brahmbhatt 
 pbrahmbh...@hortonworks.commailto:pbrahmbh...@hortonworks.com wrote:

  Hi Joe,
 
  Let me clarify on authZException. The caller gets a 403 regardless of
  existence of the topic, even if the topic does not exist you always get
  403. This will fall under the case wherewe do not find any acls for a
  resource and as per our last decision by default we are going to deny
 this
  request.
 

 The reason I'm digging into this is that in Hive we had to fix existing
 behavior after financial customers objected loudly to getting insufficient
 privileges when a real database would return table does not exist.

 I completely agree that having to handle two separate error conditions
 (TopicNotExist if user doesn't have READ, unless user has CREATE in which
 case he can see all topics and can get Unauthorized) adds complexity and
 will not be fun to debug. However, when implementing security, a lot of the
 stuff we do is around making customers pass security audits, and I suspect
 that can't know that tables even exist test is a thing.

 We share pretty much the same financial customers and they seem to have the
 same concerns. Perhaps you can double check if you also have this
 requirement?

 (and again, sorry for not seeing this earlier and holding up the vote on
 what seems like a minor point. I just don't want to punt for later
 something when we already have an idea of what customers expect)

 Gwen



 
  The configurations are listed explicitly here
 
 https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authorization+In
  terface#KIP-11-AuthorizatiInterface-Changestoexistingclasses under
  KafkaConfig. We may add an optional config to allow authorizer to read an
  arbitrary property files incrementally but that does not need to be part
  of this same KIP.
 
  The statement “If we can't audit the access then wht good is controlling
  the access?” seems extreme because we still get to control the access
  which IMHO is a huge win. The default authorizer implementation right now
  logs every allowed/denied access (see here
 
 https

Re: [VOTE] KIP-11- Authorization design for kafka security

2015-04-30 Thread Gwen Shapira
I think Kafka's behavior should be driven by what users want. My only
indication to what they may want is what we were forced to fix in similar
cases. This is why I am advocating this behavior.

I agree that this is a minor point that should not be blocking the vote. I
already gave my non-binding +1 and thats the best I can do to drive this
forward.

If this vote passes without the behavior I believe is the right one, I will
create a follow up JIRA. However, since we are still in a discussion and
since both options are trivial to implement - why exactly are you objecting
to Kafka behaving more like a DB in this scenario?

Gwen



On Thu, Apr 30, 2015 at 5:54 PM, Suresh Srinivas sur...@hortonworks.com
wrote:

 It is a strange choice to return does not exist when the condition is
 actually not authorized. I have hard time understanding why that is
 better for security. Perhaps in DB world this is expected and changes may
 be necessary to comply with such behavior. But that should not guide what
 we do in Kafka.

 This is a voting thread for an important feature. Security is the number
 one feature that our users are asking for. Can't minor things like this be
 done in a follow up jiras? Should the focus be brought back to voting?

 Btw since I am new to the Kafka community, is there a period when voting
 thread needs to wrap up by? Other projects generally follow 3 or 7 days.

 Regards,
 Suresh

 Sent from phone

 _
 From: Gwen Shapira gshap...@cloudera.commailto:gshap...@cloudera.com
 Sent: Thursday, April 30, 2015 5:32 PM
 Subject: Re: [VOTE] KIP-11- Authorization design for kafka security
 To: dev@kafka.apache.orgmailto:dev@kafka.apache.org


 On Thu, Apr 30, 2015 at 4:39 PM, Parth Brahmbhatt 
 pbrahmbh...@hortonworks.commailto:pbrahmbh...@hortonworks.com wrote:

  Hi Joe,
 
  Let me clarify on authZException. The caller gets a 403 regardless of
  existence of the topic, even if the topic does not exist you always get
  403. This will fall under the case wherewe do not find any acls for a
  resource and as per our last decision by default we are going to deny
 this
  request.
 

 The reason I'm digging into this is that in Hive we had to fix existing
 behavior after financial customers objected loudly to getting insufficient
 privileges when a real database would return table does not exist.

 I completely agree that having to handle two separate error conditions
 (TopicNotExist if user doesn't have READ, unless user has CREATE in which
 case he can see all topics and can get Unauthorized) adds complexity and
 will not be fun to debug. However, when implementing security, a lot of the
 stuff we do is around making customers pass security audits, and I suspect
 that can't know that tables even exist test is a thing.

 We share pretty much the same financial customers and they seem to have the
 same concerns. Perhaps you can double check if you also have this
 requirement?

 (and again, sorry for not seeing this earlier and holding up the vote on
 what seems like a minor point. I just don't want to punt for later
 something when we already have an idea of what customers expect)

 Gwen



 
  The configurations are listed explicitly here
 
 https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authorization+In
  terface#KIP-11-AuthorizatiInterface-Changestoexistingclasses under
  KafkaConfig. We may add an optional config to allow authorizer to read an
  arbitrary property files incrementally but that does not need to be part
  of this same KIP.
 
  The statement “If we can't audit the access then wht good is controlling
  the access?” seems extreme because we still get to control the access
  which IMHO is a huge win. The default authorizer implementation right now
  logs every allowed/denied access (see here
 
 https://github.com/Parth-Brahmbhatt/kafka/blob/KAFKA-1688-impl/core/src/mai
  n/scala/kafka/security/auth/SimpleAclAthorizer.scala) in debug mode.
  Anybody who needs auditing could create a lo4j appender to allow debug
  access to this class and send the log output to some audit fil.
 
  Auditing is still a separate piece, we could either add an auditor
  interface that wraps authorizer or the other way around so authorizer and
  auditor can be two separate implementation. I woud love to start a new
  KIP and jira to discuss approaches in more details but I don’t see the
  need to hold up Authorization work for the same.
 
  I don’t agree with the “this design seems too specific” given we already
  have 3 implementation (default, ranger, sentry) that can be supported
 with
  the current design.
 
  The authorization happens as part of handle and it is the first action,
  see here
 
 https://github.com/Parth-Brahmbhatt/kafka/blob/KAFKA-1688-impl/core/src/mai
  n/scala/kafka/server/KafkaApis.scala#L103 for one example.
 
  Thanks
  Parth
 
 
 
  On 4/30/15, 4:24 PM, Suresh Srinivas sur...@hortonworks.commailto:
 sur...@hortonworks.com wrote:
 
  Joe, thanks

Re: [VOTE] KIP-11- Authorization design for kafka security

2015-04-30 Thread Don Bosco Durai
 and I'll forward you the
  invite.
Thank you, Gwen. I have the invite and I should be at home at
that
  time.
But due to network issue, I may can't join the meeting smoothly.
   
Regards
Dapeng
   
-Original Message-
From: Gwen Shapira [mailto:gshap...@cloudera.com]
Sent: Tuesday, April 28, 2015 1:31 PM
To: dev@kafka.apache.org
Subject: Re [VOTE] KIP-11- Authorization design for kafka
security
   
While I see the advantage of being able to say something like:
deny
  user
X from hosts h1...h200 also allow user X from host h189, there
 are
  two
issues here:
   
1. Complex rule systems can be difficult to reason about and
 therefore
   end
up being less secure. The rule Deny always wins is very easy to
  grasp.
   
2. We currently don't have any mechanism for specifying IP ranges
 (or
   host
ranges) at all. I think its a pretty significant deficiency, but
it
  does
mean that we don't need to worry about the issue of blocking a
large
   range
while unblocking few servers in the range.
   
Gwen
   
P.S
We have a call tomorrow (Tuesday, April 28) at 3pm PST - to
discuss
  this
and other outstanding design issues (not all related to
security).
 If
  you
are interested in joining - let me know and I'll forward you the
  invite.
   
Gwen
   
On Mon, Apr 27, 2015 at 10:15 PM, Sun, Dapeng
dapeng@intel.com
 
wrote:
   
 Attach the image.


  
https://raw.githubusercontent.com/sundapeng/attachment/master/kafka-ac
 l1.png

 Regards
 Dapeng

 From: Sun, Dapeng [mailto:dapeng@intel.com]
 Sent: Tuesday, April 28, 2015 11:44 AM
 To: dev@kafka.apache.org
 Subject: RE: [VOTE] KIP-11- Authorization design for kafka
 security


 Thank you for your rapid reply, Parth.



 * I think the wiki already describes the precedence order as
Deny
 taking
 precedence over allow when conflicting acls are found

  
https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authorizati
 on+In

 terface#KIP-11-AuthorizationInterface-PermissionType

 Got it, thank you.



 * In the first version that I am currently writing there is no
  group
 support. Even when we add it I don't see the need to add a
  precedence
 for evaluation. it does not matter which principal matches as
long
  as

  we have a match.



 About this part, I think we should choose the best matching acl
 for
 authorization, no mater we support group or not.



 For the case

  [cid:image001.png@01D08197.E94BD410]


  
https://raw.githubusercontent.com/sundapeng/attachment/master/kafka-ac
 l1.png



 if 2 Acls are define, one that deny an operation from all
hosts
 and
 one that allows the operation from host1, the operation from
host1
 will be denied or allowed?

 According wiki Deny will take precedence over Allow in
competing
 acls., it seems acl_1 will win the competition, but customers'
 intention may be allow.

 I think deny always take precedence over Allow is okay, but
  host1
 - user1host1 default may make sense.





 * Acl storage is indexed by resource right now because that is
 the
 primary lookup id for all authorize operations. Given acls are
  cached
 I don't see the need to optimized the storage layer any further
 for
lookup.

 * The reason why we have acl with multi everything is to
reduce
 redundancy in acl storage. I am not sure how wil we be able to
  reduce
 redundancy if we divide it by using one principal,one host, one
operation.



 Yes, I'm also greed with Acl storage should be indexed by
  resource.
 Under resource index, it may be better to add index such as
hosts
  and
 principals. One option may be one principal, one host, one
  operation.
 Just give your these scenarios for considering.



 For the case defined in wiki:
 Acl_1 - {user:bob, user:*} is allowed to READ from all
hosts.

 Acl_2 - {user:bob} is denied to READ from host1

 Acl_3 - {user:alice, group:kafka-devs} is allowed to READ
and
 WRITE from {host1, hos2}.



 For acl_3, if we want to remove alice's WRITE from
{host1,host2}
 and
 remove alice's READ from host1, user may have following ways to
   achieve:



 1.Remove the parts of acl_3 directly, I think if we make it
 divided
 and hierarchical, this kind of operatons could be done
directly
 in
backend.

 2.Remove acl_3, and add new acl {group:kafka-devs} is
allowed to
 READ and WRITE from {host1, host2} and {user:alice } is
allowed
 to
 READ from {host2}

 3.Add two denied acls,{ user:alice} is denied to WRITE from
 {host1,host2} and { user:alice} is denied to READ from

Re: [VOTE] KIP-11- Authorization design for kafka security

2015-04-30 Thread Suresh Srinivas
Comment on AuthorizationException. I think the intent of exception should be to 
capture why a request is rejected. It is important from API perspective to be 
specific to aid debugging. Having a generic or obfuscated exception is not very 
useful. Does someone on getting an exception reach out to an admin to 
understand if a topic exists or it's an authorization issue?

I am not getting the security concern. System must be ensure disallowing the 
access by implementing the security correctly. Not based on security by 
obscurity.

Regards,
Suresh

Sent from phone

_
From: Gwen Shapira gshap...@cloudera.commailto:gshap...@cloudera.com
Sent: Thursday, April 30, 2015 10:14 AM
Subject: Re: [VOTE] KIP-11- Authorization design for kafka security
To: dev@kafka.apache.orgmailto:dev@kafka.apache.org


* Regarding additional authorizers:
Prasad, who is a PMC on Apache Sentry reviewed the design and confirmed
Sentry can integrate with the current APIs. Dapeng Sun, a committer on
Sentry had some concerns about the IP privileges and how we prioritize
privileges - but nothing that prevents Sentry from integrating with the
existing solution, from what I could see. It seems to me that the design is
very generic and adapters can be written for other authorization systems
(after all, you just need to implement setACL, getACL and Authorize - all
pretty basic), although I can't speak for Oracle's Identity Manager
specifically.

* Regarding AuthorizationException to indicate that an operation was not
authorized: Sorry I missed this in previous reviewed, but now that I look
at it - Many systems intentionally don't return AuthorizationException when
READ privilege is missing, since this already gives too much information
(that the topic exists and that you don't have privileges on it). Instead
they return a variant of doesn't exist. I'm wondering if this approach is
applicable / desirable for Kafka as well.
Note that this doesn't remove the need for AuthorizationException - I'm
just suggesting a possible refinement on its use.

Gwen



On Thu, Apr 30, 2015 at 9:52 AM, Parth Brahmbhatt 
pbrahmbh...@hortonworks.commailto:pbrahmbh...@hortonworks.com wrote:

 Hi Joe, Thanks for taking the time to review.

 * All the open issues already have a resolution , I can open a jira for
 each one and add the resolution to it and resolve them immediately if you
 want this for tracking purposes.
 * We will update system tests to verify that the code works. We have
 thorough unit tests for all the new code except for modifications made to
 KafkaAPI as that has way too many dependencies to be mocked which I guess
 is the reason for no existing unit tests.
 * I don’t know if I completely understand the concern. We have talked with
 Ranger team (Don Bosco Durai) so we at least have one custom authorizer
 implementation that has approved this design and they will be able to
 inject their authorization framework with current interfaces. Do you see
 any issue with the design which will prevent anyone from providing a
 custom implementation?
 * Did not understand the concern around wire protocol, we are adding
 AuthorizationException to indicate that an operation was not authorized.

 Thanks
 Parth

 On 4/30/15, 5:59 AM, Jun Rao j...@confluent.iomailto:j...@confluent.io 
 wrote:

 Joe,
 
 Could you elaborate on why we should not store JSON in ZK? So far, all
 existing ZK data are in JSON.
 
 Thanks,
 
 Jun
 
 On Thu, Apr 30, 2015 at 2:06 AM, Joe Stein 
 joe.st...@stealth.lymailto:joe.st...@stealth.ly wrote:
 
  Hi, sorry I am coming in late to chime back in on this thread and
 haven't
  been able to make the KIP hangouts the last few weeks. Sorry if any of
 this
  was brought up already or I missed it.
 
  I read through the KIP and the thread(s) and a couple of things jumped
 out.
 
 
 - Can we break out the open issues in JIRA (maybe during the hangout)
 that are in the KIP and resolve/flesh those out more?
 
 
 
 - I don't see any updates with the systems test or how we can know
 the
 code works.
 
 
 
 - We need some implementation/example/sample that we know can work in
 all different existing entitlement servers and not just ones that
 run in
 types of data centers too. I am not saying we should support
 everything
  but
 if someone had to implement
 https://docs.oracle.com/cd/E19225-01/820-6551/bzafm/index.html with
 Kafka it has to work for them out of the box.
 
 
 
 - We should shy away from storing JSON in Zookeeper. Lets store
 bytes in
 Storage.
 
 
 
 - We should spend some time thinking through exceptions in the wire
 protocol maybe as part of this so it can keep moving forward.
 
 
  ~ Joe Stein
 
  On Tue, Apr 28, 2015 at 3:33 AM, Sun, Dapeng 
  dapeng@intel.commailto:dapeng@intel.com
 wrote:
 
   Thank you for your reply, Gwen.
  
   1. Complex rule systems can be difficult to reason about and
 therefore
   end up being less secure. The rule

Re: [VOTE] KIP-11- Authorization design for kafka security

2015-04-30 Thread Suresh Srinivas
Joe, thanks for the clarification.

Regarding audits, sorry I might be misunderstanding your email. Currently, if 
Kafka does not support audits, I think audits should be considered as a 
separate effort. Here are the reasons:
- Audit, whether authorization is available or not, should record operations to 
determine what is happening in the system. It should record all the operations 
such as create, delete, consumption of topics along with user information. It 
should work whether authorization is enabled or not. In Hadoop long before we 
added real authorization, we had audit logs.
- Authorization will bring an additional element of who was denied. As part of 
audit effort, it is important to add along with what operations succeeded (and 
for whom), what operations were denied.

From: Joe Stein joe.st...@stealth.ly
Sent: Thursday, April 30, 2015 4:12 PM
To: dev@kafka.apache.org
Subject: Re: [VOTE] KIP-11- Authorization design for kafka security

I kind of thought of the authorization module as something that happens in
handle(request: RequestChannel.Reuqest) in the request.requestId match

If the request doesn't do what it is allowed too it should stop right
there. That what it is allowed to-do is a true/false callback to the
class loaded with 1 function to accept the data and some more about what it
is about (that we have access to).

I think all of the other features are awesome but you can build them on top
of this and then other can do the same.

I am more hooked on the authorization module being a watch dog above
handle() than I am on the plug-in implementation options (less is more
imho).

If we do this approach the audit fits in nice because we are seeing more
what happens in one place and decision made for access right there.

~ Joe Stein
- - - - - - - - - - - - - - - - -

  http://www.stealth.ly
- - - - - - - - - - - - - - - - -

On Thu, Apr 30, 2015 at 6:59 PM, Suresh Srinivas sur...@hortonworks.com
wrote:

 Joe,

 Can you add more details on what generalization looks like? Also is this a
 design issue or code issue?

 One more question. Does Kafka have audit capabilities today for topic
 creation, deletion, access etc.?

 Regards,
 Suresh

 Sent from phone

 _
 From: Joe Stein joe.st...@stealth.lymailto:joe.st...@stealth.ly
 Sent: Thursday, April 30, 2015 3:27 PM
 Subject: Re: [VOTE] KIP-11- Authorization design for kafka security
 To: dev@kafka.apache.orgmailto:dev@kafka.apache.org


 Ok, I read through it all again a few times. I get the provider broker
 piece now.

 The configurations are still confusing if there are 2 or 3 and they should
 be called out more specifically than as a change to a class. Configs are a
 public interface we should be a bit more explicit.

 Was there any discussion about any auditing component? How would anyone
 know if the authorization plugin was running for when or what it was doing?

 If we can't audit the access then what good is controlling the access?

 I still don't see where all the command line configuration options come in.
 There are a lot of things to-do with it but not sure how to use it yet.

 This plug-in still feels like a very specific case and we should try to
 generalize it down some more to make it more straight forward for folks.

 ~ Joestein

 On Thu, Apr 30, 2015 at 3:51 PM, Parth Brahmbhatt 
 pbrahmbh...@hortonworks.commailto:pbrahmbh...@hortonworks.com wrote:

  During the discussion Jun pointed out that mirror maker, which right now
  does not copy any zookeeper config overrides, will now replicate topics
  but will not replicate any acls. Given the authorizer interface exposes
  the acl management apis, list/get/add/remove, weproposed that mirror
  maker can just instantiate an instance of authorizer and call these apis
  directly to get acls for a topic and add it to the destination cluster if
  we want to add acls to be replicated as part of mirror maker.
 
  Thanks
  Parth
 
  On 4/30/15, 12:43 PM, Joe Stein joe.st...@stealth.lymailto:
 joe.st...@stealth.ly wrote:
 
  Parth,
  
  Can you explain how Mirror maker will have to start using new acl
  management tool) and it not affect any other client. If you aren't
  changing the wire protocol then how do clients use it?
  
  ~ Joe stein
  
  
  On Thu, Apr 30, 2015 at 3:15 PM, Parth Brahmbhatt 
  pbrahmbh...@hortonworks.commailto:pbrahmbh...@hortonworks.com wrote:
  
   Hi Joe,
  
   Regarding open question: I changed the title to “Questions resolved
  after
   community discussions” let me know if you have a better name. I have a
   question and a bullet point under each question describing the final
   decision. Not sure how can I make it any cleaner so appreciate any
   suggestion.
  
   Regarding system tests: I went through a bunch of KIP none of which
   mentions what test cases will be added. Do you want to add a “How do
 you
   plan to tet” section in the general KIP template or you think this is
   just

Re: [VOTE] KIP-11- Authorization design for kafka security

2015-04-30 Thread Gwen Shapira
Ah, I'm not talking about security by obscurity.

At least in the database world, if you don't have SELECT on a table, you
won't even see it when saying show tables because the very fact that the
table exists is privileged. In that case, a denied SELECT attempt will
return table does not exist, and not permission denied.
It is simply a question of what the privilege covers.

I was wondering if it is desirable to apply the same model to Kafka.

Gwen

On Thu, Apr 30, 2015 at 3:51 PM, Suresh Srinivas sur...@hortonworks.com
wrote:

 Comment on AuthorizationException. I think the intent of exception should
 be to capture why a request is rejected. It is important from API
 perspective to be specific to aid debugging. Having a generic or obfuscated
 exception is not very useful. Does someone on getting an exception reach
 out to an admin to understand if a topic exists or it's an authorization
 issue?

 I am not getting the security concern. System must be ensure disallowing
 the access by implementing the security correctly. Not based on security by
 obscurity.

 Regards,
 Suresh

 Sent from phone

 _
 From: Gwen Shapira gshap...@cloudera.commailto:gshap...@cloudera.com
 Sent: Thursday, April 30, 2015 10:14 AM
 Subject: Re: [VOTE] KIP-11- Authorization design for kafka security
 To: dev@kafka.apache.orgmailto:dev@kafka.apache.org


 * Regarding additional authorizers:
 Prasad, who is a PMC on Apache Sentry reviewed the design and confirmed
 Sentry can integrate with the current APIs. Dapeng Sun, a committer on
 Sentry had some concerns about the IP privileges and how we prioritize
 privileges - but nothing that prevents Sentry from integrating with the
 existing solution, from what I could see. It seems to me that the design is
 very generic and adapters can be written for other authorization systems
 (after all, you just need to implement setACL, getACL and Authorize - all
 pretty basic), although I can't speak for Oracle's Identity Manager
 specifically.

 * Regarding AuthorizationException to indicate that an operation was not
 authorized: Sorry I missed this in previous reviewed, but now that I look
 at it - Many systems intentionally don't return AuthorizationException when
 READ privilege is missing, since this already gives too much information
 (that the topic exists and that you don't have privileges on it). Instead
 they return a variant of doesn't exist. I'm wondering if this approach is
 applicable / desirable for Kafka as well.
 Note that this doesn't remove the need for AuthorizationException - I'm
 just suggesting a possible refinement on its use.

 Gwen



 On Thu, Apr 30, 2015 at 9:52 AM, Parth Brahmbhatt 
 pbrahmbh...@hortonworks.commailto:pbrahmbh...@hortonworks.com wrote:

  Hi Joe, Thanks for taking the time to review.
 
  * All the open issues already have a resolution , I can open a jira for
  each one and add the resolution to it and resolve them immediately if you
  want this for tracking purposes.
  * We will update system tests to verify that the code works. We have
  thorough unit tests for all the new code except for modifications made to
  KafkaAPI as that has way too many dependencies to be mocked which I guess
  is the reason for no existing unit tests.
  * I don’t know if I completely understand the concern. We have talked
 with
  Ranger team (Don Bosco Durai) so we at least have one custom authorizer
  implementation that has approved this design and they will be able to
  inject their authorization framework with current interfaces. Do you see
  any issue with the design which will prevent anyone from providing a
  custom implementation?
  * Did not understand the concern around wire protocol, we are adding
  AuthorizationException to indicate that an operation was not authorized.
 
  Thanks
  Parth
 
  On 4/30/15, 5:59 AM, Jun Rao j...@confluent.iomailto:j...@confluent.io
 wrote:
 
  Joe,
  
  Could you elaborate on why we should not store JSON in ZK? So far, all
  existing ZK data are in JSON.
  
  Thanks,
  
  Jun
  
  On Thu, Apr 30, 2015 at 2:06 AM, Joe Stein joe.st...@stealth.ly
 mailto:joe.st...@stealth.ly wrote:
  
   Hi, sorry I am coming in late to chime back in on this thread and
  haven't
   been able to make the KIP hangouts the last few weeks. Sorry if any of
  this
   was brought up already or I missed it.
  
   I read through the KIP and the thread(s) and a couple of things jumped
  out.
  
  
  - Can we break out the open issues in JIRA (maybe during the
 hangout)
  that are in the KIP and resolve/flesh those out more?
  
  
  
  - I don't see any updates with the systems test or how we can know
  the
  code works.
  
  
  
  - We need some implementation/example/sample that we know can work
 in
  all different existing entitlement servers and not just ones that
  run in
  types of data centers too. I am not saying we should support
  everything
   but
  if someone had to implement
  https

Re: [VOTE] KIP-11- Authorization design for kafka security

2015-04-30 Thread Joe Stein
I kind of thought of the authorization module as something that happens in
handle(request: RequestChannel.Reuqest) in the request.requestId match

If the request doesn't do what it is allowed too it should stop right
there. That what it is allowed to-do is a true/false callback to the
class loaded with 1 function to accept the data and some more about what it
is about (that we have access to).

I think all of the other features are awesome but you can build them on top
of this and then other can do the same.

I am more hooked on the authorization module being a watch dog above
handle() than I am on the plug-in implementation options (less is more
imho).

If we do this approach the audit fits in nice because we are seeing more
what happens in one place and decision made for access right there.

~ Joe Stein
- - - - - - - - - - - - - - - - -

  http://www.stealth.ly
- - - - - - - - - - - - - - - - -

On Thu, Apr 30, 2015 at 6:59 PM, Suresh Srinivas sur...@hortonworks.com
wrote:

 Joe,

 Can you add more details on what generalization looks like? Also is this a
 design issue or code issue?

 One more question. Does Kafka have audit capabilities today for topic
 creation, deletion, access etc.?

 Regards,
 Suresh

 Sent from phone

 _
 From: Joe Stein joe.st...@stealth.lymailto:joe.st...@stealth.ly
 Sent: Thursday, April 30, 2015 3:27 PM
 Subject: Re: [VOTE] KIP-11- Authorization design for kafka security
 To: dev@kafka.apache.orgmailto:dev@kafka.apache.org


 Ok, I read through it all again a few times. I get the provider broker
 piece now.

 The configurations are still confusing if there are 2 or 3 and they should
 be called out more specifically than as a change to a class. Configs are a
 public interface we should be a bit more explicit.

 Was there any discussion about any auditing component? How would anyone
 know if the authorization plugin was running for when or what it was doing?

 If we can't audit the access then what good is controlling the access?

 I still don't see where all the command line configuration options come in.
 There are a lot of things to-do with it but not sure how to use it yet.

 This plug-in still feels like a very specific case and we should try to
 generalize it down some more to make it more straight forward for folks.

 ~ Joestein

 On Thu, Apr 30, 2015 at 3:51 PM, Parth Brahmbhatt 
 pbrahmbh...@hortonworks.commailto:pbrahmbh...@hortonworks.com wrote:

  During the discussion Jun pointed out that mirror maker, which right now
  does not copy any zookeeper config overrides, will now replicate topics
  but will not replicate any acls. Given the authorizer interface exposes
  the acl management apis, list/get/add/remove, weproposed that mirror
  maker can just instantiate an instance of authorizer and call these apis
  directly to get acls for a topic and add it to the destination cluster if
  we want to add acls to be replicated as part of mirror maker.
 
  Thanks
  Parth
 
  On 4/30/15, 12:43 PM, Joe Stein joe.st...@stealth.lymailto:
 joe.st...@stealth.ly wrote:
 
  Parth,
  
  Can you explain how Mirror maker will have to start using new acl
  management tool) and it not affect any other client. If you aren't
  changing the wire protocol then how do clients use it?
  
  ~ Joe stein
  
  
  On Thu, Apr 30, 2015 at 3:15 PM, Parth Brahmbhatt 
  pbrahmbh...@hortonworks.commailto:pbrahmbh...@hortonworks.com wrote:
  
   Hi Joe,
  
   Regarding open question: I changed the title to “Questions resolved
  after
   community discussions” let me know if you have a better name. I have a
   question and a bullet point under each question describing the final
   decision. Not sure how can I make it any cleaner so appreciate any
   suggestion.
  
   Regarding system tests: I went through a bunch of KIP none of which
   mentions what test cases will be added. Do you want to add a “How do
 you
   plan to tet” section in the general KIP template or you think this is
   just a special case where the test cases should be listed and
 discussed
  as
   part of KIP? I am not sure if KIP really is the right forum for this
   discussion. This can easily be addressed during code review if people
   think we don’t have enough test coverage.
  
   I am still not sure which part is not clear. The scal exception is
  added
   for internal server side rpresentation. In the end all of our
 responses
   always return just an error code for which we will add an
   AuthorizationErroCode mapped to AuthorizationException. The error code
  it
   self will not reveal any informationother then the fact that you are
  not
   authorized to perform an operation on a resource and you will get this
   error code even for non existent topics if no acls exist for those
  topics.
  
can add a diagram if that makes things more clear, I am not convinced
   its needed given we have come so far without it. Essentially there
 are 3
   steps
   * users use the acl cli to add acls

Re: [VOTE] KIP-11- Authorization design for kafka security

2015-04-30 Thread Gwen Shapira
On Thu, Apr 30, 2015 at 4:39 PM, Parth Brahmbhatt 
pbrahmbh...@hortonworks.com wrote:

 Hi Joe,

 Let me clarify on authZException. The caller gets a 403 regardless of
 existence of the topic, even if the topic does not exist you always get
 403. This will fall under the case wherewe do not find any acls for a
 resource and as per our last decision by default we are going to deny this
 request.


The reason I'm digging into this is that in Hive we had to fix existing
behavior after financial customers objected loudly to getting insufficient
privileges when a real database would return table does not exist.

I completely agree that having to handle two separate error conditions
(TopicNotExist if user doesn't have READ, unless user has CREATE in which
case he can see all topics and can get Unauthorized) adds complexity and
will not be fun to debug. However, when implementing security, a lot of the
stuff we do is around making customers pass security audits, and I suspect
that can't know that tables even exist test is a thing.

We share pretty much the same financial customers and they seem to have the
same concerns. Perhaps you can double check if you also have this
requirement?

(and again, sorry for not seeing this earlier and holding up the vote on
what seems like a minor point. I just don't want to punt for later
something when we already have an idea of what customers expect)

Gwen




 The configurations are listed explicitly here
 https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authorization+In
 terface#KIP-11-AuthorizatiInterface-Changestoexistingclasses under
 KafkaConfig. We may add an optional config to allow authorizer to read an
 arbitrary property files incrementally but that does not need to be part
 of this same KIP.

 The statement “If we can't audit the access then wht good is controlling
 the access?” seems extreme because we still get to control the access
 which IMHO is a huge win. The default authorizer implementation right now
 logs every allowed/denied access (see here
 https://github.com/Parth-Brahmbhatt/kafka/blob/KAFKA-1688-impl/core/src/mai
 n/scala/kafka/security/auth/SimpleAclAthorizer.scala) in debug mode.
 Anybody who needs auditing could create a lo4j appender to allow debug
 access to this class and send the log output to some audit fil.

 Auditing is still a separate piece, we could either add an auditor
 interface that wraps authorizer or the other way around so authorizer and
 auditor can be two separate implementation. I woud love to start a new
 KIP and jira to discuss approaches in more details but I don’t see the
 need to hold up Authorization work for the same.

 I don’t agree with the “this design seems too specific” given we already
 have 3 implementation (default, ranger, sentry) that can be supported with
 the current design.

 The authorization happens as part of handle and it is the first action,
 see here
 https://github.com/Parth-Brahmbhatt/kafka/blob/KAFKA-1688-impl/core/src/mai
 n/scala/kafka/server/KafkaApis.scala#L103 for one example.

 Thanks
 Parth



 On 4/30/15, 4:24 PM, Suresh Srinivas sur...@hortonworks.com wrote:

 Joe, thanks for the clarification.
 
 Regarding audits, sorry I might be misunderstanding your email.
 Currently, if Kafka does not support audits, I think audits should be
 considered as a separate effort. Here are the reasons:
 - Audit,whether authorization is available or not, should record
 operations to determine what is happening in the system. It should record
 all the operations such as create, delete, consumption of topics along
 with user information. It should work whether authorization is enabled or
 not. In Hadoop long before we added real authorization, we had audit logs.
 - Authorizaion will bring an additional element of who was denied. As
 part of audit effort, it is important to add along with what operations
 succeeded (and for whom), what operations were denied.
 
 From: Joe Stein joe.st...@tealth.ly
 Sent: Thursday, April 30, 2015 4:12 PM
 To: dev@kafka.apache.org
 Subject: Re: [VOTE] KIP-11- Authorization design for kafka security
 
 I kind of thought of the authorization module as something that happens in
 handle(request: RequestChannel.Reuqest) in the request.requestId match
 
 If the request doesn't do what it is allowed too it should stop right
 there. That what it is allowed to-do is a true/false callback to the
 class loadd with 1 function to accept the data and some more about what
 it
 is about (that we have access to).
 
 I think all of the other features are awesome but you can build them on
 top
 of this and then other can do the same.
 
 I am more hooked on the authorization module being a watch dog above
 handle() than I am on the plug-in implementation options (less is more
 imho).
 
 If we do this approach the audit fits in nice because we are seeing mor
 what happens in one place and decision made for access right there.
 
 ~ Joe Stein

RE: [VOTE] KIP-11- Authorization design for kafka security

2015-04-28 Thread Sun, Dapeng
Thank you for your reply, Gwen.

1. Complex rule systems can be difficult to reason about and therefore end up 
being less secure. The rule Deny always wins is very easy to grasp.
Yes, I'm agreed with your point: we should not make the rule complex.

2. We currently don't have any mechanism for specifying IP ranges (or host
ranges) at all. I think its a pretty significant deficiency, but it does mean 
that we don't need to worry about the issue of blocking a large range while 
unblocking few servers in the range.
Support ranges sounds reasonable. If this feature will be in development plan, 
I also don't think we can put the best matching acl and  Support ip ranges 
together. 

We have a call tomorrow (Tuesday, April 28) at 3pm PST - to discuss this and 
other outstanding design issues (not all related to security). If you are 
interested in joining - let me know and I'll forward you the invite.
Thank you, Gwen. I have the invite and I should be at home at that time. But 
due to network issue, I may can't join the meeting smoothly.

Regards
Dapeng

-Original Message-
From: Gwen Shapira [mailto:gshap...@cloudera.com] 
Sent: Tuesday, April 28, 2015 1:31 PM
To: dev@kafka.apache.org
Subject: Re: [VOTE] KIP-11- Authorization design for kafka security

While I see the advantage of being able to say something like: deny user X 
from hosts h1...h200 also allow user X from host h189, there are two issues 
here:

1. Complex rule systems can be difficult to reason about and therefore end up 
being less secure. The rule Deny always wins is very easy to grasp.

2. We currently don't have any mechanism for specifying IP ranges (or host
ranges) at all. I think its a pretty significant deficiency, but it does mean 
that we don't need to worry about the issue of blocking a large range while 
unblocking few servers in the range.

Gwen

P.S
We have a call tomorrow (Tuesday, April 28) at 3pm PST - to discuss this and 
other outstanding design issues (not all related to security). If you are 
interested in joining - let me know and I'll forward you the invite.

Gwen

On Mon, Apr 27, 2015 at 10:15 PM, Sun, Dapeng dapeng@intel.com wrote:

 Attach the image.

 https://raw.githubusercontent.com/sundapeng/attachment/master/kafka-ac
 l1.png

 Regards
 Dapeng

 From: Sun, Dapeng [mailto:dapeng@intel.com]
 Sent: Tuesday, April 28, 2015 11:44 AM
 To: dev@kafka.apache.org
 Subject: RE: [VOTE] KIP-11- Authorization design for kafka security


 Thank you for your rapid reply, Parth.



 * I think the wiki already describes the precedence order as Deny 
 taking
 precedence over allow when conflicting acls are found 
 https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authorizati
 on+In

 terface#KIP-11-AuthorizationInterface-PermissionType

 Got it, thank you.



 * In the first version that I am currently writing there is no group
 support. Even when we add it I don't see the need to add a precedence 
 for evaluation. it does not matter which principal matches as long as

  we have a match.



 About this part, I think we should choose the best matching acl for 
 authorization, no matter we support group or not.



 For the case

  [cid:image001.png@01D08197.E94BD410]

 https://raw.githubusercontent.com/sundapeng/attachment/master/kafka-ac
 l1.png



 if 2 Acls are defined, one that deny an operation from all hosts and 
 one that allows the operation from host1, the operation from host1 
 will be denied or allowed?

 According wiki Deny will take precedence over Allow in competing 
 acls., it seems acl_1 will win the competition, but customers' 
 intention may be allow.

 I think deny always take precedence over Allow is okay, but  host1 
 - user1host1 default may make sense.





 * Acl storage is indexed by resource right now because that is the
 primary lookup id for all authorize operations. Given acls are cached 
 I don't see the need to optimized the storage layer any further for lookup.

 * The reason why we have acl with multi everything is to reduce
 redundancy in acl storage. I am not sure how will we be able to reduce 
 redundancy if we divide it by using one principal,one host, one operation.



 Yes, I'm also agreed with Acl storage should be indexed by resource.
 Under resource index, it may be better to add index such as hosts and 
 principals. One option may be one principal, one host, one operation. 
 Just give your these scenarios for considering.



 For the case defined in wiki:

 Acl_1 - {user:bob, user:*} is allowed to READ from all hosts.

 Acl_2 - {user:bob} is denied to READ from host1

 Acl_3 - {user:alice, group:kafka-devs} is allowed to READ and 
 WRITE from {host1, host2}.



 For acl_3, if we want to remove alice's WRITE from {host1,host2} and 
 remove alice's READ from host1, user may have following ways to achieve:



 1.Remove the parts of acl_3 directly, I think if we make it divided 
 and hierarchical, this kind of operations could be done directly in backend.

 2.Remove

Re: [VOTE] KIP-11- Authorization design for kafka security

2015-04-27 Thread Jun Rao
Parth,

I was thinking that in a multi-tenant environment, an admin may want to
carve out some topic space to a user. For example, allow user X to create
any topic of X_*. Not sure how critical it is though.

Also, with the current api, what would the admin do to replicate the acls
from one cluster to another? Will she just list all acls from cli and
reissue them to another cluster periodically?

Thanks,

Jun

On Mon, Apr 27, 2015 at 10:56 AM, Parth Brahmbhatt 
pbrahmbh...@hortonworks.com wrote:

 Thanks for your comments Jun.

 * Renamed the resource to consumer-group in wiki.
 * I don’t see a use case where admins/users would want to reserve topic
 names in advance. Can you describe why this would be needed.

 Thanks
 Parth

 On 4/26/15, 2:01 PM, Jun Rao j...@confluent.io wrote:

 A few more minor comments.
 
 100. To make it clear, perhaps we should rename the resource group to
 consumer-group. We can probably make the same change in CLI as well so
 that
 it's not confused with user group.
 
 101. Currently, create is only at the cluster level. Should it also be at
 topic level? For example, perhaps it's useful to allow only user X to
 create topic X.
 
 Thanks,
 
 Jun
 
 
 On Sun, Apr 26, 2015 at 12:36 AM, Gwen Shapira gshap...@cloudera.com
 wrote:
 
  Thanks for clarifying, Parth. I think you are taking the right approach
  here.
 
  On Fri, Apr 24, 2015 at 11:46 AM, Parth Brahmbhatt
  pbrahmbh...@hortonworks.com wrote:
   Sorry Gwen, completely misunderstood the question :-).
  
   * Does everyone have the privilege to create a new Group and use it to
   consume from Topics he's already privileged on?
   Yes in current proposal. I did not see an API to create group
  but if you
   have a READ permission on a TOPIC and WRITE permission on that Group
 you
   are free to join and consume.
  
  
   * Will the CLI tool be used to manage group membership too?
   Yes and I think that means I need to add ―group. Updating the
  KIP. Thanks
   for pointing this out.
  
   * Groups are kind of ephemeral, right? If all consumers in the group
   disconnect the group is gone, AFAIK. Do we preserve the ACLs? Or do we
   treat the new group as completely new resource? Can we create ACLs
   before the group exists, in anticipation of it getting created?
   I have considered any auto delete and auto create as out of
  scope for the
   first release. So Right now I was going with preserving the acls. Do
 you
   see any issues with this? Auto deleting would mean authorizer will now
   have to get into implementation details of kafka which I was trying to
   avoid.
  
   Thanks
   Parth
  
   On 4/24/15, 11:33 AM, Gwen Shapira gshap...@cloudera.com wrote:
  
  We are not talking about same Groups :)
  
  I meant, Groups of consumers (which KIP-11 lists as a separate
  resource in the Privilege table)
  
  On Fri, Apr 24, 2015 at 11:31 AM, Parth Brahmbhatt
  pbrahmbh...@hortonworks.com wrote:
   I see Groups as something we can add incrementally in the current
  model.
   The acls take principalType: name so groups can be represented as
  group:
   groupName. We are not managing group memberships anywhere in kafka
 and
  I
   don’t see the need to do so.
  
   So for a topic1 using the CLI an admin can add an acl to grant
 access
  to
   group:kafka-test-users.
  
   The authorizer implementation can have a plugin to map authenticated
  user
   to groups ( This is how hadoop and storm works). The plugin could be
   mapping user to linux/ldap/active directory groups but that is again
  upto
   the implementation.
  
   What we are offering is an interface that is extensible so these
  features
   can be added incrementally. I can add support for this in the first
   release but don’t necessarily see why this would be absolute
 necessity.
  
   Thanks
   Parth
  
   On 4/24/15, 11:00 AM, Gwen Shapira gshap...@cloudera.com wrote:
  
  Thanks.
  
  One more thing I'm missing in the KIP is details on the Group
 resource
  (I think we discussed this and it was just not fully updated):
  
  * Does everyone have the privilege to create a new Group and use it
 to
  consume from Topics he's already privileged on?
  * Will the CLI tool be used to manage group membership too?
  * Groups are kind of ephemeral, right? If all consumers in the group
  disconnect the group is gone, AFAIK. Do we preserve the ACLs? Or do
 we
  treat the new group as completely new resource? Can we create ACLs
  before the group exists, in anticipation of it getting created?
  
  Its all small details, but it will be difficult to implement KIP-11
  without knowing the answers :)
  
  Gwen
  
  
  On Fri, Apr 24, 2015 at 9:58 AM, Parth Brahmbhatt
  pbrahmbh...@hortonworks.com wrote:
   You are right, moved it to the default implementation section.
  
   Thanks
   Parth
  
   On 4/24/15, 9:52 AM, Gwen Shapira gshap...@cloudera.com
 wrote:
  
  Sample ACL JSON and Zookeeper is in public API, but I thought it
 is
  part of 

Re: [VOTE] KIP-11- Authorization design for kafka security

2015-04-27 Thread Parth Brahmbhatt
* We are not supporting regex matching to any of the strings
(host,resource,principal) yet but this can be added. We have a special
wild card (*) to refer to ALL but there is no other regex matching going
on right now. We can associate CREATE with topics as you are proposing
once KIP-4 is merged I am just not sure if admins currently try to figure
out/control what topic names different tenents can have.
* With current API they will have to do exactly what you said. Call list
for each resource (cluster, topic and group) and reissue the same acls by
calling add in the mirrored cluster.

Thanks
Parth

On 4/27/15, 2:17 PM, Jun Rao j...@confluent.io wrote:

Parth,

I was thinking that in a multi-tenant environment, an admin may want to
carve out some topic space to a user. For example, allow user X to create
any topic of X_*. Not sure how critical it is though.

Also, with the current api, what would the admin do to replicate the acls
from one cluster to another? Will she just list all acls from cli and
reissue them to another cluster periodically?

Thanks,

Jun

On Mon, Apr 27, 2015 at 10:56 AM, Parth Brahmbhatt 
pbrahmbh...@hortonworks.com wrote:

 Thanks for your comments Jun.

 * Renamed the resource to consumer-group in wiki.
 * I don’t see a use case where admins/users would want to reserve topic
 names in advance. Can you describe why this would be needed.

 Thanks
 Parth

 On 4/26/15, 2:01 PM, Jun Rao j...@confluent.io wrote:

 A few more minor comments.
 
 100. To make it clear, perhaps we should rename the resource group to
 consumer-group. We can probably make the same change in CLI as well so
 that
 it's not confused with user group.
 
 101. Currently, create is only at the cluster level. Should it also be
at
 topic level? For example, perhaps it's useful to allow only user X to
 create topic X.
 
 Thanks,
 
 Jun
 
 
 On Sun, Apr 26, 2015 at 12:36 AM, Gwen Shapira gshap...@cloudera.com
 wrote:
 
  Thanks for clarifying, Parth. I think you are taking the right
approach
  here.
 
  On Fri, Apr 24, 2015 at 11:46 AM, Parth Brahmbhatt
  pbrahmbh...@hortonworks.com wrote:
   Sorry Gwen, completely misunderstood the question :-).
  
   * Does everyone have the privilege to create a new Group and use
it to
   consume from Topics he's already privileged on?
   Yes in current proposal. I did not see an API to create
group
  but if you
   have a READ permission on a TOPIC and WRITE permission on that
Group
 you
   are free to join and consume.
  
  
   * Will the CLI tool be used to manage group membership too?
   Yes and I think that means I need to add ―group. Updating
the
  KIP. Thanks
   for pointing this out.
  
   * Groups are kind of ephemeral, right? If all consumers in the
group
   disconnect the group is gone, AFAIK. Do we preserve the ACLs? Or
do we
   treat the new group as completely new resource? Can we create ACLs
   before the group exists, in anticipation of it getting created?
   I have considered any auto delete and auto create as out of
  scope for the
   first release. So Right now I was going with preserving the acls.
Do
 you
   see any issues with this? Auto deleting would mean authorizer will
now
   have to get into implementation details of kafka which I was
trying to
   avoid.
  
   Thanks
   Parth
  
   On 4/24/15, 11:33 AM, Gwen Shapira gshap...@cloudera.com wrote:
  
  We are not talking about same Groups :)
  
  I meant, Groups of consumers (which KIP-11 lists as a separate
  resource in the Privilege table)
  
  On Fri, Apr 24, 2015 at 11:31 AM, Parth Brahmbhatt
  pbrahmbh...@hortonworks.com wrote:
   I see Groups as something we can add incrementally in the current
  model.
   The acls take principalType: name so groups can be represented as
  group:
   groupName. We are not managing group memberships anywhere in
kafka
 and
  I
   don’t see the need to do so.
  
   So for a topic1 using the CLI an admin can add an acl to grant
 access
  to
   group:kafka-test-users.
  
   The authorizer implementation can have a plugin to map
authenticated
  user
   to groups ( This is how hadoop and storm works). The plugin
could be
   mapping user to linux/ldap/active directory groups but that is
again
  upto
   the implementation.
  
   What we are offering is an interface that is extensible so these
  features
   can be added incrementally. I can add support for this in the
first
   release but don’t necessarily see why this would be absolute
 necessity.
  
   Thanks
   Parth
  
   On 4/24/15, 11:00 AM, Gwen Shapira gshap...@cloudera.com
wrote:
  
  Thanks.
  
  One more thing I'm missing in the KIP is details on the Group
 resource
  (I think we discussed this and it was just not fully updated):
  
  * Does everyone have the privilege to create a new Group and use
it
 to
  consume from Topics he's already privileged on?
  * Will the CLI tool be used to manage group membership too?
  * Groups are kind of ephemeral, right? If all consumers in the
group
  disconnect 

Re: [VOTE] KIP-11- Authorization design for kafka security

2015-04-27 Thread Parth Brahmbhatt
Thanks for your comments Jun.

* Renamed the resource to consumer-group in wiki.
* I don’t see a use case where admins/users would want to reserve topic
names in advance. Can you describe why this would be needed.

Thanks
Parth

On 4/26/15, 2:01 PM, Jun Rao j...@confluent.io wrote:

A few more minor comments.

100. To make it clear, perhaps we should rename the resource group to
consumer-group. We can probably make the same change in CLI as well so
that
it's not confused with user group.

101. Currently, create is only at the cluster level. Should it also be at
topic level? For example, perhaps it's useful to allow only user X to
create topic X.

Thanks,

Jun


On Sun, Apr 26, 2015 at 12:36 AM, Gwen Shapira gshap...@cloudera.com
wrote:

 Thanks for clarifying, Parth. I think you are taking the right approach
 here.

 On Fri, Apr 24, 2015 at 11:46 AM, Parth Brahmbhatt
 pbrahmbh...@hortonworks.com wrote:
  Sorry Gwen, completely misunderstood the question :-).
 
  * Does everyone have the privilege to create a new Group and use it to
  consume from Topics he's already privileged on?
  Yes in current proposal. I did not see an API to create group
 but if you
  have a READ permission on a TOPIC and WRITE permission on that Group
you
  are free to join and consume.
 
 
  * Will the CLI tool be used to manage group membership too?
  Yes and I think that means I need to add ―group. Updating the
 KIP. Thanks
  for pointing this out.
 
  * Groups are kind of ephemeral, right? If all consumers in the group
  disconnect the group is gone, AFAIK. Do we preserve the ACLs? Or do we
  treat the new group as completely new resource? Can we create ACLs
  before the group exists, in anticipation of it getting created?
  I have considered any auto delete and auto create as out of
 scope for the
  first release. So Right now I was going with preserving the acls. Do
you
  see any issues with this? Auto deleting would mean authorizer will now
  have to get into implementation details of kafka which I was trying to
  avoid.
 
  Thanks
  Parth
 
  On 4/24/15, 11:33 AM, Gwen Shapira gshap...@cloudera.com wrote:
 
 We are not talking about same Groups :)
 
 I meant, Groups of consumers (which KIP-11 lists as a separate
 resource in the Privilege table)
 
 On Fri, Apr 24, 2015 at 11:31 AM, Parth Brahmbhatt
 pbrahmbh...@hortonworks.com wrote:
  I see Groups as something we can add incrementally in the current
 model.
  The acls take principalType: name so groups can be represented as
 group:
  groupName. We are not managing group memberships anywhere in kafka
and
 I
  don’t see the need to do so.
 
  So for a topic1 using the CLI an admin can add an acl to grant
access
 to
  group:kafka-test-users.
 
  The authorizer implementation can have a plugin to map authenticated
 user
  to groups ( This is how hadoop and storm works). The plugin could be
  mapping user to linux/ldap/active directory groups but that is again
 upto
  the implementation.
 
  What we are offering is an interface that is extensible so these
 features
  can be added incrementally. I can add support for this in the first
  release but don’t necessarily see why this would be absolute
necessity.
 
  Thanks
  Parth
 
  On 4/24/15, 11:00 AM, Gwen Shapira gshap...@cloudera.com wrote:
 
 Thanks.
 
 One more thing I'm missing in the KIP is details on the Group
resource
 (I think we discussed this and it was just not fully updated):
 
 * Does everyone have the privilege to create a new Group and use it
to
 consume from Topics he's already privileged on?
 * Will the CLI tool be used to manage group membership too?
 * Groups are kind of ephemeral, right? If all consumers in the group
 disconnect the group is gone, AFAIK. Do we preserve the ACLs? Or do
we
 treat the new group as completely new resource? Can we create ACLs
 before the group exists, in anticipation of it getting created?
 
 Its all small details, but it will be difficult to implement KIP-11
 without knowing the answers :)
 
 Gwen
 
 
 On Fri, Apr 24, 2015 at 9:58 AM, Parth Brahmbhatt
 pbrahmbh...@hortonworks.com wrote:
  You are right, moved it to the default implementation section.
 
  Thanks
  Parth
 
  On 4/24/15, 9:52 AM, Gwen Shapira gshap...@cloudera.com wrote:
 
 Sample ACL JSON and Zookeeper is in public API, but I thought it
is
 part of DefaultAuthorizer (Since Sentry and Argus won't be using
 Zookeeper).
 Am I wrong? Or is it the KIP?
 
 On Fri, Apr 24, 2015 at 9:49 AM, Parth Brahmbhatt
 pbrahmbh...@hortonworks.com wrote:
  Thanks for clarifying Gwen, KIP updated.
 
  I tried to make the distinction by creating a section for all
 public
 APIs
 
 
 https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authorizat
 io
 n+
 In
  terface#KIP-11-AuthorizationInterface-PublicInterfacesandclasses
 
  Let me know if you think there is a better way to reflect this.
 
  Thanks
  Parth
 
  On 4/24/15, 9:37 AM, Gwen Shapira gshap...@cloudera.com
wrote:
 
 +1 (non-binding)
 
 Two 

Re: [VOTE] KIP-11- Authorization design for kafka security

2015-04-27 Thread Parth Brahmbhatt
Hi Sun, thanks for the comments, my answers are below:

* I think the wiki already describes the precedence order as Deny taking
precedence over allow when conflicting acls are found
https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authorization+In
terface#KIP-11-AuthorizationInterface-PermissionType
* In the first version that I am currently writing there is no group
support. Even when we add it I don’t see the need to add a precedence for
evaluation. it does not matter which principal matches as long as we have
a match. 
* Acl storage is indexed by resource right now because that is the primary
lookup id for all authorize operations. Given acls are cached I don’t see
the need to optimized the storage layer any further for lookup.
* The reason why we have acl with multi everything is to reduce redundancy
in acl storage. I am not sure how will we be able to reduce redundancy if
we divide it by using one principal,one host, one operation.

Thanks
Parth

On 4/26/15, 8:06 PM, Sun, Dapeng dapeng@intel.com wrote:

Hi Parth

The design looks good, a few minor comments below. Since I just started
looking into the discussion and many previous discussions I may missed,
I'm sorry if these comments had be discussed.

1. About SimpleAclAuthorizer (SimpleAuthorizer):
a. As my understanding, I think there should only one type
privilege(allow/deny) of a topic on a principle, or we make it deny 
allow.
For example, acl_1  host1 - group1- user1 - read-allow and acl_2 
host1- group1 - user1 -read-deny, if the two acls are for a same
topic, it may be hard to understand, do you think it's necessary to add
some details about this to wiki.
b. And when we do authorize a user on a topic, we may should check user's
user level acl first, then check user's group level acl, finally we check
the host level and default level acl. do you think it's necessary we add
some contents like these to wiki.
For example, host1 - group1- user1host1 - group1host1

2.About SimpleAclAuthorizer (Acl Json will be stored in zookeeper)
a. It may be better to make acl json stored hierarchily. It may be easy
to search and do authorize. For example, when we authorize a user, we
only need user related acls.
b. I found one acl may contains multi-principles, multi-operations and
multi-hosts, I'm strongly agreed with we provide api like these, but the
acls stored in zookeeper or memory we may better to separate to
one-principle, one-operation and one host. So we could make sure there
are not many acls with same meaning and make acl management easily.


Regards
Dapeng

-Original Message-
From: Jun Rao [mailto:j...@confluent.io]
Sent: Monday, April 27, 2015 5:02 AM
To: dev@kafka.apache.org
Subject: Re: [VOTE] KIP-11- Authorization design for kafka security

A few more minor comments.

100. To make it clear, perhaps we should rename the resource group to
consumer-group. We can probably make the same change in CLI as well so
that it's not confused with user group.

101. Currently, create is only at the cluster level. Should it also be at
topic level? For example, perhaps it's useful to allow only user X to
create topic X.

Thanks,

Jun


On Sun, Apr 26, 2015 at 12:36 AM, Gwen Shapira gshap...@cloudera.com
wrote:

 Thanks for clarifying, Parth. I think you are taking the right
 approach here.

 On Fri, Apr 24, 2015 at 11:46 AM, Parth Brahmbhatt
 pbrahmbh...@hortonworks.com wrote:
  Sorry Gwen, completely misunderstood the question :-).
 
  * Does everyone have the privilege to create a new Group and use it
  to consume from Topics he's already privileged on?
  Yes in current proposal. I did not see an API to create
  group
 but if you
  have a READ permission on a TOPIC and WRITE permission on that Group
  you are free to join and consume.
 
 
  * Will the CLI tool be used to manage group membership too?
  Yes and I think that means I need to add ―group. Updating
  the
 KIP. Thanks
  for pointing this out.
 
  * Groups are kind of ephemeral, right? If all consumers in the group
  disconnect the group is gone, AFAIK. Do we preserve the ACLs? Or do
  we treat the new group as completely new resource? Can we create
  ACLs before the group exists, in anticipation of it getting created?
  I have considered any auto delete and auto create as out of
 scope for the
  first release. So Right now I was going with preserving the acls. Do
  you see any issues with this? Auto deleting would mean authorizer
  will now have to get into implementation details of kafka which I
  was trying to avoid.
 
  Thanks
  Parth
 
  On 4/24/15, 11:33 AM, Gwen Shapira gshap...@cloudera.com wrote:
 
 We are not talking about same Groups :)
 
 I meant, Groups of consumers (which KIP-11 lists as a separate
 resource in the Privilege table)
 
 On Fri, Apr 24, 2015 at 11:31 AM, Parth Brahmbhatt
 pbrahmbh...@hortonworks.com wrote:
  I see Groups as something we can add incrementally in the current
 model.
  The acls take principalType: name so

RE: [VOTE] KIP-11- Authorization design for kafka security

2015-04-27 Thread Sun, Dapeng
Attach the image.
https://raw.githubusercontent.com/sundapeng/attachment/master/kafka-acl1.png

Regards
Dapeng

From: Sun, Dapeng [mailto:dapeng@intel.com]
Sent: Tuesday, April 28, 2015 11:44 AM
To: dev@kafka.apache.org
Subject: RE: [VOTE] KIP-11- Authorization design for kafka security


Thank you for your rapid reply, Parth.



* I think the wiki already describes the precedence order as Deny taking 
precedence over allow when conflicting acls are found 
https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authorization+In

terface#KIP-11-AuthorizationInterface-PermissionType

Got it, thank you.



* In the first version that I am currently writing there is no group support. 
Even when we add it I don't see the need to add a precedence for evaluation. 
it does not matter which principal matches as long as

 we have a match.



About this part, I think we should choose the best matching acl for 
authorization, no matter we support group or not.



For the case

 [cid:image001.png@01D08197.E94BD410]
https://raw.githubusercontent.com/sundapeng/attachment/master/kafka-acl1.png



if 2 Acls are defined, one that deny an operation from all hosts and one that 
allows the operation from host1, the operation from host1 will be denied or 
allowed?

According wiki Deny will take precedence over Allow in competing acls., it 
seems acl_1 will win the competition, but customers' intention may be allow.

I think deny always take precedence over Allow is okay, but  host1 - user1 
   host1 default may make sense.





* Acl storage is indexed by resource right now because that is the primary 
lookup id for all authorize operations. Given acls are cached I don't see the 
need to optimized the storage layer any further for lookup.

* The reason why we have acl with multi everything is to reduce redundancy in 
acl storage. I am not sure how will we be able to reduce redundancy if we 
divide it by using one principal,one host, one operation.



Yes, I'm also agreed with Acl storage should be indexed by resource. Under 
resource index, it may be better to add index such as hosts and principals. One 
option may be one principal, one host, one operation. Just give your these 
scenarios for considering.



For the case defined in wiki:

Acl_1 - {user:bob, user:*} is allowed to READ from all hosts.

Acl_2 - {user:bob} is denied to READ from host1

Acl_3 - {user:alice, group:kafka-devs} is allowed to READ and WRITE from 
{host1, host2}.



For acl_3, if we want to remove alice's WRITE from {host1,host2} and remove 
alice's READ from host1, user may have following ways to achieve:



1.Remove the parts of acl_3 directly, I think if we make it divided and 
hierarchical, this kind of operations could be done directly in backend.

2.Remove acl_3, and add new acl {group:kafka-devs} is allowed to READ and 
WRITE from {host1, host2} and {user:alice } is allowed to READ from {host2}

3.Add two denied acls,{ user:alice} is denied to WRITE from {host1,host2} and { 
user:alice} is denied to READ from {host1}



All these can achieve this kind of operations, but I think 1 could more 
directly for user operations. If you think this optimization is not urgent, I'm 
also agreed.



Regards

Dapeng



-Original Message-

From: Parth Brahmbhatt [mailto:pbrahmbh...@hortonworks.com]

Sent: Tuesday, April 28, 2015 12:18 AM

To: dev@kafka.apache.orgmailto:dev@kafka.apache.org

Subject: Re: [VOTE] KIP-11- Authorization design for kafka security



Hi Sun, thanks for the comments, my answers are below:



* I think the wiki already describes the precedence order as Deny taking 
precedence over allow when conflicting acls are found 
https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authorization+In

terface#KIP-11-AuthorizationInterface-PermissionType

* In the first version that I am currently writing there is no group support. 
Even when we add it I don't see the need to add a precedence for evaluation. it 
does not matter which principal matches as long as we have a match.

* Acl storage is indexed by resource right now because that is the primary 
lookup id for all authorize operations. Given acls are cached I don't see the 
need to optimized the storage layer any further for lookup.

* The reason why we have acl with multi everything is to reduce redundancy in 
acl storage. I am not sure how will we be able to reduce redundancy if we 
divide it by using one principal,one host, one operation.



Thanks

Parth



On 4/26/15, 8:06 PM, Sun, Dapeng 
dapeng@intel.commailto:dapeng@intel.com wrote:



Hi Parth



The design looks good, a few minor comments below. Since I just started

looking into the discussion and many previous discussions I may missed,

I'm sorry if these comments had be discussed.



1. About SimpleAclAuthorizer (SimpleAuthorizer):

a. As my understanding, I think there should only one type

privilege(allow/deny) of a topic on a principle, or we make it deny 

allow.

For example, acl_1  host1

Re: [VOTE] KIP-11- Authorization design for kafka security

2015-04-27 Thread Gwen Shapira
While I see the advantage of being able to say something like: deny user X
from hosts h1...h200 also allow user X from host h189, there are two
issues here:

1. Complex rule systems can be difficult to reason about and therefore end
up being less secure. The rule Deny always wins is very easy to grasp.

2. We currently don't have any mechanism for specifying IP ranges (or host
ranges) at all. I think its a pretty significant deficiency, but it does
mean that we don't need to worry about the issue of blocking a large range
while unblocking few servers in the range.

Gwen

P.S
We have a call tomorrow (Tuesday, April 28) at 3pm PST - to discuss this
and other outstanding design issues (not all related to security). If you
are interested in joining - let me know and I'll forward you the invite.

Gwen

On Mon, Apr 27, 2015 at 10:15 PM, Sun, Dapeng dapeng@intel.com wrote:

 Attach the image.

 https://raw.githubusercontent.com/sundapeng/attachment/master/kafka-acl1.png

 Regards
 Dapeng

 From: Sun, Dapeng [mailto:dapeng@intel.com]
 Sent: Tuesday, April 28, 2015 11:44 AM
 To: dev@kafka.apache.org
 Subject: RE: [VOTE] KIP-11- Authorization design for kafka security


 Thank you for your rapid reply, Parth.



 * I think the wiki already describes the precedence order as Deny taking
 precedence over allow when conflicting acls are found
 https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authorization+In

 terface#KIP-11-AuthorizationInterface-PermissionType

 Got it, thank you.



 * In the first version that I am currently writing there is no group
 support. Even when we add it I don't see the need to add a precedence for
 evaluation. it does not matter which principal matches as long as

  we have a match.



 About this part, I think we should choose the best matching acl for
 authorization, no matter we support group or not.



 For the case

  [cid:image001.png@01D08197.E94BD410]

 https://raw.githubusercontent.com/sundapeng/attachment/master/kafka-acl1.png



 if 2 Acls are defined, one that deny an operation from all hosts and one
 that allows the operation from host1, the operation from host1 will be
 denied or allowed?

 According wiki Deny will take precedence over Allow in competing acls.,
 it seems acl_1 will win the competition, but customers' intention may be
 allow.

 I think deny always take precedence over Allow is okay, but  host1 -
 user1host1 default may make sense.





 * Acl storage is indexed by resource right now because that is the
 primary lookup id for all authorize operations. Given acls are cached I
 don't see the need to optimized the storage layer any further for lookup.

 * The reason why we have acl with multi everything is to reduce
 redundancy in acl storage. I am not sure how will we be able to reduce
 redundancy if we divide it by using one principal,one host, one operation.



 Yes, I'm also agreed with Acl storage should be indexed by resource.
 Under resource index, it may be better to add index such as hosts and
 principals. One option may be one principal, one host, one operation. Just
 give your these scenarios for considering.



 For the case defined in wiki:

 Acl_1 - {user:bob, user:*} is allowed to READ from all hosts.

 Acl_2 - {user:bob} is denied to READ from host1

 Acl_3 - {user:alice, group:kafka-devs} is allowed to READ and WRITE
 from {host1, host2}.



 For acl_3, if we want to remove alice's WRITE from {host1,host2} and
 remove alice's READ from host1, user may have following ways to achieve:



 1.Remove the parts of acl_3 directly, I think if we make it divided and
 hierarchical, this kind of operations could be done directly in backend.

 2.Remove acl_3, and add new acl {group:kafka-devs} is allowed to READ
 and WRITE from {host1, host2} and {user:alice } is allowed to READ from
 {host2}

 3.Add two denied acls,{ user:alice} is denied to WRITE from {host1,host2}
 and { user:alice} is denied to READ from {host1}



 All these can achieve this kind of operations, but I think 1 could more
 directly for user operations. If you think this optimization is not urgent,
 I'm also agreed.



 Regards

 Dapeng



 -Original Message-

 From: Parth Brahmbhatt [mailto:pbrahmbh...@hortonworks.com]

 Sent: Tuesday, April 28, 2015 12:18 AM

 To: dev@kafka.apache.orgmailto:dev@kafka.apache.org

 Subject: Re: [VOTE] KIP-11- Authorization design for kafka security



 Hi Sun, thanks for the comments, my answers are below:



 * I think the wiki already describes the precedence order as Deny taking
 precedence over allow when conflicting acls are found
 https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authorization+In

 terface#KIP-11-AuthorizationInterface-PermissionType

 * In the first version that I am currently writing there is no group
 support. Even when we add it I don't see the need to add a precedence for
 evaluation. it does not matter which principal matches as long as we have a
 match.

 * Acl storage is indexed

Re: [VOTE] KIP-11- Authorization design for kafka security

2015-04-26 Thread Jun Rao
A few more minor comments.

100. To make it clear, perhaps we should rename the resource group to
consumer-group. We can probably make the same change in CLI as well so that
it's not confused with user group.

101. Currently, create is only at the cluster level. Should it also be at
topic level? For example, perhaps it's useful to allow only user X to
create topic X.

Thanks,

Jun


On Sun, Apr 26, 2015 at 12:36 AM, Gwen Shapira gshap...@cloudera.com
wrote:

 Thanks for clarifying, Parth. I think you are taking the right approach
 here.

 On Fri, Apr 24, 2015 at 11:46 AM, Parth Brahmbhatt
 pbrahmbh...@hortonworks.com wrote:
  Sorry Gwen, completely misunderstood the question :-).
 
  * Does everyone have the privilege to create a new Group and use it to
  consume from Topics he's already privileged on?
  Yes in current proposal. I did not see an API to create group
 but if you
  have a READ permission on a TOPIC and WRITE permission on that Group you
  are free to join and consume.
 
 
  * Will the CLI tool be used to manage group membership too?
  Yes and I think that means I need to add ―group. Updating the
 KIP. Thanks
  for pointing this out.
 
  * Groups are kind of ephemeral, right? If all consumers in the group
  disconnect the group is gone, AFAIK. Do we preserve the ACLs? Or do we
  treat the new group as completely new resource? Can we create ACLs
  before the group exists, in anticipation of it getting created?
  I have considered any auto delete and auto create as out of
 scope for the
  first release. So Right now I was going with preserving the acls. Do you
  see any issues with this? Auto deleting would mean authorizer will now
  have to get into implementation details of kafka which I was trying to
  avoid.
 
  Thanks
  Parth
 
  On 4/24/15, 11:33 AM, Gwen Shapira gshap...@cloudera.com wrote:
 
 We are not talking about same Groups :)
 
 I meant, Groups of consumers (which KIP-11 lists as a separate
 resource in the Privilege table)
 
 On Fri, Apr 24, 2015 at 11:31 AM, Parth Brahmbhatt
 pbrahmbh...@hortonworks.com wrote:
  I see Groups as something we can add incrementally in the current
 model.
  The acls take principalType: name so groups can be represented as
 group:
  groupName. We are not managing group memberships anywhere in kafka and
 I
  don’t see the need to do so.
 
  So for a topic1 using the CLI an admin can add an acl to grant access
 to
  group:kafka-test-users.
 
  The authorizer implementation can have a plugin to map authenticated
 user
  to groups ( This is how hadoop and storm works). The plugin could be
  mapping user to linux/ldap/active directory groups but that is again
 upto
  the implementation.
 
  What we are offering is an interface that is extensible so these
 features
  can be added incrementally. I can add support for this in the first
  release but don’t necessarily see why this would be absolute necessity.
 
  Thanks
  Parth
 
  On 4/24/15, 11:00 AM, Gwen Shapira gshap...@cloudera.com wrote:
 
 Thanks.
 
 One more thing I'm missing in the KIP is details on the Group resource
 (I think we discussed this and it was just not fully updated):
 
 * Does everyone have the privilege to create a new Group and use it to
 consume from Topics he's already privileged on?
 * Will the CLI tool be used to manage group membership too?
 * Groups are kind of ephemeral, right? If all consumers in the group
 disconnect the group is gone, AFAIK. Do we preserve the ACLs? Or do we
 treat the new group as completely new resource? Can we create ACLs
 before the group exists, in anticipation of it getting created?
 
 Its all small details, but it will be difficult to implement KIP-11
 without knowing the answers :)
 
 Gwen
 
 
 On Fri, Apr 24, 2015 at 9:58 AM, Parth Brahmbhatt
 pbrahmbh...@hortonworks.com wrote:
  You are right, moved it to the default implementation section.
 
  Thanks
  Parth
 
  On 4/24/15, 9:52 AM, Gwen Shapira gshap...@cloudera.com wrote:
 
 Sample ACL JSON and Zookeeper is in public API, but I thought it is
 part of DefaultAuthorizer (Since Sentry and Argus won't be using
 Zookeeper).
 Am I wrong? Or is it the KIP?
 
 On Fri, Apr 24, 2015 at 9:49 AM, Parth Brahmbhatt
 pbrahmbh...@hortonworks.com wrote:
  Thanks for clarifying Gwen, KIP updated.
 
  I tried to make the distinction by creating a section for all
 public
 APIs
 
 
 https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authorizat
 io
 n+
 In
  terface#KIP-11-AuthorizationInterface-PublicInterfacesandclasses
 
  Let me know if you think there is a better way to reflect this.
 
  Thanks
  Parth
 
  On 4/24/15, 9:37 AM, Gwen Shapira gshap...@cloudera.com wrote:
 
 +1 (non-binding)
 
 Two nitpicks for the wiki:
 * Heartbeat is probably a READ and not CLUSTER operation. I'm
 pretty
 sure new consumers need it to be part of a consumer group.
 * Can you clearly separate which parts are the API (common to every
 Authorizer) and which parts are DefaultAuthorizer implementation?
 

RE: [VOTE] KIP-11- Authorization design for kafka security

2015-04-26 Thread Sun, Dapeng
Hi Parth

The design looks good, a few minor comments below. Since I just started looking 
into the discussion and many previous discussions I may missed, I'm sorry if 
these comments had be discussed.

1. About SimpleAclAuthorizer (SimpleAuthorizer):
a. As my understanding, I think there should only one type 
privilege(allow/deny) of a topic on a principle, or we make it deny  allow.
For example, acl_1  host1 - group1- user1 - read-allow and acl_2  
host1- group1 - user1 -read-deny, if the two acls are for a same topic, it 
may be hard to understand, do you think it's necessary to add some details 
about this to wiki.
b. And when we do authorize a user on a topic, we may should check user's user 
level acl first, then check user's group level acl, finally we check the host 
level and default level acl. do you think it's necessary we add some contents 
like these to wiki.
For example, host1 - group1- user1host1 - group1host1

2.About SimpleAclAuthorizer (Acl Json will be stored in zookeeper)
a. It may be better to make acl json stored hierarchily. It may be easy to 
search and do authorize. For example, when we authorize a user, we only need 
user related acls.
b. I found one acl may contains multi-principles, multi-operations and 
multi-hosts, I'm strongly agreed with we provide api like these, but the acls 
stored in zookeeper or memory we may better to separate to one-principle, 
one-operation and one host. So we could make sure there are not many acls with 
same meaning and make acl management easily.


Regards
Dapeng

-Original Message-
From: Jun Rao [mailto:j...@confluent.io] 
Sent: Monday, April 27, 2015 5:02 AM
To: dev@kafka.apache.org
Subject: Re: [VOTE] KIP-11- Authorization design for kafka security

A few more minor comments.

100. To make it clear, perhaps we should rename the resource group to 
consumer-group. We can probably make the same change in CLI as well so that 
it's not confused with user group.

101. Currently, create is only at the cluster level. Should it also be at topic 
level? For example, perhaps it's useful to allow only user X to create topic X.

Thanks,

Jun


On Sun, Apr 26, 2015 at 12:36 AM, Gwen Shapira gshap...@cloudera.com
wrote:

 Thanks for clarifying, Parth. I think you are taking the right 
 approach here.

 On Fri, Apr 24, 2015 at 11:46 AM, Parth Brahmbhatt 
 pbrahmbh...@hortonworks.com wrote:
  Sorry Gwen, completely misunderstood the question :-).
 
  * Does everyone have the privilege to create a new Group and use it 
  to consume from Topics he's already privileged on?
  Yes in current proposal. I did not see an API to create 
  group
 but if you
  have a READ permission on a TOPIC and WRITE permission on that Group 
  you are free to join and consume.
 
 
  * Will the CLI tool be used to manage group membership too?
  Yes and I think that means I need to add ―group. Updating 
  the
 KIP. Thanks
  for pointing this out.
 
  * Groups are kind of ephemeral, right? If all consumers in the group 
  disconnect the group is gone, AFAIK. Do we preserve the ACLs? Or do 
  we treat the new group as completely new resource? Can we create 
  ACLs before the group exists, in anticipation of it getting created?
  I have considered any auto delete and auto create as out of
 scope for the
  first release. So Right now I was going with preserving the acls. Do 
  you see any issues with this? Auto deleting would mean authorizer 
  will now have to get into implementation details of kafka which I 
  was trying to avoid.
 
  Thanks
  Parth
 
  On 4/24/15, 11:33 AM, Gwen Shapira gshap...@cloudera.com wrote:
 
 We are not talking about same Groups :)
 
 I meant, Groups of consumers (which KIP-11 lists as a separate 
 resource in the Privilege table)
 
 On Fri, Apr 24, 2015 at 11:31 AM, Parth Brahmbhatt 
 pbrahmbh...@hortonworks.com wrote:
  I see Groups as something we can add incrementally in the current
 model.
  The acls take principalType: name so groups can be represented as
 group:
  groupName. We are not managing group memberships anywhere in kafka 
  and
 I
  don’t see the need to do so.
 
  So for a topic1 using the CLI an admin can add an acl to grant 
  access
 to
  group:kafka-test-users.
 
  The authorizer implementation can have a plugin to map 
 authenticated user  to groups ( This is how hadoop and storm 
 works). The plugin could be  mapping user to linux/ldap/active 
 directory groups but that is again upto  the implementation.
 
  What we are offering is an interface that is extensible so these 
 features  can be added incrementally. I can add support for this in 
 the first  release but don’t necessarily see why this would be 
 absolute necessity.
 
  Thanks
  Parth
 
  On 4/24/15, 11:00 AM, Gwen Shapira gshap...@cloudera.com wrote:
 
 Thanks.
 
 One more thing I'm missing in the KIP is details on the Group 
 resource (I think we discussed this and it was just not fully updated):
 
 * Does everyone have the privilege

Re: [VOTE] KIP-11- Authorization design for kafka security

2015-04-24 Thread Parth Brahmbhatt
I see Groups as something we can add incrementally in the current model.
The acls take principalType: name so groups can be represented as group:
groupName. We are not managing group memberships anywhere in kafka and I
don’t see the need to do so.

So for a topic1 using the CLI an admin can add an acl to grant access to
group:kafka-test-users.

The authorizer implementation can have a plugin to map authenticated user
to groups ( This is how hadoop and storm works). The plugin could be
mapping user to linux/ldap/active directory groups but that is again upto
the implementation.

What we are offering is an interface that is extensible so these features
can be added incrementally. I can add support for this in the first
release but don’t necessarily see why this would be absolute necessity.

Thanks
Parth

On 4/24/15, 11:00 AM, Gwen Shapira gshap...@cloudera.com wrote:

Thanks.

One more thing I'm missing in the KIP is details on the Group resource
(I think we discussed this and it was just not fully updated):

* Does everyone have the privilege to create a new Group and use it to
consume from Topics he's already privileged on?
* Will the CLI tool be used to manage group membership too?
* Groups are kind of ephemeral, right? If all consumers in the group
disconnect the group is gone, AFAIK. Do we preserve the ACLs? Or do we
treat the new group as completely new resource? Can we create ACLs
before the group exists, in anticipation of it getting created?

Its all small details, but it will be difficult to implement KIP-11
without knowing the answers :)

Gwen


On Fri, Apr 24, 2015 at 9:58 AM, Parth Brahmbhatt
pbrahmbh...@hortonworks.com wrote:
 You are right, moved it to the default implementation section.

 Thanks
 Parth

 On 4/24/15, 9:52 AM, Gwen Shapira gshap...@cloudera.com wrote:

Sample ACL JSON and Zookeeper is in public API, but I thought it is
part of DefaultAuthorizer (Since Sentry and Argus won't be using
Zookeeper).
Am I wrong? Or is it the KIP?

On Fri, Apr 24, 2015 at 9:49 AM, Parth Brahmbhatt
pbrahmbh...@hortonworks.com wrote:
 Thanks for clarifying Gwen, KIP updated.

 I tried to make the distinction by creating a section for all public
APIs

https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authorizatio
n+
In
 terface#KIP-11-AuthorizationInterface-PublicInterfacesandclasses

 Let me know if you think there is a better way to reflect this.

 Thanks
 Parth

 On 4/24/15, 9:37 AM, Gwen Shapira gshap...@cloudera.com wrote:

+1 (non-binding)

Two nitpicks for the wiki:
* Heartbeat is probably a READ and not CLUSTER operation. I'm pretty
sure new consumers need it to be part of a consumer group.
* Can you clearly separate which parts are the API (common to every
Authorizer) and which parts are DefaultAuthorizer implementation? It
will make reviews and Authorizer implementations a bit easier to know
exactly which is which.

Gwen

On Fri, Apr 24, 2015 at 9:28 AM, Parth Brahmbhatt
pbrahmbh...@hortonworks.com wrote:
 Hi,

 I would like to open KIP-11 for voting.

 Thanks
 Parth

 On 4/22/15, 1:56 PM, Parth Brahmbhatt
pbrahmbh...@hortonworks.com
 wrote:

Hi Jeff,

Thanks a lot for the review. I think you have a valid point about
acls
being duplicated and the simplest solution would be to modify acls
class
so they hold a set of principals instead of single principal. i.e

user_a,user_b has READ,WRITE,DESCRIBE Permissions on Topic1
from
Host1, Host2, Host3.

I think the evaluation order only matters for the permissionType
which
is
Deny acls should be evaluated before allow acls. To give you an
example
suppose we have following acls

acl1 - user1 is allowed to READ from all hosts.
acl2 - host1 is allowed to READ regardless of who is the user.
acl3 - host2 is allowed to READ regardless of who is the user.

acl4 - user1 is denied to READ from host1.

As stated in the KIP we first evaluate DENY so if user1 tries to
access
from host1 he will be denied(acl4), even though both user1 and host1
has
acl’s for allow with wildcards (acl1, acl2).
If user1 tried to READ from host2 , the action will be allowed and
it
does
not matter if we match acl3 or acl1 so I don’t think the evaluation
order
matters here.

“Will people actually use hosts with users?” I really don’t know but
given
ACl’s are part of our Public APIs I thought it is better to try and
cover
more use cases. If others think this extra complexity is not worth
the
value its adding please raise your concerns so we can discuss if it
should
be removed from the acl structure. Note that even in absence of
hosts
from
ACL users will still be able to whitelist/blacklist host as long as
we
start supporting principalType = “host”, easy to add and can be an
incremental improvement. They will however loose the ability to
restrict
access to users just from a set of hosts.

We agreed to offer a CLI to overcome the JSON acl config
https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authoriza
ti
on
+I
n
terface#KIP-11-AuthorizationInterface-AclManagement(CLI). 

Re: [VOTE] KIP-11- Authorization design for kafka security

2015-04-24 Thread Gwen Shapira
We are not talking about same Groups :)

I meant, Groups of consumers (which KIP-11 lists as a separate
resource in the Privilege table)

On Fri, Apr 24, 2015 at 11:31 AM, Parth Brahmbhatt
pbrahmbh...@hortonworks.com wrote:
 I see Groups as something we can add incrementally in the current model.
 The acls take principalType: name so groups can be represented as group:
 groupName. We are not managing group memberships anywhere in kafka and I
 don’t see the need to do so.

 So for a topic1 using the CLI an admin can add an acl to grant access to
 group:kafka-test-users.

 The authorizer implementation can have a plugin to map authenticated user
 to groups ( This is how hadoop and storm works). The plugin could be
 mapping user to linux/ldap/active directory groups but that is again upto
 the implementation.

 What we are offering is an interface that is extensible so these features
 can be added incrementally. I can add support for this in the first
 release but don’t necessarily see why this would be absolute necessity.

 Thanks
 Parth

 On 4/24/15, 11:00 AM, Gwen Shapira gshap...@cloudera.com wrote:

Thanks.

One more thing I'm missing in the KIP is details on the Group resource
(I think we discussed this and it was just not fully updated):

* Does everyone have the privilege to create a new Group and use it to
consume from Topics he's already privileged on?
* Will the CLI tool be used to manage group membership too?
* Groups are kind of ephemeral, right? If all consumers in the group
disconnect the group is gone, AFAIK. Do we preserve the ACLs? Or do we
treat the new group as completely new resource? Can we create ACLs
before the group exists, in anticipation of it getting created?

Its all small details, but it will be difficult to implement KIP-11
without knowing the answers :)

Gwen


On Fri, Apr 24, 2015 at 9:58 AM, Parth Brahmbhatt
pbrahmbh...@hortonworks.com wrote:
 You are right, moved it to the default implementation section.

 Thanks
 Parth

 On 4/24/15, 9:52 AM, Gwen Shapira gshap...@cloudera.com wrote:

Sample ACL JSON and Zookeeper is in public API, but I thought it is
part of DefaultAuthorizer (Since Sentry and Argus won't be using
Zookeeper).
Am I wrong? Or is it the KIP?

On Fri, Apr 24, 2015 at 9:49 AM, Parth Brahmbhatt
pbrahmbh...@hortonworks.com wrote:
 Thanks for clarifying Gwen, KIP updated.

 I tried to make the distinction by creating a section for all public
APIs

https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authorizatio
n+
In
 terface#KIP-11-AuthorizationInterface-PublicInterfacesandclasses

 Let me know if you think there is a better way to reflect this.

 Thanks
 Parth

 On 4/24/15, 9:37 AM, Gwen Shapira gshap...@cloudera.com wrote:

+1 (non-binding)

Two nitpicks for the wiki:
* Heartbeat is probably a READ and not CLUSTER operation. I'm pretty
sure new consumers need it to be part of a consumer group.
* Can you clearly separate which parts are the API (common to every
Authorizer) and which parts are DefaultAuthorizer implementation? It
will make reviews and Authorizer implementations a bit easier to know
exactly which is which.

Gwen

On Fri, Apr 24, 2015 at 9:28 AM, Parth Brahmbhatt
pbrahmbh...@hortonworks.com wrote:
 Hi,

 I would like to open KIP-11 for voting.

 Thanks
 Parth

 On 4/22/15, 1:56 PM, Parth Brahmbhatt
pbrahmbh...@hortonworks.com
 wrote:

Hi Jeff,

Thanks a lot for the review. I think you have a valid point about
acls
being duplicated and the simplest solution would be to modify acls
class
so they hold a set of principals instead of single principal. i.e

user_a,user_b has READ,WRITE,DESCRIBE Permissions on Topic1
from
Host1, Host2, Host3.

I think the evaluation order only matters for the permissionType
which
is
Deny acls should be evaluated before allow acls. To give you an
example
suppose we have following acls

acl1 - user1 is allowed to READ from all hosts.
acl2 - host1 is allowed to READ regardless of who is the user.
acl3 - host2 is allowed to READ regardless of who is the user.

acl4 - user1 is denied to READ from host1.

As stated in the KIP we first evaluate DENY so if user1 tries to
access
from host1 he will be denied(acl4), even though both user1 and host1
has
acl’s for allow with wildcards (acl1, acl2).
If user1 tried to READ from host2 , the action will be allowed and
it
does
not matter if we match acl3 or acl1 so I don’t think the evaluation
order
matters here.

“Will people actually use hosts with users?” I really don’t know but
given
ACl’s are part of our Public APIs I thought it is better to try and
cover
more use cases. If others think this extra complexity is not worth
the
value its adding please raise your concerns so we can discuss if it
should
be removed from the acl structure. Note that even in absence of
hosts
from
ACL users will still be able to whitelist/blacklist host as long as
we
start supporting principalType = “host”, easy to add and can be an
incremental improvement. They will however loose the ability to

Re: [VOTE] KIP-11- Authorization design for kafka security

2015-04-24 Thread Gwen Shapira
Sorry, for the confusion. I'm not sure my last email is clear enough either...

Consumers will have a Principal which may belong to a group.
But consumer configuration also have a group.id, which controls how
partitions are shared between consumers and how offsets are committed.
I'm talking about those Groups.


On Fri, Apr 24, 2015 at 11:33 AM, Gwen Shapira gshap...@cloudera.com wrote:
 We are not talking about same Groups :)

 I meant, Groups of consumers (which KIP-11 lists as a separate
 resource in the Privilege table)

 On Fri, Apr 24, 2015 at 11:31 AM, Parth Brahmbhatt
 pbrahmbh...@hortonworks.com wrote:
 I see Groups as something we can add incrementally in the current model.
 The acls take principalType: name so groups can be represented as group:
 groupName. We are not managing group memberships anywhere in kafka and I
 don’t see the need to do so.

 So for a topic1 using the CLI an admin can add an acl to grant access to
 group:kafka-test-users.

 The authorizer implementation can have a plugin to map authenticated user
 to groups ( This is how hadoop and storm works). The plugin could be
 mapping user to linux/ldap/active directory groups but that is again upto
 the implementation.

 What we are offering is an interface that is extensible so these features
 can be added incrementally. I can add support for this in the first
 release but don’t necessarily see why this would be absolute necessity.

 Thanks
 Parth

 On 4/24/15, 11:00 AM, Gwen Shapira gshap...@cloudera.com wrote:

Thanks.

One more thing I'm missing in the KIP is details on the Group resource
(I think we discussed this and it was just not fully updated):

* Does everyone have the privilege to create a new Group and use it to
consume from Topics he's already privileged on?
* Will the CLI tool be used to manage group membership too?
* Groups are kind of ephemeral, right? If all consumers in the group
disconnect the group is gone, AFAIK. Do we preserve the ACLs? Or do we
treat the new group as completely new resource? Can we create ACLs
before the group exists, in anticipation of it getting created?

Its all small details, but it will be difficult to implement KIP-11
without knowing the answers :)

Gwen


On Fri, Apr 24, 2015 at 9:58 AM, Parth Brahmbhatt
pbrahmbh...@hortonworks.com wrote:
 You are right, moved it to the default implementation section.

 Thanks
 Parth

 On 4/24/15, 9:52 AM, Gwen Shapira gshap...@cloudera.com wrote:

Sample ACL JSON and Zookeeper is in public API, but I thought it is
part of DefaultAuthorizer (Since Sentry and Argus won't be using
Zookeeper).
Am I wrong? Or is it the KIP?

On Fri, Apr 24, 2015 at 9:49 AM, Parth Brahmbhatt
pbrahmbh...@hortonworks.com wrote:
 Thanks for clarifying Gwen, KIP updated.

 I tried to make the distinction by creating a section for all public
APIs

https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authorizatio
n+
In
 terface#KIP-11-AuthorizationInterface-PublicInterfacesandclasses

 Let me know if you think there is a better way to reflect this.

 Thanks
 Parth

 On 4/24/15, 9:37 AM, Gwen Shapira gshap...@cloudera.com wrote:

+1 (non-binding)

Two nitpicks for the wiki:
* Heartbeat is probably a READ and not CLUSTER operation. I'm pretty
sure new consumers need it to be part of a consumer group.
* Can you clearly separate which parts are the API (common to every
Authorizer) and which parts are DefaultAuthorizer implementation? It
will make reviews and Authorizer implementations a bit easier to know
exactly which is which.

Gwen

On Fri, Apr 24, 2015 at 9:28 AM, Parth Brahmbhatt
pbrahmbh...@hortonworks.com wrote:
 Hi,

 I would like to open KIP-11 for voting.

 Thanks
 Parth

 On 4/22/15, 1:56 PM, Parth Brahmbhatt
pbrahmbh...@hortonworks.com
 wrote:

Hi Jeff,

Thanks a lot for the review. I think you have a valid point about
acls
being duplicated and the simplest solution would be to modify acls
class
so they hold a set of principals instead of single principal. i.e

user_a,user_b has READ,WRITE,DESCRIBE Permissions on Topic1
from
Host1, Host2, Host3.

I think the evaluation order only matters for the permissionType
which
is
Deny acls should be evaluated before allow acls. To give you an
example
suppose we have following acls

acl1 - user1 is allowed to READ from all hosts.
acl2 - host1 is allowed to READ regardless of who is the user.
acl3 - host2 is allowed to READ regardless of who is the user.

acl4 - user1 is denied to READ from host1.

As stated in the KIP we first evaluate DENY so if user1 tries to
access
from host1 he will be denied(acl4), even though both user1 and host1
has
acl’s for allow with wildcards (acl1, acl2).
If user1 tried to READ from host2 , the action will be allowed and
it
does
not matter if we match acl3 or acl1 so I don’t think the evaluation
order
matters here.

“Will people actually use hosts with users?” I really don’t know but
given
ACl’s are part of our Public APIs I thought it is better to try and
cover
more use cases. If others think 

Re: [VOTE] KIP-11- Authorization design for kafka security

2015-04-24 Thread Parth Brahmbhatt
Thanks for your comments Gari. My responses are inline.

Thanks
Parth

On 4/24/15, 10:36 AM, Gari Singh gari.r.si...@gmail.com wrote:

Sorry - fat fingered send ...


Not sure if my newbie vote will count, but I think you are getting
pretty
close here.

Couple of things:

1) I know the Session object is from a different JIRA, but I think that
Session should take a Subject rather than just a single Principal.  The
reason for this is because a Subject can have multiple Principals (for
example both a username and a group or perhaps someone would want to use
both the username and the clientIP as Principals)

I think the user - group mapping can be done at Authorization
implementation layer. In any case as you pointed out the session is part
of another jira and I think a PR is out
https://reviews.apache.org/r/27204/diff/ and we should discuss it on that
PR.


2)  We would then also have multiple concrete Principals, e.g.

KafkaPrincipal
KafkaUserPrincipal
KafkaGroupPrincipal
(perhaps even KafkaKerberosPrincipal and KafkaClientAddressPrincipal)
etc

This is important as eventually (hopefully sooner than later), we will
support multiple types of authentication which may each want to populate
the Subject with one or more Principals and perhaps even credentials (this
could be used in the future to hold encryption keys or perhaps the raw
info
prior to authentication).

So in this way, if we have different authentication modules, we can add
different types of Principals by extension

This also allows the same subject to have access to some resources based
on
username and some based on group.

Given that with this we would have different types of Principals, I would
then modify the ACL to look like:

{version:1,
  {acls:[
{
  principal_types:[KafkaUserPrincipal,KafkaGroupPrincipal],
  principals:[alice,kafka-devs]
  ...

or

{version:1,
  {acls:[
{
  principals:[KafkaUserPrincipal:alice,KafkaGroupPrincipal:kafka-
devs]


But in either case this allows for easy identification of the type of
principal and makes it easy to plugin multiple kinds of principals

The advantage of all of this is that it now provides more flexibility for
custom modules for both authentication and authorization moving forward.

All the principals that you listed above can be supported with current
design. Acls take a KafkaPrincipal as input which is a combination of type
and principal name and the authorizer implementations are free to create
any extension of this which covers group: groupName, host: HostName,
kerberos: kerberosUserName and any other types that may come up. I am not
sure how encryption key storage is relavent to the Authorizer so will be
great if you can elaborate.


3) Are you sure that you want authorize to take a session object?  If
we use the model in one above, we could just populate the Subject with a
KafkaClientAddressPrincipal and thenhave access to that when evaluated the
ACLs.

I think it is better to take a session which can just be a wrapper on 
top
of Subject + host for now. This allows for extension which in my opinion
is more future requirement proof.


4) What about actually caching authorization decisions?  I know ACLs will
be cached, but the actual authorize decision can be expensive as well?

In default implementation I don’t plan to do this. Easy to add later if
we want to but I am not sure why would this ever be expansive when acls
are cached and number of acls on a single topic should be very small and
iterating over them with simple string comparison should not really be
expansive.

Thanks
Parth


On Fri, Apr 24, 2015 at 1:27 PM, Gari Singh gari.r.si...@gmail.com
wrote:

 Not sure if my newbie vote will count, but I think you are getting
 pretty close here.

 Couple of things:

 1) I know the Session object is from a different JIRA, but I think that
 Session should take a Subject rather than just a single Principal.  The
 reason for this is because a Subject can have multiple Principals (for
 example both a username and a group or perhaps someone would want to use
 both the username and the clientIP as Principals)

 2)  We would then also have multiple concrete Principals, e.g.

 KafkaPrincipal
 KafkaUserPrincipal
 KafkaGroupPrincipal
 (perhaps even KafkaKerberosPrincipal and KafkaClientAddressPrincipal)
 etc

 This is important as eventually (hopefully sooner than later), we will
 support multiple types of authentication which may each want to populate
 the Subject with one or more Principals and perhaps even credentials
(this
 could be used in the future to hold encryption keys or perhaps the raw
info
 prior to authentication).

 So in this way, if we have different authentication modules, we can add
 different types of Principals by extension

 This also allows the same subject to have access to some resources based
 on username and some based on group.

 Given that with this we would have different types of Principals, I

Re: [VOTE] KIP-11- Authorization design for kafka security

2015-04-24 Thread Gwen Shapira
+1 (non-binding)

Two nitpicks for the wiki:
* Heartbeat is probably a READ and not CLUSTER operation. I'm pretty
sure new consumers need it to be part of a consumer group.
* Can you clearly separate which parts are the API (common to every
Authorizer) and which parts are DefaultAuthorizer implementation? It
will make reviews and Authorizer implementations a bit easier to know
exactly which is which.

Gwen

On Fri, Apr 24, 2015 at 9:28 AM, Parth Brahmbhatt
pbrahmbh...@hortonworks.com wrote:
 Hi,

 I would like to open KIP-11 for voting.

 Thanks
 Parth

 On 4/22/15, 1:56 PM, Parth Brahmbhatt pbrahmbh...@hortonworks.com
 wrote:

Hi Jeff,

Thanks a lot for the review. I think you have a valid point about acls
being duplicated and the simplest solution would be to modify acls class
so they hold a set of principals instead of single principal. i.e

user_a,user_b has READ,WRITE,DESCRIBE Permissions on Topic1 from
Host1, Host2, Host3.

I think the evaluation order only matters for the permissionType which is
Deny acls should be evaluated before allow acls. To give you an example
suppose we have following acls

acl1 - user1 is allowed to READ from all hosts.
acl2 - host1 is allowed to READ regardless of who is the user.
acl3 - host2 is allowed to READ regardless of who is the user.

acl4 - user1 is denied to READ from host1.

As stated in the KIP we first evaluate DENY so if user1 tries to access
from host1 he will be denied(acl4), even though both user1 and host1 has
acl’s for allow with wildcards (acl1, acl2).
If user1 tried to READ from host2 , the action will be allowed and it does
not matter if we match acl3 or acl1 so I don’t think the evaluation order
matters here.

“Will people actually use hosts with users?” I really don’t know but given
ACl’s are part of our Public APIs I thought it is better to try and cover
more use cases. If others think this extra complexity is not worth the
value its adding please raise your concerns so we can discuss if it should
be removed from the acl structure. Note that even in absence of hosts from
ACL users will still be able to whitelist/blacklist host as long as we
start supporting principalType = “host”, easy to add and can be an
incremental improvement. They will however loose the ability to restrict
access to users just from a set of hosts.

We agreed to offer a CLI to overcome the JSON acl config
https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authorization+I
n
terface#KIP-11-AuthorizationInterface-AclManagement(CLI). I still like
Jsons but that probably has something to do with me being a developer :-).

Thanks
Parth

On 4/22/15, 11:38 AM, Jeff Holoman jholo...@cloudera.com wrote:

Parth,

This is a long thread, so trying to keep up here, sorry if this has been
covered before. First, great job on the KIP proposal and work so far.

Are we sure that we want to tie host level access to a given user? My
understanding is that the ACL will be (omitting some fields)

user_a, host1, host2, host3
user_b, host1, host2, host3

So there would potentially be a lot of redundancy in the configs. Does it
make sense to have hosts be at the same level as principal in the
hierarchy? This way you could just blanket the allowed / denied hosts and
only have to worry about the users. So if you follow this, then

we can wildcard the user so we can have a separate list of just
host-based
access. What's the order that the perms would be evaluated if a there was
more than one match on a principal ?

Is the thought that there wouldn't usually be much overlap on hosts? I
guess I can imagine a scenario where I want to offline/online access to a
particular hosts or set of hosts and if there was overlap, I'm doing a
bunch of alter commands for just a single host. Maybe this is too
contrived
an example?

I agree that having this level of granularity gives flexibility but I
wonder if people will actually use it and not just * the hosts for a
given
user and create separate global list as i mentioned above?

The only other system I know of that ties users with hosts for access is
MySql and I don't love that model. Companies usually standardize on group
authorization anyway, are we complicating that issue with the inclusion
of
hosts attached to users? Additionally I worry about the debt of big JSON
configs in the first place, most non-developers find them non-intuitive
already, so anything to ease this I think would be beneficial.


Thanks

Jeff

On Wed, Apr 22, 2015 at 2:22 PM, Parth Brahmbhatt 
pbrahmbh...@hortonworks.com wrote:

 Sorry I missed your last questions. I am +0 on adding ―host option for
 ―list, we could add it for symmetry. Again if this is only a CLI change
it
 can be added later if you mean adding this in authorizer interface then
we
 should make a decision now.

 Given a choice I would like to actually keep only one option which is
 resource based get (remove even the get based on principal). I see
those
 (getAcl for principal or host) as special filtering case which can

Re: [VOTE] KIP-11- Authorization design for kafka security

2015-04-24 Thread Parth Brahmbhatt
You are right, moved it to the default implementation section.

Thanks
Parth

On 4/24/15, 9:52 AM, Gwen Shapira gshap...@cloudera.com wrote:

Sample ACL JSON and Zookeeper is in public API, but I thought it is
part of DefaultAuthorizer (Since Sentry and Argus won't be using
Zookeeper).
Am I wrong? Or is it the KIP?

On Fri, Apr 24, 2015 at 9:49 AM, Parth Brahmbhatt
pbrahmbh...@hortonworks.com wrote:
 Thanks for clarifying Gwen, KIP updated.

 I tried to make the distinction by creating a section for all public
APIs
 
https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authorization+
In
 terface#KIP-11-AuthorizationInterface-PublicInterfacesandclasses

 Let me know if you think there is a better way to reflect this.

 Thanks
 Parth

 On 4/24/15, 9:37 AM, Gwen Shapira gshap...@cloudera.com wrote:

+1 (non-binding)

Two nitpicks for the wiki:
* Heartbeat is probably a READ and not CLUSTER operation. I'm pretty
sure new consumers need it to be part of a consumer group.
* Can you clearly separate which parts are the API (common to every
Authorizer) and which parts are DefaultAuthorizer implementation? It
will make reviews and Authorizer implementations a bit easier to know
exactly which is which.

Gwen

On Fri, Apr 24, 2015 at 9:28 AM, Parth Brahmbhatt
pbrahmbh...@hortonworks.com wrote:
 Hi,

 I would like to open KIP-11 for voting.

 Thanks
 Parth

 On 4/22/15, 1:56 PM, Parth Brahmbhatt pbrahmbh...@hortonworks.com
 wrote:

Hi Jeff,

Thanks a lot for the review. I think you have a valid point about acls
being duplicated and the simplest solution would be to modify acls
class
so they hold a set of principals instead of single principal. i.e

user_a,user_b has READ,WRITE,DESCRIBE Permissions on Topic1 from
Host1, Host2, Host3.

I think the evaluation order only matters for the permissionType which
is
Deny acls should be evaluated before allow acls. To give you an
example
suppose we have following acls

acl1 - user1 is allowed to READ from all hosts.
acl2 - host1 is allowed to READ regardless of who is the user.
acl3 - host2 is allowed to READ regardless of who is the user.

acl4 - user1 is denied to READ from host1.

As stated in the KIP we first evaluate DENY so if user1 tries to
access
from host1 he will be denied(acl4), even though both user1 and host1
has
acl’s for allow with wildcards (acl1, acl2).
If user1 tried to READ from host2 , the action will be allowed and it
does
not matter if we match acl3 or acl1 so I don’t think the evaluation
order
matters here.

“Will people actually use hosts with users?” I really don’t know but
given
ACl’s are part of our Public APIs I thought it is better to try and
cover
more use cases. If others think this extra complexity is not worth the
value its adding please raise your concerns so we can discuss if it
should
be removed from the acl structure. Note that even in absence of hosts
from
ACL users will still be able to whitelist/blacklist host as long as we
start supporting principalType = “host”, easy to add and can be an
incremental improvement. They will however loose the ability to
restrict
access to users just from a set of hosts.

We agreed to offer a CLI to overcome the JSON acl config
https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authorizati
on
+I
n
terface#KIP-11-AuthorizationInterface-AclManagement(CLI). I still like
Jsons but that probably has something to do with me being a developer
:-).

Thanks
Parth

On 4/22/15, 11:38 AM, Jeff Holoman jholo...@cloudera.com wrote:

Parth,

This is a long thread, so trying to keep up here, sorry if this has
been
covered before. First, great job on the KIP proposal and work so far.

Are we sure that we want to tie host level access to a given user? My
understanding is that the ACL will be (omitting some fields)

user_a, host1, host2, host3
user_b, host1, host2, host3

So there would potentially be a lot of redundancy in the configs.
Does
it
make sense to have hosts be at the same level as principal in the
hierarchy? This way you could just blanket the allowed / denied hosts
and
only have to worry about the users. So if you follow this, then

we can wildcard the user so we can have a separate list of just
host-based
access. What's the order that the perms would be evaluated if a there
was
more than one match on a principal ?

Is the thought that there wouldn't usually be much overlap on hosts?
I
guess I can imagine a scenario where I want to offline/online access
to a
particular hosts or set of hosts and if there was overlap, I'm doing
a
bunch of alter commands for just a single host. Maybe this is too
contrived
an example?

I agree that having this level of granularity gives flexibility but I
wonder if people will actually use it and not just * the hosts for a
given
user and create separate global list as i mentioned above?

The only other system I know of that ties users with hosts for access
is
MySql and I don't love that model. Companies usually standardize on
group
authorization anyway, are we 

Re: [VOTE] KIP-11- Authorization design for kafka security

2015-04-24 Thread Gari Singh
Not sure if my newbie vote will count, but I think you are getting pretty
close here.

Couple of things:

1) I know the Session object is from a different JIRA, but I think that
Session should take a Subject rather than just a single Principal.  The
reason for this is because a Subject can have multiple Principals (for
example both a username and a group or perhaps someone would want to use
both the username and the clientIP as Principals)

2)  We would then also have multiple concrete Principals, e.g.

KafkaPrincipal
KafkaUserPrincipal
KafkaGroupPrincipal
(perhaps even KafkaKerberosPrincipal and KafkaClientAddressPrincipal)
etc

This is important as eventually (hopefully sooner than later), we will
support multiple types of authentication which may each want to populate
the Subject with one or more Principals and perhaps even credentials (this
could be used in the future to hold encryption keys or perhaps the raw info
prior to authentication).

So in this way, if we have different authentication modules, we can add
different types of Principals by extension

This also allows the same subject to have access to some resources based on
username and some based on group.

Given that with this we would have different types of Principals, I would
then modify the ACL to look like:

{version:1,
  {acls:[
{
  principal_types:[KafkaUserPrincipal,KafkaGroupPrincipal],
  principals:[alice,kafka-devs





3) The advantage of all of this is that it now provides more flexibility
for custom modules for both authentication and authorization moving forward.



On Fri, Apr 24, 2015 at 12:37 PM, Gwen Shapira gshap...@cloudera.com
wrote:

 +1 (non-binding)

 Two nitpicks for the wiki:
 * Heartbeat is probably a READ and not CLUSTER operation. I'm pretty
 sure new consumers need it to be part of a consumer group.
 * Can you clearly separate which parts are the API (common to every
 Authorizer) and which parts are DefaultAuthorizer implementation? It
 will make reviews and Authorizer implementations a bit easier to know
 exactly which is which.

 Gwen

 On Fri, Apr 24, 2015 at 9:28 AM, Parth Brahmbhatt
 pbrahmbh...@hortonworks.com wrote:
  Hi,
 
  I would like to open KIP-11 for voting.
 
  Thanks
  Parth
 
  On 4/22/15, 1:56 PM, Parth Brahmbhatt pbrahmbh...@hortonworks.com
  wrote:
 
 Hi Jeff,
 
 Thanks a lot for the review. I think you have a valid point about acls
 being duplicated and the simplest solution would be to modify acls class
 so they hold a set of principals instead of single principal. i.e
 
 user_a,user_b has READ,WRITE,DESCRIBE Permissions on Topic1 from
 Host1, Host2, Host3.
 
 I think the evaluation order only matters for the permissionType which is
 Deny acls should be evaluated before allow acls. To give you an example
 suppose we have following acls
 
 acl1 - user1 is allowed to READ from all hosts.
 acl2 - host1 is allowed to READ regardless of who is the user.
 acl3 - host2 is allowed to READ regardless of who is the user.
 
 acl4 - user1 is denied to READ from host1.
 
 As stated in the KIP we first evaluate DENY so if user1 tries to access
 from host1 he will be denied(acl4), even though both user1 and host1 has
 acl’s for allow with wildcards (acl1, acl2).
 If user1 tried to READ from host2 , the action will be allowed and it
 does
 not matter if we match acl3 or acl1 so I don’t think the evaluation order
 matters here.
 
 “Will people actually use hosts with users?” I really don’t know but
 given
 ACl’s are part of our Public APIs I thought it is better to try and cover
 more use cases. If others think this extra complexity is not worth the
 value its adding please raise your concerns so we can discuss if it
 should
 be removed from the acl structure. Note that even in absence of hosts
 from
 ACL users will still be able to whitelist/blacklist host as long as we
 start supporting principalType = “host”, easy to add and can be an
 incremental improvement. They will however loose the ability to restrict
 access to users just from a set of hosts.
 
 We agreed to offer a CLI to overcome the JSON acl config
 
 https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authorization+I
 n
 terface#KIP-11-AuthorizationInterface-AclManagement(CLI). I still like
 Jsons but that probably has something to do with me being a developer
 :-).
 
 Thanks
 Parth
 
 On 4/22/15, 11:38 AM, Jeff Holoman jholo...@cloudera.com wrote:
 
 Parth,
 
 This is a long thread, so trying to keep up here, sorry if this has been
 covered before. First, great job on the KIP proposal and work so far.
 
 Are we sure that we want to tie host level access to a given user? My
 understanding is that the ACL will be (omitting some fields)
 
 user_a, host1, host2, host3
 user_b, host1, host2, host3
 
 So there would potentially be a lot of redundancy in the configs. Does
 it
 make sense to have hosts be at the same level as principal in the
 hierarchy? This way you could just blanket the allowed / denied hosts
 and
 only have to 

Re: [VOTE] KIP-11- Authorization design for kafka security

2015-04-24 Thread Parth Brahmbhatt
Hi,

I would like to open KIP-11 for voting.

Thanks
Parth

On 4/22/15, 1:56 PM, Parth Brahmbhatt pbrahmbh...@hortonworks.com
wrote:

Hi Jeff,

Thanks a lot for the review. I think you have a valid point about acls
being duplicated and the simplest solution would be to modify acls class
so they hold a set of principals instead of single principal. i.e

user_a,user_b has READ,WRITE,DESCRIBE Permissions on Topic1 from
Host1, Host2, Host3.

I think the evaluation order only matters for the permissionType which is
Deny acls should be evaluated before allow acls. To give you an example
suppose we have following acls

acl1 - user1 is allowed to READ from all hosts.
acl2 - host1 is allowed to READ regardless of who is the user.
acl3 - host2 is allowed to READ regardless of who is the user.

acl4 - user1 is denied to READ from host1.

As stated in the KIP we first evaluate DENY so if user1 tries to access
from host1 he will be denied(acl4), even though both user1 and host1 has
acl’s for allow with wildcards (acl1, acl2).
If user1 tried to READ from host2 , the action will be allowed and it does
not matter if we match acl3 or acl1 so I don’t think the evaluation order
matters here.

“Will people actually use hosts with users?” I really don’t know but given
ACl’s are part of our Public APIs I thought it is better to try and cover
more use cases. If others think this extra complexity is not worth the
value its adding please raise your concerns so we can discuss if it should
be removed from the acl structure. Note that even in absence of hosts from
ACL users will still be able to whitelist/blacklist host as long as we
start supporting principalType = “host”, easy to add and can be an
incremental improvement. They will however loose the ability to restrict
access to users just from a set of hosts.

We agreed to offer a CLI to overcome the JSON acl config
https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authorization+I
n
terface#KIP-11-AuthorizationInterface-AclManagement(CLI). I still like
Jsons but that probably has something to do with me being a developer :-).

Thanks
Parth

On 4/22/15, 11:38 AM, Jeff Holoman jholo...@cloudera.com wrote:

Parth,

This is a long thread, so trying to keep up here, sorry if this has been
covered before. First, great job on the KIP proposal and work so far.

Are we sure that we want to tie host level access to a given user? My
understanding is that the ACL will be (omitting some fields)

user_a, host1, host2, host3
user_b, host1, host2, host3

So there would potentially be a lot of redundancy in the configs. Does it
make sense to have hosts be at the same level as principal in the
hierarchy? This way you could just blanket the allowed / denied hosts and
only have to worry about the users. So if you follow this, then

we can wildcard the user so we can have a separate list of just
host-based
access. What's the order that the perms would be evaluated if a there was
more than one match on a principal ?

Is the thought that there wouldn't usually be much overlap on hosts? I
guess I can imagine a scenario where I want to offline/online access to a
particular hosts or set of hosts and if there was overlap, I'm doing a
bunch of alter commands for just a single host. Maybe this is too
contrived
an example?

I agree that having this level of granularity gives flexibility but I
wonder if people will actually use it and not just * the hosts for a
given
user and create separate global list as i mentioned above?

The only other system I know of that ties users with hosts for access is
MySql and I don't love that model. Companies usually standardize on group
authorization anyway, are we complicating that issue with the inclusion
of
hosts attached to users? Additionally I worry about the debt of big JSON
configs in the first place, most non-developers find them non-intuitive
already, so anything to ease this I think would be beneficial.


Thanks

Jeff

On Wed, Apr 22, 2015 at 2:22 PM, Parth Brahmbhatt 
pbrahmbh...@hortonworks.com wrote:

 Sorry I missed your last questions. I am +0 on adding ―host option for
 ―list, we could add it for symmetry. Again if this is only a CLI change
it
 can be added later if you mean adding this in authorizer interface then
we
 should make a decision now.

 Given a choice I would like to actually keep only one option which is
 resource based get (remove even the get based on principal). I see
those
 (getAcl for principal or host) as special filtering case which can
easily
 be achieved by a third party tool by doing list all topics and
calling
 getAcls for each topic and applying filtering logic on that.  I really
 don’t see the need to make those first class citizens of the authorizer
 interface given these kind of queries will be issued outside of broker
JVM
 so they will not benefit from the caching and because the storage will
be
 indexed on resource both these options even as a first class API will
just
 scan all topic acls and apply filtering logic.

 

Re: [VOTE] KIP-11- Authorization design for kafka security

2015-04-24 Thread Gwen Shapira
Sample ACL JSON and Zookeeper is in public API, but I thought it is
part of DefaultAuthorizer (Since Sentry and Argus won't be using
Zookeeper).
Am I wrong? Or is it the KIP?

On Fri, Apr 24, 2015 at 9:49 AM, Parth Brahmbhatt
pbrahmbh...@hortonworks.com wrote:
 Thanks for clarifying Gwen, KIP updated.

 I tried to make the distinction by creating a section for all public APIs
 https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authorization+In
 terface#KIP-11-AuthorizationInterface-PublicInterfacesandclasses

 Let me know if you think there is a better way to reflect this.

 Thanks
 Parth

 On 4/24/15, 9:37 AM, Gwen Shapira gshap...@cloudera.com wrote:

+1 (non-binding)

Two nitpicks for the wiki:
* Heartbeat is probably a READ and not CLUSTER operation. I'm pretty
sure new consumers need it to be part of a consumer group.
* Can you clearly separate which parts are the API (common to every
Authorizer) and which parts are DefaultAuthorizer implementation? It
will make reviews and Authorizer implementations a bit easier to know
exactly which is which.

Gwen

On Fri, Apr 24, 2015 at 9:28 AM, Parth Brahmbhatt
pbrahmbh...@hortonworks.com wrote:
 Hi,

 I would like to open KIP-11 for voting.

 Thanks
 Parth

 On 4/22/15, 1:56 PM, Parth Brahmbhatt pbrahmbh...@hortonworks.com
 wrote:

Hi Jeff,

Thanks a lot for the review. I think you have a valid point about acls
being duplicated and the simplest solution would be to modify acls class
so they hold a set of principals instead of single principal. i.e

user_a,user_b has READ,WRITE,DESCRIBE Permissions on Topic1 from
Host1, Host2, Host3.

I think the evaluation order only matters for the permissionType which
is
Deny acls should be evaluated before allow acls. To give you an example
suppose we have following acls

acl1 - user1 is allowed to READ from all hosts.
acl2 - host1 is allowed to READ regardless of who is the user.
acl3 - host2 is allowed to READ regardless of who is the user.

acl4 - user1 is denied to READ from host1.

As stated in the KIP we first evaluate DENY so if user1 tries to access
from host1 he will be denied(acl4), even though both user1 and host1 has
acl’s for allow with wildcards (acl1, acl2).
If user1 tried to READ from host2 , the action will be allowed and it
does
not matter if we match acl3 or acl1 so I don’t think the evaluation
order
matters here.

“Will people actually use hosts with users?” I really don’t know but
given
ACl’s are part of our Public APIs I thought it is better to try and
cover
more use cases. If others think this extra complexity is not worth the
value its adding please raise your concerns so we can discuss if it
should
be removed from the acl structure. Note that even in absence of hosts
from
ACL users will still be able to whitelist/blacklist host as long as we
start supporting principalType = “host”, easy to add and can be an
incremental improvement. They will however loose the ability to restrict
access to users just from a set of hosts.

We agreed to offer a CLI to overcome the JSON acl config
https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authorization
+I
n
terface#KIP-11-AuthorizationInterface-AclManagement(CLI). I still like
Jsons but that probably has something to do with me being a developer
:-).

Thanks
Parth

On 4/22/15, 11:38 AM, Jeff Holoman jholo...@cloudera.com wrote:

Parth,

This is a long thread, so trying to keep up here, sorry if this has
been
covered before. First, great job on the KIP proposal and work so far.

Are we sure that we want to tie host level access to a given user? My
understanding is that the ACL will be (omitting some fields)

user_a, host1, host2, host3
user_b, host1, host2, host3

So there would potentially be a lot of redundancy in the configs. Does
it
make sense to have hosts be at the same level as principal in the
hierarchy? This way you could just blanket the allowed / denied hosts
and
only have to worry about the users. So if you follow this, then

we can wildcard the user so we can have a separate list of just
host-based
access. What's the order that the perms would be evaluated if a there
was
more than one match on a principal ?

Is the thought that there wouldn't usually be much overlap on hosts? I
guess I can imagine a scenario where I want to offline/online access
to a
particular hosts or set of hosts and if there was overlap, I'm doing a
bunch of alter commands for just a single host. Maybe this is too
contrived
an example?

I agree that having this level of granularity gives flexibility but I
wonder if people will actually use it and not just * the hosts for a
given
user and create separate global list as i mentioned above?

The only other system I know of that ties users with hosts for access
is
MySql and I don't love that model. Companies usually standardize on
group
authorization anyway, are we complicating that issue with the inclusion
of
hosts attached to users? Additionally I worry about the debt of big
JSON
configs in the first place, most 

Re: [VOTE] KIP-11- Authorization design for kafka security

2015-04-24 Thread Sriharsha Chintalapani
+1 (non-binding)

-- 
Harsha


On April 24, 2015 at 9:59:09 AM, Parth Brahmbhatt (pbrahmbh...@hortonworks.com) 
wrote:

You are right, moved it to the default implementation section.  

Thanks  
Parth  

On 4/24/15, 9:52 AM, Gwen Shapira gshap...@cloudera.com wrote:  

Sample ACL JSON and Zookeeper is in public API, but I thought it is  
part of DefaultAuthorizer (Since Sentry and Argus won't be using  
Zookeeper).  
Am I wrong? Or is it the KIP?  
  
On Fri, Apr 24, 2015 at 9:49 AM, Parth Brahmbhatt  
pbrahmbh...@hortonworks.com wrote:  
 Thanks for clarifying Gwen, KIP updated.  
  
 I tried to make the distinction by creating a section for all public  
APIs  
  
https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authorization+  
In  
 terface#KIP-11-AuthorizationInterface-PublicInterfacesandclasses  
  
 Let me know if you think there is a better way to reflect this.  
  
 Thanks  
 Parth  
  
 On 4/24/15, 9:37 AM, Gwen Shapira gshap...@cloudera.com wrote:  
  
+1 (non-binding)  
  
Two nitpicks for the wiki:  
* Heartbeat is probably a READ and not CLUSTER operation. I'm pretty  
sure new consumers need it to be part of a consumer group.  
* Can you clearly separate which parts are the API (common to every  
Authorizer) and which parts are DefaultAuthorizer implementation? It  
will make reviews and Authorizer implementations a bit easier to know  
exactly which is which.  
  
Gwen  
  
On Fri, Apr 24, 2015 at 9:28 AM, Parth Brahmbhatt  
pbrahmbh...@hortonworks.com wrote:  
 Hi,  
  
 I would like to open KIP-11 for voting.  
  
 Thanks  
 Parth  
  
 On 4/22/15, 1:56 PM, Parth Brahmbhatt pbrahmbh...@hortonworks.com  
 wrote:  
  
Hi Jeff,  
  
Thanks a lot for the review. I think you have a valid point about acls  
being duplicated and the simplest solution would be to modify acls  
class  
so they hold a set of principals instead of single principal. i.e  
  
user_a,user_b has READ,WRITE,DESCRIBE Permissions on Topic1 from  
Host1, Host2, Host3.  
  
I think the evaluation order only matters for the permissionType which  
is  
Deny acls should be evaluated before allow acls. To give you an  
example  
suppose we have following acls  
  
acl1 - user1 is allowed to READ from all hosts.  
acl2 - host1 is allowed to READ regardless of who is the user.  
acl3 - host2 is allowed to READ regardless of who is the user.  
  
acl4 - user1 is denied to READ from host1.  
  
As stated in the KIP we first evaluate DENY so if user1 tries to  
access  
from host1 he will be denied(acl4), even though both user1 and host1  
has  
acl’s for allow with wildcards (acl1, acl2).  
If user1 tried to READ from host2 , the action will be allowed and it  
does  
not matter if we match acl3 or acl1 so I don’t think the evaluation  
order  
matters here.  
  
“Will people actually use hosts with users?” I really don’t know but  
given  
ACl’s are part of our Public APIs I thought it is better to try and  
cover  
more use cases. If others think this extra complexity is not worth the  
value its adding please raise your concerns so we can discuss if it  
should  
be removed from the acl structure. Note that even in absence of hosts  
from  
ACL users will still be able to whitelist/blacklist host as long as we  
start supporting principalType = “host”, easy to add and can be an  
incremental improvement. They will however loose the ability to  
restrict  
access to users just from a set of hosts.  
  
We agreed to offer a CLI to overcome the JSON acl config  
https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authorizati  
on  
+I  
n  
terface#KIP-11-AuthorizationInterface-AclManagement(CLI). I still like  
Jsons but that probably has something to do with me being a developer  
:-).  
  
Thanks  
Parth  
  
On 4/22/15, 11:38 AM, Jeff Holoman jholo...@cloudera.com wrote:  
  
Parth,  
  
This is a long thread, so trying to keep up here, sorry if this has  
been  
covered before. First, great job on the KIP proposal and work so far.  
  
Are we sure that we want to tie host level access to a given user? My  
understanding is that the ACL will be (omitting some fields)  
  
user_a, host1, host2, host3  
user_b, host1, host2, host3  
  
So there would potentially be a lot of redundancy in the configs.  
Does  
it  
make sense to have hosts be at the same level as principal in the  
hierarchy? This way you could just blanket the allowed / denied hosts  
and  
only have to worry about the users. So if you follow this, then  
  
we can wildcard the user so we can have a separate list of just  
host-based  
access. What's the order that the perms would be evaluated if a there  
was  
more than one match on a principal ?  
  
Is the thought that there wouldn't usually be much overlap on hosts?  
I  
guess I can imagine a scenario where I want to offline/online access  
to a  
particular hosts or set of hosts and if there was overlap, I'm doing  
a  
bunch of alter commands for just a single host. Maybe this is too  

Re: [VOTE] KIP-11- Authorization design for kafka security

2015-04-24 Thread Parth Brahmbhatt
Thanks for clarifying Gwen, KIP updated.

I tried to make the distinction by creating a section for all public APIs
https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authorization+In
terface#KIP-11-AuthorizationInterface-PublicInterfacesandclasses

Let me know if you think there is a better way to reflect this.

Thanks
Parth

On 4/24/15, 9:37 AM, Gwen Shapira gshap...@cloudera.com wrote:

+1 (non-binding)

Two nitpicks for the wiki:
* Heartbeat is probably a READ and not CLUSTER operation. I'm pretty
sure new consumers need it to be part of a consumer group.
* Can you clearly separate which parts are the API (common to every
Authorizer) and which parts are DefaultAuthorizer implementation? It
will make reviews and Authorizer implementations a bit easier to know
exactly which is which.

Gwen

On Fri, Apr 24, 2015 at 9:28 AM, Parth Brahmbhatt
pbrahmbh...@hortonworks.com wrote:
 Hi,

 I would like to open KIP-11 for voting.

 Thanks
 Parth

 On 4/22/15, 1:56 PM, Parth Brahmbhatt pbrahmbh...@hortonworks.com
 wrote:

Hi Jeff,

Thanks a lot for the review. I think you have a valid point about acls
being duplicated and the simplest solution would be to modify acls class
so they hold a set of principals instead of single principal. i.e

user_a,user_b has READ,WRITE,DESCRIBE Permissions on Topic1 from
Host1, Host2, Host3.

I think the evaluation order only matters for the permissionType which
is
Deny acls should be evaluated before allow acls. To give you an example
suppose we have following acls

acl1 - user1 is allowed to READ from all hosts.
acl2 - host1 is allowed to READ regardless of who is the user.
acl3 - host2 is allowed to READ regardless of who is the user.

acl4 - user1 is denied to READ from host1.

As stated in the KIP we first evaluate DENY so if user1 tries to access
from host1 he will be denied(acl4), even though both user1 and host1 has
acl’s for allow with wildcards (acl1, acl2).
If user1 tried to READ from host2 , the action will be allowed and it
does
not matter if we match acl3 or acl1 so I don’t think the evaluation
order
matters here.

“Will people actually use hosts with users?” I really don’t know but
given
ACl’s are part of our Public APIs I thought it is better to try and
cover
more use cases. If others think this extra complexity is not worth the
value its adding please raise your concerns so we can discuss if it
should
be removed from the acl structure. Note that even in absence of hosts
from
ACL users will still be able to whitelist/blacklist host as long as we
start supporting principalType = “host”, easy to add and can be an
incremental improvement. They will however loose the ability to restrict
access to users just from a set of hosts.

We agreed to offer a CLI to overcome the JSON acl config
https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authorization
+I
n
terface#KIP-11-AuthorizationInterface-AclManagement(CLI). I still like
Jsons but that probably has something to do with me being a developer
:-).

Thanks
Parth

On 4/22/15, 11:38 AM, Jeff Holoman jholo...@cloudera.com wrote:

Parth,

This is a long thread, so trying to keep up here, sorry if this has
been
covered before. First, great job on the KIP proposal and work so far.

Are we sure that we want to tie host level access to a given user? My
understanding is that the ACL will be (omitting some fields)

user_a, host1, host2, host3
user_b, host1, host2, host3

So there would potentially be a lot of redundancy in the configs. Does
it
make sense to have hosts be at the same level as principal in the
hierarchy? This way you could just blanket the allowed / denied hosts
and
only have to worry about the users. So if you follow this, then

we can wildcard the user so we can have a separate list of just
host-based
access. What's the order that the perms would be evaluated if a there
was
more than one match on a principal ?

Is the thought that there wouldn't usually be much overlap on hosts? I
guess I can imagine a scenario where I want to offline/online access
to a
particular hosts or set of hosts and if there was overlap, I'm doing a
bunch of alter commands for just a single host. Maybe this is too
contrived
an example?

I agree that having this level of granularity gives flexibility but I
wonder if people will actually use it and not just * the hosts for a
given
user and create separate global list as i mentioned above?

The only other system I know of that ties users with hosts for access
is
MySql and I don't love that model. Companies usually standardize on
group
authorization anyway, are we complicating that issue with the inclusion
of
hosts attached to users? Additionally I worry about the debt of big
JSON
configs in the first place, most non-developers find them non-intuitive
already, so anything to ease this I think would be beneficial.


Thanks

Jeff

On Wed, Apr 22, 2015 at 2:22 PM, Parth Brahmbhatt 
pbrahmbh...@hortonworks.com wrote:

 Sorry I missed your last questions. I am +0 on adding ―host option

Re: [VOTE] KIP-11- Authorization design for kafka security

2015-04-24 Thread Gari Singh
Sorry - fat fingered send ...


Not sure if my newbie vote will count, but I think you are getting pretty
close here.

Couple of things:

1) I know the Session object is from a different JIRA, but I think that
Session should take a Subject rather than just a single Principal.  The
reason for this is because a Subject can have multiple Principals (for
example both a username and a group or perhaps someone would want to use
both the username and the clientIP as Principals)

2)  We would then also have multiple concrete Principals, e.g.

KafkaPrincipal
KafkaUserPrincipal
KafkaGroupPrincipal
(perhaps even KafkaKerberosPrincipal and KafkaClientAddressPrincipal)
etc

This is important as eventually (hopefully sooner than later), we will
support multiple types of authentication which may each want to populate
the Subject with one or more Principals and perhaps even credentials (this
could be used in the future to hold encryption keys or perhaps the raw info
prior to authentication).

So in this way, if we have different authentication modules, we can add
different types of Principals by extension

This also allows the same subject to have access to some resources based on
username and some based on group.

Given that with this we would have different types of Principals, I would
then modify the ACL to look like:

{version:1,
  {acls:[
{
  principal_types:[KafkaUserPrincipal,KafkaGroupPrincipal],
  principals:[alice,kafka-devs]
  ...

or

{version:1,
  {acls:[
{
  principals:[KafkaUserPrincipal:alice,KafkaGroupPrincipal:kafka-
devs]


But in either case this allows for easy identification of the type of
principal and makes it easy to plugin multiple kinds of principals

The advantage of all of this is that it now provides more flexibility for
custom modules for both authentication and authorization moving forward.

3) Are you sure that you want authorize to take a session object?  If
we use the model in one above, we could just populate the Subject with a
KafkaClientAddressPrincipal and thenhave access to that when evaluated the
ACLs.

4) What about actually caching authorization decisions?  I know ACLs will
be cached, but the actual authorize decision can be expensive as well?

On Fri, Apr 24, 2015 at 1:27 PM, Gari Singh gari.r.si...@gmail.com wrote:

 Not sure if my newbie vote will count, but I think you are getting
 pretty close here.

 Couple of things:

 1) I know the Session object is from a different JIRA, but I think that
 Session should take a Subject rather than just a single Principal.  The
 reason for this is because a Subject can have multiple Principals (for
 example both a username and a group or perhaps someone would want to use
 both the username and the clientIP as Principals)

 2)  We would then also have multiple concrete Principals, e.g.

 KafkaPrincipal
 KafkaUserPrincipal
 KafkaGroupPrincipal
 (perhaps even KafkaKerberosPrincipal and KafkaClientAddressPrincipal)
 etc

 This is important as eventually (hopefully sooner than later), we will
 support multiple types of authentication which may each want to populate
 the Subject with one or more Principals and perhaps even credentials (this
 could be used in the future to hold encryption keys or perhaps the raw info
 prior to authentication).

 So in this way, if we have different authentication modules, we can add
 different types of Principals by extension

 This also allows the same subject to have access to some resources based
 on username and some based on group.

 Given that with this we would have different types of Principals, I would
 then modify the ACL to look like:

 {version:1,
   {acls:[
 {
   principal_types:[KafkaUserPrincipal,KafkaGroupPrincipal],
   principals:[alice,kafka-devs





 3) The advantage of all of this is that it now provides more flexibility
 for custom modules for both authentication and authorization moving forward.



 On Fri, Apr 24, 2015 at 12:37 PM, Gwen Shapira gshap...@cloudera.com
 wrote:

 +1 (non-binding)

 Two nitpicks for the wiki:
 * Heartbeat is probably a READ and not CLUSTER operation. I'm pretty
 sure new consumers need it to be part of a consumer group.
 * Can you clearly separate which parts are the API (common to every
 Authorizer) and which parts are DefaultAuthorizer implementation? It
 will make reviews and Authorizer implementations a bit easier to know
 exactly which is which.

 Gwen

 On Fri, Apr 24, 2015 at 9:28 AM, Parth Brahmbhatt
 pbrahmbh...@hortonworks.com wrote:
  Hi,
 
  I would like to open KIP-11 for voting.
 
  Thanks
  Parth
 
  On 4/22/15, 1:56 PM, Parth Brahmbhatt pbrahmbh...@hortonworks.com
  wrote:
 
 Hi Jeff,
 
 Thanks a lot for the review. I think you have a valid point about acls
 being duplicated and the simplest solution would be to modify acls class
 so they hold a set of principals instead of single principal. i.e
 
 user_a,user_b has READ,WRITE,DESCRIBE Permissions on Topic1 from
 Host1, Host2, Host3.
 
 I think the 

Re: [VOTE] KIP-11- Authorization design for kafka security

2015-04-24 Thread Gwen Shapira
Thanks.

One more thing I'm missing in the KIP is details on the Group resource
(I think we discussed this and it was just not fully updated):

* Does everyone have the privilege to create a new Group and use it to
consume from Topics he's already privileged on?
* Will the CLI tool be used to manage group membership too?
* Groups are kind of ephemeral, right? If all consumers in the group
disconnect the group is gone, AFAIK. Do we preserve the ACLs? Or do we
treat the new group as completely new resource? Can we create ACLs
before the group exists, in anticipation of it getting created?

Its all small details, but it will be difficult to implement KIP-11
without knowing the answers :)

Gwen


On Fri, Apr 24, 2015 at 9:58 AM, Parth Brahmbhatt
pbrahmbh...@hortonworks.com wrote:
 You are right, moved it to the default implementation section.

 Thanks
 Parth

 On 4/24/15, 9:52 AM, Gwen Shapira gshap...@cloudera.com wrote:

Sample ACL JSON and Zookeeper is in public API, but I thought it is
part of DefaultAuthorizer (Since Sentry and Argus won't be using
Zookeeper).
Am I wrong? Or is it the KIP?

On Fri, Apr 24, 2015 at 9:49 AM, Parth Brahmbhatt
pbrahmbh...@hortonworks.com wrote:
 Thanks for clarifying Gwen, KIP updated.

 I tried to make the distinction by creating a section for all public
APIs

https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authorization+
In
 terface#KIP-11-AuthorizationInterface-PublicInterfacesandclasses

 Let me know if you think there is a better way to reflect this.

 Thanks
 Parth

 On 4/24/15, 9:37 AM, Gwen Shapira gshap...@cloudera.com wrote:

+1 (non-binding)

Two nitpicks for the wiki:
* Heartbeat is probably a READ and not CLUSTER operation. I'm pretty
sure new consumers need it to be part of a consumer group.
* Can you clearly separate which parts are the API (common to every
Authorizer) and which parts are DefaultAuthorizer implementation? It
will make reviews and Authorizer implementations a bit easier to know
exactly which is which.

Gwen

On Fri, Apr 24, 2015 at 9:28 AM, Parth Brahmbhatt
pbrahmbh...@hortonworks.com wrote:
 Hi,

 I would like to open KIP-11 for voting.

 Thanks
 Parth

 On 4/22/15, 1:56 PM, Parth Brahmbhatt pbrahmbh...@hortonworks.com
 wrote:

Hi Jeff,

Thanks a lot for the review. I think you have a valid point about acls
being duplicated and the simplest solution would be to modify acls
class
so they hold a set of principals instead of single principal. i.e

user_a,user_b has READ,WRITE,DESCRIBE Permissions on Topic1 from
Host1, Host2, Host3.

I think the evaluation order only matters for the permissionType which
is
Deny acls should be evaluated before allow acls. To give you an
example
suppose we have following acls

acl1 - user1 is allowed to READ from all hosts.
acl2 - host1 is allowed to READ regardless of who is the user.
acl3 - host2 is allowed to READ regardless of who is the user.

acl4 - user1 is denied to READ from host1.

As stated in the KIP we first evaluate DENY so if user1 tries to
access
from host1 he will be denied(acl4), even though both user1 and host1
has
acl’s for allow with wildcards (acl1, acl2).
If user1 tried to READ from host2 , the action will be allowed and it
does
not matter if we match acl3 or acl1 so I don’t think the evaluation
order
matters here.

“Will people actually use hosts with users?” I really don’t know but
given
ACl’s are part of our Public APIs I thought it is better to try and
cover
more use cases. If others think this extra complexity is not worth the
value its adding please raise your concerns so we can discuss if it
should
be removed from the acl structure. Note that even in absence of hosts
from
ACL users will still be able to whitelist/blacklist host as long as we
start supporting principalType = “host”, easy to add and can be an
incremental improvement. They will however loose the ability to
restrict
access to users just from a set of hosts.

We agreed to offer a CLI to overcome the JSON acl config
https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authorizati
on
+I
n
terface#KIP-11-AuthorizationInterface-AclManagement(CLI). I still like
Jsons but that probably has something to do with me being a developer
:-).

Thanks
Parth

On 4/22/15, 11:38 AM, Jeff Holoman jholo...@cloudera.com wrote:

Parth,

This is a long thread, so trying to keep up here, sorry if this has
been
covered before. First, great job on the KIP proposal and work so far.

Are we sure that we want to tie host level access to a given user? My
understanding is that the ACL will be (omitting some fields)

user_a, host1, host2, host3
user_b, host1, host2, host3

So there would potentially be a lot of redundancy in the configs.
Does
it
make sense to have hosts be at the same level as principal in the
hierarchy? This way you could just blanket the allowed / denied hosts
and
only have to worry about the users. So if you follow this, then

we can wildcard the user so we can have a separate list of just
host-based
access. What's the 

Re: [VOTE] KIP-11- Authorization design for kafka security

2015-04-24 Thread Parth Brahmbhatt
Sorry Gwen, completely misunderstood the question :-).

* Does everyone have the privilege to create a new Group and use it to
consume from Topics he's already privileged on?
Yes in current proposal. I did not see an API to create group but if you
have a READ permission on a TOPIC and WRITE permission on that Group you
are free to join and consume.
 

* Will the CLI tool be used to manage group membership too?
Yes and I think that means I need to add ―group. Updating the KIP. 
Thanks
for pointing this out.

* Groups are kind of ephemeral, right? If all consumers in the group
disconnect the group is gone, AFAIK. Do we preserve the ACLs? Or do we
treat the new group as completely new resource? Can we create ACLs
before the group exists, in anticipation of it getting created?
I have considered any auto delete and auto create as out of scope for 
the
first release. So Right now I was going with preserving the acls. Do you
see any issues with this? Auto deleting would mean authorizer will now
have to get into implementation details of kafka which I was trying to
avoid.

Thanks
Parth

On 4/24/15, 11:33 AM, Gwen Shapira gshap...@cloudera.com wrote:

We are not talking about same Groups :)

I meant, Groups of consumers (which KIP-11 lists as a separate
resource in the Privilege table)

On Fri, Apr 24, 2015 at 11:31 AM, Parth Brahmbhatt
pbrahmbh...@hortonworks.com wrote:
 I see Groups as something we can add incrementally in the current model.
 The acls take principalType: name so groups can be represented as group:
 groupName. We are not managing group memberships anywhere in kafka and I
 don’t see the need to do so.

 So for a topic1 using the CLI an admin can add an acl to grant access to
 group:kafka-test-users.

 The authorizer implementation can have a plugin to map authenticated
user
 to groups ( This is how hadoop and storm works). The plugin could be
 mapping user to linux/ldap/active directory groups but that is again
upto
 the implementation.

 What we are offering is an interface that is extensible so these
features
 can be added incrementally. I can add support for this in the first
 release but don’t necessarily see why this would be absolute necessity.

 Thanks
 Parth

 On 4/24/15, 11:00 AM, Gwen Shapira gshap...@cloudera.com wrote:

Thanks.

One more thing I'm missing in the KIP is details on the Group resource
(I think we discussed this and it was just not fully updated):

* Does everyone have the privilege to create a new Group and use it to
consume from Topics he's already privileged on?
* Will the CLI tool be used to manage group membership too?
* Groups are kind of ephemeral, right? If all consumers in the group
disconnect the group is gone, AFAIK. Do we preserve the ACLs? Or do we
treat the new group as completely new resource? Can we create ACLs
before the group exists, in anticipation of it getting created?

Its all small details, but it will be difficult to implement KIP-11
without knowing the answers :)

Gwen


On Fri, Apr 24, 2015 at 9:58 AM, Parth Brahmbhatt
pbrahmbh...@hortonworks.com wrote:
 You are right, moved it to the default implementation section.

 Thanks
 Parth

 On 4/24/15, 9:52 AM, Gwen Shapira gshap...@cloudera.com wrote:

Sample ACL JSON and Zookeeper is in public API, but I thought it is
part of DefaultAuthorizer (Since Sentry and Argus won't be using
Zookeeper).
Am I wrong? Or is it the KIP?

On Fri, Apr 24, 2015 at 9:49 AM, Parth Brahmbhatt
pbrahmbh...@hortonworks.com wrote:
 Thanks for clarifying Gwen, KIP updated.

 I tried to make the distinction by creating a section for all public
APIs

https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authorizat
io
n+
In
 terface#KIP-11-AuthorizationInterface-PublicInterfacesandclasses

 Let me know if you think there is a better way to reflect this.

 Thanks
 Parth

 On 4/24/15, 9:37 AM, Gwen Shapira gshap...@cloudera.com wrote:

+1 (non-binding)

Two nitpicks for the wiki:
* Heartbeat is probably a READ and not CLUSTER operation. I'm pretty
sure new consumers need it to be part of a consumer group.
* Can you clearly separate which parts are the API (common to every
Authorizer) and which parts are DefaultAuthorizer implementation? It
will make reviews and Authorizer implementations a bit easier to
know
exactly which is which.

Gwen

On Fri, Apr 24, 2015 at 9:28 AM, Parth Brahmbhatt
pbrahmbh...@hortonworks.com wrote:
 Hi,

 I would like to open KIP-11 for voting.

 Thanks
 Parth

 On 4/22/15, 1:56 PM, Parth Brahmbhatt
pbrahmbh...@hortonworks.com
 wrote:

Hi Jeff,

Thanks a lot for the review. I think you have a valid point about
acls
being duplicated and the simplest solution would be to modify acls
class
so they hold a set of principals instead of single principal. i.e

user_a,user_b has READ,WRITE,DESCRIBE Permissions on Topic1
from
Host1, Host2, Host3.

I think the evaluation order only matters for the permissionType
which
is
Deny acls should be evaluated before allow acls. To give you an

Re: [VOTE] KIP-11- Authorization design for kafka security

2015-04-24 Thread Gari Singh
I will move the comments about subject versus principal wrt session to the
PR above.  The comments around keys, etc are more appropriate there.

If I tie this together with my comments in the thread on SASL / Kerberos,
what I am having a hard time figuring out are the pluggable framework for
both authentication and authorization versus implementation of specific
authentication and authorization providers.

As for caching decisions, it just seems silly to authorize on the same
operation over and over again (e.g. publishing to the same topic), but
perhaps if the ACLs are small enough this will be ok.



On Fri, Apr 24, 2015 at 2:18 PM, Parth Brahmbhatt 
pbrahmbh...@hortonworks.com wrote:

 Thanks for your comments Gari. My responses are inline.

 Thanks
 Parth

 On 4/24/15, 10:36 AM, Gari Singh gari.r.si...@gmail.com wrote:

 Sorry - fat fingered send ...
 
 
 Not sure if my newbie vote will count, but I think you are getting
 pretty
 close here.
 
 Couple of things:
 
 1) I know the Session object is from a different JIRA, but I think that
 Session should take a Subject rather than just a single Principal.  The
 reason for this is because a Subject can have multiple Principals (for
 example both a username and a group or perhaps someone would want to use
 both the username and the clientIP as Principals)

 I think the user - group mapping can be done at Authorization
 implementation layer. In any case as you pointed out the session is part
 of another jira and I think a PR is out
 https://reviews.apache.org/r/27204/diff/ and we should discuss it on that
 PR.

 
 2)  We would then also have multiple concrete Principals, e.g.
 
 KafkaPrincipal
 KafkaUserPrincipal
 KafkaGroupPrincipal
 (perhaps even KafkaKerberosPrincipal and KafkaClientAddressPrincipal)
 etc
 
 This is important as eventually (hopefully sooner than later), we will
 support multiple types of authentication which may each want to populate
 the Subject with one or more Principals and perhaps even credentials (this
 could be used in the future to hold encryption keys or perhaps the raw
 info
 prior to authentication).
 
 So in this way, if we have different authentication modules, we can add
 different types of Principals by extension
 
 This also allows the same subject to have access to some resources based
 on
 username and some based on group.
 
 Given that with this we would have different types of Principals, I would
 then modify the ACL to look like:
 
 {version:1,
   {acls:[
 {
   principal_types:[KafkaUserPrincipal,KafkaGroupPrincipal],
   principals:[alice,kafka-devs]
   ...
 
 or
 
 {version:1,
   {acls:[
 {
   principals:[KafkaUserPrincipal:alice,KafkaGroupPrincipal:kafka-
 devs]
 
 
 But in either case this allows for easy identification of the type of
 principal and makes it easy to plugin multiple kinds of principals
 
 The advantage of all of this is that it now provides more flexibility for
 custom modules for both authentication and authorization moving forward.

 All the principals that you listed above can be supported with
 current
 design. Acls take a KafkaPrincipal as input which is a combination of type
 and principal name and the authorizer implementations are free to create
 any extension of this which covers group: groupName, host: HostName,
 kerberos: kerberosUserName and any other types that may come up. I am not
 sure how encryption key storage is relavent to the Authorizer so will be
 great if you can elaborate.

 
 3) Are you sure that you want authorize to take a session object?  If
 we use the model in one above, we could just populate the Subject with a
 KafkaClientAddressPrincipal and thenhave access to that when evaluated the
 ACLs.

 I think it is better to take a session which can just be a wrapper
 on top
 of Subject + host for now. This allows for extension which in my opinion
 is more future requirement proof.

 
 4) What about actually caching authorization decisions?  I know ACLs will
 be cached, but the actual authorize decision can be expensive as well?

 In default implementation I don’t plan to do this. Easy to add
 later if
 we want to but I am not sure why would this ever be expansive when acls
 are cached and number of acls on a single topic should be very small and
 iterating over them with simple string comparison should not really be
 expansive.

 Thanks
 Parth

 
 On Fri, Apr 24, 2015 at 1:27 PM, Gari Singh gari.r.si...@gmail.com
 wrote:
 
  Not sure if my newbie vote will count, but I think you are getting
  pretty close here.
 
  Couple of things:
 
  1) I know the Session object is from a different JIRA, but I think that
  Session should take a Subject rather than just a single Principal.  The
  reason for this is because a Subject can have multiple Principals (for
  example both a username and a group or perhaps someone would want to use
  both the username and the clientIP as Principals)
 
  2)  We would then also have multiple