+1 (non-binding) Two nitpicks for the wiki: * Heartbeat is probably a READ and not CLUSTER operation. I'm pretty sure new consumers need it to be part of a consumer group. * Can you clearly separate which parts are the API (common to every Authorizer) and which parts are DefaultAuthorizer implementation? It will make reviews and Authorizer implementations a bit easier to know exactly which is which.
Gwen On Fri, Apr 24, 2015 at 9:28 AM, Parth Brahmbhatt <pbrahmbh...@hortonworks.com> wrote: > Hi, > > I would like to open KIP-11 for voting. > > Thanks > Parth > > On 4/22/15, 1:56 PM, "Parth Brahmbhatt" <pbrahmbh...@hortonworks.com> > wrote: > >>Hi Jeff, >> >>Thanks a lot for the review. I think you have a valid point about acls >>being duplicated and the simplest solution would be to modify acls class >>so they hold a set of principals instead of single principal. i.e >> >><user_a,user_b> has <READ,WRITE,DESCRIBE> Permissions on <Topic1> from >><Host1, Host2, Host3>. >> >>I think the evaluation order only matters for the permissionType which is >>Deny acls should be evaluated before allow acls. To give you an example >>suppose we have following acls >> >>acl1 -> user1 is allowed to READ from all hosts. >>acl2 -> host1 is allowed to READ regardless of who is the user. >>acl3 -> host2 is allowed to READ regardless of who is the user. >> >>acl4 -> user1 is denied to READ from host1. >> >>As stated in the KIP we first evaluate DENY so if user1 tries to access >>from host1 he will be denied(acl4), even though both user1 and host1 has >>acl’s for allow with wildcards (acl1, acl2). >>If user1 tried to READ from host2 , the action will be allowed and it does >>not matter if we match acl3 or acl1 so I don’t think the evaluation order >>matters here. >> >>“Will people actually use hosts with users?” I really don’t know but given >>ACl’s are part of our Public APIs I thought it is better to try and cover >>more use cases. If others think this extra complexity is not worth the >>value its adding please raise your concerns so we can discuss if it should >>be removed from the acl structure. Note that even in absence of hosts from >>ACL users will still be able to whitelist/blacklist host as long as we >>start supporting principalType = “host”, easy to add and can be an >>incremental improvement. They will however loose the ability to restrict >>access to users just from a set of hosts. >> >>We agreed to offer a CLI to overcome the JSON acl config >>https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authorization+I >>n >>terface#KIP-11-AuthorizationInterface-AclManagement(CLI). I still like >>Jsons but that probably has something to do with me being a developer :-). >> >>Thanks >>Parth >> >>On 4/22/15, 11:38 AM, "Jeff Holoman" <jholo...@cloudera.com> wrote: >> >>>Parth, >>> >>>This is a long thread, so trying to keep up here, sorry if this has been >>>covered before. First, great job on the KIP proposal and work so far. >>> >>>Are we sure that we want to tie host level access to a given user? My >>>understanding is that the ACL will be (omitting some fields) >>> >>>user_a, host1, host2, host3 >>>user_b, host1, host2, host3 >>> >>>So there would potentially be a lot of redundancy in the configs. Does it >>>make sense to have hosts be at the same level as principal in the >>>hierarchy? This way you could just blanket the allowed / denied hosts and >>>only have to worry about the users. So if you follow this, then >>> >>>we can wildcard the user so we can have a separate list of just >>>host-based >>>access. What's the order that the perms would be evaluated if a there was >>>more than one match on a principal ? >>> >>>Is the thought that there wouldn't usually be much overlap on hosts? I >>>guess I can imagine a scenario where I want to offline/online access to a >>>particular hosts or set of hosts and if there was overlap, I'm doing a >>>bunch of alter commands for just a single host. Maybe this is too >>>contrived >>>an example? >>> >>>I agree that having this level of granularity gives flexibility but I >>>wonder if people will actually use it and not just * the hosts for a >>>given >>>user and create separate "global" list as i mentioned above? >>> >>>The only other system I know of that ties users with hosts for access is >>>MySql and I don't love that model. Companies usually standardize on group >>>authorization anyway, are we complicating that issue with the inclusion >>>of >>>hosts attached to users? Additionally I worry about the debt of big JSON >>>configs in the first place, most non-developers find them non-intuitive >>>already, so anything to ease this I think would be beneficial. >>> >>> >>>Thanks >>> >>>Jeff >>> >>>On Wed, Apr 22, 2015 at 2:22 PM, Parth Brahmbhatt < >>>pbrahmbh...@hortonworks.com> wrote: >>> >>>> Sorry I missed your last questions. I am +0 on adding ―host option for >>>> ―list, we could add it for symmetry. Again if this is only a CLI change >>>>it >>>> can be added later if you mean adding this in authorizer interface then >>>>we >>>> should make a decision now. >>>> >>>> Given a choice I would like to actually keep only one option which is >>>> resource based get (remove even the get based on principal). I see >>>>those >>>> (getAcl for principal or host) as special filtering case which can >>>>easily >>>> be achieved by a third party tool by doing "list all topics" and >>>>calling >>>> getAcls for each topic and applying filtering logic on that. I really >>>> don’t see the need to make those first class citizens of the authorizer >>>> interface given these kind of queries will be issued outside of broker >>>>JVM >>>> so they will not benefit from the caching and because the storage will >>>>be >>>> indexed on resource both these options even as a first class API will >>>>just >>>> scan all topic acls and apply filtering logic. >>>> >>>> Thanks >>>> Parth >>>> >>>> On 4/22/15, 11:08 AM, "Parth Brahmbhatt" <pbrahmbh...@hortonworks.com> >>>> wrote: >>>> >>>> >Please see all the available options here >>>> > >>>> >>>>https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authorization >>>>+ >>>>I >>>> >nterface#KIP-11-AuthorizationInterface-AclManagement(CLI) . I think it >>>> >covers both hosts and operations and allows to specify a list for >>>>both. >>>> > >>>> >Thanks >>>> >Parth >>>> > >>>> >From: Tom Graves <tgraves...@yahoo.com<mailto:tgraves...@yahoo.com>> >>>> >Reply-To: Tom Graves >>>><tgraves...@yahoo.com<mailto:tgraves...@yahoo.com>> >>>> >Date: Wednesday, April 22, 2015 at 11:02 AM >>>> >To: Parth Brahmbhatt >>>> ><pbrahmbh...@hortonworks.com<mailto:pbrahmbh...@hortonworks.com>>, >>>> >"dev@kafka.apache.org<mailto:dev@kafka.apache.org>" >>>> ><dev@kafka.apache.org<mailto:dev@kafka.apache.org>> >>>> >Subject: Re: [DISCUSS] KIP-11- Authorization design for kafka security >>>> > >>>> >Thanks for the explanations Parth. >>>> > >>>> >On the configs questions, the way I see it is its more likely to >>>> >accidentally give everyone access, especially since you have to run a >>>> >separate command to change the acls. If there was some config for >>>> >defaults, a cluster admin could change that to be nobody or certain >>>>set >>>> >of users, then grant others permissions. This would also remove the >>>>race >>>> >between commands. This is something you can always add later though >>>>if >>>> >people request it. >>>> > >>>> >So in kafka-acl.sh how do I actually tell it what the operation is? >>>> >kafka-acl.sh --topic testtopic --add --grandprincipal >>>>user:joe,user:kate >>>> > >>>> >where does READ, WRITE, etc go? Can specify as a list so I don't have >>>>to >>>> >run this a bunch of times for each. >>>> > >>>> >Do you want to have a --host option for --list so that admins could >>>>see >>>> >what acls apply to specific host(s)? >>>> > >>>> >Tom >>>> > >>>> > >>>> > >>>> >On Wednesday, April 22, 2015 11:38 AM, Parth Brahmbhatt >>>> ><pbrahmbh...@hortonworks.com<mailto:pbrahmbh...@hortonworks.com>> >>>>wrote: >>>> > >>>> > >>>> > >>>> >FYI, I have modified the KIP to include group as resource. In order to >>>> >access “joinGroup” and “commitOFfset” APIs the user will need a read >>>> >permission on topic and WRITE permission on group. >>>> > >>>> >I plan to open a VOTE thread by noon if there are no more concerns. >>>> > >>>> >Thanks >>>> >Parth >>>> > >>>> >On 4/22/15, 9:03 AM, "Tom Graves" >>>> ><tgraves...@yahoo.com.INVALID<mailto:tgraves...@yahoo.com.INVALID>> >>>> wrote: >>>> > >>>> >>Hey everyone, >>>> >>Sorry to jump in on the conversation so late. I'm new to Kafka. I'll >>>> >>apologize in advance if you have already covered some of my >>>>questions. I >>>> >>read through the wiki and had some comments and questions. >>>> >>1) public enum Operation needs EDIT changed to ALTER >>>> > >>>> >> Done. >>>> > >>>> >>2) Does the Authorizer class need a setAcls? Rather then just add to >>>>be >>>> >>able to set to explicit list and overwrite what was there? I see the >>>> >>kafka-acl.sh lists a removeall so I guess you could do removeall and >>>>then >>>> >>add. I also don't see a removeall in the Authorizer class, is it >>>>going >>>> >>to loop through them all to remove each one? >>>> > >>>> > There is an overloaded version of removeAcls in the interface that >>>> >takes >>>> >in resource as the only input and as described in the javadoc all the >>>>acls >>>> >attached to that resource will be deleted. To cover the setAcl use >>>>case >>>> >the caller can first call remove and then add. >>>> > >>>> >>3) Can someone tell me what the use case to do acls based on the >>>>hosts? >>>> >>I can see some possibilities just wondering if we can concrete ones >>>>where >>>> >>one user is allowed from one host but not another. >>>> > >>>> > I am not sure if I understand the question given the use case you >>>> >described in your question is what we are trying to cover with use of >>>> >hosts in Acl. There are some additional use cases like “allow access >>>>to >>>> >any user from host1,host2” but I think primarily it gives the admins >>>>the >>>> >ability to define acls at a more granular level. >>>> > >>>> >>4) I'm a bit unclear how the "resource" works in the Authorizer >>>>class. >>>> >>From what I see we have 2 resources - topics and cluster. If I want >>>>to >>>> >>add an acl to allow "joe" to CREATE for the cluster then I call >>>>addAcls >>>> >>with Acl("user: joe", ALLOW, Set(*), Set(CREATE)) and "cluster"? >>>>What >>>> >>if I want to call addAcls for DESCRIBE on a topic? Is the resource >>>>then >>>> >>"topic" or is it the topic name? >>>> > >>>> > We now have 3 resources(added group), please see the updated doc. >>>>The >>>> >CREATE acl that you described is correct. For any topic operation you >>>> >should use topic name as the resource name and for group the user will >>>> >provide groupId as resource name. >>>> > >>>> >>5) reassigning partitions is a CLUSTER_ACTION or superuser? Its not >>>> >>totally clear to me the differences between these. what about >>>>increasing >>>> >># of partitions? >>>> > >>>> > I see this as an alter topic operation so it is at topic level and >>>>the >>>> >user must have alter permissions on topic. >>>> > >>>> >>6) groups are mentioned, are we supporting right away or is that a >>>>follow >>>> >>on item? (is there going to be a kafka.supergroups) >>>> > >>>> > I think it can be a separate jira just for braking down the code >>>> >review >>>> >in smaller chunk. We will support it in first version but I think if >>>>we >>>> >can not do it for any reason that should not block a release with all >>>>the >>>> >other authZ work. We made deliberate design choices (like introducing >>>>a >>>> >principalType in KafkaPrinciapl) to allow supporting groups as an >>>> >incremental change. >>>> > >>>> >>7) Are there config options for setting acls when I create my topic? >>>>Or >>>> >>do I have to create my topic and then run the kafka-acl.sh script to >>>>set >>>> >>them? Although its very small, there would be possible race there >>>>that >>>> >>someone could start producing to topic before acls are set. >>>> > >>>> > We discussed this yesterday and we agreed to go with kafka-acl.sh. >>>>Yes >>>> >there is a very very small window of vulnerability but I think that >>>>really >>>> >does not warrant to change the decision in this case. >>>> > >>>> >>8) are there configs for cluster level acl defaults? Or does it >>>>default >>>> >>to superusers on bringing up new cluster and you have to modify with >>>>cli. >>>> >>thanks,Tom >>>> > >>>> > No defaults, the default is superusers will have full access. I >>>>don’t >>>> >think making assumptions about ones security requirement should be our >>>> >burden. >>>> > >>>> > >>>> >> >>>> >> On Tuesday, April 21, 2015 7:10 PM, Parth Brahmbhatt >>>> >><pbrahmbh...@hortonworks.com<mailto:pbrahmbh...@hortonworks.com>> >>>>wrote: >>>> >> >>>> >> >>>> >> I have added the notes to KIP-11 Open question sections. >>>> >> >>>> >>Thanks >>>> >>Parth >>>> >> >>>> >>On 4/21/15, 4:49 PM, "Gwen Shapira" >>>> >><gshap...@cloudera.com<mailto:gshap...@cloudera.com>> wrote: >>>> >> >>>> >>>Adding my notes from today's call to the thread: >>>> >>> >>>> >>>** Deny or Allow all by default? We will add a configuration to >>>> >>>control this. The configuration will default to “allow” for backward >>>> >>>compatibility. Security admins can set it to "deny" >>>> >>> >>>> >>>** Storing ACLs for default authorizers: We'll store them in ZK. >>>>We'll >>>> >>>support pointing the authorizer to any ZK. >>>> >>>The use of ZK will be internal to the default authorizer. Authorizer >>>> >>>reads ACLs from cache every hour. We proposed having mechanism >>>> >>>(possibly via new ZK node) to tell broker to refresh the cache >>>> >>>immediately. >>>> >>> >>>> >>>** Support deny as permission type - we agreed to keep this. >>>> >>> >>>> >>>** Mapping operations to API: We may need to add Group as a >>>>resource, >>>> >>>with JoinGroup and OffsetCommit require privilege on the consumer >>>> >>>group. >>>> >>>This can be something we pass now and authorizers can support in >>>> >>>future. - Jay will write specifics to the mailing list discussion. >>>> >>> >>>> >>>On Tue, Apr 21, 2015 at 4:32 PM, Jay Kreps >>>> >>><jay.kr...@gmail.com<mailto:jay.kr...@gmail.com>> wrote: >>>> >>>> Following up on the KIP discussion. Two options for authorizing >>>> >>>>consumers >>>> >>>> to read topic "t" as part of group "g": >>>> >>>> 1. READ permission on resource /topic/t >>>> >>>> 2. READ permission on resource /topic/t AND WRITE permission on >>>> >>>>/group/g >>>> >>>> >>>> >>>> The advantage of (1) is that it is simpler. The disadvantage is >>>>that >>>> >>>>any >>>> >>>> member of any group that reads from t can commit offsets as any >>>>other >>>> >>>> member of a different group. This doesn't effect data security >>>>(who >>>> >>>>can >>>> >>>> access what) but it is a bit of a management issue--a malicious >>>>person >>>> >>>>can >>>> >>>> cause data loss or duplicates for another consumer by committing >>>> >>>>offset. >>>> >>>> >>>> >>>> I think I favor (2) but it's worth it to think it through. >>>> >>>> >>>> >>>> -Jay >>>> >>>> >>>> >>>> On Tue, Apr 21, 2015 at 2:43 PM, Parth Brahmbhatt < >>>> >>>> pbrahmbh...@hortonworks.com<mailto:pbrahmbh...@hortonworks.com>> >>>> >>>>wrote: >>>> >>>> >>>> >>>>> Hey Jun, >>>> >>>>> >>>> >>>>> Yes and we support wild cards for all acl entities principal, >>>>hosts >>>> >>>>>and >>>> >>>>> operation. >>>> >>>>> >>>> >>>>> Thanks >>>> >>>>> Parth >>>> >>>>> >>>> >>>>> On 4/21/15, 9:06 AM, "Jun Rao" >>>> >>>>><j...@confluent.io<mailto:j...@confluent.io>> wrote: >>>> >>>>> >>>> >>>>> >Harsha, Parth, >>>> >>>>> > >>>> >>>>> >Thanks for the clarification. This makes sense. Perhaps we can >>>> >>>>>clarify the >>>> >>>>> >meaning of those rules in the wiki. >>>> >>>>> > >>>> >>>>> >Related to this, it seems that we need to support wildcard in >>>> >>>>>cli/request >>>> >>>>> >protocol for topics? >>>> >>>>> > >>>> >>>>> >Jun >>>> >>>>> > >>>> >>>>> >On Mon, Apr 20, 2015 at 9:07 PM, Parth Brahmbhatt < >>>> >>>>> >pbrahmbh...@hortonworks.com<mailto:pbrahmbh...@hortonworks.com>> >>>> >>>>>wrote: >>>> >>>>> > >>>> >>>>> >> The iptables on unix supports the DENY operator, not that it >>>> >>>>>should >>>> >>>>> >> matter. The deny operator can also be used to specify ³allow >>>>user1 >>>> >>>>>to >>>> >>>>> >>READ >>>> >>>>> >> from topic1 from all hosts but host1,host2². Again we could >>>>add a >>>> >>>>>host >>>> >>>>> >> group semantic and extra complexity around that, not sure if >>>>its >>>> >>>>>worth >>>> >>>>> >>it. >>>> >>>>> >> In addition with DENY operator you are now not forced to >>>>create a >>>> >>>>> >>special >>>> >>>>> >> group just to support the authorization use case. I am not >>>> >>>>>convinced >>>> >>>>> >>that >>>> >>>>> >> the operator it self is really all that confusing. There are 3 >>>> >>>>>practical >>>> >>>>> >> use cases: >>>> >>>>> >> - Resource with no acl what so ever -> allow access to >>>>everyone ( >>>> >>>>>just >>>> >>>>> >>for >>>> >>>>> >> backward compatibility, I would much rather fail close and >>>>force >>>> >>>>>users >>>> >>>>> >>to >>>> >>>>> >> explicitly grant acls that allows access to all users.) >>>> >>>>> >> - Resource with some acl attached -> only users that have a >>>> >>>>>matching >>>> >>>>> >>allow >>>> >>>>> >> acl are allowed (i.e. ³allow READ access to topic1 to user1 >>>>from >>>> >>>>>all >>>> >>>>> >> hosts², only user1 has READ access and no other user has >>>>access of >>>> >>>>>any >>>> >>>>> >> kind) >>>> >>>>> >> - Resource with some allow and some deny acl attached -> users >>>>are >>>> >>>>> >>allowed >>>> >>>>> >> to perform operation only when they satisfy allow acl and do >>>>not >>>> >>>>>have >>>> >>>>> >> conflicting deny acl. Users that have no acl(allow or deny) >>>>will >>>> >>>>>still >>>> >>>>> >>not >>>> >>>>> >> have any access. (i.e. ³allow READ access to topic1 to user1 >>>>from >>>> >>>>>all >>>> >>>>> >> hosts except host1 and host², only user1 has access but not >>>>from >>>> >>>>>host1 >>>> >>>>> >>an >>>> >>>>> >> host2) >>>> >>>>> >> >>>> >>>>> >> I think we need to make a decision on deny primarily because >>>>with >>>> >>>>> >> introduction of acl management API, Acl is now a public class >>>>that >>>> >>>>>will >>>> >>>>> >>be >>>> >>>>> >> used by Ranger/Santry and other authroization providers. In >>>> >>>>>Current >>>> >>>>> >>design >>>> >>>>> >> the acl has a permissionType enum field with possible values >>>>of >>>> >>>>>Allow >>>> >>>>> >>and >>>> >>>>> >> Deny. If we chose to remove deny we can assume all acls to be >>>>of >>>> >>>>>allow >>>> >>>>> >> type and remove the permissionType field completely. >>>> >>>>> >> >>>> >>>>> >> Thanks >>>> >>>>> >> Parth >>>> >>>>> >> >>>> >>>>> >> On 4/20/15, 6:12 PM, "Gwen Shapira" >>>> >>>>><gshap...@cloudera.com<mailto:gshap...@cloudera.com>> wrote: >>>> >>>>> >> >>>> >>>>> >> >I think thats how its done in pretty much any system I can >>>>think >>>> >>>>>of. >>>> >>>>> >> > >>>> >>>>> >> >>>> >>>>> >> >>>> >>>>> >>>> >>>>> >>>> >> >>>> >> >>>> >> >>>> >> >>>> > >>>> > >>>> > >>>> >>>> >>> >>> >>>-- >>>Jeff Holoman >>>Systems Engineer >> >