Thanks for clarifying Gwen, KIP updated. I tried to make the distinction by creating a section for all public APIs https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authorization+In terface#KIP-11-AuthorizationInterface-PublicInterfacesandclasses
Let me know if you think there is a better way to reflect this. Thanks Parth On 4/24/15, 9:37 AM, "Gwen Shapira" <gshap...@cloudera.com> wrote: >+1 (non-binding) > >Two nitpicks for the wiki: >* Heartbeat is probably a READ and not CLUSTER operation. I'm pretty >sure new consumers need it to be part of a consumer group. >* Can you clearly separate which parts are the API (common to every >Authorizer) and which parts are DefaultAuthorizer implementation? It >will make reviews and Authorizer implementations a bit easier to know >exactly which is which. > >Gwen > >On Fri, Apr 24, 2015 at 9:28 AM, Parth Brahmbhatt ><pbrahmbh...@hortonworks.com> wrote: >> Hi, >> >> I would like to open KIP-11 for voting. >> >> Thanks >> Parth >> >> On 4/22/15, 1:56 PM, "Parth Brahmbhatt" <pbrahmbh...@hortonworks.com> >> wrote: >> >>>Hi Jeff, >>> >>>Thanks a lot for the review. I think you have a valid point about acls >>>being duplicated and the simplest solution would be to modify acls class >>>so they hold a set of principals instead of single principal. i.e >>> >>><user_a,user_b> has <READ,WRITE,DESCRIBE> Permissions on <Topic1> from >>><Host1, Host2, Host3>. >>> >>>I think the evaluation order only matters for the permissionType which >>>is >>>Deny acls should be evaluated before allow acls. To give you an example >>>suppose we have following acls >>> >>>acl1 -> user1 is allowed to READ from all hosts. >>>acl2 -> host1 is allowed to READ regardless of who is the user. >>>acl3 -> host2 is allowed to READ regardless of who is the user. >>> >>>acl4 -> user1 is denied to READ from host1. >>> >>>As stated in the KIP we first evaluate DENY so if user1 tries to access >>>from host1 he will be denied(acl4), even though both user1 and host1 has >>>acl’s for allow with wildcards (acl1, acl2). >>>If user1 tried to READ from host2 , the action will be allowed and it >>>does >>>not matter if we match acl3 or acl1 so I don’t think the evaluation >>>order >>>matters here. >>> >>>“Will people actually use hosts with users?” I really don’t know but >>>given >>>ACl’s are part of our Public APIs I thought it is better to try and >>>cover >>>more use cases. If others think this extra complexity is not worth the >>>value its adding please raise your concerns so we can discuss if it >>>should >>>be removed from the acl structure. Note that even in absence of hosts >>>from >>>ACL users will still be able to whitelist/blacklist host as long as we >>>start supporting principalType = “host”, easy to add and can be an >>>incremental improvement. They will however loose the ability to restrict >>>access to users just from a set of hosts. >>> >>>We agreed to offer a CLI to overcome the JSON acl config >>>https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authorization >>>+I >>>n >>>terface#KIP-11-AuthorizationInterface-AclManagement(CLI). I still like >>>Jsons but that probably has something to do with me being a developer >>>:-). >>> >>>Thanks >>>Parth >>> >>>On 4/22/15, 11:38 AM, "Jeff Holoman" <jholo...@cloudera.com> wrote: >>> >>>>Parth, >>>> >>>>This is a long thread, so trying to keep up here, sorry if this has >>>>been >>>>covered before. First, great job on the KIP proposal and work so far. >>>> >>>>Are we sure that we want to tie host level access to a given user? My >>>>understanding is that the ACL will be (omitting some fields) >>>> >>>>user_a, host1, host2, host3 >>>>user_b, host1, host2, host3 >>>> >>>>So there would potentially be a lot of redundancy in the configs. Does >>>>it >>>>make sense to have hosts be at the same level as principal in the >>>>hierarchy? This way you could just blanket the allowed / denied hosts >>>>and >>>>only have to worry about the users. So if you follow this, then >>>> >>>>we can wildcard the user so we can have a separate list of just >>>>host-based >>>>access. What's the order that the perms would be evaluated if a there >>>>was >>>>more than one match on a principal ? >>>> >>>>Is the thought that there wouldn't usually be much overlap on hosts? I >>>>guess I can imagine a scenario where I want to offline/online access >>>>to a >>>>particular hosts or set of hosts and if there was overlap, I'm doing a >>>>bunch of alter commands for just a single host. Maybe this is too >>>>contrived >>>>an example? >>>> >>>>I agree that having this level of granularity gives flexibility but I >>>>wonder if people will actually use it and not just * the hosts for a >>>>given >>>>user and create separate "global" list as i mentioned above? >>>> >>>>The only other system I know of that ties users with hosts for access >>>>is >>>>MySql and I don't love that model. Companies usually standardize on >>>>group >>>>authorization anyway, are we complicating that issue with the inclusion >>>>of >>>>hosts attached to users? Additionally I worry about the debt of big >>>>JSON >>>>configs in the first place, most non-developers find them non-intuitive >>>>already, so anything to ease this I think would be beneficial. >>>> >>>> >>>>Thanks >>>> >>>>Jeff >>>> >>>>On Wed, Apr 22, 2015 at 2:22 PM, Parth Brahmbhatt < >>>>pbrahmbh...@hortonworks.com> wrote: >>>> >>>>> Sorry I missed your last questions. I am +0 on adding ―host option >>>>>for >>>>> ―list, we could add it for symmetry. Again if this is only a CLI >>>>>change >>>>>it >>>>> can be added later if you mean adding this in authorizer interface >>>>>then >>>>>we >>>>> should make a decision now. >>>>> >>>>> Given a choice I would like to actually keep only one option which is >>>>> resource based get (remove even the get based on principal). I see >>>>>those >>>>> (getAcl for principal or host) as special filtering case which can >>>>>easily >>>>> be achieved by a third party tool by doing "list all topics" and >>>>>calling >>>>> getAcls for each topic and applying filtering logic on that. I >>>>>really >>>>> don’t see the need to make those first class citizens of the >>>>>authorizer >>>>> interface given these kind of queries will be issued outside of >>>>>broker >>>>>JVM >>>>> so they will not benefit from the caching and because the storage >>>>>will >>>>>be >>>>> indexed on resource both these options even as a first class API will >>>>>just >>>>> scan all topic acls and apply filtering logic. >>>>> >>>>> Thanks >>>>> Parth >>>>> >>>>> On 4/22/15, 11:08 AM, "Parth Brahmbhatt" >>>>><pbrahmbh...@hortonworks.com> >>>>> wrote: >>>>> >>>>> >Please see all the available options here >>>>> > >>>>> >>>>>https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authorizati >>>>>on >>>>>+ >>>>>I >>>>> >nterface#KIP-11-AuthorizationInterface-AclManagement(CLI) . I think >>>>>it >>>>> >covers both hosts and operations and allows to specify a list for >>>>>both. >>>>> > >>>>> >Thanks >>>>> >Parth >>>>> > >>>>> >From: Tom Graves <tgraves...@yahoo.com<mailto:tgraves...@yahoo.com>> >>>>> >Reply-To: Tom Graves >>>>><tgraves...@yahoo.com<mailto:tgraves...@yahoo.com>> >>>>> >Date: Wednesday, April 22, 2015 at 11:02 AM >>>>> >To: Parth Brahmbhatt >>>>> ><pbrahmbh...@hortonworks.com<mailto:pbrahmbh...@hortonworks.com>>, >>>>> >"dev@kafka.apache.org<mailto:dev@kafka.apache.org>" >>>>> ><dev@kafka.apache.org<mailto:dev@kafka.apache.org>> >>>>> >Subject: Re: [DISCUSS] KIP-11- Authorization design for kafka >>>>>security >>>>> > >>>>> >Thanks for the explanations Parth. >>>>> > >>>>> >On the configs questions, the way I see it is its more likely to >>>>> >accidentally give everyone access, especially since you have to run >>>>>a >>>>> >separate command to change the acls. If there was some config for >>>>> >defaults, a cluster admin could change that to be nobody or certain >>>>>set >>>>> >of users, then grant others permissions. This would also remove the >>>>>race >>>>> >between commands. This is something you can always add later though >>>>>if >>>>> >people request it. >>>>> > >>>>> >So in kafka-acl.sh how do I actually tell it what the operation is? >>>>> >kafka-acl.sh --topic testtopic --add --grandprincipal >>>>>user:joe,user:kate >>>>> > >>>>> >where does READ, WRITE, etc go? Can specify as a list so I don't >>>>>have >>>>>to >>>>> >run this a bunch of times for each. >>>>> > >>>>> >Do you want to have a --host option for --list so that admins could >>>>>see >>>>> >what acls apply to specific host(s)? >>>>> > >>>>> >Tom >>>>> > >>>>> > >>>>> > >>>>> >On Wednesday, April 22, 2015 11:38 AM, Parth Brahmbhatt >>>>> ><pbrahmbh...@hortonworks.com<mailto:pbrahmbh...@hortonworks.com>> >>>>>wrote: >>>>> > >>>>> > >>>>> > >>>>> >FYI, I have modified the KIP to include group as resource. In order >>>>>to >>>>> >access “joinGroup” and “commitOFfset” APIs the user will need a read >>>>> >permission on topic and WRITE permission on group. >>>>> > >>>>> >I plan to open a VOTE thread by noon if there are no more concerns. >>>>> > >>>>> >Thanks >>>>> >Parth >>>>> > >>>>> >On 4/22/15, 9:03 AM, "Tom Graves" >>>>> ><tgraves...@yahoo.com.INVALID<mailto:tgraves...@yahoo.com.INVALID>> >>>>> wrote: >>>>> > >>>>> >>Hey everyone, >>>>> >>Sorry to jump in on the conversation so late. I'm new to Kafka. >>>>>I'll >>>>> >>apologize in advance if you have already covered some of my >>>>>questions. I >>>>> >>read through the wiki and had some comments and questions. >>>>> >>1) public enum Operation needs EDIT changed to ALTER >>>>> > >>>>> >> Done. >>>>> > >>>>> >>2) Does the Authorizer class need a setAcls? Rather then just add >>>>>to >>>>>be >>>>> >>able to set to explicit list and overwrite what was there? I see >>>>>the >>>>> >>kafka-acl.sh lists a removeall so I guess you could do removeall >>>>>and >>>>>then >>>>> >>add. I also don't see a removeall in the Authorizer class, is it >>>>>going >>>>> >>to loop through them all to remove each one? >>>>> > >>>>> > There is an overloaded version of removeAcls in the interface >>>>>that >>>>> >takes >>>>> >in resource as the only input and as described in the javadoc all >>>>>the >>>>>acls >>>>> >attached to that resource will be deleted. To cover the setAcl use >>>>>case >>>>> >the caller can first call remove and then add. >>>>> > >>>>> >>3) Can someone tell me what the use case to do acls based on the >>>>>hosts? >>>>> >>I can see some possibilities just wondering if we can concrete ones >>>>>where >>>>> >>one user is allowed from one host but not another. >>>>> > >>>>> > I am not sure if I understand the question given the use case >>>>>you >>>>> >described in your question is what we are trying to cover with use >>>>>of >>>>> >hosts in Acl. There are some additional use cases like “allow access >>>>>to >>>>> >any user from host1,host2” but I think primarily it gives the admins >>>>>the >>>>> >ability to define acls at a more granular level. >>>>> > >>>>> >>4) I'm a bit unclear how the "resource" works in the Authorizer >>>>>class. >>>>> >>From what I see we have 2 resources - topics and cluster. If I >>>>>want >>>>>to >>>>> >>add an acl to allow "joe" to CREATE for the cluster then I call >>>>>addAcls >>>>> >>with Acl("user: joe", ALLOW, Set(*), Set(CREATE)) and "cluster"? >>>>>What >>>>> >>if I want to call addAcls for DESCRIBE on a topic? Is the resource >>>>>then >>>>> >>"topic" or is it the topic name? >>>>> > >>>>> > We now have 3 resources(added group), please see the updated >>>>>doc. >>>>>The >>>>> >CREATE acl that you described is correct. For any topic operation >>>>>you >>>>> >should use topic name as the resource name and for group the user >>>>>will >>>>> >provide groupId as resource name. >>>>> > >>>>> >>5) reassigning partitions is a CLUSTER_ACTION or superuser? Its >>>>>not >>>>> >>totally clear to me the differences between these. what about >>>>>increasing >>>>> >># of partitions? >>>>> > >>>>> > I see this as an alter topic operation so it is at topic level >>>>>and >>>>>the >>>>> >user must have alter permissions on topic. >>>>> > >>>>> >>6) groups are mentioned, are we supporting right away or is that a >>>>>follow >>>>> >>on item? (is there going to be a kafka.supergroups) >>>>> > >>>>> > I think it can be a separate jira just for braking down the code >>>>> >review >>>>> >in smaller chunk. We will support it in first version but I think if >>>>>we >>>>> >can not do it for any reason that should not block a release with >>>>>all >>>>>the >>>>> >other authZ work. We made deliberate design choices (like >>>>>introducing >>>>>a >>>>> >principalType in KafkaPrinciapl) to allow supporting groups as an >>>>> >incremental change. >>>>> > >>>>> >>7) Are there config options for setting acls when I create my >>>>>topic? >>>>>Or >>>>> >>do I have to create my topic and then run the kafka-acl.sh script >>>>>to >>>>>set >>>>> >>them? Although its very small, there would be possible race there >>>>>that >>>>> >>someone could start producing to topic before acls are set. >>>>> > >>>>> > We discussed this yesterday and we agreed to go with >>>>>kafka-acl.sh. >>>>>Yes >>>>> >there is a very very small window of vulnerability but I think that >>>>>really >>>>> >does not warrant to change the decision in this case. >>>>> > >>>>> >>8) are there configs for cluster level acl defaults? Or does it >>>>>default >>>>> >>to superusers on bringing up new cluster and you have to modify >>>>>with >>>>>cli. >>>>> >>thanks,Tom >>>>> > >>>>> > No defaults, the default is superusers will have full access. I >>>>>don’t >>>>> >think making assumptions about ones security requirement should be >>>>>our >>>>> >burden. >>>>> > >>>>> > >>>>> >> >>>>> >> On Tuesday, April 21, 2015 7:10 PM, Parth Brahmbhatt >>>>> >><pbrahmbh...@hortonworks.com<mailto:pbrahmbh...@hortonworks.com>> >>>>>wrote: >>>>> >> >>>>> >> >>>>> >> I have added the notes to KIP-11 Open question sections. >>>>> >> >>>>> >>Thanks >>>>> >>Parth >>>>> >> >>>>> >>On 4/21/15, 4:49 PM, "Gwen Shapira" >>>>> >><gshap...@cloudera.com<mailto:gshap...@cloudera.com>> wrote: >>>>> >> >>>>> >>>Adding my notes from today's call to the thread: >>>>> >>> >>>>> >>>** Deny or Allow all by default? We will add a configuration to >>>>> >>>control this. The configuration will default to “allow” for >>>>>backward >>>>> >>>compatibility. Security admins can set it to "deny" >>>>> >>> >>>>> >>>** Storing ACLs for default authorizers: We'll store them in ZK. >>>>>We'll >>>>> >>>support pointing the authorizer to any ZK. >>>>> >>>The use of ZK will be internal to the default authorizer. >>>>>Authorizer >>>>> >>>reads ACLs from cache every hour. We proposed having mechanism >>>>> >>>(possibly via new ZK node) to tell broker to refresh the cache >>>>> >>>immediately. >>>>> >>> >>>>> >>>** Support deny as permission type - we agreed to keep this. >>>>> >>> >>>>> >>>** Mapping operations to API: We may need to add Group as a >>>>>resource, >>>>> >>>with JoinGroup and OffsetCommit require privilege on the consumer >>>>> >>>group. >>>>> >>>This can be something we pass now and authorizers can support in >>>>> >>>future. - Jay will write specifics to the mailing list discussion. >>>>> >>> >>>>> >>>On Tue, Apr 21, 2015 at 4:32 PM, Jay Kreps >>>>> >>><jay.kr...@gmail.com<mailto:jay.kr...@gmail.com>> wrote: >>>>> >>>> Following up on the KIP discussion. Two options for authorizing >>>>> >>>>consumers >>>>> >>>> to read topic "t" as part of group "g": >>>>> >>>> 1. READ permission on resource /topic/t >>>>> >>>> 2. READ permission on resource /topic/t AND WRITE permission on >>>>> >>>>/group/g >>>>> >>>> >>>>> >>>> The advantage of (1) is that it is simpler. The disadvantage is >>>>>that >>>>> >>>>any >>>>> >>>> member of any group that reads from t can commit offsets as any >>>>>other >>>>> >>>> member of a different group. This doesn't effect data security >>>>>(who >>>>> >>>>can >>>>> >>>> access what) but it is a bit of a management issue--a malicious >>>>>person >>>>> >>>>can >>>>> >>>> cause data loss or duplicates for another consumer by committing >>>>> >>>>offset. >>>>> >>>> >>>>> >>>> I think I favor (2) but it's worth it to think it through. >>>>> >>>> >>>>> >>>> -Jay >>>>> >>>> >>>>> >>>> On Tue, Apr 21, 2015 at 2:43 PM, Parth Brahmbhatt < >>>>> >>>> pbrahmbh...@hortonworks.com<mailto:pbrahmbh...@hortonworks.com>> >>>>> >>>>wrote: >>>>> >>>> >>>>> >>>>> Hey Jun, >>>>> >>>>> >>>>> >>>>> Yes and we support wild cards for all acl entities principal, >>>>>hosts >>>>> >>>>>and >>>>> >>>>> operation. >>>>> >>>>> >>>>> >>>>> Thanks >>>>> >>>>> Parth >>>>> >>>>> >>>>> >>>>> On 4/21/15, 9:06 AM, "Jun Rao" >>>>> >>>>><j...@confluent.io<mailto:j...@confluent.io>> wrote: >>>>> >>>>> >>>>> >>>>> >Harsha, Parth, >>>>> >>>>> > >>>>> >>>>> >Thanks for the clarification. This makes sense. Perhaps we can >>>>> >>>>>clarify the >>>>> >>>>> >meaning of those rules in the wiki. >>>>> >>>>> > >>>>> >>>>> >Related to this, it seems that we need to support wildcard in >>>>> >>>>>cli/request >>>>> >>>>> >protocol for topics? >>>>> >>>>> > >>>>> >>>>> >Jun >>>>> >>>>> > >>>>> >>>>> >On Mon, Apr 20, 2015 at 9:07 PM, Parth Brahmbhatt < >>>>> >>>>> >>>>>>pbrahmbh...@hortonworks.com<mailto:pbrahmbh...@hortonworks.com>> >>>>> >>>>>wrote: >>>>> >>>>> > >>>>> >>>>> >> The iptables on unix supports the DENY operator, not that it >>>>> >>>>>should >>>>> >>>>> >> matter. The deny operator can also be used to specify ³allow >>>>>user1 >>>>> >>>>>to >>>>> >>>>> >>READ >>>>> >>>>> >> from topic1 from all hosts but host1,host2². Again we could >>>>>add a >>>>> >>>>>host >>>>> >>>>> >> group semantic and extra complexity around that, not sure if >>>>>its >>>>> >>>>>worth >>>>> >>>>> >>it. >>>>> >>>>> >> In addition with DENY operator you are now not forced to >>>>>create a >>>>> >>>>> >>special >>>>> >>>>> >> group just to support the authorization use case. I am not >>>>> >>>>>convinced >>>>> >>>>> >>that >>>>> >>>>> >> the operator it self is really all that confusing. There >>>>>are 3 >>>>> >>>>>practical >>>>> >>>>> >> use cases: >>>>> >>>>> >> - Resource with no acl what so ever -> allow access to >>>>>everyone ( >>>>> >>>>>just >>>>> >>>>> >>for >>>>> >>>>> >> backward compatibility, I would much rather fail close and >>>>>force >>>>> >>>>>users >>>>> >>>>> >>to >>>>> >>>>> >> explicitly grant acls that allows access to all users.) >>>>> >>>>> >> - Resource with some acl attached -> only users that have a >>>>> >>>>>matching >>>>> >>>>> >>allow >>>>> >>>>> >> acl are allowed (i.e. ³allow READ access to topic1 to user1 >>>>>from >>>>> >>>>>all >>>>> >>>>> >> hosts², only user1 has READ access and no other user has >>>>>access of >>>>> >>>>>any >>>>> >>>>> >> kind) >>>>> >>>>> >> - Resource with some allow and some deny acl attached -> >>>>>users >>>>>are >>>>> >>>>> >>allowed >>>>> >>>>> >> to perform operation only when they satisfy allow acl and do >>>>>not >>>>> >>>>>have >>>>> >>>>> >> conflicting deny acl. Users that have no acl(allow or deny) >>>>>will >>>>> >>>>>still >>>>> >>>>> >>not >>>>> >>>>> >> have any access. (i.e. ³allow READ access to topic1 to user1 >>>>>from >>>>> >>>>>all >>>>> >>>>> >> hosts except host1 and host², only user1 has access but not >>>>>from >>>>> >>>>>host1 >>>>> >>>>> >>an >>>>> >>>>> >> host2) >>>>> >>>>> >> >>>>> >>>>> >> I think we need to make a decision on deny primarily because >>>>>with >>>>> >>>>> >> introduction of acl management API, Acl is now a public >>>>>class >>>>>that >>>>> >>>>>will >>>>> >>>>> >>be >>>>> >>>>> >> used by Ranger/Santry and other authroization providers. In >>>>> >>>>>Current >>>>> >>>>> >>design >>>>> >>>>> >> the acl has a permissionType enum field with possible values >>>>>of >>>>> >>>>>Allow >>>>> >>>>> >>and >>>>> >>>>> >> Deny. If we chose to remove deny we can assume all acls to >>>>>be >>>>>of >>>>> >>>>>allow >>>>> >>>>> >> type and remove the permissionType field completely. >>>>> >>>>> >> >>>>> >>>>> >> Thanks >>>>> >>>>> >> Parth >>>>> >>>>> >> >>>>> >>>>> >> On 4/20/15, 6:12 PM, "Gwen Shapira" >>>>> >>>>><gshap...@cloudera.com<mailto:gshap...@cloudera.com>> wrote: >>>>> >>>>> >> >>>>> >>>>> >> >I think thats how its done in pretty much any system I can >>>>>think >>>>> >>>>>of. >>>>> >>>>> >> > >>>>> >>>>> >> >>>>> >>>>> >> >>>>> >>>>> >>>>> >>>>> >>>>> >> >>>>> >> >>>>> >> >>>>> >> >>>>> > >>>>> > >>>>> > >>>>> >>>>> >>>> >>>> >>>>-- >>>>Jeff Holoman >>>>Systems Engineer >>> >>