Sample ACL JSON and Zookeeper is in public API, but I thought it is part of DefaultAuthorizer (Since Sentry and Argus won't be using Zookeeper). Am I wrong? Or is it the KIP?
On Fri, Apr 24, 2015 at 9:49 AM, Parth Brahmbhatt <pbrahmbh...@hortonworks.com> wrote: > Thanks for clarifying Gwen, KIP updated. > > I tried to make the distinction by creating a section for all public APIs > https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authorization+In > terface#KIP-11-AuthorizationInterface-PublicInterfacesandclasses > > Let me know if you think there is a better way to reflect this. > > Thanks > Parth > > On 4/24/15, 9:37 AM, "Gwen Shapira" <gshap...@cloudera.com> wrote: > >>+1 (non-binding) >> >>Two nitpicks for the wiki: >>* Heartbeat is probably a READ and not CLUSTER operation. I'm pretty >>sure new consumers need it to be part of a consumer group. >>* Can you clearly separate which parts are the API (common to every >>Authorizer) and which parts are DefaultAuthorizer implementation? It >>will make reviews and Authorizer implementations a bit easier to know >>exactly which is which. >> >>Gwen >> >>On Fri, Apr 24, 2015 at 9:28 AM, Parth Brahmbhatt >><pbrahmbh...@hortonworks.com> wrote: >>> Hi, >>> >>> I would like to open KIP-11 for voting. >>> >>> Thanks >>> Parth >>> >>> On 4/22/15, 1:56 PM, "Parth Brahmbhatt" <pbrahmbh...@hortonworks.com> >>> wrote: >>> >>>>Hi Jeff, >>>> >>>>Thanks a lot for the review. I think you have a valid point about acls >>>>being duplicated and the simplest solution would be to modify acls class >>>>so they hold a set of principals instead of single principal. i.e >>>> >>>><user_a,user_b> has <READ,WRITE,DESCRIBE> Permissions on <Topic1> from >>>><Host1, Host2, Host3>. >>>> >>>>I think the evaluation order only matters for the permissionType which >>>>is >>>>Deny acls should be evaluated before allow acls. To give you an example >>>>suppose we have following acls >>>> >>>>acl1 -> user1 is allowed to READ from all hosts. >>>>acl2 -> host1 is allowed to READ regardless of who is the user. >>>>acl3 -> host2 is allowed to READ regardless of who is the user. >>>> >>>>acl4 -> user1 is denied to READ from host1. >>>> >>>>As stated in the KIP we first evaluate DENY so if user1 tries to access >>>>from host1 he will be denied(acl4), even though both user1 and host1 has >>>>acl’s for allow with wildcards (acl1, acl2). >>>>If user1 tried to READ from host2 , the action will be allowed and it >>>>does >>>>not matter if we match acl3 or acl1 so I don’t think the evaluation >>>>order >>>>matters here. >>>> >>>>“Will people actually use hosts with users?” I really don’t know but >>>>given >>>>ACl’s are part of our Public APIs I thought it is better to try and >>>>cover >>>>more use cases. If others think this extra complexity is not worth the >>>>value its adding please raise your concerns so we can discuss if it >>>>should >>>>be removed from the acl structure. Note that even in absence of hosts >>>>from >>>>ACL users will still be able to whitelist/blacklist host as long as we >>>>start supporting principalType = “host”, easy to add and can be an >>>>incremental improvement. They will however loose the ability to restrict >>>>access to users just from a set of hosts. >>>> >>>>We agreed to offer a CLI to overcome the JSON acl config >>>>https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authorization >>>>+I >>>>n >>>>terface#KIP-11-AuthorizationInterface-AclManagement(CLI). I still like >>>>Jsons but that probably has something to do with me being a developer >>>>:-). >>>> >>>>Thanks >>>>Parth >>>> >>>>On 4/22/15, 11:38 AM, "Jeff Holoman" <jholo...@cloudera.com> wrote: >>>> >>>>>Parth, >>>>> >>>>>This is a long thread, so trying to keep up here, sorry if this has >>>>>been >>>>>covered before. First, great job on the KIP proposal and work so far. >>>>> >>>>>Are we sure that we want to tie host level access to a given user? My >>>>>understanding is that the ACL will be (omitting some fields) >>>>> >>>>>user_a, host1, host2, host3 >>>>>user_b, host1, host2, host3 >>>>> >>>>>So there would potentially be a lot of redundancy in the configs. Does >>>>>it >>>>>make sense to have hosts be at the same level as principal in the >>>>>hierarchy? This way you could just blanket the allowed / denied hosts >>>>>and >>>>>only have to worry about the users. So if you follow this, then >>>>> >>>>>we can wildcard the user so we can have a separate list of just >>>>>host-based >>>>>access. What's the order that the perms would be evaluated if a there >>>>>was >>>>>more than one match on a principal ? >>>>> >>>>>Is the thought that there wouldn't usually be much overlap on hosts? I >>>>>guess I can imagine a scenario where I want to offline/online access >>>>>to a >>>>>particular hosts or set of hosts and if there was overlap, I'm doing a >>>>>bunch of alter commands for just a single host. Maybe this is too >>>>>contrived >>>>>an example? >>>>> >>>>>I agree that having this level of granularity gives flexibility but I >>>>>wonder if people will actually use it and not just * the hosts for a >>>>>given >>>>>user and create separate "global" list as i mentioned above? >>>>> >>>>>The only other system I know of that ties users with hosts for access >>>>>is >>>>>MySql and I don't love that model. Companies usually standardize on >>>>>group >>>>>authorization anyway, are we complicating that issue with the inclusion >>>>>of >>>>>hosts attached to users? Additionally I worry about the debt of big >>>>>JSON >>>>>configs in the first place, most non-developers find them non-intuitive >>>>>already, so anything to ease this I think would be beneficial. >>>>> >>>>> >>>>>Thanks >>>>> >>>>>Jeff >>>>> >>>>>On Wed, Apr 22, 2015 at 2:22 PM, Parth Brahmbhatt < >>>>>pbrahmbh...@hortonworks.com> wrote: >>>>> >>>>>> Sorry I missed your last questions. I am +0 on adding ―host option >>>>>>for >>>>>> ―list, we could add it for symmetry. Again if this is only a CLI >>>>>>change >>>>>>it >>>>>> can be added later if you mean adding this in authorizer interface >>>>>>then >>>>>>we >>>>>> should make a decision now. >>>>>> >>>>>> Given a choice I would like to actually keep only one option which is >>>>>> resource based get (remove even the get based on principal). I see >>>>>>those >>>>>> (getAcl for principal or host) as special filtering case which can >>>>>>easily >>>>>> be achieved by a third party tool by doing "list all topics" and >>>>>>calling >>>>>> getAcls for each topic and applying filtering logic on that. I >>>>>>really >>>>>> don’t see the need to make those first class citizens of the >>>>>>authorizer >>>>>> interface given these kind of queries will be issued outside of >>>>>>broker >>>>>>JVM >>>>>> so they will not benefit from the caching and because the storage >>>>>>will >>>>>>be >>>>>> indexed on resource both these options even as a first class API will >>>>>>just >>>>>> scan all topic acls and apply filtering logic. >>>>>> >>>>>> Thanks >>>>>> Parth >>>>>> >>>>>> On 4/22/15, 11:08 AM, "Parth Brahmbhatt" >>>>>><pbrahmbh...@hortonworks.com> >>>>>> wrote: >>>>>> >>>>>> >Please see all the available options here >>>>>> > >>>>>> >>>>>>https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authorizati >>>>>>on >>>>>>+ >>>>>>I >>>>>> >nterface#KIP-11-AuthorizationInterface-AclManagement(CLI) . I think >>>>>>it >>>>>> >covers both hosts and operations and allows to specify a list for >>>>>>both. >>>>>> > >>>>>> >Thanks >>>>>> >Parth >>>>>> > >>>>>> >From: Tom Graves <tgraves...@yahoo.com<mailto:tgraves...@yahoo.com>> >>>>>> >Reply-To: Tom Graves >>>>>><tgraves...@yahoo.com<mailto:tgraves...@yahoo.com>> >>>>>> >Date: Wednesday, April 22, 2015 at 11:02 AM >>>>>> >To: Parth Brahmbhatt >>>>>> ><pbrahmbh...@hortonworks.com<mailto:pbrahmbh...@hortonworks.com>>, >>>>>> >"dev@kafka.apache.org<mailto:dev@kafka.apache.org>" >>>>>> ><dev@kafka.apache.org<mailto:dev@kafka.apache.org>> >>>>>> >Subject: Re: [DISCUSS] KIP-11- Authorization design for kafka >>>>>>security >>>>>> > >>>>>> >Thanks for the explanations Parth. >>>>>> > >>>>>> >On the configs questions, the way I see it is its more likely to >>>>>> >accidentally give everyone access, especially since you have to run >>>>>>a >>>>>> >separate command to change the acls. If there was some config for >>>>>> >defaults, a cluster admin could change that to be nobody or certain >>>>>>set >>>>>> >of users, then grant others permissions. This would also remove the >>>>>>race >>>>>> >between commands. This is something you can always add later though >>>>>>if >>>>>> >people request it. >>>>>> > >>>>>> >So in kafka-acl.sh how do I actually tell it what the operation is? >>>>>> >kafka-acl.sh --topic testtopic --add --grandprincipal >>>>>>user:joe,user:kate >>>>>> > >>>>>> >where does READ, WRITE, etc go? Can specify as a list so I don't >>>>>>have >>>>>>to >>>>>> >run this a bunch of times for each. >>>>>> > >>>>>> >Do you want to have a --host option for --list so that admins could >>>>>>see >>>>>> >what acls apply to specific host(s)? >>>>>> > >>>>>> >Tom >>>>>> > >>>>>> > >>>>>> > >>>>>> >On Wednesday, April 22, 2015 11:38 AM, Parth Brahmbhatt >>>>>> ><pbrahmbh...@hortonworks.com<mailto:pbrahmbh...@hortonworks.com>> >>>>>>wrote: >>>>>> > >>>>>> > >>>>>> > >>>>>> >FYI, I have modified the KIP to include group as resource. In order >>>>>>to >>>>>> >access “joinGroup” and “commitOFfset” APIs the user will need a read >>>>>> >permission on topic and WRITE permission on group. >>>>>> > >>>>>> >I plan to open a VOTE thread by noon if there are no more concerns. >>>>>> > >>>>>> >Thanks >>>>>> >Parth >>>>>> > >>>>>> >On 4/22/15, 9:03 AM, "Tom Graves" >>>>>> ><tgraves...@yahoo.com.INVALID<mailto:tgraves...@yahoo.com.INVALID>> >>>>>> wrote: >>>>>> > >>>>>> >>Hey everyone, >>>>>> >>Sorry to jump in on the conversation so late. I'm new to Kafka. >>>>>>I'll >>>>>> >>apologize in advance if you have already covered some of my >>>>>>questions. I >>>>>> >>read through the wiki and had some comments and questions. >>>>>> >>1) public enum Operation needs EDIT changed to ALTER >>>>>> > >>>>>> >> Done. >>>>>> > >>>>>> >>2) Does the Authorizer class need a setAcls? Rather then just add >>>>>>to >>>>>>be >>>>>> >>able to set to explicit list and overwrite what was there? I see >>>>>>the >>>>>> >>kafka-acl.sh lists a removeall so I guess you could do removeall >>>>>>and >>>>>>then >>>>>> >>add. I also don't see a removeall in the Authorizer class, is it >>>>>>going >>>>>> >>to loop through them all to remove each one? >>>>>> > >>>>>> > There is an overloaded version of removeAcls in the interface >>>>>>that >>>>>> >takes >>>>>> >in resource as the only input and as described in the javadoc all >>>>>>the >>>>>>acls >>>>>> >attached to that resource will be deleted. To cover the setAcl use >>>>>>case >>>>>> >the caller can first call remove and then add. >>>>>> > >>>>>> >>3) Can someone tell me what the use case to do acls based on the >>>>>>hosts? >>>>>> >>I can see some possibilities just wondering if we can concrete ones >>>>>>where >>>>>> >>one user is allowed from one host but not another. >>>>>> > >>>>>> > I am not sure if I understand the question given the use case >>>>>>you >>>>>> >described in your question is what we are trying to cover with use >>>>>>of >>>>>> >hosts in Acl. There are some additional use cases like “allow access >>>>>>to >>>>>> >any user from host1,host2” but I think primarily it gives the admins >>>>>>the >>>>>> >ability to define acls at a more granular level. >>>>>> > >>>>>> >>4) I'm a bit unclear how the "resource" works in the Authorizer >>>>>>class. >>>>>> >>From what I see we have 2 resources - topics and cluster. If I >>>>>>want >>>>>>to >>>>>> >>add an acl to allow "joe" to CREATE for the cluster then I call >>>>>>addAcls >>>>>> >>with Acl("user: joe", ALLOW, Set(*), Set(CREATE)) and "cluster"? >>>>>>What >>>>>> >>if I want to call addAcls for DESCRIBE on a topic? Is the resource >>>>>>then >>>>>> >>"topic" or is it the topic name? >>>>>> > >>>>>> > We now have 3 resources(added group), please see the updated >>>>>>doc. >>>>>>The >>>>>> >CREATE acl that you described is correct. For any topic operation >>>>>>you >>>>>> >should use topic name as the resource name and for group the user >>>>>>will >>>>>> >provide groupId as resource name. >>>>>> > >>>>>> >>5) reassigning partitions is a CLUSTER_ACTION or superuser? Its >>>>>>not >>>>>> >>totally clear to me the differences between these. what about >>>>>>increasing >>>>>> >># of partitions? >>>>>> > >>>>>> > I see this as an alter topic operation so it is at topic level >>>>>>and >>>>>>the >>>>>> >user must have alter permissions on topic. >>>>>> > >>>>>> >>6) groups are mentioned, are we supporting right away or is that a >>>>>>follow >>>>>> >>on item? (is there going to be a kafka.supergroups) >>>>>> > >>>>>> > I think it can be a separate jira just for braking down the code >>>>>> >review >>>>>> >in smaller chunk. We will support it in first version but I think if >>>>>>we >>>>>> >can not do it for any reason that should not block a release with >>>>>>all >>>>>>the >>>>>> >other authZ work. We made deliberate design choices (like >>>>>>introducing >>>>>>a >>>>>> >principalType in KafkaPrinciapl) to allow supporting groups as an >>>>>> >incremental change. >>>>>> > >>>>>> >>7) Are there config options for setting acls when I create my >>>>>>topic? >>>>>>Or >>>>>> >>do I have to create my topic and then run the kafka-acl.sh script >>>>>>to >>>>>>set >>>>>> >>them? Although its very small, there would be possible race there >>>>>>that >>>>>> >>someone could start producing to topic before acls are set. >>>>>> > >>>>>> > We discussed this yesterday and we agreed to go with >>>>>>kafka-acl.sh. >>>>>>Yes >>>>>> >there is a very very small window of vulnerability but I think that >>>>>>really >>>>>> >does not warrant to change the decision in this case. >>>>>> > >>>>>> >>8) are there configs for cluster level acl defaults? Or does it >>>>>>default >>>>>> >>to superusers on bringing up new cluster and you have to modify >>>>>>with >>>>>>cli. >>>>>> >>thanks,Tom >>>>>> > >>>>>> > No defaults, the default is superusers will have full access. I >>>>>>don’t >>>>>> >think making assumptions about ones security requirement should be >>>>>>our >>>>>> >burden. >>>>>> > >>>>>> > >>>>>> >> >>>>>> >> On Tuesday, April 21, 2015 7:10 PM, Parth Brahmbhatt >>>>>> >><pbrahmbh...@hortonworks.com<mailto:pbrahmbh...@hortonworks.com>> >>>>>>wrote: >>>>>> >> >>>>>> >> >>>>>> >> I have added the notes to KIP-11 Open question sections. >>>>>> >> >>>>>> >>Thanks >>>>>> >>Parth >>>>>> >> >>>>>> >>On 4/21/15, 4:49 PM, "Gwen Shapira" >>>>>> >><gshap...@cloudera.com<mailto:gshap...@cloudera.com>> wrote: >>>>>> >> >>>>>> >>>Adding my notes from today's call to the thread: >>>>>> >>> >>>>>> >>>** Deny or Allow all by default? We will add a configuration to >>>>>> >>>control this. The configuration will default to “allow” for >>>>>>backward >>>>>> >>>compatibility. Security admins can set it to "deny" >>>>>> >>> >>>>>> >>>** Storing ACLs for default authorizers: We'll store them in ZK. >>>>>>We'll >>>>>> >>>support pointing the authorizer to any ZK. >>>>>> >>>The use of ZK will be internal to the default authorizer. >>>>>>Authorizer >>>>>> >>>reads ACLs from cache every hour. We proposed having mechanism >>>>>> >>>(possibly via new ZK node) to tell broker to refresh the cache >>>>>> >>>immediately. >>>>>> >>> >>>>>> >>>** Support deny as permission type - we agreed to keep this. >>>>>> >>> >>>>>> >>>** Mapping operations to API: We may need to add Group as a >>>>>>resource, >>>>>> >>>with JoinGroup and OffsetCommit require privilege on the consumer >>>>>> >>>group. >>>>>> >>>This can be something we pass now and authorizers can support in >>>>>> >>>future. - Jay will write specifics to the mailing list discussion. >>>>>> >>> >>>>>> >>>On Tue, Apr 21, 2015 at 4:32 PM, Jay Kreps >>>>>> >>><jay.kr...@gmail.com<mailto:jay.kr...@gmail.com>> wrote: >>>>>> >>>> Following up on the KIP discussion. Two options for authorizing >>>>>> >>>>consumers >>>>>> >>>> to read topic "t" as part of group "g": >>>>>> >>>> 1. READ permission on resource /topic/t >>>>>> >>>> 2. READ permission on resource /topic/t AND WRITE permission on >>>>>> >>>>/group/g >>>>>> >>>> >>>>>> >>>> The advantage of (1) is that it is simpler. The disadvantage is >>>>>>that >>>>>> >>>>any >>>>>> >>>> member of any group that reads from t can commit offsets as any >>>>>>other >>>>>> >>>> member of a different group. This doesn't effect data security >>>>>>(who >>>>>> >>>>can >>>>>> >>>> access what) but it is a bit of a management issue--a malicious >>>>>>person >>>>>> >>>>can >>>>>> >>>> cause data loss or duplicates for another consumer by committing >>>>>> >>>>offset. >>>>>> >>>> >>>>>> >>>> I think I favor (2) but it's worth it to think it through. >>>>>> >>>> >>>>>> >>>> -Jay >>>>>> >>>> >>>>>> >>>> On Tue, Apr 21, 2015 at 2:43 PM, Parth Brahmbhatt < >>>>>> >>>> pbrahmbh...@hortonworks.com<mailto:pbrahmbh...@hortonworks.com>> >>>>>> >>>>wrote: >>>>>> >>>> >>>>>> >>>>> Hey Jun, >>>>>> >>>>> >>>>>> >>>>> Yes and we support wild cards for all acl entities principal, >>>>>>hosts >>>>>> >>>>>and >>>>>> >>>>> operation. >>>>>> >>>>> >>>>>> >>>>> Thanks >>>>>> >>>>> Parth >>>>>> >>>>> >>>>>> >>>>> On 4/21/15, 9:06 AM, "Jun Rao" >>>>>> >>>>><j...@confluent.io<mailto:j...@confluent.io>> wrote: >>>>>> >>>>> >>>>>> >>>>> >Harsha, Parth, >>>>>> >>>>> > >>>>>> >>>>> >Thanks for the clarification. This makes sense. Perhaps we can >>>>>> >>>>>clarify the >>>>>> >>>>> >meaning of those rules in the wiki. >>>>>> >>>>> > >>>>>> >>>>> >Related to this, it seems that we need to support wildcard in >>>>>> >>>>>cli/request >>>>>> >>>>> >protocol for topics? >>>>>> >>>>> > >>>>>> >>>>> >Jun >>>>>> >>>>> > >>>>>> >>>>> >On Mon, Apr 20, 2015 at 9:07 PM, Parth Brahmbhatt < >>>>>> >>>>> >>>>>>>pbrahmbh...@hortonworks.com<mailto:pbrahmbh...@hortonworks.com>> >>>>>> >>>>>wrote: >>>>>> >>>>> > >>>>>> >>>>> >> The iptables on unix supports the DENY operator, not that it >>>>>> >>>>>should >>>>>> >>>>> >> matter. The deny operator can also be used to specify ³allow >>>>>>user1 >>>>>> >>>>>to >>>>>> >>>>> >>READ >>>>>> >>>>> >> from topic1 from all hosts but host1,host2². Again we could >>>>>>add a >>>>>> >>>>>host >>>>>> >>>>> >> group semantic and extra complexity around that, not sure if >>>>>>its >>>>>> >>>>>worth >>>>>> >>>>> >>it. >>>>>> >>>>> >> In addition with DENY operator you are now not forced to >>>>>>create a >>>>>> >>>>> >>special >>>>>> >>>>> >> group just to support the authorization use case. I am not >>>>>> >>>>>convinced >>>>>> >>>>> >>that >>>>>> >>>>> >> the operator it self is really all that confusing. There >>>>>>are 3 >>>>>> >>>>>practical >>>>>> >>>>> >> use cases: >>>>>> >>>>> >> - Resource with no acl what so ever -> allow access to >>>>>>everyone ( >>>>>> >>>>>just >>>>>> >>>>> >>for >>>>>> >>>>> >> backward compatibility, I would much rather fail close and >>>>>>force >>>>>> >>>>>users >>>>>> >>>>> >>to >>>>>> >>>>> >> explicitly grant acls that allows access to all users.) >>>>>> >>>>> >> - Resource with some acl attached -> only users that have a >>>>>> >>>>>matching >>>>>> >>>>> >>allow >>>>>> >>>>> >> acl are allowed (i.e. ³allow READ access to topic1 to user1 >>>>>>from >>>>>> >>>>>all >>>>>> >>>>> >> hosts², only user1 has READ access and no other user has >>>>>>access of >>>>>> >>>>>any >>>>>> >>>>> >> kind) >>>>>> >>>>> >> - Resource with some allow and some deny acl attached -> >>>>>>users >>>>>>are >>>>>> >>>>> >>allowed >>>>>> >>>>> >> to perform operation only when they satisfy allow acl and do >>>>>>not >>>>>> >>>>>have >>>>>> >>>>> >> conflicting deny acl. Users that have no acl(allow or deny) >>>>>>will >>>>>> >>>>>still >>>>>> >>>>> >>not >>>>>> >>>>> >> have any access. (i.e. ³allow READ access to topic1 to user1 >>>>>>from >>>>>> >>>>>all >>>>>> >>>>> >> hosts except host1 and host², only user1 has access but not >>>>>>from >>>>>> >>>>>host1 >>>>>> >>>>> >>an >>>>>> >>>>> >> host2) >>>>>> >>>>> >> >>>>>> >>>>> >> I think we need to make a decision on deny primarily because >>>>>>with >>>>>> >>>>> >> introduction of acl management API, Acl is now a public >>>>>>class >>>>>>that >>>>>> >>>>>will >>>>>> >>>>> >>be >>>>>> >>>>> >> used by Ranger/Santry and other authroization providers. In >>>>>> >>>>>Current >>>>>> >>>>> >>design >>>>>> >>>>> >> the acl has a permissionType enum field with possible values >>>>>>of >>>>>> >>>>>Allow >>>>>> >>>>> >>and >>>>>> >>>>> >> Deny. If we chose to remove deny we can assume all acls to >>>>>>be >>>>>>of >>>>>> >>>>>allow >>>>>> >>>>> >> type and remove the permissionType field completely. >>>>>> >>>>> >> >>>>>> >>>>> >> Thanks >>>>>> >>>>> >> Parth >>>>>> >>>>> >> >>>>>> >>>>> >> On 4/20/15, 6:12 PM, "Gwen Shapira" >>>>>> >>>>><gshap...@cloudera.com<mailto:gshap...@cloudera.com>> wrote: >>>>>> >>>>> >> >>>>>> >>>>> >> >I think thats how its done in pretty much any system I can >>>>>>think >>>>>> >>>>>of. >>>>>> >>>>> >> > >>>>>> >>>>> >> >>>>>> >>>>> >> >>>>>> >>>>> >>>>>> >>>>> >>>>>> >> >>>>>> >> >>>>>> >> >>>>>> >> >>>>>> > >>>>>> > >>>>>> > >>>>>> >>>>>> >>>>> >>>>> >>>>>-- >>>>>Jeff Holoman >>>>>Systems Engineer >>>> >>> >