Re: [DISCUSSION] KIP-11: ACL Management

Don Bosco Durai Fri, 17 Apr 2015 23:50:35 -0700

Jun

Here is the recent interface from Hive. It is grossly simplified from what
you pasted before.
https://hive.apache.org/javadocs/r1.0.0/api/ql/org/apache/hadoop/hive/ql/se
curity/authorization/plugin/HiveAuthorizer.html



At the high level, there is only one authorization method (similar to
Parth’s proposal) for permission check. The method name is
checkPrivileges().

They have also exposed grantPrivileges() and revokePrivileges() APIs (as
suggested by Gwen). It is up to the plugin implementor to do as they wish.
Kafka interface can ignore all Role/Group related privileges for now.

On the Apache Ranger implementation, we store the ACLs in Ranger's central
database. The ACLs can be updated via Ranger UI or REST APIs. Ranger
doesn’t provide CLIs. In the Hive case, when someone uses Hive CLI for
grant/revoke, within HiveServer2 Ranger implementation of grant/revoke
Privileges() methods updates Ranger's database. So regardless how you
configure, the ACLs always lands into the Ranger database and Ranger
plugin uses the policies from Ranger database to enforce them.

Giving the flexibility for the plugin to store the ACLs in their store
helps the plugin implementor to extend the authorization features without
putting too much burden or dependency on Kafka development.

>From my (may be biased) opinion, I feel we should keep the default Kafka
plugin implementation and configuration simple for the initial release. As
discussed so far, the default OOTB Kafka implementation should provide
topic level permission for Users and IPs. This will address the
requirements for most users.

I like Gwen simplicity of command line, but agree with Parth that we will
have to make it flexible to add groups, etc in the future.
>kafka-topic --topic t1 --grant user --action action
>kafka-topic --topic t1 --revoke user --action action


So something like this might be good (but doesn’t have to be exactly like
this):
kafka-topic --perm --topic t1,t2 --grantuser u1,u2 --granthost h1,h2
--revokeuser u3,u4 --revokehost h3,h4 --action a1,a2

Thanks

Bosco


On 4/17/15, 10:48 PM, "Parth Brahmbhatt" <pbrahmbh...@hortonworks.com>
wrote:

>I have copied Thejas from hive team in cc list. Here is what I learnt from
>him
>
>* Hive calls the authorizer plugin if you execute “grant/revoke Operation
>to User on Table".  They use this as hive provides the SQL layer and SQL
>has standards for grant/revoke which they follow.
>* If the plugin provides more entities then what can be expressed by the
>above statement (like unix/ldap groups or host level control) you have to
>go to the plugin’s CLI/UI to create this acl.
>
>So as mentioned below you will have 2 tools. One for the very basic
>grant/revoke access and for anything complex you have a secondary
>interface provided by Authorizer plugin.
>
>Thanks
>Parth
>
>On 4/17/15, 12:01 PM, "Jun Rao" <j...@confluent.io> wrote:
>
>>Hi, Parth,
>>
>>How does this work in Hive? I thought authorization in Hive always goes
>>through it's SQL cli for any authorization plugin. When integrating with
>>Ranger(Argus), does Hive do authorization through a separate CLI?
>>
>>Thanks,
>>
>>Jun
>>
>>
>>On Fri, Apr 17, 2015 at 11:01 AM, Parth Brahmbhatt <
>>pbrahmbh...@hortonworks.com> wrote:
>>
>>> We could do this but I think its too simplistic plus now we are adding
>>> authorization related options in CLI which I thought everyone wants to
>>> avoid.
>>>
>>> When I say its too simplistic I mean there are missing options like
>>> —hosts, what happens when we start supporting group now we will
>>>probably
>>> end up adding "—grant —groups”. I think we will just endup polluting
>>>kafka
>>> create CLI with all the different acl options or we will have 2 CLIs
>>>one
>>> for the basic stuff and for anything advance you will have to use a
>>> different tool. It might be better to just have a single separate ACL
>>> management CLI.
>>>
>>> Thanks
>>> Parth
>>>
>>> On 4/17/15, 10:42 AM, "Gwen Shapira" <gshap...@cloudera.com> wrote:
>>>
>>> >I've probably been a DBA for too long, but I imagined something like:
>>> >kafka-topic --topic t1 --grant user --action action
>>> >kafka-topic --topic t1 --revoke user --action action
>>> >(i.e. the commandline equivalent of "grant select on table1 to
>>> >gwenshap" and "revoke select on table2 from gwenshap")
>>> >
>>> >When you need gazillion of them, you generate a script with gazillion
>>> >of those and execute.
>>> >
>>> >Maybe it just looks reasonable to me because I'm used to it, though :)
>>> >
>>> >Or maybe including the json parsing code in TopicCommand is not so
>>>bad?
>>> >
>>> >
>>> >
>>> >On Fri, Apr 17, 2015 at 10:30 AM, Parth Brahmbhatt
>>> ><pbrahmbh...@hortonworks.com> wrote:
>>> >> * Yes, Acl pretty much captures everything. Originally I had
>>>resource as
>>> >> part of Acls, we can go back to that.
>>> >> * The describe can call getAcl and I plan to do so. addAcl is tricky
>>> >> because the user will have to specify the acls through command
>>>lines,
>>> >> which will probably be a location to some file. Basically the CLI
>>>won¹t
>>> >> know how to parse user input and convert it to a principal/acl that
>>>the
>>> >> plugin understands. We could add an API in authorizer that can take
>>>a
>>> >>file
>>> >> as input if we want ‹acl as an option during create.
>>> >> * Yes also getAcls(Principal principal).
>>> >>
>>> >> Thanks
>>> >> Parth
>>> >>
>>> >>
>>> >> On 4/17/15, 10:05 AM, "Gwen Shapira" <gshap...@cloudera.com> wrote:
>>> >>
>>> >>>On Fri, Apr 17, 2015 at 9:31 AM, Parth Brahmbhatt
>>> >>><pbrahmbh...@hortonworks.com> wrote:
>>> >>>> I was following the storm model but I think this is a reasonable
>>> >>>>change. I recommend changing the API names to addAcls, removeAcls
>>>and
>>> >>>>getAcls.
>>> >>>
>>> >>>And they probably just need to get List<Acl> instead of everything I
>>> >>>specified? Looks like Acl encapsulates everything we need.
>>> >>>
>>> >>>> Couple of points to ensure we are on same page:
>>> >>>> * With this approach the kafka command line will not provide a way
>>>to
>>> >>>>add/edit acls during topic creation, neither it will provide a way
>>>to
>>> >>>>modify the acls. It will be up to the authorizer to either define a
>>> >>>>command line utility or to allow other means to add
>>>acls(CLI/UI/REST).
>>> >>>>For the default implementation we can provide CLI.
>>> >>>
>>> >>>You looked into this deeper than I did - is there a reason
>>> >>>TopicCommand can't invoke addACL and getACL?
>>> >>>
>>> >>>> * We probably want to add List<Acl> getAcls(Resource resource) so
>>> >>>>users
>>> >>>>can list all acls on a topic.
>>> >>>
>>> >>>Also getAcls(Principal princ)?
>>> >>>
>>> >>>>
>>> >>>> I haven¹t looked at how consumer offsets are currently stored so I
>>> >>>>will
>>> >>>>have to take a look but I think that is implementation detail.
>>> >>>>
>>> >>>> Gwen,Jun and other interested parties, do you have time to jump on
>>>a
>>> >>>>quick hangout so we can go over some of the lower level details?
>>> >>>>
>>> >>>> Thanks
>>> >>>> Parth
>>> >>>> From: Tong Li <liton...@us.ibm.com<mailto:liton...@us.ibm.com>>
>>> >>>> Reply-To: "dev@kafka.apache.org<mailto:dev@kafka.apache.org>"
>>> >>>><dev@kafka.apache.org<mailto:dev@kafka.apache.org>>
>>> >>>> Date: Friday, April 17, 2015 at 7:34 AM
>>> >>>> To: "dev@kafka.apache.org<mailto:dev@kafka.apache.org>"
>>> >>>><dev@kafka.apache.org<mailto:dev@kafka.apache.org>>
>>> >>>> Subject: Re: [DISCUSSION] KIP-11: ACL Management
>>> >>>>
>>> >>>>
>>> >>>> Gwen,
>>> >>>>          There is one product called ElasticSearch which has been
>>> >>>>quite
>>> >>>>successful. They recently added security, what they actually did is
>>> >>>>quite nice. They really separated Authentication and Authorization
>>> >>>>which
>>> >>>>many people get really confused about and often mix them up. I
>>>looked
>>> >>>>through what they did and quite impressed by it, I think there are
>>>many
>>> >>>>things we can borrow from. Here is a link to it.
>>> >>>>http://www.elastic.co/guide/en/shield/current/architecture.html.
>>>The
>>> >>>>product name is called "shield" which is implemented as an
>>> >>>>ElasticSearch
>>> >>>>plugin. The promise here is that you can have a running
>>>ElasticSearch,
>>> >>>>then you install this plugin, configure it, then your ElasticSearch
>>> >>>>service is secured. The goal should be really the same for Kafka,
>>>you
>>> >>>>have a Kafka service running, you install a new plugin (in this
>>>case
>>> >>>>security plugin), configure it, then your Kafka service is secured.
>>>I
>>> >>>>think that the key here is that we should introduce a true
>>>pluggable
>>> >>>>framework in Kafka which allows security, quota, encryption,
>>> >>>>compression, serialization/deserialization all being developed as
>>> >>>>plugins which can be all easily added and configured onto a running
>>> >>>>Kafka service, then the functions/features provided by the plugins
>>>will
>>> >>>>start working. Once we have this framework in, how a security
>>>plugin
>>> >>>>works internally becomes the really the concern of that plugin, for
>>> >>>>example, how a new user gets registered, permission granted,
>>>revoked,
>>> >>>>all these will be the concern of that plugin, rest of the Kafka
>>> >>>>components should not really be concerned about them. This way we
>>>are
>>> >>>>really following the design principal (Separation of concerns).
>>>With
>>> >>>>all that, what I am proposing is a true pluggable framework
>>> >>>>introduction
>>> >>>>into Kafka which I have also talked about in a previous email. For
>>> >>>>security we can implement a simple file based security plugin,
>>>other
>>> >>>>plugins such as LDAP, AD for authentication can come later, plugin
>>>for
>>> >>>>authorization such as RBAC can also come later if people care so
>>>much
>>> >>>>about using them.
>>> >>>>
>>> >>>> Thanks.
>>> >>>>
>>> >>>> Tong Li
>>> >>>> OpenStack & Kafka Community Development
>>> >>>> Building 501/B205
>>> >>>> liton...@us.ibm.com<mailto:liton...@us.ibm.com>
>>> >>>>
>>> >>>> [Inactive hide details for Gwen Shapira ---04/16/2015 12:44:54
>>>PM---Hi
>>> >>>>Kafka Authorization Fans, I'm starting a new thread on a]Gwen
>>>Shapira
>>> >>>>---04/16/2015 12:44:54 PM---Hi Kafka Authorization Fans, I'm
>>>starting a
>>> >>>>new thread on a specific sub-topic of KIP-11, since
>>> >>>>
>>> >>>> From: Gwen Shapira
>>> >>>><gshap...@cloudera.com<mailto:gshap...@cloudera.com>>
>>> >>>> To: "dev@kafka.apache.org<mailto:dev@kafka.apache.org>"
>>> >>>><dev@kafka.apache.org<mailto:dev@kafka.apache.org>>
>>> >>>> Date: 04/16/2015 12:44 PM
>>> >>>> Subject: [DISCUSSION] KIP-11: ACL Management
>>> >>>>
>>> >>>> ________________________________
>>> >>>>
>>> >>>>
>>> >>>>
>>> >>>> Hi Kafka Authorization Fans,
>>> >>>>
>>> >>>> I'm starting a new thread on a specific sub-topic of KIP-11, since
>>> >>>> this is a bit long :)
>>> >>>>
>>> >>>> Currently KIP-11, as I understand it, proposes:
>>> >>>> * Authorizers are pluggable, with Kafka providing
>>>DefaultAuthorizer.
>>> >>>> * Kafka tools allow adding / managing ACLs.
>>> >>>> * Those ACLs are stored in ZK and cached in a new TopicCache
>>> >>>> * Authorizers can either use the ACLs defined and stored in Kafka,
>>>or
>>> >>>> define and use their own.
>>> >>>>
>>> >>>> I am concerned of two possible issues with this design:
>>> >>>> 1. Separation of concerns - only authorizers should worry about
>>>ACLs,
>>> >>>> and therefore the less code for ACLs that exist in Kafka core, the
>>> >>>> better.
>>> >>>> 2. User confusion - It sounded like we can define ACLs in Kafka
>>>itself
>>> >>>> but authorizers can also define their own, so "kafka-topics
>>> >>>> --describe" may show an ACL different than the one in use. This
>>>can be
>>> >>>> super confusing for admins.
>>> >>>>
>>> >>>> My alternative suggestion:
>>> >>>> * Authorizer API will include:
>>> >>>> grantPrivilege(List<Principals>, List<Privilege>)
>>> >>>> revokePrivilege(List<Principals>, List<Privilege>),
>>> >>>> getPrivilegesByPrincipal(Principal, Resource)
>>> >>>> ....
>>> >>>> (The exact API can be discussed in detail, but you get the idea)
>>> >>>> * Kafka tools will simply invoke these APIs when topics are added
>>>/
>>> >>>> modified / described.
>>> >>>> * Each authorizer (including the default one) will be responsible
>>>for
>>> >>>> storing, caching and using those ACLs.
>>> >>>>
>>> >>>> This way, we keep almost all ACL code with the Authorizer, where
>>>it
>>> >>>> belongs and users get a nice unified interface that reflects what
>>>is
>>> >>>> actually getting used in the system.
>>> >>>> This is pretty much how Sqoop and Hive implement their
>>>authorization
>>> >>>>APIs.
>>> >>>>
>>> >>>> What do you think?
>>> >>>>
>>> >>>> Gwen
>>> >>>>
>>> >>
>>>
>>>
>

Re: [DISCUSSION] KIP-11: ACL Management

Reply via email to