Hi all,

Right now, our various configs can be modified by anyone with access to the
various scripts. I'd like to start a discussion around building out some
authorization to be able to add some more fine grained controls around this.

Other projects have some variants on how to accomplish this.  Typically,
this follows a pattern of calling out to a interface/class that takes in
the operation and context (user and other info) and returns true/false if
something is authorized.

In my mind, what we would need out of this is

   - Ability to apply fine grained permissions
   - The various scripts and UI should flow this authorization framework.
   I believe most (if not all) of our configuration flows through
   ConfigurationsUtils.  Anything that doesn't should either be hooked in or
   refactored to flow through the same codepaths.
   - Pluggability. We shouldn't force only one authorizer.

In particular, I'm proposing we use Apache Ranger
<https://ranger.apache.org/> as a supported authorization framework,
implementing it alongside the authorization framework to validate what we
build. In my mind, the main catch with Ranger is that, based on my
understanding, we won't be able to restrict users with direct access to
ZooKeeper via it's CLI (e.g. Ranger can't mirror it's ACLs down into ZK's
ACLs).  I believe this is a reasonable restriction, especially as the
management UI gets improved to handle more of the configuration burden and
the number of users with access to ZK CLI begins to decrease.  Users can
still add ZK ACLs separately to enforce that access there.

For anyone not familiar with Ranger, essentially you build a plugin that
hooks into the existing component's authorization framework (e.g. for
Storm, the plugin runs through the IAuthorizer
<https://storm.apache.org/releases/1.2.2/javadocs/org/apache/storm/security/auth/IAuthorizer.html>
interface, for Yarn it runs through YarnAuthorizationProvider
<http://hadoop.apache.org/docs/stable/hadoop-yarn/hadoop-yarn-common/apidocs/org/apache/hadoop/yarn/security/YarnAuthorizationProvider.html>).
Additionally, Ranger provides auditing capabilities for this authorization
and has plugins for a good portion of our stack already (so users can
already setup ACLs on HDFS, Storm, etc.). Checkout the Ranger Github
<https://github.com/apache/ranger> for a list of the plugins they have
built in.

What this means for Metron is building out an authorization setup similar
to Storm or Yarn or whatever we choose. We'll want this anyway, to allow
our solution to be pluggable.  At that point, we build an implementation of
the authorizer compatible with Ranger along with the plugin.

I think this could probably be a fairly small feature branch, which I'm
suggesting primarily to do the Ranger implementation alongside the general
authorization work to validate what's being built.  I think the main
tasking would be something similar to:

   - Build out pluggable authorization for our configs.
   - This includes testing (and possibly doing something similar to Storm,
   where they have a some testing IAuthorizers, e.g. NoopAuthorizer,
   DenyAuthorizer, etc.)
   - Ensure that all the code paths consistently flow through this
   Authorization.
   - Build a Ranger compatible version of this.
   - Define the Ranger plugin for this.
   - Make sure auditing is defined.
   - Integration testing (particularly with Kerberos. After all, if they
   want to do authorization and auditing, they're almost certainly using
   Kerberos).

Is there anything missing that we'd need or want for this?  Are there any
other concerns we'd want to make sure are taken care of?

Reply via email to