Internal daemon-to-daemon authorization

Todd Lipcon Mon, 31 Oct 2016 16:42:50 -0700

Hey folks,

After https://gerrit.cloudera.org/#/c/4765/ finishes code review, the Kudu
daemons will have support for Kerberized RPC. If we start the servers with
--server_require_kerberos, then they'll require that all RPC clients
authenticate themselves via Kerberos.


Of course authentication isn't that useful on its own. I think the next
step along this direction is to add basic "internal service" authorization:

- the master should only allow tablet server service principals to
heartbeat/register/etc
- the tserver should only allow master service principals to create/delete
tablets, etc. (with the possible extension to also allow 'root' users to do
so)

I think there are a couple patterns we could follow here to enforce the
authorization

*Option 1) Try to enforce ACLs on a per-RPC-service level*

This is more or less the approach that Hadoop takes. We can make sure that
different RPC use cases are separated into different RPC "services", and
apply an ACL on each service. For example, we can separate the current
'MasterService' into 'TserverToMasterService' and 'ClientToMasterService',
and then register ACLs on each such service when we start the RPC server.

Pros:
- when adding a new method, it "automatically" gets the right ACL without
having to remember to add special checks
- enforces some nice separation and grouping of RPCs

Cons:
- doesn't handle special cases well, like 'DeleteTablet is usually called
by the master, but we also want to allow admins to call the same method for
use in manual recovery tools'
- would require a bunch of RPC methods to be moved around, and likely break
wire compatibility (I believe the RPC header currently specifies the actual
service name for each method call)

*Option 2) Entirely ad-hoc authorization checks*

At the top of each RPC method implementation, we can manually check the
authorization. Basically, imagine something like this:

bool CheckTabletServerRequest(RpcContext* rpc) {
  // if 'rpc' is from a valid tablet server, return true
  // otherwise, respond to the RPC with an authz error and return false
}

void MasterServiceImpl::TSHeartbeat(const TSHeartbeatRequestPB* req,
                                    TSHeartbeatResponsePB* resp,
                                    rpc::RpcContext* rpc) {
  if (!CheckTabletServerRequest(rpc)) return;
}


Pros:
- 100% flexible - easily allows for request-specific special cases
- relies on no "magic", it's obvious when you look at the code how the
authz check is applied.

Cons:
- easy to forget to check authorization when adding new RPCs. I recall a
CVE or two in the past where we forgot to add the appropriate check in
various internal Hadoop RPCs.


*Option 3) Per-method authorization checks, declared in .proto file*

In the .proto file describing a service, we can add a custom option like:

  rpc TSHeartbeat(TSHeartbeatRequestPB) returns (TSHeartbeatResponsePB) {
    option (kudu.authz = "tablet-servers-only")
  };

When we start the RPC system, we can bind these magic strings to actual
method implementations, and give an error if any RPC method either mentions
an invalid authz policy, or doesn't mention an authz policy at all.

Pros:
- we can require that all methods specify a policy (even if they specify
'none'), making it impossible to forget to authorize
- still retains flexibility: we can always create an exotic policy
implementation for an exotic method as necessary (or specify 'none' and do
the custom check from option 2)
- we can allow the same protobuf option to be specified as a service-wide
'default' for the cases where an entire service has the same authz policy

Cons:
- maybe a little more confusing to developers who have to go looking at the
.proto file to see which authorization policy is being applied to a method?

*Option X? Any other options I missed?*
- I only spent 30m or so thinking about this, so maybe I missed a better
option? Feedback appreciated.


I'm leaning towards option 3 here, but would appreciate feedback before I
go too far down that path.

-Todd
-- 
Todd Lipcon
Software Engineer, Cloudera

Internal daemon-to-daemon authorization

Reply via email to