Hi all,
Over the course of the last 4-5 years, I've written at least half a
dozen non-trivial applications utilizing the RAPI and I must say that
I've always felt a bit uncomfortable with the lack of fine-grained
access control in the RAPI implementation.
In the case of an application managing a whole cluster (almost)
exclusively (think ganetimgr, GWM or synnefo), this is not a big issue;
access control is implemented in the management layer, which already
holds user credentials and user/instance associations. And since the
application is supposed to be able to perform anything on the cluster,
the security risk of having the application compromised is there anyway.
There are times however where this model doesn't work:
- Applications that are simple and specific in their scope (e.g. only
updating instance tags automatically) do not (and should not) need to
add/remove instances and nodes for example. The only way simple
applications like these can be made usable by untrusted parties, are
to pass them through an intermediate proxy service that performs
access control. This in turn makes writing simple utilities
unnecessarily cumbersome.
- Applications that manage only part of a cluster shouldn't really be
able to cause damage outside their own domain. With the current
implementation, a web interface spawning new instances can
potentially kill instances managed completely outside its scope.
Having to go through this process again today, I decided to have a look
at how easy it would be to implement more fine-grained access control in
the RAPI. I always thought that having the RAPI implemented as a WSGI
application would make things easier (being able to use pluggable
middlewares), but it turns out it's not that difficult with the current
implementation either. What I wanted to do was have a RAPI user be able
to only create/control/destroy instances with a specific hostname suffix
and these are a couple of things I found out in the course of doing
this:
1. It turns out that it's pretty easy to add new "roles" in addition to
"read" and "write" to the RAPI implementation. One only needs to
append them to the relevant ganeti.rapi.rlib2 R_2_* classes'
<method>_ACCESS lists. These roles can then be used instead of
"read" or "write" to grant per-endpoint access. RAPI already has
support for multiple privilege levels per user, so a first level of
fine-grained control is trivial to achieve by adding roles like
"instance_write", "node_write", "group_write", "network_write",
granting write access to specific resource types only.
2. It's also easy to monkey-patch RemoteApiHandler.Authenticate to
provide our own authentication and authorization for individual
requests. This allows even more fine-grained control, by inspecting
the RAPI call request and body.
What I ended up with, is the attached version of /usr/sbin/ganeti-rapi
which seems to fulfill my requirements. The only problem with this
implementation is that I have to replace the actual
/usr/sbin/ganeti-rapi script (which is a very thin wrapper anyway), thus
messing with the default Ganeti installation.
What would make things a lot easier from Ganeti's side, would be to
allow the admin to specify a "middleware" module, providing a single
RemoteApiHandler class (deriving from the original class) to use instead
of the original, as a command line argument or via an environment
variable. This would in turn require a documented and relatively stable
API for the RemoteApiHandler class, which would become kind of
semi-public.
So, it all comes down to the following questions:
- Are there any plans to implement functionality like this at some
point?
- Does anyone else think this is useful? If yes, I would be willing to
implement pluggable RAPI handlers, provided that we all agree on the
API.
Regards,
Apollon
P.S.: Are you planning to convert the RAPI implementation to Haskell at some
point?
#! /usr/bin/python
"""Bootstrap script for L{ganeti.server.rapi} -- restricted version
A couple of notes
-----------------
We want to add per-opcode access rights to the RAPI, as well as filter some
RAPI calls by e.g. instance name. This is a three-step process:
1. First we define a new user keyword (like "read" or "write") and
subsequently append it to the <METHOD>_ACCESS lists of the desired rlib2
classes. This has the result of granting access by endpoint to users that
have this special option. Note that users should have only this option,
and not "write" as this would grant unlimited write-access.
2. Then we subclass the RemoteApiHandler and override its authentication
method to filter by call arguments and/or request payload.
3. Finally we monkey-patch ganeti.rapi.rlib2 and replace the RemoteApiHandler
class with our own implementation.
"""
# pylint: disable=C0103
# C0103: Invalid name
import sys
sys.path.append("/usr/share/ganeti")
import ganeti.server.rapi
import ganeti.rapi.rlib2
from ganeti import http
from ganeti import serializer
def _check_instance_name(req):
""" Check the instance name in the URL """
return req.private.handler.items[0].endswith(".test.example.com")
def _check_instance_creation(req):
""" Check the instance name in the body """
data = serializer.LoadJson(req.request_body)
return data["name"].endswith(".test.example.com")
# The keyword used in rapi/users
ACCESS_KEY = "test"
# Define the endpoints we need to access, as a dictionary.
# We use the endpoint classes as keys and the values are tuples of (access
# methods, check function)
TEST_ENDPOINTS = {
ganeti.rapi.rlib2.R_2_query:
(["GET"], None),
ganeti.rapi.rlib2.R_2_jobs_id_wait:
(["GET"], None),
ganeti.rapi.rlib2.R_2_instances:
(["POST"], _check_instance_creation),
ganeti.rapi.rlib2.R_2_instances_name:
(["DELETE"], _check_instance_name),
ganeti.rapi.rlib2.R_2_instances_name_tags:
(["PUT", "DELETE"], _check_instance_name),
ganeti.rapi.rlib2.R_2_instances_name_reboot:
(["POST"], _check_instance_name),
ganeti.rapi.rlib2.R_2_instances_name_startup:
(["PUT"], _check_instance_name),
ganeti.rapi.rlib2.R_2_instances_name_shutdown:
(["PUT"], _check_instance_name),
}
for endpoint, (methods, check_fn) in TEST_ENDPOINTS.items():
for method in methods:
access = getattr(endpoint, "%s_ACCESS" % method)
access.append(ACCESS_KEY)
OrigRApiHandler = ganeti.server.rapi.RemoteApiHandler
class RestrictedRApiHandler(ganeti.server.rapi.RemoteApiHandler):
""" ACL-restricted RAPI handler class """
def Authenticate(self, req, username, password):
# The following call will return True for access granted, False for
# authentication failure and will raise http.HttpForbidden() for access
# violation.
auth = OrigRApiHandler.Authenticate(self, req, username, password)
# Unknown user or bad password, return immediately
if not auth:
return False
# At this point we certainly have a user
user = self._user_fn(username)
if ACCESS_KEY not in user.options:
# We don't need to bother more with this
return auth
ctx = self._GetRequestContext(req)
if ctx.handler.__class__ in TEST_ENDPOINTS:
_, check = TEST_ENDPOINTS[ctx.handler.__class__]
if check is not None:
if not check(req):
raise http.HttpForbidden()
return auth
ganeti.server.rapi.RemoteApiHandler = RestrictedRApiHandler
if __name__ == "__main__":
sys.exit(ganeti.server.rapi.Main())