Hi all,

in the midst of handling the security stuff I had a moment of clarity how the 
often requested per document permissions could be implemented. We had discussed 
a potential approach extensively in the February Boston Developer Summit (notes 
here: 
https://lists.apache.org/thread.html/09a5686bca8049010b82796cc0fe99ef27aed4983a3f02fd6956259f@%3Cdev.couchdb.apache.org%3E)

What was so alluring about this proposal was that it solves per doc access 
control and per-user-db in one go. E.g. it would be able to share a single 
database with multiple distrusting users, allow them to have their own set of 
views, and even independently use their share of a single database as a 
replication endpoint without interfering with any of the other users on that 
database.

I gave it a shot. Essentially, we need to build new indexes: by-access-id and 
by-access-seq to make all that work. I’m just focussing on the core of this, 
trying to re-use the existing couch_mrview/couch_index machinery as much as 
possible. Strictly, for replication only by-access-seq would be required, but 
by-update-id is a little easier to do, so I’ve done that first, and I believe 
the results are encouraging.

I’ve put a diff against master into a gist for your perusal:

  https://gist.github.com/janl/20b218a3f0eafbf963ee28780261f9fc


The core bits are:

  
https://gist.github.com/janl/20b218a3f0eafbf963ee28780261f9fc#file-by-access-id-diff-L189-L215

and

  
https://gist.github.com/janl/20b218a3f0eafbf963ee28780261f9fc#file-by-access-id-diff-L189-L215

Here’s an example Doc:

{
  "_id":"1fb94bf8c3d5a73745f3cc4f5f000a8d”,
  "_rev":"4-bcbc975e61bdb80f3de1b87f6cad6a76”,
  "_access":["b”]
}

It shows up for user b:


curl b:b@127.0.0.1:15984/a/_all_docs

{"total_rows”:2,"offset":0,"rows":[
  
{"id":"1fb94bf8c3d5a73745f3cc4f5f000a8d","key":["b","1fb94bf8c3d5a73745f3cc4f5f000a8d"],"value":"4-bcbc975e61bdb80f3de1b87f6cad6a76”}
]}

But not for user c:


> curl c:c@127.0.0.1:15984/a/_all_docs

{"total_rows”:2,"offset":2,"rows":[

]}


* * *


I’d like to get some general design feedback on this approach to find out if it 
is worth pursuing further. See “Next Steps” way below for my thinking on how to 
get by-access-seq going.

The rest of this email are my notes from reading the source and trying to 
explain my thinking as well as guide folks that might not be very familiar with 
the CouchDB sources to follow along what is happening.

I’d especially like to get some feedback about this from some of the folks here 
who don’t spend their days in the main Erlang codebase :)

Let me know what you think.

Thanks!
Jan

* * *

CouchDB Access Notes

Background: 
https://lists.apache.org/thread.html/09a5686bca8049010b82796cc0fe99ef27aed4983a3f02fd6956259f@%3Cdev.couchdb.apache.org%3E

# Overview

To solve the problems with the db-per-user pattern, we want to introduce 
document level access control. The result should be a single CouchDB database 
that can be used by multiple mutually untrusting users while retaining 
CouchDB’s full semantics.

// TODO: link to appendix: problems with db-per-user

We decided on an approach to define access control in documents with a new 
property `_access` which is specified as an array of strings and arrays. 
Strings represent usernames and roles, sub-arrays are used as logical AND, 
elements in the top level array are used as logical OR. For example. an _access 
field with the value [[‘management’, ‘senior’], ‘ceo-jane’] would allow 
everyone with the roles ‘management’ AND ‘senior’, OR the user ‘ceo-jane’ 
access to that doc. but not e.g. users with roles ‘development’, ‘senior’, nor 
user ‘vp-jenn’.

To achieve main CouchDB semantics, we need to introduce new behaviour for the 
_all_docs and _changes endpoints. The plan is to special case-this based on the 
authenticated user context (userCtx, e.g, username and associated roles, after 
authentication).

The existing by-id and by-seq indexes are not equipped to efficiently return 
results per user, so we are introducing two new indexes (either can be 
optionally configured, depending on the use-case and performance and storage 
needs): by-access-id and by-access-seq. In contrast with by-id and by-seq, 
these indexes are not stored in the main database file, but in a separate file, 
ideally managed by the existing couch_index infrastructure.


# Development considerations

This first spike is only concerned with getting per-access-id to work with 
minimal effort.

To get started, let’s look at how _all_docs works today using the by-id index.

## The Anatomy of a Clustered _all_docs Request

CouchDB’s clustering layer consists of three main modules: chttpd, fabric and 
refi. chttpd’s job is to handle everything HTTP and route requests to the right 
place in the rest of the code. It’s a HTTP router, mapping URLs, request 
methods and options to handler functions that do with the work the requests are 
specified to fulfil.

fabric’s job is to distribute a single request from the outside to multiple 
nodes of the cluster. Some requests require only talking to the local node, but 
that’s less important for the moment. fabric includes fabric_rpc, a module that 
turns a request to the cluster into one or more requests to other nodes in the 
cluster.

rexi’s job is know about the cluster state: which nodes are in the cluster, 
which of them are active/reachable/failed, which shards live on which nodes. 
fabric uses rexi to know which nodes to contact for which shards.

After a bit of indirection, we find ourselves at the first _all_docs-specific 
function in chttpd_db.erl: all_docs_view/4:

```
all_docs_view(Req, Db, Keys, OP) ->
    Args0 = couch_mrview_http:parse_params(Req, Keys),
    Args1 = Args0#mrargs{view_type=map},
    Args2 = couch_mrview_util:validate_args(Args1),
    Args3 = set_namespace(OP, Args2),
    Options = [{user_ctx, Req#httpd.user_ctx}],
    Max = chttpd:chunked_response_buffer_size(),
    VAcc = #vacc{db=Db, req=Req, threshold=Max},
    {ok, Resp} = fabric:all_docs(Db, Options, fun couch_mrview_http:view_cb/2, 
VAcc, Args3),
    {ok, Resp#vacc.resp}.
```

The first five lines handle query options and request parameters or arguments. 
The next three lines are the bulk of the job: start a response, call 
fabric:all_docs/5 with a callback to handle rows. The last line returns the 
accumulator that is returned by fabric:all_docs/5.

fabric:all_docs/5 is a thin wrapper around fabric_view_all_docs:go/5. Before we 
jump down, we notice that there is also a fabric_view_changes.erl, which we 
should remember for the next iteration when we implement by-access-seq.

go/5 comes in two variants and we’ll ignore the second here for the moment, 
because it is a performance optimisation. The main work for go/5 is in the top 
third of the function. First it gets all shards for the current database from 
mem3, then it starts a fabric_rpc worker for each shard, and then waits for the 
results to come back by calling go/6 with all workers. The bottom two thirds 
are timeout and error handling.

go/6 registers the handle_message/3 function as the callback for rexi_utils’ 
recv/6 (read “receive”) function.

handle_message/3 comes in a number of variants to handle rexi errors, receiving 
metadata, receiving result rows and a notification “complete” about all rows 
having been sent.

Our next level down is looking into fabric_rpc and how it handles all_docs 
requests. fabric_rpc/3 is again a short wrapper, this time around 
couch_mrview:query_all_docs/4 which is the node-local function that handles 
querying.

couch_mrview includes a bunch of functions map/reduce views. It seems like a 
natural place doing our distinction between a normal by-id request and a 
by-access-id request.

I’m skipping a step here, but with a little printf-debugging, I’ve found out 
that the `Db` variable we get passed in, includes the authenticated userCtx 
including username and any roles.  We can use couch_db:is_admin/1 to get a 
boolean back for the distinction we are going to have to make:

```
query_all_docs(Db, Args0, Callback, Acc) ->
    case couch_db:is_admin(Db) of
        true -> query_all_docs_admin(Db, Args0, Callback, Acc);
        false -> query_all_docs_access(Db, Args0, Callback, Acc)
    end.
```

query_all_docs_admin/4 is the existing query_all_docs/4 function and we’re 
introducing query_all_docs_access/4, that we now have to fill out with querying 
our view.

Before we can do that, we need to understand how view work.


## The Anatomy of a View Request

Querying a view has three stages:

1. define the view
2. build the view index
3. query the view index

A view definition is always in a design document. It can be one or JavaScript 
map/reduce functions, Erlang map/reduce functions, or a mango index definition.

// TODO: link all these view definition options.

Building the view index is an implicit step in CouchDB. View indexes are 
refreshed at query time, but only if there were any changes in the database 
since the last query. If no refresh is needed, the view result is returned from 
the index directly.

// TODO: explain query_server

Querying indexes follows a similar path through chttpd, fabric, rexi, 
fabric_rpc down to the per-node handlers in couch_mrview. Just a few lines 
below couch_mrview:query_all_docs/4 we find query_view/5 which decides between 
map and reduce requests. We care about map-only for now. query_view/5 is 
preceded by query_view/6 which includes a call to couch_mrview_util:get_view/4 
which looks like it is where we want to look next, as the map_fold/5 called by 
query_view/5 is about looping over rows. We hope we can re-use all that logic, 
and maybe get_view/4 lets us find out how we can have it return our new view.

get_view/4 calls get_view_index_state/4 which in turn calls 
get_view_index_pid/4 that finally calls into couch_index_server:get_index/4 
which looks like it returns the index for our request. Let’s have a look.

get_index/4 will dive into get_index/2 eventually and that looks indeed like 
where we need to look. In there, we look up view index in an ETS table (an 
in-memory database), and if it can’t find it there, start a new one. Either 
way, a view index is returned. The lookup is by DbName and Sig(nature), an md5 
hash over the `views` property in a design doc, that also corresponds to the 
*.view filename of the view index.


## Faking the index

So how would we get this to return the index we want to query? We need to 
create an index definition that matches the design doc `views` hash. Hm.

It is relatively easy to produce a map function that behaves like we want:

function (doc) {
  var _access = doc.access
  if (!_access) { return }
  if (!isArray(_access) || _access,length === 0) { return }
  _access.forEach( function (user_or_role) {
    emit([user_or_role, doc._id], doc._rev)
  })
}

At query time, we’d have to match the requesting username and roles against the 
first element in the key-array and return the results, while replacing the 
key-array with the second element (the doc _id). All this doesn’t sound too 
hard. Good.

One snag though: if we think ahead and try to see how we could implement 
by-access-changes we get stuck: a view does not include rows for deleted 
documents while _changes does. In addition, the update sequence for a document 
is not available in a map function. So a regular view can not be used here.

The filtering of deleted docs from a view index happens in 
couch_mrview:map_fold/3. So if we could augment that for our internal view 
requests, that could get us a long way towards reusing the rest of the 
couch_mrview/couch_index machinery.

Note to self: make sure view compaction doesn’t remove deleted docs. But a 
cursory glance at couch_mrview_compactor:compact_view_btree/5 suggests no such 
thing, but we need to validate this, and if it doesn’t hold, change 
view_compation to keep deleted entries.

* * *
 
We’ll start giving this a try by forking things off in 
couch_mrview:query_all_docs/4 and pretending to call a view with a mocked ddoc:

{
  “_id”: “_design/_access”,
  “language”: “_access”
  “views”: {} // if needed
} // TODO see which other fields it needs

We’ll try this road to see if we get to the point where we get a “view index 
not found” error, because we didn’t actually have a view index yet. We’ll then 
try and see if we can produce one. We could try the other way around too, 
building the index first and then trying to query, but the approach doesn’t 
make much of a difference.

First demo working: 
https://gist.github.com/janl/20b218a3f0eafbf963ee28780261f9fc


Next Steps:
- make sure the startkey/endkey/descending argument handling is all correct and 
complete
- add key un-munging, so the user/role prefix gets filtered out on reads
- handle roles:
    - instead of querying the _access view once, we need to issue a 
multi-query, probably via #mrags.multi_get, read up on how that is used
- then we could start thinking about by-access-seq:
    - we need access to the update-seq in couch_access_native_proc:map_doc, 
might require view protocol upgrade, or we have a post-process function that 
tags on the update-seq, we’ll see.
    - the admin/access split we’re doing in query_all_docs should probably 
happen in couch_db:changes_since/5






# More specification details


Documents with in databases with _access enabled are private/admin-only by 
default, and can be made public with the special role _public

TODO: shared id space or auto-prefix ids


Reply via email to