janl commented on code in PR #5792: URL: https://github.com/apache/couchdb/pull/5792#discussion_r2635379917
########## src/docs/rfcs/018-declarative-vdu.md: ########## @@ -0,0 +1,2249 @@ +--- +name: Formal RFC +about: Submit a formal Request For Comments for consideration by the team. +title: '' +labels: rfc, discussion +assignees: '' + +--- + +[NOTE]: # ( ^^ Provide a general summary of the RFC in the title above. ^^ ) + +# Introduction + +## Abstract + +[NOTE]: # ( Provide a 1-to-3 paragraph overview of the requested change. ) +[NOTE]: # ( Describe what problem you are solving, and the general approach. ) + +## Requirements Language + +[NOTE]: # ( Do not alter the section below. Follow its instructions. ) + +The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", +"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this +document are to be interpreted as described in +[RFC 2119](https://www.rfc-editor.org/rfc/rfc2119.txt). + +## Terminology + +[TIP]: # ( Provide a list of any unique terms or acronyms, and their definitions here.) + +--- + +# Detailed Description + +This document specifies a system of declarative document validation and +authorization for CouchDB. It lets users write rules for validating document +updates using expressions that can be evaluated inside the main CouchDB process, +instead of having to invoke JavaScript code and incurring the overhead of +round-tripping documents to the external JavaScript engine. + + +## Design documents + +Users encode validation rules by storing a design document with the following +fields: + +- `language`: must have the value `"query"` +- `validate_doc_update`: contains an _Extended Mango_ expression that encodes + the desired validation rules +- `defs`: (optional) a set of named Extended Mango expressions that may be + referenced from the `validate_doc_update` expression, as a way to encode + reusable functions + + +## Handling write requests + +When a document is updated, the `validate_doc_update` (VDU) fields of all the +design docs in the database are evaluated, and all of them must return a +successful response in order for the update to be accepted. For existing VDUs +written in JavaScript, those continue to be evaluated using the JavaScript +engine. For VDUs in docs with the field `"language": "query"`, the VDU is +evaluated using the functions in the `mango` application, particularly the +`mango_selector` module. This module will need additional functionality to +handle Extended Mango expressions as described below. + + +### Input to declarative VDUs + +JavaScript-based VDUs are functions that accept four arguments: the old and new +versions of the document, the user context, and the database security object. +The input to a declarative VDU is a virtual JSON document with the following +top-level properties: + +- `$newDoc`: the body of the new version of the document that the client is + attempting to write +- `$oldDoc`: the body of the previous leaf revision of the document that the new + revision would follow on from +- `$userCtx`: the user context containing the current user's `name` and an array + of `roles` the user possesses +- `$secObj`: the database security object, containing arrays of users and roles + with admin access, and users and roles with member access + +Example of a user context: + + { + "db": "movies", + "name": "Alice", + "roles": ["_admin"] + } + +Example of a security object: + + { + "admins": { + "names": ["Bob"], + "roles": [] + }, + "members": { + "names": ["Mike", "Alice"], + "roles": [] + } + } + +To evaluate a declarative VDU against this virtual JSON document, evaluate the +Extended Mango expression in the `validate_doc_update` field. This will produce +a list of _failures_ that describe the ways in which the input does not match +the selector. Evaluation is considered successful if this list is empty. + +If the expression produces a non-empty list, then no further expressions from +other design docs are evaluated, the write is rejected, and a representation of +the failures is returned to the client. If the expression produces an empty +list, then expressions from other design docs are evaluated. If none of the +`validate_doc_update` fields in any design doc produces a failure, the write is +accepted. + + +### Responses to write requests + +If the selector expression in the `validate_doc_update` field returns an empty +list of failures, then the write is accepted and proceeds as normal, leading to +a 201 or 202 response. + +If any of the selectors fails, then either its list of failures or a custom +error response is returned to the caller, with a 401 or 403 status code as +indicated by the selector itself. For example, imagine a design doc contains the +following: + + "validate_doc_update": { + "$all": [ + { + "$userCtx.roles": { "$all": ["_admin"] }, + "$error": "unauthorized" + }, + { + "$newDoc.type": { "$in": ["movie", "director"] }, + "$error": "forbidden" + } + ] + } + +To evaluate this expression, the following steps are performed: + +- Check whether the user context's `roles` array contains the value `"_admin"`. + If it does not, return a 401 response to the client. +- Check whether the new doc's `type` field has the value `"movie"` or + `"director"`. If it does not, return a 403 response to the client. +- Otherwise, accept the write and return a 201 or 202. + +The body of the response contains two fields: + +- `error`: this is either `"unauthorized"` or `"forbidden"` +- `reason`: this contains either a custom error message, or the list of failures + generated by the first non-matching selector. + +If no custom `$reason` is set, then the `reason` field contains a list of +failures like so: + + { + "error": "forbidden", + "reason": { + "failures": [ + { + "path": ["$newDoc", "type"], + "type": "in", + "params": ["movie", "director"] + } + ] + } + } + +This is consistent with the current working of JavaScript VDUs. Such functions +can call `throw({ forbidden: obj })` where `obj` is an object, and it will be +passed back to the client as JSON, i.e. it is already possible for user-defined +VDUs to generate responses like that above. + +A custom error response can be generated by adding extra information to the +selector expressions; see "`$error` and `$reason`" below. + +The intent of this interface is that each individual selector expression +produces a complete list of _all_ the ways in which the input did not match the +selector expression, so that the client can show all the validation errors to +the user in one go. + + +## Extended Mango + +Declarative VDU functions are expressed in an extended variant of Mango. It +includes all the selector operators previously designed for use in queries and +filters, and a few additions that are particularly suited to the task of +defining VDUs. Some of these features _only_ make sense for VDUs and should only +be allowed in this context. + + +### Return values + +Currently, the evaluation of a Mango selector by the `mango_selector:match` +function returns a boolean value to indicate whether or not the input value +matched the selector. When evaluating Extended Mango for VDUs, `match` should +instead return a list of _failures_, which are records that describe the ways in +which the input did not match. A failure has the following properties: + +- `path`: an array of keys that give the path to the non-matching value, from + the root of the input document, such that a client could locate the + non-matching value by evaluating `path.reduce((val, key) => val[key], doc)` +- `type`: the name of the matching operator that failed, e.g. `"eq"`, `"in"`, + `"type"`, etc. +- `params`: an array of any other values that the operator used to determine the + match result. For most operators this would just be the expected value. + +An example of a failure object: + + { + "path": ["$secObj", "members", "names", 1], + "type": "eq", + "params": ["Alice"] + } + +A client should be able to construct useful user-facing error messages from the +information in these failure objects such that the user could correct any +mistakes in their input. + +An Extended Mango expression is considered to match a given input if its +evaluation returns an empty list. + +To produce the `path` field, the `match` function will need to track the path it +used to reach the current value. This can be done by adding a "match context" to +its list of parameters that tracks this information along with other things. +This will also be needed to handle negation and relative `$data` references. + + +### General evaluation + +Since `match` currently just returns `true` or `false`, the `mango_selector` +module can implement certain operations using "short-circuit" semantics. For +example, `$and` can be implemented by checking each of its sub-selectors, and +returning `false` as soon as a single one of them returns `false`. Likewise +`$or` can return `true` as soon as single sub-selector returns `true`. + +For VDUs, we want to return a complete list of match failures to the client, so +some compound operators must evaluate all their inputs completely, without +short-circuiting. Specifically: + +- `$and` should evaluate all its sub-selectors and return the combined list of + failures produced by any of them. +- `$or` may return an empty list as soon as any of its sub-selectors returns an + empty list. If all its sub-selectors return non-empty failure lists, it should + return a combined list of all failures. +- `$nor` should be translated as described under "Negation" below, before the + expression is evaluated. +- `$allMatch` should evaluate all list items against the sub-selector and return + a list of all failures produced by any of the items. It must only return an + empty list of all items produce an empty list. +- `$elemMatch` may return an empty list as soon as an item is found that + produces an empty list for the sub-selector. If no item does so, the combined + list of failures from all items should be returned. + +For normal object matchers, `{ "a": X, "b": Y }` should have the same behaviour +as `{ "$and": [{ "a": X }, { "b": Y }] }`. That is, all the fields in the object +should be checked, rather than returning `false` as soon as a single field does +not match, and all failures from all fields should be returned to the caller. + + +### `$if`/`$then`/`$else` + +To produce better error messages for dependent validation rules, a new set of +conditional operators is added. The general form is: + + { "$if": A, "$then": B, "$else": C } + +`A`, `B` and `C` are sub-selector expressions. Both the `$then` and `$else` +fields are optional. `$then` defaults to `NONE($then)`, a selector which always +fails with the message that `$then` is required. `$else` defaults to `ANY`, a +selector which always succeeds. (This definition may seem odd but is necessary +for these expressions to be automatically negated; see "Negation" below.) + +To evaluate this operator for input `Doc`, perform these steps: + +- If `match(A, Doc)` returns an empty list, return the result of `match(B, Doc)` +- Otherwise, return the result of `match(C, Doc)` + +If these operators appear in a selector alongside other operators, the effect is +the same as if any other of combination of operators was used. That is, the +`$if`/`$then`/`$else` operators must succeed, and all other operators in the +selector must succeed, in order for the match to be considered successful. For +example: + + { + "$gt": 0, + "$if": { "$gt": 10 }, + "$then": { "$mod": [5, 0] } + } + +This matches inputs that are greater than 0, and if they are greater than 10 +they must also be a multiple of 5. + +These operators may be evaluated in normal query/filter Mango contexts by +translating `{ "$if": A, "$then": B, "$else": C }` to: + + { + "$or": [ + { "$and": [A, B] }, + { "$and": [{ "$not": A }, C] } + ] + } + +This translation should be applied before negations are normalised. + + +### `$data` + +Some rules, especially those concerned with authorization, will need to compare +different fields within the input, particularly comparing the user context to +the security object and the new document. To enable this, Extended Mango +provides a way to reference data from elsewhere in the input. + +The `$data` operator produces the value at the location indicated by its Review Comment: we could make that work, yes, and as such make docs self-referential, e.g. match a field’s value against another field’s value -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
