Hi all,

I’m writing to you about a new proposed feature for CouchDB: declarative 
validate doc updates: a way to do document validations without writing 
JavaScript code that needs to be evaluated on each doc update, which in 
practice means we recommend not using this for all but the smallest traffic 
situations.

We have a general desire to have this as a feature since at least 2018[1] 
(thanks Diana) and we have recently received (thank you Sovereign Tech Agency & 
James Coglan) an RFC that covers in-depth what a complete solution would look 
like[2]. 

[1]: https://github.com/apache/couchdb/issues/1554
[2]: https://github.com/apache/couchdb/pull/5792

We are now looking for community feedback on the plan as outlined below because 
we can only anticipate end-users’ needs for this feature. Please respond here, 
or if simpler on the pull request referenced above.

The RFC is concerned with two aspects:

1. Should we extend Mango to support VDU duties, or should we adopt JSON 
Schema, as it is a larger standard around JSON validation?

2. What would an implementation look like that could express the same 
validations as current JS VDUs and should we limit the scope to ease 
introducing this feature?

The RFC analyses point 1. in-depth and strongly recommends going with the 
option of extending Mango rather than adopting JSONSchema, as it is not quite 
suited to the way CouchDB needs a validation library to work.

As a consequence the RFC outlines a list of additions to Mango so it can be 
used as a full-replacement for JS VDUs.

Allowing the expression of general purpose programming languages in a 
declarative way naturally comes with a set of complexities, which makes the RFC 
a very long document to read and subsequently makes the proposal hard to 
discuss.

After initial developer feedback, we came up with a phased approach to 
implementing this RFC. This allows us to make quick progress on things we all 
agree on and to separate open discussion points to a specific stage as not to 
block progress on other stages.

Of course we want to also make sure to not make decisions in earlier stages 
that block options in later stages.

Overall concerns for this whole endeavour are:

1. The resulting additional complexity of using Mango. Users should have an 
easy time picking up the new additions to Mango to express their validation 
needs. Ideally, any additions we make are also useful or at least be not 
confusing in the indexing context.

2. The resulting additional complexity of Mango’s implementation. It should not 
turn Mango into a maintenance burden.

  - This includes performance concerns during indexing.

3. How far do we need to go on the path of allowing Mango VDUs to be as 
expressive as JS VDUs. Is a 80/20 solution good enough where folks that need 
more flexibility can always use JS or Erlang (if performance is a concern)?

As such, the plan as it stands for now is as follows:

- Phase 1: Allow Mango as it exists today to be used as a VDU. This already 
exists and is a surprisingly small patch[3]. Thanks past us’s. There is broad 
dev consensus that this is a useful addition on its own and could be used as a 
test-balloon in the next release to gather wider community feedback.

In this phase, a Mango selector evaluates to a boolean that in the indexing 
case decides whether a doc should be indexed or not. Phase 1 Mango VDUs behave 
the same way. If a doc does not match the selector, its update is rejected with 
a `{forbidden: “Document is not valid”}` response.

[3]: https://github.com/apache/couchdb/pull/5839

- Phase 2: The RFC recommends increasing the usefulness of the error response: 
instead of rejecting the document update with a generic response, it suggests 
to return a list of all validation failures, so a human seeing the results can 
fix them in one go rather then one by one with multiple document update 
roundtrips (which adds server load and increases the possibility of 409 update 
conflicts).

There is currently no consensus on this feature. The reasons are the following:

1. An implementation of this necessarily requires making the evaluation logic 
of Mango more complicated. At a minimum, it requires the evaluation of all 
clauses of a selector, whereas currently, as soon as a clause doesn’t come out 
as `true`, the evaluation can be stopped, leading to a performance optimisation 
during indexing.

2. It requires the tracking of errors in a list to return to the caller later. 
IF we wanted to make it so that the indexer would still get the shortcut 
behaviour, we’d need an additional boolean option that switches between the two 
behaviours. While we believe we could make a neat version of this, it does not 
exist yet and it will be more complex than what we have now.

Another aspect of Phase 2 is a conditional construct ($if/$then/$else in the 
RFC), if folks don’t like the `$if` terminology, I’m happy to temporarily 
bikeshed this to `$match/$true/$false`. There is wide consensus that this is a 
useful addition to Mango regardless of the VDU work (think un-uniform docs that 
get normalised during indexing). Please send your bikeshedding votes for the 
exact operators names you prefer :)

3. If we do not keep the shortcut behaviour, a Mango indexer will have to 
allocate more terms only to throw them away later. This is not a tidy 
implementation and might even affect indexing performance if only at large 
scale. This needs to be traded-off against feature usefulness and code 
complexity.

- Phase 3 authentication. We have a consensus currently that nothing in the 
above will preclude us from adding authn to Mango VDUs, but we’re punting work 
on this for the moment to get the rest of the implementation solid. This is not 
a rejection of this part of the RFC, it’s just a deferral. Coincidentally, the 
most complex addition to Mango (the $data operator that lets you reference the 
values in arbitrary fields in your input set) was mainly added to the RFC to 
support authn, so it is very convenient that we can skip this for now.

- Phase 4 optional additions. This is a loose collection of additions to Mango 
that make all of the above more useful, but are in themselves not required to 
provide the base functionality of Mango VDUs, even if that means that the VDU 
selectors are not as expressive as a corresponding JS function. We can do any 
of this at any time, so this isn’t really a phase, this is just to collect bits 
that aren’t required for any of the other phases.

These optional additions are:

1. String manipulation functions (e.g. `$concat`).

These are not controversial, but we are not sure which operations would be 
required. This is a great place to submit feedback.

2. Customisable error reporting (to mirror the current flexible `throw()` 
option in JS VDUs) so developers can set their own error messages.

There is consensus that this is a very optional feature that we should not 
worry about for now unless users request it. It will be an easy addition, but 
could add a level of complexity that would slow down adoption. Please let us 
know what you think.

3. A `$ref` operator that acts like an include mechanism, so a set of base 
selectors can be combined without duplicating the individual selector logic 
(akin to the CommonJS module system we have now).

There is no consensus on this feature. Usefulness and language complexity need 
to be weighed against each other. This is another great place to leave feedback.

—

I’ll stop here with a call to action: If you’re a CouchDB user, please respond 
here or in [2[ with what you think about this feature.

Thanks for reading and looking forward to hear from you :)

Best
Jan
—

Reply via email to