rnewson opened a new pull request, #5558:
URL: https://github.com/apache/couchdb/pull/5558

   ## Overview
   
   Apache CouchDB retains some information (at minimum, doc id, doc revision 
tree and a deleted flag) for all deleted documents forever, in order that 
replication is guaranteed to converge. This is excessively pessimistic and we 
would like to improve matters.
   
   This PR introduces a number of changes to achieve its goal;
   
   1) database shards (`.couch` files under `shards/` directory) gained an 
additional header property called `drop_seq`. Once set to a positive, 
non-negative integer, any deleted document with a lower update sequence is 
skipped entirely at next compaction.
   
   2) The notion of a `peer checkpoint document`. These are all local docs and 
their ids must have prefix `_local/peer-checkpoint-'.
   
   3) All indexers (mrview, search, nouveau) and the replicator have been 
taught to create and update peer checkpoints with the update sequence they have 
seen at appropriate times (i.e, after they have made every effort to commit the 
changes they've seen to durable storage).
   
   2) A new endpoint `POST /$dbname/_update_drop_seq` which gathers information 
about the shards of the database, update sequences from all peer checkpoint 
documents, and the internal shard sync documents, and computes the `drop_seq` 
for each shard, and then sends RPC requests to those databases to update the 
`drop_seq`.
   
   ## Testing recommendations
   
   There are some simple tests in the eunit and elixir suites which will be run 
via the normal Makefile targets.
   
   Additionally there is a stateful property-based test that exercises the code 
more comprehensively which can be started with `make elixir-cluster`. This will 
start a 3 node cluster with nouveau server running and perform random 
permutations of all relevant operations that could alter which deleted 
documents are dropped (making docs, deleting docs, creating indexes, creating 
and updating peer checkpoints, splitting shards).
   
   ## Related Issues or Pull Requests
   
   N/A
   
   ## Checklist
   
   - [x] Code is written and works correctly
   - [x] Changes are covered by tests
   - [ ] Any new configurable parameters are documented in 
`rel/overlay/etc/default.ini`
   - [TODO] Documentation changes were made in the `src/docs` folder
   - [ ] Documentation changes were backported (separated PR) to affected 
branches
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to