Looking good. I'm going to skip the database internals as I'm not at all qualified to comment. I do have some comments on the HTTP API however.
First, I think this API spec is missing how you find databases which have soft-deleted versions. For example, this could be: /_all_dbs - lists only live databases /_deleted_dbs - lists only dbs with at least one soft-deleted version - API identical otherwise to _all_dbs Secondly, this API lacks the ability to restore a soft-deleted database to a new database, you can only restore a soft-deleted database to the original name, which could be over the top of a new database. I suspect this could be problematic in day-to-day use. I'd propose: - You can't restore a soft-deleted database on top of a "live" database. This prevents data loss from a restore. - Support for the ability to restore a database to a new database path. Could you also clarify whether the restore operation "removes" the soft-deleted version. My assumption is yes, that we're just switching back the database into the "live" keyspace. An alternative API takes the conceptual model of deleted databases moving to a "recycle bin" within the API's namespace. I quite like this, but it's just how it works for me; others may hate it. The original proposal, to me, takes the approach that every database has it's own "recycle bin", so this is merely altering the conceptual containers here. Mostly, I found that the following way feels a bit more RESTful in that there is a specific set of deleted database resources, and you avoid the cognitive dissonance of being able to carry out operations on non-existent resources. In brief it roots operations on soft-deleted databases in a /_deleted_dbs namespace rather than augmenting existing endpoints: GET /_all_dbs - lists only live databases (noted just to clarify this) GET /_deleted_dbs - lists only dbs with at least one soft-deleted version - API identical otherwise to _all_dbs - It's a bit confusing to me that there can be overlap between _deleted_dbs and _all_dbs as a given database may have live and soft-deleted versions, but perhaps this is okay. GET /_deleted_dbs/{db} - Equivalent to `GET /{db}/_deleted_dbs_info` in original. - To me, this makes it very clear that a deleted db has only one operation -- get the metadata -- rather than hanging the _deleted_dbs_info on a path that could be a live database or a non-existent database. POST /_deleted_dbs/_restore - Restore a database at a given timestamp - Allows supplying a destination database. - I view _restore as an RPC-like call than a REST call, so it has a body containing the arguments rather than supplying as part of the path: { "source": {"db": "animaldb", "ts": "<deletedTS>"}, "destination": "animaldb"} - Fails if "destination" exists. If we choose to go with the original suggestion, I would still suggest taking the RPC approach to restore, meaning it looks like this: POST /{db}/_restore { "sourceTS": "<deletedTS>", "destination": "animaldb"} For either approach, "destination" could be optional and default to the original database name. -- Mike. On Wed, 18 Mar 2020, at 12:04, jiangph wrote: > ## API > > 1) `DELETE /{db}` > > There is no change on this endpoint [4] to send DELETE against one > database. The soft-deletion is triggered once > [couchdb][enable_database_recovery] is set to true in configuration > file. > > > 2) `GET /{db}/_deleted_dbs_info` > > returning basic information of all deleted instances for the > specified database, including when the instance was deleted. > Parameters: > > db –Database name > > Request Headers: > > Content-Type –application/json > > Response Headers: > > > Content-Type – > application/json > > Status Codes: > > 200 OK –Request completed successfully > 404 Not Found –Requested database not found > > Request: > > GET /db/_deleted_dbs_info HTTP/1.1 > Accept: application/json > Host: localhost:5984 > > Response: > > HTTP/1.1 200 OK > Cache-Control: must-revalidate > Content-Type: application/json > { > "total_rows": 2, > "rows": [{ > "deleted_when": "20200318.071532", > "info": { > "update_seq": "0000019100b5992700000000", > "doc_del_count": 0, > "doc_count": 3, > "sizes": { > "external": 287, > "views": 0 > } > } > }, { > "deleted_when": "20200318.071703", > "info": { > "update_seq": "0000019105f0e29900000000", > "doc_del_count": 0, > "doc_count": 2, > "sizes": { > "external": 200, > "views": 0 > } > } > }] > } > > > 3) `PUT /{db}/_restore/{deletedTS}` > > Restore a deleted database. > Parameters: > > db –Database name > deletedTS - timestamp when database was deleted > > Request Headers: > > > Accept – > application/json > text/plain > > Response Headers: > > > Content-Type – > application/json > text/plain; charset=utf-8 > > Response JSON Object: > > > ok (boolean) –Operation status. Available in case of success > error (string) –Error type. Available if response code is 4xx > reason (string) –Error description. Available if response code is 4xx > > Status Codes: > > 200 Restored –Database restored successfully > 400 Bad Request –Invalid database name or deleted timestamp > 401 Unauthorized –CouchDB Server Administrator privileges required > 412 Precondition Failed –Database already exists > > > What do you think of that? Any questions or thoughts on this? Once > again a big acknowledgment to Nick and Paul who helped with initial > design and provide consultation on this. > > > Cheers > Peng Hui > > > [1] > https://github.com/apache/couchdb/blob/master/src/couch/src/couch_file.erl#L251 > > [2] > https://github.com/apache/couchdb/blob/prototype/fdb-layer/src/fabric/src/fabric2_fdb.erl#L182-L184 > [3] https://activesphere.com/blog/2018/08/05/high-contention-allocator > <https://activesphere.com/blog/2018/08/05/high-contention-allocator> > [4] > https://docs.couchdb.org/en/stable/api/database/common.html#delete--db > >