Looking good. I'm going to skip the database internals as I'm not at all
qualified to comment. I do have some comments on the HTTP API however.
First, I think this API spec is missing how you find databases which have
soft-deleted versions. For example, this could be:
/_all_dbs
- lists only live databases
/_deleted_dbs
- lists only dbs with at least one soft-deleted version
- API identical otherwise to _all_dbs
Secondly, this API lacks the ability to restore a soft-deleted database to a
new database, you can only restore a soft-deleted database to the original
name, which could be over the top of a new database. I suspect this could be
problematic in day-to-day use. I'd propose:
- You can't restore a soft-deleted database on top of a "live" database. This
prevents data loss from a restore.
- Support for the ability to restore a database to a new database path.
Could you also clarify whether the restore operation "removes" the soft-deleted
version. My assumption is yes, that we're just switching back the database into
the "live" keyspace.
An alternative API takes the conceptual model of deleted databases moving to a
"recycle bin" within the API's namespace. I quite like this, but it's just how
it works for me; others may hate it. The original proposal, to me, takes the
approach that every database has it's own "recycle bin", so this is merely
altering the conceptual containers here.
Mostly, I found that the following way feels a bit more RESTful in that there
is a specific set of deleted database resources, and you avoid the cognitive
dissonance of being able to carry out operations on non-existent resources.
In brief it roots operations on soft-deleted databases in a /_deleted_dbs
namespace rather than augmenting existing endpoints:
GET /_all_dbs
- lists only live databases (noted just to clarify this)
GET /_deleted_dbs
- lists only dbs with at least one soft-deleted version
- API identical otherwise to _all_dbs
- It's a bit confusing to me that there can be overlap
between _deleted_dbs and _all_dbs as a given database may
have live and soft-deleted versions, but perhaps this is okay.
GET /_deleted_dbs/{db}
- Equivalent to `GET /{db}/_deleted_dbs_info` in original.
- To me, this makes it very clear that a deleted db has only one
operation -- get the metadata -- rather than hanging the
_deleted_dbs_info
on a path that could be a live database or a non-existent database.
POST /_deleted_dbs/_restore
- Restore a database at a given timestamp
- Allows supplying a destination database.
- I view _restore as an RPC-like call than a REST call, so it has a body
containing the arguments rather than supplying as part of the path:
{ "source": {"db": "animaldb", "ts": "<deletedTS>"}, "destination":
"animaldb"}
- Fails if "destination" exists.
If we choose to go with the original suggestion, I would still suggest taking
the RPC approach to restore, meaning it looks like this:
POST /{db}/_restore
{ "sourceTS": "<deletedTS>", "destination": "animaldb"}
For either approach, "destination" could be optional and default to the
original database name.
--
Mike.
On Wed, 18 Mar 2020, at 12:04, jiangph wrote:
> ## API
>
> 1) `DELETE /{db}`
>
> There is no change on this endpoint [4] to send DELETE against one
> database. The soft-deletion is triggered once
> [couchdb][enable_database_recovery] is set to true in configuration
> file.
>
>
> 2) `GET /{db}/_deleted_dbs_info`
>
> returning basic information of all deleted instances for the
> specified database, including when the instance was deleted.
> Parameters:
>
> db –Database name
>
> Request Headers:
>
> Content-Type –application/json
>
> Response Headers:
>
>
> Content-Type –
> application/json
>
> Status Codes:
>
> 200 OK –Request completed successfully
> 404 Not Found –Requested database not found
>
> Request:
>
> GET /db/_deleted_dbs_info HTTP/1.1
> Accept: application/json
> Host: localhost:5984
>
> Response:
>
> HTTP/1.1 200 OK
> Cache-Control: must-revalidate
> Content-Type: application/json
> {
> "total_rows": 2,
> "rows": [{
> "deleted_when": "20200318.071532",
> "info": {
> "update_seq": "0000019100b5992700000000",
> "doc_del_count": 0,
> "doc_count": 3,
> "sizes": {
> "external": 287,
> "views": 0
> }
> }
> }, {
> "deleted_when": "20200318.071703",
> "info": {
> "update_seq": "0000019105f0e29900000000",
> "doc_del_count": 0,
> "doc_count": 2,
> "sizes": {
> "external": 200,
> "views": 0
> }
> }
> }]
> }
>
>
> 3) `PUT /{db}/_restore/{deletedTS}`
>
> Restore a deleted database.
> Parameters:
>
> db –Database name
> deletedTS - timestamp when database was deleted
>
> Request Headers:
>
>
> Accept –
> application/json
> text/plain
>
> Response Headers:
>
>
> Content-Type –
> application/json
> text/plain; charset=utf-8
>
> Response JSON Object:
>
>
> ok (boolean) –Operation status. Available in case of success
> error (string) –Error type. Available if response code is 4xx
> reason (string) –Error description. Available if response code is 4xx
>
> Status Codes:
>
> 200 Restored –Database restored successfully
> 400 Bad Request –Invalid database name or deleted timestamp
> 401 Unauthorized –CouchDB Server Administrator privileges required
> 412 Precondition Failed –Database already exists
>
>
> What do you think of that? Any questions or thoughts on this? Once
> again a big acknowledgment to Nick and Paul who helped with initial
> design and provide consultation on this.
>
>
> Cheers
> Peng Hui
>
>
> [1]
> https://github.com/apache/couchdb/blob/master/src/couch/src/couch_file.erl#L251
>
> [2]
> https://github.com/apache/couchdb/blob/prototype/fdb-layer/src/fabric/src/fabric2_fdb.erl#L182-L184
> [3] https://activesphere.com/blog/2018/08/05/high-contention-allocator
> <https://activesphere.com/blog/2018/08/05/high-contention-allocator>
> [4]
> https://docs.couchdb.org/en/stable/api/database/common.html#delete--db
>
>