Hi Everyone,
Instead of automatically and immediately removing data and index in database
after a delete operation, soft-deletion allows to restore the deleted data back
to original state due to a “fat finger”or undesired delete operation, up to
defined periods, such as 48 hours.
In CouchDB 3.0, soft-deletion of database is implemented in [1]. The .couch
file is renamed with the .<timestamp>.deleted.couch file after soft-deletion is
enabled, and such file can be changed back to .couch for the purpose of
restore. If restore is not needed and some specified period passed, the
.<timestamp>.deleted.couch file can be deleted to achieve deletion of database
permanently.
In CouchDB 4.0, with the introduction of FoundationDB, the data model and
storage is changed. In order to support soft-deletion, we propose below
solution and then implement them.
## Proposed Data model change
In CouchDB 4.0, directories and indirection access in FoundationDB are already
used to better build data model. One key/value pair is used to build reference
from Dbkey to DbPrefix. All other key/value pairs are based on DbPrefix instead
of DbKey. This decouples the direct relationship between DBName and data in
this database. The current implementation for `DBKey -> DBPrefix` is in [2]. So
you can see below information in FoundationDB using fdbcli, etc.
```
{?ALL_DBS, DbName} -> {?DBS, DbName}
{?DBS, DbName, other part of key} -> <value>
```
To support soft-deletion, especially allowing one database to be
deleted/re-created multiple time, we need to use different DbPrefix for the
same DbKey/DBName. The proposed change is to use a unique value allocated via
High Contention Allocator(HCA) algorithm in [3].
```
DbPrefixAllocator = erlfdb_hca:create(?ERLFDB_EXTEND(DbId, <<"hca">>)),,
DbPrefix = erlfdb_hca:allocate(DBPrefixAllocator, Tx),
erlfdb:set(Tx, DbKey, DbPrefix),
```
The data in FoundationDB looks like:
```
{?ALL_DBS, DbName} -> <unique key allocated by hca>
{<unique key allocated by hca>, other part of key} -> <value>
```
Using HCA algorithm, it can acquire one unique key quickly while avoiding
conflicting. The more important, it is shorter enough to save space because
`DBPrefix` exists in almost every key/value pair for database.
## Soft-deletion, restore and permanent-deletion
Once database is soft-deleted, the only action is to change `DBKey -> DBPrefix`
pair. All other data for this data is not changed. In order to give clear
namespace management, the proposal is to move DBkey from `?ALL_DBS` to
`?DELETED_DBS`. The timestamp when database was deleted is added to `DBKey` so
that we can know when the data in this database can be permantenly restored.
The `DBKey -> DBPrefix` pair is changed to
```
{?DELETED_DBS, DbName, TimeStamp} -> <unique key allocated by hca>
```
There is a background task to clear the ranges eventually. Depending on setting
on how long the soft-deleted database will be kept, such as 48 hours, the
background task will check `DELETED_DBS` namespace, and find eligible key/value
pairs, and delete data associated with this `DBPrefix` and then delete
DbKey/DbPrefix pair finally.
Overtime, it is possible that database can be deleted several times. The
`_deleted_dbs_info` endpoint is proposed to list information about all deleted
instances for the specified database, including deletion timestamp, document
counts and disk size, etc. This allows users to identify which one to be
restored. Also it also provides information for billing. In given period, such
as 48 hours, the deletion times of same database is most likely limited, the
design API is to list all instances in one time using GET method with query
parameter.
After deciding which instance to be restored, users can use the `_restore`
endpoint with `deletedTS` to restore database. The underlying logic is to
change `DBKey -> DBPrefix` back to
```
{?ALL_DBS, DbName} -> <unique key allocated by hca>
```
Considering the sensitive actions, the `_deleted_dbs_info` and `_restore`
endpoints are supposed to be an admin-only endpoints only to allow granted user
to restore the database.
##view index and search index
Although view index and search index is based on the `DBPrefix`, the change of
value of `DBPrefix` doesn't have impact on storage and search of view index and
search index because `DBPrefix` is one opaque value. If database is
soft-deleted, the `DBkey -> DBPrefix` pair is changed so that any access to
view index and search index will be blocked with `"Database does not exist."`
error. This is expected.
The only thing we need to care is to stop all indexing or pending requests for
soft-deleted database.
## API
1) `DELETE /{db}`
There is no change on this endpoint [4] to send DELETE against one database.
The soft-deletion is triggered once [couchdb][enable_database_recovery] is set
to true in configuration file.
2) `GET /{db}/_deleted_dbs_info`
returning basic information of all deleted instances for the specified
database, including when the instance was deleted.
Parameters:
db –Database name
Request Headers:
Content-Type –application/json
Response Headers:
Content-Type –
application/json
Status Codes:
200 OK –Request completed successfully
404 Not Found –Requested database not found
Request:
GET /db/_deleted_dbs_info HTTP/1.1
Accept: application/json
Host: localhost:5984
Response:
HTTP/1.1 200 OK
Cache-Control: must-revalidate
Content-Type: application/json
{
"total_rows": 2,
"rows": [{
"deleted_when": "20200318.071532",
"info": {
"update_seq": "0000019100b5992700000000",
"doc_del_count": 0,
"doc_count": 3,
"sizes": {
"external": 287,
"views": 0
}
}
}, {
"deleted_when": "20200318.071703",
"info": {
"update_seq": "0000019105f0e29900000000",
"doc_del_count": 0,
"doc_count": 2,
"sizes": {
"external": 200,
"views": 0
}
}
}]
}
3) `PUT /{db}/_restore/{deletedTS}`
Restore a deleted database.
Parameters:
db –Database name
deletedTS - timestamp when database was deleted
Request Headers:
Accept –
application/json
text/plain
Response Headers:
Content-Type –
application/json
text/plain; charset=utf-8
Response JSON Object:
ok (boolean) –Operation status. Available in case of success
error (string) –Error type. Available if response code is 4xx
reason (string) –Error description. Available if response code is 4xx
Status Codes:
200 Restored –Database restored successfully
400 Bad Request –Invalid database name or deleted timestamp
401 Unauthorized –CouchDB Server Administrator privileges required
412 Precondition Failed –Database already exists
What do you think of that? Any questions or thoughts on this? Once again a big
acknowledgment to Nick and Paul who helped with initial design and provide
consultation on this.
Cheers
Peng Hui
[1]
https://github.com/apache/couchdb/blob/master/src/couch/src/couch_file.erl#L251
[2]
https://github.com/apache/couchdb/blob/prototype/fdb-layer/src/fabric/src/fabric2_fdb.erl#L182-L184
[3] https://activesphere.com/blog/2018/08/05/high-contention-allocator
<https://activesphere.com/blog/2018/08/05/high-contention-allocator>
[4] https://docs.couchdb.org/en/stable/api/database/common.html#delete--db