Sounds like a good idea to me.

I've never understood why we treat the DB as a LOG (keeping deleted == 0 
records around) when we should just use a LOG (or similar system) to begin with 
instead.

Does anyone use the feature of switching deleted == 1 back to deleted = 0? Has 
this worked out for u?

Seems like some of the feedback on 
https://etherpad.openstack.org/p/operators-feedback-mar14 also suggests that 
this has been a operational pain-point for folks (Tool to delete things 
properly suggestions and such…).

From: Boris Pavlovic <bpavlo...@mirantis.com<mailto:bpavlo...@mirantis.com>>
Reply-To: "OpenStack Development Mailing List (not for usage questions)" 
<openstack-dev@lists.openstack.org<mailto:openstack-dev@lists.openstack.org>>
Date: Monday, March 10, 2014 at 1:29 PM
To: OpenStack Development Mailing List 
<openstack-dev@lists.openstack.org<mailto:openstack-dev@lists.openstack.org>>, 
Victor Sergeyev <vserge...@mirantis.com<mailto:vserge...@mirantis.com>>
Subject: [openstack-dev] [all][db][performance] Proposal: Get rid of soft 
deletion (step by step)

Hi stackers,

(It's proposal for Juno.)

Intro:

Soft deletion means that records from DB are not actually deleted, they are 
just marked as a "deleted". To mark record as a "deleted" we put in special 
table's column "deleted" record's ID value.

Issue 1: Indexes & Queries
We have to add in every query "AND deleted == 0" to get non-deleted records.
It produce performance issue, cause we should add it in any index one "extra" 
column.
As well it produce extra complexity in db migrations and building queries.

Issue 2: Unique constraints
Why we store ID in deleted and not True/False?
The reason is that we would like to be able to create real DB unique 
constraints and avoid race conditions on "insert" operation.

Sample: we Have table (id, name, password, deleted) we would like to put in 
column "name" only unique value.

Approach without UC: if count(`select .... where name = name`) == 0: insert(...)
(race cause we are able to add new record between )

Approach with UC: try: insert(...) except Duplicate: ...

So to add UC we have to add them on (name, deleted). (to be able to make 
insert/delete/insert with same name)

As well it produce performance issues, because we have to use Complex unique 
constraints on 2  or more columns. + extra code & complexity in db migrations.

Issue 3: Garbage collector

It is really hard to make garbage collector that will have good performance and 
be enough common to work in any case for any project.
Without garbage collector DevOps have to cleanup records by hand, (risk to 
break something). If they don't cleanup DB they will get very soon performance 
issue.

To put in a nutshell most important issues:
1) Extra complexity to each select query & extra column in each index
2) Extra column in each Unique Constraint (worse performance)
3) 2 Extra column in each table: (deleted, deleted_at)
4) Common garbage collector is required


To resolve all these issues we should just remove soft deletion.

One of approaches that I see is in step by step removing "deleted" column from 
every table with probably code refactoring.  Actually we have 3 different cases:

1) We don't use soft deleted records:
1.1) Do .delete() instead of .soft_delete()
1.2) Change query to avoid adding extra "deleted == 0" to each query
1.3) Drop "deleted" and "deleted_at" columns

2) We use soft deleted records for internal stuff "e.g. periodic tasks"
2.1) Refactor code somehow: E.g. store all required data by periodic task in 
some special table that has: (id, type, json_data) columns
2.2) On delete add record to this table
2.3-5) similar to 1.1, 1.2, 13

3) We use soft deleted records in API
3.1) Deprecated API call if it is possible
3.2) Make proxy call to ceilometer from API
3.3) On .delete() store info about records in (ceilometer, or somewhere else)
3.4-6) similar to 1.1, 1.2, 1.3

This is not ready RoadMap, just base thoughts to start the constructive 
discussion in the mailing list, so %stacker% your opinion is very important!


Best regards,
Boris Pavlovic
_______________________________________________
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Reply via email to