Hi stackers, (It's proposal for Juno.)
Intro: Soft deletion means that records from DB are not actually deleted, they are just marked as a "deleted". To mark record as a "deleted" we put in special table's column "deleted" record's ID value. Issue 1: Indexes & Queries We have to add in every query "AND deleted == 0" to get non-deleted records. It produce performance issue, cause we should add it in any index one "extra" column. As well it produce extra complexity in db migrations and building queries. Issue 2: Unique constraints Why we store ID in deleted and not True/False? The reason is that we would like to be able to create real DB unique constraints and avoid race conditions on "insert" operation. Sample: we Have table (id, name, password, deleted) we would like to put in column "name" only unique value. Approach without UC: if count(`select .... where name = name`) == 0: insert(...) (race cause we are able to add new record between ) Approach with UC: try: insert(...) except Duplicate: ... So to add UC we have to add them on (name, deleted). (to be able to make insert/delete/insert with same name) As well it produce performance issues, because we have to use Complex unique constraints on 2 or more columns. + extra code & complexity in db migrations. Issue 3: Garbage collector It is really hard to make garbage collector that will have good performance and be enough common to work in any case for any project. Without garbage collector DevOps have to cleanup records by hand, (risk to break something). If they don't cleanup DB they will get very soon performance issue. To put in a nutshell most important issues: 1) Extra complexity to each select query & extra column in each index 2) Extra column in each Unique Constraint (worse performance) 3) 2 Extra column in each table: (deleted, deleted_at) 4) Common garbage collector is required To resolve all these issues we should just remove soft deletion. One of approaches that I see is in step by step removing "deleted" column from every table with probably code refactoring. Actually we have 3 different cases: 1) We don't use soft deleted records: 1.1) Do .delete() instead of .soft_delete() 1.2) Change query to avoid adding extra "deleted == 0" to each query 1.3) Drop "deleted" and "deleted_at" columns 2) We use soft deleted records for internal stuff "e.g. periodic tasks" 2.1) Refactor code somehow: E.g. store all required data by periodic task in some special table that has: (id, type, json_data) columns 2.2) On delete add record to this table 2.3-5) similar to 1.1, 1.2, 13 3) We use soft deleted records in API 3.1) Deprecated API call if it is possible 3.2) Make proxy call to ceilometer from API 3.3) On .delete() store info about records in (ceilometer, or somewhere else) 3.4-6) similar to 1.1, 1.2, 1.3 This is not ready RoadMap, just base thoughts to start the constructive discussion in the mailing list, so %stacker% your opinion is very important! Best regards, Boris Pavlovic
_______________________________________________ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev