On Tue, Mar 11, 2014 at 10:24 AM, Mike Wilson <geekinu...@gmail.com> wrote:
> Undeleting things is an important use case in my opinion. We do this in > our environment on a regular basis. In that light I'm not sure that it > would be appropriate just to log the deletion and git rid of the row. I > would like to see it go to an archival table where it is easily restored. > > Although we want to *support* hard deletion, we still want to support the current behavior as well (Soft deletion, where the operator, can prune deleted rows periodically). > -Mike > > > On Mon, Mar 10, 2014 at 3:44 PM, Joshua Harlow <harlo...@yahoo-inc.com>wrote: > >> Sounds like a good idea to me. >> >> I've never understood why we treat the DB as a LOG (keeping deleted == >> 0 records around) when we should just use a LOG (or similar system) to >> begin with instead. >> >> Does anyone use the feature of switching deleted == 1 back to deleted = >> 0? Has this worked out for u? >> >> Seems like some of the feedback on >> https://etherpad.openstack.org/p/operators-feedback-mar14 also suggests >> that this has been a operational pain-point for folks (Tool to delete >> things properly suggestions and such...). >> >> From: Boris Pavlovic <bpavlo...@mirantis.com> >> Reply-To: "OpenStack Development Mailing List (not for usage questions)" >> <openstack-dev@lists.openstack.org> >> Date: Monday, March 10, 2014 at 1:29 PM >> To: OpenStack Development Mailing List <openstack-dev@lists.openstack.org>, >> Victor Sergeyev <vserge...@mirantis.com> >> Subject: [openstack-dev] [all][db][performance] Proposal: Get rid of >> soft deletion (step by step) >> >> Hi stackers, >> >> (It's proposal for Juno.) >> >> Intro: >> >> Soft deletion means that records from DB are not actually deleted, they >> are just marked as a "deleted". To mark record as a "deleted" we put in >> special table's column "deleted" record's ID value. >> >> Issue 1: Indexes & Queries >> We have to add in every query "AND deleted == 0" to get non-deleted >> records. >> It produce performance issue, cause we should add it in any index one >> "extra" column. >> As well it produce extra complexity in db migrations and building >> queries. >> >> Issue 2: Unique constraints >> Why we store ID in deleted and not True/False? >> The reason is that we would like to be able to create real DB unique >> constraints and avoid race conditions on "insert" operation. >> >> Sample: we Have table (id, name, password, deleted) we would like to >> put in column "name" only unique value. >> >> Approach without UC: if count(`select .... where name = name`) == 0: >> insert(...) >> (race cause we are able to add new record between ) >> >> Approach with UC: try: insert(...) except Duplicate: ... >> >> So to add UC we have to add them on (name, deleted). (to be able to >> make insert/delete/insert with same name) >> >> As well it produce performance issues, because we have to use Complex >> unique constraints on 2 or more columns. + extra code & complexity in db >> migrations. >> >> Issue 3: Garbage collector >> >> It is really hard to make garbage collector that will have good >> performance and be enough common to work in any case for any project. >> Without garbage collector DevOps have to cleanup records by hand, (risk >> to break something). If they don't cleanup DB they will get very soon >> performance issue. >> >> To put in a nutshell most important issues: >> 1) Extra complexity to each select query & extra column in each index >> 2) Extra column in each Unique Constraint (worse performance) >> 3) 2 Extra column in each table: (deleted, deleted_at) >> 4) Common garbage collector is required >> >> >> To resolve all these issues we should just remove soft deletion. >> >> One of approaches that I see is in step by step removing "deleted" >> column from every table with probably code refactoring. Actually we have 3 >> different cases: >> >> 1) We don't use soft deleted records: >> 1.1) Do .delete() instead of .soft_delete() >> 1.2) Change query to avoid adding extra "deleted == 0" to each query >> 1.3) Drop "deleted" and "deleted_at" columns >> >> 2) We use soft deleted records for internal stuff "e.g. periodic tasks" >> 2.1) Refactor code somehow: E.g. store all required data by periodic task >> in some special table that has: (id, type, json_data) columns >> 2.2) On delete add record to this table >> 2.3-5) similar to 1.1, 1.2, 13 >> >> 3) We use soft deleted records in API >> 3.1) Deprecated API call if it is possible >> 3.2) Make proxy call to ceilometer from API >> 3.3) On .delete() store info about records in (ceilometer, or somewhere >> else) >> 3.4-6) similar to 1.1, 1.2, 1.3 >> >> This is not ready RoadMap, just base thoughts to start the constructive >> discussion in the mailing list, so %stacker% your opinion is very >> important! >> >> >> Best regards, >> Boris Pavlovic >> >> >> _______________________________________________ >> OpenStack-dev mailing list >> OpenStack-dev@lists.openstack.org >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev >> >> > > _______________________________________________ > OpenStack-dev mailing list > OpenStack-dev@lists.openstack.org > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > >
_______________________________________________ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev