On Fri, Nov 18, 2016 at 1:11 PM, Robert Haas <robertmh...@gmail.com> wrote: > So now I think that we probably need to make this logic a bit smarter. > Add all of the OIDs that need to be dropped to a list. Then have a > loop prior to the main loop (where it says "Perform operations on > collected tables.") which iterates over that list and drops those > tables one by one, starting a transaction every (say) 100 tables or > after an error. For bonus points, if a transaction fails, put all of > the OIDs except the one that provoked the failure back into the list > of OIDs to be dropped, so that we still make a progress even if some > DROPs are failing for some reason.
Okay. > That might sound adding unnecessary work just for the sake of > paranoia, but I don't think it is. Failures here won't be common, but > since they are entirely automated there will be no human intelligence > available to straighten things out. Barring considerable caution, > we'll just enter a flaming death spiral. Thinking more paranoid, an extra way to enter in this flaming death spiral is to not limit the maximum number of failures authorized when dropping a set of orphaned tables and transactions fail multiple times. This is basically not important as the relation on which the drop failed gets dropped from the list but failing on each one of them is a good way to slow down autovacuum, so putting a limit of say 10 transactions failing is I think really important. I have played with what you suggested, and finished with the patch attached. I have run some tests using this function to create some temp tables with several backends to be sure that multiple backend IDs are used: CREATE FUNCTION create_temp_tables(i int) RETURNS void AS $$ BEGIN FOR i IN 1..i LOOP EXECUTE 'CREATE TEMP TABLE aa' || i || ' (a int);'; END LOOP; END $$ LANGUAGE plpgsql; Then I killed the instance. At restart I could see a bunch of temp tables in pg_class, and I let autovacuum do the cleanup after restart. I have tested as well the error code path in the PG_TRY() block by enforcing manually a elog(ERROR) to be sure that the maximum number of failures is correctly handled, better safe than sorry. -- Michael
autovacuum-orphan-cleanup-v3.patch
Description: application/download
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers