[HACKERS] Vacuum dead tuples that are "between" transactions

Paul Tillotson Mon, 27 Feb 2006 19:28:03 -0800

The topic of improving vacuum for use in heavy-update environments seemsto come up frequently on the list. Has anyone weighed the costs ofallowing VACUUM to reclaim tuples that are not older than the oldesttransaction but are nonetheless invisible to all running transactions?It seems that it's not that hard....

Currently, a tuple is not elligible to be reclaimed by vacuum unless itwas deleted by a transaction that committed before the oldest currentlyrunning transaction committed. (i.e., it's xmax is known to havecommitted before the oldest-currently-running xid was started.) Right?

However, it seems like under certain scenarios (heavy updates to smalltables while a long-running transaction is occurring) there might be alot of tuples that are invisible to all transactions but not able to bevacuumed under the current method. Example: updating a single row overand over again while pg_dump is running.

Suppose that in the system, we have a serializable transaction with xid1000 and a read committed transaction with xid 1001. Other than thesetwo, the oldest running xid is 2000.

Suppose we consider a tuple with xmin 1200 and xmax 1201. We willassume that xid 1201 committed before xid 2000 began to run.

So:

(A) This tuple is invisible to the serializable transaction, since itssnapshot can't ever advance.

(B) The read committed transaction might be able to see it. However, ifits current command started AFTER xid 1201 committed, it can't.Unless I'm missing something, it seems that when vacuuming you can leaveserializable transactions (like pg_dump) out of the calculation of the"oldest running transaction" so long as you keep a list of them andcheck each tuple T against each serializable transaction X to make surethat T's xmin is greater than X, or else T's xmax committed before Xstarted to run. Of course this is "a lot" of work, but this shouldmitigate the effect of long running serializable transactions until suchtime as processor power becomes your limiting factor.

The read committed ones are a more difficult matter, but I think you cantreat a tuple as dead if it was inserted after the read committedtransaction started to run AND the tuple was deleted before thetransaction's currently running command started to run. I suppose themajor difficulty here is that currently a transaction has no way ofknowing when another backend's command started to run?

Is this too difficult to do or is it a good idea that no one has enough'round tuits for?


Regards,

Paul Tillotson

---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend

[HACKERS] Vacuum dead tuples that are "between" transactions

Reply via email to