[GENERAL] Running vacuumdb -a taking too long

Keaton Adams Mon, 27 Jul 2009 08:55:03 -0700

We are to a point in size and utilization of a set of our Postgres 8.1.17 
databases that a vacuumdb -a -z -U postgres is still running after 50 hours and 
we have to kill it off because the additional i/o is causing queries to stack 
up.  We have archived off as much data as possible (100+ GB) and we are looking 
at moving to a better hardware configuration to support the increased system 
activity, but for now I need to know if there is a way to avoid "transaction ID 
wraparound" if we don't issue a vacuumdb -a to update the datfrozenxid value.


This is a logging type database where data is initially loaded and then "rolled 
up" into weekly/monthly tables.  Many of the tables in the DB don't last for 
more than seven days, and after a week the majority of the historical tables 
are static and stay around until they are finally dropped off (based on a 
retention period), so re-running a vacuum on these older static tables doesn't 
really gain much since the data never changes.

I have read (and re-read) the documentation and am  a bit confused on exactly 
what needs to happen here:

Since periodic VACUUM runs are needed anyway for the reasons described earlier, 
it's unlikely that any table would not be vacuumed for as long as a billion 
transactions. But to help administrators ensure this constraint is met, VACUUM 
stores transaction ID statistics in the system table pg_database. In 
particular, the datfrozenxid column of a database's pg_database row is updated 
at the completion of any database-wide VACUUM operation (i.e., VACUUM that does 
not name a specific table). The value stored in this field is the freeze cutoff 
XID that was used by that VACUUM command. All normal XIDs older than this 
cutoff XID are guaranteed to have been replaced by FrozenXID within that 
database. A convenient way to examine this information is to execute the query

SELECT datname, age(datfrozenxid) FROM pg_database;

The age column measures the number of transactions from the cutoff XID to the 
current transaction's XID.

So if, after a table is no longer added to (becomes static), I run a VACUUM 
against it, the table wouldn't need to be vacuumed again since the tuples and 
their related transaction ID's never change?  Is there a way to set up a vacuum 
scheme on a table-by-table basis to accomplish the same goal as an all-in-one 
vacuumdb -a run that I can spread out over time instead of relying on the 
completion of a single vacuumdb -a command?

[GENERAL] Running vacuumdb -a taking too long

Reply via email to