On 12/11/2012 08:11 AM, Michael Sawyers wrote:
Our dba quit last week leaving me with an interesting problem.

We have a table currently using 33gb worth of space for only 152mb worth of
data because of bad processes or autovacuum not being aggressive enough. I
was able to confirm the size difference by doing a create table as select
along with a second test of restoring the table from the dump file to a dev
machine.



There is a very large list of foreign key relationships that I'm not
including for the sake of brevity.

The database version is 8.4.1

The old DBA had said that vacuum full would take days to complete, and we
don't have that much of a window.  So I was considering using the  to force
a full table rewrite.  In testing on a dev machine it only took about five
minutes.

I do not have as much hands on experience with postgres so I wanted to get
thoughts on what is considered the proper way to deal with this kind of
situation.

Any comments would be welcome.
It's true that vacuum-full has historically been very slow as it rewrites data in-place but you can use CLUSTER which is *WAY* faster since it rewrites the data to a new file which it then drops back in place. CLUSTER also gives you new clean indexes. In fact the current versions of PostgreSQL basically use the CLUSTER process to handle VACUUM FULL. (I just don't recall which version applied that change.)

How long did your test copy/dump/restore steps take? This will give you a *rough* idea of the time that CLUSTER will take.

Really the only downside of CLUSTER is the requirement that you have enough available space to write the new copy of the table/indexes.

Another option if you can't tolerate downtime is pg_reorg (http://pgfoundry.org/projects/reorg/) or its fork pg_repack (https://github.com/reorg/pg_repack) which essentially do the copy to a new table in the background then briefly lock the tables to apply any recent changes from the master before dropping the copy into place.

Cheers,
Steve



--
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin

Reply via email to