On 02/27/2012 09:45 AM, Shaun Thomas wrote:
On 02/27/2012 02:08 AM, Reuven M. Lerner wrote:

In the end, it was agreed that we could execute the deletes over
time, deleting items in the background, or in parallel with the
application's work. After all, if the disk is filling up at the rate
of 2 GB/day, then so long as we delete 4 GB/day (which is pretty easy
to do), we should be fine.

Please tell me you understand deleting rows from a PostgreSQL database doesn't work like this. :) The MVCC storage system means you'll basically just be marking all those deleted rows as reusable, so your database will stop growing, but you'll eventually want to purge all the accumulated dead rows.

One way to see how many there are is to use the pgstattuple contrib module. You can just call it on the table name in question:

SELECT * FROM pgstattuple('my_table');

You may find that after your deletes are done, you'll have a free_pct of 80+%. In order to get rid of all that, you'll need to either run CLUSTER on your table(s) or use the select->truncate->insert method anyway.


If he has autovacuum on he could well be just fine with his proposed strategy. Or he could have tables partitioned by time and do the delete by just dropping partitions. There are numerous way he could get this to work.

cheers

andrew

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Reply via email to