Op 01-09-11 14:22, Scott Marlowe schreef:
On Thu, Sep 1, 2011 at 6:05 AM, Rik Bellens<rik.bell...@telin.ugent.be>  wrote:
Op 01-09-11 13:31, Scott Marlowe schreef:
On Thu, Sep 1, 2011 at 4:32 AM, Rik Bellens<rik.bell...@telin.ugent.be>
  wrote:
Hello,

I have two large tables in a database, one containing original data and
the
other one derived from the first table. The first table contains several
columns and indexes, while the second table has less columns and only one
index. Both tables have the same number of rows. Nevertheless, the second
table is much larger in disk size than the first one. How can this be
explained?
This is most likely due to table bloat.  In PostgreSQL when you update
or delete a row, a dead version gets left behind.  Vacuum eventually
comes along and reclaims the empty space to be reused.  If you delete
/ update a LOT of rows at once, then you'll have a lot of dead rows
which can only be reused after vacuuming when you do more updates or
deletes later on.

A few salient questions. What version of PostgreSQL are you running?
Is autovacuum running? Do you do a LOT of bulk deletes / updates?  If
you do a lot of bulk deletes on this table, and you delete everything,
can you switch to using the truncate command instead?
I use version 8.3. I see the 'autovacuum launcher process' and 'autovacuum
worker process' in the process list, so I suppose autovacuum is running.

Rows in the measurement table are added once and never deleted or updated.
Adding a row to this table triggers a function that adds a row to the
stats_count table. Normally rows are added chronologically. So rows in the
stats_count table are normally not updated either. If however, for some
reason, a measurement is added from an older time, all rows of that device
which come after this time, are updated, but I don't think this will happen
very often.

The table 'stats_count' was created in a later stage, so the first 45M rows
were added at once and chronologically. However, because the function to
initialize this table took a long time and the client application crashed a
few times, I had to restart this function several times. Can it be that
there is some trash left from running this function several times without
finishing it? Would it be a solution to run 'VACUUM FULL' to reclaim some
disk space?
Yeah, could be.  Take a look at this page:
http://wiki.postgresql.org/wiki/Show_database_bloat and see if the
query there sheds some light on your situ.
thanks for this answer

if i run the query, I get 12433752064 wasted bytes on stats_count_pkey, so I suppose that is the reason


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Reply via email to