Dan Weber wrote:
There are some major problems with pgsql at the moment. I couldn't tell
you if it was the dbmail driver, or pgsql itself, but I know that it
keeps eating disk space like crazy. Sometimes vacuuming it helps, but
not really. I think we should have a full review of the dbpgsql driver.
We had this problem initially as well. The reason why diskusage is so
high has to deal with Postgres' Multi-Version Concurrency Control (MVCC).
Postgres never actually deletes records... when you request that a
record is to be deleted it is simply marked as deleted, but Postgres
doesn't actually remove the data from the filesystem or reclaim that
space. Doing that everytime you deleted a record would be tremendously
slow as well as cause all kinds of locking problems. Instead, VACUUM
FULL is what will actually remove the used space (And is also why when
you run it, the entire table is EXCLUSIVE READ+WRITE locked and the db
becomes almost useless until it finishes).
Now by default Postgres doesn't VACUUM FULL at all.. even if you use the
pg_contrib autovacuum module, it's set to just do normal VACUUM ANALYZE
which may reclaim a little space but not much. If the disk space is
there and you've got a busy db box, we've found that Postgres will suck
it all up. A busy DbMail+Postgres mail server is going to probably have
some pretty aggresive disk usage. The filesystem will seem to fill up
really quickly, but once it hits around 80-90 or so percent, it'll hover
there.
An example would be if I have 10 records, and I delete 5 of them, when
an 11th record gets added it reuses the space of one of the 5 records
that I had earlier marked as deleted. In most scenarios this is fine,
but with mail servers we're constantly dealing with thousands of little
messages (rows) that come in and so this can be a problem w/the default
Postgres install. So after some digging (and head banging), we found
the magic postgres tunable:
max_fsm_pages = 3000000 #
That's right, it's set at 3 million. I think it defaults to 40
thousand. Postgres uses the free space map to track database activity
and decides when to reclaim space used by deleted rows. A good rule of
thumb is to estimate how many rows are likely to be updated (in your
entire db server) between vacuums, and use half of that value for
max_fsm_pages. Also, max_fsm_relations should be upped to the total
number of tables that you'll have on your db server. We have ours at
1000.
Also - make sure you're doing regular vacuums on the database. Either
nightly, or even better is to use the pg_autovacuum module in the
Postgres contrib dir. Looks like it'll be a standard part of the
install in the next PG release.
-----
"Any sufficiently advanced bug is indistinguishable
from a feature." -- Rich Kulawiec
[EMAIL PROTECTED]