Sumbry][ wrote:

Dan Weber wrote:

There are some major problems with pgsql at the moment. I couldn't tell you if it was the dbmail driver, or pgsql itself, but I know that it keeps eating disk space like crazy. Sometimes vacuuming it helps, but not really. I think we should have a full review of the dbpgsql driver.


We had this problem initially as well. The reason why diskusage is so high has to deal with Postgres' Multi-Version Concurrency Control (MVCC).

Postgres never actually deletes records... when you request that a record is to be deleted it is simply marked as deleted, but Postgres doesn't actually remove the data from the filesystem or reclaim that space. Doing that everytime you deleted a record would be tremendously slow as well as cause all kinds of locking problems. Instead, VACUUM FULL is what will actually remove the used space (And is also why when you run it, the entire table is EXCLUSIVE READ+WRITE locked and the db becomes almost useless until it finishes).

Now by default Postgres doesn't VACUUM FULL at all.. even if you use the pg_contrib autovacuum module, it's set to just do normal VACUUM ANALYZE which may reclaim a little space but not much. If the disk space is there and you've got a busy db box, we've found that Postgres will suck it all up. A busy DbMail+Postgres mail server is going to probably have some pretty aggresive disk usage. The filesystem will seem to fill up really quickly, but once it hits around 80-90 or so percent, it'll hover there.

An example would be if I have 10 records, and I delete 5 of them, when an 11th record gets added it reuses the space of one of the 5 records that I had earlier marked as deleted. In most scenarios this is fine, but with mail servers we're constantly dealing with thousands of little messages (rows) that come in and so this can be a problem w/the default Postgres install. So after some digging (and head banging), we found the magic postgres tunable:

    max_fsm_pages = 3000000         #

That's right, it's set at 3 million. I think it defaults to 40 thousand. Postgres uses the free space map to track database activity and decides when to reclaim space used by deleted rows. A good rule of thumb is to estimate how many rows are likely to be updated (in your entire db server) between vacuums, and use half of that value for max_fsm_pages. Also, max_fsm_relations should be upped to the total number of tables that you'll have on your db server. We have ours at 1000.

Also - make sure you're doing regular vacuums on the database. Either nightly, or even better is to use the pg_autovacuum module in the Postgres contrib dir. Looks like it'll be a standard part of the install in the next PG release.

This is very interesting. Would you mind writing a README.postgresql with this (and possibly more) information for tuning PostgreSQL for DBMail?

Ilja

--
Ilja Booij
IC&S B.V.

Stadhouderslaan 57
3583 JD  Utrecht
www.ic-s.nl

T algemeen: 030 6355730
T direct: 030 6355739
F: 030 6355731
E: [EMAIL PROTECTED]

Reply via email to