Sumbry][ wrote:
Dan Weber wrote:
There are some major problems with pgsql at the moment. I couldn't
tell you if it was the dbmail driver, or pgsql itself, but I know
that it keeps eating disk space like crazy. Sometimes vacuuming it
helps, but not really. I think we should have a full review of the
dbpgsql driver.
We had this problem initially as well. The reason why diskusage is so
high has to deal with Postgres' Multi-Version Concurrency Control (MVCC).
Postgres never actually deletes records... when you request that a
record is to be deleted it is simply marked as deleted, but Postgres
doesn't actually remove the data from the filesystem or reclaim that
space. Doing that everytime you deleted a record would be
tremendously slow as well as cause all kinds of locking problems.
Instead, VACUUM FULL is what will actually remove the used space (And
is also why when you run it, the entire table is EXCLUSIVE READ+WRITE
locked and the db becomes almost useless until it finishes).
Now by default Postgres doesn't VACUUM FULL at all.. even if you use
the pg_contrib autovacuum module, it's set to just do normal VACUUM
ANALYZE which may reclaim a little space but not much. If the disk
space is there and you've got a busy db box, we've found that Postgres
will suck it all up. A busy DbMail+Postgres mail server is going to
probably have some pretty aggresive disk usage. The filesystem will
seem to fill up really quickly, but once it hits around 80-90 or so
percent, it'll hover there.
An example would be if I have 10 records, and I delete 5 of them, when
an 11th record gets added it reuses the space of one of the 5 records
that I had earlier marked as deleted. In most scenarios this is fine,
but with mail servers we're constantly dealing with thousands of
little messages (rows) that come in and so this can be a problem w/the
default Postgres install. So after some digging (and head banging),
we found the magic postgres tunable:
max_fsm_pages = 3000000 #
That's right, it's set at 3 million. I think it defaults to 40
thousand. Postgres uses the free space map to track database activity
and decides when to reclaim space used by deleted rows. A good rule
of thumb is to estimate how many rows are likely to be updated (in
your entire db server) between vacuums, and use half of that value for
max_fsm_pages. Also, max_fsm_relations should be upped to the total
number of tables that you'll have on your db server. We have ours at
1000.
Also - make sure you're doing regular vacuums on the database. Either
nightly, or even better is to use the pg_autovacuum module in the
Postgres contrib dir. Looks like it'll be a standard part of the
install in the next PG release.
This is very interesting. Would you mind writing a README.postgresql
with this (and possibly more) information for tuning PostgreSQL for DBMail?
Ilja
--
Ilja Booij
IC&S B.V.
Stadhouderslaan 57
3583 JD Utrecht
www.ic-s.nl
T algemeen: 030 6355730
T direct: 030 6355739
F: 030 6355731
E: [EMAIL PROTECTED]