OK, in general you have to pay for MVCC one way or another. Many databases make you pay as you go, so to speak, by storing all the MVCC info in a log to be applied at some future date. Other databases you can pay later, by storing all the MVCC in the table itself. Both have similar costs, but one can punish you harshly if you let the MVCC data stored in the database get out of hand.
8.3 and above are much more aggresive about autovacuuming, and on bigger hardware you can make it VERY aggressive and keep the bloat out while keeping up good throughput. On some servers I set up 4 or 6 or 8 autovacuum threads to keep up. If you were on another db you might be adding more drives to make some other part faster. For batch processing storing all MVCC data in the data store can be problematic, but for more normal work where you're changing <1% of a table all the time it can be very fast. Some other databases will just run out of space to store transactions and roll back everything you've done. PostgreSQL will gladly let you shoot yourself in the foot with bloating the data store by running successive whole table updates without vacuuming in between. Bottom line, if your hardware can't keep up, it can't keep up. If vacuum capsizes your IO and still can't keep up then you need more disks and / or better storage subsystems. A 32 disk array with single controller goes for ~$7 to $10k, and you can sustain some pretty amazing thgouhput on that kind of IO subsystem. If you're doing batch processing you can get a lot return by just making sure you vacuum after each mass update. Especially if you are on a single use machine with no cost delays for vacuum, running a vacuum on a freshly worked table should be pretty fast. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance