Josh Berkus wrote:

Matthew,



I don't see how a seperate database is better than a table in the databases., except that it means scanning only one table and not one per database. For one thing, making it a seperate database could make it hard to back up and move your database+pg_avd config.


Basically, I don't like the idea of modifying users databases, besides, in the long run most of what needs to be tracked will be moved to the system catalogs. I kind of consider the pg_autvacuum database to equivalent to the changes that will need to be made to the system catalogs.

I guess it could make it harder to backup if you are moving your database between clusters. Perhaps, if you create a pg_autovacuum schema inside of your database then we would could use that. I just don't like tools that drop things into your database.

Where are you getting 13% from?


13% * 3/4 ~~ 10%


And I think both of use agree that vacuuming tables with less than 10% changes is excessive and could lead to problems on its own, like overlapping vacuums.



I certainly agree that less than 10% would be excessive, I still feel that 10% may not be high enough though. That's why I kinda liked the sliding scale I mentioned earlier, because I agree that for very large tables, something as low as 10% might be useful, but most tables in a database would not be that large.

Do you know of an easy way to get a count of the total pages used by a whole cluster?



Select sum(relpages) from pg_class.




duh....

BTW, do we have any provisions to avoid overlapping vacuums? That is, to prevent a second vacuum on a table if an earlier one is still running?



Only that pg_autovacuum isn't smart enough to kick off more than one vacuum at a time. Basically, pg_autovacuum issues a vacuum on a table and waits for it to finish, then check the next table in it's list to see if it needs to be vacuumed, if so, it does it and waits for that vacuum to finish. There was some discussion of issuing concurrent vacuum against different tables, but it was decided that since vacuum is I/O bound, it would only make sense to issue concurrent vacuums that were on different spindles, which is not something I wanted to get into. Also, given the recent talk about how vacuum is still such a performance hog, I can't imagine what multiple concurrent vacuums would do to performance. Maybe as 7.5 develops and many of the vacuum performance issues are addressed, we can revisit this question.



---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faqs/FAQ.html

Reply via email to