Matthew, I am replying to the below as a pg_autovacuum user for multiple client databases. My thoughts:
> Inability to customize thresholds on a per table basis This hasn't been a big problem for me. I would judge that 80% of my clients would make no use of this feature. > Inability to set default thresholds on a per database basis This would be much more useful to us. > Inability to exclude specific databases / tables from pg_autovacuum monitoring Same as above -- exclusion is much more needed than incremental raising/ lowering. Of course, if one can set levels, one can set them to zero, so perhaps it is the same thing. > Inability to schedule vacuums during off-peak times I don't think that this is the job of pg_autovacuum. If a database requires bulk loads and other burst activity, the DBA should schedule manual vacuums around those and not use pg_autovacuum. Also, bgwriter and slow vacuum should make this less of an issue for 7.5. > Lack of integration related to startup and shutdown Yes, this is a pain, especially from a security standpoint. > Ignorance of VACUUM and ANALYZE operations performed outside pg_autovacuum (requires backend integration? or can listen / notify can be used?) Again, I think this is not crucial, personally. Nice if there's some easy way to do it, of course. > Lack of logging options / syslog integration / log rotation options Yep, this is a biggie. Now, let me add my comments as to what my clients have complained about: -- Lack of integrated security with the Postmaster -- Inability to detect VACUUMs "backing up" due to too low vacuum mem or too much activity and warn the DBA -- Inability to Vacuum in parallel on high-capacity machines -- No "timeout" for locked vacuums. > I'm not sure how to address all of these concerns, or that they all should be addressed right now. �One of my big questions is backend integration. � I concur with the other commentors; backend integration would be nice if pg_autovacuum is not to be permanently a seperate script/process. It would eliminate several of the above issues. > Since many people do not like tools that clutter their databases by adding tables, I think option 1 (adding a pg_autovacuum table to existing databases) is right out. � Personally, I like the idea of a pg_autovacuum table, and would supporrt it. However, I have no strong objections to the other approaches. > Right now pg_autovacuum has no memory of what was going on the last time it was run. �So if significant changes have happened while pg_autovacuum is not running, they will not be counted in the analysis of when to perform a vacuum or analyze operation which can result in under vacuuming. �So, pg_autovacuum should occasionally write down it's numbers to the database. �The data �will be stored in an additional table called table_data I think we've already had feedback about this. If it's system information, it should go in one of the existing tables, or it should be called something more descriptive than "table_data", and should begin with pg_ Some consideraiton should also be given to the frequency of updating the persistent data. I would favor an asynchnous, infrequent updating that would permit some loss of information over a synchrnous lossless approach. The latter, while more accurate, would detract from server performance on high-volume transction databases. > 3.Single-Pass Mode (External Scheduling): > > I have received requests to be able to run pg_autovacuum only on request (not as a daemon) making only one pass over all the tables (not looping indefinately). �The advantage being that it will operate more like the current vacuum command except that it will only vacuum tables that need to be vacuumed. I think this is a completely different utility from pg_autovacuum, and this line of development need not be pursued unless it's easy to do. I certainly don't need it .... > Syslog support. �I'm not sure this is really needed, but a simple patch was submitted by one user and perhaps that can be reviewed / improved and applied. > I need it, and am glad to hear there is a patch. Several of my clients use centralized syslog servers, and do *everything* through syslog. -- -Josh Berkus Aglio Database Solutions San Francisco ---------------------------(end of broadcast)--------------------------- TIP 4: Don't 'kill -9' the postmaster
