On Sat, Aug 13, 2011 at 5:31 PM, Heikki Linnakangas <heikki.linnakan...@enterprisedb.com> wrote: > Yeah, I'm not excited about making the planner and statistics more dynamic. > Feedback loops and plan instability are not fun. I think we should rather be > more aggressive in setting the visibility map bits.
As far as I can see, the only way we're really going to make any headway setting visibility map bits more aggressively is to get autovacuum to do it for us. I mean, we could consider doing it when we perform a HOT cleanup; and we could also consider having the background writer do it when it evicts buffers. But frankly, those sorts of things are a drop in the bucket. They're just not going to occur frequently enough to really make a dent. Maybe you could make it work for the case where the working set fits in shared_buffers, because then the background writer isn't doing anything anyway and so, hey, why not set visibility map bits? But you're not going to get much of a performance win out of index-only scans on that workload anyway. If the working set does NOT fit in shared_buffers, you're going to be evicting pages like crazy. I can't really see us deciding to dirty all of those pages and write and flush XLOG for them instead of just booting them out. Even if you have the background writer try to do some of that, other backends that are just throwing things over the fence to get foreground work done are going to run circles around it, so it's going to take a long, long time to make much headway by this route. Assuming you believe the above analysis (which you may not, so feel free to argue if you see something I'm missing), that leaves autovacuum. There are many things to like about running autovacuum more aggressively, and even on insert-only tables. Setting visibility map bits isn't only useful for index-only scans: it also significantly speeds up sequential scans that are not I/O-bound, because it enables the visibility checks on the offending pages to be skipped. On the flip side, setting visibility map bits (and hint bits) has to be considered a lower priority than vacuuming tables that contain dead tuples, because bloat is a far, far bigger problem than a few unset visibility map bits. There are already situations where vacuum can't keep up, especially with exciting values of vacuum_cost_delay, and I don't think we can afford to stir any more work into that pot without prioritizing it. We already have this parameter called vacuum_cost_delay that is intensively problematic. If you set it too low, vacuum can consume too much I/O bandwidth and you suffer a performance drop. On the other hand, if you set it too high (which is easy to do), autovacuum can't keep up and you get bloat. In the worst case, this sends the system into a tailspin: as all the tables get bigger, it gets harder and harder for vacuum to keep up. What would be really nice is to make this parameter self-tuning, at least to some degree. We used to run checkpoints at full speed, but we now have this nifty parameter called checkpoint_completion_target that makes them run slowly, but not so slowly that they don't complete fast enough. It would be nice to do the same thing for autovacuum. If we can estimate that a table will next need vacuuming in 37 minutes and that at the current vacuum_cost_delay setting the currently-in-progress vacuum will finish in 83 minutes (or 83 hours), we should realize that we have a problem and vacuum faster. Similarly, if we notice that tables are coming due for vacuuming faster than autovacuum workers are processing them, we should try to speed up. You can imagine trying to solve this problem by teaching autovacuum to estimate, for every table in the cluster, the time remaining until the next vacuum needs to start; and the time at which we will be able to start the next few vacuums based on when we expect to have enough autovacuum workers available. We could then detect whether or not we're keeping up, and adjust accordingly. This would allow us, in turn, to run vacuum with a higher cost delay by default, and let it decide for itself when it needs to speed up. And if we want vacuum to do lower-priority tasks, such as set visibility map bits, we could teach it to do so only when it's keeping up. This all seems like a lot of work, but at the moment I'm not seeing another way to deliver on the suggested goal of "setting visibility map bits more aggressively". Does anyone else? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers