First, thanks for this thoughtful email. On Tue, Feb 4, 2014 at 7:14 PM, Jeff Janes <jeff.ja...@gmail.com> wrote: > On Mon, Feb 3, 2014 at 8:55 AM, Robert Haas <robertmh...@gmail.com> wrote: >> I've also had some further thoughts about the right way to drive >> vacuum scheduling. I think what we need to do is tightly couple the >> rate at which we're willing to do vacuuming to the rate at which we're >> incurring "vacuum debt". That is, if we're creating 100kB/s of pages >> needing vacuum, we vacuum at 2-3MB/s (with default settings). > > If we can tolerate 2-3MB/s without adverse impact on other work, then we can > tolerate it. Do we gain anything substantial by sand-bagging it?
No. The problem is the other direction. >> If >> we're creating 10MB/s of pages needing vacuum, we *still* vacuum at >> 2-3MB/s. Not shockingly, vacuum gets behind, the database bloats, and >> everything goes to heck. > > (Your reference to bloat made be me think your comments here are about > vacuuming in general, not specific to IOS. If that isn't the case, then > please ignore.) > > If we can only vacuum at 2-3MB/s without adversely impacting other activity, > but we are creating 10MB/s of future vacuum need, then there are basically > two possibilities I can think of. Either the 10MB/s represents a spike, and > vacuum should tolerate it and hope to catch up on the debt later. Or it > represents a new permanent condition, in which case I bought too few hard > drives for the work load, and no scheduling decision that autovacuum can > make will save me from my folly. Perhaps there is some middle ground between > those possibilities, but I don't see room for much middle ground. > > I guess there might be entirely different possibilities not between those > two; for example, I don't realize I'm doing something that is generating > 10MB/s of vacuum debt, and would like to have this thing I'm doing be > automatically throttled to the point it doesn't interfere with other > processes (either directly, or indirectly by bloat) The underlying issue here is that, in order for there not to be a problem, a user needs to configure their autovacuum processes to vacuum at a rate which is greater than or equal to the average rate at which vacuum debt is being created. If they don't, they get runaway bloat. But to do that, they need to know at what rate they are creating vacuum debt, which is almost impossible to figure out right now; and even if they did know it, they'd then need to figure out what vacuum cost delay settings would allow vacuuming at a rate sufficient to keep up, which isn't quite as hard to estimate but certainly involves nontrivial math. So a lot of people have this set wrong, and it's not easy to get it right except by frobbing the settings until you find something that works well in practice. Also, a whole *lot* of problems in this area are caused by cases where the rate at which vacuum debt is being created *changes*. Autovacuum is keeping up, but then you have either a load spike or just a gradual increase in activity and it doesn't keep up any more. You don't necessarily notice right away, and by the time you do there's no easy way to recover. If you've got a table with lots of dead tuples in it, but it's also got enough internal freespace to satisfy as many inserts and updates as are happening, then it's possibly reasonable to put off vacuuming in the hopes that system load will be lower at some time in the future. But if you've got a table with lots of dead tuples in it, and you're extending it to create internal freespace instead of vacuuming it, it is highly like that you are not doing what will make the user most happy. Even if vacuuming that table slows down foreground activity quite badly, it is probably better than accumulating an arbitrary amount of bloat. >> The rate of vacuuming needs to be tied >> somehow to the rate at which we're creating stuff that needs to be >> vacuumed. Right now we don't even have a way to measure that, let >> alone auto-regulate the aggressiveness of autovacuum on that basis. > > There is the formula used to decide when a table gets vacuumed. Isn't the > time delta in this formula a measure of how fast we are creating stuff that > needs to be vacuumed for bloat reasons? Is your objection that it doesn't > include other reasons we might want to vacuum, or that it just doesn't work > very well, or that is not explicitly exposed? AFAICT, the problem isn't when the table gets vacuumed so much as *how fast* it gets vacuumed. The autovacuum algorithm does a fine job selecting tables for vacuuming, for the most part. There are problems with insert-only tables and sometimes for large tables the default threshold (0.20) is too high, but it's not terrible. However, the limit on the overall rate of vacuuming activity to 2-3MB/s regardless of how fast we're creating vacuum debt is a big problem. >> Similarly, for marking of pages as all-visible, we currently make the >> same decision whether the relation is getting index-scanned (in which >> case the failure to mark those pages all-visible may be suppressing >> the use of index scans or making them less effective) or whether it's >> not being accessed at all (in which case vacuuming it won't help >> anything, and might hurt by pushing other pages out of cache). > > If it is not getting accessed at all because the database is not very active > right now, that would be the perfect time to vacuum it. Between "I can > accurately project current patterns of (in)activity into the future" and > "People don't build large tables just to ignore them forever", I think the > latter is more likely to be true. If the system is busy but this particular > table is not, then that would be a better reason to de-prioritise vacuuming > that table. But can this degree of reasoning really be implemented in a > practical way? In core? I don't know. But the algorithm for determining the rate at which we vacuum (2-3MB/s) could hardly be stupider than it is right now. It's almost a constant, and to the extent that it's not a constant, it depends on the wrong things. The fact that getting this perfectly right is unlikely to be easy, and may be altogether impossible, shouldn't discourage us from trying to come up with something better than what we have now. > My experience has been that if too few pages are all visible, it generally > switches to a seq scan, not an index scan of a different index. But many > things that are semantically possible to be index-only-scans would never be > planned that way even if allvisible were 100%, so I think it would have to > do two planning passes, one with the real allvisible, and a hypothetical one > with allvisible set to 100%. And then there is the possibility that, while > a high allvisible would be useful, the table is so active that no amount of > vacuuming could ever keep it high. Yeah, those are all good points. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers