On Wed, 7 Mar 2018 21:39:08 -0800 Jeff Janes <jeff.ja...@gmail.com> wrote:
> As for preventing it in the first place, based on your description of your > hardware and operations, I was going to say you need to increase the max > number of autovac workers, but then I remembered you from "Autovacuum slows > down with large numbers of tables. More workers makes it slower" ( > https://www.postgresql.org/message-id/20151030133252.3033.4249%40wrigleys.postgresql.org). > So you are probably still suffering from that? Your patch from then seemed > to be pretty invasive and so controversial. We have been building from source using that patch for the worker contention since then. It's very effective, there is no way we could have continued to rely on autovacuum without it. It's sort of a nuisance to keep updating it for each point release that touches autovacuum, but here we are. The current patch is motivated by the fact that even with effective workers we still regularly find tables with inflated reltuples. I have some theories about why, but not really proof. Mainly variants on "all the vacuum workers were busy making their way through a list of 100,000 tables and did not get back to the problem table before it became a problem." I do have a design in mind for a larger more principled patch that fixes the same issue and some others too, but given the reaction to the earlier one I hesitate to spend a lot of time on it. I'd be happy to discuss a way to try to move forward though if any one is interested. Your patch helped, but mainly was targeted at the lock contention part of the problem. The other part of the problem was that autovacuum workers will force a rewrite of the stats file every time they try to choose a new table to work on. With large numbers of tables and many autovacuum workers this is a significant extra workload. -dg -- David Gould da...@sonic.net If simplicity worked, the world would be overrun with insects.