On Thu, Apr 6, 2023 at 11:52 AM Melanie Plageman <melanieplage...@gmail.com> wrote: > > Gah, I think I misunderstood you. You are saying that only calling > > AutoVacuumUpdateCostLimit() after napping while vacuuming a table may > > not be enough. The frequency at which the number of workers changes will > > likely be different. This is a good point. > > It's kind of weird to call AutoVacuumUpdateCostLimit() only after napping... > > A not fully baked idea for a solution: > > Why not keep the balanced limit in the atomic instead of the number of > workers for balance. If we expect all of the workers to have the same > value for cost limit, then why would we just count the workers and not > also do the division and store that in the atomic variable. We are > worried about the division not being done often enough, not the number > of workers being out of date. This solves that, right?
A bird in the hand is worth two in the bush, though. We don't really have time to redesign the patch before feature freeze, and I can't convince myself that there's a big enough problem with what you already did that it would be worth putting off fixing this for another year. Reading your newer emails, I think that the answer to my original question is "we don't want to do it at every vacuum_delay_point because it might be too costly," which is reasonable. I don't particularly like this new idea, either, I think. While it may be true that we expect all the workers to come up with the same answer, they need not, because rereading the configuration file isn't synchronized. It would be pretty lame if a worker that had reread an updated value from the configuration file recomputed the value, and then another worker that still had an older value recalculated it again just afterward. Keeping only the number of workers in memory avoids the possibility of thrashing around in situations like that. I do kind of wonder if it would be possible to rejigger things so that we didn't have to keep recalculating av_nworkersForBalance, though. Perhaps now is not the time due to the impending freeze, but maybe we should explore maintaining that value in such a way that it is correct at every instant, instead of recalculating it at intervals. -- Robert Haas EDB: http://www.enterprisedb.com