On Sun, 6 Nov 2022 at 20:40, Peter Geoghegan <p...@bowt.ie> wrote: > > On Sun, Nov 6, 2022 at 11:14 AM Tom Lane <t...@sss.pgh.pa.us> wrote: > > In general, I do not believe in encouraging users to run VACUUM > > manually in the first place. We would be far better served by > > spending our effort to improve autovacuum's shortcomings. > > I couldn't agree more. A lot of problems seem related to the idea that > VACUUM is just a command that the DBA periodically runs to get a > predictable fixed result, a little like CREATE INDEX. That conceptual > model isn't exactly wrong; it just makes it much harder to apply any > kind of context about the needs of the table over time. There is a > natural cycle to how VACUUM (really autovacuum) is run, and the > details matter. > > There is a significant amount of relevant context that we can't really > use right now. That wouldn't be true if VACUUM only ran within an > autovacuum worker (by definition). The VACUUM command itself would > still be available, and support the same user interface, more or less. > Under the hood the VACUUM command would work by enqueueing a VACUUM > job, to be performed asynchronously by an autovacuum worker. Perhaps > the initial enqueue operation could be transactional, fixing Simon's > complaint.
Ah, I see you got to this idea first! Yes, what we need is for the "VACUUM command" to not fail in a script. Not sure anyone cares where the work takes place. Enqueuing a request for autovacuum to do that work, then blocking until it is complete would do the job. > "No more VACUUMs outside of autovacuum" would enable more advanced > autovacuum.c scheduling, allowing us to apply a lot more context about > the costs and benefits, without having to treat manual VACUUM as an > independent thing. We could coalesce together redundant VACUUM jobs, > suspend and resume VACUUM operations, and have more strategies to deal > with problems as they emerge. +1, but clearly this would not make temp table VACUUMs work. > > I'd like to see some sort of direct attack on its inability to deal > > with temp tables, for instance. (Force the owning backend to > > do it? Temporarily change the access rules so that the data > > moves to shared buffers? Dunno, but we sure haven't tried hard.) This was a $DIRECT attack on making temp tables work! ;-) Temp tables are actually easier, since we don't need any of the concurrency features we get with lazy vacuum. So the answer is to always run a VACUUM FULL on temp tables since this skips any issues with indexes etc.. We would need to check a few things first.... maybe something like this (mostly borrowed heavily from COPY) InvalidateCatalogSnapshot(); if (!ThereAreNoPriorRegisteredSnapshots() || !ThereAreNoReadyPortals()) ereport(WARNING, (errcode(ERRCODE_INVALID_TRANSACTION_STATE), errmsg("vacuum of temporary table ignored because of prior transaction activity"))); CheckTableNotInUse(rel, "VACUUM"); > This is a good example of the kind of thing I have in mind. Perhaps it > could work by killing the backend that owns the temp relation when > things truly get out of hand? I think that that would be a perfectly > reasonable trade-off. +1 > Another related idea: better behavior in the event of a manually > issued VACUUM (now just an enqueued autovacuum) that cannot do useful > work due to the presence of a long running snapshot. The VACUUM > doesn't have to dutifully report "success" when there is no practical > sense in which it was successful. There could be a back and forth > conversation between autovacuum.c and vacuumlazy.c that makes sure > that something useful happens sooner or later. The passage of time > really matters here. Regrettably, neither vacuum nor autovacuum waits for xmin to change; perhaps it should. -- Simon Riggs http://www.EnterpriseDB.com/