On Sun, Oct 22, 2023 at 12:07:59PM -0700, Andres Freund wrote: > Medium term, I think we need an approximate xid->"time of assignment" mapping > that's continually maintained on the primary. One of the things that'd show > us to do is introduce a GUC to control the maximum effect of hs_feedback on > the primary, in a useful unit. Numbers of xids are not a useful unit (100k > xids is forever on some systems, a few minutes at best on others, the rate is > not necessarily that steady when plpgsql exception handles are used, ...) > > It'd be useful to have such a mapping for other features too. E.g. > > - making it visible in pg_stat _activity how problematic a longrunning xact > is - a 3 day old xact that doesn't have an xid assigned and has a recent xmin > is fine, it won't prevent vacuum from doing things. But a somewhat recent > xact that still has a snapshot from before an old xact was cancelled could be > problematic. > > - turn pg_class.relfrozenxid into an understandable timeframe. It's a fair > bit of mental effort to classify "370M xids old" into problem/fine (it's e.g. > not a problem on a system with a high xid rate, on a big table that takes a > bit to a bit to vacuum). > > - using the mapping to compute an xid consumption rate IMO would be one > building block for smarter AV scheduling. Together with historical vacuum > runtimes it'd allow us to start vacuuming early enough to prevent hitting > thresholds, adapt pacing, prioritize between tables etc.
Big +1 to all of this. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com