On Tue, 22 Nov 2022 at 16:28, Tom Lane <t...@sss.pgh.pa.us> wrote: > > Simon Riggs <simon.ri...@enterprisedb.com> writes: > > We seem to have replaced one magic constant with another, so not sure > > if this is autotuning, but I like it much better than what we had > > before (i.e. better than my prev patch). > > Yeah, the magic constant is still magic, even if it looks like it's > not terribly sensitive to the exact value. > > > 1. I was surprised that you removed the limits on size and just had > > the wasted work limit. If there is no read traffic that will mean we > > hardly ever compress, which means the removal of xids at commit will > > get slower over time. I would prefer that we forced compression on a > > regular basis, such as every time we process an XLOG_RUNNING_XACTS > > message (every 15s), as well as when we hit certain size limits. > > > 2. If there is lots of read traffic but no changes flowing, it would > > also make sense to force compression when the startup process goes > > idle rather than wait for the work to be wasted first. > > If we do those things, do we need a wasted-work counter at all? > > I still suspect that 90% of the problem is the max_connections > dependency in the existing heuristic, because of the fact that > you have to push max_connections to the moon before it becomes > a measurable problem. If we do > > - if (nelements < 4 * PROCARRAY_MAXPROCS || > - nelements < 2 * pArray->numKnownAssignedXids) > + if (nelements < 2 * pArray->numKnownAssignedXids) > > and then add the forced compressions you suggest, where > does that put us?
The forced compressions I propose happen * when idle - since we have time to do it when that happens, which happens often since most workloads are bursty * every 15s - since we already have lock which is overall much less often than every 64 commits, as benchmarked by Michail. I didn't mean to imply that superceded the wasted work approach, it was meant to be in addition to. The wasted work counter works well to respond to heavy read-only traffic and also avoids wasted compressions for write-heavy workloads. So I still like it the best. > Also, if we add more forced compressions, it seems like we should have > a short-circuit for a forced compression where there's nothing to do. > So more or less like > > nelements = head - tail; > if (!force) > { > if (nelements < 2 * pArray->numKnownAssignedXids) > return; > } > else > { > if (nelements == pArray->numKnownAssignedXids) > return; > } +1 > I'm also wondering why there's not an > > Assert(compress_index == pArray->numKnownAssignedXids); > > after the loop, to make sure our numKnownAssignedXids tracking > is sane. +1 -- Simon Riggs http://www.EnterpriseDB.com/