Hi, On 2022-11-15 23:14:42 +0000, Simon Riggs wrote: > * COMMITs: xids are removed from the array by performing a binary > search - this gets more and more expensive as the array gets wider > * SNAPSHOTs: scanning the array for snapshots becomes more expensive > as the array gets wider > > Hence more frequent compression is effective at reducing the overhead. > But too frequent compression slows down the startup process, which > can't then keep up.
> So we're just looking for an optimal frequency of compression for any > given workload. What about making the behaviour adaptive based on the amount of wasted effort during those two operations, rather than just a hardcoded "emptiness" factor? It's common that basically no snapshots are taken, and constantly compressing in that case is likely going to be wasted effort. The heuristic the patch adds to KnownAssignedXidsRemoveTree() seems somewhat misplaced to me. When called from ProcArrayApplyXidAssignment() we probably should always compress - it'll only be issued when a substantial amount of subxids have been assigned, so there'll be a bunch of cleanup work. It makes more sense from ExpireTreeKnownAssignedTransactionIds(), since it will very commonly called for individual xids - but even then, we afaict should take into account how many xids we've just expired. I don't think the xids % KAX_COMPRESS_FREQUENCY == 0 filter is a good idea - if you have a workload with plenty subxids you might end up never compressing because xids divisible by KAX_COMPRESS_FREQUENCY will end up as a subxid most/all of the time. Re cost of processing at COMMITs: We do a fresh binary search for each subxid right now. There's a lot of locality in the xids that can be expired. Perhaps we could have a cache for the position of the latest value in KnownAssignedXidsSearch() and search linearly if the distance from the last looked up value isn't large? Greetings, Andres