On Mon, Aug 07, 2023 at 12:51:24PM +0200, Tomas Vondra wrote: > The bad news is this seems to have negative impact on cases with few > partitions, that'd fit into 16 slots. Which is not surprising, as the > code has to walk longer arrays, it probably affects caching etc. So this > would hurt the systems that don't use that many relations - not much, > but still. > > The regression appears to be consistently ~3%, and v2 aimed to improve > that - at least for the case with just 100 rows. It even gains ~5% in a > couple cases. It's however a bit strange v2 doesn't really help the two > larger cases. > > Overall, I think this seems interesting - it's hard to not like doubling > the throughput in some cases. Yes, it's 100 rows only, and the real > improvements are bound to be smaller, it would help short OLTP queries > that only process a couple rows.
Indeed. I wonder whether we could mitigate the regressions by using SIMD intrinsics in the loops. Or auto-vectorization, if that is possible. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com