Hello Thomas,

29.08.2024 01:16, Thomas Munro wrote:
Yeah.  That's quite interesting, and must destabilise that
simple-minded demo.  I'm curious to know exactly what contention is
causing that (about 3/4 of a millisecond that I don't see and now I
want to know what it's waiting for), but it's a very crude test
lacking timer resolution in the earlier messages, and it's an
unrelated topic and a distraction.  Perhaps it explains why you saw
two different behaviours in Q15 with the patch and I didn't, though.
Really it shouldn't be so sensitive to such variations, it's obviously
a terrible plan, and TPC-DS needs a planner hacker mega-brain applied
to it; I'm going to try to nerd-snipe one...

I looked at two perf profiles of such out-of-sync processes and found no
extra calls or whatsoever in the slow one, it just has the number of perf
samples increased proportionally. It made me suspect CPU frequency
scaling... Indeed, with the "performance" governor set and the boost mode
disabled, I'm now seeing much more stable numbers (I do this tuning before
running performance tests, but I had forgotten about that when I ran that
your test, my bad).

I'm sorry for the noise and the distraction.

Still, now I can confirm your results. Without the patch, two parallel
workers gave "Buffers: shared hit=217 / Buffers: shared hit=226" 10 times
out of 10. And with the patch, I've got
"shared hit=219 / shared hit=224" 3 times,
"shared hit=220 / shared hit=223" 4 times,
"shared hit=221 / shared hit=222" 3 times of 10.

On b7b0f3f27~1, my results are:
"shared hit=219 / shared hit=224": 2
"shared hit=220 / shared hit=223": 3
"shared hit=221 / shared hit=222": 4
"shared hit=218 / shared hit=225": 1

Best regards,
Alexander


Reply via email to