> 
> I could do some tests with the patch on some larger machines. What exact
> tests do you propose? Are there some specific postgresql.conf settings and
> pgbench initialization you recommend for this? And was the test above just
> running 'pgbench -S' select-only with specific -T, -j and -c parameters?
> 

With Andres' instructions I ran a couple of tests. With your patches I can 
reproduce a speedup of ~3% on single core tests reliably on a dual-socket 
36-core machine for the pgbench select-only test case. When using the full 
scale test my results are way too noisy even for large runs unfortunately. I 
also tried some other queries (for example select's that return 10 or 100 rows 
instead of just 1), but can't see much of a speed-up there either, although it 
also doesn't hurt.

So I guess the most noticeable one is the select-only benchmark for 1 core:

<Master>
transaction type: <builtin: select only>
scaling factor: 300
query mode: prepared
number of clients: 1
number of threads: 1
duration: 600 s
number of transactions actually processed: 30255419
latency average = 0.020 ms
latency stddev = 0.001 ms
tps = 50425.693234 (including connections establishing)
tps = 50425.841532 (excluding connections establishing)

<Patched>
transaction type: <builtin: select only>
scaling factor: 300
query mode: prepared
number of clients: 1
number of threads: 1
duration: 600 s
number of transactions actually processed: 31363398
latency average = 0.019 ms
latency stddev = 0.001 ms
tps = 52272.326597 (including connections establishing)
tps = 52272.476380 (excluding connections establishing)

This is the one with 40 clients, 40 threads. Not really an improvement, and 
quite still quite noisy.
<Master>
transaction type: <builtin: select only>
scaling factor: 300
query mode: prepared
number of clients: 40
number of threads: 40
duration: 600 s
number of transactions actually processed: 876846915
latency average = 0.027 ms
latency stddev = 0.015 ms
tps = 1461407.539610 (including connections establishing)
tps = 1461422.084486 (excluding connections establishing)

<Patched>
transaction type: <builtin: select only>
scaling factor: 300
query mode: prepared
number of clients: 40
number of threads: 40
duration: 600 s
number of transactions actually processed: 872633979
latency average = 0.027 ms
latency stddev = 0.038 ms
tps = 1454387.326179 (including connections establishing)
tps = 1454396.879195 (excluding connections establishing)

For tests that don't use the full machine (eg. 10 clients, 10 threads) I see 
speed-ups as well, but not as high as the single-core run. It seems there are 
other bottlenecks (on the machine) coming into play.

-Floris

Reply via email to