Hi,

On 2026-03-10 21:23:26 +0800, Xuneng Zhou wrote:
> On Tue, Mar 10, 2026 at 6:28 PM Michael Paquier <[email protected]> wrote:
> Thanks for running the benchmarks! The performance gains for hash,
> gin, bloom_vacuum, and wal_logging is insignificant, likely because
> these workloads are not I/O-bound. The default number of I/O workers
> is three, which is fairly conservative. When I ran the benchmark
> script with a higher number of I/O workers, some runs showed improved
> performance.

FWIW, another thing that may be an issue is that you're restarting postgres
all the time, as part of drop_caches().  That means we'll spend time reloading
catalog metadata and initializing shared buffers (the first write to a shared
buffers page is considerably more expensive than later ones, as the backing
memory needs to be initialized first).

I found it useful to use the pg_buffercache extension (specifically
pg_buffercache_evict_relation()) to just drop the relation that is going to be
tested from shared_buffers.



> > pgstattuple_large          base= 12429.3ms  patch= 11916.8ms   1.04x
> > (  4.1%)  (reads=206945->12983, io_time=6501.91->32.24ms)
> 
> > pgstattuple_large          base= 12642.9ms  patch= 11873.5ms   1.06x
> > (  6.1%)  (reads=206945->12983, io_time=6516.70->143.46ms)
> 
> Yeah, this looks somewhat strange. The io_time has been reduced
> significantly, which should also lead to a substantial reduction in
> runtime.

It's possible that the bottleneck just moved, e.g to the checksum computation,
if you have data checksums enabled.

It's also worth noting that likely each of the test reps measures
something different, as likely
  psql_run "$ROOT" "$PORT" -c "UPDATE heap_test SET data = data || '!' WHERE id 
% 5 = 0;"

leads to some out-of-page updates.

You're probably better off deleting some of the data in a transaction that is
then rolled back. That will also unset all-visible, but won't otherwise change
the layout, no matter how many test iterations you run.


I'd also guess that you're seeing a relatively small win because you're
updating every page. When reading every page from disk, the OS can do
efficient readahead.  If there are only occasional misses, that does not work.



> method=io_uring
> pgstattuple_large          base=  5551.5ms  patch=  3498.2ms   1.59x
> ( 37.0%)  (reads=206945→12983, io_time=2323.49→207.14ms)
> 
> I ran the benchmark for this test again with io_uring, and the result
> is consistent with previous runs. I’m not sure what might be
> contributing to this behavior.

What does a perf profile show?  Is the query CPU bound?


> Another code path that showed significant performance improvement is
> pgstatindex [1]. I've incorporated the test into the script too. Here
> are the results from my testing:
> 
> method=worker io-workers=12
> pgstatindex_large          base=   233.8ms  patch=    54.1ms   4.32x
> ( 76.8%)  (reads=27460→1757, io_time=213.94→6.31ms)
> 
> method=io_uring
> pgstatindex_large          base=   224.2ms  patch=    56.4ms   3.98x
> ( 74.9%)  (reads=27460→1757, io_time=204.41→4.88ms)

Nice!


Greetings,

Andres Freund


Reply via email to