Hi, On 2026-03-10 21:23:26 +0800, Xuneng Zhou wrote: > On Tue, Mar 10, 2026 at 6:28 PM Michael Paquier <[email protected]> wrote: > Thanks for running the benchmarks! The performance gains for hash, > gin, bloom_vacuum, and wal_logging is insignificant, likely because > these workloads are not I/O-bound. The default number of I/O workers > is three, which is fairly conservative. When I ran the benchmark > script with a higher number of I/O workers, some runs showed improved > performance.
FWIW, another thing that may be an issue is that you're restarting postgres all the time, as part of drop_caches(). That means we'll spend time reloading catalog metadata and initializing shared buffers (the first write to a shared buffers page is considerably more expensive than later ones, as the backing memory needs to be initialized first). I found it useful to use the pg_buffercache extension (specifically pg_buffercache_evict_relation()) to just drop the relation that is going to be tested from shared_buffers. > > pgstattuple_large base= 12429.3ms patch= 11916.8ms 1.04x > > ( 4.1%) (reads=206945->12983, io_time=6501.91->32.24ms) > > > pgstattuple_large base= 12642.9ms patch= 11873.5ms 1.06x > > ( 6.1%) (reads=206945->12983, io_time=6516.70->143.46ms) > > Yeah, this looks somewhat strange. The io_time has been reduced > significantly, which should also lead to a substantial reduction in > runtime. It's possible that the bottleneck just moved, e.g to the checksum computation, if you have data checksums enabled. It's also worth noting that likely each of the test reps measures something different, as likely psql_run "$ROOT" "$PORT" -c "UPDATE heap_test SET data = data || '!' WHERE id % 5 = 0;" leads to some out-of-page updates. You're probably better off deleting some of the data in a transaction that is then rolled back. That will also unset all-visible, but won't otherwise change the layout, no matter how many test iterations you run. I'd also guess that you're seeing a relatively small win because you're updating every page. When reading every page from disk, the OS can do efficient readahead. If there are only occasional misses, that does not work. > method=io_uring > pgstattuple_large base= 5551.5ms patch= 3498.2ms 1.59x > ( 37.0%) (reads=206945→12983, io_time=2323.49→207.14ms) > > I ran the benchmark for this test again with io_uring, and the result > is consistent with previous runs. I’m not sure what might be > contributing to this behavior. What does a perf profile show? Is the query CPU bound? > Another code path that showed significant performance improvement is > pgstatindex [1]. I've incorporated the test into the script too. Here > are the results from my testing: > > method=worker io-workers=12 > pgstatindex_large base= 233.8ms patch= 54.1ms 4.32x > ( 76.8%) (reads=27460→1757, io_time=213.94→6.31ms) > > method=io_uring > pgstatindex_large base= 224.2ms patch= 56.4ms 3.98x > ( 74.9%) (reads=27460→1757, io_time=204.41→4.88ms) Nice! Greetings, Andres Freund
