On Mon, Mar 18, 2019 at 5:12 PM Peter Geoghegan <[email protected]> wrote:
> Smarter choices on page splits pay off with higher client counts
> because they reduce contention at likely hot points. It's kind of
> crazy that the code in _bt_check_unique() sometimes has to move right,
> while holding an exclusive buffer lock on the original page and a
> shared buffer lock on its sibling page at the same time. It then has
> to hold a third buffer lock concurrently, this time on any heap pages
> it is interested in.
Actually, by the time we get to 16 clients, this workload does make
the indexes and tables smaller. Here is pg_buffercache output
collected after the first 16 client case:
Master
======
relname │ relforknumber │
size_main_rel_fork_blocks │ buffer_count │ avg_buffer_usg
─────────────────────────────────────────┼───────────────┼───────────────────────────┼──────────────┼────────────────────────
pgbench_history │ 0 │
123,484 │ 123,484 │ 4.9989715266755207
pgbench_accounts │ 0 │
34,665 │ 10,682 │ 4.4948511514697622
pgbench_accounts_pkey │ 0 │
5,708 │ 1,561 │ 4.8731582319026265
pgbench_tellers │ 0 │
489 │ 489 │ 5.0000000000000000
pgbench_branches │ 0 │
284 │ 284 │ 5.0000000000000000
pgbench_tellers_pkey │ 0 │
56 │ 56 │ 5.0000000000000000
....
Patch
=====
relname │ relforknumber │
size_main_rel_fork_blocks │ buffer_count │ avg_buffer_usg
─────────────────────────────────────────┼───────────────┼───────────────────────────┼──────────────┼────────────────────────
pgbench_history │ 0 │
127,864 │ 127,864 │ 4.9980447975974473
pgbench_accounts │ 0 │
33,933 │ 9,614 │ 4.3517786561264822
pgbench_accounts_pkey │ 0 │
5,487 │ 1,322 │ 4.8857791225416036
pgbench_tellers │ 0 │
204 │ 204 │ 4.9803921568627451
pgbench_branches │ 0 │
198 │ 198 │ 4.3535353535353535
pgbench_tellers_pkey │ 0 │
14 │ 14 │ 5.0000000000000000
....
The main fork for pgbench_history is larger with the patch, obviously,
but that's good. pgbench_accounts_pkey is about 4% smaller, which is
probably the most interesting observation that can be made here, but
the tables are also smaller. pgbench_accounts itself is ~2% smaller.
pgbench_branches is ~30% smaller, and pgbench_tellers is 60% smaller.
Of course, the smaller tables were already very small, so maybe that
isn't important. I think that this is due to more effective pruning,
possibly because we get better lock arbitration as a consequence of
better splits, and also because duplicates are in heap TID order. I
haven't observed this effect with larger databases, which have been my
focus.
It isn't weird that shared_buffers doesn't have all the
pgbench_accounts blocks, since, of course, this is highly skewed by
design -- most blocks were never accessed from the table.
This effect seems to be robust, at least with this workload. The
second round of benchmarks (which have their own pgbench -i
initialization) show very similar amounts of bloat at the same point.
It may not be that significant, but it's also not a fluke.
--
Peter Geoghegan