On 2015-09-01 10:19:19 +0530, Amit Kapila wrote: > pgbench setup > ------------------------ > scale factor - 300 > Data is on magnetic disk and WAL on ssd. > pgbench -M prepared tpc-b > > HEAD - commit 0e141c0f > Patch-1 - increase_clog_bufs_v1 > > Client Count/Patch_ver 1 8 16 32 64 128 256 HEAD 911 5695 9886 18028 27851 > 28654 25714 Patch-1 954 5568 9898 18450 29313 31108 28213 > > > This data shows that there is an increase of ~5% at 64 client-count > and 8~10% at more higher clients without degradation at lower client- > count. In above data, there is some fluctuation seen at 8-client-count, > but I attribute that to run-to-run variation, however if anybody has doubts > I can again re-verify the data at lower client counts.
> Now if we try to further increase the number of CLOG buffers to 128, > no improvement is seen. > > I have also verified that this improvement can be seen only after the > contention around ProcArrayLock is reduced. Below is the data with > Commit before the ProcArrayLock reduction patch. Setup and test > is same as mentioned for previous test. The buffer replacement algorithm for clog is rather stupid - I do wonder where the cutoff is that it hurts. Could you perhaps try to create a testcase where xids are accessed that are so far apart on average that they're unlikely to be in memory? And then test that across a number of client counts? There's two reasons that I'd like to see that: First I'd like to avoid regression, second I'd like to avoid having to bump the maximum number of buffers by small buffers after every hardware generation... > /* > * Number of shared CLOG buffers. > * > - * Testing during the PostgreSQL 9.2 development cycle revealed that on a > + * Testing during the PostgreSQL 9.6 development cycle revealed that on a > * large multi-processor system, it was possible to have more CLOG page > - * requests in flight at one time than the number of CLOG buffers which > existed > - * at that time, which was hardcoded to 8. Further testing revealed that > - * performance dropped off with more than 32 CLOG buffers, possibly because > - * the linear buffer search algorithm doesn't scale well. > + * requests in flight at one time than the number of CLOG buffers which > + * existed at that time, which was 32 assuming there are enough > shared_buffers. > + * Further testing revealed that either performance stayed same or dropped > off > + * with more than 64 CLOG buffers, possibly because the linear buffer search > + * algorithm doesn't scale well or some other locking bottlenecks in the > + * system mask the improvement. > * > - * Unconditionally increasing the number of CLOG buffers to 32 did not seem > + * Unconditionally increasing the number of CLOG buffers to 64 did not seem > * like a good idea, because it would increase the minimum amount of shared > * memory required to start, which could be a problem for people running very > * small configurations. The following formula seems to represent a > reasonable > * compromise: people with very low values for shared_buffers will get fewer > - * CLOG buffers as well, and everyone else will get 32. > + * CLOG buffers as well, and everyone else will get 64. > * > * It is likely that some further work will be needed here in future > releases; > * for example, on a 64-core server, the maximum number of CLOG requests that > * can be simultaneously in flight will be even larger. But that will > * apparently require more than just changing the formula, so for now we take > - * the easy way out. > + * the easy way out. It could also happen that after removing other locking > + * bottlenecks, further increase in CLOG buffers can help, but that's not the > + * case now. > */ I think the comment should be more drastically rephrased to not reference individual versions and numbers. Greetings, Andres Freund -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers