On Thu, Mar 31, 2016 at 3:48 PM, Andres Freund <and...@anarazel.de> wrote: > > On 2016-03-31 15:07:22 +0530, Amit Kapila wrote: > > On Thu, Mar 31, 2016 at 4:39 AM, Andres Freund <and...@anarazel.de> wrote: > > > > > > On 2016-03-28 22:50:49 +0530, Amit Kapila wrote: > > > > On Fri, Sep 11, 2015 at 8:01 PM, Amit Kapila < amit.kapil...@gmail.com> > > > > wrote: > > > > > > > > > > > Amit, could you run benchmarks on your bigger hardware? Both with > > > USE_CONTENT_LOCK commented out and in? > > > > > > > Yes. > > Cool. >
Here is the performance data (configuration of machine used to perform this test is mentioned at end of mail): Non-default parameters ------------------------------------ max_connections = 300 shared_buffers=8GB min_wal_size=10GB max_wal_size=15GB checkpoint_timeout =35min maintenance_work_mem = 1GB checkpoint_completion_target = 0.9 wal_buffers = 256MB median of 3, 20-min pgbench tpc-b results for --unlogged-tables Client Count/No. Of Runs (tps) 2 64 128 HEAD+clog_buf_128 4930 66754 68818 group_clog_v8 5753 69002 78843 content_lock 5668 70134 70501 nocontent_lock 4787 69531 70663 I am not exactly sure why using content lock (defined USE_CONTENT_LOCK in 0003-Use-a-much-more-granular-locking-model-for-the-clog-) patch or no content lock (not defined USE_CONTENT_LOCK) patch gives poor performance at 128 client, may it is due to some bug in patch or due to some reason mentioned by Robert [1] (usage of two locks instead of one). On running it many-2 times with content lock and no content lock patch, some times it gives 80 ~ 81K TPS at 128 client count which is approximately 3% higher than group_clog_v8 patch which indicates that group clog approach is able to address most of the remaining contention (after increasing clog buffers) around CLOGControlLock. There is one small regression observed with no content lock patch at lower client count (2) which might be due to run-to-run variation or may be it is due to increased number of instructions due to atomic ops, need to be investigated if we want to follow no content lock approach. Note, I have not posted TPS numbers with HEAD, as I have already shown above that increasing clog bufs has increased TPS from ~36K to ~68K at 128 client-count. M/c details ----------------- Power m/c config (lscpu) ------------------------------------- Architecture: ppc64le Byte Order: Little Endian CPU(s): 192 On-line CPU(s) list: 0-191 Thread(s) per core: 8 Core(s) per socket: 1 Socket(s): 24 NUMA node(s): 4 Model: IBM,8286-42A L1d cache: 64K L1i cache: 32K L2 cache: 512K L3 cache: 8192K NUMA node0 CPU(s): 0-47 NUMA node1 CPU(s): 48-95 NUMA node2 CPU(s): 96-143 NUMA node3 CPU(s): 144-191 [1] - http://www.postgresql.org/message-id/CA+TgmoYjpNKdHDFUtJLAMna-O5LGuTDnanHFAOT5=hn_vau...@mail.gmail.com With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com