On Mon, May 21, 2018 at 9:03 AM, Thomas Munro <thomas.mu...@enterprisedb.com > wrote:
> On Wed, Apr 11, 2018 at 1:05 PM, Thomas Munro > <thomas.mu...@enterprisedb.com> wrote: > > I heard through the grapevine of some people currently investigating > > performance problems on busy FreeBSD systems, possibly related to the > > postmaster pipe. I suspect this patch might be a part of the solution > > (other patches probably needed to get maximum value out of this patch: > > reuse WaitEventSet objects in some key places, and get rid of high > > frequency PostmasterIsAlive() read() calls). The autoconf-fu in the > > last version bit-rotted so it seemed like a good time to post a > > rebased patch. > > Hi everyone, I have benchmarked the change on a FreeBSD box and found an big performance win once the number of clients goes beyond the number of hardware threads on the target machine. For smaller number of clients the win was very modest. The test was performed few weeks ago. For convenience PostgreSQL 10.3 as found in the ports tree was used. 3 variants were tested: - stock 10.3 - stock 10.3 + pdeathsig - stock 10.3 + pdeathsig + kqueue Appropriate patches were provided by Thomas. In order to keep this message PG-13 I'm not going to show the actual script, but a mere outline: for i in $(seq 1 10): do for t in vanilla pdeathsig pdeathsig_kqueue; do start up the relevant version for c in 32 64 96; do pgbench -j 96 -c $c -T 120 -M prepared -S -U bench -h 172.16.0.2 -P1 bench > ${t}-${c}-out-warmup 2>&1 pgbench -j 96 -c $c -T 120 -M prepared -S -U bench -h 172.16.0.2 -P1 bench > ${t}-${c}-out 2>&1 done shutdown the relevant version done Data from the warmup is not used. All the data was pre-read prior to the test. PostgreSQL was configured with 32GB of shared buffers and 200 max connections, otherwise it was the default. The server is: Intel(R) Xeon(R) Gold 6134 CPU @ 3.20GHz 2 package(s) x 8 core(s) x 2 hardware threads i.e. 32 threads in total. running FreeBSD -head with 'options NUMA' in kernel config and sysctl net.inet.tcp.per_cpu_timers=1 on top of zfs. The load was generated from a different box over a 100Gbit ethernet link. x cumulative-tps-vanilla-32 + cumulative-tps-pdeathsig-32 * cumulative-tps-pdeathsig_kqueue-32 +------------------------------------------------------------------------+ |+ + x+* x+ * x * + * * * * ** * ** *| | |_____|__M_A___M_A_____|____| |________MA________| | +------------------------------------------------------------------------+ N Min Max Median Avg Stddev x 10 442898.77 448476.81 444805.17 445062.08 1679.7169 + 10 442057.2 447835.46 443840.28 444235.01 1771.2254 No difference proven at 95.0% confidence * 10 448138.07 452786.41 450274.56 450311.51 1387.2927 Difference at 95.0% confidence 5249.43 +/- 1447.41 1.17948% +/- 0.327501% (Student's t, pooled s = 1540.46) x cumulative-tps-vanilla-64 + cumulative-tps-pdeathsig-64 * cumulative-tps-pdeathsig_kqueue-64 +------------------------------------------------------------------------+ | ** | | ** | | xx x + ***| |++**x *+*++ ***| | ||_A|M_| |A | +------------------------------------------------------------------------+ N Min Max Median Avg Stddev x 10 411849.26 422145.5 416043.77 416061.9 3763.2545 + 10 407123.74 425727.84 419908.73 417480.7 6817.5549 No difference proven at 95.0% confidence * 10 542032.71 546106.93 543948.05 543874.06 1234.1788 Difference at 95.0% confidence 127812 +/- 2631.31 30.7195% +/- 0.809892% (Student's t, pooled s = 2800.47) x cumulative-tps-vanilla-96 + cumulative-tps-pdeathsig-96 * cumulative-tps-pdeathsig_kqueue-96 +------------------------------------------------------------------------+ | * | | * | | * | | * | | + x * | | *xxx+ **| |+ *****+ * **| | |MA|| |A|| +------------------------------------------------------------------------+ N Min Max Median Avg Stddev x 10 325263.7 336338 332399.16 331321.82 3571.2478 + 10 321213.33 338669.66 329553.78 330903.58 5652.008 No difference proven at 95.0% confidence * 10 503877.22 511449.96 508708.41 508808.51 2016.9483 Difference at 95.0% confidence 177487 +/- 2724.98 53.5693% +/- 1.17178% (Student's t, pooled s = 2900.16) -- Mateusz Guzik <mjguzik gmail.com>