Hello,

At PGConf US Philly last week I was talking with Jim and Jan about performance. One of the items that came up is that PostgreSQL can't run full throttle for long periods of time. The long and short is that no matter what, autovacuum can't keep up. This is what I have done:

Machine:

16vCPU
59G Memory
10G SSD (/)
500G SSD /srv/main/9.6 (PGDATA) : 240MB Sustained with 15k IOPS
                * Yes, we really got 240MB sustained performance

I used benchmarksql which is a tpc-c benchmark similar to pgbench but supposedly more thorough.

https://sourceforge.net/projects/benchmarksql/

PostgreSQL 9.6 on Ubuntu 16.04 x64.

postgresql.conf:

max_connections: 1000 (just to keep it out of the way)
shared_buffers: 32G (Awesome work Haas)
work_mem: 32M
maintenance_work_mem: 2G
effective_io_concurrency: 1

* Before anybody suggests increasing this, on GCE over a dozen tests, anything but disabling this appears to be a performance hit of ~ 10% (I can reproduce those tests if you like on another thread).

synchronous_commit: off
checkpoint_timeout: 60min
max_wal_size: 5G
random_page_cost: 1
effective_cache_size: 32GB
        *this probably should be more like 50 but still
autovacuum_max_workers: 12
        * One for each table + a couple for system tables
autovacuum_vacuum_scale_factor: 0.1
autovacuum_cost_delay: 5ms

Here are the benchmarksql settings for all 4 runs:

17:07:54,268 [main] INFO   jTPCC : Term-00, warehouses=500
17:07:54,269 [main] INFO   jTPCC : Term-00, terminals=128
17:07:54,272 [main] INFO   jTPCC : Term-00, runTxnsPerTerminal=100000
17:07:54,273 [main] INFO   jTPCC : Term-00, limitTxnsPerMin=300000
17:07:54,273 [main] INFO   jTPCC : Term-00, terminalWarehouseFixed=false
17:07:54,274 [main] INFO   jTPCC : Term-00,
17:07:54,274 [main] INFO   jTPCC : Term-00, newOrderWeight=45
17:07:54,274 [main] INFO   jTPCC : Term-00, paymentWeight=43
17:07:54,274 [main] INFO   jTPCC : Term-00, orderStatusWeight=4
17:07:54,275 [main] INFO   jTPCC : Term-00, deliveryWeight=4
17:07:54,275 [main] INFO   jTPCC : Term-00, stockLevelWeight=4

For run 0, I started with:

vacuumdb -U postgres -fz;./runBenchmark.sh my_postgres.properties

And then for each subsequent run, I just ran the benchmark without the vacuum full so that PostgreSQL could prove us wrong. It didn't. Here is the break down of the results:

RUN     START DISK SIZE END DISK SIZE   TPS/Terminal
0       54              78              868.6796875
1       78              91              852.4765625
2       91              103             741.4609375
3       103             116             686.125

The good news is, PostgreSQL is not doing half bad against 128 connections with only 16vCPU. The bad news is we more than doubled our disk size without getting reuse or bloat under control. The concern here is that under heavy write loads that are persistent, we will eventually bloat out and have to vacuum full, no matter what. I know that Jan has done some testing and the best he could get is something like 8 days before PostgreSQL became unusable (but don't quote me on that).

I am open to comments, suggestions, running multiple tests with different parameters or just leaving this in the archive for people to reference.

Thanks in advance,

JD


--
Command Prompt, Inc. || http://the.postgres.company/ || @cmdpromptinc

PostgreSQL Centered full stack support, consulting and development.
Advocate: @amplifypostgres || Learn: https://pgconf.us
*****     Unless otherwise stated, opinions are my own.   *****


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to