[HACKERS] autovacuum can't keep up, bloat just continues to rise

Joshua D. Drake Wed, 19 Jul 2017 15:12:07 -0700

Hello,

At PGConf US Philly last week I was talking with Jim and Jan aboutperformance. One of the items that came up is that PostgreSQL can't runfull throttle for long periods of time. The long and short is that nomatter what, autovacuum can't keep up. This is what I have done:


Machine:

16vCPU
59G Memory
10G SSD (/)
500G SSD /srv/main/9.6 (PGDATA) : 240MB Sustained with 15k IOPS
                * Yes, we really got 240MB sustained performance

I used benchmarksql which is a tpc-c benchmark similar to pgbench butsupposedly more thorough.


https://sourceforge.net/projects/benchmarksql/

PostgreSQL 9.6 on Ubuntu 16.04 x64.

postgresql.conf:

max_connections: 1000 (just to keep it out of the way)
shared_buffers: 32G (Awesome work Haas)
work_mem: 32M
maintenance_work_mem: 2G
effective_io_concurrency: 1

* Before anybody suggests increasing this, on GCE over a dozen tests,anything but disabling this appears to be a performance hit of ~ 10% (Ican reproduce those tests if you like on another thread).


synchronous_commit: off
checkpoint_timeout: 60min
max_wal_size: 5G
random_page_cost: 1
effective_cache_size: 32GB
        *this probably should be more like 50 but still
autovacuum_max_workers: 12
        * One for each table + a couple for system tables
autovacuum_vacuum_scale_factor: 0.1
autovacuum_cost_delay: 5ms

Here are the benchmarksql settings for all 4 runs:

17:07:54,268 [main] INFO   jTPCC : Term-00, warehouses=500
17:07:54,269 [main] INFO   jTPCC : Term-00, terminals=128
17:07:54,272 [main] INFO   jTPCC : Term-00, runTxnsPerTerminal=100000
17:07:54,273 [main] INFO   jTPCC : Term-00, limitTxnsPerMin=300000
17:07:54,273 [main] INFO   jTPCC : Term-00, terminalWarehouseFixed=false
17:07:54,274 [main] INFO   jTPCC : Term-00,
17:07:54,274 [main] INFO   jTPCC : Term-00, newOrderWeight=45
17:07:54,274 [main] INFO   jTPCC : Term-00, paymentWeight=43
17:07:54,274 [main] INFO   jTPCC : Term-00, orderStatusWeight=4
17:07:54,275 [main] INFO   jTPCC : Term-00, deliveryWeight=4
17:07:54,275 [main] INFO   jTPCC : Term-00, stockLevelWeight=4

For run 0, I started with:

vacuumdb -U postgres -fz;./runBenchmark.sh my_postgres.properties

And then for each subsequent run, I just ran the benchmark without thevacuum full so that PostgreSQL could prove us wrong. It didn't. Here isthe break down of the results:


RUN     START DISK SIZE END DISK SIZE   TPS/Terminal
0       54              78              868.6796875
1       78              91              852.4765625
2       91              103             741.4609375
3       103             116             686.125

The good news is, PostgreSQL is not doing half bad against 128connections with only 16vCPU. The bad news is we more than doubled ourdisk size without getting reuse or bloat under control. The concern hereis that under heavy write loads that are persistent, we will eventuallybloat out and have to vacuum full, no matter what. I know that Jan hasdone some testing and the best he could get is something like 8 daysbefore PostgreSQL became unusable (but don't quote me on that).

I am open to comments, suggestions, running multiple tests withdifferent parameters or just leaving this in the archive for people toreference.


Thanks in advance,

JD


--
Command Prompt, Inc. || http://the.postgres.company/ || @cmdpromptinc

PostgreSQL Centered full stack support, consulting and development.
Advocate: @amplifypostgres || Learn: https://pgconf.us
*****     Unless otherwise stated, opinions are my own.   *****


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] autovacuum can't keep up, bloat just continues to rise

Reply via email to