I repeat your test on ProLiant DL580 Gen9 with Xeon E7-8890 v3.

pgbench -s 100 and command vacuum pgbench_acounts after 10_000 transactions:

with: alter system set vacuum_cost_delay to DEFAULT;
parallel_vacuum_workers |  time
        1               | 138.703,263 ms
        2               |  83.751,064 ms
        4               |  66.105,861 ms
​        8               |  59.820,171 ms

with: alter system set vacuum_cost_delay to 1;
parallel_vacuum_workers |  time
        1               | 127.210,896 ms
        2               |  75.300,278 ms
        4               |  64.253,087 ms
​        8               |  60.130,953


---
Dmitry Vasilyev
Postgres Professional: http://www.postgrespro.ru
The Russian Postgres Company

2016-08-23 14:02 GMT+03:00 Masahiko Sawada <sawada.m...@gmail.com>:

> Hi all,
>
> I'd like to propose block level parallel VACUUM.
> This feature makes VACUUM possible to use multiple CPU cores.
>
> Vacuum Processing Logic
> ===================
>
> PostgreSQL VACUUM processing logic consists of 2 phases,
> 1. Collecting dead tuple locations on heap.
> 2. Reclaiming dead tuples from heap and indexes.
> These phases 1 and 2 are executed alternately, and once amount of dead
> tuple location reached maintenance_work_mem in phase 1, phase 2 will
> be executed.
>
> Basic Design
> ==========
>
> As for PoC, I implemented parallel vacuum so that each worker
> processes both 1 and 2 phases for particular block range.
> Suppose we vacuum 1000 blocks table with 4 workers, each worker
> processes 250 consecutive blocks in phase 1 and then reclaims dead
> tuples from heap and indexes (phase 2).
> To use visibility map efficiency, each worker scan particular block
> range of relation and collect dead tuple locations.
> After each worker finished task, the leader process gathers these
> vacuum statistics information and update relfrozenxid if possible.
>
> I also changed the buffer lock infrastructure so that multiple
> processes can wait for cleanup lock on a buffer.
> And the new GUC parameter vacuum_parallel_workers controls the number
> of vacuum workers.
>
> Performance(PoC)
> =========
>
> I ran parallel vacuum on 13GB table (pgbench scale 1000) with several
> workers (on my poor virtual machine).
> The result is,
>
> 1. Vacuum whole table without index (disable page skipping)
>   1 worker   : 33 sec
>   2 workers : 27 sec
>   3 workers : 23 sec
>   4 workers : 22 sec
>
> 2. Vacuum table and index (after 10000 transaction executed)
>   1 worker   : 12 sec
>   2 workers : 49 sec
>   3 workers : 54 sec
>   4 workers : 53 sec
>
> As a result of my test, since multiple process could frequently try to
> acquire the cleanup lock on same index buffer, execution time of
> parallel vacuum got worse.
> And it seems to be effective for only table vacuum so far, but is not
> improved as expected (maybe disk bottleneck).
>
> Another Design
> ============
> ISTM that processing index vacuum by multiple process is not good idea
> in most cases because many index items can be stored in a page and
> multiple vacuum worker could try to require the cleanup lock on the
> same index buffer.
> It's rather better that multiple workers process particular block
> range and then multiple workers process each particular block range,
> and then one worker per index processes index vacuum.
>
> Still lots of work to do but attached PoC patch.
> Feedback and suggestion are very welcome.
>
> Regards,
>
> --
> Masahiko Sawada
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>
>

Reply via email to