On Wed, Oct 5, 2016 at 7:28 AM, Masahiko Sawada <sawada.m...@gmail.com>
wrote:

> Hi all,
>
> I found the kind of strange behaviour of the autovacuum launcher
> process when XID anti-wraparound vacuum.
>
> Suppose that a database (say test_db) whose age of frozenxid is about
> to reach max_autovacuum_max_age has three tables T1 and T2.
> T1 is very large and is frequently updated, so vacuum takes long time
> for vacuum.
> T2 is static and already frozen table, thus vacuum can skip to vacuum
> whole table.
> And anti-wraparound vacuum was already executed on other databases.
>
> Once the age of datfrozenxid of test_db exceeded
> max_autovacuum_max_age, autovacuum launcher launches worker process in
> order to do anti-wraparound vacuum on testdb.
> A worker process assigned to test_db begins to vacuum T1, it takes long
> time.
> Meanwhile another worker process is assigned to test_db and completes
> to vacuum on T2 and exits.
>
> After for while, the autovacuum launcher launches new worker again and
> worker is assigned to test_db again.
> But that worker exits quickly because there is no table we need to
> vacuum. (T1 is being vacuumed by another worker process).
> When new worker process starts, worker process sends SIGUSR2 signal to
> launcher process to wake up him.
> Although the launcher process executes WaitLatch() after launched new
> worker, it is woken up and launches another new worker process soon
> again.
>

See also this thread, which was never resolved:

https://www.postgresql.org/message-id/flat/CAMkU%3D1yE4YyCC00W_GcNoOZ4X2qxF7x5DUAR_kMt-Ta%3DYPyFPQ%40mail.gmail.com#CAMkU=1yE4YyCC00W_GcNoOZ4X2qxF7x5DUAR_kMt-Ta=ypy...@mail.gmail.com




> As a result, launcher process launches new worker process at extremely
> high frequency regardless of autovacuum_naptime, which increase cpu
> use rate.
>
> Why does auto vacuum worker need to wake up launcher process after started?
>
> autovacuum.c:L1604
>          /* wake up the launcher */
>         if (AutoVacuumShmem->av_launcherpid != 0)
>             kill(AutoVacuumShmem->av_launcherpid, SIGUSR2);
>


I think that that is so that the launcher can launch multiple workers in
quick succession if it has fallen behind schedule. It can't launch them in
a tight loop, because its signals to the postmaster would get merged into
one signal, so it has to wait for one to get mostly set-up before launching
the next.

But it doesn't make any real difference to your scenario, as the
short-lived worker will wake the launcher up a few microseconds later
anyway, when it realizes it has no work to do and so exits.

Cheers,

Jeff

Reply via email to