> >> Therefore, rather than "improving" pg_usleep (and uglifying its API), >> the right answer is to fix parallel vacuum leaders to not depend on >> pg_usleep in the first place. A better idea might be to use >> pg_sleep() or equivalent code. > > Yes, that is a good idea to explore and it will not require introducing > an awkward new API. I will look into using something similar to > pg_sleep.
Looking through the history of the sleep in vacuum_delay_point, commit 720de00af49 replaced WaitLatch with pg_usleep to allow for microsecond sleep precision [1]. Thomas has proposed a WaitLatchUs implementation in [2], but I have not yet tried it. So I see there are 2 possible options here to deal with the interrupt of a parallel vacuum leader when a message is sent by a parallel vacuum worker. Option 1/ something like my initial proposal which is to create a function similar to pg_usleep that is able to deal with interrupts in a sleep. This could be a function scoped only to vacuum.c, so it can only be used for vacuum delay purposes. —— Option 2/ to explore the WaitLatchUs implementation by Thomas which will give both a latch implementation for a sleep with the microsecond precision. It is worth mentioning that if we do end up using WaitLatch(Us) inside vacuum_delay_point, it will need to set only WL_TIMEOUT and WL_EXIT_ON_PM_DEATH. i.e. (void) WaitLatch(MyLatch, WL_TIMEOUT| WL_EXIT_ON_PM_DEATH, msec WAIT_EVENT_VACUUM_DELAY); This way it is not interrupted by a WL_LATCH_SET when a message is set by a parallel worker. —— Ultimately, I think option 2 may be worth a closer look as it is a cleaner and safer approach, to detect a postmaster death. Thoughts? [1] https://postgr.es/m/CAAKRu_b-q0hXCBUCAATh0Z4Zi6UkiC0k2DFgoD3nC-r3SkR3tg%40mail.gmail.com [2] https://www.postgresql.org/message-id/CA%2BhUKGKVbJE59JkwnUj5XMY%2B-rzcTFciV9vVC7i%3DLUfWPds8Xw%40mail.gmail.com