Robert Haas <[email protected]> wrote (in the other thread):
> Any idea whether it's something newly-introduced or of long standing?
It's the latter. I searched the buildfarm database for failure logs
including the string "server does not shut down" within the last three
years, and got all of the hits attached. Not all of these look like
the failure pattern Michael pointed to, but enough of them do to say
that the problem has existed since at least mid-2017. To be concrete,
we have quite a sample of cases where a standby server has received a
"fast shutdown" signal and acknowledged that in its log, but it never
gets to the expected "shutting down" message, meaning it never starts
the shutdown checkpoint let alone finishes it. The oldest case that
clearly looks like that is
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=nightjar&dt=2017-06-02%2018%3A54%3A29
A significant majority of the recent cases look just like the piculet
failure Michael pointed to, that is we fail to shut down the "london"
server while it's acting as standby in the recovery/t/009_twophase.pl
test. But there are very similar failures in other tests.
I also notice that the population of machines showing the problem seems
heavily skewed towards, um, weird cases. For instance, in the set
that have shown this type of failure since January, we have
dragonet: uses JIT
francolin: --disable-spinlocks
gull: armv7
mereswine: armv7
piculet: --disable-atomics
sidewinder: amd64, but running netbsd 7 (and this was 9.6, note)
spurfowl: fairly generic amd64
This leads me to suspect that the problem is (a) some very low-level issue
in spinlocks or or latches or the like, or (b) a timing problem that just
doesn't show up on generic Intel-oid platforms. The timing theory is
maybe a bit stronger given that one test case shows this more often than
others. I've not got any clear ideas beyond that.
Anyway, this is *not* new in v12.
regards, tom lane
sysname | snapshot | stage | l
---------------+---------------------+---------------------+-------------------------------------
jacana | 2016-07-23 06:15:32 | pg_upgradeCheck | pg_ctl: server
does not shut down\r
pademelon | 2016-08-14 03:49:36 | ECPG-Check | pg_ctl: server
does not shut down
mereswine | 2017-02-13 14:24:37 | Check | pg_ctl: server
does not shut down
arapaima | 2017-03-04 20:06:10 | StopDb-C:4 | pg_ctl: server
does not shut down
nightjar | 2017-06-02 18:54:29 | SubscriptionCheck | pg_ctl: server
does not shut down
nightjar | 2017-06-02 19:54:11 | SubscriptionCheck | pg_ctl: server
does not shut down
nightjar | 2017-06-03 15:54:05 | SubscriptionCheck | pg_ctl: server
does not shut down
nightjar | 2017-06-03 17:54:18 | SubscriptionCheck | pg_ctl: server
does not shut down
nightjar | 2017-06-03 21:54:09 | SubscriptionCheck | pg_ctl: server
does not shut down
nightjar | 2017-06-04 00:54:09 | SubscriptionCheck | pg_ctl: server
does not shut down
nightjar | 2017-06-04 16:34:32 | SubscriptionCheck | pg_ctl: server
does not shut down
nightjar | 2017-06-04 17:54:16 | SubscriptionCheck | pg_ctl: server
does not shut down
hornet | 2017-06-05 16:22:09 | SubscriptionCheck | pg_ctl: server
does not shut down
nightjar | 2017-06-05 16:54:09 | SubscriptionCheck | pg_ctl: server
does not shut down
nightjar | 2017-06-05 20:26:24 | SubscriptionCheck | pg_ctl: server
does not shut down
nightjar | 2017-06-06 03:30:02 | SubscriptionCheck | pg_ctl: server
does not shut down
nightjar | 2017-06-06 15:54:18 | SubscriptionCheck | pg_ctl: server
does not shut down
hornet | 2017-06-06 17:10:02 | SubscriptionCheck | pg_ctl: server
does not shut down
nightjar | 2017-06-06 18:54:27 | SubscriptionCheck | pg_ctl: server
does not shut down
nightjar | 2017-06-07 00:54:07 | SubscriptionCheck | pg_ctl: server
does not shut down
nightjar | 2017-06-07 02:54:06 | SubscriptionCheck | pg_ctl: server
does not shut down
nightjar | 2017-06-07 15:12:15 | SubscriptionCheck | pg_ctl: server
does not shut down
nightjar | 2017-06-07 17:54:07 | SubscriptionCheck | pg_ctl: server
does not shut down
nightjar | 2017-06-07 18:54:06 | SubscriptionCheck | pg_ctl: server
does not shut down
sungazer | 2017-06-07 19:46:53 | SubscriptionCheck | pg_ctl: server
does not shut down
nightjar | 2017-06-07 21:03:43 | SubscriptionCheck | pg_ctl: server
does not shut down
nightjar | 2017-06-08 01:54:07 | SubscriptionCheck | pg_ctl: server
does not shut down
nightjar | 2017-06-08 15:54:10 | SubscriptionCheck | pg_ctl: server
does not shut down
nightjar | 2017-06-08 16:57:03 | SubscriptionCheck | pg_ctl: server
does not shut down
nightjar | 2017-06-08 17:54:09 | SubscriptionCheck | pg_ctl: server
does not shut down
nightjar | 2017-06-08 19:54:06 | SubscriptionCheck | pg_ctl: server
does not shut down
nudibranch | 2017-07-23 08:31:12 | pg_upgradeCheck | pg_ctl: server
does not shut down
nudibranch | 2017-08-06 08:31:25 | pg_upgradeCheck | pg_ctl: server
does not shut down
spurfowl | 2017-08-21 18:43:08 | BinInstallCheck | pg_ctl: server
does not shut down
arapaima | 2017-10-10 22:12:01 | pg_upgradeCheck | pg_ctl: server
does not shut down
calliphoridae | 2017-10-11 20:32:24 | recoveryCheck | pg_ctl: server
does not shut down
arapaima | 2017-10-12 16:36:18 | pg_upgradeCheck | pg_ctl: server
does not shut down
arapaima | 2017-10-12 16:36:18 | pg_upgradeCheck | pg_ctl: server
does not shut down
arapaima | 2017-10-13 10:26:13 | test-decoding-check | pg_ctl: server
does not shut down
arapaima | 2017-10-14 07:40:05 | pg_upgradeCheck | pg_ctl: server
does not shut down
arapaima | 2017-10-14 23:22:28 | Check | pg_ctl: server
does not shut down
arapaima | 2017-10-15 10:32:28 | test-decoding-check | pg_ctl: server
does not shut down
arapaima | 2017-10-16 08:40:39 | Check | pg_ctl: server
does not shut down
arapaima | 2017-10-17 20:32:25 | pg_upgradeCheck | pg_ctl: server
does not shut down
arapaima | 2017-10-18 13:11:05 | pg_upgradeCheck | pg_ctl: server
does not shut down
arapaima | 2017-10-18 23:47:09 | Check | pg_ctl: server
does not shut down
arapaima | 2017-10-19 09:41:15 | Check | pg_ctl: server
does not shut down
spurfowl | 2017-11-15 13:29:06 | BinInstallCheck | pg_ctl: server
does not shut down
baiji | 2017-11-27 23:00:01 | Check | pg_ctl: server
does not shut down\r
skink | 2017-11-30 08:47:34 | recoveryCheck | pg_ctl: server
does not shut down
skink | 2017-12-02 14:30:01 | recoveryCheck | pg_ctl: server
does not shut down
gharial | 2017-12-15 09:12:45 | Check | pg_ctl: server
does not shut down
nudibranch | 2017-12-17 09:33:02 | pg_upgradeCheck | pg_ctl: server
does not shut down
nudibranch | 2018-01-20 10:08:33 | Check | pg_ctl: server
does not shut down
nudibranch | 2018-01-27 09:44:39 | Check | pg_ctl: server
does not shut down
nudibranch | 2018-02-01 09:49:26 | test-decoding-check | pg_ctl: server
does not shut down
mereswine | 2018-03-01 11:05:39 | Check | pg_ctl: server
does not shut down
topminnow | 2018-03-10 12:45:57 | pg_rewindCheck | pg_ctl: server
does not shut down
mereswine | 2018-03-14 08:10:08 | Check | pg_ctl: server
does not shut down
gull | 2018-03-30 02:10:15 | Check | pg_ctl: server
does not shut down
aholehole | 2018-04-08 11:37:13 | Check | pg_ctl: server
does not shut down
aholehole | 2018-04-08 21:41:39 | Check | pg_ctl: server
does not shut down
aholehole | 2018-04-10 23:37:22 | Check | pg_ctl: server
does not shut down
aholehole | 2018-04-13 16:06:17 | Check | pg_ctl: server
does not shut down
aholehole | 2018-04-13 23:59:24 | StopDb-C:1 | pg_ctl: server
does not shut down
aholehole | 2018-04-14 20:09:25 | pg_upgradeCheck | pg_ctl: server
does not shut down
aholehole | 2018-04-17 11:59:57 | pg_upgradeCheck | pg_ctl: server
does not shut down
aholehole | 2018-04-19 07:31:31 | Check | pg_ctl: server
does not shut down
aholehole | 2018-04-20 08:02:00 | pg_upgradeCheck | pg_ctl: server
does not shut down
aholehole | 2018-04-21 04:21:33 | Check | pg_ctl: server
does not shut down
aholehole | 2018-04-21 23:06:17 | pg_upgradeCheck | pg_ctl: server
does not shut down
aholehole | 2018-04-22 17:03:22 | pg_upgradeCheck | pg_ctl: server
does not shut down
aholehole | 2018-04-22 17:03:22 | pg_upgradeCheck | pg_ctl: server
does not shut down
aholehole | 2018-04-23 13:08:51 | pg_upgradeCheck | pg_ctl: server
does not shut down
aholehole | 2018-04-24 07:52:02 | pg_upgradeCheck | pg_ctl: server
does not shut down
aholehole | 2018-04-25 01:32:16 | Check | pg_ctl: server
does not shut down
gull | 2018-04-25 02:10:12 | Check | pg_ctl: server
does not shut down
aholehole | 2018-04-26 17:42:38 | Check | pg_ctl: server
does not shut down
aholehole | 2018-04-28 05:58:59 | Check | pg_ctl: server
does not shut down
aholehole | 2018-04-30 13:11:22 | pg_upgradeCheck | pg_ctl: server
does not shut down
aholehole | 2018-05-03 02:36:32 | StopDb-C:4 | pg_ctl: server
does not shut down
aholehole | 2018-05-04 11:07:37 | Check | pg_ctl: server
does not shut down
aholehole | 2018-05-04 18:02:59 | test-decoding-check | pg_ctl: server
does not shut down
aholehole | 2018-05-06 02:52:23 | pg_upgradeCheck | pg_ctl: server
does not shut down
aholehole | 2018-05-07 19:16:33 | Check | pg_ctl: server
does not shut down
aholehole | 2018-05-08 13:03:52 | StopDb-C:4 | pg_ctl: server
does not shut down
aholehole | 2018-05-09 04:36:06 | Check | pg_ctl: server
does not shut down
aholehole | 2018-05-09 10:41:59 | Check | pg_ctl: server
does not shut down
aholehole | 2018-05-09 16:41:32 | test-decoding-check | pg_ctl: server
does not shut down
aholehole | 2018-05-10 14:06:04 | Check | pg_ctl: server
does not shut down
aholehole | 2018-05-11 09:00:33 | pg_upgradeCheck | pg_ctl: server
does not shut down
aholehole | 2018-05-12 04:07:50 | Check | pg_ctl: server
does not shut down
aholehole | 2018-05-15 22:11:06 | test-decoding-check | pg_ctl: server
does not shut down
aholehole | 2018-05-16 10:14:21 | test-decoding-check | pg_ctl: server
does not shut down
aholehole | 2018-05-16 23:01:57 | Check | pg_ctl: server
does not shut down
aholehole | 2018-05-17 07:06:03 | pg_upgradeCheck | pg_ctl: server
does not shut down
aholehole | 2018-05-17 14:22:36 | test-decoding-check | pg_ctl: server
does not shut down
aholehole | 2018-05-18 00:15:19 | test-decoding-check | pg_ctl: server
does not shut down
aholehole | 2018-05-18 11:23:03 | test-decoding-check | pg_ctl: server
does not shut down
aholehole | 2018-05-18 22:36:45 | test-decoding-check | pg_ctl: server
does not shut down
aholehole | 2018-05-19 11:54:21 | Check | pg_ctl: server
does not shut down
aholehole | 2018-05-19 19:06:08 | StopDb-C:4 | pg_ctl: server
does not shut down
aholehole | 2018-05-20 09:32:53 | pg_upgradeCheck | pg_ctl: server
does not shut down
aholehole | 2018-05-20 16:47:17 | test-decoding-check | pg_ctl: server
does not shut down
aholehole | 2018-05-21 03:36:24 | test-decoding-check | pg_ctl: server
does not shut down
aholehole | 2018-05-21 14:38:21 | test-decoding-check | pg_ctl: server
does not shut down
aholehole | 2018-05-22 03:02:05 | Check | pg_ctl: server
does not shut down
aholehole | 2018-05-22 11:06:05 | pg_upgradeCheck | pg_ctl: server
does not shut down
aholehole | 2018-05-23 12:13:37 | pg_upgradeCheck | pg_ctl: server
does not shut down
aholehole | 2018-05-23 21:05:13 | pg_upgradeCheck | pg_ctl: server
does not shut down
aholehole | 2018-05-24 08:29:25 | test-decoding-check | pg_ctl: server
does not shut down
aholehole | 2018-05-25 14:08:16 | Check | pg_ctl: server
does not shut down
aholehole | 2018-05-25 19:11:35 | pg_upgradeCheck | pg_ctl: server
does not shut down
aholehole | 2018-05-27 16:56:10 | StopDb-C:1 | pg_ctl: server
does not shut down
aholehole | 2018-05-29 16:09:14 | StopDb-C:1 | pg_ctl: server
does not shut down
aholehole | 2018-05-30 15:07:44 | Check | pg_ctl: server
does not shut down
aholehole | 2018-05-31 13:10:53 | Check | pg_ctl: server
does not shut down
aholehole | 2018-05-31 21:12:12 | pg_upgradeCheck | pg_ctl: server
does not shut down
aholehole | 2018-05-31 21:12:12 | pg_upgradeCheck | pg_ctl: server
does not shut down
aholehole | 2018-06-04 17:07:20 | Check | pg_ctl: server
does not shut down
aholehole | 2018-06-05 00:12:13 | pg_upgradeCheck | pg_ctl: server
does not shut down
aholehole | 2018-06-06 17:06:03 | Check | pg_ctl: server
does not shut down
aholehole | 2018-06-06 21:49:02 | pg_upgradeCheck | pg_ctl: server
does not shut down
aholehole | 2018-06-06 21:49:02 | pg_upgradeCheck | pg_ctl: server
does not shut down
aholehole | 2018-06-07 06:53:06 | StopDb-C:1 | pg_ctl: server
does not shut down
aholehole | 2018-06-07 19:36:02 | pg_upgradeCheck | pg_ctl: server
does not shut down
aholehole | 2018-06-08 15:47:15 | pg_upgradeCheck | pg_ctl: server
does not shut down
aholehole | 2018-06-09 07:11:38 | StopDb-C:1 | pg_ctl: server
does not shut down
aholehole | 2018-06-09 21:54:40 | Check | pg_ctl: server
does not shut down
aholehole | 2018-06-10 08:12:11 | test-decoding-check | pg_ctl: server
does not shut down
aholehole | 2018-06-12 00:12:15 | test-decoding-check | pg_ctl: server
does not shut down
aholehole | 2018-06-13 23:29:02 | StopDb-C:4 | pg_ctl: server
does not shut down
aholehole | 2018-06-15 04:38:36 | pg_upgradeCheck | pg_ctl: server
does not shut down
aholehole | 2018-06-15 04:38:36 | pg_upgradeCheck | pg_ctl: server
does not shut down
aholehole | 2018-06-15 13:54:08 | Check | pg_ctl: server
does not shut down
aholehole | 2018-06-15 20:19:02 | Check | pg_ctl: server
does not shut down
aholehole | 2018-06-16 08:09:36 | pg_upgradeCheck | pg_ctl: server
does not shut down
aholehole | 2018-06-17 04:38:19 | StopDb-C:1 | pg_ctl: server
does not shut down
aholehole | 2018-06-17 19:47:19 | Check | pg_ctl: server
does not shut down
aholehole | 2018-06-18 04:06:04 | StopDb-C:4 | pg_ctl: server
does not shut down
aholehole | 2018-06-18 16:23:17 | Check | pg_ctl: server
does not shut down
aholehole | 2018-06-19 18:45:19 | test-decoding-check | pg_ctl: server
does not shut down
aholehole | 2018-06-20 09:15:42 | pg_upgradeCheck | pg_ctl: server
does not shut down
aholehole | 2018-06-20 09:15:42 | pg_upgradeCheck | pg_ctl: server
does not shut down
aholehole | 2018-06-21 04:41:45 | Check | pg_ctl: server
does not shut down
aholehole | 2018-06-21 13:06:30 | pg_upgradeCheck | pg_ctl: server
does not shut down
aholehole | 2018-06-21 23:50:37 | pg_upgradeCheck | pg_ctl: server
does not shut down
aholehole | 2018-06-22 13:11:55 | test-decoding-check | pg_ctl: server
does not shut down
aholehole | 2018-06-23 02:09:06 | Check | pg_ctl: server
does not shut down
aholehole | 2018-06-23 13:12:49 | test-decoding-check | pg_ctl: server
does not shut down
aholehole | 2018-06-25 01:06:02 | pg_upgradeCheck | pg_ctl: server
does not shut down
aholehole | 2018-06-25 01:06:02 | pg_upgradeCheck | pg_ctl: server
does not shut down
aholehole | 2018-06-25 08:25:08 | pg_upgradeCheck | pg_ctl: server
does not shut down
aholehole | 2018-06-25 16:10:48 | test-decoding-check | pg_ctl: server
does not shut down
aholehole | 2018-06-26 03:34:45 | pg_upgradeCheck | pg_ctl: server
does not shut down
aholehole | 2018-06-26 12:47:21 | StopDb-C:1 | pg_ctl: server
does not shut down
gull | 2018-08-24 03:27:16 | recoveryCheck | pg_ctl: server
does not shut down
mereswine | 2018-08-24 08:10:05 | recoveryCheck | pg_ctl: server
does not shut down
mereswine | 2018-08-29 09:26:52 | pg_rewindCheck | pg_ctl: server
does not shut down
gull | 2018-08-30 02:10:05 | recoveryCheck | pg_ctl: server
does not shut down
gull | 2018-09-06 02:10:16 | recoveryCheck | pg_ctl: server
does not shut down
gull | 2018-09-14 02:10:05 | recoveryCheck | pg_ctl: server
does not shut down
sungazer | 2018-09-18 19:44:53 | subscriptionCheck | pg_ctl: server
does not shut down
gull | 2018-09-26 02:10:06 | recoveryCheck | pg_ctl: server
does not shut down
mereswine | 2018-10-01 08:10:04 | recoveryCheck | pg_ctl: server
does not shut down
gull | 2018-10-03 02:10:05 | recoveryCheck | pg_ctl: server
does not shut down
gull | 2018-10-07 03:45:39 | recoveryCheck | pg_ctl: server
does not shut down
gull | 2018-10-07 05:07:06 | recoveryCheck | pg_ctl: server
does not shut down
mereswine | 2018-10-07 08:10:05 | recoveryCheck | pg_ctl: server
does not shut down
gull | 2018-10-13 02:10:06 | recoveryCheck | pg_ctl: server
does not shut down
gull | 2018-10-13 05:21:13 | recoveryCheck | pg_ctl: server
does not shut down
gull | 2018-10-29 03:25:28 | recoveryCheck | pg_ctl: server
does not shut down
mereswine | 2018-11-15 11:55:56 | recoveryCheck | pg_ctl: server
does not shut down
mereswine | 2018-11-16 09:10:13 | recoveryCheck | pg_ctl: server
does not shut down
mereswine | 2018-11-30 10:34:17 | recoveryCheck | pg_ctl: server
does not shut down
mereswine | 2018-11-30 11:44:54 | recoveryCheck | pg_ctl: server
does not shut down
gull | 2018-12-08 04:29:34 | recoveryCheck | pg_ctl: server
does not shut down
mereswine | 2019-01-30 05:18:59 | recoveryCheck | pg_ctl: server
does not shut down
francolin | 2019-02-07 07:55:41 | recoveryCheck | pg_ctl: server
does not shut down
mereswine | 2019-02-20 03:10:06 | recoveryCheck | pg_ctl: server
does not shut down
mereswine | 2019-02-22 03:10:05 | recoveryCheck | pg_ctl: server
does not shut down
spurfowl | 2019-02-25 11:02:04 | pg_rewindCheck | pg_ctl: server
does not shut down
sidewinder | 2019-02-25 16:45:00 | recoveryCheck | pg_ctl: server
does not shut down
mereswine | 2019-03-06 03:10:05 | recoveryCheck | pg_ctl: server
does not shut down
piculet | 2019-03-23 22:28:59 | recoveryCheck | pg_ctl: server
does not shut down
gull | 2019-04-11 06:25:20 | recoveryCheck | pg_ctl: server
does not shut down
mereswine | 2019-04-12 06:31:30 | recoveryCheck | pg_ctl: server
does not shut down
dragonet | 2019-04-16 06:14:01 | recoveryCheck | pg_ctl: server
does not shut down
(188 rows)