>Synopsis: relayd config reload race condition crash
>Category: user
>Environment:
System : OpenBSD 6.7
Details : OpenBSD 6.7-stable (PU.MP) #105: Mon Jul 20 09:18:44 EDT
2020
[email protected]:/home/brian/sys/arch/amd64/compile/PU.MP
Architecture: OpenBSD.amd64
Machine : amd64
>Description:
relay_statistics in relay.c runs on a timer that isn't synchronized with
config reload. pfe_dispatch_relay in pfe.c fails periodically with
"invalid relay id" and exits prematurely.
>How-To-Repeat:
Run relayctl reload periodically until crash occurs.
>Fix:
Below is a patch that fixes the issue in at least one way. I'm also
seeing other spurious issues with connections so there may be
other issues during config reload.
Index: pfe.c
===================================================================
RCS file: /home/brian/cvs/src/usr.sbin/relayd/pfe.c,v
retrieving revision 1.89
diff -u -r1.89 pfe.c
--- pfe.c 28 May 2017 10:39:15 -0000 1.89
+++ pfe.c 26 Jul 2020 15:28:22 -0000
@@ -270,10 +270,14 @@
case IMSG_STATISTICS:
IMSG_SIZE_CHECK(imsg, &crs);
bcopy(imsg->data, &crs, sizeof(crs));
- if (crs.proc > env->sc_conf.prefork_relay)
- fatalx("%s: invalid relay proc", __func__);
- if ((rlay = relay_find(env, crs.id)) == NULL)
- fatalx("%s: invalid relay id", __func__);
+ if (crs.proc > env->sc_conf.prefork_relay) {
+ log_warn("%s: invalid relay proc", __func__);
+ break;
+ }
+ if ((rlay = relay_find(env, crs.id)) == NULL) {
+ log_warn("%s: invalid relay id", __func__);
+ break;
+ }
bcopy(&crs, &rlay->rl_stats[crs.proc], sizeof(crs));
rlay->rl_stats[crs.proc].interval =
env->sc_conf.statinterval.tv_sec;