On Mon, Feb 15, 2021 at 12:03:42PM +1000, Jonathan Matthew wrote: > It's fairly easy to accidentally configure relayd to try to run check scripts > faster than they finish, for example if you have a check interval of one > second and the check script makes a tcp connection to a host that doesn't > exist any more. > > In this situation, the hce process will keep writing messages to its imsg > buffer to the parent process asking it to run checks, which causes its memory > usage to grow without bounds. If the check script starts working again > (or if you change it to just 'exit 0') the parent works its way through the > backlog and memory usage goes back to normal, but ideally relayd would avoid > doing this to itself. > > If we don't clear the F_CHECK_SENT and F_CHECK_DONE flags in > hce_launch_checks(), check_script() can use them to figure out if the > last check request it sent for the host has finished yet, so it can avoid > building up a backlog of work for the parent. The ICMP and script check > implementations clear these flags as they start checks, and the TCP check > code doesn't use them at all, so this shouldn't affect anything else. > > ok? > ok giovanni@ Thanks Giovanni
> > Index: check_script.c > =================================================================== > RCS file: /cvs/src/usr.sbin/relayd/check_script.c,v > retrieving revision 1.21 > diff -u -p -u -p -r1.21 check_script.c > --- check_script.c 28 May 2017 10:39:15 -0000 1.21 > +++ check_script.c 15 Feb 2021 01:28:54 -0000 > @@ -38,6 +38,9 @@ check_script(struct relayd *env, struct > struct ctl_script scr; > struct table *table; > > + if ((host->flags & (F_CHECK_SENT|F_CHECK_DONE)) == F_CHECK_SENT) > + return; > + > if ((table = table_find(env, host->conf.tableid)) == NULL) > fatalx("%s: invalid table id", __func__); > > @@ -52,7 +55,9 @@ check_script(struct relayd *env, struct > fatalx("invalid script path"); > memcpy(&scr.timeout, &table->conf.timeout, sizeof(scr.timeout)); > > - proc_compose(env->sc_ps, PROC_PARENT, IMSG_SCRIPT, &scr, sizeof(scr)); > + if (proc_compose(env->sc_ps, PROC_PARENT, IMSG_SCRIPT, &scr, > + sizeof(scr)) == 0) > + host->flags |= F_CHECK_SENT; > } > > void > Index: hce.c > =================================================================== > RCS file: /cvs/src/usr.sbin/relayd/hce.c,v > retrieving revision 1.79 > diff -u -p -u -p -r1.79 hce.c > --- hce.c 6 Aug 2018 17:31:31 -0000 1.79 > +++ hce.c 15 Feb 2021 01:28:54 -0000 > @@ -139,7 +139,6 @@ hce_launch_checks(int fd, short event, v > TAILQ_FOREACH(host, &table->hosts, entry) { > if ((host->flags & F_CHECK_DONE) == 0) > host->he = HCE_INTERVAL_TIMEOUT; > - host->flags &= ~(F_CHECK_SENT|F_CHECK_DONE); > if (event_initialized(&host->cte.ev)) { > event_del(&host->cte.ev); > close(host->cte.s); >
signature.asc
Description: PGP signature