On Tue, Oct 10, 2017 at 10:52 AM, Jeremy Kerr <j...@ozlabs.org> wrote: > Hi all, > > I've been debugging an issue where we can't reboot or poweroff a machine > in the early stages of busybox init. Using the poweroff case as an > example: > > - kernel starts /sbin/init > > - kernel receives a poweroff event, so calls __orderly_poweroff. > Effectively, these will just call out to the /sbin/poweroff usermode > helper. > > - /sbin/poweroff just does a: > > kill(1, SIGUSR2); > > - However, /sbin/init has not yet installed a signal handler for > SIGUSR2. Because we're PID 1, this means the signal is ignored, and > so the command to poweroff the machine is dropped. > > - init keeps booting rather than powering off. > > In our particular case, the "poweroff event" is an IPMI soft shutdown > message. However, the same would apply for any other path that involves > orderly_poweroff or orderly_reboot. > > Even though the signal handlers are installed fairly early in init, we > can still hit the race between this and the SIGUSR2 being sent fairly > reliably. > > I see a couple of options for resolving this: > > - installing the signal handlers even earlier in init_main(). However, > this will only reduce the window for lost events, rather than > eliminating it; or
Sure, this should be done. How about this: --- a/init/init.c +++ b/init/init.c @@ -1064,6 +1064,12 @@ int init_main(int argc UNUSED_PARAM, char **argv) #endif if (!DEBUG_INIT) { + /* Some users send poweroff signals to init VERY early. + * To handle this, mask signals early, + * and unmask them only after signal handlers are installed. + */ + sigprocmask_allsigs(SIG_BLOCK); + /* Expect to be invoked as init with PID=1 or be invoked as linuxrc */ if (getpid() != 1 && (!ENABLE_LINUXRC || applet_name[0] != 'l') /* not linuxrc? */ @@ -1204,6 +1187,8 @@ int init_main(int argc UNUSED_PARAM, char **argv) + (1 << SIGHUP) /* reread /etc/inittab */ #endif , record_signo); + + sigprocmask_allsigs(SIG_UNBLOCK); } /* Now run everything that needs to be run */ This covers code which opens and parses /etc/inittab, which can be slow (if storage is slow), and can make race realistic in real world. Can you test whether this change makes the race go away in your case? > - using a synchronous channel to send the shutdown/reboot message > between the poweroff/reboot helpers, rather than an asynchronous > signal. Say, have init listening on a socket, allowing the poweroff > binary to wait and/or retry. > > However, before I go down the wrong path here: does anyone have other > ideas that might help eliminating dropped poweroff/reboot events? The test that processes are being reaped is a good idea. _______________________________________________ busybox mailing list busybox@busybox.net http://lists.busybox.net/mailman/listinfo/busybox