Hello all,
and welcome to the BugCast of today!
This is about shutting down. Shutting down the director process. Which
doesn't work. Or, lets be more precise: it works sometimes. Other times it
looks like that:
Stopping bareos_dir.
Waiting for PIDS: 2501
120 second watchdog timeout expired. Shutdown terminated.
Now, there are a few things to observe with this:
sending a *second* sigTERM will always immediately terminate. (But the
shutdown routine doesn't do that; it expects to be obeyed the first time.)
if a console happens to be open, exiting that console after the first
sigTERM didn't work, will also immediately terminate. But the shutdown
routine doesn't do such either.
Consequentially the shutdown routine will timeout, and will either fail to
shutdown, or after timeout forcibly shutdown (by crashing all other
programs and databases).
What is also obvious is that the signal handling is *BROKEN*: it gets hosed
somewhere between signals and events.
And the last time I looked into the code for the signal handling (i think
that was release 2.2.7 or such), this looked so horrible that I never went
there again. It didn't look so very wrong, but it all looked like it was
deliberately coded for LUMMUX.
Also, I just happend to happen to find a few abandoned fixes on the garret
- but I'm not sure which kind of misbehaviour they actually fix, as there
are so many of them - anyway, just have fun with these, or give them to
your children to play with...
+ *** src/lib/signal.c.orig Thu Aug 5 16:29:51 2010
+ --- src/lib/signal.c Sun Oct 3 04:16:25 2010
+ ***************
+ *** 357,369 ****
+ /* Now setup signal handlers */
+ sighandle.sa_flags = 0;
+ sighandle.sa_handler = signal_handler;
+ ! sigfillset(&sighandle.sa_mask);
+ sigignore.sa_flags = 0;
+ sigignore.sa_handler = SIG_IGN;
+ sigfillset(&sigignore.sa_mask);
+ sigdefault.sa_flags = 0;
+ sigdefault.sa_handler = SIG_DFL;
+ ! sigfillset(&sigdefault.sa_mask);
+
+
+ sigaction(SIGPIPE, &sigignore, NULL);
+ --- 357,369 ----
+ /* Now setup signal handlers */
+ sighandle.sa_flags = 0;
+ sighandle.sa_handler = signal_handler;
+ ! sigemptyset(&sighandle.sa_mask);
+ sigignore.sa_flags = 0;
+ sigignore.sa_handler = SIG_IGN;
+ sigfillset(&sigignore.sa_mask);
+ sigdefault.sa_flags = 0;
+ sigdefault.sa_handler = SIG_DFL;
+ ! sigemptyset(&sigdefault.sa_mask);
+
+
+ sigaction(SIGPIPE, &sigignore, NULL);
Here is another one --
+ *** src/lib/signal.c.orig Sun Sep 22 23:02:39 2013
+ --- src/lib/signal.c Sun Sep 22 23:02:01 2013
+ ***************
+ *** 140,146 ****
+ }
+ Dmsg2(900, "sig=%d %s\n", sig, sig_names[sig]);
+ /* Ignore certain signals -- SIGUSR2 used to interrupt threads */
+ ! if (sig == SIGCHLD || sig == SIGUSR2) {
+ return;
+ }
+ already_dead++;
+ --- 140,146 ----
+ }
+ Dmsg2(900, "sig=%d %s\n", sig, sig_names[sig]);
+ /* Ignore certain signals -- SIGUSR2 used to interrupt threads */
+ ! if (sig == SIGCHLD || sig == SIGUSR2 || sig == 0) {
+ return;
+ }
+ already_dead++;
So with this I say, Stay tuned for the next BugCast!
----------
Footnotes:
* Bug numbers have been randomized for security reasons.
--
You received this message because you are subscribed to the Google Groups
"bareos-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/bareos-users/e0b693e5-f8fb-4e18-aee6-e706a344d8f5%40googlegroups.com.