'Twas brillig, and Colin Guthrie at 13/01/14 15:33 did gyre and gimble:
> 'Twas brillig, and Colin Guthrie at 13/01/14 15:08 did gyre and gimble:
>> 'Twas brillig, and Colin Guthrie at 13/01/14 11:30 did gyre and gimble:
>>> 'Twas brillig, and Colin Guthrie at 13/01/14 11:16 did gyre and gimble:
>>>> 2. This is the much odder part of the problem I'm seeing. The call to
>>>> daemon_reload() at the end of the enable_unit() seems to trigger some
>>>> kind of broken daemon reload that puts things into a bad state,
>>>> including a stale /run/nologin file.
>>>>
>>>> I'm not sure WHY this does this, but it's very reliably reproducible. I
>>>> have a native sysvinit script called numlock. All I need to do to
>>>> trigger the bad state is "systemctl disable numlock". After the call,
>>>> the systemd daemon is reloaded and it goes into this bad state
>>>> completely with /run/nologin file.
>>>>
>>>> If I comment out call or use --no-reload, then all is well. If I call
>>>> "systemctl daemon-reload" on it's own, all seems well. It just seems to
>>>> be this reload call specifically at the end of enable_unit() that
>>>> triggers the bad state.
>>>>
>>>>
>>>>
>>>> I'm going to try reverting some of the patches I have applied to see
>>>> where I get with things, as I see Zbigniew backed a few out of fedora
>>>> due to freeze rules, but I did also see some threads from Zbigniew about
>>>> the whole /run/nologin, so I suspect he may be interested in this.
>>>
>>> I reverted the same patches that were reverted in Fedora so our builds
>>> should be quite similar.
>>>
>>> I really hope fedora has this same issue otherwise my debugging just got
>>> more confusing.
>>>
>>> Zbigniew can you reproduce this on F20?
>>
>> It seems to be specifically related to chkconfig. If I shell out instead
>> to something different (e.g. "whoami") all runs fine.
>>
>> I'm wondering if it's something related to semi-systemd stuff supported
>> in our chkconfig... Perhaps our patches are out of date compared to
>> fedora...
>>
>> Still hunting :)
> 
> OK, so it seems that chkconfig will these days notify systemd to reload
> itself.
> 
> static void reloadSystemd(void) {
>     if (systemdActive())
>         system("systemctl daemon-reload > /dev/null 2>&1");
> }
> 
> For whatever reason, doing this in the forked off process AND in systemd
> itself leads to some kind of race.
> 
> Perhaps this happens if two reload operations come in in very quick
> succession, or perhaps the use of system() (and it's subsequent fork) in
> chkconfig just somehow messes up our signal handling in systemctl?
> 
> Either way, commenting this out in systemctl avoids the problem.
> 
> I would suggest three possible fixes:
> 
> 
> 1. Find out why this is racey and fix it.
> 2. Add an option to chkconfig to disable the reload.
> 3. Just drop the reload completely from chkconfig.
> 
> I would suspect that the option route is the best way forward but
> chkconfig will bail out of unsupported options so we should either use
> an ENV var or make sure systemd and chkconfig are updated in lockstep.
> 
> Thoughts?


OK, I've added both an env var and a new option to chkconfig and then
call chkconfig with the appropriate arguments from systemctl.

Attached are my two patches (keep in mind that the earlier patch to
avoid bogus warnings about [Install] sections is still valid too.

Seems to solve the issue for me (at least in my test case).

Cheers!

Col


-- 

Colin Guthrie
gmane(at)colin.guthr.ie
http://colin.guthr.ie/

Day Job:
  Tribalogic Limited http://www.tribalogic.net/
Open Source:
  Mageia Contributor http://www.mageia.org/
  PulseAudio Hacker http://www.pulseaudio.org/
  Trac Hacker http://trac.edgewall.org/
>From 0db8cef7a80ea001f3798c4384ffcbe287eb405e Mon Sep 17 00:00:00 2001
From: Colin Guthrie <co...@mageia.org>
Date: Mon, 13 Jan 2014 19:14:40 +0000
Subject: [PATCH 514/514] systemctl: Ensure the --no-reload and --no-redirect
 options are passed to chkconfig

We want to ensure that chkconfig won't create any circular loops incase
it mistakenly things we're in control. Thus the --no-redirect argument
is a good safety net.

The --no-reload prevents a very strange case where the the shelled
out chkconfig will trigger a systemd daemon-reload (by itself shelling
out to systemctl) and then we also reload ourselves very shortly after
and this causes systemd to get into a very strange state whereby it
basically crashes (I suspect a serialisation race of some kind) and
as one of the symptoms it writes out a /run/nologin file which is
what most people notice.

A patch has been added to chkconfig to add the --no-reload behaviour.
---
 src/systemctl/systemctl.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/systemctl/systemctl.c b/src/systemctl/systemctl.c
index 9ff4350..a459a6c 100644
--- a/src/systemctl/systemctl.c
+++ b/src/systemctl/systemctl.c
@@ -4248,8 +4248,8 @@ static int enable_sysv_units(const char *verb, char **args) {
                 const char *name;
                 _cleanup_free_ char *p = NULL, *q = NULL;
                 bool found_native = false, found_sysv;
-                unsigned c = 1;
-                const char *argv[6] = { "/sbin/chkconfig", NULL, NULL, NULL, NULL };
+                unsigned c = 3;
+                const char *argv[8] = { "/sbin/chkconfig", "--no-reload", "--no-redirect", NULL, NULL, NULL, NULL };
                 char **k, *l;
                 int j;
                 pid_t pid;
-- 
1.8.4.5

--- chkconfig-1.3.61/chkconfig.c.no-systemd-reload	2014-01-13 18:57:24.583940957 +0000
+++ chkconfig-1.3.61/chkconfig.c	2014-01-13 19:05:53.354672079 +0000
@@ -559,7 +559,7 @@
     return 0;
 }
 
-int setService(char * name, int type, int where, int state) {
+int setService(char * name, int type, int where, int state, int noReloadItem) {
     int i, rc;
     int what;
     struct service s;
@@ -613,7 +613,9 @@
 		    rc |= doSetService(s, i, what);
 	    }
 
-            reloadSystemd();
+	    if (!noReloadItem) {
+		    reloadSystemd();
+	    }
 
             return rc;
     } else if (s.type == TYPE_XINETD) {
@@ -650,7 +652,7 @@
 }
 
 int main(int argc, const char ** argv) {
-    int listItem = 0, addItem = 0, delItem = 0, overrideItem = 0, noRedirectItem = 0;
+    int listItem = 0, addItem = 0, delItem = 0, overrideItem = 0, noRedirectItem = 0, noReloadItem = 0;
     int type = TYPE_ANY;
     int rc, i, x;
     char * levels = NULL;
@@ -663,6 +665,7 @@
 	    { "del", '\0', 0, &delItem, 0 },
 	    { "override", '\0', 0, &overrideItem, 0 },
 	    { "no-redirect", '\0', 0, &noRedirectItem, 0},
+	    { "no-reload", '\0', 0, &noReloadItem, 0},
 	    { "list", '\0', 0, &listItem, 0 },
 	    { "level", '\0', POPT_ARG_STRING, &levels, 0 },
 	    { "levels", '\0', POPT_ARG_STRING, &levels, 0 },
@@ -728,6 +731,10 @@
         noRedirectItem = 1;
     }
 
+    if (getenv("SYSTEMCTL_SKIP_RELOAD") != NULL) {
+        noReloadItem = 1;
+    }
+
     if (addItem) {
 	char * name = (char *)poptGetArg(optCon);
         int r;
@@ -738,8 +745,9 @@
 	name = basename(name);
 	delService(name, type, -1);
 	r = addService(name, type);
-        reloadSystemd();
-
+	if (!noReloadItem) {
+	    reloadSystemd();
+	}
         return r;
     } else if (delItem) {
 	char * name = (char *)poptGetArg(optCon);
@@ -749,7 +757,9 @@
 
 	name = basename(name);
 	r = delService(name, type, -1);
-        reloadSystemd();
+	if (!noReloadItem) {
+	    reloadSystemd();
+	}
 
         return r;
     } else if (overrideItem) {
@@ -760,7 +770,9 @@
 
         name = basename(name);
 	r = overrideService(name, type);
-        reloadSystemd();
+	if (!noReloadItem) {
+	    reloadSystemd();
+	}
 
         return r;
     } else if (listItem) {
@@ -827,16 +839,16 @@
 	    if (!noRedirectItem) {
 		forwardSystemd(name, type, "enable");
 	    }
-	    return setService(name, type, where, 1);
+	    return setService(name, type, where, 1, noReloadItem);
         } else if (!strcmp(state, "off")) {
 	    if (!noRedirectItem) {
 		forwardSystemd(name, type, "disable");
 	    }
-	    return setService(name, type, where, 0);
+	    return setService(name, type, where, 0, noReloadItem);
         } else if (!strcmp(state, "reset"))
-	    return setService(name, type, where, -1);
+	    return setService(name, type, where, -1, noReloadItem);
 	else if (!strcmp(state, "resetpriorities"))
-	    return setService(name, type, where, -2);
+	    return setService(name, type, where, -2, noReloadItem);
     }
 
     usage(progname);
_______________________________________________
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel

Reply via email to