Scott, this is not my problem and well known. In fact, as you can see in [4] of the original post, I did all that and it didn't help. I invested some time and finally found the exact problem.
Anyone who is still encountering this (which is likely with the current stable vserver patch which hasn't changed for years) or anyone trying to run upstart within a chroot will encounter the following: 1. If /sbin/init is called from a user process it won't have pid 1 -- The common way to react (sysvinit does it too) is to "replace this process with telinit", since the user probably wants to switch the runlevel using "init 3". (He really wants "telinit 3"). If upstart is started in a chroot then init's pid is always >1, so you can't start upstart's /sbin/init in there. It will always execv() telinit and fail with either a "wrong usage" error (runlevel is missing) or throw the above-mentioned "telinit: Failed to connect to socket /com/ubuntu/upstart: Connection refused", because there is no /sbin/init dbus socket listening. You can fix that for the chroot by patching the upstart source and replacing the (pid > 1)? condition by something like (pid > 1 && have_no_runlevel_in_my_args_so_this_is_not_to_be_intended_as_telinit). In addition you need to fix some upstart jobs that signal pid 1 using "kill -SIG... 1" directly. For that to work inside the chroot you would need something like "killall -SIG... init", since init won't be pid 1. 2. Now here comes the difference with Linux-VServer. If you use the "plain" init style (see original post) the /sbin/init process within the guest will end up with pid 1. Wait, everything should be fine?! No, because upstart's /sbin/init is linked with NPTL, in contrast to SysV-Init's /sbin/init (which was used for all prior Ubuntu Versions) which is NOT. # SysV: $ ldd /sbin/init libc.so.6 => /lib/libc.so.6 (0x00002b5756108000) /lib64/ld-linux-x86-64.so.2 (0x00002b5755edf000) # Upstart: $ ldd /sbin/init linux-vdso.so.1 => (0x00007fff8dcd0000) libdbus-1.so.3 => /lib/libdbus-1.so.3 (0x00007faab9a19000) libpthread.so.0 => /lib/libpthread.so.0 (0x00007faab97fd000) librt.so.1 => /lib/librt.so.1 (0x00007faab95f5000) libc.so.6 => /lib/libc.so.6 (0x00007faab9286000) /lib64/ld-linux-x86-64.so.2 (0x00007faab9c58000) OK, fine. Where's the problem with -lpthread? Upstart's init checks for pid 1 using getpid(2). There is a known BUG, quoting it's manpage: Since glibc version 2.3.4, the glibc wrapper function for getpid() caches PIDs, so as to avoid additional system calls when a process calls getpid() repeatedly. Normally this caching is invisible, but its correct operation relies on support in the wrapper functions for fork(2), vfork(2), and clone(2): if an application bypasses the glibc wrappers for these system calls by using syscall(2), then a call to getpid() in the child will return the wrong value (to be precise: it will return the PID of the parent process). See also clone(2) for discussion of a case where getpid() may return the wrong value even when invoking clone(2) via the glibc wrapper function. This is the exact bug causing Upstart's /sbin/init within a Linux- VServer guest to *always* replace itself with telinit. Calling getpid(2) will NOT return the real pid, but the cached one, which is the pid of the util-vserver startup script. Here is a demonstration, using the attached program as /sbin/init: $ gcc -lpthread -o /sbin/init init.c # (this is how upstart's init is linked) $ vserver foo start ppid = 20523, pid = 20524 syscall getpid = 1 Notice the wrong "pid = 20524" above coming from getpid(2), which is a cached pid. The syscall gave us the correct pid, which is 1! Now the same linked without NPTL: $ gcc -o /sbin/init init.c # (this is how sysvinit's init is linked) $ vserver foo start ppid = 20278, pid = 1 syscall getpid = 1 Everything is fine! So here are your options to fix it: 1. Fix upstart and replace the getpid(2) call by syscall(SYS_getpid) to get the _real_ process id. or 2. Upgrade to a development version of the Linux-VServer kernel patch against a newer kernel. Even with the exact same host and guest system (same glibc, same /sbin/init binary) but newer kernel, this yields a correct result for getpid(2). I don't have the time to invest this any further, and the changes to clone(2) and the whole namespace system between kernel version 2.6.22 and 2.6.26+ are immense. Something within the lines has fixed this bug. Current Linux-VServer stable kernel is: 2.6.22.19-vs2.2.0.7 -- BROKEN with GLIBC 2.10.1-0ubuntu15 Development Linux-VServer kernel: 2.6.31.6-vs2.3.0.36.24 -- WORKS fine, expected result for getpid(2). With this fix in place, /sbin/init provided by upstart will work and everything is fine. -- upstart incompatible with linux-vserver https://bugs.launchpad.net/bugs/482292 You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs