[Touch-packages] [Bug 1608953] Re: Issue with systemd issue when run inside a container
FTR, there is a similar bug 1611973 about postgresql not starting in a cloud instance, but this is specific to cloud-init, not to LXC. How sure are you that this is really due to the low FDs of pid 1, as opposed to that bug? Do you have some useful logs to look at, like a journal output from that container? (although generators don't log into journal, they run earlier than journald starts -- on a "real" system their error messages land in dmesg, not sure if containers are allowed to do that). ** Summary changed: - Issue with systemd issue when run inside a container + PostgreSQL does not start in container ** Changed in: systemd (Ubuntu) Status: Fix Released => Incomplete -- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to systemd in Ubuntu. https://bugs.launchpad.net/bugs/1608953 Title: PostgreSQL does not start in container Status in systemd package in Ubuntu: Incomplete Status in systemd source package in Xenial: Incomplete Bug description: We have a 16.04 Ubuntu lx-brand container image available in our public cloud and recently discovered a systemd bug that's related to running in a container environment. I'm forwarded below what one of our engineers discovered: After installing postgres (apt-get install -y -q postgresql), systemd does not actually start any of the postgres services. We tracked this down to a failure from sed from within the /lib/systemd/system- generators/postgresql-generator script. The sed command tries to close stderr (fd 2) which fails, so sed returns an error code, which causes the entire postgres generator to fail. The root cause of the problem lies in the systemd code. Because we are running inside of a container (see detect_container) we don't execute the following block of code in the systemd main(). if (getpid() == 1 && detect_container() <= 0) { /* Running outside of a container as PID 1 */ arg_running_as = MANAGER_SYSTEM; make_null_stdio(); The make_null_stdio function is what sets up fd 0-2 as /dev/null in systemd on bare metal. Having those fd's setup is what allows the postgres system-generator to work properly since sed expects to be able to close stderr. Because we never call make_null_stdio when inside any container, the low fd's wind up getting setup later using /dev/console with O_CLOEXEC, so when we actually run the system generator script, we don't have the low fd's setup at all like sed expects. Interestingly, looking at the master branch of systemd, at src/core/main.c this bug appears to no longer exist. The relevant code block has been moved so it is no longer conditional on being in a container, but the commit was not intended to fix this problem. It was apparently due to color handling on the console/ commit 3a18b60489504056f9b0b1a139439cbfa60a87e1 It would be great if this fix could be pulled in to an update for Ubuntu 16.04. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/systemd/+bug/1608953/+subscriptions -- Mailing list: https://launchpad.net/~touch-packages Post to : touch-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~touch-packages More help : https://help.launchpad.net/ListHelp
[Touch-packages] [Bug 1608953] Re: Issue with systemd issue when run inside a container
I tried again with "classic" LXC (not lxd), and there the FDs look differently: In xenial they point to /dev/pts/4, in yakkety they point to /dev/pts/1. But installing postgresql in both xenial and yakkety works fine. I also tried with lxc-start-ephemeral, but in that case the low FDs are again pointing to /dev/null and things work fine (but I suppose you don't use emphemeral containers). So, I'm happy to backport the patch, but I'm unable to create an SRU test case or verify the fix myself. -- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to systemd in Ubuntu. https://bugs.launchpad.net/bugs/1608953 Title: PostgreSQL does not start in container Status in systemd package in Ubuntu: Incomplete Status in systemd source package in Xenial: Incomplete Bug description: We have a 16.04 Ubuntu lx-brand container image available in our public cloud and recently discovered a systemd bug that's related to running in a container environment. I'm forwarded below what one of our engineers discovered: After installing postgres (apt-get install -y -q postgresql), systemd does not actually start any of the postgres services. We tracked this down to a failure from sed from within the /lib/systemd/system- generators/postgresql-generator script. The sed command tries to close stderr (fd 2) which fails, so sed returns an error code, which causes the entire postgres generator to fail. The root cause of the problem lies in the systemd code. Because we are running inside of a container (see detect_container) we don't execute the following block of code in the systemd main(). if (getpid() == 1 && detect_container() <= 0) { /* Running outside of a container as PID 1 */ arg_running_as = MANAGER_SYSTEM; make_null_stdio(); The make_null_stdio function is what sets up fd 0-2 as /dev/null in systemd on bare metal. Having those fd's setup is what allows the postgres system-generator to work properly since sed expects to be able to close stderr. Because we never call make_null_stdio when inside any container, the low fd's wind up getting setup later using /dev/console with O_CLOEXEC, so when we actually run the system generator script, we don't have the low fd's setup at all like sed expects. Interestingly, looking at the master branch of systemd, at src/core/main.c this bug appears to no longer exist. The relevant code block has been moved so it is no longer conditional on being in a container, but the commit was not intended to fix this problem. It was apparently due to color handling on the console/ commit 3a18b60489504056f9b0b1a139439cbfa60a87e1 It would be great if this fix could be pulled in to an update for Ubuntu 16.04. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/systemd/+bug/1608953/+subscriptions -- Mailing list: https://launchpad.net/~touch-packages Post to : touch-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~touch-packages More help : https://help.launchpad.net/ListHelp
[Touch-packages] [Bug 1608953] Re: Issue with systemd issue when run inside a container
Thanks for the initial analysis! This is very helpful. I'm marking this as fixed in yakkety and add a xenial task. I tried to reproduce this. I created a standard xenial and yakkety container: lxc launch images:ubuntu/xenial/amd64 x1 lxc launch images:ubuntu/yakkety/amd64 y1 In both of them pid1's low fds look okay: $ lxc exec x1 -- ls -l /proc/1/fd/{0,1,2} lrwx-- 1 root root 64 Aug 18 05:10 /proc/1/fd/0 -> /dev/null lrwx-- 1 root root 64 Aug 18 05:10 /proc/1/fd/1 -> /dev/null lrwx-- 1 root root 64 Aug 18 05:10 /proc/1/fd/2 -> /dev/null (same for y1) PostgreSQL starts fine after installation: $ lxc exec x1 -- apt install -y postgresql $ lxc exec x1 -- pg_lsclusters Ver Cluster Port Status OwnerData directory Log file 9.5 main5432 online postgres /var/lib/postgresql/9.5/main /var/log/postgresql/postgresql-9.5-main.log (again, same for y1) The generator ran: $ lxc exec x1 -- ls -lR /run/systemd/generator/postgresql.service.wants /run/systemd/generator/postgresql.service.wants: total 0 lrwxrwxrwx 1 root root 39 Aug 18 05:13 postgresql@9.5-main.service -> /lib/systemd/system/postgresql@.service So I'm afraid I cannot reproduce this for testing the fix. This is a requirement for SRUs. Can you please describe how this can be reproduced? ** Also affects: systemd (Ubuntu Xenial) Importance: Undecided Status: New ** Changed in: systemd (Ubuntu) Status: New => Fix Released ** Changed in: systemd (Ubuntu Xenial) Status: New => Incomplete -- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to systemd in Ubuntu. https://bugs.launchpad.net/bugs/1608953 Title: Issue with systemd issue when run inside a container Status in systemd package in Ubuntu: Fix Released Status in systemd source package in Xenial: Incomplete Bug description: We have a 16.04 Ubuntu lx-brand container image available in our public cloud and recently discovered a systemd bug that's related to running in a container environment. I'm forwarded below what one of our engineers discovered: After installing postgres (apt-get install -y -q postgresql), systemd does not actually start any of the postgres services. We tracked this down to a failure from sed from within the /lib/systemd/system- generators/postgresql-generator script. The sed command tries to close stderr (fd 2) which fails, so sed returns an error code, which causes the entire postgres generator to fail. The root cause of the problem lies in the systemd code. Because we are running inside of a container (see detect_container) we don't execute the following block of code in the systemd main(). if (getpid() == 1 && detect_container() <= 0) { /* Running outside of a container as PID 1 */ arg_running_as = MANAGER_SYSTEM; make_null_stdio(); The make_null_stdio function is what sets up fd 0-2 as /dev/null in systemd on bare metal. Having those fd's setup is what allows the postgres system-generator to work properly since sed expects to be able to close stderr. Because we never call make_null_stdio when inside any container, the low fd's wind up getting setup later using /dev/console with O_CLOEXEC, so when we actually run the system generator script, we don't have the low fd's setup at all like sed expects. Interestingly, looking at the master branch of systemd, at src/core/main.c this bug appears to no longer exist. The relevant code block has been moved so it is no longer conditional on being in a container, but the commit was not intended to fix this problem. It was apparently due to color handling on the console/ commit 3a18b60489504056f9b0b1a139439cbfa60a87e1 It would be great if this fix could be pulled in to an update for Ubuntu 16.04. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/systemd/+bug/1608953/+subscriptions -- Mailing list: https://launchpad.net/~touch-packages Post to : touch-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~touch-packages More help : https://help.launchpad.net/ListHelp