Bug#1068922: runit-init: configuring network interfaces at boot inside LXC with runit as init system fails
Hi, On Sat, 11 May 2024 10:52:33 +0200 Martin Steigerwald wrote: [...] > > In case it is helpful for you I could post a step to step guide for a > minimal Incus setup and/or at least some pointers. yes this would be useful > > > > > > I just wonder why stage 2 contains /usr/local bin > > > > > directories. I think that should not be the case. Shall I > > > > > report this as a different issue? > > > > [...] > I do think this discussion belongs into a different bug report > though. I am willing to open a low priority report about this and > include the previous relevant discussion to it, so it does not get > lost and you can take your time to ponder about it. There is no need > to rush it. fine for me, feel free to proceed > > Have a great weekend! Thanks :) Lorenzo
Bug#1068922: runit-init: configuring network interfaces at boot inside LXC with runit as init system fails
Hi Lorenzo. Lorenzo - 11.05.24, 02:16:15 CEST: > On Tue, 07 May 2024 15:08:37 +0200 > Martin Steigerwald wrote: > >[...] > > Are init scripts supposed to be started with PATH variable set up and > > exported or not? How is it done with SysVInit? I bet it would be best > > to match as close as possible what SysVInit is doing to be as > > compatible as possible. > > I checked this and in sysvinit you don't have this bug because during > boot sysvscripts are run via /etc/ini.d/rc script, and there is an > 'export PATH' there. It could probably be triggered by calling the > script directly during runtime. > In runit we are calling scripts directly in stage1 so we have this bug I see. > > Otherwise it might be challenging to chase and find all the corner > > cases with existing setups. And as there is no issue initializing the > > network in the container with SysVInit instead of Runit used as PID > > 1, I'd consider a change in Runit. At least it could be challenging > > to find whether networking inside a container is the only thing that > > breaks. > > I want to dig this further, I don't recall broken network under docker > and I don't think is broken under qemu, but I can be wrong or remember > something from before /etc/init.d/rc usage was dropped from stage1 Could have something to do with Incus / LXD then. I used Incus in Devuan testing (upcoming Excalibur), which is based on Debian testing (upcoming Trixie). In case it is helpful for you I could post a step to step guide for a minimal Incus setup and/or at least some pointers. > > > > I just wonder why stage 2 contains /usr/local bin directories. I > > > > think that should not be the case. Shall I report this as a > > > > different issue? > > > > > > PATH is passed to env call for runsvdir, so I guess one can exec a > > > bin from local as runscript (not sure) without setting the PATH. I > > > can't think of other use cases.. […] > > Hmm, I get that. I am just a bit concerned as it may be a security > > issue. > > not urgent, but could you elaborate this (security implications)? is > something like an attacker placing a modified foo in /usr/local/ that > overrides the legit foo in /usr/bin or is something else? one still > needs root privileges to write to /usr/local.. Good question. It is how I learned it. :) Yes, usually /usr/local is only writable by root, however… maybe an admin or user with root privileges put some own version of say coreutils or whatever in there for whatever reason and forgot about it. And later on it has some security whole that remains unpatched. With PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/sbin:/usr/bin:/bin a vulnerable command might be picked up from /usr/local's s(bin) as it is even before the regular system managed (s)bin directories. Yes, you can consider that an user error, but there might even be other scenarios, like an user or admin installed some special version of ls or some other command in there as they prefer it. Even if the behavior or such a special replacement command is only slightly different than of the original system default command it could cause all kinds of trouble. I believe this may be some of the reasoning behind the rule I learned about to only have system directories in PATH for system provided scripts and programs. It at least appears to me like the approach of least surprise. /usr/local, only root-writable or not, is user managed. There could be anything in there causing all kinds of various trouble. A compromise here would be to change the path like this: PATH=/usr/sbin:/sbin:/usr/bin:/bin:/usr/local/sbin:/usr/local/bin With that order at least a modified "ls" from /usr/local/bin would not be picked up as there is a system managed one available. But a command that is not available in system managed directories would still be picked up from /usr/local directories. As one point of practical experience, I changed path to PATH=/sbin:/usr/sbin:/bin:/usr/bin in one Incus managed Devuan container which runs a Wordpress blog on Apache and PHP FPM and I see no issues. However as I also don't have anything in /usr/local that is to be expected. One approach could be to just change the path to the above system managed directories only path, add some NEWS.Debian entry about it and see whether someone complains :) I do think this discussion belongs into a different bug report though. I am willing to open a low priority report about this and include the previous relevant discussion to it, so it does not get lost and you can take your time to ponder about it. There is no need to rush it. Have a great weekend! -- Martin
Bug#1068922: runit-init: configuring network interfaces at boot inside LXC with runit as init system fails
Hi, On Tue, 07 May 2024 15:08:37 +0200 Martin Steigerwald wrote: >[...] > > Are init scripts supposed to be started with PATH variable set up and > exported or not? How is it done with SysVInit? I bet it would be best > to match as close as possible what SysVInit is doing to be as > compatible as possible. I checked this and in sysvinit you don't have this bug because during boot sysvscripts are run via /etc/ini.d/rc script, and there is an 'export PATH' there. It could probably be triggered by calling the script directly during runtime. In runit we are calling scripts directly in stage1 so we have this bug > > Otherwise it might be challenging to chase and find all the corner > cases with existing setups. And as there is no issue initializing the > network in the container with SysVInit instead of Runit used as PID > 1, I'd consider a change in Runit. At least it could be challenging > to find whether networking inside a container is the only thing that > breaks. I want to dig this further, I don't recall broken network under docker and I don't think is broken under qemu, but I can be wrong or remember something from before /etc/init.d/rc usage was dropped from stage1 > > > > > > I just wonder why stage 2 contains /usr/local bin directories. I > > > think that should not be the case. Shall I report this as a > > > different issue? > > > > PATH is passed to env call for runsvdir, so I guess one can exec a > > bin from local as runscript (not sure) without setting the PATH. I > > can't think of other use cases.. > > I'm fine with removing, just a bit wary, I'm afraid to break some > > custom setup > > Hmm, I get that. I am just a bit concerned as it may be a security > issue. not urgent, but could you elaborate this (security implications)? is something like an attacker placing a modified foo in /usr/local/ that overrides the legit foo in /usr/bin or is something else? one still needs root privileges to write to /usr/local.. Lorenzo > > > > I added empty "debug" and "verbose" files in /etc/runit but did > > > not find any debug output. Maybe those files needed to have some > > > content. Maybe it requires bootlogd. > > > > those files only work for runit stuff (runscripts and the sv > > trigger), boot scripts are for sysvinit and do not obey to runit > > settings :( perhaps it's time to roll some native runit > > bootscripts.. > > I see. Well that would be great. But also would require a lot of work > and testing I bet. > > Best,
Bug#1068922: runit-init: configuring network interfaces at boot inside LXC with runit as init system fails
Hi Lorenzo. Sorry for late answer. Lorenzo - 14.04.24, 11:36:32 CEST: > On Sat, 13 Apr 2024 15:05:41 +0200 > > Martin Steigerwald wrote: > > Martin Steigerwald - 13.04.24, 14:32:16 CEST: > > > Any idea how to find the cause of what is happening here? > > > > I found the cause: > > > > The container starts out with an almost empty environment. In > > [...] > > > > root@zdevuan:~# cat rcS.log > > > > >> environment > > > > container=lxc > > PWD=/ > > > > >> end of environment > > > > No PATH defined. > > > > The script defines it. See line 8 in my changed script. However it > > > > does not export it. Thus adding line 9 fixes the bug I reported: > > 8 PATH=/sbin:/usr/sbin:/bin:/usr/bin > > 9 export PATH > > > > The network is configured just fine after adding that line. > > > > Same goes for stage 2. In /etc/runit/2 I added: > >[...] > > > > Exporting the PATH there as well like > > > > 1 #!/bin/sh > > 2 > > 3 PATH=/usr/local/sbin:/usr/local/bin:/sbin:/usr/sbin:/bin:/usr/bin > > 4 export PATH > > 5 SVDIR=/etc/service > > > > fixes > > > > root@zdevuan:~# cat /etc/boot.d/network > > #!/usr/bin/env sh > > > > /etc/init.d/networking restart > > > > The network is configured even without the "export PATH" fix in > > /etc/runit/1. > > OK, so if I undertand correctly we either export PATH in the > /etc/init.d/networking script or we export PATH both in stage 1 and 2 > (so the script does not fail during boot and can be called during > runtime): is that correct? In case it is called during both stage 1 and stage 2, yes. And yes, it appears there is a link to the networking script in /etc/rcS.d which would be called in stage 1. > If yes I think it's better to fix the networking script (ifupdown pkg) > so that the fix works for sysvinit users too. Yeah, I would think so to, but: % grep PATH /etc/init.d/networking PATH="/sbin:/bin:/usr/sbin:/usr/bin" Yet, it has no export statement, it just defines the variable. What may be happening here is that something called from the script requires a valid path, but without export the variable would not be exported to that something. So it might be that the networking script needs an "export PATH" added to it. However: > Different story if multiple scripts fails during boot because of empty > PATH; many scripts in /etc/rc.S/ set their PATH but others don't.. > Could you confirm that no other scripts fails in your container setup > when PATH is not exported in stage 1 ? There are some script which do not set a command search path: % grep -L "PATH" * README brightness checkroot-bootclean.sh hwclock.sh mariadb mountall-bootclean.sh mountnfs-bootclean.sh mountnfs.sh procps rcS sudo I am not sure whether those work correctly or not. Some are not even supposed to work inside a container at all. What I wonder: What is the supposed default or standard here? Are init scripts supposed to be started with PATH variable set up and exported or not? How is it done with SysVInit? I bet it would be best to match as close as possible what SysVInit is doing to be as compatible as possible. Otherwise it might be challenging to chase and find all the corner cases with existing setups. And as there is no issue initializing the network in the container with SysVInit instead of Runit used as PID 1, I'd consider a change in Runit. At least it could be challenging to find whether networking inside a container is the only thing that breaks. Of course in case PATH variable needs to be setup, one might argue that Incus/LXC needs to do it, cause networking is setup just fine even with Runit in physical machines or VMs. So far the container appears to be working, but I did not check whether every single init script works correctly. Partly due to bootlogd not working inside the container. > > I just wonder why stage 2 contains /usr/local bin directories. I think > > that should not be the case. Shall I report this as a different issue? > > PATH is passed to env call for runsvdir, so I guess one can exec a bin > from local as runscript (not sure) without setting the PATH. I can't > think of other use cases.. > I'm fine with removing, just a bit wary, I'm afraid to break some custom > setup Hmm, I get that. I am just a bit concerned as it may be a security issue. > > I added empty "debug" and "verbose" files in /etc/runit but did not > > find any debug output. Maybe those files needed to have some content. > > Maybe it requires bootlogd. > > those files only work for runit stuff (runscripts and the sv trigger), > boot scripts are for sysvinit and do not obey to runit settings :( > perhaps it's time to roll some native runit bootscripts.. I see. Well that would be great. But also would require a lot of work and testing I bet. Best, -- Martin
Bug#1068922: runit-init: configuring network interfaces at boot inside LXC with runit as init system fails
On Sat, 13 Apr 2024 17:29:48 +0200 Martin Steigerwald wrote: > Martin Steigerwald - 13.04.24, 15:05:41 CEST: > > No PATH defined. > > > > The script defines it. See line 8 in my changed script. However it > > does not export it. Thus adding line 9 fixes the bug I reported: > > > > 8 PATH=/sbin:/usr/sbin:/bin:/usr/bin > > 9 export PATH > > > > The network is configured just fine after adding that line. > > Since configuring networking works on physical machines – I know for > sure with Devuan 5 aka Daedalus and Devuan Ceres which was at a > similar state Devuan Excalibur is currently – regarding the right fix > the question remains: > > What is different with the PATH in both cases and why? > > Why is it empty inside the (unprivileged) Incus managed LXC > container? Is it empty on the physical machine? I doubt it. But where > does the difference come from? And anyway the PATH is being set in > both stage 1 and stage 2 scripts, just not exported. So on a physical > machine it appears that PATH is being exported before already. It > might be exported before already on a container as well, albeit > undefined / empty. If I remember correctly, it's the kernel that sets environment for init; and it's different even when it boots "statically" vs via initramfs as the latter defines a lot of extra and unecessary stuff > > In both cases /bin/sh points to /bin/dash. Both the VM with Excalibur > and the physical host with Daedalus are usrmerge'd. >
Bug#1068922: runit-init: configuring network interfaces at boot inside LXC with runit as init system fails
Hello Martin, thanks for the detailed info and the time spent on it On Sat, 13 Apr 2024 15:05:41 +0200 Martin Steigerwald wrote: > Martin Steigerwald - 13.04.24, 14:32:16 CEST: > > Any idea how to find the cause of what is happening here? > > I found the cause: > > The container starts out with an almost empty environment. In > [...] > > root@zdevuan:~# cat rcS.log > >> environment > container=lxc > PWD=/ > >> end of environment > > No PATH defined. > > The script defines it. See line 8 in my changed script. However it > does not export it. Thus adding line 9 fixes the bug I reported: > > 8 PATH=/sbin:/usr/sbin:/bin:/usr/bin > 9 export PATH > > The network is configured just fine after adding that line. > > > > Same goes for stage 2. In /etc/runit/2 I added: >[...] > > Exporting the PATH there as well like > > 1 #!/bin/sh > 2 > 3 PATH=/usr/local/sbin:/usr/local/bin:/sbin:/usr/sbin:/bin:/usr/bin > 4 export PATH > 5 SVDIR=/etc/service > > fixes > > root@zdevuan:~# cat /etc/boot.d/network > #!/usr/bin/env sh > > /etc/init.d/networking restart > > The network is configured even without the "export PATH" fix in > /etc/runit/1. OK, so if I undertand correctly we either export PATH in the /etc/init.d/networking script or we export PATH both in stage 1 and 2 (so the script does not fail during boot and can be called during runtime): is that correct? If yes I think it's better to fix the networking script (ifupdown pkg) so that the fix works for sysvinit users too. Different story if multiple scripts fails during boot because of empty PATH; many scripts in /etc/rc.S/ set their PATH but others don't.. Could you confirm that no other scripts fails in your container setup when PATH is not exported in stage 1 ? > > I just wonder why stage 2 contains /usr/local bin directories. I think > that should not be the case. Shall I report this as a different issue? PATH is passed to env call for runsvdir, so I guess one can exec a bin from local as runscript (not sure) without setting the PATH. I can't think of other use cases.. I'm fine with removing, just a bit wary, I'm afraid to break some custom setup > > > > I am now undoing my debug output. > > I think I could provide a merge request for the fixes at a later time. > For now I like to finish the Devuan template and actually use it. > > That bootlogd does not seem to work inside a container is a different > issue I may report at another time. > > I added empty "debug" and "verbose" files in /etc/runit but did not > find any debug output. Maybe those files needed to have some content. > Maybe it requires bootlogd. those files only work for runit stuff (runscripts and the sv trigger), boot scripts are for sysvinit and do not obey to runit settings :( perhaps it's time to roll some native runit bootscripts.. > > But that is for another time :) > > Best, Best regards, Lorenzo
Bug#1068922: runit-init: configuring network interfaces at boot inside LXC with runit as init system fails
Martin Steigerwald - 13.04.24, 15:05:41 CEST: > No PATH defined. > > The script defines it. See line 8 in my changed script. However it does > not export it. Thus adding line 9 fixes the bug I reported: > > 8 PATH=/sbin:/usr/sbin:/bin:/usr/bin > 9 export PATH > > The network is configured just fine after adding that line. Since configuring networking works on physical machines – I know for sure with Devuan 5 aka Daedalus and Devuan Ceres which was at a similar state Devuan Excalibur is currently – regarding the right fix the question remains: What is different with the PATH in both cases and why? Why is it empty inside the (unprivileged) Incus managed LXC container? Is it empty on the physical machine? I doubt it. But where does the difference come from? And anyway the PATH is being set in both stage 1 and stage 2 scripts, just not exported. So on a physical machine it appears that PATH is being exported before already. It might be exported before already on a container as well, albeit undefined / empty. In both cases /bin/sh points to /bin/dash. Both the VM with Excalibur and the physical host with Daedalus are usrmerge'd. -- Martin
Bug#1068922: runit-init: configuring network interfaces at boot inside LXC with runit as init system fails
Martin Steigerwald - 13.04.24, 14:32:16 CEST: > Any idea how to find the cause of what is happening here? I found the cause: The container starts out with an almost empty environment. In /etc/runit/1 I added lines 4 to 6: 1 #!/bin/sh 2 # system one time initialization tasks 3 4 echo ">> environment" >> /tmp/rcS.log 5 /usr/bin/env >> /tmp/rcS.log 6 echo ">> end of environment" >> /tmp/rcS.log 7 8 PATH=/sbin:/usr/sbin:/bin:/usr/bin (For some reason using /tmp/rcS.log did not give me any output. Although /tmp is not mounted elsewhere during the boot process.) This gives me: root@zdevuan:~# cat rcS.log >> environment container=lxc PWD=/ >> end of environment No PATH defined. The script defines it. See line 8 in my changed script. However it does not export it. Thus adding line 9 fixes the bug I reported: 8 PATH=/sbin:/usr/sbin:/bin:/usr/bin 9 export PATH The network is configured just fine after adding that line. Same goes for stage 2. In /etc/runit/2 I added: 38 echo "$runsv_dir" 2>&1 >> /tmp/rc2.log 39 echo ">> environment" >> /tmp/rc2.log 40 env >> /tmp/rc2.log >> /tmp/rc2.log 41 echo ">> end of environment" 42 ls -l /etc/runit/no.emulate.sysv 2>&1 >>/tmp/rc2.log 43 if [ "$runsv_dir" != solo ] && [ ! -e /etc/runit/ no.emulate.sysv ]; then 44 echo "run rc2.d scripts…" 2>&1 >>/tmp/rc2.log 45 /lib/runit/async-timeout /lib/runit/run_sysv_scripts '/etc/rc2.d' 2>&1 >>/tmp/rc2.log 46 fi Which gives me: >> environment container=lxc PWD=/ >> end of environment Exporting the PATH there as well like 1 #!/bin/sh 2 3 PATH=/usr/local/sbin:/usr/local/bin:/sbin:/usr/sbin:/bin:/usr/bin 4 export PATH 5 SVDIR=/etc/service fixes root@zdevuan:~# cat /etc/boot.d/network #!/usr/bin/env sh /etc/init.d/networking restart The network is configured even without the "export PATH" fix in /etc/runit/1. I just wonder why stage 2 contains /usr/local bin directories. I think that should not be the case. Shall I report this as a different issue? I am now undoing my debug output. I think I could provide a merge request for the fixes at a later time. For now I like to finish the Devuan template and actually use it. That bootlogd does not seem to work inside a container is a different issue I may report at another time. I added empty "debug" and "verbose" files in /etc/runit but did not find any debug output. Maybe those files needed to have some content. Maybe it requires bootlogd. But that is for another time :) Best, -- Martin
Bug#1068922: runit-init: configuring network interfaces at boot inside LXC with runit as init system fails
Package: runit-init Version: 2.1.2-54 Severity: normal X-Debbugs-Cc: mar...@lichtvoll.de Dear Maintainer, Hi! I have Devuan Excalibur with Incus (forked from LXD) managed LXC containers. reportbug said the package is unforked and thus I agreed to send to Debian BTS instead. All but one of them are Alpine Linux. In there I installed dhcpcd for dual stack DHCP from Incus managed dnsmasq. I am currently configuring myself a Devuan template starting from incus launch images:devuan/daedalus zdevuan I installed runit-init and socklog-run in there. The containers comes up but dhcpcd is not running. It should have been started by /etc/init.d/networking due to /etc/network/interfaces: auto eth0 iface eth0 inet dhcp And indeed it is: root@zdevuan:~# /etc/init.d/networking start Configuring network interfaces...dhcpcd-9.4.1 starting […] However even with: root@zdevuan:~# cat /etc/boot.d/network #!/usr/bin/env sh /etc/init.d/networking start it does not work. I looked up how runit stage 2 runs init scripts. It does so by: root@zdevuan:/etc# grep -r "rc2.d" runit/2:/lib/runit/async-timeout /lib/runit/run_sysv_scripts '/etc/rc2.d' So I ran /lib/runit/async-timeout /lib/runit/run_sysv_scripts '/etc/rc2.d' manually and indeed it picks up /etc/boot.d/network: root@zdevuan:~# /lib/runit/async-timeout /lib/runit/run_sysv_scripts '/etc/rc2.d' dmesg: read kernel buffer failed: Operation not permitted Not running dhcpcd because /etc/network/interfaces ... failed! defines some interfaces that will use a DHCP client ... failed! Configuring network interfaces...dhcpcd-9.4.1 starting […] That last line is from /etc/boot.d/network. Thus I tried to find out whether /etc/runit/2 actually runs those scripts on boot: 38 echo "$runsv_dir" 2>&1 >> /tmp/rc2.log 39 ls -l /etc/runit/no.emulate.sysv 2>&1 >>/tmp/rc2.log 40 if [ "$runsv_dir" != solo ] && [ ! -e /etc/runit/no.emulate.sysv ]; then 41 echo "run rc2.d scripts…" 2>&1 >>/tmp/rc2.log 42 /lib/runit/async-timeout /lib/runit/run_sysv_scripts '/etc/rc2.d' 2>&1 >>/tmp/rc2.log 43 fi This gives me: root@zdevuan:~# cat /tmp/rc2.log default run rc2.d scripts… Not running dhcpcd because /etc/network/interfaces ... failed! defines some interfaces that will use a DHCP client ... failed! Configuring network interfaces...failed. So indeed stage 2 runs the scripts. But it cannot configure the network interface at this time. However running /lib/runit/async-timeout /lib/runit/run_sysv_scripts '/etc/rc2.d' later just works okay as shown above. Also putting "/etc/init.d/networking restart" inside "/etc/boot.d/network" does not work: Running /etc/init.d/networking restart is deprecated because it may not re-enable some interfaces ... (warning). Reconfiguring network interfaces...failed. Not even putting echo "ifdown eth0:" ifdown eth0 echo "ifup eth0:" ifup eth0 in there does work: root@zdevuan:~# cat /tmp/rc2.log default run rc2.d scripts… Not running dhcpcd because /etc/network/interfaces ... failed! defines some interfaces that will use a DHCP client ... failed! ifdown eth0: ifup eth0: No output from "ifup eth0" which does not seem right. However "ifdown eth0" and "ifup eth0" just works fine after booting. But even if I insert a "sleep 10" before those, it still does not work. I also looked for how rcS.d scripts are executed by Runit stage 0: root@zdevuan:/etc# grep -r "rcS.d" […] runit/1:for script in /etc/rcS.d/S* ; do In there I added for debugging: 11 for script in /etc/rcS.d/S* ; do 12 path=$(realpath "$script") 13 name=${path##*/} 14 [ -e "/etc/runit/no.emulate.sysv.d/$name" ] && continue […] 19 echo "run $script" >>/tmp/rcS.log 20 "$script" start --force-sysv 2>&1 >>/tmp/rcS.log 21 done And indeed stage1 runs the scripts. But configuring network interfaces fails there as well: root@zdevuan:~# cat /tmp/rcS.log run /etc/rcS.d/S08mountall.sh Mounting local filesystems...done. Activating swapfile swap, if any...done. run /etc/rcS.d/S09mountall-bootclean.sh Cleaning up temporary files run /etc/rcS.d/S10brightness run /etc/rcS.d/S10procps Starting Setting kernel variables: sysctl is already running. run /etc/rcS.d/S10stop-bootlogd-single run /etc/rcS.d/S10urandom run /etc/rcS.d/S11networking Configuring network interfaces...failed. run /etc/rcS.d/S12mountnfs.sh run /etc/rcS.d/S13mountnfs-bootclean.sh Cleaning up temporary files run /etc/rcS.d/S14bootmisc.sh However as bootlogd is not being started and would not work inside an LXC container anyway, I am not sure I can see any logging: root@zdevuan:~# /etc/init.d/bootlogd start Starting boot logger: bootlogdbootlogd: ioctl(/dev/pts/2, TIOCCONS): Operation not permitted Any idea how to find the cause of what is happening here? Best, -- Martin -- System Information: Devuan Release: