Re: [systemd-devel] bind-mount of /run/systemd for chrooted bind9/named
On Mon, Jul 10, 2023 at 12:11:01PM +0200, Lennart Poettering wrote: > ProtectHome= protects /home/, nothing else. Hence you can use it, and > it should not collide with bind's use of the home dir, because it's > not in /home. > > Actually, correcting myself: use ReadOnlyBindPaths= for this. clients > cann still connect to sockets on read-only fs just fine, but you take > the privs away to chmod() or chown() the inode that way. So you get > another line of defense that way. Thank you, all my questions are answered for the time being. Your help is appreciated. Greetings Marc -- - Marc Haber | "I don't trust Computers. They | Mailadresse im Header Leimen, Germany| lose things."Winona Ryder | Fon: *49 6224 1600402 Nordisch by Nature | How to make an American Quilt | Fax: *49 6224 1600421
Re: [systemd-devel] bind-mount of /run/systemd for chrooted bind9/named
On Mo, 10.07.23 11:37, Marc Haber (mh+systemd-de...@zugschlus.de) wrote: > Hi Lennart, > > On Mon, Jul 10, 2023 at 10:28:52AM +0200, Lennart Poettering wrote: > > On So, 09.07.23 20:14, Marc Haber (mh+systemd-de...@zugschlus.de) wrote: > > > > > > It should suffice bind mounting just the notify socket, not the full > > > > dir. > > > > > > Is it intended behavior that an empty file is left at the "mount point" > > > (what Where= points to) after the unit was stopped? > > > > We need an inode we can overmount, and given that this is in /run/ > > (hence inherently ephemeral) and a fixed path it shouldn't matter. > > So this is intended. Good to know. I stumbled upon that. > > > > If I set ProtectHome=yes, how do I give the user that bind runs as > > > access to its homedir? Is ReadWritePaths= the solution? > > > > ProtectHome= is about /home/ only, i.e. regular ("human") users, not > > about system users (i.e. uid < 1K). Your bind should *not* run as > > regular user, but as a system user of course, hence ProtectHome= is > > something you can just set, and don't need to be concerned about the > > system user's home dir. > > In Debian, bind runs as user bind, which gets created as a system user > (uid < 1K, yes), and with /var/cache/bind as its home directory, which > is the directory where, for example, slave zone files get written to. > So, the running process needs to be able to access its "home directory" > during its operation even after dropping root. ProtectHome= protects /home/, nothing else. Hence you can use it, and it should not collide with bind's use of the home dir, because it's not in /home. > > > > [Mount] > > > What=/run/systemd > > > Where=/var/local/chroot/bind/run/systemd > > > Type=none > > > Options=bind > > > > Note that /run/ should always be a tmpfs, hence unless you mount a > > tmpfs to /var/local/chroot/bind/run/ first, the above is a bit ugly. > > > > Instead of this .mount unit, consider using in the .service file: > > > > TemporaryFileSystem=/var/local/chroot/bind/run > > BindPaths=/run/systemd/notify:/var/local/chroot/bind/run/systemd/notify > > Ah, of course. I obviously didn't read BindPath's documentation > thoroughly enough. That is of course way better. Thanks for helping me > to read the docs. Actually, correcting myself: use ReadOnlyBindPaths= for this. clients cann still connect to sockets on read-only fs just fine, but you take the privs away to chmod() or chown() the inode that way. So you get another line of defense that way. Lennart -- Lennart Poettering, Berlin
Re: [systemd-devel] bind-mount of /run/systemd for chrooted bind9/named
Hi Lennart, On Mon, Jul 10, 2023 at 10:28:52AM +0200, Lennart Poettering wrote: > On So, 09.07.23 20:14, Marc Haber (mh+systemd-de...@zugschlus.de) wrote: > > > > It should suffice bind mounting just the notify socket, not the full > > > dir. > > > > Is it intended behavior that an empty file is left at the "mount point" > > (what Where= points to) after the unit was stopped? > > We need an inode we can overmount, and given that this is in /run/ > (hence inherently ephemeral) and a fixed path it shouldn't matter. So this is intended. Good to know. I stumbled upon that. > > If I set ProtectHome=yes, how do I give the user that bind runs as > > access to its homedir? Is ReadWritePaths= the solution? > > ProtectHome= is about /home/ only, i.e. regular ("human") users, not > about system users (i.e. uid < 1K). Your bind should *not* run as > regular user, but as a system user of course, hence ProtectHome= is > something you can just set, and don't need to be concerned about the > system user's home dir. In Debian, bind runs as user bind, which gets created as a system user (uid < 1K, yes), and with /var/cache/bind as its home directory, which is the directory where, for example, slave zone files get written to. So, the running process needs to be able to access its "home directory" during its operation even after dropping root. > > [Mount] > > What=/run/systemd > > Where=/var/local/chroot/bind/run/systemd > > Type=none > > Options=bind > > Note that /run/ should always be a tmpfs, hence unless you mount a > tmpfs to /var/local/chroot/bind/run/ first, the above is a bit ugly. > > Instead of this .mount unit, consider using in the .service file: > > TemporaryFileSystem=/var/local/chroot/bind/run > BindPaths=/run/systemd/notify:/var/local/chroot/bind/run/systemd/notify Ah, of course. I obviously didn't read BindPath's documentation thoroughly enough. That is of course way better. Thanks for helping me to read the docs. Greetings Marc -- - Marc Haber | "I don't trust Computers. They | Mailadresse im Header Leimen, Germany| lose things."Winona Ryder | Fon: *49 6224 1600402 Nordisch by Nature | How to make an American Quilt | Fax: *49 6224 1600421
Re: [systemd-devel] bind-mount of /run/systemd for chrooted bind9/named
On So, 09.07.23 20:14, Marc Haber (mh+systemd-de...@zugschlus.de) wrote: > > It should suffice bind mounting just the notify socket, not the full > > dir. > > Is it intended behavior that an empty file is left at the "mount point" > (what Where= points to) after the unit was stopped? We need an inode we can overmount, and given that this is in /run/ (hence inherently ephemeral) and a fixed path it shouldn't matter. > > You could also use a hybrid approach: let systemd bind mount this for > > your service (and thus set up a minimal namespaced env for your > > service, but with the root where it normally is), and then still let > > bind do its own chroot() thing inside of it). > > I do not quite understand exactly what that means, how would I do that? > What is "this" in the "mount this" part of sentence? use BindPaths= to mount the notify socket to the right place. > > Generally speaking: its typically a better idea to rely on proper fs > > mount namespacing (i.e. decoupling mount tables of services and host) > > rather than plain chroot() (where they aren't decoupled). If bind only > > does chroot(), then I think using systemd's namespacing is the much > > better choice. > > Where would I read up on systemd namespacing? Are you refering to what > is explained in the "Sandboxing" chapter of systemd.exec(5), like > ProtectSystem, ReadWritePaths etc? yes. > So I would basically set > ProtectSystem=strict > ReadWritePaths=/var/local/chroot/bind/pathlist (all paths that bind > needs to actually write to) and then finally > ExecStart=/path/to/bind -f -t /var/local/chroot/bind ? I don't know bind, so can't judge on the command line, but otherwise, yeah. > If I set ProtectHome=yes, how do I give the user that bind runs as > access to its homedir? Is ReadWritePaths= the solution? ProtectHome= is about /home/ only, i.e. regular ("human") users, not about system users (i.e. uid < 1K). Your bind should *not* run as regular user, but as a system user of course, hence ProtectHome= is something you can just set, and don't need to be concerned about the system user's home dir. > > > This works as intended when I start up bind9, but when stopping the name > > > daemon, the bind mount still lingers around. I have not fully understood > > > the necessary systemd magic to have var-local-bind-run-systemd.mount > > > stopped whenever bind9.service stops. How would I do that? > > > > You can do Wants= from bind to the mount unit. And then do > > StopWhenUnneeded= in the mount unit, to release it when not needed. > > StopWhenUnneeded was what I needed. So I currently have: > > [7/5031]mh@drop:~ $ sudo systemctl cat named > # /lib/systemd/system/named.service > [Unit] > Description=BIND Domain Name Server > Documentation=man:named(8) > After=network.target > Wants=nss-lookup.target > Before=nss-lookup.target > RequiresMountsFor=/var/local/chroot/bind/run/systemd > > [Service] > Type=notify > EnvironmentFile=-/etc/default/named > ExecStart=/usr/sbin/named -f $OPTIONS > ExecReload=/usr/sbin/rndc reload > ExecStop=/usr/sbin/rndc stop bind doesn't react to SIGTERM correctly on its own? > Restart=on-failure > > [Install] > WantedBy=multi-user.target > Alias=bind9.service > [8/5030]mh@drop:~ $ > > and > > 1 [9/5031]mh@drop:~ $ sudo systemctl cat > var-local-chroot-bind-run-systemd.mount > # /etc/systemd/system/var-local-chroot-bind-run-systemd.mount > [Unit] > StopWhenUnneeded=true > > [Mount] > What=/run/systemd > Where=/var/local/chroot/bind/run/systemd > Type=none > Options=bind Note that /run/ should always be a tmpfs, hence unless you mount a tmpfs to /var/local/chroot/bind/run/ first, the above is a bit ugly. Instead of this .mount unit, consider using in the .service file: TemporaryFileSystem=/var/local/chroot/bind/run BindPaths=/run/systemd/notify:/var/local/chroot/bind/run/systemd/notify (Under the assumption bind chroots itself into /var/local/chroot/bind) Lennart -- Lennart Poettering, Berlin
Re: [systemd-devel] bind-mount of /run/systemd for chrooted bind9/named
Hi Lennart, thanks for this helpful answer. On Tue, Jul 04, 2023 at 10:37:55AM +0200, Lennart Poettering wrote: > On Mo, 03.07.23 20:52, Marc Haber (mh+systemd-de...@zugschlus.de) wrote: > > (1) go fully systemd > > That would mean to get rid of bind's -t option completely but use > > systemd's RootDirectory directive instead. I have not tried this but I > > think that the bind community might be reluctant to support a setup like > > that. In advantage, I could use the BindReadOnlyPaths directive to > > directly manage the necessary bind mount to make the notify socket > > accessible. > > I'd generally advise this, as it means you can drop caps like > CAP_SYS_ADMIN and so on right-away, i.e. all the stuff that chroot() > means. That would, however, mean to have the bind binary and all libraries in the chroot as well, which means either a lot of copying or a lot of bind mounting into the chroot, introducing a number of challenges regarding updates etc. Using bind's -t option is charming here because it just needs configuration, persistent data and a few auxillary files in the chroot. This has become harder nowadays since bind loads some libraries with dlopen() after the chroot, so at least those .so files need to be in the chroot. > You don't need to explicitly mount the notify socket if you use this > btw, systemd will do this for you automatically if you combined > RootDirectory= and Type=notify. That is nice to know. > > (2) try to preserve the classic setup > > That would probably mean having a > > /etc/systemd/system/var-local-bind-run-systemd.mount with the contents: > > | [Mount] > > | What=/run/systemd > > | Where=/var/local/bind/run/systemd > > | Type=none > > | Options=bind > > | > > | [Install] > > | WantedBy=bind9.service > > and adding a RequiresMountsFor=/var/local/bind/run/systemd to the > > bind9.service. > > It should suffice bind mounting just the notify socket, not the full > dir. Is it intended behavior that an empty file is left at the "mount point" (what Where= points to) after the unit was stopped? > You could also use a hybrid approach: let systemd bind mount this for > your service (and thus set up a minimal namespaced env for your > service, but with the root where it normally is), and then still let > bind do its own chroot() thing inside of it). I do not quite understand exactly what that means, how would I do that? What is "this" in the "mount this" part of sentence? > Generally speaking: its typically a better idea to rely on proper fs > mount namespacing (i.e. decoupling mount tables of services and host) > rather than plain chroot() (where they aren't decoupled). If bind only > does chroot(), then I think using systemd's namespacing is the much > better choice. Where would I read up on systemd namespacing? Are you refering to what is explained in the "Sandboxing" chapter of systemd.exec(5), like ProtectSystem, ReadWritePaths etc? So I would basically set ProtectSystem=strict ReadWritePaths=/var/local/chroot/bind/pathlist (all paths that bind needs to actually write to) and then finally ExecStart=/path/to/bind -f -t /var/local/chroot/bind ? Is that what you mean? If I set ProtectHome=yes, how do I give the user that bind runs as access to its homedir? Is ReadWritePaths= the solution? > > This works as intended when I start up bind9, but when stopping the name > > daemon, the bind mount still lingers around. I have not fully understood > > the necessary systemd magic to have var-local-bind-run-systemd.mount > > stopped whenever bind9.service stops. How would I do that? > > You can do Wants= from bind to the mount unit. And then do > StopWhenUnneeded= in the mount unit, to release it when not needed. StopWhenUnneeded was what I needed. So I currently have: [7/5031]mh@drop:~ $ sudo systemctl cat named # /lib/systemd/system/named.service [Unit] Description=BIND Domain Name Server Documentation=man:named(8) After=network.target Wants=nss-lookup.target Before=nss-lookup.target RequiresMountsFor=/var/local/chroot/bind/run/systemd [Service] Type=notify EnvironmentFile=-/etc/default/named ExecStart=/usr/sbin/named -f $OPTIONS ExecReload=/usr/sbin/rndc reload ExecStop=/usr/sbin/rndc stop Restart=on-failure [Install] WantedBy=multi-user.target Alias=bind9.service [8/5030]mh@drop:~ $ and 1 [9/5031]mh@drop:~ $ sudo systemctl cat var-local-chroot-bind-run-systemd.mount # /etc/systemd/system/var-local-chroot-bind-run-systemd.mount [Unit] StopWhenUnneeded=true [Mount] What=/run/systemd Where=/var/local/chroot/bind/run/systemd Type=none Options=bind [10/5032]mh@drop:~ $ (test system, I don't usually edit files in /lib/systemd, I know about the override mechanisms). Again, thanks for helping, that is highly appreciated. Greetings Marc -- - Marc Haber | "I don't trust Computers. They | Mailadresse im Header Leimen, Germany| lose things."Winona Ryder | Fon: *49 6224
Re: [systemd-devel] bind-mount of /run/systemd for chrooted bind9/named
I would not recommend using own chroot to anyone, who has enabled SELinux or similar security technology. We still offer subpackage bind-chroot, which has prepared named-chroot.service for doing just that. But SELinux provides better enforcement, while not complicating deployment and usage of named. I kindly disagree it is still suggested. Also, BIND9 is full of assertions ensuring unexpected code paths are reported. This is defensive coding style, which makes it difficult to success in remote code execution attack. I have been maintainer of BIND for 6 years, but I am not aware of any successful remote execution in the last decade. Maybe not ever. I think the more important protection you can deploy is simple: Restart=on-abnormal I think good enough systemd checks are sufficient replacement to custom tailored chroots. Cheers, Petr On 7/4/23 08:40, Marc Haber wrote: On Mon, Jul 03, 2023 at 11:21:22PM +0200, Silvio Knizek wrote: why is it suggested to run `named` within its own chroot? For security reasons? This can be achieved much easier with systemd native options. That feature is two decades older than systemd, and name server operators are darn conservative. Greetings Marc -- Petr Menšík Software Engineer, RHEL Red Hat, https://www.redhat.com/ PGP: DFCF908DB7C87E8E529925BC4931CA5B6C9FC5CB
Re: [systemd-devel] bind-mount of /run/systemd for chrooted bind9/named
On Mo, 03.07.23 20:52, Marc Haber (mh+systemd-de...@zugschlus.de) wrote: > (1) go fully systemd > That would mean to get rid of bind's -t option completely but use > systemd's RootDirectory directive instead. I have not tried this but I > think that the bind community might be reluctant to support a setup like > that. In advantage, I could use the BindReadOnlyPaths directive to > directly manage the necessary bind mount to make the notify socket > accessible. I'd generally advise this, as it means you can drop caps like CAP_SYS_ADMIN and so on right-away, i.e. all the stuff that chroot() means. You don't need to explicitly mount the notify socket if you use this btw, systemd will do this for you automatically if you combined RootDirectory= and Type=notify. > (2) try to preserve the classic setup > That would probably mean having a > /etc/systemd/system/var-local-bind-run-systemd.mount with the contents: > | [Mount] > | What=/run/systemd > | Where=/var/local/bind/run/systemd > | Type=none > | Options=bind > | > | [Install] > | WantedBy=bind9.service > and adding a RequiresMountsFor=/var/local/bind/run/systemd to the > bind9.service. It should suffice bind mounting just the notify socket, not the full dir. You could also use a hybrid approach: let systemd bind mount this for your service (and thus set up a minimal namespaced env for your service, but with the root where it normally is), and then still let bind do its own chroot() thing inside of it). Generally speaking: its typically a better idea to rely on proper fs mount namespacing (i.e. decoupling mount tables of services and host) rather than plain chroot() (where they aren't decoupled). If bind only does chroot(), then I think using systemd's namespacing is the much better choice. > This works as intended when I start up bind9, but when stopping the name > daemon, the bind mount still lingers around. I have not fully understood > the necessary systemd magic to have var-local-bind-run-systemd.mount > stopped whenever bind9.service stops. How would I do that? You can do Wants= from bind to the mount unit. And then do StopWhenUnneeded= in the mount unit, to release it when not needed. So, I personally would always lock things down with systemd mechanisms (i.e. systemd-analyze security will help you big time). If the daemon in question then does further lockdown, that's great (as sometimes a daemon might need privs during startup but not later), but generally systemd should be better at locking things down, given the seccomp stuff and all that other stuff it nowadays does. Lennart -- Lennart Poettering, Berlin
Re: [systemd-devel] bind-mount of /run/systemd for chrooted bind9/named
On Mon, Jul 03, 2023 at 11:21:22PM +0200, Silvio Knizek wrote: > why is it suggested to run `named` within its own chroot? For security > reasons? This can be achieved much easier with systemd native options. That feature is two decades older than systemd, and name server operators are darn conservative. Greetings Marc -- - Marc Haber | "I don't trust Computers. They | Mailadresse im Header Leimen, Germany| lose things."Winona Ryder | Fon: *49 6224 1600402 Nordisch by Nature | How to make an American Quilt | Fax: *49 6224 1600421
Re: [systemd-devel] bind-mount of /run/systemd for chrooted bind9/named
Hi Marc, why is it suggested to run `named` within its own chroot? For security reasons? This can be achieved much easier with systemd native options. Something like `/etc/systemd/system/named.service` ```ini [Unit] Description=Internet domain name server After=network.target [Service] Type=notify User=named DynamicUser=true ExecStart=/usr/bin/named -f -c /etc/named/named.conf ExecReload=/usr/bin/kill -HUP $MAINPID NoExecPaths=/ ExecPaths=/usr/bin/named /usr/bin/kill AmbientCapabilities=CAP_NET_BIND_SERVICE ProtectSystem=full ProtectHome=yes RuntimeDirectory=%p StateDirectory=%p CacheDirectory=%p LogsDirectory=%p ConfigurationDirectory=%p [Install] WantedBy=multi-user.target ``` Make sure `directory` in `/etc/named/named.conf` points to `/var/lib/named`. Further security considerations may apply. Testing is necessary. BR Silvio
[systemd-devel] bind-mount of /run/systemd for chrooted bind9/named
Hi, this is a user-level question from someone who wants to make use of systemd but has not quite grown the gut feeling about which way is the right way to go. I am running bind 9 on more than a handful of systems providing name services as recursive and/or authoritative name servers. As it has ben recommended for two decades, I run bind in a chroot, using its own feature to chroot itself after starting up (-t /path/to/chroot). In Debian bookworm, the systemd units that come with Debian's bind9 package have recently changed from Type=simple to Type=notify. Combined with named -t, this means that systemd will never notice that the name daemon has correctly started up unless systemd's notify socket is also reachable in the chroot. This in turn means that bind is continuosly restarted by systemd. As a quick fix, I issue moiunt --bind /run/systemd /path/to/chroot/run/systemd manually. I am currently wondering which way is the preferred way to achive this in a more clean way: (1) go fully systemd That would mean to get rid of bind's -t option completely but use systemd's RootDirectory directive instead. I have not tried this but I think that the bind community might be reluctant to support a setup like that. In advantage, I could use the BindReadOnlyPaths directive to directly manage the necessary bind mount to make the notify socket accessible. (2) try to preserve the classic setup That would probably mean having a /etc/systemd/system/var-local-bind-run-systemd.mount with the contents: | [Mount] | What=/run/systemd | Where=/var/local/bind/run/systemd | Type=none | Options=bind | | [Install] | WantedBy=bind9.service and adding a RequiresMountsFor=/var/local/bind/run/systemd to the bind9.service. This works as intended when I start up bind9, but when stopping the name daemon, the bind mount still lingers around. I have not fully understood the necessary systemd magic to have var-local-bind-run-systemd.mount stopped whenever bind9.service stops. How would I do that? How would you solve this issue? Method (1), Method (2), or one that I didn't think of yet? Greetings Marc -- - Marc Haber | "I don't trust Computers. They | Mailadresse im Header Leimen, Germany| lose things."Winona Ryder | Fon: *49 6224 1600402 Nordisch by Nature | How to make an American Quilt | Fax: *49 6224 1600421