Re: [systemd-devel] bind-mount of /run/systemd for chrooted bind9/named

2023-07-10 Thread Marc Haber
On Mon, Jul 10, 2023 at 12:11:01PM +0200, Lennart Poettering wrote:
> ProtectHome= protects /home/, nothing else. Hence you can use it, and
> it should not collide with bind's use of the home dir, because it's
> not in /home.
> 
> Actually, correcting myself: use ReadOnlyBindPaths= for this. clients
> cann still connect to sockets on read-only fs just fine, but you take
> the privs away to chmod() or chown() the inode that way. So you get
> another line of defense that way.

Thank you, all my questions are answered for the time being. Your help
is appreciated.

Greetings
Marc

-- 
-
Marc Haber | "I don't trust Computers. They | Mailadresse im Header
Leimen, Germany|  lose things."Winona Ryder | Fon: *49 6224 1600402
Nordisch by Nature |  How to make an American Quilt | Fax: *49 6224 1600421


Re: [systemd-devel] bind-mount of /run/systemd for chrooted bind9/named

2023-07-10 Thread Lennart Poettering
On Mo, 10.07.23 11:37, Marc Haber (mh+systemd-de...@zugschlus.de) wrote:

> Hi Lennart,
>
> On Mon, Jul 10, 2023 at 10:28:52AM +0200, Lennart Poettering wrote:
> > On So, 09.07.23 20:14, Marc Haber (mh+systemd-de...@zugschlus.de) wrote:
> >
> > > > It should suffice bind mounting just the notify socket, not the full
> > > > dir.
> > >
> > > Is it intended behavior that an empty file is left at the "mount point"
> > > (what Where= points to) after the unit was stopped?
> >
> > We need an inode we can overmount, and given that this is in /run/
> > (hence inherently ephemeral) and a fixed path it shouldn't matter.
>
> So this is intended. Good to know. I stumbled upon that.
>
> > > If I set ProtectHome=yes, how do I give the user that bind runs as
> > > access to its homedir? Is ReadWritePaths= the solution?
> >
> > ProtectHome= is about /home/ only, i.e. regular ("human") users, not
> > about system users (i.e. uid < 1K). Your bind should *not* run as
> > regular user, but as a system user of course, hence ProtectHome= is
> > something you can just set, and don't need to be concerned about the
> > system user's home dir.
>
> In Debian, bind runs as user bind, which gets created as a system user
> (uid < 1K, yes), and with /var/cache/bind as its home directory, which
> is the directory where, for example, slave zone files get written to.
> So, the running process needs to be able to access its "home directory"
> during its operation even after dropping root.

ProtectHome= protects /home/, nothing else. Hence you can use it, and
it should not collide with bind's use of the home dir, because it's
not in /home.

>
> > > [Mount]
> > > What=/run/systemd
> > > Where=/var/local/chroot/bind/run/systemd
> > > Type=none
> > > Options=bind
> >
> > Note that /run/ should always be a tmpfs, hence unless you mount a
> > tmpfs to /var/local/chroot/bind/run/ first, the above is a bit ugly.
> >
> > Instead of this .mount unit, consider using in the .service file:
> >
> > TemporaryFileSystem=/var/local/chroot/bind/run
> > BindPaths=/run/systemd/notify:/var/local/chroot/bind/run/systemd/notify
>
> Ah, of course. I obviously didn't read BindPath's documentation
> thoroughly enough. That is of course way better. Thanks for helping me
> to read the docs.

Actually, correcting myself: use ReadOnlyBindPaths= for this. clients
cann still connect to sockets on read-only fs just fine, but you take
the privs away to chmod() or chown() the inode that way. So you get
another line of defense that way.

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] bind-mount of /run/systemd for chrooted bind9/named

2023-07-10 Thread Marc Haber
Hi Lennart,

On Mon, Jul 10, 2023 at 10:28:52AM +0200, Lennart Poettering wrote:
> On So, 09.07.23 20:14, Marc Haber (mh+systemd-de...@zugschlus.de) wrote:
> 
> > > It should suffice bind mounting just the notify socket, not the full
> > > dir.
> >
> > Is it intended behavior that an empty file is left at the "mount point"
> > (what Where= points to) after the unit was stopped?
> 
> We need an inode we can overmount, and given that this is in /run/
> (hence inherently ephemeral) and a fixed path it shouldn't matter.

So this is intended. Good to know. I stumbled upon that.

> > If I set ProtectHome=yes, how do I give the user that bind runs as
> > access to its homedir? Is ReadWritePaths= the solution?
> 
> ProtectHome= is about /home/ only, i.e. regular ("human") users, not
> about system users (i.e. uid < 1K). Your bind should *not* run as
> regular user, but as a system user of course, hence ProtectHome= is
> something you can just set, and don't need to be concerned about the
> system user's home dir.

In Debian, bind runs as user bind, which gets created as a system user
(uid < 1K, yes), and with /var/cache/bind as its home directory, which
is the directory where, for example, slave zone files get written to.
So, the running process needs to be able to access its "home directory"
during its operation even after dropping root.

> > [Mount]
> > What=/run/systemd
> > Where=/var/local/chroot/bind/run/systemd
> > Type=none
> > Options=bind
> 
> Note that /run/ should always be a tmpfs, hence unless you mount a
> tmpfs to /var/local/chroot/bind/run/ first, the above is a bit ugly.
> 
> Instead of this .mount unit, consider using in the .service file:
> 
> TemporaryFileSystem=/var/local/chroot/bind/run
> BindPaths=/run/systemd/notify:/var/local/chroot/bind/run/systemd/notify

Ah, of course. I obviously didn't read BindPath's documentation
thoroughly enough. That is of course way better. Thanks for helping me
to read the docs.

Greetings
Marc

-- 
-
Marc Haber | "I don't trust Computers. They | Mailadresse im Header
Leimen, Germany|  lose things."Winona Ryder | Fon: *49 6224 1600402
Nordisch by Nature |  How to make an American Quilt | Fax: *49 6224 1600421


Re: [systemd-devel] bind-mount of /run/systemd for chrooted bind9/named

2023-07-10 Thread Lennart Poettering
On So, 09.07.23 20:14, Marc Haber (mh+systemd-de...@zugschlus.de) wrote:

> > It should suffice bind mounting just the notify socket, not the full
> > dir.
>
> Is it intended behavior that an empty file is left at the "mount point"
> (what Where= points to) after the unit was stopped?

We need an inode we can overmount, and given that this is in /run/
(hence inherently ephemeral) and a fixed path it shouldn't matter.

> > You could also use a hybrid approach: let systemd bind mount this for
> > your service (and thus set up a minimal namespaced env for your
> > service, but with the root where it normally is), and then still let
> > bind do its own chroot() thing inside of it).
>
> I do not quite understand exactly what that means, how would I do that?
> What is "this" in the "mount this" part of sentence?

use BindPaths= to mount the notify socket to the right place.

> > Generally speaking: its typically a better idea to rely on proper fs
> > mount namespacing (i.e. decoupling mount tables of services and host)
> > rather than plain chroot() (where they aren't decoupled). If bind only
> > does chroot(), then I think using systemd's namespacing is the much
> > better choice.
>
> Where would I read up on systemd namespacing? Are you refering to what
> is explained in the "Sandboxing" chapter of systemd.exec(5), like
> ProtectSystem, ReadWritePaths etc?

yes.

> So I would basically set
> ProtectSystem=strict
> ReadWritePaths=/var/local/chroot/bind/pathlist (all paths that bind
> needs to actually write to) and then finally
> ExecStart=/path/to/bind -f -t /var/local/chroot/bind ?

I don't know bind, so can't judge on the command line, but otherwise, yeah.

> If I set ProtectHome=yes, how do I give the user that bind runs as
> access to its homedir? Is ReadWritePaths= the solution?

ProtectHome= is about /home/ only, i.e. regular ("human") users, not
about system users (i.e. uid < 1K). Your bind should *not* run as
regular user, but as a system user of course, hence ProtectHome= is
something you can just set, and don't need to be concerned about the
system user's home dir.

> > > This works as intended when I start up bind9, but when stopping the name
> > > daemon, the bind mount still lingers around. I have not fully understood
> > > the necessary systemd magic to have var-local-bind-run-systemd.mount
> > > stopped whenever bind9.service stops. How would I do that?
> >
> > You can do Wants= from bind to the mount unit. And then do
> > StopWhenUnneeded= in the mount unit, to release it when not needed.
>
> StopWhenUnneeded was what I needed. So I currently have:
>
> [7/5031]mh@drop:~ $ sudo systemctl cat named
> # /lib/systemd/system/named.service
> [Unit]
> Description=BIND Domain Name Server
> Documentation=man:named(8)
> After=network.target
> Wants=nss-lookup.target
> Before=nss-lookup.target
> RequiresMountsFor=/var/local/chroot/bind/run/systemd
>
> [Service]
> Type=notify
> EnvironmentFile=-/etc/default/named
> ExecStart=/usr/sbin/named -f $OPTIONS
> ExecReload=/usr/sbin/rndc reload
> ExecStop=/usr/sbin/rndc stop

bind doesn't react to SIGTERM correctly on its own?

> Restart=on-failure
>
> [Install]
> WantedBy=multi-user.target
> Alias=bind9.service
> [8/5030]mh@drop:~ $
>
> and
>
> 1 [9/5031]mh@drop:~ $ sudo systemctl cat 
> var-local-chroot-bind-run-systemd.mount
> # /etc/systemd/system/var-local-chroot-bind-run-systemd.mount
> [Unit]
> StopWhenUnneeded=true
>
> [Mount]
> What=/run/systemd
> Where=/var/local/chroot/bind/run/systemd
> Type=none
> Options=bind

Note that /run/ should always be a tmpfs, hence unless you mount a
tmpfs to /var/local/chroot/bind/run/ first, the above is a bit ugly.

Instead of this .mount unit, consider using in the .service file:

TemporaryFileSystem=/var/local/chroot/bind/run
BindPaths=/run/systemd/notify:/var/local/chroot/bind/run/systemd/notify

(Under the assumption bind chroots itself into /var/local/chroot/bind)

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] bind-mount of /run/systemd for chrooted bind9/named

2023-07-09 Thread Marc Haber
Hi Lennart,

thanks for this helpful answer.

On Tue, Jul 04, 2023 at 10:37:55AM +0200, Lennart Poettering wrote:
> On Mo, 03.07.23 20:52, Marc Haber (mh+systemd-de...@zugschlus.de) wrote:
> > (1) go fully systemd
> > That would mean to get rid of bind's -t option completely but use
> > systemd's RootDirectory directive instead. I have not tried this but I
> > think that the bind community might be reluctant to support a setup like
> > that. In advantage, I could use the BindReadOnlyPaths directive to
> > directly manage the necessary bind mount to make the notify socket
> > accessible.
> 
> I'd generally advise this, as it means you can drop caps like
> CAP_SYS_ADMIN and so on right-away, i.e. all the stuff that chroot()
> means.

That would, however, mean to have the bind binary and all libraries in
the chroot as well, which means either a lot of copying or a lot of bind
mounting into the chroot, introducing a number of challenges regarding
updates etc.

Using bind's -t option is charming here because it just needs
configuration, persistent data and a few auxillary files in the chroot.
This has become harder nowadays since bind loads some libraries with
dlopen() after the chroot, so at least those .so files need to be in the
chroot.

> You don't need to explicitly mount the notify socket if you use this
> btw, systemd will do this for you automatically if you combined
> RootDirectory= and Type=notify.

That is nice to know.

> > (2) try to preserve the classic setup
> > That would probably mean having a
> > /etc/systemd/system/var-local-bind-run-systemd.mount with the contents:
> > | [Mount]
> > | What=/run/systemd
> > | Where=/var/local/bind/run/systemd
> > | Type=none
> > | Options=bind
> > |
> > | [Install]
> > | WantedBy=bind9.service
> > and adding a RequiresMountsFor=/var/local/bind/run/systemd to the
> > bind9.service.
> 
> It should suffice bind mounting just the notify socket, not the full
> dir.

Is it intended behavior that an empty file is left at the "mount point"
(what Where= points to) after the unit was stopped?

> You could also use a hybrid approach: let systemd bind mount this for
> your service (and thus set up a minimal namespaced env for your
> service, but with the root where it normally is), and then still let
> bind do its own chroot() thing inside of it).

I do not quite understand exactly what that means, how would I do that?
What is "this" in the "mount this" part of sentence?

> Generally speaking: its typically a better idea to rely on proper fs
> mount namespacing (i.e. decoupling mount tables of services and host)
> rather than plain chroot() (where they aren't decoupled). If bind only
> does chroot(), then I think using systemd's namespacing is the much
> better choice.

Where would I read up on systemd namespacing? Are you refering to what
is explained in the "Sandboxing" chapter of systemd.exec(5), like
ProtectSystem, ReadWritePaths etc?

So I would basically set
ProtectSystem=strict
ReadWritePaths=/var/local/chroot/bind/pathlist (all paths that bind
needs to actually write to) and then finally
ExecStart=/path/to/bind -f -t /var/local/chroot/bind ?

Is that what you mean?

If I set ProtectHome=yes, how do I give the user that bind runs as
access to its homedir? Is ReadWritePaths= the solution?

> > This works as intended when I start up bind9, but when stopping the name
> > daemon, the bind mount still lingers around. I have not fully understood
> > the necessary systemd magic to have var-local-bind-run-systemd.mount
> > stopped whenever bind9.service stops. How would I do that?
> 
> You can do Wants= from bind to the mount unit. And then do
> StopWhenUnneeded= in the mount unit, to release it when not needed.

StopWhenUnneeded was what I needed. So I currently have:

[7/5031]mh@drop:~ $ sudo systemctl cat named
# /lib/systemd/system/named.service
[Unit]
Description=BIND Domain Name Server
Documentation=man:named(8)
After=network.target
Wants=nss-lookup.target
Before=nss-lookup.target
RequiresMountsFor=/var/local/chroot/bind/run/systemd

[Service]
Type=notify
EnvironmentFile=-/etc/default/named
ExecStart=/usr/sbin/named -f $OPTIONS
ExecReload=/usr/sbin/rndc reload
ExecStop=/usr/sbin/rndc stop
Restart=on-failure

[Install]
WantedBy=multi-user.target
Alias=bind9.service
[8/5030]mh@drop:~ $ 

and

1 [9/5031]mh@drop:~ $ sudo systemctl cat var-local-chroot-bind-run-systemd.mount
# /etc/systemd/system/var-local-chroot-bind-run-systemd.mount
[Unit]
StopWhenUnneeded=true

[Mount]
What=/run/systemd
Where=/var/local/chroot/bind/run/systemd
Type=none
Options=bind
[10/5032]mh@drop:~ $ 

(test system, I don't usually edit files in /lib/systemd, I know about the
override mechanisms).

Again, thanks for helping, that is highly appreciated.

Greetings
Marc

-- 
-
Marc Haber | "I don't trust Computers. They | Mailadresse im Header
Leimen, Germany|  lose things."Winona Ryder | Fon: *49 6224 

Re: [systemd-devel] bind-mount of /run/systemd for chrooted bind9/named

2023-07-05 Thread Petr Menšík
I would not recommend using own chroot to anyone, who has enabled 
SELinux or similar security technology.


We still offer subpackage bind-chroot, which has prepared 
named-chroot.service for doing just that. But SELinux provides better 
enforcement, while not complicating deployment and usage of named. I 
kindly disagree it is still suggested.


Also, BIND9 is full of assertions ensuring unexpected code paths are 
reported. This is defensive coding style, which makes it difficult to 
success in remote code execution attack. I have been maintainer of BIND 
for 6 years, but I am not aware of any successful remote execution in 
the last decade. Maybe not ever.


I think the more important protection you can deploy is simple:

Restart=on-abnormal

I think good enough systemd checks are sufficient replacement to custom 
tailored chroots.


Cheers,
Petr

On 7/4/23 08:40, Marc Haber wrote:

On Mon, Jul 03, 2023 at 11:21:22PM +0200, Silvio Knizek wrote:

why is it suggested to run `named` within its own chroot? For security reasons? 
This can be achieved much easier with systemd native options.

That feature is two decades older than systemd, and name server
operators are darn conservative.

Greetings
Marc


--
Petr Menšík
Software Engineer, RHEL
Red Hat, https://www.redhat.com/
PGP: DFCF908DB7C87E8E529925BC4931CA5B6C9FC5CB



Re: [systemd-devel] bind-mount of /run/systemd for chrooted bind9/named

2023-07-04 Thread Lennart Poettering
On Mo, 03.07.23 20:52, Marc Haber (mh+systemd-de...@zugschlus.de) wrote:

> (1) go fully systemd
> That would mean to get rid of bind's -t option completely but use
> systemd's RootDirectory directive instead. I have not tried this but I
> think that the bind community might be reluctant to support a setup like
> that. In advantage, I could use the BindReadOnlyPaths directive to
> directly manage the necessary bind mount to make the notify socket
> accessible.

I'd generally advise this, as it means you can drop caps like
CAP_SYS_ADMIN and so on right-away, i.e. all the stuff that chroot()
means.

You don't need to explicitly mount the notify socket if you use this
btw, systemd will do this for you automatically if you combined
RootDirectory= and Type=notify.


> (2) try to preserve the classic setup
> That would probably mean having a
> /etc/systemd/system/var-local-bind-run-systemd.mount with the contents:
> | [Mount]
> | What=/run/systemd
> | Where=/var/local/bind/run/systemd
> | Type=none
> | Options=bind
> |
> | [Install]
> | WantedBy=bind9.service
> and adding a RequiresMountsFor=/var/local/bind/run/systemd to the
> bind9.service.

It should suffice bind mounting just the notify socket, not the full
dir.

You could also use a hybrid approach: let systemd bind mount this for
your service (and thus set up a minimal namespaced env for your
service, but with the root where it normally is), and then still let
bind do its own chroot() thing inside of it).

Generally speaking: its typically a better idea to rely on proper fs
mount namespacing (i.e. decoupling mount tables of services and host)
rather than plain chroot() (where they aren't decoupled). If bind only
does chroot(), then I think using systemd's namespacing is the much
better choice.

> This works as intended when I start up bind9, but when stopping the name
> daemon, the bind mount still lingers around. I have not fully understood
> the necessary systemd magic to have var-local-bind-run-systemd.mount
> stopped whenever bind9.service stops. How would I do that?

You can do Wants= from bind to the mount unit. And then do
StopWhenUnneeded= in the mount unit, to release it when not needed.

So, I personally would always lock things down with systemd mechanisms
(i.e. systemd-analyze security will help you big time). If the daemon
in question then does further lockdown, that's great (as sometimes a
daemon might need privs during startup but not later), but generally
systemd should be better at locking things down, given the seccomp
stuff and all that other stuff it nowadays does.

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] bind-mount of /run/systemd for chrooted bind9/named

2023-07-04 Thread Marc Haber
On Mon, Jul 03, 2023 at 11:21:22PM +0200, Silvio Knizek wrote:
> why is it suggested to run `named` within its own chroot? For security 
> reasons? This can be achieved much easier with systemd native options.

That feature is two decades older than systemd, and name server
operators are darn conservative.

Greetings
Marc

-- 
-
Marc Haber | "I don't trust Computers. They | Mailadresse im Header
Leimen, Germany|  lose things."Winona Ryder | Fon: *49 6224 1600402
Nordisch by Nature |  How to make an American Quilt | Fax: *49 6224 1600421


Re: [systemd-devel] bind-mount of /run/systemd for chrooted bind9/named

2023-07-03 Thread Silvio Knizek
Hi Marc,

why is it suggested to run `named` within its own chroot? For security reasons? 
This can be achieved much easier with systemd native options.

Something like

`/etc/systemd/system/named.service`

```ini
[Unit]
Description=Internet domain name server
After=network.target

[Service]
Type=notify
User=named
DynamicUser=true
ExecStart=/usr/bin/named -f -c /etc/named/named.conf
ExecReload=/usr/bin/kill -HUP $MAINPID
NoExecPaths=/
ExecPaths=/usr/bin/named /usr/bin/kill
AmbientCapabilities=CAP_NET_BIND_SERVICE
ProtectSystem=full
ProtectHome=yes
RuntimeDirectory=%p
StateDirectory=%p
CacheDirectory=%p
LogsDirectory=%p
ConfigurationDirectory=%p

[Install]
WantedBy=multi-user.target
```

Make sure `directory` in `/etc/named/named.conf` points to `/var/lib/named`.

Further security considerations may apply. Testing is necessary.

BR  
Silvio


[systemd-devel] bind-mount of /run/systemd for chrooted bind9/named

2023-07-03 Thread Marc Haber
Hi,

this is a user-level question from someone who wants to make use of
systemd but has not quite grown the gut feeling about which way is the
right way to go.

I am running bind 9 on more than a handful of systems providing name
services as recursive and/or authoritative name servers. As it has ben
recommended for two decades, I run bind in a chroot, using its own
feature to chroot itself after starting up (-t /path/to/chroot).

In Debian bookworm, the systemd units that come with Debian's bind9
package have recently changed from Type=simple to Type=notify.

Combined with named -t, this means that systemd will never notice that
the name daemon has correctly started up unless systemd's notify socket
is also reachable in the chroot. This in turn means that bind is
continuosly restarted by systemd. As a quick fix, I issue moiunt --bind
/run/systemd /path/to/chroot/run/systemd manually.

I am currently wondering which way is the preferred way to achive this
in a more clean way:

(1) go fully systemd
That would mean to get rid of bind's -t option completely but use
systemd's RootDirectory directive instead. I have not tried this but I
think that the bind community might be reluctant to support a setup like
that. In advantage, I could use the BindReadOnlyPaths directive to
directly manage the necessary bind mount to make the notify socket
accessible.

(2) try to preserve the classic setup
That would probably mean having a
/etc/systemd/system/var-local-bind-run-systemd.mount with the contents:
| [Mount]
| What=/run/systemd
| Where=/var/local/bind/run/systemd
| Type=none
| Options=bind
| 
| [Install]
| WantedBy=bind9.service
and adding a RequiresMountsFor=/var/local/bind/run/systemd to the
bind9.service.

This works as intended when I start up bind9, but when stopping the name
daemon, the bind mount still lingers around. I have not fully understood
the necessary systemd magic to have var-local-bind-run-systemd.mount
stopped whenever bind9.service stops. How would I do that?

How would you solve this issue? Method (1), Method (2), or one that I
didn't think of yet?

Greetings
Marc

-- 
-
Marc Haber | "I don't trust Computers. They | Mailadresse im Header
Leimen, Germany|  lose things."Winona Ryder | Fon: *49 6224 1600402
Nordisch by Nature |  How to make an American Quilt | Fax: *49 6224 1600421