Re: [systemd-devel] Question: as a user of systemd-homed --storage=luks how to change --ssh-authorized-keys= without asking root?

2024-07-15 Thread Lennart Poettering
On Di, 09.07.24 18:02, Laurent GUERBY (laur...@guerby.net) wrote:

> Hi,
>
> On a debian testing system (systemd 256.2-1) I created a user with:
>
> trixie# homectl create utest --storage=luks --ssh-authorized-keys="xxx"
>
> The I used ssh to login as the user
>
> ssh utest@trixie
>
> And it all worked as described here as new feature of systemd 256:
>
> https://mastodon.social/@pid_eins/112370336310304287
>
> My question is how is the user "utest" able to change its
> --ssh-authorized-keys? I tried:
>
> utest@trixie$ homectl update utest --ssh-authorized-keys="xxx"
> Assertion 'user_name' failed at src/home/homectl.c:237, function
> acquire_existing_password(). Aborting.
>
> So without success.
>
> Changing password worked as the user:
>
> utest@trixie$ homectl passwd
>
> And changing the utest --ssh-authorized-keys= as root worked too.
>
> Did I miss something?

We currently do not allow to change that without privs. There's a PR
here to add what you are looking for:

https://github.com/systemd/systemd/pull/31153

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] Scheduling 3 periodic jobs using systemd

2024-07-12 Thread Lennart Poettering
On Fr, 12.07.24 09:12, t.schnei...@disroot.org (t.schnei...@disroot.org) wrote:

> Hello,
> I have a backup job (using rsnapshot) that must be executed daily, weekly
> and monthly.
> Therefore I created these systemd-timers:

Why three timers? Just use a single one and use OnCalendar= multiple
times to define multiple trigger calendar times.

(If you really want multiple separate timer units, you can make them
actviate the same service btw, via the Unit= setting)

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] Question about the killing spree during the transition from the initrd to the root file system.

2024-07-09 Thread Lennart Poettering
On Mo, 08.07.24 15:57, Demi Marie Obenour (d...@invisiblethingslab.com) wrote:

> On Mon, Jul 08, 2024 at 01:16:56PM +0200, Lennart Poettering wrote:
> > On Do, 04.07.24 12:44, Demi Marie Obenour (d...@invisiblethingslab.com) 
> > wrote:
> >
> > > > No, these belong to your process, systemd couldn't really reach into
> > > > your processes to close them, even if it wanted to.
> > > >
> > > > But do note that any files you keep open or mapped at the moment of 
> > > > transition
> > > > will remain pinned in memory, and cannot be released by the
> > > > kernel. this means that even though during the tmpfs→host transition
> > > > we generally destory the initrd's tmpfs' contents, the stuff you keep
> > > > pinned will stick around.
> > > >
> > > > Generally, only special purpose software should be left around that
> > > > way, if it is carefully written to handle this. For example it is not
> > > > allowed to dlopen() anything (and hence no NSS either! No
> > > > gethostbyname() or getpwnam() or so), because you'd otherwise end up
> > > > with a weird mix of match of shared libs from the initrd and the host.
> > >
> > > If one does need to e.g. do DNS lookups in such a process, what is the
> > > best way to do it?
> >
> > Well, simply don't write programs like that, of course.
> >
> > But if you really feel you must:
> >
> > If you need DNS, then do the lookups via your own statically linked
> > DNS lib maybe?
> >
> > You could talk to resolved's varlink or D-Bus interfaces too, but I
> > find this a bit icky, since you'll end up consuming services provided
> > by the OS on the root fs, while you should instead provide services to
> > that OS, but not consume them.
> >
> > If you want user/group name resolution: these are generally a resource
> > manager by the host OS, hence you almos certainly are doing things
> > wrong if you want to resolve them from your initrd service. You could
> > talk to userdbd of course, via Varlink IPC, but the same applies as
> > above: it's a bit icky if you consume services provided by the OS, if
> > you are such a low-level daemon that must survive from initrd into
> > host.
> >
> > In many ways: if you run like this you should consider yourself
> > conceptually closer to kernelspace than to userspace. And hence, the
> > same way as kernelspace generally doesn't resolve users or hostnames
> > you shouldn't really either.
>
> What is the most common use-case for such daemons?  I thought that it
> was for network-attached root filesystems.  Such a daemon might well
> need to do DNS lookups.

As I said above, if you really can't avoid DNS, then do DNS, but do it
yourself, i.e. add your own DNS client, and do not use OS services for
this. i.e. no NSS that involves dlopen() on modules from the rootfs or
talks to IPC services of the OS.

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] Question about the killing spree during the transition from the initrd to the root file system.

2024-07-08 Thread Lennart Poettering
On Do, 04.07.24 21:48, Zheng Chuan (zhengch...@huawei.com) wrote:

> >> I have some processes in my initrd needed to be excluded from the killing 
> >> spree
> >> during switch-root and needed to continue to run in the root file system. 
> >> I read
> >> the ROOT_STORAGE_DAEMONS.md and the source code of killall.c, and I've 
> >> learned
> >> that there are methods to exclude the processes from the killing spree, 
> >> such as
> >> setting `@` to `argv[0][0]`.
> >>
> >> However, I'm not sure if this is without potential consequences. For 
> >> example, could
> >> it be that even though my processes survive, some resources that the 
> >> processes
> >> depends on are discarded after switch-root, such as file
> >> descriptors?
> >
> > No, these belong to your process, systemd couldn't really reach into
> > your processes to close them, even if it wanted to.
> >
> > But do note that any files you keep open or mapped at the moment of 
> > transition
> > will remain pinned in memory, and cannot be released by the
> > kernel. this means that even though during the tmpfs→host transition
> > we generally destory the initrd's tmpfs' contents, the stuff you keep
> > pinned will stick around.
> >
> Yes, tmpfs will release all memory and may leave the fd(deleted) which belong 
> to
> the remaining process, what'more, tmpfs could not do more things like setcap.
> To solve this, we want to change the tmpfs into initramfs and keep the memory 
> with some
> memory waste, is that OK?

Well, keeping the initrd's tmpfs populated sucks of course, we
generally avoid that. But I don't know your scenario. Whether it's OK
to pin resources from the initrd tmpfs forever you need to figure out
in your case.

> > Generally, only special purpose software should be left around that
> > way, if it is carefully written to handle this. For example it is not
> > allowed to dlopen() anything (and hence no NSS either! No
> > gethostbyname() or getpwnam() or so), because you'd otherwise end up
> > with a weird mix of match of shared libs from the initrd and the host.
> >
> Yes,we keep all software version including shared libs as same as the host.
> In our scenario, we want to do kexec from old os to the newer one, and we 
> want to
> pull up the process we cared as soon as possible before we do switch-root and 
> other slow
> stunffs liking scanning disks and probing driver, etc.

Are you aware of systemd's "soft reboot" logic?

It allows a very quick way to reboot, without replacing the
kernel. it even allows you to keep specific services around during
reboot.

I know of companies that deploy this together with kernel live
patching to make OS updates work without downtimes.

That said, in your initrd scenario: consider statring up your service
in the initrd, getting things running, and then terminating before the
initrd transition while passing your open fds to PID 1 via the
"fdstore" logic. Then, after the transition start the service anew,
and you'll get the fds passed back in, so that the service can
continue doing what it needs to do, without any sockets and similar
resources being released in between.

> >> Question 2:
> >> If my processes are excluded from the killing spree during switch-root and 
> >> continue to run in
> >> the root file system, what are the potential consequences?
> >
> > You are running a processes from a different context, pinning files
> > from an emptied file system.
> >
> Is that OK if
> i. we keep all initrd files by initramfs without releasing them
> ii. keep all software version as the same between initrd and host
> iii. reopen some files like logs in case of OOM
>
> Otherwise, i am not sure if it is OK under some safty feature like SELinux, 
> we will test them
> later.

Well, I can't tell you what is OK in your specific scenario, but I
personally find arrangements like that icky.

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] Question about the killing spree during the transition from the initrd to the root file system.

2024-07-08 Thread Lennart Poettering
On Do, 04.07.24 12:44, Demi Marie Obenour (d...@invisiblethingslab.com) wrote:

> > No, these belong to your process, systemd couldn't really reach into
> > your processes to close them, even if it wanted to.
> >
> > But do note that any files you keep open or mapped at the moment of 
> > transition
> > will remain pinned in memory, and cannot be released by the
> > kernel. this means that even though during the tmpfs→host transition
> > we generally destory the initrd's tmpfs' contents, the stuff you keep
> > pinned will stick around.
> >
> > Generally, only special purpose software should be left around that
> > way, if it is carefully written to handle this. For example it is not
> > allowed to dlopen() anything (and hence no NSS either! No
> > gethostbyname() or getpwnam() or so), because you'd otherwise end up
> > with a weird mix of match of shared libs from the initrd and the host.
>
> If one does need to e.g. do DNS lookups in such a process, what is the
> best way to do it?

Well, simply don't write programs like that, of course.

But if you really feel you must:

If you need DNS, then do the lookups via your own statically linked
DNS lib maybe?

You could talk to resolved's varlink or D-Bus interfaces too, but I
find this a bit icky, since you'll end up consuming services provided
by the OS on the root fs, while you should instead provide services to
that OS, but not consume them.

If you want user/group name resolution: these are generally a resource
manager by the host OS, hence you almos certainly are doing things
wrong if you want to resolve them from your initrd service. You could
talk to userdbd of course, via Varlink IPC, but the same applies as
above: it's a bit icky if you consume services provided by the OS, if
you are such a low-level daemon that must survive from initrd into
host.

In many ways: if you run like this you should consider yourself
conceptually closer to kernelspace than to userspace. And hence, the
same way as kernelspace generally doesn't resolve users or hostnames
you shouldn't really either.

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] passing additional FDs to service

2024-07-08 Thread Lennart Poettering
On Fr, 05.07.24 16:19, Andrea Pappacoda (and...@pappacoda.it) wrote:

> Hi all!
>
> I'm writing a small FastCGI daemon which, in addition to the socket used
> to talk FastCGI to the web server, talks SMTP through another (inet)
> socket (as an SMTP client).
>
> The FastCGI socket is created by systemd with a .socket unit and passed
> to the service as an fd (which also enables socket activation), while
> the SMTP socket is opened and managed by the daemon itself.
>
> What I'm asking here is if there's a way to also pass the SMTP socket as
> a file descriptor to the daemon from systemd, so that the daemon doesn't
> need to manage sockets itself (as all it does is reading fds passed by
> the service manager) and can be further restricted with options like
> PrivateNetwork=yes.

Did I get this right, you want that systemd creates an outgoing socket
for you, that connects to some IP service for you, and hands it in
pre-connected? How is that supposed to work given the IP is generally
unreliable, i.e. when you connect to some IP service it might fail,
and you might need to retry, but the socket systemd passed in to you
cannot be reused once if failed.

That said, there's actually a TODO list item to add something like this, but
mostly with AF_UNIX (i.e. reliable) sockets in mind. And maybe this
could be used for per-connection service instance (following the logic
that it's OK if we let the whole incoming conection and its service
instance fail if the onwards connetion fails).

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] Question about the killing spree during the transition from the initrd to the root file system.

2024-07-04 Thread Lennart Poettering
On Do, 04.07.24 11:24, chenruyi (A) (chenru...@huawei.com) wrote:

> Hi,
>
> I have some processes in my initrd needed to be excluded from the killing 
> spree
> during switch-root and needed to continue to run in the root file system. I 
> read
> the ROOT_STORAGE_DAEMONS.md and the source code of killall.c, and I've learned
> that there are methods to exclude the processes from the killing spree, such 
> as
> setting `@` to `argv[0][0]`.
>
> However, I'm not sure if this is without potential consequences. For example, 
> could
> it be that even though my processes survive, some resources that the processes
> depends on are discarded after switch-root, such as file
> descriptors?

No, these belong to your process, systemd couldn't really reach into
your processes to close them, even if it wanted to.

But do note that any files you keep open or mapped at the moment of transition
will remain pinned in memory, and cannot be released by the
kernel. this means that even though during the tmpfs→host transition
we generally destory the initrd's tmpfs' contents, the stuff you keep
pinned will stick around.

Generally, only special purpose software should be left around that
way, if it is carefully written to handle this. For example it is not
allowed to dlopen() anything (and hence no NSS either! No
gethostbyname() or getpwnam() or so), because you'd otherwise end up
with a weird mix of match of shared libs from the initrd and the host.

Hence, you really should know what you are doing. Otherwise it's
almost always a better idea to allow the daemon to terminate in the
initrd, and thatn start a new instance from the host fs after the
transition.

> I have the following two questions:
>
> Question 1:
> Why is it necessary to kill processes during the transition from the
> initrd to the main system?

It's not strictly necessary. It's simply about hygiene, because
everything that sticks around will pin initrd resources, and we'd
really like to get rid of those.

Generally it's the absolute exception that stuff should stick around,
hence it's a good idea to just kill everything but allow a focussed
exception logic to that.

> Question 2:
> If my processes are excluded from the killing spree during switch-root and 
> continue to run in
> the root file system, what are the potential consequences?

You are running a processes from a different context, pinning files
from an emptied files.

Generally, don't do this, unless you know exactly what you are
doing. There are usually uch better approaches.

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] [External] : Re: rsyslog / journald - el7 vs el8

2024-07-01 Thread Lennart Poettering
On Mo, 01.07.24 13:30, Ricardo Esteves (ricardo.lopes.este...@oracle.com) wrote:

> Hi,
>
> But does anyone know why /etc/rsyslog.d/listen.conf was removed from systemd
> rpm package in EL8?

This is neither a RHEL mailing list, nor a an rsyslog mailing
list. Maybe contact RHEL support for help on this.

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] mounts with "nofail" can be unmounted on shutdown before "After=*-fs.target" units

2024-07-01 Thread Lennart Poettering
On Sa, 29.06.24 19:26, MichaIng (mi...@dietpi.com) wrote:

> Hey guys, I have question regarding a certain behaviour of the systemd mount
> generator.
>
> Mounts do not have `Before=*-fs.target` if the `nofail` mount option is
> added to their `/etc/fstab` entry.
>
> From the man page I see that this is intended behaviour:
> - 
> https://www.freedesktop.org/software/systemd/man/latest/systemd.mount.html#Default%20Dependencies
> - 
> https://www.freedesktop.org/software/systemd/man/latest/systemd.mount.html#noauto
>
> First of all, I see the reason why it seems to be not important for mounts
> to start before certain targets, if one explicitly declares that it is okay
> for them to fail. But I do not see a downside of adding `Before=*-fs.target`
> either.

To be able to mount a block-based munt we need to wait for that device
to show up first. That can take ages, and given you used "nofail" it
might even be expected to fail entirely (i.e. timeout). But we
shouldn#t delay things for stuff that quite possibly can fail.

Similar, a mount make take ages if network bound and something is
wrong with the network.

Hence "nofail" disables the ordering, it means "if this mount is there
or not doesnt really matter for anything else". Both at startup and at
shutdown, and all the time inbetween

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] systemd --user managers after systemd upgrade

2024-07-01 Thread Lennart Poettering
On Sa, 29.06.24 14:57, Mike Gilbert (flop...@gentoo.org) wrote:

> I recently added systemd v256 to Gentoo's ebuild repo. While testing
> the upgrade process from v255, I have run into an issue.
>
> After the upgrade, my KDE Plasma session stopped working, and I was
> unable to execute a reboot from the GUI.
>
> Looking at the journal, I see several messages like this one:
>
> Jun 29 14:21:30 naomi systemd[2387904]:
> /usr/lib/systemd/systemd-executor (deleted): error while loading
> shared libraries: libsystemd-core-255.so: cannot open shared object
> file: No such file or directory
>
> It appears to be executing a deleted binary
> (/usr/lib/systemd/systemd-executor), likely via /proc/1/fd/..., and
> then fails when loading a deleted shared library
> (libsystemd-core-255.so).
>
> The new versions of these files do exist on the filesystem. Also, I
> was able to reboot the system by switching to a text console and
> pressing ctrl-alt-delete.
>
> Any idea what happened here? I'm not sure if this is a systemd bug, or
> if I missed something in my packaging script (ebuild).

See this discussion:

https://github.com/systemd/systemd/pull/33279

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] rsyslog / journald - el7 vs el8

2024-07-01 Thread Lennart Poettering
On Mo, 01.07.24 13:15, Ricardo Esteves (ricardo.lopes.este...@oracle.com) wrote:

> Hi,
>
> On RHEL7 (and clones) systemd package included /etc/rsyslog.d/listen.conf:
> $SystemLogSocketName /run/systemd/journal/syslog
>
> which makes rsyslog get the logs from journald.
>
> On RHEL8 (and clones) this file is not included anymore. Does anyone knows
> why?
>
> I see that on EL8 rsyslog.conf now includes the module imjournal to get the
> logs directly from journald db.
>
> Though on rsyslog documentation says its not recommended because is quite
> heavy.
>
> What would be the correct way (less heavy) to get logs from journald into
> rsyslog on EL8?

That sounds like a question for the rsyslog community. This here is a
systemd mailing list.

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] Question about the behavior of systemd (when requesting A/AAAA via multiple interfaces)

2024-07-01 Thread Lennart Poettering
On Mo, 01.07.24 12:56, 松藤 諒太 (r-matsuf...@intelligent-design.co.jp) wrote:

Hi!

> At this condition, I've found that systemd-resolved performed to return the
> result of those queries to application
> unless all queries are completed being resolved via one of multiple
> interfaces.

we have two rules when looking things up:

1. don't mix & match replies from different sources (i.e. we will not
   return a reply that combines an A reply from netif 1 and an 
   reply from netif 2)

2. first positive reply wins, last negative reply wins (i.e. if we
   submit queries on multiple interfaces in parallel and we only see
   negative replies we'll wait until the very last query is complete
   before we report this back to the client. if however we get a
   positive reply any time, we immediately return that.)

When a local app issues a lookup request with unspecified address
family, we'll fire off a pair of lookups (i.e. A + ) on each
interface that matches the domain name routing rules, and wait for
both of these to finish, then combine the results of both (this
follows rule 1 above, as both replies come from the same iface in this
case), and then find the right of these combined replies to propagate
to teh app, according to rule 2 above.

> If is there any reason or restriction that resolved should wait for
> completing all queries through one of interfaces to return the result,
> I'm afraid I would ask the question for why it is ?

Well, apps might implement rules on whether they prefer ipv4 or ipv6
if both are available, hence we need to hand them both sets of
addresses so that they can make their choice.

i.e. we generally want to reply with "complete" responses, i.e. that
carry all information that can be acquired from a specific DNS
server that gives us information about the query. But if there are multiple
DNS servers that give us info, then we make a choice and only return
one reply.

> Furthermore, does systemd provide the configuration to switch this behavior
> ?

There's is no configuration option to control this behaviour.

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] Ordering between user@.service and systemd-logind.service

2024-07-01 Thread Lennart Poettering
On Mo, 01.07.24 12:39, Lennart Poettering (lenn...@poettering.net) wrote:

> > Hmm, so we typically don't sync on systemd-logind for user
> > stuff/sessions if we can avoid that, since the root user is a user
> > that shall be allowed logging in too, and typically much earlier than
> > regular users, i.e long before logind is up.
> >
> > That said, given that user@.service is pretty much a logind concept, I
> > guess we should have at least that dep in place.
> >
> > Can you please file an issue (or even better a PR), that adds the dep on
> > logind to user@.service)? That'd be great!
>
> (I guess this bug was introduced by
> 278e815bfa3e4c2e3914e00121c37fc844cb2025 btw, which had an indirect
> dep in place, which was removed entirely. Replacing it with a
> dependenc on systemd-logind.service should be safe and good afaics)

To cut things short I have no prepped a PR myself for this:

https://github.com/systemd/systemd/pull/33555

Hence, no further need to do anything, I took care of everything.

Thanks,

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] Ordering between user@.service and systemd-logind.service

2024-07-01 Thread Lennart Poettering
On Mo, 01.07.24 12:36, Lennart Poettering (lenn...@poettering.net) wrote:

> On So, 30.06.24 22:48, Vladimir Kudrya (vladimir-...@yandex.ru) wrote:
>
> > Hello everyone!
> >
> > I'm noticing an issue on my system (Debian sid) on shutdown. Wlroots
> > compositors try to communicate release of session to logind, but logind is
> > already gone, so conflicts arise due to activation attempts, journal is
> > spammed with stuff like this:
> >
> > Jun 29 10:38:13 hostname systemd[1]: Requested transaction contradicts 
> > existing jobs: Transaction for systemd-logind.service/start is destructive 
> > (dev-disk-by\x2dpath-pci\x2d:02:00.0\x2dnvme\x2d1\x2dpart-by\x2dlabel-swap_1.swap
> >  has 'stop' job queued, but 'start' is included in transaction).
> > Jun 29 10:38:13 hostname uwsm_sway.desktop[5886]: 00:27:37.977 [ERROR] 
> > [wlr] [libseat] [libseat/backend/logind.c:199] Could not close device: 
> > Could not activate remote peer 'org.freedesktop.login1': activation request 
> > failed: a concurrent deactivation request is already in progress
> >
> > Adding After=systemd-logind.service to user@.service seems to fix this issue
> > with no ill effects. But two questions arise: why there is no such ordering
> > by default, and is it conceptually correct?
>
> Hmm, so we typically don't sync on systemd-logind for user
> stuff/sessions if we can avoid that, since the root user is a user
> that shall be allowed logging in too, and typically much earlier than
> regular users, i.e long before logind is up.
>
> That said, given that user@.service is pretty much a logind concept, I
> guess we should have at least that dep in place.
>
> Can you please file an issue (or even better a PR), that adds the dep on
> logind to user@.service)? That'd be great!

(I guess this bug was introduced by
278e815bfa3e4c2e3914e00121c37fc844cb2025 btw, which had an indirect
dep in place, which was removed entirely. Replacing it with a
dependenc on systemd-logind.service should be safe and good afaics)

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] Ordering between user@.service and systemd-logind.service

2024-07-01 Thread Lennart Poettering
On So, 30.06.24 22:48, Vladimir Kudrya (vladimir-...@yandex.ru) wrote:

> Hello everyone!
>
> I'm noticing an issue on my system (Debian sid) on shutdown. Wlroots
> compositors try to communicate release of session to logind, but logind is
> already gone, so conflicts arise due to activation attempts, journal is
> spammed with stuff like this:
>
> Jun 29 10:38:13 hostname systemd[1]: Requested transaction contradicts 
> existing jobs: Transaction for systemd-logind.service/start is destructive 
> (dev-disk-by\x2dpath-pci\x2d:02:00.0\x2dnvme\x2d1\x2dpart-by\x2dlabel-swap_1.swap
>  has 'stop' job queued, but 'start' is included in transaction).
> Jun 29 10:38:13 hostname uwsm_sway.desktop[5886]: 00:27:37.977 [ERROR] [wlr] 
> [libseat] [libseat/backend/logind.c:199] Could not close device: Could not 
> activate remote peer 'org.freedesktop.login1': activation request failed: a 
> concurrent deactivation request is already in progress
>
> Adding After=systemd-logind.service to user@.service seems to fix this issue
> with no ill effects. But two questions arise: why there is no such ordering
> by default, and is it conceptually correct?

Hmm, so we typically don't sync on systemd-logind for user
stuff/sessions if we can avoid that, since the root user is a user
that shall be allowed logging in too, and typically much earlier than
regular users, i.e long before logind is up.

That said, given that user@.service is pretty much a logind concept, I
guess we should have at least that dep in place.

Can you please file an issue (or even better a PR), that adds the dep on
logind to user@.service)? That'd be great!

Thank you,

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] Put some users to a different slice?

2024-06-26 Thread Lennart Poettering
On Di, 25.06.24 11:57, Frank Steiner (fsteiner-ma...@bio.ifi.lmu.de) wrote:

> Hi,
>
> I'd like to put a subgroup of our users into a separate slice (below
> the user slice), so that they could be restricted further than the other
> users.

We always had the intention to add something like this, but this is
not supported right now. This is not as easy to support, because
currently the hierarchy carries meaning, i.e. to determine if
something belongs to user scope, we check if it's below the user.slice
cgroup and so on. We could fix this, by stuffing more info in the
session scope name, and this has been on the todo list for a longer
time, but so far noone sat down and implemented this. This would
require making sure the sd-login APIs are all updated to look in the
scope unit name, and never consult slices up the tree anymore.

Hence, sorry, but we cannot do this right now.

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] OpenFile directive still gives logging errors when opening a file while using the option graceful

2024-06-21 Thread Lennart Poettering
On Mi, 19.06.24 13:19, Adam Nilsson (johan.adam.nils...@gmail.com) wrote:

> Hi, so I am developing a service that may need to open a file under
> the /dev directory that may or may not exist. As I do not want the
> service to be root but the file has root permissions. I am trying to
> use the OpenFile directive using the graceful option in the service
> file. If the file is missing I get the following error:
> .service: Could not open "/dev/": No such file
> or directory
> The service is still starting so that is not the issue my question is
> should there still be an error in the log complaining that it where
> unable to open file? Reading the documentation for OpenFile is says
> that  'if "graceful" is specified, errors during file/socket opening
> are ignored' so I interpreted that as any error opening the file
> should be silenced but I do understand that it could be interpreted as
> the service will still start if errors were found opening the file.

Yes, this is a bug I guess. Can you please file an issue on github, so
that we keep track of this, and don't forget to fix this?

Even better: send a PR that fixes this.

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] Systemd, cgrupsv2, cgrulesengd, and nftables

2024-06-14 Thread Lennart Poettering
On Fr, 14.06.24 10:06, Mikhail Morfikov (mmorfi...@gmail.com) wrote:

> > --
> > Lennart Poettering, Berlin
>
> I don't need any warranty, I need a way to make this work.

Yeah, but this is the wrong forum to ask for help then. What you are
doing is strictly against how systemd and cgroup2 is designed. I mean,
do what you want, but this is not supported, you are on your own.

> I'm not sure whether I understand the "single-writer rule", so correct me if 
> I'm
> wrong. I don't want to write pids to systemd services using cgrulesengd. I 
> just
> want to create my own cgroup tree, for instance
> /sys/fs/cgroup/morfikownia/ and I

Yeah, that's not how this works. On systemd systems the top of the
cgroup tree is managed by systemd. if you want to manage your own
cgroups, then ask for a delegated subtree, and do your stuff there,
but don't interfere with the top of tree, you'll step on systemd's
feet then, and systemd will run over your feet all the time.

> want to place there all the processes managed by cgrulesengd (via the
> /etc/cgrules.conf file). So systemd won't be touching anything inside
> /sys/fs/cgroup/morfikownia/ and cgrulesengd won't be touching anything in the
> rest of the cgroup tree -- is this "single-writer rule" ?

Yeah, sorry, that's not how this works.

> > And you must delegate a subtree to other managers if a
> > different manager shall also manage cgroups.
>
> How can this be done?

There are so many docs around about this, you read them:

https://systemd.io/CGROUP_DELEGATION

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] Systemd, cgrupsv2, cgrulesengd, and nftables

2024-06-13 Thread Lennart Poettering
On Do, 13.06.24 21:38, Mikhail Morfikov (mmorfi...@gmail.com) wrote:

> I'm trying to make the 4 things (systemd, cgrupsv2, cgrulesengd, and nftables)
> work together, but I think I'm missing something.

Is "cgrulesengd" interfering with the cgroup tree?

Sorry, but that's simply not supported. cgroupv2 has a single-writer
rule, i.e. every part of the tree has only a single writer, a single
manager. And you must delegate a subtree to other managers if a
different manager shall also manage cgroups.

Hence, if you have something that just takes systemd managed processes
and moves them elsewhere, it's simply not supported. Sorry, you voided
your warranty.

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] "primary" Condition for drbd?

2024-06-13 Thread Lennart Poettering
On Do, 13.06.24 11:27, Harald Dunkel (harald.dun...@aixigo.com) wrote:

> I missed to mention, drbdadm does know:
>
>   # drbdadm role space
>   Primary/Secondary
>
> meaning "space" is primary on this host. You can also look
> at /proc/drbd:
>
>   # cat /proc/drbd
>   version: 8.4.11 (api:1/proto:86-101)
>   srcversion: 19D914EA50F713FCCE48607
>
>1: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r-
>   ns:4044664 nr:0 dw:4044664 dr:726481 al:188 bm:0 lo:0 pe:0 ua:0 
> ap:0 ep:1 wo:f oos:0
>2: cs:Connected ro:Secondary/Primary ds:UpToDate/UpToDate C r-
>   ns:0 nr:2228224 dw:2228224 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 
> wo:f oos:0
>3: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r-
>   ns:699492 nr:0 dw:699492 dr:266197 al:75 bm:0 lo:0 pe:0 ua:0 ap:0 
> ep:1 wo:f oos:0
>
> Problem is, it doesn't say "/dev/drbd1". "space" and "1" are
> defined in drbd's config files, mapping it to "/dev/drbd1". To
> evaluate the condition you have to perform a pretty complex
> task.

Youre are assuming that people know what drbd is and does, and what
"primary" or "secondary" means in the context. I certainly have no
clue.

> Does systemd allow to run a program to evaluate a custom
> condition in a unit file? Maybe I am too blind to see, but
> I haven't found it mentioned on https://www.freedesktop.org/\
> software/systemd/man/latest/systemd.unit.html.

There's ExecCondition=. But what would you use it for?

ExecCondition= is intended for quick checks.

> The context is: I want to mount/activate /dev/drbd1, if it is
> primary. (If it is not, then /dev/drbd1 should be silently
> ignored.) Next I want to start the LXC containers provided by
> the virtual block device, and setup networking and storage
> accordingly. The same should happen for /dev/dbrd{2..n}.

I have no idea what this all means, but I have the suspicion you
actually want a generator, i.e. a plugin to systemd that adds some
deps depending on external configuration files.

i.e. implement this stuff:

https://www.freedesktop.org/software/systemd/man/latest/systemd.generator.html

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] how to automate undirected UDP broadcast route for linklocal addresses?

2024-06-11 Thread Lennart Poettering
On Mo, 10.06.24 13:52, Bill Plunkett (b...@plunkware.com) wrote:

> I have a yocto-hardknott (systemd v247) embedded system using link local
> addressing.  Undirected UDP broadcasts (i.e. dest ip = 255.255.255.255) are
> failing with a 'Network is unreachable' error.  I've been able to fix this
> with the 'route add' command shown below, but would prefer to automate it
> with systemd.  Any help is greatly appreciated.
>
> This command allows undirected broadcasts:
>
> route add -net 255.255.255.255 netmask 255.255.255.255 dev eth0
>
> systemd configuration:
>
> [Match]
> Name=eth0
> KernelCommandLine=!nfsroot
>
> [Network]
> DHCP=ipv4
> LinkLocalAddressing=ipv4
>
> [DHCP]
> ClientIdentifier=mac
> RouteMetric=10


It should suffice adding the following to your .network file:

[Route]
Destination=255.255.255.255/32

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] Custom initrd services

2024-06-11 Thread Lennart Poettering
On Di, 11.06.24 09:37, Carolina Jubran (cjub...@nvidia.com) wrote:

> Hello!
>
> I have modules that need to be loaded using services because some of
> them don't autoload their modules.

"Modules"? do you mean kmod kernel modules with that?

> I want the modules to load in initrd, before initrd.target. However,
> the service is not consistently loading in initrd in the current
> implementation.

If you have kernel modules that to not autoload, it should typically
suffice to add any kernel modules to a file in /etc/modules-load.d/ in
the initrd.

That said, in 2024 kernel modules that do not autoload seems quite out
of place.

> To resolve this issue, I added "systemctl add-wants initrd.target
> A.service" in the dracut configuration.

Hmm, I don't follow? what would such a line do to kernel modules?

> Is this sufficient for my setup, or do I need to make any further
> changes in the systemd side to ensure the correct loading before
> initrd.target? I'm using systemd-load-modules to load the modules.

I am not sure I follow really. That service that needs the kernel
modules, what kind f service is that?

Note that services can add the followng to their deps:

Wants=modprobe@foobar.service
After=modprobe@foobar.service

To ensure the kmod "foobar" is definitely loaded before the service
begins execution.

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] Use container's service as dependencies in Host's systemd service unit

2024-06-11 Thread Lennart Poettering
On Di, 11.06.24 18:25, ls...@wewakecorp.com (ls...@wewakecorp.com) wrote:

> Hello, thank you all developers for your efforts first.
>
> I just wonder whether I can use container's services as dependencies in
> Host's system service unit or not..

No that is not directly supported.

But do note that systemd when run in a container will send out READY=1
sd_notify() messages when it finished startup, and systemd-nspawn may
wait for that and forward it to its own supervisor if you pass
--notify-ready=yes to it.

In other words: if you run systemd inside a container, and define
various services inside the container, then you can properly
synchronize on them collectively to finish start-up by waiting for
systemd-nspawn@.service to start up. (if you add --notify=ready=yes
that is).

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] "primary" Condition for drbd?

2024-06-11 Thread Lennart Poettering
On Di, 11.06.24 11:13, Harald Dunkel (harald.dun...@aixigo.com) wrote:

> Hi folks,
>
> would it be possible to add a Condition to check if a drbd
> resource ( a virtual block device with replication via network,
> see https://linbit.com/drbd/) is primary? I checked
>
>   /sys/devices/virtual/block/drbd1
>
> for example, but I haven't found a way to distinguish primary
> from secondary block devices yet. Apparently there is no directory
> element showing up only for primary mode.

No idea what a "primary" or "secondary" block device is, but I sense
this is some drbd specific concept, and that sounds waay too
specific for use to add a new ConditionXYZ= type for, if that's what
you are asking.

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] soft-reboot and service templates

2024-06-10 Thread Lennart Poettering
On Mi, 05.06.24 14:58, Luca Boccassi (luca.bocca...@gmail.com) wrote:

> On Wed, 5 Jun 2024 at 14:45, Thorsten Kukuk  wrote:
> >
> > Hi,
> >
> > while playing with soft-reboot and services surviving this:
> > A standard service file works, but if I use a service template (e.g.
> > test@.service), the service get's stopped during soft-reboot.
> > Reasons is:
> > -Slice=system.slice
> > +Slice=system-test.slice
> >
> > Is it somehow possible, that also "test@.service" stays alive during a
> > soft-reboot?
> > Or is there another way to pass variables to a service file at
> > startup? I don't need to run the same service several times in
> > parallel.
> >
> > Thanks,
> >   Thorsten
>
> Yeah we haven't tested this scenario at all - did you try adding a
> drop-in for system-test.slice.d that sets the parameters to avoid
> having it stopped on softreboot?

Instead of a drop-in I'd either:

1. just define the slice unit fully yourself and give it a nice
   Description= and things, and set the settings different from the
   defaults as you like. Or in other words, instead of adding
   "system-test.slice.d/50-something.conf" just add
   "system-test.slice" itself. (The fact you don't have to provide
   .slice unit files at all is convenience, not more, it doesn't mean
   you should not define unit files for slices, I'd encourage everyone
   to do that, and at the very least define a nice Description=)

2. You can just set Slice=system.slice in your template unit file, to
   override the default slice that groups template instances.

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] tee-supplicant initrd startup before tpm2.target and dev-tpmrm0.device

2024-06-10 Thread Lennart Poettering
On Mo, 10.06.24 11:19, Lennart Poettering (lenn...@poettering.net) wrote:

> This would basically be a driver-specific re-implementation of our
> generic systemd-tpm2-generator, that knows that when optee/ftpm stuff
> is used we *must* schedule tpm2.target in all cases, even if /dev/tpm0
> already exists. The sysfs path in the script above I made up of
> course, you'd have to find some sysfs file that exists exactly when
> the optee/ftpm case applies, and that is available before any kmod is
> loaded.
>
> You can make this a shell script as above, but I'd always recommend
> trying to keep shell out of the boot process, hence maybe write it in
> a better language. Generators run super early during boot, and we have
> to wait for all of them to complete before we continue, hence
> something reasonably fast would be good, shell sucks for that.
>
> The above would implement the generator interface, i.e.:
>
> https://www.freedesktop.org/software/systemd/man/latest/systemd.generator.html
>
> But again, if you can provide us with a generic interface we can
> instead just add this to upstream systemd-tpm2-generator and be happy.

Hmm, so here's an idea how we could hack this in userspace with some
lose coupling between systemd and optee drivers/supplicant:

device nodes on linux can carry xattrs in the "trusted." xattr
namespace. Maybe we can use that to tell systemd-tpm2-generator that way
/dev/tpm0 is not usable without daemon.

i.e.: in systemd-tpm2-generator, we would look for an xattr
"trusted.needs-daemon" on /dev/tpm0. If it's set and non-zero, then
we'll unconditionally enqueue tpm2.target. This would be relatively
nice I think, because it means the information is directly attached to
the device in question (i.e. if for some reason /dev/tpm0 goes away,
then the xattr goes away automatically too), and it would be metainfo
we can attach without udev around already.

This would then mean that something optee supplicant ships would have
to set that xattr. It could do so either in its own optee supplicant C
code, the instant it takes control of the tpm device; or it could do
that via a udev rule that it ships and that matches against the
driver. This rule would have to be included in the initrd so that it
runs there already and sets the xattr, so that when we transition into
the host (where the udev db is flushed out, but the device nodes stick
around) it's set.

Of course, I'd much prefer some proper kernel infra for this all
(i.e. some clear sysfs attr on the tpm device), but such a
userspace-based xattr solution seems to be acceptable to me, if you
want this. Should be a 5-line patch to systemd-tpm2-generator I'd be
willing to merge.

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] tee-supplicant initrd startup before tpm2.target and dev-tpmrm0.device

2024-06-10 Thread Lennart Poettering
s exactly when
the optee/ftpm case applies, and that is available before any kmod is
loaded.

You can make this a shell script as above, but I'd always recommend
trying to keep shell out of the boot process, hence maybe write it in
a better language. Generators run super early during boot, and we have
to wait for all of them to complete before we continue, hence
something reasonably fast would be good, shell sucks for that.

The above would implement the generator interface, i.e.:

https://www.freedesktop.org/software/systemd/man/latest/systemd.generator.html

But again, if you can provide us with a generic interface we can
instead just add this to upstream systemd-tpm2-generator and be happy.

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] confusion with systemd-repart

2024-06-10 Thread Lennart Poettering
On So, 09.06.24 19:00, Xogium (cont...@xogium.me) wrote:

> Hi,
> thank you for the help, I really appreciate it. I'm sorry for the very
> late reply, I had an issue with my mail server and only sorted it out
> today.
>
> I had to jump through a couple of problems, but I've mostly got
> something stable now.
>
> The first is that all of the partitions needed for my bootloaders were
> all of type linux-generic. I ended up creating my own UUIDs so that
> repart doesn't try to match against them.
>
> The second one which I'm still no closer to figuring out is that upon
> running once successfully, subsequent invocations of repart (i.e: on
> next boot), appears to be racy in some way with my setup, and I see no
> way of figuring out why.

Hmm, generally, systemd-repart is declarative: it adjusts a disk only
if it doesn't match the declared state and otherwise is a NOP. Thus,
it generally should only have an effect on first boot, not on second
boot, because then things should already match the intended
declaration.

Or in other words: if things are racy on 2nd boot then not because
repart wasn't doing things right (it isn't doing anything after all),
but because of other reasons?

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] Hiding systemd-cryptsetup password prompt

2024-06-07 Thread Lennart Poettering
On Do, 06.06.24 19:42, Sergio Arroutbi (sarro...@redhat.com) wrote:

> > > I miss an option where systemd-cryptsetup is executed headless, but
> > > continues running, without exiting.
> > >
> > > I have tried with keyfile=/dev/urandom and option=keyfile-size=60,
> > but
> > > it is too quick. I also tried try-empty-password, but this is tried only
> > > once.
> > >
> > > I am running out of ideas.
> >
> > Hmm, I am not sure I follow? So do you or do you not want cryptsetup
> > ask for passwrds via the ask-password agent stuff?
> >
>
> We are developing a PKCS11 plugin for Clevis (
> https://github.com/latchset/clevis). Clevis allows automatic boot encrypted
> disks unlocking by storing some information into LUKS metadata.

systemd-cryptsetup supports TPM2 and PKCS#11 natively, you know that?
Why isn't that enough for your usecase? What are you missing?

>From my PoV Clevis/Tang are useful if you want the networked/
unlock, but if you want TPM/PKCS11 then I am pretty sure
systemd-cryptsetup can do that already much better?

> To do so, it is executed in parallel to systemd-cryptsetup and,
> while the password is prompted to the user (and the agent runs),
> Clevis provides the key by writing to the systemd-cryptsetup
> ask-password socket.

Sorry, but this is simply the wrong approach. The ask-password stuff
is for *interactively* asking the user for *passwords* and *PINs*,
i.e. for querying *human* users for secrets. It should *not* be used
for automatic supplying of key material from non-interactive sources.

If you want to supply unlock keys non-interactively, then specify an
AF_UNIX/SOCK_STREAM socket as path to a key file in
/etc/crypttab. Example:

   test1 /dev/disk/by-uuid/7376e512-00a4-4a49-8c51-970f0dae5ab1 
/run/foobar/keysock -

If you do it it like that then systemd-cryptsetup will connect to
/run/foobar/keysock and read the key from it. Hence, write a small
service (could be socket activated,
i.e. ListenStream=/run/foobar/keysock) that listens on that
AF_UNIX/SOCK_STREAM socket, and simply respond to any request with
your raw key blob, then close the connection.

You can even list the same socket inode on multiple /etc/crypttab
lines. If you do you can determine from the AF_UNIX peer address on
incoming connections for which volume we are requesting the key. For
details see:

https://www.freedesktop.org/software/systemd/man/latest/crypttab.html#AF_UNIX%20Key%20Files

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] systemd-umount doesn't unmount LVM volumes

2024-06-07 Thread Lennart Poettering
On Fr, 07.06.24 08:31, Vladimir Mokrozub (m...@mfc.tambov.gov.ru) wrote:

>
> > Uh, LVM is simply nothing anyone here tests, it's not really where the
> > future is. Please reproduce with a current systemd version (i.e. 252
> > is two years old, an eternity in Linux), and file a bug, and maybe
> > someone with an interest in LVM will look into, but don't hold your breath.
>
> Sorry for offtopic but why do you say there is no future for LVM? What's
> wrong with it?

I am pretty sure the future is with multi-device file
systems. btrfs, bcachefs (even zfs if it weren't for that license
fuckup).

Conceptually, faking block device is just a silly approach.

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] tee-supplicant initrd startup before tpm2.target and dev-tpmrm0.device

2024-06-07 Thread Lennart Poettering
On Fr, 07.06.24 14:09, Mikko Rapeli (mikko.rap...@linaro.org) wrote:

> > How is this supposed to work anyway? is the supplicant supposed to
> > exit before initd transition, and be started anew after the
> > transition?
>
> Yes, and tee-supplicant must be started again before any of the TPM using 
> services.
> This now works for initrd start and also shutdown, but fails in main rootfs
> where services like systemd-pcrmachine.service, systemd-tpm2-setup.service and
> systemd-pcrfs-root.service fail since TPM device is not functional without
> tee-supplicant in userspace.

So how do you enqueue tpm2.target again? Via the unmodified upstream
systemd-tpm2-generator?

So the upstream generator assumes that if /dev/tpmrm0 already exists
it doesn't need to bother with tpm2.target, and that the TPM device
already works. But that's not really the case for you I guess, you
have a TPM device node *before* it actually works, right? You need to
start the tee service for it to start working, if I understood
correctly.

So I guess this is what happens:

When the generator runs early in the initrd it sees that /dev/tpmrm0
is absent, it enqueues tpm2.target to wait for it, wich pulls in the
tee agent, and all is good.

After wards we do the initrd→host transition.

When the generator then runs again, early in the host fs it sees that
/dev/tpmrm0 already exists, and doesn't do anything. Hence all
sync'ing is off and stuff will start using the tpm before it is
usable.

I guess to fix this we have to somehow ensure that after the
transition we'll detect that the /dev/tpmrm0 device is not actually
usable, and we have to enqueue tpm2.target after all.

Is there any reasonable way we can detect this?

For example, for this kind of TPM device is there maybe a sysfs
attribute file in /sys/class/tpm/tpm0/ or so which tells is whether
the device already works, or if it needs some userspace component?
Note that at that point udev is not operable anymore/yet hence we
cannot just ask the udev db for this.

> tee-supplicant-initrd@.service:
>
> [Unit]
> Description=TEE Supplicant on %i (initrd)
> DefaultDependencies=no
> After=dev-%i.device
> Wants=dev-%i.device tpm2.target
> Conflicts=shutdown.target tee-supplicant@teepriv0.service
> Before=tpm2.target sysinit.target shutdown.target 
> tee-supplicant@teepriv0.service initrd-switch-root.target
>
> [Service]
> Type=simple

This is the default type, you can drop this.

That said, I am pretty sure this is actually not correct. Type=simple
means that we consider the service ready the instant we fork()ed off
the process for it. But that almost certainly means that the TPM
device is not ready to use yet, because the TEE supplicant won't even
have opened the device it operates on, and not have that set up.

So I'd expect the TEE service would use sd_notify() to send a
"READY=1" notification to the service manager once it did everything
so that /dev/tpmrm0 is ready to go. You'd then use Type=notify or
Type=notify-reload in the unit file to tell systemd that it shall wait
for sd_notify().

> EnvironmentFile=-@sysconfdir@/default/tee-supplicant
> ExecStart=@sbindir@/tee-supplicant $OPTARGS
>
> tee-supplicant@.service:
>
> [Unit]
> Description=TEE Supplicant on %i
> DefaultDependencies=no
> After=dev-%i.device
> Wants=dev-%i.device

Same here, should pull in tpm2.target.

> Conflicts=shutdown.target tee-supplicant-initrd@teepriv0.service
> Before=systemd-pcrmachine.service systemd-tpm2-setup.service sysinit.target 
> shutdown.target
> After=tpm2.target initrd-switch-root.target
> tee-supplicant-initrd@teepriv0.service

These deps look incorrect, just use the same ones as up top.
>
> [Service]
> Type=simple
> EnvironmentFile=-@sysconfdir@/default/tee-supplicant
> ExecStart=@sbindir@/tee-supplicant $OPTARGS

i don't think the two service files need to differ between initrd and
host fs. Just use the same service file. i.e. i don't see a reason for
having two distinct unit files, just use the one you listed above as
tee-supplicant-initrd@.service for both cases (and drop the -initrd suffix)

> > Please provide proper boot logs, with debug logging enabled.
>
> Debug logging is available from here, sadly log is too big to view
> nicely on the web page and has to be downloaded:
>
> https://ledge.validation.linaro.org/scheduler/job/88420

This indeed shows that tpm2.target doesn't get enqueued again after
the initrd transition. So my educated guess above seems to be right,
and we need to find a way now to automatically determine from a TPM
device node whether it is ready to use or not. So far we assumed if we
have one it was ready to use, but that appears to be incorrect for
these TEE devices. So how do we detect this case so that we can delay
TPM operations until the thing is working again via the tpm2.target
stuff?

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] systemd-umount doesn't unmount LVM volumes

2024-06-06 Thread Lennart Poettering
On Mi, 05.06.24 09:12, Vladimir Mokrozub (m...@mfc.tambov.gov.ru) wrote:

> Hello,
>
> OS: Debian 12
> systemd: 252
>
> Could someone please explain why systemd-umount doesn't unmount LVM volumes
> by device:
>
> $ systemd-mount /dev/vg0/lv0 /mnt/lvm/
> Started unit mnt-lvm.mount for mount point: /mnt/lvm
>
> $ findmnt -n /mnt/lvm
> /mnt/lvm /dev/mapper/vg0-lv0 ext4   rw,relatime
>
> $ systemctl list-units '*.mount'
> mnt-lvm.mount loaded active mounted /mnt/lvm
>
> $ systemd-umount /dev/vg0/lv0
> $ systemd-umount /dev/mapper/vg0-lv0
> $ systemd-umount /dev/dm-0
>
> None of the above commands unmount LVM volume. They don't produce any output
> and the exit status is zero.
> On the other hand, unmounting by the mountpoint works fine:
>
> $ systemd-umount /mnt/lvm/
> Stopped unit mnt-lvm.mount for mount point: /mnt/lvm
>
> This only happens with LVM, not with regular block devices.
> Is this a bug or a feature?

Uh, LVM is simply nothing anyone here tests, it's not really where the
future is. Please reproduce with a current systemd version (i.e. 252
is two years old, an eternity in Linux), and file a bug, and maybe
someone with an interest in LVM will look into, but don't hold your breath.

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] Hiding systemd-cryptsetup password prompt

2024-06-06 Thread Lennart Poettering
On Mi, 05.06.24 15:36, Sergio Arroutbi (sarro...@redhat.com) wrote:

> Hello. I have tried with headless=yes. The issue with this is that
> systemd-cryptsetup ends, so I can not provide the password for decryption
> through socket provided in /run/systemd/ask-password/sck.numbers
>
> I miss an option where systemd-cryptsetup is executed headless, but
> continues running, without exiting.
>
> I have tried with keyfile=/dev/urandom and option=keyfile-size=60, but
> it is too quick. I also tried try-empty-password, but this is tried only
> once.
>
> I am running out of ideas.

Hmm, I am not sure I follow? So do you or do you not want cryptsetup
ask for passwrds via the ask-password agent stuff?

I initially thought you don't, but now you do?

Or do you want to filter stuff, i.e. that
systemd-ask-password-agent-tty only does its thing if asked for some
passwords, but not for others?

if that's what you want, let's take a step back, what are you actually
trying to do? Can you describe your scenario better?

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] tee-supplicant initrd startup before tpm2.target and dev-tpmrm0.device

2024-06-06 Thread Lennart Poettering
PM2 PCR Extension (Varlink)[0m.
>
> ^ this should only be started once tee-supplicant is running again
> from main rootfs

This suggests tpm2.target hasn't been enqueued on the host system?
Maybe you forgot to include the generator in the host system?

Please provide proper boot logs, with debug logging enabled.

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] Sysext questions

2024-06-06 Thread Lennart Poettering
On Do, 06.06.24 16:49, Itxaka Serrano Garcia (itxaka.gar...@spectrocloud.com) 
wrote:

> Another extra question, trying a extension that is signed, if I dont
> provide the signature in the verity.d dir, the service hangs because its
> asking for a password. Is it possible to skip that somehow? I dont want it
> to ask for a password, if there is not a key, just fial to load it.

That's a bug. Please file an issue about this.

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] Sysext questions

2024-06-06 Thread Lennart Poettering
On Mi, 05.06.24 18:28, Itxaka Serrano Garcia (itxaka.gar...@spectrocloud.com) 
wrote:

> Hello again!
>
> A few sysext questions that have arisen from our testing
>
>  - image policy is configurable but it's there a single config file where
> we can put that so it's used system wide? For example to only allow
> verity+signed? Service override?

This does not exist right now, because I was a bit unsure how the best
expose this, i.e. whether to maintain separate config files for
portabled, sysext, confext, nspawn, or a single knob.

Note that the plan so far was to complement the userspace enforced
logic wit a kernel-enforced logic, that refuses to allow mounting of
block-based file systems unless they are dm-verity or dm-crypt via
some LSM. Hence, the more fine-grained userspace image policy would be
one thing, the more generic kernel image policy would be the
other. Because of that I think the userspace knobs should be
per-subsystem, i.e. one setting for portabled, a separate one for
sysext, and for confext and so on. (in particular as for sysext one
probably wants dm-verity, wile for confext dm-crypt is probably necessary)

Anyway, having something like this is definitely planned, but not
implemented yet, and not fully thought to the end. if you want to work
on this, would be great.

>  - I can't see anything preventing a manual call to sysext refresh from
> overriding the default policy, i.e if we set it at the service level in an
> immutable system, nothing prevents someone from calling the sysext command
> manually and override the image policy no?

Yeah, let's say we add /etc/systemd/sysext.conf with an ImagePolicy=
setting we should have one level of security.

And some future LSM would then provide a 2nd level of security on
this.

Neither exist right now.

>  - I also don't see anything that can run against a single sysext and
> return a validity check, to check individual files conform to a given
> policy for example? Any idea if there is something like that? Sysext verify
> SYSEXT_FILE --image-policy=whatever

This exists: "systemd-dissect --validate"

>  - I have also seen that having several extensions verity+signed, if there
> is just one that it's not either verity or signed, the whole merge
> stops?

That'd be a bug. The intention is definitely that we gracefully skip
over DDIs that do not check out because of OS version mismatches,
image policy mismatches, or missig keys, and still apply the others.

> Is there any reasoning for that? Is that a bug? Should I open a bug for
> this? IMHO it makes no sense as they are individual files so if something
> does not match the policy it should just be skipped and the rest of the
> extensions loaded anyway. But of course I have low visibility onto this, so
> there may be good reasons for it.

Yes, this is a bug, and I think there's already an issue filed about
this, specific to the key-in-keyring issue.

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] Hiding systemd-cryptsetup password prompt

2024-06-04 Thread Lennart Poettering
On Di, 04.06.24 13:08, Sergio Arroutbi (sarro...@redhat.com) wrote:

> Hello.
>
> We are implementing a feature related to PKCS#11 that, when some conditions
> are met (mostly that PKCS11 PIN has not been stored in configuration and
> input to our systemd unit), requires systemd-cryptsetup service password
> prompt to be hidden from TTY and executed only listening to password
> provided by the socket defined in
> https://systemd.io/PASSWORD_AGENTS/

The boot-time password prompt on the TTY is just an agent too. Mask it
via "systemctl mask systemd-ask-password-console.service".

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] PCR signing / enrolling on UKI and validation by systemd-cryptenroll

2024-05-30 Thread Lennart Poettering
On Do, 30.05.24 17:08, Demi Marie Obenour (d...@invisiblethingslab.com) wrote:

> > Hmm, this is an interesting idea, I kinda like it. But I am not sure
> > how far this will get us, because I think even for FDE we eventually
> > want to store asymmetric keys, not symmetric ones (i.e. I think we
> > should start supporting things like TPM2+FIDO or TPM2+PKCS11 or
> > TPM2+ssh-agent where both devices operate in tandem, in a challenge
> > response model, not sure how far you get with that if we can only
> > protect symmetric keys)
>
> How would TPM2+FIDO work?

chromeos is passing a nonce from the tpm to the fido device, which
then signs it, which the tpm then can verify.

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] PCR signing / enrolling on UKI and validation by systemd-cryptenroll

2024-05-30 Thread Lennart Poettering
On Do, 30.05.24 22:43, Lennart Poettering (lenn...@poettering.net) wrote:

> > What about combining two different secrets, such that _both_ must be
> > accessible?  At a minimum, something like HASH(SECRET1||SECRET2) is
> > guaranteed to be available if and only if both SECRET1 and SECRET2 are
> > available.  This won't work with TPM-bound keys that are not accessible
> > outside the TPM, but my understanding is that the most common cases
> > (LUKS and fscrypt keys and systemd credentials) must be accessible in
> > cleartext on the host _anyway_.  If the secret to be sealed is provided
> > externally, then one can use symmetric encryption with a randomly
> > generated key to have the same effect.
>
> Hmm, this is an interesting idea, I kinda like it. But I am not sure
> how far this will get us, because I think even for FDE we eventually
> want to store asymmetric keys, not symmetric ones (i.e. I think we
> should start supporting things like TPM2+FIDO or TPM2+PKCS11 or
> TPM2+ssh-agent where both devices operate in tandem, in a challenge
> response model, not sure how far you get with that if we can only
> protect symmetric keys)

Eh, I might have figured out a way how I can do this, somewhat
inspired by this:

TPMs implement hierarchies of keys after all where each key is wrapped
by its parent, and you can apparently nest things pretty liberally, to
as many levels as one likes.

So here's what systemd's TPM2-based FDE does right now:

When enrolling: it ensures that a "storage root key" (SRK) exists on
the TPM. It then loads the plaintext FDE encryption key as a symmetric
key into the TPM, so that it is "wrapped" by the SRK. It then reads
back the wrapped (i.e. encrypted) key (this is called "sealing") and
writes that to the LUKS superblock. When unlocking we take that
wrapped key, load it back into the TPM and then read back the
plaintext key (this is called "unsealing"). Since the SRK is specific
to the TPM only the TPM can give us access to our FDE key. This model
is then enriched with TPM2 "extended policies" which we set while
sealing and which tell the TPM to insist that during unsealing the
PCRs are in a specific state.

So much so good. This allows us to define *one* extended policy for the
FDE key. And as mentioned that's a problem for us, because we'd like
to define *two* extended policies (i.e. the pcrlock one, and the
signed PCR one). But if we take benefit of the fact we can wrap keys
arbitrarily we can do it like this:

when enrolling: as before, take care of the SRK. But now generate
another key, wrapped by the SRK and with our first policy built into
it. And then seal the FDE key against that "intermediate" key, and
build our 2nd policy into that sealing.

To unlock we then first have to load the intermediate key (which will
just work) and then load the FDE key below it (which will require us
to fulfill policy 1) and then the unseal the FDE key (which will
require us to fulfill policy 2).

Unless I am missing something this should work and do exactly what I
want: I can combine policies arbitrarily.

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] PCR signing / enrolling on UKI and validation by systemd-cryptenroll

2024-05-30 Thread Lennart Poettering
On Mi, 29.05.24 14:48, Demi Marie Obenour (d...@invisiblethingslab.com) wrote:

> > > > (you can of course include PolicyAuthorizeNV in the policy you sign
> > > > for PolicyAuthorize, but that doesn#t work, since we want to pin the
> > > > local nvindex really, and allocate it localy, and the signer (i.e. the
> > > > OS vendor) cannot possibly do that. Or you could include the
> > > > PolicyAuthorize in the policy you store in the nvindex for
> > > > PolicyAuthorizeNV use, but that feels much less interesting since it
> > > > means the enforcement of the combination is subject to local,
> > > > delegated policy choices instead of mandated by the policy of the
> > > > actual object we want to protect)
> >
> > this here is where i discuss what you are saying ^^^
> >
> > so technically this works, but this means objects are effectively
> > protected by local policy only. And whether to also protect by OS vendor
> > policy is then a choice of the local policy, but not a choice of the
> > original object's policy anymore. Or in other words: that shifts
> > around who owns which part of the policy. Ideally we want that when I
> > create a protected object in the TPM I can say: "to unlock this you
> > *must* validate OS vendor policy *and* local pcrlock policy". But you
> > cannot do that. You can only say "to unlick this you *must* validate
> > local pcrlock policy", and then hope that that local policy also
> > enforces validation via OS vendor policy.
>
> What about combining two different secrets, such that _both_ must be
> accessible?  At a minimum, something like HASH(SECRET1||SECRET2) is
> guaranteed to be available if and only if both SECRET1 and SECRET2 are
> available.  This won't work with TPM-bound keys that are not accessible
> outside the TPM, but my understanding is that the most common cases
> (LUKS and fscrypt keys and systemd credentials) must be accessible in
> cleartext on the host _anyway_.  If the secret to be sealed is provided
> externally, then one can use symmetric encryption with a randomly
> generated key to have the same effect.

Hmm, this is an interesting idea, I kinda like it. But I am not sure
how far this will get us, because I think even for FDE we eventually
want to store asymmetric keys, not symmetric ones (i.e. I think we
should start supporting things like TPM2+FIDO or TPM2+PKCS11 or
TPM2+ssh-agent where both devices operate in tandem, in a challenge
response model, not sure how far you get with that if we can only
protect symmetric keys)

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] PCR signing / enrolling on UKI and validation by systemd-cryptenroll

2024-05-30 Thread Lennart Poettering
On Mi, 29.05.24 14:42, Demi Marie Obenour (d...@invisiblethingslab.com) wrote:

> > Hence, maybe tickets aren't the way to go, they bring complexity, they
> > would make a pretty relevant feature of our policies go down the drain
> > – even though they would combine the two relevant policies correctly.
>
> What about inserting an explicit delay into the boot process until the
> ticket expires?

Sorry, but no. That would be racy (since the TPM clocks are relatively
inaccurate afaics, unlike system clocks). Also it's one hell of an
ugly hack and given that TPMs are slow as fuck anyway and already slow
down boots measurably (heh, pun!) I am sure we shouldn't try to make
it even slower by inserting artificial sleeps...

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] PCR signing / enrolling on UKI and validation by systemd-cryptenroll

2024-05-29 Thread Lennart Poettering
On Mi, 29.05.24 17:00, Andrei Borzenkov (arvidj...@gmail.com) wrote:

> If you use pcrlock for more flexibility it will change into
>
> PolicyPCR(PCR1, PCR2, ...)
> PolicyAuthorize
> PolicyPCR(PCR3, PCR4, ...)
> PolicyOR(digest1, digest2, ...)
> PolicyAuthorizeNV
> Unseal

When you do this then the policy made up of the three expressions in
the middle would have to be stored in the nvindex. Which you
definitely can do, and this is exactly what I discussed below, see
below:

> > (you can of course include PolicyAuthorizeNV in the policy you sign
> > for PolicyAuthorize, but that doesn#t work, since we want to pin the
> > local nvindex really, and allocate it localy, and the signer (i.e. the
> > OS vendor) cannot possibly do that. Or you could include the
> > PolicyAuthorize in the policy you store in the nvindex for
> > PolicyAuthorizeNV use, but that feels much less interesting since it
> > means the enforcement of the combination is subject to local,
> > delegated policy choices instead of mandated by the policy of the
> > actual object we want to protect)

this here is where i discuss what you are saying ^^^

so technically this works, but this means objects are effectively
protected by local policy only. And whether to also protect by OS vendor
policy is then a choice of the local policy, but not a choice of the
original object's policy anymore. Or in other words: that shifts
around who owns which part of the policy. Ideally we want that when I
create a protected object in the TPM I can say: "to unlock this you
*must* validate OS vendor policy *and* local pcrlock policy". But you
cannot do that. You can only say "to unlick this you *must* validate
local pcrlock policy", and then hope that that local policy also
enforces validation via OS vendor policy.

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] PCR signing / enrolling on UKI and validation by systemd-cryptenroll

2024-05-29 Thread Lennart Poettering
On Mi, 29.05.24 10:36, Lennart Poettering (lenn...@poettering.net) wrote:

> But still, I am not ready to give up, there must be some other way I
> think, that I have missed so far.

I posted this on the tpm2-tss ML now:

https://lore.kernel.org/tpm2/ZlbtJ0jcy8rrUbUg@gardel-login/T/#u

Maybe they have an idea what we can do.

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] systemd-shutdown disarms hardware watchdog when finished

2024-05-29 Thread Lennart Poettering
On Mi, 29.05.24 10:51, Andreas Svensson (andreas.svens...@axis.com) wrote:

> Hello,
>
> I have a system that should keep the hardware watchdog active while
> rebooting the system. It has worked fine up to systemd version v254.
>
> I noticed that since systemd version v254 my system stops the hardware
> watchdog after systemd-shutdown completes. I think it's the
> watchdog_free_device function that's responsible.
>
> The watchdog_free_device function will call watchdog_set_device(NULL) from
> watchdog.h. Since commit f81048f8 the watchdog will be disarmed and stopped
> if changed in watchdog_set_device.
>
> There's a comment just above watchdog_free_device in shutdown.c that
> contradicts what's actually happening right now: "Note that the watchdog is
> explicitly not stopped here".
>
> Is this the intended behavior? Anything I can do to get my system back to
> its behavior before version v254 where my hardware watchdog is still
> active/running after systemd-shutdown has finished?

Yes, this is a bug and a regression.

Can you file an issue on github about this please?

(even better provide a PR that fixes this)

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] PCR signing / enrolling on UKI and validation by systemd-cryptenroll

2024-05-29 Thread Lennart Poettering
On Di, 28.05.24 17:36, Demi Marie Obenour (d...@invisiblethingslab.com) wrote:

> > (you can of course include PolicyAuthorizeNV in the policy you sign
> > for PolicyAuthorize, but that doesn#t work, since we want to pin the
> > local nvindex really, and allocate it localy, and the signer (i.e. the
> > OS vendor) cannot possibly do that. Or you could include the
> > PolicyAuthorize in the policy you store in the nvindex for
> > PolicyAuthorizeNV use, but that feels much less interesting since it
> > means the enforcement of the combination is subject to local,
> > delegated policy choices instead of mandated by the policy of the
> > actual object we want to protect)
>
> Does this work in practice?  I agree that this is ugly, but "ugly" might
> be better than "not working".

Well, it should work. I am still not ready to give up on finding a
better solution to this. For example, I have some vague hopes that we
can make TPM "tickets" work for this.

As I understand tickets would allow us to validate policies once,
which would give us a "ticket" back for that that is valid for a
specific time. Then we can bind the policies of other objects to the
availibility of such valid tickets, and then combine two ticket
validations that way.

Superficially that would do what we need. i.e. if I get one ticket for
the signed PCR policy (i.e. for the PolicyAuthorize thing) and another
ticket for the pcrlock policy (i.e. the PolcyAuhtorizeNV thing) then I
can build a policy checking both tickets and be fine.

Except that things aren't that easy (well, the above isn't precisely
"easy" either), because suddenly a time-out comes into play, and we
lose this nice "fuse blowing" feature of PCRs: i.e. while we boot we
measure the boot phase into PCR 11 after all, to ensure that secrets
that shall only be possible to be unlocked in — let's say – the initrd
cannot possibly be unlocked any later, because the PCR is "destroyed"
via the later phase measurement. If we use tickets we could still
unlock things till the end of the timeout, which we probably have to
pick large because of differences of boot speeds, hence this
compromises security quite a bit I'd say.

Hence, maybe tickets aren't the way to go, they bring complexity, they
would make a pretty relevant feature of our policies go down the drain
– even though they would combine the two relevant policies correctly.

But still, I am not ready to give up, there must be some other way I
think, that I have missed so far.

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] PCR signing / enrolling on UKI and validation by systemd-cryptenroll

2024-05-28 Thread Lennart Poettering
On Di, 28.05.24 21:21, Andrei Borzenkov (arvidj...@gmail.com) wrote:

> On 28.05.2024 17:49, Lennart Poettering wrote:
> >
> > systemd-cryptenroll supports pin, literal PCR, signed PCR — in any
> > combination. (plus pcrlock, but that's currently cannot be combined
> > with signed PCR, because afaics not expressible in the TPM policy language).
> >
>
> Why not? You can AND pcrlock with other policies just like currently literal
> PCR is ANDed with signed PCR. You can even use signed PCR in pcrlock policy
> - PolicyOR does not care what policies are combined, literal PCR (like is
> done currently) or signed PCR. Or what semantic do you have in mind that
> cannot be expressed?

pcrlock is ultimately a PolicyAuthorizeNV policy, and signed policies
use PolicyAuthorize. Both of these policy items do not *extend* the
policy so far enqueued, but *replace* it instead. (This is different
from policies such as PolicyPCR or PolicyAuthValue and so on, which
result in extension, i.e. "AND") Thus, there's not directly obvious
way how you could combine them.

(you can of course include PolicyAuthorizeNV in the policy you sign
for PolicyAuthorize, but that doesn#t work, since we want to pin the
local nvindex really, and allocate it localy, and the signer (i.e. the
OS vendor) cannot possibly do that. Or you could include the
PolicyAuthorize in the policy you store in the nvindex for
PolicyAuthorizeNV use, but that feels much less interesting since it
means the enforcement of the combination is subject to local,
delegated policy choices instead of mandated by the policy of the
actual object we want to protect)

I have so far not found a nice way out of this problem. Seems to be a
limitation of the TPM policy language.

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] PCR signing / enrolling on UKI and validation by systemd-cryptenroll

2024-05-28 Thread Lennart Poettering
On Mo, 27.05.24 22:42, Aleksandar Kostadinov (akost...@redhat.com) wrote:

> > if you want to use literal PCR policies like clevis does it, systemd
> > can do that for you just fine?
>
> clevis combines multiple methods and combinations. Like pin, PCRs (not
> signing), tang servers, but can be combined in different ways.

systemd-cryptenroll supports pin, literal PCR, signed PCR — in any
combination. (plus pcrlock, but that's currently cannot be combined
with signed PCR, because afaics not expressible in the TPM policy language).

> > > P.S. also would be great if systemd also supported tang so that both -
> > > signed PCRs and tang to be required for automatic unlock.
> >
> > I am not convinced networked unlock with  really is something
> > relevant for anyone but a select few folks who run major data centers
> > and are willing to pay the price for doing the work. It's also just a
> > bunch of shell scripts last time I looked, or did that change? If so,
> > doubly uninterested.
>
> Actually my use case is to keep a remote private server where I was
> concerned about somebody taking the hardware away. So the network
> policy based encryption pretty much covered my main concerns. + TPM to
> make local data access more difficult but I don't really see this as a
> likely threat. And you can build the tang server with a raspberry or
> install it on an openrwt router. So definitely something close to
> trivial for anybody building a home server.
>
> I didn't go in depth into how tang and clevis worked. `tang` (the
> server https://github.com/latchset/tang) seems to be using a lot of c
> but also a lot of shell. If it is good for big datacenters, then it
> should be fine for me also.

The relevant pieces are all glued-together shell scripts:

https://github.com/latchset/clevis/blob/master/src/pins/tpm2/clevis-decrypt-tpm2

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] How to set both name and altname of a NIC with a given device_addr

2024-05-27 Thread Lennart Poettering
On Mo, 27.05.24 16:59, Lars Petter Mostad (lar...@gmail.com) wrote:

> Hi,
>
> Currently I'm using a udev rule to set a known name for a network
> interface connected
> to certain pins on an SoC, then I use a .link file to set altnames for
> that interface.
> The udev rule matches the base address of the memory mapped registers of the 
> MAC
> connected to the given pins (e.g. ATTR{device_addr}=="1af"). The .link 
> file
> matches the OriginalName set by the udev rule.

Hmm, what? that's not an "original" name if it's already change by the
udev rule.

Please do the renaming in .link files only, both the main name and the
alternative name. To match the device properly in the .link file,
given your slightly unusual sysattr match, please write a udev rule
that sets some udev property of your choice,
i.e. ENV{VENDORXYZ_MYFANCYDEVICE}=1 and then match against that prop
in the .link file via Property=. And stop matching via OriginalName=.

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] Long delay for ping Systemd 252.25 when DNSSEC is enabled

2024-05-27 Thread Lennart Poettering
On So, 26.05.24 18:52, Patrick ZAJDA (patr...@zajda.fr) wrote:

> Hello,
>
> I am on Debian Bookworm, SystemD 252.25 (bookworm-proposed-update).

That's a 2y old version of systemd. Event in current versions of
systemd DNSSEC supports is experimental, but should behave much
better. Please run something less ancient.

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] PCR signing / enrolling on UKI and validation by systemd-cryptenroll

2024-05-27 Thread Lennart Poettering
On Mo, 27.05.24 14:47, Aleksandar Kostadinov (akost...@redhat.com) wrote:

> Excuse me for top-posting but I can second that. Earlier I had a long
> thread about not being able to get the signed PCRs work, I never
> figured out that a signature was only created for 11.
>
> It would really help people not to lose their time if documentation
> stated - there be dragons, go only if you want to become a TPM
> low-level details and linux boot expert.
>
> Eventually I went with clevis and tang. Although if systemd allowed
> signing with more PCRs, that would definitely be very useful.

clevis/tang does not allow signing PCRs, last time I looked.

It's really not comparable.

if you want to use literal PCR policies like clevis does it, systemd
can do that for you just fine?

systemd-cryptenroll --tpm2-pcrs= is for literal PCR enrollments.

You can combine that with --tpm2-public-key= stuff for PCR 11.

> If somebody from systemd team managed to use signed PCRs to unlock
> together with the new systemd-pcrlock for non-11 PCRs, please write a
> short how to install and what to do by kernel upgrade. Presently it is
> not usable for regular or advanced users. Which is fine as long the
> documentation doesn't suggest it is (and it presently does).

Yeah, I want a pony too, and I keep demanding one, but noone gives one
to me for free. Weird.

Honestly, maybe dial down your expectations a bit, both of you. All
this TPM support in systemd is fairly new, and it's definitely not
user facing stuff anyway (hence super-friendly docs are *not* my
priority, sorry, got enough on my plate), it's something distros
should integrate and we are only at the beginning of that path.

And complaining that things aren't just polished yet is certainly not
helping anyone to get the tiniest step ahead on that path. It just
annoys the people who you apparently believe work for you for free.

> P.S. also would be great if systemd also supported tang so that both -
> signed PCRs and tang to be required for automatic unlock.

I am not convinced networked unlock with  really is something
relevant for anyone but a select few folks who run major data centers
and are willing to pay the price for doing the work. It's also just a
bunch of shell scripts last time I looked, or did that change? If so,
doubly uninterested.

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] Bump: Journal file disk usage on frequently rebooted systems ... again

2024-05-27 Thread Lennart Poettering
On So, 26.05.24 10:23, Jens Schmidt (farb...@vodafonemail.de) wrote:

> 3.4MiB just to store 856 characters?

It stores structured logs for each of these entries, see "journalctl
-o verbose", i.e. a *lot* more data than you see in the simple output.

It also maintains an index for field, so that "systemctl status" can
reasonably quickly show only only the data for a specific unit. and so
on.

Which systemd version are you using?

In v252 many of the 64bit fields of the original journal format were
optionally reduced to 32bit, which makes the format a lot smaller.

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] PCR signing / enrolling on UKI and validation by systemd-cryptenroll

2024-05-27 Thread Lennart Poettering
On Sa, 25.05.24 13:23, Andrei Borzenkov (arvidj...@gmail.com) wrote:

> These are PCRs for which you intend to provide signed policy. These PCRs
> must be listed in JSON file that is given to systemd-cryptsetup as
> tpm2-signature= parameter. The only PCR for which there is systemd tool to
> compute it is PCR 11. You should be able to add other PCRs to this JSON file
> and it should work, but you will need to compute the values yourself.
>
> Unfortunately, this is yet another case where systemd pretends to be generic
> while in reality it is not.

Hmm, where do we pretend anything?

We give you a tool to predict/sign the measurements for PCR 11 because
we can just do that from the UKI. For other PCRs it's a very different
story however.

(And we do provide a tool for that too nowadays btw, i.e. systemd-pcrlock).

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] PCR signing / enrolling on UKI and validation by systemd-cryptenroll

2024-05-27 Thread Lennart Poettering
On Sa, 25.05.24 09:00, Felix Rubio (fe...@kngnt.org) wrote:

> Hi everybody,
>
> For some time now I have been using UKIs, with SB enabled and tying FDE
> decryption on PCRs 7+11+14, with the PCR 11 being measured during UKI
> creation. Then, I use systemd-cryptenroll to update the secret:
>
> 
> PCR11=$(/usr/lib/systemd/ukify -c /etc/kernel/uki.conf --measure
> --output=/tmp/arch-linux.efi build | grep 11:sha256)
> systemd-cryptenroll --unlock-key-file=/root/creds/fdepassword.txt
> --wipe-slot=tpm2 --tpm2-device=auto --tpm2-pcrs=7+11:sha256=d05ee4...+14
> /dev/nvme0n1p5
> 
>
> This works, flawlessly. Now, I am exploring the possibility to not bind to
> the value of those PCRS but to their signature, given that I am also
> embedding that in the UKI (the correspondent .pcrsig section is in place).
> However, I am a bit lost:
> * in .pcrsig there is only the signature for pcr11, and there seems to be no
> way to embed the signatures for other PCR values.

systemd-measure/ukify doesn't support embedding anything else, since those
measurements do not depend on the UKI but on external factors, hence
it makes little sense to include them in the UKI pcrsig section,
except for specialist cases where you know your hardware/systemd very
well and never update it separately from the kernel.

> * when used in cryptenroll, how should I use this? So far, seems should be a
> call like
> 
> systemd-cryptenroll --unlock-key-file=/root/creds/fdepassword.txt
> --wipe-slot=tpm2 --tpm2-device=auto
> --tpm2-public-key=/root/creds/tpm2-pcr-public.pem
> --tpm2-public-key-pcrs=
> 
>
> ... but then I do not see what should be provided in tpm2-public-key-pcrs.
> The same values I am currently giving to --tpm2-pcrs? the signatures that I
> get from the .pcrsig for 11 + the calculated signatures for the current
> values of the PCRs 7 and 14?

You can specify whatever you like there, as long as you then can
provide the right signature files. Thing though is that
systemd-measure/ukify won't prep those signatures for you, you'd have
to use a different tool for that. (Or prep a patch teaching literal
PCR specifications in systemd-measure, we would be open to merging
this I guess).

Or in other words: there are three parts to signed PCR policies:

1. enrollment
2. unlocking
3. signing

Of these steps 1 + 2 as implemented in systemd should just work for
PCRs other than 11. But step 3 is simply not.

That all said, With recent systemd versions we added "systemd-pcrlock"
that is supposed to cover other PCRs than 11 nicely, maintaining a
local policy (which I think is much preferable for the other PCRs,
since they are dependent on local configuration, hardware and so on,
not OS constructs).

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] keeping a backup ESP partition in sync

2024-05-27 Thread Lennart Poettering
t because of the writes done only in the primary ESP by
> firmware/sd-boot/sd-stub, right? If so, maybe this is indeed going to
> be very fragile...

The whole exercise is done to keep them bootable. So yes, the writes
done by firmware/boot loader are going to remain local to the ESP used
for booting, but that should be fine as long as after boot with ensure
the differences are evened out, i.e. that "bootctl random-seed" is
used from userspace to place a fresh random seed on every listed ESP,
that "bootctl update" updates the boot loader in every listed ESP and
that "kernel-install" copies kernels into every listed ESP and so on,
that "systemd-bless-boot" resets the boot counters for the booted
kernel on every ESP, and so on.

(the way I'd implement this, is not by actually teaching these
commands individual multi-ESP support, but simply by implementing a
single sync_esp() call or so which syncs the relavant info from
primary to secondary ESPs correctly, and that each of these commands
just call as last step. For single-ESP setups this call would be a NOP)

Yes, it's a bit of work.

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] Measured systemd-sysext

2024-05-24 Thread Lennart Poettering
her have a much much simpler lsm-bpf as alternative, that
just does this one thing and nothing else. IMA keeps its logs in
kernel memory, unbounded, with no mechanism for rotation, which I
personally find a complete dealbreaker.)

So much about my current ideas regarding all this.

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] tee-supplicant initrd startup before tpm2.target and dev-tpmrm0.device

2024-05-24 Thread Lennart Poettering
On Fr, 24.05.24 10:12, Lennart Poettering (lenn...@poettering.net) wrote:

> And that's really all.
>
> To summarize, a unit file like this:
>
> [Unit]
> Description=TEE Supplicant on %i
> Documentation=man:tee-supplicant(8)
> DefaultDependencies=no
> After=dev-%i.device
> Wants=dev-%i.device
> Conflicts=shutdown.target
> Before=sysinit.target shutdown.target
>
> [Service]
> ExecStart=@sbindir@/tee-supplicant -d /dev/%I

So, I looked at the man page for that daemon:

https://manpages.debian.org/testing/tee-supplicant/tee-supplicant.8.en.html

This seems like the service is simply not suitable for running in the
initrd, i.e. it stores its data in /var/lib/optee-client/data/tee, but
/var/ is only available in late boot. During the initrd and even after
the initrd→host transition, until local-fs.target and
systemd-remount-fs.service have been invoked /var/ is not available.

Hence, what you are trying to do is not going to fly: you need to move
the service to early boot for disk encryption to work, but the service
wants to store stuff on the disk, hence only can run after disk
encryption succeeded. That means it simply doesn't work out.

(Except of course if that man page is completely out-of-date and the
service is nowadays fine with running with just /run/ around, and
without touching /var/ whatsoever).

(Also, the thing looks fishy generally, as it references /lib/, but
that's a legacy dir, in systemd we nowadays require merged /usr/ and
do not supported separate /lib/ hence)

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] tee-supplicant initrd startup before tpm2.target and dev-tpmrm0.device

2024-05-24 Thread Lennart Poettering
gt; enqueued. So are you sure your udev rule even works?
>
> The udev rule works. With debug logs I see:
>
> https://ledge.validation.linaro.org/scheduler/job/87556
>
> [0;38;5;245mtee-supplicant@teepriv0.service: starting held back,
> waiting for: basic.target[0m

Yupp, that's the DefaultDependencies=no thing.

> So I think I need to disable the default dependencies with 
> DefaultDependencies=no
> With my limited understanding I think I would also need the tpm2.target to 
> wait for
> tee-supplicant startup. "WantedBy=sysinit.target tpm2.target" would do it?
> Or alternatively in tpm2.target
>
> After=dev-tpmrm0.device dev-teepriv0.device
> Wants=dev-tpmrm0.device dev-teepriv0.device

Why should tpm2.target wait for tee-supplicant startup though? it
should be enough for the /dev/tpmrm0 device to materalize, no? And
that hapens when tee-supplicant does its thing, hence there's no need
to explicitly wait for that, is there?

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] tee-supplicant initrd startup before tpm2.target and dev-tpmrm0.device

2024-05-23 Thread Lennart Poettering
On Do, 23.05.24 10:54, Mikko Rapeli (mikko.rap...@linaro.org) wrote:

> Hi,
>
> I'm running in circles and failing to start optee userspace daemon 
> tee-supplicant
> correctly with systemd in initrd.
>
> In certain firmware/HW configurations with optee and firmware TPM trusted 
> application,
> the setup needs tee-supplicant to start in initrd userspace before the fTPM 
> kernel
> module gets enumerated, but I'm failing to express this in the systemd
> service dependencies.
>
> TPM usage in firmware is being detected correctly and tpm2.target is queued 
> correctly,
> but the dev-tpmrm0.device is not found since tee-supplicant@teepriv0.service 
> is not
> getting started before it.
>
> optee kernel driver is loaded and working. /dev/teepriv0 is
> generated by udev but not

Note that udev does not generate device nodes. The kernel does. udev
just chmods/chown/acls it and maintains metadata about it.

> before dev-tpmrm0.device.
>
> tee-supplicant@.service:
>
> [Unit]
> Description=TEE Supplicant on %i
>
> [Service]
> User=root

This line is redundant.

> EnvironmentFile=-@sysconfdir@/default/tee-supplicant
> ExecStart=@sbindir@/tee-supplicant $OPTARGS
>
> [Install]
> WantedBy=basic.target

Usually you'd hook services into "sysinit.target" not
"basic.target". The job of "basic.target" is really do combine
sysinit.target (i.e. early-boot services), local-fs.target
(i.e. local mounts), swaps.target (swaps), sockets.target (well, you
guess it), and so on.

Hence, if you plug in services, use sysinit.target.

> udev rule is:
>
> KERNEL=="tee[0-9]*", MODE="0660", OWNER="root", GROUP="teeclnt", 
> TAG+="systemd"
>
> # If a /dev/teepriv[0-9]* device is detected, start an instance of
> # tee-supplicant.service with the device name as parameter
> KERNEL=="teepriv[0-9]*", MODE="0660", OWNER="root", GROUP="teeclnt", \
> TAG+="systemd", ENV{SYSTEMD_WANTS}+="tee-supplicant@%k.service"
>
> So basically dev-tpmrm0.device depends on tee-supplicant@teepriv0.service 
> started
> on dev-teepriv0.device by udev. How to express this dependency?

I am not sure I grok this dependency chain?

What do you mean by ordering the service against dev-tpmrm0.device?
why would you order this? I mean, when
tee-supplicant@teepriv0.service is invoked it will do its thing and
synthesize a /dev/tpmrm0, right?

Generally, you cannot really order device units, it's not under
systemd's unit engine's control when they show up. They show up when
user plugs in a device, or udev triggers a device or the kernel
otherwise probes and makes a device available, and that can be any
time. So we can *wait* for devices, and we can sometimes call tools
that synthesize synthetic devices, but we cannot order arbitrary
devices, that simply is out of our control.

> I tried to queue tee-supplicant@.service with "Wants: tpm2.target" but that 
> did not work
> and seems wrong. The dependency is earlier to the kernel /dev/tpmrm0 device 
> node.
> Then I tried to amend the teepriv udev rule to
> ENV{SYSTEMD_WANTS}+="tee-supplicant@%k.service tpm2.target" and
> ENV{SYSTEMD_BEFORE}+="tpm2.target" but this did not work either. I must be 
> doing this
> somehow wrong. Any ideas what would work?
>
> Example serial log from a rockpi4b board where fTPM is failing to be detected 
> in
> initramfs since tee-supplicant wasn't started:
> https://ledge.validation.linaro.org/scheduler/job/87532

This shows an ordering cycle. Address that first. If you have an
ordering cycle systemd will drop jobs from the initial transactions in
an attempt to fix it, but it's not always clear that the one it drops
it the best one to drop.

The logs do not show that your "tee-supplicant@.service" unit gets
enqueued. So are you sure your udev rule even works?

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] Question about propagation of INVOCATION_ID and JOURNAL_STREAM env variables in Desktop Environments

2024-05-22 Thread Lennart Poettering
On Mi, 22.05.24 17:13, Nop (ctx...@gmail.com) wrote:

> Hello folks,
> I have a question about what you guys considers to be the right/expect way.
>
> I read documentation a bit about INVOCATION_ID and JOURNAL_STREAM and, to
> me, it feels like those two variables should not be propagated from
> DE.

What precisely do you mean by "DE"? Most desktop environments these
days use systemd as service manager for per-user services too, and
hence they'll get a properly initialized INVOCATION_ID set for each
service.

If you use a traditional DE that does not use systemd, then you
shouldn't have set INVOCATION_ID since you are invoked from a PAM
session, not from a system service, hence nothing to clean up.

Hence, I am not really grokking your question. Either you buy into
systemd or you don't. If you do you should be in the clear and have
valid invocation IDs, and if you don't you should also be clear and
not have the variable set at all. So all should be good?

> I mean, if I start KDE Plasma, for example, using systemd, it will receive
> an $INVOCATION_ID. Now I start any application by clicking around, it will
> inherit and get the very same $INVOCATION_ID.

Applications are usually started via systemd too, so no?

> If the application happen to be konsole or kitty (terminal
> emulator), it still inherit the variables, so that any command run
> inside this terminal emulator also inherit from it.  Feels really
> weird to me, no?

Graphical terminal apps should really not let INVOCATION_ID leak into
their sub-sessions, they are kinda their own thing then.

> I checked a bit what gnome is doing and it confused me even more: all
> applications inherit the variables but the 'gnome-terminal' filters out few
> variables (including those two) so that commands don't have it:
> https://gitlab.gnome.org/GNOME/gnome-terminal/-/blob/master/src/terminal-client-utils.cc#L227

Seems it's doing things right then.

> What is your take on this? At which level the filter should occurs if it
> should even occurs at all?

When you open a new "sub-session" or so, and fork processes off down
the tree, then it might make sense to unset these env vars for them.

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] keeping a backup ESP partition in sync

2024-05-22 Thread Lennart Poettering
On Fr, 17.05.24 11:03, Alexander Gordeev (a...@gordius.net) wrote:

> Hi,
>
> I've tried systemd-boot recently, I like it a lot. Thanks!
> There is still one concern. I'd like to have a backup EFI partition
> because you know things can happen and my rootfs is on a mirror
> anyway. There is a popular approach with setting up a mdraid version
> 1.0 to sync the ESPs. I don't like it because (1) FAT32 is not super
> reliable and (2) if there is a power outage when a partial state is
> written, then issues can happen, I think.

Yeah, RAID on ESP is a *bad* idea if implemented by the OS. UEFI has
write access to the ESP, and this is *actively* used by both firmware
stuff and by sd-boot/sd-stub to maintain try counters, random seeds
and so on. Thus, whenever you boot the fs is written to, and that
hence means on every single boot our RAID array will come up dirty.

If you have some firmware that does RAID natively you could probably
do ESP-on-RAID, but without it it's a receipe for desaster, not a
recipe for robustness.

> I think it is better to have them mounted as e.g. /boot/efi and
> /boot/eficopy and make changes like this:
> 1. update /boot/efi
> 2. make sure the update is actually written to the device
> 3. update /boot/eficopy
>
> Right now I do this manually with rsync. I'm thinking about adding
> kernel/initramfs/dpkg hooks. Maybe there are easier ways to do it?
> Otherwise maybe this feature is desirable in systemd-boot?

I don't see why systemd-boot would care about multiple disks – however
I do agree that for systems with many disks it might make sense to
teach *bootctl* some limited support for an ESP that exists in
multiple copies on multiple devices.

Hence, if somebody sends a patch that teaches "bootctl install" and
"bootctl update" and the others to deal with multiple ESPs then I
guess I'd be on board with that.

That said, the intended semantics for that are not clear to me at
all. i.e. there are some options:

1. mount the current ("primary") ESP to /efi/, and operate exclusively
   on that, except that at the very end after syncing the ESP is dd'ed
   on the block level onto a set of matching partitions other HDDs
   without any consideration of their current contents.

2. mount the current ("primary") ESP to /efi/, and expect that
   "secondary" ESPs are mounted to /efi.mirror/$DEVNAME/ or so, and
   then first operate on the primary ESP, and then only sync a very
   specific subset of dirs from the primary to the secondary ESPs,
   i.e. /loader/, /efi/Linux and /efi/systemd. Syncing would be
   "dumb", i.e. stupidly copy over, and remove dentrys not existing in
   the source.

   This is far from trivial to implement, because how would we even
   decide what to mount to /efi.mirror/$DEVNAME/, how would we expect
   users to mark the set of partitions? probably would require some
   udev rule, but that creates messy problems around waiting for these
   mirrors on boot (because we do update the ESP automatically at
   boot, for updating the random seed automatically, and more). After
   all it should be OK if mirrors go missing, but that means we cannot
   really delay booting waiting for them anymore (because we cannot
   distinguish the case "device is just slow to pop up" from the case
   "device is dead").

2b. same as 2, but try to be "smart" with syncing, ie. look at file
mtimes, and let the newer versions win. Probably doomed to fail,
due to clock/timezone unreliability in early boot and in
particular firmware writes.

3. some scheme where there's no primary nor secondary, but just an
   equal set of partitions. This is harder than it sounds, since it
   raises questions what to do if updating some partitions works but
   things fail on others: do we undo the first change again, or do we
   just continue? if we declared one of the copies as "primary" (as
   suggested above) this problem goes away somewhat, since it would
   mean we could have strict success rules on the "primary" copy, and
   lax rules on the "secondary" copies. This also would have the
   problem that 3rd party tools are generally not ready to deal with
   the fact that there's more than one equivalent esp.

Hence, approach 2 is probably the best, but the waiting issue is a
major headache. it would probably mean we store away the list of
primary+secondary ESPs we have seen so far in a file in the ESP (which
is then sync'ed to all). And then add "bootctl wait-secondary-esps" or
so as a new tool that waits for them to show up, with some time-out
applied. But, uh, this gets so involved so quickly. (as you then
probably also need "bootctl add-secondary-esp" and "bootctl
remove-secondary-esp")

But anyway, if this matters to you, feel free to send a patch for
this, but it's not really job for a day or two, it's much more
involved than one might think.

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] systemd prerelease 256-rc1

2024-04-26 Thread Lennart Poettering
On Fr, 26.04.24 09:49, Neal Gompa (ngomp...@gmail.com) wrote:

> > Well, people moved off split-usr quite successfully, which is a bigger
> > feat than cleaning up the /boot/efi/ mess I'd say.
> >
> > Fedora is currently merging /usr/bin/ and /usr/sbin/, which I am pretty
> > sure is a bigger change too.
>
> Neither of those involved screwing with mountpoints and changing code
> around bootloaders.
>
> >From a distribution perspective, UsrMerge and the bin+sbin merge are
> significantly simpler things.

Well, believe what you want. But even in Fedora it's probably <= 15
packages that care about the EFI mount point.

the sbin/bin merge and the usr merge otoh touched pretty much *every*
package in the repo.

I think the reproducible build stuff currently being in fedora is also
going to be a harder thing to do.

But anyway, we can certainly agree that we have different
concepts/metrics of "hard" or "easy" tasks.

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] systemd prerelease 256-rc1

2024-04-26 Thread Lennart Poettering
On Fr, 26.04.24 09:47, Neal Gompa (ngomp...@gmail.com) wrote:

> > > > * systemd-gpt-auto-generator will stop generating units for ESP 
> > > > or
> > > >   XBOOTLDR partitions if it finds mount entries for or below 
> > > > the /boot/
> > > >   or /efi/ hierarchies in /etc/fstab. This is to prevent the 
> > > > generator
> > > >   from interfering with systems where the ESP is explicitly 
> > > > configured
> > > >   to be mounted at some path, for example /boot/efi/ (this type 
> > > > of
> > > >   setup is obsolete, but still commonly found).
> > >
> > > This is not obsolete. Please do not say it is when it is not true.
> >
> > Uh, we mark outdated concepts as obsolete all the time. You might
> > disagree with that, but that doesn't change the fact that from our PoV
> > /boot/efi/ is obsolete, just like split /usr/, or cgroupv1.
> >
> > Nesting /efi/ in /boot/ is bad for plenty reasons, as has been widely
> > discussed, so I am not going to repeat this here. And this has been
> > communicated for multiple years now, and all the automatisms in
> > systemd do not work for such a setup, hence I think saying that this
> > setup is obsolete by now is not an understatement.
> >
> > I know that Fedora is sadly behind on boot loader topics, but that's
> > no reason for changing our stance from systemd upstream on these
> > things.
>
> There are fewer distros using /efi than /boot/efi, and no major
> distributions that use /boot/efi.
>
> Complaining about it being a Fedora thing (which I guess I need to
> remind this audience that I am involved in more than Fedora, and every
> distribution I work on does use /boot/efi instead of /efi) is weird
> since it's not just Fedora. It's pretty much everyone.

Yeah, as the NEWS entry says, /boot/efi/ is commonly found. So?
Doesn't change the fact it's a bad idea and from systemd's PoV an
obsolete concept.

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] systemd prerelease 256-rc1

2024-04-26 Thread Lennart Poettering
On Fr, 26.04.24 10:39, Dan Nicholson (d...@endlessos.org) wrote:

> On Fri, Apr 26, 2024 at 10:11 AM Adrian Vovk  wrote:
> >
> > Perhaps Fedora can be adjusted to follow the BLS's recommended mount points?
>
> The problem with all of these type of "we've realized a better way and
> the old way is obsolete" is that it's left as someone else's issue to
> actually change existing users from the obsolete way. I've written
> code to migrate away from some old setup several times at Endless and
> it's always scary that you're going to screw a whole class of users
> and the only way out of that will be manual intervention. That's
> doubly so for something like this where it's touching critical boot
> files. Doing something wrong there may make someone's system unusable.
>
> So, while I do agree with the sentiment that /boot/efi is a bad idea
> and should not be done anymore, I have a lot of sympathy for Fedora
> continuing to use it.

Well, people moved off split-usr quite successfully, which is a bigger
feat than cleaning up the /boot/efi/ mess I'd say.

Fedora is currently merging /usr/bin/ and /usr/sbin/, which I am pretty
sure is a bigger change too.

Noone here has any illusions, this is not going to be fixed from today
to tomorrow, just like the usr-merge wasn't done in a day or the
sbin-merge doesn't happen in a single day either. But I am very sure
we shouldn't let the Linux platform stagnate like this. I think it
really should be time to clean up /boot/efi/, we don't want that
people get bored after the sbin-merge is complete, after all!

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] systemd prerelease 256-rc1

2024-04-26 Thread Lennart Poettering
On Do, 25.04.24 18:52, Neal Gompa (ngomp...@gmail.com) wrote:

> > * systemd-gpt-auto-generator will stop generating units for ESP or
> >   XBOOTLDR partitions if it finds mount entries for or below the 
> > /boot/
> >   or /efi/ hierarchies in /etc/fstab. This is to prevent the 
> > generator
> >   from interfering with systems where the ESP is explicitly 
> > configured
> >   to be mounted at some path, for example /boot/efi/ (this type of
> >   setup is obsolete, but still commonly found).
>
> This is not obsolete. Please do not say it is when it is not true.

Uh, we mark outdated concepts as obsolete all the time. You might
disagree with that, but that doesn't change the fact that from our PoV
/boot/efi/ is obsolete, just like split /usr/, or cgroupv1.

Nesting /efi/ in /boot/ is bad for plenty reasons, as has been widely
discussed, so I am not going to repeat this here. And this has been
communicated for multiple years now, and all the automatisms in
systemd do not work for such a setup, hence I think saying that this
setup is obsolete by now is not an understatement.

I know that Fedora is sadly behind on boot loader topics, but that's
no reason for changing our stance from systemd upstream on these
things.

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] Fastest way to dump last X Mo of logs from the journal ?

2024-04-25 Thread Lennart Poettering
On Do, 25.04.24 12:49, Andy Pieters (syst...@andypieters.me.uk) wrote:

> On Thu, 25 Apr 2024 at 12:48, Lennart Poettering 
> wrote:
>
> > On Mi, 24.04.24 14:48, Etienne Champetier (champetier.etie...@gmail.com)
> > wrote:
> >
> >
> > what is "last X Mo" supposed to mean? is "mo" supposed to mean months?
> > thus: show logs from a given number of most recent months? if so, just
> > use:
> >
> > megabytes (mega octets in French)

oh, wow. weird.

megabytes of what though? of formatted text? or of a journal file on disk?

such a weird request...

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] Fastest way to dump last X Mo of logs from the journal ?

2024-04-25 Thread Lennart Poettering
On Mi, 24.04.24 14:48, Etienne Champetier (champetier.etie...@gmail.com) wrote:

> Hi all,
>
> sos report includes the last X Mo of logs, sometimes filtered,
> sometimes not

what is "last X Mo" supposed to mean? is "mo" supposed to mean months?
thus: show logs from a given number of most recent months? if so, just
use:

   journalctl --since=-3month

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] Custom nobody user/group name not equivalent

2024-04-17 Thread Lennart Poettering
On Mi, 17.04.24 17:17, Opty (opt...@gmail.com) wrote:

> Hello,
>
> when using
>
> -8<-
>   -Dnobody-user=overflowuid \
>   -Dnobody-group=overflowgid \
> -8<-
>
> in Slackware I got
>
> -8<-
> ../meson.build:1081: WARNING:
> The configured user name "overflowuid" and group name "overflowgid" of
> the nobody user/group are not equivalent.
> Please re-check that both "nobody-user" and "nobody-group" options are
> correctly set.
> -8<-
>
> because of
>
> -8<-
> commit 8374cc623579e57ae79b62fce2f11627957148e2
> Author: Yu Watanabe 
> Date:   Thu Dec 7 17:19:11 2017 +0900
>
> meson: warn if nobody-user and nobody-group are set to different name
>
> It may work, but is very strange. So, let's warn about that.
>
> v2:
> Debian uses nobody and nogroup. Do not warn such case.
> -8<-
>
> Why do you find different names "very strange" but allow Debian to use
> nobody and nogroup?

This is a place where distros should not depart from each
other. Calling the user "nobody" and the group the same is simply the
least surprising thing: it's comonly understood that user's which have
their own matching groups should also name them the same
way. Derparting from that rule just to be different is just annoying.

This is a warning, to push distros to just stop trying to be different
in this corner case, it's a waste of brain cells having to deal with
pointless differences like this everywhere.

let me turn this around: why do you think it's a great idea for
slackware being its own thing and naming these groups completely
differently for everyone?

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] Issues supporting systems with and without TPM and firmware TPM (was Re: Handle device node timeout?)

2024-04-16 Thread Lennart Poettering
On Di, 16.04.24 15:02, Mikko Rapeli (mikko.rap...@linaro.org) wrote:

> Hi,
>
> On Mon, Apr 15, 2024 at 05:41:00PM +0200, Lennart Poettering wrote:
> > Would be good to have that with systemd.log_target=debug, to see if
> > tpm2.target even gets enqueued.
>
> Here is the verbose log:
>
> https://people.linaro.org/~mikko.rapeli/systemd_255_tpm2_target_qemu_swtpm_boot_encryption_failure.txt

So that shows that tpm2.target is only enqueued once the /dev/tpmrm0
device actually shows up. But that makes it useless. The idea is that
the target is already enqueued when during very early setup in
systemd-tpm2-generator we come to the conclusion that "yes, a tpm2
device has not been found and set up by Linux yet, but the firmware
indicates there is one, hence let's schedule a job for it, that
everything can sync on". But this determination never happened
here, tpm2.target was never enqueued, hence never acted as a
synchronization milestone.

(As a temporary hack you can *force* systemd-tpm2-generator to assume
that a TPM device will show up via systemd.tpm2_wait=1 on the kernel
cmdline, and thus enqueue tpm2.target. But that's only suitable as
local hack: we should be able to determine all this automatically
based on firmware properties, see below.)

> System is qemu arm64 with UEFI / ARM System Ready compatible firmware,
> secure boot and TPM2 device via swtpm.

So this firmware implements UEFI and ACPI? As indication whether the
firmware supports TPM2, we check for the existance of the
/sys/firmware/acpi/tables/TPM2 ACPI table. Does that exist for you?

See src/share/efi-api.c, function efi_has_tpm2().

Do you have /sys/kernel/security/tpm0/binary_bios_measurements?

> systemd-tpm2-setup-early.service: ConditionSecurity=measured-uki
> failed.

So this suggests you haven't booted the system with a UKI or that your
firmware doesn#t actually do TPM.

Or in other words: ConditionSecurity=measured-uki will only hold if
the aforementioned ACPI table exists *and* the StubPcrKernelImage EFI
variable is found to be set.

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] Issues supporting systems with and without TPM and firmware TPM (was Re: Handle device node timeout?)

2024-04-15 Thread Lennart Poettering
On Mo, 15.04.24 17:41, Lennart Poettering (lenn...@poettering.net) wrote:

> > or the services needed for systemd-repart config with Encrypt=tpm2
>
> Ah, repart is interesting. We are missing the tpm2.target dependency
> there. That's a bug. Will fix.

→ https://github.com/systemd/systemd/pull/32283

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] Issues supporting systems with and without TPM and firmware TPM (was Re: Handle device node timeout?)

2024-04-15 Thread Lennart Poettering
On Mo, 15.04.24 17:23, Mikko Rapeli (mikko.rap...@linaro.org) wrote:

> Hi,
>
> On Mon, Apr 15, 2024 at 04:02:46PM +0200, Lennart Poettering wrote:
> > On Mo, 15.04.24 10:38, Mikko Rapeli (mikko.rap...@linaro.org) wrote:
> >
> > > Hi,
> > >
> > > On Fri, Apr 12, 2024 at 05:03:18PM +0300, Aleksandar Kostadinov wrote:
> > > > Shouldn't the kernel automatically load the necessary modues when
> > > > devices are detected... given proper udev rules and module
> > > > availability in the initrd filesystem? I guess it depends on how you
> > > > build your initrd system for including them.
> > >
> > > The modules do get loaded but too late in the initramfs stage and 
> > > something
> > > in the tpm2.target and related service was failing and creating TPM2 
> > > encrypted
> > > rootfs fails. I could not figure out at which stage the driver needs to 
> > > be loaded,
> > > e.g.
> > > Before: modprobe@tpm_tis_core.service modprobe@tpm_tis.service 
> > > modprobe@tpm_ftpm_tee.service
> > >
> > > But I'm also trying to fix the root causes why TPM modules can't be built 
> > > into the
> > > kernel so lucky that resolves these issues. Would be nice to know to 
> > > which stage
> > > the TPM2 module loading would need to happen though.
> >
> > This should not require manual handling. The driver should be
> > auto-loaded via udev and stuff, like any other driver. Or does the
> > "tpm-ftpm_tee" thing carry no modalias info that autoloads it if some
> > specific hw is around?
>
> With latest rebase/update from systemd 254 to 255 I'm not yet testing on fTPM 
> devices
> but trying to get TPM2 backed rootfs genereted with qemu and swtpm which 
> required basic
> tpm_tis_core and tpm_tis modules to be loaded. udev does load them but too 
> late
> for tpm2.target

mm? the only purpose of tpm2.target is to only show up once
/dev/tpmrm0 actually has materialized. It comes with these two deps:

   After=dev-tpmrm0.device
   Wants=dev-tpmrm0.device

This basically means that if tpm.target is enqueued into the boot, it
will delay it forver basically — unless /dev/tpmrm0 shows up.

Are you sure tpm2.target even gets enqueued for you though?

That's normally the responsibility of systemd-tpm2-generator, whose
job is to see if the firmware reported that it talked to a TPM2
device. And if it did, then it will assume the device will show up on
Linux too if we just wait long enough for that, and for that it
enqueus tpm2.target.

I really have no idea about your platform, but it this is not an
ACPI/UEFI device, then you have to enqueue tpm2.target some other way
if you determine that "yes, the device has a tpm2 device, but no Linux
hasn't found it yet. Probably means adding another generator (or if
this reasonably generic, then we cal also add it to the upstream
systemd-tpm2-generator, for example if this info is reasonably
available from DT metadata or so).

> or the services needed for systemd-repart config with Encrypt=tpm2

Ah, repart is interesting. We are missing the tpm2.target dependency
there. That's a bug. Will fix.

> to work. Changing TPM drivers to built-in fixed all these issues, and I'm now 
> able to
> do this since I have the RPMB in kernel patches applied and tee-supplicant is 
> not needed
> anymore. The issue with TPM drivers as modules was somewhere in the mount of 
> the
> newly created TPM2 backed filesystem, possibly ConditionSecurity=measured-uki 
> failing.
>
> Full boot log in: https://pastebin.com/raw/6xy5x5NP

Would be good to have that with systemd.log_target=debug, to see if
tpm2.target even gets enqueued.

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] Issues supporting systems with and without TPM and firmware TPM (was Re: Handle device node timeout?)

2024-04-15 Thread Lennart Poettering
On Mo, 15.04.24 10:38, Mikko Rapeli (mikko.rap...@linaro.org) wrote:

> Hi,
>
> On Fri, Apr 12, 2024 at 05:03:18PM +0300, Aleksandar Kostadinov wrote:
> > Shouldn't the kernel automatically load the necessary modues when
> > devices are detected... given proper udev rules and module
> > availability in the initrd filesystem? I guess it depends on how you
> > build your initrd system for including them.
>
> The modules do get loaded but too late in the initramfs stage and something
> in the tpm2.target and related service was failing and creating TPM2 encrypted
> rootfs fails. I could not figure out at which stage the driver needs to be 
> loaded,
> e.g.
> Before: modprobe@tpm_tis_core.service modprobe@tpm_tis.service 
> modprobe@tpm_ftpm_tee.service
>
> But I'm also trying to fix the root causes why TPM modules can't be built 
> into the
> kernel so lucky that resolves these issues. Would be nice to know to which 
> stage
> the TPM2 module loading would need to happen though.

This should not require manual handling. The driver should be
auto-loaded via udev and stuff, like any other driver. Or does the
"tpm-ftpm_tee" thing carry no modalias info that autoloads it if some
specific hw is around?

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] Serial console flow control will stuck systemd

2024-04-12 Thread Lennart Poettering
On Fr, 12.04.24 09:33, 细石泉 (nicheln...@gmail.com) wrote:

> systemd will write log to console directly when bootup. A unexpected serial
> console flow control maybe block systemd at embedded systems. I guess
> systemd doesn't do a good job of initializing the serial console.

It doesn't do any job of it. We try hard to not reconfigure the serial
port at all, and we tell getty to not do this either. The assumption
is that whatever you configure on the kernel cmdline should remain in
effect all the way through.

> If a noise XOFF(HEX13) generate when systemd bootup, systemd stucked.
> Should systemd turn off any flow control stuff when initializing the serial
> console?

Tell the kernel what kind of flow control you want, we should not
reconfigure things then.

Note that iirc XON/XOFF handling is enabled on by default on most
linux ttys, i.e. "stty" will generally report "ixon" on terminals,
including graphical ones. And C-S/C-Q is generally understood to just
work to suspend terminal output. Hence, turning this off would
probably be quite confusing to most.

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] How to debug systemd services failing to start with 11/SEGV?

2024-04-09 Thread Lennart Poettering
On Di, 09.04.24 14:42, Alexander Dahl (a...@thorsis.com) wrote:

> Hello everyone,
>
> I'm currently trying to build a firmware for an embedded device and
> running into trouble because systemd seems to crash.  The BSP is
> based on pengutronix DistroKit (master) built with ptxdist and the
> target is the Microchip SAM9X60-Curiosity board, which is arm v5te
> architecture (that board is not part of DistroKit, support for that is
> in an upper layer of mine not public yet (?)).
>
> Everything is quite recent, building systemd version 255.2 currently.
> On startup I get messages like this (this is the first one, later on
> there are lot more, all with the same status):
>
>[   11.175650] systemd[1]: systemd-journald.service: Main process exited, 
> code=killed, status=11/SEGV
>[   11.239679] systemd[1]: systemd-journald.service: Failed with result 
> 'signal'.
>[   11.292640] systemd[1]: Failed to start systemd-journald.service.
>[FAILED] Failed to start systemd-journald.service.
>See 'systemctl status systemd-journald.service' for details.
>
> The system drops me on a shell later, where I can run the above
> mentioned command, which gives:
>
> ~ # systemctl status systemd-journald.service
> x systemd-journald.service - Journal Service
>  Loaded: loaded (/usr/lib/systemd/system/systemd-journald.service; 
> static)
>  Active: failed (Result: signal) since Tue 2024-04-09 11:44:52 UTC; 
> 11min a>
> TriggeredBy: x systemd-journald-dev-log.socket
>  * systemd-journald-audit.socket
>  x systemd-journald.socket
>Docs: man:systemd-journald.service(8)
>  man:journald.conf(5)
> Process: 197 ExecStart=/usr/lib/systemd/systemd-journald 
> (code=killed, sign>
>Main PID: 197 (code=killed, signal=SEGV)
>FD Store: 0 (limit: 4224)
> CPU: 330ms
>
> This does not help me much.  Other services crashing: systemd-udevd
> and systemd-timesyncd, also with status 11/SEGV which is segmentation
> fault, right?

Yes.

> I had this board running with an older version of systemd, but I can
> not remember which was the last good version.
>
> Could anyone give me a hint please how to debug this?

"coredumpctl gdb" should get open the most recent backtrace for you.

The coredump should also show up in the logs with a backtrace.

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] EXT: Re: Custom target between basic and multi-user targets

2024-04-09 Thread Lennart Poettering
On Di, 09.04.24 09:39, Agrain Patrick (patrick.agr...@al-enterprise.com) wrote:

> Hello,
>
> Thank you Lennart.
> The target inclusion between basic and multi-user is OK. The
> list-dependencies shows it as expected.

> I also add the set-default to foo.target to be able to check it.
>
> >From
> >https://www.freedesktop.org/software/systemd/man/latest/bootup.html,
> >at this stage, filesystems should be mounted (confirmed by the logs
> >on the serial console), so normally I should be able to execute any
> >binary as 'root' called by ExecStart.
>
> Is that correct ?

What you are writing here is very vague. If you have a service between
basic.target and multi-user.target then yes, local file systems are mounted
(they are mounted in time for the local-fs.target, which is ordered
before basic.target). Remote file systems might not be, they are
ordered against remote-fs.target instead, which is *not* ordered
before basic.target (simply because various network management
solutions do not run in early boot)

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] Custom target between basic and multi-user targets

2024-04-04 Thread Lennart Poettering
On Do, 04.04.24 14:34, Agrain Patrick (patrick.agr...@al-enterprise.com) wrote:

> Hello,
>
> Is it possible to insert a custom foo.target between basic.target
> and multi-user.target by just adding some
> After/Before/Wants/Requires in the foo.[target | service] files ?

Yes.

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] Reducing unmount/mount of partitions on soft-reboot

2024-03-14 Thread Lennart Poettering
On Mi, 13.03.24 16:57, Aditya Gupta (adit...@linux.ibm.com) wrote:

> Hello,
>
> I tried systemd-soft-reboot on a RHEL system, and it's amazing in terms
> of it's ability to do a userspace reboot, within fraction of time of a
> full system reboot. For example, for a Power system taking around 50
> seconds to do a normal reboot, it took around 4-5 seconds for a
> systemd-soft-reboot.
>
> I have a question on further optimisation. After soft-reboot, I notice
> much of the time is taken up by .device and .mount services. This was my
> observation based on 'systemd-analyze blame'. Please do let me know if
> I am seeing the wrong numbers, or if there's a better way to know.
>
> Is there some way to 'pass-through' these mounts ? That is, I might not
> need to unmount and remount my boot/root paritions.

Bind mount the relevant mounts from the current system into
/run/nextroot/ if you are using that.

If you are not using /run/nextroot/ then you can also define the mount
via a .mount unit (rather letting it be auto-generated via /etc/fstab
+ systemd-fstab-generator), and then set DefaultDependencies=no in it,
so that it does not get an implicit Conflicts= dependency on umount.target.

This is briefly documented on the systemd-soft-reboot.service man page btw.

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] How to install libudev from source?

2024-03-07 Thread Lennart Poettering
On Do, 07.03.24 17:09, Vru Inbvi (vru.in...@gmail.com) wrote:

> Hi,
>
> I am struggling to install libudev from source (with Ubuntu)
> Can someone please explain what the correct way to do this is, or point me
> to relevant/updated documentation?

https://systemd.io/HACKING

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] Query on sshd.socket sshd.service approaches

2024-03-06 Thread Lennart Poettering
On Mi, 06.03.24 13:06, Arseny Maslennikov (a...@cs.msu.ru) wrote:

> > The question of course is how many SSH instances you serve every
> > minute. My educated guess is that most SSH installations have a use
> > pattern that's more on the "sporadic use" side of things. There are
> > certainly heavy use scenarios though (e.g. let's say you are github
> > and server git via sshd).
>
> A more relevant source of problems here IMO is not the "fair use"
> pattern, but the misuse pattern.
>
> The per-connection template unit mode, unfortunately, is really unfit
> for any machine with ssh daemons exposed to the IPv4 internet: within
> several months of operation such a machine starts getting at least 3-5
> unauthed connections a second from hierarchically and geographically
> distributed sources. Those clients are probing for vulnerabilities and
> dictionary passwords, they are doomed to never be authenticated on a
> reasonable system, so this is junk traffic at the end of the day.
>
> If sshd is deployed the classic way (№1 or №3), each junk connection is
> accepted and possibly rate-limited by the sshd program itself, and the
> pid1-manager's state is unaffected. Units are only created for
> authorized connections via PAM hooks in the "session stack";
> same goes for other accounting entities and resources.
> If sshd is deployed the per-connection unit way (№2), each junk connection 
> will
> fiddle with system manager state, IOW make the machine create and
> immediately destroy a unit: fork-exec, accounting and sandboxing setup
> costs, etc. If the instance units for junk connections are not
> automatically collected (e. g. via `CollectMode=inactive-or-failed`
> property), this leads to unlimited memory use for pid1 on an unattended
> machine (really bad), powered by external actors.

Well, whatever sshd does as ratelimiting systemd can do to
afaics. I.e. the sshd@.service definition we suggest that and that the
big distros use all get the ExecStart=- thing right, so that an
unclean exit of sshd does not result in a pinned unit. Moreover, there's
PollLimitIntervalSec=/PollLimitBurst=, MaxConnectionsPerSource=,
MaxConnections= that ensures that any attempt to flood the socket
is reasonably contained, and the system recovers from that.

Current versions of systemd enable these settings by default, hence I
think we actually should be fine by default, even if you do not tune
these .socket parameters.

> > I'd suggest to distros to default to mode
> > 2, and alternatively support mode 3 if possible (and mode 1 if they
> > don#t want to patch the support for mode 3 in)
>
> So mode 2 only really makes sense for deployments which are only ever
> accessible from intranets with little junk traffic.

What precisely do you think is missing in systemd that
PollLimitIntervalSec=/PollLimitBurst=, MaxConnectionsPerSource=,
MaxConnections= can't cover?

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] Query on sshd.socket sshd.service approaches

2024-03-06 Thread Lennart Poettering
On Mi, 06.03.24 14:44, Shreenidhi Shedi (shreenidhi.sh...@broadcom.com) wrote:

> > Lennart Poettering, Berlin
>
> Thanks a lot for the responses Andrei, Poettering .
> We took it from blfs in PhotonOS.
> https://www.linuxfromscratch.org/blfs/view/11.3-systemd/introduction/systemd-units.html
> We need to do some more work on these unit files.

But that tarball actually contains a correct sshd -i line that
includes the "-" that makes the return values to be ignored as it
should.  Hence if your distro didn't do this even though it imported
this from LFS, then it's your distro that broke that...

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] Query on sshd.socket sshd.service approaches

2024-03-06 Thread Lennart Poettering
On Mi, 06.03.24 11:11, Shreenidhi Shedi (shreenidhi.sh...@broadcom.com) wrote:

> Hi All,
>
> What is the rationale behind using sshd.socket other than not keeping sshd
> daemon running always and reducing memory consumption?

Note that there are two distinct modes to running sshd via socket
activation: the per-connection mode (using sshd's native inetd mode),
where there's a separate instance forked off by systemd for each
connection, and the a mode where systemd just binds the socket, but
it's served by a single instance. The latter is only supported via an
out-of-tree patch afaik though, which at least debian/ubuntu ship:

https://salsa.debian.org/ssh-team/openssh/-/commit/7fa10262be3c7d9fd2fca9c9710ac4ef3f788b08

Unless you have a gazillion of connections coming in every second I'd
probably just use the per-connection inetd mode, simply because it's
supported upstream. Would be great of course if openssh would just add
support for the single-instance mode in upstream too, but as I
understand ssh upstream is a bit special, and doesn't want to play
ball on this.

To summarize the benefits of each mode:

1. Traditional mode (i.e. no socket activation)
   + connections are served immediately, minimal latency during
 connection setup
   - takes up resources all the time, even if not used

2. Per-connection socket activation mode
   + takes up almost no resources when not used
   + zero state shared between connections
   + robust updates: socket stays connectible throughout updates
   + robust towards failures in sshd: the bad instance dies, but sshd
 stays connectible in general
   + resource accounting/enforcement separate for each connection
   - slightly bigger latency for each connection coming in
   - slightly more resources being used if many connections are
 established in parallel, since each will get a whole sshd
 instance of its own.

3. Single-instance socket activation mode
   + takes up almost no resources when not used
   + robust updates: socket stays connectible throughout updates

> With sshd.socket, systemd does a fork/exec on each connection which is
> expensive and with the sshd.service approach server will just connect with
> the client which is less expensive and faster compared to
> sshd.socket.

The question of course is how many SSH instances you serve every
minute. My educated guess is that most SSH installations have a use
pattern that's more on the "sporadic use" side of things. There are
certainly heavy use scenarios though (e.g. let's say you are github
and server git via sshd). I'd suggests to distros to default to mode
2, and alternatively support mode 3 if possible (and mode 1 if they
don#t want to patch the support for mode 3 in)

> And if there are issues in unit files like in
> https://github.com/systemd/systemd/issues/29897 it will make the system
> unusable.

Did any distro ship a unit file like that? That was clearly a buggy
(local?) unit file, I am not aware of any big distro shipping such a
unit file.

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] Can I provide separate enabling for dbus-activation and "normal" start ?

2024-02-23 Thread Lennart Poettering
On Do, 22.02.24 17:09, Max Gautier (m...@max.gautier.name) wrote:

> Is it possible when writing a dbus-activable service to provide two
> separate and independent ways to enable it ?
>
> The D-Bus service file would for instance be:
> [D-BUS Service]
> Name=org.freedesktop.Notifications
> Exec=notification-daemon
> SystemdService=dbus-org.freedesktop.Notifications.service
>
> The systemd service:
> [Unit]
> PartOf=graphical-session.target
> After=graphical-session.target
>
> [Service]
> Type=dbus
> BusName=org.freedesktop.Notifications
> ExecStart=notification-daemon
>
> [Install]
> Alias=dbus-org.freedesktop.Notifications.service
> WantedBy=graphical-session.target
>
>
> With that systemd service file, `systemctl enable` would cause the
> service to be started by graphical-session.target and by
> dbus-activation; but it is possible to have two separate enable
> commands, one which would enable the dbus activation, one the
> graphical-session start ?
>
> I suppose I should have two separate unit files but I'm not completely
> sure how to do that without copying the whole file (i.e, is there some
> Install/Unit relation I can use for that ?)

No, in systemd there's only one "systemctl enable" and it applies the
[Install] section of the unit file, and that's really all there is.

You can probably add two unit files and use Alias= so that they pick a
common name as alias.

But one unit cannot have two distinct [Install] sections, if that's
what you are looking for.

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] Issues supporting systems with and without TPM and firmware TPM (was Re: Handle device node timeout?)

2024-02-20 Thread Lennart Poettering
On Di, 20.02.24 10:24, Mikko Rapeli (mikko.rap...@linaro.org) wrote:

> Thanks, I will check this. It sounds like optee needs a similar dependency
> generator.
>
> I wonder how many kernel subsystems/drivers which need userspace daemons
> would need systemd side dependency generators. Is it only the ones inside
> initramfs and/or pre-rootfs mount which need special handling?

Well, systemd to a large part is about getting deps in order,
i.e. start things in the right order but still as parallelized as
possible to make sure we can boot properly, fast.

For regular (i.e. late boot) services things are easier, since we can
hide various deps via socket activation and services typically just
have fewer deps, but during early boot things always require careful
consideration on what you need to schedulen when. That's hardly
surprising, isn't it?

TPM stuff in particular is stuff that we want to make use of super
early, because it's inherently part of the boot process to measure
progress and resources we are using. It's what "Measured Boot" after
all means. And that means you need to know what you do, and can't
really escape that.

> In the end the logic is quite straight forward. If kernel side support is
> there, then a daemon needs to be started before user service start, but
> boot should continue without if kernel support is not detected.

systemd generators are our way to allow dynamic extension of the
systemd unit dependency graph. It's the fact that you want things
dynamic (i.e. responsive to the fact whether your system has a
specific kind of tpm device/secure enclave) that means you have to do
with a generator.

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] Issues supporting systems with and without TPM and firmware TPM (was Re: Handle device node timeout?)

2024-02-19 Thread Lennart Poettering
On Mo, 19.02.24 10:36, Mikko Rapeli (mikko.rap...@linaro.org) wrote:

> > After=dev-tpmrm0.device tee-supplicant@teepriv0.service
> > Wants=dev-tpmrm0.device tee-supplicant@teepriv0.service
>
> I think my problems come from:
>
> After=tee-supplicant@teepriv0.service
> Wants=tee-supplicant@teepriv0.service
>
> Basically tee-supplicant should only be started if /dev/teepriv* device node
> is available. Then in my case with fTPM devices, all TPM using and encrypted
> rootfs creating services need to depend on the service which starts 
> tee-supplicant
> but only if /dev/teepriv0 exists. If teepriv0 doesn't exist, then 
> tee-supplicant
> should not be started and the dependencies to it should not exist
> either.

Is /dev/teepriv* guaranteed to be available when userspace is invoked?
or is it something that itself requires some kmod loading to show up,
i.e. that "udevadm trigger" causes to load?

> How should this dependency be expressed in systemd services?
>
> Can tee-supplicant@.service include:
>
> Before=systemd-pcrphase-initrd.service systemd-pcrphase.service 
> systemd-pcrmachine.service
> WantedBy=systemd-pcrphase-initrd.service systemd-pcrphase.service 
> systemd-pcrmachine.service
>
> In my testing this does not seem to work inside initramfs.
>
> If systemd-pcrphase-initrd.service systemd-pcrphase.service and 
> systemd-pcrmachine.service
> service have After= and Wants= to tee-supplicant@teepriv0.service then things 
> work,
> except on boards which have no optee and no /dev/teepriv0 where 
> tee-supplicant seems
> be started and fails due to missing optee which breaks the initramfs boot.

For your usecase the new tpm2.target available in git main is what you
really should focus on: all TPM using services should order themselves
after that. All stuff needed to make a TPM device appear should be
placed before that.

The systemd-tpm2-generator that now exists in git main analyzes the
uefi/acpi firmware situation and automatically adds a dev-tpm0.device
dependency on tpm2.target if it comes to the conclusion that such a
device will show up. This generator is not going to cover your
specific case, but I think it would be a good blueprint for you:
i.e. write a generator that checks if /dev/teepriv* exists. If not,
just exit. If yes, generate the required deps to pull in
tee-supplicatnt@.service, and add the dev-tpmrm0.device dep just like
systemd-tpm2-generator does.

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] Issues supporting systems with and without TPM and firmware TPM (was Re: Handle device node timeout?)

2024-02-19 Thread Lennart Poettering
On Fr, 16.02.24 11:28, Mikko Rapeli (mikko.rap...@linaro.org) wrote:

> Support for fTPM devices is problematic. First, the kernel support must be 
> modules
> but loading needs to be specially handled after starting tee-supplicant. For 
> normal
> boot udev handles optee detection and triggers tee-supplicant@teepriv0.service
> startup which unloads tpm_ftpm_tee kernel module, starts tee-supplicant and 
> then
> loads the kernel module again. After this RPMB works. To do the same in 
> initramfs, I added
> Wants: and After: dependencies from systemd-repart.service, 
> systemd-cryptsetup@.service,
> systemd-pcrmachine.service and systemd-pcrphase-initrd.service:

Kernel module unloading is not supposed to happen in clean
codepaths. It's a debug/development feature, it's not safe to do as
part of regular boot.

But why do you need an unload a kernel module at all? that smells...

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] Handle device node timeout?

2024-02-19 Thread Lennart Poettering
On Di, 16.01.24 16:06, Mikko Rapeli (mikko.rap...@linaro.org) wrote:

> Hi,
>
> I have services which depend on a specific device node. How can I run
> some recovery actions when the default 90s timeout for finding this
> device is hit?
>
> OnFailure= doesn't work as the service is not even started.
>
> Specifically the case is about supporting TPM2 encrypted rootfs but falling
> back to plain-text rootfs generation if there is no TPM2 device. Currently
> my initramfs works with TPM2 but without it fails with:

In git main there's new infra to deal with this case:

https://github.com/systemd/systemd/pull/30194

That should hopefully solve this systematically and generically.

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] logind: Activating session/opening seat fails in systemd v254

2024-02-16 Thread Lennart Poettering
On Do, 15.02.24 22:16, Nils Kattenbeck (nilskem...@gmail.com) wrote:

> Hi everyone,
>
> I am working on a kiosk-type device which is supposed to start a
> weston instance upon boot.
> Our images were previously based on Debian 12 and Fedora 38, now we
> are working on unifying them. Between the two old image variants the
> systemd units were mostly identical, however, on Fedora 39 with
> systemd 254 they no longer work. Weston/libseat now fails with the
> message: "Could not activate session: Permission denied". (Also see
> the logind logs at the end).

Neither Weston nor libseat (whatever that is) are a systemd
thing. Please contact the relevant projects for help?

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] Scan all USB devices from Linux service

2024-02-14 Thread Lennart Poettering
On Mi, 14.02.24 20:24, Muni Sekhar (munisekhar...@gmail.com) wrote:

> HI all,
>
> USB devices can have multiple interfaces (functional units) that serve
> different purposes (e.g., data transfer, control, audio, etc.).
>
> Each interface can have an associated string descriptor (referred to
> as iInterface). The string descriptor provides a human-readable name
> or description for the interface.
>
> >From user space service utility, How to scan all the USB devices
> connected to the system and read each interface string
> descriptor(iInterface)  and check whether it matches "Particular
> String" or not.

You can use sd-device.h, allocate an sd_device_enumerator_new(), then
apply some filter via sd_device_enumerator_add_match_sysattr() and
then enumerate through it via
sd_device_enumerator_get_device_first()/sd_device_enumerator_get_device_next().

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] Issue with systemd-logind

2024-02-14 Thread Lennart Poettering
On Mi, 14.02.24 15:03, Akshaya Maran (akshayamara...@gmail.com) wrote:

> Hi,
>
> I am trying to run weston11.0.1 using systemd logind launcher but got this 
> error
> " logind: failed to get session seat
> logind: cannot setup systemd-logind helper error:"

This looks like an error message from some weston thing. Please ask
that community for help.

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] ConditionNeedsUpdate, read-only /usr, and sysext

2024-02-14 Thread Lennart Poettering
On Mi, 07.02.24 20:42, Valentin David (m...@valentindavid.com) wrote:

> Hello everybody,
>
> The behavior of ConditionNeedsUpdate is that if /etc/.updated is
> older than /usr/, then it is true.
>
> I have some issues with this. But maybe I do not use it the right
> way.
>
> First, when using a read-only /usr partition (updated through
> sysupdate), the time of /usr is of the build of that filesystem. In
> the case of GNOME OS, to ensure reproducibility bit by bit, we set
> all times to some time in 2011. So that does not work for us.

Hmm, I wonder if the os-release file in /usr/ should optionally have a
timestamp field which could be used. That could be directly
initialized from $SOURCE_DATE_EPOCH at build time (maybe the field
should even be named like that). I think that would make sense, no?

> But now let's say we work-around that, and we make our system take a
> date that is reproducible, let's say the git commit of our
> metadata. Then we have a second issue.
>
> Because of systemd-sysext, it might be that /usr is not anymore the
> time of the /usr filesystem, but the time of a directory created on
> the fly by systemd-sysext (or maybe it keeps the time from the /
> fileystem, I do not know, but for sure the time stamp is from when
> systemd-sysext was started). If systemd-update-done happens after
> systemd-sysext (and it effectively does on 254), then the date of
> /etc/.updated will become the time when systemd-sysext started.

Uh. That'd be a bug. Can you file an issue about this?

> Let's imagine that I do not boot that machine often. My system is
> booting a new version. And there is already another new version
> available on the sysupdate server. My system will download a build
> of /usr that is likely to be older than the boot time. So next
> reboot, the condition will be false, even though I did have an
> update. And it will be false until I download a version that was
> built after the boot time of my last successful update.
>
> So my question is, is there plan to replace time stamp comparison
> for ConditionNeedsUpdate with something that works better with
> sysupdate and sysext? Maybe copying IMAGE_VERSION from
> /usr/lib/os-release into /etc/.updated for example?

Yeah, we should fix this.

I have so far never though about the mixture of sysext and
ConditionNeedsUpdate=. This is unchartered territory. But I think we
can fix this. But please open issues about this.

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] What creates a new machine-id ?

2024-02-08 Thread Lennart Poettering
On Do, 08.02.24 09:35, Agrain Patrick (patrick.agr...@al-enterprise.com) wrote:

> Hello,
>
> Our embedded system is based on a Rocky Linux 8 distribution which embeds 
> systemd-239.
>
> At first bootup, a machine-id is created and remains persistent over the 
> following reboots.
> System upgrade sometimes creates a new machine-id, sometimes not.
> By 'system upgrade', I mean either new linux kernel or upgraded Rocky 
> packages or both.
>
> Could you precise me what event(s) in the previous upgrade cases
> trigger a new machine-id ?

See:

https://www.freedesktop.org/software/systemd/man/latest/machine-id.html#Initialization

Or in other words: the machine ID is supposed to be persisted in
/etc/. if your upgrade procedure somehow causes the machine ID to be
invalidated somehow, then we'll assign a new one though. We basically
make sure that whatever happens, on boot we initialize it.

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] Detecting Systemd crash

2024-02-05 Thread Lennart Poettering
On Sa, 03.02.24 16:55, Álvaro Cebrián Juan (acebrianj...@gmail.com) wrote:

> Great question!
>
> I am very interested in detecting systemd crashes too since I have
> experienced them recently and have been asked to come up with a solution to
> react when a PID1 crash happens.
> In fact, in my recent experiences, a journald crash was enough to render
> the system into an unreliable/degraded state in which some top-level
> applications worked while others didn't.
>
> So adding to David's 1st question, I need to detect systemd and journald
> crashes and then trigger a `systemctl reboot --force --force`
> command

As mentioned elsewhere in this thread just use RuntimeWatchdogSec= in
systemd-system.conf(5)

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] Detecting Systemd crash

2024-02-05 Thread Lennart Poettering
On Mo, 05.02.24 13:54, Lennart Poettering (lenn...@poettering.net) wrote:

> you can just use the usual hw watchdog. If pid1 dies it will not ping
> the hw watchdog, and thus a reset is triggered automatically. In fact
> we actually configure the hw watchdog by default these days on hw that
> has it (which are most PCs).

Actually, we don't really, I need to correct myself. We probably
should though, dunno.

See RuntimeWatchdogSec= in systemd-system.conf(5)

>
> > 2: How do I get Systemd to freeze to test such program? I mean, if I kill
> > Systemd, the kernel would crash so I have to somehow tell Systemd to freeze?
>
> Not really, the kernel blocks SIGSTOP for PID1.
>
> Lennart
>
> --
> Lennart Poettering, Berlin

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] Detecting Systemd crash

2024-02-05 Thread Lennart Poettering
On So, 04.02.24 00:06, David Timber (d...@dev.snart.me) wrote:

> Systemd crashed on me the other day. I was writing up some Systemd units and
> testing them out by daemon-reload every time I wanted to test them out. Not
> the best way to go on about, I know. My bad abusing Systemd to the point of
> crashing. Perhaps it was just a bit flip that caused this.
>
>systemd[2368]: Assertion 'path_is_absolute(p)' failed at
>src/basic/chase.c:628, function chase(). Aborting.
>systemd[1]: Assertion 'path_is_absolute(p)' failed at
>src/basic/chase.c:628, function chase(). Aborting.
>systemd[1]: Caught  from our own process.
>systemd-coredump[32497]: Due to PID 1 having crashed coredump
>collection will now be turned off.
>systemd-coredump[32497]: [] Process 32496 (systemd) of user 0
>dumped core.
>systemd[1]: Caught , dumped core as pid 32496.
>systemd[1]: Freezing execution.
>
>...
>
>systemd-journald[871]: Failed to send stream file descriptor to
>service manager: Transport endpoint is not connected
>
> I didn't even bother trying producing stack trace. I can get on that if
> anyone wants it. My machine started doing some weird things like
> Firefox not

If this is a current systemd version (v255), please generate a stack trace
and submit it as github issue to us, we'll look into it. If it's
older, please report to your distro first.

> being able to do Ajax properly whilst being able to go to a new page,
> Chromium not being able to create a new tab whilst all the text editors
> worked just fine, all the systemctl commands timing out. So basically, I was
> using Linux without fork(). Anyway.
> Well, I think any software can crash for any reason whatsoever. The
> problem

Yeah, an assert like the above is an error we need to fix in systemd.

> with Systemd I realised from this incident is that I had no way of knowing
> that Systemd had crashed until I opened up the journal and kernel logs and
> saw that Systemd had crashed some time ago. In this particular incident,
> Systemd caught the signal and decided to just freeze. No idea why you'd want
> that because if it had just crashed, the kernel would have just panicked and
> I would have realised something went wrong.
>
> 1: So I decided that I need a some sort of "watchdog" that warns me when
> something like this happens. Using dbus to poll the status of the Systemd
> process, it could be a GUI app running under a seat, just a daemon that
> writes a warning message using `wall` or just send mail using a primed up
> MUA process. I wonder if someone already had the same idea and went on to
> make one.

you can just use the usual hw watchdog. If pid1 dies it will not ping
the hw watchdog, and thus a reset is triggered automatically. In fact
we actually configure the hw watchdog by default these days on hw that
has it (which are most PCs).

> 2: How do I get Systemd to freeze to test such program? I mean, if I kill
> Systemd, the kernel would crash so I have to somehow tell Systemd to freeze?

Not really, the kernel blocks SIGSTOP for PID1.

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] systemd-pcrlock Failed to submit super PCR policy

2024-02-05 Thread Lennart Poettering
On Mo, 05.02.24 09:24, Dominick Grift (dominick.gr...@defensec.nl) wrote:

Please run "SYSTEMD_LOG_LEVEL=debug systemd-pcrlock make-policy" from
the command line, then file a github issue about this, and pastethe
output there.

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] Systemd units complains about cgroup with 5.15.x kernel

2024-02-01 Thread Lennart Poettering
On Do, 01.02.24 16:30, Thierry Bultel (thierry.bul...@linatsea.fr) wrote:

> Hi,
>
> I am using systemd v255,
> and currently using a kernel vendor branch :
>
> g...@github.com:varigit/linux-imx.git
> lf-5.15.y_var01
> imx_v7_defconfig
>
> I had no issue with the older 5.4 kernel.
>
> I have verified that the kernel has the following options:
>
> CONFIG_DEVTMPFS=y
> CONFIG_CGROUPS=y
> CONFIG_INOTIFY_USER=y
> CONFIG_SIGNALFD=y
> CONFIG_TIMERFD=y
> CONFIG_EPOLL=y
> CONFIG_UNIX=y
> CONFIG_SYSFS=y
> CONFIG_PROC_FS=y
> CONFIG_FHANDLE=y
>
> CONFIG_NET_NS=y
>
> CONFIG_SYSFS_DEPRECATED is not set
>
> CONFIG_AUTOFS_FS=y
> CONFIG_AUTOFS4_FS=y
> CONFIG_TMPFS_POSIX_ACL=y
> CONFIG_TMPFS_XATTR=y
>
> --->
>
> systemd is failing to start some units:
>
> systemd[1]: wpa_supplicant.service: Failed to create cgroup
> /system.slice/wpa_supplicant.service: No such file or directory
> and also;
>  (agetty)[217]: serial-getty@ttymxc0.service: Failed to attach to cgroup
> /system.slice/system-serial\x2dgetty.slice/serial-getty@ttymxc0.service: No
> medium found
>
> ... and I do not have a serial console.
>
> I am currently digging into systemd code to find out what is possibly wrong
> .. but if anyone gets a clue, I would appreciate !

Educated guess, you have no cgroupvs2 or so?

Would make sense to provide logs?, use strace to check what precisely
fails?

Ask you distro for help?

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] Delaying VM startup until block devices are available

2024-01-26 Thread Lennart Poettering
On Do, 25.01.24 16:28, Orion Poplawski (or...@nwra.com) wrote:

> We have various VMs that are back by luks encrypted LVs.  At boot the volumes
> are decrypted by clevis.  The problem we are seeing at the moment is that the
> VMs are started before the block devices are decrypted.  Our current
> solution is:

We generally wait for all devices listed in /etc/crypttab, unless you
set noauto or nofail.

>
> # cat /etc/systemd/system/virtqemud.service.d/override.conf
> [Unit]
> After=blockdev@dev-mapper-luks\x2dbackup.target
> blockdev@dev-mapper-luks\x2dvm\x2d01\x2ddisk0.target
>
> Where we list each of the volumes to be decyrpted as blocking the virtqemud
> service.
>
> Does anyone have any better alternatives?  My main issue it that it feels
> somewhere in between fine-grained and coarse-grained control.
>
> Ideally I think one would be able to have each individual VM startup
> automatically delayed until the devices each used became available, but I
> don't see how to do this.

I am not sure how libvirt works, but if it runs every VM in a systemd
unit, then you could just order the device before that unit, or the
unit after the device.

Really depends on how libvirt splits things up.

> Alternatively it seems like one should be able to delay all VM startup until
> all volumes in /etc/crypttab were unlocked, rather than having to specify each
> one.  But I don't see a target for that.

This is default behaviour. Anything listed in /etc/crypttab is ordered
before cryptsetup.target, which is ordered before sysinit.target,
which is ordered before basic.target, which is ordered before regular services.

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] Bump: Testing LogFilterPatterns= on user-level services

2024-01-26 Thread Lennart Poettering
On Do, 25.01.24 22:29, Farblos (akfkqu.9df...@vodafonemail.de) wrote:

> Hi.
>
> I sent below mail some week ago, Barry's reply left me unsure as to
> whether this would be a bug or not.  I still tend do assume that I'm
> "doing something wrong".

This is currently not supported. The filters are communicated by the
service manager to journald via xattrs on the cgroups, and journald
will only consider those for cgroups owned by root, i.e. not on
cgroups delegated to unpriv users like this done for systemd --user
instances.

Interepreting arbitrary regexes configured by unpriv code in priv code
comes at some risk,. becose afair constructing them can come at O(2^n)
time, i.e. a rogue regex could make use consume unbounded time on
processing journal messages.

Hence, I wouldn't hold your breath. Unless someone figures out a smart
way to deal with this it's unlikely to be supported.

We should document this however I guess. Hence if you file an issue
that would be more than welcome, so that we can keep trakc of this.

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] Permanently remove services

2024-01-19 Thread Lennart Poettering
On Do, 18.01.24 23:40, Nils Kattenbeck (nilskem...@gmail.com) wrote:

> > > They are turning up as failed units, so they are being run,
> > > even if I don't have any TPM module. Also, I have a notifier in
> > > my waybar telling me of failed services and I don't want to see
> > > them there.
> >
> > Can you provide logs about this? The goal is definitely to make these
> > NOPs on TPM-less systems. I am a bit puzzled that the conditioning
> > they come with is not sufficient. We might need to tweak something
> > there then.
> >
> > The idea is that the system does TPM setup on systems that have a tpm
> > and on systems lacking that silently just skips all these so that
> > everything always works fully automatically and robustly without any
> > ugly error output.
> >
> > hence, any chance you can provide logs about this? and what kind of
> > system is this? i.e. does it really lack a tpm?
>
> In the past I have seen errors on systems which do not have
> libtss2/tpm2-tss installed though I am not sure if those should be
> silenced. After all, the unit being enabled means that one wants to
> use it if possible - and if the libraries are missing that should be
> noticeable to the user instead of a silent fail.

No, the libs are installed, that's what the "systemd-creds has-tpm2"
output shows.

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] Permanently remove services

2024-01-18 Thread Lennart Poettering
On Do, 18.01.24 22:53, Morten Bo Johansen (morte...@hotmail.com) wrote:

> ~/ % systemd-creds has-tpm2
> partial
> +firmware
> -driver
> +system
> +subsystem
> +libraries

OK, so this indicates that your system has TPM support on all levels
with a single exception: you lack an actual linux driver for your
specific hw. And that puzzles me. because to my knowledge at least
linux should support all relevant tpm2 interfaces just fine. THis
suggests that you haven#t got the right modules installed.

i don't know arch but is there possibly some extra package you have to
install to get more drivers?

tpm2 drivers are super basic stuff, it sound really weird to me to
split this out. It's a condition this stuff indeed is not prepared for
though: that everything is set up properly, from firmware to kernel to
userspace, but the driver is not actually available.

> The output from journalctl --unit systemd-tpm2-setup-early.service:
>
>-- Boot b3fca98d73f6441590174a72ac0d27fa --
>jan 18 18:13:02 gatsby systemd-tpm2-setup[329]: Failed to create TPM2 
> context: State not recoverable
>jan 18 18:13:02 gatsby systemd-tpm2-setup[329]: 
> ERROR:tcti:src/tss2-tcti/tcti-device.c:451:Tss2_Tcti_Device_Init() Failed to 
> open specified TCTI device file /dev/tpmrm0: No such file or direc>
>jan 18 18:13:03 gatsby systemd[1]: systemd-tpm2-setup-early.service: Main 
> process exited, code=exited, status=1/FAILURE
>jan 18 18:13:03 gatsby systemd[1]: systemd-tpm2-setup-early.service: 
> Failed with result 'exit-code'.
>jan 18 18:13:03 gatsby systemd[1]: Failed to start TPM2 SRK Setup (Early).
>
> There is a /dev/tpm0 file but not a /dev/tpmrm0 file

Oh, interesting. Is it possible that your system has only a TPM 1.2
device? (maybe your bios allows switching between TPM 2.0 and 1.2 modes)

It could be that we simply misdetect the tpm 1.2 case, i admittedly
never tested things on such a system. how old is that PC?

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] Permanently remove services

2024-01-18 Thread Lennart Poettering
On Do, 18.01.24 22:26, Morten Bo Johansen (morte...@hotmail.com) wrote:

> On 2024-01-18 Lennart Poettering wrote:
>
> > hence, any chance you can provide logs about this? and what kind of
> > system is this? i.e. does it really lack a tpm?
>
> I shall try to accommodate you. How do I get the log?
>
> The command "systemctl --plain --no-legend list-units --state=failed"
> does not provide enough info.

ideally boot with "systemd.log_level=debug" on the kernel cmdline, and
then paste "journalctl -b" somewhere.

The full output of "systemd-creds has-tpm2" would be good too.

> I have no external TPM module installed and I don't think my
> rather old cpu, "Intel(R) Core(TM) i5-4570T CPU @ 2.90GHz", has
> any on-board TPM2 capablility?

That sounds fairly recent, so I would assume that your machine has a
TPM.

Which OS is this? Is it possible that your kernel has TPM2 support
enabled, but for some reason the driver for your hw is not available
(for example not included in the initrd)?

Lennart

--
Lennart Poettering, Berlin


  1   2   3   4   5   6   7   8   9   10   >