Re: [systemd-devel] Exploring Minimal Systemd in Initramfs for Faster Boot

2024-09-24 Thread Mantas Mikulėnas
On Mon, Sep 23, 2024 at 12:33 PM  wrote:

> Hi Team,
>
> I'm exploring the possibility of splitting the systemd binary to
> optimize boot time before and after switching to the root filesystem.
>
> I’m aware that the systemd binary is quite large and may not fit in the
> initramfs, but is it feasible to have a minimal version of systemd that
> can invoke essential services and continue tracking them after
> transitioning to the main root filesystem?
>

Unless I missed any recent changes, transitioning to the main root
filesystem always involves launching the main systemd executable anew from
rootfs (even if the initramfs was already running systemd). It's fine if a
minimal initramfs-systemd execs a larger rootfs-systemd during the
transition as long as their versions are compatible – though I am not sure
if there is actually any state handed over between them in the first place;
IIRC normally none of the initramfs services are expected to survive the
transition.

-- 
Mantas Mikulėnas


Re: [systemd-devel] Learning Help: modeling system-user services with `run0`

2024-09-10 Thread Mantas Mikulėnas
On Tue, Sep 10, 2024 at 5:51 PM Divine Eguzouwa 
wrote:

> Assuming: run0 (and all of systemd for that matter) security works by
> sandboxing a service's "cgroup-namespace environment" (i.e., through
> User=/Group=, and/or NoNewPrivilages=, and/or etc.) and directly
> executing the given command therin...
>

Most of those parameters actually have nothing to do with cgroups. For
example, User= is just a regular process-level setreuid(), the same as you
would get from doas/su/runuser.


>
> I have a chain of services that executes a process belonging to
> User=/Group=one, that will read from a specific directory belonging to
> User=/Group=two, subsequently resulting in running a /bin executable that
> belongs to User=/Group=ANY
>
> Please walk me through how to model run0 --user to accomplish this in an
> "environment" *without authentication*? So far I keep bumping into "Failed
> to start transient service unit: Interactive authentication required."
> errors which leads me to believe that my earlier assumption is incorrect
>

I don't think run0 is meant for such use at all; starting transient
services in batch would be better done through systemd-run (or the
corresponding D-Bus API) – assuming you need them to be independent
services at all, that is.

Both use the same authorization method (PolicyKit) – you could write an
/etc/polkit-1 rule to allow a certain user to create transient units – but
I don't know if that can be granular enough to only allow userA>userB
transitions. Most likely it will be "all or nothing", i.e. if you allow
userA to call run0/systemd-run, that user will be able to become *any*
user... A chain of predefined .service units might work better.

-- 
Mantas Mikulėnas


Re: [systemd-devel] Is a socket with Accept=yes and ListenFIFO impossible?

2024-09-05 Thread Mantas Mikulėnas
FIFOs aren't sockets – they do not have an equivalent to accept() and there
is no multiplexing of inputs; all writes to the FIFO immediately go to the
"listening" file descriptor. So it's almost more like a datagram socket
than a stream one, in a sense.

If you want a true socket that's filesystem-based, create a Unix socket by
specifying the path via ListenStream, then connect to it using nc -U.

On Thu, Sep 5, 2024, 13:38 Steve Traylen  wrote:

> Was trying to set up a trivial socket and service to process multiple
> inputs:
>
> # Socket emailoutput.socket
> [Unit]
> Description=Send email via a socket.
>
> [Socket]
> Accept=yes
> ListenFIFO=/run/emailoutput.socket
>
> # Service emailoutput@.service
> [Unit]
> Description=email
>
> [Service]
> ExecStart=/usr/bin/mailx -s 'Testing from socket' st...@example.ch
> StandardInput=socket
>
>
> Starting the socket always produces: "Unit configured for accepting
> sockets, but sockets are non-accepting. Refusing"
>
> Switching the socket to  "ListenStream=127.0.0.1:" then everything
> works I can netcat files into the network socket.
> Is it impossible to to Accept=yes with ListenFIFO?
>
>
> Motivation for was this i wanted to do perform a systemd-run of a
> command outputting to that socket to pipe it into mailx.
>
>
>


Re: [systemd-devel] Using systemd-networkd with TI switchdev switch

2024-09-03 Thread Mantas Mikulėnas
I think that's something you could report through GitHub issues for systemd.

On Tue, Sep 3, 2024 at 10:27 AM Matthias Hörger  wrote:

> Hi,
>
> I'm using a chip from TI which contains a hardware switch. The switch
> driver uses the switchdev interface. So the switch can be configured via
> the bridge device.
>
> The TI documentation has the requirement for the switch ports to be UP
> before joining the bridge.
>
> "Port’s netdev devices have to be in UP state before joining the bridge"
>
>
> https://software-dl.ti.com/jacinto7/esd/processor-sdk-linux-am69/09_00_00_06/exports/docs/linux/Foundational_Components/Kernel/Kernel_Drivers/Network/CPSWng-Native-Ethernet.html#switch-mode
>
> Can this somehow be configured with systemd-networkd?
>
> Currently,  my switch ports go UP after joining the bridge. This leads to
> a wrong configuration. This can be observed in the latest systemd release
> and older versions as well.
>
> Regards,
> Matthias
>


-- 
Mantas Mikulėnas


Re: [systemd-devel] Updating network file during boot

2024-08-23 Thread Mantas Mikulėnas
I might be missing something, but... the systemd renaming is just another
udev rule, one in 80-net-setup-link, isn't it? Rules for the same interface
can't race with each other, they're processed linearly. (Rules for
*different* interfaces can race but that happens regardless of the method.)

Last I checked, the udev rule that applies .link files is supposed to honor
a previously set NAME=, and any rules that set NAME= after systemd should
just override it as usual.

That said, because the enp* naming is done by .link files, it doesn't make
sense to have a .link file reference an /sys/enp* DEVPATH because at that
point in time, the enp* naming hasn't been applied yet... this is not a
race, quite the opposite – you're trying to make thing X conditional on the
result of same thing X.

On Fri, Aug 23, 2024, 11:22 Henti Smith  wrote:

> On Thu, 22 Aug 2024 at 19:07, Andrei Borzenkov 
> wrote:
>
>> On 22.08.2024 16:56, Henti Smith wrote:
>> > I've switched to using "Property=" as follows:
>> >  # Fixed MAC and name for enp6s0 (Block Diagram) when debug board
>> is not
>> > plugged in
>> >  # Renamed to mvc-sw2 by PCI Address.
>> >  [Match]
>> >
>> >
>> Property=DEVPATH=/devices/pci:00/:00:11.0/:05:00.0/net/enp5s0
>> >
>> >  [Link]
>> >  MACAddress=02:00:00:00:06:00
>> >  Name=mvc-sw2
>> >
>> > However this is also inconsistent:
>>
>> What systemd version?
>>
>>
> Ubuntu focal : 245.4-4ubuntu3.23
>
> We will be moving to Jammy in the near future.
>
> I'm going to attempt using UDEV for device naming again and see if I can
> find a way to stop systemd from renaming devices, which is what happened
> before.
>
> If there are any pointers on using either exclusively UDEV or link files
> to manage device naming without them UDEV and systemd clobbering each
> other, it will be very welcome.
>
> Henti
>


Re: [systemd-devel] Starting the sshd service on a 'non-bash' system

2024-08-03 Thread Mantas Mikulėnas
I assume you mean the inetd-style sshd@.service, not the regular
sshd.service? (Or does your distribution patch systemd-style socket
activation into sshd?)

There is usually no dependency on a shell, unless the .service unit
explicitly calls /bin/sh (note that the inetd-style socket activation uses
a different .service). Forkstat or extrace can reveal what is being exec'd
when the connection is made.

On Thu, Aug 1, 2024, 15:40 Mark Corbin  wrote:

> Hello
>
> I was wondering whether anybody has any experience of running the sshd
> service successfully on a system with a 'non-bash' shell?
>
> We're using systemd 250.5 and openssh 8.9p1. Both ssh and scp work as
> expected with '/bin/sh -> bash.bash' on the target, but with '/bin/sh ->
> busybox.nosuid' (ash shell) the connections fail.
>
> The sshd logs on the target show:
> Jul 31 15:24:56 hc sshd[17826]: Connection from UNKNOWN port 65535 on
> 192.168.12.246 port 65535
> Jul 31 15:24:56 hc sshd[17826]: debug1: kex_exchange_identification:
> write: Broken pipe
> Jul 31 15:24:56 hc sshd[17826]: banner exchange: Connection from UNKNOWN
> port 65535: Broken pipe
>
> Some extra debug messages that I've added to both systemd and sshd show
> that the incoming socket gets closed somewhere between the handover from
> the systemd socket service to the systemd sshd service. This results in
> sshd being unable to get any peer details. The call to getpeername in
> service_spawn fails with ENOTCONN.
>
> I can't see anything obvious in either the systemd source that suggests a
> dependency on bash.
>
> Any ideas gratefully appreciated.
>
> Regards
>
> Mark
>
> --
>
> *Mark Corbin *
> Senior Software Engineer  |   lunarenergy.com
>   |  LinkedIn
>   |  Instagram
> 
> [image: Lunar Energy Logo]
>
> C2:Restricted unless otherwise stated.
>
> Lunar Energy Limited is a company registered in England and
> Wales, authorised and regulated by the Financial Conduct Authority under
> reference number 767876. Company registration number: 05631091. Registered
> office: 55 Baker Street, London, England, W1U 7EU
> 
>


Re: [systemd-devel] [EXT] Some base questions around systemd-resolved

2024-08-02 Thread Mantas Mikulėnas
 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 1
>
> ;; OPT PSEUDOSECTION:
> ; EDNS: version: 0, flags:; udp: 65494
> ;; QUESTION SECTION:
> ;_sip._tcp.osvsig-mets-prod.voip.itsvic.local. IN SRV
>
> ;; ANSWER SECTION:
> _sip._tcp.osvsig-mets-prod.voip.itsvic.local. 3600 IN SRV 20 0 5060
> osvn2-mets-prod.voip.itsvic.local.
> _sip._tcp.osvsig-mets-prod.voip.itsvic.local. 3600 IN SRV 10 0 5060
> osvn1-mets-prod.voip.itsvic.local.
>
> ;; Query time: 0 msec
> ;; SERVER: 127.0.0.53#53(127.0.0.53)
> ;; WHEN: Tue Jul 30 15:38:47 AEST 2024
> ;; MSG SIZE  rcvd: 179
>
> Thanks for any help.
>


-- 
Mantas Mikulėnas


Re: [systemd-devel] Some base questions around systemd-resolved

2024-08-01 Thread Mantas Mikulėnas
On Fri, Aug 2, 2024, 02:04 struth  wrote:

> Hello systemd-devel group.
> I have just started using systemd-resolved to try and achieve a goal that
> I will try to explain.
> I know very little about it (web searches so far) so please excuse any
> silly questions or trains of thought.
> I have a Debian Bullseye client in a Microsoft network that uses a .local
> domain.
> I know that this is a bad policy, but there is nothing I can do about
> it. I have no choice or authority in this matter. This is how they have
> configured their whole environment.
> I have read here  [ https://github.com/systemd/systemd/issues/8852 ] that
> .local can be used.
>
> At times there is complete isolation from the 4 Domain DNS servers and I
> want my client machine to still be able to resolve DNS entries
> (specifically SRV records with included A records) during this outage.
>
> I thought that systemd-resolved could cache the DNS entries and retain
> them until any of the DNS Servers returned to service.
> This only seems to happen for a short time after the outage. After some
> time ( I don't know how to tell how long) the entries seem to drop from
> cache.
>

Generally DNS caches are required to honor the TTL of each record and not
cache it further than that. Your dig output shows a TTL of 1h (3600) for
the SRV records, which was defined as part of the DNS entry and starts
counting down the moment that record is received from an authoritative
server (the DC).

"Serve stale" – caching beyond TTL if servers are outright unreachable –
was defined a few years back, but only some caches implement it (as an
option). I think bind and unbound do, but systemd-resolved doesn't. So run
unbound configured that way and have resolved speak to it.

I would ideally like the entries to stay in cache until updated from DNS
> Server again (once one returns to service).
>
> On the SRV point: How can I be sure that it caches the full result of the
> SRV query?
> Eg: SRV gives 2 x A-records which then need to resolve to 2xIP-addresses.
>

I think it's necessary to actually make those A-queries for them to be
cached. Some are provided along with the SRV reply but I'm not sure how
caching of the 'Additional' section works.


> I'm not sure of the mailing lists policy for including config samples of
> logs, so I will include it here in email and see what happens.
> Please excuse if this is too much or too little information.
>
> root@VATCPCOMMLC1:~# cat /etc/systemd/resolved.conf
> [Resolve]
> DNS= 10.24.1.135 10.24.129.135 10.24.1.136 10.24.129.136
> #FallbackDNS=
> Domains=itsvic.local
> #DNSSEC=no
> #DNSOverTLS=no
> #MulticastDNS=yes
> #LLMNR=yes
> #Cache=yes
> DNSStubListener=yes
> #DNSStubListenerExtra=
> #ReadEtcHosts=yes
> #ResolveUnicastSingleLabel=no
> root@VATCPCOMMLC1:~#
>
> root@VATCPCOMMLC1:~# ls -l /etc/resolv.conf
> lrwxrwxrwx 1 root root 39 Jul 30 14:11 /etc/resolv.conf ->
> ../run/systemd/resolve/stub-resolv.conf
> root@VATCPCOMMLC1:~#
> root@VATCPCOMMLC1:~# cat ../run/systemd/resolve/stub-resolv.conf
> nameserver 127.0.0.53
> options edns0 trust-ad
> search itsvic.local
> root@VATCPCOMMLC1:~#
>
> root@VATCPCOMMLC1:~# cat /etc/nsswitch.conf
> # /etc/nsswitch.conf
> passwd: files systemd
> group:  files systemd
> shadow: files
> gshadow:files
> hosts:  files dns
> networks:   files
> protocols:  db files
> services:   db files
> ethers: db files
> rpc:db files
> netgroup:   nis
> root@VATCPCOMMLC1:~#
> root@VATCPCOMMLC1:~# resolvectl statusresolvectl status
> Global
>  Protocols: +LLMNR +mDNS -DNSOverTLS DNSSEC=no/unsupported
>   resolv.conf mode: stub
> Current DNS Server: 10.24.1.135
>DNS Servers: 10.24.1.135 10.24.129.135 10.24.1.136 10.24.129.136
> DNS Domain: itsvic.local
> Link 2 (ens192)
> Current Scopes: none
>  Protocols: -DefaultRoute +LLMNR -mDNS -DNSOverTLS
> DNSSEC=no/unsupported
>
> Link 3 (ens224)
> Current Scopes: none
>  Protocols: -DefaultRoute +LLMNR -mDNS -DNSOverTLS
> DNSSEC=no/unsupported
>
> Link 4 (bond0)
> Current Scopes: LLMNR/IPv4 LLMNR/IPv6
>  Protocols: -DefaultRoute +LLMNR -mDNS -DNSOverTLS
> DNSSEC=no/unsupported
>
> root@VATCPCOMMLC1:~# dig srv _sip._tcp.osvsig-mets-prod.voip.itsvic.local
> dig srv _sip._tcp.osvsig-mets-prod.voip.itsvic.local
> ; <<>> DiG 9.16.48-Debian <<>> srv
> _sip._tcp.osvsig-mets-prod.voip.itsvic.local
> ;; global options: +cmd
> ;; Got answer:
> ;; WARNING: .local is reserved for Multicast DNS
> ;; You are currently testing what happens when an mDNS query is leaked to
> DNS
> ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 57884
> ;; flags: qr rd ra; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 1
>
> ;; OPT PSEUDOSECTION:
> ; EDNS: version: 0, flags:; udp: 65494
> ;; QUESTION SECTION:
> ;_sip._tcp.osvsig-mets-prod.voip.itsvic.local. IN SRV
>
> ;; ANSWER SECTION:
> _sip._tcp.osvsig-mets-prod.voip.itsvic.local. 3600 IN SRV 20 0 5060
> osvn2-mets-prod.voip.itsvic.

Re: [systemd-devel] help re-configuring bond and ipoib devices/networks

2024-07-29 Thread Mantas Mikulėnas
Some network types use longer or shorter addresses, not all of them try to
mimic Ethernet.

For example FireWire uses 64-bit hardware addresses but IP-over-FW extends
it to 128-bit addresses in ARP for technical reasons, and I think it's the
same for Infiniband and IPoIB.

Unfortunately Networkd doesn't understand any of it.

On Tue, Jul 30, 2024, 04:42 serenissi  wrote:

> I can't tell more about the IPoIB going down after networkd restart
> without additional debugging info. But from the complains, did you try
> removing the problematic keys (ipoib is part of netdev, not network.
> network has no knowledge of the device type)?
>
> Also are you sure 80:00:02:08:fe:80:00:00:00:00:00:00:e4:1d:2d:03:00:6f:85:11
> is the mac address? It is clearly more than 48 bits.
> On 7/29/24 08:43, Chandler Sobel-Sorenson wrote:
>
> I'm quite frustrated having spent many hours and little success, when
> things were perfectly fine before our backup generator decided not to kick
> in and power surges ensued, messing up all kinds of things stupid
> electrons.  We have a server that is a bit more important than the others,
> runs LDAP and hosts our home directories, and runs the subnet manager for
> an Infiniband network.
>
> The main uplink is a 2x10gbe bond device, and there's a local 1gbe
> 10-net interface/network, and an Infiniband with IPoIB.  The main issue is
> getting the IPoIB working.  The bond and its 2 links work fine after the
> machine boots, but whenever I restart systemd-networkd, it becomes
> unreachable even though there aren't any problems reported in the system
> logs and networkctl reports it's still routable.  The only thing I've
> figured to get it reachable again is to reboot the system, so I'd like to
> figure out the problem there because that shouldn't be happening.
>
> The IPoIB is driving me nuts, I read through all of systemd.network
> and systemd.netdev docs, and it's somehow become rogue and unmanaged when
> it used to be configured.  systemd-networkd keeps telling me about unknown
> keys and unknown sections, even though they were all added in versions
> previous to the current, which is:
>
> systemd 252 (252.26-1~deb12u2) running on Debian 11 Linux 5.10.0-20-amd64
>
> Below are the configs and messages from systemd.  Hope you can help.
> # networkctl
> *IDX LINK TYPE   OPERATIONAL SETUP *
>   1 lo   loopback   carrier unmanaged
>   2 ens1f0   ether  enslavedconfigured <--10gbe link 1
>   3 enp4s0f0 ether  off unmanaged  <--not connected
>   4 ens1f1   ether  enslavedconfigured <--10gbe link 2
>   5 enp4s0f1 ether  routableconfigured <--1gbe local 10-net
>   6 ibs3 infiniband off unmanaged  <--lazy, rogue
>   7 ibs3d1   infiniband off unmanaged  <--not connected
>   8 bond007  bond   routableconfigured
>
> *Infiniband, IPoIB, ibs3 * # cat 10-ibs3.netdev
> [Match]
>
> [NetDev]
> Name=ibs3
> Kind=ipoib
> MTUBytes=65520
> MACAddress=80:00:02:08:fe:80:00:00:00:00:00:00:e4:1d:2d:03:00:6f:85:11
>
> [IPoIB]
> Mode=connected
>
>
> # cat ibs3-10_10_11_203.network
> [Match]
> Name=ibs3
>
> PermanentMACAddress=80:00:02:08:fe:80:00:00:00:00:00:00:e4:1d:2d:03:00:6f:85:11
> Path=/devices/pci:00/:00:01.0/:01:00.0/net/ibs3
> Driver=ib_ipoib
> Type=infiniband
> Kind=ipoib
> Property=ID_NET_MANAGED_BY=io.systemd.Network
>
> [Link]
> MTUBytes=65520
>
> [Network]
> Kind=ipoib
> Address=10.10.11.203/24
> Gateway=10.10.11.203
> LinkLocalAddressing=no
> IPv4AcceptLocal=yes
> KeepConfiguration=static
>
> [IPoIB]
> Mode=connected
> systemd-networkd complaints
> /etc/systemd/network/10-ibs3.netdev:8: Not a valid hardware address,
> ignoring assignment:
> 80:00:02:08:fe:80:00:00:00:00:00:00:e4:1d:2d:03:00:6f:85:11
> /etc/systemd/network/10-ibs3.netdev:8: Not a valid hardware address,
> ignoring assignment:
> 80:00:02:08:fe:80:00:00:00:00:00:00:e4:1d:2d:03:00:6f:85:11
> /etc/systemd/network/ibs3-10_10_11_203.network:15: Unknown key 'Kind' in
> section [Network], ignoring.
> /etc/systemd/network/ibs3-10_10_11_203.network:22: Unknown section
> 'IPoIB'. Ignoring.
>
> *bond*
> # cat 10-bond007.netdev
> [NetDev]
> Name=bond007
> Kind=bond
> MTUBytes=9000
>
> [Bond]
> Mode=802.3ad
> MIIMonitorSec=1000
> UpDelaySec=1000
> DownDelaySec=2000
>
>
> # cat Intel_X710_DA2-bond007.network
> [Match]
> Path=pci-:05:00.0
> Path=pci-:05:00.1
>
> [Network]
> Bond=bond007
>
>
> # cat bond007-10_140_78_70.network
> [Match]
> Name=bond007
>
> [Network]
> Address=10.140.78.70/28
> Gateway=10.140.78.65
> DNS=128.196.11.233
> DNS=128.196.11.234
> DNS=128.196.11.235
> LinkLocalAddressing=no
> IPv6AcceptRA=no
>
> Best Regards,
> Chandler
> [image: The University of Arizona block 'A' logo.]
> *Chandler Sobel-Sorenson*
> Sr. Systems Administrator
> Arizona Genomics Institute
> School of Plant Sciences—Research
> THE UNIVERSITY OF ARIZONA
>
> Thomas W. Keating Bioresearch Bldg. | Rm. 200A24
> 1657 E. Helen St. | Tucson, AZ 85721
> 

Re: [systemd-devel] Best Practices with homectl ↔ passwd/groups/shadow ?

2024-07-29 Thread Mantas Mikulėnas
I'm not sure if that's related to homectl - it seems that you're trying to
specify User= and Group= within a user service. The whole "systemd --user"
service manager (user@xxx.service) is unprivileged and runs as your user,
so it cannot change its UID anyway or set any supplementary groups except
those that it already has.

On Mon, Jul 29, 2024, 17:43 Divine Eguzouwa 
wrote:

> Is it wise to use only `homectl` to manage human users *without* reciprocal
> entries in /etc/passwd, /etc/group, or /etc/shadow?
>
> $ systemd-analyze security wireplumber --user
>
> | NAME  | Description| Exposure|
>
> | --| -- | --- |
>
> | ❌ User=/DynamicUser= | Service runs.. | 0.4 |
>
> → Overall exposure level for wireplumber.service...
>
>
> $ systemctl edit wireplumber.service --user
> ### Editing
> /home/me/.config/systemd/user/wireplumber.service.d/override.conf
> ### Anything between here and the comment below will become the contents
> of the...
>
> [Service]
>
> User=%u
>
> Group=%g
>
> ### Edits below this comment will be discarded
> ...
>
> $ systemctl daemon-reload --user
>
> $ systemctl restart wireplumber.service --user
> $ journalctl -r --unit=wireplumber --user
> systemd[851]: Failed to start Multimedia Service Session Manager.
> systemd[851]: wireplumber.service: Failed with result 'exit-code'.
> systemd[851]: wireplumber.service: Start request repeated too quickly.
> systemd[851]: wireplumber.service: Scheduled restart job, restart counter
> is at 5.
> systemd[851]: wireplumber.service: Failed with result 'exit-code'.
> systemd[851]: wireplumber.service: Main process exited, code=exited,
> status=216/GROUP
> (eplumber)[11087]: wireplumber.service: Failed at step GROUP spawning
> /usr/bin/wireplumber: Operation not permitted
> *(eplumber)[11087]: wireplumber.service: Failed to determine supplementary
> groups: Operation not permitted*
> systemd[851]: Started Multimedia Service Session Manager.
>
>
>
> homectl should already know of this user's supplementary groups, unless
> homectl is searching for them in `/etc/groups` instead?
>
> --D
>
>
>


Re: [systemd-devel] "OnUnitInactiveSec Timer not firing" issue

2024-07-29 Thread Mantas Mikulėnas
On Mon, Jul 29, 2024 at 9:33 AM Windl, Ulrich  wrote:

> Hi!
>
>
>
> I tried to use my first systemd timer, but failed: Either I don’t
> understand it correctly, or there is a bug in systemd (228 of SLES12 SP5):
>
> (See also https://unix.stackexchange.com/q/779714/320598)
>
>
>
> It seems it’s not enough to “enable” the timer, but also “start” it (well,
> it may seem logical from the systemd point of view, but from a cron user’s
> point of view enabling should be enough)
>

"Start" is the primary action in systemd. Starting a .service runs it;
starting a .mount mounts it; starting a .timer schedules it; starting a
.socket listens on it.


> Furthermore it seems to be necessary to run the service unit itself,  too
> (assuming it must be enabled also, right?)
>

No. The purpose of the timer is to start the service, so starting the
service manually (or "enabling" it, to be started on boot) would be
redundant.


> But the biggest thing is that systemd seems to lose the point-in-time of
> the last activation, so the timer won’t fire any more (e.g. after package
> upgrade when everything enabled would be re-enabled, and everything started
> would be re-started).
>
> But most of all if the system reboots, the timer also won’t fire any more.
>
>
>
> So can anybody explain how things should work?
>
>
>
> My expectation was that an OnUnitInactiveSec timer would fire immediately if 
> it never ran, and then every day from that.
>
>
>
> Kind regards,
>
> Ulrich
>
>
>
>
>


-- 
Mantas Mikulėnas


Re: [systemd-devel] "systemd-path systemd-search-user-unit" does not match reality

2024-07-25 Thread Mantas Mikulėnas
On Thu, Jul 25, 2024 at 10:42 AM Vladimir Panteleev <
g...@vladimir.panteleev.md> wrote:

> It looks like "systemd-path systemd-search-user-unit" isn't accurate
> or does not correspond to the list of paths that systemd is looking
> in.
>

The paths for user units and other user configuration depend on your
XDG_{CONFIG,DATA}_{DIRS,HOME} environment variables.

Since systemd-search-user-unit is running as part of your interactive
session (and under your interactive shell) whereas systemd --user itself *is
not*, they will likely have different lists of environment variables,
especially if you have Nix set up a custom XDG_* through /etc/profile or
similar.

While systemd --user has a few ways to push environment variables into the
services it starts, those all happen after initialization; there's no good
equivalent for providing envvars for systemd itself. You would need to
`sudo systemctl edit user@$UID` and add some [Service] Environment=
definitions there.

(In the early days I used to edit the user@.service to invoke
`ExecStart=/bin/sh -l -c "exec systemd --user"` so that it would go through
the shell's ~/.profile processing, but I'm not sure if that works these
days.)

-- 
Mantas Mikulėnas


Re: [systemd-devel] namespace problem

2024-07-18 Thread Mantas Mikulėnas
On Thu, Jul 18, 2024, 15:43 Thomas Köller  wrote:

> Am 18.07.24 um 14:04 schrieb Mantas Mikulėnas:
> > Yes, but namespace persistence actually relies on filesystem access –
> > it's implemented as a bind-mount of the namespace file descriptor (onto
> > /run/netns for the 'ip netns' tool), as otherwise namespaces only exist
> > as long as processes that hold them.
> >
> > So if you have any service options that cause a new *mount* namespace to
> > be created (preventing its filesystem mounts from being visible outside
> > the unit), then it cannot pin persistent network namespaces.
>
> Quoting the manual page:
> ProtectSystem=
> Takes a boolean argument or the special values "full" or
> "strict". If true, mounts the /usr/ and the boot loader directories
> (/boot and /efi) read-only for processes invoked by this unit. If set
> to "full", the /etc/ directory is mounted read-only, too.
>
> No mention of /var or /run.


It still works this way whether it's mentioned or not. Once the unit's
process is put in a new mount namespace, the entire `/` is marked private
so that any mounts made underneath `/` remain visible only in that
namespace. This equally affects the "read-only /etc" mount done by systemd
itself as well as the /run/netns mount done by 'ip' or any other mounts
done anywhere else.

In theory it would be possible to carve out exceptions such as marking /run
shared again, but then /run/systemd would need to be marked private again,
etc. – and mount propagation across namespaces is complex enough as it is.

Also, note that the bind mounts in in
> /var/run/netns and /run/netns are actually created by 'ip netns add',
> they just are't usable.
>

No, the mount *points* in /run/netns are created (as regular empty files),
but they don't become actual mounts, that's why they're not usable.

There's a distinction between mount points (files or directories seen in
`ls`) and mounts (seen in `findmnt`) – make your service script log its
findmnt output to a file and compare it to findmnt output seen from the
outside.

(ember) /home/grawity $ mount | grep netns
tmpfs on /run/netns type tmpfs
(rw,nosuid,nodev,size=3268196k,nr_inodes=819200,mode=755,inode64)
(ember) /home/grawity $ sudo systemd-run --shell -p ProtectSystem=full
Running as unit: run-u1253.service; invocation ID:
9d4675b9ef7c40d68486b3058ee8a60b
Press ^] three times within 1s to disconnect TTY.
root@ember /home/grawity # mount | grep netns
tmpfs on /run/netns type tmpfs
(rw,nosuid,nodev,size=3268196k,nr_inodes=819200,mode=755,inode64)
root@ember /home/grawity # ip netns add foo
root@ember /home/grawity # mount | grep netns
tmpfs on /run/netns type tmpfs
(rw,nosuid,nodev,size=3268196k,nr_inodes=819200,mode=755,inode64)
nsfs on /run/netns/foo type nsfs (rw)
root@ember /home/grawity # exit
Finished with result: success
Main processes terminated with: code=exited, status=0/SUCCESS Service
runtime: 18.451s
(ember) /home/grawity $ mount | grep netns
tmpfs on /run/netns type tmpfs
(rw,nosuid,nodev,size=3268196k,nr_inodes=819200,mode=755,inode64)
(ember) /home/grawity $

(The non-systemd rough equivalent is `unshare --mount
--propagation=private`, and you can attach to a namespace using `nsenter` –
an "ip netns exec" is approximately an `nsenter --net`.)

>


Re: [systemd-devel] namespace problem

2024-07-18 Thread Mantas Mikulėnas
On Thu, Jul 18, 2024 at 2:14 PM Thomas Köller 
wrote:

> > Does it use any hardening options at all?
>
> Thanks for the hint. As it seems this is an undocumented side effect of
> 'ProtectSystem = full'. From reading the docs I got the impression that
> only file system access is affected by this parameter.
>

Yes, but namespace persistence actually relies on filesystem access – it's
implemented as a bind-mount of the namespace file descriptor (onto
/run/netns for the 'ip netns' tool), as otherwise namespaces only exist as
long as processes that hold them.

So if you have any service options that cause a new *mount* namespace to be
created (preventing its filesystem mounts from being visible outside the
unit), then it cannot pin persistent network namespaces.

(It's also a bit overkill to use ProtectSystem for this kind of script,
really.)

-- 
Mantas Mikulėnas


Re: [systemd-devel] namespace problem

2024-07-18 Thread Mantas Mikulėnas
Would really like to see the contents of the .service file. Does it use any
hardening options at all?

On Thu, Jul 18, 2024 at 10:49 AM Thomas Köller 
wrote:

> Hi,
>
> I have a problem creating a namespace from a systemd service. The
> service (type oneshot) invokes a shell script containing these two lines:
>
>  ip netns add vpnlink
>  iw phy phy0 set netns name vpnlink
>
> Both commands succeed, meaning they do not return an error, and so the
> service start is successful. However, the newly created network
> namespace is apparently unusable. Invoking the script from a root shell
> outside of the systemd service successfully creates the namespace. The
> log below illustrates the problem:
>
> root@htpc:~/netsu# ip netns list
> root@htpc:~/netsu# ./netsu
> root@htpc:~/netsu# ip netns list
> vpnlink (id: 0)
> root@htpc:~/netsu# ip netns exec vpnlink ip link show
> 1: lo:  mtu 65536 qdisc noop state DOWN mode DEFAULT group
> default qlen 1000
>  link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
> 4: wlan_usb:  mtu 1500 qdisc noop state DOWN mode
> DEFAULT group default qlen 1000
>  link/ether 00:0f:60:06:7f:3b brd ff:ff:ff:ff:ff:ff
> root@htpc:~/netsu# ip netns del vpnlink
> root@htpc:~/netsu# ip netns list
> root@htpc:~/netsu# systemctl restart network-setup.service
> root@htpc:~/netsu# systemctl status network-setup.service
> ● network-setup.service
>   Loaded: loaded (/etc/systemd/system/network-setup.service;
> enabled; preset: disabled)
>  Drop-In: /usr/lib/systemd/system/service.d
>   └─10-timeout-abort.conf
>   Active: active (exited) since Thu 2024-07-18 09:34:55 CEST; 14s ago
>  Process: 3320 ExecStart=/root/netsu/netsu (code=exited,
> status=0/SUCCESS)
> Main PID: 3320 (code=exited, status=0/SUCCESS)
>  CPU: 29ms
>
> Jul 18 09:34:55 htpc systemd[1]: Starting network-setup.service...
> Jul 18 09:34:55 htpc systemd[1]: Finished network-setup.service.
> root@htpc:~/netsu# ip netns list
> Error: Peer netns reference is invalid.
> Error: Peer netns reference is invalid.
> vpnlink
> root@htpc:~/netsu# ip netns exec vpnlink ip link show
> setting the network namespace "vpnlink" failed: Invalid argument
> root@htpc:~/netsu# ip netns del vpnlink
>
> Am I missing something? Of course, the process running the root shell
> invoked from the command line is ultimately also a child of systemd,
> which is the system's init process.
>


-- 
Mantas Mikulėnas


Re: [systemd-devel] passing additional FDs to service

2024-07-05 Thread Mantas Mikulėnas
A service could receive multiple listeners sockets, but I don't remember
systemd having an option to pass client connection sockets – and I don't
think it would make much sense, as the SMTP server is likely to close the
connection while the service is still running, and then systemd would
definitely have no way to inject a replacement socket.

Instead, I'd probably make the fcgi service talk SMTP to localhost or even
over a Unix socket (i.e. to a local MTA); filesystem-based Unix sockets
are  not bound to a network namespace.

On Fri, Jul 5, 2024, 17:25 Andrea Pappacoda  wrote:

> Hi all!
>
> I'm writing a small FastCGI daemon which, in addition to the socket used
> to talk FastCGI to the web server, talks SMTP through another (inet)
> socket (as an SMTP client).
>
> The FastCGI socket is created by systemd with a .socket unit and passed
> to the service as an fd (which also enables socket activation), while
> the SMTP socket is opened and managed by the daemon itself.
>
> What I'm asking here is if there's a way to also pass the SMTP socket as
> a file descriptor to the daemon from systemd, so that the daemon doesn't
> need to manage sockets itself (as all it does is reading fds passed by
> the service manager) and can be further restricted with options like
> PrivateNetwork=yes.
>
> Ideally, I'd just get fd 3 and use it to listen for incoming requests,
> and get fd 4 and use it to talk TLS + SMTP over TCP to the remote (or
> local) SMTP server.
>
> Is this currently possible with systemd? Am I missing something which
> would make this a bad idea?
>
> Thanks!
>


Re: [systemd-devel] configuring nspawn private network (mtu & mac)

2024-07-01 Thread Mantas Mikulėnas
On Mon, Jul 1, 2024 at 6:35 PM Ede Wolf  wrote:

> Got it. On the host I need to match OriginalName, not Name, in the link
> file. Then I am able to set the macaddress and the mtu.
>
> Since I am not sure, how stable those names are (as in
> vb-webserver@if3), is there maybe a way I have missed to label the
> interface names in the .nspawn file to later reference them in the .link
> file?
>

"@if3" is not part of the name. The interface name should be just
"vb-webserver" and is based directly on the nspawn name.

-- 
Mantas Mikulėnas


Re: [systemd-devel] Question about the behavior of systemd (when requesting A/AAAA via multiple interfaces)

2024-07-01 Thread Mantas Mikulėnas
On Mon, Jul 1, 2024 at 6:57 AM 松藤 諒太 
wrote:

> Dear contributers for systemd-resolved:
>
> Hello. I'm Ryota Matsufuji.
>
> Could I ask a question about the behavior of systemd-resolved?
>
> When being requested v4 and v6 address by application(such as wget with
> default option or firefox),
> depending on the interfaces' configuration, I watched multiple queries
> for both v4 and v6 address are launched through those interfaces.
>
> At this condition, I've found that systemd-resolved performed to return
> the result of those queries to application
> unless all queries are completed being resolved via one of multiple
> interfaces.
>
> I imagined that when A and  record are received, disregarding any
> interface completed resolving queries through itself,
> resolved could return the result.
> (for instance, received A from eth0 and  from eth1, and not received
>  from eth0 and A from eth1)
>
> Actually, It seems not as above.
>
> If is there any reason or restriction that resolved should wait for
> completing all queries through one of interfaces to return the result,
> I'm afraid I would ask the question for why it is ?
>

Not 100% sure about this, but as far as I know, it's because
systemd-resolved deliberately tries to avoid mixing address information
from different sources, in order to support "split-view DNS" or
"split-horizon DNS" that is commonly used with corporate VPNs. (But the
logic is general and applied to all interfaces, not only to VPN interfaces;
see `scope` and `DnsScope` in the source code.)

For example, if you're connected through VPN to an IPv6-capable workplace
network, the same server might be seen as having an IPv4 NAT address
through public DNS (eth0) but direct IPv6 through internal DNS (vpn0), and
it would not be correct to merge the public A and internal  records
with the same priority, because the former might have different firewall
restrictions than the latter, etc. – instead, *all of* vpn0:IPvX gets
priority over eth0:IPvX.

(The same also applies if different interfaces provide different records of
the same type; e.g. if both public DNS and internal DNS provide different A
records for the same server, you would still want to prioritize one answer
instead of merging both.)

So instead of handling each record type independently, the high-level
ResolveHostname() varlink call treats the [IPv4+IPv6] group of answers from
the same interface as an indivisible [IPvX] unit, which means it must wait
for both A and  replies from eth0 in order to produce the full
eth0:[IPvX] answer.


> Furthermore, does systemd provide the configuration to switch this
> behavior ?
>
> If so, could I get the information about the config option?
>

I don't think there is an option to disable it if you are using the
'resolve' module in /etc/nsswitch.conf (which uses the high-level
ResolveHostname call), but I suspect that switching to the traditional
'dns' module (which makes low-level A/ queries to 127.0.0.53) would
bypass this logic.

-- 
Mantas Mikulėnas


Re: [systemd-devel] systemd --user managers after systemd upgrade

2024-06-29 Thread Mantas Mikulėnas
v255 added a new systemd-executor binary – instead of direct
fork/setup/exec, now it's fork/exec(executor)/setup/exec(service), to avoid
doing too much stuff after fork. But the binary is executed off an open fd,
so even though you've upgraded it on disk, the manager is still holding
onto its old copy.

I guess the latter ended up achieving the opposite of what it intended.

I think asking systemd to reexec itself after the upgrade is how you're
supposed to handle it – i.e. first "systemctl daemon-reexec" the system
manager (or "telinit u" if you like), then "systemctl --user
daemon-reexec", or a mass "systemctl kill -s SIGRTMIN-25 user@\*.service".
(On Arch it's one of the very few daemon restarts that are automatically
done via post-upgrade hooks.)

On Sat, Jun 29, 2024, 22:05 Mike Gilbert  wrote:

> I recently added systemd v256 to Gentoo's ebuild repo. While testing
> the upgrade process from v255, I have run into an issue.
>
> After the upgrade, my KDE Plasma session stopped working, and I was
> unable to execute a reboot from the GUI.
>
> Looking at the journal, I see several messages like this one:
>
> Jun 29 14:21:30 naomi systemd[2387904]:
> /usr/lib/systemd/systemd-executor (deleted): error while loading
> shared libraries: libsystemd-core-255.so: cannot open shared object
> file: No such file or directory
>
> It appears to be executing a deleted binary
> (/usr/lib/systemd/systemd-executor), likely via /proc/1/fd/..., and
> then fails when loading a deleted shared library
> (libsystemd-core-255.so).
>
> The new versions of these files do exist on the filesystem. Also, I
> was able to reboot the system by switching to a text console and
> pressing ctrl-alt-delete.
>
> Any idea what happened here? I'm not sure if this is a systemd bug, or
> if I missed something in my packaging script (ebuild).
>


Re: [systemd-devel] Default run0 background colors not working

2024-06-28 Thread Mantas Mikulėnas
Does your terminal emulator support OSC 11 to report the *current*
background color?
(i.e. printf '\e]11;?\e\\'; read; should cause the terminal to respond with
the RGB value)
"Tint" is relative, so it cannot be applied if the original background
color is unknown.


On Fri, Jun 28, 2024 at 2:48 PM Lucas Sánchez  wrote:

> I recently updated to systemd 256 on my headless Raspberry Pi 4
> running Arch Linux ARM, and decided to try the new run0. I found out
> that when running it from ssh the default red and yellow background
> colors for root and other users aren't working for me, although
> manually setting --background does work. Setting
> $SYSTEMD_TINT_BACKGROUND makes no difference.
>
> Any ideas?
>


-- 
Mantas Mikulėnas


Re: [systemd-devel] Issues with Service Dependencies in Systemd

2024-05-21 Thread Mantas Mikulėnas
On Tue, May 21, 2024 at 11:47 PM Robert Landers 
wrote:

> Hello hello,
>
> I'm encountering an issue with Systemd service dependencies that I
> can't seem to resolve despite following the documentation. Either
> there's a misunderstanding on my part or there's a potential bug.
>
> 1. I cannot modify a specific service (immutable.service) because it's
> generated dynamically by a tool which I also cannot modify.
>

A service may consist of multiple config files – even if you cannot modify
/etc/…/immutable.service itself, you can still extend it by creating
/etc/…/immutable.service.d/foo.conf (e.g. using `systemctl edit immutable`)
which lets you add any properties (or reset/remove many of them); and
technically, you're already doing something like that using your WantedBy=
and RequiredBy= settings – they're both implemented by extending the
specified service with Wants/Requires without actually modifying its file.


> 2. I need to create a service (after-reboot.service) that runs after
> the network is completely up and running and before immutable.service.
> 3. I need to prevent immutable.service from starting if
> after-reboot.service fails to start.
>

Use the "drop-in" mechanism to extend immutable.service with:

# /etc/systemd/system/immutable.service.d/special.conf
[Unit]
After=after-reboot.service
Requires=after-reboot.service
# or Requisite=, or AssertSomething=, or whatever suits

[Install]
> WantedBy=multi-user.target
> RequiredBy=immutable.service
>

This should work, but all of [Install] is only re-applied when you
`systemctl [re]enable after-reboot`, so make sure you have done that.
(That's the reason it's under [Install] and not under [Unit].)

But since it's done to a .service, it doesn't imply any Before/After (if I
remember correctly, the Wants-implies-After is .target-specific magic), so
that may be what makes RequiredBy= insufficient. Use a .conf to add both
Requires *and* After to immutable.service.


-- 
Mantas Mikulėnas


Re: [systemd-devel] MulticastDNS Responder Hostname in Early Boot

2024-04-29 Thread Mantas Mikulėnas
On Mon, Apr 29, 2024 at 9:16 AM Justin Brown 
wrote:

> Hello,
>
> I'm having some trouble the resolved as a multicast DNS responder in early
> boot. I'm trying to setup a headless system with full disk encryption, and
> I need to connect remotely (currently using tinyssh) to unlock sysroot and
> other volumes before the boot continues. I use networkd to setup the dhcp
> interface, which works fine. The problem is that resolved won't use the
> value in /etc/hostname, and I can't find a resolved or networkd option to
> specify a hostname.
>

Does your initramfs actually contain /etc/hostname? resolved will use the
value that's been set as the *kernel* hostname.

Usually the loading of /etc/hostname into the kernel hostname is done by
systemd, and if it hasn't done so then I'm guessing the file is not part of
the initrd...

(But you can use "/bin/hostname -f" or "sysctl kernel.hostname" or "echo
testvm > /proc/sys/kernel/hostname" or pass "systemd.hostname=testvm" as a
kernel command line option to achieve the same thing.)

-- 
Mantas Mikulėnas


Re: [systemd-devel] How to chain services driven by a timer?

2024-04-10 Thread Mantas Mikulėnas
On Wed, Apr 10, 2024 at 5:50 PM Brian Reichert  wrote:

> My goal is to implement a service that runs after logrotate.service
> completes.
>
> logrotate.service is triggered by a timer logrotate.timer.
>
> I don't want to modify either of logrotate.service or logrotate.timer,
> as they are provided by the OS vendor (SLES 12 SP5, in my case.)
>

In a sense, you're already modifying units provided by the OS vendor –
whenever you use `systemctl enable` to link a service into
multi-user.target.wants/, that "modifies" multi-user.target by adding a
Wants= dependency, just that that happens in a way that does not get reset
during package updates. Systemd has plenty of mechanisms for that; e.g. you
can add additional settings to
`/etc/systemd/system/logrotate.service.d/*.conf` (using systemctl edit) or
you can link some units into logrotate.service.wants/ in the same way as is
done with .targets.

(You need to do this to the .service, as that's what actually gets
activated periodically.)

My current service file:
>
>   [Unit]
>   Description=Activities after logrotation
>
>   Requires=logrotate.service
>   Wants=logrotate.service
>

That seems like the complete opposite of what you're trying to achieve –
this makes *your* unit trigger the start of logrotate, not the other way
around.


>   After=logrotate.service
>

After= does not define a trigger. It only defines the execution order when
multiple units are triggered at once.


>
>   [Service]
>   #Type=oneshot
>   Type=simple
>
>   ExecStart=/usr/bin/logger 'XXX post log rotation'
>
>   [Install]
>   WantedBy=timers.target
>

The WantedBy= (or the .wants/ symlink that results from it) is the only
trigger being defined in your unit. But since the service was set up to be
started by timers.target, and timers.target itself only starts once during
boot (when timers get scheduled), that means your service is also started
once during boot and that's that.

If you want it to be triggered by logrotate.service, then you need
WantedBy=logrotate.service. Then each time logrotate.service is started on
schedule, it'll cause your service to be started as a dependency, and the
After= will actually work to define the order.

-- 
Mantas Mikulėnas


Re: [systemd-devel] How to debug systemd services failing to start with 11/SEGV?

2024-04-10 Thread Mantas Mikulėnas
On Wed, Apr 10, 2024 at 4:08 PM Alexander Dahl  wrote:

> Note: platform here is 32 bit arm, namely v5te on Microchip SAM9X60
> SoC.  Kernel is 6.6, maybe I did not get the kernelconfig right and
> some options are not set correctly?  Or maybe those crashes are real?
> Then I could need some help how to _really_ enable coredumps for
> journald, udevd, and timesyncd.  Got a hint off-list to pass
> 'systemd.dump_core=true' to kernel cmdline, but that had no effect on
> coredump creation.
>

I would just set kernel.core_pattern to a *file* path, e.g.
"/var/log/core.%P". Then use the shell's ulimit command to raise the
coredump size limit as it defaults to zero (ulimit -c unlimited), and
manually start /usr/lib/systemd/systemd-timesyncd from the shell (timesyncd
is the simplest one and doesn't do anything system-critical).

Alternatively, run the service under the debugger: `gdb /usr/.../timesyncd`.

-- 
Mantas Mikulėnas


Re: [systemd-devel] systemctl inaccessible when enabling DynamicUser=true

2024-03-29 Thread Mantas Mikulėnas
I don't know, but it might be related to this note:
https://github.com/systemd/systemd/commit/0a207d8f234ff7ea0d7988445e38685090fc930e

On Fri, Mar 29, 2024, 19:53 Nils Kattenbeck  wrote:

> On Fri, Mar 29, 2024 at 7:04 AM Mantas Mikulėnas 
> wrote:
> >
> > It's probably a difference between dbus-daemon and dbus-broker, I
> suspect.
>
> Hi, that was indeed the problem. Installing dbus-broker on one of the
> machines did in fact fix this. Any idea why that might be? I do not
> know the differences between the two so I am not completely
> comfortable with replacing dbus-daemon with dbus-broker just to fix
> this small issue. This issue at least confirms to me that they are not
> drop-in compatible.
>
> Kind regards,
> Nils
>


Re: [systemd-devel] systemctl inaccessible when enabling DynamicUser=true

2024-03-28 Thread Mantas Mikulėnas
It's probably a difference between dbus-daemon and dbus-broker, I suspect.

On Thu, Mar 28, 2024 at 8:55 PM Nils Kattenbeck 
wrote:

> On Thu, Mar 28, 2024 at 3:08 PM Luca Boccassi 
> wrote:
> >
> > Works just fine here in Debian with 252:
>
> Hm, weird. With logging enabled I get the following output:
>
> $ sudo systemd-run -t --collect -p DynamicUser=true -E
> SYSTEMD_LOG_LEVEL=debug systemctl --failed
> Running as unit: run-u1497.service
> Press ^] three times within 1s to disconnect TTY.
> Cannot stat /proc/1/root: Permission denied
> running_in_chroot(): Permission denied
> Bus n/a: changing state UNSET → OPENING
> sd-bus: starting bus by connecting to /run/dbus/system_bus_socket...
> Bus n/a: changing state OPENING → AUTHENTICATING
> Successfully forked off '(pager)' as PID 386098.
> Skipping PR_SET_MM, as we don't have privileges.
> sd_pid_get_owner_uid() failed, enabling pager secure mode: No data
> available
> Pager executable is "less", options "FRSXMK", quit_on_interrupt: yes
> Bus n/a: changing state AUTHENTICATING → HELLO
> Bus n/a: changing state HELLO → CLOSING
> Failed to list units: Transport endpoint is not connected
> Bus n/a: changing state CLOSING → CLOSED
> $ systemd 252 (252.22-1~deb12u1)
> +PAM +AUDIT +SELINUX +APPARMOR +IMA +SMACK +SECCOMP +GCRYPT -GNUTLS
> +OPENSSL +ACL +BLKID +CURL +ELFUTILS +FIDO2 +IDN2 -IDN +IPTC +KMOD
> +LIBCRYPTSETUP +LIBFDISK +PCRE2 -PWQUALITY +P11KIT +QRENCODE +TPM2
> +BZIP2 +LZ4 +XZ +ZLIB +ZSTD -BPF_FRAMEWORK -XKBCOMMON +UTMP +SYSVINIT
> default-hierarchy=unified
>
> I can also can reproduce this on another machine running Ubuntu 22.04
> LTS with the systemd 249 (249.11-0ubuntu3.12). On my laptop (Fedora
> 40) I cannot reproduce the error and it works like in your case. The
> other two machines are servers.
>


-- 
Mantas Mikulėnas


Re: [systemd-devel] How to automatically decrypt a disk on connection

2024-03-27 Thread Mantas Mikulėnas
On Wed, Mar 27, 2024, 16:36 Orion Poplawski  wrote:

>
>
> Can I setup a unit that gets started automatically when a particular
> dev-disk-by-uuid device becomes present?
>

Just link it under dev-disk-foo.device.wants/ (systemctl enable, or
systemctl add-wants).

Alternatively, ENV{SYSTEMD_WANTS}="foo.service" from udev will have the
same effect.


> Thanks.
>
> --
> Orion Poplawski
> he/him/his  - surely the least important thing about me
> IT Systems Manager 720-772-5637
> NWRA, Boulder/CoRA Office FAX: 303-415-9702
> 3380 Mitchell Lane   or...@nwra.com
> Boulder, CO 80301 https://www.nwra.com/
>
>


Re: [systemd-devel] Forking service behind socket and service.

2024-03-27 Thread Mantas Mikulėnas
On Wed, Mar 27, 2024 at 9:35 AM Steve Traylen  wrote:

> Hi,
> I have a old legacy service that's a bit odd in that it was previously
> launched by xinetd but is weird in that:
>
> After an initial quite quick auth and set up a daemon is forked of to run
> the much longer process.
>
> Setting this up with a socket me.socket and me@.service does not work
> quite right for us.
>
> In particular I want the socket to close once the fork happens.
>
> If the service is Type=forking things do work but socket is persisted -
> that's not great for thing doing the original submission. It expects the
> socket to be short lived.
>
> Only workaround I have which is rubbish for obvious reasons is to use the
> service with Type=simple, KillMode=None to leave the forked process running.
>
> Anyway to persuade the socket service to close earlier.
>
> Presumably in the xinetd world xinetd was oblivious to this forked process.
>

Well, xinetd predates the whole idea of per-service process tracking using
cgroups or whatever else; it only knows about its immediate children.

But even in the systemd world, if you have xinetd.service, everything
spawned by xinetd is still just part of xinetd.service and can continue
running as long as xinetd (the main process of the .service) keeps running.

So I would say just keep using xinetd for it? It's definitely not a good
fit for systemd .socket as you've found out, but it can continue to run
under (x)inetd or a custom `systemd-socket-activate` service (that's mainly
a CLI tool for testing but it would work as a service too).

-- 
Mantas Mikulėnas


Re: [systemd-devel] ConditionFirstBoot question

2024-03-12 Thread Mantas Mikulėnas
On Tue, Mar 12, 2024, 15:06  wrote:

> Hi,
>
> I have a system that needs to perform some tasks on first boot.  I have
> this working for the most part but I had some general questions and would
> like some guidance on the proper implementation.
>
> The tasks I need to perform on first boot include changing the hostname,
> formatting some flash and mounting the filesystem, updating some config
> files, etc.
>

If this is supposed to prepare the root filesystem, then the initramfs
might be a good place as it always runs before anything in the rootfs does.

(On the other hand, I've seen designs where the first boot goes to a
completely different target and only after it's done it either reboots or
switches to multi-user.target.)


> I found that if I put these tasks sequentially into single script
> /etc/hostname is updated and the hostname command returns the updated
> value, the hostname shown at login is still the old value.
>
> Now if I split tasks into individual services and order them with Before
> and After the hostname is correct at login.
>

If you mean the banner shown *above* the "login:" prompt – that's shown by
agetty, which is getty@tty1.service. It doesn't look at /etc/hostname, it
looks at the kernel parameter that's set from /etc/hostname during early
boot (before any services).

So you need to update the kernel hostname using the `hostname` command, and
you probably want to order your service before getty@tty1 (maybe before
systemd-user-session.service).


> I suspect this is because of systemd's parallelization and the monolithic
> script will not finish before the service or target that pulls in
> /etc/hostname, is that correct?
>

/etc/hostname is read *very* early – if I remember correctly, that's done
by systemd itself before it starts any services at all. The rest of the
system doesn't use that file but looks only at the kernel hostname.

If you're changing it using the `hostname` command, then it mostly doesn't
matter, as you're updating the same kernel parameter as systemd does.


> It is not clear to me what specific target or service my service needs to
> be set Before so that the script can finish.
>
> It also occurs to me that perhaps it makes more sense to split the tasks
> out so that they can run in parallel to optimize the boot, would this be
> the best practice?
>
> Thanks, Matt.
>


Re: [systemd-devel] Query on sshd.socket sshd.service approaches

2024-03-06 Thread Mantas Mikulėnas
On Wed, Mar 6, 2024 at 12:21 PM Arseny Maslennikov  wrote:

> So mode 2 only really makes sense for deployments which are only ever
> accessible from intranets with little junk traffic.
>

Which is the case for "deployments" that are *not servers* in the first
place. Many distros are oriented towards personal computers, which are
usually behind a firewall so junk traffic is not a concern, but which you
might want to SSH/VNC/RDP at unexpected moments.

For example, when I first started using systemd in ~2011, my laptop still
had a 5400 rpm HDD, and its boot time mattered far more than it does for
"deployments", so systemd's promise of on-demand startup of everything (to
reduce the boot I/O contention while still keeping the actual service
available) was particularly attractive.

(Of course, these days most systems have SSDs while even the baseline
systemd startup process runs twice as many Assorted Things as my full
desktop environment did in the past, so maybe the issue is no longer
relevant.)

-- 
Mantas Mikulėnas


Re: [systemd-devel] Wireguard routes only after connect

2024-02-24 Thread Mantas Mikulėnas
On Wed, Feb 14, 2024, 10:55 Julian Zielke  wrote:

> Hi,
>
>
>
> is there a possibility to only add the routes from allowed-ips to the
> kernel routing table after the peer has connected?
>
> Because since the tunnel itself is stateless, there is no way for me to
> make use of OSPF to route packets to a selective server running a tunnel to
> the same endpoint (for loadbalancing and multi-wan reasons).
>

The easiest method might be to make the server itself talk OSPF with the
"stub router" option enabled (or BGP; I think some places use internal BGP
for that).

>


Re: [systemd-devel] Howto unshare when user session starts.

2024-02-21 Thread Mantas Mikulėnas
Use pam_namespace for mount namespacing (part of Linux-PAM, not systemd). I
don't think it handles user namespaces yet, but that would probably be a
fairly small change.

On Wed, Feb 21, 2024 at 7:57 PM Stef Bon  wrote:

> Hi,
>
> maybe this is a question simple to answer.
>
> I want the user sessions to start in a {mount,user} namespace. How can
> I do this? I know there is the command systemd-nspawn. But to use this
> I have to adjust the first command to start a session. Or is it
> possible by setting parameters in logind?
>
> Stef
> the Netherlands
>


-- 
Mantas Mikulėnas


Re: [systemd-devel] Assistance Needed with 'loginctl list-users' Command Display Issue

2024-02-12 Thread Mantas Mikulėnas
Also, if you're using a terminal that doesn't recognize OSCs (it should
just ignore unknown ones), export SYSTEMD_URLIFY=0 to disable the hyperlink
feature that's making a mess out of systemctl output.

On Tue, Feb 13, 2024, 06:53 Sangeetha Elumalai 
wrote:

> Hi,
>
> The* 'loginctl list-users'* command isn't displaying the user list. I
> would appreciate any suggestions on resolving this issue. Do I need to
> enable any specific service for this functionality?
>
> Here are the logs:
> ```
> # who
> root ttyS0Feb 15 19:12
> #
>
>
> # loginctl list-users
> No users.
> #
>
> # loginctl list-sessions
> No sessions.
> #
>
> # systemctl status systemd-logind
> �● systemd-logind.service - User Login Management
>  Loaded: loaded
> (8;;file://beagleboneblack/lib/systemd/system/systemd-logind.service/lib/systemd/system/systemd-logind.service8;;;
> static)
>  Active: active (running) since Wed 2023-02-15 19:12:10; 11 months 27
> days >
>Docs: 8;;man:sd-login(3)man:sd-login(3)8;;
>
>  8;;man:systemd-logind.service(8)man:systemd-logind.service(8)8;;
>  8;;man:logind.conf(5)man:logind.conf(5)8;;
>
>  8;;man:org.freedesktop.login1(5)man:org.freedesktop.login1(5)8;;
>Main PID: 135 (systemd-logind)
>  Status: "Processing requests..."
>   Tasks: 1 (limit: 1060)
>  Memory: 632.0K
> CPU: 482ms
>  CGroup: /system.slice/systemd-logind.service
>  �└�─135 /lib/systemd/systemd-logind
>
> Feb 15 19:12:06 beagleboneblack systemd[1]: Starting
> systemd-logind.service...
> Feb 15 19:12:10 beagleboneblack systemd-logind[135]: New seat seat0.
> Feb 15 19:12:10 beagleboneblack systemd[1]: Started systemd-logind.service.
> #
> ```
>
> Thank you,
> -Sangeetha
>
>


Re: [systemd-devel] Assistance Needed with 'loginctl list-users' Command Display Issue

2024-02-12 Thread Mantas Mikulėnas
You need to make sure the PAM configuration for whichever service you're
logging in through includes pam_systemd.so in the 'session' group. Check
/etc/pam.d on other distributions. (For tty logins it's /etc/pam.d/login,
but usually it's indirect via /etc/pam.d/common-session or something along
those lines.)

On Tue, Feb 13, 2024, 06:53 Sangeetha Elumalai 
wrote:

> Hi,
>
> The* 'loginctl list-users'* command isn't displaying the user list. I
> would appreciate any suggestions on resolving this issue. Do I need to
> enable any specific service for this functionality?
>
> Here are the logs:
> ```
> # who
> root ttyS0Feb 15 19:12
> #
>
>
> # loginctl list-users
> No users.
> #
>
> # loginctl list-sessions
> No sessions.
> #
>
> # systemctl status systemd-logind
> �● systemd-logind.service - User Login Management
>  Loaded: loaded
> (8;;file://beagleboneblack/lib/systemd/system/systemd-logind.service/lib/systemd/system/systemd-logind.service8;;;
> static)
>  Active: active (running) since Wed 2023-02-15 19:12:10; 11 months 27
> days >
>Docs: 8;;man:sd-login(3)man:sd-login(3)8;;
>
>  8;;man:systemd-logind.service(8)man:systemd-logind.service(8)8;;
>  8;;man:logind.conf(5)man:logind.conf(5)8;;
>
>  8;;man:org.freedesktop.login1(5)man:org.freedesktop.login1(5)8;;
>Main PID: 135 (systemd-logind)
>  Status: "Processing requests..."
>   Tasks: 1 (limit: 1060)
>  Memory: 632.0K
> CPU: 482ms
>  CGroup: /system.slice/systemd-logind.service
>  �└�─135 /lib/systemd/systemd-logind
>
> Feb 15 19:12:06 beagleboneblack systemd[1]: Starting
> systemd-logind.service...
> Feb 15 19:12:10 beagleboneblack systemd-logind[135]: New seat seat0.
> Feb 15 19:12:10 beagleboneblack systemd[1]: Started systemd-logind.service.
> #
> ```
>
> Thank you,
> -Sangeetha
>
>


Re: [systemd-devel] network signals

2024-02-06 Thread Mantas Mikulėnas
/org/freedesktop/network1 is the toplevel object that represents the
networkd daemon as a whole. Interface-specific signals would be sent for
the object paths of the individual interfaces, so you need to either:

a) call *GetLinkByName()* on the toplevel object to find the interface
object, and then subscribe for signals on that;

or b) instead of filtering by path='/xx', filter by *path_namespace='/xx' *so
that it will also match child objects.

On Tue, Feb 6, 2024 at 10:30 AM ashok athukuri 
wrote:

> Hello All,
>
>
> I'm trying to use DBUS signal "PropertiesChanged" of
> org.freedesktop.network1 /org/freedesktop/network1 *to detect network
> interface is up or down OR the speed of the interface is changed.*
>
> Is it possible? using a subscription to "PropertiesChanged" signal via the
> below interface or PATH ?
> If yes, how to test those, Currently I don't see any anything on my
> dbus-monitor.
>
> Here is how I am testing :
>
> *# dbus-monitor --system "type='signal',path='/org/freede*
>
>
> *sktop/network1'"#ifconfig wlan0 down*
>
> # busctl introspect org.freedesktop.network1 /org/freedesktop/network1
> NAMETYPE  SIGNATURE RESULT/VALUE FLAGS
> org.freedesktop.DBus.Introspectable interface - --
> .Introspect method- s-
> org.freedesktop.DBus.Peer   interface - --
> .GetMachineId   method- s-
> .Ping   method- --
> org.freedesktop.DBus.Properties interface - --
> .Getmethodssv-
> .GetAll methods a{sv}-
> .Setmethodssv   --
> .PropertiesChanged  signalsa{sv}as  --
> org.freedesktop.network1.Managerinterface - --
> .Describe   method- s-
> .DescribeLink   methodi s-
> .ForceRenewLink methodi --
> .GetLinkByIndex methodi so   -
> .GetLinkByName  methods io   -
> .ListLinks  method- a(iso)   -
> .ReconfigureLinkmethodi --
> .Reload method- --
> .RenewLink  methodi --
> .RevertLinkDNS  methodi --
> .RevertLinkNTP  methodi --
> .SetLinkDNS methodia(iay)   --
> .SetLinkDNSEx   methodia(iayqs) --
> .SetLinkDNSOverTLS  methodis--
> .SetLinkDNSSEC  methodis--
> .SetLinkDNSSECNegativeTrustAnchors  methodias   --
> .SetLinkDefaultRoutemethodib--
> .SetLinkDomains methodia(sb)--
> .SetLinkLLMNR   methodis--
> .SetLinkMulticastDNSmethodis--
> .SetLinkNTP methodias   --
> .AddressState   property  s "routable"
> emits-change
> .CarrierState   property  s "carrier"
>  emits-change
> .IPv4AddressState   property  s "routable"
> emits-change
> .IPv6AddressState   property  s "routable"
> emits-change
> .NamespaceId        property  t 4026531840   const
> .OnlineStateproperty  s "partial"
>  emits-change
> .OperationalState   property  s "routable"
> emits-change
> root@MK3AC-WS100269:/var/lib/evse/cache#
>
> Thanks,
> Ashok
>
>

-- 
Mantas Mikulėnas


Re: [systemd-devel] Detecting Systemd crash

2024-02-05 Thread Mantas Mikulėnas
On Mon, Feb 5, 2024, 14:54 Lennart Poettering 
wrote:

> On So, 04.02.24 00:06, David Timber (d...@dev.snart.me) wrote:
>
> > 2: How do I get Systemd to freeze to test such program? I mean, if I kill
> > Systemd, the kernel would crash so I have to somehow tell Systemd to
> freeze?
>
> Not really, the kernel blocks SIGSTOP for PID1.
>

Attaching gdb to pid1 should do the job.


Re: [systemd-devel] Delaying VM startup until block devices are available

2024-01-25 Thread Mantas Mikulėnas
On Fri, Jan 26, 2024 at 1:29 AM Orion Poplawski  wrote:

> We have various VMs that are back by luks encrypted LVs.  At boot the
> volumes
> are decrypted by clevis.  The problem we are seeing at the moment is that
> the
> VMs are started before the block devices are decrypted.  Our current
> solution is:
>
> # cat /etc/systemd/system/virtqemud.service.d/override.conf
> [Unit]
> After=blockdev@dev-mapper-luks\x2dbackup.target
> blockdev@dev-mapper-luks\x2dvm\x2d01\x2ddisk0.target
>
> Where we list each of the volumes to be decyrpted as blocking the virtqemud
> service.
>

> Does anyone have any better alternatives?  My main issue it that it feels
> somewhere in between fine-grained and coarse-grained control.
>
> Ideally I think one would be able to have each individual VM startup
> automatically delayed until the devices each used became available, but I
> don't see how to do this.
>

You can't really do this with systemd if it's not systemd that does the
startup... The libvirt daemons need to be patched to watch udev events and
wait for the devices they require.


>
> Alternatively it seems like one should be able to delay all VM startup
> until
> all volumes in /etc/crypttab were unlocked, rather than having to specify
> each
> one.  But I don't see a target for that.
>

If this were plain systemd-cryptsetup, you could add a drop-in for
"systemd-cryptsetup@.service" that adds Before=foo.target. I'm not sure if
clevis integrates with that. (Although honestly I don't see much point in
using clevis for data volumes at all – just use it for the rootfs, and
regular keyfiles in /etc/private for everything else...)

-- 
Mantas Mikulėnas


Re: [systemd-devel] Permanently remove services

2024-01-20 Thread Mantas Mikulėnas
On Sat, Jan 20, 2024 at 8:02 AM Andrei Borzenkov 
wrote:

> On 19.01.2024 20:22, Mantas Mikulėnas wrote:
> > On Fri, Jan 19, 2024, 19:12 Morten Bo Johansen 
> wrote:
> >
> >> On 2024-01-19 Mantas Mikulėnas wrote:
> >>
> >>> In general I've learned to not quite trust what the firmware shows...
> >> we've
> >>> had a batch of Skylake-or-so desktops that *did* have a CPU-integrated
> >> fTPM
> >>> but it wasn't even mentioned until we did a BIOS update, even though
> CPU
> >>> spec said it should be present.
> >>>
> >>> However, your CPU is from Haswell era and according to the spec sheet
> it
> >>> definitely seems to lack Intel's PTT "built-in TPM 2.0" feature (it has
> >> the
> >>> older IPT but that's a different thing, not a TPM equivalent), so that
> >>> seems correct. If I understand correctly, the only option for that CPU
> >>> would be a discrete TPM chip, and if the manufacturer had bothered to
> >>> include one, it ought to be showing up in the BIOS settings.
> >>>
> >>> On the other hand, you said you have a /dev/tpm0... I'm somewhat
> curious
> >>> about whether there are any mentions 'tpm' or 'tis' or something like
> >> that
> >>> in your `dmesg`?
> >>
> >> ~/ % dmesg | grep -i tpm
> >>
> >> [0.275738] tpm_tis 00:05: 1.2 TPM (device-id 0x0, rev-id 78)
> >>
> >
> > Well, that also looks like a TPM1.2 is present; it matches the absence of
> > /dev/tpmrm0 (which is a 2.0 thing).
> >
> > (It's not very useful in general; I've used it to store my SSH key in the
> > past, but it's slow and only does RSA-2048, and the software is
> completely
> > different from what's used for newer variants. You can use it through
> > TrouSerS + OpenCryptoki.)
> >
> > I wonder what makes systemd think it's a 2.0.
> >
>
> systemd does not check for TPM 2.0 at all. The conditions in these
> services are
>
> ConditionSecurity=measured-uki
> ConditionPathExists=!/run/systemd/tpm2-srk-public-key.pem
>
> Where "measured-uki" basically checks if specific EFI variable
> (StubPcrKernelImage) exists and has "correct" value.
>

That must be commits 03d808c and 9f32bb9 then.

-- 
Mantas Mikulėnas


Re: [systemd-devel] Permanently remove services

2024-01-19 Thread Mantas Mikulėnas
On Fri, Jan 19, 2024, 19:12 Morten Bo Johansen  wrote:

> On 2024-01-19 Mantas Mikulėnas wrote:
>
> > In general I've learned to not quite trust what the firmware shows...
> we've
> > had a batch of Skylake-or-so desktops that *did* have a CPU-integrated
> fTPM
> > but it wasn't even mentioned until we did a BIOS update, even though CPU
> > spec said it should be present.
> >
> > However, your CPU is from Haswell era and according to the spec sheet it
> > definitely seems to lack Intel's PTT "built-in TPM 2.0" feature (it has
> the
> > older IPT but that's a different thing, not a TPM equivalent), so that
> > seems correct. If I understand correctly, the only option for that CPU
> > would be a discrete TPM chip, and if the manufacturer had bothered to
> > include one, it ought to be showing up in the BIOS settings.
> >
> > On the other hand, you said you have a /dev/tpm0... I'm somewhat curious
> > about whether there are any mentions 'tpm' or 'tis' or something like
> that
> > in your `dmesg`?
>
> ~/ % dmesg | grep -i tpm
>
> [0.275738] tpm_tis 00:05: 1.2 TPM (device-id 0x0, rev-id 78)
>

Well, that also looks like a TPM1.2 is present; it matches the absence of
/dev/tpmrm0 (which is a 2.0 thing).

(It's not very useful in general; I've used it to store my SSH key in the
past, but it's slow and only does RSA-2048, and the software is completely
different from what's used for newer variants. You can use it through
TrouSerS + OpenCryptoki.)

I wonder what makes systemd think it's a 2.0.


Re: [systemd-devel] Permanently remove services

2024-01-19 Thread Mantas Mikulėnas
On Fri, Jan 19, 2024, 17:47 Morten Bo Johansen  wrote:

> On 2024-01-18 Lennart Poettering wrote:
>
> > On Do, 18.01.24 22:53, Morten Bo Johansen (morte...@hotmail.com) wrote:
> >
> >> ~/ % systemd-creds has-tpm2
> >> partial
> >> +firmware
> >> -driver
> >> +system
> >> +subsystem
> >> +libraries
> >
> > OK, so this indicates that your system has TPM support on all levels
> > with a single exception: you lack an actual linux driver for your
> > specific hw. And that puzzles me. because to my knowledge at least
> > linux should support all relevant tpm2 interfaces just fine. THis
> > suggests that you haven#t got the right modules installed.
>
> I think that perhaps systemd-creds gets it wrong? There really
> does not seem to be any TPM support on this computer, either
> version 1.2 or 2. In the bios settings, there is no "security
> chip" entry under the "Security" tab and no other settings
> pertaining to TPM in the bios at all.


In general I've learned to not quite trust what the firmware shows... we've
had a batch of Skylake-or-so desktops that *did* have a CPU-integrated fTPM
but it wasn't even mentioned until we did a BIOS update, even though CPU
spec said it should be present.

However, your CPU is from Haswell era and according to the spec sheet it
definitely seems to lack Intel's PTT "built-in TPM 2.0" feature (it has the
older IPT but that's a different thing, not a TPM equivalent), so that
seems correct. If I understand correctly, the only option for that CPU
would be a discrete TPM chip, and if the manufacturer had bothered to
include one, it ought to be showing up in the BIOS settings.

On the other hand, you said you have a /dev/tpm0... I'm somewhat curious
about whether there are any mentions 'tpm' or 'tis' or something like that
in your `dmesg`?

I ran Windows 11 in a VM
> to check what it thinks about it and it also says that there is
> no TPM support, either 1.2 or 2.
>

A virtual machine won't be able to see the real TPM either way (or any
other real hardware; it's kinda what makes it a virtual machine). All it
would see is a vTPM provided by the VM host software.


Re: Activation environment(s)?

2024-01-08 Thread Mantas Mikulėnas
The traditional dbus-daemon keeps a separate environment for services it
spawns directly (i.e. those that don't specify SystemdService= in their
D-Bus .service files), though that it doesn't apply to services it runs via
systemd so you need to keep both in sync.

On the other hand, dbus-broker runs everything via systemd (using transient
service units if necessary), and as far as I know it no longer keeps track
of a separate activation environment and all changes are just directly
forwarded to systemd's environment instead.

It depends on which implementation your distribution uses.


On Mon, Jan 8, 2024, 17:58 Vladimir Kudrya  wrote:

> Hello.
>
> In context of a modern systemd-managed user session, is there a separate
> dbus activation environment, or is it merged with systemd? If one
> intends to manage environment variables, is systemctl (or
> org.freedesktop.systemd1.Manager Environment) enough?
>
> A variable added by dbus-update-activation-environment (even without
> --systemd option) shows up in systemd activation environment. I couldn't
> find a pure dbus service to check if reverse is true.
>
>
>
>


Re: Troubleshooting timedatectl and hostnamectl failed to activate service: timed out

2023-12-13 Thread Mantas Mikulėnas
Activation is not client-side, it's handled automatically by dbus-daemon –
which either spawns the service directly or starts it as a systemd service.

In this case, check whether your logs show systemd-hostnamed.service
attempting to start; either it fails to start (missing libraries?
Apparmor?) or dbus-daemon fails to contact systemd (pid1 crashed?).

On Wed, Dec 13, 2023, 19:45 Sean Caron  wrote:

> Hi everyone,
>
> I'm on Ubuntu 20.04 LTS, kernel version 5.4.0-163-generic, systemd 245
> (245.4-4ubuntu3.22).
>
> I have some systems where I am receiving the following error messages when
> people attempt to use timedatectl or hostnamectl:
>
>
> Failed to query server: Failed to activate service
> 'org.freedesktop.timedate1': timed out (service_start_timeout=25000ms)
>
> Failed to query system properties: Failed to activate service
> 'org.freedesktop.hostname1': timed out (service_start_timeout=25000ms)
>
>
> I tried setting SYSTEMD_LOG_LEVEL=debug and rerunning the commands and it
> didn't really give me anything useful for determining the root cause of the
> issue. Here's an example of that output for timedatectl status:
>
>
> Bus n/a: changing state UNSET → OPENING
> Bus n/a: changing state OPENING → AUTHENTICATING
> Bus n/a: changing state AUTHENTICATING → HELLO
> Sent message type=method_call sender=n/a destination=org.freedesktop.DBus
> path=/org/freedesktop/DBus interface=org.freedesktop.DBus member=Hello
> cookie=1 reply_cookie=0 signature=n/a error-name=n/a error-message=n/a
> Got message type=method_return sender=org.freedesktop.DBus
> destination=:1.15318 path=n/a interface=n/a member=n/a cookie=1
> reply_cookie=1 signature=s error-name=n/a error-message=n/a
> Bus n/a: changing state HELLO → RUNNING
> Sent message type=method_call sender=n/a
> destination=org.freedesktop.timedate1 path=/org/freedesktop/timedate1
> interface=org.freedesktop.DBus.Properties member=GetAll cookie=2
> reply_cookie=0 signature=s error-name=n/a error-message=n/a
> Got message type=error sender=org.freedesktop.DBus destination=:1.15318
> path=n/a interface=n/a member=n/a cookie=3 reply_cookie=2 signature=s
> error-name=org.freedesktop.DBus.Error.TimedOut error-message=Failed to
> activate service 'org.freedesktop.timedate1': timed out
> (service_start_timeout=25000ms)
> Failed to query server: Failed to activate service
> 'org.freedesktop.timedate1': timed out (service_start_timeout=25000ms)
> Bus n/a: changing state RUNNING → CLOSED
>
>
> I read that sometimes these issues can be caused by filesystem permissions
> on subdirectories in /var such as /var/tmp or /var/lib/systemd but I
> checked these and compared against a working system and I don't see any
> obvious differences.
>
> I have tried using strace on timedatectl and hostnamectl to try and see
> what's hanging things up but that hasn't really provided any fruitful
> direction, either.
>
> I didn't really know this was occurring until an end user reported it to
> me so I don't necessarily know how long the issue has been occurring or
> have a change in mind that could have broken things. I'm not sure if the
> upgrade from Ubuntu 18 to Ubuntu 20 broke it, or if some security
> configuration broke it. Or perhaps there is a missing dependency package on
> the broken systems?
>
> Could anyone out there please provide a little bit more guidance on how I
> might troubleshoot this and determine the root cause of the issue? I
> really hate to bother folks here but I'm feeling stuck.
>
> Thank you!
>
> Sean
>


Re: [systemd-devel] Manual start of user@.service failed with permission denied

2023-12-13 Thread Mantas Mikulėnas
On Wed, Dec 13, 2023 at 10:36 AM Christopher Wong 
wrote:

> Hi Mantas,
>
>
>
> I tried with StopWhenUnneeded=no in user-runtime-dir@.service, then when
> user@1001.service fails the status of user-runtime-dir@.service is
> active. At this state the directory /run/user/1001 is created, it is empty,
> owned by root. Running the mount command doesn’t show /run/user/1001.
>

Run the "/usr/lib/systemd/systemd-user-runtime-dir start 1001" manually and
check whether the mounted filesystem is there afterwards.

If it's still not there, then run "mount -t tmpfs -o uid=1001,mode=0700
none /run/user/1001" and then check whether it stays mounted.


>
>
> I have mentioned it before, but I want to point out that if I put 
> “ExecStartPre=+chown
> %i /run/user/%i” in user@.service then the user@1001.service can be
> started manually. The mount command doesn’t show /run/user/1001 either, but
> since the service is started the path contains bus socket and systemd
> directory with content, which are the things I am after.
>
>
>
> The main issue here is that /run/user/1001 is owned by root after
> user-runtime-dir@.service has been exited successfully.
>

No, that's only a symptom of the main issue.

The current design that systemd implements is to have a user-specific tmpfs
mounted at that location (for quota purposes), and so the underlying
mountpoint is deliberately created as owned by root – its ownership is not
changed because it's supposed to have a new filesystem mounted on top
(which would make the mountpoint hidden and its ownership moot).

If you specifically want to *not* have an additional tmpfs there, then you
can continue using the manual "ExecStartPre=chown" (or in fact you could
replace the entire user-runtime-dir@ with a simpler one that only mkdirs
and chowns), but in that case you shouldn't be saying that it's a systemd
issue that it doesn't chown something that it was never meant to chown to
begin with.


>
>
> Best regards,
>
> Christopher Wong
>
>
>
>
>
> *From: *Mantas Mikulėnas 
> *Date: *Wednesday, 13 December 2023 at 08:08
> *To: *Christopher Wong 
> *Cc: *Systemd 
> *Subject: *Re: [systemd-devel] Manual start of user@.service failed
> with permission denied
>
> On Tue, Dec 12, 2023 at 6:15 PM Christopher Wong <
> christopher.w...@axis.com> wrote:
>
> Hi Mantas,
>
>
>
> After user@1001.service failed, it trigger the stopping process and
> become inactive.
>
>
>
> Ah yeah, that makes sense, user-runtime-dir@ has StopWhenUnneeded=yes –
> so of course after user@1001 crashes you're not going to see anything
> mounted anymore.
>
>
>
> Could you try temporarily removing that option / setting it to 'no', just
> to see what changes?
>
>
>
>
>
> ○ user-runtime-dir@1001.service - User Runtime Directory /run/user/1001
>
>  Loaded: loaded (/etc/systemd/system/user-runtime-dir@.service;
> static)
>
> Drop-In: /usr/lib/systemd/system/service.d
>
>  └─10-axis.conf, 20-axis-sandbox.conf
>
>  Active: inactive (dead) since Tue 2023-12-12 16:33:35 CET; 36min ago
>
>Duration: 315ms
>
>Docs: man:user@.service(5)
>
> Process: 16325 ExecStartPre=ls -la /run/user (code=exited,
> status=0/SUCCESS)
>
> Process: 16327 ExecStartPre=mount (code=exited, status=0/SUCCESS)
>
> Process: 16329 ExecStart=/usr/lib/systemd/systemd-user-runtime-dir
> start 1001 (code=exited, status=0/SUCCESS)
>
> Process: 16334 ExecStartPost=sleep 5 (code=exited, status=0/SUCCESS)
>
> Process: 16347 ExecStartPost=ls -la /run/user/1001 (code=exited,
> status=0/SUCCESS)
>
> Process: 16351 ExecStartPost=mount (code=exited, status=0/SUCCESS)
>
> Process: 16361 ExecStop=/usr/lib/systemd/systemd-user-runtime-dir
> stop 1001 (code=exited, status=0/SUCCESS)
>
>Main PID: 16329 (code=exited, status=0/SUCCESS)
>
> CPU: 48ms
>
>
>
> /etc/fstab don’t include anything on /run/user/1001 and there is no mount
> unit for run-user-1001.mount either.
>
>
>
> Best regards,
>
> Christopher Wong
>
>
>
>
>
> *From: *Mantas Mikulėnas 
> *Date: *Tuesday, 12 December 2023 at 17:05
> *To: *Christopher Wong 
> *Cc: *Systemd 
> *Subject: *Re: [systemd-devel] Manual start of user@.service failed
> with permission denied
>
> That sounds like it's getting immediately unmounted (or maybe not being
> mounted at all despite the program doing so).
>
>
>
> Does the user-runtime-dir service continue to show as "active" after this,
> or does it return to "inactive"?
>
>
>
> Does your /etc/fstab have any ment

Re: [systemd-devel] Manual start of user@.service failed with permission denied

2023-12-12 Thread Mantas Mikulėnas
On Tue, Dec 12, 2023 at 6:15 PM Christopher Wong 
wrote:

> Hi Mantas,
>
>
>
> After user@1001.service failed, it trigger the stopping process and
> become inactive.
>

Ah yeah, that makes sense, user-runtime-dir@ has StopWhenUnneeded=yes – so
of course after user@1001 crashes you're not going to see anything mounted
anymore.

Could you try temporarily removing that option / setting it to 'no', just
to see what changes?


>
>
> ○ user-runtime-dir@1001.service - User Runtime Directory /run/user/1001
>
>  Loaded: loaded (/etc/systemd/system/user-runtime-dir@.service;
> static)
>
> Drop-In: /usr/lib/systemd/system/service.d
>
>  └─10-axis.conf, 20-axis-sandbox.conf
>
>  Active: inactive (dead) since Tue 2023-12-12 16:33:35 CET; 36min ago
>
>Duration: 315ms
>
>Docs: man:user@.service(5)
>
> Process: 16325 ExecStartPre=ls -la /run/user (code=exited,
> status=0/SUCCESS)
>
> Process: 16327 ExecStartPre=mount (code=exited, status=0/SUCCESS)
>
> Process: 16329 ExecStart=/usr/lib/systemd/systemd-user-runtime-dir
> start 1001 (code=exited, status=0/SUCCESS)
>
> Process: 16334 ExecStartPost=sleep 5 (code=exited, status=0/SUCCESS)
>
> Process: 16347 ExecStartPost=ls -la /run/user/1001 (code=exited,
> status=0/SUCCESS)
>
> Process: 16351 ExecStartPost=mount (code=exited, status=0/SUCCESS)
>
> Process: 16361 ExecStop=/usr/lib/systemd/systemd-user-runtime-dir
> stop 1001 (code=exited, status=0/SUCCESS)
>
>Main PID: 16329 (code=exited, status=0/SUCCESS)
>
> CPU: 48ms
>
>
>
> /etc/fstab don’t include anything on /run/user/1001 and there is no mount
> unit for run-user-1001.mount either.
>
>
>
> Best regards,
>
> Christopher Wong
>
>
>
>
>
> *From: *Mantas Mikulėnas 
> *Date: *Tuesday, 12 December 2023 at 17:05
> *To: *Christopher Wong 
> *Cc: *Systemd 
> *Subject: *Re: [systemd-devel] Manual start of user@.service failed
> with permission denied
>
> That sounds like it's getting immediately unmounted (or maybe not being
> mounted at all despite the program doing so).
>
>
>
> Does the user-runtime-dir service continue to show as "active" after this,
> or does it return to "inactive"?
>
>
>
> Does your /etc/fstab have any mentions of /run/user/1001? Or more
> generally, are there any run-user-1001.mount units? (If you 'systemctl
> status' this unit, does the status include a source path?)
>
>
>
> On Tue, Dec 12, 2023, 17:34 Christopher Wong 
> wrote:
>
> Hi Mantas,
>
>
>
> I currently have the following flow:
>
>
>
>1. No /run/user/1001 directory
>2. systemctl start user@1001.service
>3. systemd start user-runtime-dir@1001.service which ends successfully.
>4. The directory /run/user/1001 exists now, but is empty, owned by
>root with mode 0700
>5. I don’t have findmnt on my system, so I used mount, but
>/run/user/1001 is not listed.
>6. systemd start user@1001.service which fails due to permission
>denied.
>
>
>
> I can’t explain why the /run/user/1001 is owned by root after
> user-runtime-dir@1001.service successfully exited. I added some personal
> print in systemd code to ensure that the mount command returned success
> (r=0). Although, the mount was successful the command “mount” didn’t list
> it. In the list of mounts starting with /run I could only find these
> entries:
>
>
>
> Dec 12 16:19:35 host mount[14500]: tmpfs on /run type tmpfs
> (rw,nosuid,nodev,mode=755)
>
> Dec 12 16:19:35 host mount[14500]: tmpfs on /run/credentials type tmpfs
> (ro,nosuid,nodev,noexec,mode=755)
>
> Dec 12 16:19:35 host mount[14500]: tmpfs on /run/systemd/incoming type
> tmpfs (ro,nosuid,nodev,mode=755)
>
>
>
> If I do a chown of the directory in user@1001.service then it works
>
>
>
> root@host:/run/user# ls -la 1001
>
> drwx--3 ida  root80 Dec 12 16:19 .
>
> drwxr-xr-x3 root root60 Dec 12 16:19 ..
>
> srw-rw-rw-1 ida  ssh-user 0 Dec 12 16:19 bus
>
> drwxr-xr-x5 ida  ssh-user   140 Dec 12 16:19 systemd
>
>
>
> The ”mount” command don’t list /run/user/1001 for the successful case
> either.
>
>
>
> Best regards,
>
> Christopher Wong
>
>
>
>
>
> *From: *Mantas Mikulėnas 
> *Date: *Monday, 11 December 2023 at 17:56
> *To: *Christopher Wong 
> *Cc: *Systemd 
> *Subject: *Re: [systemd-devel] Manual start of user@.service failed
> with permission denied
>
> On Mon, Dec 11, 2023, 17:28 Christopher Wong 
> wrote:
>
> Hi Mantas

Re: [systemd-devel] Manual start of user@.service failed with permission denied

2023-12-12 Thread Mantas Mikulėnas
That sounds like it's getting immediately unmounted (or maybe not being
mounted at all despite the program doing so).

Does the user-runtime-dir service continue to show as "active" after this,
or does it return to "inactive"?

Does your /etc/fstab have any mentions of /run/user/1001? Or more
generally, are there any run-user-1001.mount units? (If you 'systemctl
status' this unit, does the status include a source path?)

On Tue, Dec 12, 2023, 17:34 Christopher Wong 
wrote:

> Hi Mantas,
>
>
>
> I currently have the following flow:
>
>
>
>1. No /run/user/1001 directory
>2. systemctl start user@1001.service
>3. systemd start user-runtime-dir@1001.service which ends successfully.
>4. The directory /run/user/1001 exists now, but is empty, owned by
>root with mode 0700
>5. I don’t have findmnt on my system, so I used mount, but
>/run/user/1001 is not listed.
>6. systemd start user@1001.service which fails due to permission
>denied.
>
>
>
> I can’t explain why the /run/user/1001 is owned by root after
> user-runtime-dir@1001.service successfully exited. I added some personal
> print in systemd code to ensure that the mount command returned success
> (r=0). Although, the mount was successful the command “mount” didn’t list
> it. In the list of mounts starting with /run I could only find these
> entries:
>
>
>
> Dec 12 16:19:35 host mount[14500]: tmpfs on /run type tmpfs
> (rw,nosuid,nodev,mode=755)
>
> Dec 12 16:19:35 host mount[14500]: tmpfs on /run/credentials type tmpfs
> (ro,nosuid,nodev,noexec,mode=755)
>
> Dec 12 16:19:35 host mount[14500]: tmpfs on /run/systemd/incoming type
> tmpfs (ro,nosuid,nodev,mode=755)
>
>
>
> If I do a chown of the directory in user@1001.service then it works
>
>
>
> root@host:/run/user# ls -la 1001
>
> drwx--3 ida  root80 Dec 12 16:19 .
>
> drwxr-xr-x3 root root60 Dec 12 16:19 ..
>
> srw-rw-rw-1 ida  ssh-user 0 Dec 12 16:19 bus
>
> drwxr-xr-x5 ida      ssh-user   140 Dec 12 16:19 systemd
>
>
>
> The ”mount” command don’t list /run/user/1001 for the successful case
> either.
>
>
>
> Best regards,
>
> Christopher Wong
>
>
>
>
>
> *From: *Mantas Mikulėnas 
> *Date: *Monday, 11 December 2023 at 17:56
> *To: *Christopher Wong 
> *Cc: *Systemd 
> *Subject: *Re: [systemd-devel] Manual start of user@.service failed
> with permission denied
>
> On Mon, Dec 11, 2023, 17:28 Christopher Wong 
> wrote:
>
> Hi Mantas,
>
>
>
> I have added ExecStartPre to user@.service to run “id” and “ls -la”:
>
>
>
> Dec 11 15:50:34 host systemd-user-runtime-dir[40287]: Will mount
> /run/user/1001 owned by 1001:118
>
> Dec 11 15:50:34 host systemd-user-runtime-dir[40287]: Mounting tmpfs
> (tmpfs) on /run/user/1001 (MS_NOSUID|MS_NODEV
> "mode=0700,uid=1001,gid=118,size=99426304,nr_inodes=24274")...
>
> Dec 11 15:50:34 host systemd[1]: Finished User Runtime Directory
> /run/user/1001.
>
> Dec 11 15:50:34 host systemd[1]: Starting User Manager for UID 1001...
>
> Dec 11 15:50:34 host id[40291]: uid=1001(ida) gid=118(ssh-users)
> groups=118(ssh-users),236(systemd-journal)
>
> Dec 11 15:50:34 host ls[40293]: drwxr-xr-x3 root root
> 60 Dec 11 15:50 .
>
> Dec 11 15:50:34 host ls[40293]: drwxr-xr-x   98 root root
> 2120 Dec 11 15:30 ..
>
> Dec 11 15:50:34 host ls[40293]: drwx--2 root root
> 40 Dec 11 15:50 1001
>
> Dec 11 15:50:34 host systemd[40294]: systemd 254.7-2-g9edc143 running in
> user mode for user 1001/ida. (-PAM -AUDIT -SELINUX -APPARMOR +IMA -SMACK
> +SECCOMP +GCRYPT +GNUTLS +OPENSSL -ACL +BLKID +CURL -ELFUTILS -FIDO2 -IDN2
> -IDN -IPTC +KMOD -LIBCRYPTSETUP +LIBFDISK -PCRE2 -PWQUALITY -P11KIT
> -QRENCODE -TPM2 +BZIP2 -LZ4 +XZ +ZLIB +ZSTD -BPF_FRAMEWORK -XKBCOMMON -UTMP
> -SYSVINIT default-hierarchy=unified)
>
>
>
> The /run/user/1001 belongs to root with mode 0700. Should this belong to
> root?
>
> No, it should be owned by UID 1001 (though still mode 0700).
>
> Is it because I manually start user@1001.service as root?
>
> Which user started the .service is usually not important, all services get
> a "fresh" environment that's fully described by the unit file.
>
>
>
> So even if you did 'systemctl start' as root, the unit has User=%i so the
> instance parameter tells it which UID to run as, so will be running as UID
> 1001. Likewise user-runtime-dir@1001 will get the UID for the mount from
> its instance name (you can see that the "Mounting tmpfs" message has the
> correct information).
>

Re: [systemd-devel] Manual start of user@.service failed with permission denied

2023-12-11 Thread Mantas Mikulėnas
On Mon, Dec 11, 2023, 17:28 Christopher Wong 
wrote:

> Hi Mantas,
>
>
>
> I have added ExecStartPre to user@.service to run “id” and “ls -la”:
>
>
>
> Dec 11 15:50:34 host systemd-user-runtime-dir[40287]: Will mount
> /run/user/1001 owned by 1001:118
>
> Dec 11 15:50:34 host systemd-user-runtime-dir[40287]: Mounting tmpfs
> (tmpfs) on /run/user/1001 (MS_NOSUID|MS_NODEV
> "mode=0700,uid=1001,gid=118,size=99426304,nr_inodes=24274")...
>
> Dec 11 15:50:34 host systemd[1]: Finished User Runtime Directory
> /run/user/1001.
>
> Dec 11 15:50:34 host systemd[1]: Starting User Manager for UID 1001...
>
> Dec 11 15:50:34 host id[40291]: uid=1001(ida) gid=118(ssh-users)
> groups=118(ssh-users),236(systemd-journal)
>
> Dec 11 15:50:34 host ls[40293]: drwxr-xr-x3 root root
> 60 Dec 11 15:50 .
>
> Dec 11 15:50:34 host ls[40293]: drwxr-xr-x   98 root root
> 2120 Dec 11 15:30 ..
>
> Dec 11 15:50:34 host ls[40293]: drwx--2 root root
> 40 Dec 11 15:50 1001
>
> Dec 11 15:50:34 host systemd[40294]: systemd 254.7-2-g9edc143 running in
> user mode for user 1001/ida. (-PAM -AUDIT -SELINUX -APPARMOR +IMA -SMACK
> +SECCOMP +GCRYPT +GNUTLS +OPENSSL -ACL +BLKID +CURL -ELFUTILS -FIDO2 -IDN2
> -IDN -IPTC +KMOD -LIBCRYPTSETUP +LIBFDISK -PCRE2 -PWQUALITY -P11KIT
> -QRENCODE -TPM2 +BZIP2 -LZ4 +XZ +ZLIB +ZSTD -BPF_FRAMEWORK -XKBCOMMON -UTMP
> -SYSVINIT default-hierarchy=unified)
>
>
>
> The /run/user/1001 belongs to root with mode 0700. Should this belong to
> root?
>
No, it should be owned by UID 1001 (though still mode 0700).

> Is it because I manually start user@1001.service as root?
>
Which user started the .service is usually not important, all services get
a "fresh" environment that's fully described by the unit file.

So even if you did 'systemctl start' as root, the unit has User=%i so the
instance parameter tells it which UID to run as, so will be running as UID
1001. Likewise user-runtime-dir@1001 will get the UID for the mount from
its instance name (you can see that the "Mounting tmpfs" message has the
correct information).

> However, after user-runtime-dir@1001.service has finished it startup,
> the user@1001.service is started as uid=1001 and therefore can’t create
> any directories under /run/user/1001. Resulting in user@1001.service
> failed to start.
>
>
>
> If I add “ExecStartPre=+chown %i /run/user/%i” to user@.service then it
> works! But I am unsure if this is really the way fix this.
>

So far, it sounds like the directory is being created *by something else*
before user-runtime-dir@ is even invoked.

Try adding the same "-/bin/ls -lad /run/user/%i" as both ExecStartPre and
ExecStartPost of user-runtime-dir@ (and maybe even a findmnt). If the
directory already exists during ExecStartPre, start looking for other
services or cronjobs, or tmpfiles.d configs, or 'su' invocations, which may
cause it to be created.

There might also be something that chowns it to root *after* it was created
correctly. If you actually see the tmpfs mount in 'findmnt' or in 'mount',
but it's owned by root despite having uid=1001 in its mount options,
something has chowned it...or your tmpfs feature is broken.

If you don't see it in findmnt at all, even after user-runtime-dir has
succeeded – either the mount failed quietly, or... something (like systemd
itself) has quietly unmounted it.


>
> Regarding the testing, I have done both restart of everything and manual,
> but the result is the same. Now that I have the
> “Environment=XDG_RUNTIME_DIR=/run/user/%i” I no longer need to do
> “systemctl set-environment …”
>
>
>
> Thank you for taking your time!
>
>
>
> Best regards,
>
> Christopher Wong
>
>
>
>
>
> *From: *Mantas Mikulėnas 
> *Date: *Friday, 8 December 2023 at 21:53
> *To: *Christopher Wong 
> *Cc: *Systemd 
> *Subject: *Re: [systemd-devel] Manual start of user@.service failed
> with permission denied
>
> On Fri, Dec 8, 2023 at 6:53 PM Christopher Wong 
> wrote:
>
> Hi Mantas,
>
>
>
> I have from your suggestion done the following:
>
>
>
> Putting the below in user@.service
>
>
>
> [Service]
>
> ...
>
> Environment=XDG_RUNTIME_DIR=/run/user/%i
>
> Environment=SYSTEMD_LOG_LEVEL=debug
>
>
>
> Putting the below in user-runtime-dir@.service
>
>
>
> [Service]
>
> ...
>
> Environment=SYSTEMD_LOG_LEVEL=debug
>
>
>
> Then I have disabled the global set-log-level debug (if this is also
> required, please let me know).
>
>
>
> Unlike set-environment that's not global, it only affects pid1.
>
>
>
>
>
> What I can see from the log

Re: [systemd-devel] Manual start of user@.service failed with permission denied

2023-12-08 Thread Mantas Mikulėnas
 ignoring: Permission denied
>
> Dec 08 17:33:29 host systemd[36280]: Failed to create
> '/run/user/1001/systemd/inaccessible/sock', ignoring: Permission denied
>
> Dec 08 17:33:29 host systemd[36280]: Failed to create
> '/run/user/1001/systemd/inaccessible/chr', ignoring: Permission denied
>
> Dec 08 17:33:29 host systemd[36280]: Failed to create
> '/run/user/1001/systemd/inaccessible/blk', ignoring: Permission denied
>

What's the ownership of /run/user/1001 and /run/user/1001/systemd after all
of this?

Are you rebooting between tests or just manually starting it?

My current guess is that due to the earlier `systemctl set-environment`,
some *other* thing that's running as root inherited the /run/user/1001 path
and created root-owned directories there? That's the issue with setting
global environment, it needs to be unset afterwards...

-- 
Mantas Mikulėnas


Re: [systemd-devel] Manual start of user@.service failed with permission denied

2023-12-08 Thread Mantas Mikulėnas
On Fri, Dec 8, 2023, 12:22 Christopher Wong 
wrote:

> Hi Luca,
>
>
>
> Sorry, for late reply, below is a log with debug. This time I run with a
> user with higher UID, but the result is the same.
>
>
>
> root@host:~# systemd-analyze set-log-level debug
>
> root@host:~# systemctl set-environment XDG_RUNTIME_DIR="/run/user/1001"
>

I'd avoid doing that globally. If you really want to have a PAM-less
system, then edit the unit to set this through its Environment= instead.

root@host:~# systemctl start user@1001.service
>
> Job for user@1001.service failed because the control process exited with
> error code.
>
> See "systemctl status user@1001.service" and "journalctl -xeu
> user@1001.service" for details.
>
> root@host:~# journalctl -xeu user@1001.service
>
> Dec 08 09:35:53 host systemd[1]: /usr/lib/systemd/system/user@.service:19:
> Support for option PAMName= has been disabled at compile time and it is
> ignored
>
> Dec 08 09:35:53 host systemd[1]: user@1001.service: Trying to enqueue job
> user@1001.service/start/replace
>
> Dec 08 09:35:53 host systemd[1]: user@1001.service: Installed new job
> user@1001.service/start as 6724
>
> Dec 08 09:35:53 host systemd[1]: user@1001.service: Enqueued job
> user@1001.service/start as 6724
>
> Dec 08 09:35:53 host systemd[1]: user@1001.service: starting held back,
> waiting for: user-runtime-dir@1001.service
>
> Dec 08 09:35:54 host systemd[1]: user@1001.service: Will spawn child
> (service_enter_start): /usr/lib/systemd/systemd
>
> Dec 08 09:35:54 host systemd[1]: user@1001.service: Failed to set
> 'memory.zswap.max' attribute on
> '/user.slice/user-1001.slice/user@1001.service' to 'max': No such file or
> directory
>
> Dec 08 09:35:54 host systemd[1]: user@1001.service: Passing 0 fds to
> service
>
> Dec 08 09:35:54 host systemd[1]: user@1001.service: About to execute:
> /usr/lib/systemd/systemd --user
>
> Dec 08 09:35:54 host systemd[1]: user@1001.service: Forked
> /usr/lib/systemd/systemd as 6899
>
> Dec 08 09:35:54 host (systemd)[6899]: Found cgroup2 on /sys/fs/cgroup/,
> full unified hierarchy
>
> Dec 08 09:35:54 host systemd[1]: user@1001.service: Changed dead -> start
>
> Dec 08 09:35:54 host systemd[1]: Starting User Manager for UID 1001...
>
> Dec 08 09:35:54 host (systemd)[6899]: Bind-mounting / on
> /run/systemd/mount-rootfs (MS_BIND|MS_REC "")...
>
> Dec 08 09:35:54 host systemd[1]: user@1001.service: User lookup
> succeeded: uid=1001 gid=118
>
> Dec 08 09:35:54 host (systemd)[6899]: Applying namespace mount on
> /run/systemd/mount-rootfs/run/credentials
>
> Dec 08 09:35:54 host (systemd)[6899]: Bind-mounting
> /run/systemd/inaccessible/dir on /run/systemd/mount-rootfs/run/credentials
> (MS_BIND|MS_REC "")...
>
> Dec 08 09:35:54 host (systemd)[6899]: Successfully mounted
> /run/systemd/inaccessible/dir to /run/systemd/mount-rootfs/run/credentials
>
> Dec 08 09:35:54 host (systemd)[6899]: Applying namespace mount on
> /run/systemd/mount-rootfs/run/systemd/incoming
>
> Dec 08 09:35:54 host (systemd)[6899]: Followed source symlinks
> /run/systemd/propagate/user@1001.service →
> /run/systemd/propagate/user@1001.service.
>
> Dec 08 09:35:54 host (systemd)[6899]: Bind-mounting
> /run/systemd/propagate/user@1001.service on
> /run/systemd/mount-rootfs/run/systemd/incoming (MS_BIND "")...
>
> Dec 08 09:35:54 host (systemd)[6899]: Successfully mounted
> /run/systemd/propagate/user@1001.service to
> /run/systemd/mount-rootfs/run/systemd/incoming
>
> Dec 08 09:35:54 host (systemd)[6899]: Applying namespace mount on
> /run/systemd/mount-rootfs/sys
>
> Dec 08 09:35:54 host (systemd)[6899]: Failed to umount
> /run/systemd/mount-rootfs/sys, ignoring: Device or resource busy
>
> Dec 08 09:35:54 host (systemd)[6899]: Failed to umount
> /run/systemd/mount-rootfs/sys, ignoring: Device or resource busy
>
> Dec 08 09:35:54 host (systemd)[6899]: Failed to umount
> /run/systemd/mount-rootfs/sys, ignoring: Device or resource busy
>
> Dec 08 09:35:54 host (systemd)[6899]: Failed to umount
> /run/systemd/mount-rootfs/sys, ignoring: Device or resource busy
>
> Dec 08 09:35:54 host (systemd)[6899]: Failed to umount
> /run/systemd/mount-rootfs/sys, ignoring: Device or resource busy
>
> Dec 08 09:35:54 host (systemd)[6899]: Failed to umount
> /run/systemd/mount-rootfs/sys, ignoring: Device or resource busy
>
> Dec 08 09:35:54 host (systemd)[6899]: Failed to umount
> /run/systemd/mount-rootfs/sys, ignoring: Device or resource busy
>
> Dec 08 09:35:54 host (systemd)[6899]: Failed to umount
> /run/systemd/mount-rootfs/sys, ignoring: Device or resource busy
>
> Dec 08 09:35:54 host (systemd)[6899]: Mounting sysfs (sysfs) on
> /run/systemd/mount-rootfs/sys (MS_NOSUID|MS_NODEV|MS_NOEXEC "")...
>
> Dec 08 09:35:54 host (systemd)[6899]: user@1001.service: Executing:
> /usr/lib/systemd/systemd --user
>
> Dec 08 09:35:54 host systemd[6899]: Failed to copy os-release for
> propagation, ignoring: Permission denied
>
> Dec 08 09:35:54 host systemd[6899]: Failed to allocate manager object:
> Perm

Re: how to keep eth link down across reboots ?

2023-12-07 Thread Mantas Mikulėnas
On boot, interfaces are admin-down *by default* until something explicitly
brings them up. If you don't configure any network management software to
bring eth0 up, then it'll be down.

On Thu, Dec 7, 2023, 22:52 lejeczek  wrote:

> Hi guys.
>
> Perhaps not strictly _systemd_ question but community here surely is
> capable - a matter of me being lucky - how would you keep an Ethernet
> link/port powered down?
> I was thinking I'll try first _udev_ rules - given other tools/managers
> are told to stay away from the link/port
> Is there a better, best way to put such link/port down & keep it that way
> - naturally, please steer clear of "unplug the cable" type of ideas.
>
> many thanks, L.
>


Re: [systemd-devel] WSL Ubuntu creates XDG_RUNTIME_DIR with incorrect permissions

2023-11-29 Thread Mantas Mikulėnas
On Wed, Nov 29, 2023, 20:59 Thomas Larsen Wessel  wrote:

> Thanks both of you! :)
>
> I have taken some time to digest your answers. And in particular I have
> tried to investigate this line closer:
>
> *Nov 27 12:34:22 tumbleweed unknown: WSL (2): Creating login session for
> andrei*
>
> I have found the equivalent log line on my WSL Ubuntu. I was hoping I
> could find out more about where its coming from; ie which process / service
> prints this. But journalctl does not tell me much about the origin.
>
>
>
>
>
>
>
>
>
>
> *journalctl -b --grep "Creating login session for velle" -o verboseWed
> 2023-11-29 18:41:19.982271 CET
> [s=d318bdab5d1f4ad7a48a947e6fff4a01;i=2d53;b=c8682ff139cf40da8326fd63d7c34d7c;m=1649>
>   _TRANSPORT=kernel_MACHINE_ID=967980c77d4743298ceaeb5d512bf388
> _HOSTNAME=ELCON45223PRIORITY=6SYSLOG_FACILITY=1MESSAGE=WSL (2):
> Creating login session for velle
> _BOOT_ID=c8682ff139cf40da8326fd63d7c34d7c
> _SOURCE_MONOTONIC_TIMESTAMP=23368229*
>
> Most log entries in journalctl has a _PID field, but some don't, and this
> one does not. Why? What does it tell, that a log entry has no _PID? As far
> as I know ever process has an PID, even systemd itself has a PID (which is
> always 1). Or am I wrong about that? I see now reason why those PIDs are
> not saved together with the log entries.
>

According to _TRANSPORT the message went through the kernel log (dmesg),
i.e. it was either kernel-generated (no PID) or it was written by the
process to /dev/kmsg (PID information not preserved) rather than being sent
the usual way through syslog. There might have been a PID but journald had
no way to obtain it.

That aside, another thing about WSL2 is that the entire VM actually boots a
"system distro" first and the user-facing Ubuntu distro is started as a
container. So there are several processes that run within the VM but exist
outside of the container's PID namespace and therefore don't have PIDs from
the Ubuntu container's PoV; only the "host" namespace has PIDs for them.

(Consider how a container's "PID 1" looks from outside the container...)



>
>
> On Mon, Nov 27, 2023 at 10:37 AM Andrei Borzenkov 
> wrote:
>
>> On Mon, Nov 27, 2023 at 1:06 AM Thomas Larsen Wessel 
>> wrote:
>> >>
>> >> WSL does not use systemd by default.
>> >
>> >
>> > According to this article, it systemd has been default on WSL Ubuntu
>> since june 2023. https://learn.microsoft.com/en-us/windows/wsl/systemd
>> >
>> > "Systemd is now the default for the current version of Ubuntu that will
>> be installed using the wsl --install command default."
>> >
>> > Also when I look in the /var/log/auth.log, there are many lines with
>> systemd, e.g.:
>> >
>> > Nov 25 22:30:14 ELCON45223 systemd-logind[155]: New session 6 of user
>> velle.
>> > Nov 25 22:30:14 ELCON45223 systemd: pam_unix(systemd-user:session):
>> session opened for user velle(uid=1000) by (uid=0)
>> >
>> > Could someone please help me understand exactly which part creates this
>> XDG_RUNTIME_DIR folder?
>>
>> /run/user/$UID for the "console" session (the one you get when
>> starting a WSL instance) is created by WSL before systemd. Adding "ls
>> -l /run/user" to user-runtime-dir@1000.service ExecStartPre:
>>
>> Nov 27 12:34:22 tumbleweed unknown: WSL (2) ERROR:
>> WaitForBootProcess:3237: /sbin/init failed to start within 1
>> Nov 27 12:34:22 tumbleweed unknown: ms
>> Nov 27 12:34:22 tumbleweed unknown: WSL (2): Creating login session for
>> andrei
>> ...
>> Nov 27 12:34:22 tumbleweed systemd[1]: Created slice User Slice of UID
>> 1000.
>> Nov 27 12:34:22 tumbleweed systemd[1]: Starting User Runtime Directory
>> /run/user/1000...
>> Nov 27 12:34:22 tumbleweed ls[520]: total 0
>> Nov 27 12:34:22 tumbleweed ls[520]: drwxr-xr-x 4 andrei users 120 Nov
>> 27 12:34 1000
>> Nov 27 12:34:22 tumbleweed systemd-logind[160]: New session 11 of user
>> andrei.
>> Nov 27 12:34:22 tumbleweed systemd[1]: Finished User Runtime Directory
>> /run/user/1000.
>>
>> So logind invokes user-runtime-dir@1000.service, but it sees the
>> existing directory and does nothing. I would suggest asking this
>> question on WSL support channels.
>>
>> > Is it part of the systemd repo or not? And if the answer is (or may be)
>> different between Ubuntu and WSL Ubuntu, I would be happy if you share what
>> you know about any any of those cases :) Right now, I barely know where to
>> report this issue.
>> >
>> >
>> > On Sun, Nov 26, 2023 at 10:07 AM Andrei Borzenkov 
>> wrote:
>> >>
>> >> On 26.11.2023 02:39, Thomas Larsen Wessel wrote:
>> >> > I set up WSL on Windows 10 and created an instance from the default
>> Ubuntu
>> >> > 22.04 image.
>> >> >
>> >> > I ran some (non-GUI) software that somehow relies on Qt, and
>> apparently Qt
>> >> > does some checks on the XDG environment, so I got the following.
>> >> >
>> >> > *Warning: QStandardPaths: wrong permissions on runtime directory
>> >> > /run/user/1000/, 0755 instead of 0700*
>> >> >
>> >> > And yes, all the user folders are set 

Re: [systemd-devel] setting cpulimit/iolimit on mysql thread not entire process

2023-11-27 Thread Mantas Mikulėnas
On Tue, Nov 28, 2023 at 8:27 AM jai  wrote:

> I am able to set cpulimit, iolimit, etc for a process using its pid
> through cgroups v2. But for some threads of a single mysql process, how can
> I achieve that?
>

You cannot; 1) the limits are per-cgroup and the entire service is a single
cgroup; 2) the threads are created by mysqld, not by systemd, and systemd
does not monitor and move service processes across cgroups once the service
is already running; 3) afaik, in cgroups v2 it isn't even allowed for
threads of a single process to straddle multiple cgroups anymore.

I'm not a DBA but I've heard that one common way to handle this would be to
create a separate MySQL instance (probably on a separate machine, even)
that would replicate all the data, for the heavy users to query. (Or the
other way around, main instance for the heavy updates ⇒ replica for regular
queries.)

-- 
Mantas Mikulėnas


Re: [systemd-devel] networkd 249.11 fails to create ip6gre and vti6 tunnels

2023-11-27 Thread Mantas Mikulėnas
Kernel and systemd changes aside, I kind of want to say that you need to
specify an interface for the link-local endpoint to be bound to – just as
with regular sockets. If the tunnel were device-bound and not independent,
that would happen by default.

It also seems weird that the tunnel has endpoints with different scopes; I
think I've seen routers reject such packets with a "Scope Mismatch" error.

I would try building systemd from Git source; if I remember correctly,
systemd-networkd could be run directly from the build directory, making it
possible to `git bisect` down to the change that fixed this.

On Mon, Nov 27, 2023, 19:38 Danilo Egea Gondolfo <
danilo.egea.gondo...@gmail.com> wrote:

> Hello,
>
> I'm looking for help to understand an issue we are observing on Ubuntu
> 22.04.
>
> networkd is failing with "netdev could not be created: Invalid argument"
> when I try to create either an ip6gre or vti6 device.
>
> We believe this problem started when we pulled this change [1] in to the
> kernel 5.15. The problem also happens with the most recent upstream kernel
> so it's not an issue introduced by Ubuntu.
>
> The problem doesn't happen on recent versions of systemd but we'd like to
> fix it on systemd 249 (used by Ubuntu 22.04).
>
> How to reproduce the problem (tested on Ubuntu 22.04 (jammy) with systemd
> 249.11-0ubuntu3.11 and kernel 5.15.0-89-generic):
>
> --- /etc/systemd/network/tun0.netdev ---
> [NetDev]
> Name=tun0
> Kind=ip6gre
>
> [Tunnel]
> Independent=true
> Local=fe80::1
> Remote=2001:dead:beef::2
> --
>
> --- /etc/systemd/network/tun0.network ---
> [Match]
> Name=tun0
>
> [Network]
> LinkLocalAddressing=ipv6
> ConfigureWithoutCarrier=yes
> --
>
> After restarting networkd I see this in the logs
> tun0: netdev could not be created: Invalid argument
> tun0: netdev removed
>
> If we boot a kernel that doesn't have [1], the interface tun0 is created.
>
> Here is the full log with debug enabled
> https://paste.ubuntu.com/p/dPbPxgRThW/
>
> As I said, the problem seems to be fixed already in systemd, but I'm
> looking for help to understand what changes fixed it.
> The theory is that the netlink attributes used to configure the tunnel
> local/remote IPs might be wrong.
>
> This problem is documented here
> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2037667
>
> Thanks in advance.
>
> [1] -
> https://github.com/torvalds/linux/commit/b0ad3c179059089d809b477a1d445c1183a7b8fe
>


Re: [systemd-devel] How to properly wait for udev?

2023-11-27 Thread Mantas Mikulėnas
On Mon, Nov 27, 2023 at 10:30 AM Lennart Poettering 
wrote:

> On So, 26.11.23 00:39, Richard Weinberger (richard.weinber...@gmail.com)
> wrote:
>
> > Hello!
> >
> > After upgrading my main test worker to a recent distribution, the UBI
> > test suite [0] fails at various places with -EBUSY.
> > The reason is that these tests create and remove UBI volumes rapidly.
> > A typical test sequence is as follows:
> > 1. creation of /dev/ubi0_0
> > 2. some exclusive operation, such as atomic update or volume resize on
> > /dev/ubi0_0
> > 3. removal of /dev/ubi0_0
> >
> > Both steps 2 and 3 can fail with -EBUSY because the udev worker still
> > holds a file descriptor to /dev/ubi0_0.
>
> Hmm, I have no experience with UBI, but are you sure we open that? why
> would we? are such devices analyzed by blkid? We generally don't open
> device nodes unless we have a reason to, such as doing blkid on it or
> so.
>

blkid and 60-persistent-storage indeed analyze ubi devices, it seems.

-- 
Mantas Mikulėnas


Re: [systemd-devel] WSL Ubuntu creates XDG_RUNTIME_DIR with incorrect permissions

2023-11-26 Thread Mantas Mikulėnas
On Mon, Nov 27, 2023 at 6:02 AM Thomas Larsen Wessel 
wrote:

> WSL does not use systemd by default.
>
>
> According to this article, it systemd has been default on WSL Ubuntu since
> june 2023. https://learn.microsoft.com/en-us/windows/wsl/systemd
>
> *"Systemd is now the default for the current version of Ubuntu that will
> be installed using the wsl --install command default."*
>
> Also when I look in the /var/log/auth.log, there are many lines with
> systemd, e.g.:
>
>
> *Nov 25 22:30:14 ELCON45223 systemd-logind[155]: New session 6 of user
> velle.Nov 25 22:30:14 ELCON45223 systemd: pam_unix(systemd-user:session):
> session opened for user velle(uid=1000) by (uid=0)*
>
> Could someone please help me understand exactly which part creates this
> XDG_RUNTIME_DIR folder? Is it part of the systemd repo or not? And if the
> answer is (or may be) different between Ubuntu and WSL Ubuntu, I would be
> happy if you share what you know about any any of those cases :) Right now,
> I barely know where to report this issue.
>

In Ubuntu it is *likely* to be systemd invoked through PAM (not systemd as
in init/pid1, but one of the additional components), but in general it is
*not guaranteed* to be a systemd component (some Linux distributions use
alternative PAM modules to do this).

In a 100% systemd-based system, 1) pam_systemd requests systemd-logind to
create a user session (your syslog line 1), 2) systemd-logind starts the
user@ system system service; 3) as a dependency this also starts the
user-runtime-dir@ system service; 4) the user-runtime-dir@ service
creates the runtime directory for you. In older versions it was slightly
different; logind did it internally.

-- 
Mantas Mikulėnas


Re: [systemd-devel] How to properly wait for udev?

2023-11-26 Thread Mantas Mikulėnas
If I remember correctly, udev (recent versions) takes a BSD lock using
flock(2) while processing the device, and tools are supposed to do the
same. The flock() call can be set to wait until the lock can be taken.

On Sun, Nov 26, 2023 at 1:40 AM Richard Weinberger <
richard.weinber...@gmail.com> wrote:

> Hello!
>
> After upgrading my main test worker to a recent distribution, the UBI
> test suite [0] fails at various places with -EBUSY.
> The reason is that these tests create and remove UBI volumes rapidly.
> A typical test sequence is as follows:
> 1. creation of /dev/ubi0_0
> 2. some exclusive operation, such as atomic update or volume resize on
> /dev/ubi0_0
> 3. removal of /dev/ubi0_0
>
> Both steps 2 and 3 can fail with -EBUSY because the udev worker still
> holds a file descriptor to /dev/ubi0_0.
>
> FWIW, the problem can also get triggered using UBI's shell utilities
> if the system is fast enough, e.g.
> # ubimkvol -N testv -S 50 -n 0 /dev/ubi0 && ubirmvol -n 0 /dev/ubi0
> Volume ID 0, size 50 LEBs (793600 bytes, 775.0 KiB), LEB size 15872
> bytes (15.5 KiB), dynamic, name "testv", alignment 1
> ubirmvol: error!: cannot UBI remove volume
>  error 16 (Device or resource busy)
>
> Instead of adding a retry loop around -EBUSY, I believe the best
> solution is to add code to wait for udev.
> For example, having a udev barrier in ubi_mkvol() and ubi_rmvol() [1]
> seems like a good idea to me.
>
> What function from libsystemd do you suggest for waiting until udev is
> done with rule processing?
> My naive approach, using udev_queue_is_empty() and
> sd_device_get_is_initialized(), does not resolve all failures so far.
> Firstly, udev_queue_is_empty() doesn't seem to be exported by
> libsystemd. I have open-coded it as:
> static int udev_queue_is_empty(void) {
>return access("/run/udev/queue", F_OK) < 0 ?
>(errno == ENOENT ? true : -errno) : false;
> }
>
> Additionally, sd_device_get_is_initialized() seems to return sometimes
> true even if the udev worker still has the volume open.
> In short, which API do you recommend to ensure that the device my
> thread has created is actually usable?
>
> [0]: http://git.infradead.org/mtd-utils.git/tree/HEAD:/tests/ubi-tests
> [1]: http://git.infradead.org/mtd-utils.git/blob/HEAD:/lib/libubi.c#l994
>
> --
> Thanks,
> //richard
>


-- 
Mantas Mikulėnas


Re: [systemd-devel] (no subject)

2023-10-16 Thread Mantas Mikulėnas
cal understanding of the login process, so I am bad person
> to be asking this question, but no one else seems to trying. Just a lot of
> complaining users adding to the bug report.
>
> I use nomachine, and while I have mentioned my suspicions to their tech
> support, I can't be very technically convincing since I don't understand
> what has changed in the login process causing them to miss something.
>
> x2go is barely maintained, which is a hint to me that the login process
> needs some kind of systemd enhancement missing in older code.
>
> Is there someone on this mailing list who can join the dots and give good
> clues about what might be going wrong with the login process of nomachine,
> x2go etc
>
> How could I prove to nomachine tech support that nomache workstation iis
> not logging in correctly?
>

Compare the output of `loginctl` and maybe `cat /proc/self/cgroup` with
what you see over e.g. SSH.

A "correct" login (i.e. one that goes through PAM) will have pam_systemd
create you a "session" in systemd-logind and move your process to a fresh
cgroup named after your UID, e.g. in cgroupv2 systems it would be
"/user.slice/user-UID.slice/session-XXX.scope" (and everything that's
launched via your 'systemd --user' would likewise be under
".../user-UID.slice/user@UID.service")

Whereas if your processes are still inside x2go's "service" cgroup, that's
an indication that it's not doing PAM setup correctly.

-- 
Mantas Mikulėnas


Re: [systemd-devel] systemctl stop going through timeout even though all processes have exited

2023-10-13 Thread Mantas Mikulėnas
What value do you have in /sys/fs/cgroup/systemd/release_agent, as seen by
systemd? Does it point to the correct executable?

Does e.g. forkstat (or execsnoop or similar) show that the executable is
being run when the cgroup empties?

On Fri, Oct 13, 2023, 04:20 Martin Schwenke  wrote:

> Hi Mantas,
>
> Yes, it looks like cgroups v1.
>
> Would this be a kernel bug?  systemd bug?
>
> Thanks...
>
> peace & happiness,
> martin
>
> On Wed, 11 Oct 2023 08:19:59 +0300, Mantas Mikulėnas
>  wrote:
>
> > Is this with cgroups v1 or v2? If cgroups v1 is involved (thanks
> Docker), I
> > recall it was a bit complex for systemd to get notified when the cgroup
> > actually empties – via /sys/fs/cgroup/systemd/release_agent that
> specifies
> > a helper executable that the kernel runs... I wonder if that mechanism is
> > broken on your system.
> >
> > On Wed, Oct 11, 2023 at 7:38 AM Martin Schwenke 
> wrote:
> >
> > > I'm seeing "systemctl stop " for several services taking a
> > > long time because it goes through the timeout process, even though all
> > > relevant processes have exited.
> > >
> > > I'll give 2 examples.  Both examples are running inside a privileged
> > > Rocky Linux 8.8 Docker container on a Rocky Linux 8.8 host.  The
> > > systemd version, reported by "systemctl --version" in the container
> > > is:
> > >
> > >   systemd 239 (239-74.el8_8.5)
> > >
> > > Here is ctdb.system:
> > >
> > >   [Unit]
> > >   Description=CTDB
> > >   Documentation=man:ctdbd(1) man:ctdb(7)
> > >   After=network-online.target time-sync.target
> > >   ConditionFileNotEmpty=/etc/ctdb/nodes
> > >
> > >   [Service]
> > >   Type=forking
> > >   LimitCORE=infinity
> > >   LimitNOFILE=1048576
> > >   TasksMax=4096
> > >   PIDFile=/var/run/ctdb/ctdbd.pid
> > >   ExecStart=/usr/sbin/ctdbd
> > >   ExecStop=/usr/bin/ctdb shutdown
> > >   KillMode=control-group
> > >   Restart=no
> > >
> > >   [Install]
> > >   WantedBy=multi-user.target
> > >
> > > "/usr/bin/ctdb shutdown" causes a controlled shutdown.  In many cases,
> > > starting and then stopping using systemctl works fine.  However, many
> > > times it takes >90s to stop, as per TimeoutStopSec.  If I reduce that
> > > value then the duration reduces accordingly.  I can confirm using both
> > > "ps auxfww" and "systemd-cgls" that within the container there are no
> > > relevant processes a moment after "systemctl stop ctdb" is run.  In
> > > particular, in systemd-cgls ctdb.service is gone but "systemctl stop
> > > ctdb" is still waiting.
> > >
> > > Before attempting to stop, the service is successfully started:
> > >
> > >   Oct 11 00:56:44 rocky1 systemd[710741]: ctdb.service: Executing:
> > > /usr/sbin/ctdbd
> > >   Oct 11 00:56:44 rocky1 ctdbd[710741]: CTDB logging to location
> > > file:/var/log/log.ctdb
> > >   Oct 11 00:56:44 rocky1 systemd[1]: Received SIGCHLD from PID 710741
> > > (ctdbd).
> > >   Oct 11 00:56:44 rocky1 systemd[1]: Child 710741 (ctdbd) died
> > > (code=exited, status=0/SUCCESS)
> > >   Oct 11 00:56:44 rocky1 systemd[1]: ctdb.service: Child 710741
> belongs to
> > > ctdb.service.
> > >   Oct 11 00:56:44 rocky1 systemd[1]: ctdb.service: Control process
> exited,
> > > code=exited status=0
> > >   Oct 11 00:56:44 rocky1 systemd[1]: ctdb.service: Got final SIGCHLD
> for
> > > state start.
> > >   Oct 11 00:56:44 rocky1 systemd[1]: ctdb.service: New main PID 710742
> > > belongs to service, we are happy.
> > >   Oct 11 00:56:44 rocky1 systemd[1]: ctdb.service: Main PID loaded:
> 710742
> > >   Oct 11 00:56:44 rocky1 systemd[1]: ctdb.service: Changed start ->
> running
> > >   Oct 11 00:56:44 rocky1 systemd[1]: ctdb.service: Job
> ctdb.service/start
> > > finished, result=done
> > >   Oct 11 00:56:44 rocky1 systemd[1]: Started CTDB.
> > >   -- Subject: Unit ctdb.service has finished start-up
> > >   -- Defined-By: systemd
> > >   -- Support:
> https://lists.freedesktop.org/mailman/listinfo/systemd-devel
> > >   --
> > >   -- Unit ctdb.service has finished starting up.
> > >   --
> > >   -- The start-up result is done.
> > >
> > > The relevant part of the log while stopping seems to be:
> > >

Re: [systemd-devel] systemctl stop going through timeout even though all processes have exited

2023-10-10 Thread Mantas Mikulėnas
r=JobRemoved cookie=59 reply_cookie=0 signature=uoss error-name=n/a
> error-message=n/a
>   Oct 11 00:58:17 rocky1 systemd[1]: ctdb.service: Unit entered failed
> state.
>
> It would be very useful if systemd could log what it is still waiting
> for when it times out.
>
> Note that during start and stop, CTDB runs a lot of subprocesses,
> including some that use systemctl to start and stop various services
> that it, in turn, manages.
>
> The full debug level log, after running:
>
>   systemd-analyze log-level debug
>
> is uploaded it to:
>
>   https://meltin.net/uploads/systemd/ctdb-stop.log
>
> I'm happy to reply and attach it, but it is 48KB.
>
> The only theory I can come up with is some sort of race where
> processes are created during shutdown and systemd gets confused.
>
> I see a similar thing for a much simpler service, winbind:
>
> Here is winbind.service:
>
>   [Unit]
>   Description=Samba Winbind Daemon
>   Documentation=man:winbindd(8) man:samba(7) man:smb.conf(5)
>   After=network.target nmb.service
>
>   [Service]
>   Type=notify
>   PIDFile=/var/run/winbindd.pid
>   EnvironmentFile=-/etc/sysconfig/samba
>   ExecStart=/usr/sbin/winbindd --foreground --no-process-group
> $WINBINDOPTIONS
>   ExecReload=/bin/kill -HUP $MAINPID
>   LimitCORE=infinity
>
>   [Install]
>   WantedBy=multi-user.target
>
> Yesterday I watched it do the same thing as CTDB.  I could start the
> service by hand but it would time out during stop, nearly every time,
> even though there were no relevant processes running anymore.
> winbindd sends a READY=1 notification after successfully starting.  It
> does not send STOPPING=1.  winbindd is much simpler during shutdown.
> I can get logs for this one too if necessary.
>
> Thanks for any help.
>
> peace & happiness,
> martin
>


-- 
Mantas Mikulėnas


Re: [systemd-devel] Create a tmpfile with content from output of executing a command

2023-10-09 Thread Mantas Mikulėnas
No, and that doesn't sound like a good fit for systemd-tmpfiles anyway.

Use an ordinary Type=oneshot .service instead. (In current systemd
versions, you can even specify StandardOutput to be a file; in older
versions calling `/bin/sh -c "... > file"` is fine.)

Your email domain has a strict SPF/DMARC policy, and the obsolete
mailing-list system used by Freedesktop.org does not play well with that,
leading to some places such as Gmail quarantining all list-relayed messages
from you. (I think it's because the list server damages your DKIM
signatures.)

On Thu, Oct 5, 2023 at 12:44 PM Renjaya Raga Zenta <
renjaya.ze...@formulatrix.com> wrote:

> Hi list,
>
> I'd like to create a temporary file using systemd-tmpfiles. The file
> will contain merge of multiple text files. Can the argument field in
> tmpfiles.d be a path to an executable? So I can create a script to
> print the content of those multiple files.
>
> Or maybe there is another way to do this?
>
>
> Thank you.
>


-- 
Mantas Mikulėnas


Re: [systemd-devel] systemd-tmpfiles service related queries

2023-10-02 Thread Mantas Mikulėnas
On Mon, Oct 2, 2023 at 2:36 PM Pintu Agarwal  wrote:

> Hi All,
>
> I have a doubt related to systemd-tmpfiles-setup.service.
> This service is mentioned to be started after local-fs.target.
> {{{
> After=local-fs.target systemd-sysusers.service
> Before=sysinit.target shutdown.target
> }}}
> In this case this service takes only ~125ms.
> systemd-tmpfiles-setup.service (123ms)
>
> But in our case (QC chipset, arm64, qual-core), we wanted to move this
> service to start before local-fs.target, so we can push some of our
> services upward.
> {{{
> After=systemd-sysusers.service systemd-journald.service
> Before=local-fs.target sysinit.target shutdown.target
> }}}
> In this case it is taking more than ~1s but it helps to reduce the
> timing of other services.
> systemd-tmpfiles-setup.service (1.177s)
>
> So, I wanted to know two things:
> 1) What is the dependency if starting this service after local-fs target
> only ?
> 2) Is it fine to move this service to start before local-fs.target ?
> What could be the consequences and effect and how to verify it ?
>

The consequences are that if you configure tmpfiles to create something in
a separate mounted filesystem, without this dependency (ordering) it may
accidentally create files in the "lower" mountpoints before the filesystem
is mounted...

The alternative to using local-fs.target is to go through all of your
tmpfiles.d configurations and add specific After=foo.mount or
RequiresMountsFor=/foo/bar ordering – for each filesystem that the
configuration expects to be available – into your tmpfiles service.

-- 
Mantas Mikulėnas


Re: [systemd-devel] systemd-nspawn/systemd.nspawn machinectl enable/start

2023-10-02 Thread Mantas Mikulėnas
Each nspawn container that's managed via machinectl is run as an instance
of "systemd-nspawn@.service". Add a [Service] ExecStartPre= to the instance
you need, using `systemctl edit` or similar.

On Mon, Oct 2, 2023 at 1:37 AM Rob Ert  wrote:

> Hello all,
>
> As I have not been able to find an answer to my question after consulting
> man pages and google, I am turning to this mailing list.
>
> I have a systemd-nspawn os container that I have set to automatically
> start with machinectl enable.
> I would like to automatically have a bcachefs snapshot created before the
> machine is started. As provisions for a hook to script something like this
> do not seem to be supported in systemd.nspawn,
> I would like to know what and where the best way and place to achieve this
> is?
>
> Please cc me.
>
> Many thanks,
> and all the best,
> Rob
>


-- 
Mantas Mikulėnas


Re: [systemd-devel] Systemd cgroup setup issue in containers

2023-09-29 Thread Mantas Mikulėnas
On Fri, Sep 29, 2023, 12:54 Lewis Gaul  wrote:

> Hi systemd team,
>
> I've encountered an issue when running systemd inside a container using
> cgroups v2, where if a container exec process is created at the wrong
> moment during early startup then systemd will fail to move all processes
> into a child cgroup, and therefore fail to enable controllers due to the
> "no internal processes" rule introduced in cgroups v2. In other words, a
> systemd container is started and very soon after a process is created via
> e.g. 'podman exec systemd-ctr cmd', where the exec process is placed in the
> container's namespaces (although not a child of the container's PID 1).
> This is not a totally crazy thing to be doing - this was hit when testing a
> systemd container, using a container exec "probe" to check when the
> container is ready.
>

Wouldn't it be better to have the container inform the host via
NOTIFY_SOCKET (the Type=notify mechanism)? I believe systemd has had
support for sending readiness notifications from init to a container
manager for quite a while.

(Alternatively, connect out to the container's systemd or dbus Unix socket
and query it directly that way, but NOTIFY_SOCKET would avoid the need to
time it correctly.)

Other than that – I'm not a container expert but this does seem like a
self-inflicted problem to me. If you spawn processes unknown to systemd, it
makes sense that systemd will fail to handle them.

>


Re: [systemd-devel] Starting a service before any networking

2023-09-28 Thread Mantas Mikulėnas
On Wed, Sep 27, 2023 at 12:31 PM Mark Rogers 
wrote:

> On Wed, 27 Sept 2023 at 10:18, Mantas Mikulėnas  wrote:
>
>> So now I'm curious: if the first command you run is to bring the
>> interface *down*, then what exactly brought it up?
>>
>
> Good question. The reason for down/up was that this was working as a way
> to reset the connection after boot, so I just transferred that to the
> ExecStartPre.
>
> Looking at the "journalctl -u dhcpcd" output, this is what I see from my
> last boot:
> Feb 14 10:12:05 pi systemd[1]: Starting dhcpcd on all interfaces...
> Feb 14 10:12:05 pi ip[372]: 2: eth0:  mtu 1500 qdisc
> noop state DOWN group default qlen 1000
> Feb 14 10:12:05 pi ip[372]: link/ether b8:27:eb:0d:ee:bb brd
> ff:ff:ff:ff:ff:ff
> Feb 14 10:12:05 pi ip[383]: 2: eth0: 
> mtu 1500 qdisc pfifo_fast state DOWN group default qlen 1000
> Feb 14 10:12:05 pi ip[383]: link/ether b8:27:eb:0d:ee:bb brd
> ff:ff:ff:ff:ff:ff
> Feb 14 10:12:06 pi dhcpcd[385]: wlan0: starting wpa_supplicant
> Feb 14 10:12:36 pi dhcpcd[385]: timed out
> Feb 14 10:12:36 pi systemd[1]: Started dhcpcd on all interfaces.
> Feb 14 10:12:37 pi systemd[1]: Stopping dhcpcd on all interfaces...
> Feb 14 10:12:37 pi dhcpcd[519]: sending signal TERM to pid 466
> Feb 14 10:12:37 pi dhcpcd[519]: waiting for pid 466 to exit
> Feb 14 10:12:38 pi systemd[1]: dhcpcd.service: Succeeded.
> Feb 14 10:12:38 pi systemd[1]: Stopped dhcpcd on all interfaces.
> Feb 14 10:12:38 pi systemd[1]: Starting dhcpcd on all interfaces...
> Feb 14 10:12:38 pi ip[524]: 2: eth0: 
> mtu 1500 qdisc pfifo_fast state DOWN group default qlen 1000
> Feb 14 10:12:38 pi ip[524]: link/ether b8:27:eb:0d:ee:bb brd
> ff:ff:ff:ff:ff:ff
> Feb 14 10:12:38 pi ip[529]: 2: eth0:  mtu
> 1500 qdisc pfifo_fast state UP group default qlen 1000
> Feb 14 10:12:38 pi ip[529]: link/ether b8:27:eb:0d:ee:bb brd
> ff:ff:ff:ff:ff:ff
> Feb 14 10:12:38 pi dhcpcd[530]: wlan0: starting wpa_supplicant
> Feb 14 10:12:49 pi dhcpcd[530]: Too few arguments.
> Feb 14 10:12:49 pi dhcpcd[530]: Too few arguments.
> Feb 14 10:12:49 pi systemd[1]: Started dhcpcd on all interfaces.
>
>  (I deleted the "ip addr" output from the interfaces other than eth0 for
> brevity.)
>
> The interesting thing is surely that dhcpcd is being started twice.
> Assuming that was always happening then that suggests dhcpcd was bringing
> the network up early (and failing but leaving it in a "stuck" state) and
> then again later (where it was unable to recover from the first failure,
> but now can)?
>

That's possible... but again, I don't see how it would get into this
"stuck" state in any other way but driver and/or hardware issues, as the
kernel driver is where the power-up sequence is done... dhcpcd (like 'ip
link set eth0 up') pretty much just tells the OS to power the NIC on, then
waits.

(My previous laptop had a Realtek Ethernet NIC that often wouldn't
recognize Ethernet link after suspend/resume until I removed it from the
PCI bus... took several kernel releases until they fixed that.)

-- 
Mantas Mikulėnas


Re: [systemd-devel] Starting a service before any networking

2023-09-27 Thread Mantas Mikulėnas
On Wed, Sep 27, 2023 at 12:14 PM Mark Rogers 
wrote:

> On Wed, 27 Sept 2023 at 09:39, Mantas Mikulėnas  wrote:
>
>> It might be an issue with the kernel driver for your Ethernet interface,
>> then (as setting the interface 'up/down' usually reinitializes the
>> controller) – or possibly a physical issue with your cable or your switch,
>> but it doesn't seem like the kind of issue that userspace configuration
>> should be *able* to lead to in the first place. (...except maybe for EEE
>> "power saving" stuff that might tip over a really marginal link.)
>>
>
> What doesn't make sense is that this had previously worked, although it's
> possible that the network hardware has changed since it was previously
> tested.
>
>
>> (It's sort of like blaming a segfault crash on the user: if a program
>> crashes, that's inherently a bug regardless of configuration. Here it's
>> similar: if the Ethernet cable is really connected but the driver still
>> reports "no carrier", that's either an interface issue or – if you see the
>> same on multiple Pi's – perhaps a NIC driver issue, but it's not something
>> that configuration ought to be *able* to do.)
>>
>
> OK, in that case if this persists I'll have to look at upgrading the whole
> system, which I'm trying to avoid doing. But:
>
>
>> Use the "drop-in" system (dhcpcd.service.d/*.conf), e.g. via `systemctl
>> edit dhcpcd5`. Add a few ExecStartPre= commands in [Service] to have it
>> "manually" bring the interface up, then down (possibly with a 'sleep .5'
>> after each), and hopefully when dhcpcd brings it up the /second/ time it
>> will work.
>>
>
> This has worked:
> [Service]
> ExecStartPre=ip addr
> ExecStartPre=ip link set eth0 down
> ExecStartPre=ip link set eth0 up
> ExecStartPre=ip addr
>
> (the "ip addr" calls are just to log the before/after state to journal).
> It's booted in that state several times now successfully. I'll need to do
> more testing yet but I am inclined to leave it at that (I hate workarounds
> rather than actually fixing the issue but I suspect this is far as I'll
> get).
>

So now I'm curious: if the first command you run is to bring the interface
*down*, then what exactly brought it up?

Normally interfaces start in (administrative) 'down' state until something
– such as dhcpcd – brings them up (and starts waiting for carrier, etc).
But if this is in ExecStartPre and dhcpcd isn't running yet, then how is
eth0 'up'?

-- 
Mantas Mikulėnas


Re: [systemd-devel] Starting a service before any networking

2023-09-27 Thread Mantas Mikulėnas
On Wed, Sep 27, 2023 at 11:23 AM Mark Rogers 
wrote:

> On Tue, 26 Sept 2023 at 20:41, Mark Rogers 
> wrote:
>
>> (I should be able to find another Pi to test for any physical hardware
>> issues, I'll try that tomorrow.)
>>
>
> I have today tested on a different Pi, different PSU, different cable, all
> with exactly the same results. There is definitely something about the
> early boot stages which is different from later on that means bringing the
> network up early (as happens now) will usually fail.
>
> (Some more background: This is a heavily modified install for a specific
> application so it's almost certainly something I have broken somewhere.
> However it has worked for years, I'm trying to resolve an issue on a unit
> that was returned because of physical damage to the SD card, so I've
> rebuilt it from an old image and now have this problem. I just need to
> break down the boot sequence to find out which step is causing the
> interface to get into a state where it fails like this. Systemd version is
> 241.)
>

It might be an issue with the kernel driver for your Ethernet interface,
then (as setting the interface 'up/down' usually reinitializes the
controller) – or possibly a physical issue with your cable or your switch,
but it doesn't seem like the kind of issue that userspace configuration
should be *able* to lead to in the first place. (...except maybe for EEE
"power saving" stuff that might tip over a really marginal link.)

(It's sort of like blaming a segfault crash on the user: if a program
crashes, that's inherently a bug regardless of configuration. Here it's
similar: if the Ethernet cable is really connected but the driver still
reports "no carrier", that's either an interface issue or – if you see the
same on multiple Pi's – perhaps a NIC driver issue, but it's not something
that configuration ought to be *able* to do.)


>
> Alternatively I guess there's the workaround option: detect the condition
> at a later stage of the boot and run the down/up sequence to fix it. If I
> try that, where is likely the best place in the sequence to put it? If I
> wanted to make it, in effect, part of the dhcpcd unit (in that when dhcpcd
> starts it first runs a down/up script), how should I do that without
> modifying system dhcpcd unit files?
>

Use the "drop-in" system (dhcpcd.service.d/*.conf), e.g. via `systemctl
edit dhcpcd5`. Add a few ExecStartPre= commands in [Service] to have it
"manually" bring the interface up, then down (possibly with a 'sleep .5'
after each), and hopefully when dhcpcd brings it up the /second/ time it
will work.

-- 
Mantas Mikulėnas


Re: [systemd-devel] Starting a service before any networking

2023-09-26 Thread Mantas Mikulėnas

On 2023-09-26 21:31, Mark Rogers wrote:
On Tue, 26 Sept 2023 at 13:44, Mantas Mikulėnas <mailto:graw...@gmail.com>> wrote:



I'm still not entirely sure of the situation but right now it sounds
like the configuration is okay but the Ethernet interface is failing
to establish a physical link on the first try. Does it also show
"" within the interface flags?


eth0:  mtu 1500 qdisc pfifo_fast 
state DOWN group default qlen 1000


I've done a lot more testing now and there's a race condition somewhere 
as it does sometimes (rarely) boot OK and get an IP address with no 
config changes.


That's not a race condition; it's a fault in the network interface 
itself. "NO-CARRIER" means it's physically unable to establish the 
Ethernet link – an external condition that the service ordering has no 
effect on.


(The interface *is* already brought "up" – in the `ip link set` sense – 
because it shows the  flag, which was probably done by dhcpcd when 
it started up; now the DHCP client is sitting there waiting for carrier 
before it can do anything else.)


At this stage ordering is not a problem because dhcpcd, like any 
self-respecting DHCP client, is able to monitor carrier status; it 
doesn't just immediately give up.


Re: [systemd-devel] Starting a service before any networking

2023-09-26 Thread Mantas Mikulėnas
On Tue, Sep 26, 2023, 15:32 Mark Rogers  wrote:

> On Tue, 26 Sept 2023 at 13:08, Mantas Mikulėnas  wrote:
>
>> Depends on what exactly runs dhcpcd and wpa_supplicant. Is that done by
>> networking.service (ifupdown)? NetworkManager? Are they standalone services?
>>
>
> How do I tell?
>

Run `systemctl status ` or browse `systemd-cgls` to map a process to
its .service unit.


> (System is a Pi running an elderly Raspbian. The issue I am having is that
> the device is not getting an IP address - if i wait until booted I have to
> issue "ip link set eth0 down" and "ip link set eth0 up" to get it to retry
> the DHCP request
>


("up" alone isn't sufficient, despite "ip addr" showing the interface as
> DOWN.
>

I think you're confusing two different states, which have similar
indications – "administrative" up/down that you control (the "" flag,
with nothing shown when down) and "operational" up/down that represents the
actual interface status (the "" vs "" flags and/or the
"state XXX" field).

"state DOWN" is *not* directly controlled by `ip link set up` – it's the
result of the interface being operative for any other reason even though it
is administratively  (i.e. turned on).

I'm still not entirely sure of the situation but right now it sounds like
the configuration is okay but the Ethernet interface is failing to
establish a physical link on the first try. Does it also show
"" within the interface flags?

I am assuming that this is because the config file isn't in place when
> dhcpcd starts but I may be mistaken.)
>
>
>> I would generally expect Before/Wants=network-pre.target to work, but
>> that relies on your network services themselves being set up correctly –
>> they too need to order themselves After that target.
>>
>
> In that case I should probably return to Before/Wants=network-pre.target
> and work out what is breaking it, but same question as above: how do I
> figure that out?
>

`systemctl cat` for direct configuration and `systemctl list-dependencies
--after` (if I remember it right) should be a good start.



> --
> Mark Rogers
>
>


Re: [systemd-devel] Starting a service before any networking

2023-09-26 Thread Mantas Mikulėnas
Depends on what exactly runs dhcpcd and wpa_supplicant. Is that done by
networking.service (ifupdown)? NetworkManager? Are they standalone services?

I would generally expect Before/Wants=network-pre.target to work, but that
relies on your network services themselves being set up correctly – they
too need to order themselves After that target.

On Tue, Sep 26, 2023, 13:51 Mark Rogers  wrote:

> I'm sure this is trivial but I've gone round in circles without success.
>
> I have a script which reads from an SQLite database and generates various
> system configuration files - at the moment these are dhcpcd.conf and
> wpa_supplicant.conf but this might grow in future.
>
> As such the only dependency the script has is that the filesystem is up
> and running. But the script must complete before anything that the script
> manages the configuration file for.
>
> My current unit looks like this:
> [Unit]
> Before=networking.service
> After=local-fs.target
>
> [Service]
> Type=oneshot
> ExectStart=/path/to/script
>
> [Install]
> RequiredBy=network.target
>
> Where am I going wrong and what is the right way to do this?
>
> I've also tried Before=network-pre.target and Wants=network-pre.target
> without success - it was that not working that set me off trying to fix it.
> --
> Mark Rogers
>
>


Re: [systemd-devel] What condition(s) do .device units wait for?

2023-09-15 Thread Mantas Mikulėnas
.device units wait for *udev* to broadcast the uevent about that device
being added, which happens after udev has 1. received the initial kernel
uevent (either real or produced by systemd-udev-trigger.service) and 2.
finished processing all its .rules for that device (which means everything
that rules launched from RUN= must have exited, etc).

Only devices that udev rules have tagged with TAG+="systemd" will produce
.device units; generally 99-systemd.rules will add that to disk devices.

If any of the rules have marked the device with ENV{SYSTEMD_READY}="0", the
.device unit will keep waiting until another event removes that.


On Sat, Sep 16, 2023, 07:54 Philip Couling  wrote:

> I'm trying to understand what a system is timing out waiting for a device
> in /etc/fstab when a simple "mount -av" will succeed.
>
> To reach systemd, initramfs has already mounted the device as the base
> layer to an overlay mount used as the root file system, so it's definitely
> ready to use in the Linux kernel. In /etc/fstab, fsck is set to 0.
>
> What condition does systemd wait for that could be timing out on a device
> that's already mounted?
>
>


Re: [systemd-devel] Online backup API for systemd-journal?

2023-09-04 Thread Mantas Mikulėnas
On Mon, Sep 4, 2023 at 5:35 PM Etienne Doms  wrote:

> Hi,
>
> I have some embedded systems in the wild, not connected to anything,
> on which you can push a button "something went wrong, create a dump".
> Then later I can fetch the said dump and inspect it.
>
> I'd like to include the whole journal, for the current boot, in a
> binary format so that I can later do "journalctl --file
> path/to/journal-dump.bin" from another machine. I understand that
> internally everything is stored in /var/log/journal/, but
> I guess that I cannot blindly tar/cp the .journal files, since this
> would be racy.
>
> So, is there an API to safely dump a big ".journal" file containing a
> snapshot of "journalctl -b"? I could not find anything in the
> documentation, sorry in advance if I missed something obvious.
>

Run `journalctl --rotate` (or send a SIGUSR2). All "rotated" .journal files
(containing an '@' in their name) are offline and can be copied.

For now I just dump it with "-o json" which is fine, but then I cannot
> feed another journalctl with the given json, and need to do manual
> filtering.


If you dump with `-o export` instead (or convert the JSON to the export
format), you can later feed the dump into systemd-journal-remote(8) (which
is somewhere in /lib/systemd) to import it back into a .journal file.

-- 
Mantas Mikulėnas


Re: [systemd-devel] networkd: IPv6: equivalent of 'default via fe80::1` with policy routing?

2023-09-01 Thread Mantas Mikulėnas
On Fri, Sep 1, 2023 at 2:55 PM TJ  wrote:

> I may just be over-thinking this but I have a scenario that I can
> configure manually but have not been able to figure out how to amend the
> networkd configuration to match!
>
> # echo "2 starlink" >> /etc/iproute2/rt-tables
> # ip -6 rule add from 2001:0DB8:1:1::/64 table starlink priority 100
> # ip -6 route add default via fe80::1 dev WAN table starlink
>
> Note: 'via' required to prevent failed neighbour solicitations for
> external addresses.
>
> The issue is I cannot see how to achieve both 'default' and 'via' in
> .network
>   ROUTE section (when specifying a routing table).
> I see recommendations to use `Gateway=::` as an alias for 'default' but
> that prevents
>   setting the next-hop router address explicitly, which results in failure
> due to neighbour
>   solicitation.
>

No; `default` has nothing to do with the gateway field. It's an alias for
the route *destination network* field, specifically ::/0 for IPv6 or
0.0.0.0/0 for IPv6.

What you have is a completely standard IPv6 default route, regardless of
which table it's in:

[Route]
Destination=::/0
Gateway=fe80::1

-- 
Mantas Mikulėnas


Re: [systemd-devel] Additional Locale Variables for Units and Number Format

2023-08-29 Thread Mantas Mikulėnas
It sounds like you're reinventing LC_NUMERIC.

The locale system has a lot more than just LANG; it already allows the
number format to be overridden separately from the "language". Take a look
at `locale -k LC_NUMERIC` and <
https://pubs.opengroup.org/onlinepubs/009695399/basedefs/xbd_chap07.html>.

Adding custom variables would require changing a lot – I guess the main
consumers are libc (Glibc) and libstdc++ (GCC), but of course there are
many places which set the existing LC_* and expect things to change
accordingly, or which might implement the standard interfaces on their own
without using libc.

On Tue, Aug 29, 2023, 20:17 TJ Shipp  wrote:

> I am trying to add in support for a separate variable to change our unit
> system, and having both LANG and UNITS to identify the "locale" of the
> system.
> We are also not only looking for English versus Metric, but are looking
> for mixed units as well (both Imperial and Metric hybrid), as well as
> looking to add number formats (1,000.00 vs 1.000,00)
>
> And what is the best way to add support for a new system environment
> variable such as UNITS?
>
> P.S. If anyone is interested in contracting to do this work, please send
> me a private message outside this list.
>


Re: [systemd-devel] Append to logfile with year-month

2023-08-24 Thread Mantas Mikulėnas
On Thu, Aug 24, 2023 at 10:49 AM Cecil Westerhof 
wrote:

> In a service file I can use:
> StandardOutput=append:/var/log/root/aptCacheUsage.log
>
> but I want to use something like:
> StandardOutput=append:/var/log/root/aptCacheUsage_$(date +%%Y-%%m).log
>
> Did does not work, because this puts it in:
> /var/log/root/aptCacheUsage_$(date +%Y-%m).log
>
> Is there a way I can put it in:
> /var/log/root/aptCacheUsage_2023-08.log
>
> while it would automatically next month go into:
>/var/log/root/aptCacheUsage_2023-09.log
>
>
Not with built-in systemd tools. If it's a periodic (not permanently
running) service, best you can do is script a monthly cronjob that
automatically edits the StandardOutput line in your .service unit.


> I could of-course put it into:
> /var/log/root/aptCacheUsage.log
>
> and at the beginning of the month move it if it exists with a timed
> service, but I really would not like that kind of solution.
>

It's called /etc/logrotate.conf and it's what everyone else does. It's what
Debian/Ubuntu itself uses for /var/log/apt*.log and such.

-- 
Mantas Mikulėnas


Re: [systemd-devel] systemd-cryptenroll with TPM2

2023-08-21 Thread Mantas Mikulėnas
Have your initramfs *extend* a PCR after it retrieves the key from the TPM,
before it switches to (or even unlocks) the rootfs. As most PCRs cannot be
rolled back without a reboot, this would prevent the key from being
unsealed from a running system even if it manages to boot (without causing
the initramfs to fail earlier). Systemd already has some tools for this;
see "systemd-pcrphase".

On Mon, Aug 21, 2023, 17:40 Aleksandar Kostadinov 
wrote:

> Hello,
>
> This is more of a user question but I didn't find any other suitable forum
> to ask.
>
> I want to install a server that should have an encrypted root but be able
> to reboot unattended.
>
> systemd-cryptenroll with TPM2 looks like a viable option. I'm concerned
> about which PCRs to pin so that an average attacker  won't be able to
> decrypt the volume having physical possession of the server. This means I'm
> not concerned about cracking the TPM chip or reading out life memory.
>
> To me it is acceptable to pin a lot of them so that adding/changing
> devices would prevent automatic decryption. Also 5 looks good about changed
> GPT partitions.
>
> I'm concerned though about an attacker replacing the encrypted root volume
> with a non-encrypted one. Which may result in system booting an attacker
> controlled environment while PCRs may be in a state that allows decryption
> of the original root volume.
>
> Would anything prevent the system from booting with a replaced root volume?
>
> If it can boot in such a way, which PCRs need to be pinned to remove the
> ability to decrypt the original root volume?
>
> If there is presently no such PCR, can some custom validation be added in
> the process to take care of that?
>
> Thank you!
>


Re: [systemd-devel] machinectl shell .bashrc

2023-08-16 Thread Mantas Mikulėnas
By default `machinectl shell` runs the user's shell with the "login" flag,
exactly as during console or SSH logins. For Bash, that means it will look
for ~/.bash_profile or ~/.profile *instead of* ~/.bashrc.

Usually people have a ~/.bash_profile that sets up "once per session"
things if any, then manually sources ~/.bashrc (with the '.' or 'source'
command) as Bash never does do so automatically.

The same applies to global configs; Bash in login mode will read
/etc/profile, but not /etc/bash.bashrc unless the latter is explicitly
sourced (e.g. Arch's /etc/profile has an "if [ "$BASH" ]; then .
/etc/bash.bashrc; fi").

On Thu, Aug 17, 2023 at 3:09 AM LuKaRo  wrote:

> Hi,
>
> somehow, when using machinectl shell to access my nspawn containers, my
> .bashrc is ignored, although bash is correctly used as my shell. However,
> when specifying /bin/bash explicitly, the .bashrc gets sourced correctly.
> Any ideas?
>
> *lukas@home*:*~*$ sudo machinectl shell x11
> Connected to machine x11. Press ^] three times within 1s to exit session.
> [root@x11 ~]# echo $HISTFILESIZE; echo $0;
> 500
> /bin/bash
> [root@x11 ~]#
> logout
> Connection to machine x11 terminated.*lukas@home*:*~*$ sudo machinectl shell 
> x11 /bin/bash
> Connected to machine x11. Press ^] three times within 1s to exit 
> session.*root@x11*:*~*# echo $HISTFILESIZE; echo $0;
>
> /bin/bash*root@x11*:*~*#
> exit
> Connection to machine x11 terminated.*lukas@home*:*~*$
>
> Thanks,
> lukaro
>


-- 
Mantas Mikulėnas


Re: [systemd-devel] nspawn container sees total host memory instead of MemoryMax value

2023-08-06 Thread Mantas Mikulėnas
As far as I know, that's normal – /proc/meminfo always reflects the total
amount of memory, regardless of cgroup limits. LXC uses lxcfs to mount a
fake meminfo file there, nspawn doesn't have an equivalent.

On Sun, Aug 6, 2023, 18:55 Paulo Coghi - Coghi IT 
wrote:

> I used "systemctl set-property
> systemd-nspawn@my-container-real-name.service MemoryMax=2G", to test
> defining a limit on RAM usage of a nspawn container.
>
> But after setting the limit, with the config being created at
> "/etc/systemd/system.control/" correctly, when I start the container and
> enter on it, the "free" command still shows the memory info from the host.
>
> Is this correct? If yes, is there a way to make the container to show only
> the memory separated to it?
>
> Paulo Coghi
>


Re: [systemd-devel] multiple starts for a socket-based service

2023-08-06 Thread Mantas Mikulėnas

On 2023-08-06 03:42, Ross Boylan wrote:

On Fri, Aug 4, 2023 at 4:32 PM Kevin P. Fleming
 wrote:


On Fri, Aug 4, 2023, at 18:11, Ross Boylan wrote:

Theory: since br0 has no associated IP address when socket creation is
attempted, the socket creation fails.  If so, I need to delay socket
startup until br0 has an IP4 address, but I'm not sure how to do
that--or even if that is the problem.


This is almost certainly the cause, and the reason that the 'FreeBind' 
parameter can be set in .socket files :-)


Thank you, Kevin.  Setting FreeBind=yes results in successful socket
activation on system startup

I still find the description of FreeBind on the man page puzzling: "
Controls whether the socket can be bound to non-local IP addresses."
But 192.168.1.10 is a local IP address, and for that matter one can
only directly create sockets on the local machine.  The rest of the
description makes clear the option is for my case, but I don't see how
that relates to the quoted sentence.  Presumably the problem is the
meaning of "non-local IP addresses".  Can anyone explain?


"Local" in the same sense as 'localhost'.

If the IP address is configured on any of the machine's interfaces, then 
it is a "local" address -- and if it's not assigned on an interface 
(yet), then it's non-local.


At the time .socket units start, in most cases, the network interfaces 
are likely to not have any IP addresses set up yet (it's even somewhat 
deliberate that .socket units start before services), so 192.168.1.10 is 
not yet "local" at that point in time, and sockets cannot bind() to it 
yet -- you get "Cannot assign requested address" as the error message. 
(This applies equally to systemd .socket units as to listening sockets 
that daemons might set up directly.)


The "free bind" option bypasses this restriction; it makes bind() calls 
always succeed, although the socket still doesn't actually begin 
receiving packets until later (when that IP address gets configured on a 
local interface).


It is possible to delay .socket startup until after an interface is 
configured (probably by ordering it *after* network-online.target, 
whereas your current version has an implicit 'before' instead), but 
it'll be easier to enable FreeBind=.


Finally, if the machine only has one IP address, it's even easier to not 
bother with binding to a specific address at all -- instead specify 
"ListenStream=14987" to make it bind to the wildcard 0.0.0.0 and [::] 
addresses instead. Such a socket will automatically listen on any 
current *and future* IP addresses assigned to the machine.




Did I only run into this problem because I specified a BindToDevice
directive?  It seemed like a good idea since there are potentially 2
interfaces the socket could attach to, either the virtual interface
br0 or the actual physical network interface that requests come in on.


No, but the directive is not really useful here. Sockets do not "attach" 
to interfaces in the way you probably imagine -- they primarily attach 
to IP addresses and let the system's IP stack handle everything else. 
(That is, sockets *do not* directly grab packets from an interface; the 
network stack does that globally.)




The message
systemd[1]: Listening on Socket to tickle to update family netboot config.
still occurs interspersed with kernel messages from ~2s after boot,
before IP addresses are configured.

Ross

Current config:
# /etc/systemd/system/family.socket
[Unit]
Description=Socket to tickle to update family netboot config

[Install]
WantedBy=network-online.target

[Socket]
ListenStream=192.168.1.10:14987
# want to run a new job, aka service, for each connection.
Accept=Yes
BindToDevice=br0
# must wait until it has an IP address
FreeBind=true
# 2s is default
TriggerLimitIntervalSec=5s


Re: [systemd-devel] multiple starts for a socket-based service

2023-08-03 Thread Mantas Mikulėnas
On Thu, Aug 3, 2023, 21:09 Ross Boylan 
wrote:

> Hi, systemd-ers.  I'm trying to do something that seems at cross
> purposes with systemd's assumptions, and I'm hoping for some guidance.
>
> Goal: remote client sends a 1 line command to a server, which executes
> a script that does not create a long-running service.
> These events will be rare.  I believe the basic model for systemd
> sockets is that service is launched on first contact, and is then
> expected to hang around to handle later requests.  Because such
> requests are rare, I'd rather that the service exit and the process be
> repeated for later connections.
>
> Is there a way to achieve this with systemd?


> It looks as if Accept=yes in the [Socket] section might work, but I'm
> not sure about the details, and have several concerns:
> 1. systemd may attempt to restart my service script if it exits (as it
> seems to have in the logs below).

2. man systemd.socket recommends using Accept=no "for performance
> reasons".  But that seems to imply the service that is activated must
> hang around to handle future requests.
>

That's precisely the performance reason, at least historically. Spawning a
whole new process is heavier than having a running daemon that only needs
to fork, at most (and more commonly just starts a thread or even uses a
single-threaded event loop), so while it is perfectly fine for your case,
it's not recommended for "production" servers.

(Especially if the time needed to load up e.g. a python interpreter, import
all the modules, etc. might be more than the time needed for the script to
actually perform its task...)

3. Accept=yes seems to imply that several instances of the service
> could share the same socket, which seems dicey.  Is systemd
> automatically doing the handshake for TCP sockets,


It does, that's what "accept" means in socket programming.

However, accept() does not reuse the same socket – each accepted connection
spawns a *new* socket, and with Accept=yes it's that "per connection"
socket that's passed on to the service. The original "listening" socket
stays within systemd and is not used for communications, its only purpose
is to wait for new connections to arrive.

This is no different from how you'd handle multiple concurrent connections
in your own program – you create a base "listener" socket, bind it, then
each accept(listener) generates a new "client" socket.

With systemd, the default mode of Accept=no provides your service with the
"listener" but it is the service's job to loop over accept()-ing any number
of clients it wants.

For what it's worth, systemd's .socket units are based on the traditional
Unix "inetd" service that used to have the same two modes (Accept=yes was
called "nowait"). If systemd doesn't quite work for you, you can still
install and use "xinetd" on most Linux distros (and probably five or six
other similar superservers).

in which the server
> generates a new port, communicates it to the client, and it is this
> second port that gets handed to the service?  Or is the expectation
> that the client service will do that early on and close the original
> port?
>

That's not how the TCP handshake works at all. (It's how Arpanet protocols
worked fifty years ago, back when TCP/IP did not exist yet and RFCs were
numbered in the double digits.)

In TCP it is never necessary for a server to generate a new port for each
client, because the client *brings its own* port number – there's one for
each end, and the combination of  is what allows
distinguishing multiple concurrent connections. The sockets returned by
accept() are automatically associated with the correct endpoints.

If you look at the list of active sockets in `netstat -nt` or `ss -nt` (add
-l or -a to also show the listeners), all connections to your server stay
on the same port 14987, but the 4-value  is unique
for each.

Although calls to the service should be rare, it's easy to imagine
> that during development I inadvertently generate rapidly repeated
> calls so that several live processes could end up accessing the same
> socket.
>
> Finally, I'd like to ignore rapid bursts of requests.  Most systemd
> limits seem to put the unit in a permanently failed state if that
> happens, but I would prefer if that didn't happen, ie. ignore but
> continue, rather than ignore and fail.


> My first attempt follows.  It appears to have generated 5 quick
> invocations of the script and then permanent failure of the service
> with service-start-limit-hit.  So even when the burst window ended,
> the service remained down.  I think what happened was that when my
> script finished the service unit took that as a failure (? the logs do
> show the service succeeding, though RemainAfterExit=no)  and tried to
> restart it.  It did this 5 times and then hit a default limit.  Since
> the service and the socket were then considered failed, no more
> traffic on the socket triggered any action.
>

No; more likely what happened is that you forgot Ac

Re: [systemd-devel] Securing bind with systemd methods (was: bind-mount of /run/systemd for chrooted bind9/named)

2023-07-17 Thread Mantas Mikulėnas
On Mon, Jul 17, 2023, 15:44 Marc Haber 
wrote:

>
> # /lib is necessary here, or execve will fail without indication for
> # reason - that was a surprise and hard to debug because even strace
> # didnt hint me towards the real issue
> ExecPaths=/usr/sbin/named /usr/sbin/rndc /lib
>

This one in particular is not a systemd issue: All dynamically linked
binaries are executed through /lib/ld-linux*.so as their "interpreter".
(`file` will show the exact path.) I wish that had a dedicated errno,
though.


Re: [systemd-devel] Running a non-idempotent command from udev

2023-07-15 Thread Mantas Mikulėnas
Is that "once per boot", "once per interface appearance", or "once per
physical NIC lifetime"? Can the command check its effects directly (i.e.
check whether a setting has been set, or whatever the task is)?

If it's once per boot, a flag file in /run/thing_done.$ifname would be a
common solution... If it needs to be done again if the interface disappears
and reappears – udev ENV{thing_done}="1".

On Sat, Jul 15, 2023, 19:20 Demi Marie Obenour 
wrote:

> What is the appropriate solution for running a non-idempotent command
> from udev?  One command needs to be run exactly once when a network
> interface appears, and another command needs to be run exactly once when
> a network interface disappears.  Both commands need to run after
> network-pre.target, but that can be handled in the script themselves.
> --
> Sincerely,
> Demi Marie Obenour (she/her/hers)
> Invisible Things Lab
>


Re: [systemd-devel] Security and technical differences between systemd-nspawn and OpenVZ / LXC

2023-07-06 Thread Mantas Mikulėnas
On Thu, Jul 6, 2023 at 6:05 PM Paulo Coghi - Coghi IT 
wrote:

> Hello Systemd Devel team,
>
> I've been using OpenVZ for 11 years in production without the security
> problems I faced with LXC. But as a non-official mainstream library of
> Linux kernel, there is always a gap. Virtuozzo is working on OpenVZ 9 with
> kernel 5.14 now, but it is still not released.
>
> Systemd-nspawn seems promising, and I would like to cordially ask a few
> questions.
>
> 1. Does systemd-nspawn officially support system containers?
> I would like to not conclude it myself, but it seems so, after reading the
> official documentation.
>

Yes, it's mostly what nspawn is designed for.


> 2. The "experience" inside a system container is similar to a VM, like on
> OpenVZ?
> On OpenVZ containers, except for kernel related activities (like adding
> kernel modules), everything is identical to a virtual machine, with the
> "root" user from the container being able to manage everything, like adding
> new users, changing firewall rules, installing multiple services (web
> servers, databases), managing cron jobs, etc.
>

All of that is pretty much implied by "system container".


>
> 3. Security - Can those OS containers be used in production, with multiple
> containers from multiple owners inside the same host?
> On LXC, for example, there are vulnerabilities that can be exploited,
> allowing a container user to escape to the host. On OpenVZ, it seems that
> his was already addressed more than a decade ago.
> Does systemd-nspawn provide such security, not allowing a "container user"
> to escape to the host?
>

Both nspawn and LXC these days use "user namespaces" for isolation (i.e.
container root is no longer the same UID as host root, and each container
is mapped to a unique set of host UIDs as well). LXC calls those
"unprivileged containers" and seems to consider them
<https://linuxcontainers.org/lxc/security/> as safe as the kernel's regular
user separation, so the same would apply to nspawn as well. The
corresponding nspawn option is PrivateUsers=.


>
> 4. Storage and Inodes
> On OpenVZ, we could create "virtualized" file systems, like ploop, which
> avoids consuming inodes on the host's file system, while lightweight enough
> to provide near-native performance.
> Is there any approach to have similar benefits through systemd-nspawn?
>

Nspawn supports running containers off a loop-mounted image, but nothing
built-in with the same features, although ploop seems to be a fully
separate kernel module (i.e. not strictly part of OpenVZ), so in theory you
could still use it with nspawn. Alternatively, you could use regular loop
devices (which can be space-efficient with all recent kernels, as they now
support TRIM) if you don't need the snapshotting.

Though, "consuming inodes" is only a problem with Ext4, isn't it? Does the
same type of problem even exist on more modern filesystems like XFS or
Btrfs?

-- 
Mantas Mikulėnas


Re: [systemd-devel] Enrolling PCR11 does not work as expected

2023-07-05 Thread Mantas Mikulėnas
On Wed, Jul 5, 2023 at 2:11 PM Felix Rubio  wrote:

> For what is explained on the the systemd-pcrphase.service(8) and
> comparing it to what I see in the log of the systemd services, there are
> three events in relation to this question:
>
> systemd-pcrphase-initrd.service
> [...]
> [systemd-ask-password-console.service]
> [...]
> systemd-pcrphase-sysinit
> systemd-pcrphase
>
> This means that, indeed, running cryptenroll after the new kernel has
> booted will never provide the correct PCR registry for 11. But then...
> what options do I have? Do I need to choose between having PCRs 7 and
> 14, so that I make sure that SB is up and running and all the certs from
> shim have not changed, or to have only PCR 11 so that I know that the
> UKI has not changed although SB can potentially be even disabled
> (please, correct me if wrong)?
>

I think the idea is to use `systemd-measure` to precompute PCR 11 for a
specific phase, then use the precomputed PCR value instead of the "live"
PCR value when sealing the data.

systemd-cryptenroll does not accept raw PCR values directly (though I use a
separate python script for that); instead it accepts --tpm2-public-key= as
a public key that could be used to *sign* PCR values, and an external
--tpm2-signature= path that'll contain the signed data.

So I believe you're supposed to use systemd-measure to precompute and sign
PCR 11, put the signed file in /boot, and tell systemd-cryptenroll to use
that when unlocking. (Later you only need to re-sign the PCR measurements
in /boot without needing to re-do cryptenroll.)

-- 
Mantas Mikulėnas


Re: [systemd-devel] Anonymous SYSTEMD_NOTIFY socket

2023-06-28 Thread Mantas Mikulėnas
On Tue, Jun 27, 2023 at 8:36 PM Adrian Vovk  wrote:

> Hello!
>
> I'm working on passing sd_notify events from systemd-{pull,import} through
> sysupdate.
>
> All services that consume sd_notify events (systemd itself, importd,
> machined, homed, etc) act as daemons and own a directory in /run. Thus,
> they can open a notification socket at, say, /run/SERVICENAME/notify and
> set NOTIFY_SOCKET to that. Also, there's no cleanup involved: if the
> service goes away the file sticks around until the service is restarted.
>
> sysupdate, however, is the first instance of a worker process forking off
> another worker process. Thus, we cannot bind the notify socket to some
> stable name. Here are potential approaches I've explored to solve this:
>
> - Simply pass through the NOTIFY_SOCKET environment variable. That's not
> suitable because we want to export an overall progress value (smoothly from
> 0 to 100), but systemd-import is forked off multiple times so we'd instead
> export a progress value that bounces around from 0 to 100 and back to 0.
> Also progress messages would come from different PIDs for a single
> invocation of sysupdate
>
> - Create a temporary file and use that as the socket. Problem: What
> happens if systemd-sysupdate crashes and we don't get to clean up that
> file? Over time that potentially clutters up /tmp! Is this a concern?
>

I assume you meant named sockets (like one would usually find in /run), not
actually regular temporary files?

systemd-tmpfiles should correctly clean up broken sockets in /tmp, IIRC it
supports checking whether the socket is bound or stale (though maybe /run
is still a better place, even for temporary sockets).

Personally my concern would be the crash itself, not the lack of cleanup.
But if the sockets are kept in a single place, say, /run/sysupdate/notify/,
then the subsequent restart could clean out all of them?


>
> - Use socketpair to open an anonymous socket and pass it into the child.
> This one seems ideal on paper but it doesn't actually work. I can modify
> sd_notify to just use an open file descriptor instead of tying to open its
> own, and that does work except for some reason process credentials aren't
> sent over. Also using the socket pair method doesn't work all that well
> with CLOEXEC, though maybe we don't want to CLOEXEC (We'd only close the
> socket when sd_notify is called with unset_env=true)
>
> - Create an abstract socket, named after the PID of the parent sysupdate.
> i.e. @/run/sysupdate/PID/notify. I'm not super familiar with abstract
> sockets so I'm not sure of the downsides
>

Abstract sockets are tied to the network namespace, instead of the
filesystem (mount namespace). That's the main difference, as far as I know.

-- 
Mantas Mikulėnas


Re: [systemd-devel] Correct shutdown ordering with socket-activated dependencies

2023-06-18 Thread Mantas Mikulėnas
On Sun, Jun 18, 2023 at 8:00 PM Ferenc Wágner  wrote:

> Hi,
>
> As an example, please consider rabbitmq-server.service, which is an
> Erlang application, so it uses the services of the Erlang Port Mapper
> Daemon (epmd), which is socket-activated:
>
> # /lib/systemd/system/rabbitmq-server.service
> [Unit]
> After=epmd.socket
> Wants=epmd.socket
>
> # /lib/systemd/system/epmd.service
> [Unit]
> After=network.target
> Requires=epmd.socket
>
> # /lib/systemd/system/epmd.socket
> [Install]
> WantedBy=sockets.target
>
> Side question: rabbitmq-server does not specify DefaultDependencies=no,
> are After/Wants=epmd.socket necessary there?  Isn't the implicit
> After=basic.target enough?
>

It might be enough on boot, but not necessarily on manual start. If you (or
the package manager) directly run `systemctl start rabbitmq-server`, the
socket would not be started unless it's explicitly listed in
Wants/Requires=.


>
> But anyway, this achieves a stable and maximally parallel startup.
>
> However, problems arise during shutdown, because the ExecStop command of
> rabbitmq-server.service uses epmd as well, but epmd.service can be
> stopped earlier and won't be reactivated (epmd.socket: Suppressing
> connection request since unit stop is scheduled; as an additional detail
> rabbitmq-server then tries to start its own epmd and fails because
> epmd.socket still holds the address, but this is irrelevant here).  So
> after much struggle ExecStop fails and the unit is brought down anyway,
> but the failure message already polluted the logs.
>
> This can be worked around by adding After=epmd.service to
> rabbitmq-server.service, but that pretty much defeats the advantages of
> socket activation by sequencing startups and being explicit about
> dependencies.  Now rabbitmq-server and epmd are just examples here, any
> service using a socket-activated service during its shutdown is
> affected and prone to timeouts or failures.
>

Not entirely sure here, but I *think* this might be unavoidable.

-- 
Mantas Mikulėnas


Re: [systemd-devel] VLAN interface stuck in pending

2023-06-13 Thread Mantas Mikulėnas
On Sun, Jun 4, 2023 at 6:34 PM Matthias Luft  wrote:

> Ahh, I read about this problem when e.g. renaming the VLAN interface
> which would then match the physical interface - and I just assumed this
> wouldn't be an issue as the VLAN interface would not exist yet, going by
> the ordered config files. Obviously should not have assumed... thank you
> very much!
>
>
They're not processed as a single list. All of the .link files are
completely independent from .netdev or .network files, for example, even
though they're in the same directory. (It's actually *udev* that applies
.link configuration, not networkd – whenever an interface *shows up* from
anywhere its .link configuration is applied, whether it was created
manually or otherwise.)

Similarly the .netdev files are processed independently whereas .network
files are attached to interfaces first (i.e. networkd doesn't look for
interfaces matching 10-wan.network; it looks for .network files matching
the eth0 device).

-- 
Mantas Mikulėnas


Re: [systemd-devel] Usage of PCR[7]

2023-06-05 Thread Mantas Mikulėnas
On Mon, Jun 5, 2023 at 11:38 PM Adrian Vovk  wrote:

>
> 2. The alternative approach involves pre-calculating PCR[7] on the
> client if we're updating DBX or Shim. Here's how I envision this
> going:
> - We read the TPM log (which we can trust because we're currently
> booted to system verified via the chain of trust) and extract
> everything read into PCR[7]
> - We clear PCR[16], then start replaying everything from the TPM log
> - When we reach the measurement of DBX, we pre-calculate the new value
> of DBX and measure that in instead. This would probably need
> collaboration w/ fwupd
> - When we reach the measurements made by Shim, we use the new values
> instead. See https://github.com/rhboot/shim/issues/555
> - PCR[16] now contains the future value for PCR[7]. We enroll (into a
> new keyslot) TPM decryption. We seal against 16+11+14, but then
> configure it to unseal against 7+11+14 (this is the one step I'm iffy
> about. Is this possible??)
>

You don't need to replay everything *into a real PCR* at all – the extend
operation is just a regular hash operation SHA(pcr||value), you can
recalculate everything in software, then seal the keyslot against your
provided PCR values instead of the "live" ones.

I have an old hack proof of concept for that (written mostly because I
didn't want to touch any of that SB signing even with a stick):

1. PCR[4] replay in userspace https://github.com/grawity/tpm_futurepcr
(code is ugly but it's really just calculating a hash chain, while
"updating" certain TPM log events)

2. Creating systemd-compatible LUKS tpm2 tokens against arbitrary PCR
values https://git.nullroute.lt/cgit/hacks/tpmreseal.git/
(systemd has extended its LUKS token format a little bit since then, but
the basic format still works, at least I'm able to use it on my system)

I expected #1 to be superseded by `systemd-measure` (available in latest
systemd); apparently it's not quite the same but it does focus on Secure
Boot and signed PCRs, so maybe you can get `systemd-measure` to do exactly
what you want? There's a github RFE filed for #2 so it might show up in
systemd-cryptenroll someday.

-- 
Mantas Mikulėnas


Re: [systemd-devel] triggering a remove handker manually via cmd

2023-06-05 Thread Mantas Mikulėnas
Technically yes, `udevadm trigger --action=` can be used to trigger rules
for any kind of action including remove (or just writing 'remove' into the
corresponding device's "/sys/.../uevent" file), just keep in mind that this
won't *actually* remove the device...which might result in udev and other
software being a bit confused when a "removed" device continues to exist.
(Normally it's the removal that triggers rules, not the other way around.)

On Mon, Jun 5, 2023 at 8:46 AM daggs  wrote:

> Greetings,
>
> given a rule file which has a add and remove handlers, is there a way to
> manually trigger the remove handler of that file?
>
> Thanks,
>
> Dagg
>


-- 
Mantas Mikulėnas


Re: [systemd-devel] How to prevent users form seeing other user processes with loginctl/systemctl ?

2023-06-04 Thread Mantas Mikulėnas
Assuming you already have "hidepid" configured for /proc, you'll still need
to block access to the corresponding systemd D-Bus call:

$ cat /etc/dbus-1/system.d/systemd-restrict.conf

















On Sun, Jun 4, 2023, 12:50 antisimus  wrote:

> Hello,
>
> Is there a way to hide process information (pids, command line) and
> prevent one user accessing other user processes information.
>
> On a shared system this can be a potential security risk and I really do
> not like idea users inspecting each other's running processes.
> Here I have user *bob *accessing user *alice *process info but same can
> be done even to inspect *root *users processes
>
> systemd 247 (247.3-7+deb11u2)
> Linux systemd-vps 5.10.0-23-amd64 #1 SMP Debian 5.10.179-1 (2023-05-12)
> x86_64 GNU/Linux
>
> bob@systemd-vps:~$ loginctl user-status alice
> alice (1002)
>Since: Sun 2023-06-04 08:37:18 UTC; 2min 39s ago
>State: active
> Sessions: *7
>   Linger: no
> Unit: user-1002.slice
>   ├─session-7.scope
>   │ ├─1025 sshd: alice [priv]
>   │ ├─1046 sshd: alice@pts/1
>   │ ├─1047 -bash
>   │ ├─1305 bash myapp.sh
>   │ └─1306 sleep 5
>   └─user@1002.service
> └─init.scope
>   ├─1028 /lib/systemd/systemd --user
>   └─1029 (sd-pam)
>
>
>
>  bob@systemd-vps:~$ loginctl user-status root
> root (0)
>Since: Sun 2023-06-04 09:43:03 UTC; 3min 45s ago
>State: active
> Sessions: 5 *1
>   Linger: no
> Unit: user-0.slice
>   ├─session-1.scope
>   │ ├─740 sshd: root@pts/0
>   │ ├─765 -bash
>   │ ├─769 su - bob
>   │ ├─770 -bash
>   │ ├─877 loginctl user-status root
>   │ └─878 less
>   ├─session-5.scope
>   │ ├─820 sshd: root@pts/2
>   │ ├─826 -bash
>   │ └─872 sleep 100
>   └─user@0.service
> └─init.scope
>   ├─747 /lib/systemd/systemd --user
>   └─748 (sd-pam)
>
>
> Best regards,
> Ante
>


Re: [systemd-devel] VLAN interface stuck in pending

2023-06-01 Thread Mantas Mikulėnas
On Tue, May 30, 2023 at 11:15 AM Matthias Luft  wrote:

> Hi Friends,
>
> I'm on Debian Bullseye with 247.3-7+deb11u2 and am trying to configure a
> VLAN interface with networkd.
>
> My configuration is listed below. VLAN interface gets created correctly,
> however, no IP gets assigned. If I assign an IP manually, the interface
> is functional. networkctl shows the interface as pending:
>
> # networkctl
> IDX LINKTYPE OPERATIONAL SETUP
>1 lo  loopback carrier unmanaged
>2 wan ethercarrier configured
>3 isp_uplink_vlan vlan degradedpending
>
>
> With debugging enabled, I see the following in the systemd-networkd
> journal:
>
> isp_uplink_vlan: Interface is being renamed, pending initialization.
> ...
> isp_uplink_vlan: link_check_ready(): link is in pending state.
>
>
> Full log output listed here: https://pastebin.com/4urn8TBX
>
>
> Would you have any pointers what I am missing here?
>
> Thank you in advance & cheers,
> Matthias
>
> -
>
> Configuration:
>
> # cat 01-wan.link
> [Match]
> MACAddress=99:xx:xx:xx:xx:xx
>
> [Link]
> Name=wan
>

VLANs have the same MAC address as their parent device, so this .link file
tells the system to rename *both* interfaces to the same name "wan".

Add a "Type=ether" match to avoid this.

-- 
Mantas Mikulėnas


Re: [systemd-devel] How to authenticate login using org.freedesktop.login1

2023-05-24 Thread Mantas Mikulėnas
On Wed, May 24, 2023 at 9:42 AM Lal, Arun  wrote:

> Hi All,
>
>
>
> I was trying to authenticate a user from a deamon running in my machine.
> And I found systemd-login can be used.
>
> I went through documentation for interface org.freedesktop.login1, but I
> am not clear on how it can be used.
>
>
>
> Lets assume that there is a deamon called xyz running in my device which
> has a webserver component. And it receives a request to login from https
> side.
>
> And once the deamon has username and password, I would like to invoke some
> dbus calls to org.freedesktop.login1 to perform the authentication.
>

systemd-logind does not have that functionality. It's a session manager,
not an authentication service. (And the sessions it manages are meant for
mostly interactive connections; not for webapp sessions.)

Usually system authentication is done by loading libpam in-process (must be
done from a privileged process running as root). If that is not possible
(e.g. if you're using an unprivileged webapp), the *saslauthd* daemon from
Cyrus-SASL would be one option – it is designed to be used by various
network services to validate passwords over a Unix socket interface and has
a PAM backend (`saslauthd -a pam`).

I don't know of other such daemons (surprisingly, SSSD doesn't expose an
authenticate call through its D-Bus interface either, keeping it internal
to PAM only), but that's the general approach if you plan on writing your
own.

-- 
Mantas Mikulėnas


Re: [systemd-devel] Resource limits getting enforced only for processes in user's terminal not for su [user] from root's terminal

2023-05-06 Thread Mantas Mikulėnas
 Create the cgroups *through systemd*, by creating .slice units for that
purpose.

You can either create individual slices for each user, or you can enable
Delegate= on a slice and then systemd will allow you to manage your own
sub-cgroups inside.

On Fri, May 5, 2023 at 10:16 AM jaimin bhaduri  wrote:

> I created a cgroup named mycgroup using 'mkdir /sys/fs/cgroup/mycgroup'.
> 'ls /sys/fs/cgroup/mycgroup' shows only memory and pid files. The io and
> cpu files were missing.
>
> They are visible after I execute 'echo +cpu +io >
> /sys/fs/cgroup/cgroup.subtree_control'.
>
> But 'systemctl daemon-reload' again deletes the cpu and io files.
> Executing 'echo +cpu +io > /sys/fs/cgroup/cgroup.subtree_control' again
> brings the files back but the values of cpu.max and io.max files are now
> reset to default.
>
> This happens to all the cgroups I create.
> How do I enable cpu, io, memory, pids for the entire cgroups directory so
> that daemon reload or any other event does not delete those files for any
> of my created cgroup?
>
> On Tue, May 2, 2023 at 12:54 PM jaimin bhaduri  wrote:
>
>> Ok I am understanding.
>>
>> Using php, I created cgroups for every user with their username in
>> /sys/fs/cgroup and set values in their cpu.max, memory.high, memory.high,
>> pids.max, etc.
>> I made the below service file where I am moving pids of users to their
>> cgroups. For example, pids of user5 will be appended to
>> /sys/fs/cgroup/user5/cgroup.procs.
>> I am doing this for all users in loop after every 5 seconds as per the
>> below configuration.
>>
>> *Content of /etc/systemd/system/cgroups.service:*
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> *[Unit]Description=Move processes of user to
>> cgroup[Service]Type=simpleUser=rootExecStart=/bin/bash -c 'while true; do
>> pgrep -u user1 | grep -vxFf /sys/fs/cgroup/user1/cgroup.procs | xargs -I{}
>> sh -c "echo {} >> /sys/fs/cgroup/user1/cgroup.procs";pgrep -u user2 | grep
>> -vxFf /sys/fs/cgroup/user2/cgroup.procs | xargs -I{} sh -c "echo {} >>
>> /sys/fs/cgroup/user2/cgroup.procs";pgrep -u user3 | grep -vxFf
>> /sys/fs/cgroup/user3/cgroup.procs | xargs -I{} sh -c "echo {} >>
>> /sys/fs/cgroup/user3/cgroup.procs";pgrep -u user4 | grep -vxFf
>> /sys/fs/cgroup/user4/cgroup.procs | xargs -I{} sh -c "echo {} >>
>> /sys/fs/cgroup/user4/cgroup.procs";pgrep -u user5 | grep -vxFf
>> /sys/fs/cgroup/user5/cgroup.procs | xargs -I{} sh -c "echo {} >>
>> /sys/fs/cgroup/user5/cgroup.procs";sleep 5;
>> done'[Install]WantedBy=multi-user.target*
>>
>> This solution is working. But is this a good way to enforce resource
>> limits on users? There can be more than 100 users also in some cases.
>>
>>
>>
>> On Tue, Apr 25, 2023 at 9:33 AM Mantas Mikulėnas 
>> wrote:
>>
>>>
>>>
>>> On Tue, Apr 25, 2023, 06:44 jaimin bhaduri  wrote:
>>>
>>>>
>>>> */etc/systemd/system/user-1000.slice.d/override.conf:*[Unit]
>>>> Description=User Slice for UID 1000
>>>>
>>>> [Slice]
>>>> CPUAccounting=1
>>>> MemoryAccounting=1
>>>> IOAccounting=1
>>>> TasksAccounting=1
>>>> CPUQuota=55%
>>>> MemoryMax=
>>>> MemoryHigh=1G
>>>> IOReadBandwidthMax=
>>>> IOWriteBandwidthMax=
>>>> IOReadIOPSMax=
>>>> IOWriteIOPSMax=
>>>> TasksMax=100
>>>>
>>>> [Install]
>>>> WantedBy=multi-user.target
>>>>
>>>> */etc/system/user/aa.service:*
>>>> [Unit]
>>>> Description=Resource limits for user aa
>>>>
>>>> [Service]
>>>> Slice=user-1000.slice
>>>> Environment=USER_UID=1000
>>>> User=%i
>>>> WorkingDirectory=%h
>>>> Type=simple
>>>> ExecStart=/bin/bash -c 'echo "User %EUID %USER_UID" && sudo -u
>>>> \#$USER_UID $SHELL'
>>>> Restart=always
>>>> RestartSec=10
>>>>
>>>> [Install]
>>>> WantedBy=default.target
>>>>
>>>>
>>>> I made the above mentioned override.conf(slice file) and aa.service
>>>> file for the user named 'aa'.
>>>> Then I executed 'systemctl --user enable aa.service', 'systemctl --user
>>>> daemon-reload' and 'syste

Re: [systemd-devel] feature request: optional, with delay, for equivalent of setterm blank for VT login prompt

2023-04-30 Thread Mantas Mikulėnas
On Sun, Apr 30, 2023, 11:29  wrote:

> The following is a feature request. At src/login/loginctl.c ?
> The looked up feature is the equivalent of
> setterm --blank aDelay --powerdown SomeOtherDelay
> , only as soon as the login prompt appears. Before login.
> I mean, I ask to leave the current state as is. But allow the admin to set
> up these command also for the login prompt, if he chooses to. And it will
> be cleared while logining in.
> That command just put the display into a sort of blank mode.
> For me, the current situation is that I have to login to get the ability
> to
> run that setterm command. Which, I think, is not always desiarble. A


Then instead run it as a oneshot .service with StandardInput (and maybe
TTYName) set appropriately. (It's one of the few situations where it is
appropriate for services to access a tty.)

I think such settings *could* be made part of vconsole.conf, though. Try
opening a feature request on systemd's GitHub.


> console in the remote servers room can be in blank mode when no emergency
> actions are required. In particular, when the server is remotely
> supervised.
> And so is a desktop that also act as the single printer gateway for a
> small
> office. Or for a user that gone away after the desktop was turned on (but
> before he logged in), because he had some urgent call to attend to.
> For a machine that boots into graphic mode, not plain old text mode, a
> similar feature is implemented out of the box. Isn't it?
>

It's implemented by the "graphic mode" itself (usually by Xorg), not by
systemd.



> --
> u34
>


Re: [systemd-devel] systemd-devel Digest, Vol 156, Issue 26

2023-04-26 Thread Mantas Mikulėnas
The main difference is that "containers" are chroots with their own PID
namespace, at least, while an ordinary chroot still keeps the PID numbering
from the host. In other words, the container has its own PID 1 – and
systemd really wants to be PID 1, as init. A container runtime such as
nspawn will also delegate a cgroup subtree to the container, which systemd
also needs (being rather strongly built around cgroups).

So they're chroots with a few additional features needed to "boot" an init
system.

On Wed, Apr 26, 2023, 20:09 Benjamin Godfrey 
wrote:

> Re: The question regarding missing dependencies.
>
> You might try this:
>
> *zcat /boot/initramfs-$(uname -r).img | grep -E "(depends|provides)"*
>
> If you ran an update and dependencies were missing. Then
>
> *update-initramfs -u*
>
> wouldn't be of any help to you.
>
> I hadn't introduced myself.   My background is in art and literature, I
> started working on the front end, and I started moving to the backend to
> troubleshoot problems with the web server.  Learning NodeJs, Php and Bash,
> and I'm learning C++.  In particular, I'm trying to get the Apache2 to
> serve.  I have Ubuntu working on a Chromebook.
>
> Is it really that much more complicated to complete the boot process in
> chroot?I'll try the container solution.
>
> yours truly,
> Benjamin Godfrey
>
>
> On Wed, Apr 26, 2023 at 5:00 AM <
> systemd-devel-requ...@lists.freedesktop.org> wrote:
>
>> Send systemd-devel mailing list submissions to
>> systemd-devel@lists.freedesktop.org
>>
>> To subscribe or unsubscribe via the World Wide Web, visit
>> https://lists.freedesktop.org/mailman/listinfo/systemd-devel
>> or, via email, send a message with subject or body 'help' to
>> systemd-devel-requ...@lists.freedesktop.org
>>
>> You can reach the person managing the list at
>> systemd-devel-ow...@lists.freedesktop.org
>>
>> When replying, please edit your Subject line so it is more specific
>> than "Re: Contents of systemd-devel digest..."
>>
>>
>> Today's Topics:
>>
>>1. Re:  Failed to start up manager. Freezing execution. (Caeri Tech)
>>2.  Completing the boot process for systemd inside the chroot
>>   (Benjamin Godfrey)
>>3. Re:  Completing the boot process for systemd inside the
>>   chroot (Lennart Poettering)
>>
>>
>> --
>>
>> Message: 1
>> Date: Tue, 25 Apr 2023 09:18:35 -0400
>> From: Caeri Tech 
>> To: Lennart Poettering 
>> Cc: systemd-devel@lists.freedesktop.org
>> Subject: Re: [systemd-devel] Failed to start up manager. Freezing
>> execution.
>> Message-ID:
>> > hgc+r3syjkjbt7vfin-s8b4kkuasfhfcagkmydoy...@mail.gmail.com>
>> Content-Type: text/plain; charset="utf-8"
>>
>> Interesting.  If I downgrade systemd and systemd-libs from 253 to 252 it
>> works as before.
>>
>> Is there perhaps a dependency that I'm now required to include in the
>> initramfs config?
>>
>> On Tue., Apr. 25, 2023, 4:54 a.m. Lennart Poettering, <
>> lenn...@poettering.net> wrote:
>>
>> > On Di, 25.04.23 01:43, Caeri Tech (caerit...@gmail.com) wrote:
>> >
>> > > *:: running cleanup hook [udev]*
>> >
>> > This ouput doesn't look as if systemd was actually involved?
>> >
>> > > But it still freezes execution.
>> > >
>> > > The rescue and emergency shells do not start if I activate the hook
>> and
>> > > again it freezes.The shells do start, however, when the hook is
>> not
>> > > activated.
>> >
>> > Anyway, without debug logs as suggested in my earlier mail this is
>> > really hard to debug. Enable debug logging.
>> >
>> > Lennart
>> >
>> > --
>> > Lennart Poettering, Berlin
>> >
>> -- next part --
>> An HTML attachment was scrubbed...
>> URL: <
>> https://lists.freedesktop.org/archives/systemd-devel/attachments/20230425/9ba4216c/attachment-0001.htm
>> >
>>
>> --
>>
>> Message: 2
>> Date: Tue, 25 Apr 2023 13:26:08 -0700
>> From: Benjamin Godfrey 
>> To: systemd-devel@lists.freedesktop.org
>> Subject: [systemd-devel] Completing the boot process for systemd
>> inside the chroot
>> Message-ID:
>> <
>> cajxclxpvjick9mbsjndjn_y0r9m8-4habw5xxk3ffxsuwf4...@mail.gmail.com>
>> Content-Type: text/plain; charset="utf-8"
>>
>> It seems I sent this to the wrong list. I hope this isn't off topic.
>>
>> Dear systemd mailing list,
>>
>> I am trying to finish the boot process for systemd inside the chroot. I
>> have followed the instructions in the documentation, but I am still having
>> some problems.
>>
>> The following is typical output:
>>
>> [...$ sudo systemctl start systemd-journald
>> Running in chroot, ignoring request: start...] ### Showing that systemd is
>> bailing out because of the chroot.
>>
>> [...$ ps --no-headers -o comm 1
>> init...]   ### showing the Chromebook is using init at PID 1
>>
>> [...$ file /sbin/init
>> /sbin/init: symbolic link to /lib/systemd/systemd ...} ### showing that
>> init and systemd are

Re: [systemd-devel] Resource limits getting enforced only for processes in user's terminal not for su [user] from root's terminal

2023-04-23 Thread Mantas Mikulėnas
On Mon, Apr 24, 2023 at 7:04 AM jaimin bhaduri  wrote:

> Cgroups v2 is enabled in almalinux 9.1 with 5.14.0-70.22.1.el9_0.x86_64
> kernel and systemd 250 (250-12.el9_1.3).
>
> Content of /etc/systemd/system/user-1002.slice.d/override.conf:
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> *[Unit]Description=User Slice for UID
> 1002[Slice]CPUAccounting=1MemoryAccounting=1IOAccounting=1TasksAccounting=1CPUQuota=70%MemoryMax=1GMemoryHigh=1GIOReadBandwidthMax=/
> 1GIOWriteBandwidthMax=/ 1GIOReadIOPSMax=/ 1000IOWriteIOPSMax=/
> 1000TasksMax=200[Install]WantedBy=multi-user.target*
>
> I execute systemctl daemon-reload after saving the slice file.
> Every value is getting enforced for the user when I test them by running
> some commands from the user's terminal.
> But they dont work after I run the same commands from the root's terminal
> after doing su to that user.
> They also dont work when a user's process is started from a php script
> using putenv('user_uid');.
> How do I make them work for all the user's processes no matter how they
> start?
>

Using cgroup-based limits means that something needs to actually *move* the
process into the appropriate cgroup. (They are not uid-based limits!)

As php-fpm does not support cgroup management on its own, you might need to
run multiple instances of php-fpm@.service (not just multiple pools in the
same instance), each instance specifying "Slice=user-%i.slice" similar to
how user@.service does it.

For `su`, you would need to configure its PAM stack to invoke pam_systemd,
but this is usually *deliberately* not done, as doing so would cause other
issues, especially for scripts that use `su` for non-interactive purposes.
(Besides that, systemd-logind does not allow creating a new session from
within another one, so the only time `su` would be allowed to do this is
exactly the time when it would be undesirable...)

Instead, `machinectl shell foo@` or `systemd-run --user -M foo@.host --pty
...` could be used if you need to manually run something as another user
(but as soon you need to do it twice, you should just make a .service with
Slice=, or even a --user service).

-- 
Mantas Mikulėnas


Re: [systemd-devel] One shot service failure on Fedora 37

2023-04-17 Thread Mantas Mikulėnas
On Tue, Apr 18, 2023, 02:59 Bill Steinberg  wrote:

>
> Hi Barry,
>
> Thanks for the response. Answers inline below.
>
> On Apr 17, 2023, at 5:09 PM, Barry  wrote:
>
>
>
> On 17 Apr 2023, at 19:05, Bill Steinberg  wrote:
>
> Hello systemd devel,
>
> I have a systemd service that I’ve run on prior versions of fedora which
> fails to start via systemd on Fedora release 37. It is a oneshot service
> that starts the distributed checksum clearing house’s dccifd service via a
> shell script. Here is the definition of the service:
>
> [Unit]
> Description=Distributed Checksum Clearinghouses dccifd daemon
> After=syslog.target network.target
>
> [Service]
> Type=oneshot
>
>
> Oneshot seems wrong.
>
> RemainAfterExit=yes
> ExecStart=/var/dcc/libexec/rcDCC -m dccifd start
>
> Does this run a background daemon?
>
>
> Yes, the rcDCC shell script starts and runs a linux executable, a
> background daemon as you call it.
>

A "background daemon" is just Type=forking.


> Can you just run that daemon directly?
>
>
> I could run the shell script directly to start the dccifd executable
> however if the fedora linux server is rebooted I would need to remember to
> run the shell script manually. I’d like the dccifd daemon to start
> automatically when the fedora linux server is started. Isn’t systemd meant
> for this?
>

"Directly" means *not using wrapper scripts.* You can put command line
arguments in ExecStart.


> Hopefully that program can be run without demonising.
>
>
> ExecStop=/var/dcc/libexec/rcDCC -m dccifd stop
>
> If it is oneshot it does not need a stop
>
>
> Is there another type that should be used besides oneshot? I may want to
> run systemctl stop dccifd.service, for example when dccifd is being
> upgraded to a new version.
>
> The dccifd executable is started and stopped using a shell script. It is
> not run directly. One reason is that the shell script contains the
> arguments that are passed to the dccifd linux executable.
>

That's still just Type=forking.

Make sure the script `exec`s the main process rather than spawning it
underneath as usual.

But why put those arguments in a shell script? Isn't systemd meant for this?


>
> Restart=no
>
> [Install]
> WantedBy=multi-user.target
>
>
>
> The error in the journalctl log is:
>
> systemd[1]: Starting dccifd.service - Distributed Checksum Clearinghouses
> dccifddaemon…
> systemd[1]: dccifd.service: Main process exited, code=killed,
> status=11/SEGV
> systemd[1]: dccifd.service: Failed with result 'signal’.
> systemd[1]: Failed to start dccifd.service - Distributed Checksum
> Clearinghouses dccifddaemon.
>
> The two scripts in ExecStart and ExecStop run successfully outside of
> systemd. Any info as to why systemd fails when executing these scripts
> would be appreciated.
>
> Best,
> Bill Steinberg
>
>
>


Re: [systemd-devel] [EXT]Re: systemd user instance not working in only one account, XDG_RUNTIME_DIR not being set

2023-04-11 Thread Mantas Mikulėnas
On Tue, Apr 11, 2023, 19:23 Chandler  wrote:

> Mantas Mikulėnas wrote on 4/10/23 10:31 PM:
> > The same pam_systemd module registers a "session" with logind (which
> > triggers the creation of runtime directory as well as the startup of
> > user@.service; note: /not/ user@)
> hmmm... it's a bit ambiguous since we use LDAP too.  There, "uid" is a
> user name, while "uidNumber" would be the equivalent to $UID variable in
> bash, and "UID" printed by loginctl.
>

The rest of the system doesn't really care or change its terminology even
if you use LDAP; here it's still an UID in the regular Unix sense ("uid_t"
or "struct pwent->pw_uid") where it's always an integer.



> `systemctl start user@.service` does something though, since
> `status` shows it's running and everything, e.g.:
>

Systemd resolves user names when provided via User=, which is where the
instance name goes in this case, but that's not the intended usage of user@
.service.


> * user@userName.service - User Manager for UID userName
>Loaded: loaded (/usr/lib/systemd/system/user@.service; static; vendor
> preset: disabled)
>Active: active (running) since Mon 2023-04-10 17:05:53 MST; 15h ago
>  Main PID: 1635797 (systemd)
>Status: "Startup finished in 408ms."
> Tasks: 155
>Memory: 102.1M
>CGroup: /user.slice/user-userName.slice/user@userName.service
>|-dbus.service
>| `-1635943 /usr/bin/dbus-daemon ...
>|-docker.service
>| |-1635811 rootlesskit ...
>| |-1635831 /proc/self/exe ...
>| |-1635857 slirp4netns ...
>| |-1635868 dockerd
>| `-1635915 containerd --config /run/user/$UID/docker ...
>`-init.scope
>  |-1635797 /usr/lib/systemd/systemd --user
>  `-1635800 (sd-pam)
>
> what have I done??  I guess I should stop the service?
>

You should just stop it, or it'll result in a bit of a mess when both
user@fooName and user@1000 are started for the same account.


> > and sets
> > XDG_RUNTIME_DIR after the session has been registered. Check whether
> > your tty or display is shown in the `loginctl` session list.
> Well, our Session, UID, and user names are shown, but the SEAT and TTY
> columns are blank for everyone...
>
> > Note that logind does not allow registering sessions from within another
> > session, so tools like `su` won't be able to do that (except for some
> > situations where they can but you wouldn't want them to) – only a fresh
> > login gets you a session. So usually step 1 is to not use `su` or `sudo`
> > here – run `machinectl shell foo@` if you need a shell for a local user.
> Gotcha, thanks, I didn't know that or about machinectl!
>
> So, I tried stopping user@userName.service since that seemed incorrect
> to start it manually from root.  The /run/user/$UID dir went away.  I
>

In recent systemd versions it *shouldn't break anything*, at the very
least, as the runtime directory is now created via regular dependencies
(i.e. through user-runtime-dir@$UID.service). Still, it should probably be
left to logind (using the linger flag if necessary), i.e. it should not be
*necessary* to start it manually.

In older versions (without user-runtime-dir@), this would have failed as
there was nothing that would create the runtime directory on demand (with
logind itself doing it as part of the session registration).

tried `machinectl shell userName@` which was successful but there is
> still no /run/user/$UID dir and `systemctl --user` returning the same
> bus connection failure message from before...
>

That still sounds like a PAM issue; machined does set up a PAM session for
`machinectl shell`, so I'm guessing pam_systemd is completely gone from
/etc/pam.d (wherever your distro usually has it).

(Maybe a pam_succeed_if has been told to skip too many modules for a
certain user, for example?)


> I tried checking `systemctl status user@$UID.service` for another
> account that is not logged in at all or listed in `loginctl` output
> (let's say user2), and reported it was loaded but inactive, and no
> /run/user/$UID dir for user2 either.  Then after `machinectl shell
> user2@` and checking user@$UID.service status again, it is active and
> running, and /run/user/$UID is created and `systemctl --user status`
> works too.
>
> So there is something quirky with the other user's account preventing
> proper implementation/startup of systemd user instance... any other ideas?
>

Did that user already have an active logind session at that time? Normally
user@ is started either when the user's session count goes from 0 to non-0
(or on boot if linger is enabled); but if something has manually stopped
it, I don't think logind will try to restart it?

Check the system logs.

>


Re: [systemd-devel] systemd user instance not working in only one account, XDG_RUNTIME_DIR not being set

2023-04-10 Thread Mantas Mikulėnas
On Tue, Apr 11, 2023, 03:41 Chandler  wrote:

> systemd has been working great here, system-wide as well as in all user
> instances except one.  I'm not exactly sure what all the steps are in
> the process to get a systemd user instance running.  The directory
> /run/user/$UID was not being created, though.
>
> I made some progress by running `systemctl start
> user@.service` and the /run/user/$UID was created.
>
> `systemctl --user status` returns `Failed to connect to bus: No such
> file or directory`.  XDG_RUNTIME_DIR is not being set, but a command
> like `XDG_RUNTIME_DIR=/run/user/$UID systemctl --user status` runs
> successfully, so I think it's down to this last piece.
>

The same pam_systemd module registers a "session" with logind (which
triggers the creation of runtime directory as well as the startup of
user@.service;
note: *not* user@) and sets XDG_RUNTIME_DIR after the session has
been registered. Check whether your tty or display is shown in the
`loginctl` session list.

Note that logind does not allow registering sessions from within another
session, so tools like `su` won't be able to do that (except for some
situations where they can but you wouldn't want them to) – only a fresh
login gets you a session. So usually step 1 is to not use `su` or `sudo`
here – run `machinectl shell foo@` if you need a shell for a local user.


Re: [systemd-devel] systemctl daemon-reexec forgets running services and starts everything new

2023-04-10 Thread Mantas Mikulėnas
On Tue, Apr 4, 2023 at 11:33 AM Wasser, Erik  wrote:

> # Some details to the hardware #
>
> Our metal runs OpenVZ/Virtuozzo with this kernel (without any problems):
>
> > Linux FQDN_REDACTED 3.10.0-1127.18.2.vz7.163.46 #1 SMP Fri Nov 20
> 21:47:55 MSK 2020 x86_64 x86_64 x86_64 GNU/Linux
>
> The container with the `systemctl daemon-reexec` problem reports the
> following kernel:
>
> Linux FQDN_REDACTED 5.4.0 #1 SMP Thu Apr 22 16:18:59 MSK 2021 x86_64
> x86_64 x86_64 GNU/Linux
>

Hold on a moment – if it is actually an *OpenVZ container*, not a VM,
how/why is it even reporting a different kernel than the host OS? Isn't the
entire point of OpenVZ to share a single kernel with the guest containers?
Is it actually 3.10 **pretending** to be 5.4 just to make it pass systemd's
kernel version checks?

-- 
Mantas Mikulėnas


Re: [systemd-devel] systemd-resolved and dhclient nameservers

2023-04-07 Thread Mantas Mikulėnas
Systemd-resolved can't actually see your leases all on its own; the DHCP
client needs to provide it that information. Networkd and NetworkManager do
it directly through the D-Bus IPC.

For standalone dhcpcd, you would likely need to install the `resolvconf`
emulation that comes with systemd (usually packaged separately to avoid
conflicts with the real resolvconf/openresolv), or find a hook script that
would upload the configuration using resolvectl. For dhclient, probably the
same.

On Fri, Apr 7, 2023, 19:50 Alex Cheamitru  wrote:

> Hi,
>
> I'm curious about whether or not dhclient-provided DNS servers will be
> added to resolv.conf when it is managed by systemd-resolved.  The man page
> mentions *"... **and dynamically at run time, for example from DHCP
> leases",* but I'm not sure if that only applies to systemd-networkd
> leases.
>
> To test, I emptied my current resolv.conf (which contained
> dhclient-provided nameservers) and started systemd-resolved.
>
> > /etc/resolv.conf
> systemctl start systemd-resolved
>
> After doing this, I see no nameservers defined
> in /run/systemd/resolve/resolv.conf.
>
> Is that expected behavior?  I was hoping to see my DHCP-provided
> nameservers in resolv.conf, but that wasn't the case.
>
> Thanks,
> Alex
>


Re: [systemd-devel] creating device nodes

2023-04-05 Thread Mantas Mikulėnas
.device units do not mknod, they only represent existing state.

/dev/fuse is usually created through tmpfiles.d (which gets its
configuration via kmod-static-nodes.service).

# kmod static-nodes --format=tmpfiles

On Wed, Apr 5, 2023 at 11:13 AM Richard Hector 
wrote:

> Hi all,
>
> I want to create a device (/dev/fuse) in an LXC container. The kernel
> bit works; I can mknod manually, but I'd rather use a systemd unit, and
> make it a dependency of mounting filesystems from /etc/fstab.
>
> It looks like .device units are supposed to be created automatically if
> there's an appropriate udev rule with TAG+="systemd" - these lines
> exists in /usr/lib/udev/rules.d/99-systemd.rules:
>
> # Asynchronously mount file systems implemented by these modules as soon
> as they are loaded.
> SUBSYSTEM=="module", KERNEL=="fuse", TAG+="systemd",
> ENV{SYSTEMD_WANTS}+="sys-fs-fuse-connections.mount"
>
> The comment seems to suggest it will cause the filesystems to be mounted
> when the device is created, which is kind of the reverse of what I'm
> after. Do I need a different line?
>
> Or do I need to create a .device unit file manually? I can't see much
> info on doing that.
>
> Cheers,
> Richard
>


-- 
Mantas Mikulėnas


  1   2   3   4   5   6   7   8   9   10   >