[systemd-devel] udev can fail to read stdout of processes spwaned in udev_event_spawn
Hi, In tracking down an issue we are having with usb-modeswitch I found that the root cause is an issue in udev where when the rule sets a PROGRAM= and uses the result it will sometimes receive an empty result even when the program did produce output. This appears to be because the on_spawn_sigchld handler used by spawn_wait is not checking if there is output in the stdout pipe when the program exits and thus there is a race between this event and the io event to read the output. What is the best way to fix this issue? I have locally had success just calling the on_spawn_io callback in the process success branch of on_spawn_sigchld, but I am unsure if this is an acceptable fix. Thanks, Paul ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] Journalctl --list-boots problem
On Thu, Oct 31, 2019 at 4:34 PM Lennart Poettering wrote: > On Di, 08.10.19 16:57, Martin Townsend (mtownsend1...@gmail.com) wrote: > > > Thanks for your help. In the end I just created a symlink from > > /etc/machine-id to /data/etc/machine-id. It complains really early on > > boot with > > Cannot open /etc/machine-id: No such file or directory > > > > So I guess it's trying to read /etc/machine-id for something before > > fstab has been processed and the data partition is ready. > > > > But the journal seems to be working ok and --list-boots is fine. The > > initramfs would definitely be more elegant solution to ensure > > /etc/machine-id is ready. > > > > I don't suppose you know what requires /etc/machine-id so early in the > boot? > > PID 1 does. > > You have to have a valid /etc/machine-id really, everything else is > not supported. And it needs to be available when PID 1 initializes. > > You basically have three options: > > 1. Make it read-only at boot, initialize persistently on OS install > > 2. Make it read-only, initialize it to an empty file on OS install, in >which case systemd (i.e. PID 1) overmounts it with a random one >during early boot. In this mode the system will come up with a new >identity on each boot, and thus journal files from previous boots >will be considered to belong to different systems. > > 2b. (Same as 2, but mount / writable during later boot, at which time > the machine ID is commited to disk automatically) > > 3. Make it writable during early boot, and initialize it originally to >an empty file. In this case PID 1 will generate a random one and >persist it to disk right away. > > Also see: > > https://www.freedesktop.org/software/systemd/man/machine-id.html > > Lennart > > -- > Lennart Poettering, Berlin > Hi Lennart, Thank you for the information it was very useful. Reading the link on machine-id gives my another option and that is to pass the machine-id via U-Boot to the kernel via its bootargs. As some background this is for an embedded system that is using systemd and Mender which seems to becoming fairly popular so hopefully this may help someone else who stumbles across this post with the same problem. Mender provides image updates on an A/B root filesystem, so having /etc/machine-id within the root filesystem isn't really feasible but with Mender you get a persistent data partition that both root filesystems share, hence why have opted for a symlink to the persistent partition. So for embedded systems using Mender that want a persistent machine-id I see two options now: 1) Use an initramdisk to mount the persistent mender data partition and store machine-id in here with /etc/machine-id a symlink 2) and thanks to your link I think we could store the machine-id in the U-Boot environment and pass it in as the systemd.machine_id= kernel command line parameter. Out of interest what does PID 1 need /etc/machine-id for before it has processed fstab (and hence the persistent data partition would be ready to read the /etc/machine-id symlink)? We haven't implemented either of the above so I wouldn't mind knowing what the impact would be. Cheers, Martin. ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] rootnfs + rootovl (diskless client shutdown problem)
Sorry for the imprecision there, it's not a kernel panic but a complete freeze of the system during a reboot or shutdown (almost 10/20% of the time). The error message that appears on the screen is about missing libraries, which I could later locate in /usr/lib64 (I do not remember their name right now). Concerning the .mount file, there is NOT ONE associated to /live/image. Instead, all others form /etc/fstab are created: cat etc/fstab live:/srv/live/root /root nfs intr,nolock 0 0 live:/srv/live/home /home nfs intr,nolock 0 0 live:/srv/live/opt /opt nfs intr,nolock 0 0 each nfs entry has a corresponding .mount file. The most important kernel options are: root=172.16.16.38:/srv/live/cos8 rootovl where 'rootovl' is the option provided by the 90overlay-root dracut module (imported from debian buster, see here below). Should I modify overlay-mount.sh? Thank you. ### ### cat /usr/lib/dracut/modules.d/90overlay-root/README ### ### dracut rootfs overlayfs module Make any rootfs ro, but writable via overlayfs. This is convenient, if for example using an ro-nfs-mount. Add the parameter "rootovl" to the kernel, to activate this feature This happens pre-pivot. Therefore the final root file system is already mounted. It will be set ro, and turned into an overlayfs mount with an underlying tmpfs. The original root and the tmpfs will be mounted at /live/image and /live/cow in the final rootfs. ### cat /usr/lib/dracut/modules.d/90overlay-root/module-setup.sh ### #!/bin/bash check() { # do not add modules if the kernel does not have overlayfs support [ -d /lib/modules/$kernel/kernel/fs/overlayfs ] || return 1 } depends() { # We do not depend on any modules - just some root return 0 } # called by dracut installkernel() { instmods overlay } install() { inst_hook pre-pivot 10 "$moddir/overlay-mount.sh" } # ### cat /usr/lib/dracut/modules.d/90overlay-root/overlay-mount.sh ### # #!/bin/sh # make a read-only nfsroot writeable by using overlayfs # the nfsroot is already mounted to $NEWROOT # add the parameter rootovl to the kernel, to activate this feature . /lib/dracut-lib.sh if ! getargbool 0 rootovl ; then return fi modprobe overlay # a little bit tuning mount -o remount,nolock,noatime $NEWROOT # Move root # --move does not always work. Google >mount move "wrong fs"< for # details mkdir -p /live/image mount --bind $NEWROOT /live/image umount $NEWROOT # Create tmpfs mkdir /cow mount -n -t tmpfs -o mode=0755 tmpfs /cow mkdir /cow/work /cow/rw # Merge both to new Filesystem mount -t overlay -o noatime,lowerdir=/live/image,upperdir=/cow/rw,workdir=/cow/work,default_permissions overlay $NEWROOT # Let filesystems survive pivot mkdir -p $NEWROOT/live/cow mkdir -p $NEWROOT/live/image mount --bind /cow/rw $NEWROOT/live/cow umount /cow mount --bind /live/image $NEWROOT/live/image umount /live/image From: Lennart Poettering Sent: Thursday, October 31, 2019 6:34:15 PM To: Matteo Guglielmi Cc: systemd-devel@lists.freedesktop.org Subject: Re: [systemd-devel] rootnfs + rootovl (diskless client shutdown problem) On Mo, 28.10.19 09:47, Matteo Guglielmi (matteo.guglie...@dalco.ch) wrote: > > almost 20% of the time I get a kernel panic error > due to a bunch of missing libraries. A kernel panic? because of "missing libraries"? that doesn't sound right. The kernel doesn't need "libraries". iirc it's totally fine to unmount the backing fs after you mounted the overlayfs, the file systems remain pinned in the background by the overlayfs. > > > How can I instruct systemd to avoid unmounting > > /live/image (or postpone it to a later moment)? You can extend the .mount unit file for /live/image and add an explicit dep: i.e. create /etc/systemd/system/live-image.mount.d/50-my-drop-in.conf, then add: [Unit] After=some-other.mount You get the idea... Lennart -- Lennart Poettering, Berlin ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] [SPAM]Re: Mount units with After=autofs.service cause ordering cycles
On 10/31/19 2:59 PM, Lennart Poettering wrote: On Do, 31.10.19 14:09, John Florian (jflor...@doubledog.org) wrote: # /etc/systemd/system/var-www-pub.mount [Unit] Description=mount /pub served via httpd Requires=autofs.service After=autofs.service [Mount] What=/mnt/pub Where=/var/www/pub Options=bind,context=system_u:object_r:httpd_sys_content_t [Install] WantedBy=multi-user.target ~~~ The above worked for a long time, but once again a `dnf upgrade` seems to have broken things because now I have a ordering cycle that systemd must break. Since I haven't changed my mount units, my ability to mesh with those shipped by the OS proves fragile. I'm deliberately avoiding too much detail here because it would seem that there should be a relatively simple solution to this general sort of task -- I just can't seem to discover it. Any recommendations that don't involve an entirely different approach? What precisely is the ordering cycle you are seeing? It's usually dumped along with the log message. systemd[1]: local-fs.target: Found ordering cycle on var-www-pub.mount/start systemd[1]: local-fs.target: Found dependency on autofs.service/start systemd[1]: local-fs.target: Found dependency on rpc-statd.service/start systemd[1]: local-fs.target: Found dependency on network-online.target/start systemd[1]: local-fs.target: Found dependency on network.target/start systemd[1]: local-fs.target: Found dependency on NetworkManager.service/start systemd[1]: local-fs.target: Found dependency on sysinit.target/start systemd[1]: local-fs.target: Found dependency on systemd-update-done.service/start systemd[1]: local-fs.target: Found dependency on local-fs.target/start systemd[1]: local-fs.target: Job var-www-pub.mount/start deleted to break ordering cycle starting with local-fs.target/start The ordering dep between local-fs.target and var-ww-pub.mount is what you have to get rid of to remove the cycle. Set: … [Unit] DefaultDependencies=no Conflicts=umount.target Before=umount.target … [Install] WantedBy=remote-fs.target … i.e. make this a dep of remote-fs.target, not the implicit local-fs.target, so that we don#t pull it in early boot, but only during late boot, before remote-fs.target. Thanks Lennart! That did the trick. I and others I know have knocked heads on this one several times and somehow never came to this conclusion. It makes sense now that I see it, however. Maybe local-fs.target should have stood out to me, but I think it was mostly accepted since if you follow all deps far enough, you'll eventually cover (most?) everything. I think this just means I need to use `systemctl show` more even though `systemctl cat` is so much easier to digest for what I think I need to know. Abstracting the default deps is both good in expression but also difficult in comprehension. I wish there was something "in between", but I don't even know how to define what that means. Maybe just grouping all the settings from `show` somehow, e.g.: ordering, deps, etc. or maybe by unit type: unit, exec, mount, etc. ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] [SPAM]Re: Mount units with After=autofs.service cause ordering cycles
On Do, 31.10.19 14:09, John Florian (jflor...@doubledog.org) wrote: > > > # /etc/systemd/system/var-www-pub.mount > > > [Unit] > > > Description=mount /pub served via httpd > > > Requires=autofs.service > > > After=autofs.service > > > > > > [Mount] > > > What=/mnt/pub > > > Where=/var/www/pub > > > Options=bind,context=system_u:object_r:httpd_sys_content_t > > > > > > [Install] > > > WantedBy=multi-user.target > > > > > > ~~~ > > > > > > The above worked for a long time, but once again a `dnf upgrade` seems to > > > have broken things because now I have a ordering cycle that systemd must > > > break. Since I haven't changed my mount units, my ability to mesh with > > > those shipped by the OS proves fragile. I'm deliberately avoiding too much > > > detail here because it would seem that there should be a relatively simple > > > solution to this general sort of task -- I just can't seem to discover it. > > > Any recommendations that don't involve an entirely different approach? > > What precisely is the ordering cycle you are seeing? It's usually > > dumped along with the log message. > > systemd[1]: local-fs.target: Found ordering cycle on var-www-pub.mount/start > systemd[1]: local-fs.target: Found dependency on autofs.service/start > systemd[1]: local-fs.target: Found dependency on rpc-statd.service/start > systemd[1]: local-fs.target: Found dependency on network-online.target/start > systemd[1]: local-fs.target: Found dependency on network.target/start > systemd[1]: local-fs.target: Found dependency on > NetworkManager.service/start > systemd[1]: local-fs.target: Found dependency on sysinit.target/start > systemd[1]: local-fs.target: Found dependency on > systemd-update-done.service/start > systemd[1]: local-fs.target: Found dependency on local-fs.target/start > systemd[1]: local-fs.target: Job var-www-pub.mount/start deleted to break > ordering cycle starting with local-fs.target/start The ordering dep between local-fs.target and var-ww-pub.mount is what you have to get rid of to remove the cycle. Set: … [Unit] DefaultDependencies=no Conflicts=umount.target Before=umount.target … [Install] WantedBy=remote-fs.target … i.e. make this a dep of remote-fs.target, not the implicit local-fs.target, so that we don#t pull it in early boot, but only during late boot, before remote-fs.target. Lennart -- Lennart Poettering, Berlin ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] [SPAM]Re: Mount units with After=autofs.service cause ordering cycles
On 10/31/19 1:08 PM, Lennart Poettering wrote: On Mo, 14.10.19 16:23, John Florian (j...@doubledog.org) wrote: So, I much prefer the expressiveness of systemd's mount units to the naive era of /etc/fstab, but I've found one situation where I seem to always get stuck and am never able to find a reliable solution that survives OS (Fedora & CentOS) updates. I have a NFS filesystem mounted by autofs at /pub that needs to be bind mounted in various places such as /var/www/pub and /var/ftp/pub. So I create a unit that looks like: ~~~ # /etc/systemd/system/var-www-pub.mount [Unit] Description=mount /pub served via httpd Requires=autofs.service After=autofs.service [Mount] What=/mnt/pub Where=/var/www/pub Options=bind,context=system_u:object_r:httpd_sys_content_t [Install] WantedBy=multi-user.target ~~~ The above worked for a long time, but once again a `dnf upgrade` seems to have broken things because now I have a ordering cycle that systemd must break. Since I haven't changed my mount units, my ability to mesh with those shipped by the OS proves fragile. I'm deliberately avoiding too much detail here because it would seem that there should be a relatively simple solution to this general sort of task -- I just can't seem to discover it. Any recommendations that don't involve an entirely different approach? What precisely is the ordering cycle you are seeing? It's usually dumped along with the log message. systemd[1]: local-fs.target: Found ordering cycle on var-www-pub.mount/start systemd[1]: local-fs.target: Found dependency on autofs.service/start systemd[1]: local-fs.target: Found dependency on rpc-statd.service/start systemd[1]: local-fs.target: Found dependency on network-online.target/start systemd[1]: local-fs.target: Found dependency on network.target/start systemd[1]: local-fs.target: Found dependency on NetworkManager.service/start systemd[1]: local-fs.target: Found dependency on sysinit.target/start systemd[1]: local-fs.target: Found dependency on systemd-update-done.service/start systemd[1]: local-fs.target: Found dependency on local-fs.target/start systemd[1]: local-fs.target: Job var-www-pub.mount/start deleted to break ordering cycle starting with local-fs.target/start ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] is the watchdog useful?
On Thu, Oct 31, 2019 at 06:30:33PM +0100, Lennart Poettering wrote: > On Mo, 21.10.19 17:50, Zbigniew Jędrzejewski-Szmek (zbys...@in.waw.pl) wrote: > > > In principle, the watchdog for services is nice. But in practice it seems > > be bring only grief. The Fedora bugtracker is full of automated reports of > > ABRTs, > > and of those that were fired by the watchdog, pretty much 100% are bogus, in > > the sense that the machine was resource starved and the watchdog fired. > > > > There a few downsides to the watchdog killing the service: > > 1. if it is something like logind, it is possible that it will cause > > user-visible > > failure of other services > > 2. restarting of the service causes additional load on the machine > > 3. coredump handling causes additional load on the machine, quite > > significant > > 4. those failures are reported in bugtrackers and waste everyone's time. > > > > I had the following ideas: > > 1. disable coredumps for watchdog abrts: systemd could set some flag > > on the unit or otherwise notify systemd-coredump about this, and it could > > just > > log the occurence but not dump the core file. > > 2. generally disable watchdogs and make them opt in. We have > > 'systemd-analyze service-watchdogs', > > and we could make the default configurable to "yes|no". > > > > What do you think? > > Isn't this more a reason to substantially increase the watchdog > interval by default? i.e. 30min if needed? Yep, there was a proposal like that. I want to make it 1h in Fedora. Zbyszek ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] systemd as a docker process manager
On So, 27.10.19 20:50, Jeff Solomon (jsolomon8...@gmail.com) wrote: > This is a followup to this thread: > > https://lists.freedesktop.org/archives/systemd-devel/2015-July/033585.html > > To see if there are any new developments. > > We have multi-process application that already uses systemd successfully. > Our customers want to put the application into a container and that > container should be docker because that is what they use. We can't use > systemd-nspawn or podman or whatever because our customers want to use > docker because they are already using docker for other applications. > > I understand that containers are not a security technology but we want to > find a solution that allows us to run systemd in a docker container that > isn't blatantly less secure than systemd running outside of a container. I > have yet to find a way. > > Fundamentally, the problem is that the systemd in the container require > read/write access to the host's /sys/fs/cgroup/systemd directory in order > to function at all. It only requires write access to the subtree it lives in, not to what lives above it. See how nspawn does it. > Even if the container isn't privileged, it's necessary > to mount the host's /sys/fs/cgroup directory inside the directory and let > the container write to it, you have a security hole that doesn't exist when > systemd is just run on the host. That hole is described here: Three options: 1. Docker should use CLONE_NEWCGROUP to get its own cgroup subtree hiding what is outside of it. 2. Docker should mount the root of the cgroup tree read-only, only the subtree the container is supposed to live in writable. 3. Just use cgroupsv2. I don't know Docker really, you'd have to enquire them if they support that. They are a bit behind on these things, but maybe if you ping them, they will add this for you. (Of course, systemd-nspawn supports all three of the above-) > https://blog.trailofbits.com/2019/07/19/understanding-docker-container-escapes/ > > Using user namespaces doesn't help because then the container user wouldn't > have permission to write to the /sys/fs/cgroup/systemd. It doesn't need write acces to that dir, only to the subtree it is supposed to live in it. > Our application runs as a non-root user. The security concern is that any > user on the host who is in the docker group would be able to start a shell > inside the container as "container root" and then be able to get root on > the host. So basically membership in the docker group is equivalent to host > root. > > Taking a step back - I wonder (mostly asking Lennart) if there is a way to > run systemd without it needing access to /sys/fs/cgroup/systemd? I'm sure > there isn't but I thought I would ask. no. systemd requires cgroups. But it's fine to mount only the subtree it needs writable. systemd carefully makes sure that the service manager never steps beyond its territory, and the access boundaries are clear and that allows you to carefully arrange the cgroup tree so that only the subtree and the hierarchy systemd really needs (i.e. the name=systemd hierarchy) is writable. (I mean, cgroupsv1 and non-userns containers are not safe anyway, so you are just closing one gaping hole while leaving many others open, but of course, this is your choice). > Is there a way to run systemd's user service without it having the system > systemd service as a parent? This is not supported, sorry. Lennart -- Lennart Poettering, Berlin ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] rootnfs + rootovl (diskless client shutdown problem)
On Mo, 28.10.19 09:47, Matteo Guglielmi (matteo.guglie...@dalco.ch) wrote: > > almost 20% of the time I get a kernel panic error > due to a bunch of missing libraries. A kernel panic? because of "missing libraries"? that doesn't sound right. The kernel doesn't need "libraries". iirc it's totally fine to unmount the backing fs after you mounted the overlayfs, the file systems remain pinned in the background by the overlayfs. > > > How can I instruct systemd to avoid unmounting > > /live/image (or postpone it to a later moment)? You can extend the .mount unit file for /live/image and add an explicit dep: i.e. create /etc/systemd/system/live-image.mount.d/50-my-drop-in.conf, then add: [Unit] After=some-other.mount You get the idea... Lennart -- Lennart Poettering, Berlin ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] is the watchdog useful?
On Mo, 21.10.19 17:50, Zbigniew Jędrzejewski-Szmek (zbys...@in.waw.pl) wrote: > In principle, the watchdog for services is nice. But in practice it seems > be bring only grief. The Fedora bugtracker is full of automated reports of > ABRTs, > and of those that were fired by the watchdog, pretty much 100% are bogus, in > the sense that the machine was resource starved and the watchdog fired. > > There a few downsides to the watchdog killing the service: > 1. if it is something like logind, it is possible that it will cause > user-visible > failure of other services > 2. restarting of the service causes additional load on the machine > 3. coredump handling causes additional load on the machine, quite significant > 4. those failures are reported in bugtrackers and waste everyone's time. > > I had the following ideas: > 1. disable coredumps for watchdog abrts: systemd could set some flag > on the unit or otherwise notify systemd-coredump about this, and it could just > log the occurence but not dump the core file. > 2. generally disable watchdogs and make them opt in. We have 'systemd-analyze > service-watchdogs', > and we could make the default configurable to "yes|no". > > What do you think? Isn't this more a reason to substantially increase the watchdog interval by default? i.e. 30min if needed? Lennart -- Lennart Poettering, Berlin ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] nspawn and ovs bridges
On Mi, 23.10.19 20:09, Michał Zegan (webczat_...@poczta.onet.pl) wrote: > Hello, > My use case is the following: make a test of routing protocols without > having... enough real hardware. I decided to do that via containers > using systemd-nspawn, and because I may need many interconnected > networks and things like qos settings applied without dirty scripts, I > decided to try openvswitch for bridge management. > The problem here is that systemd-nspawn does not really support adding > the created veth interface to the ovs bridge, even for the so called > fake bridge, because it says "operation not supported". Same happens if > I try to do ip link set iface master fakebridge. > How to deal with that situation correctly? Any ideas? Uh. Some network device types might need some special calls to migrate them to an ns namespace (wifi too?), so far nobody sat down and did the necessary work to make that happen. It's a matter of debugging, researching and than probably making a minor fix to nspawn, to do the migration for you. Lennart -- Lennart Poettering, Berlin ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] How-to for systemd user timers instead of cron/crontab?
On Do, 17.10.19 10:58, Paul Menzel (pmenzel+systemd-de...@molgen.mpg.de) wrote: > Dear systemd folks, > > > I couldn’t find a simple documentation for “normal” users how > to use systemd timers instead of cron/crontab? The Arch Wiki > has a page [1], but I am afraid it’s still too complicated > for our users. There are no such docs afaik. It's not too different from systemd timer units. Except you place them in ~/.config/systemd/user/*.timer. And a few other things are different, and a few things not available... Yes, it would be great to have more docs about this, maybe file an issue on github asking for that, but I wouldn't hold my breath I fear... Lennart -- Lennart Poettering, Berlin ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] Mutually exclusive (timer-triggered) services
On Mo, 14.10.19 18:30, Alexander Koch (m...@alexanderkoch.net) wrote: > * flock leaves the lock file behind so you'd need some type of cleanup in > case > you really want the jobs to be trace-free. This is not as trivial as it > might > seem, e.g. you cannot do it from the service units themselves in > `ExecStartPost=` or similar. Linux supports BSD and POSIX locks on all kinds of fs objects, including dirs. It should be possible to just lock the cache dir or so of pacman (i.e. an fs object that exists anyway), no need to introduce a new locking file. Lennart -- Lennart Poettering, Berlin ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] Mutually exclusive (timer-triggered) services
On Mo, 14.10.19 12:45, Alexander Koch (m...@alexanderkoch.net) wrote: > Dear [systemd-devel], > > imagine you've got multiple services that perform system housekeeping > tasks, all triggered by .timer units. These services all happen to use > a specific resource (e.g. the system package manager) so they must not > be run in parallel, but they all need to be run. > > Is there a systemd'ish way of modeling this? > > I first thought of using `Conflicts=` but having read the manpages I > understand that queueing one of the services would actively stop any > running instance of any of the others. > > `After=` is not an option either as that (unless 'Type=oneshot', which > isn't to be used for long-running tasks) doesn't delay up to completion > but only to initialization. Furthermore I think you'd run into trouble > ordering more than two units using this approach. > > Ideally I'd think of something like a 'virtual resource' that can be > specified by service units, like this (real use case on Arch Linux): > > [Unit] > Description=Pacman sync > Locks=pacman-db > > [Service] > ExecStart=/usr/bin/pacman -Sy > > > > [Unit] > Description=Pacman cleanup > Locks=pacman-db > > [Service] > ExecStart=/usr/bin/paccache -r -k 0 -u > > The value of `Locks=` shall be an arbitrary string describing the > virtual resource the service is requiring exclusive access to. systemd > would then delay the start of a unit if there is another unit with > identical `Locks=` entry currently active. > > A nice advantage of this concept is that services depending on the same > virtual resource would not need to know of each other, which simplifies > shipping them via separate packages. > > Any thoughts on this? Maybe I'm just too blind to see the obvious > solution to this simple problem. I presume pacman uses file system locks anyway, no? I think the best approach would be to make those (optionally) blocking directly in pacman, no? I mean, you can add five layers of locking on top, but ultimately it appears to me in this case you just want to make the locking that already exists just blocking. Linux fs locking exists in non-blocking *and* blocking flavour anyway, it's just a matter of making pacman expose that, maybe with a new --block switch or so? Other than that you can of course use tools such as "flock(1)" around pacman. Lennart -- Lennart Poettering, Berlin ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] How to control the login prompt from my application service unit file?
On Di, 15.10.19 04:15, Moji, Shashidhar (shashidhar.m...@dellteam.com) wrote: > Hi, > We have VMware vApp based solution. Our application gets installed during > first boot. > Till now we had SLES11 OS based VM and we upgraded to SLES12. Now we have > systemd instead of init scripts for service handling. > In SLES11, we had service dependency configured in init scripts that was > holding back the login prompt until our application installation is done. But > in SLES12, we get the login prompt before our application is installed. > > How to hold the login prompt until our application installation is > complete? We tried adding Before=getty@.service in our application > install unit file, but its not helping. getty@.service is just a template for a unit, not a unit itself. Thus you cannot have a dependency on it as a whole. You have two options: 1. You can add a dropin getty@.service.d/foobar.conf, i.e. extend the getty@.service file that all VT gettys are instantiated of. In there, just place: [Unit] After=… 2. Order your unit before systemd-user-sessions.service. All gettys and other logins order themselves after that service, so if you order yourse before it you get the behaviour you are looking for. The first option is nicer, since it's more specific to a getty type, while the latter appplies to all logins including SSH or graphical. Lennart -- Lennart Poettering, Berlin ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] Mount units with After=autofs.service cause ordering cycles
On Mo, 14.10.19 16:23, John Florian (j...@doubledog.org) wrote: > So, I much prefer the expressiveness of systemd's mount units to the naive > era of /etc/fstab, but I've found one situation where I seem to always get > stuck and am never able to find a reliable solution that survives OS (Fedora > & CentOS) updates. I have a NFS filesystem mounted by autofs at /pub that > needs to be bind mounted in various places such as /var/www/pub and > /var/ftp/pub. So I create a unit that looks like: > > ~~~ > > # /etc/systemd/system/var-www-pub.mount > [Unit] > Description=mount /pub served via httpd > Requires=autofs.service > After=autofs.service > > [Mount] > What=/mnt/pub > Where=/var/www/pub > Options=bind,context=system_u:object_r:httpd_sys_content_t > > [Install] > WantedBy=multi-user.target > > ~~~ > > The above worked for a long time, but once again a `dnf upgrade` seems to > have broken things because now I have a ordering cycle that systemd must > break. Since I haven't changed my mount units, my ability to mesh with > those shipped by the OS proves fragile. I'm deliberately avoiding too much > detail here because it would seem that there should be a relatively simple > solution to this general sort of task -- I just can't seem to discover it. > Any recommendations that don't involve an entirely different approach? What precisely is the ordering cycle you are seeing? It's usually dumped along with the log message. Lennart -- Lennart Poettering, Berlin ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] RFC: luksSuspend support in sleep/sleep.c
On Mo, 14.10.19 16:27, Jonas Meurer (jo...@freesources.org) wrote: > Yeah, something like that was my hope as well: use plymouth and > framebuffer or something alike for spawning the passphrase prompt. But > I'm not sure yet how to ensure that we change to the passphrase prompt > (or overlay the graphical desktop environment). > > Another idea that came into my mind: spawn the passphrase prompt > *before* system suspend, just like it's apparently done with the > screenlock right now. > > The passphrase prompt could write to a fifo pipe or talk to a small > daemon that waits for the luks passphrase(s) to be entered. Paging doesn't allow that really. It's always ugly. You'd have to have your own UI stack in the initrd, i.e. basically have an alternative root disk, that possesses the screen exclusively as long as the system is up but not unlocked yet. So most likely a comprehensive approch would be: in systemd-suspend.service pass control to a binary in the initrd that runs in its one fs namespace with only tmpfs and api vfs visible, which includes plymouth and so on. It then switches to a new VT, does plymouth there, then suspends, on coming back lets plymouth ask its question and then unlocks the disk. And maybe even uses the cgroup freezer to freeze all processes on the host (i.e. everything except the stuff run from the initrd) before suspend, and thaw it only after the password has been entered again, so that the whole OS remains frozen and doesn't partially get woken up but hangs on the root disk, because typing in the pw might take a lng time... But even that is very ugly for various reasons. For example CLOCK_MONOTONIC will not be paused while the host remains frozen. Thus watchdog events will be missed (actual system suspend pauses CLOCK_MONOTONIC, which makes this safe for it), and then your system is hosed. Moreover, your initrd main process will be a child of a frozen process (as PID 1 is from the host), and this means you have to be very very careful with what you do, since you then cannot rely on some of the most basic functions of the OS. for example, PID 1 reaps processes which get reparented to it normally. Thus in your initrd you should be very careful never to have processes die while they have children as they will collect as unreaped children of PID 1 then... One can ignore issues like that, but they are frickin ugly > >> They might not be 100% available from just memory. What happens > >> if the DE needs to load assets (fonts, .ui files) for the > >> passphrase prompt from disk? (Actually, do any GPU drivers need > >> to load firmware from /lib on resume?) > >> > > > > In Ubuntu, casper component, we work around it by reading the files to > > ensure they are in the fscache, and then if one force unmounts the > > filesystem underneath them (cdrom eject) plymouth can still "read" > > fonts and display late boot messages. So there are ways of doing this. > > Again, the simplest solution would be to spawn the passphrase prompt > *before* suspend, to ensure that all required components are already in > memory. Or do you see caveats? Programs are memory mapped on Linux, i.e. nominally on disk, only the bits paged in as they are used, as they are executed. Similar, data files are typically memory mapped too. This means that preparing anything in advance is not that easy, you have to lock it into RAM too. Which you can do, but doesn't really scale, since our dep trees are large and fonts/media files in particular so. Lennart -- Lennart Poettering, Berlin ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] RFC: luksSuspend support in sleep/sleep.c
On Do, 10.10.19 17:22, Jonas Meurer (jo...@freesources.org) wrote: > >> systemd-homed maintains only the home directory via LUKS encryption, > >> and leaves the OS itself unencrypted (under the assumption it's > >> protected differently, for example via verity – if immutable — or via > >> encryption bound to the TPM), and uses the passphrase only for > >> home. THis means the whole UI stack to prompt the user is around > >> without problems, and the problem gets much much easier. > >> > >> So what's your story on the UI stack? Do you intend to actually copy > >> the full UI stack into the ramdisk? If not, what do you intend to do > >> instead? > > As Tim already wrote, the UI stack was not our focus so far. But I > agree, that it's a valid concern. My silent hope was to find a solution > for a simple passwort prompt that can be overlayed over whatever > graphical stack is running on the system. But we haven't looked into it > yet, so it might well be impossible to do something like this. > > But since the graphical interface is running already, I doubt that we > would have to copy the whole stack into the ramfs. We certainly need to > take care of all *new* dependencies that a password prompt application > pulls in, but the wayland/x11/gnome basics should just be there, as they > have been in use just before the suspend started, no? No. During suspend it's likely the kernel flushes caches. This means GNOME tools previously in memory might not be anymore and would have to be paged in again when they are executed again. But that's going to hang if your too disk is paused. > > [...] While it would be great to make the suspension as smooth as > > possible, I think there is also a place for people who *really* want a > > whole encrypted disk during suspend and are okay to jump through a few > > hoops for that. > > Let me stress this aspect a bit more: at the moment, full diskt > encryption is the way to go if you want encryption at rest on your > laptop. I applaud your efforts regarding systemd-homed, but they > probably will not be the default setup anytime soon. Especially not if > you talk about immutable or TPM-bound-encrypted rootfs. And > *unencrypted* rootfs clearly shouldn't be the way to go. Think about all > the sensitive stuff in /etc, /var/lib and even unencrypted swap > devices. I disagree. In my view, immutable+non-encrypted /usr is certainly the way to go. Not sure why you'd encrypt the OS if it's Open Source anyway. /var should be locked to the TPM, and $HOME bound to the user's credentials. System resources should not be protected by user credentials. And user resources not (at least not exclusively) by system credentials. > But the main poinf of your feedback probably is that we need a clear > vision how to technically solve the UI issue before you consider this a > valid candiate for systemd inclusion, right? Yes. > By the way, we discovered a possible dead lock between luksSuspend and > the final sync() in the Linux Kernel system suspend implementation that > you most likely will discover with your luksSuspend systemd-homed as > well. We're currently working on getting a Kernel patch accepted that > adds a run-time switch to disable the sync() in the Kernel system > suspend implementation. Hmm, so far this all just worked for me, I didn't run into any trouble with suspending just $HOME? Lennart -- Lennart Poettering, Berlin ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] RFC: luksSuspend support in sleep/sleep.c
On Do, 10.10.19 12:01, Tim Dittler (tim.ditt...@systemli.org) wrote: > > So what's your story on the UI stack? Do you intend to actually copy > > the full UI stack into the ramdisk? If not, what do you intend to do > > instead? > > > > Lennart > > Thank you for your feedback, Lennart. To be honest, the UX of the > operation has been a secondary concern for us so far. We're basically > exploring what is possible atm. Our current approach is to re-use the > initramfs which was used during boot before. This doesn't include > X11/wayland. While it would be great to make the suspension as smooth as > possible, I think there is also a place for people who *really* want a > whole encrypted disk during suspend and are okay to jump through a few > hoops for that. Well, but if you have no way to acquire the password you are in trouble. You have to think about the UX at some point. You'd have to rework systemd-suspend.service (and similar services) to transition to your initrd fully, then run systemd-sleep from there I figure and then maybe have a drop-in /usr/lib/systemd/system-sleep/ that unlocks the root fs. But it's not going to be nice if there's no UI support. Lennart -- Lennart Poettering, Berlin ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] Unexpected behaviour not noticed by systemctl command
On Mo, 07.10.19 11:43, Andy Pieters (syst...@andypieters.me.uk) wrote: > Hi guys > > Just lately ran into a fumble. I was trying to stop and disable a > service and I typed in: > > systemctl stop --now example.service > > The service duly stopped but wasn't disabled because the --now switch > is only applicable on the disable/enable/mask commands > > However, shouldn't it be good practice to produce a warning or an > error when a switch is used that has no effect? We consider most swicthes "modifiers" that just modify behaviour slightly but not generally, and hence if they don't apply we simply ignore them. Maybe the "--now" switch is a bit mroe heavy weight than the others, and we should refuse it. Can you maybe file a github issue about this (maybe even prep a PR), and we'll look into it. > Do you think it would be worth me writing a bug report for it? Yes! Lennart -- Lennart Poettering, Berlin ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] /cdrom mounted from initrd is stopped on boot, possibly confused about device-bound
On Mi, 09.10.19 14:28, Dimitri John Ledkov (x...@ubuntu.com) wrote: > Ubuntu installer images use initrd, which has udevd but no systemd. > > It mounts /dev/sr0 as /root/cdrom, then pivots to /root, meaning > /root/cdrom becomes just /cdrom and exec systemd as pid 1. > > At this point cdrom.mount is stopped as it's bound to an inactive > dev-sr0.device. Then sometime later dev-sr0.device becomes active, but > nothing remounts /cdrom back in. > > My question is why on startup, when processing cdrom.mount it > determines that dev-sr0 is inactive, when clearly it's fully > operational (it contains media, media is locked, and is mounted, and > is serving content). > > I notice that SYSTEMD_MOUNT_DEVICE_BOUND is set to 1 on the udev > device, and it seems impossible to undo via mount unit. 60-cdrom_id.rules sets that. > > I also wonder why, initially, /dev/sr0 is inactive, but later becomes > active - as in what causes it to become active, and what is missing in > the initrd. When PID 1 initializes and udev is not running no device is considered to be around. The devices only appear when they are triggered by systemd-udevd-trigger.service for the first time. > Things appear to work if I specify in the 60-cdrom_id.rules > SYSTEMD_READY=1, then on boot there are no working messages that > cdrom.mount is bound to an inactive device. > > Shouldn't 60-cdrom_id.rules set SYSTEMD_READY=1 if after importing > cdrom_id variables ID_CROM_MEDIA is not-empty? Such that > dev-sr0.device initial state is correct, if one booted with cdrom > media in place. SYSTEMD_READY=1 doesn't do anything, it's SYSTEMD_READY=0 that has an effect. i.e. a device that lacks SYSTEMD_READY= at all is equivalent to SYSTEMD_READY=1. The only reason for setting the property is to turn the readiness off, it's by default considered ready if the "systemd" udev tag is set. Lennart -- Lennart Poettering, Berlin ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] Journalctl --list-boots problem
On Di, 08.10.19 16:57, Martin Townsend (mtownsend1...@gmail.com) wrote: > Thanks for your help. In the end I just created a symlink from > /etc/machine-id to /data/etc/machine-id. It complains really early on > boot with > Cannot open /etc/machine-id: No such file or directory > > So I guess it's trying to read /etc/machine-id for something before > fstab has been processed and the data partition is ready. > > But the journal seems to be working ok and --list-boots is fine. The > initramfs would definitely be more elegant solution to ensure > /etc/machine-id is ready. > > I don't suppose you know what requires /etc/machine-id so early in the boot? PID 1 does. You have to have a valid /etc/machine-id really, everything else is not supported. And it needs to be available when PID 1 initializes. You basically have three options: 1. Make it read-only at boot, initialize persistently on OS install 2. Make it read-only, initialize it to an empty file on OS install, in which case systemd (i.e. PID 1) overmounts it with a random one during early boot. In this mode the system will come up with a new identity on each boot, and thus journal files from previous boots will be considered to belong to different systems. 2b. (Same as 2, but mount / writable during later boot, at which time the machine ID is commited to disk automatically) 3. Make it writable during early boot, and initialize it originally to an empty file. In this case PID 1 will generate a random one and persist it to disk right away. Also see: https://www.freedesktop.org/software/systemd/man/machine-id.html Lennart -- Lennart Poettering, Berlin ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] Make systemd-localed modify the kernel commandline for the initrd keymap?
On Di, 01.10.19 15:33, Colin Walters (walt...@verbum.org) wrote: > On Sun, Sep 29, 2019, at 6:08 AM, Lennart Poettering wrote: > > > i.e maybe write down a spec, that declares how to store settings > > shared between host OS, boot loader and early-boot kernel environment > > on systems that have no EFI NVRAM, and then we can make use of > > that. i.e. come up with semantics inspired by the boot loader spec for > > finding the boot partition to use, then define a couple of files in > > there for these params. > > I like the idea in general but it would mean there's no mechanism to > "roll back" to a previous configuration by default, which is a quite > important part of OSTree (and other similar systems). (Relatedly > this is also why ostree extends the BLS spec with an > atomically-swappable /boot/loader symlink, though I want to get away > from that eventually) Well, what I proposed is a file. OSTree can cover files on disk, no? > That said, maybe one thing we want regardless is a "safe mode" boot > that skips any OS customization and will get one booted enough to be > able to fix/retry for configuration like this. > > BTW related to EFI - as you know AWS doesn't support it, and we're > making a general purpose OS. Fedora isn't just about desktops, and > we need to be careful about doing anything in the OS that diverges > from the server side. (That said I only recently discovered that > GCP supports it as well as vTPMs, working on "blessing" our Fedora > CoreOS images to note they support it > https://github.com/coreos/mantle/pull/1060 ) I doubt on AWS you want to configure keymaps though, do you? Lennart -- Lennart Poettering, Berlin ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] Make systemd-localed modify the kernel commandline for the initrd keymap?
On Mo, 07.10.19 10:32, Colin Guthrie (gm...@colin.guthr.ie) wrote: > Colin Walters wrote on 01/10/2019 20:33: > > On Sun, Sep 29, 2019, at 6:08 AM, Lennart Poettering wrote: > > > >> i.e maybe write down a spec, that declares how to store settings > >> shared between host OS, boot loader and early-boot kernel environment > >> on systems that have no EFI NVRAM, and then we can make use of > >> that. i.e. come up with semantics inspired by the boot loader spec for > >> finding the boot partition to use, then define a couple of files in > >> there for these params. > > > > I like the idea in general but it would mean there's no mechanism to "roll > > back" to a previous configuration by default, which is a quite important > > part of OSTree (and other similar systems). (Relatedly this is also why > > ostree extends the BLS spec with an atomically-swappable /boot/loader > > symlink, though I want to get away from that eventually) > > Just out of curiosity, when /boot is the EFI (as is recommended in the > BLS) how do you deal with symlinks when the FS is FAT based? Fedora doesn't do that. Fedora doesn't implement the boot loader spec. They implemented their own thing that is an interpreted macro language (!?), they just call it "BLS". It's why I myself never use the acronym "BLS", to avoid unnecessary confusion. Really, the Fedora thing is just a bad idea. The Fedora thing totally missed the idea that boot loader drop-ins are supposed to be dead-simple, trivially parsable and generatable static drop-in files, and just replicated the bad bad idea inherent to GRUB which is that everything needs to be an (ideally Turing-complete) programming language again. Lennart -- Lennart Poettering, Berlin ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] Make systemd-localed modify the kernel commandline for the initrd keymap?
On Mo, 30.09.19 16:07, Hans de Goede (hdego...@redhat.com) wrote: > > So what you are arguing for is replacing the overlay initramfs > > with a key-value config file which gets used by both the bootloader > > and the OS. > > > > That is an interesting concept, esp. since it limits (as you advocate) > > what can be done in the overlay from "everything" to "set specific > > config variables to a value". > > > > So yes I can get behind this. > > While discussing this with Alberto an interesting problem came up. > > If we put this file in /boot/loader as you suggest, then the boot-loader > can get to it and use it to set its keymap (and in the future probably also > other stuff) but how does the localed in the initrd get to this > file? Boot loader could append it to the kernel cmdline for example. > I agree with you that having a generic mechanism to share config > between the OS and early-boot (so bootloader + initrd) is useful, > but are we then going to make the initrd mount /boot (or the ESP) ? I wouldn't no. Given that this is configuration that the boot loader is supposed to grok and parse it could just pass it on on the kernel cmdline. This would also allow boot loaders to provide a menu-drive scheme for changing kbd layouts, which they then can sanely pass on to the initrd and OS. Lennart -- Lennart Poettering, Berlin ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] Make systemd-localed modify the kernel commandline for the initrd keymap?
On Mo, 30.09.19 13:23, Hans de Goede (hdego...@redhat.com) wrote: > > i.e. generating initrd images with cpio and so on is hacky, gluey, > > Linux-specific. If you just use plain simple, standardized config > > files at clearly defined locations, then reading and writing them is > > simple, they can be shared between all players, are not Linux specific > > and so on. I think systemd could certainly commit to updating those > > files then. > > This sounds interesting, although I'm not sure I like the one file > per setting approach why not have a $BOOT/loader/config file which > has key=value pairs and kbdmap would be a key in that file? Currently there's $BOOT/loader/loader.conf which is read by sd-boot and private property of it, if you so will. We could probably open that up a bit, and make it part of the boot loader spec too. The format after all is pretty mich the same semantically as the boot loader spec. > I'm afraid that will not work, some countries have multiple variants, > we actually have a bunch of Fedora bugs open about the disk unlock > support in plymouth and the "de-neo" keymap and there also are the > somewhat popular dvorak variants. Well, I am not sure we need to support more than /etc/vconsole.conf supports. Not in the initrd... > So we could do this as say a base setting but then we would need to add > a kbdmap_variant setting which when sets makes the keymap loaded > $kbdmap-$variant.map (in Linux Console terms) I guess we could specify > that setting a variant this way is allowed, but that variant settings > are OS / bootloader specific and may be ignored? I am not sure we really need to configure a 100% fully featured keymap there. as long as the basics work in initrd/boot loader we are fine... mmkeys and per-machine tweaks can happily happen in later boot. I'd go as far as /etc/vconsole.conf, but not further really. Lennart -- Lennart Poettering, Berlin ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] user slice changes for uid ranges
On Fr, 27.09.19 15:56, Stijn De Weirdt (stijn.dewei...@ugent.be) wrote: > hi all, > > i'm looking for an "easy" way to set resource limits on a group of users. > > we are lucky enough that this group of users is within a (although > large) high enough range, so a range of uids is ok for us. > > generating a user-.slice file for every user (or symlink them or > whatever) looks a bit cumbersome, and probably not really performance > friendly if the range is in eg 100k (possible) uids. > > e.g. if this range was 100k-200k, i was more looking for a way to do > e.g. user-1X.slice or user-10:20.slice > > (i think this is different from/not covered by the templated/prefix user > slice patch > https://github.com/systemd/systemd/commit/5396624506e155c4bc10c0ee65b939600860ab67) I am not sure this helps you very much right now. But ultimately the plan is to allow resource limits to be configured in detail as part of each user record. This is implemented here already: https://github.com/poettering/systemd/commits/homed But this hasn't been merged upstream yet, but will hopefully be merged soon. Lennart -- Lennart Poettering, Berlin ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] systemd-growfs blocks boot until completed
On Fr, 27.09.19 17:12, Mirza Krak (mi...@mkrak.org) wrote: > Den fre 27 sep. 2019 kl 15:23 skrev Lennart Poettering > : > > > > On Fr, 27.09.19 14:35, Mirza Krak (mi...@mkrak.org) wrote: > > > > > Hi, > > > > > > I have been using the systemd-growfs feature for a while, and been > > > happy with it so far. > > > > > > But recently I upgraded my distribution (custom based on Yocto) which > > > also upgraded systemd from 239 to 241, and I can see that there has > > > been a change in behavior of the "systemd-growfs" feature. > > > > > > In systemd 241, it blocks the boot process while it is growing the > > > filesystem, here is an example log: > > > > > > Mounting /data... > > > [ 10.693190] EXT4-fs (mmcblk0p4): mounted filesystem with ordered > > > data mode. Opts: (null) > > > [ OK ] Mounted /data. > > > Starting Grow File System on /data... > > > [ 10.780109] EXT4-fs (mmcblk0p4): resizing filesystem from 131072 to > > > 30773248 blocks > > > [**] A start job is running for Grow File System on /data (11s / > > > no limit) > > > [ *** ] A start job is running for Grow File System on /data (21s / > > > no limit) > > > [*** ] A start job is running for Grow File System on /data (30s / no > > > limit) > > > [*** ] A start job is running for Grow File System on /data (42s / no > > > limit) > > > [**] A start job is running for Grow File System on /data (52s / no > > > limit) > > > [**] A start job is running for Grow Filâ…stem on /data (1min 2s / no > > > limit) > > > [ *** ] A start job is running for Grow Filâ…tem on /data (1min 15s / no > > > limit) > > > [ *** ] A start job is running for Grow Filâ…tem on /data (1min 26s / no > > > limit) > > > [ *** ] A start job is running for Grow Filâ…tem on /data (1min 36s / no > > > limit) > > > [ *** ] A start job is running for Grow Filâ…tem on /data (1min 46s / no > > > limit) > > > [ ***] A start job is running for Grow Filâ…tem on /data (1min 56s / no > > > limit) > > > [**] A start job is running for Grow Filâ…stem on /data (2min 6s / no > > > limit) > > > [**] A start job is running for Grow Filâ…tem on /data (2min 17s / no > > > limit) > > > [* ] A start job is running for Grow Filâ…tem on /data (2min 27s / no > > > limit) > > > [ *** ] A start job is running for Grow Filâ…tem on /data (2min 35s / no > > > limit) > > > > > > In the previous version (239), this occurred in the background and did > > > not obstruct the boot process in any noticeable way. Which matched my > > > expectations on how this feature would work. > > > > > > So my question is, was the change intentional and if so, what was the > > > reasoning? > > > > Hmm, the tool doesn't do much. It just calls an fs ioctl. If you > > attach gdb to the process (or strace it), can you see what it is > > blocking on? > > It seems that the ioctl operation is blocking until the resize is completed, > > openat(AT_FDCWD, "/data", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = 3 > openat(AT_FDCWD, "/dev/block/179:4", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = 4 > ioctl(4, BLKBSZGET, [1024]) = 0 > ioctl(4, BLKGETSIZE64, [31511805952]) = 0 > fstatfs64(3, 88, 0x7eb56bf0)= 0 > ioctl(3, _IOC(_IOC_WRITE, 0x66, 0x10, 0x8), 0x7eb56be > > I would like to clarify that it eventually will complete (after 5 > minutes on my device), and then the boot proceeds as normal. The ioctl > behavior has not changed, as it was blocking in the kernel in my > previous distribution version as well, but the > systemd-growfs@data.service did not block the boot on systemd 239 > where this was performed in parallel, but now it seems to block. > > The Linux kernel is: 4.19.71 > > This is what the systemd-growfs@.service looks like: > > # Automatically generated by systemd-fstab-generator > [Unit] > Description=Grow File System on %f > Documentation=man:systemd-growfs@.service(8) > DefaultDependencies=no > BindsTo=%i.mount > Conflicts=shutdown.target > After=%i.mount > Before=shutdown.target local-fs.target > > [Service] > Type=oneshot > RemainAfterExit=yes > ExecStart=/lib/systemd/systemd-growfs /data > TimeoutSec=0 Hmm, interesting. I wasn't aware of this change in behaviour. Most likely we should make this configurable, given that both behaviours might be desirable. Can you file an RFE bug on github that asks for this to be made configurable? Thanks, Lennart -- Lennart Poettering, Berlin ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] watchdog and sd_notify between host and container systemd etc inits
On Do, 26.09.19 07:24, mikko.rap...@bmw.de (mikko.rap...@bmw.de) wrote: > Hi, > > I'd like to run systemd as init on host but run various containers and some > of them > with their own container side systemd init. > > Then I'd like to have sd_notify and watchdog available to check the health of > the > systemd init in the container. I trust the init in the container to check the > health > of all the services and processes running there. > > If systemd init in the container fails to respond to watchdog, then I'd like > to > restart only the container, not the whole host system. > > For the container systemd watchdog, I've proposed patch: > > https://github.com/systemd/systemd/pull/13643 > > Comments to the PR mention that sd_notify support would be better, but AFAIK > it uses > the PID of processes and thus doesn't work with another systemd init as PID 0 > in > the container PID namespace. > > Thus we inveted a simple fifo between host init and container init where > container writes MACHINE and HOSTNAME as watchdog ping. This works well with a > custom watchdog manager on host and systemd init in an LXC container. > > These don't seem to fit very well to systemd, and we'd also like to know > sd_notify type > things like when is the container in running state, which systemd nspawn does > provide, but I have use cases also for LXC containers... > > So, could you provide some ideas and/or feedback how this kind of > functionality > could/should be implemented with systemd? I would normally assume that it's the job of the container manager to watchdog/supervise its container. And the host systemd to supervise the container manager in turn. i.e. it should be PID 1 on the host that gets sd_notify() messages keep-alive messages from your container manager, ensuring it's alive. And the container manager should get them from PID in the container, and ensure it remains alive. And the PID 1 in the container in the payload get the messages from the services below it. i.e. instead of forwarding these messages across these boundaries just have a clear 1:1 supervisor-client relationship. Or in other words: try to convince the LXC maintainers to send out sd_notify() watchdog messages from its own supervisor, and optionally expect them from the payload PID 1. systemd-nspawn at least partly works that way already. Lennart -- Lennart Poettering, Berlin ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] How to only allow service to start when network-online.target is met?
On Di, 17.09.19 15:44, Dom Rodriguez (shym...@shymega.org.uk) wrote: > Hello, > > I've got a service unit, which is linked to a timer unit, and I'd like to have > the behaviour associated with `Condition=` directives, but for targets. To > explain further, my desired expectation is for the service unit to *only* > start > when the target is active. Ideally, I don't want the service unit to fail > either, but have similar behaviour with `Condition=`, which doesn't mark the > unit as failed, but merely not meeting condition(s). > > Is this possible with systemd? Requisite= can do this, but it causes the depending unit to fail if the depnded-on unit is not activated yet or in the process of being activated. Lennart -- Lennart Poettering, Berlin ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] sdbus_event loop state mark as volatile?
On Do, 05.09.19 10:46, Stephen Hemminger (step...@networkplumber.org) wrote: > The libsystemd bus event loop is: > > > while (e->state != SD_EVENT_FINISHED) { > r = sd_event_run(e, (uint64_t) -1); > > But since e->state is changed by another thread it > should be marked volatile to avoid compiler thinking > the state doesn't get changed. None of systemd's libraries are thread safe. They are written in a threads-aware style though. This means you should only use a specific context object from a single thread at a time, and need to do your own locking around it if that single thread shall change all the time. systemd doesn't keep global state generally, which means doing your own locking around the sd_xyz objects should suffice and work reasonably well. Lennart -- Lennart Poettering, Berlin ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] exceeding match limit via sd_bus_add_match
On Mo, 09.09.19 11:42, Ivan Mikhaylov (i.mikhay...@yadro.com) wrote: > I have a system with a lot of sdbus properties which have to be 'match'ed. > After > reaching some match limit I'm getting -105 (ENOBUFS) on regular base. The > -105/(ENOBUFS) represents exceeding 'some limit', according to the doc. > > In manpage for sd_bus_add_match() there is no helpful information about > possible reasons for this over the limit case. I'm trying to figure it out > from > systemd code, with little success so far. > > What the limit is and where I can tweak it? It's generally the daemon that puts a limit to this not the client. i.e. you need to consult dbus-daemon/dbus-broker configuration for the limit. Usually instead of having many fine-grained matches it's more efficient to have few broader ones. i.e. instead of matching each property individually, consider matching the whole interface or so. Lennart -- Lennart Poettering, Berlin ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] Watchdog problem
On Sa, 07.09.19 15:11, Mikael Djurfeldt (mik...@djurfeldt.com) wrote: > Hi, > > I couldn't figure out a better place to ask this question. Please point me > to another place if you have a better idea. (Perhaps I should bring it up > with VirtualBox developers instead?) > > I run Debian buster (with some upgraded packages) as a Virtualbox guest > with a Windows host. > > When the host has gone to sleep and wakes up again, I get logged out from > my gdm session. It starts out like this: > > Sep 7 13:23:58 hat kernel: [82210.177399] 11:23:58.337557 timesync > vgsvcTimeSyncWorker: Radical guest time change: 2 491 379 233 000ns > (GuestNow=1 567 855 438 281 378 000 ns GuestLast=1 567 852 946 902 145 000 > ns fSetTimeLastLoop=false) > Sep 7 13:23:59 hat systemd[1]: systemd-logind.service: Watchdog timeout > (limit 3min)! > Sep 7 13:23:59 hat /usr/lib/gdm3/gdm-x-session[1285]: (EE) > Sep 7 13:23:59 hat /usr/lib/gdm3/gdm-x-session[1285]: Fatal server error: > Sep 7 13:23:59 hat /usr/lib/gdm3/gdm-x-session[1285]: (EE) systemd-logind > disappeared (stopped/restarted?) > > Can I fix this by setting systemd-logind.service WatchdogSec to something > else? What should I set it to to disable watchdogs? I tried to find > documentation for WatchdogSec but failed. Can you please point me to the > proper documentation? This looks like a kernel/virtualbox issue. We use CLOCK_MONOTONIC to determine when the last keep-alive message was received. Unlike CLOCK_REALTIME this means the clock stops while the system is suspended. If Virtualbox doesn't get this right, please report this as issue to virtualbox. Lennart -- Lennart Poettering, Berlin ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] set-property CPUAffinity
On Di, 03.09.19 19:49, Alexey Perevalov (a.pereva...@samsung.com) wrote: > Hello Michal, > > Thank you for response! > > On 9/3/19 6:03 PM, Michal Koutný wrote: > > Hello Alexey. > > > > On Fri, Aug 30, 2019 at 01:21:50PM +0300, Alexey Perevalov > > wrote: > > > [...] > > > The question is: changing CPUAffinity property (cpuset.cpus) is not yet > > > allowed in systemd API, right? Is it planned? > > Note that CPUAffinity= uses the mechanism of sched_setaffinity(2) which > > is different from using cpuset controller restrictions (that's also why > > you find it in `man systemd.exec` and not it `man > > systemd.resource-control`). > > > > IMO, systemd may eventually support the cpuset controller with a > > different directive. > > Does it mean community open for enhancement in this direction? > > Looks like current work on issues and enhancements are doing in github, > so I can create a RFE. Old cgroupsv1 cpuset was a horrible, broken interface which is why we are not supporting it. On cgroupsv2 things are better, and since 047f5d63d7a1ab75073f8485e2f9b550d25b0772 we now have support for it in systemd. Lennart -- Lennart Poettering, Berlin ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] DynamicUser shared by service instances
On Mo, 02.09.19 18:37, sqwishy (someb...@froghat.ca) wrote: > Hi. > > I was looking at how dynamic users are implemented and noticed that instances > seem to > share one dynamic user within their service. In the example below, I have an > attached > portable service with StateDirectory=derp-%i > > # ls -dn /var/lib/private/derp-{foo,bar} > drwxr-xr-x 2 64000 64000 4096 Sep 2 17:59 /var/lib/private/derp-bar/ > drwxr-xr-x 2 63000 63000 4096 Sep 2 17:59 /var/lib/private/derp-foo/ > > # systemctl start f30-derp@{foo,bar} > > # ls -dn /var/lib/private/derp-{foo,bar} > drwxr-xr-x 2 63000 63000 4096 Sep 2 17:59 /var/lib/private/derp-bar/ > drwxr-xr-x 2 63000 63000 4096 Sep 2 17:59 /var/lib/private/derp-foo/ > > # ls -l /run/systemd/dynamic-uid/ > total 4 > -rw--- 1 root root 9 Sep 2 18:12 63000 > lrwxrwxrwx 1 root root 8 Sep 2 18:12 direct:63000 -> f30-derp > lrwxrwxrwx 1 root root 5 Sep 2 18:12 direct:f30-derp -> 63000 > > Normally the state directories are created under the same owner, I set > different owners > explicitly to see that the second instance's directory is chowned. > > I guess I'm wondering if this behaviour is intentional? I found it surprising > but that > might just be me. You can pick the name for the DynamicUser= via User=. What did you set it to? By default it's derived from the unit name. If two units specify the same name they get the same user. Lennart -- Lennart Poettering, Berlin ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] systemd put it uinder a custom slice
On Mi, 30.10.19 11:08, Bhasker C V (bhas...@unixindia.com) wrote: > Hi all, > > I have been googl-ing for a few days now but could not stumble upon a > solution I am looking for. > > Apologies if this is a noob question. > > Is there a way to use custom slices with my systemd-nspawn container ? > > I see that systemd-nspawn man page says using --slice= but any such > cgroup created is not accepted by this option (I dont know how to create > a slice externally from systemd unit-files) > > $ sudo cgcreate -g freezer,memory:/test This is not supported. systemd owns the cgroup tree, only subtrees for which delegation is explicitly turned on can be managed by other programs, for example for the purpose of container managers. Thus, creating cgroups manually, directly via cgcreate at the top of the tree is explicitly not supported. Use systemd's own concepts, i.e. slice units, direct cgroup access bypassing systemd at the top of the tree is explicitly not supported. Lennart -- Lennart Poettering, Berlin ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/systemd-devel
[systemd-devel] Problems with DNS resolution in German Rail WiFi
Dear systemd folks, Since over half a year, I am having problems with the German Rail WIFI (Deutsche Bahn, WIFIonICE) [1]. I am only having problems with the WIFI in the train. The WIFI on the train stations (operated by Deutsche Telekom(?)) works fine. I am able to connect to the WIFI network, but accessing the capture page to log in, and after logging in, I am having serious DNS problems in the browsers (Mozilla Firefox 70.0 and Google Chromium). I am using Debian Sid/unstable. It looks like that, DNS requests are not answered in time (also confirmed by the developer tools in the browser). SSH (and even Mozilla Thunderbird) seems to work better. The fellow train travelers do not seem to have any problems. Testing on the console shows: ``` $ time host bahn.de bahn.de has address 18.185.205.203 bahn.de has address 35.157.56.133 bahn.de has address 35.158.56.207 bahn.de mail is handled by 10 mailgate2.deutschebahn.com. bahn.de mail is handled by 10 mailgate1.deutschebahn.com. real0m0,243s user0m0,021s sys 0m0,000s $ systemd-resolve bahn.de bahn.de: resolve call failed: Die Wartezeit für die Verbindung ist abgelaufen $ time systemd-resolve bahn.de bahn.de: resolve call failed: DNSSEC validation failed: failed-auxiliary real0m55,967s user0m0,006s sys 0m0,006s $ time systemd-resolve bahn.de bahn.de: resolve call failed: Die Wartezeit für die Verbindung ist abgelaufen real2m0,094s user0m0,005s sys 0m0,007s $ time systemd-resolve bahn.de bahn.de: resolve call failed: Die Wartezeit für die Verbindung ist abgelaufen real2m0,113s user0m0,014s sys 0m0,000s ``` How can this be debugged (next time I am using the ICE)? Is this systemd-resolved related? If not, who should I bug about it? Kind regards, Paul [1]: https://www.bahn.de/p/view/service/zug/railnet_ice_bahnhof.shtml ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/systemd-devel