[systemd-devel] udev can fail to read stdout of processes spwaned in udev_event_spawn

2019-10-31 Thread Paul Davey
Hi,

In tracking down an issue we are having with usb-modeswitch I found
that the root cause is an issue in udev where when the rule sets
a PROGRAM= and uses the result it will sometimes receive an empty
result even when the program did produce output.

This appears to be because the on_spawn_sigchld handler used
by spawn_wait is not checking if there is output in the stdout pipe
when the program exits and thus there is a race between this event and
the io event to read the output.

What is the best way to fix this issue?  I have locally had success
just calling the on_spawn_io callback in the process success branch of
on_spawn_sigchld, but I am unsure if this is an acceptable fix.

Thanks,
Paul
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel

Re: [systemd-devel] Journalctl --list-boots problem

2019-10-31 Thread Martin Townsend
On Thu, Oct 31, 2019 at 4:34 PM Lennart Poettering 
wrote:

> On Di, 08.10.19 16:57, Martin Townsend (mtownsend1...@gmail.com) wrote:
>
> > Thanks for your help.  In the end I just created a symlink from
> > /etc/machine-id to /data/etc/machine-id.  It complains really early on
> > boot with
> > Cannot open /etc/machine-id: No such file or directory
> >
> > So I guess it's trying to read /etc/machine-id for something before
> > fstab has been processed and the data partition is ready.
> >
> > But the journal seems to be working ok and --list-boots is fine.  The
> > initramfs would definitely be more elegant solution to ensure
> > /etc/machine-id is ready.
> >
> > I don't suppose you know what requires /etc/machine-id so early in the
> boot?
>
> PID 1 does.
>
> You have to have a valid /etc/machine-id really, everything else is
> not supported. And it needs to be available when PID 1 initializes.
>
> You basically have three options:
>
> 1. Make it read-only at boot, initialize persistently on OS install
>
> 2. Make it read-only, initialize it to an empty file on OS install, in
>which case systemd (i.e. PID 1) overmounts it with a random one
>during early boot. In this mode the system will come up with a new
>identity on each boot, and thus journal files from previous boots
>will be considered to belong to different systems.
>
> 2b. (Same as 2, but mount / writable during later boot, at which time
> the machine ID is commited to disk automatically)
>
> 3. Make it writable during early boot, and initialize it originally to
>an empty file. In this case PID 1 will generate a random one and
>persist it to disk right away.
>
> Also see:
>
> https://www.freedesktop.org/software/systemd/man/machine-id.html
>
> Lennart
>
> --
> Lennart Poettering, Berlin
>

Hi Lennart,

Thank you for the information it was very useful.  Reading the link on
machine-id gives my another option and that is to pass the machine-id via
U-Boot to the kernel via its bootargs.  As some background this is for an
embedded system that is using systemd and Mender which seems to becoming
fairly popular so hopefully this may help someone else who stumbles across
this post with the same problem.  Mender provides image updates on an A/B
root filesystem, so having /etc/machine-id within the root filesystem isn't
really feasible but with Mender you get a persistent data partition that
both root filesystems share, hence why have opted for a symlink to the
persistent partition. So for embedded systems using Mender that want a
persistent machine-id I see two options now:

1) Use an initramdisk to mount the persistent mender data partition and
store machine-id in here with /etc/machine-id a symlink
2) and thanks to your link I think we could store the machine-id in the
U-Boot environment and pass it in as the systemd.machine_id= kernel command
line parameter.

Out of interest what does PID 1 need /etc/machine-id for before it has
processed fstab (and hence the persistent data partition would be ready to
read the /etc/machine-id symlink)?  We haven't implemented either of the
above so I wouldn't mind knowing what the impact would be.

Cheers,
Martin.
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel

Re: [systemd-devel] rootnfs + rootovl (diskless client shutdown problem)

2019-10-31 Thread Matteo Guglielmi

Sorry for the imprecision there,


it's not a kernel panic but a complete

freeze of the system during a reboot or

shutdown (almost 10/20% of the time).


The error message that appears on the

screen is about missing libraries, which

I could later locate in /usr/lib64 (I do

not remember their name right now).


Concerning the .mount file, there is NOT

ONE associated to /live/image.


Instead, all others form /etc/fstab are

created:


cat etc/fstab

live:/srv/live/root /root nfs intr,nolock 0 0
live:/srv/live/home /home nfs intr,nolock 0 0
live:/srv/live/opt  /opt  nfs intr,nolock 0 0

each nfs entry has a corresponding .mount file.



The most important kernel options are:


root=172.16.16.38:/srv/live/cos8 rootovl

where 'rootovl' is the option provided by
the 90overlay-root dracut module (imported
from debian buster, see here below).

Should I modify overlay-mount.sh?

Thank you.

###
### cat /usr/lib/dracut/modules.d/90overlay-root/README ###
###

dracut rootfs overlayfs module

Make any rootfs ro, but writable via overlayfs.
This is convenient, if for example using an ro-nfs-mount.

Add the parameter "rootovl" to the kernel, to activate this feature

This happens pre-pivot. Therefore the final root file system is already
mounted. It will be set ro, and turned into an overlayfs mount with an
underlying tmpfs.

The original root and the tmpfs will be mounted at /live/image and
/live/cow in the final rootfs.



### cat /usr/lib/dracut/modules.d/90overlay-root/module-setup.sh ###


#!/bin/bash

check() {
# do not add modules if the kernel does not have overlayfs support
[ -d /lib/modules/$kernel/kernel/fs/overlayfs ] || return 1
}

depends() {
# We do not depend on any modules - just some root
return 0
}

# called by dracut
installkernel() {
instmods overlay
}

install() {
inst_hook pre-pivot 10 "$moddir/overlay-mount.sh"
}


#
### cat /usr/lib/dracut/modules.d/90overlay-root/overlay-mount.sh ###
#

#!/bin/sh

# make a read-only nfsroot writeable by using overlayfs
# the nfsroot is already mounted to $NEWROOT
# add the parameter rootovl to the kernel, to activate this feature

. /lib/dracut-lib.sh

if ! getargbool 0 rootovl ; then
return
fi

modprobe overlay

# a little bit tuning
mount -o remount,nolock,noatime $NEWROOT

# Move root
# --move does not always work. Google >mount move "wrong fs"< for
# details
mkdir -p /live/image
mount --bind $NEWROOT /live/image
umount $NEWROOT

# Create tmpfs
mkdir /cow
mount -n -t tmpfs -o mode=0755 tmpfs /cow
mkdir /cow/work /cow/rw

# Merge both to new Filesystem
mount -t overlay -o 
noatime,lowerdir=/live/image,upperdir=/cow/rw,workdir=/cow/work,default_permissions
 overlay $NEWROOT

# Let filesystems survive pivot
mkdir -p $NEWROOT/live/cow
mkdir -p $NEWROOT/live/image
mount --bind /cow/rw $NEWROOT/live/cow
umount /cow
mount --bind /live/image $NEWROOT/live/image
umount /live/image





From: Lennart Poettering 
Sent: Thursday, October 31, 2019 6:34:15 PM
To: Matteo Guglielmi
Cc: systemd-devel@lists.freedesktop.org
Subject: Re: [systemd-devel] rootnfs + rootovl (diskless client shutdown 
problem)

On Mo, 28.10.19 09:47, Matteo Guglielmi (matteo.guglie...@dalco.ch) wrote:

>
> almost 20% of the time I get a kernel panic error
> due to a bunch of missing libraries.

A kernel panic? because of "missing libraries"? that doesn't sound
right. The kernel doesn't need "libraries".

iirc it's totally fine to unmount the backing fs after you mounted the
overlayfs, the file systems remain pinned in the background by the overlayfs.

>
>
> How can I instruct systemd to avoid unmounting
>
> /live/image (or postpone it to a later moment)?

You can extend the .mount unit file for /live/image and add an
explicit dep: i.e. create
/etc/systemd/system/live-image.mount.d/50-my-drop-in.conf, then add:

   [Unit]
   After=some-other.mount

You get the idea...

Lennart

--
Lennart Poettering, Berlin
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel

Re: [systemd-devel] [SPAM]Re: Mount units with After=autofs.service cause ordering cycles

2019-10-31 Thread John Florian

On 10/31/19 2:59 PM, Lennart Poettering wrote:

On Do, 31.10.19 14:09, John Florian (jflor...@doubledog.org) wrote:


# /etc/systemd/system/var-www-pub.mount
[Unit]
Description=mount /pub served via httpd
Requires=autofs.service
After=autofs.service

[Mount]
What=/mnt/pub
Where=/var/www/pub
Options=bind,context=system_u:object_r:httpd_sys_content_t

[Install]
WantedBy=multi-user.target

~~~

The above worked for a long time, but once again a `dnf upgrade` seems to
have broken things because now I have a ordering cycle that systemd must
break.  Since I haven't changed my mount units, my ability to mesh with
those shipped by the OS proves fragile. I'm deliberately avoiding too much
detail here because it would seem that there should be a relatively simple
solution to this general sort of task -- I just can't seem to discover it.
Any recommendations that don't involve an entirely different approach?

What precisely is the ordering cycle you are seeing? It's usually
dumped along with the log message.

systemd[1]: local-fs.target: Found ordering cycle on var-www-pub.mount/start
systemd[1]: local-fs.target: Found dependency on autofs.service/start
systemd[1]: local-fs.target: Found dependency on rpc-statd.service/start
systemd[1]: local-fs.target: Found dependency on network-online.target/start
systemd[1]: local-fs.target: Found dependency on network.target/start
systemd[1]: local-fs.target: Found dependency on
NetworkManager.service/start
systemd[1]: local-fs.target: Found dependency on sysinit.target/start
systemd[1]: local-fs.target: Found dependency on
systemd-update-done.service/start
systemd[1]: local-fs.target: Found dependency on local-fs.target/start
systemd[1]: local-fs.target: Job var-www-pub.mount/start deleted to break
ordering cycle starting with local-fs.target/start

The ordering dep between local-fs.target and var-ww-pub.mount is what
you have to get rid of to remove the cycle. Set:

…
[Unit]
DefaultDependencies=no
Conflicts=umount.target
Before=umount.target
…
[Install]
WantedBy=remote-fs.target
…

i.e. make this a dep of remote-fs.target, not the implicit
local-fs.target, so that we don#t pull it in early boot, but only
during late boot, before remote-fs.target.


Thanks Lennart!  That did the trick.  I and others I know have knocked 
heads on this one several times and somehow never came to this 
conclusion.  It makes sense now that I see it, however.   Maybe 
local-fs.target should have stood out to me, but I think it was mostly 
accepted since if you follow all deps far enough, you'll eventually 
cover (most?) everything.


I think this just means I need to use `systemctl show` more even though 
`systemctl cat` is so much easier to digest for what I think I need to 
know.  Abstracting the default deps is both good in expression but also 
difficult in comprehension.  I wish there was something "in between", 
but I don't even know how to define what that means.  Maybe just 
grouping all the settings from `show` somehow, e.g.: ordering, deps, 
etc. or maybe by unit type: unit, exec, mount, etc.


___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel

Re: [systemd-devel] [SPAM]Re: Mount units with After=autofs.service cause ordering cycles

2019-10-31 Thread Lennart Poettering
On Do, 31.10.19 14:09, John Florian (jflor...@doubledog.org) wrote:

> > > # /etc/systemd/system/var-www-pub.mount
> > > [Unit]
> > > Description=mount /pub served via httpd
> > > Requires=autofs.service
> > > After=autofs.service
> > >
> > > [Mount]
> > > What=/mnt/pub
> > > Where=/var/www/pub
> > > Options=bind,context=system_u:object_r:httpd_sys_content_t
> > >
> > > [Install]
> > > WantedBy=multi-user.target
> > >
> > > ~~~
> > >
> > > The above worked for a long time, but once again a `dnf upgrade` seems to
> > > have broken things because now I have a ordering cycle that systemd must
> > > break.  Since I haven't changed my mount units, my ability to mesh with
> > > those shipped by the OS proves fragile. I'm deliberately avoiding too much
> > > detail here because it would seem that there should be a relatively simple
> > > solution to this general sort of task -- I just can't seem to discover it.
> > > Any recommendations that don't involve an entirely different approach?
> > What precisely is the ordering cycle you are seeing? It's usually
> > dumped along with the log message.
>
> systemd[1]: local-fs.target: Found ordering cycle on var-www-pub.mount/start
> systemd[1]: local-fs.target: Found dependency on autofs.service/start
> systemd[1]: local-fs.target: Found dependency on rpc-statd.service/start
> systemd[1]: local-fs.target: Found dependency on network-online.target/start
> systemd[1]: local-fs.target: Found dependency on network.target/start
> systemd[1]: local-fs.target: Found dependency on
> NetworkManager.service/start
> systemd[1]: local-fs.target: Found dependency on sysinit.target/start
> systemd[1]: local-fs.target: Found dependency on
> systemd-update-done.service/start
> systemd[1]: local-fs.target: Found dependency on local-fs.target/start
> systemd[1]: local-fs.target: Job var-www-pub.mount/start deleted to break
> ordering cycle starting with local-fs.target/start

The ordering dep between local-fs.target and var-ww-pub.mount is what
you have to get rid of to remove the cycle. Set:

…
[Unit]
DefaultDependencies=no
Conflicts=umount.target
Before=umount.target
…
[Install]
WantedBy=remote-fs.target
…

i.e. make this a dep of remote-fs.target, not the implicit
local-fs.target, so that we don#t pull it in early boot, but only
during late boot, before remote-fs.target.

Lennart

--
Lennart Poettering, Berlin
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel

Re: [systemd-devel] [SPAM]Re: Mount units with After=autofs.service cause ordering cycles

2019-10-31 Thread John Florian

On 10/31/19 1:08 PM, Lennart Poettering wrote:

On Mo, 14.10.19 16:23, John Florian (j...@doubledog.org) wrote:


So, I much prefer the expressiveness of systemd's mount units to the naive
era of /etc/fstab, but I've found one situation where I seem to always get
stuck and am never able to find a reliable solution that survives OS (Fedora
& CentOS) updates.  I have a NFS filesystem mounted by autofs at /pub that
needs to be bind mounted in various places such as /var/www/pub and
/var/ftp/pub. So I create a unit that looks like:

~~~

# /etc/systemd/system/var-www-pub.mount
[Unit]
Description=mount /pub served via httpd
Requires=autofs.service
After=autofs.service

[Mount]
What=/mnt/pub
Where=/var/www/pub
Options=bind,context=system_u:object_r:httpd_sys_content_t

[Install]
WantedBy=multi-user.target

~~~

The above worked for a long time, but once again a `dnf upgrade` seems to
have broken things because now I have a ordering cycle that systemd must
break.  Since I haven't changed my mount units, my ability to mesh with
those shipped by the OS proves fragile. I'm deliberately avoiding too much
detail here because it would seem that there should be a relatively simple
solution to this general sort of task -- I just can't seem to discover it.
Any recommendations that don't involve an entirely different approach?

What precisely is the ordering cycle you are seeing? It's usually
dumped along with the log message.


systemd[1]: local-fs.target: Found ordering cycle on var-www-pub.mount/start
systemd[1]: local-fs.target: Found dependency on autofs.service/start
systemd[1]: local-fs.target: Found dependency on rpc-statd.service/start
systemd[1]: local-fs.target: Found dependency on network-online.target/start
systemd[1]: local-fs.target: Found dependency on network.target/start
systemd[1]: local-fs.target: Found dependency on 
NetworkManager.service/start

systemd[1]: local-fs.target: Found dependency on sysinit.target/start
systemd[1]: local-fs.target: Found dependency on 
systemd-update-done.service/start

systemd[1]: local-fs.target: Found dependency on local-fs.target/start
systemd[1]: local-fs.target: Job var-www-pub.mount/start deleted to 
break ordering cycle starting with local-fs.target/start


___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel

Re: [systemd-devel] is the watchdog useful?

2019-10-31 Thread Zbigniew Jędrzejewski-Szmek
On Thu, Oct 31, 2019 at 06:30:33PM +0100, Lennart Poettering wrote:
> On Mo, 21.10.19 17:50, Zbigniew Jędrzejewski-Szmek (zbys...@in.waw.pl) wrote:
> 
> > In principle, the watchdog for services is nice. But in practice it seems
> > be bring only grief. The Fedora bugtracker is full of automated reports of 
> > ABRTs,
> > and of those that were fired by the watchdog, pretty much 100% are bogus, in
> > the sense that the machine was resource starved and the watchdog fired.
> >
> > There a few downsides to the watchdog killing the service:
> > 1. if it is something like logind, it is possible that it will cause 
> > user-visible
> > failure of other services
> > 2. restarting of the service causes additional load on the machine
> > 3. coredump handling causes additional load on the machine, quite 
> > significant
> > 4. those failures are reported in bugtrackers and waste everyone's time.
> >
> > I had the following ideas:
> > 1. disable coredumps for watchdog abrts: systemd could set some flag
> > on the unit or otherwise notify systemd-coredump about this, and it could 
> > just
> > log the occurence but not dump the core file.
> > 2. generally disable watchdogs and make them opt in. We have 
> > 'systemd-analyze service-watchdogs',
> > and we could make the default configurable to "yes|no".
> >
> > What do you think?
> 
> Isn't this more a reason to substantially increase the watchdog
> interval by default? i.e. 30min if needed?

Yep, there was a proposal like that. I want to make it 1h in Fedora.

Zbyszek

___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel

Re: [systemd-devel] systemd as a docker process manager

2019-10-31 Thread Lennart Poettering
On So, 27.10.19 20:50, Jeff Solomon (jsolomon8...@gmail.com) wrote:

> This is a followup to this thread:
>
> https://lists.freedesktop.org/archives/systemd-devel/2015-July/033585.html
>
> To see if there are any new developments.
>
> We have multi-process application that already uses systemd successfully.
> Our customers want to put the application into a container and that
> container should be docker because that is what they use. We can't use
> systemd-nspawn or podman or whatever because our customers want to use
> docker because they are already using docker for other applications.
>
> I understand that containers are not a security technology but we want to
> find a solution that allows us to run systemd in a docker container that
> isn't blatantly less secure than systemd running outside of a container. I
> have yet to find a way.
>
> Fundamentally, the problem is that the systemd in the container require
> read/write access to the host's /sys/fs/cgroup/systemd directory in order
> to function at all.

It only requires write access to the subtree it lives in, not to what
lives above it. See how nspawn does it.

> Even if the container isn't privileged, it's necessary
> to mount the host's /sys/fs/cgroup directory inside the directory and let
> the container write to it, you have a security hole that doesn't exist when
> systemd is just run on the host. That hole is described here:

Three options:

1. Docker should use CLONE_NEWCGROUP to get its own cgroup subtree
   hiding what is outside of it.

2. Docker should mount the root of the cgroup tree read-only, only the
   subtree the container is supposed to live in writable.

3. Just use cgroupsv2.

I don't know Docker really, you'd have to enquire them if they support
that. They are a bit behind on these things, but maybe if you ping
them, they will add this for you.

(Of course, systemd-nspawn supports all three of the above-)

> https://blog.trailofbits.com/2019/07/19/understanding-docker-container-escapes/
>
> Using user namespaces doesn't help because then the container user wouldn't
> have permission to write to the /sys/fs/cgroup/systemd.

It doesn't need write acces to that dir, only to the subtree it is
supposed to live in it.

> Our application runs as a non-root user. The security concern is that any
> user on the host who is in the docker group would be able to start a shell
> inside the container as "container root" and then be able to get root on
> the host. So basically membership in the docker group is equivalent to host
> root.
>
> Taking a step back - I wonder (mostly asking Lennart) if there is a way to
> run systemd without it needing access to /sys/fs/cgroup/systemd? I'm sure
> there isn't but I thought I would ask.

no. systemd requires cgroups. But it's fine to mount only the subtree
it needs writable. systemd carefully makes sure that the service
manager never steps beyond its territory, and the access boundaries are clear
and that allows you to carefully arrange the cgroup tree so that only
the subtree and the hierarchy systemd really needs (i.e. the
name=systemd hierarchy) is writable.

(I mean, cgroupsv1 and non-userns containers are not safe anyway, so
you are just closing one gaping hole while leaving many others open,
but of course, this is your choice).

> Is there a way to run systemd's user service without it having the system
> systemd service as a parent?

This is not supported, sorry.

Lennart

--
Lennart Poettering, Berlin
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel

Re: [systemd-devel] rootnfs + rootovl (diskless client shutdown problem)

2019-10-31 Thread Lennart Poettering
On Mo, 28.10.19 09:47, Matteo Guglielmi (matteo.guglie...@dalco.ch) wrote:

>
> almost 20% of the time I get a kernel panic error
> due to a bunch of missing libraries.

A kernel panic? because of "missing libraries"? that doesn't sound
right. The kernel doesn't need "libraries".

iirc it's totally fine to unmount the backing fs after you mounted the
overlayfs, the file systems remain pinned in the background by the overlayfs.

>
>
> How can I instruct systemd to avoid unmounting
>
> /live/image (or postpone it to a later moment)?

You can extend the .mount unit file for /live/image and add an
explicit dep: i.e. create
/etc/systemd/system/live-image.mount.d/50-my-drop-in.conf, then add:

   [Unit]
   After=some-other.mount

You get the idea...

Lennart

--
Lennart Poettering, Berlin
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel

Re: [systemd-devel] is the watchdog useful?

2019-10-31 Thread Lennart Poettering
On Mo, 21.10.19 17:50, Zbigniew Jędrzejewski-Szmek (zbys...@in.waw.pl) wrote:

> In principle, the watchdog for services is nice. But in practice it seems
> be bring only grief. The Fedora bugtracker is full of automated reports of 
> ABRTs,
> and of those that were fired by the watchdog, pretty much 100% are bogus, in
> the sense that the machine was resource starved and the watchdog fired.
>
> There a few downsides to the watchdog killing the service:
> 1. if it is something like logind, it is possible that it will cause 
> user-visible
> failure of other services
> 2. restarting of the service causes additional load on the machine
> 3. coredump handling causes additional load on the machine, quite significant
> 4. those failures are reported in bugtrackers and waste everyone's time.
>
> I had the following ideas:
> 1. disable coredumps for watchdog abrts: systemd could set some flag
> on the unit or otherwise notify systemd-coredump about this, and it could just
> log the occurence but not dump the core file.
> 2. generally disable watchdogs and make them opt in. We have 'systemd-analyze 
> service-watchdogs',
> and we could make the default configurable to "yes|no".
>
> What do you think?

Isn't this more a reason to substantially increase the watchdog
interval by default? i.e. 30min if needed?

Lennart

--
Lennart Poettering, Berlin
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel

Re: [systemd-devel] nspawn and ovs bridges

2019-10-31 Thread Lennart Poettering
On Mi, 23.10.19 20:09, Michał Zegan (webczat_...@poczta.onet.pl) wrote:

> Hello,
> My use case is the following: make a test of routing protocols without
> having... enough real hardware. I decided to do that via containers
> using systemd-nspawn, and because I may need many interconnected
> networks and things like qos settings applied without dirty scripts, I
> decided to try openvswitch for bridge management.
> The problem here is that systemd-nspawn does not really support adding
> the created veth interface to the ovs bridge, even for the so called
> fake bridge, because it says "operation not supported". Same happens if
> I try to do ip link set iface master fakebridge.
> How to deal with that situation correctly? Any ideas?

Uh. Some network device types might need some special calls to migrate
them to an ns namespace (wifi too?), so far nobody sat down and did
the necessary work to make that happen. It's a matter of debugging,
researching and than probably making a minor fix to nspawn, to do the
migration for you.

Lennart

--
Lennart Poettering, Berlin
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel

Re: [systemd-devel] How-to for systemd user timers instead of cron/crontab?

2019-10-31 Thread Lennart Poettering
On Do, 17.10.19 10:58, Paul Menzel (pmenzel+systemd-de...@molgen.mpg.de) wrote:

> Dear systemd folks,
>
>
> I couldn’t find a simple documentation for “normal” users how
> to use systemd timers instead of cron/crontab? The Arch Wiki
> has a page [1], but I am afraid it’s still too complicated
> for our users.

There are no such docs afaik.

It's not too different from systemd timer units. Except you place them
in ~/.config/systemd/user/*.timer. And a few other things are
different, and a few things not available...

Yes, it would be great to have more docs about this, maybe file an
issue on github asking for that, but I wouldn't hold my breath I
fear...

Lennart

--
Lennart Poettering, Berlin
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel

Re: [systemd-devel] Mutually exclusive (timer-triggered) services

2019-10-31 Thread Lennart Poettering
On Mo, 14.10.19 18:30, Alexander Koch (m...@alexanderkoch.net) wrote:

> * flock leaves the lock file behind so you'd need some type of cleanup in
> case
>   you really want the jobs to be trace-free. This is not as trivial as it
> might
>   seem, e.g. you cannot do it from the service units themselves in
>   `ExecStartPost=` or similar.

Linux supports BSD and POSIX locks on all kinds of fs objects,
including dirs. It should be possible to just lock the cache dir or so
of pacman (i.e. an fs object that exists anyway), no need to introduce
a new locking file.

Lennart

--
Lennart Poettering, Berlin
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel

Re: [systemd-devel] Mutually exclusive (timer-triggered) services

2019-10-31 Thread Lennart Poettering
On Mo, 14.10.19 12:45, Alexander Koch (m...@alexanderkoch.net) wrote:

> Dear [systemd-devel],
>
> imagine you've got multiple services that perform system housekeeping
> tasks, all triggered by .timer units. These services all happen to use
> a specific resource (e.g. the system package manager) so they must not
> be run in parallel, but they all need to be run.
>
> Is there a systemd'ish way of modeling this?
>
> I first thought of using `Conflicts=` but having read the manpages I
> understand that queueing one of the services would actively stop any
> running instance of any of the others.
>
> `After=` is not an option either as that (unless 'Type=oneshot', which
> isn't to be used for long-running tasks) doesn't delay up to completion
> but only to initialization. Furthermore I think you'd run into trouble
> ordering more than two units using this approach.
>
> Ideally I'd think of something like a 'virtual resource' that can be
> specified by service units, like this (real use case on Arch Linux):
>
> [Unit]
> Description=Pacman sync
> Locks=pacman-db
>
> [Service]
> ExecStart=/usr/bin/pacman -Sy
>
> 
>
> [Unit]
> Description=Pacman cleanup
> Locks=pacman-db
>
> [Service]
> ExecStart=/usr/bin/paccache -r -k 0 -u
>
> The value of `Locks=` shall be an arbitrary string describing the
> virtual resource the service is requiring exclusive access to. systemd
> would then delay the start of a unit if there is another unit with
> identical `Locks=` entry currently active.
>
> A nice advantage of this concept is that services depending on the same
> virtual resource would not need to know of each other, which simplifies
> shipping them via separate packages.
>
> Any thoughts on this? Maybe I'm just too blind to see the obvious
> solution to this simple problem.

I presume pacman uses file system locks anyway, no? I think the best
approach would be to make those (optionally) blocking directly in
pacman, no?

I mean, you can add five layers of locking on top, but ultimately it
appears to me in this case you just want to make the locking that
already exists just blocking. Linux fs locking exists in non-blocking
*and* blocking flavour anyway, it's just a matter of making pacman
expose that, maybe with a new --block switch or so?

Other than that you can of course use tools such as "flock(1)" around
pacman.

Lennart

--
Lennart Poettering, Berlin
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel

Re: [systemd-devel] How to control the login prompt from my application service unit file?

2019-10-31 Thread Lennart Poettering
On Di, 15.10.19 04:15, Moji, Shashidhar (shashidhar.m...@dellteam.com) wrote:

> Hi,
> We have VMware vApp based solution. Our application gets installed during 
> first boot.
> Till now we had SLES11 OS based VM and we upgraded to SLES12. Now we have 
> systemd instead of init scripts for service handling.
> In SLES11, we had service dependency configured in init scripts that was 
> holding back the login prompt until our application installation is done. But 
> in SLES12, we get the login prompt before our application is installed.
>
> How to hold the login prompt until our application installation is
> complete? We tried adding Before=getty@.service  in our application
> install unit file, but its not helping.

getty@.service is just a template for a unit, not a unit itself. Thus
you cannot have a dependency on it as a whole.

You have two options:

1. You can add a dropin getty@.service.d/foobar.conf, i.e. extend the
   getty@.service file that all VT gettys are instantiated of. In
   there, just place:

   [Unit]
   After=…

2. Order your unit before systemd-user-sessions.service. All gettys
   and other logins order themselves after that service, so if you order
   yourse before it you get the behaviour you are looking for.

The first option is nicer, since it's more specific to a getty type,
while the latter appplies to all logins including SSH or graphical.

Lennart

--
Lennart Poettering, Berlin
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel

Re: [systemd-devel] Mount units with After=autofs.service cause ordering cycles

2019-10-31 Thread Lennart Poettering
On Mo, 14.10.19 16:23, John Florian (j...@doubledog.org) wrote:

> So, I much prefer the expressiveness of systemd's mount units to the naive
> era of /etc/fstab, but I've found one situation where I seem to always get
> stuck and am never able to find a reliable solution that survives OS (Fedora
> & CentOS) updates.  I have a NFS filesystem mounted by autofs at /pub that
> needs to be bind mounted in various places such as /var/www/pub and
> /var/ftp/pub. So I create a unit that looks like:
>
> ~~~
>
> # /etc/systemd/system/var-www-pub.mount
> [Unit]
> Description=mount /pub served via httpd
> Requires=autofs.service
> After=autofs.service
>
> [Mount]
> What=/mnt/pub
> Where=/var/www/pub
> Options=bind,context=system_u:object_r:httpd_sys_content_t
>
> [Install]
> WantedBy=multi-user.target
>
> ~~~
>
> The above worked for a long time, but once again a `dnf upgrade` seems to
> have broken things because now I have a ordering cycle that systemd must
> break.  Since I haven't changed my mount units, my ability to mesh with
> those shipped by the OS proves fragile. I'm deliberately avoiding too much
> detail here because it would seem that there should be a relatively simple
> solution to this general sort of task -- I just can't seem to discover it. 
> Any recommendations that don't involve an entirely different approach?

What precisely is the ordering cycle you are seeing? It's usually
dumped along with the log message.

Lennart

--
Lennart Poettering, Berlin
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel

Re: [systemd-devel] RFC: luksSuspend support in sleep/sleep.c

2019-10-31 Thread Lennart Poettering
On Mo, 14.10.19 16:27, Jonas Meurer (jo...@freesources.org) wrote:

> Yeah, something like that was my hope as well: use plymouth and
> framebuffer or something alike for spawning the passphrase prompt. But
> I'm not sure yet how to ensure that we change to the passphrase prompt
> (or overlay the graphical desktop environment).
>
> Another idea that came into my mind: spawn the passphrase prompt
> *before* system suspend, just like it's apparently done with the
> screenlock right now.
>
> The passphrase prompt could write to a fifo pipe or talk to a small
> daemon that waits for the luks passphrase(s) to be entered.

Paging doesn't allow that really. It's always ugly. You'd have to have
your own UI stack in the initrd, i.e. basically have an alternative
root disk, that possesses the screen exclusively as long as the system
is up but not unlocked yet.

So most likely a comprehensive approch would be:

in systemd-suspend.service pass control to a binary in the initrd that
runs in its one fs namespace with only tmpfs and api vfs visible,
which includes plymouth and so on. It then switches to a new VT, does
plymouth there, then suspends, on coming back lets plymouth ask its
question and then unlocks the disk. And maybe even uses the cgroup
freezer to freeze all processes on the host (i.e. everything except
the stuff run from the initrd) before suspend, and thaw it only after
the password has been entered again, so that the whole OS remains
frozen and doesn't partially get woken up but hangs on the root disk,
because typing in the pw might take a lng time...

But even that is very ugly for various reasons. For example
CLOCK_MONOTONIC will not be paused while the host remains frozen. Thus
watchdog events will be missed (actual system suspend pauses
CLOCK_MONOTONIC, which makes this safe for it), and then your system
is hosed. Moreover, your initrd main process will be a child of a
frozen process (as PID 1 is from the host), and this means you have to
be very very careful with what you do, since you then cannot rely on
some of the most basic functions of the OS. for example, PID 1 reaps
processes which get reparented to it normally. Thus in your initrd you
should be very careful never to have processes die while they have
children as they will collect as unreaped children of PID 1
then... One can ignore issues like that, but they are frickin ugly

> >> They might not be 100% available from just memory. What happens
> >> if the DE needs to load assets (fonts, .ui files) for the
> >> passphrase prompt from disk? (Actually, do any GPU drivers need
> >> to load firmware from /lib on resume?)
> >>
> >
> > In Ubuntu, casper component, we work around it by reading the files to
> > ensure they are in the fscache, and then if one force unmounts the
> > filesystem underneath them (cdrom eject) plymouth can still "read"
> > fonts and display late boot messages. So there are ways of doing this.
>
> Again, the simplest solution would be to spawn the passphrase prompt
> *before* suspend, to ensure that all required components are already in
> memory. Or do you see caveats?

Programs are memory mapped on Linux, i.e. nominally on disk, only the
bits paged in as they are used, as they are executed. Similar, data
files are typically memory mapped too. This means that preparing
anything in advance is not that easy, you have to lock it into RAM
too. Which you can do, but doesn't really scale, since our dep trees
are large and fonts/media files in particular so.

Lennart

--
Lennart Poettering, Berlin
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel

Re: [systemd-devel] RFC: luksSuspend support in sleep/sleep.c

2019-10-31 Thread Lennart Poettering
On Do, 10.10.19 17:22, Jonas Meurer (jo...@freesources.org) wrote:

> >> systemd-homed maintains only the home directory via LUKS encryption,
> >> and leaves the OS itself unencrypted (under the assumption it's
> >> protected differently, for example via verity – if immutable —  or via
> >> encryption bound to the TPM), and uses the passphrase only for
> >> home. THis means the whole UI stack to prompt the user is around
> >> without problems, and the problem gets much much easier.
> >>
> >> So what's your story on the UI stack? Do you intend to actually copy
> >> the full UI stack into the ramdisk? If not, what do you intend to do
> >> instead?
>
> As Tim already wrote, the UI stack was not our focus so far. But I
> agree, that it's a valid concern. My silent hope was to find a solution
> for a simple passwort prompt that can be overlayed over whatever
> graphical stack is running on the system. But we haven't looked into it
> yet, so it might well be impossible to do something like this.
>
> But since the graphical interface is running already, I doubt that we
> would have to copy the whole stack into the ramfs. We certainly need to
> take care of all *new* dependencies that a password prompt application
> pulls in, but the wayland/x11/gnome basics should just be there, as they
> have been in use just before the suspend started, no?

No. During suspend it's likely the kernel flushes caches. This means
GNOME tools previously in memory might not be anymore and would have
to be paged in again when they are executed again. But that's going to
hang if your too disk is paused.

> > [...] While it would be great to make the suspension as smooth as
> > possible, I think there is also a place for people who *really* want a
> > whole encrypted disk during suspend and are okay to jump through a few
> > hoops for that.
>
> Let me stress this aspect a bit more: at the moment, full diskt
> encryption is the way to go if you want encryption at rest on your
> laptop. I applaud your efforts regarding systemd-homed, but they
> probably will not be the default setup anytime soon. Especially not if
> you talk about immutable or TPM-bound-encrypted rootfs. And
> *unencrypted* rootfs clearly shouldn't be the way to go. Think about all
> the sensitive stuff in /etc, /var/lib and even unencrypted swap
> devices.

I disagree.

In my view, immutable+non-encrypted /usr is certainly the way to
go. Not sure why you'd encrypt the OS if it's Open Source anyway.

/var should be locked to the TPM, and $HOME bound to the user's
credentials.

System resources should not be protected by user credentials. And user
resources not (at least not exclusively) by system credentials.

> But the main poinf of your feedback probably is that we need a clear
> vision how to technically solve the UI issue before you consider this a
> valid candiate for systemd inclusion, right?

Yes.

> By the way, we discovered a possible dead lock between luksSuspend and
> the final sync() in the Linux Kernel system suspend implementation that
> you most likely will discover with your luksSuspend systemd-homed as
> well. We're currently working on getting a Kernel patch accepted that
> adds a run-time switch to disable the sync() in the Kernel system
> suspend implementation.

Hmm, so far this all just worked for me, I didn't run into any trouble
with suspending just $HOME?

Lennart

--
Lennart Poettering, Berlin
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel

Re: [systemd-devel] RFC: luksSuspend support in sleep/sleep.c

2019-10-31 Thread Lennart Poettering
On Do, 10.10.19 12:01, Tim Dittler (tim.ditt...@systemli.org) wrote:

> > So what's your story on the UI stack? Do you intend to actually copy
> > the full UI stack into the ramdisk? If not, what do you intend to do
> > instead?
> >
> > Lennart
>
> Thank you for your feedback, Lennart. To be honest, the UX of the
> operation has been a secondary concern for us so far. We're basically
> exploring what is possible atm. Our current approach is to re-use the
> initramfs which was used during boot before. This doesn't include
> X11/wayland. While it would be great to make the suspension as smooth as
> possible, I think there is also a place for people who *really* want a
> whole encrypted disk during suspend and are okay to jump through a few
> hoops for that.

Well, but if you have no way to acquire the password you are in
trouble. You have to think about the UX at some point.

You'd have to rework systemd-suspend.service (and similar services) to
transition to your initrd fully, then run systemd-sleep from there I
figure and then maybe have a drop-in /usr/lib/systemd/system-sleep/
that unlocks the root fs. But it's not going to be nice if there's no
UI support.

Lennart

--
Lennart Poettering, Berlin
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel

Re: [systemd-devel] Unexpected behaviour not noticed by systemctl command

2019-10-31 Thread Lennart Poettering
On Mo, 07.10.19 11:43, Andy Pieters (syst...@andypieters.me.uk) wrote:

> Hi guys
>
> Just lately ran into a fumble. I was trying to stop and disable a
> service and I typed in:
>
> systemctl stop --now example.service
>
> The service duly stopped but wasn't disabled because the --now switch
> is only applicable on the disable/enable/mask commands
>
> However, shouldn't it be good practice to produce a warning or an
> error when a switch is used that has no effect?

We consider most swicthes "modifiers" that just modify behaviour
slightly but not generally, and hence if they don't apply we simply
ignore them. Maybe the "--now" switch is a bit mroe heavy weight than
the others, and we should refuse it. Can you maybe file a github issue
about this (maybe even prep a PR), and we'll look into it.

> Do you think it would be worth me writing a bug report for it?

Yes!

Lennart

--
Lennart Poettering, Berlin
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel

Re: [systemd-devel] /cdrom mounted from initrd is stopped on boot, possibly confused about device-bound

2019-10-31 Thread Lennart Poettering
On Mi, 09.10.19 14:28, Dimitri John Ledkov (x...@ubuntu.com) wrote:

> Ubuntu installer images use initrd, which has udevd but no systemd.
>
> It mounts /dev/sr0 as /root/cdrom, then pivots to /root, meaning
> /root/cdrom becomes just /cdrom and exec systemd as pid 1.
>
> At this point cdrom.mount is stopped as it's bound to an inactive
> dev-sr0.device. Then sometime later dev-sr0.device becomes active, but
> nothing remounts /cdrom back in.
>
> My question is why on startup, when processing cdrom.mount it
> determines that dev-sr0 is inactive, when clearly it's fully
> operational (it contains media, media is locked, and is mounted, and
> is serving content).
>
> I notice that SYSTEMD_MOUNT_DEVICE_BOUND is set to 1 on the udev
> device, and it seems impossible to undo via mount unit.

60-cdrom_id.rules sets that.
>
> I also wonder why, initially, /dev/sr0 is inactive, but later becomes
> active - as in what causes it to become active, and what is missing in
> the initrd.

When PID 1 initializes and udev is not running no device is considered
to be around. The devices only appear when they are triggered by
systemd-udevd-trigger.service for the first time.

> Things appear to work if I specify in the 60-cdrom_id.rules
> SYSTEMD_READY=1, then on boot there are no working messages that
> cdrom.mount is bound to an inactive device.
>
> Shouldn't 60-cdrom_id.rules set SYSTEMD_READY=1 if after importing
> cdrom_id variables ID_CROM_MEDIA is not-empty? Such that
> dev-sr0.device initial state is correct, if one booted with cdrom
> media in place.

SYSTEMD_READY=1 doesn't do anything, it's SYSTEMD_READY=0 that has an
effect. i.e. a device that lacks SYSTEMD_READY= at all is equivalent
to SYSTEMD_READY=1. The only reason for setting the property is to
turn the readiness off, it's by default considered ready if the
"systemd" udev tag is set.

Lennart

--
Lennart Poettering, Berlin
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel

Re: [systemd-devel] Journalctl --list-boots problem

2019-10-31 Thread Lennart Poettering
On Di, 08.10.19 16:57, Martin Townsend (mtownsend1...@gmail.com) wrote:

> Thanks for your help.  In the end I just created a symlink from
> /etc/machine-id to /data/etc/machine-id.  It complains really early on
> boot with
> Cannot open /etc/machine-id: No such file or directory
>
> So I guess it's trying to read /etc/machine-id for something before
> fstab has been processed and the data partition is ready.
>
> But the journal seems to be working ok and --list-boots is fine.  The
> initramfs would definitely be more elegant solution to ensure
> /etc/machine-id is ready.
>
> I don't suppose you know what requires /etc/machine-id so early in the boot?

PID 1 does.

You have to have a valid /etc/machine-id really, everything else is
not supported. And it needs to be available when PID 1 initializes.

You basically have three options:

1. Make it read-only at boot, initialize persistently on OS install

2. Make it read-only, initialize it to an empty file on OS install, in
   which case systemd (i.e. PID 1) overmounts it with a random one
   during early boot. In this mode the system will come up with a new
   identity on each boot, and thus journal files from previous boots
   will be considered to belong to different systems.

2b. (Same as 2, but mount / writable during later boot, at which time
the machine ID is commited to disk automatically)

3. Make it writable during early boot, and initialize it originally to
   an empty file. In this case PID 1 will generate a random one and
   persist it to disk right away.

Also see:

https://www.freedesktop.org/software/systemd/man/machine-id.html

Lennart

--
Lennart Poettering, Berlin
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel

Re: [systemd-devel] Make systemd-localed modify the kernel commandline for the initrd keymap?

2019-10-31 Thread Lennart Poettering
On Di, 01.10.19 15:33, Colin Walters (walt...@verbum.org) wrote:

> On Sun, Sep 29, 2019, at 6:08 AM, Lennart Poettering wrote:
>
> > i.e maybe write down a spec, that declares how to store settings
> > shared between host OS, boot loader and early-boot kernel environment
> > on systems that have no EFI NVRAM, and then we can make use of
> > that. i.e. come up with semantics inspired by the boot loader spec for
> > finding the boot partition to use, then define a couple of files in
> > there for these params.
>
> I like the idea in general but it would mean there's no mechanism to
> "roll back" to a previous configuration by default, which is a quite
> important part of OSTree (and other similar systems).  (Relatedly
> this is also why ostree extends the BLS spec with an
> atomically-swappable /boot/loader symlink, though I want to get away
> from that eventually)

Well, what I proposed is a file. OSTree can cover files on disk, no?

> That said, maybe one thing we want regardless is a "safe mode" boot
> that skips any OS customization and will get one booted enough to be
> able to fix/retry for configuration like this.
>
> BTW related to EFI - as you know AWS doesn't support it, and we're
> making a general purpose OS.  Fedora isn't just about desktops, and
> we need to be careful about doing anything in the OS that diverges
> from the server side.  (That said I only recently discovered that
> GCP supports it as well as vTPMs, working on "blessing" our Fedora
> CoreOS images to note they support it
> https://github.com/coreos/mantle/pull/1060 )

I doubt on AWS you want to configure keymaps though, do you?

Lennart

--
Lennart Poettering, Berlin
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel

Re: [systemd-devel] Make systemd-localed modify the kernel commandline for the initrd keymap?

2019-10-31 Thread Lennart Poettering
On Mo, 07.10.19 10:32, Colin Guthrie (gm...@colin.guthr.ie) wrote:

> Colin Walters wrote on 01/10/2019 20:33:
> > On Sun, Sep 29, 2019, at 6:08 AM, Lennart Poettering wrote:
> >
> >> i.e maybe write down a spec, that declares how to store settings
> >> shared between host OS, boot loader and early-boot kernel environment
> >> on systems that have no EFI NVRAM, and then we can make use of
> >> that. i.e. come up with semantics inspired by the boot loader spec for
> >> finding the boot partition to use, then define a couple of files in
> >> there for these params.
> >
> > I like the idea in general but it would mean there's no mechanism to "roll 
> > back" to a previous configuration by default, which is a quite important 
> > part of OSTree (and other similar systems).   (Relatedly this is also why 
> > ostree extends the BLS spec with an atomically-swappable /boot/loader 
> > symlink, though I want to get away from that eventually)
>
> Just out of curiosity, when /boot is the EFI (as is recommended in the
> BLS) how do you deal with symlinks when the FS is FAT based?

Fedora doesn't do that. Fedora doesn't implement the boot loader
spec. They implemented their own thing that is an interpreted macro
language (!?), they just call it "BLS". It's why I myself never use
the acronym "BLS", to avoid unnecessary confusion. Really, the Fedora
thing is just a bad idea. The Fedora thing totally missed the idea
that boot loader drop-ins are supposed to be dead-simple, trivially
parsable and generatable static drop-in files, and just replicated the
bad bad idea inherent to GRUB which is that everything needs to be an
(ideally Turing-complete) programming language again.

Lennart

--
Lennart Poettering, Berlin
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel

Re: [systemd-devel] Make systemd-localed modify the kernel commandline for the initrd keymap?

2019-10-31 Thread Lennart Poettering
On Mo, 30.09.19 16:07, Hans de Goede (hdego...@redhat.com) wrote:

> > So what you are arguing for is replacing the overlay initramfs
> > with a key-value config file which gets used by both the bootloader
> > and the OS.
> >
> > That is an interesting concept, esp. since it limits (as you advocate)
> > what can be done in the overlay from "everything" to "set specific
> > config variables to a value".
> >
> > So yes I can get behind this.
>
> While discussing this with Alberto an interesting problem came up.
>
> If we put this file in /boot/loader as you suggest, then the boot-loader
> can get to it and use it to set its keymap (and in the future probably also
> other stuff) but how does the localed in the initrd get to this
> file?

Boot loader could append it to the kernel cmdline for example.

> I agree with you that having a generic mechanism to share config
> between the OS and early-boot (so bootloader + initrd) is useful,
> but are we then going to make the initrd mount /boot (or the ESP) ?

I wouldn't no. Given that this is configuration that the boot loader
is supposed to grok and parse it could just pass it on on the kernel cmdline.

This would also allow boot loaders to provide a menu-drive scheme for
changing kbd layouts, which they then can sanely pass on to the initrd
and OS.

Lennart

--
Lennart Poettering, Berlin
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel

Re: [systemd-devel] Make systemd-localed modify the kernel commandline for the initrd keymap?

2019-10-31 Thread Lennart Poettering
On Mo, 30.09.19 13:23, Hans de Goede (hdego...@redhat.com) wrote:

> > i.e. generating initrd images with cpio and so on is hacky, gluey,
> > Linux-specific. If you just use plain simple, standardized config
> > files at clearly defined locations, then reading and writing them is
> > simple, they can be shared between all players, are not Linux specific
> > and so on. I think systemd could certainly commit to updating those
> > files then.
>
> This sounds interesting, although I'm not sure I like the one file
> per setting approach why not have a $BOOT/loader/config file which
> has key=value pairs and kbdmap would be a key in that file?

Currently there's $BOOT/loader/loader.conf which is read by sd-boot
and private property of it, if you so will. We could probably open
that up a bit, and make it part of the boot loader spec too. The
format after all is pretty mich the same semantically as the boot
loader spec.

> I'm afraid that will not work, some countries have multiple variants,
> we actually have a bunch of Fedora bugs open about the disk unlock
> support in plymouth and the "de-neo" keymap and there also are the
> somewhat popular dvorak variants.

Well, I am not sure we need to support more than /etc/vconsole.conf
supports. Not in the initrd...

> So we could do this as say a base setting but then we would need to add
> a kbdmap_variant setting which when sets makes the keymap loaded
> $kbdmap-$variant.map (in Linux Console terms) I guess we could specify
> that setting a variant this way is allowed, but that variant settings
> are OS / bootloader specific and may be ignored?

I am not sure we really need to configure a 100% fully featured keymap
there. as long as the basics work in initrd/boot loader we are
fine... mmkeys and per-machine tweaks can happily happen in later
boot. I'd go as far as /etc/vconsole.conf, but not further really.

Lennart

--
Lennart Poettering, Berlin
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel

Re: [systemd-devel] user slice changes for uid ranges

2019-10-31 Thread Lennart Poettering
On Fr, 27.09.19 15:56, Stijn De Weirdt (stijn.dewei...@ugent.be) wrote:

> hi all,
>
> i'm looking for an "easy" way to set resource limits on a group of users.
>
> we are lucky enough that this group of users is within a (although
> large) high enough range, so a range of uids is ok for us.
>
> generating a user-.slice file for every user (or symlink them or
> whatever) looks a bit cumbersome, and probably not really performance
> friendly if the range is in eg 100k (possible) uids.
>
> e.g. if this range was 100k-200k, i was more looking for a way to do
> e.g. user-1X.slice or user-10:20.slice
>
> (i think this is different from/not covered by the templated/prefix user
> slice patch
> https://github.com/systemd/systemd/commit/5396624506e155c4bc10c0ee65b939600860ab67)

I am not sure this helps you very much right now. But ultimately the
plan is to allow resource limits to be configured in detail as part of
each user record. This is implemented here already:

https://github.com/poettering/systemd/commits/homed

But this hasn't been merged upstream yet, but will hopefully be merged
soon.

Lennart

--
Lennart Poettering, Berlin
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel

Re: [systemd-devel] systemd-growfs blocks boot until completed

2019-10-31 Thread Lennart Poettering
On Fr, 27.09.19 17:12, Mirza Krak (mi...@mkrak.org) wrote:

> Den fre 27 sep. 2019 kl 15:23 skrev Lennart Poettering 
> :
> >
> > On Fr, 27.09.19 14:35, Mirza Krak (mi...@mkrak.org) wrote:
> >
> > > Hi,
> > >
> > > I have been using the systemd-growfs feature for a while, and been
> > > happy with it so far.
> > >
> > > But recently I upgraded my distribution (custom based on Yocto) which
> > > also upgraded systemd from 239 to 241, and I can see that there has
> > > been a change in behavior of the "systemd-growfs" feature.
> > >
> > > In systemd 241, it blocks the boot process while it is growing the
> > > filesystem, here is an example log:
> > >
> > >  Mounting /data...
> > > [   10.693190] EXT4-fs (mmcblk0p4): mounted filesystem with ordered
> > > data mode. Opts: (null)
> > > [  OK  ] Mounted /data.
> > >  Starting Grow File System on /data...
> > > [   10.780109] EXT4-fs (mmcblk0p4): resizing filesystem from 131072 to
> > > 30773248 blocks
> > > [**] A start job is running for Grow File System on /data (11s /
> > > no limit)
> > > [  *** ] A start job is running for Grow File System on /data (21s /
> > > no limit)
> > > [***   ] A start job is running for Grow File System on /data (30s / no 
> > > limit)
> > > [***   ] A start job is running for Grow File System on /data (42s / no 
> > > limit)
> > > [**] A start job is running for Grow File System on /data (52s / no 
> > > limit)
> > > [**] A start job is running for Grow Filâ…stem on /data (1min 2s / no 
> > > limit)
> > > [ ***  ] A start job is running for Grow Filâ…tem on /data (1min 15s / no 
> > > limit)
> > > [  *** ] A start job is running for Grow Filâ…tem on /data (1min 26s / no 
> > > limit)
> > > [  *** ] A start job is running for Grow Filâ…tem on /data (1min 36s / no 
> > > limit)
> > > [ ***  ] A start job is running for Grow Filâ…tem on /data (1min 46s / no 
> > > limit)
> > > [   ***] A start job is running for Grow Filâ…tem on /data (1min 56s / no 
> > > limit)
> > > [**] A start job is running for Grow Filâ…stem on /data (2min 6s / no 
> > > limit)
> > > [**] A start job is running for Grow Filâ…tem on /data (2min 17s / no 
> > > limit)
> > > [* ] A start job is running for Grow Filâ…tem on /data (2min 27s / no 
> > > limit)
> > > [ ***  ] A start job is running for Grow Filâ…tem on /data (2min 35s / no 
> > > limit)
> > >
> > > In the previous version (239), this occurred in the background and did
> > > not obstruct the boot process in any noticeable way. Which matched my
> > > expectations on how this feature would work.
> > >
> > > So my question is, was the change intentional and if so, what was the 
> > > reasoning?
> >
> > Hmm, the tool doesn't do much. It just calls an fs ioctl. If you
> > attach gdb to the process (or strace it), can you see what it is
> > blocking on?
>
> It seems that the ioctl operation is blocking until the resize is completed,
>
> openat(AT_FDCWD, "/data", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = 3
> openat(AT_FDCWD, "/dev/block/179:4", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = 4
> ioctl(4, BLKBSZGET, [1024]) = 0
> ioctl(4, BLKGETSIZE64, [31511805952])   = 0
> fstatfs64(3, 88, 0x7eb56bf0)= 0
> ioctl(3, _IOC(_IOC_WRITE, 0x66, 0x10, 0x8), 0x7eb56be
>
> I would like to clarify that it eventually will complete (after 5
> minutes on my device), and then the boot proceeds as normal. The ioctl
> behavior has not changed, as it was blocking in the kernel in my
> previous distribution version as well, but the
> systemd-growfs@data.service did not block the boot on systemd 239
> where this was performed in parallel, but now it seems to block.
>
> The Linux kernel is: 4.19.71
>
> This is what the systemd-growfs@.service looks like:
>
> # Automatically generated by systemd-fstab-generator
> [Unit]
> Description=Grow File System on %f
> Documentation=man:systemd-growfs@.service(8)
> DefaultDependencies=no
> BindsTo=%i.mount
> Conflicts=shutdown.target
> After=%i.mount
> Before=shutdown.target local-fs.target
>
> [Service]
> Type=oneshot
> RemainAfterExit=yes
> ExecStart=/lib/systemd/systemd-growfs /data
> TimeoutSec=0

Hmm, interesting. I wasn't aware of this change in behaviour. Most
likely we should make this configurable, given that both behaviours
might be desirable. Can you file an RFE bug on github that asks for
this to be made configurable?

Thanks,

Lennart

--
Lennart Poettering, Berlin
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel

Re: [systemd-devel] watchdog and sd_notify between host and container systemd etc inits

2019-10-31 Thread Lennart Poettering
On Do, 26.09.19 07:24, mikko.rap...@bmw.de (mikko.rap...@bmw.de) wrote:

> Hi,
>
> I'd like to run systemd as init on host but run various containers and some 
> of them
> with their own container side systemd init.
>
> Then I'd like to have sd_notify and watchdog available to check the health of 
> the
> systemd init in the container. I trust the init in the container to check the 
> health
> of all the services and processes running there.
>
> If systemd init in the container fails to respond to watchdog, then I'd like 
> to
> restart only the container, not the whole host system.
>
> For the container systemd watchdog, I've proposed patch:
>
> https://github.com/systemd/systemd/pull/13643
>
> Comments to the PR mention that sd_notify support would be better, but AFAIK 
> it uses
> the PID of processes and thus doesn't work with another systemd init as PID 0 
> in
> the container PID namespace.
>
> Thus we inveted a simple fifo between host init and container init where
> container writes MACHINE and HOSTNAME as watchdog ping. This works well with a
> custom watchdog manager on host and systemd init in an LXC container.
>
> These don't seem to fit very well to systemd, and we'd also like to know 
> sd_notify type
> things like when is the container in running state, which systemd nspawn does
> provide, but I have use cases also for LXC containers...
>
> So, could you provide some ideas and/or feedback how this kind of 
> functionality
> could/should be implemented with systemd?

I would normally assume that it's the job of the container manager to
watchdog/supervise its container. And the host systemd to supervise
the container manager in turn. i.e. it should be PID 1 on the host that gets
sd_notify() messages keep-alive messages from your container manager,
ensuring it's alive. And the container manager should get them from
PID in the container, and ensure it remains alive. And the PID 1 in
the container in the payload get the messages from the services below
it. i.e. instead of forwarding these messages across these boundaries
just have a clear 1:1 supervisor-client relationship.

Or in other words: try to convince the LXC maintainers to send out
sd_notify() watchdog messages from its own supervisor, and optionally
expect them from the payload PID 1. systemd-nspawn at least partly
works that way already.

Lennart

--
Lennart Poettering, Berlin
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel

Re: [systemd-devel] How to only allow service to start when network-online.target is met?

2019-10-31 Thread Lennart Poettering
On Di, 17.09.19 15:44, Dom Rodriguez (shym...@shymega.org.uk) wrote:

> Hello,
>
> I've got a service unit, which is linked to a timer unit, and I'd like to have
> the behaviour associated with `Condition=` directives, but for targets. To
> explain further, my desired expectation is for the service unit to *only* 
> start
> when the target is active. Ideally, I don't want the service unit to fail
> either, but have similar behaviour with `Condition=`, which doesn't mark the
> unit as failed, but merely not meeting condition(s).
>
> Is this possible with systemd?

Requisite= can do this, but it causes the depending unit to fail if
the depnded-on unit is not activated yet or in the process of being
activated.

Lennart

--
Lennart Poettering, Berlin
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel

Re: [systemd-devel] sdbus_event loop state mark as volatile?

2019-10-31 Thread Lennart Poettering
On Do, 05.09.19 10:46, Stephen Hemminger (step...@networkplumber.org) wrote:

> The libsystemd bus event loop is:
>
>
> while (e->state != SD_EVENT_FINISHED) {
> r = sd_event_run(e, (uint64_t) -1);
>
> But since e->state is changed by another thread it
> should be marked volatile to avoid compiler thinking
> the state doesn't get changed.

None of systemd's libraries are thread safe. They are written in a
threads-aware style though. This means you should only use a specific
context object from a single thread at a time, and need to do your own
locking around it if that single thread shall change all the
time. systemd doesn't keep global state generally, which means doing
your own locking around the sd_xyz objects should suffice and work
reasonably well.

Lennart

--
Lennart Poettering, Berlin
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel

Re: [systemd-devel] exceeding match limit via sd_bus_add_match

2019-10-31 Thread Lennart Poettering
On Mo, 09.09.19 11:42, Ivan Mikhaylov (i.mikhay...@yadro.com) wrote:

> I have a system with a lot of sdbus properties which have to be 'match'ed. 
> After
> reaching some match limit I'm getting -105 (ENOBUFS) on regular base. The
> -105/(ENOBUFS) represents exceeding 'some limit', according to the doc.
>
> In manpage for sd_bus_add_match() there is no helpful information about
> possible reasons for this over the limit case. I'm trying to figure it out 
> from
> systemd code, with little success so far.
>
> What the limit is and where I can tweak it?

It's generally the daemon that puts a limit to this not the
client. i.e. you need to consult dbus-daemon/dbus-broker configuration
for the limit.

Usually instead of having many fine-grained matches it's more
efficient to have few broader ones. i.e. instead of matching each
property individually, consider matching the whole interface or so.

Lennart

--
Lennart Poettering, Berlin
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel

Re: [systemd-devel] Watchdog problem

2019-10-31 Thread Lennart Poettering
On Sa, 07.09.19 15:11, Mikael Djurfeldt (mik...@djurfeldt.com) wrote:

> Hi,
>
> I couldn't figure out a better place to ask this question. Please point me
> to another place if you have a better idea. (Perhaps I should bring it up
> with VirtualBox developers instead?)
>
> I run Debian buster (with some upgraded packages) as a Virtualbox guest
> with a Windows host.
>
> When the host has gone to sleep and wakes up again, I get logged out from
> my gdm session. It starts out like this:
>
> Sep  7 13:23:58 hat kernel: [82210.177399] 11:23:58.337557 timesync
> vgsvcTimeSyncWorker: Radical guest time change: 2 491 379 233 000ns
> (GuestNow=1 567 855 438 281 378 000 ns GuestLast=1 567 852 946 902 145 000
> ns fSetTimeLastLoop=false)
> Sep  7 13:23:59 hat systemd[1]: systemd-logind.service: Watchdog timeout
> (limit 3min)!
> Sep  7 13:23:59 hat /usr/lib/gdm3/gdm-x-session[1285]: (EE)
> Sep  7 13:23:59 hat /usr/lib/gdm3/gdm-x-session[1285]: Fatal server error:
> Sep  7 13:23:59 hat /usr/lib/gdm3/gdm-x-session[1285]: (EE) systemd-logind
> disappeared (stopped/restarted?)
>
> Can I fix this by setting systemd-logind.service WatchdogSec to something
> else? What should I set it to to disable watchdogs? I tried to find
> documentation for WatchdogSec but failed. Can you please point me to the
> proper documentation?

This looks like a kernel/virtualbox issue. We use CLOCK_MONOTONIC to
determine when the last keep-alive message was received. Unlike
CLOCK_REALTIME this means the clock stops while the system is
suspended. If Virtualbox doesn't get this right, please report this as
issue to virtualbox.

Lennart

--
Lennart Poettering, Berlin
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel

Re: [systemd-devel] set-property CPUAffinity

2019-10-31 Thread Lennart Poettering
On Di, 03.09.19 19:49, Alexey Perevalov (a.pereva...@samsung.com) wrote:

> Hello Michal,
>
> Thank you for response!
>
> On 9/3/19 6:03 PM, Michal Koutný wrote:
> > Hello Alexey.
> >
> > On Fri, Aug 30, 2019 at 01:21:50PM +0300, Alexey Perevalov 
> >  wrote:
> > > [...]
> > > The question is: changing CPUAffinity property (cpuset.cpus) is not yet
> > > allowed in systemd API, right? Is it planned?
> > Note that CPUAffinity= uses the mechanism of sched_setaffinity(2) which
> > is different from using cpuset controller restrictions (that's also why
> > you find it in `man systemd.exec` and not it `man
> > systemd.resource-control`).
> >
> > IMO, systemd may eventually support the cpuset controller with a
> > different directive.
>
> Does it mean community open for enhancement in this direction?
>
> Looks like current work on issues and enhancements are doing in github,
> so I can create a RFE.

Old cgroupsv1 cpuset was a horrible, broken interface which is why we
are not supporting it. On cgroupsv2 things are better, and since
047f5d63d7a1ab75073f8485e2f9b550d25b0772 we now have support for it in
systemd.

Lennart

--
Lennart Poettering, Berlin
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel

Re: [systemd-devel] DynamicUser shared by service instances

2019-10-31 Thread Lennart Poettering
On Mo, 02.09.19 18:37, sqwishy (someb...@froghat.ca) wrote:

> Hi.
>
> I was looking at how dynamic users are implemented and noticed that instances 
> seem to
> share one dynamic user within their service. In the example below, I have an 
> attached
> portable service with StateDirectory=derp-%i
>
> # ls -dn /var/lib/private/derp-{foo,bar}
> drwxr-xr-x 2 64000 64000 4096 Sep  2 17:59 /var/lib/private/derp-bar/
> drwxr-xr-x 2 63000 63000 4096 Sep  2 17:59 /var/lib/private/derp-foo/
>
> # systemctl start f30-derp@{foo,bar}
>
> # ls -dn /var/lib/private/derp-{foo,bar}
> drwxr-xr-x 2 63000 63000 4096 Sep  2 17:59 /var/lib/private/derp-bar/
> drwxr-xr-x 2 63000 63000 4096 Sep  2 17:59 /var/lib/private/derp-foo/
>
> # ls -l /run/systemd/dynamic-uid/
> total 4
> -rw--- 1 root root 9 Sep  2 18:12 63000
> lrwxrwxrwx 1 root root 8 Sep  2 18:12 direct:63000 -> f30-derp
> lrwxrwxrwx 1 root root 5 Sep  2 18:12 direct:f30-derp -> 63000
>
> Normally the state directories are created under the same owner, I set 
> different owners
> explicitly to see that the second instance's directory is chowned.
>
> I guess I'm wondering if this behaviour is intentional? I found it surprising 
> but that
> might just be me.

You can pick the name for the DynamicUser= via User=. What did you set
it to? By default it's derived from the unit name. If two units
specify the same name they get the same user.

Lennart

--
Lennart Poettering, Berlin
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel

Re: [systemd-devel] systemd put it uinder a custom slice

2019-10-31 Thread Lennart Poettering
On Mi, 30.10.19 11:08, Bhasker C V (bhas...@unixindia.com) wrote:

> Hi all,
>
>  I have been googl-ing for a few days now but could not stumble upon a
> solution I am looking for.
>
> Apologies if this is a noob question.
>
> Is there a way to use custom slices with my systemd-nspawn container ?
>
> I see that systemd-nspawn man page says using --slice= but any such
> cgroup created is not accepted by this option (I dont know how to create
> a slice externally from systemd unit-files)
>
> $ sudo cgcreate  -g freezer,memory:/test

This is not supported.

systemd owns the cgroup tree, only subtrees for which delegation is
explicitly turned on can be managed by other programs, for example for
the purpose of container managers.

Thus, creating cgroups manually, directly via cgcreate at the top of
the tree is explicitly not supported.

Use systemd's own concepts, i.e. slice units, direct cgroup access
bypassing systemd at the top of the tree is explicitly not supported.

Lennart

--
Lennart Poettering, Berlin
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel

[systemd-devel] Problems with DNS resolution in German Rail WiFi

2019-10-31 Thread Paul Menzel

Dear systemd folks,


Since over half a year, I am having problems with the German Rail WIFI 
(Deutsche Bahn, WIFIonICE) [1]. I am only having problems with the WIFI 
in the train. The WIFI on the train stations (operated by Deutsche 
Telekom(?)) works fine.


I am able to connect to the WIFI network, but accessing the capture page 
to log in, and after logging in, I am having serious DNS problems in the 
browsers (Mozilla Firefox 70.0 and Google Chromium). I am using Debian 
Sid/unstable.


It looks like that, DNS requests are not answered in time (also 
confirmed by the developer tools in the browser). SSH (and even Mozilla 
Thunderbird) seems to work better. The fellow train travelers do not 
seem to have any problems.


Testing on the console shows:

```
$ time host bahn.de
bahn.de has address 18.185.205.203
bahn.de has address 35.157.56.133
bahn.de has address 35.158.56.207
bahn.de mail is handled by 10 mailgate2.deutschebahn.com.
bahn.de mail is handled by 10 mailgate1.deutschebahn.com.

real0m0,243s
user0m0,021s
sys 0m0,000s
$ systemd-resolve bahn.de
bahn.de: resolve call failed: Die Wartezeit für die Verbindung ist 
abgelaufen

$ time systemd-resolve bahn.de
bahn.de: resolve call failed: DNSSEC validation failed: failed-auxiliary

real0m55,967s
user0m0,006s
sys 0m0,006s
$ time systemd-resolve bahn.de
bahn.de: resolve call failed: Die Wartezeit für die Verbindung ist 
abgelaufen


real2m0,094s
user0m0,005s
sys 0m0,007s
$ time systemd-resolve bahn.de
bahn.de: resolve call failed: Die Wartezeit für die Verbindung ist 
abgelaufen


real2m0,113s
user0m0,014s
sys 0m0,000s
```

How can this be debugged (next time I am using the ICE)? Is this 
systemd-resolved related? If not, who should I bug about it?



Kind regards,

Paul


[1]: https://www.bahn.de/p/view/service/zug/railnet_ice_bahnhof.shtml
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel