[systemd-devel] ConditionNeedsUpdate, read-only /usr, and sysext

2024-02-07 Thread Valentin David
Hello everybody,

The behavior of ConditionNeedsUpdate is that if /etc/.updated is older than 
/usr/, then it is true.

I have some issues with this. But maybe I do not use it the right way.

First, when using a read-only /usr partition (updated through sysupdate), the 
time of /usr is of the build of that filesystem. In the case of GNOME OS, to 
ensure reproducibility bit by bit, we set all times to some time in 2011. So 
that does not work for us.

But now let's say we work-around that, and we make our system take a date that 
is reproducible, let's say the git commit of our metadata. Then we have a 
second issue.

Because of systemd-sysext, it might be that /usr is not anymore the time of the 
/usr filesystem, but the time of a directory created on the fly by 
systemd-sysext (or maybe it keeps the time from the / fileystem, I do not know, 
but for sure the time stamp is from when systemd-sysext was started). If 
systemd-update-done happens after systemd-sysext (and it effectively does on 
254), then the date of /etc/.updated will become the time when systemd-sysext 
started.

Let's imagine that I do not boot that machine often. My system is booting a new 
version. And there is already another new version available on the sysupdate 
server. My system will download a build of /usr that is likely to be older than 
the boot time. So next reboot, the condition will be false, even though I did 
have an update. And it will be false until I download a version that was built 
after the boot time of my last successful update.

So my question is, is there plan to replace time stamp comparison for 
ConditionNeedsUpdate with something that  works better with sysupdate and 
sysext? Maybe copying IMAGE_VERSION from /usr/lib/os-release into /etc/.updated 
for example?

Thanks,
--
Valentin David
m...@valentindavid.com


[systemd-devel] udev database cross-version compatibility

2023-09-26 Thread Valentin David
Hello,

Back in 2014 and again in 2020, there were discussions on the mailing-list 
related to udev database version safety. This was important to know if libudev 
from a container could access to /run/udev/data files safely. Given then 
libudev and systemd-udevd would be potentially different version.

The conclusion was that there was no guarantee. And based on that flatpak has 
not provided /run/udev/data to applications.

Later, the format changed in #16853 (udev: make uevents "sticky"). But this 
caused issue #17605 (Units with BindsTo= are being killed on upgrade 
from v246 to v247).

This was fixed by #17622 (sd-device: make sd_device_has_current_tag() 
compatible with udev database generated by older udevd).

It seems to me, because udev needs to handle upgrade and downgrade, that it 
will continue to handle some compatibility across versions.

Is it safe now for flatpak to provide /run/udev/data to containers?

(Also, snapd does it, oops)

--
Valentin David
m...@valentindavid.com


Re: [systemd-devel] systemd-repart very slow creation of partitions with Encrypt=

2023-06-05 Thread Valentin David
On Mon, Jun 5, 2023 at 11:09 AM Lennart Poettering 
wrote:

> On Mo, 05.06.23 10:41, Valentin David (valentin.da...@canonical.com)
> wrote:
>
> > On Mon, Jun 5, 2023 at 9:56 AM Lennart Poettering <
> lenn...@poettering.net>
> > wrote:
> >
> > > On So, 04.06.23 14:25, Valentin David (valentin.da...@canonical.com)
> > > wrote:
> > >
> > > > I have been trying to create a root partition from initrd with
> > > > systemd-repart. The repart.d file for this partition is as follow:
> > > >
> > > > [Partition]
> > > > Type=root
> > > > Label=root
> > > > Encrypt=tpm2
> > > > Format=ext4
> > > > FactoryReset=yes
> > > >
> > > > I am just using systemd-repart.service in initrd, without
> modification
> > > > (that is, it finds the disk from /sysusr/usr). Even though this is
> > > working,
> > > > the problem I have is that it takes a very long time for the
> partition to
> > > > be created. Looking at the logs, it spends most of time in the
> > > > reencryption.
> > >
> > > reencryption? We don't do any reencrytion really. i.e. we do not
> > > actually support anything like "cryptsetup reencrypt" at all. All we
> > > do is the equivalent of "cryptsetup luksFormat". Are you suggesting
> > > that repart is slower at formatting a block device via LUKS than
> > > invoking cryptsetup directly would be? I'd find that very surprising...
> > >
> >
> > This is what it looks like in src/partition/repart.c. Function
> > partition_encrypt calls sym_crypt_reencrypt_init_by_passphrase and
> > then sym_crypt_reencrypt.
> > And make_filesystem is called before partition_encrypt. So it must
> > reencrypt since mkfs was called before.
>
> Oh, fuck, yeah, Daan added that.
>
> This is a bug really.
>

I will open an issue on github then.


Re: [systemd-devel] systemd-repart very slow creation of partitions with Encrypt=

2023-06-05 Thread Valentin David
I think that behavior was introduced by
https://github.com/systemd/systemd/commit/48a09a8fff480aab9a68e95e95cc37f6b1438751

On Mon, Jun 5, 2023 at 10:41 AM Valentin David 
wrote:

>
>
> On Mon, Jun 5, 2023 at 9:56 AM Lennart Poettering 
> wrote:
>
>> On So, 04.06.23 14:25, Valentin David (valentin.da...@canonical.com)
>> wrote:
>>
>> > I have been trying to create a root partition from initrd with
>> > systemd-repart. The repart.d file for this partition is as follow:
>> >
>> > [Partition]
>> > Type=root
>> > Label=root
>> > Encrypt=tpm2
>> > Format=ext4
>> > FactoryReset=yes
>> >
>> > I am just using systemd-repart.service in initrd, without modification
>> > (that is, it finds the disk from /sysusr/usr). Even though this is
>> working,
>> > the problem I have is that it takes a very long time for the partition
>> to
>> > be created. Looking at the logs, it spends most of time in the
>> > reencryption.
>>
>> reencryption? We don't do any reencrytion really. i.e. we do not
>> actually support anything like "cryptsetup reencrypt" at all. All we
>> do is the equivalent of "cryptsetup luksFormat". Are you suggesting
>> that repart is slower at formatting a block device via LUKS than
>> invoking cryptsetup directly would be? I'd find that very surprising...
>>
>
> This is what it looks like in src/partition/repart.c. Function
> partition_encrypt calls sym_crypt_reencrypt_init_by_passphrase and then 
> sym_crypt_reencrypt.
> And make_filesystem is called before partition_encrypt. So it must
> reencrypt since mkfs was called before.
>
>
>> > For 11GB partition on a VM, it takes more than 2 minutes. On the bare
>> metal
>> > with a 512 GB nvme disk, it has been running for 3 hours. And it is
>> still
>> > not finished.
>>
>> This is really strange. The LUKS formatting should just write a
>> superlock onto the disk, which is just a couple of sectors, and should
>> barely take any time.
>>
>> Or are you saying "mke2fs" takes that long?
>>
>> Note that we specify lazy_itable_init=1 during formatting ext4, hence
>> it should actually be super fast too...
>>
>
> No. mkfs was done. In the logs it was all about reencryption. See
> https://gitlab.gnome.org/-/snippets/5809/raw/main/snippetfile1.txt
>
>
>>
>> > I do not think cryptsetup reencryption supports holes. Is it normal to
>> have
>> > a full reencryption of a disk that was just initialized with mkfs.ext4?
>> If
>> > so, could we at least move the effective reencryption after
>> > systemd-repart.service, so that the rest of the system can continue to
>> boot?
>> >
>> > I am running:
>> > systemd 253.4 running in system mode (+PAM +AUDIT +SELINUX +APPARMOR
>> +IMA
>> > +SMACK +SECCOMP +GCRYPT -GNUTLS +OPENSSL +ACL +BLKID +CURL +ELFUTILS
>> +FIDO2
>> > +IDN2 -IDN -IPTC +KMOD +LIBCRYPTSETUP +LIBFDISK +PCRE2 +PWQUALITY
>> +P11KIT
>> > -QRENCODE +TPM2 +BZIP2 +LZ4 +XZ +ZLIB +ZSTD -BPF_FRAMEWORK -XKBCOMM
>> > ON +UTMP +SYSVINIT default-hierarchy=unified)
>> >
>> > Cryptsetup: v2.6.1
>>
>> I am a bit puzzled by this. WOuld be good to figure out what actually
>> is so slow here? formatting luks? formatting ext4? discarding?
>>
>> Lennart
>>
>> --
>> Lennart Poettering, Berlin
>>
>


Re: [systemd-devel] systemd-repart very slow creation of partitions with Encrypt=

2023-06-05 Thread Valentin David
On Mon, Jun 5, 2023 at 9:56 AM Lennart Poettering 
wrote:

> On So, 04.06.23 14:25, Valentin David (valentin.da...@canonical.com)
> wrote:
>
> > I have been trying to create a root partition from initrd with
> > systemd-repart. The repart.d file for this partition is as follow:
> >
> > [Partition]
> > Type=root
> > Label=root
> > Encrypt=tpm2
> > Format=ext4
> > FactoryReset=yes
> >
> > I am just using systemd-repart.service in initrd, without modification
> > (that is, it finds the disk from /sysusr/usr). Even though this is
> working,
> > the problem I have is that it takes a very long time for the partition to
> > be created. Looking at the logs, it spends most of time in the
> > reencryption.
>
> reencryption? We don't do any reencrytion really. i.e. we do not
> actually support anything like "cryptsetup reencrypt" at all. All we
> do is the equivalent of "cryptsetup luksFormat". Are you suggesting
> that repart is slower at formatting a block device via LUKS than
> invoking cryptsetup directly would be? I'd find that very surprising...
>

This is what it looks like in src/partition/repart.c. Function
partition_encrypt calls sym_crypt_reencrypt_init_by_passphrase and
then sym_crypt_reencrypt.
And make_filesystem is called before partition_encrypt. So it must
reencrypt since mkfs was called before.


> > For 11GB partition on a VM, it takes more than 2 minutes. On the bare
> metal
> > with a 512 GB nvme disk, it has been running for 3 hours. And it is still
> > not finished.
>
> This is really strange. The LUKS formatting should just write a
> superlock onto the disk, which is just a couple of sectors, and should
> barely take any time.
>
> Or are you saying "mke2fs" takes that long?
>
> Note that we specify lazy_itable_init=1 during formatting ext4, hence
> it should actually be super fast too...
>

No. mkfs was done. In the logs it was all about reencryption. See
https://gitlab.gnome.org/-/snippets/5809/raw/main/snippetfile1.txt


>
> > I do not think cryptsetup reencryption supports holes. Is it normal to
> have
> > a full reencryption of a disk that was just initialized with mkfs.ext4?
> If
> > so, could we at least move the effective reencryption after
> > systemd-repart.service, so that the rest of the system can continue to
> boot?
> >
> > I am running:
> > systemd 253.4 running in system mode (+PAM +AUDIT +SELINUX +APPARMOR +IMA
> > +SMACK +SECCOMP +GCRYPT -GNUTLS +OPENSSL +ACL +BLKID +CURL +ELFUTILS
> +FIDO2
> > +IDN2 -IDN -IPTC +KMOD +LIBCRYPTSETUP +LIBFDISK +PCRE2 +PWQUALITY +P11KIT
> > -QRENCODE +TPM2 +BZIP2 +LZ4 +XZ +ZLIB +ZSTD -BPF_FRAMEWORK -XKBCOMM
> > ON +UTMP +SYSVINIT default-hierarchy=unified)
> >
> > Cryptsetup: v2.6.1
>
> I am a bit puzzled by this. WOuld be good to figure out what actually
> is so slow here? formatting luks? formatting ext4? discarding?
>
> Lennart
>
> --
> Lennart Poettering, Berlin
>


[systemd-devel] systemd-repart very slow creation of partitions with Encrypt=

2023-06-04 Thread Valentin David
I have been trying to create a root partition from initrd with
systemd-repart. The repart.d file for this partition is as follow:

[Partition]
Type=root
Label=root
Encrypt=tpm2
Format=ext4
FactoryReset=yes

I am just using systemd-repart.service in initrd, without modification
(that is, it finds the disk from /sysusr/usr). Even though this is working,
the problem I have is that it takes a very long time for the partition to
be created. Looking at the logs, it spends most of time in the reencryption.

For 11GB partition on a VM, it takes more than 2 minutes. On the bare metal
with a 512 GB nvme disk, it has been running for 3 hours. And it is still
not finished.

I do not think cryptsetup reencryption supports holes. Is it normal to have
a full reencryption of a disk that was just initialized with mkfs.ext4? If
so, could we at least move the effective reencryption after
systemd-repart.service, so that the rest of the system can continue to boot?

I am running:
systemd 253.4 running in system mode (+PAM +AUDIT +SELINUX +APPARMOR +IMA
+SMACK +SECCOMP +GCRYPT -GNUTLS +OPENSSL +ACL +BLKID +CURL +ELFUTILS +FIDO2
+IDN2 -IDN -IPTC +KMOD +LIBCRYPTSETUP +LIBFDISK +PCRE2 +PWQUALITY +P11KIT
-QRENCODE +TPM2 +BZIP2 +LZ4 +XZ +ZLIB +ZSTD -BPF_FRAMEWORK -XKBCOMM
ON +UTMP +SYSVINIT default-hierarchy=unified)

Cryptsetup: v2.6.1


Re: [systemd-devel] Unmountable mounts and systemd-fsck@.service conflicting with shutdown.target

2023-01-06 Thread Valentin David
It is a call to systemd-mount done from initramfs. It ends up in
/run/systemd/transient and survives the root switch. The generated unit
contains Requires=systemd-fsck@service.

Is the conflict on shutdown.target to make shutdown kill fsck if it is
running?

Generated systemd-cryptsetup@.service units have "DefaultDependencies=no"
and no conflict on shutdown. Maybe this is missing then. "cryptsetup
attach" might be running.


On Fri, Jan 6, 2023 at 1:34 PM Lennart Poettering 
wrote:

> On Do, 05.01.23 14:18, Valentin David (valentin.da...@canonical.com)
> wrote:
>
> > Hello,
> >
> > In Ubuntu Core, we have some mounts that cannot be unmounted until we
> have
> > switched root.
> >
> > To simplify, this looks like that:
> >
> > / mounts a ro loop devices backed by /some/disk/some/path/image.img
> > /some/disk mounts a block device (let's say /dev/some-block0p1)
> >
> > In this case, /some/disk cannot be unmounted.
> >
> > We do not want to lazily unmount, we cannot get errors if something
> fails.
> > (Unless we had a lazy unmount that would only work when read-only)
> >
> > We do remount /some/disk read-only on shutdown. And in the shutdown
> > intramfs, we unmount /oldroot/some/disk.
> >
> > However, we get an error message with systemd trying to unmount it. While
> > functionally, it does not matter, it is still very problematic to have
> > error messages.
> >
> > Using `DefaultDependencies=no` is not enough. I have tried to be clever
> and
> > add some-disk.mount to shutdown.target.wants so it would not try to
> unmount
> > it. But systemd got confused with conflicts and randomly kills stop jobs
> > until there is no conflict.
> >
> > Debugging it, I have found that this is because some-disk.mount depends
> on
> > systemd-fsck@some\x2dblock0p1.service. And systemd-fsck@.service
> conflicts
> > with shutdown.target.
> >
> > I wonder if having conflict on shutdown.target really needed. Could we
> > remove it? (And also add DefaultDepenencies=no to
> > system-systemd\x2dfsck.slice) With this, mounts with
> DefaultDependencie=no
> > do not get unmounted as part of shutdown.target. (They do during
> > systemd-shutdown)
>
> hmm, so we generally want system services to go away before
> shutdown. This is a very special case though. I wonder if we can just
> override systemd-fsck@….service for that specific case?
>
> How are those mounts established? i.e. by which unit is the
> systemd-fsck@.service instance pulled in? and how was that configured?
> fstab? ubuntu-own code?
>
> Lennart
>
> --
> Lennart Poettering, Berlin
>


[systemd-devel] Unmountable mounts and systemd-fsck@.service conflicting with shutdown.target

2023-01-05 Thread Valentin David
Hello,

In Ubuntu Core, we have some mounts that cannot be unmounted until we have
switched root.

To simplify, this looks like that:

/ mounts a ro loop devices backed by /some/disk/some/path/image.img
/some/disk mounts a block device (let's say /dev/some-block0p1)

In this case, /some/disk cannot be unmounted.

We do not want to lazily unmount, we cannot get errors if something fails.
(Unless we had a lazy unmount that would only work when read-only)

We do remount /some/disk read-only on shutdown. And in the shutdown
intramfs, we unmount /oldroot/some/disk.

However, we get an error message with systemd trying to unmount it. While
functionally, it does not matter, it is still very problematic to have
error messages.

Using `DefaultDependencies=no` is not enough. I have tried to be clever and
add some-disk.mount to shutdown.target.wants so it would not try to unmount
it. But systemd got confused with conflicts and randomly kills stop jobs
until there is no conflict.

Debugging it, I have found that this is because some-disk.mount depends on
systemd-fsck@some\x2dblock0p1.service. And systemd-fsck@.service conflicts
with shutdown.target.

I wonder if having conflict on shutdown.target really needed. Could we
remove it? (And also add DefaultDepenencies=no to
system-systemd\x2dfsck.slice) With this, mounts with DefaultDependencie=no
do not get unmounted as part of shutdown.target. (They do during
systemd-shutdown)