Re: [systemd-devel] Hosts without /etc/machine-id on boot

2014-11-24 Thread Didier Roche

Le 21/11/2014 00:41, Lennart Poettering a écrit :

On Thu, 20.11.14 17:23, Didier Roche (didro...@ubuntu.com) wrote:


a) make /etc writable before systemd is invoked. If you use an initrd
this is without risk, given that the initrd should really invoke
fsck on the root disk anyway, and there's hence little reason to
transition to a read-only root, rather than just doing rw
right-away.

Interesting, I run that through our kernel team. However, we run fsck a
little bit later on in the boot process to be able to pipe the output to
plymouth.

At least on Fedora plymouth already runs on the initrd. If Ubuntu does
the same, then there shouldn't be a difference regarding where fsck is
run...

Note that running fsck in the initrd for the root fs is really the
right thing to do: running fsck from the file system you are about to
check, which you hence cannot trust, is really wrong.


I'm not sure we should then have two code paths:
- one fscking from the initrd if /etc/machine-id is empty (like after a
factory reset), showing the results and eventual failures to the user in
some way
- and then, the general use case: fscking through the systemd service via
systemd-fsck-root.service before local-fs.target and piping the result in
plymouth

The latter is useful only really on non-initrd boots where there isn't
any initrd where the fsck could run. General purpose distributions
should really run fsck in the initrd.

Note that systemd-fsck-root.service skips itself when it notices that
the fs was already checked (via a flag file in /run).


Indeed, that makes sense and we should investigate that with our kernel 
team in the near future. I'm going to open a thread on it.



The guarantee with /etc/machine-id is really that it is valid at *any*
time, in early boot and late boot and all the time in between.

I think I will go that path which is an interesting one and mapping some of
my thoughts. Thanks for the guidance and documentation on what's the right
approach to achieve this race-free! I'll work on something around that and
propose a patch.

Looking forword to it.


Here we go :)
I did some factorization to reuse some functions on the first path and 
added the binary helper, unit and man page.


It should follow your advice and be race-free. Tested various cases locally.

Cheers,
Didier
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Hosts without /etc/machine-id on boot

2014-11-20 Thread Lennart Poettering
On Thu, 20.11.14 17:23, Didier Roche (didro...@ubuntu.com) wrote:

> >a) make /etc writable before systemd is invoked. If you use an initrd
> >this is without risk, given that the initrd should really invoke
> >fsck on the root disk anyway, and there's hence little reason to
> >transition to a read-only root, rather than just doing rw
> >right-away.
> 
> Interesting, I run that through our kernel team. However, we run fsck a
> little bit later on in the boot process to be able to pipe the output to
> plymouth.

At least on Fedora plymouth already runs on the initrd. If Ubuntu does
the same, then there shouldn't be a difference regarding where fsck is
run...

Note that running fsck in the initrd for the root fs is really the
right thing to do: running fsck from the file system you are about to
check, which you hence cannot trust, is really wrong.

> I'm not sure we should then have two code paths:
> - one fscking from the initrd if /etc/machine-id is empty (like after a
> factory reset), showing the results and eventual failures to the user in
> some way
> - and then, the general use case: fscking through the systemd service via
> systemd-fsck-root.service before local-fs.target and piping the result in
> plymouth

The latter is useful only really on non-initrd boots where there isn't
any initrd where the fsck could run. General purpose distributions
should really run fsck in the initrd.

Note that systemd-fsck-root.service skips itself when it notices that
the fs was already checked (via a flag file in /run).

> >The guarantee with /etc/machine-id is really that it is valid at *any*
> >time, in early boot and late boot and all the time in between.
>
> I think I will go that path which is an interesting one and mapping some of
> my thoughts. Thanks for the guidance and documentation on what's the right
> approach to achieve this race-free! I'll work on something around that and
> propose a patch.

Looking forword to it.

Lennart

-- 
Lennart Poettering, Red Hat
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Hosts without /etc/machine-id on boot

2014-11-20 Thread Didier Roche

Le 20/11/2014 13:45, Lennart Poettering a écrit :

On Wed, 19.11.14 09:45, Didier Roche (didro...@ubuntu.com) wrote:


Hey,

Some other topic related to "empty /etc" discussions: when preparing some
generic distro images, we are have the desire to ensure that all new
instances will get a different /etc/machine-id file.
As part of the empty /etc at boot, we first thought that removing
/etc/machine-id would be sufficient, however, the instance then doesn't
generate a new machine-id file and complain heavily.

The new debug message of systemd 216+ helped shading some lights on it: 
http://cgit.freedesktop.org/systemd/systemd-stable/diff/src/core/machine-id-setup.c?h=v216-stable&id=896050eeb3acbf4106d71204a5173b4984cf1675,
and adding debug statement in machine_id_setup() from
src/core/machine-id-setup.c just before "open(etc_machine_id,
O_RDWR|O_CREAT|O_CLOEXEC|O_NOCTTY, 0444)" explains what happens with
/proc/mounts:

[2.119041] systemd[1]: rootfs / rootfs rw
[2.126775] systemd[1]: 
/dev/disk/by-uuid/ec8166e5-d5ed-45ec-b350-6cf5773904ac / ext4 
ro,relatime,data=ordered


It's clear then that at this stage of the boot process / is readonly.
The error message (and code) will say that in this case, what is supported
is an empty /etc/machine-id. After reboot, the consequence is that
/etc/machine-id is mounted as a tmpfs:

Yes, generation of the machine ID is done very very early at boot,
before we fork off the first non-PID1 userspace process, and hence
before any file system could be remounted writable.


That makes sense.



tmpfs on /etc/machine-id type tmpfs (ro,relatime,size=204948k,mode=755)

However, this means is that each boot of this instance will result in a
different machine-id, which isn't what is desired in the empty /etc case
after a factory reset. I know that there is the utility
systemd-machine-id-setup that we are running on systemd postinst in
debian/ubuntu, but that doesn't cover the factory reset one.

Is there anything obvious that I'm missing to cover that case or anything in
the pipe?

You have a couple of options:

a) make /etc writable before systemd is invoked. If you use an initrd
this is without risk, given that the initrd should really invoke
fsck on the root disk anyway, and there's hence little reason to
transition to a read-only root, rather than just doing rw
right-away.


Interesting, I run that through our kernel team. However, we run fsck a 
little bit later on in the boot process to be able to pipe the output to 
plymouth.

I'm not sure we should then have two code paths:
- one fscking from the initrd if /etc/machine-id is empty (like after a 
factory reset), showing the results and eventual failures to the user in 
some way
- and then, the general use case: fscking through the systemd service 
via systemd-fsck-root.service before local-fs.target and piping the 
result in plymouth




b) pre-initialize the machine ID before you boot, at build time.

c) live with random ids


Those 2 are actually nice features, but not applying with a machine if 
we factory reset it with an empty /etc.


d) pass in the id to use via $container_uuid (if you use a container
manager), or via the DMI uuid field (if you use kvm). Then, create
/etc/machine-id as an empty file, and systemd will initialize it to
this ID rather than a random one.

Usually, option d) is preferable in cloud setups I guess since it
allows seeding the machine id from some externally used UUID, the way
many container/virtualization managers define one anyway.


Fully agreed (not the case I showed up here, but nice to know we can 
pass pre-generated uuid to containers/vms).


I'd be open to add another option on top of this:

e) boot up with /etc read-only and /etc/machine-id empty, so that the
usualy logic of c) generates a random machine id and overmounts
/etc/machine-id with it. But then, add a tiny new bootup service,
that runs shortly after local-fs.target (i.e. the point where /etc/
has been made writable if it's supposed to be made writable
according to fstab), and that syncs the random one used so far back
to disk, so that at the next boot-up it is fully initialized. This
tiny service should be properly conditioned so that it only runs if
/etc/machine-id is overmounted and /etc writable
(i.e. ConditionPathIsMountPoint=/etc/machine-id and
ConditionPathIsReadWrite=/etc). Special care should be taken so
that replacement of the mount by the normal file is
race-free. (This probably means the tool should open a new ount
namespace temporarily, unmount /etc/machine-id there, update the
file undearneath and then return to the original mount namespace
and unmount the file there too, so that at no time the filw is
invalid).

The guarantee with /etc/machine-id is really that it is valid at *any*
time, in early boot and late boot and all the time in between.
I think I will go that path which is an interesting one and mapping some 

Re: [systemd-devel] Hosts without /etc/machine-id on boot

2014-11-20 Thread Lennart Poettering
On Wed, 19.11.14 09:45, Didier Roche (didro...@ubuntu.com) wrote:

> Hey,
> 
> Some other topic related to "empty /etc" discussions: when preparing some
> generic distro images, we are have the desire to ensure that all new
> instances will get a different /etc/machine-id file.
> As part of the empty /etc at boot, we first thought that removing
> /etc/machine-id would be sufficient, however, the instance then doesn't
> generate a new machine-id file and complain heavily.
> 
> The new debug message of systemd 216+ helped shading some lights on it: 
> http://cgit.freedesktop.org/systemd/systemd-stable/diff/src/core/machine-id-setup.c?h=v216-stable&id=896050eeb3acbf4106d71204a5173b4984cf1675,
> and adding debug statement in machine_id_setup() from
> src/core/machine-id-setup.c just before "open(etc_machine_id,
> O_RDWR|O_CREAT|O_CLOEXEC|O_NOCTTY, 0444)" explains what happens with
> /proc/mounts:
> 
> [2.119041] systemd[1]: rootfs / rootfs rw
> [2.126775] systemd[1]: 
> /dev/disk/by-uuid/ec8166e5-d5ed-45ec-b350-6cf5773904ac / ext4 
> ro,relatime,data=ordered
> 
> 
> It's clear then that at this stage of the boot process / is readonly.
> The error message (and code) will say that in this case, what is supported
> is an empty /etc/machine-id. After reboot, the consequence is that
> /etc/machine-id is mounted as a tmpfs:

Yes, generation of the machine ID is done very very early at boot,
before we fork off the first non-PID1 userspace process, and hence
before any file system could be remounted writable.

> tmpfs on /etc/machine-id type tmpfs (ro,relatime,size=204948k,mode=755)
> 
> However, this means is that each boot of this instance will result in a
> different machine-id, which isn't what is desired in the empty /etc case
> after a factory reset. I know that there is the utility
> systemd-machine-id-setup that we are running on systemd postinst in
> debian/ubuntu, but that doesn't cover the factory reset one.
> 
> Is there anything obvious that I'm missing to cover that case or anything in
> the pipe?

You have a couple of options:

a) make /etc writable before systemd is invoked. If you use an initrd
   this is without risk, given that the initrd should really invoke
   fsck on the root disk anyway, and there's hence little reason to
   transition to a read-only root, rather than just doing rw
   right-away.

b) pre-initialize the machine ID before you boot, at build time.

c) live with random ids

d) pass in the id to use via $container_uuid (if you use a container
   manager), or via the DMI uuid field (if you use kvm). Then, create
   /etc/machine-id as an empty file, and systemd will initialize it to
   this ID rather than a random one.

Usually, option d) is preferable in cloud setups I guess since it
allows seeding the machine id from some externally used UUID, the way
many container/virtualization managers define one anyway.

I'd be open to add another option on top of this:

e) boot up with /etc read-only and /etc/machine-id empty, so that the
   usualy logic of c) generates a random machine id and overmounts
   /etc/machine-id with it. But then, add a tiny new bootup service,
   that runs shortly after local-fs.target (i.e. the point where /etc/
   has been made writable if it's supposed to be made writable
   according to fstab), and that syncs the random one used so far back
   to disk, so that at the next boot-up it is fully initialized. This
   tiny service should be properly conditioned so that it only runs if
   /etc/machine-id is overmounted and /etc writable
   (i.e. ConditionPathIsMountPoint=/etc/machine-id and
   ConditionPathIsReadWrite=/etc). Special care should be taken so
   that replacement of the mount by the normal file is
   race-free. (This probably means the tool should open a new ount
   namespace temporarily, unmount /etc/machine-id there, update the
   file undearneath and then return to the original mount namespace
   and unmount the file there too, so that at no time the filw is
   invalid).

The guarantee with /etc/machine-id is really that it is valid at *any*
time, in early boot and late boot and all the time in between.

Hope this makes sense?

Lennart

-- 
Lennart Poettering, Red Hat
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


[systemd-devel] Hosts without /etc/machine-id on boot

2014-11-19 Thread Didier Roche

Hey,

Some other topic related to "empty /etc" discussions: when preparing 
some generic distro images, we are have the desire to ensure that all 
new instances will get a different /etc/machine-id file.
As part of the empty /etc at boot, we first thought that removing 
/etc/machine-id would be sufficient, however, the instance then doesn't 
generate a new machine-id file and complain heavily.


The new debug message of systemd 216+ helped shading some lights on it: 
http://cgit.freedesktop.org/systemd/systemd-stable/diff/src/core/machine-id-setup.c?h=v216-stable&id=896050eeb3acbf4106d71204a5173b4984cf1675, 
and adding debug statement in machine_id_setup() from 
src/core/machine-id-setup.c just before "open(etc_machine_id, 
O_RDWR|O_CREAT|O_CLOEXEC|O_NOCTTY, 0444)" explains what happens with 
/proc/mounts:


[2.119041] systemd[1]: rootfs / rootfs rw
[2.126775] systemd[1]: 
/dev/disk/by-uuid/ec8166e5-d5ed-45ec-b350-6cf5773904ac / ext4 
ro,relatime,data=ordered


It's clear then that at this stage of the boot process / is readonly.
The error message (and code) will say that in this case, what is 
supported is an empty /etc/machine-id. After reboot, the consequence is 
that /etc/machine-id is mounted as a tmpfs:


tmpfs on /etc/machine-id type tmpfs (ro,relatime,size=204948k,mode=755)

However, this means is that each boot of this instance will result in a 
different machine-id, which isn't what is desired in the empty /etc case 
after a factory reset. I know that there is the utility 
systemd-machine-id-setup that we are running on systemd postinst in 
debian/ubuntu, but that doesn't cover the factory reset one.


Is there anything obvious that I'm missing to cover that case or 
anything in the pipe?

Cheers,
Didier
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel