Le 20/11/2014 13:45, Lennart Poettering a écrit :
On Wed, 19.11.14 09:45, Didier Roche (didro...@ubuntu.com) wrote:
Hey,
Some other topic related to "empty /etc" discussions: when preparing some
generic distro images, we are have the desire to ensure that all new
instances will get a different /etc/machine-id file.
As part of the empty /etc at boot, we first thought that removing
/etc/machine-id would be sufficient, however, the instance then doesn't
generate a new machine-id file and complain heavily.
The new debug message of systemd 216+ helped shading some lights on it:
http://cgit.freedesktop.org/systemd/systemd-stable/diff/src/core/machine-id-setup.c?h=v216-stable&id=896050eeb3acbf4106d71204a5173b4984cf1675,
and adding debug statement in machine_id_setup() from
src/core/machine-id-setup.c just before "open(etc_machine_id,
O_RDWR|O_CREAT|O_CLOEXEC|O_NOCTTY, 0444)" explains what happens with
/proc/mounts:
[ 2.119041] systemd[1]: rootfs / rootfs rw
[ 2.126775] systemd[1]:
/dev/disk/by-uuid/ec8166e5-d5ed-45ec-b350-6cf5773904ac / ext4
ro,relatime,data=ordered
It's clear then that at this stage of the boot process / is readonly.
The error message (and code) will say that in this case, what is supported
is an empty /etc/machine-id. After reboot, the consequence is that
/etc/machine-id is mounted as a tmpfs:
Yes, generation of the machine ID is done very very early at boot,
before we fork off the first non-PID1 userspace process, and hence
before any file system could be remounted writable.
That makes sense.
tmpfs on /etc/machine-id type tmpfs (ro,relatime,size=204948k,mode=755)
However, this means is that each boot of this instance will result in a
different machine-id, which isn't what is desired in the empty /etc case
after a factory reset. I know that there is the utility
systemd-machine-id-setup that we are running on systemd postinst in
debian/ubuntu, but that doesn't cover the factory reset one.
Is there anything obvious that I'm missing to cover that case or anything in
the pipe?
You have a couple of options:
a) make /etc writable before systemd is invoked. If you use an initrd
this is without risk, given that the initrd should really invoke
fsck on the root disk anyway, and there's hence little reason to
transition to a read-only root, rather than just doing rw
right-away.
Interesting, I run that through our kernel team. However, we run fsck a
little bit later on in the boot process to be able to pipe the output to
plymouth.
I'm not sure we should then have two code paths:
- one fscking from the initrd if /etc/machine-id is empty (like after a
factory reset), showing the results and eventual failures to the user in
some way
- and then, the general use case: fscking through the systemd service
via systemd-fsck-root.service before local-fs.target and piping the
result in plymouth
b) pre-initialize the machine ID before you boot, at build time.
c) live with random ids
Those 2 are actually nice features, but not applying with a machine if
we factory reset it with an empty /etc.
d) pass in the id to use via $container_uuid (if you use a container
manager), or via the DMI uuid field (if you use kvm). Then, create
/etc/machine-id as an empty file, and systemd will initialize it to
this ID rather than a random one.
Usually, option d) is preferable in cloud setups I guess since it
allows seeding the machine id from some externally used UUID, the way
many container/virtualization managers define one anyway.
Fully agreed (not the case I showed up here, but nice to know we can
pass pre-generated uuid to containers/vms).
I'd be open to add another option on top of this:
e) boot up with /etc read-only and /etc/machine-id empty, so that the
usualy logic of c) generates a random machine id and overmounts
/etc/machine-id with it. But then, add a tiny new bootup service,
that runs shortly after local-fs.target (i.e. the point where /etc/
has been made writable if it's supposed to be made writable
according to fstab), and that syncs the random one used so far back
to disk, so that at the next boot-up it is fully initialized. This
tiny service should be properly conditioned so that it only runs if
/etc/machine-id is overmounted and /etc writable
(i.e. ConditionPathIsMountPoint=/etc/machine-id and
ConditionPathIsReadWrite=/etc). Special care should be taken so
that replacement of the mount by the normal file is
race-free. (This probably means the tool should open a new ount
namespace temporarily, unmount /etc/machine-id there, update the
file undearneath and then return to the original mount namespace
and unmount the file there too, so that at no time the filw is
invalid).
The guarantee with /etc/machine-id is really that it is valid at *any*
time, in early boot and late boot and all the time in between.
I think I will go that path which is an interesting one and mapping some
of my thoughts. Thanks for the guidance and documentation on what's the
right approach to achieve this race-free! I'll work on something around
that and propose a patch.
This should bring us one step closer (even if it will require an empty
/etc/machine-id file for now) to the factory reset (with an empty /etc)
case.
Hope this makes sense?
It completely does, thanks again for your detailed answer!
Cheers,
Didier
_______________________________________________
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel