Re: [systemd-devel] Hosts without /etc/machine-id on boot
Le 21/11/2014 00:41, Lennart Poettering a écrit : On Thu, 20.11.14 17:23, Didier Roche (didro...@ubuntu.com) wrote: a) make /etc writable before systemd is invoked. If you use an initrd this is without risk, given that the initrd should really invoke fsck on the root disk anyway, and there's hence little reason to transition to a read-only root, rather than just doing rw right-away. Interesting, I run that through our kernel team. However, we run fsck a little bit later on in the boot process to be able to pipe the output to plymouth. At least on Fedora plymouth already runs on the initrd. If Ubuntu does the same, then there shouldn't be a difference regarding where fsck is run... Note that running fsck in the initrd for the root fs is really the right thing to do: running fsck from the file system you are about to check, which you hence cannot trust, is really wrong. I'm not sure we should then have two code paths: - one fscking from the initrd if /etc/machine-id is empty (like after a factory reset), showing the results and eventual failures to the user in some way - and then, the general use case: fscking through the systemd service via systemd-fsck-root.service before local-fs.target and piping the result in plymouth The latter is useful only really on non-initrd boots where there isn't any initrd where the fsck could run. General purpose distributions should really run fsck in the initrd. Note that systemd-fsck-root.service skips itself when it notices that the fs was already checked (via a flag file in /run). Indeed, that makes sense and we should investigate that with our kernel team in the near future. I'm going to open a thread on it. The guarantee with /etc/machine-id is really that it is valid at *any* time, in early boot and late boot and all the time in between. I think I will go that path which is an interesting one and mapping some of my thoughts. Thanks for the guidance and documentation on what's the right approach to achieve this race-free! I'll work on something around that and propose a patch. Looking forword to it. Here we go :) I did some factorization to reuse some functions on the first path and added the binary helper, unit and man page. It should follow your advice and be race-free. Tested various cases locally. Cheers, Didier ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] Hosts without /etc/machine-id on boot
On Thu, 20.11.14 17:23, Didier Roche (didro...@ubuntu.com) wrote: > >a) make /etc writable before systemd is invoked. If you use an initrd > >this is without risk, given that the initrd should really invoke > >fsck on the root disk anyway, and there's hence little reason to > >transition to a read-only root, rather than just doing rw > >right-away. > > Interesting, I run that through our kernel team. However, we run fsck a > little bit later on in the boot process to be able to pipe the output to > plymouth. At least on Fedora plymouth already runs on the initrd. If Ubuntu does the same, then there shouldn't be a difference regarding where fsck is run... Note that running fsck in the initrd for the root fs is really the right thing to do: running fsck from the file system you are about to check, which you hence cannot trust, is really wrong. > I'm not sure we should then have two code paths: > - one fscking from the initrd if /etc/machine-id is empty (like after a > factory reset), showing the results and eventual failures to the user in > some way > - and then, the general use case: fscking through the systemd service via > systemd-fsck-root.service before local-fs.target and piping the result in > plymouth The latter is useful only really on non-initrd boots where there isn't any initrd where the fsck could run. General purpose distributions should really run fsck in the initrd. Note that systemd-fsck-root.service skips itself when it notices that the fs was already checked (via a flag file in /run). > >The guarantee with /etc/machine-id is really that it is valid at *any* > >time, in early boot and late boot and all the time in between. > > I think I will go that path which is an interesting one and mapping some of > my thoughts. Thanks for the guidance and documentation on what's the right > approach to achieve this race-free! I'll work on something around that and > propose a patch. Looking forword to it. Lennart -- Lennart Poettering, Red Hat ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] Hosts without /etc/machine-id on boot
Le 20/11/2014 13:45, Lennart Poettering a écrit : On Wed, 19.11.14 09:45, Didier Roche (didro...@ubuntu.com) wrote: Hey, Some other topic related to "empty /etc" discussions: when preparing some generic distro images, we are have the desire to ensure that all new instances will get a different /etc/machine-id file. As part of the empty /etc at boot, we first thought that removing /etc/machine-id would be sufficient, however, the instance then doesn't generate a new machine-id file and complain heavily. The new debug message of systemd 216+ helped shading some lights on it: http://cgit.freedesktop.org/systemd/systemd-stable/diff/src/core/machine-id-setup.c?h=v216-stable&id=896050eeb3acbf4106d71204a5173b4984cf1675, and adding debug statement in machine_id_setup() from src/core/machine-id-setup.c just before "open(etc_machine_id, O_RDWR|O_CREAT|O_CLOEXEC|O_NOCTTY, 0444)" explains what happens with /proc/mounts: [2.119041] systemd[1]: rootfs / rootfs rw [2.126775] systemd[1]: /dev/disk/by-uuid/ec8166e5-d5ed-45ec-b350-6cf5773904ac / ext4 ro,relatime,data=ordered It's clear then that at this stage of the boot process / is readonly. The error message (and code) will say that in this case, what is supported is an empty /etc/machine-id. After reboot, the consequence is that /etc/machine-id is mounted as a tmpfs: Yes, generation of the machine ID is done very very early at boot, before we fork off the first non-PID1 userspace process, and hence before any file system could be remounted writable. That makes sense. tmpfs on /etc/machine-id type tmpfs (ro,relatime,size=204948k,mode=755) However, this means is that each boot of this instance will result in a different machine-id, which isn't what is desired in the empty /etc case after a factory reset. I know that there is the utility systemd-machine-id-setup that we are running on systemd postinst in debian/ubuntu, but that doesn't cover the factory reset one. Is there anything obvious that I'm missing to cover that case or anything in the pipe? You have a couple of options: a) make /etc writable before systemd is invoked. If you use an initrd this is without risk, given that the initrd should really invoke fsck on the root disk anyway, and there's hence little reason to transition to a read-only root, rather than just doing rw right-away. Interesting, I run that through our kernel team. However, we run fsck a little bit later on in the boot process to be able to pipe the output to plymouth. I'm not sure we should then have two code paths: - one fscking from the initrd if /etc/machine-id is empty (like after a factory reset), showing the results and eventual failures to the user in some way - and then, the general use case: fscking through the systemd service via systemd-fsck-root.service before local-fs.target and piping the result in plymouth b) pre-initialize the machine ID before you boot, at build time. c) live with random ids Those 2 are actually nice features, but not applying with a machine if we factory reset it with an empty /etc. d) pass in the id to use via $container_uuid (if you use a container manager), or via the DMI uuid field (if you use kvm). Then, create /etc/machine-id as an empty file, and systemd will initialize it to this ID rather than a random one. Usually, option d) is preferable in cloud setups I guess since it allows seeding the machine id from some externally used UUID, the way many container/virtualization managers define one anyway. Fully agreed (not the case I showed up here, but nice to know we can pass pre-generated uuid to containers/vms). I'd be open to add another option on top of this: e) boot up with /etc read-only and /etc/machine-id empty, so that the usualy logic of c) generates a random machine id and overmounts /etc/machine-id with it. But then, add a tiny new bootup service, that runs shortly after local-fs.target (i.e. the point where /etc/ has been made writable if it's supposed to be made writable according to fstab), and that syncs the random one used so far back to disk, so that at the next boot-up it is fully initialized. This tiny service should be properly conditioned so that it only runs if /etc/machine-id is overmounted and /etc writable (i.e. ConditionPathIsMountPoint=/etc/machine-id and ConditionPathIsReadWrite=/etc). Special care should be taken so that replacement of the mount by the normal file is race-free. (This probably means the tool should open a new ount namespace temporarily, unmount /etc/machine-id there, update the file undearneath and then return to the original mount namespace and unmount the file there too, so that at no time the filw is invalid). The guarantee with /etc/machine-id is really that it is valid at *any* time, in early boot and late boot and all the time in between. I think I will go that path which is an interesting one and mapping some
Re: [systemd-devel] Hosts without /etc/machine-id on boot
On Wed, 19.11.14 09:45, Didier Roche (didro...@ubuntu.com) wrote: > Hey, > > Some other topic related to "empty /etc" discussions: when preparing some > generic distro images, we are have the desire to ensure that all new > instances will get a different /etc/machine-id file. > As part of the empty /etc at boot, we first thought that removing > /etc/machine-id would be sufficient, however, the instance then doesn't > generate a new machine-id file and complain heavily. > > The new debug message of systemd 216+ helped shading some lights on it: > http://cgit.freedesktop.org/systemd/systemd-stable/diff/src/core/machine-id-setup.c?h=v216-stable&id=896050eeb3acbf4106d71204a5173b4984cf1675, > and adding debug statement in machine_id_setup() from > src/core/machine-id-setup.c just before "open(etc_machine_id, > O_RDWR|O_CREAT|O_CLOEXEC|O_NOCTTY, 0444)" explains what happens with > /proc/mounts: > > [2.119041] systemd[1]: rootfs / rootfs rw > [2.126775] systemd[1]: > /dev/disk/by-uuid/ec8166e5-d5ed-45ec-b350-6cf5773904ac / ext4 > ro,relatime,data=ordered > > > It's clear then that at this stage of the boot process / is readonly. > The error message (and code) will say that in this case, what is supported > is an empty /etc/machine-id. After reboot, the consequence is that > /etc/machine-id is mounted as a tmpfs: Yes, generation of the machine ID is done very very early at boot, before we fork off the first non-PID1 userspace process, and hence before any file system could be remounted writable. > tmpfs on /etc/machine-id type tmpfs (ro,relatime,size=204948k,mode=755) > > However, this means is that each boot of this instance will result in a > different machine-id, which isn't what is desired in the empty /etc case > after a factory reset. I know that there is the utility > systemd-machine-id-setup that we are running on systemd postinst in > debian/ubuntu, but that doesn't cover the factory reset one. > > Is there anything obvious that I'm missing to cover that case or anything in > the pipe? You have a couple of options: a) make /etc writable before systemd is invoked. If you use an initrd this is without risk, given that the initrd should really invoke fsck on the root disk anyway, and there's hence little reason to transition to a read-only root, rather than just doing rw right-away. b) pre-initialize the machine ID before you boot, at build time. c) live with random ids d) pass in the id to use via $container_uuid (if you use a container manager), or via the DMI uuid field (if you use kvm). Then, create /etc/machine-id as an empty file, and systemd will initialize it to this ID rather than a random one. Usually, option d) is preferable in cloud setups I guess since it allows seeding the machine id from some externally used UUID, the way many container/virtualization managers define one anyway. I'd be open to add another option on top of this: e) boot up with /etc read-only and /etc/machine-id empty, so that the usualy logic of c) generates a random machine id and overmounts /etc/machine-id with it. But then, add a tiny new bootup service, that runs shortly after local-fs.target (i.e. the point where /etc/ has been made writable if it's supposed to be made writable according to fstab), and that syncs the random one used so far back to disk, so that at the next boot-up it is fully initialized. This tiny service should be properly conditioned so that it only runs if /etc/machine-id is overmounted and /etc writable (i.e. ConditionPathIsMountPoint=/etc/machine-id and ConditionPathIsReadWrite=/etc). Special care should be taken so that replacement of the mount by the normal file is race-free. (This probably means the tool should open a new ount namespace temporarily, unmount /etc/machine-id there, update the file undearneath and then return to the original mount namespace and unmount the file there too, so that at no time the filw is invalid). The guarantee with /etc/machine-id is really that it is valid at *any* time, in early boot and late boot and all the time in between. Hope this makes sense? Lennart -- Lennart Poettering, Red Hat ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
[systemd-devel] Hosts without /etc/machine-id on boot
Hey, Some other topic related to "empty /etc" discussions: when preparing some generic distro images, we are have the desire to ensure that all new instances will get a different /etc/machine-id file. As part of the empty /etc at boot, we first thought that removing /etc/machine-id would be sufficient, however, the instance then doesn't generate a new machine-id file and complain heavily. The new debug message of systemd 216+ helped shading some lights on it: http://cgit.freedesktop.org/systemd/systemd-stable/diff/src/core/machine-id-setup.c?h=v216-stable&id=896050eeb3acbf4106d71204a5173b4984cf1675, and adding debug statement in machine_id_setup() from src/core/machine-id-setup.c just before "open(etc_machine_id, O_RDWR|O_CREAT|O_CLOEXEC|O_NOCTTY, 0444)" explains what happens with /proc/mounts: [2.119041] systemd[1]: rootfs / rootfs rw [2.126775] systemd[1]: /dev/disk/by-uuid/ec8166e5-d5ed-45ec-b350-6cf5773904ac / ext4 ro,relatime,data=ordered It's clear then that at this stage of the boot process / is readonly. The error message (and code) will say that in this case, what is supported is an empty /etc/machine-id. After reboot, the consequence is that /etc/machine-id is mounted as a tmpfs: tmpfs on /etc/machine-id type tmpfs (ro,relatime,size=204948k,mode=755) However, this means is that each boot of this instance will result in a different machine-id, which isn't what is desired in the empty /etc case after a factory reset. I know that there is the utility systemd-machine-id-setup that we are running on systemd postinst in debian/ubuntu, but that doesn't cover the factory reset one. Is there anything obvious that I'm missing to cover that case or anything in the pipe? Cheers, Didier ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel