There is a scenario where a real rootfs is located on a bcache device,
however, for that we need to register a bcache device at the initrd
stage which already happens now. Then we'd locate a file system on it
and do pivot_root and so on.

The bcache<i> naming, I believe, is not guaranteed at this point unless
we have a rule that says so.

Side-tracking to our field use-cases, we need persistence in
/dev/bcache<i> names based on superblock UUIDs. So, I expect
/dev/bcache/<i> names to be persisted by UUID on first discovery (which
corresponds to MAAS deploy stage, not commissioning as in case of disk
serial numbers).

However, we also expect bcache<i> names to match names in MAAS which may
not happen in this scenario because <backing-dev-name> : bcache<i>
mapping is not enforced.

Going back to https://bugs.launchpad.net/curtin/+bug/1728742, I think we
can break it down into two problems:

1. bcache device numbers are not static across reboots and we need a
static mapping of superblock UUID to bcache<i> for a given device. This
requires CACHED_UUID to be present in uevent environment which is only
possible during a successful registration where this code path is
triggered. As a result of rootfs on bcache requirement, this makes sense
to do at the initrd stage before we have to do pivot_root to the real
rootfs.

Doing something like that when systemd is running post pivot_root and
/dev devtmpfs transfer to the real rootfs doesn't sound right to me as
we have this problem with double registration. In summary, I think
/dev/bcache/by-uuid/ symlinks for bcache devices that exist on initial
boot should be created via udev rules in initrd.

This is what this bug is about.

2. bcache device names may not match the ones in MAAS. This has
implications for our use of Juju Storage functionality when we need
device special files with static names without file systems or partition
tables present. After commissioning in MAAS there's already metadata
present about a given machine - disk serial numbers are gathered (if
present, this is not guaranteed and block driver-specific AFAIK but a
sane assumption to make) and device names that were assigned during
ephemeral image boot are presented and stored in a database with
associated serial numbers available for querying to set up dname
symlinks on deployment.

In order to make <backing-dev-name> : bcache<i> mapping static we need
to essentially have a mapping of disk serial numbers to bcache
superblock UUIDs which are in turn mapped to bcache<i> names.

I would say that https://bugs.launchpad.net/curtin/+bug/1728742 is about
p.2.

====

The rationale for p. 1 is that the init script sets up devtmpfs
initially which then gets moved over to the real rootfs (init-bottom
script) before pivot_root is performed. systemd then runs its mount
point set up code which checks if a given entry in its hard-coded table
of mount points is already a mount point and skips its setup if this is
the case. So anything set up during initrd stage will stay there after
systemd runs as devtmpfs is moved and reused.

https://git.launchpad.net/~usd-import-team/ubuntu/+source/systemd/tree/src/core/mount-setup.c?h=applied/ubuntu/xenial-updates#n77
  { "devtmpfs", "/dev", "devtmpfs", "mode=755", MS_NOSUID|MS_STRICTATIME,

path_is_mount_point -> fd_is_mount_point
https://git.launchpad.net/~usd-import-team/ubuntu/+source/systemd/tree/src/core/mount-setup.c?h=applied/ubuntu/xenial-updates#n161

static int mount_one(const MountPoint *p, bool relabel) {
...
        r = path_is_mount_point(p->where, AT_SYMLINK_FOLLOW);
        if (r < 0 && r != -ENOENT) {
                log_full_errno((p->mode & MNT_FATAL) ? LOG_ERR : LOG_DEBUG, r, 
"Failed to determine whether %s is a mount point: %m", p->where);
                return (p->mode & MNT_FATAL) ? r : 0;
        }
        if (r > 0)
                return 0;


init script:
https://git.launchpad.net/~usd-import-team/ubuntu/+source/initramfs-tools/tree/init?h=applied/ubuntu/xenial-updates
[ -d /dev ] || mkdir -m 0755 /dev
...

# Note that this only becomes /dev on the real filesystem if udev's scripts
# are used; which they will be, but it's worth pointing out
if ! mount -t devtmpfs -o nosuid,mode=0755 udev /dev; then
     echo "W: devtmpfs not available, falling back to tmpfs for /dev"
     mount -t tmpfs -o nosuid,mode=0755 udev /dev
     [ -e /dev/console ] || mknod -m 0600 /dev/console c 5 1
     [ -e /dev/null ] || mknod /dev/null c 1 3
fi
...


init-bottom:
https://git.launchpad.net/~usd-import-team/ubuntu/+source/systemd/tree/debian/extra/initramfs-tools/scripts/init-bottom/udev?h=applied/ubuntu/xenial-updates

...
# move the /dev tmpfs to the rootfs
mount -n -o move /dev ${rootmnt}/dev

# create a temporary symlink to the final /dev for other initramfs scripts
if command -v nuke >/dev/null; then
  nuke /dev
else
  rm -rf /dev
fi
ln -s ${rootmnt}/dev /dev

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1729145

Title:
  /dev/bcache/by-uuid links not created after reboot

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1729145/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to