On Thu, Sep 24, 2020 at 11:40 AM Pavel Tikhomirov
<ptikhomi...@virtuozzo.com> wrote:
>
>
>
> On 9/23/20 7:09 PM, Amir Goldstein wrote:
> > On Wed, Sep 23, 2020 at 6:23 PM Pavel Tikhomirov
> > <ptikhomi...@virtuozzo.com> wrote:
> >>
> >> This relaxes uuid checks for overlay index feature. It is only possible
> >> in case there is only one filesystem for all the work/upper/lower
> >> directories and bare file handles from this backing filesystem are uniq.
> >> In case we have multiple filesystems here just fall back to normal
> >> "index=on".
> >>
> >> This is needed when overlayfs is/was mounted in a container with
> >> index enabled (e.g.: to be able to resolve inotify watch file handles on
> >> it to paths in CRIU), and this container is copied and started alongside
> >> with the original one. This way the "copy" container can't have the same
> >> uuid on the superblock and mounting the overlayfs from it later would
> >> fail.
> >>
> >> That is an example of the problem on top of loop+ext4:
> >>
> >> dd if=/dev/zero of=loopbackfile.img bs=100M count=10
> >> losetup -fP loopbackfile.img
> >> losetup -a
> >>    #/dev/loop0: [64768]:35 (/loop-test/loopbackfile.img)
> >> mkfs.ext4 loopbackfile.img
> >> mkdir loop-mp
> >> mount -o loop /dev/loop0 loop-mp
> >> mkdir loop-mp/{lower,upper,work,merged}
> >> mount -t overlay overlay -oindex=on,lowerdir=loop-mp/lower,\
> >> upperdir=loop-mp/upper,workdir=loop-mp/work loop-mp/merged
> >> umount loop-mp/merged
> >> umount loop-mp
> >> e2fsck -f /dev/loop0
> >> tune2fs -U random /dev/loop0
> >>
> >> mount -o loop /dev/loop0 loop-mp
> >> mount -t overlay overlay -oindex=on,lowerdir=loop-mp/lower,\
> >> upperdir=loop-mp/upper,workdir=loop-mp/work loop-mp/merged
> >>    #mount: /loop-test/loop-mp/merged:
> >>    #mount(2) system call failed: Stale file handle.
> >>
> >> mount -t overlay overlay -oindex=nouuid,lowerdir=loop-mp/lower,\
> >> upperdir=loop-mp/upper,workdir=loop-mp/work loop-mp/merged
> >>
> >> If you just change the uuid of the backing filesystem, overlay is not
> >> mounting any more. In Virtuozzo we copy container disks (ploops) when
> >> crate the copy of container and we require fs uuid to be uniq for a new
> >> container.
> >>
> >> v2: in v1 I missed actual uuid check skip - add it
> >>
> >> CC: Amir Goldstein <amir7...@gmail.com>
> >> CC: Vivek Goyal <vgo...@redhat.com>
> >> CC: Miklos Szeredi <mik...@szeredi.hu>
> >> CC: linux-unio...@vger.kernel.org
> >> CC: linux-kernel@vger.kernel.org
> >> Signed-off-by: Pavel Tikhomirov <ptikhomi...@virtuozzo.com>
> >> ---
> >
> > Look reasonable, but you need to rebase over
> > git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs.git 
> > overlayfs-next
> >
> > Which makes a lot of your work unneeded.
> > ofs is propagated to most of the relevant helpers
> > and you should propagate it down to ovl_decode_real_fh().
>
> Thanks! Will do.
>
> >
> > Some minor comments below...
> >
> >>   fs/overlayfs/Kconfig     | 16 +++++++++++
> >>   fs/overlayfs/export.c    |  6 ++--
> >>   fs/overlayfs/namei.c     | 35 +++++++++++++++--------
> >>   fs/overlayfs/overlayfs.h | 23 +++++++++++----
> >>   fs/overlayfs/ovl_entry.h |  2 +-
> >>   fs/overlayfs/super.c     | 61 +++++++++++++++++++++++++++++-----------
> >>   6 files changed, 106 insertions(+), 37 deletions(-)
> >>
> >> diff --git a/fs/overlayfs/Kconfig b/fs/overlayfs/Kconfig
> >> index dd188c7996b3..b00fd44006f9 100644
> >> --- a/fs/overlayfs/Kconfig
> >> +++ b/fs/overlayfs/Kconfig
> >> @@ -61,6 +61,22 @@ config OVERLAY_FS_INDEX
> >>
> >>            If unsure, say N.
> >>
> >> +config OVERLAY_FS_INDEX_NOUUID
> >> +       bool "Overlayfs: relax uuid checks of inodes index feature"
> >> +       depends on OVERLAY_FS
> >> +       depends on OVERLAY_FS_INDEX
> >> +       help
> >> +         If this config option is enabled then overlay will skip uuid 
> >> checks
> >> +         for index lower to upper inode map, this only can be done if all
> >> +         upper and lower directories are on the same filesystem where 
> >> basic
> >> +         fhandles are uniq.
> >> +
> >> +         It is needed to overcome possible change of uuid on superblock 
> >> of the
> >> +         backing filesystem, e.g. when you copied the virtual disk and 
> >> mount
> >> +         both the copy of the disk and the original one at the same time.
> >> +
> >> +         If unsure, say N.
> >> +
> >
> > Please do not add a config option for this.
> > Isn't a mount option sufficient for your needs?
>
> Users inside Virtuozzo container can mount overlayfs inside the CT (we
> assume that they do "regular" mounts without any "index=" option as
> docker does) so we wan't to setup the default in kernel config, so that
> all "regular" mounts of the user become "index=nouuid" automatically,
> and thus we would be able to both migrate (CRIU inotify resolution by
> fhandle on dump) and copy (copy disk with uuid change) the container
> without problem.
>

That is a problem.
So far, mount options always override module and kernel config options.
So if mount option says inodex=on explicitly, it should be index=on.

The only way I see for you to tackle this is if nouuid option is independent
of index option, e.g. index=on,anon_uuid (or uuid=off).

But if kernel config has a way to turn this on, then module param and
mount option should have a way to turn it off.

[...]
> >> @@ -1889,9 +1911,14 @@ static int ovl_fill_super(struct super_block *sb, 
> >> void *data, int silent)
> >>          if (err)
> >>                  goto out_free_oe;
> >>
> >> +       if (ofs->config.index == OVL_INDEX_NOUUID && ofs->numfs > 1) {
> >> +               pr_warn("The index=nouuid requires a single fs for lower 
> >> and upper, falling back to index=on.\n");
> >> +               ofs->config.index = OVL_INDEX_ON;
> >> +       }
> >> +
> >
> > It's too late for this fallback now, you already did relaxed 
> > ovl_verify_origin()
> > and now we will continue as if all is ok.
> > Please fail the mount in this case.
> >
> > I don't think that users that specifically requested index=nouuid would 
> > care to
> > fallback to index=on.
>
> No, it's we who will force users to switch to index=nouuid in our
> Virtuozzo case through kernel config default, so probably having a
> fallback is a good thing, as users will be able to use their overlay at
> least until we "copy" their container.
>
> Maybe I can just move my check before ovl_get_indexdir (as far as I can
> see it is the only place from ovl_fill_super where we reach
> ovl_verify_fh or ovl_decode_real_fh) and after ovl_get_lowerstack where
> numfs is calculated. - Will do.

Yes, that would be fine techicaly.
I am not sure if the fallback is helpful though.

Anyway, you would need to get approval from Miklos
on the option itself and how it behaves.

Thanks,
Amir.

Reply via email to