Am Montag, den 25.01.2021, 20:51 +0000 schrieb Filipe Manana:
> On Mon, Jan 25, 2021 at 7:51 PM Roman Anasal <roman.ana...@bdsu.de>
> wrote:
> > This is analogous to the preceding patch ("btrfs: send: fix invalid
> > commands for inodes with changed type but same gen") but for
> > changed
> > rdev:
> > 
> > When doing an incremental send, if a new inode has the same number
> > as an
> > inode in the parent subvolume, was created with the same generation
> > but
> > has differing rdev it will not be detected as changed and thus not
> > recreated. This will lead to incorrect results on the receiver
> > where the
> > inode will keep the rdev of the inode in the parent subvolume or
> > even
> > fail when also the ref is unchanged.
> > 
> > This case does not happen when doing incremental sends with
> > snapshots
> > that are kept read-only by the user all the time, but it may happen
> > if
> > - a snapshot was modified in the same transaction as its parent
> > after it
> >   was created
> > - the subvol used as parent was created independently from the sent
> > subvol
> > 
> > Example reproducers:
> > 
> >   # case 1: same ino at same path
> >   btrfs subvolume create subvol1
> >   btrfs subvolume create subvol2
> >   mknod subvol1/a c 1 3
> >   mknod subvol2/a c 1 5
> >   btrfs property set subvol1 ro true
> >   btrfs property set subvol2 ro true
> >   btrfs send -p subvol1 subvol2 | btrfs receive --dump
> > 
> > The produced tree state here is:
> >   |-- subvol1
> >   |   `-- a         (ino 257, c 1,3)
> >   |
> >   `-- subvol2
> >       `-- a         (ino 257, c 1,5)
> > 
> > Where subvol1/a and subvol2/a are character devices with differing
> > minor
> > numbers but same inode number and same generation.
> > 
> > Example output of the receive command:
> >   At subvol subvol2
> >   snapshot        ./subvol2                       uuid=7513941c-
> > 4ef7-f847-b05e-4fdfe003af7b transid=9 parent_uuid=b66f015b-c226-
> > 2548-9e39-048c7fdbec99 parent_transid=9
> >   utimes          ./subvol2/                      atime=2021-01-
> > 25T17:14:36+0000 mtime=2021-01-25T17:14:36+0000 ctime=2021-01-
> > 25T17:14:36+0000
> >   link            ./subvol2/a                     dest=a
> >   unlink          ./subvol2/a
> >   utimes          ./subvol2/                      atime=2021-01-
> > 25T17:14:36+0000 mtime=2021-01-25T17:14:36+0000 ctime=2021-01-
> > 25T17:14:36+0000
> >   utimes          ./subvol2/a                     atime=2021-01-
> > 25T17:14:36+0000 mtime=2021-01-25T17:14:36+0000 ctime=2021-01-
> > 25T17:14:36+0000
> > 
> > => the `link` command causes the receiver to fail with:
> >    ERROR: link a -> a failed: File exists
> > 
> > Second example:
> >   # case 2: same ino at different path
> >   btrfs subvolume create subvol1
> >   btrfs subvolume create subvol2
> >   mknod subvol1/a c 1 3
> >   mknod subvol2/b c 1 5
> >   btrfs property set subvol1 ro true
> >   btrfs property set subvol2 ro true
> >   btrfs send -p subvol1 subvol2 | btrfs receive --dump
> 
> As I've told you before for the v1 patchset from a week or two ago,
> this is not a supported scenario for incremental sends.
> Incremental sends are meant to be used on RO snapshots of the same
> subvolume, and those snapshots must never be changed after they were
> created.
> 
> Incremental sends were simply not designed for these cases, and can
> never be guaranteed to work with such cases.
> 
> The bug is not having incremental sends fail right away, with an
> explicit error message, when the send and parent roots aren't RO
> snapshots of the same subvolume.

I am sorry, I didn't want to anger you or to appear to be just stubborn
by posting this.

As I wrote in the cover letter I am aware that this is not a supported
use case and I understand that that makes the patches likely to be
rejected.
As said the reason I _wrote_ the patches was simply to learn more about
the btrfs code and its internals and see if I would be able to
understand it enough. The reason I _submitted_ them was just to
document what I found out so others could have a look into it and just
in case it maybe useful at a later time.

I also don't want to claim that these will add full support for sending
unrelated roots - they don't! They only handle those very specific edge
cases I found, which are currently _possible_, although still not
supported.

I took a deeper look into the rest to see if it could be supported:
the comparing algorithm actually works fine, even with completely
unrelated subvolumes (i.e. btrfs_compare_trees, changed_cb,
changed_inode etc.), but the processing of the changes (i.e.
process_recorded_refs etc.) is heavily based on (ino, gen) as
identifying handle, which can not be changed without the high risk of
regression - just as you said in your earlier comments - since side
effects of any changes are hard to see or understand without a very
deep understanding of the whole code; which is why I didn't even try to
touch that parts.

I apologize if I appeared to be stubborn or ignorant of your feedback!
That really wasn't my intent.


> > The produced tree state here is:
> >   |-- subvol1
> >   |   `-- a         (ino 257, c 1,3)
> >   |
> >   `-- subvol2
> >       `-- b         (ino 257, c 1,5)
> > 
> > Where subvol1/a and subvol2/b are character devices with differing
> > minor
> > numbers but same inode number and same generation.
> > 
> > Example output of the receive command:
> >   At subvol subvol2
> >   snapshot        ./subvol2                       uuid=1c175819-
> > 8b97-0046-a20e-5f95e37cbd40 transid=13 parent_uuid=bad4a908-21b4-
> > 6f40-9a08-6b0768346725 parent_transid=13
> >   utimes          ./subvol2/                      atime=2021-01-
> > 25T17:18:46+0000 mtime=2021-01-25T17:18:46+0000 ctime=2021-01-
> > 25T17:18:46+0000
> >   link            ./subvol2/b                     dest=a
> >   unlink          ./subvol2/a
> >   utimes          ./subvol2/                      atime=2021-01-
> > 25T17:18:46+0000 mtime=2021-01-25T17:18:46+0000 ctime=2021-01-
> > 25T17:18:46+0000
> >   utimes          ./subvol2/b                     atime=2021-01-
> > 25T17:18:46+0000 mtime=2021-01-25T17:18:46+0000 ctime=2021-01-
> > 25T17:18:46+0000
> > 
> > => subvol1/a is renamed to subvol2/b instead of recreated to
> > updated
> >    rdev which results in received subvol2/b having the wrong minor
> >    number:
> > 
> >   257 crw-r--r--. 1 root root 1, 3 Jan 25 17:18 subvol2/b
> > 
> > Signed-off-by: Roman Anasal <roman.ana...@bdsu.de>
> > ---
> > v2:
> >   - add this patch to also handle changed rdev
> > ---
> >  fs/btrfs/send.c | 15 ++++++++++-----
> >  1 file changed, 10 insertions(+), 5 deletions(-)
> > 
> > diff --git a/fs/btrfs/send.c b/fs/btrfs/send.c
> > index c8b1f441f..ef544525f 100644
> > --- a/fs/btrfs/send.c
> > +++ b/fs/btrfs/send.c
> > @@ -6263,6 +6263,7 @@ static int changed_inode(struct send_ctx
> > *sctx,
> >         struct btrfs_inode_item *right_ii = NULL;
> >         u64 left_gen = 0;
> >         u64 right_gen = 0;
> > +       u64 left_rdev, right_rdev;
> >         u64 left_type, right_type;
> > 
> >         sctx->cur_ino = key->objectid;
> > @@ -6285,6 +6286,8 @@ static int changed_inode(struct send_ctx
> > *sctx,
> >                                 struct btrfs_inode_item);
> >                 left_gen = btrfs_inode_generation(sctx->left_path-
> > >nodes[0],
> >                                 left_ii);
> > +               left_rdev = btrfs_inode_rdev(sctx->left_path-
> > >nodes[0],
> > +                               left_ii);
> >         } else {
> >                 right_ii = btrfs_item_ptr(sctx->right_path-
> > >nodes[0],
> >                                 sctx->right_path->slots[0],
> > @@ -6300,6 +6303,9 @@ static int changed_inode(struct send_ctx
> > *sctx,
> >                 right_gen = btrfs_inode_generation(sctx-
> > >right_path->nodes[0],
> >                                 right_ii);
> > 
> > +               right_rdev = btrfs_inode_rdev(sctx->right_path-
> > >nodes[0],
> > +                               right_ii);
> > +
> >                 left_type = S_IFMT & btrfs_inode_mode(
> >                                 sctx->left_path->nodes[0],
> > left_ii);
> >                 right_type = S_IFMT & btrfs_inode_mode(
> > @@ -6310,7 +6316,8 @@ static int changed_inode(struct send_ctx
> > *sctx,
> >                  * the inode as deleted+reused because it would
> > generate a
> >                  * stream that tries to delete/mkdir the root dir.
> >                  */
> > -               if ((left_gen != right_gen || left_type !=
> > right_type) &&
> > +               if ((left_gen != right_gen || left_type !=
> > right_type ||
> > +                   left_rdev != right_rdev) &&
> >                     sctx->cur_ino != BTRFS_FIRST_FREE_OBJECTID)
> >                         sctx->cur_inode_recreated = 1;
> >         }
> > @@ -6350,8 +6357,7 @@ static int changed_inode(struct send_ctx
> > *sctx,
> >                                 sctx->left_path->nodes[0],
> > left_ii);
> >                 sctx->cur_inode_mode = btrfs_inode_mode(
> >                                 sctx->left_path->nodes[0],
> > left_ii);
> > -               sctx->cur_inode_rdev = btrfs_inode_rdev(
> > -                               sctx->left_path->nodes[0],
> > left_ii);
> > +               sctx->cur_inode_rdev = left_rdev;
> >                 if (sctx->cur_ino != BTRFS_FIRST_FREE_OBJECTID)
> >                         ret = send_create_inode_if_needed(sctx);
> >         } else if (result == BTRFS_COMPARE_TREE_DELETED) {
> > @@ -6396,8 +6402,7 @@ static int changed_inode(struct send_ctx
> > *sctx,
> >                                         sctx->left_path->nodes[0],
> > left_ii);
> >                         sctx->cur_inode_mode = btrfs_inode_mode(
> >                                         sctx->left_path->nodes[0],
> > left_ii);
> > -                       sctx->cur_inode_rdev = btrfs_inode_rdev(
> > -                                       sctx->left_path->nodes[0],
> > left_ii);
> > +                       sctx->cur_inode_rdev = left_rdev;
> >                         ret = send_create_inode_if_needed(sctx);
> >                         if (ret < 0)
> >                                 goto out;
> > --
> > 2.26.2
> > 
> 
> 

Reply via email to