On Wed, Mar 24, 2021 at 11:15 AM bingjingc <bingji...@synology.com> wrote: > > From: BingJing Chang <bingji...@synology.com> > > In commit d77815461f04 ("btrfs: Avoid trucating page or punching hole in > a already existed hole."), existed holes can be skipped by calling > find_first_non_hole() to adjust *start and *len. However, if the given > len is invalid and large, when an EXTENT_MAP_HOLE extent is found, the > *len will not be set to zero because (em->start + em->len) is less than > (*start + *len). Then the ret will be 1 but the *len will not be set to > 0. The propagated non-zero ret will result in fallocate failure. > > In the while-loop of btrfs_replace_file_extents(), len is not updated > every time before it calls find_first_non_hole(). That is, if the last > file extent in the given hole-punching range has been dropped but > btrfs_drop_extents() fails with -ENOSPC (btrfs_drop_extents() runs out > of reserved space of the given transaction), the problem can happen.
This is not entirely clear. Dropping the last extent and still returning ENOSPC is confusing. I think you mean that it drops the last file extent item that does not represent hole (disk_bytenr > 0), and after it there's only one file extent item representing a hole (disk_bytenr == 0). It fails with -ENOSPC when attempting to drop the file extent item representing the hole, after successfully dropping the non-hole file extent item. Is that it? > After it calls find_first_non_hole(), the cur_offset will be adjusted > to be larger than or equal to end. However, since the len is not set to > zero. The break-loop condition (ret && !len) will not meet. After it > leaves the while-loop, uncleared ret will result in fallocate failure. Ok, fallocate will return 1, an unexpected return value. > > We're not able to construct a reproducible way to let > btrfs_drop_extents() fails with -ENOSPC after it drops the last file > extent but with remaining holes. However, it's quite easy to fix. We > just need to update and check the len every time before we call > find_first_non_hole(). To make the while loop more readable, we also > pull the variable updates to the bottom of loop like this: > while (cur_offset < end) { > ... > // update cur_offset & len > // advance cur_offset & len in hole-punching case if needed > } > > Reported-by: Robbie Ko <robbi...@synology.com> > Fixes: d77815461f04 ("btrfs: Avoid trucating page or punching hole in a > already existed hole.") > Reviewed-by: Robbie Ko <robbi...@synology.com> > Reviewed-by: Chung-Chiang Cheng <ccch...@synology.com> > Signed-off-by: BingJing Chang <bingji...@synology.com> Looks good. Please just update that paragraph to be more clear about what is going on. Thanks. > --- > fs/btrfs/file.c | 6 +++--- > 1 file changed, 3 insertions(+), 3 deletions(-) > > diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c > index 0e155f0..dccb017 100644 > --- a/fs/btrfs/file.c > +++ b/fs/btrfs/file.c > @@ -2735,8 +2735,6 @@ int btrfs_replace_file_extents(struct inode *inode, > struct btrfs_path *path, > extent_info->file_offset += replace_len; > } > > - cur_offset = drop_args.drop_end; > - > ret = btrfs_update_inode(trans, root, BTRFS_I(inode)); > if (ret) > break; > @@ -2756,7 +2754,9 @@ int btrfs_replace_file_extents(struct inode *inode, > struct btrfs_path *path, > BUG_ON(ret); /* shouldn't happen */ > trans->block_rsv = rsv; > > - if (!extent_info) { > + cur_offset = drop_args.drop_end; > + len = end - cur_offset; > + if (!extent_info && len) { > ret = find_first_non_hole(BTRFS_I(inode), &cur_offset, > &len); > if (unlikely(ret < 0)) > -- > 2.7.4 > -- Filipe David Manana, “Whether you think you can, or you think you can't — you're right.”