On Mon, Aug 06, 2018 at 10:30:30AM +0800, robbieko wrote: > From: Robbie Ko <robbi...@synology.com> > > Commit e9894fd3e3b3 ("Btrfs: fix snapshot vs nocow writting") > forced nocow writes to fallback to COW, during writeback, > when a snapshot is created. This resulted in writes made before > creating the snapshot to unexpectedly fail with ENOSPC during > writeback when success (0) was returned to user space through > the write system call. > > The steps leading to this problem are: > > 1. When it's not possible to allocate data space for a write, > the buffered write path checks if a NOCOW write is possible. > If it is, it will not reserve space and success (0) is returned > to user space. > > 2. Then when a snapshot is created, the root's will_be_snapshotted > atomic is incremented and writeback is triggered for all inode's > that belong to the root being snapshotted. Incrementing that atomic > forces all previous writes to fallback to COW during writeback > (running delalloc). > > 3. This results in the writeback for the inodes to fail and therefore > setting the ENOSPC error in their mappings, so that a subsequent fsync > on them will report the error to user space. So it's not a completely > silent data loss (since fsync will report ENOSPC) but it's a very > unexpected and undesirable behaviour, because if a clean > shutdown/unmount of the filesystem happens without previous calls to > fsync, it is expected to have the data present in the files after > mounting the filesystem again. > > So fix this by adding a new atomic named snapshot_force_cow to the > root structure which prevents this behaviour and works the following way: > > 1. It is incremented when we start to create a snapshot after > triggering writeback and before waiting for writeback to finish. > > 2. This new atomic is now what is used by writeback (running delalloc) > to decide whether we need to fallback to COW or not. Because we > incremented this new atomic after triggering writeback in the snapshot > creation ioctl, we ensure that all buffered writes that happened > before snapshot creation will succeed and not fallback to COW > (which would make them fail with ENOSPC). > > 3. The existing atomic, will_be_snapshotted, is kept because it is > used to force new buffered writes, that start after we started > snapshotting, to reserve data space even when NOCOW is possible. > This makes these writes fail early with ENOSPC when there's no > available space to allocate, preventing the unexpected behaviour > of writeback later failing with ENOSPC due to a fallback to COW mode. > > Fixes: e9894fd3e3b3 ("Btrfs: fix snapshot vs nocow writting") > Signed-off-by: Robbie Ko <robbi...@synology.com> > Reviewed-by: Filipe Manana <fdman...@suse.com>
Added to misc-next and will be probably in the 2nd pull for the 4.19 merge window, thanks. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html