On Mon, Aug 06, 2018 at 10:30:30AM +0800, robbieko wrote:
> From: Robbie Ko <robbi...@synology.com>
> 
> Commit e9894fd3e3b3 ("Btrfs: fix snapshot vs nocow writting")
> forced nocow writes to fallback to COW, during writeback,
> when a snapshot is created. This resulted in writes made before
> creating the snapshot to unexpectedly fail with ENOSPC during
> writeback when success (0) was returned to user space through
> the write system call.
> 
> The steps leading to this problem are:
> 
> 1. When it's not possible to allocate data space for a write,
> the buffered write path checks if a NOCOW write is possible.
> If it is, it will not reserve space and success (0) is returned
> to user space.
> 
> 2. Then when a snapshot is created, the root's will_be_snapshotted
> atomic is incremented and writeback is triggered for all inode's
> that belong to the root being snapshotted. Incrementing that atomic
> forces all previous writes to fallback to COW during writeback
> (running delalloc).
> 
> 3. This results in the writeback for the inodes to fail and therefore
> setting the ENOSPC error in their mappings, so that a subsequent fsync
> on them will report the error to user space. So it's not a completely
> silent data loss (since fsync will report ENOSPC) but it's a very
> unexpected and undesirable behaviour, because if a clean
> shutdown/unmount of the filesystem happens without previous calls to
> fsync, it is expected to have the data present in the files after
> mounting the filesystem again.
> 
> So fix this by adding a new atomic named snapshot_force_cow to the
> root structure which prevents this behaviour and works the following way:
> 
> 1. It is incremented when we start to create a snapshot after
> triggering writeback and before waiting for writeback to finish.
> 
> 2. This new atomic is now what is used by writeback (running delalloc)
> to decide whether we need to fallback to COW or not. Because we
> incremented this new atomic after triggering writeback in the snapshot
> creation ioctl, we ensure that all buffered writes that happened
> before snapshot creation will succeed and not fallback to COW
> (which would make them fail with ENOSPC).
> 
> 3. The existing atomic, will_be_snapshotted, is kept because it is
> used to force new buffered writes, that start after we started
> snapshotting, to reserve data space even when NOCOW is possible.
> This makes these writes fail early with ENOSPC when there's no
> available space to allocate, preventing the unexpected behaviour
> of writeback later failing with ENOSPC due to a fallback to COW mode.
> 
> Fixes: e9894fd3e3b3 ("Btrfs: fix snapshot vs nocow writting")
> Signed-off-by: Robbie Ko <robbi...@synology.com>
> Reviewed-by: Filipe Manana <fdman...@suse.com>

Added to misc-next and will be probably in the 2nd pull for the 4.19
merge window, thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to