Re: [PATCH 14/25] vfs: make remap_file_range functions take and return bytes completed

2018-10-09 Thread Amir Goldstein
On Wed, Oct 10, 2018 at 3:14 AM Darrick J. Wong wrote: > > From: Darrick J. Wong > > Change the remap_file_range functions to take a number of bytes to > operate upon and return the number of bytes they operated on. This is a > requirement for allowing fs implementations to return short clone/de

Re: [PATCH 15/25] vfs: plumb RFR_* remap flags through the vfs clone functions

2018-10-09 Thread Amir Goldstein
On Wed, Oct 10, 2018 at 9:22 AM Amir Goldstein wrote: > > On Wed, Oct 10, 2018 at 3:14 AM Darrick J. Wong > wrote: > > > > From: Darrick J. Wong > > > > Plumb a remap_flags argument through the {do,vfs}_clone_file_range > > functions so that clone can take advantage of it. > > > > Signed-off-by

Re: [PATCH 15/25] vfs: plumb RFR_* remap flags through the vfs clone functions

2018-10-09 Thread Amir Goldstein
On Wed, Oct 10, 2018 at 3:14 AM Darrick J. Wong wrote: > > From: Darrick J. Wong > > Plumb a remap_flags argument through the {do,vfs}_clone_file_range > functions so that clone can take advantage of it. > > Signed-off-by: Darrick J. Wong > --- [...] > diff --git a/fs/overlayfs/file.c b/fs/overl

Re: [PATCH 08/25] vfs: combine the clone and dedupe into a single remap_file_range

2018-10-09 Thread Amir Goldstein
On Wed, Oct 10, 2018 at 3:12 AM Darrick J. Wong wrote: > > From: Darrick J. Wong > > Combine the clone_file_range and dedupe_file_range operations into a > single remap_file_range file operation dispatch since they're > fundamentally the same operation. The differences between the two can > be m

Re: [PATCH 06/25] vfs: strengthen checking of file range inputs to generic_remap_checks

2018-10-09 Thread Amir Goldstein
On Wed, Oct 10, 2018 at 3:11 AM Darrick J. Wong wrote: > > From: Darrick J. Wong > > File range remapping, if allowed to run past the destination file's EOF, > is an optimization on a regular file write. Regular file writes that > extend the file length are subject to various constraints which a

Recovery options for damaged beginning of the filesystem

2018-10-09 Thread Shapranov Vladimir
Hi, I've got a filesystem with first ~50Mb accidentally dd'ed. "btrfs check" fails with a following error (regardless of "-s"): checksum verify failed on 21037056 found FC8A6557 wanted 2F51D090 checksum verify failed on 21037056 found FC8A6557 wanted 2F51D090 checksum verify failed on 21037056 fo

Re: [PATCH v2 00/25] fs: fixes for serious clone/dedupe problems

2018-10-09 Thread Darrick J. Wong
On Wed, Oct 10, 2018 at 12:02:08PM +1100, Dave Chinner wrote: > On Tue, Oct 09, 2018 at 05:10:38PM -0700, Darrick J. Wong wrote: > > Hi all, > > > > Dave, Eric, and I have been chasing a stale data exposure bug in the XFS > > reflink implementation, and tracked it down to reflink forgetting to do

Re: [PATCH v2 00/25] fs: fixes for serious clone/dedupe problems

2018-10-09 Thread Dave Chinner
On Tue, Oct 09, 2018 at 05:10:38PM -0700, Darrick J. Wong wrote: > Hi all, > > Dave, Eric, and I have been chasing a stale data exposure bug in the XFS > reflink implementation, and tracked it down to reflink forgetting to do > some of the file-extending activities that must happen for regular > w

Re: [PATCH 01/25] xfs: add a per-xfs trace_printk macro

2018-10-09 Thread Dave Chinner
On Tue, Oct 09, 2018 at 05:10:45PM -0700, Darrick J. Wong wrote: > From: Darrick J. Wong > > Add a "xfs_tprintk" macro so that developers can use trace_printk to > print out arbitrary debugging information with the XFS device name > attached to the trace output. > > Signed-off-by: Darrick J. Won

[PATCH 25/25] xfs: support returning partial reflink results

2018-10-09 Thread Darrick J. Wong
From: Darrick J. Wong Back when the XFS reflink code only supported clone_file_range, we were only able to return zero or negative error codes to userspace. However, now that copy_file_range (which returns bytes copied) can use XFS' clone_file_range, we have the opportunity to return partial res

[PATCH 19/25] vfs: hide file range comparison function

2018-10-09 Thread Darrick J. Wong
From: Darrick J. Wong There are no callers of vfs_dedupe_file_range_compare, so we might as well make it a static helper and remove the export. Signed-off-by: Darrick J. Wong --- fs/read_write.c| 191 ++-- include/linux/fs.h |3 - 2 file

[PATCH 20/25] vfs: implement opportunistic short dedupe

2018-10-09 Thread Darrick J. Wong
From: Darrick J. Wong For a given dedupe request, the bytes_deduped field in the control structure tells userspace if we managed to deduplicate some, but not all of, the requested regions starting from the file offsets supplied. However, due to sloppy coding, the current dedupe code returns FILE_

[PATCH 24/25] xfs: fix pagecache truncation prior to reflink

2018-10-09 Thread Darrick J. Wong
From: Darrick J. Wong Prior to remapping blocks, it is necessary to remove pages from the destination file's page cache. Unfortunately, the truncation is not aggressive enough -- if page size > block size, we'll end up zeroing subpage blocks instead of removing them. So, round the start offset

[PATCH 23/25] ocfs2: support partial clone range and dedupe range

2018-10-09 Thread Darrick J. Wong
From: Darrick J. Wong Change the ocfs2 remap code to allow for returning partial results. Signed-off-by: Darrick J. Wong --- fs/ocfs2/file.c |7 + fs/ocfs2/refcounttree.c | 73 ++- fs/ocfs2/refcounttree.h | 12 3 files ch

[PATCH 15/25] vfs: plumb RFR_* remap flags through the vfs clone functions

2018-10-09 Thread Darrick J. Wong
From: Darrick J. Wong Plumb a remap_flags argument through the {do,vfs}_clone_file_range functions so that clone can take advantage of it. Signed-off-by: Darrick J. Wong --- fs/ioctl.c |2 +- fs/nfsd/vfs.c |2 +- fs/overlayfs/copy_up.c |2 +- fs/overlayfs/file.

[PATCH 10/25] vfs: rename clone_verify_area to remap_verify_area

2018-10-09 Thread Darrick J. Wong
From: Darrick J. Wong Since we use clone_verify_area for both clone and dedupe range checks, rename the function to make it clear that it's for both. Signed-off-by: Darrick J. Wong --- fs/read_write.c | 10 +- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/fs/read_wri

[PATCH 17/25] vfs: make remapping to source file eof more explicit

2018-10-09 Thread Darrick J. Wong
From: Darrick J. Wong Create a RFR_TO_SRC_EOF flag to explicitly declare that the caller wants the remap implementation to remap to the end of the source file, once the files are locked. Signed-off-by: Darrick J. Wong --- fs/ioctl.c |3 ++- fs/nfsd/vfs.c |3 ++- fs/read_wr

[PATCH 21/25] ocfs2: truncate page cache for clone destination file before remapping

2018-10-09 Thread Darrick J. Wong
From: Darrick J. Wong When cloning blocks into another file, truncate the page cache before we start remapping blocks so that concurrent reads wait for us to finish. Signed-off-by: Darrick J. Wong --- fs/ocfs2/refcounttree.c | 10 -- 1 file changed, 4 insertions(+), 6 deletions(-)

[PATCH 22/25] ocfs2: fix pagecache truncation prior to reflink

2018-10-09 Thread Darrick J. Wong
From: Darrick J. Wong Prior to remapping blocks, it is necessary to remove pages from the destination file's page cache. Unfortunately, the truncation is not aggressive enough -- if page size > block size, we'll end up zeroing subpage blocks instead of removing them. So, round the start offset

[PATCH 18/25] vfs: enable remap callers that can handle short operations

2018-10-09 Thread Darrick J. Wong
From: Darrick J. Wong Plumb in a remap flag that enables the filesystem remap handler to shorten remapping requests for callers that can handle it. Now copy_file_range can report partial success (in case we run up against alignment problems, resource limits, etc.). Signed-off-by: Darrick J. Won

[PATCH 13/25] vfs: pass remap flags to generic_remap_checks

2018-10-09 Thread Darrick J. Wong
From: Darrick J. Wong Pass the same remap flags to generic_remap_checks for consistency. Signed-off-by: Darrick J. Wong --- fs/read_write.c|2 +- include/linux/fs.h |2 +- mm/filemap.c |4 ++-- 3 files changed, 4 insertions(+), 4 deletions(-) diff --git a/fs/read_write.

[PATCH 16/25] vfs: plumb RFR_* remap flags through the vfs dedupe functions

2018-10-09 Thread Darrick J. Wong
From: Darrick J. Wong Plumb a remap_flags argument through the vfs_dedupe_file_range_one functions so that dedupe can take advantage of it. Signed-off-by: Darrick J. Wong --- fs/overlayfs/file.c |3 ++- fs/read_write.c |9 ++--- include/linux/fs.h |2 +- 3 files changed, 9

[PATCH 12/25] vfs: pass remap flags to generic_remap_file_range_prep

2018-10-09 Thread Darrick J. Wong
From: Darrick J. Wong Plumb the remap flags through the filesystem from the vfs function dispatcher all the way to the prep function to prepare for behavior changes in subsequent patches. Signed-off-by: Darrick J. Wong --- fs/ocfs2/file.c |2 +- fs/ocfs2/refcounttree.c |6 +++--

[PATCH 14/25] vfs: make remap_file_range functions take and return bytes completed

2018-10-09 Thread Darrick J. Wong
From: Darrick J. Wong Change the remap_file_range functions to take a number of bytes to operate upon and return the number of bytes they operated on. This is a requirement for allowing fs implementations to return short clone/dedupe results to the user, which will enable us to obey resource lim

[PATCH 11/25] vfs: create generic_remap_file_range_touch to update inode metadata

2018-10-09 Thread Darrick J. Wong
From: Darrick J. Wong Create a new VFS helper to handle inode metadata updates when remapping into a file. If the operation can possibly alter the file contents, we must update the ctime and mtime and remove security privileges, just like we do for regular file writes. Wire up ocfs2 to ensure c

[PATCH 08/25] vfs: combine the clone and dedupe into a single remap_file_range

2018-10-09 Thread Darrick J. Wong
From: Darrick J. Wong Combine the clone_file_range and dedupe_file_range operations into a single remap_file_range file operation dispatch since they're fundamentally the same operation. The differences between the two can be made in the prep functions. Signed-off-by: Darrick J. Wong --- Docu

[PATCH 09/25] vfs: rename vfs_clone_file_prep to be more descriptive

2018-10-09 Thread Darrick J. Wong
From: Darrick J. Wong The vfs_clone_file_prep is a generic function to be called by filesystem implementations only. Rename the prefix to generic_ and make it more clear that it applies to remap operations, not just clones. Signed-off-by: Darrick J. Wong --- fs/ocfs2/refcounttree.c |2 +-

[PATCH 06/25] vfs: strengthen checking of file range inputs to generic_remap_checks

2018-10-09 Thread Darrick J. Wong
From: Darrick J. Wong File range remapping, if allowed to run past the destination file's EOF, is an optimization on a regular file write. Regular file writes that extend the file length are subject to various constraints which are not checked by range cloning. This is a correctness problem bec

[PATCH 05/25] vfs: check file ranges before cloning files

2018-10-09 Thread Darrick J. Wong
From: Darrick J. Wong Move the file range checks from vfs_clone_file_prep into a separate generic_remap_checks function so that all the checks are collected in a central location. This forms the basis for adding more checks from generic_write_checks that will make cloning's input checking more c

[PATCH 07/25] vfs: skip zero-length dedupe requests

2018-10-09 Thread Darrick J. Wong
From: Darrick J. Wong Don't bother calling the filesystem for a zero-length dedupe request; we can return zero and exit. Signed-off-by: Darrick J. Wong Reviewed-by: Christoph Hellwig --- fs/read_write.c |5 + 1 file changed, 5 insertions(+) diff --git a/fs/read_write.c b/fs/read_wri

[PATCH v2 00/25] fs: fixes for serious clone/dedupe problems

2018-10-09 Thread Darrick J. Wong
Hi all, Dave, Eric, and I have been chasing a stale data exposure bug in the XFS reflink implementation, and tracked it down to reflink forgetting to do some of the file-extending activities that must happen for regular writes. We then started auditing the clone, dedupe, and copyfile code and rea

[PATCH 03/25] xfs: zero posteof blocks when cloning above eof

2018-10-09 Thread Darrick J. Wong
From: Darrick J. Wong When we're reflinking between two files and the destination file range is well beyond the destination file's EOF marker, zero any posteof speculative preallocations in the destination file so that we don't expose stale disk contents. The previous strategy of trying to clear

[PATCH 02/25] xfs: refactor clonerange preparation into a separate helper

2018-10-09 Thread Darrick J. Wong
From: Darrick J. Wong Refactor all the reflink preparation steps into a separate helper that we'll use to land all the upcoming fixes for insufficient input checks. This rework also moves the invalidation of the destination range to the prep function so that it is done before the range is remapp

[PATCH 04/25] xfs: update ctime and remove suid before cloning files

2018-10-09 Thread Darrick J. Wong
From: Darrick J. Wong Before cloning into a file, update the ctime and remove sensitive attributes like suid, just like we'd do for a regular file write. Signed-off-by: Darrick J. Wong Reviewed-by: Dave Chinner --- fs/xfs/xfs_reflink.c | 25 + 1 file changed, 25 inse

[PATCH 01/25] xfs: add a per-xfs trace_printk macro

2018-10-09 Thread Darrick J. Wong
From: Darrick J. Wong Add a "xfs_tprintk" macro so that developers can use trace_printk to print out arbitrary debugging information with the XFS device name attached to the trace output. Signed-off-by: Darrick J. Wong --- fs/xfs/xfs_error.h |5 + 1 file changed, 5 insertions(+) diff

[PATCH v3 2/2] btrfs: Add zstd support to grub btrfs

2018-10-09 Thread Nick Terrell
Adds zstd support to the btrfs module. Tested on Ubuntu-18.04 with a btrfs /boot partition with and without zstd compression. A test case was also added to the test suite that fails before the patch, and passes after. Signed-off-by: Nick Terrell --- v1 -> v2: - Fix comments from Daniel Kiper. v

[PATCH v3 0/2] btrfs: Add zstd support to grub btrfs

2018-10-09 Thread Nick Terrell
Hi all, This patch set imports the upstream zstd library, adds zstd support to the btrfs module, and adds a test case. I've also tested the patch set by storing my boot partition in btrfs with and without zstd compression and rebooting. The fist patch imports the files needed to support zstd deco

Re: CoW behavior when writing same content

2018-10-09 Thread Chris Murphy
On Tue, Oct 9, 2018 at 11:25 AM, Andrei Borzenkov wrote: > 09.10.2018 18:52, Chris Murphy пишет: >>> In this case is root/big_file and snapshot/big_file still share the same >>> data? >> >> You'll be left with three files. /big_file and root/big_file will >> share extents, > > How comes they sha

Re: [PATCH v3 4/6] btrfs-progs: lowmem check: Add dev_item check for used bytes and total bytes

2018-10-09 Thread Hans van Kranenburg
On 10/09/2018 03:14 AM, Qu Wenruo wrote: > > > On 2018/10/9 上午6:20, Hans van Kranenburg wrote: >> On 10/08/2018 02:30 PM, Qu Wenruo wrote: >>> Obviously, used bytes can't be larger than total bytes. >>> >>> Signed-off-by: Qu Wenruo >>> --- >>> check/mode-lowmem.c | 5 + >>> 1 file changed,

Re: [PATCH v2 2/2] btrfs: Add zstd support to grub btrfs

2018-10-09 Thread Nick Terrell
> On Oct 9, 2018, at 12:07 PM, Daniel Kiper wrote: > On Mon, Oct 08, 2018 at 04:06:21PM -0700, Nick Terrell wrote: >> Adds zstd support to the btrfs module. I'm not sure that my changes to the >> Makefiles are correct, please let me know if I need to do something >> differently. >> >> Tested on U

Re: [PATCH v2 1/2] Import upstream zstd-1.3.6

2018-10-09 Thread Daniel Kiper
On Mon, Oct 08, 2018 at 04:06:20PM -0700, Nick Terrell wrote: > Import zstd-1.3.6 from upstream [1]. Only the files need for decompression > are imported. > > I used the latest zstd release, which includes patches [2] to build cleanly > in GRUB. > > Upstream zstd commit hash: 4fa456d7f12f8b27bd3b2f

Re: [PATCH 7/9] btrfs: Add support for recovery for a RAID 5 btrfs profiles.

2018-10-09 Thread Daniel Kiper
On Thu, Sep 27, 2018 at 08:35:02PM +0200, Goffredo Baroncelli wrote: > From: Goffredo Baroncelli > > Add support for recovery for a RAID 5 btrfs profile. In addition > it is added some code as preparatory work for RAID 6 recovery code. > > Signed-off-by: Goffredo Baroncelli > --- > grub-core/fs/

Re: [PATCH v2 2/2] btrfs: Add zstd support to grub btrfs

2018-10-09 Thread Daniel Kiper
On Mon, Oct 08, 2018 at 04:06:21PM -0700, Nick Terrell wrote: > Adds zstd support to the btrfs module. I'm not sure that my changes to the > Makefiles are correct, please let me know if I need to do something > differently. > > Tested on Ubuntu-18.04 with a btrfs /boot partition with and without zs

Re: [PATCH 9/9] btrfs: Add RAID 6 recovery for a btrfs filesystem.

2018-10-09 Thread Daniel Kiper
On Thu, Sep 27, 2018 at 08:35:04PM +0200, Goffredo Baroncelli wrote: > From: Goffredo Baroncelli > > Add the RAID 6 recovery, in order to use a RAID 6 filesystem even if some > disks (up to two) are missing. This code use the md RAID 6 code already > present in grub. > > Signed-off-by: Goffredo Ba

Re: [PATCH 4/9] btrfs: Avoid a rescan for a device which was already not found.

2018-10-09 Thread Daniel Kiper
On Thu, Sep 27, 2018 at 08:34:59PM +0200, Goffredo Baroncelli wrote: > From: Goffredo Baroncelli > > Change the behavior of find_device(): before the patch, a read of a missed > device may trigger a rescan. However, it is never recorded that a device > is missed, so each single read of a missed de

Re: [PATCH 1/9] btrfs: Add support for reading a filesystem with a RAID 5 or RAID 6 profile.

2018-10-09 Thread Daniel Kiper
On Thu, Sep 27, 2018 at 08:34:56PM +0200, Goffredo Baroncelli wrote: > From: Goffredo Baroncelli > > Signed-off-by: Goffredo Baroncelli Code LGTM. Though comment begs improvement. I will send you updated comment for approval shortly. Daniel

Re: CoW behavior when writing same content

2018-10-09 Thread Andrei Borzenkov
09.10.2018 18:52, Chris Murphy пишет: > On Tue, Oct 9, 2018 at 8:48 AM, Gervais, Francois > wrote: >> Hi, >> >> If I have a snapshot where I overwrite a big file but which only a >> small portion of it is different, will the whole file be rewritten in >> the snapshot? Or only the different part of

Next btrfs development cycle open - 4.21

2018-10-09 Thread David Sterba
From: David Sterba Hi, a friendly reminder of the timetable and what's expected at this phase. 4.18 - current 4.19 - upcoming, urgent regression fixes only 4.20 - development closed, pull request in prep, fixes or regressions only 4.21 - development open, until 4.20-rc5 (at least) (https://btr

Re: CoW behavior when writing same content

2018-10-09 Thread Roman Mamedov
On Tue, 9 Oct 2018 09:52:00 -0600 Chris Murphy wrote: > You'll be left with three files. /big_file and root/big_file will > share extents, and snapshot/big_file will have its own extents. You'd > need to copy with --reflink for snapshot/big_file to have shared > extents with /big_file - or dedupl

Re: [PATCH] Btrfs: fix wrong dentries after fsync of file that got its parent replaced

2018-10-09 Thread David Sterba
On Tue, Oct 09, 2018 at 03:05:29PM +0100, fdman...@kernel.org wrote: > From: Filipe Manana > > In a scenario like the following: > > mkdir /mnt/A # inode 258 > mkdir /mnt/B # inode 259 > touch /mnt/B/bar # inode 260 > > sync > > mv /mnt/B/bar /mn

Re: [PATCH] Btrfs: fix warning when replaying log after fsync of a tmpfile

2018-10-09 Thread David Sterba
On Tue, Oct 09, 2018 at 06:01:56PM +0200, David Sterba wrote: > > Reported-by: Martin Steigerwald > > Link: https://lore.kernel.org/linux-btrfs/319.NTnn27ZJZE@merkaba/ > > Fixes: 471d557afed1 ("Btrfs: fix loss of prealloc extents past i_size after > > fsync log replay") > > Signed-off-by: Fil

Re: [PATCH] Btrfs: fix warning when replaying log after fsync of a tmpfile

2018-10-09 Thread David Sterba
On Mon, Oct 08, 2018 at 11:12:55AM +0100, fdman...@kernel.org wrote: > From: Filipe Manana > > When replaying a log which contains a tmpfile (which necessarily has a > link count of 0) we end up calling inc_nlink(), at > fs/btrfs/tree-log.c:replay_one_buffer(), which produces a warning like > the

Re: CoW behavior when writing same content

2018-10-09 Thread Chris Murphy
On Tue, Oct 9, 2018 at 8:48 AM, Gervais, Francois wrote: > Hi, > > If I have a snapshot where I overwrite a big file but which only a > small portion of it is different, will the whole file be rewritten in > the snapshot? Or only the different part of the file? Depends on how the application modi

Re: [PATCH] btrfs: qgroup: Avoid calling qgroup functions if qgroup is not enabled

2018-10-09 Thread David Sterba
On Tue, Oct 09, 2018 at 02:36:45PM +0800, Qu Wenruo wrote: > Some qgroup trace events like btrfs_qgroup_release_data() and > btrfs_qgroup_free_delayed_ref() can still be triggered even qgroup is > not enabled. > > This is caused by the lack of qgroup status check before really calling > qgroup fun

CoW behavior when writing same content

2018-10-09 Thread Gervais, Francois
Hi, If I have a snapshot where I overwrite a big file but which only a small portion of it is different, will the whole file be rewritten in the snapshot? Or only the different part of the file? Something like: $ dd if=/dev/urandom of=/big_file bs=1M count=1024 $ cp /big_file root/ $ btrfs sub s

[PATCH] generic: test for file fsync after moving it to a new parent directory

2018-10-09 Thread fdmanana
From: Filipe Manana Test that if we move a file from a directory B to a directory A, replace directory B with directory A, fsync the file and then power fail, after mounting the filesystem the file has a single parent, named B and there is no longer any directory with the name A. This test is mo

[PATCH] Btrfs: fix wrong dentries after fsync of file that got its parent replaced

2018-10-09 Thread fdmanana
From: Filipe Manana In a scenario like the following: mkdir /mnt/A # inode 258 mkdir /mnt/B # inode 259 touch /mnt/B/bar # inode 260 sync mv /mnt/B/bar /mnt/A/bar mv -T /mnt/A /mnt/B fsync /mnt/B/bar After replaying the log we end up wit

[PATCH v2 00/23] various dynamic_debug patches

2018-10-09 Thread Rasmus Villemoes
v2: Added various acks/reviews. I'll follow up with rewriting the x86 part as asm macros once that work is in mainline. Patches 15, 16 are in next-20181009; in hindsight I should probably have asked Rafael not to pick those. Patch 17 textually depend on those, and patch 19 removes the .

[PATCH v2 14/23] btrfs: implement btrfs_debug* in terms of helper macro

2018-10-09 Thread Rasmus Villemoes
First, the btrfs_debug macros open-code (one possible definition of) DYNAMIC_DEBUG_BRANCH, so they don't benefit from the HAVE_JUMP_LABEL optimization. Second, changes on x86-64 later in this series require that all struct _ddebug descriptors in a translation unit use distinct identifiers. Using