raid1 mount as read only

2019-02-18 Thread Stefan K
Hello, I've played a little bit with raid1: my steps was: 1. create a raid1 with btrfs (add device; balance start -mconvert=raid1 -dconvert=raid1 /) 2. after finishing, i shutdown the server and remove a device and start it again, 3. it works (i used degraded options in fstab) 4. I shutdown the

Re: [PATCH v5.1 12/12] btrfs: Do mandatory tree block check before submitting bio

2019-02-18 Thread Nikolay Borisov
On 18.02.19 г. 7:27 ч., Qu Wenruo wrote: > There are at least 2 reports about memory bit flip sneaking into on-disk > data. > > Currently we only have a relaxed check triggered at > btrfs_mark_buffer_dirty() time, as it's not mandatory and only for > CONFIG_BTRFS_FS_CHECK_INTEGRITY enabled buil

Re: [PATCH v5.1 12/12] btrfs: Do mandatory tree block check before submitting bio

2019-02-18 Thread Qu Wenruo
[snip] >> >> Reported-by: Leonard Lausen >> Signed-off-by: Qu Wenruo >> --- >> fs/btrfs/disk-io.c | 10 ++ >> fs/btrfs/tree-checker.c | 24 +--- >> fs/btrfs/tree-checker.h | 8 >> 3 files changed, 39 insertions(+), 3 deletions(-) >> >> diff --git a/fs/b

[PATCH v2] btrfs: ensure that a DUP or RAID1 block group has exactly two stripes

2019-02-18 Thread Johannes Thumshirn
We recently had a customer issue with a corrupted filesystem. When trying to mount this image btrfs panicked with a division by zero in calc_stripe_length(). The corrupt chunk had a 'num_stripes' value of 1. calc_stripe_length() takes this value and divides it by the number of copies the RAID prof

Re: [PATCH v2] btrfs: ensure that a DUP or RAID1 block group has exactly two stripes

2019-02-18 Thread Hans van Kranenburg
On 2/18/19 10:48 AM, Johannes Thumshirn wrote: > We recently had a customer issue with a corrupted filesystem. When trying > to mount this image btrfs panicked with a division by zero in > calc_stripe_length(). > > The corrupt chunk had a 'num_stripes' value of 1. calc_stripe_length() > takes this

Re: [PATCH v2] btrfs: ensure that a DUP or RAID1 block group has exactly two stripes

2019-02-18 Thread Johannes Thumshirn
On 18/02/2019 10:55, Hans van Kranenburg wrote: > On 2/18/19 10:48 AM, Johannes Thumshirn wrote: >> We recently had a customer issue with a corrupted filesystem. When trying >> to mount this image btrfs panicked with a division by zero in >> calc_stripe_length(). >> >> The corrupt chunk had a 'num_

[PATCH v2.1] btrfs: ensure that a DUP or RAID1 block group has exactly two stripes

2019-02-18 Thread Johannes Thumshirn
We recently had a customer issue with a corrupted filesystem. When trying to mount this image btrfs panicked with a division by zero in calc_stripe_length(). The corrupt chunk had a 'num_stripes' value of 1. calc_stripe_length() takes this value and divides it by the number of copies the RAID prof

Re: [GIT PULL] Btrfs fixes for 4.15-rc2

2019-02-18 Thread Alex Lyakas
Hi David, > Btrfs: incremental send, fix wrong unlink path after renaming file > (2017-11-28 17:15:30 +0100) > > > David Sterba (2): > btrfs: add missing device::flush_bio puts Is there a reason that this one should not be t

Re: [LSF/MM TOPIC] Software RAID Support for NV-DIMM

2019-02-18 Thread Johannes Thumshirn
On 16/02/2019 06:39, Dave Chinner wrote: [..] >> We've supported this since mid 2018 and commit ba23cba9b3bd ("fs: >> allow per-device dax status checking for filesystems"). That is, >> we can have DAX on the XFS RT device indepently of the data device. >> >> That is, you set up pmem in three segm

Re: Btrfs send with parent different size depending on source of files.

2019-02-18 Thread André Malm
So the takeaway from this is that btrfs send doesn't work this way, I can't copy a file with reflink=always and expect it to be diffed correctly as in a parent-child situation or? On 2019-02-17 04:11, Remi Gauvin wrote: On 2019-02-16 3:08 p.m., Andrei Borzenkov wrote: File size of "real" ser

Re: [PATCH 1/2] btrfs: check for refs on snapshot delete resume

2019-02-18 Thread David Sterba
On Wed, Feb 06, 2019 at 03:46:14PM -0500, Josef Bacik wrote: > There's a bug in snapshot deletion where we won't update the > drop_progress key if we're in the UPDATE_BACKREF stage. This is a > problem because we could drop refs for blocks we know don't belong to > ours. If we crash or umount at

Re: [PATCH 4/4] Btrfs: remove no longer needed range length checks for deduplication

2019-02-18 Thread David Sterba
On Thu, Jan 31, 2019 at 04:31:49PM +, Filipe Manana wrote: > On Wed, Dec 12, 2018 at 6:07 PM wrote: > > > > From: Filipe Manana > > > > Comparing the content of the pages in the range to deduplicate is now done > > by the generic helper generic_remap_file_range_prep(), which takes care of > >

Re: [PATCH 3/4] Btrfs: check if destination root is read-only for deduplication

2019-02-18 Thread David Sterba
On Thu, Jan 31, 2019 at 04:44:39PM +, Hugo Mills wrote: > On Thu, Jan 31, 2019 at 04:39:22PM +, Filipe Manana wrote: > > On Thu, Dec 13, 2018 at 4:08 PM David Sterba wrote: > > > > > > On Wed, Dec 12, 2018 at 06:05:58PM +, fdman...@kernel.org wrote: > > > > From: Filipe Manana > > > >

Re: [PATCH 3/4] Btrfs: check if destination root is read-only for deduplication

2019-02-18 Thread David Sterba
On Wed, Dec 12, 2018 at 06:05:58PM +, fdman...@kernel.org wrote: > From: Filipe Manana > > Checking if the destination root is read-only was being performed only for > clone operations. Make deduplication check it as well, as it does not make > sense to not do it, even if it is an operation t

Re: [GIT PULL] Btrfs fixes for 4.15-rc2

2019-02-18 Thread David Sterba
On Mon, Feb 18, 2019 at 12:37:49PM +0200, Alex Lyakas wrote: > Hi David, > > > Btrfs: incremental send, fix wrong unlink path after renaming file > > (2017-11-28 17:15:30 +0100) > > > > > > David Sterba (2): > > btrfs: add m

Re: [PATCH 3/4] Btrfs: check if destination root is read-only for deduplication

2019-02-18 Thread Filipe Manana
On Mon, Feb 18, 2019 at 3:36 PM David Sterba wrote: > > On Thu, Jan 31, 2019 at 04:44:39PM +, Hugo Mills wrote: > > On Thu, Jan 31, 2019 at 04:39:22PM +, Filipe Manana wrote: > > > On Thu, Dec 13, 2018 at 4:08 PM David Sterba wrote: > > > > > > > > On Wed, Dec 12, 2018 at 06:05:58PM +

[PATCH 1/2] Btrfs: add missing error handling after doing leaf/node binary search

2019-02-18 Thread fdmanana
From: Filipe Manana The function map_private_extent_buffer() can return an -EINVAL error, and it is called by generic_bin_search() which will return back the error. The btrfs_bin_search() function in turn calls generic_bin_search() and the key_search() function calls btrfs_bin_search(), so both c

[PATCH 2/2] Btrfs: report and handle error on unexpected first key on extent buffer

2019-02-18 Thread fdmanana
From: Filipe Manana When there is a kind of corruption in an extent buffer such that its first key does not match the key at the respective parent slot, one of two things happens: 1) When assertions are enabled, we effectively hit a BUG_ON() which requires rebooting the machine later. This al

[PATCH v2 1/2] Btrfs: add missing error handling after doing leaf/node binary search

2019-02-18 Thread fdmanana
From: Filipe Manana The function map_private_extent_buffer() can return an -EINVAL error, and it is called by generic_bin_search() which will return back the error. The btrfs_bin_search() function in turn calls generic_bin_search() and the key_search() function calls btrfs_bin_search(), so both c

Re: [PATCH] Btrfs: fix file corruption after snapshotting

2019-02-18 Thread David Sterba
On Mon, Feb 04, 2019 at 02:28:10PM +, fdman...@kernel.org wrote: > From: Filipe Manana > > When we are mixing buffered writes with direct IO writes against the same > file and snapshotting is happening concurrently, we can end up with a > corrupt file content in the snapshot. Example: The pa

Re: [PATCH v2 1/2] Btrfs: add missing error handling after doing leaf/node binary search

2019-02-18 Thread Nikolay Borisov
On 18.02.19 г. 19:07 ч., fdman...@kernel.org wrote: > From: Filipe Manana > > The function map_private_extent_buffer() can return an -EINVAL error, and > it is called by generic_bin_search() which will return back the error. The > btrfs_bin_search() function in turn calls generic_bin_search()

Re: [PATCH v2 1/2] Btrfs: add missing error handling after doing leaf/node binary search

2019-02-18 Thread Filipe Manana
On Mon, Feb 18, 2019 at 5:11 PM Nikolay Borisov wrote: > > > > On 18.02.19 г. 19:07 ч., fdman...@kernel.org wrote: > > From: Filipe Manana > > > > The function map_private_extent_buffer() can return an -EINVAL error, and > > it is called by generic_bin_search() which will return back the error. T

Re: [PATCH] Btrfs: fix file corruption after snapshotting

2019-02-18 Thread Filipe Manana
On Mon, Feb 18, 2019 at 5:09 PM David Sterba wrote: > > On Mon, Feb 04, 2019 at 02:28:10PM +, fdman...@kernel.org wrote: > > From: Filipe Manana > > > > When we are mixing buffered writes with direct IO writes against the same > > file and snapshotting is happening concurrently, we can end up

Re: Btrfs send with parent different size depending on source of files.

2019-02-18 Thread Chris Murphy
On Sat, Feb 16, 2019 at 1:26 PM Andrei Borzenkov wrote: > > 15.02.2019 22:11, Chris Murphy пишет: > > The proven way it works, is as I've described, and many emails over a > > decade on this list, inferred from the man page, and the step by step > > recipe on the Wiki. > > > > Oh well. OK, you'r

Re: Btrfs send with parent different size depending on source of files.

2019-02-18 Thread Chris Murphy
On Mon, Feb 18, 2019 at 6:05 AM André Malm wrote: > > So the takeaway from this is that btrfs send doesn't work this way, I > can't copy a file with reflink=always and expect it to be diffed > correctly as in a parent-child situation or? Sorry about the confusion, André. What you're doing isn't d

Re: [LSF/MM TOPIC] Software RAID Support for NV-DIMM

2019-02-18 Thread Dan Williams
On Mon, Feb 18, 2019 at 2:50 AM Johannes Thumshirn wrote: > > On 16/02/2019 06:39, Dave Chinner wrote: > [..] > > >> We've supported this since mid 2018 and commit ba23cba9b3bd ("fs: > >> allow per-device dax status checking for filesystems"). That is, > >> we can have DAX on the XFS RT device ind

Re: Btrfs send with parent different size depending on source of files.

2019-02-18 Thread André Malm
What causes the extent to be incomplete? And can I avoid it? I tried with my example but with wget with a larger file (1G) and the diff produces was only 60MB which indicates that it "mostly" works. Basically what I'm trying to achieve is having a "reference" / "master" btrfs subvolume where i

Re: Corrupted filesystem, looking for guidance

2019-02-18 Thread Sébastien Luttringer
On Tue, 2019-02-12 at 15:40 -0700, Chris Murphy wrote: > On Mon, Feb 11, 2019 at 8:16 PM Sébastien Luttringer wrote: > > FYI: This only does full stripe reads, recomputes parity and overwrites the > parity strip. It assumes the data strips are correct, so long as the > underlying member devices d

Re: raid1 mount as read only

2019-02-18 Thread Chris Murphy
On Mon, Feb 18, 2019 at 2:01 AM Stefan K wrote: > > Hello, > > I've played a little bit with raid1: > my steps was: > 1. create a raid1 with btrfs (add device; balance start -mconvert=raid1 > -dconvert=raid1 /) > 2. after finishing, i shutdown the server and remove a device and start it > again

Re: Corrupted filesystem, looking for guidance

2019-02-18 Thread Chris Murphy
On Mon, Feb 18, 2019 at 1:14 PM Sébastien Luttringer wrote: > > On Tue, 2019-02-12 at 15:40 -0700, Chris Murphy wrote: > > On Mon, Feb 11, 2019 at 8:16 PM Sébastien Luttringer > > wrote: > > > > FYI: This only does full stripe reads, recomputes parity and overwrites the > > parity strip. It assu

Re: Btrfs send with parent different size depending on source of files.

2019-02-18 Thread Graham Cobb
On 18/02/2019 19:58, André Malm wrote: > What causes the extent to be incomplete? And can I avoid it? Does it matter? I presume the send is working OK, it is just that it sends a little more data than it needs to. Or have you seen any data loss? Graham

raid 1 filesystem corruption

2019-02-18 Thread Rudolf Kastl
Any hints on how to recover is greatly appreciated. uname -a: Linux localhost-live 4.18.16-300.fc29.x86_64 #1 SMP Sat Oct 20 23:24:08 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux btrfs --version: btrfs-progs v4.17.1 dmesg: [ 46.918149] BTRFS info (device sda2): disk space caching is enabled [ 46.

Re: Btrfs send with parent different size depending on source of files.

2019-02-18 Thread Chris Murphy
r, read-write. And once it's in the "master state" you want, just snapshot it: btrfs sub snap -r A A.20190218-master And now continue to make changes to A subvolume, and on whatever schedule you want: btrfs sub snap -r A A.20190218-1412 btrfs sub snap -r A A.20190218-1850 btrfs

Re: raid 1 filesystem corruption

2019-02-18 Thread Chris Murphy
On Mon, Feb 18, 2019 at 2:19 PM Rudolf Kastl wrote: > > Any hints on how to recover is greatly appreciated. > > uname -a: > Linux localhost-live 4.18.16-300.fc29.x86_64 #1 SMP Sat Oct 20 > 23:24:08 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux > > btrfs --version: > btrfs-progs v4.17.1 > > dmesg: > [

Re: Btrfs send with parent different size depending on source of files.

2019-02-18 Thread André Malm
ending the diffs of the child subvolumes. Maybe this is a bit optimistic. I'm not sure what you get out of this method that depends on reflink rather than just making read only snapshots. Why don't you create subvolume A as the master, read-write. And once it's in the "maste

Re: [RFC PATCH 0/6] Allow setting file birth time with utimensat()

2019-02-18 Thread Dave Chinner
On Sat, Feb 16, 2019 at 06:57:45PM -0700, Andreas Dilger wrote: > While it may be a bit of a stretch to call this "forensic evidence", making We do forensic analysis of corrupt filesystems looking for evidence of what went wrong, not just looking for evidence of what happened on systems that have

Re: Btrfs send with parent different size depending on source of files.

2019-02-18 Thread Chris Murphy
g a difference between master and child? Yes I think that might work but I haven't tried it. But of course, master must already be on the destination. btrfs sub create master ##populate the master btrfs sub create childofmaster cp --reflink master/bunchoffiles childofmaster/ btrfs sub

Re: [LSF/MM TOPIC] More async operations for file systems - async discard?

2019-02-18 Thread Ric Wheeler
On 2/17/19 9:22 PM, Dave Chinner wrote: On Sun, Feb 17, 2019 at 06:42:59PM -0500, Ric Wheeler wrote: On 2/17/19 4:09 PM, Dave Chinner wrote: On Sun, Feb 17, 2019 at 03:36:10PM -0500, Ric Wheeler wrote: One proposal for btrfs was that we should look at getting discard out of the synchronous pat

Re: Btrfs send with parent different size depending on source of files.

2019-02-18 Thread André Malm
the destination. btrfs sub create master ##populate the master btrfs sub create childofmaster cp --reflink master/bunchoffiles childofmaster/ btrfs sub snap -r master master.20190218-initial btrfs sub snap -r childofmaster childofmaster.20190218-initial btrfs send master.20190218-initial | btrfs

Re: raid 1 filesystem corruption

2019-02-18 Thread Chris Murphy
> Am Mo., 18. Feb. 2019 um 23:35 Uhr schrieb Rudolf Kastl : > > > > I tried a btrfs check --repair on sda2: OK I think this is a risky decision, hopefully it works out though. a. There's corruption and IO error we don't understand the source of yet. Repairs can make problems much worse if the sou

Re: Btrfs send with parent different size depending on source of files.

2019-02-18 Thread Chris Murphy
On Mon, Feb 18, 2019 at 3:58 PM André Malm wrote: > > Ok, but I don't want to keep old snapshots of the child volumes. Only > the latest and then diffing it in regards to the master. Would that be > possible? In order to do an incremental send/receive you need to have the -p snapshot on both sour

Re: Btrfs send with parent different size depending on source of files.

2019-02-18 Thread André Malm
I assume i would have to use rsync (with --inplace possibly) to keep the master volume in sync between machines? Say for example I have a (large) file on master, on machine A, I cp reflink it to a child subvolume. I then send -p child subvolume to remote machine B (which already have the maste

Re: Btrfs send with parent different size depending on source of files.

2019-02-18 Thread Chris Murphy
On Mon, Feb 18, 2019 at 4:58 PM André Malm wrote: > > I assume i would have to use rsync (with --inplace possibly) to keep the > master volume in sync between machines? Why? You previously said you didn't want to do that: " if I change / remove, say 10 GB worth of data from the master subvolume

Re: Btrfs send with parent different size depending on source of files.

2019-02-18 Thread Chris Murphy
On Mon, Feb 18, 2019 at 5:16 PM Chris Murphy wrote: > rsync'd subvolumes across volumes aren't consistent identical by btrfs s/consistent/considered -- Chris Murphy

Re: Btrfs send with parent different size depending on source of files.

2019-02-18 Thread André Malm
Rsync is probably i bad idea yes. I could btrfs send -p the changed "new" master subvolume and then delete the old master subvolume and then reference the new master subvolume when transferring it later on i guess? I'll explain the problem I'm trying to solve abit better; Say i have a program

Re: [PATCH 2/2] Btrfs: report and handle error on unexpected first key on extent buffer

2019-02-18 Thread Qu Wenruo
On 2019/2/19 上午12:58, fdman...@kernel.org wrote: > From: Filipe Manana > > When there is a kind of corruption in an extent buffer such that its first > key does not match the key at the respective parent slot, one of two things > happens: Isn't that handled by read_tree_block() already? Thank

Re: [PATCH 1/2] fstests: generic, test fsync after succession of file renames

2019-02-18 Thread Chao Yu
On 2019/2/13 2:08, fdman...@kernel.org wrote: > From: Filipe Manana > > Test that after a combination of file renames, linking and creating a new > file with the old name of a renamed file, if we fsync the new file, after > a power failure we are able to mount the filesystem and all file names >

[PATCH] fs/btrfs: init csum_list before possible free

2019-02-18 Thread Dan Robertson
The scrub_ctx csum_list member must be initialized before scrub_free_ctx is called. If the csum_list is not initialized beforehand, the list_empty call in scrub_free_csums will result in a null deref. Signed-off-by: Dan Robertson --- fs/btrfs/scrub.c | 2 +- 1 file changed, 1 insertion(+), 1 del

Re: Btrfs send with parent different size depending on source of files.

2019-02-18 Thread Chris Murphy
On Mon, Feb 18, 2019 at 5:28 PM André Malm wrote: > > Rsync is probably i bad idea yes. I could btrfs send -p the changed > "new" master subvolume and then delete the old master subvolume and then > reference the new master subvolume when transferring it later on i guess? I'm not sure how your ap

Re: [LSF/MM TOPIC] Software RAID Support for NV-DIMM

2019-02-18 Thread Dave Chinner
On Mon, Feb 18, 2019 at 06:15:34PM -0800, Jane Chu wrote: > On 2/15/2019 9:39 PM, Dave Chinner wrote: > > >On Sat, Feb 16, 2019 at 04:31:33PM +1100, Dave Chinner wrote: > >>On Fri, Feb 15, 2019 at 10:57:12AM +0100, Johannes Thumshirn wrote: > >>>(This is a joint proposal with Hannes Reinecke) > >>

Re: [RFC PATCH 0/6] Allow setting file birth time with utimensat()

2019-02-18 Thread Matthew Wilcox
On Sun, Feb 17, 2019 at 12:40:09PM -0800, Andy Lutomirski wrote: > So I'm highly in favor of this patch. If XFS wants to disallow > writing the birth time, fine, but I think that behavior should be > overridable. Please, no. We need to have consistent behaviour between at least Linux local files

Re: [RFC PATCH 0/6] Allow setting file birth time with utimensat()

2019-02-18 Thread Dave Chinner
On Mon, Feb 18, 2019 at 08:04:47PM -0800, Matthew Wilcox wrote: > On Sun, Feb 17, 2019 at 12:40:09PM -0800, Andy Lutomirski wrote: > > So I'm highly in favor of this patch. If XFS wants to disallow > > writing the birth time, fine, but I think that behavior should be > > overridable. > > Please,

Re: [PATCH] fs/btrfs: init csum_list before possible free

2019-02-18 Thread Nikolay Borisov
On 19.02.19 г. 4:56 ч., Dan Robertson wrote: > The scrub_ctx csum_list member must be initialized before > scrub_free_ctx is called. If the csum_list is not initialized > beforehand, the list_empty call in scrub_free_csums will result > in a null deref. > > Signed-off-by: Dan Robertson Review

Re: [PATCH v2.1] btrfs: ensure that a DUP or RAID1 block group has exactly two stripes

2019-02-18 Thread Peter Becker
I would prefer "< 2" .. in both cases (RAID1 and DUP) because this allow us to implement N-Copy RAID1 in the future. diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index 03f223aa7194..9024eee889b9 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -6791,10 +6791,10 @@ static int btr

Re: [PATCH v2.1] btrfs: ensure that a DUP or RAID1 block group has exactly two stripes

2019-02-18 Thread Nikolay Borisov
On 19.02.19 г. 9:33 ч., Peter Becker wrote: > I would prefer "< 2" .. in both cases (RAID1 and DUP) because this > allow us to implement N-Copy RAID1 in the future. NAK When/if N-Copy raid1 patches land then they can modify the code and it will be visible from the commit message why it's < 2.