On 2019/2/12 下午3:55, Remi Gauvin wrote:
> On 2019-02-12 2:47 a.m., Qu Wenruo wrote:
>>
>>
>> Consider this use case:
>>
>> One btrfs with 2 devices, RAID1 for data and metadata.
>>
>> One day devid 2 got failure, and before replacement arrives, user can
>> only use devid 1 alone. (Maybe that's th
On 2019-02-12 2:47 a.m., Qu Wenruo wrote:
>
>
> Consider this use case:
>
> One btrfs with 2 devices, RAID1 for data and metadata.
>
> One day devid 2 got failure, and before replacement arrives, user can
> only use devid 1 alone. (Maybe that's the root fs).
>
> Then new disk arrived, user repl
On 2019/2/12 下午3:43, Remi Gauvin wrote:
> On 2019-02-12 2:22 a.m., Qu Wenruo wrote:
>
>>> Does this mean you would rely on scrub/CSUM to repair the missing data
>>> if device is restored?
>>
>> Yes, just as btrfs usually does.
>>
>
> I don't really understand the implications of the problems wi
On 2019-02-12 2:22 a.m., Qu Wenruo wrote:
>> Does this mean you would rely on scrub/CSUM to repair the missing data
>> if device is restored?
>
> Yes, just as btrfs usually does.
>
I don't really understand the implications of the problems with mounting
fs when single/dup data chunk are allocat
On 2019/2/12 下午3:20, Remi Gauvin wrote:
> On 2019-02-12 2:03 a.m., Qu Wenruo wrote:
>
>> So we only need to consider missing devices as writable, and calculate
>> our chunk allocation profile with missing devices too.
>>
>> Then every thing should work as expected, without annoying SINGLE/DUP
>>
On 2019-02-12 2:03 a.m., Qu Wenruo wrote:
> So we only need to consider missing devices as writable, and calculate
> our chunk allocation profile with missing devices too.
>
> Then every thing should work as expected, without annoying SINGLE/DUP
> chunks blocking later degraded mount.
>
>
Does
[PROBLEM]
The following script can easily create unnecessary SINGLE or DUP chunks:
#!/bin/bash
dev1="/dev/test/scratch1"
dev2="/dev/test/scratch2"
dev3="/dev/test/scratch3"
mnt="/mnt/btrfs"
umount $dev1 $dev2 $dev3 $mnt &> /dev/null
mkfs.btrfs -f $dev1 $dev2 -d raid1 -m raid1
mo
On 2019/2/12 下午2:22, Steve Leung wrote:
>
>
> - Original Message -
>> From: "Qu Wenruo"
>> To: "STEVE LEUNG" , linux-btrfs@vger.kernel.org
>> Sent: Sunday, February 10, 2019 6:52:23 AM
>> Subject: Re: corruption with multi-device btrfs + single bcache, won't mount
>
>> - Original
- Original Message -
> From: "Qu Wenruo"
> To: "STEVE LEUNG" , linux-btrfs@vger.kernel.org
> Sent: Sunday, February 10, 2019 6:52:23 AM
> Subject: Re: corruption with multi-device btrfs + single bcache, won't mount
> - Original Message -
> From: "Qu Wenruo"
> On 2019/2/10 下午2:
On 2019/1/11 下午1:01, Qu Wenruo wrote:
[snip]
> +# FS QA Test 179
> +#
> +# Test if btrfs will lockup at subvolume deletion when qgroups are enabled.
> +#
> +# This bug is going to be fixed by a patch for the kernel titled
> +# "btrfs: qgroup: Don't trigger backref walk at delayed ref insert time"
On 2019/1/17 上午12:00, Josef Bacik wrote:
> qgroups will do the old roots lookup at delayed ref time, which could be
> while walking down the extent root while running a delayed ref. This
> should be fine, except we specifically lock eb's in the backref walking
> code irrespective of path->skip_l
Hello,
The context is a BTRFS filesystem on top of an md device (raid5 on 6 disks).
System is an Arch Linux and the kernel was a vanilla 4.20.2.
# btrfs fi us /home
Overall:
Device size: 27.29TiB
Device allocated: 5.01TiB
Device unallocated: 22.
Still reproducible on 4.20.7.
The behavior is slightly different on current kernels (4.20.7, 4.14.96)
which makes the problem a bit more difficult to detect.
# repro-hole-corruption-test
i: 91, status: 0, bytes_deduped: 131072
i: 92, status: 0, bytes_deduped: 131072
Maintain the lines extented upto 80 char where possible, and indent the
argument.
Signed-off-by: Anand Jain
---
v3: changelog added.
fs/btrfs/props.c | 16 ++--
1 file changed, 6 insertions(+), 10 deletions(-)
diff --git a/fs/btrfs/props.c b/fs/btrfs/props.c
index 77a03076b18e..3c15
v3: Merge patch 2/5 and 3/5 as in v1.
Not included 1/5 in v1 as its already integrated in misc-next.
While adding the readmirror property found few cleanup things which
can be fixed. As these aren't part of upcoming readmirror property
I am sending these separately.
Anand Jain (3):
btrfs: k
Drop forward declaration of the functions,
prop_compression_validate(), prop_compression_apply() and
prop_compression_extract(). By moving prop_handlers[], btrfs_props_init()
prop_compression_validate(), prop_compression_apply() and
prop_compression_extract() appropriately within the file. No funct
btrfs_set_prop() is a redirect to __btrfs_set_prop() with the
transaction handler equal to NULL. And __btrfs_set_prop() inturn diectly
uses trans to do_setxattr() which when trans is NULL creates a transaction.
Instead rename __btrfs_set_prop() to btrfs_set_prop(), and update the
caller with NULL
On Mon, Feb 11, 2019 at 5:17 AM Austin S. Hemmelgarn
wrote:
>
> Last I knew, it was systemd itself doing the pause, because we provide
> no real device for udev to wait on appearing.
Well there's more than one thing responsible for the net behavior. The
most central thing waiting is the kernel. A
On Mon, Feb 11, 2019 at 05:36:13PM +0100, David Sterba wrote:
> > I have re-written the code though to make it cleaner and
> > to silence the static checkers.
>
> Maybe there's something new the static checker needs to learn.
Gar. Yes. You're right. I hadn't thought about that read locks could
On Fri, Feb 08, 2019 at 09:10:41AM +0200, Nikolay Borisov wrote:
>
>
> On 8.02.19 г. 9:02 ч., Anand Jain wrote:
> > In preparation to drop forward declaration of the functions,
> > prop_compression_validate(), prop_compression_apply() and
> > prop_compression_extract(). Move prop_handlers[], btrf
On Fri, Feb 08, 2019 at 03:39:37PM +0800, Anand Jain wrote:
> We have killed volume mutex (commit: dccdb07bc996
> btrfs: kill btrfs_fs_info::volume_mutex). This a trival one seems to have
> escaped.
>
> Signed-off-by: Anand Jain
> ---
> v2: Delete the wrong comment instead of fixing it.
This pat
We should drop the lock on this error path. This is from static
analysis and I don't know if it's possible to hit this error path in
real life.
Signed-off-by: Dan Carpenter
---
fs/btrfs/dev-replace.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/fs/btrfs/dev-replace.c b/fs/btrfs/dev-repla
On Mon, Feb 11, 2019 at 05:36:13PM +0100, David Sterba wrote:
> On Sat, Feb 09, 2019 at 12:02:55PM +0300, Dan Carpenter wrote:
> > Back in the day, before commit 0b246afa62b0 ("btrfs: root->fs_info
> > cleanup, add fs_info convenience variables") then we used to take
> > different locks.
>
> Nope,
On Sat, Feb 09, 2019 at 12:02:55PM +0300, Dan Carpenter wrote:
> Back in the day, before commit 0b246afa62b0 ("btrfs: root->fs_info
> cleanup, add fs_info convenience variables") then we used to take
> different locks.
Nope, it's the same per-filesystem lock, just the old code got there
in two dif
[snip]
>>> Looking at the dev
>>> docs and the description for 'offset' field in btrfs_file_extent_item I
>>> can sort of deduce that this field will only be different than null if
>>> this reference is for an extent which is shared between 2 snapshots.
>>
>> Don't forget reflink and data CoW.
>>
>
On 11.02.19 г. 15:23 ч., Qu Wenruo wrote:
>
>
> On 2019/2/11 下午8:55, Nikolay Borisov wrote:
>>
>>
>> On 11.02.19 г. 7:16 ч., Qu Wenruo wrote:
>>> Current delayed ref interface has several problems:
>>> - Longer and longer parameter lists
>>> bytenr
>>> num_bytes
>>> parent
>>>
On 2019/2/11 下午8:55, Nikolay Borisov wrote:
>
>
> On 11.02.19 г. 7:16 ч., Qu Wenruo wrote:
>> Current delayed ref interface has several problems:
>> - Longer and longer parameter lists
>> bytenr
>> num_bytes
>> parent
>> -- so far so good
>> ref_root
>> owner
>> offset
>>
On 11.02.19 г. 7:16 ч., Qu Wenruo wrote:
> Similar to btrfs_inc_extent_ref(), just use btrfs_ref to replace the
> long parameter list and the confusing @owner parameter.
>
> Signed-off-by: Qu Wenruo
Reviewed-by: Nikolay Borisov
> ---
> fs/btrfs/ctree.h | 5 +---
> fs/btrfs/extent-t
On 11.02.19 г. 7:16 ч., Qu Wenruo wrote:
> Now we don't need to play the dirty game of reusing @owner for tree block
> level.
>
> Signed-off-by: Qu Wenruo
Reviewed-by: Nikolay Borisov
> ---
> fs/btrfs/ctree.h | 5 ++--
> fs/btrfs/extent-tree.c | 57 -
On 11.02.19 г. 7:16 ч., Qu Wenruo wrote:
> It's a perfect match for btrfs_ref_tree_mod() to use btrfs_ref, as
> btrfs_ref describes a metadata/data reference update comprehensively.
>
> Now we have one less function use confusing owner/level trick.
>
> Signed-off-by: Qu Wenruo
Reviewed-by: Ni
On 11.02.19 г. 7:16 ч., Qu Wenruo wrote:
> Just like btrfs_add_delayed_tree_ref(), use btrfs_ref to refactor
> btrfs_add_delayed_data_ref().
>
> Signed-off-by: Qu Wenruo
Reviewed-by: Nikolay Borisov
> ---
> fs/btrfs/delayed-ref.c | 20 ++--
> fs/btrfs/delayed-ref.h | 7 ++
On 11.02.19 г. 7:16 ч., Qu Wenruo wrote:
> btrfs_add_delayed_tree_ref() has a longer and longer parameter list, and
> some caller like btrfs_inc_extent_ref() are using @owner as level for
> delayed tree ref.
>
> Instead of making the parameter list longer and longer, use btrfs_ref to
> refactor
On 11.02.19 г. 7:16 ч., Qu Wenruo wrote:
> Current delayed ref interface has several problems:
> - Longer and longer parameter lists
> bytenr
> num_bytes
> parent
> -- so far so good
> ref_root
> owner
> offset
> -- I don't feel good now
>
> - Different interpret
On 2019-02-10 13:34, Chris Murphy wrote:
On Sat, Feb 9, 2019 at 5:13 AM waxhead wrote:
Understood, but that is not quite what I meant - let me rephrase...
If BTRFS still can't mount, why would it blindly accept a previously
non-existing disk to take part of the pool?!
It doesn't do it blindl
On 2/7/19 7:04 PM, Stefan K wrote:
Thanks, with degraded as kernel parameter and also ind the fstab it works like
expected
That should be the normal behaviour,
IMO in the long term it will be. But before that we have few items to
fix around this, such as the serviceability part.
-Anan
Use the refcount_t for fs_info::scrub_workers_refcnt instead of int.
Signed-off-by: Anand Jain
---
v5: Fix refcount validation warning.
Use refcount_set() instead of refcount_inc() when count is 0.
v4: born
fs/btrfs/ctree.h | 2 +-
fs/btrfs/disk-io.c | 2 +-
fs/btrfs/scrub.c | 10 +
From: Jeff Mahoney
The pending chunks list contains chunks that are allocated in the
current transaction but haven't been created yet. The pinned chunks
list contains chunks that are being released in the current transaction.
Both describe chunks that are not reflected on disk as in use but are
u
From: Jeff Mahoney
We currently overload the pending_chunks list to handle updating
btrfs_device->commit_bytes used. We don't actually care about
the extent mapping or even the device mapping for the chunk - we
just need the device, and we can end up processing it multiple
times. The fs_devices
This function is very similar to find_first_extent_bit except that it
locates the first contiguous span of space which does not have bits set.
It's intended use is in the freespace trimming code.
Signed-off-by: Nikolay Borisov
---
fs/btrfs/extent_io.c | 73 +++
During device shrink pinned/pending chunks (i.e those which have been
deleted/created respectively, in the current transaction and haven't
touched disk) need to be accounted when doing device shrink. Presently
this happens after the main relocation loop in btrfs_shrink_device,
which could lead to m
Currently unallocated chunks are always trimmed. For example
2 consecutive trims on large storage would trim freespace twice
irrespective of whether the space was actually allocated or not between
those trims.
Optimise this behavior by exploiting the newly introduced alloc_state
tree of btrfs_devi
Instead of always calling the allocator to search for a free extent,
that satisfies the input criteria, switch btrfs_trim_free_extents to
using find_first_clear_extent_bit. With this change it's no longer
necessary to read the device tree in order to figure out holes in
the devices.
Now the code a
Now that those function no longer require a handle to transaction to
inspect pending/pinned chunks the argument can be removed. At the same
time also remove any surrounding code which acquired the handle.
Signed-off-by: Nikolay Borisov
---
fs/btrfs/extent-tree.c | 36 +++-
Rather than hijacking the existing defines let's just define new bits,
with more descriptive names. Instead of using yet more (currently at 18)
bits for the new flags, use the fact those flags will be specific to
the device allocation tree so define them using existing EXTENT_* flags.
Signed-off-b
Up until know trimming the freespace was done irrespective of what the
arguments of the FITRIM ioctl were. For example fstrim's -o/-l arguments
will be entirely ignored. Fix it by correctly handling those paramter.
This requires breaking if the found freespace extent is after the end
of the passed
This is used in more than one places so let's factor it out in ctree.h.
No functional changes.
Signed-off-by: Nikolay Borisov
---
fs/btrfs/ctree.h | 2 ++
fs/btrfs/extent-tree.c | 1 -
fs/btrfs/volumes.c | 1 -
3 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/fs/btrfs/ctr
This function is going to be used to clear out the device extent
allocation information. Give it a more generic name and export it. This
is in preparation to replacing the pending/pinned chunk lists with an
extent tree. No functional changes.
Signed-off-by: Nikolay Borisov
---
fs/btrfs/extent_io
Chunks read from disk currently don't get their ->orig_block_len member
set, in contrast when a new chunk is allocated, the respective
extent_map's ->orig_block_len is assigned the size of the stripe of this
chunk. Let's apply the same strategy for chunks which are read from
disk, not only does thi
Here is the second version of the FITRIM patchset. For background information
consult the previous [0] post. Changes since v1:
* Dropped some cleanup patches as they have been merged in the meantime.
* In Patch 2 switched list iteration to list_for_each_entry_safe in
btrfs_cleanup_one_tra
Hi!
I'v hit a BUG ON during btrfs check:
-
server:~# btrfs check --progress --repair /dev/sde
enabling repair mode
Opening filesystem to check...
Checking filesystem on /dev/sde
UUID: d5fa971b-6546-424d-87c1-dcd688eacdac
[1/7] checking root items
On 2/9/19 1:02 AM, David Sterba wrote:
On Wed, Jan 30, 2019 at 02:44:59PM +0800, Anand Jain wrote:
Fixes the circular locking dependency warning as in patch 1/3,
and patch 2/3 adds lockdep_assert_held() to scrub_workers_get().
Patch 3/3 converts scrub_workers_refcnt into refcount_t.
Anand Ja
51 matches
Mail list logo