On 04/24/2018 02:50 PM, Qu Wenruo wrote:
On 2018年04月24日 14:43, Su Yue wrote:
On 04/24/2018 02:17 PM, Qu Wenruo wrote:
On 2018年04月24日 13:52, Su Yue wrote:
For an extent item which contains many tree block backrefs, like
===
In
For an extent item which contains many tree block backrefs, like
=
In 020-extent-ref-cases/keyed_block_ref.img
item 10 key (29470720 METADATA_ITEM 0) itemoff 3450 itemsize 222
refs 23 gen 10 flags TREE_BLOCK
For an extent item which contains many tree block backrefs, like
=
In 020-extent-ref-cases/keyed_block_ref.img
item 10 key (29470720 METADATA_ITEM 0) itemoff 3450 itemsize 222
refs 23 gen 10 flags TREE_BLOCK
Hi Chris,
On Tue, Apr 24, 2018 at 10:07 PM, Chris Murphy wrote:
> I don't have answer to your question, but I'm curious exactly how you
> simulate a crash? For my own really rudimentary testing I've been doing
> crazy things like:
>
> # grub-mkconfig -o /boot/efi && echo
Hi Chris,
We are using software we developed called CrashMonkey [1]. It
simulates the state on storage after a crash (taking into accounts
FLUSH and FUA flags). Talk slides on how it works can be found here
[2].
It is similar to dm-log-writes if you have used that in the past.
[1]
On Tue, Apr 24, 2018 at 8:35 PM, Jayashree Mohan
wrote:
> Hi,
>
> While investigating crash consistency bugs on btrfs, we came across
> workloads that demonstrate inconsistent behavior of fsync.
>
> Consider the following workload where fsync on the directory did not
Hi,
While investigating crash consistency bugs on btrfs, we came across
workloads that demonstrate inconsistent behavior of fsync.
Consider the following workload where fsync on the directory did not persist it.
Workload 1:
mkdir A
Sync
rename (A, B)
creat B/foo
fsync B/foo
fsync B
---crash---
path->keep_lock is set but @path immediatley gets released, this sets
->keep_lock only when it's necessary.
Signed-off-by: Liu Bo
---
fs/btrfs/tree-defrag.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/fs/btrfs/tree-defrag.c
Currently btrfs raid1/10 balancer bаlance requests to mirrors,
based on pid % num of mirrors.
Make logic understood:
- if one of underline devices are non rotational
- Queue leght to underline devices
By default try use pid % num_mirrors guessing, but:
- If one of mirrors are non rotational,
On 2018年04月24日 22:44, David Sterba wrote:
> On Tue, Apr 24, 2018 at 01:03:13PM +0800, Qu Wenruo wrote:
>> It's pretty handy if we can get debug output for locking status of an
>> extent buffer, specially for race related debugging.
>>
>> So add the following output for btrfs_print_tree() and
>>
Both, defrag ioctl and autodefrag - call btrfs_defrag_file()
for file defragmentation.
Kernel default target extent size - 256KiB.
Btrfs progs default - 32MiB.
Both bigger then maximum size of compressed extent - 128KiB.
That lead to rewrite all compressed data on disk.
Fix that by check
Currently btrfs_inode have size equal 1136 bytes. (On x86_64).
struct btrfs_inode store several vars releated to compression code,
all states use 1 or 2 bits.
Lets declare bitfields for compression releated vars, to reduce
sizeof btrfs_inode to 1128 bytes.
Signed-off-by: Timofey Titovets
At now btrfs_dedupe_file_range() restricted to 16MiB range for
limit locking time and memory requirement for dedup ioctl()
For too big input range code silently set range to 16MiB
Let's remove that restriction by do iterating over dedup range.
That's backward compatible and will not change
1st patch, remove 16MiB restriction from extent_same ioctl(),
by doing iterations over passed range.
I did not see much difference in performance, so it's just remove
logic restriction.
2-3 pathes, update defrag ioctl():
- Fix bad behaviour with full rewriting all compressed
extents in
Currently defrag ioctl only support recompress files with specified
compression type.
Allow set compression type to none, while call defrag, and use
BTRFS_DEFRAG_RANGE_COMPRESS as flag, that user request change of compression
type.
Signed-off-by: Timofey Titovets
---
On 4/23/18 5:43 PM, David Sterba wrote:
> On Tue, Apr 17, 2018 at 02:45:33PM -0400, Jeff Mahoney wrote:
>> On a file system with many snapshots and qgroups enabled, an interrupted
>> balance can end up taking a long time to mount due to recovering the
>> relocations during mount. It does this in
On Wed, Apr 18, 2018 at 05:56:31PM +0800, Anand Jain wrote:
> @@ -155,29 +155,26 @@ static int __btrfs_map_block(struct btrfs_fs_info
> *fs_info,
> *
> * uuid_mutex (global lock)
> *
> - * protects the fs_uuids list that tracks all per-fs fs_devices, resulting
>
On Tue, Apr 24, 2018 at 01:03:13PM +0800, Qu Wenruo wrote:
> It's pretty handy if we can get debug output for locking status of an
> extent buffer, specially for race related debugging.
>
> So add the following output for btrfs_print_tree() and
> btrfs_print_leaf():
> - refs
> - write_locks (as
On Tue, Apr 24, 2018 at 05:23:59PM +0300, Nikolay Borisov wrote:
> It's used only in inode.c so makes no sense to have it exported. Also
> move the definition of btrfs_delalloc_work to inode.c since it's used
> only this file.
>
> Signed-off-by: Nikolay Borisov
Reviewed-by:
It's used only in inode.c so makes no sense to have it exported. Also
move the definition of btrfs_delalloc_work to inode.c since it's used
only this file.
Signed-off-by: Nikolay Borisov
---
fs/btrfs/ctree.h | 9 -
fs/btrfs/inode.c | 8
2 files changed, 8
Hi all,
Any thoughts on this?
We completely understand you are all busy and might be traveling, so
we only need a simple ack from you: that when we fsync a directory in
btrfs, we can expect the contents to get persisted. We understand that
is not your highest priority item, and that you will fix
Now that the initialization part and the critical section code have
been split it's a lot easier to open code add_delayed_tree_ref. Do
so in the following manner:
1. The commin init code is put immediately after memory-to-be-init is
allocate, followed by the ref-specific member initialization.
THe majority of the init code for struct btrfs_delayed_ref_node is
duplicated in add_delayed_data_ref and add_delayed_tree_ref. Factor
out the common bits in init_delayed_ref_common. This function is
going to be used in future patches to clean that up. No functional
changes
Signed-off-by: Nikolay
Use the newly introduced function when initialising the head_ref in
add_delayed_ref_head. No functional changes.
Signed-off-by: Nikolay Borisov
---
fs/btrfs/delayed-ref.c | 63 --
1 file changed, 4 insertions(+), 59 deletions(-)
Use the newly introduced common helper. No functional changes
Signed-off-by: Nikolay Borisov
---
fs/btrfs/delayed-ref.c | 35 +++
1 file changed, 11 insertions(+), 24 deletions(-)
diff --git a/fs/btrfs/delayed-ref.c b/fs/btrfs/delayed-ref.c
Use the newly introduced helper and remove the duplicate code.
No functional changes
Signed-off-by: Nikolay Borisov
---
fs/btrfs/delayed-ref.c | 34 ++
1 file changed, 10 insertions(+), 24 deletions(-)
diff --git a/fs/btrfs/delayed-ref.c
Now that the initialization part and the critical section code have
been split it's a lot easier to open code add_delayed_data_ref. Do
so in the following manner:
1. The common init function is put immediately after memory-to-be-init
is allocated, followed by the specific data ref initialization.
add_delayed_ref_head really performed 2 independent operations -
initialisting the ref head and adding it to a list. Now that the init
part is in a separate function let's complete the separation between
both operations. This results in a lot simpler interface for
add_delayed_ref_head since the
add_delayed_ref_head implements the logic to both initialize a head_ref
structure as well as perform the necessary operations to add it to the
delayed ref machinery. This has resulted in a very cumebrsome interface
with loads of parameters and code, which at first glance, looks very
unwieldy.
On Tue, Apr 24, 2018 at 12:48:09PM +0800, Qu Wenruo wrote:
> -static int btrfs_validate_super(struct btrfs_fs_info *fs_info)
> +/*
> + * Check the validation of btrfs super block.
> + *
> + * @sb: super block to check
> + * @super_mirror:the super block number to check its
On Mon, Apr 23, 2018 at 12:31:17PM +0300, Nikolay Borisov wrote:
>
>
> On 23.04.2018 12:27, Qu Wenruo wrote:
> >
> >
> > On 2018年04月23日 15:54, Nikolay Borisov wrote:
> >> While trying to make sense of the lifecycle of delayed iputs it became
> >> apparent
> >> that the delay_iput parameter of
On Mon, Apr 23, 2018 at 10:54:15AM +0300, Nikolay Borisov wrote:
> It's always set to 0 so remove it
>
> Signed-off-by: Nikolay Borisov
> ---
> fs/btrfs/inode.c | 14 +-
> 1 file changed, 5 insertions(+), 9 deletions(-)
>
> diff --git a/fs/btrfs/inode.c
On 24.04.2018 16:22, David Sterba wrote:
> On Mon, Apr 23, 2018 at 10:54:17AM +0300, Nikolay Borisov wrote:
>> It's used only in inode.c so makes no sense to have it exported.
>>
>> Signed-off-by: Nikolay Borisov
>> ---
>> fs/btrfs/ctree.h | 2 --
>> 1 file changed, 2
On 24.04.2018 02:03, David Sterba wrote:
> Almost all callers pass the start and len as 2 arguments but this is not
> necessary, all the information is provided by the eb. By reordering the
> calls to num_extent_pages, we don't need the local variables with
> start/len.
>
> Signed-off-by: David
On 24.04.2018 02:03, David Sterba wrote:
> The loops iterating eb pages use unsigned long, that's an overkill as
> we know that there are at most 16 pages (64k / 4k), and 4 by default
> (with nodesize 16k).
>
> Signed-off-by: David Sterba
Reviewed-by: Nikolay Borisov
On Mon, Apr 23, 2018 at 10:54:17AM +0300, Nikolay Borisov wrote:
> It's used only in inode.c so makes no sense to have it exported.
>
> Signed-off-by: Nikolay Borisov
> ---
> fs/btrfs/ctree.h | 2 --
> 1 file changed, 2 deletions(-)
>
> diff --git a/fs/btrfs/ctree.h
Use the wrappers and reduce the amount of low-level details about the
waitqueue management.
Signed-off-by: David Sterba
---
fs/btrfs/compression.c | 7 +--
fs/btrfs/delayed-inode.c | 9 +++--
fs/btrfs/dev-replace.c | 10 --
fs/btrfs/extent-tree.c | 7
Reduce number of standalone barriers before waitqueue_active calls.
Changes v2:
* add 2 barriers to btrfs_sync_log and do not assume they're implied,
(pointed out by Nikolay)
git://github.com/kdave/btrfs-devel.git cleanup/cond-wake
David Sterba (3):
btrfs: introduce conditional wakeup
Currently the code assumes that there's an implied barrier by the
sequence of code preceding the wakeup, namely the mutex unlock.
As Nikolay pointed out:
I think this is wrong (not your code) but the original assumption that
the RELEASE semantics provided by mutex_unlock is sufficient.
According
Add convenience wrappers for the waitqueue management that involves
memory barriers to prevent deadlocks. The helpers will let us remove
barriers and the necessary comments in several places.
Reviewed-by: Nikolay Borisov
Signed-off-by: David Sterba
---
On 2018年04月24日 19:30, David Sterba wrote:
> On Tue, Apr 24, 2018 at 07:28:27PM +0800, Qu Wenruo wrote:
>>> I've read the discussion under previous version again, IMHO the best way
>>> to report what's going on is to use 2 functions for mount ant pre-commit
>>> time.
>>
>> OK, next version will
Hi,
btrfs-progs version 4.16.1 have been released. This is a bugfix release.
Changes:
* remove obsolete tools: btrfs-debug-tree, btrfs-zero-log, btrfs-show-super,
btrfs-calc-size
* sb-mod: new debugging tool to edit superblock items
* mkfs: detect if thin-provisioned device does not
On Tue, Apr 24, 2018 at 07:28:27PM +0800, Qu Wenruo wrote:
> > I've read the discussion under previous version again, IMHO the best way
> > to report what's going on is to use 2 functions for mount ant pre-commit
> > time.
>
> OK, next version will go that direction.
>
> Although it may still be
On 2018年04月24日 18:48, David Sterba wrote:
> On Tue, Apr 24, 2018 at 12:48:07PM +0800, Qu Wenruo wrote:
>> Although we have already checked incompat flags manually before really
>> mounting it, we could still enhance btrfs_check_super_valid() to check
>> incompat flags for later write time super
On Tue, Apr 24, 2018 at 10:52:41AM +0800, Su Yue wrote:
>
>
> On 01/24/2018 03:42 AM, David Sterba wrote:
> > On Sun, Jan 07, 2018 at 01:54:21PM -0800, Rosen Penev wrote:
> >> As btrfs is specific to Linux, %m can be used instead of strerror(errno)
> >> in format strings. This has some size
[adding linux-btrfs list to cc]
On Tue, Apr 17, 2018 at 04:44:42PM -0700, Howard McLauchlan wrote:
> This test aims to verify correct behaviour with chattr operations and
> btrfs send/receive. The intent is to check general correctness as well
> as special interactions with troublesome
On Tue, Apr 24, 2018 at 12:48:07PM +0800, Qu Wenruo wrote:
> Although we have already checked incompat flags manually before really
> mounting it, we could still enhance btrfs_check_super_valid() to check
> incompat flags for later write time super block validation check.
But the calls are in
On 2018年04月24日 18:29, David Sterba wrote:
> On Tue, Apr 24, 2018 at 02:22:15PM +0800, Qu Wenruo wrote:
>>
>>
>> On 2018年04月24日 13:59, Nikolay Borisov wrote:
>>>
>>>
>>> On 24.04.2018 02:03, David Sterba wrote:
The eb length is nodesize, as initialized in __alloc_extent_buffer.
On Tue, Apr 24, 2018 at 02:22:15PM +0800, Qu Wenruo wrote:
>
>
> On 2018年04月24日 13:59, Nikolay Borisov wrote:
> >
> >
> > On 24.04.2018 02:03, David Sterba wrote:
> >> The eb length is nodesize, as initialized in __alloc_extent_buffer.
> >> Regardless of start, we should always get the same
On 2018年04月24日 14:48, Su Yue wrote:
>
>
> On 04/24/2018 02:02 PM, Qu Wenruo wrote:
>>
>>
>> On 2018年04月24日 13:52, Su Yue wrote:
>>> There is no delayed ref in btrfs-progs, so remove related comments.
>>>
>>
>> Indeed.
>> Delayed ref is only used to speed up extent tree modification with the
>>
On 2018年04月24日 14:43, Su Yue wrote:
>
>
> On 04/24/2018 02:17 PM, Qu Wenruo wrote:
>>
>>
>> On 2018年04月24日 13:52, Su Yue wrote:
>>> For an extent item which contains many tree block backrefs, like
>>> ===
>>> In
On 04/24/2018 02:02 PM, Qu Wenruo wrote:
On 2018年04月24日 13:52, Su Yue wrote:
There is no delayed ref in btrfs-progs, so remove related comments.
Indeed.
Delayed ref is only used to speed up extent tree modification with the
cost of code complexity.
Thanks for your explanation :).
For
On 04/24/2018 02:17 PM, Qu Wenruo wrote:
On 2018年04月24日 13:52, Su Yue wrote:
For an extent item which contains many tree block backrefs, like
=
In 020-extent-ref-cases/keyed_block_ref.img
item 10 key (29470720 METADATA_ITEM 0)
On 2018年04月24日 13:59, Nikolay Borisov wrote:
>
>
> On 24.04.2018 02:03, David Sterba wrote:
>> The eb length is nodesize, as initialized in __alloc_extent_buffer.
>> Regardless of start, we should always get the same number of pages, so
>> use that fact.
>>
>> Signed-off-by: David Sterba
On 2018年04月24日 13:52, Su Yue wrote:
> For an extent item which contains many tree block backrefs, like
> =
> In 020-extent-ref-cases/keyed_block_ref.img
>
> item 10 key (29470720 METADATA_ITEM 0) itemoff 3450 itemsize 222
>
On 2018年04月24日 13:52, Su Yue wrote:
> After call of ref_for_same_block, ref1->parent must equals to
> ref2->parent, the block of exchange is never reached.
>
> So remove the block of exchange.
Reviewed-by: Qu Wenruo
The patch looks good, but considering how much difference the
On 2018年04月24日 13:52, Su Yue wrote:
> There is no delayed ref in btrfs-progs, so remove related comments.
>
Indeed.
Delayed ref is only used to speed up extent tree modification with the
cost of code complexity.
For btrfs-progs we don't need to worry about it at all.
Thanks,
Qu
>
57 matches
Mail list logo