On Thu, Oct 18, 2018 at 04:23:18PM -0400, Josef Bacik wrote:
> ->page_mkwrite is extremely expensive in btrfs. We have to reserve
> space, which can take 6 lifetimes, and we could possibly have to wait on
> writeback on the page, another several lifetimes. To avoid this simply
> drop the mmap_sem
On Thu, Oct 18, 2018 at 04:23:15PM -0400, Josef Bacik wrote:
> If we drop the mmap_sem we have to redo the vma lookup which requires
> redoing the fault handler. Chances are we will just come back to the
> same page, so save this page in our vmf->cached_page and reuse it in the
> next loop through
On Thu, Oct 18, 2018 at 04:23:14PM -0400, Josef Bacik wrote:
> Johannes' patches didn't quite cover all of the IO cases that we need to
> drop the mmap_sem for, this patch covers the rest of them.
>
> Signed-off-by: Josef Bacik
> ---
> mm/filemap.c | 11 +++
> 1 file changed, 11 insertio
On Thu, Oct 18, 2018 at 04:23:17PM -0400, Josef Bacik wrote:
> Before we didn't set the retry flag on our vm_fault. We want to allow
> file systems to drop the mmap_sem if they so choose, so set this flag
> and deal with VM_FAULT_RETRY appropriately.
>
> Signed-off-by: Josef Bacik
> ---
> mm/me
On Thu, Oct 18, 2018 at 04:23:16PM -0400, Josef Bacik wrote:
> This is preparation for dropping the mmap_sem in page_mkwrite. We need
> to know if we used our cached page so we can be sure it is the page we
> already did the page_mkwrite stuff on so we don't have to redo all of
> that work.
>
> S
On Thu, Oct 18, 2018 at 04:23:13PM -0400, Josef Bacik wrote:
> From: Johannes Weiner
>
> Reads can take a long time, and if anybody needs to take a write lock on
> the mmap_sem it'll block any subsequent readers to the mmap_sem while
> the read is outstanding, which could cause long delays. Inst
To allow delayed subtree swap rescan, btrfs needs to record per-root
info about which tree blocks get swapped.
So this patch introduces per-root btrfs_qgroup_swapped_blocks structure,
which records which tree blocks get swapped.
The designed workflow will be:
1) Record the subtree root block get
[Bad format in previous reply, send again]
On 10/18/18 10:41 PM, Christoph Anton Mitterer wrote:
Hey.
So I'm back from a longer vacation and had now the time to try out your
patches from below:
On Wed, 2018-09-05 at 15:04 +0800, Su Yue wrote:
I found the errors should blame to something about
On 10/18/18 10:41 PM, Christoph Anton Mitterer wrote:
Hey.
So I'm back from a longer vacation and had now the time to try out your
patches from below:
On Wed, 2018-09-05 at 15:04 +0800, Su Yue wrote:
I found the errors should blame to something about inode_extref check
in lowmem mode.
I hav
On 10/19/2018 02:02 AM, Chris Murphy wrote:
On Tue, Oct 16, 2018 at 10:08 PM, Anand Jain wrote:
So a possible solution for the reproducible builds:
usual mkfs.btrfs dev
Write the data
unmount; create btrfs-image with uuid/fsid/time sanitized; mark it as a
seed (RO).
chec
On 2018/10/19 上午12:20, David Sterba wrote:
> On Thu, Oct 18, 2018 at 07:17:27PM +0800, Qu Wenruo wrote:
>> +void btrfs_qgroup_clean_swapped_blocks(struct btrfs_root *root)
>> +{
>> +struct btrfs_qgroup_swapped_blocks *swapped_blocks;
>> +struct btrfs_qgroup_swapped_block *cur, *next;
>> +
We want to be able to cache the result of a previous loop of a page
fault in the case that we use VM_FAULT_RETRY, so introduce
handle_mm_fault_cacheable that will take a struct vm_fault directly, add
a ->cached_page field to vm_fault, and add helpers to init/cleanup the
struct vm_fault.
I've conve
Johannes' patches didn't quite cover all of the IO cases that we need to
drop the mmap_sem for, this patch covers the rest of them.
Signed-off-by: Josef Bacik
---
mm/filemap.c | 11 +++
1 file changed, 11 insertions(+)
diff --git a/mm/filemap.c b/mm/filemap.c
index 1ed35cd99b2c..65395ee
This is preparation for dropping the mmap_sem in page_mkwrite. We need
to know if we used our cached page so we can be sure it is the page we
already did the page_mkwrite stuff on so we don't have to redo all of
that work.
Signed-off-by: Josef Bacik
---
include/linux/mm.h | 6 +-
mm/filemap
->page_mkwrite is extremely expensive in btrfs. We have to reserve
space, which can take 6 lifetimes, and we could possibly have to wait on
writeback on the page, another several lifetimes. To avoid this simply
drop the mmap_sem if we didn't have the cached page and do all of our
work and return
Before we didn't set the retry flag on our vm_fault. We want to allow
file systems to drop the mmap_sem if they so choose, so set this flag
and deal with VM_FAULT_RETRY appropriately.
Signed-off-by: Josef Bacik
---
mm/memory.c | 10 +++---
1 file changed, 7 insertions(+), 3 deletions(-)
di
From: Johannes Weiner
Reads can take a long time, and if anybody needs to take a write lock on
the mmap_sem it'll block any subsequent readers to the mmap_sem while
the read is outstanding, which could cause long delays. Instead drop
the mmap_sem if we do any reads at all.
Signed-off-by: Johann
Getting some production testing running on these patches shortly to verify they
are ready for primetime, but in the meantime they've had a bunch of xfstests
runs on xfs, btrfs, and ext4 using kvm-xfstests.
v2->v3:
- dropped the RFC, ready for a real review.
- fixed a kbuild error for !MMU configs.
If we drop the mmap_sem we have to redo the vma lookup which requires
redoing the fault handler. Chances are we will just come back to the
same page, so save this page in our vmf->cached_page and reuse it in the
next loop through the fault handler.
Signed-off-by: Josef Bacik
---
mm/filemap.c |
From: Darrick J. Wong
Plumb in a remap flag that enables the filesystem remap handler to
shorten remapping requests for callers that can handle it. Now
copy_file_range can report partial success (in case we run up against
alignment problems, resource limits, etc.).
We also enable CAN_SHORTEN fo
From: Darrick J. Wong
Change the remap_file_range functions to take a number of bytes to
operate upon and return the number of bytes they operated on. This is a
requirement for allowing fs implementations to return short clone/dedupe
results to the user, which will enable us to obey resource lim
From: Darrick J. Wong
Combine the clone_file_range and dedupe_file_range operations into a
single remap_file_range file operation dispatch since they're
fundamentally the same operation. The differences between the two can
be made in the prep functions.
Signed-off-by: Darrick J. Wong
Reviewed-
From: Darrick J. Wong
File range remapping, if allowed to run past the destination file's EOF,
is an optimization on a regular file write. Regular file writes that
extend the file length are subject to various constraints which are not
checked by range cloning.
This is a correctness problem bec
From: Goffredo Baroncelli
The caller knows better if this error is fatal or not, i.e. another disk is
available or not.
This is a preparatory patch.
Signed-off-by: Goffredo Baroncelli
Reviewed-by: Daniel Kiper
---
grub-core/fs/btrfs.c | 10 --
1 file changed, 4 insertions(+), 6 delet
From: Goffredo Baroncelli
The original code which handles the recovery of a RAID 6 disks array
assumes that all reads are multiple of 1 << GRUB_DISK_SECTOR_BITS and it
assumes that all the I/O is done via the struct grub_diskfilter_segment.
This is not true for the btrfs code. In order to reuse t
From: Goffredo Baroncelli
Move the code in charge to read the data from disk into a separate
function. This helps to separate the error handling logic (which depends on
the different raid profiles) from the read from disk logic.
Refactoring this code increases the general readability too.
This i
On Tue, Oct 16, 2018 at 10:08 PM, Anand Jain wrote:
>
> So a possible solution for the reproducible builds:
>usual mkfs.btrfs dev
>Write the data
>unmount; create btrfs-image with uuid/fsid/time sanitized; mark it as a
> seed (RO).
>check/verify the hash of the image.
Gotcha. G
From: Goffredo Baroncelli
This helper is used in a few places to help the debugging. As
conservative approach the error is only logged.
This does not impact the error handling.
Signed-off-by: Goffredo Baroncelli
Reviewed-by: Daniel Kiper
---
grub-core/fs/btrfs.c | 24 +++-
From: Goffredo Baroncelli
Add support for recovery for a RAID 5 btrfs profile. In addition
it is added some code as preparatory work for RAID 6 recovery code.
Signed-off-by: Goffredo Baroncelli
---
grub-core/fs/btrfs.c | 161 +--
1 file changed, 156 inse
Hi All,
the aim of this patches set is to provide support for a BTRFS raid5/6
filesystem in GRUB.
The first patch, implements the basic support for raid5/6. I.e this works when
all the disks are present.
The next 5 patches, are preparatory ones.
The 7th patch implements the raid5 recovery for
From: Goffredo Baroncelli
Add the RAID 6 recovery, in order to use a RAID 6 filesystem even if some
disks (up to two) are missing. This code use the md RAID 6 code already
present in grub.
Signed-off-by: Goffredo Baroncelli
Reviewed-by: Daniel Kiper
---
grub-core/fs/btrfs.c | 60 +
From: Goffredo Baroncelli
A portion of the logging code is moved outside of internal for(;;). The part
that is left inside is the one which depends on the internal for(;;) index.
This is a preparatory patch. The next one will refactor the code inside
the for(;;) into an another function.
Signed
From: Goffredo Baroncelli
Signed-off-by: Goffredo Baroncelli
Signed-off-by: Daniel Kiper
Reviewed-by: Daniel Kiper
---
grub-core/fs/btrfs.c | 73
1 file changed, 73 insertions(+)
diff --git a/grub-core/fs/btrfs.c b/grub-core/fs/btrfs.c
index be195
From: Goffredo Baroncelli
Currently read from missing device triggers rescan. However, it is never
recorded that the device is missing. So, each read of a missing device
triggers rescan again and again. This behavior causes a lot of unneeded
rescans leading to huge slowdowns.
This patch fixes ab
On Fri, Oct 12, 2018 at 03:32:32PM -0400, Josef Bacik wrote:
> --- a/fs/btrfs/tree-log.c
> +++ b/fs/btrfs/tree-log.c
> @@ -4374,7 +4374,6 @@ static int btrfs_log_changed_extents(struct
> btrfs_trans_handle *trans,
>
> INIT_LIST_HEAD(&extents);
>
> - down_write(&inode->dio_sem);
I'll
On Thu, Oct 18, 2018 at 07:17:27PM +0800, Qu Wenruo wrote:
> +void btrfs_qgroup_clean_swapped_blocks(struct btrfs_root *root)
> +{
> + struct btrfs_qgroup_swapped_blocks *swapped_blocks;
> + struct btrfs_qgroup_swapped_block *cur, *next;
> + int i;
> +
> + swapped_blocks = &root->sw
Hey.
So I'm back from a longer vacation and had now the time to try out your
patches from below:
On Wed, 2018-09-05 at 15:04 +0800, Su Yue wrote:
> I found the errors should blame to something about inode_extref check
> in lowmem mode.
> I have writeen three patches to detect and report errors ab
Hello guys!
I have a 2TB disk formatted to btrfs.My notebook broke last may, so I
haven't used it in a long time, only to backup a few files, but from a
windows pc(using windows btrfs driver), and maybe (i don't remember) put
a few files to the disk.Then my thinkpadlinux-bt...@vger.kernel.org
16.10.2018 0:33, Chris Murphy пишет:
> On Mon, Oct 15, 2018 at 3:26 PM, Anton Shepelev wrote:
>> Chris Murphy to Anton Shepelev:
>>
How can I track down the origin of this mount point:
/dev/sda2 on /home/hana type btrfs
(rw,relatime,space_cache,subvolid=259,subvol=/@/.snapshot
On 18/10/2018 08.02, Anton Shepelev wrote:
I wrote:
What may be the reason of a CRC mismatch on a BTRFS file in
a virutal machine:
csum failed ino 175524 off 1876295680 csum 451760558
expected csum 1446289185
Shall I seek the culprit in the host machine on in the
guest one? Supposing the hos
I wrote:
>What may be the reason of a CRC mismatch on a BTRFS file in
>a virutal machine:
>
>csum failed ino 175524 off 1876295680 csum 451760558
>expected csum 1446289185
>
>Shall I seek the culprit in the host machine on in the
>guest one? Supposing the host machine healty, what
>operations on
Refactor btrfs_qgroup_trace_subtree_swap() into
qgroup_trace_subtree_swap(), which only needs two extent buffer and some
other bool to control the behavior.
Also, allow depending functions to accept parameter @exec_post to
determine whether we need to trigger backref walk.
This provides the basis
To allow delayed subtree swap rescan, btrfs needs to record per-root
info about which tree blocks get swapped.
So this patch introduces per-root btrfs_qgroup_swapped_blocks structure,
which records which tree blocks get swapped.
The designed workflow will be:
1) Record the subtree root block get
Before this patch, qgroup code trace the whole subtree of file and reloc
trees unconditionally.
This makes qgroup numbers consistent, but it could cause tons of
unnecessary extent trace, which cause a lot of overhead.
However for subtree swap of balance, since both subtree contains the
same conte
Commit fb235dc06fac ("btrfs: qgroup: Move half of the qgroup accounting
time out of commit trans") makes btrfs_qgroup_extent_record::old_roots
populated at insert time.
It's OK for most cases as btrfs_qgroup_extent_record is inserted at
delayed ref head insert time, which has a less restrict lock
Currently only relocation code cares about btrfs_root::reloc_root, and
they have the method to sync btrfs_root::reloc_root without screwing
things up.
However qgroup code doesn't really have the ability to keep
btrfs_root::reloc_root reliable.
Currently if someone outside of relocation code want
This patchset can be fetched from github:
https://github.com/adam900710/linux/tree/qgroup_balance_skip_trees
Which is still based on v4.19-rc1, but with previous submitted patches
as dependency.
This patch address the heavy load subtree scan, but delaying it until
we're going to modify the swappe
Since it's replaced by new delayed subtree swap code, remove the
original code.
The cleanup is small since most of its core function is still used by
delayed subtree swap trace.
Signed-off-by: Qu Wenruo
---
fs/btrfs/qgroup.c | 94 ---
fs/btrfs/qgroup.
On 2018/10/18 下午2:16, Tony Prokott wrote:
> On Wed, 17 Oct 2018 17:57:25 -0700 Qu Wenruo
> wrote
> ...
> > > But after chrooting to update-initramfs and cataloging resulting image
> content, usb_storage and uas were present under /lib/modules/xxx already, and
> failing systems st
49 matches
Mail list logo