Introduce new function, scrub_one_data_stripe(), to check all data and
tree blocks inside the data stripe.
This function will not try to recovery any error, but only check if any
data/tree blocks has mismatch csum.
If data missing csum, which is completely valid for case like nodatasum,
it will j
For READ, caller normally hopes to get what they request, other than
full stripe map.
In this case, we should remove unrelated stripe map, just like the
following case:
32K 96K
|<-request range->|
0 64k 128K
RAID0: |
Introduce a new function, scrub_one_full_stripe(), to check a full
stripe.
It handles the full stripe scrub in the following steps:
0) Check if we need to check full stripe
If full stripe contains no extent, why waste our CPU and IO?
1) Read out full stripe
Then we know how many devices are
Introduce new function, scrub_one_block_group(), to scrub a block group.
For Single/DUP/RAID0/RAID1/RAID10, we use old mirror number based
map_block, and check extent by extent.
For parity based profile (RAID5/6), we use new map_block_v2() and check
full stripe by full stripe.
Signed-off-by: Qu
Introuduce new local structures, scrub_full_stripe and scrub_stripe, for
incoming offline RAID56 scrub support.
For pure stripe/mirror based profiles, like raid0/1/10/dup/single, we
will follow the original bytenr and mirror number based iteration, so
they don't need any extra structures for these
Introduce new functions, check/recover_tree_mirror(), to check and
recover mirror-based tree blocks (Single/DUP/RAID0/1/10).
check_tree_mirror() can also be used on in-memory tree blocks using @data
parameter.
This is very handy for RAID5/6 case, either checking the data stripe
tree block by @byte
Introduce new function, check/recover_data_mirror(), to check and recover
mirror based data blocks.
Unlike tree block, data blocks must be recovered sector by sector, so we
introduced corrupted_bitmap for check and recover.
Signed-off-by: Qu Wenruo
Signed-off-by: Su Yue
---
scrub.c | 212 +
Introduce a new function, scrub_one_extent(), as a wrapper to check one
mirror-based extent.
It will accept a btrfs_path parameter @path, which must points to a
META/EXTENT_ITEM.
And @start, @len, which must be a subset of META/EXTENT_ITEM.
Signed-off-by: Qu Wenruo
---
scrub.c | 148 +++
Introduce a new header, kernel-lib/raid56.h, for later raid56 works.
It contains 2 functions, from original btrfs-progs code:
void raid6_gen_syndrome(int disks, size_t bytes, void **ptrs);
int raid5_gen_result(int nr_devs, size_t stripe_len, int dest, void **data);
Will be expanded later and some
For any one who wants to try it, it can be get from my repo:
https://github.com/adam900710/btrfs-progs/tree/offline_scrub
Several reports on kernel scrub screwing up good data stripes are in ML
for sometime.
And since kernel scrub won't account P/Q corruption, it makes us quite
hard to detect err
Introduce a internal helper, write_full_stripe() to calculate P/Q and
write the whole full stripe.
This is useful to recover RAID56 stripes.
Signed-off-by: Qu Wenruo
---
scrub.c | 44
1 file changed, 44 insertions(+)
diff --git a/scrub.c b/scrub.c
i
Now, btrfs-progs has a kernel scrub equivalent.
A new option, --offline is added to "btrfs scrub start".
If --offline is given, btrfs scrub will just act like kernel scrub, to
check every copy of extent and do a report on corrupted data and if it's
recoverable.
The advantage compare to kernel scr
Introduce a wrapper to recover raid56 data.
The logical is the same with kernel one, but with different interfaces,
since kernel ones cares the performance while in btrfs we don't care
that much.
And the interface is more caller friendly inside btrfs-progs.
Signed-off-by: Qu Wenruo
---
kernel-
Introduce function, recover_from_parities(), to recover data stripes.
It just wraps raid56_recov() with extra check functions to
scrub_full_stripe structure.
Signed-off-by: Qu Wenruo
---
scrub.c | 51 +++
1 file changed, 51 insertions(+)
diff --g
Introduce a new function, btrfs_check_extent_exists(), to check if there
is any extent in the range specified by user.
The parameter can be a large range, and if any extent exists in the
range, it will return >0 (in fact it will return 1).
Or return 0 if no extent is found.
Signed-off-by: Qu Wenr
Introduce new function, verify_parities(), to check if parities matches
for full stripe which all data stripes matches with their csum.
Caller should fill the scrub_full_stripe structure properly before
calling this function.
Signed-off-by: Qu Wenruo
---
scrub.c | 69 +++
Copied from kernel lib/raid6/recov.c.
Minor modifications includes:
- Rename from raid6_datap_recov_intx() to raid5_recov_datap()
- Rename parameter from faila to dest1
Signed-off-by: Qu Wenruo
---
kernel-lib/raid56.c | 41 +
kernel-lib/raid56.h | 2 ++
Use kernel RAID6 galois tables for later RAID6 recovery.
Galois tables file, kernel-lib/tables.c is generated by user space
program, mktable.
Galois field tables declaration, in kernel-lib/raid56.h, is completely
copied from kernel.
The mktables.c is copied from kernel with minor header/macro
mo
Copied from kernel lib/raid6/recov.c raid6_2data_recov_intx1() function.
With the following modification:
- Rename to raid6_recov_data2() for shorter name
- s/kfree/free/g modification
Signed-off-by: Qu Wenruo
---
Makefile| 4 +--
raid56.c => kernel-lib/raid56.c | 69 +++
Introduce a new function: btrfs_read_data_csums(), to read out csums
for sectors in range.
This is quite useful for read out data csum so we don't need to do it
using open code.
Signed-off-by: Qu Wenruo
Signed-off-by: Su Yue
---
Makefile | 2 +-
csum.c | 136 +++
Introduce a new function, __btrfs_map_block_v2().
Unlike old btrfs_map_block(), which needs different parameter to handle
different RAID profile, this new function uses unified btrfs_map_block
structure to handle all RAID profile in a more meaningful method:
Return physical address along with log
On Wed, May 24, 2017 at 05:27:24PM +0800, Qu Wenruo wrote:
>
>
> At 05/24/2017 05:22 PM, Eryu Guan wrote:
> > On Wed, May 24, 2017 at 03:58:11PM +0800, Qu Wenruo wrote:
> > >
> > >
> > > At 05/24/2017 01:16 PM, Qu Wenruo wrote:
> > > >
> > > >
> > > > At 05/24/2017 01:08 PM, Eryu Guan wrote:
Origin 'verify_dir_item' verify namelen of dir_item with fixed values
but no item boundary.
If corrupted namelen was not bigger than the fixed value, for example 255,
the function will think the dir_item is fine. And then reading beyond
boundary will cause crash.
Add a parameter 'slot' and check n
When reading out name from inode_ref, dir_item, it's possible that
corrupted name_len lead to read beyond boundary.
Since there are already patches for btrfs-progs, this is for btrfs.
Introduce function btrfs_check_namelen, it should be called before reading
name from extent_buffer.
The function
Reading name using 'read_extent_buffer' and 'memcmp_extent_buffer'
may cause read beyond item boundary if namelen field in dir_item,
inode_ref is corrupted.
Example:
1. Corrupt one dir_item namelen to be 255.
2. Run 'ls -lar /mnt/test/ > /dev/null'
dmesg:
[ 48.451449] BTRFS info
I've run into a few systems where we start getting immediate ENOSPC
errors on any operation as soon as we update to a recent kernel.
These are all small filesystems (not MIXED), which should have had
plenty of free metadata space but no unallocated chunks.
I was able to trace this back to commit a
On Wed, Apr 19, 2017 at 11:44 AM, Henk Slager wrote:
> I also have a WD40EZRX and the fs on it is also almost exclusively a
> btrfs receive target and it has now for the second time csum (just 5 )
> errors. Extended selftest at 16K hours shows no problem and I am not
> fully sure if this is a mag
On Wed, May 24, 2017 at 11:41:49AM -0500, Goldwyn Rodrigues wrote:
> From: Goldwyn Rodrigues
>
> If IOCB_NOWAIT is set, bail if the i_rwsem is not lockable
> immediately.
>
> IF IOMAP_NOWAIT is set, return EAGAIN in xfs_file_iomap_begin
> if it needs allocation either due to file extension, writ
From: Goldwyn Rodrigues
aio_rw_flags is introduced in struct iocb (using aio_reserved1) which will
carry the RWF_* flags. We cannot use aio_flags because they are not
checked for validity which may break existing applications.
Note, the only place RWF_HIPRI comes in effect is dio_await_one().
Al
From: Goldwyn Rodrigues
Signed-off-by: Goldwyn Rodrigues
Reviewed-by: Christoph Hellwig
---
fs/read_write.c| 12 +++-
include/linux/fs.h | 14 ++
2 files changed, 17 insertions(+), 9 deletions(-)
diff --git a/fs/read_write.c b/fs/read_write.c
index 47c1d4484df9..53c816
From: Goldwyn Rodrigues
Find out if the write will trigger a wait due to writeback. If yes,
return -EAGAIN.
Return -EINVAL for buffered AIO: there are multiple causes of
delay such as page locks, dirty throttling logic, page loading
from disk etc. which cannot be taken care of.
Signed-off-by: G
From: Goldwyn Rodrigues
A new bio operation flag REQ_NOWAIT is introduced to identify bio's
orignating from iocb with IOCB_NOWAIT. This flag indicates
to return immediately if a request cannot be made instead
of retrying.
Stacked devices such as md (the ones with make_request_fn hooks)
currently
From: Goldwyn Rodrigues
If IOCB_NOWAIT is set, bail if the i_rwsem is not lockable
immediately.
IF IOMAP_NOWAIT is set, return EAGAIN in xfs_file_iomap_begin
if it needs allocation either due to file extension, writing to a hole,
or COW or waiting for other DIOs to finish.
Signed-off-by: Goldwy
From: Goldwyn Rodrigues
Return EAGAIN if any of the following checks fail for direct I/O:
+ i_rwsem is lockable
+ Writing beyond end of file (will trigger allocation)
+ Blocks are not allocated at the write location
Signed-off-by: Goldwyn Rodrigues
Reviewed-by: Jan Kara
---
fs/ext4/file
From: Goldwyn Rodrigues
IOCB_NOWAIT translates to IOMAP_NOWAIT for iomaps.
This is used by XFS in the XFS patch.
Signed-off-by: Goldwyn Rodrigues
Reviewed-by: Christoph Hellwig
---
fs/iomap.c| 2 ++
include/linux/iomap.h | 1 +
2 files changed, 3 insertions(+)
diff --git a/fs/iom
From: Goldwyn Rodrigues
Return EAGAIN if any of the following checks fail
+ i_rwsem is not lockable
+ NODATACOW or PREALLOC is not set
+ Cannot nocow at the desired location
+ Writing beyond end of file which is not allocated
Signed-off-by: Goldwyn Rodrigues
Acked-by: David Sterba
---
fs/
From: Goldwyn Rodrigues
RWF_NOWAIT informs kernel to bail out if an AIO request will block
for reasons such as file allocations, or a writeback triggered,
or would block while allocating requests while performing
direct I/O.
RWF_NOWAIT is translated to IOCB_NOWAIT for iocb->ki_flags.
The check
From: Goldwyn Rodrigues
filemap_range_has_page() return true if the file's mapping has
a page within the range mentioned. This function will be used
to check if a write() call will cause a writeback of previous
writes.
Signed-off-by: Goldwyn Rodrigues
Reviewed-by: Christoph Hellwig
---
includ
Formerly known as non-blocking AIO.
This series adds nonblocking feature to asynchronous I/O writes.
io_submit() can be delayed because of a number of reason:
- Block allocation for files
- Data writebacks for direct I/O
- Sleeping because of waiting to acquire i_rwsem
- Congested block device
On Thu, 2017-05-18 at 15:18 +0200, Christoph Hellwig wrote:
> Only read bio->bi_error once in the common path.
Reviewed-by: Bart Van Assche --
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http:/
On Thu, 2017-05-18 at 15:18 +0200, Christoph Hellwig wrote:
> Signed-off-by: Christoph Hellwig
Reviewed-by: Bart Van Assche --
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.or
On Thu, 2017-05-18 at 15:18 +0200, Christoph Hellwig wrote:
> A few (but not all) dm targets use a special EWOULDBLOCK error code for
> failing REQ_RAHEAD requests that fail due to a lack of available resources.
> But no one else knows about this magic code, and lower level drivers also
> don't gen
On Thu, 2017-05-18 at 15:18 +0200, Christoph Hellwig wrote:
> Signed-off-by: Christoph Hellwig
Reviewed-by: Bart Van Assche --
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.or
On Thu, 2017-05-18 at 15:17 +0200, Christoph Hellwig wrote:
> We will only have sense data if the command exectured and got a SCSI
> result, so this is pointless.
>
> Signed-off-by: Christoph Hellwig
> ---
> drivers/scsi/osd/osd_initiator.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
On Thu, 2017-05-18 at 15:17 +0200, Christoph Hellwig wrote:
> Instead of reinventing it poorly.
Reviewed-by: Bart Van Assche --
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.or
On 24.05.2017 06:01, Anand Jain wrote:
> Instead of sending each argument of struct compressed_bio, send
> the compressed_bio itself.
>
> Also by having struct compressed_bio in btrfs_decompress_bio()
> it would help tracing.
>
> Signed-off-by: Anand Jain
> ---
> This patch is preparatory for
On 2017-05-23 14:32, Kai Krakow wrote:
Am Tue, 23 May 2017 07:21:33 -0400
schrieb "Austin S. Hemmelgarn" :
On 2017-05-22 22:07, Chris Murphy wrote:
On Mon, May 22, 2017 at 5:57 PM, Marc MERLIN
wrote:
On Mon, May 22, 2017 at 05:26:25PM -0600, Chris Murphy wrote:
[...]
[...]
[...]
Oh,
Marc MERLIN posted on Tue, 23 May 2017 09:58:47 -0700 as excerpted:
> That's a valid point, and in my case, I can back it up/restore, it just
> takes a bit of time, but most of the time is manually babysitting all
> those subvolumes that I need to recreate by hand with btrfs send/restore
> relatio
At 05/24/2017 05:22 PM, Eryu Guan wrote:
On Wed, May 24, 2017 at 03:58:11PM +0800, Qu Wenruo wrote:
At 05/24/2017 01:16 PM, Qu Wenruo wrote:
At 05/24/2017 01:08 PM, Eryu Guan wrote:
On Wed, May 24, 2017 at 12:28:34PM +0800, Qu Wenruo wrote:
At 05/24/2017 12:24 PM, Eryu Guan wrote:
On
On Wed, May 24, 2017 at 03:58:11PM +0800, Qu Wenruo wrote:
>
>
> At 05/24/2017 01:16 PM, Qu Wenruo wrote:
> >
> >
> > At 05/24/2017 01:08 PM, Eryu Guan wrote:
> > > On Wed, May 24, 2017 at 12:28:34PM +0800, Qu Wenruo wrote:
> > > >
> > > >
> > > > At 05/24/2017 12:24 PM, Eryu Guan wrote:
> >
David,
Can I ping you on this patch ? Wonder if there is any concern.
Thanks, Anand
On 04/28/2017 05:14 PM, Anand Jain wrote:
We allow recursive mounts with subvol options such as [1]
[1]
mount -o rw,compress=lzo /dev/sdc /btrfs1
mount -o ro,subvol=sv2 /dev/sdc /btrfs2
And except for t
The bdev->bd_disk, !bdev_get_queue and q->make_request_fn checks
are all things you don't need, any blkdev_issue_flush should not
either, although I'll need to look into the weird loop workaround
again, which doesn't make much sense to me.
I tried to confirm q->make_request_fn and got lost, I
Hello,
It occurs when enabling quotas on a volume. When there are a
lot of snapshots that are deleted, the system becomes extremely
unresponsive (IO often waiting for 30s on a SSD). When I don't have
quotas, removing snapshots is fast.
Same problem here. It is now common knowledge in the list th
Thanks for comments.
But that said, I find the log spam today from e.g. docker + devicemapper + xfs
annoying, and switching to overlay2 fixed that as a side effect which is nice.
Having overlay2 log would reintroduce that problem.
You are right, docker with overlay2 logs additional 6 lines du
Thanks for the comments.
On 05/19/2017 11:01 PM, Theodore Ts'o wrote:
On Fri, May 19, 2017 at 08:17:55AM +0800, Anand Jain wrote:
XFS already logs its own unmounts.
Nice. as far as I know its only in XFS.
Ext4 logs mounts, but not unmounts.
I prefer to let each filesystem log
its own u
By looking at the logs we should be able to know when was the FS
mounted and unmounted and the options used, so to help forensic
investigations.
Signed-off-by: Anand Jain
---
v2:
. Colin pointed out that when docker runs, this patch will create
messages which can be called as too chatty. In v2 I
2015-04-23 17:42 GMT+02:00 Marc Cousin :
> On 20/04/2015 11:51, Marc Cousin wrote:
>> On 31/03/2015 19:05, David Sterba wrote:
>>> On Mon, Mar 30, 2015 at 05:09:52PM +0200, Marc Cousin wrote:
> So it would be good to sample the active threads and see where it's
> spending the time. It could
At 05/24/2017 01:16 PM, Qu Wenruo wrote:
At 05/24/2017 01:08 PM, Eryu Guan wrote:
On Wed, May 24, 2017 at 12:28:34PM +0800, Qu Wenruo wrote:
At 05/24/2017 12:24 PM, Eryu Guan wrote:
On Wed, May 24, 2017 at 08:22:25AM +0800, Qu Wenruo wrote:
At 05/23/2017 07:13 PM, Eryu Guan wrote:
On
58 matches
Mail list logo