Re: [PATCH] btrfs: fix dead lock while running replace and defrag concurrently

2014-11-14 Thread David Sterba
On Mon, Nov 10, 2014 at 03:36:08PM +0800, Gui Hecheng wrote: This can be reproduced by fstests: btrfs/070 [...] Signed-off-by: Gui Hecheng guihc.f...@cn.fujitsu.com Tested-by: David Sterba dste...@suse.cz -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a

[PULL] [PATCH 0/6] Support for 'pending changes'

2014-11-14 Thread David Sterba
There are some actions that modify global filesystem state but cannot be performed at the time of request, but rather later at the transaction commit time when the filesystem is in a known state. The change can be requested from any context and is permanent after sync. This changes the semantics

[PATCH 1/6] btrfs: add support for processing pending changes

2014-11-14 Thread David Sterba
There are some actions that modify global filesystem state but cannot be performed at the time of request, but later at the transaction commit time when the filesystem is in a known state. For example enabling new incompat features on-the-fly or issuing transaction commit from unsafe contexts

[PATCH 4/6] btrfs: introduce pending action: commit

2014-11-14 Thread David Sterba
In some contexts, like in sysfs handlers, we don't want to trigger a transaction commit. It's a heavy operation, we don't know what external locks may be taken. Instead, make it possible to finish the operation through sync syscall or SYNC_FS ioctl. Signed-off-by: David Sterba dste...@suse.cz ---

[PATCH 6/6] btrfs: move commit out of sysfs when changing label

2014-11-14 Thread David Sterba
Signed-off-by: David Sterba dste...@suse.cz --- fs/btrfs/sysfs.c | 21 - 1 file changed, 8 insertions(+), 13 deletions(-) diff --git a/fs/btrfs/sysfs.c b/fs/btrfs/sysfs.c index 226f7261533a..92db3f648df4 100644 --- a/fs/btrfs/sysfs.c +++ b/fs/btrfs/sysfs.c @@ -369,9 +369,6 @@

[PATCH 3/6] btrfs: switch inode_cache option handling to pending changes

2014-11-14 Thread David Sterba
The pending mount option(s) now share namespace and bits with the normal options, and the existing one for (inode_cache) is unset unconditionally at each transaction commit. Introduce a separate namespace for pending changes and enhance the descriptions of the intended change to use separate bits

[PATCH 5/6] btrfs: move commit out of sysfs when changing features

2014-11-14 Thread David Sterba
Signed-off-by: David Sterba dste...@suse.cz --- fs/btrfs/sysfs.c | 13 + 1 file changed, 5 insertions(+), 8 deletions(-) diff --git a/fs/btrfs/sysfs.c b/fs/btrfs/sysfs.c index b2e7bb4393f6..226f7261533a 100644 --- a/fs/btrfs/sysfs.c +++ b/fs/btrfs/sysfs.c @@ -111,7 +111,6 @@ static

[PATCH 2/6] btrfs: do commit in sync_fs if there are pending changes

2014-11-14 Thread David Sterba
If a pending change is requested, it's not processed unless there is a transaction commit about to happen, not even after sync or SYNC_FS ioctl. For example a remount that toggles the inode_cache option will not take effect after sync on a quiescent filesystem. Signed-off-by: David Sterba

Re: [PATCH 2/6] btrfs: do commit in sync_fs if there are pending changes

2014-11-14 Thread Filipe David Manana
On Fri, Nov 14, 2014 at 10:33 AM, David Sterba dste...@suse.cz wrote: If a pending change is requested, it's not processed unless there is a transaction commit about to happen, not even after sync or SYNC_FS ioctl. For example a remount that toggles the inode_cache option will not take effect

[PATCH 2/6 v2] btrfs: do commit in sync_fs if there are pending changes

2014-11-14 Thread David Sterba
If a pending change is requested, it's not processed unless there is a transaction commit about to happen, not even after sync or SYNC_FS ioctl. For example a remount that toggles the inode_cache option will not take effect after sync on a quiescent filesystem. Signed-off-by: David Sterba

[PATCH 0/9] Implement device scrub/replace for RAID56

2014-11-14 Thread Miao Xie
This patchset implement the device scrub/replace function for RAID56, the most implementation of the common data is similar to the other RAID type. The differentia or difficulty is the parity process. In order to avoid that problem the data that is easy to be change out the stripe lock, we do most

[PATCH 6/9] Btrfs,raid56: support parity scrub on raid56

2014-11-14 Thread Miao Xie
The implementation is: - Read and check all the data with checksum in the same stripe. All the data which has checksum is COW data, and we are sure that it is not changed though we don't lock the stripe. because the space of that data just can be reclaimed after the current transction is

[PATCH 1/9] Btrfs: remove noused bbio_ret in __btrfs_map_block in condition

2014-11-14 Thread Miao Xie
From: Zhao Lei zhao...@cn.fujitsu.com bbio_ret in this condition is always !NULL because previous code already have a check-and-skip: 4908 if (!bbio_ret) 4909 goto out; Signed-off-by: Zhao Lei zhao...@cn.fujitsu.com Signed-off-by: Miao Xie mi...@cn.fujitsu.com --- fs/btrfs/volumes.c | 3 +--

[PATCH 5/9] Btrfs,raid56: use a variant to record the operation type

2014-11-14 Thread Miao Xie
We will introduce new operation type later, if we still use integer variant as bool variant to record the operation type, we would add new variant and increase the size of raid bio structure. It is not good, by this patch, we define different number for different operation, and we can just use a

[PATCH 4/9] Btrfs, scrub: repair the common data on RAID5/6 if it is corrupted

2014-11-14 Thread Miao Xie
This patch implement the RAID5/6 common data repair function, the implementation is similar to the scrub on the other RAID such as RAID1, the differentia is that we don't read the data from the mirror, we use the data repair function of RAID5/6. Signed-off-by: Miao Xie mi...@cn.fujitsu.com ---

[PATCH 9/9] Btrfs, replace: enable dev-replace for raid56

2014-11-14 Thread Miao Xie
From: Zhao Lei zhao...@cn.fujitsu.com Signed-off-by: Zhao Lei zhao...@cn.fujitsu.com Signed-off-by: Miao Xie mi...@cn.fujitsu.com --- fs/btrfs/dev-replace.c | 5 - 1 file changed, 5 deletions(-) diff --git a/fs/btrfs/dev-replace.c b/fs/btrfs/dev-replace.c index 6f662b3..6aa835c 100644 ---

[PATCH 2/9] Btrfs: remove unnecessary code of stripe_index assignment in __btrfs_map_block

2014-11-14 Thread Miao Xie
From: Zhao Lei zhao...@cn.fujitsu.com stripe_index's value was set again in latter line: stripe_index = 0; Signed-off-by: Zhao Lei zhao...@cn.fujitsu.com Signed-off-by: Miao Xie mi...@cn.fujitsu.com --- fs/btrfs/volumes.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git

[PATCH 8/9] Btrfs, replace: write raid56 parity into the replace target device

2014-11-14 Thread Miao Xie
This function reused the code of parity scrub, and we just write the right parity or corrected parity into the target device before the parity scrub end. Signed-off-by: Miao Xie mi...@cn.fujitsu.com --- fs/btrfs/raid56.c | 23 +++ fs/btrfs/scrub.c | 2 +- 2 files changed,

[PATCH 7/9] Btrfs, replace: write dirty pages into the replace target device

2014-11-14 Thread Miao Xie
The implementation is simple: - In order to avoid changing the code logic of btrfs_map_bio and RAID56, we add the stripes of the replace target devices at the end of the stripe array in btrfs bio, and we sort those target device stripes in the array. And we keep the number of the target

[PATCH 3/9] Btrfs, raid56: don't change bbio and raid_map

2014-11-14 Thread Miao Xie
Because we will reuse bbio and raid_map during the scrub later, it is better that we don't change any variant of bbio and don't free it at the end of IO request. So we introduced similar variants into the raid bio, and don't access those bbio's variants any more. Signed-off-by: Miao Xie

Re: [PATCH] Btrfs: don't take the chunk_mutex/dev_list mutex in statfs V2

2014-11-14 Thread David Sterba
On Thu, Nov 06, 2014 at 09:35:17AM -0500, Josef Bacik wrote: Yeah df if you change alloc_start in the middle of it is going to be slightly racey, which is fine since I'm going to kill alloc_start soon anyway. Go on, seems that there was never a real usecase for that. According to the

[PATCH] btrfs: fix wrong accounting of raid1 data profile in statfs

2014-11-14 Thread David Sterba
The sizes that are obtained from space infos are in raw units and have to be adjusted according to the raid factor. This was missing for f_bavail and df reported doubled size for raid1. Reported-by: Martin Steigerwald mar...@lichtvoll.de Fixes: ba7b6e62f420 (btrfs: adjust statfs calculations

Re: [PATCH 0/9] Implement device scrub/replace for RAID56

2014-11-14 Thread Chris Mason
On Fri, Nov 14, 2014 at 8:50 AM, Miao Xie mi...@cn.fujitsu.com wrote: This patchset implement the device scrub/replace function for RAID56, the most implementation of the common data is similar to the other RAID type. The differentia or difficulty is the parity process. In order to avoid

Re: [PATCH 2/9] Btrfs: remove unnecessary code of stripe_index assignment in __btrfs_map_block

2014-11-14 Thread David Sterba
On Fri, Nov 14, 2014 at 09:50:54PM +0800, Miao Xie wrote: From: Zhao Lei zhao...@cn.fujitsu.com stripe_index's value was set again in latter line: stripe_index = 0; Signed-off-by: Zhao Lei zhao...@cn.fujitsu.com Signed-off-by: Miao Xie mi...@cn.fujitsu.com Reviewed-by: David Sterba

Re: [PATCH 1/9] Btrfs: remove noused bbio_ret in __btrfs_map_block in condition

2014-11-14 Thread David Sterba
On Fri, Nov 14, 2014 at 09:50:53PM +0800, Miao Xie wrote: From: Zhao Lei zhao...@cn.fujitsu.com bbio_ret in this condition is always !NULL because previous code already have a check-and-skip: 4908 if (!bbio_ret) 4909 goto out; Signed-off-by: Zhao Lei zhao...@cn.fujitsu.com

[RFC PATCH V9 01/17] Btrfs: subpagesize-blocksize: Get rid of whole page reads.

2014-11-14 Thread Chandan Rajendra
Based on original patch from Aneesh Kumar K.V aneesh.ku...@linux.vnet.ibm.com For the subpagesize-blocksize scenario, a page can contain multiple blocks. This patch handles this case. This patch also brings back check_page_locked() to reliably unlock pages in readpage's end bio function.

[RFC PATCH V9 00/17] Btrfs: Subpagesize-blocksize: Get rid of whole page I/O.

2014-11-14 Thread Chandan Rajendra
This patchset continues with the work posted earlier at http://www.mail-archive.com/linux-btrfs@vger.kernel.org/msg38775.html. Changes from V1: 1. Remove usage of bio_vec-bv_{len,offset} in end_bio_extent_readpage() and end_bio_extent_writepage(). Changes from V2: 1. Get __extent_writepage()

[RFC PATCH V9 02/17] Btrfs: subpagesize-blocksize: Get rid of whole page writes.

2014-11-14 Thread Chandan Rajendra
This commit brings back functions that set/clear EXTENT_WRITEBACK bits. These are required to reliably clear PG_writeback page flag. Signed-off-by: Chandan Rajendra chan...@linux.vnet.ibm.com --- fs/btrfs/extent_io.c | 47 +++ fs/btrfs/inode.c | 40

[RFC PATCH V9 12/17] Btrfs: subpagesize-blocksize: Search for all ordered extents that could span across a page.

2014-11-14 Thread Chandan Rajendra
In subpagesize-blocksize scenario it is not sufficient to search using the first byte of the page to make sure that there are no ordered extents present across the page. Fix this. Signed-off-by: Chandan Rajendra chan...@linux.vnet.ibm.com --- fs/btrfs/extent_io.c | 3 ++- fs/btrfs/inode.c |

[RFC PATCH V9 03/17] Btrfs: subpagesize-blocksize: __btrfs_buffered_write: Reserve/release extents aligned to block size.

2014-11-14 Thread Chandan Rajendra
Currently, the code reserves/releases extents in multiples of PAGE_CACHE_SIZE units. Fix this. Signed-off-by: Chandan Rajendra chan...@linux.vnet.ibm.com --- fs/btrfs/file.c | 32 1 file changed, 20 insertions(+), 12 deletions(-) diff --git a/fs/btrfs/file.c

[RFC PATCH V9 13/17] Btrfs: subpagesize-blocksize: Deal with partial ordered extent allocations.

2014-11-14 Thread Chandan Rajendra
In subpagesize-blocksize scenario, extent allocations for only some of the dirty blocks of a page can succeed, while allocation for rest of the blocks can fail. This patch allows I/O against such partially allocated ordered extents to be submitted. Signed-off-by: Chandan Rajendra

[RFC PATCH V9 15/17] Btrfs: subpagesize-blocksize: Revert commit fc4adbff823f76577ece26dcb88bf6f8392dbd43.

2014-11-14 Thread Chandan Rajendra
In subpagesize-blocksize, we have multiple blocks in a page. Checking for existence of a page in the page cache isn't a sufficient check, since we could be truncating a subset of the blocks mapped by the page. Signed-off-by: Chandan Rajendra chan...@linux.vnet.ibm.com --- fs/btrfs/btrfs_inode.h

[RFC PATCH V9 10/17] Btrfs: subpagesize-blocksize: fallocate: Work with sectorsized units.

2014-11-14 Thread Chandan Rajendra
While at it, this commit changes btrfs_truncate_page() to truncate sectorsized blocks instead of pages. Hence the function has been renamed to btrfs_truncate_block(). Signed-off-by: Chandan Rajendra chan...@linux.vnet.ibm.com --- fs/btrfs/ctree.h | 2 +- fs/btrfs/file.c | 41

[RFC PATCH V9 17/17] Btrfs: subpagesize-blocksize: Prevent writes to an extent buffer when PG_writeback flag is set.

2014-11-14 Thread Chandan Rajendra
In non-subpagesize-blocksize scenario, BTRFS_HEADER_FLAG_WRITTEN flag prevents Btrfs code from writing into an extent buffer whose pages are under writeback. This facility isn't sufficient for achieving the same in subpagesize-blocksize scenario, since we have more than one extent buffer mapped to

[RFC PATCH V9 09/17] Btrfs: subpagesize-blocksize: __extent_writepage: Write only dirty blocks of a page.

2014-11-14 Thread Chandan Rajendra
The code now loops across 'ordered extents' instead of 'extent maps' to figure out the dirty blocks of the page to be submitted for a write operation. Signed-off-by: Chandan Rajendra chan...@linux.vnet.ibm.com --- fs/btrfs/extent_io.c | 74 1

[RFC PATCH V9 05/17] Btrfs: subpagesize-blocksize: Read tree blocks whose size is PAGE_CACHE_SIZE.

2014-11-14 Thread Chandan Rajendra
In the case of subpagesize-blocksize, this patch makes it possible to read only a single metadata block from the disk instead of all the metadata blocks that map into a page. Signed-off-by: Chandan Rajendra chan...@linux.vnet.ibm.com --- fs/btrfs/disk-io.c | 45 +++--

[RFC PATCH V9 14/17] Btrfs: subpagesize-blocksize: Explicitly Track I/O status of blocks of an ordered extent.

2014-11-14 Thread Chandan Rajendra
In subpagesize-blocksize scenario a page can have more than one block. So in addition to PagePrivate2 flag, we would have to track the I/O status of each block of a page to reliably mark the ordered extent as complete. Signed-off-by: Chandan Rajendra chan...@linux.vnet.ibm.com ---

[RFC PATCH V9 11/17] Btrfs: subpagesize-blocksize: btrfs_page_mkwrite: Reserve space in sectorsized units.

2014-11-14 Thread Chandan Rajendra
In subpagesize-blocksize scenario, if i_size occurs in a block which is not the last block in the page, then the space to be reserved should be calculated appropriately. Signed-off-by: Chandan Rajendra chan...@linux.vnet.ibm.com --- fs/btrfs/inode.c | 33 ++--- 1 file

[RFC PATCH V9 07/17] Btrfs: subpagesize-blocksize: Allow mounting filesystems where sectorsize != PAGE_SIZE

2014-11-14 Thread Chandan Rajendra
From: Chandra Seetharaman sekha...@us.ibm.com This patch allows mounting filesystems with blocksize smaller than the PAGE_SIZE. Signed-off-by: Chandra Seetharaman sekha...@us.ibm.com Signed-off-by: Chandan Rajendra chan...@linux.vnet.ibm.com --- fs/btrfs/disk-io.c | 6 -- 1 file changed, 6

[RFC PATCH V9 04/17] Btrfs: subpagesize-blocksize: Define extent_buffer_head.

2014-11-14 Thread Chandan Rajendra
From: Chandra Seetharaman sekha...@us.ibm.com In order to handle multiple extent buffers per page, first we need to create a way to handle all the extent buffers that are attached to a page. This patch creates a new data structure 'struct extent_buffer_head', and moves fields that are common to

[RFC PATCH V9 08/17] Btrfs: subpagesize-blocksize: Compute and look up csums based on sectorsized blocks.

2014-11-14 Thread Chandan Rajendra
Checksums are applicable to sectorsize units. The current code uses bio-bv_len units to compute and look up checksums. This works on machines where sectorsize == PAGE_CACHE_SIZE. This patch makes the checksum computation and look up code to work with sectorsize units. Signed-off-by: Chandan

[RFC PATCH V9 06/17] Btrfs: subpagesize-blocksize: Write only dirty extent buffers belonging to a page

2014-11-14 Thread Chandan Rajendra
For the subpagesize-blocksize scenario, This patch adds the ability to write a single extent buffer to the disk. Signed-off-by: Chandan Rajendra chan...@linux.vnet.ibm.com --- fs/btrfs/disk-io.c | 20 ++-- fs/btrfs/extent_io.c | 300 --- 2 files

[RFC PATCH V9 16/17] Btrfs: subpagesize-blocksize: Track blocks of ordered extent submitted for write I/O.

2014-11-14 Thread Chandan Rajendra
In the subpagesize-blocksize scenario, the following command (with 4k as the PAGE_SIZE and 2k as the block size) can cause false accounting of blocks of an ordered extent that is written to disk: $ xfs_io -f -c pwrite 0 10240 \ -c sync_range 0 4096 \ -c sync_range 8192 2048 \ -c pwrite 10240 2048

Re: [PATCH 2/2 v4] btrfs-progs: optimize btrfs_scan_lblkid() for multiple calls

2014-11-14 Thread David Sterba
On Tue, Nov 11, 2014 at 12:47:35PM +0100, Karel Zak wrote: What I see critical is missing ./configure, because it's pretty ugly to add hardcoded dependencies (e.g. libudev), there is also no checks for another libs, Makefile does not care about place where libs are installed, header files,

Re: soft lockup - CPU#0 stuck - Kernel 3.17.2

2014-11-14 Thread Chris Mason
On Fri, Nov 14, 2014 at 1:23 AM, Patrick Schmid sch...@phys.ethz.ch wrote: Hi Chris here comes the kernel log from this night. it`s from beginning lockups until reboot. [ log file that was too big for vger ] What kind of machine is this? I'm trying to reproduce here. Thanks, Chris --

Re: soft lockup - CPU#0 stuck - Kernel 3.17.2

2014-11-14 Thread Patrick Schmid
What kind of machine is this? I'm trying to reproduce here. The frontend hardware is a 24 core Xeon E5-2620 on an Intel S2600GZ board with 128 GiB RAM, the nic is a Intel X520-DA2 10Gb/s SFP+, running Ubuntu 14.04LTS. Do you need more infos? Patrick -- To unsubscribe from this list:

Re: soft lockup - CPU#0 stuck - Kernel 3.17.2

2014-11-14 Thread Chris Mason
On Fri, Nov 14, 2014 at 1:23 PM, Patrick Schmid sch...@phys.ethz.ch wrote: What kind of machine is this? I'm trying to reproduce here. The frontend hardware is a 24 core Xeon E5-2620 on an Intel S2600GZ board with 128 GiB RAM, the nic is a Intel X520-DA2 10Gb/s SFP+, running Ubuntu

Suggestion on reducing short kernel hangs from my btrfs filesystems: bcache?

2014-11-14 Thread Marc MERLIN
I have a server which runs zoneminder (video recording which is CPU and disk IO intensive) while also doing a bunch of I/O over serial ports. I have a a dual core Intel(R) Core(TM) i3-2100T CPU @ 2.50GHz (4 virtual CPUs in /proc/cpuinfo) It's pretty clear that when zoneminder is doing more work,

Two persistent problems

2014-11-14 Thread Hugo Mills
Chris, Josef, anyone else who's interested, On IRC, I've been seeing reports of two persistent unsolved problems. Neither is showing up very often, but both have turned up often enough to indicate that there's something specific going on worthy of investigation. One of them is

Re: Two persistent problems

2014-11-14 Thread Josef Bacik
On 11/14/2014 04:51 PM, Hugo Mills wrote: Chris, Josef, anyone else who's interested, On IRC, I've been seeing reports of two persistent unsolved problems. Neither is showing up very often, but both have turned up often enough to indicate that there's something specific going on worthy

Re: soft lockup - CPU#0 stuck - Kernel 3.17.2

2014-11-14 Thread Chris Mason
On Fri, Nov 14, 2014 at 1:31 PM, Chris Mason c...@fb.com wrote: On Fri, Nov 14, 2014 at 1:23 PM, Patrick Schmid sch...@phys.ethz.ch wrote: What kind of machine is this? I'm trying to reproduce here. The frontend hardware is a 24 core Xeon E5-2620 on an Intel S2600GZ board with 128

Re: [PATCH 0/2] add progress indicator to btrfs-convert

2014-11-14 Thread David Sterba
On Sun, Nov 09, 2014 at 11:16:54PM +0100, Silvio Fricke wrote: I have tried a ext4-btrfs conversation of a 1TB (90%full) and have seen that btrfs has no progressbar or something like that. Furthermore I have found [1] a project idea for this. I have tried a generic tasklet-based solution

btrfs-progs: ARGV0_BUF_SIZE causes problems with tests

2014-11-14 Thread WorMzy Tykashi
Hi guys, I found a bit of a weird corner-case today. [1] It seems that, due to the use of a 64-byte constant (ARGV0_BUF_SIZE) in utils.c, some tests fail with a buffer overflow detected error if the progs are built in a location with a sufficiently long path. For example: clone the btrfs-progs