[patch] btrfs: send: silence an integer overflow warning

2016-04-12 Thread Dan Carpenter
The "sizeof(*arg->clone_sources) * arg->clone_sources_count" expression can overflow. It causes several static checker warnings. It's all under CAP_SYS_ADMIN so it's not that serious but lets silence the warnings. Signed-off-by: Dan Carpenter diff --git a/fs/btrfs/send.c b/fs/btrfs/send.c inde

Re: enospace regression in 4.4

2016-04-12 Thread Duncan
Julian Taylor posted on Tue, 12 Apr 2016 17:52:57 +0200 as excerpted: > $ sudo btrfs fi balance start -dusage=0 . > ERROR: error during balancing '.': No space left on device Not much to add, but this one really surprises me and it may be related to the new problem you're seeing. I don't recall

Re: [PATCH v4 00/13] Introduce device state 'failed', spare device and auto replace

2016-04-12 Thread Yauhen Kharuzhy
On Tue, Apr 12, 2016 at 10:15:50PM +0800, Anand Jain wrote: > Thanks for various comments, tests and feedback. Seems working for me. I have triggered OOM killer while testing this in VirtualBox but I don't think that it is related to autoreplace, it seems to be scrub implementation issue: [ 449

Re: KERNEL PANIC + CORRUPTED BTRFS?

2016-04-12 Thread Chris Murphy
On Tue, Apr 12, 2016 at 9:48 AM, lenovomi wrote: > root@ubuntu:/home/ubuntu# btrfs restore -D -v /dev/sda /mnt/usb/ > checksum verify failed on 17802818387968 found FF45E2D3 wanted BFB02AEC > checksum verify failed on 17802818387968 found FF45E2D3 wanted BFB02AEC > checksum verify failed on 1780

Re: [PATCH 01/13] btrfs: Introduce a new function to check if all chunks a OK for degraded mount

2016-04-12 Thread Yauhen Kharuzhy
On Tue, Apr 12, 2016 at 10:15:51PM +0800, Anand Jain wrote: > From: Qu Wenruo > > Introduce a new function, btrfs_check_degradable(), to judge if all chunks > in btrfs is OK for degraded mount. > > It provides the new basis for accurate btrfs mount/remount and even > runtime degraded mount check

Re: enospace regression in 4.4

2016-04-12 Thread Julian Taylor
On 12.04.2016 20:09, Henk Slager wrote: > On Tue, Apr 12, 2016 at 5:52 PM, Julian Taylor > wrote: >> smaller testcase that shows the immediate enospc after fallocate -> rm, >> though I don't know if it is really related to the full filesystem >> bugging out as the balance does work if you wait a f

Re: loop subsystem corrupted after mounting multiple btrfs sub-volumes

2016-04-12 Thread Stanislav Brabec
On Feb 26, 2016 at 21:30 Al Viro wrote: Sigh... sys_mount() (mount_bdev(), actually) has no way to tell if two loop devices refer to the same underlying object. As far as it's concerned, you are asking to mount a completely unrelated block device. Which just happens to see the data (living in s

Re: enospace regression in 4.4

2016-04-12 Thread Henk Slager
On Tue, Apr 12, 2016 at 5:52 PM, Julian Taylor wrote: > smaller testcase that shows the immediate enospc after fallocate -> rm, > though I don't know if it is really related to the full filesystem > bugging out as the balance does work if you wait a few seconds after the > balance. > But this sequ

Re: [PATCH] Btrfs: track transid for delayed ref flushing

2016-04-12 Thread Josef Bacik
On 04/12/2016 01:43 PM, Liu Bo wrote: On Mon, Apr 11, 2016 at 05:37:40PM -0400, Josef Bacik wrote: Using the offwakecputime bpf script I noticed most of our time was spent waiting on the delayed ref throttling. This is what is supposed to happen, but sometimes the transaction can commit and the

Re: [PATCH] Btrfs: track transid for delayed ref flushing

2016-04-12 Thread Liu Bo
On Mon, Apr 11, 2016 at 05:37:40PM -0400, Josef Bacik wrote: > Using the offwakecputime bpf script I noticed most of our time was spent > waiting > on the delayed ref throttling. This is what is supposed to happen, but > sometimes the transaction can commit and then we're waiting for throttling

Re: scrub: Tree block spanning stripes, ignored

2016-04-12 Thread Ivan P
Feel free to send me that modified btrfsck when you finish it, I'm open for experiments as long as I have my backup copy. Regards, Ivan. On Mon, Apr 11, 2016 at 3:10 AM, Qu Wenruo wrote: > There seems to be something wrong with btrfsck. > > Not sure if it's kernel clear_cache mount option or btr

Re: [PATCH] Btrfs: do not create empty block group if we have allocated data

2016-04-12 Thread Liu Bo
On Mon, Apr 11, 2016 at 05:02:18PM +0200, David Sterba wrote: > On Mon, Dec 14, 2015 at 06:29:32PM -0800, Liu Bo wrote: > > Now we force to create empty block group to keep data profile alive, > > however, in the below example, we eventually get an empty block group > > while we're trying to get mo

[PATCH] Btrfs: remove BUG_ON()'s in btrfs_map_block

2016-04-12 Thread Josef Bacik
btrfs_map_block can go horribly wrong in the face of fs corruption, lets agree to not be assholes and panic at any possible chance things are all fucked up. Signed-off-by: Josef Bacik --- fs/btrfs/volumes.c | 22 -- 1 file changed, 20 insertions(+), 2 deletions(-) diff --git

Re: KERNEL PANIC + CORRUPTED BTRFS?

2016-04-12 Thread lenovomi
One more thing, root@ubuntu:/home/ubuntu# btrfs fi show Label: 'universe' uuid: 51e1c933-a39d-4bff-9cf7-f369b4b5d414 Total devices 4 FS bytes used 10.75TiB devid1 size 2.73TiB used 2.70TiB path /dev/sda devid2 size 2.73TiB used 2.70TiB path /dev/sdb devid

Re: enospace regression in 4.4

2016-04-12 Thread Julian Taylor
smaller testcase that shows the immediate enospc after fallocate -> rm, though I don't know if it is really related to the full filesystem bugging out as the balance does work if you wait a few seconds after the balance. But this sequence of commands did work in 4.2. $ sudo btrfs fi show /dev/map

Re: KERNEL PANIC + CORRUPTED BTRFS?

2016-04-12 Thread lenovomi
Hi Chris, i tried these: 1) mount -o ro,recovery /dev/sda /mnt/usb [ 1707.971925] BTRFS info (device sdd): enabling auto recovery [ 1707.971933] BTRFS info (device sdd): disk space caching is enabled [ 1708.005073] BTRFS: sdd checksum verify failed on 17802818387968 wanted BFB02AEC found FF4

Re: [PATCH 10/13] btrfs: introduce helper functions to perform hot replace

2016-04-12 Thread kbuild test robot
Hi Anand, [auto build test ERROR on btrfs/next] [also build test ERROR on v4.6-rc3 next-20160412] [if your patch is applied to the wrong git tree, please drop us a note to help improving the system] url: https://github.com/0day-ci/linux/commits/Anand-Jain/Introduce-device-state-failed-spare

Btrfs fails desatrerous on fuzzy tests

2016-04-12 Thread Juergen Sauer
Hi! do you know this paper ? http://events.linuxfoundation.org/sites/events/files/slides/AFL%20filesystem%20fuzzing%2C%20Vault%202016.pdf It was rushing through the Linux press sites in Germany, see also [german]: http://www.pro-linux.de/news/1/23449/fuzzy-test-f%C3%BCr-dateisysteme-vorgestellt.

[PATCH 09/13] btrfs: provide framework to get and put a spare device

2016-04-12 Thread Anand Jain
From: Anand Jain This adds functions to get and put a spare device from the list. So that hot repace code can pick a spare device when needed. Signed-off-by: Anand Jain Tested-by: Austin S. Hemmelgarn --- fs/btrfs/ctree.h | 1 + fs/btrfs/super.c | 5 + fs/btrfs/volumes.c | 53 ++

[PATCH v4 00/13] Introduce device state 'failed', spare device and auto replace

2016-04-12 Thread Anand Jain
Thanks for various comments, tests and feedback. Background: Spare device and Auto replace: Spare device is predominately used to mitigate or narrow the time window of a degraded raid mode, as because during which any further disk failure would lead to a catastrophic data loss. Data center sto

[PATCH 01/13] btrfs: Introduce a new function to check if all chunks a OK for degraded mount

2016-04-12 Thread Anand Jain
From: Qu Wenruo Introduce a new function, btrfs_check_degradable(), to judge if all chunks in btrfs is OK for degraded mount. It provides the new basis for accurate btrfs mount/remount and even runtime degraded mount check other than old one-size-fit-all method. Signed-off-by: Qu Wenruo --- f

[PATCH 11/13] btrfs: introduce device dynamic state transition to offline or failed

2016-04-12 Thread Anand Jain
From: Anand Jain This patch provides helper functions to force a device to offline or failed, and we need this device states for the following reasons, 1) a. it can be reported that device has failed when it does b. close the device when it goes offline so that blocklayer can cleanup 2)

[PATCH 07/13] btrfs: add check not to mount a spare device

2016-04-12 Thread Anand Jain
From: Anand Jain Spare devices can be scanned but shouldn't be mountable. Signed-off-by: Anand Jain Tested-by: Austin S. Hemmelgarn --- fs/btrfs/disk-io.c | 8 1 file changed, 8 insertions(+) diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index 65c9f19d8017..e9fca3bc7e42 10064

[PATCH 10/13] btrfs: introduce helper functions to perform hot replace

2016-04-12 Thread Anand Jain
From: Anand Jain Hot replace / auto replace is important volume manager feature and is critical to the data center operations, so that the degraded volume can be brought back to a healthy state at the earliest and without manual intervention. This modifies the existing replace code to suite the

[PATCH 12/13] btrfs: check device for critical errors and mark failed

2016-04-12 Thread Anand Jain
From: Anand Jain Write and Flush errors are considered as critical errors, upon which the device will be brought offline and marked as failed. Write and Flush errors are identified using device error statistics. This is monitored using a kthread btrfs_health. Signed-off-by: Anand Jain Tested-by

Re: [PATCH 10/13] btrfs: introduce helper functions to perform hot replace

2016-04-12 Thread Anand Jain
On 04/09/2016 06:05 AM, Yauhen Kharuzhy wrote: On Sat, Apr 02, 2016 at 09:30:48AM +0800, Anand Jain wrote: Hot replace / auto replace is important volume manager feature and is critical to the data center operations, so that the degraded volume can be brought back to a healthy state at the ear

Re: Global hotspare functionality

2016-04-12 Thread Anand Jain
On 04/05/2016 03:32 AM, Yauhen Kharuzhy wrote: 2016-04-01 18:15 GMT-07:00 Anand Jain : Issue 2. At start of autoreplacig drive by hotspare, kernel craches in transaction handling code (inside of btrfs_commit_transaction() called by autoreplace initiating routines). I 'fixed' this by removing o

[PATCH 05/13] btrfs: Cleanup num_tolerated_disk_barrier_failures

2016-04-12 Thread Anand Jain
From: Qu Wenruo As we use per-chunk degradable check, now the global num_tolerated_disk_barrier_failures is of no use. So cleanup it. Signed-off-by: Qu Wenruo [Btrfs: resolve conflict to apply 'btrfs: Cleanup num_tolerated_disk_barrier_failures'] Signed-off-by: Anand Jain --- fs/btrfs/ctree

[PATCH 03/13] btrfs: Do per-chunk degraded check for remount

2016-04-12 Thread Anand Jain
From: Qu Wenruo Just the same for mount time check, use new btrfs_check_degraded() to do per chunk check. Signed-off-by: Qu Wenruo Btrfs: use btrfs_error instead of btrfs_err during remount Signed-off-by: Anand Jain --- fs/btrfs/super.c | 11 +++ 1 file changed, 7 insertions(+), 4 d

[PATCH 08/13] btrfs: support btrfs dev scan for spare device

2016-04-12 Thread Anand Jain
From: Anand Jain When the user or system calls the BTRFS_IOC_SCAN_DEV, ioctl this patch will make sure it is added to the device list and set it as spare. This operation will be same when BTRFS_IOC_DEVICES_READY as well since BTRFS_IOC_DEVICES_READY ioctl has been doing that by legacy. Signed-o

[PATCH 02/13] btrfs: Do per-chunk check for mount time check

2016-04-12 Thread Anand Jain
From: Qu Wenruo Now use the btrfs_check_degraded() to do mount time degraded check. With this patch, now we can mount with the following case: # mkfs.btrfs -f -m raid1 -d single /dev/sdb /dev/sdc # wipefs -a /dev/sdc # mount /dev/sdb /mnt/btrfs -o degraded As the single data chunk is only in

[PATCH 04/13] btrfs: Allow barrier_all_devices to do per-chunk device check

2016-04-12 Thread Anand Jain
From: Qu Wenruo The last user of num_tolerated_disk_barrier_failures is barrier_all_devices(). But it's can be easily changed to new per-chunk degradable check framework. Now btrfs_device will have two extra members, representing send/wait error, set at write_dev_flush() time. And then check it

[PATCH 06/13] btrfs: introduce BTRFS_FEATURE_INCOMPAT_SPARE_DEV

2016-04-12 Thread Anand Jain
From: Anand Jain Add BTRFS_FEATURE_INCOMPAT_SPARE_DEV (400) flag to identify a spare device. Along with this it checks in the mount context that a spare device will fail to mount. As spare devices aren't mountable. Signed-off-by: Anand Jain Tested-by: Austin S. Hemmelgarn --- fs/btrfs/ctree

[PATCH 13/13] btrfs: check for failed device and hot replace

2016-04-12 Thread Anand Jain
From: Anand Jain This patch checks for failed device and kicks out auto replace, if when user decided to disable auto replace it can be done by future sysfs or future ioctl interface to set fs_info->no_auto_replace parameter to 1. Signed-off-by: Anand Jain Tested-by: Austin S. Hemmelgarn ---

[PATCH 1/1] btrfs: fix lock dep warning move scratch super outside of chunk_mutex

2016-04-12 Thread Anand Jain
Move scratch super outside of the chunk lock to avoid below lockdep warning. The better place to scratch super is in the function btrfs_rm_dev_replace_free_srcdev() just before free_device, which is outside of the chunk lock as well. To reproduce: (fresh boot) mkfs.btrfs -f -draid5 -mraid5 /de

Re: Hi

2016-04-12 Thread Stevenson, Marjorie
How are you doing today? I sent you an email but I am yet to get your response, so I am sending you a reminder. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-i

enospace regression in 4.4

2016-04-12 Thread Julian Taylor
hi, I have a system with two filesystems which are both affected by the notorious enospace bug when there is plenty of unallocated space available. The system is a raid0 on two 900 GiB disks and an iscsi single/dup 1.4TiB. To deal with the problem I use a cronjob that uses fallocate to give me an a

[PATCH] fstests: btrfs/091: Disable compress to avoid output dismatch

2016-04-12 Thread Qu Wenruo
If run btrfs/091 with "-o compress=lzo" mount option, test case will fail, as compress makes extent much smaller on disk, making output different from golden output. As this test case is only testing qgroup, not compression, disable compression manually in test case. Signed-off-by: Qu Wenruo ---

[PATCH v2] btrfs: qgroup: Fix qgroup accounting when creating snapshot

2016-04-12 Thread Qu Wenruo
Current btrfs qgroup design implies a requirement that after calling btrfs_qgroup_account_extents() there must be a commit root switch. Normally this is OK, as btrfs_qgroup_accounting_extents() is only called inside btrfs_commit_transaction() just be commit_cowonly_roots(). However there is a exc

Re: KERNEL PANIC + CORRUPTED BTRFS?

2016-04-12 Thread lenovomi
Hi Chris, I tried mount with ro.recovery and again kernel panic: https://bpaste.net/show/895089db279a https://bpaste.net/show/f3cf84532e26 2) I tried to execute restore but these are results: https://bpaste.net/show/191e87b20a54 3) should i run repair? thanks On Tue, Apr 12, 2016 at 5:43 A

Lockdep warning when running btrfs/114

2016-04-12 Thread Qu Wenruo
Hi When debugging the qgroup problem, I found the following lockdep warning outputted when running btrfs/114. It seems to be more easy to trigger if run all qgroup tests in a row. (-g qgroup) The source is integration-4.6 branch *WITHOUT* my qgroup fix patch v2(not submitted yet). Just post