Re: Another ENOSPC situation

2016-04-01 Thread Chris Murphy
On Fri, Apr 1, 2016 at 10:55 PM, Duncan <1i5t5.dun...@cox.net> wrote: > Marc Haber posted on Fri, 01 Apr 2016 15:40:29 +0200 as excerpted: >> [4/502]mh@swivel:~$ sudo btrfs fi usage / >> Overall: >> Device size: 600.00GiB >> Device allocated:600.00GiB >> Dev

Re: [PATCH 10/13] btrfs: introduce helper functions to perform hot replace

2016-04-01 Thread kbuild test robot
Hi Anand, [auto build test ERROR on btrfs/next] [also build test ERROR on v4.6-rc1 next-20160401] [if your patch is applied to the wrong git tree, please drop us a note to help improving the system] url: https://github.com/0day-ci/linux/commits/Anand-Jain/Introduce-device-state-failed-Hot

Re: Another ENOSPC situation

2016-04-01 Thread Duncan
Marc Haber posted on Fri, 01 Apr 2016 15:40:29 +0200 as excerpted: > Hi, > > just for a change, this is another btrfs on a different host. The host > is also running Debian unstable with mainline kernels, the btrfs in > question was created (not converted) in March 2015 with btrfs-tools > 3.17. I

Re: Global hotspare functionality

2016-04-01 Thread Anand Jain
On 04/02/2016 09:33 AM, Yauhen Kharuzhy wrote: On Sat, Apr 02, 2016 at 09:15:56AM +0800, Anand Jain wrote: On 03/30/2016 03:47 AM, Yauhen Kharuzhy wrote: On Tue, Mar 29, 2016 at 10:41:36PM +0800, Anand Jain wrote: Hi Yauhen, Issue 2. At start of autoreplacig drive by hotspare, kernel

Re: Global hotspare functionality

2016-04-01 Thread Yauhen Kharuzhy
On Sat, Apr 02, 2016 at 09:15:56AM +0800, Anand Jain wrote: > > > On 03/30/2016 03:47 AM, Yauhen Kharuzhy wrote: > >On Tue, Mar 29, 2016 at 10:41:36PM +0800, Anand Jain wrote: > >> > >>Hi Yauhen, > >> > > > >>> > >>>Issue 2. > >>>At start of autoreplacig drive by hotspare, kernel craches in trans

[PATCH 00/13 v3] Introduce device state 'failed', Hot spare and Auto replace

2016-04-01 Thread Anand Jain
Thanks for various comments, tests and feedback. Background: Hot spare and Auto replace: Hot spare is predominately used to mitigate or narrow the time window of a degraded mode, during which any further disk failure might lead to a catastrophic data loss. Data center storage generally will ha

[PATCH 10/13] btrfs: introduce helper functions to perform hot replace

2016-04-01 Thread Anand Jain
Hot replace / auto replace is important volume manager feature and is critical to the data center operations, so that the degraded volume can be brought back to a healthy state at the earliest and without manual intervention. This modifies the existing replace code to suite the need of auto replac

[PATCH 02/13] btrfs: Do per-chunk check for mount time check

2016-04-01 Thread Anand Jain
From: Qu Wenruo Now use the btrfs_check_degraded() to do mount time degraded check. With this patch, now we can mount with the following case: # mkfs.btrfs -f -m raid1 -d single /dev/sdb /dev/sdc # wipefs -a /dev/sdc # mount /dev/sdb /mnt/btrfs -o degraded As the single data chunk is only in

[PATCH 03/13] btrfs: Do per-chunk degraded check for remount

2016-04-01 Thread Anand Jain
From: Qu Wenruo Just the same for mount time check, use new btrfs_check_degraded() to do per chunk check. Signed-off-by: Qu Wenruo Btrfs: use btrfs_error instead of btrfs_err during remount Signed-off-by: Anand Jain --- fs/btrfs/super.c | 11 +++ 1 file changed, 7 insertions(+), 4 d

[PATCH 01/13] btrfs: Introduce a new function to check if all chunks a OK for degraded mount

2016-04-01 Thread Anand Jain
From: Qu Wenruo Introduce a new function, btrfs_check_degradable(), to judge if all chunks in btrfs is OK for degraded mount. It provides the new basis for accurate btrfs mount/remount and even runtime degraded mount check other than old one-size-fit-all method. Signed-off-by: Qu Wenruo --- f

[PATCH 07/13] btrfs: add check not to mount a spare device

2016-04-01 Thread Anand Jain
Spare devices can be scanned but shouldn't be mountable. Signed-off-by: Anand Jain Tested-by: Austin S. Hemmelgarn --- fs/btrfs/disk-io.c | 8 1 file changed, 8 insertions(+) diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index 7f02f1766037..b99329e37965 100644 --- a/fs/btrfs/di

[PATCH 08/13] btrfs: support btrfs dev scan for spare device

2016-04-01 Thread Anand Jain
When the user or system calls the BTRFS_IOC_SCAN_DEV, ioctl this patch will make sure it is added to the device list and set it as spare. This operation will be same when BTRFS_IOC_DEVICES_READY as well since BTRFS_IOC_DEVICES_READY ioctl has been doing that by legacy. Signed-off-by: Anand Jain

[PATCH 05/13] btrfs: Cleanup num_tolerated_disk_barrier_failures

2016-04-01 Thread Anand Jain
From: Qu Wenruo As we use per-chunk degradable check, now the global num_tolerated_disk_barrier_failures is of no use. So cleanup it. Signed-off-by: Qu Wenruo [Btrfs: resolve conflict to apply 'btrfs: Cleanup num_tolerated_disk_barrier_failures'] Signed-off-by: Anand Jain --- fs/btrfs/ctree

[PATCH 11/13] btrfs: introduce device dynamic state transition to offline or failed

2016-04-01 Thread Anand Jain
This patch provides helper functions to force a device to offline or failed, and we need this device states for the following reasons, 1) a. it can be reported that device has failed when it does b. close the device when it goes offline so that blocklayer can cleanup 2) identify the candid

[PATCH 06/13] btrfs: introduce BTRFS_FEATURE_INCOMPAT_SPARE_DEV

2016-04-01 Thread Anand Jain
Add BTRFS_FEATURE_INCOMPAT_SPARE_DEV (400) flag to identify a spare device. Along with this it checks in the mount context that a spare device will fail to mount. As spare devices aren't mountable. Signed-off-by: Anand Jain Tested-by: Austin S. Hemmelgarn --- fs/btrfs/ctree.h | 4 +++- 1 file

[PATCH 04/13] btrfs: Allow barrier_all_devices to do per-chunk device check

2016-04-01 Thread Anand Jain
From: Qu Wenruo The last user of num_tolerated_disk_barrier_failures is barrier_all_devices(). But it's can be easily changed to new per-chunk degradable check framework. Now btrfs_device will have two extra members, representing send/wait error, set at write_dev_flush() time. And then check it

[PATCH 09/13] btrfs: provide framework to get and put a spare device

2016-04-01 Thread Anand Jain
This adds functions to get and put a spare device from the list. So that hot repace code can pick a spare device when needed. Signed-off-by: Anand Jain Tested-by: Austin S. Hemmelgarn --- fs/btrfs/ctree.h | 1 + fs/btrfs/super.c | 5 + fs/btrfs/volumes.c | 53 +

[PATCH 13/13] btrfs: check for failed device and hot replace

2016-04-01 Thread Anand Jain
This patch checks for failed device and kicks out auto replace, if when user decided to disable auto replace it can be done by future sysfs or future ioctl interface to set fs_info->no_auto_replace parameter to 1. Signed-off-by: Anand Jain Tested-by: Austin S. Hemmelgarn --- fs/btrfs/ctree.h

[PATCH 12/13] btrfs: check device for critical errors and mark failed

2016-04-01 Thread Anand Jain
Write and Flush errors are considered as critical errors, upon which the device will be brought offline and marked as failed. Write and Flush errors are identified using device error statistics. This is monitored using a kthread btrfs_health. Signed-off-by: Anand Jain Tested-by: Austin S. Hemmelg

Re: Global hotspare functionality

2016-04-01 Thread Anand Jain
On 03/31/2016 06:17 AM, Yauhen Kharuzhy wrote: On Tue, Mar 29, 2016 at 10:40:40PM +0300, Yauhen Kharuzhy wrote: Hi. I am testing hotspare v2 on kernel v4.4.5 (I will try latest Chris' tree later) now with lockdep debugging enabled. At starting of replacement, lockdep warning is displayed, be

Re: Global hotspare functionality

2016-04-01 Thread Anand Jain
On 03/30/2016 03:47 AM, Yauhen Kharuzhy wrote: On Tue, Mar 29, 2016 at 10:41:36PM +0800, Anand Jain wrote: Hi Yauhen, Issue 2. At start of autoreplacig drive by hotspare, kernel craches in transaction handling code (inside of btrfs_commit_transaction() called by autoreplace initiating r

Re: [PATCH 12/12] btrfs: check device for critical errors and mark failed

2016-04-01 Thread Anand Jain
On 03/30/2016 08:49 AM, Yauhen Kharuzhy wrote: On Tue, Mar 29, 2016 at 10:22:29PM +0800, Anand Jain wrote: Write and Flush errors are considered as critical errors, upon which the device will be brought offline and marked as failed. Write and Flush errors are identified using device error stat

Re: [PATCH 12/12] btrfs: check device for critical errors and mark failed

2016-04-01 Thread Anand Jain
On 03/30/2016 06:41 AM, Yauhen Kharuzhy wrote: On Tue, Mar 29, 2016 at 10:22:29PM +0800, Anand Jain wrote: Write and Flush errors are considered as critical errors, upon which the device will be brought offline and marked as failed. Write and Flush errors are identified using device error stat

Re: Another ENOSPC situation

2016-04-01 Thread Henk Slager
On Fri, Apr 1, 2016 at 10:40 PM, Marc Haber wrote: > On Fri, Apr 01, 2016 at 09:20:52PM +0200, Henk Slager wrote: >> On Fri, Apr 1, 2016 at 6:50 PM, Marc Haber >> wrote: >> > On Fri, Apr 01, 2016 at 06:30:20PM +0200, Marc Haber wrote: >> >> On Fri, Apr 01, 2016 at 05:44:30PM +0200, Henk Slager w

[GIT PULL] Btrfs

2016-04-01 Thread Chris Mason
Hi Linus, My for-linus-4.6 branch: git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs.git for-linus-4.6 Has a few fixes Dave Sterba had queued up. These are all pretty small, but since they were tested I decided against waiting for more: Alex Lyakas (2) commits (+18/-10): btr

RE: Compression causes kernel crashes if there are I/O or checksum errors (was: RE: kernel BUG at fs/btrfs/volumes.c:5519 when hot-removing device in RAID-1)

2016-04-01 Thread James Johnston
> I grabbed this part from the log after the machine crashed again > following trying to transfer a bunch of files that included ones with > csum errors, let me know if this looks like the same issue you were > having: > Idk? You hit a soft lockup, mine got a "kernel BUG at..." Your stack trace

Re: Another ENOSPC situation

2016-04-01 Thread Marc Haber
On Fri, Apr 01, 2016 at 09:20:52PM +0200, Henk Slager wrote: > On Fri, Apr 1, 2016 at 6:50 PM, Marc Haber > wrote: > > On Fri, Apr 01, 2016 at 06:30:20PM +0200, Marc Haber wrote: > >> On Fri, Apr 01, 2016 at 05:44:30PM +0200, Henk Slager wrote: > >> > On Fri, Apr 1, 2016 at 3:40 PM, Marc Haber

[PATCH 0/8] btrfs: uapi migration for user-visible API components

2016-04-01 Thread Jeff Mahoney
commit 55e301fd57a (Btrfs: move fs/btrfs/ioctl.h to include/uapi/linux/btrfs.h) was intended to make the ioctl definitions available to userspace. Unfortunately, moving just that file wasn't enough and many of the ioctls aren't actually usable without the userspace programmer filling in the gaps.

[PATCH 4/8] btrfs: uapi/linux/btrfs.h migration, move feature flags

2016-04-01 Thread Jeff Mahoney
The compat/compat_ro/incompat feature flags are used by the feature set/get ioctls. Signed-off-by: Jeff Mahoney --- fs/btrfs/ctree.h | 25 - include/uapi/linux/btrfs.h | 31 +++ 2 files changed, 31 insertions(+), 25 deletions(-) diff

[PATCH 6/8] btrfs: uapi/linux/btrfs.h migration, move struct btrfs_ioctl_defrag_range_args

2016-04-01 Thread Jeff Mahoney
struct btrfs_ioctl_defrag_range_args is used by the BTRFS_IOC_DEFRAG_RANGE ioctl. Signed-off-by: Jeff Mahoney --- fs/btrfs/ctree.h | 31 --- include/uapi/linux/btrfs.h | 38 +- 2 files changed, 37 insertions(+), 32 deletio

[PATCH 2/8] btrfs: uapi/linux/btrfs.h migration, qgroup limit flags

2016-04-01 Thread Jeff Mahoney
The BTRFS_QGROUP_LIMIT_* flags are required to tell the kernel which fields are valid when using the BTRFS_IOC_QGROUP_LIMIT ioctl. Signed-off-by: Jeff Mahoney --- fs/btrfs/ctree.h | 8 include/uapi/linux/btrfs.h | 22 +- 2 files changed, 21 insertions(+),

[PATCH 7/8] btrfs: uapi/linux/btrfs_tree.h migration, item types and defines

2016-04-01 Thread Jeff Mahoney
The BTRFS_IOC_SEARCH_TREE ioctl returns file system items directly to userspace. In order to decode them, full type information is required. Create a new header, btrfs_tree to contain these since most users won't need them. Signed-off-by: Jeff Mahoney --- fs/btrfs/ctree.h| 949

[PATCH 3/8] btrfs: uapi/linux/btrfs.h migration, document subvol flags

2016-04-01 Thread Jeff Mahoney
Signed-off-by: Jeff Mahoney --- include/uapi/linux/btrfs.h | 17 ++--- 1 file changed, 14 insertions(+), 3 deletions(-) diff --git a/include/uapi/linux/btrfs.h b/include/uapi/linux/btrfs.h index 9651af3..0316e23 100644 --- a/include/uapi/linux/btrfs.h +++ b/include/uapi/linux/btrfs.h

[PATCH 8/8] btrfs: uapi/linux/btrfs_tree.h, use __u8 and __u64

2016-04-01 Thread Jeff Mahoney
u8 and u64 aren't exported to userspace, while __u8 and __u64 are. Signed-off-by: Jeff Mahoney --- include/uapi/linux/btrfs_tree.h | 52 - 1 file changed, 26 insertions(+), 26 deletions(-) diff --git a/include/uapi/linux/btrfs_tree.h b/include/uapi/linux/

[PATCH 5/8] btrfs: uapi/linux/btrfs.h migration, move balance flags

2016-04-01 Thread Jeff Mahoney
The BTRFS_BALANCE_* flags are used by struct btrfs_ioctl_balance_args.flags and btrfs_ioctl_balance_args.{data,meta,sys}.flags in the BTRFS_IOC_BALANCE ioctl. Signed-off-by: Jeff Mahoney --- fs/btrfs/volumes.h | 46 - include/uapi/linux/btrfs.h | 64 ++

[PATCH 1/8] btrfs: uapi/linux/btrfs.h migration, move BTRFS_LABEL_SIZE

2016-04-01 Thread Jeff Mahoney
BTRFS_LABEL_SIZE is required to define the BTRFS_IOC_GET_FSLABEL and BTRFS_IOC_SET_FSLABEL ioctls. Signed-off-by: Jeff Mahoney --- fs/btrfs/ctree.h | 1 - include/uapi/linux/btrfs.h | 1 + 2 files changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h

Re: Another ENOSPC situation

2016-04-01 Thread Henk Slager
On Fri, Apr 1, 2016 at 6:50 PM, Marc Haber wrote: > On Fri, Apr 01, 2016 at 06:30:20PM +0200, Marc Haber wrote: >> On Fri, Apr 01, 2016 at 05:44:30PM +0200, Henk Slager wrote: >> > On Fri, Apr 1, 2016 at 3:40 PM, Marc Haber >> > wrote: >> > > btrfs balance -mprofiles seems to do something. one k

Re: Compression causes kernel crashes if there are I/O or checksum errors (was: RE: kernel BUG at fs/btrfs/volumes.c:5519 when hot-removing device in RAID-1)

2016-04-01 Thread mitch
I grabbed this part from the log after the machine crashed again following trying to transfer a bunch of files that included ones with csum errors, let me know if this looks like the same issue you were having: Mar 31 00:49:42 sl-server kernel: NMI watchdog: BUG: soft lockup - CPU#21 stuck for 22

Re: btrfs_destroy_inode WARN_ON.

2016-04-01 Thread Dave Jones
On Fri, Apr 01, 2016 at 02:12:27PM -0400, Dave Jones wrote: > BUG: workqueue lockup - pool cpus=1 node=0 flags=0x0 nice=0 stuck for 30s! > Showing busy workqueues and worker pools: > workqueue events: flags=0x0 > pwq 6: cpus=3 node=0 flags=0x0 nice=0 active=1/256 > pending: vmstat_shephe

Re: btrfs_destroy_inode WARN_ON.

2016-04-01 Thread Dave Jones
On Sun, Mar 27, 2016 at 09:14:00PM -0400, Dave Jones wrote: > > WARNING: CPU: 2 PID: 32570 at fs/btrfs/inode.c:9261 > btrfs_destroy_inode+0x389/0x3f0 [btrfs] > > CPU: 2 PID: 32570 Comm: rm Not tainted 4.5.0-think+ #14 > > c039baf9 ef721ef0 88025966fc08 8957bcd

Re: Another ENOSPC situation

2016-04-01 Thread Marc Haber
On Fri, Apr 01, 2016 at 06:30:20PM +0200, Marc Haber wrote: > On Fri, Apr 01, 2016 at 05:44:30PM +0200, Henk Slager wrote: > > On Fri, Apr 1, 2016 at 3:40 PM, Marc Haber > > wrote: > > > btrfs balance -mprofiles seems to do something. one kworked and one > > > btrfs-transaction process hog one CP

Re: [PATCH v9 00/19] Btrfs dedupe framework

2016-04-01 Thread David Sterba
On Fri, Apr 01, 2016 at 08:26:43AM +0800, Qu Wenruo wrote: > > > David Sterba wrote on 2016/03/31 18:12 +0200: > > On Wed, Mar 30, 2016 at 03:55:55PM +0800, Qu Wenruo wrote: > >> This March 30th patchset update mostly addresses the patchset structure > >> comment from David: > >> 1) Change the pa

Re: Another ENOSPC situation

2016-04-01 Thread Marc Haber
On Fri, Apr 01, 2016 at 05:44:30PM +0200, Henk Slager wrote: > On Fri, Apr 1, 2016 at 3:40 PM, Marc Haber > wrote: > > btrfs balance -mprofiles seems to do something. one kworked and one > > btrfs-transaction process hog one CPU core each for hours, while > > blocking the filesystem for minutes a

Re: Another ENOSPC situation

2016-04-01 Thread Henk Slager
On Fri, Apr 1, 2016 at 3:40 PM, Marc Haber wrote: > Hi, > > just for a change, this is another btrfs on a different host. The host > is also running Debian unstable with mainline kernels, the btrfs in > question was created (not converted) in March 2015 with btrfs-tools > 3.17. It is the root fs o

Re: Again, no space left on device while rebalancing and recipe doesnt work

2016-04-01 Thread Marc Haber
On Sat, Feb 27, 2016 at 10:14:50PM +0100, Marc Haber wrote: > I have again the issue of no space left on device while rebalancing > (with btrfs-tools 4.4.1 on kernel 4.4.2 on Debian unstable): just for the record: The host started acting up in more and more interesting ways, and after a call of rm

Another ENOSPC situation

2016-04-01 Thread Marc Haber
Hi, just for a change, this is another btrfs on a different host. The host is also running Debian unstable with mainline kernels, the btrfs in question was created (not converted) in March 2015 with btrfs-tools 3.17. It is the root fs of my main work notebook which is under workstation load, with

Re: [PATCH] btrfs-progs: fsck: Fix a false metadata extent warning

2016-04-01 Thread David Sterba
On Fri, Apr 01, 2016 at 08:09:56PM +0800, Qu Wenruo wrote: > > > On 04/01/2016 07:39 PM, David Sterba wrote: > > On Fri, Apr 01, 2016 at 04:50:06PM +0800, Qu Wenruo wrote: > >>> After another look, why don't we use nodesize directly? Or stripesize > >>> where applies. With max_size == 0 the test

Re: "bad metadata" not fixed by btrfs repair

2016-04-01 Thread Marc Haber
On Thu, Mar 31, 2016 at 08:42:46PM +0200, Henk Slager wrote: > So also false alerts. btrfs-tools 4.5.1 with Qu's patch from patchwork doesnt show those warnings any more. Greetings Marc -- - Marc Haber | "I don'

Re: [PATCH] btrfs-progs: fsck: Fix a false metadata extent warning

2016-04-01 Thread Qu Wenruo
On 04/01/2016 07:39 PM, David Sterba wrote: On Fri, Apr 01, 2016 at 04:50:06PM +0800, Qu Wenruo wrote: After another look, why don't we use nodesize directly? Or stripesize where applies. With max_size == 0 the test does not make sense, we ought to know the alignment. Yes, my first though is

Re: [PATCH] btrfs-progs: fsck: Fix a false metadata extent warning

2016-04-01 Thread David Sterba
On Fri, Apr 01, 2016 at 04:50:06PM +0800, Qu Wenruo wrote: > > After another look, why don't we use nodesize directly? Or stripesize > > where applies. With max_size == 0 the test does not make sense, we ought > > to know the alignment. > > > Yes, my first though is also to use nodesize directly, w

Re: empty disk reports full

2016-04-01 Thread Hugo Mills
On Fri, Apr 01, 2016 at 11:50:50AM +0200, Alejandro Vargas wrote: > I am using a 2Tb disk for incremental backups. > > I use rsync for backing up to a subvolume, and each day I creates an snapshot > of the lastest snapshot and do rsync in this. > > When the disk becomes nearly full (100Gb or les

Re: [PATCH v10 02/21] btrfs: dedupe: Introduce function to initialize dedupe info

2016-04-01 Thread kbuild test robot
Hi Wang, [auto build test ERROR on btrfs/next] [also build test ERROR on v4.6-rc1 next-20160401] [if your patch is applied to the wrong git tree, please drop us a note to help improving the system] url: https://github.com/0day-ci/linux/commits/Qu-Wenruo/Btrfs-dedupe-framework/20160401

empty disk reports full

2016-04-01 Thread Alejandro Vargas
I am using a 2Tb disk for incremental backups. I use rsync for backing up to a subvolume, and each day I creates an snapshot of the lastest snapshot and do rsync in this. When the disk becomes nearly full (100Gb or less available) I deletes the oldest subvolume (withbtrfs subvolume delete). My

[PATCH v10 16/21] btrfs: dedupe: Add basic tree structure for on-disk dedupe method

2016-04-01 Thread Qu Wenruo
Introduce a new tree, dedupe tree to record on-disk dedupe hash. As a persist hash storage instead of in-memeory only implement. Unlike Liu Bo's implement, in this version we won't do hack for bytenr -> hash search, but add a new type, DEDUP_BYTENR_ITEM for such search case, just like in-memory ba

Re: [PATCH] btrfs-progs: fsck: Fix a false metadata extent warning

2016-04-01 Thread Qu Wenruo
David Sterba wrote on 2016/04/01 10:44 +0200: On Fri, Apr 01, 2016 at 08:28:18AM +0800, Qu Wenruo wrote: David Sterba wrote on 2016/03/31 18:30 +0200: On Thu, Mar 31, 2016 at 10:19:34AM +0800, Qu Wenruo wrote: At least 2 user from mail list reported btrfsck reported false alert of "bad met

Re: [PATCH] btrfs-progs: fsck: Fix a false metadata extent warning

2016-04-01 Thread David Sterba
On Fri, Apr 01, 2016 at 08:28:18AM +0800, Qu Wenruo wrote: > > > David Sterba wrote on 2016/03/31 18:30 +0200: > > On Thu, Mar 31, 2016 at 10:19:34AM +0800, Qu Wenruo wrote: > >> At least 2 user from mail list reported btrfsck reported false alert of > >> "bad metadata [,) crossing stripe