[PATCH RFC] Make add and device ioctls device path len consistent

2016-02-18 Thread Anand Jain
 Just in case if need to fix it properly, this patch helps.
 Its a bit of overkill though, on the basis of being theoretically
 correct. As discussed, this or Chris suggested to use simpler
 BTRFS_SUBVOL_NAME_MAX will fix equally from practical perspective.

 This patch is ontop the patch
   btrfs: Introduce device delete by devid
   btrfs-progs: Introduce device delete by devid
 respectively.

Thanks, Anand

Anand Jain (1):
  btrfs: make add and device ioctl args path max consistent

 fs/btrfs/ioctl.c   |  2 +-
 include/uapi/linux/btrfs.h | 19 ++-
 2 files changed, 19 insertions(+), 2 deletions(-)

Anand Jain (1):
  btrfs-progs: make add and delete path max len consistent

 cmds-device.c |  2 +-
 ioctl.h   | 19 ++-
 2 files changed, 19 insertions(+), 2 deletions(-)

-- 
2.7.0

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH RFC] btrfs-progs: make add and delete path max len consistent

2016-02-18 Thread Anand Jain
Add device ioctl uses BTRFS_PATH_NAME_MAX however delete
uses BTRFS_SUBVOL_NAME_MAX to hold device path. This patch
makes them consistent.

Depends on
  btrfs: make add and device ioctl args path max consistent

Signed-off-by: Anand Jain 
---
 cmds-device.c |  2 +-
 ioctl.h   | 19 ++-
 2 files changed, 19 insertions(+), 2 deletions(-)

diff --git a/cmds-device.c b/cmds-device.c
index 879ba94d7ea5..be07e34c2af8 100644
--- a/cmds-device.c
+++ b/cmds-device.c
@@ -160,7 +160,7 @@ static int _cmd_device_remove(int argc, char **argv,
struct  btrfs_ioctl_vol_args arg;
int res;
 
-   struct btrfs_ioctl_vol_args_v2 argv2 = {0};
+   struct btrfs_ioctl_vol_args_v3 argv2 = {0};
int is_devid = 0;
 
if (string_is_numerical(argv[i])) {
diff --git a/ioctl.h b/ioctl.h
index 1560dcb4b457..8e444ca374fd 100644
--- a/ioctl.h
+++ b/ioctl.h
@@ -92,6 +92,23 @@ struct btrfs_ioctl_vol_args_v2 {
};
 };
 
+struct btrfs_ioctl_vol_args_v3 {
+   __s64 fd;
+   __u64 transid;
+   __u64 flags;
+   union {
+   struct {
+   __u64 size;
+   struct btrfs_qgroup_inherit __user *qgroup_inherit;
+   };
+   __u64 unused[4];
+   };
+   union {
+   char name[BTRFS_PATH_NAME_MAX + 1];
+   u64 devid;
+   };
+};
+
 /*
  * structure to report errors and progress to userspace, either as a
  * result of a finished scrub, a canceled scrub or a progress inquiry
@@ -689,7 +706,7 @@ static inline char *btrfs_err_str(enum btrfs_err_code 
err_code)
 #define BTRFS_IOC_GET_SUPPORTED_FEATURES _IOR(BTRFS_IOCTL_MAGIC, 57, \
   struct btrfs_ioctl_feature_flags[3])
 #define BTRFS_IOC_RM_DEV_V2 _IOW(BTRFS_IOCTL_MAGIC, 58, \
-  struct btrfs_ioctl_vol_args_v2)
+  struct btrfs_ioctl_vol_args_v3)
 #ifdef __cplusplus
 }
 #endif
-- 
2.7.0

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH RFC] btrfs: make add and device ioctl args path max consistent

2016-02-18 Thread Anand Jain
Add device ioctl uses BTRFS_PATH_NAME_MAX however delete
uses BTRFS_SUBVOL_NAME_MAX for device path.  This patch makes
them consistent.

Signed-off-by: Anand Jain 
---
 fs/btrfs/ioctl.c   |  2 +-
 include/uapi/linux/btrfs.h | 19 ++-
 2 files changed, 19 insertions(+), 2 deletions(-)

diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
index 1f72f8ed38f5..15fc91291c40 100644
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -2672,7 +2672,7 @@ out:
 static long btrfs_ioctl_rm_dev_v2(struct file *file, void __user *arg)
 {
struct btrfs_root *root = BTRFS_I(file_inode(file))->root;
-   struct btrfs_ioctl_vol_args_v2 *vol_args;
+   struct btrfs_ioctl_vol_args_v3 *vol_args;
int ret;
 
if (!capable(CAP_SYS_ADMIN))
diff --git a/include/uapi/linux/btrfs.h b/include/uapi/linux/btrfs.h
index 3975e683af72..3d58182b68e8 100644
--- a/include/uapi/linux/btrfs.h
+++ b/include/uapi/linux/btrfs.h
@@ -90,6 +90,23 @@ struct btrfs_ioctl_vol_args_v2 {
};
 };
 
+struct btrfs_ioctl_vol_args_v3 {
+   __s64 fd;
+   __u64 transid;
+   __u64 flags;
+   union {
+   struct {
+   __u64 size;
+   struct btrfs_qgroup_inherit __user *qgroup_inherit;
+   };
+   __u64 unused[4];
+   };
+   union {
+   char name[BTRFS_PATH_NAME_MAX + 1];
+   u64 devid;
+   };
+};
+
 /*
  * structure to report errors and progress to userspace, either as a
  * result of a finished scrub, a canceled scrub or a progress inquiry
@@ -671,6 +688,6 @@ static inline char *btrfs_err_str(enum btrfs_err_code 
err_code)
 #define BTRFS_IOC_GET_SUPPORTED_FEATURES _IOR(BTRFS_IOCTL_MAGIC, 57, \
   struct btrfs_ioctl_feature_flags[3])
 #define BTRFS_IOC_RM_DEV_V2 _IOW(BTRFS_IOCTL_MAGIC, 58, \
-  struct btrfs_ioctl_vol_args_v2)
+  struct btrfs_ioctl_vol_args_v3)
 
 #endif /* _UAPI_LINUX_BTRFS_H */
-- 
2.7.0

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Oops on Send

2016-02-18 Thread Jan Alexander Steffens
Greetings,

I'm getting reproducible Oopses when attempting "btrfs send" on my
root vol, quickly ending in a hard lockup of the machine. Both
incremental send and non-incremental send crash, the former much
earlier. The volume is in a LUKS container.

The initial fault is always during LZO decompression, but scrub does
not find any errors. Could this be invalid compressed data with good
checksums? "btrfs check" finds bad extent backrefs which --repair
claims to fix, but they reappear on a subsequent check.

Is there a way to find out which files' extents are damaged so I can
delete them and complete my backup?

I'm not subscribed, so please include my address in To or Cc.

Kind regards,
Jan



BUG: unable to handle kernel paging request at c90001bb6000
IP: [] memcpy_erms+0x6/0x10
PGD 40f08d067 PUD 40f08e067 PMD 40c65e067 PTE 0
Oops: 0002 [#1] PREEMPT SMP
Modules linked in: ctr ccm xt_CHECKSUM iptable_mangle ipt_MASQUERADE
nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat
nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT
nf_reject_ipv4 xt_tcpudp rfcomm tun bridge stp llc ebtable_filter
ebtables ip6table_filter ip6_tables iptable_filter uas usb_storage
bnep joydev mousedev tpm_infineon arc4 nls_iso8859_1 nls_cp437 vfat
fat snd_hda_codec_via snd_hda_codec_generic snd_hda_codec_hdmi
iTCO_wdt iTCO_vendor_support iwlmvm mac80211 x86_pkg_temp_thermal
intel_powerclamp coretemp kvm_intel kvm iwlwifi irqbypass btusb btrtl
btbcm input_leds btintel led_class cfg80211 bluetooth psmouse
serio_raw snd_hda_intel e1000e snd_hda_codec rfkill i2c_i801
rtsx_pci_ms crc16 memstick snd_hda_core ptp mei_me pps_core snd_hwdep
lpc_ich mei
 shpchp snd_pcm wmi thermal battery fjes evdev tpm_tis mac_hid ac
processor sch_fq_codel usbip_host usbip_core snd_seq_dummy snd_seq_oss
snd_seq_midi_event snd_seq snd_seq_device snd_timer snd soundcore cuse
fuse nfs lockd grace sunrpc fscache tcp_cdg tpm_rng rng_core tpm
vhba(O) ip_tables x_tables btrfs xor raid6_pq sha256_ssse3
sha256_generic hmac drbg ansi_cprng algif_skcipher af_alg dm_crypt
dm_mod sd_mod rtsx_pci_sdmmc mmc_core atkbd libps2 crct10dif_pclmul
xhci_pci crc32_pclmul crc32c_intel xhci_hcd aesni_intel ehci_pci ahci
aes_x86_64 lrw ehci_hcd libahci gf128mul glue_helper ablk_helper
cryptd libata scsi_mod usbcore rtsx_pci usb_common i8042 serio i915
video button intel_gtt i2c_algo_bit drm_kms_helper syscopyarea
sysfillrect sysimgblt fb_sys_fops drm
CPU: 4 PID: 12114 Comm: kworker/u16:5 Tainted: G   O4.4.1-2-ARCH #1
Hardware name: Notebook W740SU
 /W740SU  , BIOS 4.6.5 09/11/2014
Workqueue: btrfs-endio btrfs_endio_helper [btrfs]
task: 8803942f9b80 ti: 8803aee78000 task.ti: 8803aee78000
RIP: 0010:[]  [] memcpy_erms+0x6/0x10
RSP: 0018:8803aee7bc58  EFLAGS: 00010286
RAX: c90001bb5ff8 RBX: 1000 RCX: 0ff8
RDX: 1000 RSI: 880260ca7008 RDI: c90001bb6000
RBP: 8803aee7bd28 R08: c90001bb4000 R09: 1000
R10:  R11: 880408b5c398 R12: 
R13: 00230021 R14: 880408b5c398 R15: 00230029
FS:  () GS:88041fb0() knlGS:
CS:  0010 DS:  ES:  CR0: 80050033
CR2: c90001bb6000 CR3: 01809000 CR4: 001406e0
Stack:
 a04dbce3 8102166e 8803aee7bc90 1600
 8803581f5800 000e  0022e029
 c90001bb4000 0002 001a 001a0048
Call Trace:
 [] ? lzo_decompress_biovec+0x1e3/0x2d0 [btrfs]
 [] ? kernel_fpu_end+0xe/0x20
 [] end_compressed_bio_read+0x1c5/0x300 [btrfs]
 [] bio_endio+0x3f/0x60
 [] end_workqueue_fn+0x3c/0x40 [btrfs]
 [] btrfs_scrubparity_helper+0x77/0x2e0 [btrfs]
 [] btrfs_endio_helper+0xe/0x10 [btrfs]
 [] process_one_work+0x14b/0x440
 [] worker_thread+0x48/0x4a0
 [] ? process_one_work+0x440/0x440
 [] kthread+0xd8/0xf0
 [] ? kthread_worker_fn+0x170/0x170
 [] ret_from_fork+0x3f/0x70
 [] ? kthread_worker_fn+0x170/0x170
Code: f3 c3 90 90 90 eb 1e 0f 1f 00 48 89 f8 48 89 d1 48 c1 e9 03 83
e2 07 f3 48 a5 89 d1 f3 a4 c3 66 0f 1f 44 00 00 48 89 f8 48 89 d1 
a4 c3 0f 1f 80 00 00 00 00 48 89 f8 48 83 fa 20 72 7e 40 38
RIP  [] memcpy_erms+0x6/0x10
 RSP 
CR2: c90001bb6000
---[ end trace 99ccc6da83f67660 ]---
BUG: unable to handle kernel paging request at ffd8
IP: [] kthread_data+0x10/0x20
PGD 180c067 PUD 180e067 PMD 0
Oops:  [#2] PREEMPT SMP
Modules linked in: ctr ccm xt_CHECKSUM iptable_mangle ipt_MASQUERADE
nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat
nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT
nf_reject_ipv4 xt_tcpudp rfcomm tun bridge stp llc ebtable_filter
ebtables ip6table_filter ip6_tables iptable_filter uas usb_storage
bnep joydev mousedev tpm_infineon arc4 nls_iso8859_1 nls_cp437 vfat
fat snd_hda_codec_via snd_hda_codec_generic snd_hda_codec_hdmi
iTCO_wdt iT

Re: [patch] btrfs: array overflow in btrfs_ioctl_rm_dev_v2()

2016-02-18 Thread David Sterba
On Thu, Feb 18, 2016 at 03:14:04PM +0800, Anand Jain wrote:
> 
> 
> Thanks Dan.
>   Chris pointed out as well. We are working on it..

In for next since yesterday.

>   Just one concern when device is added the max device length is
>   BTRFS_PATH_NAME_MAX. However below fix is proper from the vol_args
>   perspective.

Yeah, there's mess in the various PATH constants, but they're all > 4000
and that should be enough for most uses.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] btrfs: fix ifnullfree.cocci warnings

2016-02-18 Thread David Sterba
On Wed, Feb 17, 2016 at 07:04:41PM +0800, kbuild test robot wrote:
> fs/btrfs/volumes.c:1886:2-7: WARNING: NULL check before freeing functions 
> like kfree, debugfs_remove, debugfs_remove_recursive or usb_free_urb is not 
> needed. Maybe consider reorganizing relevant code to avoid passing NULL 
> values.
> 
>  NULL check before some freeing functions is not needed.
> 
>  Based on checkpatch warning
>  "kfree(NULL) is safe this check is probably not required"
>  and kfreeaddr.cocci by Julia Lawall.
> 
> Generated by: scripts/coccinelle/free/ifnullfree.cocci
> 
> CC: Anand Jain 
> Signed-off-by: Fengguang Wu 

Thanks, applied to the respective branch.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs: reada: simplify dev->reada_in_flight processing

2016-02-18 Thread David Sterba
On Thu, Feb 18, 2016 at 09:37:41AM +0800, Zhao Lei wrote:
> CC: David Sterba 
> I'll fix this indent problem in following branch:
> https://github.com/zhaoleidd/btrfs.git integration-4.5
> 
> Could you pick them again?

Done, fixed manually and updated in for-next.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 12/13] btrfs: introduce device delete by devid

2016-02-18 Thread David Sterba
On Thu, Feb 18, 2016 at 02:59:26PM +0800, Anand Jain wrote:
> #define BTRFS_PATH_NAME_MAX 4087
> #define BTRFS_SUBVOL_NAME_MAX 4039
> 
>   I am fine with using BTRFS_SUBVOL_NAME_MAX for now. But theoretical
>   anomaly is that add-device code path will use BTRFS_PATH_NAME_MAX and
>   delete device will use BTRFS_SUBVOL_NAME_MAX.. its only theoretical
>   as most of the devices path are well below 4k IMO. So its a good
>   trade off than other solutions like.. (just for the understanding),
> 
> - Update add device code as well to use btrfs_ioctl_vol_args_v2
>   Which means we need to introduce BTRFS_IOC_ADD_DEV_V2 (system
>   PATH_MAX is 4096).
> 
> OR
> 
> - Create new btrfs_ioctl_vol_args_v3 with name[BTRFS_PATH_NAME_MAX+1]
> (instead of name[BTRFS_SUBVOL_NAME_MAX+1]) and BTRFS_IOC_RM_DEV_V2
> will be the only consumer of btrfs_ioctl_vol_args_v3 as of now.

I'd rather not introduce any new structures or ioctls, the problem seems
to be marginal. As mentioned, paths are way shorter than 4k. A symlink
can be used as a workaround. We can document that.

The use of BTRFS_SUBVOL_NAME_MAX is confusing for devices, we can do

#define BTRFS_DEVICE_PATH_MAX   BTRFS_SUBVOL_NAME_MAX
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs send error

2016-02-18 Thread Duncan
Filipe Manana posted on Wed, 17 Feb 2016 14:01:15 + as excerpted:

> Just upgrade to a 4.3 or 4.4 kernel, or build a kernel with the patch
> below.
> The fix for this landed in 4.3:
> 
> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?
id=d6589101b67a55107652050dfbf414403a93e351

Has that patch been earmarked for stable, and stable just hasn't caught 
up to it yet?

Because the OP is on 4.1-LTS, which is definitely still within the 
reasonable support window and being an LTS, should still be supported.

If it's already earmarked for stable and simply hasn't made it into a 
stable release yet, that's understandable, tho having it specifically 
stated, thus making waiting for it to hit stable an option, would be 
nice. =:^)

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: task btrfs-cleaner:770 blocked for more than 120 seconds.

2016-02-18 Thread Christian Rohmann


On 02/14/2016 11:42 PM, Roman Mamedov wrote:
> FWIW I had a persistently repeating deadlock on 4.1 and 4.3, but
> after upgrade to 4.4 it no longer happens.


Apparently also with 4.4 there is some sort of blocking happening ...
just at 38580:

 cut 

[Wed Feb 17 16:43:48 2016] INFO: task btrfs-cleaner:38580 blocked for
more than 120 seconds.
[Wed Feb 17 16:43:48 2016]   Not tainted 4.4.0-customkernel #1
[Wed Feb 17 16:43:48 2016] "echo 0 >
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
[Wed Feb 17 16:43:48 2016] btrfs-cleaner   D 882c27295dc0 0
38580  2 0x
[Wed Feb 17 16:43:48 2016]  882c16fa6480 88161a980280
882a3d744000 882a3d743df8
[Wed Feb 17 16:43:48 2016]  8815fead7104 882c16fa6480
 8815fead7108
[Wed Feb 17 16:43:48 2016]  81559a31 8815fead7100
81559cba 8155b5a0
[Wed Feb 17 16:43:48 2016] Call Trace:
[Wed Feb 17 16:43:48 2016]  [] ? schedule+0x31/0x80
[Wed Feb 17 16:43:48 2016]  [] ?
schedule_preempt_disabled+0xa/0x10
[Wed Feb 17 16:43:48 2016]  [] ?
__mutex_lock_slowpath+0x90/0x110
[Wed Feb 17 16:43:48 2016]  [] ? mutex_lock+0x1b/0x30
[Wed Feb 17 16:43:48 2016]  [] ?
btrfs_delete_unused_bgs+0xee/0x3f0 [btrfs]
[Wed Feb 17 16:43:48 2016]  [] ? __schedule+0x286/0x8f0
[Wed Feb 17 16:43:48 2016]  [] ?
cleaner_kthread+0x1a7/0x200 [btrfs]
[Wed Feb 17 16:43:48 2016]  [] ?
check_leaf+0x340/0x340 [btrfs]
[Wed Feb 17 16:43:48 2016]  [] ? kthread+0xcf/0xf0
[Wed Feb 17 16:43:48 2016]  [] ? kthread_park+0x50/0x50
[Wed Feb 17 16:43:48 2016]  [] ? ret_from_fork+0x3f/0x70
[Wed Feb 17 16:43:48 2016]  [] ? kthread_park+0x50/0x50

[Wed Feb 17 17:23:48 2016] INFO: task btrfs-cleaner:38580 blocked for
more than 120 seconds.
[Wed Feb 17 17:23:48 2016]   Not tainted 4.4.0-customkernel #1
[Wed Feb 17 17:23:48 2016] "echo 0 >
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
[Wed Feb 17 17:23:48 2016] btrfs-cleaner   D 881627a35dc0 0
38580  2 0x
[Wed Feb 17 17:23:48 2016]  882c16fa6480 88161a956f00
882a3d744000 882a3d743df8
[Wed Feb 17 17:23:48 2016]  8815fead7104 882c16fa6480
 8815fead7108
[Wed Feb 17 17:23:48 2016]  81559a31 8815fead7100
81559cba 8155b5a0
[Wed Feb 17 17:23:48 2016] Call Trace:
[Wed Feb 17 17:23:48 2016]  [] ? schedule+0x31/0x80
[Wed Feb 17 17:23:48 2016]  [] ?
schedule_preempt_disabled+0xa/0x10
[Wed Feb 17 17:23:48 2016]  [] ?
__mutex_lock_slowpath+0x90/0x110
[Wed Feb 17 17:23:48 2016]  [] ? mutex_lock+0x1b/0x30
[Wed Feb 17 17:23:48 2016]  [] ?
btrfs_delete_unused_bgs+0xee/0x3f0 [btrfs]
[Wed Feb 17 17:23:48 2016]  [] ? __schedule+0x286/0x8f0
[Wed Feb 17 17:23:48 2016]  [] ?
cleaner_kthread+0x1a7/0x200 [btrfs]
[Wed Feb 17 17:23:48 2016]  [] ?
check_leaf+0x340/0x340 [btrfs]
[Wed Feb 17 17:23:48 2016]  [] ? kthread+0xcf/0xf0
[Wed Feb 17 17:23:48 2016]  [] ? kthread_park+0x50/0x50
[Wed Feb 17 17:23:48 2016]  [] ? ret_from_fork+0x3f/0x70
[Wed Feb 17 17:23:48 2016]  [] ? kthread_park+0x50/0x50

[Wed Feb 17 17:57:48 2016] INFO: task btrfs-cleaner:38580 blocked for
more than 120 seconds.
[Wed Feb 17 17:57:48 2016]   Not tainted 4.4.0-customkernel #1
[Wed Feb 17 17:57:48 2016] "echo 0 >
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
[Wed Feb 17 17:57:48 2016] btrfs-cleaner   D 881627a95dc0 0
38580  2 0x
[Wed Feb 17 17:57:48 2016]  882c16fa6480 88161a980fc0
882a3d744000 882a3d743df8
[Wed Feb 17 17:57:48 2016]  8815fead7104 882c16fa6480
 8815fead7108
[Wed Feb 17 17:57:48 2016]  81559a31 8815fead7100
81559cba 8155b5a0
[Wed Feb 17 17:57:48 2016] Call Trace:
[Wed Feb 17 17:57:48 2016]  [] ? schedule+0x31/0x80
[Wed Feb 17 17:57:48 2016]  [] ?
schedule_preempt_disabled+0xa/0x10
[Wed Feb 17 17:57:48 2016]  [] ?
__mutex_lock_slowpath+0x90/0x110
[Wed Feb 17 17:57:48 2016]  [] ? mutex_lock+0x1b/0x30
[Wed Feb 17 17:57:48 2016]  [] ?
btrfs_delete_unused_bgs+0xee/0x3f0 [btrfs]
[Wed Feb 17 17:57:48 2016]  [] ? __schedule+0x286/0x8f0
[Wed Feb 17 17:57:48 2016]  [] ?
cleaner_kthread+0x1a7/0x200 [btrfs]
[Wed Feb 17 17:57:48 2016]  [] ?
check_leaf+0x340/0x340 [btrfs]
[Wed Feb 17 17:57:48 2016]  [] ? kthread+0xcf/0xf0
[Wed Feb 17 17:57:48 2016]  [] ? kthread_park+0x50/0x50
[Wed Feb 17 17:57:48 2016]  [] ? ret_from_fork+0x3f/0x70
[Wed Feb 17 17:57:48 2016]  [] ? kthread_park+0x50/0x50


 cut 
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/2] btrfs: add GET_SUPPORTED_FEATURES to the control device ioctls

2016-02-18 Thread David Sterba
The control device is accessible when no filesystem is mounted and we
may want to query features supported by the module. This is already
possible using the sysfs files, this ioctl is for parity and
convenience.

Signed-off-by: David Sterba 
---
 fs/btrfs/ctree.h | 1 +
 fs/btrfs/ioctl.c | 3 +--
 fs/btrfs/super.c | 4 
 3 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index bfe4a337fb4d..47bc50fd4f55 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -4089,6 +4089,7 @@ void btrfs_test_inode_set_ops(struct inode *inode);
 
 /* ioctl.c */
 long btrfs_ioctl(struct file *file, unsigned int cmd, unsigned long arg);
+int btrfs_ioctl_get_supported_features(struct file *file, void __user *arg);
 void btrfs_update_iflags(struct inode *inode);
 void btrfs_inherit_iflags(struct inode *inode, struct inode *dir);
 int btrfs_is_empty_uuid(u8 *uuid);
diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
index 952172ca7e45..f4c6ed5c5300 100644
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -5187,8 +5187,7 @@ static int btrfs_ioctl_set_fslabel(struct file *file, 
void __user *arg)
  .compat_ro_flags = BTRFS_FEATURE_COMPAT_RO_##suffix, \
  .incompat_flags = BTRFS_FEATURE_INCOMPAT_##suffix }
 
-static int btrfs_ioctl_get_supported_features(struct file *file,
- void __user *arg)
+int btrfs_ioctl_get_supported_features(struct file *file, void __user *arg)
 {
static const struct btrfs_ioctl_feature_flags features[3] = {
INIT_FEATURE_FLAGS(SUPP),
diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
index d41e09fe8e38..dda6f64dfd73 100644
--- a/fs/btrfs/super.c
+++ b/fs/btrfs/super.c
@@ -2163,6 +2163,10 @@ static long btrfs_control_ioctl(struct file *file, 
unsigned int cmd,
break;
ret = !(fs_devices->num_devices == fs_devices->total_devices);
break;
+   case BTRFS_IOC_GET_SUPPORTED_FEATURES:
+   ret = btrfs_ioctl_get_supported_features(NULL,
+   (void __user*)arg);
+   break;
}
 
kfree(vol);
-- 
2.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/2] btrfs: drop unused argument in btrfs_ioctl_get_supported_features

2016-02-18 Thread David Sterba
Signed-off-by: David Sterba 
---
 fs/btrfs/ctree.h | 2 +-
 fs/btrfs/ioctl.c | 4 ++--
 fs/btrfs/super.c | 3 +--
 3 files changed, 4 insertions(+), 5 deletions(-)

diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index 47bc50fd4f55..82ce847318ae 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -4089,7 +4089,7 @@ void btrfs_test_inode_set_ops(struct inode *inode);
 
 /* ioctl.c */
 long btrfs_ioctl(struct file *file, unsigned int cmd, unsigned long arg);
-int btrfs_ioctl_get_supported_features(struct file *file, void __user *arg);
+int btrfs_ioctl_get_supported_features(void __user *arg);
 void btrfs_update_iflags(struct inode *inode);
 void btrfs_inherit_iflags(struct inode *inode, struct inode *dir);
 int btrfs_is_empty_uuid(u8 *uuid);
diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
index f4c6ed5c5300..dcda7ea1e928 100644
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -5187,7 +5187,7 @@ static int btrfs_ioctl_set_fslabel(struct file *file, 
void __user *arg)
  .compat_ro_flags = BTRFS_FEATURE_COMPAT_RO_##suffix, \
  .incompat_flags = BTRFS_FEATURE_INCOMPAT_##suffix }
 
-int btrfs_ioctl_get_supported_features(struct file *file, void __user *arg)
+int btrfs_ioctl_get_supported_features(void __user *arg)
 {
static const struct btrfs_ioctl_feature_flags features[3] = {
INIT_FEATURE_FLAGS(SUPP),
@@ -5466,7 +5466,7 @@ long btrfs_ioctl(struct file *file, unsigned int
case BTRFS_IOC_SET_FSLABEL:
return btrfs_ioctl_set_fslabel(file, argp);
case BTRFS_IOC_GET_SUPPORTED_FEATURES:
-   return btrfs_ioctl_get_supported_features(file, argp);
+   return btrfs_ioctl_get_supported_features(argp);
case BTRFS_IOC_GET_FEATURES:
return btrfs_ioctl_get_features(file, argp);
case BTRFS_IOC_SET_FEATURES:
diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
index dda6f64dfd73..737e6a85c71e 100644
--- a/fs/btrfs/super.c
+++ b/fs/btrfs/super.c
@@ -2164,8 +2164,7 @@ static long btrfs_control_ioctl(struct file *file, 
unsigned int cmd,
ret = !(fs_devices->num_devices == fs_devices->total_devices);
break;
case BTRFS_IOC_GET_SUPPORTED_FEATURES:
-   ret = btrfs_ioctl_get_supported_features(NULL,
-   (void __user*)arg);
+   ret = btrfs_ioctl_get_supported_features((void __user*)arg);
break;
}
 
-- 
2.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] fstests: generic test for directory fsync after rename operation

2016-02-18 Thread Filipe Manana
On Thu, Feb 18, 2016 at 1:30 AM, Dave Chinner  wrote:
> On Mon, Feb 15, 2016 at 10:54:23AM +, fdman...@kernel.org wrote:
>> From: Filipe Manana 
>>
>> Test that if we move one file between directories, fsync the parent
>> directory of the old directory, power fail and remount the filesystem,
>> the file is not lost and it's located at the destination directory.
>>
>> This is motivated by a bug found in btrfs, which is fixed by the patch
>> (for the linux kernel) titled:
>>
>>   "Btrfs: fix file loss on log replay after renaming a file and fsync"
>>
>> Tested against ext3, ext4, xfs, f2fs and reiserfs.
>>
>> Signed-off-by: Filipe Manana 
> 
>> +# We expect our file foo to exist, have an entry in the new parent
>> +# directory (c/) and not have anymore an entry in the old parent directory
>> +# (a/b/).
>> +[ -e $SCRATCH_MNT/a/b/foo ] && echo "File foo is still at directory a/b/"
>> +[ -e $SCRATCH_MNT/c/foo ] || echo "File foo is not at directory c/"
>> +
>> +# The new file named bar should also exist.
>> +[ -e $SCRATCH_MNT/a/bar ] || echo "File bar is missing"
>
> This can all be replaced simply by:
>
> ls -R $SCRATCH_MNT | _filter_scratch
>
> Because the golden image match will tell us if files are missing or
> in the wrong place.

The problem with that is ext3/4 have the lost+found directory that
xfs, btrfs, etc don't have.
Do you mind about something like this:

# exclude lost+found directory specific to some filesystems (ext3/4)
ls -R $SCRATCH_MNT | grep -v 'lost+found' | tr -s '\n' | _filter_scratch

(since you usually dislike generic tests having any specific logic for
specific filesystems)

Also do I need to remove _need_to_be_root for the 3 tests I submitted?
I only noticed there was a submitted patch that kills that function
after sending them.

thanks


>
> Cheers,
>
> Dave.
> --
> Dave Chinner
> da...@fromorbit.com
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 13/13] btrfs: optimize check for stale device

2016-02-18 Thread David Sterba
On Sat, Feb 13, 2016 at 10:01:40AM +0800, Anand Jain wrote:
> Optimize check for stale device to only be checked when there is device
> added or changed. If there is no update to the device, there is no need
> to call btrfs_free_stale_device().
> 
> Signed-off-by: Anand Jain 

http://thread.gmane.org/gmane.comp.file-systems.btrfs/48909/focus=48976

So why did you include the patch in this series?

I see crashes with btrfs/011 on a non-debugging config

[  641.714363] BUG: unable to handle kernel NULL pointer dereference at 
0068
[  641.716057] IP: [] scrub_setup_ctx.isra.19+0x1f6/0x260 
[btrfs]
[  641.717036] PGD 720c1067 PUD 720c2067 PMD 0
[  641.717749] Oops:  [#1] PREEMPT SMP
[  641.718432] Modules linked in: af_packet iscsi_ibft iscsi_boot_sysfs xfs 
libcrc32c ppdev acpi_cpufreq 8250_fintek parport_pc parport bochs_drm ttm 
drm_kms_helper drm fb_sys_fops syscopyarea sysfillrect sysimgblt button joydev 
tpm_tis tpm i2c_piix4 serio_raw pcspkr dm_mod btrfs xor raid6_pq sr_mod cdrom 
ata_generic ata_piix sym53c8xx e1000 scsi_transport_spi floppy sg
[  641.723163] CPU: 0 PID: 27766 Comm: btrfs Not tainted 
4.5.0-rc3-next-20160212-1.g38290f0-vanilla #1
[  641.724420] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS by 
qemu-project.org 04/01/2014
[  641.725723] task: 8800742481c0 ti: 880071d1 task.ti: 
880071d1
[  641.726954] RIP: 0010:[]  [] 
scrub_setup_ctx.isra.19+0x1f6/0x260 [btrfs]
[  641.728404] RSP: 0018:880071d13ce8  EFLAGS: 00010202
[  641.729413] RAX: 88007231e800 RBX: 88007231e800 RCX: 
[  641.730610] RDX: a0195638 RSI: a017c5a8 RDI: 88007231ea80
[  641.731832] RBP: 880071d13d18 R08:  R09: 88007204ea00
[  641.733085] R10: 0008 R11:  R12: 
[  641.734307] R13: 0001 R14: 88007231e9f8 R15: 003f
[  641.735544] FS:  7f03ed36d8c0() GS:88007fc0() 
knlGS:
[  641.736883] CS:  0010 DS:  ES:  CR0: 80050033
[  641.738022] CR2: 0068 CR3: 720c CR4: 06f0
[  641.739325] Stack:
[  641.740156]  8800724d4000 8800724d4000  
8800722ef000
[  641.741735]   8800724d4fc8 880071d13d98 
a01566fd
[  641.743163]  88007b127000 0019 8800724d4ce8 

[  641.744599] Call Trace:
[  641.745553]  [] btrfs_scrub_dev+0x13d/0x510 [btrfs]
[  641.746894]  [] btrfs_dev_replace_start+0x279/0x3f0 [btrfs]
[  641.748282]  [] btrfs_ioctl+0x1869/0x2070 [btrfs]
[  641.749587]  [] ? pte_alloc_one+0x33/0x40
[  641.750850]  [] do_vfs_ioctl+0x96/0x590
[  641.752128]  [] ? __do_page_fault+0x181/0x450
[  641.753432]  [] SyS_ioctl+0x79/0x90
[  641.754663]  [] entry_SYSCALL_64_fastpath+0x1e/0xa8
[  641.756037] Code: 00 48 c7 c2 38 56 19 a0 48 c7 c6 a8 c5 17 a0 e8 21 39 f7 
e0 45 85 ed 48 c7 83 68 02 00 00 00 00 00 00 48 89 d8 0f 84 03 ff ff ff <49> 83 
7c 24 68 00 74 40 c7 83 78 02 00 00 20 00 00 00 4c 89 a3
[  641.760392] RIP  [] scrub_setup_ctx.isra.19+0x1f6/0x260 
[btrfs]
[  641.761970]  RSP 
[  641.763190] CR2: 0068
[  641.767218] ---[ end trace f46d4e6a90bda310 ]---

the dereference happens at offset 0x68 which matches bdev in
btrfs_device, so this patch is my best guess at the moment. I'm not able
to reproduce it directly so I need to wait for a rebuild and repeat.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] fstests: generic test for directory fsync after rename operation

2016-02-18 Thread Darrick J. Wong
On Thu, Feb 18, 2016 at 01:38:41PM +, Filipe Manana wrote:
> On Thu, Feb 18, 2016 at 1:30 AM, Dave Chinner  wrote:
> > On Mon, Feb 15, 2016 at 10:54:23AM +, fdman...@kernel.org wrote:
> >> From: Filipe Manana 
> >>
> >> Test that if we move one file between directories, fsync the parent
> >> directory of the old directory, power fail and remount the filesystem,
> >> the file is not lost and it's located at the destination directory.
> >>
> >> This is motivated by a bug found in btrfs, which is fixed by the patch
> >> (for the linux kernel) titled:
> >>
> >>   "Btrfs: fix file loss on log replay after renaming a file and fsync"
> >>
> >> Tested against ext3, ext4, xfs, f2fs and reiserfs.
> >>
> >> Signed-off-by: Filipe Manana 
> > 
> >> +# We expect our file foo to exist, have an entry in the new parent
> >> +# directory (c/) and not have anymore an entry in the old parent directory
> >> +# (a/b/).
> >> +[ -e $SCRATCH_MNT/a/b/foo ] && echo "File foo is still at directory a/b/"
> >> +[ -e $SCRATCH_MNT/c/foo ] || echo "File foo is not at directory c/"
> >> +
> >> +# The new file named bar should also exist.
> >> +[ -e $SCRATCH_MNT/a/bar ] || echo "File bar is missing"
> >
> > This can all be replaced simply by:
> >
> > ls -R $SCRATCH_MNT | _filter_scratch
> >
> > Because the golden image match will tell us if files are missing or
> > in the wrong place.
> 
> The problem with that is ext3/4 have the lost+found directory that
> xfs, btrfs, etc don't have.

XFS can have lost+found too, though this seems unlikely on the scratch mount.

> Do you mind about something like this:
> 
> # exclude lost+found directory specific to some filesystems (ext3/4)
> ls -R $SCRATCH_MNT | grep -v 'lost+found' | tr -s '\n' | _filter_scratch

Why not put "a" and "c" under $SCRATCH_MNT/test-335/?

--D

> 
> (since you usually dislike generic tests having any specific logic for
> specific filesystems)
> 
> Also do I need to remove _need_to_be_root for the 3 tests I submitted?
> I only noticed there was a submitted patch that kills that function
> after sending them.
> 
> thanks
> 
> 
> >
> > Cheers,
> >
> > Dave.
> > --
> > Dave Chinner
> > da...@fromorbit.com
> --
> To unsubscribe from this list: send the line "unsubscribe fstests" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] fstests: generic test for directory fsync after rename operation

2016-02-18 Thread Filipe Manana
On Thu, Feb 18, 2016 at 4:43 PM, Darrick J. Wong
 wrote:
> On Thu, Feb 18, 2016 at 01:38:41PM +, Filipe Manana wrote:
>> On Thu, Feb 18, 2016 at 1:30 AM, Dave Chinner  wrote:
>> > On Mon, Feb 15, 2016 at 10:54:23AM +, fdman...@kernel.org wrote:
>> >> From: Filipe Manana 
>> >>
>> >> Test that if we move one file between directories, fsync the parent
>> >> directory of the old directory, power fail and remount the filesystem,
>> >> the file is not lost and it's located at the destination directory.
>> >>
>> >> This is motivated by a bug found in btrfs, which is fixed by the patch
>> >> (for the linux kernel) titled:
>> >>
>> >>   "Btrfs: fix file loss on log replay after renaming a file and fsync"
>> >>
>> >> Tested against ext3, ext4, xfs, f2fs and reiserfs.
>> >>
>> >> Signed-off-by: Filipe Manana 
>> > 
>> >> +# We expect our file foo to exist, have an entry in the new parent
>> >> +# directory (c/) and not have anymore an entry in the old parent 
>> >> directory
>> >> +# (a/b/).
>> >> +[ -e $SCRATCH_MNT/a/b/foo ] && echo "File foo is still at directory a/b/"
>> >> +[ -e $SCRATCH_MNT/c/foo ] || echo "File foo is not at directory c/"
>> >> +
>> >> +# The new file named bar should also exist.
>> >> +[ -e $SCRATCH_MNT/a/bar ] || echo "File bar is missing"
>> >
>> > This can all be replaced simply by:
>> >
>> > ls -R $SCRATCH_MNT | _filter_scratch
>> >
>> > Because the golden image match will tell us if files are missing or
>> > in the wrong place.
>>
>> The problem with that is ext3/4 have the lost+found directory that
>> xfs, btrfs, etc don't have.
>
> XFS can have lost+found too, though this seems unlikely on the scratch mount.
>
>> Do you mind about something like this:
>>
>> # exclude lost+found directory specific to some filesystems (ext3/4)
>> ls -R $SCRATCH_MNT | grep -v 'lost+found' | tr -s '\n' | _filter_scratch
>
> Why not put "a" and "c" under $SCRATCH_MNT/test-335/?

Would work as well. I was thinking earlier of just doing two ls -R
calls, one for a/ and other for c/.
Thanks Darrick.

>
> --D
>
>>
>> (since you usually dislike generic tests having any specific logic for
>> specific filesystems)
>>
>> Also do I need to remove _need_to_be_root for the 3 tests I submitted?
>> I only noticed there was a submitted patch that kills that function
>> after sending them.
>>
>> thanks
>>
>>
>> >
>> > Cheers,
>> >
>> > Dave.
>> > --
>> > Dave Chinner
>> > da...@fromorbit.com
>> --
>> To unsubscribe from this list: send the line "unsubscribe fstests" in
>> the body of a message to majord...@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH V3 2/2] btrfs-progs: Introduce device delete by devid

2016-02-18 Thread David Sterba
On Tue, Oct 06, 2015 at 05:33:10PM +0800, Anand Jain wrote:
> From: Anand Jain 
> 
> This patch introduces new option  for the command
> 
>   btrfs device delete [..]  
> 
> In a user reported issue on a 3-disk-RAID1, one disk failed with its
> SB unreadable. Now with this patch user will have a choice to delete
> the device using devid.
> 
> The other method we could do, is to match the input device_path
> to the available device_paths with in the kernel. But that won't
> work in all the cases, like what if user provided mapper path
> when the path within the kernel is a non-mapper path.
> 
> This patch depends on the below kernel patch for the new feature to work,
> however it will fail-back to the old interface for the kernel without the
> patch
> 
>   Btrfs: Introduce device delete by devid
> 
> Signed-off-by: Anand Jain 

Applied and refreshed.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: task btrfs-cleaner:770 blocked for more than 120 seconds.

2016-02-18 Thread Liu Bo
On Thu, Feb 18, 2016 at 01:35:24PM +0100, Christian Rohmann wrote:
> 
> 
> On 02/14/2016 11:42 PM, Roman Mamedov wrote:
> > FWIW I had a persistently repeating deadlock on 4.1 and 4.3, but
> > after upgrade to 4.4 it no longer happens.
> 
> 
> Apparently also with 4.4 there is some sort of blocking happening ...
> just at 38580:

OK, what does 'sysrq-w' say?

Thanks,

-liubo

> 
>  cut 
> 
> [Wed Feb 17 16:43:48 2016] INFO: task btrfs-cleaner:38580 blocked for
> more than 120 seconds.
> [Wed Feb 17 16:43:48 2016]   Not tainted 4.4.0-customkernel #1
> [Wed Feb 17 16:43:48 2016] "echo 0 >
> /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> [Wed Feb 17 16:43:48 2016] btrfs-cleaner   D 882c27295dc0 0
> 38580  2 0x
> [Wed Feb 17 16:43:48 2016]  882c16fa6480 88161a980280
> 882a3d744000 882a3d743df8
> [Wed Feb 17 16:43:48 2016]  8815fead7104 882c16fa6480
>  8815fead7108
> [Wed Feb 17 16:43:48 2016]  81559a31 8815fead7100
> 81559cba 8155b5a0
> [Wed Feb 17 16:43:48 2016] Call Trace:
> [Wed Feb 17 16:43:48 2016]  [] ? schedule+0x31/0x80
> [Wed Feb 17 16:43:48 2016]  [] ?
> schedule_preempt_disabled+0xa/0x10
> [Wed Feb 17 16:43:48 2016]  [] ?
> __mutex_lock_slowpath+0x90/0x110
> [Wed Feb 17 16:43:48 2016]  [] ? mutex_lock+0x1b/0x30
> [Wed Feb 17 16:43:48 2016]  [] ?
> btrfs_delete_unused_bgs+0xee/0x3f0 [btrfs]
> [Wed Feb 17 16:43:48 2016]  [] ? __schedule+0x286/0x8f0
> [Wed Feb 17 16:43:48 2016]  [] ?
> cleaner_kthread+0x1a7/0x200 [btrfs]
> [Wed Feb 17 16:43:48 2016]  [] ?
> check_leaf+0x340/0x340 [btrfs]
> [Wed Feb 17 16:43:48 2016]  [] ? kthread+0xcf/0xf0
> [Wed Feb 17 16:43:48 2016]  [] ? kthread_park+0x50/0x50
> [Wed Feb 17 16:43:48 2016]  [] ? ret_from_fork+0x3f/0x70
> [Wed Feb 17 16:43:48 2016]  [] ? kthread_park+0x50/0x50
> 
> [Wed Feb 17 17:23:48 2016] INFO: task btrfs-cleaner:38580 blocked for
> more than 120 seconds.
> [Wed Feb 17 17:23:48 2016]   Not tainted 4.4.0-customkernel #1
> [Wed Feb 17 17:23:48 2016] "echo 0 >
> /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> [Wed Feb 17 17:23:48 2016] btrfs-cleaner   D 881627a35dc0 0
> 38580  2 0x
> [Wed Feb 17 17:23:48 2016]  882c16fa6480 88161a956f00
> 882a3d744000 882a3d743df8
> [Wed Feb 17 17:23:48 2016]  8815fead7104 882c16fa6480
>  8815fead7108
> [Wed Feb 17 17:23:48 2016]  81559a31 8815fead7100
> 81559cba 8155b5a0
> [Wed Feb 17 17:23:48 2016] Call Trace:
> [Wed Feb 17 17:23:48 2016]  [] ? schedule+0x31/0x80
> [Wed Feb 17 17:23:48 2016]  [] ?
> schedule_preempt_disabled+0xa/0x10
> [Wed Feb 17 17:23:48 2016]  [] ?
> __mutex_lock_slowpath+0x90/0x110
> [Wed Feb 17 17:23:48 2016]  [] ? mutex_lock+0x1b/0x30
> [Wed Feb 17 17:23:48 2016]  [] ?
> btrfs_delete_unused_bgs+0xee/0x3f0 [btrfs]
> [Wed Feb 17 17:23:48 2016]  [] ? __schedule+0x286/0x8f0
> [Wed Feb 17 17:23:48 2016]  [] ?
> cleaner_kthread+0x1a7/0x200 [btrfs]
> [Wed Feb 17 17:23:48 2016]  [] ?
> check_leaf+0x340/0x340 [btrfs]
> [Wed Feb 17 17:23:48 2016]  [] ? kthread+0xcf/0xf0
> [Wed Feb 17 17:23:48 2016]  [] ? kthread_park+0x50/0x50
> [Wed Feb 17 17:23:48 2016]  [] ? ret_from_fork+0x3f/0x70
> [Wed Feb 17 17:23:48 2016]  [] ? kthread_park+0x50/0x50
> 
> [Wed Feb 17 17:57:48 2016] INFO: task btrfs-cleaner:38580 blocked for
> more than 120 seconds.
> [Wed Feb 17 17:57:48 2016]   Not tainted 4.4.0-customkernel #1
> [Wed Feb 17 17:57:48 2016] "echo 0 >
> /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> [Wed Feb 17 17:57:48 2016] btrfs-cleaner   D 881627a95dc0 0
> 38580  2 0x
> [Wed Feb 17 17:57:48 2016]  882c16fa6480 88161a980fc0
> 882a3d744000 882a3d743df8
> [Wed Feb 17 17:57:48 2016]  8815fead7104 882c16fa6480
>  8815fead7108
> [Wed Feb 17 17:57:48 2016]  81559a31 8815fead7100
> 81559cba 8155b5a0
> [Wed Feb 17 17:57:48 2016] Call Trace:
> [Wed Feb 17 17:57:48 2016]  [] ? schedule+0x31/0x80
> [Wed Feb 17 17:57:48 2016]  [] ?
> schedule_preempt_disabled+0xa/0x10
> [Wed Feb 17 17:57:48 2016]  [] ?
> __mutex_lock_slowpath+0x90/0x110
> [Wed Feb 17 17:57:48 2016]  [] ? mutex_lock+0x1b/0x30
> [Wed Feb 17 17:57:48 2016]  [] ?
> btrfs_delete_unused_bgs+0xee/0x3f0 [btrfs]
> [Wed Feb 17 17:57:48 2016]  [] ? __schedule+0x286/0x8f0
> [Wed Feb 17 17:57:48 2016]  [] ?
> cleaner_kthread+0x1a7/0x200 [btrfs]
> [Wed Feb 17 17:57:48 2016]  [] ?
> check_leaf+0x340/0x340 [btrfs]
> [Wed Feb 17 17:57:48 2016]  [] ? kthread+0xcf/0xf0
> [Wed Feb 17 17:57:48 2016]  [] ? kthread_park+0x50/0x50
> [Wed Feb 17 17:57:48 2016]  [] ? ret_from_fork+0x3f/0x70
> [Wed Feb 17 17:57:48 2016]  [] ? kthread_park+0x50/0x50
> 
> 
>  cut 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vg

Re: Oops on Send

2016-02-18 Thread Henk Slager
On Thu, Feb 18, 2016 at 9:29 AM, Jan Alexander Steffens
 wrote:
> Greetings,
>
> I'm getting reproducible Oopses when attempting "btrfs send" on my
> root vol, quickly ending in a hard lockup of the machine. Both
> incremental send and non-incremental send crash, the former much
> earlier. The volume is in a LUKS container.
>
> The initial fault is always during LZO decompression, but scrub does
> not find any errors. Could this be invalid compressed data with good
> checksums? "btrfs check" finds bad extent backrefs which --repair
> claims to fix, but they reappear on a subsequent check.
>
> Is there a way to find out which files' extents are damaged so I can
> delete them and complete my backup?

man btrfs-send
 -v
  Enable verbose debug output. Each occurrence of this option
increases the verbose level more.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] Btrfs: fix deadlock between direct IO reads and buffered writes

2016-02-18 Thread fdmanana
From: Filipe Manana 

While running a test with a mix of buffered IO and direct IO against
the same files I hit a deadlock reported by the following trace:

[11642.140352] INFO: task kworker/u32:3:15282 blocked for more than 120 seconds.
[11642.142452]   Not tainted 4.4.0-rc6-btrfs-next-21+ #1
[11642.143982] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this 
message.
[11642.146332] kworker/u32:3   D 880230ef7988 [11642.147737] 
systemd-journald[571]: Sent WATCHDOG=1 notification.
[11642.149771] 0 15282  2 0x
[11642.151205] Workqueue: btrfs-flush_delalloc btrfs_flush_delalloc_helper 
[btrfs]
[11642.154074]  880230ef7988 0246 00014ec0 
88023ec94ec0
[11642.156722]  880233fe8f80 880230ef8000 88023ec94ec0 
7fff
[11642.159205]  0002 8147b7f9 880230ef79a0 
8147b541
[11642.161403] Call Trace:
[11642.162129]  [] ? bit_wait+0x2f/0x2f
[11642.163396]  [] schedule+0x82/0x9a
[11642.164871]  [] schedule_timeout+0x43/0x109
[11642.167020]  [] ? bit_wait+0x2f/0x2f
[11642.167931]  [] ? trace_hardirqs_on_caller+0x17b/0x197
[11642.182320]  [] ? trace_hardirqs_on+0xd/0xf
[11642.183762]  [] ? timekeeping_get_ns+0xe/0x33
[11642.185308]  [] ? ktime_get+0x41/0x52
[11642.186782]  [] io_schedule_timeout+0xa0/0x102
[11642.188217]  [] ? io_schedule_timeout+0xa0/0x102
[11642.189626]  [] bit_wait_io+0x1b/0x39
[11642.190803]  [] __wait_on_bit_lock+0x4c/0x90
[11642.192158]  [] __lock_page+0x66/0x68
[11642.193379]  [] ? autoremove_wake_function+0x3a/0x3a
[11642.194831]  [] lock_page+0x31/0x34 [btrfs]
[11642.197068]  [] 
extent_write_cache_pages.isra.19.constprop.35+0x1af/0x2f4 [btrfs]
[11642.199188]  [] extent_writepages+0x4b/0x5c [btrfs]
[11642.200723]  [] ? btrfs_writepage_start_hook+0xce/0xce 
[btrfs]
[11642.202465]  [] btrfs_writepages+0x28/0x2a [btrfs]
[11642.203836]  [] do_writepages+0x23/0x2c
[11642.205624]  [] __filemap_fdatawrite_range+0x5a/0x61
[11642.207057]  [] filemap_fdatawrite_range+0x13/0x15
[11642.208529]  [] btrfs_start_ordered_extent+0xd0/0x1a1 
[btrfs]
[11642.210375]  [] ? btrfs_scrubparity_helper+0x140/0x33a 
[btrfs]
[11642.212132]  [] btrfs_run_ordered_extent_work+0x25/0x34 
[btrfs]
[11642.213837]  [] btrfs_scrubparity_helper+0x15c/0x33a 
[btrfs]
[11642.215457]  [] btrfs_flush_delalloc_helper+0xe/0x10 
[btrfs]
[11642.217095]  [] process_one_work+0x256/0x48b
[11642.218324]  [] worker_thread+0x1f5/0x2a7
[11642.219466]  [] ? rescuer_thread+0x289/0x289
[11642.220801]  [] kthread+0xd4/0xdc
[11642.222032]  [] ? kthread_parkme+0x24/0x24
[11642.223190]  [] ret_from_fork+0x3f/0x70
[11642.224394]  [] ? kthread_parkme+0x24/0x24
[11642.226295] 2 locks held by kworker/u32:3/15282:
[11642.227273]  #0:  ("%s-%s""btrfs", name){.+}, at: [] 
process_one_work+0x165/0x48b
[11642.229412]  #1:  ((&work->normal_work)){+.+.+.}, at: [] 
process_one_work+0x165/0x48b
[11642.231414] INFO: task kworker/u32:8:15289 blocked for more than 120 seconds.
[11642.232872]   Not tainted 4.4.0-rc6-btrfs-next-21+ #1
[11642.234109] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this 
message.
[11642.235776] kworker/u32:8   D 88020de5f848 0 15289  2 0x
[11642.237412] Workqueue: writeback wb_workfn (flush-btrfs-481)
[11642.238670]  88020de5f848 0246 00014ec0 
88023ed54ec0
[11642.240475]  88021b1ece40 88020de6 88023ed54ec0 
7fff
[11642.242154]  0002 8147b7f9 88020de5f860 
8147b541
[11642.243715] Call Trace:
[11642.244390]  [] ? bit_wait+0x2f/0x2f
[11642.245432]  [] schedule+0x82/0x9a
[11642.246392]  [] schedule_timeout+0x43/0x109
[11642.247479]  [] ? bit_wait+0x2f/0x2f
[11642.248551]  [] ? trace_hardirqs_on_caller+0x17b/0x197
[11642.249968]  [] ? trace_hardirqs_on+0xd/0xf
[11642.251043]  [] ? timekeeping_get_ns+0xe/0x33
[11642.252202]  [] ? ktime_get+0x41/0x52
[11642.253210]  [] io_schedule_timeout+0xa0/0x102
[11642.254307]  [] ? io_schedule_timeout+0xa0/0x102
[11642.256118]  [] bit_wait_io+0x1b/0x39
[11642.257131]  [] __wait_on_bit_lock+0x4c/0x90
[11642.258200]  [] __lock_page+0x66/0x68
[11642.259168]  [] ? autoremove_wake_function+0x3a/0x3a
[11642.260516]  [] lock_page+0x31/0x34 [btrfs]
[11642.261841]  [] 
extent_write_cache_pages.isra.19.constprop.35+0x1af/0x2f4 [btrfs]
[11642.263531]  [] extent_writepages+0x4b/0x5c [btrfs]
[11642.264747]  [] ? btrfs_writepage_start_hook+0xce/0xce 
[btrfs]
[11642.266148]  [] btrfs_writepages+0x28/0x2a [btrfs]
[11642.267264]  [] do_writepages+0x23/0x2c
[11642.268280]  [] __writeback_single_inode+0xda/0x5ba
[11642.269407]  [] writeback_sb_inodes+0x27b/0x43d
[11642.270476]  [] __writeback_inodes_wb+0x76/0xae
[11642.271547]  [] wb_writeback+0x19e/0x41c
[11642.272588]  [] wb_workfn+0x201/0x341
[11642.273523]  [] ? wb_workfn+0x201/0x341
[11642.274479]  [] process_one_work+0x256/0x48b
[11642.275497]  [] worker_thread+0x1f5/0x2a7
[11642.276518]  [] ? rescuer_thread+0x289/0x289
[11642.277

Re: Major HDD performance degradation on btrfs receive

2016-02-18 Thread Henk Slager
On Tue, Feb 16, 2016 at 5:44 AM, Nazar Mokrynskyi  wrote:
> I have 2 SSD with BTRFS filesystem (RAID) on them and several subvolumes.
> Each 15 minutes I'm creating read-only snapshot of subvolumes /root, /home
> and /web inside /backup.
> After this I'm searching for last common subvolume on /backup_hdd, sending
> difference between latest common snapshot and simply latest snapshot to
> /backup_hdd.
> On top of all above there is snapshots rotation, so that /backup contains
> much less snapshots than /backup_hdd.
>
> I'm using this setup for last 7 months or so and this is luckily the longest
> period when I had no problems with BTRFS at all.
> However, last 2+ months btrfs receive command loads HDD so much that I can't
> even get list of directories in it.
> This happens even if diff between snapshots is really small.
> HDD contains 2 filesystems - mentioned BTRFS and ext4 for other files, so I
> can't even play mp3 file from ext4 filesystem while btrfs receive is
> running.
> Since I'm running everything each 15 minutes this is a real headache.
>
> My guess is that performance hit might be caused by filesystem fragmentation
> even though there is more than enough empty space. But I'm not sure how to
> properly check this and can't, obviously, run defragmentation on read-only
> subvolumes.
>
> I'll be thankful for anything that might help to identify and resolve this
> issue.
>
> ~> uname -a
> Linux nazar-pc 4.5.0-rc4-haswell #1 SMP Tue Feb 16 02:09:13 CET 2016 x86_64
> x86_64 x86_64 GNU/Linux
>
> ~> btrfs --version
> btrfs-progs v4.4
>
> ~> sudo btrfs fi show
> Label: none  uuid: 5170aca4-061a-4c6c-ab00-bd7fc8ae6030
> Total devices 2 FS bytes used 71.00GiB
> devid1 size 111.30GiB used 111.30GiB path /dev/sdb2
> devid2 size 111.30GiB used 111.29GiB path /dev/sdc2
>
> Label: 'Backup'  uuid: 40b8240a-a0a2-4034-ae55-f8558c0343a8
> Total devices 1 FS bytes used 252.54GiB
> devid1 size 800.00GiB used 266.08GiB path /dev/sda1
>
> ~> sudo btrfs fi df /
> Data, RAID0: total=214.56GiB, used=69.10GiB
> System, RAID1: total=8.00MiB, used=16.00KiB
> System, single: total=4.00MiB, used=0.00B
> Metadata, RAID1: total=4.00GiB, used=1.87GiB
> Metadata, single: total=8.00MiB, used=0.00B
> GlobalReserve, single: total=512.00MiB, used=0.00B
>
> ~> sudo btrfs fi df /backup_hdd
> Data, single: total=245.01GiB, used=243.61GiB
> System, DUP: total=32.00MiB, used=48.00KiB
> System, single: total=4.00MiB, used=0.00B
> Metadata, DUP: total=10.50GiB, used=8.93GiB
> Metadata, single: total=8.00MiB, used=0.00B
> GlobalReserve, single: total=512.00MiB, used=0.00B
>
> Relevant mount options:
> UUID=5170aca4-061a-4c6c-ab00-bd7fc8ae6030/ btrfs
> compress=lzo,noatime,relatime,ssd,subvol=/root0 1
> UUID=5170aca4-061a-4c6c-ab00-bd7fc8ae6030/home btrfs
> compress=lzo,noatime,relatime,ssd,subvol=/home 01
> UUID=5170aca4-061a-4c6c-ab00-bd7fc8ae6030/backup btrfs
> compress=lzo,noatime,relatime,ssd,subvol=/backup 01
> UUID=5170aca4-061a-4c6c-ab00-bd7fc8ae6030/web btrfs
> compress=lzo,noatime,relatime,ssd,subvol=/web 01
> UUID=40b8240a-a0a2-4034-ae55-f8558c0343a8/backup_hdd btrfs
> compress=lzo,noatime,relatime,noexec 01

As already indicated by Duncan, the amount of snapshots might be just
too much. The fragmentation on the HDD might have become very high. If
there is limited amount of RAM in the system (so limited caching), too
much time is lost in seeks. In addition:

 compress=lzo
this also increases the chance of scattering fragments and fragmentation.

 noatime,relatime
I am not sure why you have this. Hopefully you have the actual mount
listed as   noatime

You could use the principles of the tool/package called  snapper  to
do a sort of non-linear snapshot thinning: further back in time you
will have a much higher granularity of snapshot over a certain
timeframe.

You could use skinny metadata (recreate the fs with newer tools or use
btrfstune -x on /dev/sda1). I think at the moment this flag is not
enabled on /dev/sda1

If you put just 1 btrfs fs on the hdd (so move all the content from
the ext4 fs in the the btrfs fs) you might get better overall
performance. I assume the ext4 fs is on the second (slower part) of
the HDD and that is a disadvantage I think.
But you probably have reasons for why the setup is like it is.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Btrfs: fix deadlock between direct IO reads and buffered writes

2016-02-18 Thread Liu Bo
On Thu, Feb 18, 2016 at 05:42:50PM +, fdman...@kernel.org wrote:
> From: Filipe Manana 
> 
> While running a test with a mix of buffered IO and direct IO against
> the same files I hit a deadlock reported by the following trace:
> 
> [11642.140352] INFO: task kworker/u32:3:15282 blocked for more than 120 
> seconds.
> [11642.142452]   Not tainted 4.4.0-rc6-btrfs-next-21+ #1
> [11642.143982] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables 
> this message.
> [11642.146332] kworker/u32:3   D 880230ef7988 [11642.147737] 
> systemd-journald[571]: Sent WATCHDOG=1 notification.
> [11642.149771] 0 15282  2 0x
> [11642.151205] Workqueue: btrfs-flush_delalloc btrfs_flush_delalloc_helper 
> [btrfs]
> [11642.154074]  880230ef7988 0246 00014ec0 
> 88023ec94ec0
> [11642.156722]  880233fe8f80 880230ef8000 88023ec94ec0 
> 7fff
> [11642.159205]  0002 8147b7f9 880230ef79a0 
> 8147b541
> [11642.161403] Call Trace:
> [11642.162129]  [] ? bit_wait+0x2f/0x2f
> [11642.163396]  [] schedule+0x82/0x9a
> [11642.164871]  [] schedule_timeout+0x43/0x109
> [11642.167020]  [] ? bit_wait+0x2f/0x2f
> [11642.167931]  [] ? trace_hardirqs_on_caller+0x17b/0x197
> [11642.182320]  [] ? trace_hardirqs_on+0xd/0xf
> [11642.183762]  [] ? timekeeping_get_ns+0xe/0x33
> [11642.185308]  [] ? ktime_get+0x41/0x52
> [11642.186782]  [] io_schedule_timeout+0xa0/0x102
> [11642.188217]  [] ? io_schedule_timeout+0xa0/0x102
> [11642.189626]  [] bit_wait_io+0x1b/0x39
> [11642.190803]  [] __wait_on_bit_lock+0x4c/0x90
> [11642.192158]  [] __lock_page+0x66/0x68
> [11642.193379]  [] ? autoremove_wake_function+0x3a/0x3a
> [11642.194831]  [] lock_page+0x31/0x34 [btrfs]
> [11642.197068]  [] 
> extent_write_cache_pages.isra.19.constprop.35+0x1af/0x2f4 [btrfs]
> [11642.199188]  [] extent_writepages+0x4b/0x5c [btrfs]
> [11642.200723]  [] ? btrfs_writepage_start_hook+0xce/0xce 
> [btrfs]
> [11642.202465]  [] btrfs_writepages+0x28/0x2a [btrfs]
> [11642.203836]  [] do_writepages+0x23/0x2c
> [11642.205624]  [] __filemap_fdatawrite_range+0x5a/0x61
> [11642.207057]  [] filemap_fdatawrite_range+0x13/0x15
> [11642.208529]  [] btrfs_start_ordered_extent+0xd0/0x1a1 
> [btrfs]
> [11642.210375]  [] ? btrfs_scrubparity_helper+0x140/0x33a 
> [btrfs]
> [11642.212132]  [] btrfs_run_ordered_extent_work+0x25/0x34 
> [btrfs]
> [11642.213837]  [] btrfs_scrubparity_helper+0x15c/0x33a 
> [btrfs]
> [11642.215457]  [] btrfs_flush_delalloc_helper+0xe/0x10 
> [btrfs]
> [11642.217095]  [] process_one_work+0x256/0x48b
> [11642.218324]  [] worker_thread+0x1f5/0x2a7
> [11642.219466]  [] ? rescuer_thread+0x289/0x289
> [11642.220801]  [] kthread+0xd4/0xdc
> [11642.222032]  [] ? kthread_parkme+0x24/0x24
> [11642.223190]  [] ret_from_fork+0x3f/0x70
> [11642.224394]  [] ? kthread_parkme+0x24/0x24
> [11642.226295] 2 locks held by kworker/u32:3/15282:
> [11642.227273]  #0:  ("%s-%s""btrfs", name){.+}, at: [] 
> process_one_work+0x165/0x48b
> [11642.229412]  #1:  ((&work->normal_work)){+.+.+.}, at: [] 
> process_one_work+0x165/0x48b
> [11642.231414] INFO: task kworker/u32:8:15289 blocked for more than 120 
> seconds.
> [11642.232872]   Not tainted 4.4.0-rc6-btrfs-next-21+ #1
> [11642.234109] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables 
> this message.
> [11642.235776] kworker/u32:8   D 88020de5f848 0 15289  2 
> 0x
> [11642.237412] Workqueue: writeback wb_workfn (flush-btrfs-481)
> [11642.238670]  88020de5f848 0246 00014ec0 
> 88023ed54ec0
> [11642.240475]  88021b1ece40 88020de6 88023ed54ec0 
> 7fff
> [11642.242154]  0002 8147b7f9 88020de5f860 
> 8147b541
> [11642.243715] Call Trace:
> [11642.244390]  [] ? bit_wait+0x2f/0x2f
> [11642.245432]  [] schedule+0x82/0x9a
> [11642.246392]  [] schedule_timeout+0x43/0x109
> [11642.247479]  [] ? bit_wait+0x2f/0x2f
> [11642.248551]  [] ? trace_hardirqs_on_caller+0x17b/0x197
> [11642.249968]  [] ? trace_hardirqs_on+0xd/0xf
> [11642.251043]  [] ? timekeeping_get_ns+0xe/0x33
> [11642.252202]  [] ? ktime_get+0x41/0x52
> [11642.253210]  [] io_schedule_timeout+0xa0/0x102
> [11642.254307]  [] ? io_schedule_timeout+0xa0/0x102
> [11642.256118]  [] bit_wait_io+0x1b/0x39
> [11642.257131]  [] __wait_on_bit_lock+0x4c/0x90
> [11642.258200]  [] __lock_page+0x66/0x68
> [11642.259168]  [] ? autoremove_wake_function+0x3a/0x3a
> [11642.260516]  [] lock_page+0x31/0x34 [btrfs]
> [11642.261841]  [] 
> extent_write_cache_pages.isra.19.constprop.35+0x1af/0x2f4 [btrfs]
> [11642.263531]  [] extent_writepages+0x4b/0x5c [btrfs]
> [11642.264747]  [] ? btrfs_writepage_start_hook+0xce/0xce 
> [btrfs]
> [11642.266148]  [] btrfs_writepages+0x28/0x2a [btrfs]
> [11642.267264]  [] do_writepages+0x23/0x2c
> [11642.268280]  [] __writeback_single_inode+0xda/0x5ba
> [11642.269407]  [] writeback_sb_inodes+0x27b/0x43d
> [11642.270476]  [] __writeback_inodes_wb+0

Re: [PATCH] Btrfs: fix deadlock between direct IO reads and buffered writes

2016-02-18 Thread Filipe Manana
On Thu, Feb 18, 2016 at 7:53 PM, Liu Bo  wrote:
> On Thu, Feb 18, 2016 at 05:42:50PM +, fdman...@kernel.org wrote:
>> From: Filipe Manana 
>>
>> While running a test with a mix of buffered IO and direct IO against
>> the same files I hit a deadlock reported by the following trace:
>>
>> [11642.140352] INFO: task kworker/u32:3:15282 blocked for more than 120 
>> seconds.
>> [11642.142452]   Not tainted 4.4.0-rc6-btrfs-next-21+ #1
>> [11642.143982] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables 
>> this message.
>> [11642.146332] kworker/u32:3   D 880230ef7988 [11642.147737] 
>> systemd-journald[571]: Sent WATCHDOG=1 notification.
>> [11642.149771] 0 15282  2 0x
>> [11642.151205] Workqueue: btrfs-flush_delalloc btrfs_flush_delalloc_helper 
>> [btrfs]
>> [11642.154074]  880230ef7988 0246 00014ec0 
>> 88023ec94ec0
>> [11642.156722]  880233fe8f80 880230ef8000 88023ec94ec0 
>> 7fff
>> [11642.159205]  0002 8147b7f9 880230ef79a0 
>> 8147b541
>> [11642.161403] Call Trace:
>> [11642.162129]  [] ? bit_wait+0x2f/0x2f
>> [11642.163396]  [] schedule+0x82/0x9a
>> [11642.164871]  [] schedule_timeout+0x43/0x109
>> [11642.167020]  [] ? bit_wait+0x2f/0x2f
>> [11642.167931]  [] ? trace_hardirqs_on_caller+0x17b/0x197
>> [11642.182320]  [] ? trace_hardirqs_on+0xd/0xf
>> [11642.183762]  [] ? timekeeping_get_ns+0xe/0x33
>> [11642.185308]  [] ? ktime_get+0x41/0x52
>> [11642.186782]  [] io_schedule_timeout+0xa0/0x102
>> [11642.188217]  [] ? io_schedule_timeout+0xa0/0x102
>> [11642.189626]  [] bit_wait_io+0x1b/0x39
>> [11642.190803]  [] __wait_on_bit_lock+0x4c/0x90
>> [11642.192158]  [] __lock_page+0x66/0x68
>> [11642.193379]  [] ? autoremove_wake_function+0x3a/0x3a
>> [11642.194831]  [] lock_page+0x31/0x34 [btrfs]
>> [11642.197068]  [] 
>> extent_write_cache_pages.isra.19.constprop.35+0x1af/0x2f4 [btrfs]
>> [11642.199188]  [] extent_writepages+0x4b/0x5c [btrfs]
>> [11642.200723]  [] ? btrfs_writepage_start_hook+0xce/0xce 
>> [btrfs]
>> [11642.202465]  [] btrfs_writepages+0x28/0x2a [btrfs]
>> [11642.203836]  [] do_writepages+0x23/0x2c
>> [11642.205624]  [] __filemap_fdatawrite_range+0x5a/0x61
>> [11642.207057]  [] filemap_fdatawrite_range+0x13/0x15
>> [11642.208529]  [] btrfs_start_ordered_extent+0xd0/0x1a1 
>> [btrfs]
>> [11642.210375]  [] ? btrfs_scrubparity_helper+0x140/0x33a 
>> [btrfs]
>> [11642.212132]  [] btrfs_run_ordered_extent_work+0x25/0x34 
>> [btrfs]
>> [11642.213837]  [] btrfs_scrubparity_helper+0x15c/0x33a 
>> [btrfs]
>> [11642.215457]  [] btrfs_flush_delalloc_helper+0xe/0x10 
>> [btrfs]
>> [11642.217095]  [] process_one_work+0x256/0x48b
>> [11642.218324]  [] worker_thread+0x1f5/0x2a7
>> [11642.219466]  [] ? rescuer_thread+0x289/0x289
>> [11642.220801]  [] kthread+0xd4/0xdc
>> [11642.222032]  [] ? kthread_parkme+0x24/0x24
>> [11642.223190]  [] ret_from_fork+0x3f/0x70
>> [11642.224394]  [] ? kthread_parkme+0x24/0x24
>> [11642.226295] 2 locks held by kworker/u32:3/15282:
>> [11642.227273]  #0:  ("%s-%s""btrfs", name){.+}, at: 
>> [] process_one_work+0x165/0x48b
>> [11642.229412]  #1:  ((&work->normal_work)){+.+.+.}, at: 
>> [] process_one_work+0x165/0x48b
>> [11642.231414] INFO: task kworker/u32:8:15289 blocked for more than 120 
>> seconds.
>> [11642.232872]   Not tainted 4.4.0-rc6-btrfs-next-21+ #1
>> [11642.234109] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables 
>> this message.
>> [11642.235776] kworker/u32:8   D 88020de5f848 0 15289  2 
>> 0x
>> [11642.237412] Workqueue: writeback wb_workfn (flush-btrfs-481)
>> [11642.238670]  88020de5f848 0246 00014ec0 
>> 88023ed54ec0
>> [11642.240475]  88021b1ece40 88020de6 88023ed54ec0 
>> 7fff
>> [11642.242154]  0002 8147b7f9 88020de5f860 
>> 8147b541
>> [11642.243715] Call Trace:
>> [11642.244390]  [] ? bit_wait+0x2f/0x2f
>> [11642.245432]  [] schedule+0x82/0x9a
>> [11642.246392]  [] schedule_timeout+0x43/0x109
>> [11642.247479]  [] ? bit_wait+0x2f/0x2f
>> [11642.248551]  [] ? trace_hardirqs_on_caller+0x17b/0x197
>> [11642.249968]  [] ? trace_hardirqs_on+0xd/0xf
>> [11642.251043]  [] ? timekeeping_get_ns+0xe/0x33
>> [11642.252202]  [] ? ktime_get+0x41/0x52
>> [11642.253210]  [] io_schedule_timeout+0xa0/0x102
>> [11642.254307]  [] ? io_schedule_timeout+0xa0/0x102
>> [11642.256118]  [] bit_wait_io+0x1b/0x39
>> [11642.257131]  [] __wait_on_bit_lock+0x4c/0x90
>> [11642.258200]  [] __lock_page+0x66/0x68
>> [11642.259168]  [] ? autoremove_wake_function+0x3a/0x3a
>> [11642.260516]  [] lock_page+0x31/0x34 [btrfs]
>> [11642.261841]  [] 
>> extent_write_cache_pages.isra.19.constprop.35+0x1af/0x2f4 [btrfs]
>> [11642.263531]  [] extent_writepages+0x4b/0x5c [btrfs]
>> [11642.264747]  [] ? btrfs_writepage_start_hook+0xce/0xce 
>> [btrfs]
>> [11642.266148]  [] btrfs_writepages+0x28/0x2a [btrfs]
>> [11642.267264]  [] do_writepages+0x23/0x2c
>

[PATCH v2] fstests: btrfs, test directory fsync after deleting snapshots

2016-02-18 Thread fdmanana
From: Filipe Manana 

Test that if we fsync a directory that had a snapshot entry in it that
was deleted and crash, the next time we mount the filesystem, the log
replay procedure will not fail and the snapshot is not present anymore.

This issue is fixed by the following patch for the linux kernel:

  "Btrfs: fix unreplayable log after snapshot delete + parent dir fsync"

Signed-off-by: Filipe Manana 
Tested-by: Liu Bo 
Reviewed-by: Liu Bo 
---

V2: Removed the call to _need_to_be_root since there's a patch around to
kill it.
Removed explicit snapshot existence tests with bash and replaced
with calls to ls -R.

 tests/btrfs/118 | 98 +
 tests/btrfs/118.out | 15 
 tests/btrfs/group   |  1 +
 3 files changed, 114 insertions(+)
 create mode 100755 tests/btrfs/118
 create mode 100644 tests/btrfs/118.out

diff --git a/tests/btrfs/118 b/tests/btrfs/118
new file mode 100755
index 000..d0a1f2e
--- /dev/null
+++ b/tests/btrfs/118
@@ -0,0 +1,98 @@
+#! /bin/bash
+# FSQA Test No. 118
+#
+# Test that if we fsync a directory that had a snapshot entry in it that was
+# deleted and crash, the next time we mount the filesystem, the log replay
+# procedure will not fail and the snapshot is not present anymore.
+#
+#---
+#
+# Copyright (C) 2016 SUSE Linux Products GmbH. All Rights Reserved.
+# Author: Filipe Manana 
+#
+# This program is free software; you can redistribute it and/or
+# modify it under the terms of the GNU General Public License as
+# published by the Free Software Foundation.
+#
+# This program is distributed in the hope that it would be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program; if not, write the Free Software Foundation,
+# Inc.,  51 Franklin St, Fifth Floor, Boston, MA  02110-1301  USA
+#---
+#
+
+seq=`basename $0`
+seqres=$RESULT_DIR/$seq
+echo "QA output created by $seq"
+tmp=/tmp/$$
+status=1   # failure is the default!
+trap "_cleanup; exit \$status" 0 1 2 3 15
+
+_cleanup()
+{
+   _cleanup_flakey
+   cd /
+   rm -f $tmp.*
+}
+
+# get standard environment, filters and checks
+. ./common/rc
+. ./common/filter
+. ./common/dmflakey
+
+# real QA test starts here
+_supported_fs btrfs
+_supported_os Linux
+_require_scratch
+_require_dm_target flakey
+_require_metadata_journaling $SCRATCH_DEV
+
+rm -f $seqres.full
+
+_scratch_mkfs >>$seqres.full 2>&1
+_init_flakey
+_mount_flakey
+
+# Create a snapshot at the root of our filesystem (mount point path), delete 
it,
+# fsync the mount point path, crash and mount to replay the log. This should
+# succeed and after the filesystem is mounted the snapshot should not be 
visible
+# anymore.
+_run_btrfs_util_prog subvolume snapshot $SCRATCH_MNT $SCRATCH_MNT/snap1
+_run_btrfs_util_prog subvolume delete $SCRATCH_MNT/snap1
+$XFS_IO_PROG -c "fsync" $SCRATCH_MNT
+
+echo "Filesystem content before first power failure:"
+ls -R $SCRATCH_MNT | _filter_scratch
+
+_flakey_drop_and_remount
+
+echo "Filesystem content after first power failure:"
+# Must match what we had before the power failure, we don't expect to see the
+# snapshot anymore.
+ls -R $SCRATCH_MNT | _filter_scratch
+
+# Similar scenario as above, but this time the snapshot is created inside a
+# directory and not directly under the root (mount point path).
+mkdir $SCRATCH_MNT/testdir
+_run_btrfs_util_prog subvolume snapshot $SCRATCH_MNT $SCRATCH_MNT/testdir/snap2
+_run_btrfs_util_prog subvolume delete $SCRATCH_MNT/testdir/snap2
+$XFS_IO_PROG -c "fsync" $SCRATCH_MNT/testdir
+
+echo "Filesystem content before second power failure:"
+ls -R $SCRATCH_MNT | _filter_scratch
+
+_flakey_drop_and_remount
+
+echo "Filesystem content after second power failure:"
+# Must match what we had before the power failure, we don't expect to see the
+# snapshot anymore.
+ls -R $SCRATCH_MNT | _filter_scratch
+
+_unmount_flakey
+
+status=0
+exit
diff --git a/tests/btrfs/118.out b/tests/btrfs/118.out
new file mode 100644
index 000..fee12ad
--- /dev/null
+++ b/tests/btrfs/118.out
@@ -0,0 +1,15 @@
+QA output created by 118
+Filesystem content before first power failure:
+SCRATCH_MNT:
+Filesystem content after first power failure:
+SCRATCH_MNT:
+Filesystem content before second power failure:
+SCRATCH_MNT:
+testdir
+
+SCRATCH_MNT/testdir:
+Filesystem content after second power failure:
+SCRATCH_MNT:
+testdir
+
+SCRATCH_MNT/testdir:
diff --git a/tests/btrfs/group b/tests/btrfs/group
index f74ffbb..a2fa412 100644
--- a/tests/btrfs/group
+++ b/tests/btrfs/group
@@ -118,3 +118,4 @@
 115 auto qgroup
 116 auto quick metadata
 117 auto quick send clone
+118 auto quick snapshot m

[PATCH 1/2 v2] fstests: generic test for directory fsync after rename operation

2016-02-18 Thread fdmanana
From: Filipe Manana 

Test that if we move one file between directories, fsync the parent
directory of the old directory, power fail and remount the filesystem,
the file is not lost and it's located at the destination directory.

This is motivated by a bug found in btrfs, which is fixed by the patch
(for the linux kernel) titled:

  "Btrfs: fix file loss on log replay after renaming a file and fsync"

Tested against ext3, ext4, xfs, f2fs and reiserfs.

Signed-off-by: Filipe Manana 
---

V2: Removed the call to _need_to_be_root since there's a patch around to
kill it.
Removed explicit file existence tests with bash and replaced them with
calls to ls -R.

 tests/generic/335 | 96 +++
 tests/generic/335.out | 19 ++
 tests/generic/group   |  1 +
 3 files changed, 116 insertions(+)
 create mode 100755 tests/generic/335
 create mode 100644 tests/generic/335.out

diff --git a/tests/generic/335 b/tests/generic/335
new file mode 100755
index 000..0f79b6d
--- /dev/null
+++ b/tests/generic/335
@@ -0,0 +1,96 @@
+#! /bin/bash
+# FSQA Test No. 335
+#
+# Test that if we move one file between directories, fsync the parent directory
+# of the old directory, power fail and remount the filesystem, the file is not
+# lost and it's located at the destination directory.
+#
+#---
+#
+# Copyright (C) 2016 SUSE Linux Products GmbH. All Rights Reserved.
+# Author: Filipe Manana 
+#
+# This program is free software; you can redistribute it and/or
+# modify it under the terms of the GNU General Public License as
+# published by the Free Software Foundation.
+#
+# This program is distributed in the hope that it would be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program; if not, write the Free Software Foundation,
+# Inc.,  51 Franklin St, Fifth Floor, Boston, MA  02110-1301  USA
+#---
+#
+
+seq=`basename $0`
+seqres=$RESULT_DIR/$seq
+echo "QA output created by $seq"
+tmp=/tmp/$$
+status=1   # failure is the default!
+trap "_cleanup; exit \$status" 0 1 2 3 15
+
+_cleanup()
+{
+   _cleanup_flakey
+   cd /
+   rm -f $tmp.*
+}
+
+# get standard environment, filters and checks
+. ./common/rc
+. ./common/filter
+. ./common/dmflakey
+
+# real QA test starts here
+_supported_fs generic
+_supported_os Linux
+_require_scratch
+_require_dm_target flakey
+_require_metadata_journaling $SCRATCH_DEV
+
+rm -f $seqres.full
+
+_scratch_mkfs >>$seqres.full 2>&1
+_init_flakey
+_mount_flakey
+
+# Create our test directories and the file we will later check if it has
+# disappeared.
+mkdir -p $SCRATCH_MNT/a/b
+mkdir $SCRATCH_MNT/c
+touch $SCRATCH_MNT/a/b/foo
+
+# Make sure everything is durably persisted.
+sync
+
+# Now move our test file into a new parent directory.
+mv $SCRATCH_MNT/a/b/foo $SCRATCH_MNT/c/
+
+# Create a new file inside the parent directory of the directory where our test
+# file foo was previously at. This is just to ensure the fsync we do next
+# against that parent directory actually does something and it's not a noop.
+touch $SCRATCH_MNT/a/bar
+$XFS_IO_PROG -c "fsync" $SCRATCH_MNT/a
+
+echo "Filesystem content before power failure:"
+ls -R $SCRATCH_MNT/a $SCRATCH_MNT/c | _filter_scratch
+
+# Simulate a power failure / crash and remount the filesystem, so that the
+# journal/log is replayed.
+_flakey_drop_and_remount
+
+# We expect our file foo to exist, have an entry in the new parent
+# directory (c/) and not have anymore an entry in the old parent directory
+# (a/b/).
+# The new file named bar should also exist.
+echo "Filesystem content after power failure:"
+# Must match what we had before the power failure.
+ls -R $SCRATCH_MNT/a $SCRATCH_MNT/c | _filter_scratch
+
+_unmount_flakey
+
+status=0
+exit
diff --git a/tests/generic/335.out b/tests/generic/335.out
new file mode 100644
index 000..bd38d75
--- /dev/null
+++ b/tests/generic/335.out
@@ -0,0 +1,19 @@
+QA output created by 335
+Filesystem content before power failure:
+SCRATCH_MNT/a:
+b
+bar
+
+SCRATCH_MNT/a/b:
+
+SCRATCH_MNT/c:
+foo
+Filesystem content after power failure:
+SCRATCH_MNT/a:
+b
+bar
+
+SCRATCH_MNT/a/b:
+
+SCRATCH_MNT/c:
+foo
diff --git a/tests/generic/group b/tests/generic/group
index 5f699ce..f270edb 100644
--- a/tests/generic/group
+++ b/tests/generic/group
@@ -337,3 +337,4 @@
 332 auto quick clone
 333 auto clone
 334 auto clone
+335 auto quick metadata
-- 
2.7.0.rc3

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/2 v2] fstests: generic test for file fsync after rename operation

2016-02-18 Thread fdmanana
From: Filipe Manana 

Test that if we have a file F1 with two links, one in a directory A and
the other in directory B, if we remove the link in directory B, move some
other file F2 from directory B into directory C, fsync inode F1, power
fail and remount the filesystem, file F2 exists and is located only in
directory C.

This is motivated by a bug found in btrfs, which is fixed by the patch
(for the linux kernel) titled:

   "Btrfs: fix file loss on log replay after renaming a file and fsync"

Tested against ext3, ext4, xfs, f2fs and reiserfs.

Signed-off-by: Filipe Manana 
---

V2: Removed the call to _need_to_be_root since there's a patch around to
kill it.
Removed explicit file existence tests with bash and replaced them
with calls to ls -R.

 tests/generic/336 | 96 +++
 tests/generic/336.out | 17 +
 tests/generic/group   |  1 +
 3 files changed, 114 insertions(+)
 create mode 100755 tests/generic/336
 create mode 100644 tests/generic/336.out

diff --git a/tests/generic/336 b/tests/generic/336
new file mode 100755
index 000..acf9856
--- /dev/null
+++ b/tests/generic/336
@@ -0,0 +1,96 @@
+#! /bin/bash
+# FSQA Test No. 336
+#
+# Test that if we have a file F1 with two links, one in a directory A and the
+# other in directory B, if we remove the link in directory B, move some other
+# file F2 from directory B into directory C, fsync inode F1, power fail and
+# remount the filesystem, file F2 exists and is located only in directory C.
+#
+#---
+#
+# Copyright (C) 2016 SUSE Linux Products GmbH. All Rights Reserved.
+# Author: Filipe Manana 
+#
+# This program is free software; you can redistribute it and/or
+# modify it under the terms of the GNU General Public License as
+# published by the Free Software Foundation.
+#
+# This program is distributed in the hope that it would be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program; if not, write the Free Software Foundation,
+# Inc.,  51 Franklin St, Fifth Floor, Boston, MA  02110-1301  USA
+#---
+#
+
+seq=`basename $0`
+seqres=$RESULT_DIR/$seq
+echo "QA output created by $seq"
+tmp=/tmp/$$
+status=1   # failure is the default!
+trap "_cleanup; exit \$status" 0 1 2 3 15
+
+_cleanup()
+{
+   _cleanup_flakey
+   cd /
+   rm -f $tmp.*
+}
+
+# get standard environment, filters and checks
+. ./common/rc
+. ./common/filter
+. ./common/dmflakey
+
+# real QA test starts here
+_supported_fs generic
+_supported_os Linux
+_require_scratch
+_require_dm_target flakey
+_require_metadata_journaling $SCRATCH_DEV
+
+rm -f $seqres.full
+
+_scratch_mkfs >>$seqres.full 2>&1
+_init_flakey
+_mount_flakey
+
+# Create our test directories and the file we will later check if it has
+# disappeared (file bar).
+mkdir $SCRATCH_MNT/a
+mkdir $SCRATCH_MNT/b
+mkdir $SCRATCH_MNT/c
+touch $SCRATCH_MNT/a/foo
+ln $SCRATCH_MNT/a/foo $SCRATCH_MNT/b/foo_link
+touch $SCRATCH_MNT/b/bar
+
+# Make sure everything is durably persisted.
+sync
+
+# Now delete one of the hard links of file foo and move file bar into c/
+unlink $SCRATCH_MNT/b/foo_link
+mv $SCRATCH_MNT/b/bar $SCRATCH_MNT/c/
+
+# Now fsync file foo.
+$XFS_IO_PROG -c "fsync" $SCRATCH_MNT/a/foo
+
+echo "Filesystem content before power failure:"
+ls -R $SCRATCH_MNT/a $SCRATCH_MNT/b $SCRATCH_MNT/c | _filter_scratch
+
+# Simulate a power failure / crash and remount the filesystem, so that the
+# journal/log is replayed.
+_flakey_drop_and_remount
+
+# We expect that after the journal/log was replayed, we no longer have the link
+# foo_link and file bar was moved from directory b/ to directory c/.
+echo "Filesystem content after power failure:"
+# Must match what we had before the power failure.
+ls -R $SCRATCH_MNT/a $SCRATCH_MNT/b $SCRATCH_MNT/c | _filter_scratch
+
+_unmount_flakey
+
+status=0
+exit
diff --git a/tests/generic/336.out b/tests/generic/336.out
new file mode 100644
index 000..6c82fc4
--- /dev/null
+++ b/tests/generic/336.out
@@ -0,0 +1,17 @@
+QA output created by 336
+Filesystem content before power failure:
+SCRATCH_MNT/a:
+foo
+
+SCRATCH_MNT/b:
+
+SCRATCH_MNT/c:
+bar
+Filesystem content after power failure:
+SCRATCH_MNT/a:
+foo
+
+SCRATCH_MNT/b:
+
+SCRATCH_MNT/c:
+bar
diff --git a/tests/generic/group b/tests/generic/group
index f270edb..a47e23d 100644
--- a/tests/generic/group
+++ b/tests/generic/group
@@ -338,3 +338,4 @@
 333 auto clone
 334 auto clone
 335 auto quick metadata
+336 auto quick metadata
-- 
2.7.0.rc3

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://v

Re: RAID 6 full, but there is still space left on some devices

2016-02-18 Thread Henk Slager
On Thu, Feb 18, 2016 at 3:03 AM, Qu Wenruo  wrote:
>
>
> Dan Blazejewski wrote on 2016/02/17 18:04 -0500:
>>
>> Hello,
>>
>> I upgraded my kernel to 4.4.2, and btrfs-progs to 4.4. I also added
>> another 4TB disk and kicked off a full balance (currently 7x4TB
>> RAID6). I'm interested to see what an additional drive will do to
>> this. I'll also have to wait and see if a full system balance on a
>> newer version of BTRFS tools does the trick or not.
>>
>> I also noticed that "btrfs device usage" shows multiple entries for
>> Data, RAID 6 on some drives. Is this normal? Please note that /dev/sdh
>> is the new disk, and I only just started the balance.
>>
>> # btrfs dev usage /mnt/data
>> /dev/sda, ID: 5
>> Device size: 3.64TiB
>> Data,RAID6:  1.43TiB
>> Data,RAID6:  1.48TiB
>> Data,RAID6:320.00KiB
>> Metadata,RAID6:  2.55GiB
>> Metadata,RAID6:  1.50GiB
>> System,RAID6:   16.00MiB
>> Unallocated:   733.67GiB
>>
>> /dev/sdb, ID: 6
>> Device size: 3.64TiB
>> Data,RAID6:  1.48TiB
>> Data,RAID6:320.00KiB
>> Metadata,RAID6:  1.50GiB
>> System,RAID6:   16.00MiB
>> Unallocated: 2.15TiB
>>
>> /dev/sdc, ID: 7
>> Device size: 3.64TiB
>> Data,RAID6:  1.43TiB
>> Data,RAID6:732.69GiB
>> Data,RAID6:  1.48TiB
>> Data,RAID6:320.00KiB
>> Metadata,RAID6:  2.55GiB
>> Metadata,RAID6:982.00MiB
>> Metadata,RAID6:  1.50GiB
>> System,RAID6:   16.00MiB
>> Unallocated:25.21MiB
>>
>> /dev/sdd, ID: 1
>> Device size: 3.64TiB
>> Data,RAID6:  1.43TiB
>> Data,RAID6:732.69GiB
>> Data,RAID6:  1.48TiB
>> Data,RAID6:320.00KiB
>> Metadata,RAID6:  2.55GiB
>> Metadata,RAID6:982.00MiB
>> Metadata,RAID6:  1.50GiB
>> System,RAID6:   16.00MiB
>> Unallocated:25.21MiB
>>
>> /dev/sdf, ID: 3
>> Device size: 3.64TiB
>> Data,RAID6:  1.43TiB
>> Data,RAID6:732.69GiB
>> Data,RAID6:  1.48TiB
>> Data,RAID6:320.00KiB
>> Metadata,RAID6:  2.55GiB
>> Metadata,RAID6:982.00MiB
>> Metadata,RAID6:  1.50GiB
>> System,RAID6:   16.00MiB
>> Unallocated:25.21MiB
>>
>> /dev/sdg, ID: 2
>> Device size: 3.64TiB
>> Data,RAID6:  1.43TiB
>> Data,RAID6:732.69GiB
>> Data,RAID6:  1.48TiB
>> Data,RAID6:320.00KiB
>> Metadata,RAID6:  2.55GiB
>> Metadata,RAID6:982.00MiB
>> Metadata,RAID6:  1.50GiB
>> System,RAID6:   16.00MiB
>> Unallocated:25.21MiB
>>
>> /dev/sdh, ID: 8
>> Device size: 3.64TiB
>> Data,RAID6:320.00KiB
>> Unallocated: 3.64TiB
>>
>
> Not sure how that multiple chunk type shows up.
> Maybe all these shown RAID6 has different number of stripes?

Indeed, its 4 different sets of stripe-widths, i.e. how many drives is
striped accross. Someone has suggested to indicate this in the output
ofbtrfs de us  comand some time ago.

The fs has only RAID6 profile and I am not fully sure if the
'Unallocated'  numbers are correct (on RAID10 they are 2x too high
with unpatched v4.4 progs), but anyhow the lower devid's are way too
full.

>From the size, one can derive how many devices (or stipe-width):
732.69GiB 4, 1.43TiB 5, 1.48TiB 6, 320.00KiB 7

>> Qu, in regards to your question, I ran RAID 1 on multiple disks of
>> different sizes. I believe I had a mix of 2x4TB, 1x2TB, and 1x3TB
>> drive. I replaced the 2TB drive first with a 4TB, and balanced it.
>> Later on, I replaced the 3TB drive with another 4TB, and balanced,
>> yielding an array of 4x4TB RAID1. A little while later, I wound up
>> sticking a fifth 4TB drive in, and converting to RAID6. The sixth 4TB
>> drive was added some time after that. The seventh was added just a few
>> minutes ago.
>
>
> Personally speaking, I just came up to one method to balance all these
> disks, and in fact you don't need to add a disk.
>
> 1) Balance all data chunk to single profile
> 2) Balance all metadata chunk to single or RAID1 profile
> 3) Balance all data chunk back to RAID6 profile
> 4) Balance all metadata chunk back to RAID6 profile
> System chunk is so small that normally you don't need to bother.
>
> The trick is, as single is the most flex chunk type, only needs one disk
> with unallocated space.
> And btrfs chunk allocater will allocate chunk to device with most
> unallocated space.
>
> So after 1) and 2) you should found that chunk allocation is almost
> perfectly balanced across all devices, as long as they are in same size

Re: RAID 6 full, but there is still space left on some devices

2016-02-18 Thread Dan Blazejewski
Qu, thanks for your input. I cancelled the existing balance, and
kicked off a balance set to dconvert=single. Should be busy for the
next few days, but I already see the multiple RAID 6 stripes
disappearing, and the chunk distribution across all drives is starting
to normalize. I'll let you know if it works once it's done. Thanks!

On Wed, Feb 17, 2016 at 9:03 PM, Qu Wenruo  wrote:
>
>
> Dan Blazejewski wrote on 2016/02/17 18:04 -0500:
>>
>> Hello,
>>
>> I upgraded my kernel to 4.4.2, and btrfs-progs to 4.4. I also added
>> another 4TB disk and kicked off a full balance (currently 7x4TB
>> RAID6). I'm interested to see what an additional drive will do to
>> this. I'll also have to wait and see if a full system balance on a
>> newer version of BTRFS tools does the trick or not.
>>
>> I also noticed that "btrfs device usage" shows multiple entries for
>> Data, RAID 6 on some drives. Is this normal? Please note that /dev/sdh
>> is the new disk, and I only just started the balance.
>>
>> # btrfs dev usage /mnt/data
>> /dev/sda, ID: 5
>> Device size: 3.64TiB
>> Data,RAID6:  1.43TiB
>> Data,RAID6:  1.48TiB
>> Data,RAID6:320.00KiB
>> Metadata,RAID6:  2.55GiB
>> Metadata,RAID6:  1.50GiB
>> System,RAID6:   16.00MiB
>> Unallocated:   733.67GiB
>>
>> /dev/sdb, ID: 6
>> Device size: 3.64TiB
>> Data,RAID6:  1.48TiB
>> Data,RAID6:320.00KiB
>> Metadata,RAID6:  1.50GiB
>> System,RAID6:   16.00MiB
>> Unallocated: 2.15TiB
>>
>> /dev/sdc, ID: 7
>> Device size: 3.64TiB
>> Data,RAID6:  1.43TiB
>> Data,RAID6:732.69GiB
>> Data,RAID6:  1.48TiB
>> Data,RAID6:320.00KiB
>> Metadata,RAID6:  2.55GiB
>> Metadata,RAID6:982.00MiB
>> Metadata,RAID6:  1.50GiB
>> System,RAID6:   16.00MiB
>> Unallocated:25.21MiB
>>
>> /dev/sdd, ID: 1
>> Device size: 3.64TiB
>> Data,RAID6:  1.43TiB
>> Data,RAID6:732.69GiB
>> Data,RAID6:  1.48TiB
>> Data,RAID6:320.00KiB
>> Metadata,RAID6:  2.55GiB
>> Metadata,RAID6:982.00MiB
>> Metadata,RAID6:  1.50GiB
>> System,RAID6:   16.00MiB
>> Unallocated:25.21MiB
>>
>> /dev/sdf, ID: 3
>> Device size: 3.64TiB
>> Data,RAID6:  1.43TiB
>> Data,RAID6:732.69GiB
>> Data,RAID6:  1.48TiB
>> Data,RAID6:320.00KiB
>> Metadata,RAID6:  2.55GiB
>> Metadata,RAID6:982.00MiB
>> Metadata,RAID6:  1.50GiB
>> System,RAID6:   16.00MiB
>> Unallocated:25.21MiB
>>
>> /dev/sdg, ID: 2
>> Device size: 3.64TiB
>> Data,RAID6:  1.43TiB
>> Data,RAID6:732.69GiB
>> Data,RAID6:  1.48TiB
>> Data,RAID6:320.00KiB
>> Metadata,RAID6:  2.55GiB
>> Metadata,RAID6:982.00MiB
>> Metadata,RAID6:  1.50GiB
>> System,RAID6:   16.00MiB
>> Unallocated:25.21MiB
>>
>> /dev/sdh, ID: 8
>> Device size: 3.64TiB
>> Data,RAID6:320.00KiB
>> Unallocated: 3.64TiB
>>
>
> Not sure how that multiple chunk type shows up.
> Maybe all these shown RAID6 has different number of stripes?
>
>>
>>
>> Qu, in regards to your question, I ran RAID 1 on multiple disks of
>> different sizes. I believe I had a mix of 2x4TB, 1x2TB, and 1x3TB
>> drive. I replaced the 2TB drive first with a 4TB, and balanced it.
>> Later on, I replaced the 3TB drive with another 4TB, and balanced,
>> yielding an array of 4x4TB RAID1. A little while later, I wound up
>> sticking a fifth 4TB drive in, and converting to RAID6. The sixth 4TB
>> drive was added some time after that. The seventh was added just a few
>> minutes ago.
>
>
> Personally speaking, I just came up to one method to balance all these
> disks, and in fact you don't need to add a disk.
>
> 1) Balance all data chunk to single profile
> 2) Balance all metadata chunk to single or RAID1 profile
> 3) Balance all data chunk back to RAID6 profile
> 4) Balance all metadata chunk back to RAID6 profile
> System chunk is so small that normally you don't need to bother.
>
> The trick is, as single is the most flex chunk type, only needs one disk
> with unallocated space.
> And btrfs chunk allocater will allocate chunk to device with most
> unallocated space.
>
> So after 1) and 2) you should found that chunk allocation is almost
> perfectly balanced across all devices, as long as they are in same size.
>
> Now you have a balance base layout for RAID6 allocation. Should make things
> go quite smooth and result a balanced RAID6 chunk layout.
>
> Thanks,
> Qu
>

Re: RAID 6 full, but there is still space left on some devices

2016-02-18 Thread Qu Wenruo



Henk Slager wrote on 2016/02/19 00:27 +0100:

On Thu, Feb 18, 2016 at 3:03 AM, Qu Wenruo  wrote:



Dan Blazejewski wrote on 2016/02/17 18:04 -0500:


Hello,

I upgraded my kernel to 4.4.2, and btrfs-progs to 4.4. I also added
another 4TB disk and kicked off a full balance (currently 7x4TB
RAID6). I'm interested to see what an additional drive will do to
this. I'll also have to wait and see if a full system balance on a
newer version of BTRFS tools does the trick or not.

I also noticed that "btrfs device usage" shows multiple entries for
Data, RAID 6 on some drives. Is this normal? Please note that /dev/sdh
is the new disk, and I only just started the balance.

# btrfs dev usage /mnt/data
/dev/sda, ID: 5
 Device size: 3.64TiB
 Data,RAID6:  1.43TiB
 Data,RAID6:  1.48TiB
 Data,RAID6:320.00KiB
 Metadata,RAID6:  2.55GiB
 Metadata,RAID6:  1.50GiB
 System,RAID6:   16.00MiB
 Unallocated:   733.67GiB

/dev/sdb, ID: 6
 Device size: 3.64TiB
 Data,RAID6:  1.48TiB
 Data,RAID6:320.00KiB
 Metadata,RAID6:  1.50GiB
 System,RAID6:   16.00MiB
 Unallocated: 2.15TiB

/dev/sdc, ID: 7
 Device size: 3.64TiB
 Data,RAID6:  1.43TiB
 Data,RAID6:732.69GiB
 Data,RAID6:  1.48TiB
 Data,RAID6:320.00KiB
 Metadata,RAID6:  2.55GiB
 Metadata,RAID6:982.00MiB
 Metadata,RAID6:  1.50GiB
 System,RAID6:   16.00MiB
 Unallocated:25.21MiB

/dev/sdd, ID: 1
 Device size: 3.64TiB
 Data,RAID6:  1.43TiB
 Data,RAID6:732.69GiB
 Data,RAID6:  1.48TiB
 Data,RAID6:320.00KiB
 Metadata,RAID6:  2.55GiB
 Metadata,RAID6:982.00MiB
 Metadata,RAID6:  1.50GiB
 System,RAID6:   16.00MiB
 Unallocated:25.21MiB

/dev/sdf, ID: 3
 Device size: 3.64TiB
 Data,RAID6:  1.43TiB
 Data,RAID6:732.69GiB
 Data,RAID6:  1.48TiB
 Data,RAID6:320.00KiB
 Metadata,RAID6:  2.55GiB
 Metadata,RAID6:982.00MiB
 Metadata,RAID6:  1.50GiB
 System,RAID6:   16.00MiB
 Unallocated:25.21MiB

/dev/sdg, ID: 2
 Device size: 3.64TiB
 Data,RAID6:  1.43TiB
 Data,RAID6:732.69GiB
 Data,RAID6:  1.48TiB
 Data,RAID6:320.00KiB
 Metadata,RAID6:  2.55GiB
 Metadata,RAID6:982.00MiB
 Metadata,RAID6:  1.50GiB
 System,RAID6:   16.00MiB
 Unallocated:25.21MiB

/dev/sdh, ID: 8
 Device size: 3.64TiB
 Data,RAID6:320.00KiB
 Unallocated: 3.64TiB



Not sure how that multiple chunk type shows up.
Maybe all these shown RAID6 has different number of stripes?


Indeed, its 4 different sets of stripe-widths, i.e. how many drives is
striped accross. Someone has suggested to indicate this in the output
ofbtrfs de us  comand some time ago.

The fs has only RAID6 profile and I am not fully sure if the
'Unallocated'  numbers are correct (on RAID10 they are 2x too high
with unpatched v4.4 progs), but anyhow the lower devid's are way too
full.

 From the size, one can derive how many devices (or stipe-width):
732.69GiB 4, 1.43TiB 5, 1.48TiB 6, 320.00KiB 7


Qu, in regards to your question, I ran RAID 1 on multiple disks of
different sizes. I believe I had a mix of 2x4TB, 1x2TB, and 1x3TB
drive. I replaced the 2TB drive first with a 4TB, and balanced it.
Later on, I replaced the 3TB drive with another 4TB, and balanced,
yielding an array of 4x4TB RAID1. A little while later, I wound up
sticking a fifth 4TB drive in, and converting to RAID6. The sixth 4TB
drive was added some time after that. The seventh was added just a few
minutes ago.



Personally speaking, I just came up to one method to balance all these
disks, and in fact you don't need to add a disk.

1) Balance all data chunk to single profile
2) Balance all metadata chunk to single or RAID1 profile
3) Balance all data chunk back to RAID6 profile
4) Balance all metadata chunk back to RAID6 profile
System chunk is so small that normally you don't need to bother.

The trick is, as single is the most flex chunk type, only needs one disk
with unallocated space.
And btrfs chunk allocater will allocate chunk to device with most
unallocated space.

So after 1) and 2) you should found that chunk allocation is almost
perfectly balanced across all devices, as long as they are in same size.

Now you have a balance base layout for RAID6 allocation. Should make things
go quite smooth and result a balanced RAID6 chunk layout.


This is a good trick to get out of 'the RAID6 full' situa

[PATCH] btrfs-progs: fix symlink creation multiple times

2016-02-18 Thread Hongxu Jia
The rule to create symlink in Makefile caused parallel issue:
$ make -j 40 DESTDIR=/image install BUILD_VERBOSE=1
...
  1 [LN] libbtrfs.so.0
  2 [LN] libbtrfs.so
  3 ln -s -f libbtrfs.so.0.1 libbtrfs.so.0
  4 ln -s -f libbtrfs.so.0.1 libbtrfs.so.0
  5 ln -s -f libbtrfs.so.0.1 libbtrfs.so
  6 ln -s -f libbtrfs.so.0.1 libbtrfs.so
...

It failed occasionally:
...
|symlinkat: couldn't stat 'git/libbtrfs.so' even though symlink
creation succeeded (No such file or directory).
|ln: failed to create symbolic link 'libbtrfs.so': No such file or directory
...

Signed-off-by: Hongxu Jia 
---
 Makefile.in | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/Makefile.in b/Makefile.in
index 1f4002e..16eeaf9 100644
--- a/Makefile.in
+++ b/Makefile.in
@@ -236,8 +236,7 @@ $(libs_static): $(libbtrfs_objects)
 
 $(lib_links):
@echo "[LN] $@"
-   $(Q)$(LN_S) -f libbtrfs.so.0.1 libbtrfs.so.0
-   $(Q)$(LN_S) -f libbtrfs.so.0.1 libbtrfs.so
+   $(Q)$(LN_S) -f libbtrfs.so.0.1 $@
 
 # keep intermediate files from the below implicit rules around
 .PRECIOUS: $(addsuffix .o,$(progs))
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 13/13] btrfs: optimize check for stale device

2016-02-18 Thread Anand Jain


On 02/18/2016 11:13 PM, David Sterba wrote:

On Sat, Feb 13, 2016 at 10:01:40AM +0800, Anand Jain wrote:

Optimize check for stale device to only be checked when there is device
added or changed. If there is no update to the device, there is no need
to call btrfs_free_stale_device().

Signed-off-by: Anand Jain 


http://thread.gmane.org/gmane.comp.file-systems.btrfs/48909/focus=48976

So why did you include the patch in this series?


 Non technical. Getting miscellaneous device management related patches
 through.



I see crashes with btrfs/011 on a non-debugging config

[  641.714363] BUG: unable to handle kernel NULL pointer dereference at 
0068
[  641.716057] IP: [] scrub_setup_ctx.isra.19+0x1f6/0x260 
[btrfs]
[  641.717036] PGD 720c1067 PUD 720c2067 PMD 0
[  641.717749] Oops:  [#1] PREEMPT SMP

::

[  641.723163] CPU: 0 PID: 27766 Comm: btrfs Not tainted 
4.5.0-rc3-next-20160212-1.g38290f0-vanilla #1
[  641.724420] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS by 
qemu-project.org 04/01/2014
[  641.725723] task: 8800742481c0 ti: 880071d1 task.ti: 
880071d1
[  641.726954] RIP: 0010:[]  [] 
scrub_setup_ctx.isra.19+0x1f6/0x260 [btrfs]
[  641.728404] RSP: 0018:880071d13ce8  EFLAGS: 00010202
[  641.729413] RAX: 88007231e800 RBX: 88007231e800 RCX: 
[  641.730610] RDX: a0195638 RSI: a017c5a8 RDI: 88007231ea80
[  641.731832] RBP: 880071d13d18 R08:  R09: 88007204ea00
[  641.733085] R10: 0008 R11:  R12: 
[  641.734307] R13: 0001 R14: 88007231e9f8 R15: 003f
[  641.735544] FS:  7f03ed36d8c0() GS:88007fc0() 
knlGS:
[  641.736883] CS:  0010 DS:  ES:  CR0: 80050033
[  641.738022] CR2: 0068 CR3: 720c CR4: 06f0
[  641.739325] Stack:
[  641.740156]  8800724d4000 8800724d4000  
8800722ef000
[  641.741735]   8800724d4fc8 880071d13d98 
a01566fd
[  641.743163]  88007b127000 0019 8800724d4ce8 

[  641.744599] Call Trace:
[  641.745553]  [] btrfs_scrub_dev+0x13d/0x510 [btrfs]
[  641.746894]  [] btrfs_dev_replace_start+0x279/0x3f0 [btrfs]
[  641.748282]  [] btrfs_ioctl+0x1869/0x2070 [btrfs]
[  641.749587]  [] ? pte_alloc_one+0x33/0x40
[  641.750850]  [] do_vfs_ioctl+0x96/0x590
[  641.752128]  [] ? __do_page_fault+0x181/0x450
[  641.753432]  [] SyS_ioctl+0x79/0x90
[  641.754663]  [] entry_SYSCALL_64_fastpath+0x1e/0xa8
[  641.756037] Code: 00 48 c7 c2 38 56 19 a0 48 c7 c6 a8 c5 17 a0 e8 21 39 f7 e0 45 
85 ed 48 c7 83 68 02 00 00 00 00 00 00 48 89 d8 0f 84 03 ff ff ff <49> 83 7c 24 
68 00 74 40 c7 83 78 02 00 00 20 00 00 00 4c 89 a3
[  641.760392] RIP  [] scrub_setup_ctx.isra.19+0x1f6/0x260 
[btrfs]
[  641.761970]  RSP 
[  641.763190] CR2: 0068
[  641.767218] ---[ end trace f46d4e6a90bda310 ]---

the dereference happens at offset 0x68 which matches bdev in
btrfs_device, so this patch is my best guess at the moment. I'm not able
to reproduce it directly so I need to wait for a rebuild and repeat.



  Looks like dev was fine when find_device was called, but
  later it was null when ->bdev was accessed.

  I couldn't reproduce here. There are 10 workouts within btrfs/011
  any idea workout caused this? As of now I am guessing..

  workout "-m dup -d single" 1 cancel quick

  digging more.

Thanks, Anand


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html