3.15.0-rc5: btrfs and sync deadlock: call_rwsem_down_read_failed

2014-05-22 Thread Marc MERLIN
I got m laptop to hang all IO to one of its devices again, this time drive #2. This is the 3rd time it happens, and I've already lost data as a result since things that haven't hit disk, don't make it at this point. I was doing balance and btrfs send/receive. Then cron started a scrub in the

Re: Linuxcon-JP Btrfs talk

2014-05-22 Thread Duncan
Marc MERLIN posted on Wed, 21 May 2014 20:19:06 -0700 as excerpted: If you're new with Btrfs, this may be a useful walkthrough for you. You can go through the slides which I wrote to be readable without the video, but the video is available too if you'd like:

Re: mount hangs after disk crash (RAID-1)

2014-05-22 Thread Duncan
Tomasz Chmielewski posted on Thu, 22 May 2014 03:22:58 +0100 as excerpted: One disk in RAID-1 crashed, so powered off, changed disk, powered on, trying to mount degraded. Unfortunately it hangs (running 3.14.4). # mount -o degraded,compress=lzo,noatime /dev/sdb4 /home (...never

[PATCH 2/2 v3] btrfs: usage error should not be logged into system log

2014-05-22 Thread Anand Jain
From: Anand Jain anand.j...@oracle.com I have an opinion that system logs /var/log/messages are valuable info to investigate the real system issues at the data center. People handling data center issues do spend a lot time and efforts analyzing messages files. Having usage error logged into

[PATCH 1/2 v3] btrfs: label should not contain return char

2014-05-22 Thread Anand Jain
From: Anand Jain anand.j...@oracle.com generally if you use echo test /sys/fs/btrfs/fsid/label it would introduce return char at the end and it can not be part of the label. The correct command is echo -n test /sys/fs/btrfs/fsid/label This patch will check for this user error

Re: [PATCH 1/2 v2] btrfs: label should not contain return char

2014-05-22 Thread Anand Jain
On 21/05/14 00:33, David Sterba wrote: On Tue, May 20, 2014 at 02:36:48PM +0800, Anand Jain wrote: From: Anand Jain anand.j...@oracle.com generally if you use echo test /sys/fs/btrfs/fsid/label it would introduce return char at the end and it can not be part of the label. The correct

Re: ditto blocks on ZFS

2014-05-22 Thread Austin S Hemmelgarn
On 2014-05-21 19:05, Martin wrote: Very good comment from Ashford. Sorry, but I see no advantages from Russell's replies other than for a feel-good factor or a dangerous false sense of security. At best, there is a weak justification that for metadata, again going from 2% to 4% isn't

Re: [PATCH 2/2 v3] btrfs: usage error should not be logged into system log

2014-05-22 Thread Koen Kooi
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Anand Jain schreef op 22-05-14 12:41: From: Anand Jain anand.j...@oracle.com I have an opinion that system logs /var/log/messages are valuable info to investigate the real system issues at the data center. People handling data center issues do

Re: [PATCH 1/2 v3] btrfs: label should not contain return char

2014-05-22 Thread David Sterba
On Thu, May 22, 2014 at 06:41:11PM +0800, Anand Jain wrote: @@ -385,7 +392,8 @@ static ssize_t btrfs_label_store(struct kobject *kobj, return PTR_ERR(trans); spin_lock(root-fs_info-super_lock); - strcpy(fs_info-super_copy-label, buf); +

Re: [PATCH] Btrfs: don't remove raid type sysfs entries until unmount

2014-05-22 Thread Chris Mason
On 05/21/2014 09:21 PM, Jeff Mahoney wrote: On 05/21/2014 08:12 PM, Chris Mason wrote: The Btrfs sysfs code removes entries for raid types that are no longer in use. This means that if you have a raid0 FS and use balance to turn it into a raid1 FS, the raid0 sysfs entries will go away.

Re: 3.15.0-rc5: btrfs and sync deadlock: call_rwsem_down_read_failed / balance seems to create locks that block everything else

2014-05-22 Thread Marc MERLIN
On Thu, May 22, 2014 at 02:09:21AM -0700, Marc MERLIN wrote: I got m laptop to hang all IO to one of its devices again, this time drive #2. This is the 3rd time it happens, and I've already lost data as a result since things that haven't hit disk, don't make it at this point. I was doing

Re: [PATCH] Btrfs: don't remove raid type sysfs entries until unmount

2014-05-22 Thread Jeff Mahoney
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 5/22/14, 8:19 AM, Chris Mason wrote: Can we safely reinit a kobject that has been put in use in sysfs? Given all the things that can hold refs etc is this legal? It depends on how the kobject is being used. It wouldn't be safe to re-use the

Re: mount hangs after disk crash (RAID-1)

2014-05-22 Thread Tomasz Chmielewski
One disk in RAID-1 crashed, so powered off, changed disk, powered on, trying to mount degraded. Unfortunately it hangs (running 3.14.4). # mount -o degraded,compress=lzo,noatime /dev/sdb4 /home (...never returns...) 1) Just to be sure, btrfs raid1, not btrfs on md/raid1 or the

Re: ditto blocks on ZFS

2014-05-22 Thread Tomasz Chmielewski
I thought an important idea behind btrfs was that we avoid by design in the first place the very long and vulnerable RAID rebuild scenarios suffered for block-level RAID... This may be true for SSD disks - for ordinary disks it's not entirely the case. For most RAID rebuilds, it still seems

Re: mount hangs after disk crash (RAID-1)

2014-05-22 Thread Chris Murphy
On May 22, 2014, at 3:43 AM, Duncan 1i5t5.dun...@cox.net wrote: Note that unlike md/raid1, btrfs raid1 won't mount writable with only a single device. You must have at least two devices to mount writable, tho a formerly two-device raid1 with a device missing should mount read-only. No, a

Re: [PATCH] Btrfs: don't remove raid type sysfs entries until unmount

2014-05-22 Thread Chris Mason
On 05/22/2014 11:05 AM, Jeff Mahoney wrote: - gpg control packet On 5/22/14, 8:19 AM, Chris Mason wrote: Can we safely reinit a kobject that has been put in use in sysfs? Given all the things that can hold refs etc is this legal? It depends on how the kobject is being used. It wouldn't

Re: mount hangs after disk crash (RAID-1)

2014-05-22 Thread Tomasz Chmielewski
Try -o recovery,degraded I would drop the other options for now, since they aren't necessary to recover from a \ device failure. Yes I've tried that as well, and it ends in the similar hang - high IO for a while, then no IO at all, mount does not return. It *does* mount as ro,degraded, but

Re: mount hangs after disk crash (RAID-1)

2014-05-22 Thread Chris Murphy
On May 22, 2014, at 11:50 AM, Tomasz Chmielewski t...@virtall.com wrote: Try -o recovery,degraded I would drop the other options for now, since they aren't necessary to recover from a \ device failure. Yes I've tried that as well, and it ends in the similar hang - high IO for a while,

[PATCH] fs: btrfs: volumes.c: Fix for possible null pointer dereference

2014-05-22 Thread Rickard Strandqvist
There is otherwise a risk of a possible null pointer dereference. Was largely found by using a static code analysis program called cppcheck. Signed-off-by: Rickard Strandqvist rickard_strandqv...@spectrumdigital.se --- fs/btrfs/volumes.c |5 +++-- 1 file changed, 3 insertions(+), 2

Re: 3.15.0-rc5: btrfs and sync deadlock: call_rwsem_down_read_failed / balance seems to create locks that block everything else

2014-05-22 Thread Duncan
Marc MERLIN posted on Thu, 22 May 2014 06:15:29 -0700 as excerpted: Balance cancel hangs too and so does sync [...] For balance, if it comes to having to stop it on new mount after a shutdown, there is of course the skip_balance mount option. I was able to stop my btrfs send/receive, in turn

Re: mount hangs after disk crash (RAID-1)

2014-05-22 Thread Duncan
Tomasz Chmielewski posted on Thu, 22 May 2014 18:50:25 +0100 as excerpted: It *does* mount as ro,degraded, but then, it's not possible to add a disk and recover to a functioning RAID-1. Also, when I try to remount rw, the mount command hangs as well. Is there anything else I can try? It's

Re: mount hangs after disk crash (RAID-1)

2014-05-22 Thread Chris Murphy
On May 22, 2014, at 11:50 AM, Tomasz Chmielewski t...@virtall.com wrote: Try -o recovery,degraded I would drop the other options for now, since they aren't necessary to recover from a \ device failure. Yes I've tried that as well, and it ends in the similar hang - high IO for a while,

Re: ditto blocks on ZFS

2014-05-22 Thread ashford
Russell, Overall, there are still a lot of unknowns WRT the stability, and ROI (Return On Investment) of implementing ditto blocks for BTRFS. The good news is that there's a lot of time before the underlying structure is in place to support, so there's time to figure this out a bit better. On

Re: 3.15.0-rc5: btrfs and sync deadlock: call_rwsem_down_read_failed / balance seems to create locks that block everything else

2014-05-22 Thread Marc MERLIN
On Thu, May 22, 2014 at 08:52:34PM +, Duncan wrote: It's been running for at least 15mn in 'cancel mode'. Is that normal? I'd guess so. It's probably in the middle of operations for a single chunk, and only checks for cancel between chunks. Given the possible complexity of those

Re: [PATCH 2/2 v3] btrfs: usage error should not be logged into system log

2014-05-22 Thread Anand Jain
On 22/05/14 19:21, Koen Kooi wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Anand Jain schreef op 22-05-14 12:41: From: Anand Jain anand.j...@oracle.com I have an opinion that system logs /var/log/messages are valuable info to investigate the real system issues at the data center.

[PATCH 2/2 v4] btrfs: usage error should not be logged into system log

2014-05-22 Thread Anand Jain
From: Anand Jain anand.j...@oracle.com I have an opinion that system logs under /var/log/ are valuable info to investigate the real system issues at the data center. People handling data center issues do spend a lot time and efforts analyzing messages files. Having usage error logged into system

[PATCH 1/2 v4] btrfs: label should not contain return char

2014-05-22 Thread Anand Jain
From: Anand Jain anand.j...@oracle.com generally if you use echo test /sys/fs/btrfs/fsid/label it would introduce return char at the end and it can not be part of the label. The correct command is echo -n test /sys/fs/btrfs/fsid/label This patch will check for this user error

[PATCH 1/2] xfstests: add helper require function _require_btrfs_cloner

2014-05-22 Thread Filipe David Borba Manana
So that the same check (btrfs cloner program presence) can be reused by other tests. Signed-off-by: Filipe David Borba Manana fdman...@gmail.com --- common/rc | 7 +++ tests/btrfs/035 | 4 +--- 2 files changed, 8 insertions(+), 3 deletions(-) diff --git a/common/rc b/common/rc index

[PATCH v3] Btrfs: ensure readers see new data after a clone operation

2014-05-22 Thread Filipe David Borba Manana
We were cleaning the clone target file range from the page cache before we did replace the file extent items in the fs tree. This was racy, as right after cleaning the relevant range from the page cache and before replacing the file extent items, a read against that range could be performed by

[PATCH 2/2] xfstests: add test for btrfs ioctl clone operation

2014-05-22 Thread Filipe David Borba Manana
This is a test to verify that the btrfs ioctl clone operation is able to clone extents of a file to different positions of the file, that is, the source and target files are the same. Existing tests only cover the case where the source and target files are different. Signed-off-by: Filipe David

[PATCH] Btrfs: clear compress-force when remounting with compress option

2014-05-22 Thread Wang Shilong
Steps to reproduce: # mkfs.btrfs -f /dev/sdb # mount /dev/sdb /mnt -o compress-force=lzo # mount /dev/sdb /mnt -o remount,compress=zlib # cat /proc/mounts Remounting from compress-force to compress could not clear compress-force option. The problem is there is no way for users to clear

Re: ditto blocks on ZFS

2014-05-22 Thread Russell Coker
On Thu, 22 May 2014 15:09:40 ashf...@whisperpc.com wrote: You've addressed half of the issue. It appears that the metadata is normally a bit over 1% using the current methods, but two samples do not make a statistical universe. The good news is that these two samples are from opposite