[PATCH v2] Btrfs: fix a dio write regression
From: Liu Bo This bug is introduced by commit 3b8bde746f6f9bd36a9f05f5f3b6e334318176a9 (Btrfs: lock extents as we map them in DIO). In dio write, we should unlock the section which we didn't do IO on in case that we fall back to buffered write. But we need to not only unlock the section but also cleanup reserved space for the section. This bug was found while running xfstests 133, with this 133 no longer complains. Signed-off-by: Liu Bo --- v1->v2: apply style comments from David Sterba. fs/btrfs/inode.c | 24 1 files changed, 20 insertions(+), 4 deletions(-) diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 7131fac..ea6a4ee 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -5993,11 +5993,27 @@ unlock: * in the case of read we need to unlock only the end area that we * aren't using if there is any left over space. */ - if (lockstart < lockend) - clear_extent_bit(&BTRFS_I(inode)->io_tree, lockstart, lockend, -unlock_bits, 1, 0, &cached_state, GFP_NOFS); - else + if (lockstart < lockend) { + if (create && len < lockend - lockstart) { + clear_extent_bit(&BTRFS_I(inode)->io_tree, lockstart, +lockstart + len - 1, unlock_bits, 1, 0, +&cached_state, GFP_NOFS); + /* +* Beside unlock, we also need to cleanup reserved space +* for the left range by attaching EXTENT_DO_ACCOUNTING. +*/ + clear_extent_bit(&BTRFS_I(inode)->io_tree, +lockstart + len, lockend, +unlock_bits | EXTENT_DO_ACCOUNTING, +1, 0, NULL, GFP_NOFS); + } else { + clear_extent_bit(&BTRFS_I(inode)->io_tree, lockstart, +lockend, unlock_bits, 1, 0, +&cached_state, GFP_NOFS); + } + } else { free_extent_state(cached_state); + } free_extent_map(em); -- 1.7.7.6 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
btrfsprogs: cases of snapshot failures
Since btrfs does not do recursive atomic snapshots (which I am ok with), I am doing this myself. A handful of suggestions/problems came up. 1. Maybe btrfsprogs could gain an option to do recursive non-atomic snapshots at the userspace level, simply invoking low-level atomic snapshots one by one? For the following, the kernel is 3.4.4 with the too-overloaded "0.19" version of btrfsprogs. 2. Subvolume directories are somewhat special, as you may know. Only `btrfs sub create/snap` creates them, and they cannot be rmdird. # btrfs sub list . ID 256 top level 5 path HEAD [...] ID 450 top level 5 path HEAD/woven ID 451 top level 5 path HEAD/leet Attempting to snapshot a directory with further subvolumes in it has the strange effect that directories get created, and do so with the wrong inode info: # ls -l HEAD total 4 drwxr-xr-x 1 root root 18 Aug 18 01:04 . dr-xr-xr-x 1 root root 218 Aug 23 00:25 .. drwxr-xr-x 1 root root 66 Aug 17 23:04 leet drwxrwx--- 1 root root 100 Aug 18 00:53 woven # btrfs sub snap HEAD today Create a snapshot of 'HEAD' in './today' # ls -l today total 4 drwxr-xr-x 1 root root 18 Aug 18 01:04 . dr-xr-xr-x 1 root root 228 Aug 23 00:25 .. drwxr-xr-x 1 root root 0 Aug 23 00:25 leet drwxr-xr-x 1 root root 0 Aug 23 00:25 woven 3. The creation of these non-special directories in today/ is undesired, because now I need to rmdir them first before creating the subsnapshots. # btrfs sub snap HEAD today Create a snapshot of 'HEAD' in './today' # btrfs sub snap HEAD/leet today/ Create a snapshot of 'HEAD/leet' in 'today//leet' ERROR: cannot snapshot 'HEAD/leet' - File exists 4. Because today/leet already exists as a non-subvolume root, btrfsprogs defaults to creating another directory inside it (unexpected, but ok). However, while doing so, it runs into some unexplainable ENOTTY: # btrfs sub snap HEAD today Create a snapshot of 'HEAD' in './today' # btrfs sub snap HEAD/leet today/leet Create a snapshot of 'HEAD/leet' in 'today/leet/leet' ERROR: cannot snapshot 'HEAD/leet' - Inappropriate ioctl for device (^failure where success would have been expected) Successes: # rmdir today/leet/ # mkdir today/leet # btrfs sub snap HEAD/leet today/leet Create a snapshot of 'HEAD/leet' in 'today/leet/leet' So today/leet as created by `sub snap HEAD` is also some weird object. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Computer crash, btrfs partition errors
Hi The full output of the btrfs-debug-tree is 190MB compressed, did you want it still ? As far as the conditions, I was running a repo sync which I had CTRL-Z, I then got distracted and mistakenly started the sync again (not sure if you are familiar with repo command, it spawns git processes to checkout project). The crash did not occur immediately but it was probably within 2 minutes of me starting this second repo sync. The lock up was bad enough that BUSIER did not reboot the PC, I had to power down. After the patch to btrfsck, I got this error: # ./work/builds/btrfs-progs/btrfsck --repair /dev/sda2 enabling repair mode checking extents leaf parent key incorrect 46329503744 bad block 46329503744 owner ref check failed [46329503744 4096] repair deleting extent record: key 46329503744 168 4096 adding new tree backref on start 46329503744 len 4096 parent 256 root 256 repaired damaged extent references *** glibc detected *** ./work/builds/btrfs-progs/btrfsck: corrupted double-linked list: 0x1202b220 *** === Backtrace: = /lib64/libc.so.6(+0x77896)[0x7f0008c59896] /lib64/libc.so.6(+0x77cfb)[0x7f0008c59cfb] /lib64/libc.so.6(+0x784a8)[0x7f0008c5a4a8] /lib64/libc.so.6(cfree+0x6c)[0x7f0008c5d84c] ./work/builds/btrfs-progs/btrfsck[0x415db8] ./work/builds/btrfs-progs/btrfsck[0x415e15] ./work/builds/btrfs-progs/btrfsck[0x40aa9e] ./work/builds/btrfs-progs/btrfsck[0x4046c7] /lib64/libc.so.6(__libc_start_main+0xed)[0x7f0008c0636d] ./work/builds/btrfs-progs/btrfsck[0x4017f9] === Memory map: 0040-00427000 r-xp 00:22 1375631 /mnt/DevSystem/Work/builds/btrfs-progs/btrfsck 00626000-00627000 r--p 00026000 00:22 1375631 /mnt/DevSystem/Work/builds/btrfs-progs/btrfsck 00627000-00628000 rw-p 00027000 00:22 1375631 /mnt/DevSystem/Work/builds/btrfs-progs/btrfsck 01f27000-2576c000 rw-p 00:00 0 [heap] 7f000400-7f0004021000 rw-p 00:00 0 7f0004021000-7f000800 ---p 00:00 0 7f00089cb000-7f00089e r-xp 08:22 298053 /lib64/libgcc_s.so.1 7f00089e-7f0008be ---p 00015000 08:22 298053 /lib64/libgcc_s.so.1 7f0008be-7f0008be1000 r--p 00015000 08:22 298053 /lib64/libgcc_s.so.1 7f0008be1000-7f0008be2000 rw-p 00016000 08:22 298053 /lib64/libgcc_s.so.1 7f0008be2000-7f0008d64000 r-xp 08:22 2883622 /lib64/libc-2.14.1.so 7f0008d64000-7f0008f64000 ---p 00182000 08:22 2883622 /lib64/libc-2.14.1.so 7f0008f64000-7f0008f68000 r--p 00182000 08:22 2883622 /lib64/libc-2.14.1.so 7f0008f68000-7f0008f69000 rw-p 00186000 08:22 2883622 /lib64/libc-2.14.1.so 7f0008f69000-7f0008f6e000 rw-p 00:00 0 7f0008f6e000-7f0008fef000 r-xp 08:22 2883678 /lib64/libm-2.14.1.so 7f0008fef000-7f00091ee000 ---p 00081000 08:22 2883678 /lib64/libm-2.14.1.so 7f00091ee000-7f00091ef000 r--p 0008 08:22 2883678 /lib64/libm-2.14.1.so 7f00091ef000-7f00091f rw-p 00081000 08:22 2883678 /lib64/libm-2.14.1.so 7f00091f-7f00091f4000 r-xp 08:22 394806 /lib64/libuuid.so.1.3.0 7f00091f4000-7f00093f3000 ---p 4000 08:22 394806 /lib64/libuuid.so.1.3.0 7f00093f3000-7f00093f4000 r--p 3000 08:22 394806 /lib64/libuuid.so.1.3.0 7f00093f4000-7f00093f5000 rw-p 4000 08:22 394806 /lib64/libuuid.so.1.3.0 7f00093f5000-7f0009415000 r-xp 08:22 2883603 /lib64/ld-2.14.1.so 7f00095d1000-7f00095d4000 rw-p 00:00 0 7f0009612000-7f0009615000 rw-p 00:00 0 7f0009615000-7f0009616000 r--p 0002 08:22 2883603 /lib64/ld-2.14.1.so 7f0009616000-7f0009617000 rw-p 00021000 08:22 2883603 /lib64/ld-2.14.1.so 7f0009617000-7f0009618000 rw-p 00:00 0 7fff70f01000-7fff70f23000 rw-p 00:00 0 [stack] 7fff70fff000-7fff7100 r-xp 00:00 0 [vdso] ff60-ff601000 r-xp 00:00 0 [vsyscall] Aborted -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Btrfs: fix a dio write regression
Hi, a few minor style comments, On Wed, Aug 22, 2012 at 06:11:14PM +0800, bo.li@oracle.com wrote: > --- a/fs/btrfs/inode.c > +++ b/fs/btrfs/inode.c > @@ -5993,10 +5993,24 @@ unlock: >* in the case of read we need to unlock only the end area that we >* aren't using if there is any left over space. >*/ > - if (lockstart < lockend) > - clear_extent_bit(&BTRFS_I(inode)->io_tree, lockstart, lockend, > - unlock_bits, 1, 0, &cached_state, GFP_NOFS); > - else > + if (lockstart < lockend) { > + if (create && len < lockend - lockstart) { > + clear_extent_bit(&BTRFS_I(inode)->io_tree, lockstart, > + lockstart + len - 1, unlock_bits, 1, 0, > + &cached_state, GFP_NOFS); > + /* > + * Beside unlock, we also need to cleanup reserved space > + * for the left range by attaching EXTENT_DO_ACCOUNTING. > + */ > + clear_extent_bit(&BTRFS_I(inode)->io_tree, > + lockstart + len, lockend, unlock_bits | > + EXTENT_DO_ACCOUNTING, 1, 0, NULL, I'd prefer to see unlock_bits and the new value on one line > + GFP_NOFS); > + } else add { ... } around this > + clear_extent_bit(&BTRFS_I(inode)->io_tree, lockstart, > + lockend, unlock_bits, 1, 0, > + &cached_state, GFP_NOFS); > + } else here too > free_extent_state(cached_state); > > free_extent_map(em); -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: raw partition or LV for btrfs?
On Tue, Aug 14, 2012 at 07:23:48AM -0400, Calvin Walton wrote: > A patch to add support for `btrfs fi defrag -c none ` or so would > make this easier, and shouldn't be to hard to do :) This one is on my list of 'nice to have', it's needed to extend the ioctl to understand 'none' as to actually use no compression during the defrag, while currently it means 'whatever compression the file has set'. david -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: linux 3.5.0: BTRFS error in compress_file_range:581 (failed to join transaction)
On Tue, Aug 14, 2012 at 09:00:53PM -0700, Marc MERLIN wrote: > > What does the 'ret' shows? Is it -ENOSPC? > > I got nothing else in my logs. Unless it was a second error from a filesystem that went RO, there should be more than the "Failed to join transaction" message, and the first occurence of some transaction abort would spit some stacktrace as well. As you wrote in next paragraphs, it was probably a cable disconnection, so my bet is on EIO, and the transaction abort did the right things, so > I powered the laptop back on and it came up like nothing ever happened. s/I/you/ . (besides the few uncommitted changes) david -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Hung I/O, Kernel BUG with corrupt leaf (bad key order)
On Tue, Aug 14, 2012 at 01:20:36PM -0500, Peter Marheine wrote: > Hi all, > > I'm running btrfs in a 3-disk RAID1 configuration. After a hard > power-off, I'm seeing a lot of hung I/O tasks on this volume, > apparently due to a corrupt leaf. I first noticed the problem on > kernel 3.4.7, and it's persisted with 3.4.8. Relevant parts of the > kernel log follow. What was the filesystem activity when the power-off happened? > > [ 85.179621] block group 38684065792 has an wrong amount of free space > [ 85.179667] btrfs: failed to load free space cache for block group > 38684065792 > [ 136.969477] btrfs: corrupt leaf, bad key order: > block=1478255230976,root=1, slot=26 > [ 136.998953] btrfs: corrupt leaf, bad key order: > block=1478255230976,root=1, slot=26 > [ 137.000492] btrfs: corrupt leaf, bad key order: > block=1478255230976,root=1, slot=26 > [ 137.000708] btrfs: corrupt leaf, bad key order: > block=1478255230976,root=1, slot=26 > [ 153.912922] btrfs: corrupt leaf, bad key order: > block=1478255230976,root=1, slot=26 > [ 153.913020] [ cut here ] > [ 153.913055] kernel BUG at fs/btrfs/inode.c:828! 809 static noinline int cow_file_range(struct inode *inode, 810struct page *locked_page, 811u64 start, u64 end, int *page_started, 812unsigned long *nr_written, 813int unlock) 814 { [...] 828 BUG_ON(btrfs_is_free_space_inode(root, inode)); plus the 'block group' warning above, this seems to be the but that Liu Bo fixed with patches Btrfs: fix a bug of writting free space cache with nodatacow option Btrfs: fix a bug of writting free space cache during balance Btrfs: fix btrfs_is_free_space_inode to recognize btree inode that should appear in 3.6. You can try to mount with 'nospace_cache' or 'clear_cache' if this would make a difference to redo the space cache from scratch, but I'm afaraid the bad keys will remain and would have to be removed via offline fsck. david -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: How to get Btrfs on 2nd partition of USB HDD to automount as read/write
On Tue, Aug 21, 2012 at 03:28:22PM -0400, dg1727 wrote: > Thanks a lot for these answers. As an exercise, how would I track > that patch so I can tell when it has been released? Pointing me to > a webpage that covers this would be fine. You can easily check that the patch appears in the main progs repo at http://git.kernel.org/?p=linux/kernel/git/mason/btrfs-progs.git;a=summary or you can clone the repo and check the git log directly. david -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Computer crash, btrfs partition errors
On Tue, Aug 21, 2012 at 09:50:58AM -0700, Not Zippy wrote: > Thanks for the analysis, unfortunately I get the same assert error > when I attempt to run the repair from the compiled source. > > # ./work/builds/btrfs-progs/btrfsck > usage: btrfsck dev > Btrfs Btrfs v0.19 'git describe' would be more helpful, but as the following command succeded, I assume that you're on the right version. > # ./work/builds/btrfs-progs/btrfsck --repair /dev/sda2 > enabling repair mode > checking extents > leaf parent key incorrect 46329503744 > bad block 46329503744 > owner ref check failed [46329503744 4096] > repair deleting extent record: key 46329503744 168 4096 > adding new tree backref on start 46329503744 len 4096 parent 256 root 256 > repaired damaged extent references so it's able to fix that error, but due to failure in the next phase the change is not permanent -- a quick hack here would be to commit immediatelly after this phase --- a/btrfsck.c +++ b/btrfsck.c @@ -3572,6 +3572,8 @@ int main(int ac, char **av) if (ret) fprintf(stderr, "Errors found in extent allocation tree\n"); + goto out; + fprintf(stderr, "checking fs roots\n"); ret = check_fs_roots(root, &root_cache); if (ret) --- > checking fs roots > btrfsck: btrfsck.c:397: process_inode_item: Assertion `!(rec->ino != > key->objectid || rec->refs > 1)' failed. > Aborted This would need more information to analyze further, like full output of btrfs-debug-tree, and capture actual values from the Assertion to match them against the output. Additionally, I'm still interested in details about conditions that lead to this corruption, it's less painful to reproduce it locally than the remote debugging ping-pong (though sometimes nothing else is available). I think it'd be interesting to dig deeper, but you were able to get to your data so it's not probably that urgent for both of us. david -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Btrfs-progs: seg fault in get_label_unmounted
On Wed, Aug 15, 2012 at 04:29:53PM +0800, Anand jain wrote: > From: Anand Jain > > btrfs f l / > No valid Btrfs found on / > Segmentation fault (core dumped) Patches fixing this have been sent like 4 times, last one was from Alexander's 'btrfs prop', that modified it a bit more (to return the label instead of printing it). http://permalink.gmane.org/gmane.comp.file-systems.btrfs/18287 while Danny fixes more instances of unhandled error code from open_ctree http://permalink.gmane.org/gmane.comp.file-systems.btrfs/15305 JFYI, david -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
interaction with hardware RAID?
It is well documented that btrfs data recovery (after silent corruption) is dependent on the use of btrfs's own RAID1. However, I'm curious about whether any hardware RAID vendors are contemplating ways to integrate more closely with btrfs, for example, such that when btrfs detects a bad checksum, it would be able to ask the hardware RAID controller to return all alternate copies of the block. Is this technically possible within any hardware RAID device today, even though not implemented in btrfs? Has there been any suggestion that vendors would support this in future, presumably for the benefit of btrfs, ZFS and other checksumming filesystems? -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [GIT PULL v2] Update LZO compression
On Tue, Aug 21, 2012 at 05:21:50PM +0200, Markus F.X.J. Oberhumer wrote: > as suggested on the mailing list I have converted the updated LZO > code into git, so please pull my "lzo-update" branch from ... > [ Changes in v2: Optimize code for CPUs with inefficient unaligned > access => significant speed increase on ARM ] I can confirm that this new code runs at the same speed as the current lzo code in the Linux kernel on my ARM926EJ-S based platform. I only tested decompression, using the attached hacky userspace code. # time ./lzo-bench/old/unlzop /dev/null real0m 0.29s # time ./lzo-bench/new/unlzop /dev/null real0m 0.29s (where lzoimage is a Linux Image compressed with lzop) So, from my side there are no more objections. Thanks for doing this work, Markus. Johannes lzo-bench.tar.gz Description: Binary data
[PATCH] Btrfs: fix a dio write regression
From: Liu Bo This bug is introduced by commit 3b8bde746f6f9bd36a9f05f5f3b6e334318176a9 (Btrfs: lock extents as we map them in DIO). In dio write, we should unlock the section which we didn't do IO on in case that we fall back to buffered write. But we need to not only unlock the section but also cleanup reserved space for the section. This bug was found while running xfstests 133, with this 133 no longer complains. Signed-off-by: Liu Bo --- fs/btrfs/inode.c | 22 ++ 1 files changed, 18 insertions(+), 4 deletions(-) diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 7131fac..e4ab92b 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -5993,10 +5993,24 @@ unlock: * in the case of read we need to unlock only the end area that we * aren't using if there is any left over space. */ - if (lockstart < lockend) - clear_extent_bit(&BTRFS_I(inode)->io_tree, lockstart, lockend, -unlock_bits, 1, 0, &cached_state, GFP_NOFS); - else + if (lockstart < lockend) { + if (create && len < lockend - lockstart) { + clear_extent_bit(&BTRFS_I(inode)->io_tree, lockstart, +lockstart + len - 1, unlock_bits, 1, 0, +&cached_state, GFP_NOFS); + /* +* Beside unlock, we also need to cleanup reserved space +* for the left range by attaching EXTENT_DO_ACCOUNTING. +*/ + clear_extent_bit(&BTRFS_I(inode)->io_tree, +lockstart + len, lockend, unlock_bits | +EXTENT_DO_ACCOUNTING, 1, 0, NULL, +GFP_NOFS); + } else + clear_extent_bit(&BTRFS_I(inode)->io_tree, lockstart, +lockend, unlock_bits, 1, 0, +&cached_state, GFP_NOFS); + } else free_extent_state(cached_state); free_extent_map(em); -- 1.7.7.6 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html