Re: corruption: yet another one after deleting a ro snapshot
At 01/16/2017 12:53 PM, Christoph Anton Mitterer wrote: On Mon, 2017-01-16 at 11:16 +0800, Qu Wenruo wrote: It would be very nice if you could paste the output of "btrfs-debug-tree -t extent " and "btrfs-debug-tree -t root " That would help us to fix the bug in lowmem mode. I'll send you the link in a private mail ... if any other developer needs it, just ask me or Qu for the link. BTW, if it's possible, would you please try to run btrfs-check before your next deletion on ro-snapshots? You mean in general, when I do my next runs of backups respectively snaphot-cleanup? Sure, actually I did this this time as well (in original mode, though), and no error was found. For what should I look out? Nothing special, just in case the fs is already corrupted. Not really needed, as all corruption happens on tree block of root 6403, it means, if it's a real corruption, it will only disturb you(make fs suddenly RO) when you try to modify something(leaves under that node) in that subvolume. Ah... and it couldn't cause corruption to the same data blocks if they were used by another snaphshot? No, it won't cause corruption to any data block, no matter shared or not. And I highly suspect if the subvolume 6403 is the RO snapshot you just removed. I guess there is no way to find out whether it was that snapshot, is there? "btrfs subvolume list" could do it." If no output of 6403, then it's removed. And "btrfs-debug-tree -t root" also has info for it. A deleted subvolume won't have corresponding ROOT_BACKREF, and its ROOT_ITEM should have none-zero drop key. And in your case, your subvolume is indeed undergoing deletion. Also checked the extent tree, the result is a little interesting: 1) Most tree backref are good. In fact, 3 of all the 4 errors reported are tree blocks shared by other subvolumes, like: item 77 key (5120 METADATA_ITEM 1) itemoff 13070 itemsize 42 extent refs 2 gen 11 flags TREE_BLOCK|FULL_BACKREF tree block skinny level 1 tree block backref root 7285 tree block backref root 6572 This means the tree blocks are share by 2 other subvolumes, 7285 and 6572. 7285 subvolume is completely OK, while 6572 is also undergoing subvolume deletion(while real deletion doesn't start yet). And considering the generation, I assume 6403 is deleted before 6572. So we're almost clear that, btrfs (maybe only btrfsck) doesn't handle it well if there are multiple subvolume undergoing deletion. This gives us enough info to try to build such image by ourselves now. (Although still quite hard to do though). Also that also explained why btrfs-progs test image 021 won't trigger the problem. As it's only one subvolume undergoing deletion and no full backref extent. And for the scary lowmem mode, it's false alert. I manually checked the used size of a block group and it's OK. BTW, most of your block groups are completely used, without any space. But interestingly, mostly data extent size are just 512K, larger than compressed extent upper limit, but still quite small. In other words, your fs seems to be fragmented considering the upper limit of a data extent is 128M. (Or your case is quite common in common world?) If 'btrfs subvolume list' can't find that subvolume, then I think it's mostly OK for you to RW mount and wait the subvolume to be fully deleted. And I think you have already provided enough data for us to, at least try to, reproduce the bug. I won't do the remount,rw this night, so you have the rest of your day/night time to think of anything further I should test or provide you with from that fs... then it will be "gone" (in the sense of mounted RW). Just give your veto if I should wait :) At least from extent and root tree dump, I found nothing wrong. It's still possible that some full backref needs to be checked from subvolume tree (consdiering your fs size, not really practical) and can be wrong, but the possibility is quite low. And in that case, there should be more than 4 extent tree errors reported. So you are mostly OK to mount it rw any time you want, and I have already downloaded the raw data. Hard part is remaining for us developers to build such small image to reproduce your situation then. Thanks, Qu Thanks, Chris. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: corruption: yet another one after deleting a ro snapshot
On Mon, 2017-01-16 at 11:16 +0800, Qu Wenruo wrote: > It would be very nice if you could paste the output of > "btrfs-debug-tree -t extent " and "btrfs-debug-tree -t > root > " > > That would help us to fix the bug in lowmem mode. I'll send you the link in a private mail ... if any other developer needs it, just ask me or Qu for the link. > BTW, if it's possible, would you please try to run btrfs-check > before > your next deletion on ro-snapshots? You mean in general, when I do my next runs of backups respectively snaphot-cleanup? Sure, actually I did this this time as well (in original mode, though), and no error was found. For what should I look out? > Not really needed, as all corruption happens on tree block of root > 6403, > it means, if it's a real corruption, it will only disturb you(make > fs > suddenly RO) when you try to modify something(leaves under that node) > in > that subvolume. Ah... and it couldn't cause corruption to the same data blocks if they were used by another snaphshot? > And I highly suspect if the subvolume 6403 is the RO snapshot you > just removed. I guess there is no way to find out whether it was that snapshot, is there? > If 'btrfs subvolume list' can't find that subvolume, then I think > it's > mostly OK for you to RW mount and wait the subvolume to be fully > deleted. > > And I think you have already provided enough data for us to, at > least > try to, reproduce the bug. I won't do the remount,rw this night, so you have the rest of your day/night time to think of anything further I should test or provide you with from that fs... then it will be "gone" (in the sense of mounted RW). Just give your veto if I should wait :) Thanks, Chris. smime.p7s Description: S/MIME cryptographic signature
Re: [PATCH] fstests: generic: splitted large dio write could trigger assertion on btrfs
On Thu, Jan 12, 2017 at 02:22:06PM -0800, Liu Bo wrote: > On btrfs, if a large dio write (>=128MB) got splitted, the outstanding_extents > assertion would complain. Note that CONFIG_BTRFS_ASSERT is required. > > Regression test for > Btrfs: adjust outstanding_extents counter properly when dio write is split > > Signed-off-by: Liu Bo> --- > I didn't figure out how to check if CONFIG_BTRFS_ASSERT is enabled, but since > there is no btrfs specific stuff in the test case, it might be better to not > have such a _require check b/c it doesn't make sense to other FS. > > tests/generic/392 | 75 > +++ > tests/generic/392.out | 2 ++ > tests/generic/group | 1 + > 3 files changed, 78 insertions(+) > create mode 100755 tests/generic/392 > create mode 100644 tests/generic/392.out There're over 400 generic tests now, seems your repo hasn't been updated for a long time :) > > diff --git a/tests/generic/392 b/tests/generic/392 > new file mode 100755 > index 000..4d88c44 > --- /dev/null > +++ b/tests/generic/392 > @@ -0,0 +1,75 @@ > +#! /bin/bash > +# FS QA Test generic/392 > +# > +# If a larger dio write (size >= 128M) got splitted, the assertion in endio > +# would complain (CONFIG_BTRFS_ASSERT is required). > +# > +# Regression test for > +# Btrfs: adjust outstanding_extents counter properly when dio write is > split > +# > +#--- > +# Copyright (c) 2017 Liu Bo. All Rights Reserved. > +# > +# This program is free software; you can redistribute it and/or > +# modify it under the terms of the GNU General Public License as > +# published by the Free Software Foundation. > +# > +# This program is distributed in the hope that it would be useful, > +# but WITHOUT ANY WARRANTY; without even the implied warranty of > +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the > +# GNU General Public License for more details. > +# > +# You should have received a copy of the GNU General Public License > +# along with this program; if not, write the Free Software Foundation, > +# Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA > +#--- > +# > + > +seq=`basename $0` > +seqres=$RESULT_DIR/$seq > +echo "QA output created by $seq" > + > +here=`pwd` > +tmp=/tmp/$$ > +status=1 # failure is the default! > +trap "_cleanup; exit \$status" 0 1 2 3 15 > + > +_cleanup() > +{ > + cd / > + rm -f $tmp.* > +} > + > +# get standard environment, filters and checks > +. ./common/rc > +. ./common/filter > + > +# remove previous $seqres.full before test > +rm -f $seqres.full > + > +# real QA test starts here > + > +# Modify as appropriate. > +_supported_fs generic > +_supported_os Linux > +_require_scratch > +_require_odirect > + > +# 2G / 1K > +fsblock=$((1 << 21)) > +fssize=$((1 << 31)) > +_require_fs_space $SCRATCH_MNT $fsblock You should mkfs & mount $SCRATCH_DEV first before _require_fs_space, otherwise it's testing against the wrong fs. > + > +_scratch_mkfs_sized $fssize >> $seqres.full 2>&1 _scratch_mkfs_sized also make sure SCRATCH_DEV is big enough to make the filesystem, in this test _require_fs_space and _scratch_mkfs_sized both could do the work, only one is needed. But not all filesystems have _scratch_mkfs_sized support, I'd prefer using _require_fs_space. Thanks, Eryu > +_scratch_mount >> $seqres.full 2>&1 > + > +echo "Silence is golden" > + > +blocksize=$(( (128 + 1) * 2 * 1024 * 1024)) > +$XFS_IO_PROG -f -d -c "pwrite -b ${blocksize} 0 ${blocksize}" > $SCRATCH_MNT/testfile.$seq >> $seqres.full 2>&1 > + > +_scratch_unmount > + > +# success, all done > +status=0 > +exit > diff --git a/tests/generic/392.out b/tests/generic/392.out > new file mode 100644 > index 000..665233c > --- /dev/null > +++ b/tests/generic/392.out > @@ -0,0 +1,2 @@ > +QA output created by 392 > +Silence is golden > diff --git a/tests/generic/group b/tests/generic/group > index 2c16bd1..1631933 100644 > --- a/tests/generic/group > +++ b/tests/generic/group > @@ -394,3 +394,4 @@ > 389 auto quick acl > 390 auto freeze stress dangerous > 391 auto quick rw > +392 auto quick dangerous > -- > 2.5.0 > > -- > To unsubscribe from this list: send the line "unsubscribe fstests" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: corruption: yet another one after deleting a ro snapshot
At 01/16/2017 10:56 AM, Christoph Anton Mitterer wrote: On Mon, 2017-01-16 at 09:38 +0800, Qu Wenruo wrote: So the fs is REALLY corrupted. *sigh* ... (not as in fuck-I'm-loosing-my-data™ ... but as in *sigh* another-possibly-deeply-hidden-bug-in-btrfs-that-might-eventually- cause-data-loss...) BTW, lowmem mode seems to have a new false alert when checking the block group item. Anything you want to check me there? It would be very nice if you could paste the output of "btrfs-debug-tree -t extent " and "btrfs-debug-tree -t root " That would help us to fix the bug in lowmem mode. Did you have any "lightweight" method to reproduce the bug? Na, not at all... as I've said this already happened to me once before, and in both cases I was cleaning up old ro-snapshots. At least in the current case the fs was only ever filled via send/receive (well apart from minor mkdirs or so)... so there shouldn't have been any "extreme ways" of using it. Since it's mostly populated by receive, yes, receive is completely sane, since it's done purely in user-space. So if we have any way to reproduce it, it won't involve anything special. BTW, if it's possible, would you please try to run btrfs-check before your next deletion on ro-snapshots? I think (but not sure), that this was also the case on the other occasion that happened to me with a different fs (i.e. I think it was also a backup 8TB disk). For example, on a 1G btrfs fs with moderate operations, for example 15min or so, to reproduce the bug? Well I could try to produce it, but I guess you'd have far better means to do so. As I've said I was mostly doing send (with -p) | receive to do incremental backups... and after a while I was cleaning up the old snapshots on the backup fs. Of course the snapshot subvols are pretty huge.. as I've said close to 8TB (7.5 or so)... everything from quite big files (4GB) to very small, smylinks (no device/sockets/fifos)... perhaps some hardlinks... Some refcopied files. The whole fs has compression enabled. Shall I rw-mount the fs and do sync and wait and retry? Or is there anything else that you want me to try before in order to get the kernel bug (if any) or btrfs-progs bug nailed down? Personally speaking, rw mount would help, to verify if it's just a bug that will disappear after the deletion is done. Well but than we might loose any chance to further track it down. And even if it would go away, it would still at least be a bug in terms of fsck false positive if not more (in the sense of... corruptions may happen if some affect parts of the fs are used while not cleaned up again). But considering the size of your fs, it may not be a good idea as we don't have reliable method to recover/rebuild extent tree yet. So what do you effectively want now? Wait and try something else? RW mount and recheck to see whether it goes away with that? (And even if, should I rather re-create/populate the fs from scratch just to be sure? What I can also offer in addition... as mentioned some times previously, I do have full lists of the reg-files/dirs/symlinks as well as SHA512 sums of each of the reg-files, as they are expected to be on the fs respectively the snapshot. So I can offer to do a full verification pass of these, to see whether anything is missing or (file)data actually corrupted. Not really needed, as all corruption happens on tree block of root 6403, it means, if it's a real corruption, it will only disturb you(make fs suddenly RO) when you try to modify something(leaves under that node) in that subvolume. At least data is good. And I highly suspect if the subvolume 6403 is the RO snapshot you just removed. If 'btrfs subvolume list' can't find that subvolume, then I think it's mostly OK for you to RW mount and wait the subvolume to be fully deleted. And I think you have already provided enough data for us to, at least try to, reproduce the bug. Thanks, Qu Of course that will take a while, and even if everything verifies, I'm still not really sure whether I'd trust that fs anymore ;-) Cheers, Chris. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: corruption: yet another one after deleting a ro snapshot
On Mon, 2017-01-16 at 09:38 +0800, Qu Wenruo wrote: > So the fs is REALLY corrupted. *sigh* ... (not as in fuck-I'm-loosing-my-data™ ... but as in *sigh* another-possibly-deeply-hidden-bug-in-btrfs-that-might-eventually- cause-data-loss...) > BTW, lowmem mode seems to have a new false alert when checking the > block > group item. Anything you want to check me there? > Did you have any "lightweight" method to reproduce the bug? Na, not at all... as I've said this already happened to me once before, and in both cases I was cleaning up old ro-snapshots. At least in the current case the fs was only ever filled via send/receive (well apart from minor mkdirs or so)... so there shouldn't have been any "extreme ways" of using it. I think (but not sure), that this was also the case on the other occasion that happened to me with a different fs (i.e. I think it was also a backup 8TB disk). > For example, on a 1G btrfs fs with moderate operations, for example > 15min or so, to reproduce the bug? Well I could try to produce it, but I guess you'd have far better means to do so. As I've said I was mostly doing send (with -p) | receive to do incremental backups... and after a while I was cleaning up the old snapshots on the backup fs. Of course the snapshot subvols are pretty huge.. as I've said close to 8TB (7.5 or so)... everything from quite big files (4GB) to very small, smylinks (no device/sockets/fifos)... perhaps some hardlinks... Some refcopied files. The whole fs has compression enabled. > > Shall I rw-mount the fs and do sync and wait and retry? Or is there > > anything else that you want me to try before in order to get the > > kernel > > bug (if any) or btrfs-progs bug nailed down? > > Personally speaking, rw mount would help, to verify if it's just a > bug > that will disappear after the deletion is done. Well but than we might loose any chance to further track it down. And even if it would go away, it would still at least be a bug in terms of fsck false positive if not more (in the sense of... corruptions may happen if some affect parts of the fs are used while not cleaned up again). > But considering the size of your fs, it may not be a good idea as we > don't have reliable method to recover/rebuild extent tree yet. So what do you effectively want now? Wait and try something else? RW mount and recheck to see whether it goes away with that? (And even if, should I rather re-create/populate the fs from scratch just to be sure? What I can also offer in addition... as mentioned some times previously, I do have full lists of the reg-files/dirs/symlinks as well as SHA512 sums of each of the reg-files, as they are expected to be on the fs respectively the snapshot. So I can offer to do a full verification pass of these, to see whether anything is missing or (file)data actually corrupted. Of course that will take a while, and even if everything verifies, I'm still not really sure whether I'd trust that fs anymore ;-) Cheers, Chris. smime.p7s Description: S/MIME cryptographic signature
[PATCH] btrfs: raid56: Remove unused variant in lock_stripe_add
Variant 'walk' in lock_stripe_add() is never used. Remove it. Signed-off-by: Qu Wenruo--- fs/btrfs/raid56.c | 2 -- 1 file changed, 2 deletions(-) diff --git a/fs/btrfs/raid56.c b/fs/btrfs/raid56.c index 453eefdcb591..b8ffd9ea7499 100644 --- a/fs/btrfs/raid56.c +++ b/fs/btrfs/raid56.c @@ -693,11 +693,9 @@ static noinline int lock_stripe_add(struct btrfs_raid_bio *rbio) struct btrfs_raid_bio *freeit = NULL; struct btrfs_raid_bio *cache_drop = NULL; int ret = 0; - int walk = 0; spin_lock_irqsave(>lock, flags); list_for_each_entry(cur, >hash_list, hash_list) { - walk++; if (cur->bbio->raid_map[0] == rbio->bbio->raid_map[0]) { spin_lock(>bio_list_lock); -- 2.11.0 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/3 for-4.10] RAID56 scrub fixes
Hi David, Chris, Any comment on this patchset? Although we don't recommend user to use RAID5/6 for now, but this patchset still fixes 2 quite important bugs for RAID5/6. And current so far the test shows no regression. (Although test cases like btrfs/011 and btrfs/069 is still causing problem, as v4.10-rc without this patch will also cause problems). And the patchset is relatively small enough for an rc. Thanks, Qu At 12/12/2016 05:38 PM, Qu Wenruo wrote: Can be feteched from github: https://github.com/adam900710/linux.git raid56_fixes Fixes 2 scrub bugs: 1) Scrub recover correct data, but wrong parity 2) Scrub report wrong csum error number, or even unrecoverable error The patches are still undergoing xfstests, but currect for-linus-4.10 is already causing deadlock for btrfs/011, even without the patches. So I'd remove btrfs/011 and continue the test, even these test cases won't trigger real recover code. But the current internal test cases are quite good so far. I'll test them for several extra loop, and submit the internal test for reference. (Since it's not suitable for xfstest, so I'd only submit the test script, which needs manually to probe chunk layout) Qu Wenruo (3): btrfs: scrub: Introduce full stripe lock for RAID56 btrfs: scrub: Fix RAID56 recovery race conditiong btrfs: raid56: Use correct stolen pages to calculate P/Q fs/btrfs/ctree.h | 4 ++ fs/btrfs/extent-tree.c | 3 + fs/btrfs/raid56.c | 62 ++-- fs/btrfs/scrub.c | 192 + 4 files changed, 257 insertions(+), 4 deletions(-) -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: corruption: yet another one after deleting a ro snapshot
At 01/16/2017 01:04 AM, Christoph Anton Mitterer wrote: On Thu, 2017-01-12 at 10:38 +0800, Qu Wenruo wrote: IIRC, RO mount won't continue background deletion. I see. Would you please try 4.9 btrfs-progs? Done now, see results (lowmem and original mode) below: # btrfs version btrfs-progs v4.9 # btrfs check /dev/nbd0 ; echo $? Checking filesystem on /dev/nbd0 UUID: 326d292d-f97b-43ca-b1e8-c722d3474719 checking extents ref mismatch on [37765120 16384] extent item 0, found 1 Backref 37765120 parent 6403 root 6403 not found in extent tree backpointer mismatch on [37765120 16384] owner ref check failed [37765120 16384] ref mismatch on [5120 16384] extent item 0, found 1 Backref 5120 parent 6403 root 6403 not found in extent tree backpointer mismatch on [5120 16384] owner ref check failed [5120 16384] ref mismatch on [78135296 16384] extent item 0, found 1 Backref 78135296 parent 6403 root 6403 not found in extent tree backpointer mismatch on [78135296 16384] owner ref check failed [78135296 16384] ref mismatch on [5960381235200 16384] extent item 0, found 1 Backref 5960381235200 parent 6403 root 6403 not found in extent tree backpointer mismatch on [5960381235200 16384] checking free space cache checking fs roots checking csums checking root refs found 7483995824128 bytes used err is 0 total csum bytes: 7296183880 total tree bytes: 10875944960 total fs tree bytes: 2035286016 total extent tree bytes: 1015988224 btree space waste bytes: 920641324 file data blocks allocated: 8267656339456 referenced 8389440876544 0 # btrfs check --mode=lowmem /dev/nbd0 ; echo $? Checking filesystem on /dev/nbd0 UUID: 326d292d-f97b-43ca-b1e8-c722d3474719 checking extents ERROR: block group[74117545984 1073741824] used 1073741824 but extent items used 0 ERROR: block group[239473786880 1073741824] used 1073741824 but extent items used 1207959552 ERROR: block group[500393050112 1073741824] used 1073741824 but extent items used 1207959552 ERROR: block group[581997428736 1073741824] used 1073741824 but extent items used 0 ERROR: block group[626557714432 1073741824] used 1073741824 but extent items used 0 ERROR: block group[668433645568 1073741824] used 1073741824 but extent items used 0 ERROR: block group[948680261632 1073741824] used 1073741824 but extent items used 0 ERROR: block group[982503129088 1073741824] used 1073741824 but extent items used 0 ERROR: block group[1039411445760 1073741824] used 1073741824 but extent items used 0 ERROR: block group[1054443831296 1073741824] used 1073741824 but extent items used 1207959552 ERROR: block group[1190809042944 1073741824] used 1073741824 but extent items used 0 ERROR: block group[1279392743424 1073741824] used 1073741824 but extent items used 0 ERROR: block group[1481256206336 1073741824] used 1073741824 but extent items used 0 ERROR: block group[1620842643456 1073741824] used 1073741824 but extent items used 1207959552 ERROR: block group[1914511032320 1073741824] used 1073741824 but extent items used 1207959552 ERROR: block group[3055361720320 1073741824] used 1073741824 but extent items used 0 ERROR: block group[3216422993920 1073741824] used 1073741824 but extent items used 0 ERROR: block group[3670615785472 1073741824] used 1073741824 but extent items used 1207959552 ERROR: block group[3801612288000 1073741824] used 1073741824 but extent items used 1207959552 ERROR: block group[3828455833600 1073741824] used 1073741824 but extent items used 1207959552 ERROR: block group[4250973241344 1073741824] used 1073741824 but extent items used 0 ERROR: block group[4261710659584 1073741824] used 1073741824 but extent items used 1074266112 ERROR: block group[4392707162112 1073741824] used 1073741824 but extent items used 0 ERROR: block group[4558063403008 1073741824] used 1073741824 but extent items used 0 ERROR: block group[4607455526912 1073741824] used 1073741824 but extent items used 0 ERROR: block group[4635372814336 1073741824] used 1073741824 but extent items used 0 ERROR: block group[4640204652544 1073741824] used 1073741824 but extent items used 0 ERROR: block group[4642352136192 1073741824] used 1073741824 but extent items used 1207959552 ERROR: block group[4681006841856 1073741824] used 1073741824 but extent items used 0 ERROR: block group[5063795802112 1073741824] used 1073741824 but extent items used 0 ERROR: block group[5171169984512 1073741824] used 1073741824 but extent items used 1207959552 ERROR: block group[5216267141120 1073741824] used 1073741824 but extent items used 1207959552 ERROR: block group[5290355326976 1073741824] used 1073741824 but extent items used 0 ERROR: block group[5445511020544 1073741824] used 1073741824 but extent items used 1074266112 ERROR: block group[6084387405824 1073741824] used 1073741824 but extent items used 0 ERROR: block group[6104788500480 1073741824] used 1073741824 but extent items used 0 ERROR: block group[6878956355584 1073741824] used 1073741824 but extent items used 0
delete subvolume bad key order
Hello, i have the message 'bad key order' and as you can see is the message of the debug very creepy. The RAM is ECC based and i have already checked this, no errors here. A 'btrfs check --repair' unfortunately did not help. What else can I do? The system builds as follows: sda ---\ sdb --\ \ sdc md0(Raid5)-vg0(Stripe)--btrfs2(lv)--btrfs sdd --/ Log: http://pastebin.com/Pu9ENpUM Overview: http://pastebin.com/PnddKQYz Debug: http://pastebin.com/qmR30AjJ Greeting York 0xD987052F.asc Description: application/pgp-keys signature.asc Description: OpenPGP digital signature
[ISSUE] uncorrectable errors on Raid1
Hello /all, I have some concerns about the raid 1 of BTRFS. I have encountered 114 uncorrectable errors on the directory hosting my 'seafile-data'. Seafile is a software to backup the data. My 2 hard drives seems to be fined. SMARTCTL reports do not identify any badlocks (Reallocated_Event_Count or Current_Pending_Sector). How can I have uncorrectable errors since BTRFS is assuring data integrity ? How did my data got corrupted ? What can I do to ensure that it does not happen again ? Sincerely, You can find below all the useful information I can think of. If you need more, let me know. sudo btrfs scrub status /mnt scrub status for 89f6f57e-90d9-46ac-1132-144e6ac150e4 scrub started at Sat Jan 14 17:09:36 2017 and finished after 2207 seconds total bytes scrubbed: 598.03GiB with 114 errors error details: csum=114 corrected errors: 0, uncorrectable errors: 114, unverified errors: 0 if I look, at the dmesg log , I can that both logical block seems to be corrupted. [ 1047.312852] BTRFS: bdev /dev/sde1 errs: wr 0, rd 0, flush 0, corrupt 49, gen 0 [ 1047.352631] BTRFS: unable to fixup (regular) error at logical 429848649728 on dev /dev/sde1 [ 1062.667080] BTRFS: checksum error at logical 441348554752 on dev /dev/sdd1, sector 195114560, root 5, inode 964364, offset 819200, length 4096, links 1 (path: seafile-data/storage/blocks/bd71e3e1-95bd-40fc-b6db-55c4ea9467c1/30/bfa04bb182ff8050fe4a0f357da7df335e7511) [ 1062.667092] BTRFS: bdev /dev/sdd1 errs: wr 0, rd 0, flush 0, corrupt 18, gen 0 [ 1062.710999] BTRFS: unable to fixup (regular) error at logical 441348554752 on dev /dev/sdd1 [ 1074.536137] BTRFS: checksum error at logical 441348554752 on dev /dev/sde1, sector 195075648, root 5, inode 964364, offset 819200, length 4096, links 1 (path: seafile-data/storage/blocks/bd71e3e1-95bd-40fc-b6db-55c4ea9467c1/30/bfa04bb182ff8050fe4a0f357da7df335e7511) sudo btrfs inspect-internal logical-resolve 441348554752 -v /mnt ioctl ret=0, total_size=4096, bytes_left=4056, bytes_missing=0, cnt=3, missed=0 ioctl ret=0, bytes_left=3965, bytes_missing=0, cnt=1, missed=0 /vault/seafile-data/storage/blocks/bd71e3e1-95bd-40fc-b6db-55c4ea9467c1/30/bfa04bb182ff8050fe4a0f357da7df335e7511 If I attempt to read the corresponding file, I have an " Input/output error ". Here is my Raid1 configuration: sudo btrfs fi show /mnt Label: none uuid: 91f6f57e-23d7-46ac-8056-144e6ac150e4 Total devices 2 FS bytes used 299.02GiB devid 1 size 2.73TiB used 301.03GiB path /dev/sdd1 devid 2 size 2.73TiB used 301.01GiB path /dev/sde1 btrfs-progs v3.19.1 sudo btrfs fi df /mnt Data, RAID1: total=299.00GiB, used=298.15GiB Data, single: total=8.00MiB, used=0.00B System, RAID1: total=8.00MiB, used=64.00KiB System, single: total=4.00MiB, used=0.00B Metadata, RAID1: total=2.00GiB, used=887.55MiB Metadata, single: total=8.00MiB, used=0.00B GlobalReserve, single: total=304.00MiB, used=0.00B sudo btrfs fi us /mnt Overall: Device size: 5.46TiB Device allocated: 602.04GiB Device unallocated: 4.87TiB Device missing: 0.00B Used: 598.04GiB Free (estimated): 2.44TiB (min: 2.44TiB) Data ratio: 2.00 Metadata ratio: 2.00 Global reserve: 304.00MiB (used: 0.00B) Data,single: Size:8.00MiB, Used:0.00B /dev/sdd1 8.00MiB Data,RAID1: Size:299.00GiB, Used:298.15GiB /dev/sdd1 299.00GiB /dev/sde1 299.00GiB Metadata,single: Size:8.00MiB, Used:0.00B /dev/sdd1 8.00MiB Metadata,RAID1: Size:2.00GiB, Used:887.55MiB /dev/sdd1 2.00GiB /dev/sde1 2.00GiB System,single: Size:4.00MiB, Used:0.00B /dev/sdd1 4.00MiB System,RAID1: Size:8.00MiB, Used:64.00KiB /dev/sdd1 8.00MiB /dev/sde1 8.00MiB Unallocated: /dev/sdd1 2.43TiB /dev/sde1 2.43TiB btrfs --version btrfs-progs v3.19.1 sudo smartctl -a /dev/sde smartctl 6.2 2013-07-26 r3841 [x86_64-linux-3.10.0-327.28.3.el7.x86_64] (local build) Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org === START OF INFORMATION SECTION === Model Family: Western Digital Red (AF) Device Model: WDC WD30EFRX-68EUZN0 Serial Number: WD-WCC4N1003742 LU WWN Device Id: 5 0014ee 25f64a417 Firmware Version: 80.00A80 User Capacity: 3 000 592 982 016 bytes [3,00 TB] Sector Sizes: 512 bytes logical, 4096 bytes physical Rotation Rate: 5400 rpm Device is: In smartctl database [for details use: -P show] ATA Version is: ACS-2 (minor revision not indicated) SATA Version is: SATA 3.0, 6.0 Gb/s (current: 3.0 Gb/s) Local Time is: Sun Jan 15 16:46:37 2017 CET SMART support is: Available - device has SMART capability. SMART support is: Enabled === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED General SMART Values: Offline data collection status: (0x00) Offline data collection activity was never started. Auto Offline Data Collection: Disabled. Self-test execution status: ( 0) The previous self-test routine completed without error or no self-test has ever been run. Total
Re: corruption: yet another one after deleting a ro snapshot
On Thu, 2017-01-12 at 10:38 +0800, Qu Wenruo wrote: > IIRC, RO mount won't continue background deletion. I see. > Would you please try 4.9 btrfs-progs? Done now, see results (lowmem and original mode) below: # btrfs version btrfs-progs v4.9 # btrfs check /dev/nbd0 ; echo $? Checking filesystem on /dev/nbd0 UUID: 326d292d-f97b-43ca-b1e8-c722d3474719 checking extents ref mismatch on [37765120 16384] extent item 0, found 1 Backref 37765120 parent 6403 root 6403 not found in extent tree backpointer mismatch on [37765120 16384] owner ref check failed [37765120 16384] ref mismatch on [5120 16384] extent item 0, found 1 Backref 5120 parent 6403 root 6403 not found in extent tree backpointer mismatch on [5120 16384] owner ref check failed [5120 16384] ref mismatch on [78135296 16384] extent item 0, found 1 Backref 78135296 parent 6403 root 6403 not found in extent tree backpointer mismatch on [78135296 16384] owner ref check failed [78135296 16384] ref mismatch on [5960381235200 16384] extent item 0, found 1 Backref 5960381235200 parent 6403 root 6403 not found in extent tree backpointer mismatch on [5960381235200 16384] checking free space cache checking fs roots checking csums checking root refs found 7483995824128 bytes used err is 0 total csum bytes: 7296183880 total tree bytes: 10875944960 total fs tree bytes: 2035286016 total extent tree bytes: 1015988224 btree space waste bytes: 920641324 file data blocks allocated: 8267656339456 referenced 8389440876544 0 # btrfs check --mode=lowmem /dev/nbd0 ; echo $? Checking filesystem on /dev/nbd0 UUID: 326d292d-f97b-43ca-b1e8-c722d3474719 checking extents ERROR: block group[74117545984 1073741824] used 1073741824 but extent items used 0 ERROR: block group[239473786880 1073741824] used 1073741824 but extent items used 1207959552 ERROR: block group[500393050112 1073741824] used 1073741824 but extent items used 1207959552 ERROR: block group[581997428736 1073741824] used 1073741824 but extent items used 0 ERROR: block group[626557714432 1073741824] used 1073741824 but extent items used 0 ERROR: block group[668433645568 1073741824] used 1073741824 but extent items used 0 ERROR: block group[948680261632 1073741824] used 1073741824 but extent items used 0 ERROR: block group[982503129088 1073741824] used 1073741824 but extent items used 0 ERROR: block group[1039411445760 1073741824] used 1073741824 but extent items used 0 ERROR: block group[1054443831296 1073741824] used 1073741824 but extent items used 1207959552 ERROR: block group[1190809042944 1073741824] used 1073741824 but extent items used 0 ERROR: block group[1279392743424 1073741824] used 1073741824 but extent items used 0 ERROR: block group[1481256206336 1073741824] used 1073741824 but extent items used 0 ERROR: block group[1620842643456 1073741824] used 1073741824 but extent items used 1207959552 ERROR: block group[1914511032320 1073741824] used 1073741824 but extent items used 1207959552 ERROR: block group[3055361720320 1073741824] used 1073741824 but extent items used 0 ERROR: block group[3216422993920 1073741824] used 1073741824 but extent items used 0 ERROR: block group[3670615785472 1073741824] used 1073741824 but extent items used 1207959552 ERROR: block group[3801612288000 1073741824] used 1073741824 but extent items used 1207959552 ERROR: block group[3828455833600 1073741824] used 1073741824 but extent items used 1207959552 ERROR: block group[4250973241344 1073741824] used 1073741824 but extent items used 0 ERROR: block group[4261710659584 1073741824] used 1073741824 but extent items used 1074266112 ERROR: block group[4392707162112 1073741824] used 1073741824 but extent items used 0 ERROR: block group[4558063403008 1073741824] used 1073741824 but extent items used 0 ERROR: block group[4607455526912 1073741824] used 1073741824 but extent items used 0 ERROR: block group[4635372814336 1073741824] used 1073741824 but extent items used 0 ERROR: block group[4640204652544 1073741824] used 1073741824 but extent items used 0 ERROR: block group[4642352136192 1073741824] used 1073741824 but extent items used 1207959552 ERROR: block group[4681006841856 1073741824] used 1073741824 but extent items used 0 ERROR: block group[5063795802112 1073741824] used 1073741824 but extent items used 0 ERROR: block group[5171169984512 1073741824] used 1073741824 but extent items used 1207959552 ERROR: block group[5216267141120 1073741824] used 1073741824 but extent items used 1207959552 ERROR: block group[5290355326976 1073741824] used 1073741824 but extent items used 0 ERROR: block group[5445511020544 1073741824] used 1073741824 but extent items used 1074266112 ERROR: block group[6084387405824 1073741824] used 1073741824 but extent items used 0 ERROR: block group[6104788500480 1073741824] used 1073741824 but extent items used 0 ERROR: block group[6878956355584 1073741824] used 1073741824 but extent items used 0 ERROR: block group[6997067956224 1073741824] used 1073741824
[LSF/MM TOPIC] [LSF/MM ATTEND] BTRFS Encryption
I am working on BTRFS Encryption stage 2 design [1], which circles around the data center solution requisites. I shall be presenting an overview of the proposed design, so to obtain the constructive feedback and comments. And, I hope this will help to finalize on the design before it is taken up for the implementation. Though the current experiment is only with in BTRFS, the encryption-method part of the design should be part of fs/crypto, when it is appropriate. The presentation and the review will be in two parts, the encryption-method part, in which the experts from the fs/crypto may like to participate. And the btrfs part in which the btrfs experts may like to provide their comments on the proposed design in specific to the btrfs. [1] (Working draft, I hope to put more details into it before the LSF). https://docs.google.com/document/d/1jWB3lyY2PF5CSAzcOnR8Yzh_45xU3Ub9bf3zKoMruWs/edit?usp=sharing Thanks, Anand -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html