Re: Incremental send/receive broken after snapshot restore
01.07.2018 02:16, Marc MERLIN пишет: > Sorry that I missed the beginning of this discussion, but I think this is > what I documented here after hitting hte same problem: This is similar, yes. IIRC you had different starting point though. Here it should have been possible to use only standard documented tools without need of low level surgery if done right. > http://marc.merlins.org/perso/btrfs/post_2018-03-09_Btrfs-Tips_-Rescuing-A-Btrfs-Send-Receive-Relationship.html > M-m-m ... statement "because the source had a Parent UUID value too, I was actually supposed to set Received UUID on the destination to it" is entirely off mark nor does it even match subsequent command. You probably meant to say "because the source had a *Received* UUID value too, I was actually supposed to set Received UUID on the destination to it". That is correct. And that is what I meant above - received_uuid is misnomer, it is actually used as common data set identifier. Two subvolumes with the same received_uuid are presumed to have identical content. Which makes the very idea of being able to freely manipulate it rather questionable. P.S. of course "parent" is also highly ambiguous in btrfs world. We really need to come up with acceptable terminology to disambiguate tree parent, snapshot parent and replication parent. The latter should probably better be called "base snapshot" (NetApp calls it "common snapshot"); error message "Could not find base snapshot matching UUID xxx" would be far less ambiguous. > Marc > > On Sun, Jul 01, 2018 at 01:03:37AM +0200, Hannes Schweizer wrote: >> On Sat, Jun 30, 2018 at 10:02 PM Andrei Borzenkov >> wrote: >>> >>> 30.06.2018 21:49, Andrei Borzenkov пишет: 30.06.2018 20:49, Hannes Schweizer пишет: >>> ... > > I've tested a few restore methods beforehand, and simply creating a > writeable clone from the restored snapshot does not work for me, eg: > # create some source snapshots > btrfs sub create test_root > btrfs sub snap -r test_root test_snap1 > btrfs sub snap -r test_root test_snap2 > > # send a full and incremental backup to external disk > btrfs send test_snap2 | btrfs receive /run/media/schweizer/external > btrfs sub snap -r test_root test_snap3 > btrfs send -c test_snap2 test_snap3 | btrfs receive > /run/media/schweizer/external > > # simulate disappearing source > btrfs sub del test_* > > # restore full snapshot from external disk > btrfs send /run/media/schweizer/external/test_snap3 | btrfs receive . > > # create writeable clone > btrfs sub snap test_snap3 test_root > > # try to continue with backup scheme from source to external > btrfs sub snap -r test_root test_snap4 > > # this fails!! > btrfs send -c test_snap3 test_snap4 | btrfs receive > /run/media/schweizer/external > At subvol test_snap4 > ERROR: parent determination failed for 2047 > ERROR: empty stream is not considered valid > Yes, that's expected. Incremental stream always needs valid parent - this will be cloned on destination and incremental changes applied to it. "-c" option is just additional sugar on top of it which might reduce size of stream, but in this case (i.e. without "-p") it also attempts to guess parent subvolume for test_snap4 and this fails because test_snap3 and test_snap4 do not have common parent so test_snap3 is rejected as valid parent snapshot. You can restart incremental-forever chain by using explicit "-p" instead: btrfs send -p test_snap3 test_snap4 Subsequent snapshots (test_snap5 etc) will all have common parent with immediate predecessor again so "-c" will work. Note that technically "btrfs send" with single "-c" option is entirely equivalent to "btrfs -p". Using "-p" would have avoided this issue. :) Although this implicit check for common parent may be considered a good thing in this case. P.S. looking at the above, it probably needs to be in manual page for btrfs-send. It took me quite some time to actually understand the meaning of "-p" and "-c" and behavior if they are present. >>> ... > > Is there some way to reset the received_uuid of the following snapshot > on online? > ID 258 gen 13742 top level 5 parent_uuid - >received_uuid 6c683d90-44f2-ad48-bb84-e9f241800179 uuid > 46db1185-3c3e-194e-8d19-7456e532b2f3 path diablo > There is no "official" tool but this question came up quite often. Search this list, I believe recently one-liner using python-btrfs was posted. Note that also patch that removes received_uuid when "ro" propery is removed was suggested, hopefully it will be merged at some point. Still I personally consider ability to flip read-only property the very bad thing that should have never been exposed in the first place. >>> >>> Note that if
[PATCH] btrfs-progs: free-space-cache: Don't panic when free space cache is corrupted
In btrfs_add_free_space(), if the free space to be added is already here, we trigger ASSERT() which is just another BUG_ON(). Let's remove such BUG_ON() at all. Reported-by: Lewis Diamond Signed-off-by: Qu Wenruo --- free-space-cache.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/free-space-cache.c b/free-space-cache.c index 9b83a71ca59a..2ef2d307cc5d 100644 --- a/free-space-cache.c +++ b/free-space-cache.c @@ -838,10 +838,8 @@ int btrfs_add_free_space(struct btrfs_free_space_ctl *ctl, u64 offset, try_merge_free_space(ctl, info); ret = link_free_space(ctl, info); - if (ret) { + if (ret) printk(KERN_CRIT "btrfs: unable to add free space :%d\n", ret); - BUG_ON(ret == -EEXIST); - } return ret; } -- 2.18.0 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfs check --readonly crash
On 2018年07月01日 09:59, Lewis Diamond wrote: > Hi, > I've been told to report this issue to this mailing list. > > sudo btrfs check --readonly /dev/sdc > Checking filesystem on /dev/sdc > UUID: 2630aec8-8399-4bd8-9397-8c04953a35d5 > checking extents > checking free space cache > there is no free space entry for 1596152365056-1596152381440 > there is no free space entry for 1596152365056-1597220323328 > cache appears valid but isn't 1596146581504 > there is no free space entry for 1613348585472-1613348618240 > there is no free space entry for 1613348585472-1614400192512 > cache appears valid but isn't 1613326450688 > block group 1645538705408 has wrong amount of free space, free space > cache has 58212352 block group has 58277888 > failed to load free space cache for block group 1645538705408 > block group 1683119669248 has wrong amount of free space, free space > cache has 52838400 block group has 52953088 > failed to load free space cache for block group 1683119669248 > btrfs: unable to add free space :-17 Your free space cache is corrupted. And if not handled well, it could (may have already) damaged your fs further. You could try "btrfs check --clear-space-cache v1 " to remove the free space cache completely and re-try "btrfs check --readonly" to see if it works. Thanks, Qu > free-space-cache.c:843: btrfs_add_free_space: BUG_ON `ret == -EEXIST` > triggered, value 1 > btrfs(+0x37337)[0x556024d5d337] > btrfs(btrfs_add_free_space+0x11d)[0x556024d5da2d] > btrfs(load_free_space_cache+0xde9)[0x556024d5e889] > btrfs(cmd_check+0x15fe)[0x556024d8b9ee] > btrfs(main+0x88)[0x556024d38768] > /usr/lib/libc.so.6(__libc_start_main+0xeb)[0x7f9382a9506b] > btrfs(_start+0x2a)[0x556024d3888a] > Aborted > > > This happened on a USB HDD duo in raid 1 which had btrfs issues > (probably due to a bad controller). One of the hdd was subsequently > dropped by a 2 year old (causing subsequent checks to crash). > > -Lewis > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > signature.asc Description: OpenPGP digital signature
btrfs check --readonly crash
Hi, I've been told to report this issue to this mailing list. sudo btrfs check --readonly /dev/sdc Checking filesystem on /dev/sdc UUID: 2630aec8-8399-4bd8-9397-8c04953a35d5 checking extents checking free space cache there is no free space entry for 1596152365056-1596152381440 there is no free space entry for 1596152365056-1597220323328 cache appears valid but isn't 1596146581504 there is no free space entry for 1613348585472-1613348618240 there is no free space entry for 1613348585472-1614400192512 cache appears valid but isn't 1613326450688 block group 1645538705408 has wrong amount of free space, free space cache has 58212352 block group has 58277888 failed to load free space cache for block group 1645538705408 block group 1683119669248 has wrong amount of free space, free space cache has 52838400 block group has 52953088 failed to load free space cache for block group 1683119669248 btrfs: unable to add free space :-17 free-space-cache.c:843: btrfs_add_free_space: BUG_ON `ret == -EEXIST` triggered, value 1 btrfs(+0x37337)[0x556024d5d337] btrfs(btrfs_add_free_space+0x11d)[0x556024d5da2d] btrfs(load_free_space_cache+0xde9)[0x556024d5e889] btrfs(cmd_check+0x15fe)[0x556024d8b9ee] btrfs(main+0x88)[0x556024d38768] /usr/lib/libc.so.6(__libc_start_main+0xeb)[0x7f9382a9506b] btrfs(_start+0x2a)[0x556024d3888a] Aborted This happened on a USB HDD duo in raid 1 which had btrfs issues (probably due to a bad controller). One of the hdd was subsequently dropped by a 2 year old (causing subsequent checks to crash). -Lewis -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: A list of bugs in btrfs found by fuzzing
Dear BTRFS developers, I would like to know if these issues are fixed or handled? Thanks. -Wen > On Jun 3, 2018, at 6:22 PM, Wen Xu wrote: > > Hi btrfs maintainers and developers, > > Here are a list of bugs found in upstream kernel recently. Please check: > > https://bugzilla.kernel.org/show_bug.cgi?id=199833 > https://bugzilla.kernel.org/show_bug.cgi?id=199835 > https://bugzilla.kernel.org/show_bug.cgi?id=199837 > https://bugzilla.kernel.org/show_bug.cgi?id=199839 > https://bugzilla.kernel.org/show_bug.cgi?id=199847 > https://bugzilla.kernel.org/show_bug.cgi?id=199849 > > Thanks, > Wen -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Incremental send/receive broken after snapshot restore
Sorry that I missed the beginning of this discussion, but I think this is what I documented here after hitting hte same problem: http://marc.merlins.org/perso/btrfs/post_2018-03-09_Btrfs-Tips_-Rescuing-A-Btrfs-Send-Receive-Relationship.html Marc On Sun, Jul 01, 2018 at 01:03:37AM +0200, Hannes Schweizer wrote: > On Sat, Jun 30, 2018 at 10:02 PM Andrei Borzenkov wrote: > > > > 30.06.2018 21:49, Andrei Borzenkov пишет: > > > 30.06.2018 20:49, Hannes Schweizer пишет: > > ... > > >> > > >> I've tested a few restore methods beforehand, and simply creating a > > >> writeable clone from the restored snapshot does not work for me, eg: > > >> # create some source snapshots > > >> btrfs sub create test_root > > >> btrfs sub snap -r test_root test_snap1 > > >> btrfs sub snap -r test_root test_snap2 > > >> > > >> # send a full and incremental backup to external disk > > >> btrfs send test_snap2 | btrfs receive /run/media/schweizer/external > > >> btrfs sub snap -r test_root test_snap3 > > >> btrfs send -c test_snap2 test_snap3 | btrfs receive > > >> /run/media/schweizer/external > > >> > > >> # simulate disappearing source > > >> btrfs sub del test_* > > >> > > >> # restore full snapshot from external disk > > >> btrfs send /run/media/schweizer/external/test_snap3 | btrfs receive . > > >> > > >> # create writeable clone > > >> btrfs sub snap test_snap3 test_root > > >> > > >> # try to continue with backup scheme from source to external > > >> btrfs sub snap -r test_root test_snap4 > > >> > > >> # this fails!! > > >> btrfs send -c test_snap3 test_snap4 | btrfs receive > > >> /run/media/schweizer/external > > >> At subvol test_snap4 > > >> ERROR: parent determination failed for 2047 > > >> ERROR: empty stream is not considered valid > > >> > > > > > > Yes, that's expected. Incremental stream always needs valid parent - > > > this will be cloned on destination and incremental changes applied to > > > it. "-c" option is just additional sugar on top of it which might reduce > > > size of stream, but in this case (i.e. without "-p") it also attempts to > > > guess parent subvolume for test_snap4 and this fails because test_snap3 > > > and test_snap4 do not have common parent so test_snap3 is rejected as > > > valid parent snapshot. You can restart incremental-forever chain by > > > using explicit "-p" instead: > > > > > > btrfs send -p test_snap3 test_snap4 > > > > > > Subsequent snapshots (test_snap5 etc) will all have common parent with > > > immediate predecessor again so "-c" will work. > > > > > > Note that technically "btrfs send" with single "-c" option is entirely > > > equivalent to "btrfs -p". Using "-p" would have avoided this issue. :) > > > Although this implicit check for common parent may be considered a good > > > thing in this case. > > > > > > P.S. looking at the above, it probably needs to be in manual page for > > > btrfs-send. It took me quite some time to actually understand the > > > meaning of "-p" and "-c" and behavior if they are present. > > > > > ... > > >> > > >> Is there some way to reset the received_uuid of the following snapshot > > >> on online? > > >> ID 258 gen 13742 top level 5 parent_uuid - > > >>received_uuid 6c683d90-44f2-ad48-bb84-e9f241800179 uuid > > >> 46db1185-3c3e-194e-8d19-7456e532b2f3 path diablo > > >> > > > > > > There is no "official" tool but this question came up quite often. > > > Search this list, I believe recently one-liner using python-btrfs was > > > posted. Note that also patch that removes received_uuid when "ro" > > > propery is removed was suggested, hopefully it will be merged at some > > > point. Still I personally consider ability to flip read-only property > > > the very bad thing that should have never been exposed in the first place. > > > > > > > Note that if you remove received_uuid (explicitly or - in the future - > > implicitly) you will not be able to restart incremental send anymore. > > Without received_uuid there will be no way to match source test_snap3 > > with destination test_snap3. So you *must* preserve it and start with > > writable clone. > > > > received_uuid is misnomer. I wish it would be named "content_uuid" or > > "snap_uuid" with semantic > > > > 1. When read-only snapshot of writable volume is created, content_uuid > > is initialized > > > > 2. Read-only snapshot of read-only snapshot inherits content_uuid > > > > 3. destination of "btrfs send" inherits content_uuid > > > > 4. writable snapshot of read-only snapshot clears content_uuid > > > > 5. clearing read-only property clears content_uuid > > > > This would make it more straightforward to cascade and restart > > replication by having single subvolume property to match against. > > Indeed, the current terminology is a bit confusing, and the patch > removing the received_uuid when manually switching ro to false should > definitely be merged. As recommended, I'll simply create a writeable > clone of the restored snapshot and use -p instead of -c when restoring > again
Re: Incremental send/receive broken after snapshot restore
On Sat, Jun 30, 2018 at 10:02 PM Andrei Borzenkov wrote: > > 30.06.2018 21:49, Andrei Borzenkov пишет: > > 30.06.2018 20:49, Hannes Schweizer пишет: > ... > >> > >> I've tested a few restore methods beforehand, and simply creating a > >> writeable clone from the restored snapshot does not work for me, eg: > >> # create some source snapshots > >> btrfs sub create test_root > >> btrfs sub snap -r test_root test_snap1 > >> btrfs sub snap -r test_root test_snap2 > >> > >> # send a full and incremental backup to external disk > >> btrfs send test_snap2 | btrfs receive /run/media/schweizer/external > >> btrfs sub snap -r test_root test_snap3 > >> btrfs send -c test_snap2 test_snap3 | btrfs receive > >> /run/media/schweizer/external > >> > >> # simulate disappearing source > >> btrfs sub del test_* > >> > >> # restore full snapshot from external disk > >> btrfs send /run/media/schweizer/external/test_snap3 | btrfs receive . > >> > >> # create writeable clone > >> btrfs sub snap test_snap3 test_root > >> > >> # try to continue with backup scheme from source to external > >> btrfs sub snap -r test_root test_snap4 > >> > >> # this fails!! > >> btrfs send -c test_snap3 test_snap4 | btrfs receive > >> /run/media/schweizer/external > >> At subvol test_snap4 > >> ERROR: parent determination failed for 2047 > >> ERROR: empty stream is not considered valid > >> > > > > Yes, that's expected. Incremental stream always needs valid parent - > > this will be cloned on destination and incremental changes applied to > > it. "-c" option is just additional sugar on top of it which might reduce > > size of stream, but in this case (i.e. without "-p") it also attempts to > > guess parent subvolume for test_snap4 and this fails because test_snap3 > > and test_snap4 do not have common parent so test_snap3 is rejected as > > valid parent snapshot. You can restart incremental-forever chain by > > using explicit "-p" instead: > > > > btrfs send -p test_snap3 test_snap4 > > > > Subsequent snapshots (test_snap5 etc) will all have common parent with > > immediate predecessor again so "-c" will work. > > > > Note that technically "btrfs send" with single "-c" option is entirely > > equivalent to "btrfs -p". Using "-p" would have avoided this issue. :) > > Although this implicit check for common parent may be considered a good > > thing in this case. > > > > P.S. looking at the above, it probably needs to be in manual page for > > btrfs-send. It took me quite some time to actually understand the > > meaning of "-p" and "-c" and behavior if they are present. > > > ... > >> > >> Is there some way to reset the received_uuid of the following snapshot > >> on online? > >> ID 258 gen 13742 top level 5 parent_uuid - > >>received_uuid 6c683d90-44f2-ad48-bb84-e9f241800179 uuid > >> 46db1185-3c3e-194e-8d19-7456e532b2f3 path diablo > >> > > > > There is no "official" tool but this question came up quite often. > > Search this list, I believe recently one-liner using python-btrfs was > > posted. Note that also patch that removes received_uuid when "ro" > > propery is removed was suggested, hopefully it will be merged at some > > point. Still I personally consider ability to flip read-only property > > the very bad thing that should have never been exposed in the first place. > > > > Note that if you remove received_uuid (explicitly or - in the future - > implicitly) you will not be able to restart incremental send anymore. > Without received_uuid there will be no way to match source test_snap3 > with destination test_snap3. So you *must* preserve it and start with > writable clone. > > received_uuid is misnomer. I wish it would be named "content_uuid" or > "snap_uuid" with semantic > > 1. When read-only snapshot of writable volume is created, content_uuid > is initialized > > 2. Read-only snapshot of read-only snapshot inherits content_uuid > > 3. destination of "btrfs send" inherits content_uuid > > 4. writable snapshot of read-only snapshot clears content_uuid > > 5. clearing read-only property clears content_uuid > > This would make it more straightforward to cascade and restart > replication by having single subvolume property to match against. Indeed, the current terminology is a bit confusing, and the patch removing the received_uuid when manually switching ro to false should definitely be merged. As recommended, I'll simply create a writeable clone of the restored snapshot and use -p instead of -c when restoring again (which kind of snapshot relations are accepted for incremental send/receive needs better documentation) Fortunately, with all your hints regarding received_uuid I was able to successfully restart the incremental-chain WITHOUT restarting from scratch: # replace incorrectly propagated received_uuid on destination with actual uuid of source snapshot btrfs property set /run/media/schweizer/external/diablo_external.2018-06-24T19-37-39 ro false set_received_uuid.py de9421c5-d160-2949-bf09-613949b4611c 1089 0.0
Re: So, does btrfs check lowmem take days? weeks?
On Sat, Jun 30, 2018 at 10:49:07PM +0800, Qu Wenruo wrote: > But the last abort looks pretty possible to be the culprit. > > Would you try to dump the extent tree? > # btrfs inspect dump-tree -t extent | grep -A50 156909494272 Sure, there you go: item 25 key (156909494272 EXTENT_ITEM 55320576) itemoff 14943 itemsize 24 refs 19715 gen 31575 flags DATA item 26 key (156909494272 EXTENT_DATA_REF 571620086735451015) itemoff 14915 itemsize 28 extent data backref root 21641 objectid 374857 offset 235175936 count 1452 item 27 key (156909494272 EXTENT_DATA_REF 1765833482087969671) itemoff 14887 itemsize 28 extent data backref root 23094 objectid 374857 offset 235175936 count 1442 item 28 key (156909494272 EXTENT_DATA_REF 1807626434455810951) itemoff 14859 itemsize 28 extent data backref root 21503 objectid 374857 offset 235175936 count 1454 item 29 key (156909494272 EXTENT_DATA_REF 1879818091602916231) itemoff 14831 itemsize 28 extent data backref root 21462 objectid 374857 offset 235175936 count 1454 item 30 key (156909494272 EXTENT_DATA_REF 3610854505775117191) itemoff 14803 itemsize 28 extent data backref root 23134 objectid 374857 offset 235175936 count 1442 item 31 key (156909494272 EXTENT_DATA_REF 3754675454231458695) itemoff 14775 itemsize 28 extent data backref root 23052 objectid 374857 offset 235175936 count 1442 item 32 key (156909494272 EXTENT_DATA_REF 5060494667839714183) itemoff 14747 itemsize 28 extent data backref root 23174 objectid 374857 offset 235175936 count 1440 item 33 key (156909494272 EXTENT_DATA_REF 5476627808561673095) itemoff 14719 itemsize 28 extent data backref root 22911 objectid 374857 offset 235175936 count 1 item 34 key (156909494272 EXTENT_DATA_REF 6378484416458011527) itemoff 14691 itemsize 28 extent data backref root 23012 objectid 374857 offset 235175936 count 1442 item 35 key (156909494272 EXTENT_DATA_REF 7338474132555182983) itemoff 14663 itemsize 28 extent data backref root 21872 objectid 374857 offset 235175936 count 1 item 36 key (156909494272 EXTENT_DATA_REF 7516565391717970823) itemoff 14635 itemsize 28 extent data backref root 21826 objectid 374857 offset 235175936 count 1452 item 37 key (156909494272 SHARED_DATA_REF 14871537025024) itemoff 14631 itemsize 4 shared data backref count 10 item 38 key (156909494272 SHARED_DATA_REF 14871617568768) itemoff 14627 itemsize 4 shared data backref count 73 item 39 key (156909494272 SHARED_DATA_REF 14871619846144) itemoff 14623 itemsize 4 shared data backref count 59 item 40 key (156909494272 SHARED_DATA_REF 14871623270400) itemoff 14619 itemsize 4 shared data backref count 68 item 41 key (156909494272 SHARED_DATA_REF 14871623532544) itemoff 14615 itemsize 4 shared data backref count 70 item 42 key (156909494272 SHARED_DATA_REF 14871626383360) itemoff 14611 itemsize 4 shared data backref count 76 item 43 key (156909494272 SHARED_DATA_REF 14871635132416) itemoff 14607 itemsize 4 shared data backref count 60 item 44 key (156909494272 SHARED_DATA_REF 14871649533952) itemoff 14603 itemsize 4 shared data backref count 79 item 45 key (156909494272 SHARED_DATA_REF 14871862378496) itemoff 14599 itemsize 4 shared data backref count 70 item 46 key (156909494272 SHARED_DATA_REF 14909667098624) itemoff 14595 itemsize 4 shared data backref count 72 item 47 key (156909494272 SHARED_DATA_REF 14909669720064) itemoff 14591 itemsize 4 shared data backref count 58 item 48 key (156909494272 SHARED_DATA_REF 14909734567936) itemoff 14587 itemsize 4 shared data backref count 73 item 49 key (156909494272 SHARED_DATA_REF 14909920477184) itemoff 14583 itemsize 4 shared data backref count 79 item 50 key (156909494272 SHARED_DATA_REF 14942279335936) itemoff 14579 itemsize 4 shared data backref count 79 item 51 key (156909494272 SHARED_DATA_REF 14942304862208) itemoff 14575 itemsize 4 shared data backref count 72 item 52 key (156909494272 SHARED_DATA_REF 14942348378112) itemoff 14571 itemsize 4 shared data backref count 67 item 53 key (156909494272 SHARED_DATA_REF 14942366138368) itemoff 14567 itemsize 4 shared data backref count 51 item 54 key (156909494272 SHARED_DATA_REF 14942384799744) itemoff 14563 itemsize 4 shared data backref count 64 item 55 key (156909494272 SHARED_DATA_REF 14978234613760)
Re: Incremental send/receive broken after snapshot restore
30.06.2018 21:49, Andrei Borzenkov пишет: > 30.06.2018 20:49, Hannes Schweizer пишет: ... >> >> I've tested a few restore methods beforehand, and simply creating a >> writeable clone from the restored snapshot does not work for me, eg: >> # create some source snapshots >> btrfs sub create test_root >> btrfs sub snap -r test_root test_snap1 >> btrfs sub snap -r test_root test_snap2 >> >> # send a full and incremental backup to external disk >> btrfs send test_snap2 | btrfs receive /run/media/schweizer/external >> btrfs sub snap -r test_root test_snap3 >> btrfs send -c test_snap2 test_snap3 | btrfs receive >> /run/media/schweizer/external >> >> # simulate disappearing source >> btrfs sub del test_* >> >> # restore full snapshot from external disk >> btrfs send /run/media/schweizer/external/test_snap3 | btrfs receive . >> >> # create writeable clone >> btrfs sub snap test_snap3 test_root >> >> # try to continue with backup scheme from source to external >> btrfs sub snap -r test_root test_snap4 >> >> # this fails!! >> btrfs send -c test_snap3 test_snap4 | btrfs receive >> /run/media/schweizer/external >> At subvol test_snap4 >> ERROR: parent determination failed for 2047 >> ERROR: empty stream is not considered valid >> > > Yes, that's expected. Incremental stream always needs valid parent - > this will be cloned on destination and incremental changes applied to > it. "-c" option is just additional sugar on top of it which might reduce > size of stream, but in this case (i.e. without "-p") it also attempts to > guess parent subvolume for test_snap4 and this fails because test_snap3 > and test_snap4 do not have common parent so test_snap3 is rejected as > valid parent snapshot. You can restart incremental-forever chain by > using explicit "-p" instead: > > btrfs send -p test_snap3 test_snap4 > > Subsequent snapshots (test_snap5 etc) will all have common parent with > immediate predecessor again so "-c" will work. > > Note that technically "btrfs send" with single "-c" option is entirely > equivalent to "btrfs -p". Using "-p" would have avoided this issue. :) > Although this implicit check for common parent may be considered a good > thing in this case. > > P.S. looking at the above, it probably needs to be in manual page for > btrfs-send. It took me quite some time to actually understand the > meaning of "-p" and "-c" and behavior if they are present. > ... >> >> Is there some way to reset the received_uuid of the following snapshot >> on online? >> ID 258 gen 13742 top level 5 parent_uuid - >>received_uuid 6c683d90-44f2-ad48-bb84-e9f241800179 uuid >> 46db1185-3c3e-194e-8d19-7456e532b2f3 path diablo >> > > There is no "official" tool but this question came up quite often. > Search this list, I believe recently one-liner using python-btrfs was > posted. Note that also patch that removes received_uuid when "ro" > propery is removed was suggested, hopefully it will be merged at some > point. Still I personally consider ability to flip read-only property > the very bad thing that should have never been exposed in the first place. > Note that if you remove received_uuid (explicitly or - in the future - implicitly) you will not be able to restart incremental send anymore. Without received_uuid there will be no way to match source test_snap3 with destination test_snap3. So you *must* preserve it and start with writable clone. received_uuid is misnomer. I wish it would be named "content_uuid" or "snap_uuid" with semantic 1. When read-only snapshot of writable volume is created, content_uuid is initialized 2. Read-only snapshot of read-only snapshot inherits content_uuid 3. destination of "btrfs send" inherits content_uuid 4. writable snapshot of read-only snapshot clears content_uuid 5. clearing read-only property clears content_uuid This would make it more straightforward to cascade and restart replication by having single subvolume property to match against. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Incremental send/receive broken after snapshot restore
30.06.2018 20:49, Hannes Schweizer пишет: > On Sat, Jun 30, 2018 at 8:24 AM Andrei Borzenkov wrote: >> >> Do not reply privately to mails on list. >> >> 29.06.2018 22:10, Hannes Schweizer пишет: >>> On Fri, Jun 29, 2018 at 7:44 PM Andrei Borzenkov >>> wrote: 28.06.2018 23:09, Hannes Schweizer пишет: > Hi, > > Here's my environment: > Linux diablo 4.17.0-gentoo #5 SMP Mon Jun 25 00:26:55 CEST 2018 x86_64 > Intel(R) Core(TM) i5 CPU 760 @ 2.80GHz GenuineIntel GNU/Linux > btrfs-progs v4.17 > > Label: 'online' uuid: e4dc6617-b7ed-4dfb-84a6-26e3952c8390 > Total devices 2 FS bytes used 3.16TiB > devid1 size 1.82TiB used 1.58TiB path /dev/mapper/online0 > devid2 size 1.82TiB used 1.58TiB path /dev/mapper/online1 > Data, RAID0: total=3.16TiB, used=3.15TiB > System, RAID0: total=16.00MiB, used=240.00KiB > Metadata, RAID0: total=7.00GiB, used=4.91GiB > GlobalReserve, single: total=512.00MiB, used=0.00B > > Label: 'offline' uuid: 5b449116-93e5-473e-aaf5-bf3097b14f29 > Total devices 2 FS bytes used 3.52TiB > devid1 size 5.46TiB used 3.53TiB path /dev/mapper/offline0 > devid2 size 5.46TiB used 3.53TiB path /dev/mapper/offline1 > Data, RAID1: total=3.52TiB, used=3.52TiB > System, RAID1: total=8.00MiB, used=512.00KiB > Metadata, RAID1: total=6.00GiB, used=5.11GiB > GlobalReserve, single: total=512.00MiB, used=0.00B > > Label: 'external' uuid: 8bf13621-01f0-4f09-95c7-2c157d3087d0 > Total devices 1 FS bytes used 3.65TiB > devid1 size 5.46TiB used 3.66TiB path > /dev/mapper/luks-3c196e96-d46c-4a9c-9583-b79c707678fc > Data, single: total=3.64TiB, used=3.64TiB > System, DUP: total=32.00MiB, used=448.00KiB > Metadata, DUP: total=11.00GiB, used=9.72GiB > GlobalReserve, single: total=512.00MiB, used=0.00B > > > The following automatic backup scheme is in place: > hourly: > btrfs sub snap -r online/root online/root. > > daily: > btrfs sub snap -r online/root online/root. > btrfs send -c online/root. > online/root. | btrfs receive offline > btrfs sub del -c online/root. > > monthly: > btrfs sub snap -r online/root online/root. > btrfs send -c online/root. > online/root. | btrfs receive external > btrfs sub del -c online/root. > > Now here are the commands leading up to my problem: > After the online filesystem suddenly went ro, and btrfs check showed > massive problems, I decided to start the online array from scratch: > 1: mkfs.btrfs -f -d raid0 -m raid0 -L "online" /dev/mapper/online0 > /dev/mapper/online1 > > As you can see from the backup commands above, the snapshots of > offline and external are not related, so in order to at least keep the > extensive backlog of the external snapshot set (including all > reflinks), I decided to restore the latest snapshot from external. > 2: btrfs send external/root. | btrfs receive online > > I wanted to ensure I can restart the incremental backup flow from > online to external, so I did this > 3: mv online/root. online/root > 4: btrfs sub snap -r online/root online/root. > 5: btrfs property set online/root ro false > > Now, I naively expected a simple restart of my automatic backups for > external should work. > However after running > 6: btrfs sub snap -r online/root online/root. > 7: btrfs send -c online/root. > online/root. | btrfs receive external You just recreated your "online" filesystem from scratch. Where "old_external_reference" comes from? You did not show steps used to create it. > I see the following error: > ERROR: unlink root/.ssh/agent-diablo-_dev_pts_3 failed. No such file > or directory > > Which is unfortunate, but the second problem actually encouraged me to > post this message. > As planned, I had to start the offline array from scratch as well, > because I no longer had any reference snapshot for incremental backups > on other devices: > 8: mkfs.btrfs -f -d raid1 -m raid1 -L "offline" /dev/mapper/offline0 > /dev/mapper/offline1 > > However restarting the automatic daily backup flow bails out with a > similar error, although no potentially problematic previous > incremental snapshots should be involved here! > ERROR: unlink o925031-987-0/2139527549 failed. No such file or directory > Again - before you can *re*start incremental-forever sequence you need initial full copy. How exactly did you restart it if no snapshots exist either on source or on destination? >>> >>> Thanks for your help regarding this issue! >>> >>> Before the online crash, I've used the following online -> external >>> backup scheme: >>> btrfs sub snap -r online/root online/root. >>> btrfs send
Re: Incremental send/receive broken after snapshot restore
On Sat, Jun 30, 2018 at 8:24 AM Andrei Borzenkov wrote: > > Do not reply privately to mails on list. > > 29.06.2018 22:10, Hannes Schweizer пишет: > > On Fri, Jun 29, 2018 at 7:44 PM Andrei Borzenkov > > wrote: > >> > >> 28.06.2018 23:09, Hannes Schweizer пишет: > >>> Hi, > >>> > >>> Here's my environment: > >>> Linux diablo 4.17.0-gentoo #5 SMP Mon Jun 25 00:26:55 CEST 2018 x86_64 > >>> Intel(R) Core(TM) i5 CPU 760 @ 2.80GHz GenuineIntel GNU/Linux > >>> btrfs-progs v4.17 > >>> > >>> Label: 'online' uuid: e4dc6617-b7ed-4dfb-84a6-26e3952c8390 > >>> Total devices 2 FS bytes used 3.16TiB > >>> devid1 size 1.82TiB used 1.58TiB path /dev/mapper/online0 > >>> devid2 size 1.82TiB used 1.58TiB path /dev/mapper/online1 > >>> Data, RAID0: total=3.16TiB, used=3.15TiB > >>> System, RAID0: total=16.00MiB, used=240.00KiB > >>> Metadata, RAID0: total=7.00GiB, used=4.91GiB > >>> GlobalReserve, single: total=512.00MiB, used=0.00B > >>> > >>> Label: 'offline' uuid: 5b449116-93e5-473e-aaf5-bf3097b14f29 > >>> Total devices 2 FS bytes used 3.52TiB > >>> devid1 size 5.46TiB used 3.53TiB path /dev/mapper/offline0 > >>> devid2 size 5.46TiB used 3.53TiB path /dev/mapper/offline1 > >>> Data, RAID1: total=3.52TiB, used=3.52TiB > >>> System, RAID1: total=8.00MiB, used=512.00KiB > >>> Metadata, RAID1: total=6.00GiB, used=5.11GiB > >>> GlobalReserve, single: total=512.00MiB, used=0.00B > >>> > >>> Label: 'external' uuid: 8bf13621-01f0-4f09-95c7-2c157d3087d0 > >>> Total devices 1 FS bytes used 3.65TiB > >>> devid1 size 5.46TiB used 3.66TiB path > >>> /dev/mapper/luks-3c196e96-d46c-4a9c-9583-b79c707678fc > >>> Data, single: total=3.64TiB, used=3.64TiB > >>> System, DUP: total=32.00MiB, used=448.00KiB > >>> Metadata, DUP: total=11.00GiB, used=9.72GiB > >>> GlobalReserve, single: total=512.00MiB, used=0.00B > >>> > >>> > >>> The following automatic backup scheme is in place: > >>> hourly: > >>> btrfs sub snap -r online/root online/root. > >>> > >>> daily: > >>> btrfs sub snap -r online/root online/root. > >>> btrfs send -c online/root. > >>> online/root. | btrfs receive offline > >>> btrfs sub del -c online/root. > >>> > >>> monthly: > >>> btrfs sub snap -r online/root online/root. > >>> btrfs send -c online/root. > >>> online/root. | btrfs receive external > >>> btrfs sub del -c online/root. > >>> > >>> Now here are the commands leading up to my problem: > >>> After the online filesystem suddenly went ro, and btrfs check showed > >>> massive problems, I decided to start the online array from scratch: > >>> 1: mkfs.btrfs -f -d raid0 -m raid0 -L "online" /dev/mapper/online0 > >>> /dev/mapper/online1 > >>> > >>> As you can see from the backup commands above, the snapshots of > >>> offline and external are not related, so in order to at least keep the > >>> extensive backlog of the external snapshot set (including all > >>> reflinks), I decided to restore the latest snapshot from external. > >>> 2: btrfs send external/root. | btrfs receive online > >>> > >>> I wanted to ensure I can restart the incremental backup flow from > >>> online to external, so I did this > >>> 3: mv online/root. online/root > >>> 4: btrfs sub snap -r online/root online/root. > >>> 5: btrfs property set online/root ro false > >>> > >>> Now, I naively expected a simple restart of my automatic backups for > >>> external should work. > >>> However after running > >>> 6: btrfs sub snap -r online/root online/root. > >>> 7: btrfs send -c online/root. > >>> online/root. | btrfs receive external > >> > >> You just recreated your "online" filesystem from scratch. Where > >> "old_external_reference" comes from? You did not show steps used to > >> create it. > >> > >>> I see the following error: > >>> ERROR: unlink root/.ssh/agent-diablo-_dev_pts_3 failed. No such file > >>> or directory > >>> > >>> Which is unfortunate, but the second problem actually encouraged me to > >>> post this message. > >>> As planned, I had to start the offline array from scratch as well, > >>> because I no longer had any reference snapshot for incremental backups > >>> on other devices: > >>> 8: mkfs.btrfs -f -d raid1 -m raid1 -L "offline" /dev/mapper/offline0 > >>> /dev/mapper/offline1 > >>> > >>> However restarting the automatic daily backup flow bails out with a > >>> similar error, although no potentially problematic previous > >>> incremental snapshots should be involved here! > >>> ERROR: unlink o925031-987-0/2139527549 failed. No such file or directory > >>> > >> > >> Again - before you can *re*start incremental-forever sequence you need > >> initial full copy. How exactly did you restart it if no snapshots exist > >> either on source or on destination? > > > > Thanks for your help regarding this issue! > > > > Before the online crash, I've used the following online -> external > > backup scheme: > > btrfs sub snap -r online/root online/root. > > btrfs send -c online/root. > > online/root. | btrfs receive
Re: So, does btrfs check lowmem take days? weeks?
On 2018年06月30日 10:44, Marc MERLIN wrote: > Well, there goes that. After about 18H: > ERROR: extent[156909494272, 55320576] referencer count mismatch (root: 21872, > owner: 374857, offset: 235175936) wanted: 1, have: 1452 > backref.c:466: __add_missing_keys: Assertion `ref->root_id` failed, value 0 > btrfs(+0x3a232)[0x56091704f232] > btrfs(+0x3ab46)[0x56091704fb46] > btrfs(+0x3b9f5)[0x5609170509f5] > btrfs(btrfs_find_all_roots+0x9)[0x560917050a45] > btrfs(+0x572ff)[0x56091706c2ff] > btrfs(+0x60b13)[0x560917075b13] > btrfs(cmd_check+0x2634)[0x56091707d431] > btrfs(main+0x88)[0x560917027260] > /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf1)[0x7f93aa508561] > btrfs(_start+0x2a)[0x560917026dfa] > Aborted I think that's the root cause. Some invalid extent tree backref or bad tree block blow up backref code. All previous error message may be garbage unless you're using Su's latest branch, as lowmem mode tends to report false alerts on refrencer count mismatch. But the last abort looks pretty possible to be the culprit. Would you try to dump the extent tree? # btrfs inspect dump-tree -t extent | grep -A50 156909494272 It should help us locate the culprit and hopefully get some chance to fix it. Thanks, Qu > > That's https://github.com/Damenly/btrfs-progs.git > > Whoops, I didn't use the tmp1 branch, let me try again with that and > report back, although the problem above is still going to be there since > I think the only difference will be this, correct? > https://github.com/Damenly/btrfs-progs/commit/b5851513a12237b3e19a3e71f3ad00b966d25b3a > > Marc > -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfs suddenly think's it's raid6
On 2018年06月30日 06:11, marble wrote: > Hey, > Thanks for the quick reply :-) > >> Is there anything like unexpected power loss happens? > Power loss may have happened. >> And would you provide the following data for debugging? >> >> # btrfs ins dump-super -fFa /dev/mapper/black > I attached it. Looks pretty good. Then we could go through normal salvage routine. Please try the following commands and especially keep an eye on the stderr output: (The follow option is added in recent btrfs-progs releases, so please ensure your btrfs-progs is up to date) # btrfs ins dump-tree -b 1084653568 --follow /dev/mapper/black # btrfs ins dump-tree -b 1083604992 --follow /dev/mapper/black # btrfs ins dump-tree -b 1083981824 --follow /dev/mapper/black # btrfs ins dump-tree -b 1084325888 --follow /dev/mapper/black Or just use old roots and let btrfs check to judge: # btrfs check --tree-root 1084751872 /dev/mapper/black # btrfs check --tree-root 1083801600 /dev/mapper/black # btrfs check --tree-root 1084145664 /dev/mapper/black # btrfs check --tree-root 1084473344 /dev/mapper/black >> And further more, what's the device mapper setup for /dev/mapper/black? >> Is there anything like RAID here? > I think this is the result of luks. "black" is the name I passed to > cryptsetup open. Maybe some powerloss screwed up encryption? I'm not pretty sure how things will happen when power loss happens. Anyway, let's see how above btrfs check and dump-tree ends. Thanks, Qu >> Thanks, >> Qu > Cheers, > marble > signature.asc Description: OpenPGP digital signature
[josef-btrfs:blk-iolatency-v7 14/14] mm/readahead.c:504:6: error: implicit declaration of function 'blk_cgroup_congested'
tree: https://git.kernel.org/pub/scm/linux/kernel/git/josef/btrfs-next.git blk-iolatency-v7 head: 4f16e9aa09862911cb7ec38061a48b91a72142c3 commit: 4f16e9aa09862911cb7ec38061a48b91a72142c3 [14/14] skip readahead if the cgroup is congested config: openrisc-or1ksim_defconfig (attached as .config) compiler: or1k-linux-gcc (GCC) 6.0.0 20160327 (experimental) reproduce: wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross git checkout 4f16e9aa09862911cb7ec38061a48b91a72142c3 # save the attached .config to linux build tree make.cross ARCH=openrisc All errors (new ones prefixed by >>): mm/readahead.c: In function 'page_cache_sync_readahead': >> mm/readahead.c:504:6: error: implicit declaration of function >> 'blk_cgroup_congested' [-Werror=implicit-function-declaration] if (blk_cgroup_congested()) ^~~~ cc1: some warnings being treated as errors vim +/blk_cgroup_congested +504 mm/readahead.c 481 482 /** 483 * page_cache_sync_readahead - generic file readahead 484 * @mapping: address_space which holds the pagecache and I/O vectors 485 * @ra: file_ra_state which holds the readahead state 486 * @filp: passed on to ->readpage() and ->readpages() 487 * @offset: start offset into @mapping, in pagecache page-sized units 488 * @req_size: hint: total size of the read which the caller is performing in 489 *pagecache pages 490 * 491 * page_cache_sync_readahead() should be called when a cache miss happened: 492 * it will submit the read. The readahead logic may decide to piggyback more 493 * pages onto the read request if access patterns suggest it will improve 494 * performance. 495 */ 496 void page_cache_sync_readahead(struct address_space *mapping, 497 struct file_ra_state *ra, struct file *filp, 498 pgoff_t offset, unsigned long req_size) 499 { 500 /* no read-ahead */ 501 if (!ra->ra_pages) 502 return; 503 > 504 if (blk_cgroup_congested()) 505 return; 506 507 /* be dumb */ 508 if (filp && (filp->f_mode & FMODE_RANDOM)) { 509 force_page_cache_readahead(mapping, filp, offset, req_size); 510 return; 511 } 512 513 /* do read-ahead */ 514 ondemand_readahead(mapping, ra, filp, false, offset, req_size); 515 } 516 EXPORT_SYMBOL_GPL(page_cache_sync_readahead); 517 --- 0-DAY kernel test infrastructureOpen Source Technology Center https://lists.01.org/pipermail/kbuild-all Intel Corporation .config.gz Description: application/gzip
[josef-btrfs:blk-iolatency-v7 14/14] mm/readahead.c:504:6: error: implicit declaration of function 'blk_cgroup_congested'; did you mean 'bdi_rw_congested'?
tree: https://git.kernel.org/pub/scm/linux/kernel/git/josef/btrfs-next.git blk-iolatency-v7 head: 4f16e9aa09862911cb7ec38061a48b91a72142c3 commit: 4f16e9aa09862911cb7ec38061a48b91a72142c3 [14/14] skip readahead if the cgroup is congested config: sh-allnoconfig (attached as .config) compiler: sh4-linux-gnu-gcc (Debian 7.2.0-11) 7.2.0 reproduce: wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross git checkout 4f16e9aa09862911cb7ec38061a48b91a72142c3 # save the attached .config to linux build tree GCC_VERSION=7.2.0 make.cross ARCH=sh All errors (new ones prefixed by >>): mm/readahead.c: In function 'page_cache_sync_readahead': >> mm/readahead.c:504:6: error: implicit declaration of function >> 'blk_cgroup_congested'; did you mean 'bdi_rw_congested'? >> [-Werror=implicit-function-declaration] if (blk_cgroup_congested()) ^~~~ bdi_rw_congested cc1: some warnings being treated as errors vim +504 mm/readahead.c 481 482 /** 483 * page_cache_sync_readahead - generic file readahead 484 * @mapping: address_space which holds the pagecache and I/O vectors 485 * @ra: file_ra_state which holds the readahead state 486 * @filp: passed on to ->readpage() and ->readpages() 487 * @offset: start offset into @mapping, in pagecache page-sized units 488 * @req_size: hint: total size of the read which the caller is performing in 489 *pagecache pages 490 * 491 * page_cache_sync_readahead() should be called when a cache miss happened: 492 * it will submit the read. The readahead logic may decide to piggyback more 493 * pages onto the read request if access patterns suggest it will improve 494 * performance. 495 */ 496 void page_cache_sync_readahead(struct address_space *mapping, 497 struct file_ra_state *ra, struct file *filp, 498 pgoff_t offset, unsigned long req_size) 499 { 500 /* no read-ahead */ 501 if (!ra->ra_pages) 502 return; 503 > 504 if (blk_cgroup_congested()) 505 return; 506 507 /* be dumb */ 508 if (filp && (filp->f_mode & FMODE_RANDOM)) { 509 force_page_cache_readahead(mapping, filp, offset, req_size); 510 return; 511 } 512 513 /* do read-ahead */ 514 ondemand_readahead(mapping, ra, filp, false, offset, req_size); 515 } 516 EXPORT_SYMBOL_GPL(page_cache_sync_readahead); 517 --- 0-DAY kernel test infrastructureOpen Source Technology Center https://lists.01.org/pipermail/kbuild-all Intel Corporation .config.gz Description: application/gzip
Re: btrfs send/receive vs rsync
Marc MERLIN posted on Fri, 29 Jun 2018 09:24:20 -0700 as excerpted: >> If instead of using a single BTRFS filesystem you used LVM volumes >> (maybe with Thin provisioning and monitoring of the volume group free >> space) for each of your servers to backup with one BTRFS filesystem per >> volume you would have less snapshots per filesystem and isolate >> problems in case of corruption. If you eventually decide to start from >> scratch again this might help a lot in your case. > > So, I already have problems due to too many block layers: > - raid 5 + ssd - bcache - dmcrypt - btrfs > > I get occasional deadlocks due to upper layers sending more data to the > lower layer (bcache) than it can process. I'm a bit warry of adding yet > another layer (LVM), but you're otherwise correct than keeping smaller > btrfs filesystems would help with performance and containing possible > damage. > > Has anyone actually done this? :) So I definitely use (and advocate!) the split-em-up strategy, and I use btrfs, but that's pretty much all the similarity we have. I'm all ssd, having left spinning rust behind. My strategy avoids unnecessary layers like lvm (tho crypt can arguably be necessary), preferring direct on-device (gpt) partitioning for simplicity of management and disaster recovery. And my backup and recovery strategy is an equally simple mkfs and full-filesystem-fileset copy to an identically sized filesystem, with backups easily bootable/mountable in place of the working copy if necessary, and multiple backups so if disaster takes out the backup I was writing at the same time as the working copy, I still have a backup to fall back to. So it's different enough I'm not sure how much my experience will help you. But I /can/ say the subdivision is nice, as it means I can keep my root filesystem read-only by default for reliability, my most-at-risk log filesystem tiny for near-instant scrub/balance/check, and my also at risk home small as well, with the big media files being on a different filesystem that's mostly read-only, so less at risk and needing less frequent backups. The tiny boot and large updates (distro repo, sources, ccache) are also separate, and mounted only for boot maintenance or updates. -- Duncan - List replies preferred. No HTML msgs. "Every nonfree program has a lord, a master -- and if you use the program, he is your master." Richard Stallman -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Major design flaw with BTRFS Raid, temporary device drop will corrupt nodatacow files
Austin S. Hemmelgarn posted on Fri, 29 Jun 2018 14:31:04 -0400 as excerpted: > On 2018-06-29 13:58, james harvey wrote: >> On Fri, Jun 29, 2018 at 1:09 PM, Austin S. Hemmelgarn >> wrote: >>> On 2018-06-29 11:15, james harvey wrote: On Thu, Jun 28, 2018 at 6:27 PM, Chris Murphy wrote: > > And an open question I have about scrub is weather it only ever is > checking csums, meaning nodatacow files are never scrubbed, or if > the copies are at least compared to each other? Scrub never looks at nodatacow files. It does not compare the copies to each other. Qu submitted a patch to make check compare the copies: https://patchwork.kernel.org/patch/10434509/ This hasn't been added to btrfs-progs git yet. IMO, I think the offline check should look at nodatacow copies like this, but I still think this also needs to be added to scrub. In the patch thread, I discuss my reasons why. In brief: online scanning; this goes along with user's expectation of scrub ensuring mirrored data integrity; and recommendations to setup scrub on periodic basis to me means it's the place to put it. >>> >>> That said, it can't sanely fix things if there is a mismatch. At >>> least, >>> not unless BTRFS gets proper generational tracking to handle >>> temporarily missing devices. As of right now, sanely fixing things >>> requires significant manual intervention, as you have to bypass the >>> device read selection algorithm to be able to look at the state of the >>> individual copies so that you can pick one to use and forcibly rewrite >>> the whole file by hand. >> >> Absolutely. User would need to use manual intervention as you >> describe, or restore the single file(s) from backup. But, it's a good >> opportunity to tell the user they had partial data corruption, even if >> it can't be auto-fixed. Otherwise they get intermittent data >> corruption, depending on which copies are read. > The thing is though, as things stand right now, you need to manually > edit the data on-disk directly or restore the file from a backup to fix > the file. While it's technically true that you can manually repair this > type of thing, both of the cases for doing it without those patches I > mentioned, it's functionally impossible for a regular user to do it > without potentially losing some data. [Usual backups rant, user vs. admin variant, nowcow/tmpfs edition. Regulars can skip as the rest is already predicted from past posts, for them. =;^] "Regular user"? "Regular users" don't need to bother with this level of detail. They simply get their "admin" to do it, even if that "admin" is their kid, or the kid from next door that's good with computers, or the geek squad (aka nsa-agent-squad) guy/gal, doing it... or telling them to install "a real OS", meaning whatever MS/Apple/Google something that they know how to deal with. If the "user" is dealing with setting nocow, choosing btrfs in the first place, etc, then they're _not_ a "regular user" by definition, they're already an admin. And as any admin learns rather quickly, the value of data is defined by the number of backups it's worth having of that data. Which means it's not a problem. Either the data had a backup and it's (reasonably) trivial to restore the data from that backup, or the data was defined by lack of having that backup as of only trivial value, so low as to not be worth the time/trouble/resources necessary to make that backup in the first place. Which of course means what was defined as of most value, either the data of there was a backup, or the time/trouble/resources that would have gone into creating it if not, is *always* saved. (And of course the same goes for "I had a backup, but it's old", except in this case it's the value of the data delta between the backup and current. As soon as it's worth more than the time/trouble/hassle of updating the backup, it will by definition be updated. Not having a newer backup available thus simply means the value of the data that changed between the last backup and current was simply not enough to justify updating the backup, and again, what was of most value is *always* saved, either the data, or the time that would have otherwise gone into making the newer backup.) Because while a "regular user" may not know it because it's not his /job/ to know it, if there's anything an admin knows *well* it's that the working copy of data **WILL** be damaged. It's not a matter of if, but of when, and of whether it'll be a fat-finger mistake, or a hardware or software failure, or wetware (theft, ransomware, etc), or wetware (flood, fire and the water that put it out damage, etc), tho none of that actually matters after all, because in the end, the only thing that matters was how the value of that data was defined by the number of backups made of it, and how quickly and conveniently at least
Re: Incremental send/receive broken after snapshot restore
Do not reply privately to mails on list. 29.06.2018 22:10, Hannes Schweizer пишет: > On Fri, Jun 29, 2018 at 7:44 PM Andrei Borzenkov wrote: >> >> 28.06.2018 23:09, Hannes Schweizer пишет: >>> Hi, >>> >>> Here's my environment: >>> Linux diablo 4.17.0-gentoo #5 SMP Mon Jun 25 00:26:55 CEST 2018 x86_64 >>> Intel(R) Core(TM) i5 CPU 760 @ 2.80GHz GenuineIntel GNU/Linux >>> btrfs-progs v4.17 >>> >>> Label: 'online' uuid: e4dc6617-b7ed-4dfb-84a6-26e3952c8390 >>> Total devices 2 FS bytes used 3.16TiB >>> devid1 size 1.82TiB used 1.58TiB path /dev/mapper/online0 >>> devid2 size 1.82TiB used 1.58TiB path /dev/mapper/online1 >>> Data, RAID0: total=3.16TiB, used=3.15TiB >>> System, RAID0: total=16.00MiB, used=240.00KiB >>> Metadata, RAID0: total=7.00GiB, used=4.91GiB >>> GlobalReserve, single: total=512.00MiB, used=0.00B >>> >>> Label: 'offline' uuid: 5b449116-93e5-473e-aaf5-bf3097b14f29 >>> Total devices 2 FS bytes used 3.52TiB >>> devid1 size 5.46TiB used 3.53TiB path /dev/mapper/offline0 >>> devid2 size 5.46TiB used 3.53TiB path /dev/mapper/offline1 >>> Data, RAID1: total=3.52TiB, used=3.52TiB >>> System, RAID1: total=8.00MiB, used=512.00KiB >>> Metadata, RAID1: total=6.00GiB, used=5.11GiB >>> GlobalReserve, single: total=512.00MiB, used=0.00B >>> >>> Label: 'external' uuid: 8bf13621-01f0-4f09-95c7-2c157d3087d0 >>> Total devices 1 FS bytes used 3.65TiB >>> devid1 size 5.46TiB used 3.66TiB path >>> /dev/mapper/luks-3c196e96-d46c-4a9c-9583-b79c707678fc >>> Data, single: total=3.64TiB, used=3.64TiB >>> System, DUP: total=32.00MiB, used=448.00KiB >>> Metadata, DUP: total=11.00GiB, used=9.72GiB >>> GlobalReserve, single: total=512.00MiB, used=0.00B >>> >>> >>> The following automatic backup scheme is in place: >>> hourly: >>> btrfs sub snap -r online/root online/root. >>> >>> daily: >>> btrfs sub snap -r online/root online/root. >>> btrfs send -c online/root. >>> online/root. | btrfs receive offline >>> btrfs sub del -c online/root. >>> >>> monthly: >>> btrfs sub snap -r online/root online/root. >>> btrfs send -c online/root. >>> online/root. | btrfs receive external >>> btrfs sub del -c online/root. >>> >>> Now here are the commands leading up to my problem: >>> After the online filesystem suddenly went ro, and btrfs check showed >>> massive problems, I decided to start the online array from scratch: >>> 1: mkfs.btrfs -f -d raid0 -m raid0 -L "online" /dev/mapper/online0 >>> /dev/mapper/online1 >>> >>> As you can see from the backup commands above, the snapshots of >>> offline and external are not related, so in order to at least keep the >>> extensive backlog of the external snapshot set (including all >>> reflinks), I decided to restore the latest snapshot from external. >>> 2: btrfs send external/root. | btrfs receive online >>> >>> I wanted to ensure I can restart the incremental backup flow from >>> online to external, so I did this >>> 3: mv online/root. online/root >>> 4: btrfs sub snap -r online/root online/root. >>> 5: btrfs property set online/root ro false >>> >>> Now, I naively expected a simple restart of my automatic backups for >>> external should work. >>> However after running >>> 6: btrfs sub snap -r online/root online/root. >>> 7: btrfs send -c online/root. >>> online/root. | btrfs receive external >> >> You just recreated your "online" filesystem from scratch. Where >> "old_external_reference" comes from? You did not show steps used to >> create it. >> >>> I see the following error: >>> ERROR: unlink root/.ssh/agent-diablo-_dev_pts_3 failed. No such file >>> or directory >>> >>> Which is unfortunate, but the second problem actually encouraged me to >>> post this message. >>> As planned, I had to start the offline array from scratch as well, >>> because I no longer had any reference snapshot for incremental backups >>> on other devices: >>> 8: mkfs.btrfs -f -d raid1 -m raid1 -L "offline" /dev/mapper/offline0 >>> /dev/mapper/offline1 >>> >>> However restarting the automatic daily backup flow bails out with a >>> similar error, although no potentially problematic previous >>> incremental snapshots should be involved here! >>> ERROR: unlink o925031-987-0/2139527549 failed. No such file or directory >>> >> >> Again - before you can *re*start incremental-forever sequence you need >> initial full copy. How exactly did you restart it if no snapshots exist >> either on source or on destination? > > Thanks for your help regarding this issue! > > Before the online crash, I've used the following online -> external > backup scheme: > btrfs sub snap -r online/root online/root. > btrfs send -c online/root. > online/root. | btrfs receive external > btrfs sub del -c online/root. > > By sending the existing snapshot from external to online (basically a > full copy of external/old_external_reference to online/root), it > should have been possible to restart the monthly online -> external > backup scheme, right? > You did