Re: Help Recovering BTRFS array
Hi Duncan, I'm not sure if this will attache to my original message... Thank you for your reply. For some reason i'm not getting list messages even tho i know i am subscribed. I know all to well about the golden rule of data. It has bitten me a few times. The data on this array is mostly data that i don't really care about. I was able to copy off what i wanted. The main reason i sent it to the list was just to see if i could somehow return the FS to a working state without having to recreate. I'm just surprised that all 3 copies of the super block got corrupted. Probably my lack of understanding but i always assumed that if one copy got corrupted it would be replaced by a good copy therefore leaving all copies in a good state. Is that not the case. If it is then what back luck that all 3 got messed up at same time. Some information i forgot to include in my original message uname -a Linux thebeach 4.12.13-gentoo-GMAN #1 SMP Sat Sep 16 15:28:26 ADT 2017 x86_64 Intel(R) Core(TM) i5-2320 CPU @ 3.00GHz GenuineIntel GNU/Linux btrfs --version btrfs-progs v4.10.2 Anyways thank you again for your reply. I will leave the FS intact for a few days in case anymore details could help the development of BTRFS and maybe avoid this happening or having a recovery option. Marc -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Help Recovering BTRFS array
grondinm posted on Mon, 18 Sep 2017 14:14:08 -0300 as excerpted: > superblock: bytenr=65536, device=/dev/md0 > - > ERROR: bad magic on superblock on /dev/md0 at 65536 > > superblock: bytenr=67108864, device=/dev/md0 > - > ERROR: bad magic on superblock on /dev/md0 at 67108864 > > superblock: bytenr=274877906944, device=/dev/md0 > - > ERROR: bad magic on superblock on /dev/md0 at 274877906944 > > Now i'm really panicked. Is the FS toast? Can any recovery be attempted? First I'm a user and list regular, not a dev. With luck they can help beyond the below suggestions... However, there's no need to panic in any case, due to the sysadmin's first rule of backups: The true value of any data is defined by the number of backups of that data you consider(ed) it worth having. As a result, there are precisely two possibilities, neither one of which calls for panic. 1) No need to panic because you have a backup, and recovery is as simple as restoring from that backup. 2) You don't have a backup, in which case the lack of that backup means you have defined the value of the data as only trivial, worth less than the time/trouble/resources you saved by not making that backup. Because the data is only of trivial value anyway, and you saved the more valuable assets of the time/trouble/resources you would have put into that backup were the data of more than trivial value, you've still saved the stuff you considered most valuable, so again, no need to panic. It's a binary state. There's no third possibility available, and no possibility you lost what your actions, or lack of them in the case of no backup, defined as of most value to you. (As for the freshness of that backup, the same rule applies, but to the data delta between the state as of the backup and the current state. If the value of the changed data is worth it to you to have it backed up, you'll have freshened your backup. If not, you defined it to be as of such trivial value as to not be worth the time/trouble/resources to do so.) That said, at the time you're calculating the value of the data against the value of the time/trouble/resources required to back it up, the loss potential remains theoretical. Once something actually happens to the data, it's no longer theoretical, and the data, while of trivial enough value to be worth the risk when it was theoretical, may still be valuable enough to you to spend at least some time/trouble on trying to recover it. In that case, since you can still mount, I'd suggest mounting read-only to prevent any further damage, and then do a copy off of the data you can, to a different, unaffected, filesystem. Then if there's still data you want that you couldn't simply copy off, you can try btrfs restore. While I do have backups here, a couple times when things went bad, btrfs restore was able to get back pretty much everything to current, while were I to have had to restore from backups, I'd have lost enough changed data to hurt, even if I had defined it as of trivial enough value when the risk remained theoretical that I hadn't yet freshened the backup. (Since then I upgraded the rest of my storage to ssd, thus lowering the time and hassle cost of backups, encouraging me to do them more frequently. Talking about which, I need to freshen them in the near future. It's now on my list for my next day off...) -- Duncan - List replies preferred. No HTML msgs. "Every nonfree program has a lord, a master -- and if you use the program, he is your master." Richard Stallman -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Help Recovering BTRFS array
Hello, I will try to provide all information pertinent to the situation i find myself in. Yesterday while trying to write some data to a BTRFS filesystem on top of a mdadm raid5 array encrypted with dmcrypt comprising of 4 1tb HDD my system became unresponsive and i had no choice but to hard reset. System came back up no problem and the array in question mounted without a complaint. Once i tried to write data to it again however the system became unresponsive again and required another hard reset. Again system came back up and everything mounted with no complaints. This time i decided to run some checks. Ran a raid check by issuing 'echo check > /sys/block/md0/md/sync_action'. This completed without a single error. So i performed a proper restart just because and once the system came back up i initiated a scrub on the btrfs filesystem. This greeted me with my first indication that something is wrong: btrfs sc stat /media/Storage2 scrub status for e5bd5cf3-c736-48ff-b1c6-c9f678567788 scrub started at Mon Sep 18 06:05:21 2017, running for 07:40:47 total bytes scrubbed: 1.03TiB with 1 errors error details: super=1 corrected errors: 0, uncorrectable errors: 0, unverified errors: 0 I was concerned but since it was still scrubbing i left it. Now things look really bleak... Every few minutes the scrub process goes into a D status as shown by htop it eventually keeps going and as far as i can see is still scrubbing(slowly). I decided to check a something else(based on the error above) I ran btrfs inspect-internal dump-super -a -f /dev/md0 which gave me this: superblock: bytenr=65536, device=/dev/md0 - ERROR: bad magic on superblock on /dev/md0 at 65536 superblock: bytenr=67108864, device=/dev/md0 - ERROR: bad magic on superblock on /dev/md0 at 67108864 superblock: bytenr=274877906944, device=/dev/md0 - ERROR: bad magic on superblock on /dev/md0 at 274877906944 Now i'm really panicked. Is the FS toast? Can any recovery be attempted? Here is the output of dump-super with the -F option: superblock: bytenr=65536, device=/dev/md0 - csum_type 43668 (INVALID) csum_size 32 csum 0x76c647b04abf1057f04e40d1dc52522397258064b98a1b8f6aa6934c74c0dd55 [DON'T MATCH] bytenr 6376050623103086821 flags 0x7edcc412b742c79f ( WRITTEN | RELOC | METADUMP | unknown flag: 0x7edcc410b742c79c ) magic ..l~...q [DON'T MATCH] fsid2cf827fa-7ab8-e290-b152-1735c2735a37 label .a.9.@.=4.#.|.D...]..dh=d,..k..n..~.5.i.8...(.._.tl.a.@..2..qidj.>Hy.U..{X5.kG0.)t..;/.2...@.T.|.u.<.`!J*9./8...&.g\.V...*.,/95.uEs..W.i..z..h...n(...VGn^F...H...5.DT..3.A..mK...~..}.1..n. generation 1769598730239175261 root14863846352370317867 sys_array_size 1744503544 chunk_root_generation 18100024505086712407 root_level 79 chunk_root 10848092274453435018 chunk_root_level156 log_root7514172289378668244 log_root_transid6227239369566282426 log_root_level 18 total_bytes 5481087866519986730 bytes_used 13216280034370888020 sectorsize 4102056786 nodesize1038279258 leafsize276348297 stripesize 2473897044 root_dir12090183195204234845 num_devices 12836127619712721941 compat_flags0xf98ff436fc954bd4 compat_ro_flags 0x3fe8246616164da7 ( FREE_SPACE_TREE | FREE_SPACE_TREE_VALID | unknown flag: 0x3fe8246616164da4 ) incompat_flags 0x3989a5037330bfd8 ( COMPRESS_LZO | COMPRESS_LZOv2 | EXTENDED_IREF | RAID56 | SKINNY_METADATA | NO_HOLES | unknown flag: 0x3989a5037330bc10 ) cache_generation10789185961859482334 uuid_tree_generation14921288820846890813 dev_item.uuid e6e382b3-de66-4c25-7cc9-3cc43cde9c24 dev_item.fsid f8430e37-12ca-adaf-b038-f0ee10ce6327 [DON'T MATCH] dev_item.type 7909001383421391155 dev_item.total_bytes4839925749276763097 dev_item.bytes_used 14330418354255459170 dev_item.io_align 4136652250 dev_item.io_width 1113335506 dev_item.sector_size1197062542 dev_item.devid 16559830033162408461 dev_item.dev_group 3271056113