Hi Qu, Oh dear that is not good news!
I have been running the find root command since yesterday but it only seems to be only be outputting the following message: ERROR: tree block bytenr 0 is not aligned to sectorsize 4096 ERROR: tree block bytenr 0 is not aligned to sectorsize 4096 ERROR: tree block bytenr 0 is not aligned to sectorsize 4096 ERROR: tree block bytenr 0 is not aligned to sectorsize 4096 ERROR: tree block bytenr 0 is not aligned to sectorsize 4096 ERROR: tree block bytenr 0 is not aligned to sectorsize 4096 ERROR: tree block bytenr 0 is not aligned to sectorsize 4096 ERROR: tree block bytenr 0 is not aligned to sectorsize 4096 ERROR: tree block bytenr 0 is not aligned to sectorsize 4096 ERROR: tree block bytenr 0 is not aligned to sectorsize 4096 I tried with the latest btrfs tools compiled from source and the ones I have installed with the same result. Is there a CLI utility I could use to determine if the log contains any other content? Kind regards Michael On 30 April 2018 at 04:02, Qu Wenruo <quwenruo.bt...@gmx.com> wrote: > > > On 2018年04月29日 22:08, Michael Wade wrote: >> Hi Qu, >> >> Got this error message: >> >> ./btrfs inspect dump-tree -b 20800943685632 /dev/md127 >> btrfs-progs v4.16.1 >> bytenr mismatch, want=20800943685632, have=3118598835113619663 >> ERROR: cannot read chunk root >> ERROR: unable to open /dev/md127 >> >> I have attached the dumps for: >> >> dd if=/dev/md127 of=/tmp/chunk_root.copy1 bs=1 count=32K skip=266325721088 >> dd if=/dev/md127 of=/tmp/chunk_root.copy2 bs=1 count=32K skip=266359275520 > > Unfortunately, both dumps are corrupted and contain mostly garbage. > I think it's the underlying stack (mdraid) has something wrong or failed > to recover its data. > > This means your last chance will be btrfs-find-root. > > Please try: > # btrfs-find-root -o 3 <device> > > And provide all the output. > > But please keep in mind, chunk root is a critical tree, and so far it's > already heavily damaged. > Although I could still continue try to recover, there is pretty low > chance now. > > Thanks, > Qu >> >> Kind regards >> Michael >> >> >> On 29 April 2018 at 10:33, Qu Wenruo <quwenruo.bt...@gmx.com> wrote: >>> >>> >>> On 2018年04月29日 16:59, Michael Wade wrote: >>>> Ok, will it be possible for me to install the new version of the tools >>>> on my current kernel without overriding the existing install? Hesitant >>>> to update kernel/btrfs as it might break the ReadyNAS interface / >>>> future firmware upgrades. >>>> >>>> Perhaps I could grab this: >>>> https://github.com/kdave/btrfs-progs/releases/tag/v4.16.1 and >>>> hopefully build from source and then run the binaries directly? >>> >>> Of course, that's how most of us test btrfs-progs builds. >>> >>> Thanks, >>> Qu >>> >>>> >>>> Kind regards >>>> >>>> On 29 April 2018 at 09:33, Qu Wenruo <quwenruo.bt...@gmx.com> wrote: >>>>> >>>>> >>>>> On 2018年04月29日 16:11, Michael Wade wrote: >>>>>> Thanks Qu, >>>>>> >>>>>> Please find attached the log file for the chunk recover command. >>>>> >>>>> Strangely, btrfs chunk recovery found no extra chunk beyond current >>>>> system chunk range. >>>>> >>>>> Which means, it's chunk tree corrupted. >>>>> >>>>> Please dump the chunk tree with latest btrfs-progs (which provides the >>>>> new --follow option). >>>>> >>>>> # btrfs inspect dump-tree -b 20800943685632 <device> >>>>> >>>>> If it doesn't work, please provide the following binary dump: >>>>> >>>>> # dd if=<dev> of=/tmp/chunk_root.copy1 bs=1 count=32K skip=266325721088 >>>>> # dd if=<dev> of=/tmp/chunk_root.copy2 bs=1 count=32K skip=266359275520 >>>>> (And will need to repeat similar dump for several times according to >>>>> above dump) >>>>> >>>>> Thanks, >>>>> Qu >>>>> >>>>> >>>>>> >>>>>> Kind regards >>>>>> Michael >>>>>> >>>>>> On 28 April 2018 at 12:38, Qu Wenruo <quwenruo.bt...@gmx.com> wrote: >>>>>>> >>>>>>> >>>>>>> On 2018年04月28日 17:37, Michael Wade wrote: >>>>>>>> Hi Qu, >>>>>>>> >>>>>>>> Thanks for your reply. I will investigate upgrading the kernel, >>>>>>>> however I worry that future ReadyNAS firmware upgrades would fail on a >>>>>>>> newer kernel version (I don't have much linux experience so maybe my >>>>>>>> concerns are unfounded!?). >>>>>>>> >>>>>>>> I have attached the output of the dump super command. >>>>>>>> >>>>>>>> I did actually run chunk recover before, without the verbose option, >>>>>>>> it took around 24 hours to finish but did not resolve my issue. Happy >>>>>>>> to start that again if you need its output. >>>>>>> >>>>>>> The system chunk only contains the following chunks: >>>>>>> [0, 4194304]: Initial temporary chunk, not used at all >>>>>>> [20971520, 29360128]: System chunk created by mkfs, should be full >>>>>>> used up >>>>>>> [20800943685632, 20800977240064]: >>>>>>> The newly created large system chunk. >>>>>>> >>>>>>> The chunk root is still in 2nd chunk thus valid, but some of its leaf is >>>>>>> out of the range. >>>>>>> >>>>>>> If you can't wait 24h for chunk recovery to run, my advice would be move >>>>>>> the disk to some other computer, and use latest btrfs-progs to execute >>>>>>> the following command: >>>>>>> >>>>>>> # btrfs inpsect dump-tree -b 20800943685632 --follow >>>>>>> >>>>>>> If we're lucky enough, we may read out the tree leaf containing the new >>>>>>> system chunk and save a day. >>>>>>> >>>>>>> Thanks, >>>>>>> Qu >>>>>>> >>>>>>>> >>>>>>>> Thanks so much for your help. >>>>>>>> >>>>>>>> Kind regards >>>>>>>> Michael >>>>>>>> >>>>>>>> On 28 April 2018 at 09:45, Qu Wenruo <quwenruo.bt...@gmx.com> wrote: >>>>>>>>> >>>>>>>>> >>>>>>>>> On 2018年04月28日 16:30, Michael Wade wrote: >>>>>>>>>> Hi all, >>>>>>>>>> >>>>>>>>>> I was hoping that someone would be able to help me resolve the issues >>>>>>>>>> I am having with my ReadyNAS BTRFS volume. Basically my trouble >>>>>>>>>> started after a power cut, subsequently the volume would not mount. >>>>>>>>>> Here are the details of my setup as it is at the moment: >>>>>>>>>> >>>>>>>>>> uname -a >>>>>>>>>> Linux QAI 4.4.116.alpine.1 #1 SMP Mon Feb 19 21:58:38 PST 2018 >>>>>>>>>> armv7l GNU/Linux >>>>>>>>> >>>>>>>>> The kernel is pretty old for btrfs. >>>>>>>>> Strongly recommended to upgrade. >>>>>>>>> >>>>>>>>>> >>>>>>>>>> btrfs --version >>>>>>>>>> btrfs-progs v4.12 >>>>>>>>> >>>>>>>>> So is the user tools. >>>>>>>>> >>>>>>>>> Although I think it won't be a big problem, as needed tool should be >>>>>>>>> there. >>>>>>>>> >>>>>>>>>> >>>>>>>>>> btrfs fi show >>>>>>>>>> Label: '11baed92:data' uuid: 20628cda-d98f-4f85-955c-932a367f8821 >>>>>>>>>> Total devices 1 FS bytes used 5.12TiB >>>>>>>>>> devid 1 size 7.27TiB used 6.24TiB path /dev/md127 >>>>>>>>> >>>>>>>>> So, it's btrfs on mdraid. >>>>>>>>> It would normally make things harder to debug, so I could only provide >>>>>>>>> advice from the respect of btrfs. >>>>>>>>> For mdraid part, I can't ensure anything. >>>>>>>>> >>>>>>>>>> >>>>>>>>>> Here are the relevant dmesg logs for the current state of the device: >>>>>>>>>> >>>>>>>>>> [ 19.119391] md: md127 stopped. >>>>>>>>>> [ 19.120841] md: bind<sdb3> >>>>>>>>>> [ 19.121120] md: bind<sdc3> >>>>>>>>>> [ 19.121380] md: bind<sda3> >>>>>>>>>> [ 19.125535] md/raid:md127: device sda3 operational as raid disk 0 >>>>>>>>>> [ 19.125547] md/raid:md127: device sdc3 operational as raid disk 2 >>>>>>>>>> [ 19.125554] md/raid:md127: device sdb3 operational as raid disk 1 >>>>>>>>>> [ 19.126712] md/raid:md127: allocated 3240kB >>>>>>>>>> [ 19.126778] md/raid:md127: raid level 5 active with 3 out of 3 >>>>>>>>>> devices, algorithm 2 >>>>>>>>>> [ 19.126784] RAID conf printout: >>>>>>>>>> [ 19.126789] --- level:5 rd:3 wd:3 >>>>>>>>>> [ 19.126794] disk 0, o:1, dev:sda3 >>>>>>>>>> [ 19.126799] disk 1, o:1, dev:sdb3 >>>>>>>>>> [ 19.126804] disk 2, o:1, dev:sdc3 >>>>>>>>>> [ 19.128118] md127: detected capacity change from 0 to >>>>>>>>>> 7991637573632 >>>>>>>>>> [ 19.395112] Adding 523708k swap on /dev/md1. Priority:-1 >>>>>>>>>> extents:1 >>>>>>>>>> across:523708k >>>>>>>>>> [ 19.434956] BTRFS: device label 11baed92:data devid 1 transid >>>>>>>>>> 151800 /dev/md127 >>>>>>>>>> [ 19.739276] BTRFS info (device md127): setting nodatasum >>>>>>>>>> [ 19.740440] BTRFS critical (device md127): unable to find logical >>>>>>>>>> 3208757641216 len 4096 >>>>>>>>>> [ 19.740450] BTRFS critical (device md127): unable to find logical >>>>>>>>>> 3208757641216 len 4096 >>>>>>>>>> [ 19.740498] BTRFS critical (device md127): unable to find logical >>>>>>>>>> 3208757641216 len 4096 >>>>>>>>>> [ 19.740512] BTRFS critical (device md127): unable to find logical >>>>>>>>>> 3208757641216 len 4096 >>>>>>>>>> [ 19.740552] BTRFS critical (device md127): unable to find logical >>>>>>>>>> 3208757641216 len 4096 >>>>>>>>>> [ 19.740560] BTRFS critical (device md127): unable to find logical >>>>>>>>>> 3208757641216 len 4096 >>>>>>>>>> [ 19.740576] BTRFS error (device md127): failed to read chunk root >>>>>>>>> >>>>>>>>> This shows it pretty clear, btrfs fails to read chunk root. >>>>>>>>> And according your above "len 4096" it's pretty old fs, as it's still >>>>>>>>> using 4K nodesize other than 16K nodesize. >>>>>>>>> >>>>>>>>> According to above output, it means your superblock by somehow lacks >>>>>>>>> the >>>>>>>>> needed system chunk mapping, which is used to initialize chunk >>>>>>>>> mapping. >>>>>>>>> >>>>>>>>> Please provide the following command output: >>>>>>>>> >>>>>>>>> # btrfs inspect dump-super -fFa /dev/md127 >>>>>>>>> >>>>>>>>> Also, please consider run the following command and dump all its >>>>>>>>> output: >>>>>>>>> >>>>>>>>> # btrfs rescue chunk-recover -v /dev/md127. >>>>>>>>> >>>>>>>>> Please note that, above command can take a long time to finish, and if >>>>>>>>> it works without problem, it may solve your problem. >>>>>>>>> But if it doesn't work, the output could help me to manually craft a >>>>>>>>> fix >>>>>>>>> to your super block. >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Qu >>>>>>>>> >>>>>>>>> >>>>>>>>>> [ 19.783975] BTRFS error (device md127): open_ctree failed >>>>>>>>>> >>>>>>>>>> In an attempt to recover the volume myself I run a few BTRFS commands >>>>>>>>>> mostly using advice from here: >>>>>>>>>> https://lists.opensuse.org/opensuse/2017-02/msg00930.html. However >>>>>>>>>> that actually seems to have made things worse as I can no longer >>>>>>>>>> mount >>>>>>>>>> the file system, not even in readonly mode. >>>>>>>>>> >>>>>>>>>> So starting from the beginning here is a list of things I have done >>>>>>>>>> so >>>>>>>>>> far (hopefully I remembered the order in which I ran them!) >>>>>>>>>> >>>>>>>>>> 1. Noticed that my backups to the NAS were not running (didn't get >>>>>>>>>> notified that the volume had basically "died") >>>>>>>>>> 2. ReadyNAS UI indicated that the volume was inactive. >>>>>>>>>> 3. SSHed onto the box and found that the first drive was not marked >>>>>>>>>> as >>>>>>>>>> operational (log showed I/O errors / UNKOWN (0x2003)) so I replaced >>>>>>>>>> the disk and let the array resync. >>>>>>>>>> 4. After resync the volume still was unaccessible so I looked at the >>>>>>>>>> logs once more and saw something like the following which seemed to >>>>>>>>>> indicate that the replay log had been corrupted when the power went >>>>>>>>>> out: >>>>>>>>>> >>>>>>>>>> BTRFS critical (device md127): corrupt leaf, non-root leaf's nritems >>>>>>>>>> is 0: block=232292352, root=7, slot=0 >>>>>>>>>> BTRFS critical (device md127): corrupt leaf, non-root leaf's nritems >>>>>>>>>> is 0: block=232292352, root=7, slot=0 >>>>>>>>>> BTRFS: error (device md127) in btrfs_replay_log:2524: errno=-5 IO >>>>>>>>>> failure (Failed to recover log tree) >>>>>>>>>> BTRFS error (device md127): pending csums is 155648 >>>>>>>>>> BTRFS error (device md127): cleaner transaction attach returned -30 >>>>>>>>>> BTRFS critical (device md127): corrupt leaf, non-root leaf's nritems >>>>>>>>>> is 0: block=232292352, root=7, slot=0 >>>>>>>>>> >>>>>>>>>> 5. Then: >>>>>>>>>> >>>>>>>>>> btrfs rescue zero-log >>>>>>>>>> >>>>>>>>>> 6. Was then able to mount the volume in readonly mode. >>>>>>>>>> >>>>>>>>>> btrfs scrub start >>>>>>>>>> >>>>>>>>>> Which fixed some errors but not all: >>>>>>>>>> >>>>>>>>>> scrub status for 20628cda-d98f-4f85-955c-932a367f8821 >>>>>>>>>> >>>>>>>>>> scrub started at Tue Apr 24 17:27:44 2018, running for 04:00:34 >>>>>>>>>> total bytes scrubbed: 224.26GiB with 6 errors >>>>>>>>>> error details: csum=6 >>>>>>>>>> corrected errors: 0, uncorrectable errors: 6, unverified errors: 0 >>>>>>>>>> >>>>>>>>>> scrub status for 20628cda-d98f-4f85-955c-932a367f8821 >>>>>>>>>> scrub started at Tue Apr 24 17:27:44 2018, running for 04:34:43 >>>>>>>>>> total bytes scrubbed: 224.26GiB with 6 errors >>>>>>>>>> error details: csum=6 >>>>>>>>>> corrected errors: 0, uncorrectable errors: 6, unverified errors: 0 >>>>>>>>>> >>>>>>>>>> 6. Seeing this hanging I rebooted the NAS >>>>>>>>>> 7. Think this is when the volume would not mount at all. >>>>>>>>>> 8. Seeing log entries like these: >>>>>>>>>> >>>>>>>>>> BTRFS warning (device md127): checksum error at logical >>>>>>>>>> 20800943685632 >>>>>>>>>> on dev /dev/md127, sector 520167424: metadata node (level 1) in tree >>>>>>>>>> 3 >>>>>>>>>> >>>>>>>>>> I ran >>>>>>>>>> >>>>>>>>>> btrfs check --fix-crc >>>>>>>>>> >>>>>>>>>> And that brings us to where I am now: Some seemly corrupted BTRFS >>>>>>>>>> metadata and unable to mount the drive even with the recovery option. >>>>>>>>>> >>>>>>>>>> Any help you can give is much appreciated! >>>>>>>>>> >>>>>>>>>> Kind regards >>>>>>>>>> Michael >>>>>>>>>> -- >>>>>>>>>> To unsubscribe from this list: send the line "unsubscribe >>>>>>>>>> linux-btrfs" in >>>>>>>>>> the body of a message to majord...@vger.kernel.org >>>>>>>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>>>>>>>>> >>>>>>>>> >>>>>>> >>>>> >>>> -- >>>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in >>>> the body of a message to majord...@vger.kernel.org >>>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>>> >>> > -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html