RE: Help with leaf parent key incorrect
> -Original Message- > From: Anand Jain [mailto:anand.j...@oracle.com] > Sent: Monday, 26 February 2018 7:27 PM > To: Paul Jones <p...@pauljones.id.au>; linux-btrfs@vger.kernel.org > Subject: Re: Help with leaf parent key incorrect > > > > > There is one io error in the log below, > > Apparently, that's not a real EIO. We need to fix it. > But can't be the root cause we are looking for here. > > > > Feb 24 22:41:59 home kernel: BTRFS: error (device dm-6) in > btrfs_run_delayed_refs:3076: errno=-5 IO failure > Feb 24 22:41:59 home > kernel: BTRFS info (device dm-6): forced readonly > > static int run_delayed_extent_op(struct btrfs_trans_handle *trans, > struct btrfs_fs_info *fs_info, > struct btrfs_delayed_ref_head *head, > struct btrfs_delayed_extent_op *extent_op) { > :: > > } else { > err = -EIO; > goto out; > } > > > > but other than that I have never had io errors before, or any other > troubles. > > Hm. btrfs dev stat shows real disk IO errors. > As this FS isn't mountable .. pls try >btrfs dev stat > file >search for 'device stats', there will be one for each disk. > Or it reports in the syslog when it happens not necessarily > during dedupe. vm-server ~ # btrfs dev stat /media/storage/ [/dev/mapper/b-storage--b].write_io_errs0 [/dev/mapper/b-storage--b].read_io_errs 0 [/dev/mapper/b-storage--b].flush_io_errs0 [/dev/mapper/b-storage--b].corruption_errs 0 [/dev/mapper/b-storage--b].generation_errs 0 [/dev/mapper/a-storage--a].write_io_errs0 [/dev/mapper/a-storage--a].read_io_errs 0 [/dev/mapper/a-storage--a].flush_io_errs0 [/dev/mapper/a-storage--a].corruption_errs 0 [/dev/mapper/a-storage--a].generation_errs 0 vm-server ~ # btrfs dev stat / [/dev/sdb1].write_io_errs0 [/dev/sdb1].read_io_errs 0 [/dev/sdb1].flush_io_errs0 [/dev/sdb1].corruption_errs 0 [/dev/sdb1].generation_errs 0 [/dev/sda1].write_io_errs0 [/dev/sda1].read_io_errs 0 [/dev/sda1].flush_io_errs0 [/dev/sda1].corruption_errs 0 [/dev/sda1].generation_errs 0 vm-server ~ # btrfs dev stat /dev/mapper/a-backup--a ERROR: '/dev/mapper/a-backup--a' is not a mounted btrfs device I check syslog regularly and I haven't seen any errors on any drives for over a year. > > > One of my other filesystems share the same two discs and it is still fine, > so I > think the hardware is probably ok. > Right. I guess that too. A confirmation will be better. > > I've copied the beginning of the errors below. > > > At my end finding the root cause of 'parent transid verify failed' > during/after dedupe is is kind of fading as disk seems to be had > no issues. which I had in mind. > > Also, there wasn't abrupt power-recycle here? I presume. No, although now that I think about it I just realised it happened right after I upgraded from 4.15.4 to 4.15.5 and I didn't quit bees before rebooting, I let the system do it. Not sure if it's relevant or not. I also just noticed that the kernel has spawned hundreds of kworkers - the highest number I can see is 516. > > It's better to save the output disk1-log and disk2-log as below > before further efforts to recovery. Just in case if something > pops out. > >btrfs in dump-super -fa disk1 > disk1-log >btrfs in dump-tree --degraded disk1 >> disk1-log [1] I applied the patch and started dumping the tree, but I stopped it after about 10 mins and 9GB. Because I use zstd and free space tree the recovery tools wouldn't do anything in RW mode, so I've decided to just blow it away and restore from a backup. I made a block level copy of both discs in case I need anything. Thanks for your help anyway. Regards, Paul.
Re: Help with leaf parent key incorrect
> There is one io error in the log below, Apparently, that's not a real EIO. We need to fix it. But can't be the root cause we are looking for here. > Feb 24 22:41:59 home kernel: BTRFS: error (device dm-6) in btrfs_run_delayed_refs:3076: errno=-5 IO failure > Feb 24 22:41:59 home kernel: BTRFS info (device dm-6): forced readonly static int run_delayed_extent_op(struct btrfs_trans_handle *trans, struct btrfs_fs_info *fs_info, struct btrfs_delayed_ref_head *head, struct btrfs_delayed_extent_op *extent_op) { :: } else { err = -EIO; goto out; } > but other than that I have never had io errors before, or any other troubles. Hm. btrfs dev stat shows real disk IO errors. As this FS isn't mountable .. pls try btrfs dev stat > file search for 'device stats', there will be one for each disk. Or it reports in the syslog when it happens not necessarily during dedupe. > One of my other filesystems share the same two discs and it is still fine, so I think the hardware is probably ok. Right. I guess that too. A confirmation will be better. > I've copied the beginning of the errors below. At my end finding the root cause of 'parent transid verify failed' during/after dedupe is is kind of fading as disk seems to be had no issues. which I had in mind. Also, there wasn't abrupt power-recycle here? I presume. It's better to save the output disk1-log and disk2-log as below before further efforts to recovery. Just in case if something pops out. btrfs in dump-super -fa disk1 > disk1-log btrfs in dump-tree --degraded disk1 >> disk1-log [1] btrfs in dump-super -fa disk2 > disk2-log btrfs in dump-tree --degraded disk2 >> disk2-log [1] [1] --degraded option is in the ML. [PATCH] btrfs-progs: dump-tree: add degraded option Thanks, Anand -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Help with leaf parent key incorrect
On 02/25/2018 06:16 PM, Paul Jones wrote: Hi all, I was running dedupe on my filesystem and something went wrong overnight, by the time I noticed the fs was readonly. Thanks for the report. I have few questions.. Kind of raid profile used here? Dedupe tool that was used? Was the fs full before dedupe? Were there any IO errors? Thanks, Anand When trying to check it this is what I get: vm-server ~ # btrfs check /dev/mapper/a-backup--a parent transid verify failed on 2371034071040 wanted 62977 found 62893 parent transid verify failed on 2371034071040 wanted 62977 found 62893 parent transid verify failed on 2371034071040 wanted 62977 found 62893 parent transid verify failed on 2371034071040 wanted 62977 found 62893 Ignoring transid failure leaf parent key incorrect 2371034071040 ERROR: cannot open file system Is there a way to fix this? I'm using kernel 4.15.5 This is the last part of dmesg [ +0.02] BTRFS error (device dm-6): parent transid verify failed on 2374016368640 wanted 63210 found 63208 [ +0.04] BTRFS error (device dm-6): parent transid verify failed on 2374016368640 wanted 63210 found 63208 [ +1.107963] BTRFS error (device dm-6): parent transid verify failed on 2374016368640 wanted 63210 found 63208 [ +0.05] BTRFS error (device dm-6): parent transid verify failed on 2374016368640 wanted 63210 found 63208 [ +1.473598] BTRFS error (device dm-6): parent transid verify failed on 2373996298240 wanted 63210 found 63208 [ +0.04] BTRFS error (device dm-6): parent transid verify failed on 2373996298240 wanted 63210 found 63208 [ +0.001927] BTRFS error (device dm-6): parent transid verify failed on 2373996298240 wanted 63210 found 63208 [ +0.03] BTRFS error (device dm-6): parent transid verify failed on 2373996298240 wanted 63210 found 63208 [ +0.60] BTRFS error (device dm-6): parent transid verify failed on 2373996298240 wanted 63210 found 63208 [ +0.01] BTRFS error (device dm-6): parent transid verify failed on 2373996298240 wanted 63210 found 63208 [ +2.676048] verify_parent_transid: 10362 callbacks suppressed [ +0.02] BTRFS error (device dm-6): parent transid verify failed on 2373991677952 wanted 63210 found 63208 [ +0.03] BTRFS error (device dm-6): parent transid verify failed on 2373991677952 wanted 63210 found 63208 [ +0.078432] BTRFS error (device dm-6): parent transid verify failed on 2373996232704 wanted 63210 found 63208 [ +0.04] BTRFS error (device dm-6): parent transid verify failed on 2373996232704 wanted 63210 found 63208 [ +0.43] BTRFS error (device dm-6): parent transid verify failed on 2373996232704 wanted 63210 found 63208 [ +0.01] BTRFS error (device dm-6): parent transid verify failed on 2373996232704 wanted 63210 found 63208 [ +0.058638] BTRFS error (device dm-6): parent transid verify failed on 2373996232704 wanted 63210 found 63208 [ +0.04] BTRFS error (device dm-6): parent transid verify failed on 2373996232704 wanted 63210 found 63208 [ +0.139174] BTRFS error (device dm-6): parent transid verify failed on 2373996232704 wanted 63210 found 63208 [ +0.04] BTRFS error (device dm-6): parent transid verify failed on 2373996232704 wanted 63210 found 63208 [Feb25 20:48] BTRFS info (device dm-6): using free space tree [ +0.02] BTRFS error (device dm-6): Remounting read-write after error is not allowed [Feb25 20:49] BTRFS error (device dm-6): cleaner transaction attach returned -30 [ +0.238718] BTRFS warning (device dm-6): page private not zero on page 1596642967552 [ +0.03] BTRFS warning (device dm-6): page private not zero on page 1596642971648 [ +0.02] BTRFS warning (device dm-6): page private not zero on page 1596642975744 [ +0.02] BTRFS warning (device dm-6): page private not zero on page 1596642979840 [ +0.02] BTRFS warning (device dm-6): page private not zero on page 1596643672064 [ +0.02] BTRFS warning (device dm-6): page private not zero on page 1596643676160 [ +0.02] BTRFS warning (device dm-6): page private not zero on page 1596643680256 [ +0.02] BTRFS warning (device dm-6): page private not zero on page 1596643684352 [ +0.02] BTRFS warning (device dm-6): page private not zero on page 1596643704832 [ +0.02] BTRFS warning (device dm-6): page private not zero on page 1596643708928 [ +0.02] BTRFS warning (device dm-6): page private not zero on page 1596643713024 [ +0.02] BTRFS warning (device dm-6): page private not zero on page 1596643717120 [ +0.28] BTRFS warning (device dm-6): page private not zero on page 2363051098112 [ +0.01] BTRFS warning (device dm-6): page private not zero on page 2363051102208 [ +0.01] BTRFS warning (device dm-6): page private not zero on page 2363051106304 [ +0.01] BTRFS warning (device dm-6): page private not zero on page 2363051110400 [ +0.01] BTRFS warning (device dm-6): page private not zero on page 2368056344576 [ +0.00] BTRFS