Marc MERLIN posted on Wed, 02 Jul 2014 13:41:52 -0700 as excerpted: > This got triggered by an rsync I think. I'm not sure which of my btrfs > FS has the issue yet since BUG_ON isn't very helpful as discussed > earlier. > > [160562.925463] parent transid verify failed on 2776298520576 > wanted 41015 found 18120 > [160562.950297] ------------[ cut here ]------------ > [160562.965904] kernel BUG at fs/btrfs/locking.c:269! > > But shouldn't messages like 'parent transid verify failed' print which > device this happened on to give the operator a hint on where the problem > is? > > Could someone do a pass at those and make sure they all print the device > ID/name?
kernel 3.16 series here, rc2+ when this happened, rc3+ now. IOW it's 3.15 series (Marc) and 3.16 series (me) both known affected. FWIW, I'm not sure what originally triggered it, but I recently had a couple bad shutdowns -- systemd would say it was turning off the system but it wouldn't shut off, and on manual shutoff and later reboot, the read-write-mounted btrfs (two separate filesystems, home and log, root, including the rest of the system, is read-only by default, here) failed to mount. The mount failures triggered on the above parent transid failed errors with kernel BUG at fs/btrs/locking.c -- I /believe/ 269, but I couldn't swear to it. The biggest difference here (other than the fact that it happened on mount of critical filesystems at boot, so probably double-digit seconds in since boot at most) was that the parent transid numbers were only a few off, something like wanted nnnn, found nnnn-2. [End of technical stuff. The rest is discussion of my recovery experience, both because it might be of help to others and because it lets me tell my experience. =:^) ] I have backups but they weren't as current as I would have liked, so I decided to try recovery. My rootfs is btrfs as well, a /separate/ btrfs, but it remains mounted read-only by default and is only mounted read-write for updates, so wasn't damaged. That includes the bits of /var that I can get away with being read-only, with various /var/lib/* subdirs being symlinks to /home/var/lib/* subdirs, where they need to be writable, and of course /var/log being a separate dedicated filesystem -- one of the two that was damaged, and the usual /var/run and /var/lock symlinks to /run and /run/lock (with /run being a tmpfs mount on standard systemd configurations). As a result, the rootfs mounted from the initramfs and systemd on it was invoked as the new PID 1 init in the transfer from initramfs. Systemd was in turn able to start early boot services and anything that didn't have a dependency on home (including the bits of / var/lib symlinked into home) or log being mounted. But that of course left a number of critical services failing due to dependency on the home and/or log mounts, since those mounts were failing. Fortunately, while some of the time the errors would trigger a full kernel lockup with the above parent transid and locking BUG, other times the mount attempt would simply error out, and systemd would drop me to the emergency-mode root-login prompt. (If it hadn't, I'd have had to switch to booting the backup.) Since the main rootfs including /usr, /etc and much of /var was already mounted and safely read-only so I wasn't too worried about damaging it, that left me with only a partly working system, but access to all the normal recovery tools, manpages, etc, I'd normally have. The only big thing (other than X/kde, of course) initially was network access, due to dependencies on the unmountable filesystems for local DNS and I think iptables logging. I could have reconfigured that if I had to, but after I got log back up, I found I had network access (presumably with fallback to the ISP DNS), and was able to get to the wiki to research recovery of home a bit further. I decided to tackle log (/var/log) before home since it was smaller and I figured I could use anything I learned in that process to help me save more of home. My policy is no backup log partition, since I don't do backups regularly enough for the logs thereon to be of much likely usefulness. That left me trying to repair or recover what I could and then doing a mkfs.btrfs on it. The various repair options I tried didn't help -- they either died without helpful output or triggered the same lockup. Mount with the recovery or recovery,ro wouldn't work, and neither would btrfs check. Btrfs rescue didn't look useful, as I couldn't find useful documentation on chunk-recover and the supers looked fine (btrfs-show-super) so super- recover was unnecessary. I tried btrfs-zero-log on the log partition, but it didn't make the problem better and might have made it worse, so didn't try it on home. That left btrfs restore. I used it on log without really understanding what I was doing, and lost an entire directory worth of logs. =:^( Fortunately, I was able to learn a bit, however, and the home restore went rather better, with no permanent loss AFAIK. FWIW I also tried btrfs-image but it couldn't get anywhere at all -- the result was a zero-byte image. The *OTHER* thing I learned, which in hind sight I should have known but didn't think about until after I was beyond fixing it, was that if one wants to experiment with a filesystem and thus does a direct dd image of the device to a file for later dd back, if necessary, for btrfs raid1, dd *ALL* devices to separate images, not just one of two, figuring the other is raid1-identical anyway, because it's NOT. I had dd-ed one device of the btrfs raid1 log btrfs and verified matching md5sums on the device and image to be sure, before I tried btrfs-zero- log, figuring I could simply dd the image back if that didn't help. But since it was raid1 and the images were of course somewhat large being whole-partition images, I thought I only needed one of the two, and made the mistake of neither dd-ing the second one, nor md5sum verifying my it- turned-out-invalid assumption that the second device matched the first (which of course it doesn't, the data and metadata may or may not be identical but there's device specific info I suppose in the supers, at least). So after I found out btrfs-zero-log wasn't going to help, I had only one image of the two to dd back in ordered to try something different. It's possible that had I actually dd-ed both images and could thus have dd-ed both back after btrfs-zero-log didn't help, I could have recovered more of the log files on that filesystem than I did, because I'd have had the raid1 second copy to work from as well. But anyway, logs have only a certain value to me, and if I'm to lose files, log files are what I'd choose to lose. And I learned to dd *BOTH* images next time and indeed did just that for home, so all in all, I'm prepared to say that it was a worthwhile trade, a few log files destroyed for the knowledge and experience gained. =:^) Moving on... After I btrfs restored what I could from the damaged log btrfs, I mkfs.btrfs-ed it and copied what log files I had recovered back to the new filesystem. A reboot later and I had confirmed that systemd could mount the new filesystem at boot again now, and I tackled home. Except by this time I was sleepy and I had work the next day, so shut back down and went to bed, saving the home challenge for a day later. Upon reboot after work the next day, I found that the network was working again, and I could access the wiki to reread about btrfs-find-root and btrfs restore, on the wiki. While it took me a bit of experimentation (I'm going to try to update the wiki to reflect what I learned, as the page covering restore and fine- root is still a bit vague, mentioning information it isn't exactly clear how to get, maybe this'll be what it takes to actually get me to get a wiki account so I /can/ do such updates), eventually I figured out that generation and transid effectively refer to the same thing (which the wiki does suggest), and that the tree roots which the wiki mentions you should look to see that there are as many as possible of, are enumerated on the restore -l (list tree roots) report. That last bit the wiki currently doesn't mention at all -- I had to find -l by myself. Meanwhile, at this point the pieces began to all fall together, and I figured out that the very same parent transid verify failed numbers mentioned in the found/wanted as output by the kernel traces, were the generation numbers as well, AND how btrfs-find-root and btrfs restore as well as the kernel traces all fit together on this generation/transid thing. Further, this generation/transid thing increases serially as it is in effect tracking the root-tree-root commit count -- the times the filesystem has actually been fully atomically updated and had a new root- tree-root committed. These transaction ID verify faileds I was seeing in the logs referred to these same transid/generation numbers (simple enough to infer when the found/wanted were only a couple commits different, as I was seeing but as is NOT the case above), and I could actually tell restore to look for different roots that it could still find, based on the associated bytenr reported by btrfs-find-root. Suddenly a lot of those logs I've seen posted with found/wanted generation/transid numbers actually make some sort of sense! So after figuring all that out, it turned out that both the generation/ transid recorded in the supers as current, and the one 1-commit back, were both nearly entirely whole. Only a handful of found/wanted errors on the restore, tho I had to run it several times to fill in additional files as it kept deciding it was looping too much in the big dirs and wasn't making progress. And as far as I can tell, the only missing files in the restore are the last few rss/atom feed updates that my feed reader pulled. The other possibilities would be news (nntp) updates, but my client would show the messages as unread again if it lost them and that didn't happen, and mail, but while my servers are POP3 only, my client is configured to download but not delete for a week, just in case something like this /does/ happen and I lose a few messages locally, so again, I'd get messages shown as new again, and I didn't. So AFAIK, the rss/atom feed was the only thing affected, and like the logs, that's only of limited value to me and no big loss. So after the restore, again I did a new mkfs.btrfs to recreate a new filesystem, mounted and copied everything back from the restore. **BUT** One OTHER thing I learned about btrfs restore! The --help output (and manpage) suggest that -x is used to restore extended attributes. What it does *NOT* say is that evidently, "extended attributes" in this case includes file ownership and standard *ix permissions. Either that or restore never restores those in any case, I'm not sure which. Anyway, while restore seemed to give me back nearly all my files, they were all root/root ownership, root umask-modified perms (644, 744 for dirs). THAT metadata was a HEADACHE to restore -- manually! Fortunately I was able to hack up a find -exec script to compare ownership and perms on the backup (which as I mentioned I had tho it was a bit less current than I would have liked), doing a chown/chmod --reference to the file in the backup, where the file existed in the backup. That covered most files, but there were still a few left root/root/644. But a bit of admin time with mc to find and figure out appropriate ownership/perms for each (recursive) case and those were corrected as well. So, umm... while I hope there isn't a next time, at least I actually have some idea/experience how restore works now, and I know a couple things NOT to do next time, as well to try -x on btrfs restore and hope that restores ownership/perms too. If not, then I guess we need an improved restore that can, because having everything restored as root/root/644 SUCKS, tho obviously not as much as not having it restored AT ALL would suck!. And, hopefully some users find this experience helpful. I know if someone would have posted it before, I'd have definitely read it with interest, retaining at least some of it, and would have likely saved it for later reference, just in case. So it would have helped me, and here's hoping it can help someone else as well. -- Duncan - List replies preferred. No HTML msgs. "Every nonfree program has a lord, a master -- and if you use the program, he is your master." Richard Stallman -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html