On 10/10/18 19:25, Larkin Lowrey wrote:
On 10/10/2018 12:04 PM, Holger Hoffstätte wrote:
On 10/10/18 17:44, Larkin Lowrey wrote:
(..)
About once a week, or so, I'm running into the above situation where
FS seems to deadlock. All IO to the FS blocks, there is no IO
activity at all. I have to hard reboot the system to recover. There
are no error indications except for the following which occurs well
before the FS freezes up:
BTRFS warning (device dm-3): block group 78691883286528 has wrong amount of
free space
BTRFS warning (device dm-3): failed to load free space cache for block group
78691883286528, rebuilding it now
Do I have any options other the nuking the FS and starting over?
Unmount cleanly & mount again with -o space_cache=v2.
It froze while unmounting. The attached zip is a stack dump captured
via 'echo t > /proc/sysrq-trigger'. A second attempt after a hard
reboot worked.
Trace says freespace cache writeout failed midway while the scsi device
was resetting itself and then went aaaarrrghh. Probably managed to hit
different blocks on the second attempt. So chances are your controller,
disk or something else is broken, dying, or both.
When things have settled and you have verified that r/o mounting works
and is stable, try rescuing the data (when necessary) before scrubbing,
dm-device-checking or whatever you have set up.
-h