I'm getting kernel crash and complete system lockup when trying to access journal on two disk btrfs filesystem with data/metadata as RAID1.
I can't get proper log because whole system hangs and even kdump fails, seems it doesn't start or I'm doing something wrong. Also because there are several call traces and they all get printed on screen within few seconds I can get photos only on few last ones. But I managed to get some low-quality blurry photos with 80 FPS recording. So from them I saw kernel BUG at fs/btrfs/extent_io.c:2062 extent_i...@2062.png => http://i.imgur.com/uuxOGIR.png kernel BUG at fs/btrfs/extent_io.c:2140 extent_i...@2140.png => http://i.imgur.com/j5xrt7w.png kernel BUG at fs/btrfs/extent_io.c:2338 extent_io.c@2338_0.png => http://i.imgur.com/EosplAu.png extent_io.c@2338_1.png => http://i.imgur.com/rsE9qNT.png kernel BUG at fs/btrfs/volumes.c:5399 volumes.c@5399_0.png => http://i.imgur.com/iV9zqAv.png volumes.c@5399_1.png => http://i.imgur.com/VCyr07R.png And better photos BUG: scheduling while atomic: kworker/u16 scheduling_while_atomic_0.jpg => http://i.imgur.com/asHjcM9.jpg scheduling_while_atomic_1.jpg => http://i.imgur.com/OJSFDUx.jpg scheduling_while_atomic_2.jpg => http://i.imgur.com/0nHQin8.jpg scheduling_while_atomic_3.jpg => http://i.imgur.com/ZmzOh7f.jpg Watchdog detected hard LOCKUP on cpu watchdog_detected_hard_LOCKUP_0.jpg => http://i.imgur.com/6W4FlfI.jpg watchdog_detected_hard_LOCKUP_1.jpg => http://i.imgur.com/WxxGozJ.jpg watchdog_detected_hard_LOCKUP_2.jpg => http://i.imgur.com/0Mmifwf.jpg BUG: unable to handle kernel paging request unable_to_handle_kernel_paging_request.jpg => http://i.imgur.com/4Sz4v96.jpg BUG: unable to handle kernel unable_to_handle_kernel.jpg => http://i.imgur.com/T0x7K4a.jpg Weird is that it crashes only sometimes and when reading all files then it doesn't crash, but only when try to open journal with journalctl. Also btrfs scrub and balance finishes without any errors. Even btrfs check and check --repair completed successfully without finding anything to repair. Also this crash happened on v4.1.6 too and now I'll recompile v4.2 as it got released. I'm getting this crash since I decided to test how well Linux handles one disk loss on btrfs RAID1 (I just pulled one disk out), it kept working but there were some call traces and when I plugged it back in then btrfs failed to write to it and after few mins system froze but before that SMART test passed on that disk. Then I rebooted and ran scrub which fixed errors on that disk. Next I was trying to test other disk and for it executed echo 1 > /sys/block/sdf/device/delete which caused immediate system hang. And now this filesystem crashes kernel when I try to view journal. I think RAID1 should handle well such cases when one disk disappears or is corrupted but currently it doesn't work and crashes whole system. PS. this is a resend without attachments. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html