Am Sun, 27 Mar 2016 13:04:25 -0600 schrieb Chris Murphy <li...@colorremedies.com>:
> As for the csum errors with this one single VDI file, you're going to > have to come up with a way to reproduce it consistently. You'll need > to have a good copy on a filesystem that comes up clean with btrfs > check and scrub. And then reproduce the corruption somehow. One hint > based on the other two users with similar setups or workload is they > aren't using the discard mount option and you are. I'd say unless you > have a newer SSD that supports queued trim, it probably shouldn't be > used, it's known to cause the kinds of hangs you report with drives > that only support non-queued trim. Those drives are better off getting > fstrim e.g. once a week on a timer. Let's get back to the csum errors later - that's only on the main drive which has other corruptions, too, as I found out. So the csum errors may well be a side effect. I'm currently trying to fix the remaining problems of the backup drive which uses no discard at all (it's no SSD). I'd like to help you out of the confusion, here's the output of: $ lsblk -o NAME,MODEL,FSTYPE,LABEL,MOUNTPOINT NAME MODEL FSTYPE LABEL MOUNTPOINT sda Crucial_CT128MX1 ├─sda1 vfat ESP /boot ├─sda2 └─sda3 bcache ├─bcache0 btrfs system ├─bcache1 btrfs system └─bcache2 btrfs system / sdb SAMSUNG HD103SJ ├─sdb1 swap swap0 [SWAP] └─sdb2 bcache └─bcache2 btrfs system / sdc SAMSUNG HD103SJ ├─sdc1 swap swap1 [SWAP] └─sdc2 bcache └─bcache0 btrfs system sdd SAMSUNG HD103UJ ├─sdd1 swap swap2 [SWAP] └─sdd2 bcache └─bcache1 btrfs system sde 003-9VT166 └─sde1 btrfs usb-backup (the mountpoint is pretty bogus due to multiple subvolumes, so I corrected it) BTW: This discard option ran smooth for the last 12 months or so (apparently, the SSD drive is soon to die - smartctl lifetime counter is almost used up, bcache + btrfs can be pretty stressful I think). I'm not even sure if btrfs mount option "discard" has any effect at all if it is mounted through bcache. BTW2: Even fstrim will issue queued trim if it is supported by the drive and will get you in the same trouble. It needs to be disabled in the kernel, according to [1]. Side note: I have model MX100 with firmware update applied which is supposed to fix the problem. I never experienced the libata fault messages in dmesg. [1]: http://forums.crucial.com/t5/Crucial-SSDs/M500-M5x0-QUEUED-TRIM-data-corruption-alert-mostly-for-Linux/td-p/151028 -- Regards, Kai Replies to list-only preferred. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html