Henk Slager wrote on 2016/04/01 01:27 +0200:
On Thu, Mar 31, 2016 at 10:44 PM, Kai Krakow <hurikha...@gmail.com> wrote:
Hello!
I already reported this in another thread but it was a bit confusing by
intermixing multiple volumes. So let's start a new thread:
Since one of the last kernel upgrades, I'm experiencing one VDI file
(containing a NTFS image with Windows 7) getting damaged when running
the machine in VirtualBox. I got knowledge about this after
experiencing an error "duplicate object" and btrfs went RO. I fixed it
by deleting the VDI and restoring from backup - but no I get csum
errors as soon as some VM IO goes into the VDI file.
The FS is still usable. One effect is, that after reading all files
with rsync (to copy to my backup), each call of "du" or "df" hangs, also
similar calls to "btrfs {sub|fi} ..." show the same effect. I guess one
outcome of this is, that the FS does not properly unmount during
shutdown.
Kernel is 4.5.0 by now (the FS is much much older, dates back to 3.x
series, and never had problems), including Gentoo patch-set r1.
One possibility could be that the vbox kernel modules somehow corrupt
btrfs kernel area since kernel 4.5.
In order to make this reproducible (or an attempt to reproduce) for
others, you could unload VirtualBox stuff and restore the VDI file
from backup (or whatever big file) and then make pseudo-random, but
reproducible writes to the file.
It is not clear to me what 'Gentoo patch-set r1' is and does. So just
boot a vanilla v4.5 kernel from kernel.org and see if you get csum
errors in dmesg.
Also, where does 'duplicate object' come from? dmesg ? then please
post its surroundings, straight from dmesg.
The device layout is:
$ lsblk -o NAME,MODEL,FSTYPE,LABEL,MOUNTPOINT
NAME MODEL FSTYPE LABEL MOUNTPOINT
sda Crucial_CT128MX1
├─sda1 vfat ESP /boot
├─sda2
└─sda3 bcache
├─bcache0 btrfs system
├─bcache1 btrfs system
└─bcache2 btrfs system /usr/src
sdb SAMSUNG HD103SJ
├─sdb1 swap swap0 [SWAP]
└─sdb2 bcache
└─bcache2 btrfs system /usr/src
sdc SAMSUNG HD103SJ
├─sdc1 swap swap1 [SWAP]
└─sdc2 bcache
└─bcache1 btrfs system
sdd SAMSUNG HD103UJ
├─sdd1 swap swap2 [SWAP]
└─sdd2 bcache
└─bcache0 btrfs system
Mount options are:
$ mount|fgrep btrfs
/dev/bcache2 on / type btrfs
(rw,noatime,compress=lzo,nossd,discard,space_cache,autodefrag,subvolid=256,subvol=/gentoo/rootfs)
The FS uses mraid=1 and draid=0.
Output of btrfsck is:
(also available here:
https://gist.github.com/kakra/bfcce4af242f6548f4d6b45c8afb46ae)
$ btrfsck /dev/disk/by-label/system
checking extents
ref mismatch on [10443660537856 524288] extent item 1, found 2
This 10443660537856 number is bigger than the 1832931324360 number
found for total bytes. AFAIK, this is already wrong.
Nope. That's btrfs logical space address, which can be beyond real disk
bytenr.
The easiest method to reproduce such case, is write something in a 256M
btrfs, and balance the fs several times.
Then all chunks can be at bytenr beyond 256M.
The real problem is, the extent has mismatched reference.
Normally it can fixed by --init-extent-tree option, but it normally
means bigger problem, especially it has already caused kernel
delayed-ref problem.
No to mention the error "extent item 11271947091968 has multiple extent
items", which makes the problem more serious.
I assume some older kernel have already screwed up the extent tree, as
although delayed-ref is bug-prove, it has improved in recent years.
But it seems fs tree is less damaged, I assume the extent tree
corruption could be fixed by "--init-extent-tree".
For the only fs tree error (missing csum), if "btrfsck
--init-extent-tree --repair" works without any problem, the most simple
fix would be, just removing the file.
Or you can use a lot of CPU time and disk IO to rebuild the whole csum,
by using "--init-csum-tree" option.
Thanks,
Qu
[...]
checking fs roots
root 4336 inode 4284125 errors 1000, some csum missing
What is in this inode?
Checking filesystem on /dev/disk/by-label/system
UUID: d2bb232a-2e8f-4951-8bcc-97e237f1b536
found 1832931324360 bytes used err is 1
total csum bytes: 1730105656
total tree bytes: 6494474240
total fs tree bytes: 3789783040
total extent tree bytes: 608219136
btree space waste bytes: 1221460063
file data blocks allocated: 2406059724800
referenced 2040857763840
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html