btrfs csum failed

2011-05-03 Thread Martin Schitter
since my last debian kernel-update to 2.6.38-2-amd64 i got troubles with 
csum failures. it's a volume full of huge kvm-images on md-RAID1 and 
LVM, so i used the mount options: 'noatime,nodatasum' to maximize the 
performance.


it happened two weeks ago for the fist time. and now again a kvm-image 
isn't readable again. i have to use an older snapshot to substitute the 
virtual machine.


this are the entries in dmesg/kernel-log on any access:
...
 [2412668.409442] btrfs csum failed ino 258 off 2331529216 csum 
3632892464 private 2115348581

...

it's a production machine, so i can not make to much experiments on it.
do you see an obvious way to solve this problem?

thanks!
martin
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs csum failed

2011-05-03 Thread Martin Schitter

Am 2011-05-04 02:28, schrieb Josef Bacik:

Wait why are you running with btrfs in production?


do you know a better alternative for continuous snapshots? :)

it works surprisingly well since more than a year.
well the performance could be better for vm-image-hosting but it works.

we used cache='writeback' for a long time but now all virtual instances 
have set cache='none'



What OS is in this vm image?


2.6.30-bpo.1-amd64 with virtio-driver

could you give me some advice how to debug/report this specific problem 
more precise?


thanks
martin
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs csum failed

2011-05-04 Thread Martin Schitter

Am 2011-05-04 04:18, schrieb Fajar A. Nugraha:

could you give me some advice how to debug/report this specific
problem more

precise?

If it's not reproducible then I'd suspect it'd be hard to do.


the last working snapshot is from 2011-05-02-17:13. i can reproduce this 
file system corruption on one specific file in any hourly snapshot later.


whenever i make a simple:

  cat snapshot-2011-05-02-18:13/sata-images/image_xy.raw > /dev/null

i get an "Input/output error" and the quoted debug messages in dmesg and
kernel-log

could this be seen as an useful starting point for further investigations?


Usually checksum errors is early sign of hardware failure (most
common are disk or power supply).


that looks very unplausible to me. there is an RAID1 layer beneath btrfs 
in our setup and i don't see any errors there.


and the 'nodatasum' option should also ignore csum issues.-- isn't it?

martin
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs csum failed

2011-05-04 Thread Martin Schitter

Am 2011-05-04 13:51, schrieb cwillu:

that looks very unplausible to me. there is an RAID1 layer beneath btrfs in
our setup and i don't see any errors there.


That doesn't rule out the possibility of corruption when it was
written in the first place, or some similar problem that the raid1
faithfully reproduced on both mirrors.  That's not to say that it's
impossible that the problem is in btrfs, just that it's not the only
plausible possibility.


well -- i am doing a backup of all images every night. this process 
should work like a simple "scrub" because all data (and its checksumes) 
will be read. that's the way i stumbled over this problem!



and the 'nodatasum' option should also ignore csum issues.-- isn't it?

>

No, it only affects writing new checksums; any existing checksums are
still checked.


would it make some sense to remount the volume with checksumming enabled 
and run additional tests to find similar suspect blocks to prevent this 
kind of suddenly broken files?


martin
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs csum failed

2011-05-04 Thread Martin Schitter

Am 2011-05-04 14:31, schrieb Kaspar Schleiser:

Is the btrfs RAID1 itself inside a virtual machine? I've had data
corruption with virtio block devices > 1TB on early squeeze kernels.


no -- it's on the (native) host side. and we use a very actual kernel 
from debian 'testing' (2.6.38-2).


martin
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs csum failed

2011-05-04 Thread Martin Schitter

Am 2011-05-04 14:39, schrieb Chris Mason:

What OS is inside these virtual machines?  The btrfs unstable tree has
some fixes for windows based OSes.


we have only linux guests of different flavor, no windows guests.

both corruptions during this last weeks belong to different virtual 
block device images of the same guest instance.



Is your kvm config using O_DIRECT?


yes -- the kvm/qemu option cache="none" implies O_DIRECT.


I've also got patches here that force us to honor nodatasum even when
the file has csums, that can help if the contents of the file are
actually good.


that sounds interessting! in our case it may be easier do use same 
recent backup data, but it could be very helpful in similar situations.


i would really like to help isolating the reasons of this failure and a 
find a practical strategy to prevent additional breakdowns.


thanks
martin
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs csum failed

2011-05-04 Thread Martin Schitter

Am 2011-05-04 15:23, schrieb Edward Ned Harvey:

From: linux-btrfs-ow...@vger.kernel.org [mailto:linux-btrfs-
ow...@vger.kernel.org] On Behalf Of Martin Schitter

well -- i am doing a backup of all images every night. this
process should work like a simple "scrub" because all data (and its
checksumes) will be read.


Sorry, not correct.  When you read all the data using something in
user-land, the OS only needs to read one side of the data.  It can
accelerate by staggering the read requests across multiple disks.  So
some sectors remain unread on some disks.

When you scrub, it reads all the data from all the redundant copies
(mirrored or raid) on all the individual disks in the raid set.


ok -- i see -- you're right!

i know, there a some befits in the way btrfs and zfs implement RAID / 
multiply disk usage and checksumming, but i a also want to stay on the 
save side, when it comes to real practical problems. so i decided to use 
'classical' linux software RAID-1 as the base layer. that's a very old 
fashioned solution, but it usually simply works... and you can change a 
broken disk without any respect of the used filesystem(s). in general i 
try to use btrfs only on account of its snapshot features in a very 
simple way.


it looks very strange to me, that i don't see any SMART warnings on the 
harddisks or errors on other filsystems on the same raid-array. there 
was also no reboot, power-failure or similar when the corruption 
suddenly appeared. so i think, a btrfs bug would be the most evident 
explanation.


martin

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html