I've had 2 different filesystems fail in the same way recently. In
both cases the server crashed (probably due to something unrelated to
btrfs). On reboot the fsck fails as follows:
# btrfsck /dev/raidvg/vol4
parent transid verify failed on 20971520 wanted 206856 found 214247
parent transid ver
>
> I don't suppose you have the dmesg errors from the crash? This error
> shows the header in the block is incorrect, so either something was
> written to the wrong place or not written at all.
>
> Have you memtest86 on this system?
>
> How did it crash...was a power off used to reset the mach
> >
> > No dmesg. This has happened on two different machines that both have
> > other active btrfs filesystems, so I suspect it's not a memory issue.
> > In both cases it was the same data that was being copied when the
> > crash occurred.
>
> Ok, is there anything special about this data?
>
>
> Does the array have any kind of writeback cache?
>
Yes, the array has a writeback cache.
>
> Are all of the filesystems spread across all of the drives? Or do some
> filesystems use some drives only?
>
In all cases the array is presenting 1 physical volume to the host
system (which is RA
> >
> > Yes, the array has a writeback cache.
>
> Ok, this would be my top suspect then, especially if it had to be
> powered off to reset it. The errors you sent look like some IO just
> didn't happen, which the btrfs code goes to great length to
> detect and complain about.
>
While the array
>
> I wonder if the barrier messages are making it to this write back
> cache. Do you see any messages about barriers in your kernel logs?
>
None relating to the array. The only barrier messages I see are for
filesystems on the servers internal disks.
--
Bill
--
To unsubscribe from this li
>
> Bill, I've got a great little application that you can use to test the
> safety of the array against power failures. You'll have to pull the
> plug on the poor machine about 10 times to be sure, just let me know if
> you're interested.
>
> If the raid array works, the power failure test won'
>
> If the write cache isn't working, you'll get errors about 50% of the
> time. If you run it 10 times without any errors you're probably safe.
>
Ok, I managed 12 times with no errors, so there's at least another
data point.
--
Bill
--
To unsubscribe from this list: send the line "unsubs
Fix sparse warnings:
fs/btrfs/free-space-cache.c:1078:40: warning: symbol 'node' shadows an earlier
one
fs/btrfs/free-space-cache.c:1230:32: warning: symbol 'node' shadows an earlier
one
Signed-off-by: Bill Pemberton
---
fs/btrfs/free-space-cache.c |4 +---
1 files c