assertion failures

2010-02-24 Thread Bill Pemberton
I've had 2 different filesystems fail in the same way recently. In both cases the server crashed (probably due to something unrelated to btrfs). On reboot the fsck fails as follows: # btrfsck /dev/raidvg/vol4 parent transid verify failed on 20971520 wanted 206856 found 214247 parent transid ver

Re: assertion failures

2010-02-25 Thread Bill Pemberton
> > I don't suppose you have the dmesg errors from the crash? This error > shows the header in the block is incorrect, so either something was > written to the wrong place or not written at all. > > Have you memtest86 on this system? > > How did it crash...was a power off used to reset the mach

Re: assertion failures

2010-02-26 Thread Bill Pemberton
> > > > No dmesg. This has happened on two different machines that both have > > other active btrfs filesystems, so I suspect it's not a memory issue. > > In both cases it was the same data that was being copied when the > > crash occurred. > > Ok, is there anything special about this data? >

Re: assertion failures

2010-02-26 Thread Bill Pemberton
> > Does the array have any kind of writeback cache? > Yes, the array has a writeback cache. > > Are all of the filesystems spread across all of the drives? Or do some > filesystems use some drives only? > In all cases the array is presenting 1 physical volume to the host system (which is RA

Re: assertion failures

2010-02-26 Thread Bill Pemberton
> > > > Yes, the array has a writeback cache. > > Ok, this would be my top suspect then, especially if it had to be > powered off to reset it. The errors you sent look like some IO just > didn't happen, which the btrfs code goes to great length to > detect and complain about. > While the array

Re: assertion failures

2010-02-26 Thread Bill Pemberton
> > I wonder if the barrier messages are making it to this write back > cache. Do you see any messages about barriers in your kernel logs? > None relating to the array. The only barrier messages I see are for filesystems on the servers internal disks. -- Bill -- To unsubscribe from this li

Re: assertion failures

2010-02-26 Thread Bill Pemberton
> > Bill, I've got a great little application that you can use to test the > safety of the array against power failures. You'll have to pull the > plug on the poor machine about 10 times to be sure, just let me know if > you're interested. > > If the raid array works, the power failure test won'

Re: assertion failures

2010-02-27 Thread Bill Pemberton
> > If the write cache isn't working, you'll get errors about 50% of the > time. If you run it 10 times without any errors you're probably safe. > Ok, I managed 12 times with no errors, so there's at least another data point. -- Bill -- To unsubscribe from this list: send the line "unsubs

[PATCH] btrfs: fix shadows sparse warning

2010-04-30 Thread Bill Pemberton
Fix sparse warnings: fs/btrfs/free-space-cache.c:1078:40: warning: symbol 'node' shadows an earlier one fs/btrfs/free-space-cache.c:1230:32: warning: symbol 'node' shadows an earlier one Signed-off-by: Bill Pemberton --- fs/btrfs/free-space-cache.c |4 +--- 1 files c