Jan Mikkelsen wrote:
(Scott:  I should have emailed you this earlier, but Christmas and various
other things got in the way.)

Ian West wrote:
On Sun, Jan 07, 2007 at 02:25:02PM -0500, Mike Tancsa wrote:
At 11:43 AM 1/7/2007, Craig Rodrigues wrote:
On Fri, Jan 05, 2007 at 06:59:10PM +0200, Nikolay Pavlov wrote:
[ Areca kernel panic, IO failures ... ]
I have seen this identical fault with the new areca driver, my machine
is opteron hardware, but running a regular i386/SMP kernel/world. With
everything at 6.2RC2 (as of 29th of December) except the areca driver
the machine is rock solid, with the 29th of december version of the
areca driver the box will crash on extract of a large tar file, removal of a large directory structure, or pretty much anything that does a lot of disk io to different files/locations. There is no error log prior to
seeing the following messages..

Dec 29 14:26:44 aleph kernel: g_vfs_done():da0s1g[WRITE(offset=433078272, length=8192)]error = 5 Dec 29 14:26:44 aleph kernel: g_vfs_done():da0s1g[WRITE(offset=433111040, length=16384)]error = 5 Dec 29 14:26:44 aleph kernel: g_vfs_done():da0s1g[WRITE(offset=433209344, length=16384)]error = 5 Dec 29 14:26:44 aleph kernel: g_vfs_done():da0s1g[WRITE(offset=433242112, length=32768)]error = 5 Dec 29 14:26:44 aleph kernel: g_vfs_done():da0s1g[WRITE(offset=437612544, length=4096)]error = 5 Dec 29 14:26:44 aleph kernel: g_vfs_done():da0s1g[WRITE(offset=437616640, length=12288)]error = 5 Dec 29 14:26:44 aleph kernel: g_vfs_done():da0s1g[WRITE(offset=437633024, length=6144)]error = 5 Dec 29 14:26:44 aleph kernel: g_vfs_done():da0s1g[WRITE(offset=437639168, length=2048)]error = 5 Dec 29 14:26:44 aleph kernel: g_vfs_done():da0s1g[WRITE(offset=437641216, length=6144)]error = 5

There are a string of these, followed by a crash and reboot. The file system state can be left very dirty to the point where background fsck seems unable
to recover it.

The areca card in question is running the latest firmware/boot and
has shown no problems either before, or since backing out the areca
driver.

The volume is ran the tests on was a 250G on a raid6 raid set.

I have seen various problems with various Areca drivers.  All on
6.2-RC1/amd64 with an Areca RAID-6 volume.

Areca 1.20.00.02 seems to work fine.

Areca 1.20.00.12 (from the Areca website) seems to have data corruption
problems.  My tests involve doing a "diff -r" on a filesystem with 2GB of
data.  It will occasional find differences in files.  On examination, the
last 640 bytes of the first block of the affected file contain data from
another file "nearby" in the filesystem.  Unmounting and remounting the
filesystems and rerunning the test shows no problem, or a difference in
another file entirely.  I think this is the cause of the g_vfs_done failures
with this version of the driver;  the offsets are wrong because the data is
corrupted.

Areca 1.20.00.13 (as currently in the tree) does not seem to have data
corruption problems, but I can trigger g_vfs_done failures under heavy I/O.

I have raised this with Areca support, and I'm waiting to hear back from
Erich Chen.

Regards,

Jan Mikkelsen


I discussed this issue in length with the release engineering team today, and we're going to go ahead with keeping the .013 version in
6.2 since it has been working very reliably for a number of other
testers, and reverting it at this late stage of the release represents
more risk.  A note about this issue will likely be put into the 6.2
errata document as well.

I plan to dig into this problem next week unless Areca fixes it first.
Please let me know if you hear anything from them.

Scott


_______________________________________________
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Reply via email to