One of my servers does have a similar problem.
Ubuntu 10.04.2 LTS 64 bit
kernel 2.6.32-31-server
The RAID controller is a Symbios Logic LSI MegaSAS 9260 (rev 03)
(default drivers)
Today I found next error in dmesg
task xfssyncd: blocked for more than 120 seconds
--
You received this bug
Hi BDV,
I suggest to update the driver with the most recent one from LSI and install it
using dkms.
You might need to change the source a little bit, because there are statements
like
#if (KERNEL_VERSION = 2.6.32) .
but it has to be
#if (KERNEL_VERSION 2.6.32) .
You'll find it just
That is good to hear. There have been some workarounds in the LSI
driver to handle problems similar to this. It might be that you're
still having a problem but the new fw/driver combination is able to mask
it much better. I would keep an eye on your system for anything
suspicious for a while so
Hi Jason,
the previous module versions were the ones shipped with ubuntu amd64
server 10.04(.1) LTS (up to 2.6.32-25-server) and 10.10
(2.6.35-24-server).
I'll have a look at DKMS. I never used it yet.
If the server runs without related issues for the next 2 month I'll
close the bug if it's
Hi Jason,
thanks for your hints.
I did a FW update of the LSI SAS controller and reduced the fs content. Since
the the update and the filesystems are less filled the error didn't occur again.
# cat /proc/scsi/mptsas/9
ioc1: LSISAS1068E B3, FwRev=011f0200h, Ports=1, MaxQ=483
# cat
Ha, wait!
I forgott the most important fact.
I installed a very much newer driver version manually. And I do it every kernel
update again.
It's the driver from the zip archive from lsi. Actually it is for redhat and
suse but the archive contains a source tarball.
The mpt messages in your logs suggest that the firmware had an NCQ
problem that required it to abort all the outstanding commands and have
the OS retry them (see http://en.wikipedia.org/wiki/NCQ for what NCQ
is). You can disable NCQ, at the cost of IO performance usually, to
work around the issue
Hallo again,
here are some other logs that might be connected to the failure:
[...]
Jul 2 17:52:25 speicher48 kernel: [17690.334458] sd 10:0:20:0: [sdx] CDB:
Read(10): 28 00 00 00 00 00 00 00 08 00
Jul 2 17:52:25 speicher48 kernel: [17690.338722] sd 10:0:20:0: [sdx] Device
not ready
Jul 2
Hi,
here is a dmesg log of the system.
There are a lot of messages and errors from the LSI driver module (mptbase).
Maybe someone knows how to interprete these.
[...]
[48087.552179] mptbase: ioc0: LogInfo(0x3108): Originator={PL}, Code={SATA
NCQ Fail All Commands After Error},
** Attachment added: screen shot of hanging system
http://launchpadlibrarian.net/51117530/sp48-hang.jpg
--
system hangs after strange errors - raid6 and xfs defective (lsi driver?)
https://bugs.launchpad.net/bugs/599830
You received this bug notification because you are a member of Ubuntu
** Attachment added: error during xfs access
http://launchpadlibrarian.net/51117657/xfs.log1
--
system hangs after strange errors - raid6 and xfs defective (lsi driver?)
https://bugs.launchpad.net/bugs/599830
You received this bug notification because you are a member of Ubuntu
Bugs, which
** Package changed: ubuntu = ecs (Ubuntu)
--
system hangs after strange errors - raid6 and xfs defective (lsi driver?)
https://bugs.launchpad.net/bugs/599830
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
--
ubuntu-bugs mailing list
12 matches
Mail list logo