Hi All,

I have a nasty problem with open-iscsi on SLES10 + an Infortrend iSCSI
array.

Basically it looks like everything goes wrong as soon as the read/
write load becomes heavy, although network dumps suggest the problem
is always there, it just goes critical when the load is too heavy.

My setup:

1x HP DL585 - SLES10 x86_64
1x HP DL585 - RHEL4 x86_64
1x HP DL380 - SLES10 i586

2x Cisco 2960G (gigabit) switches

2x Infortrend A16E-G2130-4 with 16x 1TB disks each

The two Infortrend arrays have all their gigabit ethernet ports
plugged into one of the cisco switches, then we have 2 fibre
connections leading to the other cisco switch which has the three
servers plugged into it.  The network is completely isolated from our
other company networks.

At first I thought it was a network problem, so we replaced our dodgy
Netgear switches with quality Cisco networking gear, but the problem
is the same, if anything it's worse because the Cisco switches
facilitate higher bandwidth (extra ~20mb/s) and the errors seem to be
more reliably producible.

None of the linux ethernet statistics report any errors (ifconfig) and
the cisco switches don't report any packet errors either.  The
Infortrend arrays don't provide ethernet statistics.

Wireshark (ethereal) shows many errors - clusters of Duplicate ACKs,
and a few "previous segment lost".

dmesg output from the SLES10 x86_64 server:

sd 11:0:0:4: SCSI error: return code = 0x00020000
end_request: I/O error, dev sdi, sector 1945613312
sd 11:0:0:4: SCSI error: return code = 0x00020000
end_request: I/O error, dev sdi, sector 3866132584
sd 11:0:0:4: SCSI error: return code = 0x00020000
end_request: I/O error, dev sdi, sector 1565827648
sd 11:0:0:4: SCSI error: return code = 0x00020000
end_request: I/O error, dev sdi, sector 429620296
sd 11:0:0:4: SCSI error: return code = 0x00020000
end_request: I/O error, dev sdi, sector 429619272
sd 11:0:0:4: SCSI error: return code = 0x00020000
end_request: I/O error, dev sdi, sector 429618248
sd 11:0:0:4: SCSI error: return code = 0x00020000
end_request: I/O error, dev sdi, sector 429617224
sd 11:0:0:4: SCSI error: return code = 0x00020000
end_request: I/O error, dev sdi, sector 429616200
sd 11:0:0:4: SCSI error: return code = 0x00020000
end_request: I/O error, dev sdi, sector 429615176
sd 11:0:0:4: SCSI error: return code = 0x00020000
end_request: I/O error, dev sdi, sector 429614152
sd 11:0:0:4: SCSI error: return code = 0x00020000
end_request: I/O error, dev sdi, sector 429613128
Buffer I/O error on device dm-9, logical block 53701593
lost page write due to I/O error on dm-9
Buffer I/O error on device dm-9, logical block 53701594
lost page write due to I/O error on dm-9
Buffer I/O error on device dm-9, logical block 53701595
lost page write due to I/O error on dm-9
Buffer I/O error on device dm-9, logical block 53701596
lost page write due to I/O error on dm-9
Buffer I/O error on device dm-9, logical block 53701597
lost page write due to I/O error on dm-9
Buffer I/O error on device dm-9, logical block 53701598
lost page write due to I/O error on dm-9
Buffer I/O error on device dm-9, logical block 53701599
lost page write due to I/O error on dm-9
Buffer I/O error on device dm-9, logical block 53701600
lost page write due to I/O error on dm-9
Buffer I/O error on device dm-9, logical block 53701601
lost page write due to I/O error on dm-9
Buffer I/O error on device dm-9, logical block 53701602
lost page write due to I/O error on dm-9
sd 11:0:0:4: SCSI error: return code = 0x00020000
end_request: I/O error, dev sdi, sector 2972647720
sd 11:0:0:4: SCSI error: return code = 0x00020000
end_request: I/O error, dev sdi, sector 2717078440
sd 11:0:0:4: SCSI error: return code = 0x00020000
end_request: I/O error, dev sdi, sector 1566942880
sd 11:0:0:4: SCSI error: return code = 0x00020000
end_request: I/O error, dev sdi, sector 1023998416
I/O error in filesystem ("dm-9") meta-data dev dm-9 block
0x3d08f850       ("xfs_trans_read_buf") error 5 buf count 8192
sd 11:0:0:4: SCSI error: return code = 0x00020000
end_request: I/O error, dev sdi, sector 2048020038
I/O error in filesystem ("dm-9") meta-data dev dm-9 block
0x7a124cc6       ("xlog_iodone") error 5 buf count 1024
xfs_force_shutdown(dm-9,0x2) called from line 960 of file fs/xfs/
xfs_log.c.  Return address = 0xffffffff882913aa
Filesystem "dm-9": Log I/O Error Detected.  Shutting down filesystem:
dm-9
Please umount the filesystem, and rectify the problem(s)
xfs_force_shutdown(dm-9,0x1) called from line 424 of file fs/xfs/
xfs_rw.c.  Return address = 0xffffffff882a5139
xfs_force_shutdown(dm-9,0x1) called from line 424 of file fs/xfs/
xfs_rw.c.  Return address = 0xffffffff882a5139

/var/log/messages from sles10 i586 server:

kernel:  connection2:0: iscsi: detected conn error (1011)
iscsid: detected iSCSI connection 2:0 error (1011) state (3)
iscsid: connect failed (113)
iscsid: connect failed (113)
iscsid: connect failed (113)
kernel:  session2: iscsi: session recovery timed out after 120 secs
kernel: sd 2:0:0:0: scsi: Device offlined - not ready after error
recovery
kernel: sd 2:0:0:0: scsi: Device offlined - not ready after error
recovery
kernel: sd 2:0:0:0: scsi: Device offlined - not ready after error
recovery
kernel: sd 2:0:0:0: scsi: Device offlined - not ready after error
recovery
kernel: sd 2:0:0:0: scsi: Device offlined - not ready after error
recovery
kernel: sd 2:0:0:0: scsi: Device offlined - not ready after error
recovery
kernel: sd 2:0:0:0: scsi: Device offlined - not ready after error
recovery
kernel: sd 2:0:0:0: scsi: Device offlined - not ready after error
recovery
kernel: sd 2:0:0:0: scsi: Device offlined - not ready after error
recovery
kernel: sd 2:0:0:0: scsi: Device offlined - not ready after error
recovery
kernel: sd 2:0:0:0: scsi: Device offlined - not ready after error
recovery
kernel: sd 2:0:0:0: scsi: Device offlined - not ready after error
recovery
kernel: sd 2:0:0:0: scsi: Device offlined - not ready after error
recovery
kernel: sd 2:0:0:0: scsi: Device offlined - not ready after error
recovery
kernel: sd 2:0:0:0: scsi: Device offlined - not ready after error
recovery
kernel: sd 2:0:0:0: scsi: Device offlined - not ready after error
recovery
kernel: sd 2:0:0:0: scsi: Device offlined - not ready after error
recovery
kernel: sd 2:0:0:0: scsi: Device offlined - not ready after error
recovery
kernel: sd 2:0:0:0: scsi: Device offlined - not ready after error
recovery
kernel: sd 2:0:0:0: scsi: Device offlined - not ready after error
recovery
kernel: sd 2:0:0:0: scsi: Device offlined - not ready after error
recovery
kernel: sd 2:0:0:0: scsi: Device offlined - not ready after error
recovery
kernel: sd 2:0:0:0: scsi: Device offlined - not ready after error
recovery
kernel: sd 2:0:0:0: scsi: Device offlined - not ready after error
recovery
kernel: sd 2:0:0:0: scsi: Device offlined - not ready after error
recovery
kernel: sd 2:0:0:0: scsi: Device offlined - not ready after error
recovery
kernel: sd 2:0:0:0: scsi: Device offlined - not ready after error
recovery
kernel: sd 2:0:0:0: scsi: Device offlined - not ready after error
recovery
kernel: sd 2:0:0:0: scsi: Device offlined - not ready after error
recovery
kernel: sd 2:0:0:0: SCSI error: return code = 0x00020000
kernel: end_request: I/O error, dev sda, sector 768252872
kernel: sd 2:0:0:0: rejecting I/O to offline device

Any help would be much appreciated !!!

Thanks,

Stuart.
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To post to this group, send email to open-iscsi@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at http://groups.google.com/group/open-iscsi
-~----------~----~----~----~------~----~------~--~---

Reply via email to