Mike,
Thanks for your prompt/detailed reply.
Vtrak is an iscsi disk array, which unfortunately doesn't seem to be reporting
any errors.
Is there any sort of debugging I could do to discern what the cause of the
problem is? I'm pretty sure it's not a pulled cable, but either the target or
the initiator dropping the connection for some reason...
What sort of "invalid data" would cause the initiator to drop the connection?
There are no errors about pings or nops. The fragment I sent was the
"beginning" of the problem. Can the verbosity of the iscsi messages be
increased?
Below is the output of iscsiadm -m session -P 3. The connection failed again
over the weekend, this time I wasn't even doing any reads/writes.
Some different error messages in there too, though I've seen these before.
Sep 27 02:51:04 spin kernel: connection4:0: iscsi: detected conn error (1011)
Sep 27 02:51:05 spin iscsid: Kernel reported iSCSI connection 4:0 error (1011) s
tate (3)
Sep 27 02:51:08 spin iscsid: connect to 142.20.199.117:3260 failed (Connection r
efused)
Sep 27 02:51:41 spin last message repeated 9 times
Sep 27 02:52:45 spin last message repeated 17 times
Sep 27 02:53:49 spin last message repeated 17 times
Sep 27 02:54:52 spin last message repeated 17 times
Sep 27 02:55:56 spin last message repeated 17 times
Sep 27 02:56:00 spin iscsid: connect to 142.20.199.117:3260 failed (Connection r
efused)
Sep 27 02:56:04 spin kernel: session4: iscsi: session recovery timed out after
300 secs
Sep 27 02:56:04 spin iscsid: connect to 142.20.199.117:3260 failed (Connection r
efused)
...
Sep 27 04:02:33 spin last message repeated 7 times
Sep 27 04:02:35 spin kernel: iscsi: cmd 0x28 is not queued (8)
Sep 27 04:02:35 spin kernel: sd 8:0:0:0: SCSI error: return code = 0x0001
Sep 27 04:02:35 spin kernel: end_request: I/O error, dev sde, sector 549316386
Sep 27 04:02:35 spin kernel: I/O error in filesystem ("sde1") meta-data dev sde1
block 0x20bde700 ("xfs_trans_read_buf") error 5 buf count 8192
Sep 27 04:02:37 spin iscsid: connect to 142.20.199.117:3260 failed (Connection r
efused)
Sep 27 04:02:39 spin kernel: iscsi: cmd 0x28 is not queued (8)
Sep 27 04:02:39 spin kernel: sd 8:0:0:0: SCSI error: return code = 0x0001
Sep 27 04:02:39 spin kernel: end_request: I/O error, dev sde, sector 446314
Sep 27 04:02:39 spin kernel: I/O error in filesystem ("sde1") meta-data dev sde1
block 0x6cf48 ("xfs_trans_read_buf") error 5 buf count 4096
...
iSCSI Transport Class version 2.0-724
version 2.0-871
Target: reserve.vtrak
Current Portal: 142.20.199.117:3260,1
Persistent Portal: 142.20.199.117:3260,1
**
Interface:
**
Iface Name: default
Iface Transport: tcp
Iface Initiatorname: unknown
Iface IPaddress: 142.20.199.72
Iface HWaddress: default
Iface Netdev: default
SID: 4
iSCSI Connection State: TRANSPORT WAIT
iSCSI Session State: Unknown
Internal iscsid Session State: REOPEN
Negotiated iSCSI params:
HeaderDigest: None
DataDigest: None
MaxRecvDataSegmentLength: 131072
MaxXmitDataSegmentLength: 524288
FirstBurstLength: 131072
MaxBurstLength: 262144
ImmediateData: Yes
InitialR2T: Yes
MaxOutstandingR2T: 1
Attached SCSI devices:
Host Number: 8 State: running
scsi8 Channel 00 Id 0 Lun: 0
Attached scsi disk sde State: running
-Original Message-
From: Mike Christie [mailto:micha...@cs.wisc.edu]
Sent: Friday, September 25, 2009 8:46 PM
To: open-iscsi@googlegroups.com
Cc: Drew Morris
Subject: Re: connection error kills iscsi session
djmorris wrote:
> Problem: While running a backup to a mounted iscsi device, iscsid gets
> a "conn error (1011)", and my drive starts getting input/output
> errors.
>
> Details:
> Kernel Version: 2.6.23.17-88.fc7PAE
> Distribution: Fedora core 7
> open-iscsi version: 2.0-871
>
> ISCSI Target Device: Promise Vtrak M610i
> backup program: BackupPC (uses rsync)
> my linux box and target are both plugged into a Netgear 516T Switch
> (1000 baseT, 16 port, unmanaged)
>
> When I log into the target after this happens, i notice that there are
> no active iscsi sessions
> However, when i run
> iscsiadm -m session, it lists the active session
>
> tcp: [2] TAR.GET.IP.ADR:3260,1 reserve.vtrak
>
> I can log out with
> iscsiadm -m node -U manual
> but i cannot log in after that or even discover the target device
> unless i reboot that device
>
> sample from my /var/log/messa