Re: iscsi errors with debian-kernel 2.6.24-3-686

2008-08-25 Thread Mike Christie

Klemens Kittan wrote:
> Am Wednesday, 20. August 2008 19:27 schrieb Mike Christie:
>> Klemens Kittan wrote:
>>> Here is the whole log.
>> Thanks. In that log, are you doing io to both disks on both connections
>> or just one disks?
>>
>> It looks like the initiator is not getting a response for the ping in
>> time (within ping (node.conn[0].timeo.noop_out_timeout) timeout
>> seconds)). It might be a driver bug and we are dropping it. To check
>> that you would need to run ethereal or wireshark so we can see what the
>> network layer is seeing, but I do not think this is the case given some
>> of the other ping times in there.
>>
>> If the target is just so slow that it is useless to try sending the nop
>> you could just run without nops on by setting
>> node.conn[0].timeo.noop_out_timeout = 0
>> node.conn[0].timeo.noop_out_interval = 0
>> or turn node.conn[0].timeo.noop_out_timeout to a higher value like 45
>> and see if that helps.
>>
>> In the logs the responses to some other pings are around 9 seconds
>> which is cutting it close when you have a timeout of 10 seconds, so you
>> should definately try to increase the node.conn[0].timeo.noop_out_timeout.
>>
> 
> Hi Mike,
> 
> we think now that the problem isn't with the iscsi-stuff but comes
> from the ethernet driver (e1000 "Detected Tx Unit Hang"). Sorry and
> thank you very much for your patience and help!
> 

No problem. Thanks for updating us with your findings. It will help 
other people when searching the list for their issues.

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To post to this group, send email to open-iscsi@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at http://groups.google.com/group/open-iscsi
-~--~~~~--~~--~--~---



Re: iscsi errors with debian-kernel 2.6.24-3-686

2008-08-25 Thread Klemens Kittan
Am Wednesday, 20. August 2008 19:27 schrieb Mike Christie:
> Klemens Kittan wrote:
> > Here is the whole log.
>
> Thanks. In that log, are you doing io to both disks on both connections
> or just one disks?
>
> It looks like the initiator is not getting a response for the ping in
> time (within ping (node.conn[0].timeo.noop_out_timeout) timeout
> seconds)). It might be a driver bug and we are dropping it. To check
> that you would need to run ethereal or wireshark so we can see what the
> network layer is seeing, but I do not think this is the case given some
> of the other ping times in there.
>
> If the target is just so slow that it is useless to try sending the nop
> you could just run without nops on by setting
> node.conn[0].timeo.noop_out_timeout = 0
> node.conn[0].timeo.noop_out_interval = 0
> or turn node.conn[0].timeo.noop_out_timeout to a higher value like 45
> and see if that helps.
>
> In the logs the responses to some other pings are around 9 seconds
> which is cutting it close when you have a timeout of 10 seconds, so you
> should definately try to increase the node.conn[0].timeo.noop_out_timeout.
>

Hi Mike,

we think now that the problem isn't with the iscsi-stuff but comes
from the ethernet driver (e1000 "Detected Tx Unit Hang"). Sorry and
thank you very much for your patience and help!

Best regards,
Klemens


pgpGbUUr6O4rO.pgp
Description: PGP signature


Re: iscsi errors with debian-kernel 2.6.24-3-686

2008-08-20 Thread Mike Christie

Klemens Kittan wrote:
> 
> Here is the whole log.
> 

Thanks. In that log, are you doing io to both disks on both connections 
or just one disks?

It looks like the initiator is not getting a response for the ping in 
time (within ping (node.conn[0].timeo.noop_out_timeout) timeout 
seconds)). It might be a driver bug and we are dropping it. To check 
that you would need to run ethereal or wireshark so we can see what the 
network layer is seeing, but I do not think this is the case given some 
of the other ping times in there.

If the target is just so slow that it is useless to try sending the nop 
you could just run without nops on by setting
node.conn[0].timeo.noop_out_timeout = 0
node.conn[0].timeo.noop_out_interval = 0
or turn node.conn[0].timeo.noop_out_timeout to a higher value like 45 
and see if that helps.

In the logs the responses to some other pings are around 9 seconds 
which is cutting it close when you have a timeout of 10 seconds, so you 
should definately try to increase the node.conn[0].timeo.noop_out_timeout.

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To post to this group, send email to open-iscsi@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at http://groups.google.com/group/open-iscsi
-~--~~~~--~~--~--~---



Re: iscsi errors with debian-kernel 2.6.24-3-686

2008-08-20 Thread Klemens Kittan
Am Wednesday, 20. August 2008 11:52 schrieb Mike Christie:
> Klemens Kittan wrote:
> > Am Wednesday, 20. August 2008 09:43 schrieb Mike Christie:
> >> Klemens Kittan wrote:
> >>> Am Tuesday, 19. August 2008 19:15 schrieb Mike Christie:
>  Klemens Kittan wrote:
> > Am Monday, 18. August 2008 20:10 schrieb Mike Christie:
> >> Klemens Kittan wrote:
> >>> Am Friday, 15. August 2008 20:03 schrieb Mike Christie:
>  Mike Christie wrote:
> > Klemens Kittan wrote:
> >> Here is the configuration of my debian kernel (2.6.25-2).
> >
> > Thanks. It looks like your target is responding to other IO, but
> > did not respond to the ping quick enough so it timed out. Let me
> > make a patch for you to test. I should hopefully have it later
> > today.
> 
>  Try the attached patch over  open-iscsi-2.0-869.2 tarball modules.
> 
>  To apply the patch untar and unzip the source then cd to the dir.
>  Then do:
> 
>  patch -p1 -i where-the-patch-is-saved/relax-ping-timer.patch
> 
>  Then do the normal make and make install. You will probably want
>  to reboot the box to make sure you are using the new modules.
> >>>
> >>> Unfortunately I got the same errors.
> >>
> >> Could you send the log output?
> >
> > Here is the /var/log/syslog.
> 
>  Shoot. For some reason that nop is just not finishing in a decent
>  amount of time. Could you try the attached patch. It gives the nop
>  even more time to complete and it spits out a bunch of debug info to
>  make sure open-iscsi did not leak the task.
> >>>
> >>> Unfortunately, the attached file is empty.
> >>
> >> Oh yeah, if you just log into the target and do not do any IO to the
> >> disks. Do you see any messages like this:
> >>
> >> Aug 14 09:52:23 baltrum kernel: [81064.665749]  connection2:0: ping
> >> timeout of
> >> 10 secs expired, last rx 4315069195, last ping 4315067926, now
> >> 4315070426 Aug 14 09:52:23 baltrum kernel: [81064.669756] 
> >> connection2:0: detected conn
> >> error (1011)
> >
> > I get these messages all the time (with and without IO traffic):
>
> you should get these. I am just worried about getting these
>
>  > Aug 20 09:56:10 baltrum kernel: [168687.391990]  connection1:0: ping
>
> timeout
>
>  > of 10 secs with recv timeout of 5 secs expired last rx 4336967839,
>
> last ping
>
>  > 4336967081, now 4336970339 task 81003797aac0
>  > Aug 20 09:56:10 baltrum kernel: [168687.396001]  connection1:0:
>
> detected conn
>
>  > error (1011)
>
> when there is no IO traffic.
>
> > Aug 20 09:49:13 baltrum kernel: [168482.943791] send 8100f9c541c0
> > Aug 20 09:49:13 baltrum kernel: [168483.026754] send 81003797adc0
> > Aug 20 09:49:13 baltrum kernel: [168483.026817] iscsi_free_mgmt_task
> > 8100f9c541c0
> > Aug 20 09:49:13 baltrum kernel: [168483.031189] iscsi_free_mgmt_task
> > 81003797adc0
> > Aug 20 09:49:18 baltrum kernel: [168487.772859] send 8100f9c54140
> > Aug 20 09:49:18 baltrum kernel: [168488.018304] iscsi_free_mgmt_task
> > 8100f9c54140
> > Aug 20 09:49:18 baltrum kernel: [168488.018342] send 81003797aac0
> > Aug 20 09:49:18 baltrum kernel: [168488.026632] iscsi_free_mgmt_task
> > 81003797aac0
> >
> > With IO traffic I get these messages:
>
> Could you give me a large chunk of the log? I need the stuff that
> happened before this part.
>
> > Aug 20 09:56:10 baltrum kernel: [168687.391990]  connection1:0: ping
> > timeout of 10 secs with recv timeout of 5 secs expired last rx
> > 4336967839, last ping 4336967081, now 4336970339 task 81003797aac0
> > Aug 20 09:56:10 baltrum kernel: [168687.396001]  connection1:0: detected
> > conn error (1011)
> > Aug 20 09:56:10 baltrum iscsid: Kernel reported iSCSI connection 1:0
> > error (1011) state (3)
> > Aug 20 09:56:14 baltrum iscsid: connection1:0 is operational after
> > recovery (1 attempts)
> >
> > Thanks,
> > Klemens
>

Here is the whole log.

Thanks,
Klemens
Aug 20 06:25:10 baltrum syslogd 1.5.0#5: restart.
Aug 20 06:31:01 baltrum /USR/SBIN/CRON[17699]: (clamav) CMD ([ -x /usr/bin/freshclam ] && /usr/bin/freshclam --quiet >/dev/null)
Aug 20 06:51:33 baltrum -- MARK --
Aug 20 07:11:33 baltrum -- MARK --
Aug 20 07:17:01 baltrum /USR/SBIN/CRON[19082]: (root) CMD (   cd / && run-parts --report /etc/cron.hourly)
Aug 20 07:31:01 baltrum /USR/SBIN/CRON[19505]: (clamav) CMD ([ -x /usr/bin/freshclam ] && /usr/bin/freshclam --quiet >/dev/null)
Aug 20 07:51:33 baltrum -- MARK --
Aug 20 08:11:33 baltrum -- MARK --
Aug 20 08:17:01 baltrum /USR/SBIN/CRON[20887]: (root) CMD (   cd / && run-parts --report /etc/cron.hourly)
Aug 20 08:19:32 baltrum multipathd: ift_backup: stop event checker thread 
Aug 20 08:19:32 baltrum multipathd: ift_vz: stop event checker thread 
Aug 20 08:19:32 baltrum multipathd: ift_mail: stop event checker thread 
Aug 20 08:19:32 baltrum multipathd: ift_sql: stop ev

Re: iscsi errors with debian-kernel 2.6.24-3-686

2008-08-20 Thread Mike Christie

Klemens Kittan wrote:
> Am Wednesday, 20. August 2008 09:43 schrieb Mike Christie:
>> Klemens Kittan wrote:
>>> Am Tuesday, 19. August 2008 19:15 schrieb Mike Christie:
 Klemens Kittan wrote:
> Am Monday, 18. August 2008 20:10 schrieb Mike Christie:
>> Klemens Kittan wrote:
>>> Am Friday, 15. August 2008 20:03 schrieb Mike Christie:
 Mike Christie wrote:
> Klemens Kittan wrote:
>> Here is the configuration of my debian kernel (2.6.25-2).
> Thanks. It looks like your target is responding to other IO, but
> did not respond to the ping quick enough so it timed out. Let me
> make a patch for you to test. I should hopefully have it later
> today.
 Try the attached patch over  open-iscsi-2.0-869.2 tarball modules.

 To apply the patch untar and unzip the source then cd to the dir.
 Then do:

 patch -p1 -i where-the-patch-is-saved/relax-ping-timer.patch

 Then do the normal make and make install. You will probably want to
 reboot the box to make sure you are using the new modules.
>>> Unfortunately I got the same errors.
>> Could you send the log output?
> Here is the /var/log/syslog.
 Shoot. For some reason that nop is just not finishing in a decent amount
 of time. Could you try the attached patch. It gives the nop even more
 time to complete and it spits out a bunch of debug info to make sure
 open-iscsi did not leak the task.
>>> Unfortunately, the attached file is empty.
>> Oh yeah, if you just log into the target and do not do any IO to the
>> disks. Do you see any messages like this:
>>
>> Aug 14 09:52:23 baltrum kernel: [81064.665749]  connection2:0: ping
>> timeout of
>> 10 secs expired, last rx 4315069195, last ping 4315067926, now 4315070426
>> Aug 14 09:52:23 baltrum kernel: [81064.669756]  connection2:0: detected
>> conn
>> error (1011)
>>
> 
> I get these messages all the time (with and without IO traffic):

you should get these. I am just worried about getting these

 > Aug 20 09:56:10 baltrum kernel: [168687.391990]  connection1:0: ping 
timeout
 > of 10 secs with recv timeout of 5 secs expired last rx 4336967839, 
last ping
 > 4336967081, now 4336970339 task 81003797aac0
 > Aug 20 09:56:10 baltrum kernel: [168687.396001]  connection1:0: 
detected conn
 > error (1011)

when there is no IO traffic.

> Aug 20 09:49:13 baltrum kernel: [168482.943791] send 8100f9c541c0
> Aug 20 09:49:13 baltrum kernel: [168483.026754] send 81003797adc0
> Aug 20 09:49:13 baltrum kernel: [168483.026817] iscsi_free_mgmt_task 
> 8100f9c541c0
> Aug 20 09:49:13 baltrum kernel: [168483.031189] iscsi_free_mgmt_task 
> 81003797adc0
> Aug 20 09:49:18 baltrum kernel: [168487.772859] send 8100f9c54140
> Aug 20 09:49:18 baltrum kernel: [168488.018304] iscsi_free_mgmt_task 
> 8100f9c54140
> Aug 20 09:49:18 baltrum kernel: [168488.018342] send 81003797aac0
> Aug 20 09:49:18 baltrum kernel: [168488.026632] iscsi_free_mgmt_task 
> 81003797aac0
> 
> With IO traffic I get these messages:


Could you give me a large chunk of the log? I need the stuff that 
happened before this part.

> Aug 20 09:56:10 baltrum kernel: [168687.391990]  connection1:0: ping timeout 
> of 10 secs with recv timeout of 5 secs expired last rx 4336967839, last ping 
> 4336967081, now 4336970339 task 81003797aac0
> Aug 20 09:56:10 baltrum kernel: [168687.396001]  connection1:0: detected conn 
> error (1011)
> Aug 20 09:56:10 baltrum iscsid: Kernel reported iSCSI connection 1:0 error 
> (1011) state (3)
> Aug 20 09:56:14 baltrum iscsid: connection1:0 is operational after recovery 
> (1 
> attempts)
> 
> Thanks,
> Klemens
> 


--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To post to this group, send email to open-iscsi@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at http://groups.google.com/group/open-iscsi
-~--~~~~--~~--~--~---



Re: iscsi errors with debian-kernel 2.6.24-3-686

2008-08-20 Thread Klemens Kittan
Am Wednesday, 20. August 2008 09:43 schrieb Mike Christie:
> Klemens Kittan wrote:
> > Am Tuesday, 19. August 2008 19:15 schrieb Mike Christie:
> >> Klemens Kittan wrote:
> >>> Am Monday, 18. August 2008 20:10 schrieb Mike Christie:
>  Klemens Kittan wrote:
> > Am Friday, 15. August 2008 20:03 schrieb Mike Christie:
> >> Mike Christie wrote:
> >>> Klemens Kittan wrote:
>  Here is the configuration of my debian kernel (2.6.25-2).
> >>>
> >>> Thanks. It looks like your target is responding to other IO, but
> >>> did not respond to the ping quick enough so it timed out. Let me
> >>> make a patch for you to test. I should hopefully have it later
> >>> today.
> >>
> >> Try the attached patch over  open-iscsi-2.0-869.2 tarball modules.
> >>
> >> To apply the patch untar and unzip the source then cd to the dir.
> >> Then do:
> >>
> >> patch -p1 -i where-the-patch-is-saved/relax-ping-timer.patch
> >>
> >> Then do the normal make and make install. You will probably want to
> >> reboot the box to make sure you are using the new modules.
> >
> > Unfortunately I got the same errors.
> 
>  Could you send the log output?
> >>>
> >>> Here is the /var/log/syslog.
> >>
> >> Shoot. For some reason that nop is just not finishing in a decent amount
> >> of time. Could you try the attached patch. It gives the nop even more
> >> time to complete and it spits out a bunch of debug info to make sure
> >> open-iscsi did not leak the task.
> >
> > Unfortunately, the attached file is empty.
>
> Oh yeah, if you just log into the target and do not do any IO to the
> disks. Do you see any messages like this:
>
> Aug 14 09:52:23 baltrum kernel: [81064.665749]  connection2:0: ping
> timeout of
> 10 secs expired, last rx 4315069195, last ping 4315067926, now 4315070426
> Aug 14 09:52:23 baltrum kernel: [81064.669756]  connection2:0: detected
> conn
> error (1011)
>

I get these messages all the time (with and without IO traffic):
Aug 20 09:49:13 baltrum kernel: [168482.943791] send 8100f9c541c0
Aug 20 09:49:13 baltrum kernel: [168483.026754] send 81003797adc0
Aug 20 09:49:13 baltrum kernel: [168483.026817] iscsi_free_mgmt_task 
8100f9c541c0
Aug 20 09:49:13 baltrum kernel: [168483.031189] iscsi_free_mgmt_task 
81003797adc0
Aug 20 09:49:18 baltrum kernel: [168487.772859] send 8100f9c54140
Aug 20 09:49:18 baltrum kernel: [168488.018304] iscsi_free_mgmt_task 
8100f9c54140
Aug 20 09:49:18 baltrum kernel: [168488.018342] send 81003797aac0
Aug 20 09:49:18 baltrum kernel: [168488.026632] iscsi_free_mgmt_task 
81003797aac0

With IO traffic I get these messages:
Aug 20 09:56:10 baltrum kernel: [168687.391990]  connection1:0: ping timeout 
of 10 secs with recv timeout of 5 secs expired last rx 4336967839, last ping 
4336967081, now 4336970339 task 81003797aac0
Aug 20 09:56:10 baltrum kernel: [168687.396001]  connection1:0: detected conn 
error (1011)
Aug 20 09:56:10 baltrum iscsid: Kernel reported iSCSI connection 1:0 error 
(1011) state (3)
Aug 20 09:56:14 baltrum iscsid: connection1:0 is operational after recovery (1 
attempts)

Thanks,
Klemens



pgpKoS5RiKNwU.pgp
Description: PGP signature


Re: iscsi errors with debian-kernel 2.6.24-3-686

2008-08-20 Thread Mike Christie

Klemens Kittan wrote:
> Am Tuesday, 19. August 2008 19:15 schrieb Mike Christie:
>> Klemens Kittan wrote:
>>> Am Monday, 18. August 2008 20:10 schrieb Mike Christie:
 Klemens Kittan wrote:
> Am Friday, 15. August 2008 20:03 schrieb Mike Christie:
>> Mike Christie wrote:
>>> Klemens Kittan wrote:
 Here is the configuration of my debian kernel (2.6.25-2).
>>> Thanks. It looks like your target is responding to other IO, but did
>>> not respond to the ping quick enough so it timed out. Let me make a
>>> patch for you to test. I should hopefully have it later today.
>> Try the attached patch over  open-iscsi-2.0-869.2 tarball modules.
>>
>> To apply the patch untar and unzip the source then cd to the dir. Then
>> do:
>>
>> patch -p1 -i where-the-patch-is-saved/relax-ping-timer.patch
>>
>> Then do the normal make and make install. You will probably want to
>> reboot the box to make sure you are using the new modules.
> Unfortunately I got the same errors.
 Could you send the log output?
>>> Here is the /var/log/syslog.
>> Shoot. For some reason that nop is just not finishing in a decent amount
>> of time. Could you try the attached patch. It gives the nop even more
>> time to complete and it spits out a bunch of debug info to make sure
>> open-iscsi did not leak the task.
>>
> 
> Unfortunately, the attached file is empty.
> 

Oh yeah, if you just log into the target and do not do any IO to the 
disks. Do you see any messages like this:

Aug 14 09:52:23 baltrum kernel: [81064.665749]  connection2:0: ping 
timeout of
10 secs expired, last rx 4315069195, last ping 4315067926, now 4315070426
Aug 14 09:52:23 baltrum kernel: [81064.669756]  connection2:0: detected 
conn
error (1011)

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To post to this group, send email to open-iscsi@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at http://groups.google.com/group/open-iscsi
-~--~~~~--~~--~--~---



Re: iscsi errors with debian-kernel 2.6.24-3-686

2008-08-20 Thread Mike Christie
Klemens Kittan wrote:
> Am Tuesday, 19. August 2008 19:15 schrieb Mike Christie:
>> Klemens Kittan wrote:
>>> Am Monday, 18. August 2008 20:10 schrieb Mike Christie:
 Klemens Kittan wrote:
> Am Friday, 15. August 2008 20:03 schrieb Mike Christie:
>> Mike Christie wrote:
>>> Klemens Kittan wrote:
 Here is the configuration of my debian kernel (2.6.25-2).
>>> Thanks. It looks like your target is responding to other IO, but did
>>> not respond to the ping quick enough so it timed out. Let me make a
>>> patch for you to test. I should hopefully have it later today.
>> Try the attached patch over  open-iscsi-2.0-869.2 tarball modules.
>>
>> To apply the patch untar and unzip the source then cd to the dir. Then
>> do:
>>
>> patch -p1 -i where-the-patch-is-saved/relax-ping-timer.patch
>>
>> Then do the normal make and make install. You will probably want to
>> reboot the box to make sure you are using the new modules.
> Unfortunately I got the same errors.
 Could you send the log output?
>>> Here is the /var/log/syslog.
>> Shoot. For some reason that nop is just not finishing in a decent amount
>> of time. Could you try the attached patch. It gives the nop even more
>> time to complete and it spits out a bunch of debug info to make sure
>> open-iscsi did not leak the task.
>>
> 
> Unfortunately, the attached file is empty.
> 

Sorry here it is.

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To post to this group, send email to open-iscsi@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at http://groups.google.com/group/open-iscsi
-~--~~~~--~~--~--~---

--- open-iscsi-2.0-869.2/kernel/libiscsi.c  2008-05-08 19:53:48.0 
-0500
+++ open-iscsi-2.0-869.2.nop/kernel/libiscsi.c  2008-08-19 12:13:38.0 
-0500
@@ -319,8 +319,10 @@ void iscsi_free_mgmt_task(struct iscsi_c
if (conn->login_mtask == mtask)
return;
 
-   if (conn->ping_mtask == mtask)
+   if (conn->ping_mtask == mtask) {
+   printk(KERN_ERR "iscsi_free_mgmt_task %p\n", mtask);
conn->ping_mtask = NULL;
+   }
__kfifo_put(conn->session->mgmtpool.queue,
(void*)&mtask, sizeof(void*));
 }
@@ -501,6 +503,7 @@ static void iscsi_send_nopout(struct isc
 
/* only track our nops */
if (!rhdr) {
+   printk(KERN_ERR "send %p\n", mtask);
conn->ping_mtask = mtask;
conn->last_ping = jiffies;
}
@@ -628,6 +631,7 @@ static int __iscsi_complete_pdu(struct i
conn->exp_statsn = be32_to_cpu(hdr->statsn) + 1;
 
if (conn->ping_mtask != mtask) {
+   printk(KERN_ERR "userspace nop\n");
/*
 * If this is not in response to one of our
 * nops then it must be from userspace.
@@ -1367,19 +1371,28 @@ static void iscsi_check_transport_timeou
 
recv_timeout *= HZ;
last_recv = conn->last_recv;
+   /*
+* Don't fire the eh if the ping timed out but we are getting
+* other IO responses. Just give it more time.
+*/
if (conn->ping_mtask &&
time_before_eq(conn->last_ping + (conn->ping_timeout * HZ),
-  jiffies)) {
+  jiffies) &&
+   time_before_eq(last_recv + (recv_timeout * 2), jiffies)) {
iscsi_conn_printk(KERN_ERR, conn, "ping timeout of %d secs "
- "expired, last rx %lu, last ping %lu, "
- "now %lu\n", conn->ping_timeout, last_recv,
- conn->last_ping, jiffies);
+ "with recv timeout of %d secs expired "
+ "last rx %lu, last ping %lu, now %lu "
+ "task %p\n",
+ conn->ping_timeout, conn->recv_timeout,
+ last_recv, conn->last_ping, jiffies,
+ conn->ping_mtask);
spin_unlock(&session->lock);
iscsi_conn_failure(conn, ISCSI_ERR_CONN_FAILED);
return;
}
 
-   if (time_before_eq(last_recv + recv_timeout, jiffies)) {
+   if (!conn->ping_mtask &&
+   time_before_eq(last_recv + recv_timeout, jiffies)) {
/* send a ping to try to provoke some traffic */
debug_scsi("Sending nopout as ping on conn %p\n", conn);
iscsi_send_nopout(conn, NULL);


Re: iscsi errors with debian-kernel 2.6.24-3-686

2008-08-20 Thread Klemens Kittan
Am Tuesday, 19. August 2008 19:15 schrieb Mike Christie:
> Klemens Kittan wrote:
> > Am Monday, 18. August 2008 20:10 schrieb Mike Christie:
> >> Klemens Kittan wrote:
> >>> Am Friday, 15. August 2008 20:03 schrieb Mike Christie:
>  Mike Christie wrote:
> > Klemens Kittan wrote:
> >> Here is the configuration of my debian kernel (2.6.25-2).
> >
> > Thanks. It looks like your target is responding to other IO, but did
> > not respond to the ping quick enough so it timed out. Let me make a
> > patch for you to test. I should hopefully have it later today.
> 
>  Try the attached patch over  open-iscsi-2.0-869.2 tarball modules.
> 
>  To apply the patch untar and unzip the source then cd to the dir. Then
>  do:
> 
>  patch -p1 -i where-the-patch-is-saved/relax-ping-timer.patch
> 
>  Then do the normal make and make install. You will probably want to
>  reboot the box to make sure you are using the new modules.
> >>>
> >>> Unfortunately I got the same errors.
> >>
> >> Could you send the log output?
> >
> > Here is the /var/log/syslog.
>
> Shoot. For some reason that nop is just not finishing in a decent amount
> of time. Could you try the attached patch. It gives the nop even more
> time to complete and it spits out a bunch of debug info to make sure
> open-iscsi did not leak the task.
>

Unfortunately, the attached file is empty.

Thanks,
Klemens



pgpiyh0rkEpvI.pgp
Description: PGP signature


Re: iscsi errors with debian-kernel 2.6.24-3-686

2008-08-19 Thread Mike Christie
Klemens Kittan wrote:
> Am Monday, 18. August 2008 20:10 schrieb Mike Christie:
>> Klemens Kittan wrote:
>>> Am Friday, 15. August 2008 20:03 schrieb Mike Christie:
 Mike Christie wrote:
> Klemens Kittan wrote:
>> Here is the configuration of my debian kernel (2.6.25-2).
> Thanks. It looks like your target is responding to other IO, but did
> not respond to the ping quick enough so it timed out. Let me make a
> patch for you to test. I should hopefully have it later today.
 Try the attached patch over  open-iscsi-2.0-869.2 tarball modules.

 To apply the patch untar and unzip the source then cd to the dir. Then
 do:

 patch -p1 -i where-the-patch-is-saved/relax-ping-timer.patch

 Then do the normal make and make install. You will probably want to
 reboot the box to make sure you are using the new modules.
>>> Unfortunately I got the same errors.
>> Could you send the log output?
>>
> 
> Here is the /var/log/syslog.
> 

Shoot. For some reason that nop is just not finishing in a decent amount 
of time. Could you try the attached patch. It gives the nop even more 
time to complete and it spits out a bunch of debug info to make sure 
open-iscsi did not leak the task.

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To post to this group, send email to open-iscsi@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at http://groups.google.com/group/open-iscsi
-~--~~~~--~~--~--~---



Re: iscsi errors with debian-kernel 2.6.24-3-686

2008-08-18 Thread Mike Christie

Klemens Kittan wrote:
> Am Friday, 15. August 2008 20:03 schrieb Mike Christie:
>> Mike Christie wrote:
>>> Klemens Kittan wrote:
 Here is the configuration of my debian kernel (2.6.25-2).
>>> Thanks. It looks like your target is responding to other IO, but did not
>>> respond to the ping quick enough so it timed out. Let me make a patch
>>> for you to test. I should hopefully have it later today.
>> Try the attached patch over  open-iscsi-2.0-869.2 tarball modules.
>>
>> To apply the patch untar and unzip the source then cd to the dir. Then do:
>>
>> patch -p1 -i where-the-patch-is-saved/relax-ping-timer.patch
>>
>> Then do the normal make and make install. You will probably want to
>> reboot the box to make sure you are using the new modules.
>>
> 
> Unfortunately I got the same errors.
> 

Could you send the log output?

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To post to this group, send email to open-iscsi@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at http://groups.google.com/group/open-iscsi
-~--~~~~--~~--~--~---



Re: iscsi errors with debian-kernel 2.6.24-3-686

2008-08-18 Thread Klemens Kittan
Am Friday, 15. August 2008 20:03 schrieb Mike Christie:
> Mike Christie wrote:
> > Klemens Kittan wrote:
> >> Here is the configuration of my debian kernel (2.6.25-2).
> >
> > Thanks. It looks like your target is responding to other IO, but did not
> > respond to the ping quick enough so it timed out. Let me make a patch
> > for you to test. I should hopefully have it later today.
>
> Try the attached patch over  open-iscsi-2.0-869.2 tarball modules.
>
> To apply the patch untar and unzip the source then cd to the dir. Then do:
>
> patch -p1 -i where-the-patch-is-saved/relax-ping-timer.patch
>
> Then do the normal make and make install. You will probably want to
> reboot the box to make sure you are using the new modules.
>

Unfortunately I got the same errors.

Thanks,
Klemens

-- 
Klemens Kittan
Systemadministrator

Uni-Potsdam, Inst. f. Informatik
August-Bebel-Str. 89
14482 Potsdam

Tel.:   +49-331-977/3125
Fax.:   +49-331-977/3122
eMail   : [EMAIL PROTECTED]

gpg --recv-keys --keyserver wwwkeys.de.pgp.net 6EA09333


pgpUZcDVhuzIA.pgp
Description: PGP signature


Re: iscsi errors with debian-kernel 2.6.24-3-686

2008-06-23 Thread Mike Christie

Michael Kindermann wrote:
> Hello,
> thanks for quick reply. During weekend I did backups on the iscsi-device
> with not one single error on kernel 2.6.22. 
> Am Freitag, den 13.06.2008, 19:11 +0200 schrieb Mike Christie:
>> Michael Kindermann wrote:
>>> Hello,
>>>
>>> We receive errors below on a standard debian lenny/testing sytem since
>> kernelupdate from 2.6.22-3-686 to latest debian-kernel 2.6.24-3-686. The
>> iscsi-device is Eonstore E16A-2130.
>>> The open-iscsi deb package is 2.0.869.2-2. When we use the older kernel
>> the errors disappear. Errors only happen during copying on the
>> iscsi-devices.
>>> Is this behaviour a debian specific problem and I have to compile
>> open-scsi?   
>>>
>>>
>>>
>>> Jun 13 14:32:30 hg2 kernel:  connection1:0: iscsi: detected conn error
>> (1011)
>>> Jun 13 14:32:31 hg2 iscsid: Kernel reported iSCSI connection 1:0 error
>> (1011) state (3)
>>> Jun 13 14:32:41 hg2 kernel: iscsi: host reset succeeded
>> The READs or WRITEs from the copy operations are timing out. The SCSI 
>> layer sets a timer on each command which is probably the default of 60 
>> seconds (scsi layer sets to 30 and udev normal raises this to 60). If 
>> the command does not complete in that time it starts the scsi error 
>> handler and you end up getting these errors in the worst case where we 
>> cannot just abort and restart the command or reset the device.
>>
>> Are you copying to the iscsi device or from it (and are you then copying 
>> to to/from a non-iscsi device), or is it mixed?
> 
> These errors posted earlier resulting from simply copying a dvdimage
> from a local logical volume on SATA-Drives to a logical volume on the
> iscsi-device. Normal usage is to do backups by rsync (dirvish) to this
> device, which are very slow due to timeouts (normal rsync linux backups
> via ssh and the windows-filesystems by local rsyncing cifs-mounts).
> Results stay the same. The errors only happen when copying to the
> iscsi-device. Restoring of data from iscsi to local volumes works great
> without errors on the 2.6.24 -Kernel.   
> 

Weird. We actually fixed a write bug in that kernel, so writes should be 
better. Maybe we are now overloading the target - I do not know.

If this happens again you might want to lower the queue depths by 
setting the following values lowwer

node.session.cmds_max
node.session.queue_depth

Are you doing IO to lots of disks at the same time? And what type of 
target is this? Does it have good backing storage - fast sas or scsi 
disk or slower ide or sata ones?

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To post to this group, send email to open-iscsi@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at http://groups.google.com/group/open-iscsi
-~--~~~~--~~--~--~---



Re: iscsi errors with debian-kernel 2.6.24-3-686

2008-06-16 Thread Michael Kindermann

Hello,
thanks for quick reply. During weekend I did backups on the iscsi-device
with not one single error on kernel 2.6.22. 
Am Freitag, den 13.06.2008, 19:11 +0200 schrieb Mike Christie:
> Michael Kindermann wrote:
> > Hello,
> > 
> > We receive errors below on a standard debian lenny/testing sytem since
> kernelupdate from 2.6.22-3-686 to latest debian-kernel 2.6.24-3-686. The
> iscsi-device is Eonstore E16A-2130.
> > The open-iscsi deb package is 2.0.869.2-2. When we use the older kernel
> the errors disappear. Errors only happen during copying on the
> iscsi-devices.
> > Is this behaviour a debian specific problem and I have to compile
> open-scsi?   
> > 
> > 
> > 
> > 
> > Jun 13 14:32:30 hg2 kernel:  connection1:0: iscsi: detected conn error
> (1011)
> > Jun 13 14:32:31 hg2 iscsid: Kernel reported iSCSI connection 1:0 error
> (1011) state (3)
> > Jun 13 14:32:41 hg2 kernel: iscsi: host reset succeeded
> 
> The READs or WRITEs from the copy operations are timing out. The SCSI 
> layer sets a timer on each command which is probably the default of 60 
> seconds (scsi layer sets to 30 and udev normal raises this to 60). If 
> the command does not complete in that time it starts the scsi error 
> handler and you end up getting these errors in the worst case where we 
> cannot just abort and restart the command or reset the device.
> 
> Are you copying to the iscsi device or from it (and are you then copying 
> to to/from a non-iscsi device), or is it mixed?

These errors posted earlier resulting from simply copying a dvdimage
from a local logical volume on SATA-Drives to a logical volume on the
iscsi-device. Normal usage is to do backups by rsync (dirvish) to this
device, which are very slow due to timeouts (normal rsync linux backups
via ssh and the windows-filesystems by local rsyncing cifs-mounts).
Results stay the same. The errors only happen when copying to the
iscsi-device. Restoring of data from iscsi to local volumes works great
without errors on the 2.6.24 -Kernel.   

Another effect is backups are slow and no keyboard interaction is
possible during these timeouts.

> 
> When you were using 2.6.22-3-686, were you also using the open-iscsi deb 
> package 2.0.869.2-2 or was it a older version.
We just boot from the former 2.6.22 -Kernel.

dpkg -l |grep iscsi
ii  open-iscsi2.0.869.2-2

> 
> On the broken setup could you run
> 
> iscsiadm -m session -P 3
> 
> and send all the output?
iscsiadm -m session -P 3
iSCSI Transport Class version 2.0-724
iscsiadm version 2.0-869
Target: iqn.2002-10.com.infortrend:raid.sn7457154.20
Current Portal: 192.168.7.227:3260,1
Persistent Portal: 192.168.7.227:3260,1
**
Interface:
**
Iface Name: default
Iface Transport: tcp
Iface Initiatorname:
iqn.1993-08.org.debian:01.b75ebc4b5f99
Iface IPaddress: 192.168.7.2
Iface HWaddress: default
Iface Netdev: default
SID: 1
iSCSI Connection State: LOGGED IN
iSCSI Session State: Unknown
Internal iscsid Session State: NO CHANGE

Negotiated iSCSI params:

HeaderDigest: None
DataDigest: None
MaxRecvDataSegmentLength: 131072
MaxXmitDataSegmentLength: 65536
FirstBurstLength: 65536
MaxBurstLength: 262144
ImmediateData: Yes
InitialR2T: No
MaxOutstandingR2T: 1

Attached SCSI devices:

Host Number: 5  State: running
scsi5 Channel 00 Id 0 Lun: 0
Attached scsi disk sdc  State: running
scsi5 Channel 00 Id 0 Lun: 1
Attached scsi disk sdd  State: running
scsi5 Channel 00 Id 0 Lun: 2
Attached scsi disk sde  State: running
scsi5 Channel 00 Id 0 Lun: 3
Attached scsi disk sdf  State: running
scsi5 Channel 00 Id 0 Lun: 4
Attached scsi disk sdg  State: running
scsi5 Channel 00 Id 0 Lun: 5
Attached scsi disk sdh  State: running
scsi5 Channel 00 Id 0 Lun: 6
Attached scsi disk sdi  State: running

greets
Michael

> > 
> 
-- 
Michael Kindermann
Systemadministrator
HandyGames

www.handy-games.com GmbH
i_Park Klingholz 13
97232 Giebelstadt
Germany

Tel: +49 (0) 9334 9757 - 35
mail:[EMAIL PROTECTED]


--~--~-~--~~~---~--~~
You received this message because you are subscribed to the 

Re: iscsi errors with debian-kernel 2.6.24-3-686

2008-06-13 Thread Mike Christie

Michael Kindermann wrote:
> Hello,
> 
> We receive errors below on a standard debian lenny/testing sytem since 
> kernelupdate from 2.6.22-3-686 to latest debian-kernel 2.6.24-3-686. The 
> iscsi-device is Eonstore E16A-2130.
> The open-iscsi deb package is 2.0.869.2-2. When we use the older kernel the 
> errors disappear. Errors only happen during copying on the iscsi-devices.
> Is this behaviour a debian specific problem and I have to compile open-scsi?  
>  
> 
> 
> 
> 
> Jun 13 14:32:30 hg2 kernel:  connection1:0: iscsi: detected conn error (1011)
> Jun 13 14:32:31 hg2 iscsid: Kernel reported iSCSI connection 1:0 error (1011) 
> state (3)
> Jun 13 14:32:41 hg2 kernel: iscsi: host reset succeeded

The READs or WRITEs from the copy operations are timing out. The SCSI 
layer sets a timer on each command which is probably the default of 60 
seconds (scsi layer sets to 30 and udev normal raises this to 60). If 
the command does not complete in that time it starts the scsi error 
handler and you end up getting these errors in the worst case where we 
cannot just abort and restart the command or reset the device.

Are you copying to the iscsi device or from it (and are you then copying 
to to/from a non-iscsi device), or is it mixed?

When you were using 2.6.22-3-686, were you also using the open-iscsi deb 
package 2.0.869.2-2 or was it a older version.

On the broken setup could you run

iscsiadm -m session -P 3

and send all the output?

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To post to this group, send email to open-iscsi@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at http://groups.google.com/group/open-iscsi
-~--~~~~--~~--~--~---



iscsi errors with debian-kernel 2.6.24-3-686

2008-06-13 Thread Michael Kindermann

Hello,

We receive errors below on a standard debian lenny/testing sytem since 
kernelupdate from 2.6.22-3-686 to latest debian-kernel 2.6.24-3-686. The 
iscsi-device is Eonstore E16A-2130.
The open-iscsi deb package is 2.0.869.2-2. When we use the older kernel the 
errors disappear. Errors only happen during copying on the iscsi-devices.
Is this behaviour a debian specific problem and I have to compile open-scsi?   




Jun 13 14:32:30 hg2 kernel:  connection1:0: iscsi: detected conn error (1011)
Jun 13 14:32:31 hg2 iscsid: Kernel reported iSCSI connection 1:0 error (1011) 
state (3)
Jun 13 14:32:41 hg2 kernel: iscsi: host reset succeeded
Jun 13 14:32:42 hg2 iscsid: received iferror -38
Jun 13 14:32:42 hg2 last message repeated 4 times
Jun 13 14:32:42 hg2 iscsid: connection1:0 is operational after recovery (1 
attempts)
Jun 13 14:33:41 hg2 kernel:  connection1:0: iscsi: detected conn error (1011)
Jun 13 14:33:42 hg2 iscsid: Kernel reported iSCSI connection 1:0 error (1011) 
state (3)
Jun 13 14:33:55 hg2 kernel: iscsi: host reset succeeded
Jun 13 14:33:56 hg2 iscsid: received iferror -38
Jun 13 14:33:56 hg2 last message repeated 4 times
Jun 13 14:33:56 hg2 iscsid: connection1:0 is operational after recovery (1 
attempts)
Jun 13 14:34:56 hg2 kernel:  connection1:0: iscsi: detected conn error (1011)
Jun 13 14:34:57 hg2 iscsid: Kernel reported iSCSI connection 1:0 error (1011) 
state (3)


Greets
Michael

-- 
Michael Kindermann
Systemadministrator
HandyGames
 
www.handy-games.com GmbH
i_Park Klingholz 13
97232 Giebelstadt
Germany
 
_
 
Tel.: +49 (0) 9334 9757 - 35
Fax: +49 (0) 9334 9757 - 19
Mail: [EMAIL PROTECTED]
 
_ 
 
Handelsregister HRB 8667 Amtsgericht Würzburg
Steuer-Nummer 257/142/90099
USt-Identifikationsnummer (VAT): DE209182197
 
Geschäftsführer (CEO):
Christopher Kassulke
Markus Kassulke



--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To post to this group, send email to open-iscsi@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at http://groups.google.com/group/open-iscsi
-~--~~~~--~~--~--~---