Re: iscsi errors with debian-kernel 2.6.24-3-686
Klemens Kittan wrote: > Am Wednesday, 20. August 2008 19:27 schrieb Mike Christie: >> Klemens Kittan wrote: >>> Here is the whole log. >> Thanks. In that log, are you doing io to both disks on both connections >> or just one disks? >> >> It looks like the initiator is not getting a response for the ping in >> time (within ping (node.conn[0].timeo.noop_out_timeout) timeout >> seconds)). It might be a driver bug and we are dropping it. To check >> that you would need to run ethereal or wireshark so we can see what the >> network layer is seeing, but I do not think this is the case given some >> of the other ping times in there. >> >> If the target is just so slow that it is useless to try sending the nop >> you could just run without nops on by setting >> node.conn[0].timeo.noop_out_timeout = 0 >> node.conn[0].timeo.noop_out_interval = 0 >> or turn node.conn[0].timeo.noop_out_timeout to a higher value like 45 >> and see if that helps. >> >> In the logs the responses to some other pings are around 9 seconds >> which is cutting it close when you have a timeout of 10 seconds, so you >> should definately try to increase the node.conn[0].timeo.noop_out_timeout. >> > > Hi Mike, > > we think now that the problem isn't with the iscsi-stuff but comes > from the ethernet driver (e1000 "Detected Tx Unit Hang"). Sorry and > thank you very much for your patience and help! > No problem. Thanks for updating us with your findings. It will help other people when searching the list for their issues. --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "open-iscsi" group. To post to this group, send email to open-iscsi@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/open-iscsi -~--~~~~--~~--~--~---
Re: iscsi errors with debian-kernel 2.6.24-3-686
Am Wednesday, 20. August 2008 19:27 schrieb Mike Christie: > Klemens Kittan wrote: > > Here is the whole log. > > Thanks. In that log, are you doing io to both disks on both connections > or just one disks? > > It looks like the initiator is not getting a response for the ping in > time (within ping (node.conn[0].timeo.noop_out_timeout) timeout > seconds)). It might be a driver bug and we are dropping it. To check > that you would need to run ethereal or wireshark so we can see what the > network layer is seeing, but I do not think this is the case given some > of the other ping times in there. > > If the target is just so slow that it is useless to try sending the nop > you could just run without nops on by setting > node.conn[0].timeo.noop_out_timeout = 0 > node.conn[0].timeo.noop_out_interval = 0 > or turn node.conn[0].timeo.noop_out_timeout to a higher value like 45 > and see if that helps. > > In the logs the responses to some other pings are around 9 seconds > which is cutting it close when you have a timeout of 10 seconds, so you > should definately try to increase the node.conn[0].timeo.noop_out_timeout. > Hi Mike, we think now that the problem isn't with the iscsi-stuff but comes from the ethernet driver (e1000 "Detected Tx Unit Hang"). Sorry and thank you very much for your patience and help! Best regards, Klemens pgpGbUUr6O4rO.pgp Description: PGP signature
Re: iscsi errors with debian-kernel 2.6.24-3-686
Klemens Kittan wrote: > > Here is the whole log. > Thanks. In that log, are you doing io to both disks on both connections or just one disks? It looks like the initiator is not getting a response for the ping in time (within ping (node.conn[0].timeo.noop_out_timeout) timeout seconds)). It might be a driver bug and we are dropping it. To check that you would need to run ethereal or wireshark so we can see what the network layer is seeing, but I do not think this is the case given some of the other ping times in there. If the target is just so slow that it is useless to try sending the nop you could just run without nops on by setting node.conn[0].timeo.noop_out_timeout = 0 node.conn[0].timeo.noop_out_interval = 0 or turn node.conn[0].timeo.noop_out_timeout to a higher value like 45 and see if that helps. In the logs the responses to some other pings are around 9 seconds which is cutting it close when you have a timeout of 10 seconds, so you should definately try to increase the node.conn[0].timeo.noop_out_timeout. --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "open-iscsi" group. To post to this group, send email to open-iscsi@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/open-iscsi -~--~~~~--~~--~--~---
Re: iscsi errors with debian-kernel 2.6.24-3-686
Am Wednesday, 20. August 2008 11:52 schrieb Mike Christie: > Klemens Kittan wrote: > > Am Wednesday, 20. August 2008 09:43 schrieb Mike Christie: > >> Klemens Kittan wrote: > >>> Am Tuesday, 19. August 2008 19:15 schrieb Mike Christie: > Klemens Kittan wrote: > > Am Monday, 18. August 2008 20:10 schrieb Mike Christie: > >> Klemens Kittan wrote: > >>> Am Friday, 15. August 2008 20:03 schrieb Mike Christie: > Mike Christie wrote: > > Klemens Kittan wrote: > >> Here is the configuration of my debian kernel (2.6.25-2). > > > > Thanks. It looks like your target is responding to other IO, but > > did not respond to the ping quick enough so it timed out. Let me > > make a patch for you to test. I should hopefully have it later > > today. > > Try the attached patch over open-iscsi-2.0-869.2 tarball modules. > > To apply the patch untar and unzip the source then cd to the dir. > Then do: > > patch -p1 -i where-the-patch-is-saved/relax-ping-timer.patch > > Then do the normal make and make install. You will probably want > to reboot the box to make sure you are using the new modules. > >>> > >>> Unfortunately I got the same errors. > >> > >> Could you send the log output? > > > > Here is the /var/log/syslog. > > Shoot. For some reason that nop is just not finishing in a decent > amount of time. Could you try the attached patch. It gives the nop > even more time to complete and it spits out a bunch of debug info to > make sure open-iscsi did not leak the task. > >>> > >>> Unfortunately, the attached file is empty. > >> > >> Oh yeah, if you just log into the target and do not do any IO to the > >> disks. Do you see any messages like this: > >> > >> Aug 14 09:52:23 baltrum kernel: [81064.665749] connection2:0: ping > >> timeout of > >> 10 secs expired, last rx 4315069195, last ping 4315067926, now > >> 4315070426 Aug 14 09:52:23 baltrum kernel: [81064.669756] > >> connection2:0: detected conn > >> error (1011) > > > > I get these messages all the time (with and without IO traffic): > > you should get these. I am just worried about getting these > > > Aug 20 09:56:10 baltrum kernel: [168687.391990] connection1:0: ping > > timeout > > > of 10 secs with recv timeout of 5 secs expired last rx 4336967839, > > last ping > > > 4336967081, now 4336970339 task 81003797aac0 > > Aug 20 09:56:10 baltrum kernel: [168687.396001] connection1:0: > > detected conn > > > error (1011) > > when there is no IO traffic. > > > Aug 20 09:49:13 baltrum kernel: [168482.943791] send 8100f9c541c0 > > Aug 20 09:49:13 baltrum kernel: [168483.026754] send 81003797adc0 > > Aug 20 09:49:13 baltrum kernel: [168483.026817] iscsi_free_mgmt_task > > 8100f9c541c0 > > Aug 20 09:49:13 baltrum kernel: [168483.031189] iscsi_free_mgmt_task > > 81003797adc0 > > Aug 20 09:49:18 baltrum kernel: [168487.772859] send 8100f9c54140 > > Aug 20 09:49:18 baltrum kernel: [168488.018304] iscsi_free_mgmt_task > > 8100f9c54140 > > Aug 20 09:49:18 baltrum kernel: [168488.018342] send 81003797aac0 > > Aug 20 09:49:18 baltrum kernel: [168488.026632] iscsi_free_mgmt_task > > 81003797aac0 > > > > With IO traffic I get these messages: > > Could you give me a large chunk of the log? I need the stuff that > happened before this part. > > > Aug 20 09:56:10 baltrum kernel: [168687.391990] connection1:0: ping > > timeout of 10 secs with recv timeout of 5 secs expired last rx > > 4336967839, last ping 4336967081, now 4336970339 task 81003797aac0 > > Aug 20 09:56:10 baltrum kernel: [168687.396001] connection1:0: detected > > conn error (1011) > > Aug 20 09:56:10 baltrum iscsid: Kernel reported iSCSI connection 1:0 > > error (1011) state (3) > > Aug 20 09:56:14 baltrum iscsid: connection1:0 is operational after > > recovery (1 attempts) > > > > Thanks, > > Klemens > Here is the whole log. Thanks, Klemens Aug 20 06:25:10 baltrum syslogd 1.5.0#5: restart. Aug 20 06:31:01 baltrum /USR/SBIN/CRON[17699]: (clamav) CMD ([ -x /usr/bin/freshclam ] && /usr/bin/freshclam --quiet >/dev/null) Aug 20 06:51:33 baltrum -- MARK -- Aug 20 07:11:33 baltrum -- MARK -- Aug 20 07:17:01 baltrum /USR/SBIN/CRON[19082]: (root) CMD ( cd / && run-parts --report /etc/cron.hourly) Aug 20 07:31:01 baltrum /USR/SBIN/CRON[19505]: (clamav) CMD ([ -x /usr/bin/freshclam ] && /usr/bin/freshclam --quiet >/dev/null) Aug 20 07:51:33 baltrum -- MARK -- Aug 20 08:11:33 baltrum -- MARK -- Aug 20 08:17:01 baltrum /USR/SBIN/CRON[20887]: (root) CMD ( cd / && run-parts --report /etc/cron.hourly) Aug 20 08:19:32 baltrum multipathd: ift_backup: stop event checker thread Aug 20 08:19:32 baltrum multipathd: ift_vz: stop event checker thread Aug 20 08:19:32 baltrum multipathd: ift_mail: stop event checker thread Aug 20 08:19:32 baltrum multipathd: ift_sql: stop ev
Re: iscsi errors with debian-kernel 2.6.24-3-686
Klemens Kittan wrote: > Am Wednesday, 20. August 2008 09:43 schrieb Mike Christie: >> Klemens Kittan wrote: >>> Am Tuesday, 19. August 2008 19:15 schrieb Mike Christie: Klemens Kittan wrote: > Am Monday, 18. August 2008 20:10 schrieb Mike Christie: >> Klemens Kittan wrote: >>> Am Friday, 15. August 2008 20:03 schrieb Mike Christie: Mike Christie wrote: > Klemens Kittan wrote: >> Here is the configuration of my debian kernel (2.6.25-2). > Thanks. It looks like your target is responding to other IO, but > did not respond to the ping quick enough so it timed out. Let me > make a patch for you to test. I should hopefully have it later > today. Try the attached patch over open-iscsi-2.0-869.2 tarball modules. To apply the patch untar and unzip the source then cd to the dir. Then do: patch -p1 -i where-the-patch-is-saved/relax-ping-timer.patch Then do the normal make and make install. You will probably want to reboot the box to make sure you are using the new modules. >>> Unfortunately I got the same errors. >> Could you send the log output? > Here is the /var/log/syslog. Shoot. For some reason that nop is just not finishing in a decent amount of time. Could you try the attached patch. It gives the nop even more time to complete and it spits out a bunch of debug info to make sure open-iscsi did not leak the task. >>> Unfortunately, the attached file is empty. >> Oh yeah, if you just log into the target and do not do any IO to the >> disks. Do you see any messages like this: >> >> Aug 14 09:52:23 baltrum kernel: [81064.665749] connection2:0: ping >> timeout of >> 10 secs expired, last rx 4315069195, last ping 4315067926, now 4315070426 >> Aug 14 09:52:23 baltrum kernel: [81064.669756] connection2:0: detected >> conn >> error (1011) >> > > I get these messages all the time (with and without IO traffic): you should get these. I am just worried about getting these > Aug 20 09:56:10 baltrum kernel: [168687.391990] connection1:0: ping timeout > of 10 secs with recv timeout of 5 secs expired last rx 4336967839, last ping > 4336967081, now 4336970339 task 81003797aac0 > Aug 20 09:56:10 baltrum kernel: [168687.396001] connection1:0: detected conn > error (1011) when there is no IO traffic. > Aug 20 09:49:13 baltrum kernel: [168482.943791] send 8100f9c541c0 > Aug 20 09:49:13 baltrum kernel: [168483.026754] send 81003797adc0 > Aug 20 09:49:13 baltrum kernel: [168483.026817] iscsi_free_mgmt_task > 8100f9c541c0 > Aug 20 09:49:13 baltrum kernel: [168483.031189] iscsi_free_mgmt_task > 81003797adc0 > Aug 20 09:49:18 baltrum kernel: [168487.772859] send 8100f9c54140 > Aug 20 09:49:18 baltrum kernel: [168488.018304] iscsi_free_mgmt_task > 8100f9c54140 > Aug 20 09:49:18 baltrum kernel: [168488.018342] send 81003797aac0 > Aug 20 09:49:18 baltrum kernel: [168488.026632] iscsi_free_mgmt_task > 81003797aac0 > > With IO traffic I get these messages: Could you give me a large chunk of the log? I need the stuff that happened before this part. > Aug 20 09:56:10 baltrum kernel: [168687.391990] connection1:0: ping timeout > of 10 secs with recv timeout of 5 secs expired last rx 4336967839, last ping > 4336967081, now 4336970339 task 81003797aac0 > Aug 20 09:56:10 baltrum kernel: [168687.396001] connection1:0: detected conn > error (1011) > Aug 20 09:56:10 baltrum iscsid: Kernel reported iSCSI connection 1:0 error > (1011) state (3) > Aug 20 09:56:14 baltrum iscsid: connection1:0 is operational after recovery > (1 > attempts) > > Thanks, > Klemens > --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "open-iscsi" group. To post to this group, send email to open-iscsi@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/open-iscsi -~--~~~~--~~--~--~---
Re: iscsi errors with debian-kernel 2.6.24-3-686
Am Wednesday, 20. August 2008 09:43 schrieb Mike Christie: > Klemens Kittan wrote: > > Am Tuesday, 19. August 2008 19:15 schrieb Mike Christie: > >> Klemens Kittan wrote: > >>> Am Monday, 18. August 2008 20:10 schrieb Mike Christie: > Klemens Kittan wrote: > > Am Friday, 15. August 2008 20:03 schrieb Mike Christie: > >> Mike Christie wrote: > >>> Klemens Kittan wrote: > Here is the configuration of my debian kernel (2.6.25-2). > >>> > >>> Thanks. It looks like your target is responding to other IO, but > >>> did not respond to the ping quick enough so it timed out. Let me > >>> make a patch for you to test. I should hopefully have it later > >>> today. > >> > >> Try the attached patch over open-iscsi-2.0-869.2 tarball modules. > >> > >> To apply the patch untar and unzip the source then cd to the dir. > >> Then do: > >> > >> patch -p1 -i where-the-patch-is-saved/relax-ping-timer.patch > >> > >> Then do the normal make and make install. You will probably want to > >> reboot the box to make sure you are using the new modules. > > > > Unfortunately I got the same errors. > > Could you send the log output? > >>> > >>> Here is the /var/log/syslog. > >> > >> Shoot. For some reason that nop is just not finishing in a decent amount > >> of time. Could you try the attached patch. It gives the nop even more > >> time to complete and it spits out a bunch of debug info to make sure > >> open-iscsi did not leak the task. > > > > Unfortunately, the attached file is empty. > > Oh yeah, if you just log into the target and do not do any IO to the > disks. Do you see any messages like this: > > Aug 14 09:52:23 baltrum kernel: [81064.665749] connection2:0: ping > timeout of > 10 secs expired, last rx 4315069195, last ping 4315067926, now 4315070426 > Aug 14 09:52:23 baltrum kernel: [81064.669756] connection2:0: detected > conn > error (1011) > I get these messages all the time (with and without IO traffic): Aug 20 09:49:13 baltrum kernel: [168482.943791] send 8100f9c541c0 Aug 20 09:49:13 baltrum kernel: [168483.026754] send 81003797adc0 Aug 20 09:49:13 baltrum kernel: [168483.026817] iscsi_free_mgmt_task 8100f9c541c0 Aug 20 09:49:13 baltrum kernel: [168483.031189] iscsi_free_mgmt_task 81003797adc0 Aug 20 09:49:18 baltrum kernel: [168487.772859] send 8100f9c54140 Aug 20 09:49:18 baltrum kernel: [168488.018304] iscsi_free_mgmt_task 8100f9c54140 Aug 20 09:49:18 baltrum kernel: [168488.018342] send 81003797aac0 Aug 20 09:49:18 baltrum kernel: [168488.026632] iscsi_free_mgmt_task 81003797aac0 With IO traffic I get these messages: Aug 20 09:56:10 baltrum kernel: [168687.391990] connection1:0: ping timeout of 10 secs with recv timeout of 5 secs expired last rx 4336967839, last ping 4336967081, now 4336970339 task 81003797aac0 Aug 20 09:56:10 baltrum kernel: [168687.396001] connection1:0: detected conn error (1011) Aug 20 09:56:10 baltrum iscsid: Kernel reported iSCSI connection 1:0 error (1011) state (3) Aug 20 09:56:14 baltrum iscsid: connection1:0 is operational after recovery (1 attempts) Thanks, Klemens pgpKoS5RiKNwU.pgp Description: PGP signature
Re: iscsi errors with debian-kernel 2.6.24-3-686
Klemens Kittan wrote: > Am Tuesday, 19. August 2008 19:15 schrieb Mike Christie: >> Klemens Kittan wrote: >>> Am Monday, 18. August 2008 20:10 schrieb Mike Christie: Klemens Kittan wrote: > Am Friday, 15. August 2008 20:03 schrieb Mike Christie: >> Mike Christie wrote: >>> Klemens Kittan wrote: Here is the configuration of my debian kernel (2.6.25-2). >>> Thanks. It looks like your target is responding to other IO, but did >>> not respond to the ping quick enough so it timed out. Let me make a >>> patch for you to test. I should hopefully have it later today. >> Try the attached patch over open-iscsi-2.0-869.2 tarball modules. >> >> To apply the patch untar and unzip the source then cd to the dir. Then >> do: >> >> patch -p1 -i where-the-patch-is-saved/relax-ping-timer.patch >> >> Then do the normal make and make install. You will probably want to >> reboot the box to make sure you are using the new modules. > Unfortunately I got the same errors. Could you send the log output? >>> Here is the /var/log/syslog. >> Shoot. For some reason that nop is just not finishing in a decent amount >> of time. Could you try the attached patch. It gives the nop even more >> time to complete and it spits out a bunch of debug info to make sure >> open-iscsi did not leak the task. >> > > Unfortunately, the attached file is empty. > Oh yeah, if you just log into the target and do not do any IO to the disks. Do you see any messages like this: Aug 14 09:52:23 baltrum kernel: [81064.665749] connection2:0: ping timeout of 10 secs expired, last rx 4315069195, last ping 4315067926, now 4315070426 Aug 14 09:52:23 baltrum kernel: [81064.669756] connection2:0: detected conn error (1011) --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "open-iscsi" group. To post to this group, send email to open-iscsi@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/open-iscsi -~--~~~~--~~--~--~---
Re: iscsi errors with debian-kernel 2.6.24-3-686
Klemens Kittan wrote: > Am Tuesday, 19. August 2008 19:15 schrieb Mike Christie: >> Klemens Kittan wrote: >>> Am Monday, 18. August 2008 20:10 schrieb Mike Christie: Klemens Kittan wrote: > Am Friday, 15. August 2008 20:03 schrieb Mike Christie: >> Mike Christie wrote: >>> Klemens Kittan wrote: Here is the configuration of my debian kernel (2.6.25-2). >>> Thanks. It looks like your target is responding to other IO, but did >>> not respond to the ping quick enough so it timed out. Let me make a >>> patch for you to test. I should hopefully have it later today. >> Try the attached patch over open-iscsi-2.0-869.2 tarball modules. >> >> To apply the patch untar and unzip the source then cd to the dir. Then >> do: >> >> patch -p1 -i where-the-patch-is-saved/relax-ping-timer.patch >> >> Then do the normal make and make install. You will probably want to >> reboot the box to make sure you are using the new modules. > Unfortunately I got the same errors. Could you send the log output? >>> Here is the /var/log/syslog. >> Shoot. For some reason that nop is just not finishing in a decent amount >> of time. Could you try the attached patch. It gives the nop even more >> time to complete and it spits out a bunch of debug info to make sure >> open-iscsi did not leak the task. >> > > Unfortunately, the attached file is empty. > Sorry here it is. --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "open-iscsi" group. To post to this group, send email to open-iscsi@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/open-iscsi -~--~~~~--~~--~--~--- --- open-iscsi-2.0-869.2/kernel/libiscsi.c 2008-05-08 19:53:48.0 -0500 +++ open-iscsi-2.0-869.2.nop/kernel/libiscsi.c 2008-08-19 12:13:38.0 -0500 @@ -319,8 +319,10 @@ void iscsi_free_mgmt_task(struct iscsi_c if (conn->login_mtask == mtask) return; - if (conn->ping_mtask == mtask) + if (conn->ping_mtask == mtask) { + printk(KERN_ERR "iscsi_free_mgmt_task %p\n", mtask); conn->ping_mtask = NULL; + } __kfifo_put(conn->session->mgmtpool.queue, (void*)&mtask, sizeof(void*)); } @@ -501,6 +503,7 @@ static void iscsi_send_nopout(struct isc /* only track our nops */ if (!rhdr) { + printk(KERN_ERR "send %p\n", mtask); conn->ping_mtask = mtask; conn->last_ping = jiffies; } @@ -628,6 +631,7 @@ static int __iscsi_complete_pdu(struct i conn->exp_statsn = be32_to_cpu(hdr->statsn) + 1; if (conn->ping_mtask != mtask) { + printk(KERN_ERR "userspace nop\n"); /* * If this is not in response to one of our * nops then it must be from userspace. @@ -1367,19 +1371,28 @@ static void iscsi_check_transport_timeou recv_timeout *= HZ; last_recv = conn->last_recv; + /* +* Don't fire the eh if the ping timed out but we are getting +* other IO responses. Just give it more time. +*/ if (conn->ping_mtask && time_before_eq(conn->last_ping + (conn->ping_timeout * HZ), - jiffies)) { + jiffies) && + time_before_eq(last_recv + (recv_timeout * 2), jiffies)) { iscsi_conn_printk(KERN_ERR, conn, "ping timeout of %d secs " - "expired, last rx %lu, last ping %lu, " - "now %lu\n", conn->ping_timeout, last_recv, - conn->last_ping, jiffies); + "with recv timeout of %d secs expired " + "last rx %lu, last ping %lu, now %lu " + "task %p\n", + conn->ping_timeout, conn->recv_timeout, + last_recv, conn->last_ping, jiffies, + conn->ping_mtask); spin_unlock(&session->lock); iscsi_conn_failure(conn, ISCSI_ERR_CONN_FAILED); return; } - if (time_before_eq(last_recv + recv_timeout, jiffies)) { + if (!conn->ping_mtask && + time_before_eq(last_recv + recv_timeout, jiffies)) { /* send a ping to try to provoke some traffic */ debug_scsi("Sending nopout as ping on conn %p\n", conn); iscsi_send_nopout(conn, NULL);
Re: iscsi errors with debian-kernel 2.6.24-3-686
Am Tuesday, 19. August 2008 19:15 schrieb Mike Christie: > Klemens Kittan wrote: > > Am Monday, 18. August 2008 20:10 schrieb Mike Christie: > >> Klemens Kittan wrote: > >>> Am Friday, 15. August 2008 20:03 schrieb Mike Christie: > Mike Christie wrote: > > Klemens Kittan wrote: > >> Here is the configuration of my debian kernel (2.6.25-2). > > > > Thanks. It looks like your target is responding to other IO, but did > > not respond to the ping quick enough so it timed out. Let me make a > > patch for you to test. I should hopefully have it later today. > > Try the attached patch over open-iscsi-2.0-869.2 tarball modules. > > To apply the patch untar and unzip the source then cd to the dir. Then > do: > > patch -p1 -i where-the-patch-is-saved/relax-ping-timer.patch > > Then do the normal make and make install. You will probably want to > reboot the box to make sure you are using the new modules. > >>> > >>> Unfortunately I got the same errors. > >> > >> Could you send the log output? > > > > Here is the /var/log/syslog. > > Shoot. For some reason that nop is just not finishing in a decent amount > of time. Could you try the attached patch. It gives the nop even more > time to complete and it spits out a bunch of debug info to make sure > open-iscsi did not leak the task. > Unfortunately, the attached file is empty. Thanks, Klemens pgpiyh0rkEpvI.pgp Description: PGP signature
Re: iscsi errors with debian-kernel 2.6.24-3-686
Klemens Kittan wrote: > Am Monday, 18. August 2008 20:10 schrieb Mike Christie: >> Klemens Kittan wrote: >>> Am Friday, 15. August 2008 20:03 schrieb Mike Christie: Mike Christie wrote: > Klemens Kittan wrote: >> Here is the configuration of my debian kernel (2.6.25-2). > Thanks. It looks like your target is responding to other IO, but did > not respond to the ping quick enough so it timed out. Let me make a > patch for you to test. I should hopefully have it later today. Try the attached patch over open-iscsi-2.0-869.2 tarball modules. To apply the patch untar and unzip the source then cd to the dir. Then do: patch -p1 -i where-the-patch-is-saved/relax-ping-timer.patch Then do the normal make and make install. You will probably want to reboot the box to make sure you are using the new modules. >>> Unfortunately I got the same errors. >> Could you send the log output? >> > > Here is the /var/log/syslog. > Shoot. For some reason that nop is just not finishing in a decent amount of time. Could you try the attached patch. It gives the nop even more time to complete and it spits out a bunch of debug info to make sure open-iscsi did not leak the task. --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "open-iscsi" group. To post to this group, send email to open-iscsi@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/open-iscsi -~--~~~~--~~--~--~---
Re: iscsi errors with debian-kernel 2.6.24-3-686
Klemens Kittan wrote: > Am Friday, 15. August 2008 20:03 schrieb Mike Christie: >> Mike Christie wrote: >>> Klemens Kittan wrote: Here is the configuration of my debian kernel (2.6.25-2). >>> Thanks. It looks like your target is responding to other IO, but did not >>> respond to the ping quick enough so it timed out. Let me make a patch >>> for you to test. I should hopefully have it later today. >> Try the attached patch over open-iscsi-2.0-869.2 tarball modules. >> >> To apply the patch untar and unzip the source then cd to the dir. Then do: >> >> patch -p1 -i where-the-patch-is-saved/relax-ping-timer.patch >> >> Then do the normal make and make install. You will probably want to >> reboot the box to make sure you are using the new modules. >> > > Unfortunately I got the same errors. > Could you send the log output? --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "open-iscsi" group. To post to this group, send email to open-iscsi@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/open-iscsi -~--~~~~--~~--~--~---
Re: iscsi errors with debian-kernel 2.6.24-3-686
Am Friday, 15. August 2008 20:03 schrieb Mike Christie: > Mike Christie wrote: > > Klemens Kittan wrote: > >> Here is the configuration of my debian kernel (2.6.25-2). > > > > Thanks. It looks like your target is responding to other IO, but did not > > respond to the ping quick enough so it timed out. Let me make a patch > > for you to test. I should hopefully have it later today. > > Try the attached patch over open-iscsi-2.0-869.2 tarball modules. > > To apply the patch untar and unzip the source then cd to the dir. Then do: > > patch -p1 -i where-the-patch-is-saved/relax-ping-timer.patch > > Then do the normal make and make install. You will probably want to > reboot the box to make sure you are using the new modules. > Unfortunately I got the same errors. Thanks, Klemens -- Klemens Kittan Systemadministrator Uni-Potsdam, Inst. f. Informatik August-Bebel-Str. 89 14482 Potsdam Tel.: +49-331-977/3125 Fax.: +49-331-977/3122 eMail : [EMAIL PROTECTED] gpg --recv-keys --keyserver wwwkeys.de.pgp.net 6EA09333 pgpUZcDVhuzIA.pgp Description: PGP signature
Re: iscsi errors with debian-kernel 2.6.24-3-686
Michael Kindermann wrote: > Hello, > thanks for quick reply. During weekend I did backups on the iscsi-device > with not one single error on kernel 2.6.22. > Am Freitag, den 13.06.2008, 19:11 +0200 schrieb Mike Christie: >> Michael Kindermann wrote: >>> Hello, >>> >>> We receive errors below on a standard debian lenny/testing sytem since >> kernelupdate from 2.6.22-3-686 to latest debian-kernel 2.6.24-3-686. The >> iscsi-device is Eonstore E16A-2130. >>> The open-iscsi deb package is 2.0.869.2-2. When we use the older kernel >> the errors disappear. Errors only happen during copying on the >> iscsi-devices. >>> Is this behaviour a debian specific problem and I have to compile >> open-scsi? >>> >>> >>> >>> Jun 13 14:32:30 hg2 kernel: connection1:0: iscsi: detected conn error >> (1011) >>> Jun 13 14:32:31 hg2 iscsid: Kernel reported iSCSI connection 1:0 error >> (1011) state (3) >>> Jun 13 14:32:41 hg2 kernel: iscsi: host reset succeeded >> The READs or WRITEs from the copy operations are timing out. The SCSI >> layer sets a timer on each command which is probably the default of 60 >> seconds (scsi layer sets to 30 and udev normal raises this to 60). If >> the command does not complete in that time it starts the scsi error >> handler and you end up getting these errors in the worst case where we >> cannot just abort and restart the command or reset the device. >> >> Are you copying to the iscsi device or from it (and are you then copying >> to to/from a non-iscsi device), or is it mixed? > > These errors posted earlier resulting from simply copying a dvdimage > from a local logical volume on SATA-Drives to a logical volume on the > iscsi-device. Normal usage is to do backups by rsync (dirvish) to this > device, which are very slow due to timeouts (normal rsync linux backups > via ssh and the windows-filesystems by local rsyncing cifs-mounts). > Results stay the same. The errors only happen when copying to the > iscsi-device. Restoring of data from iscsi to local volumes works great > without errors on the 2.6.24 -Kernel. > Weird. We actually fixed a write bug in that kernel, so writes should be better. Maybe we are now overloading the target - I do not know. If this happens again you might want to lower the queue depths by setting the following values lowwer node.session.cmds_max node.session.queue_depth Are you doing IO to lots of disks at the same time? And what type of target is this? Does it have good backing storage - fast sas or scsi disk or slower ide or sata ones? --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "open-iscsi" group. To post to this group, send email to open-iscsi@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/open-iscsi -~--~~~~--~~--~--~---
Re: iscsi errors with debian-kernel 2.6.24-3-686
Hello, thanks for quick reply. During weekend I did backups on the iscsi-device with not one single error on kernel 2.6.22. Am Freitag, den 13.06.2008, 19:11 +0200 schrieb Mike Christie: > Michael Kindermann wrote: > > Hello, > > > > We receive errors below on a standard debian lenny/testing sytem since > kernelupdate from 2.6.22-3-686 to latest debian-kernel 2.6.24-3-686. The > iscsi-device is Eonstore E16A-2130. > > The open-iscsi deb package is 2.0.869.2-2. When we use the older kernel > the errors disappear. Errors only happen during copying on the > iscsi-devices. > > Is this behaviour a debian specific problem and I have to compile > open-scsi? > > > > > > > > > > Jun 13 14:32:30 hg2 kernel: connection1:0: iscsi: detected conn error > (1011) > > Jun 13 14:32:31 hg2 iscsid: Kernel reported iSCSI connection 1:0 error > (1011) state (3) > > Jun 13 14:32:41 hg2 kernel: iscsi: host reset succeeded > > The READs or WRITEs from the copy operations are timing out. The SCSI > layer sets a timer on each command which is probably the default of 60 > seconds (scsi layer sets to 30 and udev normal raises this to 60). If > the command does not complete in that time it starts the scsi error > handler and you end up getting these errors in the worst case where we > cannot just abort and restart the command or reset the device. > > Are you copying to the iscsi device or from it (and are you then copying > to to/from a non-iscsi device), or is it mixed? These errors posted earlier resulting from simply copying a dvdimage from a local logical volume on SATA-Drives to a logical volume on the iscsi-device. Normal usage is to do backups by rsync (dirvish) to this device, which are very slow due to timeouts (normal rsync linux backups via ssh and the windows-filesystems by local rsyncing cifs-mounts). Results stay the same. The errors only happen when copying to the iscsi-device. Restoring of data from iscsi to local volumes works great without errors on the 2.6.24 -Kernel. Another effect is backups are slow and no keyboard interaction is possible during these timeouts. > > When you were using 2.6.22-3-686, were you also using the open-iscsi deb > package 2.0.869.2-2 or was it a older version. We just boot from the former 2.6.22 -Kernel. dpkg -l |grep iscsi ii open-iscsi2.0.869.2-2 > > On the broken setup could you run > > iscsiadm -m session -P 3 > > and send all the output? iscsiadm -m session -P 3 iSCSI Transport Class version 2.0-724 iscsiadm version 2.0-869 Target: iqn.2002-10.com.infortrend:raid.sn7457154.20 Current Portal: 192.168.7.227:3260,1 Persistent Portal: 192.168.7.227:3260,1 ** Interface: ** Iface Name: default Iface Transport: tcp Iface Initiatorname: iqn.1993-08.org.debian:01.b75ebc4b5f99 Iface IPaddress: 192.168.7.2 Iface HWaddress: default Iface Netdev: default SID: 1 iSCSI Connection State: LOGGED IN iSCSI Session State: Unknown Internal iscsid Session State: NO CHANGE Negotiated iSCSI params: HeaderDigest: None DataDigest: None MaxRecvDataSegmentLength: 131072 MaxXmitDataSegmentLength: 65536 FirstBurstLength: 65536 MaxBurstLength: 262144 ImmediateData: Yes InitialR2T: No MaxOutstandingR2T: 1 Attached SCSI devices: Host Number: 5 State: running scsi5 Channel 00 Id 0 Lun: 0 Attached scsi disk sdc State: running scsi5 Channel 00 Id 0 Lun: 1 Attached scsi disk sdd State: running scsi5 Channel 00 Id 0 Lun: 2 Attached scsi disk sde State: running scsi5 Channel 00 Id 0 Lun: 3 Attached scsi disk sdf State: running scsi5 Channel 00 Id 0 Lun: 4 Attached scsi disk sdg State: running scsi5 Channel 00 Id 0 Lun: 5 Attached scsi disk sdh State: running scsi5 Channel 00 Id 0 Lun: 6 Attached scsi disk sdi State: running greets Michael > > > -- Michael Kindermann Systemadministrator HandyGames www.handy-games.com GmbH i_Park Klingholz 13 97232 Giebelstadt Germany Tel: +49 (0) 9334 9757 - 35 mail:[EMAIL PROTECTED] --~--~-~--~~~---~--~~ You received this message because you are subscribed to the
Re: iscsi errors with debian-kernel 2.6.24-3-686
Michael Kindermann wrote: > Hello, > > We receive errors below on a standard debian lenny/testing sytem since > kernelupdate from 2.6.22-3-686 to latest debian-kernel 2.6.24-3-686. The > iscsi-device is Eonstore E16A-2130. > The open-iscsi deb package is 2.0.869.2-2. When we use the older kernel the > errors disappear. Errors only happen during copying on the iscsi-devices. > Is this behaviour a debian specific problem and I have to compile open-scsi? > > > > > > Jun 13 14:32:30 hg2 kernel: connection1:0: iscsi: detected conn error (1011) > Jun 13 14:32:31 hg2 iscsid: Kernel reported iSCSI connection 1:0 error (1011) > state (3) > Jun 13 14:32:41 hg2 kernel: iscsi: host reset succeeded The READs or WRITEs from the copy operations are timing out. The SCSI layer sets a timer on each command which is probably the default of 60 seconds (scsi layer sets to 30 and udev normal raises this to 60). If the command does not complete in that time it starts the scsi error handler and you end up getting these errors in the worst case where we cannot just abort and restart the command or reset the device. Are you copying to the iscsi device or from it (and are you then copying to to/from a non-iscsi device), or is it mixed? When you were using 2.6.22-3-686, were you also using the open-iscsi deb package 2.0.869.2-2 or was it a older version. On the broken setup could you run iscsiadm -m session -P 3 and send all the output? --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "open-iscsi" group. To post to this group, send email to open-iscsi@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/open-iscsi -~--~~~~--~~--~--~---
iscsi errors with debian-kernel 2.6.24-3-686
Hello, We receive errors below on a standard debian lenny/testing sytem since kernelupdate from 2.6.22-3-686 to latest debian-kernel 2.6.24-3-686. The iscsi-device is Eonstore E16A-2130. The open-iscsi deb package is 2.0.869.2-2. When we use the older kernel the errors disappear. Errors only happen during copying on the iscsi-devices. Is this behaviour a debian specific problem and I have to compile open-scsi? Jun 13 14:32:30 hg2 kernel: connection1:0: iscsi: detected conn error (1011) Jun 13 14:32:31 hg2 iscsid: Kernel reported iSCSI connection 1:0 error (1011) state (3) Jun 13 14:32:41 hg2 kernel: iscsi: host reset succeeded Jun 13 14:32:42 hg2 iscsid: received iferror -38 Jun 13 14:32:42 hg2 last message repeated 4 times Jun 13 14:32:42 hg2 iscsid: connection1:0 is operational after recovery (1 attempts) Jun 13 14:33:41 hg2 kernel: connection1:0: iscsi: detected conn error (1011) Jun 13 14:33:42 hg2 iscsid: Kernel reported iSCSI connection 1:0 error (1011) state (3) Jun 13 14:33:55 hg2 kernel: iscsi: host reset succeeded Jun 13 14:33:56 hg2 iscsid: received iferror -38 Jun 13 14:33:56 hg2 last message repeated 4 times Jun 13 14:33:56 hg2 iscsid: connection1:0 is operational after recovery (1 attempts) Jun 13 14:34:56 hg2 kernel: connection1:0: iscsi: detected conn error (1011) Jun 13 14:34:57 hg2 iscsid: Kernel reported iSCSI connection 1:0 error (1011) state (3) Greets Michael -- Michael Kindermann Systemadministrator HandyGames www.handy-games.com GmbH i_Park Klingholz 13 97232 Giebelstadt Germany _ Tel.: +49 (0) 9334 9757 - 35 Fax: +49 (0) 9334 9757 - 19 Mail: [EMAIL PROTECTED] _ Handelsregister HRB 8667 Amtsgericht Würzburg Steuer-Nummer 257/142/90099 USt-Identifikationsnummer (VAT): DE209182197 Geschäftsführer (CEO): Christopher Kassulke Markus Kassulke --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "open-iscsi" group. To post to this group, send email to open-iscsi@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/open-iscsi -~--~~~~--~~--~--~---