Re: Need help debugging 1011 connection errors

2012-06-12 Thread Mike Christie
On 06/12/2012 11:18 AM, awiddersh...@hotmail.com wrote:
 I seem to be getting this over and over again in the logs:
 
 Jun 12 09:13:40 example-server kernel:  connection0:0: iscsi: detected conn 
 error (1011)

1011 is just a generic error code meaning there was a connection
problem. We do not know what caused it. It could have been the target
died or someone pulled a cable or a bug. We do not have enough info.

 Jun 12 09:13:40 example-server iscsid: Kernel reported iSCSI connection 0:0 
 error (1011) state (3)
 Jun 12 09:13:43 example-server iscsid: connection0:0 is operational after 
 recovery (1 attempts)
 Jun 12 09:14:06 example-server iscsid: Got nop in, but kernel supports nop 
 handling.

This indicates something went screwy. The kernel has that el5 string, so
are you using Centos or RHEL? If so what is the iscsi tools version

rpm -q iscsi-initiator-utils

Is there anything else in the log before or after that? Something about
a nop or ping timing out?

What type of target is this with?

We used to send the nops as pings from userspace, but later moved it to
the kernel. The userspace tools support both (if the kernel does not
support it then we drop down and send from userspace). The message above
indicates that the kernel supports nops, but for some reason the kernel
sent it to userspace to handle. This should never happen.

It could happen if the target is not setting something on the iscsi
packet correctly. To detect this we could take a wireshark/tcpdump trace
and see the packet causing the problem.

-- 
You received this message because you are subscribed to the Google Groups 
open-iscsi group.
To post to this group, send email to open-iscsi@googlegroups.com.
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/open-iscsi?hl=en.



Re: Need help debugging 1011 connection errors

2012-06-12 Thread awiddersh...@hotmail.com
This indicates something went screwy. The kernel has that el5 string, so 
are you using Centos or RHEL? If so what is the iscsi tools version 

It is RHEL5 and the tools are as follows:

iscsi-initiator-utils-6.2.0.872-13.el5

Is there anything else in the log before or after that? Something about 
a nop or ping timing out? 

No, it is just went I sent you repeated over and over again.

What type of target is this with?

SUN COMSTAR

It could happen if the target is not setting something on the iscsi 
packet correctly. To detect this we could take a wireshark/tcpdump trace 
and see the packet causing the problem. 

I doubt this. We have about 300 other hosts connected to this without an 
issue.

-- 
You received this message because you are subscribed to the Google Groups 
open-iscsi group.
To view this discussion on the web visit 
https://groups.google.com/d/msg/open-iscsi/-/Zl8RzmwW7uoJ.
To post to this group, send email to open-iscsi@googlegroups.com.
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/open-iscsi?hl=en.



Re: Need help debugging 1011 connection errors

2012-06-12 Thread Mike Christie
On 06/12/2012 11:33 AM, awiddersh...@hotmail.com wrote:
 This indicates something went screwy. The kernel has that el5 string, so 
 are you using Centos or RHEL? If so what is the iscsi tools version 
 
 It is RHEL5 and the tools are as follows:
 
 iscsi-initiator-utils-6.2.0.872-13.el5
 
 Is there anything else in the log before or after that? Something about 
 a nop or ping timing out? 
 
 No, it is just went I sent you repeated over and over again.
 
 What type of target is this with?
 
 SUN COMSTAR
 
 It could happen if the target is not setting something on the iscsi 
 packet correctly. To detect this we could take a wireshark/tcpdump trace 
 and see the packet causing the problem. 
 
 I doubt this. We have about 300 other hosts connected to this without an 
 issue.
 

Are those other hosts running the same kernel version and tools as the
machine you hit this issue with?

-- 
You received this message because you are subscribed to the Google Groups 
open-iscsi group.
To post to this group, send email to open-iscsi@googlegroups.com.
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/open-iscsi?hl=en.



Re: Need help debugging 1011 connection errors

2012-06-12 Thread Mike Christie
On 06/12/2012 11:41 AM, Mike Christie wrote:
 On 06/12/2012 11:33 AM, awiddersh...@hotmail.com wrote:
 This indicates something went screwy. The kernel has that el5 string, so 
 are you using Centos or RHEL? If so what is the iscsi tools version 

 It is RHEL5 and the tools are as follows:

 iscsi-initiator-utils-6.2.0.872-13.el5

 Is there anything else in the log before or after that? Something about 
 a nop or ping timing out? 

 No, it is just went I sent you repeated over and over again.

 What type of target is this with?

 SUN COMSTAR

 It could happen if the target is not setting something on the iscsi 
 packet correctly. To detect this we could take a wireshark/tcpdump trace 
 and see the packet causing the problem. 

 I doubt this. We have about 300 other hosts connected to this without an 
 issue.

 
 Are those other hosts running the same kernel version and tools as the
 machine you hit this issue with?
 

If it is easy to replicate, could you send a trace?

-- 
You received this message because you are subscribed to the Google Groups 
open-iscsi group.
To post to this group, send email to open-iscsi@googlegroups.com.
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/open-iscsi?hl=en.



Re: Need help debugging 1011 connection errors

2012-06-12 Thread Mike Christie
On 06/12/2012 11:45 AM, Mike Christie wrote:
 On 06/12/2012 11:41 AM, Mike Christie wrote:
 On 06/12/2012 11:33 AM, awiddersh...@hotmail.com wrote:
 This indicates something went screwy. The kernel has that el5 string, so 
 are you using Centos or RHEL? If so what is the iscsi tools version 

 It is RHEL5 and the tools are as follows:

 iscsi-initiator-utils-6.2.0.872-13.el5

 Is there anything else in the log before or after that? Something about 
 a nop or ping timing out? 

 No, it is just went I sent you repeated over and over again.

 What type of target is this with?

 SUN COMSTAR

 It could happen if the target is not setting something on the iscsi 
 packet correctly. To detect this we could take a wireshark/tcpdump trace 
 and see the packet causing the problem. 

 I doubt this. We have about 300 other hosts connected to this without an 
 issue.


 Are those other hosts running the same kernel version and tools as the
 machine you hit this issue with?

 
 If it is easy to replicate, could you send a trace?
 

Actually. Do not waste your time. Just update your kernel.

There is a bug, because that kernel you are using does not support
kernel nops as pings and is returning a different error code than what
userspace is expecting when checking if the kernel supports it.

-- 
You received this message because you are subscribed to the Google Groups 
open-iscsi group.
To post to this group, send email to open-iscsi@googlegroups.com.
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/open-iscsi?hl=en.



Re: Need help debugging 1011 connection errors

2012-06-12 Thread awiddersh...@hotmail.com
Awesome. I'll do that and hopefully everything is happy afterward including 
these random connection issues. 

-- 
You received this message because you are subscribed to the Google Groups 
open-iscsi group.
To view this discussion on the web visit 
https://groups.google.com/d/msg/open-iscsi/-/grCIZutEzE4J.
To post to this group, send email to open-iscsi@googlegroups.com.
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/open-iscsi?hl=en.



Re: Need help debugging 1011 connection errors

2012-06-12 Thread Mike Christie
On 06/12/2012 11:55 AM, Mike Christie wrote:
 On 06/12/2012 11:45 AM, Mike Christie wrote:
 On 06/12/2012 11:41 AM, Mike Christie wrote:
 On 06/12/2012 11:33 AM, awiddersh...@hotmail.com wrote:
 This indicates something went screwy. The kernel has that el5 string, so 
 are you using Centos or RHEL? If so what is the iscsi tools version 

 It is RHEL5 and the tools are as follows:

 iscsi-initiator-utils-6.2.0.872-13.el5

 Is there anything else in the log before or after that? Something about 
 a nop or ping timing out? 

 No, it is just went I sent you repeated over and over again.

 What type of target is this with?

 SUN COMSTAR

 It could happen if the target is not setting something on the iscsi 
 packet correctly. To detect this we could take a wireshark/tcpdump trace 
 and see the packet causing the problem. 

 I doubt this. We have about 300 other hosts connected to this without an 
 issue.


 Are those other hosts running the same kernel version and tools as the
 machine you hit this issue with?


 If it is easy to replicate, could you send a trace?

 
 Actually. Do not waste your time. Just update your kernel.
 

On second or third thought could you send a trace? With the bug below it
means that we are not sending nops at all, so it must be coming from the
target or the kernel code is really messing up or there is corruption
somewhere because we are misreading packet headers.


 There is a bug, because that kernel you are using does not support
 kernel nops as pings and is returning a different error code than what
 userspace is expecting when checking if the kernel supports it.
 

-- 
You received this message because you are subscribed to the Google Groups 
open-iscsi group.
To post to this group, send email to open-iscsi@googlegroups.com.
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/open-iscsi?hl=en.



Re: Need help debugging 1011 connection errors

2012-06-12 Thread Mike Christie
On 06/12/2012 12:09 PM, Mike Christie wrote:
 On 06/12/2012 11:55 AM, Mike Christie wrote:
 On 06/12/2012 11:45 AM, Mike Christie wrote:
 On 06/12/2012 11:41 AM, Mike Christie wrote:
 On 06/12/2012 11:33 AM, awiddersh...@hotmail.com wrote:
 This indicates something went screwy. The kernel has that el5 string, so 
 are you using Centos or RHEL? If so what is the iscsi tools version 

 It is RHEL5 and the tools are as follows:

 iscsi-initiator-utils-6.2.0.872-13.el5

 Is there anything else in the log before or after that? Something about 
 a nop or ping timing out? 

 No, it is just went I sent you repeated over and over again.

 What type of target is this with?

 SUN COMSTAR

 It could happen if the target is not setting something on the iscsi 
 packet correctly. To detect this we could take a wireshark/tcpdump trace 
 and see the packet causing the problem. 

 I doubt this. We have about 300 other hosts connected to this without an 
 issue.


 Are those other hosts running the same kernel version and tools as the
 machine you hit this issue with?


 If it is easy to replicate, could you send a trace?


 Actually. Do not waste your time. Just update your kernel.

 
 On second or third thought could you send a trace? With the bug below it
 means that we are not sending nops at all, so it must be coming from the
 target or the kernel code is really messing up or there is corruption
 somewhere because we are misreading packet headers.
 

Sorry. Ignore that. I was right the first time. It is such old code. I
have not looked at it for a long time, so my eyes got cross eyed looking
at different versions :) It looks like target could send a nop as a
ping. We do not reply, because of the bug I mentioned, then target drops
connection. This will happen over an over.

So please just upgrade. Do not waste your time getting me traces.

Sorry for the confusion.

-- 
You received this message because you are subscribed to the Google Groups 
open-iscsi group.
To post to this group, send email to open-iscsi@googlegroups.com.
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/open-iscsi?hl=en.



RE: Need help debugging 1011 connection errors

2012-06-12 Thread Andrew Widdersheim

Understood. 
  

-- 
You received this message because you are subscribed to the Google Groups 
open-iscsi group.
To post to this group, send email to open-iscsi@googlegroups.com.
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/open-iscsi?hl=en.