RE: Need help debugging 1011 connection errors

2012-06-12 Thread Andrew Widdersheim

Understood. 
  

-- 
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To post to this group, send email to open-iscsi@googlegroups.com.
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/open-iscsi?hl=en.



Re: Need help debugging 1011 connection errors

2012-06-12 Thread Mike Christie
On 06/12/2012 12:09 PM, Mike Christie wrote:
> On 06/12/2012 11:55 AM, Mike Christie wrote:
>> On 06/12/2012 11:45 AM, Mike Christie wrote:
>>> On 06/12/2012 11:41 AM, Mike Christie wrote:
 On 06/12/2012 11:33 AM, awiddersh...@hotmail.com wrote:
>> This indicates something went screwy. The kernel has that el5 string, so 
>> are you using Centos or RHEL? If so what is the iscsi tools version 
>
> It is RHEL5 and the tools are as follows:
>
> iscsi-initiator-utils-6.2.0.872-13.el5
>
>> Is there anything else in the log before or after that? Something about 
>> a nop or ping timing out? 
>
> No, it is just went I sent you repeated over and over again.
>
>> What type of target is this with?
>
> SUN COMSTAR
>
>> It could happen if the target is not setting something on the iscsi 
>> packet correctly. To detect this we could take a wireshark/tcpdump trace 
>> and see the packet causing the problem. 
>
> I doubt this. We have about 300 other hosts connected to this without an 
> issue.
>

 Are those other hosts running the same kernel version and tools as the
 machine you hit this issue with?

>>>
>>> If it is easy to replicate, could you send a trace?
>>>
>>
>> Actually. Do not waste your time. Just update your kernel.
>>
> 
> On second or third thought could you send a trace? With the bug below it
> means that we are not sending nops at all, so it must be coming from the
> target or the kernel code is really messing up or there is corruption
> somewhere because we are misreading packet headers.
> 

Sorry. Ignore that. I was right the first time. It is such old code. I
have not looked at it for a long time, so my eyes got cross eyed looking
at different versions :) It looks like target could send a nop as a
ping. We do not reply, because of the bug I mentioned, then target drops
connection. This will happen over an over.

So please just upgrade. Do not waste your time getting me traces.

Sorry for the confusion.

-- 
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To post to this group, send email to open-iscsi@googlegroups.com.
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/open-iscsi?hl=en.



Re: Need help debugging 1011 connection errors

2012-06-12 Thread Mike Christie
On 06/12/2012 11:55 AM, Mike Christie wrote:
> On 06/12/2012 11:45 AM, Mike Christie wrote:
>> On 06/12/2012 11:41 AM, Mike Christie wrote:
>>> On 06/12/2012 11:33 AM, awiddersh...@hotmail.com wrote:
> This indicates something went screwy. The kernel has that el5 string, so 
> are you using Centos or RHEL? If so what is the iscsi tools version 

 It is RHEL5 and the tools are as follows:

 iscsi-initiator-utils-6.2.0.872-13.el5

> Is there anything else in the log before or after that? Something about 
> a nop or ping timing out? 

 No, it is just went I sent you repeated over and over again.

> What type of target is this with?

 SUN COMSTAR

> It could happen if the target is not setting something on the iscsi 
> packet correctly. To detect this we could take a wireshark/tcpdump trace 
> and see the packet causing the problem. 

 I doubt this. We have about 300 other hosts connected to this without an 
 issue.

>>>
>>> Are those other hosts running the same kernel version and tools as the
>>> machine you hit this issue with?
>>>
>>
>> If it is easy to replicate, could you send a trace?
>>
> 
> Actually. Do not waste your time. Just update your kernel.
> 

On second or third thought could you send a trace? With the bug below it
means that we are not sending nops at all, so it must be coming from the
target or the kernel code is really messing up or there is corruption
somewhere because we are misreading packet headers.


> There is a bug, because that kernel you are using does not support
> kernel nops as pings and is returning a different error code than what
> userspace is expecting when checking if the kernel supports it.
> 

-- 
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To post to this group, send email to open-iscsi@googlegroups.com.
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/open-iscsi?hl=en.



Re: Need help debugging 1011 connection errors

2012-06-12 Thread awiddersh...@hotmail.com
Awesome. I'll do that and hopefully everything is happy afterward including 
these random connection issues. 

-- 
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To view this discussion on the web visit 
https://groups.google.com/d/msg/open-iscsi/-/grCIZutEzE4J.
To post to this group, send email to open-iscsi@googlegroups.com.
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/open-iscsi?hl=en.



Re: Need help debugging 1011 connection errors

2012-06-12 Thread Mike Christie
On 06/12/2012 11:45 AM, Mike Christie wrote:
> On 06/12/2012 11:41 AM, Mike Christie wrote:
>> On 06/12/2012 11:33 AM, awiddersh...@hotmail.com wrote:
 This indicates something went screwy. The kernel has that el5 string, so 
 are you using Centos or RHEL? If so what is the iscsi tools version 
>>>
>>> It is RHEL5 and the tools are as follows:
>>>
>>> iscsi-initiator-utils-6.2.0.872-13.el5
>>>
 Is there anything else in the log before or after that? Something about 
 a nop or ping timing out? 
>>>
>>> No, it is just went I sent you repeated over and over again.
>>>
 What type of target is this with?
>>>
>>> SUN COMSTAR
>>>
 It could happen if the target is not setting something on the iscsi 
 packet correctly. To detect this we could take a wireshark/tcpdump trace 
 and see the packet causing the problem. 
>>>
>>> I doubt this. We have about 300 other hosts connected to this without an 
>>> issue.
>>>
>>
>> Are those other hosts running the same kernel version and tools as the
>> machine you hit this issue with?
>>
> 
> If it is easy to replicate, could you send a trace?
> 

Actually. Do not waste your time. Just update your kernel.

There is a bug, because that kernel you are using does not support
kernel nops as pings and is returning a different error code than what
userspace is expecting when checking if the kernel supports it.

-- 
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To post to this group, send email to open-iscsi@googlegroups.com.
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/open-iscsi?hl=en.



Re: Need help debugging 1011 connection errors

2012-06-12 Thread Mike Christie
On 06/12/2012 11:41 AM, Mike Christie wrote:
> On 06/12/2012 11:33 AM, awiddersh...@hotmail.com wrote:
>>> This indicates something went screwy. The kernel has that el5 string, so 
>>> are you using Centos or RHEL? If so what is the iscsi tools version 
>>
>> It is RHEL5 and the tools are as follows:
>>
>> iscsi-initiator-utils-6.2.0.872-13.el5
>>
>>> Is there anything else in the log before or after that? Something about 
>>> a nop or ping timing out? 
>>
>> No, it is just went I sent you repeated over and over again.
>>
>>> What type of target is this with?
>>
>> SUN COMSTAR
>>
>>> It could happen if the target is not setting something on the iscsi 
>>> packet correctly. To detect this we could take a wireshark/tcpdump trace 
>>> and see the packet causing the problem. 
>>
>> I doubt this. We have about 300 other hosts connected to this without an 
>> issue.
>>
> 
> Are those other hosts running the same kernel version and tools as the
> machine you hit this issue with?
> 

If it is easy to replicate, could you send a trace?

-- 
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To post to this group, send email to open-iscsi@googlegroups.com.
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/open-iscsi?hl=en.



Re: Disconnected iSCSI and umount problems

2012-06-12 Thread Mike Christie
On 06/12/2012 11:22 AM, awiddersh...@hotmail.com wrote:
> I am curious to know if you were able to test this and have any of the same 
> issues.
> 

I am not able to hit it in upstream or rhel kernels.

-- 
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To post to this group, send email to open-iscsi@googlegroups.com.
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/open-iscsi?hl=en.



Re: Need help debugging 1011 connection errors

2012-06-12 Thread Mike Christie
On 06/12/2012 11:33 AM, awiddersh...@hotmail.com wrote:
>> This indicates something went screwy. The kernel has that el5 string, so 
>> are you using Centos or RHEL? If so what is the iscsi tools version 
> 
> It is RHEL5 and the tools are as follows:
> 
> iscsi-initiator-utils-6.2.0.872-13.el5
> 
>> Is there anything else in the log before or after that? Something about 
>> a nop or ping timing out? 
> 
> No, it is just went I sent you repeated over and over again.
> 
>> What type of target is this with?
> 
> SUN COMSTAR
> 
>> It could happen if the target is not setting something on the iscsi 
>> packet correctly. To detect this we could take a wireshark/tcpdump trace 
>> and see the packet causing the problem. 
> 
> I doubt this. We have about 300 other hosts connected to this without an 
> issue.
> 

Are those other hosts running the same kernel version and tools as the
machine you hit this issue with?

-- 
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To post to this group, send email to open-iscsi@googlegroups.com.
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/open-iscsi?hl=en.



Re: Need help debugging 1011 connection errors

2012-06-12 Thread awiddersh...@hotmail.com
>This indicates something went screwy. The kernel has that el5 string, so 
>are you using Centos or RHEL? If so what is the iscsi tools version 

It is RHEL5 and the tools are as follows:

iscsi-initiator-utils-6.2.0.872-13.el5

>Is there anything else in the log before or after that? Something about 
>a nop or ping timing out? 

No, it is just went I sent you repeated over and over again.

>What type of target is this with?

SUN COMSTAR

>It could happen if the target is not setting something on the iscsi 
>packet correctly. To detect this we could take a wireshark/tcpdump trace 
>and see the packet causing the problem. 

I doubt this. We have about 300 other hosts connected to this without an 
issue.

-- 
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To view this discussion on the web visit 
https://groups.google.com/d/msg/open-iscsi/-/Zl8RzmwW7uoJ.
To post to this group, send email to open-iscsi@googlegroups.com.
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/open-iscsi?hl=en.



Re: Need help debugging 1011 connection errors

2012-06-12 Thread Mike Christie
On 06/12/2012 11:18 AM, awiddersh...@hotmail.com wrote:
> I seem to be getting this over and over again in the logs:
> 
> Jun 12 09:13:40 example-server kernel:  connection0:0: iscsi: detected conn 
> error (1011)

1011 is just a generic error code meaning there was a connection
problem. We do not know what caused it. It could have been the target
died or someone pulled a cable or a bug. We do not have enough info.

> Jun 12 09:13:40 example-server iscsid: Kernel reported iSCSI connection 0:0 
> error (1011) state (3)
> Jun 12 09:13:43 example-server iscsid: connection0:0 is operational after 
> recovery (1 attempts)
> Jun 12 09:14:06 example-server iscsid: Got nop in, but kernel supports nop 
> handling.

This indicates something went screwy. The kernel has that el5 string, so
are you using Centos or RHEL? If so what is the iscsi tools version

rpm -q iscsi-initiator-utils

Is there anything else in the log before or after that? Something about
a nop or ping timing out?

What type of target is this with?

We used to send the nops as pings from userspace, but later moved it to
the kernel. The userspace tools support both (if the kernel does not
support it then we drop down and send from userspace). The message above
indicates that the kernel supports nops, but for some reason the kernel
sent it to userspace to handle. This should never happen.

It could happen if the target is not setting something on the iscsi
packet correctly. To detect this we could take a wireshark/tcpdump trace
and see the packet causing the problem.

-- 
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To post to this group, send email to open-iscsi@googlegroups.com.
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/open-iscsi?hl=en.



Re: Disconnected iSCSI and umount problems

2012-06-12 Thread awiddersh...@hotmail.com
I am curious to know if you were able to test this and have any of the same 
issues.

On Wednesday, March 21, 2012 2:36:36 PM UTC-4, awidde...@hotmail.com wrote:
>
> Here is the output of 'uname -a'
>
> Linux test-server 2.6.32-131.0.15.el6.x86_64 #1 SMP Tue May 10 15:42:40 
> EDT 2011 x86_64 x86_64 x86_64 GNU/Linux
>
> Yes, I have two iSCSI disks on this test machine. When I do ' ls 
> /sys/block/ | grep sd' I still see all of the disks:
>
> sda
> sdb
> sdc
>

-- 
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To view this discussion on the web visit 
https://groups.google.com/d/msg/open-iscsi/-/zo2S5bcZdq8J.
To post to this group, send email to open-iscsi@googlegroups.com.
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/open-iscsi?hl=en.



Need help debugging 1011 connection errors

2012-06-12 Thread awiddersh...@hotmail.com
I seem to be getting this over and over again in the logs:

Jun 12 09:13:40 example-server kernel:  connection0:0: iscsi: detected conn 
error (1011)
Jun 12 09:13:40 example-server iscsid: Kernel reported iSCSI connection 0:0 
error (1011) state (3)
Jun 12 09:13:43 example-server iscsid: connection0:0 is operational after 
recovery (1 attempts)
Jun 12 09:14:06 example-server iscsid: Got nop in, but kernel supports nop 
handling.

I'm not really sure what the 1011 error indicates. Also the bit about the " 
Got nop in, but kernel supports nop handling." message is confusing as 
well. It is the first time I have ever seen it and I'm wondering if it is 
indicating a problem.

The kernel on the machine is quite old and could be the culprit either 
because drivers or whatever else. Currently the kernel is:

2.6.18-8.1.14.el5

Hopefully, going to update it shortly to see if it fixes any of  the issues 
but thought I would post here to see if anyone had any input.

-- 
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To view this discussion on the web visit 
https://groups.google.com/d/msg/open-iscsi/-/AyeuqA44Gj0J.
To post to this group, send email to open-iscsi@googlegroups.com.
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/open-iscsi?hl=en.