Re: Problem using multiple NICs

2009-11-17 Thread Pasi Kärkkäinen
On Thu, Nov 12, 2009 at 04:49:47PM -0800, Jim Cole wrote:
 
 Hi - I am running into problems utilizing two NICs in an iSCSI setup
 for multipath IO. The setup involves a Linux server (Ubuntu 9.10
 Server) with two Broadcom NetXtreme II GbE NICs connected to two
 separate switches on a single subnet, which is dedicated to EqualLogic
 SAN access.
 

Here's what I did when I tested multiple interfaces with Equallogic:
http://pasik.reaktio.net/open-iscsi-multiple-ifaces-test.txt

-- Pasi

 I have setup two iface definitions using the following steps.
 
   - iscsiadm -m iface -I eth4 --op=new
   - iscsiadm -m iface -I eth5 --op=new
   - iscsiadm -m iface -I eth4 --op=update -n iface.net_ifacename -v
 eth4
   - iscsiadm -m iface -I eth5 --op=update -n iface.net_ifacename -v
 eth5
 
 I have also tried specifying the MAC addresses explicitly with no
 change in behavior.
 
 Discovery was performed with the following command and worked as
 expected, generating node entries for both interfaces.
 
 - iscsiadm -m discovery -t st -p xx.xx.xx.xx:3260 -I eth4 -I eth5
 
 Up to this point everything looks good. And I have no trouble logging
 one interface into the desired target. However attempts to login the
 second interface always result in a time out. The message is
 
   iscsiadm: Could not login to [iface: eth4, target: target, portal:
 xx.xx.xx.xx,3260]:
   iscsiadm: initiator reported error (8 - connection timed out)
 
 The problem is not specific to one interface. I am able to login with
 either one. I just can't seem to login with both at the same time.
 
 I am using the open-iscsi package that ships with the Ubuntu distro
 (open-iscsi 2.0.870.1-0ubuntu12).
 
 I have another server on the same network, with identical hardware and
 iSCSI configuration, that is working properly. The only difference is
 that the other server is running CentOS 5.4 and using the initiator
 that ships with that distro (iscsi-initiator-utils
 6.2.0.871-0.10.el5).
 
 If anyone could provide any guidance on how to further diagnose, and
 hopefully solve, this problem, it would be greatly appreciated.
 
 TIA
 
 Jim
 
 --~--~-~--~~~---~--~~
 You received this message because you are subscribed to the Google Groups 
 open-iscsi group.
 To post to this group, send email to open-iscsi@googlegroups.com
 To unsubscribe from this group, send email to 
 open-iscsi+unsubscr...@googlegroups.com
 For more options, visit this group at 
 http://groups.google.com/group/open-iscsi
 -~--~~~~--~~--~--~---
 

--

You received this message because you are subscribed to the Google Groups 
open-iscsi group.
To post to this group, send email to open-is...@googlegroups.com.
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/open-iscsi?hl=.




Re: iscsi diagnosis help

2009-11-17 Thread Pasi Kärkkäinen
On Mon, Nov 16, 2009 at 09:39:00PM -0500, Hoot, Joseph wrote:
 
 On Nov 16, 2009, at 8:19 PM, Hoot, Joseph wrote:
 
  thanks.  That helps.  So I know that with the EqualLogic targets, there is 
  a Group IP which, I believe, responds with an iscsi login_redirect. 
  
  1) Could the Login authentication failed message be the response because 
  of a login redirect messages from the EQL redirect?
  
  and then my next question is more for curiosity sake:
  
  2) Are there plans in the future to have more than one connection per 
  session?  and I guess in addition to that, would that mean multiple 
  connections to a single volume over the same nic?
  
  
 
 
 Also Mike, I'm seeing one or two of these every 30-40 minutes if I slam our 
 EqualLogic with roughly 7-15k IOPS (reads and writes) non stop on 3 volumes.  
 In this type of scenario, would you expect to see timeouts like this once in 
 awhile?  If so, do you think increasing my NOOP timeouts would assist so we 
 don't get these?  maybe set it to 15 seconds instead of 10?
 

Equallogic does active loadbalancing (redirects) during operation..
dunno about the errors though.

-- Pasi

 
  
  On Nov 16, 2009, at 7:18 PM, Mike Christie wrote:
  
  Hoot, Joseph wrote:
  Hi all,
  
  I'm trying to understand what I'm seeing in my /var/log/messages.  Here's 
  what I have:
  
  Nov 13 10:49:47 oim6102506 kernel:  connection5:0: ping timeout of 10 
  secs expired, last rx 191838122, last ping 191839372, now 191841872
  Nov 13 10:49:47 oim6102506 kernel:  connection5:0: detected conn error 
  (1011)
  Nov 13 10:49:47 oim6102506 iscsid: Kernel reported iSCSI connection 5:0 
  error (1011) state (3)
  Nov 13 10:49:50 oim6102506 iscsid: Login authentication failed with 
  target 
  iqn.2001-05.com.equallogic:0-8a0906-e7d1dea02-786272c42554aef2-ovm-2-lun03
  Nov 13 10:49:52 oim6102506 iscsid: connection5:0 is operational after 
  recovery (1 attempts)
  
  the first line, what is connection5:0?  is that referenced from 
  iscsiadm somewhere? I only ask because I'm seeing iscsid messages and 
  kernel messages.  I also have dm-multipath running, which usually shows 
  up as dm-multipath or something like that.  I understand that iscsid is 
  the process that is logging in and out.  But is the kernel: message 
  just an iscsi modules that is loaded into the kernel, which is why it is 
  being logged as kernel:?
  
  
  It is the session id and connection id.
  
  connection$SESSION_ID:$CONNECTION_ID
  
  If you run iscsiadm -m session -P 1 or -P 3
  
  You will see
  
  #iscsiadm -m session -P 1
  Target: iqn.1992-08.com.netapp:sn.33615311
 Current Portal: 10.15.85.19:3260,3
 Persistent Portal: 10.15.85.19:3260,3
 Iface Transport: tcp
 Iface IPaddress: 10.11.14.37
 Iface HWaddress: default
 Iface Netdev: default
 SID: 7
 iSCSI Connection State: LOGGED IN
 Internal iscsid Session State: NO CHANGE
  
  
  Session number is the SID value.
  
  If you run
  iscsiadm -m session
  tcp [2] 10.15.84.19:3260,2 iqn.1992-08.com.netapp:sn.33615311
  
  the session number/SID is the value in brackets.
  
  
  If you run iscsiadm in session mode (iscsiadm -m session) then you can 
  use the -R argument and pass in a SID to do an opertaion like
  
  iscsiadm -m session -R 2 --rescan
  
  would rescan that session.
  
  Connection number is currently always zero.
  
  
  For the second question, iscsid handles login and logout, and error 
  handling, and the kernel basically passes iscsi packets around.
  
  
  Nov 13 10:49:47 oim6102506 kernel:  connection5:0: ping timeout of 10 
  secs expired, last rx 191838122, last ping 191839372, now 191841872
  
  
  so here the iscsi kernel code sends a iscsi ping/nop every noop_interval 
  seconds, and if we do not get a response withing noop_timeout seconds it 
  will fire off a connection error.
  
  
  
  Nov 13 10:49:47 oim6102506 kernel:  connection5:0: detected conn error 
  (1011)
  
  
  Here is the kernel code notifying userspace of the problem.
  
  
  Nov 13 10:49:47 oim6102506 iscsid: Kernel reported iSCSI connection 5:0 
  error (1011) state (3)
  
  
  And there iscsid is accepting the error (probably no need for the error 
  to be logged twice).
  
  
  Nov 13 10:49:50 oim6102506 iscsid: Login authentication failed with target
  
  
  And then here iscsid handled the error by killing the tcp/ip connection, 
  reconnection the tcp/ip connection, and then re-logging into the iscsi 
  target. But for some reason we could not log back in right away.
  
  
  iqn.2001-05.com.equallogic:0-8a0906-e7d1dea02-786272c42554aef2-ovm-2-lun03
  Nov 13 10:49:52 oim6102506 iscsid: connection5:0 is operational after 
  recovery (1 attempts)
  
  But it looks like we tried again and we got back in.
  
  
  
  Maybe one of these modules?
  iscsi_tcp  19785  46 
  libiscsi_tcp   21829  1 iscsi_tcp
  

Re: iscsi diagnosis help

2009-11-17 Thread Yangkook Kim
 2) Are there plans in the future to have more than one connection per session?

I dont't think so. If you want to know the reason, read the thread
titled MC/S support
 in open-iscsi mailing list.

Kim

2009/11/17, Pasi Kärkkäinen pa...@iki.fi:
 On Mon, Nov 16, 2009 at 09:39:00PM -0500, Hoot, Joseph wrote:

 On Nov 16, 2009, at 8:19 PM, Hoot, Joseph wrote:

  thanks.  That helps.  So I know that with the EqualLogic targets, there
  is a Group IP which, I believe, responds with an iscsi login_redirect.
 
 
  1) Could the Login authentication failed message be the response
  because of a login redirect messages from the EQL redirect?
 
  and then my next question is more for curiosity sake:
 
  2) Are there plans in the future to have more than one connection per
  session?  and I guess in addition to that, would that mean multiple
  connections to a single volume over the same nic?
 
 


 Also Mike, I'm seeing one or two of these every 30-40 minutes if I slam
 our EqualLogic with roughly 7-15k IOPS (reads and writes) non stop on 3
 volumes.  In this type of scenario, would you expect to see timeouts like
 this once in awhile?  If so, do you think increasing my NOOP timeouts
 would assist so we don't get these?  maybe set it to 15 seconds instead of
 10?


 Equallogic does active loadbalancing (redirects) during operation..
 dunno about the errors though.

 -- Pasi


 
  On Nov 16, 2009, at 7:18 PM, Mike Christie wrote:
 
  Hoot, Joseph wrote:
  Hi all,
 
  I'm trying to understand what I'm seeing in my /var/log/messages.
  Here's what I have:
 
  Nov 13 10:49:47 oim6102506 kernel:  connection5:0: ping timeout of 10
  secs expired, last rx 191838122, last ping 191839372, now 191841872
  Nov 13 10:49:47 oim6102506 kernel:  connection5:0: detected conn error
  (1011)
  Nov 13 10:49:47 oim6102506 iscsid: Kernel reported iSCSI connection
  5:0 error (1011) state (3)
  Nov 13 10:49:50 oim6102506 iscsid: Login authentication failed with
  target
  iqn.2001-05.com.equallogic:0-8a0906-e7d1dea02-786272c42554aef2-ovm-2-lun03
  Nov 13 10:49:52 oim6102506 iscsid: connection5:0 is operational after
  recovery (1 attempts)
 
  the first line, what is connection5:0?  is that referenced from
  iscsiadm somewhere? I only ask because I'm seeing iscsid messages and
  kernel messages.  I also have dm-multipath running, which usually
  shows up as dm-multipath or something like that.  I understand that
  iscsid is the process that is logging in and out.  But is the
  kernel: message just an iscsi modules that is loaded into the
  kernel, which is why it is being logged as kernel:?
 
 
  It is the session id and connection id.
 
  connection$SESSION_ID:$CONNECTION_ID
 
  If you run iscsiadm -m session -P 1 or -P 3
 
  You will see
 
  #iscsiadm -m session -P 1
  Target: iqn.1992-08.com.netapp:sn.33615311
 Current Portal: 10.15.85.19:3260,3
 Persistent Portal: 10.15.85.19:3260,3
 Iface Transport: tcp
 Iface IPaddress: 10.11.14.37
 Iface HWaddress: default
 Iface Netdev: default
 SID: 7
 iSCSI Connection State: LOGGED IN
 Internal iscsid Session State: NO CHANGE
 
 
  Session number is the SID value.
 
  If you run
  iscsiadm -m session
  tcp [2] 10.15.84.19:3260,2 iqn.1992-08.com.netapp:sn.33615311
 
  the session number/SID is the value in brackets.
 
 
  If you run iscsiadm in session mode (iscsiadm -m session) then you can
  use the -R argument and pass in a SID to do an opertaion like
 
  iscsiadm -m session -R 2 --rescan
 
  would rescan that session.
 
  Connection number is currently always zero.
 
 
  For the second question, iscsid handles login and logout, and error
  handling, and the kernel basically passes iscsi packets around.
 
 
  Nov 13 10:49:47 oim6102506 kernel:  connection5:0: ping timeout of 10
  secs expired, last rx 191838122, last ping 191839372, now 191841872
 
 
  so here the iscsi kernel code sends a iscsi ping/nop every
  noop_interval
  seconds, and if we do not get a response withing noop_timeout seconds
  it
  will fire off a connection error.
 
 
 
  Nov 13 10:49:47 oim6102506 kernel:  connection5:0: detected conn error
  (1011)
 
 
  Here is the kernel code notifying userspace of the problem.
 
 
  Nov 13 10:49:47 oim6102506 iscsid: Kernel reported iSCSI connection 5:0
 
  error (1011) state (3)
 
 
  And there iscsid is accepting the error (probably no need for the error
 
  to be logged twice).
 
 
  Nov 13 10:49:50 oim6102506 iscsid: Login authentication failed with
  target
 
 
  And then here iscsid handled the error by killing the tcp/ip
  connection,
  reconnection the tcp/ip connection, and then re-logging into the iscsi
  target. But for some reason we could not log back in right away.
 
 
  iqn.2001-05.com.equallogic:0-8a0906-e7d1dea02-786272c42554aef2-ovm-2-lun03
  Nov 13 10:49:52 oim6102506 iscsid: connection5:0 is operational after
  recovery (1 attempts)
 
  

Poor read performance with IET

2009-11-17 Thread pryker
I am currently running IET on a CentOS 5.4 server with the following
kernel:

Linux titan1 2.6.18-128.7.1.el5 #1 SMP Mon Aug 24 08:21:56 EDT 2009
x86_64 x86_64 x86_64 GNU/Linux

The server is a Dual quad core 2.8 GHz system with 16 GB ram.  I am
also using Coraid disk shelves via AoE for my block storage that I am
offering up as a iscsi target.

I am running v 0.4.17 of IET.

I am getting very good write performance but lousy read performance.
performing a simple sequential write to the iscsi target I get 94
megabytes per sec.  With reads I am only getting 12.4 megabytes per
sec.

My ietd.conf looks like this:

Target iqn.2009-11.net.storage:titan.diskshelf1.e1.2
  Lun 1 Path=/dev/etherd/e1.2,Type=blockio
  Alias e1.2
  MaxConnections 1
  InitialR2T No
  ImmediateData Yes
  MaxRecvDataSegmentLength 262144
  MaxXmitDataSegmentLength 262144

I have also made the following tweaks to tcp/ip:

sysctl net.ipv4.tcp_rmem=100 100 100
sysctl net.ipv4.tcp_wmem=100 100 100
sysctl net.ipv4.tcp_tw_recycle=1
sysctl net.ipv4.tcp_tw_reuse=1
sysctl net.core.rmem_max=524287
sysctl net.core.wmem_max=524287
sysctl net.core.wmem_default=524287
sysctl net.core.optmem_max=524287
sysctl net.core.netdev_max_backlog=30

I am using Broadcom cards in the iscsi target server.  I have enabled
jumbo rames on them (MTU 9000).  They are connected directly into a
windows server and I am accessing the iscsi target with MS iscsi
initiator.  The NIC cards on the Windows server are also set to a MTU
of 9000.  There is no switch in between, they are directly connected
into the Windows server.

I also notice that load averages on the Linux box will get into the
7's and 8's when I try pushing the system by performing multiple
transfers.

Any feedback on what I might be missing here would be great!

Thanks

Phil

--

You received this message because you are subscribed to the Google Groups 
open-iscsi group.
To post to this group, send email to open-is...@googlegroups.com.
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/open-iscsi?hl=.




Re: iscsi diagnosis help

2009-11-17 Thread Mike Christie
Hoot, Joseph wrote:
 thanks.  That helps.  So I know that with the EqualLogic targets, there is a 
 Group IP which, I believe, responds with an iscsi login_redirect. 
 
 1) Could the Login authentication failed message be the response because of 
 a login redirect messages from the EQL redirect?
 

It could be, but then when we retry we end up handling the redirect ok. 
I do not know why the first redirect would fail.

Take a wireshark trace or run iscsid by hand

iscsid -d 8

send the log output.

That will let us know where we failed, but it would not tell us why. 
Normally EQL targets would leave something in their logs about why.


 and then my next question is more for curiosity sake:
 
 2) Are there plans in the future to have more than one connection per 
 session?  and I guess in addition to that, would that mean multiple 
 connections to a single volume over the same nic?
 

No plans for MC/s. You can do multiple sessions to the same volume 
though. You can have multiple sessions over the same nic or over 
difference nics or some combo.

 
 
 On Nov 16, 2009, at 7:18 PM, Mike Christie wrote:
 
 Hoot, Joseph wrote:
 Hi all,

 I'm trying to understand what I'm seeing in my /var/log/messages.  Here's 
 what I have:

 Nov 13 10:49:47 oim6102506 kernel:  connection5:0: ping timeout of 10 secs 
 expired, last rx 191838122, last ping 191839372, now 191841872
 Nov 13 10:49:47 oim6102506 kernel:  connection5:0: detected conn error 
 (1011)
 Nov 13 10:49:47 oim6102506 iscsid: Kernel reported iSCSI connection 5:0 
 error (1011) state (3)
 Nov 13 10:49:50 oim6102506 iscsid: Login authentication failed with target 
 iqn.2001-05.com.equallogic:0-8a0906-e7d1dea02-786272c42554aef2-ovm-2-lun03
 Nov 13 10:49:52 oim6102506 iscsid: connection5:0 is operational after 
 recovery (1 attempts)

 the first line, what is connection5:0?  is that referenced from iscsiadm 
 somewhere? I only ask because I'm seeing iscsid messages and kernel 
 messages.  I also have dm-multipath running, which usually shows up as 
 dm-multipath or something like that.  I understand that iscsid is the 
 process that is logging in and out.  But is the kernel: message just an 
 iscsi modules that is loaded into the kernel, which is why it is being 
 logged as kernel:?

 It is the session id and connection id.

 connection$SESSION_ID:$CONNECTION_ID

 If you run iscsiadm -m session -P 1 or -P 3

 You will see

 #iscsiadm -m session -P 1
 Target: iqn.1992-08.com.netapp:sn.33615311
 Current Portal: 10.15.85.19:3260,3
 Persistent Portal: 10.15.85.19:3260,3
 Iface Transport: tcp
 Iface IPaddress: 10.11.14.37
 Iface HWaddress: default
 Iface Netdev: default
 SID: 7
 iSCSI Connection State: LOGGED IN
 Internal iscsid Session State: NO CHANGE


 Session number is the SID value.

 If you run
 iscsiadm -m session
 tcp [2] 10.15.84.19:3260,2 iqn.1992-08.com.netapp:sn.33615311

 the session number/SID is the value in brackets.


 If you run iscsiadm in session mode (iscsiadm -m session) then you can 
 use the -R argument and pass in a SID to do an opertaion like

 iscsiadm -m session -R 2 --rescan

 would rescan that session.

 Connection number is currently always zero.


 For the second question, iscsid handles login and logout, and error 
 handling, and the kernel basically passes iscsi packets around.


 Nov 13 10:49:47 oim6102506 kernel:  connection5:0: ping timeout of 10 
 secs expired, last rx 191838122, last ping 191839372, now 191841872


 so here the iscsi kernel code sends a iscsi ping/nop every noop_interval 
 seconds, and if we do not get a response withing noop_timeout seconds it 
 will fire off a connection error.



 Nov 13 10:49:47 oim6102506 kernel:  connection5:0: detected conn error 
 (1011)


 Here is the kernel code notifying userspace of the problem.


 Nov 13 10:49:47 oim6102506 iscsid: Kernel reported iSCSI connection 5:0 
 error (1011) state (3)


 And there iscsid is accepting the error (probably no need for the error 
 to be logged twice).


 Nov 13 10:49:50 oim6102506 iscsid: Login authentication failed with target


 And then here iscsid handled the error by killing the tcp/ip connection, 
 reconnection the tcp/ip connection, and then re-logging into the iscsi 
 target. But for some reason we could not log back in right away.


 iqn.2001-05.com.equallogic:0-8a0906-e7d1dea02-786272c42554aef2-ovm-2-lun03
 Nov 13 10:49:52 oim6102506 iscsid: connection5:0 is operational after 
 recovery (1 attempts)

 But it looks like we tried again and we got back in.



 Maybe one of these modules?
 iscsi_tcp  19785  46 
 libiscsi_tcp   21829  1 iscsi_tcp
 libiscsi2  41285  3 ib_iser,iscsi_tcp,libiscsi_tcp
 scsi_transport_iscsi237197  5 ib_iser,iscsi_tcp,libiscsi2
 scsi_transport_iscsi 6085  1 scsi_transport_iscsi2

 I'm just trying to make sure that all of my timeout values line 

Re: iscsi diagnosis help

2009-11-17 Thread Mike Christie
Hoot, Joseph wrote:
 On Nov 16, 2009, at 8:19 PM, Hoot, Joseph wrote:
 
 thanks.  That helps.  So I know that with the EqualLogic targets, there is a 
 Group IP which, I believe, responds with an iscsi login_redirect. 

 1) Could the Login authentication failed message be the response because 
 of a login redirect messages from the EQL redirect?

 and then my next question is more for curiosity sake:

 2) Are there plans in the future to have more than one connection per 
 session?  and I guess in addition to that, would that mean multiple 
 connections to a single volume over the same nic?


 
 
 Also Mike, I'm seeing one or two of these every 30-40 minutes if I slam our 
 EqualLogic with roughly 7-15k IOPS (reads and writes) non stop on 3 volumes.  
 In this type of scenario, would you expect to see timeouts like this once in 
 awhile?  If so, do you think increasing my NOOP timeouts would assist so we 
 don't get these?  maybe set it to 15 seconds instead of 10?
 

It might be a bug.

What version of open-iscsi are you using? What kernel? Is it a distro or 
kernel.org one? And are you using the open-iscsi kernel modules that 
come with a open-iscsi.org tarball or the kernel modules that come with 
your kernel?

--

You received this message because you are subscribed to the Google Groups 
open-iscsi group.
To post to this group, send email to open-is...@googlegroups.com.
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/open-iscsi?hl=.




Re: Problem using multiple NICs

2009-11-17 Thread Mike Christie
Jim Cole wrote:
 Hi - I am running into problems utilizing two NICs in an iSCSI setup
 for multipath IO. The setup involves a Linux server (Ubuntu 9.10
 Server) with two Broadcom NetXtreme II GbE NICs connected to two
 separate switches on a single subnet, which is dedicated to EqualLogic
 SAN access.
 
 I have setup two iface definitions using the following steps.
 
   - iscsiadm -m iface -I eth4 --op=new
   - iscsiadm -m iface -I eth5 --op=new
   - iscsiadm -m iface -I eth4 --op=update -n iface.net_ifacename -v
 eth4
   - iscsiadm -m iface -I eth5 --op=update -n iface.net_ifacename -v
 eth5
 
 I have also tried specifying the MAC addresses explicitly with no
 change in behavior.
 
 Discovery was performed with the following command and worked as
 expected, generating node entries for both interfaces.
 
 - iscsiadm -m discovery -t st -p xx.xx.xx.xx:3260 -I eth4 -I eth5
 
 Up to this point everything looks good. And I have no trouble logging
 one interface into the desired target. However attempts to login the
 second interface always result in a time out. The message is
 
   iscsiadm: Could not login to [iface: eth4, target: target, portal:
 xx.xx.xx.xx,3260]:
   iscsiadm: initiator reported error (8 - connection timed out)
 
 The problem is not specific to one interface. I am able to login with
 either one. I just can't seem to login with both at the same time.
 

Can you do ping through each interface at the same time?

Do

ping -I eth4 xx.xx.xx.xx

in one console and

ping -I eth5 xx.xx.xx.xx

in another.

--

You received this message because you are subscribed to the Google Groups 
open-iscsi group.
To post to this group, send email to open-is...@googlegroups.com.
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/open-iscsi?hl=.




Re: iscsi diagnosis help

2009-11-17 Thread Mike Christie
Pasi Kärkkäinen wrote:
 On Mon, Nov 16, 2009 at 09:39:00PM -0500, Hoot, Joseph wrote:
 On Nov 16, 2009, at 8:19 PM, Hoot, Joseph wrote:

 thanks.  That helps.  So I know that with the EqualLogic targets, there is 
 a Group IP which, I believe, responds with an iscsi login_redirect. 

 1) Could the Login authentication failed message be the response because 
 of a login redirect messages from the EQL redirect?

 and then my next question is more for curiosity sake:

 2) Are there plans in the future to have more than one connection per 
 session?  and I guess in addition to that, would that mean multiple 
 connections to a single volume over the same nic?



 Also Mike, I'm seeing one or two of these every 30-40 minutes if I slam our 
 EqualLogic with roughly 7-15k IOPS (reads and writes) non stop on 3 volumes. 
  In this type of scenario, would you expect to see timeouts like this once 
 in awhile?  If so, do you think increasing my NOOP timeouts would assist so 
 we don't get these?  maybe set it to 15 seconds instead of 10?

 
 Equallogic does active loadbalancing (redirects) during operation..
 dunno about the errors though.
 

Oh yeah, forgot about that. Thanks Pasi!

Joseph, look in the EQL target logs for something about the EQL box 
doing load balancing. I think normally we handle the load balancing more 
gracefully, but we might be messing up. I think if EQL was load 
balancing in the open-iscsi logs we would see something about getting a 
async iscsi pdu from the target that asks us to logout. Then when we 
relogin the target would redirect us to the optimal path.

--

You received this message because you are subscribed to the Google Groups 
open-iscsi group.
To post to this group, send email to open-is...@googlegroups.com.
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/open-iscsi?hl=.




Re: iscsi diagnosis help

2009-11-17 Thread Hoot, Joseph
more INLINE below...

On Nov 17, 2009, at 7:27 PM, Mike Christie wrote:

 Pasi Kärkkäinen wrote:
 On Mon, Nov 16, 2009 at 09:39:00PM -0500, Hoot, Joseph wrote:
 On Nov 16, 2009, at 8:19 PM, Hoot, Joseph wrote:
 
 thanks.  That helps.  So I know that with the EqualLogic targets, there is 
 a Group IP which, I believe, responds with an iscsi login_redirect. 
 
 1) Could the Login authentication failed message be the response because 
 of a login redirect messages from the EQL redirect?
 
 and then my next question is more for curiosity sake:
 
 2) Are there plans in the future to have more than one connection per 
 session?  and I guess in addition to that, would that mean multiple 
 connections to a single volume over the same nic?
 
 
 
 Also Mike, I'm seeing one or two of these every 30-40 minutes if I slam our 
 EqualLogic with roughly 7-15k IOPS (reads and writes) non stop on 3 
 volumes.  In this type of scenario, would you expect to see timeouts like 
 this once in awhile?  If so, do you think increasing my NOOP timeouts would 
 assist so we don't get these?  maybe set it to 15 seconds instead of 10?
 
 
 Equallogic does active loadbalancing (redirects) during operation..
 dunno about the errors though.
 
 
 Oh yeah, forgot about that. Thanks Pasi!
 
 Joseph, look in the EQL target logs for something about the EQL box 
 doing load balancing. I think normally we handle the load balancing more 
 gracefully, but we might be messing up. I think if EQL was load 
 balancing in the open-iscsi logs we would see something about getting a 
 async iscsi pdu from the target that asks us to logout. Then when we 
 relogin the target would redirect us to the optimal path.


There are two things that the EQL does, I believe-- one thing is async logout, 
the other is login_redirect.   Unfortunately, from the EQL syslog side we don't 
see any errors related to this.  It's my understanding, however, that when a 
login is initially attempted to the EQL, it hits the group ip or an alias'd 
IP sitting on a real nic.  The group IP looks at all the interfaces on the EQL 
and decides, based on some algorithm, which EQL nic the session should connect 
to.  It then sends the initiator that made the request a login_redirect, which 
I thought is basically a logout and reconnect pdu.  It would say, for 
example, you're can't log into the group IP, however, you can log into this IP 
(a real nic) that it would prefer you be logged into.

I'm thinking that the failed login is actually the result of that attempt to 
log into the group IP and it sending a login redirect pdu back to it.

Don, does this seem like normal EQL traffic to an OiS initiator?

 
 --
 
 You received this message because you are subscribed to the Google Groups 
 open-iscsi group.
 To post to this group, send email to open-is...@googlegroups.com.
 To unsubscribe from this group, send email to 
 open-iscsi+unsubscr...@googlegroups.com.
 For more options, visit this group at 
 http://groups.google.com/group/open-iscsi?hl=.
 
 

===
Joseph R. Hoot
Lead System Programmer/Analyst
(w) 716-878-4832
(c) 716-759-HOOT
joe.h...@itec.suny.edu
GPG KEY:   7145F633
===

--

You received this message because you are subscribed to the Google Groups 
open-iscsi group.
To post to this group, send email to open-is...@googlegroups.com.
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/open-iscsi?hl=.




Re: iscsi diagnosis help

2009-11-17 Thread Mike Christie
Hoot, Joseph wrote:
 more INLINE below...
 
 On Nov 17, 2009, at 7:27 PM, Mike Christie wrote:
 
 Pasi Kärkkäinen wrote:
 On Mon, Nov 16, 2009 at 09:39:00PM -0500, Hoot, Joseph wrote:
 On Nov 16, 2009, at 8:19 PM, Hoot, Joseph wrote:

 thanks.  That helps.  So I know that with the EqualLogic targets, there 
 is a Group IP which, I believe, responds with an iscsi login_redirect. 

 1) Could the Login authentication failed message be the response 
 because of a login redirect messages from the EQL redirect?

 and then my next question is more for curiosity sake:

 2) Are there plans in the future to have more than one connection per 
 session?  and I guess in addition to that, would that mean multiple 
 connections to a single volume over the same nic?


 Also Mike, I'm seeing one or two of these every 30-40 minutes if I slam 
 our EqualLogic with roughly 7-15k IOPS (reads and writes) non stop on 3 
 volumes.  In this type of scenario, would you expect to see timeouts like 
 this once in awhile?  If so, do you think increasing my NOOP timeouts 
 would assist so we don't get these?  maybe set it to 15 seconds instead of 
 10?

 Equallogic does active loadbalancing (redirects) during operation..
 dunno about the errors though.

 Oh yeah, forgot about that. Thanks Pasi!

 Joseph, look in the EQL target logs for something about the EQL box 
 doing load balancing. I think normally we handle the load balancing more 
 gracefully, but we might be messing up. I think if EQL was load 
 balancing in the open-iscsi logs we would see something about getting a 
 async iscsi pdu from the target that asks us to logout. Then when we 
 relogin the target would redirect us to the optimal path.
 
 
 There are two things that the EQL does, I believe-- one thing is async 
 logout, the other is login_redirect.   Unfortunately, from the EQL syslog 
 side we don't see any errors related to this.  It's my understanding, 
 however, that when a login is initially attempted to the EQL, it hits the 
 group ip or an alias'd IP sitting on a real nic.  The group IP looks at all 
 the interfaces on the EQL and decides, based on some algorithm, which EQL nic 
 the session should connect to.  It then sends the initiator that made the 
 request a login_redirect, which I thought is basically a logout and 
 reconnect pdu.  It would say, for example, you're can't log into the group 
 IP, however, you can log into this IP (a real nic) that it would prefer you 
 be logged into.
 
 I'm thinking that the failed login is actually the result of that attempt 
 to log into the group IP and it sending a login redirect pdu back to it.
 

If the target was load balancing us it would:

- Send a async logout pdu.
- We then send a logout pdu.
- When we get the logout response pdu we kill the tcp ip connection
- We then create a new tcp connection
- We then log in to the portal that was passed into iscsiadm/iscsid (the 
one in the DB that you see when you run iscsiadm -m node, which is 
probably what you call the group IP). For this process we send a login 
pdu. It then sends a login response pdu with the login redirect 
response. In this response we also get the new IP to log into.
- We see that response and kill the tcp connection, and create a new tcp 
connection to the portal we are being redirected to.
- We then log into the portal we were redirected to. We again do this by 
sending a login pdu. This time the login response pdu should be ok and 
we are done.


We do not know which login pdu failed, right now. You would need a 
wireshark trace or iscsid debugging. iscsid could hit the Login 
authentication failed path for either of the login pdus sent.

You should not see that message normally even when we are being 
redirected. We do the login redirect login when we initially log into 
the target (like when you do iscsiadm -m node -l or service iscsi 
start), and if you look in your logs you should not see a login failed 
message for that. If you do it might be a clue.

--

You received this message because you are subscribed to the Google Groups 
open-iscsi group.
To post to this group, send email to open-is...@googlegroups.com.
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/open-iscsi?hl=.




Re: iscsi diagnosis help

2009-11-17 Thread Mike Christie
Mike Christie wrote:
 Hoot, Joseph wrote:
 more INLINE below...

 On Nov 17, 2009, at 7:27 PM, Mike Christie wrote:

 Pasi Kärkkäinen wrote:
 On Mon, Nov 16, 2009 at 09:39:00PM -0500, Hoot, Joseph wrote:
 On Nov 16, 2009, at 8:19 PM, Hoot, Joseph wrote:

 thanks.  That helps.  So I know that with the EqualLogic targets, there 
 is a Group IP which, I believe, responds with an iscsi login_redirect. 

 1) Could the Login authentication failed message be the response 
 because of a login redirect messages from the EQL redirect?

 and then my next question is more for curiosity sake:

 2) Are there plans in the future to have more than one connection per 
 session?  and I guess in addition to that, would that mean multiple 
 connections to a single volume over the same nic?


 Also Mike, I'm seeing one or two of these every 30-40 minutes if I slam 
 our EqualLogic with roughly 7-15k IOPS (reads and writes) non stop on 3 
 volumes.  In this type of scenario, would you expect to see timeouts like 
 this once in awhile?  If so, do you think increasing my NOOP timeouts 
 would assist so we don't get these?  maybe set it to 15 seconds instead 
 of 10?

 Equallogic does active loadbalancing (redirects) during operation..
 dunno about the errors though.

 Oh yeah, forgot about that. Thanks Pasi!

 Joseph, look in the EQL target logs for something about the EQL box 
 doing load balancing. I think normally we handle the load balancing more 
 gracefully, but we might be messing up. I think if EQL was load 
 balancing in the open-iscsi logs we would see something about getting a 
 async iscsi pdu from the target that asks us to logout. Then when we 
 relogin the target would redirect us to the optimal path.

 There are two things that the EQL does, I believe-- one thing is async 
 logout, the other is login_redirect.   Unfortunately, from the EQL syslog 
 side we don't see any errors related to this.  It's my understanding, 
 however, that when a login is initially attempted to the EQL, it hits the 
 group ip or an alias'd IP sitting on a real nic.  The group IP looks at 
 all the interfaces on the EQL and decides, based on some algorithm, which 
 EQL nic the session should connect to.  It then sends the initiator that 
 made the request a login_redirect, which I thought is basically a logout 
 and reconnect pdu.  It would say, for example, you're can't log into the 
 group IP, however, you can log into this IP (a real nic) that it would 
 prefer you be logged into.

 I'm thinking that the failed login is actually the result of that attempt 
 to log into the group IP and it sending a login redirect pdu back to it.

 
 If the target was load balancing us it would:
 
 - Send a async logout pdu.
 - We then send a logout pdu.
 - When we get the logout response pdu we kill the tcp ip connection
 - We then create a new tcp connection
 - We then log in to the portal that was passed into iscsiadm/iscsid (the 
 one in the DB that you see when you run iscsiadm -m node, which is 
 probably what you call the group IP). For this process we send a login 
 pdu. It then sends a login response pdu with the login redirect 
 response. In this response we also get the new IP to log into.
 - We see that response and kill the tcp connection, and create a new tcp 
 connection to the portal we are being redirected to.
 - We then log into the portal we were redirected to. We again do this by 
 sending a login pdu. This time the login response pdu should be ok and 
 we are done.

Oh yeah, I meant to also say that this is pretty much the same process 
that happens we do the first login, and if we have to relogin because of 
a connection problem like the nop/ping timeout. The only difference in 
those cases is that we do not get the async logout and we do not do a 
logout by sending a logout pdu. We start at the killing tcp ip 
connection step.

So even if we are not getting load balanced we would be in the same 
place in the open-iscsi code when we are getting the login failed errors.


To get back on track solving why we get the nop timeouts then if we are 
not seeing load balancing messages or async logout messages,  it could 
be the open-iscsi bug I mentioned in the other mail. If you can send the 
open-iscsi and kernel info I asked for in the other mail, we can start 
down that path.

--

You received this message because you are subscribed to the Google Groups 
open-iscsi group.
To post to this group, send email to open-is...@googlegroups.com.
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/open-iscsi?hl=.




Re: [RFC-PATCH] libiscsi dhcp handler

2009-11-17 Thread Rakesh Ranjan
Rakesh Ranjan wrote:
 David Miller wrote:
 From: Rakesh Ranjan rak...@chelsio.com
 Date: Mon, 16 Nov 2009 18:41:49 +0530

 Herein attached patches to support dhcp based provisioning for iSCSI
 offload capable cards. I have made dhcp code as generic as possible,
 please go through the code. Based on the feedback I will submit final
 version of these patches.
 You can't really add objects to the build before the patch that
 adds the source for that object.

 
 Hi david,
 
 Fixed patch attached.

ping ...


--

You received this message because you are subscribed to the Google Groups 
open-iscsi group.
To post to this group, send email to open-is...@googlegroups.com.
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/open-iscsi?hl=.




Re: Poor read performance with IET

2009-11-17 Thread Gopu Krishnan
Pryker,

Test your environment with NULL I/O. Perform the read and write and check
the performance which should attains around 100MB/s. If not so the problem
would be around IET on your environment, Else the problem may be attained
with your disk I/O or back-end driver.

Thanks
Gopala krishnan Varatharajan.

On Tue, Nov 17, 2009 at 8:51 PM, pryker pry...@gmail.com wrote:

 I am currently running IET on a CentOS 5.4 server with the following
 kernel:

 Linux titan1 2.6.18-128.7.1.el5 #1 SMP Mon Aug 24 08:21:56 EDT 2009
 x86_64 x86_64 x86_64 GNU/Linux

 The server is a Dual quad core 2.8 GHz system with 16 GB ram.  I am
 also using Coraid disk shelves via AoE for my block storage that I am
 offering up as a iscsi target.

 I am running v 0.4.17 of IET.

 I am getting very good write performance but lousy read performance.
 performing a simple sequential write to the iscsi target I get 94
 megabytes per sec.  With reads I am only getting 12.4 megabytes per
 sec.

 My ietd.conf looks like this:

 Target iqn.2009-11.net.storage:titan.diskshelf1.e1.2
  Lun 1 Path=/dev/etherd/e1.2,Type=blockio
  Alias e1.2
  MaxConnections 1
  InitialR2T No
  ImmediateData Yes
  MaxRecvDataSegmentLength 262144
  MaxXmitDataSegmentLength 262144

 I have also made the following tweaks to tcp/ip:

 sysctl net.ipv4.tcp_rmem=100 100 100
 sysctl net.ipv4.tcp_wmem=100 100 100
 sysctl net.ipv4.tcp_tw_recycle=1
 sysctl net.ipv4.tcp_tw_reuse=1
 sysctl net.core.rmem_max=524287
 sysctl net.core.wmem_max=524287
 sysctl net.core.wmem_default=524287
 sysctl net.core.optmem_max=524287
 sysctl net.core.netdev_max_backlog=30

 I am using Broadcom cards in the iscsi target server.  I have enabled
 jumbo rames on them (MTU 9000).  They are connected directly into a
 windows server and I am accessing the iscsi target with MS iscsi
 initiator.  The NIC cards on the Windows server are also set to a MTU
 of 9000.  There is no switch in between, they are directly connected
 into the Windows server.

 I also notice that load averages on the Linux box will get into the
 7's and 8's when I try pushing the system by performing multiple
 transfers.

 Any feedback on what I might be missing here would be great!

 Thanks

 Phil

 --

 You received this message because you are subscribed to the Google Groups
 open-iscsi group.
 To post to this group, send email to open-is...@googlegroups.com.
 To unsubscribe from this group, send email to
 open-iscsi+unsubscr...@googlegroups.comopen-iscsi%2bunsubscr...@googlegroups.com
 .
 For more options, visit this group at
 http://groups.google.com/group/open-iscsi?hl=.





-- 
Regards

Gopu.

--

You received this message because you are subscribed to the Google Groups 
open-iscsi group.
To post to this group, send email to open-is...@googlegroups.com.
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/open-iscsi?hl=.