disconnecting part of the state machine

2010-03-01 Thread Or Gerlitz
Hi Mike,

I'm just reviewing this for the iser disconnect change we're discussing,
and I noted that stop conn is called after ep_disconnect. I wasn't sure if
this is needed, or maybe bug or just something we can live with...

Here's the sequence of events I have collected with the RHEL 5.4 initiator
(6.2.0.871-0.10.el5), for what it worth, I was running over 2.6.33.

The number on the left is the line number in the attached file.
I just did login and after few seconds logout from the command line.

48  kep_connect
53  kep_bind
94  kcreate_session
133 kcreate_conn
139 kbind_conn

153 -- login
203 -- login (opcode 0x23)

xxx kset_param
410 kstart_conn

450 -- logout
479 -- logout (opcode 0x26)

486 kep_disconnect
491 kstop_conn
497 kdAestroy_conn
504 kdestroy_session


Or.

-- 
You received this message because you are subscribed to the Google Groups 
open-iscsi group.
To post to this group, send email to open-is...@googlegroups.com.
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/open-iscsi?hl=en.



iscsid_log.txt.bz2
Description: BZip2 compressed data


Re: multipathing

2010-03-01 Thread Or Gerlitz
Mike Christie wrote:
 The refcouning method sounds good. If iser has cleaned up what gets set
 in iscsi_iser_conn_bind once its ep_disconnect has completed, then you
 should be ok. So iser_conn-ib_conn has to be NULLd so later when
 iscsi_iser_conn_bind is called for the new conn it can be set. 

okay, I am looking on how to do this

 And we will have to watch out for a rmmod while there are ib_conns 
 left to completely destroy

yes, I never was a big favor of rmmod flows, but, I admit they exist...

Or.

-- 
You received this message because you are subscribed to the Google Groups 
open-iscsi group.
To post to this group, send email to open-is...@googlegroups.com.
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/open-iscsi?hl=en.



Re: multipathing

2010-03-01 Thread Or Gerlitz
Or Gerlitz wrote:
 Mike Christie wrote:

 And we will have to watch out for a rmmod while there are ib_conns left 

 yes, I never was a big favor of rmmod flows, but, I admit they exist...

I do hit this -EEXIST race from time to time when running things like 
stop/start sequence.

Now I get it quite often but I am playing with the code. I am quite sure that 
I saw this also on non modified occasions with latest kernels (e.g 2.6.30 and 
above) 
did anyone else reported it as well?

Or.

Loading iSCSI transport class v2.0-870.
[ cut here ]
WARNING: at fs/sysfs/dir.c:487 sysfs_add_one+0xcc/0xe4()
Hardware name: X7DW3
sysfs: cannot create duplicate filename '/class/iscsi_endpoint'
Modules linked in: scsi_transport_iscsi(+) iw_cxgb3 cxgb3 mdio autofs4 nfs 
nfs_acl auth_rpcgss lockd sunrpc rds ib_ipoib rdma_ucm rdma_cm ib_ucm ib_uverbs 
ib_umad ib_cm iw_cm ib_addr ipv6 ib_sa dm_mirror dm_multipath video output 
battery ac joydev sr_mod sg igb ib_mthca ib_mad ib_core button floppy ioatdma 
rng_core pcspkr dca dm_region_hash dm_log dm_mod usb_storage ata_piix libata 
shpchp megaraid_sas sd_mod scsi_mod ext3 jbd uhci_hcd ohci_hcd ehci_hcd [last 
unloaded: scsi_transport_iscsi]
Pid: 5341, comm: modprobe Tainted: GW  2.6.33 #1
Call Trace:
 [8110a4fc] ? sysfs_add_one+0xcc/0xe4
 [810349a2] ? warn_slowpath_common+0x77/0x8e
 [81034a15] ? warn_slowpath_fmt+0x51/0x59
 [8110a428] ? sysfs_pathname+0x35/0x3d
 [8110a428] ? sysfs_pathname+0x35/0x3d
 [8110a4fc] ? sysfs_add_one+0xcc/0xe4
 [8110a9f5] ? create_dir+0x4f/0x85
 [8110aa60] ? sysfs_create_dir+0x35/0x4a
 [8114a364] ? kobject_add_internal+0xd2/0x18d
 [8114a443] ? kset_register+0x24/0x3a
 [a0357000] ? iscsi_transport_init+0x0/0x145 [scsi_transport_iscsi]
 [811e2f1a] ? __class_register+0x128/0x1c0
 [a0357000] ? iscsi_transport_init+0x0/0x145 [scsi_transport_iscsi]
 [a0357050] ? iscsi_transport_init+0x50/0x145 [scsi_transport_iscsi]
 [810001e1] ? do_one_initcall+0x50/0x13d
 [81060142] ? sys_init_module+0xc8/0x222
 [81001eeb] ? system_call_fastpath+0x16/0x1b
---[ end trace c8375848060e5033 ]---
kobject_add_internal failed for iscsi_endpoint with -EEXIST, don't try to 
register things with the same name in the same directory.
Pid: 5341, comm: modprobe Tainted: GW  2.6.33 #1
Call Trace:
 [8114a3e0] ? kobject_add_internal+0x14e/0x18d
 [8114a443] ? kset_register+0x24/0x3a
 [a0357000] ? iscsi_transport_init+0x0/0x145 [scsi_transport_iscsi]
 [811e2f1a] ? __class_register+0x128/0x1c0
 [a0357000] ? iscsi_transport_init+0x0/0x145 [scsi_transport_iscsi]
 [a0357050] ? iscsi_transport_init+0x50/0x145 [scsi_transport_iscsi]
 [810001e1] ? do_one_initcall+0x50/0x13d
 [81060142] ? sys_init_module+0xc8/0x222
 [81001eeb] ? system_call_fastpath+0x16/0x1b

-- 
You received this message because you are subscribed to the Google Groups 
open-iscsi group.
To post to this group, send email to open-is...@googlegroups.com.
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/open-iscsi?hl=en.



Re: multipathing

2010-03-01 Thread Or Gerlitz
Mike Christie wrote:
 So iser_conn-ib_conn has to be NULLd so later when
 iscsi_iser_conn_bind is called for the new conn it can be set. 

nullifying iser_conn-ib_conn in iser_conn_terminate is problematic since
i_c_terminate is called in the e_disconnect flow where i_c_stop is call later
and if iser_conn-ib_conn it doesn't call the iser_conn_put and the ep isn't 
destroyed...

I can change things, ofcourse, just wanted to share with you these findings, 
let me keep
looking on this tomorrow.

Or.

-- 
You received this message because you are subscribed to the Google Groups 
open-iscsi group.
To post to this group, send email to open-is...@googlegroups.com.
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/open-iscsi?hl=en.



Re: read only access on 1 LUN for multiple initiators

2010-03-01 Thread Yangkook Kim
 Remember that clients (especially Linux) doe cache blocks
 locally

Doesn't kernel refresh the read cache for blocks at some intervals?

If I make a new file on target locally, and waiting for sometime,
hitting read command like ls on initiator side will read data from
block device, not from stale cache, so that I can read updated data.

I believed that this is how the file-system is supposed to work,
but it is not?

they do not invalidate that cache.

Clients not invalidating cache means that they do not
update their read cahce unless the volume is remounted?

2010/3/1, Ulrich Windl ulrich.wi...@rz.uni-regensburg.de:
 On 27 Feb 2010 at 0:23, Kapetanakis Giannis wrote:

 Hi,

 This probably have been brought up before, but I couldn't find any info.

 I'd like to setup multiple initiators (web/ftp cluster) to access
 read-only the same iscsi target.
 I would prefer to do this without cluster fs (ie GFS) if possible.
 However I want to have right access on the target server locally.

 I managed to do this but I have the following problem:
 When I write something on the target (locally on the server)
 the updates are not propagated to the initiators.

 Naturally, read-only media don't propagate changes, because they don't
 change. Remember that clients (especially Linux) doe cache blocks
 locally, and they do not invalidate that cache. So you could even
 experience application crashes when accessing data structures that were
 partially cached while being changed on the original.


 If I unmount and mount again I can see the changes.

 Naturally, because unmount invalidates the cached blocks for the
 device.

 Regards,
 Ulrich


 I'm sharing /dev/vg01/iscsi which is an ext3 fs. It's locally mounted on
 server
 and also mounted (/dev/sde - /mnt) on clients.

 Server is centos 5.4 scsi-target-utils-0.0-6.20091205snap.el5_4.1

 target iqn.2008-09.com.example:server.target2
  backing-store /dev/vg01/iscsi
  incominguser user pass
  initiator-address 10.0.0.0/26
  write-cache off
 /target

 client is Fedora 12 iscsi-initiator-utils-6.2.0.870-10.fc12.1.x86_64

 scsi22 : iSCSI Initiator over TCP/IP
 scsi 22:0:0:0: RAID  IET  Controller   0001 PQ: 0
 ANSI: 5
 scsi 22:0:0:0: Attached scsi generic sg5 type 12
 scsi 22:0:0:1: Direct-Access IET  VIRTUAL-DISK 0001 PQ: 0
 ANSI: 5
 sd 22:0:0:1: Attached scsi generic sg6 type 0
 sd 22:0:0:1: [sde] 16777216 512-byte logical blocks: (8.58 GB/8.00 GiB)
 sd 22:0:0:1: [sde] Write Protect is off
 sd 22:0:0:1: [sde] Mode Sense: 79 00 00 08
 sd 22:0:0:1: [sde] Write cache: disabled, read cache: enabled, doesn't
 support DPO or FUA
   sde: unknown partition table
 sd 22:0:0:1: [sde] Attached SCSI disk
 kjournald starting.  Commit interval 5 seconds
 EXT3-fs: mounted filesystem with ordered data mode.

 so is GFS the only option?

 Giannis

 --
 You received this message because you are subscribed to the Google Groups
 open-iscsi group.
 To post to this group, send email to open-is...@googlegroups.com.
 To unsubscribe from this group, send email to
 open-iscsi+unsubscr...@googlegroups.com.
 For more options, visit this group at
 http://groups.google.com/group/open-iscsi?hl=en.



 --
 You received this message because you are subscribed to the Google Groups
 open-iscsi group.
 To post to this group, send email to open-is...@googlegroups.com.
 To unsubscribe from this group, send email to
 open-iscsi+unsubscr...@googlegroups.com.
 For more options, visit this group at
 http://groups.google.com/group/open-iscsi?hl=en.



-- 
You received this message because you are subscribed to the Google Groups 
open-iscsi group.
To post to this group, send email to open-is...@googlegroups.com.
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/open-iscsi?hl=en.



Re: read only access on 1 LUN for multiple initiators

2010-03-01 Thread Yangkook Kim
 Remember that clients (especially Linux) doe cache blocks
 locally

Doesn't kernel refresh the read cache for blocks at some intervals?

If I make a new file on target locally, and waiting for sometime,
hitting read command like ls on initiator side will read data from
block device, not from stale cache, so that I can read updated data.

I believed that this is how the file-system is supposed to work,
but it is not?

they do not invalidate that cache.

Clients not invalidating cache means that they do not
update their read cahce unless the volume is remounted?

Thanks.

kim

2010/3/1, Ulrich Windl ulrich.wi...@rz.uni-regensburg.de:
 On 27 Feb 2010 at 0:23, Kapetanakis Giannis wrote:

 Hi,

 This probably have been brought up before, but I couldn't find any info.

 I'd like to setup multiple initiators (web/ftp cluster) to access
 read-only the same iscsi target.
 I would prefer to do this without cluster fs (ie GFS) if possible.
 However I want to have right access on the target server locally.

 I managed to do this but I have the following problem:
 When I write something on the target (locally on the server)
 the updates are not propagated to the initiators.

 Naturally, read-only media don't propagate changes, because they don't
 change. Remember that clients (especially Linux) doe cache blocks
 locally, and they do not invalidate that cache. So you could even
 experience application crashes when accessing data structures that were
 partially cached while being changed on the original.


 If I unmount and mount again I can see the changes.

 Naturally, because unmount invalidates the cached blocks for the
 device.

 Regards,
 Ulrich


 I'm sharing /dev/vg01/iscsi which is an ext3 fs. It's locally mounted on
 server
 and also mounted (/dev/sde - /mnt) on clients.

 Server is centos 5.4 scsi-target-utils-0.0-6.20091205snap.el5_4.1

 target iqn.2008-09.com.example:server.target2
  backing-store /dev/vg01/iscsi
  incominguser user pass
  initiator-address 10.0.0.0/26
  write-cache off
 /target

 client is Fedora 12 iscsi-initiator-utils-6.2.0.870-10.fc12.1.x86_64

 scsi22 : iSCSI Initiator over TCP/IP
 scsi 22:0:0:0: RAID  IET  Controller   0001 PQ: 0
 ANSI: 5
 scsi 22:0:0:0: Attached scsi generic sg5 type 12
 scsi 22:0:0:1: Direct-Access IET  VIRTUAL-DISK 0001 PQ: 0
 ANSI: 5
 sd 22:0:0:1: Attached scsi generic sg6 type 0
 sd 22:0:0:1: [sde] 16777216 512-byte logical blocks: (8.58 GB/8.00 GiB)
 sd 22:0:0:1: [sde] Write Protect is off
 sd 22:0:0:1: [sde] Mode Sense: 79 00 00 08
 sd 22:0:0:1: [sde] Write cache: disabled, read cache: enabled, doesn't
 support DPO or FUA
   sde: unknown partition table
 sd 22:0:0:1: [sde] Attached SCSI disk
 kjournald starting.  Commit interval 5 seconds
 EXT3-fs: mounted filesystem with ordered data mode.

 so is GFS the only option?

 Giannis

 --
 You received this message because you are subscribed to the Google Groups
 open-iscsi group.
 To post to this group, send email to open-is...@googlegroups.com.
 To unsubscribe from this group, send email to
 open-iscsi+unsubscr...@googlegroups.com.
 For more options, visit this group at
 http://groups.google.com/group/open-iscsi?hl=en.



 --
 You received this message because you are subscribed to the Google Groups
 open-iscsi group.
 To post to this group, send email to open-is...@googlegroups.com.
 To unsubscribe from this group, send email to
 open-iscsi+unsubscr...@googlegroups.com.
 For more options, visit this group at
 http://groups.google.com/group/open-iscsi?hl=en.



-- 
You received this message because you are subscribed to the Google Groups 
open-iscsi group.
To post to this group, send email to open-is...@googlegroups.com.
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/open-iscsi?hl=en.



Re: read only access on 1 LUN for multiple initiators

2010-03-01 Thread romeotheriault


 so is GFS the only option?

Is NFS an option?

-- 
You received this message because you are subscribed to the Google Groups 
open-iscsi group.
To post to this group, send email to open-is...@googlegroups.com.
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/open-iscsi?hl=en.



Starting up

2010-03-01 Thread sci3ntist
Hello,

I'm new to SANs and iSCSI, and I'm lookin to a place to start from,
and I'm finding this place is such a great one, so please, can you
give me some links to tutorials or videos about it?

Regards,
Sci3ntist

-- 
You received this message because you are subscribed to the Google Groups 
open-iscsi group.
To post to this group, send email to open-is...@googlegroups.com.
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/open-iscsi?hl=en.



Re: read only access on 1 LUN for multiple initiators

2010-03-01 Thread Romeo Theriault
 so is GFS the only option?


Is NFS an option?



-- 
Romeo Theriault

-- 
You received this message because you are subscribed to the Google Groups 
open-iscsi group.
To post to this group, send email to open-is...@googlegroups.com.
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/open-iscsi?hl=en.



Re: to iface or not to iface?

2010-03-01 Thread Romeo Theriault
 If you do not use ifaces, then IO will be routed based on the route table.
 So I think probably, IO would go through the same NIC on the server. Is this
 what you are seeing?


I don't actually have the env. setup yet. Right now I'm just trying to
determine what would be the best way to setup the env. without creating
another subnet for the iscsi SAN traffic.


 If you wanted to use dm-multipath to round robin over both NICs on the
 linux server then you would use a ifaces to bind each session to each NIC.


Ok, good to know.


 Are the two switches connected to each other? If they were and you are
 using one subnet, you would have better redundancy. Above you have 2 paths
 to the target, but if the switches are connected you have 4 paths.


I'm not sure I'll check this out with the networking crew.


 The network layer should figure out there is another NIC that can be used
 and just use it. A problem might be while we are switching nics IO could
 time out and both paths could be down if they both ended up using the same
 nic due to the routing table. So you would want to setup dm-multipath with a
 higher no_path_retry, because when you switch over you might also have to
 relogin to the target through the new nic.

 If you used ifaces then the failover should be smoother. The other path
 would already be logged in, so dm-multipath could just restart the IO right
 away.


Perfect, this is exactly the information I was looking for. Thank you for
the help, I really appreciate your response.

-- 
Romeo Theriault
System Administrator
Information Technology Services

-- 
You received this message because you are subscribed to the Google Groups 
open-iscsi group.
To post to this group, send email to open-is...@googlegroups.com.
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/open-iscsi?hl=en.



[PATCH] Missinig field initialization

2010-03-01 Thread Avi Kaplan
Hello,
The struct *iscsi_sw_tcp_conn* contains the field *struct iscsi_conn *conn*,
which is never initialized and never used.

I am currently working on a patch that will change the open-iscsi logging
mechanism (as was discussed between Mike, Erez and Ulrich). Some of the
logging is done in the context of a connection and in those cases I use the
struct *iscsi_conn*. Now, during my work I came across the function *
iscsi_sw_tcp_conn_restore_callbacks*, which receives *struct
iscsi_sw_tcp_conn*, from which I need to extract the *iscsi_conn* field and
use it for logging. However, it is not initialized and hance my fix.

One more thing: The function *iscsi_sw_tcp_conn_set_callbacks* receives *struct
iscsi_conn*. Wouldn't ii be better if
*iscsi_sw_tcp_conn_restore_callbacks*receives the same? If it had, it
would have solved my problem, but in any
case we should fix that missing field initialization (by initializing or
removing it from the struct).

Thanks,
  Avi

-- 
You received this message because you are subscribed to the Google Groups 
open-iscsi group.
To post to this group, send email to open-is...@googlegroups.com.
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/open-iscsi?hl=en.



Failover time of iSCSI multipath devices.

2010-03-01 Thread bet
Hi all.  I am going through some testing of my multipathed iSCSI
devices and I am seeing some longer than expected delays.  I am
running the latest RHEL 5.4 packages as of this morning.  I am seeing
the failure of the iSCSI sessions take about 67 seconds.  After the
iSCSI failure the multipath layer picks up almost immediately.  Here
is a breakdown of /var/log/messages, I am testing dd while pulling a
network cable:

I started the dd at 07:13:51

Cable pulled at:
Mar  1 07:14:27 bentCluster-1 kernel:  connection4:0: ping timeout of
5 secs expired, recv timeout 5, last rx 4884304, last ping 4889304,
now 4894304

ISCIS errors at:
Mar  1 07:14:28 bentCluster-1 iscsid: Kernel reported iSCSI connection
4:0 error (1011) state (3)

SCSI error and multipath failures at:
Mar  1 07:15:35 bentCluster-1 kernel:  session2: session recovery
timed out after 15 secs
Mar  1 07:15:35 bentCluster-1 kernel: sd 3:0:0:1: SCSI error: return
code = 0x000f
Mar  1 07:15:35 bentCluster-1 kernel: end_request: I/O error, dev sdf,
sector 3164079
Mar  1 07:15:35 bentCluster-1 kernel: device-mapper: multipath:
Failing path 8:80.

And then I/O starts again on the device I am sending I/O down.

Finally the other devices fail:

Mar  1 07:15:48 bentCluster-1 kernel: device-mapper: multipath:
Failing path 8:112.

The entire dd took 138 seconds.  It looks like the delay is in the
iSCSI layer.  It took from 07:14:28 to 07:15:35 for the iSCSI session
to fail.

I am using the timeouts:

 ● node.session.timeo.replacement_timeout = 15
 ● node.conn[0].timeo.noop_out_timeout = 5
 ● node.conn[0].timeo.noop_out_interval = 5

http://kbase.redhat.com/faq/docs/DOC-2877

So I guess I have two questions:

1.  Based on my timeouts I would think that my session would time out
after 15 seconds.  Anyone have an idea why is it taking 67 seconds?
Am I missing any other timeout values?

2.  In a perfect world what is the best case scenario for the failure
of my iSCSI session?

Thanks in advance.

-Ben

-- 
You received this message because you are subscribed to the Google Groups 
open-iscsi group.
To post to this group, send email to open-is...@googlegroups.com.
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/open-iscsi?hl=en.



Re: [PATCH] decrease sndtmo

2010-03-01 Thread Mike Christie

On 03/01/2010 05:38 AM, Erez Zilber wrote:

On Wed, Feb 3, 2010 at 11:51 PM, Mike Christiemicha...@cs.wisc.edu  wrote:

On 02/03/2010 06:07 AM, Erez Zilber wrote:


On Wed, Feb 3, 2010 at 11:30 AM, Mike Christiemicha...@cs.wisc.edu
  wrote:


On 02/03/2010 01:50 AM, Erez Zilber wrote:


It looks like I posted it at Red Hat and never got a response, and I
probably then forgot about it and never asked upstream. Will send mail
upstream now.


Which list are you sending it to? I thought it was lkml, but didn't
find any discussion there.



I think I found a nicer solution. See the attached patch made over
linus's
tree. I am just not sure if we are allowed to set the sk_err field -
maybe
it is supposed to be internal to the socket code. The patch seems to be
working for me.



Works great for me.



Ok. I am going to post it to netdev today/tomorrow, to make sure they are ok
with how I am accessing the sock struct.



Mike,

Did you get any response from the netdev list?



I just sent it to linux-scsi after looking at some similar code. It 
ended up getting merged in James's tree.


--
You received this message because you are subscribed to the Google Groups 
open-iscsi group.
To post to this group, send email to open-is...@googlegroups.com.
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/open-iscsi?hl=en.



Re: 2.6.14-23_compat.patch CentOS 5.4

2010-03-01 Thread Mike Christie

On 02/02/2010 02:53 AM, Erez Zilber wrote:


I've attached 2 versions. One fixes only the  5.5 case and the other
one handles all RHEL versions that are6.0. I prefer the 2nd one
(assuming that there will be no API breakage until RHEL 6.0).



Merged the second one. Thanks!

--
You received this message because you are subscribed to the Google Groups 
open-iscsi group.
To post to this group, send email to open-is...@googlegroups.com.
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/open-iscsi?hl=en.



Re: iscsi ifaces / multipathing / etc

2010-03-01 Thread Mike Christie

On 03/01/2010 02:46 AM, Or Gerlitz wrote:

Mike Christie wrote:

Ah never mind. For some reason I thought you had to have a mask, but if
you give rdma_resolve_addr a addr then it will do the right thing and
use only the port you wanted right?


YES. providing rdma_resolve_addr a source address is like calling rdma_bind 
with this source address and then calling rdma_resolve_address with only 
destination address. So its like bind(2) in that respect. Currently the rdma 
stack through the include/rdma/rdma_cm.h rdma-cm api doesn't support things 
like SO_BINDTODEVICE to either of network or rdma device. But even if it 
would/will, I prefer to stay with IP addresses.



I am not sure I got why you prefer to use the IP? Was your reason that 
part you wrote about iser being IP based?




I understand how SO_BINDTODEVICE is used for the tcp transport, but its all done 
in user space, and later when the connection is bounded to the end-point (-- 
socket created/binded/connected from user space) things are moved to the kernel. 
This isn't the case with iser. I believe that at this point we agree that there 
should be a way to specify the source address bounded by the user to the iscsi 
interface to the kernel iser transport code, correct?



Whether it is done in the kernel or userspace or if you support 
SO_BINDTODEVICE or not is not a issue. We can change userspace so you 
get a device like tcp and offload and/or we can change the kernel in any 
sane way so you can bind by whatever. You do not have to support 
SO_BINDTODEVICE. You can work like how the offload drivers do.


I am not sure where you get that I agree ip address is best. All I am 
saying above is I think I see the API you wanted to use.


If the device name and port do not change normally that seems better to 
me since it works like the other drivers.


--
You received this message because you are subscribed to the Google Groups 
open-iscsi group.
To post to this group, send email to open-is...@googlegroups.com.
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/open-iscsi?hl=en.



Re: [PATCH] Missinig field initialization

2010-03-01 Thread Mike Christie

On 03/01/2010 04:33 AM, Avi Kaplan wrote:

Hello,
The struct *iscsi_sw_tcp_conn* contains the field *struct iscsi_conn *conn*,
which is never initialized and never used.

I am currently working on a patch that will change the open-iscsi logging
mechanism (as was discussed between Mike, Erez and Ulrich). Some of the
logging is done in the context of a connection and in those cases I use the
struct *iscsi_conn*. Now, during my work I came across the function *
iscsi_sw_tcp_conn_restore_callbacks*, which receives *struct
iscsi_sw_tcp_conn*, from which I need to extract the *iscsi_conn* field and
use it for logging. However, it is not initialized and hance my fix.

One more thing: The function *iscsi_sw_tcp_conn_set_callbacks* receives *struct
iscsi_conn*. Wouldn't ii be better if
*iscsi_sw_tcp_conn_restore_callbacks*receives the same? If it had, it
would have solved my problem, but in any
case we should fix that missing field initialization (by initializing or
removing it from the struct).



If it is not used remove it.

I think you can actually get the iscsi_conn from the sk_user_data field, 
but changing iscsi_sw_tcp_conn_restore_callbacks to get a iscsi_conn so 
it works like set_callbacks is fine.


--
You received this message because you are subscribed to the Google Groups 
open-iscsi group.
To post to this group, send email to open-is...@googlegroups.com.
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/open-iscsi?hl=en.



Re: disconnecting part of the state machine

2010-03-01 Thread Mike Christie

On 03/01/2010 08:15 AM, Or Gerlitz wrote:

Hi Mike,

I'm just reviewing this for the iser disconnect change we're discussing,
and I noted that stop conn is called after ep_disconnect. I wasn't sure if
this is needed, or maybe bug or just something we can live with...



ep_disconnect deals with low level stuff like ib, tcp, hardware/firmware 
stuff on the card level stuff.


stop conn deals with the iscsi connection and cleans up iscsi level 
stuff like pdus, iscsi cmds, session/connection cleanup. It is 
definately needed.


--
You received this message because you are subscribed to the Google Groups 
open-iscsi group.
To post to this group, send email to open-is...@googlegroups.com.
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/open-iscsi?hl=en.



Re: iscsi ifaces / multipathing / etc

2010-03-01 Thread Mike Christie

On 03/01/2010 06:29 PM, Mike Christie wrote:

If the device name and port do not change normally that seems better to
me since it works like the other drivers.



Oh yeah, just to be clear, I am saying I prefer above, but that is based 
on what I understand today. As I said I did not understand why you think 
IP based is best for iser when all other drivers use the other option. 
Beat your point into my head if I am not getting it :) I am open to change.


--
You received this message because you are subscribed to the Google Groups 
open-iscsi group.
To post to this group, send email to open-is...@googlegroups.com.
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/open-iscsi?hl=en.



Re: multipathing

2010-03-01 Thread Mike Christie

On 03/01/2010 09:15 AM, Or Gerlitz wrote:

Or Gerlitz wrote:

Mike Christie wrote:



And we will have to watch out for a rmmod while there are ib_conns left



yes, I never was a big favor of rmmod flows, but, I admit they exist...


I do hit this -EEXIST race from time to time when running things like 
stop/start sequence.

Now I get it quite often but I am playing with the code. I am quite sure that
I saw this also on non modified occasions with latest kernels (e.g 2.6.30 and 
above)
did anyone else reported it as well?

Or.

Loading iSCSI transport class v2.0-870.
[ cut here ]
WARNING: at fs/sysfs/dir.c:487 sysfs_add_one+0xcc/0xe4()
Hardware name: X7DW3
sysfs: cannot create duplicate filename '/class/iscsi_endpoint'
Modules linked in: scsi_transport_iscsi(+) iw_cxgb3 cxgb3 mdio autofs4 nfs 
nfs_acl auth_rpcgss lockd sunrpc rds ib_ipoib rdma_ucm rdma_cm ib_ucm ib_uverbs 
ib_umad ib_cm iw_cm ib_addr ipv6 ib_sa dm_mirror dm_multipath video output 
battery ac joydev sr_mod sg igb ib_mthca ib_mad ib_core button floppy ioatdma 
rng_core pcspkr dca dm_region_hash dm_log dm_mod usb_storage ata_piix libata 
shpchp megaraid_sas sd_mod scsi_mod ext3 jbd uhci_hcd ohci_hcd ehci_hcd [last 
unloaded: scsi_transport_iscsi]
Pid: 5341, comm: modprobe Tainted: GW  2.6.33 #1
Call Trace:
  [8110a4fc] ? sysfs_add_one+0xcc/0xe4
  [810349a2] ? warn_slowpath_common+0x77/0x8e
  [81034a15] ? warn_slowpath_fmt+0x51/0x59
  [8110a428] ? sysfs_pathname+0x35/0x3d
  [8110a428] ? sysfs_pathname+0x35/0x3d
  [8110a4fc] ? sysfs_add_one+0xcc/0xe4
  [8110a9f5] ? create_dir+0x4f/0x85
  [8110aa60] ? sysfs_create_dir+0x35/0x4a
  [8114a364] ? kobject_add_internal+0xd2/0x18d
  [8114a443] ? kset_register+0x24/0x3a
  [a0357000] ? iscsi_transport_init+0x0/0x145 [scsi_transport_iscsi]
  [811e2f1a] ? __class_register+0x128/0x1c0
  [a0357000] ? iscsi_transport_init+0x0/0x145 [scsi_transport_iscsi]
  [a0357050] ? iscsi_transport_init+0x50/0x145 [scsi_transport_iscsi]
  [810001e1] ? do_one_initcall+0x50/0x13d
  [81060142] ? sys_init_module+0xc8/0x222
  [81001eeb] ? system_call_fastpath+0x16/0x1b
---[ end trace c8375848060e5033 ]---
kobject_add_internal failed for iscsi_endpoint with -EEXIST, don't try to 
register things with the same name in the same directory.
Pid: 5341, comm: modprobe Tainted: GW  2.6.33 #1
Call Trace:
  [8114a3e0] ? kobject_add_internal+0x14e/0x18d
  [8114a443] ? kset_register+0x24/0x3a
  [a0357000] ? iscsi_transport_init+0x0/0x145 [scsi_transport_iscsi]
  [811e2f1a] ? __class_register+0x128/0x1c0
  [a0357000] ? iscsi_transport_init+0x0/0x145 [scsi_transport_iscsi]
  [a0357050] ? iscsi_transport_init+0x50/0x145 [scsi_transport_iscsi]
  [810001e1] ? do_one_initcall+0x50/0x13d
  [81060142] ? sys_init_module+0xc8/0x222
  [81001eeb] ? system_call_fastpath+0x16/0x1b



I have not seen it. Something probably has iscsi_transport sysfs file 
open and so that object is not completely freed by the time you start up 
again. We see this error a lot with dm-multipath testing when paths 
(/dev/sdxs) are removed and added quickly.


--
You received this message because you are subscribed to the Google Groups 
open-iscsi group.
To post to this group, send email to open-is...@googlegroups.com.
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/open-iscsi?hl=en.



Re: Failover time of iSCSI multipath devices.

2010-03-01 Thread Mike Christie

On 03/01/2010 12:06 PM, bet wrote:

1.  Based on my timeouts I would think that my session would time out


Yes. It should timeout about 15 secs after you see
 Mar  1 07:14:27 bentCluster-1 kernel:  connection4:0: ping timeout of
 5 secs expired, recv timeout 5, last rx 4884304, last ping 4889304,
 now 4894304

You might be hitting a bug where the network layer gets stuck trying to 
send data. I attached a patch that should fix the problem.


If you do not know how to build a RHEL kernel let me know the arch you 
are using and I can build a kernel here (it takes about a day).





after 15 seconds.  Anyone have an idea why is it taking 67 seconds?
Am I missing any other timeout values?


No. The ones you have set are it.



2.  In a perfect world what is the best case scenario for the failure
of my iSCSI session?



It should work like in that doc.

--
You received this message because you are subscribed to the Google Groups 
open-iscsi group.
To post to this group, send email to open-is...@googlegroups.com.
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/open-iscsi?hl=en.

diff --git a/drivers/scsi/iscsi_tcp.c b/drivers/scsi/iscsi_tcp.c
index 5c39369..e840806 100644
--- a/drivers/scsi/iscsi_tcp.c
+++ b/drivers/scsi/iscsi_tcp.c
@@ -276,11 +276,12 @@ static int iscsi_sw_tcp_xmit(struct iscsi_conn *conn)
 
while (1) {
rc = iscsi_sw_tcp_xmit_segment(tcp_conn, segment);
-   if (rc  0) {
+   if (rc == -EAGAIN)
+   return rc;
+   else if (rc  0) {
rc = ISCSI_ERR_XMIT_FAILED;
goto error;
-   }
-   if (rc == 0)
+   } else if (rc == 0)
break;
 
consumed += rc;
@@ -561,9 +562,10 @@ static void iscsi_sw_tcp_conn_stop(struct iscsi_cls_conn 
*cls_conn, int flag)
struct iscsi_conn *conn = cls_conn-dd_data;
struct iscsi_tcp_conn *tcp_conn = conn-dd_data;
struct iscsi_sw_tcp_conn *tcp_sw_conn = tcp_conn-dd_data;
+   struct socket *sock = tcp_sw_conn-sock;
 
/* userspace may have goofed up and not bound us */
-   if (!tcp_sw_conn-sock)
+   if (!sock)
return;
/*
 * Make sure our recv side is stopped.
@@ -574,6 +576,11 @@ static void iscsi_sw_tcp_conn_stop(struct iscsi_cls_conn 
*cls_conn, int flag)
set_bit(ISCSI_SUSPEND_BIT, conn-suspend_rx);
write_unlock_bh(tcp_sw_conn-sock-sk-sk_callback_lock);
 
+   if (sock-sk-sk_sleep  waitqueue_active(sock-sk-sk_sleep)) {
+   sock-sk-sk_err = EIO;
+   wake_up_interruptible(sock-sk-sk_sleep);
+   }
+
iscsi2_conn_stop(cls_conn, flag);
iscsi_sw_tcp_release_conn(conn);
 }


Re: Failover time of iSCSI multipath devices.

2010-03-01 Thread guy keren

Mike Christie wrote:

On 03/01/2010 12:06 PM, bet wrote:

1.  Based on my timeouts I would think that my session would time out


Yes. It should timeout about 15 secs after you see
  Mar  1 07:14:27 bentCluster-1 kernel:  connection4:0: ping timeout of
  5 secs expired, recv timeout 5, last rx 4884304, last ping 4889304,
  now 4894304

You might be hitting a bug where the network layer gets stuck trying to 
send data. I attached a patch that should fix the problem.


If you do not know how to build a RHEL kernel let me know the arch you 
are using and I can build a kernel here (it takes about a day).





after 15 seconds.  Anyone have an idea why is it taking 67 seconds?
Am I missing any other timeout values?


No. The ones you have set are it.



2.  In a perfect world what is the best case scenario for the failure
of my iSCSI session?



It should work like in that doc.



wouldn't the abort timeout also have an effect here? or will iSCSI fail 
the coming abort (that the mid-layer sends when it gets an error sending 
a SCSI command) immediately?


--guy

--
You received this message because you are subscribed to the Google Groups 
open-iscsi group.
To post to this group, send email to open-is...@googlegroups.com.
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/open-iscsi?hl=en.



Re: Kernel oops on login

2010-03-01 Thread An Oneironaut
Ok so I guess working with old versions of open-iscsi is not accepted
here :).  So I upgraded to the latest and greatest semi-stable
release, 871.  I no longer see the is not queued messages and my
login and logout work fine.  However this is the only the case if I
don't have my flash device mounted on /dev/sdc.  If the flash is
mounted I get this kernel oops:

 -T iqn.1999-02.com.nexsan:p0:sataboy:01731a5a --login
Logging in to [iface: default, target: iqn.
1999-02.com.nexsan:p0:sataboy:01731a5a, portal: 172.19.151.169,3260]
kobject_add failed for sdc with -EEXIST, don't try to register things
with the same name in the same directory.
BUG: unable to handle kernel NULL pointer dereference at virtual
address 0008
 printing eip:
*pde = 
Oops:  [#1]
Modules linked in: iscsi_tcp libiscsi_tcp libiscsi
scsi_transport_iscsi
CPU:0
EIP:0060:[c0189241]Not tainted VLI
EFLAGS: 00210292   (2.6.22.10-vs2.2.0.5-cisco-nmx #1)
EIP is at create_dir+0x21/0x190
eax: c2280584   ebx: f7191bac   ecx: c2280588   edx: 
esi: c2280584   edi: c2280584   ebp:    esp: f7191b7c
ds: 007b   es: 007b   fs:   gs: 0033  ss: 0068
Process iscsid (pid: 5038, ti=f719 task=f77c75b0 task.ti=f719)
Stack: c0378110 c037566c c2280588 c2280584 c037566c c0103c41 c2280584
c2280584
   c2280584  c01893d4 f7191bac   c0203a1f
f7b35ac0
   f7d690c0 c0186e80  f7b35b10 f7b35ac0 c22804c0 c2280584
f7d690c0
Call Trace:
 [c0103c41] dump_stack+0x11/0x20
 [c01893d4] sysfs_create_dir+0x24/0x70
 [c0203a1f] kobject_shadow_add+0x7f/0x1a0
 [c0186e80] register_disk+0x50/0x1f0
 [c01f4b72] blk_register_queue+0x52/0x90
 [c026c6c8] sd_probe+0x278/0x3f0
 [c0189e47] sysfs_create_link+0x57/0x150
 [c0230c57] driver_probe_device+0x87/0x190
 [c0331fc1] klist_next+0x51/0xb0
 [c022ff24] bus_for_each_drv+0x44/0x70
 [c0230e19] device_attach+0x79/0x80
 [c0230d60] __device_attach+0x0/0x10
 [c022fe95] bus_attach_device+0x45/0x90
 [c022eb93] device_add+0x493/0x560
 [c0268502] scsi_sysfs_add_sdev+0x32/0x230
 [c02665bd] scsi_probe_and_add_lun+0x95d/0x980
 [c0266e91] __scsi_scan_target+0x491/0x5f0
 [c0166dcb] mntput_no_expire+0x1b/0x70
 [c015bac3] link_path_walk+0x63/0xc0
 [c02676a6] scsi_scan_target+0xb6/0xe0
 [f8a008fa] iscsi_user_scan_session+0x9a/0xb0 [scsi_transport_iscsi]
 [f8a00820] iscsi_user_scan+0x0/0x30 [scsi_transport_iscsi]
 [f8a00860] iscsi_user_scan_session+0x0/0xb0 [scsi_transport_iscsi]
 [c022dd42] device_for_each_child+0x22/0x40
 [f8a00820] iscsi_user_scan+0x0/0x30 [scsi_transport_iscsi]
 [f8a00843] iscsi_user_scan+0x23/0x30 [scsi_transport_iscsi]
 [c026818b] store_scan+0xbb/0xf0
 [c013b614] __alloc_pages+0x64/0x2f0
 [c02680d0] store_scan+0x0/0xf0
 [c0232196] class_device_attr_store+0x26/0x40
 [c0188151] sysfs_write_file+0xb1/0x110
 [c01880a0] sysfs_write_file+0x0/0x110
 [c0153820] vfs_write+0xa0/0x140
 [c0153da1] sys_write+0x41/0x70
 [c010280e] sysenter_past_esp+0x5f/0x85
 ===
Code: 74 26 00 8d bc 27 00 00 00 00 83 ec 28 89 5c 24 18 8b 5c 24 2c
89 74 24 1c 89 7c 24 20 89 6c 24 24 89 d5 89 4c 24 08 89 44 24 0c 8b
42 08 83 c0 68 e8 94 a2 1a 0031 c0 b9 ff ff ff ff 8b 7c 24
EIP: [c0189241] create_dir+0x21/0x190 SS:ESP 0068:f7191b7c
Mar  1 21:09:07 localhost kernel: BUG: unable to handle kernel NULL
pointer dereference at virtual address 0008
Mar  1 21:09:07 localhost kClocksource tsc unstable (delta =
2941960870 ns)
ernel:  printingTime: pit clocksource has been installed.
 eip:
Mar  1 21:09:07 localhost kernel: *pde = 
Mar  1 21:09:07 localhost kernel: Oops:  [#1]
Mar  1 21:09:07 localhost kernel: CPU:0
Mar  1 21:09:07 localhost kernel: EIP:0060:[c0189241]Not
tainted VLI
Mar  1 21:09:07 localhost kernel: EFLAGS: 00210292   (2.6.22.10-
vs2.2.0.5-cisco-nmx #1)
Mar  1 21:09:07 localhost kernel: EIP is at create_dir+0x21/0x190
Mar  1 21:09:07 localhost kernel: eax: c2280584   ebx: f7191bac   ecx:
c2280588   edx: 
Mar  1 21:09:07 localhost kernel: esi: c2280584   edi: c2280584   ebp:
   esp: f7191b7c
Mar  1 21:09:07 localhost kernel: ds: 007b   es: 007b   fs:   gs:
0033  ss: 0068
Mar  1 21:09:07 localhost kernel: Process iscsid (pid: 5038,
ti=f719 task=f77c75b0 task.ti=f719)
Mar  1 21:09:07 localhost kernel: Stack: c0378110 c037566c c2280588
c2280584 c037566c c0103c41 c2280584 c2280584
Mar  1 21:09:07 localhost kernel:c2280584  c01893d4
f7191bac   c0203a1f f7b35ac0
Mar  1 21:09:07 localhost kernel:f7d690c0 c0186e80 
f7b35b10 f7b35ac0 c22804c0 c2280584 f7d690c0
Mar  1 21:09:07 localhost kernel: Call Trace:
Mar  1 21:09:07 localhost kernel:  [c0103c41] dump_stack+0x11/0x20
Mar  1 21:09:07 localhost kernel:  [c01893d4] sysfs_create_dir
+0x24/0x70
Mar  1 21:09:07 localhost kernel:  [c0203a1f] kobject_shadow_add
+0x7f/0x1a0
Mar  1 21:09:07 localhost kernel:  [c0186e80] register_disk
+0x50/0x1f0
Mar  1 21:09:07 localhost kernel:  [c01f4b72] blk_register_queue
+0x52/0x90