Re: Should connection restored?

2008-05-27 Thread Mike Christie

HIMANSHU wrote:
 Yeah mike,This time you hit the Bulls eye.
 
 So only way(from initiators side) to successfully relogin to blocked
 target was increasing timeout value which i tried.
 
 Now we have to change the IET code to allow relogin.right?
 
 IET people are not really responding to my posts.Do you have any vague
 idea what things should be changed.
 

I do not think IET needs any changes. If I restart IET, it just lets me 
log back in with no changes. Or if I do

/etc/init.d/iscsi-target stop

wait a long time

/etc/init.d/iscsi-target start

the initiator logs back in.

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
open-iscsi group.
To post to this group, send email to open-iscsi@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at http://groups.google.com/group/open-iscsi
-~--~~~~--~~--~--~---



Re: Should connection restored?

2008-05-21 Thread Mike Christie

HIMANSHU wrote:
 
 Yeah..I am using IET.
 
 I was getting session recovery timed out after 120 secs when it was
 120.
 
 Now as it is 86400,I never observed after 86400sec. that
 session recovery timed out after 86400 secs.
 
 Otherwise is it NORMAL behavior of IET that the connection is lost of
 existing targets after
 restarting target daemon?

I am not sure what you mean by connection is lost. If you mean that you 
see those 1011 errors then yes. When IET and open-iscsi are running 
normally we have a tcp connection to the IET. When you reboot IET, it 
closes it, so we try to reconnect. When we detect the problem we spit 
out an error 1011. If a nop/ping times out first though you might see a 
slightly different error, but normally when the tcp connection changes 
state we are notified and you see 1011 errors.

If when you say the connection is lost is that we cannot do disk IO and 
you get FS errors and the iscsi session and its disks are basically dead 
and unusable because they only spit out errors then it is sort of 
expected :) We will not begin to fail IO until that replacement/recovery 
timer expires. So if you reboot IET and it takes longer than that timer 
to relogin then you will get FS/IO/SCSI/BLOCK errors.


 
 In Open-e,it was the case that disk blocked only some time,after it
 was recovered.
 But in my case,it is blocked forever.

When you say blocked, do you mean if you run iscsiadm -m session -P 3 
that the disks state says blocked or do you mean something else?

Like I said before, when the connection is detected we block the scsi 
disks. When we are logged back or if the replacement/recovery timer 
fires we unblock them. If we are logged in we execte IO. If we are not 
then we fail IO.


 
 Open-e might changed IET to suit this persistent connection or they
 might be using something else?Thoughts
 
 -
 
 
  


--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
open-iscsi group.
To post to this group, send email to open-iscsi@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at http://groups.google.com/group/open-iscsi
-~--~~~~--~~--~--~---



Re: Should connection restored?

2008-05-20 Thread Konrad Rzeszutek

 The Rule: Only after logout,target daemon can be restarted.
 
 But it sounds little weird.is it?

It looks like a safety feature. 

 
 If some I/O going on the initiators side,i cannot logout that target.

You sure about that? Did you do 'iscsiadm -m node -U all' and the
session wouldn't logout?

 
 Without restarting target daemon,we can not make target/LUN
 addition,modification changed to reflect on initiators on discovery
 commmand.

From the initiator side that is not true. The initiator can
get new targets, rescan the LUNs, find new devices, etc. The limitation
is in the iscsi-target software which doesn't allow you to dynamically
do these things. If you have the experience you could write this
functionality and propose it to the iscsi-target folks.


--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
open-iscsi group.
To post to this group, send email to open-iscsi@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at http://groups.google.com/group/open-iscsi
-~--~~~~--~~--~--~---



Re: Should connection restored?

2008-05-20 Thread HIMANSHU

Thanks a lot Konrad and Mike for your continuous help.

@Konrad

Sorry, I meant to say If I/O going on,it shouldn't be allowed to
logout that target.

Suppose I added few targets and want initiators should know this
change,i should restart iscsi-target daemon.Without this,iscsiadm -
m discovery. cannot detect addition of these new
targets..It still shows OLD ones.

Does it mean according to you,addition of new target should be spread
by target itself or discovery command  should sense it??

How initiator can sense addition of new targets other than discovery?
Are there any special commands?I am using IET.

LUN addition can be sensed by --rescan.But after restarting iscsi-
target,as disks status is blocked,--rescan command hangs up
completely.

so after restarting iscsi-target,New target/LUN's can be sensed,But
original connections are lost.
Is it normal behavior of IET?
--
@Mike

Yeah..I am using IET.

I was getting session recovery timed out after 120 secs when it was
120.

Now as it is 86400,I never observed after 86400sec. that
session recovery timed out after 86400 secs.

Otherwise is it NORMAL behavior of IET that the connection is lost of
existing targets after
restarting target daemon?

In Open-e,it was the case that disk blocked only some time,after it
was recovered.
But in my case,it is blocked forever.

Open-e might changed IET to suit this persistent connection or they
might be using something else?Thoughts

-


--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
open-iscsi group.
To post to this group, send email to open-iscsi@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at http://groups.google.com/group/open-iscsi
-~--~~~~--~~--~--~---



Re: Should connection restored?

2008-05-19 Thread HIMANSHU

I changed timeout as follows.

1. iscsiadm --mode discovery --type sendtargets --portal 192.168.7.174

2. iscsiadm -m node -T iqn.2008-05.com.abcde:Tar1 -p
192.168.7.174:3260 --login

3. iscsiadm -m node -T iqn.2008-05.com.qualexsystems:Tar1 -p
192.168.7.174:3260 -o update -n
node.session.timeo.replacement_timeout -v 82400
(got succedded)

4. I mounted disk exposed and started some i/o on it.

5. Then i restarted iscsi-target daemon

output of dmesg after during step 4
-
__journal_remove_journal_head: freeing b_frozen_data
 scsi 1:0:0:1: rejecting I/O to dead device
 Buffer I/O error on device sdh1, logical block 529
 lost page write due to I/O error on sdh1
 scsi2 : iSCSI Initiator over TCP/IP
   Vendor: IET   Model: VIRTUAL-DISK  Rev: 0
   Type:   Direct-Access  ANSI SCSI revision: 04
 SCSI device sdg: 630784 512-byte hdwr sectors (323 MB)
 sdg: Write Protect is off
 sdg: Mode Sense: 77 00 00 08
 SCSI device sdg: drive cache: write through
 SCSI device sdg: 630784 512-byte hdwr sectors (323 MB)
 sdg: Write Protect is off
 sdg: Mode Sense: 77 00 00 08
 SCSI device sdg: drive cache: write through
  sdg: sdg1
 sd 2:0:0:0: Attached scsi disk sdg
 sd 2:0:0:0: Attached scsi generic sg6 type 0
   Vendor: IET   Model: VIRTUAL-DISK  Rev: 0
   Type:   Direct-Access  ANSI SCSI revision: 04
 SCSI device sdh: 630784 512-byte hdwr sectors (323 MB)
 sdh: Write Protect is off
 sdh: Mode Sense: 77 00 00 08
 SCSI device sdh: drive cache: write through
 SCSI device sdh: 630784 512-byte hdwr sectors (323 MB)
 sdh: Write Protect is off
 sdh: Mode Sense: 77 00 00 08
 SCSI device sdh: drive cache: write through
  sdh: sdh1
 sd 2:0:0:1: Attached scsi disk sdh
 sd 2:0:0:1: Attached scsi generic sg7 type 0
 kjournald starting.  Commit interval 5 seconds
 EXT3 FS on sdh1, internal journal
 EXT3-fs: mounted filesystem with ordered data mode.

 dmesg after step 5

session1: iscsi: session recovery timed out after 120 secs
 iscsi: cmd 0x2a is not queued (7)
 sd 2:0:0:1: SCSI error: return code = 0x0001
 end_request: I/O error, dev sdh, sector 390208
 Buffer I/O error on device sdh1, logical block 195073
 lost page write due to I/O error on sdh1
 Buffer I/O error on device sdh1, logical block 195074
 lost page write due to I/O error on sdh1
 Aborting journal on device sdh1.
 iscsi: cmd 0x2a is not queued (7)
 sd 2:0:0:1: SCSI error: return code = 0x0001
 end_request: I/O error, dev sdh, sector 390208
 Buffer I/O error on device sdh1, logical block 195073
 lost page write due to I/O error on sdh1
 Buffer I/O error on device sdh1, logical block 195074
 lost page write due to I/O error on sdh1
 __journal_remove_journal_head: freeing b_committed_data
 ext3_abort called.
 EXT3-fs error (device sdh1): ext3_journal_start_sb: Detected aborted
journal
 Remounting filesystem read-only
 iscsi: cmd 0x2a is not queued (7)
 sd 2:0:0:1: SCSI error: return code = 0x0001
 end_request: I/O error, dev sdh, sector 66
 Buffer I/O error on device sdh1, logical block 2
 lost page write due to I/O error on sdh1
 iscsi: cmd 0x2a is not queued (7)
 iscsi: cmd 0x2a is not queued (7)
 sd 2:0:0:1: SCSI error: return code = 0x0001
 end_request: I/O error, dev sdh, sector 376896
 Buffer I/O error on device sdh1, logical block 188417
 lost page write due to I/O error on sdh1
 sd 2:0:0:1: SCSI error: return code = 0x0001
 end_request: I/O error, dev sdh, sector 376900
 Buffer I/O error on device sdh1, logical block 188419
 lost page write due to I/O error on sdh1
---
What is exactly happening here?
timeout value is changing from 120 to 86400..How can i re-confirm
that?
Can there be some other problem as well?
-
Target: iqn.2008-05.com.qualexsystems:Tar2
   Current Portal: 192.168.7.174:3260,1
   Persistent Portal: 192.168.7.174:3260,1
   **
   Interface:
   **
   Iface Name: default
   Iface Transport: tcp
   Iface IPaddress: default
   Iface HWaddress: default
   Iface Netdev: default
   SID: 4
   iSCSI Connection State: LOGGED IN
   Internal iscsid Session State: NO CHANGE
   
   Negotiated iSCSI params:
   
   HeaderDigest: CRC32C
   DataDigest: None
   MaxRecvDataSegmentLength: 131072
   MaxXmitDataSegmentLength: 8192
   FirstBurstLength: 65536
   MaxBurstLength: 262144
   ImmediateData: No
   InitialR2T: Yes
   MaxOutstandingR2T: 1
   
   

Re: Should connection restored?

2008-05-14 Thread Mike Christie

HIMANSHU wrote:
 
 Hi..
 
 It is probably not the problem of replacement_timer.
 

Did you try what I asked? Did you do the same timing in the tests?

In your log you had this:
session13: iscsi: session recovery timed out after 120 secs

When this is is printed out it means the replacment timer has fired and 
the devices will be unblocked (while they are blocked IO will be queued 
and will process will look like they are hung waiting for it) and IO 
will be failed until the session comes back.

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
open-iscsi group.
To post to this group, send email to open-iscsi@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at http://groups.google.com/group/open-iscsi
-~--~~~~--~~--~--~---



Re: Should connection restored?

2008-05-12 Thread HIMANSHU

I) How to increase replacement_timeout on initiator?

   echo 82400  /sys/block/sdh/device/timeout
or
   After discovery,nodes  sendtargets are created.In nodes,there is
default file.



node.session.timeo.replacement_timeout = 120
node.session.err_timeo.abort_timeout = 10
node.session.err_timeo.reset_timeout = 30
node.session.iscsi.FastAbort = Yes


node.conn[0].timeo.logout_timeout = 15
node.conn[0].timeo.login_timeout = 15
node.conn[0].timeo.auth_timeout = 45
node.conn[0].timeo.active_timeout = 5
node.conn[0].timeo.idle_timeout = 60
node.conn[0].timeo.ping_timeout = 5
node.conn[0].timeo.noop_out_interval = 10
node.conn[0].timeo.noop_out_timeout = 15
...
...
Can these values be useful to me?Or only replacement timeout HIGH
value and

node.conn[0].timeo.noop_out_interval = 0
node.conn[0].timeo.noop_out_timeout = 0

as given in documentation will do?
And 1 more replacement timeout is not available in target side?
Right.Only to be changed on the initiator side.

Here's target's ietd file
Target iqn.2008-04.com.qualexsystems:Tar1
Alias Tar1
Lun 1 Path=/dev/Vg1/Lv2,Type=fileio
Lun 0 Path=/dev/Vg1/Lv1,Type=fileio
HeaderDigest CRC32C,None
InitialR2T Yes
MaxBurstLength 262144
MaxRecvDataSegmentLength 8192
DataPDUInOrder Yes
ImmediateData No
MaxXmitDataSegmentLength 8192
FirstBurstLength 65536
MaxOutstandingR2T 8
DataSequenceInOrder Yes
DataDigest CRC32C,None
DefaultTime2Wait 2
MaxConnections 1
DefaultTime2Retain 20
ErrorRecoveryLevel 0
Wthreads 8
 
-
II) After particular target iqn.2008-04.com.qualexsystems:Tar1 is
logged in to an initiator,is it Ethical to add LUN's on that target?
--
III) If CHAP Bidirectional authentication is is given to 2 targets
Tar1  Tar2,Can same initiator machine can login to both targets?

 Because for 2nd targets login,/etc/iscsi/iscsid.conf on the
initiator should be changed so that login can be possible.But here we
are losing our previous target's authentication,when we overwrite
iscsid.conf with Tar2s Uname  Pwd.

 And still Tar1 is logged in though iscsid.conf doesn't
contain it's Authentication parameters,but contain Tar2's.Is it
accepted behavior or it is weird?Your views
--
IV) We are not using Multipath.So i can't try your previous
suggestion.

Spring for an extra nic on the target and use multipath and set
queue_if_no_path or no_path_retry queue.
--
   Is there any other way to make connections persistent?When i used
open-e(DSS) as target machine,Initiator retained persistent
connections after target restart.Our target restart code is already
mentioned earlier.

Thank you very much
-
--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
open-iscsi group.
To post to this group, send email to open-iscsi@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at http://groups.google.com/group/open-iscsi
-~--~~~~--~~--~--~---



Re: Should connection restored?

2008-05-12 Thread Mike Christie

HIMANSHU wrote:
 I) How to increase replacement_timeout on initiator?
 
echo 82400  /sys/block/sdh/device/timeout
 or

Yeah, the timeout above is not the replacement_timeout so writing to 
that sysfs file will not help.



After discovery,nodes  sendtargets are created.In nodes,there is
 default file.
 
 
 
 node.session.timeo.replacement_timeout = 120
 node.session.err_timeo.abort_timeout = 10
 node.session.err_timeo.reset_timeout = 30
 node.session.iscsi.FastAbort = Yes
 
 
 node.conn[0].timeo.logout_timeout = 15
 node.conn[0].timeo.login_timeout = 15
 node.conn[0].timeo.auth_timeout = 45
 node.conn[0].timeo.active_timeout = 5
 node.conn[0].timeo.idle_timeout = 60
 node.conn[0].timeo.ping_timeout = 5
 node.conn[0].timeo.noop_out_interval = 10
 node.conn[0].timeo.noop_out_timeout = 15
 ...
 ...
 Can these values be useful to me?Or only replacement timeout HIGH
 value and


Yes. This it the replacement_timeout you want to set. Run
iscsiadm -m node -T target -p ip:port -o update -n 
node.session.timeo.replacement_timeout -v 82400

You can also edit the file by hand, but I would use iscsiadm in general 
because it will handle changes in the file format for you.


 
 node.conn[0].timeo.noop_out_interval = 0
 node.conn[0].timeo.noop_out_timeout = 0
 
 as given in documentation will do?
 And 1 more replacement timeout is not available in target side?
 Right.Only to be changed on the initiator side.

Yeah, right. There is not target setting.


 
 Here's target's ietd file
 Target iqn.2008-04.com.qualexsystems:Tar1
 Alias Tar1
 Lun 1 Path=/dev/Vg1/Lv2,Type=fileio
 Lun 0 Path=/dev/Vg1/Lv1,Type=fileio
 HeaderDigest CRC32C,None
 InitialR2T Yes
 MaxBurstLength 262144
 MaxRecvDataSegmentLength 8192
 DataPDUInOrder Yes
 ImmediateData No
 MaxXmitDataSegmentLength 8192
 FirstBurstLength 65536
 MaxOutstandingR2T 8
 DataSequenceInOrder Yes
 DataDigest CRC32C,None
 DefaultTime2Wait 2
 MaxConnections 1
 DefaultTime2Retain 20
 ErrorRecoveryLevel 0
 Wthreads 8
  
 -
 II) After particular target iqn.2008-04.com.qualexsystems:Tar1 is
 logged in to an initiator,is it Ethical to add LUN's on that target?

Yes. To find them on the initiator side you then would need to do

iscsiadm -m session --rescan


 --
 III) If CHAP Bidirectional authentication is is given to 2 targets
 Tar1  Tar2,Can same initiator machine can login to both targets?
 

Yes.

  Because for 2nd targets login,/etc/iscsi/iscsid.conf on the
 initiator should be changed so that login can be possible.But here we
 are losing our previous target's authentication,when we overwrite
 iscsid.conf with Tar2s Uname  Pwd.
 
  And still Tar1 is logged in though iscsid.conf doesn't
 contain it's Authentication parameters,but contain Tar2's.Is it
 accepted behavior or it is weird?Your views


iscsi.conf is only read when you do iscsiadm -m discovery . iscsiadm 
will read iscsi.conf and put those values in 
/etc/iscsi/nodes/target/portal/default.

So if you had different CHAP values for the targets they would get their 
own default file in /etc/iscsi/nodes.

 --
 IV) We are not using Multipath.So i can't try your previous
 suggestion.
 
 Spring for an extra nic on the target and use multipath and set
 queue_if_no_path or no_path_retry queue.
 --
Is there any other way to make connections persistent?When i used
 open-e(DSS) as target machine,Initiator retained persistent
 connections after target restart.Our target restart code is already
 mentioned earlier.
 

Set the replacement_timeout how I describeed above.

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
open-iscsi group.
To post to this group, send email to open-iscsi@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at http://groups.google.com/group/open-iscsi
-~--~~~~--~~--~--~---



Re: Should connection restored?

2008-05-09 Thread MAKHU

Hello All,

1. I am using latest iscsitarget-0.4.16-1 from sourceforge.

Now,can you tell whether it is expected to retain the connection with
initiator once the target is restarted?
-
2. You wrote.

The reason for having you do iscsid -d 8 was so we could see why we
cannot log back in

   iscsid -d 8 log messages can be seen from dmesg...right?

   check 2nd last line here in dmesg.connection13:0: iscsi:
detected conn error (1011)
---

iscsi: cmd 0x28 is not queued (7)
iscsi: cmd 0x28 is not queued (7)
sd 14:0:0:1: SCSI error: return code = 0x0001
end_request: I/O error, dev sdi, sector 0
sd 14:0:0:1: SCSI error: return code = 0x0001
end_request: I/O error, dev sdi, sector 8
iscsi: cmd 0x28 is not queued (7)
sd 14:0:0:1: SCSI error: return code = 0x0001
end_request: I/O error, dev sdi, sector 0
iscsi: cmd 0x28 is not queued (7)
sd 14:0:0:1: SCSI error: return code = 0x0001
end_request: I/O error, dev sdi, sector 835576
iscsi: cmd 0x28 is not queued (7)
sd 14:0:0:1: SCSI error: return code = 0x0001
end_request: I/O error, dev sdi, sector 835576
iscsi: cmd 0x28 is not queued (7)
sd 14:0:0:1: SCSI error: return code = 0x0001
end_request: I/O error, dev sdi, sector 0
iscsi: cmd 0x28 is not queued (7)
iscsi: cmd 0x28 is not queued (7)
sd 14:0:0:1: SCSI error: return code = 0x0001
end_request: I/O error, dev sdi, sector 8
iscsi: cmd 0x28 is not queued (7)
sd 14:0:0:1: SCSI error: return code = 0x0001
end_request: I/O error, dev sdi, sector 16
sd 14:0:0:1: SCSI error: return code = 0x0001
end_request: I/O error, dev sdi, sector 0
iscsi: cmd 0x28 is not queued (7)
sd 14:0:0:1: SCSI error: return code = 0x0001
end_request: I/O error, dev sdi, sector 0

scsi14 : iSCSI Initiator over TCP/IP
   Vendor: IET   Model: VIRTUAL-DISK  Rev: 0
   Type:   Direct-Access  ANSI SCSI revision: 04
 SCSI device sdh: 1048576 512-byte hdwr sectors (537 MB)
 sdh: Write Protect is off
 sdh: Mode Sense: 77 00 00 08
 SCSI device sdh: drive cache: write through
 SCSI device sdh: 1048576 512-byte hdwr sectors (537 MB)
 sdh: Write Protect is off
 sdh: Mode Sense: 77 00 00 08
 SCSI device sdh: drive cache: write through
  sdh: sdh1
 sd 14:0:0:0: Attached scsi disk sdh
 sd 14:0:0:0: Attached scsi generic sg6 type 0
   Vendor: IET   Model: VIRTUAL-DISK  Rev: 0
   Type:   Direct-Access  ANSI SCSI revision: 04
 SCSI device sdi: 835584 512-byte hdwr sectors (428 MB)
 sdi: Write Protect is off
 sdi: Mode Sense: 77 00 00 08
 SCSI device sdi: drive cache: write through
 SCSI device sdi: 835584 512-byte hdwr sectors (428 MB)
 sdi: Write Protect is off
 sdi: Mode Sense: 77 00 00 08
 SCSI device sdi: drive cache: write through
  sdi: sdi1
 sd 14:0:0:1: Attached scsi disk sdi
 sd 14:0:0:1: Attached scsi generic sg7 type 0

  connection13:0: iscsi: detected conn error (1011)

session13: iscsi: session recovery timed out after 120 secs
---
3. when session is recovered.

[EMAIL PROTECTED] iscsi]# iscsiadm -m session -P 3
iSCSI Transport Class version 1.1-646
iscsiadm version 2.0-865
Target: iqn.2008-04.com.qualexsystems:Tar1
Current Portal: 192.168.7.173:3260,1
Persistent Portal: 192.168.7.173:3260,1
**
Interface:
**
Iface Name: default
Iface Transport: tcp
Iface IPaddress: default
Iface HWaddress: default
Iface Netdev: default
SID: 13
iSCSI Connection State: IN LOGIN
Internal iscsid Session State: REPOEN

Negotiated iSCSI params:

HeaderDigest: None
DataDigest: None
MaxRecvDataSegmentLength: 8192
MaxXmitDataSegmentLength: 8192
FirstBurstLength: 65536
MaxBurstLength: 262144
ImmediateData: No
InitialR2T: Yes
MaxOutstandingR2T: 1

Attached SCSI devices:

Host Number: 14 State: running
scsi14 Channel 00 Id 0 Lun: 0
Attached scsi disk sdh  State: running
scsi14 Channel 00 Id 0 Lun: 1
Attached scsi disk sdi  State: running

Though Disk state is running,still iSCSI Connection State: IN LOGIN.
So disks still becomes unusable.

Conclusion: What is moral of the story i 

Re: Should connection restored?

2008-05-09 Thread Konrad Rzeszutek

On Fri, May 09, 2008 at 02:36:25AM -0700, MAKHU wrote:
 
 Hello All,
 
 1. I am using latest iscsitarget-0.4.16-1 from sourceforge.

And an older version of Open-iSCSI..would say 868-20. Have you tried
using the one that got released about a week ago?

... snip ...
 
 Host Number: 14 State: running
 scsi14 Channel 00 Id 0 Lun: 0
 Attached scsi disk sdh  State: running
 scsi14 Channel 00 Id 0 Lun: 1
 Attached scsi disk sdi  State: running
 
 Though Disk state is running,still iSCSI Connection State: IN LOGIN.
 So disks still becomes unusable.

Are the disks really unusable? What happens when you do 'dd if=/dev/sdh 
of=/dev/null count=1' ?

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
open-iscsi group.
To post to this group, send email to open-iscsi@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at http://groups.google.com/group/open-iscsi
-~--~~~~--~~--~--~---



Re: Should connection restored?

2008-05-09 Thread Mike Christie

MAKHU wrote:
 Conclusion: What is moral of the story i got is that in our target-
 initiator pair,Persistent connection is not possible.After target/
 Initiator iscsi-target daemon restart(code given below),connection
 is lost causing disks to be unusable.

What you should have taken from the mails was that you should either:

1. Spring for an extra nic on the target and use multipath and set 
queue_if_no_path or no_path_retry queue.

2. set the iscsi node.session.timeo.replacement_timeout timer very high.


--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
open-iscsi group.
To post to this group, send email to open-iscsi@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at http://groups.google.com/group/open-iscsi
-~--~~~~--~~--~--~---



Should connection restored?

2008-05-08 Thread MAKHU

When target is logged in to an initiator and then either target/
initiator is restarted,connection is lost.

Should the connection be restored?
--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
open-iscsi group.
To post to this group, send email to open-iscsi@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at http://groups.google.com/group/open-iscsi
-~--~~~~--~~--~--~---



Re: Should connection restored?

2008-05-08 Thread Konrad Rzeszutek

On Thu, May 08, 2008 at 08:52:35AM -0700, MAKHU wrote:
 
 When target is logged in to an initiator and then either target/
 initiator is restarted,connection is lost.

It goes the other way. Initiator logs in the target.

If the initiator (client) is restarted the connection would be lost.
If the target is restarted it might do:
 1). If it a NetApp, send a command asking the initiator to logout.
 2). If is a EqualLogic, send a AsyncMsg telling the initiator that the block
 device is going to be off-line.
 3). For others it might just terminate the connection without notifying
 the initiator at all. At which point the nop-ping timer (which runs
 by default every 15 seconds) would figure out the connection is lost, kick
 a retry and if that failed terminate the connection.

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
open-iscsi group.
To post to this group, send email to open-iscsi@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at http://groups.google.com/group/open-iscsi
-~--~~~~--~~--~--~---



Re: Should connection restored?

2008-05-08 Thread Mike Christie

Konrad Rzeszutek wrote:

  2). If is a EqualLogic, send a AsyncMsg telling the initiator that the 
 block
  device is going to be off-line.
 What is this? I do not think we handle this. Is it a verndor specific 
 async iscsi event?
 
 Yes. To be exact attached is a TCP dump, look for seq no 1774. We do try to
 re-login afterwards.

Ah that is just target requests logout. That is standard and like you 
said we handle it by trying to relogin. You worried me :) I just fixed 
that code in 754 or something and I thought I goofed and now they did an 
actual vendor specific one and we had to add some more new code to 
handle it.


 
  3). For others it might just terminate the connection without notifying
  the initiator at all. At which point the nop-ping timer (which runs
  by default every 15 seconds) would figure out the connection is lost, 
 kick
  a retry and if that failed terminate the connection.
 For 1 and 3, it depends on the version of open-iscsi you are using and 
 what the target returns on the retry if it is able to send something at all.

 With the current open-iscsi code, if the tcp/ip socket connect fails we 
 continue to retry that forever or until the user manually kills it. If 
 when we try the relogin if the target is still up and responding and 
 responds with target not found then we will kill the connection. If we 
 get some other errors we will continue to retry the login.
 
 I thought the re-login routine did some back-off. Like 1 second, 2 seconds,
 then 4 seconds and so on.. Granted the attached dump shows it to try to

No we use the def time2wait we got during login negotiation. We also are 
a little broken and we use the time2wait returned from a logout response 
when we are not supposed to.

 re-login non-stop and maybe I am confusing it with the 2.0-868-20 which
 had a nop-ping code in the iSCSI daemon instead in the kernel.


The nop ping code does not change the relogin behavior here.

You are probably remembering linux-iscsi. It will nicely back off.



--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
open-iscsi group.
To post to this group, send email to open-iscsi@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at http://groups.google.com/group/open-iscsi
-~--~~~~--~~--~--~---