Re: Configuration question

2008-08-20 Thread Mike Christie

v42bis wrote:
 I have changed the timeouts in iscsid.conf as directed in the iSCSI
 Root section 8.2 of the README so that in case the open-iscsi
 initiator loses connection or has other communications problems with
 my OpenSolaris target then the open-iscsi initiator will wait for up
 to 24 hours before it fails to the SCSI layer:
 
 node.session.timeo.replacement_timeout = 86400
 node.conn[0].timeo.login_timeout = 15
 node.conn[0].timeo.logout_timeout = 15
 node.conn[0].timeo.noop_out_interval = 0
 node.conn[0].timeo.noop_out_timeout = 0
 
 My OpenSolaris target recently core dumped and came back online in
 about 5 minutes. By that time, all of my ext3 partitions mounted over
 iscsi had aborted their journals. Shouldn't iscsi wait for 24 hours
 before I see any failures on my SCSI layer affecting my ext3
 partitions?

It should have. Do you have the logs? Do you see something about the 
replacement or recovery timeout timing out. It would have the correct 
86400 value, but when you look at the log it would say that it failed a 
lot quicker like the 5 minutes you mention. If this happens you may be 
hitting a bug where the kernel cannot support long timeouts and 
basically what is happening is the kernel's timer is rolling over and 
not caching it self right or maybe we are not supposed to be setting 
that high. We are still investigating to see who is at fault.

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
open-iscsi group.
To post to this group, send email to open-iscsi@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at http://groups.google.com/group/open-iscsi
-~--~~~~--~~--~--~---



Re: iscsi errors with debian-kernel 2.6.24-3-686

2008-08-20 Thread Klemens Kittan
Am Tuesday, 19. August 2008 19:15 schrieb Mike Christie:
 Klemens Kittan wrote:
  Am Monday, 18. August 2008 20:10 schrieb Mike Christie:
  Klemens Kittan wrote:
  Am Friday, 15. August 2008 20:03 schrieb Mike Christie:
  Mike Christie wrote:
  Klemens Kittan wrote:
  Here is the configuration of my debian kernel (2.6.25-2).
 
  Thanks. It looks like your target is responding to other IO, but did
  not respond to the ping quick enough so it timed out. Let me make a
  patch for you to test. I should hopefully have it later today.
 
  Try the attached patch over  open-iscsi-2.0-869.2 tarball modules.
 
  To apply the patch untar and unzip the source then cd to the dir. Then
  do:
 
  patch -p1 -i where-the-patch-is-saved/relax-ping-timer.patch
 
  Then do the normal make and make install. You will probably want to
  reboot the box to make sure you are using the new modules.
 
  Unfortunately I got the same errors.
 
  Could you send the log output?
 
  Here is the /var/log/syslog.

 Shoot. For some reason that nop is just not finishing in a decent amount
 of time. Could you try the attached patch. It gives the nop even more
 time to complete and it spits out a bunch of debug info to make sure
 open-iscsi did not leak the task.


Unfortunately, the attached file is empty.

Thanks,
Klemens



pgpiyh0rkEpvI.pgp
Description: PGP signature


Re: iscsi errors with debian-kernel 2.6.24-3-686

2008-08-20 Thread Mike Christie

Klemens Kittan wrote:
 Am Tuesday, 19. August 2008 19:15 schrieb Mike Christie:
 Klemens Kittan wrote:
 Am Monday, 18. August 2008 20:10 schrieb Mike Christie:
 Klemens Kittan wrote:
 Am Friday, 15. August 2008 20:03 schrieb Mike Christie:
 Mike Christie wrote:
 Klemens Kittan wrote:
 Here is the configuration of my debian kernel (2.6.25-2).
 Thanks. It looks like your target is responding to other IO, but did
 not respond to the ping quick enough so it timed out. Let me make a
 patch for you to test. I should hopefully have it later today.
 Try the attached patch over  open-iscsi-2.0-869.2 tarball modules.

 To apply the patch untar and unzip the source then cd to the dir. Then
 do:

 patch -p1 -i where-the-patch-is-saved/relax-ping-timer.patch

 Then do the normal make and make install. You will probably want to
 reboot the box to make sure you are using the new modules.
 Unfortunately I got the same errors.
 Could you send the log output?
 Here is the /var/log/syslog.
 Shoot. For some reason that nop is just not finishing in a decent amount
 of time. Could you try the attached patch. It gives the nop even more
 time to complete and it spits out a bunch of debug info to make sure
 open-iscsi did not leak the task.

 
 Unfortunately, the attached file is empty.
 

Oh yeah, if you just log into the target and do not do any IO to the 
disks. Do you see any messages like this:

Aug 14 09:52:23 baltrum kernel: [81064.665749]  connection2:0: ping 
timeout of
10 secs expired, last rx 4315069195, last ping 4315067926, now 4315070426
Aug 14 09:52:23 baltrum kernel: [81064.669756]  connection2:0: detected 
conn
error (1011)

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
open-iscsi group.
To post to this group, send email to open-iscsi@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at http://groups.google.com/group/open-iscsi
-~--~~~~--~~--~--~---



Re: iscsi errors with debian-kernel 2.6.24-3-686

2008-08-20 Thread Mike Christie

Klemens Kittan wrote:
 Am Wednesday, 20. August 2008 09:43 schrieb Mike Christie:
 Klemens Kittan wrote:
 Am Tuesday, 19. August 2008 19:15 schrieb Mike Christie:
 Klemens Kittan wrote:
 Am Monday, 18. August 2008 20:10 schrieb Mike Christie:
 Klemens Kittan wrote:
 Am Friday, 15. August 2008 20:03 schrieb Mike Christie:
 Mike Christie wrote:
 Klemens Kittan wrote:
 Here is the configuration of my debian kernel (2.6.25-2).
 Thanks. It looks like your target is responding to other IO, but
 did not respond to the ping quick enough so it timed out. Let me
 make a patch for you to test. I should hopefully have it later
 today.
 Try the attached patch over  open-iscsi-2.0-869.2 tarball modules.

 To apply the patch untar and unzip the source then cd to the dir.
 Then do:

 patch -p1 -i where-the-patch-is-saved/relax-ping-timer.patch

 Then do the normal make and make install. You will probably want to
 reboot the box to make sure you are using the new modules.
 Unfortunately I got the same errors.
 Could you send the log output?
 Here is the /var/log/syslog.
 Shoot. For some reason that nop is just not finishing in a decent amount
 of time. Could you try the attached patch. It gives the nop even more
 time to complete and it spits out a bunch of debug info to make sure
 open-iscsi did not leak the task.
 Unfortunately, the attached file is empty.
 Oh yeah, if you just log into the target and do not do any IO to the
 disks. Do you see any messages like this:

 Aug 14 09:52:23 baltrum kernel: [81064.665749]  connection2:0: ping
 timeout of
 10 secs expired, last rx 4315069195, last ping 4315067926, now 4315070426
 Aug 14 09:52:23 baltrum kernel: [81064.669756]  connection2:0: detected
 conn
 error (1011)

 
 I get these messages all the time (with and without IO traffic):

you should get these. I am just worried about getting these

  Aug 20 09:56:10 baltrum kernel: [168687.391990]  connection1:0: ping 
timeout
  of 10 secs with recv timeout of 5 secs expired last rx 4336967839, 
last ping
  4336967081, now 4336970339 task 81003797aac0
  Aug 20 09:56:10 baltrum kernel: [168687.396001]  connection1:0: 
detected conn
  error (1011)

when there is no IO traffic.

 Aug 20 09:49:13 baltrum kernel: [168482.943791] send 8100f9c541c0
 Aug 20 09:49:13 baltrum kernel: [168483.026754] send 81003797adc0
 Aug 20 09:49:13 baltrum kernel: [168483.026817] iscsi_free_mgmt_task 
 8100f9c541c0
 Aug 20 09:49:13 baltrum kernel: [168483.031189] iscsi_free_mgmt_task 
 81003797adc0
 Aug 20 09:49:18 baltrum kernel: [168487.772859] send 8100f9c54140
 Aug 20 09:49:18 baltrum kernel: [168488.018304] iscsi_free_mgmt_task 
 8100f9c54140
 Aug 20 09:49:18 baltrum kernel: [168488.018342] send 81003797aac0
 Aug 20 09:49:18 baltrum kernel: [168488.026632] iscsi_free_mgmt_task 
 81003797aac0
 
 With IO traffic I get these messages:


Could you give me a large chunk of the log? I need the stuff that 
happened before this part.

 Aug 20 09:56:10 baltrum kernel: [168687.391990]  connection1:0: ping timeout 
 of 10 secs with recv timeout of 5 secs expired last rx 4336967839, last ping 
 4336967081, now 4336970339 task 81003797aac0
 Aug 20 09:56:10 baltrum kernel: [168687.396001]  connection1:0: detected conn 
 error (1011)
 Aug 20 09:56:10 baltrum iscsid: Kernel reported iSCSI connection 1:0 error 
 (1011) state (3)
 Aug 20 09:56:14 baltrum iscsid: connection1:0 is operational after recovery 
 (1 
 attempts)
 
 Thanks,
 Klemens
 


--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
open-iscsi group.
To post to this group, send email to open-iscsi@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at http://groups.google.com/group/open-iscsi
-~--~~~~--~~--~--~---



Re: iSER login process

2008-08-20 Thread Erez Zilber

On Wed, Aug 20, 2008 at 12:04 AM, Jesse Butler [EMAIL PROTECTED] wrote:


 Ok, I've tried the configuration and login now whilst specifying the
 TPGT.  I don't hit the same error now, but I do see this:

 # iscsiadm -m node -T
 iqn.1986-03.com.sun:02:aff22998-3466-4bf4-ee3c-958fd4b5d346 -p
 10.8.0.6:3260 -l
 Login session [iface: default, target:
 iqn.1986-03.com.sun:02:aff22998-3466-4bf4-ee3c-958fd4b5d346, portal:
 10.8.0.6,3260]
 iscsiadm: initiator reported error (14 - iSCSI driver does not support
 requested capability.)
 iscsiadm: Could not execute operation on all records. Err 107.

 So, progress!

 Here is the set of operations I performed.

 Thanks
 Jesse


 # iscsiadm -m node -T
 iqn.1986-03.com.sun:02:aff22998-3466-4bf4-ee3c-958fd4b5d346 -p
 10.8.0.6:3260,1 -o new
 New iSCSI node [tcp:[hw=default,ip=,net_if=default,iscsi_if=default]
 10.8.0.6,3260,1
 iqn.1986-03.com.sun:02:aff22998-3466-4bf4-ee3c-958fd4b5d346] added

 # iscsiadm -m node -T
 iqn.1986-03.com.sun:02:aff22998-3466-4bf4-ee3c-958fd4b5d346 -p
 10.8.0.6:3260,1
 node.name = iqn.1986-03.com.sun:02:aff22998-3466-4bf4-ee3c-958fd4b5d346
 node.tpgt = 1
 node.startup = manual
 iface.hwaddress = default
 iface.iscsi_ifacename = default
 iface.net_ifacename = default
 iface.transport_name = tcp
 node.discovery_address = empty
 node.discovery_port = 0
 node.discovery_type = static
 node.session.initial_cmdsn = 0
 node.session.initial_login_retry_max = 4
 node.session.cmds_max = 128
 node.session.queue_depth = 32
 node.session.auth.authmethod = None
 node.session.auth.username = empty
 node.session.auth.password = empty
 node.session.auth.username_in = empty
 node.session.auth.password_in = empty
 node.session.timeo.replacement_timeout = 120
 node.session.err_timeo.abort_timeout = 10
 node.session.err_timeo.reset_timeout = 30
 node.session.iscsi.FastAbort = Yes
 node.session.iscsi.InitialR2T = No
 node.session.iscsi.ImmediateData = Yes
 node.session.iscsi.FirstBurstLength = 262144
 node.session.iscsi.MaxBurstLength = 16776192
 node.session.iscsi.DefaultTime2Retain = 0
 node.session.iscsi.DefaultTime2Wait = 2
 node.session.iscsi.MaxConnections = 1
 node.session.iscsi.MaxOutstandingR2T = 1
 node.session.iscsi.ERL = 0
 node.conn[0].address = 10.8.0.6
 node.conn[0].port = 3260
 node.conn[0].startup = manual
 node.conn[0].tcp.window_size = 524288
 node.conn[0].tcp.type_of_service = 0
 node.conn[0].timeo.logout_timeout = 15
 node.conn[0].timeo.login_timeout = 15
 node.conn[0].timeo.auth_timeout = 45
 node.conn[0].timeo.active_timeout = 5
 node.conn[0].timeo.idle_timeout = 60
 node.conn[0].timeo.ping_timeout = 5
 node.conn[0].timeo.noop_out_interval = 10
 node.conn[0].timeo.noop_out_timeout = 15
 node.conn[0].iscsi.MaxRecvDataSegmentLength = 131072
 node.conn[0].iscsi.HeaderDigest = None,CRC32C

I think that this is the problem. iSER doesn't use
HeaderDigest/DataDigest. I strongly suggest that you use
iscsi_discovery which does all the work for you (including setting
HeaderDigest/DataDigest to None).

Erez

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
open-iscsi group.
To post to this group, send email to open-iscsi@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at http://groups.google.com/group/open-iscsi
-~--~~~~--~~--~--~---



RE: Reducing amount of logmessages openiscsi/multipath [with md3000i]

2008-08-20 Thread Kees Hoekzema



 -Original Message-
 He wants the RDAC hw handler and its path checker (rdac). The MD3000i
 is an
 LSI re-branded box and is the same as the IBM DS3300.
 
 
  You also want to make sure that you are using the md3000i hw handler
 or
  scsh_dh_module if you are not already. The dm-devel guys can help you
 out.
 
 Here is the multipath.conf he should be using for MD3000i (or for the
 DS3300
 but will need to modify the vendor/model entry)
 
 # Note: The same as the IBM DS3300
 #
 device {
 vendor  DELL
 product MD3000i
 product_blacklist   Universal Xport
 features1 queue_if_no_path
 path_grouping_policygroup_by_prio
 hardware_handler1 rdac
 path_checkerrdac
 priordac
 failbackimmediate
 }
 
 

This fixed the problem, although a simple flush was not enough to activate
the changes, so it took a bit longer to get the result I wanted. In the end
I saw that it was still using the direction path_checker and noticed that
you cannot change the 'path_checker' on the fly with just a multipath -F /
-v 2, but that a reload of the modules was needed.

$ multipath -ll
webdata (36001ec9000d16311067f484e260c) dm-0 DELL,MD3000i
[size=2.0T][features=0][hwhandler=1 rdac]
\_ round-robin 0 [prio=1000][active]
 \_ 0:0:0:0 sda 8:0   [active][ready]
\_ round-robin 0 [prio=1][enabled]
 \_ 1:0:0:0 sdb 8:16  [active][ready]
\_ round-robin 0 [prio=0][enabled]
 \_ 2:0:0:0 sdc 8:32  [active][ghost]
 \_ 3:0:0:0 sdd 8:48  [active][ghost]

That is the output of multipath now, and no more spam in my syslog or dmesg.

(The different priorities you see are because the second path (with a prio
of 1) is a longer path through 1 more switch, so I'd rather not give it a
high priority, and if the '1000' path fails, it just uses the '1' path.)

A big thanks to everyone who directed me on the right path ;)

-kees


--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
open-iscsi group.
To post to this group, send email to open-iscsi@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at http://groups.google.com/group/open-iscsi
-~--~~~~--~~--~--~---



Re: Debian Lenny; Kernel 2.6.26, Iser, Mellanox : iser_cma_handler:event: 3, error: -110

2008-08-20 Thread Dr. Volker Jaenisch

Hello Mike!

Mike Christie schrieb:
 I used the following commands:

 tgtadm --lld iscsi --op new --mode target --tid 1 -T
 de.inqbus.poseidon:disk1
 tgtadm --lld iscsi --op bind --mode target --tid 1 -I 10.6.0.1
 tgtadm --lld iscsi --op new --mode logicalunit --tid 1 --lun 1 -b
 /dev/vg1/test
 

 Could you try a
 tgtadm --lld fcoe --op bind --mode target --tid 1 -I ALL
   
./tgtadm --lld iscsi --op bind --mode target --tid 1 -I ALL

results in exactly the same behavior

Aug 20 14:44:51 hades kernel: [97036.613263] iser:
iser_connect:connecting to: 10.6.0.2, port 0xbc0c
Aug 20 14:44:51 hades kernel: [97036.616516] iser:
iser_cma_handler:event 0 conn 81000c513080 id 81000d8d1800
Aug 20 14:44:51 hades kernel: [97036.871932] iser: iscsi_iser_ep_poll:ib
conn 81000c513080 rc = 0
Aug 20 14:44:51 hades kernel: [97037.179318] iser: iscsi_iser_ep_poll:ib
conn 81000c513080 rc = 0
Aug 20 14:44:51 hades kernel: [97037.439327] iser: iscsi_iser_ep_poll:ib
conn 81000c513080 rc = 0
Aug 20 14:44:52 hades kernel: [97037.663318] iser:
iser_cma_handler:event 3 conn 81000c513080 id 81000d8d1800
Aug 20 14:44:52 hades kernel: [97037.663318] iser:
iser_cma_handler:event: 3, error: -110
Aug 20 14:44:52 hades kernel: [97037.703064] iser: iscsi_iser_ep_poll:ib
conn 81000c513080 rc = -1
Aug 20 14:44:55 hades kernel: [97041.675045] iser:
iscsi_iser_ep_disconnect:ib conn 81000c513080 state 4
Aug 20 14:44:55 hades kernel: [97041.675076] iser:
iser_conn_terminate:Failed to disconnect, conn: 0x81000c513080 err -22
Aug 20 14:44:55 hades kernel: [97041.675121] iser:
iser_free_ib_conn_res:freeing conn 81000c513080 cma_id
81000d8d1800 fmr pool 0
000 qp 

Any other ideas ?

Best regards

Volker

-- 

   inqbus it-consulting  +49 ( 341 )  5643800
   Dr.  Volker Jaenisch  http://www.inqbus.de
   Herloßsohnstr.12  0 4 1 5 5Leipzig
   N  O  T -  F Ä L L E  +49 ( 170 )  3113748



--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
open-iscsi group.
To post to this group, send email to open-iscsi@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at http://groups.google.com/group/open-iscsi
-~--~~~~--~~--~--~---



Re: iSER login process

2008-08-20 Thread Jesse Butler



Erez Zilber wrote:
 On Wed, Aug 20, 2008 at 12:04 AM, Jesse Butler [EMAIL PROTECTED] wrote:
   
 Ok, I've tried the configuration and login now whilst specifying the
 TPGT.  I don't hit the same error now, but I do see this:

 # iscsiadm -m node -T
 iqn.1986-03.com.sun:02:aff22998-3466-4bf4-ee3c-958fd4b5d346 -p
 10.8.0.6:3260 -l
 Login session [iface: default, target:
 iqn.1986-03.com.sun:02:aff22998-3466-4bf4-ee3c-958fd4b5d346, portal:
 10.8.0.6,3260]
 iscsiadm: initiator reported error (14 - iSCSI driver does not support
 requested capability.)
 iscsiadm: Could not execute operation on all records. Err 107.

 So, progress!

 Here is the set of operations I performed.

 Thanks
 Jesse


 # iscsiadm -m node -T
 iqn.1986-03.com.sun:02:aff22998-3466-4bf4-ee3c-958fd4b5d346 -p
 10.8.0.6:3260,1 -o new
 New iSCSI node [tcp:[hw=default,ip=,net_if=default,iscsi_if=default]
 10.8.0.6,3260,1
 iqn.1986-03.com.sun:02:aff22998-3466-4bf4-ee3c-958fd4b5d346] added

 # iscsiadm -m node -T
 iqn.1986-03.com.sun:02:aff22998-3466-4bf4-ee3c-958fd4b5d346 -p
 10.8.0.6:3260,1
 node.name = iqn.1986-03.com.sun:02:aff22998-3466-4bf4-ee3c-958fd4b5d346
 node.tpgt = 1
 node.startup = manual
 iface.hwaddress = default
 iface.iscsi_ifacename = default
 iface.net_ifacename = default
 iface.transport_name = tcp
 node.discovery_address = empty
 node.discovery_port = 0
 node.discovery_type = static
 node.session.initial_cmdsn = 0
 node.session.initial_login_retry_max = 4
 node.session.cmds_max = 128
 node.session.queue_depth = 32
 node.session.auth.authmethod = None
 node.session.auth.username = empty
 node.session.auth.password = empty
 node.session.auth.username_in = empty
 node.session.auth.password_in = empty
 node.session.timeo.replacement_timeout = 120
 node.session.err_timeo.abort_timeout = 10
 node.session.err_timeo.reset_timeout = 30
 node.session.iscsi.FastAbort = Yes
 node.session.iscsi.InitialR2T = No
 node.session.iscsi.ImmediateData = Yes
 node.session.iscsi.FirstBurstLength = 262144
 node.session.iscsi.MaxBurstLength = 16776192
 node.session.iscsi.DefaultTime2Retain = 0
 node.session.iscsi.DefaultTime2Wait = 2
 node.session.iscsi.MaxConnections = 1
 node.session.iscsi.MaxOutstandingR2T = 1
 node.session.iscsi.ERL = 0
 node.conn[0].address = 10.8.0.6
 node.conn[0].port = 3260
 node.conn[0].startup = manual
 node.conn[0].tcp.window_size = 524288
 node.conn[0].tcp.type_of_service = 0
 node.conn[0].timeo.logout_timeout = 15
 node.conn[0].timeo.login_timeout = 15
 node.conn[0].timeo.auth_timeout = 45
 node.conn[0].timeo.active_timeout = 5
 node.conn[0].timeo.idle_timeout = 60
 node.conn[0].timeo.ping_timeout = 5
 node.conn[0].timeo.noop_out_interval = 10
 node.conn[0].timeo.noop_out_timeout = 15
 node.conn[0].iscsi.MaxRecvDataSegmentLength = 131072
 node.conn[0].iscsi.HeaderDigest = None,CRC32C
 

 I think that this is the problem. iSER doesn't use
 HeaderDigest/DataDigest. I strongly suggest that you use
 iscsi_discovery which does all the work for you (including setting
 HeaderDigest/DataDigest to None).

 Erez

   

Hello Erez-

The HeaderDigest setting here indicates a list of options [None, 
CRC32C].  If running on iSER, we'll negotiate to None, and all will be 
well.

I would like to take your advice, but the distribution that I am using 
does not have the iscsi_discovery with the -t option, so I just used 
static.  It could be that what I'm running just won't work (we have 
discussed offline a known-to-work configuration, I will try that).  As 
an aside, I think it's kinda nutty that there's a chance that the RHEL 
5.2 config doesn't work... since, eh, well it's ship

There may be something else in the config, though.  I'm just trying to 
figure out what this is:

# iscsiadm -m node -T 
iqn.1986-03.com.sun:02:aff22998-3466-4bf4-ee3c-958fd4b5d346 -p 
10.8.0.6:3260 -l
Login session [iface: default, target: 
iqn.1986-03.com.sun:02:aff22998-3466-4bf4-ee3c-958fd4b5d346, portal: 
10.8.0.6,3260]
iscsiadm: initiator reported error (14 - iSCSI driver does not support 
requested capability.)
iscsiadm: Could not execute operation on all records. Err 107.
#


I have yet to find it in the code (but I do have my day job).  If I 
don't hear anything, I'll just roll back to RHEL 5.2.

Best
Jesse







--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
open-iscsi group.
To post to this group, send email to open-iscsi@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at http://groups.google.com/group/open-iscsi
-~--~~~~--~~--~--~---



Re: Configuration question

2008-08-20 Thread v42bis



On Aug 20, 1:39 am, Mike Christie [EMAIL PROTECTED] wrote:
 v42bis wrote:
  Thank for the reply, Mike.

 No problem.

  The iscsi connections failed about 1m13s after my iscsi target went
  down (timestamps that follow are synced from same ntp master, however
  clock skew may account for a few seconds difference [1m45sec seems
  very conspicuous - a multiplier of default 15sec timers?]). The target
  went down at Aug 19 13:33:33.

 Actually this looks like a different problem. What version of open-iscsi
 are you using? Do a iscsiadm -P 3. The top part should dump the
 iscsiadm version.

`iscsiadm -P 3` just spits out the usage/help information - no
version. I know it is version open-iscsi-2.0-865.15, though.


  Aug 19 13:36:42 ak1-vz2 kernel: iscsi: scsi conn_destroy(): host_busy
  0 host_failed 0

 This means that userspace decided to kill the iscsi session/connection
 which means that we ignore the recovery/replacement timeout and just
 kill everything which forces IO errors. We only did this for fatal
 errors, but we should not do that anymore.

What userspace process would have done that?


  The above did not affect normal operation of my open-iscsi initiators.

 That is weirder. In this setup do you have multiple
 sessions/connections? When you checked the machine were all the
 session/connections running? There should have been two sessions that
 were destroyed.

Only one session per connection. One connection to each iscsi target.

All of the filesystems and iscsi connections seemed fine, as far as I
could tell.


 In older open-iscsi userspace tools there were certain errors the target
 could send us and iscsid would consider it a fatal error and it would
 kill the sessions like above. For example if a target was shutting down
 it could tell us that it was not coming back, so we would kill the
 session. There was also a case where iscsid got confused and thought it
 was a fatal error and would kill the session. We now just retry forever
 or until the user kills the session manually to avoid problems like this.

To confirm: open-iscsi version 2.0-869.2 and above will never kill
iscsi sessions unless the user explicitly tells iscsid to logout/kill
the session? I want to make sure my open-iscsi initiators never return
errors until replacement_timeout is reached. I'd rather have any
processes accessing filesystems on iscsi hang forever than have the
connections lost and journals aborted.

Looking at the code, there is no problem with setting such a high
replacement_timeout?


 Please tell me you were using a older version than open-iscsi-2.0-869.2
 :) If you were using open-iscsi-2.0-869.2 then we have a different
 problem :(

I am definitely running 2.0-865.15. I will upgrade to 2.0-869.2.

It would be *very* convenient if the Changelog would include changes
in every version and not just the current release. :)


--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
open-iscsi group.
To post to this group, send email to open-iscsi@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at http://groups.google.com/group/open-iscsi
-~--~~~~--~~--~--~---