Re: Concurrent logins to different interfaces of same iscsi target and login timeout

Amit Bawer Tue, 18 Aug 2020 06:24:23 -0700

Hi Lee,

Thanks for adding the async login support to upstream. I've ran some tests 
using the iscsiadm built from there
and would like to ask:


1. How is it possible to gather the async logins return status? if 
understood correctly, the proposed way
is to lookup for the connections in the output of "iscsiadm -m session" 
after the async logins were launched.
Currently, I am using a sampling loop, checking at 1 second intervals the 
output of iscsiadm -m session for
presence of expected connections targets and portals and breaks if all were 
found or not found within 
the expected timeout interval, which for the default iscsi settings is 
considered as following: 
(120 seconds timeout per connection login) * (number of connections) / 
(number of workers)
Is there a better way? I am not sure how to gather the error status when a 
connection not able to login in such case.

2. Would it also be supported for non-login-all mode? For "iscsiadm -m node 
-T target -p portal -I interface --login"
I get same timeouts with/without the --no-wait flag, meaning the test waits 
240 seconds in case two connections
are down when using a single node login worker for both cases, so I assume 
it currently doesn't apply for this login mode.

-- Simulating one portal down (2 connections down) with one worker, using 
node login without --no-wait

# python3 ./initiator.py  -j 1 -i 10.35.18.220 10.35.18.156  -d 
10.35.18.156 

2020-08-18 15:59:01,874 INFO    (MainThread) Removing prior sessions and 
nodes
2020-08-18 15:59:01,882 INFO    (MainThread) Deleting all nodes
2020-08-18 15:59:01,893 INFO    (MainThread) No active sessions
2020-08-18 15:59:01,943 INFO    (MainThread) Setting 10.35.18.156 as 
invalid address for target iqn.2003-01.org.vm-18-220.iqn2
2020-08-18 15:59:01,943 INFO    (MainThread) Setting 10.35.18.156 as 
invalid address for target iqn.2003-01.org.vm-18-220.iqn1
2020-08-18 15:59:01,943 INFO    (MainThread) Discovered connections: 
{('iqn.2003-01.org.vm-18-220.iqn1', '0.0.0.0:0,0'), 
('iqn.2003-01.org.vm-18-220.iqn2', '0.0.0.0:0,0'), 
('iqn.2003-01.org.vm-18-220.iqn2', '10.35.18.220:3260,1'), 
('iqn.2003-01.org.vm-18-220.iqn1', '10.35.18.220:3260,1')}
2020-08-18 15:59:01,944 INFO    (MainThread) Adding node for target 
iqn.2003-01.org.vm-18-220.iqn1 portal 0.0.0.0:0,0
2020-08-18 15:59:01,956 INFO    (MainThread) Adding node for target 
iqn.2003-01.org.vm-18-220.iqn2 portal 0.0.0.0:0,0
2020-08-18 15:59:01,968 INFO    (MainThread) Adding node for target 
iqn.2003-01.org.vm-18-220.iqn2 portal 10.35.18.220:3260,1
2020-08-18 15:59:01,980 INFO    (MainThread) Adding node for target 
iqn.2003-01.org.vm-18-220.iqn1 portal 10.35.18.220:3260,1
2020-08-18 15:59:01,995 INFO    (login_0) Login to target 
iqn.2003-01.org.vm-18-220.iqn1 portal 0.0.0.0:0,0 (nowait=False)
2020-08-18 16:01:02,019 INFO    (login_0) Login to target 
iqn.2003-01.org.vm-18-220.iqn2 portal 0.0.0.0:0,0 (nowait=False)
2020-08-18 16:01:02,028 ERROR   (MainThread) Job failed: Command 
['iscsiadm', '--mode', 'node', '--targetname', 
'iqn.2003-01.org.vm-18-220.iqn1', '--interface', 'default', '--portal', 
'0.0.0.0:0,0', '--login'] failed rc=8 out='Logging in to [iface: default, 
target: iqn.2003-01.org.vm-18-220.iqn1, portal: 0.0.0.0,0]' err='iscsiadm: 
Could not login to [iface: default, target: iqn.2003-01.org.vm-18-220.iqn1, 
portal: 0.0.0.0,0].\niscsiadm: initiator reported error (8 - connection 
timed out)\niscsiadm: Could not log into all portals'
2020-08-18 16:03:02,045 INFO    (login_0) Login to target 
iqn.2003-01.org.vm-18-220.iqn2 portal 10.35.18.220:3260,1 (nowait=False)
2020-08-18 16:03:02,053 ERROR   (MainThread) Job failed: Command 
['iscsiadm', '--mode', 'node', '--targetname', 
'iqn.2003-01.org.vm-18-220.iqn2', '--interface', 'default', '--portal', 
'0.0.0.0:0,0', '--login'] failed rc=8 out='Logging in to [iface: default, 
target: iqn.2003-01.org.vm-18-220.iqn2, portal: 0.0.0.0,0]' err='iscsiadm: 
Could not login to [iface: default, target: iqn.2003-01.org.vm-18-220.iqn2, 
portal: 0.0.0.0,0].\niscsiadm: initiator reported error (8 - connection 
timed out)\niscsiadm: Could not log into all portals'
2020-08-18 16:03:02,321 INFO    (login_0) Login to target 
iqn.2003-01.org.vm-18-220.iqn1 portal 10.35.18.220:3260,1 (nowait=False)
2020-08-18 16:03:02,695 INFO    (MainThread) Connecting completed in 
240.752s

-- Simulating one portal down (2 connections down) with one worker, using 
node login with --no-wait

# python3 ./initiator.py  -j 1 -i 10.35.18.220 10.35.18.156  -d 
10.35.18.156  --nowait

2020-08-18 16:16:05,802 INFO    (MainThread) Removing prior sessions and 
nodes
2020-08-18 16:16:06,075 INFO    (MainThread) Deleting all nodes
2020-08-18 16:16:06,090 INFO    (MainThread) No active sessions
2020-08-18 16:16:06,130 INFO    (MainThread) Setting 10.35.18.156 as 
invalid address for target iqn.2003-01.org.vm-18-220.iqn2
2020-08-18 16:16:06,131 INFO    (MainThread) Setting 10.35.18.156 as 
invalid address for target iqn.2003-01.org.vm-18-220.iqn1
2020-08-18 16:16:06,131 INFO    (MainThread) Discovered connections: 
{('iqn.2003-01.org.vm-18-220.iqn2', '10.35.18.220:3260,1'), 
('iqn.2003-01.org.vm-18-220.iqn1', '0.0.0.0:0,0'), 
('iqn.2003-01.org.vm-18-220.iqn1', '10.35.18.220:3260,1'), 
('iqn.2003-01.org.vm-18-220.iqn2', '0.0.0.0:0,0')}
2020-08-18 16:16:06,132 INFO    (MainThread) Adding node for target 
iqn.2003-01.org.vm-18-220.iqn2 portal 10.35.18.220:3260,1
2020-08-18 16:16:06,147 INFO    (MainThread) Adding node for target 
iqn.2003-01.org.vm-18-220.iqn1 portal 0.0.0.0:0,0
2020-08-18 16:16:06,162 INFO    (MainThread) Adding node for target 
iqn.2003-01.org.vm-18-220.iqn1 portal 10.35.18.220:3260,1
2020-08-18 16:16:06,176 INFO    (MainThread) Adding node for target 
iqn.2003-01.org.vm-18-220.iqn2 portal 0.0.0.0:0,0
2020-08-18 16:16:06,190 INFO    (login_0) Login to target 
iqn.2003-01.org.vm-18-220.iqn2 portal 10.35.18.220:3260,1 (nowait=True)
2020-08-18 16:16:06,324 INFO    (login_0) Login to target 
iqn.2003-01.org.vm-18-220.iqn1 portal 0.0.0.0:0,0 (nowait=True)
2020-08-18 16:18:06,351 INFO    (login_0) Login to target 
iqn.2003-01.org.vm-18-220.iqn1 portal 10.35.18.220:3260,1 (nowait=True)
2020-08-18 16:18:06,356 ERROR   (MainThread) Job failed: Command 
['iscsiadm', '--mode', 'node', '--targetname', 
'iqn.2003-01.org.vm-18-220.iqn1', '--interface', 'default', '--portal', 
'0.0.0.0:0,0', '--login', '--no_wait'] failed rc=8 out='Logging in to 
[iface: default, target: iqn.2003-01.org.vm-18-220.iqn1, portal: 
0.0.0.0,0]' err='iscsiadm: Could not login to [iface: default, target: 
iqn.2003-01.org.vm-18-220.iqn1, portal: 0.0.0.0,0].\niscsiadm: initiator 
reported error (8 - connection timed out)\niscsiadm: Could not log into all 
portals'
2020-08-18 16:18:06,589 INFO    (login_0) Login to target 
iqn.2003-01.org.vm-18-220.iqn2 portal 0.0.0.0:0,0 (nowait=True)
2020-08-18 16:20:06,643 ERROR   (MainThread) Job failed: Command 
['iscsiadm', '--mode', 'node', '--targetname', 
'iqn.2003-01.org.vm-18-220.iqn2', '--interface', 'default', '--portal', 
'0.0.0.0:0,0', '--login', '--no_wait'] failed rc=8 out='Logging in to 
[iface: default, target: iqn.2003-01.org.vm-18-220.iqn2, portal: 
0.0.0.0,0]' err='iscsiadm: Could not login to [iface: default, target: 
iqn.2003-01.org.vm-18-220.iqn2, portal: 0.0.0.0,0].\niscsiadm: initiator 
reported error (8 - connection timed out)\niscsiadm: Could not log into all 
portals'
2020-08-18 16:20:06,656 INFO    (MainThread) Connecting completed in 
240.524s


Thanks for helping out,
Amit

On Thursday, August 13, 2020 at 5:32:26 PM UTC+3 nir...@gmail.com wrote:

> On Thu, Aug 13, 2020 at 1:32 AM The Lee-Man <leeman...@gmail.com> wrote:
>
>> On Sunday, August 9, 2020 at 11:08:50 AM UTC-7, Amit Bawer wrote:
>>>
>>> ...
>>>
>>>>
>>>>> The other option is to use one login-all call without parallelism, but 
>>>>> that would have other implications on our system to consider.
>>>>>
>>>>
>>>> Such as? 
>>>>
>>> As mentioned above,  unless there is a way to specify a list of targets 
>>> and portals for a single login (all) command.
>>>
>>>>
>>>>> Your answers would be helpful once again.
>>>>>
>>>>> Thanks,
>>>>> - Amit
>>>>>
>>>>>
>>>> You might be interested in a new feature I'm considering adding to 
>>>> iscsiadm to do asynchronous logins. In other words, the iscsiadm could, 
>>>> when asked to login to one or more targets, would send the login request 
>>>> to 
>>>> the targets, then return success immediately. It is then up to the 
>>>> end-user 
>>>> (you in this case) to poll for when the target actually shows up.
>>>>
>>> This sounds very interesting, but probably will be available to us only 
>>> on later RHEL releases, if chosen to be delivered downstream.
>>> At present it seems we can only use the login-all way or logins in a 
>>> dedicated threads per target-portal.
>>>
>>>>
>>>> ...
>>>>
>>>
>> So you can only use RH-released packages?
>>
>
> Yes, we support RHEL and CentOS now.
>  
>
>> That's fine with me, but I'm asking you to test a new feature and see if 
>> it fixes your problems. If it helped, I would add up here in this repo, and 
>> redhat would get it by default when they updated, which they do regularly, 
>> as does my company (SUSE).
>>
>
> Sure, this is how we do things. But using async login is something we can 
> use only
> in a future version, maybe RHEL/CentOS 8.4, since it is probably too late 
> for 8.3.
>
> Just as a "side" point, I wouldn't attack your problem by manually listing 
>> nodes to login to.
>>
>> It does seem as if you assume you are the only iscsi user on the system. 
>> In that case, you have complete control of the node database. Assuming your 
>> targets do not change, you can set up your node database once and never 
>> have to discover iscsi targets again. Of course if targets change, you can 
>> update your node database, but only as needed, i.e. full discovery 
>> shouldn't be needed each time you start up, unless targets are really 
>> changing all the time in your environment.
>>
>
> This is partly true. in oVirt, there is the vdsm daemon managing iSCSI 
> connections.
> so usually only vdsm manipulates the database.
>
> However even in vdsm we have an issue when we attach a Cinder based volume.
> In this case we use os-brick (https://github.com/openstack/os-brick) to 
> attach the
> volume, and it will discover and login to the volume.
>
> And of course we cannot prevent an admin from changing the database for 
> their
> valid reasons.
>
> So being able to login/logout to specific nodes is very attractive for us. 
>
> If you do discovery and have nodes in your node database you don't like, 
>> just remove them.
>>
>
> We can do this, adding and removing nodes we added, but we cannot remove 
> nodes
> we did not add. If may be something added by os-brik or an administrator.
>
> Another point about your scheme: you are setting each node's 'startup' to 
>> 'manual', but manual is the default, and since you seem to own the 
>> open-iscsi code on this system, you can ensure the default is manual. 
>> Perhaps because this is a test?
>>
>
> No, this is our production setup. I don't know why we specify manual, maybe
> this was not the default in 2009 when this code was written, or maybe the 
> intent
> was to be explicit about it, in case the default would change?
>
> Do you see a problem with explicit node.startup=manual?
>  
>
>>
>> So, again, I ask you if you will test the async login code? It's really 
>> not much extra work -- just a "git clone" and a "make install" (mostly). If 
>> not, the async feature may make it into iscsiadm any way, some time soon, 
>> but I'd really prefer other testers for this feature before that.
>>
>
> Sure, we will test this.
>
> Having async login API sounds great, but my concern is how do we wait for 
> the 
> login result. For example with systemd many things became asynchronous, but
> there is no good way to wait for things. Few examples are mounts that can 
> fail
> after the mount command completes, because after the completion udev 
> changes
> permissions on the mount, or multipath devices, which may not be ready 
> after
> connecting to a target.
>
> Can you elaborate on how you would wait for the login result, and how 
> would you
> get login error for reporting up the stack? How can you handle timeouts? 
> This is 
> easy to do when using synchronous API with threads.
>
> From our point of view we want to be able to:
>
>     start async login process
>     for result in login results:
>         add result to response
>     return response with connection details
>
> This runs on every host in a cluster, and the results are returned to 
> oVirt engine,
> managing the cluster.
>
> Cheers,
> Nir
>

-- 
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to open-iscsi+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/open-iscsi/1f197a6d-cf7c-4723-a3f1-1d98b3d98520n%40googlegroups.com.

Re: Concurrent logins to different interfaces of same iscsi target and login timeout

Reply via email to