Hi Mike:

Thank you for your reply.

On Friday, September 21, 2018 at 9:35:25 AM UTC-7, Mike Christie wrote:
>
> > ...
>
> This is one of those things where I said I would fix if anyone ever 
> complains, but luckily I made it through my entire time as maintainer 
> and no one ever did. So, I guess basically tag you're it :) 
>
> Checkout node.session.initial_login_retry_max and iscsi_login_eh. The 
> current state machine detects the iscsid restart case as a relogin and 
> so that setting does not kick in. You could: 
>
> 1. modify the state machine check so it does. 
> 2. add a new setting that works like initial_login_retry_max but doesn't 
> really work like it :) What I mean is that initial_login_retry_max was 
> trying to handle a lot of different cases so it is a little odd. See the 
> iscsid.conf comments. You probably want to make something that is more 
> straightforward. 
>

I didn't see this reply until this morning, but it's funny that I came to
similar conclusions. Even though Chris and I "own" this code now,
I don't know it well enough, so I spent a bunch of time learning what
is going on here now. And the libopeniscsiusr library complicates
things a bit because some functionality is now in two places! (Not
sure how I let that happen, but that's a different story.)

I actually picked number 2. I created session.relogin_max for the
new value because there was already a similar value for discovery.
So now the daemon code, when trying to recover a session, will only
retry "relogin_max" times, by default 32. But this still isn't an ideal
long-term solution, because this is _per-session_. It's easy to
imagine hundreds of sessions. This should probably be done
asynchronously, and it probably shouldn't keep the daemon
from coming completely up.

But the good news is that now, after about a minute, the stale
session goes away if the target can't be reached. And if the
target comes back during that minute, the session is correctly
reestablished.

I also fixed some bugs so that we can see the session correctly
during this period of stale-session-connection-retry, and we will
get a message along the lines of "target temporarily not connected"
if we try to logout of one of these stale sessions.

I will pose this solution as a pull request on
github "real soon now". :)

-- 
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to open-iscsi+unsubscr...@googlegroups.com.
To post to this group, send email to open-iscsi@googlegroups.com.
Visit this group at https://groups.google.com/group/open-iscsi.
For more options, visit https://groups.google.com/d/optout.

Reply via email to