So it semes that a number of different patches to solve the same problem
got accepted in aggregate and together created the opposite problem ;)

Before 1.8.9, SOL sessions dropped readily due to a misdirected keepalive
packet.

Now, an SOL session will never figure out it got dropped.

lanplus.c: For years has had a retry mechanism at its core, and I'm not
quite understanding why more layers were put on top of it.  I know
keepalive was broken because getdeviceid/sol data got mixed up, but a patch
last year straightened that out.

ipmi_sol.c: 1.51 CVS, there is an attempt to loop on the entry at a higher
level.  By doing both this and the above, I think the time to give up has
been made too long and would like to see one removed (I think this
particular change is less universal than the above and as such I would
remove here.  However, making matters worse:

ipmi_sol.c: 1.52 cvs changed ipmi_sol_keepalive_using_getdeviceid to fake
return 0 most times.  I understand the intent, but it's my least favorite
approach.  This in conjunction with the above change produces the following
behavior on a truly downed session:


-ipmi_sol_red_pill calls ipmi_sol_keepalive_using_getdeviceid  (IMO the
correct keepalive that was just incorrectly implemented in 1.8.8, since the
SOL keepalive isn't a guarantee)
-keepalive_using_getdeviceid calls keepalive function in lanplus.c
-keepalive function tries and retries and gives up, return bad status to
keepalive_using_getdevice_id
-keepalive_using_getdeviceid receives the negative status, increments *its*
retry counter, and fakes success to the caller, ipmi_sol_red_pill
-ipmi_sol_red_pill happily continues
.
.
.
-keepalive function returns bad status to keepalive_using_getdevice_id
-keepalive_using_getdeviceid realizes it has exhausted retries, resets it's
counter to zero, and actually propogates the failure to sol_red_pill
-red_pill thinks this is the first time it has ever failed, calls
keepalive_using_getdeviceid again, which fakes a 'good' status because it
has plenty of retries, resetting red_pill's retry count
(infinite loop)


I propose we back out the explicit keepalives in ipmi_sol and leave the
retries to lanplus.c.
------------------------------------------------------------------------------
Apps built with the Adobe(R) Flex(R) framework and Flex Builder(TM) are
powering Web 2.0 with engaging, cross-platform capabilities. Quickly and
easily build your RIAs with Flex Builder, the Eclipse(TM)based development
software that enables intelligent coding and step-through debugging.
Download the free 60 day trial. http://p.sf.net/sfu/www-adobe-com
_______________________________________________
Ipmitool-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/ipmitool-devel

Reply via email to