On 03/27/2014 12:47 AM, Prasanna Kumar wrote:
> Hi,
>
> When going through the patch, noticed a small issue
> The function smi_mod_timer hard codes time to  (jiffies +
> SI_TIMEOUT_JIFFIES) and does not make use of passed value "new_val" .
>

You are right, I have fixed it.  Thank you.

-corey

>
>
> regards,
> Prasanna
>
>
> On Wed, Mar 26, 2014 at 10:34 PM, Corey Minyard <[email protected]
> <mailto:[email protected]>> wrote:
>
>     On 03/26/2014 11:51 AM, Strösser, Bodo wrote:
>     > This time with the CC to the list. Sorry for the noise.
>     >
>     >
>     >> -----Original Message-----
>     >> From: Corey Minyard [mailto:[email protected]
>     <mailto:[email protected]>] On Behalf Of Corey Minyard
>     >> Sent: Wednesday, March 26, 2014 12:41 AM
>     >> To: Strösser, Bodo
>     >> Cc: [email protected]
>     <mailto:[email protected]>
>     >> Subject: Re: ipmi_si: KCS: timeout waiting for IBF or OBF does
>     not always work
>     >>
>     >> On 03/25/2014 05:32 PM, Strösser, Bodo wrote:
>     >>> Attached you'll find my modified patch to explain my
>     hypothesis better.
>     >>> It seems to work fine. Thus, my hypothesis seems to be right.
>     >> timer_running needs to be a bool,
>     > Yes, of course. int is what we had to use in the good old times  ;-)
>
>     Indeed, old habits die hard :).
>
>     >
>     >
>     >> but beyond that, this is good.  I did
>     >> think about this, but I thought timer_running() would be
>     enough.  But
>     >> thinking about it some more, I think you are right.  It's not
>     quite good
>     >> enough.
>     >
>     > Have attached a new version of the patch. This time using bool
>     and with your
>     > comments, that I've slightly extended. Have added my From and
>     Signed-off-by
>     > hoping that this is ok for you.
>
>     Perfect, I have it queued for the next merge window.
>
>     Thanks again,
>
>     -corey
>
>     >
>     > The patch is based on SuSE-SLES11-SP2 (3.0.101-xxx), but it
>     applies to
>     > mainline also, printing some offsets.
>     >
>     > Bodo
>     >
>     >
>     >
>     >> -corey
>     >>
>     >>> Bodo
>     >>>
>     >>>> -----Original Message-----
>     >>>> From: Strösser, Bodo
>     >>>> Sent: Tuesday, March 25, 2014 9:41 PM
>     >>>> To: '[email protected] <mailto:[email protected]>'
>     >>>> Cc: [email protected]
>     <mailto:[email protected]>
>     >>>> Subject: RE: ipmi_si: KCS: timeout waiting for IBF or OBF
>     does not always work
>     >>>>
>     >>>> Hi Corey,
>     >>>>
>     >>>> Sorry, the patch isn't good. The problem is, that the timeout
>     time now is
>     >>>> a multiple of what it should be (5 seconds).
>     >>>>
>     >>>> I guess, I know the reason for this:
>     >>>> I think, the timer is reset quite often while the driver
>     loops waiting for OBF
>     >>>> in KCS status 40. The reset can happen, as timer_pending() is
>     not the right test.
>     >>>> For example the timer isn't pending if smi_timeout already is
>     started, but has to
>     >>>> wait for the lock the is held by ipmi_thread().
>     >>>>
>     >>>> Maybe we could add a flag to smi_info, that is set in
>     smi_mod_timer() and is reset
>     >>>> in smi_timeout() id smi_mod_timer() is not called. Then
>     ipmi_thread() could test
>     >>>> the flag instead of calling timer_pending(). But maybe you
>     have a better idea?
>     >>>>
>     >>>> Bodo
>     >>>>
>     >>>>> -----Original Message-----
>     >>>>> From: Corey Minyard [mailto:[email protected]
>     <mailto:[email protected]>] On Behalf Of Corey Minyard
>     >>>>> Sent: Monday, March 24, 2014 7:54 PM
>     >>>>> To: Strösser, Bodo
>     >>>>> Cc: [email protected]
>     <mailto:[email protected]>
>     >>>>> Subject: Re: ipmi_si: KCS: timeout waiting for IBF or OBF
>     does not always work
>     >>>>>
>     >>>>> On 03/24/2014 11:30 AM, Strösser, Bodo wrote:
>     >>>>>> Hi Corey,
>     >>>>>>
>     >>>>>> Thank you for the quick reply and the patches.
>     >>>>>>
>     >>>>>> We had tests running for the entire weekend with a driver
>     >>>>>> patched with my change. No more errors. So it works.
>     >>>>> It certainy fixes the issue you were seeing, but it would
>     cause some
>     >>>>> other issues that would possibly cause some error conditions
>     to not be
>     >>>>> handled correctly, and it would make the power management
>     people very
>     >>>>> unhappy with me :-0.
>     >>>>>
>     >>>>> Hopefully the patch I submitted will fix the issue without
>     those side
>     >>>>> effects.  But your patch pointed me in the right direction,
>     so it wasn't
>     >>>>> without merit.
>     >>>>>
>     >>>>> Thanks again,
>     >>>>>
>     >>>>> -corey
>     >>>>>
>     >>>>>> But of course, your patch is a much more sane fix. I'll test
>     >>>>>> it ASAP and will report the result.
>     >>>>>>
>     >>>>>> Bodo
>     >>>>>>
>     >>>>>>
>     >>>>>>> -----Original Message-----
>     >>>>>>> From: Corey Minyard [mailto:[email protected]
>     <mailto:[email protected]>] On Behalf Of Corey Minyard
>     >>>>>>> Sent: Monday, March 24, 2014 5:10 PM
>     >>>>>>> To: Strösser, Bodo
>     >>>>>>> Cc: [email protected]
>     <mailto:[email protected]>
>     >>>>>>> Subject: Re: ipmi_si: KCS: timeout waiting for IBF or OBF
>     does not always work
>     >>>>>>>
>     >>>>>>> On 03/21/2014 04:02 PM, Strösser, Bodo wrote:
>     >>>>>>>> Hi Corey,
>     >>>>>>>>
>     >>>>>>>> We are using servers, that have a BMC connected via KCS.
>     >>>>>>>> We are running SuSE SLES11-SP2, kernel 3.0.101-0.7.17, but
>     >>>>>>>> the problem I see seems to be relevant for Mainline also.
>     >>>>>>>>
>     >>>>>>>> For our BMC, ipmi_si.ko uses both, ipmi_thread() and
>     >>>>>>>> smi_timeout() in parallel to call smi_event_handler().
>     >>>>>>>> But only smi_timeout() passes the elapsed time for timeout
>     >>>>>>>> handling to smi_event_handler(), but not ipmi_thread().
>     >>>>>>>>
>     >>>>>>>> After smi_timeout detected IDLE, it does not start the
>     >>>>>>>> timer again. Instead, a call to sender() will restart the
>     >>>>>>>> timer. Unfortunately, if a command to the BMC is not
>     >>>>>>>> started via sender(), but generated internally (eg. from
>     >>>>>>>> smi_event_handler() because of smi_info->req_events) and
>     >>>>>>>> started via smi_info->handlers->start_transaction(),
>     >>>>>>>> AFAICS the timer is not started. So, the timeouts in the
>     >>>>>>>> KCS driver while waiting for IBF or OBF won't work in this
>     >>>>>>>> situation.
>     >>>>>>>>
>     >>>>>>>> The BMCs in our servers seem to have a sporadic problem
>     >>>>>>>> that leads to the error recovery being started because of
>     >>>>>>>> an timeout while waiting for OBF. But in case the timer is
>     >>>>>>>> stopped while the problem occurs, the driver loop waiting
>     >>>>>>>> for OBF forever.
>     >>>>>>> I think I can see what is happening here, but I don't
>     think your fix is
>     >>>>>>> correct.  With your fix the driver would never time out if
>     the interface
>     >>>>>>> wasn't working, because it's starting the timer in the
>     thread, the timer
>     >>>>>>> is what is used to time out operations, and the thread
>     will run
>     >>>>>>> constantly while the driver is operating.
>     >>>>>>>
>     >>>>>>> Try the attached patches instead, please.  I've done some
>     minimal testing.
>     >>>>>>>
>     >>>>>>>> P.S.: There might be another minor issue in ipmi_kcs_sm.c
>     >>>>>>>> If check_obf() times out, it starts the error handling
>     but does
>     >>>>>>>> not reset kcs->obf_timeout. Thus, in state KCS_ERROR2
>     check_obf()
>     >>>>>>>> is called with the timer already expired. So I'd like to
>     >>>>>>>> suggest this small change:
>     >>>>>>>>
>     >>>>>>>> --- a/drivers/char/ipmi/ipmi_kcs_sm.c   2014-03-19
>     10:42:51.000000000 +0100
>     >>>>>>>> +++ b/drivers/char/ipmi/ipmi_kcs_sm.c   2014-03-19
>     10:48:33.000000000 +0100
>     >>>>>>>> @@ -251,8 +251,9 @@ static inline int check_obf(struct si_sm
>     >>>>>>>>        if (!GET_STATUS_OBF(status)) {
>     >>>>>>>>                kcs->obf_timeout -= time;
>     >>>>>>>>                if (kcs->obf_timeout < 0) {
>     >>>>>>>> -                  start_error_recovery(kcs, "OBF not
>     ready in time");
>     >>>>>>>> -                  return 1;
>     >>>>>>>> +                      start_error_recovery(kcs, "OBF not
>     ready in time");
>     >>>>>>>> +                      kcs->obf_timeout = OBF_RETRY_TIMEOUT;
>     >>>>>>>> +                      return 1;
>     >>>>>>>>                }
>     >>>>>>>>                return 0;
>     >>>>>>>>        }
>     >>>>>>>>
>     >>>>>>> Indeed, you are right.  I've attached another patch for that.
>     >>>>>>>
>     >>>>>>> -corey
>
>
>     
> ------------------------------------------------------------------------------
>     Learn Graph Databases - Download FREE O'Reilly Book
>     "Graph Databases" is the definitive new guide to graph databases
>     and their
>     applications. Written by three acclaimed leaders in the field,
>     this first edition is now available. Download your free book today!
>     http://p.sf.net/sfu/13534_NeoTech
>     _______________________________________________
>     Openipmi-developer mailing list
>     [email protected]
>     <mailto:[email protected]>
>     https://lists.sourceforge.net/lists/listinfo/openipmi-developer
>
>


------------------------------------------------------------------------------
_______________________________________________
Openipmi-developer mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/openipmi-developer

Reply via email to