Hi, When going through the patch, noticed a small issue The function smi_mod_timer hard codes time to (jiffies + SI_TIMEOUT_JIFFIES) and does not make use of passed value "new_val" .
regards, Prasanna On Wed, Mar 26, 2014 at 10:34 PM, Corey Minyard <[email protected]> wrote: > On 03/26/2014 11:51 AM, Strösser, Bodo wrote: > > This time with the CC to the list. Sorry for the noise. > > > > > >> -----Original Message----- > >> From: Corey Minyard [mailto:[email protected]] On Behalf Of Corey > Minyard > >> Sent: Wednesday, March 26, 2014 12:41 AM > >> To: Strösser, Bodo > >> Cc: [email protected] > >> Subject: Re: ipmi_si: KCS: timeout waiting for IBF or OBF does not > always work > >> > >> On 03/25/2014 05:32 PM, Strösser, Bodo wrote: > >>> Attached you'll find my modified patch to explain my hypothesis better. > >>> It seems to work fine. Thus, my hypothesis seems to be right. > >> timer_running needs to be a bool, > > Yes, of course. int is what we had to use in the good old times ;-) > > Indeed, old habits die hard :). > > > > > > >> but beyond that, this is good. I did > >> think about this, but I thought timer_running() would be enough. But > >> thinking about it some more, I think you are right. It's not quite good > >> enough. > > > > Have attached a new version of the patch. This time using bool and with > your > > comments, that I've slightly extended. Have added my From and > Signed-off-by > > hoping that this is ok for you. > > Perfect, I have it queued for the next merge window. > > Thanks again, > > -corey > > > > > The patch is based on SuSE-SLES11-SP2 (3.0.101-xxx), but it applies to > > mainline also, printing some offsets. > > > > Bodo > > > > > > > >> -corey > >> > >>> Bodo > >>> > >>>> -----Original Message----- > >>>> From: Strösser, Bodo > >>>> Sent: Tuesday, March 25, 2014 9:41 PM > >>>> To: '[email protected]' > >>>> Cc: [email protected] > >>>> Subject: RE: ipmi_si: KCS: timeout waiting for IBF or OBF does not > always work > >>>> > >>>> Hi Corey, > >>>> > >>>> Sorry, the patch isn't good. The problem is, that the timeout time > now is > >>>> a multiple of what it should be (5 seconds). > >>>> > >>>> I guess, I know the reason for this: > >>>> I think, the timer is reset quite often while the driver loops > waiting for OBF > >>>> in KCS status 40. The reset can happen, as timer_pending() is not the > right test. > >>>> For example the timer isn't pending if smi_timeout already is > started, but has to > >>>> wait for the lock the is held by ipmi_thread(). > >>>> > >>>> Maybe we could add a flag to smi_info, that is set in smi_mod_timer() > and is reset > >>>> in smi_timeout() id smi_mod_timer() is not called. Then ipmi_thread() > could test > >>>> the flag instead of calling timer_pending(). But maybe you have a > better idea? > >>>> > >>>> Bodo > >>>> > >>>>> -----Original Message----- > >>>>> From: Corey Minyard [mailto:[email protected]] On Behalf Of Corey > Minyard > >>>>> Sent: Monday, March 24, 2014 7:54 PM > >>>>> To: Strösser, Bodo > >>>>> Cc: [email protected] > >>>>> Subject: Re: ipmi_si: KCS: timeout waiting for IBF or OBF does not > always work > >>>>> > >>>>> On 03/24/2014 11:30 AM, Strösser, Bodo wrote: > >>>>>> Hi Corey, > >>>>>> > >>>>>> Thank you for the quick reply and the patches. > >>>>>> > >>>>>> We had tests running for the entire weekend with a driver > >>>>>> patched with my change. No more errors. So it works. > >>>>> It certainy fixes the issue you were seeing, but it would cause some > >>>>> other issues that would possibly cause some error conditions to not > be > >>>>> handled correctly, and it would make the power management people very > >>>>> unhappy with me :-0. > >>>>> > >>>>> Hopefully the patch I submitted will fix the issue without those side > >>>>> effects. But your patch pointed me in the right direction, so it > wasn't > >>>>> without merit. > >>>>> > >>>>> Thanks again, > >>>>> > >>>>> -corey > >>>>> > >>>>>> But of course, your patch is a much more sane fix. I'll test > >>>>>> it ASAP and will report the result. > >>>>>> > >>>>>> Bodo > >>>>>> > >>>>>> > >>>>>>> -----Original Message----- > >>>>>>> From: Corey Minyard [mailto:[email protected]] On Behalf Of > Corey Minyard > >>>>>>> Sent: Monday, March 24, 2014 5:10 PM > >>>>>>> To: Strösser, Bodo > >>>>>>> Cc: [email protected] > >>>>>>> Subject: Re: ipmi_si: KCS: timeout waiting for IBF or OBF does not > always work > >>>>>>> > >>>>>>> On 03/21/2014 04:02 PM, Strösser, Bodo wrote: > >>>>>>>> Hi Corey, > >>>>>>>> > >>>>>>>> We are using servers, that have a BMC connected via KCS. > >>>>>>>> We are running SuSE SLES11-SP2, kernel 3.0.101-0.7.17, but > >>>>>>>> the problem I see seems to be relevant for Mainline also. > >>>>>>>> > >>>>>>>> For our BMC, ipmi_si.ko uses both, ipmi_thread() and > >>>>>>>> smi_timeout() in parallel to call smi_event_handler(). > >>>>>>>> But only smi_timeout() passes the elapsed time for timeout > >>>>>>>> handling to smi_event_handler(), but not ipmi_thread(). > >>>>>>>> > >>>>>>>> After smi_timeout detected IDLE, it does not start the > >>>>>>>> timer again. Instead, a call to sender() will restart the > >>>>>>>> timer. Unfortunately, if a command to the BMC is not > >>>>>>>> started via sender(), but generated internally (eg. from > >>>>>>>> smi_event_handler() because of smi_info->req_events) and > >>>>>>>> started via smi_info->handlers->start_transaction(), > >>>>>>>> AFAICS the timer is not started. So, the timeouts in the > >>>>>>>> KCS driver while waiting for IBF or OBF won't work in this > >>>>>>>> situation. > >>>>>>>> > >>>>>>>> The BMCs in our servers seem to have a sporadic problem > >>>>>>>> that leads to the error recovery being started because of > >>>>>>>> an timeout while waiting for OBF. But in case the timer is > >>>>>>>> stopped while the problem occurs, the driver loop waiting > >>>>>>>> for OBF forever. > >>>>>>> I think I can see what is happening here, but I don't think your > fix is > >>>>>>> correct. With your fix the driver would never time out if the > interface > >>>>>>> wasn't working, because it's starting the timer in the thread, the > timer > >>>>>>> is what is used to time out operations, and the thread will run > >>>>>>> constantly while the driver is operating. > >>>>>>> > >>>>>>> Try the attached patches instead, please. I've done some minimal > testing. > >>>>>>> > >>>>>>>> P.S.: There might be another minor issue in ipmi_kcs_sm.c > >>>>>>>> If check_obf() times out, it starts the error handling but does > >>>>>>>> not reset kcs->obf_timeout. Thus, in state KCS_ERROR2 check_obf() > >>>>>>>> is called with the timer already expired. So I'd like to > >>>>>>>> suggest this small change: > >>>>>>>> > >>>>>>>> --- a/drivers/char/ipmi/ipmi_kcs_sm.c 2014-03-19 > 10:42:51.000000000 +0100 > >>>>>>>> +++ b/drivers/char/ipmi/ipmi_kcs_sm.c 2014-03-19 > 10:48:33.000000000 +0100 > >>>>>>>> @@ -251,8 +251,9 @@ static inline int check_obf(struct si_sm > >>>>>>>> if (!GET_STATUS_OBF(status)) { > >>>>>>>> kcs->obf_timeout -= time; > >>>>>>>> if (kcs->obf_timeout < 0) { > >>>>>>>> - start_error_recovery(kcs, "OBF not ready in > time"); > >>>>>>>> - return 1; > >>>>>>>> + start_error_recovery(kcs, "OBF not ready > in time"); > >>>>>>>> + kcs->obf_timeout = OBF_RETRY_TIMEOUT; > >>>>>>>> + return 1; > >>>>>>>> } > >>>>>>>> return 0; > >>>>>>>> } > >>>>>>>> > >>>>>>> Indeed, you are right. I've attached another patch for that. > >>>>>>> > >>>>>>> -corey > > > > ------------------------------------------------------------------------------ > Learn Graph Databases - Download FREE O'Reilly Book > "Graph Databases" is the definitive new guide to graph databases and their > applications. Written by three acclaimed leaders in the field, > this first edition is now available. Download your free book today! > http://p.sf.net/sfu/13534_NeoTech > _______________________________________________ > Openipmi-developer mailing list > [email protected] > https://lists.sourceforge.net/lists/listinfo/openipmi-developer >
------------------------------------------------------------------------------
_______________________________________________ Openipmi-developer mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/openipmi-developer
