On Fri, 26 Mar 2021 19:54:26 +0100, Rafael J. Wysocki wrote:
> On Fri, Mar 26, 2021 at 6:53 PM Zhou Ti (x2019cwm) <x2019...@stfx.ca> wrote:
> >
> > On Fri, 26 Mar 2021 18:01:47 +0100, Rafael J. Wysocki wrote:
> > > On Thu, Mar 25, 2021 at 9:37 PM Zhou Ti (x2019cwm) <x2019...@stfx.ca> 
> > > wrote:
> > > >
> > > > On March 25, 2021 15:50, Rafael J. Wysocki wrote:
> > > > > On Thu, Mar 25, 2021 at 8:18 PM Zhou Ti (x2019cwm) <x2019...@stfx.ca> 
> > > > > wrote:
> > > > > >
> > > > > > On March 25, 2021 14:56, Rafael J. Wysocki wrote:
> > > > > > > On Thursday, March 25, 2021 2:14:00 PM CET Frederic Weisbecker 
> > > > > > > wrote:
> > > > > > > > On Tue, Mar 16, 2021 at 04:08:08PM +0000, Zhou Ti (x2019cwm) 
> > > > > > > > wrote:
> > > > > > > > > But I don't think it's a good idea to handle this in callers, 
> > > > > > > > > because logically the function shouldn't return negative 
> > > > > > > > > values. Returning 0 directly would allow idle governors to 
> > > > > > > > > get another chance to select again.
> > > > > > > >
> > > > > > > > Hmm, I'm going to leave the last word to Rafael since cpuidle 
> > > > > > > > are the only
> > > > > > > > callers of this. In any case we need to fix it.
> > > > > > >
> > > > > > > Yes, we do.
> > > > > > >
> > > > > > > So I said that I preferred to address this in the callers and the 
> > > > > > > reason why
> > > > > > > is because, for example, for the teo governor it would be a 
> > > > > > > matter of using
> > > > > > > a different data type to store the tick_nohz_get_sleep_length() 
> > > > > > > return value,
> > > > > > > like in the (untested) patch below.
> > > > > > >
> > > > > > > So at least in this case there is no need to add any new branches 
> > > > > > > anywhere.
> > > > > > >
> > > > > > > I'm still not sure about menu, because it is more complicated, 
> > > > > > > but even if
> > > > > > > that one needs an extra branch, that is a win already.
> > > > > >
> > > > > > I would like to point out the potential trouble that fixing this 
> > > > > > issue in the
> > > > > > callers could cause.
> > > > > >
> > > > > > 1. This function is called multiple times in menu governor and TEO
> > > > > > governor.
> > > > >
> > > > > What do you mean by "multiple times"?
> > > > >
> > > > > Each of the governors calls it once per cycle and its previous return
> > > > > value is not used in the next cycle at least in teo.
> > > >
> > > > I remember a governor called this function twice in a cycle, I guess I 
> > > > remember
> > > > wrong.
> > >
> > > That obviously depends on the governor, but both teo and menu call it
> > > once per cycle.
> > >
> > > > > > I'm not sure that receiving results using signed integers is enough
> > > > > > to solve all the problems, in the worst case it may require 
> > > > > > increasing
> > > > > > the logical complexity of the code.
> > > > >
> > > > > That is a valid concern, so it is a tradeoff between increasing the
> > > > > logical complexity of the code and adding branches to it.
> > > > >
> > > > > > 2. This function is important for developing idle governor.
> > > > > > If the problem is not fixed in the function itself, then this 
> > > > > > potential
> > > > > > pitfall should be explicitly stated in the documentation.
> > > > >
> > > > > That I can agree with.
> > > > >
> > > > > > This is because
> > > > > > it is difficult to predict from the definition and naming of the 
> > > > > > function
> > > > > > that it might return a negative number. I actually discovered this 
> > > > > > anomaly
> > > > > > when I was doing data analysis on my own idle governor. For some 
> > > > > > idle control
> > > > > > algorithms, this exception return could lead to serious 
> > > > > > consequences,
> > > > > > because negative return logically won't happen.
> > > > >
> > > > > Well, it's a matter of how to take the possible negative return value
> > > > > into account so it does not affect the result of the computations.
> > > >
> > > > I think it is challenging for some algorithms to take negative return 
> > > > values
> > > > into account properly. For TEO (and even menu), it is possible to
> > > > solve the problem by just changing the way the data is received is 
> > > > because the
> > > > learning mechanism for both algorithms is simple.
> > >
> > > Of course this depends on the governor.
> > >
> > > > One of the interesting things about the CPUIdle subsystem is that it is 
> > > > well
> > > > suited to introduce machine learning and probabilistic statistical 
> > > > methods.
> > >
> > > You need to remember that the governor code runs in the idle loop
> > > context which is expected to be reasonably fast.
> > >
> > > That's why we are worrying about individual branches here.
> > >
> > > > This means that many of the more complex and data-sensitive algorithms 
> > > > can
> > > > potentially be explored. In the best case we will still need to add 
> > > > additional
> > > > code complexity to a new algorithm.
> > >
> > > So I'm not sure what the problem with adding an upfront negative value
> > > check to the governor is.
> > >
> > > > It would reduce a lot of unnecessary considerations (for example, 
> > > > highlight
> > > > this shortcoming in the documentation) if we could ensure that this 
> > > > function
> > > > would work as it is logically defined. But I don't really understand
> > > > how much of a burden adding an extra branch would impose, so I don't 
> > > > know if
> > > > this tradeoff is worth it.
> > >
> > > It ultimately depends on the governor, which is why I think that the
> > > negative value check should be done by the governor, if needed, and
> > > not by the function called by it, because in the latter case the check
> > > may be redundant and we end up with an extra branch (or two branches
> > > in this particular case) for no good reason whatsoever.
> > >
> > > Yes, there are governors which simply can do the negative value check
> > > upfront right after calling that function and ensure that they will
> > > not deal with negative values going forward.  This is probably what
> > > I'll do in the menu case.
> > >
> > > However, if the governor is simple enough and it can avoid doing the
> > > explicit negative value check, I don't see a reason to do that check
> > > elsewhere "just in case".
> >
> > Makes sense. I will submit my patch to fix this issue in menu and TEO.
> 
> Well, I have patches for that already and they are not
> super-straightforward.  Though If you want to try to fix this
> yourself, I'll wait for your submission.

Thanks! I really like this subsystem, so I hope to contribute a little.
I still have some questions.

For TEO governor:
    Even if we change some datatypes as your patch did before, some explicit 
    type conversions still need to be added to prevent wrong results.

    For example:
        line 276 (because 1u > -1 will be false)
        line 329
        line 422
    
    Since some of them are in a loop, does the overhead caused by the
    type conversions worth it? or do I need to do some pre-processing to avoid 
    duplicate conversions (which may cause additional space overhead) ?
    
For menu governor:
    If we simply change the datatypes, the conversions required are even more.

    For example:
        lines 133-142
        line 202
        line 289
        lines 302-309
        line 320
        lines 327-328
        line 347

    If we just do exception handling on next_timer_ns and delta_next,
    that still requires two additional branches.

Reply via email to