Re: 回复: [PATCH 01/10] tick/nohz: Prevent tick_nohz_get_sleep_length() from returning negative value

2021-03-29 Thread Zhou Ti (x2019cwm)
On Mon 2021-03-29 8:45, Rafael J. Wysocki wrote:
> On Fri, Mar 26, 2021 at 11:53 PM Zhou Ti (x2019cwm)  wrote:
> >
> > On Fri, 26 Mar 2021 19:54:26 +0100, Rafael J. Wysocki wrote:
> > > On Fri, Mar 26, 2021 at 6:53 PM Zhou Ti (x2019cwm)  
> > > wrote:
> > > >
> > > > On Fri, 26 Mar 2021 18:01:47 +0100, Rafael J. Wysocki wrote:
> > > > > On Thu, Mar 25, 2021 at 9:37 PM Zhou Ti (x2019cwm)  
> > > > > wrote:
> > > > > >
> > > > > > On March 25, 2021 15:50, Rafael J. Wysocki wrote:
> > > > > > > On Thu, Mar 25, 2021 at 8:18 PM Zhou Ti (x2019cwm) 
> > > > > > >  wrote:
> > > > > > > >
> > > > > > > > On March 25, 2021 14:56, Rafael J. Wysocki wrote:
> > > > > > > > > On Thursday, March 25, 2021 2:14:00 PM CET Frederic 
> > > > > > > > > Weisbecker wrote:
> > > > > > > > > > On Tue, Mar 16, 2021 at 04:08:08PM +, Zhou Ti 
> > > > > > > > > > (x2019cwm) wrote:
> > > > > > > > > > > But I don't think it's a good idea to handle this in 
> > > > > > > > > > > callers, because logically the function shouldn't return 
> > > > > > > > > > > negative values. Returning 0 directly would allow idle 
> > > > > > > > > > > governors to get another chance to select again.
> > > > > > > > > >
> > > > > > > > > > Hmm, I'm going to leave the last word to Rafael since 
> > > > > > > > > > cpuidle are the only
> > > > > > > > > > callers of this. In any case we need to fix it.
> > > > > > > > >
> > > > > > > > > Yes, we do.
> > > > > > > > >
> > > > > > > > > So I said that I preferred to address this in the callers and 
> > > > > > > > > the reason why
> > > > > > > > > is because, for example, for the teo governor it would be a 
> > > > > > > > > matter of using
> > > > > > > > > a different data type to store the 
> > > > > > > > > tick_nohz_get_sleep_length() return value,
> > > > > > > > > like in the (untested) patch below.
> > > > > > > > >
> > > > > > > > > So at least in this case there is no need to add any new 
> > > > > > > > > branches anywhere.
> > > > > > > > >
> > > > > > > > > I'm still not sure about menu, because it is more 
> > > > > > > > > complicated, but even if
> > > > > > > > > that one needs an extra branch, that is a win already.
> > > > > > > >
> > > > > > > > I would like to point out the potential trouble that fixing 
> > > > > > > > this issue in the
> > > > > > > > callers could cause.
> > > > > > > >
> > > > > > > > 1. This function is called multiple times in menu governor and 
> > > > > > > > TEO
> > > > > > > > governor.
> > > > > > >
> > > > > > > What do you mean by "multiple times"?
> > > > > > >
> > > > > > > Each of the governors calls it once per cycle and its previous 
> > > > > > > return
> > > > > > > value is not used in the next cycle at least in teo.
> > > > > >
> > > > > > I remember a governor called this function twice in a cycle, I 
> > > > > > guess I remember
> > > > > > wrong.
> > > > >
> > > > > That obviously depends on the governor, but both teo and menu call it
> > > > > once per cycle.
> > > > >
> > > > > > > > I'm not sure that receiving results using signed integers is 
> > > > > > > > enough
> > > > > > > > to solve all the problems, in the worst case it may require 
> > > > > > > > increasing
> > > > > > > > the logical complexity of the code.
> > > > > > >
> > > > > > > That is a valid concern, so it is a tradeoff between increasing 
> > > > > > > the
> > > > > > > logical complexity of the code and adding branches to it.
> > > > > > >
> > > > > > > > 2. This function is important for developing idle governor.
> > > > > > > > If the problem is not fixed in the function itself, then this 
> > > > > > > > potential
> > > > > > > > pitfall should be explicitly stated in the documentation.
> > > > > > >
> > > > > > > That I can agree with.
> > > > > > >
> > > > > > > > This is because
> > > > > > > > it is difficult to predict from the definition and naming of 
> > > > > > > > the function
> > > > > > > > that it might return a negative number. I actually discovered 
> > > > > > > > this anomaly
> > > > > > > > when I was doing data analysis on my own idle governor. For 
> > > > > > > > some idle control
> > > > > > > > algorithms, this exception return could lead to serious 
> > > > > > > > consequences,
> > > > > > > > because negative return logically won't happen.
> > > > > > >
> > > > > > > Well, it's a matter of how to take the possible negative return 
> > > > > > > value
> > > > > > > into account so it does not affect the result of the computations.
> > > > > >
> > > > > > I think it is challenging for some algorithms to take negative 
> > > > > > return values
> > > > > > into account properly. For TEO (and even menu), it is possible to
> > > > > > solve the problem by just changing the way the data is received is 
> > > > > > because the
> > > > > > learning mechanism for both algorithms is simple.
> > > > >
> > > > > Of course this depends on the governor.
> > > > >
> > > > > > One of the interesting things about the CPUIdle subsystem is that 
> > > > > 

Re: 回复: [PATCH 01/10] tick/nohz: Prevent tick_nohz_get_sleep_length() from returning negative value

2021-03-29 Thread Rafael J. Wysocki
On Fri, Mar 26, 2021 at 11:53 PM Zhou Ti (x2019cwm)  wrote:
>
> On Fri, 26 Mar 2021 19:54:26 +0100, Rafael J. Wysocki wrote:
> > On Fri, Mar 26, 2021 at 6:53 PM Zhou Ti (x2019cwm)  wrote:
> > >
> > > On Fri, 26 Mar 2021 18:01:47 +0100, Rafael J. Wysocki wrote:
> > > > On Thu, Mar 25, 2021 at 9:37 PM Zhou Ti (x2019cwm)  
> > > > wrote:
> > > > >
> > > > > On March 25, 2021 15:50, Rafael J. Wysocki wrote:
> > > > > > On Thu, Mar 25, 2021 at 8:18 PM Zhou Ti (x2019cwm) 
> > > > > >  wrote:
> > > > > > >
> > > > > > > On March 25, 2021 14:56, Rafael J. Wysocki wrote:
> > > > > > > > On Thursday, March 25, 2021 2:14:00 PM CET Frederic Weisbecker 
> > > > > > > > wrote:
> > > > > > > > > On Tue, Mar 16, 2021 at 04:08:08PM +, Zhou Ti (x2019cwm) 
> > > > > > > > > wrote:
> > > > > > > > > > But I don't think it's a good idea to handle this in 
> > > > > > > > > > callers, because logically the function shouldn't return 
> > > > > > > > > > negative values. Returning 0 directly would allow idle 
> > > > > > > > > > governors to get another chance to select again.
> > > > > > > > >
> > > > > > > > > Hmm, I'm going to leave the last word to Rafael since cpuidle 
> > > > > > > > > are the only
> > > > > > > > > callers of this. In any case we need to fix it.
> > > > > > > >
> > > > > > > > Yes, we do.
> > > > > > > >
> > > > > > > > So I said that I preferred to address this in the callers and 
> > > > > > > > the reason why
> > > > > > > > is because, for example, for the teo governor it would be a 
> > > > > > > > matter of using
> > > > > > > > a different data type to store the tick_nohz_get_sleep_length() 
> > > > > > > > return value,
> > > > > > > > like in the (untested) patch below.
> > > > > > > >
> > > > > > > > So at least in this case there is no need to add any new 
> > > > > > > > branches anywhere.
> > > > > > > >
> > > > > > > > I'm still not sure about menu, because it is more complicated, 
> > > > > > > > but even if
> > > > > > > > that one needs an extra branch, that is a win already.
> > > > > > >
> > > > > > > I would like to point out the potential trouble that fixing this 
> > > > > > > issue in the
> > > > > > > callers could cause.
> > > > > > >
> > > > > > > 1. This function is called multiple times in menu governor and TEO
> > > > > > > governor.
> > > > > >
> > > > > > What do you mean by "multiple times"?
> > > > > >
> > > > > > Each of the governors calls it once per cycle and its previous 
> > > > > > return
> > > > > > value is not used in the next cycle at least in teo.
> > > > >
> > > > > I remember a governor called this function twice in a cycle, I guess 
> > > > > I remember
> > > > > wrong.
> > > >
> > > > That obviously depends on the governor, but both teo and menu call it
> > > > once per cycle.
> > > >
> > > > > > > I'm not sure that receiving results using signed integers is 
> > > > > > > enough
> > > > > > > to solve all the problems, in the worst case it may require 
> > > > > > > increasing
> > > > > > > the logical complexity of the code.
> > > > > >
> > > > > > That is a valid concern, so it is a tradeoff between increasing the
> > > > > > logical complexity of the code and adding branches to it.
> > > > > >
> > > > > > > 2. This function is important for developing idle governor.
> > > > > > > If the problem is not fixed in the function itself, then this 
> > > > > > > potential
> > > > > > > pitfall should be explicitly stated in the documentation.
> > > > > >
> > > > > > That I can agree with.
> > > > > >
> > > > > > > This is because
> > > > > > > it is difficult to predict from the definition and naming of the 
> > > > > > > function
> > > > > > > that it might return a negative number. I actually discovered 
> > > > > > > this anomaly
> > > > > > > when I was doing data analysis on my own idle governor. For some 
> > > > > > > idle control
> > > > > > > algorithms, this exception return could lead to serious 
> > > > > > > consequences,
> > > > > > > because negative return logically won't happen.
> > > > > >
> > > > > > Well, it's a matter of how to take the possible negative return 
> > > > > > value
> > > > > > into account so it does not affect the result of the computations.
> > > > >
> > > > > I think it is challenging for some algorithms to take negative return 
> > > > > values
> > > > > into account properly. For TEO (and even menu), it is possible to
> > > > > solve the problem by just changing the way the data is received is 
> > > > > because the
> > > > > learning mechanism for both algorithms is simple.
> > > >
> > > > Of course this depends on the governor.
> > > >
> > > > > One of the interesting things about the CPUIdle subsystem is that it 
> > > > > is well
> > > > > suited to introduce machine learning and probabilistic statistical 
> > > > > methods.
> > > >
> > > > You need to remember that the governor code runs in the idle loop
> > > > context which is expected to be reasonably fast.
> > > >
> > > > That's why we are worrying about 

Re: 回复: [PATCH 01/10] tick/nohz: Prevent tick_nohz_get_sleep_length() from returning negative value

2021-03-26 Thread Zhou Ti (x2019cwm)
On Fri, 26 Mar 2021 19:54:26 +0100, Rafael J. Wysocki wrote:
> On Fri, Mar 26, 2021 at 6:53 PM Zhou Ti (x2019cwm)  wrote:
> >
> > On Fri, 26 Mar 2021 18:01:47 +0100, Rafael J. Wysocki wrote:
> > > On Thu, Mar 25, 2021 at 9:37 PM Zhou Ti (x2019cwm)  
> > > wrote:
> > > >
> > > > On March 25, 2021 15:50, Rafael J. Wysocki wrote:
> > > > > On Thu, Mar 25, 2021 at 8:18 PM Zhou Ti (x2019cwm)  
> > > > > wrote:
> > > > > >
> > > > > > On March 25, 2021 14:56, Rafael J. Wysocki wrote:
> > > > > > > On Thursday, March 25, 2021 2:14:00 PM CET Frederic Weisbecker 
> > > > > > > wrote:
> > > > > > > > On Tue, Mar 16, 2021 at 04:08:08PM +, Zhou Ti (x2019cwm) 
> > > > > > > > wrote:
> > > > > > > > > But I don't think it's a good idea to handle this in callers, 
> > > > > > > > > because logically the function shouldn't return negative 
> > > > > > > > > values. Returning 0 directly would allow idle governors to 
> > > > > > > > > get another chance to select again.
> > > > > > > >
> > > > > > > > Hmm, I'm going to leave the last word to Rafael since cpuidle 
> > > > > > > > are the only
> > > > > > > > callers of this. In any case we need to fix it.
> > > > > > >
> > > > > > > Yes, we do.
> > > > > > >
> > > > > > > So I said that I preferred to address this in the callers and the 
> > > > > > > reason why
> > > > > > > is because, for example, for the teo governor it would be a 
> > > > > > > matter of using
> > > > > > > a different data type to store the tick_nohz_get_sleep_length() 
> > > > > > > return value,
> > > > > > > like in the (untested) patch below.
> > > > > > >
> > > > > > > So at least in this case there is no need to add any new branches 
> > > > > > > anywhere.
> > > > > > >
> > > > > > > I'm still not sure about menu, because it is more complicated, 
> > > > > > > but even if
> > > > > > > that one needs an extra branch, that is a win already.
> > > > > >
> > > > > > I would like to point out the potential trouble that fixing this 
> > > > > > issue in the
> > > > > > callers could cause.
> > > > > >
> > > > > > 1. This function is called multiple times in menu governor and TEO
> > > > > > governor.
> > > > >
> > > > > What do you mean by "multiple times"?
> > > > >
> > > > > Each of the governors calls it once per cycle and its previous return
> > > > > value is not used in the next cycle at least in teo.
> > > >
> > > > I remember a governor called this function twice in a cycle, I guess I 
> > > > remember
> > > > wrong.
> > >
> > > That obviously depends on the governor, but both teo and menu call it
> > > once per cycle.
> > >
> > > > > > I'm not sure that receiving results using signed integers is enough
> > > > > > to solve all the problems, in the worst case it may require 
> > > > > > increasing
> > > > > > the logical complexity of the code.
> > > > >
> > > > > That is a valid concern, so it is a tradeoff between increasing the
> > > > > logical complexity of the code and adding branches to it.
> > > > >
> > > > > > 2. This function is important for developing idle governor.
> > > > > > If the problem is not fixed in the function itself, then this 
> > > > > > potential
> > > > > > pitfall should be explicitly stated in the documentation.
> > > > >
> > > > > That I can agree with.
> > > > >
> > > > > > This is because
> > > > > > it is difficult to predict from the definition and naming of the 
> > > > > > function
> > > > > > that it might return a negative number. I actually discovered this 
> > > > > > anomaly
> > > > > > when I was doing data analysis on my own idle governor. For some 
> > > > > > idle control
> > > > > > algorithms, this exception return could lead to serious 
> > > > > > consequences,
> > > > > > because negative return logically won't happen.
> > > > >
> > > > > Well, it's a matter of how to take the possible negative return value
> > > > > into account so it does not affect the result of the computations.
> > > >
> > > > I think it is challenging for some algorithms to take negative return 
> > > > values
> > > > into account properly. For TEO (and even menu), it is possible to
> > > > solve the problem by just changing the way the data is received is 
> > > > because the
> > > > learning mechanism for both algorithms is simple.
> > >
> > > Of course this depends on the governor.
> > >
> > > > One of the interesting things about the CPUIdle subsystem is that it is 
> > > > well
> > > > suited to introduce machine learning and probabilistic statistical 
> > > > methods.
> > >
> > > You need to remember that the governor code runs in the idle loop
> > > context which is expected to be reasonably fast.
> > >
> > > That's why we are worrying about individual branches here.
> > >
> > > > This means that many of the more complex and data-sensitive algorithms 
> > > > can
> > > > potentially be explored. In the best case we will still need to add 
> > > > additional
> > > > code complexity to a new algorithm.
> > >
> > > So I'm not sure what the problem with adding 

Re: 回复: [PATCH 01/10] tick/nohz: Prevent tick_nohz_get_sleep_length() from returning negative value

2021-03-26 Thread Rafael J. Wysocki
On Fri, Mar 26, 2021 at 6:53 PM Zhou Ti (x2019cwm)  wrote:
>
> On Fri, 26 Mar 2021 18:01:47 +0100, Rafael J. Wysocki wrote:
> > On Thu, Mar 25, 2021 at 9:37 PM Zhou Ti (x2019cwm)  wrote:
> > >
> > > On March 25, 2021 15:50, Rafael J. Wysocki wrote:
> > > > On Thu, Mar 25, 2021 at 8:18 PM Zhou Ti (x2019cwm)  
> > > > wrote:
> > > > >
> > > > > On March 25, 2021 14:56, Rafael J. Wysocki wrote:
> > > > > > On Thursday, March 25, 2021 2:14:00 PM CET Frederic Weisbecker 
> > > > > > wrote:
> > > > > > > On Tue, Mar 16, 2021 at 04:08:08PM +, Zhou Ti (x2019cwm) 
> > > > > > > wrote:
> > > > > > > > But I don't think it's a good idea to handle this in callers, 
> > > > > > > > because logically the function shouldn't return negative 
> > > > > > > > values. Returning 0 directly would allow idle governors to get 
> > > > > > > > another chance to select again.
> > > > > > >
> > > > > > > Hmm, I'm going to leave the last word to Rafael since cpuidle are 
> > > > > > > the only
> > > > > > > callers of this. In any case we need to fix it.
> > > > > >
> > > > > > Yes, we do.
> > > > > >
> > > > > > So I said that I preferred to address this in the callers and the 
> > > > > > reason why
> > > > > > is because, for example, for the teo governor it would be a matter 
> > > > > > of using
> > > > > > a different data type to store the tick_nohz_get_sleep_length() 
> > > > > > return value,
> > > > > > like in the (untested) patch below.
> > > > > >
> > > > > > So at least in this case there is no need to add any new branches 
> > > > > > anywhere.
> > > > > >
> > > > > > I'm still not sure about menu, because it is more complicated, but 
> > > > > > even if
> > > > > > that one needs an extra branch, that is a win already.
> > > > >
> > > > > I would like to point out the potential trouble that fixing this 
> > > > > issue in the
> > > > > callers could cause.
> > > > >
> > > > > 1. This function is called multiple times in menu governor and TEO
> > > > > governor.
> > > >
> > > > What do you mean by "multiple times"?
> > > >
> > > > Each of the governors calls it once per cycle and its previous return
> > > > value is not used in the next cycle at least in teo.
> > >
> > > I remember a governor called this function twice in a cycle, I guess I 
> > > remember
> > > wrong.
> >
> > That obviously depends on the governor, but both teo and menu call it
> > once per cycle.
> >
> > > > > I'm not sure that receiving results using signed integers is enough
> > > > > to solve all the problems, in the worst case it may require increasing
> > > > > the logical complexity of the code.
> > > >
> > > > That is a valid concern, so it is a tradeoff between increasing the
> > > > logical complexity of the code and adding branches to it.
> > > >
> > > > > 2. This function is important for developing idle governor.
> > > > > If the problem is not fixed in the function itself, then this 
> > > > > potential
> > > > > pitfall should be explicitly stated in the documentation.
> > > >
> > > > That I can agree with.
> > > >
> > > > > This is because
> > > > > it is difficult to predict from the definition and naming of the 
> > > > > function
> > > > > that it might return a negative number. I actually discovered this 
> > > > > anomaly
> > > > > when I was doing data analysis on my own idle governor. For some idle 
> > > > > control
> > > > > algorithms, this exception return could lead to serious consequences,
> > > > > because negative return logically won't happen.
> > > >
> > > > Well, it's a matter of how to take the possible negative return value
> > > > into account so it does not affect the result of the computations.
> > >
> > > I think it is challenging for some algorithms to take negative return 
> > > values
> > > into account properly. For TEO (and even menu), it is possible to
> > > solve the problem by just changing the way the data is received is 
> > > because the
> > > learning mechanism for both algorithms is simple.
> >
> > Of course this depends on the governor.
> >
> > > One of the interesting things about the CPUIdle subsystem is that it is 
> > > well
> > > suited to introduce machine learning and probabilistic statistical 
> > > methods.
> >
> > You need to remember that the governor code runs in the idle loop
> > context which is expected to be reasonably fast.
> >
> > That's why we are worrying about individual branches here.
> >
> > > This means that many of the more complex and data-sensitive algorithms can
> > > potentially be explored. In the best case we will still need to add 
> > > additional
> > > code complexity to a new algorithm.
> >
> > So I'm not sure what the problem with adding an upfront negative value
> > check to the governor is.
> >
> > > It would reduce a lot of unnecessary considerations (for example, 
> > > highlight
> > > this shortcoming in the documentation) if we could ensure that this 
> > > function
> > > would work as it is logically defined. But I don't really understand
> > > how 

Re: 回复: [PATCH 01/10] tick/nohz: Prevent tick_nohz_get_sleep_length() from returning negative value

2021-03-26 Thread Zhou Ti (x2019cwm)
which is expected to be reasonably fast.
>
> That's why we are worrying about individual branches here.
>
> > This means that many of the more complex and data-sensitive algorithms can
> > potentially be explored. In the best case we will still need to add 
> > additional
> > code complexity to a new algorithm.
>
> So I'm not sure what the problem with adding an upfront negative value
> check to the governor is.
>
> > It would reduce a lot of unnecessary considerations (for example, highlight
> > this shortcoming in the documentation) if we could ensure that this function
> > would work as it is logically defined. But I don't really understand
> > how much of a burden adding an extra branch would impose, so I don't know if
> > this tradeoff is worth it.
>
> It ultimately depends on the governor, which is why I think that the
> negative value check should be done by the governor, if needed, and
> not by the function called by it, because in the latter case the check
> may be redundant and we end up with an extra branch (or two branches
> in this particular case) for no good reason whatsoever.
>
> Yes, there are governors which simply can do the negative value check
> upfront right after calling that function and ensure that they will
> not deal with negative values going forward.  This is probably what
> I'll do in the menu case.
>
> However, if the governor is simple enough and it can avoid doing the
> explicit negative value check, I don't see a reason to do that check
> elsewhere "just in case".

Makes sense. I will submit my patch to fix this issue in menu and TEO.


From: Rafael J. Wysocki 
Sent: March 26, 2021 13:01
To: Zhou Ti (x2019cwm)
Cc: Rafael J. Wysocki; Rafael J. Wysocki; Frederic Weisbecker; LKML; Rafael J. 
Wysocki; Peter Zijlstra; Thomas Gleixner; Yunfeng Ye; Paul E . McKenney; 
Marcelo Tosatti; Ingo Molnar; Linux PM
Subject: Re: 回复: [PATCH 01/10] tick/nohz: Prevent tick_nohz_get_sleep_length() 
from returning negative value

On Thu, Mar 25, 2021 at 9:37 PM Zhou Ti (x2019cwm)  wrote:
>
> On March 25, 2021 15:50, Rafael J. Wysocki wrote:
> > On Thu, Mar 25, 2021 at 8:18 PM Zhou Ti (x2019cwm)  wrote:
> > >
> > > On March 25, 2021 14:56, Rafael J. Wysocki wrote:
> > > > On Thursday, March 25, 2021 2:14:00 PM CET Frederic Weisbecker wrote:
> > > > > On Tue, Mar 16, 2021 at 04:08:08PM +, Zhou Ti (x2019cwm) wrote:
> > > > > > But I don't think it's a good idea to handle this in callers, 
> > > > > > because logically the function shouldn't return negative values. 
> > > > > > Returning 0 directly would allow idle governors to get another 
> > > > > > chance to select again.
> > > > >
> > > > > Hmm, I'm going to leave the last word to Rafael since cpuidle are the 
> > > > > only
> > > > > callers of this. In any case we need to fix it.
> > > >
> > > > Yes, we do.
> > > >
> > > > So I said that I preferred to address this in the callers and the 
> > > > reason why
> > > > is because, for example, for the teo governor it would be a matter of 
> > > > using
> > > > a different data type to store the tick_nohz_get_sleep_length() return 
> > > > value,
> > > > like in the (untested) patch below.
> > > >
> > > > So at least in this case there is no need to add any new branches 
> > > > anywhere.
> > > >
> > > > I'm still not sure about menu, because it is more complicated, but even 
> > > > if
> > > > that one needs an extra branch, that is a win already.
> > >
> > > I would like to point out the potential trouble that fixing this issue in 
> > > the
> > > callers could cause.
> > >
> > > 1. This function is called multiple times in menu governor and TEO
> > > governor.
> >
> > What do you mean by "multiple times"?
> >
> > Each of the governors calls it once per cycle and its previous return
> > value is not used in the next cycle at least in teo.
>
> I remember a governor called this function twice in a cycle, I guess I 
> remember
> wrong.

That obviously depends on the governor, but both teo and menu call it
once per cycle.

> > > I'm not sure that receiving results using signed integers is enough
> > > to solve all the problems, in the worst case it may require increasing
> > > the logical complexity of the code.
> >
> > That is a valid concern, so it is a tradeoff between increasing the
> &

Re: 回复: [PATCH 01/10] tick/nohz: Prevent tick_nohz_get_sleep_length() from returning negative value

2021-03-26 Thread Rafael J. Wysocki
On Thu, Mar 25, 2021 at 9:37 PM Zhou Ti (x2019cwm)  wrote:
>
> On March 25, 2021 15:50, Rafael J. Wysocki wrote:
> > On Thu, Mar 25, 2021 at 8:18 PM Zhou Ti (x2019cwm)  wrote:
> > >
> > > On March 25, 2021 14:56, Rafael J. Wysocki wrote:
> > > > On Thursday, March 25, 2021 2:14:00 PM CET Frederic Weisbecker wrote:
> > > > > On Tue, Mar 16, 2021 at 04:08:08PM +, Zhou Ti (x2019cwm) wrote:
> > > > > > But I don't think it's a good idea to handle this in callers, 
> > > > > > because logically the function shouldn't return negative values. 
> > > > > > Returning 0 directly would allow idle governors to get another 
> > > > > > chance to select again.
> > > > >
> > > > > Hmm, I'm going to leave the last word to Rafael since cpuidle are the 
> > > > > only
> > > > > callers of this. In any case we need to fix it.
> > > >
> > > > Yes, we do.
> > > >
> > > > So I said that I preferred to address this in the callers and the 
> > > > reason why
> > > > is because, for example, for the teo governor it would be a matter of 
> > > > using
> > > > a different data type to store the tick_nohz_get_sleep_length() return 
> > > > value,
> > > > like in the (untested) patch below.
> > > >
> > > > So at least in this case there is no need to add any new branches 
> > > > anywhere.
> > > >
> > > > I'm still not sure about menu, because it is more complicated, but even 
> > > > if
> > > > that one needs an extra branch, that is a win already.
> > >
> > > I would like to point out the potential trouble that fixing this issue in 
> > > the
> > > callers could cause.
> > >
> > > 1. This function is called multiple times in menu governor and TEO
> > > governor.
> >
> > What do you mean by "multiple times"?
> >
> > Each of the governors calls it once per cycle and its previous return
> > value is not used in the next cycle at least in teo.
>
> I remember a governor called this function twice in a cycle, I guess I 
> remember
> wrong.

That obviously depends on the governor, but both teo and menu call it
once per cycle.

> > > I'm not sure that receiving results using signed integers is enough
> > > to solve all the problems, in the worst case it may require increasing
> > > the logical complexity of the code.
> >
> > That is a valid concern, so it is a tradeoff between increasing the
> > logical complexity of the code and adding branches to it.
> >
> > > 2. This function is important for developing idle governor.
> > > If the problem is not fixed in the function itself, then this potential
> > > pitfall should be explicitly stated in the documentation.
> >
> > That I can agree with.
> >
> > > This is because
> > > it is difficult to predict from the definition and naming of the function
> > > that it might return a negative number. I actually discovered this anomaly
> > > when I was doing data analysis on my own idle governor. For some idle 
> > > control
> > > algorithms, this exception return could lead to serious consequences,
> > > because negative return logically won't happen.
> >
> > Well, it's a matter of how to take the possible negative return value
> > into account so it does not affect the result of the computations.
>
> I think it is challenging for some algorithms to take negative return values
> into account properly. For TEO (and even menu), it is possible to
> solve the problem by just changing the way the data is received is because the
> learning mechanism for both algorithms is simple.

Of course this depends on the governor.

> One of the interesting things about the CPUIdle subsystem is that it is well
> suited to introduce machine learning and probabilistic statistical methods.

You need to remember that the governor code runs in the idle loop
context which is expected to be reasonably fast.

That's why we are worrying about individual branches here.

> This means that many of the more complex and data-sensitive algorithms can
> potentially be explored. In the best case we will still need to add additional
> code complexity to a new algorithm.

So I'm not sure what the problem with adding an upfront negative value
check to the governor is.

> It would reduce a lot of unnecessary considerations (for example, highlight
> this shortcoming in the documentation) if we could ensure that this function
> would work as it is logically defined. But I don't really understand
> how much of a burden adding an extra branch would impose, so I don't know if
> this tradeoff is worth it.

It ultimately depends on the governor, which is why I think that the
negative value check should be done by the governor, if needed, and
not by the function called by it, because in the latter case the check
may be redundant and we end up with an extra branch (or two branches
in this particular case) for no good reason whatsoever.

Yes, there are governors which simply can do the negative value check
upfront right after calling that function and ensure that they will
not deal with negative values going forward.  This is probably what
I'll 

Re: 回复: [PATCH 01/10] tick/nohz: Prevent tick_nohz_get_sleep_length() from returning negative value

2021-03-25 Thread Zhou Ti (x2019cwm)
On March 25, 2021 15:50, Rafael J. Wysocki wrote:
> On Thu, Mar 25, 2021 at 8:18 PM Zhou Ti (x2019cwm)  wrote:
> >
> > On March 25, 2021 14:56, Rafael J. Wysocki wrote:
> > > On Thursday, March 25, 2021 2:14:00 PM CET Frederic Weisbecker wrote:
> > > > On Tue, Mar 16, 2021 at 04:08:08PM +, Zhou Ti (x2019cwm) wrote:
> > > > > But I don't think it's a good idea to handle this in callers, because 
> > > > > logically the function shouldn't return negative values. Returning 0 
> > > > > directly would allow idle governors to get another chance to select 
> > > > > again.
> > > >
> > > > Hmm, I'm going to leave the last word to Rafael since cpuidle are the 
> > > > only
> > > > callers of this. In any case we need to fix it.
> > >
> > > Yes, we do.
> > >
> > > So I said that I preferred to address this in the callers and the reason 
> > > why
> > > is because, for example, for the teo governor it would be a matter of 
> > > using
> > > a different data type to store the tick_nohz_get_sleep_length() return 
> > > value,
> > > like in the (untested) patch below.
> > >
> > > So at least in this case there is no need to add any new branches 
> > > anywhere.
> > >
> > > I'm still not sure about menu, because it is more complicated, but even if
> > > that one needs an extra branch, that is a win already.
> >
> > I would like to point out the potential trouble that fixing this issue in 
> > the
> > callers could cause.
> >
> > 1. This function is called multiple times in menu governor and TEO
> > governor.
> 
> What do you mean by "multiple times"?
> 
> Each of the governors calls it once per cycle and its previous return
> value is not used in the next cycle at least in teo.

I remember a governor called this function twice in a cycle, I guess I remember 
wrong.

> 
> > I'm not sure that receiving results using signed integers is enough
> > to solve all the problems, in the worst case it may require increasing
> > the logical complexity of the code.
> 
> That is a valid concern, so it is a tradeoff between increasing the
> logical complexity of the code and adding branches to it.
> 
> > 2. This function is important for developing idle governor.
> > If the problem is not fixed in the function itself, then this potential
> > pitfall should be explicitly stated in the documentation.
> 
> That I can agree with.
> 
> > This is because
> > it is difficult to predict from the definition and naming of the function
> > that it might return a negative number. I actually discovered this anomaly
> > when I was doing data analysis on my own idle governor. For some idle 
> > control
> > algorithms, this exception return could lead to serious consequences,
> > because negative return logically won't happen.
> 
> Well, it's a matter of how to take the possible negative return value
> into account so it does not affect the result of the computations.

I think it is challenging for some algorithms to take negative return values 
into account properly. For TEO (and even menu), it is possible to 
solve the problem by just changing the way the data is received is because the 
learning mechanism for both algorithms is simple. 

One of the interesting things about the CPUIdle subsystem is that it is well 
suited to introduce machine learning and probabilistic statistical methods.
This means that many of the more complex and data-sensitive algorithms can 
potentially be explored. In the best case we will still need to add additional 
code complexity to a new algorithm.

It would reduce a lot of unnecessary considerations (for example, highlight 
this shortcoming in the documentation) if we could ensure that this function 
would work as it is logically defined. But I don't really understand 
how much of a burden adding an extra branch would impose, so I don't know if 
this tradeoff is worth it.


Re: 回复: [PATCH 01/10] tick/nohz: Prevent tick_nohz_get_sleep_length() from returning negative value

2021-03-25 Thread Rafael J. Wysocki
On Thu, Mar 25, 2021 at 8:18 PM Zhou Ti (x2019cwm)  wrote:
>
> On March 25, 2021 14:56, Rafael J. Wysocki wrote:
> > On Thursday, March 25, 2021 2:14:00 PM CET Frederic Weisbecker wrote:
> > > On Tue, Mar 16, 2021 at 04:08:08PM +, Zhou Ti (x2019cwm) wrote:
> > > > But I don't think it's a good idea to handle this in callers, because 
> > > > logically the function shouldn't return negative values. Returning 0 
> > > > directly would allow idle governors to get another chance to select 
> > > > again.
> > >
> > > Hmm, I'm going to leave the last word to Rafael since cpuidle are the only
> > > callers of this. In any case we need to fix it.
> >
> > Yes, we do.
> >
> > So I said that I preferred to address this in the callers and the reason why
> > is because, for example, for the teo governor it would be a matter of using
> > a different data type to store the tick_nohz_get_sleep_length() return 
> > value,
> > like in the (untested) patch below.
> >
> > So at least in this case there is no need to add any new branches anywhere.
> >
> > I'm still not sure about menu, because it is more complicated, but even if
> > that one needs an extra branch, that is a win already.
>
> I would like to point out the potential trouble that fixing this issue in the
> callers could cause.
>
> 1. This function is called multiple times in menu governor and TEO
> governor.

What do you mean by "multiple times"?

Each of the governors calls it once per cycle and its previous return
value is not used in the next cycle at least in teo.

> I'm not sure that receiving results using signed integers is enough
> to solve all the problems, in the worst case it may require increasing
> the logical complexity of the code.

That is a valid concern, so it is a tradeoff between increasing the
logical complexity of the code and adding branches to it.

> 2. This function is important for developing idle governor.
> If the problem is not fixed in the function itself, then this potential
> pitfall should be explicitly stated in the documentation.

That I can agree with.

> This is because
> it is difficult to predict from the definition and naming of the function
> that it might return a negative number. I actually discovered this anomaly
> when I was doing data analysis on my own idle governor. For some idle control
> algorithms, this exception return could lead to serious consequences,
> because negative return logically won't happen.

Well, it's a matter of how to take the possible negative return value
into account so it does not affect the result of the computations.

> >
> > ---
> >  drivers/cpuidle/governors/teo.c |8 
> >  1 file changed, 4 insertions(+), 4 deletions(-)
> >
> > Index: linux-pm/drivers/cpuidle/governors/teo.c
> > ===
> > --- linux-pm.orig/drivers/cpuidle/governors/teo.c
> > +++ linux-pm/drivers/cpuidle/governors/teo.c
> > @@ -100,8 +100,8 @@ struct teo_idle_state {
> >   * @intervals: Saved idle duration values.
> >   */
> >  struct teo_cpu {
> > -   u64 time_span_ns;
> > -   u64 sleep_length_ns;
> > +   s64 time_span_ns;
> > +   s64 sleep_length_ns;
> > struct teo_idle_state states[CPUIDLE_STATE_MAX];
> > int interval_idx;
> > u64 intervals[INTERVALS];
> > @@ -216,7 +216,7 @@ static bool teo_time_ok(u64 interval_ns)
> >   */
> >  static int teo_find_shallower_state(struct cpuidle_driver *drv,
> > struct cpuidle_device *dev, int 
> > state_idx,
> > -   u64 duration_ns)
> > +   s64 duration_ns)
> >  {
> > int i;
> >
> > @@ -242,7 +242,7 @@ static int teo_select(struct cpuidle_dri
> >  {
> > struct teo_cpu *cpu_data = per_cpu_ptr(_cpus, dev->cpu);
> > s64 latency_req = cpuidle_governor_latency_req(dev->cpu);
> > -   u64 duration_ns;
> > +   s64 duration_ns;
> > unsigned int hits, misses, early_hits;
> > int max_early_idx, prev_max_early_idx, constraint_idx, idx, i;
> > ktime_t delta_tick;
>


Re: 回复: [PATCH 01/10] tick/nohz: Prevent tick_nohz_get_sleep_length() from returning negative value

2021-03-25 Thread Zhou Ti (x2019cwm)
On March 25, 2021 14:56, Rafael J. Wysocki wrote:
> On Thursday, March 25, 2021 2:14:00 PM CET Frederic Weisbecker wrote:
> > On Tue, Mar 16, 2021 at 04:08:08PM +, Zhou Ti (x2019cwm) wrote:
> > > But I don't think it's a good idea to handle this in callers, because 
> > > logically the function shouldn't return negative values. Returning 0 
> > > directly would allow idle governors to get another chance to select again.
> >
> > Hmm, I'm going to leave the last word to Rafael since cpuidle are the only
> > callers of this. In any case we need to fix it.
> 
> Yes, we do.
> 
> So I said that I preferred to address this in the callers and the reason why
> is because, for example, for the teo governor it would be a matter of using
> a different data type to store the tick_nohz_get_sleep_length() return value,
> like in the (untested) patch below.
> 
> So at least in this case there is no need to add any new branches anywhere.
> 
> I'm still not sure about menu, because it is more complicated, but even if
> that one needs an extra branch, that is a win already.

I would like to point out the potential trouble that fixing this issue in the 
callers could cause.

1. This function is called multiple times in menu governor and TEO 
governor. I'm not sure that receiving results using signed integers is enough 
to solve all the problems, in the worst case it may require increasing 
the logical complexity of the code.

2. This function is important for developing idle governor. 
If the problem is not fixed in the function itself, then this potential 
pitfall should be explicitly stated in the documentation. This is because 
it is difficult to predict from the definition and naming of the function 
that it might return a negative number. I actually discovered this anomaly 
when I was doing data analysis on my own idle governor. For some idle control 
algorithms, this exception return could lead to serious consequences, 
because negative return logically won't happen.

> 
> ---
>  drivers/cpuidle/governors/teo.c |8 
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> Index: linux-pm/drivers/cpuidle/governors/teo.c
> ===
> --- linux-pm.orig/drivers/cpuidle/governors/teo.c
> +++ linux-pm/drivers/cpuidle/governors/teo.c
> @@ -100,8 +100,8 @@ struct teo_idle_state {
>   * @intervals: Saved idle duration values.
>   */
>  struct teo_cpu {
> -   u64 time_span_ns;
> -   u64 sleep_length_ns;
> +   s64 time_span_ns;
> +   s64 sleep_length_ns;
> struct teo_idle_state states[CPUIDLE_STATE_MAX];
> int interval_idx;
> u64 intervals[INTERVALS];
> @@ -216,7 +216,7 @@ static bool teo_time_ok(u64 interval_ns)
>   */
>  static int teo_find_shallower_state(struct cpuidle_driver *drv,
> struct cpuidle_device *dev, int state_idx,
> -   u64 duration_ns)
> +   s64 duration_ns)
>  {
> int i;
> 
> @@ -242,7 +242,7 @@ static int teo_select(struct cpuidle_dri
>  {
> struct teo_cpu *cpu_data = per_cpu_ptr(_cpus, dev->cpu);
> s64 latency_req = cpuidle_governor_latency_req(dev->cpu);
> -   u64 duration_ns;
> +   s64 duration_ns;
> unsigned int hits, misses, early_hits;
> int max_early_idx, prev_max_early_idx, constraint_idx, idx, i;
> ktime_t delta_tick;



Re: 回复: [PATCH 01/10] tick/nohz: Prevent tick_nohz_get_sleep_length() from returning negative value

2021-03-25 Thread Rafael J. Wysocki
On Thursday, March 25, 2021 2:14:00 PM CET Frederic Weisbecker wrote:
> On Tue, Mar 16, 2021 at 04:08:08PM +, Zhou Ti (x2019cwm) wrote:
> > But I don't think it's a good idea to handle this in callers, because 
> > logically the function shouldn't return negative values. Returning 0 
> > directly would allow idle governors to get another chance to select again.
> 
> Hmm, I'm going to leave the last word to Rafael since cpuidle are the only
> callers of this. In any case we need to fix it.

Yes, we do.

So I said that I preferred to address this in the callers and the reason why
is because, for example, for the teo governor it would be a matter of using
a different data type to store the tick_nohz_get_sleep_length() return value,
like in the (untested) patch below.

So at least in this case there is no need to add any new branches anywhere.

I'm still not sure about menu, because it is more complicated, but even if
that one needs an extra branch, that is a win already.

---
 drivers/cpuidle/governors/teo.c |8 
 1 file changed, 4 insertions(+), 4 deletions(-)

Index: linux-pm/drivers/cpuidle/governors/teo.c
===
--- linux-pm.orig/drivers/cpuidle/governors/teo.c
+++ linux-pm/drivers/cpuidle/governors/teo.c
@@ -100,8 +100,8 @@ struct teo_idle_state {
  * @intervals: Saved idle duration values.
  */
 struct teo_cpu {
-   u64 time_span_ns;
-   u64 sleep_length_ns;
+   s64 time_span_ns;
+   s64 sleep_length_ns;
struct teo_idle_state states[CPUIDLE_STATE_MAX];
int interval_idx;
u64 intervals[INTERVALS];
@@ -216,7 +216,7 @@ static bool teo_time_ok(u64 interval_ns)
  */
 static int teo_find_shallower_state(struct cpuidle_driver *drv,
struct cpuidle_device *dev, int state_idx,
-   u64 duration_ns)
+   s64 duration_ns)
 {
int i;
 
@@ -242,7 +242,7 @@ static int teo_select(struct cpuidle_dri
 {
struct teo_cpu *cpu_data = per_cpu_ptr(_cpus, dev->cpu);
s64 latency_req = cpuidle_governor_latency_req(dev->cpu);
-   u64 duration_ns;
+   s64 duration_ns;
unsigned int hits, misses, early_hits;
int max_early_idx, prev_max_early_idx, constraint_idx, idx, i;
ktime_t delta_tick;





Re: 回复: [PATCH 01/10] tick/nohz: Prevent tick_nohz_get_sleep_length() from returning negative value

2021-03-25 Thread Frederic Weisbecker
On Tue, Mar 16, 2021 at 04:08:08PM +, Zhou Ti (x2019cwm) wrote:
> But I don't think it's a good idea to handle this in callers, because 
> logically the function shouldn't return negative values. Returning 0 directly 
> would allow idle governors to get another chance to select again.

Hmm, I'm going to leave the last word to Rafael since cpuidle are the only
callers of this. In any case we need to fix it.

Thanks.


Re: 回复: [PATCH 01/10] tick/nohz: Prevent tick_nohz_get_sleep_length() from returning negative value

2021-03-17 Thread Zhou Ti (x2019cwm)
On March 16, 2021 12:25, Peter Zijlstra wrote:
>On Tue, Mar 16, 2021 at 04:08:08PM +, Zhou Ti (x2019cwm) wrote:
>> But I don't think it's a good idea to handle this in callers, because
>> logically the function shouldn't return negative values. Returning 0
>> directly would allow idle governors to get another chance to select
>> again.


>A: Because it messes up the order in which people normally read text.
>Q: Why is top-posting such a bad thing?
>A: Top-posting.
>Q: What is the most annoying thing in e-mail?

Sorry about it, I am a newbie, I won't do it again.

Re: 回复: [PATCH 01/10] tick/nohz: Prevent tick_nohz_get_sleep_length() from returning negative value

2021-03-16 Thread Peter Zijlstra
On Tue, Mar 16, 2021 at 04:08:08PM +, Zhou Ti (x2019cwm) wrote:
> But I don't think it's a good idea to handle this in callers, because
> logically the function shouldn't return negative values. Returning 0
> directly would allow idle governors to get another chance to select
> again.


A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?
A: Top-posting.
Q: What is the most annoying thing in e-mail?


回复: [PATCH 01/10] tick/nohz: Prevent tick_nohz_get_sleep_length() from returning negative value

2021-03-16 Thread Zhou Ti (x2019cwm)
But I don't think it's a good idea to handle this in callers, because logically 
the function shouldn't return negative values. Returning 0 directly would allow 
idle governors to get another chance to select again.


发件人: Zhou Ti (x2019cwm) 
发送时间: 2021年3月16日 3:57
收件人: Rafael J. Wysocki; Frederic Weisbecker; Peter Zijlstra
抄送: Thomas Gleixner; LKML; Yunfeng Ye; Paul E . McKenney; Marcelo Tosatti; Ingo 
Molnar; raf...@kernel.org
主题: 回复: [PATCH 01/10] tick/nohz: Prevent tick_nohz_get_sleep_length() from 
returning negative value

Yes, the return of a negative number results in a very large unsigned integer, 
which idle governors use as a baseline prediction for future interrupts and to 
correct their own parameters. This problem can lead to the selection of idle 
states that are too deep, which can be detrimental to both energy and 
performance.


发件人: Rafael J. Wysocki 
发送时间: 2021年3月16日 3:26
收件人: Frederic Weisbecker; Peter Zijlstra
抄送: Thomas Gleixner; LKML; Zhou Ti (x2019cwm); Yunfeng Ye; Paul E . McKenney; 
Marcelo Tosatti; Ingo Molnar; raf...@kernel.org
主题: Re: [PATCH 01/10] tick/nohz: Prevent tick_nohz_get_sleep_length() from 
returning negative value

On 3/16/2021 3:53 PM, Frederic Weisbecker wrote:
> On Tue, Mar 16, 2021 at 03:35:37PM +0100, Peter Zijlstra wrote:
>> On Tue, Mar 16, 2021 at 02:37:03PM +0100, Frederic Weisbecker wrote:
>>> On Tue, Mar 16, 2021 at 01:21:29PM +0100, Peter Zijlstra wrote:
>>>> On Thu, Mar 11, 2021 at 01:36:59PM +0100, Frederic Weisbecker wrote:
>>>>> From: "Zhou Ti (x2019cwm)" 
>>>>>
>>>>> If the hardware clock happens to fire its interrupts late, two possible
>>>>> issues can happen while calling tick_nohz_get_sleep_length(). Either:
>>>>>
>>>>> 1) The next clockevent device event is due past the last idle entry time.
>>>>>
>>>>> or:
>>>>>
>>>>> 2) The last timekeeping update happened before the last idle entry time
>>>>> and the next timer callback expires before the last idle entry time.
>>>>>
>>>>> Make sure that both cases are handled to avoid returning a negative
>>>>> duration to the cpuidle governors.
>>>> Why? ... and wouldn't it be cheaper the fix the caller to
>>>> check negative once, instead of adding two branches here?
>>> There are already two callers and potentially two return values to check
>>> for each because the function returns two values.
>>>
>>> I'd rather make the API more robust instead of fixing each callers and 
>>> worrying
>>> about future ones.
>> But what's the actual problem? The Changelog doesn't say why returning a
>> negative value is a problem, and in fact the return value is explicitly
>> signed.
>>
>> Anyway, I don't terribly mind the patch, I was just confused by the lack
>> of actual justification.
> And you're right, the motivation is pure FUD: I don't know exactly
> how the cpuidle governors may react to such negative values and so this
> is just to prevent from potential accident.
>
> Rafael, does that look harmless to you?

No, this is a problem now.  Both governors using this assign the return
value of it to a u64 var and so negative values confuse them.

That said I think it's better to deal with the issue in the callers.

I can send a patch for that if needed.




回复: [PATCH 01/10] tick/nohz: Prevent tick_nohz_get_sleep_length() from returning negative value

2021-03-16 Thread Zhou Ti (x2019cwm)
Yes, the return of a negative number results in a very large unsigned integer, 
which idle governors use as a baseline prediction for future interrupts and to 
correct their own parameters. This problem can lead to the selection of idle 
states that are too deep, which can be detrimental to both energy and 
performance.


发件人: Rafael J. Wysocki 
发送时间: 2021年3月16日 3:26
收件人: Frederic Weisbecker; Peter Zijlstra
抄送: Thomas Gleixner; LKML; Zhou Ti (x2019cwm); Yunfeng Ye; Paul E . McKenney; 
Marcelo Tosatti; Ingo Molnar; raf...@kernel.org
主题: Re: [PATCH 01/10] tick/nohz: Prevent tick_nohz_get_sleep_length() from 
returning negative value

On 3/16/2021 3:53 PM, Frederic Weisbecker wrote:
> On Tue, Mar 16, 2021 at 03:35:37PM +0100, Peter Zijlstra wrote:
>> On Tue, Mar 16, 2021 at 02:37:03PM +0100, Frederic Weisbecker wrote:
>>> On Tue, Mar 16, 2021 at 01:21:29PM +0100, Peter Zijlstra wrote:
 On Thu, Mar 11, 2021 at 01:36:59PM +0100, Frederic Weisbecker wrote:
> From: "Zhou Ti (x2019cwm)" 
>
> If the hardware clock happens to fire its interrupts late, two possible
> issues can happen while calling tick_nohz_get_sleep_length(). Either:
>
> 1) The next clockevent device event is due past the last idle entry time.
>
> or:
>
> 2) The last timekeeping update happened before the last idle entry time
> and the next timer callback expires before the last idle entry time.
>
> Make sure that both cases are handled to avoid returning a negative
> duration to the cpuidle governors.
 Why? ... and wouldn't it be cheaper the fix the caller to
 check negative once, instead of adding two branches here?
>>> There are already two callers and potentially two return values to check
>>> for each because the function returns two values.
>>>
>>> I'd rather make the API more robust instead of fixing each callers and 
>>> worrying
>>> about future ones.
>> But what's the actual problem? The Changelog doesn't say why returning a
>> negative value is a problem, and in fact the return value is explicitly
>> signed.
>>
>> Anyway, I don't terribly mind the patch, I was just confused by the lack
>> of actual justification.
> And you're right, the motivation is pure FUD: I don't know exactly
> how the cpuidle governors may react to such negative values and so this
> is just to prevent from potential accident.
>
> Rafael, does that look harmless to you?

No, this is a problem now.  Both governors using this assign the return
value of it to a u64 var and so negative values confuse them.

That said I think it's better to deal with the issue in the callers.

I can send a patch for that if needed.