On Thu, Sep 24, 2020 at 1:00 PM Lukasz Luba <lukasz.l...@arm.com> wrote: > > > > On 9/24/20 11:24 AM, Rafael J. Wysocki wrote: > > On Thu, Sep 24, 2020 at 11:25 AM Lukasz Luba <lukasz.l...@arm.com> wrote: > >> > >> Hi Rafael, > >> > >> On 9/23/20 2:48 PM, Rafael J. Wysocki wrote: > >>> On Wed, Sep 16, 2020 at 8:45 AM Viresh Kumar <viresh.ku...@linaro.org> > >>> wrote: > >>>> > >>>> In order to prepare for lock-less stats update, add support to defer any > >>>> updates to it until cpufreq_stats_record_transition() is called. > >>> > >>> This is a bit devoid of details. > >>> > >>> I guess you mean reset in particular, but that's not clear from the above. > >>> > >>> Also, it would be useful to describe the design somewhat. > >>> > >>>> Signed-off-by: Viresh Kumar <viresh.ku...@linaro.org> > >>>> --- > >>>> drivers/cpufreq/cpufreq_stats.c | 75 ++++++++++++++++++++++++--------- > >>>> 1 file changed, 56 insertions(+), 19 deletions(-) > >>>> > >>>> diff --git a/drivers/cpufreq/cpufreq_stats.c > >>>> b/drivers/cpufreq/cpufreq_stats.c > >>>> index 94d959a8e954..3e7eee29ee86 100644 > >>>> --- a/drivers/cpufreq/cpufreq_stats.c > >>>> +++ b/drivers/cpufreq/cpufreq_stats.c > >>>> @@ -22,17 +22,22 @@ struct cpufreq_stats { > >>>> spinlock_t lock; > >>>> unsigned int *freq_table; > >>>> unsigned int *trans_table; > >>>> + > >>>> + /* Deferred reset */ > >>>> + unsigned int reset_pending; > >>>> + unsigned long long reset_time; > >>>> }; > >>>> > >>>> -static void cpufreq_stats_update(struct cpufreq_stats *stats) > >>>> +static void cpufreq_stats_update(struct cpufreq_stats *stats, > >>>> + unsigned long long time) > >>>> { > >>>> unsigned long long cur_time = get_jiffies_64(); > >>>> > >>>> - stats->time_in_state[stats->last_index] += cur_time - > >>>> stats->last_time; > >>>> + stats->time_in_state[stats->last_index] += cur_time - time; > >>>> stats->last_time = cur_time; > >>>> } > >>>> > >>>> -static void cpufreq_stats_clear_table(struct cpufreq_stats *stats) > >>>> +static void cpufreq_stats_reset_table(struct cpufreq_stats *stats) > >>>> { > >>>> unsigned int count = stats->max_state; > >>>> > >>>> @@ -41,42 +46,67 @@ static void cpufreq_stats_clear_table(struct > >>>> cpufreq_stats *stats) > >>>> memset(stats->trans_table, 0, count * count * sizeof(int)); > >>>> stats->last_time = get_jiffies_64(); > >>>> stats->total_trans = 0; > >>>> + > >>>> + /* Adjust for the time elapsed since reset was requested */ > >>>> + WRITE_ONCE(stats->reset_pending, 0); > >>> > >>> What if this runs in parallel with store_reset()? > >>> > >>> The latter may update reset_pending to 1 before the below runs. > >>> Conversely, this may clear reset_pending right after store_reset() has > >>> set it to 1, but before it manages to set reset_time. Is that not a > >>> problem? > >> > >> I wonder if we could just drop the reset feature. Is there a tool > >> which uses this file? The 'reset' sysfs would probably have to stay > >> forever, but an empty implementation is not an option? > > > > Well, having an empty sysfs attr would be a bit ugly, but the > > implementation of it could be simplified. > > > >> The documentation states: > >> 'This can be useful for evaluating system behaviour under different > >> governors without the need for a reboot.' > >> With the scenario of fast-switch this resetting complicates the > >> implementation and the justification of having it just for experiments > >> avoiding reboot is IMO weak. The real production code would have to pay > >> extra cycles every time. Also, we would probably not experiment with > >> cpufreq different governors, since the SchedUtil is considered the best > >> option. > > > > It would still be good to have a way to test it against the other > > available options, though. > > > > Experimenting with different governors would still be possible, just > the user-space would have to take a snapshot of the stats when switching > to a new governor. Then the values presented in the stats would just > need to be calculated in this user tool against the snapshot. > > The resetting is also not that bad, since nowadays more components > maintain some kind of local statistics/history (scheduler, thermal). > I would recommend to reset the whole system and repeat the same tests > with different governor, just to be sure that everything starts from > similar state (utilization, temperature, other devfreq devices > frequencies etc).
Well, if everyone agrees on removing the reset feature, let's drop the sysfs attr too, as it would be useless going forward. Admittedly, I don't have a strong opinion and since intel_pstate doesn't use a frequency table, this is not relevant for systems using that driver anyway.