On Tue Feb 13, 2024 at 3:26 PM AEST, Shrikanth Hegde wrote: > powerVM hypervisor updates the VPA fields with stolen time data. > It currently reports enqueue_dispatch_tb and ready_enqueue_tb for > this purpose. In linux these two fields are used to report the stolen time. > > The VPA fields are updated at the TB frequency. On powerPC its mostly > set at 512Mhz. Hence this needs a conversion to ns when reporting it > back as rest of the kernel timings are in ns. This conversion is already > handled in tb_to_ns function. So use that function to report accurate > stolen time. > > Observed this issue and used an Capped Shared Processor LPAR(SPLPAR) to > simplify the experiments. In all these cases, 100% VP Load is run using > stress-ng workload. Values of stolen time is in percentages as reported > by mpstat. With the patch values are close to expected. > > 6.8.rc1 +Patch > 12EC/12VP 0.0 0.0 > 12EC/24VP 25.7 50.2 > 12EC/36VP 37.3 69.2 > 12EC/48VP 38.5 78.3 > > > Fixes: 0e8a63132800 ("powerpc/pseries: Implement > CONFIG_PARAVIRT_TIME_ACCOUNTING")
Good find and fix. Paper bag for me. I wonder why we didn't catch it in the first place. Maybe we didn't understand the hypervisor's sharing algorithm and what we expected it to report. In any case this is right. The KVM implementation of the counters is in TB, so that's fine. Reviewed-by: Nicholas Piggin <npig...@gmail.com> Thanks, Nick > Signed-off-by: Shrikanth Hegde <sshe...@linux.ibm.com> > --- > arch/powerpc/platforms/pseries/lpar.c | 8 ++++++-- > 1 file changed, 6 insertions(+), 2 deletions(-) > > diff --git a/arch/powerpc/platforms/pseries/lpar.c > b/arch/powerpc/platforms/pseries/lpar.c > index 4561667832ed..bdcc428e1c2b 100644 > --- a/arch/powerpc/platforms/pseries/lpar.c > +++ b/arch/powerpc/platforms/pseries/lpar.c > @@ -662,8 +662,12 @@ u64 pseries_paravirt_steal_clock(int cpu) > { > struct lppaca *lppaca = &lppaca_of(cpu); > > - return be64_to_cpu(READ_ONCE(lppaca->enqueue_dispatch_tb)) + > - be64_to_cpu(READ_ONCE(lppaca->ready_enqueue_tb)); > + /* > + * VPA steal time counters are reported at TB frequency. Hence do a > + * conversion to ns before returning > + */ > + return tb_to_ns(be64_to_cpu(READ_ONCE(lppaca->enqueue_dispatch_tb)) + > + be64_to_cpu(READ_ONCE(lppaca->ready_enqueue_tb))); > } > #endif > > -- > 2.39.3