Hi Jerin, > -----Original Message----- > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Jerin Jacob > Sent: Thursday, August 18, 2016 17:22 > To: dev at dpdk.org > Cc: thomas.monjalon at 6wind.com; jianbo.liu at linaro.org; > viktorin at rehivetech.com; Jerin Jacob <jerin.jacob at caviumnetworks.com> > Subject: [dpdk-dev] [PATCH] eal/armv8: high-resolution cycle counter > > Existing cntvct_el0 based rte_rdtsc() provides portable > means to get wall clock counter at user space. Typically > it runs at <= 100MHz. > > The alternative method to enable rte_rdtsc() for high resolution > wall clock counter is through armv8 PMU subsystem. > The PMU cycle counter runs at CPU frequency, However, > access to PMU cycle counter from user space is not enabled > by default in the arm64 linux kernel. > It is possible to enable cycle counter at user space access > by configuring the PMU from the privileged mode (kernel space). > > by default rte_rdtsc() implementation uses portable > cntvct_el0 scheme. Application can choose the PMU based > implementation with CONFIG_RTE_ARM_EAL_RDTSC_USE_PMU > > Signed-off-by: Jerin Jacob <jerin.jacob at caviumnetworks.com> > --- > > The PMU based scheme useful for high accuracy performance profiling. > Find below the example steps to configure the PMU based cycle counter on an > armv8 machine. > > # git clone https://github.com/jerinjacobk/armv8_pmu_cycle_counter_el0 > # cd armv8_pmu_cycle_counter_el0 > # make > # sudo insmod pmu_el0_cycle_counter.ko > # cd $DPDK_DIR > # make config T=arm64-armv8a-linuxapp-gcc > # echo "CONFIG_RTE_ARM_EAL_RDTSC_USE_PMU=y" >> build/.config > # make -j 4
Can we make this kernel module also a part of DPDK. May be in the linuxapp so that it is also compiled with DPDK? > > --- > .../common/include/arch/arm/rte_cycles_64.h | 33 > ++++++++++++++++++++++ > 1 file changed, 33 insertions(+) > > diff --git a/lib/librte_eal/common/include/arch/arm/rte_cycles_64.h > b/lib/librte_eal/common/include/arch/arm/rte_cycles_64.h > index 14f2612..867a946 100644 > --- a/lib/librte_eal/common/include/arch/arm/rte_cycles_64.h > +++ b/lib/librte_eal/common/include/arch/arm/rte_cycles_64.h > @@ -45,6 +45,11 @@ extern "C" { > * @return > * The time base for this lcore. > */ > +#ifndef RTE_ARM_EAL_RDTSC_USE_PMU > +/** > + * This call is portable to any ARMv8 architecture, however, typically > + * cntvct_el0 runs at <= 100MHz and it may be imprecise for some tasks. > + */ > static inline uint64_t > rte_rdtsc(void) > { > @@ -53,6 +58,34 @@ rte_rdtsc(void) > asm volatile("mrs %0, cntvct_el0" : "=r" (tsc)); > return tsc; > } > +#else > +/** > + * This is an alternative method to enable rte_rdtsc() with high resolution > + * PMU cycles counter.The cycle counter runs at cpu frequency and this scheme > + * uses ARMv8 PMU subsystem to get the cycle counter at userspace, However, > + * access to PMU cycle counter from user space is not enabled by default in > + * arm64 linux kernel. > + * It is possible to enable cycle counter at user space access by configuring > + * the PMU from the privileged mode (kernel space). > + * > + * asm volatile("msr pmintenset_el1, %0" : : "r" ((u64)(0 << 31))); > + * asm volatile("msr pmcntenset_el0, %0" :: "r" BIT(31)); > + * asm volatile("msr pmuserenr_el0, %0" : : "r"(BIT(0) | BIT(2))); > + * asm volatile("mrs %0, pmcr_el0" : "=r" (val)); > + * val |= (BIT(0) | BIT(2)); > + * isb(); > + * asm volatile("msr pmcr_el0, %0" : : "r" (val)); In your git repo I see that on cleanup the cycle count register is not disabled (PMCNTENCLR_EL0). It shall be better to disable the cycle count register too at module exit. > + * > + */ > +static inline uint64_t > +rte_rdtsc(void) > +{ > + uint64_t tsc; > + > + asm volatile("mrs %0, pmccntr_el0" : "=r"(tsc)); > + return tsc; > +} > +#endif > > static inline uint64_t > rte_rdtsc_precise(void) > -- > 2.5.5 Do you also plan to support performance monitor event counters? Regards, Nipun