Philippe Gerum wrote:
> Here is likely why we have different levels of accuracy and performance,
> firstly my version is bluntly based on the khz freq, secondly it
> calculates the other way around, i.e. ns2tsc, so that tsc are keep in
> the inner code, but more efficiently converted from ns counts passed to
> the outer interface:
>
> static unsigned long ns2cyc_scale;
> #define NS2CYC_SCALE_FACTOR 10 /* 2^10, carefully chosen */
>
> static inline void set_ns2cyc_scale(unsigned long cpu_khz)
> {
> ns2cyc_scale = (cpu_khz << NS2CYC_SCALE_FACTOR) / 1000000;
> }
>
> static inline unsigned long long ns_2_cycles(unsigned long long ns)
> {
> return ns * ns2cyc_scale >> NS2CYC_SCALE_FACTOR;
> }Your version performs ~50% better than mine (outperforming the original version by factor 7 on a 1 GHz box, vs. 4.8). I think you compared non-optimised code, didn't you? Without -O2, I see 15 times better performance. [Gilles variant yet refuses the get benchmarked.] Jan
signature.asc
Description: OpenPGP digital signature
_______________________________________________ Xenomai-core mailing list [email protected] https://mail.gna.org/listinfo/xenomai-core
