TSCs are the only clocksource that Xenomai is able to use at the moment and it relies on them to be synchronized across cores. Unfortunately that is not always the case and the offsets can be substantial. [1]
Linux does test for that on bootup and whenever a CPU is hotplugged and it will fall back to HPET if TSCs don't turn out to be in sync. That test result should also tell the user that the system in question is not fit to be used with Xenomai. I already have a patch that will actually return an error from the init of the Xenomai module. But that reaction is pretty drastic, especially since it potentially disables Xenomai on systems that are already running on a Xenomai stack "just fine". So the idea was to relax the TSC sync test in Linux and allow a few hundred ns offset via a kernel parameter. Systems i have seen TSC async on are sometimes off by something like 2000 cycles, but most of the time the TSCs are in sync at boottime. But i have also seen the ocassional 10s of thousands and i have seen machines where the TSCs seem to always be off, sometimes 100s of thousands of cycles. I am not sure whether one could just blame the BIOS or the hardware for that, and actually blaming a component you can not control does not help solve the problem. Fact is, we need to get these counters in relatively close sync or we need another clocksource like the HPET. TSCs can be synched with MSR 0x10, given the potential for SMIs that is not trivial but hopefully possible. And here i am hoping/assuming that once synched they will not start drifting. Modern CPUs have monotonic TSCs and tell about it via CPUID. Linux used to have syncing code but that was dropped in 2007. [2] At the time it did not work well enough, probably also because of power management and changing frequencies. But today a TSC is stable across power states and synching is hopefully feasible again. To conclude i basically see two ways to approach that problem: 1. rely on Linux to test the TSCs and refuse Xenomai service - patch Linux to sync the TSCs again - maybe patch Linux to slightly relax the sync test, once we are close enough 2. rely on Linux to test the TSCs and fall back to another clock, which needs to be implemented and wont be as fast I certainly prefer option 1. because it seems like less effort and it will maintain a high performance clocksource. And it - if we get the patches mainline - has the potential to improve Linux and not get these changes into the ipipe-patch. I am not sure how "bad" it would be performance-wise to use HPET in Xenomai. We should include the Linux TSC experts in this discussion at some point. But right now i am curious to hear opinions, thoughts, and expertise from the Xenomai community! Henning [1] http://www.xenomai.org/pipermail/xenomai/2016-August/036615.html [2] https://github.com/torvalds/linux/commit/95492e4646e5de8b43d9a7908d6177fb737b61f0 _______________________________________________ Xenomai mailing list [email protected] https://xenomai.org/mailman/listinfo/xenomai
