Hey,

i would like to draw the attention on this issue again. IMHO that is
something that really needs to be addressed somehow. And will probably
involve a discussion and patches in mainline Linux.

At some point i will start such a discussion "in the name of Xenomai",
but before that i would like to have one here.

Henning

Am Fri, 12 Aug 2016 11:30:57 +0200
schrieb Henning Schild <[email protected]>:

> TSCs are the only clocksource that Xenomai is able to use at the
> moment and it relies on them to be synchronized across cores.
> Unfortunately that is not always the case and the offsets can be
> substantial. [1]
> 
> Linux does test for that on bootup and whenever a CPU is hotplugged
> and it will fall back to HPET if TSCs don't turn out to be in sync.
> That test result should also tell the user that the system in
> question is not fit to be used with Xenomai. I already have a patch
> that will actually return an error from the init of the Xenomai
> module.
> 
> But that reaction is pretty drastic, especially since it potentially
> disables Xenomai on systems that are already running on a Xenomai
> stack "just fine". So the idea was to relax the TSC sync test in
> Linux and allow a few hundred ns offset via a kernel parameter.
> Systems i have seen TSC async on are sometimes off by something like
> 2000 cycles, but most of the time the TSCs are in sync at boottime.
> 
> But i have also seen the ocassional 10s of thousands and i have seen
> machines where the TSCs seem to always be off, sometimes 100s of
> thousands of cycles. I am not sure whether one could just blame the
> BIOS or the hardware for that, and actually blaming a component you
> can not control does not help solve the problem.
> 
> Fact is, we need to get these counters in relatively close sync or we
> need another clocksource like the HPET.
> 
> TSCs can be synched with MSR 0x10, given the potential for SMIs that
> is not trivial but hopefully possible. And here i am hoping/assuming
> that once synched they will not start drifting. Modern CPUs have
> monotonic TSCs and tell about it via CPUID.
> 
> Linux used to have syncing code but that was dropped in 2007. [2]
> At the time it did not work well enough, probably also because of
> power management and changing frequencies. But today a TSC is stable
> across power states and synching is hopefully feasible again.
> 
> To conclude i basically see two ways to approach that problem:
> 
> 1. rely on Linux to test the TSCs and refuse Xenomai service
>  - patch Linux to sync the TSCs again
>  - maybe patch Linux to slightly relax the sync test, once we are
> close enough
> 
> 2. rely on Linux to test the TSCs and fall back to another clock,
> which needs to be implemented and wont be as fast
> 
> I certainly prefer option 1. because it seems like less effort and it
> will maintain a high performance clocksource. And it - if we get the
> patches mainline - has the potential to improve Linux and not get
> these changes into the ipipe-patch. I am not sure how "bad" it would
> be performance-wise to use HPET in Xenomai.
> 
> We should include the Linux TSC experts in this discussion at some
> point. But right now i am curious to hear opinions, thoughts, and
> expertise from the Xenomai community! 
> 
> Henning
> 
> [1]
> http://www.xenomai.org/pipermail/xenomai/2016-August/036615.html
> [2]
> https://github.com/torvalds/linux/commit/95492e4646e5de8b43d9a7908d6177fb737b61f0
> 


_______________________________________________
Xenomai mailing list
[email protected]
https://xenomai.org/mailman/listinfo/xenomai

Reply via email to