Re: [RFC][PATCH 4.19] x86/ipipe: Protect TLB flushing against context switch by head domain

Jan Kiszka via Xenomai Thu, 12 Mar 2020 09:12:47 -0700

On 12.03.20 16:59, Philippe Gerum wrote:

On 3/12/20 2:48 PM, Jan Kiszka wrote:

From: Jan Kiszka <jan.kis...@siemens.com>


A Xenomai application is very rarely triggering

WARNING: CPU: 0 PID: 1997 at arch/x86/mm/tlb.c:560 [...]
(local_tlb_gen > mm_tlb_gen)

This could be triggered by loaded_mm and loaded_mm_asid becoming out of
sync when flush_tlb_func_common is interrupted by the head domain to
switch a real-time task right between the retrieval of both values, or
maybe even after that but before writing mm_tlb_gen back to
cpu_tlbstate.ctxs[loaded_mm_asid].tlb_gen.

Avoid that case by making the retrieval atomic while keeping the TLB
flush interruptible. Now, there could still be interrupt during the
flush. To avoid writing back to the wrong context, we first atomically
check after the flush if nothing changed and only write if that is the
case. That may mean another TLB flush is triggered needlessly, but
that's rare and acceptable.

Signed-off-by: Jan Kiszka <jan.kis...@siemens.com>
---

Due to the rare nature of this issue, we are not yet confident to have
truly fixed it this way.

Philippe, I'm seeing some similar attempt in dovetail but it appears to
me it's missing some cases.


Not "some cases", but the last one in your patch specifically if I read
it correctly, which I assumed was not applicable, at least not the way I
read your change, when I worked on this a year ago. This explains why
that particular change is not present in the commit (3aa2fc2fb4c) you
seem to have cherry picked from dovetail for the 5.x kernel series. This
said, these are tricky issues, so as you hinted in your commit log,
there is likely room for improvement in any case, and I may have
overlooked things.

Too bad that development was forking here
and information isn't flowing smoothly yet.


You just demonstrated that the information is there, and that anyone can
access it freely by looking at the EVL development tree. I'm sorry to

It's there but it now requires polling to extract it. I suspect I willfind more interesting changes once reviewing the dovetail queuecompletely (I already found the reverse: KVM was broken in dovetail dueto incomplete forward porting; will fix when I come along the code).

hear that forking my own code for the most part in order to find a
better approach for others to benefit from in the long run can be a
problem. I did not find any other way to go back to the drawing board as
required by the technical goals I'm pursuing with EVL, which differ from
Xenomai's.

I've seen this with other spin-offs/rewrites/etc. of the ipipe-likekernel queue a couple of times: Even if colors and edges lookdifferently, the core concept remains the same. Thus you also share theconceptual problems - and often also the solutions. Doing this multipletimes is just wasted time. That's why we really need to get Xenomaibased in dovetail for upcoming kernels so that test results and fixesflow in both directions automatically again.


Jan

--
Siemens AG, Corporate Technology, CT RDA IOT SES-DE
Corporate Competence Center Embedded Linux

Re: [RFC][PATCH 4.19] x86/ipipe: Protect TLB flushing against context switch by head domain

Reply via email to