RE: Still getting Deadlocks with condition variables

Lange Norbert via Xenomai Mon, 15 Jun 2020 04:53:14 -0700


> -----Original Message-----
> From: Philippe Gerum <r...@xenomai.org>
> Sent: Montag, 15. Juni 2020 12:03
> To: Lange Norbert <norbert.la...@andritz.com>; Xenomai
> (xenomai@xenomai.org) <xenomai@xenomai.org>;
> 'jan.kis...@siemens.com' <jan.kis...@siemens.com>
> Subject: Re: Still getting Deadlocks with condition variables
>
> NON-ANDRITZ SOURCE: BE CAUTIOUS WITH CONTENT, LINKS OR
> ATTACHMENTS.
>
>
> On 6/15/20 11:06 AM, Lange Norbert wrote:
> >>
> >> This code does not take away any protection, on the contrary this
> >> ensures that PROT_EXEC is set for all stacks along with read and
> >> write access, which is glibc's default for the x86_64 architecture.
> >
> > I meant that it might have to do some non-atomic procedure, for
> > example when splitting up a continuous bigger mapping with the stack
> > in the middle, as the protection flags are now different.
> >
>
> We are talking about mprotect(), not mmap().


My bad, yes.

>
> >> The fault is likely due to mm code fixing up those protections for
> >> the relevant page(s). It looks like such pages are force faulted-in,
> >> which would explain the #PF, and the SIGXCPU notification as a
> >> consequence. These are minor faults in the MMU management sense, so
> >> this is transparent for common applications.
> >
> > I don’t know enough about the x86 (and don’t want to know), but this
> > needs some explanation. First, the DSOs don’t need executable stack
> > (build system did not care to add the .note.GNU-stack everywhere ), so
> > this specific issue can be worked around.
> >
> > -   I don’t understand why this is very timing sensitive, If page is marked 
> > to
> #PF (or removed)
> >     Then this should fault predictable on the next access (I don’t
> > share data on stack that Linux threads could run into an #PF instead)
>
> Faulting is what it does. It is predictable and synchronous, you seem to be
> assuming that the fault is somehow async or proxied, it is not.

It does happen rather sparsely, affected by small changes in unrelated code,
Loading a DSO (which requires executable stack) might trigger a #PF or not.
Nothing but the respective RT-Thread is accessing it's *private* stack, the 
pagefault
Happens at a callq.

If that stack accesss is to cause a #PF, then the code would run into it 
*every* time.
Yet it does not, its rather really hard to get this to reproduce.

> > -   If that’s a non-atomic operation (perhaps only if the sparse tables need
> modification in a higher level), then I would expect
> >     some sort of lazy locking (RCU?). Is this ending up in chaos as cores
> running Xenomai are "idle" for Linux, and pick up outdated data?
>
> I have no idea why you would bring up RCU in this picture, there is no
> convoluted aspect in what happens. There is no chaos, only a plain simple
> #PF event which unfortunately occurs as a result of running an apparently
> innocuous regular operation which is loading a DSO. The reason for the #PF
> can be explained, how it is dealt with is fine, the rt loop in your app just 
> does
> not like observing it for a legitimate reason.

I am asking a question, I assume pagetables need to be reallocated under
some circumstances. I understand a #PF *has to happen* according to your 
explanation,
and I don’t know *where it happens if I don’t observe the RT task switching*.

The faults are very timing sensitive.

So I could imagine the multilevel page-table map looking like this (dunt know 
how many levels x86 is using nowadays, but that’s beside the point):
[first level] -> [second level] -> [stack mapping]

if the mprotect syscall changes just the *private* stack mapping, then the RT 
thread will always fault - (not what I observe).
if the syscall modifies lower levels, then any thread can hit the PF, and if 
it's not a RT thread then no #PF will be observed (by WARNSW).

This is my conjecture, if that’s true, then the question is modified to:
-   under what circumstances can this appear?

>
> > -   Are/can such minor faults be handled in Xenomai? In other words is the
> WARNSW correct, or is
> >     this actually just the check causing the issues?
> >     Would it make sense to handle such minor faults in Xenomai (only
> demoting to Linux if necessary)?
> >
> >> Not for those of us who do not want the application code to run into
> >> any page fault unfortunately.
> >>
> >> Loading DSOs while the real-time system is running just proved to be
> >> a bad idea it seems (did not check how other *libc implementations
> >> behave on
> >> dlopen() though).
> >
> > glibc dlopens files on its own BTW, for nss plugins and encodings.
> > Practically than means you would need to check everything running (in
> > non-rt threads), for dlopen and various calls that could resolve names to
> uid/gid, do dns lookups, use iconv etc.
> >
>
> The glibc is fortunately not dlopening DSOs at every corner. You mention
> very specific features that would have to take place during the app init
> chores instead, or at the very least in a way which is synchronized with a
> quiescent state of the rt portion of the process.

Still important to know, so you could for ex. "prime" the library by doing 
those calls
up-front.

> > This is an issue of changing protection on existing mappings, or the mmap
> call in broader terms.
>
> This is an issue with some of mprotect() side-effects.
>
> > Knowing why and under what circumstances this causes trouble would be
> > rather important (would have been important some years ago when we
> *started* porting to Xenomai and picked solutions).
> >
>
> I believe the whole thread has already framed the how and why fairly
> precisely. With respect to knowing those things in advance, I can only
> recommend that people who based their project on Xenomai over the past
> 15 years share their knowledge and experience by contributing
> documentation and participating to the mailing list.

You can't expect "users" of Xenomai having your indepth knowledge,
I can't describe the inner workings and I hoped I could keep myself distanced
from the 24M LOC Linux kernel.

>
> > I could for ex. expect the kernel option CONFIG_COMPACTION causing
> similar issues (and pretty impossible to triage).
> >
>
> CONFIG_COMPACTION is a known source of latency spots, just like
> transparent huge pages are not going to be helpful, regardless of the
> underlying rt infrastructure.

COMPACTION could help for long running applications, you don’t want
latency, but you want heavy fragmentation even less.
(I'd guess that kind of latency is not really comparable with a RT thread
getting demoted)

>
> To sum up what we have all been saying, the problem is not about causing a
> PTE miss due to mprotect() altering permissions on pages, but how we can
> handle the minor fault from primary mode. On arm, arm64 and ppc, such
> fault can be handled directly from primary mode. On x86, the inner MMU
> code which is in charge of handling faults does not allow that, there is the
> need for switching to secondary mode.
>
> Regarding handling PTE misses directly from primary mode, this would be a
> mess with x86: sharing the MMU management logic between Cobalt and the
> regular linux mm sub-system which otherwise run totally asynchronously is
> something I for one won't even try.

Ok, sounds perfectly messy (the x86 architecture that is).

> Without Xenomai, you would take a fault the same way, the difference is
> that with Xenomai, the WARNSW notifier tells you so, issuing a signal, which
> may be triggering a separate issue with condvars.

So WARNSW works correctly, thanks for clearing that up.

Norbert

________________________________

This message and any attachments are solely for the use of the intended 
recipients. They may contain privileged and/or confidential information or 
other information protected from disclosure. If you are not an intended 
recipient, you are hereby notified that you received this email in error and 
that any review, dissemination, distribution or copying of this email and any 
attachment is strictly prohibited. If you have received this email in error, 
please contact the sender and delete the message and any attachment from your 
system.

ANDRITZ HYDRO GmbH


Rechtsform/ Legal form: Gesellschaft mit beschränkter Haftung / Corporation

Firmensitz/ Registered seat: Wien

Firmenbuchgericht/ Court of registry: Handelsgericht Wien

Firmenbuchnummer/ Company registration: FN 61833 g

DVR: 0605077

UID-Nr.: ATU14756806


Thank You
________________________________

RE: Still getting Deadlocks with condition variables

Reply via email to