On 05/13/2016 07:54 AM, Jan Kiszka wrote: > On 2016-05-13 00:26, Philippe Gerum wrote: >> On 05/12/2016 09:27 PM, Jan Kiszka wrote: >>> On 2016-05-12 21:08, Philippe Gerum wrote: >>>> On 05/12/2016 08:42 PM, Jan Kiszka wrote: >>>>> On 2016-05-12 20:35, Philippe Gerum wrote: >>>>>> On 05/12/2016 08:24 PM, Jan Kiszka wrote: >>>>>>> On 2016-05-12 20:20, Gilles Chanteperdrix wrote: >>>>>>>> On Thu, May 12, 2016 at 07:17:15PM +0200, Jan Kiszka wrote: >>>>>>>>> On 2016-05-12 19:12, Gilles Chanteperdrix wrote: >>>>>>>>>> On Thu, May 12, 2016 at 06:59:04PM +0200, Gilles Chanteperdrix wrote: >>>>>>>>>>> On Thu, May 12, 2016 at 06:50:03PM +0200, Jan Kiszka wrote: >>>>>>>>>>>> On 2016-05-12 18:31, Gilles Chanteperdrix wrote: >>>>>>>>>>>>> On Thu, May 12, 2016 at 06:06:16PM +0200, Jan Kiszka wrote: >>>>>>>>>>>>>> Gilles, >>>>>>>>>>>>>> >>>>>>>>>>>>>> regarding commit bec5d0dd42 (rtdm: make syscalls conforming >>>>>>>>>>>>>> rather than >>>>>>>>>>>>>> current) - I remember a discussion on that topic, but I do not >>>>>>>>>>>>>> find its >>>>>>>>>>>>>> traces any more. Do you have a pointer >>>>>>>>>>>>>> >>>>>>>>>>>>>> In any case, I'm confronted with a use case for the old (Xenomai >>>>>>>>>>>>>> 2), >>>>>>>>>>>>>> lazy switching behaviour: lightweight, performance sensitive >>>>>>>>>>>>>> IOCTL >>>>>>>>>>>>>> services that can (and should) be called without any switching >>>>>>>>>>>>>> from both >>>>>>>>>>>>>> domains. >>>>>>>>>>>>> >>>>>>>>>>>>> Why not using a plain linux driver? ioctl_nrt callbacks are >>>>>>>>>>>>> redundant with plain linux drivers. >>>>>>>>>>>> >>>>>>>>>>>> Because that enforces the calling layer to either call the same >>>>>>>>>>>> service >>>>>>>>>>>> via a plain Linux device if the calling thread is currently >>>>>>>>>>>> relaxed or >>>>>>>>>>>> go for the RT device if the caller is in primary. Doable, but I >>>>>>>>>>>> would >>>>>>>>>>>> really like to avoid this pain for the users. >>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> What were the arguments in favour of migrating threads to >>>>>>>>>>>>>> real-time first? >>>>>>>>>>>>>> >>>>>>>>>>>>>> I currently see the real need only for IOCTLs, but the question >>>>>>>>>>>>>> is then >>>>>>>>>>>>>> if we shouldn't go back to "__xn_exec_current" in all RTDM cases >>>>>>>>>>>>>> to >>>>>>>>>>>>>> avoid unwanted migration costs (which are significantly higher >>>>>>>>>>>>>> than >>>>>>>>>>>>>> syscall restarts). >>>>>>>>>>>>> >>>>>>>>>>>>> I do not find commit bec5d0dd42 in xenomai-2.6 git tree, and I do >>>>>>>>>>>> >>>>>>>>>>>> Xenomai 2 is still following the lazy scheme - we reverted that >>>>>>>>>>>> commit >>>>>>>>>>>> later on in 7df0c1d96b. Xenomai 3 changed it again with the commit >>>>>>>>>>>> above. >>>>>>>>>>>> >>>>>>>>>>>>> not remember merging this. However I find commit >>>>>>>>>>>>> 13bfdd477ab880499d2e8f3b82c49ef4d2cccff0 from 2010 which seems to >>>>>>>>>>>>> explain the reason pretty clear. >>>>>>>>>>>>> >>>>>>>>>>>>> At the time of the discussion we had concluded that it was the way >>>>>>>>>>>>> to go. With __xn_exec_current you may enter the ioctl_rt callback >>>>>>>>>>>>> from secondary domain, which is counter-intuitive, error-prone, >>>>>>>>>>>>> and >>>>>>>>>>>>> forces you to cripple driver code for checks for the current >>>>>>>>>>>>> domain. >>>>>>>>>>>> >>>>>>>>>>>> Nope, normal drivers are not affected as they just implement those >>>>>>>>>>>> services in the respective mode they want to support there and >>>>>>>>>>>> have a >>>>>>>>>>>> simple -ENOSYS for the rest (explicitly in IOCTLs or implicitly by >>>>>>>>>>>> leaving out the implementation of the counterpart handler). >>>>>>>>>>> >>>>>>>>>>> Yes, I got mixed up trying to remember. I think the crux of the >>>>>>>>>>> problem is that if a thread running in primary mode gets >>>>>>>>>>> (temporarily) switched to secondary mode by gdb, the ioctl_nrt >>>>>>>>>>> handler gets invoked, which is almost certainly the wrong thing to >>>>>>>>>>> do. You want the thread to migrate to primary mode to execute >>>>>>>>>>> ioctl_rt, which __xn_exec_conforming achieves. Otherwise running an >>>>>>>>>>> application in gdb causes the application to behave differently. >>>>>>>>>> >>>>>>>>>> And trying and avoiding this issue indeed cripple codes with checks >>>>>>>>>> for rtdm_in_rt_context: >>>>>>>>>> https://git.xenomai.org/xenomai-2.6.git/tree/ksrc/drivers/analogy/rtdm_interface.c#n194 >>>>>>>>>> >>>>>>>>> >>>>>>>>> I don't remember details here, but this is a special case: The driver >>>>>>>>> provides also read_nrt - is that really useful for Analogy? >>>>>>>>> >>>>>>>>> In most cases, you are fine with not providing the nrt (or rt) >>>>>>>>> handler, >>>>>>>>> or with a simple >>>>>>>>> >>>>>>>>> default: >>>>>>>>> return -ENOSYS; >>>>>>>>> >>>>>>>>> in your ioctl dispatcher. >>>>>>>> >>>>>>>> You are missing the point: if you enter read_nrt, there are two >>>>>>>> cases: >>>>>>>> - either the thread is real-time capable and has been relaxed by gdb >>>>>>>> and you want to switch to read_rt for the reasons I already >>>>>>>> explained, in that case, you must return -ENOSYS; >>>>>>>> - or the thread is not real-time capable and the nrt handler >>>>>>>> applies. >>>>>>>> >>>>>>>> So, you need at least >>>>>>>> >>>>>>>> read_nrt() >>>>>>>> { >>>>>>>> if (rt_capable) >>>>>>>> return -ENOSYS; >>>>>>>> >>>>>>>> /* Do the normal case here */ >>>>>>>> } >>>>>>> >>>>>>> Now tell me how many drivers have read_nrt, write_nrt? 1 in-tree. >>>>>>> recvmsg_nrt, sendmsg_nrt? 0 in-tree. Analogy is special (still like to >>>>>>> understand why, though). And having some special code in the exceptional >>>>>>> case is probably better then the side effects we get from eagerly >>>>>>> switching now. >>>>>>> >>>>>> >>>>>> Sorry, that is exactly the opposite: your use case is exceptional and I >>>>>> believe is wrong. The normal use case is the one that does not ask the >>>>>> user to track the current mode for knowing what any random driver would >>>>>> eventually do depending on the calling context. >>>>> >>>>> You still miss the point that this is not required in 99% of the cases. >>>>> There is no such problem. There only Analogy. >>>>> >>>> >>>> I'm not discussing Analogy at all, those drivers are still biased by the >>>> legacy 2.x logic for dealing with modes and need fixing. I have never >>>> been convinced by the reasoning behind rtdm_in_rt_context(), which >>>> perfectly illustrates why messing with the call mode is not the >>>> application's business. >>> >>> You still need rtdm_in_rt_context() for the (rare) case of having the >>> same handler for both service_rt and service_nrt. That didn't change >>> with any switching strategy adjustment. It can't as long as there are >>> services behind a syscall that may handle any mode, thus that syscall is >>> unable to filter for the service in the background. We really need to >>> differentiate here. >>> >>>> >>>>> Every driver must ensure that a service is only exposed to users in the >>>>> right mode. That is a functional requirement, and drivers that fail to >>>>> do so only work by chance (thus with the restricted workload they are >>>>> tested against). If that is fulfilled, it doesn't matter to the driver >>>>> when the switch happens. It's pure optimization. >>>>> >>>> >>>> You don't seem to get my point either. Let's proceed differently, please >>>> sketch the application code that would require __xn_exec_current for >>>> RTDM calls. >>> >>> You cut the more interesting case (migration ping-pong when calling >>> non-RT drivers from relaxed threads), and I hope you will not forget to >>> answer this. >>> >> >> I'm not ignoring the question, I have been postponing the answer until I >> understand why the application could be put in a situation making this >> migration a problem, and whether another approach would exist for >> solving that problem within the current scheme. > > These two scenarios are unrelated: this migration issue would still be > there even if we solved the one below via a different application/driver > design. >
Which starts to be an issue only because the caller is a Cobalt shadow undergoing the SCHED_WEAK policy, calling a RTDM driver for a non-rt operation very frequently. For this reason, those two scenarii are very much related. >> >>> But let's go to our case: >>> >>> We have a non-blocking service in the driver, the classic case of >>> accessing a privileged resource that userspace can't or shouldn't touch >>> directly. Think of some kind of register access that requires low-level >>> synchronization with other threads and interrupt handlers. That service >>> is called by both RT and non-RT threads (SCHED_WEAK) at higher frequency >>> (some thousand times per second). The RT threads are obviously on the >>> time critical path, must not migrate, and that can be achieved perfectly >>> already by providing that service under ioctl_rt. The non-RT threads >>> could be migrated to RT, but then they would pay an unneeded price, >>> contributing to a higher system load, in the worst case overload. >>> Therefore, the very same service shall be provided under ioctl_nrt as >>> well. Makes sense? >>> >> >> I understand the conflict with the "rt-always-has-precedence" rule >> implemented by the conforming state, then I have another question: >> >> assuming the nrt thread undergoes the SCHED_WEAK policy because it is >> mainly operating from the Linux space but still needs to synchronize >> with the rt side at some point, which kind of high frequency interaction >> with the rt side is this? >> >> Sharing some resource requiring mutual exclusion via a Cobalt synchro, >> waiting for rt events, something else? >> > > There synchronization need is first of all only on the hardware access > (thus inside the driver), not necessarily at application level. In fact, > there are even scenarios where you only want to exploit the driver as > permission checker on privileged resource accesses (userspace shall only > access certain MMIO registers in a page, thus the driver acts as > gatekeeper). Then there could be no synchronization at all but still the > need to provide migration-free accesses. > I get the idea of the resource gatekeeper, which does make a lot of sense. However I still don't get which benefit your caller has in undergoing the SCHED_WEAK policy - which implies that it has to share synchronization points with Cobalt - compared to running as a regular (glibc) thread, under whichever policy that could fit? Leaving the non-RT ioctl call aside, which are those Cobalt calls the SCHED_WEAK thread needs to invoke for synchronizing with rt threads? -- Philippe. _______________________________________________ Xenomai mailing list Xenomai@xenomai.org https://xenomai.org/mailman/listinfo/xenomai