Jan Kiszka wrote: > Philippe Gerum wrote: >> Jan Kiszka wrote: >>> Hi Philippe, >>> >>> as already indicated, I'm starting to understand the ipipe bug Roman >>> sees. It seems to melt down to the following path: >>> >>> - exception raised over non-root domain (__rt_event_wait...) >>> - root domain is stalled on entry of __ipipe_handle_exception >>> - fault causing task is first relaxed, then scheduled away under Linux >>> - scheduled-in Linux task was interrupted in __ipipe_divert_exception, >>> shortly before __fixup_if >>> - __fixup_if finds root domain stalled and propagates this to the >>> register set of the interrupted context (user space task running on >>> its first fpu instruction, having triggered device_not_available). >>> - return to user space task with irqs disable - bang! >>> >> Good catch. >> >>> Two ways to approach this: >>> 1. Do we actually have to stall the root domain in >>> __ipipe_handle_exception before ipipe_trap_notify? I don't see why we >>> should be better off with doing this afterwards. >> We do, because the root domain may install an I-pipe event handler on >> exceptions >> as well, and the callee may assume that the virtual interrupt state is >> correct. > > But from that POV, you would have to stall all domains before calling > the hook, not just root
Why? non-root domain may not affect the root stall bit, that is simply forbidden. So there is no point in making it consistent, since they may not act upon it anyway. > . > >>> 2. Avoid that __ipipe_divert_exception is interruptible and can pick up >>> the stall flag from a different Linux task. But I don't know if there >>> aren't more race windows like that. >>> >> Since the core of the issue is about a preemption point that may be >> introduced >> by a thread migration to secondary, the same goes with __ipipe_syscall_root; >> this is what I stumbled upon on a different trace set. >> >> The way to fix this properly is to decouple fixup_if() from the current >> global >> interrupt state at call time, and rather make such state context-dependent, >> so >> that iret emulation always uses the proper state value. A typical approach >> would >> be to record the stall bit value on the caller's stack, and feed fixup_if() >> with it. >> > > Didn't get yet how this should work, but I guess you've implemented it > in -06. Will check. > > Jan > -- Philippe. _______________________________________________ Adeos-main mailing list [email protected] https://mail.gna.org/listinfo/adeos-main
