On 20/05/2020 09:06, Jürgen Groß wrote: > On 19.05.20 21:44, Andy Lutomirski wrote: >> On Tue, May 19, 2020 at 11:58 AM Thomas Gleixner <t...@linutronix.de> >> wrote: >>> >>> Andy Lutomirski <l...@kernel.org> writes: >>>> B: Turn this thing around. Specifically, in the one and only case we >>>> care about, we know pretty much exactly what context we got this entry >>>> in: we're running in a schedulable context doing an explicitly >>>> preemptible hypercall, and we have RIP pointing at a SYSCALL >>>> instruction (presumably, but we shouldn't bet on it) in the hypercall >>>> page. Ideally we would change the Xen PV ABI so the hypercall would >>>> return something like EAGAIN instead of auto-restarting and we could >>>> ditch this mess entirely. But the ABI seems to be set in stone or at >>>> least in molasses, so how about just: >>>> >>>> idt_entry(exit(regs)); >>>> if (inhcall && need_resched()) >>>> schedule(); >>> >>> Which brings you into the situation that you call schedule() from the >>> point where we just moved it out. If we would go there we'd need to >>> ensure that RCU is watching as well. idtentry_exit() might have it >>> turned off .... >> >> I don't think this is possible. Once you untangle all the wrappers, >> the call sites are effectively: >> >> __this_cpu_write(xen_in_preemptible_hcall, true); >> CALL_NOSPEC to the hypercall page >> __this_cpu_write(xen_in_preemptible_hcall, false); >> >> I think IF=1 when this happens, but I won't swear to it. RCU had >> better be watching. > > Preemptible hypercalls are never done with interrupts off. To be more > precise: they are only ever done during ioctl() processing. > > I can add an ASSERT() to xen_preemptible_hcall_begin() if you want. > >> >> As I understand it, the one and only situation Xen wants to handle is >> that an interrupt gets delivered during the hypercall. The hypervisor >> is too clever for its own good and deals with this by rewinding RIP to >> the beginning of whatever instruction did the hypercall and delivers >> the interrupt, and we end up in this handler. So, if this happens, >> the idea is to not only handle the interrupt but to schedule if >> scheduling would be useful. > > Correct. More precise: the hypercalls in question can last very long > (up to several seconds) and so they need to be interruptible. As said > before: the interface how this is done is horrible. :-(
Forget seconds. DOMCTL_domain_kill gets to ~14 minutes for a 2TB domain. The reason for the existing logic is to be able voluntarily preempt. It doesn't need to remain the way it is, but some adequate form of pre-emption does need to stay. ~Andrew