On Thu, Oct 18, 2018 at 10:25 AM Nadav Amit <na...@vmware.com> wrote: > > at 10:00 AM, Andy Lutomirski <l...@amacapital.net> wrote: > > > > > > >> On Oct 18, 2018, at 9:47 AM, Nadav Amit <na...@vmware.com> wrote: > >> > >> at 8:51 PM, Andy Lutomirski <l...@amacapital.net> wrote: > >> > >>>> On Wed, Oct 17, 2018 at 8:12 PM Nadav Amit <na...@vmware.com> wrote: > >>>> at 6:22 PM, Andy Lutomirski <l...@amacapital.net> wrote: > >>>> > >>>>>> On Oct 17, 2018, at 5:54 PM, Nadav Amit <na...@vmware.com> wrote: > >>>>>> > >>>>>> It is sometimes beneficial to prevent preemption for very few > >>>>>> instructions, or prevent preemption for some instructions that precede > >>>>>> a branch (this latter case will be introduced in the next patches). > >>>>>> > >>>>>> To provide such functionality on x86-64, we use an empty REX-prefix > >>>>>> (opcode 0x40) as an indication that preemption is disabled for the > >>>>>> following instruction. > >>>>> > >>>>> Nifty! > >>>>> > >>>>> That being said, I think you have a few bugs. First, you can’t just > >>>>> ignore > >>>>> a rescheduling interrupt, as you introduce unbounded latency when this > >>>>> happens — you’re effectively emulating preempt_enable_no_resched(), > >>>>> which > >>>>> is not a drop-in replacement for preempt_enable(). To fix this, you may > >>>>> need to jump to a slow-path trampoline that calls schedule() at the end > >>>>> or > >>>>> consider rewinding one instruction instead. Or use TF, which is only a > >>>>> little bit terrifying… > >>>> > >>>> Yes, I didn’t pay enough attention here. For my use-case, I think that > >>>> the > >>>> easiest solution would be to make synchronize_sched() ignore preemptions > >>>> that happen while the prefix is detected. It would slightly change the > >>>> meaning of the prefix. > >> > >> So thinking about it further, rewinding the instruction seems the easiest > >> and most robust solution. I’ll do it. > >> > >>>>> You also aren’t accounting for the case where you get an exception that > >>>>> is, in turn, preempted. > >>>> > >>>> Hmm.. Can you give me an example for such an exception in my use-case? I > >>>> cannot think of an exception that might be preempted (assuming #BP, #MC > >>>> cannot be preempted). > >>> > >>> Look for cond_local_irq_enable(). > >> > >> I looked at it. Yet, I still don’t see how exceptions might happen in my > >> use-case, but having said that - this can be fixed too. > > > > I’m not totally certain there’s a case that matters. But it’s worth > > checking > > > >> To be frank, I paid relatively little attention to this subject. Any > >> feedback about the other parts and especially on the high-level approach? > >> Is > >> modifying the retpolines in the proposed manner (assembly macros) > >> acceptable? > > > > It’s certainly a neat idea, and it could be a real speedup. > > Great. So I’ll try to shape things up, and I still wait for other comments > (from others). > > I’ll just mention two more patches I need to cleanup (I know I still owe you > some > work, so obviously it will be done later): > > 1. Seccomp trampolines. On my Ubuntu, when I run Redis, systemd installs 17 > BPF filters on the Redis server process that are invoked on each > system-call. Invoking each one requires an indirect branch. The patch keeps > a per-process kernel code-page that holds trampolines for these functions.
I wonder how many levels of branches are needed before the branches involved exceed the retpoline cost. > > 2. Binary-search for system-calls. Use the per-process kernel code-page also > to hold multiple trampolines for the 16 common system calls of a certain > process. The patch uses an indirection table and a binary-search to find the > proper trampoline. Same comment applies here. > > Thanks again, > Nadav -- Andy Lutomirski AMA Capital Management, LLC