From: Andy Lutomirski Sent: November 5, 2018 at 5:22:32 PM GMT > To: Peter Zijlstra <pet...@infradead.org> > Cc: Nadav Amit <na...@vmware.com>, Ingo Molnar <mi...@redhat.com>, > linux-kernel@vger.kernel.org, x...@kernel.org, H. Peter Anvin > <h...@zytor.com>, Thomas Gleixner <t...@linutronix.de>, Borislav Petkov > <b...@alien8.de>, Dave Hansen <dave.han...@linux.intel.com>, Andy Lutomirski > <l...@kernel.org>, Kees Cook <keesc...@chromium.org>, Dave Hansen > <dave.han...@intel.com>, Masami Hiramatsu <mhira...@kernel.org> > Subject: Re: [PATCH v3 2/7] x86/jump_label: Use text_poke_early() during > early_init > > > >> On Nov 5, 2018, at 6:09 AM, Peter Zijlstra <pet...@infradead.org> wrote: >> >>> On Fri, Nov 02, 2018 at 04:29:41PM -0700, Nadav Amit wrote: >>> diff --git a/arch/x86/kernel/jump_label.c b/arch/x86/kernel/jump_label.c >>> index aac0c1f7e354..367c1d0c20a3 100644 >>> --- a/arch/x86/kernel/jump_label.c >>> +++ b/arch/x86/kernel/jump_label.c >>> @@ -52,7 +52,13 @@ static void __ref __jump_label_transform(struct >>> jump_entry *entry, >>> jmp.offset = jump_entry_target(entry) - >>> (jump_entry_code(entry) + JUMP_LABEL_NOP_SIZE); >>> >>> - if (early_boot_irqs_disabled) >>> + /* >>> + * As long as we are in early boot, we can use text_poke_early(), which >>> + * is more efficient: the memory was still not marked as read-only (it >>> + * is only marked after poking_init()). This also prevents us from >>> using >>> + * text_poke() before poking_init() is called. >>> + */ >>> + if (!early_boot_done) >>> poker = text_poke_early; >>> >>> if (type == JUMP_LABEL_JMP) { >> >> It took me a while to untangle init/maze^H^Hin.c... but I think this >> is all we need: >> >> diff --git a/arch/x86/kernel/jump_label.c b/arch/x86/kernel/jump_label.c >> index aac0c1f7e354..ed5fe274a7d8 100644 >> --- a/arch/x86/kernel/jump_label.c >> +++ b/arch/x86/kernel/jump_label.c >> @@ -52,7 +52,12 @@ static void __ref __jump_label_transform(struct >> jump_entry *entry, >> jmp.offset = jump_entry_target(entry) - >> (jump_entry_code(entry) + JUMP_LABEL_NOP_SIZE); >> >> - if (early_boot_irqs_disabled) >> + /* >> + * As long as we're UP and not yet marked RO, we can use >> + * text_poke_early; SYSTEM_BOOTING guarantees both, as we switch to >> + * SYSTEM_SCHEDULING before going either. >> + */ >> + if (system_state == SYSTEM_BOOTING) >> poker = text_poke_early; >> >> if (type == JUMP_LABEL_JMP) { > > Can we move this logic into text_poke() and get rid of text_poke_early()?
This will negatively affect poking of modules doing module loading, e.g., apply_paravirt(). This can be resolved by keeping track when the module is write-protected and giving a module parameter to text_poke(). Does it worth the complexity? > FWIW, alternative patching was, at some point, a significant fraction of > total boot time in some cases. This was probably mostly due to unnecessary > sync_core() calls. Although I think this was reported on a VM, and > sync_core() used to be *extremely* expensive on a VM, but that’s fixed > now, and it even got backported, I think. > > (Hmm. Maybe we can also make jump label patching work in early boot, too!) It may be possible to resolve the dependencies between poking_init() and the other *_init(). I first considered doing that, yet, it makes the code very fragile, and I don’t see the value in getting rid of text_poke_early() from security or simplicity point of views. Let me know if you think otherwise. Regards, Nadav