On Thu, Nov 23, 2017 at 11:04 PM, Michael Ellerman <[email protected]> wrote: > Christophe LEROY <[email protected]> writes: >> Le 22/11/2017 à 12:48, Michael Ellerman a écrit : >>> Christophe LEROY <[email protected]> writes: >>>> Le 22/11/2017 à 00:07, Balbir Singh a écrit : >>>>> On Wed, Nov 22, 2017 at 1:28 AM, Christophe Leroy >>>>> <[email protected]> wrote: >>>>>> On powerpc32, patch_instruction() is called by apply_feature_fixups() >>>>>> which is called from early_init() > ... >>>>>> diff --git a/arch/powerpc/lib/code-patching.c >>>>>> b/arch/powerpc/lib/code-patching.c >>>>>> index c9de03e0c1f1..d469224c4ada 100644 >>>>>> --- a/arch/powerpc/lib/code-patching.c >>>>>> +++ b/arch/powerpc/lib/code-patching.c >>>>>> @@ -146,11 +147,8 @@ int patch_instruction(unsigned int *addr, unsigned >>>>>> int instr) >>>>>> * During early early boot patch_instruction is called >>>>>> * when text_poke_area is not ready, but we still need >>>>>> * to allow patching. We just do the plain old patching >>>>>> - * We use slab_is_available and per cpu read * via this_cpu_read >>>>>> - * of text_poke_area. Per-CPU areas might not be up early >>>>>> - * this can create problems with just using this_cpu_read() >>>>>> */ >>>>>> - if (!slab_is_available() || !this_cpu_read(text_poke_area)) >>>>>> + if (!this_cpu_read(*PTRRELOC(&text_poke_area))) >>>>>> return __patch_instruction(addr, instr); >>>>> >>>>> On ppc64, we call apply_feature_fixups() in early_setup() after we've >>>>> relocated ourselves. Sorry for missing the ppc32 case. I would like to >>>>> avoid PTRRELOC when unnecessary. >>>> >>>> What do you suggest then ? >>>> >>>> Some #ifdef PPC32 around that ? >>> >>> No I don't think that improves anything. >>> >>> I think the comment about per-cpu not being up is wrong, you'll just get >>> the static version of text_poke_area, which should be NULL. So we don't >>> need the slab_available() check anyway. >>> >>> So I'll take this as-is. >>> >>> Having said that I absolutely hate PTRRELOC, so if it starts spreading >>> we will have to come up with something less bug prone. >> >> Would something like that be the solution ? > > I don't love that actual patch, there's a lot of churn just for one > flag. > > But the idea is not so bad. > > In fact I don't think we ever need to use the text_poke_area when we > call do_feature_fixups(). > > Most of the calls are in early boot. > > The exception is for modules, but when we do the fixups *of the module*, > the module text is not mapped read only yet. > > So I think we can just do something like below. > > cheers > > > diff --git a/arch/powerpc/include/asm/code-patching.h > b/arch/powerpc/include/asm/code-patching.h > index abef812de7f8..1090024e8519 100644 > --- a/arch/powerpc/include/asm/code-patching.h > +++ b/arch/powerpc/include/asm/code-patching.h > @@ -31,6 +31,7 @@ unsigned int create_cond_branch(const unsigned int *addr, > unsigned long target, int flags); > int patch_branch(unsigned int *addr, unsigned long target, int flags); > int patch_instruction(unsigned int *addr, unsigned int instr); > +int raw_patch_instruction(unsigned int *addr, unsigned int instr); > > int instr_is_relative_branch(unsigned int instr); > int instr_is_branch_to_addr(const unsigned int *instr, unsigned long addr); > diff --git a/arch/powerpc/lib/code-patching.c > b/arch/powerpc/lib/code-patching.c > index d469224c4ada..d1eb24cbef58 100644 > --- a/arch/powerpc/lib/code-patching.c > +++ b/arch/powerpc/lib/code-patching.c > @@ -23,7 +23,7 @@ > #include <asm/code-patching.h> > #include <asm/setup.h> > > -static int __patch_instruction(unsigned int *addr, unsigned int instr) > +int raw_patch_instruction(unsigned int *addr, unsigned int instr) > { > int err; > > @@ -148,8 +148,8 @@ int patch_instruction(unsigned int *addr, unsigned int > instr) > * when text_poke_area is not ready, but we still need > * to allow patching. We just do the plain old patching > */ > - if (!this_cpu_read(*PTRRELOC(&text_poke_area))) > - return __patch_instruction(addr, instr); > + if (!this_cpu_read(text_poke_area)) > + return raw_patch_instruction(addr, instr); > > local_irq_save(flags); > > @@ -184,7 +184,7 @@ int patch_instruction(unsigned int *addr, unsigned int > instr) > > int patch_instruction(unsigned int *addr, unsigned int instr) > { > - return __patch_instruction(addr, instr); > + return raw_patch_instruction(addr, instr); > } > > #endif /* CONFIG_STRICT_KERNEL_RWX */ > diff --git a/arch/powerpc/lib/feature-fixups.c > b/arch/powerpc/lib/feature-fixups.c > index 41cf5ae273cf..0872d60ede10 100644 > --- a/arch/powerpc/lib/feature-fixups.c > +++ b/arch/powerpc/lib/feature-fixups.c > @@ -62,7 +62,7 @@ static int patch_alt_instruction(unsigned int *src, > unsigned int *dest, > } > } > > - patch_instruction(dest, instr); > + raw_patch_instruction(dest, instr); > > return 0; > } > @@ -91,7 +91,7 @@ static int patch_feature_section(unsigned long value, > struct fixup_entry *fcur) > } > > for (; dest < end; dest++) > - patch_instruction(dest, PPC_INST_NOP); > + raw_patch_instruction(dest, PPC_INST_NOP); > > return 0; > } > @@ -129,7 +129,7 @@ void do_lwsync_fixups(unsigned long value, void > *fixup_start, void *fixup_end) > > for (; start < end; start++) { > dest = (void *)start + *start; > - patch_instruction(dest, PPC_INST_LWSYNC); > + raw_patch_instruction(dest, PPC_INST_LWSYNC); > } > } > > @@ -147,7 +147,7 @@ static void do_final_fixups(void) > length = (__end_interrupts - _stext) / sizeof(int); > > while (length--) { > - patch_instruction(dest, *src); > + raw_patch_instruction(dest, *src); > src++; > dest++; > }
This looks more promising, but there is a subtle dependence between marking areas as R/O/X and the raw_patch_ins* bits I saw that Michael has merged that patch as is, I guess we get to continue to optimise :) Balbir

