On Mon, Oct 16, 2023 at 04:01:45PM +1100, Benjamin Gray wrote: > patch_instruction() is designed for patching instructions in otherwise > readonly memory. Other consumers also sometimes need to patch readonly > memory, so have abused patch_instruction() for arbitrary data patches. > > This is a problem on ppc64 as patch_instruction() decides on the patch > width using the 'instruction' opcode to see if it's a prefixed > instruction. Data that triggers this can lead to larger writes, possibly > crossing a page boundary and failing the write altogether. > > Introduce patch_uint(), and patch_ulong(), with aliases patch_u32(), and > patch_u64() (on ppc64) designed for aligned data patches. The patch > size is now determined by the called function, and is passed as an > additional parameter to generic internals. > > While the instruction flushing is not required for data patches, the > use cases for data patching (mainly module loading and static calls) > are less performance sensitive than for instruction patching > (ftrace activation).
That's debatable. While it is nice to be able to activate function tracing quickly, it is not necessarily a hot path. On the flip side, I do have a use case for data patching for ftrace activation :) > So the instruction flushing remains unconditional > in this patch. > > ppc32 does not support prefixed instructions, so is unaffected by the > original issue. Care is taken in not exposing the size parameter in the > public (non-static) interface, so the compiler can const-propagate it > away. > > Signed-off-by: Benjamin Gray <bg...@linux.ibm.com> > > --- > > v2: * Deduplicate patch_32() definition > * Use u32 for val32 > * Remove noinline > --- > arch/powerpc/include/asm/code-patching.h | 33 ++++++++++++ > arch/powerpc/lib/code-patching.c | 66 ++++++++++++++++++------ > 2 files changed, 83 insertions(+), 16 deletions(-) > > diff --git a/arch/powerpc/include/asm/code-patching.h > b/arch/powerpc/include/asm/code-patching.h > index 3f881548fb61..7c6056bb1706 100644 > --- a/arch/powerpc/include/asm/code-patching.h > +++ b/arch/powerpc/include/asm/code-patching.h > @@ -75,6 +75,39 @@ int patch_branch(u32 *addr, unsigned long target, int > flags); > int patch_instruction(u32 *addr, ppc_inst_t instr); > int raw_patch_instruction(u32 *addr, ppc_inst_t instr); > > +/* > + * patch_uint() and patch_ulong() should only be called on addresses where > the > + * patch does not cross a cacheline, otherwise it may not be flushed properly > + * and mixes of new and stale data may be observed. It cannot cross a page > + * boundary, as only the target page is mapped as writable. Should we enforce alignment requirements, especially for patch_ulong() on 64-bit powerpc? I am not sure if there are use cases for unaligned 64-bit writes. That should also ensure that the write doesn't cross a cacheline. > + * > + * patch_instruction() and other instruction patchers automatically satisfy > this > + * requirement due to instruction alignment requirements. > + */ > + > +#ifdef CONFIG_PPC64 > + > +int patch_uint(void *addr, unsigned int val); > +int patch_ulong(void *addr, unsigned long val); > + > +#define patch_u64 patch_ulong > + > +#else > + > +static inline int patch_uint(u32 *addr, unsigned int val) Is there a reason to use u32 * here? > +{ > + return patch_instruction(addr, ppc_inst(val)); > +} > + > +static inline int patch_ulong(void *addr, unsigned long val) > +{ > + return patch_instruction(addr, ppc_inst(val)); > +} > + > +#endif > + > +#define patch_u32 patch_uint > + > static inline unsigned long patch_site_addr(s32 *site) > { > return (unsigned long)site + *site; > diff --git a/arch/powerpc/lib/code-patching.c > b/arch/powerpc/lib/code-patching.c > index b00112d7ad46..60289332412f 100644 > --- a/arch/powerpc/lib/code-patching.c > +++ b/arch/powerpc/lib/code-patching.c > @@ -20,15 +20,14 @@ > #include <asm/code-patching.h> > #include <asm/inst.h> > > -static int __patch_instruction(u32 *exec_addr, ppc_inst_t instr, u32 > *patch_addr) > +static int __patch_memory(void *exec_addr, unsigned long val, void > *patch_addr, > + bool is_dword) > { > - if (!ppc_inst_prefixed(instr)) { > - u32 val = ppc_inst_val(instr); > + if (!IS_ENABLED(CONFIG_PPC64) || likely(!is_dword)) { > + u32 val32 = val; Would be good to add a comment indicating the need for this for BE. - Naveen