Re: [PATCH v2 1/2] powerpc/code-patching: add patch_memory() for writing RO text

2021-12-12 Thread Russell Currey
On Sun, 2021-12-12 at 09:08 +, Christophe Leroy wrote:
> Le 12/12/2021 à 02:03, Russell Currey a écrit :
> > +static int do_patch_memory(void *dest, const void *src, size_t
> > size, unsigned long poke_addr)
> > +{
> > +   unsigned long patch_addr = poke_addr +
> > offset_in_page(dest);
> > +
> > +   if (map_patch_area(dest, poke_addr)) {
> > +   pr_warn("failed to map %lx\n", poke_addr);
> 
> It isn't worth a warning here. If that happens before slab is
> available, 
> it will panic in early_alloc_pgtable().
> 
> If it happens after, you will already get a pile of messages dumping
> the 
> memory state etc ...
> 
> During the last few years, pr_ messages have been removed from most 
> places where ENOMEM is returned.

That's good to know, thanks.

> 
> > +   return -1;
> > +   }
> 
> I have a series reworking error handling at 
> https://patchwork.ozlabs.org/project/linuxppc-dev/list/?series=274823=*
> 
> Especially this one handles map_patch_area() : 
> https://patchwork.ozlabs.org/project/linuxppc-dev/patch/85259d894069e47f915ea580b169e1adbeec7a61.1638446239.git.christophe.le...@csgroup.eu/
> 
> Would be good if you could rebase your series on top of it.


> 
I've rebased on top of your series (patchwork 274258 & 274823).

> > +
> > +   memcpy((u8 *)patch_addr, src, size);
> 
> Shouldn't we use copy_to_kernel_nofault(), so that we survive from a 
> fault just like patch_instruction() ?

Yes we should.

> > +
> > +   flush_icache_range(patch_addr, size);
> > +
> > +   if (unmap_patch_area(poke_addr)) {
> > +   pr_warn("failed to unmap %lx\n", poke_addr);
> > +   return -1;
> > +   }
> 
> I have changed unmap_page_area() to a void in 
> https://patchwork.ozlabs.org/project/linuxppc-dev/patch/299804b117fae35c786c827536c91f25352e279b.1638446239.git.christophe.le...@csgroup.eu/
> 
> > +
> > +   return 0;
> > +}
> > +
> > +/**
> > + * patch_memory - write data using the text poke area
> > + *
> > + * @dest:  destination address
> > + * @src:   source address
> > + * @size:  size in bytes
> > + *
> > + * like memcpy(), but using the text poke area. No atomicity
> > guarantees.
> > + * Do not use for instructions, use patch_instruction() instead.
> > + * Handles crossing page boundaries, though you shouldn't need to.
> > + *
> > + * Return value:
> > + * @dest
> > + **/
> > +void *patch_memory(void *dest, const void *src, size_t size)
> > +{
> > +   size_t bytes_written, write_size;
> > +   unsigned long text_poke_addr;
> > +   unsigned long flags;
> > +
> > +   // If the poke area isn't set up, it's early boot and we
> > can just memcpy.
> > +   if (!this_cpu_read(text_poke_area))
> > +   return memcpy(dest, src, size);
> > +
> > +   local_irq_save(flags);
> 
> Do we want to do such potentially big copies with interrupts disabled
> ?

Probably not.  This should never actually get used for big copies - the
problem it was written to solve never copies more than 40 bytes, and is
very unlikely to ever cross a page boundary.

I could disable and re-enable interrupts per-page (per call of
do_patch_memory()) so there's a preemption window on longer operations.

> 
> > +   text_poke_addr = (unsigned
> > long)__this_cpu_read(text_poke_area)->addr;
> > +
> > +   for (bytes_written = 0;
> > +    bytes_written < size;
> > +    bytes_written += write_size) {
> 
> I recommend you to read 
> https://www.kernel.org/doc/html/latest/process/coding-style.html?highlight=coding%20style#naming
> 
> As explained there, local variable names should be short. Using long 
> names is non-productive.
> 
> You could just call it "written", it would allow you to keep the
> for() 
> on a single line, that would be a lot more readable.

I am aware of the coding style, my brain somehow didn't consider
"written" as a better option, which is quite silly.


> > +   // Write as much as possible without crossing a
> > page boundary.
> > +   write_size = min_t(size_t,
> > +  size - bytes_written,
> > +  PAGE_SIZE - offset_in_page(dest
> > + bytes_written));
> 
> Reduce the size of you variable names and keep it on a single line.

> > +
> > +   if (do_patch_memory(dest + bytes_written,
> > +   src + bytes_written,
> > +   write_size,
> > +   text_poke_addr))
> 
> Same, keep a single line as much as possible.
> 
> > +   break;
> > +   }
> > +
> > +   local_irq_restore(flags);
> > +
> > +   return dest;
> 
> Maybe it would be better to return ERR_PTR() of the error returned by
> do_page_memory().

That is indeed much better.

Thanks for the feedback.

- Russell

> 
> > +}
> >   #else /* !CONFIG_STRICT_KERNEL_RWX */
> >   
> >   static int do_patch_instruction(u32 *addr, 

Re: [PATCH v2 1/2] powerpc/code-patching: add patch_memory() for writing RO text

2021-12-12 Thread Christophe Leroy


Le 12/12/2021 à 02:03, Russell Currey a écrit :
> powerpc allocates a text poke area of one page that is used by
> patch_instruction() to modify read-only text when STRICT_KERNEL_RWX
> is enabled.
> 
> patch_instruction() is only designed for instructions,
> so writing data using the text poke area can only happen 4 bytes
> at a time - each with a page map/unmap, pte flush and syncs.
> 
> This patch introduces patch_memory(), implementing the same
> interface as memcpy(), similar to x86's text_poke() and s390's
> s390_kernel_write().  patch_memory() only needs to map the text
> poke area once, unless the write would cross a page boundary.
> 
> Signed-off-by: Russell Currey 
> ---
> v2: Use min_t() instead of min(), fixing the 32-bit build as reported
>  by snowpatch.
> 
> Some discussion here: https://github.com/linuxppc/issues/issues/375
> 
>   arch/powerpc/include/asm/code-patching.h |  1 +
>   arch/powerpc/lib/code-patching.c | 74 
>   2 files changed, 75 insertions(+)
> 
> diff --git a/arch/powerpc/include/asm/code-patching.h 
> b/arch/powerpc/include/asm/code-patching.h
> index 4ba834599c4d..604211d8380c 100644
> --- a/arch/powerpc/include/asm/code-patching.h
> +++ b/arch/powerpc/include/asm/code-patching.h
> @@ -31,6 +31,7 @@ int create_cond_branch(struct ppc_inst *instr, const u32 
> *addr,
>   int patch_branch(u32 *addr, unsigned long target, int flags);
>   int patch_instruction(u32 *addr, struct ppc_inst instr);
>   int raw_patch_instruction(u32 *addr, struct ppc_inst instr);
> +void *patch_memory(void *dest, const void *src, size_t size);
>   
>   static inline unsigned long patch_site_addr(s32 *site)
>   {
> diff --git a/arch/powerpc/lib/code-patching.c 
> b/arch/powerpc/lib/code-patching.c
> index c5ed98823835..330602aa59f1 100644
> --- a/arch/powerpc/lib/code-patching.c
> +++ b/arch/powerpc/lib/code-patching.c
> @@ -17,6 +17,7 @@
>   #include 
>   #include 
>   #include 
> +#include 
>   
>   static int __patch_instruction(u32 *exec_addr, struct ppc_inst instr, u32 
> *patch_addr)
>   {
> @@ -178,6 +179,74 @@ static int do_patch_instruction(u32 *addr, struct 
> ppc_inst instr)
>   
>   return err;
>   }
> +
> +static int do_patch_memory(void *dest, const void *src, size_t size, 
> unsigned long poke_addr)
> +{
> + unsigned long patch_addr = poke_addr + offset_in_page(dest);
> +
> + if (map_patch_area(dest, poke_addr)) {
> + pr_warn("failed to map %lx\n", poke_addr);

It isn't worth a warning here. If that happens before slab is available, 
it will panic in early_alloc_pgtable().

If it happens after, you will already get a pile of messages dumping the 
memory state etc ...

During the last few years, pr_ messages have been removed from most 
places where ENOMEM is returned.

> + return -1;
> + }

I have a series reworking error handling at 
https://patchwork.ozlabs.org/project/linuxppc-dev/list/?series=274823=*

Especially this one handles map_patch_area() : 
https://patchwork.ozlabs.org/project/linuxppc-dev/patch/85259d894069e47f915ea580b169e1adbeec7a61.1638446239.git.christophe.le...@csgroup.eu/

Would be good if you could rebase your series on top of it.


> +
> + memcpy((u8 *)patch_addr, src, size);

Shouldn't we use copy_to_kernel_nofault(), so that we survive from a 
fault just like patch_instruction() ?

> +
> + flush_icache_range(patch_addr, size);
> +
> + if (unmap_patch_area(poke_addr)) {
> + pr_warn("failed to unmap %lx\n", poke_addr);
> + return -1;
> + }

I have changed unmap_page_area() to a void in 
https://patchwork.ozlabs.org/project/linuxppc-dev/patch/299804b117fae35c786c827536c91f25352e279b.1638446239.git.christophe.le...@csgroup.eu/

> +
> + return 0;
> +}
> +
> +/**
> + * patch_memory - write data using the text poke area
> + *
> + * @dest:destination address
> + * @src: source address
> + * @size:size in bytes
> + *
> + * like memcpy(), but using the text poke area. No atomicity guarantees.
> + * Do not use for instructions, use patch_instruction() instead.
> + * Handles crossing page boundaries, though you shouldn't need to.
> + *
> + * Return value:
> + *   @dest
> + **/
> +void *patch_memory(void *dest, const void *src, size_t size)
> +{
> + size_t bytes_written, write_size;
> + unsigned long text_poke_addr;
> + unsigned long flags;
> +
> + // If the poke area isn't set up, it's early boot and we can just 
> memcpy.
> + if (!this_cpu_read(text_poke_area))
> + return memcpy(dest, src, size);
> +
> + local_irq_save(flags);

Do we want to do such potentially big copies with interrupts disabled ?

> + text_poke_addr = (unsigned long)__this_cpu_read(text_poke_area)->addr;
> +
> + for (bytes_written = 0;
> +  bytes_written < size;
> +  bytes_written += write_size) {

I recommend you to read 

[PATCH v2 1/2] powerpc/code-patching: add patch_memory() for writing RO text

2021-12-11 Thread Russell Currey
powerpc allocates a text poke area of one page that is used by
patch_instruction() to modify read-only text when STRICT_KERNEL_RWX
is enabled.

patch_instruction() is only designed for instructions,
so writing data using the text poke area can only happen 4 bytes
at a time - each with a page map/unmap, pte flush and syncs.

This patch introduces patch_memory(), implementing the same
interface as memcpy(), similar to x86's text_poke() and s390's
s390_kernel_write().  patch_memory() only needs to map the text
poke area once, unless the write would cross a page boundary.

Signed-off-by: Russell Currey 
---
v2: Use min_t() instead of min(), fixing the 32-bit build as reported
by snowpatch.

Some discussion here: https://github.com/linuxppc/issues/issues/375

 arch/powerpc/include/asm/code-patching.h |  1 +
 arch/powerpc/lib/code-patching.c | 74 
 2 files changed, 75 insertions(+)

diff --git a/arch/powerpc/include/asm/code-patching.h 
b/arch/powerpc/include/asm/code-patching.h
index 4ba834599c4d..604211d8380c 100644
--- a/arch/powerpc/include/asm/code-patching.h
+++ b/arch/powerpc/include/asm/code-patching.h
@@ -31,6 +31,7 @@ int create_cond_branch(struct ppc_inst *instr, const u32 
*addr,
 int patch_branch(u32 *addr, unsigned long target, int flags);
 int patch_instruction(u32 *addr, struct ppc_inst instr);
 int raw_patch_instruction(u32 *addr, struct ppc_inst instr);
+void *patch_memory(void *dest, const void *src, size_t size);
 
 static inline unsigned long patch_site_addr(s32 *site)
 {
diff --git a/arch/powerpc/lib/code-patching.c b/arch/powerpc/lib/code-patching.c
index c5ed98823835..330602aa59f1 100644
--- a/arch/powerpc/lib/code-patching.c
+++ b/arch/powerpc/lib/code-patching.c
@@ -17,6 +17,7 @@
 #include 
 #include 
 #include 
+#include 
 
 static int __patch_instruction(u32 *exec_addr, struct ppc_inst instr, u32 
*patch_addr)
 {
@@ -178,6 +179,74 @@ static int do_patch_instruction(u32 *addr, struct ppc_inst 
instr)
 
return err;
 }
+
+static int do_patch_memory(void *dest, const void *src, size_t size, unsigned 
long poke_addr)
+{
+   unsigned long patch_addr = poke_addr + offset_in_page(dest);
+
+   if (map_patch_area(dest, poke_addr)) {
+   pr_warn("failed to map %lx\n", poke_addr);
+   return -1;
+   }
+
+   memcpy((u8 *)patch_addr, src, size);
+
+   flush_icache_range(patch_addr, size);
+
+   if (unmap_patch_area(poke_addr)) {
+   pr_warn("failed to unmap %lx\n", poke_addr);
+   return -1;
+   }
+
+   return 0;
+}
+
+/**
+ * patch_memory - write data using the text poke area
+ *
+ * @dest:  destination address
+ * @src:   source address
+ * @size:  size in bytes
+ *
+ * like memcpy(), but using the text poke area. No atomicity guarantees.
+ * Do not use for instructions, use patch_instruction() instead.
+ * Handles crossing page boundaries, though you shouldn't need to.
+ *
+ * Return value:
+ * @dest
+ **/
+void *patch_memory(void *dest, const void *src, size_t size)
+{
+   size_t bytes_written, write_size;
+   unsigned long text_poke_addr;
+   unsigned long flags;
+
+   // If the poke area isn't set up, it's early boot and we can just 
memcpy.
+   if (!this_cpu_read(text_poke_area))
+   return memcpy(dest, src, size);
+
+   local_irq_save(flags);
+   text_poke_addr = (unsigned long)__this_cpu_read(text_poke_area)->addr;
+
+   for (bytes_written = 0;
+bytes_written < size;
+bytes_written += write_size) {
+   // Write as much as possible without crossing a page boundary.
+   write_size = min_t(size_t,
+  size - bytes_written,
+  PAGE_SIZE - offset_in_page(dest + 
bytes_written));
+
+   if (do_patch_memory(dest + bytes_written,
+   src + bytes_written,
+   write_size,
+   text_poke_addr))
+   break;
+   }
+
+   local_irq_restore(flags);
+
+   return dest;
+}
 #else /* !CONFIG_STRICT_KERNEL_RWX */
 
 static int do_patch_instruction(u32 *addr, struct ppc_inst instr)
@@ -185,6 +254,11 @@ static int do_patch_instruction(u32 *addr, struct ppc_inst 
instr)
return raw_patch_instruction(addr, instr);
 }
 
+void *patch_memory(void *dest, const void *src, size_t size)
+{
+   return memcpy(dest, src, size);
+}
+
 #endif /* CONFIG_STRICT_KERNEL_RWX */
 
 int patch_instruction(u32 *addr, struct ppc_inst instr)
-- 
2.34.1