Re: [PATCHv4 bpf-next 2/7] uprobe: Add uretprobe syscall to speed up return probe

2024-05-03 Thread Deepak Gupta

On Fri, May 03, 2024 at 07:38:18PM +, Edgecombe, Rick P wrote:

+Some more shadow stack folks from other archs. We are discussing how uretprobes
work with shadow stack.

Context:
https://lore.kernel.org/lkml/ZjU4ganRF1Cbiug6@krava/


Thanks Rick.

Yeah I didn't give enough attention to uprobes either.
Although now that I think for RISC-V shadow stack, it shouldn't be an issue.
On RISC-V return addresses don't get pushed as part of call instruction.
There is a distinct instruction "shadow stack push of return address" in prolog.
Similarly in epilog there is distinct instruction "shadow stack pop and check 
with
link register".

On RISC-V, uretprobe would install a uprobe on function start and when it's hit.
It'll replace pt_regs->ra = trampoline_handler. As function will resume, 
trampoline
addr will get pushed and popped. Although trampoline_handler would have to be 
enlightened
to eventually return to original return site.



On Fri, 2024-05-03 at 21:18 +0200, Jiri Olsa wrote:


hack below seems to fix it for the current uprobe setup,
we need similar fix for the uretprobe syscall trampoline setup


It seems like a reasonable direction.

Security-wise, applications cannot do this on themselves, or it is an otherwise
privileged thing right?






Re: [PATCH next v2 5/5] locking/osq_lock: Optimise decode_cpu() and per_cpu_ptr().

2024-05-03 Thread Waiman Long



On 5/3/24 17:10, David Laight wrote:

From: Waiman Long

Sent: 03 May 2024 17:00

...

David,

Could you respin the series based on the latest upstream code?

I've just reapplied the patches to 'master' and they all apply
cleanly and diffing the new patches to the old ones gives no differences.
So I think they should still apply.

Were you seeing a specific problem?

I don't remember any suggested changed either.
(Apart from a very local variable I used to keep a patch isolated.)


No, I just want to make sure that your patches will still apply. Anyway, 
it will be easier for the maintainer to merge your remaining patches if 
you can send out a new version even if they are almost the same as the 
old ones.


Thanks,
Longman




RE: [PATCH next v2 5/5] locking/osq_lock: Optimise decode_cpu() and per_cpu_ptr().

2024-05-03 Thread David Laight
From: Waiman Long
> Sent: 03 May 2024 17:00
...
> David,
> 
> Could you respin the series based on the latest upstream code?

I've just reapplied the patches to 'master' and they all apply
cleanly and diffing the new patches to the old ones gives no differences.
So I think they should still apply.

Were you seeing a specific problem?

I don't remember any suggested changed either.
(Apart from a very local variable I used to keep a patch isolated.)

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)


Re: [PATCHv4 bpf-next 0/7] uprobe: uretprobe speed up

2024-05-03 Thread Jiri Olsa
On Fri, May 03, 2024 at 11:03:24AM -0700, Andrii Nakryiko wrote:
> On Thu, May 2, 2024 at 1:04 PM Jiri Olsa  wrote:
> >
> > On Thu, May 02, 2024 at 09:43:02AM -0700, Andrii Nakryiko wrote:
> > > On Thu, May 2, 2024 at 5:23 AM Jiri Olsa  wrote:
> > > >
> > > > hi,
> > > > as part of the effort on speeding up the uprobes [0] coming with
> > > > return uprobe optimization by using syscall instead of the trap
> > > > on the uretprobe trampoline.
> > > >
> > > > The speed up depends on instruction type that uprobe is installed
> > > > and depends on specific HW type, please check patch 1 for details.
> > > >
> > > > Patches 1-6 are based on bpf-next/master, but path 1 and 2 are
> > > > apply-able on linux-trace.git tree probes/for-next branch.
> > > > Patch 7 is based on man-pages master.
> > > >
> > > > v4 changes:
> > > >   - added acks [Oleg,Andrii,Masami]
> > > >   - reworded the man page and adding more info to NOTE section [Masami]
> > > >   - rewrote bpf tests not to use trace_pipe [Andrii]
> > > >   - cc-ed linux-man list
> > > >
> > > > Also available at:
> > > >   https://git.kernel.org/pub/scm/linux/kernel/git/jolsa/perf.git
> > > >   uretprobe_syscall
> > > >
> > >
> > > It looks great to me, thanks! Unfortunately BPF CI build is broken,
> > > probably due to some of the Makefile additions, please investigate and
> > > fix (or we'll need to fix something on BPF CI side), but it looks like
> > > you'll need another revision, unfortunately.
> > >
> > > pw-bot: cr
> > >
> > >   [0] 
> > > https://github.com/kernel-patches/bpf/actions/runs/8923849088/job/24509002194
> >
> > yes, I think it's missing the 32-bit libc for uprobe_compat binary,
> > probably it needs to be added to github.com:libbpf/ci.git 
> > setup-build-env/action.yml ?
> > hm but I'm not sure how to test it, need to check
> 
> You can create a custom PR directly against Github repo
> (kernel-patches/bpf) and BPF CI will run all the tests on your custom
> code. This way you can iterate without spamming the mailing list.

I'm running CI tests like that, but I think I need to change the action
which is in other repo (github.com:libbpf/ci.git)

> 
> But I'm just wondering if it's worth complicating setup just for
> testing this x32 compat mode. So maybe just dropping one of those
> patches would be better?

well, we had compat process crashing on uretprobe because of this change,
so I rather keep the test.. or it can go in later on when the CI stuff is
figured out.. I got busy with the shadow stack issue today, will check on
the CI PR next week

jirka



Re: [PATCHv4 bpf-next 2/7] uprobe: Add uretprobe syscall to speed up return probe

2024-05-03 Thread Edgecombe, Rick P
On Fri, 2024-05-03 at 22:17 +0200, Jiri Olsa wrote:
> when uretprobe is created, kernel overwrites the return address on user
> stack to point to user space trampoline, so the setup is in kernel hands

I mean for uprobes in general. I'm didn't have any specific ideas in mind, but
in general when we give the kernel more abilities around shadow stack we have to
think if attackers could use it to work around shadow stack protections.

> 
> with the hack below on top of this patchset I'm no longer seeing shadow
> stack app crash on uretprobe.. I'll try to polish it and send out next
> week, any suggestions are welcome ;-)

Thanks. Some comments below.

> 
> thanks,
> jirka
> 
> 
> ---
> diff --git a/arch/x86/include/asm/shstk.h b/arch/x86/include/asm/shstk.h
> index 42fee8959df7..d374305a6851 100644
> --- a/arch/x86/include/asm/shstk.h
> +++ b/arch/x86/include/asm/shstk.h
> @@ -21,6 +21,8 @@ unsigned long shstk_alloc_thread_stack(struct task_struct
> *p, unsigned long clon
>  void shstk_free(struct task_struct *p);
>  int setup_signal_shadow_stack(struct ksignal *ksig);
>  int restore_signal_shadow_stack(void);
> +void uprobe_change_stack(unsigned long addr);
> +void uprobe_push_stack(unsigned long addr);

Maybe name them:
shstk_update_last_frame();
shstk_push_frame();


>  #else
>  static inline long shstk_prctl(struct task_struct *task, int option,
>    unsigned long arg2) { return -EINVAL; }
> diff --git a/arch/x86/kernel/shstk.c b/arch/x86/kernel/shstk.c
> index 59e15dd8d0f8..804c446231d9 100644
> --- a/arch/x86/kernel/shstk.c
> +++ b/arch/x86/kernel/shstk.c
> @@ -577,3 +577,24 @@ long shstk_prctl(struct task_struct *task, int option,
> unsigned long arg2)
> return wrss_control(true);
> return -EINVAL;
>  }
> +
> +void uprobe_change_stack(unsigned long addr)
> +{
> +   unsigned long ssp;

Probably want something like:

if (!features_enabled(ARCH_SHSTK_SHSTK))
return;

So this doesn't try the below if shadow stack is disabled.

> +
> +   ssp = get_user_shstk_addr();
> +   write_user_shstk_64((u64 __user *)ssp, (u64)addr);
> +}

Can we know that there was a valid return address just before this point on the
stack? Or could it be a sigframe or something?

> +
> +void uprobe_push_stack(unsigned long addr)
> +{
> +   unsigned long ssp;

if (!features_enabled(ARCH_SHSTK_SHSTK))
return;

> +
> +   ssp = get_user_shstk_addr();
> +   ssp -= SS_FRAME_SIZE;
> +   write_user_shstk_64((u64 __user *)ssp, (u64)addr);
> +
> +   fpregs_lock_and_load();
> +   wrmsrl(MSR_IA32_PL3_SSP, ssp);
> +   fpregs_unlock();
> +}
> diff --git a/arch/x86/kernel/uprobes.c b/arch/x86/kernel/uprobes.c
> index 81e6ee95784d..259457838020 100644
> --- a/arch/x86/kernel/uprobes.c
> +++ b/arch/x86/kernel/uprobes.c
> @@ -416,6 +416,7 @@ SYSCALL_DEFINE0(uretprobe)
> regs->r11 = regs->flags;
> regs->cx  = regs->ip;
>  
> +   uprobe_push_stack(r11_cx_ax[2]);

I'm concerned this could be used to push arbitrary frames to the shadow stack.
Couldn't an attacker do a jump to the point that calls this syscall? Maybe this
is what peterz was raising.

> return regs->ax;
>  
>  sigill:
> @@ -1191,8 +1192,10 @@ arch_uretprobe_hijack_return_addr(unsigned long
> trampoline_vaddr, struct pt_regs
> return orig_ret_vaddr;
>  
> nleft = copy_to_user((void __user *)regs->sp, _vaddr,
> rasize);
> -   if (likely(!nleft))
> +   if (likely(!nleft)) {
> +   uprobe_change_stack(trampoline_vaddr);
> return orig_ret_vaddr;
> +   }
>  
> if (nleft != rasize) {
> pr_err("return address clobbered: pid=%d, %%sp=%#lx,
> %%ip=%#lx\n",



Re: [PATCHv4 bpf-next 2/7] uprobe: Add uretprobe syscall to speed up return probe

2024-05-03 Thread Jiri Olsa
On Fri, May 03, 2024 at 07:38:18PM +, Edgecombe, Rick P wrote:
> +Some more shadow stack folks from other archs. We are discussing how 
> uretprobes
> work with shadow stack.
> 
> Context:
> https://lore.kernel.org/lkml/ZjU4ganRF1Cbiug6@krava/
> 
> On Fri, 2024-05-03 at 21:18 +0200, Jiri Olsa wrote:
> > 
> > hack below seems to fix it for the current uprobe setup,
> > we need similar fix for the uretprobe syscall trampoline setup
> 
> It seems like a reasonable direction.
> 
> Security-wise, applications cannot do this on themselves, or it is an 
> otherwise
> privileged thing right?

when uretprobe is created, kernel overwrites the return address on user
stack to point to user space trampoline, so the setup is in kernel hands

with the hack below on top of this patchset I'm no longer seeing shadow
stack app crash on uretprobe.. I'll try to polish it and send out next
week, any suggestions are welcome ;-)

thanks,
jirka


---
diff --git a/arch/x86/include/asm/shstk.h b/arch/x86/include/asm/shstk.h
index 42fee8959df7..d374305a6851 100644
--- a/arch/x86/include/asm/shstk.h
+++ b/arch/x86/include/asm/shstk.h
@@ -21,6 +21,8 @@ unsigned long shstk_alloc_thread_stack(struct task_struct *p, 
unsigned long clon
 void shstk_free(struct task_struct *p);
 int setup_signal_shadow_stack(struct ksignal *ksig);
 int restore_signal_shadow_stack(void);
+void uprobe_change_stack(unsigned long addr);
+void uprobe_push_stack(unsigned long addr);
 #else
 static inline long shstk_prctl(struct task_struct *task, int option,
   unsigned long arg2) { return -EINVAL; }
diff --git a/arch/x86/kernel/shstk.c b/arch/x86/kernel/shstk.c
index 59e15dd8d0f8..804c446231d9 100644
--- a/arch/x86/kernel/shstk.c
+++ b/arch/x86/kernel/shstk.c
@@ -577,3 +577,24 @@ long shstk_prctl(struct task_struct *task, int option, 
unsigned long arg2)
return wrss_control(true);
return -EINVAL;
 }
+
+void uprobe_change_stack(unsigned long addr)
+{
+   unsigned long ssp;
+
+   ssp = get_user_shstk_addr();
+   write_user_shstk_64((u64 __user *)ssp, (u64)addr);
+}
+
+void uprobe_push_stack(unsigned long addr)
+{
+   unsigned long ssp;
+
+   ssp = get_user_shstk_addr();
+   ssp -= SS_FRAME_SIZE;
+   write_user_shstk_64((u64 __user *)ssp, (u64)addr);
+
+   fpregs_lock_and_load();
+   wrmsrl(MSR_IA32_PL3_SSP, ssp);
+   fpregs_unlock();
+}
diff --git a/arch/x86/kernel/uprobes.c b/arch/x86/kernel/uprobes.c
index 81e6ee95784d..259457838020 100644
--- a/arch/x86/kernel/uprobes.c
+++ b/arch/x86/kernel/uprobes.c
@@ -416,6 +416,7 @@ SYSCALL_DEFINE0(uretprobe)
regs->r11 = regs->flags;
regs->cx  = regs->ip;
 
+   uprobe_push_stack(r11_cx_ax[2]);
return regs->ax;
 
 sigill:
@@ -1191,8 +1192,10 @@ arch_uretprobe_hijack_return_addr(unsigned long 
trampoline_vaddr, struct pt_regs
return orig_ret_vaddr;
 
nleft = copy_to_user((void __user *)regs->sp, _vaddr, 
rasize);
-   if (likely(!nleft))
+   if (likely(!nleft)) {
+   uprobe_change_stack(trampoline_vaddr);
return orig_ret_vaddr;
+   }
 
if (nleft != rasize) {
pr_err("return address clobbered: pid=%d, %%sp=%#lx, 
%%ip=%#lx\n",



Re: [PATCHv4 bpf-next 2/7] uprobe: Add uretprobe syscall to speed up return probe

2024-05-03 Thread Edgecombe, Rick P
+Some more shadow stack folks from other archs. We are discussing how uretprobes
work with shadow stack.

Context:
https://lore.kernel.org/lkml/ZjU4ganRF1Cbiug6@krava/

On Fri, 2024-05-03 at 21:18 +0200, Jiri Olsa wrote:
> 
> hack below seems to fix it for the current uprobe setup,
> we need similar fix for the uretprobe syscall trampoline setup

It seems like a reasonable direction.

Security-wise, applications cannot do this on themselves, or it is an otherwise
privileged thing right?




Re: [PATCHv4 bpf-next 2/7] uprobe: Add uretprobe syscall to speed up return probe

2024-05-03 Thread Jiri Olsa
On Fri, May 03, 2024 at 03:53:15PM +, Edgecombe, Rick P wrote:
> On Fri, 2024-05-03 at 15:04 +0200, Jiri Olsa wrote:
> > On Fri, May 03, 2024 at 01:34:53PM +0200, Peter Zijlstra wrote:
> > > On Thu, May 02, 2024 at 02:23:08PM +0200, Jiri Olsa wrote:
> > > > Adding uretprobe syscall instead of trap to speed up return probe.
> > > > 
> > > > At the moment the uretprobe setup/path is:
> > > > 
> > > >    - install entry uprobe
> > > > 
> > > >    - when the uprobe is hit, it overwrites probed function's return
> > > > address
> > > >  on stack with address of the trampoline that contains breakpoint
> > > >  instruction
> > > > 
> > > >    - the breakpoint trap code handles the uretprobe consumers execution
> > > > and
> > > >  jumps back to original return address
> 
> Hi,
> 
> I worked on the x86 shadow stack support.
> 
> I didn't know uprobes did anything like this. In hindsight I should have 
> looked
> more closely. The current upstream behavior is to overwrite the return address
> on the stack?
> 
> Stupid uprobes question - what is actually overwriting the return address on 
> the
> stack? Is it the kernel? If so perhaps the kernel could just update the shadow
> stack at the same time.

yes, it's in kernel - arch_uretprobe_hijack_return_addr .. so I guess
we need to update the shadow stack with the new return value as well

> 
> > > > 
> > > > This patch replaces the above trampoline's breakpoint instruction with 
> > > > new
> > > > ureprobe syscall call. This syscall does exactly the same job as the 
> > > > trap
> > > > with some more extra work:
> > > > 
> > > >    - syscall trampoline must save original value for rax/r11/rcx 
> > > > registers
> > > >  on stack - rax is set to syscall number and r11/rcx are changed and
> > > >  used by syscall instruction
> > > > 
> > > >    - the syscall code reads the original values of those registers and
> > > >  restore those values in task's pt_regs area
> > > > 
> > > >    - only caller from trampoline exposed in '[uprobes]' is allowed,
> > > >  the process will receive SIGILL signal otherwise
> > > > 
> > > 
> > > Did you consider shadow stacks? IIRC we currently have userspace shadow
> > > stack support available, and that will utterly break all of this.
> > 
> > nope.. I guess it's the extra ret instruction in the trampoline that would
> > make it crash?
> 
> The original behavior seems problematic for shadow stack IIUC. I'm not sure of
> the additional breakage with the new behavior.

I can see it's broken also for current uprobes

> 
> Roughly, how shadow stack works is there is an additional protected stack for
> the app thread. The HW pushes to from the shadow stack with CALL, and pops 
> from
> it with RET. But it also continues to push and pop from the normal stack. On
> pop, if the values don't match between the two stacks, an exception is
> generated. The whole point is to prevent the app from overwriting its stack
> return address to return to random places.
> 
> Userspace cannot (normally) write to the shadow stack, but the kernel can do
> this or adust the SSP (shadow stack pointer). So in the kernel (for things 
> like
> sigreturn) there is an ability to do what is needed. Ptracers also can do 
> things
> like this.

hack below seems to fix it for the current uprobe setup,
we need similar fix for the uretprobe syscall trampoline setup

jirka


---
diff --git a/arch/x86/include/asm/shstk.h b/arch/x86/include/asm/shstk.h
index 42fee8959df7..99a0948a3b79 100644
--- a/arch/x86/include/asm/shstk.h
+++ b/arch/x86/include/asm/shstk.h
@@ -21,6 +21,7 @@ unsigned long shstk_alloc_thread_stack(struct task_struct *p, 
unsigned long clon
 void shstk_free(struct task_struct *p);
 int setup_signal_shadow_stack(struct ksignal *ksig);
 int restore_signal_shadow_stack(void);
+void uprobe_change_stack(unsigned long addr);
 #else
 static inline long shstk_prctl(struct task_struct *task, int option,
   unsigned long arg2) { return -EINVAL; }
diff --git a/arch/x86/kernel/shstk.c b/arch/x86/kernel/shstk.c
index 59e15dd8d0f8..d2c4dbe5843c 100644
--- a/arch/x86/kernel/shstk.c
+++ b/arch/x86/kernel/shstk.c
@@ -577,3 +577,11 @@ long shstk_prctl(struct task_struct *task, int option, 
unsigned long arg2)
return wrss_control(true);
return -EINVAL;
 }
+
+void uprobe_change_stack(unsigned long addr)
+{
+   unsigned long ssp;
+
+   ssp = get_user_shstk_addr();
+   write_user_shstk_64((u64 __user *)ssp, (u64)addr);
+}
diff --git a/arch/x86/kernel/uprobes.c b/arch/x86/kernel/uprobes.c
index 81e6ee95784d..88afbeaacb8f 100644
--- a/arch/x86/kernel/uprobes.c
+++ b/arch/x86/kernel/uprobes.c
@@ -348,7 +348,7 @@ void *arch_uprobe_trampoline(unsigned long *psize)
 * only for native 64-bit process, the compat process still uses
 * standard breakpoint.
 */
-   if (user_64bit_mode(regs)) {
+   if (0 && user_64bit_mode(regs)) {
*psize = 

[REGRESSION][v6.8-rc1] virtio-pci: Introduce admin virtqueue

2024-05-03 Thread Joseph Salisbury

Hi Feng,

During testing, a kernel bug was identified with the suspend/resume 
functionality on instances running in a public cloud [0].  This bug is a 
regression introduced in v6.8-rc1.  After a kernel bisect, the following 
commit was identified as the cause of the regression:


   fd27ef6b44be  ("virtio-pci: Introduce admin virtqueue")

I was hoping to get your feedback, since you are the patch author. Do 
you think gathering any additional data will help diagnose this issue?  
This commit is depended upon by other virtio commits, so a revert test 
is not really straight forward without reverting all the dependencies.   
Any ideas you have would be greatly appreciated.



Thanks,

Joe

http://pad.lv/2063315




Re: [PATCHv4 bpf-next 0/7] uprobe: uretprobe speed up

2024-05-03 Thread Andrii Nakryiko
On Thu, May 2, 2024 at 1:04 PM Jiri Olsa  wrote:
>
> On Thu, May 02, 2024 at 09:43:02AM -0700, Andrii Nakryiko wrote:
> > On Thu, May 2, 2024 at 5:23 AM Jiri Olsa  wrote:
> > >
> > > hi,
> > > as part of the effort on speeding up the uprobes [0] coming with
> > > return uprobe optimization by using syscall instead of the trap
> > > on the uretprobe trampoline.
> > >
> > > The speed up depends on instruction type that uprobe is installed
> > > and depends on specific HW type, please check patch 1 for details.
> > >
> > > Patches 1-6 are based on bpf-next/master, but path 1 and 2 are
> > > apply-able on linux-trace.git tree probes/for-next branch.
> > > Patch 7 is based on man-pages master.
> > >
> > > v4 changes:
> > >   - added acks [Oleg,Andrii,Masami]
> > >   - reworded the man page and adding more info to NOTE section [Masami]
> > >   - rewrote bpf tests not to use trace_pipe [Andrii]
> > >   - cc-ed linux-man list
> > >
> > > Also available at:
> > >   https://git.kernel.org/pub/scm/linux/kernel/git/jolsa/perf.git
> > >   uretprobe_syscall
> > >
> >
> > It looks great to me, thanks! Unfortunately BPF CI build is broken,
> > probably due to some of the Makefile additions, please investigate and
> > fix (or we'll need to fix something on BPF CI side), but it looks like
> > you'll need another revision, unfortunately.
> >
> > pw-bot: cr
> >
> >   [0] 
> > https://github.com/kernel-patches/bpf/actions/runs/8923849088/job/24509002194
>
> yes, I think it's missing the 32-bit libc for uprobe_compat binary,
> probably it needs to be added to github.com:libbpf/ci.git 
> setup-build-env/action.yml ?
> hm but I'm not sure how to test it, need to check

You can create a custom PR directly against Github repo
(kernel-patches/bpf) and BPF CI will run all the tests on your custom
code. This way you can iterate without spamming the mailing list.

But I'm just wondering if it's worth complicating setup just for
testing this x32 compat mode. So maybe just dropping one of those
patches would be better?

>
> >
> >
> >
> > But while we are at it.
> >
> > Masami, Oleg,
> >
> > What should be the logistics of landing this? Can/should we route this
> > through the bpf-next tree, given there are lots of BPF-based
> > selftests? Or you want to take this through
> > linux-trace/probes/for-next? In the latter case, it's probably better
> > to apply only the first two patches to probes/for-next and the rest
> > should still go through the bpf-next tree (otherwise we are running
>
> I think this was the plan, previously mentioned in here:
> https://lore.kernel.org/bpf/20240423000943.478ccf1e735a63c6c1b4c...@kernel.org/
>

Ok, then we'll have to land this patch set as two separate ones. It's
fine, let's figure out if you need to do anything for shadow stacks
and try to land it soon.

> > into conflicts in BPF selftests). Previously we were handling such
> > cross-tree dependencies by creating a named branch or tag, and merging
> > it into bpf-next (so that all SHAs are preserved). It's a bunch of
> > extra work for everyone involved, so the simplest way would be to just
> > land through bpf-next, of course. But let me know your preferences.
> >
> > Thanks!
> >
> > > thanks,
> > > jirka
> > >
> > >
> > > Notes to check list items in Documentation/process/adding-syscalls.rst:
> > >
> > > - System Call Alternatives
> > >   New syscall seems like the best way in here, becase we need
> >
> > typo (thanks, Gmail): because
>
> ok
>
> >
> > >   just to quickly enter kernel with no extra arguments processing,
> > >   which we'd need to do if we decided to use another syscall.
> > >
> > > - Designing the API: Planning for Extension
> > >   The uretprobe syscall is very specific and most likely won't be
> > >   extended in the future.
> > >
> > >   At the moment it does not take any arguments and even if it does
> > >   in future, it's allowed to be called only from trampoline prepared
> > >   by kernel, so there'll be no broken user.
> > >
> > > - Designing the API: Other Considerations
> > >   N/A because uretprobe syscall does not return reference to kernel
> > >   object.
> > >
> > > - Proposing the API
> > >   Wiring up of the uretprobe system call si in separate change,
> >
> > typo: is
>
> ok, thanks
>
> jirka



RE: [PATCH next v2 5/5] locking/osq_lock: Optimise decode_cpu() and per_cpu_ptr().

2024-05-03 Thread David Laight
From: Waiman Long
> Sent: 03 May 2024 17:00
> To: David Laight ; 'linux-kernel@vger.kernel.org' 
>  ker...@vger.kernel.org>; 'pet...@infradead.org' 
> Cc: 'mi...@redhat.com' ; 'w...@kernel.org' 
> ; 'boqun.f...@gmail.com'
> ; 'Linus Torvalds' ; 
> 'virtualization@lists.linux-
> foundation.org' ; 'Zeng Heng' 
> 
> Subject: Re: [PATCH next v2 5/5] locking/osq_lock: Optimise decode_cpu() and 
> per_cpu_ptr().
> 
> 
> On 12/31/23 23:14, Waiman Long wrote:
> >
> > On 12/31/23 16:55, David Laight wrote:
> >> per_cpu_ptr() indexes __per_cpu_offset[] with the cpu number.
> >> This requires the cpu number be 64bit.
> >> However the value is osq_lock() comes from a 32bit xchg() and there
> >> isn't a way of telling gcc the high bits are zero (they are) so
> >> there will always be an instruction to clear the high bits.
> >>
> >> The cpu number is also offset by one (to make the initialiser 0)
> >> It seems to be impossible to get gcc to convert
> >> __per_cpu_offset[cpu_p1 - 1]
> >> into (__per_cpu_offset - 1)[cpu_p1] (transferring the offset to the
> >> address).
> >>
> >> Converting the cpu number to 32bit unsigned prior to the decrement means
> >> that gcc knows the decrement has set the high bits to zero and doesn't
> >> add a register-register move (or cltq) to zero/sign extend the value.
> >>
> >> Not massive but saves two instructions.
> >>
> >> Signed-off-by: David Laight 
> >> ---
> >>   kernel/locking/osq_lock.c | 6 ++
> >>   1 file changed, 2 insertions(+), 4 deletions(-)
> >>
> >> diff --git a/kernel/locking/osq_lock.c b/kernel/locking/osq_lock.c
> >> index 35bb99e96697..37a4fa872989 100644
> >> --- a/kernel/locking/osq_lock.c
> >> +++ b/kernel/locking/osq_lock.c
> >> @@ -29,11 +29,9 @@ static inline int encode_cpu(int cpu_nr)
> >>   return cpu_nr + 1;
> >>   }
> >>   -static inline struct optimistic_spin_node *decode_cpu(int
> >> encoded_cpu_val)
> >> +static inline struct optimistic_spin_node *decode_cpu(unsigned int
> >> encoded_cpu_val)
> >>   {
> >> -    int cpu_nr = encoded_cpu_val - 1;
> >> -
> >> -    return per_cpu_ptr(_node, cpu_nr);
> >> +    return per_cpu_ptr(_node, encoded_cpu_val - 1);
> >>   }
> >>     /*
> >
> > You really like micro-optimization.
> >
> > Anyway,
> >
> > Reviewed-by: Waiman Long 
> >
> David,
> 
> Could you respin the series based on the latest upstream code?

Looks like a wet bank holiday weekend.

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)


Re: [PATCH next v2 5/5] locking/osq_lock: Optimise decode_cpu() and per_cpu_ptr().

2024-05-03 Thread Waiman Long



On 12/31/23 23:14, Waiman Long wrote:


On 12/31/23 16:55, David Laight wrote:

per_cpu_ptr() indexes __per_cpu_offset[] with the cpu number.
This requires the cpu number be 64bit.
However the value is osq_lock() comes from a 32bit xchg() and there
isn't a way of telling gcc the high bits are zero (they are) so
there will always be an instruction to clear the high bits.

The cpu number is also offset by one (to make the initialiser 0)
It seems to be impossible to get gcc to convert 
__per_cpu_offset[cpu_p1 - 1]
into (__per_cpu_offset - 1)[cpu_p1] (transferring the offset to the 
address).


Converting the cpu number to 32bit unsigned prior to the decrement means
that gcc knows the decrement has set the high bits to zero and doesn't
add a register-register move (or cltq) to zero/sign extend the value.

Not massive but saves two instructions.

Signed-off-by: David Laight 
---
  kernel/locking/osq_lock.c | 6 ++
  1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/kernel/locking/osq_lock.c b/kernel/locking/osq_lock.c
index 35bb99e96697..37a4fa872989 100644
--- a/kernel/locking/osq_lock.c
+++ b/kernel/locking/osq_lock.c
@@ -29,11 +29,9 @@ static inline int encode_cpu(int cpu_nr)
  return cpu_nr + 1;
  }
  -static inline struct optimistic_spin_node *decode_cpu(int 
encoded_cpu_val)
+static inline struct optimistic_spin_node *decode_cpu(unsigned int 
encoded_cpu_val)

  {
-    int cpu_nr = encoded_cpu_val - 1;
-
-    return per_cpu_ptr(_node, cpu_nr);
+    return per_cpu_ptr(_node, encoded_cpu_val - 1);
  }
    /*


You really like micro-optimization.

Anyway,

Reviewed-by: Waiman Long 


David,

Could you respin the series based on the latest upstream code?

Thanks,
Longman




Re: [PATCHv4 bpf-next 2/7] uprobe: Add uretprobe syscall to speed up return probe

2024-05-03 Thread Edgecombe, Rick P
On Fri, 2024-05-03 at 15:04 +0200, Jiri Olsa wrote:
> On Fri, May 03, 2024 at 01:34:53PM +0200, Peter Zijlstra wrote:
> > On Thu, May 02, 2024 at 02:23:08PM +0200, Jiri Olsa wrote:
> > > Adding uretprobe syscall instead of trap to speed up return probe.
> > > 
> > > At the moment the uretprobe setup/path is:
> > > 
> > >    - install entry uprobe
> > > 
> > >    - when the uprobe is hit, it overwrites probed function's return
> > > address
> > >  on stack with address of the trampoline that contains breakpoint
> > >  instruction
> > > 
> > >    - the breakpoint trap code handles the uretprobe consumers execution
> > > and
> > >  jumps back to original return address

Hi,

I worked on the x86 shadow stack support.

I didn't know uprobes did anything like this. In hindsight I should have looked
more closely. The current upstream behavior is to overwrite the return address
on the stack?

Stupid uprobes question - what is actually overwriting the return address on the
stack? Is it the kernel? If so perhaps the kernel could just update the shadow
stack at the same time.

> > > 
> > > This patch replaces the above trampoline's breakpoint instruction with new
> > > ureprobe syscall call. This syscall does exactly the same job as the trap
> > > with some more extra work:
> > > 
> > >    - syscall trampoline must save original value for rax/r11/rcx registers
> > >  on stack - rax is set to syscall number and r11/rcx are changed and
> > >  used by syscall instruction
> > > 
> > >    - the syscall code reads the original values of those registers and
> > >  restore those values in task's pt_regs area
> > > 
> > >    - only caller from trampoline exposed in '[uprobes]' is allowed,
> > >  the process will receive SIGILL signal otherwise
> > > 
> > 
> > Did you consider shadow stacks? IIRC we currently have userspace shadow
> > stack support available, and that will utterly break all of this.
> 
> nope.. I guess it's the extra ret instruction in the trampoline that would
> make it crash?

The original behavior seems problematic for shadow stack IIUC. I'm not sure of
the additional breakage with the new behavior.

Roughly, how shadow stack works is there is an additional protected stack for
the app thread. The HW pushes to from the shadow stack with CALL, and pops from
it with RET. But it also continues to push and pop from the normal stack. On
pop, if the values don't match between the two stacks, an exception is
generated. The whole point is to prevent the app from overwriting its stack
return address to return to random places.

Userspace cannot (normally) write to the shadow stack, but the kernel can do
this or adust the SSP (shadow stack pointer). So in the kernel (for things like
sigreturn) there is an ability to do what is needed. Ptracers also can do things
like this.


Re: [PATCH net-next v3 2/2] ipvs: allow some sysctls in non-init user namespaces

2024-05-03 Thread Julian Anastasov

Hello,

On Thu, 18 Apr 2024, Alexander Mikhalitsyn wrote:

> Let's make all IPVS sysctls writtable even when
> network namespace is owned by non-initial user namespace.
> 
> Let's make a few sysctls to be read-only for non-privileged users:
> - sync_qlen_max
> - sync_sock_size
> - run_estimation
> - est_cpulist
> - est_nice
> 
> I'm trying to be conservative with this to prevent
> introducing any security issues in there. Maybe,
> we can allow more sysctls to be writable, but let's
> do this on-demand and when we see real use-case.
> 
> This patch is motivated by user request in the LXC
> project [1]. Having this can help with running some
> Kubernetes [2] or Docker Swarm [3] workloads inside the system
> containers.
> 
> Link: https://github.com/lxc/lxc/issues/4278 [1]
> Link: 
> https://github.com/kubernetes/kubernetes/blob/b722d017a34b300a2284b890448e5a605f21d01e/pkg/proxy/ipvs/proxier.go#L103
>  [2]
> Link: 
> https://github.com/moby/libnetwork/blob/3797618f9a38372e8107d8c06f6ae199e1133ae8/osl/namespace_linux.go#L682
>  [3]
> 
> Cc: Stéphane Graber 
> Cc: Christian Brauner 
> Cc: Julian Anastasov 
> Cc: Simon Horman 
> Cc: Pablo Neira Ayuso 
> Cc: Jozsef Kadlecsik 
> Cc: Florian Westphal 
> Signed-off-by: Alexander Mikhalitsyn 
> ---
>  net/netfilter/ipvs/ip_vs_ctl.c | 21 +++--
>  1 file changed, 15 insertions(+), 6 deletions(-)
> 
> diff --git a/net/netfilter/ipvs/ip_vs_ctl.c b/net/netfilter/ipvs/ip_vs_ctl.c
> index 32be24f0d4e4..c3ba71aa2654 100644
> --- a/net/netfilter/ipvs/ip_vs_ctl.c
> +++ b/net/netfilter/ipvs/ip_vs_ctl.c

...

> @@ -4284,12 +4285,6 @@ static int __net_init 
> ip_vs_control_net_init_sysctl(struct netns_ipvs *ipvs)
>   tbl = kmemdup(vs_vars, sizeof(vs_vars), GFP_KERNEL);
>   if (tbl == NULL)
>   return -ENOMEM;
> -
> - /* Don't export sysctls to unprivileged users */
> - if (net->user_ns != _user_ns) {
> - tbl[0].procname = NULL;
> - ctl_table_size = 0;
> - }
>   } else
>   tbl = vs_vars;
>   /* Initialize sysctl defaults */

Sorry but you have to send v4 because above if-block was
changed with net-next commit 635470eb0aa7 from today...

Regards

--
Julian Anastasov 

Re: [PATCHv4 bpf-next 2/7] uprobe: Add uretprobe syscall to speed up return probe

2024-05-03 Thread Jiri Olsa
On Fri, May 03, 2024 at 01:34:53PM +0200, Peter Zijlstra wrote:
> On Thu, May 02, 2024 at 02:23:08PM +0200, Jiri Olsa wrote:
> > Adding uretprobe syscall instead of trap to speed up return probe.
> > 
> > At the moment the uretprobe setup/path is:
> > 
> >   - install entry uprobe
> > 
> >   - when the uprobe is hit, it overwrites probed function's return address
> > on stack with address of the trampoline that contains breakpoint
> > instruction
> > 
> >   - the breakpoint trap code handles the uretprobe consumers execution and
> > jumps back to original return address
> > 
> > This patch replaces the above trampoline's breakpoint instruction with new
> > ureprobe syscall call. This syscall does exactly the same job as the trap
> > with some more extra work:
> > 
> >   - syscall trampoline must save original value for rax/r11/rcx registers
> > on stack - rax is set to syscall number and r11/rcx are changed and
> > used by syscall instruction
> > 
> >   - the syscall code reads the original values of those registers and
> > restore those values in task's pt_regs area
> > 
> >   - only caller from trampoline exposed in '[uprobes]' is allowed,
> > the process will receive SIGILL signal otherwise
> > 
> 
> Did you consider shadow stacks? IIRC we currently have userspace shadow
> stack support available, and that will utterly break all of this.

nope.. I guess it's the extra ret instruction in the trampoline that would
make it crash?

> 
> It would be really nice if the new scheme would consider shadow stacks.

I seem to have the hw with support for user_shstk, let me test that

thanks,
jirka



Re: [PATCHv4 bpf-next 2/7] uprobe: Add uretprobe syscall to speed up return probe

2024-05-03 Thread Peter Zijlstra
On Thu, May 02, 2024 at 02:23:08PM +0200, Jiri Olsa wrote:
> Adding uretprobe syscall instead of trap to speed up return probe.
> 
> At the moment the uretprobe setup/path is:
> 
>   - install entry uprobe
> 
>   - when the uprobe is hit, it overwrites probed function's return address
> on stack with address of the trampoline that contains breakpoint
> instruction
> 
>   - the breakpoint trap code handles the uretprobe consumers execution and
> jumps back to original return address
> 
> This patch replaces the above trampoline's breakpoint instruction with new
> ureprobe syscall call. This syscall does exactly the same job as the trap
> with some more extra work:
> 
>   - syscall trampoline must save original value for rax/r11/rcx registers
> on stack - rax is set to syscall number and r11/rcx are changed and
> used by syscall instruction
> 
>   - the syscall code reads the original values of those registers and
> restore those values in task's pt_regs area
> 
>   - only caller from trampoline exposed in '[uprobes]' is allowed,
> the process will receive SIGILL signal otherwise
> 

Did you consider shadow stacks? IIRC we currently have userspace shadow
stack support available, and that will utterly break all of this.

It would be really nice if the new scheme would consider shadow stacks.



Re: [PATCH v7 00/16] mm: jit/text allocator

2024-05-03 Thread Liviu Dudau
On Fri, May 03, 2024 at 09:28:25AM +0300, Mike Rapoport wrote:
> On Fri, May 03, 2024 at 01:23:30AM +0100, Liviu Dudau wrote:
> > On Thu, May 02, 2024 at 04:07:05PM -0700, Luis Chamberlain wrote:
> > > On Thu, May 02, 2024 at 11:50:36PM +0100, Liviu Dudau wrote:
> > > > On Mon, Apr 29, 2024 at 09:29:20AM -0700, Luis Chamberlain wrote:
> > > > > On Mon, Apr 29, 2024 at 03:16:04PM +0300, Mike Rapoport wrote:
> > > > > > From: "Mike Rapoport (IBM)" 
> > > > > > 
> > > > > > Hi,
> > > > > > 
> > > > > > The patches are also available in git:
> > > > > > https://git.kernel.org/pub/scm/linux/kernel/git/rppt/linux.git/log/?h=execmem/v7
> > > > > > 
> > > > > > v7 changes:
> > > > > > * define MODULE_{VADDR,END} for riscv32 to fix the build and avoid
> > > > > >   #ifdefs in a function body
> > > > > > * add Acks, thanks everybody
> > > > > 
> > > > > Thanks, I've pushed this to modules-next for further exposure / 
> > > > > testing.
> > > > > Given the status of testing so far with prior revisions, in that only 
> > > > > a
> > > > > few issues were found and that those were fixed, and the status of
> > > > > reviews, this just might be ripe for v6.10.
> > > > 
> > > > Looks like there is still some work needed. I've picked up next-20240501
> > > > and on arch/mips with CONFIG_MODULE_COMPRESS_XZ=y and 
> > > > CONFIG_MODULE_DECOMPRESS=y
> > > > I fail to load any module:
> > > > 
> > > > # modprobe rfkill
> > > > [11746.539090] Invalid ELF header magic: != ELF
> > > > [11746.587149] execmem: unable to allocate memory
> > > > modprobe: can't load module rfkill (kernel/net/rfkill/rfkill.ko.xz): 
> > > > Out of memory
> > > > 
> > > > The (hopefully) relevant parts of my .config:
> > > 
> > > Thanks for the report! Any chance we can get you to try a bisection? I
> > > think it should take 2-3 test boots. To help reduce scope you try 
> > > modules-next:
> > > 
> > > https://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux.git/log/?h=modules-next
> > > 
> > > Then can you check by resetting your tree to commmit 3fbe6c2f820a76 (mm:
> > > introduce execmem_alloc() and execmem_free()"). I suspect that should
> > > boot, so your bad commit would be the tip 3c2c250cb3a5fbb ("bpf: remove
> > > CONFIG_BPF_JIT dependency on CONFIG_MODULES of").
> > > 
> > > That gives us only a few commits to bisect:
> > > 
> > > git log --oneline 3fbe6c2f820a76bc36d5546bda85832f57c8fce2..
> > > 3c2c250cb3a5 (HEAD -> modules-next, korg/modules-next) bpf: remove 
> > > CONFIG_BPF_JIT dependency on CONFIG_MODULES of
> > > 11e8e65cce5c kprobes: remove dependency on CONFIG_MODULES
> > > e10cbc38697b powerpc: use CONFIG_EXECMEM instead of CONFIG_MODULES where 
> > > appropriate
> > > 4da3d38f24c5 x86/ftrace: enable dynamic ftrace without CONFIG_MODULES
> > > 13ae3d74ee70 arch: make execmem setup available regardless of 
> > > CONFIG_MODULES
> > > 460bbbc70a47 powerpc: extend execmem_params for kprobes allocations
> > > e1a14069b5b4 arm64: extend execmem_info for generated code allocations
> > > 971e181c6585 riscv: extend execmem_params for generated code allocations
> > > 0fa276f26721 mm/execmem, arch: convert remaining overrides of 
> > > module_alloc to execmem
> > > 022cef244287 mm/execmem, arch: convert simple overrides of module_alloc 
> > > to execmem
> > > 
> > > With 2-3 boots we should be to tell which is the bad commit.
> > 
> > Looks like 0fa276f26721 is the first bad commit.
> > 
> > $ git bisect log
> > # bad: [3c2c250cb3a5fbbccc4a4ff4c9354c54af91f02c] bpf: remove 
> > CONFIG_BPF_JIT dependency on CONFIG_MODULES of
> > # good: [3fbe6c2f820a76bc36d5546bda85832f57c8fce2] mm: introduce 
> > execmem_alloc() and execmem_free()
> > git bisect start '3c2c250cb3a5' '3fbe6c2f820a76'
> > # bad: [460bbbc70a47e929b1936ca68979f3b79f168fc6] powerpc: extend 
> > execmem_params for kprobes allocations
> > git bisect bad 460bbbc70a47e929b1936ca68979f3b79f168fc6
> > # bad: [0fa276f26721e0ffc2ae9c7cf67dcc005b43c67e] mm/execmem, arch: convert 
> > remaining overrides of module_alloc to execmem
> > git bisect bad 0fa276f26721e0ffc2ae9c7cf67dcc005b43c67e
> > # good: [022cef2442870db738a366d3b7a636040c081859] mm/execmem, arch: 
> > convert simple overrides of module_alloc to execmem
> > git bisect good 022cef2442870db738a366d3b7a636040c081859
> > # first bad commit: [0fa276f26721e0ffc2ae9c7cf67dcc005b43c67e] mm/execmem, 
> > arch: convert remaining overrides of module_alloc to execmem
> > 
> > Maybe MIPS also needs a ARCH_WANTS_EXECMEM_LATE?
> 
> I don't think so. It rather seems there's a bug in the initialization of
> the defaults in execmem. This should fix it:
> 
> diff --git a/mm/execmem.c b/mm/execmem.c
> index f6dc3fabc1ca..0c4b36bc6d10 100644
> --- a/mm/execmem.c
> +++ b/mm/execmem.c
> @@ -118,7 +118,6 @@ static void __init __execmem_init(void)
>   info->ranges[EXECMEM_DEFAULT].end = VMALLOC_END;
>   info->ranges[EXECMEM_DEFAULT].pgprot = PAGE_KERNEL_EXEC;
>   info->ranges[EXECMEM_DEFAULT].alignment = 1;
> - 

Re: [PATCH v9 1/3] soc: qcom: Add qcom_rproc_minidump module

2024-05-03 Thread Mukesh Ojha




On 4/30/2024 7:08 PM, Bjorn Andersson wrote:

On Tue, Mar 26, 2024 at 07:43:12PM +0530, Mukesh Ojha wrote:

Add qcom_rproc_minidump module in a preparation to remove
minidump specific code from driver/remoteproc/qcom_common.c
and provide needed exported API, this as well helps to
abstract minidump specific data layout from qualcomm's
remoteproc driver.

It is just a copying of qcom_minidump() functionality from
driver/remoteproc/qcom_common.c into a separate file under
qcom_rproc_minidump().



I'd prefer to see this enter the git history as one patch, extracting
this logic from the remoteproc into its own driver - rather than as
presented here give a sense that it's a new thing added. (I'll take care
of the maintainer sync...)

I also would prefer for this to include a problem description,
documenting why this is done.


I've not compared patch 1 and 3, but I'd also like a statement in the
commit message telling if there are any changes, or if the functions are
cleanly moved from one place to another.



Thanks for the review, addressed the comments here in v10.

https://lore.kernel.org/lkml/1714724287-12518-1-git-send-email-quic_mo...@quicinc.com/

-Mukesh



[PATCH v10] remoteproc: qcom: Move minidump related layout and API to soc/qcom directory

2024-05-03 Thread Mukesh Ojha
Currently, Qualcomm Minidump is being used to collect mini version of
remoteproc coredump with the help of boot firmware however, Minidump
as a feature is not limited to be used only for remote processor and
can also be used for Application processors. So, in preparation of
using it move the Minidump related data structure and its function
to its own file under drivers/soc/qcom with qcom_rproc_minidump.c
which clearly says it is only for remoteproc minidump.

Extra changes made apart from the movement is,
1. Adds new config, kernel headers and module macros to get this
   module compiled.
2. Guards the qcom_minidump() with CONFIG_QCOM_RPROC_MINIDUMP.
3. Selects this QCOM_RPROC_MINIDUMP config when QCOM_RPROC_COMMON
   enabled.
4. Added new header qcom_minidump.h .

Signed-off-by: Mukesh Ojha 
---
Changes in v10:
 - Converted all 3 changes sent in v9 to a single to properly
   reflect the movement in git history as per suggestion made by [Bjorn.A]

 - Described the reason of the change in commit text.

Changes in v9: 
https://lore.kernel.org/lkml/1711462394-21540-1-git-send-email-quic_mo...@quicinc.com/
 - Added source file driver/remoteproc/qcom_common.c copyright
   to qcom_rproc_minidump.c
 - Dissociated it from minidump series as this can go separately
   and minidump can put it dependency for the data structure files.

Nothing much changed in these three patches from previous version,
However, giving the link of their older versions.

v8: https://lore.kernel.org/lkml/20240131105734.13090-1-quic_mo...@quicinc.com/
v7: https://lore.kernel.org/lkml/20240109153200.12848-1-quic_mo...@quicinc.com/
v6: 
https://lore.kernel.org/lkml/1700864395-1479-1-git-send-email-quic_mo...@quicinc.com/
v5: 
https://lore.kernel.org/lkml/1694429639-21484-1-git-send-email-quic_mo...@quicinc.com/
v4: 
https://lore.kernel.org/lkml/1687955688-20809-1-git-send-email-quic_mo...@quicinc.com/

 drivers/remoteproc/Kconfig |   1 +
 drivers/remoteproc/qcom_common.c   | 160 -
 drivers/remoteproc/qcom_common.h   |   5 -
 drivers/remoteproc/qcom_q6v5_pas.c |   1 +
 drivers/soc/qcom/Kconfig   |  11 ++
 drivers/soc/qcom/Makefile  |   1 +
 drivers/soc/qcom/qcom_rproc_minidump.c | 177 +
 include/soc/qcom/qcom_minidump.h   |  23 +
 8 files changed, 214 insertions(+), 165 deletions(-)
 create mode 100644 drivers/soc/qcom/qcom_rproc_minidump.c
 create mode 100644 include/soc/qcom/qcom_minidump.h

diff --git a/drivers/remoteproc/Kconfig b/drivers/remoteproc/Kconfig
index 48845dc8fa85..cea960749e2c 100644
--- a/drivers/remoteproc/Kconfig
+++ b/drivers/remoteproc/Kconfig
@@ -166,6 +166,7 @@ config QCOM_PIL_INFO
 
 config QCOM_RPROC_COMMON
tristate
+   select QCOM_RPROC_MINIDUMP
 
 config QCOM_Q6V5_COMMON
tristate
diff --git a/drivers/remoteproc/qcom_common.c b/drivers/remoteproc/qcom_common.c
index 03e5f5d533eb..085fd73fa23a 100644
--- a/drivers/remoteproc/qcom_common.c
+++ b/drivers/remoteproc/qcom_common.c
@@ -17,7 +17,6 @@
 #include 
 #include 
 #include 
-#include 
 
 #include "remoteproc_internal.h"
 #include "qcom_common.h"
@@ -26,61 +25,6 @@
 #define to_smd_subdev(d) container_of(d, struct qcom_rproc_subdev, subdev)
 #define to_ssr_subdev(d) container_of(d, struct qcom_rproc_ssr, subdev)
 
-#define MAX_NUM_OF_SS   10
-#define MAX_REGION_NAME_LENGTH  16
-#define SBL_MINIDUMP_SMEM_ID   602
-#define MINIDUMP_REGION_VALID  ('V' << 24 | 'A' << 16 | 'L' << 8 | 'I' 
<< 0)
-#define MINIDUMP_SS_ENCR_DONE  ('D' << 24 | 'O' << 16 | 'N' << 8 | 'E' 
<< 0)
-#define MINIDUMP_SS_ENABLED('E' << 24 | 'N' << 16 | 'B' << 8 | 'L' 
<< 0)
-
-/**
- * struct minidump_region - Minidump region
- * @name   : Name of the region to be dumped
- * @seq_num:   : Use to differentiate regions with same name.
- * @valid  : This entry to be dumped (if set to 1)
- * @address: Physical address of region to be dumped
- * @size   : Size of the region
- */
-struct minidump_region {
-   charname[MAX_REGION_NAME_LENGTH];
-   __le32  seq_num;
-   __le32  valid;
-   __le64  address;
-   __le64  size;
-};
-
-/**
- * struct minidump_subsystem - Subsystem's SMEM Table of content
- * @status : Subsystem toc init status
- * @enabled : if set to 1, this region would be copied during coredump
- * @encryption_status: Encryption status for this subsystem
- * @encryption_required : Decides to encrypt the subsystem regions or not
- * @region_count : Number of regions added in this subsystem toc
- * @regions_baseptr : regions base pointer of the subsystem
- */
-struct minidump_subsystem {
-   __le32  status;
-   __le32  enabled;
-   __le32  encryption_status;
-   __le32  encryption_required;
-   __le32  region_count;
-   __le64  regions_baseptr;
-};
-
-/**
- * struct minidump_global_toc - Global Table of Content
- * @status : 

Re: [PATCH v7 00/16] mm: jit/text allocator

2024-05-03 Thread Mike Rapoport
On Fri, May 03, 2024 at 01:23:30AM +0100, Liviu Dudau wrote:
> On Thu, May 02, 2024 at 04:07:05PM -0700, Luis Chamberlain wrote:
> > On Thu, May 02, 2024 at 11:50:36PM +0100, Liviu Dudau wrote:
> > > On Mon, Apr 29, 2024 at 09:29:20AM -0700, Luis Chamberlain wrote:
> > > > On Mon, Apr 29, 2024 at 03:16:04PM +0300, Mike Rapoport wrote:
> > > > > From: "Mike Rapoport (IBM)" 
> > > > > 
> > > > > Hi,
> > > > > 
> > > > > The patches are also available in git:
> > > > > https://git.kernel.org/pub/scm/linux/kernel/git/rppt/linux.git/log/?h=execmem/v7
> > > > > 
> > > > > v7 changes:
> > > > > * define MODULE_{VADDR,END} for riscv32 to fix the build and avoid
> > > > >   #ifdefs in a function body
> > > > > * add Acks, thanks everybody
> > > > 
> > > > Thanks, I've pushed this to modules-next for further exposure / testing.
> > > > Given the status of testing so far with prior revisions, in that only a
> > > > few issues were found and that those were fixed, and the status of
> > > > reviews, this just might be ripe for v6.10.
> > > 
> > > Looks like there is still some work needed. I've picked up next-20240501
> > > and on arch/mips with CONFIG_MODULE_COMPRESS_XZ=y and 
> > > CONFIG_MODULE_DECOMPRESS=y
> > > I fail to load any module:
> > > 
> > > # modprobe rfkill
> > > [11746.539090] Invalid ELF header magic: != ELF
> > > [11746.587149] execmem: unable to allocate memory
> > > modprobe: can't load module rfkill (kernel/net/rfkill/rfkill.ko.xz): Out 
> > > of memory
> > > 
> > > The (hopefully) relevant parts of my .config:
> > 
> > Thanks for the report! Any chance we can get you to try a bisection? I
> > think it should take 2-3 test boots. To help reduce scope you try 
> > modules-next:
> > 
> > https://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux.git/log/?h=modules-next
> > 
> > Then can you check by resetting your tree to commmit 3fbe6c2f820a76 (mm:
> > introduce execmem_alloc() and execmem_free()"). I suspect that should
> > boot, so your bad commit would be the tip 3c2c250cb3a5fbb ("bpf: remove
> > CONFIG_BPF_JIT dependency on CONFIG_MODULES of").
> > 
> > That gives us only a few commits to bisect:
> > 
> > git log --oneline 3fbe6c2f820a76bc36d5546bda85832f57c8fce2..
> > 3c2c250cb3a5 (HEAD -> modules-next, korg/modules-next) bpf: remove 
> > CONFIG_BPF_JIT dependency on CONFIG_MODULES of
> > 11e8e65cce5c kprobes: remove dependency on CONFIG_MODULES
> > e10cbc38697b powerpc: use CONFIG_EXECMEM instead of CONFIG_MODULES where 
> > appropriate
> > 4da3d38f24c5 x86/ftrace: enable dynamic ftrace without CONFIG_MODULES
> > 13ae3d74ee70 arch: make execmem setup available regardless of CONFIG_MODULES
> > 460bbbc70a47 powerpc: extend execmem_params for kprobes allocations
> > e1a14069b5b4 arm64: extend execmem_info for generated code allocations
> > 971e181c6585 riscv: extend execmem_params for generated code allocations
> > 0fa276f26721 mm/execmem, arch: convert remaining overrides of module_alloc 
> > to execmem
> > 022cef244287 mm/execmem, arch: convert simple overrides of module_alloc to 
> > execmem
> > 
> > With 2-3 boots we should be to tell which is the bad commit.
> 
> Looks like 0fa276f26721 is the first bad commit.
> 
> $ git bisect log
> # bad: [3c2c250cb3a5fbbccc4a4ff4c9354c54af91f02c] bpf: remove CONFIG_BPF_JIT 
> dependency on CONFIG_MODULES of
> # good: [3fbe6c2f820a76bc36d5546bda85832f57c8fce2] mm: introduce 
> execmem_alloc() and execmem_free()
> git bisect start '3c2c250cb3a5' '3fbe6c2f820a76'
> # bad: [460bbbc70a47e929b1936ca68979f3b79f168fc6] powerpc: extend 
> execmem_params for kprobes allocations
> git bisect bad 460bbbc70a47e929b1936ca68979f3b79f168fc6
> # bad: [0fa276f26721e0ffc2ae9c7cf67dcc005b43c67e] mm/execmem, arch: convert 
> remaining overrides of module_alloc to execmem
> git bisect bad 0fa276f26721e0ffc2ae9c7cf67dcc005b43c67e
> # good: [022cef2442870db738a366d3b7a636040c081859] mm/execmem, arch: convert 
> simple overrides of module_alloc to execmem
> git bisect good 022cef2442870db738a366d3b7a636040c081859
> # first bad commit: [0fa276f26721e0ffc2ae9c7cf67dcc005b43c67e] mm/execmem, 
> arch: convert remaining overrides of module_alloc to execmem
> 
> Maybe MIPS also needs a ARCH_WANTS_EXECMEM_LATE?

I don't think so. It rather seems there's a bug in the initialization of
the defaults in execmem. This should fix it:

diff --git a/mm/execmem.c b/mm/execmem.c
index f6dc3fabc1ca..0c4b36bc6d10 100644
--- a/mm/execmem.c
+++ b/mm/execmem.c
@@ -118,7 +118,6 @@ static void __init __execmem_init(void)
info->ranges[EXECMEM_DEFAULT].end = VMALLOC_END;
info->ranges[EXECMEM_DEFAULT].pgprot = PAGE_KERNEL_EXEC;
info->ranges[EXECMEM_DEFAULT].alignment = 1;
-   return;
}
 
if (!execmem_validate(info))
 
> Best regards,
> Liviu

-- 
Sincerely yours,
Mike.