Re: [PATCH V3 1/3] vhost-vdpa: flush workers on suspend

2024-05-21 Thread Jason Wang
On Tue, May 21, 2024 at 9:39 PM Steven Sistare
 wrote:
>
> On 5/20/2024 10:28 PM, Jason Wang wrote:
> > On Mon, May 20, 2024 at 11:21 PM Steve Sistare
> >  wrote:
> >>
> >> Flush to guarantee no workers are running when suspend returns.
> >>
> >> Fixes: f345a0143b4d ("vhost-vdpa: uAPI to suspend the device")
> >> Signed-off-by: Steve Sistare 
> >> Acked-by: Eugenio Pérez 
> >> ---
> >>   drivers/vhost/vdpa.c | 3 +++
> >>   1 file changed, 3 insertions(+)
> >>
> >> diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
> >> index ba52d128aeb7..189596caaec9 100644
> >> --- a/drivers/vhost/vdpa.c
> >> +++ b/drivers/vhost/vdpa.c
> >> @@ -594,6 +594,7 @@ static long vhost_vdpa_suspend(struct vhost_vdpa *v)
> >>  struct vdpa_device *vdpa = v->vdpa;
> >>  const struct vdpa_config_ops *ops = vdpa->config;
> >>  int ret;
> >> +   struct vhost_dev *vdev = >vdev;
> >>
> >>  if (!(ops->get_status(vdpa) & VIRTIO_CONFIG_S_DRIVER_OK))
> >>  return 0;
> >> @@ -601,6 +602,8 @@ static long vhost_vdpa_suspend(struct vhost_vdpa *v)
> >>  if (!ops->suspend)
> >>  return -EOPNOTSUPP;
> >>
> >> +   vhost_dev_flush(vdev);
> >
> > vhost-vDPA doesn't use workers, see:
> >
> >  vhost_dev_init(dev, vqs, nvqs, 0, 0, 0, false,
> > vhost_vdpa_process_iotlb_msg);
> >
> > So I wonder if this is a must.
>
> True, but I am adding this to be future proof.  I could instead log a warning
> or an error message if vhost_vdpa_suspend is called and 
> v->vdev.use_worker=true,
> but IMO we should just fix it, given that the fix is trivial.

I meant we need to know if it fixes any actual issue or not.

Thanks

>
> - Steve
>
>
>




Re: [PATCH V3 2/3] vduse: suspend

2024-05-21 Thread Jason Wang
On Tue, May 21, 2024 at 9:39 PM Steven Sistare
 wrote:
>
> On 5/20/2024 10:30 PM, Jason Wang wrote:
> > On Mon, May 20, 2024 at 11:21 PM Steve Sistare
> >  wrote:
> >>
> >> Support the suspend operation.  There is little to do, except flush to
> >> guarantee no workers are running when suspend returns.
> >>
> >> Signed-off-by: Steve Sistare 
> >> ---
> >>   drivers/vdpa/vdpa_user/vduse_dev.c | 24 
> >>   1 file changed, 24 insertions(+)
> >>
> >> diff --git a/drivers/vdpa/vdpa_user/vduse_dev.c 
> >> b/drivers/vdpa/vdpa_user/vduse_dev.c
> >> index 73c89701fc9d..7dc46f771f12 100644
> >> --- a/drivers/vdpa/vdpa_user/vduse_dev.c
> >> +++ b/drivers/vdpa/vdpa_user/vduse_dev.c
> >> @@ -472,6 +472,18 @@ static void vduse_dev_reset(struct vduse_dev *dev)
> >>  up_write(>rwsem);
> >>   }
> >>
> >> +static void vduse_flush_work(struct vduse_dev *dev)
> >> +{
> >> +   flush_work(>inject);
> >> +
> >> +   for (int i = 0; i < dev->vq_num; i++) {
> >> +   struct vduse_virtqueue *vq = dev->vqs[i];
> >> +
> >> +   flush_work(>inject);
> >> +   flush_work(>kick);
> >> +   }
> >> +}
> >> +
> >>   static int vduse_vdpa_set_vq_address(struct vdpa_device *vdpa, u16 idx,
> >>  u64 desc_area, u64 driver_area,
> >>  u64 device_area)
> >> @@ -724,6 +736,17 @@ static int vduse_vdpa_reset(struct vdpa_device *vdpa)
> >>  return ret;
> >>   }
> >>
> >> +static int vduse_vdpa_suspend(struct vdpa_device *vdpa)
> >> +{
> >> +   struct vduse_dev *dev = vdpa_to_vduse(vdpa);
> >> +
> >> +   down_write(>rwsem);
> >> +   vduse_flush_work(dev);
> >> +   up_write(>rwsem);
> >
> > Can this forbid the new work to be scheduled?
>
> Are you suggesting I return an error below if the dev is suspended?
> I can do that.

I mean the irq injection work can still be scheduled after vduse_vdpa_suspend().

>
> However, I now suspect this implementation of vduse_vdpa_suspend is not
> complete in other ways, so I withdraw this patch pending future work.
> Thanks for looking at it.

Ok.

Thanks

>
> - Steve
>
> > static int vduse_dev_queue_irq_work(struct vduse_dev *dev,
> >  struct work_struct *irq_work,
> >  int irq_effective_cpu)
> > {
> >  int ret = -EINVAL;
> >
> >  down_read(>rwsem);
> >  if (!(dev->status & VIRTIO_CONFIG_S_DRIVER_OK))
> >  goto unlock;
> >
> >  ret = 0;
> >  if (irq_effective_cpu == IRQ_UNBOUND)
> >  queue_work(vduse_irq_wq, irq_work);
> >  else
> >  queue_work_on(irq_effective_cpu,
> >vduse_irq_bound_wq, irq_work);
> > unlock:
> >  up_read(>rwsem);
> >
> >  return ret;
> > }
> >
> > Thanks
> >
> >> +
> >> +   return 0;
> >> +}
> >> +
> >>   static u32 vduse_vdpa_get_generation(struct vdpa_device *vdpa)
> >>   {
> >>  struct vduse_dev *dev = vdpa_to_vduse(vdpa);
> >> @@ -806,6 +829,7 @@ static const struct vdpa_config_ops 
> >> vduse_vdpa_config_ops = {
> >>  .set_vq_affinity= vduse_vdpa_set_vq_affinity,
> >>  .get_vq_affinity= vduse_vdpa_get_vq_affinity,
> >>  .reset  = vduse_vdpa_reset,
> >> +   .suspend= vduse_vdpa_suspend,
> >>  .set_map= vduse_vdpa_set_map,
> >>  .free   = vduse_vdpa_free,
> >>   };
> >> --
> >> 2.39.3
> >>
> >
>




Re: [PATCH V3 3/3] vdpa_sim: flush workers on suspend

2024-05-21 Thread Jason Wang
On Tue, May 21, 2024 at 9:39 PM Steven Sistare
 wrote:
>
> On 5/20/2024 10:32 PM, Jason Wang wrote:
> > On Mon, May 20, 2024 at 11:21 PM Steve Sistare
> >  wrote:
> >>
> >> Flush to guarantee no workers are running when suspend returns.
> >> Add a lock to enforce ordering between clearing running, flushing,
> >> and posting new work in vdpasim_kick_vq.  It must be a spin lock
> >> because vdpasim_kick_vq may be reached va eventfd_write.
> >>
> >> Signed-off-by: Steve Sistare 
> >> ---
> >>   drivers/vdpa/vdpa_sim/vdpa_sim.c | 16 ++--
> >>   drivers/vdpa/vdpa_sim/vdpa_sim.h |  1 +
> >>   2 files changed, 15 insertions(+), 2 deletions(-)
> >>
> >> diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim.c 
> >> b/drivers/vdpa/vdpa_sim/vdpa_sim.c
> >> index 8ffea8430f95..67ed49d95bf0 100644
> >> --- a/drivers/vdpa/vdpa_sim/vdpa_sim.c
> >> +++ b/drivers/vdpa/vdpa_sim/vdpa_sim.c
> >> @@ -322,7 +322,7 @@ static u16 vdpasim_get_vq_size(struct vdpa_device 
> >> *vdpa, u16 idx)
> >>  return VDPASIM_QUEUE_MAX;
> >>   }
> >>
> >> -static void vdpasim_kick_vq(struct vdpa_device *vdpa, u16 idx)
> >> +static void vdpasim_do_kick_vq(struct vdpa_device *vdpa, u16 idx)
> >>   {
> >>  struct vdpasim *vdpasim = vdpa_to_sim(vdpa);
> >>  struct vdpasim_virtqueue *vq = >vqs[idx];
> >> @@ -337,6 +337,15 @@ static void vdpasim_kick_vq(struct vdpa_device *vdpa, 
> >> u16 idx)
> >>  vdpasim_schedule_work(vdpasim);
> >>   }
> >>
> >> +static void vdpasim_kick_vq(struct vdpa_device *vdpa, u16 idx)
> >> +{
> >> +   struct vdpasim *vdpasim = vdpa_to_sim(vdpa);
> >> +
> >> +   spin_lock(>kick_lock);
> >> +   vdpasim_do_kick_vq(vdpa, idx);
> >> +   spin_unlock(>kick_lock);
> >> +}
> >> +
> >>   static void vdpasim_set_vq_cb(struct vdpa_device *vdpa, u16 idx,
> >>struct vdpa_callback *cb)
> >>   {
> >> @@ -520,8 +529,11 @@ static int vdpasim_suspend(struct vdpa_device *vdpa)
> >>  struct vdpasim *vdpasim = vdpa_to_sim(vdpa);
> >>
> >>  mutex_lock(>mutex);
> >> +   spin_lock(>kick_lock);
> >>  vdpasim->running = false;
> >> +   spin_unlock(>kick_lock);
> >>  mutex_unlock(>mutex);
> >> +   kthread_flush_work(>work);
> >>
> >>  return 0;
> >>   }
> >> @@ -537,7 +549,7 @@ static int vdpasim_resume(struct vdpa_device *vdpa)
> >>  if (vdpasim->pending_kick) {
> >>  /* Process pending descriptors */
> >>  for (i = 0; i < vdpasim->dev_attr.nvqs; ++i)
> >> -   vdpasim_kick_vq(vdpa, i);
> >> +   vdpasim_do_kick_vq(vdpa, i);
> >>
> >>  vdpasim->pending_kick = false;
> >>  }
> >> diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim.h 
> >> b/drivers/vdpa/vdpa_sim/vdpa_sim.h
> >> index bb137e479763..5eb6ca9c5ec5 100644
> >> --- a/drivers/vdpa/vdpa_sim/vdpa_sim.h
> >> +++ b/drivers/vdpa/vdpa_sim/vdpa_sim.h
> >> @@ -75,6 +75,7 @@ struct vdpasim {
> >>  bool pending_kick;
> >>  /* spinlock to synchronize iommu table */
> >>  spinlock_t iommu_lock;
> >> +   spinlock_t kick_lock;
> >
> > It looks to me this is not initialized?
>
> Yup, I lost that line while fiddling with different locking schemes.
> Thanks, will fix in V4.
>
> @@ -236,6 +236,7 @@ struct vdpasim *vdpasim_create(struct vdpasim_dev_attr
> *dev_attr,
>
>  mutex_init(>mutex);
>  spin_lock_init(>iommu_lock);
> +   spin_lock_init(>kick_lock);
>
> With that fix, does this patch earn your RB?

Yes.

Thanks

>
> - Steve
>
> >>   };
> >>
> >>   struct vdpasim *vdpasim_create(struct vdpasim_dev_attr *attr,
> >> --
> >> 2.39.3
> >>
> >
>




Re: [PATCH v3 2/2] LoongArch: Add steal time support in guest side

2024-05-21 Thread kernel test robot
Hi Bibo,

kernel test robot noticed the following build warnings:

[auto build test WARNING on 3c999d1ae3c75991902a1a7dad0cb62c2a3008b4]

url:
https://github.com/intel-lab-lkp/linux/commits/Bibo-Mao/LoongArch-KVM-Add-steal-time-support-in-kvm-side/20240521-104902
base:   3c999d1ae3c75991902a1a7dad0cb62c2a3008b4
patch link:
https://lore.kernel.org/r/20240521024556.419436-3-maobibo%40loongson.cn
patch subject: [PATCH v3 2/2] LoongArch: Add steal time support in guest side
config: loongarch-kismet-CONFIG_PARAVIRT-CONFIG_PARAVIRT_TIME_ACCOUNTING-0-0 
(https://download.01.org/0day-ci/archive/20240522/202405221028.qrcedmnq-...@intel.com/config)
reproduce: 
(https://download.01.org/0day-ci/archive/20240522/202405221028.qrcedmnq-...@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot 
| Closes: 
https://lore.kernel.org/oe-kbuild-all/202405221028.qrcedmnq-...@intel.com/

kismet warnings: (new ones prefixed by >>)
>> kismet: WARNING: unmet direct dependencies detected for PARAVIRT when 
>> selected by PARAVIRT_TIME_ACCOUNTING
   

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki



[PATCH v2 4/4] selftests/bpf: add test validating uprobe/uretprobe stack traces

2024-05-21 Thread Andrii Nakryiko
Add a set of tests to validate that stack traces captured from or in the
presence of active uprobes and uretprobes are valid and complete.

For this we use BPF program that are installed either on entry or exit
of user function, plus deep-nested USDT. One of target funtions
(target_1) is recursive to generate two different entries in the stack
trace for the same uprobe/uretprobe, testing potential edge conditions.

Without fixes in this patch set, we get something like this for one of
the scenarios:

 caller: 0x758fff - 0x7595ab
 target_1: 0x758fd5 - 0x758fff
 target_2: 0x758fca - 0x758fd5
 target_3: 0x758fbf - 0x758fca
 target_4: 0x758fb3 - 0x758fbf
 ENTRY #0: 0x758fb3 (in target_4)
 ENTRY #1: 0x758fd3 (in target_2)
 ENTRY #2: 0x758ffd (in target_1)
 ENTRY #3: 0x7fffe000
 ENTRY #4: 0x7fffe000
 ENTRY #5: 0x6f8f39
 ENTRY #6: 0x6fa6f0
 ENTRY #7: 0x7f403f229590

Entry #3 and #4 (0x7fffe000) are uretprobe trampoline addresses
which obscure actual target_1 and another target_1 invocations. Also
note that between entry #0 and entry #1 we are missing an entry for
target_3, which is fixed in patch #2.

With all the fixes, we get desired full stack traces:

 caller: 0x758fff - 0x7595ab
 target_1: 0x758fd5 - 0x758fff
 target_2: 0x758fca - 0x758fd5
 target_3: 0x758fbf - 0x758fca
 target_4: 0x758fb3 - 0x758fbf
 ENTRY #0: 0x758fb7 (in target_4)
 ENTRY #1: 0x758fc8 (in target_3)
 ENTRY #2: 0x758fd3 (in target_2)
 ENTRY #3: 0x758ffd (in target_1)
 ENTRY #4: 0x758ff3 (in target_1)
 ENTRY #5: 0x75922c (in caller)
 ENTRY #6: 0x6f8f39
 ENTRY #7: 0x6fa6f0
 ENTRY #8: 0x7f986adc4cd0

Now there is a logical and complete sequence of function calls.

Signed-off-by: Andrii Nakryiko 
---
 .../bpf/prog_tests/uretprobe_stack.c  | 186 ++
 .../selftests/bpf/progs/uretprobe_stack.c |  96 +
 2 files changed, 282 insertions(+)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/uretprobe_stack.c
 create mode 100644 tools/testing/selftests/bpf/progs/uretprobe_stack.c

diff --git a/tools/testing/selftests/bpf/prog_tests/uretprobe_stack.c 
b/tools/testing/selftests/bpf/prog_tests/uretprobe_stack.c
new file mode 100644
index ..6deb8d560ddd
--- /dev/null
+++ b/tools/testing/selftests/bpf/prog_tests/uretprobe_stack.c
@@ -0,0 +1,186 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2024 Meta Platforms, Inc. and affiliates. */
+
+#include 
+#include "uretprobe_stack.skel.h"
+#include "../sdt.h"
+
+/* We set up target_1() -> target_2() -> target_3() -> target_4() -> USDT()
+ * call chain, each being traced by our BPF program. On entry or return from
+ * each target_*() we are capturing user stack trace and recording it in
+ * global variable, so that user space part of the test can validate it.
+ *
+ * Note, we put each target function into a custom section to get those
+ * __start_XXX/__stop_XXX symbols, generated by linker for us, which allow us
+ * to know address range of those functions
+ */
+__attribute__((section("uprobe__target_4")))
+__weak int target_4(void)
+{
+   STAP_PROBE1(uretprobe_stack, target, 42);
+   return 42;
+}
+
+extern const void *__start_uprobe__target_4;
+extern const void *__stop_uprobe__target_4;
+
+__attribute__((section("uprobe__target_3")))
+__weak int target_3(void)
+{
+   return target_4();
+}
+
+extern const void *__start_uprobe__target_3;
+extern const void *__stop_uprobe__target_3;
+
+__attribute__((section("uprobe__target_2")))
+__weak int target_2(void)
+{
+   return target_3();
+}
+
+extern const void *__start_uprobe__target_2;
+extern const void *__stop_uprobe__target_2;
+
+__attribute__((section("uprobe__target_1")))
+__weak int target_1(int depth)
+{
+   if (depth < 1)
+   return 1 + target_1(depth + 1);
+   else
+   return target_2();
+}
+
+extern const void *__start_uprobe__target_1;
+extern const void *__stop_uprobe__target_1;
+
+extern const void *__start_uretprobe_stack_sec;
+extern const void *__stop_uretprobe_stack_sec;
+
+struct range {
+   long start;
+   long stop;
+};
+
+static struct range targets[] = {
+   {}, /* we want target_1 to map to target[1], so need 1-based indexing */
+   { (long)&__start_uprobe__target_1, (long)&__stop_uprobe__target_1 },
+   { (long)&__start_uprobe__target_2, (long)&__stop_uprobe__target_2 },
+   { (long)&__start_uprobe__target_3, (long)&__stop_uprobe__target_3 },
+   { (long)&__start_uprobe__target_4, (long)&__stop_uprobe__target_4 },
+};
+
+static struct range caller = {
+   (long)&__start_uretprobe_stack_sec,
+   (long)&__stop_uretprobe_stack_sec,
+};
+
+static void validate_stack(__u64 *ips, int stack_len, int cnt, ...)
+{
+   int i, j;
+   va_list args;
+
+   if (!ASSERT_GT(stack_len, 0, "stack_len"))
+   return;
+
+   stack_len /= 8;
+
+   /* check if we have enough entries to satisfy test expectations */
+   if (!ASSERT_GE(stack_len, cnt, "stack_len2"))
+

[PATCH v2 3/4] perf,x86: avoid missing caller address in stack traces captured in uprobe

2024-05-21 Thread Andrii Nakryiko
When tracing user functions with uprobe functionality, it's common to
install the probe (e.g., a BPF program) at the first instruction of the
function. This is often going to be `push %rbp` instruction in function
preamble, which means that within that function frame pointer hasn't
been established yet. This leads to consistently missing an actual
caller of the traced function, because perf_callchain_user() only
records current IP (capturing traced function) and then following frame
pointer chain (which would be caller's frame, containing the address of
caller's caller).

So when we have target_1 -> target_2 -> target_3 call chain and we are
tracing an entry to target_3, captured stack trace will report
target_1 -> target_3 call chain, which is wrong and confusing.

This patch proposes a x86-64-specific heuristic to detect `push %rbp`
instruction being traced. If that's the case, with the assumption that
applicatoin is compiled with frame pointers, this instruction would be
a strong indicator that this is the entry to the function. In that case,
return address is still pointed to by %rsp, so we fetch it and add to
stack trace before proceeding to unwind the rest using frame
pointer-based logic.

Signed-off-by: Andrii Nakryiko 
---
 arch/x86/events/core.c  | 20 
 include/linux/uprobes.h |  2 ++
 kernel/events/uprobes.c |  2 ++
 3 files changed, 24 insertions(+)

diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
index 5b0dd07b1ef1..82d5570b58ff 100644
--- a/arch/x86/events/core.c
+++ b/arch/x86/events/core.c
@@ -2884,6 +2884,26 @@ perf_callchain_user(struct perf_callchain_entry_ctx 
*entry, struct pt_regs *regs
return;
 
pagefault_disable();
+
+#ifdef CONFIG_UPROBES
+   /*
+* If we are called from uprobe handler, and we are indeed at the very
+* entry to user function (which is normally a `push %rbp` instruction,
+* under assumption of application being compiled with frame pointers),
+* we should read return address from *regs->sp before proceeding
+* to follow frame pointers, otherwise we'll skip immediate caller
+* as %rbp is not yet setup.
+*/
+   if (current->utask) {
+   struct arch_uprobe *auprobe = current->utask->auprobe;
+   u64 ret_addr;
+
+   if (auprobe && auprobe->insn[0] == 0x55 /* push %rbp */ &&
+   !__get_user(ret_addr, (const u64 __user *)regs->sp))
+   perf_callchain_store(entry, ret_addr);
+   }
+#endif
+
while (entry->nr < entry->max_stack) {
if (!valid_user_frame(fp, sizeof(frame)))
break;
diff --git a/include/linux/uprobes.h b/include/linux/uprobes.h
index 0c57eec85339..7b785cd30d86 100644
--- a/include/linux/uprobes.h
+++ b/include/linux/uprobes.h
@@ -76,6 +76,8 @@ struct uprobe_task {
struct uprobe   *active_uprobe;
unsigned long   xol_vaddr;
 
+   struct arch_uprobe  *auprobe;
+
struct return_instance  *return_instances;
unsigned intdepth;
 };
diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c
index 1c99380dc89d..504693845187 100644
--- a/kernel/events/uprobes.c
+++ b/kernel/events/uprobes.c
@@ -2072,6 +2072,7 @@ static void handler_chain(struct uprobe *uprobe, struct 
pt_regs *regs)
bool need_prep = false; /* prepare return uprobe, when needed */
 
down_read(>register_rwsem);
+   current->utask->auprobe = >arch;
for (uc = uprobe->consumers; uc; uc = uc->next) {
int rc = 0;
 
@@ -2086,6 +2087,7 @@ static void handler_chain(struct uprobe *uprobe, struct 
pt_regs *regs)
 
remove &= rc;
}
+   current->utask->auprobe = NULL;
 
if (need_prep && !remove)
prepare_uretprobe(uprobe, regs); /* put bp at return */
-- 
2.43.0




[PATCH v2 2/4] perf,uprobes: fix user stack traces in the presence of pending uretprobes

2024-05-21 Thread Andrii Nakryiko
When kernel has pending uretprobes installed, it hijacks original user
function return address on the stack with a uretprobe trampoline
address. There could be multiple such pending uretprobes (either on
different user functions or on the same recursive one) at any given
time within the same task.

This approach interferes with the user stack trace capture logic, which
would report suprising addresses (like 0x7fffe000) that correspond
to a special "[uprobes]" section that kernel installs in the target
process address space for uretprobe trampoline code, while logically it
should be an address somewhere within the calling function of another
traced user function.

This is easy to correct for, though. Uprobes subsystem keeps track of
pending uretprobes and records original return addresses. This patch is
using this to do a post-processing step and restore each trampoline
address entries with correct original return address. This is done only
if there are pending uretprobes for current task.

This is a similar approach to what fprobe/kretprobe infrastructure is
doing when capturing kernel stack traces in the presence of pending
return probes.

Reported-by: Riham Selim 
Signed-off-by: Andrii Nakryiko 
---
 kernel/events/callchain.c | 43 ++-
 kernel/events/uprobes.c   |  9 
 2 files changed, 51 insertions(+), 1 deletion(-)

diff --git a/kernel/events/callchain.c b/kernel/events/callchain.c
index 1273be84392c..b17e3323f7f6 100644
--- a/kernel/events/callchain.c
+++ b/kernel/events/callchain.c
@@ -11,6 +11,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "internal.h"
 
@@ -176,13 +177,51 @@ put_callchain_entry(int rctx)
put_recursion_context(this_cpu_ptr(callchain_recursion), rctx);
 }
 
+static void fixup_uretprobe_trampoline_entries(struct perf_callchain_entry 
*entry,
+  int start_entry_idx)
+{
+#ifdef CONFIG_UPROBES
+   struct uprobe_task *utask = current->utask;
+   struct return_instance *ri;
+   __u64 *cur_ip, *last_ip, tramp_addr;
+
+   if (likely(!utask || !utask->return_instances))
+   return;
+
+   cur_ip = >ip[start_entry_idx];
+   last_ip = >ip[entry->nr - 1];
+   ri = utask->return_instances;
+   tramp_addr = uprobe_get_trampoline_vaddr();
+
+   /*
+* If there are pending uretprobes for the current thread, they are
+* recorded in a list inside utask->return_instances; each such
+* pending uretprobe replaces traced user function's return address on
+* the stack, so when stack trace is captured, instead of seeing
+* actual function's return address, we'll have one or many uretprobe
+* trampoline addresses in the stack trace, which are not helpful and
+* misleading to users.
+* So here we go over the pending list of uretprobes, and each
+* encountered trampoline address is replaced with actual return
+* address.
+*/
+   while (ri && cur_ip <= last_ip) {
+   if (*cur_ip == tramp_addr) {
+   *cur_ip = ri->orig_ret_vaddr;
+   ri = ri->next;
+   }
+   cur_ip++;
+   }
+#endif
+}
+
 struct perf_callchain_entry *
 get_perf_callchain(struct pt_regs *regs, u32 init_nr, bool kernel, bool user,
   u32 max_stack, bool crosstask, bool add_mark)
 {
struct perf_callchain_entry *entry;
struct perf_callchain_entry_ctx ctx;
-   int rctx;
+   int rctx, start_entry_idx;
 
entry = get_callchain_entry();
if (!entry)
@@ -215,7 +254,9 @@ get_perf_callchain(struct pt_regs *regs, u32 init_nr, bool 
kernel, bool user,
if (add_mark)
perf_callchain_store_context(, 
PERF_CONTEXT_USER);
 
+   start_entry_idx = entry->nr;
perf_callchain_user(, regs);
+   fixup_uretprobe_trampoline_entries(entry, 
start_entry_idx);
}
}
 
diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c
index d60d24f0f2f4..1c99380dc89d 100644
--- a/kernel/events/uprobes.c
+++ b/kernel/events/uprobes.c
@@ -2149,6 +2149,15 @@ static void handle_trampoline(struct pt_regs *regs)
 
instruction_pointer_set(regs, ri->orig_ret_vaddr);
do {
+   /* pop current instance from the stack of pending 
return instances,
+* as it's not pending anymore: we just fixed up 
original
+* instruction pointer in regs and are about to call 
handlers;
+* this allows fixup_uretprobe_trampoline_entries() to 
properly fix up
+* captured stack traces from uretprobe handlers, in 
which pending
+* trampoline addresses on the stack are replaced with 
correct
+* original 

[PATCH v2 1/4] uprobes: rename get_trampoline_vaddr() and make it global

2024-05-21 Thread Andrii Nakryiko
This helper is needed in another file, so make it a bit more uniquely
named and expose it internally.

Signed-off-by: Andrii Nakryiko 
---
 include/linux/uprobes.h | 1 +
 kernel/events/uprobes.c | 6 +++---
 2 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/include/linux/uprobes.h b/include/linux/uprobes.h
index f46e0ca0169c..0c57eec85339 100644
--- a/include/linux/uprobes.h
+++ b/include/linux/uprobes.h
@@ -138,6 +138,7 @@ extern bool arch_uretprobe_is_alive(struct return_instance 
*ret, enum rp_check c
 extern bool arch_uprobe_ignore(struct arch_uprobe *aup, struct pt_regs *regs);
 extern void arch_uprobe_copy_ixol(struct page *page, unsigned long vaddr,
 void *src, unsigned long len);
+extern unsigned long uprobe_get_trampoline_vaddr(void);
 #else /* !CONFIG_UPROBES */
 struct uprobes_state {
 };
diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c
index 8ae0eefc3a34..d60d24f0f2f4 100644
--- a/kernel/events/uprobes.c
+++ b/kernel/events/uprobes.c
@@ -1827,7 +1827,7 @@ void uprobe_copy_process(struct task_struct *t, unsigned 
long flags)
  *
  * Returns -1 in case the xol_area is not allocated.
  */
-static unsigned long get_trampoline_vaddr(void)
+unsigned long uprobe_get_trampoline_vaddr(void)
 {
struct xol_area *area;
unsigned long trampoline_vaddr = -1;
@@ -1878,7 +1878,7 @@ static void prepare_uretprobe(struct uprobe *uprobe, 
struct pt_regs *regs)
if (!ri)
return;
 
-   trampoline_vaddr = get_trampoline_vaddr();
+   trampoline_vaddr = uprobe_get_trampoline_vaddr();
orig_ret_vaddr = arch_uretprobe_hijack_return_addr(trampoline_vaddr, 
regs);
if (orig_ret_vaddr == -1)
goto fail;
@@ -2187,7 +2187,7 @@ static void handle_swbp(struct pt_regs *regs)
int is_swbp;
 
bp_vaddr = uprobe_get_swbp_addr(regs);
-   if (bp_vaddr == get_trampoline_vaddr())
+   if (bp_vaddr == uprobe_get_trampoline_vaddr())
return handle_trampoline(regs);
 
uprobe = find_active_uprobe(bp_vaddr, _swbp);
-- 
2.43.0




[PATCH v2 0/4] Fix user stack traces captured from uprobes

2024-05-21 Thread Andrii Nakryiko
This patch set reports two issues with captured stack traces.

First issue, fixed in patch #2, deals with fixing up uretprobe trampoline
addresses in captured stack trace. This issue happens when there are pending
return probes, for which kernel hijacks some of the return addresses on user
stacks. The code is matching those special uretprobe trampoline addresses with
the list of pending return probe instances and replaces them with actual
return addresses. This is the same fixup logic that fprobe/kretprobe has for
kernel stack traces.

Second issue, which patch #3 is fixing with the help of heuristic, is having
to do with capturing user stack traces in entry uprobes. At the very entrance
to user function, frame pointer in rbp register is not yet setup, so actual
caller return address is still pointed to by rsp. Patch is using a simple
heuristic, looking for `push %rbp` instruction, to fetch this extra direct
caller return address, before proceeding to unwind the stack using rbp.

Patch #4 adds tests into BPF selftests, that validate that captured stack
traces at various points is what we expect to get. This patch, while being BPF
selftests, is isolated from any other BPF selftests changes and can go in
through non-BPF tree without the risk of merge conflicts.

Patches are based on latest linux-trace/probes/for-next.

v1->v2:
  - fixed GCC aggressively inlining test_uretprobe_stack() function (BPF CI);
  - fixed comments (Peter).

Andrii Nakryiko (4):
  uprobes: rename get_trampoline_vaddr() and make it global
  perf,uprobes: fix user stack traces in the presence of pending
uretprobes
  perf,x86: avoid missing caller address in stack traces captured in
uprobe
  selftests/bpf: add test validating uprobe/uretprobe stack traces

 arch/x86/events/core.c|  20 ++
 include/linux/uprobes.h   |   3 +
 kernel/events/callchain.c |  43 +++-
 kernel/events/uprobes.c   |  17 +-
 .../bpf/prog_tests/uretprobe_stack.c  | 186 ++
 .../selftests/bpf/progs/uretprobe_stack.c |  96 +
 6 files changed, 361 insertions(+), 4 deletions(-)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/uretprobe_stack.c
 create mode 100644 tools/testing/selftests/bpf/progs/uretprobe_stack.c

-- 
2.43.0




Re: [PATCHv6 bpf-next 0/9] uprobe: uretprobe speed up

2024-05-21 Thread Alexei Starovoitov
On Tue, May 21, 2024 at 1:49 PM Deepak Gupta  wrote:
>
> On Tue, May 21, 2024 at 12:48:16PM +0200, Jiri Olsa wrote:
> >hi,
> >as part of the effort on speeding up the uprobes [0] coming with
> >return uprobe optimization by using syscall instead of the trap
> >on the uretprobe trampoline.
>
> I understand this provides an optimization on x86. I believe primary reason
> is syscall is straight-line microcode and short sequence while trap delivery
> still does all the GDT / IDT and segmentation checks and it makes delivery
> of the trap slow.
>
> So doing syscall improves that. Although it seems x86 is going to get rid of
> that as part of FRED [1, 2]. And linux kernel support for FRED is already 
> upstream [2].
> So I am imagining x86 hardware already exists with FRED support.
>
> On other architectures, I believe trap delivery for breakpoint instruction
> is same as syscall instruction.
>
> Given that x86 trap delivery is pretty much going following the suit here and
> intend to make trap delivery cost similar to syscall delivery.
>
> Sorry for being buzzkill here but ...
> Is it worth introducing this syscall which otherwise has no use on other 
> arches
> and x86 (and x86 kernel) has already taken steps to match trap delivery 
> latency with
> syscall latency would have similar cost?
>
> Did you do any study of this on FRED enabled x86 CPUs?

afaik CPUs with FRED do not exist on the market and it's
not clear when they will be available.
And when they finally will be on the shelves
the overhead of FRED vs int3 would still have to be measured.
int3 with FRED might still be higher than syscall with FRED.

>
> [1] - 
> https://www.intel.com/content/www/us/en/content-details/780121/flexible-return-and-event-delivery-fred-specification.html
> [2] - https://docs.kernel.org/arch/x86/x86_64/fred.html
>
> >
> >The speed up depends on instruction type that uprobe is installed
> >and depends on specific HW type, please check patch 1 for details.
> >



Re: [PATCHv6 9/9] man2: Add uretprobe syscall page

2024-05-21 Thread Alejandro Colomar
Hi Jirka,

On Tue, May 21, 2024 at 10:24:30PM GMT, Jiri Olsa wrote:
> how about the change below?

Much better.  I still have a few comments below.  :-)

> 
> thanks,
> jirka
> 
> 
> ---
> diff --git a/man/man2/uretprobe.2 b/man/man2/uretprobe.2
> new file mode 100644
> index ..959b7a47102b
> --- /dev/null
> +++ b/man/man2/uretprobe.2
> @@ -0,0 +1,55 @@
> +.\" Copyright (C) 2024, Jiri Olsa 
> +.\"
> +.\" SPDX-License-Identifier: Linux-man-pages-copyleft
> +.\"
> +.TH uretprobe 2 (date) "Linux man-pages (unreleased)"
> +.SH NAME
> +uretprobe \- execute pending return uprobes
> +.SH SYNOPSIS
> +.nf
> +.B int uretprobe(void)
> +.fi
> +.SH DESCRIPTION
> +The
> +.BR uretprobe ()
> +system call is an alternative to breakpoint instructions for triggering 
> return
> +uprobe consumers.
> +.P
> +Calls to
> +.BR uretprobe ()
> +system call are only made from the user-space trampoline provided by the 
> kernel.
> +Calls from any other place result in a
> +.BR SIGILL .
> +.SH RETURN VALUE
> +The
> +.BR uretprobe ()
> +system call return value is architecture-specific.
> +.SH ERRORS
> +.BR SIGILL

This should be a tagged paragraph, preceeded with '.TP'.  See any manual
page with an ERRORS section for an example.

Also, BR is Bold alternating with Roman, but this is just bold, so it
should use '.B'.

.TP
.B SIGILL

> +The
> +.BR uretprobe ()
> +system call was called by user.
> +.SH VERSIONS
> +Details of the
> +.BR uretprobe ()
> +system call behavior vary across systems.
> +.SH STANDARDS
> +None.
> +.SH HISTORY
> +TBD
> +.SH NOTES
> +The
> +.BR uretprobe ()
> +system call was initially introduced for the x86_64 architecture where it 
> was shown

We have a strong-ish limit at column 80.  Please break after
'architecture', which is a clause boundary.

Have a lovely night!
Alex

> +to be faster than breakpoint traps.
> +It might be extended to other architectures.
> +.P
> +The
> +.BR uretprobe ()
> +system call exists only to allow the invocation of return uprobe consumers.
> +It should
> +.B never
> +be called directly.
> +Details of the arguments (if any) passed to
> +.BR uretprobe ()
> +and the return value are architecture-specific.
> 

-- 



signature.asc
Description: PGP signature


Re: [PATCHv6 bpf-next 0/9] uprobe: uretprobe speed up

2024-05-21 Thread Deepak Gupta

On Tue, May 21, 2024 at 12:48:16PM +0200, Jiri Olsa wrote:

hi,
as part of the effort on speeding up the uprobes [0] coming with
return uprobe optimization by using syscall instead of the trap
on the uretprobe trampoline.


I understand this provides an optimization on x86. I believe primary reason
is syscall is straight-line microcode and short sequence while trap delivery
still does all the GDT / IDT and segmentation checks and it makes delivery
of the trap slow.

So doing syscall improves that. Although it seems x86 is going to get rid of 
that as part of FRED [1, 2]. And linux kernel support for FRED is already upstream [2].

So I am imagining x86 hardware already exists with FRED support.

On other architectures, I believe trap delivery for breakpoint instruction
is same as syscall instruction.

Given that x86 trap delivery is pretty much going following the suit here and
intend to make trap delivery cost similar to syscall delivery.

Sorry for being buzzkill here but ...
Is it worth introducing this syscall which otherwise has no use on other arches
and x86 (and x86 kernel) has already taken steps to match trap delivery latency 
with
syscall latency would have similar cost?

Did you do any study of this on FRED enabled x86 CPUs?

[1] - 
https://www.intel.com/content/www/us/en/content-details/780121/flexible-return-and-event-delivery-fred-specification.html
[2] - https://docs.kernel.org/arch/x86/x86_64/fred.html



The speed up depends on instruction type that uprobe is installed
and depends on specific HW type, please check patch 1 for details.





Re: [PATCH RFC 1/2] dt-bindings: soc: qcom,smsm: Allow specifying mboxes instead of qcom,ipc

2024-05-21 Thread Luca Weiss
On Dienstag, 21. Mai 2024 10:58:07 MESZ Krzysztof Kozlowski wrote:
> On 20/05/2024 17:11, Luca Weiss wrote:
> > Hi Krzysztof
> > 
> > Ack, sounds good.
> > 
> > Maybe also from you, any opinion between these two binding styles?
> > 
> > So first using index of mboxes for the numbering, where for the known
> > usages the first element (and sometimes the 3rd - ipc-2) are empty <>.
> > 
> > The second variant is using mbox-names to get the correct channel-mbox
> > mapping.
> > 
> > -   qcom,ipc-1 = < 8 13>;
> > -   qcom,ipc-2 = < 8 9>;
> > -   qcom,ipc-3 = < 8 19>;
> > +   mboxes = <0>, < 13>, < 9>, < 19>;
> > 
> > vs.
> > 
> > -   qcom,ipc-1 = < 8 13>;
> > -   qcom,ipc-2 = < 8 9>;
> > -   qcom,ipc-3 = < 8 19>;
> > +   mboxes = < 13>, < 9>, < 19>;
> > +   mbox-names = "ipc-1", "ipc-2", "ipc-3";
> 
> Sorry, don't get, ipc-1 is the first mailbox, so why would there be <0>
> in first case?

Actually not, ipc-0 would be permissible by the driver, used for the 0th host

e.g. from:

/* Iterate over all hosts to check whom wants a kick */
for (host = 0; host < smsm->num_hosts; host++) {
hostp = >hosts[host];

Even though no mailbox is specified in any upstream dts for this 0th host I
didn't want the bindings to restrict that, that's why in the first example
there's an empty element (<0>) for the 0th smsm host

> Anyway, the question is if you need to know that some
> mailbox is missing. But then it is weird to name them "ipc-1" etc.

In either case we'd just query the mbox (either by name or index) and then
see if it's there? Not quite sure I understand the sentence..
Pretty sure either binding would work the same way.

Regards
Luca

> 
> Best regards,
> Krzysztof
> 
> 







Re: [PATCHv6 9/9] man2: Add uretprobe syscall page

2024-05-21 Thread Jiri Olsa
On Tue, May 21, 2024 at 01:48:59PM +0200, Jiri Olsa wrote:
> On Tue, May 21, 2024 at 01:36:25PM +0200, Alejandro Colomar wrote:
> > Hi Jiri,
> > 
> > On Tue, May 21, 2024 at 12:48:25PM GMT, Jiri Olsa wrote:
> > > Adding man page for new uretprobe syscall.
> > > 
> > > Signed-off-by: Jiri Olsa 
> > > ---
> > >  man2/uretprobe.2 | 50 
> > >  1 file changed, 50 insertions(+)
> > >  create mode 100644 man2/uretprobe.2
> > > 
> > > diff --git a/man2/uretprobe.2 b/man2/uretprobe.2
> > > new file mode 100644
> > > index ..690fe3b1a44f
> > > --- /dev/null
> > > +++ b/man2/uretprobe.2
> > > @@ -0,0 +1,50 @@
> > > +.\" Copyright (C) 2024, Jiri Olsa 
> > > +.\"
> > > +.\" SPDX-License-Identifier: Linux-man-pages-copyleft
> > > +.\"
> > > +.TH uretprobe 2 (date) "Linux man-pages (unreleased)"
> > > +.SH NAME
> > > +uretprobe \- execute pending return uprobes
> > > +.SH SYNOPSIS
> > > +.nf
> > > +.B int uretprobe(void)
> > > +.fi
> > 
> > What header file provides this system call?
> 
> there's no header, it's used/called only by user space trampoline
> provided by kernel, it's not expected to be called by user
> 
> > 
> > > +.SH DESCRIPTION
> > > +The
> > > +.BR uretprobe ()
> > > +syscall is an alternative to breakpoint instructions for
> > > +triggering return uprobe consumers.
> > > +.P
> > > +Calls to
> > > +.BR uretprobe ()
> > > +suscall are only made from the user-space trampoline provided by the 
> > > kernel.
> > 
> > s/suscall/system call/
> 
> ugh leftover sry
> 
> > 
> > > +Calls from any other place result in a
> > > +.BR SIGILL .
> > 
> > Maybe add an ERRORS section?
> > 
> > > +
> > 
> > We don't use blank lines; it causes a groff(1) warning, and other
> > problems.  Instead, use '.P'.
> > 
> > > +.SH RETURN VALUE
> > > +The
> > > +.BR uretprobe ()
> > > +syscall return value is architecture-specific.
> > > +
> > 
> > .P
> > 
> > > +.SH VERSIONS
> > > +This syscall is not specified in POSIX,
> > 
> > Redundant with "STANDARDS: None.".
> > 
> > > +and details of its behavior vary across systems.
> > 
> > Keep this.
> 
> ok
> 
> > 
> > > +.SH STANDARDS
> > > +None.
> > > +.SH HISTORY
> > > +TBD
> > > +.SH NOTES
> > > +The
> > > +.BR uretprobe ()
> > > +syscall was initially introduced for the x86_64 architecture where it 
> > > was shown
> > > +to be faster than breakpoint traps. It might be extended to other 
> > > architectures.
> > 
> > Please use semantic newlines.
> > 
> > $ MANWIDTH=72 man man-pages | sed -n '/Use semantic newlines/,/^$/p'
> >Use semantic newlines
> >  In the source of a manual page, new sentences should be started on
> >  new lines, long sentences should be split  into  lines  at  clause
> >  breaks  (commas,  semicolons, colons, and so on), and long clauses
> >  should be split at phrase boundaries.  This convention,  sometimes
> >  known as "semantic newlines", makes it easier to see the effect of
> >  patches, which often operate at the level of individual sentences,
> >  clauses, or phrases.
> 

how about the change below?

thanks,
jirka


---
diff --git a/man/man2/uretprobe.2 b/man/man2/uretprobe.2
new file mode 100644
index ..959b7a47102b
--- /dev/null
+++ b/man/man2/uretprobe.2
@@ -0,0 +1,55 @@
+.\" Copyright (C) 2024, Jiri Olsa 
+.\"
+.\" SPDX-License-Identifier: Linux-man-pages-copyleft
+.\"
+.TH uretprobe 2 (date) "Linux man-pages (unreleased)"
+.SH NAME
+uretprobe \- execute pending return uprobes
+.SH SYNOPSIS
+.nf
+.B int uretprobe(void)
+.fi
+.SH DESCRIPTION
+The
+.BR uretprobe ()
+system call is an alternative to breakpoint instructions for triggering return
+uprobe consumers.
+.P
+Calls to
+.BR uretprobe ()
+system call are only made from the user-space trampoline provided by the 
kernel.
+Calls from any other place result in a
+.BR SIGILL .
+.SH RETURN VALUE
+The
+.BR uretprobe ()
+system call return value is architecture-specific.
+.SH ERRORS
+.BR SIGILL
+The
+.BR uretprobe ()
+system call was called by user.
+.SH VERSIONS
+Details of the
+.BR uretprobe ()
+system call behavior vary across systems.
+.SH STANDARDS
+None.
+.SH HISTORY
+TBD
+.SH NOTES
+The
+.BR uretprobe ()
+system call was initially introduced for the x86_64 architecture where it was 
shown
+to be faster than breakpoint traps.
+It might be extended to other architectures.
+.P
+The
+.BR uretprobe ()
+system call exists only to allow the invocation of return uprobe consumers.
+It should
+.B never
+be called directly.
+Details of the arguments (if any) passed to
+.BR uretprobe ()
+and the return value are architecture-specific.



Re: [PATCH 06/12] remoteproc: qcom_q6v5_pas: switch to mbn files by default

2024-05-21 Thread Bjorn Andersson
On Tue, May 21, 2024 at 11:49:42AM +0200, neil.armstr...@linaro.org wrote:
> On 21/05/2024 11:45, Dmitry Baryshkov wrote:
> > We have been pushing userspace to use mbn files by default for ages.
> > As a preparation for making the firmware-name optional, make the driver
> > use .mbn instead of .mdt files by default.
> 
> I think we should have a mechanism to fallback to .mdt since downstream
> uses split mdt on the devices filesystem.
> 

Let's ignore and continue to move away from the split .mdt files.

Combining split files is trivial and removes a class of problems where
people mix and match their parts. (And worst case you can rename/symlink
your downstream firmware to match the requested filename)

Regards,
Bjorn



Re: [GIT PULL] remoteproc updates for v6.10

2024-05-21 Thread pr-tracker-bot
The pull request you sent on Mon, 20 May 2024 20:12:20 -0700:

> https://git.kernel.org/pub/scm/linux/kernel/git/remoteproc/linux.git 
> tags/rproc-v6.10

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/ab7b884a34ffda718cb93c772f575e45e8241c62

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/prtracker.html



Re: [GIT PULL] rpmsg updates for v6.10

2024-05-21 Thread pr-tracker-bot
The pull request you sent on Mon, 20 May 2024 19:58:46 -0700:

> https://git.kernel.org/pub/scm/linux/kernel/git/remoteproc/linux.git 
> tags/rpmsg-v6.10

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/e66128fa8e7e38ebd0b0c95578f8020aec6c0dee

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/prtracker.html



Re: [PATCH v2 1/2] drivers: remoteproc: xlnx: add attach detach support

2024-05-21 Thread Mathieu Poirier
Hi Tanmay,

On Fri, May 10, 2024 at 05:51:25PM -0700, Tanmay Shah wrote:
> It is possible that remote processor is already running before
> linux boot or remoteproc platform driver probe. Implement required
> remoteproc framework ops to provide resource table address and
> connect or disconnect with remote processor in such case.
> 
> Signed-off-by: Tanmay Shah 
> ---
> 
> Changes in v2:
>   - Fix following sparse warnings
> 
> drivers/remoteproc/xlnx_r5_remoteproc.c:827:21: sparse:expected struct 
> rsc_tbl_data *rsc_data_va
> drivers/remoteproc/xlnx_r5_remoteproc.c:844:18: sparse:expected struct 
> resource_table *rsc_addr
> drivers/remoteproc/xlnx_r5_remoteproc.c:898:24: sparse:expected void 
> volatile [noderef] __iomem *addr
> 
>  drivers/remoteproc/xlnx_r5_remoteproc.c | 164 +++-
>  1 file changed, 160 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/remoteproc/xlnx_r5_remoteproc.c 
> b/drivers/remoteproc/xlnx_r5_remoteproc.c
> index 84243d1dff9f..039370cffa32 100644
> --- a/drivers/remoteproc/xlnx_r5_remoteproc.c
> +++ b/drivers/remoteproc/xlnx_r5_remoteproc.c
> @@ -25,6 +25,10 @@
>  /* RX mailbox client buffer max length */
>  #define MBOX_CLIENT_BUF_MAX  (IPI_BUF_LEN_MAX + \
>sizeof(struct zynqmp_ipi_message))
> +
> +#define RSC_TBL_XLNX_MAGIC   ((uint32_t)'x' << 24 | (uint32_t)'a' << 16 | \
> +  (uint32_t)'m' << 8 | (uint32_t)'p')
> +
>  /*
>   * settings for RPU cluster mode which
>   * reflects possible values of xlnx,cluster-mode dt-property
> @@ -73,6 +77,15 @@ struct mbox_info {
>   struct mbox_chan *rx_chan;
>  };
>  
> +/* Xilinx Platform specific data structure */
> +struct rsc_tbl_data {
> + const int version;
> + const u32 magic_num;
> + const u32 comp_magic_num;

Why is a complement magic number needed?

> + const u32 rsc_tbl_size;
> + const uintptr_t rsc_tbl;
> +} __packed;
> +
>  /*
>   * Hardcoded TCM bank values. This will stay in driver to maintain backward
>   * compatibility with device-tree that does not have TCM information.
> @@ -95,20 +108,24 @@ static const struct mem_bank_data 
> zynqmp_tcm_banks_lockstep[] = {
>  /**
>   * struct zynqmp_r5_core
>   *
> + * @rsc_tbl_va: resource table virtual address
>   * @dev: device of RPU instance
>   * @np: device node of RPU instance
>   * @tcm_bank_count: number TCM banks accessible to this RPU
>   * @tcm_banks: array of each TCM bank data
>   * @rproc: rproc handle
> + * @rsc_tbl_size: resource table size retrieved from remote
>   * @pm_domain_id: RPU CPU power domain id
>   * @ipi: pointer to mailbox information
>   */
>  struct zynqmp_r5_core {
> + struct resource_table *rsc_tbl_va;

Shouldn't this be of type "void __iomem *"?  Did sparse give you trouble on that
one?

>   struct device *dev;
>   struct device_node *np;
>   int tcm_bank_count;
>   struct mem_bank_data **tcm_banks;
>   struct rproc *rproc;
> + u32 rsc_tbl_size;
>   u32 pm_domain_id;
>   struct mbox_info *ipi;
>  };
> @@ -621,10 +638,19 @@ static int zynqmp_r5_rproc_prepare(struct rproc *rproc)
>  {
>   int ret;
>  
> - ret = add_tcm_banks(rproc);
> - if (ret) {
> - dev_err(>dev, "failed to get TCM banks, err %d\n", ret);
> - return ret;
> + /**

Using "/**" is for comments that will endup in the documentation, which I don't
think is needed here.  Please correct throughout the patch.

> +  * For attach/detach use case, Firmware is already loaded so
> +  * TCM isn't really needed at all. Also, for security TCM can be
> +  * locked in such case and linux may not have access at all.
> +  * So avoid adding TCM banks. TCM power-domains requested during attach
> +  * callback.
> +  */
> + if (rproc->state != RPROC_DETACHED) {
> + ret = add_tcm_banks(rproc);
> + if (ret) {
> + dev_err(>dev, "failed to get TCM banks, err 
> %d\n", ret);
> + return ret;
> + }
>   }
>  
>   ret = add_mem_regions_carveout(rproc);
> @@ -662,6 +688,123 @@ static int zynqmp_r5_rproc_unprepare(struct rproc 
> *rproc)
>   return 0;
>  }
>  
> +static struct resource_table *zynqmp_r5_get_loaded_rsc_table(struct rproc 
> *rproc,
> +  size_t *size)
> +{
> + struct zynqmp_r5_core *r5_core;
> +
> + r5_core = rproc->priv;
> +
> + *size = r5_core->rsc_tbl_size;
> +
> + return r5_core->rsc_tbl_va;
> +}
> +
> +static int zynqmp_r5_get_rsc_table_va(struct zynqmp_r5_core *r5_core)
> +{
> + struct device *dev = r5_core->dev;
> + struct rsc_tbl_data *rsc_data_va;
> + struct resource_table *rsc_addr;
> + struct resource res_mem;
> + struct device_node *np;
> + int ret;
> +
> + /**
> +  * It is expected from remote processor firmware to provide resource
> +  * table address via struct 

Re: [PATCHv6 bpf-next 1/9] x86/shstk: Make return uprobe work with shadow stack

2024-05-21 Thread Jiri Olsa
On Tue, May 21, 2024 at 04:22:21PM +0200, Oleg Nesterov wrote:
> On 05/21, Jiri Olsa wrote:
> >
> > Currently the application with enabled shadow stack will crash
> > if it sets up return uprobe. The reason is the uretprobe kernel
> > code changes the user space task's stack, but does not update
> > shadow stack accordingly.
> >
> > Adding new functions to update values on shadow stack and using
> > them in uprobe code to keep shadow stack in sync with uretprobe
> > changes to user stack.
> 
> I don't think my ack has any value in this area but looks good to me.
> 
> Reviewed-by: Oleg Nesterov 
> 
> 
> > Fixes: 8b1c23543436 ("x86/shstk: Add return uprobe support")
> 
> Hmm... Was this commit ever applied?

should have been:
  488af8ea7131 x86/shstk: Wire in shadow stack interface

will send new version

thanks,
jirka

> 
> Oleg.
> 



Re: [PATCH] rpmsg: char: fix rpmsg_eptdev structure documentation

2024-05-21 Thread Mathieu Poirier
On Fri, May 17, 2024 at 06:56:54PM +0200, Arnaud Pouliquen wrote:
> Add missing @ tags for some rpmsg_eptdev structure parameters.
> 
> This fixes warning messages on build:
> drivers/rpmsg/rpmsg_char.c:75: warning: Function parameter or struct member 
> 'remote_flow_restricted' not described in 'rpmsg_eptdev'
> drivers/rpmsg/rpmsg_char.c:75: warning: Function parameter or struct member 
> 'remote_flow_updated' not described in 'rpmsg_eptdev'
> 
> Fixes: 5550201c0fe2 ("rpmsg: char: Add RPMSG GET/SET FLOWCONTROL IOCTL 
> support")
> 
> Signed-off-by: Arnaud Pouliquen 
> ---
>  drivers/rpmsg/rpmsg_char.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/rpmsg/rpmsg_char.c b/drivers/rpmsg/rpmsg_char.c
> index 1cb8d7474428..98d95ce5b6fb 100644
> --- a/drivers/rpmsg/rpmsg_char.c
> +++ b/drivers/rpmsg/rpmsg_char.c
> @@ -52,8 +52,8 @@ static DEFINE_IDA(rpmsg_minor_ida);
>   * @readq:   wait object for incoming queue
>   * @default_ept: set to channel default endpoint if the default endpoint 
> should be re-used
>   *  on device open to prevent endpoint address update.
> - * remote_flow_restricted: to indicate if the remote has requested for flow 
> to be limited
> - * remote_flow_updated: to indicate if the flow control has been requested
> + * @remote_flow_restricted: to indicate if the remote has requested for flow 
> to be limited
> + * @remote_flow_updated: to indicate if the flow control has been requested

I will apply this patch next week when rc1 comes out.

Thanks,
Mathieu

>   */
>  struct rpmsg_eptdev {
>   struct device dev;
> -- 
> 2.25.1
> 



Re: [PATCH] remoteproc: mediatek: Zero out only remaining bytes of IPI buffer

2024-05-21 Thread Mathieu Poirier
On Mon, May 20, 2024 at 01:27:24PM +0200, AngeloGioacchino Del Regno wrote:
> In scp_ipi_handler(), instead of zeroing out the entire shared
> buffer, which may be as large as 600 bytes, overwrite it with the
> received data, then zero out only the remaining bytes.
> 
> Signed-off-by: AngeloGioacchino Del Regno 
> 
> ---
>  drivers/remoteproc/mtk_scp.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/remoteproc/mtk_scp.c b/drivers/remoteproc/mtk_scp.c
> index e5214d43181e..dc70cf7db44d 100644
> --- a/drivers/remoteproc/mtk_scp.c
> +++ b/drivers/remoteproc/mtk_scp.c
> @@ -117,8 +117,8 @@ static void scp_ipi_handler(struct mtk_scp *scp)
>   return;
>   }
>  
> - memset(scp->share_buf, 0, scp_sizes->ipi_share_buffer_size);
>   memcpy_fromio(scp->share_buf, _obj->share_buf, len);
> + memset(>share_buf[len], 0, scp_sizes->ipi_share_buffer_size - len);

I will apply this patch when rc1 comes out next week.

Thanks,
Mathieu

>   handler(scp->share_buf, len, ipi_desc[id].priv);
>   scp_ipi_unlock(scp, id);
>  
> -- 
> 2.45.1
> 



[PATCH] remoteproc: stm32_rproc: Fix mailbox interrupts queuing

2024-05-21 Thread Gwenael Treuveur
Manage interrupt coming from coprocessor also when state is
ATTACHED.

Fixes: 35bdafda40cc ("remoteproc: stm32_rproc: Add mutex protection for 
workqueue")
Signed-off-by: Gwenael Treuveur 
Acked-by: Arnaud Pouliquen 
---
 drivers/remoteproc/stm32_rproc.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/remoteproc/stm32_rproc.c b/drivers/remoteproc/stm32_rproc.c
index 88623df7d0c3..8c7f7950b80e 100644
--- a/drivers/remoteproc/stm32_rproc.c
+++ b/drivers/remoteproc/stm32_rproc.c
@@ -294,7 +294,7 @@ static void stm32_rproc_mb_vq_work(struct work_struct *work)
 
mutex_lock(>lock);
 
-   if (rproc->state != RPROC_RUNNING)
+   if (rproc->state != RPROC_RUNNING && rproc->state != RPROC_ATTACHED)
goto unlock_mutex;
 
if (rproc_vq_interrupt(rproc, mb->vq_id) == IRQ_NONE)

base-commit: 4d5ba6ead1dc9fa298d727e92db40cd98564d1ac
-- 
2.25.1




Re: [PATCH] tools/latency-collector: fix -Wformat-security compile warns

2024-05-21 Thread Steven Rostedt
On Tue, 21 May 2024 09:11:08 -0600
Shuah Khan  wrote:

> Any thoughts on this patch?

Sorry, this one fell through the cracks. Daniel Bristot has been
maintaining his tools and I thought this was one of his changes.

I'll take a look at it.

-- Steve



Re: [PATCH v3 2/9] riscv: mm: Pre-allocate vmemmap/direct map PGD entries

2024-05-21 Thread Björn Töpel
Björn Töpel  writes:

> From: Björn Töpel 
>
> The RISC-V port copies the PGD table from init_mm/swapper_pg_dir to
> all userland page tables, which means that if the PGD level table is
> changed, other page tables has to be updated as well.
>
> Instead of having the PGD changes ripple out to all tables, the
> synchronization can be avoided by pre-allocating the PGD entries/pages
> at boot, avoiding the synchronization all together.
>
> This is currently done for the bpf/modules, and vmalloc PGD regions.
> Extend this scheme for the PGD regions touched by memory hotplugging.
>
> Prepare the RISC-V port for memory hotplug by pre-allocate
> vmemmap/direct map entries at the PGD level. This will roughly waste
> ~128 worth of 4K pages when memory hotplugging is enabled in the
> kernel configuration.
>
> Reviewed-by: Alexandre Ghiti 
> Signed-off-by: Björn Töpel 
> ---
>  arch/riscv/include/asm/kasan.h | 4 ++--
>  arch/riscv/mm/init.c   | 7 +++
>  2 files changed, 9 insertions(+), 2 deletions(-)
>
> diff --git a/arch/riscv/include/asm/kasan.h b/arch/riscv/include/asm/kasan.h
> index 0b85e363e778..e6a0071bdb56 100644
> --- a/arch/riscv/include/asm/kasan.h
> +++ b/arch/riscv/include/asm/kasan.h
> @@ -6,8 +6,6 @@
>  
>  #ifndef __ASSEMBLY__
>  
> -#ifdef CONFIG_KASAN
> -
>  /*
>   * The following comment was copied from arm64:
>   * KASAN_SHADOW_START: beginning of the kernel virtual addresses.
> @@ -34,6 +32,8 @@
>   */
>  #define KASAN_SHADOW_START   ((KASAN_SHADOW_END - KASAN_SHADOW_SIZE) & 
> PGDIR_MASK)
>  #define KASAN_SHADOW_END MODULES_LOWEST_VADDR
> +
> +#ifdef CONFIG_KASAN
>  #define KASAN_SHADOW_OFFSET  _AC(CONFIG_KASAN_SHADOW_OFFSET, UL)
>  
>  void kasan_init(void);
> diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c
> index b66f846e7634..c98010ede810 100644
> --- a/arch/riscv/mm/init.c
> +++ b/arch/riscv/mm/init.c
> @@ -27,6 +27,7 @@
>  
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -1488,10 +1489,16 @@ static void __init 
> preallocate_pgd_pages_range(unsigned long start, unsigned lon
>   panic("Failed to pre-allocate %s pages for %s area\n", lvl, area);
>  }
>  
> +#define PAGE_END KASAN_SHADOW_START
> +
>  void __init pgtable_cache_init(void)
>  {
>   preallocate_pgd_pages_range(VMALLOC_START, VMALLOC_END, "vmalloc");
>   if (IS_ENABLED(CONFIG_MODULES))
>   preallocate_pgd_pages_range(MODULES_VADDR, MODULES_END, 
> "bpf/modules");
> + if (IS_ENABLED(CONFIG_MEMORY_HOTPLUG)) {
> + preallocate_pgd_pages_range(VMEMMAP_START, VMEMMAP_END, 
> "vmemmap");
> + preallocate_pgd_pages_range(PAGE_OFFSET, PAGE_END, "direct 
> map");

Alex pointed out that KASAN PGDs should be preallocated as well! I'll
address this in the next revision.


Björn



Re: [PATCH] tools/latency-collector: fix -Wformat-security compile warns

2024-05-21 Thread Shuah Khan

On 4/3/24 19:10, Shuah Khan wrote:

Fix the following -Wformat-security compile warnings adding missing
format arguments:

latency-collector.c: In function ‘show_available’:
latency-collector.c:938:17: warning: format not a string literal and
no format arguments [-Wformat-security]
   938 | warnx(no_tracer_msg);
   | ^

latency-collector.c:943:17: warning: format not a string literal and
no format arguments [-Wformat-security]
   943 | warnx(no_latency_tr_msg);
   | ^

latency-collector.c: In function ‘find_default_tracer’:
latency-collector.c:986:25: warning: format not a string literal and
no format arguments [-Wformat-security]
   986 | errx(EXIT_FAILURE, no_tracer_msg);
   |
  ^~~~
latency-collector.c: In function ‘scan_arguments’:
latency-collector.c:1881:33: warning: format not a string literal and
no format arguments [-Wformat-security]
  1881 | errx(EXIT_FAILURE, no_tracer_msg);
   | ^~~~

Signed-off-by: Shuah Khan 
---
  tools/tracing/latency/latency-collector.c | 8 
  1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/tools/tracing/latency/latency-collector.c 
b/tools/tracing/latency/latency-collector.c
index 0fd9c747d396..cf263fe9deaf 100644
--- a/tools/tracing/latency/latency-collector.c
+++ b/tools/tracing/latency/latency-collector.c
@@ -935,12 +935,12 @@ static void show_available(void)
}
  
  	if (!tracers) {

-   warnx(no_tracer_msg);
+   warnx("%s", no_tracer_msg);
return;
}
  
  	if (!found) {

-   warnx(no_latency_tr_msg);
+   warnx("%s", no_latency_tr_msg);
tracefs_list_free(tracers);
return;
}
@@ -983,7 +983,7 @@ static const char *find_default_tracer(void)
for (i = 0; relevant_tracers[i]; i++) {
valid = tracer_valid(relevant_tracers[i], );
if (notracer)
-   errx(EXIT_FAILURE, no_tracer_msg);
+   errx(EXIT_FAILURE, "%s", no_tracer_msg);
if (valid)
return relevant_tracers[i];
}
@@ -1878,7 +1878,7 @@ static void scan_arguments(int argc, char *argv[])
}
valid = tracer_valid(current_tracer, );
if (notracer)
-   errx(EXIT_FAILURE, no_tracer_msg);
+   errx(EXIT_FAILURE, "%s", no_tracer_msg);
if (!valid)
errx(EXIT_FAILURE,
  "The tracer %s is not supported by your kernel!\n", current_tracer);


Any thoughts on this patch?

thanks,
-- Shuah




Re: [PATCH] uprobes: prevent mutex_lock() under rcu_read_lock()

2024-05-21 Thread Breno Leitao
On Mon, May 20, 2024 at 10:30:17PM -0700, Andrii Nakryiko wrote:
> Recent changes made uprobe_cpu_buffer preparation lazy, and moved it
> deeper into __uprobe_trace_func(). This is problematic because
> __uprobe_trace_func() is called inside rcu_read_lock()/rcu_read_unlock()
> block, which then calls prepare_uprobe_buffer() -> uprobe_buffer_get() ->
> mutex_lock(>mutex), leading to a splat about using mutex under
> non-sleepable RCU:
> 
>   BUG: sleeping function called from invalid context at 
> kernel/locking/mutex.c:585
>in_atomic(): 0, irqs_disabled(): 0, non_block: 0, pid: 98231, name: 
> stress-ng-sigq
>preempt_count: 0, expected: 0
>RCU nest depth: 1, expected: 0
>...
>Call Trace:
> 
> dump_stack_lvl+0x3d/0xe0
> __might_resched+0x24c/0x270
> ? prepare_uprobe_buffer+0xd5/0x1d0
> __mutex_lock+0x41/0x820
> ? ___perf_sw_event+0x206/0x290
> ? __perf_event_task_sched_in+0x54/0x660
> ? __perf_event_task_sched_in+0x54/0x660
> prepare_uprobe_buffer+0xd5/0x1d0
> __uprobe_trace_func+0x4a/0x140
> uprobe_dispatcher+0x135/0x280
> ? uprobe_dispatcher+0x94/0x280
> uprobe_notify_resume+0x650/0xec0
> ? atomic_notifier_call_chain+0x21/0x110
> ? atomic_notifier_call_chain+0xf8/0x110
> irqentry_exit_to_user_mode+0xe2/0x1e0
> asm_exc_int3+0x35/0x40
>RIP: 0033:0x7f7e1d4da390
>Code: 33 04 00 0f 1f 80 00 00 00 00 f3 0f 1e fa b9 01 00 00 00 e9 b2 fc ff 
> ff 66 90 f3 0f 1e fa 31 c9 e9 a5 fc ff ff 0f 1f 44 00 00  0f 1e fa b8 27 
> 00 00 00 0f 05 c3 0f 1f 40 00 f3 0f 1e fa b8 6e
>RSP: 002b:7ffd2abc3608 EFLAGS: 0246
>RAX:  RBX: 76d325f1 RCX: 
>RDX: 76d325f1 RSI: 000a RDI: 7ffd2abc3690
>RBP: 000a R08: 00017fb7 R09: 00017fb7
>R10: 00017fb7 R11: 0246 R12: 00017ff2
>R13: 7ffd2abc3610 R14:  R15: 7ffd2abc3780
> 
> 
> Luckily, it's easy to fix by moving prepare_uprobe_buffer() to be called
> slightly earlier: into uprobe_trace_func() and uretprobe_trace_func(), outside
> of RCU locked section. This still keeps this buffer preparation lazy and helps
> avoid the overhead when it's not needed. E.g., if there is only BPF uprobe
> handler installed on a given uprobe, buffer won't be initialized.
> 
> Note, the other user of prepare_uprobe_buffer(), __uprobe_perf_func(), is not
> affected, as it doesn't prepare buffer under RCU read lock.
> 
> Fixes: 1b8f85defbc8 ("uprobes: prepare uprobe args buffer lazily")
> Reported-by: Breno Leitao 
> Signed-off-by: Andrii Nakryiko 

Tested-by: Breno Leitao 



Re: [PATCH] uprobes: prevent mutex_lock() under rcu_read_lock()

2024-05-21 Thread Oleg Nesterov
On 05/20, Andrii Nakryiko wrote:
>
> Fixes: 1b8f85defbc8 ("uprobes: prepare uprobe args buffer lazily")
> Reported-by: Breno Leitao 
> Signed-off-by: Andrii Nakryiko 
> ---
>  kernel/trace/trace_uprobe.c | 14 +-
>  1 file changed, 9 insertions(+), 5 deletions(-)

Reviewed-by: Oleg Nesterov 




Re: [PATCHv6 bpf-next 1/9] x86/shstk: Make return uprobe work with shadow stack

2024-05-21 Thread Oleg Nesterov
On 05/21, Jiri Olsa wrote:
>
> Currently the application with enabled shadow stack will crash
> if it sets up return uprobe. The reason is the uretprobe kernel
> code changes the user space task's stack, but does not update
> shadow stack accordingly.
>
> Adding new functions to update values on shadow stack and using
> them in uprobe code to keep shadow stack in sync with uretprobe
> changes to user stack.

I don't think my ack has any value in this area but looks good to me.

Reviewed-by: Oleg Nesterov 


> Fixes: 8b1c23543436 ("x86/shstk: Add return uprobe support")

Hmm... Was this commit ever applied?

Oleg.




Re: [PATCH v3 5/9] riscv: mm: Add memory hotplugging support

2024-05-21 Thread Oscar Salvador
On Tue, May 21, 2024 at 03:19:37PM +0200, Alexandre Ghiti wrote:
> On Tue, May 21, 2024 at 1:49 PM Björn Töpel  wrote:
> > +   if (PageReserved(page)) {
> > +   __ClearPageReserved(page);
> 
> What's the difference between __ClearPageReserved() and
> ClearPageReserved()? Because it seems like free_reserved_page() calls
> the latter already, so why would you need to call
> __ClearPageReserved() on the first page?

__{Set,Clear}Page are the non-atomic version.
Usually used when you know that no one else can fiddle with the page, which
should be the case here since we are removing the memory.

As to why we have __ClearPageReserved and then having
free_reserved_page() call ClearPageReserved I do not really know.
Looking at the history, it has always been like this.

I remember I looked at this a few years ago but I cannot remember the outcome
of that.

Maybe David remembers better, but I think we could remove that
__ClearPageReserved.
Looking at powerpc implementation code, it does not do the
__ClearPageReserved and relies only on free_reserved_page().

I will have a look.

-- 
Oscar Salvador
SUSE Labs



Re: [PATCH v3 5/9] riscv: mm: Add memory hotplugging support

2024-05-21 Thread Björn Töpel
Alexandre Ghiti  writes:

> On Tue, May 21, 2024 at 1:49 PM Björn Töpel  wrote:
>>
>> From: Björn Töpel 
>>
>> For an architecture to support memory hotplugging, a couple of
>> callbacks needs to be implemented:
>>
>>  arch_add_memory()
>>   This callback is responsible for adding the physical memory into the
>>   direct map, and call into the memory hotplugging generic code via
>>   __add_pages() that adds the corresponding struct page entries, and
>>   updates the vmemmap mapping.
>>
>>  arch_remove_memory()
>>   This is the inverse of the callback above.
>>
>>  vmemmap_free()
>>   This function tears down the vmemmap mappings (if
>>   CONFIG_SPARSEMEM_VMEMMAP is enabled), and also deallocates the
>>   backing vmemmap pages. Note that for persistent memory, an
>>   alternative allocator for the backing pages can be used; The
>>   vmem_altmap. This means that when the backing pages are cleared,
>>   extra care is needed so that the correct deallocation method is
>>   used.
>>
>>  arch_get_mappable_range()
>>   This functions returns the PA range that the direct map can map.
>>   Used by the MHP internals for sanity checks.
>>
>> The page table unmap/teardown functions are heavily based on code from
>> the x86 tree. The same remove_pgd_mapping() function is used in both
>> vmemmap_free() and arch_remove_memory(), but in the latter function
>> the backing pages are not removed.
>>
>> Signed-off-by: Björn Töpel 
>> ---
>>  arch/riscv/mm/init.c | 261 +++
>>  1 file changed, 261 insertions(+)
>>
>> diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c
>> index 6f72b0b2b854..6693b742bf2f 100644
>> --- a/arch/riscv/mm/init.c
>> +++ b/arch/riscv/mm/init.c
>> @@ -1493,3 +1493,264 @@ void __init pgtable_cache_init(void)
>> }
>>  }
>>  #endif
>> +
>> +#ifdef CONFIG_MEMORY_HOTPLUG
>> +static void __meminit free_pagetable(struct page *page, int order)
>> +{
>> +   unsigned int nr_pages = 1 << order;
>> +
>> +   /*
>> +* vmemmap/direct page tables can be reserved, if added at
>> +* boot.
>> +*/
>> +   if (PageReserved(page)) {
>> +   __ClearPageReserved(page);
>
> What's the difference between __ClearPageReserved() and
> ClearPageReserved()? Because it seems like free_reserved_page() calls
> the latter already, so why would you need to call
> __ClearPageReserved() on the first page?

Indeed! x86 copy pasta (which uses bootmem info page that RV doesn't).

>> +   while (nr_pages--)
>> +   free_reserved_page(page++);
>> +   return;
>> +   }
>> +
>> +   free_pages((unsigned long)page_address(page), order);
>> +}
>> +
>> +static void __meminit free_pte_table(pte_t *pte_start, pmd_t *pmd)
>> +{
>> +   pte_t *pte;
>> +   int i;
>> +
>> +   for (i = 0; i < PTRS_PER_PTE; i++) {
>> +   pte = pte_start + i;
>> +   if (!pte_none(*pte))
>> +   return;
>> +   }
>> +
>> +   free_pagetable(pmd_page(*pmd), 0);
>> +   pmd_clear(pmd);
>> +}
>> +
>> +static void __meminit free_pmd_table(pmd_t *pmd_start, pud_t *pud)
>> +{
>> +   pmd_t *pmd;
>> +   int i;
>> +
>> +   for (i = 0; i < PTRS_PER_PMD; i++) {
>> +   pmd = pmd_start + i;
>> +   if (!pmd_none(*pmd))
>> +   return;
>> +   }
>> +
>> +   free_pagetable(pud_page(*pud), 0);
>> +   pud_clear(pud);
>> +}
>> +
>> +static void __meminit free_pud_table(pud_t *pud_start, p4d_t *p4d)
>> +{
>> +   pud_t *pud;
>> +   int i;
>> +
>> +   for (i = 0; i < PTRS_PER_PUD; i++) {
>> +   pud = pud_start + i;
>> +   if (!pud_none(*pud))
>> +   return;
>> +   }
>> +
>> +   free_pagetable(p4d_page(*p4d), 0);
>> +   p4d_clear(p4d);
>> +}
>> +
>> +static void __meminit free_vmemmap_storage(struct page *page, size_t size,
>> +  struct vmem_altmap *altmap)
>> +{
>> +   if (altmap)
>> +   vmem_altmap_free(altmap, size >> PAGE_SHIFT);
>> +   else
>> +   free_pagetable(page, get_order(size));
>> +}
>> +
>> +static void __meminit remove_pte_mapping(pte_t *pte_base, unsigned long 
>> addr, unsigned long end,
>> +bool is_vmemmap, struct vmem_altmap 
>> *altmap)
>> +{
>> +   unsigned long next;
>> +   pte_t *ptep, pte;
>> +
>> +   for (; addr < end; addr = next) {
>> +   next = (addr + PAGE_SIZE) & PAGE_MASK;
>
> Nit: use ALIGN() instead.
>
>> +   if (next > end)
>> +   next = end;
>> +
>> +   ptep = pte_base + pte_index(addr);
>> +   pte = READ_ONCE(*ptep);
>
> Nit: Use ptep_get()
>
>> +
>> +   if (!pte_present(*ptep))
>> +   continue;
>> +
>> +   pte_clear(_mm, addr, ptep);
>> +   if (is_vmemmap)
>> +   

Re: [PATCH v3 9/9] riscv: mm: Add support for ZONE_DEVICE

2024-05-21 Thread Björn Töpel
Alexandre Ghiti  writes:

> On Tue, May 21, 2024 at 1:49 PM Björn Töpel  wrote:
>>
>> From: Björn Töpel 
>>
>> ZONE_DEVICE pages need DEVMAP PTEs support to function
>> (ARCH_HAS_PTE_DEVMAP). Claim another RSW (reserved for software) bit
>> in the PTE for DEVMAP mark, add the corresponding helpers, and enable
>> ARCH_HAS_PTE_DEVMAP for riscv64.
>>
>> Signed-off-by: Björn Töpel 
>> ---
>>  arch/riscv/Kconfig|  1 +
>>  arch/riscv/include/asm/pgtable-64.h   | 20 
>>  arch/riscv/include/asm/pgtable-bits.h |  1 +
>>  arch/riscv/include/asm/pgtable.h  | 17 +
>>  4 files changed, 39 insertions(+)
>>
>> diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
>> index 2724dc2af29f..0b74698c63c7 100644
>> --- a/arch/riscv/Kconfig
>> +++ b/arch/riscv/Kconfig
>> @@ -36,6 +36,7 @@ config RISCV
>> select ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE
>> select ARCH_HAS_PMEM_API
>> select ARCH_HAS_PREPARE_SYNC_CORE_CMD
>> +   select ARCH_HAS_PTE_DEVMAP if 64BIT && MMU
>> select ARCH_HAS_PTE_SPECIAL
>> select ARCH_HAS_SET_DIRECT_MAP if MMU
>> select ARCH_HAS_SET_MEMORY if MMU
>> diff --git a/arch/riscv/include/asm/pgtable-64.h 
>> b/arch/riscv/include/asm/pgtable-64.h
>> index 221a5c1ee287..c67a9bbfd010 100644
>> --- a/arch/riscv/include/asm/pgtable-64.h
>> +++ b/arch/riscv/include/asm/pgtable-64.h
>> @@ -400,4 +400,24 @@ static inline struct page *pgd_page(pgd_t pgd)
>>  #define p4d_offset p4d_offset
>>  p4d_t *p4d_offset(pgd_t *pgd, unsigned long address);
>>
>> +#ifdef CONFIG_TRANSPARENT_HUGEPAGE
>> +static inline int pte_devmap(pte_t pte);
>> +static inline pte_t pmd_pte(pmd_t pmd);
>> +
>> +static inline int pmd_devmap(pmd_t pmd)
>> +{
>> +   return pte_devmap(pmd_pte(pmd));
>> +}
>> +
>> +static inline int pud_devmap(pud_t pud)
>> +{
>> +   return 0;
>> +}
>> +
>> +static inline int pgd_devmap(pgd_t pgd)
>> +{
>> +   return 0;
>> +}
>> +#endif
>> +
>>  #endif /* _ASM_RISCV_PGTABLE_64_H */
>> diff --git a/arch/riscv/include/asm/pgtable-bits.h 
>> b/arch/riscv/include/asm/pgtable-bits.h
>> index 179bd4afece4..a8f5205cea54 100644
>> --- a/arch/riscv/include/asm/pgtable-bits.h
>> +++ b/arch/riscv/include/asm/pgtable-bits.h
>> @@ -19,6 +19,7 @@
>>  #define _PAGE_SOFT  (3 << 8)/* Reserved for software */
>>
>>  #define _PAGE_SPECIAL   (1 << 8)/* RSW: 0x1 */
>> +#define _PAGE_DEVMAP(1 << 9)/* RSW, devmap */
>>  #define _PAGE_TABLE _PAGE_PRESENT
>>
>>  /*
>> diff --git a/arch/riscv/include/asm/pgtable.h 
>> b/arch/riscv/include/asm/pgtable.h
>> index 7933f493db71..02fadc276064 100644
>> --- a/arch/riscv/include/asm/pgtable.h
>> +++ b/arch/riscv/include/asm/pgtable.h
>> @@ -387,6 +387,13 @@ static inline int pte_special(pte_t pte)
>> return pte_val(pte) & _PAGE_SPECIAL;
>>  }
>>
>> +#ifdef CONFIG_ARCH_HAS_PTE_DEVMAP
>> +static inline int pte_devmap(pte_t pte)
>> +{
>> +   return pte_val(pte) & _PAGE_DEVMAP;
>> +}
>> +#endif
>
> Not sure you need the #ifdef here.

W/o it 32b builds break (!defined(CONFIG_ARCH_HAS_PTE_DEVMAP) will have
a default implementation).. Maybe it's cleaner just to use that instead?

>> +
>>  /* static inline pte_t pte_rdprotect(pte_t pte) */
>>
>>  static inline pte_t pte_wrprotect(pte_t pte)
>> @@ -428,6 +435,11 @@ static inline pte_t pte_mkspecial(pte_t pte)
>> return __pte(pte_val(pte) | _PAGE_SPECIAL);
>>  }
>>
>> +static inline pte_t pte_mkdevmap(pte_t pte)
>> +{
>> +   return __pte(pte_val(pte) | _PAGE_DEVMAP);
>> +}
>> +
>>  static inline pte_t pte_mkhuge(pte_t pte)
>>  {
>> return pte;
>> @@ -711,6 +723,11 @@ static inline pmd_t pmd_mkdirty(pmd_t pmd)
>> return pte_pmd(pte_mkdirty(pmd_pte(pmd)));
>>  }
>>
>> +static inline pmd_t pmd_mkdevmap(pmd_t pmd)
>> +{
>> +   return pte_pmd(pte_mkdevmap(pmd_pte(pmd)));
>> +}
>> +
>>  static inline void set_pmd_at(struct mm_struct *mm, unsigned long addr,
>> pmd_t *pmdp, pmd_t pmd)
>>  {
>> --
>> 2.40.1
>>
>
> Otherwise, you can add:
>
> Reviewed-by: Alexandre Ghiti 

Thank you!


Björn



Re: [PATCH v3 9/9] riscv: mm: Add support for ZONE_DEVICE

2024-05-21 Thread Alexandre Ghiti
On Tue, May 21, 2024 at 1:49 PM Björn Töpel  wrote:
>
> From: Björn Töpel 
>
> ZONE_DEVICE pages need DEVMAP PTEs support to function
> (ARCH_HAS_PTE_DEVMAP). Claim another RSW (reserved for software) bit
> in the PTE for DEVMAP mark, add the corresponding helpers, and enable
> ARCH_HAS_PTE_DEVMAP for riscv64.
>
> Signed-off-by: Björn Töpel 
> ---
>  arch/riscv/Kconfig|  1 +
>  arch/riscv/include/asm/pgtable-64.h   | 20 
>  arch/riscv/include/asm/pgtable-bits.h |  1 +
>  arch/riscv/include/asm/pgtable.h  | 17 +
>  4 files changed, 39 insertions(+)
>
> diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
> index 2724dc2af29f..0b74698c63c7 100644
> --- a/arch/riscv/Kconfig
> +++ b/arch/riscv/Kconfig
> @@ -36,6 +36,7 @@ config RISCV
> select ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE
> select ARCH_HAS_PMEM_API
> select ARCH_HAS_PREPARE_SYNC_CORE_CMD
> +   select ARCH_HAS_PTE_DEVMAP if 64BIT && MMU
> select ARCH_HAS_PTE_SPECIAL
> select ARCH_HAS_SET_DIRECT_MAP if MMU
> select ARCH_HAS_SET_MEMORY if MMU
> diff --git a/arch/riscv/include/asm/pgtable-64.h 
> b/arch/riscv/include/asm/pgtable-64.h
> index 221a5c1ee287..c67a9bbfd010 100644
> --- a/arch/riscv/include/asm/pgtable-64.h
> +++ b/arch/riscv/include/asm/pgtable-64.h
> @@ -400,4 +400,24 @@ static inline struct page *pgd_page(pgd_t pgd)
>  #define p4d_offset p4d_offset
>  p4d_t *p4d_offset(pgd_t *pgd, unsigned long address);
>
> +#ifdef CONFIG_TRANSPARENT_HUGEPAGE
> +static inline int pte_devmap(pte_t pte);
> +static inline pte_t pmd_pte(pmd_t pmd);
> +
> +static inline int pmd_devmap(pmd_t pmd)
> +{
> +   return pte_devmap(pmd_pte(pmd));
> +}
> +
> +static inline int pud_devmap(pud_t pud)
> +{
> +   return 0;
> +}
> +
> +static inline int pgd_devmap(pgd_t pgd)
> +{
> +   return 0;
> +}
> +#endif
> +
>  #endif /* _ASM_RISCV_PGTABLE_64_H */
> diff --git a/arch/riscv/include/asm/pgtable-bits.h 
> b/arch/riscv/include/asm/pgtable-bits.h
> index 179bd4afece4..a8f5205cea54 100644
> --- a/arch/riscv/include/asm/pgtable-bits.h
> +++ b/arch/riscv/include/asm/pgtable-bits.h
> @@ -19,6 +19,7 @@
>  #define _PAGE_SOFT  (3 << 8)/* Reserved for software */
>
>  #define _PAGE_SPECIAL   (1 << 8)/* RSW: 0x1 */
> +#define _PAGE_DEVMAP(1 << 9)/* RSW, devmap */
>  #define _PAGE_TABLE _PAGE_PRESENT
>
>  /*
> diff --git a/arch/riscv/include/asm/pgtable.h 
> b/arch/riscv/include/asm/pgtable.h
> index 7933f493db71..02fadc276064 100644
> --- a/arch/riscv/include/asm/pgtable.h
> +++ b/arch/riscv/include/asm/pgtable.h
> @@ -387,6 +387,13 @@ static inline int pte_special(pte_t pte)
> return pte_val(pte) & _PAGE_SPECIAL;
>  }
>
> +#ifdef CONFIG_ARCH_HAS_PTE_DEVMAP
> +static inline int pte_devmap(pte_t pte)
> +{
> +   return pte_val(pte) & _PAGE_DEVMAP;
> +}
> +#endif

Not sure you need the #ifdef here.

> +
>  /* static inline pte_t pte_rdprotect(pte_t pte) */
>
>  static inline pte_t pte_wrprotect(pte_t pte)
> @@ -428,6 +435,11 @@ static inline pte_t pte_mkspecial(pte_t pte)
> return __pte(pte_val(pte) | _PAGE_SPECIAL);
>  }
>
> +static inline pte_t pte_mkdevmap(pte_t pte)
> +{
> +   return __pte(pte_val(pte) | _PAGE_DEVMAP);
> +}
> +
>  static inline pte_t pte_mkhuge(pte_t pte)
>  {
> return pte;
> @@ -711,6 +723,11 @@ static inline pmd_t pmd_mkdirty(pmd_t pmd)
> return pte_pmd(pte_mkdirty(pmd_pte(pmd)));
>  }
>
> +static inline pmd_t pmd_mkdevmap(pmd_t pmd)
> +{
> +   return pte_pmd(pte_mkdevmap(pmd_pte(pmd)));
> +}
> +
>  static inline void set_pmd_at(struct mm_struct *mm, unsigned long addr,
> pmd_t *pmdp, pmd_t pmd)
>  {
> --
> 2.40.1
>

Otherwise, you can add:

Reviewed-by: Alexandre Ghiti 

Thanks,

Alex



Re: [PATCH V3 3/3] vdpa_sim: flush workers on suspend

2024-05-21 Thread Steven Sistare

On 5/20/2024 10:32 PM, Jason Wang wrote:

On Mon, May 20, 2024 at 11:21 PM Steve Sistare
 wrote:


Flush to guarantee no workers are running when suspend returns.
Add a lock to enforce ordering between clearing running, flushing,
and posting new work in vdpasim_kick_vq.  It must be a spin lock
because vdpasim_kick_vq may be reached va eventfd_write.

Signed-off-by: Steve Sistare 
---
  drivers/vdpa/vdpa_sim/vdpa_sim.c | 16 ++--
  drivers/vdpa/vdpa_sim/vdpa_sim.h |  1 +
  2 files changed, 15 insertions(+), 2 deletions(-)

diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim.c b/drivers/vdpa/vdpa_sim/vdpa_sim.c
index 8ffea8430f95..67ed49d95bf0 100644
--- a/drivers/vdpa/vdpa_sim/vdpa_sim.c
+++ b/drivers/vdpa/vdpa_sim/vdpa_sim.c
@@ -322,7 +322,7 @@ static u16 vdpasim_get_vq_size(struct vdpa_device *vdpa, 
u16 idx)
 return VDPASIM_QUEUE_MAX;
  }

-static void vdpasim_kick_vq(struct vdpa_device *vdpa, u16 idx)
+static void vdpasim_do_kick_vq(struct vdpa_device *vdpa, u16 idx)
  {
 struct vdpasim *vdpasim = vdpa_to_sim(vdpa);
 struct vdpasim_virtqueue *vq = >vqs[idx];
@@ -337,6 +337,15 @@ static void vdpasim_kick_vq(struct vdpa_device *vdpa, u16 
idx)
 vdpasim_schedule_work(vdpasim);
  }

+static void vdpasim_kick_vq(struct vdpa_device *vdpa, u16 idx)
+{
+   struct vdpasim *vdpasim = vdpa_to_sim(vdpa);
+
+   spin_lock(>kick_lock);
+   vdpasim_do_kick_vq(vdpa, idx);
+   spin_unlock(>kick_lock);
+}
+
  static void vdpasim_set_vq_cb(struct vdpa_device *vdpa, u16 idx,
   struct vdpa_callback *cb)
  {
@@ -520,8 +529,11 @@ static int vdpasim_suspend(struct vdpa_device *vdpa)
 struct vdpasim *vdpasim = vdpa_to_sim(vdpa);

 mutex_lock(>mutex);
+   spin_lock(>kick_lock);
 vdpasim->running = false;
+   spin_unlock(>kick_lock);
 mutex_unlock(>mutex);
+   kthread_flush_work(>work);

 return 0;
  }
@@ -537,7 +549,7 @@ static int vdpasim_resume(struct vdpa_device *vdpa)
 if (vdpasim->pending_kick) {
 /* Process pending descriptors */
 for (i = 0; i < vdpasim->dev_attr.nvqs; ++i)
-   vdpasim_kick_vq(vdpa, i);
+   vdpasim_do_kick_vq(vdpa, i);

 vdpasim->pending_kick = false;
 }
diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim.h b/drivers/vdpa/vdpa_sim/vdpa_sim.h
index bb137e479763..5eb6ca9c5ec5 100644
--- a/drivers/vdpa/vdpa_sim/vdpa_sim.h
+++ b/drivers/vdpa/vdpa_sim/vdpa_sim.h
@@ -75,6 +75,7 @@ struct vdpasim {
 bool pending_kick;
 /* spinlock to synchronize iommu table */
 spinlock_t iommu_lock;
+   spinlock_t kick_lock;


It looks to me this is not initialized?


Yup, I lost that line while fiddling with different locking schemes.
Thanks, will fix in V4.

@@ -236,6 +236,7 @@ struct vdpasim *vdpasim_create(struct vdpasim_dev_attr 
*dev_attr,


mutex_init(>mutex);
spin_lock_init(>iommu_lock);
+   spin_lock_init(>kick_lock);

With that fix, does this patch earn your RB?

- Steve


  };

  struct vdpasim *vdpasim_create(struct vdpasim_dev_attr *attr,
--
2.39.3







Re: [PATCH V3 2/3] vduse: suspend

2024-05-21 Thread Steven Sistare

On 5/20/2024 10:30 PM, Jason Wang wrote:

On Mon, May 20, 2024 at 11:21 PM Steve Sistare
 wrote:


Support the suspend operation.  There is little to do, except flush to
guarantee no workers are running when suspend returns.

Signed-off-by: Steve Sistare 
---
  drivers/vdpa/vdpa_user/vduse_dev.c | 24 
  1 file changed, 24 insertions(+)

diff --git a/drivers/vdpa/vdpa_user/vduse_dev.c 
b/drivers/vdpa/vdpa_user/vduse_dev.c
index 73c89701fc9d..7dc46f771f12 100644
--- a/drivers/vdpa/vdpa_user/vduse_dev.c
+++ b/drivers/vdpa/vdpa_user/vduse_dev.c
@@ -472,6 +472,18 @@ static void vduse_dev_reset(struct vduse_dev *dev)
 up_write(>rwsem);
  }

+static void vduse_flush_work(struct vduse_dev *dev)
+{
+   flush_work(>inject);
+
+   for (int i = 0; i < dev->vq_num; i++) {
+   struct vduse_virtqueue *vq = dev->vqs[i];
+
+   flush_work(>inject);
+   flush_work(>kick);
+   }
+}
+
  static int vduse_vdpa_set_vq_address(struct vdpa_device *vdpa, u16 idx,
 u64 desc_area, u64 driver_area,
 u64 device_area)
@@ -724,6 +736,17 @@ static int vduse_vdpa_reset(struct vdpa_device *vdpa)
 return ret;
  }

+static int vduse_vdpa_suspend(struct vdpa_device *vdpa)
+{
+   struct vduse_dev *dev = vdpa_to_vduse(vdpa);
+
+   down_write(>rwsem);
+   vduse_flush_work(dev);
+   up_write(>rwsem);


Can this forbid the new work to be scheduled?


Are you suggesting I return an error below if the dev is suspended?
I can do that.

However, I now suspect this implementation of vduse_vdpa_suspend is not
complete in other ways, so I withdraw this patch pending future work.
Thanks for looking at it.

- Steve


static int vduse_dev_queue_irq_work(struct vduse_dev *dev,
 struct work_struct *irq_work,
 int irq_effective_cpu)
{
 int ret = -EINVAL;

 down_read(>rwsem);
 if (!(dev->status & VIRTIO_CONFIG_S_DRIVER_OK))
 goto unlock;

 ret = 0;
 if (irq_effective_cpu == IRQ_UNBOUND)
 queue_work(vduse_irq_wq, irq_work);
 else
 queue_work_on(irq_effective_cpu,
   vduse_irq_bound_wq, irq_work);
unlock:
 up_read(>rwsem);

 return ret;
}

Thanks


+
+   return 0;
+}
+
  static u32 vduse_vdpa_get_generation(struct vdpa_device *vdpa)
  {
 struct vduse_dev *dev = vdpa_to_vduse(vdpa);
@@ -806,6 +829,7 @@ static const struct vdpa_config_ops vduse_vdpa_config_ops = 
{
 .set_vq_affinity= vduse_vdpa_set_vq_affinity,
 .get_vq_affinity= vduse_vdpa_get_vq_affinity,
 .reset  = vduse_vdpa_reset,
+   .suspend= vduse_vdpa_suspend,
 .set_map= vduse_vdpa_set_map,
 .free   = vduse_vdpa_free,
  };
--
2.39.3







Re: [PATCH V3 1/3] vhost-vdpa: flush workers on suspend

2024-05-21 Thread Steven Sistare

On 5/20/2024 10:28 PM, Jason Wang wrote:

On Mon, May 20, 2024 at 11:21 PM Steve Sistare
 wrote:


Flush to guarantee no workers are running when suspend returns.

Fixes: f345a0143b4d ("vhost-vdpa: uAPI to suspend the device")
Signed-off-by: Steve Sistare 
Acked-by: Eugenio Pérez 
---
  drivers/vhost/vdpa.c | 3 +++
  1 file changed, 3 insertions(+)

diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
index ba52d128aeb7..189596caaec9 100644
--- a/drivers/vhost/vdpa.c
+++ b/drivers/vhost/vdpa.c
@@ -594,6 +594,7 @@ static long vhost_vdpa_suspend(struct vhost_vdpa *v)
 struct vdpa_device *vdpa = v->vdpa;
 const struct vdpa_config_ops *ops = vdpa->config;
 int ret;
+   struct vhost_dev *vdev = >vdev;

 if (!(ops->get_status(vdpa) & VIRTIO_CONFIG_S_DRIVER_OK))
 return 0;
@@ -601,6 +602,8 @@ static long vhost_vdpa_suspend(struct vhost_vdpa *v)
 if (!ops->suspend)
 return -EOPNOTSUPP;

+   vhost_dev_flush(vdev);


vhost-vDPA doesn't use workers, see:

 vhost_dev_init(dev, vqs, nvqs, 0, 0, 0, false,
vhost_vdpa_process_iotlb_msg);

So I wonder if this is a must.


True, but I am adding this to be future proof.  I could instead log a warning
or an error message if vhost_vdpa_suspend is called and v->vdev.use_worker=true,
but IMO we should just fix it, given that the fix is trivial.

- Steve






Re: [PATCH v3 7/9] riscv: Enable memory hotplugging for RISC-V

2024-05-21 Thread Alexandre Ghiti
On Tue, May 21, 2024 at 1:49 PM Björn Töpel  wrote:
>
> From: Björn Töpel 
>
> Enable ARCH_ENABLE_MEMORY_HOTPLUG and ARCH_ENABLE_MEMORY_HOTREMOVE for
> RISC-V.
>
> Signed-off-by: Björn Töpel 
> ---
>  arch/riscv/Kconfig | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
> index fe5281398543..2724dc2af29f 100644
> --- a/arch/riscv/Kconfig
> +++ b/arch/riscv/Kconfig
> @@ -16,6 +16,8 @@ config RISCV
> select ACPI_REDUCED_HARDWARE_ONLY if ACPI
> select ARCH_DMA_DEFAULT_COHERENT
> select ARCH_ENABLE_HUGEPAGE_MIGRATION if HUGETLB_PAGE && MIGRATION
> +   select ARCH_ENABLE_MEMORY_HOTPLUG if SPARSEMEM_VMEMMAP && 64BIT && MMU

Not sure you need 64BIT && MMU here since ARCH_SPARSEMEM_ENABLE
depends on MMU and SPARSEMEM_VMEMMAP_ENABLE is only enabled on 64BIT.

> +   select ARCH_ENABLE_MEMORY_HOTREMOVE if MEMORY_HOTPLUG
> select ARCH_ENABLE_SPLIT_PMD_PTLOCK if PGTABLE_LEVELS > 2
> select ARCH_ENABLE_THP_MIGRATION if TRANSPARENT_HUGEPAGE
> select ARCH_HAS_BINFMT_FLAT
> --
> 2.40.1
>

But anyway, to me that does not require a new version so you can add:

Reviewed-by: Alexandre Ghiti 

Thanks,

Alex



Re: [PATCH v3 5/9] riscv: mm: Add memory hotplugging support

2024-05-21 Thread Alexandre Ghiti
On Tue, May 21, 2024 at 1:49 PM Björn Töpel  wrote:
>
> From: Björn Töpel 
>
> For an architecture to support memory hotplugging, a couple of
> callbacks needs to be implemented:
>
>  arch_add_memory()
>   This callback is responsible for adding the physical memory into the
>   direct map, and call into the memory hotplugging generic code via
>   __add_pages() that adds the corresponding struct page entries, and
>   updates the vmemmap mapping.
>
>  arch_remove_memory()
>   This is the inverse of the callback above.
>
>  vmemmap_free()
>   This function tears down the vmemmap mappings (if
>   CONFIG_SPARSEMEM_VMEMMAP is enabled), and also deallocates the
>   backing vmemmap pages. Note that for persistent memory, an
>   alternative allocator for the backing pages can be used; The
>   vmem_altmap. This means that when the backing pages are cleared,
>   extra care is needed so that the correct deallocation method is
>   used.
>
>  arch_get_mappable_range()
>   This functions returns the PA range that the direct map can map.
>   Used by the MHP internals for sanity checks.
>
> The page table unmap/teardown functions are heavily based on code from
> the x86 tree. The same remove_pgd_mapping() function is used in both
> vmemmap_free() and arch_remove_memory(), but in the latter function
> the backing pages are not removed.
>
> Signed-off-by: Björn Töpel 
> ---
>  arch/riscv/mm/init.c | 261 +++
>  1 file changed, 261 insertions(+)
>
> diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c
> index 6f72b0b2b854..6693b742bf2f 100644
> --- a/arch/riscv/mm/init.c
> +++ b/arch/riscv/mm/init.c
> @@ -1493,3 +1493,264 @@ void __init pgtable_cache_init(void)
> }
>  }
>  #endif
> +
> +#ifdef CONFIG_MEMORY_HOTPLUG
> +static void __meminit free_pagetable(struct page *page, int order)
> +{
> +   unsigned int nr_pages = 1 << order;
> +
> +   /*
> +* vmemmap/direct page tables can be reserved, if added at
> +* boot.
> +*/
> +   if (PageReserved(page)) {
> +   __ClearPageReserved(page);

What's the difference between __ClearPageReserved() and
ClearPageReserved()? Because it seems like free_reserved_page() calls
the latter already, so why would you need to call
__ClearPageReserved() on the first page?

> +   while (nr_pages--)
> +   free_reserved_page(page++);
> +   return;
> +   }
> +
> +   free_pages((unsigned long)page_address(page), order);
> +}
> +
> +static void __meminit free_pte_table(pte_t *pte_start, pmd_t *pmd)
> +{
> +   pte_t *pte;
> +   int i;
> +
> +   for (i = 0; i < PTRS_PER_PTE; i++) {
> +   pte = pte_start + i;
> +   if (!pte_none(*pte))
> +   return;
> +   }
> +
> +   free_pagetable(pmd_page(*pmd), 0);
> +   pmd_clear(pmd);
> +}
> +
> +static void __meminit free_pmd_table(pmd_t *pmd_start, pud_t *pud)
> +{
> +   pmd_t *pmd;
> +   int i;
> +
> +   for (i = 0; i < PTRS_PER_PMD; i++) {
> +   pmd = pmd_start + i;
> +   if (!pmd_none(*pmd))
> +   return;
> +   }
> +
> +   free_pagetable(pud_page(*pud), 0);
> +   pud_clear(pud);
> +}
> +
> +static void __meminit free_pud_table(pud_t *pud_start, p4d_t *p4d)
> +{
> +   pud_t *pud;
> +   int i;
> +
> +   for (i = 0; i < PTRS_PER_PUD; i++) {
> +   pud = pud_start + i;
> +   if (!pud_none(*pud))
> +   return;
> +   }
> +
> +   free_pagetable(p4d_page(*p4d), 0);
> +   p4d_clear(p4d);
> +}
> +
> +static void __meminit free_vmemmap_storage(struct page *page, size_t size,
> +  struct vmem_altmap *altmap)
> +{
> +   if (altmap)
> +   vmem_altmap_free(altmap, size >> PAGE_SHIFT);
> +   else
> +   free_pagetable(page, get_order(size));
> +}
> +
> +static void __meminit remove_pte_mapping(pte_t *pte_base, unsigned long 
> addr, unsigned long end,
> +bool is_vmemmap, struct vmem_altmap 
> *altmap)
> +{
> +   unsigned long next;
> +   pte_t *ptep, pte;
> +
> +   for (; addr < end; addr = next) {
> +   next = (addr + PAGE_SIZE) & PAGE_MASK;

Nit: use ALIGN() instead.

> +   if (next > end)
> +   next = end;
> +
> +   ptep = pte_base + pte_index(addr);
> +   pte = READ_ONCE(*ptep);

Nit: Use ptep_get()

> +
> +   if (!pte_present(*ptep))
> +   continue;
> +
> +   pte_clear(_mm, addr, ptep);
> +   if (is_vmemmap)
> +   free_vmemmap_storage(pte_page(pte), PAGE_SIZE, 
> altmap);
> +   }
> +}
> +
> +static void __meminit remove_pmd_mapping(pmd_t *pmd_base, unsigned long 
> addr, unsigned long end,
> +bool is_vmemmap, struct 

Re: [PATCH 01/12] soc: qcom: add firmware name helper

2024-05-21 Thread Dmitry Baryshkov
On Tue, 21 May 2024 at 13:20, Kalle Valo  wrote:
>
> Dmitry Baryshkov  writes:
>
> > On Tue, 21 May 2024 at 12:52,  wrote:
> >>
> >> On 21/05/2024 11:45, Dmitry Baryshkov wrote:
> >> > Qualcomm platforms have different sets of the firmware files, which
> >> > differ from platform to platform (and from board to board, due to the
> >> > embedded signatures). Rather than listing all the firmware files,
> >> > including full paths, in the DT, provide a way to determine firmware
> >> > path based on the root DT node compatible.
> >>
> >> Ok this looks quite over-engineered but necessary to handle the legacy,
> >> but I really think we should add a way to look for a board-specific path
> >> first and fallback to those SoC specific paths.
> >
> > Again, CONFIG_FW_LOADER_USER_HELPER => delays.
>
> To me this also looks like very over-engineered, can you elaborate more
> why this is needed? Concrete examples would help to understand better.

Sure. During the meeting last week Arnd suggested evaluating if we can
drop firmware-name from the board DT files. Several reasons for that:
- DT should describe the hardware, not the Linux-firmware locations
- having firmware name in DT complicates updating the tree to use
different firmware API (think of mbn vs mdt vs any other format)
- If the DT gets supplied by the vendor (e.g. for
SystemReady-certified devices), there should be a sync between the
vendor's DT, linux kernel and the rootfs. Dropping firmware names from
DT solves that by removing one piece of the equation

Now for the complexity of the solution. Each SoC family has their own
firmware set. This includes firmware for the DSPs, for modem, WiFi
bits, GPU shader, etc.
For the development boards these devices are signed by the testing key
and the actual signature is not validated against the root of trust
certificate.
For the end-user devices the signature is actually validated against
the bits fused to the SoC during manufacturing process. CA certificate
(and thus the fuses) differ from vendor to vendor (and from the device
to device)

Not all of the firmware files are a part of the public linux-firmware
tree. However we need to support the rootfs bundled with the firmware
for different platforms (both public and vendor). The non-signed files
come from the Adreno GPU and can be shared between platforms. All
other files are SoC-specific and in some cases device-specific.

So for example the SDM845 db845c (open device) loads following firmware files:
Not signed:
- qcom/a630_sqe.fw
- qcom/a630_gmu.bin

Signed, will work for any non-secured sdm845 device:
- qcom/sdm845/a630_zap.mbn
- qcom/sdm845/adsp.mbn
- qcom/sdm845/cdsp.mbn
- qcom/sdm485/mba.mbn
- qcom/sdm845/modem.mbn
- qcom/sdm845/wlanmdsp.mbn (loaded via TQFTP)
- qcom/venus-5.2/venus.mbn

Signed, works only for DB845c.
- qcom/sdm845/Thundercomm/db845c/slpi.mbn

In comparison, the SDM845 Pixel-3 phone (aka blueline) should load the
following firmware files:
- qcom/a630_sqe.fw (the same, non-signed file)
- qcom/a630_gmu.bin (the same, non-signed file)
- qcom/sdm845/Google/blueline/a630_zap.mbn
- qcom/sdm845/Google/blueline/adsp.mbn
- qcom/sdm845/Google/blueline/cdsp.mbn
- qcom/sdm845/Google/blueline/ipa_fws.mbn
- qcom/sdm845/Google/blueline/mba.mbn
- qcom/sdm845/Google/blueline/modem.mbn
- qcom/sdm845/Google/blueline/venus.mbn
- qcom/sdm845/Google/blueline/wlanmdsp.mbn
- qcom/sdm845/Google/blueline/slpi.mbn

The Lenovo Yoga C630 WoS laptop (SDM850 is a variant of SDM845) uses
another set of files:
- qcom/a630_sqe.fw (the same, non-signed file)
- qcom/a630_gmu.bin (the same, non-signed file)
- qcom/sdm850/LENOVO/81JL/qcdxkmsuc850.mbn
- qcom/sdm850/LENOVO/81JL/qcadsp850.mbn
- qcom/sdm850/LENOVO/81JL/qccdsp850.mbn
- qcom/sdm850/LENOVO/81JL/ipa_fws.elf
- qcom/sdm850/LENOVO/81JL/qcdsp1v2850.mbn
- qcom/sdm850/LENOVO/81JL/qcdsp2850.mbn
- qcom/sdm850/LENOVO/81JL/qcvss850.mbn
- qcom/sdm850/LENOVO/81JL/wlanmdsp.mbn
- qcom/sdm850/LENOVO/81JL/qcslpi850.mbn

If we look at one of the recent platforms, e.g. SM8650-QRD, this list
also grows up:
- qcom/gen70900_sqe.fw (generic, non-signed)
- qcom/gmu_gen70900.bin (generic, non-signed)
- qcom/sm8650/gen70900_zap.mbn
- qcom/sm8650/adsp.mbn
- qcom/sm8650/adsp_dtb.mbn
- qcom/sm8650/cdsp.mbn
- qcom/sm8650/cdsp_dtb.mbn
- qcom/sm8650/ipa_fws.mbn
- qcom/sm8650/modem.mbn
- qcom/sm8650/modem_dtb.mbn
- qcom/sm8650/vpu33_4v.mbn (or maybe qcom/vpu-33/vpu_4v.mbn)

-- 
With best wishes
Dmitry



Re: [PATCH v3 3/9] riscv: mm: Change attribute from __init to __meminit for page functions

2024-05-21 Thread Alexandre Ghiti
On Tue, May 21, 2024 at 1:48 PM Björn Töpel  wrote:
>
> From: Björn Töpel 
>
> Prepare for memory hotplugging support by changing from __init to
> __meminit for the page table functions that are used by the upcoming
> architecture specific callbacks.
>
> Changing the __init attribute to __meminit, avoids that the functions
> are removed after init. The __meminit attribute makes sure the
> functions are kept in the kernel text post init, but only if memory
> hotplugging is enabled for the build.
>
> Reviewed-by: David Hildenbrand 
> Reviewed-by: Oscar Salvador 
> Signed-off-by: Björn Töpel 
> ---
>  arch/riscv/include/asm/mmu.h |  4 +--
>  arch/riscv/include/asm/pgtable.h |  2 +-
>  arch/riscv/mm/init.c | 56 ++--
>  3 files changed, 28 insertions(+), 34 deletions(-)
>
> diff --git a/arch/riscv/include/asm/mmu.h b/arch/riscv/include/asm/mmu.h
> index 947fd60f9051..c9e03e9da3dc 100644
> --- a/arch/riscv/include/asm/mmu.h
> +++ b/arch/riscv/include/asm/mmu.h
> @@ -31,8 +31,8 @@ typedef struct {
>  #define cntx2asid(cntx)((cntx) & SATP_ASID_MASK)
>  #define cntx2version(cntx) ((cntx) & ~SATP_ASID_MASK)
>
> -void __init create_pgd_mapping(pgd_t *pgdp, uintptr_t va, phys_addr_t pa,
> -  phys_addr_t sz, pgprot_t prot);
> +void __meminit create_pgd_mapping(pgd_t *pgdp, uintptr_t va, phys_addr_t pa, 
> phys_addr_t sz,
> + pgprot_t prot);
>  #endif /* __ASSEMBLY__ */
>
>  #endif /* _ASM_RISCV_MMU_H */
> diff --git a/arch/riscv/include/asm/pgtable.h 
> b/arch/riscv/include/asm/pgtable.h
> index 58fd7b70b903..7933f493db71 100644
> --- a/arch/riscv/include/asm/pgtable.h
> +++ b/arch/riscv/include/asm/pgtable.h
> @@ -162,7 +162,7 @@ struct pt_alloc_ops {
>  #endif
>  };
>
> -extern struct pt_alloc_ops pt_ops __initdata;
> +extern struct pt_alloc_ops pt_ops __meminitdata;
>
>  #ifdef CONFIG_MMU
>  /* Number of PGD entries that a user-mode program can use */
> diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c
> index c98010ede810..c969427eab88 100644
> --- a/arch/riscv/mm/init.c
> +++ b/arch/riscv/mm/init.c
> @@ -295,7 +295,7 @@ static void __init setup_bootmem(void)
>  }
>
>  #ifdef CONFIG_MMU
> -struct pt_alloc_ops pt_ops __initdata;
> +struct pt_alloc_ops pt_ops __meminitdata;
>
>  pgd_t swapper_pg_dir[PTRS_PER_PGD] __page_aligned_bss;
>  pgd_t trampoline_pg_dir[PTRS_PER_PGD] __page_aligned_bss;
> @@ -357,7 +357,7 @@ static inline pte_t *__init 
> get_pte_virt_fixmap(phys_addr_t pa)
> return (pte_t *)set_fixmap_offset(FIX_PTE, pa);
>  }
>
> -static inline pte_t *__init get_pte_virt_late(phys_addr_t pa)
> +static inline pte_t *__meminit get_pte_virt_late(phys_addr_t pa)
>  {
> return (pte_t *) __va(pa);
>  }
> @@ -376,7 +376,7 @@ static inline phys_addr_t __init 
> alloc_pte_fixmap(uintptr_t va)
> return memblock_phys_alloc(PAGE_SIZE, PAGE_SIZE);
>  }
>
> -static phys_addr_t __init alloc_pte_late(uintptr_t va)
> +static phys_addr_t __meminit alloc_pte_late(uintptr_t va)
>  {
> struct ptdesc *ptdesc = pagetable_alloc(GFP_KERNEL & ~__GFP_HIGHMEM, 
> 0);
>
> @@ -384,9 +384,8 @@ static phys_addr_t __init alloc_pte_late(uintptr_t va)
> return __pa((pte_t *)ptdesc_address(ptdesc));
>  }
>
> -static void __init create_pte_mapping(pte_t *ptep,
> - uintptr_t va, phys_addr_t pa,
> - phys_addr_t sz, pgprot_t prot)
> +static void __meminit create_pte_mapping(pte_t *ptep, uintptr_t va, 
> phys_addr_t pa, phys_addr_t sz,
> +pgprot_t prot)
>  {
> uintptr_t pte_idx = pte_index(va);
>
> @@ -440,7 +439,7 @@ static pmd_t *__init get_pmd_virt_fixmap(phys_addr_t pa)
> return (pmd_t *)set_fixmap_offset(FIX_PMD, pa);
>  }
>
> -static pmd_t *__init get_pmd_virt_late(phys_addr_t pa)
> +static pmd_t *__meminit get_pmd_virt_late(phys_addr_t pa)
>  {
> return (pmd_t *) __va(pa);
>  }
> @@ -457,7 +456,7 @@ static phys_addr_t __init alloc_pmd_fixmap(uintptr_t va)
> return memblock_phys_alloc(PAGE_SIZE, PAGE_SIZE);
>  }
>
> -static phys_addr_t __init alloc_pmd_late(uintptr_t va)
> +static phys_addr_t __meminit alloc_pmd_late(uintptr_t va)
>  {
> struct ptdesc *ptdesc = pagetable_alloc(GFP_KERNEL & ~__GFP_HIGHMEM, 
> 0);
>
> @@ -465,9 +464,9 @@ static phys_addr_t __init alloc_pmd_late(uintptr_t va)
> return __pa((pmd_t *)ptdesc_address(ptdesc));
>  }
>
> -static void __init create_pmd_mapping(pmd_t *pmdp,
> - uintptr_t va, phys_addr_t pa,
> - phys_addr_t sz, pgprot_t prot)
> +static void __meminit create_pmd_mapping(pmd_t *pmdp,
> +uintptr_t va, phys_addr_t pa,
> +phys_addr_t sz, pgprot_t prot)
>  {
> pte_t *ptep;
> phys_addr_t pte_phys;
> @@ -503,7 +502,7 @@ 

[RESEND PATCH v5 7/7] remoteproc: stm32: Add support of an OP-TEE TA to load the firmware

2024-05-21 Thread Arnaud Pouliquen
The new TEE remoteproc device is used to manage remote firmware in a
secure, trusted context. The 'st,stm32mp1-m4-tee' compatibility is
introduced to delegate the loading of the firmware to the trusted
execution context. In such cases, the firmware should be signed and
adhere to the image format defined by the TEE.

Signed-off-by: Arnaud Pouliquen 
---
Update from V4:
- remove hard coded remote proc ID STM32_MP1_M4_PROC_ID, get the
  ID from the DT,
- replace find_loaded_rsc_table by get_loaded_rsc_table.
---
 drivers/remoteproc/stm32_rproc.c | 65 ++--
 1 file changed, 61 insertions(+), 4 deletions(-)

diff --git a/drivers/remoteproc/stm32_rproc.c b/drivers/remoteproc/stm32_rproc.c
index 8cd838df4e92..f6f748814bf2 100644
--- a/drivers/remoteproc/stm32_rproc.c
+++ b/drivers/remoteproc/stm32_rproc.c
@@ -20,6 +20,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 #include "remoteproc_internal.h"
@@ -257,6 +258,19 @@ static int stm32_rproc_release(struct rproc *rproc)
return 0;
 }
 
+static int stm32_rproc_tee_stop(struct rproc *rproc)
+{
+   int err;
+
+   stm32_rproc_request_shutdown(rproc);
+
+   err = tee_rproc_stop(rproc);
+   if (err)
+   return err;
+
+   return stm32_rproc_release(rproc);
+}
+
 static int stm32_rproc_prepare(struct rproc *rproc)
 {
struct device *dev = rproc->dev.parent;
@@ -693,8 +707,20 @@ static const struct rproc_ops st_rproc_ops = {
.get_boot_addr  = rproc_elf_get_boot_addr,
 };
 
+static const struct rproc_ops st_rproc_tee_ops = {
+   .prepare= stm32_rproc_prepare,
+   .start  = tee_rproc_start,
+   .stop   = stm32_rproc_tee_stop,
+   .kick   = stm32_rproc_kick,
+   .load   = tee_rproc_load_fw,
+   .parse_fw   = tee_rproc_parse_fw,
+   .get_loaded_rsc_table = tee_rproc_get_loaded_rsc_table,
+
+};
+
 static const struct of_device_id stm32_rproc_match[] = {
-   { .compatible = "st,stm32mp1-m4" },
+   {.compatible = "st,stm32mp1-m4",},
+   {.compatible = "st,stm32mp1-m4-tee",},
{},
 };
 MODULE_DEVICE_TABLE(of, stm32_rproc_match);
@@ -853,17 +879,42 @@ static int stm32_rproc_probe(struct platform_device *pdev)
struct device *dev = >dev;
struct stm32_rproc *ddata;
struct device_node *np = dev->of_node;
+   struct tee_rproc *trproc = NULL;
struct rproc *rproc;
unsigned int state;
+   u32 proc_id;
int ret;
 
ret = dma_coerce_mask_and_coherent(dev, DMA_BIT_MASK(32));
if (ret)
return ret;
 
-   rproc = devm_rproc_alloc(dev, np->name, _rproc_ops, NULL, 
sizeof(*ddata));
-   if (!rproc)
-   return -ENOMEM;
+   if (of_device_is_compatible(np, "st,stm32mp1-m4-tee")) {
+   /*
+* Delegate the firmware management to the secure context.
+* The firmware loaded has to be signed.
+*/
+   ret = of_property_read_u32(np, "st,proc-id", _id);
+   if (ret) {
+   dev_err(dev, "failed to read st,rproc-id property\n");
+   return ret;
+   }
+
+   rproc = devm_rproc_alloc(dev, np->name, _rproc_tee_ops, 
NULL, sizeof(*ddata));
+   if (!rproc)
+   return -ENOMEM;
+
+   trproc = tee_rproc_register(dev, rproc, proc_id);
+   if (IS_ERR(trproc)) {
+   dev_err_probe(dev, PTR_ERR(trproc),
+ "signed firmware not supported by TEE\n");
+   return PTR_ERR(trproc);
+   }
+   } else {
+   rproc = devm_rproc_alloc(dev, np->name, _rproc_ops, NULL, 
sizeof(*ddata));
+   if (!rproc)
+   return -ENOMEM;
+   }
 
ddata = rproc->priv;
 
@@ -915,6 +966,9 @@ static int stm32_rproc_probe(struct platform_device *pdev)
dev_pm_clear_wake_irq(dev);
device_init_wakeup(dev, false);
}
+   if (trproc)
+   tee_rproc_unregister(trproc);
+
return ret;
 }
 
@@ -935,6 +989,9 @@ static void stm32_rproc_remove(struct platform_device *pdev)
dev_pm_clear_wake_irq(dev);
device_init_wakeup(dev, false);
}
+   if (rproc->tee_interface)
+   tee_rproc_unregister(rproc->tee_interface);
+
 }
 
 static int stm32_rproc_suspend(struct device *dev)
-- 
2.25.1




[RESEND PATCH v5 4/7] remoteproc: core introduce rproc_set_rsc_table_on_start function

2024-05-21 Thread Arnaud Pouliquen
Split rproc_start()to prepare the update of the management of
the cache table on start, for the support of the firmware loading
by the TEE interface.
- create rproc_set_rsc_table_on_start() to address the management of
  the cache table in a specific function, as done in
  rproc_reset_rsc_table_on_stop().
- rename rproc_set_rsc_table in rproc_set_rsc_table_on_attach()
- move rproc_reset_rsc_table_on_stop() to be close to the
  rproc_set_rsc_table_on_start() function

Suggested-by: Mathieu Poirier 
Signed-off-by: Arnaud Pouliquen 
---
 drivers/remoteproc/remoteproc_core.c | 116 ++-
 1 file changed, 62 insertions(+), 54 deletions(-)

diff --git a/drivers/remoteproc/remoteproc_core.c 
b/drivers/remoteproc/remoteproc_core.c
index f276956f2c5c..42bca01f3bde 100644
--- a/drivers/remoteproc/remoteproc_core.c
+++ b/drivers/remoteproc/remoteproc_core.c
@@ -1264,18 +1264,9 @@ void rproc_resource_cleanup(struct rproc *rproc)
 }
 EXPORT_SYMBOL(rproc_resource_cleanup);
 
-static int rproc_start(struct rproc *rproc, const struct firmware *fw)
+static int rproc_set_rsc_table_on_start(struct rproc *rproc, const struct 
firmware *fw)
 {
struct resource_table *loaded_table;
-   struct device *dev = >dev;
-   int ret;
-
-   /* load the ELF segments to memory */
-   ret = rproc_load_segments(rproc, fw);
-   if (ret) {
-   dev_err(dev, "Failed to load program segments: %d\n", ret);
-   return ret;
-   }
 
/*
 * The starting device has been given the rproc->cached_table as the
@@ -1291,6 +1282,64 @@ static int rproc_start(struct rproc *rproc, const struct 
firmware *fw)
rproc->table_ptr = loaded_table;
}
 
+   return 0;
+}
+
+static int rproc_reset_rsc_table_on_stop(struct rproc *rproc)
+{
+   /* A resource table was never retrieved, nothing to do here */
+   if (!rproc->table_ptr)
+   return 0;
+
+   /*
+* If a cache table exists the remote processor was started by
+* the remoteproc core.  That cache table should be used for
+* the rest of the shutdown process.
+*/
+   if (rproc->cached_table)
+   goto out;
+
+   /*
+* If we made it here the remote processor was started by another
+* entity and a cache table doesn't exist.  As such make a copy of
+* the resource table currently used by the remote processor and
+* use that for the rest of the shutdown process.  The memory
+* allocated here is free'd in rproc_shutdown().
+*/
+   rproc->cached_table = kmemdup(rproc->table_ptr,
+ rproc->table_sz, GFP_KERNEL);
+   if (!rproc->cached_table)
+   return -ENOMEM;
+
+   /*
+* Since the remote processor is being switched off the clean table
+* won't be needed.  Allocated in rproc_set_rsc_table_on_start().
+*/
+   kfree(rproc->clean_table);
+
+out:
+   /*
+* Use a copy of the resource table for the remainder of the
+* shutdown process.
+*/
+   rproc->table_ptr = rproc->cached_table;
+   return 0;
+}
+
+static int rproc_start(struct rproc *rproc, const struct firmware *fw)
+{
+   struct device *dev = >dev;
+   int ret;
+
+   /* load the ELF segments to memory */
+   ret = rproc_load_segments(rproc, fw);
+   if (ret) {
+   dev_err(dev, "Failed to load program segments: %d\n", ret);
+   return ret;
+   }
+
+   rproc_set_rsc_table_on_start(rproc, fw);
+
ret = rproc_prepare_subdevices(rproc);
if (ret) {
dev_err(dev, "failed to prepare subdevices for %s: %d\n",
@@ -1450,7 +1499,7 @@ static int rproc_fw_boot(struct rproc *rproc, const 
struct firmware *fw)
return ret;
 }
 
-static int rproc_set_rsc_table(struct rproc *rproc)
+static int rproc_set_rsc_table_on_attach(struct rproc *rproc)
 {
struct resource_table *table_ptr;
struct device *dev = >dev;
@@ -1540,54 +1589,13 @@ static int rproc_reset_rsc_table_on_detach(struct rproc 
*rproc)
 
/*
 * The clean resource table is no longer needed.  Allocated in
-* rproc_set_rsc_table().
+* rproc_set_rsc_table_on_attach().
 */
kfree(rproc->clean_table);
 
return 0;
 }
 
-static int rproc_reset_rsc_table_on_stop(struct rproc *rproc)
-{
-   /* A resource table was never retrieved, nothing to do here */
-   if (!rproc->table_ptr)
-   return 0;
-
-   /*
-* If a cache table exists the remote processor was started by
-* the remoteproc core.  That cache table should be used for
-* the rest of the shutdown process.
-*/
-   if (rproc->cached_table)
-   goto out;
-
-   /*
-* If we made it here the remote processor was started by another
-* entity and a cache table doesn't exist.  As such make a 

[RESEND PATCH v5 2/7] dt-bindings: remoteproc: Add compatibility for TEE support

2024-05-21 Thread Arnaud Pouliquen
The "st,stm32mp1-m4-tee" compatible is utilized in a system configuration
where the Cortex-M4 firmware is loaded by the Trusted execution Environment
(TEE).
For instance, this compatible is used in both the Linux and OP-TEE
device-tree:
- In OP-TEE, a node is defined in the device tree with the
  st,stm32mp1-m4-tee to support signed remoteproc firmware.
  Based on DT properties, OP-TEE authenticates, loads, starts, and stops
  the firmware.
- On Linux, when the compatibility is set, the Cortex-M resets should not
  be declared in the device tree.

Signed-off-by: Arnaud Pouliquen 
Reviewed-by: Rob Herring 
---
 .../bindings/remoteproc/st,stm32-rproc.yaml   | 51 ---
 1 file changed, 43 insertions(+), 8 deletions(-)

diff --git a/Documentation/devicetree/bindings/remoteproc/st,stm32-rproc.yaml 
b/Documentation/devicetree/bindings/remoteproc/st,stm32-rproc.yaml
index 370af61d8f28..36ea54016b76 100644
--- a/Documentation/devicetree/bindings/remoteproc/st,stm32-rproc.yaml
+++ b/Documentation/devicetree/bindings/remoteproc/st,stm32-rproc.yaml
@@ -16,7 +16,12 @@ maintainers:
 
 properties:
   compatible:
-const: st,stm32mp1-m4
+enum:
+  - st,stm32mp1-m4
+  - st,stm32mp1-m4-tee
+description:
+  Use "st,stm32mp1-m4" for the Cortex-M4 coprocessor management by 
non-secure context
+  Use "st,stm32mp1-m4-tee" for the Cortex-M4 coprocessor management by 
secure context
 
   reg:
 description:
@@ -142,21 +147,41 @@ properties:
 required:
   - compatible
   - reg
-  - resets
 
 allOf:
   - if:
   properties:
-reset-names:
-  not:
-contains:
-  const: hold_boot
+compatible:
+  contains:
+const: st,stm32mp1-m4
 then:
+  if:
+properties:
+  reset-names:
+not:
+  contains:
+const: hold_boot
+  then:
+required:
+  - st,syscfg-holdboot
+  else:
+properties:
+  st,syscfg-holdboot: false
+required:
+  - reset-names
   required:
-- st,syscfg-holdboot
-else:
+- resets
+
+  - if:
+  properties:
+compatible:
+  contains:
+const: st,stm32mp1-m4-tee
+then:
   properties:
 st,syscfg-holdboot: false
+reset-names: false
+resets: false
 
 additionalProperties: false
 
@@ -188,5 +213,15 @@ examples:
   st,syscfg-rsc-tbl = < 0x144 0x>;
   st,syscfg-m4-state = < 0x148 0x>;
 };
+  - |
+#include 
+m4@1000 {
+  compatible = "st,stm32mp1-m4-tee";
+  reg = <0x1000 0x4>,
+<0x3000 0x4>,
+<0x3800 0x1>;
+  st,syscfg-rsc-tbl = < 0x144 0x>;
+  st,syscfg-m4-state = < 0x148 0x>;
+};
 
 ...
-- 
2.25.1




[RESEND PATCH v5 1/7] remoteproc: Add TEE support

2024-05-21 Thread Arnaud Pouliquen
Add a remoteproc TEE (Trusted Execution Environment) driver
that will be probed by the TEE bus. If the associated Trusted
application is supported on secure part this driver offers a client
interface to load a firmware in the secure part.
This firmware could be authenticated by the secure trusted application.

Signed-off-by: Arnaud Pouliquen 
---
update from V4
- fix commit message,
- fix Kconfig typo,
- introduce tee_rproc_release_loaded_rsc_table function to release the
  resource table,
- reorder function variables in declaration in reverse ascending order,
- introduce try_module_get and module_put to prevent module removed while
  used,
- remove rsc_table field in tee_rproc structure,
- remove tee_rproc_find_loaded_rsc_table as seems not correspond to the
  propoer usage regarding ops definition [1]. The resource table is
  loaded before used,
- add __force attribute when cast the type aof the resource table to fix
  build warning.

[1]https://elixir.bootlin.com/linux/latest/source/include/linux/remoteproc.h#L374
---
 drivers/remoteproc/Kconfig  |  10 +
 drivers/remoteproc/Makefile |   1 +
 drivers/remoteproc/tee_remoteproc.c | 429 
 include/linux/remoteproc.h  |   4 +
 include/linux/tee_remoteproc.h  |  99 +++
 5 files changed, 543 insertions(+)
 create mode 100644 drivers/remoteproc/tee_remoteproc.c
 create mode 100644 include/linux/tee_remoteproc.h

diff --git a/drivers/remoteproc/Kconfig b/drivers/remoteproc/Kconfig
index 48845dc8fa85..6c1c07202276 100644
--- a/drivers/remoteproc/Kconfig
+++ b/drivers/remoteproc/Kconfig
@@ -365,6 +365,16 @@ config XLNX_R5_REMOTEPROC
 
  It's safe to say N if not interested in using RPU r5f cores.
 
+
+config TEE_REMOTEPROC
+   tristate "Remoteproc support by a TEE application"
+   depends on OPTEE
+   help
+ Support a remote processor with a TEE application. The Trusted
+ Execution Context is responsible for loading the trusted firmware
+ image and managing the remote processor's lifecycle.
+ This can be either built-in or a loadable module.
+
 endif # REMOTEPROC
 
 endmenu
diff --git a/drivers/remoteproc/Makefile b/drivers/remoteproc/Makefile
index 91314a9b43ce..fa8daebce277 100644
--- a/drivers/remoteproc/Makefile
+++ b/drivers/remoteproc/Makefile
@@ -36,6 +36,7 @@ obj-$(CONFIG_RCAR_REMOTEPROC) += rcar_rproc.o
 obj-$(CONFIG_ST_REMOTEPROC)+= st_remoteproc.o
 obj-$(CONFIG_ST_SLIM_REMOTEPROC)   += st_slim_rproc.o
 obj-$(CONFIG_STM32_RPROC)  += stm32_rproc.o
+obj-$(CONFIG_TEE_REMOTEPROC)   += tee_remoteproc.o
 obj-$(CONFIG_TI_K3_DSP_REMOTEPROC) += ti_k3_dsp_remoteproc.o
 obj-$(CONFIG_TI_K3_R5_REMOTEPROC)  += ti_k3_r5_remoteproc.o
 obj-$(CONFIG_XLNX_R5_REMOTEPROC)   += xlnx_r5_remoteproc.o
diff --git a/drivers/remoteproc/tee_remoteproc.c 
b/drivers/remoteproc/tee_remoteproc.c
new file mode 100644
index ..f13546628ec9
--- /dev/null
+++ b/drivers/remoteproc/tee_remoteproc.c
@@ -0,0 +1,429 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+/*
+ * Copyright (C) STMicroelectronics 2024 - All Rights Reserved
+ * Author: Arnaud Pouliquen 
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "remoteproc_internal.h"
+
+#define MAX_TEE_PARAM_ARRY_MEMBER  4
+
+/*
+ * Authentication of the firmware and load in the remote processor memory
+ *
+ * [in]  params[0].value.a:unique 32bit identifier of the remote processor
+ * [in] params[1].memref:  buffer containing the image of the 
buffer
+ */
+#define TA_RPROC_FW_CMD_LOAD_FW1
+
+/*
+ * Start the remote processor
+ *
+ * [in]  params[0].value.a:unique 32bit identifier of the remote processor
+ */
+#define TA_RPROC_FW_CMD_START_FW   2
+
+/*
+ * Stop the remote processor
+ *
+ * [in]  params[0].value.a:unique 32bit identifier of the remote processor
+ */
+#define TA_RPROC_FW_CMD_STOP_FW3
+
+/*
+ * Return the address of the resource table, or 0 if not found
+ * No check is done to verify that the address returned is accessible by
+ * the non secure context. If the resource table is loaded in a protected
+ * memory the access by the non secure context will lead to a data abort.
+ *
+ * [in]  params[0].value.a:unique 32bit identifier of the remote processor
+ * [out]  params[1].value.a:   32bit LSB resource table memory address
+ * [out]  params[1].value.b:   32bit MSB resource table memory address
+ * [out]  params[2].value.a:   32bit LSB resource table memory size
+ * [out]  params[2].value.b:   32bit MSB resource table memory size
+ */
+#define TA_RPROC_FW_CMD_GET_RSC_TABLE  4
+
+/*
+ * Return the address of the core dump
+ *
+ * [in]  params[0].value.a:unique 32bit identifier of the remote processor
+ * [out] params[1].memref: address of the core dump image if exist,
+ * else return Null
+ */
+#define 

[RESEND PATCH v5 0/7] Introduction of a remoteproc tee to load signed firmware

2024-05-21 Thread Arnaud Pouliquen
Main updates from the previous version [1]:
--

1) use proc->table_ptr as unique reference to point to the resource table
 --> update remoteproc_core.c to implement management of the resource table
 base on rproc->rproc->tee_interface new field:
 - on start get the resource table address from TEE remoteproc instead
   of finding it in firmware (ops choice to confirm)
 - on stop unmap the resource table before updating the
   proc->table_ptr pointer.

2) retrieve the TEE rproc Identifier from the device tree instead of
   hardcoding it
 -->  Add a new "st,proc-id" property in device tree.

More details on updates are listed in commits messages

[1] 
https://lore.kernel.org/linux-arm-kernel/20240115135249.296822-1-arnaud.pouliq...@foss.st.com/T/#m9ebb2e8f6d5e90f055827e4f227ce0877bc6d761

base-commit: c8d8f841e95bcc07ac8c5621fc171a24f1fd5cdb

Description of the feature:
--
This series proposes the implementation of a remoteproc tee driver to
communicate with a TEE trusted application responsible for authenticating
and loading the remoteproc firmware image in an Arm secure context.

1) Principle:

The remoteproc tee driver provides services to communicate with the OP-TEE
trusted application running on the Trusted Execution Context (TEE).
The trusted application in TEE manages the remote processor lifecycle:

- authenticating and loading firmware images,
- isolating and securing the remote processor memories,
- supporting multi-firmware (e.g., TF-M + Zephyr on a Cortex-M33),
- managing the start and stop of the firmware by the TEE.

2) Format of the signed image:

Refer to:
https://github.com/OP-TEE/optee_os/blob/master/ta/remoteproc/src/remoteproc_core.c#L18-L57

3) OP-TEE trusted application API:

Refer to:
https://github.com/OP-TEE/optee_os/blob/master/ta/remoteproc/include/ta_remoteproc.h

4) OP-TEE signature script

Refer to:
https://github.com/OP-TEE/optee_os/blob/master/scripts/sign_rproc_fw.py

Example of usage:
sign_rproc_fw.py --in  --in  --out  --key 
${OP-TEE_PATH}/keys/default.pem


5) Impact on User space Application

No sysfs impact.the user only needs to provide the signed firmware image
instead of the ELF image.


For more information about the implementation, a presentation is available here
(note that the format of the signed image has evolved between the presentation
and the integration in OP-TEE).

https://resources.linaro.org/en/resource/6c5bGvZwUAjX56fvxthxds

Arnaud Pouliquen (7):
  remoteproc: Add TEE support
  dt-bindings: remoteproc: Add compatibility for TEE support
  dt-bindings: remoteproc: Add processor identifier property
  remoteproc: core introduce rproc_set_rsc_table_on_start function
  remoteproc: core: support of the tee interface
  remoteproc: stm32: Create sub-functions to request shutdown and
release
  remoteproc: stm32: Add support of an OP-TEE TA to load the firmware

 .../bindings/remoteproc/st,stm32-rproc.yaml   |  58 ++-
 drivers/remoteproc/Kconfig|  10 +
 drivers/remoteproc/Makefile   |   1 +
 drivers/remoteproc/remoteproc_core.c  | 135 +++---
 drivers/remoteproc/stm32_rproc.c  | 149 --
 drivers/remoteproc/tee_remoteproc.c   | 429 ++
 include/linux/remoteproc.h|   4 +
 include/linux/tee_remoteproc.h|  99 
 8 files changed, 784 insertions(+), 101 deletions(-)
 create mode 100644 drivers/remoteproc/tee_remoteproc.c
 create mode 100644 include/linux/tee_remoteproc.h

-- 
2.25.1




[RESEND PATCH v5 6/7] remoteproc: stm32: Create sub-functions to request shutdown and release

2024-05-21 Thread Arnaud Pouliquen
To prepare for the support of TEE remoteproc, create sub-functions
that can be used in both cases, with and without remoteproc TEE support.

Signed-off-by: Arnaud Pouliquen 
---
 drivers/remoteproc/stm32_rproc.c | 84 +++-
 1 file changed, 51 insertions(+), 33 deletions(-)

diff --git a/drivers/remoteproc/stm32_rproc.c b/drivers/remoteproc/stm32_rproc.c
index 88623df7d0c3..8cd838df4e92 100644
--- a/drivers/remoteproc/stm32_rproc.c
+++ b/drivers/remoteproc/stm32_rproc.c
@@ -209,6 +209,54 @@ static int stm32_rproc_mbox_idx(struct rproc *rproc, const 
unsigned char *name)
return -EINVAL;
 }
 
+static void stm32_rproc_request_shutdown(struct rproc *rproc)
+{
+   struct stm32_rproc *ddata = rproc->priv;
+   int err, dummy_data, idx;
+
+   /* Request shutdown of the remote processor */
+   if (rproc->state != RPROC_OFFLINE && rproc->state != RPROC_CRASHED) {
+   idx = stm32_rproc_mbox_idx(rproc, STM32_MBX_SHUTDOWN);
+   if (idx >= 0 && ddata->mb[idx].chan) {
+   /* A dummy data is sent to allow to block on transmit. 
*/
+   err = mbox_send_message(ddata->mb[idx].chan,
+   _data);
+   if (err < 0)
+   dev_warn(>dev, "warning: remote FW 
shutdown without ack\n");
+   }
+   }
+}
+
+static int stm32_rproc_release(struct rproc *rproc)
+{
+   struct stm32_rproc *ddata = rproc->priv;
+   unsigned int err = 0;
+
+   /* To allow platform Standby power mode, set remote proc Deep Sleep. */
+   if (ddata->pdds.map) {
+   err = regmap_update_bits(ddata->pdds.map, ddata->pdds.reg,
+ddata->pdds.mask, 1);
+   if (err) {
+   dev_err(>dev, "failed to set pdds\n");
+   return err;
+   }
+   }
+
+   /* Update coprocessor state to OFF if available. */
+   if (ddata->m4_state.map) {
+   err = regmap_update_bits(ddata->m4_state.map,
+ddata->m4_state.reg,
+ddata->m4_state.mask,
+M4_STATE_OFF);
+   if (err) {
+   dev_err(>dev, "failed to set copro state\n");
+   return err;
+   }
+   }
+
+   return 0;
+}
+
 static int stm32_rproc_prepare(struct rproc *rproc)
 {
struct device *dev = rproc->dev.parent;
@@ -519,17 +567,9 @@ static int stm32_rproc_detach(struct rproc *rproc)
 static int stm32_rproc_stop(struct rproc *rproc)
 {
struct stm32_rproc *ddata = rproc->priv;
-   int err, idx;
+   int err;
 
-   /* request shutdown of the remote processor */
-   if (rproc->state != RPROC_OFFLINE && rproc->state != RPROC_CRASHED) {
-   idx = stm32_rproc_mbox_idx(rproc, STM32_MBX_SHUTDOWN);
-   if (idx >= 0 && ddata->mb[idx].chan) {
-   err = mbox_send_message(ddata->mb[idx].chan, "detach");
-   if (err < 0)
-   dev_warn(>dev, "warning: remote FW 
shutdown without ack\n");
-   }
-   }
+   stm32_rproc_request_shutdown(rproc);
 
err = stm32_rproc_set_hold_boot(rproc, true);
if (err)
@@ -541,29 +581,7 @@ static int stm32_rproc_stop(struct rproc *rproc)
return err;
}
 
-   /* to allow platform Standby power mode, set remote proc Deep Sleep */
-   if (ddata->pdds.map) {
-   err = regmap_update_bits(ddata->pdds.map, ddata->pdds.reg,
-ddata->pdds.mask, 1);
-   if (err) {
-   dev_err(>dev, "failed to set pdds\n");
-   return err;
-   }
-   }
-
-   /* update coprocessor state to OFF if available */
-   if (ddata->m4_state.map) {
-   err = regmap_update_bits(ddata->m4_state.map,
-ddata->m4_state.reg,
-ddata->m4_state.mask,
-M4_STATE_OFF);
-   if (err) {
-   dev_err(>dev, "failed to set copro state\n");
-   return err;
-   }
-   }
-
-   return 0;
+   return stm32_rproc_release(rproc);
 }
 
 static void stm32_rproc_kick(struct rproc *rproc, int vqid)
-- 
2.25.1




[RESEND PATCH v5 5/7] remoteproc: core: support of the tee interface

2024-05-21 Thread Arnaud Pouliquen
1) on start:
- Using the TEE loader, the resource table is loaded by an external entity.
In such case the resource table address is not find from the firmware but
provided by the TEE remoteproc framework.
Use the rproc_get_loaded_rsc_table instead of rproc_find_loaded_rsc_table
- test that rproc->cached_table is not null before performing the memcpy

2)on stop
The use of the cached_table seems mandatory:
- during recovery sequence to have a snapshot of the resource table
  resources used,
- on stop to allow  for the deinitialization of resources after the
  the remote processor has been shutdown.
However if the TEE interface is being used, we first need to unmap the
table_ptr before setting it to rproc->cached_table.
The update of rproc->table_ptr to rproc->cached_table is performed in
tee_remoteproc.

Signed-off-by: Arnaud Pouliquen 
---
 drivers/remoteproc/remoteproc_core.c | 31 +---
 1 file changed, 23 insertions(+), 8 deletions(-)

diff --git a/drivers/remoteproc/remoteproc_core.c 
b/drivers/remoteproc/remoteproc_core.c
index 42bca01f3bde..3a642151c983 100644
--- a/drivers/remoteproc/remoteproc_core.c
+++ b/drivers/remoteproc/remoteproc_core.c
@@ -1267,6 +1267,7 @@ EXPORT_SYMBOL(rproc_resource_cleanup);
 static int rproc_set_rsc_table_on_start(struct rproc *rproc, const struct 
firmware *fw)
 {
struct resource_table *loaded_table;
+   struct device *dev = >dev;
 
/*
 * The starting device has been given the rproc->cached_table as the
@@ -1276,12 +1277,21 @@ static int rproc_set_rsc_table_on_start(struct rproc 
*rproc, const struct firmwa
 * this information to device memory. We also update the table_ptr so
 * that any subsequent changes will be applied to the loaded version.
 */
-   loaded_table = rproc_find_loaded_rsc_table(rproc, fw);
-   if (loaded_table) {
-   memcpy(loaded_table, rproc->cached_table, rproc->table_sz);
-   rproc->table_ptr = loaded_table;
+   if (rproc->tee_interface) {
+   loaded_table = rproc_get_loaded_rsc_table(rproc, 
>table_sz);
+   if (IS_ERR(loaded_table)) {
+   dev_err(dev, "can't get resource table\n");
+   return PTR_ERR(loaded_table);
+   }
+   } else {
+   loaded_table = rproc_find_loaded_rsc_table(rproc, fw);
}
 
+   if (loaded_table && rproc->cached_table)
+   memcpy(loaded_table, rproc->cached_table, rproc->table_sz);
+
+   rproc->table_ptr = loaded_table;
+
return 0;
 }
 
@@ -1318,11 +1328,16 @@ static int rproc_reset_rsc_table_on_stop(struct rproc 
*rproc)
kfree(rproc->clean_table);
 
 out:
-   /*
-* Use a copy of the resource table for the remainder of the
-* shutdown process.
+   /* If the remoteproc_tee interface is used, then we have first to unmap 
the resource table
+* before updating the proc->table_ptr reference.
 */
-   rproc->table_ptr = rproc->cached_table;
+   if (!rproc->tee_interface) {
+   /*
+* Use a copy of the resource table for the remainder of the
+* shutdown process.
+*/
+   rproc->table_ptr = rproc->cached_table;
+   }
return 0;
 }
 
-- 
2.25.1




[RESEND PATCH v5 3/7] dt-bindings: remoteproc: Add processor identifier property

2024-05-21 Thread Arnaud Pouliquen
Add the "st,proc-id" property allowing to identify the remote processor.
This ID is used to define an unique ID, common between Linux, U-boot and
OP-TEE to identify a coprocessor.
This ID will be used in request to OP-TEE remoteproc Trusted Application
to specify the remote processor.

Signed-off-by: Arnaud Pouliquen 
---
 .../devicetree/bindings/remoteproc/st,stm32-rproc.yaml | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/Documentation/devicetree/bindings/remoteproc/st,stm32-rproc.yaml 
b/Documentation/devicetree/bindings/remoteproc/st,stm32-rproc.yaml
index 36ea54016b76..409123cd4667 100644
--- a/Documentation/devicetree/bindings/remoteproc/st,stm32-rproc.yaml
+++ b/Documentation/devicetree/bindings/remoteproc/st,stm32-rproc.yaml
@@ -48,6 +48,10 @@ properties:
   - description: The offset of the hold boot setting register
   - description: The field mask of the hold boot
 
+  st,proc-id:
+description: remote processor identifier
+$ref: /schemas/types.yaml#/definitions/uint32
+
   st,syscfg-tz:
 deprecated: true
 description:
@@ -182,6 +186,8 @@ allOf:
 st,syscfg-holdboot: false
 reset-names: false
 resets: false
+  required:
+- st,proc-id
 
 additionalProperties: false
 
@@ -220,6 +226,7 @@ examples:
   reg = <0x1000 0x4>,
 <0x3000 0x4>,
 <0x3800 0x1>;
+  st,proc-id = <0>;
   st,syscfg-rsc-tbl = < 0x144 0x>;
   st,syscfg-m4-state = < 0x148 0x>;
 };
-- 
2.25.1




Re: [PATCH v3 1/9] riscv: mm: Properly forward vmemmap_populate() altmap parameter

2024-05-21 Thread Alexandre Ghiti
Hi Björn,

On Tue, May 21, 2024 at 1:48 PM Björn Töpel  wrote:
>
> From: Björn Töpel 
>
> Make sure that the altmap parameter is properly passed on to
> vmemmap_populate_hugepages().
>
> Signed-off-by: Björn Töpel 
> ---
>  arch/riscv/mm/init.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c
> index 2574f6a3b0e7..b66f846e7634 100644
> --- a/arch/riscv/mm/init.c
> +++ b/arch/riscv/mm/init.c
> @@ -1434,7 +1434,7 @@ int __meminit vmemmap_populate(unsigned long start, 
> unsigned long end, int node,
>  * memory hotplug, we are not able to update all the page tables with
>  * the new PMDs.
>  */
> -   return vmemmap_populate_hugepages(start, end, node, NULL);
> +   return vmemmap_populate_hugepages(start, end, node, altmap);
>  }
>  #endif
>
> --
> 2.40.1
>

You can add:

Reviewed-by: Alexandre Ghiti 

Thanks,

Alex



Re: [PATCH v5 2/7] dt-bindings: remoteproc: Add compatibility for TEE support

2024-05-21 Thread Arnaud POULIQUEN


On 5/21/24 11:24, Krzysztof Kozlowski wrote:
> On 21/05/2024 10:09, Arnaud Pouliquen wrote:
>> The "st,stm32mp1-m4-tee" compatible is utilized in a system configuration
>> where the Cortex-M4 firmware is loaded by the Trusted execution Environment
>> (TEE).
>> For instance, this compatible is used in both the Linux and OP-TEE
>> device-tree:
>> - In OP-TEE, a node is defined in the device tree with the
>>   st,stm32mp1-m4-tee to support signed remoteproc firmware.
>>   Based on DT properties, OP-TEE authenticates, loads, starts, and stops
>>   the firmware.
>> - On Linux, when the compatibility is set, the Cortex-M resets should not
>>   be declared in the device tree.
>>
> 
> Not tested.
> 
> Please use scripts/get_maintainers.pl to get a list of necessary people
> and lists to CC. It might happen, that command when run on an older
> kernel, gives you outdated entries. Therefore please be sure you base
> your patches on recent Linux kernel.
> 
> Tools like b4 or scripts/get_maintainer.pl provide you proper list of
> people, so fix your workflow. Tools might also fail if you work on some
> ancient tree (don't, instead use mainline), work on fork of kernel
> (don't, instead use mainline) or you ignore some maintainers (really
> don't). Just use b4 and everything should be fine, although remember
> about `b4 prep --auto-to-cc` if you added new patches to the patchset.
> 
> You missed at least devicetree list (maybe more), so this won't be
> tested by automated tooling. Performing review on untested code might be
> a waste of time, thus I will skip this patch entirely till you follow
> the process allowing the patch to be tested.
> 
> Please kindly resend and include all necessary To/Cc entries.

I apologize for this oversight; I will resend the pull request and adding
the missing CC and To.

Thanks!
Arnaud

> 
> Best regards,
> Krzysztof
> 



Re: [PATCH 1/1] x86/vector: Fix vector leak during CPU offline

2024-05-21 Thread Thomas Gleixner
On Wed, May 15 2024 at 12:51, Dongli Zhang wrote:
> On 5/13/24 3:46 PM, Thomas Gleixner wrote:
>> So yes, moving the invocation of irq_force_complete_move() before the
>> irq_needs_fixup() call makes sense, but it wants this to actually work
>> correctly:
>> @@ -1097,10 +1098,11 @@ void irq_force_complete_move(struct irq_
>>  goto unlock;
>>  
>>  /*
>> - * If prev_vector is empty, no action required.
>> + * If prev_vector is empty or the descriptor was previously
>> + * not on the outgoing CPU no action required.
>>   */
>>  vector = apicd->prev_vector;
>> -if (!vector)
>> +if (!vector || apicd->prev_cpu != smp_processor_id())
>>  goto unlock;
>>  
>
> The above may not work. migrate_one_irq() relies on irq_force_complete_move() 
> to
> always reclaim the apicd->prev_vector. Otherwise, the call of
> irq_do_set_affinity() later may return -EBUSY.

You're right. But that still can be handled in irq_force_complete_move()
with a single unconditional invocation in migrate_one_irq():

cpu = smp_processor_id();
if (!vector || (apicd->cur_cpu != cpu && apicd->prev_cpu != cpu))
goto unlock;

because there are only two cases when a cleanup is required:

   1) The outgoing CPU is the current target

   2) The outgoing CPU was the previous target

No?

Thanks,

tglx



[PATCH v3 9/9] riscv: mm: Add support for ZONE_DEVICE

2024-05-21 Thread Björn Töpel
From: Björn Töpel 

ZONE_DEVICE pages need DEVMAP PTEs support to function
(ARCH_HAS_PTE_DEVMAP). Claim another RSW (reserved for software) bit
in the PTE for DEVMAP mark, add the corresponding helpers, and enable
ARCH_HAS_PTE_DEVMAP for riscv64.

Signed-off-by: Björn Töpel 
---
 arch/riscv/Kconfig|  1 +
 arch/riscv/include/asm/pgtable-64.h   | 20 
 arch/riscv/include/asm/pgtable-bits.h |  1 +
 arch/riscv/include/asm/pgtable.h  | 17 +
 4 files changed, 39 insertions(+)

diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index 2724dc2af29f..0b74698c63c7 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -36,6 +36,7 @@ config RISCV
select ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE
select ARCH_HAS_PMEM_API
select ARCH_HAS_PREPARE_SYNC_CORE_CMD
+   select ARCH_HAS_PTE_DEVMAP if 64BIT && MMU
select ARCH_HAS_PTE_SPECIAL
select ARCH_HAS_SET_DIRECT_MAP if MMU
select ARCH_HAS_SET_MEMORY if MMU
diff --git a/arch/riscv/include/asm/pgtable-64.h 
b/arch/riscv/include/asm/pgtable-64.h
index 221a5c1ee287..c67a9bbfd010 100644
--- a/arch/riscv/include/asm/pgtable-64.h
+++ b/arch/riscv/include/asm/pgtable-64.h
@@ -400,4 +400,24 @@ static inline struct page *pgd_page(pgd_t pgd)
 #define p4d_offset p4d_offset
 p4d_t *p4d_offset(pgd_t *pgd, unsigned long address);
 
+#ifdef CONFIG_TRANSPARENT_HUGEPAGE
+static inline int pte_devmap(pte_t pte);
+static inline pte_t pmd_pte(pmd_t pmd);
+
+static inline int pmd_devmap(pmd_t pmd)
+{
+   return pte_devmap(pmd_pte(pmd));
+}
+
+static inline int pud_devmap(pud_t pud)
+{
+   return 0;
+}
+
+static inline int pgd_devmap(pgd_t pgd)
+{
+   return 0;
+}
+#endif
+
 #endif /* _ASM_RISCV_PGTABLE_64_H */
diff --git a/arch/riscv/include/asm/pgtable-bits.h 
b/arch/riscv/include/asm/pgtable-bits.h
index 179bd4afece4..a8f5205cea54 100644
--- a/arch/riscv/include/asm/pgtable-bits.h
+++ b/arch/riscv/include/asm/pgtable-bits.h
@@ -19,6 +19,7 @@
 #define _PAGE_SOFT  (3 << 8)/* Reserved for software */
 
 #define _PAGE_SPECIAL   (1 << 8)/* RSW: 0x1 */
+#define _PAGE_DEVMAP(1 << 9)/* RSW, devmap */
 #define _PAGE_TABLE _PAGE_PRESENT
 
 /*
diff --git a/arch/riscv/include/asm/pgtable.h b/arch/riscv/include/asm/pgtable.h
index 7933f493db71..02fadc276064 100644
--- a/arch/riscv/include/asm/pgtable.h
+++ b/arch/riscv/include/asm/pgtable.h
@@ -387,6 +387,13 @@ static inline int pte_special(pte_t pte)
return pte_val(pte) & _PAGE_SPECIAL;
 }
 
+#ifdef CONFIG_ARCH_HAS_PTE_DEVMAP
+static inline int pte_devmap(pte_t pte)
+{
+   return pte_val(pte) & _PAGE_DEVMAP;
+}
+#endif
+
 /* static inline pte_t pte_rdprotect(pte_t pte) */
 
 static inline pte_t pte_wrprotect(pte_t pte)
@@ -428,6 +435,11 @@ static inline pte_t pte_mkspecial(pte_t pte)
return __pte(pte_val(pte) | _PAGE_SPECIAL);
 }
 
+static inline pte_t pte_mkdevmap(pte_t pte)
+{
+   return __pte(pte_val(pte) | _PAGE_DEVMAP);
+}
+
 static inline pte_t pte_mkhuge(pte_t pte)
 {
return pte;
@@ -711,6 +723,11 @@ static inline pmd_t pmd_mkdirty(pmd_t pmd)
return pte_pmd(pte_mkdirty(pmd_pte(pmd)));
 }
 
+static inline pmd_t pmd_mkdevmap(pmd_t pmd)
+{
+   return pte_pmd(pte_mkdevmap(pmd_pte(pmd)));
+}
+
 static inline void set_pmd_at(struct mm_struct *mm, unsigned long addr,
pmd_t *pmdp, pmd_t pmd)
 {
-- 
2.40.1




[PATCH v3 8/9] virtio-mem: Enable virtio-mem for RISC-V

2024-05-21 Thread Björn Töpel
From: Björn Töpel 

Now that RISC-V has memory hotplugging support, virtio-mem can be used
on the platform.

Acked-by: David Hildenbrand 
Signed-off-by: Björn Töpel 
---
 drivers/virtio/Kconfig | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/virtio/Kconfig b/drivers/virtio/Kconfig
index c17193544268..4e5cebf1b82a 100644
--- a/drivers/virtio/Kconfig
+++ b/drivers/virtio/Kconfig
@@ -122,7 +122,7 @@ config VIRTIO_BALLOON
 
 config VIRTIO_MEM
tristate "Virtio mem driver"
-   depends on X86_64 || ARM64
+   depends on X86_64 || ARM64 || RISCV
depends on VIRTIO
depends on MEMORY_HOTPLUG
depends on MEMORY_HOTREMOVE
-- 
2.40.1




[PATCH v3 7/9] riscv: Enable memory hotplugging for RISC-V

2024-05-21 Thread Björn Töpel
From: Björn Töpel 

Enable ARCH_ENABLE_MEMORY_HOTPLUG and ARCH_ENABLE_MEMORY_HOTREMOVE for
RISC-V.

Signed-off-by: Björn Töpel 
---
 arch/riscv/Kconfig | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index fe5281398543..2724dc2af29f 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -16,6 +16,8 @@ config RISCV
select ACPI_REDUCED_HARDWARE_ONLY if ACPI
select ARCH_DMA_DEFAULT_COHERENT
select ARCH_ENABLE_HUGEPAGE_MIGRATION if HUGETLB_PAGE && MIGRATION
+   select ARCH_ENABLE_MEMORY_HOTPLUG if SPARSEMEM_VMEMMAP && 64BIT && MMU
+   select ARCH_ENABLE_MEMORY_HOTREMOVE if MEMORY_HOTPLUG
select ARCH_ENABLE_SPLIT_PMD_PTLOCK if PGTABLE_LEVELS > 2
select ARCH_ENABLE_THP_MIGRATION if TRANSPARENT_HUGEPAGE
select ARCH_HAS_BINFMT_FLAT
-- 
2.40.1




[PATCH v3 6/9] riscv: mm: Take memory hotplug read-lock during kernel page table dump

2024-05-21 Thread Björn Töpel
From: Björn Töpel 

During memory hot remove, the ptdump functionality can end up touching
stale data. Avoid any potential crashes (or worse), by holding the
memory hotplug read-lock while traversing the page table.

This change is analogous to arm64's commit bf2b59f60ee1 ("arm64/mm:
Hold memory hotplug lock while walking for kernel page table dump").

Reviewed-by: David Hildenbrand 
Reviewed-by: Oscar Salvador 
Signed-off-by: Björn Töpel 
---
 arch/riscv/mm/ptdump.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/arch/riscv/mm/ptdump.c b/arch/riscv/mm/ptdump.c
index 1289cc6d3700..9d5f657a251b 100644
--- a/arch/riscv/mm/ptdump.c
+++ b/arch/riscv/mm/ptdump.c
@@ -6,6 +6,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 
@@ -370,7 +371,9 @@ bool ptdump_check_wx(void)
 
 static int ptdump_show(struct seq_file *m, void *v)
 {
+   get_online_mems();
ptdump_walk(m, m->private);
+   put_online_mems();
 
return 0;
 }
-- 
2.40.1




[PATCH v3 5/9] riscv: mm: Add memory hotplugging support

2024-05-21 Thread Björn Töpel
From: Björn Töpel 

For an architecture to support memory hotplugging, a couple of
callbacks needs to be implemented:

 arch_add_memory()
  This callback is responsible for adding the physical memory into the
  direct map, and call into the memory hotplugging generic code via
  __add_pages() that adds the corresponding struct page entries, and
  updates the vmemmap mapping.

 arch_remove_memory()
  This is the inverse of the callback above.

 vmemmap_free()
  This function tears down the vmemmap mappings (if
  CONFIG_SPARSEMEM_VMEMMAP is enabled), and also deallocates the
  backing vmemmap pages. Note that for persistent memory, an
  alternative allocator for the backing pages can be used; The
  vmem_altmap. This means that when the backing pages are cleared,
  extra care is needed so that the correct deallocation method is
  used.

 arch_get_mappable_range()
  This functions returns the PA range that the direct map can map.
  Used by the MHP internals for sanity checks.

The page table unmap/teardown functions are heavily based on code from
the x86 tree. The same remove_pgd_mapping() function is used in both
vmemmap_free() and arch_remove_memory(), but in the latter function
the backing pages are not removed.

Signed-off-by: Björn Töpel 
---
 arch/riscv/mm/init.c | 261 +++
 1 file changed, 261 insertions(+)

diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c
index 6f72b0b2b854..6693b742bf2f 100644
--- a/arch/riscv/mm/init.c
+++ b/arch/riscv/mm/init.c
@@ -1493,3 +1493,264 @@ void __init pgtable_cache_init(void)
}
 }
 #endif
+
+#ifdef CONFIG_MEMORY_HOTPLUG
+static void __meminit free_pagetable(struct page *page, int order)
+{
+   unsigned int nr_pages = 1 << order;
+
+   /*
+* vmemmap/direct page tables can be reserved, if added at
+* boot.
+*/
+   if (PageReserved(page)) {
+   __ClearPageReserved(page);
+   while (nr_pages--)
+   free_reserved_page(page++);
+   return;
+   }
+
+   free_pages((unsigned long)page_address(page), order);
+}
+
+static void __meminit free_pte_table(pte_t *pte_start, pmd_t *pmd)
+{
+   pte_t *pte;
+   int i;
+
+   for (i = 0; i < PTRS_PER_PTE; i++) {
+   pte = pte_start + i;
+   if (!pte_none(*pte))
+   return;
+   }
+
+   free_pagetable(pmd_page(*pmd), 0);
+   pmd_clear(pmd);
+}
+
+static void __meminit free_pmd_table(pmd_t *pmd_start, pud_t *pud)
+{
+   pmd_t *pmd;
+   int i;
+
+   for (i = 0; i < PTRS_PER_PMD; i++) {
+   pmd = pmd_start + i;
+   if (!pmd_none(*pmd))
+   return;
+   }
+
+   free_pagetable(pud_page(*pud), 0);
+   pud_clear(pud);
+}
+
+static void __meminit free_pud_table(pud_t *pud_start, p4d_t *p4d)
+{
+   pud_t *pud;
+   int i;
+
+   for (i = 0; i < PTRS_PER_PUD; i++) {
+   pud = pud_start + i;
+   if (!pud_none(*pud))
+   return;
+   }
+
+   free_pagetable(p4d_page(*p4d), 0);
+   p4d_clear(p4d);
+}
+
+static void __meminit free_vmemmap_storage(struct page *page, size_t size,
+  struct vmem_altmap *altmap)
+{
+   if (altmap)
+   vmem_altmap_free(altmap, size >> PAGE_SHIFT);
+   else
+   free_pagetable(page, get_order(size));
+}
+
+static void __meminit remove_pte_mapping(pte_t *pte_base, unsigned long addr, 
unsigned long end,
+bool is_vmemmap, struct vmem_altmap 
*altmap)
+{
+   unsigned long next;
+   pte_t *ptep, pte;
+
+   for (; addr < end; addr = next) {
+   next = (addr + PAGE_SIZE) & PAGE_MASK;
+   if (next > end)
+   next = end;
+
+   ptep = pte_base + pte_index(addr);
+   pte = READ_ONCE(*ptep);
+
+   if (!pte_present(*ptep))
+   continue;
+
+   pte_clear(_mm, addr, ptep);
+   if (is_vmemmap)
+   free_vmemmap_storage(pte_page(pte), PAGE_SIZE, altmap);
+   }
+}
+
+static void __meminit remove_pmd_mapping(pmd_t *pmd_base, unsigned long addr, 
unsigned long end,
+bool is_vmemmap, struct vmem_altmap 
*altmap)
+{
+   unsigned long next;
+   pte_t *pte_base;
+   pmd_t *pmdp, pmd;
+
+   for (; addr < end; addr = next) {
+   next = pmd_addr_end(addr, end);
+   pmdp = pmd_base + pmd_index(addr);
+   pmd = READ_ONCE(*pmdp);
+
+   if (!pmd_present(pmd))
+   continue;
+
+   if (pmd_leaf(pmd)) {
+   pmd_clear(pmdp);
+   if (is_vmemmap)
+   free_vmemmap_storage(pmd_page(pmd), PMD_SIZE, 
altmap);
+   continue;
+   

[PATCH v3 4/9] riscv: mm: Refactor create_linear_mapping_range() for memory hot add

2024-05-21 Thread Björn Töpel
From: Björn Töpel 

Add a parameter to the direct map setup function, so it can be used in
arch_add_memory() later.

Reviewed-by: Alexandre Ghiti 
Reviewed-by: David Hildenbrand 
Reviewed-by: Oscar Salvador 
Signed-off-by: Björn Töpel 
---
 arch/riscv/mm/init.c | 15 ++-
 1 file changed, 6 insertions(+), 9 deletions(-)

diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c
index c969427eab88..6f72b0b2b854 100644
--- a/arch/riscv/mm/init.c
+++ b/arch/riscv/mm/init.c
@@ -1227,7 +1227,7 @@ asmlinkage void __init setup_vm(uintptr_t dtb_pa)
 }
 
 static void __meminit create_linear_mapping_range(phys_addr_t start, 
phys_addr_t end,
- uintptr_t fixed_map_size)
+ uintptr_t fixed_map_size, 
const pgprot_t *pgprot)
 {
phys_addr_t pa;
uintptr_t va, map_size;
@@ -1238,7 +1238,7 @@ static void __meminit 
create_linear_mapping_range(phys_addr_t start, phys_addr_t
best_map_size(pa, va, end - pa);
 
create_pgd_mapping(swapper_pg_dir, va, pa, map_size,
-  pgprot_from_va(va));
+  pgprot ? *pgprot : pgprot_from_va(va));
}
 }
 
@@ -1282,22 +1282,19 @@ static void __init 
create_linear_mapping_page_table(void)
if (end >= __pa(PAGE_OFFSET) + memory_limit)
end = __pa(PAGE_OFFSET) + memory_limit;
 
-   create_linear_mapping_range(start, end, 0);
+   create_linear_mapping_range(start, end, 0, NULL);
}
 
 #ifdef CONFIG_STRICT_KERNEL_RWX
-   create_linear_mapping_range(ktext_start, ktext_start + ktext_size, 0);
-   create_linear_mapping_range(krodata_start,
-   krodata_start + krodata_size, 0);
+   create_linear_mapping_range(ktext_start, ktext_start + ktext_size, 0, 
NULL);
+   create_linear_mapping_range(krodata_start, krodata_start + 
krodata_size, 0, NULL);
 
memblock_clear_nomap(ktext_start,  ktext_size);
memblock_clear_nomap(krodata_start, krodata_size);
 #endif
 
 #ifdef CONFIG_KFENCE
-   create_linear_mapping_range(kfence_pool,
-   kfence_pool + KFENCE_POOL_SIZE,
-   PAGE_SIZE);
+   create_linear_mapping_range(kfence_pool, kfence_pool + 
KFENCE_POOL_SIZE, PAGE_SIZE, NULL);
 
memblock_clear_nomap(kfence_pool, KFENCE_POOL_SIZE);
 #endif
-- 
2.40.1




Re: [PATCHv6 9/9] man2: Add uretprobe syscall page

2024-05-21 Thread Jiri Olsa
On Tue, May 21, 2024 at 01:36:25PM +0200, Alejandro Colomar wrote:
> Hi Jiri,
> 
> On Tue, May 21, 2024 at 12:48:25PM GMT, Jiri Olsa wrote:
> > Adding man page for new uretprobe syscall.
> > 
> > Signed-off-by: Jiri Olsa 
> > ---
> >  man2/uretprobe.2 | 50 
> >  1 file changed, 50 insertions(+)
> >  create mode 100644 man2/uretprobe.2
> > 
> > diff --git a/man2/uretprobe.2 b/man2/uretprobe.2
> > new file mode 100644
> > index ..690fe3b1a44f
> > --- /dev/null
> > +++ b/man2/uretprobe.2
> > @@ -0,0 +1,50 @@
> > +.\" Copyright (C) 2024, Jiri Olsa 
> > +.\"
> > +.\" SPDX-License-Identifier: Linux-man-pages-copyleft
> > +.\"
> > +.TH uretprobe 2 (date) "Linux man-pages (unreleased)"
> > +.SH NAME
> > +uretprobe \- execute pending return uprobes
> > +.SH SYNOPSIS
> > +.nf
> > +.B int uretprobe(void)
> > +.fi
> 
> What header file provides this system call?

there's no header, it's used/called only by user space trampoline
provided by kernel, it's not expected to be called by user

> 
> > +.SH DESCRIPTION
> > +The
> > +.BR uretprobe ()
> > +syscall is an alternative to breakpoint instructions for
> > +triggering return uprobe consumers.
> > +.P
> > +Calls to
> > +.BR uretprobe ()
> > +suscall are only made from the user-space trampoline provided by the 
> > kernel.
> 
> s/suscall/system call/

ugh leftover sry

> 
> > +Calls from any other place result in a
> > +.BR SIGILL .
> 
> Maybe add an ERRORS section?
> 
> > +
> 
> We don't use blank lines; it causes a groff(1) warning, and other
> problems.  Instead, use '.P'.
> 
> > +.SH RETURN VALUE
> > +The
> > +.BR uretprobe ()
> > +syscall return value is architecture-specific.
> > +
> 
> .P
> 
> > +.SH VERSIONS
> > +This syscall is not specified in POSIX,
> 
> Redundant with "STANDARDS: None.".
> 
> > +and details of its behavior vary across systems.
> 
> Keep this.

ok

> 
> > +.SH STANDARDS
> > +None.
> > +.SH HISTORY
> > +TBD
> > +.SH NOTES
> > +The
> > +.BR uretprobe ()
> > +syscall was initially introduced for the x86_64 architecture where it was 
> > shown
> > +to be faster than breakpoint traps. It might be extended to other 
> > architectures.
> 
> Please use semantic newlines.
> 
> $ MANWIDTH=72 man man-pages | sed -n '/Use semantic newlines/,/^$/p'
>Use semantic newlines
>  In the source of a manual page, new sentences should be started on
>  new lines, long sentences should be split  into  lines  at  clause
>  breaks  (commas,  semicolons, colons, and so on), and long clauses
>  should be split at phrase boundaries.  This convention,  sometimes
>  known as "semantic newlines", makes it easier to see the effect of
>  patches, which often operate at the level of individual sentences,
>  clauses, or phrases.

ok

thanks,
jirka

> 
> > +.P
> > +The
> > +.BR uretprobe ()
> > +syscall exists only to allow the invocation of return uprobe consumers.
> 
> s/syscall/system call/
> 
> > +It should
> > +.B never
> > +be called directly.
> > +Details of the arguments (if any) passed to
> > +.BR uretprobe ()
> > +and the return value are architecture-specific.
> > -- 
> > 2.44.0
> 
> Have a lovely day!
> Alex
> 
> -- 
> 





[PATCH v3 3/9] riscv: mm: Change attribute from __init to __meminit for page functions

2024-05-21 Thread Björn Töpel
From: Björn Töpel 

Prepare for memory hotplugging support by changing from __init to
__meminit for the page table functions that are used by the upcoming
architecture specific callbacks.

Changing the __init attribute to __meminit, avoids that the functions
are removed after init. The __meminit attribute makes sure the
functions are kept in the kernel text post init, but only if memory
hotplugging is enabled for the build.

Reviewed-by: David Hildenbrand 
Reviewed-by: Oscar Salvador 
Signed-off-by: Björn Töpel 
---
 arch/riscv/include/asm/mmu.h |  4 +--
 arch/riscv/include/asm/pgtable.h |  2 +-
 arch/riscv/mm/init.c | 56 ++--
 3 files changed, 28 insertions(+), 34 deletions(-)

diff --git a/arch/riscv/include/asm/mmu.h b/arch/riscv/include/asm/mmu.h
index 947fd60f9051..c9e03e9da3dc 100644
--- a/arch/riscv/include/asm/mmu.h
+++ b/arch/riscv/include/asm/mmu.h
@@ -31,8 +31,8 @@ typedef struct {
 #define cntx2asid(cntx)((cntx) & SATP_ASID_MASK)
 #define cntx2version(cntx) ((cntx) & ~SATP_ASID_MASK)
 
-void __init create_pgd_mapping(pgd_t *pgdp, uintptr_t va, phys_addr_t pa,
-  phys_addr_t sz, pgprot_t prot);
+void __meminit create_pgd_mapping(pgd_t *pgdp, uintptr_t va, phys_addr_t pa, 
phys_addr_t sz,
+ pgprot_t prot);
 #endif /* __ASSEMBLY__ */
 
 #endif /* _ASM_RISCV_MMU_H */
diff --git a/arch/riscv/include/asm/pgtable.h b/arch/riscv/include/asm/pgtable.h
index 58fd7b70b903..7933f493db71 100644
--- a/arch/riscv/include/asm/pgtable.h
+++ b/arch/riscv/include/asm/pgtable.h
@@ -162,7 +162,7 @@ struct pt_alloc_ops {
 #endif
 };
 
-extern struct pt_alloc_ops pt_ops __initdata;
+extern struct pt_alloc_ops pt_ops __meminitdata;
 
 #ifdef CONFIG_MMU
 /* Number of PGD entries that a user-mode program can use */
diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c
index c98010ede810..c969427eab88 100644
--- a/arch/riscv/mm/init.c
+++ b/arch/riscv/mm/init.c
@@ -295,7 +295,7 @@ static void __init setup_bootmem(void)
 }
 
 #ifdef CONFIG_MMU
-struct pt_alloc_ops pt_ops __initdata;
+struct pt_alloc_ops pt_ops __meminitdata;
 
 pgd_t swapper_pg_dir[PTRS_PER_PGD] __page_aligned_bss;
 pgd_t trampoline_pg_dir[PTRS_PER_PGD] __page_aligned_bss;
@@ -357,7 +357,7 @@ static inline pte_t *__init get_pte_virt_fixmap(phys_addr_t 
pa)
return (pte_t *)set_fixmap_offset(FIX_PTE, pa);
 }
 
-static inline pte_t *__init get_pte_virt_late(phys_addr_t pa)
+static inline pte_t *__meminit get_pte_virt_late(phys_addr_t pa)
 {
return (pte_t *) __va(pa);
 }
@@ -376,7 +376,7 @@ static inline phys_addr_t __init alloc_pte_fixmap(uintptr_t 
va)
return memblock_phys_alloc(PAGE_SIZE, PAGE_SIZE);
 }
 
-static phys_addr_t __init alloc_pte_late(uintptr_t va)
+static phys_addr_t __meminit alloc_pte_late(uintptr_t va)
 {
struct ptdesc *ptdesc = pagetable_alloc(GFP_KERNEL & ~__GFP_HIGHMEM, 0);
 
@@ -384,9 +384,8 @@ static phys_addr_t __init alloc_pte_late(uintptr_t va)
return __pa((pte_t *)ptdesc_address(ptdesc));
 }
 
-static void __init create_pte_mapping(pte_t *ptep,
- uintptr_t va, phys_addr_t pa,
- phys_addr_t sz, pgprot_t prot)
+static void __meminit create_pte_mapping(pte_t *ptep, uintptr_t va, 
phys_addr_t pa, phys_addr_t sz,
+pgprot_t prot)
 {
uintptr_t pte_idx = pte_index(va);
 
@@ -440,7 +439,7 @@ static pmd_t *__init get_pmd_virt_fixmap(phys_addr_t pa)
return (pmd_t *)set_fixmap_offset(FIX_PMD, pa);
 }
 
-static pmd_t *__init get_pmd_virt_late(phys_addr_t pa)
+static pmd_t *__meminit get_pmd_virt_late(phys_addr_t pa)
 {
return (pmd_t *) __va(pa);
 }
@@ -457,7 +456,7 @@ static phys_addr_t __init alloc_pmd_fixmap(uintptr_t va)
return memblock_phys_alloc(PAGE_SIZE, PAGE_SIZE);
 }
 
-static phys_addr_t __init alloc_pmd_late(uintptr_t va)
+static phys_addr_t __meminit alloc_pmd_late(uintptr_t va)
 {
struct ptdesc *ptdesc = pagetable_alloc(GFP_KERNEL & ~__GFP_HIGHMEM, 0);
 
@@ -465,9 +464,9 @@ static phys_addr_t __init alloc_pmd_late(uintptr_t va)
return __pa((pmd_t *)ptdesc_address(ptdesc));
 }
 
-static void __init create_pmd_mapping(pmd_t *pmdp,
- uintptr_t va, phys_addr_t pa,
- phys_addr_t sz, pgprot_t prot)
+static void __meminit create_pmd_mapping(pmd_t *pmdp,
+uintptr_t va, phys_addr_t pa,
+phys_addr_t sz, pgprot_t prot)
 {
pte_t *ptep;
phys_addr_t pte_phys;
@@ -503,7 +502,7 @@ static pud_t *__init get_pud_virt_fixmap(phys_addr_t pa)
return (pud_t *)set_fixmap_offset(FIX_PUD, pa);
 }
 
-static pud_t *__init get_pud_virt_late(phys_addr_t pa)
+static pud_t *__meminit get_pud_virt_late(phys_addr_t pa)
 {
return (pud_t *)__va(pa);
 }
@@ -521,7 

[PATCH v3 2/9] riscv: mm: Pre-allocate vmemmap/direct map PGD entries

2024-05-21 Thread Björn Töpel
From: Björn Töpel 

The RISC-V port copies the PGD table from init_mm/swapper_pg_dir to
all userland page tables, which means that if the PGD level table is
changed, other page tables has to be updated as well.

Instead of having the PGD changes ripple out to all tables, the
synchronization can be avoided by pre-allocating the PGD entries/pages
at boot, avoiding the synchronization all together.

This is currently done for the bpf/modules, and vmalloc PGD regions.
Extend this scheme for the PGD regions touched by memory hotplugging.

Prepare the RISC-V port for memory hotplug by pre-allocate
vmemmap/direct map entries at the PGD level. This will roughly waste
~128 worth of 4K pages when memory hotplugging is enabled in the
kernel configuration.

Reviewed-by: Alexandre Ghiti 
Signed-off-by: Björn Töpel 
---
 arch/riscv/include/asm/kasan.h | 4 ++--
 arch/riscv/mm/init.c   | 7 +++
 2 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/arch/riscv/include/asm/kasan.h b/arch/riscv/include/asm/kasan.h
index 0b85e363e778..e6a0071bdb56 100644
--- a/arch/riscv/include/asm/kasan.h
+++ b/arch/riscv/include/asm/kasan.h
@@ -6,8 +6,6 @@
 
 #ifndef __ASSEMBLY__
 
-#ifdef CONFIG_KASAN
-
 /*
  * The following comment was copied from arm64:
  * KASAN_SHADOW_START: beginning of the kernel virtual addresses.
@@ -34,6 +32,8 @@
  */
 #define KASAN_SHADOW_START ((KASAN_SHADOW_END - KASAN_SHADOW_SIZE) & 
PGDIR_MASK)
 #define KASAN_SHADOW_END   MODULES_LOWEST_VADDR
+
+#ifdef CONFIG_KASAN
 #define KASAN_SHADOW_OFFSET_AC(CONFIG_KASAN_SHADOW_OFFSET, UL)
 
 void kasan_init(void);
diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c
index b66f846e7634..c98010ede810 100644
--- a/arch/riscv/mm/init.c
+++ b/arch/riscv/mm/init.c
@@ -27,6 +27,7 @@
 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -1488,10 +1489,16 @@ static void __init preallocate_pgd_pages_range(unsigned 
long start, unsigned lon
panic("Failed to pre-allocate %s pages for %s area\n", lvl, area);
 }
 
+#define PAGE_END KASAN_SHADOW_START
+
 void __init pgtable_cache_init(void)
 {
preallocate_pgd_pages_range(VMALLOC_START, VMALLOC_END, "vmalloc");
if (IS_ENABLED(CONFIG_MODULES))
preallocate_pgd_pages_range(MODULES_VADDR, MODULES_END, 
"bpf/modules");
+   if (IS_ENABLED(CONFIG_MEMORY_HOTPLUG)) {
+   preallocate_pgd_pages_range(VMEMMAP_START, VMEMMAP_END, 
"vmemmap");
+   preallocate_pgd_pages_range(PAGE_OFFSET, PAGE_END, "direct 
map");
+   }
 }
 #endif
-- 
2.40.1




[PATCH v3 1/9] riscv: mm: Properly forward vmemmap_populate() altmap parameter

2024-05-21 Thread Björn Töpel
From: Björn Töpel 

Make sure that the altmap parameter is properly passed on to
vmemmap_populate_hugepages().

Signed-off-by: Björn Töpel 
---
 arch/riscv/mm/init.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c
index 2574f6a3b0e7..b66f846e7634 100644
--- a/arch/riscv/mm/init.c
+++ b/arch/riscv/mm/init.c
@@ -1434,7 +1434,7 @@ int __meminit vmemmap_populate(unsigned long start, 
unsigned long end, int node,
 * memory hotplug, we are not able to update all the page tables with
 * the new PMDs.
 */
-   return vmemmap_populate_hugepages(start, end, node, NULL);
+   return vmemmap_populate_hugepages(start, end, node, altmap);
 }
 #endif
 
-- 
2.40.1




[PATCH v3 0/9] riscv: Memory Hot(Un)Plug support

2024-05-21 Thread Björn Töpel
From: Björn Töpel 


Memory Hot(Un)Plug support (and ZONE_DEVICE) for the RISC-V port


Introduction


To quote "Documentation/admin-guide/mm/memory-hotplug.rst": "Memory
hot(un)plug allows for increasing and decreasing the size of physical
memory available to a machine at runtime."

This series adds memory hot(un)plugging, and ZONE_DEVICE support for
the RISC-V Linux port.

MM configuration


RISC-V MM has the following configuration:

 * Memory blocks are 128M, analogous to x86-64. It uses PMD
   ("hugepage") vmemmaps. From that follows that 2M (PMD) worth of
   vmemmap spans 32768 pages á 4K which gets us 128M.

 * The pageblock size is the minimum minimum virtio_mem size, and on
   RISC-V it's 2M (2^9 * 4K).

Implementation
==

The PGD table on RISC-V is shared/copied between for all processes. To
avoid doing page table synchronization, the first patch (patch 1)
pre-allocated the PGD entries for vmemmap/direct map. By doing that
the init_mm PGD will be fixed at kernel init, and synchronization can
be avoided all together.

The following two patches (patch 2-3) does some preparations, followed
by the actual MHP implementation (patch 4-5). Then, MHP and virtio-mem
are enabled (patch 6-7), and finally ZONE_DEVICE support is added
(patch 8).

MHP and locking
===

TL;DR: The MHP does not step on any toes, except for ptdump.
Additional locking is required for ptdump.

Long version: For v2 I spent some time digging into init_mm
synchronization/update. Here are my findings, and I'd love them to be
corrected if incorrect.

It's been a gnarly path...

The `init_mm` structure is a special mm (perhaps not a "real" one).
It's a "lazy context" that tracks kernel page table resources, e.g.,
the kernel page table (swapper_pg_dir), a kernel page_table_lock (more
about the usage below), mmap_lock, and such.

`init_mm` does not track/contain any VMAs. Having the `init_mm` is
convenient, so that the regular kernel page table walk/modify
functions can be used.

Now, `init_mm` being special means that the locking for kernel page
tables are special as well.

On RISC-V the PGD (top-level page table structure), similar to x86, is
shared (copied) with user processes. If the kernel PGD is modified, it
has to be synched to user-mode processes PGDs. This is avoided by
pre-populating the PGD, so it'll be fixed from boot.

The in-kernel pgd regions are documented in
`Documentation/arch/riscv/vm-layout.rst`.

The distinct regions are:
 * vmemmap
 * vmalloc/ioremap space
 * direct mapping of all physical memory
 * kasan
 * modules, BPF
 * kernel

Memory hotplug is the process of adding/removing memory to/from the
kernel.

Adding is done in two phases:
 1. Add the memory to the kernel
 2. Online memory, making it available to the page allocator.

Step 1 is partially architecture dependent, and updates the init_mm
page table:
 * Update the direct map page tables. The direct map is a linear map,
   representing all physical memory: `virt = phys + PAGE_OFFSET`
 * Add a `struct page` for each added page of memory. Update the
   vmemmap (virtual mapping to the `struct page`, so we can easily
   transform a kernel virtual address to a `struct page *` address.

>From an MHP perspective, there are two regions of the PGD that are
updated:
 * vmemmap
 * direct mapping of all physical memory

The `struct mm_struct` has a couple of locks in play:
 * `spinlock_t page_table_lock` protects the page table, and some
counters
 * `struct rw_semaphore mmap_lock` protect an mm's VMAs

Note again that `init_mm` does not contain any VMAs, but still uses
the mmap_lock in some places.

The `page_table_lock` was originally used to to protect all pages
tables, but more recently a split page table lock has been introduced.
The split lock has a per-table lock for the PTE and PMD tables. If
split lock is disabled, all tables are guarded by
`mm->page_table_lock` (for user processes). Split page table locks are
not used for init_mm.

MHP operations is typically synchronized using
`DEFINE_STATIC_PERCPU_RWSEM(mem_hotplug_lock)`.

Actors
--

The following non-MHP actors in the kernel traverses (read), and/or
modifies the kernel PGD.

 * `ptdump`

   Walks the entire `init_mm`, via `ptdump_walk_pgd()` with the
   `mmap_write_lock(init_mm)` taken.

   Observation: ptdump can race with MHP, and needs additional locking
   to avoid crashes/races.

 * `set_direct_*` / `arch/riscv/mm/pageattr.c`

   The `set_direct_*` functionality is used to "synchronize" the
   direct map to other kernel mappings, e.g. modules/kernel text. The
   direct map is using "as large huge table mappings as possible",
   which means that the `set_direct_*` might need to split the direct
   map.

  The `set_direct_*` functions operates with the
  `mmap_write_lock(init_mm)` taken.

  Observation: `set_direct_*` uses 

Re: [PATCHv6 9/9] man2: Add uretprobe syscall page

2024-05-21 Thread Alejandro Colomar
Hi Jiri,

On Tue, May 21, 2024 at 12:48:25PM GMT, Jiri Olsa wrote:
> Adding man page for new uretprobe syscall.
> 
> Signed-off-by: Jiri Olsa 
> ---
>  man2/uretprobe.2 | 50 
>  1 file changed, 50 insertions(+)
>  create mode 100644 man2/uretprobe.2
> 
> diff --git a/man2/uretprobe.2 b/man2/uretprobe.2
> new file mode 100644
> index ..690fe3b1a44f
> --- /dev/null
> +++ b/man2/uretprobe.2
> @@ -0,0 +1,50 @@
> +.\" Copyright (C) 2024, Jiri Olsa 
> +.\"
> +.\" SPDX-License-Identifier: Linux-man-pages-copyleft
> +.\"
> +.TH uretprobe 2 (date) "Linux man-pages (unreleased)"
> +.SH NAME
> +uretprobe \- execute pending return uprobes
> +.SH SYNOPSIS
> +.nf
> +.B int uretprobe(void)
> +.fi

What header file provides this system call?

> +.SH DESCRIPTION
> +The
> +.BR uretprobe ()
> +syscall is an alternative to breakpoint instructions for
> +triggering return uprobe consumers.
> +.P
> +Calls to
> +.BR uretprobe ()
> +suscall are only made from the user-space trampoline provided by the kernel.

s/suscall/system call/

> +Calls from any other place result in a
> +.BR SIGILL .

Maybe add an ERRORS section?

> +

We don't use blank lines; it causes a groff(1) warning, and other
problems.  Instead, use '.P'.

> +.SH RETURN VALUE
> +The
> +.BR uretprobe ()
> +syscall return value is architecture-specific.
> +

.P

> +.SH VERSIONS
> +This syscall is not specified in POSIX,

Redundant with "STANDARDS: None.".

> +and details of its behavior vary across systems.

Keep this.

> +.SH STANDARDS
> +None.
> +.SH HISTORY
> +TBD
> +.SH NOTES
> +The
> +.BR uretprobe ()
> +syscall was initially introduced for the x86_64 architecture where it was 
> shown
> +to be faster than breakpoint traps. It might be extended to other 
> architectures.

Please use semantic newlines.

$ MANWIDTH=72 man man-pages | sed -n '/Use semantic newlines/,/^$/p'
   Use semantic newlines
 In the source of a manual page, new sentences should be started on
 new lines, long sentences should be split  into  lines  at  clause
 breaks  (commas,  semicolons, colons, and so on), and long clauses
 should be split at phrase boundaries.  This convention,  sometimes
 known as "semantic newlines", makes it easier to see the effect of
 patches, which often operate at the level of individual sentences,
 clauses, or phrases.

> +.P
> +The
> +.BR uretprobe ()
> +syscall exists only to allow the invocation of return uprobe consumers.

s/syscall/system call/

> +It should
> +.B never
> +be called directly.
> +Details of the arguments (if any) passed to
> +.BR uretprobe ()
> +and the return value are architecture-specific.
> -- 
> 2.44.0

Have a lovely day!
Alex

-- 



signature.asc
Description: PGP signature


Re: [PATCH 01/12] soc: qcom: add firmware name helper

2024-05-21 Thread Kalle Valo
Dmitry Baryshkov  writes:

> On Tue, 21 May 2024 at 12:52,  wrote:
>>
>> On 21/05/2024 11:45, Dmitry Baryshkov wrote:
>> > Qualcomm platforms have different sets of the firmware files, which
>> > differ from platform to platform (and from board to board, due to the
>> > embedded signatures). Rather than listing all the firmware files,
>> > including full paths, in the DT, provide a way to determine firmware
>> > path based on the root DT node compatible.
>>
>> Ok this looks quite over-engineered but necessary to handle the legacy,
>> but I really think we should add a way to look for a board-specific path
>> first and fallback to those SoC specific paths.
>
> Again, CONFIG_FW_LOADER_USER_HELPER => delays.

To me this also looks like very over-engineered, can you elaborate more
why this is needed? Concrete examples would help to understand better.

-- 
https://patchwork.kernel.org/project/linux-wireless/list/

https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches



Re: [PATCH v2] sched/rt: Clean up usage of rt_task()

2024-05-21 Thread Sebastian Andrzej Siewior
On 2024-05-15 23:05:36 [+0100], Qais Yousef wrote:
> rt_task() checks if a task has RT priority. But depends on your
> dictionary, this could mean it belongs to RT class, or is a 'realtime'
> task, which includes RT and DL classes.
> 
> Since this has caused some confusion already on discussion [1], it
> seemed a clean up is due.
> 
> I define the usage of rt_task() to be tasks that belong to RT class.
> Make sure that it returns true only for RT class and audit the users and
> replace the ones required the old behavior with the new realtime_task()
> which returns true for RT and DL classes. Introduce similar
> realtime_prio() to create similar distinction to rt_prio() and update
> the users that required the old behavior to use the new function.
> 
> Move MAX_DL_PRIO to prio.h so it can be used in the new definitions.
> 
> Document the functions to make it more obvious what is the difference
> between them. PI-boosted tasks is a factor that must be taken into
> account when choosing which function to use.
> 
> Rename task_is_realtime() to realtime_task_policy() as the old name is
> confusing against the new realtime_task().

I *think* everyone using rt_task() means to include DL tasks. And
everyone means !SCHED-people since they know when the difference matters.

> No functional changes were intended.
> 
> [1] 
> https://lore.kernel.org/lkml/20240506100509.gl40...@noisy.programming.kicks-ass.net/
> 
> Reviewed-by: Phil Auld 
> Signed-off-by: Qais Yousef 
> ---
> 
> Changes since v1:
> 
>   * Use realtime_task_policy() instead task_has_realtime_policy() (Peter)
>   * Improve commit message readability about replace some rt_task()
> users.
> 
> v1 discussion: 
> https://lore.kernel.org/lkml/20240514234112.792989-1-qyou...@layalina.io/
> 
>  fs/select.c   |  2 +-

fs/bcachefs/six.c
six_owner_running() has rt_task(). But imho should have realtime_task()
to consider DL. But I think it is way worse that it has its own locking
rather than using what everyone else but then again it wouldn't be the
new hot thing…

>  include/linux/ioprio.h|  2 +-
>  include/linux/sched/deadline.h|  6 --
>  include/linux/sched/prio.h|  1 +
>  include/linux/sched/rt.h  | 27 ++-
>  kernel/locking/rtmutex.c  |  4 ++--
>  kernel/locking/rwsem.c|  4 ++--
>  kernel/locking/ww_mutex.h |  2 +-
>  kernel/sched/core.c   |  6 +++---
>  kernel/time/hrtimer.c |  6 +++---
>  kernel/trace/trace_sched_wakeup.c |  2 +-
>  mm/page-writeback.c   |  4 ++--
>  mm/page_alloc.c   |  2 +-
>  13 files changed, 48 insertions(+), 20 deletions(-)
…
> diff --git a/kernel/time/hrtimer.c b/kernel/time/hrtimer.c
> index 70625dff62ce..08b95e0a41ab 100644
> --- a/kernel/time/hrtimer.c
> +++ b/kernel/time/hrtimer.c
> @@ -1996,7 +1996,7 @@ static void __hrtimer_init_sleeper(struct 
> hrtimer_sleeper *sl,
>* expiry.
>*/
>   if (IS_ENABLED(CONFIG_PREEMPT_RT)) {
> - if (task_is_realtime(current) && !(mode & HRTIMER_MODE_SOFT))
> + if (realtime_task_policy(current) && !(mode & 
> HRTIMER_MODE_SOFT))
>   mode |= HRTIMER_MODE_HARD;
>   }
>  
> @@ -2096,7 +2096,7 @@ long hrtimer_nanosleep(ktime_t rqtp, const enum 
> hrtimer_mode mode,
>   u64 slack;
>  
>   slack = current->timer_slack_ns;
> - if (rt_task(current))
> + if (realtime_task(current))
>   slack = 0;
>  
>   hrtimer_init_sleeper_on_stack(, clockid, mode);
> @@ -2301,7 +2301,7 @@ schedule_hrtimeout_range_clock(ktime_t *expires, u64 
> delta,
>* Override any slack passed by the user if under
>* rt contraints.
>*/
> - if (rt_task(current))
> + if (realtime_task(current))
>   delta = 0;

I know this is just converting what is already here but…
__hrtimer_init_sleeper() looks at the policy to figure out if the task
is realtime do decide if should expire in HARD-IRQ context. This is
correct, a boosted task should not sleep.

hrtimer_nanosleep() + schedule_hrtimeout_range_clock() is looking at
priority to decide if slack should be removed. This should also look at
policy since a boosted task shouldn't sleep.

In order to be PI-boosted you need to acquire a lock and the only lock
you can sleep while acquired without generating a warning is a mutex_t
(or equivalent sleeping lock) on PREEMPT_RT. 

>   hrtimer_init_sleeper_on_stack(, clock_id, mode);
> diff --git a/kernel/trace/trace_sched_wakeup.c 
> b/kernel/trace/trace_sched_wakeup.c
> index 0469a04a355f..19d737742e29 100644
> --- a/kernel/trace/trace_sched_wakeup.c
> +++ b/kernel/trace/trace_sched_wakeup.c
> @@ -545,7 +545,7 @@ probe_wakeup(void *ignore, struct task_struct *p)
>*  - wakeup_dl handles tasks belonging to sched_dl class only.
>*/
>   if (tracing_dl || (wakeup_dl && !dl_task(p)) ||
> - (wakeup_rt && !dl_task(p) 

[PATCHv6 9/9] man2: Add uretprobe syscall page

2024-05-21 Thread Jiri Olsa
Adding man page for new uretprobe syscall.

Signed-off-by: Jiri Olsa 
---
 man2/uretprobe.2 | 50 
 1 file changed, 50 insertions(+)
 create mode 100644 man2/uretprobe.2

diff --git a/man2/uretprobe.2 b/man2/uretprobe.2
new file mode 100644
index ..690fe3b1a44f
--- /dev/null
+++ b/man2/uretprobe.2
@@ -0,0 +1,50 @@
+.\" Copyright (C) 2024, Jiri Olsa 
+.\"
+.\" SPDX-License-Identifier: Linux-man-pages-copyleft
+.\"
+.TH uretprobe 2 (date) "Linux man-pages (unreleased)"
+.SH NAME
+uretprobe \- execute pending return uprobes
+.SH SYNOPSIS
+.nf
+.B int uretprobe(void)
+.fi
+.SH DESCRIPTION
+The
+.BR uretprobe ()
+syscall is an alternative to breakpoint instructions for
+triggering return uprobe consumers.
+.P
+Calls to
+.BR uretprobe ()
+suscall are only made from the user-space trampoline provided by the kernel.
+Calls from any other place result in a
+.BR SIGILL .
+
+.SH RETURN VALUE
+The
+.BR uretprobe ()
+syscall return value is architecture-specific.
+
+.SH VERSIONS
+This syscall is not specified in POSIX,
+and details of its behavior vary across systems.
+.SH STANDARDS
+None.
+.SH HISTORY
+TBD
+.SH NOTES
+The
+.BR uretprobe ()
+syscall was initially introduced for the x86_64 architecture where it was shown
+to be faster than breakpoint traps. It might be extended to other 
architectures.
+.P
+The
+.BR uretprobe ()
+syscall exists only to allow the invocation of return uprobe consumers.
+It should
+.B never
+be called directly.
+Details of the arguments (if any) passed to
+.BR uretprobe ()
+and the return value are architecture-specific.
-- 
2.44.0




[PATCHv6 bpf-next 8/9] selftests/bpf: Add uretprobe shadow stack test

2024-05-21 Thread Jiri Olsa
Adding uretprobe shadow stack test that runs all existing
uretprobe tests with shadow stack enabled if it's available.

Signed-off-by: Jiri Olsa 
---
 .../selftests/bpf/prog_tests/uprobe_syscall.c | 60 +++
 1 file changed, 60 insertions(+)

diff --git a/tools/testing/selftests/bpf/prog_tests/uprobe_syscall.c 
b/tools/testing/selftests/bpf/prog_tests/uprobe_syscall.c
index 3ef324c2db50..fda456401284 100644
--- a/tools/testing/selftests/bpf/prog_tests/uprobe_syscall.c
+++ b/tools/testing/selftests/bpf/prog_tests/uprobe_syscall.c
@@ -9,6 +9,9 @@
 #include 
 #include 
 #include 
+#include 
+#include 
+#include 
 #include "uprobe_syscall.skel.h"
 #include "uprobe_syscall_executed.skel.h"
 
@@ -297,6 +300,56 @@ static void test_uretprobe_syscall_call(void)
close(go[1]);
close(go[0]);
 }
+
+/*
+ * Borrowed from tools/testing/selftests/x86/test_shadow_stack.c.
+ *
+ * For use in inline enablement of shadow stack.
+ *
+ * The program can't return from the point where shadow stack gets enabled
+ * because there will be no address on the shadow stack. So it can't use
+ * syscall() for enablement, since it is a function.
+ *
+ * Based on code from nolibc.h. Keep a copy here because this can't pull
+ * in all of nolibc.h.
+ */
+#define ARCH_PRCTL(arg1, arg2) \
+({ \
+   long _ret;  \
+   register long _num  asm("eax") = __NR_arch_prctl;   \
+   register long _arg1 asm("rdi") = (long)(arg1);  \
+   register long _arg2 asm("rsi") = (long)(arg2);  \
+   \
+   asm volatile (  \
+   "syscall\n" \
+   : "=a"(_ret)\
+   : "r"(_arg1), "r"(_arg2),   \
+ "0"(_num) \
+   : "rcx", "r11", "memory", "cc"  \
+   );  \
+   _ret;   \
+})
+
+#ifndef ARCH_SHSTK_ENABLE
+#define ARCH_SHSTK_ENABLE  0x5001
+#define ARCH_SHSTK_DISABLE 0x5002
+#define ARCH_SHSTK_SHSTK   (1ULL <<  0)
+#endif
+
+static void test_uretprobe_shadow_stack(void)
+{
+   if (ARCH_PRCTL(ARCH_SHSTK_ENABLE, ARCH_SHSTK_SHSTK)) {
+   test__skip();
+   return;
+   }
+
+   /* Run all of the uretprobe tests. */
+   test_uretprobe_regs_equal();
+   test_uretprobe_regs_change();
+   test_uretprobe_syscall_call();
+
+   ARCH_PRCTL(ARCH_SHSTK_DISABLE, ARCH_SHSTK_SHSTK);
+}
 #else
 static void test_uretprobe_regs_equal(void)
 {
@@ -312,6 +365,11 @@ static void test_uretprobe_syscall_call(void)
 {
test__skip();
 }
+
+static void test_uretprobe_shadow_stack(void)
+{
+   test__skip();
+}
 #endif
 
 void test_uprobe_syscall(void)
@@ -322,4 +380,6 @@ void test_uprobe_syscall(void)
test_uretprobe_regs_change();
if (test__start_subtest("uretprobe_syscall_call"))
test_uretprobe_syscall_call();
+   if (test__start_subtest("uretprobe_shadow_stack"))
+   test_uretprobe_shadow_stack();
 }
-- 
2.45.0




[PATCHv6 bpf-next 7/9] selftests/bpf: Add uretprobe syscall call from user space test

2024-05-21 Thread Jiri Olsa
Adding test to verify that when called from outside of the
trampoline provided by kernel, the uretprobe syscall will cause
calling process to receive SIGILL signal and the attached bpf
program is not executed.

Acked-by: Andrii Nakryiko 
Reviewed-by: Masami Hiramatsu (Google) 
Signed-off-by: Jiri Olsa 
---
 .../selftests/bpf/prog_tests/uprobe_syscall.c | 95 +++
 .../bpf/progs/uprobe_syscall_executed.c   | 17 
 2 files changed, 112 insertions(+)
 create mode 100644 tools/testing/selftests/bpf/progs/uprobe_syscall_executed.c

diff --git a/tools/testing/selftests/bpf/prog_tests/uprobe_syscall.c 
b/tools/testing/selftests/bpf/prog_tests/uprobe_syscall.c
index 1a50cd35205d..3ef324c2db50 100644
--- a/tools/testing/selftests/bpf/prog_tests/uprobe_syscall.c
+++ b/tools/testing/selftests/bpf/prog_tests/uprobe_syscall.c
@@ -7,7 +7,10 @@
 #include 
 #include 
 #include 
+#include 
+#include 
 #include "uprobe_syscall.skel.h"
+#include "uprobe_syscall_executed.skel.h"
 
 __naked unsigned long uretprobe_regs_trigger(void)
 {
@@ -209,6 +212,91 @@ static void test_uretprobe_regs_change(void)
}
 }
 
+#ifndef __NR_uretprobe
+#define __NR_uretprobe 462
+#endif
+
+__naked unsigned long uretprobe_syscall_call_1(void)
+{
+   /*
+* Pretend we are uretprobe trampoline to trigger the return
+* probe invocation in order to verify we get SIGILL.
+*/
+   asm volatile (
+   "pushq %rax\n"
+   "pushq %rcx\n"
+   "pushq %r11\n"
+   "movq $" __stringify(__NR_uretprobe) ", %rax\n"
+   "syscall\n"
+   "popq %r11\n"
+   "popq %rcx\n"
+   "retq\n"
+   );
+}
+
+__naked unsigned long uretprobe_syscall_call(void)
+{
+   asm volatile (
+   "call uretprobe_syscall_call_1\n"
+   "retq\n"
+   );
+}
+
+static void test_uretprobe_syscall_call(void)
+{
+   LIBBPF_OPTS(bpf_uprobe_multi_opts, opts,
+   .retprobe = true,
+   );
+   struct uprobe_syscall_executed *skel;
+   int pid, status, err, go[2], c;
+
+   if (ASSERT_OK(pipe(go), "pipe"))
+   return;
+
+   skel = uprobe_syscall_executed__open_and_load();
+   if (!ASSERT_OK_PTR(skel, "uprobe_syscall_executed__open_and_load"))
+   goto cleanup;
+
+   pid = fork();
+   if (!ASSERT_GE(pid, 0, "fork"))
+   goto cleanup;
+
+   /* child */
+   if (pid == 0) {
+   close(go[1]);
+
+   /* wait for parent's kick */
+   err = read(go[0], , 1);
+   if (err != 1)
+   exit(-1);
+
+   uretprobe_syscall_call();
+   _exit(0);
+   }
+
+   skel->links.test = bpf_program__attach_uprobe_multi(skel->progs.test, 
pid,
+   "/proc/self/exe",
+   
"uretprobe_syscall_call", );
+   if (!ASSERT_OK_PTR(skel->links.test, 
"bpf_program__attach_uprobe_multi"))
+   goto cleanup;
+
+   /* kick the child */
+   write(go[1], , 1);
+   err = waitpid(pid, , 0);
+   ASSERT_EQ(err, pid, "waitpid");
+
+   /* verify the child got killed with SIGILL */
+   ASSERT_EQ(WIFSIGNALED(status), 1, "WIFSIGNALED");
+   ASSERT_EQ(WTERMSIG(status), SIGILL, "WTERMSIG");
+
+   /* verify the uretprobe program wasn't called */
+   ASSERT_EQ(skel->bss->executed, 0, "executed");
+
+cleanup:
+   uprobe_syscall_executed__destroy(skel);
+   close(go[1]);
+   close(go[0]);
+}
 #else
 static void test_uretprobe_regs_equal(void)
 {
@@ -219,6 +307,11 @@ static void test_uretprobe_regs_change(void)
 {
test__skip();
 }
+
+static void test_uretprobe_syscall_call(void)
+{
+   test__skip();
+}
 #endif
 
 void test_uprobe_syscall(void)
@@ -227,4 +320,6 @@ void test_uprobe_syscall(void)
test_uretprobe_regs_equal();
if (test__start_subtest("uretprobe_regs_change"))
test_uretprobe_regs_change();
+   if (test__start_subtest("uretprobe_syscall_call"))
+   test_uretprobe_syscall_call();
 }
diff --git a/tools/testing/selftests/bpf/progs/uprobe_syscall_executed.c 
b/tools/testing/selftests/bpf/progs/uprobe_syscall_executed.c
new file mode 100644
index ..0d7f1a7db2e2
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/uprobe_syscall_executed.c
@@ -0,0 +1,17 @@
+// SPDX-License-Identifier: GPL-2.0
+#include "vmlinux.h"
+#include 
+#include 
+
+struct pt_regs regs;
+
+char _license[] SEC("license") = "GPL";
+
+int executed = 0;
+
+SEC("uretprobe.multi")
+int test(struct pt_regs *regs)
+{
+   executed = 1;
+   return 0;
+}
-- 
2.45.0




[PATCHv6 bpf-next 6/9] selftests/bpf: Add uretprobe syscall test for regs changes

2024-05-21 Thread Jiri Olsa
Adding test that creates uprobe consumer on uretprobe which changes some
of the registers. Making sure the changed registers are propagated to the
user space when the ureptobe syscall trampoline is used on x86_64.

To be able to do this, adding support to bpf_testmod to create uprobe via
new attribute file:
  /sys/kernel/bpf_testmod_uprobe

This file is expecting file offset and creates related uprobe on current
process exe file and removes existing uprobe if offset is 0. The can be
only single uprobe at any time.

The uprobe has specific consumer that changes registers used in ureprobe
syscall trampoline and which are later checked in the test.

Acked-by: Andrii Nakryiko 
Reviewed-by: Masami Hiramatsu (Google) 
Signed-off-by: Jiri Olsa 
---
 .../selftests/bpf/bpf_testmod/bpf_testmod.c   | 123 +-
 .../selftests/bpf/prog_tests/uprobe_syscall.c |  67 ++
 2 files changed, 189 insertions(+), 1 deletion(-)

diff --git a/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c 
b/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c
index 2a18bd320e92..b0132a342bb5 100644
--- a/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c
+++ b/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c
@@ -18,6 +18,7 @@
 #include 
 #include 
 #include 
+#include 
 #include "bpf_testmod.h"
 #include "bpf_testmod_kfunc.h"
 
@@ -358,6 +359,119 @@ static struct bin_attribute bin_attr_bpf_testmod_file 
__ro_after_init = {
.write = bpf_testmod_test_write,
 };
 
+/* bpf_testmod_uprobe sysfs attribute is so far enabled for x86_64 only,
+ * please see test_uretprobe_regs_change test
+ */
+#ifdef __x86_64__
+
+static int
+uprobe_ret_handler(struct uprobe_consumer *self, unsigned long func,
+  struct pt_regs *regs)
+
+{
+   regs->ax  = 0x12345678deadbeef;
+   regs->cx  = 0x87654321feebdaed;
+   regs->r11 = (u64) -1;
+   return true;
+}
+
+struct testmod_uprobe {
+   struct path path;
+   loff_t offset;
+   struct uprobe_consumer consumer;
+};
+
+static DEFINE_MUTEX(testmod_uprobe_mutex);
+
+static struct testmod_uprobe uprobe = {
+   .consumer.ret_handler = uprobe_ret_handler,
+};
+
+static int testmod_register_uprobe(loff_t offset)
+{
+   int err = -EBUSY;
+
+   if (uprobe.offset)
+   return -EBUSY;
+
+   mutex_lock(_uprobe_mutex);
+
+   if (uprobe.offset)
+   goto out;
+
+   err = kern_path("/proc/self/exe", LOOKUP_FOLLOW, );
+   if (err)
+   goto out;
+
+   err = uprobe_register_refctr(d_real_inode(uprobe.path.dentry),
+offset, 0, );
+   if (err)
+   path_put();
+   else
+   uprobe.offset = offset;
+
+out:
+   mutex_unlock(_uprobe_mutex);
+   return err;
+}
+
+static void testmod_unregister_uprobe(void)
+{
+   mutex_lock(_uprobe_mutex);
+
+   if (uprobe.offset) {
+   uprobe_unregister(d_real_inode(uprobe.path.dentry),
+ uprobe.offset, );
+   uprobe.offset = 0;
+   }
+
+   mutex_unlock(_uprobe_mutex);
+}
+
+static ssize_t
+bpf_testmod_uprobe_write(struct file *file, struct kobject *kobj,
+struct bin_attribute *bin_attr,
+char *buf, loff_t off, size_t len)
+{
+   unsigned long offset = 0;
+   int err = 0;
+
+   if (kstrtoul(buf, 0, ))
+   return -EINVAL;
+
+   if (offset)
+   err = testmod_register_uprobe(offset);
+   else
+   testmod_unregister_uprobe();
+
+   return err ?: strlen(buf);
+}
+
+static struct bin_attribute bin_attr_bpf_testmod_uprobe_file __ro_after_init = 
{
+   .attr = { .name = "bpf_testmod_uprobe", .mode = 0666, },
+   .write = bpf_testmod_uprobe_write,
+};
+
+static int register_bpf_testmod_uprobe(void)
+{
+   return sysfs_create_bin_file(kernel_kobj, 
_attr_bpf_testmod_uprobe_file);
+}
+
+static void unregister_bpf_testmod_uprobe(void)
+{
+   testmod_unregister_uprobe();
+   sysfs_remove_bin_file(kernel_kobj, _attr_bpf_testmod_uprobe_file);
+}
+
+#else
+static int register_bpf_testmod_uprobe(void)
+{
+   return 0;
+}
+
+static void unregister_bpf_testmod_uprobe(void) { }
+#endif
+
 BTF_KFUNCS_START(bpf_testmod_common_kfunc_ids)
 BTF_ID_FLAGS(func, bpf_iter_testmod_seq_new, KF_ITER_NEW)
 BTF_ID_FLAGS(func, bpf_iter_testmod_seq_next, KF_ITER_NEXT | KF_RET_NULL)
@@ -912,7 +1026,13 @@ static int bpf_testmod_init(void)
return -EINVAL;
sock = NULL;
mutex_init(_lock);
-   return sysfs_create_bin_file(kernel_kobj, _attr_bpf_testmod_file);
+   ret = sysfs_create_bin_file(kernel_kobj, _attr_bpf_testmod_file);
+   if (ret < 0)
+   return ret;
+   ret = register_bpf_testmod_uprobe();
+   if (ret < 0)
+   return ret;
+   return 0;
 }
 
 static void bpf_testmod_exit(void)
@@ -927,6 +1047,7 @@ static void bpf_testmod_exit(void)
 
   

[PATCHv6 bpf-next 5/9] selftests/bpf: Add uretprobe syscall test for regs integrity

2024-05-21 Thread Jiri Olsa
Add uretprobe syscall test that compares register values before
and after the uretprobe is hit. It also compares the register
values seen from attached bpf program.

Acked-by: Andrii Nakryiko 
Reviewed-by: Masami Hiramatsu (Google) 
Signed-off-by: Jiri Olsa 
---
 tools/include/linux/compiler.h|   4 +
 .../selftests/bpf/prog_tests/uprobe_syscall.c | 163 ++
 .../selftests/bpf/progs/uprobe_syscall.c  |  15 ++
 3 files changed, 182 insertions(+)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/uprobe_syscall.c
 create mode 100644 tools/testing/selftests/bpf/progs/uprobe_syscall.c

diff --git a/tools/include/linux/compiler.h b/tools/include/linux/compiler.h
index 8a63a9913495..6f7f22ac9da5 100644
--- a/tools/include/linux/compiler.h
+++ b/tools/include/linux/compiler.h
@@ -62,6 +62,10 @@
 #define __nocf_check __attribute__((nocf_check))
 #endif
 
+#ifndef __naked
+#define __naked __attribute__((__naked__))
+#endif
+
 /* Are two types/vars the same type (ignoring qualifiers)? */
 #ifndef __same_type
 # define __same_type(a, b) __builtin_types_compatible_p(typeof(a), typeof(b))
diff --git a/tools/testing/selftests/bpf/prog_tests/uprobe_syscall.c 
b/tools/testing/selftests/bpf/prog_tests/uprobe_syscall.c
new file mode 100644
index ..311ac19d8992
--- /dev/null
+++ b/tools/testing/selftests/bpf/prog_tests/uprobe_syscall.c
@@ -0,0 +1,163 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#include 
+
+#ifdef __x86_64__
+
+#include 
+#include 
+#include 
+#include "uprobe_syscall.skel.h"
+
+__naked unsigned long uretprobe_regs_trigger(void)
+{
+   asm volatile (
+   "movq $0xdeadbeef, %rax\n"
+   "ret\n"
+   );
+}
+
+__naked void uretprobe_regs(struct pt_regs *before, struct pt_regs *after)
+{
+   asm volatile (
+   "movq %r15,   0(%rdi)\n"
+   "movq %r14,   8(%rdi)\n"
+   "movq %r13,  16(%rdi)\n"
+   "movq %r12,  24(%rdi)\n"
+   "movq %rbp,  32(%rdi)\n"
+   "movq %rbx,  40(%rdi)\n"
+   "movq %r11,  48(%rdi)\n"
+   "movq %r10,  56(%rdi)\n"
+   "movq  %r9,  64(%rdi)\n"
+   "movq  %r8,  72(%rdi)\n"
+   "movq %rax,  80(%rdi)\n"
+   "movq %rcx,  88(%rdi)\n"
+   "movq %rdx,  96(%rdi)\n"
+   "movq %rsi, 104(%rdi)\n"
+   "movq %rdi, 112(%rdi)\n"
+   "movq   $0, 120(%rdi)\n" /* orig_rax */
+   "movq   $0, 128(%rdi)\n" /* rip  */
+   "movq   $0, 136(%rdi)\n" /* cs   */
+   "pushf\n"
+   "pop %rax\n"
+   "movq %rax, 144(%rdi)\n" /* eflags   */
+   "movq %rsp, 152(%rdi)\n" /* rsp  */
+   "movq   $0, 160(%rdi)\n" /* ss   */
+
+   /* save 2nd argument */
+   "pushq %rsi\n"
+   "call uretprobe_regs_trigger\n"
+
+   /* save  return value and load 2nd argument pointer to rax */
+   "pushq %rax\n"
+   "movq 8(%rsp), %rax\n"
+
+   "movq %r15,   0(%rax)\n"
+   "movq %r14,   8(%rax)\n"
+   "movq %r13,  16(%rax)\n"
+   "movq %r12,  24(%rax)\n"
+   "movq %rbp,  32(%rax)\n"
+   "movq %rbx,  40(%rax)\n"
+   "movq %r11,  48(%rax)\n"
+   "movq %r10,  56(%rax)\n"
+   "movq  %r9,  64(%rax)\n"
+   "movq  %r8,  72(%rax)\n"
+   "movq %rcx,  88(%rax)\n"
+   "movq %rdx,  96(%rax)\n"
+   "movq %rsi, 104(%rax)\n"
+   "movq %rdi, 112(%rax)\n"
+   "movq   $0, 120(%rax)\n" /* orig_rax */
+   "movq   $0, 128(%rax)\n" /* rip  */
+   "movq   $0, 136(%rax)\n" /* cs   */
+
+   /* restore return value and 2nd argument */
+   "pop %rax\n"
+   "pop %rsi\n"
+
+   "movq %rax,  80(%rsi)\n"
+
+   "pushf\n"
+   "pop %rax\n"
+
+   "movq %rax, 144(%rsi)\n" /* eflags   */
+   "movq %rsp, 152(%rsi)\n" /* rsp  */
+   "movq   $0, 160(%rsi)\n" /* ss   */
+   "ret\n"
+);
+}
+
+static void test_uretprobe_regs_equal(void)
+{
+   struct uprobe_syscall *skel = NULL;
+   struct pt_regs before = {}, after = {};
+   unsigned long *pb = (unsigned long *) 
+   unsigned long *pa = (unsigned long *) 
+   unsigned long *pp;
+   unsigned int i, cnt;
+   int err;
+
+   skel = uprobe_syscall__open_and_load();
+   if (!ASSERT_OK_PTR(skel, "uprobe_syscall__open_and_load"))
+   goto cleanup;
+
+   err = uprobe_syscall__attach(skel);
+   if (!ASSERT_OK(err, "uprobe_syscall__attach"))
+   goto cleanup;
+
+   uretprobe_regs(, );
+
+   pp = (unsigned long *) >bss->regs;
+   cnt = sizeof(before)/sizeof(*pb);
+
+   for (i = 0; i < cnt; i++) {

[PATCHv6 bpf-next 4/9] selftests/x86: Add return uprobe shadow stack test

2024-05-21 Thread Jiri Olsa
Adding return uprobe test for shadow stack and making sure it's
working properly. Borrowed some of the code from bpf selftests.

Signed-off-by: Jiri Olsa 
---
 .../testing/selftests/x86/test_shadow_stack.c | 145 ++
 1 file changed, 145 insertions(+)

diff --git a/tools/testing/selftests/x86/test_shadow_stack.c 
b/tools/testing/selftests/x86/test_shadow_stack.c
index 757e6527f67e..e3501b7e2ecc 100644
--- a/tools/testing/selftests/x86/test_shadow_stack.c
+++ b/tools/testing/selftests/x86/test_shadow_stack.c
@@ -34,6 +34,7 @@
 #include 
 #include 
 #include 
+#include 
 
 /*
  * Define the ABI defines if needed, so people can run the tests
@@ -681,6 +682,144 @@ int test_32bit(void)
return !segv_triggered;
 }
 
+static int parse_uint_from_file(const char *file, const char *fmt)
+{
+   int err, ret;
+   FILE *f;
+
+   f = fopen(file, "re");
+   if (!f) {
+   err = -errno;
+   printf("failed to open '%s': %d\n", file, err);
+   return err;
+   }
+   err = fscanf(f, fmt, );
+   if (err != 1) {
+   err = err == EOF ? -EIO : -errno;
+   printf("failed to parse '%s': %d\n", file, err);
+   fclose(f);
+   return err;
+   }
+   fclose(f);
+   return ret;
+}
+
+static int determine_uprobe_perf_type(void)
+{
+   const char *file = "/sys/bus/event_source/devices/uprobe/type";
+
+   return parse_uint_from_file(file, "%d\n");
+}
+
+static int determine_uprobe_retprobe_bit(void)
+{
+   const char *file = 
"/sys/bus/event_source/devices/uprobe/format/retprobe";
+
+   return parse_uint_from_file(file, "config:%d\n");
+}
+
+static ssize_t get_uprobe_offset(const void *addr)
+{
+   size_t start, end, base;
+   char buf[256];
+   bool found = false;
+   FILE *f;
+
+   f = fopen("/proc/self/maps", "r");
+   if (!f)
+   return -errno;
+
+   while (fscanf(f, "%zx-%zx %s %zx %*[^\n]\n", , , buf, ) 
== 4) {
+   if (buf[2] == 'x' && (uintptr_t)addr >= start && 
(uintptr_t)addr < end) {
+   found = true;
+   break;
+   }
+   }
+
+   fclose(f);
+
+   if (!found)
+   return -ESRCH;
+
+   return (uintptr_t)addr - start + base;
+}
+
+static __attribute__((noinline)) void uretprobe_trigger(void)
+{
+   asm volatile ("");
+}
+
+/*
+ * This test setups return uprobe, which is sensitive to shadow stack
+ * (crashes without extra fix). After executing the uretprobe we fail
+ * the test if we receive SIGSEGV, no crash means we're good.
+ *
+ * Helper functions above borrowed from bpf selftests.
+ */
+static int test_uretprobe(void)
+{
+   const size_t attr_sz = sizeof(struct perf_event_attr);
+   const char *file = "/proc/self/exe";
+   int bit, fd = 0, type, err = 1;
+   struct perf_event_attr attr;
+   struct sigaction sa = {};
+   ssize_t offset;
+
+   type = determine_uprobe_perf_type();
+   if (type < 0) {
+   if (type == -ENOENT)
+   printf("[SKIP]\tUretprobe test, uprobes are not 
available\n");
+   return 0;
+   }
+
+   offset = get_uprobe_offset(uretprobe_trigger);
+   if (offset < 0)
+   return 1;
+
+   bit = determine_uprobe_retprobe_bit();
+   if (bit < 0)
+   return 1;
+
+   sa.sa_sigaction = segv_gp_handler;
+   sa.sa_flags = SA_SIGINFO;
+   if (sigaction(SIGSEGV, , NULL))
+   return 1;
+
+   /* Setup return uprobe through perf event interface. */
+   memset(, 0, attr_sz);
+   attr.size = attr_sz;
+   attr.type = type;
+   attr.config = 1 << bit;
+   attr.config1 = (__u64) (unsigned long) file;
+   attr.config2 = offset;
+
+   fd = syscall(__NR_perf_event_open, , 0 /* pid */, -1 /* cpu */,
+-1 /* group_fd */, PERF_FLAG_FD_CLOEXEC);
+   if (fd < 0)
+   goto out;
+
+   if (sigsetjmp(jmp_buffer, 1))
+   goto out;
+
+   ARCH_PRCTL(ARCH_SHSTK_ENABLE, ARCH_SHSTK_SHSTK);
+
+   /*
+* This either segfaults and goes through sigsetjmp above
+* or succeeds and we're good.
+*/
+   uretprobe_trigger();
+
+   printf("[OK]\tUretprobe test\n");
+   err = 0;
+
+out:
+   ARCH_PRCTL(ARCH_SHSTK_DISABLE, ARCH_SHSTK_SHSTK);
+   signal(SIGSEGV, SIG_DFL);
+   if (fd)
+   close(fd);
+   return err;
+}
+
 void segv_handler_ptrace(int signum, siginfo_t *si, void *uc)
 {
/* The SSP adjustment caused a segfault. */
@@ -867,6 +1006,12 @@ int main(int argc, char *argv[])
goto out;
}
 
+   if (test_uretprobe()) {
+   ret = 1;
+   printf("[FAIL]\turetprobe test\n");
+   goto out;
+   }
+
return ret;
 
 out:
-- 
2.45.0




[PATCHv6 bpf-next 3/9] uprobe: Add uretprobe syscall to speed up return probe

2024-05-21 Thread Jiri Olsa
Adding uretprobe syscall instead of trap to speed up return probe.

At the moment the uretprobe setup/path is:

  - install entry uprobe

  - when the uprobe is hit, it overwrites probed function's return address
on stack with address of the trampoline that contains breakpoint
instruction

  - the breakpoint trap code handles the uretprobe consumers execution and
jumps back to original return address

This patch replaces the above trampoline's breakpoint instruction with new
ureprobe syscall call. This syscall does exactly the same job as the trap
with some more extra work:

  - syscall trampoline must save original value for rax/r11/rcx registers
on stack - rax is set to syscall number and r11/rcx are changed and
used by syscall instruction

  - the syscall code reads the original values of those registers and
restore those values in task's pt_regs area

  - only caller from trampoline exposed in '[uprobes]' is allowed,
the process will receive SIGILL signal otherwise

Even with some extra work, using the uretprobes syscall shows speed
improvement (compared to using standard breakpoint):

  On Intel (11th Gen Intel(R) Core(TM) i7-1165G7 @ 2.80GHz)

  current:
uretprobe-nop  :1.498 ± 0.000M/s
uretprobe-push :1.448 ± 0.001M/s
uretprobe-ret  :0.816 ± 0.001M/s

  with the fix:
uretprobe-nop  :1.969 ± 0.002M/s  < 31% speed up
uretprobe-push :1.910 ± 0.000M/s  < 31% speed up
uretprobe-ret  :0.934 ± 0.000M/s  < 14% speed up

  On Amd (AMD Ryzen 7 5700U)

  current:
uretprobe-nop  :0.778 ± 0.001M/s
uretprobe-push :0.744 ± 0.001M/s
uretprobe-ret  :0.540 ± 0.001M/s

  with the fix:
uretprobe-nop  :0.860 ± 0.001M/s  < 10% speed up
uretprobe-push :0.818 ± 0.001M/s  < 10% speed up
uretprobe-ret  :0.578 ± 0.000M/s  <  7% speed up

The performance test spawns a thread that runs loop which triggers
uprobe with attached bpf program that increments the counter that
gets printed in results above.

The uprobe (and uretprobe) kind is determined by which instruction
is being patched with breakpoint instruction. That's also important
for uretprobes, because uprobe is installed for each uretprobe.

The performance test is part of bpf selftests:
  tools/testing/selftests/bpf/run_bench_uprobes.sh

Note at the moment uretprobe syscall is supported only for native
64-bit process, compat process still uses standard breakpoint.

Note that when shadow stack is enabled the uretprobe syscall returns
via iret, which is slower than return via sysret, but won't cause the
shadow stack violation.

Suggested-by: Andrii Nakryiko 
Reviewed-by: Oleg Nesterov 
Reviewed-by: Masami Hiramatsu (Google) 
Acked-by: Andrii Nakryiko 
Signed-off-by: Oleg Nesterov 
Signed-off-by: Jiri Olsa 
---
 arch/x86/include/asm/shstk.h |   2 +
 arch/x86/kernel/shstk.c  |   5 ++
 arch/x86/kernel/uprobes.c| 117 +++
 include/linux/uprobes.h  |   3 +
 kernel/events/uprobes.c  |  24 ---
 5 files changed, 144 insertions(+), 7 deletions(-)

diff --git a/arch/x86/include/asm/shstk.h b/arch/x86/include/asm/shstk.h
index 896909f306e3..4cb77e004615 100644
--- a/arch/x86/include/asm/shstk.h
+++ b/arch/x86/include/asm/shstk.h
@@ -22,6 +22,7 @@ void shstk_free(struct task_struct *p);
 int setup_signal_shadow_stack(struct ksignal *ksig);
 int restore_signal_shadow_stack(void);
 int shstk_update_last_frame(unsigned long val);
+bool shstk_is_enabled(void);
 #else
 static inline long shstk_prctl(struct task_struct *task, int option,
   unsigned long arg2) { return -EINVAL; }
@@ -33,6 +34,7 @@ static inline void shstk_free(struct task_struct *p) {}
 static inline int setup_signal_shadow_stack(struct ksignal *ksig) { return 0; }
 static inline int restore_signal_shadow_stack(void) { return 0; }
 static inline int shstk_update_last_frame(unsigned long val) { return 0; }
+static inline bool shstk_is_enabled(void) { return false; }
 #endif /* CONFIG_X86_USER_SHADOW_STACK */
 
 #endif /* __ASSEMBLY__ */
diff --git a/arch/x86/kernel/shstk.c b/arch/x86/kernel/shstk.c
index 9797d4cdb78a..059685612362 100644
--- a/arch/x86/kernel/shstk.c
+++ b/arch/x86/kernel/shstk.c
@@ -588,3 +588,8 @@ int shstk_update_last_frame(unsigned long val)
ssp = get_user_shstk_addr();
return write_user_shstk_64((u64 __user *)ssp, (u64)val);
 }
+
+bool shstk_is_enabled(void)
+{
+   return features_enabled(ARCH_SHSTK_SHSTK);
+}
diff --git a/arch/x86/kernel/uprobes.c b/arch/x86/kernel/uprobes.c
index 6402fb3089d2..5a952c5ea66b 100644
--- a/arch/x86/kernel/uprobes.c
+++ b/arch/x86/kernel/uprobes.c
@@ -12,6 +12,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -308,6 +309,122 @@ static int uprobe_init_insn(struct arch_uprobe *auprobe, 
struct insn *insn, bool
 }
 
 #ifdef CONFIG_X86_64
+
+asm (
+   ".pushsection .rodata\n"
+   ".global uretprobe_trampoline_entry\n"
+ 

[PATCHv6 bpf-next 2/9] uprobe: Wire up uretprobe system call

2024-05-21 Thread Jiri Olsa
Wiring up uretprobe system call, which comes in following changes.
We need to do the wiring before, because the uretprobe implementation
needs the syscall number.

Note at the moment uretprobe syscall is supported only for native
64-bit process.

Reviewed-by: Oleg Nesterov 
Reviewed-by: Masami Hiramatsu (Google) 
Acked-by: Andrii Nakryiko 
Signed-off-by: Jiri Olsa 
---
 arch/x86/entry/syscalls/syscall_64.tbl | 1 +
 include/linux/syscalls.h   | 2 ++
 include/uapi/asm-generic/unistd.h  | 5 -
 kernel/sys_ni.c| 2 ++
 4 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/arch/x86/entry/syscalls/syscall_64.tbl 
b/arch/x86/entry/syscalls/syscall_64.tbl
index cc78226ffc35..47dfea0a827c 100644
--- a/arch/x86/entry/syscalls/syscall_64.tbl
+++ b/arch/x86/entry/syscalls/syscall_64.tbl
@@ -383,6 +383,7 @@
 459common  lsm_get_self_attr   sys_lsm_get_self_attr
 460common  lsm_set_self_attr   sys_lsm_set_self_attr
 461common  lsm_list_modulessys_lsm_list_modules
+46264  uretprobe   sys_uretprobe
 
 #
 # Due to a historical design error, certain syscalls are numbered differently
diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
index e619ac10cd23..5318e0e76799 100644
--- a/include/linux/syscalls.h
+++ b/include/linux/syscalls.h
@@ -972,6 +972,8 @@ asmlinkage long sys_lsm_list_modules(u64 *ids, u32 *size, 
u32 flags);
 /* x86 */
 asmlinkage long sys_ioperm(unsigned long from, unsigned long num, int on);
 
+asmlinkage long sys_uretprobe(void);
+
 /* pciconfig: alpha, arm, arm64, ia64, sparc */
 asmlinkage long sys_pciconfig_read(unsigned long bus, unsigned long dfn,
unsigned long off, unsigned long len,
diff --git a/include/uapi/asm-generic/unistd.h 
b/include/uapi/asm-generic/unistd.h
index 75f00965ab15..8a747cd1d735 100644
--- a/include/uapi/asm-generic/unistd.h
+++ b/include/uapi/asm-generic/unistd.h
@@ -842,8 +842,11 @@ __SYSCALL(__NR_lsm_set_self_attr, sys_lsm_set_self_attr)
 #define __NR_lsm_list_modules 461
 __SYSCALL(__NR_lsm_list_modules, sys_lsm_list_modules)
 
+#define __NR_uretprobe 462
+__SYSCALL(__NR_uretprobe, sys_uretprobe)
+
 #undef __NR_syscalls
-#define __NR_syscalls 462
+#define __NR_syscalls 463
 
 /*
  * 32 bit systems traditionally used different
diff --git a/kernel/sys_ni.c b/kernel/sys_ni.c
index faad00cce269..be6195e0d078 100644
--- a/kernel/sys_ni.c
+++ b/kernel/sys_ni.c
@@ -391,3 +391,5 @@ COND_SYSCALL(setuid16);
 
 /* restartable sequence */
 COND_SYSCALL(rseq);
+
+COND_SYSCALL(uretprobe);
-- 
2.45.0




[PATCHv6 bpf-next 1/9] x86/shstk: Make return uprobe work with shadow stack

2024-05-21 Thread Jiri Olsa
Currently the application with enabled shadow stack will crash
if it sets up return uprobe. The reason is the uretprobe kernel
code changes the user space task's stack, but does not update
shadow stack accordingly.

Adding new functions to update values on shadow stack and using
them in uprobe code to keep shadow stack in sync with uretprobe
changes to user stack.

Fixes: 8b1c23543436 ("x86/shstk: Add return uprobe support")
Signed-off-by: Jiri Olsa 
---
 arch/x86/include/asm/shstk.h |  2 ++
 arch/x86/kernel/shstk.c  | 11 +++
 arch/x86/kernel/uprobes.c|  7 ++-
 3 files changed, 19 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/shstk.h b/arch/x86/include/asm/shstk.h
index 42fee8959df7..896909f306e3 100644
--- a/arch/x86/include/asm/shstk.h
+++ b/arch/x86/include/asm/shstk.h
@@ -21,6 +21,7 @@ unsigned long shstk_alloc_thread_stack(struct task_struct *p, 
unsigned long clon
 void shstk_free(struct task_struct *p);
 int setup_signal_shadow_stack(struct ksignal *ksig);
 int restore_signal_shadow_stack(void);
+int shstk_update_last_frame(unsigned long val);
 #else
 static inline long shstk_prctl(struct task_struct *task, int option,
   unsigned long arg2) { return -EINVAL; }
@@ -31,6 +32,7 @@ static inline unsigned long shstk_alloc_thread_stack(struct 
task_struct *p,
 static inline void shstk_free(struct task_struct *p) {}
 static inline int setup_signal_shadow_stack(struct ksignal *ksig) { return 0; }
 static inline int restore_signal_shadow_stack(void) { return 0; }
+static inline int shstk_update_last_frame(unsigned long val) { return 0; }
 #endif /* CONFIG_X86_USER_SHADOW_STACK */
 
 #endif /* __ASSEMBLY__ */
diff --git a/arch/x86/kernel/shstk.c b/arch/x86/kernel/shstk.c
index 6f1e9883f074..9797d4cdb78a 100644
--- a/arch/x86/kernel/shstk.c
+++ b/arch/x86/kernel/shstk.c
@@ -577,3 +577,14 @@ long shstk_prctl(struct task_struct *task, int option, 
unsigned long arg2)
return wrss_control(true);
return -EINVAL;
 }
+
+int shstk_update_last_frame(unsigned long val)
+{
+   unsigned long ssp;
+
+   if (!features_enabled(ARCH_SHSTK_SHSTK))
+   return 0;
+
+   ssp = get_user_shstk_addr();
+   return write_user_shstk_64((u64 __user *)ssp, (u64)val);
+}
diff --git a/arch/x86/kernel/uprobes.c b/arch/x86/kernel/uprobes.c
index 6c07f6daaa22..6402fb3089d2 100644
--- a/arch/x86/kernel/uprobes.c
+++ b/arch/x86/kernel/uprobes.c
@@ -1076,8 +1076,13 @@ arch_uretprobe_hijack_return_addr(unsigned long 
trampoline_vaddr, struct pt_regs
return orig_ret_vaddr;
 
nleft = copy_to_user((void __user *)regs->sp, _vaddr, 
rasize);
-   if (likely(!nleft))
+   if (likely(!nleft)) {
+   if (shstk_update_last_frame(trampoline_vaddr)) {
+   force_sig(SIGSEGV);
+   return -1;
+   }
return orig_ret_vaddr;
+   }
 
if (nleft != rasize) {
pr_err("return address clobbered: pid=%d, %%sp=%#lx, 
%%ip=%#lx\n",
-- 
2.45.0




[PATCHv6 bpf-next 0/9] uprobe: uretprobe speed up

2024-05-21 Thread Jiri Olsa
hi,
as part of the effort on speeding up the uprobes [0] coming with
return uprobe optimization by using syscall instead of the trap
on the uretprobe trampoline.

The speed up depends on instruction type that uprobe is installed
and depends on specific HW type, please check patch 1 for details.

Patches 1-8 are based on bpf-next/master, but patch 2 and 3 are
apply-able on linux-trace.git tree probes/for-next branch.
Patch 9 is based on man-pages master.

v6 changes:
- separate shadow stack fix for current uretprobe in patch 1
- skip shadow stack test when uprobe is not compiled int [Masami]
- fix retprobe with the shadow stack, using iret return when
  shadow stack is detected
- I kept the acks on patch 3, because the shadow stack change is
  minimal and the original code is almost untouched
- added shadow stack bpf selftest
- rebased man page

Also available at:
  https://git.kernel.org/pub/scm/linux/kernel/git/jolsa/perf.git
  uretprobe_syscall

thanks,
jirka


Notes to check list items in Documentation/process/adding-syscalls.rst:

- System Call Alternatives
  New syscall seems like the best way in here, because we need
  just to quickly enter kernel with no extra arguments processing,
  which we'd need to do if we decided to use another syscall.

- Designing the API: Planning for Extension
  The uretprobe syscall is very specific and most likely won't be
  extended in the future.

  At the moment it does not take any arguments and even if it does
  in future, it's allowed to be called only from trampoline prepared
  by kernel, so there'll be no broken user.

- Designing the API: Other Considerations
  N/A because uretprobe syscall does not return reference to kernel
  object.

- Proposing the API
  Wiring up of the uretprobe system call is in separate change,
  selftests and man page changes are part of the patchset.

- Generic System Call Implementation
  There's no CONFIG option for the new functionality because it
  keeps the same behaviour from the user POV.

- x86 System Call Implementation
  It's 64-bit syscall only.

- Compatibility System Calls (Generic)
  N/A uretprobe syscall has no arguments and is not supported
  for compat processes.

- Compatibility System Calls (x86)
  N/A uretprobe syscall is not supported for compat processes.

- System Calls Returning Elsewhere
  N/A.

- Other Details
  N/A.

- Testing
  Adding new bpf selftests and ran ltp on top of this change.

- Man Page
  Attached.

- Do not call System Calls in the Kernel
  N/A.


[0] https://lore.kernel.org/bpf/ZeCXHKJ--iYYbmLj@krava/
---
Jiri Olsa (8):
  x86/shstk: Make return uprobe work with shadow stack
  uprobe: Wire up uretprobe system call
  uprobe: Add uretprobe syscall to speed up return probe
  selftests/x86: Add return uprobe shadow stack test
  selftests/bpf: Add uretprobe syscall test for regs integrity
  selftests/bpf: Add uretprobe syscall test for regs changes
  selftests/bpf: Add uretprobe syscall call from user space test
  selftests/bpf: Add uretprobe shadow stack test

 arch/x86/entry/syscalls/syscall_64.tbl  |   1 +
 arch/x86/include/asm/shstk.h|   4 +
 arch/x86/kernel/shstk.c |  16 
 arch/x86/kernel/uprobes.c   | 124 
-
 include/linux/syscalls.h|   2 +
 include/linux/uprobes.h |   3 +
 include/uapi/asm-generic/unistd.h   |   5 +-
 kernel/events/uprobes.c |  24 --
 kernel/sys_ni.c |   2 +
 tools/include/linux/compiler.h  |   4 +
 tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c   | 123 
-
 tools/testing/selftests/bpf/prog_tests/uprobe_syscall.c | 385 
+++
 tools/testing/selftests/bpf/progs/uprobe_syscall.c  |  15 
 tools/testing/selftests/bpf/progs/uprobe_syscall_executed.c |  17 
 tools/testing/selftests/x86/test_shadow_stack.c | 145 
++
 15 files changed, 860 insertions(+), 10 deletions(-)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/uprobe_syscall.c
 create mode 100644 tools/testing/selftests/bpf/progs/uprobe_syscall.c
 create mode 100644 tools/testing/selftests/bpf/progs/uprobe_syscall_executed.c

Jiri Olsa (1):
  man2: Add uretprobe syscall page

 man/man2/uretprobe.2 | 50 ++
 1 file changed, 50 insertions(+)
 create mode 100644 man/man2/uretprobe.2



Re: [PATCHv5 bpf-next 6/8] x86/shstk: Add return uprobe support

2024-05-21 Thread Jiri Olsa
On Tue, May 21, 2024 at 01:31:53AM +, Edgecombe, Rick P wrote:
> On Mon, 2024-05-20 at 00:18 +0200, Jiri Olsa wrote:
> > anyway I think we can fix that in another way by using the optimized
> > trampoline,
> > but returning to the user space through iret when shadow stack is detected
> > (as I did in the first version, before you adjusted it to the sysret path).
> > 
> > we need to update the return address on stack only when returning through 
> > the
> > trampoline, but we can jump to original return address directly from syscall
> > through iret.. which is slower, but with shadow stack we don't care
> > 
> > basically the only change is adding the shstk_is_enabled check to the
> > following condition in SYSCALL_DEFINE0(uretprobe):
> > 
> > if (regs->sp != sp || shstk_is_enabled())
> > return regs->ax;
> 
> On the surface it sounds reasonable. Thanks.
> 
> And then I guess if tradeoffs are seen differently in the future, and we want 
> to
> enable the fast path for shadow stack we can go with your other solution. So
> this just simply fixes things functionally without much code.

yes, if we want to enable the fast path for shadow stack in future
we'll need to remove that shstk_is_enabled and push extra frame on
shadow stack

jirka



Re: [PATCH 01/12] soc: qcom: add firmware name helper

2024-05-21 Thread Dmitry Baryshkov
On Tue, 21 May 2024 at 12:52,  wrote:
>
> On 21/05/2024 11:45, Dmitry Baryshkov wrote:
> > Qualcomm platforms have different sets of the firmware files, which
> > differ from platform to platform (and from board to board, due to the
> > embedded signatures). Rather than listing all the firmware files,
> > including full paths, in the DT, provide a way to determine firmware
> > path based on the root DT node compatible.
>
> Ok this looks quite over-engineered but necessary to handle the legacy,
> but I really think we should add a way to look for a board-specific path
> first and fallback to those SoC specific paths.

Again, CONFIG_FW_LOADER_USER_HELPER => delays.

>
> Neil
>
> >
> > Suggested-by: Arnd Bergmann 
> > Signed-off-by: Dmitry Baryshkov 
> > ---
> >   drivers/soc/qcom/Kconfig   |  5 +++
> >   drivers/soc/qcom/Makefile  |  1 +
> >   drivers/soc/qcom/qcom_fw_helper.c  | 86 
> > ++
> >   include/linux/soc/qcom/fw_helper.h | 10 +
> >   4 files changed, 102 insertions(+)
> >
> > diff --git a/drivers/soc/qcom/Kconfig b/drivers/soc/qcom/Kconfig
> > index 5af33b0e3470..b663774d65f8 100644
> > --- a/drivers/soc/qcom/Kconfig
> > +++ b/drivers/soc/qcom/Kconfig
> > @@ -62,6 +62,11 @@ config QCOM_MDT_LOADER
> >   tristate
> >   select QCOM_SCM
> >
> > +config QCOM_FW_HELPER
> > + tristate "NONE FW HELPER"
> > + help
> > +   Helpers to return platform-specific location for the firmware files.
> > +
> >   config QCOM_OCMEM
> >   tristate "Qualcomm On Chip Memory (OCMEM) driver"
> >   depends on ARCH_QCOM
> > diff --git a/drivers/soc/qcom/Makefile b/drivers/soc/qcom/Makefile
> > index ca0bece0dfff..e612bee5b955 100644
> > --- a/drivers/soc/qcom/Makefile
> > +++ b/drivers/soc/qcom/Makefile
> > @@ -6,6 +6,7 @@ obj-$(CONFIG_QCOM_GENI_SE) += qcom-geni-se.o
> >   obj-$(CONFIG_QCOM_COMMAND_DB) += cmd-db.o
> >   obj-$(CONFIG_QCOM_GSBI) +=  qcom_gsbi.o
> >   obj-$(CONFIG_QCOM_MDT_LOADER)   += mdt_loader.o
> > +obj-$(CONFIG_QCOM_FW_HELPER) += qcom_fw_helper.o
> >   obj-$(CONFIG_QCOM_OCMEM)+= ocmem.o
> >   obj-$(CONFIG_QCOM_PDR_HELPERS)  += pdr_interface.o
> >   obj-$(CONFIG_QCOM_PMIC_GLINK)   += pmic_glink.o
> > diff --git a/drivers/soc/qcom/qcom_fw_helper.c 
> > b/drivers/soc/qcom/qcom_fw_helper.c
> > new file mode 100644
> > index ..13123c2514b8
> > --- /dev/null
> > +++ b/drivers/soc/qcom/qcom_fw_helper.c
> > @@ -0,0 +1,86 @@
> > +// SPDX-License-Identifier: GPL-2.0-only
> > +/*
> > + * Qualcomm Firmware loading data
> > + *
> > + * Copyright (C) 2024 Linaro Ltd
> > + */
> > +
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +
> > +static DEFINE_MUTEX(qcom_fw_mutex);
> > +static const char *fw_path;
> > +
> > +static const struct of_device_id qcom_fw_paths[] = {
> > + /* device-specific entries */
> > + { .compatible = "thundercomm,db845c", .data = 
> > "qcom/sdm845/Thundercomm/db845c", },
> > + { .compatible = "qcom,qrb5165-rb5", .data = 
> > "qcom/sm8250/Thundercomm/RB5", },
> > + /* SoC default entries */
> > + { .compatible = "qcom,apq8016", .data = "qcom/apq8016", },
> > + { .compatible = "qcom,apq8096", .data = "qcom/apq8096", },
> > + { .compatible = "qcom,sdm845", .data = "qcom/sdm845", },
> > + { .compatible = "qcom,sm8250", .data = "qcom/sm8250", },
> > + { .compatible = "qcom,sm8350", .data = "qcom/sm8350", },
> > + { .compatible = "qcom,sm8450", .data = "qcom/sm8450", },
> > + { .compatible = "qcom,sm8550", .data = "qcom/sm8550", },
> > + { .compatible = "qcom,sm8650", .data = "qcom/sm8650", },
> > + {},
> > +};
> > +
> > +static int qcom_fw_ensure_init(void)
> > +{
> > + const struct of_device_id *match;
> > + struct device_node *root;
> > +
> > + if (fw_path)
> > + return 0;
> > +
> > + root = of_find_node_by_path("/");
> > + if (!root)
> > + return -ENODEV;
> > +
> > + match = of_match_node(qcom_fw_paths, root);
> > + of_node_put(root);
> > + if (!match || !match->data) {
> > + pr_notice("Platform not supported by qcom_fw_helper\n");
> > + return -ENODEV;
> > + }
> > +
> > + fw_path = match->data;
> > +
> > + return 0;
> > +}
> > +
> > +const char *qcom_get_board_fw(const char *firmware)
> > +{
> > + if (strchr(firmware, '/'))
> > + return kstrdup(firmware, GFP_KERNEL);
> > +
> > + scoped_guard(mutex, _fw_mutex) {
> > + if (!qcom_fw_ensure_init())
> > + return kasprintf(GFP_KERNEL, "%s/%s", fw_path, 
> > firmware);
> > + }
> > +
> > + return kstrdup(firmware, GFP_KERNEL);
> > +}
> > +EXPORT_SYMBOL_GPL(qcom_get_board_fw);
> > +
> > +const char *devm_qcom_get_board_fw(struct device *dev, const char 
> > *firmware)
> > +{
> > + if (strchr(firmware, '/'))
> > + return devm_kstrdup(dev, firmware, GFP_KERNEL);
> > +
> > + scoped_guard(mutex, 

Re: [PATCH 06/12] remoteproc: qcom_q6v5_pas: switch to mbn files by default

2024-05-21 Thread Dmitry Baryshkov
On Tue, 21 May 2024 at 12:49,  wrote:
>
> On 21/05/2024 11:45, Dmitry Baryshkov wrote:
> > We have been pushing userspace to use mbn files by default for ages.
> > As a preparation for making the firmware-name optional, make the driver
> > use .mbn instead of .mdt files by default.
>
> I think we should have a mechanism to fallback to .mdt since downstream
> uses split mdt on the devices filesystem.
>
> Perhaps only specify .firmware_name = "adsp" and add a list of allowed 
> extension
> it will try in a loop ?

Such loops can cause unnecessary delays if the
CONFIG_FW_LOADER_USER_HELPER is enabled.
Since it is not possible to use vendor's firmware partition as is (you
have to either bind-mount a subdir or use a plenty of symlinks) one
might as well symlink .mbn to .mdt file.
Another option is to explicitly specify something like `firmware-name
= "./adsp.mdt";'

But yes, this whole series is a balance of pros and cons, as it was
discussed last week.

-- 
With best wishes
Dmitry



Re: [PATCH 01/12] soc: qcom: add firmware name helper

2024-05-21 Thread neil . armstrong

On 21/05/2024 11:45, Dmitry Baryshkov wrote:

Qualcomm platforms have different sets of the firmware files, which
differ from platform to platform (and from board to board, due to the
embedded signatures). Rather than listing all the firmware files,
including full paths, in the DT, provide a way to determine firmware
path based on the root DT node compatible.


Ok this looks quite over-engineered but necessary to handle the legacy,
but I really think we should add a way to look for a board-specific path
first and fallback to those SoC specific paths.

Neil



Suggested-by: Arnd Bergmann 
Signed-off-by: Dmitry Baryshkov 
---
  drivers/soc/qcom/Kconfig   |  5 +++
  drivers/soc/qcom/Makefile  |  1 +
  drivers/soc/qcom/qcom_fw_helper.c  | 86 ++
  include/linux/soc/qcom/fw_helper.h | 10 +
  4 files changed, 102 insertions(+)

diff --git a/drivers/soc/qcom/Kconfig b/drivers/soc/qcom/Kconfig
index 5af33b0e3470..b663774d65f8 100644
--- a/drivers/soc/qcom/Kconfig
+++ b/drivers/soc/qcom/Kconfig
@@ -62,6 +62,11 @@ config QCOM_MDT_LOADER
tristate
select QCOM_SCM
  
+config QCOM_FW_HELPER

+   tristate "NONE FW HELPER"
+   help
+ Helpers to return platform-specific location for the firmware files.
+
  config QCOM_OCMEM
tristate "Qualcomm On Chip Memory (OCMEM) driver"
depends on ARCH_QCOM
diff --git a/drivers/soc/qcom/Makefile b/drivers/soc/qcom/Makefile
index ca0bece0dfff..e612bee5b955 100644
--- a/drivers/soc/qcom/Makefile
+++ b/drivers/soc/qcom/Makefile
@@ -6,6 +6,7 @@ obj-$(CONFIG_QCOM_GENI_SE) +=   qcom-geni-se.o
  obj-$(CONFIG_QCOM_COMMAND_DB) += cmd-db.o
  obj-$(CONFIG_QCOM_GSBI)   +=  qcom_gsbi.o
  obj-$(CONFIG_QCOM_MDT_LOADER) += mdt_loader.o
+obj-$(CONFIG_QCOM_FW_HELPER)   += qcom_fw_helper.o
  obj-$(CONFIG_QCOM_OCMEM)  += ocmem.o
  obj-$(CONFIG_QCOM_PDR_HELPERS)+= pdr_interface.o
  obj-$(CONFIG_QCOM_PMIC_GLINK) += pmic_glink.o
diff --git a/drivers/soc/qcom/qcom_fw_helper.c 
b/drivers/soc/qcom/qcom_fw_helper.c
new file mode 100644
index ..13123c2514b8
--- /dev/null
+++ b/drivers/soc/qcom/qcom_fw_helper.c
@@ -0,0 +1,86 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Qualcomm Firmware loading data
+ *
+ * Copyright (C) 2024 Linaro Ltd
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+
+static DEFINE_MUTEX(qcom_fw_mutex);
+static const char *fw_path;
+
+static const struct of_device_id qcom_fw_paths[] = {
+   /* device-specific entries */
+   { .compatible = "thundercomm,db845c", .data = 
"qcom/sdm845/Thundercomm/db845c", },
+   { .compatible = "qcom,qrb5165-rb5", .data = 
"qcom/sm8250/Thundercomm/RB5", },
+   /* SoC default entries */
+   { .compatible = "qcom,apq8016", .data = "qcom/apq8016", },
+   { .compatible = "qcom,apq8096", .data = "qcom/apq8096", },
+   { .compatible = "qcom,sdm845", .data = "qcom/sdm845", },
+   { .compatible = "qcom,sm8250", .data = "qcom/sm8250", },
+   { .compatible = "qcom,sm8350", .data = "qcom/sm8350", },
+   { .compatible = "qcom,sm8450", .data = "qcom/sm8450", },
+   { .compatible = "qcom,sm8550", .data = "qcom/sm8550", },
+   { .compatible = "qcom,sm8650", .data = "qcom/sm8650", },
+   {},
+};
+
+static int qcom_fw_ensure_init(void)
+{
+   const struct of_device_id *match;
+   struct device_node *root;
+
+   if (fw_path)
+   return 0;
+
+   root = of_find_node_by_path("/");
+   if (!root)
+   return -ENODEV;
+
+   match = of_match_node(qcom_fw_paths, root);
+   of_node_put(root);
+   if (!match || !match->data) {
+   pr_notice("Platform not supported by qcom_fw_helper\n");
+   return -ENODEV;
+   }
+
+   fw_path = match->data;
+
+   return 0;
+}
+
+const char *qcom_get_board_fw(const char *firmware)
+{
+   if (strchr(firmware, '/'))
+   return kstrdup(firmware, GFP_KERNEL);
+
+   scoped_guard(mutex, _fw_mutex) {
+   if (!qcom_fw_ensure_init())
+   return kasprintf(GFP_KERNEL, "%s/%s", fw_path, 
firmware);
+   }
+
+   return kstrdup(firmware, GFP_KERNEL);
+}
+EXPORT_SYMBOL_GPL(qcom_get_board_fw);
+
+const char *devm_qcom_get_board_fw(struct device *dev, const char *firmware)
+{
+   if (strchr(firmware, '/'))
+   return devm_kstrdup(dev, firmware, GFP_KERNEL);
+
+   scoped_guard(mutex, _fw_mutex) {
+   if (!qcom_fw_ensure_init())
+   return devm_kasprintf(dev, GFP_KERNEL, "%s/%s", 
fw_path, firmware);
+   }
+
+   return devm_kstrdup(dev, firmware, GFP_KERNEL);
+}
+EXPORT_SYMBOL_GPL(devm_qcom_get_board_fw);
+
+MODULE_DESCRIPTION("Firmware helpers for Qualcomm devices");
+MODULE_LICENSE("GPL");
diff --git a/include/linux/soc/qcom/fw_helper.h 
b/include/linux/soc/qcom/fw_helper.h
new file mode 100644
index ..755645386bba
--- /dev/null
+++ 

Re: [PATCH 06/12] remoteproc: qcom_q6v5_pas: switch to mbn files by default

2024-05-21 Thread neil . armstrong

On 21/05/2024 11:45, Dmitry Baryshkov wrote:

We have been pushing userspace to use mbn files by default for ages.
As a preparation for making the firmware-name optional, make the driver
use .mbn instead of .mdt files by default.


I think we should have a mechanism to fallback to .mdt since downstream
uses split mdt on the devices filesystem.

Perhaps only specify .firmware_name = "adsp" and add a list of allowed extension
it will try in a loop ?

Neil



Signed-off-by: Dmitry Baryshkov 
---
  drivers/remoteproc/qcom_q6v5_pas.c | 76 +++---
  1 file changed, 38 insertions(+), 38 deletions(-)

diff --git a/drivers/remoteproc/qcom_q6v5_pas.c 
b/drivers/remoteproc/qcom_q6v5_pas.c
index 54d8005d40a3..4694ec4f038d 100644
--- a/drivers/remoteproc/qcom_q6v5_pas.c
+++ b/drivers/remoteproc/qcom_q6v5_pas.c
@@ -812,7 +812,7 @@ static void adsp_remove(struct platform_device *pdev)
  
  static const struct adsp_data adsp_resource_init = {

.crash_reason_smem = 423,
-   .firmware_name = "adsp.mdt",
+   .firmware_name = "adsp.mbn",
.pas_id = 1,
.auto_boot = true,
.ssr_name = "lpass",
@@ -822,7 +822,7 @@ static const struct adsp_data adsp_resource_init = {
  
  static const struct adsp_data sdm845_adsp_resource_init = {

.crash_reason_smem = 423,
-   .firmware_name = "adsp.mdt",
+   .firmware_name = "adsp.mbn",
.pas_id = 1,
.auto_boot = true,
.load_state = "adsp",
@@ -833,7 +833,7 @@ static const struct adsp_data sdm845_adsp_resource_init = {
  
  static const struct adsp_data sm6350_adsp_resource = {

.crash_reason_smem = 423,
-   .firmware_name = "adsp.mdt",
+   .firmware_name = "adsp.mbn",
.pas_id = 1,
.auto_boot = true,
.proxy_pd_names = (char*[]){
@@ -849,7 +849,7 @@ static const struct adsp_data sm6350_adsp_resource = {
  
  static const struct adsp_data sm6375_mpss_resource = {

.crash_reason_smem = 421,
-   .firmware_name = "modem.mdt",
+   .firmware_name = "modem.mbn",
.pas_id = 4,
.minidump_id = 3,
.auto_boot = false,
@@ -864,7 +864,7 @@ static const struct adsp_data sm6375_mpss_resource = {
  
  static const struct adsp_data sm8150_adsp_resource = {

.crash_reason_smem = 423,
-   .firmware_name = "adsp.mdt",
+   .firmware_name = "adsp.mbn",
.pas_id = 1,
.auto_boot = true,
.proxy_pd_names = (char*[]){
@@ -879,7 +879,7 @@ static const struct adsp_data sm8150_adsp_resource = {
  
  static const struct adsp_data sm8250_adsp_resource = {

.crash_reason_smem = 423,
-   .firmware_name = "adsp.mdt",
+   .firmware_name = "adsp.mbn",
.pas_id = 1,
.auto_boot = true,
.proxy_pd_names = (char*[]){
@@ -895,7 +895,7 @@ static const struct adsp_data sm8250_adsp_resource = {
  
  static const struct adsp_data sm8350_adsp_resource = {

.crash_reason_smem = 423,
-   .firmware_name = "adsp.mdt",
+   .firmware_name = "adsp.mbn",
.pas_id = 1,
.auto_boot = true,
.proxy_pd_names = (char*[]){
@@ -911,7 +911,7 @@ static const struct adsp_data sm8350_adsp_resource = {
  
  static const struct adsp_data msm8996_adsp_resource = {

.crash_reason_smem = 423,
-   .firmware_name = "adsp.mdt",
+   .firmware_name = "adsp.mbn",
.pas_id = 1,
.auto_boot = true,
.proxy_pd_names = (char*[]){
@@ -925,7 +925,7 @@ static const struct adsp_data msm8996_adsp_resource = {
  
  static const struct adsp_data cdsp_resource_init = {

.crash_reason_smem = 601,
-   .firmware_name = "cdsp.mdt",
+   .firmware_name = "cdsp.mbn",
.pas_id = 18,
.auto_boot = true,
.ssr_name = "cdsp",
@@ -935,7 +935,7 @@ static const struct adsp_data cdsp_resource_init = {
  
  static const struct adsp_data sdm845_cdsp_resource_init = {

.crash_reason_smem = 601,
-   .firmware_name = "cdsp.mdt",
+   .firmware_name = "cdsp.mbn",
.pas_id = 18,
.auto_boot = true,
.load_state = "cdsp",
@@ -946,7 +946,7 @@ static const struct adsp_data sdm845_cdsp_resource_init = {
  
  static const struct adsp_data sm6350_cdsp_resource = {

.crash_reason_smem = 601,
-   .firmware_name = "cdsp.mdt",
+   .firmware_name = "cdsp.mbn",
.pas_id = 18,
.auto_boot = true,
.proxy_pd_names = (char*[]){
@@ -962,7 +962,7 @@ static const struct adsp_data sm6350_cdsp_resource = {
  
  static const struct adsp_data sm8150_cdsp_resource = {

.crash_reason_smem = 601,
-   .firmware_name = "cdsp.mdt",
+   .firmware_name = "cdsp.mbn",
.pas_id = 18,
.auto_boot = true,
.proxy_pd_names = (char*[]){
@@ -977,7 +977,7 @@ static const struct adsp_data sm8150_cdsp_resource = {
  
  static const struct adsp_data sm8250_cdsp_resource = {

.crash_reason_smem = 601,
-   .firmware_name = "cdsp.mdt",
+   .firmware_name = 

[PATCH 12/12] arm64: dts: qcom: apq8096-db820c: drop firmware-name properties

2024-05-21 Thread Dmitry Baryshkov
As the drivers default to loading the firmware from the board-specific
location, drop the firmware-name properties.

Signed-off-by: Dmitry Baryshkov 
---
 arch/arm64/boot/dts/qcom/apq8096-db820c.dts | 2 --
 1 file changed, 2 deletions(-)

diff --git a/arch/arm64/boot/dts/qcom/apq8096-db820c.dts 
b/arch/arm64/boot/dts/qcom/apq8096-db820c.dts
index e8148b3d6c50..2c8a77401aa3 100644
--- a/arch/arm64/boot/dts/qcom/apq8096-db820c.dts
+++ b/arch/arm64/boot/dts/qcom/apq8096-db820c.dts
@@ -161,7 +161,6 @@ bluetooth {
 
 _pil {
status = "okay";
-   firmware-name = "qcom/apq8096/adsp.mbn";
 };
 
 _i2c1 {
@@ -253,7 +252,6 @@  {
 _pil {
status = "okay";
pll-supply = <_l12a_1p8>;
-   firmware-name = "qcom/apq8096/mba.mbn", "qcom/apq8096/modem.mbn";
 };
 
 _resin {

-- 
2.39.2




[PATCH 11/12] arm64: dts: qcom: apq8016-sbc: drop firmware-name properties

2024-05-21 Thread Dmitry Baryshkov
As the drivers default to loading the firmware from the board-specific
location, drop the firmware-name properties. In case of the WCNSS
calibration data drop the path to the file, retaining just the file
name.

Signed-off-by: Dmitry Baryshkov 
---
 arch/arm64/boot/dts/qcom/apq8016-sbc.dts | 5 +
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/arch/arm64/boot/dts/qcom/apq8016-sbc.dts 
b/arch/arm64/boot/dts/qcom/apq8016-sbc.dts
index aba08424aa38..24779238cc18 100644
--- a/arch/arm64/boot/dts/qcom/apq8016-sbc.dts
+++ b/arch/arm64/boot/dts/qcom/apq8016-sbc.dts
@@ -260,8 +260,6 @@ _dsi0_out {
 
  {
status = "okay";
-
-   firmware-name = "qcom/apq8016/mba.mbn", "qcom/apq8016/modem.mbn";
 };
 
 _mem {
@@ -388,11 +386,10 @@ _mem {
 
  {
status = "okay";
-   firmware-name = "qcom/apq8016/wcnss.mbn";
 };
 
 _ctrl {
-   firmware-name = "qcom/apq8016/WCNSS_qcom_wlan_nv_sbc.bin";
+   firmware-name = "WCNSS_qcom_wlan_nv_sbc.bin";
 };
 
 _iris {

-- 
2.39.2




[PATCH 10/12] remoteproc: qcom_wcnss: make use of QCOM_FW_HELPER

2024-05-21 Thread Dmitry Baryshkov
Make the driver use qcom_fw_helper to autodetect the path to the
calibration data file.

Signed-off-by: Dmitry Baryshkov 
---
 drivers/remoteproc/Kconfig | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/remoteproc/Kconfig b/drivers/remoteproc/Kconfig
index 7bb22fdb64e4..e0ffcaeca03d 100644
--- a/drivers/remoteproc/Kconfig
+++ b/drivers/remoteproc/Kconfig
@@ -279,6 +279,7 @@ config QCOM_WCNSS_PIL
depends on QCOM_SMEM
depends on QCOM_SYSMON || QCOM_SYSMON=n
depends on RPMSG_QCOM_GLINK || RPMSG_QCOM_GLINK=n
+   select QCOM_FW_HELPER
select QCOM_MDT_LOADER
select QCOM_PIL_INFO
select QCOM_RPROC_COMMON

-- 
2.39.2




[PATCH 09/12] remoteproc: qcom_wcnss: make use of QCOM_FW_HELPER

2024-05-21 Thread Dmitry Baryshkov
Make the driver use qcom_fw_helper to autodetect the path to the
calibration data file.

Signed-off-by: Dmitry Baryshkov 
---
 drivers/remoteproc/qcom_wcnss.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/drivers/remoteproc/qcom_wcnss.c b/drivers/remoteproc/qcom_wcnss.c
index 421a3943a90d..45fc578ae30b 100644
--- a/drivers/remoteproc/qcom_wcnss.c
+++ b/drivers/remoteproc/qcom_wcnss.c
@@ -23,6 +23,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 
@@ -555,8 +556,13 @@ static int wcnss_probe(struct platform_device *pdev)
if (ret < 0 && ret != -EINVAL)
return ret;
 
+   fw_name = qcom_get_board_fw(fw_name);
+   if (!fw_name)
+   return -ENOMEM;
+
rproc = devm_rproc_alloc(>dev, pdev->name, _ops,
 fw_name, sizeof(*wcnss));
+   kfree(fw_name);
if (!rproc) {
dev_err(>dev, "unable to allocate remoteproc\n");
return -ENOMEM;

-- 
2.39.2




[PATCH 08/12] remoteproc: qcom_wcnss: switch to mbn files by default

2024-05-21 Thread Dmitry Baryshkov
We have been pushing userspace to use mbn files by default for ages.
As a preparation for making the firmware-name optional, make the driver
use .mbn instead of .mdt files by default.

Signed-off-by: Dmitry Baryshkov 
---
 drivers/remoteproc/qcom_wcnss.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/remoteproc/qcom_wcnss.c b/drivers/remoteproc/qcom_wcnss.c
index a7bb9da27029..421a3943a90d 100644
--- a/drivers/remoteproc/qcom_wcnss.c
+++ b/drivers/remoteproc/qcom_wcnss.c
@@ -32,7 +32,7 @@
 #include "qcom_wcnss.h"
 
 #define WCNSS_CRASH_REASON_SMEM422
-#define WCNSS_FIRMWARE_NAME"wcnss.mdt"
+#define WCNSS_FIRMWARE_NAME"wcnss.mbn"
 #define WCNSS_PAS_ID   6
 #define WCNSS_SSCTL_ID 0x13
 

-- 
2.39.2




[PATCH 07/12] remoteproc: qcom_q6v5_pas: make use of QCOM_FW_HELPER

2024-05-21 Thread Dmitry Baryshkov
Make the driver use qcom_fw_helper to autodetect the path to the
calibration data file.

Signed-off-by: Dmitry Baryshkov 
---
 drivers/remoteproc/Kconfig | 1 +
 drivers/remoteproc/qcom_q6v5_pas.c | 9 +
 2 files changed, 10 insertions(+)

diff --git a/drivers/remoteproc/Kconfig b/drivers/remoteproc/Kconfig
index 884e1e69bbb6..7bb22fdb64e4 100644
--- a/drivers/remoteproc/Kconfig
+++ b/drivers/remoteproc/Kconfig
@@ -223,6 +223,7 @@ config QCOM_Q6V5_PAS
depends on RPMSG_QCOM_GLINK || RPMSG_QCOM_GLINK=n
depends on QCOM_AOSS_QMP || QCOM_AOSS_QMP=n
select MFD_SYSCON
+   select QCOM_FW_HELPER
select QCOM_PIL_INFO
select QCOM_MDT_LOADER
select QCOM_Q6V5_COMMON
diff --git a/drivers/remoteproc/qcom_q6v5_pas.c 
b/drivers/remoteproc/qcom_q6v5_pas.c
index 4694ec4f038d..893fda54b598 100644
--- a/drivers/remoteproc/qcom_q6v5_pas.c
+++ b/drivers/remoteproc/qcom_q6v5_pas.c
@@ -22,6 +22,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -705,11 +706,19 @@ static int adsp_probe(struct platform_device *pdev)
_fw_name);
if (ret < 0 && ret != -EINVAL)
return ret;
+
+   dtb_fw_name = devm_qcom_get_board_fw(>dev, dtb_fw_name);
+   if (!dtb_fw_name)
+   return -ENOMEM;
}
 
if (desc->minidump_id)
ops = _minidump_ops;
 
+   fw_name = qcom_get_board_fw(fw_name);
+   if (!fw_name)
+   return -ENOMEM;
+
rproc = devm_rproc_alloc(>dev, pdev->name, ops, fw_name, 
sizeof(*adsp));
 
if (!rproc) {

-- 
2.39.2




[PATCH 06/12] remoteproc: qcom_q6v5_pas: switch to mbn files by default

2024-05-21 Thread Dmitry Baryshkov
We have been pushing userspace to use mbn files by default for ages.
As a preparation for making the firmware-name optional, make the driver
use .mbn instead of .mdt files by default.

Signed-off-by: Dmitry Baryshkov 
---
 drivers/remoteproc/qcom_q6v5_pas.c | 76 +++---
 1 file changed, 38 insertions(+), 38 deletions(-)

diff --git a/drivers/remoteproc/qcom_q6v5_pas.c 
b/drivers/remoteproc/qcom_q6v5_pas.c
index 54d8005d40a3..4694ec4f038d 100644
--- a/drivers/remoteproc/qcom_q6v5_pas.c
+++ b/drivers/remoteproc/qcom_q6v5_pas.c
@@ -812,7 +812,7 @@ static void adsp_remove(struct platform_device *pdev)
 
 static const struct adsp_data adsp_resource_init = {
.crash_reason_smem = 423,
-   .firmware_name = "adsp.mdt",
+   .firmware_name = "adsp.mbn",
.pas_id = 1,
.auto_boot = true,
.ssr_name = "lpass",
@@ -822,7 +822,7 @@ static const struct adsp_data adsp_resource_init = {
 
 static const struct adsp_data sdm845_adsp_resource_init = {
.crash_reason_smem = 423,
-   .firmware_name = "adsp.mdt",
+   .firmware_name = "adsp.mbn",
.pas_id = 1,
.auto_boot = true,
.load_state = "adsp",
@@ -833,7 +833,7 @@ static const struct adsp_data sdm845_adsp_resource_init = {
 
 static const struct adsp_data sm6350_adsp_resource = {
.crash_reason_smem = 423,
-   .firmware_name = "adsp.mdt",
+   .firmware_name = "adsp.mbn",
.pas_id = 1,
.auto_boot = true,
.proxy_pd_names = (char*[]){
@@ -849,7 +849,7 @@ static const struct adsp_data sm6350_adsp_resource = {
 
 static const struct adsp_data sm6375_mpss_resource = {
.crash_reason_smem = 421,
-   .firmware_name = "modem.mdt",
+   .firmware_name = "modem.mbn",
.pas_id = 4,
.minidump_id = 3,
.auto_boot = false,
@@ -864,7 +864,7 @@ static const struct adsp_data sm6375_mpss_resource = {
 
 static const struct adsp_data sm8150_adsp_resource = {
.crash_reason_smem = 423,
-   .firmware_name = "adsp.mdt",
+   .firmware_name = "adsp.mbn",
.pas_id = 1,
.auto_boot = true,
.proxy_pd_names = (char*[]){
@@ -879,7 +879,7 @@ static const struct adsp_data sm8150_adsp_resource = {
 
 static const struct adsp_data sm8250_adsp_resource = {
.crash_reason_smem = 423,
-   .firmware_name = "adsp.mdt",
+   .firmware_name = "adsp.mbn",
.pas_id = 1,
.auto_boot = true,
.proxy_pd_names = (char*[]){
@@ -895,7 +895,7 @@ static const struct adsp_data sm8250_adsp_resource = {
 
 static const struct adsp_data sm8350_adsp_resource = {
.crash_reason_smem = 423,
-   .firmware_name = "adsp.mdt",
+   .firmware_name = "adsp.mbn",
.pas_id = 1,
.auto_boot = true,
.proxy_pd_names = (char*[]){
@@ -911,7 +911,7 @@ static const struct adsp_data sm8350_adsp_resource = {
 
 static const struct adsp_data msm8996_adsp_resource = {
.crash_reason_smem = 423,
-   .firmware_name = "adsp.mdt",
+   .firmware_name = "adsp.mbn",
.pas_id = 1,
.auto_boot = true,
.proxy_pd_names = (char*[]){
@@ -925,7 +925,7 @@ static const struct adsp_data msm8996_adsp_resource = {
 
 static const struct adsp_data cdsp_resource_init = {
.crash_reason_smem = 601,
-   .firmware_name = "cdsp.mdt",
+   .firmware_name = "cdsp.mbn",
.pas_id = 18,
.auto_boot = true,
.ssr_name = "cdsp",
@@ -935,7 +935,7 @@ static const struct adsp_data cdsp_resource_init = {
 
 static const struct adsp_data sdm845_cdsp_resource_init = {
.crash_reason_smem = 601,
-   .firmware_name = "cdsp.mdt",
+   .firmware_name = "cdsp.mbn",
.pas_id = 18,
.auto_boot = true,
.load_state = "cdsp",
@@ -946,7 +946,7 @@ static const struct adsp_data sdm845_cdsp_resource_init = {
 
 static const struct adsp_data sm6350_cdsp_resource = {
.crash_reason_smem = 601,
-   .firmware_name = "cdsp.mdt",
+   .firmware_name = "cdsp.mbn",
.pas_id = 18,
.auto_boot = true,
.proxy_pd_names = (char*[]){
@@ -962,7 +962,7 @@ static const struct adsp_data sm6350_cdsp_resource = {
 
 static const struct adsp_data sm8150_cdsp_resource = {
.crash_reason_smem = 601,
-   .firmware_name = "cdsp.mdt",
+   .firmware_name = "cdsp.mbn",
.pas_id = 18,
.auto_boot = true,
.proxy_pd_names = (char*[]){
@@ -977,7 +977,7 @@ static const struct adsp_data sm8150_cdsp_resource = {
 
 static const struct adsp_data sm8250_cdsp_resource = {
.crash_reason_smem = 601,
-   .firmware_name = "cdsp.mdt",
+   .firmware_name = "cdsp.mbn",
.pas_id = 18,
.auto_boot = true,
.proxy_pd_names = (char*[]){
@@ -992,7 +992,7 @@ static const struct adsp_data sm8250_cdsp_resource = {
 
 static const struct adsp_data sc8280xp_nsp0_resource = {
.crash_reason_smem = 601,
-   .firmware_name = "cdsp.mdt",
+   

[PATCH 05/12] remoteproc: qcom_q6v5_mss: make use of QCOM_FW_HELPER

2024-05-21 Thread Dmitry Baryshkov
Make the driver use qcom_fw_helper to autodetect the path to the
calibration data file.

Signed-off-by: Dmitry Baryshkov 
---
 drivers/remoteproc/Kconfig |  1 +
 drivers/remoteproc/qcom_q6v5_mss.c | 10 ++
 2 files changed, 11 insertions(+)

diff --git a/drivers/remoteproc/Kconfig b/drivers/remoteproc/Kconfig
index 48845dc8fa85..884e1e69bbb6 100644
--- a/drivers/remoteproc/Kconfig
+++ b/drivers/remoteproc/Kconfig
@@ -202,6 +202,7 @@ config QCOM_Q6V5_MSS
depends on RPMSG_QCOM_GLINK || RPMSG_QCOM_GLINK=n
depends on QCOM_AOSS_QMP || QCOM_AOSS_QMP=n
select MFD_SYSCON
+   select QCOM_FW_HELPER
select QCOM_MDT_LOADER
select QCOM_PIL_INFO
select QCOM_Q6V5_COMMON
diff --git a/drivers/remoteproc/qcom_q6v5_mss.c 
b/drivers/remoteproc/qcom_q6v5_mss.c
index eeaae2505352..1ccd5bb92952 100644
--- a/drivers/remoteproc/qcom_q6v5_mss.c
+++ b/drivers/remoteproc/qcom_q6v5_mss.c
@@ -26,6 +26,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 
@@ -1990,8 +1991,13 @@ static int q6v5_probe(struct platform_device *pdev)
return ret;
}
 
+   mba_image = qcom_get_board_fw(mba_image);
+   if (!mba_image)
+   return -ENOMEM;
+
rproc = devm_rproc_alloc(>dev, pdev->name, _ops,
 mba_image, sizeof(*qproc));
+   kfree(mba_image);
if (!rproc) {
dev_err(>dev, "failed to allocate rproc\n");
return -ENOMEM;
@@ -2011,6 +2017,10 @@ static int q6v5_probe(struct platform_device *pdev)
return ret;
}
 
+   qproc->hexagon_mdt_image = devm_qcom_get_board_fw(>dev, 
qproc->hexagon_mdt_image);
+   if (!qproc->hexagon_mdt_image)
+   return -ENOMEM;
+
platform_set_drvdata(pdev, qproc);
 
qproc->has_qaccept_regs = desc->has_qaccept_regs;

-- 
2.39.2




[PATCH 04/12] remoteproc: qcom_q6v5_mss: switch to mbn files by default

2024-05-21 Thread Dmitry Baryshkov
We have been pushing userspace to use mbn files by default for ages.
As a preparation for making the firmware-name optional, make the driver
use .mbn instead of .mdt files by default.

Signed-off-by: Dmitry Baryshkov 
---
 drivers/remoteproc/qcom_q6v5_mss.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/remoteproc/qcom_q6v5_mss.c 
b/drivers/remoteproc/qcom_q6v5_mss.c
index 1779fc890e10..eeaae2505352 100644
--- a/drivers/remoteproc/qcom_q6v5_mss.c
+++ b/drivers/remoteproc/qcom_q6v5_mss.c
@@ -2003,7 +2003,7 @@ static int q6v5_probe(struct platform_device *pdev)
qproc = rproc->priv;
qproc->dev = >dev;
qproc->rproc = rproc;
-   qproc->hexagon_mdt_image = "modem.mdt";
+   qproc->hexagon_mdt_image = "modem.mbn";
ret = of_property_read_string_index(pdev->dev.of_node, "firmware-name",
1, >hexagon_mdt_image);
if (ret < 0 && ret != -EINVAL) {

-- 
2.39.2




[PATCH 03/12] soc: qcom: wcnss_ctrl: make use of QCOM_FW_HELPER

2024-05-21 Thread Dmitry Baryshkov
Make the driver use qcom_fw_helper to autodetect the path to the
calibration data file.

Signed-off-by: Dmitry Baryshkov 
---
 drivers/soc/qcom/Kconfig  | 1 +
 drivers/soc/qcom/wcnss_ctrl.c | 9 +
 2 files changed, 10 insertions(+)

diff --git a/drivers/soc/qcom/Kconfig b/drivers/soc/qcom/Kconfig
index b663774d65f8..3af3f15175e4 100644
--- a/drivers/soc/qcom/Kconfig
+++ b/drivers/soc/qcom/Kconfig
@@ -238,6 +238,7 @@ config QCOM_WCNSS_CTRL
tristate "Qualcomm WCNSS control driver"
depends on ARCH_QCOM || COMPILE_TEST
depends on RPMSG
+   select QCOM_FW_HELPER
help
  Client driver for the WCNSS_CTRL SMD channel, used to download nv
  firmware to a newly booted WCNSS chip.
diff --git a/drivers/soc/qcom/wcnss_ctrl.c b/drivers/soc/qcom/wcnss_ctrl.c
index 148bcbac332d..7d1a4536226a 100644
--- a/drivers/soc/qcom/wcnss_ctrl.c
+++ b/drivers/soc/qcom/wcnss_ctrl.c
@@ -12,6 +12,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #define WCNSS_REQUEST_TIMEOUT  (5 * HZ)
 #define WCNSS_CBC_TIMEOUT  (10 * HZ)
@@ -214,11 +215,19 @@ static int wcnss_download_nv(struct wcnss_ctrl *wcnss, 
bool *expect_cbc)
if (ret < 0 && ret != -EINVAL)
goto free_req;
 
+   nvbin = qcom_get_board_fw(nvbin);
+   if (!nvbin) {
+   ret = -ENOMEM;
+   goto free_req;
+   }
+
ret = request_firmware(, nvbin, dev);
if (ret < 0) {
dev_err(dev, "Failed to load nv file %s: %d\n", nvbin, ret);
+   kfree(nvbin);
goto free_req;
}
+   kfree(nvbin);
 
data = fw->data;
left = fw->size;

-- 
2.39.2




[PATCH 02/12] wifi: wcn36xx: make use of QCOM_FW_HELPER

2024-05-21 Thread Dmitry Baryshkov
Make the driver use qcom_fw_helper to autodetect the path to the
calibration data file.

Signed-off-by: Dmitry Baryshkov 
---
 drivers/net/wireless/ath/wcn36xx/Kconfig | 1 +
 drivers/net/wireless/ath/wcn36xx/main.c  | 5 +
 2 files changed, 6 insertions(+)

diff --git a/drivers/net/wireless/ath/wcn36xx/Kconfig 
b/drivers/net/wireless/ath/wcn36xx/Kconfig
index 5832c7ef9352..90239c89676a 100644
--- a/drivers/net/wireless/ath/wcn36xx/Kconfig
+++ b/drivers/net/wireless/ath/wcn36xx/Kconfig
@@ -4,6 +4,7 @@ config WCN36XX
depends on MAC80211 && HAS_DMA
depends on QCOM_WCNSS_CTRL || QCOM_WCNSS_CTRL=n
depends on RPMSG || RPMSG=n
+   select QCOM_FW_HELPER
help
  This module adds support for wireless adapters based on
  Qualcomm Atheros WCN3660 and WCN3680 mobile chipsets.
diff --git a/drivers/net/wireless/ath/wcn36xx/main.c 
b/drivers/net/wireless/ath/wcn36xx/main.c
index e760d8002e09..8d25db81c1d0 100644
--- a/drivers/net/wireless/ath/wcn36xx/main.c
+++ b/drivers/net/wireless/ath/wcn36xx/main.c
@@ -22,6 +22,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -1609,6 +1610,10 @@ static int wcn36xx_probe(struct platform_device *pdev)
goto out_wq;
}
 
+   wcn->nv_file = devm_qcom_get_board_fw(wcn->dev, wcn->nv_file);
+   if (!wcn->nv_file)
+   return -ENOMEM;
+
wcn->smd_channel = qcom_wcnss_open_channel(wcnss, "WLAN_CTRL", 
wcn36xx_smd_rsp_process, hw);
if (IS_ERR(wcn->smd_channel)) {
wcn36xx_err("failed to open WLAN_CTRL channel\n");

-- 
2.39.2




[PATCH 01/12] soc: qcom: add firmware name helper

2024-05-21 Thread Dmitry Baryshkov
Qualcomm platforms have different sets of the firmware files, which
differ from platform to platform (and from board to board, due to the
embedded signatures). Rather than listing all the firmware files,
including full paths, in the DT, provide a way to determine firmware
path based on the root DT node compatible.

Suggested-by: Arnd Bergmann 
Signed-off-by: Dmitry Baryshkov 
---
 drivers/soc/qcom/Kconfig   |  5 +++
 drivers/soc/qcom/Makefile  |  1 +
 drivers/soc/qcom/qcom_fw_helper.c  | 86 ++
 include/linux/soc/qcom/fw_helper.h | 10 +
 4 files changed, 102 insertions(+)

diff --git a/drivers/soc/qcom/Kconfig b/drivers/soc/qcom/Kconfig
index 5af33b0e3470..b663774d65f8 100644
--- a/drivers/soc/qcom/Kconfig
+++ b/drivers/soc/qcom/Kconfig
@@ -62,6 +62,11 @@ config QCOM_MDT_LOADER
tristate
select QCOM_SCM
 
+config QCOM_FW_HELPER
+   tristate "NONE FW HELPER"
+   help
+ Helpers to return platform-specific location for the firmware files.
+
 config QCOM_OCMEM
tristate "Qualcomm On Chip Memory (OCMEM) driver"
depends on ARCH_QCOM
diff --git a/drivers/soc/qcom/Makefile b/drivers/soc/qcom/Makefile
index ca0bece0dfff..e612bee5b955 100644
--- a/drivers/soc/qcom/Makefile
+++ b/drivers/soc/qcom/Makefile
@@ -6,6 +6,7 @@ obj-$(CONFIG_QCOM_GENI_SE) +=   qcom-geni-se.o
 obj-$(CONFIG_QCOM_COMMAND_DB) += cmd-db.o
 obj-$(CONFIG_QCOM_GSBI)+=  qcom_gsbi.o
 obj-$(CONFIG_QCOM_MDT_LOADER)  += mdt_loader.o
+obj-$(CONFIG_QCOM_FW_HELPER)   += qcom_fw_helper.o
 obj-$(CONFIG_QCOM_OCMEM)   += ocmem.o
 obj-$(CONFIG_QCOM_PDR_HELPERS) += pdr_interface.o
 obj-$(CONFIG_QCOM_PMIC_GLINK)  += pmic_glink.o
diff --git a/drivers/soc/qcom/qcom_fw_helper.c 
b/drivers/soc/qcom/qcom_fw_helper.c
new file mode 100644
index ..13123c2514b8
--- /dev/null
+++ b/drivers/soc/qcom/qcom_fw_helper.c
@@ -0,0 +1,86 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Qualcomm Firmware loading data
+ *
+ * Copyright (C) 2024 Linaro Ltd
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+
+static DEFINE_MUTEX(qcom_fw_mutex);
+static const char *fw_path;
+
+static const struct of_device_id qcom_fw_paths[] = {
+   /* device-specific entries */
+   { .compatible = "thundercomm,db845c", .data = 
"qcom/sdm845/Thundercomm/db845c", },
+   { .compatible = "qcom,qrb5165-rb5", .data = 
"qcom/sm8250/Thundercomm/RB5", },
+   /* SoC default entries */
+   { .compatible = "qcom,apq8016", .data = "qcom/apq8016", },
+   { .compatible = "qcom,apq8096", .data = "qcom/apq8096", },
+   { .compatible = "qcom,sdm845", .data = "qcom/sdm845", },
+   { .compatible = "qcom,sm8250", .data = "qcom/sm8250", },
+   { .compatible = "qcom,sm8350", .data = "qcom/sm8350", },
+   { .compatible = "qcom,sm8450", .data = "qcom/sm8450", },
+   { .compatible = "qcom,sm8550", .data = "qcom/sm8550", },
+   { .compatible = "qcom,sm8650", .data = "qcom/sm8650", },
+   {},
+};
+
+static int qcom_fw_ensure_init(void)
+{
+   const struct of_device_id *match;
+   struct device_node *root;
+
+   if (fw_path)
+   return 0;
+
+   root = of_find_node_by_path("/");
+   if (!root)
+   return -ENODEV;
+
+   match = of_match_node(qcom_fw_paths, root);
+   of_node_put(root);
+   if (!match || !match->data) {
+   pr_notice("Platform not supported by qcom_fw_helper\n");
+   return -ENODEV;
+   }
+
+   fw_path = match->data;
+
+   return 0;
+}
+
+const char *qcom_get_board_fw(const char *firmware)
+{
+   if (strchr(firmware, '/'))
+   return kstrdup(firmware, GFP_KERNEL);
+
+   scoped_guard(mutex, _fw_mutex) {
+   if (!qcom_fw_ensure_init())
+   return kasprintf(GFP_KERNEL, "%s/%s", fw_path, 
firmware);
+   }
+
+   return kstrdup(firmware, GFP_KERNEL);
+}
+EXPORT_SYMBOL_GPL(qcom_get_board_fw);
+
+const char *devm_qcom_get_board_fw(struct device *dev, const char *firmware)
+{
+   if (strchr(firmware, '/'))
+   return devm_kstrdup(dev, firmware, GFP_KERNEL);
+
+   scoped_guard(mutex, _fw_mutex) {
+   if (!qcom_fw_ensure_init())
+   return devm_kasprintf(dev, GFP_KERNEL, "%s/%s", 
fw_path, firmware);
+   }
+
+   return devm_kstrdup(dev, firmware, GFP_KERNEL);
+}
+EXPORT_SYMBOL_GPL(devm_qcom_get_board_fw);
+
+MODULE_DESCRIPTION("Firmware helpers for Qualcomm devices");
+MODULE_LICENSE("GPL");
diff --git a/include/linux/soc/qcom/fw_helper.h 
b/include/linux/soc/qcom/fw_helper.h
new file mode 100644
index ..755645386bba
--- /dev/null
+++ b/include/linux/soc/qcom/fw_helper.h
@@ -0,0 +1,10 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef __QCOM_FW_HELPER_H__
+#define __QCOM_FW_HELPER_H__
+
+struct device;
+
+const char *qcom_get_board_fw(const char *firmware);
+const char *devm_qcom_get_board_fw(struct device *dev, 

[PATCH 00/12] arm64: qcom: autodetect firmware paths

2024-05-21 Thread Dmitry Baryshkov
This is a followup to the discussion during the Linaro Connect. Remove
most of the firmware-name properties from the board DT by using
root node compatible to detect firmware path.

The most obvious change is that the drivers now have to look for the
MBN firmware files by default, so this might break the case of the user
simply mounting vendor's firmware partition to /lib/firmware and
expecting it to work.

Also things are slightly more complex for the platforms like DB845c and
Qualcomm RB5. These platforms have generic SoC firmware in qcom/sdm845
and qcom/sm8250 and also the board-specific firmware at
qcom/sdm845/Thundercomm/DB845C and qcom/sm8250/Thundercomm/RB5
respectively. Making these boards follow up the scheme would require
additional symlinks in the firmware dir.

+Link: qcom/sdm845/Thundercomm/db845c/a630_zap.mbn -> ../../a630_zap.mbn
+Link: qcom/sm8250/Thundercomm/RB5/a650_zap.mbn -> ../../a650_zap.mbn
+Link: qcom/sdm845/Thundercomm/db845c/adsp.mbn -> ../../adsp.mbn
+Link: qcom/sdm845/Thundercomm/db845c/adspr.jsn -> ../../adspr.jsn
+Link: qcom/sdm845/Thundercomm/db845c/adspua.jsn -> ../../adspua.jsn
+Link: qcom/sdm845/Thundercomm/db845c/cdsp.mbn -> ../../cdsp.mbn
+Link: qcom/sdm845/Thundercomm/db845c/cdspr.jsn -> ../../cdspr.jsn
+Link: qcom/sm8250/Thundercomm/RB5/adsp.mbn -> ../../adsp.mbn
+Link: qcom/sm8250/Thundercomm/RB5/adspr.jsn -> ../../adspr.jsn
+Link: qcom/sm8250/Thundercomm/RB5/adspua.jsn -> ../../adspua.jsn
+Link: qcom/sm8250/Thundercomm/RB5/cdsp.mbn -> ../../cdsp.mbn
+Link: qcom/sm8250/Thundercomm/RB5/cdspr.jsn -> ../../cdspr.jsn

Suggested-by: Arnd Bergmann 
Signed-off-by: Dmitry Baryshkov 
---
Dmitry Baryshkov (12):
  soc: qcom: add firmware name helper
  wifi: wcn36xx: make use of QCOM_FW_HELPER
  soc: qcom: wcnss_ctrl: make use of QCOM_FW_HELPER
  remoteproc: qcom_q6v5_mss: switch to mbn files by default
  remoteproc: qcom_q6v5_mss: make use of QCOM_FW_HELPER
  remoteproc: qcom_q6v5_pas: switch to mbn files by default
  remoteproc: qcom_q6v5_pas: make use of QCOM_FW_HELPER
  remoteproc: qcom_wcnss: switch to mbn files by default
  remoteproc: qcom_wcnss: make use of QCOM_FW_HELPER
  remoteproc: qcom_wcnss: make use of QCOM_FW_HELPER
  arm64: dts: qcom: apq8016-sbc: drop firmware-name properties
  arm64: dts: qcom: apq8096-db820c: drop firmware-name properties

 arch/arm64/boot/dts/qcom/apq8016-sbc.dts|  5 +-
 arch/arm64/boot/dts/qcom/apq8096-db820c.dts |  2 -
 drivers/net/wireless/ath/wcn36xx/Kconfig|  1 +
 drivers/net/wireless/ath/wcn36xx/main.c |  5 ++
 drivers/remoteproc/Kconfig  |  3 +
 drivers/remoteproc/qcom_q6v5_mss.c  | 12 +++-
 drivers/remoteproc/qcom_q6v5_pas.c  | 85 +++-
 drivers/remoteproc/qcom_wcnss.c |  8 ++-
 drivers/soc/qcom/Kconfig|  6 ++
 drivers/soc/qcom/Makefile   |  1 +
 drivers/soc/qcom/qcom_fw_helper.c   | 86 +
 drivers/soc/qcom/wcnss_ctrl.c   |  9 +++
 include/linux/soc/qcom/fw_helper.h  | 10 
 13 files changed, 187 insertions(+), 46 deletions(-)
---
base-commit: 632483ea8004edfadd035de36e1ab2c7c4f53158
change-id: 20240520-qcom-firmware-name-aeef265a753a

Best regards,
-- 
Dmitry Baryshkov 




Re: [PATCH v5 2/7] dt-bindings: remoteproc: Add compatibility for TEE support

2024-05-21 Thread Krzysztof Kozlowski
On 21/05/2024 10:09, Arnaud Pouliquen wrote:
> The "st,stm32mp1-m4-tee" compatible is utilized in a system configuration
> where the Cortex-M4 firmware is loaded by the Trusted execution Environment
> (TEE).
> For instance, this compatible is used in both the Linux and OP-TEE
> device-tree:
> - In OP-TEE, a node is defined in the device tree with the
>   st,stm32mp1-m4-tee to support signed remoteproc firmware.
>   Based on DT properties, OP-TEE authenticates, loads, starts, and stops
>   the firmware.
> - On Linux, when the compatibility is set, the Cortex-M resets should not
>   be declared in the device tree.
> 

Not tested.

Please use scripts/get_maintainers.pl to get a list of necessary people
and lists to CC. It might happen, that command when run on an older
kernel, gives you outdated entries. Therefore please be sure you base
your patches on recent Linux kernel.

Tools like b4 or scripts/get_maintainer.pl provide you proper list of
people, so fix your workflow. Tools might also fail if you work on some
ancient tree (don't, instead use mainline), work on fork of kernel
(don't, instead use mainline) or you ignore some maintainers (really
don't). Just use b4 and everything should be fine, although remember
about `b4 prep --auto-to-cc` if you added new patches to the patchset.

You missed at least devicetree list (maybe more), so this won't be
tested by automated tooling. Performing review on untested code might be
a waste of time, thus I will skip this patch entirely till you follow
the process allowing the patch to be tested.

Please kindly resend and include all necessary To/Cc entries.

Best regards,
Krzysztof




Re: [PATCH RFC 1/2] dt-bindings: soc: qcom,smsm: Allow specifying mboxes instead of qcom,ipc

2024-05-21 Thread Krzysztof Kozlowski
On 20/05/2024 17:11, Luca Weiss wrote:
> Hi Krzysztof
> 
> Ack, sounds good.
> 
> Maybe also from you, any opinion between these two binding styles?
> 
> So first using index of mboxes for the numbering, where for the known
> usages the first element (and sometimes the 3rd - ipc-2) are empty <>.
> 
> The second variant is using mbox-names to get the correct channel-mbox
> mapping.
> 
> -   qcom,ipc-1 = < 8 13>;
> -   qcom,ipc-2 = < 8 9>;
> -   qcom,ipc-3 = < 8 19>;
> +   mboxes = <0>, < 13>, < 9>, < 19>;
> 
> vs.
> 
> -   qcom,ipc-1 = < 8 13>;
> -   qcom,ipc-2 = < 8 9>;
> -   qcom,ipc-3 = < 8 19>;
> +   mboxes = < 13>, < 9>, < 19>;
> +   mbox-names = "ipc-1", "ipc-2", "ipc-3";

Sorry, don't get, ipc-1 is the first mailbox, so why would there be <0>
in first case? Anyway, the question is if you need to know that some
mailbox is missing. But then it is weird to name them "ipc-1" etc.

Best regards,
Krzysztof




[PATCH v5 4/7] remoteproc: core introduce rproc_set_rsc_table_on_start function

2024-05-21 Thread Arnaud Pouliquen
Split rproc_start()to prepare the update of the management of
the cache table on start, for the support of the firmware loading
by the TEE interface.
- create rproc_set_rsc_table_on_start() to address the management of
  the cache table in a specific function, as done in
  rproc_reset_rsc_table_on_stop().
- rename rproc_set_rsc_table in rproc_set_rsc_table_on_attach()
- move rproc_reset_rsc_table_on_stop() to be close to the
  rproc_set_rsc_table_on_start() function

Suggested-by: Mathieu Poirier 
Signed-off-by: Arnaud Pouliquen 
---
 drivers/remoteproc/remoteproc_core.c | 116 ++-
 1 file changed, 62 insertions(+), 54 deletions(-)

diff --git a/drivers/remoteproc/remoteproc_core.c 
b/drivers/remoteproc/remoteproc_core.c
index f276956f2c5c..42bca01f3bde 100644
--- a/drivers/remoteproc/remoteproc_core.c
+++ b/drivers/remoteproc/remoteproc_core.c
@@ -1264,18 +1264,9 @@ void rproc_resource_cleanup(struct rproc *rproc)
 }
 EXPORT_SYMBOL(rproc_resource_cleanup);
 
-static int rproc_start(struct rproc *rproc, const struct firmware *fw)
+static int rproc_set_rsc_table_on_start(struct rproc *rproc, const struct 
firmware *fw)
 {
struct resource_table *loaded_table;
-   struct device *dev = >dev;
-   int ret;
-
-   /* load the ELF segments to memory */
-   ret = rproc_load_segments(rproc, fw);
-   if (ret) {
-   dev_err(dev, "Failed to load program segments: %d\n", ret);
-   return ret;
-   }
 
/*
 * The starting device has been given the rproc->cached_table as the
@@ -1291,6 +1282,64 @@ static int rproc_start(struct rproc *rproc, const struct 
firmware *fw)
rproc->table_ptr = loaded_table;
}
 
+   return 0;
+}
+
+static int rproc_reset_rsc_table_on_stop(struct rproc *rproc)
+{
+   /* A resource table was never retrieved, nothing to do here */
+   if (!rproc->table_ptr)
+   return 0;
+
+   /*
+* If a cache table exists the remote processor was started by
+* the remoteproc core.  That cache table should be used for
+* the rest of the shutdown process.
+*/
+   if (rproc->cached_table)
+   goto out;
+
+   /*
+* If we made it here the remote processor was started by another
+* entity and a cache table doesn't exist.  As such make a copy of
+* the resource table currently used by the remote processor and
+* use that for the rest of the shutdown process.  The memory
+* allocated here is free'd in rproc_shutdown().
+*/
+   rproc->cached_table = kmemdup(rproc->table_ptr,
+ rproc->table_sz, GFP_KERNEL);
+   if (!rproc->cached_table)
+   return -ENOMEM;
+
+   /*
+* Since the remote processor is being switched off the clean table
+* won't be needed.  Allocated in rproc_set_rsc_table_on_start().
+*/
+   kfree(rproc->clean_table);
+
+out:
+   /*
+* Use a copy of the resource table for the remainder of the
+* shutdown process.
+*/
+   rproc->table_ptr = rproc->cached_table;
+   return 0;
+}
+
+static int rproc_start(struct rproc *rproc, const struct firmware *fw)
+{
+   struct device *dev = >dev;
+   int ret;
+
+   /* load the ELF segments to memory */
+   ret = rproc_load_segments(rproc, fw);
+   if (ret) {
+   dev_err(dev, "Failed to load program segments: %d\n", ret);
+   return ret;
+   }
+
+   rproc_set_rsc_table_on_start(rproc, fw);
+
ret = rproc_prepare_subdevices(rproc);
if (ret) {
dev_err(dev, "failed to prepare subdevices for %s: %d\n",
@@ -1450,7 +1499,7 @@ static int rproc_fw_boot(struct rproc *rproc, const 
struct firmware *fw)
return ret;
 }
 
-static int rproc_set_rsc_table(struct rproc *rproc)
+static int rproc_set_rsc_table_on_attach(struct rproc *rproc)
 {
struct resource_table *table_ptr;
struct device *dev = >dev;
@@ -1540,54 +1589,13 @@ static int rproc_reset_rsc_table_on_detach(struct rproc 
*rproc)
 
/*
 * The clean resource table is no longer needed.  Allocated in
-* rproc_set_rsc_table().
+* rproc_set_rsc_table_on_attach().
 */
kfree(rproc->clean_table);
 
return 0;
 }
 
-static int rproc_reset_rsc_table_on_stop(struct rproc *rproc)
-{
-   /* A resource table was never retrieved, nothing to do here */
-   if (!rproc->table_ptr)
-   return 0;
-
-   /*
-* If a cache table exists the remote processor was started by
-* the remoteproc core.  That cache table should be used for
-* the rest of the shutdown process.
-*/
-   if (rproc->cached_table)
-   goto out;
-
-   /*
-* If we made it here the remote processor was started by another
-* entity and a cache table doesn't exist.  As such make a 

[PATCH v5 6/7] remoteproc: stm32: Create sub-functions to request shutdown and release

2024-05-21 Thread Arnaud Pouliquen
To prepare for the support of TEE remoteproc, create sub-functions
that can be used in both cases, with and without remoteproc TEE support.

Signed-off-by: Arnaud Pouliquen 
---
 drivers/remoteproc/stm32_rproc.c | 84 +++-
 1 file changed, 51 insertions(+), 33 deletions(-)

diff --git a/drivers/remoteproc/stm32_rproc.c b/drivers/remoteproc/stm32_rproc.c
index 88623df7d0c3..8cd838df4e92 100644
--- a/drivers/remoteproc/stm32_rproc.c
+++ b/drivers/remoteproc/stm32_rproc.c
@@ -209,6 +209,54 @@ static int stm32_rproc_mbox_idx(struct rproc *rproc, const 
unsigned char *name)
return -EINVAL;
 }
 
+static void stm32_rproc_request_shutdown(struct rproc *rproc)
+{
+   struct stm32_rproc *ddata = rproc->priv;
+   int err, dummy_data, idx;
+
+   /* Request shutdown of the remote processor */
+   if (rproc->state != RPROC_OFFLINE && rproc->state != RPROC_CRASHED) {
+   idx = stm32_rproc_mbox_idx(rproc, STM32_MBX_SHUTDOWN);
+   if (idx >= 0 && ddata->mb[idx].chan) {
+   /* A dummy data is sent to allow to block on transmit. 
*/
+   err = mbox_send_message(ddata->mb[idx].chan,
+   _data);
+   if (err < 0)
+   dev_warn(>dev, "warning: remote FW 
shutdown without ack\n");
+   }
+   }
+}
+
+static int stm32_rproc_release(struct rproc *rproc)
+{
+   struct stm32_rproc *ddata = rproc->priv;
+   unsigned int err = 0;
+
+   /* To allow platform Standby power mode, set remote proc Deep Sleep. */
+   if (ddata->pdds.map) {
+   err = regmap_update_bits(ddata->pdds.map, ddata->pdds.reg,
+ddata->pdds.mask, 1);
+   if (err) {
+   dev_err(>dev, "failed to set pdds\n");
+   return err;
+   }
+   }
+
+   /* Update coprocessor state to OFF if available. */
+   if (ddata->m4_state.map) {
+   err = regmap_update_bits(ddata->m4_state.map,
+ddata->m4_state.reg,
+ddata->m4_state.mask,
+M4_STATE_OFF);
+   if (err) {
+   dev_err(>dev, "failed to set copro state\n");
+   return err;
+   }
+   }
+
+   return 0;
+}
+
 static int stm32_rproc_prepare(struct rproc *rproc)
 {
struct device *dev = rproc->dev.parent;
@@ -519,17 +567,9 @@ static int stm32_rproc_detach(struct rproc *rproc)
 static int stm32_rproc_stop(struct rproc *rproc)
 {
struct stm32_rproc *ddata = rproc->priv;
-   int err, idx;
+   int err;
 
-   /* request shutdown of the remote processor */
-   if (rproc->state != RPROC_OFFLINE && rproc->state != RPROC_CRASHED) {
-   idx = stm32_rproc_mbox_idx(rproc, STM32_MBX_SHUTDOWN);
-   if (idx >= 0 && ddata->mb[idx].chan) {
-   err = mbox_send_message(ddata->mb[idx].chan, "detach");
-   if (err < 0)
-   dev_warn(>dev, "warning: remote FW 
shutdown without ack\n");
-   }
-   }
+   stm32_rproc_request_shutdown(rproc);
 
err = stm32_rproc_set_hold_boot(rproc, true);
if (err)
@@ -541,29 +581,7 @@ static int stm32_rproc_stop(struct rproc *rproc)
return err;
}
 
-   /* to allow platform Standby power mode, set remote proc Deep Sleep */
-   if (ddata->pdds.map) {
-   err = regmap_update_bits(ddata->pdds.map, ddata->pdds.reg,
-ddata->pdds.mask, 1);
-   if (err) {
-   dev_err(>dev, "failed to set pdds\n");
-   return err;
-   }
-   }
-
-   /* update coprocessor state to OFF if available */
-   if (ddata->m4_state.map) {
-   err = regmap_update_bits(ddata->m4_state.map,
-ddata->m4_state.reg,
-ddata->m4_state.mask,
-M4_STATE_OFF);
-   if (err) {
-   dev_err(>dev, "failed to set copro state\n");
-   return err;
-   }
-   }
-
-   return 0;
+   return stm32_rproc_release(rproc);
 }
 
 static void stm32_rproc_kick(struct rproc *rproc, int vqid)
-- 
2.25.1




[PATCH v5 1/7] remoteproc: Add TEE support

2024-05-21 Thread Arnaud Pouliquen
Add a remoteproc TEE (Trusted Execution Environment) driver
that will be probed by the TEE bus. If the associated Trusted
application is supported on secure part this driver offers a client
interface to load a firmware in the secure part.
This firmware could be authenticated by the secure trusted application.

Signed-off-by: Arnaud Pouliquen 
---
update from V4
- fix commit message,
- fix Kconfig typo,
- introduce tee_rproc_release_loaded_rsc_table function to release the
  resource table,
- reorder function variables in declaration in reverse ascending order,
- introduce try_module_get and module_put to prevent module removed while
  used,
- remove rsc_table field in tee_rproc structure,
- remove tee_rproc_find_loaded_rsc_table as seems not correspond to the
  propoer usage regarding ops definition [1]. The resource table is
  loaded before used,
- add __force attribute when cast the type aof the resource table to fix
  build warning.

[1]https://elixir.bootlin.com/linux/latest/source/include/linux/remoteproc.h#L374
---
 drivers/remoteproc/Kconfig  |  10 +
 drivers/remoteproc/Makefile |   1 +
 drivers/remoteproc/tee_remoteproc.c | 429 
 include/linux/remoteproc.h  |   4 +
 include/linux/tee_remoteproc.h  |  99 +++
 5 files changed, 543 insertions(+)
 create mode 100644 drivers/remoteproc/tee_remoteproc.c
 create mode 100644 include/linux/tee_remoteproc.h

diff --git a/drivers/remoteproc/Kconfig b/drivers/remoteproc/Kconfig
index 48845dc8fa85..6c1c07202276 100644
--- a/drivers/remoteproc/Kconfig
+++ b/drivers/remoteproc/Kconfig
@@ -365,6 +365,16 @@ config XLNX_R5_REMOTEPROC
 
  It's safe to say N if not interested in using RPU r5f cores.
 
+
+config TEE_REMOTEPROC
+   tristate "Remoteproc support by a TEE application"
+   depends on OPTEE
+   help
+ Support a remote processor with a TEE application. The Trusted
+ Execution Context is responsible for loading the trusted firmware
+ image and managing the remote processor's lifecycle.
+ This can be either built-in or a loadable module.
+
 endif # REMOTEPROC
 
 endmenu
diff --git a/drivers/remoteproc/Makefile b/drivers/remoteproc/Makefile
index 91314a9b43ce..fa8daebce277 100644
--- a/drivers/remoteproc/Makefile
+++ b/drivers/remoteproc/Makefile
@@ -36,6 +36,7 @@ obj-$(CONFIG_RCAR_REMOTEPROC) += rcar_rproc.o
 obj-$(CONFIG_ST_REMOTEPROC)+= st_remoteproc.o
 obj-$(CONFIG_ST_SLIM_REMOTEPROC)   += st_slim_rproc.o
 obj-$(CONFIG_STM32_RPROC)  += stm32_rproc.o
+obj-$(CONFIG_TEE_REMOTEPROC)   += tee_remoteproc.o
 obj-$(CONFIG_TI_K3_DSP_REMOTEPROC) += ti_k3_dsp_remoteproc.o
 obj-$(CONFIG_TI_K3_R5_REMOTEPROC)  += ti_k3_r5_remoteproc.o
 obj-$(CONFIG_XLNX_R5_REMOTEPROC)   += xlnx_r5_remoteproc.o
diff --git a/drivers/remoteproc/tee_remoteproc.c 
b/drivers/remoteproc/tee_remoteproc.c
new file mode 100644
index ..f13546628ec9
--- /dev/null
+++ b/drivers/remoteproc/tee_remoteproc.c
@@ -0,0 +1,429 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+/*
+ * Copyright (C) STMicroelectronics 2024 - All Rights Reserved
+ * Author: Arnaud Pouliquen 
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "remoteproc_internal.h"
+
+#define MAX_TEE_PARAM_ARRY_MEMBER  4
+
+/*
+ * Authentication of the firmware and load in the remote processor memory
+ *
+ * [in]  params[0].value.a:unique 32bit identifier of the remote processor
+ * [in] params[1].memref:  buffer containing the image of the 
buffer
+ */
+#define TA_RPROC_FW_CMD_LOAD_FW1
+
+/*
+ * Start the remote processor
+ *
+ * [in]  params[0].value.a:unique 32bit identifier of the remote processor
+ */
+#define TA_RPROC_FW_CMD_START_FW   2
+
+/*
+ * Stop the remote processor
+ *
+ * [in]  params[0].value.a:unique 32bit identifier of the remote processor
+ */
+#define TA_RPROC_FW_CMD_STOP_FW3
+
+/*
+ * Return the address of the resource table, or 0 if not found
+ * No check is done to verify that the address returned is accessible by
+ * the non secure context. If the resource table is loaded in a protected
+ * memory the access by the non secure context will lead to a data abort.
+ *
+ * [in]  params[0].value.a:unique 32bit identifier of the remote processor
+ * [out]  params[1].value.a:   32bit LSB resource table memory address
+ * [out]  params[1].value.b:   32bit MSB resource table memory address
+ * [out]  params[2].value.a:   32bit LSB resource table memory size
+ * [out]  params[2].value.b:   32bit MSB resource table memory size
+ */
+#define TA_RPROC_FW_CMD_GET_RSC_TABLE  4
+
+/*
+ * Return the address of the core dump
+ *
+ * [in]  params[0].value.a:unique 32bit identifier of the remote processor
+ * [out] params[1].memref: address of the core dump image if exist,
+ * else return Null
+ */
+#define 

[PATCH v5 7/7] remoteproc: stm32: Add support of an OP-TEE TA to load the firmware

2024-05-21 Thread Arnaud Pouliquen
The new TEE remoteproc device is used to manage remote firmware in a
secure, trusted context. The 'st,stm32mp1-m4-tee' compatibility is
introduced to delegate the loading of the firmware to the trusted
execution context. In such cases, the firmware should be signed and
adhere to the image format defined by the TEE.

Signed-off-by: Arnaud Pouliquen 
---
Update from V4:
- remove hard coded remote proc ID STM32_MP1_M4_PROC_ID, get the
  ID from the DT,
- replace find_loaded_rsc_table by get_loaded_rsc_table.
---
 drivers/remoteproc/stm32_rproc.c | 65 ++--
 1 file changed, 61 insertions(+), 4 deletions(-)

diff --git a/drivers/remoteproc/stm32_rproc.c b/drivers/remoteproc/stm32_rproc.c
index 8cd838df4e92..f6f748814bf2 100644
--- a/drivers/remoteproc/stm32_rproc.c
+++ b/drivers/remoteproc/stm32_rproc.c
@@ -20,6 +20,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 #include "remoteproc_internal.h"
@@ -257,6 +258,19 @@ static int stm32_rproc_release(struct rproc *rproc)
return 0;
 }
 
+static int stm32_rproc_tee_stop(struct rproc *rproc)
+{
+   int err;
+
+   stm32_rproc_request_shutdown(rproc);
+
+   err = tee_rproc_stop(rproc);
+   if (err)
+   return err;
+
+   return stm32_rproc_release(rproc);
+}
+
 static int stm32_rproc_prepare(struct rproc *rproc)
 {
struct device *dev = rproc->dev.parent;
@@ -693,8 +707,20 @@ static const struct rproc_ops st_rproc_ops = {
.get_boot_addr  = rproc_elf_get_boot_addr,
 };
 
+static const struct rproc_ops st_rproc_tee_ops = {
+   .prepare= stm32_rproc_prepare,
+   .start  = tee_rproc_start,
+   .stop   = stm32_rproc_tee_stop,
+   .kick   = stm32_rproc_kick,
+   .load   = tee_rproc_load_fw,
+   .parse_fw   = tee_rproc_parse_fw,
+   .get_loaded_rsc_table = tee_rproc_get_loaded_rsc_table,
+
+};
+
 static const struct of_device_id stm32_rproc_match[] = {
-   { .compatible = "st,stm32mp1-m4" },
+   {.compatible = "st,stm32mp1-m4",},
+   {.compatible = "st,stm32mp1-m4-tee",},
{},
 };
 MODULE_DEVICE_TABLE(of, stm32_rproc_match);
@@ -853,17 +879,42 @@ static int stm32_rproc_probe(struct platform_device *pdev)
struct device *dev = >dev;
struct stm32_rproc *ddata;
struct device_node *np = dev->of_node;
+   struct tee_rproc *trproc = NULL;
struct rproc *rproc;
unsigned int state;
+   u32 proc_id;
int ret;
 
ret = dma_coerce_mask_and_coherent(dev, DMA_BIT_MASK(32));
if (ret)
return ret;
 
-   rproc = devm_rproc_alloc(dev, np->name, _rproc_ops, NULL, 
sizeof(*ddata));
-   if (!rproc)
-   return -ENOMEM;
+   if (of_device_is_compatible(np, "st,stm32mp1-m4-tee")) {
+   /*
+* Delegate the firmware management to the secure context.
+* The firmware loaded has to be signed.
+*/
+   ret = of_property_read_u32(np, "st,proc-id", _id);
+   if (ret) {
+   dev_err(dev, "failed to read st,rproc-id property\n");
+   return ret;
+   }
+
+   rproc = devm_rproc_alloc(dev, np->name, _rproc_tee_ops, 
NULL, sizeof(*ddata));
+   if (!rproc)
+   return -ENOMEM;
+
+   trproc = tee_rproc_register(dev, rproc, proc_id);
+   if (IS_ERR(trproc)) {
+   dev_err_probe(dev, PTR_ERR(trproc),
+ "signed firmware not supported by TEE\n");
+   return PTR_ERR(trproc);
+   }
+   } else {
+   rproc = devm_rproc_alloc(dev, np->name, _rproc_ops, NULL, 
sizeof(*ddata));
+   if (!rproc)
+   return -ENOMEM;
+   }
 
ddata = rproc->priv;
 
@@ -915,6 +966,9 @@ static int stm32_rproc_probe(struct platform_device *pdev)
dev_pm_clear_wake_irq(dev);
device_init_wakeup(dev, false);
}
+   if (trproc)
+   tee_rproc_unregister(trproc);
+
return ret;
 }
 
@@ -935,6 +989,9 @@ static void stm32_rproc_remove(struct platform_device *pdev)
dev_pm_clear_wake_irq(dev);
device_init_wakeup(dev, false);
}
+   if (rproc->tee_interface)
+   tee_rproc_unregister(rproc->tee_interface);
+
 }
 
 static int stm32_rproc_suspend(struct device *dev)
-- 
2.25.1




[PATCH v5 2/7] dt-bindings: remoteproc: Add compatibility for TEE support

2024-05-21 Thread Arnaud Pouliquen
The "st,stm32mp1-m4-tee" compatible is utilized in a system configuration
where the Cortex-M4 firmware is loaded by the Trusted execution Environment
(TEE).
For instance, this compatible is used in both the Linux and OP-TEE
device-tree:
- In OP-TEE, a node is defined in the device tree with the
  st,stm32mp1-m4-tee to support signed remoteproc firmware.
  Based on DT properties, OP-TEE authenticates, loads, starts, and stops
  the firmware.
- On Linux, when the compatibility is set, the Cortex-M resets should not
  be declared in the device tree.

Signed-off-by: Arnaud Pouliquen 
Reviewed-by: Rob Herring 
---
 .../bindings/remoteproc/st,stm32-rproc.yaml   | 51 ---
 1 file changed, 43 insertions(+), 8 deletions(-)

diff --git a/Documentation/devicetree/bindings/remoteproc/st,stm32-rproc.yaml 
b/Documentation/devicetree/bindings/remoteproc/st,stm32-rproc.yaml
index 370af61d8f28..36ea54016b76 100644
--- a/Documentation/devicetree/bindings/remoteproc/st,stm32-rproc.yaml
+++ b/Documentation/devicetree/bindings/remoteproc/st,stm32-rproc.yaml
@@ -16,7 +16,12 @@ maintainers:
 
 properties:
   compatible:
-const: st,stm32mp1-m4
+enum:
+  - st,stm32mp1-m4
+  - st,stm32mp1-m4-tee
+description:
+  Use "st,stm32mp1-m4" for the Cortex-M4 coprocessor management by 
non-secure context
+  Use "st,stm32mp1-m4-tee" for the Cortex-M4 coprocessor management by 
secure context
 
   reg:
 description:
@@ -142,21 +147,41 @@ properties:
 required:
   - compatible
   - reg
-  - resets
 
 allOf:
   - if:
   properties:
-reset-names:
-  not:
-contains:
-  const: hold_boot
+compatible:
+  contains:
+const: st,stm32mp1-m4
 then:
+  if:
+properties:
+  reset-names:
+not:
+  contains:
+const: hold_boot
+  then:
+required:
+  - st,syscfg-holdboot
+  else:
+properties:
+  st,syscfg-holdboot: false
+required:
+  - reset-names
   required:
-- st,syscfg-holdboot
-else:
+- resets
+
+  - if:
+  properties:
+compatible:
+  contains:
+const: st,stm32mp1-m4-tee
+then:
   properties:
 st,syscfg-holdboot: false
+reset-names: false
+resets: false
 
 additionalProperties: false
 
@@ -188,5 +213,15 @@ examples:
   st,syscfg-rsc-tbl = < 0x144 0x>;
   st,syscfg-m4-state = < 0x148 0x>;
 };
+  - |
+#include 
+m4@1000 {
+  compatible = "st,stm32mp1-m4-tee";
+  reg = <0x1000 0x4>,
+<0x3000 0x4>,
+<0x3800 0x1>;
+  st,syscfg-rsc-tbl = < 0x144 0x>;
+  st,syscfg-m4-state = < 0x148 0x>;
+};
 
 ...
-- 
2.25.1




[PATCH v5 0/7] Introduction of a remoteproc tee to load signed firmware

2024-05-21 Thread Arnaud Pouliquen
Main updates from the previous version [1]:
--

1) use proc->table_ptr as unique reference to point to the resource table
 --> update remoteproc_core.c to implement management of the resource table
 base on rproc->rproc->tee_interface new field:
 - on start get the resource table address from TEE remoteproc instead
   of finding it in firmware (ops choice to confirm)
 - on stop unmap the resource table before updating the
   proc->table_ptr pointer.

2) retrieve the TEE rproc Identifier from the device tree instead of
   hardcoding it
 -->  Add a new "st,proc-id" property in device tree.

More details on updates are listed in commits messages

[1] 
https://lore.kernel.org/linux-arm-kernel/20240115135249.296822-1-arnaud.pouliq...@foss.st.com/T/#m9ebb2e8f6d5e90f055827e4f227ce0877bc6d761

base-commit: c8d8f841e95bcc07ac8c5621fc171a24f1fd5cdb

Description of the feature:
--
This series proposes the implementation of a remoteproc tee driver to
communicate with a TEE trusted application responsible for authenticating
and loading the remoteproc firmware image in an Arm secure context.

1) Principle:

The remoteproc tee driver provides services to communicate with the OP-TEE
trusted application running on the Trusted Execution Context (TEE).
The trusted application in TEE manages the remote processor lifecycle:

- authenticating and loading firmware images,
- isolating and securing the remote processor memories,
- supporting multi-firmware (e.g., TF-M + Zephyr on a Cortex-M33),
- managing the start and stop of the firmware by the TEE.

2) Format of the signed image:

Refer to:
https://github.com/OP-TEE/optee_os/blob/master/ta/remoteproc/src/remoteproc_core.c#L18-L57

3) OP-TEE trusted application API:

Refer to:
https://github.com/OP-TEE/optee_os/blob/master/ta/remoteproc/include/ta_remoteproc.h

4) OP-TEE signature script

Refer to:
https://github.com/OP-TEE/optee_os/blob/master/scripts/sign_rproc_fw.py

Example of usage:
sign_rproc_fw.py --in  --in  --out  --key 
${OP-TEE_PATH}/keys/default.pem


5) Impact on User space Application

No sysfs impact.the user only needs to provide the signed firmware image
instead of the ELF image.


For more information about the implementation, a presentation is available here
(note that the format of the signed image has evolved between the presentation
and the integration in OP-TEE).

https://resources.linaro.org/en/resource/6c5bGvZwUAjX56fvxthxds

Arnaud Pouliquen (7):
  remoteproc: Add TEE support
  dt-bindings: remoteproc: Add compatibility for TEE support
  dt-bindings: remoteproc: Add processor identifier property
  remoteproc: core introduce rproc_set_rsc_table_on_start function
  remoteproc: core: support of the tee interface
  remoteproc: stm32: Create sub-functions to request shutdown and
release
  remoteproc: stm32: Add support of an OP-TEE TA to load the firmware

 .../bindings/remoteproc/st,stm32-rproc.yaml   |  58 ++-
 drivers/remoteproc/Kconfig|  10 +
 drivers/remoteproc/Makefile   |   1 +
 drivers/remoteproc/remoteproc_core.c  | 135 +++---
 drivers/remoteproc/stm32_rproc.c  | 149 --
 drivers/remoteproc/tee_remoteproc.c   | 429 ++
 include/linux/remoteproc.h|   4 +
 include/linux/tee_remoteproc.h|  99 
 8 files changed, 784 insertions(+), 101 deletions(-)
 create mode 100644 drivers/remoteproc/tee_remoteproc.c
 create mode 100644 include/linux/tee_remoteproc.h

-- 
2.25.1




  1   2   >