Re: [PATCH 4/8] riscv: ftrace: align patchable functions to 4 Byte boundary

2024-06-16 Thread Andy Chiu
Sorry for the noise,

On Mon, Jun 17, 2024 at 10:38 AM Andy Chiu  wrote:
>
> On Fri, Jun 14, 2024 at 3:09 AM Nathan Chancellor  wrote:
> >
> > Hi Andy,
> >
> > On Thu, Jun 13, 2024 at 03:11:09PM +0800, Andy Chiu wrote:
> > > We are changing ftrace code patching in order to remove dependency from
> > > stop_machine() and enable kernel preemption. This requires us to align
> > > functions entry at a 4-B align address.
> > >
> > > However, -falign-functions on older versions of GCC alone was not strong
> > > enoungh to align all functions. In fact, cold functions are not aligned
> > > after turning on optimizations. We consider this is a bug in GCC and
> > > turn off guess-branch-probility as a workaround to align all functions.
> > >
> > > GCC bug id: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88345
> > >
> > > The option -fmin-function-alignment is able to align all functions
> > > properly on newer versions of gcc. So, we add a cc-option to test if
> > > the toolchain supports it.
> > >
> > > Suggested-by: Evgenii Shatokhin 
> > > Signed-off-by: Andy Chiu 
> > > ---
> > >  arch/riscv/Kconfig  | 1 +
> > >  arch/riscv/Makefile | 7 ++-
> > >  2 files changed, 7 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
> > > index b94176e25be1..80b8d48e1e46 100644
> > > --- a/arch/riscv/Kconfig
> > > +++ b/arch/riscv/Kconfig
> > > @@ -203,6 +203,7 @@ config CLANG_SUPPORTS_DYNAMIC_FTRACE
> > >  config GCC_SUPPORTS_DYNAMIC_FTRACE
> > >   def_bool CC_IS_GCC
> > >   depends on $(cc-option,-fpatchable-function-entry=8)
> > > + depends on $(cc-option,-fmin-function-alignment=4) || !RISCV_ISA_C
> >
> > Please use CC_HAS_MIN_FUNCTION_ALIGNMENT (from arch/Kconfig), which
> > already checks for support for this option.
>
> Thanks for the suggestion!
>
> >
> > >  config HAVE_SHADOW_CALL_STACK
> > >   def_bool $(cc-option,-fsanitize=shadow-call-stack)
> > > diff --git a/arch/riscv/Makefile b/arch/riscv/Makefile
> > > index 06de9d365088..74628ad8dcf8 100644
> > > --- a/arch/riscv/Makefile
> > > +++ b/arch/riscv/Makefile
> > > @@ -14,8 +14,13 @@ endif
> > >  ifeq ($(CONFIG_DYNAMIC_FTRACE),y)
> > >   LDFLAGS_vmlinux += --no-relax
> > >   KBUILD_CPPFLAGS += -DCC_USING_PATCHABLE_FUNCTION_ENTRY
> > > +ifeq ($(CONFIG_CC_IS_CLANG),y)
> >
> > Same here, please invert this and use
> >
> >   ifdef CONFIG_CC_HAS_MIN_FUNCTION_ALIGNMENT
> >
> > like the main Makefile does.
>
> Hope this makes sense to you. I am going to add the following in riscv Kconig:
>
> select FUNCTION_ALIGNMENT_4B if DYNAMIC_FTRACE && !RISCV_ISA_C

This should be:

select FUNCTION_ALIGNMENT_4B if DYNAMIC_FTRACE && RISCV_ISA_C

as RISCV_ISA_C == y means that there are 2B instructions. In this
case, functions can be non 4B aligned, so we need to enforce the
alignment requirement from the compiler.

>
> So we will not need any of these
>
> >
> > > + cflags_ftrace_align := -falign-functions=4
> > > +else
> > > + cflags_ftrace_align := -fmin-function-alignment=4
> > > +endif
> > >  ifeq ($(CONFIG_RISCV_ISA_C),y)
> > > - CC_FLAGS_FTRACE := -fpatchable-function-entry=4
> > > + CC_FLAGS_FTRACE := -fpatchable-function-entry=4 
> > > $(cflags_ftrace_align)
> > >  else
> > >   CC_FLAGS_FTRACE := -fpatchable-function-entry=2
> > >  endif
> > >
> > > --
> > > 2.43.0
> > >
> > >
>
> Thanks,
> Andy



RE: [PATCH 1/2] vdpa: support set mac address from vdpa tool

2024-06-16 Thread Parav Pandit


> From: Jason Wang 
> Sent: Monday, June 17, 2024 7:18 AM
> 
> On Wed, Jun 12, 2024 at 2:30 PM Jiri Pirko  wrote:
> >
> > Wed, Jun 12, 2024 at 03:58:10AM CEST, k...@kernel.org wrote:
> > >On Tue, 11 Jun 2024 13:32:32 +0800 Cindy Lu wrote:
> > >> Add new UAPI to support the mac address from vdpa tool Function
> > >> vdpa_nl_cmd_dev_config_set_doit() will get the MAC address from the
> > >> vdpa tool and then set it to the device.
> > >>
> > >> The usage is: vdpa dev set name vdpa_name mac **:**:**:**:**:**
> > >
> > >Why don't you use devlink?
> >
> > Fair question. Why does vdpa-specific uapi even exist? To have
> > driver-specific uapi Does not make any sense to me :/
> 
> It came with devlink first actually, but switched to a dedicated uAPI.
> 
> Parav(cced) may explain more here.
> 
Devlink configures function level mac that applies to all protocol devices 
(vdpa, rdma, netdev) etc.
Additionally, vdpa device level mac can be different (an additional one) to 
apply to only vdpa traffic.
Hence dedicated uAPI was added.



Re: [PATCH 4/8] riscv: ftrace: align patchable functions to 4 Byte boundary

2024-06-16 Thread Andy Chiu
On Fri, Jun 14, 2024 at 3:09 AM Nathan Chancellor  wrote:
>
> Hi Andy,
>
> On Thu, Jun 13, 2024 at 03:11:09PM +0800, Andy Chiu wrote:
> > We are changing ftrace code patching in order to remove dependency from
> > stop_machine() and enable kernel preemption. This requires us to align
> > functions entry at a 4-B align address.
> >
> > However, -falign-functions on older versions of GCC alone was not strong
> > enoungh to align all functions. In fact, cold functions are not aligned
> > after turning on optimizations. We consider this is a bug in GCC and
> > turn off guess-branch-probility as a workaround to align all functions.
> >
> > GCC bug id: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88345
> >
> > The option -fmin-function-alignment is able to align all functions
> > properly on newer versions of gcc. So, we add a cc-option to test if
> > the toolchain supports it.
> >
> > Suggested-by: Evgenii Shatokhin 
> > Signed-off-by: Andy Chiu 
> > ---
> >  arch/riscv/Kconfig  | 1 +
> >  arch/riscv/Makefile | 7 ++-
> >  2 files changed, 7 insertions(+), 1 deletion(-)
> >
> > diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
> > index b94176e25be1..80b8d48e1e46 100644
> > --- a/arch/riscv/Kconfig
> > +++ b/arch/riscv/Kconfig
> > @@ -203,6 +203,7 @@ config CLANG_SUPPORTS_DYNAMIC_FTRACE
> >  config GCC_SUPPORTS_DYNAMIC_FTRACE
> >   def_bool CC_IS_GCC
> >   depends on $(cc-option,-fpatchable-function-entry=8)
> > + depends on $(cc-option,-fmin-function-alignment=4) || !RISCV_ISA_C
>
> Please use CC_HAS_MIN_FUNCTION_ALIGNMENT (from arch/Kconfig), which
> already checks for support for this option.

Thanks for the suggestion!

>
> >  config HAVE_SHADOW_CALL_STACK
> >   def_bool $(cc-option,-fsanitize=shadow-call-stack)
> > diff --git a/arch/riscv/Makefile b/arch/riscv/Makefile
> > index 06de9d365088..74628ad8dcf8 100644
> > --- a/arch/riscv/Makefile
> > +++ b/arch/riscv/Makefile
> > @@ -14,8 +14,13 @@ endif
> >  ifeq ($(CONFIG_DYNAMIC_FTRACE),y)
> >   LDFLAGS_vmlinux += --no-relax
> >   KBUILD_CPPFLAGS += -DCC_USING_PATCHABLE_FUNCTION_ENTRY
> > +ifeq ($(CONFIG_CC_IS_CLANG),y)
>
> Same here, please invert this and use
>
>   ifdef CONFIG_CC_HAS_MIN_FUNCTION_ALIGNMENT
>
> like the main Makefile does.

Hope this makes sense to you. I am going to add the following in riscv Kconig:

select FUNCTION_ALIGNMENT_4B if DYNAMIC_FTRACE && !RISCV_ISA_C

So we will not need any of these

>
> > + cflags_ftrace_align := -falign-functions=4
> > +else
> > + cflags_ftrace_align := -fmin-function-alignment=4
> > +endif
> >  ifeq ($(CONFIG_RISCV_ISA_C),y)
> > - CC_FLAGS_FTRACE := -fpatchable-function-entry=4
> > + CC_FLAGS_FTRACE := -fpatchable-function-entry=4 $(cflags_ftrace_align)
> >  else
> >   CC_FLAGS_FTRACE := -fpatchable-function-entry=2
> >  endif
> >
> > --
> > 2.43.0
> >
> >

Thanks,
Andy



Re: [PATCH 2/8] tracing: do not trace kernel_text_address()

2024-06-16 Thread Andy Chiu
On Thu, Jun 13, 2024 at 9:32 PM Steven Rostedt  wrote:
>
> On Thu, 13 Jun 2024 15:11:07 +0800
> Andy Chiu  wrote:
>
> > kernel_text_address() and __kernel_text_address() are called in
> > arch_stack_walk() of riscv. This results in excess amount of un-related
> > traces when the kernel is compiled with CONFIG_TRACE_IRQFLAGS. The
> > situation worsens when function_graph is active, as it calls
> > local_irq_save/restore in each function's entry/exit. This patch adds
> > both functions to notrace, so they won't show up on the trace records.
>
> I rather not add notrace just because something is noisy.
>
> You can always just add:
>
>  echo '*kernel_text_address' > /sys/kernel/tracing/set_ftrace_notrace
>
> and achieve the same result.

Sounds good, I am going to drop this patch for the next revision

>
> -- Steve

Thanks,
Andy



Re: [PATCH] vringh: add MODULE_DESCRIPTION()

2024-06-16 Thread Jason Wang
On Fri, May 17, 2024 at 9:57 AM Jeff Johnson  wrote:
>
> Fix the allmodconfig 'make w=1' issue:
>
> WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/vhost/vringh.o
>
> Signed-off-by: Jeff Johnson 

Acked-by: Jason Wang 

Thanks

>




Re: [PATCH] virtio_net: Eliminate OOO packets during switching

2024-06-16 Thread Jason Wang
On Sat, Jun 15, 2024 at 6:05 AM Abhinav Jain  wrote:
>
> Disable the network device & turn off carrier before modifying the
> number of queue pairs.
> Process all the in-flight packets and then turn on carrier, followed
> by waking up all the queues on the network device.
>
> Signed-off-by: Abhinav Jain 
> ---
>  drivers/net/virtio_net.c | 17 +++--
>  1 file changed, 15 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> index 61a57d134544..d0a655a3b4c6 100644
> --- a/drivers/net/virtio_net.c
> +++ b/drivers/net/virtio_net.c
> @@ -3447,7 +3447,6 @@ static void virtnet_get_drvinfo(struct net_device *dev,
>
>  }
>
> -/* TODO: Eliminate OOO packets during switching */
>  static int virtnet_set_channels(struct net_device *dev,
> struct ethtool_channels *channels)
>  {
> @@ -3471,6 +3470,15 @@ static int virtnet_set_channels(struct net_device *dev,
> if (vi->rq[0].xdp_prog)
> return -EINVAL;
>
> +   /* Disable network device to prevent packet processing during
> +* the switch.
> +*/
> +   netif_tx_disable(dev);
> +   netif_carrier_off(dev);

Any reason we don't need to synchronize with NAPI here?

Thanks

> +
> +   /* Make certain that all in-flight packets are processed. */
> +   synchronize_net();
> +
> cpus_read_lock();
> err = virtnet_set_queues(vi, queue_pairs);
> if (err) {
> @@ -3482,7 +3490,12 @@ static int virtnet_set_channels(struct net_device *dev,
>
> netif_set_real_num_tx_queues(dev, queue_pairs);
> netif_set_real_num_rx_queues(dev, queue_pairs);
> - err:
> +
> +   /* Restart the network device */
> +   netif_carrier_on(dev);
> +   netif_tx_wake_all_queues(dev);
> +
> +err:
> return err;
>  }
>
> --
> 2.34.1
>




Re: [PATCH net-next V2] virtio-net: synchronize operstate with admin state on up/down

2024-06-16 Thread Jason Wang
On Thu, Jun 6, 2024 at 8:22 AM Jason Wang  wrote:
>
> On Fri, May 31, 2024 at 8:18 AM Jason Wang  wrote:
> >
> > On Thu, May 30, 2024 at 9:09 PM Michael S. Tsirkin  wrote:
> > >
> > > On Thu, May 30, 2024 at 06:29:51PM +0800, Jason Wang wrote:
> > > > On Thu, May 30, 2024 at 2:10 PM Michael S. Tsirkin  
> > > > wrote:
> > > > >
> > > > > On Thu, May 30, 2024 at 11:20:55AM +0800, Jason Wang wrote:
> > > > > > This patch synchronize operstate with admin state per RFC2863.
> > > > > >
> > > > > > This is done by trying to toggle the carrier upon open/close and
> > > > > > synchronize with the config change work. This allows propagate 
> > > > > > status
> > > > > > correctly to stacked devices like:
> > > > > >
> > > > > > ip link add link enp0s3 macvlan0 type macvlan
> > > > > > ip link set link enp0s3 down
> > > > > > ip link show
> > > > > >
> > > > > > Before this patch:
> > > > > >
> > > > > > 3: enp0s3:  mtu 1500 qdisc pfifo_fast state 
> > > > > > DOWN mode DEFAULT group default qlen 1000
> > > > > > link/ether 00:00:05:00:00:09 brd ff:ff:ff:ff:ff:ff
> > > > > > ..
> > > > > > 5: macvlan0@enp0s3:  mtu 
> > > > > > 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
> > > > > > link/ether b2:a9:c5:04:da:53 brd ff:ff:ff:ff:ff:ff
> > > > > >
> > > > > > After this patch:
> > > > > >
> > > > > > 3: enp0s3:  mtu 1500 qdisc pfifo_fast state 
> > > > > > DOWN mode DEFAULT group default qlen 1000
> > > > > > link/ether 00:00:05:00:00:09 brd ff:ff:ff:ff:ff:ff
> > > > > > ...
> > > > > > 5: macvlan0@enp0s3:  mtu 
> > > > > > 1500 qdisc noqueue state LOWERLAYERDOWN mode DEFAULT group default 
> > > > > > qlen 1000
> > > > > > link/ether b2:a9:c5:04:da:53 brd ff:ff:ff:ff:ff:ff
> > > > > >
> > > > > > Cc: Venkat Venkatsubra 
> > > > > > Cc: Gia-Khanh Nguyen 
> > > > > > Reviewed-by: Xuan Zhuo 
> > > > > > Acked-by: Michael S. Tsirkin 
> > > > > > Signed-off-by: Jason Wang 
> > > > > > ---
> > > > > > Changes since V1:
> > > > > > - rebase
> > > > > > - add ack/review tags
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > > ---
> > > > > >  drivers/net/virtio_net.c | 94 
> > > > > > +++-
> > > > > >  1 file changed, 63 insertions(+), 31 deletions(-)
> > > > > >
> > > > > > diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> > > > > > index 4a802c0ea2cb..69e4ae353c51 100644
> > > > > > --- a/drivers/net/virtio_net.c
> > > > > > +++ b/drivers/net/virtio_net.c
> > > > > > @@ -433,6 +433,12 @@ struct virtnet_info {
> > > > > >   /* The lock to synchronize the access to refill_enabled */
> > > > > >   spinlock_t refill_lock;
> > > > > >
> > > > > > + /* Is config change enabled? */
> > > > > > + bool config_change_enabled;
> > > > > > +
> > > > > > + /* The lock to synchronize the access to 
> > > > > > config_change_enabled */
> > > > > > + spinlock_t config_change_lock;
> > > > > > +
> > > > > >   /* Work struct for config space updates */
> > > > > >   struct work_struct config_work;
> > > > > >
> > > > >
> > > > >
> > > > > But we already have dev->config_lock and dev->config_enabled.
> > > > >
> > > > > And it actually works better - instead of discarding config
> > > > > change events it defers them until enabled.
> > > > >
> > > >
> > > > Yes but then both virtio-net driver and virtio core can ask to enable
> > > > and disable and then we need some kind of synchronization which is
> > > > non-trivial.
> > >
> > > Well for core it happens on bring up path before driver works
> > > and later on tear down after it is gone.
> > > So I do not think they ever do it at the same time.
> >
> > For example, there could be a suspend/resume when the admin state is down.
> >
> > >
> > >
> > > > And device enabling on the core is different from bringing the device
> > > > up in the networking subsystem. Here we just delay to deal with the
> > > > config change interrupt on ndo_open(). (E.g try to ack announce is
> > > > meaningless when the device is down).
> > > >
> > > > Thanks
> > >
> > > another thing is that it is better not to re-read all config
> > > on link up if there was no config interrupt - less vm exits.
> >
> > Yes, but it should not matter much as it's done in the ndo_open().
>
> Michael, any more comments on this?

Gentle ping.

Thanks

>
> Please confirm if this patch is ok or not. If you prefer to reuse the
> config_disable() I can change it from a boolean to a counter that
> allows to be nested.
>
> Thanks
>
> >
> > Thanks
> >
> > >
> > > --
> > > MST
> > >




[PATCH v11 18/18] fgraph: Skip recording calltime/rettime if it is not nneeded

2024-06-16 Thread Masami Hiramatsu (Google)
From: Masami Hiramatsu (Google) 

Skip recording calltime and rettime if the fgraph_ops does not need it.
This is a kind of performance optimization for fprobe. Since the fprobe
user does not use these entries, recording timestamp in fgraph is just
a overhead (e.g. eBPF, ftrace). So introduce the skip_timestamp flag,
and all fgraph_ops sets this flag, skip recording calltime and rettime.

Suggested-by: Jiri Olsa 
Signed-off-by: Masami Hiramatsu (Google) 
---
 Changes in v11:
  - Simplify it to be symmetric on push and pop. (Thus the timestamp
getting place is a bit shifted.)
 Changes in v10:
  - Add likely() to skipping timestamp.
 Changes in v9:
  - Newly added.
---
 include/linux/ftrace.h |2 ++
 kernel/trace/fgraph.c  |   36 +---
 kernel/trace/fprobe.c  |1 +
 3 files changed, 36 insertions(+), 3 deletions(-)

diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
index d8a58b940d81..fabf1a0979d4 100644
--- a/include/linux/ftrace.h
+++ b/include/linux/ftrace.h
@@ -1160,6 +1160,8 @@ struct fgraph_ops {
void*private;
trace_func_graph_ent_t  saved_func;
int idx;
+   /* If skip_timestamp is true, this does not record timestamps. */
+   boolskip_timestamp;
 };
 
 void *fgraph_reserve_data(int idx, int size_bytes);
diff --git a/kernel/trace/fgraph.c b/kernel/trace/fgraph.c
index d735a8c872bb..cf3ae59a436e 100644
--- a/kernel/trace/fgraph.c
+++ b/kernel/trace/fgraph.c
@@ -174,6 +174,7 @@ int ftrace_graph_active;
 
 static struct fgraph_ops *fgraph_array[FGRAPH_ARRAY_SIZE];
 static unsigned long fgraph_array_bitmask;
+static bool fgraph_skip_timestamp;
 
 /* LRU index table for fgraph_array */
 static int fgraph_lru_table[FGRAPH_ARRAY_SIZE];
@@ -557,7 +558,11 @@ ftrace_push_return_trace(unsigned long ret, unsigned long 
func,
return -EBUSY;
}
 
-   calltime = trace_clock_local();
+   /* This is not really 'likely' but for keeping the least path to be 
faster. */
+   if (likely(fgraph_skip_timestamp))
+   calltime = 0LL;
+   else
+   calltime = trace_clock_local();
 
offset = READ_ONCE(current->curr_ret_stack);
ret_stack = RET_STACK(current, offset);
@@ -728,6 +733,12 @@ ftrace_pop_return_trace(struct ftrace_graph_ret *trace, 
unsigned long *ret,
*ret = ret_stack->ret;
trace->func = ret_stack->func;
trace->calltime = ret_stack->calltime;
+   /* This is not really 'likely' but for keeping the least path to be 
faster. */
+   if (likely(!trace->calltime))
+   trace->rettime = 0LL;
+   else
+   trace->rettime = trace_clock_local();
+
trace->overrun = atomic_read(>trace_overrun);
trace->depth = current->curr_ret_depth;
/*
@@ -788,7 +799,6 @@ __ftrace_return_to_handler(struct ftrace_regs *fregs, 
unsigned long frame_pointe
return (unsigned long)panic;
}
 
-   trace.rettime = trace_clock_local();
if (fregs)
ftrace_regs_set_instruction_pointer(fregs, ret);
 
@@ -1242,6 +1252,24 @@ static void ftrace_graph_disable_direct(bool 
disable_branch)
fgraph_direct_gops = _stub;
 }
 
+static void update_fgraph_skip_timestamp(void)
+{
+   int i;
+
+   for (i = 0; i < FGRAPH_ARRAY_SIZE; i++) {
+   struct fgraph_ops *gops = fgraph_array[i];
+
+   if (gops == _stub)
+   continue;
+
+   if (!gops->skip_timestamp) {
+   fgraph_skip_timestamp = false;
+   return;
+   }
+   }
+   fgraph_skip_timestamp = true;
+}
+
 int register_ftrace_graph(struct fgraph_ops *gops)
 {
int command = 0;
@@ -1267,6 +1295,7 @@ int register_ftrace_graph(struct fgraph_ops *gops)
gops->idx = i;
 
ftrace_graph_active++;
+   update_fgraph_skip_timestamp();
 
if (ftrace_graph_active == 2)
ftrace_graph_disable_direct(true);
@@ -1298,6 +1327,7 @@ int register_ftrace_graph(struct fgraph_ops *gops)
ftrace_graph_active--;
gops->saved_func = NULL;
fgraph_lru_release_index(i);
+   update_fgraph_skip_timestamp();
}
 out:
mutex_unlock(_lock);
@@ -1321,8 +1351,8 @@ void unregister_ftrace_graph(struct fgraph_ops *gops)
goto out;
 
fgraph_array[gops->idx] = _stub;
-
ftrace_graph_active--;
+   update_fgraph_skip_timestamp();
 
if (!ftrace_graph_active)
command = FTRACE_STOP_FUNC_RET;
diff --git a/kernel/trace/fprobe.c b/kernel/trace/fprobe.c
index afa52d9816cf..24bb8edec8a3 100644
--- a/kernel/trace/fprobe.c
+++ b/kernel/trace/fprobe.c
@@ -345,6 +345,7 @@ NOKPROBE_SYMBOL(fprobe_return);
 static struct fgraph_ops fprobe_graph_ops = {
.entryfunc  = fprobe_entry,

[PATCH v11 17/18] Documentation: probes: Update fprobe on function-graph tracer

2024-06-16 Thread Masami Hiramatsu (Google)
From: Masami Hiramatsu (Google) 

Update fprobe documentation for the new fprobe on function-graph
tracer. This includes some bahvior changes and pt_regs to
ftrace_regs interface change.

Signed-off-by: Masami Hiramatsu (Google) 
---
 Changes in v2:
  - Update @fregs parameter explanation.
---
 Documentation/trace/fprobe.rst |   42 ++--
 1 file changed, 27 insertions(+), 15 deletions(-)

diff --git a/Documentation/trace/fprobe.rst b/Documentation/trace/fprobe.rst
index 196f52386aaa..f58bdc64504f 100644
--- a/Documentation/trace/fprobe.rst
+++ b/Documentation/trace/fprobe.rst
@@ -9,9 +9,10 @@ Fprobe - Function entry/exit probe
 Introduction
 
 
-Fprobe is a function entry/exit probe mechanism based on ftrace.
-Instead of using ftrace full feature, if you only want to attach callbacks
-on function entry and exit, similar to the kprobes and kretprobes, you can
+Fprobe is a function entry/exit probe mechanism based on the function-graph
+tracer.
+Instead of tracing all functions, if you want to attach callbacks on specific
+function entry and exit, similar to the kprobes and kretprobes, you can
 use fprobe. Compared with kprobes and kretprobes, fprobe gives faster
 instrumentation for multiple functions with single handler. This document
 describes how to use fprobe.
@@ -91,12 +92,14 @@ The prototype of the entry/exit callback function are as 
follows:
 
 .. code-block:: c
 
- int entry_callback(struct fprobe *fp, unsigned long entry_ip, unsigned long 
ret_ip, struct pt_regs *regs, void *entry_data);
+ int entry_callback(struct fprobe *fp, unsigned long entry_ip, unsigned long 
ret_ip, struct ftrace_regs *fregs, void *entry_data);
 
- void exit_callback(struct fprobe *fp, unsigned long entry_ip, unsigned long 
ret_ip, struct pt_regs *regs, void *entry_data);
+ void exit_callback(struct fprobe *fp, unsigned long entry_ip, unsigned long 
ret_ip, struct ftrace_regs *fregs, void *entry_data);
 
-Note that the @entry_ip is saved at function entry and passed to exit handler.
-If the entry callback function returns !0, the corresponding exit callback 
will be cancelled.
+Note that the @entry_ip is saved at function entry and passed to exit
+handler.
+If the entry callback function returns !0, the corresponding exit callback
+will be cancelled.
 
 @fp
 This is the address of `fprobe` data structure related to this handler.
@@ -112,12 +115,10 @@ If the entry callback function returns !0, the 
corresponding exit callback will
 This is the return address that the traced function will return to,
 somewhere in the caller. This can be used at both entry and exit.
 
-@regs
-This is the `pt_regs` data structure at the entry and exit. Note that
-the instruction pointer of @regs may be different from the @entry_ip
-in the entry_handler. If you need traced instruction pointer, you need
-to use @entry_ip. On the other hand, in the exit_handler, the 
instruction
-pointer of @regs is set to the current return address.
+@fregs
+This is the `ftrace_regs` data structure at the entry and exit. This
+includes the function parameters, or the return values. So user can
+access thos values via appropriate `ftrace_regs_*` APIs.
 
 @entry_data
 This is a local storage to share the data between entry and exit 
handlers.
@@ -125,6 +126,17 @@ If the entry callback function returns !0, the 
corresponding exit callback will
 and `entry_data_size` field when registering the fprobe, the storage is
 allocated and passed to both `entry_handler` and `exit_handler`.
 
+Entry data size and exit handlers on the same function
+==
+
+Since the entry data is passed via per-task stack and it is has limited size,
+the entry data size per probe is limited to `15 * sizeof(long)`. You also need
+to take care that the different fprobes are probing on the same function, this
+limit becomes smaller. The entry data size is aligned to `sizeof(long)` and
+each fprobe which has exit handler uses a `sizeof(long)` space on the stack,
+you should keep the number of fprobes on the same function as small as
+possible.
+
 Share the callbacks with kprobes
 
 
@@ -165,8 +177,8 @@ This counter counts up when;
  - fprobe fails to take ftrace_recursion lock. This usually means that a 
function
which is traced by other ftrace users is called from the entry_handler.
 
- - fprobe fails to setup the function exit because of the shortage of rethook
-   (the shadow stack for hooking the function return.)
+ - fprobe fails to setup the function exit because of failing to allocate the
+   data buffer from the per-task shadow stack.
 
 The `fprobe::nmissed` field counts up in both cases. Therefore, the former
 skips both of entry and exit callback and the latter skips the exit




[PATCH v11 16/18] selftests/ftrace: Add a test case for repeating register/unregister fprobe

2024-06-16 Thread Masami Hiramatsu (Google)
From: Masami Hiramatsu (Google) 

This test case repeats define and undefine the fprobe dynamic event to
ensure that the fprobe does not cause any issue with such operations.

Signed-off-by: Masami Hiramatsu (Google) 
---
 .../test.d/dynevent/add_remove_fprobe_repeat.tc|   19 +++
 1 file changed, 19 insertions(+)
 create mode 100644 
tools/testing/selftests/ftrace/test.d/dynevent/add_remove_fprobe_repeat.tc

diff --git 
a/tools/testing/selftests/ftrace/test.d/dynevent/add_remove_fprobe_repeat.tc 
b/tools/testing/selftests/ftrace/test.d/dynevent/add_remove_fprobe_repeat.tc
new file mode 100644
index ..b4ad09237e2a
--- /dev/null
+++ b/tools/testing/selftests/ftrace/test.d/dynevent/add_remove_fprobe_repeat.tc
@@ -0,0 +1,19 @@
+#!/bin/sh
+# SPDX-License-Identifier: GPL-2.0
+# description: Generic dynamic event - Repeating add/remove fprobe events
+# requires: dynamic_events "f[:[/][]] [%return] 
[]":README
+
+echo 0 > events/enable
+echo > dynamic_events
+
+PLACE=$FUNCTION_FORK
+REPEAT_TIMES=64
+
+for i in `seq 1 $REPEAT_TIMES`; do
+  echo "f:myevent $PLACE" >> dynamic_events
+  grep -q myevent dynamic_events
+  test -d events/fprobes/myevent
+  echo > dynamic_events
+done
+
+clear_trace




[PATCH v11 15/18] selftests: ftrace: Remove obsolate maxactive syntax check

2024-06-16 Thread Masami Hiramatsu (Google)
From: Masami Hiramatsu (Google) 

Since the fprobe event does not support maxactive anymore, stop
testing the maxactive syntax error checking.

Signed-off-by: Masami Hiramatsu (Google) 
---
 .../ftrace/test.d/dynevent/fprobe_syntax_errors.tc |4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git 
a/tools/testing/selftests/ftrace/test.d/dynevent/fprobe_syntax_errors.tc 
b/tools/testing/selftests/ftrace/test.d/dynevent/fprobe_syntax_errors.tc
index 61877d166451..c9425a34fae3 100644
--- a/tools/testing/selftests/ftrace/test.d/dynevent/fprobe_syntax_errors.tc
+++ b/tools/testing/selftests/ftrace/test.d/dynevent/fprobe_syntax_errors.tc
@@ -16,9 +16,7 @@ aarch64)
   REG=%r0 ;;
 esac
 
-check_error 'f^100 vfs_read'   # MAXACT_NO_KPROBE
-check_error 'f^1a111 vfs_read' # BAD_MAXACT
-check_error 'f^10 vfs_read'# MAXACT_TOO_BIG
+check_error 'f^100 vfs_read'   # BAD_MAXACT
 
 check_error 'f ^non_exist_func'# BAD_PROBE_ADDR (enoent)
 check_error 'f ^vfs_read+10'   # BAD_PROBE_ADDR




[PATCH v11 14/18] tracing/fprobe: Remove nr_maxactive from fprobe

2024-06-16 Thread Masami Hiramatsu (Google)
From: Masami Hiramatsu (Google) 

Remove depercated fprobe::nr_maxactive. This involves fprobe events to
rejects the maxactive number.

Signed-off-by: Masami Hiramatsu (Google) 
---
 Changes in v2:
  - Newly added.
---
 include/linux/fprobe.h  |2 --
 kernel/trace/trace_fprobe.c |   44 ++-
 2 files changed, 6 insertions(+), 40 deletions(-)

diff --git a/include/linux/fprobe.h b/include/linux/fprobe.h
index 2d06bbd99601..a86b3e4df2a0 100644
--- a/include/linux/fprobe.h
+++ b/include/linux/fprobe.h
@@ -54,7 +54,6 @@ struct fprobe_hlist {
  * @nmissed: The counter for missing events.
  * @flags: The status flag.
  * @entry_data_size: The private data storage size.
- * @nr_maxactive: The max number of active functions. (*deprecated)
  * @entry_handler: The callback function for function entry.
  * @exit_handler: The callback function for function exit.
  * @hlist_array: The fprobe_hlist for fprobe search from IP hash table.
@@ -63,7 +62,6 @@ struct fprobe {
unsigned long   nmissed;
unsigned intflags;
size_t  entry_data_size;
-   int nr_maxactive;
 
fprobe_entry_cb entry_handler;
fprobe_exit_cb  exit_handler;
diff --git a/kernel/trace/trace_fprobe.c b/kernel/trace/trace_fprobe.c
index 86cd6a8c806a..20ef5cd5d419 100644
--- a/kernel/trace/trace_fprobe.c
+++ b/kernel/trace/trace_fprobe.c
@@ -422,7 +422,6 @@ static struct trace_fprobe *alloc_trace_fprobe(const char 
*group,
   const char *event,
   const char *symbol,
   struct tracepoint *tpoint,
-  int maxactive,
   int nargs, bool is_return)
 {
struct trace_fprobe *tf;
@@ -442,7 +441,6 @@ static struct trace_fprobe *alloc_trace_fprobe(const char 
*group,
tf->fp.entry_handler = fentry_dispatcher;
 
tf->tpoint = tpoint;
-   tf->fp.nr_maxactive = maxactive;
 
ret = trace_probe_init(>tp, event, group, false, nargs);
if (ret < 0)
@@ -1021,12 +1019,11 @@ static int __trace_fprobe_create(int argc, const char 
*argv[])
 *  FETCHARG:TYPE : use TYPE instead of unsigned long.
 */
struct trace_fprobe *tf = NULL;
-   int i, len, new_argc = 0, ret = 0;
+   int i, new_argc = 0, ret = 0;
bool is_return = false;
char *symbol = NULL;
const char *event = NULL, *group = FPROBE_EVENT_SYSTEM;
const char **new_argv = NULL;
-   int maxactive = 0;
char buf[MAX_EVENT_NAME_LEN];
char gbuf[MAX_EVENT_NAME_LEN];
char sbuf[KSYM_NAME_LEN];
@@ -1048,33 +1045,13 @@ static int __trace_fprobe_create(int argc, const char 
*argv[])
 
trace_probe_log_init("trace_fprobe", argc, argv);
 
-   event = strchr([0][1], ':');
-   if (event)
-   event++;
-
-   if (isdigit(argv[0][1])) {
-   if (event)
-   len = event - [0][1] - 1;
-   else
-   len = strlen([0][1]);
-   if (len > MAX_EVENT_NAME_LEN - 1) {
-   trace_probe_log_err(1, BAD_MAXACT);
-   goto parse_error;
-   }
-   memcpy(buf, [0][1], len);
-   buf[len] = '\0';
-   ret = kstrtouint(buf, 0, );
-   if (ret || !maxactive) {
+   if (argv[0][1] != '\0') {
+   if (argv[0][1] != ':') {
+   trace_probe_log_set_index(0);
trace_probe_log_err(1, BAD_MAXACT);
goto parse_error;
}
-   /* fprobe rethook instances are iterated over via a list. The
-* maximum should stay reasonable.
-*/
-   if (maxactive > RETHOOK_MAXACTIVE_MAX) {
-   trace_probe_log_err(1, MAXACT_TOO_BIG);
-   goto parse_error;
-   }
+   event = [0][2];
}
 
trace_probe_log_set_index(1);
@@ -1084,12 +1061,6 @@ static int __trace_fprobe_create(int argc, const char 
*argv[])
if (ret < 0)
goto parse_error;
 
-   if (!is_return && maxactive) {
-   trace_probe_log_set_index(0);
-   trace_probe_log_err(1, BAD_MAXACT_TYPE);
-   goto parse_error;
-   }
-
trace_probe_log_set_index(0);
if (event) {
ret = traceprobe_parse_event_name(, , gbuf,
@@ -1147,8 +1118,7 @@ static int __trace_fprobe_create(int argc, const char 
*argv[])
goto out;
 
/* setup a probe */
-   tf = alloc_trace_fprobe(group, event, symbol, tpoint, maxactive,
-   argc, is_return);
+   tf = alloc_trace_fprobe(group, event, symbol, tpoint, argc, 

[PATCH v11 13/18] fprobe: Rewrite fprobe on function-graph tracer

2024-06-16 Thread Masami Hiramatsu (Google)
From: Masami Hiramatsu (Google) 

Rewrite fprobe implementation on function-graph tracer.
Major API changes are:
 -  'nr_maxactive' field is deprecated.
 -  This depends on CONFIG_DYNAMIC_FTRACE_WITH_ARGS or
!CONFIG_HAVE_DYNAMIC_FTRACE_WITH_ARGS, and
CONFIG_HAVE_FUNCTION_GRAPH_FREGS. So currently works only
on x86_64.
 -  Currently the entry size is limited in 15 * sizeof(long).
 -  If there is too many fprobe exit handler set on the same
function, it will fail to probe.

Signed-off-by: Masami Hiramatsu (Google) 
---
 Changes in v9:
  - Remove unneeded prototype of ftrace_regs_get_return_address().
  - Fix entry data address calculation.
  - Remove DIV_ROUND_UP() from hotpath.
 Changes in v8:
  - Use trace_func_graph_ret/ent_t for fgraph_ops.
  - Update CONFIG_FPROBE dependencies.
  - Add ftrace_regs_get_return_address() for each arch.
 Changes in v3:
  - Update for new reserve_data/retrieve_data API.
  - Fix internal push/pop on fgraph data logic so that it can
correctly save/restore the returning fprobes.
 Changes in v2:
  - Add more lockdep_assert_held(fprobe_mutex)
  - Use READ_ONCE() and WRITE_ONCE() for fprobe_hlist_node::fp.
  - Add NOKPROBE_SYMBOL() for the functions which is called from
entry/exit callback.
---
 arch/arm64/include/asm/ftrace.h |6 
 arch/loongarch/include/asm/ftrace.h |6 
 arch/powerpc/include/asm/ftrace.h   |6 
 arch/s390/include/asm/ftrace.h  |6 
 arch/x86/include/asm/ftrace.h   |6 
 include/linux/fprobe.h  |   53 ++-
 kernel/trace/Kconfig|8 
 kernel/trace/fprobe.c   |  638 +--
 lib/test_fprobe.c   |   45 --
 9 files changed, 529 insertions(+), 245 deletions(-)

diff --git a/arch/arm64/include/asm/ftrace.h b/arch/arm64/include/asm/ftrace.h
index 14ecb9a418d9..27e32f323048 100644
--- a/arch/arm64/include/asm/ftrace.h
+++ b/arch/arm64/include/asm/ftrace.h
@@ -132,6 +132,12 @@ ftrace_regs_get_frame_pointer(const struct ftrace_regs 
*fregs)
return fregs->fp;
 }
 
+static __always_inline unsigned long
+ftrace_regs_get_return_address(const struct ftrace_regs *fregs)
+{
+   return fregs->lr;
+}
+
 static __always_inline struct pt_regs *
 ftrace_partial_regs(const struct ftrace_regs *fregs, struct pt_regs *regs)
 {
diff --git a/arch/loongarch/include/asm/ftrace.h 
b/arch/loongarch/include/asm/ftrace.h
index 1a73f35ea9af..c021aa3194f3 100644
--- a/arch/loongarch/include/asm/ftrace.h
+++ b/arch/loongarch/include/asm/ftrace.h
@@ -80,6 +80,12 @@ ftrace_regs_set_instruction_pointer(struct ftrace_regs 
*fregs, unsigned long ip)
 #define ftrace_regs_get_frame_pointer(fregs) \
((fregs)->regs.regs[22])
 
+static __always_inline unsigned long
+ftrace_regs_get_return_address(struct ftrace_regs *fregs)
+{
+   return *(unsigned long *)(fregs->regs.regs[1]);
+}
+
 #define ftrace_graph_func ftrace_graph_func
 void ftrace_graph_func(unsigned long ip, unsigned long parent_ip,
   struct ftrace_ops *op, struct ftrace_regs *fregs);
diff --git a/arch/powerpc/include/asm/ftrace.h 
b/arch/powerpc/include/asm/ftrace.h
index e6ff6834bf7e..2a2d070dd23c 100644
--- a/arch/powerpc/include/asm/ftrace.h
+++ b/arch/powerpc/include/asm/ftrace.h
@@ -75,6 +75,12 @@ ftrace_regs_get_instruction_pointer(struct ftrace_regs 
*fregs)
 #define ftrace_regs_query_register_offset(name) \
regs_query_register_offset(name)
 
+static __always_inline unsigned long
+ftrace_regs_get_return_address(struct ftrace_regs *fregs)
+{
+   return fregs->regs.link;
+}
+
 struct ftrace_ops;
 
 #define ftrace_graph_func ftrace_graph_func
diff --git a/arch/s390/include/asm/ftrace.h b/arch/s390/include/asm/ftrace.h
index 0d9f6df21f81..7b80ff4d3386 100644
--- a/arch/s390/include/asm/ftrace.h
+++ b/arch/s390/include/asm/ftrace.h
@@ -84,6 +84,12 @@ ftrace_regs_get_frame_pointer(struct ftrace_regs *fregs)
return sp[0];   /* return backchain */
 }
 
+static __always_inline unsigned long
+ftrace_regs_get_return_address(const struct ftrace_regs *fregs)
+{
+   return fregs->regs.gprs[14];
+}
+
 #define arch_ftrace_fill_perf_regs(fregs, _regs)do {   \
(_regs)->psw.addr = (fregs)->regs.psw.addr; \
(_regs)->gprs[15] = (fregs)->regs.gprs[15]; \
diff --git a/arch/x86/include/asm/ftrace.h b/arch/x86/include/asm/ftrace.h
index 1f4d1f7b19ed..8472ba394091 100644
--- a/arch/x86/include/asm/ftrace.h
+++ b/arch/x86/include/asm/ftrace.h
@@ -74,6 +74,12 @@ arch_ftrace_get_regs(struct ftrace_regs *fregs)
 #define ftrace_regs_get_frame_pointer(fregs) \
frame_pointer(&(fregs)->regs)
 
+static __always_inline unsigned long
+ftrace_regs_get_return_address(struct ftrace_regs *fregs)
+{
+   return *(unsigned long *)ftrace_regs_get_stack_pointer(fregs);
+}
+
 struct ftrace_ops;
 #define ftrace_graph_func ftrace_graph_func
 void ftrace_graph_func(unsigned long ip, unsigned long 

[PATCH v11 12/18] ftrace: Add CONFIG_HAVE_FTRACE_GRAPH_FUNC

2024-06-16 Thread Masami Hiramatsu (Google)
From: Masami Hiramatsu (Google) 

Add CONFIG_HAVE_FTRACE_GRAPH_FUNC kconfig in addition to ftrace_graph_func
macro check. This is for the other feature (e.g. FPROBE) which requires to
access ftrace_regs from fgraph_ops::entryfunc() can avoid compiling if
the fgraph can not pass the valid ftrace_regs.

Signed-off-by: Masami Hiramatsu (Google) 
---
 Changes in v8:
  - Newly added.
---
 arch/arm64/Kconfig |1 +
 arch/loongarch/Kconfig |1 +
 arch/powerpc/Kconfig   |1 +
 arch/riscv/Kconfig |1 +
 arch/x86/Kconfig   |1 +
 kernel/trace/Kconfig   |5 +
 6 files changed, 10 insertions(+)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 8691683d782e..e99a3fd53efd 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -207,6 +207,7 @@ config ARM64
select HAVE_SAMPLE_FTRACE_DIRECT_MULTI
select HAVE_EFFICIENT_UNALIGNED_ACCESS
select HAVE_GUP_FAST
+   select HAVE_FTRACE_GRAPH_FUNC
select HAVE_FTRACE_MCOUNT_RECORD
select HAVE_FUNCTION_TRACER
select HAVE_FUNCTION_ERROR_INJECTION
diff --git a/arch/loongarch/Kconfig b/arch/loongarch/Kconfig
index 0f1b2057507b..f1439c42c46a 100644
--- a/arch/loongarch/Kconfig
+++ b/arch/loongarch/Kconfig
@@ -126,6 +126,7 @@ config LOONGARCH
select HAVE_EFFICIENT_UNALIGNED_ACCESS if !ARCH_STRICT_ALIGN
select HAVE_EXIT_THREAD
select HAVE_GUP_FAST
+   select HAVE_FTRACE_GRAPH_FUNC
select HAVE_FTRACE_MCOUNT_RECORD
select HAVE_FUNCTION_ARG_ACCESS_API
select HAVE_FUNCTION_ERROR_INJECTION
diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index c88c6d46a5bc..910118faedaa 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -239,6 +239,7 @@ config PPC
select HAVE_EBPF_JIT
select HAVE_EFFICIENT_UNALIGNED_ACCESS
select HAVE_GUP_FAST
+   select HAVE_FTRACE_GRAPH_FUNC
select HAVE_FTRACE_MCOUNT_RECORD
select HAVE_FUNCTION_ARG_ACCESS_API
select HAVE_FUNCTION_DESCRIPTORSif PPC64_ELF_ABI_V1
diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index 1904393bc399..83e8c8c64b99 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -130,6 +130,7 @@ config RISCV
select HAVE_DYNAMIC_FTRACE if !XIP_KERNEL && MMU && 
(CLANG_SUPPORTS_DYNAMIC_FTRACE || GCC_SUPPORTS_DYNAMIC_FTRACE)
select HAVE_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
select HAVE_DYNAMIC_FTRACE_WITH_ARGS if HAVE_DYNAMIC_FTRACE
+   select HAVE_FTRACE_GRAPH_FUNC
select HAVE_FTRACE_MCOUNT_RECORD if !XIP_KERNEL
select HAVE_FUNCTION_GRAPH_TRACER
select HAVE_FUNCTION_GRAPH_FREGS
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index d4655b72e6d7..7213e27b5b2b 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -228,6 +228,7 @@ config X86
select HAVE_EXIT_THREAD
select HAVE_GUP_FAST
select HAVE_FENTRY  if X86_64 || DYNAMIC_FTRACE
+   select HAVE_FTRACE_GRAPH_FUNC   if HAVE_FUNCTION_GRAPH_TRACER
select HAVE_FTRACE_MCOUNT_RECORD
select HAVE_FUNCTION_GRAPH_FREGSif HAVE_FUNCTION_GRAPH_TRACER
select HAVE_FUNCTION_GRAPH_TRACER   if X86_32 || (X86_64 && 
DYNAMIC_FTRACE)
diff --git a/kernel/trace/Kconfig b/kernel/trace/Kconfig
index 4a3dd81f749b..a1fa9cba0ef3 100644
--- a/kernel/trace/Kconfig
+++ b/kernel/trace/Kconfig
@@ -34,6 +34,11 @@ config HAVE_FUNCTION_GRAPH_TRACER
 config HAVE_FUNCTION_GRAPH_FREGS
bool
 
+config HAVE_FTRACE_GRAPH_FUNC
+   bool
+   help
+ True if ftrace_graph_func() is defined.
+
 config HAVE_DYNAMIC_FTRACE
bool
help




[PATCH v11 11/18] bpf: Enable kprobe_multi feature if CONFIG_FPROBE is enabled

2024-06-16 Thread Masami Hiramatsu (Google)
From: Masami Hiramatsu (Google) 

Enable kprobe_multi feature if CONFIG_FPROBE is enabled. The pt_regs is
converted from ftrace_regs by ftrace_partial_regs(), thus some registers
may always returns 0. But it should be enough for function entry (access
arguments) and exit (access return value).

Signed-off-by: Masami Hiramatsu (Google) 
Acked-by: Florent Revest 
---
 Changes in v9:
  - Avoid wasting memory for bpf_kprobe_multi_pt_regs when
CONFIG_HAVE_PT_REGS_TO_FTRACE_REGS_CAST=y
---
 kernel/trace/bpf_trace.c |   27 ++-
 1 file changed, 14 insertions(+), 13 deletions(-)

diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
index f72b421abe9b..77fd63027286 100644
--- a/kernel/trace/bpf_trace.c
+++ b/kernel/trace/bpf_trace.c
@@ -2602,7 +2602,7 @@ struct bpf_session_run_ctx {
void *data;
 };
 
-#if defined(CONFIG_FPROBE) && defined(CONFIG_DYNAMIC_FTRACE_WITH_REGS)
+#ifdef CONFIG_FPROBE
 struct bpf_kprobe_multi_link {
struct bpf_link link;
struct fprobe fp;
@@ -2625,6 +2625,13 @@ struct user_syms {
char *buf;
 };
 
+#ifndef CONFIG_HAVE_PT_REGS_TO_FTRACE_REGS_CAST
+static DEFINE_PER_CPU(struct pt_regs, bpf_kprobe_multi_pt_regs);
+#define bpf_kprobe_multi_pt_regs_ptr() this_cpu_ptr(_kprobe_multi_pt_regs)
+#else
+#define bpf_kprobe_multi_pt_regs_ptr() (NULL)
+#endif
+
 static int copy_user_syms(struct user_syms *us, unsigned long __user *usyms, 
u32 cnt)
 {
unsigned long __user usymbol;
@@ -2819,7 +2826,7 @@ static u64 bpf_kprobe_multi_entry_ip(struct bpf_run_ctx 
*ctx)
 
 static int
 kprobe_multi_link_prog_run(struct bpf_kprobe_multi_link *link,
-  unsigned long entry_ip, struct pt_regs *regs,
+  unsigned long entry_ip, struct ftrace_regs *fregs,
   bool is_return, void *data)
 {
struct bpf_kprobe_multi_run_ctx run_ctx = {
@@ -2831,6 +2838,7 @@ kprobe_multi_link_prog_run(struct bpf_kprobe_multi_link 
*link,
.entry_ip = entry_ip,
};
struct bpf_run_ctx *old_run_ctx;
+   struct pt_regs *regs;
int err;
 
if (unlikely(__this_cpu_inc_return(bpf_prog_active) != 1)) {
@@ -2841,6 +2849,7 @@ kprobe_multi_link_prog_run(struct bpf_kprobe_multi_link 
*link,
 
migrate_disable();
rcu_read_lock();
+   regs = ftrace_partial_regs(fregs, bpf_kprobe_multi_pt_regs_ptr());
old_run_ctx = bpf_set_run_ctx(_ctx.session_ctx.run_ctx);
err = bpf_prog_run(link->link.prog, regs);
bpf_reset_run_ctx(old_run_ctx);
@@ -2857,15 +2866,11 @@ kprobe_multi_link_handler(struct fprobe *fp, unsigned 
long fentry_ip,
  unsigned long ret_ip, struct ftrace_regs *fregs,
  void *data)
 {
-   struct pt_regs *regs = ftrace_get_regs(fregs);
struct bpf_kprobe_multi_link *link;
int err;
 
-   if (!regs)
-   return 0;
-
link = container_of(fp, struct bpf_kprobe_multi_link, fp);
-   err = kprobe_multi_link_prog_run(link, get_entry_ip(fentry_ip), regs, 
false, data);
+   err = kprobe_multi_link_prog_run(link, get_entry_ip(fentry_ip), fregs, 
false, data);
return is_kprobe_session(link->link.prog) ? err : 0;
 }
 
@@ -2875,13 +2880,9 @@ kprobe_multi_link_exit_handler(struct fprobe *fp, 
unsigned long fentry_ip,
   void *data)
 {
struct bpf_kprobe_multi_link *link;
-   struct pt_regs *regs = ftrace_get_regs(fregs);
-
-   if (!regs)
-   return;
 
link = container_of(fp, struct bpf_kprobe_multi_link, fp);
-   kprobe_multi_link_prog_run(link, get_entry_ip(fentry_ip), regs, true, 
data);
+   kprobe_multi_link_prog_run(link, get_entry_ip(fentry_ip), fregs, true, 
data);
 }
 
 static int symbols_cmp_r(const void *a, const void *b, const void *priv)
@@ -3142,7 +3143,7 @@ int bpf_kprobe_multi_link_attach(const union bpf_attr 
*attr, struct bpf_prog *pr
kvfree(cookies);
return err;
 }
-#else /* !CONFIG_FPROBE || !CONFIG_DYNAMIC_FTRACE_WITH_REGS */
+#else /* !CONFIG_FPROBE */
 int bpf_kprobe_multi_link_attach(const union bpf_attr *attr, struct bpf_prog 
*prog)
 {
return -EOPNOTSUPP;




Re: [PATCH 1/2] vdpa: support set mac address from vdpa tool

2024-06-16 Thread Jason Wang
On Wed, Jun 12, 2024 at 2:30 PM Jiri Pirko  wrote:
>
> Wed, Jun 12, 2024 at 03:58:10AM CEST, k...@kernel.org wrote:
> >On Tue, 11 Jun 2024 13:32:32 +0800 Cindy Lu wrote:
> >> Add new UAPI to support the mac address from vdpa tool
> >> Function vdpa_nl_cmd_dev_config_set_doit() will get the
> >> MAC address from the vdpa tool and then set it to the device.
> >>
> >> The usage is: vdpa dev set name vdpa_name mac **:**:**:**:**:**
> >
> >Why don't you use devlink?
>
> Fair question. Why does vdpa-specific uapi even exist? To have
> driver-specific uapi Does not make any sense to me :/

It came with devlink first actually, but switched to a dedicated uAPI.

Parav(cced) may explain more here.

Thanks
>




[PATCH v11 10/18] tracing/fprobe: Enable fprobe events with CONFIG_DYNAMIC_FTRACE_WITH_ARGS

2024-06-16 Thread Masami Hiramatsu (Google)
From: Masami Hiramatsu (Google) 

Allow fprobe events to be enabled with CONFIG_DYNAMIC_FTRACE_WITH_ARGS.
With this change, fprobe events mostly use ftrace_regs instead of pt_regs.
Note that if the arch doesn't enable HAVE_PT_REGS_COMPAT_FTRACE_REGS,
fprobe events will not be able to be used from perf.

Signed-off-by: Masami Hiramatsu (Google) 
---
 Changes in v9:
  - Copy store_trace_entry_data() as store_fprobe_entry_data() for
fprobe.
 Chagnes in v3:
  - Use ftrace_regs_get_return_value().
 Changes in v2:
  - Define ftrace_regs_get_kernel_stack_nth() for
!CONFIG_HAVE_REGS_AND_STACK_ACCESS_API.
 Changes from previous series: Update against the new series.
---
 include/linux/ftrace.h  |   17 ++
 kernel/trace/Kconfig|1 
 kernel/trace/trace_fprobe.c |  107 +--
 kernel/trace/trace_probe_tmpl.h |2 -
 4 files changed, 86 insertions(+), 41 deletions(-)

diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
index d9a3723f987d..d8a58b940d81 100644
--- a/include/linux/ftrace.h
+++ b/include/linux/ftrace.h
@@ -255,6 +255,23 @@ static __always_inline bool ftrace_regs_has_args(struct 
ftrace_regs *fregs)
frame_pointer(&(fregs)->regs)
 #endif
 
+#ifdef CONFIG_HAVE_REGS_AND_STACK_ACCESS_API
+static __always_inline unsigned long
+ftrace_regs_get_kernel_stack_nth(struct ftrace_regs *fregs, unsigned int nth)
+{
+   unsigned long *stackp;
+
+   stackp = (unsigned long *)ftrace_regs_get_stack_pointer(fregs);
+   if (((unsigned long)(stackp + nth) & ~(THREAD_SIZE - 1)) ==
+   ((unsigned long)stackp & ~(THREAD_SIZE - 1)))
+   return *(stackp + nth);
+
+   return 0;
+}
+#else /* !CONFIG_HAVE_REGS_AND_STACK_ACCESS_API */
+#define ftrace_regs_get_kernel_stack_nth(fregs, nth)   (0L)
+#endif /* CONFIG_HAVE_REGS_AND_STACK_ACCESS_API */
+
 typedef void (*ftrace_func_t)(unsigned long ip, unsigned long parent_ip,
  struct ftrace_ops *op, struct ftrace_regs *fregs);
 
diff --git a/kernel/trace/Kconfig b/kernel/trace/Kconfig
index 15e340a865f5..4a3dd81f749b 100644
--- a/kernel/trace/Kconfig
+++ b/kernel/trace/Kconfig
@@ -680,7 +680,6 @@ config FPROBE_EVENTS
select TRACING
select PROBE_EVENTS
select DYNAMIC_EVENTS
-   depends on DYNAMIC_FTRACE_WITH_REGS
default y
help
  This allows user to add tracing events on the function entry and
diff --git a/kernel/trace/trace_fprobe.c b/kernel/trace/trace_fprobe.c
index 273cdf3cf70c..86cd6a8c806a 100644
--- a/kernel/trace/trace_fprobe.c
+++ b/kernel/trace/trace_fprobe.c
@@ -133,7 +133,7 @@ static int
 process_fetch_insn(struct fetch_insn *code, void *rec, void *edata,
   void *dest, void *base)
 {
-   struct pt_regs *regs = rec;
+   struct ftrace_regs *fregs = rec;
unsigned long val;
int ret;
 
@@ -141,17 +141,17 @@ process_fetch_insn(struct fetch_insn *code, void *rec, 
void *edata,
/* 1st stage: get value from context */
switch (code->op) {
case FETCH_OP_STACK:
-   val = regs_get_kernel_stack_nth(regs, code->param);
+   val = ftrace_regs_get_kernel_stack_nth(fregs, code->param);
break;
case FETCH_OP_STACKP:
-   val = kernel_stack_pointer(regs);
+   val = ftrace_regs_get_stack_pointer(fregs);
break;
case FETCH_OP_RETVAL:
-   val = regs_return_value(regs);
+   val = ftrace_regs_get_return_value(fregs);
break;
 #ifdef CONFIG_HAVE_FUNCTION_ARG_ACCESS_API
case FETCH_OP_ARG:
-   val = regs_get_kernel_argument(regs, code->param);
+   val = ftrace_regs_get_argument(fregs, code->param);
break;
case FETCH_OP_EDATA:
val = *(unsigned long *)((unsigned long)edata + code->offset);
@@ -174,7 +174,7 @@ NOKPROBE_SYMBOL(process_fetch_insn)
 /* function entry handler */
 static nokprobe_inline void
 __fentry_trace_func(struct trace_fprobe *tf, unsigned long entry_ip,
-   struct pt_regs *regs,
+   struct ftrace_regs *fregs,
struct trace_event_file *trace_file)
 {
struct fentry_trace_entry_head *entry;
@@ -188,41 +188,71 @@ __fentry_trace_func(struct trace_fprobe *tf, unsigned 
long entry_ip,
if (trace_trigger_soft_disabled(trace_file))
return;
 
-   dsize = __get_data_size(>tp, regs, NULL);
+   dsize = __get_data_size(>tp, fregs, NULL);
 
entry = trace_event_buffer_reserve(, trace_file,
   sizeof(*entry) + tf->tp.size + 
dsize);
if (!entry)
return;
 
-   fbuffer.regs = regs;
+   fbuffer.regs = ftrace_get_regs(fregs);
entry = fbuffer.entry = ring_buffer_event_data(fbuffer.event);
entry->ip = entry_ip;
-   store_trace_args([1], >tp, regs, NULL, 

[PATCH v11 09/18] tracing: Add ftrace_fill_perf_regs() for perf event

2024-06-16 Thread Masami Hiramatsu (Google)
From: Masami Hiramatsu (Google) 

Add ftrace_fill_perf_regs() which should be compatible with the
perf_fetch_caller_regs(). In other words, the pt_regs returned from the
ftrace_fill_perf_regs() must satisfy 'user_mode(regs) == false' and can be
used for stack tracing.

Signed-off-by: Masami Hiramatsu (Google) 
---
  Changes from previous series: NOTHING, just forward ported.
---
 arch/arm64/include/asm/ftrace.h   |7 +++
 arch/powerpc/include/asm/ftrace.h |7 +++
 arch/s390/include/asm/ftrace.h|5 +
 arch/x86/include/asm/ftrace.h |7 +++
 include/linux/ftrace.h|   31 +++
 5 files changed, 57 insertions(+)

diff --git a/arch/arm64/include/asm/ftrace.h b/arch/arm64/include/asm/ftrace.h
index 5cd587afab6d..14ecb9a418d9 100644
--- a/arch/arm64/include/asm/ftrace.h
+++ b/arch/arm64/include/asm/ftrace.h
@@ -143,6 +143,13 @@ ftrace_partial_regs(const struct ftrace_regs *fregs, 
struct pt_regs *regs)
return regs;
 }
 
+#define arch_ftrace_fill_perf_regs(fregs, _regs) do {  \
+   (_regs)->pc = (fregs)->pc;  \
+   (_regs)->regs[29] = (fregs)->fp;\
+   (_regs)->sp = (fregs)->sp;  \
+   (_regs)->pstate = PSR_MODE_EL1h;\
+   } while (0)
+
 int ftrace_regs_query_register_offset(const char *name);
 
 int ftrace_init_nop(struct module *mod, struct dyn_ftrace *rec);
diff --git a/arch/powerpc/include/asm/ftrace.h 
b/arch/powerpc/include/asm/ftrace.h
index 23d26f3afae4..e6ff6834bf7e 100644
--- a/arch/powerpc/include/asm/ftrace.h
+++ b/arch/powerpc/include/asm/ftrace.h
@@ -42,6 +42,13 @@ static __always_inline struct pt_regs 
*arch_ftrace_get_regs(struct ftrace_regs *
return fregs->regs.msr ? >regs : NULL;
 }
 
+#define arch_ftrace_fill_perf_regs(fregs, _regs) do {  \
+   (_regs)->result = 0;\
+   (_regs)->nip = (fregs)->regs.nip;   \
+   (_regs)->gpr[1] = (fregs)->regs.gpr[1]; \
+   asm volatile("mfmsr %0" : "=r" ((_regs)->msr)); \
+   } while (0)
+
 static __always_inline void
 ftrace_regs_set_instruction_pointer(struct ftrace_regs *fregs,
unsigned long ip)
diff --git a/arch/s390/include/asm/ftrace.h b/arch/s390/include/asm/ftrace.h
index 9cdd48a46bf7..0d9f6df21f81 100644
--- a/arch/s390/include/asm/ftrace.h
+++ b/arch/s390/include/asm/ftrace.h
@@ -84,6 +84,11 @@ ftrace_regs_get_frame_pointer(struct ftrace_regs *fregs)
return sp[0];   /* return backchain */
 }
 
+#define arch_ftrace_fill_perf_regs(fregs, _regs)do {   \
+   (_regs)->psw.addr = (fregs)->regs.psw.addr; \
+   (_regs)->gprs[15] = (fregs)->regs.gprs[15]; \
+   } while (0)
+
 #ifdef CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
 /*
  * When an ftrace registered caller is tracing a function that is
diff --git a/arch/x86/include/asm/ftrace.h b/arch/x86/include/asm/ftrace.h
index 669771ef3b5b..1f4d1f7b19ed 100644
--- a/arch/x86/include/asm/ftrace.h
+++ b/arch/x86/include/asm/ftrace.h
@@ -46,6 +46,13 @@ arch_ftrace_get_regs(struct ftrace_regs *fregs)
return >regs;
 }
 
+#define arch_ftrace_fill_perf_regs(fregs, _regs) do {  \
+   (_regs)->ip = (fregs)->regs.ip; \
+   (_regs)->sp = (fregs)->regs.sp; \
+   (_regs)->cs = __KERNEL_CS;  \
+   (_regs)->flags = 0; \
+   } while (0)
+
 #define ftrace_regs_set_instruction_pointer(fregs, _ip)\
do { (fregs)->regs.ip = (_ip); } while (0)
 
diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
index 8e5da4dfb669..d9a3723f987d 100644
--- a/include/linux/ftrace.h
+++ b/include/linux/ftrace.h
@@ -193,6 +193,37 @@ ftrace_partial_regs(struct ftrace_regs *fregs, struct 
pt_regs *regs)
 
 #endif /* !CONFIG_HAVE_DYNAMIC_FTRACE_WITH_ARGS || 
CONFIG_HAVE_PT_REGS_TO_FTRACE_REGS_CAST */
 
+#ifdef CONFIG_HAVE_DYNAMIC_FTRACE_WITH_ARGS
+
+/*
+ * Please define arch dependent pt_regs which compatible to the
+ * perf_arch_fetch_caller_regs() but based on ftrace_regs.
+ * This requires
+ *   - user_mode(_regs) returns false (always kernel mode).
+ *   - able to use the _regs for stack trace.
+ */
+#ifndef arch_ftrace_fill_perf_regs
+/* As same as perf_arch_fetch_caller_regs(), do nothing by default */
+#define arch_ftrace_fill_perf_regs(fregs, _regs) do {} while (0)
+#endif
+
+static __always_inline struct pt_regs *
+ftrace_fill_perf_regs(struct ftrace_regs *fregs, struct pt_regs *regs)
+{
+   arch_ftrace_fill_perf_regs(fregs, regs);
+   return regs;
+}
+
+#else /* !CONFIG_HAVE_DYNAMIC_FTRACE_WITH_ARGS */
+
+static __always_inline struct pt_regs *
+ftrace_fill_perf_regs(struct ftrace_regs *fregs, struct pt_regs *regs)
+{
+   return >regs;
+}
+
+#endif
+
 /*
  * When true, the 

[PATCH v11 08/18] tracing: Add ftrace_partial_regs() for converting ftrace_regs to pt_regs

2024-06-16 Thread Masami Hiramatsu (Google)
From: Masami Hiramatsu (Google) 

Add ftrace_partial_regs() which converts the ftrace_regs to pt_regs.
This is for the eBPF which needs this to keep the same pt_regs interface
to access registers.
Thus when replacing the pt_regs with ftrace_regs in fprobes (which is
used by kprobe_multi eBPF event), this will be used.

If the architecture defines its own ftrace_regs, this copies partial
registers to pt_regs and returns it. If not, ftrace_regs is the same as
pt_regs and ftrace_partial_regs() will return ftrace_regs::regs.

Signed-off-by: Masami Hiramatsu (Google) 
Acked-by: Florent Revest 
---
 Changes in v8:
  - Add the reason why this required in changelog.
 Changes from previous series: NOTHING, just forward ported.
---
 arch/arm64/include/asm/ftrace.h |   11 +++
 include/linux/ftrace.h  |   17 +
 2 files changed, 28 insertions(+)

diff --git a/arch/arm64/include/asm/ftrace.h b/arch/arm64/include/asm/ftrace.h
index dffaab3dd1f1..5cd587afab6d 100644
--- a/arch/arm64/include/asm/ftrace.h
+++ b/arch/arm64/include/asm/ftrace.h
@@ -132,6 +132,17 @@ ftrace_regs_get_frame_pointer(const struct ftrace_regs 
*fregs)
return fregs->fp;
 }
 
+static __always_inline struct pt_regs *
+ftrace_partial_regs(const struct ftrace_regs *fregs, struct pt_regs *regs)
+{
+   memcpy(regs->regs, fregs->regs, sizeof(u64) * 9);
+   regs->sp = fregs->sp;
+   regs->pc = fregs->pc;
+   regs->regs[29] = fregs->fp;
+   regs->regs[30] = fregs->lr;
+   return regs;
+}
+
 int ftrace_regs_query_register_offset(const char *name);
 
 int ftrace_init_nop(struct module *mod, struct dyn_ftrace *rec);
diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
index fa578748f7d2..8e5da4dfb669 100644
--- a/include/linux/ftrace.h
+++ b/include/linux/ftrace.h
@@ -176,6 +176,23 @@ static __always_inline struct pt_regs 
*ftrace_get_regs(struct ftrace_regs *fregs
return arch_ftrace_get_regs(fregs);
 }
 
+#if !defined(CONFIG_HAVE_DYNAMIC_FTRACE_WITH_ARGS) || \
+   defined(CONFIG_HAVE_PT_REGS_TO_FTRACE_REGS_CAST)
+
+static __always_inline struct pt_regs *
+ftrace_partial_regs(struct ftrace_regs *fregs, struct pt_regs *regs)
+{
+   /*
+* If CONFIG_HAVE_PT_REGS_TO_FTRACE_REGS_CAST=y, ftrace_regs memory
+* layout is the same as pt_regs. So always returns that address.
+* Since arch_ftrace_get_regs() will check some members and may return
+* NULL, we can not use it.
+*/
+   return >regs;
+}
+
+#endif /* !CONFIG_HAVE_DYNAMIC_FTRACE_WITH_ARGS || 
CONFIG_HAVE_PT_REGS_TO_FTRACE_REGS_CAST */
+
 /*
  * When true, the ftrace_regs_{get,set}_*() functions may be used on fregs.
  * Note: this can be true even when ftrace_get_regs() cannot provide a pt_regs.




[PATCH v11 07/18] fprobe: Use ftrace_regs in fprobe exit handler

2024-06-16 Thread Masami Hiramatsu (Google)
From: Masami Hiramatsu (Google) 

Change the fprobe exit handler to use ftrace_regs structure instead of
pt_regs. This also introduce HAVE_PT_REGS_TO_FTRACE_REGS_CAST which means
the ftrace_regs's memory layout is equal to the pt_regs so that those are
able to cast. Fprobe introduces a new dependency with that.

Signed-off-by: Masami Hiramatsu (Google) 
---
  Changes in v3:
   - Use ftrace_regs_get_return_value()
  Changes from previous series: NOTHING, just forward ported.
---
 arch/loongarch/Kconfig  |1 +
 arch/s390/Kconfig   |1 +
 arch/x86/Kconfig|1 +
 include/linux/fprobe.h  |2 +-
 include/linux/ftrace.h  |6 ++
 kernel/trace/Kconfig|8 
 kernel/trace/bpf_trace.c|6 +-
 kernel/trace/fprobe.c   |3 ++-
 kernel/trace/trace_fprobe.c |6 +-
 lib/test_fprobe.c   |6 +++---
 samples/fprobe/fprobe_example.c |2 +-
 11 files changed, 34 insertions(+), 8 deletions(-)

diff --git a/arch/loongarch/Kconfig b/arch/loongarch/Kconfig
index 23014d5f0047..0f1b2057507b 100644
--- a/arch/loongarch/Kconfig
+++ b/arch/loongarch/Kconfig
@@ -119,6 +119,7 @@ config LOONGARCH
select HAVE_DMA_CONTIGUOUS
select HAVE_DYNAMIC_FTRACE
select HAVE_DYNAMIC_FTRACE_WITH_ARGS
+   select HAVE_PT_REGS_TO_FTRACE_REGS_CAST
select HAVE_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
select HAVE_DYNAMIC_FTRACE_WITH_REGS
select HAVE_EBPF_JIT
diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig
index 33688d43fd14..adc8f6620525 100644
--- a/arch/s390/Kconfig
+++ b/arch/s390/Kconfig
@@ -173,6 +173,7 @@ config S390
select HAVE_DMA_CONTIGUOUS
select HAVE_DYNAMIC_FTRACE
select HAVE_DYNAMIC_FTRACE_WITH_ARGS
+   select HAVE_PT_REGS_TO_FTRACE_REGS_CAST
select HAVE_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
select HAVE_DYNAMIC_FTRACE_WITH_REGS
select HAVE_EBPF_JIT if HAVE_MARCH_Z196_FEATURES
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 5fc3a2997977..d4655b72e6d7 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -218,6 +218,7 @@ config X86
select HAVE_DYNAMIC_FTRACE
select HAVE_DYNAMIC_FTRACE_WITH_REGS
select HAVE_DYNAMIC_FTRACE_WITH_ARGSif X86_64
+   select HAVE_PT_REGS_TO_FTRACE_REGS_CAST if X86_64
select HAVE_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
select HAVE_SAMPLE_FTRACE_DIRECTif X86_64
select HAVE_SAMPLE_FTRACE_DIRECT_MULTI  if X86_64
diff --git a/include/linux/fprobe.h b/include/linux/fprobe.h
index ca64ee5e45d2..ef609bcca0f9 100644
--- a/include/linux/fprobe.h
+++ b/include/linux/fprobe.h
@@ -14,7 +14,7 @@ typedef int (*fprobe_entry_cb)(struct fprobe *fp, unsigned 
long entry_ip,
   void *entry_data);
 
 typedef void (*fprobe_exit_cb)(struct fprobe *fp, unsigned long entry_ip,
-  unsigned long ret_ip, struct pt_regs *regs,
+  unsigned long ret_ip, struct ftrace_regs *regs,
   void *entry_data);
 
 /**
diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
index 85394b9fb630..fa578748f7d2 100644
--- a/include/linux/ftrace.h
+++ b/include/linux/ftrace.h
@@ -162,6 +162,12 @@ struct ftrace_regs {
 #define ftrace_regs_set_instruction_pointer(fregs, ip) do { } while (0)
 #endif /* CONFIG_HAVE_DYNAMIC_FTRACE_WITH_ARGS */
 
+#ifdef CONFIG_HAVE_PT_REGS_TO_FTRACE_REGS_CAST
+
+static_assert(sizeof(struct pt_regs) == sizeof(struct ftrace_regs));
+
+#endif /* CONFIG_HAVE_PT_REGS_TO_FTRACE_REGS_CAST */
+
 static __always_inline struct pt_regs *ftrace_get_regs(struct ftrace_regs 
*fregs)
 {
if (!fregs)
diff --git a/kernel/trace/Kconfig b/kernel/trace/Kconfig
index 78b0da6fda1a..15e340a865f5 100644
--- a/kernel/trace/Kconfig
+++ b/kernel/trace/Kconfig
@@ -57,6 +57,13 @@ config HAVE_DYNAMIC_FTRACE_WITH_ARGS
 This allows for use of ftrace_regs_get_argument() and
 ftrace_regs_get_stack_pointer().
 
+config HAVE_PT_REGS_TO_FTRACE_REGS_CAST
+   bool
+   help
+If this is set, the memory layout of the ftrace_regs data structure
+is the same as the pt_regs. So the pt_regs is possible to be casted
+to ftrace_regs.
+
 config HAVE_DYNAMIC_FTRACE_NO_PATCHABLE
bool
help
@@ -288,6 +295,7 @@ config FPROBE
bool "Kernel Function Probe (fprobe)"
depends on FUNCTION_TRACER
depends on DYNAMIC_FTRACE_WITH_REGS || DYNAMIC_FTRACE_WITH_ARGS
+   depends on HAVE_PT_REGS_TO_FTRACE_REGS_CAST || 
!HAVE_DYNAMIC_FTRACE_WITH_ARGS
depends on HAVE_RETHOOK
select RETHOOK
default n
diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
index 7e782a58ca6d..f72b421abe9b 100644
--- a/kernel/trace/bpf_trace.c
+++ b/kernel/trace/bpf_trace.c
@@ -2871,10 +2871,14 @@ kprobe_multi_link_handler(struct fprobe *fp, unsigned 
long fentry_ip,
 
 static void
 

[PATCH v11 06/18] fprobe: Use ftrace_regs in fprobe entry handler

2024-06-16 Thread Masami Hiramatsu (Google)
From: Masami Hiramatsu (Google) 

This allows fprobes to be available with CONFIG_DYNAMIC_FTRACE_WITH_ARGS
instead of CONFIG_DYNAMIC_FTRACE_WITH_REGS, then we can enable fprobe
on arm64.

Signed-off-by: Masami Hiramatsu (Google) 
Acked-by: Florent Revest 
---
 Changes in v6:
  - Keep using SAVE_REGS flag to avoid breaking bpf kprobe-multi test.
---
 include/linux/fprobe.h  |2 +-
 kernel/trace/Kconfig|3 ++-
 kernel/trace/bpf_trace.c|   10 +++---
 kernel/trace/fprobe.c   |3 ++-
 kernel/trace/trace_fprobe.c |6 +-
 lib/test_fprobe.c   |4 ++--
 samples/fprobe/fprobe_example.c |2 +-
 7 files changed, 20 insertions(+), 10 deletions(-)

diff --git a/include/linux/fprobe.h b/include/linux/fprobe.h
index f39869588117..ca64ee5e45d2 100644
--- a/include/linux/fprobe.h
+++ b/include/linux/fprobe.h
@@ -10,7 +10,7 @@
 struct fprobe;
 
 typedef int (*fprobe_entry_cb)(struct fprobe *fp, unsigned long entry_ip,
-  unsigned long ret_ip, struct pt_regs *regs,
+  unsigned long ret_ip, struct ftrace_regs *regs,
   void *entry_data);
 
 typedef void (*fprobe_exit_cb)(struct fprobe *fp, unsigned long entry_ip,
diff --git a/kernel/trace/Kconfig b/kernel/trace/Kconfig
index 33fcfb36eca5..78b0da6fda1a 100644
--- a/kernel/trace/Kconfig
+++ b/kernel/trace/Kconfig
@@ -287,7 +287,7 @@ config DYNAMIC_FTRACE_WITH_ARGS
 config FPROBE
bool "Kernel Function Probe (fprobe)"
depends on FUNCTION_TRACER
-   depends on DYNAMIC_FTRACE_WITH_REGS
+   depends on DYNAMIC_FTRACE_WITH_REGS || DYNAMIC_FTRACE_WITH_ARGS
depends on HAVE_RETHOOK
select RETHOOK
default n
@@ -672,6 +672,7 @@ config FPROBE_EVENTS
select TRACING
select PROBE_EVENTS
select DYNAMIC_EVENTS
+   depends on DYNAMIC_FTRACE_WITH_REGS
default y
help
  This allows user to add tracing events on the function entry and
diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
index 6249dac61701..7e782a58ca6d 100644
--- a/kernel/trace/bpf_trace.c
+++ b/kernel/trace/bpf_trace.c
@@ -2602,7 +2602,7 @@ struct bpf_session_run_ctx {
void *data;
 };
 
-#ifdef CONFIG_FPROBE
+#if defined(CONFIG_FPROBE) && defined(CONFIG_DYNAMIC_FTRACE_WITH_REGS)
 struct bpf_kprobe_multi_link {
struct bpf_link link;
struct fprobe fp;
@@ -2854,12 +2854,16 @@ kprobe_multi_link_prog_run(struct bpf_kprobe_multi_link 
*link,
 
 static int
 kprobe_multi_link_handler(struct fprobe *fp, unsigned long fentry_ip,
- unsigned long ret_ip, struct pt_regs *regs,
+ unsigned long ret_ip, struct ftrace_regs *fregs,
  void *data)
 {
+   struct pt_regs *regs = ftrace_get_regs(fregs);
struct bpf_kprobe_multi_link *link;
int err;
 
+   if (!regs)
+   return 0;
+
link = container_of(fp, struct bpf_kprobe_multi_link, fp);
err = kprobe_multi_link_prog_run(link, get_entry_ip(fentry_ip), regs, 
false, data);
return is_kprobe_session(link->link.prog) ? err : 0;
@@ -3134,7 +3138,7 @@ int bpf_kprobe_multi_link_attach(const union bpf_attr 
*attr, struct bpf_prog *pr
kvfree(cookies);
return err;
 }
-#else /* !CONFIG_FPROBE */
+#else /* !CONFIG_FPROBE || !CONFIG_DYNAMIC_FTRACE_WITH_REGS */
 int bpf_kprobe_multi_link_attach(const union bpf_attr *attr, struct bpf_prog 
*prog)
 {
return -EOPNOTSUPP;
diff --git a/kernel/trace/fprobe.c b/kernel/trace/fprobe.c
index 9ff018245840..3d3789283873 100644
--- a/kernel/trace/fprobe.c
+++ b/kernel/trace/fprobe.c
@@ -46,7 +46,7 @@ static inline void __fprobe_handler(unsigned long ip, 
unsigned long parent_ip,
}
 
if (fp->entry_handler)
-   ret = fp->entry_handler(fp, ip, parent_ip, 
ftrace_get_regs(fregs), entry_data);
+   ret = fp->entry_handler(fp, ip, parent_ip, fregs, entry_data);
 
/* If entry_handler returns !0, nmissed is not counted. */
if (rh) {
@@ -182,6 +182,7 @@ static void fprobe_init(struct fprobe *fp)
fp->ops.func = fprobe_kprobe_handler;
else
fp->ops.func = fprobe_handler;
+
fp->ops.flags |= FTRACE_OPS_FL_SAVE_REGS;
 }
 
diff --git a/kernel/trace/trace_fprobe.c b/kernel/trace/trace_fprobe.c
index 62e6a8f4aae9..b2c20d4fdfd7 100644
--- a/kernel/trace/trace_fprobe.c
+++ b/kernel/trace/trace_fprobe.c
@@ -338,12 +338,16 @@ NOKPROBE_SYMBOL(fexit_perf_func);
 #endif /* CONFIG_PERF_EVENTS */
 
 static int fentry_dispatcher(struct fprobe *fp, unsigned long entry_ip,
-unsigned long ret_ip, struct pt_regs *regs,
+unsigned long ret_ip, struct ftrace_regs *fregs,
 void *entry_data)
 {
struct trace_fprobe *tf = container_of(fp, struct trace_fprobe, fp);
+   struct 

[PATCH v11 05/18] function_graph: Pass ftrace_regs to retfunc

2024-06-16 Thread Masami Hiramatsu (Google)
From: Masami Hiramatsu (Google) 

Pass ftrace_regs to the fgraph_ops::retfunc(). If ftrace_regs is not
available, it passes a NULL instead. User callback function can access
some registers (including return address) via this ftrace_regs.

Signed-off-by: Masami Hiramatsu (Google) 
---
 Changes in v8:
  - Pass ftrace_regs to retfunc, instead of adding retregfunc.
 Changes in v6:
  - update to use ftrace_regs_get_return_value() because of reordering
patches.
 Changes in v3:
  - Update for new multiple fgraph.
  - Save the return address to instruction pointer in ftrace_regs.
---
 include/linux/ftrace.h   |3 ++-
 kernel/trace/fgraph.c|   16 +++-
 kernel/trace/ftrace.c|3 ++-
 kernel/trace/trace.h |3 ++-
 kernel/trace/trace_functions_graph.c |7 ---
 kernel/trace/trace_irqsoff.c |3 ++-
 kernel/trace/trace_sched_wakeup.c|3 ++-
 kernel/trace/trace_selftest.c|3 ++-
 8 files changed, 27 insertions(+), 14 deletions(-)

diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
index 9230af20c92e..85394b9fb630 100644
--- a/include/linux/ftrace.h
+++ b/include/linux/ftrace.h
@@ -1069,7 +1069,8 @@ struct fgraph_ops;
 
 /* Type of the callback handlers for tracing function graph*/
 typedef void (*trace_func_graph_ret_t)(struct ftrace_graph_ret *,
-  struct fgraph_ops *); /* return */
+  struct fgraph_ops *,
+  struct ftrace_regs *); /* return */
 typedef int (*trace_func_graph_ent_t)(struct ftrace_graph_ent *,
  struct fgraph_ops *,
  struct ftrace_regs *); /* entry */
diff --git a/kernel/trace/fgraph.c b/kernel/trace/fgraph.c
index 709f920da939..d735a8c872bb 100644
--- a/kernel/trace/fgraph.c
+++ b/kernel/trace/fgraph.c
@@ -297,7 +297,8 @@ static int entry_run(struct ftrace_graph_ent *trace, struct 
fgraph_ops *ops,
 }
 
 /* ftrace_graph_return set to this to tell some archs to run function graph */
-static void return_run(struct ftrace_graph_ret *trace, struct fgraph_ops *ops)
+static void return_run(struct ftrace_graph_ret *trace, struct fgraph_ops *ops,
+  struct ftrace_regs *fregs)
 {
 }
 
@@ -491,7 +492,8 @@ int ftrace_graph_entry_stub(struct ftrace_graph_ent *trace,
 }
 
 static void ftrace_graph_ret_stub(struct ftrace_graph_ret *trace,
- struct fgraph_ops *gops)
+ struct fgraph_ops *gops,
+ struct ftrace_regs *fregs)
 {
 }
 
@@ -787,6 +789,9 @@ __ftrace_return_to_handler(struct ftrace_regs *fregs, 
unsigned long frame_pointe
}
 
trace.rettime = trace_clock_local();
+   if (fregs)
+   ftrace_regs_set_instruction_pointer(fregs, ret);
+
 #ifdef CONFIG_FUNCTION_GRAPH_RETVAL
trace.retval = ftrace_regs_get_return_value(fregs);
 #endif
@@ -796,7 +801,7 @@ __ftrace_return_to_handler(struct ftrace_regs *fregs, 
unsigned long frame_pointe
 #ifdef CONFIG_HAVE_STATIC_CALL
if (static_branch_likely(_do_direct)) {
if (test_bit(fgraph_direct_gops->idx, ))
-   static_call(fgraph_retfunc)(, fgraph_direct_gops);
+   static_call(fgraph_retfunc)(, fgraph_direct_gops, 
fregs);
} else
 #endif
{
@@ -806,7 +811,7 @@ __ftrace_return_to_handler(struct ftrace_regs *fregs, 
unsigned long frame_pointe
if (gops == _stub)
continue;
 
-   gops->retfunc(, gops);
+   gops->retfunc(, gops, fregs);
}
}
 
@@ -956,7 +961,8 @@ void ftrace_graph_sleep_time_control(bool enable)
  * Simply points to ftrace_stub, but with the proper protocol.
  * Defined by the linker script in linux/vmlinux.lds.h
  */
-void ftrace_stub_graph(struct ftrace_graph_ret *trace, struct fgraph_ops 
*gops);
+void ftrace_stub_graph(struct ftrace_graph_ret *trace, struct fgraph_ops *gops,
+  struct ftrace_regs *fregs);
 
 /* The callbacks that hook a function */
 trace_func_graph_ret_t ftrace_graph_return = ftrace_stub_graph;
diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index 64d15428cffc..725a95b161a1 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -840,7 +840,8 @@ static int profile_graph_entry(struct ftrace_graph_ent 
*trace,
 }
 
 static void profile_graph_return(struct ftrace_graph_ret *trace,
-struct fgraph_ops *gops)
+struct fgraph_ops *gops,
+struct ftrace_regs *fregs)
 {
struct ftrace_ret_stack *ret_stack;
struct ftrace_profile_stat *stat;
diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h
index 2b718e448026..75c97b0515da 100644
--- 

[PATCH v11 04/18] function_graph: Replace fgraph_ret_regs with ftrace_regs

2024-06-16 Thread Masami Hiramatsu (Google)
From: Masami Hiramatsu (Google) 

Use ftrace_regs instead of fgraph_ret_regs for tracing return value
on function_graph tracer because of simplifying the callback interface.

The CONFIG_HAVE_FUNCTION_GRAPH_RETVAL is also replaced by
CONFIG_HAVE_FUNCTION_GRAPH_FREGS.

Signed-off-by: Masami Hiramatsu (Google) 
---
 Changes in v8:
  - Newly added.
---
 arch/arm64/Kconfig  |2 +-
 arch/arm64/include/asm/ftrace.h |   23 ++-
 arch/arm64/kernel/asm-offsets.c |   12 
 arch/arm64/kernel/entry-ftrace.S|   32 ++--
 arch/loongarch/Kconfig  |2 +-
 arch/loongarch/include/asm/ftrace.h |   24 ++--
 arch/loongarch/kernel/asm-offsets.c |   12 
 arch/loongarch/kernel/mcount.S  |   17 ++---
 arch/loongarch/kernel/mcount_dyn.S  |   14 +++---
 arch/riscv/Kconfig  |2 +-
 arch/riscv/include/asm/ftrace.h |   26 +-
 arch/riscv/kernel/mcount.S  |   24 +---
 arch/s390/Kconfig   |2 +-
 arch/s390/include/asm/ftrace.h  |   26 +-
 arch/s390/kernel/asm-offsets.c  |6 --
 arch/s390/kernel/mcount.S   |9 +
 arch/x86/Kconfig|2 +-
 arch/x86/include/asm/ftrace.h   |   22 ++
 arch/x86/kernel/ftrace_32.S |   15 +--
 arch/x86/kernel/ftrace_64.S |   17 +
 include/linux/ftrace.h  |   14 +++---
 kernel/trace/Kconfig|4 ++--
 kernel/trace/fgraph.c   |   21 +
 23 files changed, 122 insertions(+), 206 deletions(-)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 5d91259ee7b5..8691683d782e 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -210,7 +210,7 @@ config ARM64
select HAVE_FTRACE_MCOUNT_RECORD
select HAVE_FUNCTION_TRACER
select HAVE_FUNCTION_ERROR_INJECTION
-   select HAVE_FUNCTION_GRAPH_RETVAL if HAVE_FUNCTION_GRAPH_TRACER
+   select HAVE_FUNCTION_GRAPH_FREGS
select HAVE_FUNCTION_GRAPH_TRACER
select HAVE_GCC_PLUGINS
select HAVE_HARDLOCKUP_DETECTOR_PERF if PERF_EVENTS && \
diff --git a/arch/arm64/include/asm/ftrace.h b/arch/arm64/include/asm/ftrace.h
index dc9cf0bd2a4c..dffaab3dd1f1 100644
--- a/arch/arm64/include/asm/ftrace.h
+++ b/arch/arm64/include/asm/ftrace.h
@@ -126,6 +126,12 @@ ftrace_override_function_with_return(struct ftrace_regs 
*fregs)
fregs->pc = fregs->lr;
 }
 
+static __always_inline unsigned long
+ftrace_regs_get_frame_pointer(const struct ftrace_regs *fregs)
+{
+   return fregs->fp;
+}
+
 int ftrace_regs_query_register_offset(const char *name);
 
 int ftrace_init_nop(struct module *mod, struct dyn_ftrace *rec);
@@ -183,23 +189,6 @@ static inline bool arch_syscall_match_sym_name(const char 
*sym,
 
 #ifndef __ASSEMBLY__
 #ifdef CONFIG_FUNCTION_GRAPH_TRACER
-struct fgraph_ret_regs {
-   /* x0 - x7 */
-   unsigned long regs[8];
-
-   unsigned long fp;
-   unsigned long __unused;
-};
-
-static inline unsigned long fgraph_ret_regs_return_value(struct 
fgraph_ret_regs *ret_regs)
-{
-   return ret_regs->regs[0];
-}
-
-static inline unsigned long fgraph_ret_regs_frame_pointer(struct 
fgraph_ret_regs *ret_regs)
-{
-   return ret_regs->fp;
-}
 
 void prepare_ftrace_return(unsigned long self_addr, unsigned long *parent,
   unsigned long frame_pointer);
diff --git a/arch/arm64/kernel/asm-offsets.c b/arch/arm64/kernel/asm-offsets.c
index 81496083c041..81bb6704ff5a 100644
--- a/arch/arm64/kernel/asm-offsets.c
+++ b/arch/arm64/kernel/asm-offsets.c
@@ -200,18 +200,6 @@ int main(void)
   DEFINE(FTRACE_OPS_FUNC,  offsetof(struct ftrace_ops, func));
 #endif
   BLANK();
-#ifdef CONFIG_FUNCTION_GRAPH_TRACER
-  DEFINE(FGRET_REGS_X0,offsetof(struct 
fgraph_ret_regs, regs[0]));
-  DEFINE(FGRET_REGS_X1,offsetof(struct 
fgraph_ret_regs, regs[1]));
-  DEFINE(FGRET_REGS_X2,offsetof(struct 
fgraph_ret_regs, regs[2]));
-  DEFINE(FGRET_REGS_X3,offsetof(struct 
fgraph_ret_regs, regs[3]));
-  DEFINE(FGRET_REGS_X4,offsetof(struct 
fgraph_ret_regs, regs[4]));
-  DEFINE(FGRET_REGS_X5,offsetof(struct 
fgraph_ret_regs, regs[5]));
-  DEFINE(FGRET_REGS_X6,offsetof(struct 
fgraph_ret_regs, regs[6]));
-  DEFINE(FGRET_REGS_X7,offsetof(struct 
fgraph_ret_regs, regs[7]));
-  DEFINE(FGRET_REGS_FP,offsetof(struct 
fgraph_ret_regs, fp));
-  DEFINE(FGRET_REGS_SIZE,  sizeof(struct fgraph_ret_regs));
-#endif
 #ifdef CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
   DEFINE(FTRACE_OPS_DIRECT_CALL,   offsetof(struct ftrace_ops, 
direct_call));

[PATCH v11 03/18] function_graph: Pass ftrace_regs to entryfunc

2024-06-16 Thread Masami Hiramatsu (Google)
From: Masami Hiramatsu (Google) 

Pass ftrace_regs to the fgraph_ops::entryfunc(). If ftrace_regs is not
available, it passes a NULL instead. User callback function can access
some registers (including return address) via this ftrace_regs.

Signed-off-by: Masami Hiramatsu (Google) 
---
 Changes in v11:
  - Update for the latest for-next branch.
 Changes in v8:
  - Just pass ftrace_regs to the handler instead of adding a new
entryregfunc.
  - Update riscv ftrace_graph_func().
 Changes in v3:
  - Update for new multiple fgraph.
---
 arch/arm64/kernel/ftrace.c   |   20 +++-
 arch/loongarch/kernel/ftrace_dyn.c   |   10 +-
 arch/powerpc/kernel/trace/ftrace.c   |2 +
 arch/powerpc/kernel/trace/ftrace_64_pg.c |   10 --
 arch/riscv/kernel/ftrace.c   |   17 ++
 arch/x86/kernel/ftrace.c |   50 +-
 include/linux/ftrace.h   |   18 ---
 kernel/trace/fgraph.c|   23 --
 kernel/trace/ftrace.c|3 +-
 kernel/trace/trace.h |3 +-
 kernel/trace/trace_functions_graph.c |3 +-
 kernel/trace/trace_irqsoff.c |3 +-
 kernel/trace/trace_sched_wakeup.c|3 +-
 kernel/trace/trace_selftest.c|8 +++--
 14 files changed, 128 insertions(+), 45 deletions(-)

diff --git a/arch/arm64/kernel/ftrace.c b/arch/arm64/kernel/ftrace.c
index a650f5e11fc5..bc647b725e6a 100644
--- a/arch/arm64/kernel/ftrace.c
+++ b/arch/arm64/kernel/ftrace.c
@@ -481,7 +481,25 @@ void prepare_ftrace_return(unsigned long self_addr, 
unsigned long *parent,
 void ftrace_graph_func(unsigned long ip, unsigned long parent_ip,
   struct ftrace_ops *op, struct ftrace_regs *fregs)
 {
-   prepare_ftrace_return(ip, >lr, fregs->fp);
+   unsigned long return_hooker = (unsigned long)_to_handler;
+   unsigned long frame_pointer = fregs->fp;
+   unsigned long *parent = >lr;
+   unsigned long old;
+
+   if (unlikely(atomic_read(>tracing_graph_pause)))
+   return;
+
+   /*
+* Note:
+* No protection against faulting at *parent, which may be seen
+* on other archs. It's unlikely on AArch64.
+*/
+   old = *parent;
+
+   if (!function_graph_enter_regs(old, ip, frame_pointer,
+  (void *)frame_pointer, fregs)) {
+   *parent = return_hooker;
+   }
 }
 #else
 /*
diff --git a/arch/loongarch/kernel/ftrace_dyn.c 
b/arch/loongarch/kernel/ftrace_dyn.c
index bff058317062..966e0f7f7aca 100644
--- a/arch/loongarch/kernel/ftrace_dyn.c
+++ b/arch/loongarch/kernel/ftrace_dyn.c
@@ -243,8 +243,16 @@ void ftrace_graph_func(unsigned long ip, unsigned long 
parent_ip,
 {
struct pt_regs *regs = >regs;
unsigned long *parent = (unsigned long *)>regs[1];
+   unsigned long return_hooker = (unsigned long)_to_handler;
+   unsigned long old;
+
+   if (unlikely(atomic_read(>tracing_graph_pause)))
+   return;
+
+   old = *parent;
 
-   prepare_ftrace_return(ip, (unsigned long *)parent);
+   if (!function_graph_enter_regs(old, ip, 0, parent, fregs))
+   *parent = return_hooker;
 }
 #else
 static int ftrace_modify_graph_caller(bool enable)
diff --git a/arch/powerpc/kernel/trace/ftrace.c 
b/arch/powerpc/kernel/trace/ftrace.c
index d8d6b4fd9a14..a1a0e0b57662 100644
--- a/arch/powerpc/kernel/trace/ftrace.c
+++ b/arch/powerpc/kernel/trace/ftrace.c
@@ -434,7 +434,7 @@ void ftrace_graph_func(unsigned long ip, unsigned long 
parent_ip,
if (bit < 0)
goto out;
 
-   if (!function_graph_enter(parent_ip, ip, 0, (unsigned long *)sp))
+   if (!function_graph_enter_regs(parent_ip, ip, 0, (unsigned long *)sp, 
fregs))
parent_ip = ppc_function_entry(return_to_handler);
 
ftrace_test_recursion_unlock(bit);
diff --git a/arch/powerpc/kernel/trace/ftrace_64_pg.c 
b/arch/powerpc/kernel/trace/ftrace_64_pg.c
index 12fab1803bcf..4ae9eeb1c8f1 100644
--- a/arch/powerpc/kernel/trace/ftrace_64_pg.c
+++ b/arch/powerpc/kernel/trace/ftrace_64_pg.c
@@ -800,7 +800,8 @@ int ftrace_disable_ftrace_graph_caller(void)
  * in current thread info. Return the address we want to divert to.
  */
 static unsigned long
-__prepare_ftrace_return(unsigned long parent, unsigned long ip, unsigned long 
sp)
+__prepare_ftrace_return(unsigned long parent, unsigned long ip, unsigned long 
sp,
+   struct ftrace_regs *fregs)
 {
unsigned long return_hooker;
int bit;
@@ -817,7 +818,7 @@ __prepare_ftrace_return(unsigned long parent, unsigned long 
ip, unsigned long sp
 
return_hooker = ppc_function_entry(return_to_handler);
 
-   if (!function_graph_enter(parent, ip, 0, (unsigned long *)sp))
+   if (!function_graph_enter_regs(parent, ip, 0, (unsigned long *)sp, 
fregs))
parent = return_hooker;

[PATCH v11 02/18] tracing: Rename ftrace_regs_return_value to ftrace_regs_get_return_value

2024-06-16 Thread Masami Hiramatsu (Google)
From: Masami Hiramatsu (Google) 

Rename ftrace_regs_return_value to ftrace_regs_get_return_value as same as
other ftrace_regs_get/set_* APIs.

Signed-off-by: Masami Hiramatsu (Google) 
Acked-by: Mark Rutland 
---
 Changes in v6:
  - Moved to top of the series.
 Changes in v3:
  - Newly added.
---
 arch/loongarch/include/asm/ftrace.h |2 +-
 arch/powerpc/include/asm/ftrace.h   |2 +-
 arch/s390/include/asm/ftrace.h  |2 +-
 arch/x86/include/asm/ftrace.h   |2 +-
 include/linux/ftrace.h  |2 +-
 5 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/arch/loongarch/include/asm/ftrace.h 
b/arch/loongarch/include/asm/ftrace.h
index c0a682808e07..6f8517d59954 100644
--- a/arch/loongarch/include/asm/ftrace.h
+++ b/arch/loongarch/include/asm/ftrace.h
@@ -69,7 +69,7 @@ ftrace_regs_set_instruction_pointer(struct ftrace_regs 
*fregs, unsigned long ip)
regs_get_kernel_argument(&(fregs)->regs, n)
 #define ftrace_regs_get_stack_pointer(fregs) \
kernel_stack_pointer(&(fregs)->regs)
-#define ftrace_regs_return_value(fregs) \
+#define ftrace_regs_get_return_value(fregs) \
regs_return_value(&(fregs)->regs)
 #define ftrace_regs_set_return_value(fregs, ret) \
regs_set_return_value(&(fregs)->regs, ret)
diff --git a/arch/powerpc/include/asm/ftrace.h 
b/arch/powerpc/include/asm/ftrace.h
index 559560286e6d..23d26f3afae4 100644
--- a/arch/powerpc/include/asm/ftrace.h
+++ b/arch/powerpc/include/asm/ftrace.h
@@ -59,7 +59,7 @@ ftrace_regs_get_instruction_pointer(struct ftrace_regs *fregs)
regs_get_kernel_argument(&(fregs)->regs, n)
 #define ftrace_regs_get_stack_pointer(fregs) \
kernel_stack_pointer(&(fregs)->regs)
-#define ftrace_regs_return_value(fregs) \
+#define ftrace_regs_get_return_value(fregs) \
regs_return_value(&(fregs)->regs)
 #define ftrace_regs_set_return_value(fregs, ret) \
regs_set_return_value(&(fregs)->regs, ret)
diff --git a/arch/s390/include/asm/ftrace.h b/arch/s390/include/asm/ftrace.h
index fbadca645af7..de76c21eb4a3 100644
--- a/arch/s390/include/asm/ftrace.h
+++ b/arch/s390/include/asm/ftrace.h
@@ -83,7 +83,7 @@ ftrace_regs_set_instruction_pointer(struct ftrace_regs *fregs,
regs_get_kernel_argument(&(fregs)->regs, n)
 #define ftrace_regs_get_stack_pointer(fregs) \
kernel_stack_pointer(&(fregs)->regs)
-#define ftrace_regs_return_value(fregs) \
+#define ftrace_regs_get_return_value(fregs) \
regs_return_value(&(fregs)->regs)
 #define ftrace_regs_set_return_value(fregs, ret) \
regs_set_return_value(&(fregs)->regs, ret)
diff --git a/arch/x86/include/asm/ftrace.h b/arch/x86/include/asm/ftrace.h
index 0152a81d9b4a..78f6a200e15b 100644
--- a/arch/x86/include/asm/ftrace.h
+++ b/arch/x86/include/asm/ftrace.h
@@ -56,7 +56,7 @@ arch_ftrace_get_regs(struct ftrace_regs *fregs)
regs_get_kernel_argument(&(fregs)->regs, n)
 #define ftrace_regs_get_stack_pointer(fregs) \
kernel_stack_pointer(&(fregs)->regs)
-#define ftrace_regs_return_value(fregs) \
+#define ftrace_regs_get_return_value(fregs) \
regs_return_value(&(fregs)->regs)
 #define ftrace_regs_set_return_value(fregs, ret) \
regs_set_return_value(&(fregs)->regs, ret)
diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
index 3c8a19ea8f45..bf04b29f9da1 100644
--- a/include/linux/ftrace.h
+++ b/include/linux/ftrace.h
@@ -183,7 +183,7 @@ static __always_inline bool ftrace_regs_has_args(struct 
ftrace_regs *fregs)
regs_get_kernel_argument(ftrace_get_regs(fregs), n)
 #define ftrace_regs_get_stack_pointer(fregs) \
kernel_stack_pointer(ftrace_get_regs(fregs))
-#define ftrace_regs_return_value(fregs) \
+#define ftrace_regs_get_return_value(fregs) \
regs_return_value(ftrace_get_regs(fregs))
 #define ftrace_regs_set_return_value(fregs, ret) \
regs_set_return_value(ftrace_get_regs(fregs), ret)




[PATCH v11 01/18] tracing: Add a comment about ftrace_regs definition

2024-06-16 Thread Masami Hiramatsu (Google)
From: Masami Hiramatsu (Google) 

To clarify what will be expected on ftrace_regs, add a comment to the
architecture independent definition of the ftrace_regs.

Signed-off-by: Masami Hiramatsu (Google) 
Acked-by: Mark Rutland 
---
 Changes in v8:
  - Update that the saved registers depends on the context.
 Changes in v3:
  - Add instruction pointer
 Changes in v2:
  - newly added.
---
 include/linux/ftrace.h |   26 ++
 1 file changed, 26 insertions(+)

diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
index 845c2ab0bc1c..3c8a19ea8f45 100644
--- a/include/linux/ftrace.h
+++ b/include/linux/ftrace.h
@@ -117,6 +117,32 @@ extern int ftrace_enabled;
 
 #ifndef CONFIG_HAVE_DYNAMIC_FTRACE_WITH_ARGS
 
+/**
+ * ftrace_regs - ftrace partial/optimal register set
+ *
+ * ftrace_regs represents a group of registers which is used at the
+ * function entry and exit. There are three types of registers.
+ *
+ * - Registers for passing the parameters to callee, including the stack
+ *   pointer. (e.g. rcx, rdx, rdi, rsi, r8, r9 and rsp on x86_64)
+ * - Registers for passing the return values to caller.
+ *   (e.g. rax and rdx on x86_64)
+ * - Registers for hooking the function call and return including the
+ *   frame pointer (the frame pointer is architecture/config dependent)
+ *   (e.g. rip, rbp and rsp for x86_64)
+ *
+ * Also, architecture dependent fields can be used for internal process.
+ * (e.g. orig_ax on x86_64)
+ *
+ * On the function entry, those registers will be restored except for
+ * the stack pointer, so that user can change the function parameters
+ * and instruction pointer (e.g. live patching.)
+ * On the function exit, only registers which is used for return values
+ * are restored.
+ *
+ * NOTE: user *must not* access regs directly, only do it via APIs, because
+ * the member can be changed according to the architecture.
+ */
 struct ftrace_regs {
struct pt_regs  regs;
 };




[PATCH v11 00/18] tracing: fprobe: function_graph: Multi-function graph and fprobe on fgraph

2024-06-16 Thread Masami Hiramatsu (Google)
Hi,

Here is the 11th version of the series to re-implement the fprobe on
function-graph tracer. The previous version is;

https://lore.kernel.org/all/171509088006.162236.7227326999861366050.stgit@devnote2/

Most of the patches in the previous version (for multiple function graph
trace instance) are already merged via tracing/for-next. This version
is the remaining part, fprobe implement on fgraph. Basically just moves
on the updated fgraph implementation, and no major changes.

Overview

This series rewrites the fprobe on this function-graph.
The purposes of this change are;

 1) Remove dependency of the rethook from fprobe so that we can reduce
   the return hook code and shadow stack.

 2) Make 'ftrace_regs' the common trace interface for the function
   boundary.

1) Currently we have 2(or 3) different function return hook codes,
 the function-graph tracer and rethook (and legacy kretprobe).
 But since this  is redundant and needs double maintenance cost,
 I would like to unify those. From the user's viewpoint, function-
 graph tracer is very useful to grasp the execution path. For this
 purpose, it is hard to use the rethook in the function-graph
 tracer, but the opposite is possible. (Strictly speaking, kretprobe
 can not use it because it requires 'pt_regs' for historical reasons.)

2) Now the fprobe provides the 'pt_regs' for its handler, but that is
 wrong for the function entry and exit. Moreover, depending on the
 architecture, there is no way to accurately reproduce 'pt_regs'
 outside of interrupt or exception handlers. This means fprobe should
 not use 'pt_regs' because it does not use such exceptions.
 (Conversely, kprobe should use 'pt_regs' because it is an abstract
  interface of the software breakpoint exception.)

This series changes fprobe to use function-graph tracer for tracing
function entry and exit, instead of mixture of ftrace and rethook.
Unlike the rethook which is a per-task list of system-wide allocated
nodes, the function graph's ret_stack is a per-task shadow stack.
Thus it does not need to set 'nr_maxactive' (which is the number of
pre-allocated nodes).
Also the handlers will get the 'ftrace_regs' instead of 'pt_regs'.
Since eBPF mulit_kprobe/multi_kretprobe events still use 'pt_regs' as
their register interface, this changes it to convert 'ftrace_regs' to
'pt_regs'. Of course this conversion makes an incomplete 'pt_regs',
so users must access only registers for function parameters or
return value. 

Design
--
Instead of using ftrace's function entry hook directly, the new fprobe
is built on top of the function-graph's entry and return callbacks
with 'ftrace_regs'.

Since the fprobe requires access to 'ftrace_regs', the architecture
must support CONFIG_HAVE_DYNAMIC_FTRACE_WITH_ARGS and
CONFIG_HAVE_FTRACE_GRAPH_FUNC, which enables to call function-graph
entry callback with 'ftrace_regs', and also
CONFIG_HAVE_FUNCTION_GRAPH_FREGS, which passes the ftrace_regs to
return_to_handler.

All fprobes share a single function-graph ops (means shares a common
ftrace filter) similar to the kprobe-on-ftrace. This needs another
layer to find corresponding fprobe in the common function-graph
callbacks, but has much better scalability, since the number of
registered function-graph ops is limited.

In the entry callback, the fprobe runs its entry_handler and saves the
address of 'fprobe' on the function-graph's shadow stack as data. The
return callback decodes the data to get the 'fprobe' address, and runs
the exit_handler.

The fprobe introduces two hash-tables, one is for entry callback which
searches fprobes related to the given function address passed by entry
callback. The other is for a return callback which checks if the given
'fprobe' data structure pointer is still valid. Note that it is
possible to unregister fprobe before the return callback runs. Thus
the address validation must be done before using it in the return
callback.

Download

This series can be applied against the ftrace/for-next branch in
linux-trace tree.

This series can also be found below branch.

https://git.kernel.org/pub/scm/linux/kernel/git/mhiramat/linux.git/log/?h=topic/fprobe-on-fgraph

Thank you,

---

Masami Hiramatsu (Google) (18):
  tracing: Add a comment about ftrace_regs definition
  tracing: Rename ftrace_regs_return_value to ftrace_regs_get_return_value
  function_graph: Pass ftrace_regs to entryfunc
  function_graph: Replace fgraph_ret_regs with ftrace_regs
  function_graph: Pass ftrace_regs to retfunc
  fprobe: Use ftrace_regs in fprobe entry handler
  fprobe: Use ftrace_regs in fprobe exit handler
  tracing: Add ftrace_partial_regs() for converting ftrace_regs to pt_regs
  tracing: Add ftrace_fill_perf_regs() for perf event
  tracing/fprobe: Enable fprobe events with CONFIG_DYNAMIC_FTRACE_WITH_ARGS
  bpf: Enable kprobe_multi feature if CONFIG_FPROBE is enabled
  ftrace: Add CONFIG_HAVE_FTRACE_GRAPH_FUNC
  fprobe: Rewrite 

Re: [PATCH v7 1/2] mm/memblock: Add "reserve_mem" to reserved named memory at boot up

2024-06-16 Thread Wei Yang
On Thu, Jun 13, 2024 at 07:34:16PM -0400, Steven Rostedt wrote:
>From: "Steven Rostedt (Google)" 
>
>In order to allow for requesting a memory region that can be used for
>things like pstore on multiple machines where the memory layout is not the
>same, add a new option to the kernel command line called "reserve_mem".
>
>The format is:  reserve_mem=nn:align:name
>
>Where it will find nn amount of memory at the given alignment of align.
>The name field is to allow another subsystem to retrieve where the memory
>was found. For example:
>
>  reserve_mem=12M:4096:oops ramoops.mem_name=oops
>
>Where ramoops.mem_name will tell ramoops that memory was reserved for it
>via the reserve_mem option and it can find it by calling:
>
>  if (reserve_mem_find_by_name("oops", , )) {
>   // start holds the start address and size holds the size given
>
>This is typically used for systems that do not wipe the RAM, and this
>command line will try to reserve the same physical memory on soft reboots.
>Note, it is not guaranteed to be the same location. For example, if KASLR
>places the kernel at the location of where the RAM reservation was from a
>previous boot, the new reservation will be at a different location.  Any
>subsystem using this feature must add a way to verify that the contents of
>the physical memory is from a previous boot, as there may be cases where
>the memory will not be located at the same location.
>
>Not all systems may work either. There could be bit flips if the reboot
>goes through the BIOS. Using kexec to reboot the machine is likely to
>have better results in such cases.
>
>Link: https://lore.kernel.org/all/zjjvnzux3nzig...@kernel.org/
>
>Suggested-by: Mike Rapoport 
>Tested-by: Guilherme G. Piccoli 
>Signed-off-by: Steven Rostedt (Google) 
>---
> .../admin-guide/kernel-parameters.txt |  22 
> include/linux/mm.h|   2 +
> mm/memblock.c | 117 ++
> 3 files changed, 141 insertions(+)
>
>diff --git a/Documentation/admin-guide/kernel-parameters.txt 
>b/Documentation/admin-guide/kernel-parameters.txt
>index b600df82669d..56e18b1a520d 100644
>--- a/Documentation/admin-guide/kernel-parameters.txt
>+++ b/Documentation/admin-guide/kernel-parameters.txt
>@@ -5710,6 +5710,28 @@
>   them.  If  is less than 0x1, the region
>   is assumed to be I/O ports; otherwise it is memory.
> 
>+  reserve_mem=[RAM]
>+  Format: nn[KNG]::
>+  Reserve physical memory and label it with a name that
>+  other subsystems can use to access it. This is typically
>+  used for systems that do not wipe the RAM, and this 
>command
>+  line will try to reserve the same physical memory on
>+  soft reboots. Note, it is not guaranteed to be the same
>+  location. For example, if anything about the system 
>changes
>+  or if booting a different kernel. It can also fail if 
>KASLR
>+  places the kernel at the location of where the RAM 
>reservation
>+  was from a previous boot, the new reservation will be 
>at a
>+  different location.
>+  Any subsystem using this feature must add a way to 
>verify
>+  that the contents of the physical memory is from a 
>previous
>+  boot, as there may be cases where the memory will not be
>+  located at the same location.
>+
>+  The format is size:align:label for example, to request
>+  12 megabytes of 4096 alignment for ramoops:
>+
>+  reserve_mem=12M:4096:oops ramoops.mem_name=oops
>+
>   reservetop= [X86-32,EARLY]
>   Format: nn[KMG]
>   Reserves a hole at the top of the kernel virtual
>diff --git a/include/linux/mm.h b/include/linux/mm.h
>index 9849dfda44d4..077fb589b88a 100644
>--- a/include/linux/mm.h
>+++ b/include/linux/mm.h
>@@ -4263,4 +4263,6 @@ static inline bool pfn_is_unaccepted_memory(unsigned 
>long pfn)
> void vma_pgtable_walk_begin(struct vm_area_struct *vma);
> void vma_pgtable_walk_end(struct vm_area_struct *vma);
> 
>+int reserve_mem_find_by_name(const char *name, phys_addr_t *start, 
>phys_addr_t *size);
>+
> #endif /* _LINUX_MM_H */
>diff --git a/mm/memblock.c b/mm/memblock.c
>index d09136e040d3..b7b0e8c3868d 100644
>--- a/mm/memblock.c
>+++ b/mm/memblock.c
>@@ -2244,6 +2244,123 @@ void __init memblock_free_all(void)
>   totalram_pages_add(pages);
> }
> 
>+/* Keep a table to reserve named memory */
>+#define RESERVE_MEM_MAX_ENTRIES   8
>+#define RESERVE_MEM_NAME_SIZE 16
>+struct reserve_mem_table {
>+  charname[RESERVE_MEM_NAME_SIZE];
>+  phys_addr_t start;
>+  phys_addr_t   

Re: [PATCH] bpf/selftests: Fix __NR_uretprobe in uprobe_syscall test

2024-06-16 Thread Jiri Olsa
On Sun, Jun 16, 2024 at 01:19:11AM +0900, Masami Hiramatsu wrote:
> On Sun, 16 Jun 2024 00:19:20 +0900
> Masami Hiramatsu (Google)  wrote:
> 
> > On Fri, 14 Jun 2024 12:15:09 +0200
> > Jiri Olsa  wrote:
> > 
> > > Fixing the __NR_uretprobe number in uprobe_syscall test,
> > > because it changed due to merge conflict.
> > > 
> > 
> > Ah, it is not enough, since Stephen's change is just a temporary fix on
> > next tree. OK, Let me update it.
> 
> Hm, I thought I need to change all NR_uretprobe, but it makes NR_syscalls
> list sparse. This may need to be solved on linus tree in merge window,
> or I should merge (or rebase on) vfs-brauner tree before sending
> probes/for-next.
> 
> Steve, do you have any idea? we talked about conflict on next tree[0].
> 
> [0] https://lore.kernel.org/all/20240613114243.2a500...@canb.auug.org.au/

hi,
I have one more fix to send [1] for this, please let me know which tree
I should based that on

thanks,
jirka


[1] https://lore.kernel.org/bpf/ZmyZgzqsowkGyqmH@krava/

> 
> Thanks,
> 
> > 
> > Thanks,
> > 
> > > Signed-off-by: Jiri Olsa 
> > > ---
> > >  tools/testing/selftests/bpf/prog_tests/uprobe_syscall.c | 2 +-
> > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > 
> > > diff --git a/tools/testing/selftests/bpf/prog_tests/uprobe_syscall.c 
> > > b/tools/testing/selftests/bpf/prog_tests/uprobe_syscall.c
> > > index c8517c8f5313..bd8c75b620c2 100644
> > > --- a/tools/testing/selftests/bpf/prog_tests/uprobe_syscall.c
> > > +++ b/tools/testing/selftests/bpf/prog_tests/uprobe_syscall.c
> > > @@ -216,7 +216,7 @@ static void test_uretprobe_regs_change(void)
> > >  }
> > >  
> > >  #ifndef __NR_uretprobe
> > > -#define __NR_uretprobe 463
> > > +#define __NR_uretprobe 467
> > >  #endif
> > >  
> > >  __naked unsigned long uretprobe_syscall_call_1(void)
> > > -- 
> > > 2.45.1
> > > 
> > 
> > 
> > -- 
> > Masami Hiramatsu (Google) 
> 
> 
> -- 
> Masami Hiramatsu (Google) 



Re: [PATCH v4 2/3] leds: sy7802: Add support for Silergy SY7802 flash LED controller

2024-06-16 Thread Markus Elfring
> The SY7802 is a current-regulated charge pump which can regulate two
> current levels for Flash and Torch modes.
>
> It is a high-current synchronous boost converter with 2-channel high
> side current sources. Each channel is able to deliver 900mA current.

Would you like to improve such a change description with imperative wordings?
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/process/submitting-patches.rst?h=v6.10-rc3#n94


…
> +++ b/drivers/leds/flash/leds-sy7802.c
> @@ -0,0 +1,542 @@
…
> +static int sy7802_strobe_get(struct led_classdev_flash *fl_cdev, bool *state)
> +{
…
> + mutex_lock(>mutex);
> + *state = !!(chip->fled_strobe_used & BIT(led->led_id));
> + mutex_unlock(>mutex);
> +
> + return 0;
> +}
…

Would you become interested to apply a statement like 
“guard(mutex)(>mutex);”?
https://elixir.bootlin.com/linux/v6.10-rc3/source/include/linux/mutex.h#L196

Regards,
Markus



Re: [syzbot] [mm?] possible deadlock in __mmap_lock_do_trace_start_locking

2024-06-16 Thread Waiman Long

On 6/16/24 10:05, syzbot wrote:

syzbot has bisected this issue to:

commit 21c38a3bd4ee3fb7337d013a638302fb5e5f9dc2
Author: Jesper Dangaard Brouer 
Date:   Wed May 1 14:04:11 2024 +

 cgroup/rstat: add cgroup_rstat_cpu_lock helpers and tracepoints

bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=1669526198
start commit:   36534d3c5453 tcp: use signed arithmetic in tcp_rtx_probe0_..
git tree:   bpf
final oops: https://syzkaller.appspot.com/x/report.txt?x=1569526198
console output: https://syzkaller.appspot.com/x/log.txt?x=1169526198
kernel config:  https://syzkaller.appspot.com/x/.config?x=333ebe38d43c42e2
dashboard link: https://syzkaller.appspot.com/bug?extid=6ff90931779bcdfc840c
syz repro:  https://syzkaller.appspot.com/x/repro.syz?x=1585acfa98
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=17bdb7ee98

Reported-by: syzbot+6ff90931779bcdfc8...@syzkaller.appspotmail.com
Fixes: 21c38a3bd4ee ("cgroup/rstat: add cgroup_rstat_cpu_lock helpers and 
tracepoints")

For information about bisection process see: https://goo.gl/tpsmEJ#bisection


+static __always_inline
+unsigned long _cgroup_rstat_cpu_lock(raw_spinlock_t *cpu_lock, int cpu,
+    struct cgroup *cgrp, const bool 
fast_path)

+{
+   unsigned long flags;
+   bool contended;
+
+   /*
+    * The _irqsave() is needed because cgroup_rstat_lock is
+    * spinlock_t which is a sleeping lock on PREEMPT_RT. Acquiring
+    * this lock with the _irq() suffix only disables interrupts on
+    * a non-PREEMPT_RT kernel. The raw_spinlock_t below disables
+    * interrupts on both configurations. The _irqsave() ensures
+    * that interrupts are always disabled and later restored.
+    */
+   contended = !raw_spin_trylock_irqsave(cpu_lock, flags);
+   if (contended) {
+   if (fast_path)
+ trace_cgroup_rstat_cpu_lock_contended_fastpath(cgrp, cp>
+   else
+   trace_cgroup_rstat_cpu_lock_contended(cgrp, cpu, 
conten>

+
+   raw_spin_lock_irqsave(cpu_lock, flags);
+   }

I believe the problem may be caused by the fact that 
trace_cgroup_rstat_cpu_lock_contended*() can be called with IRQ enabled. 
I had suggested before IRQ should be disabled first before doing any 
trace operation. See


https://lore.kernel.org/linux-mm/203fdb35-f4cf-4754-9709-3c024eeca...@redhat.com/

Doing so may be able to resolve this possible deadlock.

Cheers,
Longman




[PATCH] ARM: dts: qcom: msm8926-motorola-peregrine: Add accelerometer, magnetometer, regulator

2024-06-16 Thread André Apitzsch via B4 Relay
function = "gpio";
+   drive-strength = <2>;
+   bias-disable;
+   output-disable;
+   };
+
+   mag_reset_default: mag-reset-default-state {
+   pins = "gpio62";
+   function = "gpio";
+   drive-strength = <2>;
+   bias-disable;
+   output-high;
+   };
+
+   reg_lcd_default: reg-lcd-default-state {
+   pins = "gpio31", "gpio33";
+   function = "gpio";
+   drive-strength = <2>;
+   bias-disable;
+   output-high;
+   };
+
+   temp_alert_default: temp-alert-default-state {
+   pins = "gpio13";
+   function = "gpio";
+   drive-strength = <2>;
+   bias-disable;
+   output-disable;
+   };
+};
+
  {
extcon = <>;
dr_mode = "peripheral";

---
base-commit: c71189547381bb5f176c6b22a9edc3414f1837b9
change-id: 20240616-peregrine-6ec5e26b15ec

Best regards,
-- 
André Apitzsch 





Re: [syzbot] [mm?] possible deadlock in __mmap_lock_do_trace_start_locking

2024-06-16 Thread syzbot
syzbot has bisected this issue to:

commit 21c38a3bd4ee3fb7337d013a638302fb5e5f9dc2
Author: Jesper Dangaard Brouer 
Date:   Wed May 1 14:04:11 2024 +

cgroup/rstat: add cgroup_rstat_cpu_lock helpers and tracepoints

bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=1669526198
start commit:   36534d3c5453 tcp: use signed arithmetic in tcp_rtx_probe0_..
git tree:   bpf
final oops: https://syzkaller.appspot.com/x/report.txt?x=1569526198
console output: https://syzkaller.appspot.com/x/log.txt?x=1169526198
kernel config:  https://syzkaller.appspot.com/x/.config?x=333ebe38d43c42e2
dashboard link: https://syzkaller.appspot.com/bug?extid=6ff90931779bcdfc840c
syz repro:  https://syzkaller.appspot.com/x/repro.syz?x=1585acfa98
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=17bdb7ee98

Reported-by: syzbot+6ff90931779bcdfc8...@syzkaller.appspotmail.com
Fixes: 21c38a3bd4ee ("cgroup/rstat: add cgroup_rstat_cpu_lock helpers and 
tracepoints")

For information about bisection process see: https://goo.gl/tpsmEJ#bisection