On 09/17, Jiri Olsa wrote:
>
> static void handler_chain(struct uprobe *uprobe, struct pt_regs *regs)
...
> + if (!ignore && !ZERO_OR_NULL_PTR(ri)) {
> + /*
> + * The push_idx value has the final number of return consumers,
> + * and ri->consumers_cnt has
I don't see anything wrong after a quick glance, but I don't
really understand the UPROBE_HANDLER_IGNORE logic, see below.
On 09/17, Jiri Olsa wrote:
>
> + * UPROBE_HANDLER_IWANTMYCOOKIE
> + * - Store cookie and pass it to ret_handler (if defined).
Cough ;) yes it was me who used this name in the
After the commit 29c1c24a2707 ("function_graph: Fix up ftrace_graph_ret_addr()")
ftrace_graph_ret_addr() doesn't need to initialize "int i" at the start.
Signed-off-by: Oleg Nesterov
---
kernel/trace/fgraph.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
After the commit 29c1c24a2707 ("function_graph: Fix up ftrace_graph_ret_addr()")
ftrace_graph_ret_addr() doesn't need to initialize "int i" at the start.
While at it, move the declaration of "ret_stack" into the main loop.
Signed-off-by: Oleg Nesterov
---
Hi Liao,
On 09/14, Liao, Chang wrote:
>
> Hi, Oleg
>
> Kindly ping.
>
> This series have been pending for a month. Is thre any issue I overlook?
Well, I have already acked both patches.
Please resend them to Peter/Masami, with my acks included.
Oleg.
On 09/05, Andrii Nakryiko wrote:
>
> +static struct uprobe *find_active_uprobe_speculative(unsigned long bp_vaddr)
> +{
> + const vm_flags_t flags = VM_HUGETLB | VM_MAYEXEC | VM_MAYSHARE;
...
> + if (!vm_file || (vma->vm_flags & flags) != VM_MAYEXEC)
> + goto bail;
Not that thi
On 09/09, Andrii Nakryiko wrote:
>
> Similarly to how we SRCU-protect uprobe instance (and avoid refcounting
> it unnecessarily) when waiting for return probe hit, use hprobe approach
> to do the same with single-stepped uprobe. Same hprobe_* primitives are
> used. We also reuse ri_timer() callback
On 09/09, Andrii Nakryiko wrote:
>
> Currently put_uprobe() might trigger mutex_lock()/mutex_unlock(), which
> makes it unsuitable to be called from more restricted context like softirq.
>
> Let's make put_uprobe() agnostic to the context in which it is called,
> and use work queue to defer the mut
On 09/09, Jiri Olsa wrote:
>
> @@ -37,13 +37,16 @@ struct uprobe_consumer {
>* for the current process. If filter() is omitted or returns true,
>* UPROBE_HANDLER_REMOVE is effectively ignored.
>*/
> - int (*handler)(struct uprobe_consumer *self, struct pt_regs *regs);
>
On 09/13, Jiri Olsa wrote:
>
> On Fri, Sep 13, 2024 at 12:57:51PM +0200, Oleg Nesterov wrote:
>
> > static void handler_chain(struct uprobe *uprobe, struct pt_regs *regs)
> > {
> > ...
> > struct return_instance *ri = NULL;
> > int push_idx = 0;
On 09/13, Jiri Olsa wrote:
>
> I'm not sure the realloc will help, I feel like we need to allocate return
> consumer for each called handler separately to be safe
How about something like the (pseudo) code below? Note that this way
we do not need uprobe->consumers_cnt. Note also that krealloc() sh
On 09/13, Jiri Olsa wrote:
>
> On Thu, Sep 12, 2024 at 06:20:29PM +0200, Oleg Nesterov wrote:
> > > +
> > > + if (rc == 0 && uc->ret_handler) {
> >
> > should we enter this block if uc->handler == NULL?
>
> yes, consumer can have ju
On 09/13, Jiri Olsa wrote:
>
> On Thu, Sep 12, 2024 at 06:35:39PM +0200, Oleg Nesterov wrote:
> > > list_for_each_entry_srcu(uc, &uprobe->consumers, cons_node,
> > >srcu_read_lock_held(&uprobes_srcu)) {
> > > + /
On 09/09, Jiri Olsa wrote:
>
> handle_uretprobe_chain(struct return_instance *ri, struct pt_regs *regs)
> {
> + struct return_consumer *ric = NULL;
> struct uprobe *uprobe = ri->uprobe;
> struct uprobe_consumer *uc;
> - int srcu_idx;
> + int srcu_idx, iter = 0;
>
> s
On 09/09, Jiri Olsa wrote:
>
> static void handler_chain(struct uprobe *uprobe, struct pt_regs *regs)
> {
> struct uprobe_consumer *uc;
> int remove = UPROBE_HANDLER_REMOVE;
> - bool need_prep = false; /* prepare return uprobe, when needed */
> + struct return_consumer *ric =
Now that xol_mapping has its own ->fault() method we no longer need
xol_area->pages[1] == NULL, we need a single page.
Signed-off-by: Oleg Nesterov
---
kernel/events/uprobes.c | 17 -
1 file changed, 8 insertions(+), 9 deletions(-)
diff --git a/kernel/events/uprobes.c b/
by: Sven Schnelle
Closes: https://lore.kernel.org/all/yt9dy149vprr@linux.ibm.com/
Fixes: 223febc6e557 ("mm: add optional close() to struct vm_special_mapping")
Signed-off-by: Oleg Nesterov
---
kernel/events/uprobes.c | 21 -
1 file changed, 16 insertions(+), 5 deletions(
afterwards.
Signed-off-by: Oleg Nesterov
---
include/linux/uprobes.h | 1 +
kernel/events/uprobes.c | 36 +++-
kernel/fork.c | 1 +
3 files changed, 21 insertions(+), 17 deletions(-)
diff --git a/include/linux/uprobes.h b/include/linux/uprobes.h
index 49
Damn, sorry for the spam.
So I am going to send the patch from
https://lore.kernel.org/all/20240904095449.ga28...@redhat.com/
To me it looks like a good cleanup regardless, and unless I am totally
confused it should fix the problem with use-after-free.
Oleg.
On 09/11, Oleg Nesterov wrote
I guess VM_SEALED could help, but it depends on CONFIG_64BIT
On 09/11, Oleg Nesterov wrote:
>
> On 09/03, Sven Schnelle wrote:
> >
> > +static void uprobe_clear_state(const struct vm_special_mapping *sm, struct
> > vm_area_struct *vma)
> > +{
> > + stru
On 09/03, Sven Schnelle wrote:
>
> +static void uprobe_clear_state(const struct vm_special_mapping *sm, struct
> vm_area_struct *vma)
> +{
> + struct xol_area *area = container_of(vma->vm_private_data, struct
> xol_area, xol_mapping);
> +
> + mutex_lock(&delayed_uprobe_lock);
> + dela
On 09/09, Jiri Olsa wrote:
>
> On Fri, Sep 06, 2024 at 09:18:15PM +0200, Oleg Nesterov wrote:
> >
> > And btw... Can bpftrace attach to the uprobe tp?
> >
> > # perf probe -x ./test -a func
> > Added new event:
> > probe_test:func (on f
On 09/08, Tianyi Liu wrote:
>
> On Mon, Sep 06, 2024 at 18:43:00AM +0800, Jiri Olsa wrote:
>
> > would you consider sending another version addressing Oleg's points
> > for changelog above?
>
> My pleasure, I'll resend the updated patch in a new thread.
>
> Based on previous discussions, `uprobe_pe
On 09/06, Jiri Olsa wrote:
>
> On Mon, Sep 02, 2024 at 03:22:25AM +0800, Tianyi Liu wrote:
> >
> > For now, please forget the original patch as we need a new solution ;)
>
> hi,
> any chance we could go with your fix until we find better solution?
Well, as I said from the very beginning I won't re
& current->mm != link->task->mm)
> + if (link->task && !same_thread_group(current, link->task))
> return 0;
plus the current check can return false negative if link->task->mm == NULL
Acked-by: Oleg Nesterov
I didn't have time to write a full reply yesterday.
On 09/03, Linus Torvalds wrote:
>
> On Tue, 3 Sept 2024 at 02:09, Oleg Nesterov wrote:
> >
> > but with or without this fix __create_xol_area() also needs
> >
> > area->xol_mapping.mremap = NULL;
&
I didn't have time to write a full reply yesterday.
On 09/03, Linus Torvalds wrote:
>
> On Tue, 3 Sept 2024 at 02:09, Oleg Nesterov wrote:
> >
> > but with or without this fix __create_xol_area() also needs
> >
> > area->xol_mapping.mremap = NULL;
&
On 09/03, Linus Torvalds wrote:
>
> On Tue, 3 Sept 2024 at 02:09, Oleg Nesterov wrote:
> >
> > but with or without this fix __create_xol_area() also needs
> >
> > area->xol_mapping.mremap = NULL;
>
> I think the whole thing needs to be zeroed out.
On 09/03, Andrii Nakryiko wrote:
>
> On Tue, Sep 3, 2024 at 10:27 AM Andrii Nakryiko
> wrote:
> >
> > On Sat, Aug 31, 2024 at 9:19 AM Oleg Nesterov wrote:
> > >
> > > But. Since you are going to send another version, may I ask you to add a
> > > no
On 09/03, Andrii Nakryiko wrote:
>
> On Sat, Aug 31, 2024 at 9:19 AM Oleg Nesterov wrote:
> >
> > I was thinking about another seq counter incremented in register(), so
> > that handler_chain() can detect the race with uprobe_register() and skip
> > unapply_uprob
On 09/03, Peter Zijlstra wrote:
>
> On Fri, Aug 30, 2024 at 12:24:01PM +0200, Oleg Nesterov wrote:
> > On 08/29, Andrii Nakryiko wrote:
> > >
> > > v3->v4:
> > > - added back consumer_rwsem into consumer_del(), which was accidentally
>
On 09/03, Sven Schnelle wrote:
>
> To prevent unitialized members, use kzalloc to allocate
> the xol area.
>
> Signed-off-by: Sven Schnelle
Acked-by: Oleg Nesterov
and since this looks easily exploitable, I'd sugest
Cc: sta...@vger.kernel.org
Fixes: b059a453b1cf1 ("
On 09/03, Sven Schnelle wrote:
>
> [ 44.505448]
> ==
> 20:37:27
> [3421/145075]
> [ 44.505455] BUG: KASAN: slab-use-after-free in
> special_mapping_close+0x9c
On 09/02, Oleg Nesterov wrote:
>
> And... I think that BPF has even more problems with filtering. Not sure,
> I'll try to write another test-case tomorrow.
See below. This test-case needs a one-liner patch at the end, but this is only
because I have no idea how to add
On 09/02, Tianyi Liu wrote:
>
> On Fri, Aug 30, 2024 at 18:12:41PM +0800, Oleg Nesterov wrote:
> >
> > - So I still think that the "right" fix should change the
> > bpf_prog_run_array_uprobe() paths somehow, but I know nothing
> > about b
On 08/30, Jiri Olsa wrote:
>
> with this change the probe will not get removed in the attached test,
> it'll get 2 hits, without this change just 1 hit
Thanks again for pointing out the subtle change in behaviour, but could
you add more details for me? ;)
I was going to read the test below today,
On 08/30, Andrii Nakryiko wrote:
>
> On Fri, Aug 30, 2024 at 1:21 PM Oleg Nesterov wrote:
> >
> > I'll probably write another email (too late for me today), but I agree
> > that "avoid register_rwsem in handler_chain" is obviously a good goal,
> &g
On 08/30, Oleg Nesterov wrote:
>
> - This patch won't fix all problems because uprobe_perf_filter()
> filters by mm, not by pid. The changelog/patch assumes that it
> is a "PID filter", but it is not.
>
> See
> https:
The whole discussion was very confusing (yes, I too contributed to the
confusion ;), let me try to summarise.
> U(ret)probes are designed to be filterable using the PID, which is the
> second parameter in the perf_event_open syscall. Currently, uprobe works
> well with the filtering, but uretprobe
Ah. we certainly misunderstand each other.
On 08/29, Jiri Olsa wrote:
>
> On Thu, Aug 29, 2024 at 05:20:33PM +0200, Oleg Nesterov wrote:
>
> SNIP
SNIP
> right.. if the event is not added by perf_trace_add on this cpu
> it won't go pass this point, so no problem for perf
On 08/28, Jiri Olsa wrote:
>
> On Tue, Aug 27, 2024 at 10:19:26PM +0200, Oleg Nesterov wrote:
> > Hmm. Really? In this case these 2 different consumers will have the
> > different
> > trace_event_call's, so
> >
> > // consumer f
Sorry for another reply, I just noticed I missed one part of your email...
On 08/27, Jiri Olsa wrote:
>
>-> uretprobe-hit
> handle_swbp
> uprobe_handle_trampoline
> handle_uretprobe_chain
> {
>
> for_each_uprobe_co
On 08/27, Jiri Olsa wrote:
>
> On Tue, Aug 27, 2024 at 12:29:38AM +0200, Oleg Nesterov wrote:
> >
> > So, can you reproduce the problem reported by Tianyi on your setup?
>
> yes, I can repduce the issue with uretprobe on top of perf event uprobe
...
&
On 08/27, Jiri Olsa wrote:
>
> On Tue, Aug 27, 2024 at 12:40:52PM +0200, Oleg Nesterov wrote:
> > static bool
> > uprobe_multi_link_filter(struct uprobe_consumer *con, struct mm_struct *mm)
> > {
> > struct bpf_uprobe *uprobe;
> > + struct task_struct
On 08/27, Jiri Olsa wrote:
>
> On Tue, Aug 27, 2024 at 12:08:39PM +0200, Jiri Olsa wrote:
> > >
> > > - if (link->task && current->mm != link->task->mm)
> > > + if (link->task && !same_thread_group(current, link->task))
> > >
> > > in uprobe_prog_run() to make "filter by *process*"
On 08/27, Jiri Olsa wrote:
>
> On Mon, Aug 26, 2024 at 01:57:52PM +0200, Oleg Nesterov wrote:
> >
> > So perhaps we need
> >
> > - if (link->task && current->mm != link->task->mm)
> > + if (link->task && !same
On 08/27, Jiri Olsa wrote:
>
> did you just bpftrace-ed bpftrace? ;-) on my setup I'm getting:
>
> [root@qemu ex]# ../bpftrace/build/src/bpftrace -e 'kprobe:uprobe_register {
> printf("%s\n", kstack); }'
> Attaching 1 probe...
>
> uprobe_register+1
so I guess you are on tip/perf/core whic
This is offtopic, sorry for the spam, but...
On 08/26, Jiri Olsa wrote:
>
> On Mon, Aug 26, 2024 at 01:57:52PM +0200, Oleg Nesterov wrote:
> >
> > Does bpftrace use bpf_uprobe_multi_link_attach/etc ? I guess not...
> > Then which userspace tool uses this code? ;)
>
&g
On 08/26, Jiri Olsa wrote:
>
> On Mon, Aug 26, 2024 at 01:57:52PM +0200, Oleg Nesterov wrote:
> > On 08/26, Jiri Olsa wrote:
> > >
> > > > But "perf-record -p" works as expected.
> > >
> > > I wonder it's because there's the
On 08/26, Oleg Nesterov wrote:
>
> Does bpftrace use bpf_uprobe_multi_link_attach/etc ? I guess not...
Given that the patch from Tianyi makes the difference, bpftrace should
trigger the "#ifdef CONFIG_BPF_EVENTS" code in __uprobe_perf_func(), and
bpf_prog_run_array_uprobe() sh
On 08/26, Jiri Olsa wrote:
>
> On Mon, Aug 26, 2024 at 12:40:18AM +0200, Oleg Nesterov wrote:
> > $ ./test &
> > $ bpftrace -p $! -e 'uprobe:./test:func { printf("%d\n", pid); }'
> >
> > I hope that the syntax of the 2nd command is corr
On 08/25, Oleg Nesterov wrote:
>
> At least I certainly disagree with "Fixes: c1ae5c75e103" ;)
>
> uretprobe_perf_func/etc was designed for perf, and afaics this code still
> works fine even if you run 2 perf-record's with -p PID1/PID2 at the same
> time.
>
&g
On 08/25, Oleg Nesterov wrote:
>
> At least I certainly disagree with "Fixes: c1ae5c75e103" ;)
>
> uretprobe_perf_func/etc was designed for perf, and afaics this code still
> works fine even if you run 2 perf-record's with -p PID1/PID2 at the same
> time.
Forgo
On 08/24, Tianyi Liu wrote:
>
> A key trigger here is that the two tracer instances (either `bpftrace` or
> `perf record`) must be running *simultaneously*. One of them should use
> PID1 as filter, while the other uses PID2.
Yes.
> Agreed, the implementation of uretprobe should be almost the same
On 08/23, Andrii Nakryiko wrote:
>
> This is a bit confusing, because even if the kernel-side uretprobe
> handler doesn't do the filtering by itself, uprobe subsystem shouldn't
> install breakpoints on processes which don't have uretprobe requested
> for (unless I'm missing something, of course).
On 07/03, Oleg Nesterov wrote:
>
> > /*
> > -* The NULL 'tsk' here ensures that any faults that occur here
> > -* will not be accounted to the task. 'mm' *is* current->mm,
> > -* but we treat this as a 'remote' acce
On 07/09, Andrii Nakryiko wrote:
>
> On Tue, Jul 9, 2024 at 11:49 AM Oleg Nesterov wrote:
> >
> > > Yep, that would be unfortunate (just like SIGILL sent when uretprobe
> > > detects "improper" stack pointer progression, for example),
> >
> >
On 07/08, Andrii Nakryiko wrote:
>
> On Sun, Jul 7, 2024 at 7:48 AM Oleg Nesterov wrote:
> >
> > And I forgot to mention...
> >
> > In any case __uprobe_unregister() can't ignore the error code from
> > register_for_each_vma(). If it fails to restore th
robed application.
On 07/05, Oleg Nesterov wrote:
>
> Tried to read this patch, but I fail to understand it. It looks
> obvioulsy wrong to me, see below.
>
> I tend to agree with the comments from Peter, but lets ignore them
> for the moment.
>
> On 07/01, Andrii Nakryiko wrote:
&g
On 07/01, Andrii Nakryiko wrote:
>
> --- a/include/linux/uprobes.h
> +++ b/include/linux/uprobes.h
> @@ -42,6 +42,11 @@ struct uprobe_consumer {
> enum uprobe_filter_ctx ctx,
> struct mm_struct *mm);
>
> + /* associated file offset o
Tried to read this patch, but I fail to understand it. It looks
obvioulsy wrong to me, see below.
I tend to agree with the comments from Peter, but lets ignore them
for the moment.
On 07/01, Andrii Nakryiko wrote:
>
> static void put_uprobe(struct uprobe *uprobe)
> {
> - if (refcount_dec_an
be written at @vaddr.
> *
> - * Called with mm->mmap_lock held for write.
> + * Called with mm->mmap_lock held for read or write.
> * Return 0 (success) or a negative errno.
Thanks,
Acked-by: Oleg Nesterov
I'll try to send the patch which explains the reasons for mmap_w
e treat this as a 'remote' access since
> + * it is essentially a kernel access to the memory.
>*/
> result = get_user_pages_remote(mm, vaddr, 1, FOLL_FORCE, &page, NULL);
OK, this makes it less confusing, so
Acked-by: Oleg Nesterov
-
On 06/25, Andrii Nakryiko wrote:
>
> On Tue, Jun 25, 2024 at 7:51 AM Oleg Nesterov wrote:
> >
> > Why?
> >
> > So far I don't understand this change. Quite possibly I missed something,
> > but in this case the changelog should explain the problem more cl
On 06/25, Masami Hiramatsu wrote:
>
> On Mon, 24 Jun 2024 17:21:34 -0700
> Andrii Nakryiko wrote:
>
> > Given unapply_uprobe() can call remove_breakpoint() which eventually
> > calls uprobe_write_opcode(), which can modify a set of memory pages and
> > expects mm->mmap_lock held for write, it need
Again, I'll try to read (at least this) patch later this week,
just one cosmetic nit for now...
On 06/24, Andrii Nakryiko wrote:
>
> + * UPROBE_REFCNT_GET constant is chosen such that it will *increment both*
> + * epoch and refcnt parts atomically with one atomic_add().
> + * UPROBE_REFCNT_PUT is
I don't think I can review, I forgot everything, but I'll try to read this
series when I have time to (try to) understand what it does...
On 06/24, Andrii Nakryiko wrote:
>
> Given unapply_uprobe() can call remove_breakpoint() which eventually
> calls uprobe_write_opcode(), which can modify a set
On 05/20, Andrii Nakryiko wrote:
>
> Fixes: 1b8f85defbc8 ("uprobes: prepare uprobe args buffer lazily")
> Reported-by: Breno Leitao
> Signed-off-by: Andrii Nakryiko
> ---
> kernel/trace/trace_uprobe.c | 14 +-
> 1 file changed, 9 insertions(+), 5
68 matches
Mail list logo