From: Steven Rostedt
The per CPU "disabled" counter is used for the latency tracers and stack
tracers to make sure that their accounting isn't messed up by an NMI or
interrupt coming in and affecting the same CPU data. But the counter is an
atomic_t type. As it only needs to synchronize against t
From: Steven Rostedt
The branch tracer currently checks the per CPU "disabled" field to know if
tracing is enabled or not for the CPU. As the "disabled" value is not used
anymore to turn of tracing generically, use tracing_tracer_is_on_cpu()
instead.
Signed-off-by: Steven Rostedt (Google)
---
From: Steven Rostedt
The trace_array_cpu had a "buffer_page" field that was originally going to
be used as a backup page for the ring buffer. But the ring buffer has its
own way of reusing pages and this field was never used.
Remove it.
Signed-off-by: Steven Rostedt (Google)
---
kernel/trace/
From: Steven Rostedt
Add the function ring_buffer_record_is_on_cpu() that returns true if the
ring buffer for a give CPU is writable and false otherwise.
Also add tracer_tracing_is_on_cpu() to return if the ring buffer for a
given CPU is writeable for a given trace_array.
Signed-off-by: Steven
From: Steven Rostedt
The irqsoff tracer uses the per CPU "disabled" field to prevent corruption
of the accounting when it starts to trace interrupts disabled, but there's
a slight race that could happen if for some reason it was called twice.
Use atomic_inc_return() instead.
Signed-off-by: Steve
From: Steven Rostedt
The per CPU "disabled" value was the original way to disable tracing when
the tracing subsystem was first created. Today, the ring buffer
infrastructure has its own way to disable tracing. In fact, things have
changed so much since 2008 that many things ignore the disable fla
From: Steven Rostedt
The per CPU "disabled" value was the original way to disable tracing when
the tracing subsystem was first created. Today, the ring buffer
infrastructure has its own way to disable tracing. In fact, things have
changed so much since 2008 that many things ignore the disable fla
From: Steven Rostedt
The per CPU "disabled" value was the original way to disable tracing when
the tracing subsystem was first created. Today, the ring buffer
infrastructure has its own way to disable tracing. In fact, things have
changed so much since 2008 that many things ignore the disable fla
From: Steven Rostedt
The per CPU "disabled" value was the original way to disable tracing when
the tracing subsystem was first created. Today, the ring buffer
infrastructure has its own way to disable tracing. In fact, things have
changed so much since 2008 that many things ignore the disable fla
From: Steven Rostedt
The per CPU "disabled" value was the original way to disable tracing when
the tracing subsystem was first created. Today, the ring buffer
infrastructure has its own way to disable tracing. In fact, things have
changed so much since 2008 that many things ignore the disable fla
From: Steven Rostedt
The ignore_pid boolean on the per CPU data descriptor is updated at
sched_switch when a new task is scheduled in. If the new task is to be
ignored, it is set to true, otherwise it is set to false. The current task
should always have the correct value as it is updated when the
Looking into allowing syscall events to fault and read user space, I found
that the use of the per CPU data "disabled" field was mostly obsolete.
This goes back to 2008 when the tracing subsystem was first created.
The "disabled" field was the only way to know if tracing was disabled or
not. But
From: Steven Rostedt
The mmiotracer referenced the per CPU array_buffer->data descriptor but
never actually used it. Remove the references to it.
Signed-off-by: Steven Rostedt (Google)
---
kernel/trace/trace_mmiotrace.c | 12 ++--
1 file changed, 2 insertions(+), 10 deletions(-)
diff
On Thu, May 01, 2025 at 04:57:30PM -0400, Steven Rostedt wrote:
> On Thu, 1 May 2025 13:14:11 -0700
> Namhyung Kim wrote:
>
> > Hi Steve,
> >
> > On Wed, Apr 30, 2025 at 09:32:07PM -0400, Steven Rostedt wrote:
>
> > > To solve this, when a per CPU event is created that has defer_callchain
> > >
gah, the cc list here is rotund...
On 5/2/25 09:38, Valentin Schneider wrote:
...
>> All of the paths to enter the kernel from userspace have some
>> SWITCH_TO_KERNEL_CR3 variant. If they didn't, the userspace that they
>> entered from could have attacked the kernel with Meltdown.
>>
>> I'm theori
From: Josh Poimboeuf
Make unwind_deferred_request() NMI-safe so tracers in NMI context can
call it to get the cookie immediately rather than have to do the fragile
"schedule irq work and then call unwind_deferred_request()" dance.
Signed-off-by: Josh Poimboeuf
Signed-off-by: Steven Rostedt (Goo
From: Josh Poimboeuf
If the task is not a user thread, there's no user stack to unwind.
Signed-off-by: Josh Poimboeuf
Signed-off-by: Steven Rostedt (Google)
---
kernel/events/core.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/kernel/events/core.c b/kernel/events/core
From: Steven Rostedt
To determine if a task is a kernel thread or not, it is more reliable to
use (current->flags & PF_KTHREAD) than to rely on current->mm being NULL.
That is because some kernel tasks (io_uring helpers) may have a mm field.
Link:
https://lore.kernel.org/linux-trace-kernel/2025
From: Josh Poimboeuf
Simplify the get_perf_callchain() user logic a bit. task_pt_regs()
should never be NULL.
Acked-by: Namhyung Kim
Signed-off-by: Josh Poimboeuf
Signed-off-by: Steven Rostedt (Google)
---
kernel/events/callchain.c | 18 --
1 file changed, 8 insertions(+), 1
From: Josh Poimboeuf
get_perf_callchain() doesn't support cross-task unwinding for user space
stacks, have it return NULL if both the crosstask and user arguments are
set.
Signed-off-by: Josh Poimboeuf
Signed-off-by: Steven Rostedt (Google)
---
kernel/events/callchain.c | 8
1 file c
From: Josh Poimboeuf
The 'init_nr' argument has double duty: it's used to initialize both the
number of contexts and the number of stack entries. That's confusing
and the callers always pass zero anyway. Hard code the zero.
Acked-by: Namhyung Kim
Signed-off-by: Josh Poimboeuf
Signed-off-by:
From: Steven Rostedt
Instead of using the callback_mutex to protect the link list of callbacks
in unwind_deferred_task_work(), use SRCU instead. This gets called every
time a task exits that has to record a stack trace that was requested.
This can happen for many tasks on several CPUs at the same
From: Steven Rostedt
In order to know which registered callback requested a stacktrace for when
the task goes back to user space, add a bitmask for all registered
tracers. The bitmask is the size of log, which means that on a 32 bit
machine, it can have at most 32 registered tracers, and on 64 bi
From: Josh Poimboeuf
Cache the results of the unwind to ensure the unwind is only performed
once, even when called by multiple tracers.
The cache nr_entries gets cleared every time the task exits the kernel.
When a stacktrace is requested, nr_entries gets set to the number of
entries in the stac
From: Josh Poimboeuf
Add an interface for scheduling task work to unwind the user space stack
before returning to user space. This solves several problems for its
callers:
- Ensure the unwind happens in task context even if the caller may be
running in NMI or interrupt context.
- Avoid
From: Steven Rostedt
Add a function that must be called inside a faultable context that will
retrieve a user space stack trace. The function unwind_deferred_trace()
can be called by a tracer when a task is about to enter user space, or has
just come back from user space and has interrupts enabled
From: Josh Poimboeuf
get_segment_base() will be used by the unwind_user code, so make it
global and rename it so it doesn't conflict with a KVM function of the
same name.
As the function is no longer specific to perf, move it to ptrace.c as that
seems to be a better location for a generic functi
From: Josh Poimboeuf
Use ARCH_INIT_USER_COMPAT_FP_FRAME to describe how frame pointers are
unwound on x86, and implement the hooks needed to add the segment base
addresses. Enable HAVE_UNWIND_USER_COMPAT_FP if the system has compat
mode compiled in.
Signed-off-by: Josh Poimboeuf
Signed-off-by:
From: Josh Poimboeuf
Add optional support for user space compat mode frame pointer unwinding.
If supported, the arch needs to enable CONFIG_HAVE_UNWIND_USER_COMPAT_FP
and define ARCH_INIT_USER_COMPAT_FP_FRAME.
Signed-off-by: Josh Poimboeuf
Signed-off-by: Steven Rostedt (Google)
---
arch/Kconf
From: Josh Poimboeuf
Add optional support for user space frame pointer unwinding. If
supported, the arch needs to enable CONFIG_HAVE_UNWIND_USER_FP and
define ARCH_INIT_USER_FP_FRAME.
By encoding the frame offsets in struct unwind_user_frame, much of this
code can also be reused for future unwi
From: Josh Poimboeuf
Use ARCH_INIT_USER_FP_FRAME to describe how frame pointers are unwound
on x86, and enable CONFIG_HAVE_UNWIND_USER_FP accordingly so the
unwind_user interfaces can be used.
Signed-off-by: Josh Poimboeuf
Signed-off-by: Steven Rostedt (Google)
---
arch/x86/Kconfig
From: Josh Poimboeuf
Introduce a generic API for unwinding user stacks.
In order to expand user space unwinding to be able to handle more complex
scenarios, such as deferred unwinding and reading user space information,
create a generic interface that all architectures can use that support the
v
[ Shorten the Cc list to just those that maintain this ]
This series does not make any user space visible changes.
It only adds the necessary infrastructure of the deferred unwinder
and makes a few helpful cleanups to perf.
Based off of tip/master: 252d33c92dbc23bcc1e662a889787c09a02eeccc
Pet
On 02/05/25 06:53, Dave Hansen wrote:
> On 5/2/25 02:55, Valentin Schneider wrote:
>> My gripe with that was having two separate mechanisms
>> - super early entry around SWITCH_TO_KERNEL_CR3)
>> - later entry at context tracking
>
> What do you mean by "later entry"?
>
I meant the point at which t
On 02.05.25 17:30, Nico Pache wrote:
On Fri, May 2, 2025 at 9:27 AM Jann Horn wrote:
On Fri, May 2, 2025 at 5:19 PM David Hildenbrand wrote:
On 02.05.25 14:50, Jann Horn wrote:
On Fri, May 2, 2025 at 8:29 AM David Hildenbrand wrote:
On 02.05.25 00:29, Nico Pache wrote:
On Wed, Apr 30, 2
On 02.05.25 17:24, Lorenzo Stoakes wrote:
On Fri, May 02, 2025 at 05:18:54PM +0200, David Hildenbrand wrote:
On 02.05.25 14:50, Jann Horn wrote:
On Fri, May 2, 2025 at 8:29 AM David Hildenbrand wrote:
On 02.05.25 00:29, Nico Pache wrote:
On Wed, Apr 30, 2025 at 2:53 PM Jann Horn wrote:
On
On Fri, May 2, 2025 at 9:27 AM Jann Horn wrote:
>
> On Fri, May 2, 2025 at 5:19 PM David Hildenbrand wrote:
> >
> > On 02.05.25 14:50, Jann Horn wrote:
> > > On Fri, May 2, 2025 at 8:29 AM David Hildenbrand wrote:
> > >> On 02.05.25 00:29, Nico Pache wrote:
> > >>> On Wed, Apr 30, 2025 at 2:53 P
On Fri, May 2, 2025 at 5:19 PM David Hildenbrand wrote:
>
> On 02.05.25 14:50, Jann Horn wrote:
> > On Fri, May 2, 2025 at 8:29 AM David Hildenbrand wrote:
> >> On 02.05.25 00:29, Nico Pache wrote:
> >>> On Wed, Apr 30, 2025 at 2:53 PM Jann Horn wrote:
>
> On Mon, Apr 28, 2025 at 8:12
On Fri, May 02, 2025 at 05:18:54PM +0200, David Hildenbrand wrote:
> On 02.05.25 14:50, Jann Horn wrote:
> > On Fri, May 2, 2025 at 8:29 AM David Hildenbrand wrote:
> > > On 02.05.25 00:29, Nico Pache wrote:
> > > > On Wed, Apr 30, 2025 at 2:53 PM Jann Horn wrote:
> > > > >
> > > > > On Mon, Apr
On Fri, May 02, 2025 at 07:33:55AM -0700, Dave Hansen wrote:
> On 5/2/25 04:22, Peter Zijlstra wrote:
> > On Wed, Apr 30, 2025 at 11:07:35AM -0700, Dave Hansen wrote:
> >
> >> Both AMD and Intel have hardware to do it. ARM CPUs do it too, I think.
> >> You can go buy the Intel hardware off the she
On 02.05.25 14:50, Jann Horn wrote:
On Fri, May 2, 2025 at 8:29 AM David Hildenbrand wrote:
On 02.05.25 00:29, Nico Pache wrote:
On Wed, Apr 30, 2025 at 2:53 PM Jann Horn wrote:
On Mon, Apr 28, 2025 at 8:12 PM Nico Pache wrote:
Introduce the ability for khugepaged to collapse to different
On 5/2/25 04:22, Peter Zijlstra wrote:
> On Wed, Apr 30, 2025 at 11:07:35AM -0700, Dave Hansen wrote:
>
>> Both AMD and Intel have hardware to do it. ARM CPUs do it too, I think.
>> You can go buy the Intel hardware off the shelf today.
> To be fair, the Intel RAR thing is pretty horrific 🙁 Defini
On 2025/5/1 00:53, Alexei Starovoitov wrote:
> On Wed, Apr 30, 2025 at 8:55 AM Leon Hwang wrote:
>>
>>
>>
>> On 2025/4/30 20:43, Kafai Wan wrote:
>>> On Wed, Apr 30, 2025 at 10:46 AM Alexei Starovoitov
>>> wrote:
On Sat, Apr 26, 2025 at 9:00 AM KaFai Wan wrote:
>
>>
[...]
>>
>
On 5/2/25 02:55, Valentin Schneider wrote:
> My gripe with that was having two separate mechanisms
> - super early entry around SWITCH_TO_KERNEL_CR3)
> - later entry at context tracking
What do you mean by "later entry"?
All of the paths to enter the kernel from userspace have some
SWITCH_TO_KERN
On Fri, 02 May 2025 15:15:53 +0200
Paul Cacheux via B4 Relay wrote:
> From: Paul Cacheux
>
> The shared trace_probe_log variable can be accessed and modified
> by multiple processes using tracefs at the same time, this new
> mutex will guarantee it's always in a coherent state.
>
> There is no
Hello,
As reported in [1] a race exists in the shared trace probe log
used to build error messages. This can cause kernel crashes
when building the actual error message, but the race happens
even for non-error tracefs uses, it's just not visible.
Reproducer first reported that is still crashing:
From: Paul Cacheux
The shared trace_probe_log variable can be accessed and modified
by multiple processes using tracefs at the same time, this new
mutex will guarantee it's always in a coherent state.
There is no guarantee that multiple errors happening at the same
time will each have the correc
From: Paul Cacheux
Make sure trace_probe_log_clear is called in the tracing
eprobe code path, matching the trace_probe_log_init call.
Signed-off-by: Paul Cacheux
---
kernel/trace/trace_eprobe.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/kernel/trace/trace_eprobe.c b/kernel/trace/tr
On Fri, May 2, 2025 at 8:29 AM David Hildenbrand wrote:
> On 02.05.25 00:29, Nico Pache wrote:
> > On Wed, Apr 30, 2025 at 2:53 PM Jann Horn wrote:
> >>
> >> On Mon, Apr 28, 2025 at 8:12 PM Nico Pache wrote:
> >>> Introduce the ability for khugepaged to collapse to different mTHP sizes.
> >>> Wh
On Wed, Apr 30, 2025 at 11:07:35AM -0700, Dave Hansen wrote:
> Both AMD and Intel have hardware to do it. ARM CPUs do it too, I think.
> You can go buy the Intel hardware off the shelf today.
To be fair, the Intel RAR thing is pretty horrific :-( Definitely
sub-par compared to the AMD and ARM thi
On 30/04/25 13:00, Dave Hansen wrote:
> On 4/30/25 12:42, Steven Rostedt wrote:
>>> Look at the syscall code for instance:
>>>
SYM_CODE_START(entry_SYSCALL_64)
swapgs
movq%rsp, PER_CPU_VAR(cpu_tss_rw + TSS_sp2)
SWITCH_TO_KERNEL_CR3 scratch_reg=%rsp
>>
On Thu, May 01, 2025 at 11:26:46PM +0200, Alejandro Colomar wrote:
> Hi Jiri,
>
> On Tue, Apr 22, 2025 at 10:45:41PM +0200, Alejandro Colomar wrote:
> > On Tue, Apr 22, 2025 at 04:01:56PM +0200, Jiri Olsa wrote:
> > > > > +is an alternative to breakpoint instructions
> > > > > +for triggering entr
52 matches
Mail list logo