Re: [PATCH] tracing/timerlat: Check tlat_var for NULL in timerlat_fd_release

2024-08-29 Thread Luis Claudio R. Goncalves
On Thu, Aug 29, 2024 at 08:40:25PM -0300, Luis Claudio R. Goncalves wrote: > On Tue, Aug 27, 2024 at 04:34:39PM +0200, Tomas Glozar wrote: > > po 26. 8. 2024 v 19:27 odesílatel Steven Rostedt > > napsal: > > > > > > Yeah, I think I finally found the real issue. I don't think we need the > > > re

Re: [PATCH] tracing/timerlat: Check tlat_var for NULL in timerlat_fd_release

2024-08-29 Thread Luis Claudio R. Goncalves
On Tue, Aug 27, 2024 at 04:34:39PM +0200, Tomas Glozar wrote: > po 26. 8. 2024 v 19:27 odesílatel Steven Rostedt napsal: > > > > Yeah, I think I finally found the real issue. I don't think we need the ref > > counting. The problem is the creating and killing of the threads via the > > start and st

Re: [PATCH] tracing/timerlat: Check tlat_var for NULL in timerlat_fd_release

2024-08-27 Thread Tomas Glozar
po 26. 8. 2024 v 19:27 odesílatel Steven Rostedt napsal: > > Yeah, I think I finally found the real issue. I don't think we need the ref > counting. The problem is the creating and killing of the threads via the > start and stop callbacks. That's not their purpose. The purpose of stop > and start

Re: [PATCH] tracing/timerlat: Check tlat_var for NULL in timerlat_fd_release

2024-08-26 Thread Steven Rostedt
On Mon, 26 Aug 2024 15:01:24 +0200 Tomas Glozar wrote: > > Before the reset, all but one of the tlat->kthread is NULL. Then it dawned > > on me that this is a global per CPU variable. It gets initialized when the > > tracer starts. If another program is has the timerlat fd open when the > > trace

Re: [PATCH] tracing/timerlat: Check tlat_var for NULL in timerlat_fd_release

2024-08-26 Thread Tomas Glozar
pá 23. 8. 2024 v 20:51 odesílatel Steven Rostedt napsal: > > Egad, I don't think this is even good enough. I noticed this in the trace > (adding kthread to the memset trace_printk): > ><...>-916 [003] . 134.227044: osnoise_workload_start: > memset 88823c435b28 for 00

Re: [PATCH] tracing/timerlat: Check tlat_var for NULL in timerlat_fd_release

2024-08-23 Thread Steven Rostedt
On Fri, 23 Aug 2024 12:54:26 -0400 Steven Rostedt wrote: > > $ while true; do rtla timerlat top -u -q & PID=$!; sleep 5; \ > > kill -INT $PID; sleep 0.001; kill -TERM $PID; wait $PID; done > > The "kill -INT $PID" caused the write to osnoise_workload_start(), and the > after 1ms you do the "k

Re: [PATCH] tracing/timerlat: Check tlat_var for NULL in timerlat_fd_release

2024-08-23 Thread Steven Rostedt
On Tue, 20 Aug 2024 15:00:01 +0200 tglo...@redhat.com wrote: > From: Tomas Glozar > > When running timerlat with a userspace workload (NO_OSNOISE_WORKLOAD), > NULL pointer dereference can be triggered by sending consequent SIGINT > and SIGTERM signals to the workload process. That then causes >

Re: [PATCH] tracing/timerlat: Check tlat_var for NULL in timerlat_fd_release

2024-08-22 Thread Steven Rostedt
On Thu, 22 Aug 2024 08:20:52 -0300 "Luis Claudio R. Goncalves" wrote: > You mean the > > + if (!tlat_var->kthread) { > + /* the fd has been closed already */ > > bit or the kthread handling in rtla itself? > > As Tomas already said, thank you for testing and reviewing the sugg

Re: [PATCH] tracing/timerlat: Check tlat_var for NULL in timerlat_fd_release

2024-08-22 Thread Steven Rostedt
On Thu, 22 Aug 2024 10:32:02 -0400 Steven Rostedt wrote: > > Yeah, it seems there might be multiple bugs in the user workload > > handling, the other NULL pointer dereference and refcount warning > > above might be related (but I have yet to reproduce it on an upstream > > kernel). I'm also going

Re: [PATCH] tracing/timerlat: Check tlat_var for NULL in timerlat_fd_release

2024-08-22 Thread Steven Rostedt
On Thu, 22 Aug 2024 11:32:07 +0200 Tomas Glozar wrote: > st 21. 8. 2024 v 22:02 odesílatel Steven Rostedt napsal: > > > > I'm able to reproduce this with the above. Unfortunately, I can still > > reproduce it after applying this patch :-( > > > > Thank you for looking at this. I was at first

Re: [PATCH] tracing/timerlat: Check tlat_var for NULL in timerlat_fd_release

2024-08-22 Thread Luis Claudio R. Goncalves
On Wed, Aug 21, 2024 at 04:03:16PM -0400, Steven Rostedt wrote: > On Tue, 20 Aug 2024 15:00:01 +0200 > tglo...@redhat.com wrote: > > > From: Tomas Glozar > > > > When running timerlat with a userspace workload (NO_OSNOISE_WORKLOAD), > > NULL pointer dereference can be triggered by sending conseq

Re: [PATCH] tracing/timerlat: Check tlat_var for NULL in timerlat_fd_release

2024-08-22 Thread Tomas Glozar
st 21. 8. 2024 v 22:02 odesílatel Steven Rostedt napsal: > > I'm able to reproduce this with the above. Unfortunately, I can still > reproduce it after applying this patch :-( > Thank you for looking at this. I was at first not too sure about whether this is the proper fix, but after some discuss

Re: [PATCH] tracing/timerlat: Check tlat_var for NULL in timerlat_fd_release

2024-08-21 Thread Steven Rostedt
On Tue, 20 Aug 2024 15:00:01 +0200 tglo...@redhat.com wrote: > From: Tomas Glozar > > When running timerlat with a userspace workload (NO_OSNOISE_WORKLOAD), > NULL pointer dereference can be triggered by sending consequent SIGINT > and SIGTERM signals to the workload process. That then causes >