On Sun, 7 Jan 2024 13:42:39 +0100
Christian Brauner wrote:
>
> So, I tried to do an exploratory patch even though I promised myself not
> to do it. But hey...
>
> Some notes:
>
> * Permission handling for idmapped mounts is done completely in the
> VFS. That's the case for all filesytems that
On Fri, 5 Jan 2024 15:26:28 +0100
Christian Brauner wrote:
> On Wed, Jan 03, 2024 at 08:32:46PM -0500, Steven Rostedt wrote:
> > From: "Steven Rostedt (Google)"
> >
> > Instead of walking the dentries on mount/remount to update the gid values of
> >
From: "Steven Rostedt (Google)"
In order to apply a shortcut to skip over the current ctx->pos
immediately, by using the ei->entries array, the reading of that array
should be first. Moving the array reading before the linked list reading
will make the shortcut change diff nice
his fixes a bug with duplicate files being showned by 'ls'.
- Swap reading ei->entries with ei->children to make the next change
easier to read
- Add a "shortcut" in the ei->entries array to skip over already read
entries.
Steven Rostedt (Google) (4):
even
From: "Steven Rostedt (Google)"
As the ei->entries array is fixed for the duration of the eventfs_inode,
it can be used to skip over already read entries in eventfs_iterate().
That is, if ctx->pos is greater than zero, there's no reason in doing the
loop across the ei-&
From: "Steven Rostedt (Google)"
The ctx->pos was only updated when it added an entry, but the "skip to
current pos" check (c--) happened for every loop regardless of if the
entry was added or not. This inconsistency caused readdir to be incorrect.
It was due to:
From: "Steven Rostedt (Google)"
If ei->is_freed is set in eventfs_iterate(), it means that the directory
that is being iterated on is in the process of being freed. Just exit the
loop immediately when that is ever detected, and separate out the return
of the entry->callback() f
On Thu, 21 Dec 2023 12:58:13 -0500
Steven Rostedt wrote:
> On Thu, 21 Dec 2023 17:35:22 +
> Vincent Donnefort wrote:
>
> > @@ -5999,6 +6078,307 @@ int ring_buffer_subbuf_order_set(struct
> > trace_buffer *buffer, int order)
> > }
> > EXPORT_SYMBOL
tanya chundru
> ---
> Changes in v8:
> - Pass the structure and derefernce the variables in TP_fast_assign as
> suggested by steve
> - Link to v7:
> https://lore.kernel.org/r/20231206-ftrace_support-v7-1-aca49a042...@quicinc.com
So this looks good from a tracing POV.
Reviewe
From: "Steven Rostedt (Google)"
Instead of walking the dentries on mount/remount to update the gid values of
all the dentries if a gid option is specified on mount, just update the root
inode. Add .getattr, .setattr, and .permissions on the tracefs inode
operations to update the perm
On Thu, 4 Jan 2024 01:48:37 +
Al Viro wrote:
> On Wed, Jan 03, 2024 at 08:32:46PM -0500, Steven Rostedt wrote:
>
> > + /* Get the tracefs root from the parent */
> > + inode = d_inode(dentry->d_parent);
> > + inode = d_inode(inode->i_sb->s_root);
&g
On Thu, 4 Jan 2024 01:59:10 +
Al Viro wrote:
> On Wed, Jan 03, 2024 at 08:32:46PM -0500, Steven Rostedt wrote:
>
> > +static struct inode *instance_inode(struct dentry *parent, struct inode
> > *inode)
> > +{
> > + struct tracefs_inode *ti;
&g
From: "Steven Rostedt (Google)"
The eventfs creates dynamically allocated dentries and inodes. Using the
dcache_readdir() logic for its own directory lookups requires hiding the
cursor of the dcache logic and playing games to allow the dcache_readdir()
to still have access to the cu
From: "Steven Rostedt (Google)"
The "lookup" parameter is a way to differentiate the call to
create_file/dir_dentry() from when it's just a lookup (no need to up the
dentry refcount) and accessed via a readdir (need to up the refcount).
But reality, it just makes the c
ries in those cases.
Steven Rostedt (Google) (2):
eventfs: Remove "lookup" parameter from create_dir/file_dentry()
eventfs: Stop using dcache_readdir() for getdents()
fs/tracefs/event_inode.c | 241 ---
1 file changed, 80 insertions(+), 161 deletions(-)
From: "Steven Rostedt (Google)"
Instead of walking the dentries on mount/remount to update the gid values of
all the dentries if a gid option is specified on mount, just update the root
inode. Add .getattr, .setattr, and .permissions on the tracefs inode
operations to update the perm
On Wed, 3 Jan 2024 13:54:36 -0800
Linus Torvalds wrote:
> On Wed, 3 Jan 2024 at 11:57, Linus Torvalds
> wrote:
> >
> > Or, you know, you could do what I've told you to do at least TEN TIMES
> > already, which is to not mess with any of this, and just implement the
> > '->permission()' callback (
orage.googleapis.com/syzbot-assets/4f872267133f/vmlinux-453f5db0.xz
> kernel image:
> https://storage.googleapis.com/syzbot-assets/587572061791/bzImage-453f5db0.xz
>
> The issue was bisected to:
>
> commit 7e8358edf503e87236c8d07f69ef0ed846dd5112
> Author: Steven Rostedt (Google
On Wed, 3 Jan 2024 10:38:09 -0800
Linus Torvalds wrote:
> @@ -332,10 +255,8 @@ static int tracefs_apply_options(struct super_block *sb,
> bool remount)
> if (!remount || opts->opts & BIT(Opt_uid))
> inode->i_uid = opts->uid;
>
> - if (!remount || opts->opts & BIT(Opt_gi
On Wed, 3 Jan 2024 10:38:09 -0800
Linus Torvalds wrote:
> On Wed, 3 Jan 2024 at 10:12, Linus Torvalds
> wrote:
> >
> > Much better. Now eventfs looks more like a real filesystem, and less
> > like an eldritch horror monster that is parts of dcache tackled onto a
> > pseudo-filesystem.
>
> Oh,
On Wed, 3 Jan 2024 10:12:08 -0800
Linus Torvalds wrote:
> On Wed, 3 Jan 2024 at 07:24, Steven Rostedt wrote:
> >
> > Instead, just have eventfs have its own iterate_shared callback function
> > that will fill in the dent entries. This simplifies the code quite a bit.
From: "Steven Rostedt (Google)"
The eventfs creates dynamically allocated dentries and inodes. Using the
dcache_readdir() logic for its own directory lookups requires hiding the
cursor of the dcache logic and playing games to allow the dcache_readdir()
to still have access to the cu
loader argument comments*
> ...
>
> key1 = value1
> key2 = value2
> key3 = value3
> *bootloader argument comments*
> ...
>
> Fixes: 717c7c894d4b ("fs/proc: Add boot loader arguments as comment to
> /proc/bootconfig")
> Signed-off-by: Zhenhua Huan
From: "Steven Rostedt (Google)"
A flag was needed to denote which eventfs_inode was the "events"
directory, so a bit was taken from the "nr_entries" field, as there's not
that many entries, and 2^30 is plenty. But the bit number for nr_entries
was not updat
From: "Steven Rostedt (Google)"
If a getdents() is called on the tracefs directory but does not get all
the files, it can leave a "cursor" dentry in the d_subdirs list of tracefs
dentry. This cursor dentry does not have a d_inode for it. Before
referencing tracefs_inode
From: "Steven Rostedt (Google)"
If a getdents() is called on the tracefs directory but does not get all
the files, it can leave a "cursor" dentry in the d_subdirs list of tracefs
dentry. This cursor dentry does not have a d_inode for it. Before
referencing tracefs_inode
On Tue, 02 Jan 2024 18:54:26 +0800
"Ubisectech Sirius" wrote:
> Dear concerned.
> Greetings!
> We are Ubisectech Sirius Team, the vulnerability lab of China
> ValiantSec.Recently, our team has discovered a issue in Linux kernel 6.7.
> technical details:
> 1. Vulnerability Description: BUG: unabl
unpoisoning ftrace_regs in ftrace_ops_list_func.
>
> Acked-by: Steven Rostedt (Google)
I'm taking my ack away for this change in favor of what I'm suggesting now.
> Reviewed-by: Alexander Potapenko
> Signed-off-by: Ilya Leoshkevich
> ---
> kernel/trace/ftrace.c | 1
Masami and Jiri,
This patch made it through all my tests. If I can get an Acked-by by
Sunday, I'll include it in my push to Linus (I have a couple of other fixes
to send him).
-- Steve
On Fri, 29 Dec 2023 11:51:34 -0500
Steven Rostedt wrote:
> From: "Steven Rostedt (Google)&
On Fri, 29 Dec 2023 13:40:50 -0500
Steven Rostedt wrote:
> I'm sending this to a wider audience, as I want to hear more
> feedback on this before I accept it.
>
I forgot to mention that this can be applied on top of:
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-
oduced ioctl:
TRACE_MMAP_IOCTL_GET_READER. This will update the Meta-page reader ID to
point to the next reader containing unread data.
Link:
https://lore.kernel.org/linux-trace-kernel/20231221173523.3015715-3-vdonnef...@google.com
Signed-off-by: Vincent Donnefort
Signed-off-by: Steven Ros
I'm sending this to a wider audience, as I want to hear more
feedback on this before I accept it.
Vincent has been working on allowing the ftrace ring buffer to be
memory mapped into user space. This has been going on since
last year, where we talked at the 2022 Tracing Summit in London.
Vincen
://lore.kernel.org/linux-trace-kernel/20231221173523.3015715-2-vdonnef...@google.com
Signed-off-by: Vincent Donnefort
Signed-off-by: Steven Rostedt (Google)
---
include/linux/ring_buffer.h | 7 +
include/uapi/linux/trace_mmap.h | 29 +++
kernel/trace/ring_buffer.c | 382
From: "Steven Rostedt (Google)"
Masami Hiramatsu reported a memory leak in register_ftrace_direct() where
if the number of new entries are added is large enough to cause two
allocations in the loop:
for (i = 0; i < size; i++) {
hlist_for_each_entry(entry, &
On Wed, 27 Dec 2023 21:38:25 +0900
"Masami Hiramatsu (Google)" wrote:
> From: Masami Hiramatsu (Google)
>
> If ftrace_register_direct() called with a large number of target
There's no function called "ftrace_register_direct()", I guess you meant
register_ftrace_direct()?
> functions (e.g. 65)
On Wed, 27 Dec 2023 07:57:08 +0900
Masami Hiramatsu (Google) wrote:
> On Tue, 26 Dec 2023 12:59:02 -0500
> Steven Rostedt wrote:
>
> > From: "Steven Rostedt (Google)"
> >
> > The tracefs file "buffer_percent" is to allow user space to set a
From: "Steven Rostedt (Google)"
If an application blocks on the snapshot or snapshot_raw files, expecting
to be woken up when a snapshot occurs, it will not happen. Or it may
happen with an unexpected result.
That result is that the application will be reading the main buffer
inst
From: "Steven Rostedt (Google)"
The tracefs file "buffer_percent" is to allow user space to set a
water-mark on how much of the tracing ring buffer needs to be filled in
order to wake up a blocked reader.
0 - is to wait until any data is in the buffer
1 - is to wait for 1
From: "Steven Rostedt (Google)"
It was reported that when mounting the tracefs file system with a gid
other than root, the ownership did not carry down to the eventfs directory
due to the dynamic nature of it.
A fix was done to solve this, but it had two issues.
(a) if the attr p
From: "Steven Rostedt (Google)"
It was reported that when mounting the tracefs file system with a gid
other than root, the ownership did not carry down to the eventfs directory
due to the dynamic nature of it.
A fix was done to solve this, but it had two issues.
(a) if the attr p
On Thu, 21 Dec 2023 17:35:22 +
Vincent Donnefort wrote:
> @@ -5999,6 +6078,307 @@ int ring_buffer_subbuf_order_set(struct trace_buffer
> *buffer, int order)
> }
> EXPORT_SYMBOL_GPL(ring_buffer_subbuf_order_set);
>
The kernel developers have agreed to allow loop variables to be declared
On Thu, 21 Dec 2023 17:35:22 +
Vincent Donnefort wrote:
> @@ -739,6 +747,22 @@ static __always_inline bool full_hit(struct trace_buffer
> *buffer, int cpu, int f
> return (dirty * 100) > (full * nr_pages);
> }
>
> +static void rb_update_meta_page(struct ring_buffer_per_cpu *cpu_buff
On Thu, 21 Dec 2023 14:51:29 +
David Laight wrote:
> > I think 1kb units is perfectly fine (patch 15 changes to kb units). The
> > interface says its to define the minimal size of the sub-buffer, not the
> > actual size.
>
> I didn't read that far through :-(
>
Well, this isn't a normal
On Thu, 21 Dec 2023 11:06:38 +0100
Alexander Graf wrote:
> Thanks a bunch for the super quick turnaround time for the fix! I can
> confirm that I'm no longer seeing the warning :)
>
> Tested-by: Alexander Graf
Thanks Alex,
>
>
> Do we need another similar patch for the kprobe self tests? T
On Thu, 21 Dec 2023 09:17:55 +
David Laight wrote:
> > Unfortunately, it has to be PAGE_SIZE (and for now it's a power of 2 to
> > make masking easy). It's used for splice and will also be used for memory
> > mapping with user space.
>
> Perhaps then the sysctl to set the size should be po
On Thu, 21 Dec 2023 09:26:21 +0900
Masami Hiramatsu (Google) wrote:
> > If the user specifies 3 via:
> >
> > echo 3 > buffer_subbuf_size_kb
> >
> > Then the sub-buffer size will round up to 4kb (on a 4kb page size system).
> >
> > If they specify:
> >
> > echo 6 > buffer_subbuf_size_kb
>
On Tue, 19 Dec 2023 17:21:23 -0800
Bixuan Cui wrote:
> diff --git a/include/trace/events/vmscan.h b/include/trace/events/vmscan.h
> index b99cd28c9815..02868bdc5999 100644
> --- a/include/trace/events/vmscan.h
> +++ b/include/trace/events/vmscan.h
> @@ -395,7 +395,24 @@ TRACE_EVENT(mm_vmscan_writ
On Thu, 21 Dec 2023 01:34:56 +0900
Masami Hiramatsu (Google) wrote:
> On Tue, 19 Dec 2023 13:54:18 -0500
> Steven Rostedt wrote:
>
> > From: "Tzvetomir Stoyanov (VMware)"
> >
> > There are two approaches when changing the size of the ring buffer
> >
On Thu, 14 Dec 2023 19:45:02 +0100
Ahelenia ZiemiaĆska wrote:
> Otherwise we risk sleeping with the pipe locked for indeterminate
> lengths of time.
This change log is really lacking.
Why is this an issue?
> Link:
> https://lore.kernel.org/linux-fsdevel/qk6hjuam54khlaikf2ssom6custxf5is2ekka
On Thu, 21 Dec 2023 01:23:14 +0900
Masami Hiramatsu (Google) wrote:
> On Tue, 19 Dec 2023 13:54:20 -0500
> Steven Rostedt wrote:
>
> > From: "Steven Rostedt (Google)"
> >
> > On failure to allocate ring buffer pages, the pointer to the CPU buffer
> &
From: "Steven Rostedt (Google)"
The synth_event_gen_test module can be built in, if someone wants to run
the tests at boot up and not have to load them.
The synth_event_gen_test_init() function creates and enables the synthetic
events and runs its tests.
The synth_event_gen
On Wed, 20 Dec 2023 10:50:17 -0500
Steven Rostedt wrote:
> From: "Steven Rostedt (Google)"
>
> Dongliang reported:
>
> I found that in the latest version, the nodes of tracefs have been
> changed to dynamically created.
>
> This has caused me to e
On Wed, 20 Dec 2023 13:49:30 +
Vincent Donnefort wrote:
> I meant, to only do in rb_wake_up_waiters()
>
> if (rbwork->is_cpu_buffer)
> rb_update_meta_page(cpu_buffer)
>
> And skip the meta-page update for the !is_cpu_buffer case?
Ah yeah, that works.
-- Steve
From: "Steven Rostedt (Google)"
Dongliang reported:
I found that in the latest version, the nodes of tracefs have been
changed to dynamically created.
This has caused me to encounter a problem where the gid I specified in
the mounting parameters cannot apply to all files,
On Wed, 20 Dec 2023 23:26:19 +0900
Masami Hiramatsu (Google) wrote:
> On Tue, 19 Dec 2023 13:54:17 -0500
> Steven Rostedt wrote:
>
> > +/**
> > + * ring_buffer_subbuf_order_set - set the size of ring buffer sub page.
> > + * @buffer: The ring_buffer to set the ne
On Wed, 20 Dec 2023 17:15:06 +0800
Dongliang Cui wrote:
> I found that in the latest version, the nodes of
> tracefs have been changed to dynamically created.
>
> This has caused me to encounter a problem where
> the gid I specified in the mounting parameters
> cannot apply to all files, as i
On Wed, 20 Dec 2023 13:06:06 +
Vincent Donnefort wrote:
> > @@ -771,10 +772,20 @@ static void rb_update_meta_page(struct
> > ring_buffer_per_cpu *cpu_buffer)
> > static void rb_wake_up_waiters(struct irq_work *work)
> > {
> > struct rb_irq_work *rbwork = container_of(work, struct rb_ir
From: "Steven Rostedt (Google)"
It's been 11 years since the ring_buffer_size() function was updated to
use the nr_pages from the buffer->buffers[cpu] structure instead of using
the buffer->nr_pages that no longer exists.
The comment in the code is more of what a change l
On Wed, 20 Dec 2023 08:48:02 +
David Laight wrote:
> From: Steven Rostedt
> > Sent: 19 December 2023 18:54
> > From: "Tzvetomir Stoyanov (VMware)"
> >
> > Currently the size of one sub buffer page is global for all buffers and
> > it is hard code
On Wed, 20 Dec 2023 18:48:43 +0900
Masami Hiramatsu (Google) wrote:
> > From: "Tzvetomir Stoyanov (VMware)"
> >
> > In order to introduce sub-buffer size per ring buffer, some internal
> > refactoring is needed. As ring_buffer_print_page_header() will depend on
> > the trace_buffer structure, i
From: "Steven Rostedt (Google)"
The comparisons to PAGE_SIZE were all converted to use the
buffer->subbuf_order, but the use of PAGE_MASK was missed.
Convert all the PAGE_MASK usages over to:
(PAGE_SIZE << cpu_buffer->buffer->subbuf_order) - 1
Fixes: TBD ("r
On Tue, 19 Dec 2023 18:45:54 +
Vincent Donnefort wrote:
> The tracing ring-buffers can be stored on disk or sent to network
> without any copy via splice. However the later doesn't allow real time
> processing of the traces. A solution is to give userspace direct access
> to the ring-buffer p
From: "Steven Rostedt (Google)"
Using page order for deciding what the size of the ring buffer sub buffers
are is exposing a bit too much of the implementation. Although the sub
buffers are only allocated in orders of pages, allow the user to specify
the minimum size of each sub-
From: "Steven Rostedt (Google)"
Add a self test that will write into the trace buffer with differ trace
sub buffer order sizes.
Signed-off-by: Steven Rostedt (Google)
---
.../ftrace/test.d/00basic/ringbuffer_order.tc | 95 +++
1 file changed, 95 insertions(+)
c
From: "Steven Rostedt (Google)"
Add to the documentation how to use the buffer_subbuf_order file to change
the size and how it affects what events can be added to the ring buffer.
Signed-off-by: Steven Rostedt (Google)
---
Documentation/trace/ftrace.rst | 27
From: "Steven Rostedt (Google)"
The ring_buffer_subbuf_order_set() was creating ring_buffer_per_cpu
cpu_buffers with the new subbuffers with the updated order, and if they
all successfully were created, then they the ring_buffer's per_cpu buffers
would be freed and replaced by the
From: "Steven Rostedt (Google)"
When updating the order of the sub buffers for the main buffer, make sure
that if the snapshot buffer exists, that it gets its order updated as
well.
Signed-off-by: Steven Rostedt (Google)
---
kernel/trace/tr
From: "Steven Rostedt (Google)"
The function ring_buffer_subbuf_order_set() just updated the sub-buffers
to the new size, but this also changes the size of the buffer in doing so.
As the size is determined by nr_pages * subbuf_size. If the subbuf_size is
increased without decreasing th
From: "Steven Rostedt (Google)"
Because the main buffer and the snapshot buffer need to be the same for
some tracers, otherwise it will fail and disable all tracing, the tracers
need to be stopped while updating the sub buffer sizes so that the tracers
see the main and snapshot buffer
From: "Steven Rostedt (Google)"
Now that the ring buffer specifies the size of its sub buffers, they all
need to be the same size. When doing a read, a swap is done with a spare
page. Make sure they are the same size before doing the swap, otherwise
the read will fail.
Signed-off-
From: "Steven Rostedt (Google)"
As all the subbuffer order (subbuffer sizes) must be the same throughout
the ring buffer, check the order of the buffers that are doing a CPU
buffer swap in ring_buffer_swap_cpu() to make sure they are the same.
If the are not the same, then fail to d
From: "Steven Rostedt (Google)"
On failure to allocate ring buffer pages, the pointer to the CPU buffer
pages is freed, but the pages that were allocated previously were not.
Make sure they are freed too.
Fixes: TBD ("tracing: Set new size of the ring buffer sub page")
S
race-devel/20211213094825.61876-5-tz.stoya...@gmail.com
Signed-off-by: Tzvetomir Stoyanov (VMware)
Signed-off-by: Steven Rostedt (Google)
---
kernel/trace/ring_buffer.c | 80 ++
1 file changed, 73 insertions(+), 7 deletions(-)
diff --git a/kernel/trace/ring_buffer.c b/ke
_read_page() ]
Signed-off-by: Steven Rostedt (Google)
---
include/linux/ring_buffer.h | 11 ++--
kernel/trace/ring_buffer.c | 75
kernel/trace/ring_buffer_benchmark.c | 10 ++--
kernel/trace/trace.c | 34 +++--
4 files changed, 89
zvetomir Stoyanov (VMware)
Signed-off-by: Steven Rostedt (Google)
---
include/linux/ring_buffer.h | 4 ++
kernel/trace/ring_buffer.c | 73 +
kernel/trace/trace.c| 48
3 files changed, 125 insertions(+)
diff --git a/include/linux/ring
fer.
Link:
https://lore.kernel.org/linux-trace-devel/20211213094825.61876-3-tz.stoya...@gmail.com
Signed-off-by: Tzvetomir Stoyanov (VMware)
Signed-off-by: Steven Rostedt (Google)
---
include/linux/ring_buffer.h | 2 +-
kernel/trace/ring_buffer.c | 68 +--
inux-trace-devel/20211213094825.61876-2-tz.stoya...@gmail.com
Signed-off-by: Tzvetomir Stoyanov (VMware)
Signed-off-by: Steven Rostedt (Google)
---
kernel/trace/ring_buffer.c | 60 +++---
1 file changed, 30 insertions(+), 30 deletions(-)
diff --git a/ke
Note, this has been on my todo list since the ring buffer was created back
in 2008.
Tzvetomir last worked on this in 2021 and I need to finally get it in.
His last series was:
https://lore.kernel.org/linux-trace-devel/20211213094825.61876-1-tz.stoya...@gmail.com/
With the description of:
On Tue, 19 Dec 2023 10:36:13 -0500
Steven Rostedt wrote:
> |-- interrupt event --|-- normal context event --|-- interrupt event --|
>
> ^^ ^
> || |
> ts is befo
On Tue, 19 Dec 2023 10:10:27 -0500
Steven Rostedt wrote:
> 1000 - interrupt event
> 2000 - normal context event
> 2100 - next normal context event
>
> Where we see the delta between the interrupt event and the normal context
> event was 1000. But if we just had it be delt
On Tue, 19 Dec 2023 23:37:10 +0900
Masami Hiramatsu (Google) wrote:
> Yeah the above works, but my question is, do we really need this
> really slow path? I mean;
>
> > if (w == write - event length) {
> > /* Nothing interrupted between A and C */
> > /*E*/ write_st
From: "Steven Rostedt (Google)"
The check_buffer() which checks the timestamps of the ring buffer
sub-buffer page, when enabled, only checks if the adding of deltas of the
events from the last absolute timestamp or the timestamp of the sub-buffer
page adds up to the current event.
Wh
From: "Steven Rostedt (Google)"
When the ring buffer timestamp verifier triggers, it dumps the content of
the sub-buffer. But currently it only dumps the timestamps and the offset
of the data as well as the deltas. It would be even more informative if
the event data also showed the
From: "Steven Rostedt (Google)"
Each event has a 27 bit timestamp delta that is used to hold the delta
from the last event. If the time between events is greater than 2^27, then
a timestamp is added that holds a 59 bit absolute timestamp.
Until a389d86f7fd09 ("ring-buffer: Hav
From: "Steven Rostedt (Google)"
To synchronize the timestamps with the ring buffer reservation, there are
two timestamps that are saved in the buffer meta data.
1. before_stamp
2. write_stamp
When the two are equal, the write_stamp is considered valid, as in, it may
be used to cal
From: "Steven Rostedt (Google)"
The check_buffer() which checks the timestamps of the ring buffer
sub-buffer page, when enabled, only checks if the adding of deltas of the
events from the last absolute timestamp or the timestamp of the sub-buffer
page adds up to the current event.
Wh
On Mon, 18 Dec 2023 17:01:06 -0500
Steven Rostedt wrote:
> @@ -3347,7 +3418,8 @@ static void check_buffer(struct ring_buffer_per_cpu
> *cpu_buffer,
> }
> }
> if ((full && ts > info->ts) ||
> - (!full && ts + info->del
On Mon, 18 Dec 2023 13:42:40 -0500
Steven Rostedt wrote:
> > >
> > > > static bool rb_time_cmp_and_update(rb_time_t *t, u64 expect, u64 set)
> > > > {
> > > > - return rb_time_cmpxchg(t, expect, set);
> > > > +#ifdef RB_T
From: "Steven Rostedt (Google)"
When the ring buffer timestamp verifier triggers, it dumps the content of
the sub-buffer. But currently it only dumps the timestamps and the offset
of the data as well as the deltas. It would be even more informative if
the event data also showed the
From: "Steven Rostedt (Google)"
When the ring buffer timestamp verifier triggers, it dumps the content of
the sub-buffer. But currently it only dumps the timestamps and the offset
of the data as well as the deltas. It would be even more informative if
the event data also showed the
rb_time_cmpxchg() to work the same for both 64-bit
and 32-bit.
- Fixed reading t->time to use local64_read() and not READ_ONCE().
Steven Rostedt (Google) (2):
ring-buffer: Replace rb_time_cmpxchg() with rb_time_cmp_and_update()
ring-buffer: Remove 32bit timestamp logic
From: "Steven Rostedt (Google)"
There's only one place that performs a 64-bit cmpxchg for the timestamp
processing. The cmpxchg is only to set the write_stamp equal to the
before_stamp, and if it doesn't get set, then the next event will simply
be forced to add an absolute ti
From: "Steven Rostedt (Google)"
Each event has a 27 bit timestamp delta that is used to hold the delta
from the last event. If the time between events is greater than 2^27, then
a timestamp is added that holds a 59 bit absolute timestamp.
Until a389d86f7fd09 ("ring-buffer: Hav
On Mon, 18 Dec 2023 10:15:31 -0500
Steven Rostedt wrote:
> Basically I broke it into:
>
> 1. Remove workaround exposure from the main logic. (this patch)
> 2. Remove the workaround. (next patch).
>
> >
> > Isn't this part actual change?
>
> This part
On Mon, 18 Dec 2023 11:28:17 -0500
Steven Rostedt wrote:
> > Remove all references to sub-buffer and replace them with either bpage
> > or ring_buffer_page.
The user interface should not be changed.
But what I would like to have changed (and this will come after all other
On Mon, 18 Dec 2023 15:46:18 +
Vincent Donnefort wrote:
> Previously was introduced the ability to change the ring-buffer page
> size. It also introduced the concept of sub-buffer that is, a contiguous
> virtual memory space which can now be bigger than the system page size
> (4K on most syst
From: "Steven Rostedt (Google)"
Normally, when the filter is enabled, a temporary buffer is created to
copy the event data into it to perform the filtering logic. If the filter
passes and the event should be recorded, then the event is copied from the
temporary buffer into the ring
On Mon, 18 Dec 2023 23:24:55 +0900
Masami Hiramatsu (Google) wrote:
> On Fri, 15 Dec 2023 11:55:13 -0500
> Steven Rostedt wrote:
>
> > From: "Steven Rostedt (Google)"
> >
> > There's only one place that performs a 64-bit cmpxchg for the timestamp
On Sun, 17 Dec 2023 17:10:45 +0900
Masami Hiramatsu (Google) wrote:
> > >> It exposes the following details which IMHO should be hidden or
> > >> configurable in a way that allows moving to a whole new mechanism
> > >> which will have significantly different characteristics in the
> > >> future:
>
On Fri, 15 Dec 2023 13:25:07 -0500
Mathieu Desnoyers wrote:
>
> I am not against exposing an ABI that allows userspace to alter the
> filter behavior. I disagree on the way you plan to expose the ABI.
These are no different than the knobs for sched_debug
>
> Exposing this option as an ABI in t
801 - 900 of 8096 matches
Mail list logo