Re: [PATCH v23 3/5] tracing: Allow user-space mapping of the ring-buffer

2024-07-03 Thread Beau Belgrave
On Tue, Jul 02, 2024 at 12:51:26PM -0400, Steven Rostedt wrote:
> On Tue, 2 Jul 2024 11:32:53 -0400
> Mathieu Desnoyers  wrote:
> 
> > If we use '*' for user events already, perhaps we'd want to consider
> > using the same range for the ring buffer ioctls ? Arguably one is
> > about instrumentation and the other is about ring buffer interaction
> > (data transport), but those are both related to tracing.
> 
> Yeah, but I still rather keep them separate.
> 
> Beau, care to send a patch adding an entry into that ioctl document for
> user events?
> 
> -- Steve

Sure thing, sent one out [1].

Thanks,
-Beau

1. 
https://lore.kernel.org/linux-doc/20240703222501.1547-1-be...@linux.microsoft.com/



Re: [PATCH v23 3/5] tracing: Allow user-space mapping of the ring-buffer

2024-07-02 Thread Mathieu Desnoyers

On 2024-07-02 12:51, Steven Rostedt wrote:

On Tue, 2 Jul 2024 11:32:53 -0400
Mathieu Desnoyers  wrote:


If we use '*' for user events already, perhaps we'd want to consider
using the same range for the ring buffer ioctls ? Arguably one is
about instrumentation and the other is about ring buffer interaction
(data transport), but those are both related to tracing.


Yeah, but I still rather keep them separate.


No objection, I'm OK either way.

Thanks,

Mathieu

--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com




Re: [PATCH v23 3/5] tracing: Allow user-space mapping of the ring-buffer

2024-07-02 Thread Dmitry V. Levin
On Tue, Jul 02, 2024 at 11:18:07AM -0400, Steven Rostedt wrote:
> On Tue, 2 Jul 2024 10:36:03 -0400
> Mathieu Desnoyers  wrote:
> 
> > > I can send a patch this week to update it. Or feel free to send a patch
> > > yourself.  
> > 
> > You need to reserve an unused ioctl Code and Seq# range within:
> > 
> > Documentation/userspace-api/ioctl/ioctl-number.rst
> 
> Ug, it's been so long, I completely forgot about that file.
> 
> Thanks for catching this.
> 
> > 
> > Otherwise this duplicate will confuse all system call instrumentation
> > tooling.
> 
> Agreed, what if we did this then:
> 
> -- Steve
> 
> diff --git a/Documentation/userspace-api/ioctl/ioctl-number.rst 
> b/Documentation/userspace-api/ioctl/ioctl-number.rst
> index a141e8e65c5d..9a97030c6c8d 100644
> --- a/Documentation/userspace-api/ioctl/ioctl-number.rst
> +++ b/Documentation/userspace-api/ioctl/ioctl-number.rst
> @@ -186,6 +186,7 @@ Code  Seq#Include File
>Comments
>  'Q'   alllinux/soundcard.h
>  'R'   00-1F  linux/random.h  
> conflict!
>  'R'   01 linux/rfkill.h  
> conflict!
> +'R'   20-2F  linux/trace_mmap.h
>  'R'   C0-DF  net/bluetooth/rfcomm.h
>  'R'   E0 uapi/linux/fsl_mc.h
>  'S'   alllinux/cdrom.h   
> conflict!

Just in case, I've checked the list of ioctls known to strace and can
confirm that there are no users of 'R' ioctl code in 0x20..0x2f range yet.

By the way, this file is definitely not up to date, the 'R' part of it
should have contained the following:

'R'   00-1F  uapi/linux/random.h conflict!
'R'   01-02  uapi/linux/rfkill.h conflict!
'R'   01-0D  uapi/misc/fastrpc.h conflict!
'R'   C0-DF  net/bluetooth/rfcomm.h
'R'   E0 uapi/linux/fsl_mc.h


-- 
ldv



Re: [PATCH v23 3/5] tracing: Allow user-space mapping of the ring-buffer

2024-07-02 Thread Steven Rostedt
On Tue, 2 Jul 2024 11:32:53 -0400
Mathieu Desnoyers  wrote:

> If we use '*' for user events already, perhaps we'd want to consider
> using the same range for the ring buffer ioctls ? Arguably one is
> about instrumentation and the other is about ring buffer interaction
> (data transport), but those are both related to tracing.

Yeah, but I still rather keep them separate.

Beau, care to send a patch adding an entry into that ioctl document for
user events?

-- Steve



Re: [PATCH v23 3/5] tracing: Allow user-space mapping of the ring-buffer

2024-07-02 Thread Dmitry V. Levin
On Tue, Jul 02, 2024 at 11:32:53AM -0400, Mathieu Desnoyers wrote:
[...]
> Note that user events also has this issue: the ioctl is not reserved in
> the ioctl-number.rst list. See include/uapi/linux/user_events.h:
> 
> #define DIAG_IOC_MAGIC '*'
> 
> /* Request to register a user_event */
> #define DIAG_IOCSREG _IOWR(DIAG_IOC_MAGIC, 0, struct user_reg *)
> 
> /* Request to delete a user_event */
> #define DIAG_IOCSDEL _IOW(DIAG_IOC_MAGIC, 1, char *)
> 
> /* Requests to unregister a user_event */
> #define DIAG_IOCSUNREG _IOW(DIAG_IOC_MAGIC, 2, struct user_unreg*)
> 
> Where '*' maps to Code 0x2A. Looking at the list I don't see any
> conflicts there, but we should definitely add it.
> 
> If we use '*' for user events already, perhaps we'd want to consider
> using the same range for the ring buffer ioctls ? Arguably one is
> about instrumentation and the other is about ring buffer interaction
> (data transport), but those are both related to tracing.

FWIW, I've checked the list of ioctls known to strace and can confirm that
currently there are no more users of 0x2a ioctl code besides these three
DIAG_* ioctls.


-- 
ldv



Re: [PATCH v23 3/5] tracing: Allow user-space mapping of the ring-buffer

2024-07-02 Thread Mathieu Desnoyers

On 2024-07-02 11:18, Steven Rostedt wrote:

On Tue, 2 Jul 2024 10:36:03 -0400
Mathieu Desnoyers  wrote:


I can send a patch this week to update it. Or feel free to send a patch
yourself.


You need to reserve an unused ioctl Code and Seq# range within:

Documentation/userspace-api/ioctl/ioctl-number.rst


Ug, it's been so long, I completely forgot about that file.

Thanks for catching this.



Otherwise this duplicate will confuse all system call instrumentation
tooling.


Agreed, what if we did this then:

-- Steve

diff --git a/Documentation/userspace-api/ioctl/ioctl-number.rst 
b/Documentation/userspace-api/ioctl/ioctl-number.rst
index a141e8e65c5d..9a97030c6c8d 100644
--- a/Documentation/userspace-api/ioctl/ioctl-number.rst
+++ b/Documentation/userspace-api/ioctl/ioctl-number.rst
@@ -186,6 +186,7 @@ Code  Seq#Include File  
 Comments
  'Q'   alllinux/soundcard.h
  'R'   00-1F  linux/random.h  conflict!
  'R'   01 linux/rfkill.h  conflict!
+'R'   20-2F  linux/trace_mmap.h
  'R'   C0-DF  net/bluetooth/rfcomm.h
  'R'   E0 uapi/linux/fsl_mc.h
  'S'   alllinux/cdrom.h   conflict!
diff --git a/include/uapi/linux/trace_mmap.h b/include/uapi/linux/trace_mmap.h
index bd1066754220..c102ef35d11e 100644
--- a/include/uapi/linux/trace_mmap.h
+++ b/include/uapi/linux/trace_mmap.h
@@ -43,6 +43,6 @@ struct trace_buffer_meta {
__u64   Reserved2;
  };
  
-#define TRACE_MMAP_IOCTL_GET_READER		_IO('T', 0x1)

+#define TRACE_MMAP_IOCTL_GET_READER_IO('R', 0x20)


Note that user events also has this issue: the ioctl is not reserved in
the ioctl-number.rst list. See include/uapi/linux/user_events.h:

#define DIAG_IOC_MAGIC '*'

/* Request to register a user_event */
#define DIAG_IOCSREG _IOWR(DIAG_IOC_MAGIC, 0, struct user_reg *)

/* Request to delete a user_event */
#define DIAG_IOCSDEL _IOW(DIAG_IOC_MAGIC, 1, char *)

/* Requests to unregister a user_event */
#define DIAG_IOCSUNREG _IOW(DIAG_IOC_MAGIC, 2, struct user_unreg*)

Where '*' maps to Code 0x2A. Looking at the list I don't see any
conflicts there, but we should definitely add it.

If we use '*' for user events already, perhaps we'd want to consider
using the same range for the ring buffer ioctls ? Arguably one is
about instrumentation and the other is about ring buffer interaction
(data transport), but those are both related to tracing.

Thanks,

Mathieu

  
  #endif /* _TRACE_MMAP_H_ */


--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com




Re: [PATCH v23 3/5] tracing: Allow user-space mapping of the ring-buffer

2024-07-02 Thread Steven Rostedt
On Tue, 2 Jul 2024 10:36:03 -0400
Mathieu Desnoyers  wrote:

> > I can send a patch this week to update it. Or feel free to send a patch
> > yourself.  
> 
> You need to reserve an unused ioctl Code and Seq# range within:
> 
> Documentation/userspace-api/ioctl/ioctl-number.rst

Ug, it's been so long, I completely forgot about that file.

Thanks for catching this.

> 
> Otherwise this duplicate will confuse all system call instrumentation
> tooling.

Agreed, what if we did this then:

-- Steve

diff --git a/Documentation/userspace-api/ioctl/ioctl-number.rst 
b/Documentation/userspace-api/ioctl/ioctl-number.rst
index a141e8e65c5d..9a97030c6c8d 100644
--- a/Documentation/userspace-api/ioctl/ioctl-number.rst
+++ b/Documentation/userspace-api/ioctl/ioctl-number.rst
@@ -186,6 +186,7 @@ Code  Seq#Include File  
 Comments
 'Q'   alllinux/soundcard.h
 'R'   00-1F  linux/random.h  conflict!
 'R'   01 linux/rfkill.h  conflict!
+'R'   20-2F  linux/trace_mmap.h
 'R'   C0-DF  net/bluetooth/rfcomm.h
 'R'   E0 uapi/linux/fsl_mc.h
 'S'   alllinux/cdrom.h   conflict!
diff --git a/include/uapi/linux/trace_mmap.h b/include/uapi/linux/trace_mmap.h
index bd1066754220..c102ef35d11e 100644
--- a/include/uapi/linux/trace_mmap.h
+++ b/include/uapi/linux/trace_mmap.h
@@ -43,6 +43,6 @@ struct trace_buffer_meta {
__u64   Reserved2;
 };
 
-#define TRACE_MMAP_IOCTL_GET_READER_IO('T', 0x1)
+#define TRACE_MMAP_IOCTL_GET_READER_IO('R', 0x20)
 
 #endif /* _TRACE_MMAP_H_ */



Re: [PATCH v23 3/5] tracing: Allow user-space mapping of the ring-buffer

2024-07-02 Thread Mathieu Desnoyers

On 2024-06-30 08:40, Steven Rostedt wrote:

On Sun, 30 Jun 2024 13:53:23 +0300
"Dmitry V. Levin"  wrote:


On Fri, May 10, 2024 at 03:04:32PM +0100, Vincent Donnefort wrote:
[...]

diff --git a/include/uapi/linux/trace_mmap.h b/include/uapi/linux/trace_mmap.h
index b682e9925539..bd1066754220 100644
--- a/include/uapi/linux/trace_mmap.h
+++ b/include/uapi/linux/trace_mmap.h
@@ -43,4 +43,6 @@ struct trace_buffer_meta {
__u64   Reserved2;
  };
  
+#define TRACE_MMAP_IOCTL_GET_READER		_IO('T', 0x1)

+


I'm sorry but among all the numbers this one was probably the least
fortunate choice because it collides with TCGETS on most of architectures.


Hmm, that is unfortunate.



For example, this is how strace output would look like when
TRACE_MMAP_IOCTL_GET_READER support is added:

$ strace -e ioctl stty
ioctl(0, TCGETS or TRACE_MMAP_IOCTL_GET_READER, {c_iflag=ICRNL|IXON, 
c_oflag=NL0|CR0|TAB0|BS0|VT0|FF0|OPOST|ONLCR, c_cflag=B38400|CS8|CREAD, 
c_lflag=ISIG|ICANON|ECHO|ECHOE|ECHOK|IEXTEN|ECHOCTL|ECHOKE, ...}) = 0

Even though ioctl numbers are inherently not unique, TCGETS is
a very traditional one, so it would be great if you could change
TRACE_MMAP_IOCTL_GET_READER to avoid this collision.

Given that _IO('T', 0x1) is _IOC(_IOC_NONE, 'T', 0x1, 0),
something like _IOC(_IOC_NONE, 'T', 0x1, 0x1) should be OK.


Well, it may not be too late to update this as it hasn't been
officially released in 6.10 yet. It's still only in the -rc and the
library doesn't rely on this yet (I've been holding off until 6.10 was
officially released before releasing the library that uses it).

I can send a patch this week to update it. Or feel free to send a patch
yourself.


You need to reserve an unused ioctl Code and Seq# range within:

Documentation/userspace-api/ioctl/ioctl-number.rst

Otherwise this duplicate will confuse all system call instrumentation
tooling.

Thanks,

Mathieu


--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com




Re: [PATCH v23 3/5] tracing: Allow user-space mapping of the ring-buffer

2024-06-30 Thread Steven Rostedt
On Sun, 30 Jun 2024 13:53:23 +0300
"Dmitry V. Levin"  wrote:

> On Fri, May 10, 2024 at 03:04:32PM +0100, Vincent Donnefort wrote:
> [...]
> > diff --git a/include/uapi/linux/trace_mmap.h 
> > b/include/uapi/linux/trace_mmap.h
> > index b682e9925539..bd1066754220 100644
> > --- a/include/uapi/linux/trace_mmap.h
> > +++ b/include/uapi/linux/trace_mmap.h
> > @@ -43,4 +43,6 @@ struct trace_buffer_meta {
> > __u64   Reserved2;
> >  };
> >  
> > +#define TRACE_MMAP_IOCTL_GET_READER_IO('T', 0x1)
> > +  
> 
> I'm sorry but among all the numbers this one was probably the least
> fortunate choice because it collides with TCGETS on most of architectures.

Hmm, that is unfortunate.

> 
> For example, this is how strace output would look like when
> TRACE_MMAP_IOCTL_GET_READER support is added:
> 
> $ strace -e ioctl stty
> ioctl(0, TCGETS or TRACE_MMAP_IOCTL_GET_READER, {c_iflag=ICRNL|IXON, 
> c_oflag=NL0|CR0|TAB0|BS0|VT0|FF0|OPOST|ONLCR, c_cflag=B38400|CS8|CREAD, 
> c_lflag=ISIG|ICANON|ECHO|ECHOE|ECHOK|IEXTEN|ECHOCTL|ECHOKE, ...}) = 0
> 
> Even though ioctl numbers are inherently not unique, TCGETS is
> a very traditional one, so it would be great if you could change
> TRACE_MMAP_IOCTL_GET_READER to avoid this collision.
> 
> Given that _IO('T', 0x1) is _IOC(_IOC_NONE, 'T', 0x1, 0),
> something like _IOC(_IOC_NONE, 'T', 0x1, 0x1) should be OK.

Well, it may not be too late to update this as it hasn't been
officially released in 6.10 yet. It's still only in the -rc and the
library doesn't rely on this yet (I've been holding off until 6.10 was
officially released before releasing the library that uses it).

I can send a patch this week to update it. Or feel free to send a patch
yourself.

Thanks,

-- Steve



Re: [PATCH v23 3/5] tracing: Allow user-space mapping of the ring-buffer

2024-06-30 Thread Dmitry V. Levin
On Fri, May 10, 2024 at 03:04:32PM +0100, Vincent Donnefort wrote:
[...]
> diff --git a/include/uapi/linux/trace_mmap.h b/include/uapi/linux/trace_mmap.h
> index b682e9925539..bd1066754220 100644
> --- a/include/uapi/linux/trace_mmap.h
> +++ b/include/uapi/linux/trace_mmap.h
> @@ -43,4 +43,6 @@ struct trace_buffer_meta {
>   __u64   Reserved2;
>  };
>  
> +#define TRACE_MMAP_IOCTL_GET_READER  _IO('T', 0x1)
> +

I'm sorry but among all the numbers this one was probably the least
fortunate choice because it collides with TCGETS on most of architectures.

For example, this is how strace output would look like when
TRACE_MMAP_IOCTL_GET_READER support is added:

$ strace -e ioctl stty
ioctl(0, TCGETS or TRACE_MMAP_IOCTL_GET_READER, {c_iflag=ICRNL|IXON, 
c_oflag=NL0|CR0|TAB0|BS0|VT0|FF0|OPOST|ONLCR, c_cflag=B38400|CS8|CREAD, 
c_lflag=ISIG|ICANON|ECHO|ECHOE|ECHOK|IEXTEN|ECHOCTL|ECHOKE, ...}) = 0

Even though ioctl numbers are inherently not unique, TCGETS is
a very traditional one, so it would be great if you could change
TRACE_MMAP_IOCTL_GET_READER to avoid this collision.

Given that _IO('T', 0x1) is _IOC(_IOC_NONE, 'T', 0x1, 0),
something like _IOC(_IOC_NONE, 'T', 0x1, 0x1) should be OK.


-- 
ldv



[PATCH v23 3/5] tracing: Allow user-space mapping of the ring-buffer

2024-05-10 Thread Vincent Donnefort
Currently, user-space extracts data from the ring-buffer via splice,
which is handy for storage or network sharing. However, due to splice
limitations, it is imposible to do real-time analysis without a copy.

A solution for that problem is to let the user-space map the ring-buffer
directly.

The mapping is exposed via the per-CPU file trace_pipe_raw. The first
element of the mapping is the meta-page. It is followed by each
subbuffer constituting the ring-buffer, ordered by their unique page ID:

  * Meta-page -- include/uapi/linux/trace_mmap.h for a description
  * Subbuf ID 0
  * Subbuf ID 1
 ...

It is therefore easy to translate a subbuf ID into an offset in the
mapping:

  reader_id = meta->reader->id;
  reader_offset = meta->meta_page_size + reader_id * meta->subbuf_size;

When new data is available, the mapper must call a newly introduced ioctl:
TRACE_MMAP_IOCTL_GET_READER. This will update the Meta-page reader ID to
point to the next reader containing unread data.

Mapping will prevent snapshot and buffer size modifications.

CC: 
Signed-off-by: Vincent Donnefort 

diff --git a/include/uapi/linux/trace_mmap.h b/include/uapi/linux/trace_mmap.h
index b682e9925539..bd1066754220 100644
--- a/include/uapi/linux/trace_mmap.h
+++ b/include/uapi/linux/trace_mmap.h
@@ -43,4 +43,6 @@ struct trace_buffer_meta {
__u64   Reserved2;
 };
 
+#define TRACE_MMAP_IOCTL_GET_READER_IO('T', 0x1)
+
 #endif /* _TRACE_MMAP_H_ */
diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index 233d1af39fff..a35e7f598233 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -1191,6 +1191,12 @@ static void tracing_snapshot_instance_cond(struct 
trace_array *tr,
return;
}
 
+   if (tr->mapped) {
+   trace_array_puts(tr, "*** BUFFER MEMORY MAPPED ***\n");
+   trace_array_puts(tr, "*** Can not use snapshot (sorry) ***\n");
+   return;
+   }
+
local_irq_save(flags);
update_max_tr(tr, current, smp_processor_id(), cond_data);
local_irq_restore(flags);
@@ -1323,7 +1329,7 @@ static int tracing_arm_snapshot_locked(struct trace_array 
*tr)
lockdep_assert_held(_types_lock);
 
spin_lock(>snapshot_trigger_lock);
-   if (tr->snapshot == UINT_MAX) {
+   if (tr->snapshot == UINT_MAX || tr->mapped) {
spin_unlock(>snapshot_trigger_lock);
return -EBUSY;
}
@@ -6068,7 +6074,7 @@ static void tracing_set_nop(struct trace_array *tr)
 {
if (tr->current_trace == _trace)
return;
-   
+
tr->current_trace->enabled--;
 
if (tr->current_trace->reset)
@@ -8194,15 +8200,32 @@ tracing_buffers_splice_read(struct file *file, loff_t 
*ppos,
return ret;
 }
 
-/* An ioctl call with cmd 0 to the ring buffer file will wake up all waiters */
 static long tracing_buffers_ioctl(struct file *file, unsigned int cmd, 
unsigned long arg)
 {
struct ftrace_buffer_info *info = file->private_data;
struct trace_iterator *iter = >iter;
+   int err;
+
+   if (cmd == TRACE_MMAP_IOCTL_GET_READER) {
+   if (!(file->f_flags & O_NONBLOCK)) {
+   err = ring_buffer_wait(iter->array_buffer->buffer,
+  iter->cpu_file,
+  iter->tr->buffer_percent,
+  NULL, NULL);
+   if (err)
+   return err;
+   }
 
-   if (cmd)
-   return -ENOIOCTLCMD;
+   return ring_buffer_map_get_reader(iter->array_buffer->buffer,
+ iter->cpu_file);
+   } else if (cmd) {
+   return -ENOTTY;
+   }
 
+   /*
+* An ioctl call with cmd 0 to the ring buffer file will wake up all
+* waiters
+*/
mutex_lock(_types_lock);
 
/* Make sure the waiters see the new wait_index */
@@ -8214,6 +8237,76 @@ static long tracing_buffers_ioctl(struct file *file, 
unsigned int cmd, unsigned
return 0;
 }
 
+#ifdef CONFIG_TRACER_MAX_TRACE
+static int get_snapshot_map(struct trace_array *tr)
+{
+   int err = 0;
+
+   /*
+* Called with mmap_lock held. lockdep would be unhappy if we would now
+* take trace_types_lock. Instead use the specific
+* snapshot_trigger_lock.
+*/
+   spin_lock(>snapshot_trigger_lock);
+
+   if (tr->snapshot || tr->mapped == UINT_MAX)
+   err = -EBUSY;
+   else
+   tr->mapped++;
+
+   spin_unlock(>snapshot_trigger_lock);
+
+   /* Wait for update_max_tr() to observe iter->tr->mapped */
+   if (tr->mapped == 1)
+   synchronize_rcu();
+
+   return err;
+
+}
+static void put_snapshot_map(struct trace_array *tr)
+{
+   spin_lock(>snapshot_trigger_lock);
+   if (!WARN_ON(!tr->mapped))
+