Re: [PATCH 2/2] dmabuf/tracing: Add dma-buf trace events

2020-08-05 Thread Kalesh Singh
On Tue, Aug 04, 2020 at 02:09:13AM +0100, Al Viro wrote:
> On Mon, Aug 03, 2020 at 11:28:31PM +0100, Al Viro wrote:
> 
> > IOW, what the hell is that horror for?  You do realize, for example, that 
> > there's
> > such thing as dup(), right?  And dup2() as well.  And while we are at it, 
> > how
> > do you keep track of removals, considering the fact that you can stick a 
> > file
> > reference into SCM_RIGHTS datagram sent to yourself, close descriptors and 
> > an hour
> > later pick that datagram, suddenly getting descriptor back?
> > 
> > Besides, "I have no descriptors left" != "I can't be currently sitting in 
> > the middle
> > of syscall on that sucker"; close() does *NOT* terminate ongoing operations.
> > 
> > You are looking at the drastically wrong abstraction level.  Please, 
> > describe what
> > it is that you are trying to achieve.

Hi Al. Thank you for the comments. Ultimately what we need is to identify 
processes
that hold a file reference to the dma-buf. Unfortunately we can't use only
explicit dma_buf_get/dma_buf_put to track them because when an FD is being 
shared
between processes the file references are taken implicitly.

For example, on the sender side:
   unix_dgram_sendmsg -> send_scm -> __send_scm -> scm_fp_copy -> fget_raw
and on the receiver side:
   unix_dgram_recvmsg -> scm_recv -> scm_detach_fds -> __scm_install_fd -> 
get_file

I understand now that fd_install is not an appropriate abstraction level to 
track these.
Is there a more appropriate alternative where we could use to track these 
implicit file
references?

> _IF_ it's "who keeps a particularly long-lived sucker pinned", I would suggest
> fuser(1) run when you detect that kind of long-lived dmabuf.  With events 
> generated
> by their constructors and destructors, and detection of longevity done based 
> on
> that.
> 
> But that's only a semi-blind guess at the things you are trying to achieve; 
> please,
> describe what it really is.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH 2/2] dmabuf/tracing: Add dma-buf trace events

2020-08-05 Thread Al Viro
On Tue, Aug 04, 2020 at 03:44:51PM +, Kalesh Singh wrote:

> Hi Al. Thank you for the comments. Ultimately what we need is to identify 
> processes
> that hold a file reference to the dma-buf. Unfortunately we can't use only
> explicit dma_buf_get/dma_buf_put to track them because when an FD is being 
> shared
> between processes the file references are taken implicitly.
> 
> For example, on the sender side:
>unix_dgram_sendmsg -> send_scm -> __send_scm -> scm_fp_copy -> fget_raw
> and on the receiver side:
>unix_dgram_recvmsg -> scm_recv -> scm_detach_fds -> __scm_install_fd -> 
> get_file
> 
> I understand now that fd_install is not an appropriate abstraction level to 
> track these.
> Is there a more appropriate alternative where we could use to track these 
> implicit file
> references?

There is no single lock that would stabilize the descriptor tables of all
processes.  And there's not going to be one, ever - it would be a contention
point from hell, since that would've been a system-wide lock that would have
to be taken by *ALL* syscalls modifying any descriptor table.  Not going to
happen, for obvious reasons.  Moreover, you would have to have fork(2) take
the same lock, since it does copy descriptor table.  And clone(2) either does
the same, or has the child share the descriptor table of parent.

What's more, a reference to struct file can bloody well survive without
a single descriptor refering to that file.  In the example you've mentioned
above, sender has ever right to close all descriptors it has sent.   Files
will stay opened as long as the references are held in the datagram; when
that datagram is received, the references will be inserted into recepient's
descriptor table.  At that point you again have descriptors refering to
that file, can do any IO on it, etc.

So "the set of processes that hold a file reference to the dma-buf" is
* inherently unstable, unless you are willing to freeze every
process in the system except for the one trying to find that set.
* can remain empty for any amount of time (hours, weeks, whatever),
only to get non-empty later, with syscalls affecting the object in question
done afterwards.

So... what were you going to do with that set if you could calculate it?
If it's really "how do we debug a leak?", it's one thing; in that case
I would suggest keeping track of creation/destruction of objects (not
gaining/dropping references - actual constructors and destructors) to
see what gets stuck around for too long and use fuser(1) to try and locate
the culprits if you see that something *was* living for too long.  "Try"
since the only reference might indeed have been stashed into an SCM_RIGHTS
datagram sitting in a queue of some AF_UNIX socket.  Note that "fuser
needs elevated priveleges" is not a strong argument - the ability to
do that sort of tracking does imply elevated priveleges anyway, and
having a root process taking requests along the lines of "gimme the
list of PIDs that have such-and-such dma_buf in their descriptor table"
is not much of an attack surface.

If you want to use it for something else, you'll need to describe that
intended use; there might be sane ways to do that, but it's hard to
come up with one without knowing what's being attempted...
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH 2/2] dmabuf/tracing: Add dma-buf trace events

2020-08-04 Thread Daniel Vetter
On Tue, Aug 4, 2020 at 12:28 AM Al Viro  wrote:
>
> On Mon, Aug 03, 2020 at 09:22:53AM -0700, Suren Baghdasaryan wrote:
> > On Mon, Aug 3, 2020 at 9:12 AM Matthew Wilcox  wrote:
> > >
> > > On Mon, Aug 03, 2020 at 09:00:00AM -0700, Suren Baghdasaryan wrote:
> > > > On Mon, Aug 3, 2020 at 8:41 AM Matthew Wilcox  
> > > > wrote:
> > > > >
> > > > > On Mon, Aug 03, 2020 at 02:47:19PM +, Kalesh Singh wrote:
> > > > > > +static void dma_buf_fd_install(int fd, struct file *filp)
> > > > > > +{
> > > > > > + trace_dma_buf_fd_ref_inc(current, filp);
> > > > > > +}
> > > > >
> > > > > You're adding a new file_operation in order to just add a new 
> > > > > tracepoint?
> > > > > NACK.
> > > >
> > > > Hi Matthew,
> > > > The plan is to attach a BPF to this tracepoint in order to track
> > > > dma-buf users. If you feel this is an overkill, what would you suggest
> > > > as an alternative?
> > >
> > > I'm sure BPF can attach to fd_install and filter on file->f_ops belonging
> > > to dma_buf, for example.
> >
> > Sounds like a workable solution. Will explore that direction. Thanks 
> > Matthew!
>
> No, it is not a solution at all.
>
> What kind of locking would you use?  With _any_ of those approaches.
>
> How would you use the information that is hopelessly out of 
> date/incoherent/whatnot
> at the very moment you obtain it?
>
> IOW, what the hell is that horror for?  You do realize, for example, that 
> there's
> such thing as dup(), right?  And dup2() as well.  And while we are at it, how
> do you keep track of removals, considering the fact that you can stick a file
> reference into SCM_RIGHTS datagram sent to yourself, close descriptors and an 
> hour
> later pick that datagram, suddenly getting descriptor back?
>
> Besides, "I have no descriptors left" != "I can't be currently sitting in the 
> middle
> of syscall on that sucker"; close() does *NOT* terminate ongoing operations.
>
> You are looking at the drastically wrong abstraction level.  Please, describe 
> what
> it is that you are trying to achieve.

For added entertainment (since this is specifically about dma-buf) you
can stuff them into various gpu drivers, and convert to a native gpu
driver handle thing. That's actually the expected use case, first a
buffer sharing gets established with AF_UNIX, then both sides close
the dma-buf fd handle.

GPU drivers then internally cache the struct file so that we can hand
out the same (to avoid confusion when re-importing it on some other
driver), so for the case of dma-buf the "it's not actually an
installed fd anywhere for unlimited time" is actually the normal
use-case, not some odd corner.

Cheers, Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH 2/2] dmabuf/tracing: Add dma-buf trace events

2020-08-04 Thread Kalesh Singh
On Mon, Aug 03, 2020 at 11:32:39AM -0400, Steven Rostedt wrote:
> On Mon,  3 Aug 2020 14:47:19 +
> Kalesh Singh  wrote:
> 
> > +DECLARE_EVENT_CLASS(dma_buf_ref_template,
> > +
> > +   TP_PROTO(struct task_struct *task, struct file *filp),
> > +
> > +   TP_ARGS(task,  filp),
> > +
> > +   TP_STRUCT__entry(
> > +   __field(u32, tgid)
> > +   __field(u32, pid)
> 
> I only see "current" passed in as "task". Why are you recording the pid
> and tgid as these are available by the tracing infrastructure.
> 
> At least the pid is saved at every event. You can see the tgid when
> enabling the "record_tgid".
> 
>  # trace-cmd start -e all -O record_tgid
>  # trace-cmd show
> 
> # tracer: nop
> #
> # entries-in-buffer/entries-written: 39750/39750   #P:8
> #
> #  _-=> irqs-off
> # / _=> need-resched
> #| / _---=> hardirq/softirq
> #|| / _--=> preempt-depth
> #||| / delay
> #   TASK-PIDTGID   CPU#  TIMESTAMP  FUNCTION
> #  | ||  |      | |
>trace-cmd-28284 (28284) [005]  240338.934671: sys_exit: NR 1 = 1
>  kworker/3:2-27891 (27891) [003] d... 240338.934671: timer_start: 
> timer=d643debd function=delayed_work_timer_fn expires=4535008893 
> [timeout=1981] cpu=3 idx=186 flags=I
>trace-cmd-28284 (28284) [005]  240338.934672: sys_write -> 0x1
>  kworker/3:2-27891 (27891) [003]  240338.934672: 
> workqueue_execute_end: work struct 8fddd403: function psi_avgs_work
>  kworker/3:2-27891 (27891) [003]  240338.934673: 
> workqueue_execute_start: work struct 111c941e: function 
> dbs_work_handler
>  kworker/3:2-27891 (27891) [003]  240338.934673: 
> workqueue_execute_end: work struct 111c941e: function dbs_work_handler
>  kworker/3:2-27891 (27891) [003] d... 240338.934673: rcu_utilization: 
> Start context switch
>  kworker/3:2-27891 (27891) [003] d... 240338.934673: rcu_utilization: End 
> context switch
> 
> -- Steve
> 
Thanks for the comments Steve. I'll remove the task arg.

> > +   __field(u64, size)
> > +   __field(s64, count)
> > +   __string(exp_name, dma_buffer(filp)->exp_name)
> > +   __string(name, dma_buffer(filp)->name ? dma_buffer(filp)->name 
> > : UNKNOWN)
> > +   __field(u64, i_ino)
> > +   ),
> > +
> > +   TP_fast_assign(
> > +   __entry->tgid = task->tgid;
> > +   __entry->pid = task->pid;
> > +   __entry->size = dma_buffer(filp)->size;
> > +   __entry->count = file_count(filp);
> > +   __assign_str(exp_name, dma_buffer(filp)->exp_name);
> > +   __assign_str(name, dma_buffer(filp)->name ? 
> > dma_buffer(filp)->name : UNKNOWN);
> > +   __entry->i_ino = filp->f_inode->i_ino;
> > +   ),
> > +
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH 2/2] dmabuf/tracing: Add dma-buf trace events

2020-08-04 Thread Matthew Wilcox
On Mon, Aug 03, 2020 at 02:47:19PM +, Kalesh Singh wrote:
> +static void dma_buf_fd_install(int fd, struct file *filp)
> +{
> + trace_dma_buf_fd_ref_inc(current, filp);
> +}

You're adding a new file_operation in order to just add a new tracepoint?
NACK.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH 2/2] dmabuf/tracing: Add dma-buf trace events

2020-08-04 Thread Matthew Wilcox
On Mon, Aug 03, 2020 at 09:00:00AM -0700, Suren Baghdasaryan wrote:
> On Mon, Aug 3, 2020 at 8:41 AM Matthew Wilcox  wrote:
> >
> > On Mon, Aug 03, 2020 at 02:47:19PM +, Kalesh Singh wrote:
> > > +static void dma_buf_fd_install(int fd, struct file *filp)
> > > +{
> > > + trace_dma_buf_fd_ref_inc(current, filp);
> > > +}
> >
> > You're adding a new file_operation in order to just add a new tracepoint?
> > NACK.
> 
> Hi Matthew,
> The plan is to attach a BPF to this tracepoint in order to track
> dma-buf users. If you feel this is an overkill, what would you suggest
> as an alternative?

I'm sure BPF can attach to fd_install and filter on file->f_ops belonging
to dma_buf, for example.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH 2/2] dmabuf/tracing: Add dma-buf trace events

2020-08-04 Thread Al Viro
On Mon, Aug 03, 2020 at 11:28:31PM +0100, Al Viro wrote:

> IOW, what the hell is that horror for?  You do realize, for example, that 
> there's
> such thing as dup(), right?  And dup2() as well.  And while we are at it, how
> do you keep track of removals, considering the fact that you can stick a file
> reference into SCM_RIGHTS datagram sent to yourself, close descriptors and an 
> hour
> later pick that datagram, suddenly getting descriptor back?
> 
> Besides, "I have no descriptors left" != "I can't be currently sitting in the 
> middle
> of syscall on that sucker"; close() does *NOT* terminate ongoing operations.
> 
> You are looking at the drastically wrong abstraction level.  Please, describe 
> what
> it is that you are trying to achieve.

_IF_ it's "who keeps a particularly long-lived sucker pinned", I would suggest
fuser(1) run when you detect that kind of long-lived dmabuf.  With events 
generated
by their constructors and destructors, and detection of longevity done based on
that.

But that's only a semi-blind guess at the things you are trying to achieve; 
please,
describe what it really is.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH 2/2] dmabuf/tracing: Add dma-buf trace events

2020-08-04 Thread Al Viro
On Mon, Aug 03, 2020 at 09:22:53AM -0700, Suren Baghdasaryan wrote:
> On Mon, Aug 3, 2020 at 9:12 AM Matthew Wilcox  wrote:
> >
> > On Mon, Aug 03, 2020 at 09:00:00AM -0700, Suren Baghdasaryan wrote:
> > > On Mon, Aug 3, 2020 at 8:41 AM Matthew Wilcox  wrote:
> > > >
> > > > On Mon, Aug 03, 2020 at 02:47:19PM +, Kalesh Singh wrote:
> > > > > +static void dma_buf_fd_install(int fd, struct file *filp)
> > > > > +{
> > > > > + trace_dma_buf_fd_ref_inc(current, filp);
> > > > > +}
> > > >
> > > > You're adding a new file_operation in order to just add a new 
> > > > tracepoint?
> > > > NACK.
> > >
> > > Hi Matthew,
> > > The plan is to attach a BPF to this tracepoint in order to track
> > > dma-buf users. If you feel this is an overkill, what would you suggest
> > > as an alternative?
> >
> > I'm sure BPF can attach to fd_install and filter on file->f_ops belonging
> > to dma_buf, for example.
> 
> Sounds like a workable solution. Will explore that direction. Thanks Matthew!

No, it is not a solution at all.

What kind of locking would you use?  With _any_ of those approaches.

How would you use the information that is hopelessly out of 
date/incoherent/whatnot
at the very moment you obtain it?

IOW, what the hell is that horror for?  You do realize, for example, that 
there's
such thing as dup(), right?  And dup2() as well.  And while we are at it, how
do you keep track of removals, considering the fact that you can stick a file
reference into SCM_RIGHTS datagram sent to yourself, close descriptors and an 
hour
later pick that datagram, suddenly getting descriptor back?

Besides, "I have no descriptors left" != "I can't be currently sitting in the 
middle
of syscall on that sucker"; close() does *NOT* terminate ongoing operations.

You are looking at the drastically wrong abstraction level.  Please, describe 
what
it is that you are trying to achieve.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH 2/2] dmabuf/tracing: Add dma-buf trace events

2020-08-03 Thread Steven Rostedt
On Mon,  3 Aug 2020 14:47:19 +
Kalesh Singh  wrote:

> +DECLARE_EVENT_CLASS(dma_buf_ref_template,
> +
> + TP_PROTO(struct task_struct *task, struct file *filp),
> +
> + TP_ARGS(task,  filp),
> +
> + TP_STRUCT__entry(
> + __field(u32, tgid)
> + __field(u32, pid)

I only see "current" passed in as "task". Why are you recording the pid
and tgid as these are available by the tracing infrastructure.

At least the pid is saved at every event. You can see the tgid when
enabling the "record_tgid".

 # trace-cmd start -e all -O record_tgid
 # trace-cmd show

# tracer: nop
#
# entries-in-buffer/entries-written: 39750/39750   #P:8
#
#  _-=> irqs-off
# / _=> need-resched
#| / _---=> hardirq/softirq
#|| / _--=> preempt-depth
#||| / delay
#   TASK-PIDTGID   CPU#  TIMESTAMP  FUNCTION
#  | ||  |      | |
   trace-cmd-28284 (28284) [005]  240338.934671: sys_exit: NR 1 = 1
 kworker/3:2-27891 (27891) [003] d... 240338.934671: timer_start: 
timer=d643debd function=delayed_work_timer_fn expires=4535008893 
[timeout=1981] cpu=3 idx=186 flags=I
   trace-cmd-28284 (28284) [005]  240338.934672: sys_write -> 0x1
 kworker/3:2-27891 (27891) [003]  240338.934672: workqueue_execute_end: 
work struct 8fddd403: function psi_avgs_work
 kworker/3:2-27891 (27891) [003]  240338.934673: 
workqueue_execute_start: work struct 111c941e: function dbs_work_handler
 kworker/3:2-27891 (27891) [003]  240338.934673: workqueue_execute_end: 
work struct 111c941e: function dbs_work_handler
 kworker/3:2-27891 (27891) [003] d... 240338.934673: rcu_utilization: Start 
context switch
 kworker/3:2-27891 (27891) [003] d... 240338.934673: rcu_utilization: End 
context switch

-- Steve

> + __field(u64, size)
> + __field(s64, count)
> + __string(exp_name, dma_buffer(filp)->exp_name)
> + __string(name, dma_buffer(filp)->name ? dma_buffer(filp)->name 
> : UNKNOWN)
> + __field(u64, i_ino)
> + ),
> +
> + TP_fast_assign(
> + __entry->tgid = task->tgid;
> + __entry->pid = task->pid;
> + __entry->size = dma_buffer(filp)->size;
> + __entry->count = file_count(filp);
> + __assign_str(exp_name, dma_buffer(filp)->exp_name);
> + __assign_str(name, dma_buffer(filp)->name ? 
> dma_buffer(filp)->name : UNKNOWN);
> + __entry->i_ino = filp->f_inode->i_ino;
> + ),
> +
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel