Re: [PATCH 2/2] dmabuf/tracing: Add dma-buf trace events
On Tue, Aug 04, 2020 at 02:09:13AM +0100, Al Viro wrote: > On Mon, Aug 03, 2020 at 11:28:31PM +0100, Al Viro wrote: > > > IOW, what the hell is that horror for? You do realize, for example, that > > there's > > such thing as dup(), right? And dup2() as well. And while we are at it, > > how > > do you keep track of removals, considering the fact that you can stick a > > file > > reference into SCM_RIGHTS datagram sent to yourself, close descriptors and > > an hour > > later pick that datagram, suddenly getting descriptor back? > > > > Besides, "I have no descriptors left" != "I can't be currently sitting in > > the middle > > of syscall on that sucker"; close() does *NOT* terminate ongoing operations. > > > > You are looking at the drastically wrong abstraction level. Please, > > describe what > > it is that you are trying to achieve. Hi Al. Thank you for the comments. Ultimately what we need is to identify processes that hold a file reference to the dma-buf. Unfortunately we can't use only explicit dma_buf_get/dma_buf_put to track them because when an FD is being shared between processes the file references are taken implicitly. For example, on the sender side: unix_dgram_sendmsg -> send_scm -> __send_scm -> scm_fp_copy -> fget_raw and on the receiver side: unix_dgram_recvmsg -> scm_recv -> scm_detach_fds -> __scm_install_fd -> get_file I understand now that fd_install is not an appropriate abstraction level to track these. Is there a more appropriate alternative where we could use to track these implicit file references? > _IF_ it's "who keeps a particularly long-lived sucker pinned", I would suggest > fuser(1) run when you detect that kind of long-lived dmabuf. With events > generated > by their constructors and destructors, and detection of longevity done based > on > that. > > But that's only a semi-blind guess at the things you are trying to achieve; > please, > describe what it really is. ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH 2/2] dmabuf/tracing: Add dma-buf trace events
On Tue, Aug 04, 2020 at 03:44:51PM +, Kalesh Singh wrote: > Hi Al. Thank you for the comments. Ultimately what we need is to identify > processes > that hold a file reference to the dma-buf. Unfortunately we can't use only > explicit dma_buf_get/dma_buf_put to track them because when an FD is being > shared > between processes the file references are taken implicitly. > > For example, on the sender side: >unix_dgram_sendmsg -> send_scm -> __send_scm -> scm_fp_copy -> fget_raw > and on the receiver side: >unix_dgram_recvmsg -> scm_recv -> scm_detach_fds -> __scm_install_fd -> > get_file > > I understand now that fd_install is not an appropriate abstraction level to > track these. > Is there a more appropriate alternative where we could use to track these > implicit file > references? There is no single lock that would stabilize the descriptor tables of all processes. And there's not going to be one, ever - it would be a contention point from hell, since that would've been a system-wide lock that would have to be taken by *ALL* syscalls modifying any descriptor table. Not going to happen, for obvious reasons. Moreover, you would have to have fork(2) take the same lock, since it does copy descriptor table. And clone(2) either does the same, or has the child share the descriptor table of parent. What's more, a reference to struct file can bloody well survive without a single descriptor refering to that file. In the example you've mentioned above, sender has ever right to close all descriptors it has sent. Files will stay opened as long as the references are held in the datagram; when that datagram is received, the references will be inserted into recepient's descriptor table. At that point you again have descriptors refering to that file, can do any IO on it, etc. So "the set of processes that hold a file reference to the dma-buf" is * inherently unstable, unless you are willing to freeze every process in the system except for the one trying to find that set. * can remain empty for any amount of time (hours, weeks, whatever), only to get non-empty later, with syscalls affecting the object in question done afterwards. So... what were you going to do with that set if you could calculate it? If it's really "how do we debug a leak?", it's one thing; in that case I would suggest keeping track of creation/destruction of objects (not gaining/dropping references - actual constructors and destructors) to see what gets stuck around for too long and use fuser(1) to try and locate the culprits if you see that something *was* living for too long. "Try" since the only reference might indeed have been stashed into an SCM_RIGHTS datagram sitting in a queue of some AF_UNIX socket. Note that "fuser needs elevated priveleges" is not a strong argument - the ability to do that sort of tracking does imply elevated priveleges anyway, and having a root process taking requests along the lines of "gimme the list of PIDs that have such-and-such dma_buf in their descriptor table" is not much of an attack surface. If you want to use it for something else, you'll need to describe that intended use; there might be sane ways to do that, but it's hard to come up with one without knowing what's being attempted... ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH 2/2] dmabuf/tracing: Add dma-buf trace events
On Tue, Aug 4, 2020 at 12:28 AM Al Viro wrote: > > On Mon, Aug 03, 2020 at 09:22:53AM -0700, Suren Baghdasaryan wrote: > > On Mon, Aug 3, 2020 at 9:12 AM Matthew Wilcox wrote: > > > > > > On Mon, Aug 03, 2020 at 09:00:00AM -0700, Suren Baghdasaryan wrote: > > > > On Mon, Aug 3, 2020 at 8:41 AM Matthew Wilcox > > > > wrote: > > > > > > > > > > On Mon, Aug 03, 2020 at 02:47:19PM +, Kalesh Singh wrote: > > > > > > +static void dma_buf_fd_install(int fd, struct file *filp) > > > > > > +{ > > > > > > + trace_dma_buf_fd_ref_inc(current, filp); > > > > > > +} > > > > > > > > > > You're adding a new file_operation in order to just add a new > > > > > tracepoint? > > > > > NACK. > > > > > > > > Hi Matthew, > > > > The plan is to attach a BPF to this tracepoint in order to track > > > > dma-buf users. If you feel this is an overkill, what would you suggest > > > > as an alternative? > > > > > > I'm sure BPF can attach to fd_install and filter on file->f_ops belonging > > > to dma_buf, for example. > > > > Sounds like a workable solution. Will explore that direction. Thanks > > Matthew! > > No, it is not a solution at all. > > What kind of locking would you use? With _any_ of those approaches. > > How would you use the information that is hopelessly out of > date/incoherent/whatnot > at the very moment you obtain it? > > IOW, what the hell is that horror for? You do realize, for example, that > there's > such thing as dup(), right? And dup2() as well. And while we are at it, how > do you keep track of removals, considering the fact that you can stick a file > reference into SCM_RIGHTS datagram sent to yourself, close descriptors and an > hour > later pick that datagram, suddenly getting descriptor back? > > Besides, "I have no descriptors left" != "I can't be currently sitting in the > middle > of syscall on that sucker"; close() does *NOT* terminate ongoing operations. > > You are looking at the drastically wrong abstraction level. Please, describe > what > it is that you are trying to achieve. For added entertainment (since this is specifically about dma-buf) you can stuff them into various gpu drivers, and convert to a native gpu driver handle thing. That's actually the expected use case, first a buffer sharing gets established with AF_UNIX, then both sides close the dma-buf fd handle. GPU drivers then internally cache the struct file so that we can hand out the same (to avoid confusion when re-importing it on some other driver), so for the case of dma-buf the "it's not actually an installed fd anywhere for unlimited time" is actually the normal use-case, not some odd corner. Cheers, Daniel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH 2/2] dmabuf/tracing: Add dma-buf trace events
On Mon, Aug 03, 2020 at 11:32:39AM -0400, Steven Rostedt wrote: > On Mon, 3 Aug 2020 14:47:19 + > Kalesh Singh wrote: > > > +DECLARE_EVENT_CLASS(dma_buf_ref_template, > > + > > + TP_PROTO(struct task_struct *task, struct file *filp), > > + > > + TP_ARGS(task, filp), > > + > > + TP_STRUCT__entry( > > + __field(u32, tgid) > > + __field(u32, pid) > > I only see "current" passed in as "task". Why are you recording the pid > and tgid as these are available by the tracing infrastructure. > > At least the pid is saved at every event. You can see the tgid when > enabling the "record_tgid". > > # trace-cmd start -e all -O record_tgid > # trace-cmd show > > # tracer: nop > # > # entries-in-buffer/entries-written: 39750/39750 #P:8 > # > # _-=> irqs-off > # / _=> need-resched > #| / _---=> hardirq/softirq > #|| / _--=> preempt-depth > #||| / delay > # TASK-PIDTGID CPU# TIMESTAMP FUNCTION > # | || | | | >trace-cmd-28284 (28284) [005] 240338.934671: sys_exit: NR 1 = 1 > kworker/3:2-27891 (27891) [003] d... 240338.934671: timer_start: > timer=d643debd function=delayed_work_timer_fn expires=4535008893 > [timeout=1981] cpu=3 idx=186 flags=I >trace-cmd-28284 (28284) [005] 240338.934672: sys_write -> 0x1 > kworker/3:2-27891 (27891) [003] 240338.934672: > workqueue_execute_end: work struct 8fddd403: function psi_avgs_work > kworker/3:2-27891 (27891) [003] 240338.934673: > workqueue_execute_start: work struct 111c941e: function > dbs_work_handler > kworker/3:2-27891 (27891) [003] 240338.934673: > workqueue_execute_end: work struct 111c941e: function dbs_work_handler > kworker/3:2-27891 (27891) [003] d... 240338.934673: rcu_utilization: > Start context switch > kworker/3:2-27891 (27891) [003] d... 240338.934673: rcu_utilization: End > context switch > > -- Steve > Thanks for the comments Steve. I'll remove the task arg. > > + __field(u64, size) > > + __field(s64, count) > > + __string(exp_name, dma_buffer(filp)->exp_name) > > + __string(name, dma_buffer(filp)->name ? dma_buffer(filp)->name > > : UNKNOWN) > > + __field(u64, i_ino) > > + ), > > + > > + TP_fast_assign( > > + __entry->tgid = task->tgid; > > + __entry->pid = task->pid; > > + __entry->size = dma_buffer(filp)->size; > > + __entry->count = file_count(filp); > > + __assign_str(exp_name, dma_buffer(filp)->exp_name); > > + __assign_str(name, dma_buffer(filp)->name ? > > dma_buffer(filp)->name : UNKNOWN); > > + __entry->i_ino = filp->f_inode->i_ino; > > + ), > > + ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH 2/2] dmabuf/tracing: Add dma-buf trace events
On Mon, Aug 03, 2020 at 02:47:19PM +, Kalesh Singh wrote: > +static void dma_buf_fd_install(int fd, struct file *filp) > +{ > + trace_dma_buf_fd_ref_inc(current, filp); > +} You're adding a new file_operation in order to just add a new tracepoint? NACK. ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH 2/2] dmabuf/tracing: Add dma-buf trace events
On Mon, Aug 03, 2020 at 09:00:00AM -0700, Suren Baghdasaryan wrote: > On Mon, Aug 3, 2020 at 8:41 AM Matthew Wilcox wrote: > > > > On Mon, Aug 03, 2020 at 02:47:19PM +, Kalesh Singh wrote: > > > +static void dma_buf_fd_install(int fd, struct file *filp) > > > +{ > > > + trace_dma_buf_fd_ref_inc(current, filp); > > > +} > > > > You're adding a new file_operation in order to just add a new tracepoint? > > NACK. > > Hi Matthew, > The plan is to attach a BPF to this tracepoint in order to track > dma-buf users. If you feel this is an overkill, what would you suggest > as an alternative? I'm sure BPF can attach to fd_install and filter on file->f_ops belonging to dma_buf, for example. ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH 2/2] dmabuf/tracing: Add dma-buf trace events
On Mon, Aug 03, 2020 at 11:28:31PM +0100, Al Viro wrote: > IOW, what the hell is that horror for? You do realize, for example, that > there's > such thing as dup(), right? And dup2() as well. And while we are at it, how > do you keep track of removals, considering the fact that you can stick a file > reference into SCM_RIGHTS datagram sent to yourself, close descriptors and an > hour > later pick that datagram, suddenly getting descriptor back? > > Besides, "I have no descriptors left" != "I can't be currently sitting in the > middle > of syscall on that sucker"; close() does *NOT* terminate ongoing operations. > > You are looking at the drastically wrong abstraction level. Please, describe > what > it is that you are trying to achieve. _IF_ it's "who keeps a particularly long-lived sucker pinned", I would suggest fuser(1) run when you detect that kind of long-lived dmabuf. With events generated by their constructors and destructors, and detection of longevity done based on that. But that's only a semi-blind guess at the things you are trying to achieve; please, describe what it really is. ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH 2/2] dmabuf/tracing: Add dma-buf trace events
On Mon, Aug 03, 2020 at 09:22:53AM -0700, Suren Baghdasaryan wrote: > On Mon, Aug 3, 2020 at 9:12 AM Matthew Wilcox wrote: > > > > On Mon, Aug 03, 2020 at 09:00:00AM -0700, Suren Baghdasaryan wrote: > > > On Mon, Aug 3, 2020 at 8:41 AM Matthew Wilcox wrote: > > > > > > > > On Mon, Aug 03, 2020 at 02:47:19PM +, Kalesh Singh wrote: > > > > > +static void dma_buf_fd_install(int fd, struct file *filp) > > > > > +{ > > > > > + trace_dma_buf_fd_ref_inc(current, filp); > > > > > +} > > > > > > > > You're adding a new file_operation in order to just add a new > > > > tracepoint? > > > > NACK. > > > > > > Hi Matthew, > > > The plan is to attach a BPF to this tracepoint in order to track > > > dma-buf users. If you feel this is an overkill, what would you suggest > > > as an alternative? > > > > I'm sure BPF can attach to fd_install and filter on file->f_ops belonging > > to dma_buf, for example. > > Sounds like a workable solution. Will explore that direction. Thanks Matthew! No, it is not a solution at all. What kind of locking would you use? With _any_ of those approaches. How would you use the information that is hopelessly out of date/incoherent/whatnot at the very moment you obtain it? IOW, what the hell is that horror for? You do realize, for example, that there's such thing as dup(), right? And dup2() as well. And while we are at it, how do you keep track of removals, considering the fact that you can stick a file reference into SCM_RIGHTS datagram sent to yourself, close descriptors and an hour later pick that datagram, suddenly getting descriptor back? Besides, "I have no descriptors left" != "I can't be currently sitting in the middle of syscall on that sucker"; close() does *NOT* terminate ongoing operations. You are looking at the drastically wrong abstraction level. Please, describe what it is that you are trying to achieve. ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH 2/2] dmabuf/tracing: Add dma-buf trace events
On Mon, 3 Aug 2020 14:47:19 + Kalesh Singh wrote: > +DECLARE_EVENT_CLASS(dma_buf_ref_template, > + > + TP_PROTO(struct task_struct *task, struct file *filp), > + > + TP_ARGS(task, filp), > + > + TP_STRUCT__entry( > + __field(u32, tgid) > + __field(u32, pid) I only see "current" passed in as "task". Why are you recording the pid and tgid as these are available by the tracing infrastructure. At least the pid is saved at every event. You can see the tgid when enabling the "record_tgid". # trace-cmd start -e all -O record_tgid # trace-cmd show # tracer: nop # # entries-in-buffer/entries-written: 39750/39750 #P:8 # # _-=> irqs-off # / _=> need-resched #| / _---=> hardirq/softirq #|| / _--=> preempt-depth #||| / delay # TASK-PIDTGID CPU# TIMESTAMP FUNCTION # | || | | | trace-cmd-28284 (28284) [005] 240338.934671: sys_exit: NR 1 = 1 kworker/3:2-27891 (27891) [003] d... 240338.934671: timer_start: timer=d643debd function=delayed_work_timer_fn expires=4535008893 [timeout=1981] cpu=3 idx=186 flags=I trace-cmd-28284 (28284) [005] 240338.934672: sys_write -> 0x1 kworker/3:2-27891 (27891) [003] 240338.934672: workqueue_execute_end: work struct 8fddd403: function psi_avgs_work kworker/3:2-27891 (27891) [003] 240338.934673: workqueue_execute_start: work struct 111c941e: function dbs_work_handler kworker/3:2-27891 (27891) [003] 240338.934673: workqueue_execute_end: work struct 111c941e: function dbs_work_handler kworker/3:2-27891 (27891) [003] d... 240338.934673: rcu_utilization: Start context switch kworker/3:2-27891 (27891) [003] d... 240338.934673: rcu_utilization: End context switch -- Steve > + __field(u64, size) > + __field(s64, count) > + __string(exp_name, dma_buffer(filp)->exp_name) > + __string(name, dma_buffer(filp)->name ? dma_buffer(filp)->name > : UNKNOWN) > + __field(u64, i_ino) > + ), > + > + TP_fast_assign( > + __entry->tgid = task->tgid; > + __entry->pid = task->pid; > + __entry->size = dma_buffer(filp)->size; > + __entry->count = file_count(filp); > + __assign_str(exp_name, dma_buffer(filp)->exp_name); > + __assign_str(name, dma_buffer(filp)->name ? > dma_buffer(filp)->name : UNKNOWN); > + __entry->i_ino = filp->f_inode->i_ino; > + ), > + ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel