2016-08-25 23:49 GMT+03:00 H. Peter Anvin <h...@zytor.com>: > On August 25, 2016 8:21:07 AM PDT, Dmitry Safonov <dsafo...@virtuozzo.com> > wrote: >>This patches set is cleanly RFC and is not supposed to be applied. >>Also for RFC time it builds only on x86_64. >> >>So, in a mail thread Oleg told that it would be worth to introduce >>vm_file >>for vdso mappings as currently uprobes can not be placed on vDSO VMAs >>[1]. >>In this patches set I introduce in-kernel filesystem for vdso files. >>After patches vDSO VMA now has inode and is just a private file >>mapping: >>7ffcc4b2b000-7ffcc4b2d000 r--p 00000000 00:00 0 >> [vvar] >>7ffcc4b2d000-7ffcc4b2f000 r-xp 00000000 00:09 18 >> [vdso] >> >>Then I introduce interface in uprobe_events to insert uprobes in vdso. >>FWIW: >> [~]# cd kernel/linux >> [linux]# readelf --syms arch/x86/entry/vdso/vdso64.so >>Symbol table '.dynsym' contains 11 entries: >> Num: Value Size Type Bind Vis Ndx Name >> 0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND >> 1: 0000000000000470 0 SECTION LOCAL DEFAULT 8 >>2: 00000000000008d0 885 FUNC WEAK DEFAULT 12 >>clock_gettime@@LINUX_2.6 >>3: 0000000000000c50 472 FUNC GLOBAL DEFAULT 12 >>__vdso_gettimeofday@@LINUX_2.6 >>4: 0000000000000c50 472 FUNC WEAK DEFAULT 12 >>gettimeofday@@LINUX_2.6 >>5: 0000000000000e30 21 FUNC GLOBAL DEFAULT 12 >>__vdso_time@@LINUX_2.6 >> 6: 0000000000000e30 21 FUNC WEAK DEFAULT 12 time@@LINUX_2.6 >>7: 00000000000008d0 885 FUNC GLOBAL DEFAULT 12 >>__vdso_clock_gettime@@LINUX_2.6 >> 8: 0000000000000000 0 OBJECT GLOBAL DEFAULT ABS LINUX_2.6 >>9: 0000000000000e50 41 FUNC GLOBAL DEFAULT 12 >>__vdso_getcpu@@LINUX_2.6 >>10: 0000000000000e50 41 FUNC WEAK DEFAULT 12 >>getcpu@@LINUX_2.6 >> [~]# cd /sys/kernel/debug/tracing/ >> [tracing]# echo 'p:clock_gettime :vdso:/64:0x8d0' > uprobe_events >> [tracing]# echo 'p:gettimeofday :vdso:/64:0xc50' >> uprobe_events >> [tracing]# echo 'p:time :vdso:/64:0xe30' >> uprobe_events >> [tracing]# echo 1 > events/uprobes/enable >> [tracing]# su test # it has UID=1001 >> [tracing]$ date >> Thu Aug 25 17:19:29 MSK 2016 >> [tracing]$ exit >> [tracing]# cat trace >> # tracer: nop >> # >> # entries-in-buffer/entries-written: 175/175 #P:4 >> # >> # _-----=> irqs-off >> # / _----=> need-resched >> # | / _---=> hardirq/softirq >> # || / _--=> preempt-depth >> # ||| / delay >> # TASK-PID CPU# |||| TIMESTAMP FUNCTION >> # | | | |||| | | >> bash-11560 [001] d... 316.470236: time: (0x7ffcacebae30) >> bash-11560 [001] d... 316.471436: gettimeofday: (0x7ffcacebac50) >> bash-11560 [001] d... 316.477550: time: (0x7ffcacebae30) >> bash-11560 [001] d... 316.477655: time: (0x7ffcacebae30) >> mktemp-11568 [001] d... 316.479589: gettimeofday: (0x7ffc603f0c50) >> date-11571 [001] d... 316.481890: clock_gettime: (0x7ffec9db58d0) >>[...] >> >>If this approach will be decided as fine, I will prepare a better >>version, >>fixing the following things: >>o put vdsofs in generic fs/* dir >>o support other archs and vdso blobs >>o remove BUG_ON()'s and UID==1001 check >>o remove extern's and use headers only >>o refactor code in create_trace_uprobe() >>o add some state to (struct trace_uprobe), so i.e., `cat uprobe_events` >>will >> print those uprobes as vdso-based >>o document this interface in Documentation/trace/uprobetracer.txt >>o prepare nice patches set? >> >>So, opinions? Is it worth to add something like this? >> >>[1]: https://lkml.org/lkml/2016/7/12/346 >> >>Dmitry Safonov (3): >> x86/vdso: create vdso file, use it for mapping >> uprobe: drop isdigit() check in create_trace_uprobe >> uprobe: add vdso support >> >>Cc: Oleg Nesterov <o...@redhat.com> >>Cc: Al Viro <v...@zeniv.linux.org.uk> >>Cc: Steven Rostedt <rost...@goodmis.org> >>Cc: Andy Lutomirski <l...@amacapital.net> >>Cc: Thomas Gleixner <t...@linutronix.de> >>Cc: Ingo Molnar <mi...@redhat.com> >>Cc: "H. Peter Anvin" <h...@zytor.com> >>Cc: x...@kernel.org >>Cc: Dmitry Safonov <0x7f454...@gmail.com> >> >>arch/x86/entry/vdso/vma.c | 148 >>++++++++++++++++++++++++++++++++++++++++++-- >> kernel/trace/trace_uprobe.c | 50 +++++++++++---- >> 2 files changed, 180 insertions(+), 18 deletions(-) > > I think there is a lot to be said for this idea. However, a private mapping > is definitely wrong for the vvar data; for the vdso code it could be > considered either way I suppose.
Thanks on your reply. As you could see, I preserved pure mapping of pfn for vvar: 7ffcc4b2b000-7ffcc4b2d000 r--p 00000000 00:00 0 [vvar] 7ffcc4b2d000-7ffcc4b2f000 r-xp 00000000 00:09 18 [vdso] (no inode number). I also think it would be useless to do the same to vvar as it has just data and there is no point in probing it. -- Dmitry