Hiramatsu-san, thank you for reviewing my patch.
Thus wrote Masami Hiramatsu ([email protected]): > Ah, this is a bit complicated. It seems to work with sched_switch event > as commit f04dec93466a ("tracing/eprobes: Fix reading of string fields"): > echo 'e:sw sched/sched_switch comm=$next_comm:string' > dynamic_events > # TASK-PID CPU# ||||| TIMESTAMP FUNCTION > # | | | ||||| | | > sh-162 [002] d..3. 54.027213: sw: (sched.sched_switch) > comm="swapper/2" > <idle>-0 [007] d..3. 54.034573: sw: (sched.sched_switch) > comm="rcu_preempt" > rcu_preempt-15 [007] d..3. 54.034589: sw: (sched.sched_switch) > comm="swapper/7" > Maybe comm is stored as a fixed string information in the event record? Yes, this example does not execute my change. > /sys/kernel/tracing # cat events/sched/sched_switch/format > name: sched_switch > ID: 254 > format: > field:unsigned short common_type; offset:0; size:2; > signed:0; > field:unsigned char common_flags; offset:2; size:1; > signed:0; > field:unsigned char common_preempt_count; offset:3; size:1; > signed:0; > field:int common_pid; offset:4; size:4; signed:1; > field:char prev_comm[16]; offset:8; size:16; > signed:0; > field:pid_t prev_pid; offset:24; size:4; signed:1; > field:int prev_prio; offset:28; size:4; signed:1; > field:long prev_state; offset:32; size:8; signed:1; > field:char next_comm[16]; offset:40; size:16; > signed:0; > field:pid_t next_pid; offset:56; size:4; signed:1; > field:int next_prio; offset:60; size:4; signed:1; > But the filename is a pointer. > /sys/kernel/tracing # cat events/syscalls/sys_enter_openat/format > name: sys_enter_openat > ID: 705 > format: > field:unsigned short common_type; offset:0; size:2; > signed:0; > field:unsigned char common_flags; offset:2; size:1; > signed:0; > field:unsigned char common_preempt_count; offset:3; size:1; > signed:0; > field:int common_pid; offset:4; size:4; signed:1; > field:int __syscall_nr; offset:8; size:4; signed:1; > field:int dfd; offset:16; size:8; signed:0; > field:const char * filename; offset:24; size:8; signed:0; > field:int flags; offset:32; size:8; signed:0; > field:umode_t mode; offset:40; size:8; signed:0; > field:__data_loc char[] __filename_val; offset:48; size:4; > signed:0; > In this case, the filename field should use __data_loc directly instead of > pointing data on the ring buffer. > Can you try > echo 'e syscalls.sys_enter_openat $__filename_val:string' > \ > /sys/kernel/tracing/dynamic_events > Instead? This field is working as expected. I still believe that the handling of FILTER_PTR_STRING is not correct. The pointer is stored in the ringbuffer as unsigned long and read as a char. This gives us a truncated pointer that cannot be dereferenced. > I think better solution is fixing sycall tracer. I would say that syscall trace is doing the right thing. The ringbuffer entry is a struct syscall_trace_enter, the syscall arguments are unsigned longs. They are written in ftrace_syscall_enter, this looks correct to me. A const char * syscall argument is using FILTER_PTR_STRING, the unsigned long argument from the ringbuffer is read as a char and then converted to a truncated pointer. Thanks, Martin
