On Tue, 22 Nov 2016, Steven Rostedt wrote:
> On Fri, 18 Nov 2016 08:55:24 -0800
> Andi Kleen <[email protected]> wrote:
> 
> > From: Andi Kleen <[email protected]>
> > 
> > ftrace has powerfull trigger functions. Intel PT on modern Intel CPUs
> > can trace execution flow.
> > 
> > For debugging I found it useful to disable the PT trace from ftrace 
> > triggers,
> > for example when specific kernel functions are hit, which indicate
> > a problem. Then we can see the exact execution trace up to this point.
> > 
> > This patch adds a "ptoff" ftrace trigger/function that disables the trace
> > on the current function. The PT trace still has to be set up with perf
> > 
> > % perf record -e intel_pt// -a ... &
> > % cd /sys/kernel/debug/tracing
> > % echo do_page_fault:ptoff > set_ftrace_filter
> > ...
> > % cd -
> > % kill %1
> > % perf script --itrace=i0ns
> > 
> > I only implemented local disabling. Enabling would be much more complicated
> > and require a black list of functions to avoid recursion. Global
> > disabling with IPIs would be possible, but also risk some deadlock
> > scenarios. Local disabling is very easy and can be done without
> > accessing any special state, so there are no such problems. It is
> > usually good enough for debugging purposes. The trace can be always
> > reenabled from perf.
> > 
> > This patch adds "ptoff" both as ftrace trigger and ftrace functions.
> > This makes it work from "set_ftrace_filter" and through the trigger
> > field of trace points.

This changelog reads like a fairy tale and certainly needs some care.

> > The PT driver exports a pt_disable() function for this that can be also
> > used for manual instrumentation.

> > diff --git a/Documentation/trace/ftrace.txt b/Documentation/trace/ftrace.txt
> > index 185c39fea2a0..5dc8ec658678 100644
> > --- a/Documentation/trace/ftrace.txt
> > +++ b/Documentation/trace/ftrace.txt
> > @@ -2549,6 +2549,11 @@ The following commands are supported:
> >    command, it only prints out the contents of the ring buffer for the
> >    CPU that executed the function that triggered the dump.
> >  
> > +- ptoff
> > +  When the function is hit disable Intel PT trace. The Intel PT
> > +  trace has to be set up earlier with perf record -a -e intel_pt// ...
> > +  This disables the trace on the current CPU only.

       When the function is hit, Intel PT trace is disabled on the current
       CPU; PT trace on other CPUs is unaffected.
       The Intel PT trace has to be set up earlier with: 
       perf record -e intel_pt// ...  

> > +/*
> > + * Disable the PT trace for debugging purposes.
> > + */
> > +void pt_disable(void)
> > +{
> > +   u64 val;
> > +
> > +   if (!boot_cpu_has(X86_FEATURE_INTEL_PT))
> > +           return;
> > +
> > +   rdmsrl_safe(MSR_IA32_RTIT_CTL, &val);

Why is this using rdmsrl_safe()? If that's required then the return value
should be checked and acted upon.

> > +   val &= ~RTIT_CTL_TRACEEN;
> > +   wrmsrl_safe(MSR_IA32_RTIT_CTL, val);

How does this interact with perf, the PT nmi handler, context switch etc.?
I can't see anything, but the implications of this need to be documented.

> > +}
> > +EXPORT_SYMBOL(pt_disable);

This export is not required for the trace trigger. This wants to be a
seperate patch, EXPORT_SYMBOl_GPL and an in tree user as any other export..

> > diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
> > index 4741ecdb9817..a408d288298b 100644
> > --- a/include/linux/perf_event.h
> > +++ b/include/linux/perf_event.h
> > @@ -1387,4 +1387,6 @@ int perf_event_exit_cpu(unsigned int cpu);
> >  #define perf_event_exit_cpu        NULL
> >  #endif
> >  
> > +void pt_disable(void);

This is the wrong header file to declare the function. Aside of that
pt_disable() is a too generic name. Something like x86_intel_pt_disable()
makes it clear what this is about.

Thanks,

        tglx

Reply via email to