On Thu, 29 Nov 2007 08:14:10 -0000 "Metzger, Markus T" <[EMAIL PROTECTED]> wrote:
> Support for Intel's last branch recording to ptrace. This gives > debuggers > access to this hardware feature and allows them to show an execution > trace > of the debugged application. > > Last branch recording (see section 18.5 in the Intel 64 and IA-32 > Architectures Software Developer's Manual) allows taking an execution > trace of the running application without instrumentation. When a branch > is executed, the hardware logs the source and destination address in a > cyclic buffer given to it by the OS. > > This can be a great debugging aid. It shows you how exactly you got > where you currently are without requiring you to do lots of single > stepping and rerunning. > > This patch manages the various buffers, configures the trace > hardware, disentangles the trace, and provides a user interface via > ptrace. On the high-level design: > - there is one optional trace buffer per thread_struct > - upon a context switch, the trace hardware is reconfigured to either > disable tracing or to use the appropriate buffer for the new task. > - tracing induces ~20% overhead as branch records are sent out on > the bus. > - the hardware collects trace per processor. To disentangle the > traces for different tasks, we use separate buffers and reconfigure > the trace hardware. > - the low-level data layout is configured at cpu initialization time > - different processors use different branch record formats > > > patch 1/2 contains the kernel changes > patch 2/2 contains changes to the ptrace man pages > > Is there any userspace code avaialble which people can use to play with this? How do you envisage it being used in the long term? Do you expect any of the standard performance tuning tools will be tweaked to understand this feature and if so which ones? I'm generally wondering "how will developers be using this in a year or two's time?" Please cc Michael Kerrisk <[EMAIL PROTECTED]> on future versions of these patches. The patches were horridly wordwrapped. Is there any likelihood that any other CPUs do now or will in the future support any similar feature to this? If so, is an implementation which is 100% contained to arch/x86 appropriate? - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/