Hi, Steve reported an NMI stack overflow when running perf and function tracing together. Thomas found that we had 4 copies of struct perf_sample_data on-stack.
These here patches reduce that to 2 copies and half the size of it. So just for perf_sample_data we go from 4*384=1536 to 2*192=384 bytes of stack. Also remove one struct pt_regs instance from __intel_pmu_pebs_event(); it has another instance in struct x86_perf_regs which I haven't yet managed to offload. Perf seems to still work... :-)