> > For comparison, can you give --per-thread a go prior to these patches > > being applied? > > FWIW, I had a go with (an old) perf record on an arm64 system using > --per-thread, and I see that no samples are recorded, which seems like a > bug. > > With --per-thread, the slwodown was ~20%, whereas with the defaults it > was > 400%.
I'm not sure what the point of the experiment is? It has to work with reasonable overead even without --per-thread. FWIW Alexey already root caused the problem, so there's no need to restart the debugging. -Andi

