Peter Zijlstra <[email protected]> wrote: > On Wed, Jan 31, 2018 at 09:38:46PM -0800, Nadav Amit wrote: > >> I used ftrace to measure the execution time of flush_tlb_func_remote() on a >> 2-socket Haswell machine, using a microbenchmark I wrote for some research >> project. > > However cool ftrace is, it is _really_ bad for such uses. The cost of > using ftrace is many many time higher than any change you could affect > by this. > > A microbench and/or perf is what you should use for this.
Don’t expect to see a remote NUMA access impact, whose cost are few 10s of nanoseconds on microbenchmarks. (And indeed I did not.) Each iteration of #PF - MADV_DONTNEED takes several microseconds, and the impact is lost in the noise. You are right in the fact that ftrace introduces overheads, but the variance is relatively low. If I stretch the struct to 3 lines of cache, I see a 20ns overhead. Anyhow, I think this line of code got more than its fair share of attention.
signature.asc
Description: Message signed with OpenPGP

