On Wed, Jan 04, 2017 at 02:06:04PM +0900, Masami Hiramatsu wrote: > On Tue, 3 Jan 2017 11:54:02 +0100 > Peter Zijlstra <pet...@infradead.org> wrote:
> > How many entries should one expect on that list? I spend quite a bit of > > time reducing the cost of is_module_text_address() a while back and see > > that both ftrace (which actually needs this to be fast) and now > > kprobes have linear list walks in here. > > It depends on how many probes are used and optimized. However, in most > cases, there should be one entry (unless user defines optimized probes > over 32 on x86, from my experience, it is very rare case. :) ) OK, that's good :-) > > I'm assuming the ftrace thing to be mostly empty, since I never saw it > > on my benchmarks back then, but it is something Steve should look at I > > suppose. > > > > Similarly, the changelog here should include some talk about worst case > > costs. > > Would you have any good benchmark to measure it? Not trivially so; what I did was cobble together a debugfs file that measures the average of the PMI time in perf_sample_event_took(), and a module that has a 10 deep callchain around a while(1) loop. Then perf record with callchains for a few seconds. Generating the callchain does the unwinder thing and ends up calling is_kernel_address() lots. The case I worked on was 0 modules vs 100+ modules in a distro build, which was fairly obviously painful back then, since is_module_text_address() used a linear lookup. I'm not sure I still have all those bits, but I can dig around a bit if you're interested.