When gcc considers the size of a function for inlining decisions, it apparently considers *all* sections. Since the kernel extensively uses sections for things other than code (e.g., exception-table, bug-table), the optimality of these decisions seem questionable to me.
The objtool’s sections may be the most extreme case, as these sections are discarded, while their size still appears to be considered by the inlining heuristics. It may be beneficial not to consider (some) the other sections as well, as they do not affect code-caching but only increase the kernel size. To illustrate the issue, consider the function copy_overflow(): 0xffffffff819315e0 <+0>: push %rbp 0xffffffff819315e1 <+1>: mov %rsi,%rdx 0xffffffff819315e4 <+4>: mov %edi,%esi 0xffffffff819315e6 <+6>: mov $0xffffffff820bc4b8,%rdi 0xffffffff819315ed <+13>: mov %rsp,%rbp 0xffffffff819315f0 <+16>: callq 0xffffffff81089b70 <__warn_printk> 0xffffffff819315f5 <+21>: ud2 0xffffffff819315f7 <+23>: pop %rbp 0xffffffff819315f8 <+24>: retq This function seems to me as a great candidate for inlining. Yet, in my 4.16 build (using gcc 7.2), I get 38 non-inlined instances of this function in vmlinux. Forcing CONFIG_STACK_VALIDATION to be disabled reduces the number non-inlined instances to 35. Removing, in addition, the data which is saved in the __bug_table makes all the instances of the function to be inlined. Obviously this certain function can be set as __always_inline, but the inline heuristics seems to me as wrongfully biased. What do you think? Is there a way to make gcc to ignore sections for its inlining heuristics? Thanks, Nadav