On Mon, 11 May 2015 19:57:22 +0200 Denys Vlasenko <dvlas...@redhat.com> wrote:
> With both gcc 4.7.2 and 4.9.2, sometimes gcc mysteriously doesn't inline > very small functions we expect to be inlined. In particular, > with this config: http://busybox.net/~vda/kernel_config > there are more than a thousand copies of tiny spinlock-related functions: > > $ nm --size-sort vmlinux | grep -iF ' t ' | uniq -c | grep -v '^ *1 ' | sort > -rn | grep ' spin' > 473 000000000000000b t spin_unlock_irqrestore > 292 000000000000000b t spin_unlock > 215 000000000000000b t spin_lock > 134 000000000000000b t spin_unlock_irq > 130 000000000000000b t spin_unlock_bh > 120 000000000000000b t spin_lock_irq > 106 000000000000000b t spin_lock_bh > > Disassembly: > > ffffffff81004720 <spin_lock>: > ffffffff81004720: 55 push %rbp > ffffffff81004721: 48 89 e5 mov %rsp,%rbp > ffffffff81004724: e8 f8 4e e2 02 callq <_raw_spin_lock> > ffffffff81004729: 5d pop %rbp > ffffffff8100472a: c3 retq > > This patch fixes this via s/inline/__always_inline/ in spinlock.h. > This decreases vmlinux by about 30k: > > text data bss dec hex filename > 82375570 22255544 20627456 125258570 7774b4a vmlinux.before > 82335059 22255416 20627456 125217931 776ac8b vmlinux See also https://lkml.org/lkml/2015/4/23/598 ("enforce function inlining for hot functions"). Presumably Hagen didn't see the issue with spinlock functions. I wonder why not. I suppose we should get both these consolidated into a coherent whole. It's a bit irritating to have to do this: presumably gcc will get fixed and the huge sprinkling of __always_inline will become less and less relevant over time and people will have trouble distinguishing "real __always_inline which was put here for a purpose" from "dopey __always_inline to work around a short-term gcc glitch". __always_inline is one of those things where a usage site should always be commented, because it's near impossible to work out why someone chose to use it. Quick, tell me what's happening in include/linux/slab.h. Perhaps we should do /* * Comment goes here. It is very specific about gcc versions. */ #define inline_for_broken_gcc __always_inline and then use inline_for_broken_gcc everywhere. That way, the reason for the marker is self-explanatory and we can later hunt all these things down and remvoe them. Also, the inline_for_broken_gcc definition can be made dependent on particular gcc versions, which will allow us to easily keep an eye on the behaviour of later gcc versions. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/