On Monday, 24 February 2014 at 02:05:31 UTC, Walter Bright wrote:
1. It provides information to the compiler about runtime frequency that it cannot obtain otherwise. This is very useful information for generating better code.


2. Making it a hard requirement then means the user will have to put versioning in it. It becomes inherently non-portable. There is no way to predict what some other version of some other compiler on some other system will do.

I'm not sure what it is impossible to inline in some case, I've never hit that limitation with ICC. Like others I would like unconditional and explicit optimization from the compiler.

3. In the end, the compiler should make the decision. Inlining does not always result in faster code, as I pointed out in another post.

Also when I use "force inline" it's very often to force "not-inline" to reuse the same bit of code while the compiler would have inlined it.

Each optimization here is taken a repeatable automated A-B test with a 95% statistical significance on various inputs, and forcing inline/not-inline has been an effective tool to reduce the I-cache stress that plagues some very particular program areas that the compiler doesn't differentiate. This can be checked by looking at assembly or binary size afterwards.

I'm perfectly OK with the compiler doing what he wants when I don't tell it to inline or not. AFAIK the C/C++ inline keyword is mostly ignored by optimizing compilers, it's precisely a keyword that is both overused and meaningless.


Perhaps the lesson is the word 'inline' carries certain expectations with it, and the feature would be better positioned as something like:

    pragma(usage, often);
    pragma(usage, rare);

To me it's not so much about usage frequency that about I-cache misses. Some inlining can be nearly free (I-cache working set small), or very costly (I-cache actively being the bottleneck through repeated miss due to large working set).

Reply via email to