http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54965
--- Comment #3 from Siarhei Siamashka <siarhei.siamashka at gmail dot com> 2012-10-18 10:47:51 UTC --- (In reply to comment #2) > void combine_conjoint_xor_ca_float () > { > combine_channel_t j = pd_combine_conjoint_xor, k = > pd_combine_conjoint_xor; > a[0] = k (0, b, 0, a[0]); > a[0] = k (0, b, 0, a[0]); > a[0] = k (0, b, 0, a[0]); > a[0] = j (0, c[0], 0, a[0]); > a[0] = k (0, c[0], 0, a[0]); > a[0] = k (0, c[0], 0, a[0]); > a[0] = k (0, c[0], 0, a[0]); > > you are using indirect function calls here, GCC in 4.6 is not smart enough > to transform them to direct calls before inlining. Inlining of > always-inline indirect function calls is not going to work reliably. Does this only apply to GCC 4.6? > Don't use always-inline or don't use indirect function calls to always-inline > functions. This looks like it might be really inconvenient. Pixman relies on this functionality in a number of places by doing something like this: void always_inline per_pixel_operation_a(...) { ... } void always_inline per_pixel_operation_b(...) { ... } void always_inline big_function_template(..., per_pixel_operation_ptr foo) { ... /* do some calls to foo() in an inner loop */ ... } void big_function_a(...) { big_function_template(..., per_pixel_operation_a); } void big_function_b(...) { big_function_template(..., per_pixel_operation_b); } Needless to say that we want to be absolutely sure that per-pixel operations are always inlined. Otherwise the performance gets really bad if the compiler ever makes a bad inlining decision. The same functionality can be probably achieved by replacing always_inline functions with macros. But the code becomes less readable, more error prone and somewhat more difficult to maintain.