* Ingo Molnar | 2015-04-21 09:42:12 [+0200]:

Hey Ingo,

>So the thing is that allyesconfig turns on -Os:
>
>   CONFIG_CC_OPTIMIZE_FOR_SIZE=y

CONFIG_CC_OPTIMIZE_FOR_SIZE seems to have no effect, The only option which
makes a difference is CONFIG_OPTIMIZE_INLINING! But this is not a big surprise:
*disabling* CONFIG_OPTIMIZE_INLINING substitudes _all_ inlines with
__attribute__((always_inline)).

"If unsure, say N." -> results in configurations with always_inline.


So I tested again, one time with unset CONFIG_OPTIMIZE_INLINING the result
seems fine:

              show_temp:  59 duplicates
               char2uni:  52 duplicates
               uni2char:  52 duplicates
               sd_probe:  49 duplicates
         sd_driver_init:  48 duplicates
         sd_driver_exit:  48 duplicates
 usb_serial_module_exit:  47 duplicates
                  [...]


We see ordinary "template" reuse of common driver code without renaming the
copied static's. But compiled with CONFIG_OPTIMIZE_INLINING=y the inlining is
not respected by gcc:

            atomic_inc: 544 duplicates
       rcu_read_unlock: 453 duplicates
         rcu_read_lock: 383 duplicates
           get_dma_ops: 271 duplicates
arch_local_irq_restore: 258 duplicates
            atomic_dec: 215 duplicates
               kzalloc: 185 duplicates
      test_and_set_bit: 156 duplicates
         cpumask_check: 148 duplicates
          cpumask_next: 146 duplicates
              list_del: 131 duplicates
              kref_get: 126 duplicates
    test_and_clear_bit: 122 duplicates
                brelse: 122 duplicates
         schedule_work: 122 duplicates
   netif_tx_stop_queue: 115 duplicates
   atomic_dec_and_test: 107 duplicates
     dma_mapping_error: 105 duplicates
         list_del_init: 101 duplicates
      netif_stop_queue: 100 duplicates
 arch_local_save_flags:  98 duplicates
      tasklet_schedule:  76 duplicates
    clk_prepare_enable:  71 duplicates
       init_completion:  69 duplicates
         pskb_may_pull:  67 duplicates
                  [...]

Again, the used gcc version is "gcc (Debian 4.9.2-10) 4.9.2". So it is not
outdated nor a legacy one. The inline heuristic seems really broken for some
parts. Is it possible that gcc is bedeviled because of inline assembler
parts which brings confuse the internal scoring system?

I suggest the following: I prepare a patch series for the most obvious
candidates and substituting inline with __always_inline (probably ~50
functions). Each subsystem maintainer can check and ACK the patch. This has the
benefit that for all other locations gcc is still responsible for inlining
decision. Enforcing inlining via __always_inline for all inline marked function
is probably too hard!? In 2015 gcc is still not able to inline single line
statements - that's strange.

Linus, ack?

Hagen
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to