On Tue, Nov 10, 2015 at 12:48:42PM +0100, Borislav Petkov wrote:
> From: Borislav Petkov <[email protected]>
> 
> Those are stupid and code should use static_cpu_has_safe() anyway. Kill
> the least used and unused ones.
> 
> Signed-off-by: Borislav Petkov <[email protected]>
> Cc: Herbert Xu <[email protected]>
> Cc: Peter Zijlstra <[email protected]>
> Cc: Matt Mackall <[email protected]>
> Cc: Chris Mason <[email protected]>
> Cc: Josef Bacik <[email protected]>
> Cc: David Sterba <[email protected]>
> ---
>  arch/x86/crypto/chacha20_glue.c             |  2 +-
>  arch/x86/crypto/crc32c-intel_glue.c         |  3 ++-
>  arch/x86/include/asm/cmpxchg_32.h           |  2 +-
>  arch/x86/include/asm/cpufeature.h           | 32 
> +++--------------------------
>  arch/x86/include/asm/smp.h                  |  2 +-
>  arch/x86/kernel/cpu/amd.c                   |  2 +-
>  arch/x86/kernel/cpu/intel.c                 |  3 ++-
>  arch/x86/kernel/cpu/mtrr/generic.c          |  2 +-
>  arch/x86/kernel/cpu/mtrr/main.c             |  2 +-
>  arch/x86/kernel/cpu/perf_event_amd.c        |  4 ++--
>  arch/x86/kernel/cpu/perf_event_amd_uncore.c |  8 ++++----
>  arch/x86/kernel/fpu/init.c                  |  4 ++--
>  arch/x86/kernel/hw_breakpoint.c             |  3 ++-
>  arch/x86/kernel/vm86_32.c                   |  4 +++-
>  arch/x86/mm/setup_nx.c                      |  4 ++--
>  drivers/char/hw_random/via-rng.c            |  5 +++--
>  drivers/crypto/padlock-aes.c                |  2 +-
>  drivers/crypto/padlock-sha.c                |  3 ++-
>  fs/btrfs/disk-io.c                          |  2 +-
>  19 files changed, 35 insertions(+), 54 deletions(-)

Ok, 0day says this patch makes tiny not so tiny:

i386-tinyconfig vmlinux size:

+-------+------+-------+-----+--------------------------------------------------------------------------------------+
| TOTAL | TEXT | DATA  | BSS |                                                  
                                    |
+-------+------+-------+-----+--------------------------------------------------------------------------------------+
| +4646 |  +64 | +4096 |   0 | ab9976b5af96 x86/cpufeature: Remove unused and 
seldomly used cpu_has_xx macros       |
|   -32 |  -32 |     0 |   0 | 13e835020a02 x86/cpufeature: Cleanup 
get_cpu_cap()                                   |
|   +32 |  +32 |     0 |   0 | 3615f94f0486 x86/cpufeature: Move some of the 
scattered feature bits to x86_capabili |
|  +136 |  +32 |     0 |   0 | 506d983184f4 Merge branch 'tip-fpu-xsave' into 
rc2+                                  |
| +4782 |  +96 | +4096 |   0 | ALL COMMITS                                      
                                    |
+-------+------+-------+-----+--------------------------------------------------------------------------------------+

Btw, thanks 0day!

The problem comes from static_cpu_has_safe() adding the alternatives and
fallback machinery. For example, before the patch, we had this at the
cpu_has_* testing sites:

        movl    boot_cpu_data+20, %eax  # MEM[(const long unsigned int 
*)&boot_cpu_data + 20B], D.19113
        testl   $2097152, %eax  #, D.19113
        je      .L166   #,

and now we get this:

#APP
# 449 "arch/x86/kernel/cpu/intel.c" 1
# 0 "" 2
# 511 "./arch/x86/include/asm/cpufeature.h" 1
        1: jmp .L166    #
2:
.skip -(((5f-4f) - (2b-1b)) > 0) * ((5f-4f) - (2b-1b)),0x90
3:
.section .altinstructions,"a"
 .long 1b - .
 .long 4f - .
 .word 117      #
 .byte 3b - 1b
 .byte 5f - 4f
 .byte 3b - 2b
.previous
.section .altinstr_replacement,"ax"
4: jmp .L167    #
5:
.previous
.section .altinstructions,"a"
 .long 1b - .
 .long 0
 .word 21       #
 .byte 3b - 1b
 .byte 0
 .byte 0
.previous

# 0 "" 2
#NO_APP
        jmp     .L168   #
.L166:
        movl    $21, %eax       #,
        call    __static_cpu_has_safe   #
        testb   %al, %al        # D.19126
        je      .L167   #,
.L168:
#APP
# 453 "arch/x86/kernel/cpu/intel.c" 1
# 0 "" 2
#NO_APP

That gets spread among .altinstructions, .altinstr_replacement, .text
etc sections. .data grows too probably because of the NOP padding :-\

                   text    data     bss     dec     hex filename
before:          644896  127436 1189384 1961716  1deef4 vmlinux
after:           645446  131532 1189384 1966362  1e011a vmlinux

        [Nr] Name              Type            Addr     Off    Size   ES Flg Lk 
Inf Al
before: [12] .altinstructions  PROGBITS        c10bdf48 0bef48 000680 00   A  0 
  0  1
after:  [12] .altinstructions  PROGBITS        c10bff48 0c0f48 0007d2 00   A  0 
  0  1

before: [13] .altinstr_replace PROGBITS        c10be5c8 0bf5c8 00016c 00  AX  0 
  0  1
after:  [13] .altinstr_replace PROGBITS        c10c071a 0c171a 0001ad 00  AX  0 
  0  1

before: [ 7] .data             PROGBITS        c1092000 093000 0132a0 00  WA  0 
  0 4096
after:  [ 7] .data             PROGBITS        c1093000 094000 0142a0 00  WA  0 
  0 4096

So I'm wondering if we should make a config option which converts
static_cpu_has* macros to boot_cpu_has()? That should slim down
the kernel even more but it won't benefit from the speedup of the
static_cpu_has* stuff.

Josh, thoughts?

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to