Re: [GIT PULL] x86/alternatives padding
On Wed, Mar 04, 2015 at 09:22:27PM +0100, Ingo Molnar wrote: > So you could have a look at the detailed section dump itself via: > >objdump -h vmlinux > > there .text will be the raw text and .alt* will be listed separately. > The 'size' tool will add up executable sections IIRC, mixing these > sections. Right. > .alt* is freed after init, so it's not really a kernel image size > increase, right? Exactly: void free_initmem(void) { free_init_pages("unused kernel", (unsigned long)(&__init_begin), (unsigned long)(&__init_end)); } which are: 69708: 81ee9000 0 NOTYPE GLOBAL DEFAULT 16 __init_begin 72679: 81ff9000 0 NOTYPE GLOBAL DEFAULT 25 __init_end and there's a bunch of stuff between 81ee9000 and 81ff9000: [Nr] Name Type Address Offset Size EntSize Flags Link Info Align [17] .init.textPROGBITS 81efe000 012fe000 00066c98 AX 0 0 16 [18] .init.dataPROGBITS 81f65000 01365000 00086d18 WA 0 0 4096 [19] .x86_cpu_dev.init PROGBITS 81febd18 013ebd18 0018 A 0 0 8 [20] .altinstructions PROGBITS 81febd30 013ebd30 7e4b A 0 0 1 [21] .altinstr_replace PROGBITS 81ff3b7b 013f3b7b 2044 AX 0 0 1 [22] .iommu_table PROGBITS 81ff5bc0 013f5bc0 00c8 A 0 0 8 [23] .apicdrivers PROGBITS 81ff5c88 013f5c88 0010 WA 0 0 8 [24] .exit.textPROGBITS 81ff5c98 013f5c98 2412 AX 0 0 1 -- Regards/Gruss, Boris. ECO tip #101: Trim your mails when you reply. -- -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [GIT PULL] x86/alternatives padding
* Borislav Petkov wrote: > On Wed, Mar 04, 2015 at 12:22:06PM +0100, Borislav Petkov wrote: > > Well, kernel image doesn't change while vmlinux shows only a very small > > .text increase of about 2K. I'm not sure yet why that happens though > > because it shouldn't be the padding. Because we will have to do it > > anyway, this patchset makes it automatic instead of by-hand, so to > > speak. > > > > Let me bisect it and see which patch adds the increase. > > Doh, of course. I've added u8 padlen to the alternative instruction > entry struct. For 2Kish alt sites in total, this explains the almost > exact same increase in text size: > >textdata bss dec hex filename > 122905391595264 1085440 14971243 e4716b vmlinux > > 338ea55579d1... x86/lib/copy_user_64.S: Remove FIX_ALIGNMENT define >textdata bss dec hex filename > 122905391595264 1085440 14971243 e4716b vmlinux > > db477a3386de... x86/alternatives: Cleanup DPRINTK macro >textdata bss dec hex filename > 122905391595264 1085440 14971243 e4716b vmlinux > > 4332195c5615... x86/alternatives: Add instruction padding >textdata bss dec hex filename > 122930301595264 1085440 14973734 e47b26 vmlinux > ^^^ So you could have a look at the detailed section dump itself via: objdump -h vmlinux there .text will be the raw text and .alt* will be listed separately. The 'size' tool will add up executable sections IIRC, mixing these sections. .alt* is freed after init, so it's not really a kernel image size increase, right? Thanks, Ingo -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [GIT PULL] x86/alternatives padding
On Wed, Mar 04, 2015 at 12:22:06PM +0100, Borislav Petkov wrote: > Well, kernel image doesn't change while vmlinux shows only a very small > .text increase of about 2K. I'm not sure yet why that happens though > because it shouldn't be the padding. Because we will have to do it > anyway, this patchset makes it automatic instead of by-hand, so to > speak. > > Let me bisect it and see which patch adds the increase. Doh, of course. I've added u8 padlen to the alternative instruction entry struct. For 2Kish alt sites in total, this explains the almost exact same increase in text size: textdata bss dec hex filename 122905391595264 1085440 14971243 e4716b vmlinux 338ea55579d1... x86/lib/copy_user_64.S: Remove FIX_ALIGNMENT define textdata bss dec hex filename 122905391595264 1085440 14971243 e4716b vmlinux db477a3386de... x86/alternatives: Cleanup DPRINTK macro textdata bss dec hex filename 122905391595264 1085440 14971243 e4716b vmlinux 4332195c5615... x86/alternatives: Add instruction padding textdata bss dec hex filename 122930301595264 1085440 14973734 e47b26 vmlinux ^^^ Ok, that's sorted out now. -- Regards/Gruss, Boris. ECO tip #101: Trim your mails when you reply. -- -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [GIT PULL] x86/alternatives padding
On Wed, Mar 04, 2015 at 08:32:21AM +0100, Ingo Molnar wrote: > Just curious: did the kernel image size change before/after these > changes? I.e. was any of the existing alternative instructions using > sites coded sub-optimally, with a larger maximum instruction size > allocated than strictly needed? > > At least some of your improvements made things more optimal - > wondering at the total win, beyond the significant maintainability win > that is. Well, kernel image doesn't change while vmlinux shows only a very small .text increase of about 2K. I'm not sure yet why that happens though because it shouldn't be the padding. Because we will have to do it anyway, this patchset makes it automatic instead of by-hand, so to speak. Let me bisect it and see which patch adds the increase. 4.0-rc1 with alternatives patchset: === Setup is 15644 bytes (padded to 15872 bytes). System is 5855 kB CRC f2669897 Kernel: arch/x86/boot/bzImage is ready (#1) textdata bss dec hex filename 122929711595264 1085440 14973675 e47aeb vmlinux plain 4.0-rc1: == Setup is 15644 bytes (padded to 15872 bytes). System is 5855 kB CRC 7200607a Kernel: arch/x86/boot/bzImage is ready (#1) textdata bss dec hex filename 122905391595264 1085440 14971243 e4716b vmlinux -- Regards/Gruss, Boris. ECO tip #101: Trim your mails when you reply. -- -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [GIT PULL] x86/alternatives padding
On Wed, Mar 04, 2015 at 09:22:27PM +0100, Ingo Molnar wrote: So you could have a look at the detailed section dump itself via: objdump -h vmlinux there .text will be the raw text and .alt* will be listed separately. The 'size' tool will add up executable sections IIRC, mixing these sections. Right. .alt* is freed after init, so it's not really a kernel image size increase, right? Exactly: void free_initmem(void) { free_init_pages(unused kernel, (unsigned long)(__init_begin), (unsigned long)(__init_end)); } which are: 69708: 81ee9000 0 NOTYPE GLOBAL DEFAULT 16 __init_begin 72679: 81ff9000 0 NOTYPE GLOBAL DEFAULT 25 __init_end and there's a bunch of stuff between 81ee9000 and 81ff9000: [Nr] Name Type Address Offset Size EntSize Flags Link Info Align [17] .init.textPROGBITS 81efe000 012fe000 00066c98 AX 0 0 16 [18] .init.dataPROGBITS 81f65000 01365000 00086d18 WA 0 0 4096 [19] .x86_cpu_dev.init PROGBITS 81febd18 013ebd18 0018 A 0 0 8 [20] .altinstructions PROGBITS 81febd30 013ebd30 7e4b A 0 0 1 [21] .altinstr_replace PROGBITS 81ff3b7b 013f3b7b 2044 AX 0 0 1 [22] .iommu_table PROGBITS 81ff5bc0 013f5bc0 00c8 A 0 0 8 [23] .apicdrivers PROGBITS 81ff5c88 013f5c88 0010 WA 0 0 8 [24] .exit.textPROGBITS 81ff5c98 013f5c98 2412 AX 0 0 1 -- Regards/Gruss, Boris. ECO tip #101: Trim your mails when you reply. -- -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [GIT PULL] x86/alternatives padding
* Borislav Petkov b...@alien8.de wrote: On Wed, Mar 04, 2015 at 12:22:06PM +0100, Borislav Petkov wrote: Well, kernel image doesn't change while vmlinux shows only a very small .text increase of about 2K. I'm not sure yet why that happens though because it shouldn't be the padding. Because we will have to do it anyway, this patchset makes it automatic instead of by-hand, so to speak. Let me bisect it and see which patch adds the increase. Doh, of course. I've added u8 padlen to the alternative instruction entry struct. For 2Kish alt sites in total, this explains the almost exact same increase in text size: textdata bss dec hex filename 122905391595264 1085440 14971243 e4716b vmlinux 338ea55579d1... x86/lib/copy_user_64.S: Remove FIX_ALIGNMENT define textdata bss dec hex filename 122905391595264 1085440 14971243 e4716b vmlinux db477a3386de... x86/alternatives: Cleanup DPRINTK macro textdata bss dec hex filename 122905391595264 1085440 14971243 e4716b vmlinux 4332195c5615... x86/alternatives: Add instruction padding textdata bss dec hex filename 122930301595264 1085440 14973734 e47b26 vmlinux ^^^ So you could have a look at the detailed section dump itself via: objdump -h vmlinux there .text will be the raw text and .alt* will be listed separately. The 'size' tool will add up executable sections IIRC, mixing these sections. .alt* is freed after init, so it's not really a kernel image size increase, right? Thanks, Ingo -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [GIT PULL] x86/alternatives padding
On Wed, Mar 04, 2015 at 12:22:06PM +0100, Borislav Petkov wrote: Well, kernel image doesn't change while vmlinux shows only a very small .text increase of about 2K. I'm not sure yet why that happens though because it shouldn't be the padding. Because we will have to do it anyway, this patchset makes it automatic instead of by-hand, so to speak. Let me bisect it and see which patch adds the increase. Doh, of course. I've added u8 padlen to the alternative instruction entry struct. For 2Kish alt sites in total, this explains the almost exact same increase in text size: textdata bss dec hex filename 122905391595264 1085440 14971243 e4716b vmlinux 338ea55579d1... x86/lib/copy_user_64.S: Remove FIX_ALIGNMENT define textdata bss dec hex filename 122905391595264 1085440 14971243 e4716b vmlinux db477a3386de... x86/alternatives: Cleanup DPRINTK macro textdata bss dec hex filename 122905391595264 1085440 14971243 e4716b vmlinux 4332195c5615... x86/alternatives: Add instruction padding textdata bss dec hex filename 122930301595264 1085440 14973734 e47b26 vmlinux ^^^ Ok, that's sorted out now. -- Regards/Gruss, Boris. ECO tip #101: Trim your mails when you reply. -- -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [GIT PULL] x86/alternatives padding
On Wed, Mar 04, 2015 at 08:32:21AM +0100, Ingo Molnar wrote: Just curious: did the kernel image size change before/after these changes? I.e. was any of the existing alternative instructions using sites coded sub-optimally, with a larger maximum instruction size allocated than strictly needed? At least some of your improvements made things more optimal - wondering at the total win, beyond the significant maintainability win that is. Well, kernel image doesn't change while vmlinux shows only a very small .text increase of about 2K. I'm not sure yet why that happens though because it shouldn't be the padding. Because we will have to do it anyway, this patchset makes it automatic instead of by-hand, so to speak. Let me bisect it and see which patch adds the increase. 4.0-rc1 with alternatives patchset: === Setup is 15644 bytes (padded to 15872 bytes). System is 5855 kB CRC f2669897 Kernel: arch/x86/boot/bzImage is ready (#1) textdata bss dec hex filename 122929711595264 1085440 14973675 e47aeb vmlinux plain 4.0-rc1: == Setup is 15644 bytes (padded to 15872 bytes). System is 5855 kB CRC 7200607a Kernel: arch/x86/boot/bzImage is ready (#1) textdata bss dec hex filename 122905391595264 1085440 14971243 e4716b vmlinux -- Regards/Gruss, Boris. ECO tip #101: Trim your mails when you reply. -- -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [GIT PULL] x86/alternatives padding
* Borislav Petkov wrote: > Hi guys, > > so this one has been long in the making and has been passing testing > on a bunch of boxes and bitness here so maybe we should try to put it > into the wider tip mix and see what happens. If all is well, great, if > there's trouble which I haven't managed to trigger in my testing, we can > remove it from tip/master until all issues are fixed. > > Btw, the last three patches are adjusting and improving perf bench a > little as it includes memcpy/memset_64.S directly and this patchset > breaks it with the changes otherwise. > > Please pull, > thanks. > > --- > The following changes since commit c517d838eb7d07bbe9507871fab3931deccff539: > > Linux 4.0-rc1 (2015-02-22 18:21:14 -0800) > > are available in the git repository at: > > git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp.git > tags/alternatives_padding > > for you to fetch changes up to dfecb95cdfeaf7872d83a96bec3a606e9cd95c8d: > > perf/bench: Add -r all so that you can run all mem* routines (2015-03-03 > 18:01:58 +0100) > > > A more involved rework of the alternatives framework to be able to > pad instructions and thus make using the alternatives macros more > straightforward and without having to figure out old and new instruction > sizes but have the toolchain figure that out for us. > > Furthermore, it optimizes JMPs used so that fetch and decode can be > relieved with smaller versions of the JMPs, where possible. > > Some stats: > > x86_64 defconfig: > > Alternatives sites total: 2478 > Total padding added (in Bytes): 6051 Just curious: did the kernel image size change before/after these changes? I.e. was any of the existing alternative instructions using sites coded sub-optimally, with a larger maximum instruction size allocated than strictly needed? At least some of your improvements made things more optimal - wondering at the total win, beyond the significant maintainability win that is. > The padding is currently done for: > > X86_FEATURE_ALWAYS > X86_FEATURE_ERMS > X86_FEATURE_LFENCE_RDTSC > X86_FEATURE_MFENCE_RDTSC > X86_FEATURE_SMAP > > This is with the latest version of the patchset. Of course, on each > machine the alternatives sites actually being patched are a proper > subset of the total number. > > > Borislav Petkov (18): > x86/lib/copy_user_64.S: Remove FIX_ALIGNMENT define > x86/alternatives: Cleanup DPRINTK macro > x86/alternatives: Add instruction padding > x86/alternatives: Make JMPs more robust > x86/alternatives: Use optimized NOPs for padding > x86/lib/copy_page_64.S: Use generic ALTERNATIVE macro > x86/lib/copy_user_64.S: Convert to ALTERNATIVE_2 > x86/smap: Use ALTERNATIVE macro > x86/entry_32: Convert X86_INVD_BUG to ALTERNATIVE macro > x86/lib/clear_page_64.S: Convert to ALTERNATIVE_2 macro > x86/asm: Use alternative_2() in rdtsc_barrier() > x86/asm: Cleanup prefetch primitives > x86/lib/memset_64.S: Convert to ALTERNATIVE_2 macro > x86/lib/memmove_64.S: Convert memmove() to ALTERNATIVE macro > x86/lib/memcpy_64.S: Convert memcpy to ALTERNATIVE_2 macro > perf/bench: Fix mem* routines usage after alternatives change > perf/bench: Carve out mem routine benchmarking > perf/bench: Add -r all so that you can run all mem* routines > > arch/x86/include/asm/alternative-asm.h| 43 ++- > arch/x86/include/asm/alternative.h| 65 +++ > arch/x86/include/asm/apic.h | 2 +- > arch/x86/include/asm/barrier.h| 6 +- > arch/x86/include/asm/cpufeature.h | 30 ++--- > arch/x86/include/asm/processor.h | 16 ++- > arch/x86/include/asm/smap.h | 30 ++--- > arch/x86/kernel/alternative.c | 158 > ++ > arch/x86/kernel/cpu/amd.c | 5 + > arch/x86/kernel/entry_32.S| 12 +- > arch/x86/lib/clear_page_64.S | 66 +-- > arch/x86/lib/copy_page_64.S | 37 ++ > arch/x86/lib/copy_user_64.S | 46 ++-- > arch/x86/lib/memcpy_64.S | 68 --- > arch/x86/lib/memmove_64.S | 19 +--- > arch/x86/lib/memset_64.S | 61 -- > arch/x86/um/asm/barrier.h | 4 +- > tools/perf/bench/mem-memcpy-x86-64-asm-def.h | 6 +- > tools/perf/bench/mem-memcpy-x86-64-asm.S | 2 - > tools/perf/bench/mem-memcpy.c | 128 +++-- > tools/perf/bench/mem-memset-x86-64-asm-def.h | 6 +- > tools/perf/bench/mem-memset-x86-64-asm.S | 2 - > tools/perf/util/include/asm/alternative-asm.h | 1 + > 23 files changed, 433 insertions(+), 380 deletions(-)
Re: [GIT PULL] x86/alternatives padding
* Borislav Petkov b...@alien8.de wrote: Hi guys, so this one has been long in the making and has been passing testing on a bunch of boxes and bitness here so maybe we should try to put it into the wider tip mix and see what happens. If all is well, great, if there's trouble which I haven't managed to trigger in my testing, we can remove it from tip/master until all issues are fixed. Btw, the last three patches are adjusting and improving perf bench a little as it includes memcpy/memset_64.S directly and this patchset breaks it with the changes otherwise. Please pull, thanks. --- The following changes since commit c517d838eb7d07bbe9507871fab3931deccff539: Linux 4.0-rc1 (2015-02-22 18:21:14 -0800) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp.git tags/alternatives_padding for you to fetch changes up to dfecb95cdfeaf7872d83a96bec3a606e9cd95c8d: perf/bench: Add -r all so that you can run all mem* routines (2015-03-03 18:01:58 +0100) A more involved rework of the alternatives framework to be able to pad instructions and thus make using the alternatives macros more straightforward and without having to figure out old and new instruction sizes but have the toolchain figure that out for us. Furthermore, it optimizes JMPs used so that fetch and decode can be relieved with smaller versions of the JMPs, where possible. Some stats: x86_64 defconfig: Alternatives sites total: 2478 Total padding added (in Bytes): 6051 Just curious: did the kernel image size change before/after these changes? I.e. was any of the existing alternative instructions using sites coded sub-optimally, with a larger maximum instruction size allocated than strictly needed? At least some of your improvements made things more optimal - wondering at the total win, beyond the significant maintainability win that is. The padding is currently done for: X86_FEATURE_ALWAYS X86_FEATURE_ERMS X86_FEATURE_LFENCE_RDTSC X86_FEATURE_MFENCE_RDTSC X86_FEATURE_SMAP This is with the latest version of the patchset. Of course, on each machine the alternatives sites actually being patched are a proper subset of the total number. Borislav Petkov (18): x86/lib/copy_user_64.S: Remove FIX_ALIGNMENT define x86/alternatives: Cleanup DPRINTK macro x86/alternatives: Add instruction padding x86/alternatives: Make JMPs more robust x86/alternatives: Use optimized NOPs for padding x86/lib/copy_page_64.S: Use generic ALTERNATIVE macro x86/lib/copy_user_64.S: Convert to ALTERNATIVE_2 x86/smap: Use ALTERNATIVE macro x86/entry_32: Convert X86_INVD_BUG to ALTERNATIVE macro x86/lib/clear_page_64.S: Convert to ALTERNATIVE_2 macro x86/asm: Use alternative_2() in rdtsc_barrier() x86/asm: Cleanup prefetch primitives x86/lib/memset_64.S: Convert to ALTERNATIVE_2 macro x86/lib/memmove_64.S: Convert memmove() to ALTERNATIVE macro x86/lib/memcpy_64.S: Convert memcpy to ALTERNATIVE_2 macro perf/bench: Fix mem* routines usage after alternatives change perf/bench: Carve out mem routine benchmarking perf/bench: Add -r all so that you can run all mem* routines arch/x86/include/asm/alternative-asm.h| 43 ++- arch/x86/include/asm/alternative.h| 65 +++ arch/x86/include/asm/apic.h | 2 +- arch/x86/include/asm/barrier.h| 6 +- arch/x86/include/asm/cpufeature.h | 30 ++--- arch/x86/include/asm/processor.h | 16 ++- arch/x86/include/asm/smap.h | 30 ++--- arch/x86/kernel/alternative.c | 158 ++ arch/x86/kernel/cpu/amd.c | 5 + arch/x86/kernel/entry_32.S| 12 +- arch/x86/lib/clear_page_64.S | 66 +-- arch/x86/lib/copy_page_64.S | 37 ++ arch/x86/lib/copy_user_64.S | 46 ++-- arch/x86/lib/memcpy_64.S | 68 --- arch/x86/lib/memmove_64.S | 19 +--- arch/x86/lib/memset_64.S | 61 -- arch/x86/um/asm/barrier.h | 4 +- tools/perf/bench/mem-memcpy-x86-64-asm-def.h | 6 +- tools/perf/bench/mem-memcpy-x86-64-asm.S | 2 - tools/perf/bench/mem-memcpy.c | 128 +++-- tools/perf/bench/mem-memset-x86-64-asm-def.h | 6 +- tools/perf/bench/mem-memset-x86-64-asm.S | 2 - tools/perf/util/include/asm/alternative-asm.h | 1 + 23 files changed, 433 insertions(+), 380 deletions(-) Pulled into tip:x86/asm, thanks Boris! (I made a few comments as replies to the