* Borislav Petkov <b...@alien8.de> wrote: > Hi guys, > > so this one has been long in the making and has been passing testing > on a bunch of boxes and bitness here so maybe we should try to put it > into the wider tip mix and see what happens. If all is well, great, if > there's trouble which I haven't managed to trigger in my testing, we can > remove it from tip/master until all issues are fixed. > > Btw, the last three patches are adjusting and improving perf bench a > little as it includes memcpy/memset_64.S directly and this patchset > breaks it with the changes otherwise. > > Please pull, > thanks. > > --- > The following changes since commit c517d838eb7d07bbe9507871fab3931deccff539: > > Linux 4.0-rc1 (2015-02-22 18:21:14 -0800) > > are available in the git repository at: > > git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp.git > tags/alternatives_padding > > for you to fetch changes up to dfecb95cdfeaf7872d83a96bec3a606e9cd95c8d: > > perf/bench: Add -r all so that you can run all mem* routines (2015-03-03 > 18:01:58 +0100) > > ---------------------------------------------------------------- > A more involved rework of the alternatives framework to be able to > pad instructions and thus make using the alternatives macros more > straightforward and without having to figure out old and new instruction > sizes but have the toolchain figure that out for us. > > Furthermore, it optimizes JMPs used so that fetch and decode can be > relieved with smaller versions of the JMPs, where possible. > > Some stats: > > x86_64 defconfig: > > Alternatives sites total: 2478 > Total padding added (in Bytes): 6051
Just curious: did the kernel image size change before/after these changes? I.e. was any of the existing alternative instructions using sites coded sub-optimally, with a larger maximum instruction size allocated than strictly needed? At least some of your improvements made things more optimal - wondering at the total win, beyond the significant maintainability win that is. > The padding is currently done for: > > X86_FEATURE_ALWAYS > X86_FEATURE_ERMS > X86_FEATURE_LFENCE_RDTSC > X86_FEATURE_MFENCE_RDTSC > X86_FEATURE_SMAP > > This is with the latest version of the patchset. Of course, on each > machine the alternatives sites actually being patched are a proper > subset of the total number. > > ---------------------------------------------------------------- > Borislav Petkov (18): > x86/lib/copy_user_64.S: Remove FIX_ALIGNMENT define > x86/alternatives: Cleanup DPRINTK macro > x86/alternatives: Add instruction padding > x86/alternatives: Make JMPs more robust > x86/alternatives: Use optimized NOPs for padding > x86/lib/copy_page_64.S: Use generic ALTERNATIVE macro > x86/lib/copy_user_64.S: Convert to ALTERNATIVE_2 > x86/smap: Use ALTERNATIVE macro > x86/entry_32: Convert X86_INVD_BUG to ALTERNATIVE macro > x86/lib/clear_page_64.S: Convert to ALTERNATIVE_2 macro > x86/asm: Use alternative_2() in rdtsc_barrier() > x86/asm: Cleanup prefetch primitives > x86/lib/memset_64.S: Convert to ALTERNATIVE_2 macro > x86/lib/memmove_64.S: Convert memmove() to ALTERNATIVE macro > x86/lib/memcpy_64.S: Convert memcpy to ALTERNATIVE_2 macro > perf/bench: Fix mem* routines usage after alternatives change > perf/bench: Carve out mem routine benchmarking > perf/bench: Add -r all so that you can run all mem* routines > > arch/x86/include/asm/alternative-asm.h | 43 ++++++- > arch/x86/include/asm/alternative.h | 65 +++++++---- > arch/x86/include/asm/apic.h | 2 +- > arch/x86/include/asm/barrier.h | 6 +- > arch/x86/include/asm/cpufeature.h | 30 ++--- > arch/x86/include/asm/processor.h | 16 ++- > arch/x86/include/asm/smap.h | 30 ++--- > arch/x86/kernel/alternative.c | 158 > ++++++++++++++++++++++---- > arch/x86/kernel/cpu/amd.c | 5 + > arch/x86/kernel/entry_32.S | 12 +- > arch/x86/lib/clear_page_64.S | 66 +++++------ > arch/x86/lib/copy_page_64.S | 37 ++---- > arch/x86/lib/copy_user_64.S | 46 ++------ > arch/x86/lib/memcpy_64.S | 68 ++++------- > arch/x86/lib/memmove_64.S | 19 +--- > arch/x86/lib/memset_64.S | 61 ++++------ > arch/x86/um/asm/barrier.h | 4 +- > tools/perf/bench/mem-memcpy-x86-64-asm-def.h | 6 +- > tools/perf/bench/mem-memcpy-x86-64-asm.S | 2 - > tools/perf/bench/mem-memcpy.c | 128 +++++++++++---------- > tools/perf/bench/mem-memset-x86-64-asm-def.h | 6 +- > tools/perf/bench/mem-memset-x86-64-asm.S | 2 - > tools/perf/util/include/asm/alternative-asm.h | 1 + > 23 files changed, 433 insertions(+), 380 deletions(-) Pulled into tip:x86/asm, thanks Boris! (I made a few comments as replies to the patches themselves, none affected the quality of this tree so I pulled it.) Thanks, Ingo -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/