Re: [PATCH v9 04/38] x86/CPU/AMD: Add the Secure Memory Encryption CPU feature
On Tue, Jul 11, 2017 at 01:07:46AM -0400, Brian Gerst wrote: > > If I make the scattered feature support conditional on CONFIG_X86_64 > > (based on comment below) then cpu_has() will always be false unless > > CONFIG_X86_64 is enabled. So this won't need to be wrapped by the > > #ifdef. > > If you change it to use cpu_feature_enabled(), gcc will see that it is > disabled and eliminate the dead code at compile time. Just do this: if (cpu_has(c, X86_FEATURE_SME)) { if (IS_ENABLED(CONFIG_X86_32)) { clear_cpu_cap(c, X86_FEATURE_SME); } else { u64 msr; /* Check if SME is enabled */ rdmsrl(MSR_K8_SYSCFG, msr); if (!(msr & MSR_K8_SYSCFG_MEM_ENCRYPT)) clear_cpu_cap(c, X86_FEATURE_SME); } } so that it is explicit that we disable it on 32-bit and we can save us the ifdeffery elsewhere. Thanks. -- Regards/Gruss, Boris. ECO tip #101: Trim your mails when you reply. -- ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [PATCH v9 04/38] x86/CPU/AMD: Add the Secure Memory Encryption CPU feature
On Mon, Jul 10, 2017 at 3:41 PM, Tom Lendacky wrote: > On 7/8/2017 7:50 AM, Brian Gerst wrote: >> >> On Fri, Jul 7, 2017 at 9:38 AM, Tom Lendacky >> wrote: >>> >>> Update the CPU features to include identifying and reporting on the >>> Secure Memory Encryption (SME) feature. SME is identified by CPUID >>> 0x801f, but requires BIOS support to enable it (set bit 23 of >>> MSR_K8_SYSCFG). Only show the SME feature as available if reported by >>> CPUID and enabled by BIOS. >>> >>> Reviewed-by: Borislav Petkov >>> Signed-off-by: Tom Lendacky >>> --- >>> arch/x86/include/asm/cpufeatures.h |1 + >>> arch/x86/include/asm/msr-index.h |2 ++ >>> arch/x86/kernel/cpu/amd.c | 13 + >>> arch/x86/kernel/cpu/scattered.c|1 + >>> 4 files changed, 17 insertions(+) >>> >>> diff --git a/arch/x86/include/asm/cpufeatures.h >>> b/arch/x86/include/asm/cpufeatures.h >>> index 2701e5f..2b692df 100644 >>> --- a/arch/x86/include/asm/cpufeatures.h >>> +++ b/arch/x86/include/asm/cpufeatures.h >>> @@ -196,6 +196,7 @@ >>> >>> #define X86_FEATURE_HW_PSTATE ( 7*32+ 8) /* AMD HW-PState */ >>> #define X86_FEATURE_PROC_FEEDBACK ( 7*32+ 9) /* AMD >>> ProcFeedbackInterface */ >>> +#define X86_FEATURE_SME( 7*32+10) /* AMD Secure Memory >>> Encryption */ >> >> >> Given that this feature is available only in long mode, this should be >> added to disabled-features.h as disabled for 32-bit builds. > > > I can add that. If the series needs a re-spin then I'll include this > change in the series, otherwise I can send a follow-on patch to handle > the feature for 32-bit builds if that works. > > >> >>> #define X86_FEATURE_INTEL_PPIN ( 7*32+14) /* Intel Processor Inventory >>> Number */ >>> #define X86_FEATURE_INTEL_PT ( 7*32+15) /* Intel Processor Trace */ >>> diff --git a/arch/x86/include/asm/msr-index.h >>> b/arch/x86/include/asm/msr-index.h >>> index 18b1623..460ac01 100644 >>> --- a/arch/x86/include/asm/msr-index.h >>> +++ b/arch/x86/include/asm/msr-index.h >>> @@ -352,6 +352,8 @@ >>> #define MSR_K8_TOP_MEM10xc001001a >>> #define MSR_K8_TOP_MEM20xc001001d >>> #define MSR_K8_SYSCFG 0xc0010010 >>> +#define MSR_K8_SYSCFG_MEM_ENCRYPT_BIT 23 >>> +#define MSR_K8_SYSCFG_MEM_ENCRYPT >>> BIT_ULL(MSR_K8_SYSCFG_MEM_ENCRYPT_BIT) >>> #define MSR_K8_INT_PENDING_MSG 0xc0010055 >>> /* C1E active bits in int pending message */ >>> #define K8_INTP_C1E_ACTIVE_MASK0x1800 >>> diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c >>> index bb5abe8..c47ceee 100644 >>> --- a/arch/x86/kernel/cpu/amd.c >>> +++ b/arch/x86/kernel/cpu/amd.c >>> @@ -611,6 +611,19 @@ static void early_init_amd(struct cpuinfo_x86 *c) >>> */ >>> if (cpu_has_amd_erratum(c, amd_erratum_400)) >>> set_cpu_bug(c, X86_BUG_AMD_E400); >>> + >>> + /* >>> +* BIOS support is required for SME. If BIOS has not enabled SME >>> +* then don't advertise the feature (set in scattered.c) >>> +*/ >>> + if (cpu_has(c, X86_FEATURE_SME)) { >>> + u64 msr; >>> + >>> + /* Check if SME is enabled */ >>> + rdmsrl(MSR_K8_SYSCFG, msr); >>> + if (!(msr & MSR_K8_SYSCFG_MEM_ENCRYPT)) >>> + clear_cpu_cap(c, X86_FEATURE_SME); >>> + } >> >> >> This should be conditional on CONFIG_X86_64. > > > If I make the scattered feature support conditional on CONFIG_X86_64 > (based on comment below) then cpu_has() will always be false unless > CONFIG_X86_64 is enabled. So this won't need to be wrapped by the > #ifdef. If you change it to use cpu_feature_enabled(), gcc will see that it is disabled and eliminate the dead code at compile time. >> >>> } >>> >>> static void init_amd_k8(struct cpuinfo_x86 *c) >>> diff --git a/arch/x86/kernel/cpu/scattered.c >>> b/arch/x86/kernel/cpu/scattered.c >>> index 23c2350..05459ad 100644 >>> --- a/arch/x86/kernel/cpu/scattered.c >>> +++ b/arch/x86/kernel/cpu/scattered.c >>> @@ -31,6 +31,7 @@ struct cpuid_bit { >>> { X86_FEATURE_HW_PSTATE,CPUID_EDX, 7, 0x8007, 0 }, >>> { X86_FEATURE_CPB, CPUID_EDX, 9, 0x8007, 0 }, >>> { X86_FEATURE_PROC_FEEDBACK,CPUID_EDX, 11, 0x8007, 0 }, >>> + { X86_FEATURE_SME, CPUID_EAX, 0, 0x801f, 0 }, >> >> >> This should also be conditional. We don't want to set this feature on >> 32-bit, even if the processor has support. > > > Can do. See comment above about re-spin vs. follow-on patch. > > Thanks, > Tom A followup patch will be OK if there is no code that will get confused by the SME bit being present but not active. -- Brian Gerst ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [PATCH v9 07/38] x86/mm: Remove phys_to_virt() usage in ioremap()
On Mon, Jul 10, 2017 at 3:50 PM, Tom Lendacky wrote: > On 7/8/2017 7:57 AM, Brian Gerst wrote: >> >> On Fri, Jul 7, 2017 at 9:39 AM, Tom Lendacky >> wrote: >>> >>> Currently there is a check if the address being mapped is in the ISA >>> range (is_ISA_range()), and if it is, then phys_to_virt() is used to >>> perform the mapping. When SME is active, the default is to add pagetable >>> mappings with the encryption bit set unless specifically overridden. The >>> resulting pagetable mapping from phys_to_virt() will result in a mapping >>> that has the encryption bit set. With SME, the use of ioremap() is >>> intended to generate pagetable mappings that do not have the encryption >>> bit set through the use of the PAGE_KERNEL_IO protection value. >>> >>> Rather than special case the SME scenario, remove the ISA range check and >>> usage of phys_to_virt() and have ISA range mappings continue through the >>> remaining ioremap() path. >>> >>> Signed-off-by: Tom Lendacky >>> --- >>> arch/x86/mm/ioremap.c |7 +-- >>> 1 file changed, 1 insertion(+), 6 deletions(-) >>> >>> diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c >>> index 4c1b5fd..bfc3e2d 100644 >>> --- a/arch/x86/mm/ioremap.c >>> +++ b/arch/x86/mm/ioremap.c >>> @@ -13,6 +13,7 @@ >>> #include >>> #include >>> #include >>> +#include >>> >>> #include >>> #include >>> @@ -106,12 +107,6 @@ static void __iomem >>> *__ioremap_caller(resource_size_t phys_addr, >>> } >>> >>> /* >>> -* Don't remap the low PCI/ISA area, it's always mapped.. >>> -*/ >>> - if (is_ISA_range(phys_addr, last_addr)) >>> - return (__force void __iomem *)phys_to_virt(phys_addr); >>> - >>> - /* >>> * Don't allow anybody to remap normal RAM that we're using.. >>> */ >>> pfn = phys_addr >> PAGE_SHIFT; >>> >> >> Removing this also affects 32-bit, which is more likely to access >> legacy devices in this range. Put in a check for SME instead > > > I originally had a check for SME here in a previous version of the > patch. Thomas Gleixner recommended removing the check so that the code > path was always exercised regardless of the state of SME in order to > better detect issues: > > http://marc.info/?l=linux-kernel&m=149803067811436&w=2 > > Thanks, > Tom Looking a bit closer, this shortcut doesn't set the caching attributes. So it's probably best to get rid of it anyways. Also note, there is a corresponding check in iounmap(). -- Brian Gerst ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [PATCH v9 07/38] x86/mm: Remove phys_to_virt() usage in ioremap()
On 7/8/2017 7:57 AM, Brian Gerst wrote: On Fri, Jul 7, 2017 at 9:39 AM, Tom Lendacky wrote: Currently there is a check if the address being mapped is in the ISA range (is_ISA_range()), and if it is, then phys_to_virt() is used to perform the mapping. When SME is active, the default is to add pagetable mappings with the encryption bit set unless specifically overridden. The resulting pagetable mapping from phys_to_virt() will result in a mapping that has the encryption bit set. With SME, the use of ioremap() is intended to generate pagetable mappings that do not have the encryption bit set through the use of the PAGE_KERNEL_IO protection value. Rather than special case the SME scenario, remove the ISA range check and usage of phys_to_virt() and have ISA range mappings continue through the remaining ioremap() path. Signed-off-by: Tom Lendacky --- arch/x86/mm/ioremap.c |7 +-- 1 file changed, 1 insertion(+), 6 deletions(-) diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c index 4c1b5fd..bfc3e2d 100644 --- a/arch/x86/mm/ioremap.c +++ b/arch/x86/mm/ioremap.c @@ -13,6 +13,7 @@ #include #include #include +#include #include #include @@ -106,12 +107,6 @@ static void __iomem *__ioremap_caller(resource_size_t phys_addr, } /* -* Don't remap the low PCI/ISA area, it's always mapped.. -*/ - if (is_ISA_range(phys_addr, last_addr)) - return (__force void __iomem *)phys_to_virt(phys_addr); - - /* * Don't allow anybody to remap normal RAM that we're using.. */ pfn = phys_addr >> PAGE_SHIFT; Removing this also affects 32-bit, which is more likely to access legacy devices in this range. Put in a check for SME instead I originally had a check for SME here in a previous version of the patch. Thomas Gleixner recommended removing the check so that the code path was always exercised regardless of the state of SME in order to better detect issues: http://marc.info/?l=linux-kernel&m=149803067811436&w=2 Thanks, Tom (provided you follow my recommendations to not set the SME feature bit on 32-bit even when the processor supports it) -- Brian Gerst ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [PATCH v9 04/38] x86/CPU/AMD: Add the Secure Memory Encryption CPU feature
On 7/8/2017 7:50 AM, Brian Gerst wrote: On Fri, Jul 7, 2017 at 9:38 AM, Tom Lendacky wrote: Update the CPU features to include identifying and reporting on the Secure Memory Encryption (SME) feature. SME is identified by CPUID 0x801f, but requires BIOS support to enable it (set bit 23 of MSR_K8_SYSCFG). Only show the SME feature as available if reported by CPUID and enabled by BIOS. Reviewed-by: Borislav Petkov Signed-off-by: Tom Lendacky --- arch/x86/include/asm/cpufeatures.h |1 + arch/x86/include/asm/msr-index.h |2 ++ arch/x86/kernel/cpu/amd.c | 13 + arch/x86/kernel/cpu/scattered.c|1 + 4 files changed, 17 insertions(+) diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h index 2701e5f..2b692df 100644 --- a/arch/x86/include/asm/cpufeatures.h +++ b/arch/x86/include/asm/cpufeatures.h @@ -196,6 +196,7 @@ #define X86_FEATURE_HW_PSTATE ( 7*32+ 8) /* AMD HW-PState */ #define X86_FEATURE_PROC_FEEDBACK ( 7*32+ 9) /* AMD ProcFeedbackInterface */ +#define X86_FEATURE_SME( 7*32+10) /* AMD Secure Memory Encryption */ Given that this feature is available only in long mode, this should be added to disabled-features.h as disabled for 32-bit builds. I can add that. If the series needs a re-spin then I'll include this change in the series, otherwise I can send a follow-on patch to handle the feature for 32-bit builds if that works. #define X86_FEATURE_INTEL_PPIN ( 7*32+14) /* Intel Processor Inventory Number */ #define X86_FEATURE_INTEL_PT ( 7*32+15) /* Intel Processor Trace */ diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h index 18b1623..460ac01 100644 --- a/arch/x86/include/asm/msr-index.h +++ b/arch/x86/include/asm/msr-index.h @@ -352,6 +352,8 @@ #define MSR_K8_TOP_MEM10xc001001a #define MSR_K8_TOP_MEM20xc001001d #define MSR_K8_SYSCFG 0xc0010010 +#define MSR_K8_SYSCFG_MEM_ENCRYPT_BIT 23 +#define MSR_K8_SYSCFG_MEM_ENCRYPT BIT_ULL(MSR_K8_SYSCFG_MEM_ENCRYPT_BIT) #define MSR_K8_INT_PENDING_MSG 0xc0010055 /* C1E active bits in int pending message */ #define K8_INTP_C1E_ACTIVE_MASK0x1800 diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c index bb5abe8..c47ceee 100644 --- a/arch/x86/kernel/cpu/amd.c +++ b/arch/x86/kernel/cpu/amd.c @@ -611,6 +611,19 @@ static void early_init_amd(struct cpuinfo_x86 *c) */ if (cpu_has_amd_erratum(c, amd_erratum_400)) set_cpu_bug(c, X86_BUG_AMD_E400); + + /* +* BIOS support is required for SME. If BIOS has not enabled SME +* then don't advertise the feature (set in scattered.c) +*/ + if (cpu_has(c, X86_FEATURE_SME)) { + u64 msr; + + /* Check if SME is enabled */ + rdmsrl(MSR_K8_SYSCFG, msr); + if (!(msr & MSR_K8_SYSCFG_MEM_ENCRYPT)) + clear_cpu_cap(c, X86_FEATURE_SME); + } This should be conditional on CONFIG_X86_64. If I make the scattered feature support conditional on CONFIG_X86_64 (based on comment below) then cpu_has() will always be false unless CONFIG_X86_64 is enabled. So this won't need to be wrapped by the #ifdef. } static void init_amd_k8(struct cpuinfo_x86 *c) diff --git a/arch/x86/kernel/cpu/scattered.c b/arch/x86/kernel/cpu/scattered.c index 23c2350..05459ad 100644 --- a/arch/x86/kernel/cpu/scattered.c +++ b/arch/x86/kernel/cpu/scattered.c @@ -31,6 +31,7 @@ struct cpuid_bit { { X86_FEATURE_HW_PSTATE,CPUID_EDX, 7, 0x8007, 0 }, { X86_FEATURE_CPB, CPUID_EDX, 9, 0x8007, 0 }, { X86_FEATURE_PROC_FEEDBACK,CPUID_EDX, 11, 0x8007, 0 }, + { X86_FEATURE_SME, CPUID_EAX, 0, 0x801f, 0 }, This should also be conditional. We don't want to set this feature on 32-bit, even if the processor has support. Can do. See comment above about re-spin vs. follow-on patch. Thanks, Tom { 0, 0, 0, 0, 0 } }; -- Brian Gerst ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [PATCH v9 00/38] x86: Secure Memory Encryption (AMD)
On 7/8/2017 4:24 AM, Ingo Molnar wrote: * Tom Lendacky wrote: This patch series provides support for AMD's new Secure Memory Encryption (SME) feature. I'm wondering, what's the typical performance hit to DRAM access latency when SME is enabled? It's about an extra 10 cycles of DRAM latency when performing an encryption or decryption operation. On that same note, if the performance hit is noticeable I'd expect SME to not be enabled in native kernels typically - but still it looks like a useful hardware In some internal testing we've seen about 1.5% or less reduction in performance. Of course it all depends on the workload: the number of memory accesses, cache friendliness, etc. feature. Since it's controlled at the page table level, have you considered allowing SME-activated vmas via mmap(), even on kernels that are otherwise not using encrypted DRAM? That is definitely something to consider as an additional SME-related feature and something I can look into after this. Thanks, Tom One would think that putting encryption keys into such encrypted RAM regions would generally improve robustness against various physical space attacks that want to extract keys but don't have full control of the CPU. Thanks, Ingo ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [makedumpfile PATCH] Allow PFN_EXCLUDED to be tunable via command line option --exclude-threshold
On 07/07/2017 12:53 PM, Eric DeVolder wrote: Hi Atsushi, please see below. eric On 07/07/2017 04:09 AM, Atsushi Kumagai wrote: The PFN_EXCLUDED value is used to control at which point a run of zeros in the bitmap (zeros denote excluded pages) is large enough to warrant truncating the current output segment and to create a new output segment (containing non-excluded pages), in an ELF dump. If the run is smaller than PFN_EXCLUDED, then those excluded pages are still output in the ELF dump, for the current output segment. By using smaller values of PFN_EXCLUDED, the resulting dump file size can be made smaller by actually removing more excluded pages from the resulting dump file. This patch adds the command line option --exclude-threshold= to indicate the threshold. The default is 256, the legacy value of PFN_EXCLUDED. The smallest value permitted is 1. Using an existing vmcore, this was tested by the following: % makedumpfile -E -d31 --exclude-threshold=256 -x vmlinux vmcore newvmcore256 % makedumpfile -E -d31 --exclude-threshold=4 -x vmlinux vmcore newvmcore4 I utilize -d31 in order to exclude as many page types as possible, resulting in a [significantly] smaller file sizes than the original vmcore. -rwxrwx--- 1 edevolde edevolde 4034564096 Jun 27 10:24 vmcore -rw--- 1 edevolde edevolde 119808156 Jul 6 13:01 newvmcore256 -rw--- 1 edevolde edevolde 100811276 Jul 6 13:08 newvmcore4 The use of smaller value of PFN_EXCLUDED increases the number of output segments (the 'Number of program headers' in the readelf output) in the ELF dump file. How will you tune the value ? I'm not sure what is the benefit of the tunable PFN_EXCLUDED. If there is no regression caused by too many PT_LOAD entries, I think we can decide a concrete PFN_EXCLUDED. Allow me note two things prior to addressing the question. Note that the value for PFN_EXCLUDED really should be in the range: 1 <= PFN_EXCLUDED <= NUM_PAGES(largest segment) but that values larger than NUM_PAGES(largest segment) behave the same as NUM_PAGES(largest segment) and simply prevent makedumpfile from ever omitting excluded pages from the dump file. Also note that the ELF header allows for a 16-bit e_phnum value for the number of segments in the dump file. As it stands today, I doubt that anybody has come close to reaching 65535 segments, but the combination of larger and larger memories as well as the work we (Oracle) are doing to further enhance the capabilities of makedumpfile, I believe we will start to challenge this 65535 number. The ability to tune PFN_EXCLUDED allows one to minimize file size while still staying within ELF boundaries. There are two ways in which have PFN_EXCLUDED as a tunable parameter benefits the user. The first benefit is, when making PFN_EXCLUDED smaller, makedumpfile has more opportunities to NOT write excluded pages to the resulting dump file, thus obtaining a smaller overall dump file size. And since a PT_LOAD header is smaller than a page, this penalty (of more segments) will always result in a smaller file size. (In the example I cite the dump file was 18MB smaller with a PFN_EXCLUDED value of 4 than default 256, in spite of increasing the number of segments from 6 to 244). The second benefit is, when enabling PFN_EXCLUDED to become larger, it allows makedumpfile to continue to generate valid ELF dump files in the presence of larger and larger memory systems. Generally speaking, the goal is to minimize the size of dump files via the exclusion of uninteresting pages (ie zero, free, etc), especially as the size of memory continues to grow and grow. As the memory increases, there are more and more of these uninteresting pages, and more opportunities for makedumpfile to omit them (even at the current PFN_EXCLUDED value of 256). Furthermore, we are working on additional page exclusion strategies that will drive more and more opportunities for makedumpfile to omit these pages from the dump file. And as makedumpfile omits more and more pages from the dump file, that increases the number of segments needed. By enabling a user to tune the value of PFN_EXCLUDED, we provide an additional mechanism to balance the size of the ELF dump file with respect to the size of memory. It occurred to me that offering the option "--exclude-threshold=auto" whereby a binary search on the second bitmap in the function get_loads_dumpfile_cyclic() to determine the optimum value of PFN_EXCLUDED (optimum here meaning the smallest possible value while still staying within 65535 segments, which would yield the smallest possible dump file size for the given constraints) would be an excellent feature to have? eric The penalty for splitting PT_LOAD is the size of a PT_LOAD header, so the best PFN_EXCLUDED is the minimum number which meets the condition below: (size of PT_LOAD header) < (PFN_EXCLUDED << PAGE_SIZE) I admit I don't quite understand, would you mind working through an example or two? It seems t