[PATCH v10 00/38] x86: Secure Memory Encryption (AMD)

2017-07-17 Thread Tom Lendacky
This patch series provides support for AMD's new Secure Memory Encryption (SME)
feature.

SME can be used to mark individual pages of memory as encrypted through the
page tables. A page of memory that is marked encrypted will be automatically
decrypted when read from DRAM and will be automatically encrypted when
written to DRAM. Details on SME can found in the links below.

The SME feature is identified through a CPUID function and enabled through
the SYSCFG MSR. Once enabled, page table entries will determine how the
memory is accessed. If a page table entry has the memory encryption mask set,
then that memory will be accessed as encrypted memory. The memory encryption
mask (as well as other related information) is determined from settings
returned through the same CPUID function that identifies the presence of the
feature.

The approach that this patch series takes is to encrypt everything possible
starting early in the boot where the kernel is encrypted. Using the page
table macros the encryption mask can be incorporated into all page table
entries and page allocations. By updating the protection map, userspace
allocations are also marked encrypted. Certain data must be accounted for
as having been placed in memory before SME was enabled (EFI, initrd, etc.)
and accessed accordingly.

This patch series is a pre-cursor to another AMD processor feature called
Secure Encrypted Virtualization (SEV). The support for SEV will build upon
the SME support and will be submitted later. Details on SEV can be found
in the links below.

The following links provide additional detail:

AMD Memory Encryption whitepaper:
   
http://amd-dev.wpengine.netdna-cdn.com/wordpress/media/2013/12/AMD_Memory_Encryption_Whitepaper_v7-Public.pdf

AMD64 Architecture Programmer's Manual:
   http://support.amd.com/TechDocs/24593.pdf
   SME is section 7.10
   SEV is section 15.34

---

This patch series is based off of the master branch of tip:
  https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git master

  Commit 5fcfb42b132c ("Merge branch 'linus'")

Source code is also available at https://github.com/codomania/tip/tree/sme-v10

Cc: 
Cc: Joerg Roedel 
Cc: 
Cc: 
Cc: Boris Ostrovsky 
Cc: Juergen Gross 

Still to do:
- Kdump support, including using memremap() instead of ioremap_cache()

Changes since v9:
- Cleared SME feature capability for 32-bit builds
- Added a WARNing to the iounmap() path for ISA ranges to catch callers
  which did not use ioremap()

Changes since v8:
- Changed AMD IOMMU SME-related function name
- Updated the sme_encrypt_kernel() entry/exit code to address new warnings
  issued by objtool

Changes since v7:
- Fixed kbuild test robot failure related to pgprot_decrypted() macro
  usage for some non-x86 archs
- Moved calls to encrypt the kernel and retrieve the encryption mask
  from assembler (head_64.S) into C (head64.c)
- Removed use of phys_to_virt() in __ioremap_caller() when address is in
  the ISA range. Now regular ioremap() processing occurs.
- Two new, small patches:
  - Introduced a native_make_p4d() for use when CONFIG_PGTABLE_LEVELS is
not greater than 4
  - Introduced __nostackp GCC option to turn off stack protection on a
per function basis
- General code cleanup based on feedback

Changes since v6:
- Fixed the asm include file issue that caused build errors on other archs
- Rebased the CR3 register changes on top of Andy Lutomirski's patch
- Added a patch to clear the SME cpu feature if running as a PV guest under
  Xen
- Added a patch to obtain the AMD microcode level earlier in the boot
  instead of directly reading the MSR
- Refactor patch #8 ("x86/mm: Add support to enable SME in early boot
  processing") because the 5-level paging support moved the code into the
  new C-function __startup_64()
- Removed need to decrypt trampoline area in-place (set memory attributes
  before copying the trampoline code)
- General code cleanup based on feedback

Changes since v5:
- Added support for 5-level paging
- Added IOMMU support
- Created a generic asm/mem_encrypt.h in order to remove a bunch of
  #ifndef/#define entries
- Removed changes to the __va() macro and defined a function to return
  the true physical address in cr3
- Removed sysfs support as it was determined not to be needed
- General code cleanup based on feedback
- General cleanup of patch subjects and descriptions

Changes since v4:
- Re-worked mapping of setup data to not use a fixed list. Rather, check
  dynamically whether the requested early_memremap()/memremap() call
  needs to be mapped decrypted.
- Moved SME cpu feature into scattered features
- Moved some declarations into header files
- Cleared the encryption mask from the __PHYSICAL_MASK so that users
  of macros such as pmd_pfn_mask() don't have to worry/know about the
  encryption mask
- Updated some return types and values related to EFI and e820 functions
  so that an error could be returned
- During cpu shutdown, removed cache disabling and added a check for kexec

[PATCH v10 31/38] x86/mm, kexec: Allow kexec to be used with SME

2017-07-17 Thread Tom Lendacky
Provide support so that kexec can be used to boot a kernel when SME is
enabled.

Support is needed to allocate pages for kexec without encryption.  This
is needed in order to be able to reboot in the kernel in the same manner
as originally booted.

Additionally, when shutting down all of the CPUs we need to be sure to
flush the caches and then halt. This is needed when booting from a state
where SME was not active into a state where SME is active (or vice-versa).
Without these steps, it is possible for cache lines to exist for the same
physical location but tagged both with and without the encryption bit. This
can cause random memory corruption when caches are flushed depending on
which cacheline is written last.

Cc: 
Reviewed-by: Borislav Petkov 
Signed-off-by: Tom Lendacky 
---
 arch/x86/include/asm/init.h  |  1 +
 arch/x86/include/asm/kexec.h |  8 
 arch/x86/include/asm/pgtable_types.h |  1 +
 arch/x86/kernel/machine_kexec_64.c   | 22 +-
 arch/x86/kernel/process.c| 17 +++--
 arch/x86/mm/ident_map.c  | 12 
 include/linux/kexec.h|  8 
 kernel/kexec_core.c  | 12 +++-
 8 files changed, 73 insertions(+), 8 deletions(-)

diff --git a/arch/x86/include/asm/init.h b/arch/x86/include/asm/init.h
index 474eb8c..05c4aa0 100644
--- a/arch/x86/include/asm/init.h
+++ b/arch/x86/include/asm/init.h
@@ -7,6 +7,7 @@ struct x86_mapping_info {
unsigned long page_flag; /* page flag for PMD or PUD entry */
unsigned long offset;/* ident mapping offset */
bool direct_gbpages; /* PUD level 1GB page support */
+   unsigned long kernpg_flag;   /* kernel pagetable flag override */
 };
 
 int kernel_ident_mapping_init(struct x86_mapping_info *info, pgd_t *pgd_page,
diff --git a/arch/x86/include/asm/kexec.h b/arch/x86/include/asm/kexec.h
index 70ef205..e8183ac 100644
--- a/arch/x86/include/asm/kexec.h
+++ b/arch/x86/include/asm/kexec.h
@@ -207,6 +207,14 @@ struct kexec_entry64_regs {
uint64_t r15;
uint64_t rip;
 };
+
+extern int arch_kexec_post_alloc_pages(void *vaddr, unsigned int pages,
+  gfp_t gfp);
+#define arch_kexec_post_alloc_pages arch_kexec_post_alloc_pages
+
+extern void arch_kexec_pre_free_pages(void *vaddr, unsigned int pages);
+#define arch_kexec_pre_free_pages arch_kexec_pre_free_pages
+
 #endif
 
 typedef void crash_vmclear_fn(void);
diff --git a/arch/x86/include/asm/pgtable_types.h 
b/arch/x86/include/asm/pgtable_types.h
index 32095af..830992f 100644
--- a/arch/x86/include/asm/pgtable_types.h
+++ b/arch/x86/include/asm/pgtable_types.h
@@ -213,6 +213,7 @@ enum page_cache_mode {
 #define PAGE_KERNEL__pgprot(__PAGE_KERNEL | _PAGE_ENC)
 #define PAGE_KERNEL_RO __pgprot(__PAGE_KERNEL_RO | _PAGE_ENC)
 #define PAGE_KERNEL_EXEC   __pgprot(__PAGE_KERNEL_EXEC | _PAGE_ENC)
+#define PAGE_KERNEL_EXEC_NOENC __pgprot(__PAGE_KERNEL_EXEC)
 #define PAGE_KERNEL_RX __pgprot(__PAGE_KERNEL_RX | _PAGE_ENC)
 #define PAGE_KERNEL_NOCACHE__pgprot(__PAGE_KERNEL_NOCACHE | _PAGE_ENC)
 #define PAGE_KERNEL_LARGE  __pgprot(__PAGE_KERNEL_LARGE | _PAGE_ENC)
diff --git a/arch/x86/kernel/machine_kexec_64.c 
b/arch/x86/kernel/machine_kexec_64.c
index cb0a304..9cf8daa 100644
--- a/arch/x86/kernel/machine_kexec_64.c
+++ b/arch/x86/kernel/machine_kexec_64.c
@@ -87,7 +87,7 @@ static int init_transition_pgtable(struct kimage *image, 
pgd_t *pgd)
set_pmd(pmd, __pmd(__pa(pte) | _KERNPG_TABLE));
}
pte = pte_offset_kernel(pmd, vaddr);
-   set_pte(pte, pfn_pte(paddr >> PAGE_SHIFT, PAGE_KERNEL_EXEC));
+   set_pte(pte, pfn_pte(paddr >> PAGE_SHIFT, PAGE_KERNEL_EXEC_NOENC));
return 0;
 err:
free_transition_pgtable(image);
@@ -115,6 +115,7 @@ static int init_pgtable(struct kimage *image, unsigned long 
start_pgtable)
.alloc_pgt_page = alloc_pgt_page,
.context= image,
.page_flag  = __PAGE_KERNEL_LARGE_EXEC,
+   .kernpg_flag= _KERNPG_TABLE_NOENC,
};
unsigned long mstart, mend;
pgd_t *level4p;
@@ -602,3 +603,22 @@ void arch_kexec_unprotect_crashkres(void)
 {
kexec_mark_crashkres(false);
 }
+
+int arch_kexec_post_alloc_pages(void *vaddr, unsigned int pages, gfp_t gfp)
+{
+   /*
+* If SME is active we need to be sure that kexec pages are
+* not encrypted because when we boot to the new kernel the
+* pages won't be accessed encrypted (initially).
+*/
+   return set_memory_decrypted((unsigned long)vaddr, pages);
+}
+
+void arch_kexec_pre_free_pages(void *vaddr, unsigned int pages)
+{
+   /*
+* If SME is active we need to reset the pages back to being
+* an encrypted mapping before freeing them.
+*/
+   set_memory_encrypted((unsigned long)vaddr, pages);
+}