Re: [RFC PATCH v2 08/32] x86: Use PAGE_KERNEL protection for ioremap of memory page
On 3/17/2017 9:32 AM, Tom Lendacky wrote: On 3/16/2017 3:04 PM, Tom Lendacky wrote: On 3/7/2017 8:59 AM, Borislav Petkov wrote: On Thu, Mar 02, 2017 at 10:13:32AM -0500, Brijesh Singh wrote: From: Tom Lendacky In order for memory pages to be properly mapped when SEV is active, we need to use the PAGE_KERNEL protection attribute as the base protection. This will insure that memory mapping of, e.g. ACPI tables, receives the proper mapping attributes. Signed-off-by: Tom Lendacky --- diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c index c400ab5..481c999 100644 --- a/arch/x86/mm/ioremap.c +++ b/arch/x86/mm/ioremap.c @@ -151,7 +151,15 @@ static void __iomem *__ioremap_caller(resource_size_t phys_addr, pcm = new_pcm; } + /* +* If the page being mapped is in memory and SEV is active then +* make sure the memory encryption attribute is enabled in the +* resulting mapping. +*/ prot = PAGE_KERNEL_IO; + if (sev_active() && page_is_mem(pfn)) Hmm, a resource tree walk per ioremap call. This could get expensive for ioremap-heavy workloads. __ioremap_caller() gets called here during boot 55 times so not a whole lot but I wouldn't be surprised if there were some nasty use cases which ioremap a lot. ... diff --git a/kernel/resource.c b/kernel/resource.c index 9b5f044..db56ba3 100644 --- a/kernel/resource.c +++ b/kernel/resource.c @@ -518,6 +518,46 @@ int __weak page_is_ram(unsigned long pfn) } EXPORT_SYMBOL_GPL(page_is_ram); +/* + * This function returns true if the target memory is marked as + * IORESOURCE_MEM and IORESOUCE_BUSY and described as other than + * IORES_DESC_NONE (e.g. IORES_DESC_ACPI_TABLES). + */ +static int walk_mem_range(unsigned long start_pfn, unsigned long nr_pages) +{ +struct resource res; +unsigned long pfn, end_pfn; +u64 orig_end; +int ret = -1; + +res.start = (u64) start_pfn << PAGE_SHIFT; +res.end = ((u64)(start_pfn + nr_pages) << PAGE_SHIFT) - 1; +res.flags = IORESOURCE_MEM | IORESOURCE_BUSY; +orig_end = res.end; +while ((res.start < res.end) && +(find_next_iomem_res(&res, IORES_DESC_NONE, true) >= 0)) { +pfn = (res.start + PAGE_SIZE - 1) >> PAGE_SHIFT; +end_pfn = (res.end + 1) >> PAGE_SHIFT; +if (end_pfn > pfn) +ret = (res.desc != IORES_DESC_NONE) ? 1 : 0; +if (ret) +break; +res.start = res.end + 1; +res.end = orig_end; +} +return ret; +} So the relevant difference between this one and walk_system_ram_range() is this: -ret = (*func)(pfn, end_pfn - pfn, arg); +ret = (res.desc != IORES_DESC_NONE) ? 1 : 0; so it seems to me you can have your own *func() pointer which does that IORES_DESC_NONE comparison. And then you can define your own workhorse __walk_memory_range() which gets called by both walk_mem_range() and walk_system_ram_range() instead of almost duplicating them. And looking at walk_system_ram_res(), that one looks similar too except the pfn computation. But AFAICT the pfn/end_pfn things are computed from res.start and res.end so it looks to me like all those three functions are crying for unification... I'll take a look at what it takes to consolidate these with a pre-patch. Then I'll add the new support. It looks pretty straight forward to combine walk_iomem_res_desc() and walk_system_ram_res(). The walk_system_ram_range() function would fit easily into this, also, except for the fact that the callback function takes unsigned longs vs the u64s of the other functions. Is it worth modifying all of the callers of walk_system_ram_range() (which are only about 8 locations) to change the callback functions to accept u64s in order to consolidate the walk_system_ram_range() function, too? The more I dig, the more I find that the changes keep expanding. I'll leave walk_system_ram_range() out of the consolidation for now. Thanks, Tom Thanks, Tom Thanks, Tom ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
Re: [RFC PATCH v2 08/32] x86: Use PAGE_KERNEL protection for ioremap of memory page
On 3/16/2017 3:04 PM, Tom Lendacky wrote: On 3/7/2017 8:59 AM, Borislav Petkov wrote: On Thu, Mar 02, 2017 at 10:13:32AM -0500, Brijesh Singh wrote: From: Tom Lendacky In order for memory pages to be properly mapped when SEV is active, we need to use the PAGE_KERNEL protection attribute as the base protection. This will insure that memory mapping of, e.g. ACPI tables, receives the proper mapping attributes. Signed-off-by: Tom Lendacky --- diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c index c400ab5..481c999 100644 --- a/arch/x86/mm/ioremap.c +++ b/arch/x86/mm/ioremap.c @@ -151,7 +151,15 @@ static void __iomem *__ioremap_caller(resource_size_t phys_addr, pcm = new_pcm; } + /* +* If the page being mapped is in memory and SEV is active then +* make sure the memory encryption attribute is enabled in the +* resulting mapping. +*/ prot = PAGE_KERNEL_IO; + if (sev_active() && page_is_mem(pfn)) Hmm, a resource tree walk per ioremap call. This could get expensive for ioremap-heavy workloads. __ioremap_caller() gets called here during boot 55 times so not a whole lot but I wouldn't be surprised if there were some nasty use cases which ioremap a lot. ... diff --git a/kernel/resource.c b/kernel/resource.c index 9b5f044..db56ba3 100644 --- a/kernel/resource.c +++ b/kernel/resource.c @@ -518,6 +518,46 @@ int __weak page_is_ram(unsigned long pfn) } EXPORT_SYMBOL_GPL(page_is_ram); +/* + * This function returns true if the target memory is marked as + * IORESOURCE_MEM and IORESOUCE_BUSY and described as other than + * IORES_DESC_NONE (e.g. IORES_DESC_ACPI_TABLES). + */ +static int walk_mem_range(unsigned long start_pfn, unsigned long nr_pages) +{ +struct resource res; +unsigned long pfn, end_pfn; +u64 orig_end; +int ret = -1; + +res.start = (u64) start_pfn << PAGE_SHIFT; +res.end = ((u64)(start_pfn + nr_pages) << PAGE_SHIFT) - 1; +res.flags = IORESOURCE_MEM | IORESOURCE_BUSY; +orig_end = res.end; +while ((res.start < res.end) && +(find_next_iomem_res(&res, IORES_DESC_NONE, true) >= 0)) { +pfn = (res.start + PAGE_SIZE - 1) >> PAGE_SHIFT; +end_pfn = (res.end + 1) >> PAGE_SHIFT; +if (end_pfn > pfn) +ret = (res.desc != IORES_DESC_NONE) ? 1 : 0; +if (ret) +break; +res.start = res.end + 1; +res.end = orig_end; +} +return ret; +} So the relevant difference between this one and walk_system_ram_range() is this: -ret = (*func)(pfn, end_pfn - pfn, arg); +ret = (res.desc != IORES_DESC_NONE) ? 1 : 0; so it seems to me you can have your own *func() pointer which does that IORES_DESC_NONE comparison. And then you can define your own workhorse __walk_memory_range() which gets called by both walk_mem_range() and walk_system_ram_range() instead of almost duplicating them. And looking at walk_system_ram_res(), that one looks similar too except the pfn computation. But AFAICT the pfn/end_pfn things are computed from res.start and res.end so it looks to me like all those three functions are crying for unification... I'll take a look at what it takes to consolidate these with a pre-patch. Then I'll add the new support. It looks pretty straight forward to combine walk_iomem_res_desc() and walk_system_ram_res(). The walk_system_ram_range() function would fit easily into this, also, except for the fact that the callback function takes unsigned longs vs the u64s of the other functions. Is it worth modifying all of the callers of walk_system_ram_range() (which are only about 8 locations) to change the callback functions to accept u64s in order to consolidate the walk_system_ram_range() function, too? Thanks, Tom Thanks, Tom ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
Re: [RFC PATCH v2 08/32] x86: Use PAGE_KERNEL protection for ioremap of memory page
On 3/7/2017 8:59 AM, Borislav Petkov wrote: On Thu, Mar 02, 2017 at 10:13:32AM -0500, Brijesh Singh wrote: From: Tom Lendacky In order for memory pages to be properly mapped when SEV is active, we need to use the PAGE_KERNEL protection attribute as the base protection. This will insure that memory mapping of, e.g. ACPI tables, receives the proper mapping attributes. Signed-off-by: Tom Lendacky --- diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c index c400ab5..481c999 100644 --- a/arch/x86/mm/ioremap.c +++ b/arch/x86/mm/ioremap.c @@ -151,7 +151,15 @@ static void __iomem *__ioremap_caller(resource_size_t phys_addr, pcm = new_pcm; } + /* +* If the page being mapped is in memory and SEV is active then +* make sure the memory encryption attribute is enabled in the +* resulting mapping. +*/ prot = PAGE_KERNEL_IO; + if (sev_active() && page_is_mem(pfn)) Hmm, a resource tree walk per ioremap call. This could get expensive for ioremap-heavy workloads. __ioremap_caller() gets called here during boot 55 times so not a whole lot but I wouldn't be surprised if there were some nasty use cases which ioremap a lot. ... diff --git a/kernel/resource.c b/kernel/resource.c index 9b5f044..db56ba3 100644 --- a/kernel/resource.c +++ b/kernel/resource.c @@ -518,6 +518,46 @@ int __weak page_is_ram(unsigned long pfn) } EXPORT_SYMBOL_GPL(page_is_ram); +/* + * This function returns true if the target memory is marked as + * IORESOURCE_MEM and IORESOUCE_BUSY and described as other than + * IORES_DESC_NONE (e.g. IORES_DESC_ACPI_TABLES). + */ +static int walk_mem_range(unsigned long start_pfn, unsigned long nr_pages) +{ + struct resource res; + unsigned long pfn, end_pfn; + u64 orig_end; + int ret = -1; + + res.start = (u64) start_pfn << PAGE_SHIFT; + res.end = ((u64)(start_pfn + nr_pages) << PAGE_SHIFT) - 1; + res.flags = IORESOURCE_MEM | IORESOURCE_BUSY; + orig_end = res.end; + while ((res.start < res.end) && + (find_next_iomem_res(&res, IORES_DESC_NONE, true) >= 0)) { + pfn = (res.start + PAGE_SIZE - 1) >> PAGE_SHIFT; + end_pfn = (res.end + 1) >> PAGE_SHIFT; + if (end_pfn > pfn) + ret = (res.desc != IORES_DESC_NONE) ? 1 : 0; + if (ret) + break; + res.start = res.end + 1; + res.end = orig_end; + } + return ret; +} So the relevant difference between this one and walk_system_ram_range() is this: - ret = (*func)(pfn, end_pfn - pfn, arg); + ret = (res.desc != IORES_DESC_NONE) ? 1 : 0; so it seems to me you can have your own *func() pointer which does that IORES_DESC_NONE comparison. And then you can define your own workhorse __walk_memory_range() which gets called by both walk_mem_range() and walk_system_ram_range() instead of almost duplicating them. And looking at walk_system_ram_res(), that one looks similar too except the pfn computation. But AFAICT the pfn/end_pfn things are computed from res.start and res.end so it looks to me like all those three functions are crying for unification... I'll take a look at what it takes to consolidate these with a pre-patch. Then I'll add the new support. Thanks, Tom ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
Re: [RFC PATCH v2 05/32] x86: Use encrypted access of BOOT related data with SEV
On 3/7/2017 5:09 AM, Borislav Petkov wrote: On Thu, Mar 02, 2017 at 10:12:59AM -0500, Brijesh Singh wrote: From: Tom Lendacky When Secure Encrypted Virtualization (SEV) is active, BOOT data (such as EFI related data, setup data) is encrypted and needs to be accessed as such when mapped. Update the architecture override in early_memremap to keep the encryption attribute when mapping this data. This could also explain why persistent memory needs to be accessed decrypted with SEV. I'll add some comments about why persistent memory needs to be accessed decrypted (because the encryption key changes across reboots) for both SME and SEV. In general, what the difference in that aspect is in respect to SME. And I'd write that in the comment over the function. And not say "E820 areas are checked in making this determination." because that is visible but say *why* we need to check those ranges and determine access depending on their type. Will do. Thanks, Tom ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
Re: [RFC PATCH v2 12/32] x86: Add early boot support when running with SEV active
On 3/16/2017 10:09 AM, Borislav Petkov wrote: On Thu, Mar 16, 2017 at 09:28:58AM -0500, Tom Lendacky wrote: Because there are differences between how SME and SEV behave (instruction fetches are always decrypted under SEV, DMA to an encrypted location is not supported under SEV, etc.) we need to determine which mode we are in so that things can be setup properly during boot. For example, if SEV is active the kernel will already be encrypted and so we don't perform that step or the trampoline area for bringing up an AP must be decrypted for SME but encrypted for SEV. So with SEV enabled, it seems to me a guest doesn't know anything about encryption and can run as if SME is disabled. So sme_active() will be false. And then the kernel can bypass all that code dealing with SME. So a guest should simply run like on baremetal with no SME, IMHO. Not quite. The guest still needs to understand about the encryption mask so that it can protect memory by setting the encryption mask in the pagetable entries. It can also decide when to share memory with the hypervisor by not setting the encryption mask in the pagetable entries. But then there's that part: "instruction fetches are always decrypted under SEV". What does that mean exactly? And how much of that code can "Instruction fetches are always decrypted under SEV" means that, regardless of how a virtual address is mapped, encrypted or decrypted, if an instruction fetch is performed by the CPU from that address it will always be decrypted. This is to prevent the hypervisor from injecting executable code into the guest since it would have to be valid encrypted instructions. be reused so that * SME on baremetal * SEV on guest use the same logic? There are many areas that use the same logic, but there are certain situations where we need to check between SME vs SEV (e.g. DMA operation setup or decrypting the trampoline area) and act accordingly. Thanks, Tom Having the larger SEV preparation part on the kvm host side is perfectly fine. But I'd like to keep kernel initialization paths clean. Thanks. ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
Re: [RFC PATCH v2 12/32] x86: Add early boot support when running with SEV active
On 3/16/2017 5:16 AM, Borislav Petkov wrote: On Fri, Mar 10, 2017 at 10:35:30AM -0600, Brijesh Singh wrote: We could update this patch to use the below logic: * CPUID(0) - Check for AuthenticAMD * CPID(1) - Check if under hypervisor * CPUID(0x8000) - Check for highest supported leaf * CPUID(0x801F).EAX - Check for SME and SEV support * rdmsr (MSR_K8_SYSCFG)[MemEncryptionModeEnc] - Check if SMEE is set Actually, it is still not clear to me *why* we need to do anything special wrt SEV in the guest. Lemme clarify: why can't the guest boot just like a normal Linux on baremetal and use the SME(!) detection code to set sme_enable and so on? IOW, I'd like to avoid all those checks whether we're running under hypervisor and handle all that like we're running on baremetal. Because there are differences between how SME and SEV behave (instruction fetches are always decrypted under SEV, DMA to an encrypted location is not supported under SEV, etc.) we need to determine which mode we are in so that things can be setup properly during boot. For example, if SEV is active the kernel will already be encrypted and so we don't perform that step or the trampoline area for bringing up an AP must be decrypted for SME but encrypted for SEV. The hypervisor check will provide that ability to determine how we handle things. Thanks, Tom ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
Re: [RFC PATCH v2 06/32] x86/pci: Use memremap when walking setup data
On 3/6/2017 6:03 PM, Bjorn Helgaas wrote: On Fri, Mar 03, 2017 at 03:15:34PM -0600, Tom Lendacky wrote: On 3/3/2017 2:42 PM, Bjorn Helgaas wrote: On Thu, Mar 02, 2017 at 10:13:10AM -0500, Brijesh Singh wrote: From: Tom Lendacky The use of ioremap will force the setup data to be mapped decrypted even though setup data is encrypted. Switch to using memremap which will be able to perform the proper mapping. How should callers decide whether to use ioremap() or memremap()? memremap() existed before SME and SEV, and this code is used even if SME and SEV aren't supported, so the rationale for this change should not need the decryption argument. When SME or SEV is active an ioremap() will remove the encryption bit from the pagetable entry when it is mapped. This allows MMIO, which doesn't support SME/SEV, to be performed successfully. So my take is that ioremap() should be used for MMIO and memremap() for pages in RAM. OK, thanks. The commit message should say something like "this is RAM, not MMIO, so we should map it with memremap(), not ioremap()". That's the part that determines whether the change is correct. You can mention the encryption part, too, but it's definitely secondary because the change has to make sense on its own, without SME/SEV. Ok, that makes sense, will do. The following commits (from https://github.com/codomania/tip/branches) all do basically the same thing so the changelogs (and summaries) should all be basically the same: cb0d0d1eb0a6 x86: Change early_ioremap to early_memremap for BOOT data 91acb68b8333 x86/pci: Use memremap when walking setup data 4f687503e23f x86: Access the setup data through sysfs decrypted e90246b8c229 x86: Access the setup data through debugfs decrypted I would collect them all together and move them to the beginning of your series, since they don't depend on anything else. I'll do that. Also, change "x86/pci: " to "x86/PCI" so it matches the previous convention. Will do. Thanks, Tom Signed-off-by: Tom Lendacky Acked-by: Bjorn Helgaas --- arch/x86/pci/common.c |4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/x86/pci/common.c b/arch/x86/pci/common.c index a4fdfa7..0b06670 100644 --- a/arch/x86/pci/common.c +++ b/arch/x86/pci/common.c @@ -691,7 +691,7 @@ int pcibios_add_device(struct pci_dev *dev) pa_data = boot_params.hdr.setup_data; while (pa_data) { - data = ioremap(pa_data, sizeof(*rom)); + data = memremap(pa_data, sizeof(*rom), MEMREMAP_WB); I can't quite connect the dots here. ioremap() on x86 would do ioremap_nocache(). memremap(MEMREMAP_WB) would do arch_memremap_wb(), which is ioremap_cache(). Is making a cacheable mapping the important difference? The memremap(MEMREMAP_WB) will actually check to see if it can perform a __va(pa_data) in try_ram_remap() and then fallback to the arch_memremap_wb(). So it's actually the __va() vs the ioremap_cache() that is the difference. Thanks, Tom if (!data) return -ENOMEM; @@ -710,7 +710,7 @@ int pcibios_add_device(struct pci_dev *dev) } } pa_data = data->next; - iounmap(data); + memunmap(data); } set_dma_domain_ops(dev); set_dev_domain_options(dev); ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
Re: [RFC PATCH v2 06/32] x86/pci: Use memremap when walking setup data
On 3/3/2017 2:42 PM, Bjorn Helgaas wrote: On Thu, Mar 02, 2017 at 10:13:10AM -0500, Brijesh Singh wrote: From: Tom Lendacky The use of ioremap will force the setup data to be mapped decrypted even though setup data is encrypted. Switch to using memremap which will be able to perform the proper mapping. How should callers decide whether to use ioremap() or memremap()? memremap() existed before SME and SEV, and this code is used even if SME and SEV aren't supported, so the rationale for this change should not need the decryption argument. When SME or SEV is active an ioremap() will remove the encryption bit from the pagetable entry when it is mapped. This allows MMIO, which doesn't support SME/SEV, to be performed successfully. So my take is that ioremap() should be used for MMIO and memremap() for pages in RAM. Signed-off-by: Tom Lendacky --- arch/x86/pci/common.c |4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/x86/pci/common.c b/arch/x86/pci/common.c index a4fdfa7..0b06670 100644 --- a/arch/x86/pci/common.c +++ b/arch/x86/pci/common.c @@ -691,7 +691,7 @@ int pcibios_add_device(struct pci_dev *dev) pa_data = boot_params.hdr.setup_data; while (pa_data) { - data = ioremap(pa_data, sizeof(*rom)); + data = memremap(pa_data, sizeof(*rom), MEMREMAP_WB); I can't quite connect the dots here. ioremap() on x86 would do ioremap_nocache(). memremap(MEMREMAP_WB) would do arch_memremap_wb(), which is ioremap_cache(). Is making a cacheable mapping the important difference? The memremap(MEMREMAP_WB) will actually check to see if it can perform a __va(pa_data) in try_ram_remap() and then fallback to the arch_memremap_wb(). So it's actually the __va() vs the ioremap_cache() that is the difference. Thanks, Tom if (!data) return -ENOMEM; @@ -710,7 +710,7 @@ int pcibios_add_device(struct pci_dev *dev) } } pa_data = data->next; - iounmap(data); + memunmap(data); } set_dma_domain_ops(dev); set_dev_domain_options(dev); ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
Re: [RFC PATCH v1 09/28] x86/efi: Access EFI data as encrypted when SEV is active
On 09/22/2016 02:11 PM, Borislav Petkov wrote: > On Thu, Sep 22, 2016 at 02:04:27PM -0500, Tom Lendacky wrote: >> That's not what I mean here. If the BIOS sets the SMEE bit in the >> SYS_CFG msr then, even if the encryption bit is never used, there is >> still a reduction in physical address space. > > I thought that reduction is the reservation of bits for the SME mask. > > What other reduction is there? There is a reduction in physical address space for the SME mask and the bits used to aid in identifying the ASID associated with the memory request. This allows for the memory controller to determine the key to be used for the encryption operation (host/hypervisor key vs. an SEV guest key). Thanks, Tom > >> Transparent SME (TSME) will be a BIOS option that will result in the >> memory controller performing encryption no matter what. In this case >> all data will be encrypted without a reduction in physical address >> space. > > Now I'm confused: aren't we reducing the address space with the SME > mask? > > Or what reduction do you mean? > ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
Re: [RFC PATCH v1 09/28] x86/efi: Access EFI data as encrypted when SEV is active
On 09/22/2016 09:45 AM, Paolo Bonzini wrote: > > > On 22/09/2016 16:35, Borislav Petkov wrote: @@ -230,6 +230,10 @@ int __init efi_setup_page_tables(unsigned long pa_memmap, unsigned num_pages) efi_scratch.efi_pgt = (pgd_t *)__sme_pa(efi_pgd); pgd = efi_pgd; + flags = _PAGE_NX | _PAGE_RW; + if (sev_active) + flags |= _PAGE_ENC; >> So this is confusing me. There's this patch which says EFI data is >> accessed in the clear: >> >> https://lkml.kernel.org/r/2016083738.29880.6909.st...@tlendack-t1.amdoffice.net >> >> but now here it is encrypted when SEV is enabled. >> >> Do you mean, it is encrypted here because we're in the guest kernel? > > I suspect this patch is untested, and also wrong. :) Yes, it is untested but not sure that it is wrong... It all depends on how we add SEV support to the guest UEFI BIOS. My take would be to have the EFI data and ACPI tables encrypted. > > The main difference between the SME and SEV encryption, from the point > of view of the kernel, is that real-mode always writes unencrypted in > SME and always writes encrypted in SEV. But UEFI can run in 64-bit mode > and learn about the C bit, so EFI boot data should be unprotected in SEV > guests. > > Because the firmware volume is written to high memory in encrypted form, > and because the PEI phase runs in 32-bit mode, the firmware code will be > encrypted; on the other hand, data that is placed in low memory for the > kernel can be unencrypted, thus limiting differences between SME and SEV. I like the idea of limiting the differences but it would leave the EFI data and ACPI tables exposed and able to be manipulated. > > Important: I don't know what you guys are doing for SEV and > Windows guests, but if you are doing something I would really > appreciate doing things in the open. If Linux and Windows end > up doing different things with EFI boot data, ACPI tables, etc. > it will be a huge pain. On the other hand, if we can enjoy > being first, that's great. We haven't discussed Windows guests under SEV yet, but as you say, we need to do things the same. Thanks, Tom > > In fact, I have suggested in the QEMU list that SEV guests should always > use UEFI; because BIOS runs in real-mode or 32-bit non-paging protected > mode, BIOS must always write encrypted data, which becomes painful in > the kernel. > > And regarding the above "important" point, all I know is that Microsoft > for sure will be happy to restrict SEV to UEFI guests. :) > > There are still some differences, mostly around the real mode trampoline > executed by the kernel, but they should be much smaller. > > Paolo > ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
Re: [RFC PATCH v1 09/28] x86/efi: Access EFI data as encrypted when SEV is active
On 09/22/2016 09:59 AM, Borislav Petkov wrote: > On Thu, Sep 22, 2016 at 04:45:51PM +0200, Paolo Bonzini wrote: >> The main difference between the SME and SEV encryption, from the point >> of view of the kernel, is that real-mode always writes unencrypted in >> SME and always writes encrypted in SEV. But UEFI can run in 64-bit mode >> and learn about the C bit, so EFI boot data should be unprotected in SEV >> guests. > > Actually, it is different: you can start fully encrypted in SME, see: > > https://lkml.kernel.org/r/2016083539.29880.96739.st...@tlendack-t1.amdoffice.net > > The last paragraph alludes to a certain transparent mode where you're > already encrypted and only certain pieces like EFI is not encrypted. I > think the aim is to have the transparent mode be the default one, which > makes most sense anyway. There is a new Transparent SME mode that is now part of the overall SME support, but I'm not alluding to that in the documentation at all. In TSME mode, everything that goes through the memory controller would be encrypted and that would include EFI data, etc. TSME would be enabled through a BIOS option, thus allowing legacy OSes to benefit. > > The EFI regions are unencrypted for obvious reasons and you need to > access them as such. > >> Because the firmware volume is written to high memory in encrypted >> form, and because the PEI phase runs in 32-bit mode, the firmware >> code will be encrypted; on the other hand, data that is placed in low >> memory for the kernel can be unencrypted, thus limiting differences >> between SME and SEV. > > When you run fully encrypted, you still need to access EFI tables in the > clear. That's why I'm confused about this patch here. This patch assumes that the EFI regions of a guest would be encrypted. Thanks, Tom > ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
Re: [RFC PATCH v1 09/28] x86/efi: Access EFI data as encrypted when SEV is active
On 09/22/2016 12:07 PM, Borislav Petkov wrote: > On Thu, Sep 22, 2016 at 05:05:54PM +0200, Paolo Bonzini wrote: >> Which paragraph? > > "Linux relies on BIOS to set this bit if BIOS has determined that the > reduction in the physical address space as a result of enabling memory > encryption..." > > Basically, you can enable SME in the BIOS and you're all set. That's not what I mean here. If the BIOS sets the SMEE bit in the SYS_CFG msr then, even if the encryption bit is never used, there is still a reduction in physical address space. Transparent SME (TSME) will be a BIOS option that will result in the memory controller performing encryption no matter what. In this case all data will be encrypted without a reduction in physical address space. Thanks, Tom > ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
Re: [RFC PATCH v1 09/28] x86/efi: Access EFI data as encrypted when SEV is active
On 09/22/2016 09:35 AM, Borislav Petkov wrote: > On Mon, Aug 22, 2016 at 07:25:25PM -0400, Brijesh Singh wrote: >> From: Tom Lendacky >> >> EFI data is encrypted when the kernel is run under SEV. Update the >> page table references to be sure the EFI memory areas are accessed >> encrypted. >> >> Signed-off-by: Tom Lendacky >> --- >> arch/x86/platform/efi/efi_64.c | 14 -- >> 1 file changed, 12 insertions(+), 2 deletions(-) >> >> diff --git a/arch/x86/platform/efi/efi_64.c b/arch/x86/platform/efi/efi_64.c >> index 0871ea4..98363f3 100644 >> --- a/arch/x86/platform/efi/efi_64.c >> +++ b/arch/x86/platform/efi/efi_64.c >> @@ -213,7 +213,7 @@ void efi_sync_low_kernel_mappings(void) >> >> int __init efi_setup_page_tables(unsigned long pa_memmap, unsigned >> num_pages) >> { >> -unsigned long pfn, text; >> +unsigned long pfn, text, flags; >> efi_memory_desc_t *md; >> struct page *page; >> unsigned npages; >> @@ -230,6 +230,10 @@ int __init efi_setup_page_tables(unsigned long >> pa_memmap, unsigned num_pages) >> efi_scratch.efi_pgt = (pgd_t *)__sme_pa(efi_pgd); >> pgd = efi_pgd; >> >> +flags = _PAGE_NX | _PAGE_RW; >> +if (sev_active) >> +flags |= _PAGE_ENC; > > So this is confusing me. There's this patch which says EFI data is > accessed in the clear: > > https://lkml.kernel.org/r/2016083738.29880.6909.st...@tlendack-t1.amdoffice.net > > but now here it is encrypted when SEV is enabled. > > Do you mean, it is encrypted here because we're in the guest kernel? Yes, the idea is that the SEV guest will be running encrypted from the start, including the BIOS/UEFI, and so all of the EFI related data will be encrypted. Thanks, Tom > > Thanks. > ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
Re: [RFC PATCH v1 18/28] crypto: add AMD Platform Security Processor driver
On 08/23/2016 02:14 AM, Herbert Xu wrote: > On Mon, Aug 22, 2016 at 07:27:22PM -0400, Brijesh Singh wrote: >> The driver to communicate with Secure Encrypted Virtualization (SEV) >> firmware running within the AMD secure processor providing a secure key >> management interface for SEV guests. >> >> Signed-off-by: Tom Lendacky >> Signed-off-by: Brijesh Singh > > This driver doesn't seem to hook into the Crypto API at all, is > there any reason why it should be in drivers/crypto? Yes, this needs to be cleaned up. The PSP and the CCP share the same PCI id, so this has to be integrated with the CCP. It could either be moved into the drivers/crypto/ccp directory or both the psp and ccp device specific support can be moved somewhere else leaving just the ccp crypto API related files in drivers/crypto/ccp. Thanks, Tom > > Thanks, > ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel