Re: [RFC PATCH v2 08/32] x86: Use PAGE_KERNEL protection for ioremap of memory page

2017-03-17 Thread Tom Lendacky

On 3/17/2017 9:32 AM, Tom Lendacky wrote:

On 3/16/2017 3:04 PM, Tom Lendacky wrote:

On 3/7/2017 8:59 AM, Borislav Petkov wrote:

On Thu, Mar 02, 2017 at 10:13:32AM -0500, Brijesh Singh wrote:

From: Tom Lendacky 

In order for memory pages to be properly mapped when SEV is active, we
need to use the PAGE_KERNEL protection attribute as the base
protection.
This will insure that memory mapping of, e.g. ACPI tables, receives the
proper mapping attributes.

Signed-off-by: Tom Lendacky 
---



diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c
index c400ab5..481c999 100644
--- a/arch/x86/mm/ioremap.c
+++ b/arch/x86/mm/ioremap.c
@@ -151,7 +151,15 @@ static void __iomem
*__ioremap_caller(resource_size_t phys_addr,
pcm = new_pcm;
}

+   /*
+* If the page being mapped is in memory and SEV is active then
+* make sure the memory encryption attribute is enabled in the
+* resulting mapping.
+*/
prot = PAGE_KERNEL_IO;
+   if (sev_active() && page_is_mem(pfn))


Hmm, a resource tree walk per ioremap call. This could get expensive for
ioremap-heavy workloads.

__ioremap_caller() gets called here during boot 55 times so not a whole
lot but I wouldn't be surprised if there were some nasty use cases which
ioremap a lot.

...


diff --git a/kernel/resource.c b/kernel/resource.c
index 9b5f044..db56ba3 100644
--- a/kernel/resource.c
+++ b/kernel/resource.c
@@ -518,6 +518,46 @@ int __weak page_is_ram(unsigned long pfn)
 }
 EXPORT_SYMBOL_GPL(page_is_ram);

+/*
+ * This function returns true if the target memory is marked as
+ * IORESOURCE_MEM and IORESOUCE_BUSY and described as other than
+ * IORES_DESC_NONE (e.g. IORES_DESC_ACPI_TABLES).
+ */
+static int walk_mem_range(unsigned long start_pfn, unsigned long
nr_pages)
+{
+struct resource res;
+unsigned long pfn, end_pfn;
+u64 orig_end;
+int ret = -1;
+
+res.start = (u64) start_pfn << PAGE_SHIFT;
+res.end = ((u64)(start_pfn + nr_pages) << PAGE_SHIFT) - 1;
+res.flags = IORESOURCE_MEM | IORESOURCE_BUSY;
+orig_end = res.end;
+while ((res.start < res.end) &&
+(find_next_iomem_res(&res, IORES_DESC_NONE, true) >= 0)) {
+pfn = (res.start + PAGE_SIZE - 1) >> PAGE_SHIFT;
+end_pfn = (res.end + 1) >> PAGE_SHIFT;
+if (end_pfn > pfn)
+ret = (res.desc != IORES_DESC_NONE) ? 1 : 0;
+if (ret)
+break;
+res.start = res.end + 1;
+res.end = orig_end;
+}
+return ret;
+}


So the relevant difference between this one and walk_system_ram_range()
is this:

-ret = (*func)(pfn, end_pfn - pfn, arg);
+ret = (res.desc != IORES_DESC_NONE) ? 1 : 0;

so it seems to me you can have your own *func() pointer which does that
IORES_DESC_NONE comparison. And then you can define your own workhorse
__walk_memory_range() which gets called by both walk_mem_range() and
walk_system_ram_range() instead of almost duplicating them.

And looking at walk_system_ram_res(), that one looks similar too except
the pfn computation. But AFAICT the pfn/end_pfn things are computed from
res.start and res.end so it looks to me like all those three functions
are crying for unification...


I'll take a look at what it takes to consolidate these with a pre-patch.
Then I'll add the new support.


It looks pretty straight forward to combine walk_iomem_res_desc() and
walk_system_ram_res(). The walk_system_ram_range() function would fit
easily into this, also, except for the fact that the callback function
takes unsigned longs vs the u64s of the other functions.  Is it worth
modifying all of the callers of walk_system_ram_range() (which are only
about 8 locations) to change the callback functions to accept u64s in
order to consolidate the walk_system_ram_range() function, too?


The more I dig, the more I find that the changes keep expanding. I'll
leave walk_system_ram_range() out of the consolidation for now.

Thanks,
Tom



Thanks,
Tom



Thanks,
Tom




___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


Re: [RFC PATCH v2 08/32] x86: Use PAGE_KERNEL protection for ioremap of memory page

2017-03-17 Thread Tom Lendacky

On 3/16/2017 3:04 PM, Tom Lendacky wrote:

On 3/7/2017 8:59 AM, Borislav Petkov wrote:

On Thu, Mar 02, 2017 at 10:13:32AM -0500, Brijesh Singh wrote:

From: Tom Lendacky 

In order for memory pages to be properly mapped when SEV is active, we
need to use the PAGE_KERNEL protection attribute as the base protection.
This will insure that memory mapping of, e.g. ACPI tables, receives the
proper mapping attributes.

Signed-off-by: Tom Lendacky 
---



diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c
index c400ab5..481c999 100644
--- a/arch/x86/mm/ioremap.c
+++ b/arch/x86/mm/ioremap.c
@@ -151,7 +151,15 @@ static void __iomem
*__ioremap_caller(resource_size_t phys_addr,
pcm = new_pcm;
}

+   /*
+* If the page being mapped is in memory and SEV is active then
+* make sure the memory encryption attribute is enabled in the
+* resulting mapping.
+*/
prot = PAGE_KERNEL_IO;
+   if (sev_active() && page_is_mem(pfn))


Hmm, a resource tree walk per ioremap call. This could get expensive for
ioremap-heavy workloads.

__ioremap_caller() gets called here during boot 55 times so not a whole
lot but I wouldn't be surprised if there were some nasty use cases which
ioremap a lot.

...


diff --git a/kernel/resource.c b/kernel/resource.c
index 9b5f044..db56ba3 100644
--- a/kernel/resource.c
+++ b/kernel/resource.c
@@ -518,6 +518,46 @@ int __weak page_is_ram(unsigned long pfn)
 }
 EXPORT_SYMBOL_GPL(page_is_ram);

+/*
+ * This function returns true if the target memory is marked as
+ * IORESOURCE_MEM and IORESOUCE_BUSY and described as other than
+ * IORES_DESC_NONE (e.g. IORES_DESC_ACPI_TABLES).
+ */
+static int walk_mem_range(unsigned long start_pfn, unsigned long
nr_pages)
+{
+struct resource res;
+unsigned long pfn, end_pfn;
+u64 orig_end;
+int ret = -1;
+
+res.start = (u64) start_pfn << PAGE_SHIFT;
+res.end = ((u64)(start_pfn + nr_pages) << PAGE_SHIFT) - 1;
+res.flags = IORESOURCE_MEM | IORESOURCE_BUSY;
+orig_end = res.end;
+while ((res.start < res.end) &&
+(find_next_iomem_res(&res, IORES_DESC_NONE, true) >= 0)) {
+pfn = (res.start + PAGE_SIZE - 1) >> PAGE_SHIFT;
+end_pfn = (res.end + 1) >> PAGE_SHIFT;
+if (end_pfn > pfn)
+ret = (res.desc != IORES_DESC_NONE) ? 1 : 0;
+if (ret)
+break;
+res.start = res.end + 1;
+res.end = orig_end;
+}
+return ret;
+}


So the relevant difference between this one and walk_system_ram_range()
is this:

-ret = (*func)(pfn, end_pfn - pfn, arg);
+ret = (res.desc != IORES_DESC_NONE) ? 1 : 0;

so it seems to me you can have your own *func() pointer which does that
IORES_DESC_NONE comparison. And then you can define your own workhorse
__walk_memory_range() which gets called by both walk_mem_range() and
walk_system_ram_range() instead of almost duplicating them.

And looking at walk_system_ram_res(), that one looks similar too except
the pfn computation. But AFAICT the pfn/end_pfn things are computed from
res.start and res.end so it looks to me like all those three functions
are crying for unification...


I'll take a look at what it takes to consolidate these with a pre-patch.
Then I'll add the new support.


It looks pretty straight forward to combine walk_iomem_res_desc() and
walk_system_ram_res(). The walk_system_ram_range() function would fit
easily into this, also, except for the fact that the callback function
takes unsigned longs vs the u64s of the other functions.  Is it worth
modifying all of the callers of walk_system_ram_range() (which are only
about 8 locations) to change the callback functions to accept u64s in
order to consolidate the walk_system_ram_range() function, too?

Thanks,
Tom



Thanks,
Tom




___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


Re: [RFC PATCH v2 08/32] x86: Use PAGE_KERNEL protection for ioremap of memory page

2017-03-16 Thread Tom Lendacky

On 3/7/2017 8:59 AM, Borislav Petkov wrote:

On Thu, Mar 02, 2017 at 10:13:32AM -0500, Brijesh Singh wrote:

From: Tom Lendacky 

In order for memory pages to be properly mapped when SEV is active, we
need to use the PAGE_KERNEL protection attribute as the base protection.
This will insure that memory mapping of, e.g. ACPI tables, receives the
proper mapping attributes.

Signed-off-by: Tom Lendacky 
---



diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c
index c400ab5..481c999 100644
--- a/arch/x86/mm/ioremap.c
+++ b/arch/x86/mm/ioremap.c
@@ -151,7 +151,15 @@ static void __iomem *__ioremap_caller(resource_size_t 
phys_addr,
pcm = new_pcm;
}

+   /*
+* If the page being mapped is in memory and SEV is active then
+* make sure the memory encryption attribute is enabled in the
+* resulting mapping.
+*/
prot = PAGE_KERNEL_IO;
+   if (sev_active() && page_is_mem(pfn))


Hmm, a resource tree walk per ioremap call. This could get expensive for
ioremap-heavy workloads.

__ioremap_caller() gets called here during boot 55 times so not a whole
lot but I wouldn't be surprised if there were some nasty use cases which
ioremap a lot.

...


diff --git a/kernel/resource.c b/kernel/resource.c
index 9b5f044..db56ba3 100644
--- a/kernel/resource.c
+++ b/kernel/resource.c
@@ -518,6 +518,46 @@ int __weak page_is_ram(unsigned long pfn)
 }
 EXPORT_SYMBOL_GPL(page_is_ram);

+/*
+ * This function returns true if the target memory is marked as
+ * IORESOURCE_MEM and IORESOUCE_BUSY and described as other than
+ * IORES_DESC_NONE (e.g. IORES_DESC_ACPI_TABLES).
+ */
+static int walk_mem_range(unsigned long start_pfn, unsigned long nr_pages)
+{
+   struct resource res;
+   unsigned long pfn, end_pfn;
+   u64 orig_end;
+   int ret = -1;
+
+   res.start = (u64) start_pfn << PAGE_SHIFT;
+   res.end = ((u64)(start_pfn + nr_pages) << PAGE_SHIFT) - 1;
+   res.flags = IORESOURCE_MEM | IORESOURCE_BUSY;
+   orig_end = res.end;
+   while ((res.start < res.end) &&
+   (find_next_iomem_res(&res, IORES_DESC_NONE, true) >= 0)) {
+   pfn = (res.start + PAGE_SIZE - 1) >> PAGE_SHIFT;
+   end_pfn = (res.end + 1) >> PAGE_SHIFT;
+   if (end_pfn > pfn)
+   ret = (res.desc != IORES_DESC_NONE) ? 1 : 0;
+   if (ret)
+   break;
+   res.start = res.end + 1;
+   res.end = orig_end;
+   }
+   return ret;
+}


So the relevant difference between this one and walk_system_ram_range()
is this:

-   ret = (*func)(pfn, end_pfn - pfn, arg);
+   ret = (res.desc != IORES_DESC_NONE) ? 1 : 0;

so it seems to me you can have your own *func() pointer which does that
IORES_DESC_NONE comparison. And then you can define your own workhorse
__walk_memory_range() which gets called by both walk_mem_range() and
walk_system_ram_range() instead of almost duplicating them.

And looking at walk_system_ram_res(), that one looks similar too except
the pfn computation. But AFAICT the pfn/end_pfn things are computed from
res.start and res.end so it looks to me like all those three functions
are crying for unification...


I'll take a look at what it takes to consolidate these with a pre-patch. 
Then I'll add the new support.


Thanks,
Tom




___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


Re: [RFC PATCH v2 05/32] x86: Use encrypted access of BOOT related data with SEV

2017-03-16 Thread Tom Lendacky

On 3/7/2017 5:09 AM, Borislav Petkov wrote:

On Thu, Mar 02, 2017 at 10:12:59AM -0500, Brijesh Singh wrote:

From: Tom Lendacky 

When Secure Encrypted Virtualization (SEV) is active, BOOT data (such as
EFI related data, setup data) is encrypted and needs to be accessed as
such when mapped. Update the architecture override in early_memremap to
keep the encryption attribute when mapping this data.


This could also explain why persistent memory needs to be accessed
decrypted with SEV.


I'll add some comments about why persistent memory needs to be accessed
decrypted (because the encryption key changes across reboots) for both
SME and SEV.



In general, what the difference in that aspect is in respect to SME. And
I'd write that in the comment over the function. And not say "E820 areas
are checked in making this determination." because that is visible but
say *why* we need to check those ranges and determine access depending
on their type.


Will do.

Thanks,
Tom




___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


Re: [RFC PATCH v2 12/32] x86: Add early boot support when running with SEV active

2017-03-16 Thread Tom Lendacky

On 3/16/2017 10:09 AM, Borislav Petkov wrote:

On Thu, Mar 16, 2017 at 09:28:58AM -0500, Tom Lendacky wrote:

Because there are differences between how SME and SEV behave
(instruction fetches are always decrypted under SEV, DMA to an
encrypted location is not supported under SEV, etc.) we need to
determine which mode we are in so that things can be setup properly
during boot. For example, if SEV is active the kernel will already
be encrypted and so we don't perform that step or the trampoline area
for bringing up an AP must be decrypted for SME but encrypted for SEV.


So with SEV enabled, it seems to me a guest doesn't know anything about
encryption and can run as if SME is disabled. So sme_active() will be
false. And then the kernel can bypass all that code dealing with SME.

So a guest should simply run like on baremetal with no SME, IMHO.



Not quite. The guest still needs to understand about the encryption mask
so that it can protect memory by setting the encryption mask in the
pagetable entries.  It can also decide when to share memory with the
hypervisor by not setting the encryption mask in the pagetable entries.


But then there's that part: "instruction fetches are always decrypted
under SEV". What does that mean exactly? And how much of that code can


"Instruction fetches are always decrypted under SEV" means that,
regardless of how a virtual address is mapped, encrypted or decrypted,
if an instruction fetch is performed by the CPU from that address it
will always be decrypted. This is to prevent the hypervisor from
injecting executable code into the guest since it would have to be
valid encrypted instructions.


be reused so that

* SME on baremetal
* SEV on guest

use the same logic?


There are many areas that use the same logic, but there are certain
situations where we need to check between SME vs SEV (e.g. DMA operation
setup or decrypting the trampoline area) and act accordingly.

Thanks,
Tom



Having the larger SEV preparation part on the kvm host side is perfectly
fine. But I'd like to keep kernel initialization paths clean.

Thanks.


___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


Re: [RFC PATCH v2 12/32] x86: Add early boot support when running with SEV active

2017-03-16 Thread Tom Lendacky

On 3/16/2017 5:16 AM, Borislav Petkov wrote:

On Fri, Mar 10, 2017 at 10:35:30AM -0600, Brijesh Singh wrote:

We could update this patch to use the below logic:

 * CPUID(0) - Check for AuthenticAMD
 * CPID(1) - Check if under hypervisor
 * CPUID(0x8000) - Check for highest supported leaf
 * CPUID(0x801F).EAX - Check for SME and SEV support
 * rdmsr (MSR_K8_SYSCFG)[MemEncryptionModeEnc] - Check if SMEE is set


Actually, it is still not clear to me *why* we need to do anything
special wrt SEV in the guest.

Lemme clarify: why can't the guest boot just like a normal Linux on
baremetal and use the SME(!) detection code to set sme_enable and so
on? IOW, I'd like to avoid all those checks whether we're running under
hypervisor and handle all that like we're running on baremetal.


Because there are differences between how SME and SEV behave
(instruction fetches are always decrypted under SEV, DMA to an
encrypted location is not supported under SEV, etc.) we need to
determine which mode we are in so that things can be setup properly
during boot. For example, if SEV is active the kernel will already
be encrypted and so we don't perform that step or the trampoline area
for bringing up an AP must be decrypted for SME but encrypted for SEV.
The hypervisor check will provide that ability to determine how we
handle things.

Thanks,
Tom




___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


Re: [RFC PATCH v2 06/32] x86/pci: Use memremap when walking setup data

2017-03-13 Thread Tom Lendacky

On 3/6/2017 6:03 PM, Bjorn Helgaas wrote:

On Fri, Mar 03, 2017 at 03:15:34PM -0600, Tom Lendacky wrote:

On 3/3/2017 2:42 PM, Bjorn Helgaas wrote:

On Thu, Mar 02, 2017 at 10:13:10AM -0500, Brijesh Singh wrote:

From: Tom Lendacky 

The use of ioremap will force the setup data to be mapped decrypted even
though setup data is encrypted.  Switch to using memremap which will be
able to perform the proper mapping.


How should callers decide whether to use ioremap() or memremap()?

memremap() existed before SME and SEV, and this code is used even if
SME and SEV aren't supported, so the rationale for this change should
not need the decryption argument.


When SME or SEV is active an ioremap() will remove the encryption bit
from the pagetable entry when it is mapped.  This allows MMIO, which
doesn't support SME/SEV, to be performed successfully.  So my take is
that ioremap() should be used for MMIO and memremap() for pages in RAM.


OK, thanks.  The commit message should say something like "this is
RAM, not MMIO, so we should map it with memremap(), not ioremap()".
That's the part that determines whether the change is correct.

You can mention the encryption part, too, but it's definitely
secondary because the change has to make sense on its own, without
SME/SEV.



Ok, that makes sense, will do.


The following commits (from https://github.com/codomania/tip/branches)
all do basically the same thing so the changelogs (and summaries)
should all be basically the same:

  cb0d0d1eb0a6 x86: Change early_ioremap to early_memremap for BOOT data
  91acb68b8333 x86/pci: Use memremap when walking setup data
  4f687503e23f x86: Access the setup data through sysfs decrypted
  e90246b8c229 x86: Access the setup data through debugfs decrypted

I would collect them all together and move them to the beginning of
your series, since they don't depend on anything else.


I'll do that.



Also, change "x86/pci: " to "x86/PCI" so it matches the previous
convention.


Will do.

Thanks,
Tom




Signed-off-by: Tom Lendacky 


Acked-by: Bjorn Helgaas 


---
arch/x86/pci/common.c |4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/pci/common.c b/arch/x86/pci/common.c
index a4fdfa7..0b06670 100644
--- a/arch/x86/pci/common.c
+++ b/arch/x86/pci/common.c
@@ -691,7 +691,7 @@ int pcibios_add_device(struct pci_dev *dev)

pa_data = boot_params.hdr.setup_data;
while (pa_data) {
-   data = ioremap(pa_data, sizeof(*rom));
+   data = memremap(pa_data, sizeof(*rom), MEMREMAP_WB);


I can't quite connect the dots here.  ioremap() on x86 would do
ioremap_nocache().  memremap(MEMREMAP_WB) would do arch_memremap_wb(),
which is ioremap_cache().  Is making a cacheable mapping the important
difference?


The memremap(MEMREMAP_WB) will actually check to see if it can perform
a __va(pa_data) in try_ram_remap() and then fallback to the
arch_memremap_wb().  So it's actually the __va() vs the ioremap_cache()
that is the difference.

Thanks,
Tom




if (!data)
return -ENOMEM;

@@ -710,7 +710,7 @@ int pcibios_add_device(struct pci_dev *dev)
}
}
pa_data = data->next;
-   iounmap(data);
+   memunmap(data);
}
set_dma_domain_ops(dev);
set_dev_domain_options(dev);


___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


Re: [RFC PATCH v2 06/32] x86/pci: Use memremap when walking setup data

2017-03-03 Thread Tom Lendacky

On 3/3/2017 2:42 PM, Bjorn Helgaas wrote:

On Thu, Mar 02, 2017 at 10:13:10AM -0500, Brijesh Singh wrote:

From: Tom Lendacky 

The use of ioremap will force the setup data to be mapped decrypted even
though setup data is encrypted.  Switch to using memremap which will be
able to perform the proper mapping.


How should callers decide whether to use ioremap() or memremap()?

memremap() existed before SME and SEV, and this code is used even if
SME and SEV aren't supported, so the rationale for this change should
not need the decryption argument.


When SME or SEV is active an ioremap() will remove the encryption bit
from the pagetable entry when it is mapped.  This allows MMIO, which
doesn't support SME/SEV, to be performed successfully.  So my take is
that ioremap() should be used for MMIO and memremap() for pages in RAM.




Signed-off-by: Tom Lendacky 
---
 arch/x86/pci/common.c |4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/pci/common.c b/arch/x86/pci/common.c
index a4fdfa7..0b06670 100644
--- a/arch/x86/pci/common.c
+++ b/arch/x86/pci/common.c
@@ -691,7 +691,7 @@ int pcibios_add_device(struct pci_dev *dev)

pa_data = boot_params.hdr.setup_data;
while (pa_data) {
-   data = ioremap(pa_data, sizeof(*rom));
+   data = memremap(pa_data, sizeof(*rom), MEMREMAP_WB);


I can't quite connect the dots here.  ioremap() on x86 would do
ioremap_nocache().  memremap(MEMREMAP_WB) would do arch_memremap_wb(),
which is ioremap_cache().  Is making a cacheable mapping the important
difference?


The memremap(MEMREMAP_WB) will actually check to see if it can perform
a __va(pa_data) in try_ram_remap() and then fallback to the
arch_memremap_wb().  So it's actually the __va() vs the ioremap_cache()
that is the difference.

Thanks,
Tom




if (!data)
return -ENOMEM;

@@ -710,7 +710,7 @@ int pcibios_add_device(struct pci_dev *dev)
}
}
pa_data = data->next;
-   iounmap(data);
+   memunmap(data);
}
set_dma_domain_ops(dev);
set_dev_domain_options(dev);


___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


Re: [RFC PATCH v1 09/28] x86/efi: Access EFI data as encrypted when SEV is active

2016-09-22 Thread Tom Lendacky
On 09/22/2016 02:11 PM, Borislav Petkov wrote:
> On Thu, Sep 22, 2016 at 02:04:27PM -0500, Tom Lendacky wrote:
>> That's not what I mean here.  If the BIOS sets the SMEE bit in the
>> SYS_CFG msr then, even if the encryption bit is never used, there is
>> still a reduction in physical address space.
> 
> I thought that reduction is the reservation of bits for the SME mask.
> 
> What other reduction is there?

There is a reduction in physical address space for the SME mask and the
bits used to aid in identifying the ASID associated with the memory
request. This allows for the memory controller to determine the key to
be used for the encryption operation (host/hypervisor key vs. an SEV
guest key).

Thanks,
Tom

> 
>> Transparent SME (TSME) will be a BIOS option that will result in the
>> memory controller performing encryption no matter what. In this case
>> all data will be encrypted without a reduction in physical address
>> space.
> 
> Now I'm confused: aren't we reducing the address space with the SME
> mask?
> 
> Or what reduction do you mean?
> 
___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


Re: [RFC PATCH v1 09/28] x86/efi: Access EFI data as encrypted when SEV is active

2016-09-22 Thread Tom Lendacky
On 09/22/2016 09:45 AM, Paolo Bonzini wrote:
> 
> 
> On 22/09/2016 16:35, Borislav Petkov wrote:
 @@ -230,6 +230,10 @@ int __init efi_setup_page_tables(unsigned long 
 pa_memmap, unsigned num_pages)
efi_scratch.efi_pgt = (pgd_t *)__sme_pa(efi_pgd);
pgd = efi_pgd;
  
 +  flags = _PAGE_NX | _PAGE_RW;
 +  if (sev_active)
 +  flags |= _PAGE_ENC;
>> So this is confusing me. There's this patch which says EFI data is
>> accessed in the clear:
>>
>> https://lkml.kernel.org/r/2016083738.29880.6909.st...@tlendack-t1.amdoffice.net
>>
>> but now here it is encrypted when SEV is enabled.
>>
>> Do you mean, it is encrypted here because we're in the guest kernel?
> 
> I suspect this patch is untested, and also wrong. :)

Yes, it is untested but not sure that it is wrong...  It all depends on
how we add SEV support to the guest UEFI BIOS.  My take would be to have
the EFI data and ACPI tables encrypted.

> 
> The main difference between the SME and SEV encryption, from the point
> of view of the kernel, is that real-mode always writes unencrypted in
> SME and always writes encrypted in SEV.  But UEFI can run in 64-bit mode
> and learn about the C bit, so EFI boot data should be unprotected in SEV
> guests.
> 
> Because the firmware volume is written to high memory in encrypted form,
> and because the PEI phase runs in 32-bit mode, the firmware code will be
> encrypted; on the other hand, data that is placed in low memory for the
> kernel can be unencrypted, thus limiting differences between SME and SEV.

I like the idea of limiting the differences but it would leave the EFI
data and ACPI tables exposed and able to be manipulated.

> 
>   Important: I don't know what you guys are doing for SEV and
>   Windows guests, but if you are doing something I would really
>   appreciate doing things in the open.  If Linux and Windows end
>   up doing different things with EFI boot data, ACPI tables, etc.
>   it will be a huge pain.  On the other hand, if we can enjoy
>   being first, that's great.

We haven't discussed Windows guests under SEV yet, but as you say, we
need to do things the same.

Thanks,
Tom

> 
> In fact, I have suggested in the QEMU list that SEV guests should always
> use UEFI; because BIOS runs in real-mode or 32-bit non-paging protected
> mode, BIOS must always write encrypted data, which becomes painful in
> the kernel.
> 
> And regarding the above "important" point, all I know is that Microsoft
> for sure will be happy to restrict SEV to UEFI guests. :)
> 
> There are still some differences, mostly around the real mode trampoline
> executed by the kernel, but they should be much smaller.
> 
> Paolo
> 
___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


Re: [RFC PATCH v1 09/28] x86/efi: Access EFI data as encrypted when SEV is active

2016-09-22 Thread Tom Lendacky
On 09/22/2016 09:59 AM, Borislav Petkov wrote:
> On Thu, Sep 22, 2016 at 04:45:51PM +0200, Paolo Bonzini wrote:
>> The main difference between the SME and SEV encryption, from the point
>> of view of the kernel, is that real-mode always writes unencrypted in
>> SME and always writes encrypted in SEV.  But UEFI can run in 64-bit mode
>> and learn about the C bit, so EFI boot data should be unprotected in SEV
>> guests.
> 
> Actually, it is different: you can start fully encrypted in SME, see:
> 
> https://lkml.kernel.org/r/2016083539.29880.96739.st...@tlendack-t1.amdoffice.net
> 
> The last paragraph alludes to a certain transparent mode where you're
> already encrypted and only certain pieces like EFI is not encrypted. I
> think the aim is to have the transparent mode be the default one, which
> makes most sense anyway.

There is a new Transparent SME mode that is now part of the overall
SME support, but I'm not alluding to that in the documentation at all.
In TSME mode, everything that goes through the memory controller would
be encrypted and that would include EFI data, etc.  TSME would be
enabled through a BIOS option, thus allowing legacy OSes to benefit.

> 
> The EFI regions are unencrypted for obvious reasons and you need to
> access them as such.
> 
>> Because the firmware volume is written to high memory in encrypted
>> form, and because the PEI phase runs in 32-bit mode, the firmware
>> code will be encrypted; on the other hand, data that is placed in low
>> memory for the kernel can be unencrypted, thus limiting differences
>> between SME and SEV.
> 
> When you run fully encrypted, you still need to access EFI tables in the
> clear. That's why I'm confused about this patch here.

This patch assumes that the EFI regions of a guest would be encrypted.

Thanks,
Tom

> 
___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


Re: [RFC PATCH v1 09/28] x86/efi: Access EFI data as encrypted when SEV is active

2016-09-22 Thread Tom Lendacky
On 09/22/2016 12:07 PM, Borislav Petkov wrote:
> On Thu, Sep 22, 2016 at 05:05:54PM +0200, Paolo Bonzini wrote:
>> Which paragraph?
> 
> "Linux relies on BIOS to set this bit if BIOS has determined that the
> reduction in the physical address space as a result of enabling memory
> encryption..."
> 
> Basically, you can enable SME in the BIOS and you're all set.

That's not what I mean here.  If the BIOS sets the SMEE bit in the
SYS_CFG msr then, even if the encryption bit is never used, there is
still a reduction in physical address space.

Transparent SME (TSME) will be a BIOS option that will result in the
memory controller performing encryption no matter what. In this case
all data will be encrypted without a reduction in physical address
space.

Thanks,
Tom

> 
___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


Re: [RFC PATCH v1 09/28] x86/efi: Access EFI data as encrypted when SEV is active

2016-09-22 Thread Tom Lendacky
On 09/22/2016 09:35 AM, Borislav Petkov wrote:
> On Mon, Aug 22, 2016 at 07:25:25PM -0400, Brijesh Singh wrote:
>> From: Tom Lendacky 
>>
>> EFI data is encrypted when the kernel is run under SEV. Update the
>> page table references to be sure the EFI memory areas are accessed
>> encrypted.
>>
>> Signed-off-by: Tom Lendacky 
>> ---
>>  arch/x86/platform/efi/efi_64.c |   14 --
>>  1 file changed, 12 insertions(+), 2 deletions(-)
>>
>> diff --git a/arch/x86/platform/efi/efi_64.c b/arch/x86/platform/efi/efi_64.c
>> index 0871ea4..98363f3 100644
>> --- a/arch/x86/platform/efi/efi_64.c
>> +++ b/arch/x86/platform/efi/efi_64.c
>> @@ -213,7 +213,7 @@ void efi_sync_low_kernel_mappings(void)
>>  
>>  int __init efi_setup_page_tables(unsigned long pa_memmap, unsigned 
>> num_pages)
>>  {
>> -unsigned long pfn, text;
>> +unsigned long pfn, text, flags;
>>  efi_memory_desc_t *md;
>>  struct page *page;
>>  unsigned npages;
>> @@ -230,6 +230,10 @@ int __init efi_setup_page_tables(unsigned long 
>> pa_memmap, unsigned num_pages)
>>  efi_scratch.efi_pgt = (pgd_t *)__sme_pa(efi_pgd);
>>  pgd = efi_pgd;
>>  
>> +flags = _PAGE_NX | _PAGE_RW;
>> +if (sev_active)
>> +flags |= _PAGE_ENC;
> 
> So this is confusing me. There's this patch which says EFI data is
> accessed in the clear:
> 
> https://lkml.kernel.org/r/2016083738.29880.6909.st...@tlendack-t1.amdoffice.net
> 
> but now here it is encrypted when SEV is enabled.
> 
> Do you mean, it is encrypted here because we're in the guest kernel?

Yes, the idea is that the SEV guest will be running encrypted from the
start, including the BIOS/UEFI, and so all of the EFI related data will
be encrypted.

Thanks,
Tom

> 
> Thanks.
> 
___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


Re: [RFC PATCH v1 18/28] crypto: add AMD Platform Security Processor driver

2016-08-24 Thread Tom Lendacky


On 08/23/2016 02:14 AM, Herbert Xu wrote:
> On Mon, Aug 22, 2016 at 07:27:22PM -0400, Brijesh Singh wrote:
>> The driver to communicate with Secure Encrypted Virtualization (SEV)
>> firmware running within the AMD secure processor providing a secure key
>> management interface for SEV guests.
>>
>> Signed-off-by: Tom Lendacky 
>> Signed-off-by: Brijesh Singh 
> 
> This driver doesn't seem to hook into the Crypto API at all, is
> there any reason why it should be in drivers/crypto?

Yes, this needs to be cleaned up.  The PSP and the CCP share the same
PCI id, so this has to be integrated with the CCP. It could either
be moved into the drivers/crypto/ccp directory or both the psp and
ccp device specific support can be moved somewhere else leaving just
the ccp crypto API related files in drivers/crypto/ccp.

Thanks,
Tom

> 
> Thanks,
> 
___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel