Re: [PATCH v2 2/2] x86/mm/sme: fix the kdump kernel breakage on SME system when CONFIG_IMA_KEXEC=y

2024-09-09 Thread Tom Lendacky
On 8/29/24 05:40, Baoquan He wrote:
> Recently, it's reported that kdump kernel is broken during bootup on
> SME system when CONFIG_IMA_KEXEC=y. When debugging, I noticed this
> can be traced back to commit ("b69a2afd5afc x86/kexec: Carry forward
> IMA measurement log on kexec"). Just nobody ever tested it on SME
> system when enabling CONFIG_IMA_KEXEC.
> 
> --
>  ima: No TPM chip found, activating TPM-bypass!
>  Loading compiled-in module X.509 certificates
>  Loaded X.509 cert 'Build time autogenerated kernel key: 
> 18ae0bc7e79b64700122bb1d6a904b070fef2656'
>  ima: Allocated hash algorithm: sha256
>  Oops: general protection fault, probably for non-canonical address 
> 0xcfacfdfe6660003e:  [#1] PREEMPT SMP NOPTI
>  CPU: 0 UID: 0 PID: 1 Comm: swapper/0 Not tainted 6.11.0-rc2+ #14
>  Hardware name: Dell Inc. PowerEdge R7425/02MJ3T, BIOS 1.20.0 05/03/2023
>  RIP: 0010:ima_restore_measurement_list+0xdc/0x420
>  Code: ff 48 c7 85 10 ff ff ff 00 00 00 00 48 c7 85 18 ff ff ff 00 00 00 00 
> 48 85 f6 0f 84 09 03 00 00 48 83 fa 17 0f 86 ff 02 00 00 <66> 83 3e 01 49 89 
> f4 0f 85 90 94 7d 00 48 83 7e 10 ff 0f 84 74 94
>  RSP: 0018:c9053c80 EFLAGS: 00010286
>  RAX:  RBX: c9053d03 RCX: 
>  RDX: e48066052d5df359 RSI: cfacfdfe6660003e RDI: cfacfdfe66600056
>  RBP: c9053d80 R08:  R09: 82de1a88
>  R10: c9053da0 R11: 0003 R12: 01a4
>  R13: c9053df0 R14:  R15: 
>  FS:  () GS:88804020() knlGS:
>  CS:  0010 DS:  ES:  CR0: 80050033
>  CR2: 7f2c744050e8 CR3: 80004110e000 CR4: 003506b0
>  Call Trace:
>   
>   ? show_trace_log_lvl+0x1b0/0x2f0
>   ? show_trace_log_lvl+0x1b0/0x2f0
>   ? ima_load_kexec_buffer+0x6e/0xf0
>   ? __die_body.cold+0x8/0x12
>   ? die_addr+0x3c/0x60
>   ? exc_general_protection+0x178/0x410
>   ? asm_exc_general_protection+0x26/0x30
>   ? ima_restore_measurement_list+0xdc/0x420
>   ? vprintk_emit+0x1f0/0x270
>   ? ima_load_kexec_buffer+0x6e/0xf0
>   ima_load_kexec_buffer+0x6e/0xf0
>   ima_init+0x52/0xb0
>   ? __pfx_init_ima+0x10/0x10
>   init_ima+0x26/0xc0
>   ? __pfx_init_ima+0x10/0x10
>   do_one_initcall+0x5b/0x300
>   do_initcalls+0xdf/0x100
>   ? __pfx_kernel_init+0x10/0x10
>   kernel_init_freeable+0x147/0x1a0
>   kernel_init+0x1a/0x140
>   ret_from_fork+0x34/0x50
>   ? __pfx_kernel_init+0x10/0x10
>   ret_from_fork_asm+0x1a/0x30
>   
>  Modules linked in:
>  ---[ end trace  ]---
>  RIP: 0010:ima_restore_measurement_list+0xdc/0x420
>  Code: ff 48 c7 85 10 ff ff ff 00 00 00 00 48 c7 85 18 ff ff ff 00 00 00 00 
> 48 85 f6 0f 84 09 03 00 00 48 83 fa 17 0f 86 ff 02 00 00 <66> 83 3e 01 49 89 
> f4 0f 85 90 94 7d 00 48 83 7e 10 ff 0f 84 74 94
>  RSP: 0018:c9053c80 EFLAGS: 00010286
>  RAX:  RBX: c9053d03 RCX: 
>  RDX: e48066052d5df359 RSI: cfacfdfe6660003e RDI: cfacfdfe66600056
>  RBP: c9053d80 R08:  R09: 82de1a88
>  R10: c9053da0 R11: 0003 R12: 01a4
>  R13: c9053df0 R14:  R15: 
>  FS:  () GS:88804020() knlGS:
>  CS:  0010 DS:  ES:  CR0: 80050033
>  CR2: 7f2c744050e8 CR3: 80004110e000 CR4: 003506b0
>  Kernel panic - not syncing: Fatal exception
>  Kernel Offset: disabled
>  Rebooting in 10 seconds..
> 
> From debugging printing, the stored addr and size of ima_kexec buffer
> are not decrypted correctly like:
>  --
>  ima: ima_load_kexec_buffer, buffer:0xcfacfdfe6660003e, 
> size:0xe48066052d5df359
>  --
> 
> There are three pieces of setup_data info passed to kexec/kdump kernel:
> SETUP_EFI, SETUP_IMA and SETUP_RNG_SEED. However, among them, only
> ima_kexec buffer suffered from the incorrect decryption. After
> debugging, it's because of the code bug in early_memremap_is_setup_data()
> where checking the embedded content inside setup_data takes wrong range
> calculation.
> 
> The length of efi data, rng_seed and ima_kexec are 0x70, 0x20, 0x10,
> and the length of setup_data is 0x10. When checking if data is inside
> the embedded conent of setup_data, the starting address of efi data and
> rng_seed happened to land in the wrong calculated range. While the
> ima_kexec's starting address unluckily doesn't pass the checking, then
> error occurred.
> 
> Here fix the code bug to make kexec/kdump kernel boot up successfully.
> 
> And also f

Re: [PATCH v2 1/2] x86/mm: rename the confusing local variable in early_memremap_is_setup_data()

2024-09-09 Thread Tom Lendacky
On 8/29/24 05:40, Baoquan He wrote:
> In function early_memremap_is_setup_data(), parameter 'size' passed has
> the same name as the local variable inside the while loop. That
> confuses people who sometime mix up them when reading code.
> 
> Here rename the local variable 'size' inside while loop to 'sd_size'.
> 
> And also add one local variable 'sd_size' likewise in function
> memremap_is_setup_data() to simplify code. In later patch, this can also
> be used.
> 
> Signed-off-by: Baoquan He 

Acked-by: Tom Lendacky 

> ---
>  arch/x86/mm/ioremap.c | 18 +++---
>  1 file changed, 11 insertions(+), 7 deletions(-)
> 
> diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c
> index aa7d279321ea..f1ee8822ddf1 100644
> --- a/arch/x86/mm/ioremap.c
> +++ b/arch/x86/mm/ioremap.c
> @@ -640,7 +640,7 @@ static bool memremap_is_setup_data(resource_size_t 
> phys_addr,
>  
>   paddr = boot_params.hdr.setup_data;
>   while (paddr) {
> - unsigned int len;
> + unsigned int len, sd_size;
>  
>   if (phys_addr == paddr)
>   return true;
> @@ -652,6 +652,8 @@ static bool memremap_is_setup_data(resource_size_t 
> phys_addr,
>   return false;
>   }
>  
> + sd_size = sizeof(*data);
> +
>   paddr_next = data->next;
>   len = data->len;
>  
> @@ -662,7 +664,9 @@ static bool memremap_is_setup_data(resource_size_t 
> phys_addr,
>  
>   if (data->type == SETUP_INDIRECT) {
>   memunmap(data);
> - data = memremap(paddr, sizeof(*data) + len,
> +
> + sd_size += len;
> + data = memremap(paddr, sd_size,
>   MEMREMAP_WB | MEMREMAP_DEC);
>   if (!data) {
>   pr_warn("failed to memremap indirect 
> setup_data\n");
> @@ -701,7 +705,7 @@ static bool __init 
> early_memremap_is_setup_data(resource_size_t phys_addr,
>  
>   paddr = boot_params.hdr.setup_data;
>   while (paddr) {
> - unsigned int len, size;
> + unsigned int len, sd_size;
>  
>   if (phys_addr == paddr)
>   return true;
> @@ -712,7 +716,7 @@ static bool __init 
> early_memremap_is_setup_data(resource_size_t phys_addr,
>   return false;
>   }
>  
> - size = sizeof(*data);
> + sd_size = sizeof(*data);
>  
>   paddr_next = data->next;
>   len = data->len;
> @@ -723,9 +727,9 @@ static bool __init 
> early_memremap_is_setup_data(resource_size_t phys_addr,
>   }
>  
>   if (data->type == SETUP_INDIRECT) {
> - size += len;
> + sd_size += len;
>   early_memunmap(data, sizeof(*data));
> - data = early_memremap_decrypted(paddr, size);
> + data = early_memremap_decrypted(paddr, sd_size);
>   if (!data) {
>   pr_warn("failed to early memremap indirect 
> setup_data\n");
>   return false;
> @@ -739,7 +743,7 @@ static bool __init 
> early_memremap_is_setup_data(resource_size_t phys_addr,
>   }
>   }
>  
> - early_memunmap(data, size);
> + early_memunmap(data, sd_size);
>  
>   if ((phys_addr > paddr) && (phys_addr < (paddr + len)))
>   return true;

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH] x86/mm/sme: fix the kdump kernel breakage on SME system when CONFIG_IMA_KEXEC=y

2024-08-27 Thread Tom Lendacky
On 8/27/24 08:52, Tom Lendacky wrote:
> On 8/26/24 22:19, Baoquan He wrote:
>> On 08/26/24 at 09:24am, Tom Lendacky wrote:
>>> On 8/25/24 21:44, Baoquan He wrote:
>>>> Recently, it's reported that kdump kernel is broken during bootup on
>>>> SME system when CONFIG_IMA_KEXEC=y. When debugging, I noticed this
>>>> can be traced back to commit ("b69a2afd5afc x86/kexec: Carry forward
>>>> IMA measurement log on kexec"). Just nobody ever tested it on SME
>>>> system when enabling CONFIG_IMA_KEXEC.
>>>>
>>>>
>>>> Here fix the code bug to make kexec/kdump kernel boot up successfully.
>>>>
>>>> Fixes: 8f716c9b5feb ("x86/mm: Add support to access boot related data in 
>>>> the clear")
>>>
>>> The check that was modified was added by:
>>> b3c72fc9a78e ("x86/boot: Introduce setup_indirect")
>>>
>>> The SETUP_INDIRECT patches seem to be the issue here.
>>
>> Hmm, I didn't check it carefully, thanks for addding this info. While
>> after checking commit b3c72fc9a78e, I feel the adding code was trying to
>> fix your original early_memremap_is_setup_data(). Even though
>> SETUP_INDIRECT type of setup_data has been added, the original
>> early_memremap_is_setup_data() only check the starting address and
>> the content of struct setup_data, that's obviously wrong.
> 
> IIRC, when this function was created, the value of "len" in setup_data
> included the length of "data", so the calculation was correct. Everything
> was contiguous in a setup_data element.
> 
>>
>> arch/x86/include/uapi/asm/setup_data.h:
>> /* extensible setup data list node */
>> struct setup_data {
>> __u64 next;
>> __u32 type;
>> __u32 len;
>> __u8 data[];
>> };
>>
>> As you can see, the zero-length will embed the carried data which is
>> actually expected and adjacent to its carrier, the struct setup_data.
> 
> Right, and "len" is the length of that data. So paddr + len goes to the
> end of the overall setup_data.

Ah, I see what you're saying. "len" doesn't include the size of the
setup_data structure, only the data. If so, then, yes, adding a sizeof()
to the calculation in the if statement is correct.

Thanks,
Tom

> 
> Thanks,
> Tom
> 
>>
>>>

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH] x86/mm/sme: fix the kdump kernel breakage on SME system when CONFIG_IMA_KEXEC=y

2024-08-27 Thread Tom Lendacky
On 8/27/24 00:41, Dave Young wrote:
> On Tue, 27 Aug 2024 at 13:28, Baoquan He  wrote:
>>
>> On 08/26/24 at 09:24am, Tom Lendacky wrote:
>>> On 8/25/24 21:44, Baoquan He wrote:
>>>> Recently, it's reported that kdump kernel is broken during bootup on
>>>> SME system when CONFIG_IMA_KEXEC=y. When debugging, I noticed this
>>>> can be traced back to commit ("b69a2afd5afc x86/kexec: Carry forward
>>>> IMA measurement log on kexec"). Just nobody ever tested it on SME
>>>> system when enabling CONFIG_IMA_KEXEC.
>>>>

>>
>> I talked to Dave, he reminded me that people could mix the passed in
>> parameter 'size' and the local variable 'size' defined inside the while
>> loop, not sure which 'size' you are referring to.
>>
> Baoquan, you are right, but I think I mistakenly read the code in
> memremap_is_setup_data instead of early_memremap_is_setup_data.  You

Ditto.

> can check the memremap_is_setup_data, no "size = sizeof (*data)",  so
> these two functions could both need fixes.
> 
> Otherwise it would be better to change the function internal variable
> name, it could cause confusion even if the actual result is correct.

Yes, the use of size as a local variable while being passed in as a
parameter is very confusing.

Thanks,
Tom

> 
> Thanks
> Dave
> 

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH] x86/mm/sme: fix the kdump kernel breakage on SME system when CONFIG_IMA_KEXEC=y

2024-08-27 Thread Tom Lendacky
On 8/26/24 22:19, Baoquan He wrote:
> On 08/26/24 at 09:24am, Tom Lendacky wrote:
>> On 8/25/24 21:44, Baoquan He wrote:
>>> Recently, it's reported that kdump kernel is broken during bootup on
>>> SME system when CONFIG_IMA_KEXEC=y. When debugging, I noticed this
>>> can be traced back to commit ("b69a2afd5afc x86/kexec: Carry forward
>>> IMA measurement log on kexec"). Just nobody ever tested it on SME
>>> system when enabling CONFIG_IMA_KEXEC.
>>>
>>>
>>> Here fix the code bug to make kexec/kdump kernel boot up successfully.
>>>
>>> Fixes: 8f716c9b5feb ("x86/mm: Add support to access boot related data in 
>>> the clear")
>>
>> The check that was modified was added by:
>>  b3c72fc9a78e ("x86/boot: Introduce setup_indirect")
>>
>> The SETUP_INDIRECT patches seem to be the issue here.
> 
> Hmm, I didn't check it carefully, thanks for addding this info. While
> after checking commit b3c72fc9a78e, I feel the adding code was trying to
> fix your original early_memremap_is_setup_data(). Even though
> SETUP_INDIRECT type of setup_data has been added, the original
> early_memremap_is_setup_data() only check the starting address and
> the content of struct setup_data, that's obviously wrong.

IIRC, when this function was created, the value of "len" in setup_data
included the length of "data", so the calculation was correct. Everything
was contiguous in a setup_data element.

> 
> arch/x86/include/uapi/asm/setup_data.h:
> /* extensible setup data list node */
> struct setup_data {
> __u64 next;
> __u32 type;
> __u32 len;
> __u8 data[];
> };
> 
> As you can see, the zero-length will embed the carried data which is
> actually expected and adjacent to its carrier, the struct setup_data.

Right, and "len" is the length of that data. So paddr + len goes to the
end of the overall setup_data.

Thanks,
Tom

> 
>>

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH] x86/mm/sme: fix the kdump kernel breakage on SME system when CONFIG_IMA_KEXEC=y

2024-08-26 Thread Tom Lendacky
On 8/25/24 21:44, Baoquan He wrote:
> Recently, it's reported that kdump kernel is broken during bootup on
> SME system when CONFIG_IMA_KEXEC=y. When debugging, I noticed this
> can be traced back to commit ("b69a2afd5afc x86/kexec: Carry forward
> IMA measurement log on kexec"). Just nobody ever tested it on SME
> system when enabling CONFIG_IMA_KEXEC.
> 
> --
>  ima: No TPM chip found, activating TPM-bypass!
>  Loading compiled-in module X.509 certificates
>  Loaded X.509 cert 'Build time autogenerated kernel key: 
> 18ae0bc7e79b64700122bb1d6a904b070fef2656'
>  ima: Allocated hash algorithm: sha256
>  Oops: general protection fault, probably for non-canonical address 
> 0xcfacfdfe6660003e:  [#1] PREEMPT SMP NOPTI
>  CPU: 0 UID: 0 PID: 1 Comm: swapper/0 Not tainted 6.11.0-rc2+ #14
>  Hardware name: Dell Inc. PowerEdge R7425/02MJ3T, BIOS 1.20.0 05/03/2023
>  RIP: 0010:ima_restore_measurement_list+0xdc/0x420
>  Code: ff 48 c7 85 10 ff ff ff 00 00 00 00 48 c7 85 18 ff ff ff 00 00 00 00 
> 48 85 f6 0f 84 09 03 00 00 48 83 fa 17 0f 86 ff 02 00 00 <66> 83 3e 01 49 89 
> f4 0f 85 90 94 7d 00 48 83 7e 10 ff 0f 84 74 94
>  RSP: 0018:c9053c80 EFLAGS: 00010286
>  RAX:  RBX: c9053d03 RCX: 
>  RDX: e48066052d5df359 RSI: cfacfdfe6660003e RDI: cfacfdfe66600056
>  RBP: c9053d80 R08:  R09: 82de1a88
>  R10: c9053da0 R11: 0003 R12: 01a4
>  R13: c9053df0 R14:  R15: 
>  FS:  () GS:88804020() knlGS:
>  CS:  0010 DS:  ES:  CR0: 80050033
>  CR2: 7f2c744050e8 CR3: 80004110e000 CR4: 003506b0
>  Call Trace:
>   
>   ? show_trace_log_lvl+0x1b0/0x2f0
>   ? show_trace_log_lvl+0x1b0/0x2f0
>   ? ima_load_kexec_buffer+0x6e/0xf0
>   ? __die_body.cold+0x8/0x12
>   ? die_addr+0x3c/0x60
>   ? exc_general_protection+0x178/0x410
>   ? asm_exc_general_protection+0x26/0x30
>   ? ima_restore_measurement_list+0xdc/0x420
>   ? vprintk_emit+0x1f0/0x270
>   ? ima_load_kexec_buffer+0x6e/0xf0
>   ima_load_kexec_buffer+0x6e/0xf0
>   ima_init+0x52/0xb0
>   ? __pfx_init_ima+0x10/0x10
>   init_ima+0x26/0xc0
>   ? __pfx_init_ima+0x10/0x10
>   do_one_initcall+0x5b/0x300
>   do_initcalls+0xdf/0x100
>   ? __pfx_kernel_init+0x10/0x10
>   kernel_init_freeable+0x147/0x1a0
>   kernel_init+0x1a/0x140
>   ret_from_fork+0x34/0x50
>   ? __pfx_kernel_init+0x10/0x10
>   ret_from_fork_asm+0x1a/0x30
>   
>  Modules linked in:
>  ---[ end trace  ]---
>  RIP: 0010:ima_restore_measurement_list+0xdc/0x420
>  Code: ff 48 c7 85 10 ff ff ff 00 00 00 00 48 c7 85 18 ff ff ff 00 00 00 00 
> 48 85 f6 0f 84 09 03 00 00 48 83 fa 17 0f 86 ff 02 00 00 <66> 83 3e 01 49 89 
> f4 0f 85 90 94 7d 00 48 83 7e 10 ff 0f 84 74 94
>  RSP: 0018:c9053c80 EFLAGS: 00010286
>  RAX:  RBX: c9053d03 RCX: 
>  RDX: e48066052d5df359 RSI: cfacfdfe6660003e RDI: cfacfdfe66600056
>  RBP: c9053d80 R08:  R09: 82de1a88
>  R10: c9053da0 R11: 0003 R12: 01a4
>  R13: c9053df0 R14:  R15: 
>  FS:  () GS:88804020() knlGS:
>  CS:  0010 DS:  ES:  CR0: 80050033
>  CR2: 7f2c744050e8 CR3: 80004110e000 CR4: 003506b0
>  Kernel panic - not syncing: Fatal exception
>  Kernel Offset: disabled
>  Rebooting in 10 seconds..
> 
> From debugging printing, the stored addr and size of ima_kexec buffer
> are not decrypted correctly like:
>  --
>  ima: ima_load_kexec_buffer, buffer:0xcfacfdfe6660003e, 
> size:0xe48066052d5df359
>  --
> 
> There are three pieces of setup_data info passed to kexec/kdump kernel:
> SETUP_EFI, SETUP_IMA and SETUP_RNG_SEED. However, among them, only
> ima_kexec buffer suffered from the incorrect decryption. After
> debugging, it's because of the code bug in early_memremap_is_setup_data()
> where checking the embedded content inside setup_data takes wrong range
> calculation.
> 
> The length of efi data, rng_seed and ima_kexec are 0x70, 0x20, 0x10,
> and the length of setup_data is 0x10. When checking if data is inside
> the embedded conent of setup_data, the starting address of efi data and
> rng_seed happened to land in the wrong calculated range. While the
> ima_kexec's starting address unluckily doesn't pass the checking, then
> error occurred.
> 
> Here fix the code bug to make kexec/kdump kernel boot up successfully.
> 
> Fixes: 8f716c9b5feb ("x86/mm: Add support to access boot related data in the 
> clear")

The check that was modified was added by:
b3c72fc9a78e ("x86/boot: Introduce setup_indirect")

The SETUP_INDIRECT patches seem to be the issue here.

> Signed-off-by: Baoquan He 
> ---
>  arch/x86/mm/ioremap.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)

Re: [PATCH v9 3/3] x86/snp: Convert shared memory back to private on kexec

2024-06-28 Thread Tom Lendacky
On 6/27/24 23:27, Kalra, Ashish wrote:
> Hello Boris,
> 
> On 6/24/2024 10:59 PM, Borislav Petkov wrote:
>> On Mon, Jun 24, 2024 at 03:57:34PM -0500, Kalra, Ashish wrote:
>>> ...  Hence, added simple static functions make_pte_private() and
>>> set_pte_enc() to make use of the more optimized snp_set_memory_private() to
>>> use the GHCB instead of the MSR protocol. Additionally, make_pte_private()
>>> adds check for GHCB addresses during unshare_all_memory() loop.
>> IOW, what you're saying is: "Boris, what you're asking can't be done."
>>
>> Well, what *you're* asking - for me to maintain crap - can't be done either.
>> So this will stay where it is.
>>
>> Unless you make a genuine effort and refactor the code...
> 
> There is an issue with calling __set_clr_pte_enc() here for the 
> _bss_decrypted section being made private again,
> 
> when calling __set_clr_pte_enc() on _bss_decrypted section pages, 
> clflush_cache_range() will fail as __va()
> 
> on this physical range fails as the bss_decrypted section pages are not in 
> kernel direct map.
> 
> Hence, clflush_cache_range() in __set_clr_pte_enc() causes an implicit page 
> state change which is not resolved as below and causes fatal guest exit :
> 
> qemu-system-x86_64: warning: memory fault: GPA 0x400 size 0x1000 flags 
> 0x8 kvm_convert_memory start 0x400 size 0x1000 shared_to_private
> 
> ...
> 
> KVM: unknown exit reason 24 EAX= EBX= ECX= 
> EDX= ESI= EDI= EBP= ESP= EIP= 
> EFL=0002 [---] CPL=0 II=0 A20=1 SMM=0 HLT=0
> 
> ...
> 
> This is the reason why i had to pass the vaddr to set_pte_enc(), added here 
> for kexec code, so that i can use it for clflush_cache_range().
> 
> So for specific cases such as this, where we can't call __set_clr_pte_enc() 
> on _bss_decrypted section, we probably need a separate set_pte_enc().

You can probably add a va parameter, that when not NULL, is used for the
flush. If NULL, then use the __va() macro on the pa.

Thanks,
Tom

> 
> Thanks, Ashish
> 

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH v9 3/3] x86/snp: Convert shared memory back to private on kexec

2024-06-24 Thread Tom Lendacky
On 6/20/24 17:23, Ashish Kalra wrote:
> From: Ashish Kalra 
> 
> SNP guests allocate shared buffers to perform I/O. It is done by
> allocating pages normally from the buddy allocator and converting them
> to shared with set_memory_decrypted().
> 
> The second kernel has no idea what memory is converted this way. It only
> sees E820_TYPE_RAM.
> 
> Accessing shared memory via private mapping will cause unrecoverable RMP
> page-faults.
> 
> On kexec walk direct mapping and convert all shared memory back to
> private. It makes all RAM private again and second kernel may use it
> normally. Additionally for SNP guests convert all bss decrypted section
> pages back to private.
> 
> The conversion occurs in two steps: stopping new conversions and
> unsharing all memory. In the case of normal kexec, the stopping of
> conversions takes place while scheduling is still functioning. This
> allows for waiting until any ongoing conversions are finished. The
> second step is carried out when all CPUs except one are inactive and
> interrupts are disabled. This prevents any conflicts with code that may
> access shared memory.
> 
> Signed-off-by: Ashish Kalra 

The pr_debug() calls don't make a lot of sense (and one appears to be in
the wrong location given what it says vs what is done) and should
probably be removed.

Otherwise:

Reviewed-by: Tom Lendacky 

...

> + /* Check for GHCB for being part of a PMD range. */
> + if ((unsigned long)ghcb >= addr &&
> + (unsigned long)ghcb <= (addr + (pages * PAGE_SIZE))) {
> + /*
> +  * Ensure that the current cpu's GHCB is made private
> +  * at the end of unshared loop so that we continue to use the
> +  * optimized GHCB protocol and not force the switch to
> +  * MSR protocol till the very end.
> +  */
> + pr_debug("setting boot_ghcb to NULL for this cpu ghcb\n");
> + kexec_last_addr_to_make_private = addr;
> + return true;
> + }

...

> + /*
> +  * Switch to using the MSR protocol to change this cpu's
> +  * GHCB to private.
> +  * All the per-cpu GHCBs have been switched back to private,
> +  * so can't do any more GHCB calls to the hypervisor beyond
> +  * this point till the kexec kernel starts running.
> +  */
> + boot_ghcb = NULL;
> + sev_cfg.ghcbs_initialized = false;
> +
> + pr_debug("boot ghcb 0x%lx\n", kexec_last_addr_to_make_private);
> + pte = lookup_address(kexec_last_addr_to_make_private, &level);
> + size = page_level_size(level);
> + set_pte_enc(pte, level, (void 
> *)kexec_last_addr_to_make_private);
> + snp_set_memory_private(kexec_last_addr_to_make_private, (size / 
> PAGE_SIZE));
> + }
> +}

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH v9 2/3] x86/boot: Skip video memory access in the decompressor for SEV-ES/SNP

2024-06-24 Thread Tom Lendacky
On 6/20/24 17:23, Ashish Kalra wrote:
> From: Ashish Kalra 
> 
> Accessing guest video memory/RAM in the decompressor causes guest
> termination as the boot stage2 #VC handler for SEV-ES/SNP systems does
> not support MMIO handling.
> 
> This issue is observed during a SEV-ES/SNP guest kexec as kexec -c adds
> screen_info to the boot parameters passed to the second kernel, which
> causes console output to be dumped to both video and serial.
> 
> As the decompressor output gets cleared really fast, it is preferable to
> get the console output only on serial, hence, skip accessing the video
> RAM during decompressor stage to prevent guest termination.
> 
> Serial console output during decompressor stage works as boot stage2 #VC
> handler already supports handling port I/O.
> 
>   [ bp: Massage. ]
> 
> Suggested-by: Borislav Petkov (AMD) 
> Suggested-by: Thomas Lendacy 
> Signed-off-by: Ashish Kalra 
> Signed-off-by: Borislav Petkov (AMD) 
> Reviewed-by: Kuppuswamy Sathyanarayanan 
> 

Reviewed-by: Tom Lendacky 

> ---
>  arch/x86/boot/compressed/misc.c | 15 +++
>  1 file changed, 15 insertions(+)
> 

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCHv11 18/19] x86/acpi: Add support for CPU offlining for ACPI MADT wakeup method

2024-06-14 Thread Tom Lendacky

On 6/13/24 09:56, Borislav Petkov wrote:

On Thu, Jun 13, 2024 at 04:41:00PM +0300, Kirill A. Shutemov wrote:

It is easy enough to do. See the patch below.


Thanks, will have a look.


But I am not sure if I can justify it properly. If someone doesn't really
need 5-level paging, disabling it at compile-time would save ~34K of
kernel code with the configuration.

Is it worth saving ~100 lines of code?


Well, it goes both ways: is it worth saving ~34K kernel text and for that make
the code a lot less conditional, more readable, contain less ugly ifdeffery,


Won't getting rid of the config option cause 5-level to be used by default 
on all platforms that support it? The no5lvl command line option would 
have to be used to get 4-level paging at that point.


Thanks,
Tom


...?





Re: [PATCH v4 0/9] x86/sev: KEXEC/KDUMP support for SEV-ES guests

2024-03-12 Thread Tom Lendacky

On 3/12/24 10:16, Vasant Karasulli wrote:

On Di 12-03-24 09:04:13, Tom Lendacky wrote:

On 3/11/24 15:32, Vasant k wrote:

Hi Tom,

 Right,  it just escaped my mind that the SNP uses the secrets page
to hand over APs to the next stage.  I will correct that in the next


Not quite... The MADT table lists the APs and the GHCB AP Create NAE event
is used to start the APs.


Alright. So AP Jump Table is not used like in the case of SEV-ES. Thanks,


Right. It can be, but we don't use that method in Linux.

Thanks,
Tom


I will keep the changes in the patch set exclusively for SEV-ES then.

- Vasant




Re: [PATCH v4 0/9] x86/sev: KEXEC/KDUMP support for SEV-ES guests

2024-03-12 Thread Tom Lendacky

On 3/11/24 15:32, Vasant k wrote:

Hi Tom,

Right,  it just escaped my mind that the SNP uses the secrets page
to hand over APs to the next stage.  I will correct that in the next


Not quite... The MADT table lists the APs and the GHCB AP Create NAE event 
is used to start the APs.


Thanks,
Tom


version.  Please let me know if you have any corrections or improvement
suggestions on the rest of the patchset.

Thanks,
Vasant





Re: [PATCH v4 0/9] x86/sev: KEXEC/KDUMP support for SEV-ES guests

2024-03-11 Thread Tom Lendacky

On 3/11/24 11:17, Vasant Karasulli wrote:

From: Vasant Karasulli 

Hi,


Hi Vasant,

The SNP guest support has been incorporated in the kernel since this 
patchset was originally presented. SNP also is considered a guest with 
encrypted state (CC_ATTR_GUEST_STATE_ENCRYPT will return true), but does 
not use the AP jump table. So this series need adjusted so that the AP 
jump table is only used for SEV-ES guests.


Thanks,
Tom



here are changes to enable kexec/kdump in SEV-ES guests. The biggest
problem for supporting kexec/kdump under SEV-ES is to find a way to
hand the non-boot CPUs (APs) from one kernel to another.

Without SEV-ES the first kernel parks the CPUs in a HLT loop until
they get reset by the kexec'ed kernel via an INIT-SIPI-SIPI sequence.
For virtual machines the CPU reset is emulated by the hypervisor,
which sets the vCPU registers back to reset state.

This does not work under SEV-ES, because the hypervisor has no access
to the vCPU registers and can't make modifications to them. So an
SEV-ES guest needs to reset the vCPU itself and park it using the
AP-reset-hold protocol. Upon wakeup the guest needs to jump to
real-mode and to the reset-vector configured in the AP-Jump-Table.

The code to do this is the main part of this patch-set. It works by
placing code on the AP Jump-Table page itself to park the vCPU and for
jumping to the reset vector upon wakeup. The code on the AP Jump Table
runs in 16-bit protected mode with segment base set to the beginning
of the page. The AP Jump-Table is usually not within the first 1MB of
memory, so the code can't run in real-mode.

The AP Jump-Table is the best place to put the parking code, because
the memory is owned, but read-only by the firmware and writeable by
the OS. Only the first 4 bytes are used for the reset-vector, leaving
the rest of the page for code/data/stack to park a vCPU. The code
can't be in kernel memory because by the time the vCPU wakes up the
memory will be owned by the new kernel, which might have overwritten it
already.

The other patches add initial GHCB Version 2 protocol support, because
kexec/kdump need the MSR-based (without a GHCB) AP-reset-hold VMGEXIT,
which is a GHCB protocol version 2 feature.

The kexec'ed kernel is also entered via the decompressor and needs
MMIO support there, so this patch-set also adds MMIO #VC support to
the decompressor and support for handling CLFLUSH instructions.

Finally there is also code to disable kexec/kdump support at runtime
when the environment does not support it (e.g. no GHCB protocol
version 2 support or AP Jump Table over 4GB).

The diffstat looks big, but most of it is moving code for MMIO #VC
support around to make it available to the decompressor.

The previous version of this patch-set can be found here:

https://lore.kernel.org/lkml/20220127101044.13803-1-j...@8bytes.org/

Please review.

Thanks,
Vasant

Changes v3->v4:
 - Rebased to v6.8 kernel
- Applied review comments by Sean Christopherson
- Combined sev_es_setup_ap_jump_table() and sev_setup_ap_jump_table()
   into a single function which makes caching jump table address
   unnecessary
 - annotated struct sev_ap_jump_table_header with __packed attribute
- added code to set up real mode data segment at boot time instead of
   hardcoding the value.

Changes v2->v3:

- Rebased to v5.17-rc1
- Applied most review comments by Boris
- Use the name 'AP jump table' consistently
- Make kexec-disabling for unsupported guests x86-specific
- Cleanup and consolidate patches to detect GHCB v2 protocol
  support

Joerg Roedel (9):
   x86/kexec/64: Disable kexec when SEV-ES is active
   x86/sev: Save and print negotiated GHCB protocol version
   x86/sev: Set GHCB data structure version
   x86/sev: Setup code to park APs in the AP Jump Table
   x86/sev: Park APs on AP Jump Table with GHCB protocol version 2
   x86/sev: Use AP Jump Table blob to stop CPU
   x86/sev: Add MMIO handling support to boot/compressed/ code
   x86/sev: Handle CLFLUSH MMIO events
   x86/kexec/64: Support kexec under SEV-ES with AP Jump Table Blob

  arch/x86/boot/compressed/sev.c  |  45 +-
  arch/x86/include/asm/insn-eval.h|   1 +
  arch/x86/include/asm/realmode.h |   5 +
  arch/x86/include/asm/sev-ap-jumptable.h |  30 +
  arch/x86/include/asm/sev.h  |   7 +
  arch/x86/kernel/machine_kexec_64.c  |  12 +
  arch/x86/kernel/process.c   |   8 +
  arch/x86/kernel/sev-shared.c| 234 +-
  arch/x86/kernel/sev.c   | 372 +-
  arch/x86/lib/insn-eval-shared.c | 912 
  arch/x86/lib/insn-eval.c| 911 +--
  arch/x86/realmode/Makefile  |   9 +-
  arch/x86/realmode/rm/Makefile   |  11 +-
  arch/x86/realmode/rm/header.S   |   3 +
  arch/x86/realmode/rm/sev.S  |  85 

Re: [PATCH 2/2] x86/snp: Convert shared memory back to private on kexec

2024-02-22 Thread Tom Lendacky

On 2/22/24 04:50, Kirill A. Shutemov wrote:

On Wed, Feb 21, 2024 at 02:35:13PM -0600, Tom Lendacky wrote:

@@ -906,6 +917,206 @@ void snp_accept_memory(phys_addr_t start, phys_addr_t end)
set_pages_state(vaddr, npages, SNP_PAGE_STATE_PRIVATE);
   }
+static inline bool pte_decrypted(pte_t pte)
+{
+   return cc_mkdec(pte_val(pte)) == pte_val(pte);
+}
+


This is duplicated in TDX code, arch/x86/coco/tdx/tdx.c, looks like
something that can go in a header file, maybe mem_encrypt.h.



I think  is a better fit.


+void snp_kexec_stop_conversion(bool crash)
+{
+   /* Stop new private<->shared conversions */
+   conversion_allowed = false;
+   crash_requested = crash;
+
+   /*
+* Make sure conversion_allowed is cleared before checking
+* conversions_in_progress.
+*/
+   barrier();


This should be smp_wmb().



Why?


IIUC, this is because conversions_in_progress can be set on another thread 
and so this needs an smp barrier. In this case, smp_wmb() just ends up 
being barrier(), but to me it is clearer this way. Just my opinion, though.


Thanks,
Tom






___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH 2/2] x86/snp: Convert shared memory back to private on kexec

2024-02-21 Thread Tom Lendacky

On 2/19/24 19:18, Ashish Kalra wrote:

From: Ashish Kalra 

SNP guests allocate shared buffers to perform I/O. It is done by
allocating pages normally from the buddy allocator and converting them
to shared with set_memory_decrypted().

The second kernel has no idea what memory is converted this way. It only
sees E820_TYPE_RAM.

Accessing shared memory via private mapping will cause unrecoverable RMP
page-faults.

On kexec walk direct mapping and convert all shared memory back to
private. It makes all RAM private again and second kernel may use it
normally. Additionally for SNP guests convert all bss decrypted section
pages back to private and switch back ROM regions to shared so that
their revalidation does not fail during kexec kernel boot.

The conversion occurs in two steps: stopping new conversions and
unsharing all memory. In the case of normal kexec, the stopping of
conversions takes place while scheduling is still functioning. This
allows for waiting until any ongoing conversions are finished. The
second step is carried out when all CPUs except one are inactive and
interrupts are disabled. This prevents any conflicts with code that may
access shared memory.


This seems like this patch should be broken down into multiple patches 
with the final patch setting x86_platform.guest.enc_kexec_stop_conversion 
and x86_platform.guest.enc_kexec_unshare_mem




Signed-off-by: Ashish Kalra 
---
  arch/x86/include/asm/probe_roms.h |   1 +
  arch/x86/include/asm/sev.h|   8 ++
  arch/x86/kernel/probe_roms.c  |  16 +++
  arch/x86/kernel/sev.c | 211 ++
  arch/x86/mm/mem_encrypt_amd.c |  18 ++-
  5 files changed, 253 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/probe_roms.h 
b/arch/x86/include/asm/probe_roms.h
index 1c7f3815bbd6..d50b67dbff33 100644
--- a/arch/x86/include/asm/probe_roms.h
+++ b/arch/x86/include/asm/probe_roms.h
@@ -6,4 +6,5 @@ struct pci_dev;
  extern void __iomem *pci_map_biosrom(struct pci_dev *pdev);
  extern void pci_unmap_biosrom(void __iomem *rom);
  extern size_t pci_biosrom_size(struct pci_dev *pdev);
+extern void snp_kexec_unprep_rom_memory(void);
  #endif
diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h
index 5b4a1ce3d368..dd236d7e9407 100644
--- a/arch/x86/include/asm/sev.h
+++ b/arch/x86/include/asm/sev.h
@@ -81,6 +81,10 @@ extern void vc_no_ghcb(void);
  extern void vc_boot_ghcb(void);
  extern bool handle_vc_boot_ghcb(struct pt_regs *regs);
  
+extern atomic_t conversions_in_progress;

+extern bool conversion_allowed;
+extern unsigned long pg_level_to_pfn(int level, pte_t *kpte, pgprot_t 
*ret_prot);
+
  /* PVALIDATE return codes */
  #define PVALIDATE_FAIL_SIZEMISMATCH   6
  
@@ -213,6 +217,8 @@ int snp_issue_guest_request(u64 exit_code, struct snp_req_data *input, struct sn

  void snp_accept_memory(phys_addr_t start, phys_addr_t end);
  u64 snp_get_unsupported_features(u64 status);
  u64 sev_get_status(void);
+void snp_kexec_unshare_mem(void);
+void snp_kexec_stop_conversion(bool crash);
  #else
  static inline void sev_es_ist_enter(struct pt_regs *regs) { }
  static inline void sev_es_ist_exit(void) { }
@@ -241,6 +247,8 @@ static inline int snp_issue_guest_request(u64 exit_code, 
struct snp_req_data *in
  static inline void snp_accept_memory(phys_addr_t start, phys_addr_t end) { }
  static inline u64 snp_get_unsupported_features(u64 status) { return 0; }
  static inline u64 sev_get_status(void) { return 0; }
+void snp_kexec_unshare_mem(void) {}
+static void snp_kexec_stop_conversion(bool crash) {}
  #endif
  
  #endif

diff --git a/arch/x86/kernel/probe_roms.c b/arch/x86/kernel/probe_roms.c
index 319fef37d9dc..457f1e5c8d00 100644
--- a/arch/x86/kernel/probe_roms.c
+++ b/arch/x86/kernel/probe_roms.c
@@ -177,6 +177,22 @@ size_t pci_biosrom_size(struct pci_dev *pdev)
  }
  EXPORT_SYMBOL(pci_biosrom_size);
  
+void snp_kexec_unprep_rom_memory(void)

+{
+   unsigned long vaddr, npages, sz;
+
+   /*
+* Switch back ROM regions to shared so that their validation
+* does not fail during kexec kernel boot.
+*/
+   vaddr = (unsigned long)__va(video_rom_resource.start);
+   sz = (system_rom_resource.end + 1) - video_rom_resource.start;
+   npages = PAGE_ALIGN(sz) >> PAGE_SHIFT;
+
+   snp_set_memory_shared(vaddr, npages);
+}
+EXPORT_SYMBOL(snp_kexec_unprep_rom_memory);
+
  #define ROMSIGNATURE 0xaa55
  
  static int __init romsignature(const unsigned char *rom)

diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c
index c67285824e82..765ab83129eb 100644
--- a/arch/x86/kernel/sev.c
+++ b/arch/x86/kernel/sev.c
@@ -23,6 +23,9 @@
  #include 
  #include 
  #include 
+#include 
+#include 
+#include 
  #include 
  
  #include 

@@ -40,6 +43,7 @@
  #include 
  #include 
  #include 
+#include 
  
  #define DR7_RESET_VALUE0x400
  
@@ -71,6 +75,13 @@ static struct ghcb *boot_ghcb __section(".data");

  /* Bitmap of SEV features supported by the hyper

Re: kexec reboot failed due to commit 75d090fd167ac

2023-09-11 Thread Tom Lendacky

On 9/11/23 10:53, Kirill A. Shutemov wrote:

On Mon, Sep 11, 2023 at 10:33:01AM -0500, Tom Lendacky wrote:

On 9/11/23 09:57, Kirill A. Shutemov wrote:

On Mon, Sep 11, 2023 at 10:56:36PM +0800, Dave Young wrote:

early console in extract_kernel
input_data: 0x00807eb433a8
input_len: 0x00d26271
output: 0x00807b00
output_len: 0x04800c10
kernel_total_size: 0x03e28000
needed_size: 0x04a0
trampoline_32bit: 0x0009d000

Decompressing Linux... out of pgt_buf in 
arch/x86/boot/compressed/ident_map_64.c!?
pages->pgt_buf_offset: 0x6000
pages->pgt_buf_size: 0x6000


Error: kernel_ident_mapping_init() failed

It crashes on #PF due to stbl->nr_tables dereference in
efi_get_conf_table() called from init_unaccepted_memory().

I don't see anything special about stbl location: 0x775d6018.

One other bit of information: disabling 5-level paging also helps the
issue.

I will debug further.


The problem is not limited to unaccepted memory, it also triggers if we
reach efi_get_rsdp_addr() in the same setup.

I think we have several problems here.

- 6 pages for !RANDOMIZE_BASE is only enough for kernel, cmdline,
boot_data and setup_data if we assume that they are in different 1G
regions and do not cross the 1G boundaries. 4-level paging: 1 for PGD, 1
for PUD, 4 for PMD tables.

Looks like we never map EFI/ACPI memory explicitly.

It might work if kernel/cmdline/... are in single 1G and we have
spare pages to handle page faults.

- No spare memory to handle mapping for cc_info and cc_info->cpuid_phys;

- I didn't increase BOOT_INIT_PGT_SIZE when added 5-level paging support.
And if start pagetables from scratch ('else' case of 'if (p4d_offset...))
we run out of memory.

I believe similar logic would apply for BOOT_PGT_SIZE for RANDOMIZE_BASE=y
case.

I don't know what the right fix here. We can increase the constants to be
enough to cover existing cases, but it is very fragile. I am not sure I
saw all users. Some of them could silently handled with pagefault handler
in some setups. And it is hard to catch new users during code review.

Also I'm not sure why do we need pagefault handler there. Looks like it
just masking problems. I think everything has to be mapped explicitly.

Any comments?


There was a similar related issue around the cc_info blob that is captured
here: https://lore.kernel.org/lkml/20230601072043.24439-1-l...@redhat.com/

Personally, I'm a fan of mapping the EFI tables that will be passed to the
kexec/kdump kernel. To me, that seems to more closely match the valid
mappings for the tables when control is transferred to the OS from UEFI on
the initial boot.


I don't see how it would help if initialize_identity_maps() resets
pagetables. See 'else' case of 'if (p4d_offset...).


Ok, I see what you mean now.

Thanks,
Tom





___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: kexec reboot failed due to commit 75d090fd167ac

2023-09-11 Thread Tom Lendacky

On 9/11/23 09:57, Kirill A. Shutemov wrote:

On Mon, Sep 11, 2023 at 10:56:36PM +0800, Dave Young wrote:

early console in extract_kernel
input_data: 0x00807eb433a8
input_len: 0x00d26271
output: 0x00807b00
output_len: 0x04800c10
kernel_total_size: 0x03e28000
needed_size: 0x04a0
trampoline_32bit: 0x0009d000

Decompressing Linux... out of pgt_buf in 
arch/x86/boot/compressed/ident_map_64.c!?
pages->pgt_buf_offset: 0x6000
pages->pgt_buf_size: 0x6000


Error: kernel_ident_mapping_init() failed

It crashes on #PF due to stbl->nr_tables dereference in
efi_get_conf_table() called from init_unaccepted_memory().

I don't see anything special about stbl location: 0x775d6018.

One other bit of information: disabling 5-level paging also helps the
issue.

I will debug further.


The problem is not limited to unaccepted memory, it also triggers if we
reach efi_get_rsdp_addr() in the same setup.

I think we have several problems here.

- 6 pages for !RANDOMIZE_BASE is only enough for kernel, cmdline,
   boot_data and setup_data if we assume that they are in different 1G
   regions and do not cross the 1G boundaries. 4-level paging: 1 for PGD, 1
   for PUD, 4 for PMD tables.

   Looks like we never map EFI/ACPI memory explicitly.

   It might work if kernel/cmdline/... are in single 1G and we have
   spare pages to handle page faults.

- No spare memory to handle mapping for cc_info and cc_info->cpuid_phys;

- I didn't increase BOOT_INIT_PGT_SIZE when added 5-level paging support.
   And if start pagetables from scratch ('else' case of 'if (p4d_offset...))
   we run out of memory.

I believe similar logic would apply for BOOT_PGT_SIZE for RANDOMIZE_BASE=y
case.

I don't know what the right fix here. We can increase the constants to be
enough to cover existing cases, but it is very fragile. I am not sure I
saw all users. Some of them could silently handled with pagefault handler
in some setups. And it is hard to catch new users during code review.

Also I'm not sure why do we need pagefault handler there. Looks like it
just masking problems. I think everything has to be mapped explicitly.

Any comments?


There was a similar related issue around the cc_info blob that is captured 
here: https://lore.kernel.org/lkml/20230601072043.24439-1-l...@redhat.com/


Personally, I'm a fan of mapping the EFI tables that will be passed to the 
kexec/kdump kernel. To me, that seems to more closely match the valid 
mappings for the tables when control is transferred to the OS from UEFI on 
the initial boot.


Thanks,
Tom





___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH v2] x86/kexec: Add EFI config table identity mapping for kexec kernel

2023-08-02 Thread Tom Lendacky

On 8/2/23 04:39, Borislav Petkov wrote:

On Wed, Aug 02, 2023 at 04:22:54PM +0800, Tao Liu wrote:

Thanks for the patch! I have tested it on the lenovo machine in the
past few days, no issue found, so the patch tests OK.


Thanks for testing!

Mike, Tom, the below ok this way?


Short of figuring out how to map page accesses earlier through the 
boot_page_fault IDT routine, this seems reasonable.


Acked-by: Tom Lendacky 



---
From: "Borislav Petkov (AMD)" 
Date: Sun, 16 Jul 2023 20:22:20 +0200
Subject: [PATCH] x86/sev: Do not try to parse for the CC blob on non-AMD
  hardware

Tao Liu reported a boot hang on an Intel Atom machine due to an unmapped
EFI config table. The reason being that the CC blob which contains the
CPUID page for AMD SNP guests is parsed for before even checking
whether the machine runs on AMD hardware.

Usually that's not a problem on !AMD hw - it simply won't find the CC
blob's GUID and return. However, if any parts of the config table
pointers array is not mapped, the kernel will #PF very early in the
decompressor stage without any opportunity to recover.

Therefore, do a superficial CPUID check before poking for the CC blob.
This will fix the current issue on real hardware. It would also work as
a guest on a non-lying hypervisor.

For the lying hypervisor, the check is done again, *after* parsing the
CC blob as the real CPUID page will be present then.

Clear the #VC handler in case SEV-{ES,SNP} hasn't been detected, as
a precaution.

Fixes: c01fce9cef84 ("x86/compressed: Add SEV-SNP feature detection/setup")
Reported-by: Tao Liu 
Signed-off-by: Borislav Petkov (AMD) 
Tested-by: Tao Liu 
Cc: 
Link: https://lore.kernel.org/r/20230601072043.24439-1-l...@redhat.com
---
  arch/x86/boot/compressed/idt_64.c |  9 +++-
  arch/x86/boot/compressed/sev.c| 37 +--
  2 files changed, 43 insertions(+), 3 deletions(-)

diff --git a/arch/x86/boot/compressed/idt_64.c 
b/arch/x86/boot/compressed/idt_64.c
index 6debb816e83d..3cdf94b41456 100644
--- a/arch/x86/boot/compressed/idt_64.c
+++ b/arch/x86/boot/compressed/idt_64.c
@@ -63,7 +63,14 @@ void load_stage2_idt(void)
set_idt_entry(X86_TRAP_PF, boot_page_fault);
  
  #ifdef CONFIG_AMD_MEM_ENCRYPT

-   set_idt_entry(X86_TRAP_VC, boot_stage2_vc);
+   /*
+* Clear the second stage #VC handler in case guest types
+* needing #VC have not been detected.
+*/
+   if (sev_status & BIT(1))
+   set_idt_entry(X86_TRAP_VC, boot_stage2_vc);
+   else
+   set_idt_entry(X86_TRAP_VC, NULL);
  #endif
  
  	load_boot_idt(&boot_idt_desc);

diff --git a/arch/x86/boot/compressed/sev.c b/arch/x86/boot/compressed/sev.c
index 09dc8c187b3c..c3e343bd4760 100644
--- a/arch/x86/boot/compressed/sev.c
+++ b/arch/x86/boot/compressed/sev.c
@@ -404,13 +404,46 @@ void sev_enable(struct boot_params *bp)
if (bp)
bp->cc_blob_address = 0;
  
+	/*

+* Do an initial SEV capability check before snp_init() which
+* loads the CPUID page and the same checks afterwards are done
+* without the hypervisor and are trustworthy.
+*
+* If the HV fakes SEV support, the guest will crash'n'burn
+* which is good enough.
+*/
+
+   /* Check for the SME/SEV support leaf */
+   eax = 0x8000;
+   ecx = 0;
+   native_cpuid(&eax, &ebx, &ecx, &edx);
+   if (eax < 0x801f)
+   return;
+
+   /*
+* Check for the SME/SEV feature:
+*   CPUID Fn8000_001F[EAX]
+*   - Bit 0 - Secure Memory Encryption support
+*   - Bit 1 - Secure Encrypted Virtualization support
+*   CPUID Fn8000_001F[EBX]
+*   - Bits 5:0 - Pagetable bit position used to indicate encryption
+*/
+   eax = 0x801f;
+   ecx = 0;
+   native_cpuid(&eax, &ebx, &ecx, &edx);
+   /* Check whether SEV is supported */
+   if (!(eax & BIT(1)))
+   return;
+
/*
 * Setup/preliminary detection of SNP. This will be sanity-checked
 * against CPUID/MSR values later.
 */
snp = snp_init(bp);
  
-	/* Check for the SME/SEV support leaf */

+   /* Now repeat the checks with the SNP CPUID table. */
+
+   /* Recheck the SME/SEV support leaf */
eax = 0x8000;
ecx = 0;
native_cpuid(&eax, &ebx, &ecx, &edx);
@@ -418,7 +451,7 @@ void sev_enable(struct boot_params *bp)
return;
  
  	/*

-* Check for the SME/SEV feature:
+* Recheck for the SME/SEV feature:
 *   CPUID Fn8000_001F[EAX]
 *   - Bit 0 - Secure Memory Encryption support
 *   - Bit 1 - Secure Encrypted Virtualization support


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH v2] x86/kexec: Add EFI config table identity mapping for kexec kernel

2023-07-07 Thread Tom Lendacky

On 7/7/23 03:22, Joerg Roedel wrote:

On Fri, Jul 07, 2023 at 12:23:59PM +0800, Baoquan He wrote:

I am wondering why we don't detect the cpu type and return early inside
sev_enable() if it's Intel cpu.

We can't rely on CONFIG_AMD_MEM_ENCRYPT to decide if the code need be
executed or not because we usually enable them all in distros.


Looking at the code in head_64.S, by the time sev_enable() runs the SEV
bit should already be set in sev_status. Maybe use that to detect
whether SEV is enabled and bail out early?


I think that is only if you enter on the 32-bit path. If invoked from EFI 
in 64-bit, efi64_stub_entry(), then I don't believe that sev_status will 
be set yet.


Before it can be determined if it is a non-AMD platform, the EFI config 
table has to be searched in order to find the CC blob table. Once that is 
found (or not found), then the checks for the platform are performed and 
sev_enable() will exit if not on an AMD platform.


I think it was an oversight to not add support for identity mapping the 
EFI config tables for kexec. Any features in the future that need to 
search for an EFI config table early like this will need the same.


Thanks,
Tom



Regards,



___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH v3 5/8] x86/sme: Replace occurrences of sme_active() with cc_platform_has()

2021-09-24 Thread Tom Lendacky

On 9/24/21 4:51 AM, Borislav Petkov wrote:

On Fri, Sep 24, 2021 at 12:41:32PM +0300, Kirill A. Shutemov wrote:

On Thu, Sep 23, 2021 at 08:21:03PM +0200, Borislav Petkov wrote:

On Thu, Sep 23, 2021 at 12:05:58AM +0300, Kirill A. Shutemov wrote:

Unless we find other way to guarantee RIP-relative access, we must use
fixup_pointer() to access any global variables.


Yah, I've asked compiler folks about any guarantees we have wrt
rip-relative addresses but it doesn't look good. Worst case, we'd have
to do the fixup_pointer() thing.

In the meantime, Tom and I did some more poking at this and here's a
diff ontop.

The direction being that we'll stick both the AMD and Intel
*cc_platform_has() call into cc_platform.c for which instrumentation
will be disabled so no issues with that.

And that will keep all that querying all together in a single file.


And still do cc_platform_has() calls in __startup_64() codepath?

It's broken.

Intel detection in cc_platform_has() relies on boot_cpu_data.x86_vendor
which is not initialized until early_cpu_init() in setup_arch(). Given
that X86_VENDOR_INTEL is 0 it leads to false-positive.


Yeah, Tom, I had the same question yesterday.

/me looks in his direction.



Yup, all the things we talked about.

But we also know that cc_platform_has() gets called a few other times 
before boot_cpu_data is initialized - so more false-positives. For 
cc_platform_has() to work properly, given the desire to consolidate the 
calls, IMO, Intel will have to come up with some early setting that can be 
enabled and checked in place of boot_cpu_data or else you live with 
false-positives.


Thanks,
Tom


:-)



___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH v3 5/8] x86/sme: Replace occurrences of sme_active() with cc_platform_has()

2021-09-22 Thread Tom Lendacky

On 9/21/21 4:58 PM, Kirill A. Shutemov wrote:

On Tue, Sep 21, 2021 at 04:43:59PM -0500, Tom Lendacky wrote:

On 9/21/21 4:34 PM, Kirill A. Shutemov wrote:

On Tue, Sep 21, 2021 at 11:27:17PM +0200, Borislav Petkov wrote:

On Wed, Sep 22, 2021 at 12:20:59AM +0300, Kirill A. Shutemov wrote:

I still believe calling cc_platform_has() from __startup_64() is totally
broken as it lacks proper wrapping while accessing global variables.


Well, one of the issues on the AMD side was using boot_cpu_data too
early and the Intel side uses it too. Can you replace those checks with
is_tdx_guest() or whatever was the helper's name which would check
whether the the kernel is running as a TDX guest, and see if that helps?


There's no need in Intel check this early. Only AMD need it. Maybe just
opencode them?


Any way you can put a gzipped/bzipped copy of your vmlinux file somewhere I
can grab it from and take a look at it?


You can find broken vmlinux and bzImage here:

https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdrive.google.com%2Fdrive%2Ffolders%2F1n74vUQHOGebnF70Im32qLFY8iS3wvjIs%3Fusp%3Dsharing&data=04%7C01%7Cthomas.lendacky%40amd.com%7C1c7adf380cbe4c1a6bb708d97d4af6ff%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637678583935705530%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=gA30x%2Bfu97tUx0p2UqI8HgjiL8bxDbK1GqgJBbUrUE4%3D&reserved=0

Let me know when I can remove it.


Looking at everything, it is all RIP relative addressing, so those
accesses should be fine. Your image has the intel_cc_platform_has()
function, does it work if you remove that call? Because I think it may be
the early call into that function which looks like it has instrumentation
that uses %gs in __sanitizer_cov_trace_pc and %gs is not setup properly
yet. And since boot_cpu_data.x86_vendor will likely be zero this early it
will match X86_VENDOR_INTEL and call into that function.

8124f880 :
8124f880:   e8 bb 64 06 00  callq  812b5d40 
<__fentry__>
8124f885:   e8 36 ca 42 00  callq  8167c2c0 
<__sanitizer_cov_trace_pc>
8124f88a:   31 c0   xor%eax,%eax
8124f88c:   c3  retq


8167c2c0 <__sanitizer_cov_trace_pc>:
8167c2c0:   65 8b 05 39 ad 9a 7emov%gs:0x7e9aad39(%rip),%eax  
  # 27000 <__preempt_count>
8167c2c7:   89 c6   mov%eax,%esi
8167c2c9:   48 8b 0c 24 mov(%rsp),%rcx
8167c2cd:   81 e6 00 01 00 00   and$0x100,%esi
8167c2d3:   65 48 8b 14 25 40 70mov%gs:0x27040,%rdx

Thanks,
Tom





___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH v3 5/8] x86/sme: Replace occurrences of sme_active() with cc_platform_has()

2021-09-21 Thread Tom Lendacky

On 9/21/21 4:34 PM, Kirill A. Shutemov wrote:

On Tue, Sep 21, 2021 at 11:27:17PM +0200, Borislav Petkov wrote:

On Wed, Sep 22, 2021 at 12:20:59AM +0300, Kirill A. Shutemov wrote:

I still believe calling cc_platform_has() from __startup_64() is totally
broken as it lacks proper wrapping while accessing global variables.


Well, one of the issues on the AMD side was using boot_cpu_data too
early and the Intel side uses it too. Can you replace those checks with
is_tdx_guest() or whatever was the helper's name which would check
whether the the kernel is running as a TDX guest, and see if that helps?


There's no need in Intel check this early. Only AMD need it. Maybe just
opencode them?


Any way you can put a gzipped/bzipped copy of your vmlinux file somewhere 
I can grab it from and take a look at it?


Thanks,
Tom





___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH v3 5/8] x86/sme: Replace occurrences of sme_active() with cc_platform_has()

2021-09-21 Thread Tom Lendacky

On 9/20/21 2:23 PM, Kirill A. Shutemov wrote:

On Wed, Sep 08, 2021 at 05:58:36PM -0500, Tom Lendacky wrote:

diff --git a/arch/x86/mm/mem_encrypt_identity.c 
b/arch/x86/mm/mem_encrypt_identity.c
index 470b20208430..eff4d19f9cb4 100644
--- a/arch/x86/mm/mem_encrypt_identity.c
+++ b/arch/x86/mm/mem_encrypt_identity.c
@@ -30,6 +30,7 @@
  #include 
  #include 
  #include 
+#include 
  
  #include 

  #include 
@@ -287,7 +288,7 @@ void __init sme_encrypt_kernel(struct boot_params *bp)
unsigned long pgtable_area_len;
unsigned long decrypted_base;
  
-	if (!sme_active())

+   if (!cc_platform_has(CC_ATTR_HOST_MEM_ENCRYPT))
return;
  
  	/*


This change break boot for me (in KVM on Intel host). It only reproduces
with allyesconfig. More reasonable config works fine, but I didn't try to
find exact cause in config.


Looks like instrumentation during early boot. I worked with Boris offline 
to exclude arch/x86/kernel/cc_platform.c from some of the instrumentation 
and that allowed an allyesconfig to boot.


Thanks,
Tom



Convertion to cc_platform_has() in __startup_64() in 8/8 has the same
effect.

I believe it caused by sme_me_mask access from __startup_64() without
fixup_pointer() magic. I think __startup_64() requires special treatement
and we should avoid cc_platform_has() there (or have a special version of
the helper). Note that only AMD requires these cc_platform_has() to return
true.



___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH v3 0/8] Implement generic cc_platform_has() helper function

2021-09-09 Thread Tom Lendacky

On 9/9/21 2:32 AM, Christian Borntraeger wrote:



On 09.09.21 00:58, Tom Lendacky wrote:

This patch series provides a generic helper function, cc_platform_has(),
to replace the sme_active(), sev_active(), sev_es_active() and
mem_encrypt_active() functions.

It is expected that as new confidential computing technologies are
added to the kernel, they can all be covered by a single function call
instead of a collection of specific function calls all called from the
same locations.

The powerpc and s390 patches have been compile tested only. Can the
folks copied on this series verify that nothing breaks for them.


Is there a tree somewhere?


I pushed it up to github:

https://github.com/AMDESE/linux/tree/prot-guest-has-v3

Thanks,
Tom



  Also,

a new file, arch/powerpc/platforms/pseries/cc_platform.c, has been
created for powerpc to hold the out of line function.

Cc: Andi Kleen 
Cc: Andy Lutomirski 
Cc: Ard Biesheuvel 
Cc: Baoquan He 
Cc: Benjamin Herrenschmidt 
Cc: Borislav Petkov 
Cc: Christian Borntraeger 
Cc: Daniel Vetter 
Cc: Dave Hansen 
Cc: Dave Young 
Cc: David Airlie 
Cc: Heiko Carstens 
Cc: Ingo Molnar 
Cc: Joerg Roedel 
Cc: Maarten Lankhorst 
Cc: Maxime Ripard 
Cc: Michael Ellerman 
Cc: Paul Mackerras 
Cc: Peter Zijlstra 
Cc: Thomas Gleixner 
Cc: Thomas Zimmermann 
Cc: Vasily Gorbik 
Cc: VMware Graphics 
Cc: Will Deacon 
Cc: Christoph Hellwig 

---

Patches based on:
   
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgit.kernel.org%2Fpub%2Fscm%2Flinux%2Fkernel%2Fgit%2Ftorvalds%2Flinux.git&data=04%7C01%7Cthomas.lendacky%40amd.com%7C5cd71ef2c2ce4b90060708d973640358%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637667695657121432%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=FVngrPSxCCRKutAaIMtU2Nk8WArFQB1dEE2wN7v8RgA%3D&reserved=0 
master
   4b93c544e90e ("thunderbolt: test: split up test cases in 
tb_test_credit_alloc_all")


Changes since v2:
- Changed the name from prot_guest_has() to cc_platform_has()
- Took the cc_platform_has() function out of line. Created two new files,
   cc_platform.c, in both x86 and ppc to implment the function. As a
   result, also changed the attribute defines into enums.
- Removed any received Reviewed-by's and Acked-by's given changes in this
   version.
- Added removal of new instances of mem_encrypt_active() usage in powerpc
   arch.
- Based on latest Linux tree to pick up powerpc changes related to the
   mem_encrypt_active() function.

Changes since v1:
- Moved some arch ioremap functions within #ifdef CONFIG_AMD_MEM_ENCRYPT
   in prep for use of prot_guest_has() by TDX.
- Added type includes to the the protected_guest.h header file to prevent
   build errors outside of x86.
- Made amd_prot_guest_has() EXPORT_SYMBOL_GPL
- Used amd_prot_guest_has() in place of checking sme_me_mask in the
   arch/x86/mm/mem_encrypt.c file.

Tom Lendacky (8):
   x86/ioremap: Selectively build arch override encryption functions
   mm: Introduce a function to check for confidential computing features
   x86/sev: Add an x86 version of cc_platform_has()
   powerpc/pseries/svm: Add a powerpc version of cc_platform_has()
   x86/sme: Replace occurrences of sme_active() with cc_platform_has()
   x86/sev: Replace occurrences of sev_active() with cc_platform_has()
   x86/sev: Replace occurrences of sev_es_active() with cc_platform_has()
   treewide: Replace the use of mem_encrypt_active() with
 cc_platform_has()

  arch/Kconfig |  3 +
  arch/powerpc/include/asm/mem_encrypt.h   |  5 --
  arch/powerpc/platforms/pseries/Kconfig   |  1 +
  arch/powerpc/platforms/pseries/Makefile  |  2 +
  arch/powerpc/platforms/pseries/cc_platform.c | 26 ++
  arch/powerpc/platforms/pseries/svm.c |  5 +-
  arch/s390/include/asm/mem_encrypt.h  |  2 -
  arch/x86/Kconfig |  1 +
  arch/x86/include/asm/io.h    |  8 ++
  arch/x86/include/asm/kexec.h |  2 +-
  arch/x86/include/asm/mem_encrypt.h   | 14 +---
  arch/x86/kernel/Makefile |  3 +
  arch/x86/kernel/cc_platform.c    | 21 +
  arch/x86/kernel/crash_dump_64.c  |  4 +-
  arch/x86/kernel/head64.c |  4 +-
  arch/x86/kernel/kvm.c    |  3 +-
  arch/x86/kernel/kvmclock.c   |  4 +-
  arch/x86/kernel/machine_kexec_64.c   | 19 +++--
  arch/x86/kernel/pci-swiotlb.c    |  9 +-
  arch/x86/kernel/relocate_kernel_64.S |  2 +-
  arch/x86/kernel/sev.c    |  6 +-
  arch/x86/kvm/svm/svm.c   |  3 +-
  arch/x86/mm/ioremap.c    | 18 ++--
  arch/x86/mm/mem_encrypt.c    | 57 +++--
  arch/x86/mm/mem_encrypt_identity.c   |  3 +-
  arch/x86/mm/pat/set_memory.c |  3 +-
  arch/x86/platform/efi/efi_64.c 

Re: [PATCH v3 8/8] treewide: Replace the use of mem_encrypt_active() with cc_platform_has()

2021-09-09 Thread Tom Lendacky

On 9/9/21 2:25 AM, Christophe Leroy wrote:



On 9/8/21 10:58 PM, Tom Lendacky wrote:


diff --git a/arch/powerpc/include/asm/mem_encrypt.h 
b/arch/powerpc/include/asm/mem_encrypt.h

index ba9dab07c1be..2f26b8fc8d29 100644
--- a/arch/powerpc/include/asm/mem_encrypt.h
+++ b/arch/powerpc/include/asm/mem_encrypt.h
@@ -10,11 +10,6 @@
  #include 
-static inline bool mem_encrypt_active(void)
-{
-    return is_secure_guest();
-}
-
  static inline bool force_dma_unencrypted(struct device *dev)
  {
  return is_secure_guest();
diff --git a/arch/powerpc/platforms/pseries/svm.c 
b/arch/powerpc/platforms/pseries/svm.c

index 87f001b4c4e4..c083ecbbae4d 100644
--- a/arch/powerpc/platforms/pseries/svm.c
+++ b/arch/powerpc/platforms/pseries/svm.c
@@ -8,6 +8,7 @@
  #include 
  #include 
+#include 
  #include 
  #include 
  #include 
@@ -63,7 +64,7 @@ void __init svm_swiotlb_init(void)
  int set_memory_encrypted(unsigned long addr, int numpages)
  {
-    if (!mem_encrypt_active())
+    if (!cc_platform_has(CC_ATTR_MEM_ENCRYPT))
  return 0;
  if (!PAGE_ALIGNED(addr))
@@ -76,7 +77,7 @@ int set_memory_encrypted(unsigned long addr, int 
numpages)

  int set_memory_decrypted(unsigned long addr, int numpages)
  {
-    if (!mem_encrypt_active())
+    if (!cc_platform_has(CC_ATTR_MEM_ENCRYPT))
  return 0;
  if (!PAGE_ALIGNED(addr))


This change unnecessarily complexifies the two functions. This is due to 
cc_platform_has() being out-line. It should really remain inline.


Please see previous discussion(s) on this series for why the function is
implemented out of line and for the naming:

V1: https://lore.kernel.org/lkml/cover.1627424773.git.thomas.lenda...@amd.com/

V2: https://lore.kernel.org/lkml/cover.1628873970.git.thomas.lenda...@amd.com/

Thanks,
Tom



Before the change we got:

 <.set_memory_encrypted>:
    0:    7d 20 00 a6 mfmsr   r9
    4:    75 29 00 40 andis.  r9,r9,64
    8:    41 82 00 48 beq 50 <.set_memory_encrypted+0x50>
    c:    78 69 04 20 clrldi  r9,r3,48
   10:    2c 29 00 00 cmpdi   r9,0
   14:    40 82 00 4c bne 60 <.set_memory_encrypted+0x60>
   18:    7c 08 02 a6 mflr    r0
   1c:    7c 85 23 78 mr  r5,r4
   20:    78 64 85 02 rldicl  r4,r3,48,20
   24:    61 23 f1 34 ori r3,r9,61748
   28:    f8 01 00 10 std r0,16(r1)
   2c:    f8 21 ff 91 stdu    r1,-112(r1)
   30:    48 00 00 01 bl  30 <.set_memory_encrypted+0x30>
     30: R_PPC64_REL24    .ucall_norets
   34:    60 00 00 00 nop
   38:    38 60 00 00 li  r3,0
   3c:    38 21 00 70 addi    r1,r1,112
   40:    e8 01 00 10 ld  r0,16(r1)
   44:    7c 08 03 a6 mtlr    r0
   48:    4e 80 00 20 blr
   50:    38 60 00 00 li  r3,0
   54:    4e 80 00 20 blr
   60:    38 60 ff ea li  r3,-22
   64:    4e 80 00 20 blr

After the change we get:

 <.set_memory_encrypted>:
    0:    7c 08 02 a6 mflr    r0
    4:    fb c1 ff f0 std r30,-16(r1)
    8:    fb e1 ff f8 std r31,-8(r1)
    c:    7c 7f 1b 78 mr  r31,r3
   10:    38 60 00 00 li  r3,0
   14:    7c 9e 23 78 mr  r30,r4
   18:    f8 01 00 10 std r0,16(r1)
   1c:    f8 21 ff 81 stdu    r1,-128(r1)
   20:    48 00 00 01 bl  20 <.set_memory_encrypted+0x20>
     20: R_PPC64_REL24    .cc_platform_has
   24:    60 00 00 00 nop
   28:    2c 23 00 00 cmpdi   r3,0
   2c:    41 82 00 44 beq 70 <.set_memory_encrypted+0x70>
   30:    7b e9 04 20 clrldi  r9,r31,48
   34:    2c 29 00 00 cmpdi   r9,0
   38:    40 82 00 58 bne 90 <.set_memory_encrypted+0x90>
   3c:    38 60 00 00 li  r3,0
   40:    7f c5 f3 78 mr  r5,r30
   44:    7b e4 85 02 rldicl  r4,r31,48,20
   48:    60 63 f1 34 ori r3,r3,61748
   4c:    48 00 00 01 bl  4c <.set_memory_encrypted+0x4c>
     4c: R_PPC64_REL24    .ucall_norets
   50:    60 00 00 00 nop
   54:    38 60 00 00 li  r3,0
   58:    38 21 00 80 addi    r1,r1,128
   5c:    e8 01 00 10 ld  r0,16(r1)
   60:    eb c1 ff f0 ld  r30,-16(r1)
   64:    eb e1 ff f8 ld  r31,-8(r1)
   68:    7c 08 03 a6 mtlr    r0
   6c:    4e 80 00 20 blr
   70:    38 21 00 80 addi    r1,r1,128
   74:    38 60 00 00 li  r3,0
   78:    e8 01 00 10 ld  r0,16(r1)
   7c:    eb c1 ff f0 ld  r30,-16(r1)
   80:    eb e1 ff f8 ld  r31,-8(r1)
   84:    7c 08 03 a6 mtlr    r0
   88:    4e 80 00 20 blr
   90:    38 60 ff ea li  r3,-22
   94:    4b ff ff c4 b   58 <.set_memory_encrypted+0x58>



___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


[PATCH v3 0/8] Implement generic cc_platform_has() helper function

2021-09-08 Thread Tom Lendacky
This patch series provides a generic helper function, cc_platform_has(),
to replace the sme_active(), sev_active(), sev_es_active() and
mem_encrypt_active() functions.

It is expected that as new confidential computing technologies are
added to the kernel, they can all be covered by a single function call
instead of a collection of specific function calls all called from the
same locations.

The powerpc and s390 patches have been compile tested only. Can the
folks copied on this series verify that nothing breaks for them. Also,
a new file, arch/powerpc/platforms/pseries/cc_platform.c, has been
created for powerpc to hold the out of line function.

Cc: Andi Kleen 
Cc: Andy Lutomirski 
Cc: Ard Biesheuvel 
Cc: Baoquan He 
Cc: Benjamin Herrenschmidt 
Cc: Borislav Petkov 
Cc: Christian Borntraeger 
Cc: Daniel Vetter 
Cc: Dave Hansen 
Cc: Dave Young 
Cc: David Airlie 
Cc: Heiko Carstens 
Cc: Ingo Molnar 
Cc: Joerg Roedel 
Cc: Maarten Lankhorst 
Cc: Maxime Ripard 
Cc: Michael Ellerman 
Cc: Paul Mackerras 
Cc: Peter Zijlstra 
Cc: Thomas Gleixner 
Cc: Thomas Zimmermann 
Cc: Vasily Gorbik 
Cc: VMware Graphics 
Cc: Will Deacon 
Cc: Christoph Hellwig 

---

Patches based on:
  https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master
  4b93c544e90e ("thunderbolt: test: split up test cases in 
tb_test_credit_alloc_all")

Changes since v2:
- Changed the name from prot_guest_has() to cc_platform_has()
- Took the cc_platform_has() function out of line. Created two new files,
  cc_platform.c, in both x86 and ppc to implment the function. As a
  result, also changed the attribute defines into enums.
- Removed any received Reviewed-by's and Acked-by's given changes in this
  version.
- Added removal of new instances of mem_encrypt_active() usage in powerpc
  arch.
- Based on latest Linux tree to pick up powerpc changes related to the
  mem_encrypt_active() function.

Changes since v1:
- Moved some arch ioremap functions within #ifdef CONFIG_AMD_MEM_ENCRYPT
  in prep for use of prot_guest_has() by TDX.
- Added type includes to the the protected_guest.h header file to prevent
  build errors outside of x86.
- Made amd_prot_guest_has() EXPORT_SYMBOL_GPL
- Used amd_prot_guest_has() in place of checking sme_me_mask in the
  arch/x86/mm/mem_encrypt.c file.

Tom Lendacky (8):
  x86/ioremap: Selectively build arch override encryption functions
  mm: Introduce a function to check for confidential computing features
  x86/sev: Add an x86 version of cc_platform_has()
  powerpc/pseries/svm: Add a powerpc version of cc_platform_has()
  x86/sme: Replace occurrences of sme_active() with cc_platform_has()
  x86/sev: Replace occurrences of sev_active() with cc_platform_has()
  x86/sev: Replace occurrences of sev_es_active() with cc_platform_has()
  treewide: Replace the use of mem_encrypt_active() with
cc_platform_has()

 arch/Kconfig |  3 +
 arch/powerpc/include/asm/mem_encrypt.h   |  5 --
 arch/powerpc/platforms/pseries/Kconfig   |  1 +
 arch/powerpc/platforms/pseries/Makefile  |  2 +
 arch/powerpc/platforms/pseries/cc_platform.c | 26 ++
 arch/powerpc/platforms/pseries/svm.c |  5 +-
 arch/s390/include/asm/mem_encrypt.h  |  2 -
 arch/x86/Kconfig |  1 +
 arch/x86/include/asm/io.h|  8 ++
 arch/x86/include/asm/kexec.h |  2 +-
 arch/x86/include/asm/mem_encrypt.h   | 14 +---
 arch/x86/kernel/Makefile |  3 +
 arch/x86/kernel/cc_platform.c| 21 +
 arch/x86/kernel/crash_dump_64.c  |  4 +-
 arch/x86/kernel/head64.c |  4 +-
 arch/x86/kernel/kvm.c|  3 +-
 arch/x86/kernel/kvmclock.c   |  4 +-
 arch/x86/kernel/machine_kexec_64.c   | 19 +++--
 arch/x86/kernel/pci-swiotlb.c|  9 +-
 arch/x86/kernel/relocate_kernel_64.S |  2 +-
 arch/x86/kernel/sev.c|  6 +-
 arch/x86/kvm/svm/svm.c   |  3 +-
 arch/x86/mm/ioremap.c| 18 ++--
 arch/x86/mm/mem_encrypt.c| 57 +++--
 arch/x86/mm/mem_encrypt_identity.c   |  3 +-
 arch/x86/mm/pat/set_memory.c |  3 +-
 arch/x86/platform/efi/efi_64.c   |  9 +-
 arch/x86/realmode/init.c |  8 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c  |  4 +-
 drivers/gpu/drm/drm_cache.c  |  4 +-
 drivers/gpu/drm/vmwgfx/vmwgfx_drv.c  |  4 +-
 drivers/gpu/drm/vmwgfx/vmwgfx_msg.c  |  6 +-
 drivers/iommu/amd/init.c |  7 +-
 drivers/iommu/amd/iommu.c|  3 +-
 drivers/iommu/amd/iommu_v2.c |  3 +-
 drivers/iommu/iommu.c|  3 +-
 fs/proc/vmcore.c |  6 +-
 include/linux/cc_platform.h  | 88 
 include/linux/mem_encrypt.h  

[PATCH v3 8/8] treewide: Replace the use of mem_encrypt_active() with cc_platform_has()

2021-09-08 Thread Tom Lendacky
Replace uses of mem_encrypt_active() with calls to cc_platform_has() with
the CC_ATTR_MEM_ENCRYPT attribute.

Remove the implementation of mem_encrypt_active() across all arches.

For s390, since the default implementation of the cc_platform_has()
matches the s390 implementation of mem_encrypt_active(), cc_platform_has()
does not need to be implemented in s390 (the config option
ARCH_HAS_CC_PLATFORM is not set).

Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: Borislav Petkov 
Cc: Dave Hansen 
Cc: Andy Lutomirski 
Cc: Peter Zijlstra 
Cc: David Airlie 
Cc: Daniel Vetter 
Cc: Maarten Lankhorst 
Cc: Maxime Ripard 
Cc: Thomas Zimmermann 
Cc: VMware Graphics 
Cc: Joerg Roedel 
Cc: Will Deacon 
Cc: Dave Young 
Cc: Baoquan He 
Cc: Michael Ellerman 
Cc: Benjamin Herrenschmidt 
Cc: Paul Mackerras 
Cc: Heiko Carstens 
Cc: Vasily Gorbik 
Cc: Christian Borntraeger 
Signed-off-by: Tom Lendacky 
---
 arch/powerpc/include/asm/mem_encrypt.h  | 5 -
 arch/powerpc/platforms/pseries/svm.c| 5 +++--
 arch/s390/include/asm/mem_encrypt.h | 2 --
 arch/x86/include/asm/mem_encrypt.h  | 5 -
 arch/x86/kernel/head64.c| 4 ++--
 arch/x86/mm/ioremap.c   | 4 ++--
 arch/x86/mm/mem_encrypt.c   | 2 +-
 arch/x86/mm/pat/set_memory.c| 3 ++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 4 +++-
 drivers/gpu/drm/drm_cache.c | 4 ++--
 drivers/gpu/drm/vmwgfx/vmwgfx_drv.c | 4 ++--
 drivers/gpu/drm/vmwgfx/vmwgfx_msg.c | 6 +++---
 drivers/iommu/amd/iommu.c   | 3 ++-
 drivers/iommu/amd/iommu_v2.c| 3 ++-
 drivers/iommu/iommu.c   | 3 ++-
 fs/proc/vmcore.c| 6 +++---
 include/linux/mem_encrypt.h | 4 
 kernel/dma/swiotlb.c| 4 ++--
 18 files changed, 31 insertions(+), 40 deletions(-)

diff --git a/arch/powerpc/include/asm/mem_encrypt.h 
b/arch/powerpc/include/asm/mem_encrypt.h
index ba9dab07c1be..2f26b8fc8d29 100644
--- a/arch/powerpc/include/asm/mem_encrypt.h
+++ b/arch/powerpc/include/asm/mem_encrypt.h
@@ -10,11 +10,6 @@
 
 #include 
 
-static inline bool mem_encrypt_active(void)
-{
-   return is_secure_guest();
-}
-
 static inline bool force_dma_unencrypted(struct device *dev)
 {
return is_secure_guest();
diff --git a/arch/powerpc/platforms/pseries/svm.c 
b/arch/powerpc/platforms/pseries/svm.c
index 87f001b4c4e4..c083ecbbae4d 100644
--- a/arch/powerpc/platforms/pseries/svm.c
+++ b/arch/powerpc/platforms/pseries/svm.c
@@ -8,6 +8,7 @@
 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -63,7 +64,7 @@ void __init svm_swiotlb_init(void)
 
 int set_memory_encrypted(unsigned long addr, int numpages)
 {
-   if (!mem_encrypt_active())
+   if (!cc_platform_has(CC_ATTR_MEM_ENCRYPT))
return 0;
 
if (!PAGE_ALIGNED(addr))
@@ -76,7 +77,7 @@ int set_memory_encrypted(unsigned long addr, int numpages)
 
 int set_memory_decrypted(unsigned long addr, int numpages)
 {
-   if (!mem_encrypt_active())
+   if (!cc_platform_has(CC_ATTR_MEM_ENCRYPT))
return 0;
 
if (!PAGE_ALIGNED(addr))
diff --git a/arch/s390/include/asm/mem_encrypt.h 
b/arch/s390/include/asm/mem_encrypt.h
index 2542cbf7e2d1..08a8b96606d7 100644
--- a/arch/s390/include/asm/mem_encrypt.h
+++ b/arch/s390/include/asm/mem_encrypt.h
@@ -4,8 +4,6 @@
 
 #ifndef __ASSEMBLY__
 
-static inline bool mem_encrypt_active(void) { return false; }
-
 int set_memory_encrypted(unsigned long addr, int numpages);
 int set_memory_decrypted(unsigned long addr, int numpages);
 
diff --git a/arch/x86/include/asm/mem_encrypt.h 
b/arch/x86/include/asm/mem_encrypt.h
index 499440781b39..ed954aa5c448 100644
--- a/arch/x86/include/asm/mem_encrypt.h
+++ b/arch/x86/include/asm/mem_encrypt.h
@@ -98,11 +98,6 @@ static inline void mem_encrypt_free_decrypted_mem(void) { }
 
 extern char __start_bss_decrypted[], __end_bss_decrypted[], 
__start_bss_decrypted_unused[];
 
-static inline bool mem_encrypt_active(void)
-{
-   return sme_me_mask;
-}
-
 static inline u64 sme_get_me_mask(void)
 {
return sme_me_mask;
diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c
index de01903c3735..f98c76a1d16c 100644
--- a/arch/x86/kernel/head64.c
+++ b/arch/x86/kernel/head64.c
@@ -19,7 +19,7 @@
 #include 
 #include 
 #include 
-#include 
+#include 
 #include 
 
 #include 
@@ -285,7 +285,7 @@ unsigned long __head __startup_64(unsigned long physaddr,
 * there is no need to zero it after changing the memory encryption
 * attribute.
 */
-   if (mem_encrypt_active()) {
+   if (cc_platform_has(CC_ATTR_MEM_ENCRYPT)) {
vaddr = (unsigned long)__start_bss_decrypted;
vaddr_end = (unsigned long)__end_bss_decrypted;
for (; vaddr < vaddr_end; vaddr += PMD_SIZE) {
diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c
index b59a5cbc6bc5..026031b3b782 100644
--- a/arch/x86/mm/ioremap.c
+++ b/arch/x86

[PATCH v3 7/8] x86/sev: Replace occurrences of sev_es_active() with cc_platform_has()

2021-09-08 Thread Tom Lendacky
Replace uses of sev_es_active() with the more generic cc_platform_has()
using CC_ATTR_GUEST_STATE_ENCRYPT. If future support is added for other
memory encyrption techonologies, the use of CC_ATTR_GUEST_STATE_ENCRYPT
can be updated, as required.

Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: Borislav Petkov 
Signed-off-by: Tom Lendacky 
---
 arch/x86/include/asm/mem_encrypt.h |  2 --
 arch/x86/kernel/sev.c  |  6 +++---
 arch/x86/mm/mem_encrypt.c  | 14 --
 arch/x86/realmode/init.c   |  3 +--
 4 files changed, 8 insertions(+), 17 deletions(-)

diff --git a/arch/x86/include/asm/mem_encrypt.h 
b/arch/x86/include/asm/mem_encrypt.h
index f440eebeeb2c..499440781b39 100644
--- a/arch/x86/include/asm/mem_encrypt.h
+++ b/arch/x86/include/asm/mem_encrypt.h
@@ -51,7 +51,6 @@ void __init mem_encrypt_free_decrypted_mem(void);
 void __init mem_encrypt_init(void);
 
 void __init sev_es_init_vc_handling(void);
-bool sev_es_active(void);
 bool amd_cc_platform_has(enum cc_attr attr);
 
 #define __bss_decrypted __section(".bss..decrypted")
@@ -75,7 +74,6 @@ static inline void __init sme_encrypt_kernel(struct 
boot_params *bp) { }
 static inline void __init sme_enable(struct boot_params *bp) { }
 
 static inline void sev_es_init_vc_handling(void) { }
-static inline bool sev_es_active(void) { return false; }
 static inline bool amd_cc_platform_has(enum cc_attr attr) { return false; }
 
 static inline int __init
diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c
index a6895e440bc3..53a6837d354b 100644
--- a/arch/x86/kernel/sev.c
+++ b/arch/x86/kernel/sev.c
@@ -11,7 +11,7 @@
 
 #include  /* For show_regs() */
 #include 
-#include 
+#include 
 #include 
 #include 
 #include 
@@ -615,7 +615,7 @@ int __init sev_es_efi_map_ghcbs(pgd_t *pgd)
int cpu;
u64 pfn;
 
-   if (!sev_es_active())
+   if (!cc_platform_has(CC_ATTR_GUEST_STATE_ENCRYPT))
return 0;
 
pflags = _PAGE_NX | _PAGE_RW;
@@ -774,7 +774,7 @@ void __init sev_es_init_vc_handling(void)
 
BUILD_BUG_ON(offsetof(struct sev_es_runtime_data, ghcb_page) % 
PAGE_SIZE);
 
-   if (!sev_es_active())
+   if (!cc_platform_has(CC_ATTR_GUEST_STATE_ENCRYPT))
return;
 
if (!sev_es_check_cpu_features())
diff --git a/arch/x86/mm/mem_encrypt.c b/arch/x86/mm/mem_encrypt.c
index 22d4e152a6de..47d571a2cd28 100644
--- a/arch/x86/mm/mem_encrypt.c
+++ b/arch/x86/mm/mem_encrypt.c
@@ -373,13 +373,6 @@ int __init early_set_memory_encrypted(unsigned long vaddr, 
unsigned long size)
  * up under SME the trampoline area cannot be encrypted, whereas under SEV
  * the trampoline area must be encrypted.
  */
-
-/* Needs to be called from non-instrumentable code */
-bool noinstr sev_es_active(void)
-{
-   return sev_status & MSR_AMD64_SEV_ES_ENABLED;
-}
-
 bool amd_cc_platform_has(enum cc_attr attr)
 {
switch (attr) {
@@ -393,7 +386,7 @@ bool amd_cc_platform_has(enum cc_attr attr)
return sev_status & MSR_AMD64_SEV_ENABLED;
 
case CC_ATTR_GUEST_STATE_ENCRYPT:
-   return sev_es_active();
+   return sev_status & MSR_AMD64_SEV_ES_ENABLED;
 
default:
return false;
@@ -469,7 +462,7 @@ static void print_mem_encrypt_feature_info(void)
pr_cont(" SEV");
 
/* Encrypted Register State */
-   if (sev_es_active())
+   if (cc_platform_has(CC_ATTR_GUEST_STATE_ENCRYPT))
pr_cont(" SEV-ES");
 
pr_cont("\n");
@@ -488,7 +481,8 @@ void __init mem_encrypt_init(void)
 * With SEV, we need to unroll the rep string I/O instructions,
 * but SEV-ES supports them through the #VC handler.
 */
-   if (cc_platform_has(CC_ATTR_GUEST_MEM_ENCRYPT) && !sev_es_active())
+   if (cc_platform_has(CC_ATTR_GUEST_MEM_ENCRYPT) &&
+   !cc_platform_has(CC_ATTR_GUEST_STATE_ENCRYPT))
static_branch_enable(&sev_enable_key);
 
print_mem_encrypt_feature_info();
diff --git a/arch/x86/realmode/init.c b/arch/x86/realmode/init.c
index c878c5ee5a4c..4a3da7592b99 100644
--- a/arch/x86/realmode/init.c
+++ b/arch/x86/realmode/init.c
@@ -2,7 +2,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 #include 
 
@@ -48,7 +47,7 @@ static void sme_sev_setup_real_mode(struct trampoline_header 
*th)
if (cc_platform_has(CC_ATTR_HOST_MEM_ENCRYPT))
th->flags |= TH_FLAGS_SME_ACTIVE;
 
-   if (sev_es_active()) {
+   if (cc_platform_has(CC_ATTR_GUEST_STATE_ENCRYPT)) {
/*
 * Skip the call to verify_cpu() in secondary_startup_64 as it
 * will cause #VC exceptions when the AP can't handle them yet.
-- 
2.33.0


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


[PATCH v3 6/8] x86/sev: Replace occurrences of sev_active() with cc_platform_has()

2021-09-08 Thread Tom Lendacky
Replace uses of sev_active() with the more generic cc_platform_has()
using CC_ATTR_GUEST_MEM_ENCRYPT. If future support is added for other
memory encryption technologies, the use of CC_ATTR_GUEST_MEM_ENCRYPT
can be updated, as required.

Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: Borislav Petkov 
Cc: Dave Hansen 
Cc: Andy Lutomirski 
Cc: Peter Zijlstra 
Cc: Ard Biesheuvel 
Signed-off-by: Tom Lendacky 
---
 arch/x86/include/asm/mem_encrypt.h |  2 --
 arch/x86/kernel/crash_dump_64.c|  4 +++-
 arch/x86/kernel/kvm.c  |  3 ++-
 arch/x86/kernel/kvmclock.c |  4 ++--
 arch/x86/kernel/machine_kexec_64.c |  4 ++--
 arch/x86/kvm/svm/svm.c |  3 ++-
 arch/x86/mm/ioremap.c  |  6 +++---
 arch/x86/mm/mem_encrypt.c  | 25 ++---
 arch/x86/platform/efi/efi_64.c |  9 +
 9 files changed, 29 insertions(+), 31 deletions(-)

diff --git a/arch/x86/include/asm/mem_encrypt.h 
b/arch/x86/include/asm/mem_encrypt.h
index 8c4f0dfe63f9..f440eebeeb2c 100644
--- a/arch/x86/include/asm/mem_encrypt.h
+++ b/arch/x86/include/asm/mem_encrypt.h
@@ -51,7 +51,6 @@ void __init mem_encrypt_free_decrypted_mem(void);
 void __init mem_encrypt_init(void);
 
 void __init sev_es_init_vc_handling(void);
-bool sev_active(void);
 bool sev_es_active(void);
 bool amd_cc_platform_has(enum cc_attr attr);
 
@@ -76,7 +75,6 @@ static inline void __init sme_encrypt_kernel(struct 
boot_params *bp) { }
 static inline void __init sme_enable(struct boot_params *bp) { }
 
 static inline void sev_es_init_vc_handling(void) { }
-static inline bool sev_active(void) { return false; }
 static inline bool sev_es_active(void) { return false; }
 static inline bool amd_cc_platform_has(enum cc_attr attr) { return false; }
 
diff --git a/arch/x86/kernel/crash_dump_64.c b/arch/x86/kernel/crash_dump_64.c
index 045e82e8945b..a7f617a3981d 100644
--- a/arch/x86/kernel/crash_dump_64.c
+++ b/arch/x86/kernel/crash_dump_64.c
@@ -10,6 +10,7 @@
 #include 
 #include 
 #include 
+#include 
 
 static ssize_t __copy_oldmem_page(unsigned long pfn, char *buf, size_t csize,
  unsigned long offset, int userbuf,
@@ -73,5 +74,6 @@ ssize_t copy_oldmem_page_encrypted(unsigned long pfn, char 
*buf, size_t csize,
 
 ssize_t elfcorehdr_read(char *buf, size_t count, u64 *ppos)
 {
-   return read_from_oldmem(buf, count, ppos, 0, sev_active());
+   return read_from_oldmem(buf, count, ppos, 0,
+   cc_platform_has(CC_ATTR_GUEST_MEM_ENCRYPT));
 }
diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
index a26643dc6bd6..509a578f56a0 100644
--- a/arch/x86/kernel/kvm.c
+++ b/arch/x86/kernel/kvm.c
@@ -27,6 +27,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -418,7 +419,7 @@ static void __init sev_map_percpu_data(void)
 {
int cpu;
 
-   if (!sev_active())
+   if (!cc_platform_has(CC_ATTR_GUEST_MEM_ENCRYPT))
return;
 
for_each_possible_cpu(cpu) {
diff --git a/arch/x86/kernel/kvmclock.c b/arch/x86/kernel/kvmclock.c
index ad273e5861c1..fc3930c5db1b 100644
--- a/arch/x86/kernel/kvmclock.c
+++ b/arch/x86/kernel/kvmclock.c
@@ -16,9 +16,9 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
-#include 
 #include 
 #include 
 
@@ -232,7 +232,7 @@ static void __init kvmclock_init_mem(void)
 * hvclock is shared between the guest and the hypervisor, must
 * be mapped decrypted.
 */
-   if (sev_active()) {
+   if (cc_platform_has(CC_ATTR_GUEST_MEM_ENCRYPT)) {
r = set_memory_decrypted((unsigned long) hvclock_mem,
 1UL << order);
if (r) {
diff --git a/arch/x86/kernel/machine_kexec_64.c 
b/arch/x86/kernel/machine_kexec_64.c
index 7040c0fa921c..f5da4a18070a 100644
--- a/arch/x86/kernel/machine_kexec_64.c
+++ b/arch/x86/kernel/machine_kexec_64.c
@@ -167,7 +167,7 @@ static int init_transition_pgtable(struct kimage *image, 
pgd_t *pgd)
}
pte = pte_offset_kernel(pmd, vaddr);
 
-   if (sev_active())
+   if (cc_platform_has(CC_ATTR_GUEST_MEM_ENCRYPT))
prot = PAGE_KERNEL_EXEC;
 
set_pte(pte, pfn_pte(paddr >> PAGE_SHIFT, prot));
@@ -207,7 +207,7 @@ static int init_pgtable(struct kimage *image, unsigned long 
start_pgtable)
level4p = (pgd_t *)__va(start_pgtable);
clear_page(level4p);
 
-   if (sev_active()) {
+   if (cc_platform_has(CC_ATTR_GUEST_MEM_ENCRYPT)) {
info.page_flag   |= _PAGE_ENC;
info.kernpg_flag |= _PAGE_ENC;
}
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 69639f9624f5..eb3669154b48 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -25,6 +25,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -457,7 +458,7 @@ static int has_svm(void)
return 0;
}
 
-   if (sev_active()) 

[PATCH v3 5/8] x86/sme: Replace occurrences of sme_active() with cc_platform_has()

2021-09-08 Thread Tom Lendacky
Replace uses of sme_active() with the more generic cc_platform_has()
using CC_ATTR_HOST_MEM_ENCRYPT. If future support is added for other
memory encryption technologies, the use of CC_ATTR_HOST_MEM_ENCRYPT
can be updated, as required.

This also replaces two usages of sev_active() that are really geared
towards detecting if SME is active.

Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: Borislav Petkov 
Cc: Dave Hansen 
Cc: Andy Lutomirski 
Cc: Peter Zijlstra 
Cc: Joerg Roedel 
Cc: Will Deacon 
Signed-off-by: Tom Lendacky 
---
 arch/x86/include/asm/kexec.h |  2 +-
 arch/x86/include/asm/mem_encrypt.h   |  2 --
 arch/x86/kernel/machine_kexec_64.c   | 15 ---
 arch/x86/kernel/pci-swiotlb.c|  9 -
 arch/x86/kernel/relocate_kernel_64.S |  2 +-
 arch/x86/mm/ioremap.c|  6 +++---
 arch/x86/mm/mem_encrypt.c| 15 +--
 arch/x86/mm/mem_encrypt_identity.c   |  3 ++-
 arch/x86/realmode/init.c |  5 +++--
 drivers/iommu/amd/init.c |  7 ---
 10 files changed, 31 insertions(+), 35 deletions(-)

diff --git a/arch/x86/include/asm/kexec.h b/arch/x86/include/asm/kexec.h
index 0a6e34b07017..11b7c06e2828 100644
--- a/arch/x86/include/asm/kexec.h
+++ b/arch/x86/include/asm/kexec.h
@@ -129,7 +129,7 @@ relocate_kernel(unsigned long indirection_page,
unsigned long page_list,
unsigned long start_address,
unsigned int preserve_context,
-   unsigned int sme_active);
+   unsigned int host_mem_enc_active);
 #endif
 
 #define ARCH_HAS_KIMAGE_ARCH
diff --git a/arch/x86/include/asm/mem_encrypt.h 
b/arch/x86/include/asm/mem_encrypt.h
index 3d8a5e8b2e3f..8c4f0dfe63f9 100644
--- a/arch/x86/include/asm/mem_encrypt.h
+++ b/arch/x86/include/asm/mem_encrypt.h
@@ -51,7 +51,6 @@ void __init mem_encrypt_free_decrypted_mem(void);
 void __init mem_encrypt_init(void);
 
 void __init sev_es_init_vc_handling(void);
-bool sme_active(void);
 bool sev_active(void);
 bool sev_es_active(void);
 bool amd_cc_platform_has(enum cc_attr attr);
@@ -77,7 +76,6 @@ static inline void __init sme_encrypt_kernel(struct 
boot_params *bp) { }
 static inline void __init sme_enable(struct boot_params *bp) { }
 
 static inline void sev_es_init_vc_handling(void) { }
-static inline bool sme_active(void) { return false; }
 static inline bool sev_active(void) { return false; }
 static inline bool sev_es_active(void) { return false; }
 static inline bool amd_cc_platform_has(enum cc_attr attr) { return false; }
diff --git a/arch/x86/kernel/machine_kexec_64.c 
b/arch/x86/kernel/machine_kexec_64.c
index 131f30fdcfbd..7040c0fa921c 100644
--- a/arch/x86/kernel/machine_kexec_64.c
+++ b/arch/x86/kernel/machine_kexec_64.c
@@ -17,6 +17,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -358,7 +359,7 @@ void machine_kexec(struct kimage *image)
   (unsigned long)page_list,
   image->start,
   image->preserve_context,
-  sme_active());
+  
cc_platform_has(CC_ATTR_HOST_MEM_ENCRYPT));
 
 #ifdef CONFIG_KEXEC_JUMP
if (image->preserve_context)
@@ -569,12 +570,12 @@ void arch_kexec_unprotect_crashkres(void)
  */
 int arch_kexec_post_alloc_pages(void *vaddr, unsigned int pages, gfp_t gfp)
 {
-   if (sev_active())
+   if (!cc_platform_has(CC_ATTR_HOST_MEM_ENCRYPT))
return 0;
 
/*
-* If SME is active we need to be sure that kexec pages are
-* not encrypted because when we boot to the new kernel the
+* If host memory encryption is active we need to be sure that kexec
+* pages are not encrypted because when we boot to the new kernel the
 * pages won't be accessed encrypted (initially).
 */
return set_memory_decrypted((unsigned long)vaddr, pages);
@@ -582,12 +583,12 @@ int arch_kexec_post_alloc_pages(void *vaddr, unsigned int 
pages, gfp_t gfp)
 
 void arch_kexec_pre_free_pages(void *vaddr, unsigned int pages)
 {
-   if (sev_active())
+   if (!cc_platform_has(CC_ATTR_HOST_MEM_ENCRYPT))
return;
 
/*
-* If SME is active we need to reset the pages back to being
-* an encrypted mapping before freeing them.
+* If host memory encryption is active we need to reset the pages back
+* to being an encrypted mapping before freeing them.
 */
set_memory_encrypted((unsigned long)vaddr, pages);
 }
diff --git a/arch/x86/kernel/pci-swiotlb.c b/arch/x86/kernel/pci-swiotlb.c
index c2cfa5e7c152..814ab46a0dad 100644
--- a/arch/x86/kernel/pci-swiotlb.c
+++ b/arch/x86/kernel/pci-swiotlb.c
@@ -6,7 +6,7 @@
 #include 
 #include 
 #include 
-#include 
+#include 
 
 #include 
 #include 
@@ -45,11 +45,10 @@ int __init pci_swiotlb_detect_4gb(void)
 

[PATCH v3 3/8] x86/sev: Add an x86 version of cc_platform_has()

2021-09-08 Thread Tom Lendacky
Introduce an x86 version of the cc_platform_has() function. This will be
used to replace vendor specific calls like sme_active(), sev_active(),
etc.

Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: Borislav Petkov 
Cc: Dave Hansen 
Cc: Andy Lutomirski 
Cc: Peter Zijlstra 
Co-developed-by: Andi Kleen 
Signed-off-by: Andi Kleen 
Co-developed-by: Kuppuswamy Sathyanarayanan 

Signed-off-by: Kuppuswamy Sathyanarayanan 

Signed-off-by: Tom Lendacky 
---
 arch/x86/Kconfig   |  1 +
 arch/x86/include/asm/mem_encrypt.h |  3 +++
 arch/x86/kernel/Makefile   |  3 +++
 arch/x86/kernel/cc_platform.c  | 21 +
 arch/x86/mm/mem_encrypt.c  | 21 +
 5 files changed, 49 insertions(+)
 create mode 100644 arch/x86/kernel/cc_platform.c

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 4e001425..2b2a9639d8ae 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -1513,6 +1513,7 @@ config AMD_MEM_ENCRYPT
select ARCH_HAS_FORCE_DMA_UNENCRYPTED
select INSTRUCTION_DECODER
select ARCH_HAS_RESTRICTED_VIRTIO_MEMORY_ACCESS
+   select ARCH_HAS_CC_PLATFORM
help
  Say yes to enable support for the encryption of system memory.
  This requires an AMD processor that supports Secure Memory
diff --git a/arch/x86/include/asm/mem_encrypt.h 
b/arch/x86/include/asm/mem_encrypt.h
index 9c80c68d75b5..3d8a5e8b2e3f 100644
--- a/arch/x86/include/asm/mem_encrypt.h
+++ b/arch/x86/include/asm/mem_encrypt.h
@@ -13,6 +13,7 @@
 #ifndef __ASSEMBLY__
 
 #include 
+#include 
 
 #include 
 
@@ -53,6 +54,7 @@ void __init sev_es_init_vc_handling(void);
 bool sme_active(void);
 bool sev_active(void);
 bool sev_es_active(void);
+bool amd_cc_platform_has(enum cc_attr attr);
 
 #define __bss_decrypted __section(".bss..decrypted")
 
@@ -78,6 +80,7 @@ static inline void sev_es_init_vc_handling(void) { }
 static inline bool sme_active(void) { return false; }
 static inline bool sev_active(void) { return false; }
 static inline bool sev_es_active(void) { return false; }
+static inline bool amd_cc_platform_has(enum cc_attr attr) { return false; }
 
 static inline int __init
 early_set_memory_decrypted(unsigned long vaddr, unsigned long size) { return 
0; }
diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile
index 8f4e8fa6ed75..f91403a78594 100644
--- a/arch/x86/kernel/Makefile
+++ b/arch/x86/kernel/Makefile
@@ -147,6 +147,9 @@ obj-$(CONFIG_UNWINDER_FRAME_POINTER)+= 
unwind_frame.o
 obj-$(CONFIG_UNWINDER_GUESS)   += unwind_guess.o
 
 obj-$(CONFIG_AMD_MEM_ENCRYPT)  += sev.o
+
+obj-$(CONFIG_ARCH_HAS_CC_PLATFORM) += cc_platform.o
+
 ###
 # 64 bit specific files
 ifeq ($(CONFIG_X86_64),y)
diff --git a/arch/x86/kernel/cc_platform.c b/arch/x86/kernel/cc_platform.c
new file mode 100644
index ..3c9bacd3c3f3
--- /dev/null
+++ b/arch/x86/kernel/cc_platform.c
@@ -0,0 +1,21 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Confidential Computing Platform Capability checks
+ *
+ * Copyright (C) 2021 Advanced Micro Devices, Inc.
+ *
+ * Author: Tom Lendacky 
+ */
+
+#include 
+#include 
+#include 
+
+bool cc_platform_has(enum cc_attr attr)
+{
+   if (sme_me_mask)
+   return amd_cc_platform_has(attr);
+
+   return false;
+}
+EXPORT_SYMBOL_GPL(cc_platform_has);
diff --git a/arch/x86/mm/mem_encrypt.c b/arch/x86/mm/mem_encrypt.c
index ff08dc463634..18fe19916bc3 100644
--- a/arch/x86/mm/mem_encrypt.c
+++ b/arch/x86/mm/mem_encrypt.c
@@ -20,6 +20,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -389,6 +390,26 @@ bool noinstr sev_es_active(void)
return sev_status & MSR_AMD64_SEV_ES_ENABLED;
 }
 
+bool amd_cc_platform_has(enum cc_attr attr)
+{
+   switch (attr) {
+   case CC_ATTR_MEM_ENCRYPT:
+   return sme_me_mask != 0;
+
+   case CC_ATTR_HOST_MEM_ENCRYPT:
+   return sme_active();
+
+   case CC_ATTR_GUEST_MEM_ENCRYPT:
+   return sev_active();
+
+   case CC_ATTR_GUEST_STATE_ENCRYPT:
+   return sev_es_active();
+
+   default:
+   return false;
+   }
+}
+
 /* Override for DMA direct allocation check - ARCH_HAS_FORCE_DMA_UNENCRYPTED */
 bool force_dma_unencrypted(struct device *dev)
 {
-- 
2.33.0


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


[PATCH v3 2/8] mm: Introduce a function to check for confidential computing features

2021-09-08 Thread Tom Lendacky
In prep for other confidential computing technologies, introduce a generic
helper function, cc_platform_has(), that can be used to check for specific
active confidential computing attributes, like memory encryption. This is
intended to eliminate having to add multiple technology-specific checks to
the code (e.g. if (sev_active() || tdx_active())).

Co-developed-by: Andi Kleen 
Signed-off-by: Andi Kleen 
Co-developed-by: Kuppuswamy Sathyanarayanan 

Signed-off-by: Kuppuswamy Sathyanarayanan 

Signed-off-by: Tom Lendacky 
---
 arch/Kconfig|  3 ++
 include/linux/cc_platform.h | 88 +
 2 files changed, 91 insertions(+)
 create mode 100644 include/linux/cc_platform.h

diff --git a/arch/Kconfig b/arch/Kconfig
index 3743174da870..ca7c359e5da8 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -1234,6 +1234,9 @@ config RELR
 config ARCH_HAS_MEM_ENCRYPT
bool
 
+config ARCH_HAS_CC_PLATFORM
+   bool
+
 config HAVE_SPARSE_SYSCALL_NR
bool
help
diff --git a/include/linux/cc_platform.h b/include/linux/cc_platform.h
new file mode 100644
index ..253f3ea66cd8
--- /dev/null
+++ b/include/linux/cc_platform.h
@@ -0,0 +1,88 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Confidential Computing Platform Capability checks
+ *
+ * Copyright (C) 2021 Advanced Micro Devices, Inc.
+ *
+ * Author: Tom Lendacky 
+ */
+
+#ifndef _CC_PLATFORM_H
+#define _CC_PLATFORM_H
+
+#include 
+#include 
+
+/**
+ * enum cc_attr - Confidential computing attributes
+ *
+ * These attributes represent confidential computing features that are
+ * currently active.
+ */
+enum cc_attr {
+   /**
+* @CC_ATTR_MEM_ENCRYPT: Memory encryption is active
+*
+* The platform/OS is running with active memory encryption. This
+* includes running either as a bare-metal system or a hypervisor
+* and actively using memory encryption or as a guest/virtual machine
+* and actively using memory encryption.
+*
+* Examples include SME, SEV and SEV-ES.
+*/
+   CC_ATTR_MEM_ENCRYPT,
+
+   /**
+* @CC_ATTR_HOST_MEM_ENCRYPT: Host memory encryption is active
+*
+* The platform/OS is running as a bare-metal system or a hypervisor
+* and actively using memory encryption.
+*
+* Examples include SME.
+*/
+   CC_ATTR_HOST_MEM_ENCRYPT,
+
+   /**
+* @CC_ATTR_GUEST_MEM_ENCRYPT: Guest memory encryption is active
+*
+* The platform/OS is running as a guest/virtual machine and actively
+* using memory encryption.
+*
+* Examples include SEV and SEV-ES.
+*/
+   CC_ATTR_GUEST_MEM_ENCRYPT,
+
+   /**
+* @CC_ATTR_GUEST_STATE_ENCRYPT: Guest state encryption is active
+*
+* The platform/OS is running as a guest/virtual machine and actively
+* using memory encryption and register state encryption.
+*
+* Examples include SEV-ES.
+*/
+   CC_ATTR_GUEST_STATE_ENCRYPT,
+};
+
+#ifdef CONFIG_ARCH_HAS_CC_PLATFORM
+
+/**
+ * cc_platform_has() - Checks if the specified cc_attr attribute is active
+ * @attr: Confidential computing attribute to check
+ *
+ * The cc_platform_has() function will return an indicator as to whether the
+ * specified Confidential Computing attribute is currently active.
+ *
+ * Context: Any context
+ * Return:
+ * * TRUE  - Specified Confidential Computing attribute is active
+ * * FALSE - Specified Confidential Computing attribute is not active
+ */
+bool cc_platform_has(enum cc_attr attr);
+
+#else  /* !CONFIG_ARCH_HAS_CC_PLATFORM */
+
+static inline bool cc_platform_has(enum cc_attr attr) { return false; }
+
+#endif /* CONFIG_ARCH_HAS_CC_PLATFORM */
+
+#endif /* _CC_PLATFORM_H */
-- 
2.33.0


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


[PATCH v3 4/8] powerpc/pseries/svm: Add a powerpc version of cc_platform_has()

2021-09-08 Thread Tom Lendacky
Introduce a powerpc version of the cc_platform_has() function. This will
be used to replace the powerpc mem_encrypt_active() implementation, so
the implementation will initially only support the CC_ATTR_MEM_ENCRYPT
attribute.

Cc: Michael Ellerman 
Cc: Benjamin Herrenschmidt 
Cc: Paul Mackerras 
Signed-off-by: Tom Lendacky 
---
 arch/powerpc/platforms/pseries/Kconfig   |  1 +
 arch/powerpc/platforms/pseries/Makefile  |  2 ++
 arch/powerpc/platforms/pseries/cc_platform.c | 26 
 3 files changed, 29 insertions(+)
 create mode 100644 arch/powerpc/platforms/pseries/cc_platform.c

diff --git a/arch/powerpc/platforms/pseries/Kconfig 
b/arch/powerpc/platforms/pseries/Kconfig
index 5e037df2a3a1..2e57391e0778 100644
--- a/arch/powerpc/platforms/pseries/Kconfig
+++ b/arch/powerpc/platforms/pseries/Kconfig
@@ -159,6 +159,7 @@ config PPC_SVM
select SWIOTLB
select ARCH_HAS_MEM_ENCRYPT
select ARCH_HAS_FORCE_DMA_UNENCRYPTED
+   select ARCH_HAS_CC_PLATFORM
help
 There are certain POWER platforms which support secure guests using
 the Protected Execution Facility, with the help of an Ultravisor
diff --git a/arch/powerpc/platforms/pseries/Makefile 
b/arch/powerpc/platforms/pseries/Makefile
index 4cda0ef87be0..41d8aee98da4 100644
--- a/arch/powerpc/platforms/pseries/Makefile
+++ b/arch/powerpc/platforms/pseries/Makefile
@@ -31,3 +31,5 @@ obj-$(CONFIG_FA_DUMP) += rtas-fadump.o
 
 obj-$(CONFIG_SUSPEND)  += suspend.o
 obj-$(CONFIG_PPC_VAS)  += vas.o
+
+obj-$(CONFIG_ARCH_HAS_CC_PLATFORM) += cc_platform.o
diff --git a/arch/powerpc/platforms/pseries/cc_platform.c 
b/arch/powerpc/platforms/pseries/cc_platform.c
new file mode 100644
index ..e8021af83a19
--- /dev/null
+++ b/arch/powerpc/platforms/pseries/cc_platform.c
@@ -0,0 +1,26 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Confidential Computing Platform Capability checks
+ *
+ * Copyright (C) 2021 Advanced Micro Devices, Inc.
+ *
+ * Author: Tom Lendacky 
+ */
+
+#include 
+#include 
+
+#include 
+#include 
+
+bool cc_platform_has(enum cc_attr attr)
+{
+   switch (attr) {
+   case CC_ATTR_MEM_ENCRYPT:
+   return is_secure_guest();
+
+   default:
+   return false;
+   }
+}
+EXPORT_SYMBOL_GPL(cc_platform_has);
-- 
2.33.0


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


[PATCH v3 1/8] x86/ioremap: Selectively build arch override encryption functions

2021-09-08 Thread Tom Lendacky
In prep for other uses of the cc_platform_has() function besides AMD's
memory encryption support, selectively build the AMD memory encryption
architecture override functions only when CONFIG_AMD_MEM_ENCRYPT=y. These
functions are:
- early_memremap_pgprot_adjust()
- arch_memremap_can_ram_remap()

Additionally, routines that are only invoked by these architecture
override functions can also be conditionally built. These functions are:
- memremap_should_map_decrypted()
- memremap_is_efi_data()
- memremap_is_setup_data()
- early_memremap_is_setup_data()

And finally, phys_mem_access_encrypted() is conditionally built as well,
but requires a static inline version of it when CONFIG_AMD_MEM_ENCRYPT is
not set.

Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: Borislav Petkov 
Cc: Dave Hansen 
Cc: Andy Lutomirski 
Cc: Peter Zijlstra 
Signed-off-by: Tom Lendacky 
---
 arch/x86/include/asm/io.h | 8 
 arch/x86/mm/ioremap.c | 2 +-
 2 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/io.h b/arch/x86/include/asm/io.h
index 841a5d104afa..5c6a4af0b911 100644
--- a/arch/x86/include/asm/io.h
+++ b/arch/x86/include/asm/io.h
@@ -391,6 +391,7 @@ extern void arch_io_free_memtype_wc(resource_size_t start, 
resource_size_t size)
 #define arch_io_reserve_memtype_wc arch_io_reserve_memtype_wc
 #endif
 
+#ifdef CONFIG_AMD_MEM_ENCRYPT
 extern bool arch_memremap_can_ram_remap(resource_size_t offset,
unsigned long size,
unsigned long flags);
@@ -398,6 +399,13 @@ extern bool arch_memremap_can_ram_remap(resource_size_t 
offset,
 
 extern bool phys_mem_access_encrypted(unsigned long phys_addr,
  unsigned long size);
+#else
+static inline bool phys_mem_access_encrypted(unsigned long phys_addr,
+unsigned long size)
+{
+   return true;
+}
+#endif
 
 /**
  * iosubmit_cmds512 - copy data to single MMIO location, in 512-bit units
diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c
index 60ade7dd71bd..ccff76cedd8f 100644
--- a/arch/x86/mm/ioremap.c
+++ b/arch/x86/mm/ioremap.c
@@ -508,6 +508,7 @@ void unxlate_dev_mem_ptr(phys_addr_t phys, void *addr)
memunmap((void *)((unsigned long)addr & PAGE_MASK));
 }
 
+#ifdef CONFIG_AMD_MEM_ENCRYPT
 /*
  * Examine the physical address to determine if it is an area of memory
  * that should be mapped decrypted.  If the memory is not part of the
@@ -746,7 +747,6 @@ bool phys_mem_access_encrypted(unsigned long phys_addr, 
unsigned long size)
return arch_memremap_can_ram_remap(phys_addr, size, 0);
 }
 
-#ifdef CONFIG_AMD_MEM_ENCRYPT
 /* Remap memory with encryption */
 void __init *early_memremap_encrypted(resource_size_t phys_addr,
  unsigned long size)
-- 
2.33.0


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH v2 04/12] powerpc/pseries/svm: Add a powerpc version of prot_guest_has()

2021-08-19 Thread Tom Lendacky
On 8/19/21 4:55 AM, Christoph Hellwig wrote:
> On Fri, Aug 13, 2021 at 11:59:23AM -0500, Tom Lendacky wrote:
>> +static inline bool prot_guest_has(unsigned int attr)
> 
> No reall need to have this inline.  In fact I'd suggest we havea the
> prototype in a common header so that everyone must implement it out
> of line.

I'll do the same thing I end up doing for x86.

Thanks,
Tom

> 

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH v2 03/12] x86/sev: Add an x86 version of prot_guest_has()

2021-08-19 Thread Tom Lendacky
On 8/19/21 4:52 AM, Christoph Hellwig wrote:
> On Fri, Aug 13, 2021 at 11:59:22AM -0500, Tom Lendacky wrote:
>> While the name suggests this is intended mainly for guests, it will
>> also be used for host memory encryption checks in place of sme_active().
> 
> Which suggest that the name is not good to start with.  Maybe protected
> hardware, system or platform might be a better choice?
> 
>> +static inline bool prot_guest_has(unsigned int attr)
>> +{
>> +#ifdef CONFIG_AMD_MEM_ENCRYPT
>> +if (sme_me_mask)
>> +return amd_prot_guest_has(attr);
>> +#endif
>> +
>> +return false;
>> +}
> 
> Shouldn't this be entirely out of line?

I did it as inline originally because the presence of the function will be
decided based on the ARCH_HAS_PROTECTED_GUEST config. For now, that is
only selected by the AMD memory encryption support, so if I went out of
line I could put in mem_encrypt.c. But with TDX wanting to also use it, it
would have to be in an always built file with some #ifdefs or in its own
file that is conditionally built based on the ARCH_HAS_PROTECTED_GUEST
setting (they've already tried building with ARCH_HAS_PROTECTED_GUEST=y
and AMD_MEM_ENCRYPT not set).

To take it out of line, I'm leaning towards the latter, creating a new
file that is built based on the ARCH_HAS_PROTECTED_GUEST setting.

> 
>> +/* 0x800 - 0x8ff reserved for AMD */
>> +#define PATTR_SME   0x800
>> +#define PATTR_SEV   0x801
>> +#define PATTR_SEV_ES0x802
> 
> Why do we need reservations for a purely in-kernel namespace?
> 
> And why are you overoading a brand new generic API with weird details
> of a specific implementation like this?

There was some talk about this on the mailing list where TDX and SEV may
need to be differentiated, so we wanted to reserve a range of values per
technology. I guess I can remove them until they are actually needed.

Thanks,
Tom

> 

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH v2 02/12] mm: Introduce a function to check for virtualization protection features

2021-08-19 Thread Tom Lendacky
On 8/19/21 4:46 AM, Christoph Hellwig wrote:
> On Fri, Aug 13, 2021 at 11:59:21AM -0500, Tom Lendacky wrote:
>> +#define PATTR_MEM_ENCRYPT   0   /* Encrypted memory */
>> +#define PATTR_HOST_MEM_ENCRYPT  1   /* Host encrypted 
>> memory */
>> +#define PATTR_GUEST_MEM_ENCRYPT 2   /* Guest encrypted 
>> memory */
>> +#define PATTR_GUEST_PROT_STATE  3   /* Guest encrypted 
>> state */
> 
> Please write an actual detailed explanaton of what these mean, that
> is what implications it has on the kernel.

Will do.

Thanks,
Tom

> 

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH v2 09/12] mm: Remove the now unused mem_encrypt_active() function

2021-08-17 Thread Tom Lendacky
On 8/17/21 5:24 AM, Borislav Petkov wrote:
> On Tue, Aug 17, 2021 at 12:22:33PM +0200, Borislav Petkov wrote:
>> This one wants to be part of the previous patch.
> 
> ... and the three following patches too - the treewide patch does a
> single atomic :) replacement and that's it.

Ok, I'll squash those all together.

Thanks,
Tom

> 

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH v2 06/12] x86/sev: Replace occurrences of sev_active() with prot_guest_has()

2021-08-17 Thread Tom Lendacky
On 8/17/21 5:02 AM, Borislav Petkov wrote:
> On Fri, Aug 13, 2021 at 11:59:25AM -0500, Tom Lendacky wrote:
>> diff --git a/arch/x86/kernel/machine_kexec_64.c 
>> b/arch/x86/kernel/machine_kexec_64.c
>> index 8e7b517ad738..66ff788b79c9 100644
>> --- a/arch/x86/kernel/machine_kexec_64.c
>> +++ b/arch/x86/kernel/machine_kexec_64.c
>> @@ -167,7 +167,7 @@ static int init_transition_pgtable(struct kimage *image, 
>> pgd_t *pgd)
>>  }
>>  pte = pte_offset_kernel(pmd, vaddr);
>>  
>> -if (sev_active())
>> +if (prot_guest_has(PATTR_GUEST_MEM_ENCRYPT))
>>  prot = PAGE_KERNEL_EXEC;
>>  
>>  set_pte(pte, pfn_pte(paddr >> PAGE_SHIFT, prot));
>> @@ -207,7 +207,7 @@ static int init_pgtable(struct kimage *image, unsigned 
>> long start_pgtable)
>>  level4p = (pgd_t *)__va(start_pgtable);
>>  clear_page(level4p);
>>  
>> -if (sev_active()) {
>> +if (prot_guest_has(PATTR_GUEST_MEM_ENCRYPT)) {
>>  info.page_flag   |= _PAGE_ENC;
>>  info.kernpg_flag |= _PAGE_ENC;
>>  }
>> @@ -570,12 +570,12 @@ void arch_kexec_unprotect_crashkres(void)
>>   */
>>  int arch_kexec_post_alloc_pages(void *vaddr, unsigned int pages, gfp_t gfp)
>>  {
>> -if (sev_active())
>> +if (!prot_guest_has(PATTR_HOST_MEM_ENCRYPT))
>>  return 0;
>>  
>>  /*
>> - * If SME is active we need to be sure that kexec pages are
>> - * not encrypted because when we boot to the new kernel the
>> + * If host memory encryption is active we need to be sure that kexec
>> + * pages are not encrypted because when we boot to the new kernel the
>>   * pages won't be accessed encrypted (initially).
>>   */
> 
> That hunk belongs logically into the previous patch which removes
> sme_active().

I was trying to keep the sev_active() changes separate... so even though
it's an SME thing, I kept it here. But I can move it to the previous
patch, it just might look strange.

> 
>>  return set_memory_decrypted((unsigned long)vaddr, pages);
>> @@ -583,12 +583,12 @@ int arch_kexec_post_alloc_pages(void *vaddr, unsigned 
>> int pages, gfp_t gfp)
>>  
>>  void arch_kexec_pre_free_pages(void *vaddr, unsigned int pages)
>>  {
>> -if (sev_active())
>> +if (!prot_guest_has(PATTR_HOST_MEM_ENCRYPT))
>>  return;
>>  
>>  /*
>> - * If SME is active we need to reset the pages back to being
>> - * an encrypted mapping before freeing them.
>> + * If host memory encryption is active we need to reset the pages back
>> + * to being an encrypted mapping before freeing them.
>>   */
>>  set_memory_encrypted((unsigned long)vaddr, pages);
>>  }
>> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
>> index e8ccab50ebf6..b69f5ac622d5 100644
>> --- a/arch/x86/kvm/svm/svm.c
>> +++ b/arch/x86/kvm/svm/svm.c
>> @@ -25,6 +25,7 @@
>>  #include 
>>  #include 
>>  #include 
>> +#include 
>>  
>>  #include 
>>  #include 
>> @@ -457,7 +458,7 @@ static int has_svm(void)
>>  return 0;
>>  }
>>  
>> -if (sev_active()) {
>> +if (prot_guest_has(PATTR_SEV)) {
>>  pr_info("KVM is unsupported when running as an SEV guest\n");
>>  return 0;
> 
> Same question as for PATTR_SME. PATTR_GUEST_MEM_ENCRYPT should be enough.

Yup, I'll change them all.

> 
>> @@ -373,7 +373,7 @@ int __init early_set_memory_encrypted(unsigned long 
>> vaddr, unsigned long size)
>>   * up under SME the trampoline area cannot be encrypted, whereas under SEV
>>   * the trampoline area must be encrypted.
>>   */
>> -bool sev_active(void)
>> +static bool sev_active(void)
>>  {
>>  return sev_status & MSR_AMD64_SEV_ENABLED;
>>  }
>> @@ -382,7 +382,6 @@ static bool sme_active(void)
>>  {
>>  return sme_me_mask && !sev_active();
>>  }
>> -EXPORT_SYMBOL_GPL(sev_active);
> 
> Just get rid of it altogether.

Ok.

Thanks,
Tom

> 
> Thx.
> 

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH v2 03/12] x86/sev: Add an x86 version of prot_guest_has()

2021-08-17 Thread Tom Lendacky
On 8/15/21 9:39 AM, Borislav Petkov wrote:
> On Sun, Aug 15, 2021 at 08:53:31AM -0500, Tom Lendacky wrote:
>> It's not a cross-vendor thing as opposed to a KVM or other hypervisor
>> thing where the family doesn't have to be reported as AMD or HYGON.
> 
> What would be the use case? A HV starts a guest which is supposed to be
> encrypted using the AMD's confidential guest technology but the HV tells
> the guest that it is not running on an AMD SVM HV but something else?
> 
> Is that even an actual use case?
> 
> Or am I way off?
> 
> I know we have talked about this in the past but this still sounds
> insane.

Maybe the KVM folks have a better understanding of it...

I can change it to be an AMD/HYGON check...  although, I'll have to check
to see if any (very) early use of the function will work with that.

At a minimum, the check in arch/x86/kernel/head64.c will have to be
changed or removed. I'll take a closer look.

Thanks,
Tom

> 

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH v2 05/12] x86/sme: Replace occurrences of sme_active() with prot_guest_has()

2021-08-17 Thread Tom Lendacky
On 8/17/21 4:00 AM, Borislav Petkov wrote:
> On Fri, Aug 13, 2021 at 11:59:24AM -0500, Tom Lendacky wrote:
>> diff --git a/arch/x86/mm/mem_encrypt.c b/arch/x86/mm/mem_encrypt.c
>> index edc67ddf065d..5635ca9a1fbe 100644
>> --- a/arch/x86/mm/mem_encrypt.c
>> +++ b/arch/x86/mm/mem_encrypt.c
>> @@ -144,7 +144,7 @@ void __init sme_unmap_bootdata(char *real_mode_data)
>>  struct boot_params *boot_data;
>>  unsigned long cmdline_paddr;
>>  
>> -if (!sme_active())
>> +if (!amd_prot_guest_has(PATTR_SME))
>>  return;
>>  
>>  /* Get the command line address before unmapping the real_mode_data */
>> @@ -164,7 +164,7 @@ void __init sme_map_bootdata(char *real_mode_data)
>>  struct boot_params *boot_data;
>>  unsigned long cmdline_paddr;
>>  
>> -if (!sme_active())
>> +if (!amd_prot_guest_has(PATTR_SME))
>>  return;
>>  
>>  __sme_early_map_unmap_mem(real_mode_data, sizeof(boot_params), true);
>> @@ -378,7 +378,7 @@ bool sev_active(void)
>>  return sev_status & MSR_AMD64_SEV_ENABLED;
>>  }
>>  
>> -bool sme_active(void)
>> +static bool sme_active(void)
> 
> Just get rid of it altogether. Also, there's an
> 
> EXPORT_SYMBOL_GPL(sev_active);
> > which needs to go under the actual function. Here's a diff ontop:

Will do.

> 
> ---
> diff --git a/arch/x86/mm/mem_encrypt.c b/arch/x86/mm/mem_encrypt.c
> index 5635ca9a1fbe..a3a2396362a5 100644
> --- a/arch/x86/mm/mem_encrypt.c
> +++ b/arch/x86/mm/mem_encrypt.c
> @@ -364,8 +364,9 @@ int __init early_set_memory_encrypted(unsigned long 
> vaddr, unsigned long size)
>  /*
>   * SME and SEV are very similar but they are not the same, so there are
>   * times that the kernel will need to distinguish between SME and SEV. The
> - * sme_active() and sev_active() functions are used for this.  When a
> - * distinction isn't needed, the mem_encrypt_active() function can be used.
> + * PATTR_HOST_MEM_ENCRYPT and PATTR_GUEST_MEM_ENCRYPT flags to
> + * amd_prot_guest_has() are used for this. When a distinction isn't needed,
> + * the mem_encrypt_active() function can be used.
>   *
>   * The trampoline code is a good example for this requirement.  Before
>   * paging is activated, SME will access all memory as decrypted, but SEV
> @@ -377,11 +378,6 @@ bool sev_active(void)
>  {
>   return sev_status & MSR_AMD64_SEV_ENABLED;
>  }
> -
> -static bool sme_active(void)
> -{
> - return sme_me_mask && !sev_active();
> -}
>  EXPORT_SYMBOL_GPL(sev_active);
>  
>  /* Needs to be called from non-instrumentable code */
> @@ -398,7 +394,7 @@ bool amd_prot_guest_has(unsigned int attr)
>  
>   case PATTR_SME:
>   case PATTR_HOST_MEM_ENCRYPT:
> - return sme_active();
> + return sme_me_mask && !sev_active();
>  
>   case PATTR_SEV:
>   case PATTR_GUEST_MEM_ENCRYPT:
> 
>>  {
>>  return sme_me_mask && !sev_active();
>>  }
>> @@ -428,7 +428,7 @@ bool force_dma_unencrypted(struct device *dev)
>>   * device does not support DMA to addresses that include the
>>   * encryption mask.
>>   */
>> -if (sme_active()) {
>> +if (amd_prot_guest_has(PATTR_SME)) {
> 
> So I'm not sure: you add PATTR_SME which you call with
> amd_prot_guest_has() and PATTR_HOST_MEM_ENCRYPT which you call with
> prot_guest_has() and they both end up being the same thing on AMD.
> 
> So why even bother with PATTR_SME?
> 
> This is only going to cause confusion later and I'd say let's simply use
> prot_guest_has(PATTR_HOST_MEM_ENCRYPT) everywhere...

Ok, I can do that. I was trying to ensure that anything that is truly SME
or SEV specific would be called out now.

I'm ok with letting the TDX folks make changes to these calls to be SME or
SEV specific, if necessary, later.

Thanks,
Tom

> 

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH v2 04/12] powerpc/pseries/svm: Add a powerpc version of prot_guest_has()

2021-08-17 Thread Tom Lendacky
On 8/17/21 3:35 AM, Borislav Petkov wrote:
> On Fri, Aug 13, 2021 at 11:59:23AM -0500, Tom Lendacky wrote:
>> Introduce a powerpc version of the prot_guest_has() function. This will
>> be used to replace the powerpc mem_encrypt_active() implementation, so
>> the implementation will initially only support the PATTR_MEM_ENCRYPT
>> attribute.
>>
>> Cc: Michael Ellerman 
>> Cc: Benjamin Herrenschmidt 
>> Cc: Paul Mackerras 
>> Signed-off-by: Tom Lendacky 
>> ---
>>  arch/powerpc/include/asm/protected_guest.h | 30 ++
>>  arch/powerpc/platforms/pseries/Kconfig |  1 +
>>  2 files changed, 31 insertions(+)
>>  create mode 100644 arch/powerpc/include/asm/protected_guest.h
>>
>> diff --git a/arch/powerpc/include/asm/protected_guest.h 
>> b/arch/powerpc/include/asm/protected_guest.h
>> new file mode 100644
>> index ..ce55c2c7e534
>> --- /dev/null
>> +++ b/arch/powerpc/include/asm/protected_guest.h
>> @@ -0,0 +1,30 @@
>> +/* SPDX-License-Identifier: GPL-2.0-only */
>> +/*
>> + * Protected Guest (and Host) Capability checks
>> + *
>> + * Copyright (C) 2021 Advanced Micro Devices, Inc.
>> + *
>> + * Author: Tom Lendacky 
>> + */
>> +
>> +#ifndef _POWERPC_PROTECTED_GUEST_H
>> +#define _POWERPC_PROTECTED_GUEST_H
>> +
>> +#include 
>> +
>> +#ifndef __ASSEMBLY__
> 
> Same thing here. Pls audit the whole set whether those __ASSEMBLY__
> guards are really needed and remove them if not.

Will do.

Thanks,
Tom

> 
> Thx.
> 

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH v2 03/12] x86/sev: Add an x86 version of prot_guest_has()

2021-08-15 Thread Tom Lendacky

On 8/14/21 2:08 PM, Borislav Petkov wrote:

On Fri, Aug 13, 2021 at 11:59:22AM -0500, Tom Lendacky wrote:

diff --git a/arch/x86/include/asm/protected_guest.h 
b/arch/x86/include/asm/protected_guest.h
new file mode 100644
index ..51e4eefd9542
--- /dev/null
+++ b/arch/x86/include/asm/protected_guest.h
@@ -0,0 +1,29 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Protected Guest (and Host) Capability checks
+ *
+ * Copyright (C) 2021 Advanced Micro Devices, Inc.
+ *
+ * Author: Tom Lendacky 
+ */
+
+#ifndef _X86_PROTECTED_GUEST_H
+#define _X86_PROTECTED_GUEST_H
+
+#include 
+
+#ifndef __ASSEMBLY__
+
+static inline bool prot_guest_has(unsigned int attr)
+{
+#ifdef CONFIG_AMD_MEM_ENCRYPT
+   if (sme_me_mask)
+   return amd_prot_guest_has(attr);
+#endif
+
+   return false;
+}
+
+#endif /* __ASSEMBLY__ */
+
+#endif /* _X86_PROTECTED_GUEST_H */


I think this can be simplified more, diff ontop below:

- no need for the ifdeffery as amd_prot_guest_has() has versions for
both when CONFIG_AMD_MEM_ENCRYPT is set or not.


Ugh, yeah, not sure why I put that in for this version since I have the 
static inline for when CONFIG_AMD_MEM_ENCRYPT is not set.




- the sme_me_mask check is pushed there too.

- and since this is vendor-specific, I'm checking the vendor bit. Yeah,
yeah, cross-vendor but I don't really believe that.


It's not a cross-vendor thing as opposed to a KVM or other hypervisor 
thing where the family doesn't have to be reported as AMD or HYGON. That's 
why I made the if check be for sme_me_mask. I think that is the safer way 
to go.


Thanks,
Tom



---
diff --git a/arch/x86/include/asm/protected_guest.h 
b/arch/x86/include/asm/protected_guest.h
index 51e4eefd9542..8541c76d5da4 100644
--- a/arch/x86/include/asm/protected_guest.h
+++ b/arch/x86/include/asm/protected_guest.h
@@ -12,18 +12,13 @@
  
  #include 
  
-#ifndef __ASSEMBLY__

-
  static inline bool prot_guest_has(unsigned int attr)
  {
-#ifdef CONFIG_AMD_MEM_ENCRYPT
-   if (sme_me_mask)
+   if (boot_cpu_data.x86_vendor == X86_VENDOR_AMD ||
+   boot_cpu_data.x86_vendor == X86_VENDOR_HYGON)
return amd_prot_guest_has(attr);
-#endif
  
  	return false;

  }
  
-#endif	/* __ASSEMBLY__ */

-
  #endif/* _X86_PROTECTED_GUEST_H */
diff --git a/arch/x86/mm/mem_encrypt.c b/arch/x86/mm/mem_encrypt.c
index edc67ddf065d..5a0442a6f072 100644
--- a/arch/x86/mm/mem_encrypt.c
+++ b/arch/x86/mm/mem_encrypt.c
@@ -392,6 +392,9 @@ bool noinstr sev_es_active(void)
  
  bool amd_prot_guest_has(unsigned int attr)

  {
+   if (!sme_me_mask)
+   return false;
+
switch (attr) {
case PATTR_MEM_ENCRYPT:
return sme_me_mask != 0;



___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH v2 02/12] mm: Introduce a function to check for virtualization protection features

2021-08-14 Thread Tom Lendacky

On 8/14/21 1:32 PM, Borislav Petkov wrote:

On Fri, Aug 13, 2021 at 11:59:21AM -0500, Tom Lendacky wrote:

diff --git a/include/linux/protected_guest.h b/include/linux/protected_guest.h
new file mode 100644
index ..43d4dde94793
--- /dev/null
+++ b/include/linux/protected_guest.h
@@ -0,0 +1,35 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Protected Guest (and Host) Capability checks
+ *
+ * Copyright (C) 2021 Advanced Micro Devices, Inc.
+ *
+ * Author: Tom Lendacky 
+ */
+
+#ifndef _PROTECTED_GUEST_H
+#define _PROTECTED_GUEST_H
+
+#ifndef __ASSEMBLY__

   ^

Do you really need that guard? It builds fine without it too. Or
something coming later does need it...?


No, I probably did it out of habit. I can remove it in the next version.

Thanks,
Tom





___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH 07/11] treewide: Replace the use of mem_encrypt_active() with prot_guest_has()

2021-08-13 Thread Tom Lendacky

On 8/13/21 12:08 PM, Tom Lendacky wrote:

On 8/12/21 5:07 AM, Kirill A. Shutemov wrote:

On Wed, Aug 11, 2021 at 10:52:55AM -0500, Tom Lendacky wrote:

On 8/11/21 7:19 AM, Kirill A. Shutemov wrote:

On Tue, Aug 10, 2021 at 02:48:54PM -0500, Tom Lendacky wrote:

On 8/10/21 1:45 PM, Kuppuswamy, Sathyanarayanan wrote:

...

Looking at code agains, now I *think* the reason is accessing a global
variable from __startup_64() inside TDX version of prot_guest_has().

__startup_64() is special. If you access any global variable you need to
use fixup_pointer(). See comment before __startup_64().

I'm not sure how you get away with accessing sme_me_mask directly from
there. Any clues? Maybe just a luck and complier generates code just 
right

for your case, I donno.


Hmm... yeah, could be that the compiler is using rip-relative addressing
for it because it lives in the .data section?


I guess. It has to be fixed. It may break with complier upgrade or any
random change around the code.


I'll look at doing that separate from this series.



BTW, does it work with clang for you?


I haven't tried with clang, I'll check on that.


Just as an fyi, clang also uses rip relative addressing for those 
variables. No issues booting SME and SEV guests built with clang.


Thanks,
Tom



Thanks,
Tom





___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH v2 00/12] Implement generic prot_guest_has() helper function

2021-08-13 Thread Tom Lendacky

On 8/13/21 11:59 AM, Tom Lendacky wrote:

This patch series provides a generic helper function, prot_guest_has(),
to replace the sme_active(), sev_active(), sev_es_active() and
mem_encrypt_active() functions.

It is expected that as new protected virtualization technologies are
added to the kernel, they can all be covered by a single function call
instead of a collection of specific function calls all called from the
same locations.

The powerpc and s390 patches have been compile tested only. Can the
folks copied on this series verify that nothing breaks for them.


There are some patches related to PPC that added new calls to the 
mem_encrypt_active() function that are not yet in the tip tree. After the 
merge window, I'll need to send a v3 with those additional changes before 
this series can be applied.


Thanks,
Tom

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH 07/11] treewide: Replace the use of mem_encrypt_active() with prot_guest_has()

2021-08-13 Thread Tom Lendacky

On 8/12/21 5:07 AM, Kirill A. Shutemov wrote:

On Wed, Aug 11, 2021 at 10:52:55AM -0500, Tom Lendacky wrote:

On 8/11/21 7:19 AM, Kirill A. Shutemov wrote:

On Tue, Aug 10, 2021 at 02:48:54PM -0500, Tom Lendacky wrote:

On 8/10/21 1:45 PM, Kuppuswamy, Sathyanarayanan wrote:

...

Looking at code agains, now I *think* the reason is accessing a global
variable from __startup_64() inside TDX version of prot_guest_has().

__startup_64() is special. If you access any global variable you need to
use fixup_pointer(). See comment before __startup_64().

I'm not sure how you get away with accessing sme_me_mask directly from
there. Any clues? Maybe just a luck and complier generates code just right
for your case, I donno.


Hmm... yeah, could be that the compiler is using rip-relative addressing
for it because it lives in the .data section?


I guess. It has to be fixed. It may break with complier upgrade or any
random change around the code.


I'll look at doing that separate from this series.



BTW, does it work with clang for you?


I haven't tried with clang, I'll check on that.

Thanks,
Tom





___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


[PATCH v2 12/12] s390/mm: Remove the now unused mem_encrypt_active() function

2021-08-13 Thread Tom Lendacky
The mem_encrypt_active() function has been replaced by prot_guest_has(),
so remove the implementation. Since the default implementation of the
prot_guest_has() matches the s390 implementation of mem_encrypt_active(),
prot_guest_has() does not need to be implemented in s390 (the config
option ARCH_HAS_PROTECTED_GUEST is not set).

Cc: Heiko Carstens 
Cc: Vasily Gorbik 
Cc: Christian Borntraeger 
Signed-off-by: Tom Lendacky 
---
 arch/s390/include/asm/mem_encrypt.h | 2 --
 1 file changed, 2 deletions(-)

diff --git a/arch/s390/include/asm/mem_encrypt.h 
b/arch/s390/include/asm/mem_encrypt.h
index 2542cbf7e2d1..08a8b96606d7 100644
--- a/arch/s390/include/asm/mem_encrypt.h
+++ b/arch/s390/include/asm/mem_encrypt.h
@@ -4,8 +4,6 @@
 
 #ifndef __ASSEMBLY__
 
-static inline bool mem_encrypt_active(void) { return false; }
-
 int set_memory_encrypted(unsigned long addr, int numpages);
 int set_memory_decrypted(unsigned long addr, int numpages);
 
-- 
2.32.0


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


[PATCH v2 11/12] powerpc/pseries/svm: Remove the now unused mem_encrypt_active() function

2021-08-13 Thread Tom Lendacky
The mem_encrypt_active() function has been replaced by prot_guest_has(),
so remove the implementation.

Cc: Michael Ellerman 
Cc: Benjamin Herrenschmidt 
Cc: Paul Mackerras 
Signed-off-by: Tom Lendacky 
---
 arch/powerpc/include/asm/mem_encrypt.h | 5 -
 1 file changed, 5 deletions(-)

diff --git a/arch/powerpc/include/asm/mem_encrypt.h 
b/arch/powerpc/include/asm/mem_encrypt.h
index ba9dab07c1be..2f26b8fc8d29 100644
--- a/arch/powerpc/include/asm/mem_encrypt.h
+++ b/arch/powerpc/include/asm/mem_encrypt.h
@@ -10,11 +10,6 @@
 
 #include 
 
-static inline bool mem_encrypt_active(void)
-{
-   return is_secure_guest();
-}
-
 static inline bool force_dma_unencrypted(struct device *dev)
 {
return is_secure_guest();
-- 
2.32.0


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


[PATCH v2 10/12] x86/sev: Remove the now unused mem_encrypt_active() function

2021-08-13 Thread Tom Lendacky
The mem_encrypt_active() function has been replaced by prot_guest_has(),
so remove the implementation.

Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: Borislav Petkov 
Reviewed-by: Joerg Roedel 
Signed-off-by: Tom Lendacky 
---
 arch/x86/include/asm/mem_encrypt.h | 5 -
 1 file changed, 5 deletions(-)

diff --git a/arch/x86/include/asm/mem_encrypt.h 
b/arch/x86/include/asm/mem_encrypt.h
index 797146e0cd6b..94c089e9ea69 100644
--- a/arch/x86/include/asm/mem_encrypt.h
+++ b/arch/x86/include/asm/mem_encrypt.h
@@ -97,11 +97,6 @@ static inline void mem_encrypt_free_decrypted_mem(void) { }
 
 extern char __start_bss_decrypted[], __end_bss_decrypted[], 
__start_bss_decrypted_unused[];
 
-static inline bool mem_encrypt_active(void)
-{
-   return sme_me_mask;
-}
-
 static inline u64 sme_get_me_mask(void)
 {
return sme_me_mask;
-- 
2.32.0


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


[PATCH v2 08/12] treewide: Replace the use of mem_encrypt_active() with prot_guest_has()

2021-08-13 Thread Tom Lendacky
Replace occurrences of mem_encrypt_active() with calls to prot_guest_has()
with the PATTR_MEM_ENCRYPT attribute.

Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: Borislav Petkov 
Cc: Dave Hansen 
Cc: Andy Lutomirski 
Cc: Peter Zijlstra 
Cc: David Airlie 
Cc: Daniel Vetter 
Cc: Maarten Lankhorst 
Cc: Maxime Ripard 
Cc: Thomas Zimmermann 
Cc: VMware Graphics 
Cc: Joerg Roedel 
Cc: Will Deacon 
Cc: Dave Young 
Cc: Baoquan He 
Signed-off-by: Tom Lendacky 
---
 arch/x86/kernel/head64.c| 4 ++--
 arch/x86/mm/ioremap.c   | 4 ++--
 arch/x86/mm/mem_encrypt.c   | 5 ++---
 arch/x86/mm/pat/set_memory.c| 3 ++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 4 +++-
 drivers/gpu/drm/drm_cache.c | 4 ++--
 drivers/gpu/drm/vmwgfx/vmwgfx_drv.c | 4 ++--
 drivers/gpu/drm/vmwgfx/vmwgfx_msg.c | 6 +++---
 drivers/iommu/amd/iommu.c   | 3 ++-
 drivers/iommu/amd/iommu_v2.c| 3 ++-
 drivers/iommu/iommu.c   | 3 ++-
 fs/proc/vmcore.c| 6 +++---
 kernel/dma/swiotlb.c| 4 ++--
 13 files changed, 29 insertions(+), 24 deletions(-)

diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c
index de01903c3735..cafed6456d45 100644
--- a/arch/x86/kernel/head64.c
+++ b/arch/x86/kernel/head64.c
@@ -19,7 +19,7 @@
 #include 
 #include 
 #include 
-#include 
+#include 
 #include 
 
 #include 
@@ -285,7 +285,7 @@ unsigned long __head __startup_64(unsigned long physaddr,
 * there is no need to zero it after changing the memory encryption
 * attribute.
 */
-   if (mem_encrypt_active()) {
+   if (prot_guest_has(PATTR_MEM_ENCRYPT)) {
vaddr = (unsigned long)__start_bss_decrypted;
vaddr_end = (unsigned long)__end_bss_decrypted;
for (; vaddr < vaddr_end; vaddr += PMD_SIZE) {
diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c
index 3ed0f28f12af..7f012fc1b600 100644
--- a/arch/x86/mm/ioremap.c
+++ b/arch/x86/mm/ioremap.c
@@ -694,7 +694,7 @@ static bool __init 
early_memremap_is_setup_data(resource_size_t phys_addr,
 bool arch_memremap_can_ram_remap(resource_size_t phys_addr, unsigned long size,
 unsigned long flags)
 {
-   if (!mem_encrypt_active())
+   if (!prot_guest_has(PATTR_MEM_ENCRYPT))
return true;
 
if (flags & MEMREMAP_ENC)
@@ -724,7 +724,7 @@ pgprot_t __init 
early_memremap_pgprot_adjust(resource_size_t phys_addr,
 {
bool encrypted_prot;
 
-   if (!mem_encrypt_active())
+   if (!prot_guest_has(PATTR_MEM_ENCRYPT))
return prot;
 
encrypted_prot = true;
diff --git a/arch/x86/mm/mem_encrypt.c b/arch/x86/mm/mem_encrypt.c
index 38dfa84b77a1..69aed9935b5e 100644
--- a/arch/x86/mm/mem_encrypt.c
+++ b/arch/x86/mm/mem_encrypt.c
@@ -364,8 +364,7 @@ int __init early_set_memory_encrypted(unsigned long vaddr, 
unsigned long size)
 /*
  * SME and SEV are very similar but they are not the same, so there are
  * times that the kernel will need to distinguish between SME and SEV. The
- * sme_active() and sev_active() functions are used for this.  When a
- * distinction isn't needed, the mem_encrypt_active() function can be used.
+ * sme_active() and sev_active() functions are used for this.
  *
  * The trampoline code is a good example for this requirement.  Before
  * paging is activated, SME will access all memory as decrypted, but SEV
@@ -451,7 +450,7 @@ void __init mem_encrypt_free_decrypted_mem(void)
 * The unused memory range was mapped decrypted, change the encryption
 * attribute from decrypted to encrypted before freeing it.
 */
-   if (mem_encrypt_active()) {
+   if (amd_prot_guest_has(PATTR_MEM_ENCRYPT)) {
r = set_memory_encrypted(vaddr, npages);
if (r) {
pr_warn("failed to free unused decrypted pages\n");
diff --git a/arch/x86/mm/pat/set_memory.c b/arch/x86/mm/pat/set_memory.c
index ad8a5c586a35..6925f2bb4be1 100644
--- a/arch/x86/mm/pat/set_memory.c
+++ b/arch/x86/mm/pat/set_memory.c
@@ -18,6 +18,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -1986,7 +1987,7 @@ static int __set_memory_enc_dec(unsigned long addr, int 
numpages, bool enc)
int ret;
 
/* Nothing to do if memory encryption is not active */
-   if (!mem_encrypt_active())
+   if (!prot_guest_has(PATTR_MEM_ENCRYPT))
return 0;
 
/* Should not be working on unaligned addresses */
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index 971c5b8e75dc..21c1e3056070 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -38,6 +38,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "amdgpu.h"
 #include "amdgpu_irq.h"
@@ -1250,7 +1251,8 @@ static

[PATCH v2 00/12] Implement generic prot_guest_has() helper function

2021-08-13 Thread Tom Lendacky
This patch series provides a generic helper function, prot_guest_has(),
to replace the sme_active(), sev_active(), sev_es_active() and
mem_encrypt_active() functions.

It is expected that as new protected virtualization technologies are
added to the kernel, they can all be covered by a single function call
instead of a collection of specific function calls all called from the
same locations.

The powerpc and s390 patches have been compile tested only. Can the
folks copied on this series verify that nothing breaks for them.

Cc: Andi Kleen 
Cc: Andy Lutomirski 
Cc: Ard Biesheuvel 
Cc: Baoquan He 
Cc: Benjamin Herrenschmidt 
Cc: Borislav Petkov 
Cc: Christian Borntraeger 
Cc: Daniel Vetter 
Cc: Dave Hansen 
Cc: Dave Young 
Cc: David Airlie 
Cc: Heiko Carstens 
Cc: Ingo Molnar 
Cc: Joerg Roedel 
Cc: Maarten Lankhorst 
Cc: Maxime Ripard 
Cc: Michael Ellerman 
Cc: Paul Mackerras 
Cc: Peter Zijlstra 
Cc: Thomas Gleixner 
Cc: Thomas Zimmermann 
Cc: Vasily Gorbik 
Cc: VMware Graphics 
Cc: Will Deacon 

---

Patches based on:
  https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git master
  0b52902cd2d9 ("Merge branch 'efi/urgent'")

Changes since v1:
- Move some arch ioremap functions within #ifdef CONFIG_AMD_MEM_ENCRYPT
  in prep for use of prot_guest_has() by TDX.
- Add type includes to the the protected_guest.h header file to prevent
  build errors outside of x86.
- Make amd_prot_guest_has() EXPORT_SYMBOL_GPL
- Use amd_prot_guest_has() in place of checking sme_me_mask in the
  arch/x86/mm/mem_encrypt.c file.

Tom Lendacky (12):
  x86/ioremap: Selectively build arch override encryption functions
  mm: Introduce a function to check for virtualization protection
features
  x86/sev: Add an x86 version of prot_guest_has()
  powerpc/pseries/svm: Add a powerpc version of prot_guest_has()
  x86/sme: Replace occurrences of sme_active() with prot_guest_has()
  x86/sev: Replace occurrences of sev_active() with prot_guest_has()
  x86/sev: Replace occurrences of sev_es_active() with prot_guest_has()
  treewide: Replace the use of mem_encrypt_active() with
prot_guest_has()
  mm: Remove the now unused mem_encrypt_active() function
  x86/sev: Remove the now unused mem_encrypt_active() function
  powerpc/pseries/svm: Remove the now unused mem_encrypt_active()
function
  s390/mm: Remove the now unused mem_encrypt_active() function

 arch/Kconfig   |  3 ++
 arch/powerpc/include/asm/mem_encrypt.h |  5 --
 arch/powerpc/include/asm/protected_guest.h | 30 +++
 arch/powerpc/platforms/pseries/Kconfig |  1 +
 arch/s390/include/asm/mem_encrypt.h|  2 -
 arch/x86/Kconfig   |  1 +
 arch/x86/include/asm/io.h  |  8 +++
 arch/x86/include/asm/kexec.h   |  2 +-
 arch/x86/include/asm/mem_encrypt.h | 13 +
 arch/x86/include/asm/protected_guest.h | 29 +++
 arch/x86/kernel/crash_dump_64.c|  4 +-
 arch/x86/kernel/head64.c   |  4 +-
 arch/x86/kernel/kvm.c  |  3 +-
 arch/x86/kernel/kvmclock.c |  4 +-
 arch/x86/kernel/machine_kexec_64.c | 19 +++
 arch/x86/kernel/pci-swiotlb.c  |  9 ++--
 arch/x86/kernel/relocate_kernel_64.S   |  2 +-
 arch/x86/kernel/sev.c  |  6 +--
 arch/x86/kvm/svm/svm.c |  3 +-
 arch/x86/mm/ioremap.c  | 18 +++
 arch/x86/mm/mem_encrypt.c  | 60 +++---
 arch/x86/mm/mem_encrypt_identity.c |  3 +-
 arch/x86/mm/pat/set_memory.c   |  3 +-
 arch/x86/platform/efi/efi_64.c |  9 ++--
 arch/x86/realmode/init.c   |  8 +--
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c|  4 +-
 drivers/gpu/drm/drm_cache.c|  4 +-
 drivers/gpu/drm/vmwgfx/vmwgfx_drv.c|  4 +-
 drivers/gpu/drm/vmwgfx/vmwgfx_msg.c|  6 +--
 drivers/iommu/amd/init.c   |  7 +--
 drivers/iommu/amd/iommu.c  |  3 +-
 drivers/iommu/amd/iommu_v2.c   |  3 +-
 drivers/iommu/iommu.c  |  3 +-
 fs/proc/vmcore.c   |  6 +--
 include/linux/mem_encrypt.h|  4 --
 include/linux/protected_guest.h| 40 +++
 kernel/dma/swiotlb.c   |  4 +-
 37 files changed, 232 insertions(+), 105 deletions(-)
 create mode 100644 arch/powerpc/include/asm/protected_guest.h
 create mode 100644 arch/x86/include/asm/protected_guest.h
 create mode 100644 include/linux/protected_guest.h

-- 
2.32.0


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


[PATCH v2 06/12] x86/sev: Replace occurrences of sev_active() with prot_guest_has()

2021-08-13 Thread Tom Lendacky
Replace occurrences of sev_active() with the more generic prot_guest_has()
using PATTR_GUEST_MEM_ENCRYPT, except for in arch/x86/mm/mem_encrypt*.c
where PATTR_SEV will be used. If future support is added for other memory
encryption technologies, the use of PATTR_GUEST_MEM_ENCRYPT can be
updated, as required, to use PATTR_SEV.

Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: Borislav Petkov 
Cc: Dave Hansen 
Cc: Andy Lutomirski 
Cc: Peter Zijlstra 
Cc: Ard Biesheuvel 
Reviewed-by: Joerg Roedel 
Signed-off-by: Tom Lendacky 
---
 arch/x86/include/asm/mem_encrypt.h |  2 --
 arch/x86/kernel/crash_dump_64.c|  4 +++-
 arch/x86/kernel/kvm.c  |  3 ++-
 arch/x86/kernel/kvmclock.c |  4 ++--
 arch/x86/kernel/machine_kexec_64.c | 16 
 arch/x86/kvm/svm/svm.c |  3 ++-
 arch/x86/mm/ioremap.c  |  6 +++---
 arch/x86/mm/mem_encrypt.c  | 15 +++
 arch/x86/platform/efi/efi_64.c |  9 +
 9 files changed, 32 insertions(+), 30 deletions(-)

diff --git a/arch/x86/include/asm/mem_encrypt.h 
b/arch/x86/include/asm/mem_encrypt.h
index 956338406cec..7e25de37c148 100644
--- a/arch/x86/include/asm/mem_encrypt.h
+++ b/arch/x86/include/asm/mem_encrypt.h
@@ -50,7 +50,6 @@ void __init mem_encrypt_free_decrypted_mem(void);
 void __init mem_encrypt_init(void);
 
 void __init sev_es_init_vc_handling(void);
-bool sev_active(void);
 bool sev_es_active(void);
 bool amd_prot_guest_has(unsigned int attr);
 
@@ -75,7 +74,6 @@ static inline void __init sme_encrypt_kernel(struct 
boot_params *bp) { }
 static inline void __init sme_enable(struct boot_params *bp) { }
 
 static inline void sev_es_init_vc_handling(void) { }
-static inline bool sev_active(void) { return false; }
 static inline bool sev_es_active(void) { return false; }
 static inline bool amd_prot_guest_has(unsigned int attr) { return false; }
 
diff --git a/arch/x86/kernel/crash_dump_64.c b/arch/x86/kernel/crash_dump_64.c
index 045e82e8945b..0cfe35f03e67 100644
--- a/arch/x86/kernel/crash_dump_64.c
+++ b/arch/x86/kernel/crash_dump_64.c
@@ -10,6 +10,7 @@
 #include 
 #include 
 #include 
+#include 
 
 static ssize_t __copy_oldmem_page(unsigned long pfn, char *buf, size_t csize,
  unsigned long offset, int userbuf,
@@ -73,5 +74,6 @@ ssize_t copy_oldmem_page_encrypted(unsigned long pfn, char 
*buf, size_t csize,
 
 ssize_t elfcorehdr_read(char *buf, size_t count, u64 *ppos)
 {
-   return read_from_oldmem(buf, count, ppos, 0, sev_active());
+   return read_from_oldmem(buf, count, ppos, 0,
+   prot_guest_has(PATTR_GUEST_MEM_ENCRYPT));
 }
diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
index a26643dc6bd6..9d08ad2f3faa 100644
--- a/arch/x86/kernel/kvm.c
+++ b/arch/x86/kernel/kvm.c
@@ -27,6 +27,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -418,7 +419,7 @@ static void __init sev_map_percpu_data(void)
 {
int cpu;
 
-   if (!sev_active())
+   if (!prot_guest_has(PATTR_GUEST_MEM_ENCRYPT))
return;
 
for_each_possible_cpu(cpu) {
diff --git a/arch/x86/kernel/kvmclock.c b/arch/x86/kernel/kvmclock.c
index ad273e5861c1..f7ba78a23dcd 100644
--- a/arch/x86/kernel/kvmclock.c
+++ b/arch/x86/kernel/kvmclock.c
@@ -16,9 +16,9 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
-#include 
 #include 
 #include 
 
@@ -232,7 +232,7 @@ static void __init kvmclock_init_mem(void)
 * hvclock is shared between the guest and the hypervisor, must
 * be mapped decrypted.
 */
-   if (sev_active()) {
+   if (prot_guest_has(PATTR_GUEST_MEM_ENCRYPT)) {
r = set_memory_decrypted((unsigned long) hvclock_mem,
 1UL << order);
if (r) {
diff --git a/arch/x86/kernel/machine_kexec_64.c 
b/arch/x86/kernel/machine_kexec_64.c
index 8e7b517ad738..66ff788b79c9 100644
--- a/arch/x86/kernel/machine_kexec_64.c
+++ b/arch/x86/kernel/machine_kexec_64.c
@@ -167,7 +167,7 @@ static int init_transition_pgtable(struct kimage *image, 
pgd_t *pgd)
}
pte = pte_offset_kernel(pmd, vaddr);
 
-   if (sev_active())
+   if (prot_guest_has(PATTR_GUEST_MEM_ENCRYPT))
prot = PAGE_KERNEL_EXEC;
 
set_pte(pte, pfn_pte(paddr >> PAGE_SHIFT, prot));
@@ -207,7 +207,7 @@ static int init_pgtable(struct kimage *image, unsigned long 
start_pgtable)
level4p = (pgd_t *)__va(start_pgtable);
clear_page(level4p);
 
-   if (sev_active()) {
+   if (prot_guest_has(PATTR_GUEST_MEM_ENCRYPT)) {
info.page_flag   |= _PAGE_ENC;
info.kernpg_flag |= _PAGE_ENC;
}
@@ -570,12 +570,12 @@ void arch_kexec_unprotect_crashkres(void)
  */
 int arch_kexec_post_alloc_pages(void *vaddr, unsigned int pages, gfp_t gfp)
 {
-   if (sev_active())
+   if (!prot_guest_has(PATTR_HOST_MEM_ENCRYPT))

[PATCH v2 02/12] mm: Introduce a function to check for virtualization protection features

2021-08-13 Thread Tom Lendacky
In prep for other protected virtualization technologies, introduce a
generic helper function, prot_guest_has(), that can be used to check
for specific protection attributes, like memory encryption. This is
intended to eliminate having to add multiple technology-specific checks
to the code (e.g. if (sev_active() || tdx_active())).

Reviewed-by: Joerg Roedel 
Co-developed-by: Andi Kleen 
Signed-off-by: Andi Kleen 
Co-developed-by: Kuppuswamy Sathyanarayanan 

Signed-off-by: Kuppuswamy Sathyanarayanan 

Signed-off-by: Tom Lendacky 
---
 arch/Kconfig|  3 +++
 include/linux/protected_guest.h | 35 +
 2 files changed, 38 insertions(+)
 create mode 100644 include/linux/protected_guest.h

diff --git a/arch/Kconfig b/arch/Kconfig
index 98db63496bab..bd4f60c581f1 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -1231,6 +1231,9 @@ config RELR
 config ARCH_HAS_MEM_ENCRYPT
bool
 
+config ARCH_HAS_PROTECTED_GUEST
+   bool
+
 config HAVE_SPARSE_SYSCALL_NR
bool
help
diff --git a/include/linux/protected_guest.h b/include/linux/protected_guest.h
new file mode 100644
index ..43d4dde94793
--- /dev/null
+++ b/include/linux/protected_guest.h
@@ -0,0 +1,35 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Protected Guest (and Host) Capability checks
+ *
+ * Copyright (C) 2021 Advanced Micro Devices, Inc.
+ *
+ * Author: Tom Lendacky 
+ */
+
+#ifndef _PROTECTED_GUEST_H
+#define _PROTECTED_GUEST_H
+
+#ifndef __ASSEMBLY__
+
+#include 
+#include 
+
+#define PATTR_MEM_ENCRYPT  0   /* Encrypted memory */
+#define PATTR_HOST_MEM_ENCRYPT 1   /* Host encrypted memory */
+#define PATTR_GUEST_MEM_ENCRYPT2   /* Guest encrypted 
memory */
+#define PATTR_GUEST_PROT_STATE 3   /* Guest encrypted state */
+
+#ifdef CONFIG_ARCH_HAS_PROTECTED_GUEST
+
+#include 
+
+#else  /* !CONFIG_ARCH_HAS_PROTECTED_GUEST */
+
+static inline bool prot_guest_has(unsigned int attr) { return false; }
+
+#endif /* CONFIG_ARCH_HAS_PROTECTED_GUEST */
+
+#endif /* __ASSEMBLY__ */
+
+#endif /* _PROTECTED_GUEST_H */
-- 
2.32.0


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


[PATCH v2 04/12] powerpc/pseries/svm: Add a powerpc version of prot_guest_has()

2021-08-13 Thread Tom Lendacky
Introduce a powerpc version of the prot_guest_has() function. This will
be used to replace the powerpc mem_encrypt_active() implementation, so
the implementation will initially only support the PATTR_MEM_ENCRYPT
attribute.

Cc: Michael Ellerman 
Cc: Benjamin Herrenschmidt 
Cc: Paul Mackerras 
Signed-off-by: Tom Lendacky 
---
 arch/powerpc/include/asm/protected_guest.h | 30 ++
 arch/powerpc/platforms/pseries/Kconfig |  1 +
 2 files changed, 31 insertions(+)
 create mode 100644 arch/powerpc/include/asm/protected_guest.h

diff --git a/arch/powerpc/include/asm/protected_guest.h 
b/arch/powerpc/include/asm/protected_guest.h
new file mode 100644
index ..ce55c2c7e534
--- /dev/null
+++ b/arch/powerpc/include/asm/protected_guest.h
@@ -0,0 +1,30 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Protected Guest (and Host) Capability checks
+ *
+ * Copyright (C) 2021 Advanced Micro Devices, Inc.
+ *
+ * Author: Tom Lendacky 
+ */
+
+#ifndef _POWERPC_PROTECTED_GUEST_H
+#define _POWERPC_PROTECTED_GUEST_H
+
+#include 
+
+#ifndef __ASSEMBLY__
+
+static inline bool prot_guest_has(unsigned int attr)
+{
+   switch (attr) {
+   case PATTR_MEM_ENCRYPT:
+   return is_secure_guest();
+
+   default:
+   return false;
+   }
+}
+
+#endif /* __ASSEMBLY__ */
+
+#endif /* _POWERPC_PROTECTED_GUEST_H */
diff --git a/arch/powerpc/platforms/pseries/Kconfig 
b/arch/powerpc/platforms/pseries/Kconfig
index 5e037df2a3a1..8ce5417d6feb 100644
--- a/arch/powerpc/platforms/pseries/Kconfig
+++ b/arch/powerpc/platforms/pseries/Kconfig
@@ -159,6 +159,7 @@ config PPC_SVM
select SWIOTLB
select ARCH_HAS_MEM_ENCRYPT
select ARCH_HAS_FORCE_DMA_UNENCRYPTED
+   select ARCH_HAS_PROTECTED_GUEST
help
 There are certain POWER platforms which support secure guests using
 the Protected Execution Facility, with the help of an Ultravisor
-- 
2.32.0


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


[PATCH v2 09/12] mm: Remove the now unused mem_encrypt_active() function

2021-08-13 Thread Tom Lendacky
The mem_encrypt_active() function has been replaced by prot_guest_has(),
so remove the implementation.

Reviewed-by: Joerg Roedel 
Signed-off-by: Tom Lendacky 
---
 include/linux/mem_encrypt.h | 4 
 1 file changed, 4 deletions(-)

diff --git a/include/linux/mem_encrypt.h b/include/linux/mem_encrypt.h
index 5c4a18a91f89..ae4526389261 100644
--- a/include/linux/mem_encrypt.h
+++ b/include/linux/mem_encrypt.h
@@ -16,10 +16,6 @@
 
 #include 
 
-#else  /* !CONFIG_ARCH_HAS_MEM_ENCRYPT */
-
-static inline bool mem_encrypt_active(void) { return false; }
-
 #endif /* CONFIG_ARCH_HAS_MEM_ENCRYPT */
 
 #ifdef CONFIG_AMD_MEM_ENCRYPT
-- 
2.32.0


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


[PATCH v2 07/12] x86/sev: Replace occurrences of sev_es_active() with prot_guest_has()

2021-08-13 Thread Tom Lendacky
Replace occurrences of sev_es_active() with the more generic
prot_guest_has() using PATTR_GUEST_PROT_STATE, except for in
arch/x86/kernel/sev*.c and arch/x86/mm/mem_encrypt*.c where PATTR_SEV_ES
will be used. If future support is added for other memory encyrption
techonologies, the use of PATTR_GUEST_PROT_STATE can be updated, as
required, to specifically use PATTR_SEV_ES.

Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: Borislav Petkov 
Signed-off-by: Tom Lendacky 
---
 arch/x86/include/asm/mem_encrypt.h | 2 --
 arch/x86/kernel/sev.c  | 6 +++---
 arch/x86/mm/mem_encrypt.c  | 7 +++
 arch/x86/realmode/init.c   | 3 +--
 4 files changed, 7 insertions(+), 11 deletions(-)

diff --git a/arch/x86/include/asm/mem_encrypt.h 
b/arch/x86/include/asm/mem_encrypt.h
index 7e25de37c148..797146e0cd6b 100644
--- a/arch/x86/include/asm/mem_encrypt.h
+++ b/arch/x86/include/asm/mem_encrypt.h
@@ -50,7 +50,6 @@ void __init mem_encrypt_free_decrypted_mem(void);
 void __init mem_encrypt_init(void);
 
 void __init sev_es_init_vc_handling(void);
-bool sev_es_active(void);
 bool amd_prot_guest_has(unsigned int attr);
 
 #define __bss_decrypted __section(".bss..decrypted")
@@ -74,7 +73,6 @@ static inline void __init sme_encrypt_kernel(struct 
boot_params *bp) { }
 static inline void __init sme_enable(struct boot_params *bp) { }
 
 static inline void sev_es_init_vc_handling(void) { }
-static inline bool sev_es_active(void) { return false; }
 static inline bool amd_prot_guest_has(unsigned int attr) { return false; }
 
 static inline int __init
diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c
index a6895e440bc3..66a4ab9d95d7 100644
--- a/arch/x86/kernel/sev.c
+++ b/arch/x86/kernel/sev.c
@@ -11,7 +11,7 @@
 
 #include  /* For show_regs() */
 #include 
-#include 
+#include 
 #include 
 #include 
 #include 
@@ -615,7 +615,7 @@ int __init sev_es_efi_map_ghcbs(pgd_t *pgd)
int cpu;
u64 pfn;
 
-   if (!sev_es_active())
+   if (!prot_guest_has(PATTR_SEV_ES))
return 0;
 
pflags = _PAGE_NX | _PAGE_RW;
@@ -774,7 +774,7 @@ void __init sev_es_init_vc_handling(void)
 
BUILD_BUG_ON(offsetof(struct sev_es_runtime_data, ghcb_page) % 
PAGE_SIZE);
 
-   if (!sev_es_active())
+   if (!prot_guest_has(PATTR_SEV_ES))
return;
 
if (!sev_es_check_cpu_features())
diff --git a/arch/x86/mm/mem_encrypt.c b/arch/x86/mm/mem_encrypt.c
index 83bc928f529e..38dfa84b77a1 100644
--- a/arch/x86/mm/mem_encrypt.c
+++ b/arch/x86/mm/mem_encrypt.c
@@ -383,8 +383,7 @@ static bool sme_active(void)
return sme_me_mask && !sev_active();
 }
 
-/* Needs to be called from non-instrumentable code */
-bool noinstr sev_es_active(void)
+static bool sev_es_active(void)
 {
return sev_status & MSR_AMD64_SEV_ES_ENABLED;
 }
@@ -482,7 +481,7 @@ static void print_mem_encrypt_feature_info(void)
pr_cont(" SEV");
 
/* Encrypted Register State */
-   if (sev_es_active())
+   if (amd_prot_guest_has(PATTR_SEV_ES))
pr_cont(" SEV-ES");
 
pr_cont("\n");
@@ -501,7 +500,7 @@ void __init mem_encrypt_init(void)
 * With SEV, we need to unroll the rep string I/O instructions,
 * but SEV-ES supports them through the #VC handler.
 */
-   if (amd_prot_guest_has(PATTR_SEV) && !sev_es_active())
+   if (amd_prot_guest_has(PATTR_SEV) && !amd_prot_guest_has(PATTR_SEV_ES))
static_branch_enable(&sev_enable_key);
 
print_mem_encrypt_feature_info();
diff --git a/arch/x86/realmode/init.c b/arch/x86/realmode/init.c
index 2109ae569c67..7711d0071f41 100644
--- a/arch/x86/realmode/init.c
+++ b/arch/x86/realmode/init.c
@@ -2,7 +2,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 #include 
 
@@ -48,7 +47,7 @@ static void sme_sev_setup_real_mode(struct trampoline_header 
*th)
if (prot_guest_has(PATTR_HOST_MEM_ENCRYPT))
th->flags |= TH_FLAGS_SME_ACTIVE;
 
-   if (sev_es_active()) {
+   if (prot_guest_has(PATTR_GUEST_PROT_STATE)) {
/*
 * Skip the call to verify_cpu() in secondary_startup_64 as it
 * will cause #VC exceptions when the AP can't handle them yet.
-- 
2.32.0


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


[PATCH v2 05/12] x86/sme: Replace occurrences of sme_active() with prot_guest_has()

2021-08-13 Thread Tom Lendacky
Replace occurrences of sme_active() with the more generic prot_guest_has()
using PATTR_HOST_MEM_ENCRYPT, except for in arch/x86/mm/mem_encrypt*.c
where PATTR_SME will be used. If future support is added for other memory
encryption technologies, the use of PATTR_HOST_MEM_ENCRYPT can be
updated, as required, to use PATTR_SME.

Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: Borislav Petkov 
Cc: Dave Hansen 
Cc: Andy Lutomirski 
Cc: Peter Zijlstra 
Cc: Joerg Roedel 
Cc: Will Deacon 
Reviewed-by: Joerg Roedel 
Signed-off-by: Tom Lendacky 
---
 arch/x86/include/asm/kexec.h |  2 +-
 arch/x86/include/asm/mem_encrypt.h   |  2 --
 arch/x86/kernel/machine_kexec_64.c   |  3 ++-
 arch/x86/kernel/pci-swiotlb.c|  9 -
 arch/x86/kernel/relocate_kernel_64.S |  2 +-
 arch/x86/mm/ioremap.c|  6 +++---
 arch/x86/mm/mem_encrypt.c| 10 +-
 arch/x86/mm/mem_encrypt_identity.c   |  3 ++-
 arch/x86/realmode/init.c |  5 +++--
 drivers/iommu/amd/init.c |  7 ---
 10 files changed, 25 insertions(+), 24 deletions(-)

diff --git a/arch/x86/include/asm/kexec.h b/arch/x86/include/asm/kexec.h
index 0a6e34b07017..11b7c06e2828 100644
--- a/arch/x86/include/asm/kexec.h
+++ b/arch/x86/include/asm/kexec.h
@@ -129,7 +129,7 @@ relocate_kernel(unsigned long indirection_page,
unsigned long page_list,
unsigned long start_address,
unsigned int preserve_context,
-   unsigned int sme_active);
+   unsigned int host_mem_enc_active);
 #endif
 
 #define ARCH_HAS_KIMAGE_ARCH
diff --git a/arch/x86/include/asm/mem_encrypt.h 
b/arch/x86/include/asm/mem_encrypt.h
index a46d47662772..956338406cec 100644
--- a/arch/x86/include/asm/mem_encrypt.h
+++ b/arch/x86/include/asm/mem_encrypt.h
@@ -50,7 +50,6 @@ void __init mem_encrypt_free_decrypted_mem(void);
 void __init mem_encrypt_init(void);
 
 void __init sev_es_init_vc_handling(void);
-bool sme_active(void);
 bool sev_active(void);
 bool sev_es_active(void);
 bool amd_prot_guest_has(unsigned int attr);
@@ -76,7 +75,6 @@ static inline void __init sme_encrypt_kernel(struct 
boot_params *bp) { }
 static inline void __init sme_enable(struct boot_params *bp) { }
 
 static inline void sev_es_init_vc_handling(void) { }
-static inline bool sme_active(void) { return false; }
 static inline bool sev_active(void) { return false; }
 static inline bool sev_es_active(void) { return false; }
 static inline bool amd_prot_guest_has(unsigned int attr) { return false; }
diff --git a/arch/x86/kernel/machine_kexec_64.c 
b/arch/x86/kernel/machine_kexec_64.c
index 131f30fdcfbd..8e7b517ad738 100644
--- a/arch/x86/kernel/machine_kexec_64.c
+++ b/arch/x86/kernel/machine_kexec_64.c
@@ -17,6 +17,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -358,7 +359,7 @@ void machine_kexec(struct kimage *image)
   (unsigned long)page_list,
   image->start,
   image->preserve_context,
-  sme_active());
+  prot_guest_has(PATTR_HOST_MEM_ENCRYPT));
 
 #ifdef CONFIG_KEXEC_JUMP
if (image->preserve_context)
diff --git a/arch/x86/kernel/pci-swiotlb.c b/arch/x86/kernel/pci-swiotlb.c
index c2cfa5e7c152..bd9a9cfbc9a2 100644
--- a/arch/x86/kernel/pci-swiotlb.c
+++ b/arch/x86/kernel/pci-swiotlb.c
@@ -6,7 +6,7 @@
 #include 
 #include 
 #include 
-#include 
+#include 
 
 #include 
 #include 
@@ -45,11 +45,10 @@ int __init pci_swiotlb_detect_4gb(void)
swiotlb = 1;
 
/*
-* If SME is active then swiotlb will be set to 1 so that bounce
-* buffers are allocated and used for devices that do not support
-* the addressing range required for the encryption mask.
+* Set swiotlb to 1 so that bounce buffers are allocated and used for
+* devices that can't support DMA to encrypted memory.
 */
-   if (sme_active())
+   if (prot_guest_has(PATTR_HOST_MEM_ENCRYPT))
swiotlb = 1;
 
return swiotlb;
diff --git a/arch/x86/kernel/relocate_kernel_64.S 
b/arch/x86/kernel/relocate_kernel_64.S
index c53271aebb64..c8fe74a28143 100644
--- a/arch/x86/kernel/relocate_kernel_64.S
+++ b/arch/x86/kernel/relocate_kernel_64.S
@@ -47,7 +47,7 @@ SYM_CODE_START_NOALIGN(relocate_kernel)
 * %rsi page_list
 * %rdx start address
 * %rcx preserve_context
-* %r8  sme_active
+* %r8  host_mem_enc_active
 */
 
/* Save the CPU context, used for jumping back */
diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c
index ccff76cedd8f..583afd54c7e1 100644
--- a/arch/x86/mm/ioremap.c
+++ b/arch/x86/mm/ioremap.c
@@ -14,7 +14,7 @@
 #include 
 #include 
 #include 
-#include 
+#include 
 #include 
 #include 
 
@@ -703,7 +703,7 @@ bool arch_memremap_can_ram_remap(resource_size_t phys_a

[PATCH v2 03/12] x86/sev: Add an x86 version of prot_guest_has()

2021-08-13 Thread Tom Lendacky
Introduce an x86 version of the prot_guest_has() function. This will be
used in the more generic x86 code to replace vendor specific calls like
sev_active(), etc.

While the name suggests this is intended mainly for guests, it will
also be used for host memory encryption checks in place of sme_active().

Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: Borislav Petkov 
Cc: Dave Hansen 
Cc: Andy Lutomirski 
Cc: Peter Zijlstra 
Reviewed-by: Joerg Roedel 
Co-developed-by: Andi Kleen 
Signed-off-by: Andi Kleen 
Co-developed-by: Kuppuswamy Sathyanarayanan 

Signed-off-by: Kuppuswamy Sathyanarayanan 

Signed-off-by: Tom Lendacky 
---
 arch/x86/Kconfig   |  1 +
 arch/x86/include/asm/mem_encrypt.h |  2 ++
 arch/x86/include/asm/protected_guest.h | 29 ++
 arch/x86/mm/mem_encrypt.c  | 25 ++
 include/linux/protected_guest.h|  5 +
 5 files changed, 62 insertions(+)
 create mode 100644 arch/x86/include/asm/protected_guest.h

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 421fa9e38c60..82e5fb713261 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -1514,6 +1514,7 @@ config AMD_MEM_ENCRYPT
select ARCH_HAS_FORCE_DMA_UNENCRYPTED
select INSTRUCTION_DECODER
select ARCH_HAS_RESTRICTED_VIRTIO_MEMORY_ACCESS
+   select ARCH_HAS_PROTECTED_GUEST
help
  Say yes to enable support for the encryption of system memory.
  This requires an AMD processor that supports Secure Memory
diff --git a/arch/x86/include/asm/mem_encrypt.h 
b/arch/x86/include/asm/mem_encrypt.h
index 9c80c68d75b5..a46d47662772 100644
--- a/arch/x86/include/asm/mem_encrypt.h
+++ b/arch/x86/include/asm/mem_encrypt.h
@@ -53,6 +53,7 @@ void __init sev_es_init_vc_handling(void);
 bool sme_active(void);
 bool sev_active(void);
 bool sev_es_active(void);
+bool amd_prot_guest_has(unsigned int attr);
 
 #define __bss_decrypted __section(".bss..decrypted")
 
@@ -78,6 +79,7 @@ static inline void sev_es_init_vc_handling(void) { }
 static inline bool sme_active(void) { return false; }
 static inline bool sev_active(void) { return false; }
 static inline bool sev_es_active(void) { return false; }
+static inline bool amd_prot_guest_has(unsigned int attr) { return false; }
 
 static inline int __init
 early_set_memory_decrypted(unsigned long vaddr, unsigned long size) { return 
0; }
diff --git a/arch/x86/include/asm/protected_guest.h 
b/arch/x86/include/asm/protected_guest.h
new file mode 100644
index ..51e4eefd9542
--- /dev/null
+++ b/arch/x86/include/asm/protected_guest.h
@@ -0,0 +1,29 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Protected Guest (and Host) Capability checks
+ *
+ * Copyright (C) 2021 Advanced Micro Devices, Inc.
+ *
+ * Author: Tom Lendacky 
+ */
+
+#ifndef _X86_PROTECTED_GUEST_H
+#define _X86_PROTECTED_GUEST_H
+
+#include 
+
+#ifndef __ASSEMBLY__
+
+static inline bool prot_guest_has(unsigned int attr)
+{
+#ifdef CONFIG_AMD_MEM_ENCRYPT
+   if (sme_me_mask)
+   return amd_prot_guest_has(attr);
+#endif
+
+   return false;
+}
+
+#endif /* __ASSEMBLY__ */
+
+#endif /* _X86_PROTECTED_GUEST_H */
diff --git a/arch/x86/mm/mem_encrypt.c b/arch/x86/mm/mem_encrypt.c
index ff08dc463634..edc67ddf065d 100644
--- a/arch/x86/mm/mem_encrypt.c
+++ b/arch/x86/mm/mem_encrypt.c
@@ -20,6 +20,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -389,6 +390,30 @@ bool noinstr sev_es_active(void)
return sev_status & MSR_AMD64_SEV_ES_ENABLED;
 }
 
+bool amd_prot_guest_has(unsigned int attr)
+{
+   switch (attr) {
+   case PATTR_MEM_ENCRYPT:
+   return sme_me_mask != 0;
+
+   case PATTR_SME:
+   case PATTR_HOST_MEM_ENCRYPT:
+   return sme_active();
+
+   case PATTR_SEV:
+   case PATTR_GUEST_MEM_ENCRYPT:
+   return sev_active();
+
+   case PATTR_SEV_ES:
+   case PATTR_GUEST_PROT_STATE:
+   return sev_es_active();
+
+   default:
+   return false;
+   }
+}
+EXPORT_SYMBOL_GPL(amd_prot_guest_has);
+
 /* Override for DMA direct allocation check - ARCH_HAS_FORCE_DMA_UNENCRYPTED */
 bool force_dma_unencrypted(struct device *dev)
 {
diff --git a/include/linux/protected_guest.h b/include/linux/protected_guest.h
index 43d4dde94793..5ddef1b6a2ea 100644
--- a/include/linux/protected_guest.h
+++ b/include/linux/protected_guest.h
@@ -20,6 +20,11 @@
 #define PATTR_GUEST_MEM_ENCRYPT2   /* Guest encrypted 
memory */
 #define PATTR_GUEST_PROT_STATE 3   /* Guest encrypted state */
 
+/* 0x800 - 0x8ff reserved for AMD */
+#define PATTR_SME  0x800
+#define PATTR_SEV  0x801
+#define PATTR_SEV_ES   0x802
+
 #ifdef CONFIG_ARCH_HAS_PROTECTED_GUEST
 
 #include 
-- 
2.32.0


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/ma

[PATCH v2 01/12] x86/ioremap: Selectively build arch override encryption functions

2021-08-13 Thread Tom Lendacky
In prep for other uses of the prot_guest_has() function besides AMD's
memory encryption support, selectively build the AMD memory encryption
architecture override functions only when CONFIG_AMD_MEM_ENCRYPT=y. These
functions are:
- early_memremap_pgprot_adjust()
- arch_memremap_can_ram_remap()

Additionally, routines that are only invoked by these architecture
override functions can also be conditionally built. These functions are:
- memremap_should_map_decrypted()
- memremap_is_efi_data()
- memremap_is_setup_data()
- early_memremap_is_setup_data()

And finally, phys_mem_access_encrypted() is conditionally built as well,
but requires a static inline version of it when CONFIG_AMD_MEM_ENCRYPT is
not set.

Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: Borislav Petkov 
Cc: Dave Hansen 
Cc: Andy Lutomirski 
Cc: Peter Zijlstra 
Signed-off-by: Tom Lendacky 
---
 arch/x86/include/asm/io.h | 8 
 arch/x86/mm/ioremap.c | 2 +-
 2 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/io.h b/arch/x86/include/asm/io.h
index 841a5d104afa..5c6a4af0b911 100644
--- a/arch/x86/include/asm/io.h
+++ b/arch/x86/include/asm/io.h
@@ -391,6 +391,7 @@ extern void arch_io_free_memtype_wc(resource_size_t start, 
resource_size_t size)
 #define arch_io_reserve_memtype_wc arch_io_reserve_memtype_wc
 #endif
 
+#ifdef CONFIG_AMD_MEM_ENCRYPT
 extern bool arch_memremap_can_ram_remap(resource_size_t offset,
unsigned long size,
unsigned long flags);
@@ -398,6 +399,13 @@ extern bool arch_memremap_can_ram_remap(resource_size_t 
offset,
 
 extern bool phys_mem_access_encrypted(unsigned long phys_addr,
  unsigned long size);
+#else
+static inline bool phys_mem_access_encrypted(unsigned long phys_addr,
+unsigned long size)
+{
+   return true;
+}
+#endif
 
 /**
  * iosubmit_cmds512 - copy data to single MMIO location, in 512-bit units
diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c
index 60ade7dd71bd..ccff76cedd8f 100644
--- a/arch/x86/mm/ioremap.c
+++ b/arch/x86/mm/ioremap.c
@@ -508,6 +508,7 @@ void unxlate_dev_mem_ptr(phys_addr_t phys, void *addr)
memunmap((void *)((unsigned long)addr & PAGE_MASK));
 }
 
+#ifdef CONFIG_AMD_MEM_ENCRYPT
 /*
  * Examine the physical address to determine if it is an area of memory
  * that should be mapped decrypted.  If the memory is not part of the
@@ -746,7 +747,6 @@ bool phys_mem_access_encrypted(unsigned long phys_addr, 
unsigned long size)
return arch_memremap_can_ram_remap(phys_addr, size, 0);
 }
 
-#ifdef CONFIG_AMD_MEM_ENCRYPT
 /* Remap memory with encryption */
 void __init *early_memremap_encrypted(resource_size_t phys_addr,
  unsigned long size)
-- 
2.32.0


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH 07/11] treewide: Replace the use of mem_encrypt_active() with prot_guest_has()

2021-08-11 Thread Tom Lendacky
On 8/11/21 7:19 AM, Kirill A. Shutemov wrote:
> On Tue, Aug 10, 2021 at 02:48:54PM -0500, Tom Lendacky wrote:
>> On 8/10/21 1:45 PM, Kuppuswamy, Sathyanarayanan wrote:
>>>
>>>
>>> On 7/27/21 3:26 PM, Tom Lendacky wrote:
>>>> diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c
>>>> index de01903c3735..cafed6456d45 100644
>>>> --- a/arch/x86/kernel/head64.c
>>>> +++ b/arch/x86/kernel/head64.c
>>>> @@ -19,7 +19,7 @@
>>>>   #include 
>>>>   #include 
>>>>   #include 
>>>> -#include 
>>>> +#include 
>>>>   #include 
>>>>     #include 
>>>> @@ -285,7 +285,7 @@ unsigned long __head __startup_64(unsigned long
>>>> physaddr,
>>>>    * there is no need to zero it after changing the memory encryption
>>>>    * attribute.
>>>>    */
>>>> -    if (mem_encrypt_active()) {
>>>> +    if (prot_guest_has(PATTR_MEM_ENCRYPT)) {
>>>>   vaddr = (unsigned long)__start_bss_decrypted;
>>>>   vaddr_end = (unsigned long)__end_bss_decrypted;
>>>
>>>
>>> Since this change is specific to AMD, can you replace PATTR_MEM_ENCRYPT with
>>> prot_guest_has(PATTR_SME) || prot_guest_has(PATTR_SEV). It is not used in
>>> TDX.
>>
>> This is a direct replacement for now.
> 
> With current implementation of prot_guest_has() for TDX it breaks boot for
> me.
> 
> Looking at code agains, now I *think* the reason is accessing a global
> variable from __startup_64() inside TDX version of prot_guest_has().
> 
> __startup_64() is special. If you access any global variable you need to
> use fixup_pointer(). See comment before __startup_64().
> 
> I'm not sure how you get away with accessing sme_me_mask directly from
> there. Any clues? Maybe just a luck and complier generates code just right
> for your case, I donno.

Hmm... yeah, could be that the compiler is using rip-relative addressing
for it because it lives in the .data section?

For the static variables in mem_encrypt_identity.c I did an assembler rip
relative LEA, but probably could have passed physaddr to sme_enable() and
used a fixup_pointer() style function, instead.

> 
> A separate point is that TDX version of prot_guest_has() relies on
> cpu_feature_enabled() which is not ready at this point.

Does TDX have to do anything special to make memory able to be shared with
the hypervisor?  You might have to use something that is available earlier
than cpu_feature_enabled() in that case (should you eventually support
kvmclock).

> 
> I think __bss_decrypted fixup has to be done if sme_me_mask is non-zero.
> Or just do it uncoditionally because it's NOP for sme_me_mask == 0.

For SNP, we'll have to additionally call the HV to update the RMP to make
the memory shared. But that could also be done unconditionally since the
early_snp_set_memory_shared() routine will check for SNP before doing
anything.

Thanks,
Tom

> 
>> I think the change you're requesting
>> should be done as part of the TDX support patches so it's clear why it is
>> being changed.
>>
>> But, wouldn't TDX still need to do something with this shared/unencrypted
>> area, though? Or since it is shared, there's actually nothing you need to
>> do (the bss decrpyted section exists even if CONFIG_AMD_MEM_ENCRYPT is not
>> configured)?
> 
> AFAICS, only kvmclock uses __bss_decrypted. We don't enable kvmclock in
> TDX at the moment. It may change in the future.
> 

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH 01/11] mm: Introduce a function to check for virtualization protection features

2021-08-11 Thread Tom Lendacky
On 8/11/21 9:53 AM, Kuppuswamy, Sathyanarayanan wrote:
> On 7/27/21 3:26 PM, Tom Lendacky wrote:
>> diff --git a/include/linux/protected_guest.h
>> b/include/linux/protected_guest.h
>> new file mode 100644
>> index ..f8ed7b72967b
>> --- /dev/null
>> +++ b/include/linux/protected_guest.h
>> @@ -0,0 +1,32 @@
>> +/* SPDX-License-Identifier: GPL-2.0-only */
>> +/*
>> + * Protected Guest (and Host) Capability checks
>> + *
>> + * Copyright (C) 2021 Advanced Micro Devices, Inc.
>> + *
>> + * Author: Tom Lendacky
>> + */
>> +
>> +#ifndef _PROTECTED_GUEST_H
>> +#define _PROTECTED_GUEST_H
>> +
>> +#ifndef __ASSEMBLY__
> 
> Can you include headers for bool type and false definition?

Can do.

Thanks,
Tom

> 
> --- a/include/linux/protected_guest.h
> +++ b/include/linux/protected_guest.h
> @@ -12,6 +12,9 @@
> 
>  #ifndef __ASSEMBLY__
> 
> +#include 
> +#include 
> 
> Otherwise, I see following errors in multi-config auto testing.
> 
> include/linux/protected_guest.h:40:15: error: unknown type name 'bool'
> include/linux/protected_guest.h:40:63: error: 'false' undeclared (first
> use in this functi
> 

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH RFC 0/2] dma-pool: allow user to disable atomic pool

2021-08-11 Thread Tom Lendacky
On 8/10/21 9:23 PM, Baoquan He wrote:
> On 08/10/21 at 03:52pm, Tom Lendacky wrote:
>> On 8/5/21 1:54 AM, Baoquan He wrote:
>>> On 06/24/21 at 11:47am, Robin Murphy wrote:
>>>> On 2021-06-24 10:29, Baoquan He wrote:
>>>>> On 06/24/21 at 08:40am, Christoph Hellwig wrote:

...

> Looking at the those related commits, the below one from David tells 
> that atomic dma pool is used when device require non-blocking and
> unencrypted buffer. When I checked the system I borrowed, it's AMD EYPC
> and SME is enabled. And it has many pci devices, as you can see, its 'ls
> pci' outputs 113 lines. But disabling the three atomic pools didn't
> trigger any error on that AMD system. Does it mean only specific devices
> need this atomic pool in SME/SEV enabling case? Should we add more
> details in document or code comment to make clear this? 

It very well could be just the devices being used. Under SME (bare metal),
if a device supports 64-bit DMA, then bounce buffers aren't used and the
DMA can be performed directly to encrypted memory, so there is no need to
issue a set_memory_decrypted() call, so I would assume it likely isn't
using the pool.

Under SEV, however, all DMA has to go through guest un-encrypted memory.
If you pass through a device that does dma_alloc_coherent() calls with
GFP_ATOMIC, then the pool will be needed.

Thanks,
Tom

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH RFC 0/2] dma-pool: allow user to disable atomic pool

2021-08-10 Thread Tom Lendacky
On 8/5/21 1:54 AM, Baoquan He wrote:
> On 06/24/21 at 11:47am, Robin Murphy wrote:
>> On 2021-06-24 10:29, Baoquan He wrote:
>>> On 06/24/21 at 08:40am, Christoph Hellwig wrote:
 So reduce the amount allocated.  But the pool is needed for proper
 operation on systems with memory encryption.  And please add the right
 maintainer or at least mailing list for the code you're touching next
 time.
>>>
>>> Oh, I thoutht it's memory issue only, should have run
>>> ./scripts/get_maintainer.pl. sorry.
>>>
>>> About reducing the amount allocated, it may not help. Because on x86_64,
>>> kdump kernel doesn't put any page of memory into buddy allocator of DMA
>>> zone. Means it will defenitely OOM for atomic_pool_dma initialization.
>>>
>>> Wondering in which case or on which device the atomic pool is needed on
>>> AMD system with mem encrytion enabled. As we can see, the OOM will
>>> happen too in kdump kernel on Intel system, even though it's not
>>> necessary.
> 
> Sorry for very late response, and thank both for your comments.
> 
>>
>> Hmm, I think the Kconfig reshuffle has actually left a slight wrinkle here.
>> For DMA_DIRECT_REMAP=y we can assume an atomic pool is always needed, since
>> that was the original behaviour anyway. However the implications of
>> AMD_MEM_ENCRYPT=y are different - even if support is enabled, it still
>> should only be relevant if mem_encrypt_active(), so it probably does make
>> sense to have an additional runtime gate on that.
> 
>>
>> From a quick scan, use of dma_alloc_from_pool() already depends on
>> force_dma_unencrypted() so that's probably fine already, but I think we'd
>> need a bit of extra protection around dma_free_from_pool() to prevent
>> gen_pool_has_addr() dereferencing NULL if the pools are uninitialised, even
>> with your proposed patch as it is. Presumably nothing actually called
>> dma_direct_free() when you tested this?
> 
> Yes, enforcing the conditional check of force_dma_unencrypted() around
> dma_free_from_pool sounds reasonable, just as we have done in
> dma_alloc_from_pool().
> 
> I have tested this patchset on normal x86_64 systems and one amd system
> with SME support, disabling atomic pool can fix the issue that there's no
> managed pages in dma zone then requesting page from dma zone will cause
> allocation failure. And even disabling atomic pool in 1st kernel didn't
> cause any problem on one AMD EPYC system which supports SME. I am not
> expert of DMA area, wondering how atomic pool is supposed to do in
> SME/SEV system. 

I think the atomic pool is used by the NVMe driver. My understanding is
that driver will do a dma_alloc_coherent() from interrupt context, so it
needs to use GFP_ATOMIC. The pool was created because dma_alloc_coherent()
would perform a set_memory_decrypted() call, which can sleep. The pool
eliminates that issue (David can correct me if I got that wrong).

Thanks,
Tom

> 
> Besides, even though atomic pool is disabled, slub page for allocation
> of dma-kmalloc also triggers page allocation failure. So I change to
> take another way to fix them, please check v2 post. The atomic pool
> disabling an be a good to have change.
> 
> Thanks
> Baoquan
> 

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH 07/11] treewide: Replace the use of mem_encrypt_active() with prot_guest_has()

2021-08-10 Thread Tom Lendacky
On 8/10/21 1:45 PM, Kuppuswamy, Sathyanarayanan wrote:
> 
> 
> On 7/27/21 3:26 PM, Tom Lendacky wrote:
>> diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c
>> index de01903c3735..cafed6456d45 100644
>> --- a/arch/x86/kernel/head64.c
>> +++ b/arch/x86/kernel/head64.c
>> @@ -19,7 +19,7 @@
>>   #include 
>>   #include 
>>   #include 
>> -#include 
>> +#include 
>>   #include 
>>     #include 
>> @@ -285,7 +285,7 @@ unsigned long __head __startup_64(unsigned long
>> physaddr,
>>    * there is no need to zero it after changing the memory encryption
>>    * attribute.
>>    */
>> -    if (mem_encrypt_active()) {
>> +    if (prot_guest_has(PATTR_MEM_ENCRYPT)) {
>>   vaddr = (unsigned long)__start_bss_decrypted;
>>   vaddr_end = (unsigned long)__end_bss_decrypted;
> 
> 
> Since this change is specific to AMD, can you replace PATTR_MEM_ENCRYPT with
> prot_guest_has(PATTR_SME) || prot_guest_has(PATTR_SEV). It is not used in
> TDX.

This is a direct replacement for now. I think the change you're requesting
should be done as part of the TDX support patches so it's clear why it is
being changed.

But, wouldn't TDX still need to do something with this shared/unencrypted
area, though? Or since it is shared, there's actually nothing you need to
do (the bss decrpyted section exists even if CONFIG_AMD_MEM_ENCRYPT is not
configured)?

Thanks,
Tom

> 

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH 00/11] Implement generic prot_guest_has() helper function

2021-08-09 Thread Tom Lendacky
On 8/8/21 8:41 PM, Kuppuswamy, Sathyanarayanan wrote:
> Hi Tom,
> 
> On 7/27/21 3:26 PM, Tom Lendacky wrote:
>> This patch series provides a generic helper function, prot_guest_has(),
>> to replace the sme_active(), sev_active(), sev_es_active() and
>> mem_encrypt_active() functions.
>>
>> It is expected that as new protected virtualization technologies are
>> added to the kernel, they can all be covered by a single function call
>> instead of a collection of specific function calls all called from the
>> same locations.
>>
>> The powerpc and s390 patches have been compile tested only. Can the
>> folks copied on this series verify that nothing breaks for them.
> 
> With this patch set, select ARCH_HAS_PROTECTED_GUEST and set
> CONFIG_AMD_MEM_ENCRYPT=n, creates following error.
> 
> ld: arch/x86/mm/ioremap.o: in function `early_memremap_is_setup_data':
> arch/x86/mm/ioremap.c:672: undefined reference to `early_memremap_decrypted'
> 
> It looks like early_memremap_is_setup_data() is not protected with
> appropriate config.

Ok, thanks for finding that. I'll fix that.

Thanks,
Tom

> 
> 
>>
>> Cc: Andi Kleen 
>> Cc: Andy Lutomirski 
>> Cc: Ard Biesheuvel 
>> Cc: Baoquan He 
>> Cc: Benjamin Herrenschmidt 
>> Cc: Borislav Petkov 
>> Cc: Christian Borntraeger 
>> Cc: Daniel Vetter 
>> Cc: Dave Hansen 
>> Cc: Dave Young 
>> Cc: David Airlie 
>> Cc: Heiko Carstens 
>> Cc: Ingo Molnar 
>> Cc: Joerg Roedel 
>> Cc: Maarten Lankhorst 
>> Cc: Maxime Ripard 
>> Cc: Michael Ellerman 
>> Cc: Paul Mackerras 
>> Cc: Peter Zijlstra 
>> Cc: Thomas Gleixner 
>> Cc: Thomas Zimmermann 
>> Cc: Vasily Gorbik 
>> Cc: VMware Graphics 
>> Cc: Will Deacon 
>>
>> ---
>>
>> Patches based on:
>>   
>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgit.kernel.org%2Fpub%2Fscm%2Flinux%2Fkernel%2Fgit%2Ftip%2Ftip.git&data=04%7C01%7Cthomas.lendacky%40amd.com%7C563b5e30a3254f6739aa08d95ad6e242%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637640701228434514%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=vx9v4EmYqVTsJ7KSr97gQaBWJ%2Fq%2BE9NOzXMhe3Fp7T8%3D&reserved=0
>> master
>>    commit 79e920060fa7 ("Merge branch 'WIP/fixes'")
>>
>> Tom Lendacky (11):
>>    mm: Introduce a function to check for virtualization protection
>>  features
>>    x86/sev: Add an x86 version of prot_guest_has()
>>    powerpc/pseries/svm: Add a powerpc version of prot_guest_has()
>>    x86/sme: Replace occurrences of sme_active() with prot_guest_has()
>>    x86/sev: Replace occurrences of sev_active() with prot_guest_has()
>>    x86/sev: Replace occurrences of sev_es_active() with prot_guest_has()
>>    treewide: Replace the use of mem_encrypt_active() with
>>  prot_guest_has()
>>    mm: Remove the now unused mem_encrypt_active() function
>>    x86/sev: Remove the now unused mem_encrypt_active() function
>>    powerpc/pseries/svm: Remove the now unused mem_encrypt_active()
>>  function
>>    s390/mm: Remove the now unused mem_encrypt_active() function
>>
>>   arch/Kconfig   |  3 ++
>>   arch/powerpc/include/asm/mem_encrypt.h |  5 --
>>   arch/powerpc/include/asm/protected_guest.h | 30 +++
>>   arch/powerpc/platforms/pseries/Kconfig |  1 +
>>   arch/s390/include/asm/mem_encrypt.h    |  2 -
>>   arch/x86/Kconfig   |  1 +
>>   arch/x86/include/asm/kexec.h   |  2 +-
>>   arch/x86/include/asm/mem_encrypt.h | 13 +
>>   arch/x86/include/asm/protected_guest.h | 27 ++
>>   arch/x86/kernel/crash_dump_64.c    |  4 +-
>>   arch/x86/kernel/head64.c   |  4 +-
>>   arch/x86/kernel/kvm.c  |  3 +-
>>   arch/x86/kernel/kvmclock.c |  4 +-
>>   arch/x86/kernel/machine_kexec_64.c | 19 +++
>>   arch/x86/kernel/pci-swiotlb.c  |  9 ++--
>>   arch/x86/kernel/relocate_kernel_64.S   |  2 +-
>>   arch/x86/kernel/sev.c  |  6 +--
>>   arch/x86/kvm/svm/svm.c |  3 +-
>>   arch/x86/mm/ioremap.c  | 16 +++---
>>   arch/x86/mm/mem_encrypt.c  | 60 +++---
>>   arch/x86/mm/mem_encrypt_identity.c |  3 +-
>>   arch/x86/mm/pat/set_memory.c   |  3 +-
>>   arch/x86/platform/efi/efi_64.c |  9 ++--
>>   arch/x86/rea

Re: [PATCH 07/11] treewide: Replace the use of mem_encrypt_active() with prot_guest_has()

2021-08-09 Thread Tom Lendacky
On 8/2/21 7:42 AM, Christophe Leroy wrote:
> 
> 
> Le 28/07/2021 à 00:26, Tom Lendacky a écrit :
>> Replace occurrences of mem_encrypt_active() with calls to prot_guest_has()
>> with the PATTR_MEM_ENCRYPT attribute.
> 
> 
> What about
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpatchwork.ozlabs.org%2Fproject%2Flinuxppc-dev%2Fpatch%2F20210730114231.23445-1-will%40kernel.org%2F&data=04%7C01%7Cthomas.lendacky%40amd.com%7C1198d62463e04a27be5908d955b30433%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637635049667233612%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=Erpu4Du05sVYkYuAfTkXdLvq48%2FlfLS2q%2FZW8DG3tFw%3D&reserved=0>
>  ?

Ah, looks like that just went into the PPC tree and isn't part of the tip
tree. I'll have to look into how to handle that one.

Thanks,
Tom

> 
> Christophe
> 
> 
>>
>> Cc: Thomas Gleixner 
>> Cc: Ingo Molnar 
>> Cc: Borislav Petkov 
>> Cc: Dave Hansen 
>> Cc: Andy Lutomirski 
>> Cc: Peter Zijlstra 
>> Cc: David Airlie 
>> Cc: Daniel Vetter 
>> Cc: Maarten Lankhorst 
>> Cc: Maxime Ripard 
>> Cc: Thomas Zimmermann 
>> Cc: VMware Graphics 
>> Cc: Joerg Roedel 
>> Cc: Will Deacon 
>> Cc: Dave Young 
>> Cc: Baoquan He 
>> Signed-off-by: Tom Lendacky 
>> ---
>>   arch/x86/kernel/head64.c    | 4 ++--
>>   arch/x86/mm/ioremap.c   | 4 ++--
>>   arch/x86/mm/mem_encrypt.c   | 5 ++---
>>   arch/x86/mm/pat/set_memory.c    | 3 ++-
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 4 +++-
>>   drivers/gpu/drm/drm_cache.c | 4 ++--
>>   drivers/gpu/drm/vmwgfx/vmwgfx_drv.c | 4 ++--
>>   drivers/gpu/drm/vmwgfx/vmwgfx_msg.c | 6 +++---
>>   drivers/iommu/amd/iommu.c   | 3 ++-
>>   drivers/iommu/amd/iommu_v2.c    | 3 ++-
>>   drivers/iommu/iommu.c   | 3 ++-
>>   fs/proc/vmcore.c    | 6 +++---
>>   kernel/dma/swiotlb.c    | 4 ++--
>>   13 files changed, 29 insertions(+), 24 deletions(-)
>>
>> diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c
>> index de01903c3735..cafed6456d45 100644
>> --- a/arch/x86/kernel/head64.c
>> +++ b/arch/x86/kernel/head64.c
>> @@ -19,7 +19,7 @@
>>   #include 
>>   #include 
>>   #include 
>> -#include 
>> +#include 
>>   #include 
>>     #include 
>> @@ -285,7 +285,7 @@ unsigned long __head __startup_64(unsigned long
>> physaddr,
>>    * there is no need to zero it after changing the memory encryption
>>    * attribute.
>>    */
>> -    if (mem_encrypt_active()) {
>> +    if (prot_guest_has(PATTR_MEM_ENCRYPT)) {
>>   vaddr = (unsigned long)__start_bss_decrypted;
>>   vaddr_end = (unsigned long)__end_bss_decrypted;
>>   for (; vaddr < vaddr_end; vaddr += PMD_SIZE) {
>> diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c
>> index 0f2d5ace5986..5e1c1f5cbbe8 100644
>> --- a/arch/x86/mm/ioremap.c
>> +++ b/arch/x86/mm/ioremap.c
>> @@ -693,7 +693,7 @@ static bool __init
>> early_memremap_is_setup_data(resource_size_t phys_addr,
>>   bool arch_memremap_can_ram_remap(resource_size_t phys_addr, unsigned
>> long size,
>>    unsigned long flags)
>>   {
>> -    if (!mem_encrypt_active())
>> +    if (!prot_guest_has(PATTR_MEM_ENCRYPT))
>>   return true;
>>     if (flags & MEMREMAP_ENC)
>> @@ -723,7 +723,7 @@ pgprot_t __init
>> early_memremap_pgprot_adjust(resource_size_t phys_addr,
>>   {
>>   bool encrypted_prot;
>>   -    if (!mem_encrypt_active())
>> +    if (!prot_guest_has(PATTR_MEM_ENCRYPT))
>>   return prot;
>>     encrypted_prot = true;
>> diff --git a/arch/x86/mm/mem_encrypt.c b/arch/x86/mm/mem_encrypt.c
>> index 451de8e84fce..0f1533dbe81c 100644
>> --- a/arch/x86/mm/mem_encrypt.c
>> +++ b/arch/x86/mm/mem_encrypt.c
>> @@ -364,8 +364,7 @@ int __init early_set_memory_encrypted(unsigned long
>> vaddr, unsigned long size)
>>   /*
>>    * SME and SEV are very similar but they are not the same, so there are
>>    * times that the kernel will need to distinguish between SME and SEV.
>> The
>> - * sme_active() and sev_active() functions are used for this.  When a
>> - * distinction isn't needed, the mem_encrypt_active() function can be
>> used.
>> + * sme_active() and sev_active() functions are used for this.
>>    *
>>    * The 

Re: [PATCH 06/11] x86/sev: Replace occurrences of sev_es_active() with prot_guest_has()

2021-08-09 Thread Tom Lendacky
On 8/2/21 5:45 AM, Joerg Roedel wrote:
> On Tue, Jul 27, 2021 at 05:26:09PM -0500, Tom Lendacky wrote:
>> @@ -48,7 +47,7 @@ static void sme_sev_setup_real_mode(struct 
>> trampoline_header *th)
>>  if (prot_guest_has(PATTR_HOST_MEM_ENCRYPT))
>>  th->flags |= TH_FLAGS_SME_ACTIVE;
>>  
>> -if (sev_es_active()) {
>> +if (prot_guest_has(PATTR_GUEST_PROT_STATE)) {
>>  /*
>>   * Skip the call to verify_cpu() in secondary_startup_64 as it
>>   * will cause #VC exceptions when the AP can't handle them yet.
> 
> Not sure how TDX will handle AP booting, are you sure it needs this
> special setup as well? Otherwise a check for SEV-ES would be better
> instead of the generic PATTR_GUEST_PROT_STATE.

Yes, I'm not sure either. I figure that change can be made, if needed, as
part of the TDX support.

Thanks,
Tom

> 
> Regards,
> 
> Joerg
> 

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH 07/11] treewide: Replace the use of mem_encrypt_active() with prot_guest_has()

2021-08-09 Thread Tom Lendacky
On 7/30/21 5:34 PM, Sean Christopherson wrote:
> On Tue, Jul 27, 2021, Tom Lendacky wrote:
>> @@ -451,7 +450,7 @@ void __init mem_encrypt_free_decrypted_mem(void)
>>   * The unused memory range was mapped decrypted, change the encryption
>>   * attribute from decrypted to encrypted before freeing it.
>>   */
>> -if (mem_encrypt_active()) {
>> +if (sme_me_mask) {
> 
> Any reason this uses sme_me_mask?  The helper it calls, 
> __set_memory_enc_dec(),
> uses prot_guest_has(PATTR_MEM_ENCRYPT) so I assume it's available?

Probably just a slip on my part. I was debating at one point calling the
helper vs. referencing the variables/functions directly in the
mem_encrypt.c file.

Thanks,
Tom

> 
>>  r = set_memory_encrypted(vaddr, npages);
>>  if (r) {
>>  pr_warn("failed to free unused decrypted pages\n");
> 

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH 02/11] x86/sev: Add an x86 version of prot_guest_has()

2021-07-29 Thread Tom Lendacky
On 7/28/21 8:22 AM, Christoph Hellwig wrote:
> On Tue, Jul 27, 2021 at 05:26:05PM -0500, Tom Lendacky via iommu wrote:
>> Introduce an x86 version of the prot_guest_has() function. This will be
>> used in the more generic x86 code to replace vendor specific calls like
>> sev_active(), etc.
>>
>> While the name suggests this is intended mainly for guests, it will
>> also be used for host memory encryption checks in place of sme_active().
>>
>> The amd_prot_guest_has() function does not use EXPORT_SYMBOL_GPL for the
>> same reasons previously stated when changing sme_active(), sev_active and
> 
> None of that applies here as none of the callers get pulled into
> random macros.  The only case of that is sme_me_mask through
> sme_mask, but that's not something this series replaces as far as I can
> tell.

Ok, let me make sure of that and I'll change to EXPORT_SYMBOL_GPL if
that's the case.

Thanks,
Tom

> 

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH 00/11] Implement generic prot_guest_has() helper function

2021-07-27 Thread Tom Lendacky
On 7/27/21 5:26 PM, Tom Lendacky wrote:
> This patch series provides a generic helper function, prot_guest_has(),
> to replace the sme_active(), sev_active(), sev_es_active() and
> mem_encrypt_active() functions.
> 
> It is expected that as new protected virtualization technologies are
> added to the kernel, they can all be covered by a single function call
> instead of a collection of specific function calls all called from the
> same locations.
> 
> The powerpc and s390 patches have been compile tested only. Can the
> folks copied on this series verify that nothing breaks for them.

I wanted to get this out before I head out on vacation at the end of the
week. I'll only be out for a week, but I won't be able to respond to any
feedback until I get back.

I'm still not a fan of the name prot_guest_has() because it is used for
some baremetal checks, but really haven't been able to come up with
anything better. So take it with a grain of salt where the sme_active()
calls are replaced by prot_guest_has().

Also, let me know if the treewide changes in patch #7 need to be further
split out by tree.

Thanks,
Tom

> 
> Cc: Andi Kleen 
> Cc: Andy Lutomirski 
> Cc: Ard Biesheuvel 
> Cc: Baoquan He 
> Cc: Benjamin Herrenschmidt 
> Cc: Borislav Petkov 
> Cc: Christian Borntraeger 
> Cc: Daniel Vetter 
> Cc: Dave Hansen 
> Cc: Dave Young 
> Cc: David Airlie 
> Cc: Heiko Carstens 
> Cc: Ingo Molnar 
> Cc: Joerg Roedel 
> Cc: Maarten Lankhorst 
> Cc: Maxime Ripard 
> Cc: Michael Ellerman 
> Cc: Paul Mackerras 
> Cc: Peter Zijlstra 
> Cc: Thomas Gleixner 
> Cc: Thomas Zimmermann 
> Cc: Vasily Gorbik 
> Cc: VMware Graphics 
> Cc: Will Deacon 
> 
> ---
> 
> Patches based on:
>   https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git master
>   commit 79e920060fa7 ("Merge branch 'WIP/fixes'")
> 
> Tom Lendacky (11):
>   mm: Introduce a function to check for virtualization protection
> features
>   x86/sev: Add an x86 version of prot_guest_has()
>   powerpc/pseries/svm: Add a powerpc version of prot_guest_has()
>   x86/sme: Replace occurrences of sme_active() with prot_guest_has()
>   x86/sev: Replace occurrences of sev_active() with prot_guest_has()
>   x86/sev: Replace occurrences of sev_es_active() with prot_guest_has()
>   treewide: Replace the use of mem_encrypt_active() with
> prot_guest_has()
>   mm: Remove the now unused mem_encrypt_active() function
>   x86/sev: Remove the now unused mem_encrypt_active() function
>   powerpc/pseries/svm: Remove the now unused mem_encrypt_active()
> function
>   s390/mm: Remove the now unused mem_encrypt_active() function
> 
>  arch/Kconfig   |  3 ++
>  arch/powerpc/include/asm/mem_encrypt.h |  5 --
>  arch/powerpc/include/asm/protected_guest.h | 30 +++
>  arch/powerpc/platforms/pseries/Kconfig |  1 +
>  arch/s390/include/asm/mem_encrypt.h|  2 -
>  arch/x86/Kconfig   |  1 +
>  arch/x86/include/asm/kexec.h   |  2 +-
>  arch/x86/include/asm/mem_encrypt.h | 13 +
>  arch/x86/include/asm/protected_guest.h | 27 ++
>  arch/x86/kernel/crash_dump_64.c|  4 +-
>  arch/x86/kernel/head64.c   |  4 +-
>  arch/x86/kernel/kvm.c  |  3 +-
>  arch/x86/kernel/kvmclock.c |  4 +-
>  arch/x86/kernel/machine_kexec_64.c | 19 +++
>  arch/x86/kernel/pci-swiotlb.c  |  9 ++--
>  arch/x86/kernel/relocate_kernel_64.S   |  2 +-
>  arch/x86/kernel/sev.c  |  6 +--
>  arch/x86/kvm/svm/svm.c |  3 +-
>  arch/x86/mm/ioremap.c  | 16 +++---
>  arch/x86/mm/mem_encrypt.c  | 60 +++---
>  arch/x86/mm/mem_encrypt_identity.c |  3 +-
>  arch/x86/mm/pat/set_memory.c   |  3 +-
>  arch/x86/platform/efi/efi_64.c |  9 ++--
>  arch/x86/realmode/init.c   |  8 +--
>  drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c|  4 +-
>  drivers/gpu/drm/drm_cache.c|  4 +-
>  drivers/gpu/drm/vmwgfx/vmwgfx_drv.c|  4 +-
>  drivers/gpu/drm/vmwgfx/vmwgfx_msg.c|  6 +--
>  drivers/iommu/amd/init.c   |  7 +--
>  drivers/iommu/amd/iommu.c  |  3 +-
>  drivers/iommu/amd/iommu_v2.c   |  3 +-
>  drivers/iommu/iommu.c  |  3 +-
>  fs/proc/vmcore.c   |  6 +--
>  include/linux/mem_encrypt.h|  4 --
>  include/linux/protected_guest.h| 37 +
>  kernel/dma/swiotlb.c   |  4 +-
>  36 files changed, 218 insertions(+), 104 delet

[PATCH 10/11] powerpc/pseries/svm: Remove the now unused mem_encrypt_active() function

2021-07-27 Thread Tom Lendacky
The mem_encrypt_active() function has been replaced by prot_guest_has(),
so remove the implementation.

Cc: Michael Ellerman 
Cc: Benjamin Herrenschmidt 
Cc: Paul Mackerras 
Signed-off-by: Tom Lendacky 
---
 arch/powerpc/include/asm/mem_encrypt.h | 5 -
 1 file changed, 5 deletions(-)

diff --git a/arch/powerpc/include/asm/mem_encrypt.h 
b/arch/powerpc/include/asm/mem_encrypt.h
index ba9dab07c1be..2f26b8fc8d29 100644
--- a/arch/powerpc/include/asm/mem_encrypt.h
+++ b/arch/powerpc/include/asm/mem_encrypt.h
@@ -10,11 +10,6 @@
 
 #include 
 
-static inline bool mem_encrypt_active(void)
-{
-   return is_secure_guest();
-}
-
 static inline bool force_dma_unencrypted(struct device *dev)
 {
return is_secure_guest();
-- 
2.32.0


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


[PATCH 09/11] x86/sev: Remove the now unused mem_encrypt_active() function

2021-07-27 Thread Tom Lendacky
The mem_encrypt_active() function has been replaced by prot_guest_has(),
so remove the implementation.

Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: Borislav Petkov 
Signed-off-by: Tom Lendacky 
---
 arch/x86/include/asm/mem_encrypt.h | 5 -
 1 file changed, 5 deletions(-)

diff --git a/arch/x86/include/asm/mem_encrypt.h 
b/arch/x86/include/asm/mem_encrypt.h
index 797146e0cd6b..94c089e9ea69 100644
--- a/arch/x86/include/asm/mem_encrypt.h
+++ b/arch/x86/include/asm/mem_encrypt.h
@@ -97,11 +97,6 @@ static inline void mem_encrypt_free_decrypted_mem(void) { }
 
 extern char __start_bss_decrypted[], __end_bss_decrypted[], 
__start_bss_decrypted_unused[];
 
-static inline bool mem_encrypt_active(void)
-{
-   return sme_me_mask;
-}
-
 static inline u64 sme_get_me_mask(void)
 {
return sme_me_mask;
-- 
2.32.0


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


[PATCH 08/11] mm: Remove the now unused mem_encrypt_active() function

2021-07-27 Thread Tom Lendacky
The mem_encrypt_active() function has been replaced by prot_guest_has(),
so remove the implementation.

Signed-off-by: Tom Lendacky 
---
 include/linux/mem_encrypt.h | 4 
 1 file changed, 4 deletions(-)

diff --git a/include/linux/mem_encrypt.h b/include/linux/mem_encrypt.h
index 5c4a18a91f89..ae4526389261 100644
--- a/include/linux/mem_encrypt.h
+++ b/include/linux/mem_encrypt.h
@@ -16,10 +16,6 @@
 
 #include 
 
-#else  /* !CONFIG_ARCH_HAS_MEM_ENCRYPT */
-
-static inline bool mem_encrypt_active(void) { return false; }
-
 #endif /* CONFIG_ARCH_HAS_MEM_ENCRYPT */
 
 #ifdef CONFIG_AMD_MEM_ENCRYPT
-- 
2.32.0


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


[PATCH 07/11] treewide: Replace the use of mem_encrypt_active() with prot_guest_has()

2021-07-27 Thread Tom Lendacky
Replace occurrences of mem_encrypt_active() with calls to prot_guest_has()
with the PATTR_MEM_ENCRYPT attribute.

Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: Borislav Petkov 
Cc: Dave Hansen 
Cc: Andy Lutomirski 
Cc: Peter Zijlstra 
Cc: David Airlie 
Cc: Daniel Vetter 
Cc: Maarten Lankhorst 
Cc: Maxime Ripard 
Cc: Thomas Zimmermann 
Cc: VMware Graphics 
Cc: Joerg Roedel 
Cc: Will Deacon 
Cc: Dave Young 
Cc: Baoquan He 
Signed-off-by: Tom Lendacky 
---
 arch/x86/kernel/head64.c| 4 ++--
 arch/x86/mm/ioremap.c   | 4 ++--
 arch/x86/mm/mem_encrypt.c   | 5 ++---
 arch/x86/mm/pat/set_memory.c| 3 ++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 4 +++-
 drivers/gpu/drm/drm_cache.c | 4 ++--
 drivers/gpu/drm/vmwgfx/vmwgfx_drv.c | 4 ++--
 drivers/gpu/drm/vmwgfx/vmwgfx_msg.c | 6 +++---
 drivers/iommu/amd/iommu.c   | 3 ++-
 drivers/iommu/amd/iommu_v2.c| 3 ++-
 drivers/iommu/iommu.c   | 3 ++-
 fs/proc/vmcore.c| 6 +++---
 kernel/dma/swiotlb.c| 4 ++--
 13 files changed, 29 insertions(+), 24 deletions(-)

diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c
index de01903c3735..cafed6456d45 100644
--- a/arch/x86/kernel/head64.c
+++ b/arch/x86/kernel/head64.c
@@ -19,7 +19,7 @@
 #include 
 #include 
 #include 
-#include 
+#include 
 #include 
 
 #include 
@@ -285,7 +285,7 @@ unsigned long __head __startup_64(unsigned long physaddr,
 * there is no need to zero it after changing the memory encryption
 * attribute.
 */
-   if (mem_encrypt_active()) {
+   if (prot_guest_has(PATTR_MEM_ENCRYPT)) {
vaddr = (unsigned long)__start_bss_decrypted;
vaddr_end = (unsigned long)__end_bss_decrypted;
for (; vaddr < vaddr_end; vaddr += PMD_SIZE) {
diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c
index 0f2d5ace5986..5e1c1f5cbbe8 100644
--- a/arch/x86/mm/ioremap.c
+++ b/arch/x86/mm/ioremap.c
@@ -693,7 +693,7 @@ static bool __init 
early_memremap_is_setup_data(resource_size_t phys_addr,
 bool arch_memremap_can_ram_remap(resource_size_t phys_addr, unsigned long size,
 unsigned long flags)
 {
-   if (!mem_encrypt_active())
+   if (!prot_guest_has(PATTR_MEM_ENCRYPT))
return true;
 
if (flags & MEMREMAP_ENC)
@@ -723,7 +723,7 @@ pgprot_t __init 
early_memremap_pgprot_adjust(resource_size_t phys_addr,
 {
bool encrypted_prot;
 
-   if (!mem_encrypt_active())
+   if (!prot_guest_has(PATTR_MEM_ENCRYPT))
return prot;
 
encrypted_prot = true;
diff --git a/arch/x86/mm/mem_encrypt.c b/arch/x86/mm/mem_encrypt.c
index 451de8e84fce..0f1533dbe81c 100644
--- a/arch/x86/mm/mem_encrypt.c
+++ b/arch/x86/mm/mem_encrypt.c
@@ -364,8 +364,7 @@ int __init early_set_memory_encrypted(unsigned long vaddr, 
unsigned long size)
 /*
  * SME and SEV are very similar but they are not the same, so there are
  * times that the kernel will need to distinguish between SME and SEV. The
- * sme_active() and sev_active() functions are used for this.  When a
- * distinction isn't needed, the mem_encrypt_active() function can be used.
+ * sme_active() and sev_active() functions are used for this.
  *
  * The trampoline code is a good example for this requirement.  Before
  * paging is activated, SME will access all memory as decrypted, but SEV
@@ -451,7 +450,7 @@ void __init mem_encrypt_free_decrypted_mem(void)
 * The unused memory range was mapped decrypted, change the encryption
 * attribute from decrypted to encrypted before freeing it.
 */
-   if (mem_encrypt_active()) {
+   if (sme_me_mask) {
r = set_memory_encrypted(vaddr, npages);
if (r) {
pr_warn("failed to free unused decrypted pages\n");
diff --git a/arch/x86/mm/pat/set_memory.c b/arch/x86/mm/pat/set_memory.c
index ad8a5c586a35..6925f2bb4be1 100644
--- a/arch/x86/mm/pat/set_memory.c
+++ b/arch/x86/mm/pat/set_memory.c
@@ -18,6 +18,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -1986,7 +1987,7 @@ static int __set_memory_enc_dec(unsigned long addr, int 
numpages, bool enc)
int ret;
 
/* Nothing to do if memory encryption is not active */
-   if (!mem_encrypt_active())
+   if (!prot_guest_has(PATTR_MEM_ENCRYPT))
return 0;
 
/* Should not be working on unaligned addresses */
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index abb928894eac..8407224717df 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -38,6 +38,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "amdgpu.h"
 #include "amdgpu_irq.h"
@@ -1239,7 +1240,8 @@ static int amdgpu_pci_probe(struct pci_dev 

[PATCH 06/11] x86/sev: Replace occurrences of sev_es_active() with prot_guest_has()

2021-07-27 Thread Tom Lendacky
Replace occurrences of sev_es_active() with the more generic
prot_guest_has() using PATTR_GUEST_PROT_STATE, except for in
arch/x86/kernel/sev*.c and arch/x86/mm/mem_encrypt*.c where PATTR_SEV_ES
will be used. If future support is added for other memory encyrption
techonologies, the use of PATTR_GUEST_PROT_STATE can be updated, as
required, to specifically use PATTR_SEV_ES.

Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: Borislav Petkov 
Signed-off-by: Tom Lendacky 
---
 arch/x86/include/asm/mem_encrypt.h | 2 --
 arch/x86/kernel/sev.c  | 6 +++---
 arch/x86/mm/mem_encrypt.c  | 7 +++
 arch/x86/realmode/init.c   | 3 +--
 4 files changed, 7 insertions(+), 11 deletions(-)

diff --git a/arch/x86/include/asm/mem_encrypt.h 
b/arch/x86/include/asm/mem_encrypt.h
index 7e25de37c148..797146e0cd6b 100644
--- a/arch/x86/include/asm/mem_encrypt.h
+++ b/arch/x86/include/asm/mem_encrypt.h
@@ -50,7 +50,6 @@ void __init mem_encrypt_free_decrypted_mem(void);
 void __init mem_encrypt_init(void);
 
 void __init sev_es_init_vc_handling(void);
-bool sev_es_active(void);
 bool amd_prot_guest_has(unsigned int attr);
 
 #define __bss_decrypted __section(".bss..decrypted")
@@ -74,7 +73,6 @@ static inline void __init sme_encrypt_kernel(struct 
boot_params *bp) { }
 static inline void __init sme_enable(struct boot_params *bp) { }
 
 static inline void sev_es_init_vc_handling(void) { }
-static inline bool sev_es_active(void) { return false; }
 static inline bool amd_prot_guest_has(unsigned int attr) { return false; }
 
 static inline int __init
diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c
index a6895e440bc3..66a4ab9d95d7 100644
--- a/arch/x86/kernel/sev.c
+++ b/arch/x86/kernel/sev.c
@@ -11,7 +11,7 @@
 
 #include  /* For show_regs() */
 #include 
-#include 
+#include 
 #include 
 #include 
 #include 
@@ -615,7 +615,7 @@ int __init sev_es_efi_map_ghcbs(pgd_t *pgd)
int cpu;
u64 pfn;
 
-   if (!sev_es_active())
+   if (!prot_guest_has(PATTR_SEV_ES))
return 0;
 
pflags = _PAGE_NX | _PAGE_RW;
@@ -774,7 +774,7 @@ void __init sev_es_init_vc_handling(void)
 
BUILD_BUG_ON(offsetof(struct sev_es_runtime_data, ghcb_page) % 
PAGE_SIZE);
 
-   if (!sev_es_active())
+   if (!prot_guest_has(PATTR_SEV_ES))
return;
 
if (!sev_es_check_cpu_features())
diff --git a/arch/x86/mm/mem_encrypt.c b/arch/x86/mm/mem_encrypt.c
index eb5cae93b238..451de8e84fce 100644
--- a/arch/x86/mm/mem_encrypt.c
+++ b/arch/x86/mm/mem_encrypt.c
@@ -383,8 +383,7 @@ static bool sme_active(void)
return sme_me_mask && !sev_active();
 }
 
-/* Needs to be called from non-instrumentable code */
-bool noinstr sev_es_active(void)
+static bool sev_es_active(void)
 {
return sev_status & MSR_AMD64_SEV_ES_ENABLED;
 }
@@ -482,7 +481,7 @@ static void print_mem_encrypt_feature_info(void)
pr_cont(" SEV");
 
/* Encrypted Register State */
-   if (sev_es_active())
+   if (amd_prot_guest_has(PATTR_SEV_ES))
pr_cont(" SEV-ES");
 
pr_cont("\n");
@@ -501,7 +500,7 @@ void __init mem_encrypt_init(void)
 * With SEV, we need to unroll the rep string I/O instructions,
 * but SEV-ES supports them through the #VC handler.
 */
-   if (amd_prot_guest_has(PATTR_SEV) && !sev_es_active())
+   if (amd_prot_guest_has(PATTR_SEV) && !amd_prot_guest_has(PATTR_SEV_ES))
static_branch_enable(&sev_enable_key);
 
print_mem_encrypt_feature_info();
diff --git a/arch/x86/realmode/init.c b/arch/x86/realmode/init.c
index 2109ae569c67..7711d0071f41 100644
--- a/arch/x86/realmode/init.c
+++ b/arch/x86/realmode/init.c
@@ -2,7 +2,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 #include 
 
@@ -48,7 +47,7 @@ static void sme_sev_setup_real_mode(struct trampoline_header 
*th)
if (prot_guest_has(PATTR_HOST_MEM_ENCRYPT))
th->flags |= TH_FLAGS_SME_ACTIVE;
 
-   if (sev_es_active()) {
+   if (prot_guest_has(PATTR_GUEST_PROT_STATE)) {
/*
 * Skip the call to verify_cpu() in secondary_startup_64 as it
 * will cause #VC exceptions when the AP can't handle them yet.
-- 
2.32.0


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


[PATCH 05/11] x86/sev: Replace occurrences of sev_active() with prot_guest_has()

2021-07-27 Thread Tom Lendacky
Replace occurrences of sev_active() with the more generic prot_guest_has()
using PATTR_GUEST_MEM_ENCRYPT, except for in arch/x86/mm/mem_encrypt*.c
where PATTR_SEV will be used. If future support is added for other memory
encryption technologies, the use of PATTR_GUEST_MEM_ENCRYPT can be
updated, as required, to use PATTR_SEV.

Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: Borislav Petkov 
Cc: Dave Hansen 
Cc: Andy Lutomirski 
Cc: Peter Zijlstra 
Cc: Ard Biesheuvel 
Signed-off-by: Tom Lendacky 
---
 arch/x86/include/asm/mem_encrypt.h |  2 --
 arch/x86/kernel/crash_dump_64.c|  4 +++-
 arch/x86/kernel/kvm.c  |  3 ++-
 arch/x86/kernel/kvmclock.c |  4 ++--
 arch/x86/kernel/machine_kexec_64.c | 16 
 arch/x86/kvm/svm/svm.c |  3 ++-
 arch/x86/mm/ioremap.c  |  6 +++---
 arch/x86/mm/mem_encrypt.c  | 15 +++
 arch/x86/platform/efi/efi_64.c |  9 +
 9 files changed, 32 insertions(+), 30 deletions(-)

diff --git a/arch/x86/include/asm/mem_encrypt.h 
b/arch/x86/include/asm/mem_encrypt.h
index 956338406cec..7e25de37c148 100644
--- a/arch/x86/include/asm/mem_encrypt.h
+++ b/arch/x86/include/asm/mem_encrypt.h
@@ -50,7 +50,6 @@ void __init mem_encrypt_free_decrypted_mem(void);
 void __init mem_encrypt_init(void);
 
 void __init sev_es_init_vc_handling(void);
-bool sev_active(void);
 bool sev_es_active(void);
 bool amd_prot_guest_has(unsigned int attr);
 
@@ -75,7 +74,6 @@ static inline void __init sme_encrypt_kernel(struct 
boot_params *bp) { }
 static inline void __init sme_enable(struct boot_params *bp) { }
 
 static inline void sev_es_init_vc_handling(void) { }
-static inline bool sev_active(void) { return false; }
 static inline bool sev_es_active(void) { return false; }
 static inline bool amd_prot_guest_has(unsigned int attr) { return false; }
 
diff --git a/arch/x86/kernel/crash_dump_64.c b/arch/x86/kernel/crash_dump_64.c
index 045e82e8945b..0cfe35f03e67 100644
--- a/arch/x86/kernel/crash_dump_64.c
+++ b/arch/x86/kernel/crash_dump_64.c
@@ -10,6 +10,7 @@
 #include 
 #include 
 #include 
+#include 
 
 static ssize_t __copy_oldmem_page(unsigned long pfn, char *buf, size_t csize,
  unsigned long offset, int userbuf,
@@ -73,5 +74,6 @@ ssize_t copy_oldmem_page_encrypted(unsigned long pfn, char 
*buf, size_t csize,
 
 ssize_t elfcorehdr_read(char *buf, size_t count, u64 *ppos)
 {
-   return read_from_oldmem(buf, count, ppos, 0, sev_active());
+   return read_from_oldmem(buf, count, ppos, 0,
+   prot_guest_has(PATTR_GUEST_MEM_ENCRYPT));
 }
diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
index a26643dc6bd6..9d08ad2f3faa 100644
--- a/arch/x86/kernel/kvm.c
+++ b/arch/x86/kernel/kvm.c
@@ -27,6 +27,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -418,7 +419,7 @@ static void __init sev_map_percpu_data(void)
 {
int cpu;
 
-   if (!sev_active())
+   if (!prot_guest_has(PATTR_GUEST_MEM_ENCRYPT))
return;
 
for_each_possible_cpu(cpu) {
diff --git a/arch/x86/kernel/kvmclock.c b/arch/x86/kernel/kvmclock.c
index ad273e5861c1..f7ba78a23dcd 100644
--- a/arch/x86/kernel/kvmclock.c
+++ b/arch/x86/kernel/kvmclock.c
@@ -16,9 +16,9 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
-#include 
 #include 
 #include 
 
@@ -232,7 +232,7 @@ static void __init kvmclock_init_mem(void)
 * hvclock is shared between the guest and the hypervisor, must
 * be mapped decrypted.
 */
-   if (sev_active()) {
+   if (prot_guest_has(PATTR_GUEST_MEM_ENCRYPT)) {
r = set_memory_decrypted((unsigned long) hvclock_mem,
 1UL << order);
if (r) {
diff --git a/arch/x86/kernel/machine_kexec_64.c 
b/arch/x86/kernel/machine_kexec_64.c
index 8e7b517ad738..66ff788b79c9 100644
--- a/arch/x86/kernel/machine_kexec_64.c
+++ b/arch/x86/kernel/machine_kexec_64.c
@@ -167,7 +167,7 @@ static int init_transition_pgtable(struct kimage *image, 
pgd_t *pgd)
}
pte = pte_offset_kernel(pmd, vaddr);
 
-   if (sev_active())
+   if (prot_guest_has(PATTR_GUEST_MEM_ENCRYPT))
prot = PAGE_KERNEL_EXEC;
 
set_pte(pte, pfn_pte(paddr >> PAGE_SHIFT, prot));
@@ -207,7 +207,7 @@ static int init_pgtable(struct kimage *image, unsigned long 
start_pgtable)
level4p = (pgd_t *)__va(start_pgtable);
clear_page(level4p);
 
-   if (sev_active()) {
+   if (prot_guest_has(PATTR_GUEST_MEM_ENCRYPT)) {
info.page_flag   |= _PAGE_ENC;
info.kernpg_flag |= _PAGE_ENC;
}
@@ -570,12 +570,12 @@ void arch_kexec_unprotect_crashkres(void)
  */
 int arch_kexec_post_alloc_pages(void *vaddr, unsigned int pages, gfp_t gfp)
 {
-   if (sev_active())
+   if (!prot_guest_has(PATTR_HOST_MEM_ENCRYPT))
return 0;
 
/*
-* If SME 

[PATCH 04/11] x86/sme: Replace occurrences of sme_active() with prot_guest_has()

2021-07-27 Thread Tom Lendacky
Replace occurrences of sme_active() with the more generic prot_guest_has()
using PATTR_HOST_MEM_ENCRYPT, except for in arch/x86/mm/mem_encrypt*.c
where PATTR_SME will be used. If future support is added for other memory
encryption technologies, the use of PATTR_HOST_MEM_ENCRYPT can be
updated, as required, to use PATTR_SME.

Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: Borislav Petkov 
Cc: Dave Hansen 
Cc: Andy Lutomirski 
Cc: Peter Zijlstra 
Cc: Joerg Roedel 
Cc: Will Deacon 
Signed-off-by: Tom Lendacky 
---
 arch/x86/include/asm/kexec.h |  2 +-
 arch/x86/include/asm/mem_encrypt.h   |  2 --
 arch/x86/kernel/machine_kexec_64.c   |  3 ++-
 arch/x86/kernel/pci-swiotlb.c|  9 -
 arch/x86/kernel/relocate_kernel_64.S |  2 +-
 arch/x86/mm/ioremap.c|  6 +++---
 arch/x86/mm/mem_encrypt.c| 10 +-
 arch/x86/mm/mem_encrypt_identity.c   |  3 ++-
 arch/x86/realmode/init.c |  5 +++--
 drivers/iommu/amd/init.c |  7 ---
 10 files changed, 25 insertions(+), 24 deletions(-)

diff --git a/arch/x86/include/asm/kexec.h b/arch/x86/include/asm/kexec.h
index 0a6e34b07017..11b7c06e2828 100644
--- a/arch/x86/include/asm/kexec.h
+++ b/arch/x86/include/asm/kexec.h
@@ -129,7 +129,7 @@ relocate_kernel(unsigned long indirection_page,
unsigned long page_list,
unsigned long start_address,
unsigned int preserve_context,
-   unsigned int sme_active);
+   unsigned int host_mem_enc_active);
 #endif
 
 #define ARCH_HAS_KIMAGE_ARCH
diff --git a/arch/x86/include/asm/mem_encrypt.h 
b/arch/x86/include/asm/mem_encrypt.h
index a46d47662772..956338406cec 100644
--- a/arch/x86/include/asm/mem_encrypt.h
+++ b/arch/x86/include/asm/mem_encrypt.h
@@ -50,7 +50,6 @@ void __init mem_encrypt_free_decrypted_mem(void);
 void __init mem_encrypt_init(void);
 
 void __init sev_es_init_vc_handling(void);
-bool sme_active(void);
 bool sev_active(void);
 bool sev_es_active(void);
 bool amd_prot_guest_has(unsigned int attr);
@@ -76,7 +75,6 @@ static inline void __init sme_encrypt_kernel(struct 
boot_params *bp) { }
 static inline void __init sme_enable(struct boot_params *bp) { }
 
 static inline void sev_es_init_vc_handling(void) { }
-static inline bool sme_active(void) { return false; }
 static inline bool sev_active(void) { return false; }
 static inline bool sev_es_active(void) { return false; }
 static inline bool amd_prot_guest_has(unsigned int attr) { return false; }
diff --git a/arch/x86/kernel/machine_kexec_64.c 
b/arch/x86/kernel/machine_kexec_64.c
index 131f30fdcfbd..8e7b517ad738 100644
--- a/arch/x86/kernel/machine_kexec_64.c
+++ b/arch/x86/kernel/machine_kexec_64.c
@@ -17,6 +17,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -358,7 +359,7 @@ void machine_kexec(struct kimage *image)
   (unsigned long)page_list,
   image->start,
   image->preserve_context,
-  sme_active());
+  prot_guest_has(PATTR_HOST_MEM_ENCRYPT));
 
 #ifdef CONFIG_KEXEC_JUMP
if (image->preserve_context)
diff --git a/arch/x86/kernel/pci-swiotlb.c b/arch/x86/kernel/pci-swiotlb.c
index c2cfa5e7c152..bd9a9cfbc9a2 100644
--- a/arch/x86/kernel/pci-swiotlb.c
+++ b/arch/x86/kernel/pci-swiotlb.c
@@ -6,7 +6,7 @@
 #include 
 #include 
 #include 
-#include 
+#include 
 
 #include 
 #include 
@@ -45,11 +45,10 @@ int __init pci_swiotlb_detect_4gb(void)
swiotlb = 1;
 
/*
-* If SME is active then swiotlb will be set to 1 so that bounce
-* buffers are allocated and used for devices that do not support
-* the addressing range required for the encryption mask.
+* Set swiotlb to 1 so that bounce buffers are allocated and used for
+* devices that can't support DMA to encrypted memory.
 */
-   if (sme_active())
+   if (prot_guest_has(PATTR_HOST_MEM_ENCRYPT))
swiotlb = 1;
 
return swiotlb;
diff --git a/arch/x86/kernel/relocate_kernel_64.S 
b/arch/x86/kernel/relocate_kernel_64.S
index c53271aebb64..c8fe74a28143 100644
--- a/arch/x86/kernel/relocate_kernel_64.S
+++ b/arch/x86/kernel/relocate_kernel_64.S
@@ -47,7 +47,7 @@ SYM_CODE_START_NOALIGN(relocate_kernel)
 * %rsi page_list
 * %rdx start address
 * %rcx preserve_context
-* %r8  sme_active
+* %r8  host_mem_enc_active
 */
 
/* Save the CPU context, used for jumping back */
diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c
index 60ade7dd71bd..f899f02c0241 100644
--- a/arch/x86/mm/ioremap.c
+++ b/arch/x86/mm/ioremap.c
@@ -14,7 +14,7 @@
 #include 
 #include 
 #include 
-#include 
+#include 
 #include 
 #include 
 
@@ -702,7 +702,7 @@ bool arch_memremap_can_ram_remap(resource_size_t phys_addr, 
unsigned long size,
   

[PATCH 11/11] s390/mm: Remove the now unused mem_encrypt_active() function

2021-07-27 Thread Tom Lendacky
The mem_encrypt_active() function has been replaced by prot_guest_has(),
so remove the implementation. Since the default implementation of the
prot_guest_has() matches the s390 implementation of mem_encrypt_active(),
prot_guest_has() does not need to be implemented in s390 (the config
option ARCH_HAS_PROTECTED_GUEST is not set).

Cc: Heiko Carstens 
Cc: Vasily Gorbik 
Cc: Christian Borntraeger 
Signed-off-by: Tom Lendacky 
---
 arch/s390/include/asm/mem_encrypt.h | 2 --
 1 file changed, 2 deletions(-)

diff --git a/arch/s390/include/asm/mem_encrypt.h 
b/arch/s390/include/asm/mem_encrypt.h
index 2542cbf7e2d1..08a8b96606d7 100644
--- a/arch/s390/include/asm/mem_encrypt.h
+++ b/arch/s390/include/asm/mem_encrypt.h
@@ -4,8 +4,6 @@
 
 #ifndef __ASSEMBLY__
 
-static inline bool mem_encrypt_active(void) { return false; }
-
 int set_memory_encrypted(unsigned long addr, int numpages);
 int set_memory_decrypted(unsigned long addr, int numpages);
 
-- 
2.32.0


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


[PATCH 03/11] powerpc/pseries/svm: Add a powerpc version of prot_guest_has()

2021-07-27 Thread Tom Lendacky
Introduce a powerpc version of the prot_guest_has() function. This will
be used to replace the powerpc mem_encrypt_active() implementation, so
the implementation will initially only support the PATTR_MEM_ENCRYPT
attribute.

Cc: Michael Ellerman 
Cc: Benjamin Herrenschmidt 
Cc: Paul Mackerras 
Signed-off-by: Tom Lendacky 
---
 arch/powerpc/include/asm/protected_guest.h | 30 ++
 arch/powerpc/platforms/pseries/Kconfig |  1 +
 2 files changed, 31 insertions(+)
 create mode 100644 arch/powerpc/include/asm/protected_guest.h

diff --git a/arch/powerpc/include/asm/protected_guest.h 
b/arch/powerpc/include/asm/protected_guest.h
new file mode 100644
index ..ce55c2c7e534
--- /dev/null
+++ b/arch/powerpc/include/asm/protected_guest.h
@@ -0,0 +1,30 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Protected Guest (and Host) Capability checks
+ *
+ * Copyright (C) 2021 Advanced Micro Devices, Inc.
+ *
+ * Author: Tom Lendacky 
+ */
+
+#ifndef _POWERPC_PROTECTED_GUEST_H
+#define _POWERPC_PROTECTED_GUEST_H
+
+#include 
+
+#ifndef __ASSEMBLY__
+
+static inline bool prot_guest_has(unsigned int attr)
+{
+   switch (attr) {
+   case PATTR_MEM_ENCRYPT:
+   return is_secure_guest();
+
+   default:
+   return false;
+   }
+}
+
+#endif /* __ASSEMBLY__ */
+
+#endif /* _POWERPC_PROTECTED_GUEST_H */
diff --git a/arch/powerpc/platforms/pseries/Kconfig 
b/arch/powerpc/platforms/pseries/Kconfig
index 5e037df2a3a1..8ce5417d6feb 100644
--- a/arch/powerpc/platforms/pseries/Kconfig
+++ b/arch/powerpc/platforms/pseries/Kconfig
@@ -159,6 +159,7 @@ config PPC_SVM
select SWIOTLB
select ARCH_HAS_MEM_ENCRYPT
select ARCH_HAS_FORCE_DMA_UNENCRYPTED
+   select ARCH_HAS_PROTECTED_GUEST
help
 There are certain POWER platforms which support secure guests using
 the Protected Execution Facility, with the help of an Ultravisor
-- 
2.32.0


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


[PATCH 02/11] x86/sev: Add an x86 version of prot_guest_has()

2021-07-27 Thread Tom Lendacky
Introduce an x86 version of the prot_guest_has() function. This will be
used in the more generic x86 code to replace vendor specific calls like
sev_active(), etc.

While the name suggests this is intended mainly for guests, it will
also be used for host memory encryption checks in place of sme_active().

The amd_prot_guest_has() function does not use EXPORT_SYMBOL_GPL for the
same reasons previously stated when changing sme_active(), sev_active and
sme_me_mask to EXPORT_SYBMOL:
  commit 87df26175e67 ("x86/mm: Unbreak modules that rely on external 
PAGE_KERNEL availability")
  commit 9d5f38ba6c82 ("x86/mm: Unbreak modules that use the DMA API")

Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: Borislav Petkov 
Cc: Dave Hansen 
Cc: Andy Lutomirski 
Cc: Peter Zijlstra 
Co-developed-by: Andi Kleen 
Signed-off-by: Andi Kleen 
Co-developed-by: Kuppuswamy Sathyanarayanan 

Signed-off-by: Kuppuswamy Sathyanarayanan 

Signed-off-by: Tom Lendacky 
---
 arch/x86/Kconfig   |  1 +
 arch/x86/include/asm/mem_encrypt.h |  2 ++
 arch/x86/include/asm/protected_guest.h | 27 ++
 arch/x86/mm/mem_encrypt.c  | 25 
 include/linux/protected_guest.h|  5 +
 5 files changed, 60 insertions(+)
 create mode 100644 arch/x86/include/asm/protected_guest.h

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 49270655e827..e47213cbfc55 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -1514,6 +1514,7 @@ config AMD_MEM_ENCRYPT
select ARCH_HAS_FORCE_DMA_UNENCRYPTED
select INSTRUCTION_DECODER
select ARCH_HAS_RESTRICTED_VIRTIO_MEMORY_ACCESS
+   select ARCH_HAS_PROTECTED_GUEST
help
  Say yes to enable support for the encryption of system memory.
  This requires an AMD processor that supports Secure Memory
diff --git a/arch/x86/include/asm/mem_encrypt.h 
b/arch/x86/include/asm/mem_encrypt.h
index 9c80c68d75b5..a46d47662772 100644
--- a/arch/x86/include/asm/mem_encrypt.h
+++ b/arch/x86/include/asm/mem_encrypt.h
@@ -53,6 +53,7 @@ void __init sev_es_init_vc_handling(void);
 bool sme_active(void);
 bool sev_active(void);
 bool sev_es_active(void);
+bool amd_prot_guest_has(unsigned int attr);
 
 #define __bss_decrypted __section(".bss..decrypted")
 
@@ -78,6 +79,7 @@ static inline void sev_es_init_vc_handling(void) { }
 static inline bool sme_active(void) { return false; }
 static inline bool sev_active(void) { return false; }
 static inline bool sev_es_active(void) { return false; }
+static inline bool amd_prot_guest_has(unsigned int attr) { return false; }
 
 static inline int __init
 early_set_memory_decrypted(unsigned long vaddr, unsigned long size) { return 
0; }
diff --git a/arch/x86/include/asm/protected_guest.h 
b/arch/x86/include/asm/protected_guest.h
new file mode 100644
index ..b4a267dddf93
--- /dev/null
+++ b/arch/x86/include/asm/protected_guest.h
@@ -0,0 +1,27 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Protected Guest (and Host) Capability checks
+ *
+ * Copyright (C) 2021 Advanced Micro Devices, Inc.
+ *
+ * Author: Tom Lendacky 
+ */
+
+#ifndef _X86_PROTECTED_GUEST_H
+#define _X86_PROTECTED_GUEST_H
+
+#include 
+
+#ifndef __ASSEMBLY__
+
+static inline bool prot_guest_has(unsigned int attr)
+{
+   if (sme_me_mask)
+   return amd_prot_guest_has(attr);
+
+   return false;
+}
+
+#endif /* __ASSEMBLY__ */
+
+#endif /* _X86_PROTECTED_GUEST_H */
diff --git a/arch/x86/mm/mem_encrypt.c b/arch/x86/mm/mem_encrypt.c
index ff08dc463634..7d3b2c6f5f88 100644
--- a/arch/x86/mm/mem_encrypt.c
+++ b/arch/x86/mm/mem_encrypt.c
@@ -20,6 +20,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -389,6 +390,30 @@ bool noinstr sev_es_active(void)
return sev_status & MSR_AMD64_SEV_ES_ENABLED;
 }
 
+bool amd_prot_guest_has(unsigned int attr)
+{
+   switch (attr) {
+   case PATTR_MEM_ENCRYPT:
+   return sme_me_mask != 0;
+
+   case PATTR_SME:
+   case PATTR_HOST_MEM_ENCRYPT:
+   return sme_active();
+
+   case PATTR_SEV:
+   case PATTR_GUEST_MEM_ENCRYPT:
+   return sev_active();
+
+   case PATTR_SEV_ES:
+   case PATTR_GUEST_PROT_STATE:
+   return sev_es_active();
+
+   default:
+   return false;
+   }
+}
+EXPORT_SYMBOL(amd_prot_guest_has);
+
 /* Override for DMA direct allocation check - ARCH_HAS_FORCE_DMA_UNENCRYPTED */
 bool force_dma_unencrypted(struct device *dev)
 {
diff --git a/include/linux/protected_guest.h b/include/linux/protected_guest.h
index f8ed7b72967b..7a7120abbb62 100644
--- a/include/linux/protected_guest.h
+++ b/include/linux/protected_guest.h
@@ -17,6 +17,11 @@
 #define PATTR_GUEST_MEM_ENCRYPT2   /* Guest encrypted 
memory */
 #define PATTR_GUEST_PROT_STATE 3   /* Guest encrypted state */
 
+/* 0x800 - 0x8ff reserved for AMD */
+#define PATTR_SME

[PATCH 01/11] mm: Introduce a function to check for virtualization protection features

2021-07-27 Thread Tom Lendacky
In prep for other protected virtualization technologies, introduce a
generic helper function, prot_guest_has(), that can be used to check
for specific protection attributes, like memory encryption. This is
intended to eliminate having to add multiple technology-specific checks
to the code (e.g. if (sev_active() || tdx_active())).

Co-developed-by: Andi Kleen 
Signed-off-by: Andi Kleen 
Co-developed-by: Kuppuswamy Sathyanarayanan 

Signed-off-by: Kuppuswamy Sathyanarayanan 

Signed-off-by: Tom Lendacky 
---
 arch/Kconfig|  3 +++
 include/linux/protected_guest.h | 32 
 2 files changed, 35 insertions(+)
 create mode 100644 include/linux/protected_guest.h

diff --git a/arch/Kconfig b/arch/Kconfig
index 129df498a8e1..a47cf283f2ff 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -1231,6 +1231,9 @@ config RELR
 config ARCH_HAS_MEM_ENCRYPT
bool
 
+config ARCH_HAS_PROTECTED_GUEST
+   bool
+
 config HAVE_SPARSE_SYSCALL_NR
bool
help
diff --git a/include/linux/protected_guest.h b/include/linux/protected_guest.h
new file mode 100644
index ..f8ed7b72967b
--- /dev/null
+++ b/include/linux/protected_guest.h
@@ -0,0 +1,32 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Protected Guest (and Host) Capability checks
+ *
+ * Copyright (C) 2021 Advanced Micro Devices, Inc.
+ *
+ * Author: Tom Lendacky 
+ */
+
+#ifndef _PROTECTED_GUEST_H
+#define _PROTECTED_GUEST_H
+
+#ifndef __ASSEMBLY__
+
+#define PATTR_MEM_ENCRYPT  0   /* Encrypted memory */
+#define PATTR_HOST_MEM_ENCRYPT 1   /* Host encrypted memory */
+#define PATTR_GUEST_MEM_ENCRYPT2   /* Guest encrypted 
memory */
+#define PATTR_GUEST_PROT_STATE 3   /* Guest encrypted state */
+
+#ifdef CONFIG_ARCH_HAS_PROTECTED_GUEST
+
+#include 
+
+#else  /* !CONFIG_ARCH_HAS_PROTECTED_GUEST */
+
+static inline bool prot_guest_has(unsigned int attr) { return false; }
+
+#endif /* CONFIG_ARCH_HAS_PROTECTED_GUEST */
+
+#endif /* __ASSEMBLY__ */
+
+#endif /* _PROTECTED_GUEST_H */
-- 
2.32.0


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


[PATCH 00/11] Implement generic prot_guest_has() helper function

2021-07-27 Thread Tom Lendacky
This patch series provides a generic helper function, prot_guest_has(),
to replace the sme_active(), sev_active(), sev_es_active() and
mem_encrypt_active() functions.

It is expected that as new protected virtualization technologies are
added to the kernel, they can all be covered by a single function call
instead of a collection of specific function calls all called from the
same locations.

The powerpc and s390 patches have been compile tested only. Can the
folks copied on this series verify that nothing breaks for them.

Cc: Andi Kleen 
Cc: Andy Lutomirski 
Cc: Ard Biesheuvel 
Cc: Baoquan He 
Cc: Benjamin Herrenschmidt 
Cc: Borislav Petkov 
Cc: Christian Borntraeger 
Cc: Daniel Vetter 
Cc: Dave Hansen 
Cc: Dave Young 
Cc: David Airlie 
Cc: Heiko Carstens 
Cc: Ingo Molnar 
Cc: Joerg Roedel 
Cc: Maarten Lankhorst 
Cc: Maxime Ripard 
Cc: Michael Ellerman 
Cc: Paul Mackerras 
Cc: Peter Zijlstra 
Cc: Thomas Gleixner 
Cc: Thomas Zimmermann 
Cc: Vasily Gorbik 
Cc: VMware Graphics 
Cc: Will Deacon 

---

Patches based on:
  https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git master
  commit 79e920060fa7 ("Merge branch 'WIP/fixes'")

Tom Lendacky (11):
  mm: Introduce a function to check for virtualization protection
features
  x86/sev: Add an x86 version of prot_guest_has()
  powerpc/pseries/svm: Add a powerpc version of prot_guest_has()
  x86/sme: Replace occurrences of sme_active() with prot_guest_has()
  x86/sev: Replace occurrences of sev_active() with prot_guest_has()
  x86/sev: Replace occurrences of sev_es_active() with prot_guest_has()
  treewide: Replace the use of mem_encrypt_active() with
prot_guest_has()
  mm: Remove the now unused mem_encrypt_active() function
  x86/sev: Remove the now unused mem_encrypt_active() function
  powerpc/pseries/svm: Remove the now unused mem_encrypt_active()
function
  s390/mm: Remove the now unused mem_encrypt_active() function

 arch/Kconfig   |  3 ++
 arch/powerpc/include/asm/mem_encrypt.h |  5 --
 arch/powerpc/include/asm/protected_guest.h | 30 +++
 arch/powerpc/platforms/pseries/Kconfig |  1 +
 arch/s390/include/asm/mem_encrypt.h|  2 -
 arch/x86/Kconfig   |  1 +
 arch/x86/include/asm/kexec.h   |  2 +-
 arch/x86/include/asm/mem_encrypt.h | 13 +
 arch/x86/include/asm/protected_guest.h | 27 ++
 arch/x86/kernel/crash_dump_64.c|  4 +-
 arch/x86/kernel/head64.c   |  4 +-
 arch/x86/kernel/kvm.c  |  3 +-
 arch/x86/kernel/kvmclock.c |  4 +-
 arch/x86/kernel/machine_kexec_64.c | 19 +++
 arch/x86/kernel/pci-swiotlb.c  |  9 ++--
 arch/x86/kernel/relocate_kernel_64.S   |  2 +-
 arch/x86/kernel/sev.c  |  6 +--
 arch/x86/kvm/svm/svm.c |  3 +-
 arch/x86/mm/ioremap.c  | 16 +++---
 arch/x86/mm/mem_encrypt.c  | 60 +++---
 arch/x86/mm/mem_encrypt_identity.c |  3 +-
 arch/x86/mm/pat/set_memory.c   |  3 +-
 arch/x86/platform/efi/efi_64.c |  9 ++--
 arch/x86/realmode/init.c   |  8 +--
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c|  4 +-
 drivers/gpu/drm/drm_cache.c|  4 +-
 drivers/gpu/drm/vmwgfx/vmwgfx_drv.c|  4 +-
 drivers/gpu/drm/vmwgfx/vmwgfx_msg.c|  6 +--
 drivers/iommu/amd/init.c   |  7 +--
 drivers/iommu/amd/iommu.c  |  3 +-
 drivers/iommu/amd/iommu_v2.c   |  3 +-
 drivers/iommu/iommu.c  |  3 +-
 fs/proc/vmcore.c   |  6 +--
 include/linux/mem_encrypt.h|  4 --
 include/linux/protected_guest.h| 37 +
 kernel/dma/swiotlb.c   |  4 +-
 36 files changed, 218 insertions(+), 104 deletions(-)
 create mode 100644 arch/powerpc/include/asm/protected_guest.h
 create mode 100644 arch/x86/include/asm/protected_guest.h
 create mode 100644 include/linux/protected_guest.h

-- 
2.32.0


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH 04/12] x86/sev: Do not hardcode GHCB protocol version

2021-07-21 Thread Tom Lendacky
On 7/21/21 9:20 AM, Joerg Roedel wrote:
> From: Joerg Roedel 
> 
> Introduce the sev_get_ghcb_proto_ver() which will return the negotiated
> GHCB protocol version and use it to set the version field in the GHCB.
> 
> Signed-off-by: Joerg Roedel 
> ---
>  arch/x86/boot/compressed/sev.c | 5 +
>  arch/x86/kernel/sev-shared.c   | 5 -
>  arch/x86/kernel/sev.c  | 5 +
>  3 files changed, 14 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/boot/compressed/sev.c b/arch/x86/boot/compressed/sev.c
> index 1a2e49730f8b..101e08c67296 100644
> --- a/arch/x86/boot/compressed/sev.c
> +++ b/arch/x86/boot/compressed/sev.c
> @@ -119,6 +119,11 @@ static enum es_result vc_read_mem(struct es_em_ctxt 
> *ctxt,
>  /* Include code for early handlers */
>  #include "../../kernel/sev-shared.c"
>  
> +static u64 sev_get_ghcb_proto_ver(void)
> +{
> + return GHCB_PROTOCOL_MAX;
> +}
> +
>  static bool early_setup_sev_es(void)
>  {
>   if (!sev_es_negotiate_protocol(NULL))
> diff --git a/arch/x86/kernel/sev-shared.c b/arch/x86/kernel/sev-shared.c
> index 73eeb5897d16..36eaac2773ed 100644
> --- a/arch/x86/kernel/sev-shared.c
> +++ b/arch/x86/kernel/sev-shared.c
> @@ -28,6 +28,9 @@ struct sev_ghcb_protocol_info {
>   unsigned int vm_proto;
>  };
>  
> +/* Returns the negotiated GHCB Protocol version */
> +static u64 sev_get_ghcb_proto_ver(void);
> +
>  static bool __init sev_es_check_cpu_features(void)
>  {
>   if (!has_cpuflag(X86_FEATURE_RDRAND)) {
> @@ -122,7 +125,7 @@ static enum es_result sev_es_ghcb_hv_call(struct ghcb 
> *ghcb,
>   enum es_result ret;
>  
>   /* Fill in protocol and format specifiers */
> - ghcb->protocol_version = GHCB_PROTOCOL_MAX;
> + ghcb->protocol_version = sev_get_ghcb_proto_ver();

So this probably needs better clarification in the spec, but the GHCB
version field is for the GHCB structure layout. So if you don't plan to
use the XSS field that was added for version 2 of the layout, then you
should report the GHCB structure version as 1.

Thanks,
Tom

>   ghcb->ghcb_usage   = GHCB_DEFAULT_USAGE;
>  
>   ghcb_set_sw_exit_code(ghcb, exit_code);
> diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c
> index 8084bfd7cce1..5d3422e8b25e 100644
> --- a/arch/x86/kernel/sev.c
> +++ b/arch/x86/kernel/sev.c
> @@ -498,6 +498,11 @@ static enum es_result vc_slow_virt_to_phys(struct ghcb 
> *ghcb, struct es_em_ctxt
>  /* Negotiated GHCB protocol version */
>  static struct sev_ghcb_protocol_info ghcb_protocol_info __ro_after_init;
>  
> +static u64 sev_get_ghcb_proto_ver(void)
> +{
> + return ghcb_protocol_info.vm_proto;
> +}
> +
>  static noinstr void __sev_put_ghcb(struct ghcb_state *state)
>  {
>   struct sev_es_runtime_data *data;
> 

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH 3/4 V3] Remap the device table of IOMMU in encrypted manner for kdump

2018-06-21 Thread Tom Lendacky
On 6/21/2018 3:39 AM, Baoquan He wrote:
> On 06/21/18 at 01:42pm, lijiang wrote:
>> 在 2018年06月21日 00:42, Tom Lendacky 写道:
>>> On 6/16/2018 3:27 AM, Lianbo Jiang wrote:
>>>> In kdump mode, it will copy the device table of IOMMU from the old
>>>> device table, which is encrypted when SME is enabled in the first
>>>> kernel. So we must remap it in encrypted manner in order to be
>>>> automatically decrypted when we read.
>>>>
>>>> Signed-off-by: Lianbo Jiang 
>>>> ---
>>>> Some changes:
>>>> 1. add some comments
>>>> 2. clean compile warning.
>>>>
>>>>  drivers/iommu/amd_iommu_init.c | 15 ++-
>>>>  1 file changed, 14 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/drivers/iommu/amd_iommu_init.c 
>>>> b/drivers/iommu/amd_iommu_init.c
>>>> index 904c575..a20af4c 100644
>>>> --- a/drivers/iommu/amd_iommu_init.c
>>>> +++ b/drivers/iommu/amd_iommu_init.c
>>>> @@ -889,11 +889,24 @@ static bool copy_device_table(void)
>>>>}
>>>>  
>>>>old_devtb_phys = entry & PAGE_MASK;
>>>> +
>>>> +  /*
>>>> +   *  When sme enable in the first kernel, old_devtb_phys includes the
>>>> +   *  memory encryption mask(sme_me_mask), we must remove the memory
>>>> +   *  encryption mask to obtain the true physical address in kdump mode.
>>>> +   */
>>>> +  if (mem_encrypt_active() && is_kdump_kernel())
>>>> +  old_devtb_phys = __sme_clr(old_devtb_phys);
>>>> +
>>>
>>> You can probably just use "if (is_kdump_kernel())" here, since memory
>>> encryption is either on in both the first and second kernel or off in
>>> both the first and second kernel.  At which point __sme_clr() will do
>>> the proper thing.
>>>
>>> Actually, this needs to be done no matter what.  When doing either the
>>> ioremap_encrypted() or the memremap(), the physical address should not
>>> include the encryption bit/mask.
>>>
>>> Thanks,
>>> Tom
>>>
>> Thanks for your comments. If we don't remove the memory encryption mask, it 
>> will
>> return false because the 'old_devtb_phys >= 0x1ULL' may become true.
> 
> Lianbo, you may not get what Tom suggested. Tom means no matter what it
> is, encrypted or not in 1st kernel, we need get pure physicall address,
> and using below code is always right for both cases.
> 
>   if (is_kdump_kernel())
>   old_devtb_phys = __sme_clr(old_devtb_phys);
> 
> And this is simpler. You even can add one line of code comment to say
> like "Physical address w/o encryption mask is needed here."

Even simpler, there's no need to even check for is_kdump_kernel().  The
__sme_clr() should always be done if the physical address is going to be
used for some form of io or memory remapping.

So you could just change the existing:

old_devtb_phys = entry & PAGE_MASK;

to:

old_devtb_phys = __sme_clr(entry) & PAGE_MASK;

Thanks,
Tom

>>
>> Lianbo
>>>>if (old_devtb_phys >= 0x1ULL) {
>>>>pr_err("The address of old device table is above 4G, not 
>>>> trustworthy!\n");
>>>>return false;
>>>>}
>>>> -  old_devtb = memremap(old_devtb_phys, dev_table_size, MEMREMAP_WB);
>>>> +  old_devtb = (mem_encrypt_active() && is_kdump_kernel())
>>>> +  ? (__force void *)ioremap_encrypted(old_devtb_phys,
>>>> +  dev_table_size)
>>>> +  : memremap(old_devtb_phys, dev_table_size, MEMREMAP_WB);> +
>>>>if (!old_devtb)
>>>>return false;
>>>>  
>>>>
>>
>> ___
>> kexec mailing list
>> kexec@lists.infradead.org
>> http://lists.infradead.org/mailman/listinfo/kexec

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH 3/4 V3] Remap the device table of IOMMU in encrypted manner for kdump

2018-06-20 Thread Tom Lendacky
On 6/16/2018 3:27 AM, Lianbo Jiang wrote:
> In kdump mode, it will copy the device table of IOMMU from the old
> device table, which is encrypted when SME is enabled in the first
> kernel. So we must remap it in encrypted manner in order to be
> automatically decrypted when we read.
> 
> Signed-off-by: Lianbo Jiang 
> ---
> Some changes:
> 1. add some comments
> 2. clean compile warning.
> 
>  drivers/iommu/amd_iommu_init.c | 15 ++-
>  1 file changed, 14 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/iommu/amd_iommu_init.c b/drivers/iommu/amd_iommu_init.c
> index 904c575..a20af4c 100644
> --- a/drivers/iommu/amd_iommu_init.c
> +++ b/drivers/iommu/amd_iommu_init.c
> @@ -889,11 +889,24 @@ static bool copy_device_table(void)
>   }
>  
>   old_devtb_phys = entry & PAGE_MASK;
> +
> + /*
> +  *  When sme enable in the first kernel, old_devtb_phys includes the
> +  *  memory encryption mask(sme_me_mask), we must remove the memory
> +  *  encryption mask to obtain the true physical address in kdump mode.
> +  */
> + if (mem_encrypt_active() && is_kdump_kernel())
> + old_devtb_phys = __sme_clr(old_devtb_phys);
> +

You can probably just use "if (is_kdump_kernel())" here, since memory
encryption is either on in both the first and second kernel or off in
both the first and second kernel.  At which point __sme_clr() will do
the proper thing.

Actually, this needs to be done no matter what.  When doing either the
ioremap_encrypted() or the memremap(), the physical address should not
include the encryption bit/mask.

Thanks,
Tom

>   if (old_devtb_phys >= 0x1ULL) {
>   pr_err("The address of old device table is above 4G, not 
> trustworthy!\n");
>   return false;
>   }
> - old_devtb = memremap(old_devtb_phys, dev_table_size, MEMREMAP_WB);
> + old_devtb = (mem_encrypt_active() && is_kdump_kernel())
> + ? (__force void *)ioremap_encrypted(old_devtb_phys,
> + dev_table_size)
> + : memremap(old_devtb_phys, dev_table_size, MEMREMAP_WB);> +
>   if (!old_devtb)
>   return false;
>  
> 

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH 1/4 V3] Add a function(ioremap_encrypted) for kdump when AMD sme enabled

2018-06-20 Thread Tom Lendacky
On 6/16/2018 3:27 AM, Lianbo Jiang wrote:
> It is convenient to remap the old memory encrypted to the second
> kernel by calling ioremap_encrypted().
> 
> Signed-off-by: Lianbo Jiang 
> ---
> Some changes:
> 1. remove the sme_active() check in __ioremap_caller().
> 2. put some logic into the early_memremap_pgprot_adjust() for
> early memremap.
> 
>  arch/x86/include/asm/io.h |  3 +++
>  arch/x86/mm/ioremap.c | 28 
>  2 files changed, 23 insertions(+), 8 deletions(-)
> 
> diff --git a/arch/x86/include/asm/io.h b/arch/x86/include/asm/io.h
> index f6e5b93..989d60b 100644
> --- a/arch/x86/include/asm/io.h
> +++ b/arch/x86/include/asm/io.h
> @@ -192,6 +192,9 @@ extern void __iomem *ioremap_cache(resource_size_t 
> offset, unsigned long size);
>  #define ioremap_cache ioremap_cache
>  extern void __iomem *ioremap_prot(resource_size_t offset, unsigned long 
> size, unsigned long prot_val);
>  #define ioremap_prot ioremap_prot
> +extern void __iomem *ioremap_encrypted(resource_size_t phys_addr,
> + unsigned long size);
> +#define ioremap_encrypted ioremap_encrypted
>  
>  /**
>   * ioremap -   map bus memory into CPU space
> diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c
> index c63a545..e365fc4 100644
> --- a/arch/x86/mm/ioremap.c
> +++ b/arch/x86/mm/ioremap.c
> @@ -24,6 +24,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  #include "physaddr.h"
>  
> @@ -131,7 +132,8 @@ static void __ioremap_check_mem(resource_size_t addr, 
> unsigned long size,
>   * caller shouldn't need to know that small detail.
>   */
>  static void __iomem *__ioremap_caller(resource_size_t phys_addr,
> - unsigned long size, enum page_cache_mode pcm, void *caller)
> + unsigned long size, enum page_cache_mode pcm,
> + void *caller, bool encrypted)
>  {
>   unsigned long offset, vaddr;
>   resource_size_t last_addr;
> @@ -199,7 +201,7 @@ static void __iomem *__ioremap_caller(resource_size_t 
> phys_addr,
>* resulting mapping.
>*/
>   prot = PAGE_KERNEL_IO;
> - if (sev_active() && mem_flags.desc_other)
> + if ((sev_active() && mem_flags.desc_other) || encrypted)
>   prot = pgprot_encrypted(prot);
>  
>   switch (pcm) {
> @@ -291,7 +293,7 @@ void __iomem *ioremap_nocache(resource_size_t phys_addr, 
> unsigned long size)
>   enum page_cache_mode pcm = _PAGE_CACHE_MODE_UC_MINUS;
>  
>   return __ioremap_caller(phys_addr, size, pcm,
> - __builtin_return_address(0));
> + __builtin_return_address(0), false);
>  }
>  EXPORT_SYMBOL(ioremap_nocache);
>  
> @@ -324,7 +326,7 @@ void __iomem *ioremap_uc(resource_size_t phys_addr, 
> unsigned long size)
>   enum page_cache_mode pcm = _PAGE_CACHE_MODE_UC;
>  
>   return __ioremap_caller(phys_addr, size, pcm,
> - __builtin_return_address(0));
> + __builtin_return_address(0), false);
>  }
>  EXPORT_SYMBOL_GPL(ioremap_uc);
>  
> @@ -341,7 +343,7 @@ EXPORT_SYMBOL_GPL(ioremap_uc);
>  void __iomem *ioremap_wc(resource_size_t phys_addr, unsigned long size)
>  {
>   return __ioremap_caller(phys_addr, size, _PAGE_CACHE_MODE_WC,
> - __builtin_return_address(0));
> + __builtin_return_address(0), false);
>  }
>  EXPORT_SYMBOL(ioremap_wc);
>  
> @@ -358,14 +360,21 @@ EXPORT_SYMBOL(ioremap_wc);
>  void __iomem *ioremap_wt(resource_size_t phys_addr, unsigned long size)
>  {
>   return __ioremap_caller(phys_addr, size, _PAGE_CACHE_MODE_WT,
> - __builtin_return_address(0));
> + __builtin_return_address(0), false);
>  }
>  EXPORT_SYMBOL(ioremap_wt);
>  
> +void __iomem *ioremap_encrypted(resource_size_t phys_addr, unsigned long 
> size)
> +{
> + return __ioremap_caller(phys_addr, size, _PAGE_CACHE_MODE_WB,
> + __builtin_return_address(0), true);
> +}
> +EXPORT_SYMBOL(ioremap_encrypted);
> +
>  void __iomem *ioremap_cache(resource_size_t phys_addr, unsigned long size)
>  {
>   return __ioremap_caller(phys_addr, size, _PAGE_CACHE_MODE_WB,
> - __builtin_return_address(0));
> + __builtin_return_address(0), false);
>  }
>  EXPORT_SYMBOL(ioremap_cache);
>  
> @@ -374,7 +383,7 @@ void __iomem *ioremap_prot(resource_size_t phys_addr, 
> unsigned long size,
>  {
>   return __ioremap_caller(phys_addr, size,
>   pgprot2cachemode(__pgprot(prot_val)),
> - __builtin_return_address(0));
> + __builtin_return_address(0), false);
>  }
>  EXPORT_SYMBOL(ioremap_prot);
>  
> @@ -688,6 +697,9 @@ pgprot_t __init 
> early_memremap_pgprot_adjust(resource_size_t phys_addr,
>   if (encrypted_prot && memremap_shoul

Re: [PATCH 0/2] support kdump for AMD secure memory encryption(sme)

2018-05-21 Thread Tom Lendacky
On 5/20/2018 10:45 PM, lijiang wrote:
> 在 2018年05月17日 21:45, lijiang 写道:
>> 在 2018年05月15日 21:31, Tom Lendacky 写道:
>>> On 5/14/2018 8:51 PM, Lianbo Jiang wrote:
>>>> It is convenient to remap the old memory encrypted to the second kernel by
>>>> calling ioremap_encrypted().
>>>>
>>>> When sme enabled on AMD server, we also need to support kdump. Because
>>>> the memory is encrypted in the first kernel, we will remap the old memory
>>>> encrypted to the second kernel(crash kernel), and sme is also enabled in
>>>> the second kernel, otherwise the old memory encrypted can not be decrypted.
>>>> Because simply changing the value of a C-bit on a page will not
>>>> automatically encrypt the existing contents of a page, and any data in the
>>>> page prior to the C-bit modification will become unintelligible. A page of
>>>> memory that is marked encrypted will be automatically decrypted when read
>>>> from DRAM and will be automatically encrypted when written to DRAM.
>>>>
>>>> For the kdump, it is necessary to distinguish whether the memory is
>>>> encrypted. Furthermore, we should also know which part of the memory is
>>>> encrypted or decrypted. We will appropriately remap the memory according
>>>> to the specific situation in order to tell cpu how to deal with the
>>>> data(encrypted or decrypted). For example, when sme enabled, if the old
>>>> memory is encrypted, we will remap the old memory in encrypted way, which
>>>> will automatically decrypt the old memory encrypted when we read those data
>>>> from the remapping address.
>>>>
>>>>  --
>>>> | first-kernel | second-kernel | kdump support |
>>>> |  (mem_encrypt=on|off)|   (yes|no)| 
>>>> |--+---+---|
>>>> | on   | on| yes   |
>>>> | off  | off   | yes   |
>>>> | on   | off   | no|
>>>> | off  | on| no|
>>>> |__|___|___|
>>>>
>>>> Test tools:
>>>> makedumpfile[v1.6.3]: https://github.com/LianboJ/makedumpfile
>>>> commit e1de103eca8f (A draft for kdump vmcore about AMD SME)
>>>> Author: Lianbo Jiang 
>>>> Date:   Mon May 14 17:02:40 2018 +0800
>>>> Note: This patch can only dump vmcore in the case of SME enabled.
>>>>
>>>> crash-7.2.1: https://github.com/crash-utility/crash.git
>>>> commit 1e1bd9c4c1be (Fix for the "bpf" command display on Linux 4.17-rc1)
>>>> Author: Dave Anderson 
>>>> Date:   Fri May 11 15:54:32 2018 -0400
>>>>
>>>> Test environment:
>>>> HP ProLiant DL385Gen10 AMD EPYC 7251
>>>> 8-Core Processor
>>>> 32768 MB memory
>>>> 600 GB disk space
>>>>
>>>> Linux 4.17-rc4:
>>>> git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
>>>> commit 75bc37fefc44 ("Linux 4.17-rc4")
>>>> Author: Linus Torvalds 
>>>> Date:   Sun May 6 16:57:38 2018 -1000
>>>>
>>>> Reference:
>>>> AMD64 Architecture Programmer's Manual
>>>> https://support.amd.com/TechDocs/24593.pdf
>>>>
>>>
>>> Have you also tested this with SEV?  It would be nice if the kdump
>>> changes you make work with both SME and SEV.
>>>
>> Thank you, Tom.
>> This is a great question, we originally plan to implement SEV in subsequent 
>> patches, and
>> we are also working on SEV at present.
>> Furthermore, we have another known issue that the system can't jump into the 
>> second kernel
>> when SME is enabled and kaslr is disabled in kdump mode. It seems that is a 
>> complex problems,
>> maybe it is related to kaslr and SME, currently, i'm not sure the root 
>> cause, but we will
>> also plan to fix it. Can you give me any advice about this issue?
>>
> Based on upstream 4.17-rc5, we have recently found a new issue that the 
> system can't boot to
> use kexec when SME is enabled and kaslr is disabled. Have you ever run into 
> this issue? 
> They have similar reproduction scenarios.
> 
> CC Tom & Baoquan

I haven't encountered this issue.  Can you send the kernel config that you
are using?  And your kernel com

Re: [PATCH 2/2] support kdump when AMD secure memory encryption is active

2018-05-15 Thread Tom Lendacky
On 5/14/2018 8:51 PM, Lianbo Jiang wrote:
> When sme enabled on AMD server, we also need to support kdump. Because
> the memory is encrypted in the first kernel, we will remap the old memory
> encrypted to the second kernel(crash kernel), and sme is also enabled in
> the second kernel, otherwise the old memory encrypted can not be decrypted.
> Because simply changing the value of a C-bit on a page will not
> automatically encrypt the existing contents of a page, and any data in the
> page prior to the C-bit modification will become unintelligible. A page of
> memory that is marked encrypted will be automatically decrypted when read
> from DRAM and will be automatically encrypted when written to DRAM.
> 
> For the kdump, it is necessary to distinguish whether the memory is
> encrypted. Furthermore, we should also know which part of the memory is
> encrypted or decrypted. We will appropriately remap the memory according
> to the specific situation in order to tell cpu how to deal with the data(
> encrypted or unencrypted). For example, when sme enabled, if the old memory
> is encrypted, we will remap the old memory in encrypted way, which will
> automatically decrypt the old memory encrypted when we read those data from
> the remapping address.
> 
>  --
> | first-kernel | second-kernel | kdump support |
> |  (mem_encrypt=on|off)|   (yes|no)|
> |--+---+---|
> | on   | on| yes   |
> | off  | off   | yes   |
> | on   | off   | no|
> | off  | on| no|
> |__|___|___|
> 
> Signed-off-by: Lianbo Jiang 
> ---
>  arch/x86/include/asm/dmi.h  | 14 +-
>  arch/x86/kernel/acpi/boot.c |  8 
>  arch/x86/kernel/crash_dump_64.c | 27 +++
>  drivers/acpi/tables.c   | 14 +-
>  drivers/iommu/amd_iommu_init.c  |  9 -
>  fs/proc/vmcore.c| 36 +++-
>  include/linux/crash_dump.h  |  4 
>  kernel/kexec_core.c | 12 
>  8 files changed, 116 insertions(+), 8 deletions(-)
> 
> diff --git a/arch/x86/include/asm/dmi.h b/arch/x86/include/asm/dmi.h
> index 0ab2ab2..a5663b4 100644
> --- a/arch/x86/include/asm/dmi.h
> +++ b/arch/x86/include/asm/dmi.h
> @@ -7,6 +7,10 @@
>  
>  #include 
>  #include 
> +#ifdef CONFIG_AMD_MEM_ENCRYPT

I don't think you need all of the #ifdef stuff throughout this
patch.  Everything should work just fine without it.

> +#include 
> +#include 
> +#endif
>  
>  static __always_inline __init void *dmi_alloc(unsigned len)
>  {
> @@ -14,7 +18,15 @@ static __always_inline __init void *dmi_alloc(unsigned len)
>  }
>  
>  /* Use early IO mappings for DMI because it's initialized early */
> -#define dmi_early_remap  early_memremap
> +static __always_inline __init void *dmi_early_remap(resource_size_t
> + phys_addr, unsigned long size)
> +{
> +#ifdef CONFIG_AMD_MEM_ENCRYPT

Again, no need for the #ifdef here.  You should probably audit the
code for all of these and truly determine if they are really needed.

> + if (sme_active() && is_kdump_kernel())

Use of sme_active() here is good since under SEV, this area will be
encrypted.

> + return early_memremap_decrypted(phys_addr, size);
> +#endif
> + return early_memremap(phys_addr, size);

Instead of doing this, maybe it makes more sense to put this logic
somewhere in the early_memremap() path.  Possibly smarten up the
early_memremap_pgprot_adjust() function with some kdump kernel
related logic.  Not sure it's possible, but would be nice since you
have this logic in a couple of places.

> +}
>  #define dmi_early_unmap  early_memunmap
>  #define dmi_remap(_x, _l)memremap(_x, _l, MEMREMAP_WB)
>  #define dmi_unmap(_x)memunmap(_x)
> diff --git a/arch/x86/kernel/acpi/boot.c b/arch/x86/kernel/acpi/boot.c
> index 3b20607..354ad66 100644
> --- a/arch/x86/kernel/acpi/boot.c
> +++ b/arch/x86/kernel/acpi/boot.c
> @@ -48,6 +48,10 @@
>  #include 
>  #include 
>  #include 
> +#ifdef CONFIG_AMD_MEM_ENCRYPT
> +#include 
> +#include 
> +#endif
>  
>  #include "sleep.h" /* To include x86_acpi_suspend_lowlevel */
>  static int __initdata acpi_force = 0;
> @@ -124,6 +128,10 @@ void __init __iomem *__acpi_map_table(unsigned long 
> phys, unsigned long size)
>   if (!phys || !size)
>   return NULL;
>  
> +#ifdef CONFIG_AMD_MEM_ENCRYPT
> + if (sme_active() && is_kdump_kernel())
> + return early_memremap_decrypted(phys, size);
> +#endif

Same as previous comment(s).

>   return early_memremap(phys, size);
>  }
>  
> diff --git a/arch/x86/kernel/crash_dump_64.c b/arch/x86/kernel/crash_dump_64.c
> index 4f2e077..2ef67fc 100644
> --- a/arch/x86/kernel/crash_dump_64.c
> +++ b/arch/

Re: [PATCH 1/2] add a function(ioremap_encrypted) for kdump when AMD sme enabled.

2018-05-15 Thread Tom Lendacky
On 5/14/2018 8:51 PM, Lianbo Jiang wrote:
> It is convenient to remap the old memory encrypted to the second kernel
> by calling ioremap_encrypted().
> 
> Signed-off-by: Lianbo Jiang 
> ---
>  arch/x86/include/asm/io.h |  2 ++
>  arch/x86/mm/ioremap.c | 25 +
>  2 files changed, 19 insertions(+), 8 deletions(-)
> 
> diff --git a/arch/x86/include/asm/io.h b/arch/x86/include/asm/io.h
> index f6e5b93..06d2a9f 100644
> --- a/arch/x86/include/asm/io.h
> +++ b/arch/x86/include/asm/io.h
> @@ -192,6 +192,8 @@ extern void __iomem *ioremap_cache(resource_size_t 
> offset, unsigned long size);
>  #define ioremap_cache ioremap_cache
>  extern void __iomem *ioremap_prot(resource_size_t offset, unsigned long 
> size, unsigned long prot_val);
>  #define ioremap_prot ioremap_prot
> +extern void __iomem *ioremap_encrypted(resource_size_t phys_addr, unsigned 
> long size);
> +#define ioremap_encrypted ioremap_encrypted
>  
>  /**
>   * ioremap -   map bus memory into CPU space
> diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c
> index c63a545..7a52d1e 100644
> --- a/arch/x86/mm/ioremap.c
> +++ b/arch/x86/mm/ioremap.c
> @@ -131,7 +131,8 @@ static void __ioremap_check_mem(resource_size_t addr, 
> unsigned long size,
>   * caller shouldn't need to know that small detail.
>   */
>  static void __iomem *__ioremap_caller(resource_size_t phys_addr,
> - unsigned long size, enum page_cache_mode pcm, void *caller)
> + unsigned long size, enum page_cache_mode pcm,
> + void *caller, bool encrypted)
>  {
>   unsigned long offset, vaddr;
>   resource_size_t last_addr;
> @@ -199,7 +200,8 @@ static void __iomem *__ioremap_caller(resource_size_t 
> phys_addr,
>* resulting mapping.
>*/
>   prot = PAGE_KERNEL_IO;
> - if (sev_active() && mem_flags.desc_other)
> + if ((sev_active() && mem_flags.desc_other) ||
> + (encrypted && sme_active()))

You only need the encrypted check here.  If sme is not active,
then the pgprot_encrypted will basically be a no-op.

Also, extra indents.

Thanks,
Tom

>   prot = pgprot_encrypted(prot);
>  
>   switch (pcm) {
> @@ -291,7 +293,7 @@ void __iomem *ioremap_nocache(resource_size_t phys_addr, 
> unsigned long size)
>   enum page_cache_mode pcm = _PAGE_CACHE_MODE_UC_MINUS;
>  
>   return __ioremap_caller(phys_addr, size, pcm,
> - __builtin_return_address(0));
> + __builtin_return_address(0), false);
>  }
>  EXPORT_SYMBOL(ioremap_nocache);
>  
> @@ -324,7 +326,7 @@ void __iomem *ioremap_uc(resource_size_t phys_addr, 
> unsigned long size)
>   enum page_cache_mode pcm = _PAGE_CACHE_MODE_UC;
>  
>   return __ioremap_caller(phys_addr, size, pcm,
> - __builtin_return_address(0));
> + __builtin_return_address(0), false);
>  }
>  EXPORT_SYMBOL_GPL(ioremap_uc);
>  
> @@ -341,7 +343,7 @@ EXPORT_SYMBOL_GPL(ioremap_uc);
>  void __iomem *ioremap_wc(resource_size_t phys_addr, unsigned long size)
>  {
>   return __ioremap_caller(phys_addr, size, _PAGE_CACHE_MODE_WC,
> - __builtin_return_address(0));
> + __builtin_return_address(0), false);
>  }
>  EXPORT_SYMBOL(ioremap_wc);
>  
> @@ -358,14 +360,21 @@ EXPORT_SYMBOL(ioremap_wc);
>  void __iomem *ioremap_wt(resource_size_t phys_addr, unsigned long size)
>  {
>   return __ioremap_caller(phys_addr, size, _PAGE_CACHE_MODE_WT,
> - __builtin_return_address(0));
> + __builtin_return_address(0), false);
>  }
>  EXPORT_SYMBOL(ioremap_wt);
>  
> +void __iomem *ioremap_encrypted(resource_size_t phys_addr, unsigned long 
> size)
> +{
> + return __ioremap_caller(phys_addr, size, _PAGE_CACHE_MODE_WB,
> + __builtin_return_address(0), true);
> +}
> +EXPORT_SYMBOL(ioremap_encrypted);
> +
>  void __iomem *ioremap_cache(resource_size_t phys_addr, unsigned long size)
>  {
>   return __ioremap_caller(phys_addr, size, _PAGE_CACHE_MODE_WB,
> - __builtin_return_address(0));
> + __builtin_return_address(0), false);
>  }
>  EXPORT_SYMBOL(ioremap_cache);
>  
> @@ -374,7 +383,7 @@ void __iomem *ioremap_prot(resource_size_t phys_addr, 
> unsigned long size,
>  {
>   return __ioremap_caller(phys_addr, size,
>   pgprot2cachemode(__pgprot(prot_val)),
> - __builtin_return_address(0));
> + __builtin_return_address(0), false);
>  }
>  EXPORT_SYMBOL(ioremap_prot);
>  
> 

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH 0/2] support kdump for AMD secure memory encryption(sme)

2018-05-15 Thread Tom Lendacky
On 5/14/2018 8:51 PM, Lianbo Jiang wrote:
> It is convenient to remap the old memory encrypted to the second kernel by
> calling ioremap_encrypted().
> 
> When sme enabled on AMD server, we also need to support kdump. Because
> the memory is encrypted in the first kernel, we will remap the old memory
> encrypted to the second kernel(crash kernel), and sme is also enabled in
> the second kernel, otherwise the old memory encrypted can not be decrypted.
> Because simply changing the value of a C-bit on a page will not
> automatically encrypt the existing contents of a page, and any data in the
> page prior to the C-bit modification will become unintelligible. A page of
> memory that is marked encrypted will be automatically decrypted when read
> from DRAM and will be automatically encrypted when written to DRAM.
> 
> For the kdump, it is necessary to distinguish whether the memory is
> encrypted. Furthermore, we should also know which part of the memory is
> encrypted or decrypted. We will appropriately remap the memory according
> to the specific situation in order to tell cpu how to deal with the
> data(encrypted or decrypted). For example, when sme enabled, if the old
> memory is encrypted, we will remap the old memory in encrypted way, which
> will automatically decrypt the old memory encrypted when we read those data
> from the remapping address.
> 
>  --
> | first-kernel | second-kernel | kdump support |
> |  (mem_encrypt=on|off)|   (yes|no)| 
> |--+---+---|
> | on   | on| yes   |
> | off  | off   | yes   |
> | on   | off   | no|
> | off  | on| no|
> |__|___|___|
> 
> Test tools:
> makedumpfile[v1.6.3]: https://github.com/LianboJ/makedumpfile
> commit e1de103eca8f (A draft for kdump vmcore about AMD SME)
> Author: Lianbo Jiang 
> Date:   Mon May 14 17:02:40 2018 +0800
> Note: This patch can only dump vmcore in the case of SME enabled.
> 
> crash-7.2.1: https://github.com/crash-utility/crash.git
> commit 1e1bd9c4c1be (Fix for the "bpf" command display on Linux 4.17-rc1)
> Author: Dave Anderson 
> Date:   Fri May 11 15:54:32 2018 -0400
> 
> Test environment:
> HP ProLiant DL385Gen10 AMD EPYC 7251
> 8-Core Processor
> 32768 MB memory
> 600 GB disk space
> 
> Linux 4.17-rc4:
> git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
> commit 75bc37fefc44 ("Linux 4.17-rc4")
> Author: Linus Torvalds 
> Date:   Sun May 6 16:57:38 2018 -1000
> 
> Reference:
> AMD64 Architecture Programmer's Manual
> https://support.amd.com/TechDocs/24593.pdf
> 

Have you also tested this with SEV?  It would be nice if the kdump
changes you make work with both SME and SEV.

Thanks,
Tom

> Lianbo Jiang (2):
>   add a function(ioremap_encrypted) for kdump when AMD sme enabled.
>   support kdump when AMD secure memory encryption is active
> 
>  arch/x86/include/asm/dmi.h  | 14 +-
>  arch/x86/include/asm/io.h   |  2 ++
>  arch/x86/kernel/acpi/boot.c |  8 
>  arch/x86/kernel/crash_dump_64.c | 27 +++
>  arch/x86/mm/ioremap.c   | 25 +
>  drivers/acpi/tables.c   | 14 +-
>  drivers/iommu/amd_iommu_init.c  |  9 -
>  fs/proc/vmcore.c| 36 +++-
>  include/linux/crash_dump.h  |  4 
>  kernel/kexec_core.c | 12 
>  10 files changed, 135 insertions(+), 16 deletions(-)
> 

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: kexec reboot fails with extra wbinvd introduced for AME SME

2018-01-17 Thread Tom Lendacky
On 1/17/2018 8:29 PM, Dave Young wrote:
> On 01/17/18 at 06:14pm, Linus Torvalds wrote:
>> On Wed, Jan 17, 2018 at 5:47 PM, Dave Young  wrote:
>>>
>>> It does not work with just once wbinvd(), and it only works with
>>> removing the wbinvd() for me.  Tom's new post works for me as well
>>> since my cpu is an Intel i5-4200U.
>>
>> Intriguing.
>>
>> It's not like the wbinvd really should be that much of a deal.
>>
>> I think Tom's patch is fine and should be applied, but it does worry
>> me a bit that even a single wbinvd makes that much of a difference for
>> you. There is very little logical reason I can think of that a wbinvd
>> should make any difference what-so-ever on an i5-4200U.
>>
>> I wonder if you have some system issues, and wbinvd just happens to
>> trigger them. But I think we do wbinvd before a suspend-to-RAM too
>> (it's "ACPI_FLUSH_CPU_CACHE()" in the ACPI code). And the dmr code
>> dioes "wbinvd_on_all_cpus()" which does a cross-call etc.
>>
>> Would you mind experimenting a bit with that wbinvd?
>>
>> In particular, what happens if you enable it (so it's not hidden by
>> the SME check), but you move it up to before interrupts are disabled?
> 
> Will play with it more.  Actually I found the hang seems happens
> in code of arch/x86/kernel/relocate_kernel_64.S, there is another
> wbinvd there as well.

The wbinvd in arch/x86/kernel/relocate_kernel_64.S is only performed if
SME is active, so that one won't be executed on an Intel chip.

Thanks,
Tom

> 
>>
>> I'm wondering if there is some issue with MCE generation and wbinvd
>> and whatever, and doing it when the CPU is down and interrupts are
>> disabled causes some system issue..
>>
>> Does anybody have any other ideas?
>>
>> Linus

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH] x86/mm: Rework wbinvd, hlt operation in stop_this_cpu()

2018-01-17 Thread Tom Lendacky
On 1/17/2018 5:41 PM, Tom Lendacky wrote:
> Some issues have been reported with the for loop in stop_this_cpu() that
> issues the 'wbinvd; hlt' sequence.  Reverting this sequence to halt()
> has been shown to resolve the issue.
> 
> However, the wbinvd is needed when running with SME.  The reason for the
> wbinvd is to prevent cache flush races between encrypted and non-encrypted
> entries that have the same physical address.  This can occur when
> kexec'ing from memory encryption active to inactive or vice-versa.  The
> important thing is to not have outside of kernel text memory references
> (such as stack usage), so the usage of the native_*() functions is needed
> since these expand as inline asm sequences.  So instead of reverting the
> change, rework the sequence.
> 
> Move the wbinvd instruction outside of the for loop as native_wbinvd()
> and make its execution conditional on X86_FEATURE_SME.  In the for loop,
> change the asm 'wbinvd; hlt' sequence back to a halt sequence but use
> the native_halt() call.
> 
> Cc:  # 4.14.x
> Fixes: bba4ed011a52 ("x86/mm, kexec: Allow kexec to be used with SME")
> Reported-by: Dave Young 

Dave,

Can you test this and see if it resolves your issue?

Thanks,
Tom

> Signed-off-by: Tom Lendacky 
> ---
>  arch/x86/kernel/process.c |   25 +++--
>  1 file changed, 15 insertions(+), 10 deletions(-)
> 
> diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
> index 63711fe..03408b9 100644
> --- a/arch/x86/kernel/process.c
> +++ b/arch/x86/kernel/process.c
> @@ -379,19 +379,24 @@ void stop_this_cpu(void *dummy)
>   disable_local_APIC();
>   mcheck_cpu_clear(this_cpu_ptr(&cpu_info));
>  
> + /*
> +  * Use wbinvd on processors that support SME. This provides support
> +  * for performing a successful kexec when going from SME inactive
> +  * to SME active (or vice-versa). The cache must be cleared so that
> +  * if there are entries with the same physical address, both with and
> +  * without the encryption bit, they don't race each other when flushed
> +  * and potentially end up with the wrong entry being committed to
> +  * memory.
> +  */
> + if (boot_cpu_has(X86_FEATURE_SME))
> + native_wbinvd();
>   for (;;) {
>   /*
> -  * Use wbinvd followed by hlt to stop the processor. This
> -  * provides support for kexec on a processor that supports
> -  * SME. With kexec, going from SME inactive to SME active
> -  * requires clearing cache entries so that addresses without
> -  * the encryption bit set don't corrupt the same physical
> -  * address that has the encryption bit set when caches are
> -  * flushed. To achieve this a wbinvd is performed followed by
> -  * a hlt. Even if the processor is not in the kexec/SME
> -  * scenario this only adds a wbinvd to a halting processor.
> +  * Use native_halt() so that memory contents don't change
> +  * (stack usage and variables) after possibly issuing the
> +  * native_wbinvd() above.
>*/
> - asm volatile("wbinvd; hlt" : : : "memory");
> + native_halt();
>   }
>  }
>  
> 

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


[PATCH] x86/mm: Rework wbinvd, hlt operation in stop_this_cpu()

2018-01-17 Thread Tom Lendacky
Some issues have been reported with the for loop in stop_this_cpu() that
issues the 'wbinvd; hlt' sequence.  Reverting this sequence to halt()
has been shown to resolve the issue.

However, the wbinvd is needed when running with SME.  The reason for the
wbinvd is to prevent cache flush races between encrypted and non-encrypted
entries that have the same physical address.  This can occur when
kexec'ing from memory encryption active to inactive or vice-versa.  The
important thing is to not have outside of kernel text memory references
(such as stack usage), so the usage of the native_*() functions is needed
since these expand as inline asm sequences.  So instead of reverting the
change, rework the sequence.

Move the wbinvd instruction outside of the for loop as native_wbinvd()
and make its execution conditional on X86_FEATURE_SME.  In the for loop,
change the asm 'wbinvd; hlt' sequence back to a halt sequence but use
the native_halt() call.

Cc:  # 4.14.x
Fixes: bba4ed011a52 ("x86/mm, kexec: Allow kexec to be used with SME")
Reported-by: Dave Young 
Signed-off-by: Tom Lendacky 
---
 arch/x86/kernel/process.c |   25 +++--
 1 file changed, 15 insertions(+), 10 deletions(-)

diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
index 63711fe..03408b9 100644
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -379,19 +379,24 @@ void stop_this_cpu(void *dummy)
disable_local_APIC();
mcheck_cpu_clear(this_cpu_ptr(&cpu_info));
 
+   /*
+* Use wbinvd on processors that support SME. This provides support
+* for performing a successful kexec when going from SME inactive
+* to SME active (or vice-versa). The cache must be cleared so that
+* if there are entries with the same physical address, both with and
+* without the encryption bit, they don't race each other when flushed
+* and potentially end up with the wrong entry being committed to
+* memory.
+*/
+   if (boot_cpu_has(X86_FEATURE_SME))
+   native_wbinvd();
for (;;) {
/*
-* Use wbinvd followed by hlt to stop the processor. This
-* provides support for kexec on a processor that supports
-* SME. With kexec, going from SME inactive to SME active
-* requires clearing cache entries so that addresses without
-* the encryption bit set don't corrupt the same physical
-* address that has the encryption bit set when caches are
-* flushed. To achieve this a wbinvd is performed followed by
-* a hlt. Even if the processor is not in the kexec/SME
-* scenario this only adds a wbinvd to a halting processor.
+* Use native_halt() so that memory contents don't change
+* (stack usage and variables) after possibly issuing the
+* native_wbinvd() above.
 */
-   asm volatile("wbinvd; hlt" : : : "memory");
+   native_halt();
}
 }
 


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: kexec reboot fails with extra wbinvd introduced for AME SME

2018-01-17 Thread Tom Lendacky
On 1/17/2018 2:01 PM, Tom Lendacky wrote:
> On 1/17/2018 1:42 PM, Linus Torvalds wrote:
>> On Tue, Jan 16, 2018 at 11:22 PM, Dave Young  wrote:
>>>
>>> For the kexec reboot hang, if I remove the wbinvd in stop_this_cpu()
>>> then kexec works fine. like this:
>>
>> Honestly, I think we should apply that patch regardless.
>>
>> Using 'wbinvd' should not be some "just because of random reasons".
>> There are CPU's with errata on wbinvd, and the thing in general is
>> slow and nasty.
>>
>> Doing the wbinvd in a loop sounds even stranger.
>>
>> If we're only doing it because of some SME issue, why isn't it
>> dependent on SME? And why is it inside that loop at all?
> 
> My original patches did check for X86_FEATURE_SME and only do the
> wbinvd if SME was supported (although still in the loop).  The general
> consensus was to just do the wbinvd no matter what and so it is as it is
> today.
> 
> It can probably be outside of the loop.  The issue I was seeing was
> memory corruption from the stack when using halt() with paravirt ops
> enabled.  So a native_halt() should be used.
> 
>>
>> Anyway, does it work for you if you just do the wbinvd() once, outside
>> the loop? Admittedly the loop shouldn't actually loop (hlt with
>> interrupts disabled), but who the hell knows.. Some of the errata
>> around SME have been about machine check exceptions or something.
> 
> I think that should work as long as it's a native_wbinvd() call and it
> can also be conditional on boot_cpu_has(X86_FEATURE_SME).
> 
> I'll do some testing.

Looks like everything is good with the suggested changes.  Patch to follow
shortly.

Thanks,
Tom

> 
> Thanks,
> Tom
> 
>>
>> See commit a68e5c94f7d3 ("x86, hotplug: Move WBINVD back outside the
>> play_dead loop") for another example where wbinvd was inside a loop
>> and apparently caused some odd issues.
>>
>>   Linus
>>

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: kexec reboot fails with extra wbinvd introduced for AME SME

2018-01-17 Thread Tom Lendacky
On 1/17/2018 1:42 PM, Linus Torvalds wrote:
> On Tue, Jan 16, 2018 at 11:22 PM, Dave Young  wrote:
>>
>> For the kexec reboot hang, if I remove the wbinvd in stop_this_cpu()
>> then kexec works fine. like this:
> 
> Honestly, I think we should apply that patch regardless.
> 
> Using 'wbinvd' should not be some "just because of random reasons".
> There are CPU's with errata on wbinvd, and the thing in general is
> slow and nasty.
> 
> Doing the wbinvd in a loop sounds even stranger.
> 
> If we're only doing it because of some SME issue, why isn't it
> dependent on SME? And why is it inside that loop at all?

My original patches did check for X86_FEATURE_SME and only do the
wbinvd if SME was supported (although still in the loop).  The general
consensus was to just do the wbinvd no matter what and so it is as it is
today.

It can probably be outside of the loop.  The issue I was seeing was
memory corruption from the stack when using halt() with paravirt ops
enabled.  So a native_halt() should be used.

> 
> Anyway, does it work for you if you just do the wbinvd() once, outside
> the loop? Admittedly the loop shouldn't actually loop (hlt with
> interrupts disabled), but who the hell knows.. Some of the errata
> around SME have been about machine check exceptions or something.

I think that should work as long as it's a native_wbinvd() call and it
can also be conditional on boot_cpu_has(X86_FEATURE_SME).

I'll do some testing.

Thanks,
Tom

> 
> See commit a68e5c94f7d3 ("x86, hotplug: Move WBINVD back outside the
> play_dead loop") for another example where wbinvd was inside a loop
> and apparently caused some odd issues.
> 
>   Linus
> 

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: kexec reboot fails with extra wbinvd introduced for AME SME

2018-01-17 Thread Tom Lendacky
On 1/17/2018 1:22 AM, Dave Young wrote:
> [Modify the subject since this is a new problem, original io vector
> issue has been fixed with one commit from Thomas]
> 
> Add more cc according to below old discussion:
> https://lkml.org/lkml/2017/7/27/574
> 
> Tom, I'm not sure why you finally did not dynamically run wbinvd?

That discussion was aimed at the wbinvd that was being performed
in arch/x86/kernel/relocate_kernel_64.S, which is dynamically
run based on a flag.

> On 01/04/18 at 11:15am, Dave Young wrote:
>> On 12/14/17 at 05:24pm, Dave Young wrote:
>>> On 12/13/17 at 11:57pm, Yu Chen wrote:
 On Wed, Dec 13, 2017 at 10:52:56AM +0800, Dave Young wrote:
> Hi,
>
> Kexec reboot and kdump has broken on my laptop for long time with
> 4.15.0-rc1+ kernels. With the patch below an early panic been fixed:
> https://patchwork.kernel.org/patch/10084289/
>
> But still can not get a successful reboot, it looked like graphic
> issue, but after bisecting the kernel, I got below:
>
> [dyoung@dhcp-*-* linux]$ git bisect good
> There are only 'skip'ped commits left to test.
> The first bad commit could be any of:
> 2db1f959d9dc16035f2eb44ed5fdb2789b754d6a
> 4900be83602b6be07366d3e69f756c1959f4169a
> We cannot bisect more!
>
> These two commits can no be reverted because of code conflicts, thus
> I reverted the whole series from Thomas (below commits), with those
> x86/vector changes reverted, kexec reboot works fine.
>
> Could you help to take a look, any thoughts?  I can do the test
> if you have some debug patch to try.
 Is it possible that the "second" kernel runs on non-zero CPU? If yes,
 what if some irqs are only delivered to cpu0? (use cpumask_of(0)
 directly)
>>>
>>> Thanks for the reply.
>>>
>>> For kdump, yes, for kexec, I'm not sure.  
>>>
>>> Here is some kexec kernel boot log:
>>> http://people.redhat.com/~ruyang/misc/kexec-regression.txt
>>>
>>> Copy the lockup call trace here:
>>> [   23.779285] NMI watchdog: Watchdog detected hard LOCKUP on cpu 0 
>>> 
>>> [   23.779285] Modules linked in: arc4 rtsx_pci_sdmmc i915 iwlmvm kvm_intel 
>>> mac8
>>> 0211 kvm irqbypass btusb btrtl btbcm intel_gtt btintel drm_kms_helper 
>>> snd_hda_in
>>> tel syscopyarea bluetooth iwlwifi snd_hda_codec snd_hwdep snd_hda_core 
>>> sysfillre
>>> ct snd_seq sysimgblt input_leds fb_sys_fops e1000e ecdh_generic cfg80211 
>>> snd_seq
>>> _device drm snd_pcm serio_raw ptp pcspkr thinkpad_acpi i2c_i801 snd_timer 
>>> rtsx_p
>>> ci pps_core snd soundcore rfkill video  
>>> 
>>> [   23.779307] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.15.0-rc3+ #378   
>>> 
>>> [   23.779308] Hardware name: LENOVO 20ARS1BJ02/20ARS1BJ02, BIOS GJET92WW 
>>> (2.42 
>>> ) 03/03/2017
>>> 
>>> [   23.779312] RIP: 0010:poll_idle+0x2f/0x5f
>>> 
>>> [   23.779313] RSP: 0018:81c03e80 EFLAGS: 0246  
>>> 
>>> [   23.779314] RAX: 81c0f4c0 RBX: 81c6db80 RCX: 
>>> 
>>> [   23.779315] RDX:  RSI: 81c6db80 RDI: 
>>> 88021f2201e8
>>> [   23.779316] RBP: 88021f2201e8 R08: 00349a65b7dd R09: 
>>> 88021f216db4
>>> [   23.779317] R10: 81c03e68 R11:  R12: 
>>> 
>>> [   23.779318] R13: 81c6db98 R14:  R15: 
>>> 000578a065b1
>>> [   23.779319] FS:  () GS:88021f20() 
>>> knlGS:0
>>> 000 
>>> 
>>> [   23.779320] CS:  0010 DS:  ES:  CR0: 80050033
>>> 
>>> [   23.779321] CR2: 7ffed1d0ee60 CR3: 00021ec0a006 CR4: 
>>> 001606b0
>>> [   23.779322] Call Trace:  
>>> 
>>> [   23.779328]  cpuidle_enter_state+0x6a/0x2c0  
>>> 
>>> [   23.779333]  do_idle+0x17b/0x1d0 
>>> 
>>> [   23.779335]  cpu_startup_entry+0x6f/0x80 
>>> 
>>> [   23.779338]  start_kernel+0x431/0x451
>>> 
>>> [   23.779342]  secondary_startup_64+0xa5/0xb0  
>>> 
>>> [   23.779344] Code: 00 fb 66 0f 1f 44 00 00 65 48 8b 04 25 40 c4 00 00 f0 
>>> 80 48
>>>  02 20 48 8b 08 83 e1 08 74 0d eb 12 f3 90 65 48 8b 04 25 40 c4 00 00 <48> 
>>> 8b 00
>>>  a8 08 74 ee 65 48 8b 04 25 40 c4 00 00 f0 80 60 02 df
>>>
>>
>> Followup this issue, seems another commit from Thomas partially fixed
>> this, kexec/kdump boot up successfully for me, but kexec after kexec
>> (2nd kexec reboot cycle) failed, kernel hung early
> 
> The above kexec reboot hang is another problem, so Thomas has fully
> fixed previous report, thanks!
> 

[tip:x86/mm] x86/mm, kexec: Fix memory corruption with SME on successive kexecs

2017-07-30 Thread tip-bot for Tom Lendacky
Commit-ID:  4e237903f95db585b976e7311de2bfdaaf0f6e31
Gitweb: http://git.kernel.org/tip/4e237903f95db585b976e7311de2bfdaaf0f6e31
Author: Tom Lendacky 
AuthorDate: Fri, 28 Jul 2017 11:01:16 -0500
Committer:  Ingo Molnar 
CommitDate: Sun, 30 Jul 2017 12:09:12 +0200

x86/mm, kexec: Fix memory corruption with SME on successive kexecs

After issuing successive kexecs it was found that the SHA hash failed
verification when booting the kexec'd kernel.  When SME is enabled, the
change from using pages that were marked encrypted to now being marked as
not encrypted (through new identify mapped page tables) results in memory
corruption if there are any cache entries for the previously encrypted
pages. This is because separate cache entries can exist for the same
physical location but tagged both with and without the encryption bit.

To prevent this, issue a wbinvd if SME is active before copying the pages
from the source location to the destination location to clear any possible
cache entry conflicts.

Signed-off-by: Tom Lendacky 
Cc: 
Cc: Andy Lutomirski 
Cc: Borislav Petkov 
Cc: Brijesh Singh 
Cc: Dave Young 
Cc: Linus Torvalds 
Cc: Peter Zijlstra 
Cc: Thomas Gleixner 
Link: 
http://lkml.kernel.org/r/e7fb8610af3a93e8f8ae6f214cd9249adc0df2b4.1501186516.git.thomas.lenda...@amd.com
Signed-off-by: Ingo Molnar 
---
 arch/x86/include/asm/kexec.h |  3 ++-
 arch/x86/kernel/machine_kexec_64.c   |  3 ++-
 arch/x86/kernel/relocate_kernel_64.S | 14 ++
 3 files changed, 18 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/kexec.h b/arch/x86/include/asm/kexec.h
index e8183ac..942c1f4 100644
--- a/arch/x86/include/asm/kexec.h
+++ b/arch/x86/include/asm/kexec.h
@@ -147,7 +147,8 @@ unsigned long
 relocate_kernel(unsigned long indirection_page,
unsigned long page_list,
unsigned long start_address,
-   unsigned int preserve_context);
+   unsigned int preserve_context,
+   unsigned int sme_active);
 #endif
 
 #define ARCH_HAS_KIMAGE_ARCH
diff --git a/arch/x86/kernel/machine_kexec_64.c 
b/arch/x86/kernel/machine_kexec_64.c
index 9cf8daa..1f790cf 100644
--- a/arch/x86/kernel/machine_kexec_64.c
+++ b/arch/x86/kernel/machine_kexec_64.c
@@ -335,7 +335,8 @@ void machine_kexec(struct kimage *image)
image->start = relocate_kernel((unsigned long)image->head,
   (unsigned long)page_list,
   image->start,
-  image->preserve_context);
+  image->preserve_context,
+  sme_active());
 
 #ifdef CONFIG_KEXEC_JUMP
if (image->preserve_context)
diff --git a/arch/x86/kernel/relocate_kernel_64.S 
b/arch/x86/kernel/relocate_kernel_64.S
index 98111b3..307d3ba 100644
--- a/arch/x86/kernel/relocate_kernel_64.S
+++ b/arch/x86/kernel/relocate_kernel_64.S
@@ -47,6 +47,7 @@ relocate_kernel:
 * %rsi page_list
 * %rdx start address
 * %rcx preserve_context
+* %r8  sme_active
 */
 
/* Save the CPU context, used for jumping back */
@@ -71,6 +72,9 @@ relocate_kernel:
pushq $0
popfq
 
+   /* Save SME active flag */
+   movq%r8, %r12
+
/*
 * get physical address of control page now
 * this is impossible after page table switch
@@ -132,6 +136,16 @@ identity_mapped:
/* Flush the TLB (needed?) */
movq%r9, %cr3
 
+   /*
+* If SME is active, there could be old encrypted cache line
+* entries that will conflict with the now unencrypted memory
+* used by kexec. Flush the caches before copying the kernel.
+*/
+   testq   %r12, %r12
+   jz 1f
+   wbinvd
+1:
+
movq%rcx, %r11
callswap_pages
 

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


[PATCH v2 1/2] x86/mm, kexec: Fix memory corruption with SME on successive kexecs

2017-07-28 Thread Tom Lendacky
After issuing successive kexecs it was found that the SHA hash failed
verification when booting the kexec'd kernel.  When SME is enabled, the
change from using pages that were marked encrypted to now being marked as
not encrypted (through new identify mapped page tables) results in memory
corruption if there are any cache entries for the previously encrypted
pages. This is because separate cache entries can exist for the same
physical location but tagged both with and without the encryption bit.

To prevent this, issue a wbinvd if SME is active before copying the pages
from the source location to the destination location to clear any possible
cache entry conflicts.

Cc: 
Signed-off-by: Tom Lendacky 
---
 arch/x86/include/asm/kexec.h |  3 ++-
 arch/x86/kernel/machine_kexec_64.c   |  3 ++-
 arch/x86/kernel/relocate_kernel_64.S | 14 ++
 3 files changed, 18 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/kexec.h b/arch/x86/include/asm/kexec.h
index e8183ac..942c1f4 100644
--- a/arch/x86/include/asm/kexec.h
+++ b/arch/x86/include/asm/kexec.h
@@ -147,7 +147,8 @@ static inline void crash_setup_regs(struct pt_regs *newregs,
 relocate_kernel(unsigned long indirection_page,
unsigned long page_list,
unsigned long start_address,
-   unsigned int preserve_context);
+   unsigned int preserve_context,
+   unsigned int sme_active);
 #endif
 
 #define ARCH_HAS_KIMAGE_ARCH
diff --git a/arch/x86/kernel/machine_kexec_64.c 
b/arch/x86/kernel/machine_kexec_64.c
index 9cf8daa..1f790cf 100644
--- a/arch/x86/kernel/machine_kexec_64.c
+++ b/arch/x86/kernel/machine_kexec_64.c
@@ -335,7 +335,8 @@ void machine_kexec(struct kimage *image)
image->start = relocate_kernel((unsigned long)image->head,
   (unsigned long)page_list,
   image->start,
-  image->preserve_context);
+  image->preserve_context,
+  sme_active());
 
 #ifdef CONFIG_KEXEC_JUMP
if (image->preserve_context)
diff --git a/arch/x86/kernel/relocate_kernel_64.S 
b/arch/x86/kernel/relocate_kernel_64.S
index 98111b3..307d3ba 100644
--- a/arch/x86/kernel/relocate_kernel_64.S
+++ b/arch/x86/kernel/relocate_kernel_64.S
@@ -47,6 +47,7 @@ relocate_kernel:
 * %rsi page_list
 * %rdx start address
 * %rcx preserve_context
+* %r8  sme_active
 */
 
/* Save the CPU context, used for jumping back */
@@ -71,6 +72,9 @@ relocate_kernel:
pushq $0
popfq
 
+   /* Save SME active flag */
+   movq%r8, %r12
+
/*
 * get physical address of control page now
 * this is impossible after page table switch
@@ -132,6 +136,16 @@ identity_mapped:
/* Flush the TLB (needed?) */
movq%r9, %cr3
 
+   /*
+* If SME is active, there could be old encrypted cache line
+* entries that will conflict with the now unencrypted memory
+* used by kexec. Flush the caches before copying the kernel.
+*/
+   testq   %r12, %r12
+   jz 1f
+   wbinvd
+1:
+
movq%rcx, %r11
callswap_pages
 
-- 
1.9.1


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


[PATCH v2 0/2] x86: Secure Memory Encryption (SME) fixes 2017-07-26

2017-07-28 Thread Tom Lendacky
This patch series addresses some issues found during further testing of
Secure Memory Encryption (SME).

The following fixes are included in this update series:

- Fix a cache-related memory corruption when kexec is invoked in
  successive instances
- Remove the encryption mask from the protection properties returned
  by arch_apei_get_mem_attribute() when SME is active

---

This patch series is based off of the master branch of tip:
  https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git master

  Commit 8333bcad393c ("Merge branch 'x86/asm'")

Cc: 

Changes since v1:
- Patch #1:
  - Only issue wbinvd if SME is active
- Patch #2:
  - Create a no encryption version of the PAGE_KERNEL protection type
and use that in arch_apei_get_mem_attribute()
- General comment and patch description clean up

Tom Lendacky (2):
  x86/mm, kexec: Fix memory corruption with SME on successive kexecs
  acpi, x86: Remove encryption mask from ACPI page protection type

 arch/x86/include/asm/acpi.h  | 11 ++-
 arch/x86/include/asm/kexec.h |  3 ++-
 arch/x86/include/asm/pgtable_types.h |  1 +
 arch/x86/kernel/machine_kexec_64.c   |  3 ++-
 arch/x86/kernel/relocate_kernel_64.S | 14 ++
 5 files changed, 25 insertions(+), 7 deletions(-)

-- 
1.9.1


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


  1   2   3   4   5   >