Re: [PATCH 2/2 v2] iommu: use the __iommu_attach_device() directly for deferred attach

2021-01-21 Thread lijiang
Hi, Christoph

在 2021年01月19日 23:29, Christoph Hellwig 写道:
>> +int iommu_do_deferred_attach(struct device *dev,
>> + struct iommu_domain *domain)
> 
> I'd remove the "do_" from the name, it doesn't really add any value.
> 
OK.

>> +{
>> +const struct iommu_ops *ops = domain->ops;
>> +
>> +if (unlikely(ops->is_attach_deferred &&
>> + ops->is_attach_deferred(domain, dev)))
>> +return __iommu_attach_device(domain, dev);
>> +
>> +return 0;
> 
> Now that everyting is under the static key we don't really need to
> bother with the unlikely micro optimization, do we?
> 
Good understanding. I can remove it, otherwise it should use the likely().

>> +extern int iommu_do_deferred_attach(struct device *dev,
>> +struct iommu_domain *domain);
> 
> No need for the extern.
> 
Sounds good. I will remove the 'extern' when I post again.

Thanks.
Lianbo

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 1/2 v2] dma-iommu: use static-key to minimize the impact in the fast-path

2021-01-21 Thread lijiang
Hi, Christoph

Thanks for the comment.
在 2021年01月19日 23:26, Christoph Hellwig 写道:
> On Tue, Jan 19, 2021 at 07:16:15PM +0800, Lianbo Jiang wrote:
>> +static DEFINE_STATIC_KEY_FALSE(__deferred_attach);
> Why the strange underscores?  Wouldn't iommu_deferred_attach_enabled

The variable is defined with the static keyword, which indicates that the
variable is only used in the local module(file), and gives a hint explicitly
with the underscore prefix. Anyway, this is my personal opinion.

> be a better name?
> 
It could be a long name?

Thanks.
Lianbo

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH] iommu: check for the deferred attach when attaching a device

2021-01-18 Thread lijiang
在 2021年01月15日 23:15, Robin Murphy 写道:
> On 2021-01-15 14:26, lijiang wrote:
>> Hi, Robin
>>
>> Thank you for the comment.
>>
>> 在 2021年01月13日 01:29, Robin Murphy 写道:
>>> On 2021-01-05 07:52, lijiang wrote:
>>>> 在 2021年01月05日 11:55, lijiang 写道:
>>>>> Hi,
>>>>>
>>>>> Also add Joerg to cc list.
>>>>>
>>>>
>>>> Also add more people to cc list, Jerry Snitselaar and Tom Lendacky.
>>>>
>>>> Thanks.
>>>>
>>>>> Thanks.
>>>>> Lianbo
>>>>> 在 2020年12月26日 13:39, Lianbo Jiang 写道:
>>>>>> Currently, because domain attach allows to be deferred from iommu
>>>>>> driver to device driver, and when iommu initializes, the devices
>>>>>> on the bus will be scanned and the default groups will be allocated.
>>>>>>
>>>>>> Due to the above changes, some devices could be added to the same
>>>>>> group as below:
>>>>>>
>>>>>> [    3.859417] pci :01:00.0: Adding to iommu group 16
>>>>>> [    3.864572] pci :01:00.1: Adding to iommu group 16
>>>>>> [    3.869738] pci :02:00.0: Adding to iommu group 17
>>>>>> [    3.874892] pci :02:00.1: Adding to iommu group 17
>>>>>>
>>>>>> But when attaching these devices, it doesn't allow that a group has
>>>>>> more than one device, otherwise it will return an error. This conflicts
>>>>>> with the deferred attaching. Unfortunately, it has two devices in the
>>>>>> same group for my side, for example:
>>>>>>
>>>>>> [    9.627014] iommu_group_device_count(): device name[0]::01:00.0
>>>>>> [    9.633545] iommu_group_device_count(): device name[1]::01:00.1
>>>>>> ...
>>>>>> [   10.255609] iommu_group_device_count(): device name[0]::02:00.0
>>>>>> [   10.262144] iommu_group_device_count(): device name[1]::02:00.1
>>>>>>
>>>>>> Finally, which caused the failure of tg3 driver when tg3 driver calls
>>>>>> the dma_alloc_coherent() to allocate coherent memory in the 
>>>>>> tg3_test_dma().
>>>>>>
>>>>>> [    9.660310] tg3 :01:00.0: DMA engine test failed, aborting
>>>>>> [    9.754085] tg3: probe of :01:00.0 failed with error -12
>>>>>> [    9.997512] tg3 :01:00.1: DMA engine test failed, aborting
>>>>>> [   10.043053] tg3: probe of :01:00.1 failed with error -12
>>>>>> [   10.288905] tg3 :02:00.0: DMA engine test failed, aborting
>>>>>> [   10.334070] tg3: probe of :02:00.0 failed with error -12
>>>>>> [   10.578303] tg3 :02:00.1: DMA engine test failed, aborting
>>>>>> [   10.622629] tg3: probe of :02:00.1 failed with error -12
>>>>>>
>>>>>> In addition, the similar situations also occur in other drivers such
>>>>>> as the bnxt_en driver. That can be reproduced easily in kdump kernel
>>>>>> when SME is active.
>>>>>>
>>>>>> Add a check for the deferred attach in the iommu_attach_device() and
>>>>>> allow to attach the deferred device regardless of how many devices
>>>>>> are in a group.
>>>
>>> Is this iommu_attach_device() call is coming from iommu-dma? (if not, then 
>>> whoever's calling it probably shouldn't be)
>>>
>>
>> Yes, you are right, the iommu_attach_device call is coming from iommu-dma.
>>  
>>> Assuming so, then probably what should happen is to move the handling 
>>> currently in iommu_dma_deferred_attach() into the core so that it can call 
>>> __iommu_attach_device() directly - the intent is just to replay that exact 
>>> call skipped in iommu_group_add_device(), so the legacy external 
>>> iommu_attach_device() interface isn't really the right tool for the job
>>
>> Sounds good. I will check if this can work in various cases. If it's OK, I 
>> will post again.
>>
>> diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
>> index f0305e6aac1b..5e7da902ac36 100644
>> --- a/drivers/iommu/dma-iommu.c
>> +++ b/drivers/iommu/dma-iommu.c
>> @@ -23,7 +23,6 @@
>>   #include 
>>   #include 
>>   #include 
>> -#include 
>>   #include 
>>     stru

Re: [PATCH] iommu: check for the deferred attach when attaching a device

2021-01-15 Thread lijiang
Hi, Robin

Thank you for the comment.

在 2021年01月13日 01:29, Robin Murphy 写道:
> On 2021-01-05 07:52, lijiang wrote:
>> 在 2021年01月05日 11:55, lijiang 写道:
>>> Hi,
>>>
>>> Also add Joerg to cc list.
>>>
>>
>> Also add more people to cc list, Jerry Snitselaar and Tom Lendacky.
>>
>> Thanks.
>>
>>> Thanks.
>>> Lianbo
>>> 在 2020年12月26日 13:39, Lianbo Jiang 写道:
>>>> Currently, because domain attach allows to be deferred from iommu
>>>> driver to device driver, and when iommu initializes, the devices
>>>> on the bus will be scanned and the default groups will be allocated.
>>>>
>>>> Due to the above changes, some devices could be added to the same
>>>> group as below:
>>>>
>>>> [    3.859417] pci :01:00.0: Adding to iommu group 16
>>>> [    3.864572] pci :01:00.1: Adding to iommu group 16
>>>> [    3.869738] pci :02:00.0: Adding to iommu group 17
>>>> [    3.874892] pci :02:00.1: Adding to iommu group 17
>>>>
>>>> But when attaching these devices, it doesn't allow that a group has
>>>> more than one device, otherwise it will return an error. This conflicts
>>>> with the deferred attaching. Unfortunately, it has two devices in the
>>>> same group for my side, for example:
>>>>
>>>> [    9.627014] iommu_group_device_count(): device name[0]::01:00.0
>>>> [    9.633545] iommu_group_device_count(): device name[1]::01:00.1
>>>> ...
>>>> [   10.255609] iommu_group_device_count(): device name[0]::02:00.0
>>>> [   10.262144] iommu_group_device_count(): device name[1]::02:00.1
>>>>
>>>> Finally, which caused the failure of tg3 driver when tg3 driver calls
>>>> the dma_alloc_coherent() to allocate coherent memory in the tg3_test_dma().
>>>>
>>>> [    9.660310] tg3 :01:00.0: DMA engine test failed, aborting
>>>> [    9.754085] tg3: probe of :01:00.0 failed with error -12
>>>> [    9.997512] tg3 :01:00.1: DMA engine test failed, aborting
>>>> [   10.043053] tg3: probe of :01:00.1 failed with error -12
>>>> [   10.288905] tg3 :02:00.0: DMA engine test failed, aborting
>>>> [   10.334070] tg3: probe of :02:00.0 failed with error -12
>>>> [   10.578303] tg3 :02:00.1: DMA engine test failed, aborting
>>>> [   10.622629] tg3: probe of :02:00.1 failed with error -12
>>>>
>>>> In addition, the similar situations also occur in other drivers such
>>>> as the bnxt_en driver. That can be reproduced easily in kdump kernel
>>>> when SME is active.
>>>>
>>>> Add a check for the deferred attach in the iommu_attach_device() and
>>>> allow to attach the deferred device regardless of how many devices
>>>> are in a group.
> 
> Is this iommu_attach_device() call is coming from iommu-dma? (if not, then 
> whoever's calling it probably shouldn't be)
> 

Yes, you are right, the iommu_attach_device call is coming from iommu-dma.
 
> Assuming so, then probably what should happen is to move the handling 
> currently in iommu_dma_deferred_attach() into the core so that it can call 
> __iommu_attach_device() directly - the intent is just to replay that exact 
> call skipped in iommu_group_add_device(), so the legacy external 
> iommu_attach_device() interface isn't really the right tool for the job 

Sounds good. I will check if this can work in various cases. If it's OK, I will 
post again.

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index f0305e6aac1b..5e7da902ac36 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -23,7 +23,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 
 struct iommu_dma_msi_page {
@@ -378,21 +377,6 @@ static int iommu_dma_init_domain(struct iommu_domain 
*domain, dma_addr_t base,
return iova_reserve_iommu_regions(dev, domain);
 }
 
-static int iommu_dma_deferred_attach(struct device *dev,
-   struct iommu_domain *domain)
-{
-   const struct iommu_ops *ops = domain->ops;
-
-   if (!is_kdump_kernel())
-   return 0;
-
-   if (unlikely(ops->is_attach_deferred &&
-   ops->is_attach_deferred(domain, dev)))
-   return iommu_attach_device(domain, dev);
-
-   return 0;
-}
-
 /**
  * dma_info_to_prot - Translate DMA API directions and attributes to IOMMU API
  *page flags.
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index ffeebda8d6de..4fed1567b498 100644
--- a/driver

Re: [PATCH] iommu: check for the deferred attach when attaching a device

2021-01-04 Thread lijiang
在 2021年01月05日 11:55, lijiang 写道:
> Hi,
> 
> Also add Joerg to cc list.
> 

Also add more people to cc list, Jerry Snitselaar and Tom Lendacky.

Thanks.

> Thanks.
> Lianbo
> 在 2020年12月26日 13:39, Lianbo Jiang 写道:
>> Currently, because domain attach allows to be deferred from iommu
>> driver to device driver, and when iommu initializes, the devices
>> on the bus will be scanned and the default groups will be allocated.
>>
>> Due to the above changes, some devices could be added to the same
>> group as below:
>>
>> [3.859417] pci :01:00.0: Adding to iommu group 16
>> [3.864572] pci :01:00.1: Adding to iommu group 16
>> [3.869738] pci :02:00.0: Adding to iommu group 17
>> [3.874892] pci :02:00.1: Adding to iommu group 17
>>
>> But when attaching these devices, it doesn't allow that a group has
>> more than one device, otherwise it will return an error. This conflicts
>> with the deferred attaching. Unfortunately, it has two devices in the
>> same group for my side, for example:
>>
>> [9.627014] iommu_group_device_count(): device name[0]::01:00.0
>> [9.633545] iommu_group_device_count(): device name[1]::01:00.1
>> ...
>> [   10.255609] iommu_group_device_count(): device name[0]::02:00.0
>> [   10.262144] iommu_group_device_count(): device name[1]::02:00.1
>>
>> Finally, which caused the failure of tg3 driver when tg3 driver calls
>> the dma_alloc_coherent() to allocate coherent memory in the tg3_test_dma().
>>
>> [9.660310] tg3 :01:00.0: DMA engine test failed, aborting
>> [9.754085] tg3: probe of :01:00.0 failed with error -12
>> [9.997512] tg3 :01:00.1: DMA engine test failed, aborting
>> [   10.043053] tg3: probe of :01:00.1 failed with error -12
>> [   10.288905] tg3 :02:00.0: DMA engine test failed, aborting
>> [   10.334070] tg3: probe of :02:00.0 failed with error -12
>> [   10.578303] tg3 :02:00.1: DMA engine test failed, aborting
>> [   10.622629] tg3: probe of :02:00.1 failed with error -12
>>
>> In addition, the similar situations also occur in other drivers such
>> as the bnxt_en driver. That can be reproduced easily in kdump kernel
>> when SME is active.
>>
>> Add a check for the deferred attach in the iommu_attach_device() and
>> allow to attach the deferred device regardless of how many devices
>> are in a group.
>>
>> Signed-off-by: Lianbo Jiang 
>> ---
>>  drivers/iommu/iommu.c | 5 -
>>  1 file changed, 4 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
>> index ffeebda8d6de..dccab7b133fb 100644
>> --- a/drivers/iommu/iommu.c
>> +++ b/drivers/iommu/iommu.c
>> @@ -1967,8 +1967,11 @@ int iommu_attach_device(struct iommu_domain *domain, 
>> struct device *dev)
>>   */
>>  mutex_lock(>mutex);
>>  ret = -EINVAL;
>> -if (iommu_group_device_count(group) != 1)
>> +if (!iommu_is_attach_deferred(domain, dev) &&
>> +iommu_group_device_count(group) != 1) {
>> +dev_err_ratelimited(dev, "Group has more than one device\n");
>>  goto out_unlock;
>> +}
>>  
>>  ret = __iommu_attach_group(domain, group);
>>  
>>

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH] iommu: check for the deferred attach when attaching a device

2021-01-04 Thread lijiang
Hi,

Also add Joerg to cc list.

Thanks.
Lianbo
在 2020年12月26日 13:39, Lianbo Jiang 写道:
> Currently, because domain attach allows to be deferred from iommu
> driver to device driver, and when iommu initializes, the devices
> on the bus will be scanned and the default groups will be allocated.
> 
> Due to the above changes, some devices could be added to the same
> group as below:
> 
> [3.859417] pci :01:00.0: Adding to iommu group 16
> [3.864572] pci :01:00.1: Adding to iommu group 16
> [3.869738] pci :02:00.0: Adding to iommu group 17
> [3.874892] pci :02:00.1: Adding to iommu group 17
> 
> But when attaching these devices, it doesn't allow that a group has
> more than one device, otherwise it will return an error. This conflicts
> with the deferred attaching. Unfortunately, it has two devices in the
> same group for my side, for example:
> 
> [9.627014] iommu_group_device_count(): device name[0]::01:00.0
> [9.633545] iommu_group_device_count(): device name[1]::01:00.1
> ...
> [   10.255609] iommu_group_device_count(): device name[0]::02:00.0
> [   10.262144] iommu_group_device_count(): device name[1]::02:00.1
> 
> Finally, which caused the failure of tg3 driver when tg3 driver calls
> the dma_alloc_coherent() to allocate coherent memory in the tg3_test_dma().
> 
> [9.660310] tg3 :01:00.0: DMA engine test failed, aborting
> [9.754085] tg3: probe of :01:00.0 failed with error -12
> [9.997512] tg3 :01:00.1: DMA engine test failed, aborting
> [   10.043053] tg3: probe of :01:00.1 failed with error -12
> [   10.288905] tg3 :02:00.0: DMA engine test failed, aborting
> [   10.334070] tg3: probe of :02:00.0 failed with error -12
> [   10.578303] tg3 :02:00.1: DMA engine test failed, aborting
> [   10.622629] tg3: probe of :02:00.1 failed with error -12
> 
> In addition, the similar situations also occur in other drivers such
> as the bnxt_en driver. That can be reproduced easily in kdump kernel
> when SME is active.
> 
> Add a check for the deferred attach in the iommu_attach_device() and
> allow to attach the deferred device regardless of how many devices
> are in a group.
> 
> Signed-off-by: Lianbo Jiang 
> ---
>  drivers/iommu/iommu.c | 5 -
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
> index ffeebda8d6de..dccab7b133fb 100644
> --- a/drivers/iommu/iommu.c
> +++ b/drivers/iommu/iommu.c
> @@ -1967,8 +1967,11 @@ int iommu_attach_device(struct iommu_domain *domain, 
> struct device *dev)
>*/
>   mutex_lock(>mutex);
>   ret = -EINVAL;
> - if (iommu_group_device_count(group) != 1)
> + if (!iommu_is_attach_deferred(domain, dev) &&
> + iommu_group_device_count(group) != 1) {
> + dev_err_ratelimited(dev, "Group has more than one device\n");
>   goto out_unlock;
> + }
>  
>   ret = __iommu_attach_group(domain, group);
>  
> 

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v3 5/6] fs/core/vmcore: Move sev_active() reference to x86 arch code

2019-07-18 Thread lijiang
在 2019年07月19日 01:47, Lendacky, Thomas 写道:
> On 7/17/19 10:28 PM, Thiago Jung Bauermann wrote:
>> Secure Encrypted Virtualization is an x86-specific feature, so it shouldn't
>> appear in generic kernel code because it forces non-x86 architectures to
>> define the sev_active() function, which doesn't make a lot of sense.
>>
>> To solve this problem, add an x86 elfcorehdr_read() function to override
>> the generic weak implementation. To do that, it's necessary to make
>> read_from_oldmem() public so that it can be used outside of vmcore.c.
>>
>> Also, remove the export for sev_active() since it's only used in files that
>> won't be built as modules.
>>
>> Signed-off-by: Thiago Jung Bauermann 
> 
> Adding Lianbo and Baoquan, who recently worked on this, for their review.
> 

This change looks good to me.

Reviewed-by: Lianbo Jiang 

Thanks.
Lianbo

> Thanks,
> Tom
> 
>> ---
>>  arch/x86/kernel/crash_dump_64.c |  5 +
>>  arch/x86/mm/mem_encrypt.c   |  1 -
>>  fs/proc/vmcore.c|  8 
>>  include/linux/crash_dump.h  | 14 ++
>>  include/linux/mem_encrypt.h |  1 -
>>  5 files changed, 23 insertions(+), 6 deletions(-)
>>
>> diff --git a/arch/x86/kernel/crash_dump_64.c 
>> b/arch/x86/kernel/crash_dump_64.c
>> index 22369dd5de3b..045e82e8945b 100644
>> --- a/arch/x86/kernel/crash_dump_64.c
>> +++ b/arch/x86/kernel/crash_dump_64.c
>> @@ -70,3 +70,8 @@ ssize_t copy_oldmem_page_encrypted(unsigned long pfn, char 
>> *buf, size_t csize,
>>  {
>>  return __copy_oldmem_page(pfn, buf, csize, offset, userbuf, true);
>>  }
>> +
>> +ssize_t elfcorehdr_read(char *buf, size_t count, u64 *ppos)
>> +{
>> +return read_from_oldmem(buf, count, ppos, 0, sev_active());
>> +}
>> diff --git a/arch/x86/mm/mem_encrypt.c b/arch/x86/mm/mem_encrypt.c
>> index 7139f2f43955..b1e823441093 100644
>> --- a/arch/x86/mm/mem_encrypt.c
>> +++ b/arch/x86/mm/mem_encrypt.c
>> @@ -349,7 +349,6 @@ bool sev_active(void)
>>  {
>>  return sme_me_mask && sev_enabled;
>>  }
>> -EXPORT_SYMBOL(sev_active);
>>  
>>  /* Override for DMA direct allocation check - 
>> ARCH_HAS_FORCE_DMA_UNENCRYPTED */
>>  bool force_dma_unencrypted(struct device *dev)
>> diff --git a/fs/proc/vmcore.c b/fs/proc/vmcore.c
>> index 57957c91c6df..ca1f20bedd8c 100644
>> --- a/fs/proc/vmcore.c
>> +++ b/fs/proc/vmcore.c
>> @@ -100,9 +100,9 @@ static int pfn_is_ram(unsigned long pfn)
>>  }
>>  
>>  /* Reads a page from the oldmem device from given offset. */
>> -static ssize_t read_from_oldmem(char *buf, size_t count,
>> -u64 *ppos, int userbuf,
>> -bool encrypted)
>> +ssize_t read_from_oldmem(char *buf, size_t count,
>> + u64 *ppos, int userbuf,
>> + bool encrypted)
>>  {
>>  unsigned long pfn, offset;
>>  size_t nr_bytes;
>> @@ -166,7 +166,7 @@ void __weak elfcorehdr_free(unsigned long long addr)
>>   */
>>  ssize_t __weak elfcorehdr_read(char *buf, size_t count, u64 *ppos)
>>  {
>> -return read_from_oldmem(buf, count, ppos, 0, sev_active());
>> +return read_from_oldmem(buf, count, ppos, 0, false);
>>  }
>>  
>>  /*
>> diff --git a/include/linux/crash_dump.h b/include/linux/crash_dump.h
>> index f774c5eb9e3c..4664fc1871de 100644
>> --- a/include/linux/crash_dump.h
>> +++ b/include/linux/crash_dump.h
>> @@ -115,4 +115,18 @@ static inline int vmcore_add_device_dump(struct 
>> vmcoredd_data *data)
>>  return -EOPNOTSUPP;
>>  }
>>  #endif /* CONFIG_PROC_VMCORE_DEVICE_DUMP */
>> +
>> +#ifdef CONFIG_PROC_VMCORE
>> +ssize_t read_from_oldmem(char *buf, size_t count,
>> + u64 *ppos, int userbuf,
>> + bool encrypted);
>> +#else
>> +static inline ssize_t read_from_oldmem(char *buf, size_t count,
>> +   u64 *ppos, int userbuf,
>> +   bool encrypted)
>> +{
>> +return -EOPNOTSUPP;
>> +}
>> +#endif /* CONFIG_PROC_VMCORE */
>> +
>>  #endif /* LINUX_CRASHDUMP_H */
>> diff --git a/include/linux/mem_encrypt.h b/include/linux/mem_encrypt.h
>> index 0c5b0ff9eb29..5c4a18a91f89 100644
>> --- a/include/linux/mem_encrypt.h
>> +++ b/include/linux/mem_encrypt.h
>> @@ -19,7 +19,6 @@
>>  #else   /* !CONFIG_ARCH_HAS_MEM_ENCRYPT */
>>  
>>  static inline bool mem_encrypt_active(void) { return false; }
>> -static inline bool sev_active(void) { return false; }
>>  
>>  #endif  /* CONFIG_ARCH_HAS_MEM_ENCRYPT */
>>  
>>
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 3/4 v7] amd_iommu: remap the device table of IOMMU with the memory encryption mask for kdump

2018-09-26 Thread lijiang
在 2018年09月25日 20:04, Joerg Roedel 写道:
> On Fri, Sep 07, 2018 at 04:18:04PM +0800, Lianbo Jiang wrote:
>> In kdump kernel, it will copy the device table of IOMMU from the old device
>> table, which is encrypted when SME is enabled in the first kernel. So we
>> have to remap the old device table with the memory encryption mask.
>>
>> Signed-off-by: Lianbo Jiang 
> 
> Please change the subject to:
> 
>   iommu/amd: Remap the device table of IOMMU with the memory encryption 
> mask for kdump
> 
> With that you can add my:
> 
> Acked-by: Joerg Roedel 
> 

Thank you, Joerg.
Sorry for my late reply. I will change the subject and resend this patch.

Thanks.
Lianbo
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 0/4 v7] Support kdump for AMD secure memory encryption(SME)

2018-09-26 Thread lijiang
Also cc maintainer and other reviewer. Thanks.

在 2018年09月26日 13:52, lijiang 写道:
> 在 2018年09月26日 03:10, Lendacky, Thomas 写道:
>> On 09/07/2018 03:18 AM, Lianbo Jiang wrote:
>>> When SME is enabled on AMD machine, we also need to support kdump. Because
>>> the memory is encrypted in the first kernel, we will remap the old memory
>>> to the kdump kernel for dumping data, and SME is also enabled in the kdump
>>> kernel, otherwise the old memory can not be decrypted.
>>>
>>> For the kdump, it is necessary to distinguish whether the memory is 
>>> encrypted.
>>> Furthermore, we should also know which part of the memory is encrypted or
>>> decrypted. We will appropriately remap the memory according to the specific
>>> situation in order to tell cpu how to access the memory.
>>>
>>> As we know, a page of memory that is marked as encrypted, which will be
>>> automatically decrypted when read from DRAM, and will also be automatically
>>> encrypted when written to DRAM. If the old memory is encrypted, we have to
>>> remap the old memory with the memory encryption mask, which will 
>>> automatically
>>> decrypt the old memory when we read those data.
>>>
>>> For kdump(SME), there are two cases that doesn't support:
>>>
>>>  --
>>> | first-kernel | second-kernel | kdump support |
>>> |  (mem_encrypt=on|off)|   (yes|no)|
>>> |--+---+---|
>>> | on   | on| yes   |
>>> | off  | off   | yes   |
>>> | on   | off   | no|
>>> | off  | on| no|
>>> |__|___|___|
>>>
>>> 1. SME is enabled in the first kernel, but SME is disabled in kdump kernel
>>> In this case, because the old memory is encrypted, we can't decrypt the
>>> old memory.
>>>
>>> 2. SME is disabled in the first kernel, but SME is enabled in kdump kernel
>>> It is unnecessary to support in this case, because the old memory is
>>> unencrypted, the old memory can be dumped as usual, we don't need to enable
>>> SME in kdump kernel. Another, If we must support the scenario, it will
>>> increase the complexity of the code, we will have to consider how to pass
>>> the SME flag from the first kernel to the kdump kernel, in order to let the
>>> kdump kernel know that whether the old memory is encrypted.
>>>
>>> There are two methods to pass the SME flag to the kdump kernel. The first
>>> method is to modify the assembly code, which includes some common code and
>>> the path is too long. The second method is to use kexec tool, which could
>>> require the SME flag to be exported in the first kernel by "proc" or 
>>> "sysfs",
>>> kexec tools will read the SME flag from "proc" or "sysfs" when we use kexec
>>> tools to load image, subsequently the SME flag will be saved in boot_params,
>>> we can properly remap the old memory according to the previously saved SME
>>> flag. But it is too expensive to do this.
>>>
>>> This patches are only for SME kdump, the patches don't support SEV kdump.
>>
>> Reviewed-by: Tom Lendacky 
>>
> 
> Thank you, Tom. I'm very glad that you would like to review my patches, and
> also gave me some advice to improve these patches.
> 
>> Just curious, are you planning to add SEV kdump support after this?
>>
> 
> Yes, we are planning to add SEV kdump support after this.
> And i also welcome that you would like to review my SEV kdump patch again.
> 
>> Also, a question below...
>>
>>>
>>> Test tools:
>>> makedumpfile[v1.6.3]: https://github.com/LianboJ/makedumpfile
>>> commit e1de103eca8f (A draft for kdump vmcore about AMD SME)
>>> Note: This patch can only dump vmcore in the case of SME enabled.
>>>
>>> crash-7.2.3: https://github.com/crash-utility/crash.git
>>> commit 001f77a05585 (Fix for Linux 4.19-rc1 and later kernels that contain
>>>  kernel commit7290d58095712a89f845e1bca05334796dd49ed2)
>>>
>>> kexec-tools-2.0.17: 
>>> git://git.kernel.org/pub/scm/utils/kernel/kexec/kexec-tools.git
>>> commit b9de21ef51a7 (kexec: fix for "Unhandled rela relocation: 
>>> R_X86_64_PLT32" error)
>>> Note:
>>> Before you load the kernel and initramfs for kdump, this 
>>> 

Re: [PATCH 2/5 V6] x86/ioremap: strengthen the logic in early_memremap_pgprot_adjust() to adjust encryption mask

2018-09-05 Thread lijiang
在 2018年09月05日 14:46, Dave Young 写道:
> [snip]
>>
>> As previously mentioned, there are also many differences between kexec and 
>> kdump. In general,
>> kexec needs to look at all of available physical memory, but kdump doesn't 
>> need.
>>
>> For kexec, kexec-tools will read /sys/firmware/memmap and recreate the e820 
>> ranges for the 2nd
>> kernel. If it fails, will use /proc/iomem.
>>
>> For kdump, kexec-tools will read /proc/iomem and recreate the e820 ranges 
>> for kdump kernel.
>> BTW: we can not get the range of persistent memory from /proc/iomem. So e820 
>> ranges don't contain
>> the persistent memory in kdump kernel, this is the real reason why i need to 
>> strengthen the logic
>> of adjusting memory encryption mask.
> 
> "persistent memory" is different, I think you meant about some reserved
> memory instead
> 
>>
>> If kexec-tools also use /sys/firmware/memmap for kdump(like kexec), kdump 
>> kernel can also work
>> without a fix, but the kexec-tools will have to be modified. Are you sure 
>> that you want me to
>> fix kexec-tools instead of kernel?
> 
> Yes, please fix kexec-tools to pass reserved ranges in e820, you will
> not need this patch then.
> 

This might be a kexec-tools bug, i have posted a patch for kexec-tools(please 
check ke...@lists.infradead.org).
Thanks.

> Thanks
> Dave
> 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 2/5 V6] x86/ioremap: strengthen the logic in early_memremap_pgprot_adjust() to adjust encryption mask

2018-09-05 Thread lijiang
在 2018年09月04日 09:51, Dave Young 写道:
> On 09/04/18 at 09:29am, Dave Young wrote:
>> On 09/04/18 at 08:44am, Dave Young wrote:
>>> On 09/03/18 at 10:06pm, lijiang wrote:
>>>> 在 2018年09月03日 10:45, Dave Young 写道:
>>>>> On 08/31/18 at 04:19pm, Lianbo Jiang wrote:
>>>>>> For kdump kernel, when SME is enabled, the acpi table and dmi table will 
>>>>>> need
>>>>>> to be remapped without the memory encryption mask. So we have to 
>>>>>> strengthen
>>>>>> the logic in early_memremap_pgprot_adjust(), which makes us have an 
>>>>>> opportunity
>>>>>> to adjust the memory encryption mask.
>>>>>>
>>>>>> Signed-off-by: Lianbo Jiang 
>>>>>> ---
>>>>>>  arch/x86/mm/ioremap.c | 9 -
>>>>>>  1 file changed, 8 insertions(+), 1 deletion(-)
>>>>>>
>>>>>> diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c
>>>>>> index e01e6c695add..f9d9a39955f3 100644
>>>>>> --- a/arch/x86/mm/ioremap.c
>>>>>> +++ b/arch/x86/mm/ioremap.c
>>>>>> @@ -689,8 +689,15 @@ pgprot_t __init 
>>>>>> early_memremap_pgprot_adjust(resource_size_t phys_addr,
>>>>>>  encrypted_prot = true;
>>>>>>  
>>>>>>  if (sme_active()) {
>>>>>> +/*
>>>>>> + * In kdump kernel, the acpi table and dmi table will 
>>>>>> need
>>>>>> + * to be remapped without the memory encryption mask. 
>>>>>> Here
>>>>>> + * we have to strengthen the logic to adjust the memory
>>>>>> + * encryption mask.
>>>>>
>>>>> Assume the acpi/dmi tables are identical for both 1st kernel and kdump
>>>>> kernel, I'm not sure what is the difference, why need special handling
>>>>> for kdump. Can you add more explanations?
>>>>>
>>>>
>>>> Ok, i will use a dmi example to explain this issue.
>>>>
>>>> There are significant differences about E820 between the 1st kernel and 
>>>> kdump kernel. I pasted them at bottom.
>>>>
>>>> Firstly, we need to know how they are called.
>>>> __acpi_map_table()\
>>>> / early_memremap_is_setup_data()
>>>>|-> early_memremap()-> early_memremap_pgprot_adjust()-> 
>>>> | memremap_is_efi_data()
>>>>  dmi_early_remap()/
>>>> \ memremap_should_map_decrypted()-> e820__get_entry_type()
>>>>
>>>> Secondly, we also need to understand the memremap_should_map_decrypted(), 
>>>> which is illustrated by the fake code.
>>>> static bool memremap_should_map_decrypted(resource_size_t phys_addr,
>>>>   unsigned long size)
>>>> {
>>>>
>>>> /* code ... */
>>>>
>>>> switch (e820__get_entry_type(phys_addr, phys_addr + size - 1)) {
>>>> case E820_TYPE_RESERVED:
>>>> case E820_TYPE_ACPI:
>>>> case E820_TYPE_NVS:
>>>> case E820_TYPE_UNUSABLE:
>>>> /* For SEV, these areas are encrypted */
>>>> if (sev_active())
>>>> break;
>>>> /* Fallthrough */
>>>>
>>>> case E820_TYPE_PRAM:
>>>> /* For SME, these areas are decrypted */
>>>> return true;
>>>> default:
>>>> /* these areas are encrypted by default*/
>>>> break;
>>>> }
>>>>
>>>> return false;
>>>> }
>>>>
>>>> For the dmi case, the dmi base address is 0x6286b000 in my test machine.
>>>>
>>>> In the 1st kernel, the e820__get_entry_type() can get a valid entry and 
>>>> type by the dmi address, and we can also find the dmi base address from 
>>>> e820.
>>>> (see the 1st kernel log)
>>>> 0x6286b000 ∈ [mem 0x6286b000-0x6286efff]
>>>> So, these areas are decrypted according to the 
>>>> memremap_should_m

Re: [PATCH 2/5 V6] x86/ioremap: strengthen the logic in early_memremap_pgprot_adjust() to adjust encryption mask

2018-09-03 Thread lijiang
在 2018年09月04日 08:44, Dave Young 写道:
> On 09/03/18 at 10:06pm, lijiang wrote:
>> 在 2018年09月03日 10:45, Dave Young 写道:
>>> On 08/31/18 at 04:19pm, Lianbo Jiang wrote:
>>>> For kdump kernel, when SME is enabled, the acpi table and dmi table will 
>>>> need
>>>> to be remapped without the memory encryption mask. So we have to strengthen
>>>> the logic in early_memremap_pgprot_adjust(), which makes us have an 
>>>> opportunity
>>>> to adjust the memory encryption mask.
>>>>
>>>> Signed-off-by: Lianbo Jiang 
>>>> ---
>>>>  arch/x86/mm/ioremap.c | 9 -
>>>>  1 file changed, 8 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c
>>>> index e01e6c695add..f9d9a39955f3 100644
>>>> --- a/arch/x86/mm/ioremap.c
>>>> +++ b/arch/x86/mm/ioremap.c
>>>> @@ -689,8 +689,15 @@ pgprot_t __init 
>>>> early_memremap_pgprot_adjust(resource_size_t phys_addr,
>>>>encrypted_prot = true;
>>>>  
>>>>if (sme_active()) {
>>>> +/*
>>>> + * In kdump kernel, the acpi table and dmi table will need
>>>> + * to be remapped without the memory encryption mask. Here
>>>> + * we have to strengthen the logic to adjust the memory
>>>> + * encryption mask.
>>>
>>> Assume the acpi/dmi tables are identical for both 1st kernel and kdump
>>> kernel, I'm not sure what is the difference, why need special handling
>>> for kdump. Can you add more explanations?
>>>
>>
>> Ok, i will use a dmi example to explain this issue.
>>
>> There are significant differences about E820 between the 1st kernel and 
>> kdump kernel. I pasted them at bottom.
>>
>> Firstly, we need to know how they are called.
>> __acpi_map_table()\/ 
>> early_memremap_is_setup_data()
>>|-> early_memremap()-> early_memremap_pgprot_adjust()-> | 
>> memremap_is_efi_data()
>>  dmi_early_remap()/\ 
>> memremap_should_map_decrypted()-> e820__get_entry_type()
>>
>> Secondly, we also need to understand the memremap_should_map_decrypted(), 
>> which is illustrated by the fake code.
>> static bool memremap_should_map_decrypted(resource_size_t phys_addr,
>>   unsigned long size)
>> {
>>
>> /* code ... */
>>
>> switch (e820__get_entry_type(phys_addr, phys_addr + size - 1)) {
>> case E820_TYPE_RESERVED:
>> case E820_TYPE_ACPI:
>> case E820_TYPE_NVS:
>> case E820_TYPE_UNUSABLE:
>> /* For SEV, these areas are encrypted */
>> if (sev_active())
>> break;
>> /* Fallthrough */
>>
>> case E820_TYPE_PRAM:
>> /* For SME, these areas are decrypted */
>> return true;
>> default:
>> /* these areas are encrypted by default*/
>> break;
>> }
>>
>> return false;
>> }
>>
>> For the dmi case, the dmi base address is 0x6286b000 in my test machine.
>>
>> In the 1st kernel, the e820__get_entry_type() can get a valid entry and type 
>> by the dmi address, and we can also find the dmi base address from e820.
>> (see the 1st kernel log)
>> 0x6286b000 ∈ [mem 0x6286b000-0x6286efff]
>> So, these areas are decrypted according to the 
>> memremap_should_map_decrypted().
>>
>> In kdump kernel, the dmi base address is still 0x6286b000, but we can not 
>> find the dmi base address from e820 any more. The e820__get_entry_type() can
>> not get a valid entry and type by the dmi base address, it will go into the 
>> default branch. That is to say, these areas become encrypted. In fact, these
>> areas are also decrypted, so we have to strengthen the logic of adjusting 
>> the memory encryption mask.
>>
>>
>> The 1st kernel log:
>>
>> [0.00] BIOS-provided physical RAM map:
>> [0.00] BIOS-e820: [mem 0x-0x0008bfff] usable
>> [0.00] BIOS-e820: [mem 0x0008c000-0x0009] 
>> reserved
>> [0.00] BIOS-e820: [mem 0x000e-0x000f] 
>> reserved
>>

Re: [PATCH 2/5 V6] x86/ioremap: strengthen the logic in early_memremap_pgprot_adjust() to adjust encryption mask

2018-09-03 Thread lijiang
在 2018年09月03日 10:45, Dave Young 写道:
> On 08/31/18 at 04:19pm, Lianbo Jiang wrote:
>> For kdump kernel, when SME is enabled, the acpi table and dmi table will need
>> to be remapped without the memory encryption mask. So we have to strengthen
>> the logic in early_memremap_pgprot_adjust(), which makes us have an 
>> opportunity
>> to adjust the memory encryption mask.
>>
>> Signed-off-by: Lianbo Jiang 
>> ---
>>  arch/x86/mm/ioremap.c | 9 -
>>  1 file changed, 8 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c
>> index e01e6c695add..f9d9a39955f3 100644
>> --- a/arch/x86/mm/ioremap.c
>> +++ b/arch/x86/mm/ioremap.c
>> @@ -689,8 +689,15 @@ pgprot_t __init 
>> early_memremap_pgprot_adjust(resource_size_t phys_addr,
>>  encrypted_prot = true;
>>  
>>  if (sme_active()) {
>> +/*
>> + * In kdump kernel, the acpi table and dmi table will need
>> + * to be remapped without the memory encryption mask. Here
>> + * we have to strengthen the logic to adjust the memory
>> + * encryption mask.
> 
> Assume the acpi/dmi tables are identical for both 1st kernel and kdump
> kernel, I'm not sure what is the difference, why need special handling
> for kdump. Can you add more explanations?
> 

Ok, i will use a dmi example to explain this issue.

There are significant differences about E820 between the 1st kernel and kdump 
kernel. I pasted them at bottom.

Firstly, we need to know how they are called.
__acpi_map_table()\/ 
early_memremap_is_setup_data()
   |-> early_memremap()-> early_memremap_pgprot_adjust()-> | 
memremap_is_efi_data()
 dmi_early_remap()/\ 
memremap_should_map_decrypted()-> e820__get_entry_type()

Secondly, we also need to understand the memremap_should_map_decrypted(), which 
is illustrated by the fake code.
static bool memremap_should_map_decrypted(resource_size_t phys_addr,
  unsigned long size)
{

/* code ... */

switch (e820__get_entry_type(phys_addr, phys_addr + size - 1)) {
case E820_TYPE_RESERVED:
case E820_TYPE_ACPI:
case E820_TYPE_NVS:
case E820_TYPE_UNUSABLE:
/* For SEV, these areas are encrypted */
if (sev_active())
break;
/* Fallthrough */

case E820_TYPE_PRAM:
/* For SME, these areas are decrypted */
return true;
default:
/* these areas are encrypted by default*/
break;
}

return false;
}

For the dmi case, the dmi base address is 0x6286b000 in my test machine.

In the 1st kernel, the e820__get_entry_type() can get a valid entry and type by 
the dmi address, and we can also find the dmi base address from e820.
(see the 1st kernel log)
0x6286b000 ∈ [mem 0x6286b000-0x6286efff]
So, these areas are decrypted according to the memremap_should_map_decrypted().

In kdump kernel, the dmi base address is still 0x6286b000, but we can not find 
the dmi base address from e820 any more. The e820__get_entry_type() can
not get a valid entry and type by the dmi base address, it will go into the 
default branch. That is to say, these areas become encrypted. In fact, these
areas are also decrypted, so we have to strengthen the logic of adjusting the 
memory encryption mask.


The 1st kernel log:

[0.00] BIOS-provided physical RAM map:
[0.00] BIOS-e820: [mem 0x-0x0008bfff] usable
[0.00] BIOS-e820: [mem 0x0008c000-0x0009] reserved
[0.00] BIOS-e820: [mem 0x000e-0x000f] reserved
[0.00] BIOS-e820: [mem 0x0010-0x29920fff] usable
[0.00] BIOS-e820: [mem 0x29921000-0x29921fff] reserved
[0.00] BIOS-e820: [mem 0x29922000-0x62256fff] usable
[0.00] BIOS-e820: [mem 0x62257000-0x62356fff] reserved
[0.00] BIOS-e820: [mem 0x62357000-0x6235cfff] ACPI data
[0.00] BIOS-e820: [mem 0x6235d000-0x623dbfff] usable
[0.00] BIOS-e820: [mem 0x623dc000-0x6261bfff] reserved
[0.00] BIOS-e820: [mem 0x6261c000-0x6263dfff] usable
[0.00] BIOS-e820: [mem 0x6263e000-0x6269dfff] reserved
[0.00] BIOS-e820: [mem 0x6269e000-0x627d6fff] usable
[0.00] BIOS-e820: [mem 0x627d7000-0x627e3fff] ACPI data
[0.00] BIOS-e820: [mem 0x627e4000-0x627e4fff] ACPI NVS
[0.00] BIOS-e820: [mem 0x627e5000-0x627e8fff] ACPI data
[0.00] BIOS-e820: [mem 0x627e9000-0x627eafff] usable
[0.00] BIOS-e820: [mem 

Re: [PATCH 1/5 V5] Add a function(ioremap_encrypted) for kdump when AMD sme enabled

2018-08-15 Thread lijiang
在 2018年07月20日 18:08, Boris Petkov 写道:
> On July 20, 2018 12:55:04 PM GMT+03:00, lijiang  wrote:>
>> Thanks for your advice, I will rewrite the log and send them again.
> 
> Do not send them again - explain the problem properly first!
> 
I have compared two solutions to handle the encrypted memory, the solution A is 
mine, the solution B is Boris's.
Here we only talk about the case that SME is active in the first kernel, and 
only care it's active too in kdump
kernel. For solution A and solution B, these four cases are same.
a. dump vmcore
   It is encrypted in the first kernel, and needs be read out in kdump kernel.
b. crash notes
   When dumping vmcore, the people usually need to read the useful information 
from notes, and the notes
   is also encrypted.
c. iommu device table
   It is allocated by kernel, need fill its pointer into mmio of amd iommu. 
It's encrypted in the first kernel,
   need read the old content to analyze and get useful information.
d. mmio of amd iommu
   Register reported by amd firmware, it's not RAM, we don't encrypt in both 
the first kernel and kdump kernel.

Solution A:
   1. add a new bool parameter "encrypted" to __ioremap_caller()
  It is a low level function, and check the newly added parameter, if it's 
true and in kdump kernel, will
  remap the memory with sme mask.
   2. add a new function ioremap_encrypted() to explicitly passed in a "true" 
value for "encrypted". For above
  a, b, c, we will call ioremap_encrypted();
   3. adjust all existed ioremap wrapper functions, passed in "false" for 
encrypted to make them an before.

  ioremap_encrypted()\
  ioremap_cache() |
  ioremap_prot()  |
  ioremap_wt()|->__ioremap_caller()
  ioremap_wc()|
  ioremap_uc()|
  ioremap_nocache()  /

Solution B(Boris suggested):
   1. no need to add a new function(ioremap_encrypted), check whether the old 
memory is encrypted by comparing the address.
   2. add new functions to check whether the old memory is encrypted for all 
cases.
  a. dump vmcore
 bool addr_is_old_memory(unsignd long addr)
  b. crash notes
 bool addr_is_crash_notes(unsigned long addr)
  c. iommu device table
 bool addr_is_iommu_device_table(unsigned long addr)
   3. when remapping the memory, it will call all new functions, and check 
whether the memory is encrypted in __ioremap_caller().
  __ioremap_caller()->__ioremap_compute_prot()-> /-> addr_is_crash_notes()
 |-> 
addr_is_iommu_device_table()
 |-> addr_is_old_memory()
 \ ..
   For solution B, i made draft patch for demonstration, just pasted them at 
bottom.


Conclusion:

  Solution A:
  advantages:
  1). It's very simple and very clean, less code change;
  2). It's easier to understand.
  disadvantages:
  1). Need change the interface of __ioremap_caller() and 
adjust its wrapper functions;
  2). Need call ioremap_encrypted() explicitly for vmcore/crash 
notes/dev table reading.

  Solution B:
  advantages:
 1). No need to touch interface;
 2). Automatically detect and do inside __ioremap_caller().
  disadvantages:
 1). Need add each function for each kind of encrypted old 
memory reading;
 2). In the future, need add these kinds of functions too for 
intel MKTME;
 3). It's more complicated and more code changes.

I personally prefer solution A, which is presented in posted patches.
What do you think, Boris? And other reviewers?

Thanks,
Lianbo




The solution B will be described by pseudo-code, for example:

   modified file: drivers/iommu/amd_iommu_init.c
   inline bool addr_is_iommu_device_table(unsigned long addr) {
struct amd_iommu *iommu;

/* search the addr */
for_each_iommu(iommu) {
lo = readl(iommu->mmio_base + MMIO_DEV_TABLE_OFFSET);
hi = readl(iommu->mmio_base + MMIO_DEV_TABLE_OFFSET + 4);
entry = (((u64) hi) << 32) + lo;
/* check whether it is an iommu device table address */
if (addr == entry) {
return true;
}
}
return false;
   }

   modified file: fs/proc/vmcore.c
   inline bool addr_is_crash_notes(unsigned long addr) {
   Elf64_Ehdr ehdr;
   
   /* code */

   rc = elfcorehdr_read((char*), sizeof(ELF64_Ehdr), _addr);
   elfcorebuf_sz_orig = sizeof(Elf64_Ehdr) + 
ehdr.e_phnum*sizeof(Elf64_Phdr);
   rc = elfcorehdr_read(elfcorebuf, elfcorebuf_sz_orig, _addr);
   ehdr_ptr = (Elf64_Ehdr*)(elfcorebuf + 1);

   /* search the addr */
   for (i = 0; i < ehdr_ptr->e_p

Re: [PATCH 1/5 V5] Add a function(ioremap_encrypted) for kdump when AMD sme enabled

2018-07-20 Thread lijiang
在 2018年07月20日 15:32, Borislav Petkov 写道:
> On Fri, Jul 20, 2018 at 01:23:04PM +0800, Dave Young wrote:
>>> Here, it doesn't need to dump MMIO space of the previous kernel, when the
>>> kdump kernel boot, the MMIO address will be remapped in decryption manners,
>>> but the MMIO address don't belong to the range of the crash reserved memory,
>>> for the kdump kernel, the MMIO space(address) and IOMMU device 
>>> table(address)
>>> are outside address, whereas, the IOMMU device table is encrypted in the 
>>> first
>>> kernel, the kdump kernel will need to copy the content of IOMMU device table
>>> from the first kernel when the kdump kernel boot, so the IOMMU device table 
>>> will
>>> be remapped in encryption manners.
> 
> -ENOFCKINGPARSE
> 
> I believe you're the only one who understands that humongous sentence.Sorry 
> for this.

> How about using a fullstop from time to time. And WTF is "encryption
> manners"?
> 
>>> So some of them require to be remapped in encryption manners, and 
>>> some(address)
>>> require to be remapped in decryption manners.
> 
>> There could be some misunderstanding here.
> 
> Hell yeah there's a misunderstanding!
> 
> Can you folks first relax, sit down and explain the whole problem in
> *plain* English using *simple* sentences. *Not* like the unparseable
> mess above. Use simple, declaratory sentences and don't even try to
> sound fancy:
> 
> "The first kernel boots. It's memory is encrypted... Now, the second
> kernel boots. It must do A because of B. In order to do A, it needs to
> do C. Because D..."
> 
> And so on. Explain what the problem is first. Then explain the proposed
> solution. Explain *why* it needs to be done this way.
> 
> When you've written your explanation, try to read it as someone who
> doesn't know kdump and *think* hard whether your explanation makes
> sense. If it doesn't, fix it and read it again. Rinse and repeat. Until
> it is clear to unenlightened readers too.
> 
Thanks for your advice, I will rewrite the log and send them again.

> It is about time this hurried throwing of half-baked patches at
> maintainers and seeing what sticks, stops!
> 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 1/5 V5] Add a function(ioremap_encrypted) for kdump when AMD sme enabled

2018-07-19 Thread lijiang
在 2018年07月14日 01:08, Borislav Petkov 写道:
> On Mon, Jul 09, 2018 at 09:55:35PM +0800, lijiang wrote:
>> About this issue, i want to use an example to describe it.
>> /* drivers/iommu/amd_iommu_init.c */
>> static u8 __iomem * __init iommu_map_mmio_space(u64 address, u64 end)
> 
> Those addresses come from the IVHD header which is an ACPI table. So the
> dump kernel can find that out too.
> Sure. I might understand your means, that will have to find all address out in
order to cover any cases in kdump kernel, those address  might include MMIO
space, HPET, ACPI device table, ERST, and so on... 

>> Obviously, the iommu mmio space is not encrypted, and the device
>> mmio space is outside kdump kernel. We know that the old memory is
>> encrypted, and the old memory is also outside kdump kernel. For the
>> current case, e820__get_entry_type() and walk_iomem_res_desc() can't
>> get the desired result, so we can't also decide whether encryption
>> or not according to this result(rules). If we want to know whether
>> encryption or not by deducing the address, we will need to read the
>> content of memory and have a reference value for comparison, then
>> what's a reference value? Sometimes we don't know that.
> 
> Again, if we don't know that how is the *caller* supposed to know
> whether the memory is encrypted or not? Because
> 
> "we" == "caller"
> 
> in the kdump kernel.
> 
> And the more important question is, why are we dumping MMIO space of the
> previous kernel *at* *all*? That doesn't make any sense to me.
> 
Sorry for my late reply.
Here, it doesn't need to dump MMIO space of the previous kernel, when the
kdump kernel boot, the MMIO address will be remapped in decryption manners,
but the MMIO address don't belong to the range of the crash reserved memory,
for the kdump kernel, the MMIO space(address) and IOMMU device table(address)
are outside address, whereas, the IOMMU device table is encrypted in the first
kernel, the kdump kernel will need to copy the content of IOMMU device table
from the first kernel when the kdump kernel boot, so the IOMMU device table will
be remapped in encryption manners.
So some of them require to be remapped in encryption manners, and some(address)
require to be remapped in decryption manners.

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 1/5 V5] Add a function(ioremap_encrypted) for kdump when AMD sme enabled

2018-07-09 Thread lijiang
在 2018年07月09日 17:29, Borislav Petkov 写道:
> On Mon, Jul 09, 2018 at 02:28:11PM +0800, lijiang wrote:
>> Last week, I had tried many ways to do this work, but it looks
>> like that the ways of deducing address is not suitable to another
>> scenarios, such as mapping some devices mmio space, which are
>> unencrypted, and the device mmio space is outside kdump kernel like
>> the old memory, but the old memory is encrypted, we can't find the
>> general rules to decide when to encrypt or not.
> 
> If we can't find the "general rules" how is the caller supposed to know
> how to access the memory?
> 
About this issue, i want to use an example to describe it.
/* drivers/iommu/amd_iommu_init.c */
static u8 __iomem * __init iommu_map_mmio_space(u64 address, u64 end)
{
if (!request_mem_region(address, end, "amd_iommu")) {
pr_err("AMD-Vi: Can not reserve memory region %llx-%llx for 
mmio\n",
address, end);
pr_err("AMD-Vi: This is a BIOS bug. Please contact your 
hardware vendor\n");
return NULL;
}

return (u8 __iomem *)ioremap_nocache(address, end);
}

[root@hp-dl385g10-03 linux]# cat /proc/iomem |grep -i crash
  7fa00-879ff : Crash kernel
[root@hp-dl385g10-03 linux]# cat /proc/iomem |grep amd_iommu
  9c80-9c87 : amd_iommu
  a980-a987 : amd_iommu
  b680-b687 : amd_iommu
  c380-c387 : amd_iommu
  d080-d087 : amd_iommu
  dd80-dd87 : amd_iommu
  ea80-ea87 : amd_iommu
  fd80-fd87 : amd_iommu

Obviously, the iommu mmio space is not encrypted, and the device mmio space is 
outside kdump
kernel. We know that the old memory is encrypted, and the old memory is also 
outside kdump
kernel. For the current case, e820__get_entry_type() and walk_iomem_res_desc() 
can't get the
desired result, so we can't also decide whether encryption or not according to 
this result(rules).
If we want to know whether encryption or not by deducing the address, we will 
need to read
the content of memory and have a reference value for comparison, then what's a 
reference value?
Sometimes we don't know that.

Regards,
Lianbo
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 1/5 V5] Add a function(ioremap_encrypted) for kdump when AMD sme enabled

2018-07-09 Thread lijiang
在 2018年07月03日 19:44, lijiang 写道:
> 在 2018年07月03日 19:14, Borislav Petkov 写道:
>> On Tue, Jul 03, 2018 at 06:58:14PM +0800, lijiang wrote:
>>> For kdump, the elf header finally use the crash kernel reserved memory, it 
>>> is not an old memory.
>>
>> Lamme repeat my suggestion:
>>
>> So beef up the logic in __ioremap_caller() to figure out based on the
>> address whether to access the memory encrypted or not. In general, you
>> can deduce, based on the region you're mapping, whether you need to map
>> in encrypted or decrypted.
>>
>> For example:
>>
>> addr = elfcorehdr_addr;
>>
>> /* Read Elf header */
>> rc = elfcorehdr_read((char *), sizeof(Elf64_Ehdr), );
>> if (rc < 0)
>> return rc;
>>
>> elfcorehdr_addr has that elfcorehdr address. So you can check which address
>> you're mapping and do:
>>
>> __ioremap_caller:
>>
>> ...
>>  prot = __ioremap_compute_prot(...);
>>
>> and that __ioremap_compute_prot() function which you will add will have
>> all that logic to determine encrypted or not by comparing addresses etc.
>>
>> Does that make more sense?
>>
> Thank you, Boris. Good idea, I will rethink about this issue and post it 
> again.
> 
Hi, Boris
Last week, I had tried many ways to do this work, but it looks like that the 
ways of
deducing address is not suitable to another scenarios, such as mapping some 
devices
mmio space, which are unencrypted, and the device mmio space is outside kdump 
kernel
like the old memory, but the old memory is encrypted, we can't find the general 
rules
to decide when to encrypt or not. So it should be the best way that the caller 
care about
encryption or not. What do you think?

> Regards,
> Lianbo
> 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 1/5 V5] Add a function(ioremap_encrypted) for kdump when AMD sme enabled

2018-07-03 Thread lijiang
在 2018年07月03日 19:14, Borislav Petkov 写道:
> On Tue, Jul 03, 2018 at 06:58:14PM +0800, lijiang wrote:
>> For kdump, the elf header finally use the crash kernel reserved memory, it 
>> is not an old memory.
> 
> Lamme repeat my suggestion:
> 
> So beef up the logic in __ioremap_caller() to figure out based on the
> address whether to access the memory encrypted or not. In general, you
> can deduce, based on the region you're mapping, whether you need to map
> in encrypted or decrypted.
> 
> For example:
> 
> addr = elfcorehdr_addr;
> 
> /* Read Elf header */
> rc = elfcorehdr_read((char *), sizeof(Elf64_Ehdr), );
> if (rc < 0)
> return rc;
> 
> elfcorehdr_addr has that elfcorehdr address. So you can check which address
> you're mapping and do:
> 
> __ioremap_caller:
> 
> ...
>   prot = __ioremap_compute_prot(...);
> 
> and that __ioremap_compute_prot() function which you will add will have
> all that logic to determine encrypted or not by comparing addresses etc.
> 
> Does that make more sense?
> 
Thank you, Boris. Good idea, I will rethink about this issue and post it again.

Regards,
Lianbo
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 1/5 V5] Add a function(ioremap_encrypted) for kdump when AMD sme enabled

2018-07-03 Thread lijiang
在 2018年07月03日 17:39, Borislav Petkov 写道:
> On Tue, Jul 03, 2018 at 10:17:19AM +0800, lijiang wrote:
>> for example, the elfcorehdr. In fact, the elfcorehdr and notes
> 
> You mean this?
> 
>  ssize_t __weak elfcorehdr_read_notes(char *buf, size_t count, u64 *ppos)
>  {
> -   return read_from_oldmem(buf, count, ppos, 0);
> +   return read_from_oldmem(buf, count, ppos, 0, sme_active());
> 
> That looks encrypted to me.
> 
The elf notes is an old memory, it is encrypted.

But the elf header is a crash kernel reserved memory, which is unencrypted.
 ssize_t __weak elfcorehdr_read(char *buf, size_t count, u64 *ppos)
 {
-   return read_from_oldmem(buf, count, ppos, 0);
+   return read_from_oldmem(buf, count, ppos, 0, false);
 }

They call the same function(read_from_oldmem->ioremap_cache), so it is hard to
properly remap the memory if we don't use the parameter to distinguish. 

Regards,
Lianbo
>> call the same function(read_from_oldmem->ioremap_cache), in this case,
>> it is very difficult to properly remap the memory if the caller don't
>> care whether the memory is encrypted.
> 
> So beef up the logic in __ioremap_caller() to figure out based on the
> address whether to access the memory encrypted or not. You can find out
> the elfcorehdr address in the capture kernel.
> 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 1/5 V5] Add a function(ioremap_encrypted) for kdump when AMD sme enabled

2018-07-03 Thread lijiang
在 2018年07月03日 10:17, lijiang 写道:
> 在 2018年07月02日 18:14, Borislav Petkov 写道:
>> On Mon, Jul 02, 2018 at 03:26:35PM +0800, Lianbo Jiang wrote:
>>> @@ -131,7 +132,8 @@ static void __ioremap_check_mem(resource_size_t addr, 
>>> unsigned long size,
>>>   * caller shouldn't need to know that small detail.
>>>   */
>>>  static void __iomem *__ioremap_caller(resource_size_t phys_addr,
>>> -   unsigned long size, enum page_cache_mode pcm, void *caller)
>>> +   unsigned long size, enum page_cache_mode pcm,
>>> +   void *caller, bool encrypted)
>>
>> So instead of sprinkling that @encrypted argument everywhere and then
>> setting it based on sme_active() ...
>>
>>>  {
>>> unsigned long offset, vaddr;
>>> resource_size_t last_addr;
>>> @@ -199,7 +201,7 @@ static void __iomem *__ioremap_caller(resource_size_t 
>>> phys_addr,
>>>  * resulting mapping.
>>>  */
>>> prot = PAGE_KERNEL_IO;
>>> -   if (sev_active() && mem_flags.desc_other)
>>> +   if ((sev_active() && mem_flags.desc_other) || encrypted)
>>
>> ... why can't you simply do your checks:
>>
>>  sme_active() && is_kdump_kernel()
>>
>> here so that __ioremap_caller() can automatically choose the proper
>> pgprot value when ioremapping the memory in the kdump kernel?
>>
>> And this way the callers don't even have to care whether the memory is
>> encrypted or not?
>>
> Thank you, Boris. I'm very glad to read your comments. That's a good idea, 
> but it has some
> unencrypted memory in kdump mode, for example, the elfcorehdr. In fact, the 
> elfcorehdr and
> notes call the same function(read_from_oldmem->ioremap_cache), in this case, 
> it is very
> difficult to properly remap the memory if the caller don't care whether the 
> memory is encrypted.
> 
Hi, Boris,
About this, maybe I should describe the more details.
For kdump, the elf header finally use the crash kernel reserved memory, it is 
not an old memory.
When we load the crash kernel image, kexec tools will copy the elf header from 
encrypted memory
region to unencrypted memory region, we know that a page of memory that is 
marked encrypted will
be automatically decrypted when read from DRAM if SME is enabled. This 
operation make us get the
plaintext, that is to say, it becomes unencrypted data, so we need to remap the 
elfcorehdr in un-
encrypted manners in kdump kernel, but elf notes use an old memory, it is 
encrypted. That is the
real reason why we need to use the different 
functions(ioremap_cache/ioremap_encrypted).
If the elf core header is set to be encrypted, we could need to know which part 
of memory is
allocated for the elfcorehdr in kernel space and change the page attribute, 
maybe which will have
to modify another common codes and kexec-tools.

If you agree with my explanation, i will add them to patch log and also post it 
again.
Welcome to review my patches again.

Thanks.
Lianbo

> Regards,
> Lianbo>>> prot = pgprot_encrypted(prot);
>>>  
>>> switch (pcm) {
>>
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 1/5 V5] Add a function(ioremap_encrypted) for kdump when AMD sme enabled

2018-07-02 Thread lijiang
在 2018年07月02日 18:14, Borislav Petkov 写道:
> On Mon, Jul 02, 2018 at 03:26:35PM +0800, Lianbo Jiang wrote:
>> @@ -131,7 +132,8 @@ static void __ioremap_check_mem(resource_size_t addr, 
>> unsigned long size,
>>   * caller shouldn't need to know that small detail.
>>   */
>>  static void __iomem *__ioremap_caller(resource_size_t phys_addr,
>> -unsigned long size, enum page_cache_mode pcm, void *caller)
>> +unsigned long size, enum page_cache_mode pcm,
>> +void *caller, bool encrypted)
> 
> So instead of sprinkling that @encrypted argument everywhere and then
> setting it based on sme_active() ...
> 
>>  {
>>  unsigned long offset, vaddr;
>>  resource_size_t last_addr;
>> @@ -199,7 +201,7 @@ static void __iomem *__ioremap_caller(resource_size_t 
>> phys_addr,
>>   * resulting mapping.
>>   */
>>  prot = PAGE_KERNEL_IO;
>> -if (sev_active() && mem_flags.desc_other)
>> +if ((sev_active() && mem_flags.desc_other) || encrypted)
> 
> ... why can't you simply do your checks:
> 
>   sme_active() && is_kdump_kernel()
> 
> here so that __ioremap_caller() can automatically choose the proper
> pgprot value when ioremapping the memory in the kdump kernel?
> 
> And this way the callers don't even have to care whether the memory is
> encrypted or not?
> 
Thank you, Boris. I'm very glad to read your comments. That's a good idea, but 
it has some
unencrypted memory in kdump mode, for example, the elfcorehdr. In fact, the 
elfcorehdr and
notes call the same function(read_from_oldmem->ioremap_cache), in this case, it 
is very
difficult to properly remap the memory if the caller don't care whether the 
memory is encrypted.

Regards,
Lianbo
>>  prot = pgprot_encrypted(prot);
>>  
>>  switch (pcm) {
> 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 2/4 V3] Allocate pages for kdump without encryption when SME is enabled

2018-06-21 Thread lijiang
在 2018年06月21日 18:23, Baoquan He 写道:
> On 06/21/18 at 01:06pm, lijiang wrote:
>> 在 2018年06月21日 09:53, Baoquan He 写道:
>>> On 06/16/18 at 04:27pm, Lianbo Jiang wrote:
>>>> When SME is enabled in the first kernel, we will allocate pages
>>>> for kdump without encryption in order to be able to boot the
>>>> second kernel in the same manner as kexec, which helps to keep
>>>> the same code style.
>>>>
>>>> Signed-off-by: Lianbo Jiang 
>>>> ---
>>>>  kernel/kexec_core.c | 12 
>>>>  1 file changed, 12 insertions(+)
>>>>
>>>> diff --git a/kernel/kexec_core.c b/kernel/kexec_core.c
>>>> index 20fef1a..3c22a9b 100644
>>>> --- a/kernel/kexec_core.c
>>>> +++ b/kernel/kexec_core.c
>>>> @@ -471,6 +471,16 @@ static struct page 
>>>> *kimage_alloc_crash_control_pages(struct kimage *image,
>>>>}
>>>>}
>>>>  
>>>> +  if (pages) {
>>>> +  unsigned int count, i;
>>>> +
>>>> +  pages->mapping = NULL;
>>>> +  set_page_private(pages, order);
>>>> +  count = 1 << order;
>>>> +  for (i = 0; i < count; i++)
>>>> +  SetPageReserved(pages + i);
>>>
>>> I guess you might imitate the kexec case, however kexec get pages from
>>> buddy. Crash pages are reserved in memblock, these codes might make no 
>>> sense.
>>>
>> Thanks for your comments.
>> We have changed the attribute of pages, so the original attribute of pages 
>> will be
>> restored when they free.
> 
> Hmm, you can check what kimage_free() is doing, and where
> kimage->control_pages, dest_pages, unusable_pages is assigned. Do you
> know where these original attribute of pages comes from and they are
> used/needed in CRASH case, if you care about them?
> 
Originally, we want to have an opportunity to restore the previous attribute of 
pages, that
should be more better if the pages are remembered in 'image->control_pages'.
If we remove these codes, it is also harmless for kdump, but it will become 
strange, maybe
someone could ask where to restore the previous attribute of pages.

Thanks.
>>
>>>> +  arch_kexec_post_alloc_pages(page_address(pages), 1 << order, 0);
>>>> +  }
>>>>return pages;
>>>>  }
>>>>  
>>>> @@ -865,6 +875,7 @@ static int kimage_load_crash_segment(struct kimage 
>>>> *image,
>>>>result  = -ENOMEM;
>>>>goto out;
>>>>}
>>>> +  arch_kexec_post_alloc_pages(page_address(page), 1, 0);
>>>>ptr = kmap(page);
>>>>ptr += maddr & ~PAGE_MASK;
>>>>mchunk = min_t(size_t, mbytes,
>>>> @@ -882,6 +893,7 @@ static int kimage_load_crash_segment(struct kimage 
>>>> *image,
>>>>result = copy_from_user(ptr, buf, uchunk);
>>>>kexec_flush_icache_page(page);
>>>>kunmap(page);
>>>> +  arch_kexec_pre_free_pages(page_address(page), 1);
>>>>if (result) {
>>>>result = -EFAULT;
>>>>goto out;
>>>> -- 
>>>> 2.9.5
>>>>
>>>>
>>>> ___
>>>> kexec mailing list
>>>> ke...@lists.infradead.org
>>>> http://lists.infradead.org/mailman/listinfo/kexec
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 3/4 V3] Remap the device table of IOMMU in encrypted manner for kdump

2018-06-21 Thread lijiang
在 2018年06月21日 16:39, Baoquan He 写道:
> On 06/21/18 at 01:42pm, lijiang wrote:
>> 在 2018年06月21日 00:42, Tom Lendacky 写道:
>>> On 6/16/2018 3:27 AM, Lianbo Jiang wrote:
>>>> In kdump mode, it will copy the device table of IOMMU from the old
>>>> device table, which is encrypted when SME is enabled in the first
>>>> kernel. So we must remap it in encrypted manner in order to be
>>>> automatically decrypted when we read.
>>>>
>>>> Signed-off-by: Lianbo Jiang 
>>>> ---
>>>> Some changes:
>>>> 1. add some comments
>>>> 2. clean compile warning.
>>>>
>>>>  drivers/iommu/amd_iommu_init.c | 15 ++-
>>>>  1 file changed, 14 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/drivers/iommu/amd_iommu_init.c 
>>>> b/drivers/iommu/amd_iommu_init.c
>>>> index 904c575..a20af4c 100644
>>>> --- a/drivers/iommu/amd_iommu_init.c
>>>> +++ b/drivers/iommu/amd_iommu_init.c
>>>> @@ -889,11 +889,24 @@ static bool copy_device_table(void)
>>>>}
>>>>  
>>>>old_devtb_phys = entry & PAGE_MASK;
>>>> +
>>>> +  /*
>>>> +   *  When sme enable in the first kernel, old_devtb_phys includes the
>>>> +   *  memory encryption mask(sme_me_mask), we must remove the memory
>>>> +   *  encryption mask to obtain the true physical address in kdump mode.
>>>> +   */
>>>> +  if (mem_encrypt_active() && is_kdump_kernel())
>>>> +  old_devtb_phys = __sme_clr(old_devtb_phys);
>>>> +
>>>
>>> You can probably just use "if (is_kdump_kernel())" here, since memory
>>> encryption is either on in both the first and second kernel or off in
>>> both the first and second kernel.  At which point __sme_clr() will do
>>> the proper thing.
>>>
>>> Actually, this needs to be done no matter what.  When doing either the
>>> ioremap_encrypted() or the memremap(), the physical address should not
>>> include the encryption bit/mask.
>>>
>>> Thanks,
>>> Tom
>>>
>> Thanks for your comments. If we don't remove the memory encryption mask, it 
>> will
>> return false because the 'old_devtb_phys >= 0x1ULL' may become true.
> 
> Lianbo, you may not get what Tom suggested. Tom means no matter what it
> is, encrypted or not in 1st kernel, we need get pure physicall address,
> and using below code is always right for both cases.
> 
>   if (is_kdump_kernel())
>   old_devtb_phys = __sme_clr(old_devtb_phys);
> 
Thank you, Baoquan. I understood what Tom said. I just want to explain why 
removing
the memory encryption mask is necessary before the 'old_devtb_phys >= 
0x1ULL'.

Lianbo
> And this is simpler. You even can add one line of code comment to say
> like "Physical address w/o encryption mask is needed here."
>>
>> Lianbo
>>>>if (old_devtb_phys >= 0x1ULL) {
>>>>pr_err("The address of old device table is above 4G, not 
>>>> trustworthy!\n");
>>>>return false;
>>>>}
>>>> -  old_devtb = memremap(old_devtb_phys, dev_table_size, MEMREMAP_WB);
>>>> +  old_devtb = (mem_encrypt_active() && is_kdump_kernel())
>>>> +  ? (__force void *)ioremap_encrypted(old_devtb_phys,
>>>> +  dev_table_size)
>>>> +  : memremap(old_devtb_phys, dev_table_size, MEMREMAP_WB);> +
>>>>if (!old_devtb)
>>>>return false;
>>>>  
>>>>
>>
>> ___
>> kexec mailing list
>> ke...@lists.infradead.org
>> http://lists.infradead.org/mailman/listinfo/kexec
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 1/4 V3] Add a function(ioremap_encrypted) for kdump when AMD sme enabled

2018-06-21 Thread lijiang
在 2018年06月21日 00:00, Tom Lendacky 写道:
> On 6/16/2018 3:27 AM, Lianbo Jiang wrote:
>> It is convenient to remap the old memory encrypted to the second
>> kernel by calling ioremap_encrypted().
>>
>> Signed-off-by: Lianbo Jiang 
>> ---
>> Some changes:
>> 1. remove the sme_active() check in __ioremap_caller().
>> 2. put some logic into the early_memremap_pgprot_adjust() for
>> early memremap.
>>
>>  arch/x86/include/asm/io.h |  3 +++
>>  arch/x86/mm/ioremap.c | 28 
>>  2 files changed, 23 insertions(+), 8 deletions(-)
>>
>> diff --git a/arch/x86/include/asm/io.h b/arch/x86/include/asm/io.h
>> index f6e5b93..989d60b 100644
>> --- a/arch/x86/include/asm/io.h
>> +++ b/arch/x86/include/asm/io.h
>> @@ -192,6 +192,9 @@ extern void __iomem *ioremap_cache(resource_size_t 
>> offset, unsigned long size);
>>  #define ioremap_cache ioremap_cache
>>  extern void __iomem *ioremap_prot(resource_size_t offset, unsigned long 
>> size, unsigned long prot_val);
>>  #define ioremap_prot ioremap_prot
>> +extern void __iomem *ioremap_encrypted(resource_size_t phys_addr,
>> +unsigned long size);
>> +#define ioremap_encrypted ioremap_encrypted
>>  
>>  /**
>>   * ioremap -   map bus memory into CPU space
>> diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c
>> index c63a545..e365fc4 100644
>> --- a/arch/x86/mm/ioremap.c
>> +++ b/arch/x86/mm/ioremap.c
>> @@ -24,6 +24,7 @@
>>  #include 
>>  #include 
>>  #include 
>> +#include 
>>  
>>  #include "physaddr.h"
>>  
>> @@ -131,7 +132,8 @@ static void __ioremap_check_mem(resource_size_t addr, 
>> unsigned long size,
>>   * caller shouldn't need to know that small detail.
>>   */
>>  static void __iomem *__ioremap_caller(resource_size_t phys_addr,
>> -unsigned long size, enum page_cache_mode pcm, void *caller)
>> +unsigned long size, enum page_cache_mode pcm,
>> +void *caller, bool encrypted)
>>  {
>>  unsigned long offset, vaddr;
>>  resource_size_t last_addr;
>> @@ -199,7 +201,7 @@ static void __iomem *__ioremap_caller(resource_size_t 
>> phys_addr,
>>   * resulting mapping.
>>   */
>>  prot = PAGE_KERNEL_IO;
>> -if (sev_active() && mem_flags.desc_other)
>> +if ((sev_active() && mem_flags.desc_other) || encrypted)
>>  prot = pgprot_encrypted(prot);
>>  
>>  switch (pcm) {
>> @@ -291,7 +293,7 @@ void __iomem *ioremap_nocache(resource_size_t phys_addr, 
>> unsigned long size)
>>  enum page_cache_mode pcm = _PAGE_CACHE_MODE_UC_MINUS;
>>  
>>  return __ioremap_caller(phys_addr, size, pcm,
>> -__builtin_return_address(0));
>> +__builtin_return_address(0), false);
>>  }
>>  EXPORT_SYMBOL(ioremap_nocache);
>>  
>> @@ -324,7 +326,7 @@ void __iomem *ioremap_uc(resource_size_t phys_addr, 
>> unsigned long size)
>>  enum page_cache_mode pcm = _PAGE_CACHE_MODE_UC;
>>  
>>  return __ioremap_caller(phys_addr, size, pcm,
>> -__builtin_return_address(0));
>> +__builtin_return_address(0), false);
>>  }
>>  EXPORT_SYMBOL_GPL(ioremap_uc);
>>  
>> @@ -341,7 +343,7 @@ EXPORT_SYMBOL_GPL(ioremap_uc);
>>  void __iomem *ioremap_wc(resource_size_t phys_addr, unsigned long size)
>>  {
>>  return __ioremap_caller(phys_addr, size, _PAGE_CACHE_MODE_WC,
>> -__builtin_return_address(0));
>> +__builtin_return_address(0), false);
>>  }
>>  EXPORT_SYMBOL(ioremap_wc);
>>  
>> @@ -358,14 +360,21 @@ EXPORT_SYMBOL(ioremap_wc);
>>  void __iomem *ioremap_wt(resource_size_t phys_addr, unsigned long size)
>>  {
>>  return __ioremap_caller(phys_addr, size, _PAGE_CACHE_MODE_WT,
>> -__builtin_return_address(0));
>> +__builtin_return_address(0), false);
>>  }
>>  EXPORT_SYMBOL(ioremap_wt);
>>  
>> +void __iomem *ioremap_encrypted(resource_size_t phys_addr, unsigned long 
>> size)
>> +{
>> +return __ioremap_caller(phys_addr, size, _PAGE_CACHE_MODE_WB,
>> +__builtin_return_address(0), true);
>> +}
>> +EXPORT_SYMBOL(ioremap_encrypted);
>> +
>>  void __iomem *ioremap_cache(resource_size_t phys_addr, unsigned long size)
>>  {
>>  return __ioremap_caller(phys_addr, size, _PAGE_CACHE_MODE_WB,
>> -__builtin_return_address(0));
>> +__builtin_return_address(0), false);
>>  }
>>  EXPORT_SYMBOL(ioremap_cache);
>>  
>> @@ -374,7 +383,7 @@ void __iomem *ioremap_prot(resource_size_t phys_addr, 
>> unsigned long size,
>>  {
>>  return __ioremap_caller(phys_addr, size,
>>  pgprot2cachemode(__pgprot(prot_val)),
>> -__builtin_return_address(0));
>> +__builtin_return_address(0), false);
>>  }
>>  EXPORT_SYMBOL(ioremap_prot);
>>  
>> @@ -688,6 

Re: [PATCH 3/4 V3] Remap the device table of IOMMU in encrypted manner for kdump

2018-06-20 Thread lijiang
在 2018年06月21日 00:42, Tom Lendacky 写道:
> On 6/16/2018 3:27 AM, Lianbo Jiang wrote:
>> In kdump mode, it will copy the device table of IOMMU from the old
>> device table, which is encrypted when SME is enabled in the first
>> kernel. So we must remap it in encrypted manner in order to be
>> automatically decrypted when we read.
>>
>> Signed-off-by: Lianbo Jiang 
>> ---
>> Some changes:
>> 1. add some comments
>> 2. clean compile warning.
>>
>>  drivers/iommu/amd_iommu_init.c | 15 ++-
>>  1 file changed, 14 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/iommu/amd_iommu_init.c b/drivers/iommu/amd_iommu_init.c
>> index 904c575..a20af4c 100644
>> --- a/drivers/iommu/amd_iommu_init.c
>> +++ b/drivers/iommu/amd_iommu_init.c
>> @@ -889,11 +889,24 @@ static bool copy_device_table(void)
>>  }
>>  
>>  old_devtb_phys = entry & PAGE_MASK;
>> +
>> +/*
>> + *  When sme enable in the first kernel, old_devtb_phys includes the
>> + *  memory encryption mask(sme_me_mask), we must remove the memory
>> + *  encryption mask to obtain the true physical address in kdump mode.
>> + */
>> +if (mem_encrypt_active() && is_kdump_kernel())
>> +old_devtb_phys = __sme_clr(old_devtb_phys);
>> +
> 
> You can probably just use "if (is_kdump_kernel())" here, since memory
> encryption is either on in both the first and second kernel or off in
> both the first and second kernel.  At which point __sme_clr() will do
> the proper thing.
> 
> Actually, this needs to be done no matter what.  When doing either the
> ioremap_encrypted() or the memremap(), the physical address should not
> include the encryption bit/mask.
> 
> Thanks,
> Tom
> 
Thanks for your comments. If we don't remove the memory encryption mask, it will
return false because the 'old_devtb_phys >= 0x1ULL' may become true.

Lianbo
>>  if (old_devtb_phys >= 0x1ULL) {
>>  pr_err("The address of old device table is above 4G, not 
>> trustworthy!\n");
>>  return false;
>>  }
>> -old_devtb = memremap(old_devtb_phys, dev_table_size, MEMREMAP_WB);
>> +old_devtb = (mem_encrypt_active() && is_kdump_kernel())
>> +? (__force void *)ioremap_encrypted(old_devtb_phys,
>> +dev_table_size)
>> +: memremap(old_devtb_phys, dev_table_size, MEMREMAP_WB);> +
>>  if (!old_devtb)
>>  return false;
>>  
>>
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 2/4 V3] Allocate pages for kdump without encryption when SME is enabled

2018-06-20 Thread lijiang
在 2018年06月21日 09:53, Baoquan He 写道:
> On 06/16/18 at 04:27pm, Lianbo Jiang wrote:
>> When SME is enabled in the first kernel, we will allocate pages
>> for kdump without encryption in order to be able to boot the
>> second kernel in the same manner as kexec, which helps to keep
>> the same code style.
>>
>> Signed-off-by: Lianbo Jiang 
>> ---
>>  kernel/kexec_core.c | 12 
>>  1 file changed, 12 insertions(+)
>>
>> diff --git a/kernel/kexec_core.c b/kernel/kexec_core.c
>> index 20fef1a..3c22a9b 100644
>> --- a/kernel/kexec_core.c
>> +++ b/kernel/kexec_core.c
>> @@ -471,6 +471,16 @@ static struct page 
>> *kimage_alloc_crash_control_pages(struct kimage *image,
>>  }
>>  }
>>  
>> +if (pages) {
>> +unsigned int count, i;
>> +
>> +pages->mapping = NULL;
>> +set_page_private(pages, order);
>> +count = 1 << order;
>> +for (i = 0; i < count; i++)
>> +SetPageReserved(pages + i);
> 
> I guess you might imitate the kexec case, however kexec get pages from
> buddy. Crash pages are reserved in memblock, these codes might make no sense.
> 
Thanks for your comments.
We have changed the attribute of pages, so the original attribute of pages will 
be
restored when they free.

Thanks.
Lianbo
>> +arch_kexec_post_alloc_pages(page_address(pages), 1 << order, 0);
>> +}
>>  return pages;
>>  }
>>  
>> @@ -865,6 +875,7 @@ static int kimage_load_crash_segment(struct kimage 
>> *image,
>>  result  = -ENOMEM;
>>  goto out;
>>  }
>> +arch_kexec_post_alloc_pages(page_address(page), 1, 0);
>>  ptr = kmap(page);
>>  ptr += maddr & ~PAGE_MASK;
>>  mchunk = min_t(size_t, mbytes,
>> @@ -882,6 +893,7 @@ static int kimage_load_crash_segment(struct kimage 
>> *image,
>>  result = copy_from_user(ptr, buf, uchunk);
>>  kexec_flush_icache_page(page);
>>  kunmap(page);
>> +arch_kexec_pre_free_pages(page_address(page), 1);
>>  if (result) {
>>  result = -EFAULT;
>>  goto out;
>> -- 
>> 2.9.5
>>
>>
>> ___
>> kexec mailing list
>> ke...@lists.infradead.org
>> http://lists.infradead.org/mailman/listinfo/kexec
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 0/4 V3] Support kdump for AMD secure memory encryption(SME)

2018-06-20 Thread lijiang
在 2018年06月21日 09:21, Baoquan He 写道:
> On 06/16/18 at 04:27pm, Lianbo Jiang wrote:
>> It is convenient to remap the old memory encrypted to the second kernel by
>> calling ioremap_encrypted().
>>
>> When sme enabled on AMD server, we also need to support kdump. Because
>> the memory is encrypted in the first kernel, we will remap the old memory
>> encrypted to the second kernel(crash kernel), and sme is also enabled in
>> the second kernel, otherwise the old memory encrypted can not be decrypted.
>> Because simply changing the value of a C-bit on a page will not
>> automatically encrypt the existing contents of a page, and any data in the
>> page prior to the C-bit modification will become unintelligible. A page of
>> memory that is marked encrypted will be automatically decrypted when read
>> from DRAM and will be automatically encrypted when written to DRAM.
>>
>> For the kdump, it is necessary to distinguish whether the memory is
>> encrypted. Furthermore, we should also know which part of the memory is
>> encrypted or decrypted. We will appropriately remap the memory according
>> to the specific situation in order to tell cpu how to deal with the
>> data(encrypted or decrypted). For example, when sme enabled, if the old
>> memory is encrypted, we will remap the old memory in encrypted way, which
>> will automatically decrypt the old memory encrypted when we read those data
>> from the remapping address.
>>
>>  --
>> | first-kernel | second-kernel | kdump support |
>> |  (mem_encrypt=on|off)|   (yes|no)|
>> |--+---+---|
>> | on   | on| yes   |
>> | off  | off   | yes   |
>> | on   | off   | no|
> 
> 
>> | off  | on| no|
> 
> It's not clear to me here. If 1st kernel sme is off, in 2nd kernel, when
> you remap the old memory with non-sme mode, why did it fail?
> 
Thank you, Baoquan.
For kdump, there are two cases that doesn't need to support:

1. SME on(first kernel), but SME off(second kernel).
Because the old memory is encrypted, we can't decrypt the old memory if SME is 
off
in the second kernel(in kdump mode).

2. SME off(first kernel), but SME on(second kernel)
Maybe this situation doesn't have significance in actual deployment, 
furthermore, it
will also increase the complexity of the code. It's just for testing, maybe it 
is
unnecessary to support it, because the old memory is unencrypted.

Thanks.
Lianbo
> And please run scripts/get_maintainer.pl and add maintainers of
> component which is affected in patch to CC list.
Great! I forgot CC maintainers, thanks for your reminder.

Lianbo
> 
>> |__|___|___|
>>
>> This patch is only for SME kdump, it is not support SEV kdump.
>>
>> Test tools:
>> makedumpfile[v1.6.3]: https://github.com/LianboJ/makedumpfile
>> commit e1de103eca8f (A draft for kdump vmcore about AMD SME)
>> Author: Lianbo Jiang 
>> Date:   Mon May 14 17:02:40 2018 +0800
>> Note: This patch can only dump vmcore in the case of SME enabled.
>>
>> crash-7.2.1: https://github.com/crash-utility/crash.git
>> commit 1e1bd9c4c1be (Fix for the "bpf" command display on Linux 4.17-rc1)
>> Author: Dave Anderson 
>> Date:   Fri May 11 15:54:32 2018 -0400
>>
>> Test environment:
>> HP ProLiant DL385Gen10 AMD EPYC 7251
>> 8-Core Processor
>> 32768 MB memory
>> 600 GB disk space
>>
>> Linux 4.17-rc7:
>> git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
>> commit b04e217704b7 ("Linux 4.17-rc7")
>> Author: Linus Torvalds 
>> Date:   Sun May 27 13:01:47 2018 -0700
>>
>> Reference:
>> AMD64 Architecture Programmer's Manual
>> https://support.amd.com/TechDocs/24593.pdf
>>
>> Some changes:
>> 1. remove the sme_active() check in __ioremap_caller().
>> 2. remove the '#ifdef' stuff throughout this patch.
>> 3. put some logic into the early_memremap_pgprot_adjust() and clean the
>> previous unnecessary changes, for example: arch/x86/include/asm/dmi.h,
>> arch/x86/kernel/acpi/boot.c, drivers/acpi/tables.c.
>> 4. add a new file and modify Makefile.
>> 5. clean compile warning in copy_device_table() and some compile error.
>> 6. split the original patch into four patches, it will be better for
>> review.
>>
>> Some known issues:
>> 1. about SME
>> Upstream kernel doesn't work when we use kexec in the follow command. The
>> system will hang.
>> (This issue doesn't matter with the kdump patch.)
>>
>> Reproduce steps:
>>  # kexec -l /boot/vmlinuz-4.17.0-rc7+ 
>> --initrd=/boot/initramfs-4.17.0-rc7+.img 
>> --command-line="root=/dev/mapper/rhel_hp--dl385g10--03-root ro 
>> mem_encrypt=on rd.lvm.lv=rhel_hp-dl385g10-03/root 
>> rd.lvm.lv=rhel_hp-dl385g10-03/swap console=ttyS0,115200n81 LANG=en_US.UTF-8 
>> earlyprintk=serial debug nokaslr"
>>  # kexec -e (or reboot)
>>
>> The system will hang:
>> [ 1248.932239] kexec_core: Starting new kernel
>> early console in 

Re: [PATCH 4/4 V3] Help to dump the old memory encrypted into vmcore file

2018-06-19 Thread lijiang
在 2018年06月19日 22:46, lijiang 写道:
> 在 2018年06月19日 11:16, Dave Young 写道:
>> On 06/16/18 at 04:27pm, Lianbo Jiang wrote:
>>> In kdump mode, we need to dump the old memory into vmcore file,
>>> if SME is enabled in the first kernel, we must remap the old
>>> memory in encrypted manner, which will be automatically decrypted
>>> when we read from DRAM. It helps to parse the vmcore for some tools.
>>>
>>> Signed-off-by: Lianbo Jiang 
>>> ---
>>> Some changes:
>>> 1. add a new file and modify Makefile.
>>> 2. remove some code in sev_active().
>>>
>>>  arch/x86/kernel/Makefile |  1 +
>>>  arch/x86/kernel/crash_dump_encrypt.c | 53 
>>> 
>>>  fs/proc/vmcore.c | 20 ++
>>>  include/linux/crash_dump.h   | 11 
>>>  4 files changed, 79 insertions(+), 6 deletions(-)
>>>  create mode 100644 arch/x86/kernel/crash_dump_encrypt.c
>>>
>>> diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile
>>> index 02d6f5c..afb5bad 100644
>>> --- a/arch/x86/kernel/Makefile
>>> +++ b/arch/x86/kernel/Makefile
>>> @@ -96,6 +96,7 @@ obj-$(CONFIG_KEXEC_CORE)  += machine_kexec_$(BITS).o
>>>  obj-$(CONFIG_KEXEC_CORE)   += relocate_kernel_$(BITS).o crash.o
>>>  obj-$(CONFIG_KEXEC_FILE)   += kexec-bzimage64.o
>>>  obj-$(CONFIG_CRASH_DUMP)   += crash_dump_$(BITS).o
>>> +obj-$(CONFIG_AMD_MEM_ENCRYPT)  += crash_dump_encrypt.o
>>>  obj-y  += kprobes/
>>>  obj-$(CONFIG_MODULES)  += module.o
>>>  obj-$(CONFIG_DOUBLEFAULT)  += doublefault.o
>>> diff --git a/arch/x86/kernel/crash_dump_encrypt.c 
>>> b/arch/x86/kernel/crash_dump_encrypt.c
>>> new file mode 100644
>>> index 000..e44ef33
>>> --- /dev/null
>>> +++ b/arch/x86/kernel/crash_dump_encrypt.c
>>> @@ -0,0 +1,53 @@
>>> +// SPDX-License-Identifier: GPL-2.0
>>> +/*
>>> + * Memory preserving reboot related code.
>>> + *
>>> + * Created by: Lianbo Jiang (liji...@redhat.com)
>>> + * Copyright (C) RedHat Corporation, 2018. All rights reserved
>>> + */
>>> +
>>> +#include 
>>> +#include 
>>> +#include 
>>> +#include 
>>> +
>>> +/**
>>> + * copy_oldmem_page_encrypted - copy one page from "oldmem encrypted"
>>> + * @pfn: page frame number to be copied
>>> + * @buf: target memory address for the copy; this can be in kernel address
>>> + * space or user address space (see @userbuf)
>>> + * @csize: number of bytes to copy
>>> + * @offset: offset in bytes into the page (based on pfn) to begin the copy
>>> + * @userbuf: if set, @buf is in user address space, use copy_to_user(),
>>> + * otherwise @buf is in kernel address space, use memcpy().
>>> + *
>>> + * Copy a page from "oldmem encrypted". For this page, there is no pte
>>> + * mapped in the current kernel. We stitch up a pte, similar to
>>> + * kmap_atomic.
>>> + */
>>> +
>>> +ssize_t copy_oldmem_page_encrypted(unsigned long pfn, char *buf,
>>> +   size_t csize, unsigned long offset, int userbuf)
>>> +{
>>> +   void  *vaddr;
>>> +
>>> +   if (!csize)
>>> +   return 0;
>>> +
>>> +   vaddr = (__force void *)ioremap_encrypted(pfn << PAGE_SHIFT,
>>> + PAGE_SIZE);
>>> +   if (!vaddr)
>>> +   return -ENOMEM;
>>> +
>>> +   if (userbuf) {
>>> +   if (copy_to_user((void __user *)buf, vaddr + offset, csize)) {
>>> +   iounmap((void __iomem *)vaddr);
>>> +   return -EFAULT;
>>> +   }
>>> +   } else
>>> +   memcpy(buf, vaddr + offset, csize);
>>> +
>>> +   set_iounmap_nonlazy();
>>> +   iounmap((void __iomem *)vaddr);
>>> +   return csize;
>>> +}
>>> diff --git a/fs/proc/vmcore.c b/fs/proc/vmcore.c
>>> index a45f0af..5200266 100644
>>> --- a/fs/proc/vmcore.c
>>> +++ b/fs/proc/vmcore.c
>>> @@ -25,6 +25,8 @@
>>>  #include 
>>>  #include 
>>>  #include "internal.h"
>>> +#include 
>>> +#include 
>>>  
>>>  /* List representing chunks of contiguous memory areas and their offsets in
>>>   * vmcore file.
>&

Re: [PATCH 4/4 V3] Help to dump the old memory encrypted into vmcore file

2018-06-19 Thread lijiang
在 2018年06月19日 11:16, Dave Young 写道:
> On 06/16/18 at 04:27pm, Lianbo Jiang wrote:
>> In kdump mode, we need to dump the old memory into vmcore file,
>> if SME is enabled in the first kernel, we must remap the old
>> memory in encrypted manner, which will be automatically decrypted
>> when we read from DRAM. It helps to parse the vmcore for some tools.
>>
>> Signed-off-by: Lianbo Jiang 
>> ---
>> Some changes:
>> 1. add a new file and modify Makefile.
>> 2. remove some code in sev_active().
>>
>>  arch/x86/kernel/Makefile |  1 +
>>  arch/x86/kernel/crash_dump_encrypt.c | 53 
>> 
>>  fs/proc/vmcore.c | 20 ++
>>  include/linux/crash_dump.h   | 11 
>>  4 files changed, 79 insertions(+), 6 deletions(-)
>>  create mode 100644 arch/x86/kernel/crash_dump_encrypt.c
>>
>> diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile
>> index 02d6f5c..afb5bad 100644
>> --- a/arch/x86/kernel/Makefile
>> +++ b/arch/x86/kernel/Makefile
>> @@ -96,6 +96,7 @@ obj-$(CONFIG_KEXEC_CORE)   += machine_kexec_$(BITS).o
>>  obj-$(CONFIG_KEXEC_CORE)+= relocate_kernel_$(BITS).o crash.o
>>  obj-$(CONFIG_KEXEC_FILE)+= kexec-bzimage64.o
>>  obj-$(CONFIG_CRASH_DUMP)+= crash_dump_$(BITS).o
>> +obj-$(CONFIG_AMD_MEM_ENCRYPT)   += crash_dump_encrypt.o
>>  obj-y   += kprobes/
>>  obj-$(CONFIG_MODULES)   += module.o
>>  obj-$(CONFIG_DOUBLEFAULT)   += doublefault.o
>> diff --git a/arch/x86/kernel/crash_dump_encrypt.c 
>> b/arch/x86/kernel/crash_dump_encrypt.c
>> new file mode 100644
>> index 000..e44ef33
>> --- /dev/null
>> +++ b/arch/x86/kernel/crash_dump_encrypt.c
>> @@ -0,0 +1,53 @@
>> +// SPDX-License-Identifier: GPL-2.0
>> +/*
>> + *  Memory preserving reboot related code.
>> + *
>> + *  Created by: Lianbo Jiang (liji...@redhat.com)
>> + *  Copyright (C) RedHat Corporation, 2018. All rights reserved
>> + */
>> +
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +
>> +/**
>> + * copy_oldmem_page_encrypted - copy one page from "oldmem encrypted"
>> + * @pfn: page frame number to be copied
>> + * @buf: target memory address for the copy; this can be in kernel address
>> + *  space or user address space (see @userbuf)
>> + * @csize: number of bytes to copy
>> + * @offset: offset in bytes into the page (based on pfn) to begin the copy
>> + * @userbuf: if set, @buf is in user address space, use copy_to_user(),
>> + *  otherwise @buf is in kernel address space, use memcpy().
>> + *
>> + * Copy a page from "oldmem encrypted". For this page, there is no pte
>> + * mapped in the current kernel. We stitch up a pte, similar to
>> + * kmap_atomic.
>> + */
>> +
>> +ssize_t copy_oldmem_page_encrypted(unsigned long pfn, char *buf,
>> +size_t csize, unsigned long offset, int userbuf)
>> +{
>> +void  *vaddr;
>> +
>> +if (!csize)
>> +return 0;
>> +
>> +vaddr = (__force void *)ioremap_encrypted(pfn << PAGE_SHIFT,
>> +  PAGE_SIZE);
>> +if (!vaddr)
>> +return -ENOMEM;
>> +
>> +if (userbuf) {
>> +if (copy_to_user((void __user *)buf, vaddr + offset, csize)) {
>> +iounmap((void __iomem *)vaddr);
>> +return -EFAULT;
>> +}
>> +} else
>> +memcpy(buf, vaddr + offset, csize);
>> +
>> +set_iounmap_nonlazy();
>> +iounmap((void __iomem *)vaddr);
>> +return csize;
>> +}
>> diff --git a/fs/proc/vmcore.c b/fs/proc/vmcore.c
>> index a45f0af..5200266 100644
>> --- a/fs/proc/vmcore.c
>> +++ b/fs/proc/vmcore.c
>> @@ -25,6 +25,8 @@
>>  #include 
>>  #include 
>>  #include "internal.h"
>> +#include 
>> +#include 
>>  
>>  /* List representing chunks of contiguous memory areas and their offsets in
>>   * vmcore file.
>> @@ -86,7 +88,8 @@ static int pfn_is_ram(unsigned long pfn)
>>  
>>  /* Reads a page from the oldmem device from given offset. */
>>  static ssize_t read_from_oldmem(char *buf, size_t count,
>> -u64 *ppos, int userbuf)
>> +u64 *ppos, int userbuf,
>> +bool encrypted)
>>  {
>>  unsigned long pfn, offset;
>>  size_t nr_bytes;
>> @@ -108,8 +111,11 @@ static ssize_t read_from_oldmem(char *buf, size_t count,
>>  if (pfn_is_ram(pfn) == 0)
>>  memset(buf, 0, nr_bytes);
>>  else {
>> -tmp = copy_oldmem_page(pfn, buf, nr_bytes,
>> -offset, userbuf);
>> +tmp = encrypted ? copy_oldmem_page_encrypted(pfn,
>> +buf, nr_bytes, offset, userbuf)
>> +: copy_oldmem_page(pfn, buf, nr_bytes,
>> +   offset, userbuf);
>> +
>>  if (tmp < 0)
>>