Re: [PATCH v5 31/32] x86: Add sysfs support for Secure Memory Encryption

2017-05-25 Thread Xunlei Pang
On 05/26/2017 at 10:49 AM, Dave Young wrote:
> Ccing Xunlei he is reading the patches see what need to be done for
> kdump. There should still be several places to handle to make kdump work.
>
> On 05/18/17 at 07:01pm, Borislav Petkov wrote:
>> On Tue, Apr 18, 2017 at 04:22:12PM -0500, Tom Lendacky wrote:
>>> Add sysfs support for SME so that user-space utilities (kdump, etc.) can
>>> determine if SME is active.
>> But why do user-space tools need to know that?
>>
>> I mean, when we load the kdump kernel, we do it with the first kernel,
>> with the kexec_load() syscall, AFAICT. And that code does a lot of
>> things during that init, like machine_kexec_prepare()->init_pgtable() to
>> prepare the ident mapping of the second kernel, for example.
>>
>> What I'm aiming at is that the first kernel knows *exactly* whether SME
>> is enabled or not and doesn't need to tell the second one through some
>> sysfs entries - it can do that during loading.
>>
>> So I don't think we need any userspace things at all...
> If kdump kernel can get the SME status from hardware register then this
> should be not necessary and this patch can be dropped.

Yes, I also agree with dropping this one.

Regards,
Xunlei

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH v5 28/32] x86/mm, kexec: Allow kexec to be used with SME

2017-05-25 Thread Xunlei Pang
On 04/19/2017 at 05:21 AM, Tom Lendacky wrote:
> Provide support so that kexec can be used to boot a kernel when SME is
> enabled.
>
> Support is needed to allocate pages for kexec without encryption.  This
> is needed in order to be able to reboot in the kernel in the same manner
> as originally booted.

Hi Tom,

Looks like kdump will break, I didn't see the similar handling for kdump cases, 
see kernel:
kimage_alloc_crash_control_pages(), kimage_load_crash_segment(), etc.

We need to support kdump with SME, kdump 
kernel/initramfs/purgatory/elfcorehdr/etc
are all loaded into the reserved memory(see crashkernel=X) by userspace 
kexec-tools.
I think a straightforward way would be to mark the whole reserved memory range 
without
encryption before loading all the kexec segments for kdump, I guess we can 
handle this
easily in arch_kexec_unprotect_crashkres().

Moreover, now that "elfcorehdr=X" is left as decrypted, it needs to be remapped 
to the
encrypted data.

Regards,
Xunlei

>
> Additionally, when shutting down all of the CPUs we need to be sure to
> flush the caches and then halt. This is needed when booting from a state
> where SME was not active into a state where SME is active (or vice-versa).
> Without these steps, it is possible for cache lines to exist for the same
> physical location but tagged both with and without the encryption bit. This
> can cause random memory corruption when caches are flushed depending on
> which cacheline is written last.
>
> Signed-off-by: Tom Lendacky 
> ---
>  arch/x86/include/asm/init.h  |1 +
>  arch/x86/include/asm/irqflags.h  |5 +
>  arch/x86/include/asm/kexec.h |8 
>  arch/x86/include/asm/pgtable_types.h |1 +
>  arch/x86/kernel/machine_kexec_64.c   |   35 
> +-
>  arch/x86/kernel/process.c|   26 +++--
>  arch/x86/mm/ident_map.c  |   11 +++
>  include/linux/kexec.h|   14 ++
>  kernel/kexec_core.c  |7 +++
>  9 files changed, 101 insertions(+), 7 deletions(-)
>
> diff --git a/arch/x86/include/asm/init.h b/arch/x86/include/asm/init.h
> index 737da62..b2ec511 100644
> --- a/arch/x86/include/asm/init.h
> +++ b/arch/x86/include/asm/init.h
> @@ -6,6 +6,7 @@ struct x86_mapping_info {
>   void *context;   /* context for alloc_pgt_page */
>   unsigned long pmd_flag;  /* page flag for PMD entry */
>   unsigned long offset;/* ident mapping offset */
> + unsigned long kernpg_flag;   /* kernel pagetable flag override */
>  };
>  
>  int kernel_ident_mapping_init(struct x86_mapping_info *info, pgd_t *pgd_page,
> diff --git a/arch/x86/include/asm/irqflags.h b/arch/x86/include/asm/irqflags.h
> index ac7692d..38b5920 100644
> --- a/arch/x86/include/asm/irqflags.h
> +++ b/arch/x86/include/asm/irqflags.h
> @@ -58,6 +58,11 @@ static inline __cpuidle void native_halt(void)
>   asm volatile("hlt": : :"memory");
>  }
>  
> +static inline __cpuidle void native_wbinvd_halt(void)
> +{
> + asm volatile("wbinvd; hlt" : : : "memory");
> +}
> +
>  #endif
>  
>  #ifdef CONFIG_PARAVIRT
> diff --git a/arch/x86/include/asm/kexec.h b/arch/x86/include/asm/kexec.h
> index 70ef205..e8183ac 100644
> --- a/arch/x86/include/asm/kexec.h
> +++ b/arch/x86/include/asm/kexec.h
> @@ -207,6 +207,14 @@ struct kexec_entry64_regs {
>   uint64_t r15;
>   uint64_t rip;
>  };
> +
> +extern int arch_kexec_post_alloc_pages(void *vaddr, unsigned int pages,
> +gfp_t gfp);
> +#define arch_kexec_post_alloc_pages arch_kexec_post_alloc_pages
> +
> +extern void arch_kexec_pre_free_pages(void *vaddr, unsigned int pages);
> +#define arch_kexec_pre_free_pages arch_kexec_pre_free_pages
> +
>  #endif
>  
>  typedef void crash_vmclear_fn(void);
> diff --git a/arch/x86/include/asm/pgtable_types.h 
> b/arch/x86/include/asm/pgtable_types.h
> index ce8cb1c..0f326f4 100644
> --- a/arch/x86/include/asm/pgtable_types.h
> +++ b/arch/x86/include/asm/pgtable_types.h
> @@ -213,6 +213,7 @@ enum page_cache_mode {
>  #define PAGE_KERNEL  __pgprot(__PAGE_KERNEL | _PAGE_ENC)
>  #define PAGE_KERNEL_RO   __pgprot(__PAGE_KERNEL_RO | _PAGE_ENC)
>  #define PAGE_KERNEL_EXEC __pgprot(__PAGE_KERNEL_EXEC | _PAGE_ENC)
> +#define PAGE_KERNEL_EXEC_NOENC   __pgprot(__PAGE_KERNEL_EXEC)
>  #define PAGE_KERNEL_RX   __pgprot(__PAGE_KERNEL_RX | _PAGE_ENC)
>  #define PAGE_KERNEL_NOCACHE  __pgprot(__PAGE_KERNEL_NOCACHE | _PAGE_ENC)
>  #define PAGE_KERNEL_LARGE__pgprot(__PAGE_KERNEL_LARGE | _PAGE_ENC)
> diff --git a/arch/x86/kernel/machine_kexec_64.c 
> b/arch/x86/kernel/machine_kexec_64.c
> index 085c3b3..11c0ca9 100644
> --- a/arch/x86/kernel/machine_kexec_64.c
> +++ b/arch/x86/kernel/machine_kexec_64.c
> @@ -86,7 +86,7 @@ static int init_transition_pgtable(struct kimage *image, 
> pgd_t *pgd)
>   set_pmd(pmd, __pmd(__pa(

[Makedumpfile PATCH v4 0/2] Fix refiltering when kaslr enabled

2017-05-25 Thread Pratyush Anand
Hi All,

We came across another failure in makedumpfile when kaslr is enabled. This
failure occurs when we try re-filtering. We try to erase some symbol from a
dumpfile which was copied/compressed from /proc/vmcore using makedumpfile.

We have very limited symbol information in vmcoreinfo. So symbols to be
erased may not be available in vmcoreinfo and we look for it in vmlinux.
However,  symbol address from vmlinux is a static address which differs
from run time address with KASLR_OFFSET. Therefore, reading any "virtual
address of vmlinux" from vmcore is not possible. 

These patches finds runtime  KASLR offset and then calculates run time
address of symbols read from vmlinux.

Hatayama Daisuke also found some issue [1] when he was working with a
sadump and virsh dump of a none kaslr kernel. Patch 2/2 of this series has
been improved to take care of those issues as well.

[1]http://lists.infradead.org/pipermail/kexec/2017-May/018833.html

Thanks

~Pratyush

v1->v2:
 - reading KERNELOFFSET from vmcoreinfo now instead of calculating it from
   _stext
v2->v3:
 - Fixed initialization of info->file_vmcoreinfo
 - Improved page_offset calculation logic to take care of different dump
   scenarios.
v3->v4:
 - Removed info->kaslr_offset write to VMCOREINFO



Pratyush Anand (2):
  makedumpfile: add runtime kaslr offset if it exists
  x86_64: calculate page_offset in case of re-filtering/sadump/virsh
dump

 arch/x86_64.c  | 72 --
 erase_info.c   |  1 +
 makedumpfile.c | 46 +
 makedumpfile.h | 16 +
 4 files changed, 128 insertions(+), 7 deletions(-)

-- 
2.9.3


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


[Makedumpfile PATCH v4 1/2] makedumpfile: add runtime kaslr offset if it exists

2017-05-25 Thread Pratyush Anand
If we have to erase a symbol from vmcore whose address is not present in
vmcoreinfo, then we need to pass vmlinux as well to get the symbol
address.
When kaslr is enabled, virtual address of all the kernel symbols are
randomized with an offset. vmlinux  always has a static address, but all
the arch specific calculation are based on run time kernel address. So
we need to find a way to translate symbol address from vmlinux to kernel
run time address.

without this patch:
   # cat > scrub.conf << EOF
   [vmlinux]
   erase jiffies
   erase init_task.utime
   for tsk in init_task.tasks.next within task_struct:tasks
   erase tsk.utime
   endfor
   EOF

# makedumpfile --split  -d 5 -x vmlinux --config scrub.conf vmcore 
dumpfile_{1,2,3}

readpage_kdump_compressed: pfn(f97ea) is excluded from vmcore.
readmem: type_addr: 1, addr:f97eaff8, size:8
vtop4_x86_64: Can't get pml4 (page_dir:f97eaff8).
readmem: Can't convert a virtual address(819f1284) to physical 
address.
readmem: type_addr: 0, addr:819f1284, size:390
check_release: Can't get the address of system_utsname.

After this patch check_release() is ok, and also we are able to erase
symbol from vmcore.

Signed-off-by: Pratyush Anand 
---
 arch/x86_64.c  | 36 
 erase_info.c   |  1 +
 makedumpfile.c | 46 ++
 makedumpfile.h | 16 
 4 files changed, 99 insertions(+)

diff --git a/arch/x86_64.c b/arch/x86_64.c
index e978a36f8878..fd2e8ac154d6 100644
--- a/arch/x86_64.c
+++ b/arch/x86_64.c
@@ -33,6 +33,42 @@ get_xen_p2m_mfn(void)
return NOT_FOUND_LONG_VALUE;
 }
 
+unsigned long
+get_kaslr_offset_x86_64(unsigned long vaddr)
+{
+   unsigned int i;
+   char buf[BUFSIZE_FGETS], *endp;
+
+   if (!info->kaslr_offset && info->file_vmcoreinfo) {
+   if (fseek(info->file_vmcoreinfo, 0, SEEK_SET) < 0) {
+   ERRMSG("Can't seek the vmcoreinfo file(%s). %s\n",
+   info->name_vmcoreinfo, strerror(errno));
+   return FALSE;
+   }
+
+   while (fgets(buf, BUFSIZE_FGETS, info->file_vmcoreinfo)) {
+   i = strlen(buf);
+   if (!i)
+   break;
+   if (buf[i - 1] == '\n')
+   buf[i - 1] = '\0';
+   if (strncmp(buf, STR_KERNELOFFSET,
+   strlen(STR_KERNELOFFSET)) == 0)
+   info->kaslr_offset =
+   
strtoul(buf+strlen(STR_KERNELOFFSET),&endp,16);
+   }
+   }
+   if (vaddr >= __START_KERNEL_map &&
+   vaddr < __START_KERNEL_map + info->kaslr_offset)
+   return info->kaslr_offset;
+   else
+   /*
+* TODO: we need to check if it is vmalloc/vmmemmap/module
+* address, we will have different offset
+*/
+   return 0;
+}
+
 static int
 get_page_offset_x86_64(void)
 {
diff --git a/erase_info.c b/erase_info.c
index f2ba9149e93e..60abfa1a1adf 100644
--- a/erase_info.c
+++ b/erase_info.c
@@ -1088,6 +1088,7 @@ resolve_config_entry(struct config_entry *ce, unsigned 
long long base_vaddr,
ce->line, ce->name);
return FALSE;
}
+   ce->sym_addr += get_kaslr_offset(ce->sym_addr);
ce->type_name = get_symbol_type_name(ce->name,
DWARF_INFO_GET_SYMBOL_TYPE,
&ce->size, &ce->type_flag);
diff --git a/makedumpfile.c b/makedumpfile.c
index 301772a8820c..9babf1a07154 100644
--- a/makedumpfile.c
+++ b/makedumpfile.c
@@ -3782,6 +3782,46 @@ free_for_parallel()
 }
 
 int
+find_kaslr_offsets()
+{
+   off_t offset;
+   unsigned long size;
+   int ret = FALSE;
+
+   get_vmcoreinfo(&offset, &size);
+
+   if (!(info->name_vmcoreinfo = strdup(FILENAME_VMCOREINFO))) {
+   MSG("Can't duplicate strings(%s).\n", FILENAME_VMCOREINFO);
+   return FALSE;
+   }
+   if (!copy_vmcoreinfo(offset, size))
+   goto out;
+
+   if (!open_vmcoreinfo("r"))
+   goto out;
+
+   unlink(info->name_vmcoreinfo);
+
+   /*
+* This arch specific function should update info->kaslr_offset. If
+* kaslr is not enabled then offset will be set to 0. arch specific
+* function might need to read from vmcoreinfo, therefore we have
+* called this function between open_vmcoreinfo() and
+* close_vmcoreinfo()
+*/
+   get_kaslr_offset(SYMBOL(_stext));
+
+   close_vmcoreinfo();
+
+   ret = TRUE;
+out:
+   free(info->name_vmcoreinfo);
+   info->name_vmcoreinfo = NULL;
+
+   return ret;
+

[Makedumpfile PATCH v4 2/2] x86_64: calculate page_offset in case of re-filtering/sadump/virsh dump

2017-05-25 Thread Pratyush Anand
we do not call get_elf_info() in case of refiltering and sadump.
Therefore, we will not have any pt_load in that case, and so we get:

get_page_offset_x86_64: Can't get any pt_load to calculate page offset.

However, we will have vmcoreinfo and vmlinux information in case of
re-filtering. So, we are able to find kaslr offset and we can get
page_offset_base address. Thus we can read the page offset as well.

If kaslr is not enabled and also we do not have valid PT_LOAD to
calculate page offset  then use old method to find fixed page
offset.

In case of virsh dump virtual addresses in PT_LOAD are 0. Ignore such
addresses for the page_offset calculation.

Suggested-by: HATAYAMA Daisuke 
Signed-off-by: Pratyush Anand 
---
 arch/x86_64.c | 36 +---
 1 file changed, 29 insertions(+), 7 deletions(-)

diff --git a/arch/x86_64.c b/arch/x86_64.c
index fd2e8ac154d6..18384a8dd684 100644
--- a/arch/x86_64.c
+++ b/arch/x86_64.c
@@ -75,17 +75,39 @@ get_page_offset_x86_64(void)
int i;
unsigned long long phys_start;
unsigned long long virt_start;
+   unsigned long page_offset_base;
+
+   if (info->kaslr_offset) {
+   page_offset_base = get_symbol_addr("page_offset_base");
+   page_offset_base += info->kaslr_offset;
+   if (!readmem(VADDR, page_offset_base, &info->page_offset,
+   sizeof(info->page_offset))) {
+ERRMSG("Can't read page_offset_base.\n");
+return FALSE;
+   }
+   return TRUE;
+   }
 
-   for (i = 0; get_pt_load(i, &phys_start, NULL, &virt_start, NULL); i++) {
-   if (virt_start < __START_KERNEL_map
-   && phys_start != NOT_PADDR) {
-   info->page_offset = virt_start - phys_start;
-   return TRUE;
+   if (get_num_pt_loads()) {
+   for (i = 0;
+   get_pt_load(i, &phys_start, NULL, &virt_start, NULL);
+   i++) {
+   if (virt_start != NOT_KV_ADDR
+   && virt_start < __START_KERNEL_map
+   && phys_start != NOT_PADDR) {
+   info->page_offset = virt_start - phys_start;
+   return TRUE;
+   }
}
}
 
-   ERRMSG("Can't get any pt_load to calculate page offset.\n");
-   return FALSE;
+   if (info->kernel_version < KERNEL_VERSION(2, 6, 27)) {
+   info->page_offset = __PAGE_OFFSET_ORIG;
+   } else {
+   info->page_offset = __PAGE_OFFSET_2_6_27;
+   }
+
+   return TRUE;
 }
 
 int
-- 
2.9.3


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [Makedumpfile PATCH v3 1/2] makedumpfile: add runtime kaslr offset if it exists

2017-05-25 Thread Pratyush Anand



On Friday 26 May 2017 07:17 AM, Atsushi Kumagai wrote:

write_vmcoreinfo_data(void)
{
/*
+* write 1st kernel's KERNELOFFSET
+*/
+   if (info->kaslr_offset)
+   fprintf(info->file_vmcoreinfo, "%s%lx\n", STR_KERNELOFFSET,
+   info->kaslr_offset);

When will this data written to VMCOREINFO file be used ?
info->kaslr_offset is necessary for vmlinux but -x and -i are exclusive.

This is what I thought:

Lets says we have got a vmcore1 after re-filtering original vmcore. Now, if we
would like to re-filter vmcore1 then we will need kaslr_offset again. So,
should we not right kaslr_offset in vmcoreinfo of vmcore1 as well?

write_vmcoreinfo_data() is called only for -g option, it makes a
VMCOREINFO file as a separate file, it doesn't overwrite VMCOREINFO in vmcore.


OK..got it.

Will remove this function and send v4.


Thanks

~Pratyush

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH v5 31/32] x86: Add sysfs support for Secure Memory Encryption

2017-05-25 Thread Dave Young
Ccing Xunlei he is reading the patches see what need to be done for
kdump. There should still be several places to handle to make kdump work.

On 05/18/17 at 07:01pm, Borislav Petkov wrote:
> On Tue, Apr 18, 2017 at 04:22:12PM -0500, Tom Lendacky wrote:
> > Add sysfs support for SME so that user-space utilities (kdump, etc.) can
> > determine if SME is active.
> 
> But why do user-space tools need to know that?
> 
> I mean, when we load the kdump kernel, we do it with the first kernel,
> with the kexec_load() syscall, AFAICT. And that code does a lot of
> things during that init, like machine_kexec_prepare()->init_pgtable() to
> prepare the ident mapping of the second kernel, for example.
> 
> What I'm aiming at is that the first kernel knows *exactly* whether SME
> is enabled or not and doesn't need to tell the second one through some
> sysfs entries - it can do that during loading.
> 
> So I don't think we need any userspace things at all...

If kdump kernel can get the SME status from hardware register then this
should be not necessary and this patch can be dropped.

Thanks
Dave

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


RE: [Makedumpfile PATCH v3 1/2] makedumpfile: add runtime kaslr offset if it exists

2017-05-25 Thread Atsushi Kumagai
>>> diff --git a/makedumpfile.c b/makedumpfile.c
>>> index 301772a8820c..4986d098d69a 100644
>>> --- a/makedumpfile.c
>>> +++ b/makedumpfile.c
>>> @@ -2099,6 +2099,13 @@ void
>>> write_vmcoreinfo_data(void)
>>> {
>>> /*
>>> +* write 1st kernel's KERNELOFFSET
>>> +*/
>>> +   if (info->kaslr_offset)
>>> +   fprintf(info->file_vmcoreinfo, "%s%lx\n", STR_KERNELOFFSET,
>>> +   info->kaslr_offset);
>>
>> When will this data written to VMCOREINFO file be used ?
>> info->kaslr_offset is necessary for vmlinux but -x and -i are exclusive.
>
>This is what I thought:
>
>Lets says we have got a vmcore1 after re-filtering original vmcore. Now, if we
>would like to re-filter vmcore1 then we will need kaslr_offset again. So,
>should we not right kaslr_offset in vmcoreinfo of vmcore1 as well?

write_vmcoreinfo_data() is called only for -g option, it makes a
VMCOREINFO file as a separate file, it doesn't overwrite VMCOREINFO in vmcore.

  if (info->flag_generate_vmcoreinfo)
generate_vmcoreinfo()
  + write_vmcoreinfo_data()

find_kaslr_offsets() doesn't refer to the separate VMCOREINFO file,
writing STR_KERNELOFFSET in it is meaningless.


Thanks,
Atsushi Kumagai


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


RE: [Makedumpfile Patch] Fix get_kcore_dump_loads() error case

2017-05-25 Thread Atsushi Kumagai
>commit f10d1e2e94c50 introduced another bug while fixing memory leak.
>Use the braces with if condition.

Thanks, I'll merge this into v1.6.2

Atsushi Kumagai

>Signed-off-by: Pratyush Anand 
>---
> elf_info.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
>diff --git a/elf_info.c b/elf_info.c
>index 601d66e3f176..69b1719b020f 100644
>--- a/elf_info.c
>+++ b/elf_info.c
>@@ -893,9 +893,10 @@ int get_kcore_dump_loads(void)
>   if (p->phys_start == NOT_PADDR
>   || !is_phys_addr(p->virt_start))
>   continue;
>-  if (j >= loads)
>+  if (j >= loads) {
>   free(pls);
>   return FALSE;
>+  }
>
>   if (j == 0) {
>   offset_pt_load_memory = p->file_offset;
>--
>2.9.3


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH v5 29/32] x86/mm: Add support to encrypt the kernel in-place

2017-05-25 Thread Tom Lendacky

On 5/18/2017 7:46 AM, Borislav Petkov wrote:

On Tue, Apr 18, 2017 at 04:21:49PM -0500, Tom Lendacky wrote:

Add the support to encrypt the kernel in-place. This is done by creating
new page mappings for the kernel - a decrypted write-protected mapping
and an encrypted mapping. The kernel is encrypted by copying it through
a temporary buffer.

Signed-off-by: Tom Lendacky 
---
 arch/x86/include/asm/mem_encrypt.h |6 +
 arch/x86/mm/Makefile   |2
 arch/x86/mm/mem_encrypt.c  |  262 
 arch/x86/mm/mem_encrypt_boot.S |  151 +
 4 files changed, 421 insertions(+)
 create mode 100644 arch/x86/mm/mem_encrypt_boot.S

diff --git a/arch/x86/include/asm/mem_encrypt.h 
b/arch/x86/include/asm/mem_encrypt.h
index b406df2..8f6f9b4 100644
--- a/arch/x86/include/asm/mem_encrypt.h
+++ b/arch/x86/include/asm/mem_encrypt.h
@@ -31,6 +31,12 @@ static inline u64 sme_dma_mask(void)
return ((u64)sme_me_mask << 1) - 1;
 }

+void sme_encrypt_execute(unsigned long encrypted_kernel_vaddr,
+unsigned long decrypted_kernel_vaddr,
+unsigned long kernel_len,
+unsigned long encryption_wa,
+unsigned long encryption_pgd);
+
 void __init sme_early_encrypt(resource_size_t paddr,
  unsigned long size);
 void __init sme_early_decrypt(resource_size_t paddr,
diff --git a/arch/x86/mm/Makefile b/arch/x86/mm/Makefile
index 9e13841..0633142 100644
--- a/arch/x86/mm/Makefile
+++ b/arch/x86/mm/Makefile
@@ -38,3 +38,5 @@ obj-$(CONFIG_NUMA_EMU)+= numa_emulation.o
 obj-$(CONFIG_X86_INTEL_MPX)+= mpx.o
 obj-$(CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS) += pkeys.o
 obj-$(CONFIG_RANDOMIZE_MEMORY) += kaslr.o
+
+obj-$(CONFIG_AMD_MEM_ENCRYPT)  += mem_encrypt_boot.o
diff --git a/arch/x86/mm/mem_encrypt.c b/arch/x86/mm/mem_encrypt.c
index 30b07a3..0ff41a4 100644
--- a/arch/x86/mm/mem_encrypt.c
+++ b/arch/x86/mm/mem_encrypt.c
@@ -24,6 +24,7 @@
 #include 
 #include 
 #include 
+#include 

 /*
  * Since SME related variables are set early in the boot process they must
@@ -216,8 +217,269 @@ void swiotlb_set_mem_attributes(void *vaddr, unsigned 
long size)
set_memory_decrypted((unsigned long)vaddr, size >> PAGE_SHIFT);
 }

+void __init sme_clear_pgd(pgd_t *pgd_base, unsigned long start,


static


Yup.




+ unsigned long end)
+{
+   unsigned long addr = start;
+   pgdval_t *pgd_p;
+
+   while (addr < end) {
+   unsigned long pgd_end;
+
+   pgd_end = (addr & PGDIR_MASK) + PGDIR_SIZE;
+   if (pgd_end > end)
+   pgd_end = end;
+
+   pgd_p = (pgdval_t *)pgd_base + pgd_index(addr);
+   *pgd_p = 0;


Hmm, so this is a contiguous range from [start:end] which translates to
8-byte PGD pointers in the PGD page so you can simply memset that range,
no?

Instead of iterating over each one?


I guess I could do that, but this will probably only end up clearing a
single PGD entry anyway since it's highly doubtful the address range
would cross a 512GB boundary.




+
+   addr = pgd_end;
+   }
+}
+
+#define PGD_FLAGS  _KERNPG_TABLE_NOENC
+#define PUD_FLAGS  _KERNPG_TABLE_NOENC
+#define PMD_FLAGS  (__PAGE_KERNEL_LARGE_EXEC & ~_PAGE_GLOBAL)
+
+static void __init *sme_populate_pgd(pgd_t *pgd_base, void *pgtable_area,
+unsigned long vaddr, pmdval_t pmd_val)
+{
+   pgdval_t pgd, *pgd_p;
+   pudval_t pud, *pud_p;
+   pmdval_t pmd, *pmd_p;


You should use the enclosing type, not the underlying one. I.e.,

pgd_t *pgd;
pud_t *pud;
...

and then the macros native_p*d_val(), p*d_offset() and so on. I say
native_* because we don't want to have any paravirt nastyness here.
I believe your previous version was using the proper interfaces.


I won't be able to use the p*d_offset() macros since they use __va()
and we're identity mapped during this time (which is why I would guess
the proposed changes for the 5-level pagetables in
arch/x86/kernel/head64.c, __startup_64, don't use these macros
either). I should be able to use the native_set_p*d() and others though,
I'll look into that.



And the kernel has gotten 5-level pagetables support in
the meantime, so this'll need to start at p4d AFAICT.
arch/x86/mm/fault.c::dump_pagetable() looks like a good example to stare
at.


Yeah, I accounted for that in the other parts of the code but I need
to do that here also.




+   pgd_p = (pgdval_t *)pgd_base + pgd_index(vaddr);
+   pgd = *pgd_p;
+   if (pgd) {
+   pud_p = (pudval_t *)(pgd & ~PTE_FLAGS_MASK);
+   } else {
+   pud_p = pgtable_area;
+   memset(pud_p, 0, sizeof(*pud_p) * PTRS_PER_PUD);
+   pgtable_area += sizeof(*pud_p) * PTRS_PER_PUD;
+
+   *pgd_p = (pgdval_t)pud_p + PGD_FLAGS

Re: [PATCH 1/2] sadump: set info->page_size before cache_init()

2017-05-25 Thread Pratyush Anand



On Tuesday 23 May 2017 08:22 AM, Hatayama, Daisuke wrote:

Currently, makedumpfile results in Segmentation fault on sadump dump
files as follows:

# LANG=C makedumpfile -f --message-level=31 -ld31 -x vmlinux 
./sadump_vmcore sadump_vmcore-ld31
sadump: read dump device as single partition
sadump: single partition configuration
page_size: 4096
sadump: timezone information is missing
Segmentation fault

By bisect, I found that this issue is caused by the following commit
that moves invocation of cache_init() in initial() a bit early:

# git bisect bad
8e2834bac4f62da3894da297f083068431be6d80 is the first bad commit
commit 8e2834bac4f62da3894da297f083068431be6d80
Author: Pratyush Anand 
Date:   Thu Mar 2 17:37:11 2017 +0900

[PATCH v3 2/7] initial(): call cache_init() a bit early

Call cache_init() before get_kcore_dump_loads(), because latter uses
cache_search().

Call path is like this :
get_kcore_dump_loads() -> process_dump_load() -> vaddr_to_paddr() ->
vtop4_x86_64() -> readmem() -> cache_search()

Signed-off-by: Pratyush Anand 

:100644 100644 6942047199deb09dd1fff2121e264584dbb05587 
3b8e9810468de26b0d8b73d456f0bd4f3d3aa2fe M  makedumpfile.c

In this timing, on sadump vmcores, info->page_size has not been
initialized yet so has 0. So, malloc() in cache_init() returns a chunk
of 0 size. A bit later, info->page_size is initialized with 4096.
Later processing on cache.c behaves assuming the chunk size is 8 *
4096. This destroys objects allocated after the chunk, resulting in
the above Segmentation fault.

To fix this issue, this commit moves setting info->page_size before
cache_init().

Signed-off-by: HATAYAMA Daisuke 
Cc: Pratyush Anand 


For 1/2
Reviewed-by: Pratyush Anand 



---
 makedumpfile.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/makedumpfile.c b/makedumpfile.c
index 301772a..f300b19 100644
--- a/makedumpfile.c
+++ b/makedumpfile.c
@@ -3878,6 +3878,9 @@ initial(void)
if (!get_value_for_old_linux())
return FALSE;

+   if (info->flag_sadump && !set_page_size(sadump_page_size()))
+   return FALSE;
+
if (!is_xen_memory() && !cache_init())
return FALSE;

@@ -3906,9 +3909,6 @@ initial(void)
return FALSE;
}

-   if (!set_page_size(sadump_page_size()))
-   return FALSE;
-
if (!sadump_initialize_bitmap_memory())
return FALSE;




___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


[Makedumpfile Patch] Fix get_kcore_dump_loads() error case

2017-05-25 Thread Pratyush Anand
commit f10d1e2e94c50 introduced another bug while fixing memory leak.
Use the braces with if condition.

Signed-off-by: Pratyush Anand 
---
 elf_info.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/elf_info.c b/elf_info.c
index 601d66e3f176..69b1719b020f 100644
--- a/elf_info.c
+++ b/elf_info.c
@@ -893,9 +893,10 @@ int get_kcore_dump_loads(void)
if (p->phys_start == NOT_PADDR
|| !is_phys_addr(p->virt_start))
continue;
-   if (j >= loads)
+   if (j >= loads) {
free(pls);
return FALSE;
+   }
 
if (j == 0) {
offset_pt_load_memory = p->file_offset;
-- 
2.9.3


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec