Re: [PATCH] makedumpfile: ppc64: do page traversal if vmemmap_list not populated

2023-09-27 Thread Aditya Gupta
On Wed, Sep 27, 2023 at 06:32:44AM +, HAGIO KAZUHITO(萩尾 一仁) wrote:
> On 2023/09/14 18:22, Aditya Gupta wrote:
> > Currently 'makedumpfile' fails to collect vmcore on upstream kernel,
> > with the errors:
> > 
> >  readpage_elf: Attempt to read non-existent page at 0x4000.
> >  readmem: type_addr: 0, addr:0, size:8
> >  get_vmemmap_list_info: Can't get vmemmap region addresses
> >  get_machdep_info_ppc64: Can't get vmemmap list info.
> > 
> > This occurs since makedumpfile depends on 'vmemmap_list' for translating
> > vmemmap addresses. But with below commit in Linux, vmemmap_list can be
> > empty, in case of Radix MMU on PowerPC64
> > 
> >  368a0590d954: (powerpc/book3s64/vmemmap: switch radix to use a
> >  different vmemmap handling function)
> > 
> > In case vmemmap_list is empty, then it's head is NULL, which causes
> > makedumpfile to fail with above error.
> > 
> > Since with above commit, 'vmemmap_list' is not populated (when MMU is
> > Radix MMU), kernel populates corresponding page table entries in kernel
> > page table. Hence, instead of depending on 'vmemmap_list' for address
> > translation for vmemmap addresses, do a kernel pagetable walk.
> > 
> > And since the pte can also be introduced at higher levels in the page
> > table, such as at PMD level, add hugepage support, by checking for
> > PAGE_PTE flag
> > 
> > Reported-by: Sachin Sant 
> > Signed-off-by: Aditya Gupta 
> 
> Thank you for the patch, applied.
> 
> https://github.com/makedumpfile/makedumpfile/commit/a34f017965583e89c4cb0b00117c200a6c191e54

Thanks for the update.

> 
> Sorry for the delay, exceptionally busy this month..

No issues :)

Thanks,
- Aditya Gupta

> 
> Thanks,
> Kazu
> 
> > ---
> >   arch/ppc64.c   | 111 ++---
> >   makedumpfile.h |   6 +++
> >   2 files changed, 84 insertions(+), 33 deletions(-)
> > 
> > diff --git a/arch/ppc64.c b/arch/ppc64.c
> > index 5e70acb51aba..9456b8b570c5 100644
> > --- a/arch/ppc64.c
> > +++ b/arch/ppc64.c
> > @@ -196,6 +196,10 @@ ppc64_vmemmap_init(void)
> > int psize, shift;
> > ulong head;
> >   
> > +   /* initialise vmemmap_list in case SYMBOL(vmemmap_list) is not found */
> > +   info->vmemmap_list = NULL;
> > +   info->vmemmap_cnt = 0;
> > +   
> > if ((SYMBOL(vmemmap_list) == NOT_FOUND_SYMBOL)
> > || (SYMBOL(mmu_psize_defs) == NOT_FOUND_SYMBOL)
> > || (SYMBOL(mmu_vmemmap_psize) == NOT_FOUND_SYMBOL)
> > @@ -216,15 +220,24 @@ ppc64_vmemmap_init(void)
> > return FALSE;
> > info->vmemmap_psize = 1 << shift;
> >   
> > -   if (!readmem(VADDR, SYMBOL(vmemmap_list), &head, sizeof(unsigned long)))
> > -   return FALSE;
> > -
> > /*
> > -* Get vmemmap list count and populate vmemmap regions info
> > -*/
> > -   info->vmemmap_cnt = get_vmemmap_list_info(head);
> > -   if (info->vmemmap_cnt == 0)
> > -   return FALSE;
> > +* vmemmap_list symbol can be missing or set to 0 in the kernel.
> > +* This would imply vmemmap region is mapped in the kernel pagetable.
> > +*
> > +* So, read vmemmap_list anyway, and use 'vmemmap_list' if it's not 
> > empty
> > +* (head != NULL), or we will do a kernel pagetable walk for vmemmap 
> > address
> > +* translation later
> > +**/
> > +   readmem(VADDR, SYMBOL(vmemmap_list), &head, sizeof(unsigned long));
> > +
> > +   if (head) {
> > +   /*
> > +* Get vmemmap list count and populate vmemmap regions info
> > +*/
> > +   info->vmemmap_cnt = get_vmemmap_list_info(head);
> > +   if (info->vmemmap_cnt == 0)
> > +   return FALSE;
> > +   }
> >   
> > info->flag_vmemmap = TRUE;
> > return TRUE;
> > @@ -347,29 +360,6 @@ ppc64_vmalloc_init(void)
> > return TRUE;
> >   }
> >   
> > -/*
> > - *  If the vmemmap address translation information is stored in the kernel,
> > - *  make the translation.
> > - */
> > -static unsigned long long
> > -ppc64_vmemmap_to_phys(unsigned long vaddr)
> > -{
> > -   int i;
> > -   ulong   offset;
> > -   unsigned long long paddr = NOT_PADDR;
> > -
> > -   for (i = 0; i < info->vmemmap_cnt; i++) {
> > -   if ((vaddr >= info->vmemmap_list[i].virt) && (vaddr <
> > -   (info->vmemmap_list[i].virt + info->vmemmap_psize))) {
> > -   offset = vaddr - info->vmemmap_list[i].virt;
> > -   paddr = info->vmemmap_list[i].phys + offset;
> > -   break;
> > -   }
> > -   }
> > -
> > -   return paddr;
> > -}
> > -
> >   static unsigned long long
> >   ppc64_vtop_level4(unsigned long vaddr)
> >   {
> > @@ -379,6 +369,8 @@ ppc64_vtop_level4(unsigned long vaddr)
> > unsigned long long pgd_pte, pud_pte;
> > unsigned long long pmd_pte, pte;
> > unsigned long long paddr = NOT_PADDR;
> > +   uint is_hugepage = 0;
> > +   uint pdshift;
> > uint swap = 0;
> >   
> > if (info->page_buf == NULL) {
> > @@ -413,6 

Re: [PATCH] makedumpfile: ppc64: do page traversal if vmemmap_list not populated

2023-09-26 Thread 萩尾 一仁
On 2023/09/14 18:22, Aditya Gupta wrote:
> Currently 'makedumpfile' fails to collect vmcore on upstream kernel,
> with the errors:
> 
>  readpage_elf: Attempt to read non-existent page at 0x4000.
>  readmem: type_addr: 0, addr:0, size:8
>  get_vmemmap_list_info: Can't get vmemmap region addresses
>  get_machdep_info_ppc64: Can't get vmemmap list info.
> 
> This occurs since makedumpfile depends on 'vmemmap_list' for translating
> vmemmap addresses. But with below commit in Linux, vmemmap_list can be
> empty, in case of Radix MMU on PowerPC64
> 
>  368a0590d954: (powerpc/book3s64/vmemmap: switch radix to use a
>  different vmemmap handling function)
> 
> In case vmemmap_list is empty, then it's head is NULL, which causes
> makedumpfile to fail with above error.
> 
> Since with above commit, 'vmemmap_list' is not populated (when MMU is
> Radix MMU), kernel populates corresponding page table entries in kernel
> page table. Hence, instead of depending on 'vmemmap_list' for address
> translation for vmemmap addresses, do a kernel pagetable walk.
> 
> And since the pte can also be introduced at higher levels in the page
> table, such as at PMD level, add hugepage support, by checking for
> PAGE_PTE flag
> 
> Reported-by: Sachin Sant 
> Signed-off-by: Aditya Gupta 

Thank you for the patch, applied.

https://github.com/makedumpfile/makedumpfile/commit/a34f017965583e89c4cb0b00117c200a6c191e54

Sorry for the delay, exceptionally busy this month..

Thanks,
Kazu

> ---
>   arch/ppc64.c   | 111 ++---
>   makedumpfile.h |   6 +++
>   2 files changed, 84 insertions(+), 33 deletions(-)
> 
> diff --git a/arch/ppc64.c b/arch/ppc64.c
> index 5e70acb51aba..9456b8b570c5 100644
> --- a/arch/ppc64.c
> +++ b/arch/ppc64.c
> @@ -196,6 +196,10 @@ ppc64_vmemmap_init(void)
>   int psize, shift;
>   ulong head;
>   
> + /* initialise vmemmap_list in case SYMBOL(vmemmap_list) is not found */
> + info->vmemmap_list = NULL;
> + info->vmemmap_cnt = 0;
> + 
>   if ((SYMBOL(vmemmap_list) == NOT_FOUND_SYMBOL)
>   || (SYMBOL(mmu_psize_defs) == NOT_FOUND_SYMBOL)
>   || (SYMBOL(mmu_vmemmap_psize) == NOT_FOUND_SYMBOL)
> @@ -216,15 +220,24 @@ ppc64_vmemmap_init(void)
>   return FALSE;
>   info->vmemmap_psize = 1 << shift;
>   
> - if (!readmem(VADDR, SYMBOL(vmemmap_list), &head, sizeof(unsigned long)))
> - return FALSE;
> -
>   /*
> -  * Get vmemmap list count and populate vmemmap regions info
> -  */
> - info->vmemmap_cnt = get_vmemmap_list_info(head);
> - if (info->vmemmap_cnt == 0)
> - return FALSE;
> +  * vmemmap_list symbol can be missing or set to 0 in the kernel.
> +  * This would imply vmemmap region is mapped in the kernel pagetable.
> +  *
> +  * So, read vmemmap_list anyway, and use 'vmemmap_list' if it's not 
> empty
> +  * (head != NULL), or we will do a kernel pagetable walk for vmemmap 
> address
> +  * translation later
> +  **/
> + readmem(VADDR, SYMBOL(vmemmap_list), &head, sizeof(unsigned long));
> +
> + if (head) {
> + /*
> +  * Get vmemmap list count and populate vmemmap regions info
> +  */
> + info->vmemmap_cnt = get_vmemmap_list_info(head);
> + if (info->vmemmap_cnt == 0)
> + return FALSE;
> + }
>   
>   info->flag_vmemmap = TRUE;
>   return TRUE;
> @@ -347,29 +360,6 @@ ppc64_vmalloc_init(void)
>   return TRUE;
>   }
>   
> -/*
> - *  If the vmemmap address translation information is stored in the kernel,
> - *  make the translation.
> - */
> -static unsigned long long
> -ppc64_vmemmap_to_phys(unsigned long vaddr)
> -{
> - int i;
> - ulong   offset;
> - unsigned long long paddr = NOT_PADDR;
> -
> - for (i = 0; i < info->vmemmap_cnt; i++) {
> - if ((vaddr >= info->vmemmap_list[i].virt) && (vaddr <
> - (info->vmemmap_list[i].virt + info->vmemmap_psize))) {
> - offset = vaddr - info->vmemmap_list[i].virt;
> - paddr = info->vmemmap_list[i].phys + offset;
> - break;
> - }
> - }
> -
> - return paddr;
> -}
> -
>   static unsigned long long
>   ppc64_vtop_level4(unsigned long vaddr)
>   {
> @@ -379,6 +369,8 @@ ppc64_vtop_level4(unsigned long vaddr)
>   unsigned long long pgd_pte, pud_pte;
>   unsigned long long pmd_pte, pte;
>   unsigned long long paddr = NOT_PADDR;
> + uint is_hugepage = 0;
> + uint pdshift;
>   uint swap = 0;
>   
>   if (info->page_buf == NULL) {
> @@ -413,6 +405,13 @@ ppc64_vtop_level4(unsigned long vaddr)
>   if (!pgd_pte)
>   return NOT_PADDR;
>   
> + if (IS_HUGEPAGE(pgd_pte)) {
> + is_hugepage = 1;
> + pte = pgd_pte;
> + pdshift = info->l4_shift;
> + goto out;
> + }
> 

Re: [PATCH] makedumpfile: ppc64: do page traversal if vmemmap_list not populated

2023-09-14 Thread Sachin Sant


> On 14-Sep-2023, at 2:52 PM, Aditya Gupta  wrote:
> 
> Currently 'makedumpfile' fails to collect vmcore on upstream kernel,
> with the errors:
> 
>readpage_elf: Attempt to read non-existent page at 0x4000.
>readmem: type_addr: 0, addr:0, size:8
>get_vmemmap_list_info: Can't get vmemmap region addresses
>get_machdep_info_ppc64: Can't get vmemmap list info.
> 
> This occurs since makedumpfile depends on 'vmemmap_list' for translating
> vmemmap addresses. But with below commit in Linux, vmemmap_list can be
> empty, in case of Radix MMU on PowerPC64
> 
>368a0590d954: (powerpc/book3s64/vmemmap: switch radix to use a
>different vmemmap handling function)
> 
> In case vmemmap_list is empty, then it's head is NULL, which causes
> makedumpfile to fail with above error.
> 
> Since with above commit, 'vmemmap_list' is not populated (when MMU is
> Radix MMU), kernel populates corresponding page table entries in kernel
> page table. Hence, instead of depending on 'vmemmap_list' for address
> translation for vmemmap addresses, do a kernel pagetable walk.
> 
> And since the pte can also be introduced at higher levels in the page
> table, such as at PMD level, add hugepage support, by checking for
> PAGE_PTE flag
> 
> Reported-by: Sachin Sant 
> Signed-off-by: Aditya Gupta 
> —
Thanks for the fix. With this (and the corresponding kernel) fix applied,
I am able to capture crash dump.

Tested-by: Sachin Sant 

- Sachin

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH] makedumpfile: ppc64: do page traversal if vmemmap_list not populated

2023-09-14 Thread Aditya Gupta
To be able to test this on a system with Radix MMU, the kernel needs to be
patched with given kernel patch, since makedumpfile requires
'cpu_features.mmu_features' to check if MMU was Radix MMU:

https://lore.kernel.org/linuxppc-dev/20230911091409.415662-1-adit...@linux.ibm.com/T/#u

Or, the vmlinux can be provided to makedumpfile, so that it finds the
'cur_cpu_spec.mmu_features' symbol, run makedumpfile with '-x' flag similar to
this:

makedumpfile -l -d 31 -x VMLINUX /proc/vmcore /var/crash/vmcore.kdump

Thanks,
Aditya Gupta


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec