Re: [PATCH] makedumpfile: call initial before use cache

2024-07-25 Thread  

On 2024/07/23 16:31, Lichen Liu wrote:
> Yes, it works!
> Thank you so much!

Thanks, applied.
https://github.com/makedumpfile/makedumpfile/commit/900190de6b67b2de410cfc8023c1b198a416ceb3

Kazu

> 
> On Mon, Jul 22, 2024 at 2:46 PM HAGIO KAZUHITO(萩尾 一仁)
>  wrote:
>>
>> On 2024/07/20 16:38, Lichen Liu wrote:
>>
>>>> This will work only for 4.19 and later kernels, but might reduce users
>>>> that hit the issue.  Does this work for you?
>>> That works for me because I'm testing for the 6.x kernel.
>>
>> Thanks for the check.  Is the issue a segmentation fault?
>> I've made a patch below, is this OK?
>>
>>
>>   From 084742cba5b81da563074454ab8c879e8e411cb0 Mon Sep 17 00:00:00 2001
>> From: Kazuhito Hagio 
>> Date: Mon, 22 Jul 2024 14:31:43 +0900
>> Subject: [PATCH] Workaround for segfault by "makedumpfile --mem-usage" on 
>> PPC64
>>
>> "makedumpfile --mem-usage /proc/kcore" can cause a segmentation fault on
>> PPC64, because the readmem() of the following code path uses cache
>> before it's initialized in initial().
>>
>> show_mem_usage
>>   get_page_offset
>> get_versiondep_info_ppc64
>>   readmem
>>   ...
>>   initial
>>   cache_init
>>
>> The get_page_offset() is needed to get vmcoreinfo from /proc/kcore data,
>> so we can avoid calling it when a vmcoreinfo exists in the ELF NOTE
>> segment of /proc/kcore, i.e. on Linux 4.19 and later.
>>
>> (Note: for older kernels, we will need another way to fix it.)
>>
>> Reported-by: Lichen Liu 
>> Signed-off-by: Kazuhito Hagio 
>> ---
>>makedumpfile.c | 12 ++--
>>1 file changed, 6 insertions(+), 6 deletions(-)
>>
>> diff --git a/makedumpfile.c b/makedumpfile.c
>> index 5b347126db76..7d1dfcca50d8 100644
>> --- a/makedumpfile.c
>> +++ b/makedumpfile.c
>> @@ -12019,14 +12019,14 @@ int show_mem_usage(void)
>>  DEBUG_MSG("Read vmcoreinfo from NOTE segment: %d\n", 
>> vmcoreinfo);
>>  }
>>
>> -   if (!get_page_offset())
>> -   return FALSE;
>> +   if (!vmcoreinfo) {
>> +   if (!get_page_offset())
>> +   return FALSE;
>>
>> -   /* paddr_to_vaddr() on arm64 needs phys_base. */
>> -   if (!get_phys_base())
>> -   return FALSE;
>> +   /* paddr_to_vaddr() on arm64 needs phys_base. */
>> +   if (!get_phys_base())
>> +   return FALSE;
>>
>> -   if (!vmcoreinfo) {
>>  if (!get_sys_kernel_vmcoreinfo(_addr, 
>> _len))
>>  return FALSE;
>>
>> --
>> 2.31.1
___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH] makedumpfile: call initial before use cache

2024-07-22 Thread  
On 2024/07/20 16:38, Lichen Liu wrote:

>> This will work only for 4.19 and later kernels, but might reduce users
>> that hit the issue.  Does this work for you?
> That works for me because I'm testing for the 6.x kernel.

Thanks for the check.  Is the issue a segmentation fault?
I've made a patch below, is this OK?


 From 084742cba5b81da563074454ab8c879e8e411cb0 Mon Sep 17 00:00:00 2001
From: Kazuhito Hagio 
Date: Mon, 22 Jul 2024 14:31:43 +0900
Subject: [PATCH] Workaround for segfault by "makedumpfile --mem-usage" on PPC64

"makedumpfile --mem-usage /proc/kcore" can cause a segmentation fault on
PPC64, because the readmem() of the following code path uses cache
before it's initialized in initial().

   show_mem_usage
 get_page_offset
   get_versiondep_info_ppc64
 readmem
 ...
 initial
 cache_init

The get_page_offset() is needed to get vmcoreinfo from /proc/kcore data,
so we can avoid calling it when a vmcoreinfo exists in the ELF NOTE
segment of /proc/kcore, i.e. on Linux 4.19 and later.

(Note: for older kernels, we will need another way to fix it.)

Reported-by: Lichen Liu 
Signed-off-by: Kazuhito Hagio 
---
  makedumpfile.c | 12 ++--
  1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/makedumpfile.c b/makedumpfile.c
index 5b347126db76..7d1dfcca50d8 100644
--- a/makedumpfile.c
+++ b/makedumpfile.c
@@ -12019,14 +12019,14 @@ int show_mem_usage(void)
DEBUG_MSG("Read vmcoreinfo from NOTE segment: %d\n", 
vmcoreinfo);
}
  
-   if (!get_page_offset())
-   return FALSE;
+   if (!vmcoreinfo) {
+   if (!get_page_offset())
+   return FALSE;
  
-   /* paddr_to_vaddr() on arm64 needs phys_base. */
-   if (!get_phys_base())
-   return FALSE;
+   /* paddr_to_vaddr() on arm64 needs phys_base. */
+   if (!get_phys_base())
+   return FALSE;
  
-   if (!vmcoreinfo) {
if (!get_sys_kernel_vmcoreinfo(_addr, 
_len))
return FALSE;
  
-- 
2.31.1
___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH] makedumpfile: make reserve_diskspace do nothing for flattened format

2024-07-19 Thread  
Hi Jiri,

sorry for the long delay.

On 2024/06/19 21:04, Jiri Bohac wrote:
> makedumpfile: make reserve_diskspace do nothing for flattened format
> 
> reserve_diskspace() is called by write_elf_header() to make sure there is
> always space to write the program header, even if writing other data fails
> because of ENOSPC.
> 
> This is harmful when writing the flattened format to STDOUT for two reasons:
> 
> First, it actually wastes disk space, because first the block of zeroes is 
> sent
> to STDOUT by reserve_diskspace() and then the actual program header is sent,
> meant to overwrite the zeroes when the flattened format is rearranged.
> 
> Second, the algorithm used to read flattened format directly by the crash
> program does not cope with the flattened file containing two chunks meant for
> the same offset. It uses a binary search on a sorted array of flat_data 
> headers
> to find the data in the flat file. It may return the zeroed chunk written by
> reserve_diskspace() near the beginning of the file instead of the actual ELF
> header located near the end of the flattened file.

Thank you for the patch, I found a vmcore that reproduced the issue:

$ makedumpfile -FEd 31 vmcore > dump.FEd31
$ crash vmlinux dump.FEd31
...
realloc: No such file or directory
cannot realloc resized ELF header buffer
$

and the patch fixed this, so

Acked-by: Kazuhito Hagio 

(Masa will apply the patch, please wait for a while.)

Thanks,
Kazu


> 
> Fixes: e39216fce9f73759509ec158e39c289e6c211125 ("Make the incomplete 
> dumpfile generated by ENOSPC error analyzable.")
> Signed-off-by: Jiri Bohac 
> 
> ---
>   makedumpfile.c | 3 +++
>   1 file changed, 3 insertions(+)
> 
> diff --git a/makedumpfile.c b/makedumpfile.c
> index cadc596..9624c3f 100644
> --- a/makedumpfile.c
> +++ b/makedumpfile.c
> @@ -5206,6 +5206,9 @@ reserve_diskspace(int fd, off_t start_offset, off_t 
> end_offset, char *file_name)
>   
>   int ret = FALSE;
>   
> + if (info->flag_flatten)
> + return TRUE;
> +
>   assert(start_offset < end_offset);
>   buf_size = end_offset - start_offset;
>   
___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH] makedumpfile: call initial before use cache

2024-07-17 Thread  
Hi Lichen,

sorry for the long delay.

On 2024/06/25 10:57, Lichen Liu wrote:
> Run 'makedumpfile --mem-usage /proc/kcore' will coredump on ppc64, it is
> because show_mem_usage()->get_page_offset()->get_versiondep_info_ppc64()
> ->readmem() use cache before it is inited by initial().
> 
> Currently only ppc64 has this issue because only
> get_versiondep_info_ppc64() call readmem().
> 
> Signed-off-by: Lichen Liu 
> ---
>   makedumpfile.c | 6 +++---
>   1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/makedumpfile.c b/makedumpfile.c
> index 5b34712..6a42264 100644
> --- a/makedumpfile.c
> +++ b/makedumpfile.c
> @@ -12019,6 +12019,9 @@ int show_mem_usage(void)
>   DEBUG_MSG("Read vmcoreinfo from NOTE segment: %d\n", 
> vmcoreinfo);
>   }
>   
> + if (!initial())
> + return FALSE;
> +
>   if (!get_page_offset())
>   return FALSE;
>   
> @@ -12034,9 +12037,6 @@ int show_mem_usage(void)
>   return FALSE;
>   }
>   
> - if (!initial())
> - return FALSE;
> -
>   if (!open_dump_bitmap())
>   return FALSE;
>   

initial() needs to be called after set_kcore_vmcoreinfo(), when there is 
no vmcoreinfo in /proc/kcore ELF note.

So with the patch, "makedumpfile --mem-usage" fails on kernels that do 
not have a vmcoreinfo in ELF note, e.g. RHEL7 kernel:

   # makedumpfile-dev -f --mem-usage /proc/kcore
   exclude_free_page: Can't get necessary symbols for excluding free pages.

   makedumpfile Failed.


Probably readmem() should not be called before initial() in the first 
place.  I think it's the root cause, but I'm not sure how we can fix it.

A workaround I thought of is that moving get_page_offset() and 
get_phys_base() into the !vmcoreinfo block.  These are needed by 
set_kcore_vmcoreinfo(), so we can avoid calling them if there is a 
vmcoreinfo in ELF note:

--- a/makedumpfile.c
+++ b/makedumpfile.c
@@ -12019,14 +12019,14 @@ int show_mem_usage(void)
 DEBUG_MSG("Read vmcoreinfo from NOTE segment: %d\n", 
vmcoreinfo);
 }

-   if (!get_page_offset())
-   return FALSE;
+   if (!vmcoreinfo) {
+   if (!get_page_offset())
+   return FALSE;

-   /* paddr_to_vaddr() on arm64 needs phys_base. */
-   if (!get_phys_base())
-   return FALSE;
+   /* paddr_to_vaddr() on arm64 needs phys_base. */
+   if (!get_phys_base())
+   return FALSE;

-   if (!vmcoreinfo) {
 if (!get_sys_kernel_vmcoreinfo(_addr, 
_len))
 return FALSE;


This will work only for 4.19 and later kernels, but might reduce users 
that hit the issue.  Does this work for you?

Thanks,
Kazu
___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: enhance makedumpfile to dump process list from vmcore

2024-06-24 Thread  
Hi Dave,

thank you for your comments.

On 2024/06/24 10:52, Dave Young wrote:
> Hi Kazu,
> 
> On Wed, 19 Jun 2024 at 13:45, HAGIO KAZUHITO(萩尾 一仁)  
> wrote:
>>
>> On 2024/06/06 3:10, Prasad Koya wrote:
>>> Hi
>>>
>>> We use makedumpfile to read the dmesg buffer from the vmcore. We are
>>> working on adding a feature to makedumpfile to extract the "process
>>> list" from /proc/vmcore. Our main use case is to save the processes
>>> that were running at the time of kernel crash to a file on the
>>> permanent storage from the kdump kernel. We intend to read a handful
>>> of useful data for each running task: pid, parent pid, real parent
>>> pid, priority, nice value, RSS, VM size, command, utime, stime, OOM
>>> score.
>>>
>>> Process list from vmcore can probably be read using crash utility, but
>>> in our embedded system using makedumpfile to do that job makes sense
>>> for us.
>>>
>>> If such an option is useful for the general users of makedumpfile,
>>> we'd like to get more inputs and contribute to this feature.
>>
>> Hi Prasad,
>>
>> thank you for sharing the idea.  But sorry, currently we are not
>> interested in implementing such a big function in makedumpfile.
>>
>> The dmesg buffer is the most important information to be able to
>> determine the cause of a crash only with it, so the --dump-dmesg option
>> was implemented in makedumpfile first once upon a time.  But now we have
>> handier vmcore-dmesg command in kexec-tools, so we don't intend to add
>> other functions to makedumpfile than making a dumpfile.
>>
>> I'm not sure whether these are doable, but I would suggest a few ideas
>> instead:
>> - make e.g. vmcore-process-list command like vmcore-dmesg.
> 
> Not sure if this is worth either.

I also don't think that this (dumping a process list in 2nd kernel) is 
worth such a (probably) big effort, because we can make a dumpfile..

I assumed that they cannot create a dumpfile, because they know that the 
crash-utility can read it from a vmcore.  I'm not sure why makedumpfile 
doing it makes sense to them..

>   I guess it can be done with some
> crash tool scripts or a crash plugin?

Yes, agree.

> 
>> - use panic notifier to dump process list to the dmesg buffer before
>> kdump, and use vmcore-dmesg to get it.
> 
> It is not recommended to add these before kdump if people still want a
> reliable kdump.

Agree, a panic situation is unstable, I don't intend to recommend doing 
this generally.  I just thought that if they cannot create a dumpfile, 
maybe they can make a module that uses the notifier, as a last resort.

Thanks,
Kazu
___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: enhance makedumpfile to dump process list from vmcore

2024-06-18 Thread  
On 2024/06/06 3:10, Prasad Koya wrote:
> Hi
> 
> We use makedumpfile to read the dmesg buffer from the vmcore. We are
> working on adding a feature to makedumpfile to extract the "process
> list" from /proc/vmcore. Our main use case is to save the processes
> that were running at the time of kernel crash to a file on the
> permanent storage from the kdump kernel. We intend to read a handful
> of useful data for each running task: pid, parent pid, real parent
> pid, priority, nice value, RSS, VM size, command, utime, stime, OOM
> score.
> 
> Process list from vmcore can probably be read using crash utility, but
> in our embedded system using makedumpfile to do that job makes sense
> for us.
> 
> If such an option is useful for the general users of makedumpfile,
> we'd like to get more inputs and contribute to this feature.

Hi Prasad,

thank you for sharing the idea.  But sorry, currently we are not 
interested in implementing such a big function in makedumpfile.

The dmesg buffer is the most important information to be able to 
determine the cause of a crash only with it, so the --dump-dmesg option 
was implemented in makedumpfile first once upon a time.  But now we have 
handier vmcore-dmesg command in kexec-tools, so we don't intend to add 
other functions to makedumpfile than making a dumpfile.

I'm not sure whether these are doable, but I would suggest a few ideas 
instead:
- make e.g. vmcore-process-list command like vmcore-dmesg.
- use panic notifier to dump process list to the dmesg buffer before 
kdump, and use vmcore-dmesg to get it.

Thanks,
Kazu
___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH 2/2 makedumpfile] Fix wrong exclusion of Slab pages on Linux 6.10-rc1 and later

2024-06-12 Thread  

On 2024/06/07 15:34, HAGIO KAZUHITO(萩尾 一仁) wrote:
> * Required for kernel 6.10
> 
> Kernel commit 46df8e73a4a3 ("mm: free up PG_slab") moved the PG_slab
> flag from page.flags into page._mapcount (slab.__page_type), and
> introduced NUMBER(PAGE_SLAB_MAPCOUNT_VALUE) entry into vmcoreinfo.
> 
> Without the patch, "makedumpfile -d 8" option wrongly excludes Slab
> pages and crash cannot open the dumpfile with an error like this:
> 
>$ crash --kaslr auto vmlinux dumpfile
>...
>please wait... (gathering task table data)
>crash: page excluded: kernel virtual address: 909980440270 type: 
> "xa_node.slots[off]"
> 
> Signed-off-by: Kazuhito Hagio 

Applied the two patches for 6.9 and 6.10.
https://github.com/makedumpfile/makedumpfile/commit/985e575253f1c2de8d6876cfe685c68a24ee06e1
https://github.com/makedumpfile/makedumpfile/commit/bad2a7c4fa75d37a41578441468584963028bdda

Thanks,
Kazu
___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


[PATCH 2/2 makedumpfile] Fix wrong exclusion of Slab pages on Linux 6.10-rc1 and later

2024-06-07 Thread  
* Required for kernel 6.10

Kernel commit 46df8e73a4a3 ("mm: free up PG_slab") moved the PG_slab
flag from page.flags into page._mapcount (slab.__page_type), and
introduced NUMBER(PAGE_SLAB_MAPCOUNT_VALUE) entry into vmcoreinfo.

Without the patch, "makedumpfile -d 8" option wrongly excludes Slab
pages and crash cannot open the dumpfile with an error like this:

  $ crash --kaslr auto vmlinux dumpfile
  ...
  please wait... (gathering task table data)
  crash: page excluded: kernel virtual address: 909980440270 type: 
"xa_node.slots[off]"

Signed-off-by: Kazuhito Hagio 
---
NOTE: This patch can be applied on top of "[PATCH makedumpfile] Fix
failure of hugetlb pages exclusion on Linux 6.9 and later".

 makedumpfile.c | 24 +++-
 makedumpfile.h |  6 +++---
 2 files changed, 22 insertions(+), 8 deletions(-)

diff --git a/makedumpfile.c b/makedumpfile.c
index a6cadc9..cc39f73 100644
--- a/makedumpfile.c
+++ b/makedumpfile.c
@@ -275,13 +275,26 @@ isHugetlb(unsigned long dtor)
   && (SYMBOL(free_huge_page) == dtor));
 }
 
+static inline int
+isSlab(unsigned long flags, unsigned int _mapcount)
+{
+   /* Linux 6.10 and later */
+   if (NUMBER(PAGE_SLAB_MAPCOUNT_VALUE) != NOT_FOUND_NUMBER) {
+   unsigned int PG_slab = ~NUMBER(PAGE_SLAB_MAPCOUNT_VALUE);
+   if ((_mapcount & (PAGE_TYPE_BASE | PG_slab)) == PAGE_TYPE_BASE)
+   return TRUE;
+   }
+
+   return flags & (1UL << NUMBER(PG_slab));
+}
+
 static int
 isOffline(unsigned long flags, unsigned int _mapcount)
 {
if (NUMBER(PAGE_OFFLINE_MAPCOUNT_VALUE) == NOT_FOUND_NUMBER)
return FALSE;
 
-   if (flags & (1UL << NUMBER(PG_slab)))
+   if (isSlab(flags, _mapcount))
return FALSE;
 
if (_mapcount == (int)NUMBER(PAGE_OFFLINE_MAPCOUNT_VALUE))
@@ -2977,6 +2990,7 @@ read_vmcoreinfo(void)
READ_NUMBER("PAGE_BUDDY_MAPCOUNT_VALUE", PAGE_BUDDY_MAPCOUNT_VALUE);
READ_NUMBER("PAGE_HUGETLB_MAPCOUNT_VALUE", PAGE_HUGETLB_MAPCOUNT_VALUE);
READ_NUMBER("PAGE_OFFLINE_MAPCOUNT_VALUE", PAGE_OFFLINE_MAPCOUNT_VALUE);
+   READ_NUMBER("PAGE_SLAB_MAPCOUNT_VALUE", PAGE_SLAB_MAPCOUNT_VALUE);
READ_NUMBER("phys_base", phys_base);
READ_NUMBER("KERNEL_IMAGE_SIZE", KERNEL_IMAGE_SIZE);
 
@@ -6043,7 +6057,7 @@ static int
 page_is_buddy_v3(unsigned long flags, unsigned int _mapcount,
unsigned long private, unsigned int _count)
 {
-   if (flags & (1UL << NUMBER(PG_slab)))
+   if (isSlab(flags, _mapcount))
return FALSE;
 
if (_mapcount == (int)NUMBER(PAGE_BUDDY_MAPCOUNT_VALUE))
@@ -6618,7 +6632,7 @@ __exclude_unnecessary_pages(unsigned long mem_map,
 */
else if ((info->dump_level & DL_EXCLUDE_CACHE)
&& is_cache_page(flags)
-   && !isPrivate(flags) && !isAnon(mapping, flags)) {
+   && !isPrivate(flags) && !isAnon(mapping, flags, _mapcount)) 
{
pfn_counter = _cache;
}
/*
@@ -6626,7 +6640,7 @@ __exclude_unnecessary_pages(unsigned long mem_map,
 */
else if ((info->dump_level & DL_EXCLUDE_CACHE_PRI)
&& is_cache_page(flags)
-   && !isAnon(mapping, flags)) {
+   && !isAnon(mapping, flags, _mapcount)) {
if (isPrivate(flags))
pfn_counter = _cache_private;
else
@@ -6638,7 +6652,7 @@ __exclude_unnecessary_pages(unsigned long mem_map,
 *  - hugetlbfs pages
 */
else if ((info->dump_level & DL_EXCLUDE_USER_DATA)
-&& (isAnon(mapping, flags) || 
isHugetlb(compound_dtor))) {
+&& (isAnon(mapping, flags, _mapcount) || 
isHugetlb(compound_dtor))) {
pfn_counter = _user;
}
/*
diff --git a/makedumpfile.h b/makedumpfile.h
index f08c49f..6b43a8b 100644
--- a/makedumpfile.h
+++ b/makedumpfile.h
@@ -161,9 +161,8 @@ test_bit(int nr, unsigned long addr)
 #define isSwapBacked(flags)test_bit(NUMBER(PG_swapbacked), flags)
 #define isHWPOISON(flags)  (test_bit(NUMBER(PG_hwpoison), flags) \
&& (NUMBER(PG_hwpoison) != NOT_FOUND_NUMBER))
-#define isSlab(flags)  test_bit(NUMBER(PG_slab), flags)
-#define isAnon(mapping, flags) (((unsigned long)mapping & PAGE_MAPPING_ANON) 
!= 0 \
-   && !isSlab(flags))
+#define isAnon(mapping, flags, _mapcount) \
+   (((unsigned long)mapping & PAGE_MAPPING_ANON) != 0 && !isSlab(flags, 
_mapcount))
 
 #define PAGE_TYPE_BASE (0xf000)
 
@@ -2259,6 +2258,7 @@ struct number_table {
longPAGE_BUDDY_MAPCOUNT_VALUE;
longPAGE_HUGETLB_MAPCOUNT_VALUE;
longPAGE_OFFLINE_MAPCOUNT_VALUE;
+   long

[PATCH makedumpfile] Fix failure of hugetlb pages exclusion on Linux 6.9 and later

2024-05-30 Thread  
* Required for kernel 6.9

Kernel commit d99e3140a4d3 ("mm: turn folio_test_hugetlb into a
PageType") moved the PG_hugetlb flag from folio._flags_1 into
page._mapcount and introduced NUMBER(PAGE_HUGETLB_MAPCOUNT_VALUE) entry
into vmcoreinfo.

Without the patch, "makedumpfile -d 8" cannot exclude hugetlb pages.

Signed-off-by: Kazuhito Hagio 
---
 makedumpfile.c | 22 --
 makedumpfile.h |  3 +++
 2 files changed, 23 insertions(+), 2 deletions(-)

diff --git a/makedumpfile.c b/makedumpfile.c
index d7f1dd41d2ca..a6cadc9092c9 100644
--- a/makedumpfile.c
+++ b/makedumpfile.c
@@ -2975,6 +2975,7 @@ read_vmcoreinfo(void)
READ_SRCFILE("pud_t", pud_t);
 
READ_NUMBER("PAGE_BUDDY_MAPCOUNT_VALUE", PAGE_BUDDY_MAPCOUNT_VALUE);
+   READ_NUMBER("PAGE_HUGETLB_MAPCOUNT_VALUE", PAGE_HUGETLB_MAPCOUNT_VALUE);
READ_NUMBER("PAGE_OFFLINE_MAPCOUNT_VALUE", PAGE_OFFLINE_MAPCOUNT_VALUE);
READ_NUMBER("phys_base", phys_base);
READ_NUMBER("KERNEL_IMAGE_SIZE", KERNEL_IMAGE_SIZE);
@@ -6510,6 +6511,9 @@ __exclude_unnecessary_pages(unsigned long mem_map,
_count  = UINT(pcache + OFFSET(page._refcount));
mapping = ULONG(pcache + OFFSET(page.mapping));
 
+   if (OFFSET(page._mapcount) != NOT_FOUND_STRUCTURE)
+   _mapcount = UINT(pcache + OFFSET(page._mapcount));
+
compound_order = 0;
compound_dtor = 0;
/*
@@ -6520,6 +6524,22 @@ __exclude_unnecessary_pages(unsigned long mem_map,
if ((index_pg < PGMM_CACHED - 1) && isCompoundHead(flags)) {
unsigned char *addr = pcache + SIZE(page);
 
+   /*
+* Linux 6.9 and later kernels use _mapcount value for 
hugetlb pages.
+* See kernel commit d99e3140a4d3.
+*/
+   if (NUMBER(PAGE_HUGETLB_MAPCOUNT_VALUE) != 
NOT_FOUND_NUMBER) {
+   unsigned long _flags_1 = ULONG(addr + 
OFFSET(page.flags));
+   unsigned int PG_hugetlb = 
~NUMBER(PAGE_HUGETLB_MAPCOUNT_VALUE);
+
+   compound_order = _flags_1 & 0xff;
+
+   if ((_mapcount & (PAGE_TYPE_BASE | PG_hugetlb)) 
== PAGE_TYPE_BASE)
+   compound_dtor = IS_HUGETLB;
+
+   goto check_order;
+   }
+
/*
 * Linux 6.6 and later.  Kernels that have PG_hugetlb 
should also
 * have the compound order in the low byte of 
folio._flags_1.
@@ -6564,8 +6584,6 @@ check_order:
if (OFFSET(page.compound_head) != NOT_FOUND_STRUCTURE)
compound_head = ULONG(pcache + 
OFFSET(page.compound_head));
 
-   if (OFFSET(page._mapcount) != NOT_FOUND_STRUCTURE)
-   _mapcount = UINT(pcache + OFFSET(page._mapcount));
if (OFFSET(page.private) != NOT_FOUND_STRUCTURE)
private = ULONG(pcache + OFFSET(page.private));
 
diff --git a/makedumpfile.h b/makedumpfile.h
index 75b66ceaba21..f08c49fc73be 100644
--- a/makedumpfile.h
+++ b/makedumpfile.h
@@ -165,6 +165,8 @@ test_bit(int nr, unsigned long addr)
 #define isAnon(mapping, flags) (((unsigned long)mapping & PAGE_MAPPING_ANON) 
!= 0 \
&& !isSlab(flags))
 
+#define PAGE_TYPE_BASE (0xf000)
+
 #define PTOB(X)(((unsigned long long)(X)) << 
PAGESHIFT())
 #define BTOP(X)(((unsigned long long)(X)) >> 
PAGESHIFT())
 
@@ -2255,6 +2257,7 @@ struct number_table {
longPG_hugetlb;
 
longPAGE_BUDDY_MAPCOUNT_VALUE;
+   longPAGE_HUGETLB_MAPCOUNT_VALUE;
longPAGE_OFFLINE_MAPCOUNT_VALUE;
longSECTION_SIZE_BITS;
longMAX_PHYSMEM_BITS;
-- 
2.31.1

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH makedumpfile] Make sbindir configurable

2024-04-26 Thread  
On 2024/04/24 11:20, Coiby Xu wrote:
> Fedora is going unify bin and sbin and /usr/sbin directory will become a
> symlink to bin [1]. So make sbindir configurable to support this case.
> 
> [1] https://fedoraproject.org/wiki/Changes/Unify_bin_and_sbin
> 
> Signed-off-by: Coiby Xu 

Thank you for the patch, but sorry, we will be back on May 7 and then 
start to work on patches etc., so please wait for a while.

Thanks,
Kazu

> ---
>   Makefile | 6 --
>   1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/Makefile b/Makefile
> index 55c9c7a..0cd2b03 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -101,6 +101,8 @@ LINK_TEST_PROG="int main() { return 0; }"
>   LIBS := $(LIBS) $(call try-run,\
>   echo $(LINK_TEST_PROG) | $(CC) -o "$$TMP" -x c - -lebl,-lebl,)
>   
> +sbindir ?= /usr/sbin
> +
>   all: makedumpfile
>   
>   $(OBJ_PART): $(SRC_PART)
> @@ -126,8 +128,8 @@ clean:
>   rm -f $(OBJ) $(OBJ_PART) $(OBJ_ARCH) makedumpfile makedumpfile.8 
> makedumpfile.conf.5
>   
>   install:
> - install -m 755 -d ${DESTDIR}/usr/sbin ${DESTDIR}/usr/share/man/man5 
> ${DESTDIR}/usr/share/man/man8
> - install -m 755 -t ${DESTDIR}/usr/sbin makedumpfile 
> $(VPATH)makedumpfile-R.pl
> + install -m 755 -d ${DESTDIR}/${sbindir} ${DESTDIR}/usr/share/man/man5 
> ${DESTDIR}/usr/share/man/man8
> + install -m 755 -t ${DESTDIR}/${sbindir} makedumpfile 
> $(VPATH)makedumpfile-R.pl
>   install -m 644 -t ${DESTDIR}/usr/share/man/man8 makedumpfile.8
>   install -m 644 -t ${DESTDIR}/usr/share/man/man5 makedumpfile.conf.5
>   mkdir -p ${DESTDIR}/usr/share/makedumpfile/eppic_scripts
___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


[PATCH makedumpfile] Update maintainers

2024-04-22 Thread  
Add Masamitsu Yamazaki (NEC) as co-maintainer.

Signed-off-by: Kazuhito Hagio 
---
 README | 1 +
 1 file changed, 1 insertion(+)

diff --git a/README b/README
index b9d8dc3dcff4..6f22af725114 100644
--- a/README
+++ b/README
@@ -266,6 +266,7 @@
 
 * BUG REPORT
   If finding some bugs, please send the information to the following:
+  Masamitsu Yamazaki 
   Kazuhito Hagio 
   kexec-ml 
 
-- 
2.31.1

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


[ANNOUNCE] makedumpfile 1.7.5

2024-04-11 Thread  
Hi,

I'm pleased to announce the release of makedumpfile 1.7.5.
Thank you everyone for your help to maintain the tool.

Download:
The latest makedumpfile can be downloaded from the following URL.
https://github.com/makedumpfile/makedumpfile/releases

New features:
- Support for kernels up to v6.8 (x86_64)
- Support for printk caller_id by --dump-dmesg option

Commits since 1.7.4:
c266469 [v1.7.5] Update version (Kazuhito Hagio)
94241fd [PATCH] ppc64: get vmalloc start address from vmcoreinfo (Aditya Gupta)
71ac00c [PATCH] ppc64: read cur_mmu_type from vmcoreinfo (Aditya Gupta)
48bb1e0 [PATCH] add PRINTK_CALLER id support to --dump-dmesg option (Edward 
Chron)
6f8325d [PATCH v2 2/2] s390x: uncouple virtual and physical address spaces 
(Alexander Gordeev)
7bb90b7 [PATCH 1/2] s390x: fix virtual vs physical address confusion (Alexander 
Gordeev)

Description of makedumpfile:
The makedumpfile is a tool for creating a dumpfile from /proc/vmcore
with filtering out unnecessary pages for analysis and compressing the
remaining pages, in order to shorten the size of the dumpfile and the
time of creating it.
https://github.com/makedumpfile/makedumpfile

Thanks,
Kazu
___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH] makedumpfile: ppc64: get vmalloc start address from vmcoreinfo

2024-03-18 Thread  
On 2024/03/18 17:26, Aditya Gupta wrote:
> Hi,
> The commit removing 'vmap_area_list' is now merged in Linux mainline tree.
>      commit:     55c49fee57af99f3c663e69dedc5b85e691bbe50
>      mm/vmalloc: remove vmap_area_list

Applied with this commit id and the fix.
https://github.com/makedumpfile/makedumpfile/commit/94241fd2feed059227a243618f2acc6aabf366e8

Thanks,
Kazu

> 
> Any comments on this patch ?
> 
> Thanks,
> 
> Aditya Gupta
> 
> On 24/02/24 00:33, Aditya Gupta wrote:
>> Below error was noticed when running makedumpfile on linux-next kernel
>> crash (linux-next tag next-20240121):
>>
>>  ...
>>  Checking for memory holes : [100.0 %] | readpage_elf: Attempt to 
>> read non-existent page at 0xc.
>>  [ 17.551718] kdump.sh[404]: readmem: type_addr: 0, 
>> addr:c00c, size:16384
>>  [ 17.551793] kdump.sh[404]: __exclude_unnecessary_pages: Can't 
>> read the buffer of struct page.
>>  [ 17.551864] kdump.sh[404]: create_2nd_bitmap: Can't exclude 
>> unnecessary pages.
>>  [ 17.562632] kdump.sh[404]: The kernel version is not supported.
>>  [ 17.562708] kdump.sh[404]: The makedumpfile operation may be 
>> incomplete.
>>  [ 17.562773] kdump.sh[404]: makedumpfile Failed.
>>  [ 17.564335] kdump[406]: saving vmcore failed, _exitcode:1
>>
>> Above error was due to 'vmap_area_list' and 'vmlist' symbols missing
>> from the vmcore.
>>
>> 'vmap_area_list' was removed in the linux kernel with below commit:
>>
>>  commit 378eb24a0658dd922b29524e0ce35c6c43f56cba
>>   mm/vmalloc: remove vmap_area_list
>>
>> Subsequently the commit also introduced 'VMALLOC_START' in vmcoreinfo to
>> get base address of vmalloc area, instead of depending on 
>> 'vmap_area_list'
>>
>> Hence if 'VMALLOC_START' symbol is there in vmcoreinfo:
>>    1. Set vmalloc_start based on 'VMALLOC_START'
>>    2. Don't error if vmap_area_list/vmlist are not defined
>>
>> Reported-by: Sachin Sant 
>> Signed-off-by: Aditya Gupta 
>> ---
>>   arch/ppc64.c   | 19 +--
>>   makedumpfile.c |  3 ++-
>>   makedumpfile.h |  6 +++---
>>   3 files changed, 18 insertions(+), 10 deletions(-)
>>
>> diff --git a/arch/ppc64.c b/arch/ppc64.c
>> index 96c357cb0335..bb62e2cd199a 100644
>> --- a/arch/ppc64.c
>> +++ b/arch/ppc64.c
>> @@ -568,7 +568,9 @@ get_machdep_info_ppc64(void)
>>   /*
>>    * Get vmalloc_start value from either vmap_area_list or vmlist.
>>    */
>> -    if ((SYMBOL(vmap_area_list) != NOT_FOUND_SYMBOL)
>> +    if (NUMBER(vmalloc_start) != NOT_FOUND_SYMBOL) {
>> +    vmalloc_start = NUMBER(vmalloc_start);
>> +    } else if ((SYMBOL(vmap_area_list) != NOT_FOUND_SYMBOL)
>>   && (OFFSET(vmap_area.va_start) != NOT_FOUND_STRUCTURE)
>>   && (OFFSET(vmap_area.list) != NOT_FOUND_STRUCTURE)) {
>>   if (!readmem(VADDR, SYMBOL(vmap_area_list) + 
>> OFFSET(list_head.next),
>> @@ -684,11 +686,16 @@ vaddr_to_paddr_ppc64(unsigned long vaddr)
>>   if ((SYMBOL(vmap_area_list) == NOT_FOUND_SYMBOL)
>>   || (OFFSET(vmap_area.va_start) == NOT_FOUND_STRUCTURE)
>>   || (OFFSET(vmap_area.list) == NOT_FOUND_STRUCTURE)) {
>> -    if ((SYMBOL(vmlist) == NOT_FOUND_SYMBOL)
>> -    || (OFFSET(vm_struct.addr) == NOT_FOUND_STRUCTURE)) {
>> -    ERRMSG("Can't get info for vmalloc translation.\n");
>> -    return NOT_PADDR;
>> -    }
>> +    /*
>> + * Don't depend on vmap_area_list/vmlist if vmalloc_start is 
>> set in
>> + * vmcoreinfo, in that case proceed without error
>> + */
>> +    if (NUMBER(vmalloc_start) == NOT_FOUND_NUMBER)
>> +    if ((SYMBOL(vmlist) == NOT_FOUND_SYMBOL)
>> +    || (OFFSET(vm_struct.addr) == NOT_FOUND_STRUCTURE)) {
>> +    ERRMSG("Can't get info for vmalloc translation.\n");
>> +    return NOT_PADDR;
>> +    }
>>   }
>>   return ppc64_vtop_level4(vaddr);
>> diff --git a/makedumpfile.c b/makedumpfile.c
>> index b004b93fecb7..b6c63fad15f3 100644
>> --- a/makedumpfile.c
>> +++ b/makedumpfile.c
>> @@ -2978,6 +2978,8 @@ read_vmcoreinfo(void)
>>   READ_NUMBER("PAGE_OFFLINE_MAPCOUNT_VALUE", 
>> PAGE_OFFLINE_MAPCOUNT_VALUE);
>>   READ_NUMBER("phys_base", phys_base);
>>   READ_NUMBER("KERNEL_IMAGE_SIZE", KERNEL_IMAGE_SIZE);
>> +
>> +    READ_NUMBER_UNSIGNED("VMALLOC_START", vmalloc_start);
>>   #ifdef __aarch64__
>>   READ_NUMBER("VA_BITS", VA_BITS);
>>   READ_NUMBER("TCR_EL1_T1SZ", TCR_EL1_T1SZ);
>> @@ -2989,7 +2991,6 @@ read_vmcoreinfo(void)
>>   READ_NUMBER("VA_BITS", va_bits);
>>   READ_NUMBER_UNSIGNED("phys_ram_base", phys_ram_base);
>>   READ_NUMBER_UNSIGNED("PAGE_OFFSET", page_offset);
>> -    READ_NUMBER_UNSIGNED("VMALLOC_START", vmalloc_start);
>>   READ_NUMBER_UNSIGNED("VMALLOC_END", vmalloc_end);
>>   READ_NUMBER_UNSIGNED("VMEMMAP_START", vmemmap_start);
>>   READ_NUMBER_UNSIGNED("VMEMMAP_END", vmemmap_end);
>> diff --git a/makedumpfile.h 

Re: [PATCH] makedumpfile: ppc64: get vmalloc start address from vmcoreinfo

2024-02-27 Thread  
Hi Aditya,

thanks for the patch.

On 2024/02/24 4:03, Aditya Gupta wrote:
> Below error was noticed when running makedumpfile on linux-next kernel
> crash (linux-next tag next-20240121):
> 
>  ...
>  Checking for memory holes : [100.0 %] | readpage_elf: Attempt to read 
> non-existent page at 0xc.
>  [ 17.551718] kdump.sh[404]: readmem: type_addr: 0, 
> addr:c00c, size:16384
>  [ 17.551793] kdump.sh[404]: __exclude_unnecessary_pages: Can't read the 
> buffer of struct page.
>  [ 17.551864] kdump.sh[404]: create_2nd_bitmap: Can't exclude unnecessary 
> pages.
>  [ 17.562632] kdump.sh[404]: The kernel version is not supported.
>  [ 17.562708] kdump.sh[404]: The makedumpfile operation may be incomplete.
>  [ 17.562773] kdump.sh[404]: makedumpfile Failed.
>  [ 17.564335] kdump[406]: saving vmcore failed, _exitcode:1
> 
> Above error was due to 'vmap_area_list' and 'vmlist' symbols missing
> from the vmcore.
> 
> 'vmap_area_list' was removed in the linux kernel with below commit:
> 
>  commit 378eb24a0658dd922b29524e0ce35c6c43f56cba
>   mm/vmalloc: remove vmap_area_list
> 
> Subsequently the commit also introduced 'VMALLOC_START' in vmcoreinfo to
> get base address of vmalloc area, instead of depending on 'vmap_area_list'
> 
> Hence if 'VMALLOC_START' symbol is there in vmcoreinfo:
>1. Set vmalloc_start based on 'VMALLOC_START'
>2. Don't error if vmap_area_list/vmlist are not defined
> 
> Reported-by: Sachin Sant 
> Signed-off-by: Aditya Gupta 
> ---
>   arch/ppc64.c   | 19 +--
>   makedumpfile.c |  3 ++-
>   makedumpfile.h |  6 +++---
>   3 files changed, 18 insertions(+), 10 deletions(-)
> 
> diff --git a/arch/ppc64.c b/arch/ppc64.c
> index 96c357cb0335..bb62e2cd199a 100644
> --- a/arch/ppc64.c
> +++ b/arch/ppc64.c
> @@ -568,7 +568,9 @@ get_machdep_info_ppc64(void)
>   /*
>* Get vmalloc_start value from either vmap_area_list or vmlist.
>*/
> - if ((SYMBOL(vmap_area_list) != NOT_FOUND_SYMBOL)
> + if (NUMBER(vmalloc_start) != NOT_FOUND_SYMBOL) {

I will fix this NOT_FOUND_SYMBOL to NOT_FOUND_NUMBER when applying, 
otherwise makedumpfile will fail for a dumpfile without the 
corresponding kernel patch, correct?

The patch looks good to me except for it.  I will apply this with the 
kernel version in the commit log after the kernel patch gets merged.

Thanks,
Kazu


> + vmalloc_start = NUMBER(vmalloc_start);
> + } else if ((SYMBOL(vmap_area_list) != NOT_FOUND_SYMBOL)
>   && (OFFSET(vmap_area.va_start) != NOT_FOUND_STRUCTURE)
>   && (OFFSET(vmap_area.list) != NOT_FOUND_STRUCTURE)) {
>   if (!readmem(VADDR, SYMBOL(vmap_area_list) + 
> OFFSET(list_head.next),
> @@ -684,11 +686,16 @@ vaddr_to_paddr_ppc64(unsigned long vaddr)
>   if ((SYMBOL(vmap_area_list) == NOT_FOUND_SYMBOL)
>   || (OFFSET(vmap_area.va_start) == NOT_FOUND_STRUCTURE)
>   || (OFFSET(vmap_area.list) == NOT_FOUND_STRUCTURE)) {
> - if ((SYMBOL(vmlist) == NOT_FOUND_SYMBOL)
> - || (OFFSET(vm_struct.addr) == NOT_FOUND_STRUCTURE)) {
> - ERRMSG("Can't get info for vmalloc translation.\n");
> - return NOT_PADDR;
> - }
> + /*
> +  * Don't depend on vmap_area_list/vmlist if vmalloc_start is 
> set in
> +  * vmcoreinfo, in that case proceed without error
> +  */
> + if (NUMBER(vmalloc_start) == NOT_FOUND_NUMBER)
> + if ((SYMBOL(vmlist) == NOT_FOUND_SYMBOL)
> + || (OFFSET(vm_struct.addr) == 
> NOT_FOUND_STRUCTURE)) {
> + ERRMSG("Can't get info for vmalloc 
> translation.\n");
> + return NOT_PADDR;
> + }
>   }
>   
>   return ppc64_vtop_level4(vaddr);
> diff --git a/makedumpfile.c b/makedumpfile.c
> index b004b93fecb7..b6c63fad15f3 100644
> --- a/makedumpfile.c
> +++ b/makedumpfile.c
> @@ -2978,6 +2978,8 @@ read_vmcoreinfo(void)
>   READ_NUMBER("PAGE_OFFLINE_MAPCOUNT_VALUE", PAGE_OFFLINE_MAPCOUNT_VALUE);
>   READ_NUMBER("phys_base", phys_base);
>   READ_NUMBER("KERNEL_IMAGE_SIZE", KERNEL_IMAGE_SIZE);
> +
> + READ_NUMBER_UNSIGNED("VMALLOC_START", vmalloc_start);
>   #ifdef __aarch64__
>   READ_NUMBER("VA_BITS", VA_BITS);
>   READ_NUMBER("TCR_EL1_T1SZ", TCR_EL1_T1SZ);
> @@ -2989,7 +2991,6 @@ read_vmcoreinfo(void)
>   READ_NUMBER("VA_BITS", va_bits);
>   READ_NUMBER_UNSIGNED("phys_ram_base", phys_ram_base);
>   READ_NUMBER_UNSIGNED("PAGE_OFFSET", page_offset);
> - READ_NUMBER_UNSIGNED("VMALLOC_START", vmalloc_start);
>   READ_NUMBER_UNSIGNED("VMALLOC_END", vmalloc_end);
>   READ_NUMBER_UNSIGNED("VMEMMAP_START", vmemmap_start);
>   READ_NUMBER_UNSIGNED("VMEMMAP_END", vmemmap_end);
> diff --git a/makedumpfile.h b/makedumpfile.h
> index 59c83e1d9df3..4021c5af2a34 

Re: [PATCH] makedumpfile: ppc64: read cur_mmu_type from vmcoreinfo

2024-02-27 Thread  
On 2024/02/23 17:39, Aditya Gupta wrote:
> Currently makedumpfile depends on reading the 'cur_cpu_spec' kernel
> symbol to get the current MMU type on PowerPC64.
> 
> The disadvantage with this approach was that it depends on bit '0x40'
> ('MMU_FTR_TYPE_RADIX') being set in 'cur_cpu_spec->mmu_features',
> which implies kernel developers have to be careful of modifying
> MMU_FTR_* defines
> 
> Instead a more stable approach was suggested by contributors in
> https://lore.kernel.org/linuxppc-dev/87v8c3m70t.fsf@mail.lhotse/, to
> pass information about the MMU type in vmcoreinfo itself, instead of
> depending on the MMU_FTR_* defines
> 
> This was implemented in linux kernel in:
>  commit 36e826b568e4 ("powerpc/vmcore: Add MMU information to vmcoreinfo")
> 
> With this commit, if RADIX_MMU is there in the vmcoreinfo, we prefer it
> to get current mmu type, instead of 'cur_cpu_spec'.
> On older kernels, where RADIX_MMU number is not there, makedumpfile will
> simply fall back to using 'cur_cpu_spec'.
> 
> The earlier defines for 'RADIX_MMU' have been renamed to 'MMU_TYPE_RADIX'
> which avoids conflict with the vmcoreinfo string 'RADIX_MMU', as well as
> being more clear about the value 0x40 with a comment about MMU_FTR_TYPE_RADIX
> 
> Signed-off-by: Aditya Gupta 

Thanks, applied.
https://github.com/makedumpfile/makedumpfile/commit/71ac00c8a3464608ac19f7c89d7220073d7374a9

Kazu

> ---
>   arch/ppc64.c   | 15 ++-
>   makedumpfile.c |  1 +
>   makedumpfile.h |  9 ++---
>   3 files changed, 17 insertions(+), 8 deletions(-)
> 
> diff --git a/arch/ppc64.c b/arch/ppc64.c
> index 96c357cb0335..3b4f91981f71 100644
> --- a/arch/ppc64.c
> +++ b/arch/ppc64.c
> @@ -250,7 +250,7 @@ ppc64_vmalloc_init(void)
>   /*
>* 64K pagesize
>*/
> - if (info->cur_mmu_type & RADIX_MMU) {
> + if (info->cur_mmu_type & MMU_TYPE_RADIX) {
>   info->l1_index_size = PTE_INDEX_SIZE_RADIX_64K;
>   info->l2_index_size = PMD_INDEX_SIZE_RADIX_64K;
>   info->l3_index_size = PUD_INDEX_SIZE_RADIX_64K;
> @@ -300,7 +300,7 @@ ppc64_vmalloc_init(void)
>   /*
>* 4K pagesize
>*/
> - if (info->cur_mmu_type & RADIX_MMU) {
> + if (info->cur_mmu_type & MMU_TYPE_RADIX) {
>   info->l1_index_size = PTE_INDEX_SIZE_RADIX_4K;
>   info->l2_index_size = PMD_INDEX_SIZE_RADIX_4K;
>   info->l3_index_size = PUD_INDEX_SIZE_RADIX_4K;
> @@ -635,14 +635,19 @@ get_versiondep_info_ppc64()
>* On PowerISA 3.0 based server processors, a kernel can run with
>* radix MMU or standard MMU. Get the current MMU type.
>*/
> - info->cur_mmu_type = STD_MMU;
> - if ((SYMBOL(cur_cpu_spec) != NOT_FOUND_SYMBOL)
> + info->cur_mmu_type = MMU_TYPE_STD;
> +
> + if (NUMBER(RADIX_MMU) != NOT_FOUND_SYMBOL) {
> + if (NUMBER(RADIX_MMU) == 1) {
> + info->cur_mmu_type = MMU_TYPE_RADIX;
> + }
> + } else if ((SYMBOL(cur_cpu_spec) != NOT_FOUND_SYMBOL)
>   && (OFFSET(cpu_spec.mmu_features) != NOT_FOUND_STRUCTURE)) {
>   if (readmem(VADDR, SYMBOL(cur_cpu_spec), _cpu_spec,
>   sizeof(cur_cpu_spec))) {
>   if (readmem(VADDR, cur_cpu_spec + 
> OFFSET(cpu_spec.mmu_features),
>   _features, sizeof(mmu_features)))
> - info->cur_mmu_type = mmu_features & RADIX_MMU;
> + info->cur_mmu_type = mmu_features & 
> MMU_TYPE_RADIX;
>   }
>   }
>   
> diff --git a/makedumpfile.c b/makedumpfile.c
> index 3705bdd93deb..1bd7305f49ca 100644
> --- a/makedumpfile.c
> +++ b/makedumpfile.c
> @@ -2987,6 +2987,7 @@ read_vmcoreinfo(void)
>   #endif
>   
>   READ_NUMBER("HUGETLB_PAGE_DTOR", HUGETLB_PAGE_DTOR);
> + READ_NUMBER("RADIX_MMU", RADIX_MMU);
>   
>   return TRUE;
>   }
> diff --git a/makedumpfile.h b/makedumpfile.h
> index 3ed3ba551d96..a7b344974636 100644
> --- a/makedumpfile.h
> +++ b/makedumpfile.h
> @@ -747,12 +747,13 @@ unsigned long get_kvbase_arm64(void);
>   /*
>* Supported MMU types
>*/
> -#define STD_MMU 0x0
> +#define MMU_TYPE_STD 0x0
>   /*
>* The flag bit for radix MMU in cpu_spec.mmu_features
> - * in the kernel. Use the same flag here.
> + * in the kernel (MMU_FTR_TYPE_RADIX).
> + * Use the same flag here.
>*/
> -#define RADIX_MMU   0x40
> +#define MMU_TYPE_RADIX   0x40
>   
>   
>   #define PGD_MASK_L4 \
> @@ -2258,6 +2259,8 @@ struct number_table {
>   unsigned long kernel_link_addr;
>   unsigned long va_kernel_pa_offset;
>   #endif
> +
> + unsigned long RADIX_MMU;
>   };
>   
>   struct srcfile_table {
___
kexec mailing list
kexec@lists.infradead.org

Re: [PATCH] makedumpfile add dmesg PRINTK_CALLER id support

2024-01-23 Thread  
On 2024/01/11 15:55, HAGIO KAZUHITO(萩尾 一仁) wrote:
> On 2023/12/29 5:33, Edward Chron wrote:
>> Submission to Project: makedumpfile
>> Component: dmesg
>> Files: printk.c makedumpfile.c makedumpfile.h
>> Code level patch applied against: 1.7.4++ - latest code pulled from
>>   https://github.com/makedumpfile/makedumpfile
>> makedumpfile Issue #13
>>   https://github.com/makedumpfile/makedumpfile/issues/13
>> Project Owner: Kazuhito Hagio 
>> Revision: #1 on 2023/12/15 per Kazu ensure spacing of dmesg output
>>  matches dmesg -S and new dmesg caller_id
>>  output format (space after timestamp)
>> Revision: #2 on 2023/12/21 per Kazu use NOT_FOUND_STRUCTURE for caller_id
>> Revision: #3 on 2023/12/21 streamline code for printing caller_id
>> Revision: #4 on 2023/12/24 per Kazu make code consistent, drop the
>>  CALLER_ID_SIZE not needed.
> 
> Thanks for the revision.  Looks good, will watch the util-linux status.

Thanks for the information that dmesg got the caller_id support.

Applied with a few tweaks.
https://github.com/makedumpfile/makedumpfile/commit/48bb1e089056

Thanks,
Kazu
___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH] makedumpfile add dmesg PRINTK_CALLER id support

2024-01-10 Thread  
On 2023/12/29 5:33, Edward Chron wrote:
> Submission to Project: makedumpfile
> Component: dmesg
> Files: printk.c makedumpfile.c makedumpfile.h
> Code level patch applied against: 1.7.4++ - latest code pulled from
>  https://github.com/makedumpfile/makedumpfile
> makedumpfile Issue #13
>  https://github.com/makedumpfile/makedumpfile/issues/13
> Project Owner: Kazuhito Hagio 
> Revision: #1 on 2023/12/15 per Kazu ensure spacing of dmesg output
> matches dmesg -S and new dmesg caller_id
> output format (space after timestamp)
> Revision: #2 on 2023/12/21 per Kazu use NOT_FOUND_STRUCTURE for caller_id
> Revision: #3 on 2023/12/21 streamline code for printing caller_id
> Revision: #4 on 2023/12/24 per Kazu make code consistent, drop the
> CALLER_ID_SIZE not needed.

Thanks for the revision.  Looks good, will watch the util-linux status.

Thanks,
Kazu

> 
> Add support so that dmesg entries include the optional Linux Kernel
> debug CONFIG option PRINTK_CALLER which adds an optional dmesg field
> that contains the Thread Id or CPU Id that is issuing the printk to
> add the message to the kernel ring buffer. If enabled, this CONFIG
> option makes debugging simpler as dmesg entries for a specific
> thread or CPU can be recognized.
> 
> The dmesg command supports printing the PRINTK_CALLER field. The
> old syslog format (dmesg -S) and recently support was added for dmesg
> using /dev/kmsg interface.
> 
> The additional field provided by PRINTK_CALLER is only present
> if it was configured for the Linux kernel on the running system. The
> PRINTK_CALLER is a debug option and not configured by default so the
> dmesg output will only change for those kernels where the option was
> configured when the kernel was built. For users who went to the
> trouble to configure PRINTK_CALLER and have the extra field available
> for debugging, having dmesg print the field is very helpful and so
> it would be very useful to add makedumpfile support for it.
> 
> Size of the PRINTK_CALLER field is determined by the maximum number
> tasks that can be run on the system which is limited by the value of
> /proc/sys/kernel/pid_max as pid values are from 0 to value - 1.
> This value determines the number of id digits needed by the caller id.
> The PRINTK_CALLER field is printed as T for a Task Id or C
> for a CPU Id for a printk in CPU context. The values are left space
> padded and enclosed in parentheses such as:
>   [T123]   or   [ C16]
> For dmesg command the PRINTK_CALLER field when present is the last
> field before the dmesg text so it makes sense to use the same format.
> 
> Signed-off-by: Ivan Delalande 
> Signed-off-by: Edward Chron 
> ---
>   makedumpfile.c | 25 +
>   makedumpfile.h |  6 ++
>   printk.c   | 18 ++
>   3 files changed, 49 insertions(+)
> 
> diff --git a/makedumpfile.c b/makedumpfile.c
> index a6ec9d4..48ca11f 100644
> --- a/makedumpfile.c
> +++ b/makedumpfile.c
> @@ -28,6 +28,10 @@
>   #include 
>   #include 
>   
> +#define PID_CHARS_MAX 16 /* Max Number of PID characters */
> +#define PID_CHARS_DEFAULT 8  /* Default number of PID characters */
> +
> +
>   struct symbol_table symbol_table;
>   struct size_table   size_table;
>   struct offset_table offset_table;
> @@ -2118,6 +2122,7 @@ module_end:
>   SIZE_INIT(printk_info, "printk_info");
>   OFFSET_INIT(printk_info.ts_nsec, "printk_info", "ts_nsec");
>   OFFSET_INIT(printk_info.text_len, "printk_info", "text_len");
> + OFFSET_INIT(printk_info.caller_id, "printk_info", "caller_id");
>   
>   OFFSET_INIT(atomic_long_t.counter, "atomic_long_t", "counter");
>   
> @@ -2133,6 +2138,7 @@ module_end:
>   OFFSET_INIT(printk_log.ts_nsec, "printk_log", "ts_nsec");
>   OFFSET_INIT(printk_log.len, "printk_log", "len");
>   OFFSET_INIT(printk_log.text_len, "printk_log", "text_len");
> + OFFSET_INIT(printk_log.caller_id, "printk_log", "caller_id");
>   } else {
>   info->flag_use_printk_ringbuffer = FALSE;
>   info->flag_use_printk_log = FALSE;
> @@ -2462,6 +2468,7 @@ write_vmcoreinfo_data(void)
>   
>   WRITE_MEMBER_OFFSET("printk_info.ts_nsec", printk_info.ts_nsec);
>   WRITE_MEMBER_OFFSET("printk_info.text_len", 
> printk_info.text_len);
> + WRITE_MEMBER_OFFSET("printk_info.caller_id", 
> printk_info.caller_id);
>   
>   WRITE_MEMBER_OFFSET("atomic_long_t.counter", 
> atomic_long_t.counter);
>   
> @@ -2470,6 +2477,7 @@ write_vmcoreinfo_data(void)
>   WRITE_MEMBER_OFFSET("printk_log.ts_nsec", printk_log.ts_nsec);
>   WRITE_MEMBER_OFFSET("printk_log.len", printk_log.len);
>   WRITE_MEMBER_OFFSET("printk_log.text_len", printk_log.text_len);
> + WRITE_MEMBER_OFFSET("printk_log.caller_id", 

Re: [PATCH] makedumpfile add dmesg PRINTK_CALLER id support

2023-12-24 Thread  
On 2023/12/22 8:46, Edward Chron wrote:
> Submission to Project: makedumpfile
> Component: dmesg
> Files: printk.c makedumpfile.c makedumpfile.h
> Code level patch applied against: 1.7.4++ - latest code pulled from
>  https://github.com/makedumpfile/makedumpfile
> makedumpfile Issue #13
>  https://github.com/makedumpfile/makedumpfile/issues/13
> Project Owner: Kazuhito Hagio 
> Revision: #1 on 2023/12/15 per Kazu ensure spacing of dmesg output
> matches dmesg -S and new dmesg caller_id
> output format (space after timestamp)
> Revision: #2 on 2023/12/21 per Kazu use NOT_FOUND_STRUCTURE for caller_id
> Revision: #3 on 2023/12/21 streamline code for printing caller_id
> 
> Add support so that dmesg entries include the optional Linux Kernel
> debug CONFIG option PRINTK_CALLER which adds an optional dmesg field
> that contains the Thread Id or CPU Id that is issuing the printk to
> add the message to the kernel ring buffer. If enabled, this CONFIG
> option makes debugging simpler as dmesg entries for a specific
> thread or CPU can be recognized.
> 
> The dmesg command supports printing the PRINTK_CALLER field. The
> old syslog format (dmesg -S) and recently support was added for dmesg
> using /dev/kmsg interface.
> 
> The additional field provided by PRINTK_CALLER is only present
> if it was configured for the Linux kernel on the running system. The
> PRINTK_CALLER is a debug option and not configured by default so the
> dmesg output will only change for those kernels where the option was
> configured when the kernel was built. For users who went to the
> trouble to configure PRINTK_CALLER and have the extra field available
> for debugging, having dmesg print the field is very helpful and so
> it would be very useful to add makedumpfile support for it.
> 
> Size of the PRINTK_CALLER field is determined by the maximum number
> tasks that can be run on the system which is limited by the value of
> /proc/sys/kernel/pid_max as pid values are from 0 to value - 1.
> This value determines the number of id digits needed by the caller id.
> The PRINTK_CALLER field is printed as T for a Task Id or C
> for a CPU Id for a printk in CPU context. The values are left space
> padded and enclosed in parentheses such as:
>   [T123]   or   [ C16]
> For dmesg command the PRINTK_CALLER field when present is the last
> field before the dmesg text so it makes sense to use the same format.
> 
> Signed-off-by: Ivan Delalande 
> Signed-off-by: Edward Chron 

Thank you for the update.

It looks like a few fixes are needed, but I can fix them when merging,
so there is no need to update.  And I will apply this when the upstream
dmesg gets the PRINTK_CALLER support.


BTW, I tried the dmesg of this version [1], its caller width is 7 and
different from makedumpfile's 8 on an environment, is this intended?

# cat /proc/sys/kernel/pid_max
4194304
# ./dmesg

[ 7971.766029] [T109202] XFS (sda2): Ending clean mount
[ 8700.508829] [T1053218] XFS (sda2): Unmounting Filesystem ...

# cat dmesg-makedumpfile

[ 7971.766029] [ T109202] XFS (sda2): Ending clean mount
[ 8700.508829] [T1053218] XFS (sda2): Unmounting Filesystem ...

personally I prefer 8 also for dmesg, because recent kernels have 7
digits (4 million) for PID_MAX_LIMIT [2] plus one for "T"|"C".

[1] https://github.com/util-linux/util-linux/pull/2647/commits/7e485b720638
[2] 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/include/linux/threads.h#n34


The following comments are just my memo to be fixed when merging.

> ---
>   makedumpfile.c | 25 +
>   makedumpfile.h | 22 ++
>   printk.c   | 18 ++
>   3 files changed, 65 insertions(+)
> 
> diff --git a/makedumpfile.c b/makedumpfile.c
> index a6ec9d4..eaa8c4d 100644
> --- a/makedumpfile.c
> +++ b/makedumpfile.c
> @@ -28,6 +28,10 @@
>   #include 
>   #include 
>   
> +#define PID_CHARS_MAX 16 /* Max Number of PID characters */
> +#define PID_CHARS_DEFAULT 8  /* Default number of PID characters */
> +
> +
>   struct symbol_table symbol_table;
>   struct size_table   size_table;
>   struct offset_table offset_table;
> @@ -2118,6 +2122,7 @@ module_end:
>   SIZE_INIT(printk_info, "printk_info");
>   OFFSET_INIT(printk_info.ts_nsec, "printk_info", "ts_nsec");
>   OFFSET_INIT(printk_info.text_len, "printk_info", "text_len");
> + OFFSET_INIT(printk_info.caller_id, "printk_info", "caller_id");
>   
>   OFFSET_INIT(atomic_long_t.counter, "atomic_long_t", "counter");
>   
> @@ -2133,6 +2138,7 @@ module_end:
>   OFFSET_INIT(printk_log.ts_nsec, "printk_log", "ts_nsec");
>   OFFSET_INIT(printk_log.len, "printk_log", "len");
>   OFFSET_INIT(printk_log.text_len, "printk_log", "text_len");
> + OFFSET_INIT(printk_info.caller_id, "printk_log", "caller_id");

This should be 

Re: [PATCH] makedumpfile add dmesg PRINTK_CALLER id support

2023-12-20 Thread  
On 2023/12/14 17:02, HAGIO KAZUHITO(萩尾 一仁) wrote:
> On 2023/12/13 19:13, Edward Chron wrote:
>> Submission to Project: makedumpfile
>> Component: dmesg
>> Files: printk.c makedumpfile.c makedumpfile.h
>> Code level patch applied against: 1.7.4++ - latest code pulled from
>>   https://github.com/makedumpfile/makedumpfile
>> makedumpfile Issue #13
>>   https://github.com/makedumpfile/makedumpfile/issues/13
>>
>> Add support so that dmesg entries include the optional Linux Kernel
>> debug CONFIG option PRINTK_CALLER which adds an optional dmesg field
>> that contains the Thread Id or CPU Id that is issuing the printk to
>> add the message to the kernel ring buffer. If enabled, this CONFIG
>> option makes debugging simpler as dmesg entries for a specific
>> thread or CPU can be recognized.
>>
>> The dmesg command supports printing the PRINTK_CALLER field. The
>> old syslog format (dmesg -S) and recently support was added for dmesg
>> using /dev/kmsg interface.
> 
> Hi Edward, thanks for the patch.
> 
> How can I try the dmesg using /dev/kmsg with the caller support?

ok, I found this:
https://github.com/util-linux/util-linux/pull/2647

and thank you for the test output files:
https://github.com/makedumpfile/makedumpfile/issues/13

According to the output files, you may have fixed those but some 
comments below.

> I tried the latest one [1] and "dmesg -S" prints them but "dmesg" doesn't.
> 
> # ./dmesg -S
> [11894.954745] [T59093] XFS (sdb2): Unmounting Filesystem 
> a41e38db-e035-4995-8276-763f499a33df
> [11897.572717] [T59101] XFS (sdb2): Mounting V5 Filesystem 
> a41e38db-e035-4995-8276-763f499a33df
> [11897.647821] [T59101] XFS (sdb2): Ending clean mount
> # ./dmesg
> [11894.954745] XFS (sdb2): Unmounting Filesystem 
> a41e38db-e035-4995-8276-763f499a33df
> [11897.572717] XFS (sdb2): Mounting V5 Filesystem 
> a41e38db-e035-4995-8276-763f499a33df
> [11897.647821] XFS (sdb2): Ending clean mount
> 
> I would like to make the log format consistent with "dmesg" output.
> 
> # ./makedumpfile --dump-dmesg /proc/kcore a
> 
> The dmesg log is saved to a.
> 
> makedumpfile Completed.
> [root@t110h ~]# tail -n 3 a
> [11894.954745][  T59093] XFS (sdb2): Unmounting Filesystem 
> a41e38db-e035-4995-8276-763f499a33df
> [11897.572717][  T59101] XFS (sdb2): Mounting V5 Filesystem 
> a41e38db-e035-4995-8276-763f499a33df
> [11897.647821][  T59101] XFS (sdb2): Ending clean mount
> 
> [1] https://github.com/util-linux/util-linux  (top: be59729281c6)
> 
> Thanks,
> Kazu
> 
>>
>> The additional field provided by PRINTK_CALLER is only present
>> if it was configured for the Linux kernel on the running system. The
>> PRINTK_CALLER is a debug option and not configured by default so the
>> dmesg output will only change for those kernels where the option was
>> configured when the kernel was built. For users who went to the
>> trouble to configure PRINTK_CALLER and have the extra field available
>> for debugging, having dmesg print the field is very helpful and so
>> it would be very useful to add makedumpfile support for it.
>>
>> Size of the PRINTK_CALLER field is determined by the maximum number
>> tasks that can be run on the system which is limited by the value of
>> /proc/sys/kernel/pid_max as pid values are from 0 to value - 1.
>> This value determines the number of id digits needed by the caller id.
>> The PRINTK_CALLER field is printed as T for a Task Id or C
>> for a CPU Id for a printk in CPU context. The values are left space
>> padded and enclosed in parentheses such as:
>>[T123]   or   [ C16]
>> For dmesg command the PRINTK_CALLER field when present is the last
>> field before the dmesg text so it makes sense to use the same format.
>>
>> Signed-off-by: Ivan Delalande 
>> Signed-off-by: Edward Chron 
>> ---
>>makedumpfile.c | 35 ++-
>>makedumpfile.h | 22 ++
>>printk.c   | 29 -
>>3 files changed, 84 insertions(+), 2 deletions(-)
>>
>> diff --git a/makedumpfile.c b/makedumpfile.c
>> index a6ec9d4..5172ee2 100644
>> --- a/makedumpfile.c
>> +++ b/makedumpfile.c
>> @@ -2118,6 +2118,7 @@ module_end:
>>  SIZE_INIT(printk_info, "printk_info");
>>  OFFSET_INIT(printk_info.ts_nsec, "printk_info", "ts_nsec");
>>  OFFSET_INIT(printk_info.text_len, "printk_info", "text_len");
>> +OFFSET_INIT(printk_info.caller_id, "printk_info&quo

Re: [PATCH] makedumpfile add dmesg PRINTK_CALLER id support

2023-12-14 Thread  
On 2023/12/13 19:13, Edward Chron wrote:
> Submission to Project: makedumpfile
> Component: dmesg
> Files: printk.c makedumpfile.c makedumpfile.h
> Code level patch applied against: 1.7.4++ - latest code pulled from
>  https://github.com/makedumpfile/makedumpfile
> makedumpfile Issue #13
>  https://github.com/makedumpfile/makedumpfile/issues/13
> 
> Add support so that dmesg entries include the optional Linux Kernel
> debug CONFIG option PRINTK_CALLER which adds an optional dmesg field
> that contains the Thread Id or CPU Id that is issuing the printk to
> add the message to the kernel ring buffer. If enabled, this CONFIG
> option makes debugging simpler as dmesg entries for a specific
> thread or CPU can be recognized.
> 
> The dmesg command supports printing the PRINTK_CALLER field. The
> old syslog format (dmesg -S) and recently support was added for dmesg
> using /dev/kmsg interface.

Hi Edward, thanks for the patch.

How can I try the dmesg using /dev/kmsg with the caller support?
I tried the latest one [1] and "dmesg -S" prints them but "dmesg" doesn't.

# ./dmesg -S
[11894.954745] [T59093] XFS (sdb2): Unmounting Filesystem 
a41e38db-e035-4995-8276-763f499a33df
[11897.572717] [T59101] XFS (sdb2): Mounting V5 Filesystem 
a41e38db-e035-4995-8276-763f499a33df
[11897.647821] [T59101] XFS (sdb2): Ending clean mount
# ./dmesg
[11894.954745] XFS (sdb2): Unmounting Filesystem 
a41e38db-e035-4995-8276-763f499a33df
[11897.572717] XFS (sdb2): Mounting V5 Filesystem 
a41e38db-e035-4995-8276-763f499a33df
[11897.647821] XFS (sdb2): Ending clean mount

I would like to make the log format consistent with "dmesg" output.

# ./makedumpfile --dump-dmesg /proc/kcore a

The dmesg log is saved to a.

makedumpfile Completed.
[root@t110h ~]# tail -n 3 a
[11894.954745][  T59093] XFS (sdb2): Unmounting Filesystem 
a41e38db-e035-4995-8276-763f499a33df
[11897.572717][  T59101] XFS (sdb2): Mounting V5 Filesystem 
a41e38db-e035-4995-8276-763f499a33df
[11897.647821][  T59101] XFS (sdb2): Ending clean mount

[1] https://github.com/util-linux/util-linux  (top: be59729281c6)

Thanks,
Kazu

> 
> The additional field provided by PRINTK_CALLER is only present
> if it was configured for the Linux kernel on the running system. The
> PRINTK_CALLER is a debug option and not configured by default so the
> dmesg output will only change for those kernels where the option was
> configured when the kernel was built. For users who went to the
> trouble to configure PRINTK_CALLER and have the extra field available
> for debugging, having dmesg print the field is very helpful and so
> it would be very useful to add makedumpfile support for it.
> 
> Size of the PRINTK_CALLER field is determined by the maximum number
> tasks that can be run on the system which is limited by the value of
> /proc/sys/kernel/pid_max as pid values are from 0 to value - 1.
> This value determines the number of id digits needed by the caller id.
> The PRINTK_CALLER field is printed as T for a Task Id or C
> for a CPU Id for a printk in CPU context. The values are left space
> padded and enclosed in parentheses such as:
>   [T123]   or   [ C16]
> For dmesg command the PRINTK_CALLER field when present is the last
> field before the dmesg text so it makes sense to use the same format.
> 
> Signed-off-by: Ivan Delalande 
> Signed-off-by: Edward Chron 
> ---
>   makedumpfile.c | 35 ++-
>   makedumpfile.h | 22 ++
>   printk.c   | 29 -
>   3 files changed, 84 insertions(+), 2 deletions(-)
> 
> diff --git a/makedumpfile.c b/makedumpfile.c
> index a6ec9d4..5172ee2 100644
> --- a/makedumpfile.c
> +++ b/makedumpfile.c
> @@ -2118,6 +2118,7 @@ module_end:
>   SIZE_INIT(printk_info, "printk_info");
>   OFFSET_INIT(printk_info.ts_nsec, "printk_info", "ts_nsec");
>   OFFSET_INIT(printk_info.text_len, "printk_info", "text_len");
> + OFFSET_INIT(printk_info.caller_id, "printk_info", "caller_id");
>   
>   OFFSET_INIT(atomic_long_t.counter, "atomic_long_t", "counter");
>   
> @@ -2133,6 +2134,7 @@ module_end:
>   OFFSET_INIT(printk_log.ts_nsec, "printk_log", "ts_nsec");
>   OFFSET_INIT(printk_log.len, "printk_log", "len");
>   OFFSET_INIT(printk_log.text_len, "printk_log", "text_len");
> + OFFSET_INIT(printk_info.caller_id, "printk_log", "caller_id");
>   } else {
>   info->flag_use_printk_ringbuffer = FALSE;
>   info->flag_use_printk_log = FALSE;
> @@ -2462,6 +2464,7 @@ write_vmcoreinfo_data(void)
>   
>   WRITE_MEMBER_OFFSET("printk_info.ts_nsec", printk_info.ts_nsec);
>   WRITE_MEMBER_OFFSET("printk_info.text_len", 
> printk_info.text_len);
> + WRITE_MEMBER_OFFSET("printk_info.caller_id", 
> printk_info.caller_id);
>   
>   WRITE_MEMBER_OFFSET("atomic_long_t.counter", 
> atomic_long_t.counter);
>   
> 

Re: [PATCH v2 2/2] s390x: uncouple virtual and physical address spaces

2023-12-05 Thread  
On 2023/12/06 0:01, Alexander Gordeev wrote:
> Rework vaddr_to_paddr() and paddr_to_vaddr() macros to reflect
> the future uncoupling of physical and virtual address spaces in
> kernel. Existing versions are not affected.
> 
> Signed-off-by: Alexander Gordeev 

Thank you for the v2, applied.

Kazu

> ---
>   arch/s390x.c   | 134 -
>   makedumpfile.h |  11 +++-
>   2 files changed, 142 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/s390x.c b/arch/s390x.c
> index a01f164..4a993be 100644
> --- a/arch/s390x.c
> +++ b/arch/s390x.c
> @@ -59,6 +59,69 @@
>   #define rsg_offset(x, y)(rsg_index( x, y) * sizeof(unsigned long))
>   #define pte_offset(x)   (pte_index(x) * sizeof(unsigned long))
>   
> +#define LOWCORE_SIZE 0x2000
> +
> +#define OS_INFO_VERSION_MAJOR1
> +#define OS_INFO_VERSION_MINOR1
> +
> +#define OS_INFO_VMCOREINFO   0
> +#define OS_INFO_REIPL_BLOCK  1
> +#define OS_INFO_FLAGS_ENTRY  2
> +#define OS_INFO_RESERVED 3
> +#define OS_INFO_IDENTITY_BASE4
> +#define OS_INFO_KASLR_OFFSET 5
> +#define OS_INFO_KASLR_OFF_PHYS   6
> +#define OS_INFO_VMEMMAP  7
> +#define OS_INFO_AMODE31_START8
> +#define OS_INFO_AMODE31_END  9
> +
> +struct os_info_entry {
> + union {
> + uint64_taddr;
> + uint64_tval;
> + };
> + uint64_tsize;
> + uint32_tcsum;
> +} __attribute__((packed));
> +
> +struct os_info {
> + uint64_tmagic;
> + uint32_tcsum;
> + uint16_tversion_major;
> + uint16_tversion_minor;
> + uint64_tcrashkernel_addr;
> + uint64_tcrashkernel_size;
> + struct  os_info_entry   entry[10];
> + uint8_t reserved[3864];
> +} __attribute__((packed));
> +
> +#define S390X_LC_OS_INFO 0x0e18
> +
> +struct s390_ops {
> + unsigned long long  (*virt_to_phys)(unsigned long vaddr);
> + unsigned long   (*phys_to_virt)(unsigned long long paddr);
> +};
> +
> +static unsigned long long vaddr_to_paddr_s390x_legacy(unsigned long vaddr);
> +static unsigned long long vaddr_to_paddr_s390x_vr(unsigned long vaddr);
> +static unsigned long paddr_to_vaddr_s390x_legacy(unsigned long long paddr);
> +static unsigned long paddr_to_vaddr_s390x_vr(unsigned long long paddr);
> +
> +struct s390_ops s390_ops = {
> + .virt_to_phys = vaddr_to_paddr_s390x_legacy,
> + .phys_to_virt = paddr_to_vaddr_s390x_legacy,
> +};
> +
> +unsigned long long vaddr_to_paddr_s390x(unsigned long vaddr)
> +{
> + return s390_ops.virt_to_phys(vaddr);
> +}
> +
> +unsigned long paddr_to_vaddr_s390x(unsigned long long paddr)
> +{
> + return s390_ops.phys_to_virt(paddr);
> +}
> +
>   int
>   set_s390x_max_physmem_bits(void)
>   {
> @@ -88,12 +151,53 @@ set_s390x_max_physmem_bits(void)
>   return FALSE;
>   }
>   
> +static int s390x_init_vm(void)
> +{
> + struct os_info os_info;
> + ulong addr;
> +
> + if (!readmem(PADDR, S390X_LC_OS_INFO, ,
> + sizeof(addr)) || !addr) {
> + ERRMSG("Can't get s390x os_info ptr.\n");
> + return FALSE;
> + }
> +
> + if (addr == 0)
> + return TRUE;
> +
> + if (!readmem(PADDR, addr, _info, offsetof(struct os_info, 
> reserved))) {
> + ERRMSG("Can't get os_info header.\n");
> + return FALSE;
> + }
> +
> + if (!os_info.entry[OS_INFO_KASLR_OFFSET].val)
> + return TRUE;
> +
> + MSG("The -vr kernel detected.\n");
> +
> + info->identity_map_base   = os_info.entry[OS_INFO_IDENTITY_BASE].val;
> + info->kvbase  = os_info.entry[OS_INFO_KASLR_OFFSET].val;
> + info->__kaslr_offset_phys = os_info.entry[OS_INFO_KASLR_OFF_PHYS].val;
> + info->vmemmap_start   = os_info.entry[OS_INFO_VMEMMAP].val;
> + info->amode31_start   = os_info.entry[OS_INFO_AMODE31_START].val;
> + info->amode31_end = os_info.entry[OS_INFO_AMODE31_END].val;
> +
> + s390_ops.virt_to_phys   = vaddr_to_paddr_s390x_vr;
> + s390_ops.phys_to_virt   = paddr_to_vaddr_s390x_vr;
> +
> + return TRUE;
> +}
> +
> +
>   int
>   get_machdep_info_s390x(void)
>   {
>   unsigned long vmalloc_start;
>   char *term_str = getenv("TERM");
>   
> + if (!s390x_init_vm())
> + return FALSE;
> +
>   if (term_str && strcmp(term_str, "dumb") == 0)
>   /* '\r' control character is ignored on "dumb" terminal. */
>   flag_ignore_r_char = 1;
> @@ -295,8 +399,8 @@ vtop_s390x(unsigned long vaddr)
>   return paddr;
>   }
>   
> -unsigned long long
> -vaddr_to_paddr_s390x(unsigned long vaddr)
> +static unsigned long long
> +vaddr_to_paddr_s390x_legacy(unsigned long vaddr)
>   {
>   unsigned long long paddr;
>   
> @@ -320,6 +424,32 @@ vaddr_to_paddr_s390x(unsigned long vaddr)
>   

Re: [PATCH 2/2] s390x: uncouple virtual and physical address spaces

2023-12-04 Thread  
On 2023/11/29 21:50, Alexander Gordeev wrote:
> Rework vaddr_to_paddr() and paddr_to_vaddr() macros to reflect
> the future uncoupling of physical and virtual address spaces in
> kernel. Existing versions are not affected.
> 
> Signed-off-by: Alexander Gordeev 
> ---
>   arch/s390x.c   | 134 -
>   makedumpfile.c |   2 +
>   makedumpfile.h |  12 -
>   3 files changed, 145 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/s390x.c b/arch/s390x.c
> index a01f164..8ba2267 100644
> --- a/arch/s390x.c
> +++ b/arch/s390x.c
> @@ -59,6 +59,69 @@
>   #define rsg_offset(x, y)(rsg_index( x, y) * sizeof(unsigned long))
>   #define pte_offset(x)   (pte_index(x) * sizeof(unsigned long))
>   
> +#define LOWCORE_SIZE 0x2000
> +
> +#define OS_INFO_VERSION_MAJOR1
> +#define OS_INFO_VERSION_MINOR1
> +
> +#define OS_INFO_VMCOREINFO   0
> +#define OS_INFO_REIPL_BLOCK  1
> +#define OS_INFO_FLAGS_ENTRY  2
> +#define OS_INFO_RESERVED 3
> +#define OS_INFO_IDENTITY_BASE4
> +#define OS_INFO_KASLR_OFFSET 5
> +#define OS_INFO_KASLR_OFF_PHYS   6
> +#define OS_INFO_VMEMMAP  7
> +#define OS_INFO_AMODE31_START8
> +#define OS_INFO_AMODE31_END  9
> +
> +struct os_info_entry {
> + union {
> + __u64   addr;
> + __u64   val;
> + };
> + __u64   size;
> + __u32   csum;
> +} __attribute__((packed));
> +
> +struct os_info {
> + __u64   magic;
> + __u32   csum;
> + __u16   version_major;
> + __u16   version_minor;
> + __u64   crashkernel_addr;
> + __u64   crashkernel_size;
> + struct  os_info_entry entry[10];
> + __u8reserved[3864];
> +} __attribute__((packed));

are the __u* types defined on s390x?  at least make with TARGET=s390xx 
on x86_64 by my test fails.

$ make LINKTYPE=dynamic TARGET=s390xx
...
cc  -g -O2 -Wall -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE 
-D_LARGEFILE64_SOURCE -D__s390x__ -U__x86_64__ -c -o ./arch/s390x.o 
arch/s390x.c
arch/s390x.c:80:3: error: unknown type name '__u64'
__u64 addr;
^
arch/s390x.c:81:3: error: unknown type name '__u64'
__u64 val;
^
...
make: *** [Makefile:111: arch/s390x.o] Error 1

("s390xx" is strange but Makefile needs it somehow...)

> +
> +#define S390X_LC_OS_INFO 0x0e18
> +
> +struct s390_ops {
> + unsigned long long  (*virt_to_phys)(unsigned long vaddr);
> + unsigned long   (*phys_to_virt)(unsigned long long paddr);
> +};
> +
> +static unsigned long long vaddr_to_paddr_s390x_legacy(unsigned long vaddr);
> +static unsigned long long vaddr_to_paddr_s390x_vr(unsigned long vaddr);
> +static unsigned long paddr_to_vaddr_s390x_legacy(unsigned long long paddr);
> +static unsigned long paddr_to_vaddr_s390x_vr(unsigned long long paddr);
> +
> +struct s390_ops s390_ops = {
> + .virt_to_phys = vaddr_to_paddr_s390x_legacy,
> + .phys_to_virt = paddr_to_vaddr_s390x_legacy,
> +};
> +
> +unsigned long long vaddr_to_paddr_s390x(unsigned long vaddr)
> +{
> + return s390_ops.virt_to_phys(vaddr);
> +}
> +
> +unsigned long paddr_to_vaddr_s390x(unsigned long long paddr)
> +{
> + return s390_ops.phys_to_virt(paddr);
> +}
> +
>   int
>   set_s390x_max_physmem_bits(void)
>   {
> @@ -88,12 +151,53 @@ set_s390x_max_physmem_bits(void)
>   return FALSE;
>   }
>   
> +static int s390x_init_vm(void)
> +{
> + struct os_info os_info;
> + ulong addr;
> +
> + if (!readmem(PADDR, S390X_LC_OS_INFO, ,
> + sizeof(addr)) || !addr) {
> + ERRMSG("Can't get s390x os_info ptr.\n");
> + return FALSE;
> + }
> +
> + if (addr == 0)
> + return TRUE;
> +
> + if (!readmem(PADDR, addr, _info, offsetof(struct os_info, 
> reserved))) {
> + ERRMSG("Can't get os_info header.\n");
> + return FALSE;
> + }
> +
> + if (!os_info.entry[OS_INFO_KASLR_OFFSET].val)
> + return TRUE;
> +
> + MSG("The -vr kernel detected.\n");
> +
> + info->identity_map_base   = os_info.entry[OS_INFO_IDENTITY_BASE].val;
> + info->kvbase  = os_info.entry[OS_INFO_KASLR_OFFSET].val;
> + info->__kaslr_offset_phys = os_info.entry[OS_INFO_KASLR_OFF_PHYS].val;
> + info->vmemmap_start   = os_info.entry[OS_INFO_VMEMMAP].val;
> + info->amode31_start   = os_info.entry[OS_INFO_AMODE31_START].val;
> + info->amode31_end = os_info.entry[OS_INFO_AMODE31_END].val;
> +
> + s390_ops.virt_to_phys   = vaddr_to_paddr_s390x_vr;
> + s390_ops.phys_to_virt   = paddr_to_vaddr_s390x_vr;
> +
> + return TRUE;
> +}
> +
> +
>   int
>   get_machdep_info_s390x(void)
>   {
>   unsigned long vmalloc_start;
>   char *term_str = getenv("TERM");
>   
> + if (!s390x_init_vm())
> + return FALSE;
> +
>   if (term_str && strcmp(term_str, "dumb") == 0)
>   /* '\r' control character is ignored on "dumb" terminal. */
>   

[ANNOUNCE] makedumpfile 1.7.4

2023-11-05 Thread  
Hi,

I'm pleased to announce the release of makedumpfile 1.7.4.
Thank you everyone for your help to maintain the tool.

Download:
The latest makedumpfile can be downloaded from the following URL.
   https://github.com/makedumpfile/makedumpfile/releases

New features:
- Support for kernels up to v6.6
- Support for riscv64 architecture

Commits since 1.7.3:
3bc3b3e [v1.7.4] Update version (Kazuhito Hagio)
9661c1d [PATCH] Fix failure of compound page exclusion on Linux 6.6 and later 
(Kazuhito Hagio)
aee7f3b [PATCH 2/2] riscv64: Correct the pfn_start for flatmem (Song Shuai)
f777afb [PATCH 1/2] Add riscv64 support (Song Shuai)
a34f017 [PATCH] ppc64: do page traversal if vmemmap_list not populated (Aditya 
Gupta)
f23bb94 [PATCH] Support struct module_memory on Linux 6.4 and later (Kazuhito 
Hagio)
cad0d11 [PATCH] Add debugging information for DWARF information retrieval 
(Kazuhito Hagio)

Description of makedumpfile:
The makedumpfile is a tool for creating a dumpfile from /proc/vmcore
with filtering out unnecessary pages for analysis and compressing the
remaining pages, in order to shorten the size of the dumpfile and the
time of creating it.
https://github.com/makedumpfile/makedumpfile

Thanks,
Kazu
___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH makedumpfile V2 0/2] Add riscv64 support for makedumpfile

2023-10-10 Thread  
On 2023/10/10 23:12, Song Shuai wrote:
> Changes since V1:
> https://lore.kernel.org/kexec/20230927111822.180630-1-songshuaish...@tinylab.org/
> 
> - fix a typo in Patch2's commit-msg
> - adjust some indentions of Patch1

Thank you, but already applied the v1 patches with fixes on my end:
https://github.com/makedumpfile/makedumpfile/compare/a34f017...aee7f3b

I should have sent this link, sorry about that.

Thanks,
Kazu

> 
> 
> These 2 patches add riscv64 support for makedumpfile:
> 
> Patch1 - Add riscv64 support
> ===
> 
> This patch adds support for riscv64 in makedumpfile.
> It implements the "vtop" for kenrel memory regions
> and supports Sv39/Sv48/Sv57 page modes for RV64.
> 
> 
> Patch2 - riscv64: Correct the pfn_start for flatmem
> ==
> 
> This patch temporarily fixes a issue of the tests about FLATMEM,
> as the commit-msg says:
>
>  To let info->max_mapnr indicate the direct max PFN and then
>  make the kdump header's max_mapnr_64 correct, riscv64 port
>  didn't define ARCH_PFN_OFFSET.
>  
>  As for FLATMEM type, the pfn region of mem_map_data should
>  be adjusted to start from info->phys_base instead of zero.
> 
> 
> Tests
> =
> 
> With these 2 patches, the following tests had passed in RV64 Qemu virt 
> machine:
> 
> Preparation:
> ---
> 
> 1. build kernel with FLATMEM and SPARSE memory models
> 2. boot kernel with 3 different page-modes by setting nov4l/nov5l in cmdline
> 3. panic kernel
> 
> Tests:
> -
> 
> 1. create kdump-compressed file via this command
> - `/mnt/mkdf_f -d31 -f -c /proc/vmcore /mnt/dump.file1`
> - or with `--vtop` option to translate some typical addresses (like:
>   kernel_link_addr, vmalloc_start, page_offset)
> 
> 2. start crash with kdump file and do some VTOPs
> 
> 
> A test log:
> ---
> 
> # With the Sv57 and SPARSE_EXTREME kernel
> # vtop the vmalloc start address -- 0xff20
> 
> 
> # /mnt/mkdf_f  --vtop 0xff20 -d31 -f --non-mmap -c /proc/vmcore 
> /mnt/dump.file1
> 
> Translating virtual address ff20 to physical address.
> VIRTUAL   PHYSICAL
> ff20  80087000
> 
> Copying data  : [100.0 %] |
> eta: 0s
> 
> The dumpfile is saved to /mnt/dump.file1.
> 
> makedumpfile Completed.
> 
> # sudo ../crash/crash /home/song/9_linux/linux/00_rv_def/vmlinux 
> /tmp/hello/dump.file1
> ...
>KERNEL: /home/song/9_linux/linux/00_rv_def/vmlinux
>  DUMPFILE: /tmp/hello/dump.file1  [PARTIAL DUMP]
>  CPUS: 2
>  DATE: Wed Sep 27 18:37:45 CST 2023
>UPTIME: 00:00:18
> LOAD AVERAGE: 0.00, 0.00, 0.00
> TASKS: 55
>  NODENAME: (none)
>   RELEASE: 6.6.0-rc1-7-g22bfc766389c
>   VERSION: #1 SMP Mon Sep 25 19:29:05 CST 2023
>   MACHINE: riscv64  (unknown Mhz)
>MEMORY: 511.8 MB
> PANIC: "Kernel panic - not syncing: sysrq triggered crash"
>   PID: 1
>   COMMAND: "sh"
>  TASK: ff6e  [THREAD_INFO: ff6e]
>   CPU: 1
> STATE: TASK_RUNNING (PANIC)
> 
> crash> vtop 0xff20
> VIRTUAL   PHYSICAL
> ff20  80087000
> 
>PGD: 814fa900 => 20010c01
>P4D: 80043000 => 20025401
>PUD: 80095000 => 20025801
>PMD: 80096000 => 20026001
>PTE: 80098000 => 20021ce7
>   PAGE: 80087000
> 
>PTE PHYSICAL  FLAGS
> 20021ce7  80087000  (PRESENT|READ|WRITE|GLOBAL|ACCESSED|DIRTY)
> 
>PAGE   PHYSICAL  MAPPING   INDEX CNT FLAGS
> ff1c020021c0 8008700000  1 0  // same as the 
> makedumpfile's vtop
> 
> Song Shuai (2):
>Add riscv64 support
>riscv64: Correct the pfn_start for flatmem
> 
>   Makefile   |   2 +-
>   arch/riscv64.c | 219 +
>   makedumpfile.c |  18 
>   makedumpfile.h | 107 
>   4 files changed, 345 insertions(+), 1 deletion(-)
>   create mode 100644 arch/riscv64.c
> 
___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH makedumpfile 0/2] Add riscv64 support for makedumpfile

2023-10-10 Thread  
On 2023/10/07 11:27, Song Shuai wrote:
> 
> 
> 在 2023/10/3 12:22, HAGIO KAZUHITO(萩尾 一仁) 写道:
>> Hi,
>>
>> thank you for the patch.
>>
>> On 2023/09/27 20:18, Song Shuai wrote:
>>> These 2 patches add riscv64 support for makedumpfile:
>>>
>>> Patch1 - Add riscv64 support
>>> ===
>>>
>>> This patch adds support for riscv64 in makedumpfile.
>>> It implements the "vtop" for kenrel memory regions
>>> and supports Sv39/Sv48/Sv57 page modes for RV64.
>>
>> Could I have a log of makedumpfile with --message-level 31 option for
>> reference? e.g.
>>     makedumpfile -c -d 31 --message-level 31 vmcore dumpfile > mkdf.log
>>
>> (IIRC the kexec mail list doesn't accept attached files, so please send
>> it off-list.)
> 
> Sorry for the later reply,
> 
> here are the log for the Sv57 and SPARSE_EXTREME kernel:
> 
> https://termbin.com/zcf9:
> 
> and the log for FLATMEM
> 
> https://termbin.com/t89k

Thank you for the information.

> 
>>
>>>
>>>
>>> Patch2 - riscv64: Correct the pfn_start for flatmem
>>> ==
>>>
>>> This patch temporarily fixes a issue of the tests about FLATMEM,
>>> as the commit-msg says:
>>>   To let info->max_mapnr indicte the direct max PFN and then
>>
>> This means "indicate", right?
>>
> Right, would fix it if you're ok with the Patch2.

The patches look good, so applied with fixing it and several indent 
adjustments.

Thanks,
Kazu

> 
>> Thanks,
>> Kazu
>>
>>>   make the kdump header's max_mapnr_64 correct, riscv64 port
>>>   didn't define ARCH_PFN_OFFSET.
>>>   As for FLATMEM type, the pfn region of mem_map_data should
>>>   be adjusted to start from info->phys_base instead of zero.
>>>
>>> Not taking other arches into consideration and test, so I simplely
>>> judge the __riscv64__ instead of ARCH_PFN_OFFSET. Maybe we can 
>>> improve it.
>>>
>>>
>>> Tests
>>> =
>>>
>>> With these 2 patches, the following tests had passed in RV64 Qemu 
>>> virt machine:
>>>
>>> Preparation:
>>> ---
>>>
>>> 1. build kernel with FLATMEM and SPARSE memory models
>>> 2. boot kernel with 3 different page-modes by setting nov4l/nov5l in 
>>> cmdline
>>> 3. panic kernel
>>>
>>> Tests:
>>> -
>>>
>>> 1. create kdump-compressed file via this command
>>>  - `/mnt/mkdf_f -d31 -f -c /proc/vmcore /mnt/dump.file1`
>>>  - or with `--vtop` option to translate some typical addresses 
>>> (like:
>>>    kernel_link_addr, vmalloc_start, page_offset)
>>>
>>> 2. start crash with kdump file and do some VTOPs
>>>
>>>
>>> A test log:
>>> ---
>>>
>>> # With the Sv57 and SPARSE_EXTREME kernel
>>> # vtop the vmalloc start address -- 0xff20
>>>
>>>
>>> # /mnt/mkdf_f  --vtop 0xff20 -d31 -f --non-mmap -c 
>>> /proc/vmcore /mnt/dump.file1
>>>
>>> Translating virtual address ff20 to physical address.
>>> VIRTUAL   PHYSICAL
>>> ff20  80087000
>>>
>>> Copying data  : [100.0 %] |
>>> eta: 0s
>>>
>>> The dumpfile is saved to /mnt/dump.file1.
>>>
>>> makedumpfile Completed.
>>>
>>> # sudo ../crash/crash /home/song/9_linux/linux/00_rv_def/vmlinux 
>>> /tmp/hello/dump.file1
>>> ...
>>>     KERNEL: /home/song/9_linux/linux/00_rv_def/vmlinux
>>>   DUMPFILE: /tmp/hello/dump.file1  [PARTIAL DUMP]
>>>   CPUS: 2
>>>   DATE: Wed Sep 27 18:37:45 CST 2023
>>>     UPTIME: 00:00:18
>>> LOAD AVERAGE: 0.00, 0.00, 0.00
>>>  TASKS: 55
>>>   NODENAME: (none)
>>>    RELEASE: 6.6.0-rc1-7-g22bfc766389c
>>>    VERSION: #1 SMP Mon Sep 25 19:29:05 CST 2023
>>>    MACHINE: riscv64  (unknown Mhz)
>>>     MEMORY: 511.8 MB
>>>  PANIC: "Kernel panic - not syncing: sysrq triggered crash"
>>>    PID: 1
>>>    COMMAND: "sh"
>>>   TASK: ff6e  [THREAD_INFO: ff6e]
>>>    CPU: 1
>>>  STATE: TASK_RUNN

Re: [PATCH makedumpfile 0/2] Add riscv64 support for makedumpfile

2023-10-02 Thread  
Hi,

thank you for the patch.

On 2023/09/27 20:18, Song Shuai wrote:
> These 2 patches add riscv64 support for makedumpfile:
> 
> Patch1 - Add riscv64 support
> ===
> 
> This patch adds support for riscv64 in makedumpfile.
> It implements the "vtop" for kenrel memory regions
> and supports Sv39/Sv48/Sv57 page modes for RV64.

Could I have a log of makedumpfile with --message-level 31 option for 
reference? e.g.
   makedumpfile -c -d 31 --message-level 31 vmcore dumpfile > mkdf.log

(IIRC the kexec mail list doesn't accept attached files, so please send 
it off-list.)

> 
> 
> Patch2 - riscv64: Correct the pfn_start for flatmem
> ==
> 
> This patch temporarily fixes a issue of the tests about FLATMEM,
> as the commit-msg says:
>
>  To let info->max_mapnr indicte the direct max PFN and then

This means "indicate", right?

Thanks,
Kazu

>  make the kdump header's max_mapnr_64 correct, riscv64 port
>  didn't define ARCH_PFN_OFFSET.
>  
>  As for FLATMEM type, the pfn region of mem_map_data should
>  be adjusted to start from info->phys_base instead of zero.
> 
> Not taking other arches into consideration and test, so I simplely
> judge the __riscv64__ instead of ARCH_PFN_OFFSET. Maybe we can improve it.
> 
> 
> Tests
> =
> 
> With these 2 patches, the following tests had passed in RV64 Qemu virt 
> machine:
> 
> Preparation:
> ---
> 
> 1. build kernel with FLATMEM and SPARSE memory models
> 2. boot kernel with 3 different page-modes by setting nov4l/nov5l in cmdline
> 3. panic kernel
> 
> Tests:
> -
> 
> 1. create kdump-compressed file via this command
> - `/mnt/mkdf_f -d31 -f -c /proc/vmcore /mnt/dump.file1`
> - or with `--vtop` option to translate some typical addresses (like:
>   kernel_link_addr, vmalloc_start, page_offset)
> 
> 2. start crash with kdump file and do some VTOPs
> 
> 
> A test log:
> ---
> 
> # With the Sv57 and SPARSE_EXTREME kernel
> # vtop the vmalloc start address -- 0xff20
> 
> 
> # /mnt/mkdf_f  --vtop 0xff20 -d31 -f --non-mmap -c /proc/vmcore 
> /mnt/dump.file1
> 
> Translating virtual address ff20 to physical address.
> VIRTUAL   PHYSICAL
> ff20  80087000
> 
> Copying data  : [100.0 %] |
> eta: 0s
> 
> The dumpfile is saved to /mnt/dump.file1.
> 
> makedumpfile Completed.
> 
> # sudo ../crash/crash /home/song/9_linux/linux/00_rv_def/vmlinux 
> /tmp/hello/dump.file1
> ...
>KERNEL: /home/song/9_linux/linux/00_rv_def/vmlinux
>  DUMPFILE: /tmp/hello/dump.file1  [PARTIAL DUMP]
>  CPUS: 2
>  DATE: Wed Sep 27 18:37:45 CST 2023
>UPTIME: 00:00:18
> LOAD AVERAGE: 0.00, 0.00, 0.00
> TASKS: 55
>  NODENAME: (none)
>   RELEASE: 6.6.0-rc1-7-g22bfc766389c
>   VERSION: #1 SMP Mon Sep 25 19:29:05 CST 2023
>   MACHINE: riscv64  (unknown Mhz)
>MEMORY: 511.8 MB
> PANIC: "Kernel panic - not syncing: sysrq triggered crash"
>   PID: 1
>   COMMAND: "sh"
>  TASK: ff6e  [THREAD_INFO: ff6e]
>   CPU: 1
> STATE: TASK_RUNNING (PANIC)
> 
> crash> vtop 0xff20
> VIRTUAL   PHYSICAL
> ff20  80087000
> 
>PGD: 814fa900 => 20010c01
>P4D: 80043000 => 20025401
>PUD: 80095000 => 20025801
>PMD: 80096000 => 20026001
>PTE: 80098000 => 20021ce7
>   PAGE: 80087000
> 
>PTE PHYSICAL  FLAGS
> 20021ce7  80087000  (PRESENT|READ|WRITE|GLOBAL|ACCESSED|DIRTY)
> 
>PAGE   PHYSICAL  MAPPING   INDEX CNT FLAGS
> ff1c020021c0 8008700000  1 0  // same as the 
> makedumpfile's vtop
> 
> 
> Song Shuai (2):
>Add riscv64 support
>riscv64: Correct the pfn_start for flatmem
> 
>   Makefile   |   2 +-
>   arch/riscv64.c | 219 +
>   makedumpfile.c |  18 
>   makedumpfile.h | 107 
>   4 files changed, 345 insertions(+), 1 deletion(-)
>   create mode 100644 arch/riscv64.c
> 
___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH] makedumpfile: ppc64: do page traversal if vmemmap_list not populated

2023-09-27 Thread  
On 2023/09/14 18:22, Aditya Gupta wrote:
> Currently 'makedumpfile' fails to collect vmcore on upstream kernel,
> with the errors:
> 
>  readpage_elf: Attempt to read non-existent page at 0x4000.
>  readmem: type_addr: 0, addr:0, size:8
>  get_vmemmap_list_info: Can't get vmemmap region addresses
>  get_machdep_info_ppc64: Can't get vmemmap list info.
> 
> This occurs since makedumpfile depends on 'vmemmap_list' for translating
> vmemmap addresses. But with below commit in Linux, vmemmap_list can be
> empty, in case of Radix MMU on PowerPC64
> 
>  368a0590d954: (powerpc/book3s64/vmemmap: switch radix to use a
>  different vmemmap handling function)
> 
> In case vmemmap_list is empty, then it's head is NULL, which causes
> makedumpfile to fail with above error.
> 
> Since with above commit, 'vmemmap_list' is not populated (when MMU is
> Radix MMU), kernel populates corresponding page table entries in kernel
> page table. Hence, instead of depending on 'vmemmap_list' for address
> translation for vmemmap addresses, do a kernel pagetable walk.
> 
> And since the pte can also be introduced at higher levels in the page
> table, such as at PMD level, add hugepage support, by checking for
> PAGE_PTE flag
> 
> Reported-by: Sachin Sant 
> Signed-off-by: Aditya Gupta 

Thank you for the patch, applied.

https://github.com/makedumpfile/makedumpfile/commit/a34f017965583e89c4cb0b00117c200a6c191e54

Sorry for the delay, exceptionally busy this month..

Thanks,
Kazu

> ---
>   arch/ppc64.c   | 111 ++---
>   makedumpfile.h |   6 +++
>   2 files changed, 84 insertions(+), 33 deletions(-)
> 
> diff --git a/arch/ppc64.c b/arch/ppc64.c
> index 5e70acb51aba..9456b8b570c5 100644
> --- a/arch/ppc64.c
> +++ b/arch/ppc64.c
> @@ -196,6 +196,10 @@ ppc64_vmemmap_init(void)
>   int psize, shift;
>   ulong head;
>   
> + /* initialise vmemmap_list in case SYMBOL(vmemmap_list) is not found */
> + info->vmemmap_list = NULL;
> + info->vmemmap_cnt = 0;
> + 
>   if ((SYMBOL(vmemmap_list) == NOT_FOUND_SYMBOL)
>   || (SYMBOL(mmu_psize_defs) == NOT_FOUND_SYMBOL)
>   || (SYMBOL(mmu_vmemmap_psize) == NOT_FOUND_SYMBOL)
> @@ -216,15 +220,24 @@ ppc64_vmemmap_init(void)
>   return FALSE;
>   info->vmemmap_psize = 1 << shift;
>   
> - if (!readmem(VADDR, SYMBOL(vmemmap_list), , sizeof(unsigned long)))
> - return FALSE;
> -
>   /*
> -  * Get vmemmap list count and populate vmemmap regions info
> -  */
> - info->vmemmap_cnt = get_vmemmap_list_info(head);
> - if (info->vmemmap_cnt == 0)
> - return FALSE;
> +  * vmemmap_list symbol can be missing or set to 0 in the kernel.
> +  * This would imply vmemmap region is mapped in the kernel pagetable.
> +  *
> +  * So, read vmemmap_list anyway, and use 'vmemmap_list' if it's not 
> empty
> +  * (head != NULL), or we will do a kernel pagetable walk for vmemmap 
> address
> +  * translation later
> +  **/
> + readmem(VADDR, SYMBOL(vmemmap_list), , sizeof(unsigned long));
> +
> + if (head) {
> + /*
> +  * Get vmemmap list count and populate vmemmap regions info
> +  */
> + info->vmemmap_cnt = get_vmemmap_list_info(head);
> + if (info->vmemmap_cnt == 0)
> + return FALSE;
> + }
>   
>   info->flag_vmemmap = TRUE;
>   return TRUE;
> @@ -347,29 +360,6 @@ ppc64_vmalloc_init(void)
>   return TRUE;
>   }
>   
> -/*
> - *  If the vmemmap address translation information is stored in the kernel,
> - *  make the translation.
> - */
> -static unsigned long long
> -ppc64_vmemmap_to_phys(unsigned long vaddr)
> -{
> - int i;
> - ulong   offset;
> - unsigned long long paddr = NOT_PADDR;
> -
> - for (i = 0; i < info->vmemmap_cnt; i++) {
> - if ((vaddr >= info->vmemmap_list[i].virt) && (vaddr <
> - (info->vmemmap_list[i].virt + info->vmemmap_psize))) {
> - offset = vaddr - info->vmemmap_list[i].virt;
> - paddr = info->vmemmap_list[i].phys + offset;
> - break;
> - }
> - }
> -
> - return paddr;
> -}
> -
>   static unsigned long long
>   ppc64_vtop_level4(unsigned long vaddr)
>   {
> @@ -379,6 +369,8 @@ ppc64_vtop_level4(unsigned long vaddr)
>   unsigned long long pgd_pte, pud_pte;
>   unsigned long long pmd_pte, pte;
>   unsigned long long paddr = NOT_PADDR;
> + uint is_hugepage = 0;
> + uint pdshift;
>   uint swap = 0;
>   
>   if (info->page_buf == NULL) {
> @@ -413,6 +405,13 @@ ppc64_vtop_level4(unsigned long vaddr)
>   if (!pgd_pte)
>   return NOT_PADDR;
>   
> + if (IS_HUGEPAGE(pgd_pte)) {
> + is_hugepage = 1;
> + pte = pgd_pte;
> + pdshift = info->l4_shift;
> + goto out;
> + }
> +
>   

Re: [PATCH v2 4/9] mm: vmalloc: Remove global vmap_area_root rb-tree

2023-09-07 Thread  
On 2023/09/08 13:43, Baoquan He wrote:
> On 09/08/23 at 01:51am, HAGIO KAZUHITO(萩尾 一仁) wrote:
>> On 2023/09/07 18:58, Baoquan He wrote:
>>> On 09/07/23 at 11:39am, Uladzislau Rezki wrote:
>>>> On Thu, Sep 07, 2023 at 10:17:39AM +0800, Baoquan He wrote:
>>>>> Add Kazu and Lianbo to CC, and kexec mailing list
>>>>>
>>>>> On 08/29/23 at 10:11am, Uladzislau Rezki (Sony) wrote:
>>>>>> Store allocated objects in a separate nodes. A va->va_start
>>>>>> address is converted into a correct node where it should
>>>>>> be placed and resided. An addr_to_node() function is used
>>>>>> to do a proper address conversion to determine a node that
>>>>>> contains a VA.
>>>>>>
>>>>>> Such approach balances VAs across nodes as a result an access
>>>>>> becomes scalable. Number of nodes in a system depends on number
>>>>>> of CPUs divided by two. The density factor in this case is 1/2.
>>>>>>
>>>>>> Please note:
>>>>>>
>>>>>> 1. As of now allocated VAs are bound to a node-0. It means the
>>>>>>  patch does not give any difference comparing with a current
>>>>>>  behavior;
>>>>>>
>>>>>> 2. The global vmap_area_lock, vmap_area_root are removed as there
>>>>>>  is no need in it anymore. The vmap_area_list is still kept and
>>>>>>  is _empty_. It is exported for a kexec only;
>>>>>
>>>>> I haven't taken a test, while accessing all nodes' busy tree to get
>>>>> va of the lowest address could severely impact kcore reading efficiency
>>>>> on system with many vmap nodes. People doing live debugging via
>>>>> /proc/kcore will get a little surprise.
>>>>>
>>>>>
>>>>> Empty vmap_area_list will break makedumpfile utility, Crash utility
>>>>> could be impactd too. I checked makedumpfile code, it relys on
>>>>> vmap_area_list to deduce the vmalloc_start value.
>>>>>
>>>> It is left part and i hope i fix it in v3. The problem here is
>>>> we can not give an opportunity to access to vmap internals from
>>>> outside. This is just not correct, i.e. you are not allowed to
>>>> access the list directly.
>>>
>>> Right. Thanks for the fix in v3, that is a relief of makedumpfile and
>>> crash.
>>>
>>> Hi Kazu,
>>>
>>> Meanwhile, I am thinking if we should evaluate the necessity of
>>> vmap_area_list in makedumpfile and Crash. In makedumpfile, we just use
>>> vmap_area_list to deduce VMALLOC_START. Wondering if we can export
>>> VMALLOC_START directly. Surely, the lowest va->va_start in vmap_area_list
>>> is a tighter low boundary of vmalloc area and can reduce unnecessary
>>> scanning below the lowest va. Not sure if this is the reason people
>>> decided to export vmap_area_list.
>>
>> The kernel commit acd99dbf5402 introduced the original vmlist entry to
>> vmcoreinfo, but there is no information about why it did not export
>> VMALLOC_START directly.
>>
>> If VMALLOC_START is exported directly to vmcoreinfo, I think it would be
>> enough for makedumpfile.
> 
> Thanks for confirmation, Kazu.
> 
> Then, below draft patch should be enough to export VMALLOC_START
> instead, and remove vmap_area_list. 

also the following entries can be removed.

 VMCOREINFO_OFFSET(vmap_area, va_start);
 VMCOREINFO_OFFSET(vmap_area, list);

Thanks,
Kazu

In order to get the base address of
> vmalloc area, constructing a vmap_area_list from multiple busy-tree
> seems not worth.
> 
> diff --git a/Documentation/admin-guide/kdump/vmcoreinfo.rst 
> b/Documentation/admin-guide/kdump/vmcoreinfo.rst
> index 599e8d3bcbc3..3cb1ea09ff26 100644
> --- a/Documentation/admin-guide/kdump/vmcoreinfo.rst
> +++ b/Documentation/admin-guide/kdump/vmcoreinfo.rst
> @@ -65,11 +65,11 @@ Defines the beginning of the text section. In general, 
> _stext indicates
>   the kernel start address. Used to convert a virtual address from the
>   direct kernel map to a physical address.
>   
> -vmap_area_list
> ---
> +VMALLOC_START
> +-
>   
> -Stores the virtual area list. makedumpfile gets the vmalloc start value
> -from this variable and its value is necessary for vmalloc translation.
> +Stores the base address of vmalloc area. makedumpfile gets this value and
> +its value is neces

Re: [PATCH v2 4/9] mm: vmalloc: Remove global vmap_area_root rb-tree

2023-09-07 Thread  
On 2023/09/07 18:58, Baoquan He wrote:
> On 09/07/23 at 11:39am, Uladzislau Rezki wrote:
>> On Thu, Sep 07, 2023 at 10:17:39AM +0800, Baoquan He wrote:
>>> Add Kazu and Lianbo to CC, and kexec mailing list
>>>
>>> On 08/29/23 at 10:11am, Uladzislau Rezki (Sony) wrote:
 Store allocated objects in a separate nodes. A va->va_start
 address is converted into a correct node where it should
 be placed and resided. An addr_to_node() function is used
 to do a proper address conversion to determine a node that
 contains a VA.

 Such approach balances VAs across nodes as a result an access
 becomes scalable. Number of nodes in a system depends on number
 of CPUs divided by two. The density factor in this case is 1/2.

 Please note:

 1. As of now allocated VAs are bound to a node-0. It means the
 patch does not give any difference comparing with a current
 behavior;

 2. The global vmap_area_lock, vmap_area_root are removed as there
 is no need in it anymore. The vmap_area_list is still kept and
 is _empty_. It is exported for a kexec only;
>>>
>>> I haven't taken a test, while accessing all nodes' busy tree to get
>>> va of the lowest address could severely impact kcore reading efficiency
>>> on system with many vmap nodes. People doing live debugging via
>>> /proc/kcore will get a little surprise.
>>>
>>>
>>> Empty vmap_area_list will break makedumpfile utility, Crash utility
>>> could be impactd too. I checked makedumpfile code, it relys on
>>> vmap_area_list to deduce the vmalloc_start value.
>>>
>> It is left part and i hope i fix it in v3. The problem here is
>> we can not give an opportunity to access to vmap internals from
>> outside. This is just not correct, i.e. you are not allowed to
>> access the list directly.
> 
> Right. Thanks for the fix in v3, that is a relief of makedumpfile and
> crash.
> 
> Hi Kazu,
> 
> Meanwhile, I am thinking if we should evaluate the necessity of
> vmap_area_list in makedumpfile and Crash. In makedumpfile, we just use
> vmap_area_list to deduce VMALLOC_START. Wondering if we can export
> VMALLOC_START directly. Surely, the lowest va->va_start in vmap_area_list
> is a tighter low boundary of vmalloc area and can reduce unnecessary
> scanning below the lowest va. Not sure if this is the reason people
> decided to export vmap_area_list.

The kernel commit acd99dbf5402 introduced the original vmlist entry to 
vmcoreinfo, but there is no information about why it did not export 
VMALLOC_START directly.

If VMALLOC_START is exported directly to vmcoreinfo, I think it would be 
enough for makedumpfile.

Thanks,
Kazu
___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [Crash-utility PATCH V2] RISCV64: Add KASLR support

2023-08-20 Thread  
On 2023/08/18 18:50, Song Shuai wrote:
> From: Song Shuai 
> 
> This patch adds KASLR support for Crash to analyze KASLR-ed vmcore
> since RISC-V Linux is already sufficiently prepared for KASLR [1].
> 
> With this patch, even if the Crash '--kaslr' option is not set or Linux
> CONFIG_RANDOMIZE_BASE is not configured, the 'derive_kaslr_offset()'
> function will always work to calculate 'kt->relocate' which serves to
> update the kernel virtual address.
> 
> Testing in Qemu rv64 virt, kernel log outputed the kernel offset:
> 
> [  121.214447] SMP: stopping secondary CPUs
> [  121.215445] Kernel Offset: 0x37c0 from 0x8000
> [  121.216312] Starting crashdump kernel...
> [  121.216585] Will call new kernel at 9480 from hart id 0
> [  121.216834] FDT image at 9c7fd000
> [  121.216982] Bye...
> 
> Running crash with `-d 1` option and without `--kaslr` option,
> we get the right `kt->relocate` and kernel link addr:
> 
> $ ../crash/crash -d 1 vmlinux vmcore_kaslr_0815
> ...
> KASLR:
>_stext from vmlinux: 80002000
>_stext from vmcoreinfo: b7c02000
>relocate: 37c0 (892MB)
> vmemmap : 0xff1c - 0xff20
> vmalloc : 0xff20 - 0xff60
> mudules : 0x3952f000 - 0xb7c0
> lowmem  : 0xff60 -
> kernel link addr: 0xb7c0
> ...
>KERNEL: /home/song/9_linux/linux/00_rv_kaslr/vmlinux
>  DUMPFILE: /tmp/hello/vmcore_kaslr_0815
>  CPUS: 2
>  DATE: Tue Aug 15 16:36:15 CST 2023
>UPTIME: 00:02:01
> LOAD AVERAGE: 0.40, 0.23, 0.09
> TASKS: 63
>  NODENAME: stage4.fedoraproject.org
>   RELEASE: 6.5.0-rc3-8-gad18dee423ac
>   VERSION: #17 SMP Tue Aug 15 14:41:12 CST 2023
>   MACHINE: riscv64  (unknown Mhz)
>MEMORY: 511.8 MB
> PANIC: "Kernel panic - not syncing: sysrq triggered crash"
>   PID: 160
>   COMMAND: "bash"
>  TASK: ff600152bac0  [THREAD_INFO: ff600152bac0]
>   CPU: 1
> STATE: TASK_RUNNING (PANIC)
> crash>
> 
> [1]: 
> https://lore.kernel.org/linux-riscv/20230722123850.634544-1-alexgh...@rivosinc.com/
> 
> Signed-off-by: Song Shuai 
> Reviewed-by: Guo Ren 
> 
> ---
> Changes since V1:
> https://lore.kernel.org/linux-riscv/20230815104800.705753-1-songshuaish...@tinylab.org/
>- supplement the output of my Crash test in the commit-msg
>- add the Reviewed-by from Guo
> 
> ---
>   main.c|  2 +-
>   riscv64.c | 11 +++
>   symbols.c |  4 ++--
>   3 files changed, 14 insertions(+), 3 deletions(-)
> 
> diff --git a/main.c b/main.c
> index b278c22..0c6e595 100644
> --- a/main.c
> +++ b/main.c
> @@ -228,7 +228,7 @@ main(int argc, char **argv)
>   } else if (STREQ(long_options[option_index].name, 
> "kaslr")) {
>   if (!machine_type("X86_64") &&
>   !machine_type("ARM64") && 
> !machine_type("X86") &&
> - !machine_type("S390X"))
> + !machine_type("S390X") && 
> !machine_type("RISCV64"))
>   error(INFO, "--kaslr not valid "
>   "with this machine type.\n");
>   else if (STREQ(optarg, "auto"))
> diff --git a/riscv64.c b/riscv64.c
> index a02f75a..288c7ae 100644
> --- a/riscv64.c
> +++ b/riscv64.c
> @@ -378,6 +378,9 @@ static void riscv64_get_va_range(struct machine_specific 
> *ms)
>   } else
>   goto error;
>   
> + if ((kt->flags2 & KASLR) && (kt->flags & RELOC_SET))
> + ms->kernel_link_addr += (kt->relocate * -1);
> +
>   /*
>* From Linux 5.13, the kernel mapping is moved to the last 2GB
>* of the address space, modules use the 2GB memory range right
> @@ -1360,6 +1363,14 @@ riscv64_init(int when)
>   
>   machdep->verify_paddr = generic_verify_paddr;
>   machdep->ptrs_per_pgd = PTRS_PER_PGD;
> +
> + /*
> +  * Even if CONFIG_RANDOMIZE_BASE is not configured,
> +  * derive_kaslr_offset() should work and set
> +  * kt->relocate to 0
> +  */
> + if (!kt->relocate && !(kt->flags2 & (RELOC_AUTO|KASLR)))
> + kt->flags2 |= (RELOC_AUTO|KASLR);
>   break;
>   
>   case PRE_GDB:
> diff --git a/symbols.c b/symbols.c
> index 876be7a..8e8b4c3 100644
> --- a/symbols.c
> +++ b/symbols.c
> @@ -629,7 +629,7 @@ kaslr_init(void)
>   char *string;
>   
>   if ((!machine_type("X86_64") && !machine_type("ARM64") && 
> !machine_type("X86") &&
> - !machine_type("S390X")) || (kt->flags & RELOC_SET))
> + !machine_type("S390X") && !machine_type("RISCV64")) || (kt->flags & 
> RELOC_SET))
>   return;
>   
>   if (!kt->vmcoreinfo._stext_SYMBOL &&
> @@ -795,7 +795,7 @@ store_symbols(bfd *abfd, int dynamic, void *minisyms, 

Re: [Crash-utility PATCH V2] RISCV64: Use va_kernel_pa_offset in VTOP()

2023-08-14 Thread  
On 2023/08/04 18:15, Song Shuai wrote:
> Since RISC-V Linux v6.4, the commit 3335068f8721 ("riscv: Use
> PUD/P4D/PGD pages for the linear mapping") changes phys_ram_base from
> the physical start of the kernel to the actual start of the DRAM.
> 
> The Crash's VTOP() still uses phys_ram_base and kernel_map.virt_addr
> to translate kernel virtual address, that made Crash boot failed with
> Linux v6.4 and later version.
> 
> Let Linux export kernel_map.va_kernel_pa_offset in v6.5 and backported
> v6.4.0 stable, so Crash can use "va_kernel_pa_offset" to translate the
> kernel virtual address in VTOP() correctly.
> 
> Signed-off-by: Song Shuai 
> ---
> Changes since V1:
>- remove unnecessary first kernel version check as Kazu suggested
>- amend the commit-msg as Alex suggested

Thanks for the v2.

Acked-by: Kazuhito Hagio 

Another way is that checking the vmcoreinfo entry before the kernel 
version, but either is fine to me.

Thanks,
Kazu

> ---
>   defs.h|  4 ++--
>   riscv64.c | 23 +++
>   2 files changed, 25 insertions(+), 2 deletions(-)
> 
> diff --git a/defs.h b/defs.h
> index 5ee60f1..c07e6d7 100644
> --- a/defs.h
> +++ b/defs.h
> @@ -3662,8 +3662,7 @@ typedef signed int s32;
>   ulong _X = X;   
> \
>   (THIS_KERNEL_VERSION >= LINUX(5,13,0) &&
> \
>   (_X) >= machdep->machspec->kernel_link_addr) ?  
> \
> - (((unsigned long)(_X)-(machdep->machspec->kernel_link_addr)) +  
> \
> -  machdep->machspec->phys_base): 
> \
> + ((unsigned long)(_X)-(machdep->machspec->va_kernel_pa_offset)): 
> \
>   (((unsigned long)(_X)-(machdep->kvbase)) +  
> \
>machdep->machspec->phys_base); 
> \
>   })
> @@ -7021,6 +7020,7 @@ struct machine_specific {
>   ulong modules_vaddr;
>   ulong modules_end;
>   ulong kernel_link_addr;
> + ulong va_kernel_pa_offset;
>   
>   ulong _page_present;
>   ulong _page_read;
> diff --git a/riscv64.c b/riscv64.c
> index 6b9a688..7b5dd3d 100644
> --- a/riscv64.c
> +++ b/riscv64.c
> @@ -418,6 +418,28 @@ error:
>   error(FATAL, "cannot get vm layout\n");
>   }
>   
> +static void
> +riscv64_get_va_kernel_pa_offset(struct machine_specific *ms)
> +{
> + unsigned long kernel_version = riscv64_get_kernel_version();
> +
> + /*
> +  * Since Linux v6.4 phys_base is not the physical start of the kernel,
> +  * trying to use "va_kernel_pa_offset" to determine the offset between
> +  * kernel virtual and physical addresses.
> +  */
> + if (kernel_version >= LINUX(6,4,0)) {
> + char *string;
> + if ((string = 
> pc->read_vmcoreinfo("NUMBER(va_kernel_pa_offset)"))) {
> + ms->va_kernel_pa_offset = htol(string, QUIET, NULL);
> + free(string);
> + } else
> + error(FATAL, "cannot read va_kernel_pa_offset\n");
> + }
> + else
> + ms->va_kernel_pa_offset = ms->kernel_link_addr - ms->phys_base;
> +}
> +
>   static int
>   riscv64_is_kvaddr(ulong vaddr)
>   {
> @@ -1352,6 +1374,7 @@ riscv64_init(int when)
>   riscv64_get_struct_page_size(machdep->machspec);
>   riscv64_get_va_bits(machdep->machspec);
>   riscv64_get_va_range(machdep->machspec);
> + riscv64_get_va_kernel_pa_offset(machdep->machspec);
>   
>   pt_level_alloc(>pgd, "cannot malloc pgd space.");
>   pt_level_alloc(>machspec->p4d, "cannot malloc p4d 
> space.");
___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [Crash-utility] RISCV64: Use va_kernel_pa_offset in VTOP()

2023-08-03 Thread  
On 2023/07/24 13:06, Song Shuai wrote:
> Since RISC-V Linux v6.4, the commit 3335068f8721 ("riscv: Use
> PUD/P4D/PGD pages for the linear mapping") changes the
> phys_ram_base from the kernel_map.phys_addr to the start of DRAM.
> 
> The Crash's VTOP() still uses phys_ram_base and kernel_map.virt_addr
> to translate kernel virtual address, that made Crash boot failed with
> Linux v6.4 and later version.
> 
> Let Linux export kernel_map.va_kernel_pa_offset in v6.5 and Crash can
> use "va_kernel_pa_offset" to translate the kernel virtual address in
> VTOP() correctly.
> 
> Signed-off-by: Song Shuai 
> ---
> You can check/test the Linux changes from this link:
> https://github.com/sugarfillet/linux/commits/6.5-rc3-crash
> 
> And I'll send the Linux changes to riscv/for-next If you're ok with this 
> patch.
> ---
>   defs.h|  4 ++--
>   riscv64.c | 22 ++
>   2 files changed, 24 insertions(+), 2 deletions(-)
> 
> diff --git a/defs.h b/defs.h
> index 358f365..46b9857 100644
> --- a/defs.h
> +++ b/defs.h
> @@ -3662,8 +3662,7 @@ typedef signed int s32;
>   ulong _X = X;   
> \
>   (THIS_KERNEL_VERSION >= LINUX(5,13,0) &&
> \
>   (_X) >= machdep->machspec->kernel_link_addr) ?  
> \
> - (((unsigned long)(_X)-(machdep->machspec->kernel_link_addr)) +  
> \
> -  machdep->machspec->phys_base): 
> \
> + ((unsigned long)(_X)-(machdep->machspec->va_kernel_pa_offset)): 
> \
>   (((unsigned long)(_X)-(machdep->kvbase)) +  
> \
>machdep->machspec->phys_base); 
> \
>   })
> @@ -7021,6 +7020,7 @@ struct machine_specific {
>   ulong modules_vaddr;
>   ulong modules_end;
>   ulong kernel_link_addr;
> + ulong va_kernel_pa_offset;
>   
>   ulong _page_present;
>   ulong _page_read;
> diff --git a/riscv64.c b/riscv64.c
> index 6b9a688..b9e50b4 100644
> --- a/riscv64.c
> +++ b/riscv64.c
> @@ -418,6 +418,27 @@ error:
>   error(FATAL, "cannot get vm layout\n");
>   }
>   
> +static void
> +riscv64_get_va_kernel_pa_offset(struct machine_specific *ms)
> +{
> + unsigned long kernel_version = riscv64_get_kernel_version();
> +
> + /*
> +  * va_kernel_pa_offset is defined in Linux kernel since 6.5.
> +  */
> + if (kernel_version >= LINUX(6,5,0)) {

The kernel patches look accepted, so for the crash patch detail,

I think this first version check is not necessary, we can just use the 
vmcoreinfo entry if available.  With it, backporting the kernel patches 
to e.g. 6.4.0 will also be supported.

Thanks,
Kazu

> + char *string;
> + if ((string = 
> pc->read_vmcoreinfo("NUMBER(va_kernel_pa_offset)"))) {
> + ms->va_kernel_pa_offset = htol(string, QUIET, NULL);
> + free(string);
> + } else
> + error(FATAL, "cannot read va_kernel_pa_offset\n");
> + } else if (kernel_version >= LINUX(6,4,0))
> + error(FATAL, "cannot determine va_kernel_pa_offset since Linux 
> 6.4\n");
> + else
> + ms->va_kernel_pa_offset = ms->kernel_link_addr - ms->phys_base;
> +}
> +
>   static int
>   riscv64_is_kvaddr(ulong vaddr)
>   {
> @@ -1352,6 +1373,7 @@ riscv64_init(int when)
>   riscv64_get_struct_page_size(machdep->machspec);
>   riscv64_get_va_bits(machdep->machspec);
>   riscv64_get_va_range(machdep->machspec);
> + riscv64_get_va_kernel_pa_offset(machdep->machspec);
>   
>   pt_level_alloc(>pgd, "cannot malloc pgd space.");
>   pt_level_alloc(>machspec->p4d, "cannot malloc p4d 
> space.");
___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [Crash-utility] RISCV64: Use va_kernel_pa_offset in VTOP()

2023-07-24 Thread  
On 2023/07/24 17:44, Song Shuai wrote:
> 在 2023/7/24 15:41, Conor Dooley 写道:
>> Hey,
>>
>> On Mon, Jul 24, 2023 at 12:06:49PM +0800, Song Shuai wrote:
>>> Since RISC-V Linux v6.4, the commit 3335068f8721 ("riscv: Use
>>> PUD/P4D/PGD pages for the linear mapping") changes the
>>> phys_ram_base from the kernel_map.phys_addr to the start of DRAM.
>>>
>>> The Crash's VTOP() still uses phys_ram_base and kernel_map.virt_addr
>>> to translate kernel virtual address, that made Crash boot failed with
>>> Linux v6.4 and later version.
>>>
>>> Let Linux export kernel_map.va_kernel_pa_offset in v6.5 and Crash can
>>> use "va_kernel_pa_offset" to translate the kernel virtual address in
>>> VTOP() correctly.
>>>
>>> Signed-off-by: Song Shuai 
>>> ---
>>> You can check/test the Linux changes from this link:
>>> https://github.com/sugarfillet/linux/commits/6.5-rc3-crash
>>>
>>> And I'll send the Linux changes to riscv/for-next If you're ok with 
>>> this patch.
>>
>> If you want this to go into 6.5, you'll need to send it for riscv/fixes
>> instead. It sounds like a fix for this would need to go into 6.4 too,
>> no?
> You're right, that should be riscv/fixes for 6.5 and this issue also 
> need to be fixed in 6.4 stable.
> 
> How about waiting for Crash guys' comments on the introduction of the 
> "va_kernel_pa_offset" in vmcoreinfo
> and then determine which stable version should be taken in the first 
> "if" of kernel_version.

I don't have any specific comment on this, it looks necessary and if 
it's accepted in vmcoreinfo, then we can accept a crash patch for it.

Thanks,
Kazu
___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


[ANNOUNCE] makedumpfile 1.7.3

2023-04-25 Thread  
Hi,

I'm pleased to announce the release of makedumpfile 1.7.3.
Thank you everyone for your help to maintain the tool.

Download:
The latest makedumpfile can be downloaded from the following URL.
   https://github.com/makedumpfile/makedumpfile/releases

New features:
- Support kernels up to v6.3 (x86_64)
- Support sadump with 5-level paging

Commits since 1.7.2:
ecc9355 [v1.7.3] Update version (Kazuhito Hagio)
446c4f6 [PATCH 2/2] Fix failure of compound page exclusion on Linux 6.3 and 
later (Kazuhito Hagio)
9e95fc2 [PATCH 1/2] Move offset preparation for compound page into initial() 
(Kazuhito Hagio)
8e8b881 [PATCH 2/2] eppic: Fix a warning about redefining ERRMSG (Petr Tesarik)
f8ac914 [PATCH 1/2] eppic: Fix incompatible pointer type warnings (Petr Tesarik)
58553ad [PATCH] sadump: fix failure of reading memory when 5-level paging is 
enabled (Daisuke Hatayama (Fujitsu))
5f17bdd [PATCH] Fix wrong exclusion of slab pages on Linux 6.2-rc1 (Kazuhito 
Hagio)
42955c0 [PATCH] IMPLEMENTAION: Add a description of the flattened format 
(Kazuhito Hagio)
f1d84a5 [PATCH] Makefile: Remove version from /usr/share/makedumpfile (Leonidas 
Spyropoulos)
2f53c3a [PATCH] Mark start of 1.7.3 development phase with version 1.7.2++ 
(Kazuhito Hagio)


Description of makedumpfile:
The makedumpfile is a tool for creating a dumpfile from /proc/vmcore
with filtering out unnecessary pages for analysis and compressing the
remaining pages, in order to shorten the size of the dumpfile and the
time of creating it.
https://github.com/makedumpfile/makedumpfile
___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH v1 0/2] makedumpfile: Fix a couple of warnings

2023-04-16 Thread  
On 2023/04/16 1:02, Petr Tesarik wrote:
> Although none of the warnings is a real issue, they clutter the
> build output, making it easier to overlook another, real issue.

Agree, applied the two patches.
https://github.com/makedumpfile/makedumpfile/compare/58553ad...8e8b881

Thanks!
Kazu

> 
> Petr Tesarik (2):
>eppic: Fix incompatible pointer type warnings
>eppic: Fix a warning about redefining ERRMSG
> 
>   extension_eppic.c | 4 ++--
>   extension_eppic.h | 1 +
>   2 files changed, 3 insertions(+), 2 deletions(-)
> 
___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH] makedumpfile, sadump: fix failure of reading memory when 5-level paging is enabled

2023-03-29 Thread  
On 2023/03/29 21:44, Daisuke Hatayama (Fujitsu) wrote:
> makedumpfile fails as follows for memory dumps collected by sadump
> when 5-level paging is enabled on the corresponding systems:
> 
>  # makedumpfile -l -d 31 -x ./vmlinux ./dump.sadump dump.sadump-ld31
>  __vtop4_x86_64: Can't get a valid pgd.
>  ...snip...
>  __vtop4_x86_64: Can't get a valid pgd.
>  calc_kaslr_offset: failed to calculate kaslr_offset and phys_base; 
> default to 0
>  __vtop4_x86_64: Can't get a valid pgd.
>  readmem: Can't convert a virtual address(82fce960) to physical 
> address.
>  readmem: type_addr: 0, addr:82fce960, size:1024
>  cpu_online_mask_init: Can't read cpu_online_mask memory.
> 
>  makedumpfile Failed.
> 
> This is because 5-level paging support has not been done yet for
> sadump; the work of the 5-level paging support was done by the commit
> 30a3214a7193e94c551c0cebda5918a72a35c589 (PATCH 4/4 arch/x86_64: Add
> 5-level paging support) but that was focused on the core part only.
> 
> Having said that, most of things has already been finished in the
> commit. What needs to be newly added for sadump is just how to check
> if 5-level paging is enabled for a given memory dump.
> 
> For that purpose, let's refer to CR4.LA57, bit 12 of CR4, representing
> whether 5-level paging is enabled or not. We can do this because
> memory dumps collected by sadump have SMRAM as note information and
> they include CR4 together with the other control registers.
> 
> Signed-off-by: HATAYAMA Daisuke 
> ---
>   sadump_info.c | 4 
>   1 file changed, 4 insertions(+)
> 
> diff --git a/sadump_info.c b/sadump_info.c
> index adfa8dc..2c44068 100644
> --- a/sadump_info.c
> +++ b/sadump_info.c
> @@ -1362,6 +1362,7 @@ static int linux_banner_sanity_check(ulong cr3)
>   #define PTI_USER_PGTABLE_BIT(info->page_shift)
>   #define PTI_USER_PGTABLE_MASK   (1 << PTI_USER_PGTABLE_BIT)
>   #define CR3_PCID_MASK   0xFFFull
> +#define CR4_LA57 (1 << 12)
>   int
>   calc_kaslr_offset(void)
>   {
> @@ -1397,6 +1398,8 @@ calc_kaslr_offset(void)
>   else
>   cr3 = smram.Cr3 & ~CR3_PCID_MASK;
>   
> + NUMBER(pgtable_l5_enabled) = !!(smram.Cr4 & CR4_LA57);
> +
>   /* Convert virtual address of IDT table to physical address */
>   idtr_paddr = vtop4_x86_64_pagetable(idtr, cr3);
>   if (idtr_paddr == NOT_PADDR) {
> @@ -1417,6 +1420,7 @@ calc_kaslr_offset(void)
>   
>   DEBUG_MSG("sadump: idtr=%" PRIx64 "\n", idtr);
>   DEBUG_MSG("sadump: cr3=%" PRIx64 "\n", cr3);
> + DEBUG_MSG("sadump: cr4=%" PRIx32 "\n", smram.Cr4);
>   DEBUG_MSG("sadump: idtr(phys)=%" PRIx64 "\n", idtr_paddr);
>   DEBUG_MSG("sadump: devide_error(vmlinux)=%lx\n",
> divide_error_vmlinux);

Thanks, applied.
https://github.com/makedumpfile/makedumpfile/commit/58553ad03187f0cf208d6c4a0dc026c6338e5edd

(oh, "devide" is there..)

Thanks,
Kazu
___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [RFC][nvdimm][crash] pmem memmap dump support

2023-03-07 Thread  
On 2023/03/07 11:49, lizhij...@fujitsu.com wrote:
> On 07/03/2023 10:05, HAGIO KAZUHITO(萩尾 一仁) wrote:
>> On 2023/02/23 15:24, lizhij...@fujitsu.com wrote:
>>> Hello folks,
>>>
>>> This mail raises a pmem memmap dump requirement and possible solutions, but 
>>> they are all still premature.
>>> I really hope you can provide some feedback.
>>>
>>> pmem memmap can also be called pmem metadata here.
>>>
>>> ### Background and motivate overview ###
>>> ---
>>> Crash dump is an important feature for trouble shooting of kernel. It is 
>>> the final way to chase what
>>> happened at the kernel panic, slowdown, and so on. It is the most important 
>>> tool for customer support.
>>> However, a part of data on pmem is not included in crash dump, it may cause 
>>> difficulty to analyze
>>> trouble around pmem (especially Filesystem-DAX).
>>>
>>>
>>> A pmem namespace in "fsdax" or "devdax" mode requires allocation of 
>>> per-page metadata[1]. The allocation
>>> can be drawn from either mem(system memory) or dev(pmem device), see `ndctl 
>>> help create-namespace` for
>>> more details. In fsdax, struct page array becomes very important, it is one 
>>> of the key data to find
>>> status of reverse map.
>>>
>>> So, when metadata was stored in pmem, even pmem's per-page metadata will 
>>> not be dumped. That means
>>> troubleshooters are unable to check more details about pmem from the 
>>> dumpfile.
>>>
>>> ### Make pmem memmap dump support ###
>>> ---
>>> Our goal is that whether metadata is stored on mem or pmem, its metadata 
>>> can be dumped and then the
>>> crash-utilities can read more details about the pmem. Of course, this 
>>> feature can be enabled/disabled.
>>>
>>> First, based on our previous investigation, according to the location of 
>>> metadata and the scope of
>>> dump, we can divide it into the following four cases: A, B, C, D.
>>> It should be noted that although we mentioned case A below, we do not 
>>> want these two cases to be
>>> part of this feature, because dumping the entire pmem will consume a lot of 
>>> space, and more importantly,
>>> it may contain user sensitive data.
>>>
>>> +-+--++
>>> |\++\ metadata location   |
>>> |++---+
>>> | dump scope  |  mem |   PMEM |
>>> +-+--++
>>> | entire pmem | A| B  |
>>> +-+--++
>>> | metadata| C| D  |
>>> +-+--++
>>>
>>> Case A: unsupported
>>> - Only the regions listed in PT_LOAD in vmcore are dumpable. This can be 
>>> resolved by adding the pmem
>>> region into vmcore's PT_LOADs in kexec-tools.
>>> - For makedumpfile which will assume that all page objects of the entire 
>>> region described in PT_LOADs
>>> are readable, and then skips/excludes the specific page according to its 
>>> attributes. But in the case
>>> of pmem, 1st kernel only allocates page objects for the namespaces of pmem, 
>>> so makedumpfile will throw
>>> errors[2] when specific -d options are specified.
>>> Accordingly, we should make makedumpfile to ignore these errors if it's 
>>> pmem region.
>>>
>>> Because these above cases are not in our goal, we must consider how to 
>>> prevent the data part of pmem
>>> from reading by the dump application(makedumpfile).
>>>
>>> Case C: native supported
>>> metadata is stored in mem, and the entire mem/ram is dumpable.
>>>
>>> Case D: unsupported && need your input
>>> To support this situation, the makedumpfile needs to know the location of 
>>> metadata for each pmem
>>> namespace and the address and size of metadata in the pmem [start, end)
>>>
>>> We have thought of a few possible options:
>>>
>>> 1) In the 2nd kernel, with the help of the information from 
>>> /sys/bus/nd/devices/{namespaceX.Y, daxX.Y, pfnX.Y}
>>> exported by pmem drivers, makedumpfile is able to calculate the address and 
>>> size of metadata
>>> 2) In the 1st kernel, add a new symbol to the vmcore. The symbol is 
>>> associated with the layout of
>>> each namespace. The makedumpfile reads th

Re: [RFC][nvdimm][crash] pmem memmap dump support

2023-03-06 Thread  
On 2023/02/23 15:24, lizhij...@fujitsu.com wrote:
> Hello folks,
> 
> This mail raises a pmem memmap dump requirement and possible solutions, but 
> they are all still premature.
> I really hope you can provide some feedback.
> 
> pmem memmap can also be called pmem metadata here.
> 
> ### Background and motivate overview ###
> ---
> Crash dump is an important feature for trouble shooting of kernel. It is the 
> final way to chase what
> happened at the kernel panic, slowdown, and so on. It is the most important 
> tool for customer support.
> However, a part of data on pmem is not included in crash dump, it may cause 
> difficulty to analyze
> trouble around pmem (especially Filesystem-DAX).
> 
> 
> A pmem namespace in "fsdax" or "devdax" mode requires allocation of per-page 
> metadata[1]. The allocation
> can be drawn from either mem(system memory) or dev(pmem device), see `ndctl 
> help create-namespace` for
> more details. In fsdax, struct page array becomes very important, it is one 
> of the key data to find
> status of reverse map.
> 
> So, when metadata was stored in pmem, even pmem's per-page metadata will not 
> be dumped. That means
> troubleshooters are unable to check more details about pmem from the dumpfile.
> 
> ### Make pmem memmap dump support ###
> ---
> Our goal is that whether metadata is stored on mem or pmem, its metadata can 
> be dumped and then the
> crash-utilities can read more details about the pmem. Of course, this feature 
> can be enabled/disabled.
> 
> First, based on our previous investigation, according to the location of 
> metadata and the scope of
> dump, we can divide it into the following four cases: A, B, C, D.
> It should be noted that although we mentioned case A below, we do not want 
> these two cases to be
> part of this feature, because dumping the entire pmem will consume a lot of 
> space, and more importantly,
> it may contain user sensitive data.
> 
> +-+--++
> |\++\ metadata location   |
> |++---+
> | dump scope  |  mem |   PMEM |
> +-+--++
> | entire pmem | A| B  |
> +-+--++
> | metadata| C| D  |
> +-+--++
> 
> Case A: unsupported
> - Only the regions listed in PT_LOAD in vmcore are dumpable. This can be 
> resolved by adding the pmem
> region into vmcore's PT_LOADs in kexec-tools.
> - For makedumpfile which will assume that all page objects of the entire 
> region described in PT_LOADs
> are readable, and then skips/excludes the specific page according to its 
> attributes. But in the case
> of pmem, 1st kernel only allocates page objects for the namespaces of pmem, 
> so makedumpfile will throw
> errors[2] when specific -d options are specified.
> Accordingly, we should make makedumpfile to ignore these errors if it's pmem 
> region.
> 
> Because these above cases are not in our goal, we must consider how to 
> prevent the data part of pmem
> from reading by the dump application(makedumpfile).
> 
> Case C: native supported
> metadata is stored in mem, and the entire mem/ram is dumpable.
> 
> Case D: unsupported && need your input
> To support this situation, the makedumpfile needs to know the location of 
> metadata for each pmem
> namespace and the address and size of metadata in the pmem [start, end)
> 
> We have thought of a few possible options:
> 
> 1) In the 2nd kernel, with the help of the information from 
> /sys/bus/nd/devices/{namespaceX.Y, daxX.Y, pfnX.Y}
> exported by pmem drivers, makedumpfile is able to calculate the address and 
> size of metadata
> 2) In the 1st kernel, add a new symbol to the vmcore. The symbol is 
> associated with the layout of
> each namespace. The makedumpfile reads the symbol and figures out the address 
> and size of the metadata.

Hi Zhijian,

sorry, probably I don't understand enough, but do these mean that
  1. /proc/vmcore exports pmem regions with PT_LOADs, which contain
 unreadable ones, and
  2. makedumpfile gets to know the readable regions somehow?

Then /proc/vmcore with pmem cannot be captured by other commands,
e.g. cp command?

Thanks,
Kazu

> 3) others ?
> 
> But then we found that we have always ignored a user case, that is, the user 
> could save the dumpfile
> to the pmem. Neither of these two options can solve this problem, because the 
> pmem drivers will
> re-initialize the metadata during the pmem drivers loading process, which 
> leads to the metadata
> we dumped is inconsistent with the metadata at the moment of the crash 
> happening.
> Simply, can we just disable the pmem directly in 2nd kernel so that previous 
> metadata will not be
> destroyed? But this operation will bring us inconvenience that 2nd kernel 
> doesn’t allow user storing
> dumpfile on the filesystem/partition based on pmem.
> 
> So here I hope you can provide some ideas about this feature/requirement 

Re: [PATCH makedumpfile] Fix incorrect exclusion of slab pages on Linux 6.2-rc1

2022-12-21 Thread  
On 2022/12/21 11:06, HAGIO KAZUHITO(萩尾 一仁) wrote:
> From: Kazuhito Hagio 
> 
> * Required for kernel 6.2
> 
> Kernel commit 130d4df57390 ("mm/sl[au]b: rearrange struct slab fields to
> allow larger rcu_head"), which is contained in Linux 6.2-rc1 and later,
> made the offset of slab.slabs equal to page.mapping's one.  As a result,
> "makedumpfile -d 8", which should exclude user data, excludes some slab
> pages incorrectly because isAnon() returns true when slab.slabs is an
> odd number.  With such dumpfiles, crash can fail to start session with
> an error like this:
> 
># crash vmlinux dumpfile
>...
>crash: page excluded: kernel virtual address: 8fa047ac2fe8 type: 
> "xa_node shift"
> 
> Make isAnon() check that the page is not slab to fix this.
> 
> Signed-off-by: Kazuhito Hagio 

Applied.
https://github.com/makedumpfile/makedumpfile/commit/5f17bdd2128998a3eeeb4521d136a19fadb6

Thanks,
Kazu
___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [Crash-utility][PATCH V4 1/9] Add RISCV64 framework code support

2022-12-21 Thread  
On 2022/11/09 18:01, Xianting Tian wrote:
>> On the kernel side, some relevant kernel patches got ack,  it seems
>> they won't  change anymore.
>>
>> And the V4 looks good to me, so: Ack.
> Thanks,  Linux kenrel RISC-V maintainer still don't apply the kernel patch,  
> let's wait.

Now I see 649d6b1019a2 ("RISC-V: Add arch_crash_save_vmcoreinfo")
in the mainline, so applied the crash patchset.

https://github.com/crash-utility/crash/compare/88a4910d95d4...0d5ad129252a

Thanks,
Kazu
___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


[PATCH makedumpfile] Fix incorrect exclusion of slab pages on Linux 6.2-rc1

2022-12-20 Thread  
From: Kazuhito Hagio 

* Required for kernel 6.2

Kernel commit 130d4df57390 ("mm/sl[au]b: rearrange struct slab fields to
allow larger rcu_head"), which is contained in Linux 6.2-rc1 and later,
made the offset of slab.slabs equal to page.mapping's one.  As a result,
"makedumpfile -d 8", which should exclude user data, excludes some slab
pages incorrectly because isAnon() returns true when slab.slabs is an
odd number.  With such dumpfiles, crash can fail to start session with
an error like this:

  # crash vmlinux dumpfile
  ...
  crash: page excluded: kernel virtual address: 8fa047ac2fe8 type: "xa_node 
shift"

Make isAnon() check that the page is not slab to fix this.

Signed-off-by: Kazuhito Hagio 
---
 makedumpfile.c | 6 +++---
 makedumpfile.h | 9 +++--
 2 files changed, 6 insertions(+), 9 deletions(-)

diff --git a/makedumpfile.c b/makedumpfile.c
index ff821ebd3eb0..f40368364cf3 100644
--- a/makedumpfile.c
+++ b/makedumpfile.c
@@ -6502,7 +6502,7 @@ __exclude_unnecessary_pages(unsigned long mem_map,
 */
else if ((info->dump_level & DL_EXCLUDE_CACHE)
&& is_cache_page(flags)
-   && !isPrivate(flags) && !isAnon(mapping)) {
+   && !isPrivate(flags) && !isAnon(mapping, flags)) {
pfn_counter = _cache;
}
/*
@@ -6510,7 +6510,7 @@ __exclude_unnecessary_pages(unsigned long mem_map,
 */
else if ((info->dump_level & DL_EXCLUDE_CACHE_PRI)
&& is_cache_page(flags)
-   && !isAnon(mapping)) {
+   && !isAnon(mapping, flags)) {
if (isPrivate(flags))
pfn_counter = _cache_private;
else
@@ -6522,7 +6522,7 @@ __exclude_unnecessary_pages(unsigned long mem_map,
 *  - hugetlbfs pages
 */
else if ((info->dump_level & DL_EXCLUDE_USER_DATA)
-&& (isAnon(mapping) || isHugetlb(compound_dtor))) {
+&& (isAnon(mapping, flags) || 
isHugetlb(compound_dtor))) {
pfn_counter = _user;
}
/*
diff --git a/makedumpfile.h b/makedumpfile.h
index 70a1a91d66be..21dec7d1145c 100644
--- a/makedumpfile.h
+++ b/makedumpfile.h
@@ -161,12 +161,9 @@ test_bit(int nr, unsigned long addr)
 #define isSwapBacked(flags)test_bit(NUMBER(PG_swapbacked), flags)
 #define isHWPOISON(flags)  (test_bit(NUMBER(PG_hwpoison), flags) \
&& (NUMBER(PG_hwpoison) != NOT_FOUND_NUMBER))
-
-static inline int
-isAnon(unsigned long mapping)
-{
-   return ((unsigned long)mapping & PAGE_MAPPING_ANON) != 0;
-}
+#define isSlab(flags)  test_bit(NUMBER(PG_slab), flags)
+#define isAnon(mapping, flags) (((unsigned long)mapping & PAGE_MAPPING_ANON) 
!= 0 \
+   && !isSlab(flags))
 
 #define PTOB(X)(((unsigned long long)(X)) << 
PAGESHIFT())
 #define BTOP(X)(((unsigned long long)(X)) >> 
PAGESHIFT())
-- 
2.31.1

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH makedumpfile] Handle __mips64 as __mips64__ to avoid build failure

2022-11-23 Thread  
On 2022/11/24 9:50, HAGIO KAZUHITO(萩尾 一仁) wrote:
> From: Fabrice Fontaine 
> 
> Handle __mips64 as __mips64__ to avoid the following build failure:
> 
> makedumpfile.c: In function 'is_kvaddr':
> makedumpfile.c:1613:39: error: 'KVBASE' undeclared (first use in this 
> function)
> return (addr >= (unsigned long long)(KVBASE));
>  ^~
> 
> Fixes:
>- 
> http://autobuild.buildroot.org/results/94824fa8baa8edb99a5ca245e5561e0c4e430638

makedumpfile has to use only the "__arch__" style to enable TARGET build,
e.g. "make TARGET=mips64" on an x86_64 machine.

Your build environment has "-D__mips64el__", so does this work for you?

--- a/Makefile
+++ b/Makefile
@@ -24,7 +24,8 @@ endif
  ARCH := $(shell echo ${TARGET}  | sed -e s/i.86/x86/ -e s/sun4u/sparc64/ \
-e s/arm.*/arm/ -e s/sa110/arm/ \
-e s/s390x/s390/ -e s/parisc64/parisc/ \
-   -e s/ppc64/powerpc64/ -e s/ppc/powerpc32/)
+   -e s/ppc64/powerpc64/ -e s/ppc/powerpc32/ \
+   -e s/mips64el/mips64/)

  CROSS :=
  ifneq ($(TARGET), $(HOST_ARCH))

Thanks,
Kazu


> 
> Signed-off-by: Fabrice Fontaine 
> ---
>arch/mips64.c  | 2 +-
>makedumpfile.h | 6 +++---
>2 files changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/arch/mips64.c b/arch/mips64.c
> index ab45b6e..fd987b0 100644
> --- a/arch/mips64.c
> +++ b/arch/mips64.c
> @@ -16,7 +16,7 @@
> * along with this program; if not, write to the Free Software
> * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
> */
> -#ifdef __mips64__
> +#if defined(__mips64__) || defined(__mips64)
>
>#include "../print_info.h"
>#include "../elf_info.h"
> diff --git a/makedumpfile.h b/makedumpfile.h
> index 70a1a91..3842f9c 100644
> --- a/makedumpfile.h
> +++ b/makedumpfile.h
> @@ -963,7 +963,7 @@ typedef unsigned long pgd_t;
>
>#endif  /* sparc64 */
>
> -#ifdef __mips64__ /* mips64 */
> +#if defined(__mips64__) || defined(__mips64) /* mips64 */
>#define KVBASE PAGE_OFFSET
>
>#ifndef _XKPHYS_START_ADDR
> @@ -1204,7 +1204,7 @@ unsigned long long vaddr_to_paddr_sparc64(unsigned long 
> vaddr);
>#define arch_crashkernel_mem_size()stub_false()
>#endif /* sparc64 */
>
> -#ifdef __mips64__ /* mips64 */
> +#if defined(__mips64__) || defined(__mips64) /* mips64 */
>int get_phys_base_mips64(void);
>int get_machdep_info_mips64(void);
>int get_versiondep_info_mips64(void);
> @@ -2364,7 +2364,7 @@ int get_xen_info_ia64(void);
>#define get_xen_info_arch(X) FALSE
>#endif /* sparc64 */
>
> -#ifdef __mips64__ /* mips64 */
> +#if defined(__mips64__) || defined(__mips64) /* mips64 */
>#define kvtop_xen(X)   FALSE
>#define get_xen_basic_info_arch(X) FALSE
>#define get_xen_info_arch(X) FALSE
> ___
> kexec mailing list
> kexec@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/kexec
___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


[PATCH makedumpfile] Handle __mips64 as __mips64__ to avoid build failure

2022-11-23 Thread  
From: Fabrice Fontaine 

Handle __mips64 as __mips64__ to avoid the following build failure:

makedumpfile.c: In function 'is_kvaddr':
makedumpfile.c:1613:39: error: 'KVBASE' undeclared (first use in this function)
   return (addr >= (unsigned long long)(KVBASE));
^~

Fixes:
  - 
http://autobuild.buildroot.org/results/94824fa8baa8edb99a5ca245e5561e0c4e430638

Signed-off-by: Fabrice Fontaine 
---
  arch/mips64.c  | 2 +-
  makedumpfile.h | 6 +++---
  2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/mips64.c b/arch/mips64.c
index ab45b6e..fd987b0 100644
--- a/arch/mips64.c
+++ b/arch/mips64.c
@@ -16,7 +16,7 @@
   * along with this program; if not, write to the Free Software
   * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
   */
-#ifdef __mips64__
+#if defined(__mips64__) || defined(__mips64)
  
  #include "../print_info.h"
  #include "../elf_info.h"
diff --git a/makedumpfile.h b/makedumpfile.h
index 70a1a91..3842f9c 100644
--- a/makedumpfile.h
+++ b/makedumpfile.h
@@ -963,7 +963,7 @@ typedef unsigned long pgd_t;
  
  #endif  /* sparc64 */
  
-#ifdef __mips64__ /* mips64 */
+#if defined(__mips64__) || defined(__mips64) /* mips64 */
  #define KVBASEPAGE_OFFSET
  
  #ifndef _XKPHYS_START_ADDR
@@ -1204,7 +1204,7 @@ unsigned long long vaddr_to_paddr_sparc64(unsigned long 
vaddr);
  #define arch_crashkernel_mem_size()   stub_false()
  #endif/* sparc64 */
  
-#ifdef __mips64__ /* mips64 */
+#if defined(__mips64__) || defined(__mips64) /* mips64 */
  int get_phys_base_mips64(void);
  int get_machdep_info_mips64(void);
  int get_versiondep_info_mips64(void);
@@ -2364,7 +2364,7 @@ int get_xen_info_ia64(void);
  #define get_xen_info_arch(X) FALSE
  #endif/* sparc64 */
  
-#ifdef __mips64__ /* mips64 */
+#if defined(__mips64__) || defined(__mips64) /* mips64 */
  #define kvtop_xen(X)  FALSE
  #define get_xen_basic_info_arch(X) FALSE
  #define get_xen_info_arch(X) FALSE
___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH] Makefile: Remove version from /usr/share/makedumpfile

2022-10-26 Thread  
On 2022/10/24 11:25, HAGIO KAZUHITO(萩尾 一仁) wrote:
> On 2022/10/21 19:24, Leonidas Spyropoulos wrote:
>> Version specific paths doesn't make sense at
>> /usr/share/makedumpfile. This assumes you will have only one version
>> installed which on a normal system it makes sense and devs can always
>> specify different DESTDIR per versions.
>>
>> Fixes: #10
>>
>> Signed-off-by: Leonidas Spyropoulos 
> 
> Thanks for the patch.
> 
> I agree.
> 
> The patch [1] introduced the directory with ${VERSION}, but makedumpfile
> has backward compatibility and the directory does not have any data that
> has version restraint, so I don't see any reason.  Also I didn't find any
> discussion in the list archive.
> 
> I will merge this a few days later if no objection.

Applied.
https://github.com/makedumpfile/makedumpfile/commit/f1d84a5d69d81bc7a89aefae504be88df1e50693

Thanks,
Kazu
___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH] Makefile: Remove version from /usr/share/makedumpfile

2022-10-23 Thread  
On 2022/10/21 19:24, Leonidas Spyropoulos wrote:
> Version specific paths doesn't make sense at
> /usr/share/makedumpfile. This assumes you will have only one version
> installed which on a normal system it makes sense and devs can always
> specify different DESTDIR per versions.
> 
> Fixes: #10
> 
> Signed-off-by: Leonidas Spyropoulos 

Thanks for the patch.

I agree.

The patch [1] introduced the directory with ${VERSION}, but makedumpfile
has backward compatibility and the directory does not have any data that
has version restraint, so I don't see any reason.  Also I didn't find any
discussion in the list archive.

I will merge this a few days later if no objection.

Thanks,
Kazu

[1] 
https://github.com/makedumpfile/makedumpfile/commit/41e1ccfcd57736047a5c52d8096fbcaa255146ec


> ---
>   Makefile | 6 +++---
>   1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/Makefile b/Makefile
> index 548e5b7..f6ecbe2 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -130,6 +130,6 @@ install:
>   install -m 755 -t ${DESTDIR}/usr/sbin makedumpfile 
> $(VPATH)makedumpfile-R.pl
>   install -m 644 -t ${DESTDIR}/usr/share/man/man8 makedumpfile.8
>   install -m 644 -t ${DESTDIR}/usr/share/man/man5 makedumpfile.conf.5
> - mkdir -p ${DESTDIR}/usr/share/makedumpfile-${VERSION}/eppic_scripts
> - install -m 644 -D $(VPATH)makedumpfile.conf 
> ${DESTDIR}/usr/share/makedumpfile-${VERSION}/makedumpfile.conf.sample
> - install -m 644 -t 
> ${DESTDIR}/usr/share/makedumpfile-${VERSION}/eppic_scripts/ 
> $(VPATH)eppic_scripts/*
> + mkdir -p ${DESTDIR}/usr/share/makedumpfile/eppic_scripts
> + install -m 644 -D $(VPATH)makedumpfile.conf 
> ${DESTDIR}/usr/share/makedumpfile/makedumpfile.conf.sample
> + install -m 644 -t ${DESTDIR}/usr/share/makedumpfile/eppic_scripts/ 
> $(VPATH)eppic_scripts/*
___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [Crash-utility][PATCH V4 1/9] Add RISCV64 framework code support

2022-10-20 Thread  
On 2022/10/21 11:42, Xianting Tian wrote:
> 
> 在 2022/10/21 上午10:17, HAGIO KAZUHITO(萩尾 一仁) 写道:
>> On 2022/10/20 10:50, Xianting Tian wrote:
>>
>>> diff --git a/README b/README
>>> index 5abbce1..d589e72 100644
>>> --- a/README
>>> +++ b/README
>>> @@ -37,7 +37,7 @@
>>>  These are the current prerequisites:
>>>  o  At this point, x86, ia64, x86_64, ppc64, ppc, arm, arm64, alpha, 
>>> mips,
>>> - mips64, s390 and s390x-based kernels are supported.  Other 
>>> architectures
>>> + mips64, riscv64, s390 and s390x-based kernels are supported.  Other 
>>> architectures
>>>     may be addressed in the future.
>> Sentences in the README are wrapped within 80 characters, I will change
> 
> thanks,
> 
> Do you need me to send V5 patch set to fix this?

No, I will amend these when applying.

Thanks,
Kazu

> 
>> this to:
>>
>> + mips64, riscv64, s390 and s390x-based kernels are supported.  Other
>> + architectures may be addressed in the future.
>>
>>>  o  One size fits all -- the utility can be run on any Linux kernel 
>>> version
>>> @@ -98,6 +98,8 @@
>>>     arm64 dumpfiles may be built by typing "make target=ARM64".
>>>  o  On an x86_64 host, an x86_64 binary that can be used to analyze
>>>     ppc64le dumpfiles may be built by typing "make target=PPC64".
>>> +  o  On an x86_64 host, an x86_64 binary that can be used to analyze
>>> + riscv64 dumpfiles may be built by typing "make target=RISCV64".
>>>  Traditionally when vmcores are compressed via the makedumpfile(8) 
>>> facility
>>>  the libz compression library is used, and by default the crash utility
>>
>>> diff --git a/help.c b/help.c
>>> index 99214c1..253c71b 100644
>>> --- a/help.c
>>> +++ b/help.c
>>> @@ -9512,7 +9512,7 @@ char *README[] = {
>>>    "  These are the current prerequisites: ",
>>>    "",
>>>    "  o  At this point, x86, ia64, x86_64, ppc64, ppc, arm, arm64, alpha, 
>>> mips,",
>>> -" mips64, s390 and s390x-based kernels are supported.  Other 
>>> architectures",
>>> +" mips64, riscv64, s390 and s390x-based kernels are supported.  Other 
>>> architectures",
>>>    " may be addressed in the future.",
>>>    "",
>>>    "  o  One size fits all -- the utility can be run on any Linux kernel 
>>> version",
>> Same as above.
>>
>> And help.c lacks this part, will add:
>>
>> @@ -9572,6 +9572,8 @@ README_ENTER_DIRECTORY,
>>    " arm64 dumpfiles may be built by typing \"make target=ARM64\".",
>>    "  o  On an x86_64 host, an x86_64 binary that can be used to analyze",
>>    " ppc64le dumpfiles may be built by typing \"make target=PPC64\".",
>> +"  o  On an x86_64 host, an x86_64 binary that can be used to analyze",
>> +" riscv64 dumpfiles may be built by typing \"make target=RISCV64\".",
>>    "",
>>    "  Traditionally when vmcores are compressed via the makedumpfile(8) 
>> facility",
>>    "  the libz compression library is used, and by default the crash 
>> utility",
>>
>>
>> With these, the v4 crash patch set looks good to me.
>>
>> Acked-by: Kazuhito Hagio 
>>
>> Thanks,
>> Kazu
___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [Crash-utility][PATCH V4 1/9] Add RISCV64 framework code support

2022-10-20 Thread  
On 2022/10/20 10:50, Xianting Tian wrote:

> diff --git a/README b/README
> index 5abbce1..d589e72 100644
> --- a/README
> +++ b/README
> @@ -37,7 +37,7 @@
> These are the current prerequisites:
>   
> o  At this point, x86, ia64, x86_64, ppc64, ppc, arm, arm64, alpha, mips,
> - mips64, s390 and s390x-based kernels are supported.  Other architectures
> + mips64, riscv64, s390 and s390x-based kernels are supported.  Other 
> architectures
>may be addressed in the future.

Sentences in the README are wrapped within 80 characters, I will change
this to:

+ mips64, riscv64, s390 and s390x-based kernels are supported.  Other
+ architectures may be addressed in the future.

>   
> o  One size fits all -- the utility can be run on any Linux kernel version
> @@ -98,6 +98,8 @@
>arm64 dumpfiles may be built by typing "make target=ARM64".
> o  On an x86_64 host, an x86_64 binary that can be used to analyze
>ppc64le dumpfiles may be built by typing "make target=PPC64".
> +  o  On an x86_64 host, an x86_64 binary that can be used to analyze
> + riscv64 dumpfiles may be built by typing "make target=RISCV64".
>   
> Traditionally when vmcores are compressed via the makedumpfile(8) facility
> the libz compression library is used, and by default the crash utility


> diff --git a/help.c b/help.c
> index 99214c1..253c71b 100644
> --- a/help.c
> +++ b/help.c
> @@ -9512,7 +9512,7 @@ char *README[] = {
>   "  These are the current prerequisites: ",
>   "",
>   "  o  At this point, x86, ia64, x86_64, ppc64, ppc, arm, arm64, alpha, 
> mips,",
> -" mips64, s390 and s390x-based kernels are supported.  Other 
> architectures",
> +" mips64, riscv64, s390 and s390x-based kernels are supported.  Other 
> architectures",
>   " may be addressed in the future.",
>   "",
>   "  o  One size fits all -- the utility can be run on any Linux kernel 
> version",

Same as above.

And help.c lacks this part, will add:

@@ -9572,6 +9572,8 @@ README_ENTER_DIRECTORY,
  " arm64 dumpfiles may be built by typing \"make target=ARM64\".",
  "  o  On an x86_64 host, an x86_64 binary that can be used to analyze",
  " ppc64le dumpfiles may be built by typing \"make target=PPC64\".",
+"  o  On an x86_64 host, an x86_64 binary that can be used to analyze",
+" riscv64 dumpfiles may be built by typing \"make target=RISCV64\".",
  "",
  "  Traditionally when vmcores are compressed via the makedumpfile(8) 
facility",
  "  the libz compression library is used, and by default the crash utility",


With these, the v4 crash patch set looks good to me.

Acked-by: Kazuhito Hagio 

Thanks,
Kazu
___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH V3 1/2] RISC-V: Add arch_crash_save_vmcoreinfo support

2022-10-19 Thread  
On 2022/10/19 12:17, Xianting Tian wrote:

>>> +    if (IS_ENABLED(CONFIG_64BIT)) {
>>> +#ifdef CONFIG_KASAN
>>> +    vmcoreinfo_append_str("NUMBER(KASAN_SHADOW_START)=0x%lx\n", 
>>> KASAN_SHADOW_START);
>>> +    vmcoreinfo_append_str("NUMBER(KASAN_SHADOW_END)=0x%lx\n", 
>>> KASAN_SHADOW_END);
>>> +#endif
>>> +    vmcoreinfo_append_str("NUMBER(KERNEL_LINK_ADDR)=0x%lx\n", 
>>> KERNEL_LINK_ADDR);
>>> +    vmcoreinfo_append_str("NUMBER(ADDRESS_SPACE_END)=0x%lx\n", 
>>> ADDRESS_SPACE_END);
>> Seems this is the firsr ARCH where kasan and kernel link/bpf space are
>> added to dump and analyze. Just curious, have you got code change to
>> make use of them to do dumping and analyze?
> KASAN_SHADOW_START is not used, KERNEL_LINK_ADDR is used in the crash 
> patch set:
> https://patchwork.kernel.org/project/linux-riscv/cover/20220813031753.3097720-1-xianting.t...@linux.alibaba.com/
 Oh, I would say please no. Sometime we got tons of objection when adding an
 necessary one, we definitely should not add one for possible future
 use.

 For this kind of newly added one, we need get ack from
 makedumpfile/crash utility maintainer so that we know they are necessary
 to have. At least they don't oppose.
>>> Hi Kazu, Li Jiang
>>>
>>> Could you help comment whether we need KASAN_SHADOW_START and 
>>> KERNEL_LINK_ADDR area export for vmcore from crash point of view?
>>>
>>> In my crash patch set, I don't use KASAN_SHADOW_START,
>>> And only get the value of KERNEL_LINK_ADDR, not realy use it.
>>> https://patchwork.kernel.org/project/linux-riscv/cover/20220813031753.3097720-1-xianting.t...@linux.alibaba.com/
>> In your crash patch set, KERNEL_LINK_ADDR is used in VTOP() and looks
>> necessary to me.
>>
>> The others (KASAN_SHADOW_START, KASAN_SHADOW_END and ADDRESS_SPACE_END)
>> are not currently used.  It may be better to add them when they are
>> really used.
> 
> I am very sorry, I missed it , KERNEL_LINK_ADDR is used indeed.
> 
> KASAN_SHADOW_START is not used, so I don't need to send crash patch set> 
> again. only need to remove KASAN_SHADOW_END in kernel patch set.

I see that your v4 kernel patch set does not have ADDRESS_SPACE_END,
so it seems there would be need to change this part and related ones
at crash side.

 if ((string = pc->read_vmcoreinfo("NUMBER(ADDRESS_SPACE_END)"))) {
 ms->address_space_end = htol(string, QUIET, NULL);
 free(string);
 } else
 goto error;
...
error:
 error(FATAL, "cannot get vm layout\n");

Thanks,
Kazu
___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


[ANNOUNCE] makedumpfile 1.7.2

2022-10-19 Thread  
Hi,

I'm pleased to announce the release of makedumpfile 1.7.2.
Thank you everyone for your help to maintain the tool.

Download:
The latest makedumpfile can be downloaded from the following URL.
   https://github.com/makedumpfile/makedumpfile/releases

New features:
- LoongArch64 architecture support
- Support kernels up to v6.0 (x86_64)

Commits since 1.7.1:
9fefc68 [v1.7.2] Update version (Kazuhito Hagio)
e9ffebb [PATCH] README: Add a FAQ about public key and adjustments (Kazuhito 
Hagio)
a213694 [PATCH] xen: Fix wrong free issue in init_xen_crash_info() (Dietmar 
Hahn)
787a23e [PATCH] Add initial LoongArch64 support (Youling Tang)
36cdfd9 [PATCH] mips64: Replace hardcoded values with macros (Chetan Kankotiya)
09b5c87 [PATCH] Makefile: Avoid installing files in /etc (Guilherme G. Piccoli)
6d0d95e [PATCH] Avoid false-positive mem_section validation with vmlinux 
(Kazuhito Hagio)
4a4a2c2 [PATCH] Mark start of 1.7.2 development phase with version 1.7.1++ 
(Kazuhito Hagio)

Description of makedumpfile:
The makedumpfile is a tool for creating a dumpfile from /proc/vmcore
with filtering out unnecessary pages for analysis and compressing the
remaining pages, in order to shorten the size of the dumpfile and the
time of creating it.
https://github.com/makedumpfile/makedumpfile

Thanks,
Kazu
___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH V3 1/2] RISC-V: Add arch_crash_save_vmcoreinfo support

2022-10-18 Thread  
On 2022/10/19 10:50, Xianting Tian wrote:
> 在 2022/10/18 下午6:03, Baoquan He 写道:
>> On 10/18/22 at 05:25pm, Xianting Tian wrote:
>>> 在 2022/10/18 下午5:10, Baoquan He 写道:
 On 10/18/22 at 04:17pm, Xianting Tian wrote:
> Add arch_crash_save_vmcoreinfo(), which exports VM layout(MODULES, 
> VMALLOC,
> VMEMMAP and KERNEL_LINK_ADDR ranges), va bits and ram base for vmcore.
>
> Default pagetable levels and PAGE_OFFSET aren't same for different kernel
> version as below. For pagetable levels, it sets sv57 by default and falls
> back to setting sv48 at boot time if sv57 is not supported by the 
> hardware.
>
> For ram base, the default value is 0x8020 for qemu riscv64 env and,
> for example, is 0x20 on the XuanTie 910 CPU.
>
>    * Linux Kernel 5.18 ~
>    *  PGTABLE_LEVELS = 5
>    *  PAGE_OFFSET = 0xff60
>    * Linux Kernel 5.17 ~
>    *  PGTABLE_LEVELS = 4
>    *  PAGE_OFFSET = 0xaf80
>    * Linux Kernel 4.19 ~
>    *  PGTABLE_LEVELS = 3
>    *  PAGE_OFFSET = 0xffe0
>
> Since these configurations change from time to time and version to 
> version,
> it is preferable to export them via vmcoreinfo than to change the crash's
> code frequently, it can simplify the development of crash tool.
>
> Signed-off-by: Xianting Tian 
> ---
>    arch/riscv/kernel/Makefile |  1 +
>    arch/riscv/kernel/crash_core.c | 29 +
>    2 files changed, 30 insertions(+)
>    create mode 100644 arch/riscv/kernel/crash_core.c
>
> diff --git a/arch/riscv/kernel/Makefile b/arch/riscv/kernel/Makefile
> index db6e4b1294ba..4cf303a779ab 100644
> --- a/arch/riscv/kernel/Makefile
> +++ b/arch/riscv/kernel/Makefile
> @@ -81,6 +81,7 @@ obj-$(CONFIG_KGDB)    += kgdb.o
>    obj-$(CONFIG_KEXEC_CORE)    += kexec_relocate.o crash_save_regs.o 
> machine_kexec.o
>    obj-$(CONFIG_KEXEC_FILE)    += elf_kexec.o machine_kexec_file.o
>    obj-$(CONFIG_CRASH_DUMP)    += crash_dump.o
> +obj-$(CONFIG_CRASH_CORE)    += crash_core.o
>    obj-$(CONFIG_JUMP_LABEL)    += jump_label.o
> diff --git a/arch/riscv/kernel/crash_core.c 
> b/arch/riscv/kernel/crash_core.c
> new file mode 100644
> index ..8d7f5ff108da
> --- /dev/null
> +++ b/arch/riscv/kernel/crash_core.c
> @@ -0,0 +1,29 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +
> +#include 
> +#include 
> +
> +void arch_crash_save_vmcoreinfo(void)
> +{
> +    VMCOREINFO_NUMBER(VA_BITS);
> +    VMCOREINFO_NUMBER(phys_ram_base);
> +
> +    vmcoreinfo_append_str("NUMBER(PAGE_OFFSET)=0x%lx\n", PAGE_OFFSET);
> +    vmcoreinfo_append_str("NUMBER(VMALLOC_START)=0x%lx\n", 
> VMALLOC_START);
> +    vmcoreinfo_append_str("NUMBER(VMALLOC_END)=0x%lx\n", VMALLOC_END);
> +    vmcoreinfo_append_str("NUMBER(VMEMMAP_START)=0x%lx\n", 
> VMEMMAP_START);
> +    vmcoreinfo_append_str("NUMBER(VMEMMAP_END)=0x%lx\n", VMEMMAP_END);
> +#ifdef CONFIG_64BIT
> +    vmcoreinfo_append_str("NUMBER(MODULES_VADDR)=0x%lx\n", 
> MODULES_VADDR);
> +    vmcoreinfo_append_str("NUMBER(MODULES_END)=0x%lx\n", MODULES_END);
> +#endif
> +
> +    if (IS_ENABLED(CONFIG_64BIT)) {
> +#ifdef CONFIG_KASAN
> +    vmcoreinfo_append_str("NUMBER(KASAN_SHADOW_START)=0x%lx\n", 
> KASAN_SHADOW_START);
> +    vmcoreinfo_append_str("NUMBER(KASAN_SHADOW_END)=0x%lx\n", 
> KASAN_SHADOW_END);
> +#endif
> +    vmcoreinfo_append_str("NUMBER(KERNEL_LINK_ADDR)=0x%lx\n", 
> KERNEL_LINK_ADDR);
> +    vmcoreinfo_append_str("NUMBER(ADDRESS_SPACE_END)=0x%lx\n", 
> ADDRESS_SPACE_END);
 Seems this is the firsr ARCH where kasan and kernel link/bpf space are
 added to dump and analyze. Just curious, have you got code change to
 make use of them to do dumping and analyze?
>>> KASAN_SHADOW_START is not used, KERNEL_LINK_ADDR is used in the crash patch 
>>> set:
>>> https://patchwork.kernel.org/project/linux-riscv/cover/20220813031753.3097720-1-xianting.t...@linux.alibaba.com/
>> Oh, I would say please no. Sometime we got tons of objection when adding an
>> necessary one, we definitely should not add one for possible future
>> use.
>>
>> For this kind of newly added one, we need get ack from
>> makedumpfile/crash utility maintainer so that we know they are necessary
>> to have. At least they don't oppose.
> 
> Hi Kazu, Li Jiang
> 
> Could you help comment whether we need KASAN_SHADOW_START and 
> KERNEL_LINK_ADDR area export for vmcore from crash point of view?
> 
> In my crash patch set, I don't use KASAN_SHADOW_START,
> And only get the value of KERNEL_LINK_ADDR, not realy use it.
> https://patchwork.kernel.org/project/linux-riscv/cover/20220813031753.3097720-1-xianting.t...@linux.alibaba.com/

In your crash 

Re: [PATCH] makedumpfile: xen: Fix get_xen_basic_info_x86_64: Can't get the symbol of xenheap_phys_end.

2022-09-27 Thread  
On 2022/09/27 18:13, dietmar.h...@fujitsu.com wrote:
> From: HAGIO KAZUHITO(萩尾 一仁)   wrote Tuesday, September 
> 27, 2022 9:58 AM
>>
>> On 2022/09/26 16:24, dietmar.h...@fujitsu.com wrote:
>>> Hi,
>>> I have a Linux-dom0 running with Xen. The extraction of the vmcore via
>>> makdumpfile shows the message:
>>> get_xen_basic_info_x86_64: Can't get the symbol of xenheap_phys_end.
>>>
>>> The commit 2651d571 changed the behaviour of init_xen_crash_info().
>>> With
>>> -   return TRUE;
>>> +   ret = TRUE;
>>> +
>>> +out_error:
>>> +   free(buf);
>>> the buffer is released but it's still used because of
>>> info->xen_crash_info.com = buf;
>>> This leads to random data in the buffer and later to the mentioned
>>> error.
>>
>> Thank you for the report and patch, I missed that at review completely..
>>
>>>
>>> With the change back the memory is not released.
>>> But I'm not familiar enough with code to decide where to do this.
>>
>> I've tweaked the patch, does this work for you?
> 
> Yes, much better.
> My test cases are working.

Thanks for testing, applied.
https://github.com/makedumpfile/makedumpfile/commit/a2136943b1f173d2bf7efffc29542556e38aa564

Thanks,
Kazu
___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH] makedumpfile: xen: Fix get_xen_basic_info_x86_64: Can't get the symbol of xenheap_phys_end.

2022-09-27 Thread  
On 2022/09/26 16:24, dietmar.h...@fujitsu.com wrote:
> Hi,
> I have a Linux-dom0 running with Xen. The extraction of the vmcore via
> makdumpfile shows the message:
> get_xen_basic_info_x86_64: Can't get the symbol of xenheap_phys_end.
> 
> The commit 2651d571 changed the behaviour of init_xen_crash_info().
> With
> -   return TRUE;
> +   ret = TRUE;
> +
> +out_error:
> +   free(buf);
> the buffer is released but it's still used because of
> info->xen_crash_info.com = buf;
> This leads to random data in the buffer and later to the mentioned
> error.

Thank you for the report and patch, I missed that at review completely..

> 
> With the change back the memory is not released.
> But I'm not familiar enough with code to decide where to do this.

I've tweaked the patch, does this work for you?

Thanks,
Kazu

--
 From d2c336e0c1bb765675056ca942a884014c257f9a Mon Sep 17 00:00:00 2001
Subject: [PATCH] xen: Fix wrong free issue in init_xen_crash_info()

From: Dietmar Hahn 

The commit 2651d5719a21 ("[PATCH 11/14] fix memory leak in
init_xen_crash_info()") changed the behaviour of the function and the
buf variable is always released, but it's still used later when
returning TRUE.  Without the patch, this leads to random data in the
buffer and later to the following error:

   get_xen_basic_info_x86_64: Can't get the symbol of xenheap_phys_end.

Fixes: 2651d5719a21 ("[PATCH 11/14] fix memory leak in init_xen_crash_info()")
Signed-off-by: Dietmar Hahn 
Signed-off-by: Kazuhito Hagio 
---
  makedumpfile.c | 7 ---
  1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/makedumpfile.c b/makedumpfile.c
index 65d1c7c2f02c..ff821ebd3eb0 100644
--- a/makedumpfile.c
+++ b/makedumpfile.c
@@ -9668,7 +9668,6 @@ init_xen_crash_info(void)
  {
off_t   offset_xen_crash_info;
unsigned long   size_xen_crash_info;
-   int ret = FALSE;
void*buf;
  
get_xen_crash_info(_xen_crash_info, _xen_crash_info);
@@ -9710,11 +9709,11 @@ init_xen_crash_info(void)
else
info->xen_crash_info_v = 0;
  
-   ret = TRUE;
+   return TRUE;
  
  out_error:
free(buf);
-   return ret;
+   return FALSE;
  }
  
  int
@@ -12377,6 +12376,8 @@ out:
free(info->dump_header);
if (info->splitting_info != NULL)
free(info->splitting_info);
+   if (info->xen_crash_info.com != NULL)
+   free(info->xen_crash_info.com);
if (info->p2m_mfn_frame_list != NULL)
free(info->p2m_mfn_frame_list);
if (info->page_buf != NULL)
-- 
2.31.1
___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH] makedumpfile: Add initial LoongArch64 support

2022-09-20 Thread  
On 2022/09/19 13:47, Youling Tang wrote:
> Patch adds support for LoongArch64 in makedumpfile. It takes care of
> vmalloc, module and directly map kernel memory region's translation.
> Currently we only support 3 leverl 16K pages and VA_BITS as 48.
> 
> The changes were tested on a LoongArch64 Loongson-3A5000 processor.
> The dump compression and filtering (for all dump levels 1,2,4,8,16
> and 31) tests are succussfull.
> 
> Signed-off-by: Youling Tang 

Looks good, applied with adjusting a few indents:
https://github.com/makedumpfile/makedumpfile/commit/787a23ebc9d948f036aefb044f94e54fa5af

Thanks,
Kazu

> ---
> Note: kexec/kdump support patch see link [1]:
> [1] Link: 
> https://lore.kernel.org/loongarch/1663210426-15446-1-git-send-email-tangyoul...@loongson.cn/T/#t
> 
>   Makefile   |   2 +-
>   arch/loongarch64.c | 113 +
>   makedumpfile.h |  58 +++
>   3 files changed, 172 insertions(+), 1 deletion(-)
>   create mode 100644 arch/loongarch64.c
> 
> diff --git a/Makefile b/Makefile
> index 370a97c..e07f466 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -47,7 +47,7 @@ endif
>   SRC_BASE = makedumpfile.c makedumpfile.h diskdump_mod.h sadump_mod.h 
> sadump_info.h
>   SRC_PART = print_info.c dwarf_info.c elf_info.c erase_info.c sadump_info.c 
> cache.c tools.c printk.c detect_cycle.c
>   OBJ_PART=$(patsubst %.c,%.o,$(SRC_PART))
> -SRC_ARCH = arch/arm.c arch/arm64.c arch/x86.c arch/x86_64.c arch/ia64.c 
> arch/ppc64.c arch/s390x.c arch/ppc.c arch/sparc64.c arch/mips64.c
> +SRC_ARCH = arch/arm.c arch/arm64.c arch/x86.c arch/x86_64.c arch/ia64.c 
> arch/ppc64.c arch/s390x.c arch/ppc.c arch/sparc64.c arch/mips64.c 
> arch/loongarch64.c
>   OBJ_ARCH=$(patsubst %.c,%.o,$(SRC_ARCH))
>   
>   LIBS = -ldw -lbz2 -ldl -lelf -lz
> diff --git a/arch/loongarch64.c b/arch/loongarch64.c
> new file mode 100644
> index 000..42a02ab
> --- /dev/null
> +++ b/arch/loongarch64.c
> @@ -0,0 +1,113 @@
> +/*
> + * loongarch64.c
> + *
> + * Copyright (C) 2022 Loongson Technology Corporation Limited
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation; either version 2 of the License, or
> + * (at your option) any later version.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + */
> +#ifdef __loongarch64__
> +
> +#include "../print_info.h"
> +#include "../elf_info.h"
> +#include "../makedumpfile.h"
> +
> +int
> +get_phys_base_loongarch64(void)
> +{
> + info->phys_base = 0ULL;
> +
> + DEBUG_MSG("phys_base: %lx\n", info->phys_base);
> +
> + return TRUE;
> +}
> +
> +int
> +get_machdep_info_loongarch64(void)
> +{
> + info->section_size_bits = _SECTION_SIZE_BITS;
> +
> + /* Check if we can get MAX_PHYSMEM_BITS from vmcoreinfo */
> + if (NUMBER(MAX_PHYSMEM_BITS) != NOT_FOUND_NUMBER)
> + info->max_physmem_bits = NUMBER(MAX_PHYSMEM_BITS);
> + else
> + info->max_physmem_bits = _MAX_PHYSMEM_BITS;
> +
> + /* Check if we can get SECTION_SIZE_BITS from vmcoreinfo */
> + if (NUMBER(SECTION_SIZE_BITS) != NOT_FOUND_NUMBER)
> + info->section_size_bits = NUMBER(SECTION_SIZE_BITS);
> + else
> + info->section_size_bits = _SECTION_SIZE_BITS;
> +
> + DEBUG_MSG("max_physmem_bits : %ld\n", info->max_physmem_bits);
> + DEBUG_MSG("section_size_bits: %ld\n", info->section_size_bits);
> +
> + return TRUE;
> +}
> +
> +int
> +get_versiondep_info_loongarch64(void)
> +{
> + info->page_offset  = _PAGE_OFFSET;
> +
> + DEBUG_MSG("page_offset : %lx\n", info->page_offset);
> +
> + return TRUE;
> +}
> +
> +unsigned long long
> +vaddr_to_paddr_loongarch64(unsigned long vaddr)
> +{
> + unsigned long long paddr = NOT_PADDR;
> + pgd_t *pgda, pgdv;
> + pmd_t *pmda, pmdv;
> + pte_t *ptea, ptev;
> +
> + if (vaddr >= _XKPRANGE && vaddr < _XKVRANGE)
> + return vaddr & ((1ULL << MAX_PHYSMEM_BITS()) - 1);
> +
> + if (SYMBOL(swapper_pg_dir) == NOT_FOUND_SYMBOL) {
> + ERRMSG("Can't get the symbol of swapper_pg_dir.\n");
> + return NOT_PADDR;
> + }
> +
> + pgda = pgd_offset(SYMBOL(swapper_pg_dir), vaddr);
> + if (!readmem(VADDR, (unsigned long long)pgda, , sizeof(pgdv))) {
> + ERRMSG("Can't read pgd\n");
> + return NOT_PADDR;
> + }
> +
> + pmda = pmd_offset(, vaddr);
> + if (!readmem(VADDR, (unsigned long long)pmda, , sizeof(pmdv))) {
> + ERRMSG("Can't read pmd\n");
> + return NOT_PADDR;
> + }
> +
> + if (pmdv & _PAGE_HUGE) {
> + paddr = (pmdv & PMD_MASK) + (vaddr & (PMD_SIZE - 1));
> +  

Re: [PATCH] makedumpfile: mips64: Replace hardcoded values with macros

2022-09-15 Thread  
On 2022/09/14 19:13, Chetan Kankotiya wrote:
> Replace hardcoded values of PAGE_OFFSET, XKPHS start and end address,
> and _MAX_PHYSMEM_BITS with macros that may differ based on kernel-defined
> values for different MIPS SoCs. We can override these macros to align
> with kernel values at compile time.
> 
> Signed-off-by: Chetan Kankotiya 

Thanks, applied.
https://github.com/makedumpfile/makedumpfile/commit/36cdfd91d7870649c186457c5051f5c435ab199e

Kazu

> ---
>   arch/mips64.c  |  4 ++--
>   makedumpfile.h | 17 +
>   2 files changed, 19 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/mips64.c b/arch/mips64.c
> index d541c3e..ab45b6e 100644
> --- a/arch/mips64.c
> +++ b/arch/mips64.c
> @@ -52,7 +52,7 @@ get_machdep_info_mips64(void)
>   int
>   get_versiondep_info_mips64(void)
>   {
> - info->page_offset  = 0x9800ULL;
> + info->page_offset  = _PAGE_OFFSET;
>   
>   DEBUG_MSG("page_offset : %lx\n", info->page_offset);
>   
> @@ -79,7 +79,7 @@ vaddr_to_paddr_mips64(unsigned long vaddr)
>   /*
>* XKPHYS
>*/
> - if (vaddr >= 0x9000ULL && vaddr < 0xc000ULL)
> + if (vaddr >= _XKPHYS_START_ADDR && vaddr < _XKPHYS_END_ADDR)
>   return vaddr & ((1ULL << MAX_PHYSMEM_BITS()) - 1);
>   
>   if (SYMBOL(swapper_pg_dir) == NOT_FOUND_SYMBOL) {
> diff --git a/makedumpfile.h b/makedumpfile.h
> index 2084ed5..49b9242 100644
> --- a/makedumpfile.h
> +++ b/makedumpfile.h
> @@ -965,8 +965,25 @@ typedef unsigned long pgd_t;
>   
>   #ifdef __mips64__ /* mips64 */
>   #define KVBASE  PAGE_OFFSET
> +
> +#ifndef _XKPHYS_START_ADDR
> +#define _XKPHYS_START_ADDR   0x9000ULL /* 
> _LOONGSON_XKPHYS_START_ADDR */
> +#endif
> +
> +#ifndef _XKPHYS_END_ADDR
> +#define _XKPHYS_END_ADDR 0xc000ULL /* 
> _LOONGSON_XKPHYS_END_ADDR */
> +#endif
> +
> +#ifndef _PAGE_OFFSET
> +#define _PAGE_OFFSET 0x9800ULL
> +#endif
> +
>   #define _SECTION_SIZE_BITS  (28)
> +
> +#ifndef _MAX_PHYSMEM_BITS
>   #define _MAX_PHYSMEM_BITS   (48)
> +#endif
> +
>   #define _PAGE_PRESENT   (1 << 0)
>   #define _PAGE_HUGE  (1 << 4)
>   
___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH V3 0/9] Support RISCV64 arch and common commands

2022-09-12 Thread  
On 2022/09/13 11:11, Xianting Tian wrote:
> 
> 在 2022/9/9 下午5:21, lijiang 写道:
>> But anyway, these issues can be improved with later patches, so for the v3: 
>> ACK.
> 
> Thanks Li Jiang for the comments and ACK,
> 
> Now I received two ACKs for you and Kazu, Could I know when this patch set 
> can be merged to the mainline?

Usually a crash patch is merged when a required or targeted kernel patch
is merged mainline.  If the kernel patch for vmcoreinfo is merged in 6.1
merge window, the patch set will also be merged that time.

Thanks,
Kazu
___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [Crash-utility][PATCH V2 0/9] Support RISCV64 arch and common commands

2022-08-04 Thread  
On 2022/08/01 13:30, Xianting Tian wrote:
> This series of patches are for Crash-utility tool, it make crash tool support
> RISCV64 arch and the common commands(*, bt, p, rd, mod, log, set, struct, 
> task,
> dis, help -r, help -m, and so on).
> 
> To make the crash tool work normally for RISCV64 arch, we need a Linux kernel
> patch(wait for apply), which exports the kernel virtual memory layout, 
> va_bits,
> phys_ram_base to vmcoreinfo, it can simplify the development of crash tool.
> 
> The Linux kernel patch set:
> https://lore.kernel.org/linux-riscv/20220726093729.1231867-1-xianting.t...@linux.alibaba.com/
>   
> This series of patches are tested on QEMU RISCV64 env and SoC platform of
> T-head Xuantie 910 RISCV64 CPU.
> 
> 
>Some test examples list as below
> 
> ... ...
>KERNEL: vmlinux
>  DUMPFILE: vmcore
>  CPUS: 1
>  DATE: Fri Jul 15 10:24:25 CST 2022
>UPTIME: 00:00:33
> LOAD AVERAGE: 0.05, 0.01, 0.00
> TASKS: 41
>  NODENAME: buildroot
>   RELEASE: 5.18.9
>   VERSION: #30 SMP Fri Jul 15 09:47:03 CST 2022
>   MACHINE: riscv64  (unknown Mhz)
>MEMORY: 1 GB
> PANIC: "Kernel panic - not syncing: sysrq triggered crash"
>   PID: 113
>   COMMAND: "sh"
>  TASK: ff6002269600  [THREAD_INFO: ff6002269600]
>   CPU: 0
> STATE: TASK_RUNNING (PANIC)
> 
> carsh>
> 
> crash> p mem_map
> mem_map = $1 = (struct page *) 0xff603effbf00
> 
> crash> p /x *(struct page *) 0xff603effbf00
> $5 = {
>flags = 0x1000,
>{
>  {
>{
>  lru = {
>next = 0xff603effbf08,
>prev = 0xff603effbf08
>  },
>  {
>__filler = 0xff603effbf08,
>mlock_count = 0x3effbf08
>  }
>},
>mapping = 0x0,
>index = 0x0,
>private = 0x0
>  },
>... ...
> 
> crash> mod
>   MODULE   NAME BASE SIZE  OBJECT FILE
> 0113e740  nvme_core  01133000  98304  (not loaded)  
> [CONFIG_KALLSYMS]
> 011542c0  nvme   0114c000  61440  (not loaded)  
> [CONFIG_KALLSYMS]
> 
> crash> rd 0113e740 8
> 0113e740:   810874f8   .t..
> 0113e750:  011542c8 726f635f656d766e   .B..nvme_cor
> 0113e760:  0065    e...
> 0113e770:      
> 
> crash> vtop 0113e740
> VIRTUAL   PHYSICAL
> 0113e740  8254d740
> 
> PGD: 810e9ff8 => 2001
>P4D:  => 2fffec01
>PUD: 5605c2957470 => 20949801
>PMD: 7fff7f1750c0 => 20947401
> PTE: 0 => 209534e7
>   PAGE: 8254d000
> 
>PTE PHYSICAL  FLAGS
> 209534e7  8254d000  (PRESENT|READ|WRITE|GLOBAL|ACCESSED|DIRTY)
> 
>PAGE   PHYSICAL  MAPPING   INDEX CNT FLAGS
> ff603f0777d8 8254d00000  1 0
> 
> crash> bt
> PID: 113  TASK: ff600226c200  CPU: 0COMMAND: "sh"
>   #0 [ff2010333b90] riscv_crash_save_regs at 800078f8
>   #1 [ff2010333cf0] panic at 806578c6
>   #2 [ff2010333d50] sysrq_reset_seq_param_set at 8038c03c
>   #3 [ff2010333da0] __handle_sysrq at 8038c604
>   #4 [ff2010333e00] write_sysrq_trigger at 8038cae4
>   #5 [ff2010333e20] proc_reg_write at 801b7ee8
>   #6 [ff2010333e40] vfs_write at 80152bb2
>   #7 [ff2010333e80] ksys_write at 80152eda
>   #8 [ff2010333ed0] sys_write at 80152f52
> 
> ---
> Changes V1 -> V2:
>   1, Do the below fixes based on HAGIO KAZUHITO's comments:
>  Fix build warnings,
>  Use MACRO for Linux version,
>  Add description of x86_64 binary for riscv64 in README,
>  Fix build error for the "sticky" target for build on x86_64,
>  Fix the mixed indent.
>   2, Add 'help -m/M' support patch to this patch set.
>   3, Support native compiling approach, which means the host OS distro
>  is also a riscv64 (lp64d) Linux, based on Yixun Lan's comments.
>   4, Use __riscv and __riscv_xlen instead of __riscv64__ based on Yixun Lan's 
> comments.

Thank you for the v2, looks good to me.
So with the kernel patch set merged,

Acked-by: Kazuhito Hagio 

(no need to add this tag to the patches)

Thanks,
Kazu

> 
> Xianting Tian (9):
>Add RISCV64 framework code support
>RISCV64: Make crash tool enter command line and support some commands
>RISCV64: Add 'dis' command support
>RISCV64: Add 'irq' command support
>RISCV64: Add 'bt' command support
>RISCV64: Add 'help -r' command support
>RISCV64: Add 'help -m/M' command support
>RISCV64: Add 'mach' command support
>RISCV64: Add the implementation of symbol verify
> 
>   Makefile

Re: [Crash-utility] EXT: RE: crash: read error on type: "memory section root table"

2022-07-26 Thread  
On 2022/07/22 21:04, Agrain Patrick wrote:
> Hello,
> 
> Back to this topic.
> 
> I upgraded our system with the kexec-tools from Centos 8 Stream, based on 
> kexec 2.0.24 and makedumpfile 1.7.1.
> We are still facing errors when using 'makedumpfile -c'.
> 
> Removing the '-c' gives better ratio success/failure, but sometimes the crash 
> file cannot be read by the crash tool.
> 
> Referring to Hagio's remark below concerning the sync, I added a sync 
> operation before the call of makedumpfile (and just after the mount ext4 of 
> the required partitions) and add a second call to sync after the return of 
> makedumpfile.
> In that configuration, the crash file can be read by the crash tool (up to 
> now in all cases).

Good, seems the second sync works.  I would suggest unmounting the filesystem
cleanly, which contains syncing to disk, before reboot, if you don't do that.

Thanks,
Kazu

> 
> Thanks for your help.
> Best regards,
> Patrick Agrain
> 
> -Message d'origine-
> De : Crash-utility  De la part de Agrain 
> Patrick
> Envoyé : mercredi 6 avril 2022 17:48
> À : Discussion list for crash utility usage, maintenance and development 
> ; kexec@lists.infradead.org
> Objet : Re: [Crash-utility] EXT: RE: crash: read error on type: "memory 
> section root table"
> 
> 
> 
> -Message d'origine-
> De : HAGIO KAZUHITO(萩尾 一仁)  Envoyé : mercredi 6 avril 
> 2022 09:48 À : Agrain Patrick 
> Cc : Discussion list for crash utility usage, maintenance and development 
> ; kexec@lists.infradead.org Objet : RE: EXT: RE: 
> crash: read error on type: "memory section root table"
> 
> -Original Message-
>> Hello,
>>
>> Suggested trace above gives following information after a crash -d 8 command:
>> <...>
>> kernel NR_CPUS: 2
>> > 56017b542648>
>> 
>> read_diskdump: paddr/pfn: 12925820/12925 -> cache physical page:
>> 12925000
>> GETBUF(328 -> 0)
>> FREEBUF(0)
>> GETBUF(328 -> 0)
>> FREEBUF(0)
>> PAGESIZE=4096
>> mem_section_size = 16384
>> NR_SECTION_ROOTS = 2048
>> NR_MEM_SECTIONS = 524288
>> SECTIONS_PER_ROOT = 256
>> SECTION_ROOT_MASK = 0xff
>> PAGES_PER_SECTION = 32768
>> > 7ffd1b6bb000>
>> 
>> read_diskdump: paddr/pfn: 12926db0/12926 -> cache physical page:
>> 12926000
>> > 16384, (FOE), 56017da26fd0>
>> 
>> read_diskdump: paddr/pfn: 3f7fc000/3f7fc -> cache physical page:
>> 3f7fc000
>> crash: PAG3 - errno=2 r=0 pd.size=49
>> read_diskdump: READ_ERROR: cannot cache page: 3f7fc000
>> crash: read error: kernel virtual address: 904c7f7fc000  type: "memory 
>> section root table"
> 
> hmm, r=0 means end of file, can you check again whether pd.offset exceeds the 
> dumpfile size?  If so, somehow the dumpfile is shorter than expected.
> 
> Indeed, the offset points outside the dumpfile:
> Get:
> crash: PAG3 - errno=2 r=0 pd.size=52 pd.offset=168956485 with a dumpfile
> 164820 -rw-r--r--.  1 root root  168775680  6 avril 17:23 
> crashdump--20220406-1713
> 
> And another one:
> Get:
> crash: PAG3 - errno=2 r=0 pd.size=49 pd.offset=215640649 with a dumpfile
> 209984 -rw-r--r--.  1 root root  215023616  1 avril 10:58 
> crashdump-585.000-20220401-1054
> 
> I think a RHEL-based kexec-tools does "sync" after makedumpfile, but can you 
> check?
> 
> Actually, we are executing the makedumpfile in a script designated as init 
> file for the second kernel. Therefore, we do not perform the sync as per 
> core_collector.
> 
> Thanks,
> Kazu
> 
> Best regards,
> Patrick
> 
> --
> Crash-utility mailing list
> crash-util...@redhat.com
> https://listman.redhat.com/mailman/listinfo/crash-utility
> Contribution Guidelines: https://github.com/crash-utility/crash/wiki
> --
> Crash-utility mailing list
> crash-util...@redhat.com
> https://listman.redhat.com/mailman/listinfo/crash-utility
> Contribution Guidelines: https://github.com/crash-utility/crash/wiki
___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


RE: [PATCH] Makefile: Avoid installing files in /etc

2022-05-11 Thread  
-Original Message-
> On 10/05/2022 22:50, HAGIO KAZUHITO(萩尾 一仁) wrote:
> > [...]
> >
> > Fair enough, and as far as I've checked:
> >
> > - The makedumpfile.conf.sample file does not need to be in /etc, because
> > makedumpfile does not have any default path for it and reads a config
> > file only when specified with --config option.
> > (and IMO it's better to place such a sample config file, used rarely,
> > in /usr/share.)
> >
> > - Fedora/RHEL kexec-tools packaging does not use "make install" and have
> > their own install command, at least this patch will not affect them:
> > https://src.fedoraproject.org/rpms/kexec-tools/blob/main/f/kexec-tools.spec#_225
> >
> > - Debian/Ubuntu makedumpfile packages apparently do not have the file:
> > https://packages.debian.org/sid/amd64/makedumpfile/filelist
> >
> > On the whole I will accept this.
> >
> >> Notice that this patch intentionally skips
> >> the change for the .spec file, which aims specific distros, by creating
> >> RPM packages.
> >
> > However, the .spec file depends on "make install", so I will add this:
> >
> > diff --git a/makedumpfile.spec b/makedumpfile.spec
> > index ef619b8c8af9..fd9efa0639cc 100644
> > --- a/makedumpfile.spec
> > +++ b/makedumpfile.spec
> > @@ -25,7 +25,6 @@ make LINKTYPE=dynamic
> >  %install
> >  rm -rf %{buildroot}
> >  mkdir -p %{buildroot}/usr/sbin
> > -mkdir -p %{buildroot}/etc
> >  mkdir -p %{buildroot}/usr/share/man/man5
> >  mkdir -p %{buildroot}/usr/share/man/man8
> >  mkdir -p %{buildroot}/usr/share/%{name}-%{version}/eppic-scripts/
> > @@ -35,11 +34,11 @@ make install DESTDIR=%{buildroot}
> >  rm -rf %{buildroot}
> >
> >  %files
> > -/etc/makedumpfile.conf.sample
> >  /usr/sbin/makedumpfile
> >  /usr/sbin/makedumpfile-R.pl
> >  /usr/share/man/man5/makedumpfile.conf.5.gz
> >  /usr/share/man/man8/makedumpfile.8.gz
> > +/usr/share/%{name}-%{version}/makedumpfile.conf.sample
> >  /usr/share/%{name}-%{version}/eppic_scripts/
> >
> >  %changelog
> >
> >> [...]
> > The creation of ${DESTDIR}/etc is also not needed, will remove it and merge.
> >
> > install:
> > -   install -m 755 -d ${DESTDIR}/usr/sbin ${DESTDIR}/usr/share/man/man5 
> > ${DESTDIR}/usr/share/man/man8
> ${DESTDIR}/etc
> > +   install -m 755 -d ${DESTDIR}/usr/sbin ${DESTDIR}/usr/share/man/man5 
> > ${DESTDIR}/usr/share/man/man8
> > install -m 755 -t ${DESTDIR}/usr/sbin makedumpfile 
> > $(VPATH)makedumpfile-R.pl
> >
> > Please let me know if any problem.
> >
> > Thanks,
> > Kazu
> >
> 
> Hi Kazu, this is perfect - thanks a bunch for the great analysis. Feel
> free to merge with your changes, it's much appreciated =)

Thanks, applied with the changes.
https://github.com/makedumpfile/makedumpfile/commit/09b5c879b9f787c52f1963555d8d46127c457f2a

Kazu

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


RE: [PATCH] Makefile: Avoid installing files in /etc

2022-05-10 Thread  
-Original Message-
> Some Linux distros rely on having files in /etc (Debian, et al.),
> others in *not* having them in such directory (like Arch Linux).
> Fact is that the former distros have no issues in installing files
> elsewhere, whereas Arch has issues in installing files on /etc,
> specially if such files are present there when user try to reinstall
> the packages - it fails. Arch packaging doesn't try to be smart as dpkg
> for example (that compares config files to determine if user changed that).
> 
> With all of that said, this patch moves the sample conf file to
> /usr/share, a move that is well-tolerated in all distros and shouldn't
> cause regressions in packaging. Also, if some Linux distribution likes
> the idea of adding files in /etc, they can tune it in their packaging
> configuration scripts for makedumpfile, but we shouldn't have that
> as default in the Makefile.

Fair enough, and as far as I've checked:

- The makedumpfile.conf.sample file does not need to be in /etc, because
makedumpfile does not have any default path for it and reads a config
file only when specified with --config option.
(and IMO it's better to place such a sample config file, used rarely,
in /usr/share.)

- Fedora/RHEL kexec-tools packaging does not use "make install" and have
their own install command, at least this patch will not affect them:
https://src.fedoraproject.org/rpms/kexec-tools/blob/main/f/kexec-tools.spec#_225

- Debian/Ubuntu makedumpfile packages apparently do not have the file:
https://packages.debian.org/sid/amd64/makedumpfile/filelist

On the whole I will accept this.

> Notice that this patch intentionally skips
> the change for the .spec file, which aims specific distros, by creating
> RPM packages.

However, the .spec file depends on "make install", so I will add this:

diff --git a/makedumpfile.spec b/makedumpfile.spec
index ef619b8c8af9..fd9efa0639cc 100644
--- a/makedumpfile.spec
+++ b/makedumpfile.spec
@@ -25,7 +25,6 @@ make LINKTYPE=dynamic
 %install
 rm -rf %{buildroot}
 mkdir -p %{buildroot}/usr/sbin
-mkdir -p %{buildroot}/etc
 mkdir -p %{buildroot}/usr/share/man/man5
 mkdir -p %{buildroot}/usr/share/man/man8
 mkdir -p %{buildroot}/usr/share/%{name}-%{version}/eppic-scripts/
@@ -35,11 +34,11 @@ make install DESTDIR=%{buildroot}
 rm -rf %{buildroot}
 
 %files
-/etc/makedumpfile.conf.sample
 /usr/sbin/makedumpfile
 /usr/sbin/makedumpfile-R.pl
 /usr/share/man/man5/makedumpfile.conf.5.gz
 /usr/share/man/man8/makedumpfile.8.gz
+/usr/share/%{name}-%{version}/makedumpfile.conf.sample
 /usr/share/%{name}-%{version}/eppic_scripts/
 
 %changelog

> 
> Cc: Coiby Xu 
> Cc: Kazuhito Hagio 
> Cc: Leonidas Spyropoulos 
> Signed-off-by: Guilherme G. Piccoli 
> ---
>  Makefile | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/Makefile b/Makefile
> index cc6b0120aa7d..014be8110836 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -130,6 +130,6 @@ install:
>   install -m 755 -t ${DESTDIR}/usr/sbin makedumpfile 
> $(VPATH)makedumpfile-R.pl
>   install -m 644 -t ${DESTDIR}/usr/share/man/man8 makedumpfile.8
>   install -m 644 -t ${DESTDIR}/usr/share/man/man5 makedumpfile.conf.5
> - install -m 644 -D $(VPATH)makedumpfile.conf 
> ${DESTDIR}/etc/makedumpfile.conf.sample
>   mkdir -p ${DESTDIR}/usr/share/makedumpfile-${VERSION}/eppic_scripts
> + install -m 644 -D $(VPATH)makedumpfile.conf 
> ${DESTDIR}/usr/share/makedumpfile-${VERSION}/makedumpfile.conf.sample
>   install -m 644 -t 
> ${DESTDIR}/usr/share/makedumpfile-${VERSION}/eppic_scripts/ 
> $(VPATH)eppic_scripts/*
> --
> 2.36.0

The creation of ${DESTDIR}/etc is also not needed, will remove it and merge.

install:
-   install -m 755 -d ${DESTDIR}/usr/sbin ${DESTDIR}/usr/share/man/man5 
${DESTDIR}/usr/share/man/man8 ${DESTDIR}/etc
+   install -m 755 -d ${DESTDIR}/usr/sbin ${DESTDIR}/usr/share/man/man5 
${DESTDIR}/usr/share/man/man8
install -m 755 -t ${DESTDIR}/usr/sbin makedumpfile 
$(VPATH)makedumpfile-R.pl

Please let me know if any problem.

Thanks,
Kazu


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


RE: [PATCH makedumpfile] Avoid false-positive mem_section validation with vmlinux

2022-04-26 Thread  
Hi Pingfan, Philipp,

Thank you for reviewing and testing this, applied.
https://github.com/makedumpfile/makedumpfile/commit/6d0d95ecc04a70f8448d562ff0fbbae237f5c929

Kazu

-Original Message-
> On Thu, Apr 21, 2022 at 7:58 AM HAGIO KAZUHITO(萩尾 一仁)
>  wrote:
> >
> > Currently get_mem_section() validates if SYMBOL(mem_section) is the address
> > of the mem_section array first.  But there was a report that the first
> > validation wrongly returned TRUE with -x vmlinux and SPARSEMEM_EXTREME
> > (4.15+) on s390x.  This leads to crash failing statup with the following
> > seek error:
> >
> >   crash: seek error: kernel virtual address: 67fffc2800  type: "memory 
> > section root table"
> >
> > Skip the first validation when satisfying the conditions.
> >
> > Reported-by: Dave Wysochanski 
> > Signed-off-by: Kazuhito Hagio 
> > ---
> >  makedumpfile.c | 31 +++
> >  1 file changed, 31 insertions(+)
> >
> > diff --git a/makedumpfile.c b/makedumpfile.c
> > index a2f45c84cee3..65d1c7c2f02c 100644
> > --- a/makedumpfile.c
> > +++ b/makedumpfile.c
> > @@ -3698,6 +3698,22 @@ validate_mem_section(unsigned long *mem_sec,
> > return ret;
> >  }
> >
> > +/*
> > + * SYMBOL(mem_section) varies with the combination of memory model and
> > + * its source:
> > + *
> > + * SPARSEMEM
> > + *   vmcoreinfo: address of mem_section root array
> > + *   -x vmlinux: address of mem_section root array
> > + *
> > + * SPARSEMEM_EXTREME v1
> > + *   vmcoreinfo: address of mem_section root array
> > + *   -x vmlinux: address of mem_section root array
> > + *
> > + * SPARSEMEM_EXTREME v2 (with 83e3c48729d9 and a0b1280368d1) 4.15+
> > + *   vmcoreinfo: address of mem_section root array
> > + *   -x vmlinux: address of pointer to mem_section root array
> > + */
> >  static int
> >  get_mem_section(unsigned int mem_section_size, unsigned long *mem_maps,
> > unsigned int num_section)
> > @@ -3710,12 +3726,27 @@ get_mem_section(unsigned int mem_section_size, 
> > unsigned long *mem_maps,
> > strerror(errno));
> > return FALSE;
> > }
> > +
> > +   /*
> > +* There was a report that the first validation wrongly returned 
> > TRUE
> > +* with -x vmlinux and SPARSEMEM_EXTREME v2 on s390x, so skip it.
> > +* Howerver, leave the fallback validation as it is for the -i 
> > option.
> > +*/
> > +   if (is_sparsemem_extreme() && info->name_vmlinux) {
> > +   unsigned long flag = 0;
> > +   if (get_symbol_type_name("mem_section", 
> > DWARF_INFO_GET_SYMBOL_TYPE,
> > +   NULL, )
> > +   && !(flag & TYPE_ARRAY))
> > +   goto skip_1st_validation;
> > +   }
> > +
> > ret = validate_mem_section(mem_sec, SYMBOL(mem_section),
> >mem_section_size, mem_maps, num_section);
> >
> > if (!ret && is_sparsemem_extreme()) {
> > unsigned long mem_section_ptr;
> >
> > +skip_1st_validation:
> > if (!readmem(VADDR, SYMBOL(mem_section), _section_ptr,
> >  sizeof(mem_section_ptr)))
> > goto out;
> > --
> > 2.27.0
> >
> Discussed with Kazu off-list, and with his nice help, I got clear why
> he drops V1.
> 
> Hence,
> Reviewed-by: Pingfan Liu 
___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


RE: [PATCH makedumpfile] Avoid false-positive mem_section validation with vmlinux

2022-04-25 Thread  
-Original Message-
> On Mon, Apr 25, 2022 at 8:48 AM HAGIO KAZUHITO(萩尾 一仁)
>  wrote:
> >
> > Hi Pingfan,
> >
> > -Original Message-
> > > On Wed, Apr 20, 2022 at 11:58:29PM +, HAGIO KAZUHITO(萩尾 一仁) wrote:
> > > > Currently get_mem_section() validates if SYMBOL(mem_section) is the 
> > > > address
> > > > of the mem_section array first.  But there was a report that the first
> > > > validation wrongly returned TRUE with -x vmlinux and SPARSEMEM_EXTREME
> > > > (4.15+) on s390x.  This leads to crash failing statup with the following
> > > > seek error:
> > > >
> > > >   crash: seek error: kernel virtual address: 67fffc2800  type: "memory 
> > > > section root table"
> > > >
> > > > Skip the first validation when satisfying the conditions.
> > > >
> > >
> > > I still prefer to your V1, which is discussed internally. In which, the
> > > logic was made straight forward. And I suggest some slight change to
> > > your V1, which folds "-x vmlinux" logic into is_sparsemem_extreme().
> > >
> > > What about the following: (not tested yet, if it is good, I can test it)
> >
> > Thanks for your review and suggestion.
> >
> > The purpose of my patch is to distinguish between SPARSEMEM_EXTREME
> > v1 and v2, i.e. whether it has 83e3c48729d9 or not.
> >
> 
> Not sure about dwarf, but is it possible to utilize the array length
> info in is_sparsemem_extreme()?
> 
> For SPARSEMEM_EXTREME,
>  #ifdef CONFIG_SPARSEMEM_EXTREME
> extern struct mem_section *mem_section[NR_SECTION_ROOTS];
>  #else
>  extern struct mem_section mem_section[NR_SECTION_ROOTS][SECTIONS_PER_ROOT];
>  #endif
> 
> And if DWARF_INFO_GET_SYMBOL_ARRAY_LENGTH works, then there is a big
> gap between "NR_SECTION_ROOTS * 8-bytes" and "sizeof(struct
> mem_section) * NR_SECTION_ROOTS * SECTIONS_PER_ROOT"

hmm, sorry, I haven't got your point, the current is_sparsemem_extreme()
already uses that value to determine whether it's SPARSEMEM_EXTREME or not.
and it's doing the same thing with vmlinux, too.

> > >   if ((ARRAY_LENGTH(mem_section)
> > > -  == divideup(NR_MEM_SECTIONS(), _SECTIONS_PER_ROOT_EXTREME()))
> > > - || (ARRAY_LENGTH(mem_section) == NOT_FOUND_STRUCTURE))
> > > - return TRUE;

if (SYMBOL(mem_section) != NOT_FOUND_SYMBOL)
SYMBOL_ARRAY_LENGTH_INIT(mem_section, "mem_section");

Thanks,
Kazu

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


RE: [PATCH makedumpfile] Avoid false-positive mem_section validation with vmlinux

2022-04-24 Thread  
Hi Pingfan,

-Original Message-
> On Wed, Apr 20, 2022 at 11:58:29PM +, HAGIO KAZUHITO(萩尾 一仁) wrote:
> > Currently get_mem_section() validates if SYMBOL(mem_section) is the address
> > of the mem_section array first.  But there was a report that the first
> > validation wrongly returned TRUE with -x vmlinux and SPARSEMEM_EXTREME
> > (4.15+) on s390x.  This leads to crash failing statup with the following
> > seek error:
> >
> >   crash: seek error: kernel virtual address: 67fffc2800  type: "memory 
> > section root table"
> >
> > Skip the first validation when satisfying the conditions.
> >
> 
> I still prefer to your V1, which is discussed internally. In which, the
> logic was made straight forward. And I suggest some slight change to
> your V1, which folds "-x vmlinux" logic into is_sparsemem_extreme().
> 
> What about the following: (not tested yet, if it is good, I can test it)

Thanks for your review and suggestion.

The purpose of my patch is to distinguish between SPARSEMEM_EXTREME
v1 and v2, i.e. whether it has 83e3c48729d9 or not.

is_sparsemem_extreme() has to return whether it's SPARSEMEM_EXTREME or
SPARSEMEM, so unfortunately it's not suitable to use that logic in
the function.

Thanks,
Kazu

> 
> 
> diff --git a/makedumpfile.c b/makedumpfile.c
> index a2f45c8..3ebe372 100644
> --- a/makedumpfile.c
> +++ b/makedumpfile.c
> @@ -2240,11 +2240,18 @@ int
>  is_sparsemem_extreme(void)
>  {
>   if ((ARRAY_LENGTH(mem_section)
> -  == divideup(NR_MEM_SECTIONS(), _SECTIONS_PER_ROOT_EXTREME()))
> - || (ARRAY_LENGTH(mem_section) == NOT_FOUND_STRUCTURE))
> - return TRUE;
> - else
> +  == divideup(NR_MEM_SECTIONS(), _SECTIONS_PER_ROOT_EXTREME())) {
> +  return TRUE;
> + } else if (info->name_vmlinux) {
> + unsigned long flag = 0;
> +
> + if (get_symbol_type_name("mem_section", 
> DWARF_INFO_GET_SYMBOL_TYPE,
> + NULL, )
> + && !(flag & TYPE_ARRAY))
> + return TRUE;
> + } else {
>   return FALSE;
> + }
>  }
> 
>  int
> @@ -3704,28 +3711,24 @@ get_mem_section(unsigned int mem_section_size, 
> unsigned long *mem_maps,
>  {
>   int ret = FALSE;
>   unsigned long *mem_sec = NULL;
> + unsigned long mem_section_ptr = SYMBOL(mem_section);
> 
>   if ((mem_sec = malloc(mem_section_size)) == NULL) {
>   ERRMSG("Can't allocate memory for the mem_section. %s\n",
>   strerror(errno));
>   return FALSE;
>   }
> - ret = validate_mem_section(mem_sec, SYMBOL(mem_section),
> -mem_section_size, mem_maps, num_section);
> -
> - if (!ret && is_sparsemem_extreme()) {
> - unsigned long mem_section_ptr;
> 
> + if (is_sparsemem_extreme()) {
>   if (!readmem(VADDR, SYMBOL(mem_section), _section_ptr,
>sizeof(mem_section_ptr)))
>   goto out;
> + }
> 
> - ret = validate_mem_section(mem_sec, mem_section_ptr,
> - mem_section_size, mem_maps, num_section);
> -
> - if (!ret)
> + ret = validate_mem_section(mem_sec, mem_section_ptr,
> + mem_section_size, mem_maps, num_section);
> + if (!ret)
>   ERRMSG("Could not validate mem_section.\n");
> - }
>  out:
>   if (mem_sec != NULL)
>   free(mem_sec);
> --
> 
> Thanks,
> 
>   Pingfan
> 
> > Reported-by: Dave Wysochanski 
> > Signed-off-by: Kazuhito Hagio 
> > ---
> >  makedumpfile.c | 31 +++
> >  1 file changed, 31 insertions(+)
> >
> > diff --git a/makedumpfile.c b/makedumpfile.c
> > index a2f45c84cee3..65d1c7c2f02c 100644
> > --- a/makedumpfile.c
> > +++ b/makedumpfile.c
> > @@ -3698,6 +3698,22 @@ validate_mem_section(unsigned long *mem_sec,
> > return ret;
> >  }
> >
> > +/*
> > + * SYMBOL(mem_section) varies with the combination of memory model and
> > + * its source:
> > + *
> > + * SPARSEMEM
> > + *   vmcoreinfo: address of mem_section root array
> > + *   -x vmlinux: address of mem_section root array
> > + *
> > + * SPARSEMEM_EXTREME v1
> > + *   vmcoreinfo: address of mem_section root array
> > + *   -x vmlinux: address of mem_section root array
> > + *
> > + * SPARSEMEM_EXTREME v2 (with 83e3c48729d9 and a0b1280368d1) 4.15+
> > + *   vmcoreinfo: addre

[PATCH makedumpfile] Avoid false-positive mem_section validation with vmlinux

2022-04-20 Thread  
Currently get_mem_section() validates if SYMBOL(mem_section) is the address
of the mem_section array first.  But there was a report that the first
validation wrongly returned TRUE with -x vmlinux and SPARSEMEM_EXTREME
(4.15+) on s390x.  This leads to crash failing statup with the following
seek error:

  crash: seek error: kernel virtual address: 67fffc2800  type: "memory section 
root table"

Skip the first validation when satisfying the conditions.

Reported-by: Dave Wysochanski 
Signed-off-by: Kazuhito Hagio 
---
 makedumpfile.c | 31 +++
 1 file changed, 31 insertions(+)

diff --git a/makedumpfile.c b/makedumpfile.c
index a2f45c84cee3..65d1c7c2f02c 100644
--- a/makedumpfile.c
+++ b/makedumpfile.c
@@ -3698,6 +3698,22 @@ validate_mem_section(unsigned long *mem_sec,
return ret;
 }
 
+/*
+ * SYMBOL(mem_section) varies with the combination of memory model and
+ * its source:
+ *
+ * SPARSEMEM
+ *   vmcoreinfo: address of mem_section root array
+ *   -x vmlinux: address of mem_section root array
+ *
+ * SPARSEMEM_EXTREME v1
+ *   vmcoreinfo: address of mem_section root array
+ *   -x vmlinux: address of mem_section root array
+ *
+ * SPARSEMEM_EXTREME v2 (with 83e3c48729d9 and a0b1280368d1) 4.15+
+ *   vmcoreinfo: address of mem_section root array
+ *   -x vmlinux: address of pointer to mem_section root array
+ */
 static int
 get_mem_section(unsigned int mem_section_size, unsigned long *mem_maps,
unsigned int num_section)
@@ -3710,12 +3726,27 @@ get_mem_section(unsigned int mem_section_size, unsigned 
long *mem_maps,
strerror(errno));
return FALSE;
}
+
+   /*
+* There was a report that the first validation wrongly returned TRUE
+* with -x vmlinux and SPARSEMEM_EXTREME v2 on s390x, so skip it.
+* Howerver, leave the fallback validation as it is for the -i option.
+*/
+   if (is_sparsemem_extreme() && info->name_vmlinux) {
+   unsigned long flag = 0;
+   if (get_symbol_type_name("mem_section", 
DWARF_INFO_GET_SYMBOL_TYPE,
+   NULL, )
+   && !(flag & TYPE_ARRAY))
+   goto skip_1st_validation;
+   }
+
ret = validate_mem_section(mem_sec, SYMBOL(mem_section),
   mem_section_size, mem_maps, num_section);
 
if (!ret && is_sparsemem_extreme()) {
unsigned long mem_section_ptr;
 
+skip_1st_validation:
if (!readmem(VADDR, SYMBOL(mem_section), _section_ptr,
 sizeof(mem_section_ptr)))
goto out;
-- 
2.27.0

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


RE: [ANNOUNCE] makedumpfile 1.7.1

2022-04-19 Thread  
-Original Message-
> On 18/04/2022 06:03, HAGIO KAZUHITO(萩尾 一仁) wrote:
> > Hi,
> >
> > I'm pleased to announce the release of makedumpfile 1.7.1.
> > Thank you everyone for your help to maintain this tool.
> >
> > Download:
> > The latest makedumpfile can be downloaded from the following URL.
> >https://github.com/makedumpfile/makedumpfile/releases
> >
> Hi Hagio,
> 
> I maintain the package in Arch Linux and noticed that you are signing
> the git tag but not the commit message. Would you be able to sign both
> next time so I can switch to validate the source using your public key?

sure, I will try next time.

> Arch Linux packaging tool (pacman) supports this natively and it would
> help maintain the chain of trust. It would also be useful to mention
> your public key in the README (along with other's PGP keys who might be
> tagging releases)

ok, will add it too.

Thanks,
Kazu

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


[ANNOUNCE] makedumpfile 1.7.1

2022-04-18 Thread  
Hi,

I'm pleased to announce the release of makedumpfile 1.7.1.
Thank you everyone for your help to maintain this tool.

Download:
The latest makedumpfile can be downloaded from the following URL.
  https://github.com/makedumpfile/makedumpfile/releases

New features:
- Cycle detection for printk log_buf
- A lot of code hardening
- Support kernels up to v5.17 (x86_64)

Special thanks to Philipp Rudo (again), who has worked on hardening
makedumpfile.  This release contains the remainder of the work,
mainly removal of variable-length arrays.

Commits since v1.7.0
7684d26 [PATCH] Support newer kernels up to v5.17 (Kazuhito Hagio)
0ea9257 [PATCH] check that order of free pages falls within valid range 
(Alexander Egorenkov)
3318c51 [PATCH] omit unnecessary calls to print_progress() (Philipp Rudo)
5035c08 [PATCH] print error when reading with unsupported compression (Philipp 
Rudo)
68d120b [PATCH v2 3/3] use cycle detection when parsing the prink log_buf 
(Philipp Rudo)
e1d2e53 [PATCH v2 2/3] use pointer arithmetics for dump_dmesg (Philipp Rudo)
feae3d1 [PATCH v2 1/3] add generic cycle detection (Philipp Rudo)
64b1c7f [PATCH] Fixes for rpmbuild testing (Kazuhito Hagio)
2169de6 [PATCH v2] Simplify the generation of man pages (Leonidas Spyropoulos)
59b1726 [PATCH] sadump, kaslr: fix failure of calculating kaslr_offset 
(HATAYAMA Daisuke)
e459edc [PATCH] remove variable length array in get_machdep_info_x86_64() 
(Kazuhito Hagio)
d2ffef8 [PATCH 15/15] remove variable length array in 
writeout_multiple_dumpfiles() (Philipp Rudo)
704d5cb [PATCH 14/15] remove variable length array in 
write_kdump_pages_and_bitmap_cyclic() (Philipp Rudo)
1424682 [PATCH 13/15] remove variable length array in 
write_kdump_pages_cyclic() (Philipp Rudo)
c977617 [PATCH 12/15] remove variable length array in write_elf_load_segment() 
(Philipp Rudo)
e2134cd [PATCH 11/15] remove variable length array in copy_bitmap_file() 
(Philipp Rudo)
67f7a2b [PATCH 10/15] remove variable length array in 
__exclude_unnecessary_pages() (Philipp Rudo)
c556030 [PATCH 09/15] remove variable length array in 
exclude_zero_pages_cyclic() (Philipp Rudo)
be7ec0c [PATCH 08/15] remove variable length array in create_1st_bitmap_file() 
(Philipp Rudo)
26c061c [PATCH 07/15] remove variable length array in 
copy_1st_bitmap_from_memory() (Philipp Rudo)
b3a938f [PATCH 06/15] remove variable length array in 
check_and_modify_multiple_kdump_headers() (Philipp Rudo)
cc5392d [PATCH 05/15] remove variable length array in get_mm_discontigmem() 
(Philipp Rudo)
00a1b4d [PATCH 04/15] remove variable length array in 
readpage_kdump_compressed_parallel() (Philipp Rudo)
64b5b29 [PATCH 03/15] remove variable length array in 
readpage_kdump_compressed() (Philipp Rudo)
ffc64f0 [PATCH 02/15] remove variable length array in get_dom0_mapnr() (Philipp 
Rudo)
f68ac34 [PATCH 01/15] sadump: remove variable length array 
sadump_copy_1st_bitmap_from_memory() (Philipp Rudo)


Description of makedumpfile:
The makedumpfile is a tool for creating a dumpfile from /proc/vmcore
with filtering out unnecessary pages for analysis and compressing the
remaining pages, in order to shorten the size of the dumpfile and the
time of creating it.
https://github.com/makedumpfile/makedumpfile



smime.p7s
Description: S/MIME cryptographic signature
___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


RE: [PATCH v3 1/1] check that order of free pages falls within valid range

2022-04-11 Thread  
-Original Message-
> This change makes __exclude_unnecessary_pages() more robust by
> verifying that the order of a free page is valid before computing the size
> of its memory block in the buddy system.
> 
> The order of a free page cannot be larger than (MAX_ORDER - 1) because
> the array 'zone.free_area' is of size MAX_ORDER.
> 
> This situation is reproducible with some s390x dumps:
> 
> __exclude_unnecessary_pages: Invalid free page order: pfn=2690c0, order=52, 
> max order=8
> 
> References:
> - https://listman.redhat.com/archives/crash-utility/2021-September/009204.html
> - https://www.kernel.org/doc/gorman/html/understand/understand009.html
> 
> Signed-off-by: Alexander Egorenkov 
> ---
>  makedumpfile.c | 5 +
>  1 file changed, 5 insertions(+)
> 
> diff --git a/makedumpfile.c b/makedumpfile.c
> index 2ef345879524..a6c2a4934ff9 100644
> --- a/makedumpfile.c
> +++ b/makedumpfile.c
> @@ -6457,6 +6457,11 @@ __exclude_unnecessary_pages(unsigned long mem_map,
>   else if ((info->dump_level & DL_EXCLUDE_FREE)
>   && info->page_is_buddy
>   && info->page_is_buddy(flags, _mapcount, private, _count)) {
> + if (private >= ARRAY_LENGTH(zone.free_area)) {
> + MSG("WARNING: Invalid free page order: 
> pfn=%llx, order=%lu, max order=%lu\n",
> + pfn, private, ARRAY_LENGTH(zone.free_area) 
> - 1);
> + continue;
> + }
>   nr_pages = 1 << private;
>   pfn_counter = _free;
>   }
> --
> 2.34.1

Thanks for the update.
Applied with the available check for ARRAY_LENGTH(zone.free_area).
https://github.com/makedumpfile/makedumpfile/commit/0ea92577d93c93af701101f22ebd0096202c8085

Thanks,
Kazu


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


RE: [PATCH v2 1/1] check that order of free pages falls within valid range

2022-04-11 Thread  
-Original Message-
> 
> Hi Kazu,
> 
> HAGIO KAZUHITO(萩尾 一仁)  writes:
> 
> > Hi Alex,
> >
> > thanks for the patch,
> >
> > -Original Message-
> >> Alexander Egorenkov  writes:
> >>
> >> > This change makes __exclude_unnecessary_pages() more robust by
> >> > verifying that the order of a free page is valid before computing the 
> >> > size
> >> > of its memory block in the buddy system.
> >> >
> >> > The order of a free page cannot be larger than (MAX_ORDER - 1) because
> >> > the array 'zone.free_area' is of size MAX_ORDER.
> >> >
> >> > This situation is reproducible with some s390x dumps:
> >> >
> >> > __exclude_unnecessary_pages: Invalid free page order: pfn=2690c0, 
> >> > order=52, max order=8
> >> >
> >> > References:
> >> > - 
> >> > https://listman.redhat.com/archives/crash-utility/2021-September/009204.html
> >> > - https://www.kernel.org/doc/gorman/html/understand/understand009.html
> >> >
> >> > Signed-off-by: Alexander Egorenkov 
> >> > ---
> >> >  makedumpfile.c | 6 ++
> >> >  1 file changed, 6 insertions(+)
> >> >
> >> > diff --git a/makedumpfile.c b/makedumpfile.c
> >> > index 2ef345879524..56aa026e7b34 100644
> >> > --- a/makedumpfile.c
> >> > +++ b/makedumpfile.c
> >> > @@ -6457,6 +6457,12 @@ __exclude_unnecessary_pages(unsigned long mem_map,
> >> >  else if ((info->dump_level & DL_EXCLUDE_FREE)
> >> >  && info->page_is_buddy
> >> >  && info->page_is_buddy(flags, _mapcount, private, 
> >> > _count)) {
> >> > +if (private >= ARRAY_LENGTH(zone.free_area)) {
> >> > +ERRMSG("Invalid free page order: 
> >> > pfn=%llx, order=%lu, max order=%lu\n",
> >> > +   pfn, private, 
> >> > ARRAY_LENGTH(zone.free_area) - 1);
> >> > +free(page_cache);
> >> > +return FALSE;
> >> > +}
> >> >  nr_pages = 1 << private;
> >> >  pfn_counter = _free;
> >> >  }
> >> > --
> >> > 2.34.1
> >> >
> >> >
> >> > ___
> >> > kexec mailing list
> >> > kexec@lists.infradead.org
> >> > http://lists.infradead.org/mailman/listinfo/kexec
> >>
> >> I found out when this can happen.
> >>
> >> If e.g. a driver calls free_pages() and gives an order > max page order,
> >> then __free_one_page() stores the given invalid page order in the
> >> 'private' member of struct page and gives it back to the buddy
> >> allocator.
> >>
> >> This is what actually happened in the dump i used to reproduce this issue
> >> with makedumpfile.
> >
> > Good catch, though I could not reproduce it so far..
> >
> > but I wonder whether we have no other choice than returning FALSE?
> > in other words, can't we skip (include) the invalid page with a
> > warning message?
> >
> > As I said before, I think that capturing more pages than expected
> > will be better than not capturing a dump, and that is "robust"
> > against unexpected values.
> 
> I chose failing to an recover attempt because i was not sure
> that it won't have negative effects on other pages.

ah, I forgot but the issue was an infinite loop, so you chose failing
to avoid the same situation with negative effects on other pages, right?
I understand it.

hmm, but I think that it's too sensitive to stop makedumpfile with an
invalid value, it might be possible to complete dumping.  If it has
negative effects, let's fix it again.

> Your suggestion is replacing ERRMSG with a warning and then just
> continue ? I can rework the patch.

Yes, please.

Thanks,
Kazu


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


RE: [PATCH v2 1/1] check that order of free pages falls within valid range

2022-04-10 Thread  
Hi Alex,

thanks for the patch,

-Original Message-
> Alexander Egorenkov  writes:
> 
> > This change makes __exclude_unnecessary_pages() more robust by
> > verifying that the order of a free page is valid before computing the size
> > of its memory block in the buddy system.
> >
> > The order of a free page cannot be larger than (MAX_ORDER - 1) because
> > the array 'zone.free_area' is of size MAX_ORDER.
> >
> > This situation is reproducible with some s390x dumps:
> >
> > __exclude_unnecessary_pages: Invalid free page order: pfn=2690c0, order=52, 
> > max order=8
> >
> > References:
> > - 
> > https://listman.redhat.com/archives/crash-utility/2021-September/009204.html
> > - https://www.kernel.org/doc/gorman/html/understand/understand009.html
> >
> > Signed-off-by: Alexander Egorenkov 
> > ---
> >  makedumpfile.c | 6 ++
> >  1 file changed, 6 insertions(+)
> >
> > diff --git a/makedumpfile.c b/makedumpfile.c
> > index 2ef345879524..56aa026e7b34 100644
> > --- a/makedumpfile.c
> > +++ b/makedumpfile.c
> > @@ -6457,6 +6457,12 @@ __exclude_unnecessary_pages(unsigned long mem_map,
> > else if ((info->dump_level & DL_EXCLUDE_FREE)
> > && info->page_is_buddy
> > && info->page_is_buddy(flags, _mapcount, private, _count)) {
> > +   if (private >= ARRAY_LENGTH(zone.free_area)) {
> > +   ERRMSG("Invalid free page order: pfn=%llx, 
> > order=%lu, max order=%lu\n",
> > +  pfn, private, 
> > ARRAY_LENGTH(zone.free_area) - 1);
> > +   free(page_cache);
> > +   return FALSE;
> > +   }
> > nr_pages = 1 << private;
> > pfn_counter = _free;
> > }
> > --
> > 2.34.1
> >
> >
> > ___
> > kexec mailing list
> > kexec@lists.infradead.org
> > http://lists.infradead.org/mailman/listinfo/kexec
> 
> I found out when this can happen.
> 
> If e.g. a driver calls free_pages() and gives an order > max page order,
> then __free_one_page() stores the given invalid page order in the
> 'private' member of struct page and gives it back to the buddy
> allocator.
> 
> This is what actually happened in the dump i used to reproduce this issue
> with makedumpfile.

Good catch, though I could not reproduce it so far..

but I wonder whether we have no other choice than returning FALSE?
in other words, can't we skip (include) the invalid page with a
warning message?

As I said before, I think that capturing more pages than expected
will be better than not capturing a dump, and that is "robust"
against unexpected values.

Thanks,
Kazu


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


RE: [PATCH 2/2] makedumpfile: break loop after last dumpable page

2022-04-07 Thread  
-Original Message-
> Hi Kazu,
> 
> On Thu, 7 Apr 2022 06:43:00 +
> HAGIO KAZUHITO(萩尾 一仁)  wrote:
> 
> > -Original Message-
> > > Once the last dumpable page was processed there is no need to finish the
> > > loop to the last page. Thus exit early to improve performance.
> > >
> > > Signed-off-by: Philipp Rudo 
> > > ---
> > >  makedumpfile.c | 6 ++
> > >  1 file changed, 6 insertions(+)
> > >
> > > diff --git a/makedumpfile.c b/makedumpfile.c
> > > index 2ef3458..c944d0e 100644
> > > --- a/makedumpfile.c
> > > +++ b/makedumpfile.c
> > > @@ -8884,6 +8884,12 @@ write_kdump_pages_cyclic(struct cache_data 
> > > *cd_header, struct cache_data *cd_pag
> > >
> > >   for (pfn = start_pfn; pfn < end_pfn; pfn++) {
> > >
> > > + /*
> > > +  * All dumpable pages have been processed. No need to continue.
> > > +  */
> > > + if (num_dumped == info->num_dumpable)
> > > + break;
> >
> > This patch is likely to increase the possibility of failure to capture
> > /proc/kcore, although this is an unofficial functionality...
> >
> >   # makedumpfile -ld31 /proc/kcore kcore.snap
> >   # crash vmlinux kcore.snap
> >   ...
> >   crash: page incomplete: kernel virtual address: 916fbeffed00  type: 
> > "pglist node_id"
> >
> > In cyclic mode, makedumpfile first calculates only info->num_dumpable [1] 
> > and
> > frees the used bitmap, and later creates 2nd bitmap again [2] at this time.
> >
> >   create_dumpfile
> > create_dump_bitmap
> >   info->num_dumpable = get_num_dumpable_cyclic()  <<-- [1]
> > writeout_dumpfile
> >   write_kdump_pages_and_bitmap_cyclic
> > foreach cycle
> >   create_2nd_bitmap  <<-- [2]
> >   write_kdump_pages_cyclic
> >
> > So with live system, num_dumped can exceed info->num_dumpable.
> > If it stops at info->num_dumpable, some necessary data can be missed.
> 
> thanks for the explanation! I haven't considered that case and assumed
> info->num_dumpable is constant.
> 
> > Capturing /proc/kcore is still fragile and not considered enough, but
> > sometimes useful... when I want to capture a snapshot of memory.
> >
> > (the bitmap is allocated as block, so probably it's working as some buffer?)
> >
> > So I will merge the 1/2 patch, but personally would not like to merge
> > this patch.  How necessary is this?
> 
> I don't think this patch is absolutely necessary. It's only a small
> performance improvement but shouldn't have a huge impact. If you like
> you can drop the patch.

ok, drop it at present.

Applied the 1/2 patch.
https://github.com/makedumpfile/makedumpfile/commit/3318c51c3c60ed192519cef6f6e72178151f88a8

Thanks!
Kazu


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


RE: [PATCH 2/2] makedumpfile: break loop after last dumpable page

2022-04-07 Thread  
-Original Message-
> -Original Message-
> > Once the last dumpable page was processed there is no need to finish the
> > loop to the last page. Thus exit early to improve performance.
> >
> > Signed-off-by: Philipp Rudo 
> > ---
> >  makedumpfile.c | 6 ++
> >  1 file changed, 6 insertions(+)
> >
> > diff --git a/makedumpfile.c b/makedumpfile.c
> > index 2ef3458..c944d0e 100644
> > --- a/makedumpfile.c
> > +++ b/makedumpfile.c
> > @@ -8884,6 +8884,12 @@ write_kdump_pages_cyclic(struct cache_data 
> > *cd_header, struct cache_data *cd_pag
> >
> > for (pfn = start_pfn; pfn < end_pfn; pfn++) {
> >
> > +   /*
> > +* All dumpable pages have been processed. No need to continue.
> > +*/
> > +   if (num_dumped == info->num_dumpable)
> > +   break;
> 
> This patch is likely to increase the possibility of failure to capture
> /proc/kcore, although this is an unofficial functionality...
> 
>   # makedumpfile -ld31 /proc/kcore kcore.snap
>   # crash vmlinux kcore.snap
>   ...
>   crash: page incomplete: kernel virtual address: 916fbeffed00  type: 
> "pglist node_id"
> 
> In cyclic mode, makedumpfile first calculates only info->num_dumpable [1] and
> frees the used bitmap, and later creates 2nd bitmap again [2] at this time.
> 
>   create_dumpfile
> create_dump_bitmap
>   info->num_dumpable = get_num_dumpable_cyclic()  <<-- [1]
> writeout_dumpfile
>   write_kdump_pages_and_bitmap_cyclic
> foreach cycle
>   create_2nd_bitmap  <<-- [2]
>   write_kdump_pages_cyclic
> 
> So with live system, num_dumped can exceed info->num_dumpable.
> If it stops at info->num_dumpable, some necessary data can be missed.
> 
> Capturing /proc/kcore is still fragile and not considered enough, but
> sometimes useful... when I want to capture a snapshot of memory.
> 
> (the bitmap is allocated as block, so probably it's working as some buffer?)

sorry, mistake.
(page descriptors are aligned by block, ...)

> 
> So I will merge the 1/2 patch, but personally would not like to merge
> this patch.  How necessary is this?
> 
> Thanks,
> Kazu
> 
> 
> ___
> kexec mailing list
> kexec@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/kexec

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


RE: [PATCH 2/2] makedumpfile: break loop after last dumpable page

2022-04-07 Thread  
-Original Message-
> Once the last dumpable page was processed there is no need to finish the
> loop to the last page. Thus exit early to improve performance.
> 
> Signed-off-by: Philipp Rudo 
> ---
>  makedumpfile.c | 6 ++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/makedumpfile.c b/makedumpfile.c
> index 2ef3458..c944d0e 100644
> --- a/makedumpfile.c
> +++ b/makedumpfile.c
> @@ -8884,6 +8884,12 @@ write_kdump_pages_cyclic(struct cache_data *cd_header, 
> struct cache_data *cd_pag
> 
>   for (pfn = start_pfn; pfn < end_pfn; pfn++) {
> 
> + /*
> +  * All dumpable pages have been processed. No need to continue.
> +  */
> + if (num_dumped == info->num_dumpable)
> + break;

This patch is likely to increase the possibility of failure to capture
/proc/kcore, although this is an unofficial functionality...

  # makedumpfile -ld31 /proc/kcore kcore.snap
  # crash vmlinux kcore.snap
  ...
  crash: page incomplete: kernel virtual address: 916fbeffed00  type: 
"pglist node_id"

In cyclic mode, makedumpfile first calculates only info->num_dumpable [1] and
frees the used bitmap, and later creates 2nd bitmap again [2] at this time.

  create_dumpfile
create_dump_bitmap
  info->num_dumpable = get_num_dumpable_cyclic()  <<-- [1]
writeout_dumpfile
  write_kdump_pages_and_bitmap_cyclic
foreach cycle
  create_2nd_bitmap  <<-- [2]
  write_kdump_pages_cyclic

So with live system, num_dumped can exceed info->num_dumpable.
If it stops at info->num_dumpable, some necessary data can be missed.

Capturing /proc/kcore is still fragile and not considered enough, but
sometimes useful... when I want to capture a snapshot of memory.

(the bitmap is allocated as block, so probably it's working as some buffer?)

So I will merge the 1/2 patch, but personally would not like to merge
this patch.  How necessary is this?

Thanks,
Kazu


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


RE: [PATCH 1/2] makedumpfile: omit unnecessary calls to print_progress

2022-04-07 Thread  
Hi Philipp,

-Original Message-
> Check first if a page is dumpable before printing the process. Otherwise
> there is the chance that num_dumped % per == 0 at the beginning of the
> block of undampable pages. In that case num_dumped isn't updated before
> the next dumpable page and thus print_process is called for every page
> in that block.
> 
> This is especially annoying when the block is after the last dumpable
> page and thus num_dumped == info->num_dumpable. In that case
> print_process will bypass its check to only print the process once every
> second and thus flood the console with unnecessary prints. This can lead
> to a severe decrease in performance especially when the console is in
> line mode.

Good catch and improvement, will merge this.

Thanks,
Kazu

> 
> Signed-off-by: Philipp Rudo 
> ---
>  makedumpfile.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/makedumpfile.c b/makedumpfile.c
> index 14556db..2ef3458 100644
> --- a/makedumpfile.c
> +++ b/makedumpfile.c
> @@ -8884,16 +8884,16 @@ write_kdump_pages_cyclic(struct cache_data 
> *cd_header, struct cache_data *cd_pag
> 
>   for (pfn = start_pfn; pfn < end_pfn; pfn++) {
> 
> - if ((num_dumped % per) == 0)
> - print_progress(PROGRESS_COPY, num_dumped, 
> info->num_dumpable, _start);
> -
>   /*
>* Check the excluded page.
>*/
>   if (!is_dumpable(info->bitmap2, pfn, cycle))
>   continue;
> 
> + if ((num_dumped % per) == 0)
> + print_progress(PROGRESS_COPY, num_dumped, 
> info->num_dumpable, _start);
>   num_dumped++;
> +
>   if (!read_pfn(pfn, buf))
>   goto out;
> 
> --
> 2.35.1

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


RE: EXT: RE: crash: read error on type: "memory section root table"

2022-04-06 Thread  
-Original Message-
> -Original Message-
> > Hello,
> >
> > Suggested trace above gives following information after a crash -d 8 
> > command:
> > <...>
> > kernel NR_CPUS: 2
> > 
> > 
> > read_diskdump: paddr/pfn: 12925820/12925 -> cache physical page: 12925000
> > GETBUF(328 -> 0)
> > FREEBUF(0)
> > GETBUF(328 -> 0)
> > FREEBUF(0)
> > PAGESIZE=4096
> > mem_section_size = 16384
> > NR_SECTION_ROOTS = 2048
> > NR_MEM_SECTIONS = 524288
> > SECTIONS_PER_ROOT = 256
> > SECTION_ROOT_MASK = 0xff
> > PAGES_PER_SECTION = 32768
> > 
> > 
> > read_diskdump: paddr/pfn: 12926db0/12926 -> cache physical page: 12926000
> >  > (FOE), 56017da26fd0>
> > 
> > read_diskdump: paddr/pfn: 3f7fc000/3f7fc -> cache physical page: 3f7fc000
> > crash: PAG3 - errno=2 r=0 pd.size=49
> > read_diskdump: READ_ERROR: cannot cache page: 3f7fc000
> > crash: read error: kernel virtual address: 904c7f7fc000  type: "memory 
> > section root table"
> 
> hmm, r=0 means end of file, can you check again whether pd.offset exceeds
> the dumpfile size?  If so, somehow the dumpfile is shorter than expected.
> 
> I think a RHEL-based kexec-tools does "sync" after makedumpfile, but
> can you check?

> > Note 2: The debug message of makedumpfile report 'Write bytes : 
> > 17364943', but the file is ~5MB for '-d 31' opton.

This also looks the same situation.

Does cp command always work on your machine to capture /proc/vmcore?
e.g. with a RHEL-based kexec-tools:

  core_collector cp

The size of a vmcore should become almost same as memory size.

Thanks,
Kazu


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


RE: EXT: RE: crash: read error on type: "memory section root table"

2022-04-06 Thread  
-Original Message-
> Hello,
> 
> Suggested trace above gives following information after a crash -d 8 command:
> <...>
> kernel NR_CPUS: 2
> 
> 
> read_diskdump: paddr/pfn: 12925820/12925 -> cache physical page: 12925000
> GETBUF(328 -> 0)
> FREEBUF(0)
> GETBUF(328 -> 0)
> FREEBUF(0)
> PAGESIZE=4096
> mem_section_size = 16384
> NR_SECTION_ROOTS = 2048
> NR_MEM_SECTIONS = 524288
> SECTIONS_PER_ROOT = 256
> SECTION_ROOT_MASK = 0xff
> PAGES_PER_SECTION = 32768
> 
> 
> read_diskdump: paddr/pfn: 12926db0/12926 -> cache physical page: 12926000
>  (FOE), 56017da26fd0>
> 
> read_diskdump: paddr/pfn: 3f7fc000/3f7fc -> cache physical page: 3f7fc000
> crash: PAG3 - errno=2 r=0 pd.size=49
> read_diskdump: READ_ERROR: cannot cache page: 3f7fc000
> crash: read error: kernel virtual address: 904c7f7fc000  type: "memory 
> section root table"

hmm, r=0 means end of file, can you check again whether pd.offset exceeds
the dumpfile size?  If so, somehow the dumpfile is shorter than expected.

I think a RHEL-based kexec-tools does "sync" after makedumpfile, but
can you check?

Thanks,
Kazu

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


RE: crash: read error on type: "memory section root table"

2022-03-29 Thread  
-Original Message-
> Hello,
> 
> Sorry to cross post on both ML, I'm not sure which one would be the most 
> suitable.
> 
> Issue on analysis with crash-7.3.1 on a Centos 8 machine:
> crash: read error: kernel virtual address: 8f4fff7fc000  type: "memory 
> section root table"
> 
> Crash machine has a Rocky Linux 8.5 based kernel with following config 
> options:
> - CONFIG_RANDOMIZE_BASE=y
> - CONFIG_RANDOMIZE_MEMORY=y
> - CONFIG_SPARSEMEM_MANUAL=y
> - CONFIG_SPARSEMEM=y
> - CONFIG_SPARSEMEM_EXTREME=y
> - CONFIG_SPARSEMEM_VMEMMAP_ENABLE=y
> - CONFIG_KEXEC_CORE=y
> - CONFIG_KEXEC=y
> - CONFIG_KEXEC_FILE=y
> 
> Kexec-tools package is from Centos Stream repo: 
> kexec-tools-2.0.20-68.el8.2.5ale.x86_64
> 
> /proc/vmcore is packaged with :
> /sbin/makedumpfile -D -d 0 -c --message-level 15 /proc/vmcore 
> /tmpd/crashdump-${linux_ver}-${date_time}
> 
> At kernel panic, I get:
> Dumping memory to crash partition
> This may take a while, please wait...
> makedumpfile: version 1.7.0 (released on 8 Nov 2021)
> command line: /sbin/makedumpfile -D -d 0 -c --message-level 15 /proc/vmcore 
> /tmpd/crashdump--20220329-1538
> 
> sadump: does not have partition header
> sadump: read dump device as unknown format
> sadump: unknown format
>    phys_start phys_end   virt_start virt_end
> LOAD[ 0]  800  9a2c000 8a40 8be2c000
> LOAD[ 1]   10 3b00 8f4fc010 8f4ffb00
> LOAD[ 2] 3d80 3e341000 8f4ffd80 8f4ffe341000
> LOAD[ 3] 3ed7b000 3eee2000 8f4ffed7b000 8f4ffeee2000
> LOAD[ 4] 3f63a000 3f80 8f4fff63a000 8f4fff80
> Linux kdump
> VMCOREINFO   :
>   OSRELEASE=4.18.0-348.12.2.el8_5-ale
>   PAGESIZE=4096
> page_size    : 4096
>   SYMBOL(init_uts_ns)=8b653600
>   SYMBOL(node_online_map)=8b7630a8
>   SYMBOL(swapper_pg_dir)=8b64c000
>   SYMBOL(_stext)=8a40
>   SYMBOL(vmap_area_list)=8b6a47a0
>   SYMBOL(mem_map)=8bd25828
>   SYMBOL(contig_page_data)=8b726600
>   SYMBOL(mem_section)=8f4fff7fc000

hm, probably I've never seen a system that has both mem_map and mem_section, but
it looks like makedumpfile works fine.. i.e. recognizes it as SPARSEMEM_EXTREME
correctly.

>   LENGTH(mem_section)=2048
>   SIZE(mem_section)=16
>   OFFSET(mem_section.section_mem_map)=0
>   SIZE(page)=64
>   SIZE(pglist_data)=5696
>   SIZE(zone)=1216
>   SIZE(free_area)=72
>   SIZE(list_head)=16
>   SIZE(nodemask_t)=8
>   OFFSET(page.flags)=0
>   OFFSET(page._refcount)=52
>   OFFSET(page.mapping)=24
>   OFFSET(page.lru)=8
>   OFFSET(page._mapcount)=48
>   OFFSET(page.private)=40
>   OFFSET(page.compound_dtor)=16
>   OFFSET(page.compound_order)=17
>   OFFSET(page.compound_head)=8
>   OFFSET(pglist_data.node_zones)=0
>   OFFSET(pglist_data.nr_zones)=4944
>   OFFSET(pglist_data.node_start_pfn)=4952
>   OFFSET(pglist_data.node_spanned_pages)=4968
>   OFFSET(pglist_data.node_id)=4976
>   OFFSET(zone.free_area)=192
>   OFFSET(zone.vm_stat)=1104
>   OFFSET(zone.spanned_pages)=96
>   OFFSET(free_area.free_list)=0
>   OFFSET(list_head.next)=0
>   OFFSET(list_head.prev)=8
>   OFFSET(vmap_area.va_start)=0
>   OFFSET(vmap_area.list)=40
>   LENGTH(zone.free_area)=11
>   SYMBOL(log_buf)=8b67d3c0
>   SYMBOL(log_buf_len)=8b67d3bc
>   SYMBOL(log_first_idx)=8bd1a3d8
>   SYMBOL(clear_idx)=8bd1a3a4
>   SYMBOL(log_next_idx)=8bd1a3c8
>   SIZE(printk_log)=16
>   OFFSET(printk_log.ts_nsec)=0
>   OFFSET(printk_log.len)=8
>   OFFSET(printk_log.text_len)=10
>   OFFSET(printk_log.dict_len)=12
>   LENGTH(free_area.free_list)=4
>  NUMBER(NR_FREE_PAGES)=0
>   NUMBER(PG_lru)=5
>   NUMBER(PG_private)=12
>   NUMBER(PG_swapcache)=9
>   NUMBER(PG_swapbacked)=18
>   NUMBER(PG_slab)=8
>   NUMBER(PG_head_mask)=32768
>   NUMBER(PAGE_BUDDY_MAPCOUNT_VALUE)=-129
>   NUMBER(HUGETLB_PAGE_DTOR)=2
>   NUMBER(PAGE_OFFLINE_MAPCOUNT_VALUE)=-257
>   SYMBOL(alcatel_dump_info)=8b647000
>   NUMBER(phys_base)=-37748736
>   SYMBOL(init_top_pgt)=8b64c000
>   NUMBER(pgtable_l5_enabled)=0
>   KERNELOFFSET=940
>   NUMBER(KERNEL_IMAGE_SIZE)=1073741824
>   NUMBER(sme_mask)=0
>   CRASHTIME=1648561077
> 
> phys_base    : fdc0 (vmcoreinfo)
> 
> max_mapnr    : 3f800
> There is enough free memory to be done in one cycle.
> 
> Buffer size for the cyclic mode: 65024
> page_offset  : 8f4fc000 (pt_load)
> num of NODEs : 1
> Memory type  : SPARSEMEM_EX
> 
>    mem_map    pfn_start  pfn_end
> mem_map[   0] 8f4ffa00    0 8000
> mem_map[   1] 8f4ffa20 8000    1
> mem_map[   2] 8f4ffa40    1    18000
> mem_map[   3] 8f4ffa60    18000    2
> mem_map[   4] 8f4ffa80    2    28000
> mem_map[   5] 8f4ffaa0 

RE: [PATCH v2 0/4] makedumpfile: harden parsing of old prink buffer

2022-03-17 Thread  
-Original Message-
> On Wed, Mar 16, 2022 at 9:17 AM David Wysochanski  wrote:
> >
> > On Mon, Mar 14, 2022 at 12:04 PM Philipp Rudo  wrote:
> > >
> > > Hi,
> > >
> > > dumping the dmesg can cause an endless loop for the old prink mechanism (>
> > > v3.5.0 and < v5.10.0) when the log_buf got corrupted. This series fixes 
> > > those
> > > cases by adding a cycle detection. The cycle detection is implemented in a
> > > generic way so that it can be reused in other parts of makedumpfile.
> > >
> > > Thanks
> > > Philipp
> > >
> > > v2:
> > > * Rename 'idx' to 'ptr'
> > > * Also print the non-loop part when a cycle was detected. Such a
> > >   situation can happen when log_buf wrapped around in the kernel
> > >   (log_first_idx != 0) and the corruption occurred on an
> > >   idx < log_first_idx.
> > > * Add patch 4 fixing a bug independent from the memory corruption 
> > > but
> > >   found while investigating it.
> > >
> > > Philipp Rudo (4):
> > >   makedumpfile: add generic cycle detection
> > >   makedumpfile: use pointer arithmetics for dump_dmesg
> > >   makedumpfile: use cycle detection when parsing the prink log_buf
> > >   makedumpfile: print error when reading with unsupported compression
> > >
> > >  Makefile   |   2 +-
> > >  detect_cycle.c |  99 +
> > >  detect_cycle.h |  40 +++
> > >  makedumpfile.c | 131 -
> > >  4 files changed, 247 insertions(+), 25 deletions(-)
> > >  create mode 100644 detect_cycle.c
> > >  create mode 100644 detect_cycle.h
> > >
> > > --
> > > 2.35.1
> > >
> >
> > Thanks for doing v2.  Reviewing / testing this now...
> 
> You can add
> Reviewed-and-tested-by: Dave Wysochanski 

Thank you Pilipp and Dave, for the improvement.

Applied with the small changes I sent.

Thanks,
Kazu


> 
> I tested this patchset against a large set of vmcores comparing output
> of "makedumpfile --dump-dmesg" with existing makedumpfile
> (kexec-tools-2.0.20-46.el8_4.3.x86_64) with the latest upstream plus
> these patches.  No difference in output was seen.
> 
> As advertised, this handles the loop condition when log_buf is
> corrupted.  And with the v2 version of patch 3, the dmesg output is
> the same as "crash log" on the same vmcore.  Also verified patch #4
> works as advertised - thanks for including a better error message
> there for users.
___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


RE: [PATCH v2 4/4] makedumpfile: print error when reading with unsupported compression

2022-03-17 Thread  
-Original Message-
> Currently makedumpfile only checks if the required compression algorithm
> was enabled during build when compressing a dump but not when reading
> from one. This can lead to situations where, one version of makedumpfile
> creates the dump using a compression algorithm an other version of
> makedumpfile doesn't support. When the second version now tries to, e.g.
> extract the dmesg from the dump it will fail with an error similar to
> 
>   # makedumpfile --dump-dmesg vmcore dmesg.txt
>   __vtop4_x86_64: Can't get a valid pgd.
>   readmem: Can't convert a virtual address(92e18284) to physical 
> address.
>   readmem: type_addr: 0, addr:92e18284, size:390
>   check_release: Can't get the address of system_utsname.
> 
>   makedumpfile Failed.
> 
> That's because readpage_kdump_compressed{_parallel} does not return
> with an error if the page it is trying to read is compressed with an
> unsupported compression algorithm. Thus readmem copies random data from
> the (uninitialized) cachebuf to its caller and thus causing the error
> above.
> 
> Fix this by checking if the required compression algorithm is supported
> in readpage_kdump_compressed{_parallel} and print a proper error message
> if it isn't.
> 
> Reported-by: Dave Wysochanski 
> Signed-off-by: Philipp Rudo 
> ---
>  makedumpfile.c | 56 ++
>  1 file changed, 48 insertions(+), 8 deletions(-)
> 
> diff --git a/makedumpfile.c b/makedumpfile.c
> index b7ac999..56f3b6c 100644
> --- a/makedumpfile.c
> +++ b/makedumpfile.c
> @@ -865,9 +865,14 @@ readpage_kdump_compressed(unsigned long long paddr, void 
> *bufptr)
>   ERRMSG("Uncompress failed: %d\n", ret);
>   goto out_error;
>   }
> + } else if ((pd.flags & DUMP_DH_COMPRESSED_LZO)) {
>  #ifdef USELZO
> - } else if (info->flag_lzo_support
> -&& (pd.flags & DUMP_DH_COMPRESSED_LZO)) {
> + if (!info->flag_lzo_support) {
> + ERRMSG("lzo compression unsupported\n");
> + out = FALSE;

I removed those "out = FALSE;" lines because it's already initialized to FALSE.

Thanks,
Kazu

> + goto out_error;
> + }
> +
>   retlen = info->page_size;
>   ret = lzo1x_decompress_safe((unsigned char *)buf, pd.size,
>   (unsigned char *)bufptr, ,
> @@ -876,9 +881,14 @@ readpage_kdump_compressed(unsigned long long paddr, void 
> *bufptr)
>   ERRMSG("Uncompress failed: %d\n", ret);
>   goto out_error;
>   }
> +#else
> + ERRMSG("lzo compression unsupported\n");
> + ERRMSG("Try `make USELZO=on` when building.\n");
> + out = FALSE;
> + goto out_error;
>  #endif
> -#ifdef USESNAPPY
>   } else if ((pd.flags & DUMP_DH_COMPRESSED_SNAPPY)) {
> +#ifdef USESNAPPY
> 
>   ret = snappy_uncompressed_length(buf, pd.size, (size_t 
> *));
>   if (ret != SNAPPY_OK) {
> @@ -891,14 +901,24 @@ readpage_kdump_compressed(unsigned long long paddr, 
> void *bufptr)
>   ERRMSG("Uncompress failed: %d\n", ret);
>   goto out_error;
>   }
> +#else
> + ERRMSG("snappy compression unsupported\n");
> + ERRMSG("Try `make USESNAPPY=on` when building.\n");
> + out = FALSE;
> + goto out_error;
>  #endif
> -#ifdef USEZSTD
>   } else if ((pd.flags & DUMP_DH_COMPRESSED_ZSTD)) {
> +#ifdef USEZSTD
>   ret = ZSTD_decompress(bufptr, info->page_size, buf, pd.size);
>   if (ZSTD_isError(ret) || (ret != info->page_size)) {
>   ERRMSG("Uncompress failed: %d\n", ret);
>   goto out_error;
>   }
> +#else
> + ERRMSG("zstd compression unsupported\n");
> + ERRMSG("Try `make USEZSTD=on` when building.\n");
> + out = FALSE;
> + goto out_error;
>  #endif
>   }
> 
> @@ -964,9 +984,14 @@ readpage_kdump_compressed_parallel(int fd_memory, 
> unsigned long long paddr,
>   ERRMSG("Uncompress failed: %d\n", ret);
>   goto out_error;
>   }
> + } else if ((pd.flags & DUMP_DH_COMPRESSED_LZO)) {
>  #ifdef USELZO
> - } else if (info->flag_lzo_support
> -&& (pd.flags & DUMP_DH_COMPRESSED_LZO)) {
> + if (!info->flag_lzo_support) {
> + ERRMSG("lzo compression unsupported\n");
> + out = FALSE;
> + goto out_error;
> + }
> +
>   retlen = info->page_size;
>   ret = lzo1x_decompress_safe((unsigned char *)buf, pd.size,
>   (unsigned char *)bufptr, ,
> @@ -975,9 +1000,14 @@ 

RE: [PATCH v2 2/4] makedumpfile: use pointer arithmetics for dump_dmesg

2022-03-17 Thread  
-Original Message-
> When parsing the printk buffer for the old printk mechanism (> v3.5.0+ and
> < 5.10.0) a log entry is currently specified by the offset into the
> buffer where the entry starts. Change this to use a pointers instead.
> This is done in preparation for using the new cycle detection mechanism.
> 
> Signed-off-by: Philipp Rudo 
> ---
>  makedumpfile.c | 29 +
>  1 file changed, 13 insertions(+), 16 deletions(-)
> 
> diff --git a/makedumpfile.c b/makedumpfile.c
> index 7ed9756..9a05c96 100644
> --- a/makedumpfile.c
> +++ b/makedumpfile.c
> @@ -5482,13 +5482,10 @@ dump_log_entry(char *logptr, int fp, const char 
> *file_name)
>   * get log record by index; idx must point to valid message.
>   */
>  static char *
> -log_from_idx(unsigned int idx, char *logbuf)
> +log_from_ptr(char *logptr, char *logbuf)
>  {
> - char *logptr;
>   unsigned int msglen;
> 
> - logptr = logbuf + idx;
> -
>   /*
>* A length == 0 record is the end of buffer marker.
>* Wrap around and return the message at the start of
> @@ -5502,14 +5499,13 @@ log_from_idx(unsigned int idx, char *logbuf)
>   return logptr;
>  }
> 
> -static long
> -log_next(unsigned int idx, char *logbuf)
> +static void *
> +log_next(void *_logptr, void *_logbuf)
>  {
> - char *logptr;
> + char *logptr = _logptr;
> + char *logbuf = _logbuf;
>   unsigned int msglen;
> 
> - logptr = logbuf + idx;
> -

I added the following change to the log_from_ptr() and log_next() because of
- assigning to the argument (possible but unusual)
- the unnecessary local variables

--- a/makedumpfile.c
+++ b/makedumpfile.c
@@ -5494,16 +5494,14 @@ log_from_ptr(char *logptr, char *logbuf)
 
msglen = USHORT(logptr + OFFSET(printk_log.len));
if (!msglen)
-   logptr = logbuf;
+   return logbuf;
 
return logptr;
 }
 
 static void *
-log_next(void *_logptr, void *_logbuf)
+log_next(void *logptr, void *logbuf)
 {
-   char *logptr = _logptr;
-   char *logbuf = _logbuf;
unsigned int msglen;
 
/*

Thanks,
Kazu

>   /*
>* A length == 0 record is the end of buffer marker. Wrap around and
>* read the message at the start of the buffer as *this* one, and
> @@ -5519,10 +5515,10 @@ log_next(unsigned int idx, char *logbuf)
>   msglen = USHORT(logptr + OFFSET(printk_log.len));
>   if (!msglen) {
>   msglen = USHORT(logbuf + OFFSET(printk_log.len));
> - return msglen;
> + return logbuf + msglen;
>   }
> 
> - return idx + msglen;
> + return logptr + msglen;
>  }
> 
>  int
> @@ -5530,11 +5526,12 @@ dump_dmesg()
>  {
>   int log_buf_len, length_log, length_oldlog, ret = FALSE;
>   unsigned long index, log_buf, log_end;
> - unsigned int idx, log_first_idx, log_next_idx;
> + unsigned int log_first_idx, log_next_idx;
>   unsigned long long first_idx_sym;
>   unsigned long log_end_2_6_24;
>   unsigned  log_end_2_6_25;
>   char *log_buffer = NULL, *log_ptr = NULL;
> + char *ptr;
> 
>   /*
>* log_end has been changed to "unsigned" since linux-2.6.25.
> @@ -5681,13 +5678,13 @@ dump_dmesg()
>   ERRMSG("Can't open output file.\n");
>   goto out;
>   }
> - idx = log_first_idx;
> - while (idx != log_next_idx) {
> - log_ptr = log_from_idx(idx, log_buffer);
> + ptr = log_buffer + log_first_idx;
> + while (ptr != log_buffer + log_next_idx) {
> + log_ptr = log_from_ptr(ptr, log_buffer);
>   if (!dump_log_entry(log_ptr, info->fd_dumpfile,
>   info->name_dumpfile))
>   goto out;
> - idx = log_next(idx, log_buffer);
> + ptr = log_next(ptr, log_buffer);
>   }
>   if (!close_files_for_creating_dumpfile())
>   goto out;
> --
> 2.35.1

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


RE: [PATCH 3/3] makedumpfile: use cycle detection when parsing the prink log_buf

2022-03-11 Thread  
-Original Message-
> Hi Dave,
> 
> On Wed, 9 Mar 2022 03:48:12 -0500
> David Wysochanski  wrote:
> 
> > On Mon, Mar 7, 2022 at 12:23 PM Philipp Rudo  wrote:
> > >
> > > The old printk mechanism (> v3.5.0 and < v5.10.0) had a fixed size
> > > buffer (log_buf) that contains all messages. The location for the next
> > > message is stored in log_next_idx. In case the log_buf runs full
> > > log_next_idx wraps around and starts overwriting old messages at the
> > > beginning of the buffer. The wraparound is denoted by a message with
> > > msg->len == 0.
> > >
> > > Following the behavior described above blindly in makedumpfile is
> > > dangerous as e.g. a memory corruption could overwrite (parts of) the
> > > log_buf. If the corruption adds a message with msg->len == 0 this leads
> > > to an endless loop when dumping the dmesg with makedumpfile appending
> > > the messages up to the corruption over and over again to the output file
> > > until file system is full. Fix this by using cycle detection and aboard
> > > once one is detected.
> > >
> > > While at it also verify that the index is within the log_buf and thus
> > > guard against corruptions with msg->len != 0.
> > >
> > > Fixes: 36c2458 ("[PATCH] --dump-dmesg fix for post 3.5 kernels.")
> > > Reported-by: Audra Mitchell 
> > > Suggested-by: Dave Wysochanski 
> > > Signed-off-by: Philipp Rudo 
> > > ---
> > >  makedumpfile.c | 42 --
> > >  1 file changed, 40 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/makedumpfile.c b/makedumpfile.c
> > > index edf128b..2738d16 100644
> > > --- a/makedumpfile.c
> > > +++ b/makedumpfile.c
> > > @@ -15,6 +15,7 @@
> > >   */
> > >  #include "makedumpfile.h"
> > >  #include "print_info.h"
> > > +#include "detect_cycle.h"
> > >  #include "dwarf_info.h"
> > >  #include "elf_info.h"
> > >  #include "erase_info.h"
> > > @@ -5528,10 +5529,11 @@ dump_dmesg()
> > > unsigned long index, log_buf, log_end;
> > > unsigned int log_first_idx, log_next_idx;
> > > unsigned long long first_idx_sym;
> > > +   struct detect_cycle *dc = NULL;
> > > unsigned long log_end_2_6_24;
> > > unsigned  log_end_2_6_25;
> > > char *log_buffer = NULL, *log_ptr = NULL;
> > > -   char *idx;
> > > +   char *idx, *next_idx;
> > >
> >
> > Would be clearer to call the above "next_ptr" rather than "next_idx"
> > (as far as I know 'index' refers to 32-bit quantities).
> > Same comment about the "idx" variable, maybe "ptr"?
> 
> Hmm... I stuck with idx as the kernel uses the same name. In my
> opinion using the same name makes it easier to see that both variables
> contain the same "quantity" even when the implementation is slightly
> different (in the kernel idx is the offset in the log_buf just like it
> was in makedumpfile before patch 2). But my opinion isn't very strong
> on the naming. So when the consent is to rename the variable I'm open
> to do it.
> 
> @Kazu: Do you have a preference here?

Personally I think it will be more readable to use "*ptr" for pointers
in this case, as Dave says.

Thanks,
Kazu

> 
> Same for your comments in patch 2.
> 
> > > /*
> > >  * log_end has been changed to "unsigned" since linux-2.6.25.
> > > @@ -5679,12 +5681,47 @@ dump_dmesg()
> > > goto out;
> > > }
> > > idx = log_buffer + log_first_idx;
> > > +   dc = dc_init(idx, log_buffer, log_next);
> > > while (idx != log_buffer + log_next_idx) {
> > > log_ptr = log_from_idx(idx, log_buffer);
> > > if (!dump_log_entry(log_ptr, info->fd_dumpfile,
> > > info->name_dumpfile))
> > > goto out;
> > > -   idx = log_next(idx, log_buffer);
> > > +   if (dc_next(dc, (void **) _idx)) {
> > > +   unsigned long len;
> > > +   char *first;
> > > +
> > > +   /* Clear everything we have already 
> > > written... */
> > > +   ftruncate(info->fd_dumpfile, 0);
> > > +   lseek(info->fd_dumpfile, 0, SEEK_SET);
> > > +
> >
> > I'm not sure I understand why you're doing this.
> 
> That's because in every pass of the loop the entry is written to the
> output file. So most likely the output file already contains more
> than needed. Thus we somehow need to trim the file to the end of
> the cycle. Thus I decided to go brute force and simply clear all
> content from the file and write all we need a second time.
> 
> In our very specific case there are 2704 lines to the corruption. That
> means that the function has already written 6798 (4095 + 2704 - 1) lines
> till it detects the cycle.
> 
> > > +   /* ...and only write up to the 
> > > corruption. */
> > > +  

RE: [PATCH v2 1/1] Simplify the generation of man pages

2022-03-08 Thread  
-Original Message-
> Use `sed` to simplify the man pages generation. Keep the .in files
> intact during make and generate the actual man pages with sed.
> Additionally package tools already gz the man pages during install so it
> doesn't really need to do that during make and it breaks reproducibility
> of the package due to timestamps on files.
> 
> Motivation: https://reproducible-builds.org
> 
> Signed-off-by: Leonidas Spyropoulos 

Thank you for the good improvement!

Applied with adding the man files to .gitignore:
https://github.com/makedumpfile/makedumpfile/commit/2169de66ecd4504a3e69e0be0330f492f966ce5e

(and thank you Guilherme for your review and test.)

Kazu


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


RE: [PATCH] sadump, kaslr: fix failure of calculating kaslr_offset

2022-01-25 Thread  
-Original Message-
> On kernels v5.8 or later, makedumpfile fails for memory dumps in the
> sadump-related formats as follows:
> 
> # makedumpfile -f -l -d 31 -x ./vmlinux /dev/sdd4 /root/vmcore-ld31
> __vtop4_x86_64: Can't get a valid pud_pte.
> ...110 lines of the same message...
> __vtop4_x86_64: Can't get a valid pud_pte.
> calc_kaslr_offset: failed to calculate kaslr_offset and phys_base; 
> default to 0
> readmem: type_addr: 1, addr:85411858, size:8
> __vtop4_x86_64: Can't get pgd (page_dir:85411858).
> readmem: Can't convert a virtual address(059be980) to physical 
> address.
> readmem: type_addr: 0, addr:059be980, size:1024
> cpu_online_mask_init: Can't read cpu_online_mask memory.
> 
> makedumpfile Failed.
> 
> This is caused by the kernel commit 9d06c4027f21 ("x86/entry: Convert
> Divide Error to IDTENTRY") that renamed divide_error to
> asm_exc_divide_error, breaking logic for calculating kaslr offset.
> 
> Fix this by adding initialization of asm_exc_divide_error.
> 
> Signed-off-by: HATAYAMA Daisuke 
> ---
>  makedumpfile.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/makedumpfile.c b/makedumpfile.c
> index a51bdaf..7ed9756 100644
> --- a/makedumpfile.c
> +++ b/makedumpfile.c
> @@ -1667,6 +1667,8 @@ get_symbol_info(void)
>   SYMBOL_INIT(cur_cpu_spec, "cur_cpu_spec");
> 
>   SYMBOL_INIT(divide_error, "divide_error");
> + if (SYMBOL(divide_error) == NOT_FOUND_SYMBOL)
> + SYMBOL_INIT(divide_error, "asm_exc_divide_error");
>   SYMBOL_INIT(idt_table, "idt_table");
>   SYMBOL_INIT(saved_command_line, "saved_command_line");
>   SYMBOL_INIT(pti_init, "pti_init");
> --
> 2.31.1

Thanks, applied.
https://github.com/makedumpfile/makedumpfile/commit/59b1726fbcc251155140c8a1972384498fee4daf

Kazu


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


RE: [RFC PATCH] makedumpfile: add userinfo elf section 0/4]

2021-12-14 Thread  
Hi Ivan,

sorry for the delay.

-Original Message-
> These patchset suggests feature to add user specific information along
> with dumpfile. One of usecases could be a platform build information
> containing dubug file addresses, sources of daily build, some
> platform related information and more, it's gracefully simplifies
> debugging process, since an engineer doesn't need to spend time on
> finding to which platform this core file is related, where to get
> debugging counter part and so on. As user info can be added along with
> dumpfile it frees platform from creating tar achieve spending doubled
> space or using specific tools.

Hmm, currently I don't think it will be worth making makedumpfile have
the memo-like feature even with accepting some code and complexity.
If makedumpfile can access that helpful data in 2nd kernel, kdump tool
will be able to place it next to the dumpfile.  There will become two
files to be handled, but I'm not sure what the problem is with it..

Thanks,
Kazu

> Since this information is sorted and
> selected by user it can't be standardized and should be placed in some
> special generic section. This patchset roughly proposes the variant when
> user information is placed/retrieved as subsection of note elf
> section.
> 
> Ivan Khoronzhuk (4):
>   makedumpfile: rename check_dump_file() on check_file_is_writable()
>   elf: add new "userinfo" ELF section to traverse debug information
>   elf_info: make int note_descsz() and offset_next_note() public
>   elf: add ability to read the userinfo data from note segment
> 
>  elf_info.c |   4 +-
>  elf_info.h |   5 +
>  makedumpfile.c | 393 -
>  makedumpfile.h |  12 ++
>  4 files changed, 404 insertions(+), 10 deletions(-)
> 
> --
> 2.20.1

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


[ANNOUNCE] makedumpfile 1.7.0

2021-11-07 Thread  
Hi,

I'm pleased to announce the release of makedumpfile 1.7.0.
Thank you everyone for your help to maintain the tool.

Download:
The latest makedumpfile can be downloaded from the following URL.
  https://github.com/makedumpfile/makedumpfile/releases

New features:
- Zstandard (zstd) compression support
- New -L option to limit output file size
- Support of kernels up to v5.15 (x86_64)

Special thanks to Philipp Rudo, who has been working on hardening
makedumpfile.  This release contains a part of the work, such as
buffer overflow fix, removal of strcpy() and etc.  We will continue
to do this also for the next release.
Note that due to some reasons the work has been done internally,
please check out the commits if you have interest.

Commits since v1.6.9:
06ef8e2 [v1.7.0] Update version (Kazuhito Hagio)
4541b18 [PATCH] Support newer kernels up to v5.15 (Kazuhito Hagio)
a32da03 [PATCH 14/14] fix type mismatch in parse_line() (Philipp Rudo)
d155b76 [PATCH 13/14] fix out of range comparison in count_bits() (Philipp Rudo)
247afd6 [PATCH 12/14] fix potential dereference of NULL in is_cyclic_region() 
(Philipp Rudo)
2651d57 [PATCH 11/14] fix memory leak in init_xen_crash_info() (Philipp Rudo)
4577898 [PATCH 10/14] fix compiler warning in ordinal() (Philipp Rudo)
35e31db [PATCH 09/14] remove use of strcpy() in __load_module_symbol() (Philipp 
Rudo)
faea305 [PATCH 08/14] remove use of strcpy() in strip_beginning_whitespace() 
(Philipp Rudo)
cc11fcd [PATCH 07/14] remove use of strcpy() in eta_to_human_short() (Philipp 
Rudo)
a854064 [PATCH 06/14] remove use of strcpy() in open_dump_bitmap() (Philipp 
Rudo)
6b69167 [PATCH 05/14] make sure sc.sc_buf is initalized (Philipp Rudo)
d813c3d [PATCH 04/14] fix buffer overflow in init_save_control() (Philipp Rudo)
38c1428 [PATCH 03/14] fix buffer overflow in DumpInfo in 
read_vmcoreinfo_basic_info() (Philipp Rudo)
d5c95fc [PATCH 02/14] fix buffer overflow on stack in copy_vmcoreinfo() 
(Philipp Rudo)
4ce9d2c [PATCH 01/14] Makefile: add DEBUG option (Philipp Rudo)
afd0a6d [PATCH] Add Zstandard compression support (Tao Liu)
bccc39c [PATCH v2] support --partial-dmesg on 5.10+ kernels (Ivan Delalande)
5684d4c [PATCH 2/2] Fix --dry-run for incomplete dumps (Philipp Rudo)
6e53b9d [PATCH 1/2] Fix bad file descriptor error when using --dry-run (Philipp 
Rudo)
efef023 [PATCH] Cleanup: Remove verbose names and code in 
calculate_log_buf_len() (Kazuhito Hagio)
f0cfa86 [PATCH v2 3/3] Add -L option to limit output file size (Benjamin 
Poirier)
2f5697e [PATCH v2 2/3] Make --dump-dmesg option use write_and_check_space() 
(Benjamin Poirier)
047b921 [PATCH v2 1/3] Fix off-by-one error when checking cache_size (Benjamin 
Poirier)
9df519d [PATCH] arm: fix page_offset determination (Grzegorz Jaszczyk)
9a6f589 [PATCH] check for invalid physical address of /proc/kcore when making 
ELF dumpfile (Tao Liu)
38d921a [PATCH] check for invalid physical address of /proc/kcore when finding 
max_paddr (Coiby Xu)
6464568 [PATCH] Increase SECTION_MAP_LAST_BIT to 5 (Kazuhito Hagio)
0bb2f15 [PATCH] Mark start of 1.7.0 development phase with version 1.6.9++ 
(Kazuhito Hagio)

Description of makedumpfile:
The makedumpfile is a tool for creating a dumpfile from /proc/vmcore
with filtering out unnecessary pages for analysis and compressing the
remaining pages, in order to shorten the size of the dumpfile and the
time of creating it.
https://github.com/makedumpfile/makedumpfile

Thanks,
Kazu

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


RE: [PATCH 00/11] makedumpfile: Add zstd support for makedumpfile

2021-09-22 Thread  
Hi Tao Liu,

Merged them into a patch and applied:
https://github.com/makedumpfile/makedumpfile/commit/afd0a6db2a0543217f8e46955a1b44b71f7e7ef3

Thanks,
Kazu

> -Original Message-
> Hi Tao Liu,
> 
> -Original Message-
> > Hello Kazu,
> >
> > Sorry for the late reply.
> >
> > On Fri, Sep 17, 2021 at 07:03:50AM +, HAGIO KAZUHITO(萩尾 一仁) wrote:
> > > -Original Message-
> > > > -Original Message-
> > > > > > > > This patch set adds ZSTD compression support to makedumpfile. 
> > > > > > > > With ZSTD compression
> > > > > > > > support, the vmcore dump size and time consumption can have a 
> > > > > > > > better balance than
> > > > > > > > zlib/lzo/snappy.
> > > > > > > >
> > > > > > > > How to build:
> > > > > > > >
> > > > > > > >   Build using make:
> > > > > > > > $ make USEZSTD=on
> > > > > > > >
> > > > > > > > Performance Comparison:
> > > > > > > >
> > > > > > > >   How to measure
> > > > > > > >
> > > > > > > > I took a x86_64 machine which had 4T memory, and the 
> > > > > > > > compression level
> > > > > > > > range from (-3 to 4) for ZSTD, as well as zlib/lzo/snappy 
> > > > > > > > compression.
> > > > > > > > All testing was done by makedumpfile single thread mode.
> > > > > > > >
> > > > > > > > As for compression performance testing, in order to avoid 
> > > > > > > > the performance
> > > > > > > > bottle neck of disk I/O, I used the following makedumpfile 
> > > > > > > > cmd, which took
> > > > > > > > lzo compression as an example. "--dry-run" will not write 
> > > > > > > > any data to disk,
> > > > > > > > "--show-stat" will output the vmcore size after 
> > > > > > > > compression, and the time
> > > > > > > > consumption can be collected from the output logs.
> > > > > > > >
> > > > > > > > $ makedumpfile -d 0 -l /proc/kcore vmcore --dry-run 
> > > > > > > > --show-stat
> > > > > > > >
> > > > > > > >
> > > > > > > > As for decompression performance testing, I only tested the 
> > > > > > > > (-d 31) case,
> > > > > > > > because the vmcore size of (-d 0) case is too big to fit in 
> > > > > > > > the disk, in
> > > > > > > > addtion, to read a oversized file from disk will encounter 
> > > > > > > > the disk I/O
> > > > > > > > bottle neck.
> > > > > > > >
> > > > > > > > I triggered a kernel crash and collected a vmcore. Then I 
> > > > > > > > converted the
> > > > > > > > vmcore into specific compression format using the following 
> > > > > > > > makedumpfile
> > > > > > > > cmd, which would get a lzo format vmcore as an example:
> > > > > > > >
> > > > > > > > $ makedumpfile -l vmcore vmcore.lzo
> > > > > > > >
> > > > > > > > After all vmcores were ready, I used the following cmd to 
> > > > > > > > perform the
> > > > > > > > decompression, the time consumption can be collected from 
> > > > > > > > the logs.
> > > > > > > >
> > > > > > > > $ makedumpfile -F vmcore.lzo --dry-run --show-stat
> > > > > > > >
> > > > > > > >
> > > > > > > >   Result charts
> > > > > > > >
> > > > > > > > For compression:
> > > > > > > >
> > > > > > > > makedumpfile -d31   |  makedumpfile 
> > > > > > > > -d0
> > > > > > > > Compression timevmcore size |  Compression 
> > > > 

RE: [PATCH 00/11] makedumpfile: Add zstd support for makedumpfile

2021-09-21 Thread  
Hi Tao Liu,

-Original Message-
> Hello Kazu,
> 
> Sorry for the late reply.
> 
> On Fri, Sep 17, 2021 at 07:03:50AM +, HAGIO KAZUHITO(萩尾 一仁) wrote:
> > -Original Message-
> > > -Original Message-
> > > > > > > This patch set adds ZSTD compression support to makedumpfile. 
> > > > > > > With ZSTD compression
> > > > > > > support, the vmcore dump size and time consumption can have a 
> > > > > > > better balance than
> > > > > > > zlib/lzo/snappy.
> > > > > > >
> > > > > > > How to build:
> > > > > > >
> > > > > > >   Build using make:
> > > > > > > $ make USEZSTD=on
> > > > > > >
> > > > > > > Performance Comparison:
> > > > > > >
> > > > > > >   How to measure
> > > > > > >
> > > > > > > I took a x86_64 machine which had 4T memory, and the 
> > > > > > > compression level
> > > > > > > range from (-3 to 4) for ZSTD, as well as zlib/lzo/snappy 
> > > > > > > compression.
> > > > > > > All testing was done by makedumpfile single thread mode.
> > > > > > >
> > > > > > > As for compression performance testing, in order to avoid the 
> > > > > > > performance
> > > > > > > bottle neck of disk I/O, I used the following makedumpfile 
> > > > > > > cmd, which took
> > > > > > > lzo compression as an example. "--dry-run" will not write any 
> > > > > > > data to disk,
> > > > > > > "--show-stat" will output the vmcore size after compression, 
> > > > > > > and the time
> > > > > > > consumption can be collected from the output logs.
> > > > > > >
> > > > > > > $ makedumpfile -d 0 -l /proc/kcore vmcore --dry-run 
> > > > > > > --show-stat
> > > > > > >
> > > > > > >
> > > > > > > As for decompression performance testing, I only tested the 
> > > > > > > (-d 31) case,
> > > > > > > because the vmcore size of (-d 0) case is too big to fit in 
> > > > > > > the disk, in
> > > > > > > addtion, to read a oversized file from disk will encounter 
> > > > > > > the disk I/O
> > > > > > > bottle neck.
> > > > > > >
> > > > > > > I triggered a kernel crash and collected a vmcore. Then I 
> > > > > > > converted the
> > > > > > > vmcore into specific compression format using the following 
> > > > > > > makedumpfile
> > > > > > > cmd, which would get a lzo format vmcore as an example:
> > > > > > >
> > > > > > > $ makedumpfile -l vmcore vmcore.lzo
> > > > > > >
> > > > > > > After all vmcores were ready, I used the following cmd to 
> > > > > > > perform the
> > > > > > > decompression, the time consumption can be collected from the 
> > > > > > > logs.
> > > > > > >
> > > > > > > $ makedumpfile -F vmcore.lzo --dry-run --show-stat
> > > > > > >
> > > > > > >
> > > > > > >   Result charts
> > > > > > >
> > > > > > > For compression:
> > > > > > >
> > > > > > > makedumpfile -d31 |  makedumpfile 
> > > > > > > -d0
> > > > > > > Compression time  vmcore size |  Compression 
> > > > > > > time  vmcore size
> > > > > > > zstd-3  325.5164465285179595  |  8205.452248  
> > > > > > >  51715430204
> > > > > > > zstd-2  332.0694325319726604  |  8057.381371  
> > > > > > >  51732062793
> > > > > > > zstd-1  309.9427735730516274  |  8138.060786  
> > > > > > >  52136191571
> > > > > > > zstd0   439.7730764673

RE: [PATCH 00/11] makedumpfile: Add zstd support for makedumpfile

2021-09-17 Thread  
-Original Message-
> -Original Message-
> > > > > This patch set adds ZSTD compression support to makedumpfile. With 
> > > > > ZSTD compression
> > > > > support, the vmcore dump size and time consumption can have a better 
> > > > > balance than
> > > > > zlib/lzo/snappy.
> > > > >
> > > > > How to build:
> > > > >
> > > > >   Build using make:
> > > > > $ make USEZSTD=on
> > > > >
> > > > > Performance Comparison:
> > > > >
> > > > >   How to measure
> > > > >
> > > > > I took a x86_64 machine which had 4T memory, and the compression 
> > > > > level
> > > > > range from (-3 to 4) for ZSTD, as well as zlib/lzo/snappy 
> > > > > compression.
> > > > > All testing was done by makedumpfile single thread mode.
> > > > >
> > > > > As for compression performance testing, in order to avoid the 
> > > > > performance
> > > > > bottle neck of disk I/O, I used the following makedumpfile cmd, 
> > > > > which took
> > > > > lzo compression as an example. "--dry-run" will not write any 
> > > > > data to disk,
> > > > > "--show-stat" will output the vmcore size after compression, and 
> > > > > the time
> > > > > consumption can be collected from the output logs.
> > > > >
> > > > > $ makedumpfile -d 0 -l /proc/kcore vmcore --dry-run --show-stat
> > > > >
> > > > >
> > > > > As for decompression performance testing, I only tested the (-d 
> > > > > 31) case,
> > > > > because the vmcore size of (-d 0) case is too big to fit in the 
> > > > > disk, in
> > > > > addtion, to read a oversized file from disk will encounter the 
> > > > > disk I/O
> > > > > bottle neck.
> > > > >
> > > > > I triggered a kernel crash and collected a vmcore. Then I 
> > > > > converted the
> > > > > vmcore into specific compression format using the following 
> > > > > makedumpfile
> > > > > cmd, which would get a lzo format vmcore as an example:
> > > > >
> > > > > $ makedumpfile -l vmcore vmcore.lzo
> > > > >
> > > > > After all vmcores were ready, I used the following cmd to perform 
> > > > > the
> > > > > decompression, the time consumption can be collected from the 
> > > > > logs.
> > > > >
> > > > > $ makedumpfile -F vmcore.lzo --dry-run --show-stat
> > > > >
> > > > >
> > > > >   Result charts
> > > > >
> > > > > For compression:
> > > > >
> > > > > makedumpfile -d31 |  makedumpfile -d0
> > > > > Compression time  vmcore size |  Compression time  
> > > > > vmcore size
> > > > > zstd-3  325.5164465285179595  |  8205.452248  
> > > > >  51715430204
> > > > > zstd-2  332.0694325319726604  |  8057.381371  
> > > > >  51732062793
> > > > > zstd-1  309.9427735730516274  |  8138.060786  
> > > > >  52136191571
> > > > > zstd0   439.7730764673859661  |  8873.059963  
> > > > >  50993669657
> > > > > zstd1   406.68036 4700959521  |  8259.417132   
> > > > > 51036900055
> > > > > zstd2   397.1956434699263608  |  8230.308291  
> > > > >  51030410942
> > > > > zstd3   436.4916324673306398  |  8803.970103  
> > > > >  51043393637
> > > > > zstd4   543.3639284668419304  |  8991.240244  
> > > > >  51058088514
> > > > > zlib561.2173818514803195  | 14381.755611  
> > > > >  78199283893
> > > > > lzo   248.175953 16696411879  |  6057.528781  
> > > > >  90020895741
> > > > > snappy  231.868312   11782236674  |  5290.919894  
> > > > > 245661288355
> > > > >
> > > > > For decompression:
> > > > >
> > > > > makedumpfile -d31
> > > > > decompress time  vmcore size
> > > > > zstd-3477.543396 5289373448
> > > > > zstd-2478.034534 5327454123
> > > > > zstd-1459.066807 5748037931
> > > > > zstd0 561.687525 4680009013
> > > > > zstd1 547.248917 4706358547
> > > > > zstd2 544.219758 4704780719
> > > > > zstd3 555.726343 4680009013
> > > > > zstd4 558.031721 4675545933
> > > > > zlib  630.965426 8555376229
> > > > > lzo   427.29210716849457649
> > > > > snappy446.54280611841407957
> > > > >
> > > > >   Discussion
> > > > >
> > > > > For zstd range from -3 to 4, compression level 2 (ZSTD_dfast) has
> > > > > the best time consumption and vmcore dump size balance.
> > > >
> > > > Do you have a result of -d 1 compression test?  I think -d 0 is not
> > > > practical, I would like to see a -d 1 result of such a large vmcore.
> > > >
> > >
> > > No, I haven't tested the -d 1 case. I have returned the machine which used
> > > for performance testing, I will 

RE: [PATCH 00/11] makedumpfile: Add zstd support for makedumpfile

2021-09-16 Thread  
-Original Message-
> > > > This patch set adds ZSTD compression support to makedumpfile. With ZSTD 
> > > > compression
> > > > support, the vmcore dump size and time consumption can have a better 
> > > > balance than
> > > > zlib/lzo/snappy.
> > > >
> > > > How to build:
> > > >
> > > >   Build using make:
> > > > $ make USEZSTD=on
> > > >
> > > > Performance Comparison:
> > > >
> > > >   How to measure
> > > >
> > > > I took a x86_64 machine which had 4T memory, and the compression 
> > > > level
> > > > range from (-3 to 4) for ZSTD, as well as zlib/lzo/snappy 
> > > > compression.
> > > > All testing was done by makedumpfile single thread mode.
> > > >
> > > > As for compression performance testing, in order to avoid the 
> > > > performance
> > > > bottle neck of disk I/O, I used the following makedumpfile cmd, 
> > > > which took
> > > > lzo compression as an example. "--dry-run" will not write any data 
> > > > to disk,
> > > > "--show-stat" will output the vmcore size after compression, and 
> > > > the time
> > > > consumption can be collected from the output logs.
> > > >
> > > > $ makedumpfile -d 0 -l /proc/kcore vmcore --dry-run --show-stat
> > > >
> > > >
> > > > As for decompression performance testing, I only tested the (-d 31) 
> > > > case,
> > > > because the vmcore size of (-d 0) case is too big to fit in the 
> > > > disk, in
> > > > addtion, to read a oversized file from disk will encounter the disk 
> > > > I/O
> > > > bottle neck.
> > > >
> > > > I triggered a kernel crash and collected a vmcore. Then I converted 
> > > > the
> > > > vmcore into specific compression format using the following 
> > > > makedumpfile
> > > > cmd, which would get a lzo format vmcore as an example:
> > > >
> > > > $ makedumpfile -l vmcore vmcore.lzo
> > > >
> > > > After all vmcores were ready, I used the following cmd to perform 
> > > > the
> > > > decompression, the time consumption can be collected from the logs.
> > > >
> > > > $ makedumpfile -F vmcore.lzo --dry-run --show-stat
> > > >
> > > >
> > > >   Result charts
> > > >
> > > > For compression:
> > > >
> > > > makedumpfile -d31   |  makedumpfile -d0
> > > > Compression timevmcore size |  Compression time  
> > > > vmcore size
> > > > zstd-3  325.516446  5285179595  |  8205.452248   
> > > > 51715430204
> > > > zstd-2  332.069432  5319726604  |  8057.381371   
> > > > 51732062793
> > > > zstd-1  309.942773  5730516274  |  8138.060786   
> > > > 52136191571
> > > > zstd0   439.773076  4673859661  |  8873.059963   
> > > > 50993669657
> > > > zstd1   406.68036   4700959521  |  8259.417132   
> > > > 51036900055
> > > > zstd2   397.195643  4699263608  |  8230.308291   
> > > > 51030410942
> > > > zstd3   436.491632  4673306398  |  8803.970103   
> > > > 51043393637
> > > > zstd4   543.363928  4668419304  |  8991.240244   
> > > > 51058088514
> > > > zlib561.217381  8514803195  | 14381.755611   
> > > > 78199283893
> > > > lzo 248.175953 16696411879  |  6057.528781   
> > > > 90020895741
> > > > snappy  231.868312 11782236674  |  5290.919894  
> > > > 245661288355
> > > >
> > > > For decompression:
> > > >
> > > > makedumpfile -d31
> > > > decompress timevmcore size
> > > > zstd-3  477.543396 5289373448
> > > > zstd-2  478.034534 5327454123
> > > > zstd-1  459.066807 5748037931
> > > > zstd0   561.687525 4680009013
> > > > zstd1   547.248917 4706358547
> > > > zstd2   544.219758 4704780719
> > > > zstd3   555.726343 4680009013
> > > > zstd4   558.031721 4675545933
> > > > zlib630.965426 8555376229
> > > > lzo 427.29210716849457649
> > > > snappy  446.54280611841407957
> > > >
> > > >   Discussion
> > > >
> > > > For zstd range from -3 to 4, compression level 2 (ZSTD_dfast) has
> > > > the best time consumption and vmcore dump size balance.
> > >
> > > Do you have a result of -d 1 compression test?  I think -d 0 is not
> > > practical, I would like to see a -d 1 result of such a large vmcore.
> > >
> >
> > No, I haven't tested the -d 1 case. I have returned the machine which used
> > for performance testing, I will borrow and test on it again, please wait for
> > a while...
> 
> Thanks, it would be helpful.
> 
> >
> > > And just out of curiosity, what version of zstd are you using?
> > > When I tested zstd last time, compression level 1 was faster than 2, iirc.
> > >
> >
> > The OS running on the machine is fedora34, 

RE: [PATCH 00/11] makedumpfile: Add zstd support for makedumpfile

2021-09-16 Thread  
Hi Tao Liu,

-Original Message-
> Hi Kazu,
> 
> Thanks for reviewing the patchset!
> 
> On Tue, Sep 14, 2021 at 07:04:24AM +, HAGIO KAZUHITO(萩尾 一仁) wrote:
> > Hi Tao Liu,
> >
> > Thanks for the patchset!
> >
> > -Original Message-
> > > This patch set adds ZSTD compression support to makedumpfile. With ZSTD 
> > > compression
> > > support, the vmcore dump size and time consumption can have a better 
> > > balance than
> > > zlib/lzo/snappy.
> > >
> > > How to build:
> > >
> > >   Build using make:
> > > $ make USEZSTD=on
> > >
> > > Performance Comparison:
> > >
> > >   How to measure
> > >
> > > I took a x86_64 machine which had 4T memory, and the compression level
> > > range from (-3 to 4) for ZSTD, as well as zlib/lzo/snappy compression.
> > > All testing was done by makedumpfile single thread mode.
> > >
> > > As for compression performance testing, in order to avoid the 
> > > performance
> > > bottle neck of disk I/O, I used the following makedumpfile cmd, which 
> > > took
> > > lzo compression as an example. "--dry-run" will not write any data to 
> > > disk,
> > > "--show-stat" will output the vmcore size after compression, and the 
> > > time
> > > consumption can be collected from the output logs.
> > >
> > > $ makedumpfile -d 0 -l /proc/kcore vmcore --dry-run --show-stat
> > >
> > >
> > > As for decompression performance testing, I only tested the (-d 31) 
> > > case,
> > > because the vmcore size of (-d 0) case is too big to fit in the disk, 
> > > in
> > > addtion, to read a oversized file from disk will encounter the disk 
> > > I/O
> > > bottle neck.
> > >
> > > I triggered a kernel crash and collected a vmcore. Then I converted 
> > > the
> > > vmcore into specific compression format using the following 
> > > makedumpfile
> > > cmd, which would get a lzo format vmcore as an example:
> > >
> > > $ makedumpfile -l vmcore vmcore.lzo
> > >
> > > After all vmcores were ready, I used the following cmd to perform the
> > > decompression, the time consumption can be collected from the logs.
> > >
> > > $ makedumpfile -F vmcore.lzo --dry-run --show-stat
> > >
> > >
> > >   Result charts
> > >
> > > For compression:
> > >
> > > makedumpfile -d31 |  makedumpfile -d0
> > > Compression time  vmcore size |  Compression time  
> > > vmcore size
> > > zstd-3  325.5164465285179595  |  8205.452248   
> > > 51715430204
> > > zstd-2  332.0694325319726604  |  8057.381371   
> > > 51732062793
> > > zstd-1  309.9427735730516274  |  8138.060786   
> > > 52136191571
> > > zstd0   439.7730764673859661  |  8873.059963   
> > > 50993669657
> > > zstd1   406.68036 4700959521  |  8259.417132   
> > > 51036900055
> > > zstd2   397.1956434699263608  |  8230.308291   
> > > 51030410942
> > > zstd3   436.4916324673306398  |  8803.970103   
> > > 51043393637
> > > zstd4   543.3639284668419304  |  8991.240244   
> > > 51058088514
> > > zlib561.2173818514803195  | 14381.755611   
> > > 78199283893
> > > lzo   248.175953 16696411879  |  6057.528781   
> > > 90020895741
> > > snappy  231.868312   11782236674  |  5290.919894  
> > > 245661288355
> > >
> > > For decompression:
> > >
> > > makedumpfile -d31
> > > decompress time  vmcore size
> > > zstd-3477.543396 5289373448
> > > zstd-2478.034534 5327454123
> > > zstd-1459.066807 5748037931
> > > zstd0 561.687525 4680009013
> > > zstd1 547.248917 4706358547
> > > zstd2 544.219758 4704780719
> > > zstd3 555.726343 4680009013
> > > zstd4 558.031721 467554593

RE: [PATCH 00/11] makedumpfile: Add zstd support for makedumpfile

2021-09-14 Thread  
Hi Tao Liu,

Thanks for the patchset!

-Original Message-
> This patch set adds ZSTD compression support to makedumpfile. With ZSTD 
> compression
> support, the vmcore dump size and time consumption can have a better balance 
> than
> zlib/lzo/snappy.
> 
> How to build:
> 
>   Build using make:
> $ make USEZSTD=on
> 
> Performance Comparison:
> 
>   How to measure
> 
> I took a x86_64 machine which had 4T memory, and the compression level
> range from (-3 to 4) for ZSTD, as well as zlib/lzo/snappy compression.
> All testing was done by makedumpfile single thread mode.
> 
> As for compression performance testing, in order to avoid the performance
> bottle neck of disk I/O, I used the following makedumpfile cmd, which took
> lzo compression as an example. "--dry-run" will not write any data to 
> disk,
> "--show-stat" will output the vmcore size after compression, and the time
> consumption can be collected from the output logs.
> 
> $ makedumpfile -d 0 -l /proc/kcore vmcore --dry-run --show-stat
> 
> 
> As for decompression performance testing, I only tested the (-d 31) case,
> because the vmcore size of (-d 0) case is too big to fit in the disk, in
> addtion, to read a oversized file from disk will encounter the disk I/O
> bottle neck.
> 
> I triggered a kernel crash and collected a vmcore. Then I converted the
> vmcore into specific compression format using the following makedumpfile
> cmd, which would get a lzo format vmcore as an example:
> 
> $ makedumpfile -l vmcore vmcore.lzo
> 
> After all vmcores were ready, I used the following cmd to perform the
> decompression, the time consumption can be collected from the logs.
> 
> $ makedumpfile -F vmcore.lzo --dry-run --show-stat
> 
> 
>   Result charts
> 
> For compression:
> 
> makedumpfile -d31 |  makedumpfile -d0
> Compression time  vmcore size |  Compression time  vmcore size
> zstd-3  325.5164465285179595  |  8205.452248   
> 51715430204
> zstd-2  332.0694325319726604  |  8057.381371   
> 51732062793
> zstd-1  309.9427735730516274  |  8138.060786   
> 52136191571
> zstd0   439.7730764673859661  |  8873.059963   
> 50993669657
> zstd1   406.68036 4700959521  |  8259.417132   51036900055
> zstd2   397.1956434699263608  |  8230.308291   
> 51030410942
> zstd3   436.4916324673306398  |  8803.970103   
> 51043393637
> zstd4   543.3639284668419304  |  8991.240244   
> 51058088514
> zlib561.2173818514803195  | 14381.755611   
> 78199283893
> lzo   248.175953 16696411879  |  6057.528781   
> 90020895741
> snappy  231.868312   11782236674  |  5290.919894  
> 245661288355
> 
> For decompression:
> 
> makedumpfile -d31
> decompress time  vmcore size
> zstd-3477.543396 5289373448
> zstd-2478.034534 5327454123
> zstd-1459.066807 5748037931
> zstd0 561.687525 4680009013
> zstd1 547.248917 4706358547
> zstd2 544.219758 4704780719
> zstd3 555.726343 4680009013
> zstd4 558.031721 4675545933
> zlib  630.965426 8555376229
> lzo   427.29210716849457649
> snappy446.54280611841407957
> 
>   Discussion
> 
> For zstd range from -3 to 4, compression level 2 (ZSTD_dfast) has
> the best time consumption and vmcore dump size balance.

Do you have a result of -d 1 compression test?  I think -d 0 is not
practical, I would like to see a -d 1 result of such a large vmcore.

And just out of curiosity, what version of zstd are you using?
When I tested zstd last time, compression level 1 was faster than 2, iirc.

btw, ZSTD_dfast is an enum of ZSTD_strategy, not for compression level?
(no need to update for now, I will review later)

Thanks,
Kazu

> 
> For zstd2/zlib/lzo/snappy, zstd2 has the smallest vmcore size, also
> the best time consumption and vmcore dump size balance.
> 
> Tao Liu (11):
>   Add dump header for zstd.
>   Add command-line processing for zstd
>   Add zstd build support
>   Notify zstd unsupporting when disabled
>   Add single thread zstd compression processing
>   Add parallel threads zstd compression processing
>   Add single thread zstd uncompression processing
>   Add parallel threads zstd uncompression processing
>   Add zstd help message
>   Add zstd manual description
>   Add zstd README description
> 
>  Makefile   |   5 +++
>  README |   5 ++-
>  diskdump_mod.h |   1 +
>  makedumpfile.8 |   7 ++--
>  makedumpfile.c | 101 

RE: [PATCH 1/1] fix left bit-shift overflow in __exclude_unnecessary_pages()

2021-09-01 Thread  
-Original Message-
> > -Original Message-
> >> Whenever the variables compound_order or private become greater than
> >> 31, left bit-shift of 1 overflows, and nr_pages becomes zero. If nr_pages
> >> becomes 0 and pages are being excluded at the end of the PFN loop, the
> >> else branch of the last if statement is entered and pfn is decremented by
> >> 1 because nr_pages is 0. Finally, this causes the loop variable pfn to
> >> be assigned the same value as before when the next loop iteration begins
> >> which results in an infinite loop.
> >>
> >> This issue appeared on s390 64bit architecture with a dump of 16GB RAM.
> >
> > The patch looks good to me, but just out of curiosity, when do the
> > compound_order or private become greater than 31 on s390?
> >
> > Thanks,
> > Kazu
> >
> 
> I added some debug statements and this what i got:
> 
> compound_order 0
> compound_order 1
> compound_order 2
> compound_order 3
> compound_order 4
> compound_order 5
> compound_order 6
> compound_order 7
> compound_order 8
> private 0
> private 1
> private 2
> private 3
> private 4
> private 5
> private 52
> private 6
> private 7
> private 8
> 
> It seems that not compound_order but private is at fault here and
> triggers the bug. Not sure yet what that exactly means and whether we
> have here another bug which triggers this one :/

Hmm, so makedumpfile will exclude many pages wrongly with the patch?
Excluding pages wrongly is better than failing with an infinite loop,
but not better than including pages wrongly, because it might lose
necessary data for investigation.

So I think we should have a sanity check also for the private.  AFAIK,
the private value (buddy allocator's order) should be less than MAX_ORDER.
If this is correct, we can use LENGTH(zone.free_area) in vmcoreinfo.

Thanks,
Kazu

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


RE: [PATCH 1/1] fix left bit-shift overflow in __exclude_unnecessary_pages()

2021-09-01 Thread  
Hi Alex,
+cc kexec list (the right one for makedumpfile patch)

-Original Message-
> Whenever the variables compound_order or private become greater than
> 31, left bit-shift of 1 overflows, and nr_pages becomes zero. If nr_pages
> becomes 0 and pages are being excluded at the end of the PFN loop, the
> else branch of the last if statement is entered and pfn is decremented by
> 1 because nr_pages is 0. Finally, this causes the loop variable pfn to
> be assigned the same value as before when the next loop iteration begins
> which results in an infinite loop.
> 
> This issue appeared on s390 64bit architecture with a dump of 16GB RAM.

The patch looks good to me, but just out of curiosity, when do the
compound_order or private become greater than 31 on s390?

Thanks,
Kazu

> 
> This is a simple program to demonstrate the primary issue:
> 
> void main(void)
> {
> unsigned long long n;
> unsigned long m;
> 
> m = 32;
> n = 1 << m;
> fprintf(stderr, "%llx\n", n);
> n = 1UL << m;
> fprintf(stderr, "%llx\n", n);
> }
> 
> Signed-off-by: Alexander Egorenkov 
> ---
>  makedumpfile.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/makedumpfile.c b/makedumpfile.c
> index c063267f15bb..863840b13608 100644
> --- a/makedumpfile.c
> +++ b/makedumpfile.c
> @@ -6210,7 +6210,7 @@ __exclude_unnecessary_pages(unsigned long mem_map,
>   if (OFFSET(page.private) != NOT_FOUND_STRUCTURE)
>   private = ULONG(pcache + OFFSET(page.private));
> 
> - nr_pages = 1 << compound_order;
> + nr_pages = 1UL << compound_order;
>   pfn_counter = NULL;
> 
>   /*
> @@ -6227,7 +6227,7 @@ __exclude_unnecessary_pages(unsigned long mem_map,
>   else if ((info->dump_level & DL_EXCLUDE_FREE)
>   && info->page_is_buddy
>   && info->page_is_buddy(flags, _mapcount, private, _count)) {
> - nr_pages = 1 << private;
> + nr_pages = 1UL << private;
>   pfn_counter = _free;
>   }
>   /*
> --
> 2.31.1
> 
> --
> Crash-utility mailing list
> crash-util...@redhat.com
> https://listman.redhat.com/mailman/listinfo/crash-utility

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


RE: [PATCH v2] makedumpfile: support --partial-dmesg on 5.10+ kernels

2021-08-31 Thread  
-Original Message-
> The new printk ringbuffer implementation added in kernel v5.10 with
> commit 896fbe20b4e2 ("printk: use the lockless ringbuffer") also
> exported a new vmcore symbol "clear_seq" to keep track where dmesg had
> been cleared like "clear_idx" on previous versions. Use it in
> dump_lockless_dmesg when --partial-dmesg is passed to find and start
> dumping messages only from that index.
> 
> On v5.13, commit 7d7a23a91c91 ("printk: use seqcount_latch for
> clear_seq") converted it from a simple value, and so an additional step
> is required to find and retrieve the sequence at the right offset in the
> latched_seq structure it now uses.
> 
> Signed-off-by: Ivan Delalande 

The v2 patch tested OK on 5.12 and 5.14.  Applied.
https://github.com/makedumpfile/makedumpfile/commit/bccc39c147fea2b29f98f9973614c7a6697a065d

Thanks,
Kazu

> ---
>  makedumpfile.c | 12 
>  makedumpfile.h |  6 ++
>  printk.c   | 19 +++
>  3 files changed, 37 insertions(+)
> 
> diff --git a/makedumpfile.c b/makedumpfile.c
> index b1b3b42..ac17f19 100644
> --- a/makedumpfile.c
> +++ b/makedumpfile.c
> @@ -1555,6 +1555,7 @@ get_symbol_info(void)
>   SYMBOL_INIT(pgdat_list, "pgdat_list");
>   SYMBOL_INIT(contig_page_data, "contig_page_data");
>   SYMBOL_INIT(prb, "prb");
> + SYMBOL_INIT(clear_seq, "clear_seq");
>   SYMBOL_INIT(log_buf, "log_buf");
>   SYMBOL_INIT(log_buf_len, "log_buf_len");
>   SYMBOL_INIT(log_end, "log_end");
> @@ -2000,6 +2001,9 @@ get_structure_info(void)
>   OFFSET_INIT(printk_info.text_len, "printk_info", "text_len");
> 
>   OFFSET_INIT(atomic_long_t.counter, "atomic_long_t", "counter");
> +
> + SIZE_INIT(latched_seq, "latched_seq");
> + OFFSET_INIT(latched_seq.val, "latched_seq", "val");
>   } else if (SIZE(printk_log) != NOT_FOUND_STRUCTURE) {
>   /*
>* In kernel 3.11-rc4 the log structure name was renamed
> @@ -2231,6 +2235,7 @@ write_vmcoreinfo_data(void)
>   WRITE_SYMBOL("pgdat_list", pgdat_list);
>   WRITE_SYMBOL("contig_page_data", contig_page_data);
>   WRITE_SYMBOL("prb", prb);
> + WRITE_SYMBOL("clear_seq", clear_seq);
>   WRITE_SYMBOL("log_buf", log_buf);
>   WRITE_SYMBOL("log_buf_len", log_buf_len);
>   WRITE_SYMBOL("log_end", log_end);
> @@ -2266,6 +2271,7 @@ write_vmcoreinfo_data(void)
>   WRITE_STRUCTURE_SIZE("printk_ringbuffer", printk_ringbuffer);
>   WRITE_STRUCTURE_SIZE("prb_desc", prb_desc);
>   WRITE_STRUCTURE_SIZE("printk_info", printk_info);
> + WRITE_STRUCTURE_SIZE("latched_seq", latched_seq);
>   } else if (info->flag_use_printk_log)
>   WRITE_STRUCTURE_SIZE("printk_log", printk_log);
>   else
> @@ -2335,6 +2341,8 @@ write_vmcoreinfo_data(void)
>   WRITE_MEMBER_OFFSET("printk_info.text_len", 
> printk_info.text_len);
> 
>   WRITE_MEMBER_OFFSET("atomic_long_t.counter", 
> atomic_long_t.counter);
> +
> + WRITE_MEMBER_OFFSET("latched_seq.val", latched_seq.val);
>   } else if (info->flag_use_printk_log) {
>   WRITE_MEMBER_OFFSET("printk_log.ts_nsec", printk_log.ts_nsec);
>   WRITE_MEMBER_OFFSET("printk_log.len", printk_log.len);
> @@ -2676,6 +2684,7 @@ read_vmcoreinfo(void)
>   READ_SYMBOL("pgdat_list", pgdat_list);
>   READ_SYMBOL("contig_page_data", contig_page_data);
>   READ_SYMBOL("prb", prb);
> + READ_SYMBOL("clear_seq", clear_seq);
>   READ_SYMBOL("log_buf", log_buf);
>   READ_SYMBOL("log_buf_len", log_buf_len);
>   READ_SYMBOL("log_end", log_end);
> @@ -2784,6 +2793,9 @@ read_vmcoreinfo(void)
>   READ_MEMBER_OFFSET("printk_info.text_len", 
> printk_info.text_len);
> 
>   READ_MEMBER_OFFSET("atomic_long_t.counter", 
> atomic_long_t.counter);
> +
> + READ_STRUCTURE_SIZE("latched_seq", latched_seq);
> + READ_MEMBER_OFFSET("latched_seq.val", latched_seq.val);
>   } else if (SIZE(printk_log) != NOT_FOUND_STRUCTURE) {
>   info->flag_use_printk_ringbuffer = FALSE;
>   info->flag_use_printk_log = TRUE;
> diff --git a/makedumpfile.h b/makedumpfile.h
> index ca50a89..bd9e2f6 100644
> --- a/makedumpfile.h
> +++ b/makedumpfile.h
> @@ -1656,6 +1656,7 @@ struct symbol_table {
>   unsigned long long  pgdat_list;
>   unsigned long long  contig_page_data;
>   unsigned long long  prb;
> + unsigned long long  clear_seq;
>   unsigned long long  log_buf;
>   unsigned long long  log_buf_len;
>   unsigned long long  log_end;
> @@ -1749,6 +1750,7 @@ struct size_table {
>   longprintk_ringbuffer;
>   longprb_desc;
>   longprintk_info;
> + longlatched_seq;
> 
>   /*
>* for Xen extraction
> @@ -1973,6 +1975,10 @@ struct offset_table {
>   long counter;
>   } atomic_long_t;
> 
> + 

RE: [PATCH] makedumpfile: support --partial-dmesg on 5.10+ kernels

2021-08-17 Thread  
Hi Ivan,

-Original Message-
> The new printk ringbuffer implementation added in kernel v5.10 with
> commit 896fbe20b4e2 ("printk: use the lockless ringbuffer") also
> exported a new vmcore symbol "clear_seq" to keep track where dmesg had
> been cleared like "clear_idx" on previous versions. Use it in
> dump_lockless_dmesg when --partial-dmesg is passed to find and start
> dumping messages only from that index.

Thanks for the patch.

Kernel commit 7d7a23a91c91 ("printk: use seqcount_latch for clear_seq")
changed the clear_seq, the patch does not work correctly on 5.13+ kernels.

But the commit exported these values to vmcoreinfo:

SIZE(latched_seq)=24
OFFSET(latched_seq.val)=8

Can it support also 5.13+ with these?

Thanks,
Kazu

> 
> Signed-off-by: Ivan Delalande 
> ---
>  makedumpfile.c |  3 +++
>  makedumpfile.h |  1 +
>  printk.c   | 10 ++
>  3 files changed, 14 insertions(+)
> 
> diff --git a/makedumpfile.c b/makedumpfile.c
> index b1b3b42..caf4d12 100644
> --- a/makedumpfile.c
> +++ b/makedumpfile.c
> @@ -1555,6 +1555,7 @@ get_symbol_info(void)
>   SYMBOL_INIT(pgdat_list, "pgdat_list");
>   SYMBOL_INIT(contig_page_data, "contig_page_data");
>   SYMBOL_INIT(prb, "prb");
> + SYMBOL_INIT(clear_seq, "clear_seq");
>   SYMBOL_INIT(log_buf, "log_buf");
>   SYMBOL_INIT(log_buf_len, "log_buf_len");
>   SYMBOL_INIT(log_end, "log_end");
> @@ -2231,6 +2232,7 @@ write_vmcoreinfo_data(void)
>   WRITE_SYMBOL("pgdat_list", pgdat_list);
>   WRITE_SYMBOL("contig_page_data", contig_page_data);
>   WRITE_SYMBOL("prb", prb);
> + WRITE_SYMBOL("clear_seq", clear_seq);
>   WRITE_SYMBOL("log_buf", log_buf);
>   WRITE_SYMBOL("log_buf_len", log_buf_len);
>   WRITE_SYMBOL("log_end", log_end);
> @@ -2676,6 +2678,7 @@ read_vmcoreinfo(void)
>   READ_SYMBOL("pgdat_list", pgdat_list);
>   READ_SYMBOL("contig_page_data", contig_page_data);
>   READ_SYMBOL("prb", prb);
> + READ_SYMBOL("clear_seq", clear_seq);
>   READ_SYMBOL("log_buf", log_buf);
>   READ_SYMBOL("log_buf_len", log_buf_len);
>   READ_SYMBOL("log_end", log_end);
> diff --git a/makedumpfile.h b/makedumpfile.h
> index ca50a89..c57ac7a 100644
> --- a/makedumpfile.h
> +++ b/makedumpfile.h
> @@ -1656,6 +1656,7 @@ struct symbol_table {
>   unsigned long long  pgdat_list;
>   unsigned long long  contig_page_data;
>   unsigned long long  prb;
> + unsigned long long  clear_seq;
>   unsigned long long  log_buf;
>   unsigned long long  log_buf_len;
>   unsigned long long  log_end;
> diff --git a/printk.c b/printk.c
> index e8501c7..a160a5f 100644
> --- a/printk.c
> +++ b/printk.c
> @@ -145,6 +145,7 @@ dump_record(struct prb_map *m, unsigned long id)
>  int
>  dump_lockless_dmesg(void)
>  {
> + unsigned long long clear_seq;
>   unsigned long head_id;
>   unsigned long tail_id;
>   unsigned long kaddr;
> @@ -216,6 +217,15 @@ dump_lockless_dmesg(void)
>   OFFSET(atomic_long_t.counter));
>   head_id = ULONG(m.desc_ring + OFFSET(prb_desc_ring.head_id) +
>   OFFSET(atomic_long_t.counter));
> + if (info->flag_partial_dmesg && SYMBOL(clear_seq) != NOT_FOUND_SYMBOL) {
> + if (!readmem(VADDR, SYMBOL(clear_seq), _seq,
> +  sizeof(clear_seq))) {
> + ERRMSG("Can't get clear_seq.\n");
> + goto out_text_data;
> + }
> + tail_id = head_id - head_id % m.desc_ring_count +
> +   clear_seq % m.desc_ring_count;
> + }
> 
>   if (!open_dump_file()) {
>   ERRMSG("Can't open output file.\n");
> --
> 2.32.0

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


RE: [PATCH 0/2] makedumpfile: fix two bugs with --dry-run

2021-08-15 Thread  
-Original Message-
> Hi,
> 
> While playing with the --dry-run option I noticed two bugs. You can find the
> fixes below.
> 
> Thanks
> Philipp
> 
> Philipp Rudo (2):
>   makedumpfile: Fix bad file descriptor error when using --dry-run
>   makedumpfile: Fix --dry-run for incomplete dumps
> 
>  makedumpfile.c | 11 +--
>  1 file changed, 9 insertions(+), 2 deletions(-)
> 
> --
> 2.31.1

Thank you for the fixes, applied.
https://github.com/makedumpfile/makedumpfile/compare/efef023...5684d4c

Kazu


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


RE: [PATCH v2 makedumpfile 0/3] Add option to limit file size

2021-07-29 Thread  
-Original Message-
> See feature description in patch "Add -L option to limit output file
> size"
> 
> v1:
>   http://lists.infradead.org/pipermail/kexec/2021-June/022728.html
> v2:
>   Add patches 1, 2 to support dmesg limit with different kernel
>   versions.
> 
>   Stricter parsing of -L option value.
> 
>   Instead of using RLIMIT_FSIZE, use write_bytes or SEEK_CUR to
>   enforce limit. This better integrates with the -F option for
>   stdout output.

sorry for the delay, and thank you for the update and fix.

Note that the v2 patchset cannot be built except for x86_64, so I moved
the memparse() from sadump_info.c to tools.c.  Otherwise, looks good.

https://github.com/makedumpfile/makedumpfile/compare/9df519d...f0cfa86

Thanks,
Kazu


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


RE: Compiling makedumpfile from source

2021-07-27 Thread  
-Original Message-
> Hi kazuhito san,
> 
> Just following up on my last email.
> Sincere apologies for asking for your time.
> I want to specifically understand what the "user data" section is and
> what it means to exclude it from the dump.

"user data" are anonymous pages or huge pages.
https://github.com/makedumpfile/makedumpfile/blob/master/makedumpfile.c#L6224

Please consult the __exclude_unnecessary_pages() function above for
what conditions correspond to the type of page.

Thanks,
Kazu

> 
> Thank you very much.
> 
> Best Regards,
> Manty
> 
> On Fri, Jul 9, 2021 at 10:45 AM manty kuma  wrote:
> >
> > Hi Kazuhito san,
> >
> > I am looking to better understand the sections being filtering out
> > with each of the following options.
> >
> > Zero page:
> > Pages that are empty. Ignoring these pages won't have any impact on 
> > analysis.
> >
> > non-private cache and private cache:
> > What exactly are these sections of memory? Just a one-line overview
> > about them is sufficient.
> > (My understanding was that cache is not part of RAM. Is this cache
> > something else? Like some bookkeeping data maintained by the kernel?)
> >
> >
> > user data:
> > Are these sections of the memory for the user space processes/memory
> > sections allocated using malloc?
> > My understanding is that If I exclude this section, gcore would not
> > work. Is my understanding correct?
> > I expected this section to be big. But in fact excluding this did not
> > have much impact on the dump size.
> >
> > free page:
> > unallocated pages. Since they are not allocated. filtering them out
> > won't have any impact on dump analysis.
> > Please correct me if I am wrong.
> >
> >
> > If there is already some place that explains what these sections
> > filter out, please just drop the reference to them and i will look
> > into it.
> > Thank you very much in advance!
> >
> > Manty
> >
> > On Thu, Jul 8, 2021 at 8:17 PM manty kuma  wrote:
> > >
> > > Sorry. I am not sure how but I completely missed this email.
> > > Yes, /tmp was not available in my env. I just did mkdir before
> > > executing `makedumpfile` and it is now working well.
> > > Thank you very much.
> > >
> > > On Mon, Jun 28, 2021 at 4:07 PM HAGIO KAZUHITO(萩尾 一仁)
> > >  wrote:
> > > >
> > > > -Original Message-
> > > > > Hi Kazuhito san,
> > > > >
> > > > > I am getting the following error when trying to use makedumpfile 
> > > > > utility.
> > > > >
> > > > > > copy_vmcoreinfo: Can't open the vmcoreinfo 
> > > > > > file(/tmp/vmcoreinfoLUQc25). No such file or directory.
> > > > > > makedumpfile Failed
> > > > >
> > > > > In your setup how are you providing the vmcoreinfo file? In my case it
> > > > > is checking /tmp/vmcoreinfoLUQc25
> > > > > Who generates this file?
> > > >
> > > > Generally, vmcoreinfo is copied from vmcore's ELF note to 
> > > > /tmp/vmcoreinfoXX
> > > > by makedumpfile, please see copy_vmcoreinfo().  So no need to provide 
> > > > explicitly.
> > > >
> > > > Is there the /tmp directory on your environment?
> > > >
> > > > Kazu
> > > >
> > > >
___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


RE: [PATCH] makedumpfile/arm: fix page_offset determination

2021-07-06 Thread  
-Original Message-
> When the CONFIG_STRICT_KERNEL_RWX is enabled for the armv7 kernel the
> _stext is aligned to 1< wrong page_offset determination.
> 
> Suit mask used for page_offset in a way that it will allow to correctly
> determine page_offset regardless of CONFIG_STRICT_KERNEL_RWX and
> CONFIG_ARM_LPAE settings.
> 
> Signed-off-by: Grzegorz Jaszczyk 

Thanks, applied.
https://github.com/makedumpfile/makedumpfile/commit/9df519d2f4d6d6362fd878a1e0f0890e73166055

Kazu

> ---
>  arch/arm.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/arch/arm.c b/arch/arm.c
> index 33536fc..7ae9cb1 100644
> --- a/arch/arm.c
> +++ b/arch/arm.c
> @@ -80,7 +80,7 @@ get_phys_base_arm(void)
>  int
>  get_machdep_info_arm(void)
>  {
> - info->page_offset = SYMBOL(_stext) & 0xUL;
> + info->page_offset = SYMBOL(_stext) & 0xffc0UL;
> 
>   /* Check if we can get MAX_PHYSMEM_BITS from vmcoreinfo */
>   if (NUMBER(MAX_PHYSMEM_BITS) != NOT_FOUND_NUMBER)
> --
> 2.29.0


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


RE: Compiling makedumpfile from source

2021-06-28 Thread  
-Original Message-
> Hi Kazuhito san,
> 
> I am getting the following error when trying to use makedumpfile utility.
> 
> > copy_vmcoreinfo: Can't open the vmcoreinfo file(/tmp/vmcoreinfoLUQc25). No 
> > such file or directory.
> > makedumpfile Failed
> 
> In your setup how are you providing the vmcoreinfo file? In my case it
> is checking /tmp/vmcoreinfoLUQc25
> Who generates this file?

Generally, vmcoreinfo is copied from vmcore's ELF note to /tmp/vmcoreinfoXX
by makedumpfile, please see copy_vmcoreinfo().  So no need to provide 
explicitly.

Is there the /tmp directory on your environment?

Kazu


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


RE: [makedumpfile PATCH] check for invalid physical address of /proc/kcore when make ELF format dumpfile

2021-06-21 Thread  
-Original Message-
> Previously when executing makedumpfile with -E option against /proc/kcore,
> makedumpfile will fail:
> 
> $ sudo makedumpfile -E -d 31 /proc/kcore vmcore
> ...
> write_elf_load_segment: Can't convert physaddr() to an
> offset.
> 
> makedumpfile Failed.
> 
> It's because /proc/kcore contains PT_LOAD program headers which have
> physaddr(). With -E option, makedumpfile will try to convert
> the physaddr to an offset and fails.
> 
> In this patch, let's skip the PT_LOAD program headers which have such 
> physaddr.
> 
> Signed-off-by: Tao Liu 

Thanks, applied with adjusting to the previous similar patch.
https://github.com/makedumpfile/makedumpfile/commit/9a6f589d99dcef114c89fde992157f5467028c8f

Kazu


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


RE: [PATCH makedumpfile] Add -L option to limit output file size

2021-06-17 Thread  
-Original Message-
> This option can be used to ensure that a certain amount of free space is
> preserved. It is useful when the output of makedumpfile is on the root
> filesystem and some services fail to start at boot if there is no space
> left.

Thanks for the patch, the option seems nice!
Some comments inline.

> 
> Signed-off-by: Benjamin Poirier 
> ---
>  makedumpfile.8 | 12 +---
>  makedumpfile.c | 50 --
>  makedumpfile.h |  2 ++
>  print_info.c   |  3 +++
>  4 files changed, 58 insertions(+), 9 deletions(-)
> 
> diff --git a/makedumpfile.8 b/makedumpfile.8
> index 313a41c..9a90f0e 100644
> --- a/makedumpfile.8
> +++ b/makedumpfile.8
> @@ -157,9 +157,10 @@ will be effective if you specify domain-0's 
> \fIvmlinux\fR with \-x option.
>  Then the pages are excluded only from domain-0.
>  .br
>  If specifying multiple dump_levels with the delimiter ',', makedumpfile 
> retries
> -to create a \fIDUMPFILE\fR by other dump_level when "No space on device" 
> error
> -happens. For example, if dump_level is "11,31" and makedumpfile fails
> -by dump_level 11, makedumpfile retries it by dump_level 31.
> +to create \fIDUMPFILE\fR using the next dump_level when the size of a 
> dumpfile
> +exceeds the limit specified with '-L' or when when a "No space on device" 
> error
> +happens. For example, if dump_level is "11,31" and makedumpfile fails with
> +dump_level 11, makedumpfile retries with dump_level 31.
>  .br
>  .B Example:
>  .br
> @@ -221,6 +222,11 @@ Here is the all combinations of the bits.
>  30 |  |   X   |   X   |  X   |  X
>  31 |  X   |   X   |   X   |  X   |  X
> 
> +.TP
> +\fB\-L\fR \fISIZE\fR
> +Limit the size of the output file to \fISIZE\fR bytes. An incomplete
> +\fIDUMPFILE\fR or \fILOGFILE\fR is written if the size would otherwise exceed
> +\fISIZE\fR.
> 
>  .TP
>  \fB\-E\fR
> diff --git a/makedumpfile.c b/makedumpfile.c
> index 894c88e..8a443c6 100644
> --- a/makedumpfile.c
> +++ b/makedumpfile.c
> @@ -22,6 +22,8 @@
>  #include "cache.h"
>  #include 
>  #include 
> +#include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -1383,6 +1385,28 @@ open_dump_file(void)
>   return FALSE;
>   }
>   info->fd_dumpfile = fd;
> +
> + if (info->size_limit != -1) {
> + struct sigaction act = {
> + .sa_handler = SIG_IGN,
> + };
> + struct rlimit rlim = {
> + .rlim_cur = info->size_limit,
> + .rlim_max = info->size_limit,
> + };
> +
> + if (sigaction(SIGXFSZ, , NULL)) {
> + ERRMSG("Can't ignore SIGXFSZ. %s\n", strerror(errno));
> + return FALSE;
> + }
> +
> + if (setrlimit(RLIMIT_FSIZE, )) {
> + ERRMSG("Can't set file size limit(%lld). %s\n",
> +(long long) info->size_limit, strerror(errno));
> + return FALSE;
> + }
> + }
> +
>   return TRUE;
>  }
> 
> @@ -4729,7 +4753,7 @@ write_and_check_space(int fd, void *buf, size_t 
> buf_size, char *file_name)
>   written_size += status;
>   continue;
>   }
> - if (errno == ENOSPC)
> + if (errno == ENOSPC || errno == EFBIG)
>   info->flag_nospace = TRUE;
>   MSG("\nCan't write the dump file(%s). %s\n",
>   file_name, strerror(errno));
> @@ -5308,7 +5332,7 @@ dump_log_entry(char *logptr, int fp)
>   for (i = 0, p = msg; i < text_len; i++, p++) {
>   if (bufp - buf >= sizeof(buf) - buf_need) {
>   if (write(info->fd_dumpfile, buf, bufp - buf) < 0)
> - return FALSE;
> + goto out_fail;
>   bufp = buf;
>   }
> 
> @@ -5322,10 +5346,13 @@ dump_log_entry(char *logptr, int fp)
> 
>   *bufp++ = '\n';
> 
> - if (write(info->fd_dumpfile, buf, bufp - buf) < 0)
> - return FALSE;
> - else
> + if (write(info->fd_dumpfile, buf, bufp - buf) >= 0)
>   return TRUE;
> +
> +out_fail:
> + ERRMSG("\nCan't write the log file(%s). %s\n", info->name_dumpfile,
> +strerror(errno));
> + return FALSE;
>  }

This change in dump_log_entry() looks ok, but the function supports only
Linux 3.5 to 5.9 kernels.  The other kernels are supported by the other
function or path:

dump_dmesg()
...
if ((SYMBOL(prb) != NOT_FOUND_SYMBOL))
return dump_lockless_dmesg();  <<-- 5.10 and newer
...
if (info->kernel_version < KERNEL_VERSION(3, 5, 0)) {
...
if (write(info->fd_dumpfile, log_buffer, length_log) < 0) <<-- 
3.4 and older
goto out;

Could you add support for those kernels? if you need the -L option for
the --dump-dmesg option.

But it seems that write() with 

  1   2   >