Re: makedumpfile: ELF format issues (RE: makedumpfile: Fix divide by zero in print_report())
On Wed, Nov 13, 2019 at 07:36:01PM +, Kazuhito Hagio wrote: > I think I've fixed the ELF issues which I could reproduce: > - wrong statistics > - e_phnum overflow > > If you still see any problems with the latest makedumpfile, > please let me know. > > Thanks, > Kazu It's taken me a little while to get this stuff into our production systems, but things are looking much better so far. Since I pushed out the fixes, I've seen no new corrupted dumps. I'll keep an eye on things and let you know if anything new comes up. thanks! Dave ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
RE: makedumpfile: ELF format issues (RE: makedumpfile: Fix divide by zero in print_report())
Hi Dave, I think I've fixed the ELF issues which I could reproduce: - wrong statistics - e_phnum overflow If you still see any problems with the latest makedumpfile, please let me know. Thanks, Kazu > -Original Message- > > -Original Message- > > > > > There are some other failure cases with non-null data, so maybe > > > there's >1 bug here. > > > > > I've not seen an obvious pattern to this. eg... > > > > > > > > > > https://pastebin.com/2uM4sBCF > > > > > > > > > > > > > As for this case, I suspect that Elf64_Ehdr.e_phnum overflows > > > > (i.e. num_loads_dumpfile > 65535): > > > > > > Oh, good catch. These are 256GB machines, so after discarding > > > everything, that explains why we end up with so many sections. > > > This also explains why it sometimes works I think, when the discarding > > > manages to get the total nr headers <64k. > > I also could reproduce this issue on a system with 192GB memory. > The note was actually overwritten by the following program headers. > - > num_loads_dumpfile=76318# more than 64k > ehdr64.e_phnum=10783# overflowed > note.p_offset=0x93708 .p_filesz=0x2958 # The note data is at 0x93708 > note cd_header->offset=0x40 > ... > head->off= 90040 load.p_addr= 44552e000 .p_off= ed270060 ... >^ # these headers overwrote the note data. > head->off= a0040 load.p_addr= 44563 .p_off= ed272060 ... > ... > The dumpfile is saved to dump.Ed25.devel. > > makedumpfile Completed. > > # readelf -a dump.Ed25.devel > ... > Number of program headers: 10783 > ... > Displaying notes found at file offset 0x00093708 with length 0x2958: > Owner Data size Description >0x0007 Unknown note type: (0xdbce6060) >description data: 00 00 7a 39 fff2 ff8a > # ../crash vmlinux dump.Ed25.devel > > WARNING: possibly corrupt Elf64_Nhdr: n_namesz: 4185522176 n_descsz: 3 > n_type: f4000 > ... > WARNING: cannot read linux_banner string > crash: vmlinux and dump.Ed25.devel do not match! > - > > > I think this will be the one of the causes, and had a look at how > > we can fix it. If you get a vmcore where this pattern occurs, > > you can try this tree: > > https://github.com/k-hagio/makedumpfile/tree/support-extended-elf > > > > Then, the crash utility also needs a patch to support a dumpfile > > that has more than 64k program headers: > > https://github.com/k-hagio/crash/tree/support-extended-elf > > These trees look to work well, though need more tests and tweaks. > - > # readelf -a dump.Ed25.test > ... > Number of program headers: 65535 (76319) <<-- note + loads > ... > Displaying notes found at file offset 0x00413748 with length 0x2958: > Owner Data size Description > CORE 0x0150 NT_PRSTATUS (prstatus structure) > CORE 0x0150 NT_PRSTATUS (prstatus structure) > CORE 0x0150 NT_PRSTATUS (prstatus structure) > ... > # ../crash-test vmlinux dump.Ed25.test > > crash-test> help -D > vmcore_data: > flags: c0 (KDUMP_LOCAL|KDUMP_ELF64) >ndfd: 3 > ofp: 3141560 > header_size: 4284576 >num_pt_load_segments: 76318 <<-- loads > pt_load_segment[0]: > - > > It is possible that the issue occurs on general systems if they have > large memory, so I'm going to proceed with those patches. > > Thanks, > Kazu > ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
RE: makedumpfile: ELF format issues (RE: makedumpfile: Fix divide by zero in print_report()) (Kazuhito Hagio)
Hi Dave, > -Original Message- > > > I think this will be the one of the causes, and had a look at how > > > we can fix it. If you get a vmcore where this pattern occurs, > > > you can try this tree: > > > https://github.com/k-hagio/makedumpfile/tree/support-extended-elf > > > > > > Then, the crash utility also needs a patch to support a dumpfile > > > that has more than 64k program headers: > > > https://github.com/k-hagio/crash/tree/support-extended-elf > > It is possible that the issue occurs on general systems if they have > > large memory, so I'm going to proceed with those patches. > > Hi Kazu, > > Do you want me to go ahead with the crash utility patch? It looks > safe enough to apply, and I did test it to make sure there were no > ill-effects with sample ELF dumpfiles. Oh, thank you for your attention and testing. I'm dropping the ELF32 parts of them, because I think they will not be used in the future. (I estimate the theoretical minimum memory size that makedumpfile could use the extended numbering is 64GB+256MB on 4k page system.) I will let you know when it gets prepared. Thanks! Kazu ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
RE: makedumpfile: ELF format issues (RE: makedumpfile: Fix divide by zero in print_report()) (Kazuhito Hagio)
- Original Message - > Date: Thu, 7 Nov 2019 16:12:06 + > From: Kazuhito Hagio > To: Dave Jones > Cc: "kexec@lists.infradead.org" > Subject: RE: makedumpfile: ELF format issues (RE: makedumpfile: Fix divide by > zero in print_report()) > Message-ID: <4ae2dc15ac0b8543882a74ea0d43dbec03594...@bpxm09gp.gisp.nec.co.jp> > Content-Type: text/plain; charset="iso-2022-jp" > > Hi, > > > -Original Message- > > > > > There are some other failure cases with non-null data, so maybe > > > > > there's >1 bug here. > > > > > I've not seen an obvious pattern to this. eg... > > > > > > > > > > https://pastebin.com/2uM4sBCF > > > > > > > > > > > > > As for this case, I suspect that Elf64_Ehdr.e_phnum overflows > > > > (i.e. num_loads_dumpfile > 65535): > > > > > > Oh, good catch. These are 256GB machines, so after discarding > > > everything, that explains why we end up with so many sections. > > > This also explains why it sometimes works I think, when the discarding > > > manages to get the total nr headers <64k. > > I also could reproduce this issue on a system with 192GB memory. > The note was actually overwritten by the following program headers. > - > num_loads_dumpfile=76318# more than 64k > ehdr64.e_phnum=10783# overflowed > note.p_offset=0x93708 .p_filesz=0x2958 # The note data is at 0x93708 > note cd_header->offset=0x40 > ... > head->off= 90040 load.p_addr= 44552e000 .p_off= ed270060 ... >^ # these headers overwrote the note data. > head->off= a0040 load.p_addr= 44563 .p_off= ed272060 ... > ... > The dumpfile is saved to dump.Ed25.devel. > > makedumpfile Completed. > > # readelf -a dump.Ed25.devel > ... > Number of program headers: 10783 > ... > Displaying notes found at file offset 0x00093708 with length 0x2958: > Owner Data size Description >0x0007 Unknown note type: (0xdbce6060) >description data: 00 00 7a 39 fff2 ff8a > # ../crash vmlinux dump.Ed25.devel > > WARNING: possibly corrupt Elf64_Nhdr: n_namesz: 4185522176 n_descsz: 3 > n_type: f4000 > ... > WARNING: cannot read linux_banner string > crash: vmlinux and dump.Ed25.devel do not match! > - > > > I think this will be the one of the causes, and had a look at how > > we can fix it. If you get a vmcore where this pattern occurs, > > you can try this tree: > > https://github.com/k-hagio/makedumpfile/tree/support-extended-elf > > > > Then, the crash utility also needs a patch to support a dumpfile > > that has more than 64k program headers: > > https://github.com/k-hagio/crash/tree/support-extended-elf > > These trees look to work well, though need more tests and tweaks. > - > # readelf -a dump.Ed25.test > ... > Number of program headers: 65535 (76319) <<-- note + loads > ... > Displaying notes found at file offset 0x00413748 with length 0x2958: > Owner Data size Description > CORE 0x0150 NT_PRSTATUS (prstatus structure) > CORE 0x0150 NT_PRSTATUS (prstatus structure) > CORE 0x0150 NT_PRSTATUS (prstatus structure) > ... > # ../crash-test vmlinux dump.Ed25.test > > crash-test> help -D > vmcore_data: > flags: c0 (KDUMP_LOCAL|KDUMP_ELF64) >ndfd: 3 > ofp: 3141560 > header_size: 4284576 >num_pt_load_segments: 76318 <<-- loads > pt_load_segment[0]: > - > > It is possible that the issue occurs on general systems if they have > large memory, so I'm going to proceed with those patches. Hi Kazu, Do you want me to go ahead with the crash utility patch? It looks safe enough to apply, and I did test it to make sure there were no ill-effects with sample ELF dumpfiles. Thanks, Dave ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
RE: makedumpfile: ELF format issues (RE: makedumpfile: Fix divide by zero in print_report())
Hi, > -Original Message- > > > > There are some other failure cases with non-null data, so maybe > > there's >1 bug here. > > > > I've not seen an obvious pattern to this. eg... > > > > > > > > https://pastebin.com/2uM4sBCF > > > > > > > > > > As for this case, I suspect that Elf64_Ehdr.e_phnum overflows > > > (i.e. num_loads_dumpfile > 65535): > > > > Oh, good catch. These are 256GB machines, so after discarding > > everything, that explains why we end up with so many sections. > > This also explains why it sometimes works I think, when the discarding > > manages to get the total nr headers <64k. I also could reproduce this issue on a system with 192GB memory. The note was actually overwritten by the following program headers. - num_loads_dumpfile=76318# more than 64k ehdr64.e_phnum=10783# overflowed note.p_offset=0x93708 .p_filesz=0x2958 # The note data is at 0x93708 note cd_header->offset=0x40 ... head->off= 90040 load.p_addr= 44552e000 .p_off= ed270060 ... ^ # these headers overwrote the note data. head->off= a0040 load.p_addr= 44563 .p_off= ed272060 ... ... The dumpfile is saved to dump.Ed25.devel. makedumpfile Completed. # readelf -a dump.Ed25.devel ... Number of program headers: 10783 ... Displaying notes found at file offset 0x00093708 with length 0x2958: Owner Data size Description 0x0007 Unknown note type: (0xdbce6060) description data: 00 00 7a 39 fff2 ff8a # ../crash vmlinux dump.Ed25.devel WARNING: possibly corrupt Elf64_Nhdr: n_namesz: 4185522176 n_descsz: 3 n_type: f4000 ... WARNING: cannot read linux_banner string crash: vmlinux and dump.Ed25.devel do not match! - > I think this will be the one of the causes, and had a look at how > we can fix it. If you get a vmcore where this pattern occurs, > you can try this tree: > https://github.com/k-hagio/makedumpfile/tree/support-extended-elf > > Then, the crash utility also needs a patch to support a dumpfile > that has more than 64k program headers: > https://github.com/k-hagio/crash/tree/support-extended-elf These trees look to work well, though need more tests and tweaks. - # readelf -a dump.Ed25.test ... Number of program headers: 65535 (76319) <<-- note + loads ... Displaying notes found at file offset 0x00413748 with length 0x2958: Owner Data size Description CORE 0x0150 NT_PRSTATUS (prstatus structure) CORE 0x0150 NT_PRSTATUS (prstatus structure) CORE 0x0150 NT_PRSTATUS (prstatus structure) ... # ../crash-test vmlinux dump.Ed25.test crash-test> help -D vmcore_data: flags: c0 (KDUMP_LOCAL|KDUMP_ELF64) ndfd: 3 ofp: 3141560 header_size: 4284576 num_pt_load_segments: 76318 <<-- loads pt_load_segment[0]: - It is possible that the issue occurs on general systems if they have large memory, so I'm going to proceed with those patches. Thanks, Kazu ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
RE: makedumpfile: ELF format issues (RE: makedumpfile: Fix divide by zero in print_report())
> -Original Message- > > > There are some other failure cases with non-null data, so maybe there's > >1 bug here. > > > I've not seen an obvious pattern to this. eg... > > > > > > https://pastebin.com/2uM4sBCF > > > > > > > As for this case, I suspect that Elf64_Ehdr.e_phnum overflows > > (i.e. num_loads_dumpfile > 65535): > > Oh, good catch. These are 256GB machines, so after discarding > everything, that explains why we end up with so many sections. > This also explains why it sometimes works I think, when the discarding > manages to get the total nr headers <64k. I think this will be the one of the causes, and had a look at how we can fix it. If you get a vmcore where this pattern occurs, you can try this tree: https://github.com/k-hagio/makedumpfile/tree/support-extended-elf Then, the crash utility also needs a patch to support a dumpfile that has more than 64k program headers: https://github.com/k-hagio/crash/tree/support-extended-elf Thanks, Kazu ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: makedumpfile: ELF format issues (RE: makedumpfile: Fix divide by zero in print_report())
On Thu, Oct 17, 2019 at 08:55:54PM +, Kazuhito Hagio wrote: > > I'll rework things so that it redirects to a file instead of dmesg, but > > it's going to take me a while to get that deployed and tested. > > If your hosts have a big space enough, thare is another way that > you use cp for /proc/vmcore and use makedumpfile after reboot. > For example: > > # cp --sparse=always /proc/vmcore vmcore.cp > reboot > # makedumpfile -E -d 31 --message-level 31 --cyclic-buffer 4096 vmcore.cp > dump.Ed31 I did try something like this (but without --sparse flag). It took around 90 minutes to dump a 256GB core in my test, which isn't going to be viable for our production hosts where I'm seeing the corruption problems. I've also been trying unsuccessfully to try and replicate it on an isolated machine with similar specifications. I'll give the sparse flag a try, though if memory is full enough to panic-on-oom (Which seems to be one common trigger for this issue), things might not be quite as sparse as I hope. > where the --cyclic-buffer option is needed to behave like in 2nd kernrel > on the one of your hosts: > [ 13.341818] Buffer size for the cyclic mode: 4194304 > > The captured vmcore.cp may be useful for trying a next patch first. We had similar thoughts ;) Dave ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
RE: makedumpfile: ELF format issues (RE: makedumpfile: Fix divide by zero in print_report())
> -Original Message- > On Wed, Oct 09, 2019 at 08:03:51PM +, Kazuhito Hagio wrote: > > > In this case, was the "makedumpfile Completed." message emitted? > > It looks like the buffer of program headers was not written to the file.. > > > > Anyway, a debugging patch attached below. > > Our kdump tooling redirects makedumpfile output to dmesg, and unfortunately > this debug > patch produces so much info it filled the ring buffer, so we didn't > catch the beginning. ah, if makedumpfile makes more than 64k program headers, the debug log will be more than 8MB. I should have told you this.. > I'll rework things so that it redirects to a file instead of dmesg, but > it's going to take me a while to get that deployed and tested. If your hosts have a big space enough, thare is another way that you use cp for /proc/vmcore and use makedumpfile after reboot. For example: # cp --sparse=always /proc/vmcore vmcore.cp reboot # makedumpfile -E -d 31 --message-level 31 --cyclic-buffer 4096 vmcore.cp dump.Ed31 where the --cyclic-buffer option is needed to behave like in 2nd kernrel on the one of your hosts: [ 13.341818] Buffer size for the cyclic mode: 4194304 The captured vmcore.cp may be useful for trying a next patch first. Thanks, Kazu ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: makedumpfile: ELF format issues (RE: makedumpfile: Fix divide by zero in print_report())
On Wed, Oct 09, 2019 at 08:03:51PM +, Kazuhito Hagio wrote: > In this case, was the "makedumpfile Completed." message emitted? > It looks like the buffer of program headers was not written to the file.. > > Anyway, a debugging patch attached below. Our kdump tooling redirects makedumpfile output to dmesg, and unfortunately this debug patch produces so much info it filled the ring buffer, so we didn't catch the beginning. I'll rework things so that it redirects to a file instead of dmesg, but it's going to take me a while to get that deployed and tested. Dave ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: makedumpfile: ELF format issues (RE: makedumpfile: Fix divide by zero in print_report())
On Wed, Oct 09, 2019 at 08:03:51PM +, Kazuhito Hagio wrote: > > 0x 0x 0 > > NULL 0x 0x 0x > > 0x 0x 0 > > > > In this case, was the "makedumpfile Completed." message emitted? > It looks like the buffer of program headers was not written to the file.. Our logging infra didn't capture the makedumpfile output. I've fixed that up, so hopefully next time.. > Anyway, a debugging patch attached below. > > > There are some other failure cases with non-null data, so maybe there's >1 > > bug here. > > I've not seen an obvious pattern to this. eg... > > > > https://pastebin.com/2uM4sBCF > > > > As for this case, I suspect that Elf64_Ehdr.e_phnum overflows > (i.e. num_loads_dumpfile > 65535): Oh, good catch. These are 256GB machines, so after discarding everything, that explains why we end up with so many sections. This also explains why it sometimes works I think, when the discarding manages to get the total nr headers <64k. > > I'll put your patch on some of the affected hosts and see if this > > changes behaviour in any way. > > If you can try the patch below, which includes the previous patch, > please show me: > - the debugging output of makedumpfile > - readelf -a vmcore > - ls -ls vmcore Will take me a few days (travelling right now), but when hopefully by the time I get back we'll have some data. thanks for looking into this. Dave ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
makedumpfile: ELF format issues (RE: makedumpfile: Fix divide by zero in print_report())
Hi Dave, Thank you for the information. > -Original Message- > Common case seems to be: > > ELF Header: > Magic: 7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00 > Class: ELF64 > Data: 2's complement, little endian > Version: 1 (current) > OS/ABI:UNIX - System V > ABI Version: 0 > Type: CORE (Core file) > Machine: Advanced Micro Devices X86-64 > Version: 0x1 > Entry point address: 0x0 > Start of program headers: 64 (bytes into file) > Start of section headers: 0 (bytes into file) > Flags: 0x0 > Size of this header: 64 (bytes) > Size of program headers: 56 (bytes) > Number of program headers: 23881 > Size of section headers: 0 (bytes) > Number of section headers: 0 > Section header string table index: 0 > > There are no sections in this file. > > There are no sections to group in this file. > > Program Headers: > Type Offset VirtAddr PhysAddr > FileSizMemSiz Flags Align > NULL 0x 0x 0x > 0x 0x 0 > NULL 0x 0x 0x > 0x 0x 0 > NULL 0x 0x 0x > 0x 0x 0 > NULL 0x 0x 0x > 0x 0x 0 > NULL 0x 0x 0x > 0x 0x 0 > NULL 0x 0x 0x > 0x 0x 0 > NULL 0x 0x 0x > 0x 0x 0 > NULL 0x 0x 0x > 0x 0x 0 > NULL 0x 0x 0x > 0x 0x 0 > NULL 0x 0x 0x > ... > NULL 0x 0x 0x > 0x 0x 0 > NULL 0x 0x 0x > 0x 0x 0 > In this case, was the "makedumpfile Completed." message emitted? It looks like the buffer of program headers was not written to the file.. Anyway, a debugging patch attached below. > There are some other failure cases with non-null data, so maybe there's >1 > bug here. > I've not seen an obvious pattern to this. eg... > > https://pastebin.com/2uM4sBCF > As for this case, I suspect that Elf64_Ehdr.e_phnum overflows (i.e. num_loads_dumpfile > 65535): 6851 /* 6852 * Get the PT_LOAD number of the dumpfile. 6853 */ 6854 if (!(num_loads_dumpfile = get_loads_dumpfile_cyclic())) { 6855 ERRMSG("Can't get a number of PT_LOAD.\n"); 6856 goto out; 6857 } 6858 6859 if (is_elf64_memory()) { /* ELF64 */ 6860 if (!get_elf64_ehdr(info->fd_memory, 6861 info->name_memory, &ehdr64)) { 6862 ERRMSG("Can't get ehdr64.\n"); 6863 goto out; 6864 } 6865 /* 6866 * PT_NOTE(1) + PT_LOAD(1+) 6867 */ 6868 ehdr64.e_phnum = 1 + num_loads_dumpfile; because e_phnum is uint16_t and the last LOAD of the dumpfile doesn't reach up to the one of /proc/vmcore at all. LOAD 0x726029d4 0x88037ba1 0x00037ba1 <<-- paddr 0x001c5000 0x004a9000 RWE0 [ 12.743942] LOAD[ 6]1 408000 <<-- phys_end If that is the case, it seems that we need to set it to PN_XNUM (0x) and have an entry in section header table according to elf(5).. > I'll put your patch on some of the affected hosts and see if this > changes behaviour in any way. If you can try the patch below, which includes the previous patch, please show me: - the debugging output of makedumpfile - readelf -a vmcore - ls -ls vmcore Thanks, Kazu diff --git a/makedumpfile.c b/