On Tue, 2013-05-28 at 20:36 +0200, Jan Kratochvil wrote: > On Mon, 27 May 2013 16:05:46 +0200, Mark Wielaard wrote: > > Since it doesn't actually free the struct itself, lets call it > > clear_r_debug_info to avoid confusion. > > Done.
Thanks. > > Reporting the full file names was arguably wrong to begin with. I don't > > think we should restore that, but if people do feel this is a regression we > > could add a guesser to unstrip.c itself (but I don't think we should). > > The full pathnames were clearly a bug for this patchset. Yes, I agree, and I don't think anybody should rely on them, but maybe something did anyway. I am fine with calling that broken behavior though. > But when we talk about it in general, as a new patchset, I rather find a bug > dropping the path, when core file already contains the absolute pathname why > to hide it from the user? Right, if we can guarantee that path is the real one and not just an arbitrary guess. > But when we talk about some possible further extensions we should rather > forget about the whole link map and just use NT_FILE. Or one can just already > display the full pathnames easily (not in eu-readelf): > $ readelf -n test-core.core > Start End Page Offset > 0x0000000000400000 0x0000000000402000 0x0000000000000000 > /home/jkratoch/redhat/elfutils-libregr/test-core-main > 0x00000037ef400000 0x00000037ef420000 0x0000000000000000 > /usr/lib64/ld-2.16.so Yes that is nice, we should have support for that too, in the future, in some separate patch. Found the ELF NOTE description here: https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=2aa362c49c314a98fb9aebbd7760a461667bac05 Don't know if there is a more "official" documentation. Does gdb/gcore also add this NT_FILE note now? > > Document that count gets updated on success. > > The report_module helper is gone now. > > > The clearing of r_debug_info on failure is a nice because it prevents > > accidental memory leaks, but also confused me on first read of the patch > > (because I forgot the comment when reading the later code). > > The report_module helper is gone now. That is a nice cleanup. > > > /* We return the number of modules we found if we found any. > > > If we found none, we return -1 instead of 0 if there was an > > > - error rather than just nothing found. If link_map handling > > > - failed, we still have the sniffed modules. */ > > > - return sniffed == 0 || listed > sniffed ? listed : sniffed; > > > + error rather than just nothing found. */ > > > + return sniffed || listed >= 0 ? listed + sniffed : listed; > > > > What about the case of sniffed == 0 && listed == 0? > > Should that return -1? > > It did not return -1 before so why we should now? There may exist valid core > file which just does not contain any ELF module (=does not contain any > identifiable ELF module). > > A different case is when there is some real unexpected failure processing the > data structures, in such case the function still returns -1, either be an > early return -1 call or due to LISTED == -1 (from report_r_debug() > returning -1). Aha. Sorry, I misread the comment "if there was an error". So if there is no error and nothing found, just report zero. OK. > > > + /* Try to match up DYN_VADDR against L_LD as found in link map. > > > + Segments sniffing may guess invalid address as the first read-only > > > memory > > > + mapping may not be dumped to the core file (if ELF headers are not > > > dumped) > > > + and the ELF header is dumped first with the read/write mapping of > > > the same > > > + file at higher addresses. */ > > > + if (r_debug_info != NULL) > > > + for (const struct r_debug_info_module *module = r_debug_info->module; > > > + module != NULL; module = module->next) > > > + if (module_start <= module->l_ld && module->l_ld < module_end) > > > + { > > > + GElf_Addr fixup = module->l_ld - dyn_vaddr; > [...] > > fixup might need a comment. > > added: > /* L_LD read from link map must be right while DYN_VADDR is unsafe. > Therefore subtract DYN_VADDR and add L_LD to get a possibly > corrective displacement for all addresses computed so far. */ > > > It wasn't immediately clear to me. The fixup is the difference between the > > PT_LOAD address that the link-map know is the correct one and the PT_DYNAMIC > > address (of which there can be only one). And module_start points to the > > PT_LOAD address that was just gotten from the phdrs. Right? > > Right. Thanks. > > Actually now that I do understand, that is kind of what the comment above > > already says. I just didn't immediately made the connection with the > > offset/fixup being static. Is fixup always positive? > > At least in the supplied testcase it is negative. > > In fact I did not spent time thinking if it is positive or negative, > I was following the logic which address is safe and which unsafe above. OK, then I am wondering about the first check before the fixup. > + if (module_start <= module->l_ld && module->l_ld < module_end) Should the fixup not always be calculated and then the next check takes care of everything? > + GElf_Addr fixup = module->l_ld - dyn_vaddr; > + if ((fixup & (dwfl->segment_align - 1)) == 0 > + && module_start + fixup <= module->l_ld > + && module->l_ld < module_end + fixup) Since the fixup could just be zero? > > BTW. Why does link-map set module->name to the empty string when no name > > (NULL) was found? Just convenience like here, so no NULL check is > > needed, or is the empty name significant? > > 'name' is a GCC zero-length array, I am sorry but I cannot make it NULL: > struct r_debug_info_module > { > [...] > char name[0]; > }; Ah, OK then. > > > diff --git a/libdwfl/link_map.c b/libdwfl/link_map.c > > > @@ -352,9 +355,108 @@ report_r_debug (uint_fast8_t elfclass, uint_fast8_t > > > elfdata, > > > /* We have to find the file's phdrs to compute along with l_addr > > > what its runtime address boundaries are. */ > > > > > > - // XXX hook for sysroot > > > - mod = INTUSE(dwfl_report_elf) (dwfl, basename (name), > > > - name, -1, l_addr, true); > > > + Dwfl_Module *mod = NULL; > > > + if (iterations == 1 && dwfl->executable_for_core != NULL) > > > + { > > > + /* Find the main executable. > > > + FIXME: Try to also find it via build-id. */ > > > + name = dwfl->executable_for_core; > > > + } > > > > To solve that FIXME we should do somewhat the same as is done below with > > the "inlined dwfl_report_elf" snippet? And we already do try to open the > > executable if we cannot read the phdrs. Maybe this can be factored out a > > bit? > > In fact when we do not have any executable filename or we failed to open the > executable filename the only thing we can do is currently already done by the > second pass of dwfl_segment_report_module calls with the r_debug_info hints. > > This code was there before I implemented the second pass even after the link > map pass succeeded (found >= 1 shared libraries). > > So I have removed the FIXME comment and simplified it all a bit. > > > > + if (iterations != 2 && name != NULL) > > > + { > > > + /* Find a shared library or main executable. Do not try to > > > + find a file for vDSO (where ITERATIONS equals 2). */ > > > > I couldn't find where the iteration == 1 => exe, iteration == 2 => vDSO > > comes from. Shouldn't the second check be iterations > 2 ? > > As discussed on #elfutils this ITERATIONS == 2 condition was unsafe, while > currently vDSO is the 2nd entry with glibc at least with older or non-Linux > core files 2nd entry is a regular shared library. > > One could make some name-based exceptions ("linux-vdso.so.1") like in > > http://pkgs.fedoraproject.org/cgit/gdb.git/tree/gdb-glibc-vdso-workaround.patch > > I found also this case is now obsoleted by the second > dwfl_segment_report_module pass, finding the vDSO did not do anything else > than just to call dwfl_segment_report_module for its segment. All this certainly makes things a lot cleaner. Thanks. > Fun is this all is now worthless - only for backward compatibility which has > discutable value for core files - as it has been obsoleted by NT_FILE; > probably, not sure, NT_FILE contains the list of any mapped files, not just > the files really opened as shared libraries. It certainly isn't worthless! We still would like to handle old/incomplete core files from older systems (on newer systems). We don't support NT_FILE yet. There are other ways core files are generated than with the latest kernel version, like with gdb/gcore, which we also like to handle. > Attaching significantly simplified patch. > commit cd584cc822842862923670ebc44d40bf55485a69 > Author: Jan Kratochvil <[email protected]> > Date: Thu May 23 20:19:16 2013 +0200 > > Use DT_DEBUG library search first. > > libdwfl/ > 2013-05-28 Jan Kratochvil <[email protected]> > > * argp-std.c (parse_opt) <ARGP_KEY_SUCCESS> <opt->core> <opt->e>: Set > executable_for_core before calling dwfl_core_file_report. > * core-file.c (clear_r_debug_info): New function. > (dwfl_core_file_report): Move raw segments reporting lower. New > variable r_debug_info, pass it to dwfl_segment_report_module. Call > clear_r_debug_info in the end. Return sum of LISTED and SNIFFED. > * dwfl_module_build_id.c (check_notes): Move into > __libdwfl_find_elf_build_id. > (__libdwfl_find_build_id): Rename to ... > (__libdwfl_find_elf_build_id): ... here. Add parameters build_id_bits, > build_id_elfaddr and build_id_len. Verify MOD vs. ELF. > (__libdwfl_find_elf_build_id) (check_notes): Remove parameters mod and > set, rename data_vaddr to data_elfaddr. Do not call found_build_id. > (__libdwfl_find_elf_build_id): Update the check_notes caller, do not > adjust its data_elfaddr parameter. > (__libdwfl_find_build_id): New wrapper of __libdwfl_find_elf_build_id. > * dwfl_segment_report_module.c (dwfl_segment_report_module): New > parameter r_debug_info. New variable name_is_final. Adjust addresses > according to R_DEBUG_INFO->MODULE. Check conflicts against DWFL. > Do not overwrite NAME by SONAME if NAME_IS_FINAL. > * libdwflP.h (__libdwfl_find_elf_build_id): New declaration. > (struct r_debug_info_module, struct r_debug_info): New definitions. > (dwfl_segment_report_module, dwfl_link_map_report): Add parameter > r_debug_info. > * link_map.c: Include fcntl.h. > (report_r_debug): Add parameter r_debug_info, describe it in the > function comment. Delete dwfl_addrmodule call and its dependent code. > Verify build-id before calling dwfl_report_elf, also supply > executable_for_core to it. Store r_debug_info->module info when > appropriate. > (dwfl_link_map_report): Add parameter r_debug_info. New variable > in_ok. Try to read IN from EXECUTABLE_FOR_CORE. Update report_r_debug > caller parameters. > > tests/ > 2013-05-28 Jan Kratochvil <[email protected]> > > * Makefile.am (EXTRA_DIST): Add test-core-lib.so.bz2, > test-core.core.bz2 and test-core.exec.bz2. > * run-addrname-test.sh: New test for these files. > * run-unstrip-n.sh: Update expected output. New test for these files. > * test-core-lib.so.bz2: New file. > * test-core.core.bz2: New file. > * test-core.exec.bz2: New file. > > Signed-off-by: Jan Kratochvil <[email protected]> Looks good to me. But please merge this commit and the code movement commit into one, since those are really one change. Thanks, Mark _______________________________________________ elfutils-devel mailing list [email protected] https://lists.fedorahosted.org/mailman/listinfo/elfutils-devel
