Re: [Rpm-maint] [rpm-software-management/rpm] debugedit: Fix missing relocation of .debug_types section. (#1323)

Mark Wielaard Fri, 31 Jul 2020 09:12:14 -0700

Hi Florian,

On Fri, 2020-07-31 at 14:24 +0200, Florian Festi wrote:
> Mark can you have a look at this, please?


Sure. But somehow I never got this email through the mailinglist, so I
don't know how to reply so that my comments show up on that pull. Also,
it seems there are various force pushes that mess up the commit
ids/patches. But here is a review inline:

> > * [NFC] debugedit: Reindent edit_dwarf2().

This is basically:

diff --git a/tools/debugedit.c b/tools/debugedit.c
index 9f8dcd0fb..dcacb23b7 100644
--- a/tools/debugedit.c
+++ b/tools/debugedit.c
@@ -2100,8 +2100,9 @@ edit_dwarf2 (DSO *dso)
       return 1;
     }
 
-  if (debug_sections[DEBUG_INFO].data != NULL)
-    {
+  if (debug_sections[DEBUG_INFO].data == NULL)
+    return 0;
+
   unsigned char *ptr, *endcu, *endsec;
   uint32_t value;
   htab_t abbrev;
@@ -2480,7 +2481,6 @@ edit_dwarf2 (DSO *dso)
          macro_sec = macro_sec->next;
        }
     }
-    }
 
   return 0;
 }

Which is logically correct, but I don't know why you would do that.
It unnecessarily changes ~400 source lines, causing it to make it
slightly harder to track when which change was made by whom for what.
If it is for the same lines that are moved from edit_dwarf2 () to
edit_info () below then it is fine, but if you do it in a separate
commit maybe also factor out edit_info () already? To make the next
commit easier to read.

> >   * debugedit: Fix missing relocation of .debug_types section.

This will only work for executables or shared librareis, not for
(ET_REL) object files (or kernel modules) because those come with more
than 1 (comdat) .debug_types section. This is probably fine. But if you
expect that .debug_types will also appear in relocatable files, then
you might want to look at what edit_dwarf2() does for .debug_macro,
which might also appear multiple times.

You do include a testcase for the relocatable object case:

+# ===
+# Make sure -fdebug-types-section has updated strings in partial linked object.
+# ===
+AT_SETUP([debugedit .debug_types partial])
+AT_KEYWORDS([debugtypes] [debugedit])
+RPM_DEBUGEDIT_SETUP
+
+AT_DATA([expout],
+[fieldname1
+])
+
+AT_CHECK([[debugedit -b $(pwd) -d /foo/bar/baz ./foobarbaz.part.o]])
+AT_CHECK([[
+readelf --debug-dump=info ./foobarbaz.part.o \
+       | sed -n '/Abbrev Number:.*DW_TAG_type_unit/,/^$/p' \
+        | sed -n 's/^.*> *DW_AT_name *:.* \(fieldname.\)$/\1/p'
+]],[0],[expout])
+
+AT_CLEANUP

Which is great, thanks for adding that (and the single object files and
exectuable testcases). The tests look good, but won't catch the
multiple .debug_types sections case (and only test indirect strings in
.debug_str, not the small strings embedded in .debug_info).

I would suggest extending the testcase a little to have multiple larger
structs, plus some small field names. e.g. add another struct in
foobar.h:

struct types_section
{
  char t;
  int n1;
};

which then gets referenced in both foo.c and bar.c as

struct types_section debug_var_foo;

and

struct types_section debug_var_bar;

That way you get at least two .debug_type (comdat) sections and one
with "static" strings.

> >  * *M* tools/debugedit.c

> diff --git a/tools/debugedit.c b/tools/debugedit.c
> index dcacb23b7..f635c0c62 100644
> --- a/tools/debugedit.c
> +++ b/tools/debugedit.c
> @@ -1459,7 +1459,7 @@ edit_dwarf2_line (DSO *dso)
>     adjustments needed in the debug_list data structures. Returns true
>     if line_table needs to be rewrite either the dir or file paths. */
>  static bool
> -read_dwarf2_line (DSO *dso, uint32_t off, char *comp_dir)
> +read_dwarf2_line (DSO *dso, int info_scn_ix, uint32_t off, char *comp_dir)
>  {
>    unsigned char *ptr, *dir;
>    unsigned char **dirt;
> @@ -1470,7 +1470,7 @@ read_dwarf2_line (DSO *dso, uint32_t off, char 
> *comp_dir)
>    if (get_line_table (dso, off, &table) == false
>        || table == NULL)
>      {
> -      if (table != NULL)
> +      if (table != NULL && info_scn_ix == DEBUG_INFO)
>         error (0, 0, ".debug_line offset 0x%x referenced multiple times",
>                off);
>        return false;

Maybe we can just remove that warning. It isn't really "illegal" to
reference a debug line table multiple times. All we really care about
is that we only process it once (and only process those that are
actually used).

> @@ -1644,7 +1644,8 @@ find_new_list_offs (struct debug_lines *lines, size_t 
> idx)
>     abbrev data is consumed. In phase zero data is collected, in phase one
>     data might be replaced/updated.  */
>  static unsigned char *
> -edit_attributes (DSO *dso, unsigned char *ptr, struct abbrev_tag *t, int 
> phase)
> +edit_attributes (DSO *dso, int info_scn_ix, unsigned char *ptr,
> +                struct abbrev_tag *t, int phase)
>  {
>    int i;
>    uint32_t list_offs;
> @@ -1942,7 +1943,7 @@ edit_attributes (DSO *dso, unsigned char *ptr, struct 
> abbrev_tag *t, int phase)
>       separately (at the end of phase zero after all CUs have been
>       scanned in dwarf2_edit). */
>    if (phase == 0 && found_list_offs
> -      && read_dwarf2_line (dso, list_offs, comp_dir))
> +      && read_dwarf2_line (dso, info_scn_ix, list_offs, comp_dir))
>      need_stmt_update = true;
>  
>    free (comp_dir);

OK. But if we take out the warning in read_dwarf2_line () then we don't
have to pass through the info_scn_ix.

> @@ -1964,6 +1965,114 @@ line_rel_cmp (const void *a, const void *b)
>    return 0;
>  }
>  
> +static int
> +edit_info (DSO *dso, int phase, int scn_ix)
> +{
> [...]
>
> +  return 0;
> +}
> +
>  static int
>  edit_dwarf2 (DSO *dso)
>  {
> @@ -2103,12 +2212,10 @@ edit_dwarf2 (DSO *dso)
>    if (debug_sections[DEBUG_INFO].data == NULL)
>      return 0;
>  
> -  unsigned char *ptr, *endcu, *endsec;
> -  uint32_t value;
> -  htab_t abbrev;
> -  struct abbrev_tag tag, *t;
> +  unsigned char *ptr, *endsec;
>    int phase;
>    bool info_rel_updated = false;
> +  bool types_rel_updated = false;
>    bool macro_rel_updated = false;
>  
>    for (phase = 0; phase < 2; phase++)
> @@ -2120,99 +2227,21 @@ edit_dwarf2 (DSO *dso)
>           && !need_stmt_update)
>         break;
>  
> -      ptr = debug_sections[DEBUG_INFO].data;
> -      setup_relbuf(dso, &debug_sections[DEBUG_INFO], &reltype);
> [...]
> +      rel_updated = false;
> +      if (edit_info (dso, phase, DEBUG_TYPES))
>         return 1;
> -               }
> -
> -             ptr = edit_attributes (dso, ptr, t, phase);
> -             if (ptr == NULL)
> -               break;
> -           }
>  
> -         htab_delete (abbrev);
> -       }
> -
> -      /* Remember whether any .debug_info relocations might need
> +      /* Remember whether any .debug_types relocations might need
>          to be updated. */
> -      info_rel_updated = rel_updated;
> +      types_rel_updated = rel_updated;
>  
>        /* We might have to recalculate/rewrite the debug_line
>          section.  We need to do that before going into phase one

If I am reading this right, this moves the phase zero and phase one
parsing of .debug_info and now .debug_types into edit_info () from
edit_dwarf2 () calling it twice and keeping track of whether the
relocations need to be updated. This should be fine.

> @@ -2471,6 +2500,8 @@ edit_dwarf2 (DSO *dso)
>    /* Update any relocations addends we might have touched. */
>    if (info_rel_updated)
>      update_rela_data (dso, &debug_sections[DEBUG_INFO]);
> +  if (types_rel_updated)
> +    update_rela_data (dso, &debug_sections[DEBUG_TYPES]);
>  
>    if (macro_rel_updated)
>      {

This looks OK. But just before this the sections that have been changed
are marked "dirty", you probably want to mark DEBUG_TYPES also dirty if
it has been updated in any way (otherwise it isn't guaranteed the data
is written to disk, although it often will be).

One other tricky thing is tracking and updating of DW_AT_ might also
need to track whether the DW_AT_stmt_list. I think you are handling
this correctly, because the new edit_info () calls edit_attributes ()
which should take care of that. But the current testcase doesn't have
debug type units with different stmt_list indexes.

To make sure you test the case where there are multiple debug line
table offsets in your .debug_type sections, you might want to add
something like the following to bar.c:

struct large_struct
{
  int integer;
  char character;
};

struct large_struct large;

That way the type units have different stmt indexes.

Cheers,

Mark
_______________________________________________
Rpm-maint mailing list
Rpm-maint@lists.rpm.org
http://lists.rpm.org/mailman/listinfo/rpm-maint

Re: [Rpm-maint] [rpm-software-management/rpm] debugedit: Fix missing relocation of .debug_types section. (#1323)

Reply via email to