On 29/10/17(Sun) 17:27, Mark Johnston wrote:
> On Sat, Oct 28, 2017 at 04:05:23PM +0200, Martin Pieuchot wrote:
> > Diff below makes ctfconv(1) aware of forward declarations and merge them
> > with the corresponding type as soon as they are known.
> > 
> > This reduces the number of type entries in a kernel CTF section by ~4K.
> > 
> > After applying this diff, grepping for "(0 bytes)" in the ctfdump(1)
> > output points to code that generally needs a cleanup.
> > 
> > ok?
> > 
> > [...]
> > @@ -299,12 +324,6 @@ it_cmp(struct itype *a, struct itype *b)
> >     if ((diff = (a->it_type - b->it_type)) != 0)
> >             return diff;
> >  
> > -   if ((diff = (a->it_size - b->it_size)) != 0)
> > -           return diff;
> > -
> > -   if ((diff = (a->it_nelems - b->it_nelems)) != 0)
> > -           return diff;
> > -
> >     /* Match by name */
> >     if (!(a->it_flags & ITF_ANON) && !(b->it_flags & ITF_ANON))
> >             return strcmp(it_name(a), it_name(b));
> 
> I don't understand the motivation for this part of the change.
> it_merge() copies the it_size and it_nelems fields, so we ought to be
> able to merge the two types regardless.

This change is needed because the internal data structure used to store
types is a rbtree and forward declarations do not contain any size or
nelems.
So the order of the checks is important because it will determine how
the tree is ordered.  Now we want to find the forward definition when
checking for the real type, and the other way around.  That's why this
"optimization" must go.

>                                         Moreover, what if two CUs
> define identically named structs? With this patch, we will merge
> them even when they have different sizes or member counts.

Such code would indeed be a problem.  But that's a different problem,
this diff doesn't make things worse in that regard.

> Note though that the existing checks in it_cmp() are not sufficient.
> For instance, ctfconv(1) currently will merge the following structs:
> 
> struct a { int foo; };
> struct a { u_int bar; };
> 
> Fixing this in general is non-trivial.

It's not particularity complicated, however I'm not sure it's worth it. 

First of all because it's not common to have to structure of the same
name in the same program, but more importantly, because I don't know
how I should use that data.  Which definition should the debugger tool
pick when you do:

print *(struct a *)0xffff8860

I think it would be more useful to emit a warning when a different
structure with the same name is found. 

Reply via email to