On Fri, Apr 24, 2026 at 05:53:47PM -0700, Song Liu wrote:
> On Wed, Apr 22, 2026 at 9:04 PM Josh Poimboeuf <[email protected]> wrote:
> >
> > Rewrite the symbol correlation code, using a tiered list of
> > deterministic strategies in a loop.  For duplicately named symbols, each
> > tier applies a filter with the goal of finding a 1:1 deterministic
> > correlation between the original and patched version of the symbol.
> >
> > Overall this works much better than the existing algorithm, particularly
> > with LTO kernels.
> 
> I found it is hard to follow all the matching algorithms here. Could you
> please add some examples for each case: different levels in find_twin(),
> match in find_twin_suffixed(), and match in find_twin_positional()?

Ack.

> Also a few nitpicks below.
> 
> > Signed-off-by: Josh Poimboeuf <[email protected]>
> > ---
> [...]
> > +static struct symbol *find_twin(struct elfs *e, struct symbol *sym1)
> > +{
> > +       struct symbol *name_last = NULL, *scope_last = NULL,
> > +                     *file_last = NULL, *csum_last = NULL;
> > +       unsigned int name_orig = 0, name_patched = 0;
> > +       unsigned int scope_orig = 0, scope_patched = 0;
> > +       unsigned int file_orig = 0, file_patched = 0;
> > +       unsigned int csum_orig = 0, csum_patched = 0;
> > +       struct symbol *sym2, *match = NULL;
> > +
> > +       /* Count orig candidates */
> > +       for_each_sym_by_demangled_name(e->orig, sym1->demangled_name, sym2) 
> > {
> > +               if (sym2->twin || sym1->type != sym2->type || 
> > dont_correlate(sym2) ||
> > +                   (!maybe_same_file(sym1, sym2)))
> >                         continue;
> >
> > -               count++;
> > -               result = sym2;
> > +               /* Level 1: name match (widest filter)  */
> > +               name_orig++;
> > +
> > +               /* Level 2: scope (scope changes allowed) */
> > +               if (is_tu_local_sym(sym1) != is_tu_local_sym(sym2))
> 
> is_tu_local_sym(sym1) is called many times, shall we add a variable
> for it?

Unless it's actually a performance issue I'd prefer not to add yet
another bit to struct symbol.

> 
> > +                       continue;
> > +               scope_orig++;
> > +
> > +               /* Level 3: file (scope changes disallowed) */
> > +               if (!same_file(sym1, sym2))
> > +                       continue;
> > +               file_orig++;
> > +
> > +               /* Level 4: checksum (unchanged symbols) */
> > +               if (sym1->len != sym2->len || !sym1->csum.checksum ||
> > +                   sym1->csum.checksum != sym2->csum.checksum)
> > +                       continue;
> > +               csum_orig++;
> >         }
> >
> > -       if (count > 1) {
> > -               ERROR("Multiple (%d) correlation candidates for %s", count, 
> > sym->name);
> > -               return -1;
> > +       /* Count patched candidates */
> > +       for_each_sym_by_demangled_name(e->patched, sym1->demangled_name, 
> > sym2) {
> > +               if (sym2->twin || sym1->type != sym2->type || 
> > dont_correlate(sym2))
> > +                       continue;
> > +
> > +               /* Level 1 */
> > +               if (!maybe_same_file(sym1, sym2))
> > +                       continue;
> 
> This for_each_sym_by_demangled_name() has two "if () continue" while the
> first one has one. Maybe keep them the same (just for symmetry)?

Oops, asymmetry not intended there, thanks.

-- 
Josh

Reply via email to