kwk marked 8 inline comments as done.
kwk added a comment.

In D67390#1685484 <https://reviews.llvm.org/D67390#1685484>, @labath wrote:

> This looks fairly ok to me, but it could use a more explicit test of the 
> symbol uniqueing code. Right, now I believe the two tests you added check 
> that the symbols are _not_ uniqued. (They're also a bit hard to follow due to 
> the whole
>
>   objcopy business). Could you create a test with a yaml file which will 
> contain various edge cases relevant to this code. I'm thinking of stuff like 
> "a dynsym and a symtab symbol at the same address, but a different name", "a 
> dynsym and symtab symbols with identical names, but different addresses", 
> etc. Then just run "image dump symtab" on that file to check what we have 
> parsed?


I'll give my best to implement this today.

> I am also surprised that you weren't able to just use a Section* (instead of 
> the name) for the uniqueing. I'd expect that all symbols (even those from the 
> separate object file) should be resolved to the sections in the main object. 
> I see that this isn't the case, but I am surprised that this isn't causing 
> any problems. Anyway, as things seem to be working as they are now, we can 
> leave that for another day.

Okay.

> In D67390#1685313 <https://reviews.llvm.org/D67390#1685313>, @kwk wrote:
> 
>> Before I used the bare symbol name with stripped `@VERSION` suffix. Now I've 
>> changed the implementation of `NamedELFSymbol` to include the `@VERSION` 
>> suffix and the tests pass.
> 
> 
> Interesting. I'm pretty sure that the symbol count is irrelevant for that 
> test (it just wants to know if it is there), so we can change that if needed. 
> However, having the uniqueing include the `@VERSION` sounds right to me, so 
> if that makes the test happy too then, great.

Yes, I hoped so. Thank you. Please await another revision of this patch with 
the tests requested.



================
Comment at: lldb/source/Plugins/ObjectFile/ELF/ELFHeader.cpp:371-373
+  r.st_name = st_name;
+  return elf::ELFSymbol::operator==(r) &&
+         st_name_string == rhs.st_name_string;
----------------
labath wrote:
> kwk wrote:
> > clayborg wrote:
> > > I would almost just manually compare only the things we care about here. 
> > > Again, what about st_shndx when comparing a symbol from the main symbol 
> > > table and one from the .gnu_debugdata symbol table. Are those section 
> > > indexes guaranteed to match? I would think not. 
> > @clayborg I explicitly only compare what we care about and therefore always 
> > set the name index to  be the same.
> I'll echo @clayborg here. This business with copying the ELFSymbol and 
> clearing some fields is confusing. Do you even need the ELFSymbol::operator== 
> for anything? If not I'd just delete that, and have the derived version 
> compare all fields.
> 
> Also, it would be nice if the operator== and hash function definitions were 
> next to each other. Can you just forward declare the std::hash stuff in the 
> header, and have the implementation be next to this?
> I'll echo @clayborg here. This business with copying the ELFSymbol and 
> clearing some fields is confusing.

I've cleared up the documentation now and it is exactly the way I like it. 
Every entity deals with it's own business (respects its own fields when 
comparing). I find it pretty dirty to compare fields from the base struct in a 
derived one. The way I compare fields from the base struct is minimally 
invasive.

> Do you even need the ELFSymbol::operator== for anything?

Yes, when you want to compare ELFSymbols. I know that I don't do that 
explicitly but I the function only deals with fields from the entity itself and 
they don't spill into any derived structure (with the exception of explicitly 
ignoring fields).

> If not I'd just delete that, and have the derived version compare all fields.

No because I call it explcitly from the derived NamedELFSymbol.

> Also, it would be nice if the operator== and hash function definitions were 
> next to each other. Can you just forward declare the std::hash stuff in the 
> header, and have the implementation be next to this?

I've found a compromise that is even more appealing to me. The ELFSymbol and 
NamedELFSymbol structs now have a `hash` function which contains the 
implementation next to the one of `operator==()`. This `hash` is called in the 
specialization which remains in the same location as before; the reason being 
that I didn't find a way do define something in the `std::` namespace when I'm 
in the `elf::` namespace.




================
Comment at: lldb/source/Plugins/ObjectFile/ELF/ELFHeader.h:446
+    std::size_t h2 = std::hash<const char *>()(s.st_name_string.AsCString());
+    std::size_t h3 = std::hash<const char 
*>()(s.st_section_name_string.AsCString());
+    return llvm::hash_combine(h1, h2, h3);
----------------
jankratochvil wrote:
> I find better to rather define `std::hash<ConstString>` (or provide 
> `ConstString::Hasher` which I do in my DWZ patchset).
Once your DWZ patchset arrives, please let me know and we can change it.


================
Comment at: lldb/source/Plugins/ObjectFile/ELF/ObjectFileELF.cpp:2205
+        needle.st_section_name_string = 
ConstString(symbol_section_sp->GetName());
+    if (unique_elf_symbols.find(needle) == unique_elf_symbols.end()) {
+      symtab->AddSymbol(dc_symbol);
----------------
labath wrote:
> jankratochvil wrote:
> > labath wrote:
> > > something like `if (unique_elf_symbols.insert(needle).second)` would be 
> > > more efficient, as you don't need to mess with the map twice.
> > I would unconditionally add all the symbols to `std::vector<NamedElfSymbol> 
> > unique_elf_symbols` (with `unique_elf_symbols.reserve` in advance for the 
> > sym of `.symtab`+`.dynsym` sizes) and then after processing both `.symtab` 
> > and `.dynsym` and can `llvm::sort(unique_elf_symbols)` and add to `symtab` 
> > only those which are unique. I believe it will be much faster, one could 
> > benchmark it.
> sounds like a good idea.
@jankratochvil @labath yes, this sound like a good idea for *performance* 
improvement but honestly, I need to get this patch done first in order to make 
any progress with minidebuginfo. I hope you don't mind when I take this task to 
another patch, okay?


================
Comment at: lldb/source/Plugins/ObjectFile/ELF/ObjectFileELF.cpp:2650
+    // A unique set of ELF symbols added to the symtab
+    UniqueElfSymbolColl unique_elf_symbols({});
     SectionList *section_list = module_sp->GetSectionList();
----------------
labath wrote:
> what's wrong with the old-fashioned `UniqueElfSymbolColl unique_elf_symbols;` 
> ?
Apparently nothing now :) . But before I had troubles because of some deleted 
default constructor. Thanks for spotting this and bringing it back to my 
attention.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D67390/new/

https://reviews.llvm.org/D67390



_______________________________________________
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits

Reply via email to