Hi,

I've found an O(N^2) loop in the libdw library (N - number of symbols in the library). The dwfl_module_getsym function is called N times, but it calls resolve_symbol function each time and it loops through the number of symbols again.

So `readelf --debug-dump=info shared_object.so` command is very slow on large input objects. As an example, it takes several minutes to analyse libOpenImageIO.so.1.1.11 (27mb), but readelf from binutils takes just several seconds to do it.

The output of sprof for libelf:

  %        cumulative   self                                    self
 time     seconds       seconds    calls us/call      name
 42.16   28.42           28.42        1858018814     0.02 gelf_getsymshndx
 39.49   55.04           26.62        1798636744     0.01 gelf_getshdr
 18.35   67.41           12.37        1798635186     0.01 elf_getscn
 0.00     67.41           0.00          1562 0.00         elf_nextscn
 0.00     67.41           0.00          123 0.00         elf_ndxscn
 0.00     67.41           0.00          61 0.00         elf_strptr
 0.00     67.41           0.00          9 0.00         elf_getdata
 0.00     67.41           0.00          7 0.00         gelf_getphdr
 ...


The output of sprof for libdw:

  %         cumulative   self                             self
 time      seconds       seconds     calls          us/call name
 60.97    69.93           69.93         0 0.00          dwfl_module_getsym
 26.25    100.04         30.11         0               0.00 search_table
6.73 107.76 7.72 0 0.00 dwfl_adjusted_address 4.87 113.34 5.58 0 0.00 dwfl_adjusted_st_value 0.61 114.04 0.70 0 0.00 dwfl_adjusted_aux_sym_addr
 0.10      114.16         0.12           861603     0.14 dwarf_getattrs
0.10 114.27 0.11 0 0.00 __libdw_form_val_len
 0.05      114.33         0.06           0 0.00          __libdw_find_attr
 0.03      114.37         0.04           0 0.00          lookup
0.00 114.67 0.00 95891 0.00 dwfl_module_addrsym
 ...


The gelf_getsymshndx, gelf_getshdr and elf_getscn functions from libelf are called 2 billion times and it takes about 60% of total execution time. Another 26% of execution time are taken by the search_table function that is called by the dwfl_module_addrsym and loops through the number of symbols too. The dwfl_module_addrsym is called for each symbol. So it also takes O(N^2).

How can dwfl_module_getsym and dwfl_module_addrsym functions may be optimised to not to loop through all symbols in the library in order to find information for just one of them? This can make the library to be 10 times faster.

Thank you.

--
Andrey Ponomarenko, ROSA Lab.

_______________________________________________
elfutils-devel mailing list
[email protected]
https://lists.fedorahosted.org/mailman/listinfo/elfutils-devel

Reply via email to