On 12/12/2013 09:07 AM, Petr Machata wrote: > Josh Stone <[email protected]> writes: > >> struct Dwarf_CU >> { >> @@ -283,7 +285,9 @@ struct Dwarf_CU >> size_t type_offset; >> uint64_t type_sig8; >> >> - /* Hash table for the abbreviations. */ >> + /* Simple array for the abbreviations with low codes. */ >> + Dwarf_Abbrev *abbrev_array[CU_ABBREV_ARRAY_SIZE]; > > This blows up Dwarf_CU from 100odd bytes to something like half the > page, I'm not entirely fond of that. Especially since the hash table > that follows is an on-demand-growing structure, having half a page > reserved (possibly on stack) just in case seems wasteful. > > I'm looking into some debuginfo files. libc has an average of 28 > abbreviations per CU (with 108 being the most), libstdc++ an average of > 77 (with 108 the most), gcc 88 (142), libbost_python 206 (290), vmlinux > 112 (185). So reserving space for 256 seems overly generous. I'd be > fine with a growable heap-allocated array capped at 256. Hopefully that > would still be helpful performance-wise.
OK, I'll experiment with less generous static sizes, on the heap, as well as dynamic/realloced size (still capped though). I also had the idea that maybe lookup can sometimes skip the modulus, which was by far the hotspot instruction in lookup() by ~72%. Just basically moving my "is-it-a-small-code" branch into hash lookup. So everything would stay in the hash table, hopefully just a bit faster. More things to try, anyway...
