On Mon, 2014-12-15 at 22:42 +0100, Mark Wielaard wrote: > On Mon, 2014-12-15 at 12:18 -0800, Josh Stone wrote: > > On Fedora 21, this appears to be slightly faster, although pretty close > > to noise levels. Mark, can you see if this helps the performance slip > > on your el7 system? > > It is slightly faster ~0.5 secs on ~55 secs.
Wait, I wasn't testing on an idle system. One of the cores was pretty busy (with running a fuzzer...). I retested both the original (mjw/pending) and your patch with nothing else eating cpu. Now (best of 3) original was 54.90 vs patched 53.90. So a whole second won. > > /* Unrolling 0 like uleb128 didn't prove to benefit optimization. */ > > - for (unsigned int i = 0; i < len_leb128 (acc) && *addrp < end; ++i) > > + const size_t max = __libdw_max_len_leb128 (*addrp, end); > > + for (size_t i = 0; i < max; ++i) > > get_sleb128_step (acc, *addrp, i); > > /* Other implementations set VALUE to INT_MAX in this > > case. So we better do this as well. */ > > Unrolling this does seem to give an addition ~0.2 seconds win. Adding unrolling now (same idle system, best out of 3) gives me 54.28. So it does seem like another slight 0.6 second win. Cheers, Mark
