On 5/2/2018 9:05 AM, Jakub Narebski wrote:
Derrick Stolee <sto...@gmail.com> writes:
     For a copy of the Linux repository, we measured the following
     performance improvements:

     git merge-base v3.3 v4.5

     Before: 234 ms
      After: 208 ms
      Rel %: -11%

     git merge-base v4.3 v4.5

     Before: 102 ms
      After:  83 ms
      Rel %: -19%

     The experiments above were chosen to demonstrate that we are
     improving the filtering of the merge-base set. In the first
     example, more time is spent walking the history to find the
     set of merge bases before the remove_redundant() call. The
     starting commits are closer together in the second example,
     therefore more time is spent in remove_redundant(). The relative
     change in performance differs as expected.
Nice.

I was not expecting as much performance improvements as we got for
--contains tests because remove_redundant() is a final step in longer
process, dominated by man calculations.  Still, nothing to sneeze about.

One reason these numbers are not too surprising is that remove_redundant() can demonstrate quadratic behavior. It is calculating pair-wise reachability by starting a walk at each of the candidates (in the worst case). In typical cases, the first walk marks many of the other candidates as redundant and we don't need to start walks from those commits.

A possible optimization could be to sort the candidates by descending generation so we find the first walk is likely to mark the rest as redundant. But this may already be the case if the candidates are added to the list in order of "discovery" which is already simulating this behavior.

Thanks,
-Stolee

Reply via email to