On 7/29/2018 6:33 AM, Nguyễn Thái Ngọc Duy wrote:
This series speeds up unpack_trees() a bit by using cache-tree.
unpack-trees could bit split in three big parts

- the actual tree unpacking and running n-way merging
- update worktree, which could be expensive depending on how much I/O
   is involved
- repair cache-tree

This series focuses on the first part alone and could give 700%
speedup (best case possible scenario, real life ones probably not that
impressive).

It also shows that the reparing cache-tree is kinda expensive. I have
an idea of reusing cache-tree from the original index, but I'll leave
that to Ben or others to try out and see if it helps at all.

v2 fixes the comments from Junio, adds more performance tracing and
reduces the cost of adding index entries.

Nguyễn Thái Ngọc Duy (4):
   unpack-trees.c: add performance tracing
   unpack-trees: optimize walking same trees with cache-tree
   unpack-trees: reduce malloc in cache-tree walk
   unpack-trees: cheaper index update when walking by cache-tree

  cache-tree.c   |   2 +
  cache.h        |   1 +
  read-cache.c   |   3 +-
  unpack-trees.c | 161 ++++++++++++++++++++++++++++++++++++++++++++++++-
  unpack-trees.h |   1 +
  5 files changed, 166 insertions(+), 2 deletions(-)


I ran "git checkout" on a large repo and averaged the results of 3 runs. This clearly demonstrates the benefit of the optimized unpack_trees() as even the final "diff-index" is essentially a 3rd call to unpack_trees().

baseline        new     
----------------------------------------------------------------------
0.535510167     0.556558733     s: read cache .git/index
0.3057373       0.3147105       s: initialize name hash
0.0184082       0.023558433     s: preload index
0.086910967     0.089085967     s: refresh index
7.889590767     2.191554433     s: unpack trees
0.120760833     0.131941267     s: update worktree after a merge
2.2583504       2.572663167     s: repair cache-tree
0.8916137       0.959495233     s: write index, changed mask = 28
3.405199233     0.2710663       s: unpack trees
0.000999667     0.0021554       s: update worktree after a merge
3.4063306       0.273318333     s: diff-index
16.9524923 9.462943133 s: git command: 'c:\git-sdk-64\usr\src\git\git.exe' checkout

The first call to unpack_trees() saves 72%
The 2nd and 3rd call save 92%
Total time savings for the entire command was 44%

In the performance game of whack-a-mole, that call to repair cache-tree is now looking quite expensive...

Ben

Reply via email to