Re: [PATCH v2 00/19] Multiparent diff tree-walker + combine-diff speedup
On Tue, Feb 25, 2014 at 06:43:24AM +0700, Duy Nguyen wrote: On Mon, Feb 24, 2014 at 11:21 PM, Kirill Smelkov k...@mns.spb.ru wrote: Hello up there. Here go combine-diff speedup patches in form of first reworking diff tree-walker to work in general case - when a commit have several parents, not only one - we are traversing all 1+nparent trees in parallel. Then we are taking advantage of the new diff tree-walker for speeding up combine-diff, which for linux.git results in ~14 times speedup. I think there is another use case for this n-tree walker (but I'm not entirely sure yet as I haven't really read the series). In git-log (either with pathspec or --patch) we basically do this diff HEAD^ HEAD diff HEAD^^ HEAD^ diff HEAD^^^ HEAD^^ diff HEAD HEAD^^^ ... so except HEAD (and the last commit), all commits' tree will be read/diff'd twice. With n-tree walker I think we may be able to diff them in batch to reduce extra processing: commit lists are split into 16-commit blocks where 16 trees are fed to the new tree walker at the same time. I hope it would make git-log a bit faster (especially for -S). Maybe not much. Thanks for commenting. Unfortunately, as it is now, no, and I doubt savings will be significant. The real speedup comes from the fact that for combined diff, we can omit recursing into subdirectories, if we know some diff D(commit,parent_i) is empty. Let me quote myself from http://article.gmane.org/gmane.comp.version-control.git/242217 On Sun, Feb 16, 2014 at 12:08:29PM +0400, Kirill Smelkov wrote: On Fri, Feb 14, 2014 at 09:37:00AM -0800, Junio C Hamano wrote: I wonder if this machinery can be reused for log -m as well (or perhaps you do that already?). After all, by performing a single parallel scan, you are gathering all the necessary information to let you pretend that you did N pairwise diff-tree. Unfortunately, as it is now, no, and let me explain why: The reason that is not true, is that we omit recursing into directories, if we know D(A,some-parent) for that path is empty. That means we don't calculate D(A,any-other-parents) for that path and subpaths. More structured description is that combined diff and log -m, which could be though as all diffs D(A,Pi) are different things: - the combined diff is D(A,B) generalization based on ^ (sets intersection) operator, and - log -m, aka all diffs is D(A,B) generalization based on v (sets union) operator. Intersection means, we can omit calculating parts from other sets, if we know some set does not have an element (remember don't recurse into subdirectories?), and unioning does not have this property. It does so happen, that ^ case (combine-diff) is more interesting, because in the end it allows to see new information - the diff a merge itself introduces. log -m does not have this property and is no more interesting to what plain diff(HEAD,HEAD^n) can provide - in other words it's just a convenience. Now, the diff tree-walker could be generalized once more, to allow clients specify, which diffs combination operator to use - intersection or unioning, but I doubt that for unioning case that would add significant speedup - we can't reduce any diff generation based on another diff and the only saving is that we traverse resulting commit tree once, but for some cases that could be maybe slower, say if result and some parents don't have a path and some parent does, we'll be recursing into that path and do more work compared to plain D(A,Pi) for Pi that lacks the path. In short: it could be generalized more, if needed, but I propose we first establish the ground with generalizing to just combine-diff. besides D(HEAD~, HEAD) D(HEAD~2, HEAD~) ... D(HEAD~{n}, HEAD~{n-1}) is different even from log -m case as now there is no single commit with several parents. On a related note, while developing this n-tree walker, I've learned that it is important to load trees in correct order. Quoting patch 18: - t1tree = fill_tree_descriptor(t1, old); - t2tree = fill_tree_descriptor(t2, new); + /* +* load parents first, as they are probably already cached. +* +* ( log_tree_diff() parses commit-parent before calling here via +* diff_tree_sha1(parent, commit) ) +*/ + for (i = 0; i nparent; ++i) + tptree[i] = fill_tree_descriptor(tp[i], parents_sha1[i]); + ttree = fill_tree_descriptor(t, sha1); so it loads parent's tree first. If we change this to be the other way, i.e. load commit's tree first, and then parent's tree, there will be up to 4% slowdown for whole plain `git log` (without -c). So maybe what could be done to speedup plain log is for diff tree-walker to populate some form of recently-loaded trees while walking, and drop trees from will not-be used anymore commits - e.g. after doing HEAD~..HEAD for next diff for HEAD~~..HEAD~ HEAD~
[PATCH v2 00/19] Multiparent diff tree-walker + combine-diff speedup
Hello up there. Here go combine-diff speedup patches in form of first reworking diff tree-walker to work in general case - when a commit have several parents, not only one - we are traversing all 1+nparent trees in parallel. Then we are taking advantage of the new diff tree-walker for speeding up combine-diff, which for linux.git results in ~14 times speedup. This is the second posting for the whole series - sent here patches should go instead of already-in-pu ks/diff-tree-more and ks/tree-diff-nway into ks/tree-diff-nway - patches are related and seeing them all at once is more logical to me. I've tried to do my homework based on review feedback and the changes compared to v1 are: - fixed last-minute thinko/bug last time introduced on my side (sorry) with opt-pathchange manipulation in __diff_tree_sha1() - we were forgetting to restore opt-pathchange, which led to incorrect log -c (merges _and_ plain diff-tree) output; This time, I've verified several times, log output stays really the same. - direct use of alloca() changed to portability wrappers xalloca/xalloca_free which gracefully degrade to xmalloc/free on systems, where alloca is not available (see new patch 17). - i = 0; do { ... } while (++i nparent) is back to usual looping for (i = 0; i nparent; ++), as I've re-measured timings and the difference is negligible. ( Initially, when I was fighting for every cycle it made sense, but real no-slowdown turned out to be related to avoiding mallocs, load trees in correct order and reducing register pressure. ) - S_IFXMIN_NEQ definition moved out to cache.h, to have all modes registry in one place; - diff_tree() becomes static (new patch 13), as nobody is using it outside tree-diff.c (and is later renamed to __diff_tree_sha1); - p0 - first_parent; corrected comments about how emit_diff_first_parent_only behaves; not changed: - low-level helpers are still named with __ prefix as, imho, that is the best convention to name such helpers, without sacrificing signal/noise ratio. All of them are now static though. Signoffs were left intact, if a patch was already applied to pu with one, and had not changed. Please apply and thanks, Kirill P.S. Sorry for the delay - I was very busy. Kirill Smelkov (19): combine-diff: move show_log_first logic/action out of paths scanning combine-diff: move changed-paths scanning logic into its own function tree-diff: no need to manually verify that there is no mode change for a path tree-diff: no need to pass match to skip_uninteresting() tree-diff: show_tree() is not needed tree-diff: consolidate code for emitting diffs and recursion in one place tree-diff: don't assume compare_tree_entry() returns -1,0,1 tree-diff: move all action-taking code out of compare_tree_entry() tree-diff: rename compare_tree_entry - tree_entry_pathcmp tree-diff: show_path prototype is not needed anymore tree-diff: simplify tree_entry_pathcmp tree-diff: remove special-case diff-emitting code for empty-tree cases tree-diff: diff_tree() should now be static tree-diff: rework diff_tree interface to be sha1 based tree-diff: no need to call full diff_tree_sha1 from show_path() tree-diff: reuse base str(buf) memory on sub-tree recursion Portable alloca for Git tree-diff: rework diff_tree() to generate diffs for multiparent cases as well combine-diff: speed it up, by using multiparent diff tree-walker directly Makefile | 6 + cache.h | 15 ++ combine-diff.c| 170 +++--- config.mak.uname | 10 +- configure.ac | 8 + diff.c| 2 + diff.h| 12 +- git-compat-util.h | 8 + tree-diff.c | 666 +++--- 9 files changed, 724 insertions(+), 173 deletions(-) -- 1.9.rc1.181.g641f458 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 00/19] Multiparent diff tree-walker + combine-diff speedup
On Mon, Feb 24, 2014 at 11:21 PM, Kirill Smelkov k...@mns.spb.ru wrote: Hello up there. Here go combine-diff speedup patches in form of first reworking diff tree-walker to work in general case - when a commit have several parents, not only one - we are traversing all 1+nparent trees in parallel. Then we are taking advantage of the new diff tree-walker for speeding up combine-diff, which for linux.git results in ~14 times speedup. I think there is another use case for this n-tree walker (but I'm not entirely sure yet as I haven't really read the series). In git-log (either with pathspec or --patch) we basically do this diff HEAD^ HEAD diff HEAD^^ HEAD^ diff HEAD^^^ HEAD^^ diff HEAD HEAD^^^ ... so except HEAD (and the last commit), all commits' tree will be read/diff'd twice. With n-tree walker I think we may be able to diff them in batch to reduce extra processing: commit lists are split into 16-commit blocks where 16 trees are fed to the new tree walker at the same time. I hope it would make git-log a bit faster (especially for -S). Maybe not much. -- Duy -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html