On Wed, 9 Oct 2013, Nguyễn Thái Ngọc Duy wrote:

> I know I still have a lot of holes to plug, but this was more
> interesting because we could see some encouraging numbers.
> Unfortunately the result is disappointing. Maybe I did it in a stupid
> way and need to restart with a totally different way.
> 
> "rev-list --objects" on v2 takes 4 secs, v4 with current walker 11s
> and the new walker 16s (worst!).

Clear indication that something is wrong if the intermediate step of 
converting to the canonical object representation is faster than your 
attempt at a native pv4 walker.

However here's the numbers I get here after a fresh clone of git.git

$ time git revlist --objects --all > /dev/null

real    0m2.619s
user    0m2.577s
sys     0m0.033s
$ mkdir orig
$ mv .git/objects/pack/pack-* orig/
$ ../../git/test-packv4 orig/pack-*.pack .git/objects/pack/pack-foo.pack
Scanning objects: 100% (162785/162785), done.
Writing objects: 100% (162785/162785), done.
$ time git rev-list --objects --all > /dev/null

real    0m6.210s
user    0m6.140s
sys     0m0.027s

[installing git with your latest patches applied here]

$ time git rev-list --objects --all > /dev/null

real    0m20.409s
user    0m20.337s
sys     0m0.029s

So... I get even *worse* numbers relative to v2 than you do!

Now let's mitigate the deep delta chaining effect in the tree encoding:

$ rm .git/objects/pack/pack-foo.*
$ ../../git/test-packv4 --min-tree-copy=50 orig/pack-*.pack 
.git/objects/pack/pack-foo.pack
Scanning objects: 100% (162785/162785), done.
Writing objects: 100% (162785/162785), done.
$ time git rev-list --objects --all > /dev/null

real    0m9.451s
user    0m9.393s
sys     0m0.036s

Using --min-tree-copy=50 produces a pack which is still smaller than 
pack v2 but any tree copy sequence must refer to a minimum of 50 
entries.  This significantly reduces the CPU usage in decode_entries() 
by reducing the needless chaining effect I mentioned here:

http://article.gmane.org/gmane.comp.version-control.git/234975

Now let's keep that pack and back out your patches.

[installing git with your latest patches reverted here]

$ time git rev-list --objects --all > /dev/null

real    0m3.751s
user    0m3.711s
sys     0m0.029s

So I shoved almost 2.5 seconds of runtime here.  Not the half of the 
original runtime, but still significant.

Let's push the chaining theory even further:

$ rm .git/objects/pack/pack-foo.*
$ ../../git/test-packv4 --min-tree-copy=100 orig/pack-*.pack 
.git/objects/pack/pack-foo.pack
Scanning objects: 100% (162785/162785), done.
Writing objects: 100% (162785/162785), done.
$ time git rev-list --objects --all > /dev/null

real    0m3.445s
user    0m3.406s
sys     0m0.029s

With --min-tree-copy=100 the resulting pack gets somewhat larger than 
the v2 equivalent, but still smaller if we take into account the size of 
both the pack and index files.  However the runtime benefit is no more 
significant.

So, there are 2 conclusions here:

1: The current tree delta euristic is inefficient for pack v4.

2- Something must be very wrong in your latest patches as they make it
   close to 3 times more expensive than without them.

> The only bit I haven't done is avoid checking if a tree is already
> examined, if so do not bother with copy sequences referring to it.
> That should cut down the number of decode_entries but not sure how
> much because there's no relation between tree traversing order and how
> copy sequences are made.

I'm sure it would help mitigate the deep delta chaining effect as well.

> Maybe we could make an exception and allow the tree walker to pass
> pv4_tree_cache* directly to decode_entries so it does not need to do
> the first lookup every time..
> 
> Suggestions?

I'll try to have a look at your patches in more details soon.


Nicolas

Reply via email to