On 4/3/2018 9:06 AM, Jeff King wrote:
On Tue, Apr 03, 2018 at 08:00:54AM -0400, Derrick Stolee wrote:

There are several commit-graph walks that require loading many commits
but never walk the trees reachable from those commits. However, the
current logic in parse_commit() requires the root tree to be loaded.
This only uses lookup_tree(), but when reading commits from the commit-
graph file, the hashcpy() to load the root tree hash and the time spent
checking the object cache take more time than parsing the rest of the
commit.

In this patch series, all direct references to accessing the 'tree'
member of struct commit are replaced instead by one of the following
methods:

        struct tree *get_commit_tree(struct commit *)
        struct object_id *get_commit_tree_oid(struct commit *)
This seems like a pretty sane thing to do. There may still be a few
parts of the code, notably fsck, that are reliant on a "struct object"
having been instantiated to determine whether an object is "used".  I
tried to clean that up around the time of c2d17b3b6e (fsck: check
HAS_OBJ more consistently, 2017-01-16), but I won't be surprised if
there's still some hidden assumptions.

I peeked at the fsck.c parts of patch 2, and it looks like we may be OK,
since you use get_commit_tree() during the walk.

This replacement was assisted by a Coccinelle script, but the 'tree'
member is overloaded in other types, so the script gave false-positives
that were removed from the diff.
That catches any existing in-tree callers, but not any topics in flight.
We could add the Coccinelle script and re-run it to catch any future
ones. But perhaps simpler still: can we rename the "tree" member to
"maybe_tree" or something, since nobody should be accessing it directly?
That will give us a compile error if an older topic is merged.

That's a good idea. I can reorg in v2 to rename 'tree' to 'maybe_tree' (and add an explicit comment that 'maybe_tree' could be NULL, so don't reference it directly). The check that the rename patch is correct is simply "does it compile?" Then Coccinelle could do the transfer of "c->maybe_tree" to "get_commit_tree(c)" safely, and can be added to the scripts.

I guess one caveat is to not include the mutators (c->maybe_tree = ...), but that's probably something Coccinelle can do.


On the Linux repository, performance tests were run for the following
command:

     git log --graph --oneline -1000

Before: 0.83s
After:  0.65s
Rel %: -21.6%
Neat.

Not strictly related, but I happened across another thing today that
might be worth investigating here. We allocate "struct commit" in these
nice big allocation blocks. But then for every commit we allocate at
least one "struct commit_list" to store the parent pointer.

I was looking at this from a memory consumption angle (I found a
pathological repository full of single-parent commits, and this wastes
an extra 16 bytes plus malloc overhead for every 64-byte "struct
commit").

But I wonder if it could also have non-negligible overhead in calling
malloc() for your case, since you've optimized out so much of the rest
of the work.

That is a pattern I'm seeing: we strip out one bit and something else shows up as a hot spot. This game of whack-a-mole is quite productive.

I'm not sure what the exact solution would be, but I imagine something
like variable-sized "struct commit"s with the parent pointers embedded,
with some kind of flag to indicate the number of parents (and probably
some fallback to break out to a linked list for extreme cases of more
than 2 parents).  It may end up pretty invasive, though, as there's a
lot of open-coded traversals of that parent list.

Anyway, not anything to do with this patch, but food for thought as you
micro-optimize these traversals.

One other thing that I've been thinking about is that 'struct commit' is so much bigger than the other structs in 'union any_object'. This means that the object cache, which I think creates blocks of 'union any_object' for memory-alignment reasons, is overly bloated. This would be especially true when walking many more trees than commits.

Perhaps there are other memory concerns in that case (such as cached buffers) that the 'union any_object' is not a concern, but it is worth thinking about as we brainstorm how to reduce the parent-list memory.

Thanks,
-Stolee

Reply via email to