On 2/28/2018 1:37 AM, Jeff King wrote:
On Tue, Feb 27, 2018 at 03:16:58PM -0800, Junio C Hamano wrote:

This code comes originally form 454fbbcde3 (git-rev-list: allow missing
objects when the parent is marked UNINTERESTING, 2005-07-10). But later,
in aeeae1b771 (revision traversal: allow UNINTERESTING objects to be
missing, 2009-01-27), we marked dealt with calling parse_object() on the
parents more directly.

So what I wonder is whether this code is simply redundant and can go
away entirely. That would save the has_object_file() call in all cases.
Hmm, interesting. I forgot all what I did around this area, but you
are right.
I'll leave it to Stolee whether he wants to dig into removing the
has_object_file() call. I think it would do the right thing, but the
most interesting bit would be how it impacts the timings.

This patch was so small that I could understand the full implication of my change.

I'm still very unfamiliar with the object walking machinery in revision.c. There are a lot of inter-dependencies that are taking time for me to understand and to feel confident that I won't cause a side effect in a special case. I do appreciate that Junio added a test in aeeae1b771.

I'll make a note to revisit this area after I have a better grasp of the object walk code, but I will not try removing the has_object_file() call now.


There's a similar case for trees. ...
though technically the existing code allows _missing_ trees, but
not on corrupt ones.
True, but the intention of these "do not care too much about missing
stuff while marking uninteresting" effort is aligned better with
ignoring corrupt ones, too, I would think, as "missing" in that
sentence is in fact about "not availble", and stuff that exists in
corrupt form is still not available anyway.  So I do not think it
makes a bad change to start allowing corrupt ones.
Agreed. Here it is in patch form, though as we both said, it probably
doesn't matter that much in practice. So I'd be OK dropping it out of
a sense of conservatism.

-- >8 --
Subject: [PATCH] mark_tree_contents_uninteresting: drop has_object check

It's generally acceptable for UNINTERESTING objects in a
traversal to be unavailable (e.g., see aeeae1b771). When
marking trees UNINTERESTING, we access the object database
twice: once to check if the object is missing (and return
quietly if it is), and then again to actually parse it.

We can instead just try to parse; if that fails, we can then
return quietly. That halves the effort we spend on locating
the object.

Note that this isn't _exactly_ the same as the original
behavior, as the parse failure could be due to other
problems than a missing object: it could be corrupted, in
which case the original code would have died. But the new
behavior is arguably better, as it covers the object being
unavailable for any reason. We'll also still issue a warning
to stderr in such a case.

Signed-off-by: Jeff King <p...@peff.net>
---
  revision.c | 5 +----
  1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/revision.c b/revision.c
index 5ce9b93baa..221d62c52b 100644
--- a/revision.c
+++ b/revision.c
@@ -51,12 +51,9 @@ static void mark_tree_contents_uninteresting(struct tree 
*tree)
  {
        struct tree_desc desc;
        struct name_entry entry;
-       struct object *obj = &tree->object;
- if (!has_object_file(&obj->oid))
+       if (parse_tree_gently(tree, 1) < 0)
                return;
-       if (parse_tree(tree) < 0)
-               die("bad tree %s", oid_to_hex(&obj->oid));
init_tree_desc(&desc, tree->buffer, tree->size);
        while (tree_entry(&desc, &entry)) {

Reply via email to