When we delete a ref, we have to rewrite the entire
packed-refs file. We take this opportunity to "curate" the
packed-refs file and drop any entries that are crufty or
broken.
Dropping broken entries (e.g., with bogus names, or ones
that point to missing objects) is actively a bad idea, as it
means that we lose any notion that the data was there in the
first place. Aside from the general hackiness that we might
lose any information about ref "foo" while deleting an
unrelated ref "bar", this may seriously hamper any attempts
by the user at recovering from the corruption in "foo".
They will lose the sha1 and name of "foo"; the exact pointer
may still be useful even if they recover missing objects
from a different copy of the repository. But worse, once the
ref is gone, there is no trace of the corruption. A
follow-up "git prune" may delete objects, even though it
would otherwise bail when seeing corruption.
We could just drop the "broken" bits from
curate_packed_refs, and continue to drop the "crufty" bits:
refs whose loose counterpart exists in the filesystem. This
is not wrong to do, and it does have the advantage that we
may write out a slightly smaller packed-refs file. But it
has two disadvantages:
1. It is a potential source of races or mistakes with
respect to these refs that are otherwise unrelated to
the operation. To my knowledge, there aren't any active
problems in this area, but it seems like an unnecessary
risk.
2. We have to spend time looking up the matching loose
refs for every item in the packed-refs file. If you
have a large number of packed refs that do not change,
that outweights the benefit from writing out a smaller
packed-refs file (it doesn't get smaller, and you do a
bunch of directory traversal to find that out).
Signed-off-by: Jeff King
---
I'll admit my argument against curate_packed_refs is a bit hand-wavy. I
won't be _too_ sad if somebody insists on cutting this back to just
keeping "broken" refs around, and still curating the "crufty" ones.
refs.c | 67 +
t/t5312-prune-corruption.sh | 2 +-
2 files changed, 2 insertions(+), 67 deletions(-)
diff --git a/refs.c b/refs.c
index 7f0e7be..47e4e53 100644
--- a/refs.c
+++ b/refs.c
@@ -2621,68 +2621,10 @@ int pack_refs(unsigned int flags)
return 0;
}
-/*
- * If entry is no longer needed in packed-refs, add it to the string
- * list pointed to by cb_data. Reasons for deleting entries:
- *
- * - Entry is broken.
- * - Entry is overridden by a loose ref.
- * - Entry does not point at a valid object.
- *
- * In the first and third cases, also emit an error message because these
- * are indications of repository corruption.
- */
-static int curate_packed_ref_fn(struct ref_entry *entry, void *cb_data)
-{
- struct string_list *refs_to_delete = cb_data;
-
- if (entry->flag & REF_ISBROKEN) {
- /* This shouldn't happen to packed refs. */
- error("%s is broken!", entry->name);
- string_list_append(refs_to_delete, entry->name);
- return 0;
- }
- if (!has_sha1_file(entry->u.value.sha1)) {
- unsigned char sha1[20];
- int flags;
-
- if (read_ref_full(entry->name, 0, sha1, &flags))
- /* We should at least have found the packed ref. */
- die("Internal error");
- if ((flags & REF_ISSYMREF) || !(flags & REF_ISPACKED)) {
- /*
-* This packed reference is overridden by a
-* loose reference, so it is OK that its value
-* is no longer valid; for example, it might
-* refer to an object that has been garbage
-* collected. For this purpose we don't even
-* care whether the loose reference itself is
-* invalid, broken, symbolic, etc. Silently
-* remove the packed reference.
-*/
- string_list_append(refs_to_delete, entry->name);
- return 0;
- }
- /*
-* There is no overriding loose reference, so the fact
-* that this reference doesn't refer to a valid object
-* indicates some kind of repository corruption.
-* Report the problem, then omit the reference from
-* the output.
-*/
- error("%s does not point to a valid object!", entry->name);
- string_list_append(refs_to_delete, entry->name);
- return 0;
- }
-
- return 0;
-}
-
int repack_without_refs(struct string_list *refnames, struct strbuf *err)
{
struct ref_dir *packed;
- struct string_list refs_to_delete = STRING_LIST_INIT_DUP;
-