Re: git fsck: unreachable vs. dangling

Michael J Gruber Tue, 14 Apr 2015 01:50:53 -0700

Junio C Hamano venit, vidit, dixit 14.04.2015 10:05:
> On Tue, Apr 14, 2015 at 12:16 AM, Sebastian Schuberth
> <sschube...@gmail.com> wrote:
>> Hi,
>>
>> reading through the fsck docs [1] I'm having a hard time understanding
>> what the difference between "unreachable" and "dangling" objects are.
>>
>> By example, suppose I have a commit A that is the tip of exactly one
>> branch (and no tag or other ref points to A). If I delete that branch,
>> is A now dangling, or unreachable, or both?
> 
> Suppose that branch consists of two commits, A and A^.
> When you lose that branch (git branch -D that-branch),
> both A and A^ become unreachable. So are trees and
> blobs that appear only in A and A^ and nowhere else;
> they are also unreachable.
> 
> A dangling object is an unreachable object that cannot be
> made reachable by any way other than pointing at it
> directly with a ref. A^ is not dangling, because you can
> make it reachable by pointing A (the tip of the original
> branch you just lost) with a ref. A on the other hand is
> dangling (if you had a tag object that points at A that
> you lost, then A is merely unreachable but not dangling,
> because you can point at that tag with a ref and make A
> reachable).
>


This terminology is established, but misleading if you try to
understanding things from the word and the technical tree structure:

"to dangle" means "to hang loosely".

So, in the description above, "A^ dangles from A loosely" because it
hangs from A (you can reach it from A) but loosely, because it would
"drop" if A gets dropped and A is "likely" to be dropped (because it is
unreachable by refs). But A^ is not dangling in our terminology.

If you *reverse the arrows*, i.e. consider A^ pointing to A, it becomes
more apparent that A is dangling: it is an unreferenced leaf node.

But really, we're switching directions of the arrows again and again
when, on the one hand, we talk about refs pointing to commits, commits
pointing to parent commits (which they are, of course) but then, on the
other hand, use terms like "root" and "dangling" which make sense only
when you think of the oppositely oriented tree.

Maybe we should introduce the terms DAG and reverse DAG (GAD?) to make
this clearer. Our DAG is technically oriented from commits to their
parents all the way "up" to the root.

Visually, we often orient it from the root towards child commits.

When we do a revision walk, do we walk forward or backwards?

I'd say users think in the ordering in which they create commits (from
root to children), let's call that DAG. It explains "root" and "dangle"
and such, and a revision walk is a walk backwards in that orientation.

Technically, our acyclic graph is oriented the other way, a revision
walk walks forward in that orientation - it couldn't walk any other way
without extensive searching. This orientation (GAD?) explains "points
to", be it "ref to rev" or "commit to parent".

Michael

P.S.: The more mathematically inclined will notice that we can have more
than 2 orientations if our graph is not connected... But I'd say we have
one "technical" orientation (commit->parent) and the opposite and can
forget about others.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: git fsck: unreachable vs. dangling

Reply via email to