On 07/10/2015 12:06 AM, Junio C Hamano wrote:
> David Turner <dtur...@twopensource.com> writes:
> 
>> OK, here's my current best idea:
>>
>> 1. A "pseudoref" is an all-caps file in $GIT_DIR/ that always contains
>> at least a SHA1.  CHERRY_PICK_HEAD and REVERT_HEAD are examples. Because
>> HEAD might be a symbolic ref, it is not a pseudoref. 
>>
>> Refs backends do not manage pseudorefs.  Instead, when a pseudoref (an
>> all-caps ref containing no slashes) is requested (e.g. git rev-parse
>> FETCH_HEAD) the generic refs code checks for the existence of that
>> file and if it exists, returns immediately without hitting the backend.
>> The generic code will refuse to allow updates to pseudorefs.
>>
>> 2. The pluggable refs backend manages all refs other than HEAD.
>>
>> 3. The "files" backend always manages HEAD.  This allows for a reflog
>> and for HEAD to be a symbolic ref.
>>
>> The major complication here is ref transactions -- what if there's a
>> transaction that wants to update e.g. both HEAD and refs/heads/master?
> 
> An update to the current branch (e.g. "git commit") does involve at
> least update to the reflog of HEAD, the current branch somewhere in
> refs/heads/ and its log, so it is not "what if" but is a norm [*1*].

The updating of symlink reflogs in general, and particularly that of
HEAD, is not done very cleanly. You can see the code in
`commit_ref_update()` (some of it helpfully commented to be a "Special
hack"):

* If a reference is modified through a symlink, the symlink is locked
rather than the reference itself.
* If a reference is modified directly, and HEAD points at it, then the
HEAD reflog is amended without locking HEAD.

Aside from the lack of proper locking, which could result in races with
other processes, we also have the problem that the same reference that
is being changed via one of these implicit updates could *also* be being
changed directly in the same transaction. Such an update would evade the
`ref_update_reject_duplicates()` check.

Previously my thinking was that the locking should be done differently:
when the transaction is being processed, extra ref_update records could
be created for the extra reference(s) that have to be modified, then
these could be handled more straightforwardly. So supposing that HEAD
points at refs/heads/master,

* An update of HEAD would be turned into a reflog update and also add a
synthetic update to refs/heads/master.
* An update of refs/heads/master would add a synthetic update to the
HEAD reflog

The first point would obviously apply to any updates via symbolic refs.
The second one should too, thought this is a case that we currently punt
on to avoid the need to do reverse symbolic ref lookups.

>> It may be the case that this never happens; I have not actually audited
>> the code to figure it out.  If someone knows for sure that it does not
>> happen, please say so. But assuming it does happen, here's my idea:
>>
>> If the refs backend is the files backend, we can simply treat HEAD like
>> any other ref.
>>
>> If the refs backend is different, then the refs code needs to hold a
>> files-backend transaction for HEAD, which it will commit immediately
>> after the other transaction succeeds.  We can stick a pointer to the
>> extra transaction in the generic struct ref_transaction, which (as
>> Michael Haggerty suggests) specific backends will extend.
>>
>> A failure to commit either transaction will be reported as a failure,
>> and we'll give an additional inconsistent state warning if the main
>> transaction succeeds but the HEAD transaction fails.
> 
> Yeah, I was thinking along those lines, too.  Thanks for clearly
> writing it down.
> 
>> What do other folks think?
> 
> Me too ;-)

I don't have an answer right now, and I have to get on an airplane in a
few hours so I can't think hard about it at the moment. But let me also
braindump another vague plan that I have had for a long time:
overlayable reference storage schemes. Think of the way that loose refs
are currently overlaid on top of packed refs. I think it might be useful
to support overlaying more generally.

In this particular case there could be a workspace-local reference
storage that only handles HEAD and perhaps some of the other
pseudoreferences. That could be overlaid onto loose reference storage
(which would then only concern itself with references under "refs/"),
which would in turn be overlaid onto packed refs. The workspace-local
reference storage layer would have evil special-cased code for dealing
with the references that live outside of "refs/".

A `ref_transaction_commit()` would be broken into phases: first each of
the stacked backends would be asked to verify that the transaction is
possible and acquire any necessary locks, then each backend would get
the final "commit" command.

This construct would make it easy for different backends to share the
same implementation for HEAD (and potentially other workspace-local)
references, by simply layering that one storage mechanism on top of
their own.

That would probably be overengineering if it were only used to deal with
HEAD, but I think it is a nice general mechanism that could have other
applications.

Michael

-- 
Michael Haggerty
mhag...@alum.mit.edu

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to