Re: [PATCH v4 13/21] refs: resolve symbolic refs first

2016-02-18 Thread Michael Haggerty
On 02/18/2016 01:29 AM, David Turner wrote:
> On Fri, 201-02-12 at 15:09 +0100, Michael Haggerty wrote:]
>> On 02/05/2016 08:44 PM, David Turner wrote:
>>> Before committing ref updates, split symbolic ref updates into two
>>> parts: an update to the underlying ref, and a log-only update to
>>> the
>>> symbolic ref.  This ensures that both references are locked
>>> correctly
>>> while their reflogs are updated.
>>>
>>> It is still possible to confuse git by concurrent updates, since
>>> the
>>> splitting of symbolic refs does not happen under lock. So a
>>> symbolic ref
>>> could be replaced by a plain ref in the middle of this operation,
>>> which
>>> would lead to reflog discontinuities and missed old-ref checks.
>>
>> This patch is doing too much at once for my little brain to follow.
>>
>> My first hangup is the change to setting RESOLVE_REF_NO_RECURSE
>> unconditionally in lock_ref_sha1_basic(). I count five callers of
>> that
>> function and see no justification for why the change is OK in the
>> context of each caller. Here are some thoughts:
>>
>> * The call from files_create_symref() sets REF_NODEREF, so it is
>> unaffected by this change.
> 
> Yes.
> 
>> * The call from files_transaction_commit() is preceded by a call to
>> dereference_symrefs(), which I assume effectively replaces the need
>> for
>> RESOLVE_REF_NO_RECURSE.
> 
> Yes.
> 
>> * There are two calls from files_rename_ref(). Why is it OK to do
>> without RESOLVE_REF_NO_RECURSE there?
>>
>>   * For the oldrefname call, I suppose the justification is the
>> "(flag &
>> REF_ISSYMREF)" check earlier in the function. (But does this
>> introduce a
>> significant TOCTOU race?)
> 
> The refs code as a whole seems likely to have TOCTOU issues. In
> general, anywhere we check/set flag & REF_ISSYMREF without holding a
> lock, we have a potential problem.  I haven't generally tried to handle
> these cases, since they're not presently handled.  

I agree that we don't do so well here, though I think that most races
would result in reading/writing a ref that was pointed to by the symref
a moment ago, which is usually indistinguishable to the user from their
update having gone through the moment before the symref was updated. So
I don't think your change makes this bit of code significantly worse.

> The central problem with this area of the code is that commit interacts
> so intimately with the locking machinery.  I understand some of why
> it's done that way.  In particular, your change to ref locking to not
> hold lots of open files was a big win for us at Twitter.  But this
> means that it's hard to deal with cross-backend ref updates: you want
> to hold multiple locks, and backends don't have the machinery for it.
> 
> We could add backend hooks to specifically lock and unlock refs. Then
> the backend commit code would just be handled a bundle of locked refs
> and would commit them.  This might be hairy, but it could fix the
> TOCTOU problems.  So, first lock the outer refs, then split out updates
> for any which are symbolic refs, and lock those. Finally, commit all
> updates (split by backend).

As chance would have it, for an internal GitHub project I've implemented
hooks that can be called *during* a ref transaction. The hooks can, for
example, take arbitrary actions between the time that the reflocks are
all acquired and the time that the updates start to be committed. I
didn't submit this code upstream because I didn't think that it would
benefit other users, but many it would be useful for implementing
split-backend reference transaction commits. E.g., the primary reference
transaction could run the secondary backend's commit while holding the
locks for the primary backend references.

Let me think about it.

I don't think this is urgent though. The current code is not
significantly racy in mainstream usage scenarios, right?

> One downside of this is that right now, the backend API is relatively
> close to the front-end, and this would leak what should be an
> implementation detail.  But maybe this is necessary to knit multiple
> backends together.  
> 
> But I'm not sure that this is necessary right now, because I'm not sure
> that I'm actually making TOCTOU issues much worse. 

Agreed.

> [...]
> That's a legit complaint.  The problem, as you note, is that doing some
> of these steps completely independently doesn't work.  But I'll try
> splitting out what I can.

Thanks!

Michael

-- 
Michael Haggerty
mhag...@alum.mit.edu

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 13/21] refs: resolve symbolic refs first

2016-02-17 Thread David Turner
On Fri, 201-02-12 at 15:09 +0100, Michael Haggerty wrote:]
> On 02/05/2016 08:44 PM, David Turner wrote:
> > Before committing ref updates, split symbolic ref updates into two
> > parts: an update to the underlying ref, and a log-only update to
> > the
> > symbolic ref.  This ensures that both references are locked
> > correctly
> > while their reflogs are updated.
> > 
> > It is still possible to confuse git by concurrent updates, since
> > the
> > splitting of symbolic refs does not happen under lock. So a
> > symbolic ref
> > could be replaced by a plain ref in the middle of this operation,
> > which
> > would lead to reflog discontinuities and missed old-ref checks.
> 
> This patch is doing too much at once for my little brain to follow.
> 
> My first hangup is the change to setting RESOLVE_REF_NO_RECURSE
> unconditionally in lock_ref_sha1_basic(). I count five callers of
> that
> function and see no justification for why the change is OK in the
> context of each caller. Here are some thoughts:
> 
> * The call from files_create_symref() sets REF_NODEREF, so it is
> unaffected by this change.

Yes.

> * The call from files_transaction_commit() is preceded by a call to
> dereference_symrefs(), which I assume effectively replaces the need
> for
> RESOLVE_REF_NO_RECURSE.

Yes.

> * There are two calls from files_rename_ref(). Why is it OK to do
> without RESOLVE_REF_NO_RECURSE there?
> 
>   * For the oldrefname call, I suppose the justification is the
> "(flag &
> REF_ISSYMREF)" check earlier in the function. (But does this
> introduce a
> significant TOCTOU race?)

The refs code as a whole seems likely to have TOCTOU issues. In
general, anywhere we check/set flag & REF_ISSYMREF without holding a
lock, we have a potential problem.  I haven't generally tried to handle
these cases, since they're not presently handled.  

The central problem with this area of the code is that commit interacts
so intimately with the locking machinery.  I understand some of why
it's done that way.  In particular, your change to ref locking to not
hold lots of open files was a big win for us at Twitter.  But this
means that it's hard to deal with cross-backend ref updates: you want
to hold multiple locks, and backends don't have the machinery for it.

We could add backend hooks to specifically lock and unlock refs. Then
the backend commit code would just be handled a bundle of locked refs
and would commit them.  This might be hairy, but it could fix the
TOCTOU problems.  So, first lock the outer refs, then split out updates
for any which are symbolic refs, and lock those. Finally, commit all
updates (split by backend).

One downside of this is that right now, the backend API is relatively
close to the front-end, and this would leak what should be an
implementation detail.  But maybe this is necessary to knit multiple
backends together.  

But I'm not sure that this is necessary right now, because I'm not sure
that I'm actually making TOCTOU issues much worse. 

>   * For the newrefname call, I suppose it's because the code a little
> higher up tries to delete any existing reference with that name. It
> looks to me like the old code was slightly broken: if newrefname was
> an
> unborn symbolic reference, then: read_ref_full() would fail;
> delete_ref() would be skipped; lock_ref_sha1_basic() would lock the
> *referred-to* reference; the referred-to reference would be
> overwritten
> instead of newrefname. So it could be that here REF_NODEREF
> indirectly
> fixes a bug?

Yes, that's correct.  These two appears to be separable, so I'll make
it an independent patch (and add a test for that case).

> * The last call, from files_reflog_expire(), is also questionable 
> before your patch. If refname is a symref, then the function is 
> expiring the reflog of the symref. But (before this patch) it locks 
> not the symref but its referent. 

I can also separate this one.

> This was discussed in some length before on the mailing list [1], and
> the conclusion was that the current behavior is wrong, but for
> backwards compatibility reasons it would be safest to change it to
> locking *both* the symref and its referent.

Yes, that would be the right thing to do.  But for the reasons I
discuss above, that requires a serious change in the way that backends
work.  

> If possible, it would be better to split this patch up into several:
> the
> first few would each add the REF_NODEREF flag at one callsite, with a
> careful justification of why that is OK. Once all the callsites
> (except
> the one in files_transaction_commit()) have been changed, then the
> last
> patch could add the dereference_symrefs() machinery and change the
> last
> callsite.
> 
> (I'm not certain that those steps are actually doable independently,
> given that REF_NODEREF has other effects besides setting
> RESOLVE_REF_NO_RECURSE.)
> 
> I'm not just being pedantic here. The patch as written is really too
> big
> to review effectively.

That's a legit complaint.  The pr

Re: [PATCH v4 13/21] refs: resolve symbolic refs first

2016-02-12 Thread Michael Haggerty
On 02/05/2016 08:44 PM, David Turner wrote:
> Before committing ref updates, split symbolic ref updates into two
> parts: an update to the underlying ref, and a log-only update to the
> symbolic ref.  This ensures that both references are locked correctly
> while their reflogs are updated.
> 
> It is still possible to confuse git by concurrent updates, since the
> splitting of symbolic refs does not happen under lock. So a symbolic ref
> could be replaced by a plain ref in the middle of this operation, which
> would lead to reflog discontinuities and missed old-ref checks.

This patch is doing too much at once for my little brain to follow.

My first hangup is the change to setting RESOLVE_REF_NO_RECURSE
unconditionally in lock_ref_sha1_basic(). I count five callers of that
function and see no justification for why the change is OK in the
context of each caller. Here are some thoughts:

* The call from files_create_symref() sets REF_NODEREF, so it is
unaffected by this change.

* The call from files_transaction_commit() is preceded by a call to
dereference_symrefs(), which I assume effectively replaces the need for
RESOLVE_REF_NO_RECURSE.

* There are two calls from files_rename_ref(). Why is it OK to do
without RESOLVE_REF_NO_RECURSE there?

  * For the oldrefname call, I suppose the justification is the "(flag &
REF_ISSYMREF)" check earlier in the function. (But does this introduce a
significant TOCTOU race?)

  * For the newrefname call, I suppose it's because the code a little
higher up tries to delete any existing reference with that name. It
looks to me like the old code was slightly broken: if newrefname was an
unborn symbolic reference, then: read_ref_full() would fail;
delete_ref() would be skipped; lock_ref_sha1_basic() would lock the
*referred-to* reference; the referred-to reference would be overwritten
instead of newrefname. So it could be that here REF_NODEREF indirectly
fixes a bug?

* The last call, from files_reflog_expire(), is also questionable before
your patch. If refname is a symref, then the function is expiring the
reflog of the symref. But (before this patch) it locks not the symref
but its referent. This was discussed in some length before on the
mailing list [1], and the conclusion was that the current behavior is
wrong, but for backwards compatibility reasons it would be safest to
change it to locking *both* the symref and its referent.

If possible, it would be better to split this patch up into several: the
first few would each add the REF_NODEREF flag at one callsite, with a
careful justification of why that is OK. Once all the callsites (except
the one in files_transaction_commit()) have been changed, then the last
patch could add the dereference_symrefs() machinery and change the last
callsite.

(I'm not certain that those steps are actually doable independently,
given that REF_NODEREF has other effects besides setting
RESOLVE_REF_NO_RECURSE.)

I'm not just being pedantic here. The patch as written is really too big
to review effectively.

Michael

[1]
http://thread.gmane.org/gmane.comp.version-control.git/263552/focus=263555

-- 
Michael Haggerty
mhag...@alum.mit.edu

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v4 13/21] refs: resolve symbolic refs first

2016-02-05 Thread David Turner
Before committing ref updates, split symbolic ref updates into two
parts: an update to the underlying ref, and a log-only update to the
symbolic ref.  This ensures that both references are locked correctly
while their reflogs are updated.

It is still possible to confuse git by concurrent updates, since the
splitting of symbolic refs does not happen under lock. So a symbolic ref
could be replaced by a plain ref in the middle of this operation, which
would lead to reflog discontinuities and missed old-ref checks.

Signed-off-by: David Turner 
---
 refs.c   |  69 +++
 refs/files-backend.c | 132 ++-
 refs/refs-internal.h |   8 
 3 files changed, 145 insertions(+), 64 deletions(-)

diff --git a/refs.c b/refs.c
index 283a5ec..227c018 100644
--- a/refs.c
+++ b/refs.c
@@ -1152,6 +1152,71 @@ int refs_init_db(struct strbuf *err, int shared)
return the_refs_backend->init_db(err, shared);
 }
 
+/*
+ * Special case for symbolic refs when REF_NODEREF is not turned on.
+ * Dereference them here, mark them REF_LOG_ONLY, and add an update
+ * for the underlying ref.
+ */
+static int dereference_symrefs(struct ref_transaction *transaction,
+  struct strbuf *err)
+{
+   int i;
+   int nr = transaction->nr;
+
+   for (i = 0; i < nr; i++) {
+   struct ref_update *update = transaction->updates[i];
+   const char *resolved;
+   unsigned char sha1[20];
+   int resolve_flags = 0;
+   int mustexist = update->flags & REF_HAVE_OLD &&
+   !is_null_sha1(update->old_sha1);
+   int deleting = (update->flags & REF_HAVE_NEW) &&
+   is_null_sha1(update->new_sha1);
+
+   if (mustexist)
+   resolve_flags |= RESOLVE_REF_READING;
+   if (deleting)
+   resolve_flags |= RESOLVE_REF_ALLOW_BAD_NAME |
+   RESOLVE_REF_NO_RECURSE;
+
+   if (strcmp(update->refname, "HEAD"))
+   update->flags |= REF_IS_NOT_HEAD;
+
+   resolved = resolve_ref_unsafe(update->refname, resolve_flags,
+ sha1, &update->type);
+   if (!resolved) {
+   /*
+* We may notice this breakage later and die
+* with a sensible error message
+*/
+   update->type |= REF_ISBROKEN;
+   continue;
+   }
+
+   hashcpy(update->read_sha1, sha1);
+
+   if (update->flags & REF_NODEREF ||
+   !(update->type & REF_ISSYMREF))
+   continue;
+
+   /* Create a new transaction for the underlying ref */
+   if (ref_transaction_update(transaction,
+  resolved,
+  update->new_sha1,
+  (update->flags & REF_HAVE_OLD) ?
+  update->old_sha1 : NULL,
+  update->flags & ~REF_IS_NOT_HEAD,
+  update->msg, err))
+   return -1;
+
+   /* Make the symbolic ref update non-recursive */
+   update->flags |= REF_LOG_ONLY | REF_NODEREF;
+   update->flags &= ~REF_HAVE_OLD;
+   }
+
+   return 0;
+}
+
 int ref_transaction_commit(struct ref_transaction *transaction,
   struct strbuf *err)
 {
@@ -1168,6 +1233,10 @@ int ref_transaction_commit(struct ref_transaction 
*transaction,
return 0;
}
 
+   ret = dereference_symrefs(transaction, err);
+   if (ret)
+   goto done;
+
if (get_affected_refnames(transaction, &affected_refnames, err)) {
ret = TRANSACTION_GENERIC_ERROR;
goto done;
diff --git a/refs/files-backend.c b/refs/files-backend.c
index 0fdcdc7..d4f9040 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -7,7 +7,6 @@
 
 struct ref_lock {
char *ref_name;
-   char *orig_ref_name;
struct lock_file *lk;
struct object_id old_oid;
 };
@@ -1857,7 +1856,6 @@ static void unlock_ref(struct ref_lock *lock)
if (lock->lk)
rollback_lock_file(lock->lk);
free(lock->ref_name);
-   free(lock->orig_ref_name);
free(lock);
 }
 
@@ -1913,6 +1911,7 @@ static int remove_empty_directories(struct strbuf *path)
  */
 static struct ref_lock *lock_ref_sha1_basic(const char *refname,
const unsigned char *old_sha1,
+   const unsigned char *read_sha1,
const struct string_list *extras,