Re: Lockless Refs? (Was [PATCH] refs: do not use cached refs in repack_without_ref)

2013-01-21 Thread Drew Northup
On Sat, Jan 5, 2013 at 11:12 AM, Jeff King  wrote:
> On Mon, Dec 31, 2012 at 03:30:53AM -0700, Martin Fick wrote:
>
>> The general approach is to setup a transaction and either
>> commit or abort it.  A transaction can be setup by renaming
>> an appropriately setup directory to the "ref.lock" name.  If
>> the rename succeeds, the transaction is begun.  Any actor can
>> abort the transaction (up until it is committed) by simply
>> deleting the "ref.lock" directory, so it is not at risk of
>> going stale.
>
> Deleting a directory is not atomic, as you first have to remove the
> contents, putting it into a potentially inconsistent state. I'll assume
> you deal with that later...

I know I'm a bit late to the dance here, but FWIW the apparent atomicy
(odd conjugation there) of directory deletion depends largely upon the
filesystem and VFS api in use. It is not unheard of that a delete
operation actually consist of moving the reference to the item's own
allocation marker into a "trashcan" to be cleaned up after later.
In other words, I'd not advise planning on directory deletes always
being atomic nor always not being atomic.

-- 
-Drew Northup
--
"As opposed to vegetable or mineral error?"
-John Pescatore, SANS NewsBites Vol. 12 Num. 59
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Lockless Refs? (Was [PATCH] refs: do not use cached refs in repack_without_ref)

2013-01-05 Thread Jeff King
On Mon, Dec 31, 2012 at 03:30:53AM -0700, Martin Fick wrote:

> The general approach is to setup a transaction and either 
> commit or abort it.  A transaction can be setup by renaming 
> an appropriately setup directory to the "ref.lock" name.  If 
> the rename succeeds, the transaction is begun.  Any actor can 
> abort the transaction (up until it is committed) by simply 
> deleting the "ref.lock" directory, so it is not at risk of 
> going stale.

Deleting a directory is not atomic, as you first have to remove the
contents, putting it into a potentially inconsistent state. I'll assume
you deal with that later...

> One important piece of the transaction is the use of uuids.  
> The uuids provide a mechanism to tie the atomic commit pieces 
> to the transactions and thus to prevent long sleeping process 
> from inadvertently performing actions which could be out of 
> date when they wake finally up.

Has this been a problem for you in practice? Avoiding this is one of the
reasons that git does not take out long locks; instead, it takes the
lock only at the moment it is ready to write, and aborts if it has been
updated since the longer-term operation began. This has its own problems
(you might do a lot of work only to have your operation aborted), but I
am not sure that your proposal improves on that.

Your proposal does sound like it could potentially improve robustness
when killing stale transactions (i.e., you know that the transaction
originator will never wake up and think it still has the lock). But
again, is that a problem in practice? Git typically holds ref locks for
a few syscalls. If you are conservative about leaving potentially stale
locks in place (e.g., give them a few minutes to complete before
assuming they are now bogus), you will not run into that problem.

The more conservative you are about treating a lock as stale, of course,
the less performant you will be in the face of stale locks. But since
they're the exception, that isn't a big problem in practice (at least it
has not been for me).

> In each case, the atomic 
> commit piece is the renaming of a file.   For the create and 
> update pieces, a file is renamed from the "ref.lock" dir to 
> the "ref" file resulting in an update to the sha for the ref.

I think we've had problems with cross-directory renames on some
filesystems, but I don't recall the details. I know that Coda does not
like cross-directory links, but cross-directory renames are OK (and in
fact we fall back to the latter when the former does not work).

Ah, here we go: 5723fe7 (Avoid cross-directory renames and linking on
object creation, 2008-06-14). Looks like NFS is the culprit.

> In the case of a delete, the actor may verify that "ref" 
> currently contains the sha to "prune" if it needs to, and 
> then renames the "ref" file to "ref.lock/uuid/delete". On 
> success, the ref was deleted.
> 
> Whether successful or not, the actor may now simply delete 
> the "ref.lock" directory, clearing the way for a new 
> transaction.  Any other actor may delete this directory at 
> any time also, likely either on conflict (if they are 
> attempting to initiate a transaction), or after a grace 
> period just to cleanup the FS.  Any actor may also safely 
> cleanup the tmp directories, preferably also after a grace 
> period.

Hmm. So what happens to the "delete" file when the ref.lock directory is
being deleted? Presumably deleting the ref.lock directory means doing it
recursively (which is non-atomic). But then why are we keeping the
delete file at all, if we're just about to remove it?

What happens if another process wants to cancel a transaction that is
partially done? That is, the ref.lock directory is created, but it
contains the uuid subdir? It sounds like it's OK to just delete from it,
and the original process will then fail at its rename?

> One neat part about this scheme is that I believe it would be 
> backwards compatible with the current locking mechanism since 
> the transaction directory will simply appear to be a lock to 
> older clients.  And the old lock file should continue to lock 
> out these newer transactions.

Yeah, I don't see anything that would prevent that. The current code
only cares about open("$ref.lock", O_EXCL). But that should be correct
and atomic with respect to the renamed directories. You are depending on
atomic directory renames. Those aren't used anywhere yet in git, as far
as I know. So that may run into problems (but there's no reason this
system couldn't be optional for filesystems that are more abled, and
other systems could fall back to the straight-file locking).


So in response to your question, no, I don't see any real showstoppers
here. And unlike the "all refs are files in a directory" scheme, it's
confined to writing, which solves the readdir() atomicity questions I
had there.

I still can't say I'm super excited about it, just because it seems like
a solution in search of a problem that I have not experienced myself.
But 

Re: Lockless Refs? (Was [PATCH] refs: do not use cached refs in repack_without_ref)

2013-01-04 Thread Martin Fick
On Friday, January 04, 2013 10:52:43 am Pyeron, Jason J 
CTR (US) wrote:
> > From: Martin Fick
> > Sent: Thursday, January 03, 2013 6:53 PM
> > 
> > Any thoughts on this idea?  Is it flawed?  I am 
trying
> > to write it up in a more formal generalized manner 
and
> > was hoping to get at least one "it seems sane" 
before
> > I do.
> 
> If you are assuming that atomic renames, etc. are
> available, then you should identify a test case and a
> degrade operation path when it is not available.

Thanks, sound reasonable.  Where you thinking a runtime 
test case that would be run before every transaction?  I 
was anticipating a per repo config option called 
something like "core.locks = recoverable" that would be 
needed to turn them on?  I was thinking that this was 
something that server sites could test in advance on 
their repos and then enable it for them.  Maybe a git-
lock tool with a --test-recoverable option?

-Martin


> > 
> > On Monday, December 31, 2012 03:30:53 am Martin Fick 
wrote:
> > > On Thursday, December 27, 2012 04:11:51 pm Martin
> > > Fick
> > 
> > wrote:
> > > > It concerns me that git uses any locking at all,
> > > > even for refs since it has the potential to 
leave
> > > > around stale locks.
> > > > ...
> > > > [a previous not so great attempt to fix this]
> > > > ...
> > > 
> > > I may have finally figured out a working loose ref
> > > update mechanism which I think can avoid stale
> > > locks. Unfortunately it requires atomic directory
> > > renames and universally unique identifiers 
(uuids). 
> > > These may be no-go criteria?  But I figure it is
> > > worth at least exploring this idea because of the
> > > potential benefits?
> > > 
> > > The general approach is to setup a transaction and
> > > either commit or abort it.  A transaction can be
> > > setup by renaming an appropriately setup directory
> > > to the "ref.lock" name.  If the rename succeeds, 
the
> > > transaction is begun.  Any actor can abort the
> > > transaction (up until it is committed) by simply
> > > deleting the "ref.lock" directory, so it is not at
> > > risk of going stale.  However, once the actor who
> > > sets up the transaction commits it, deleting the
> > > "ref.lock" directory simply aids in cleaning it up
> > > for the next transaction (instead of aborting it).
> > > 
> > > One important piece of the transaction is the use 
of
> > > uuids. The uuids provide a mechanism to tie the
> > > atomic commit pieces to the transactions and thus 
to
> > > prevent long sleeping process from inadvertently
> > > performing actions which could be out of date when
> > > they wake finally up.  In each case, the atomic
> > > commit piece is the renaming of a file.   For the
> > > create and update pieces, a file is renamed from 
the
> > > "ref.lock" dir to the "ref" file resulting in an
> > > update to the sha for the ref. However, in the
> > > delete case, the "ref" file is instead renamed to
> > > end up in the "ref.lock" directory resulting in a
> > > delete of the ref.  This scheme does not affect 
the
> > > way refs are read today,
> > > 
> > > To prepare for a transaction, an actor first
> > > generates a uuid (an exercise I will delay for 
now).
> > >  Next, a tmp directory named after the uuid is
> > > generated in the parent directory for the ref to 
be
> > > updated, perhaps something like:  ".lock_uuid". In
> > > this directory is places either a file or a
> > > directory named after the uuid, something like:
> > > ".lock_uuid/,uuid".  In the case of a create or an
> > > update, the new sha is written to this file.  In 
the
> > > case of a delete, it is a directory.
> > > 
> > > Once the tmp directory is setup, the initiating 
actor
> > > attempts to start the transaction by renaming the 
tmp
> > > directory to "ref.lock".  If the rename fails, the
> > > update fails. If the rename succeeds, the actor 
can
> > > then attempt to commit the transaction (before
> > > another actor aborts it).
> > > 
> > > In the case of a create, the actor verifies that
> > > "ref" does not currently exist, and then renames 
the
> > > now named "ref.lock/uuid" file to "ref". On 
success,
> > > the ref was created.
> > > 
> > > In the case of an update, the actor verifies that
> > > "ref" currently contains the old sha, and then 
also
> > > renames the now named "ref.lock/uuid" file to 
"ref".
> > > On success, the ref was updated.
> > > 
> > > In the case of a delete, the actor may verify that
> > > "ref" currently contains the sha to "prune" if it
> > > needs to, and then renames the "ref" file to
> > > "ref.lock/uuid/delete". On success, the ref was
> > > deleted.
> > > 
> > > Whether successful or not, the actor may now 
simply
> > > delete the "ref.lock" directory, clearing the way
> > > for a new transaction.  Any other actor may delete
> > > this directory at any time also, likely either on
> > > conflict (if they are attempting to initiate a
> > > transaction), or after a grace period just to
> > > cleanup the FS.  Any

RE: Lockless Refs? (Was [PATCH] refs: do not use cached refs in repack_without_ref)

2013-01-04 Thread Pyeron, Jason J CTR (US)
> From: Martin Fick
> Sent: Thursday, January 03, 2013 6:53 PM
> 
> Any thoughts on this idea?  Is it flawed?  I am trying to
> write it up in a more formal generalized manner and was
> hoping to get at least one "it seems sane" before I do.

If you are assuming that atomic renames, etc. are available, then you should 
identify a test case and a degrade operation path when it is not available.

> 
> Thanks,
> 
> -Martin
> 
> On Monday, December 31, 2012 03:30:53 am Martin Fick wrote:
> > On Thursday, December 27, 2012 04:11:51 pm Martin Fick
> wrote:
> > > It concerns me that git uses any locking at all, even
> > > for refs since it has the potential to leave around
> > > stale locks.
> > > ...
> > > [a previous not so great attempt to fix this]
> > > ...
> >
> > I may have finally figured out a working loose ref update
> > mechanism which I think can avoid stale locks.
> > Unfortunately it requires atomic directory renames and
> > universally unique identifiers (uuids).  These may be
> > no-go criteria?  But I figure it is worth at least
> > exploring this idea because of the potential benefits?
> >
> > The general approach is to setup a transaction and either
> > commit or abort it.  A transaction can be setup by
> > renaming an appropriately setup directory to the
> > "ref.lock" name.  If the rename succeeds, the transaction
> > is begun.  Any actor can abort the transaction (up until
> > it is committed) by simply deleting the "ref.lock"
> > directory, so it is not at risk of going stale.  However,
> > once the actor who sets up the transaction commits it,
> > deleting the "ref.lock" directory simply aids in cleaning
> > it up for the next transaction (instead of aborting it).
> >
> > One important piece of the transaction is the use of
> > uuids. The uuids provide a mechanism to tie the atomic
> > commit pieces to the transactions and thus to prevent
> > long sleeping process from inadvertently performing
> > actions which could be out of date when they wake finally
> > up.  In each case, the atomic commit piece is the
> > renaming of a file.   For the create and update pieces, a
> > file is renamed from the "ref.lock" dir to the "ref" file
> > resulting in an update to the sha for the ref. However,
> > in the delete case, the "ref" file is instead renamed to
> > end up in the "ref.lock" directory resulting in a delete
> > of the ref.  This scheme does not affect the way refs are
> > read today,
> >
> > To prepare for a transaction, an actor first generates a
> > uuid (an exercise I will delay for now).  Next, a tmp
> > directory named after the uuid is generated in the parent
> > directory for the ref to be updated, perhaps something
> > like:  ".lock_uuid". In this directory is places either a
> > file or a directory named after the uuid, something like:
> > ".lock_uuid/,uuid".  In the case of a create or an
> > update, the new sha is written to this file.  In the case
> > of a delete, it is a directory.
> >
> > Once the tmp directory is setup, the initiating actor
> > attempts to start the transaction by renaming the tmp
> > directory to "ref.lock".  If the rename fails, the update
> > fails. If the rename succeeds, the actor can then attempt
> > to commit the transaction (before another actor aborts
> > it).
> >
> > In the case of a create, the actor verifies that "ref"
> > does not currently exist, and then renames the now named
> > "ref.lock/uuid" file to "ref". On success, the ref was
> > created.
> >
> > In the case of an update, the actor verifies that "ref"
> > currently contains the old sha, and then also renames the
> > now named "ref.lock/uuid" file to "ref". On success, the
> > ref was updated.
> >
> > In the case of a delete, the actor may verify that "ref"
> > currently contains the sha to "prune" if it needs to, and
> > then renames the "ref" file to "ref.lock/uuid/delete". On
> > success, the ref was deleted.
> >
> > Whether successful or not, the actor may now simply delete
> > the "ref.lock" directory, clearing the way for a new
> > transaction.  Any other actor may delete this directory at
> > any time also, likely either on conflict (if they are
> > attempting to initiate a transaction), or after a grace
> > period just to cleanup the FS.  Any actor may also safely
> > cleanup the tmp directories, preferably also after a grace
> > period.
> >
> > One neat part about this scheme is that I believe it would
> > be backwards compatible with the current locking
> > mechanism since the transaction directory will simply
> > appear to be a lock to older clients.  And the old lock
> > file should continue to lock out these newer
> > transactions.
> >
> > Due to this backwards compatibility, I believe that this
> > could be incrementally employed today without affecting
> > very much.  It could be deployed in place of any updates
> > which only hold ref.locks to update the loose ref.  So
> > for example I think it could replace step 4a below from
> > Michael Haggerty's description of today

Re: Lockless Refs? (Was [PATCH] refs: do not use cached refs in repack_without_ref)

2013-01-03 Thread Martin Fick
Any thoughts on this idea?  Is it flawed?  I am trying to 
write it up in a more formal generalized manner and was 
hoping to get at least one "it seems sane" before I do.

Thanks,

-Martin

On Monday, December 31, 2012 03:30:53 am Martin Fick wrote:
> On Thursday, December 27, 2012 04:11:51 pm Martin Fick 
wrote:
> > It concerns me that git uses any locking at all, even
> > for refs since it has the potential to leave around
> > stale locks.
> > ...
> > [a previous not so great attempt to fix this]
> > ...
> 
> I may have finally figured out a working loose ref update
> mechanism which I think can avoid stale locks. 
> Unfortunately it requires atomic directory renames and
> universally unique identifiers (uuids).  These may be
> no-go criteria?  But I figure it is worth at least
> exploring this idea because of the potential benefits?
> 
> The general approach is to setup a transaction and either
> commit or abort it.  A transaction can be setup by
> renaming an appropriately setup directory to the
> "ref.lock" name.  If the rename succeeds, the transaction
> is begun.  Any actor can abort the transaction (up until
> it is committed) by simply deleting the "ref.lock"
> directory, so it is not at risk of going stale.  However,
> once the actor who sets up the transaction commits it,
> deleting the "ref.lock" directory simply aids in cleaning
> it up for the next transaction (instead of aborting it).
> 
> One important piece of the transaction is the use of
> uuids. The uuids provide a mechanism to tie the atomic
> commit pieces to the transactions and thus to prevent
> long sleeping process from inadvertently performing
> actions which could be out of date when they wake finally
> up.  In each case, the atomic commit piece is the
> renaming of a file.   For the create and update pieces, a
> file is renamed from the "ref.lock" dir to the "ref" file
> resulting in an update to the sha for the ref. However,
> in the delete case, the "ref" file is instead renamed to
> end up in the "ref.lock" directory resulting in a delete
> of the ref.  This scheme does not affect the way refs are
> read today,
> 
> To prepare for a transaction, an actor first generates a
> uuid (an exercise I will delay for now).  Next, a tmp
> directory named after the uuid is generated in the parent
> directory for the ref to be updated, perhaps something
> like:  ".lock_uuid". In this directory is places either a
> file or a directory named after the uuid, something like:
> ".lock_uuid/,uuid".  In the case of a create or an
> update, the new sha is written to this file.  In the case
> of a delete, it is a directory.
> 
> Once the tmp directory is setup, the initiating actor
> attempts to start the transaction by renaming the tmp
> directory to "ref.lock".  If the rename fails, the update
> fails. If the rename succeeds, the actor can then attempt
> to commit the transaction (before another actor aborts
> it).
> 
> In the case of a create, the actor verifies that "ref"
> does not currently exist, and then renames the now named
> "ref.lock/uuid" file to "ref". On success, the ref was
> created.
> 
> In the case of an update, the actor verifies that "ref"
> currently contains the old sha, and then also renames the
> now named "ref.lock/uuid" file to "ref". On success, the
> ref was updated.
> 
> In the case of a delete, the actor may verify that "ref"
> currently contains the sha to "prune" if it needs to, and
> then renames the "ref" file to "ref.lock/uuid/delete". On
> success, the ref was deleted.
> 
> Whether successful or not, the actor may now simply delete
> the "ref.lock" directory, clearing the way for a new
> transaction.  Any other actor may delete this directory at
> any time also, likely either on conflict (if they are
> attempting to initiate a transaction), or after a grace
> period just to cleanup the FS.  Any actor may also safely
> cleanup the tmp directories, preferably also after a grace
> period.
> 
> One neat part about this scheme is that I believe it would
> be backwards compatible with the current locking
> mechanism since the transaction directory will simply
> appear to be a lock to older clients.  And the old lock
> file should continue to lock out these newer
> transactions.
> 
> Due to this backwards compatibility, I believe that this
> could be incrementally employed today without affecting
> very much.  It could be deployed in place of any updates
> which only hold ref.locks to update the loose ref.  So
> for example I think it could replace step 4a below from
> Michael Haggerty's description of today's loose ref
> pruning during
> 
> ref packing:
> > * Pack references:
> ...
> 
> > 4. prune_refs(): for each ref in the ref_to_prune list,
> > 
> > call  prune_ref():
> > a. Lock the reference using lock_ref_sha1(),
> > verifying that the recorded SHA1 is still valid.  If
> > it is, unlink the loose reference file then free
> > the lock; otherwise leave the loose reference file
> > untouched

Re: Lockless Refs? (Was [PATCH] refs: do not use cached refs in repack_without_ref)

2012-12-31 Thread Martin Fick
On Thursday, December 27, 2012 04:11:51 pm Martin Fick wrote:
> It concerns me that git uses any locking at all, even for
> refs since it has the potential to leave around stale
> locks.
> ...
> [a previous not so great attempt to fix this]
> ...

I may have finally figured out a working loose ref update 
mechanism which I think can avoid stale locks.  Unfortunately 
it requires atomic directory renames and universally unique 
identifiers (uuids).  These may be no-go criteria?  But I 
figure it is worth at least exploring this idea because of the 
potential benefits?

The general approach is to setup a transaction and either 
commit or abort it.  A transaction can be setup by renaming 
an appropriately setup directory to the "ref.lock" name.  If 
the rename succeeds, the transaction is begun.  Any actor can 
abort the transaction (up until it is committed) by simply 
deleting the "ref.lock" directory, so it is not at risk of 
going stale.  However, once the actor who sets up the 
transaction commits it, deleting the "ref.lock" directory 
simply aids in cleaning it up for the next transaction 
(instead of aborting it).

One important piece of the transaction is the use of uuids.  
The uuids provide a mechanism to tie the atomic commit pieces 
to the transactions and thus to prevent long sleeping process 
from inadvertently performing actions which could be out of 
date when they wake finally up.  In each case, the atomic 
commit piece is the renaming of a file.   For the create and 
update pieces, a file is renamed from the "ref.lock" dir to 
the "ref" file resulting in an update to the sha for the ref.  
However, in the delete case, the "ref" file is instead renamed 
to end up in the "ref.lock" directory resulting in a delete 
of the ref.  This scheme does not affect the way refs are read 
today,

To prepare for a transaction, an actor first generates a uuid 
(an exercise I will delay for now).  Next, a tmp directory 
named after the uuid is generated in the parent directory for 
the ref to be updated, perhaps something like:  ".lock_uuid".  
In this directory is places either a file or a directory named 
after the uuid, something like: ".lock_uuid/,uuid".  In the 
case of a create or an update, the new sha is written to this 
file.  In the case of a delete, it is a directory.  

Once the tmp directory is setup, the initiating actor 
attempts to start the transaction by renaming the tmp 
directory to "ref.lock".  If the rename fails, the update 
fails. If the rename succeeds, the actor can then attempt to 
commit the transaction (before another actor aborts it). 

In the case of a create, the actor verifies that "ref" does 
not currently exist, and then renames the now named 
"ref.lock/uuid" file to "ref". On success, the ref was 
created.

In the case of an update, the actor verifies that "ref" 
currently contains the old sha, and then also renames the now 
named "ref.lock/uuid" file to "ref". On success, the ref was 
updated.

In the case of a delete, the actor may verify that "ref" 
currently contains the sha to "prune" if it needs to, and 
then renames the "ref" file to "ref.lock/uuid/delete". On 
success, the ref was deleted.

Whether successful or not, the actor may now simply delete 
the "ref.lock" directory, clearing the way for a new 
transaction.  Any other actor may delete this directory at 
any time also, likely either on conflict (if they are 
attempting to initiate a transaction), or after a grace 
period just to cleanup the FS.  Any actor may also safely 
cleanup the tmp directories, preferably also after a grace 
period.

One neat part about this scheme is that I believe it would be 
backwards compatible with the current locking mechanism since 
the transaction directory will simply appear to be a lock to 
older clients.  And the old lock file should continue to lock 
out these newer transactions.

Due to this backwards compatibility, I believe that this 
could be incrementally employed today without affecting very 
much.  It could be deployed in place of any updates which 
only hold ref.locks to update the loose ref.  So for example 
I think it could replace step 4a below from Michael 
Haggerty's description of today's loose ref pruning during 
ref packing:

> * Pack references:
...
> 4. prune_refs(): for each ref in the ref_to_prune list,
> call  prune_ref():
>
> a. Lock the reference using lock_ref_sha1(), 
> verifying that the recorded SHA1 is still valid.  If it
> is, unlink the loose reference file then free the lock;
> otherwise leave the loose reference file untouched.

I think it would also therefore be able to replace the loose 
ref locking in Michael's new ref-packing scheme as well as 
the locking in Michael's new ref deletion scheme (again steps 
4):

> * Delete reference foo:
...
>   4. Delete loose ref for "foo":
> 
>  a. Acquire the lock $GIT_DIR/refs/heads/foo.lock
> 
>  b. Unlink $GIT_DIR/refs/heads/foo if it is unchanged.
>  If it is changed, leave

Re: Lockless Refs? (Was [PATCH] refs: do not use cached refs in repack_without_ref)

2012-12-30 Thread Martin Fick
On Saturday, December 29, 2012 03:18:49 pm Martin Fick wrote:
> Jeff King  wrote:
> >On Thu, Dec 27, 2012 at 04:11:51PM -0700, Martin Fick 
wrote:
> >> My idea is based on using filenames to store sha1s
> >> instead of file contents.  To do this, the sha1 one of
> >> a ref would be stored in a file in a directory named
> >> after the loose ref.  I believe this would then make
> >> it possible to have lockless atomic ref updates by
> >> renaming the file.
> >> 
> >> To more fully illustrate the idea, imagine that any
> >> file (except for the null file) in the directory will
> >> represent the value of the ref with its name, then the
> >> following transitions can represent atomic state
> >> changes to a refs
> >
> >> value and existence:
> >Hmm. So basically you are relying on atomic rename() to
> >move the value around within a directory, rather than
> >using write to move it around within a file. Atomic
> >rename is usually something we have on local filesystems
> >(and I think we rely on it elsewhere). Though I would
> >not be
> >surprised if it is not atomic on all networked
> >filesystems (though it is
> >on NFS, at least).
> 
> Yes.  I assume this is OK because doesn't git already rely
> on atomic renames?  For example to rename the new
> packed-refs file to unlock it?
> 
> ...
> 
> >> 3) To create a ref, it must be renamed from the null
> >> file (sha ...) to the new value just as if it were
> >> being updated from any other value, but there is one
> >> extra condition: before renaming the null file, a full
> >> directory scan must be done to ensure that the null
> >> file is the only file in the directory (this condition
> >> exists because creating the directory and null file
> >> cannot be atomic unless the filesystem supports atomic
> >> directory renames, an expectation git does not
> >> currently make).  I am not sure how this compares to
> >> today's approach, but including the setup costs
> >> (described below), I suspect it is slower.
> >
> >Hmm. mkdir is atomic. So wouldn't it be sufficient to
> >just mkdir and create the correct sha1 file?
> 
> But then a process could mkdir and die leaving a stale
> empty dir with no reliable recovery mechanism.
> 
> 
> Unfortunately, I think I see another flaw though! :( I
> should have known that I cannot separate an important
> check from its state transitioning action.  The following
> could happen:
> 
>  A does mkdir
>  A creates null file
>  A checks dir -> no other files
>  B checks dir -> no other files
>  A renames null file to abcd
>  C creates second null file
>  B renames second null file to defg
> 
> One way to fix this is to rely on directory renames, but I
> believe this is something git does not want to require of
> every FS? If we did, we could Change #3 to be:
> 
> 3) To create a ref, it must be renamed from the null file
> (sha ...) to the new value just as if it were being
> updated from any other value. (No more scan)
> 
> Then, with reliable directory renames, a process could do
> what you suggested to a temporary directory, mkdir +
> create null file, then rename the temporary dir to
> refname.  This would prevent duplicate null files.  With
> a grace period, the temporary dirs could be cleaned up in
> case a process dies before the rename.  This is your
> approach with reliable recovery.

The whole null file can go away if we use directory renames.  
Make #3:

3) To create a ref, create a temporary directory containing a 
file named after the sha1 of the ref to be created and rename 
the directory to the name of the ref to create.  If the 
rename fails, the create fails.  If the rename succeeds, the 
create succeeds.

With a grace period, the temporary dirs could be cleaned up 
in case a process dies before the rename,

-Martin
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Lockless Refs? (Was [PATCH] refs: do not use cached refs in repack_without_ref)

2012-12-29 Thread Martin Fick
Jeff King  wrote:

>On Thu, Dec 27, 2012 at 04:11:51PM -0700, Martin Fick wrote:
>> My idea is based on using filenames to store sha1s instead of 
>> file contents.  To do this, the sha1 one of a ref would be 
>> stored in a file in a directory named after the loose ref.  I 
>> believe this would then make it possible to have lockless 
>> atomic ref updates by renaming the file.
>> 
>> To more fully illustrate the idea, imagine that any file 
>> (except for the null file) in the directory will represent the 
>> value of the ref with its name, then the following 
>> transitions can represent atomic state changes to a refs 
>> value and existence:
>
>Hmm. So basically you are relying on atomic rename() to move the value
>around within a directory, rather than using write to move it around
>within a file. Atomic rename is usually something we have on local
>filesystems (and I think we rely on it elsewhere). Though I would not
>be
>surprised if it is not atomic on all networked filesystems (though it
>is
>on NFS, at least).

Yes.  I assume this is OK because doesn't git already rely on atomic renames?  
For example to rename the new packed-refs file to unlock it?

...

>> 3) To create a ref, it must be renamed from the null file (sha 
>> ...) to the new value just as if it were being updated 
>> from any other value, but there is one extra condition: 
>> before renaming the null file, a full directory scan must be 
>> done to ensure that the null file is the only file in the 
>> directory (this condition exists because creating the 
>> directory and null file cannot be atomic unless the filesystem 
>> supports atomic directory renames, an expectation git does 
>> not currently make).  I am not sure how this compares to 
>> today's approach, but including the setup costs (described 
>> below), I suspect it is slower.
>
>Hmm. mkdir is atomic. So wouldn't it be sufficient to just mkdir and
>create the correct sha1 file?

But then a process could mkdir and die leaving a stale empty dir with no 
reliable recovery mechanism.


Unfortunately, I think I see another flaw though! :( I should have known that I 
cannot separate an important check from its state transitioning action.  The 
following could happen:

 A does mkdir
 A creates null file
 A checks dir -> no other files 
 B checks dir -> no other files
 A renames null file to abcd
 C creates second null file 
 B renames second null file to defg

One way to fix this is to rely on directory renames, but I believe this is 
something git does not want to require of every FS? If we did, we could Change 
#3 to be:

3) To create a ref, it must be renamed from the null file (sha ...) to the 
new value just as if it were being updated from any other value. (No more scan)

Then, with reliable directory renames, a process could do what you suggested to 
a temporary directory, mkdir + create null file, then rename the temporary dir 
to refname.  This would prevent duplicate null files.  With a grace period, the 
temporary dirs could be cleaned up in case a process dies before the rename.  
This is your approach with reliable recovery.


>> I don't know how this new scheme could be made to work with 
>> the current scheme, it seems like perhaps new git releases 
>> could be made to understand both the old and the new, and a 
>> config option could be used to tell it which method to write 
>> new refs with.  Since in this new scheme ref directory names 
>> would conflict with old ref filenames, this would likely 
>> prevent both schemes from erroneously being used 
>> simultaneously (so they shouldn't corrupt each other), except 
>> for the fact that refs can be nested in directories which 
>> confuses things a bit.  I am not sure what a good solution to 
>> this is?
>
>I think you would need to bump core.repositoryformatversion, and just
>never let old versions of git access the repository directly. Not the
>end of the world, but it certainly increases deployment effort. If we
>were going to do that, it would probably make sense to think about
>solving the D/F conflict issues at the same time (i.e., start calling
>"refs/heads/foo" in the filesystem "refs.d/heads.d/foo.ref" so that it
>cannot conflict with "refs.d/heads.d/foo.d/bar.ref").

Wouldn't you want to use a non legal ref character instead of dot? And without 
locks, we free up more of the ref namespace too I think? (Refs could end in 
".lock")

-Martin

Employee of Qualcomm Innovation Center,Inc. which is a member of Code Aurora 
Forum
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Lockless Refs? (Was [PATCH] refs: do not use cached refs in repack_without_ref)

2012-12-29 Thread Martin Fick
Jeff King  wrote:

>On Fri, Dec 28, 2012 at 07:50:14AM -0700, Martin Fick wrote:
>
>> Hmm, actually I believe that with a small modification to the 
>> semantics described here it would be possible to make multi 
>> repo/branch commits work.   Simply allow the ref filename to 
>> be locked by a transaction by appending the transaction ID to 
>> the filename.  So if transaction 123 wants to lock master 
>> which points currently to abcde, then it will move 
>> master/abcde to master/abcde_123.  If transaction 123 is 
>> designed so that any process can commit/complete/abort it 
>> without requiring any locks which can go stale, then this ref 
>> lock will never go stale either (easy as long as it writes 
>> all its proposed updates somewhere upfront and has atomic 
>> semantics for starting, committing and aborting).  On commit, 
>> the ref lock gets updated to its new value: master/newsha and 
>> on abort it gets unlocked: master/abcde.
>
>Hmm. I thought our goal was to avoid locks? Isn't this just locking by
>another name?

It is a lock, but it is a lock with an owner: the transaction.  If the 
transaction has reliable recovery semantics, then the lock will be recoverable 
also.  This is possible if we have lock ownership (the transaction) which does 
not exist today for the ref locks.  With good lock ownership we gain the 
ability to reliably delete locks for a specific owner without the risk of 
deleting the lock when held by another owner (putting the owner in the filename 
is "good", while putting the owner in the filecontents is not).   Lastly, for 
reliable recovery of stale locks we need the ability to determine when an owner 
has abandoned a lock.  I believe that the transaction semantics laid out below 
give this.


>I guess your point is to have no locks in the "normal" case, and have
>locked transactions as an optional add-on?

Basically.  If we design the transaction into the git semantics we could ensure 
that it is recoverable and we should not need to expose these reflocks outside 
of the transaction APIs.

To illustrate a simple transaction approach (borrowing some of Shawn's ideas), 
we could designate a directory to hold transaction files *1.  To prepare a 
transaction: write a list of repo:ref:oldvalue:newvalue to a file named id.new 
(in a stable sorted order based on repo:ref to prevent deadlocks).  This is not 
a state change and thus this file could be deleted by any process at anytime 
(preferably after a long grace period).

If file renames are atomic on the filesystem holding the transaction files then 
1, 2, 3 below will be atomic state changes.  It does not matter who performs 
state transitions 2 or 3.  It does not matter who implements the work following 
any of the 3 transitions, many processes could attempt the work in parallel (so 
could a human).
 
1) To start the transaction, rename the id.new file to id.  If the rename 
fails, start over if desired/still possible.  On success, ref locks for each 
entry should be acquired in listed order (to prevent deadlocks), using 
transaction id and oldvalue.  It is never legal to unlock a ref in this state 
(because a block could cause the unlock to be delayed until the commit phase).  
However, it is legal for any process to transition to abort at any time from 
this state, perhaps because of a failure to acquire a lock (held by another 
transaction), and definitely if a ref has changed (is no longer oldvalue).

2) To abort the transaction, rename the id file to id.abort.  This should only 
ever fail if commit was achieved first.  Once in this state, any process 
may/should unlock any ref locks belonging to this transaction id.  Once all 
refs are unlocked, id.abort may be deleted (it could be deleted earlier, but 
then cleanup will take longer).

3) To commit the transaction, rename the file to id.commit.  This should only 
ever fail if abort was achieved first. This transition should never be done 
until every listed ref is locked by the current transaction id.  Once in this 
phase, all refs may/should be moved to their new values and unlocked by any 
process. Once all refs are unlocked, id.commit may be deleted. 

Since any process attempting any of the work in these transactions could block 
at any time for an indefinite amount of time, these processes may wake after 
the transaction is aborted or comitted and the transaction files are cleaned 
up.  I believe that in these cases the only actions which could succeed by 
these waking processes is the ref locking action.  All such abandoned ref locks 
may/should be unlocked by any process.  This last rule means that no 
transaction ids should ever be reused,

-Martin


*1 We may want to adapt the simple model illustrated above to use git 
mechanisms such as refs to hold transaction info instead of files in a 
directory, and git submodule files to hold the list of refs to update.  

Employee of Qualcomm Innovation Center,Inc. which is a member of Code Aurora 
Forum
--
To unsubscrib

Re: Lockless Refs? (Was [PATCH] refs: do not use cached refs in repack_without_ref)

2012-12-29 Thread Jeff King
On Fri, Dec 28, 2012 at 07:50:14AM -0700, Martin Fick wrote:

> Hmm, actually I believe that with a small modification to the 
> semantics described here it would be possible to make multi 
> repo/branch commits work.   Simply allow the ref filename to 
> be locked by a transaction by appending the transaction ID to 
> the filename.  So if transaction 123 wants to lock master 
> which points currently to abcde, then it will move 
> master/abcde to master/abcde_123.  If transaction 123 is 
> designed so that any process can commit/complete/abort it 
> without requiring any locks which can go stale, then this ref 
> lock will never go stale either (easy as long as it writes 
> all its proposed updates somewhere upfront and has atomic 
> semantics for starting, committing and aborting).  On commit, 
> the ref lock gets updated to its new value: master/newsha and 
> on abort it gets unlocked: master/abcde.

Hmm. I thought our goal was to avoid locks? Isn't this just locking by
another name?

I guess your point is to have no locks in the "normal" case, and have
locked transactions as an optional add-on?

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Lockless Refs? (Was [PATCH] refs: do not use cached refs in repack_without_ref)

2012-12-29 Thread Jeff King
On Thu, Dec 27, 2012 at 04:11:51PM -0700, Martin Fick wrote:

> For a single user repo this is not a big deal, the lock can 
> always be cleaned up manually (and it is a rare occurrence).  
> However, in a multi user server environment, possibly even 
> from multiple hosts over a shared filesystem such as NFS, 
> stale locks could lead to serious downtime and risky recovery 
> (since it is currently hard to figure out if a lock really is 
> stale).  Even though stale locks are probably rare even today 
> in the larger shared repo case, as git scales to even larger 
> shared repositories, this will eventually become more of a 
> problem *1.  Naturally, this has me thinking that git should 
> possibly consider moving towards a lockless design for refs 
> in the long term.

FWIW, I am involved in cleaning up stale locks for a very large git
hosting site. It actually happens surprisingly little. I think it is
mostly because git holds actual locks for a very short period of time
(just enough to check that the value is unchanged from when we started a
lengthy operation, and then atomically write the new value).

So I agree it would be cool (and maybe open up new realms of
scalability) for git to be lockless, but in my experience, this isn't
that pressing a problem (and any solutions are not going to be backwards
compatible, so there is going to be a high deployment cost).

> My idea is based on using filenames to store sha1s instead of 
> file contents.  To do this, the sha1 one of a ref would be 
> stored in a file in a directory named after the loose ref.  I 
> believe this would then make it possible to have lockless 
> atomic ref updates by renaming the file.
> 
> To more fully illustrate the idea, imagine that any file 
> (except for the null file) in the directory will represent the 
> value of the ref with its name, then the following 
> transitions can represent atomic state changes to a refs 
> value and existence:

Hmm. So basically you are relying on atomic rename() to move the value
around within a directory, rather than using write to move it around
within a file. Atomic rename is usually something we have on local
filesystems (and I think we rely on it elsewhere). Though I would not be
surprised if it is not atomic on all networked filesystems (though it is
on NFS, at least).

> 1) To update the value from a known value to a new value 
> atomically, simply rename the file to the new value.  This 
> operation should only succeed if the file exists and is still 
> named old value before the rename.  This should even be 
> faster than today's approach, especially on remote filesystems 
> since it would require only 1 round trip in the success case 
> instead of 3!

OK. Makes sense.

> 2) To delete the ref, simply delete the filename representing 
> the current value of the ref.  This ensures that you are 
> deleting the ref from a specific value.  I am not sure if git 
> needs to be able to delete refs without knowing their values?  
> If so, this would require reading the value and looping until 
> the delete succeeds, this may be a bit slow for a constantly 
> updated ref, but likely a rare situation (and not likely 
> worse than trying to acquire the ref-lock today).  Overall, 
> this again would likely be faster than today's approach.

We do sometimes delete without knowing the value. In most cases we would
not want to do this, but for some "force"-type commands, we do. You
would actually have the same problem with updating above, as we
sometimes update with the intent to overwrite whatever is there.

> 3) To create a ref, it must be renamed from the null file (sha 
> ...) to the new value just as if it were being updated 
> from any other value, but there is one extra condition: 
> before renaming the null file, a full directory scan must be 
> done to ensure that the null file is the only file in the 
> directory (this condition exists because creating the 
> directory and null file cannot be atomic unless the filesystem 
> supports atomic directory renames, an expectation git does 
> not currently make).  I am not sure how this compares to 
> today's approach, but including the setup costs (described 
> below), I suspect it is slower.

Hmm. mkdir is atomic. So wouldn't it be sufficient to just mkdir and
create the correct sha1 file?  A simultaneous creator would fail on the
mkdir and abort. A simultaneous reader might see the directory, but it
would either see it as empty, or with the correct file. In the former
case, it would treat that the same as if the directory did not exist.

Speaking of which, you did not cover reading at all, but it would have
to be:

  dh = opendir(ref);
  if (!dh) {
  if (errno == ENOENT)
  return 0; /* no such ref */
  else
  return error("couldn't read ref");
  }

  while ((ent = readdir(dh)) {
  if (ent->d_name[0] == '.')
  /*
   * skip "." and "..", and leave room for annot

Re: Lockless Refs? (Was [PATCH] refs: do not use cached refs in repack_without_ref)

2012-12-28 Thread Martin Fick
On Thursday, December 27, 2012 04:11:51 pm Martin Fick wrote:
> On Wednesday, December 26, 2012 01:24:39 am Michael
> Haggerty
> 
> wrote:
> > ... lots of discussion about ref locking...
> 
> It concerns me that git uses any locking at all, even for
> refs since it has the potential to leave around stale
> locks.
> 
> For a single user repo this is not a big deal, the lock
> can always be cleaned up manually (and it is a rare
> occurrence). However, in a multi user server environment,
> possibly even from multiple hosts over a shared
> filesystem such as NFS, stale locks could lead to serious
> downtime and risky recovery (since it is currently hard
> to figure out if a lock really is stale).  Even though
> stale locks are probably rare even today in the larger
> shared repo case, as git scales to even larger shared
> repositories, this will eventually become more of a
> problem *1.  Naturally, this has me thinking that git
> should possibly consider moving towards a lockless design
> for refs in the long term.
> 
> I realize this is hard and that git needs to support many
> different filesystems with different semantics.  I had an
> idea I think may be close to a functional lockless design
> for loose refs (one piece at a time) that I thought I
> should propose, just to get the ball rolling, even if it
> is just going to be found to be flawed (I realize that
> history suggests that such schemes usually are).  I hope
> that it does not make use of any semantics which are not
> currently expected from git of filesystems.  I think it
> relies only on the ability to rename a file atomically,
> and the ability to scan the contents of a directory
> reliably to detect the "ordered" existence of files.
> 
> My idea is based on using filenames to store sha1s instead
> of file contents.  To do this, the sha1 one of a ref
> would be stored in a file in a directory named after the
> loose ref.  I believe this would then make it possible to
> have lockless atomic ref updates by renaming the file.
> 
> To more fully illustrate the idea, imagine that any file
> (except for the null file) in the directory will represent
> the value of the ref with its name, then the following
> transitions can represent atomic state changes to a refs
> value and existence:
> 
> 1) To update the value from a known value to a new value
> atomically, simply rename the file to the new value.  This
> operation should only succeed if the file exists and is
> still named old value before the rename.  This should
> even be faster than today's approach, especially on
> remote filesystems since it would require only 1 round
> trip in the success case instead of 3!
> 
> 2) To delete the ref, simply delete the filename
> representing the current value of the ref.  This ensures
> that you are deleting the ref from a specific value.  I
> am not sure if git needs to be able to delete refs
> without knowing their values? If so, this would require
> reading the value and looping until the delete succeeds,
> this may be a bit slow for a constantly updated ref, but
> likely a rare situation (and not likely worse than trying
> to acquire the ref-lock today).  Overall, this again
> would likely be faster than today's approach.
> 
> 3) To create a ref, it must be renamed from the null file
> (sha ...) to the new value just as if it were being
> updated from any other value, but there is one extra
> condition: before renaming the null file, a full
> directory scan must be done to ensure that the null file
> is the only file in the directory (this condition exists
> because creating the directory and null file cannot be
> atomic unless the filesystem supports atomic directory
> renames, an expectation git does not currently make).  I
> am not sure how this compares to today's approach, but
> including the setup costs (described below), I suspect it
> is slower.
> 
> While this outlines the state changes, some additional
> operations may be needed to setup some starting conditions
> and to clean things up. But these operations could/should
> be performed by any process/thread and would not cause
> any state changes to the ref existence or value.  For
> example, when creating a ref, the ref directory would
> need to be created and the null file needs to be created.
>  Whenever a null file is detected in the directory at the
> same time as another file, it should be deleted.  
> Whenever the directory is empty, it may be deleted
> (perhaps after a grace period to reduce retries during
> ref creation unless the process just deleted the ref).
> 
> I don't know how this new scheme could be made to work
> with the current scheme, it seems like perhaps new git
> releases could be made to understand both the old and the
> new, and a config option could be used to tell it which
> method to write new refs with.  Since in this new scheme
> ref directory names would conflict with old ref
> filenames, this would likely prevent both schemes from
> erroneously being used
>