Re: [PATCH] git: add --no-optional-locks option
On Wed, Sep 27, 2017 at 07:20:05PM +0530, Kaartic Sivaraam wrote: > On Wed, 2017-09-27 at 02:40 -0400, Jeff King wrote: > > On Sun, Sep 24, 2017 at 01:08:41PM +0530, Kaartic Sivaraam wrote: > > > > > > > > So, if I get that correctly "git status --no-optional-locks" is a way to > > > get > > > the "current" status without updating the on disk index file? > > > > It's actually "git --no-optional-locks status", since it's a git-wide > > option (it's just that "status" is the only one who respects it for > > now). > > Thanks for correcting the command. Now let me ask my (corrected) > question again! So, if I get that correctly "git --no-optional- > locks status" is a way to get the "current" status *without updating* > the on disk index file? Yes. -Peff
Re: [PATCH] git: add --no-optional-locks option
On Wed, 2017-09-27 at 02:40 -0400, Jeff King wrote: > On Sun, Sep 24, 2017 at 01:08:41PM +0530, Kaartic Sivaraam wrote: > > > > > So, if I get that correctly "git status --no-optional-locks" is a way to get > > the "current" status without updating the on disk index file? > > It's actually "git --no-optional-locks status", since it's a git-wide > option (it's just that "status" is the only one who respects it for > now). Thanks for correcting the command. Now let me ask my (corrected) question again! So, if I get that correctly "git --no-optional- locks status" is a way to get the "current" status *without updating* the on disk index file? --- Kaartic
Re: [PATCH] git: add --no-optional-locks option
On Mon, Sep 25, 2017 at 11:51:31AM -0700, Stefan Beller wrote: > > diff --git a/read-cache.c b/read-cache.c > > index 65f4fe8375..fc1ba122a3 100644 > > --- a/read-cache.c > > +++ b/read-cache.c > > @@ -1563,7 +1563,8 @@ static int read_index_extension(struct index_state > > *istate, > > > > int hold_locked_index(struct lock_file *lk, int lock_flags) > > { > > - return hold_lock_file_for_update(lk, get_index_file(), lock_flags); > > + return hold_lock_file_for_update_timeout(lk, get_index_file(), > > +lock_flags, 500); > > } > > > > int read_index(struct index_state *istate) > > > > though I think there are a few sites which manually call > > hold_lock_file_for_update() on the index that would need similar > > treatment. > > uh, too bad. The patch above looks really promising, though. :) There are probably only a handful of other callers, and they'd just need to swap out s/update/&_timeout/. So it really is pretty trivial. > > I suspect it would work OK in practice, unless your index is so big that > > 500ms isn't enough. The user may also see minor stalls when there's lock > > contention. I'm not sure how annoying that would be. > > There is only one way to find out. Though we don't want to volunteer > all users into this experiment, I'd presume. Yes. One of the nice things about the optional-locks approach is that it only affects callers who specify the option. And the general idea has gotten a year of testing in Visual Studio, which makes me feel good about it. > Regarding larger indexes, I wonder if we can adapt the 500ms > to the repo size. At first I thought the abbreviation length could be > a good proxy to determine the maximum waiting time, but now I am > not so sure any more. I think madness that way lies. -Peff
Re: [PATCH] git: add --no-optional-locks option
On Sun, Sep 24, 2017 at 01:08:41PM +0530, Kaartic Sivaraam wrote: > On Thursday 21 September 2017 10:02 AM, Jeff King wrote: > > Some tools like IDEs or fancy editors may periodically run > > commands like "git status" in the background to keep track > > of the state of the repository. > > I might be missing something, shouldn't the IDEs be encouraged to use > libgit2 instead? I thought it was meant for these use cases. GitHub Desktop, at least, has actually moved _away_ from libgit2. I think there were a number of reasons. Some match what Dscho said regarding Visual Studio. But I think a main one is just that libgit2 doesn't quite match regular git for feature parity or pace of development. So you're stuck shelling out to regular Git half the time anyway, and now you get to handle dependencies on _two_ systems. :) > Note: I assume getting the status through libgit2 doesn't create an > index.lock file. I don't know if that's the case or not. > > This patch implements the option 3. > > So, if I get that correctly "git status --no-optional-locks" is a way to get > the "current" status without updating the on disk index file? It's actually "git --no-optional-locks status", since it's a git-wide option (it's just that "status" is the only one who respects it for now). > > +`GIT_OPTIONAL_LOCKS`:: > > + If set to `0`, Git will avoid performing any operations which > > + require taking a lock and which are not required to complete the > > + requested operation. > > The above sentence seems to be a little confusing. How about, > >If set to `0`, Git will complete the requested operation without >performing the optional sub-operations that require taking a lock. Yeah, my original is pretty clunky. I ended up with: If set to `0`, Git will complete any requested operation without performing any optional sub-operations that require taking a lock. which swaps out "the" for "any". Thanks. -Peff
Re: [PATCH] git: add --no-optional-locks option
On 9/25/2017 12:17 PM, Johannes Schindelin wrote: Hi Kaartic, On Sun, 24 Sep 2017, Kaartic Sivaraam wrote: On Thursday 21 September 2017 10:02 AM, Jeff King wrote: Some tools like IDEs or fancy editors may periodically run commands like "git status" in the background to keep track of the state of the repository. I might be missing something, shouldn't the IDEs be encouraged to use libgit2 instead? I thought it was meant for these use cases. There are pros and cons. Visual Studio moved away from libgit2 e.g. to support SSH (back then, libgit2 did not support spawning processes, I have no idea whether that changed in the meantime). There were other issues besides feature parity. The big one for VS was that it moved the git computations into a separate process and address space. You can't easily read/modify/write a 500MB .git/index file into memory (with however many copies of the data that that creates) in a 32-bit process.
Re: [PATCH] git: add --no-optional-locks option
>> There might be another option to cope with the situation: >> >> 4. Teach all commands to spinlock / busywait shortly for important >> locks instead of giving up. In that case git-status rewriting >> the index ought to be fine? > > We do have all the infrastructure in place to do a reasonable busywait > with backoff. I think the patch is roughly just: > > diff --git a/read-cache.c b/read-cache.c > index 65f4fe8375..fc1ba122a3 100644 > --- a/read-cache.c > +++ b/read-cache.c > @@ -1563,7 +1563,8 @@ static int read_index_extension(struct index_state > *istate, > > int hold_locked_index(struct lock_file *lk, int lock_flags) > { > - return hold_lock_file_for_update(lk, get_index_file(), lock_flags); > + return hold_lock_file_for_update_timeout(lk, get_index_file(), > +lock_flags, 500); > } > > int read_index(struct index_state *istate) > > though I think there are a few sites which manually call > hold_lock_file_for_update() on the index that would need similar > treatment. uh, too bad. The patch above looks really promising, though. :) > > I suspect it would work OK in practice, unless your index is so big that > 500ms isn't enough. The user may also see minor stalls when there's lock > contention. I'm not sure how annoying that would be. There is only one way to find out. Though we don't want to volunteer all users into this experiment, I'd presume. Regarding larger indexes, I wonder if we can adapt the 500ms to the repo size. At first I thought the abbreviation length could be a good proxy to determine the maximum waiting time, but now I am not so sure any more. Stefan
Re: [PATCH] git: add --no-optional-locks option
On Mon, Sep 25, 2017 at 06:10:12PM +0200, Johannes Schindelin wrote: > On Thu, 21 Sep 2017, Jeff King wrote: > > > Johannes, this is an adaptation of your 67e5ce7f63 (status: offer *not* > > to lock the index and update it, 2016-08-12). Folks working on GitHub > > Desktop complained to me that it's only available on Windows. :) > > Okay, so this is trying to help me by upstreaming a patch from Git for > Windows? > > If so: thanks! Yes, in a round-about way. :) > The changes, in particular the different semantics, will guarantee that I > have to work on the consumer's side here, though, leaving me even less > time for the Git mailing list, so I will need a lot more help with > upstreaming patches. I think you can get away with doing nothing for the time being. The two patches can co-exist in the GfW codebase[1], and people using the existing option can switch over at their leisure. Eventually you may decide to revert 67e5ce7f63. -Peff [1] Modulo some trivial conflict resolution when the two are merged.
Re: [PATCH] git: add --no-optional-locks option
Hi Kaartic, On Sun, 24 Sep 2017, Kaartic Sivaraam wrote: > On Thursday 21 September 2017 10:02 AM, Jeff King wrote: > > Some tools like IDEs or fancy editors may periodically run commands > > like "git status" in the background to keep track of the state of the > > repository. > > I might be missing something, shouldn't the IDEs be encouraged to use > libgit2 instead? I thought it was meant for these use cases. There are pros and cons. Visual Studio moved away from libgit2 e.g. to support SSH (back then, libgit2 did not support spawning processes, I have no idea whether that changed in the meantime). Ciao, Johannes
Re: [PATCH] git: add --no-optional-locks option
Hi Peff, On Thu, 21 Sep 2017, Jeff King wrote: > Johannes, this is an adaptation of your 67e5ce7f63 (status: offer *not* > to lock the index and update it, 2016-08-12). Folks working on GitHub > Desktop complained to me that it's only available on Windows. :) Okay, so this is trying to help me by upstreaming a patch from Git for Windows? If so: thanks! The changes, in particular the different semantics, will guarantee that I have to work on the consumer's side here, though, leaving me even less time for the Git mailing list, so I will need a lot more help with upstreaming patches. Ciao, Dscho
Re: [PATCH] git: add --no-optional-locks option
On Thursday 21 September 2017 10:02 AM, Jeff King wrote: > Some tools like IDEs or fancy editors may periodically run > commands like "git status" in the background to keep track > of the state of the repository. I might be missing something, shouldn't the IDEs be encouraged to use libgit2 instead? I thought it was meant for these use cases. Note: I assume getting the status through libgit2 doesn't create an index.lock file. >3. Ask "status" not to lock or write the index. > > This is easy to implement. The big downside is that any > work done in refreshing the index for such a call is > lost when the process exits. So a background process > may end up re-hashing a changed file multiple times > until the user runs a command that does an index > refresh themselves. > > This patch implements the option 3. So, if I get that correctly "git status --no-optional-locks" is a way to get the "current" status without updating the on disk index file? > +`GIT_OPTIONAL_LOCKS`:: > + If set to `0`, Git will avoid performing any operations which > + require taking a lock and which are not required to complete the > + requested operation. The above sentence seems to be a little hard to interpret. How about, If set to `0`, Git will complete the requested operation without performing the optional sub-operations that require taking a lock. --- Kaartic
Re: [PATCH] git: add --no-optional-locks option
On Fri, Sep 22, 2017 at 02:41:06PM -0700, Stefan Beller wrote: > > Ah, thanks. I _thought_ we could already do that but when I went looking > > for the standard --recursive option I couldn't find it. > > Thanks for checking for submodules there. > > I personally prefer --recurse-submodules despite the longer name, > because a plain recursive option can mean anything in a sufficiently > complex program such as Git (recurse into the tree (c.f. ls-tree), or > for the algorithm used (c.f. merge, diff) or yet another dimension > I did not think of). Yeah, I agree that's a better name (mentioning --recursive is probably just showing my general ignorance of submodules). > > So yes, I would think we would want this option to apply recursively in > > that case, even when we cross repository boundaries. > > Regarding the actual patch which is heavily aimed at coping with IDEs > despite the command line being used, I wonder how many IDEs pass > --ignore-submodules and recurse themselves (if needed). Reason for > my suspicion is [1] which does pay attention to submodules: > > >Our application calls status including the following flags: > >--porcelain=v2 --ignored --untracked-files=all --ignore-submodules=none > > [1] > https://public-inbox.org/git/2bbb1d0f-ae06-1878-d185-112bd51f7...@gmail.com/ Yeah, I think that's probably Visual Studio (which is what Johannes presumably wrote the original patch for). GitHub Desktop does not currently pass it, though perhaps should consider doing so (though of course the user could tweak their config, too). > There might be another option to cope with the situation: > > 4. Teach all commands to spinlock / busywait shortly for important > locks instead of giving up. In that case git-status rewriting > the index ought to be fine? We do have all the infrastructure in place to do a reasonable busywait with backoff. I think the patch is roughly just: diff --git a/read-cache.c b/read-cache.c index 65f4fe8375..fc1ba122a3 100644 --- a/read-cache.c +++ b/read-cache.c @@ -1563,7 +1563,8 @@ static int read_index_extension(struct index_state *istate, int hold_locked_index(struct lock_file *lk, int lock_flags) { - return hold_lock_file_for_update(lk, get_index_file(), lock_flags); + return hold_lock_file_for_update_timeout(lk, get_index_file(), +lock_flags, 500); } int read_index(struct index_state *istate) though I think there are a few sites which manually call hold_lock_file_for_update() on the index that would need similar treatment. I suspect it would work OK in practice, unless your index is so big that 500ms isn't enough. The user may also see minor stalls when there's lock contention. I'm not sure how annoying that would be. -Peff
Re: [PATCH] git: add --no-optional-locks option
On Fri, Sep 22, 2017 at 2:25 PM, Jeff King wrote: > On Fri, Sep 22, 2017 at 01:09:32PM -0700, Stefan Beller wrote: > >> On Thu, Sep 21, 2017 at 9:25 PM, Jeff King wrote: >> >> > >> > But imagine that "git status" learns to recurse into submodules and run >> > "git status" inside them. Surely we would want the submodule repos to >> > also avoid taking any unnecessary locks? >> >> You can teach Git to recurse into submodules already at home, >> just 'git config status.submoduleSummary none'. ;) >> >> It occurs to me that the config name is badly choosen, as it stores >> an argument for git status --ignore-submodules[=mode] > > Ah, thanks. I _thought_ we could already do that but when I went looking > for the standard --recursive option I couldn't find it. Thanks for checking for submodules there. I personally prefer --recurse-submodules despite the longer name, because a plain recursive option can mean anything in a sufficiently complex program such as Git (recurse into the tree (c.f. ls-tree), or for the algorithm used (c.f. merge, diff) or yet another dimension I did not think of). > So yes, I would think we would want this option to apply recursively in > that case, even when we cross repository boundaries. Regarding the actual patch which is heavily aimed at coping with IDEs despite the command line being used, I wonder how many IDEs pass --ignore-submodules and recurse themselves (if needed). Reason for my suspicion is [1] which does pay attention to submodules: >Our application calls status including the following flags: >--porcelain=v2 --ignored --untracked-files=all --ignore-submodules=none [1] https://public-inbox.org/git/2bbb1d0f-ae06-1878-d185-112bd51f7...@gmail.com/ There might be another option to cope with the situation: 4. Teach all commands to spinlock / busywait shortly for important locks instead of giving up. In that case git-status rewriting the index ought to be fine? Thanks, Stefan
Re: [PATCH] git: add --no-optional-locks option
On Fri, Sep 22, 2017 at 01:09:32PM -0700, Stefan Beller wrote: > On Thu, Sep 21, 2017 at 9:25 PM, Jeff King wrote: > > > > > But imagine that "git status" learns to recurse into submodules and run > > "git status" inside them. Surely we would want the submodule repos to > > also avoid taking any unnecessary locks? > > You can teach Git to recurse into submodules already at home, > just 'git config status.submoduleSummary none'. ;) > > It occurs to me that the config name is badly choosen, as it stores > an argument for git status --ignore-submodules[=mode] Ah, thanks. I _thought_ we could already do that but when I went looking for the standard --recursive option I couldn't find it. So yes, I would think we would want this option to apply recursively in that case, even when we cross repository boundaries. -Peff
Re: [PATCH] git: add --no-optional-locks option
On Thu, Sep 21, 2017 at 9:25 PM, Jeff King wrote: > > But imagine that "git status" learns to recurse into submodules and run > "git status" inside them. Surely we would want the submodule repos to > also avoid taking any unnecessary locks? You can teach Git to recurse into submodules already at home, just 'git config status.submoduleSummary none'. ;) It occurs to me that the config name is badly choosen, as it stores an argument for git status --ignore-submodules[=mode]
Re: [PATCH] git: add --no-optional-locks option
On Fri, Sep 22, 2017 at 01:42:10AM -0500, Daniel Santos wrote: > > But taking the index lock may conflict with other operations > > in the repository. Especially ones that the user is doing > > themselves, which _aren't_ opportunistic. In other words, > > "git status" knows how to back off when somebody else is > > holding the lock, but other commands don't know that status > > would be happy to drop the lock if somebody else wanted it. > > Interestingly, this usually slaps me when performing an _interactive_ > rebase. It occurred to me that if I'm performing an interaction > operation, it doesn't seem unreasonable for git wait up to 125ms or so > for the lock and then prompting the user to ask if they want to continue > waiting for the lock. Yes, lock timeouts would help with this situation, though not eliminate it entirely (and we've been adding some lock timeouts in a few places for related situations). We generally avoid prompting, especially when a command can just be reissued. It sounds like "git rebase" gets into a funny state if it cannot grab the index lock. That's something that should be fixed. Even if it bails, you should be able to move forward with "rebase --continue" or similar. > That is not necessarily the case. I don't actually know git on the > inside, but I would ask you to consider a read-write lock and a hybrid > of one and three. > > I don't know what dotlocks are, but I'm certain that you can implement a > rw lock using lock files and no other IPC, although it does increase the > complexity. The way this works is that `git status' acquires a read > lock and does its thing. If it has real changes, instead of discarding > them it attempts to upgrade to a write lock. If that fails, you throw > it away, otherwise you write them and release. By dotlocks, I just mean our strategy of creating O_EXCL "index.lock" files. You're right that it can be done using two such locks, and communicating via the lockfile contents. So my "impossible" was overstating it. I do stand by my contention that it's much more complex than the existing scheme. :) More importantly, though, it changes the locking contract completely between old and new versions (and between other implementations). There's probably only a small minority of users who might use two implementations to simultaneously access the same repository, but it is a use case we try to support. I think we'd have to require a repository-version bump to tell programs to use the new scheme. -Peff
Re: [PATCH] git: add --no-optional-locks option
On Fri, Sep 22, 2017 at 07:22:28AM -0400, Jeff Hostetler wrote: > > > I don't think we should pass this environment variable to remote > > > repositories. It should be listed in local_repo_env[] in environment.c. > > > > I'm not sure I agree. This is really about the context in which the > > command is executing, not anything about the particular repository > > you're operating on. > > > > For fetch/push operations that touch a remote, I doubt it would matter > > either way (and anyway, those often cross network boundaries that don't > > propagate environment variables anyway). > > > > But imagine that "git status" learns to recurse into submodules and run > > "git status" inside them. Surely we would want the submodule repos to > > also avoid taking any unnecessary locks? > > > > -Peff > > > > https://github.com/git-for-windows/git/commit/ff63b51c22389139a864eb2e565c6cdc5a30f061 > > https://github.com/git-for-windows/git/pull/1004/commits/45bad66192352481acbc826f11d90c8928b39a7a > > We should compare this with what we did in Git for Windows last fall. > I guess those commits didn't get pushed upstream. Right. I think you missed the initial message in the thread that explains how this is an expanded version of ff63b51c22. :) I didn't know about the environment thing in 45bad66192, though[1]. That makes me even more confident that this is the right approach. -Peff [1] Sorry for not doing my homework more carefully on the existing solution. GitHub Desktop ran into the same situation and pointed me at ff63b51c22. I extrapolated the rest of it on my own. ;)
Re: [PATCH] git: add --no-optional-locks option
On 9/22/2017 12:25 AM, Jeff King wrote: On Thu, Sep 21, 2017 at 08:25:50PM +0200, Johannes Sixt wrote: +`GIT_OPTIONAL_LOCKS`:: + If set to `0`, Git will avoid performing any operations which + require taking a lock and which are not required to complete the + requested operation. For example, this will prevent `git status` + from refreshing the index as a side effect. This is useful for + processes running in the background which do not want to cause + lock contention with other operations on the repository. + Defaults to `1`. >> I don't think we should pass this environment variable to remote repositories. It should be listed in local_repo_env[] in environment.c. I'm not sure I agree. This is really about the context in which the command is executing, not anything about the particular repository you're operating on. For fetch/push operations that touch a remote, I doubt it would matter either way (and anyway, those often cross network boundaries that don't propagate environment variables anyway). But imagine that "git status" learns to recurse into submodules and run "git status" inside them. Surely we would want the submodule repos to also avoid taking any unnecessary locks? -Peff https://github.com/git-for-windows/git/commit/ff63b51c22389139a864eb2e565c6cdc5a30f061 https://github.com/git-for-windows/git/pull/1004/commits/45bad66192352481acbc826f11d90c8928b39a7a We should compare this with what we did in Git for Windows last fall. I guess those commits didn't get pushed upstream. We added '--no-lock-index' to keep status from locking the index during status and effectively being read-only. This helped with problems with Visual Studio similar to the ones being described for KDevelop. Jeff
Re: [PATCH] git: add --no-optional-locks option
On 09/20/2017 11:32 PM, Jeff King wrote: > Johannes, this is an adaptation of your 67e5ce7f63 (status: offer *not* > to lock the index and update it, 2016-08-12). Folks working on GitHub > Desktop complained to me that it's only available on Windows. :) > > I expanded the scope a bit to let us give the same treatment to more > commands in the long run. I'd also be OK with just cherry-picking your > patch to non-Windows Git if you don't find my reasoning below > compelling. But I think we need _something_ like this, as the other > solutions I could come up with don't seem very promising. > > -Peff > > -- >8 -- > Some tools like IDEs or fancy editors may periodically run > commands like "git status" in the background to keep track > of the state of the repository. Some of these commands may > refresh the index and write out the result in an > opportunistic way: if they can get the index lock, then they > update the on-disk index with any updates they find. And if > not, then their in-core refresh is lost and just has to be > recomputed by the next caller. > > But taking the index lock may conflict with other operations > in the repository. Especially ones that the user is doing > themselves, which _aren't_ opportunistic. In other words, > "git status" knows how to back off when somebody else is > holding the lock, but other commands don't know that status > would be happy to drop the lock if somebody else wanted it. Interestingly, this usually slaps me when performing an _interactive_ rebase. It occurred to me that if I'm performing an interaction operation, it doesn't seem unreasonable for git wait up to 125ms or so for the lock and then prompting the user to ask if they want to continue waiting for the lock. > There are a couple possible solutions: > > 1. Have some kind of "pseudo-lock" that allows other > commands to tell status that they want the lock. > > This is likely to be complicated and error-prone to > implement (and maybe even impossible with just > dotlocks to work from, as it requires some > inter-process communication). > > 2. Avoid background runs of commands like "git status" > that want to do opportunistic updates, preferring > instead plumbing like diff-files, etc. > > This is awkward for a couple of reasons. One is that > "status --porcelain" reports a lot more about the > repository state than is available from individual > plumbing commands. And two is that we actually _do_ > want to see the refreshed index. We just don't want to > take a lock or write out the result. Whereas commands > like diff-files expect us to refresh the index > separately and write it to disk so that they can depend > on the result. But that write is exactly what we're > trying to avoid. > > 3. Ask "status" not to lock or write the index. > > This is easy to implement. The big downside is that any > work done in refreshing the index for such a call is > lost when the process exits. So a background process > may end up re-hashing a changed file multiple times > until the user runs a command that does an index > refresh themselves. That is not necessarily the case. I don't actually know git on the inside, but I would ask you to consider a read-write lock and a hybrid of one and three. I don't know what dotlocks are, but I'm certain that you can implement a rw lock using lock files and no other IPC, although it does increase the complexity. The way this works is that `git status' acquires a read lock and does its thing. If it has real changes, instead of discarding them it attempts to upgrade to a write lock. If that fails, you throw it away, otherwise you write them and release. In order to implement rw locks with only lock files, "off the cuff" I say you have a single "lock list" file that should never be deleted and a "lock lock" file that is held in order to read or modify the list. The format of the lock list would have a pair of 32-bit wrapping modification counts (or versions) at the top -- one for modifications to the lock list its self and another for modifications to the underlying data (i.e., the number of times a write lock has been acquired). This header is followed by entries something like this: 'r' if waiting for a read lock 'R' if actively reading 'w' if waiting for write lock 'W' if actively writing The pid If active, the version of the data at the time lock acquired or zero. Time began waiting or time lock acquired An operation of 'r' or 'w' means that you are waiting and upper case means that you are active. is the version of the data at the time the lock became active and writers increment it when they acquire a lock. You wait with file alteration notification on the lock list (if there is any doubt based upon timestamp precision then you can examine the lock list version). When you want to read o
Re: [PATCH] git: add --no-optional-locks option
On Thu, Sep 21, 2017 at 08:25:50PM +0200, Johannes Sixt wrote: > > +`GIT_OPTIONAL_LOCKS`:: > > + If set to `0`, Git will avoid performing any operations which > > + require taking a lock and which are not required to complete the > > + requested operation. For example, this will prevent `git status` > > + from refreshing the index as a side effect. This is useful for > > + processes running in the background which do not want to cause > > + lock contention with other operations on the repository. > > + Defaults to `1`. > > I don't think we should pass this environment variable to remote > repositories. It should be listed in local_repo_env[] in environment.c. I'm not sure I agree. This is really about the context in which the command is executing, not anything about the particular repository you're operating on. For fetch/push operations that touch a remote, I doubt it would matter either way (and anyway, those often cross network boundaries that don't propagate environment variables anyway). But imagine that "git status" learns to recurse into submodules and run "git status" inside them. Surely we would want the submodule repos to also avoid taking any unnecessary locks? -Peff
Re: [PATCH] git: add --no-optional-locks option
Am 21.09.2017 um 06:32 schrieb Jeff King: diff --git a/Documentation/git.txt b/Documentation/git.txt index 6e3a6767e5..8dd3ae05ae 100644 --- a/Documentation/git.txt +++ b/Documentation/git.txt @@ -159,6 +159,10 @@ foo.bar= ...`) sets `foo.bar` to the empty string which ` git config Add "icase" magic to all pathspec. This is equivalent to setting the `GIT_ICASE_PATHSPECS` environment variable to `1`. +--no-optional-locks:: + Do not perform optional operations that require locks. This is + equivalent to setting the `GIT_OPTIONAL_LOCKS` to `0`. + GIT COMMANDS @@ -697,6 +701,15 @@ of clones and fetches. which feed potentially-untrusted URLS to git commands. See linkgit:git-config[1] for more details. +`GIT_OPTIONAL_LOCKS`:: + If set to `0`, Git will avoid performing any operations which + require taking a lock and which are not required to complete the + requested operation. For example, this will prevent `git status` + from refreshing the index as a side effect. This is useful for + processes running in the background which do not want to cause + lock contention with other operations on the repository. + Defaults to `1`. I don't think we should pass this environment variable to remote repositories. It should be listed in local_repo_env[] in environment.c. -- Hannes
Re: [PATCH] git: add --no-optional-locks option
Jeff King writes: > I admit that's just adding more hand-waving to the pile. But I don't > think it really _hurts_ to leave that door open (aside from making the > documentation a bit wishy-washy). And it helps because callers would > pick up the new features automatically, without having to learn about a > new option. Oh, I do agree it is a good idea to leave that door open, so that scripts that rely on today's --no-optional-locks option will not have to be updated when a similar feature (like the "ref compaction" example you mentioned) against which "let's disable when we are not the primary process, in order to keep the interference to the minimum" would be something we would want to say. The option being added here should cover these future needs. > And I think that's really what this option is. It is less about the > caller asking for some specific behavior, and more about them telling > Git about the context in which it's executing so it can make intelligent > decisions. Yes, indeed.
Re: [PATCH] git: add --no-optional-locks option
On Thu, Sep 21, 2017 at 01:55:58PM +0900, Junio C Hamano wrote: > The phrase 'optional lock' does not answer this question clearly, > though: does it make sense to extend the coverage of this option in > the future to things more than the "opportunistic update to the > index file"? > > If the answer is no, then having 'index' instead of 'lock' in the > name of the option would make more sense (and 'opportunistic' over > 'optional', too), because what the change is about is to allow other > processes that are directly interacting with the user to update the > index, and 'lock' being hindrance is merely an implementation > detail. The comment on the "test" in the log message mentions as if > it were a short-coming that it does not check the lock but checks > if the index is written, but I think that is testing what matters > and preferable than testing "did we lock and then unlock it?" > > On the other hand, if the answer is yes, then I am curious what > other things this may extend to, and if these other things are also > opportunistic optimizations. I left it intentionally vague exactly because I thought we might want to leave room for the answer to change to "yes" eventually. For instance, imagine that we had a ref storage format that required periodic compaction, and readers might sometimes choose to compact in order to save future readers from repeating some work they've done. If that compaction means holding a lock even for a brief period, I think it would fall under this option. I admit that's just adding more hand-waving to the pile. But I don't think it really _hurts_ to leave that door open (aside from making the documentation a bit wishy-washy). And it helps because callers would pick up the new features automatically, without having to learn about a new option. And I think that's really what this option is. It is less about the caller asking for some specific behavior, and more about them telling Git about the context in which it's executing so it can make intelligent decisions. And in that sense, something descriptive like --background-process perhaps would be a better name. Except that I couldn't come up with a name that isn't confusing (certainly --background-process implies to me that Git would itself run in the background, which makes no sense here). I also considered something like "--read-only" to tell Git that we should avoid writing to the repository. But that's really not what this does. It just avoids writes that may cause contention, not all writes. I also considered using the word "opportunistic" in the option name, but decided it was too long and hard to spell. So there. I am open to a better name, but I could not come up with one. > Thanks (and sorry for not being Johannes ;-). You lack his rugged good looks, but your review was still welcome. -Peff
Re: [PATCH] git: add --no-optional-locks option
Jeff King writes: > Johannes, this is an adaptation of your 67e5ce7f63 (status: offer *not* > to lock the index and update it, 2016-08-12). Folks working on GitHub > Desktop complained to me that it's only available on Windows. :) > > I expanded the scope a bit to let us give the same treatment to more > commands in the long run. I'd also be OK with just cherry-picking your > patch to non-Windows Git if you don't find my reasoning below > compelling. But I think we need _something_ like this, as the other > solutions I could come up with don't seem very promising. The phrase 'optional lock' does not answer this question clearly, though: does it make sense to extend the coverage of this option in the future to things more than the "opportunistic update to the index file"? If the answer is no, then having 'index' instead of 'lock' in the name of the option would make more sense (and 'opportunistic' over 'optional', too), because what the change is about is to allow other processes that are directly interacting with the user to update the index, and 'lock' being hindrance is merely an implementation detail. The comment on the "test" in the log message mentions as if it were a short-coming that it does not check the lock but checks if the index is written, but I think that is testing what matters and preferable than testing "did we lock and then unlock it?" On the other hand, if the answer is yes, then I am curious what other things this may extend to, and if these other things are also opportunistic optimizations. Other than that, I think this change (including the part that this is done globally and down to subprocesses as needed) makes sense. Thanks (and sorry for not being Johannes ;-).
[PATCH] git: add --no-optional-locks option
Johannes, this is an adaptation of your 67e5ce7f63 (status: offer *not* to lock the index and update it, 2016-08-12). Folks working on GitHub Desktop complained to me that it's only available on Windows. :) I expanded the scope a bit to let us give the same treatment to more commands in the long run. I'd also be OK with just cherry-picking your patch to non-Windows Git if you don't find my reasoning below compelling. But I think we need _something_ like this, as the other solutions I could come up with don't seem very promising. -Peff -- >8 -- Some tools like IDEs or fancy editors may periodically run commands like "git status" in the background to keep track of the state of the repository. Some of these commands may refresh the index and write out the result in an opportunistic way: if they can get the index lock, then they update the on-disk index with any updates they find. And if not, then their in-core refresh is lost and just has to be recomputed by the next caller. But taking the index lock may conflict with other operations in the repository. Especially ones that the user is doing themselves, which _aren't_ opportunistic. In other words, "git status" knows how to back off when somebody else is holding the lock, but other commands don't know that status would be happy to drop the lock if somebody else wanted it. There are a couple possible solutions: 1. Have some kind of "pseudo-lock" that allows other commands to tell status that they want the lock. This is likely to be complicated and error-prone to implement (and maybe even impossible with just dotlocks to work from, as it requires some inter-process communication). 2. Avoid background runs of commands like "git status" that want to do opportunistic updates, preferring instead plumbing like diff-files, etc. This is awkward for a couple of reasons. One is that "status --porcelain" reports a lot more about the repository state than is available from individual plumbing commands. And two is that we actually _do_ want to see the refreshed index. We just don't want to take a lock or write out the result. Whereas commands like diff-files expect us to refresh the index separately and write it to disk so that they can depend on the result. But that write is exactly what we're trying to avoid. 3. Ask "status" not to lock or write the index. This is easy to implement. The big downside is that any work done in refreshing the index for such a call is lost when the process exits. So a background process may end up re-hashing a changed file multiple times until the user runs a command that does an index refresh themselves. This patch implements the option 3. The idea (and the test) is largely stolen from a Git for Windows patch by Johannes Schindelin, 67e5ce7f63 (status: offer *not* to lock the index and update it, 2016-08-12). The twist here is that instead of making this an option to "git status", it becomes a "git" option and matching environment variable. The reason there is two-fold: 1. An environment variable is carried through to sub-processes. And whether an invocation is a background process or not should apply to the whole process tree. So you could do "git --no-optional-locks foo", and if "foo" is a script or alias that calls "status", you'll still get the effect. 2. There may be other programs that want the same treatment. I've punted here on finding more callers to convert, since "status" is the obvious one to call as a repeated background job. But "git diff"'s opportunistic refresh of the index may be a good candidate. The test is taken from 67e5ce7f63, and it's worth repeating Johannes's explanation: Note that the regression test added in this commit does not *really* verify that no index.lock file was written; that test is not possible in a portable way. Instead, we verify that .git/index is rewritten *only* when `git status` is run without `--no-optional-locks`. Signed-off-by: Jeff King --- Documentation/git.txt | 13 + builtin/commit.c | 5 - cache.h | 6 ++ environment.c | 5 + git.c | 4 t/t7508-status.sh | 10 ++ 6 files changed, 42 insertions(+), 1 deletion(-) diff --git a/Documentation/git.txt b/Documentation/git.txt index 6e3a6767e5..8dd3ae05ae 100644 --- a/Documentation/git.txt +++ b/Documentation/git.txt @@ -159,6 +159,10 @@ foo.bar= ...`) sets `foo.bar` to the empty string which ` git config Add "icase" magic to all pathspec. This is equivalent to setting the `GIT_ICASE_PATHSPECS` environment variable to `1`. +--no-optional-locks:: + Do not perform optional operations that require locks. This is + equivalent to setting the `GIT_OPTIONAL_LOCKS` to `0`. + GIT COMMANDS @@ -697,6 +701,15 @@ of clones