Re: [PATCH v6 5/6] grep: enable recurse-submodules to work on objects
Am 01.12.2016 um 02:28 schrieb Brandon Williams: + git init "su:b" && Don't do that. Colons in file names won't work on Windows. -- Hannes
Re: What's cooking in git.git (Nov 2016, #06; Mon, 28)
On Thu, Dec 01, 2016 at 08:09:16AM +0100, Johannes Sixt wrote: > Am 01.12.2016 um 00:40 schrieb Jeff King: > > 20813 access("su:b/../.git/modules/su:b/refs", X_OK) = 0 > > Side note: where does this weirdness "su:b" come from? (I haven't looked at > any of the patches, yet, let alone tested.) Colons in file or directory > names won't work on Windows (in the way one would expect). It's explicitly used in the test, I assume to check that the recursive grep is not confused into treating the name as a tree-ish. I think it would have to either be marked with a prereq on Windows, or modified to do the whole thing in-index (and use "grep --cached"). -Peff
Re: What's cooking in git.git (Nov 2016, #06; Mon, 28)
Am 01.12.2016 um 00:40 schrieb Jeff King: 20813 access("su:b/../.git/modules/su:b/refs", X_OK) = 0 Side note: where does this weirdness "su:b" come from? (I haven't looked at any of the patches, yet, let alone tested.) Colons in file or directory names won't work on Windows (in the way one would expect). -- Hannes
Re: 'git repack' and repack.writeBitmaps=true with kept packs
On Wed, Nov 30, 2016 at 04:15:33PM -0800, Steven Noonan wrote: > It seems like it's behaving as though I've provided > --pack-kept-objects. In order to ensure the .bitmap is created, it > repacks everything, including everything in existing .pack files (not > respecting .keep). But then it's not deleting the old .pack file > (oddly, respecting .keep). Right, that's exactly what's happening. The bitmaps require a completely reachable set inside the pack, so if you omit some objects that are in .keep packs, we cannot generate the bitmap. So we have to either disable bitmaps, or pack the kept objects. By default, we do the latter (and I'll explain why in a minute). We can't delete the .keep packfiles because we don't know for sure that we've included all of their contents in the new pack (not to mention that somebody asked to keep them, and we don't know why; we should respect that). > What I'd expect it to do here is ignore the 'repack.writeBitmaps = > true' value if there's a .keep that needs to be respected. Is this not > a correct assumption? In practice, I think that ends up worse. The .keep files are used by receive-pack as lockfiles for incoming pushes. So imagine you kick off a full repack just as somebody is pushing, and repack sees the temporary .keep file. Your options are: 1. Disable bitmaps, leaving the repository with no bitmaps at all until the next repack (because you're deleting the old bitmaps along with the old, non-kept pack). 2. Duplicate the newly pushed objects in the pack (if they're even reachable; you're also racing to see the ref updates). Now you have bitmaps, but you're wasting a bit of space to store the racy push twice (and it goes away next time you repack). If you're running a Git server which depends on bitmaps for good performance, then (2) is much better. And that's the default. If you want to override it, you can pass --no-pack-kept-objects, or set repack.packKeptObjects to false. I think the documentation for --pack-kept-objects could be a bit more clear for this case. It doesn't mention the default value, nor that what you really want with "-b" is probably "--no-pack-kept-objects". -Peff
Re: [PATCH] xdiff: Do not enable XDL_FAST_HASH by default
On Wed, Nov 30, 2016 at 11:26:43PM -0500, Anders Kaseorg wrote: > > So I suspect a better strategy in general is to just override the > > uname_* variables when cross-compiling. > > The specific case of an i386 userspace on an x86_64 kernel is important > independently of the general cross compilation problem (in fact, the words > “cross compilation” may not even really apply here). And I don’t think > one should have to manually tweak the build for this setup, especially > since the compiler already has the needed information. Ah, I mistook that you were really cross-compiling x86-64 from i386, in which case you'd generally have to set CC, etc for the cross-compile chain. I agree this is a much more subtle case, and it's nice for it to work out of the box. > diff --git a/Makefile b/Makefile > index f53fcc90d..c237d4f91 100644 > --- a/Makefile > +++ b/Makefile > @@ -341,7 +341,6 @@ all:: > # Define XDL_FAST_HASH to use an alternative line-hashing method in > # the diff algorithm. It gives a nice speedup if your processor has > # fast unaligned word loads. Does NOT work on big-endian systems! > -# Enabled by default on x86_64. This is a nice incremental step in the sense that people can still enable it if they want to in order to time it, play with it, etc. But given what we know, I wonder if the help text here should warn people. Or I guess we could move straight to dropping it entirely. Here's what that patch might look like (I retimed it just be sure, and was sad to see that it really _is_ making some cases faster. But I still think slower-but-predictable is a better default). I didn't drop uname_M here. If we go this route, I think it would make sense to do that in a separate patch on top, with your commit message explaining why it is a bad idea versus using compiler-defined macros. -- >8 -- Subject: [PATCH] xdiff: drop XDL_FAST_HASH The xdiff code hashes every line of both sides of a diff, and then compares those hashes to find duplicates. The overall performance depends both on how fast we can compute the hashes, but also on how many hash collisions we see. The idea of XDL_FAST_HASH is to speed up the hash computation. But the generated hashes have worse collision behavior. This means that in some cases it speeds diffs up (running "git log -p" on git.git improves by ~8% with it), but in others it can slow things down. One pathological case saw over a 100x slowdown[1]. There may be a better hash function that covers both properties, but in the meantime we are better off with the original hash. It's slightly slower in the common case, but it has fewer surprising pathological cases. [1] http://public-inbox.org/git/20141222041944.ga...@peff.net/ Signed-off-by: Jeff King--- Makefile | 9 - config.mak.uname | 3 -- xdiff/xutils.c | 106 --- 3 files changed, 118 deletions(-) diff --git a/Makefile b/Makefile index f53fcc90d..f61076997 100644 --- a/Makefile +++ b/Makefile @@ -338,11 +338,6 @@ all:: # # Define NATIVE_CRLF if your platform uses CRLF for line endings. # -# Define XDL_FAST_HASH to use an alternative line-hashing method in -# the diff algorithm. It gives a nice speedup if your processor has -# fast unaligned word loads. Does NOT work on big-endian systems! -# Enabled by default on x86_64. -# # Define GIT_USER_AGENT if you want to change how git identifies itself during # network interactions. The default is "git/$(GIT_VERSION)". # @@ -1485,10 +1480,6 @@ ifndef NO_MSGFMT_EXTENDED_OPTIONS MSGFMT += --check --statistics endif -ifneq (,$(XDL_FAST_HASH)) - BASIC_CFLAGS += -DXDL_FAST_HASH -endif - ifdef GMTIME_UNRELIABLE_ERRORS COMPAT_OBJS += compat/gmtime.o BASIC_CFLAGS += -DGMTIME_UNRELIABLE_ERRORS diff --git a/config.mak.uname b/config.mak.uname index b232908f8..447f36ac2 100644 --- a/config.mak.uname +++ b/config.mak.uname @@ -17,9 +17,6 @@ endif # because maintaining the nesting to match is a pain. If # we had "elif" things would have been much nicer... -ifeq ($(uname_M),x86_64) - XDL_FAST_HASH = YesPlease -endif ifeq ($(uname_S),OSF1) # Need this for u_short definitions et al BASIC_CFLAGS += -D_OSF_SOURCE diff --git a/xdiff/xutils.c b/xdiff/xutils.c index 027192a1c..04d7b32e4 100644 --- a/xdiff/xutils.c +++ b/xdiff/xutils.c @@ -264,110 +264,6 @@ static unsigned long xdl_hash_record_with_whitespace(char const **data, return ha; } -#ifdef XDL_FAST_HASH - -#define REPEAT_BYTE(x) ((~0ul / 0xff) * (x)) - -#define ONEBYTES REPEAT_BYTE(0x01) -#define NEWLINEBYTES REPEAT_BYTE(0x0a) -#define HIGHBITS REPEAT_BYTE(0x80) - -/* Return the high bit set in the first byte that is a zero */ -static inline unsigned long has_zero(unsigned long a) -{ - return ((a - ONEBYTES) & ~a) & HIGHBITS; -} - -static inline long count_masked_bytes(unsigned long mask) -{ - if (sizeof(long) == 8) { -
[PATCH] xdiff: Do not enable XDL_FAST_HASH by default
Although XDL_FAST_HASH computes hashes slightly faster on some architectures, its collision characteristics are much worse, resulting in some pathological diffs running over 100x slower (http://public-inbox.org/git/20141222041944.ga...@peff.net/). Furthermore, it was being enabled when ‘uname -m’ returns x86_64, even if we are cross-compiling for a different architecture. This mistake was also causing the Debian build reproducibility test to fail (https://tests.reproducible-builds.org/debian/index_variations.html). Future architecture-specific definitions should be based on compiler macros such as __x86_64__ rather than uname. Signed-off-by: Anders Kaseorg--- [Oops, also resending for Thomas’s new email address. Sorry for the spam.] On Wed, 30 Nov 2016, Jeff King wrote: > However, I think this might be the tip of the iceberg. There are lots of > Makefile knobs whose defaults are tweaked based on uname output. This > one caught you because you are cross-compiling across architectures, but > in theory you could cross-compile for FreeBSD from Linux, or whatever. > > So I suspect a better strategy in general is to just override the > uname_* variables when cross-compiling. The specific case of an i386 userspace on an x86_64 kernel is important independently of the general cross compilation problem (in fact, the words “cross compilation” may not even really apply here). And I don’t think one should have to manually tweak the build for this setup, especially since the compiler already has the needed information. > All that being said, I actually think an easier fix for this particular > case might be to drop XDL_FAST_HASH entirely. Works for me. Anders Makefile | 1 - config.mak.uname | 5 - 2 files changed, 6 deletions(-) diff --git a/Makefile b/Makefile index f53fcc90d..c237d4f91 100644 --- a/Makefile +++ b/Makefile @@ -341,7 +341,6 @@ all:: # Define XDL_FAST_HASH to use an alternative line-hashing method in # the diff algorithm. It gives a nice speedup if your processor has # fast unaligned word loads. Does NOT work on big-endian systems! -# Enabled by default on x86_64. # # Define GIT_USER_AGENT if you want to change how git identifies itself during # network interactions. The default is "git/$(GIT_VERSION)". diff --git a/config.mak.uname b/config.mak.uname index b232908f8..2831a68c3 100644 --- a/config.mak.uname +++ b/config.mak.uname @@ -1,10 +1,8 @@ # Platform specific Makefile tweaks based on uname detection uname_S := $(shell sh -c 'uname -s 2>/dev/null || echo not') -uname_M := $(shell sh -c 'uname -m 2>/dev/null || echo not') uname_O := $(shell sh -c 'uname -o 2>/dev/null || echo not') uname_R := $(shell sh -c 'uname -r 2>/dev/null || echo not') -uname_P := $(shell sh -c 'uname -p 2>/dev/null || echo not') uname_V := $(shell sh -c 'uname -v 2>/dev/null || echo not') ifdef MSVC @@ -17,9 +15,6 @@ endif # because maintaining the nesting to match is a pain. If # we had "elif" things would have been much nicer... -ifeq ($(uname_M),x86_64) - XDL_FAST_HASH = YesPlease -endif ifeq ($(uname_S),OSF1) # Need this for u_short definitions et al BASIC_CFLAGS += -D_OSF_SOURCE -- 2.11.0
Re: [PATCH v6 1/6] submodules: add helper functions to determine presence of submodules
On Wed, Nov 30, 2016 at 05:28:29PM -0800, Brandon Williams wrote: > +/* > + * Determine if a submodule has been populated at a given 'path' > + */ > +int is_submodule_populated(const char *path) > +{ > + int ret = 0; > + struct stat st; > + char *gitdir = xstrfmt("%s/.git", path); > + > + if (!stat(gitdir, )) > + ret = 1; > + > + free(gitdir); > + return ret; > +} I don't know if it's worth changing or not, but this could be a bit shorter: int is_submodule_populated(const char *path) { return !access(mkpath("%s/.git", path), F_OK); } There is a file_exists() helper, but it uses lstat(), which I think you don't want (because you'd prefer to bail on a broken .git symlink). But access(F_OK) does what you want, I think. mkpath() is generally an unsafe function because it uses a static buffer, but it's handy and safe for handing values to syscalls like this. I say "I don't know if it's worth it" because what you've written is fine, and while more lines, it's fairly obvious and safe. -Peff
[PATCH] xdiff: Do not enable XDL_FAST_HASH by default
Although XDL_FAST_HASH computes hashes slightly faster on some architectures, its collision characteristics are much worse, resulting in some pathological diffs running over 100x slower (http://public-inbox.org/git/20141222041944.ga...@peff.net/). Furthermore, it was being enabled when ‘uname -m’ returns x86_64, even if we are cross-compiling for a different architecture. This mistake was also causing the Debian build reproducibility test to fail (https://tests.reproducible-builds.org/debian/index_variations.html). Future architecture-specific definitions should be based on compiler macros such as __x86_64__ rather than uname. Signed-off-by: Anders Kaseorg--- On Wed, 30 Nov 2016, Jeff King wrote: > However, I think this might be the tip of the iceberg. There are lots of > Makefile knobs whose defaults are tweaked based on uname output. This > one caught you because you are cross-compiling across architectures, but > in theory you could cross-compile for FreeBSD from Linux, or whatever. > > So I suspect a better strategy in general is to just override the > uname_* variables when cross-compiling. The specific case of an i386 userspace on an x86_64 kernel is important independently of the general cross compilation problem (in fact, the words “cross compilation” may not even really apply here). And I don’t think one should have to manually tweak the build for this setup, especially since the compiler already has the needed information. > All that being said, I actually think an easier fix for this particular > case might be to drop XDL_FAST_HASH entirely. Works for me. Anders Makefile | 1 - config.mak.uname | 5 - 2 files changed, 6 deletions(-) diff --git a/Makefile b/Makefile index f53fcc90d..c237d4f91 100644 --- a/Makefile +++ b/Makefile @@ -341,7 +341,6 @@ all:: # Define XDL_FAST_HASH to use an alternative line-hashing method in # the diff algorithm. It gives a nice speedup if your processor has # fast unaligned word loads. Does NOT work on big-endian systems! -# Enabled by default on x86_64. # # Define GIT_USER_AGENT if you want to change how git identifies itself during # network interactions. The default is "git/$(GIT_VERSION)". diff --git a/config.mak.uname b/config.mak.uname index b232908f8..2831a68c3 100644 --- a/config.mak.uname +++ b/config.mak.uname @@ -1,10 +1,8 @@ # Platform specific Makefile tweaks based on uname detection uname_S := $(shell sh -c 'uname -s 2>/dev/null || echo not') -uname_M := $(shell sh -c 'uname -m 2>/dev/null || echo not') uname_O := $(shell sh -c 'uname -o 2>/dev/null || echo not') uname_R := $(shell sh -c 'uname -r 2>/dev/null || echo not') -uname_P := $(shell sh -c 'uname -p 2>/dev/null || echo not') uname_V := $(shell sh -c 'uname -v 2>/dev/null || echo not') ifdef MSVC @@ -17,9 +15,6 @@ endif # because maintaining the nesting to match is a pain. If # we had "elif" things would have been much nicer... -ifeq ($(uname_M),x86_64) - XDL_FAST_HASH = YesPlease -endif ifeq ($(uname_S),OSF1) # Need this for u_short definitions et al BASIC_CFLAGS += -D_OSF_SOURCE -- 2.11.0
Re: [PATCH v6 0/6] recursively grep across submodules
On Wed, Nov 30, 2016 at 05:28:28PM -0800, Brandon Williams wrote: > v6 fixes a race condition which existed in the 'is_submodule_populated' > function. Instead of calling 'resolve_gitdir' to check for the existance of a > .git file/directory, use 'stat'. 'resolve_gitdir' calls 'chdir' which can > affect other running threads trying to load thier files into a buffer in > memory. This one passes my stress-test for t7814 (though I imagine you already knew that). I tried to think of things that could go wrong by using a simple stat() instead of resolve_gitdir(). They should only differ when ".git" for some reason does not point to a git repository. My initial thought is that this might be more vocal about errors, because the child process will complain. But actually, the original would already die if the ".git" file is funny, so we were pretty vocal already. I also wondered whether the sub-process might skip a bogus ".git" file and keep looking upward in the filesystem tree (which would confusingly end up back in the super-project!). But it looks like we bail hard when we see a ".git" file but it's bogus. Which is probably a good thing in general for submodules. I'm not sure any of that is actually even worth worrying about, as such a setup is broken by definition. I just wanted to think it through as a devil's advocate, and even that seems pretty reasonable. -Peff
Re: [PATCH] difftool.c: mark a file-local symbol with static
On Thu, Dec 01, 2016 at 01:18:35AM +, Ramsay Jones wrote: > >> I forgot, we ended up reversing course later and silencing them: > >> > >> http://public-inbox.org/git/20140505052117.gc6...@sigill.intra.peff.net/ > >> > >> By the rationale of that conversation, we should be doing: > >> > >> warning("%s", ""); > >> > >> here. > > > > I forgot too. Thanks for digging up that thread. > > Yes, I blamed wt-status.c:227 and came up with commit 7d7d68022 > as well. > > So, by the same rationale, we should remove -Wno-format-zero-length > from DEVELOPER_CFLAGS. yes? I don't have a preference on which direction we go, but yes, right now we are in an awkward middle ground. We should do one of: 1. Drop -Wno-format-zero-length from DEVELOPER_CFLAGS and make sure future patches to do not violate it. 2. Declare warning("") as OK. I still think the warning is silly, but (1) has value in that it produces the least surprise and annoyance to various people building Git. -Peff
Re: [PATCH] Define XDL_FAST_HASH when building *for* (not *on*) x86_64
[resend; the original had an outdated address for Thomas, and I would definitely like his blessing before removing XDL_FAST_HASH]. On Wed, Nov 30, 2016 at 10:04:07PM -0500, Anders Kaseorg wrote: > Previously, XDL_FAST_HASH was defined when ‘uname -m’ returns x86_64, > even if we are cross-compiling for a different architecture. Check > the __x86_64__ compiler macro instead. > > In addition to fixing the cross compilation bug, this is needed to > pass the Debian build reproducibility test > (https://tests.reproducible-builds.org/debian/index_variations.html). I don't think this is a good approach to fix it. Right now XDL_FAST_HASH is a Makefile knob that can be turned by the user, and can be used either to explicitly enable or explicitly disable the feature. With your patch, building with "make XDL_FAST_HASH=Yes" will still explicitly enable it, but "make XDL_FAST_HASH=" would no longer disable it (because even if unset, the compiler would turn it on when it sees __x86_64__). And being able to turn it off is important; more on that in a second. So I think if we wanted to auto-detect based on __x86_64__, we'd probably need to be able to set it to "auto" or something, and then #if defined(XDL_FAST_HASH_AUTO) && __x86_64__ #define XDL_FAST_HASH #endif or something. However, I think this might be the tip of the iceberg. There are lots of Makefile knobs whose defaults are tweaked based on uname output. This one caught you because you are cross-compiling across architectures, but in theory you could cross-compile for FreeBSD from Linux, or whatever. So I suspect a better strategy in general is to just override the uname_* variables when cross-compiling. All that being said, I actually think an easier fix for this particular case might be to drop XDL_FAST_HASH entirely. It computes the hashes slightly faster, but its collision characteristics are much worse. About 2 years ago I ran across a pathological diff that ran over 100x slower with XDL_FAST_HASH: http://public-inbox.org/git/20141222041944.ga...@peff.net/ The discussion veered into whether we should have a randomized hash secured against DoS attacks. I played around with some alternatives, but never found anything quite as fast for the "normal" case. And having disabled XDL_FAST_HASH on GitHub's servers, it wasn't a big priority for me. I'd be happy if somebody wanted to investigate other hash functions further. But barring that, I think we should drop XDL_FAST_HASH (or at the very least stop turning it on by default) in the meantime. It's just not a good tradeoff. -Peff
Re: [PATCH] Define XDL_FAST_HASH when building *for* (not *on*) x86_64
On Wed, Nov 30, 2016 at 10:04:07PM -0500, Anders Kaseorg wrote: > Previously, XDL_FAST_HASH was defined when ‘uname -m’ returns x86_64, > even if we are cross-compiling for a different architecture. Check > the __x86_64__ compiler macro instead. > > In addition to fixing the cross compilation bug, this is needed to > pass the Debian build reproducibility test > (https://tests.reproducible-builds.org/debian/index_variations.html). I don't think this is a good approach to fix it. Right now XDL_FAST_HASH is a Makefile knob that can be turned by the user, and can be used either to explicitly enable or explicitly disable the feature. With your patch, building with "make XDL_FAST_HASH=Yes" will still explicitly enable it, but "make XDL_FAST_HASH=" would no longer disable it (because even if unset, the compiler would turn it on when it sees __x86_64__). And being able to turn it off is important; more on that in a second. So I think if we wanted to auto-detect based on __x86_64__, we'd probably need to be able to set it to "auto" or something, and then #if defined(XDL_FAST_HASH_AUTO) && __x86_64__ #define XDL_FAST_HASH #endif or something. However, I think this might be the tip of the iceberg. There are lots of Makefile knobs whose defaults are tweaked based on uname output. This one caught you because you are cross-compiling across architectures, but in theory you could cross-compile for FreeBSD from Linux, or whatever. So I suspect a better strategy in general is to just override the uname_* variables when cross-compiling. All that being said, I actually think an easier fix for this particular case might be to drop XDL_FAST_HASH entirely. It computes the hashes slightly faster, but its collision characteristics are much worse. About 2 years ago I ran across a pathological diff that ran over 100x slower with XDL_FAST_HASH: http://public-inbox.org/git/20141222041944.ga...@peff.net/ The discussion veered into whether we should have a randomized hash secured against DoS attacks. I played around with some alternatives, but never found anything quite as fast for the "normal" case. And having disabled XDL_FAST_HASH on GitHub's servers, it wasn't a big priority for me. I'd be happy if somebody wanted to investigate other hash functions further. But barring that, I think we should drop XDL_FAST_HASH (or at the very least stop turning it on by default) in the meantime. It's just not a good tradeoff. -Peff
[PATCH] Define XDL_FAST_HASH when building *for* (not *on*) x86_64
Previously, XDL_FAST_HASH was defined when ‘uname -m’ returns x86_64, even if we are cross-compiling for a different architecture. Check the __x86_64__ compiler macro instead. In addition to fixing the cross compilation bug, this is needed to pass the Debian build reproducibility test (https://tests.reproducible-builds.org/debian/index_variations.html). Signed-off-by: Anders Kaseorg--- config.mak.uname | 5 - xdiff/xutils.c | 4 2 files changed, 4 insertions(+), 5 deletions(-) diff --git a/config.mak.uname b/config.mak.uname index b232908f8..2831a68c3 100644 --- a/config.mak.uname +++ b/config.mak.uname @@ -1,10 +1,8 @@ # Platform specific Makefile tweaks based on uname detection uname_S := $(shell sh -c 'uname -s 2>/dev/null || echo not') -uname_M := $(shell sh -c 'uname -m 2>/dev/null || echo not') uname_O := $(shell sh -c 'uname -o 2>/dev/null || echo not') uname_R := $(shell sh -c 'uname -r 2>/dev/null || echo not') -uname_P := $(shell sh -c 'uname -p 2>/dev/null || echo not') uname_V := $(shell sh -c 'uname -v 2>/dev/null || echo not') ifdef MSVC @@ -17,9 +15,6 @@ endif # because maintaining the nesting to match is a pain. If # we had "elif" things would have been much nicer... -ifeq ($(uname_M),x86_64) - XDL_FAST_HASH = YesPlease -endif ifeq ($(uname_S),OSF1) # Need this for u_short definitions et al BASIC_CFLAGS += -D_OSF_SOURCE diff --git a/xdiff/xutils.c b/xdiff/xutils.c index 027192a1c..f46351fe4 100644 --- a/xdiff/xutils.c +++ b/xdiff/xutils.c @@ -264,6 +264,10 @@ static unsigned long xdl_hash_record_with_whitespace(char const **data, return ha; } +#ifdef __x86_64__ +#define XDL_FAST_HASH +#endif + #ifdef XDL_FAST_HASH #define REPEAT_BYTE(x) ((~0ul / 0xff) * (x)) -- 2.11.0
[PATCH v6 4/6] grep: optionally recurse into submodules
Allow grep to recognize submodules and recursively search for patterns in each submodule. This is done by forking off a process to recursively call grep on each submodule. The top level --super-prefix option is used to pass a path to the submodule which can in turn be used to prepend to output or in pathspec matching logic. Recursion only occurs for submodules which have been initialized and checked out by the parent project. If a submodule hasn't been initialized and checked out it is simply skipped. In order to support the existing multi-threading infrastructure in grep, output from each child process is captured in a strbuf so that it can be later printed to the console in an ordered fashion. To limit the number of theads that are created, each child process has half the number of threads as its parents (minimum of 1), otherwise we potentailly have a fork-bomb. Signed-off-by: Brandon Williams--- Documentation/git-grep.txt | 5 + builtin/grep.c | 300 ++--- git.c | 2 +- t/t7814-grep-recurse-submodules.sh | 99 4 files changed, 386 insertions(+), 20 deletions(-) create mode 100755 t/t7814-grep-recurse-submodules.sh diff --git a/Documentation/git-grep.txt b/Documentation/git-grep.txt index 0ecea6e..17aa1ba 100644 --- a/Documentation/git-grep.txt +++ b/Documentation/git-grep.txt @@ -26,6 +26,7 @@ SYNOPSIS [--threads ] [-f ] [-e] [--and|--or|--not|(|)|-e ...] + [--recurse-submodules] [ [--[no-]exclude-standard] [--cached | --no-index | --untracked] | ...] [--] [...] @@ -88,6 +89,10 @@ OPTIONS mechanism. Only useful when searching files in the current directory with `--no-index`. +--recurse-submodules:: + Recursively search in each submodule that has been initialized and + checked out in the repository. + -a:: --text:: Process binary files as if they were text. diff --git a/builtin/grep.c b/builtin/grep.c index 8887b6a..dca0be6 100644 --- a/builtin/grep.c +++ b/builtin/grep.c @@ -18,12 +18,20 @@ #include "quote.h" #include "dir.h" #include "pathspec.h" +#include "submodule.h" static char const * const grep_usage[] = { N_("git grep [] [-e] [...] [[--] ...]"), NULL }; +static const char *super_prefix; +static int recurse_submodules; +static struct argv_array submodule_options = ARGV_ARRAY_INIT; + +static int grep_submodule_launch(struct grep_opt *opt, +const struct grep_source *gs); + #define GREP_NUM_THREADS_DEFAULT 8 static int num_threads; @@ -174,7 +182,10 @@ static void *run(void *arg) break; opt->output_priv = w; - hit |= grep_source(opt, >source); + if (w->source.type == GREP_SOURCE_SUBMODULE) + hit |= grep_submodule_launch(opt, >source); + else + hit |= grep_source(opt, >source); grep_source_clear_data(>source); work_done(w); } @@ -300,6 +311,10 @@ static int grep_sha1(struct grep_opt *opt, const unsigned char *sha1, if (opt->relative && opt->prefix_length) { quote_path_relative(filename + tree_name_len, opt->prefix, ); strbuf_insert(, 0, filename, tree_name_len); + } else if (super_prefix) { + strbuf_add(, filename, tree_name_len); + strbuf_addstr(, super_prefix); + strbuf_addstr(, filename + tree_name_len); } else { strbuf_addstr(, filename); } @@ -328,10 +343,13 @@ static int grep_file(struct grep_opt *opt, const char *filename) { struct strbuf buf = STRBUF_INIT; - if (opt->relative && opt->prefix_length) + if (opt->relative && opt->prefix_length) { quote_path_relative(filename, opt->prefix, ); - else + } else { + if (super_prefix) + strbuf_addstr(, super_prefix); strbuf_addstr(, filename); + } #ifndef NO_PTHREADS if (num_threads) { @@ -378,31 +396,260 @@ static void run_pager(struct grep_opt *opt, const char *prefix) exit(status); } -static int grep_cache(struct grep_opt *opt, const struct pathspec *pathspec, int cached) +static void compile_submodule_options(const struct grep_opt *opt, + const struct pathspec *pathspec, + int cached, int untracked, + int opt_exclude, int use_index, + int pattern_type_arg) +{ + struct grep_pat *pattern; + int i; + + if (recurse_submodules) + argv_array_push(_options, "--recurse-submodules"); + + if (cached) + argv_array_push(_options,
[PATCH v6 6/6] grep: search history of moved submodules
If a submodule was renamed at any point since it's inception then if you were to try and grep on a commit prior to the submodule being moved, you wouldn't be able to find a working directory for the submodule since the path in the past is different from the current path. This patch teaches grep to find the .git directory for a submodule in the parents .git/modules/ directory in the event the path to the submodule in the commit that is being searched differs from the state of the currently checked out commit. If found, the child process that is spawned to grep the submodule will chdir into its gitdir instead of a working directory. In order to override the explicit setting of submodule child process's gitdir environment variable (which was introduced in '10f5c526') `GIT_DIR_ENVIORMENT` needs to be pushed onto child process's env_array. This allows the searching of history from a submodule's gitdir, rather than from a working directory. Signed-off-by: Brandon Williams--- builtin/grep.c | 20 +-- t/t7814-grep-recurse-submodules.sh | 41 ++ 2 files changed, 59 insertions(+), 2 deletions(-) diff --git a/builtin/grep.c b/builtin/grep.c index 5918a26..2c727ef 100644 --- a/builtin/grep.c +++ b/builtin/grep.c @@ -547,6 +547,7 @@ static int grep_submodule_launch(struct grep_opt *opt, name = gs->name; prepare_submodule_repo_env(_array); + argv_array_push(_array, GIT_DIR_ENVIRONMENT); /* Add super prefix */ argv_array_pushf(, "--super-prefix=%s%s/", @@ -615,8 +616,23 @@ static int grep_submodule(struct grep_opt *opt, const unsigned char *sha1, { if (!is_submodule_initialized(path)) return 0; - if (!is_submodule_populated(path)) - return 0; + if (!is_submodule_populated(path)) { + /* +* If searching history, check for the presense of the +* submodule's gitdir before skipping the submodule. +*/ + if (sha1) { + const struct submodule *sub = + submodule_from_path(null_sha1, path); + if (sub) + path = git_path("modules/%s", sub->name); + + if (!(is_directory(path) && is_git_directory(path))) + return 0; + } else { + return 0; + } + } #ifndef NO_PTHREADS if (num_threads) { diff --git a/t/t7814-grep-recurse-submodules.sh b/t/t7814-grep-recurse-submodules.sh index 9e93fe7..0507771 100755 --- a/t/t7814-grep-recurse-submodules.sh +++ b/t/t7814-grep-recurse-submodules.sh @@ -186,6 +186,47 @@ test_expect_success 'grep recurse submodule colon in name' ' test_cmp expect actual ' +test_expect_success 'grep history with moved submoules' ' + git init parent && + test_when_finished "rm -rf parent" && + echo "foobar" >parent/file && + git -C parent add file && + git -C parent commit -m "add file" && + + git init sub && + test_when_finished "rm -rf sub" && + echo "foobar" >sub/file && + git -C sub add file && + git -C sub commit -m "add file" && + + git -C parent submodule add ../sub dir/sub && + git -C parent commit -m "add submodule" && + + cat >expect <<-\EOF && + dir/sub/file:foobar + file:foobar + EOF + git -C parent grep -e "foobar" --recurse-submodules >actual && + test_cmp expect actual && + + git -C parent mv dir/sub sub-moved && + git -C parent commit -m "moved submodule" && + + cat >expect <<-\EOF && + file:foobar + sub-moved/file:foobar + EOF + git -C parent grep -e "foobar" --recurse-submodules >actual && + test_cmp expect actual && + + cat >expect <<-\EOF && + HEAD^:dir/sub/file:foobar + HEAD^:file:foobar + EOF + git -C parent grep -e "foobar" --recurse-submodules HEAD^ >actual && + test_cmp expect actual +' + test_incompatible_with_recurse_submodules () { test_expect_success "--recurse-submodules and $1 are incompatible" " -- 2.8.0.rc3.226.g39d4020
[PATCH v6 3/6] grep: add submodules as a grep source type
Add `GREP_SOURCE_SUBMODULE` as a grep_source type and cases for this new type in the various switch statements in grep.c. When initializing a grep_source with type `GREP_SOURCE_SUBMODULE` the identifier can either be NULL (to indicate that the working tree will be used) or a SHA1 (the REV of the submodule to be grep'd). If the identifier is a SHA1 then we want to fall through to the `GREP_SOURCE_SHA1` case to handle the copying of the SHA1. Signed-off-by: Brandon Williams--- grep.c | 16 +++- grep.h | 1 + 2 files changed, 16 insertions(+), 1 deletion(-) diff --git a/grep.c b/grep.c index 1194d35..0dbdc1d 100644 --- a/grep.c +++ b/grep.c @@ -1735,12 +1735,23 @@ void grep_source_init(struct grep_source *gs, enum grep_source_type type, case GREP_SOURCE_FILE: gs->identifier = xstrdup(identifier); break; + case GREP_SOURCE_SUBMODULE: + if (!identifier) { + gs->identifier = NULL; + break; + } + /* +* FALL THROUGH +* If the identifier is non-NULL (in the submodule case) it +* will be a SHA1 that needs to be copied. +*/ case GREP_SOURCE_SHA1: gs->identifier = xmalloc(20); hashcpy(gs->identifier, identifier); break; case GREP_SOURCE_BUF: gs->identifier = NULL; + break; } } @@ -1760,6 +1771,7 @@ void grep_source_clear_data(struct grep_source *gs) switch (gs->type) { case GREP_SOURCE_FILE: case GREP_SOURCE_SHA1: + case GREP_SOURCE_SUBMODULE: free(gs->buf); gs->buf = NULL; gs->size = 0; @@ -1831,8 +1843,10 @@ static int grep_source_load(struct grep_source *gs) return grep_source_load_sha1(gs); case GREP_SOURCE_BUF: return gs->buf ? 0 : -1; + case GREP_SOURCE_SUBMODULE: + break; } - die("BUG: invalid grep_source type"); + die("BUG: invalid grep_source type to load"); } void grep_source_load_driver(struct grep_source *gs) diff --git a/grep.h b/grep.h index 5856a23..267534c 100644 --- a/grep.h +++ b/grep.h @@ -161,6 +161,7 @@ struct grep_source { GREP_SOURCE_SHA1, GREP_SOURCE_FILE, GREP_SOURCE_BUF, + GREP_SOURCE_SUBMODULE, } type; void *identifier; -- 2.8.0.rc3.226.g39d4020
[PATCH v6 2/6] submodules: load gitmodules file from commit sha1
teach submodules to load a '.gitmodules' file from a commit sha1. This enables the population of the submodule_cache to be based on the state of the '.gitmodules' file from a particular commit. Signed-off-by: Brandon Williams--- cache.h| 2 ++ config.c | 8 submodule-config.c | 6 +++--- submodule-config.h | 3 +++ submodule.c| 12 submodule.h| 1 + 6 files changed, 25 insertions(+), 7 deletions(-) diff --git a/cache.h b/cache.h index 1be6526..559a461 100644 --- a/cache.h +++ b/cache.h @@ -1690,6 +1690,8 @@ extern int git_default_config(const char *, const char *, void *); extern int git_config_from_file(config_fn_t fn, const char *, void *); extern int git_config_from_mem(config_fn_t fn, const enum config_origin_type, const char *name, const char *buf, size_t len, void *data); +extern int git_config_from_blob_sha1(config_fn_t fn, const char *name, +const unsigned char *sha1, void *data); extern void git_config_push_parameter(const char *text); extern int git_config_from_parameters(config_fn_t fn, void *data); extern void git_config(config_fn_t fn, void *); diff --git a/config.c b/config.c index 83fdecb..4d78e72 100644 --- a/config.c +++ b/config.c @@ -1214,10 +1214,10 @@ int git_config_from_mem(config_fn_t fn, const enum config_origin_type origin_typ return do_config_from(, fn, data); } -static int git_config_from_blob_sha1(config_fn_t fn, -const char *name, -const unsigned char *sha1, -void *data) +int git_config_from_blob_sha1(config_fn_t fn, + const char *name, + const unsigned char *sha1, + void *data) { enum object_type type; char *buf; diff --git a/submodule-config.c b/submodule-config.c index 098085b..8b9a2ef 100644 --- a/submodule-config.c +++ b/submodule-config.c @@ -379,9 +379,9 @@ static int parse_config(const char *var, const char *value, void *data) return ret; } -static int gitmodule_sha1_from_commit(const unsigned char *commit_sha1, - unsigned char *gitmodules_sha1, - struct strbuf *rev) +int gitmodule_sha1_from_commit(const unsigned char *commit_sha1, + unsigned char *gitmodules_sha1, + struct strbuf *rev) { int ret = 0; diff --git a/submodule-config.h b/submodule-config.h index d05c542..78584ba 100644 --- a/submodule-config.h +++ b/submodule-config.h @@ -29,6 +29,9 @@ const struct submodule *submodule_from_name(const unsigned char *commit_sha1, const char *name); const struct submodule *submodule_from_path(const unsigned char *commit_sha1, const char *path); +extern int gitmodule_sha1_from_commit(const unsigned char *commit_sha1, + unsigned char *gitmodules_sha1, + struct strbuf *rev); void submodule_free(void); #endif /* SUBMODULE_CONFIG_H */ diff --git a/submodule.c b/submodule.c index f336ca9..8516ab0 100644 --- a/submodule.c +++ b/submodule.c @@ -198,6 +198,18 @@ void gitmodules_config(void) } } +void gitmodules_config_sha1(const unsigned char *commit_sha1) +{ + struct strbuf rev = STRBUF_INIT; + unsigned char sha1[20]; + + if (gitmodule_sha1_from_commit(commit_sha1, sha1, )) { + git_config_from_blob_sha1(submodule_config, rev.buf, + sha1, NULL); + } + strbuf_release(); +} + /* * Determine if a submodule has been initialized at a given 'path' */ diff --git a/submodule.h b/submodule.h index 6ec5f2f..9203d89 100644 --- a/submodule.h +++ b/submodule.h @@ -37,6 +37,7 @@ void set_diffopt_flags_from_submodule_config(struct diff_options *diffopt, const char *path); int submodule_config(const char *var, const char *value, void *cb); void gitmodules_config(void); +extern void gitmodules_config_sha1(const unsigned char *commit_sha1); extern int is_submodule_initialized(const char *path); extern int is_submodule_populated(const char *path); int parse_submodule_update_strategy(const char *value, -- 2.8.0.rc3.226.g39d4020
[PATCH v6 5/6] grep: enable recurse-submodules to work on objects
Teach grep to recursively search in submodules when provided with a object. This allows grep to search a submodule based on the state of the submodule that is present in a commit of the super project. When grep is provided with a object, the name of the object is prefixed to all output. In order to provide uniformity of output between the parent and child processes the option `--parent-basename` has been added so that the child can preface all of it's output with the name of the parent's object instead of the name of the commit SHA1 of the submodule. This changes output from the command `git grep -e. -l --recurse-submodules HEAD` from: HEAD:file :sub/file to: HEAD:file HEAD:sub/file Signed-off-by: Brandon Williams--- Documentation/git-grep.txt | 13 - builtin/grep.c | 76 --- t/t7814-grep-recurse-submodules.sh | 103 - tree-walk.c| 28 ++ 4 files changed, 211 insertions(+), 9 deletions(-) diff --git a/Documentation/git-grep.txt b/Documentation/git-grep.txt index 17aa1ba..71f32f3 100644 --- a/Documentation/git-grep.txt +++ b/Documentation/git-grep.txt @@ -26,7 +26,7 @@ SYNOPSIS [--threads ] [-f ] [-e] [--and|--or|--not|(|)|-e ...] - [--recurse-submodules] + [--recurse-submodules] [--parent-basename ] [ [--[no-]exclude-standard] [--cached | --no-index | --untracked] | ...] [--] [...] @@ -91,7 +91,16 @@ OPTIONS --recurse-submodules:: Recursively search in each submodule that has been initialized and - checked out in the repository. + checked out in the repository. When used in combination with the +option the prefix of all submodule output will be the name of + the parent project's object. + +--parent-basename :: + For internal use only. In order to produce uniform output with the + --recurse-submodules option, this option can be used to provide the + basename of a parent's object to a submodule so the submodule + can prefix its output with the parent's name rather than the SHA1 of + the submodule. -a:: --text:: diff --git a/builtin/grep.c b/builtin/grep.c index dca0be6..5918a26 100644 --- a/builtin/grep.c +++ b/builtin/grep.c @@ -19,6 +19,7 @@ #include "dir.h" #include "pathspec.h" #include "submodule.h" +#include "submodule-config.h" static char const * const grep_usage[] = { N_("git grep [] [-e] [...] [[--] ...]"), @@ -28,6 +29,7 @@ static char const * const grep_usage[] = { static const char *super_prefix; static int recurse_submodules; static struct argv_array submodule_options = ARGV_ARRAY_INIT; +static const char *parent_basename; static int grep_submodule_launch(struct grep_opt *opt, const struct grep_source *gs); @@ -534,19 +536,53 @@ static int grep_submodule_launch(struct grep_opt *opt, { struct child_process cp = CHILD_PROCESS_INIT; int status, i; + const char *end_of_base; + const char *name; struct work_item *w = opt->output_priv; + end_of_base = strchr(gs->name, ':'); + if (gs->identifier && end_of_base) + name = end_of_base + 1; + else + name = gs->name; + prepare_submodule_repo_env(_array); /* Add super prefix */ argv_array_pushf(, "--super-prefix=%s%s/", super_prefix ? super_prefix : "", -gs->name); +name); argv_array_push(, "grep"); + /* +* Add basename of parent project +* When performing grep on a tree object the filename is prefixed +* with the object's name: 'tree-name:filename'. In order to +* provide uniformity of output we want to pass the name of the +* parent project's object name to the submodule so the submodule can +* prefix its output with the parent's name and not its own SHA1. +*/ + if (gs->identifier && end_of_base) + argv_array_pushf(, "--parent-basename=%.*s", +(int) (end_of_base - gs->name), +gs->name); + /* Add options */ - for (i = 0; i < submodule_options.argc; i++) + for (i = 0; i < submodule_options.argc; i++) { + /* +* If there is a tree identifier for the submodule, add the +* rev after adding the submodule options but before the +* pathspecs. To do this we listen for the '--' and insert the +* sha1 before pushing the '--' onto the child process argv +* array. +*/ + if (gs->identifier && + !strcmp("--", submodule_options.argv[i])) { + argv_array_push(, sha1_to_hex(gs->identifier)); +
[PATCH v6 1/6] submodules: add helper functions to determine presence of submodules
Add two helper functions to submodules.c. `is_submodule_initialized()` checks if a submodule has been initialized at a given path and `is_submodule_populated()` check if a submodule has been checked out at a given path. Signed-off-by: Brandon Williams--- submodule.c | 39 +++ submodule.h | 2 ++ 2 files changed, 41 insertions(+) diff --git a/submodule.c b/submodule.c index 6f7d883..f336ca9 100644 --- a/submodule.c +++ b/submodule.c @@ -198,6 +198,45 @@ void gitmodules_config(void) } } +/* + * Determine if a submodule has been initialized at a given 'path' + */ +int is_submodule_initialized(const char *path) +{ + int ret = 0; + const struct submodule *module = NULL; + + module = submodule_from_path(null_sha1, path); + + if (module) { + char *key = xstrfmt("submodule.%s.url", module->name); + char *value = NULL; + + ret = !git_config_get_string(key, ); + + free(value); + free(key); + } + + return ret; +} + +/* + * Determine if a submodule has been populated at a given 'path' + */ +int is_submodule_populated(const char *path) +{ + int ret = 0; + struct stat st; + char *gitdir = xstrfmt("%s/.git", path); + + if (!stat(gitdir, )) + ret = 1; + + free(gitdir); + return ret; +} + int parse_submodule_update_strategy(const char *value, struct submodule_update_strategy *dst) { diff --git a/submodule.h b/submodule.h index d9e197a..6ec5f2f 100644 --- a/submodule.h +++ b/submodule.h @@ -37,6 +37,8 @@ void set_diffopt_flags_from_submodule_config(struct diff_options *diffopt, const char *path); int submodule_config(const char *var, const char *value, void *cb); void gitmodules_config(void); +extern int is_submodule_initialized(const char *path); +extern int is_submodule_populated(const char *path); int parse_submodule_update_strategy(const char *value, struct submodule_update_strategy *dst); const char *submodule_strategy_to_string(const struct submodule_update_strategy *s); -- 2.8.0.rc3.226.g39d4020
[PATCH v6 0/6] recursively grep across submodules
v6 fixes a race condition which existed in the 'is_submodule_populated' function. Instead of calling 'resolve_gitdir' to check for the existance of a .git file/directory, use 'stat'. 'resolve_gitdir' calls 'chdir' which can affect other running threads trying to load thier files into a buffer in memory. Thanks to Stefan and Jeff for help debugging this problem. Brandon Williams (6): submodules: add helper functions to determine presence of submodules submodules: load gitmodules file from commit sha1 grep: add submodules as a grep source type grep: optionally recurse into submodules grep: enable recurse-submodules to work on objects grep: search history of moved submodules Documentation/git-grep.txt | 14 ++ builtin/grep.c | 386 ++--- cache.h| 2 + config.c | 8 +- git.c | 2 +- grep.c | 16 +- grep.h | 1 + submodule-config.c | 6 +- submodule-config.h | 3 + submodule.c| 51 + submodule.h| 3 + t/t7814-grep-recurse-submodules.sh | 241 +++ tree-walk.c| 28 +++ 13 files changed, 730 insertions(+), 31 deletions(-) create mode 100755 t/t7814-grep-recurse-submodules.sh --- interdiff based on 'bw/grep-recurse-submodules' diff --git a/submodule.c b/submodule.c index 062e58b..8516ab0 100644 --- a/submodule.c +++ b/submodule.c @@ -239,9 +239,10 @@ int is_submodule_initialized(const char *path) int is_submodule_populated(const char *path) { int ret = 0; + struct stat st; char *gitdir = xstrfmt("%s/.git", path); - if (resolve_gitdir(gitdir)) + if (!stat(gitdir, )) ret = 1; free(gitdir); -- 2.8.0.rc3.226.g39d4020
[PATCH] Documentation/install-webdoc.sh: quote a potentially unsafe shell expansion
Found via shellcheck In Documentation/install-webdoc.sh line 21: mkdir -p $(dirname "$T/$h") ^-- SC2046: Quote this to prevent word splitting. -- -Austin GPG: 14FB D7EA A041 937B From 1050538f252d22311185065ab8837c71b17003fb Mon Sep 17 00:00:00 2001 From: Austin EnglishDate: Wed, 30 Nov 2016 19:21:25 -0600 Subject: [PATCH] Documentation/install-webdoc.sh: quote a potentially unsafe shell expansion Signed-off-by: Austin English --- Documentation/install-webdoc.sh | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Documentation/install-webdoc.sh b/Documentation/install-webdoc.sh index ed8b4ff..5fb2dc5 100755 --- a/Documentation/install-webdoc.sh +++ b/Documentation/install-webdoc.sh @@ -18,7 +18,7 @@ do else echo >&2 "# install $h $T/$h" rm -f "$T/$h" - mkdir -p $(dirname "$T/$h") + mkdir -p "$(dirname "$T/$h")" cp "$h" "$T/$h" fi done -- 2.7.3
Re: What's cooking in git.git (Nov 2016, #06; Mon, 28)
On 11/30, Stefan Beller wrote: > > Oh interesting, I wonder if there is a way to not have to perform a > > chdir since taking a lock to lstat wouldn't be ideal. > > I think we could rewrite is_submodule_populated to be > > int is_submodule_populated_cheap_with_no_chdir(char *path) > { > return stat(path + ".git") > } > > i.e. just take the presence of the .git file/dir as a hint to run > the child process? I like this approach, its a quick (thread-safe) check to see if the submodule is interesting. If there happens to be an error with the submodule's .git file/directory then the child process will fail out. -- Brandon Williams
Re: [PATCH] difftool.c: mark a file-local symbol with static
On 30/11/16 23:46, Junio C Hamano wrote: > Jeff Kingwrites: > >> I forgot, we ended up reversing course later and silencing them: >> >> http://public-inbox.org/git/20140505052117.gc6...@sigill.intra.peff.net/ >> >> By the rationale of that conversation, we should be doing: >> >> warning("%s", ""); >> >> here. > > I forgot too. Thanks for digging up that thread. Yes, I blamed wt-status.c:227 and came up with commit 7d7d68022 as well. So, by the same rationale, we should remove -Wno-format-zero-length from DEVELOPER_CFLAGS. yes? ATB, Ramsay Jones
[PATCH] transport-helper: drop broken "unchanged" feature
Commit f8ec916 ("Allow helpers to report in "list" command that the ref is unchanged", 2009-11-17) allowed a remote helper to report that a ref is unchanged even if it did not know the contents of a ref. However, that commit also made Git wrongly assume that such a remote ref always has the contents of the local ref of the same name. (Git also cannot assume that the remote ref has the value of the current destination local ref, or any other ref, since the previous import/fetch could have been made using a different refspec.) Drop that assumption. Signed-off-by: Jonathan Tan--- Here is a patch that would remove the assumption (if it is indeed wrong). Documentation/gitremote-helpers.txt | 3 +++ transport-helper.c | 43 - 2 files changed, 3 insertions(+), 43 deletions(-) diff --git a/Documentation/gitremote-helpers.txt b/Documentation/gitremote-helpers.txt index 9e8681f..c862339 100644 --- a/Documentation/gitremote-helpers.txt +++ b/Documentation/gitremote-helpers.txt @@ -393,6 +393,9 @@ attributes are defined. 'unchanged':: This ref is unchanged since the last import or fetch, although the helper cannot necessarily determine what value that produced. + Git may import or fetch this ref anyway, because it does not + keep a record of the last values corresponding to the refs of a + specific remote. OPTIONS --- diff --git a/transport-helper.c b/transport-helper.c index 91aed35..6ab8e2f 100644 --- a/transport-helper.c +++ b/transport-helper.c @@ -392,9 +392,6 @@ static int fetch_with_fetch(struct transport *transport, for (i = 0; i < nr_heads; i++) { const struct ref *posn = to_fetch[i]; - if (posn->status & REF_STATUS_UPTODATE) - continue; - strbuf_addf(, "fetch %s %s\n", oid_to_hex(>old_oid), posn->symref ? posn->symref : posn->name); @@ -492,9 +489,6 @@ static int fetch_with_import(struct transport *transport, for (i = 0; i < nr_heads; i++) { posn = to_fetch[i]; - if (posn->status & REF_STATUS_UPTODATE) - continue; - strbuf_addf(, "import %s\n", posn->symref ? posn->symref : posn->name); sendline(data, ); @@ -531,8 +525,6 @@ static int fetch_with_import(struct transport *transport, for (i = 0; i < nr_heads; i++) { char *private, *name; posn = to_fetch[i]; - if (posn->status & REF_STATUS_UPTODATE) - continue; name = posn->symref ? posn->symref : posn->name; if (data->refspecs) private = apply_refspecs(data->refspecs, data->refspec_nr, name); @@ -649,21 +641,12 @@ static int fetch(struct transport *transport, int nr_heads, struct ref **to_fetch) { struct helper_data *data = transport->data; - int i, count; if (process_connect(transport, 0)) { do_take_over(transport); return transport->fetch(transport, nr_heads, to_fetch); } - count = 0; - for (i = 0; i < nr_heads; i++) - if (!(to_fetch[i]->status & REF_STATUS_UPTODATE)) - count++; - - if (!count) - return 0; - if (data->check_connectivity && data->transport_options.check_self_contained_and_connected) set_helper_option(transport, "check-connectivity", "true"); @@ -1009,23 +992,6 @@ static int push_refs(struct transport *transport, return -1; } - -static int has_attribute(const char *attrs, const char *attr) { - int len; - if (!attrs) - return 0; - - len = strlen(attr); - for (;;) { - const char *space = strchrnul(attrs, ' '); - if (len == space - attrs && !strncmp(attrs, attr, len)) - return 1; - if (!*space) - return 0; - attrs = space + 1; - } -} - static struct ref *get_refs_list(struct transport *transport, int for_push) { struct helper_data *data = transport->data; @@ -1067,15 +1033,6 @@ static struct ref *get_refs_list(struct transport *transport, int for_push) (*tail)->symref = xstrdup(buf.buf + 1); else if (buf.buf[0] != '?') get_oid_hex(buf.buf, &(*tail)->old_oid); - if (eon) { - if (has_attribute(eon + 1, "unchanged")) { - (*tail)->status |= REF_STATUS_UPTODATE; - if (read_ref((*tail)->name, -(*tail)->old_oid.hash) < 0) - die(_("Could not read ref %s"), -
[BUG?] Remote helper's wrong assumption of unchanged ref
Commit f8ec916 ("Allow helpers to report in "list" command that the ref is unchanged", 2009-11-17) allowed a remote helper to report that a ref is unchanged even if it did not know the contents of a ref. However, that commit also made Git assume that such a remote ref has the contents of the local ref of the same name. If I'm not missing anything, this assumption seems wrong; the attached test illustrates one case of local edits being made after cloning with default parameters. The original e-mail thread [1] seems to indicate that this feature is meant for a remote helper with no Git-specific code (which is possible if it supports "import" but not "fetch" - in this case, it would not deal with SHA-1s at all) to nevertheless indicate "unchanged", most likely to support optimizations on the client side. But it seems to me that Git cannot perform this optimization. In other words, it should just ignore "unchanged". If this makes sense, I'll prepare a patch to do this. [1] "[PATCH 00/13] Native and foreign helpers"--- git-remote-testgit.sh | 8 +++- t/t5801-remote-helpers.sh | 40 2 files changed, 47 insertions(+), 1 deletion(-) diff --git a/git-remote-testgit.sh b/git-remote-testgit.sh index 752c763..6357868 100755 --- a/git-remote-testgit.sh +++ b/git-remote-testgit.sh @@ -52,7 +52,13 @@ do echo ;; list) - git for-each-ref --format='? %(refname)' 'refs/heads/' + if test -n "$GIT_REMOTE_TESTGIT_UNCHANGED_BRANCH_REGEX" + then + git for-each-ref --format='? %(refname)' 'refs/heads/' | + sed "/${GIT_REMOTE_TESTGIT_UNCHANGED_BRANCH_REGEX}/s/$/ unchanged/" + else + git for-each-ref --format='? %(refname)' 'refs/heads/' + fi head=$(git symbolic-ref HEAD) echo "@$head HEAD" echo diff --git a/t/t5801-remote-helpers.sh b/t/t5801-remote-helpers.sh index 362b158..4a48f2b 100755 --- a/t/t5801-remote-helpers.sh +++ b/t/t5801-remote-helpers.sh @@ -301,4 +301,44 @@ test_expect_success 'fetch url' ' compare_refs server HEAD local FETCH_HEAD ' +test_expect_success 'setup remote repository and divergent clone' ' + git init s2 && + ( + cd s2 && + test_commit M1 && + git checkout -b mybranch && + test_commit B1 + ) && + git clone "testgit::${PWD}/s2" divergent && + + ( + cd divergent && + git checkout master && + test_commit M2 && + git checkout mybranch && + test_commit B2 + ) +' + +test_expect_success 'fetch with unchanged claims' ' + rm -rf local && + cp -r divergent local && + + # No unchanged branches + + GIT_REMOTE_TESTGIT_UNCHANGED_BRANCH_REGEX=ABCDE git -C local fetch && + compare_refs s2 M1 local refs/remotes/origin/master && + compare_refs s2 B1 local refs/remotes/origin/mybranch && + + # One unchanged branch + + GIT_REMOTE_TESTGIT_UNCHANGED_BRANCH_REGEX=mybranch git -C local fetch && + compare_refs s2 M1 local refs/remotes/origin/master && + + # I (Jonathan Tan) would expect refs/remotes/origin/mybranch to be B1, + # but it is B2. + test_must_fail compare_refs s2 B1 local refs/remotes/origin/mybranch && + compare_refs local B2 local refs/remotes/origin/mybranch +' + test_done -- 2.8.0.rc3.226.g39d4020
Grant.
Hello, A donation has been made out to you and your family from my foundation. For details, send an E-mail now to: castrat...@gmail.com Regards
Re: What's cooking in git.git (Nov 2016, #06; Mon, 28)
On Wed, Nov 30, 2016 at 04:06:05PM -0800, Brandon Williams wrote: > On 11/30, Jeff King wrote: > > So I think there is some other chdir(). I'm not sure if there is an easy > > way to get a backtrace on every call to chdir() in every thread. I'm > > sure somebody more clever than me could figure out how to make gdb do it > > automatically, but it might be workable manually. I think the chdir was > > in the main thread. > > > > -Peff > > Yeah maybe we're missing something else... > > How did you run strace with your stress script? It's hidden in the patch I sent a moment ago, but basically just "strace -o foo.out" will dump the trace in the trash directory. After the stress script runs, you can "cat fail/trash*/foo.out". > > (gdb) bt > > #0 chdir () at ../sysdeps/unix/syscall-template.S:84 > > #1 0x555fe259 in real_path_internal (path=0x559f6b30 > > "su:b/../.git/modules/su:b", die_on_error=1) > > at abspath.c:84 > > #2 0x555fe48a in real_path (path=0x559f6b30 > > "su:b/../.git/modules/su:b") at abspath.c:135 > > #3 0x556d09e6 in read_gitfile_gently (path=0x559f6ac0 > > "su:b/.git", return_error_code=0x0) > > at setup.c:555 > > #4 0x556d19cf in resolve_gitdir (suspect=0x559f6ac0 > > "su:b/.git") at setup.c:1021 > > #5 0x556e7e34 in is_submodule_populated (path=0x559f5ec8 > > "su:b") at submodule.c:244 > > #6 0x555a0f05 in grep_submodule (opt=0x7fffd8b0, sha1=0x0, > > filename=0x559f5ec8 "su:b", > > path=0x559f5ec8 "su:b") at builtin/grep.c:619 > > #7 0x555a12ac in grep_cache (opt=0x7fffd8b0, > > pathspec=0x7fffd880, cached=0) at builtin/grep.c:700 > > #8 0x555a36cb in cmd_grep (argc=0, argv=0x7fffdf40, > > prefix=0x0) at builtin/grep.c:1257 > > #9 0x5556603b in run_builtin (p=0x559b3ad8, > > argc=4, argv=0x7fffdf40) at git.c:373 > > #10 0x555662bc in handle_builtin (argc=4, argv=0x7fffdf40) at > > git.c:572 > > #11 0x5556641a in run_argv (argcp=0x7fffddfc, > > argv=0x7fffddf0) at git.c:630 > > #12 0x555665a8 in cmd_main (argc=4, argv=0x7fffdf40) at > > git.c:702 > > #13 0x555fde47 in main (argc=7, argv=0x7fffdf28) at > > common-main.c:40 > > > > So is_submodule_populated() needs to take a lock. But what's really > > gross is that the _other_ threads need to lock just to call lstat(). > > Presumably it could be done as a reader/writer type of lock where many > > "reader" threads can take the "I need to lstat()" lock simultaneously, > > but block when an "I'm going to chdir()" writer holds it. > > Oh interesting, I wonder if there is a way to not have to perform a > chdir since taking a lock to lstat wouldn't be ideal. I don't think so. It comes from real_path(), which needs to either chdir(), or start interpreting symbolic links itself (and madness that way lies). I think with a reader/writer lock as I described it wouldn't be too bad. The common case would pay only the locking cost and not ever block, since submodules are rare (and they're super-heavyweight to descend into anyway). I think putting it at the individual lstat() would be way too low, but probably you could do it right before calling grep_source(). It may even be possible to do some of the submodule work ahead of time while holding grep_lock(). > Thanks for helping out with this! I wasn't planning on it, but this turned into an intriguing puzzle. ;) -Peff
'git repack' and repack.writeBitmaps=true with kept packs
I have some unexpected behavior with 'git repack' on git 2.10.2 and 2.11.0. $ cat /etc/gitconfig [pack] writeBitmapHashCache = true [repack] writeBitmaps = true $ touch objects/pack/pack-3841d81123a96cedeb3c1bd7acf7e29bfba26639.keep $ find objects objects objects/pack objects/pack/pack-3841d81123a96cedeb3c1bd7acf7e29bfba26639.keep objects/pack/pack-3841d81123a96cedeb3c1bd7acf7e29bfba26639.idx objects/pack/pack-3841d81123a96cedeb3c1bd7acf7e29bfba26639.bitmap objects/pack/pack-3841d81123a96cedeb3c1bd7acf7e29bfba26639.pack objects/info objects/info/packs $ git repack -Adfl Counting objects: 16321, done. Delta compression using up to 24 threads. Compressing objects: 100% (16118/16118), done. Writing objects: 100% (16321/16321), done. Reusing bitmaps: 110, done. Selecting bitmap commits: 3257, done. Building bitmaps: 100% (137/137), done. Total 16321 (delta 11568), reused 4507 (delta 0) $ du -sh objects/pack/pack* 100Kobjects/pack/pack-3841d81123a96cedeb3c1bd7acf7e29bfba26639.bitmap 524Kobjects/pack/pack-3841d81123a96cedeb3c1bd7acf7e29bfba26639.idx 512 objects/pack/pack-3841d81123a96cedeb3c1bd7acf7e29bfba26639.keep 4.3Mobjects/pack/pack-3841d81123a96cedeb3c1bd7acf7e29bfba26639.pack 100Kobjects/pack/pack-6043151bee7bd61bdae6e3a2ba0f13cd1b0277af.bitmap 524Kobjects/pack/pack-6043151bee7bd61bdae6e3a2ba0f13cd1b0277af.idx 5.8Mobjects/pack/pack-6043151bee7bd61bdae6e3a2ba0f13cd1b0277af.pack It seems like it's behaving as though I've provided --pack-kept-objects. In order to ensure the .bitmap is created, it repacks everything, including everything in existing .pack files (not respecting .keep). But then it's not deleting the old .pack file (oddly, respecting .keep). What I'd expect it to do here is ignore the 'repack.writeBitmaps = true' value if there's a .keep that needs to be respected. Is this not a correct assumption?
Re: What's cooking in git.git (Nov 2016, #06; Mon, 28)
> Oh interesting, I wonder if there is a way to not have to perform a > chdir since taking a lock to lstat wouldn't be ideal. I think we could rewrite is_submodule_populated to be int is_submodule_populated_cheap_with_no_chdir(char *path) { return stat(path + ".git") } i.e. just take the presence of the .git file/dir as a hint to run the child process?
Re: What's cooking in git.git (Nov 2016, #06; Mon, 28)
On 11/30, Jeff King wrote: > On Wed, Nov 30, 2016 at 06:59:52PM -0500, Jeff King wrote: > > > So I think there is some other chdir(). I'm not sure if there is an easy > > way to get a backtrace on every call to chdir() in every thread. I'm > > sure somebody more clever than me could figure out how to make gdb do it > > automatically, but it might be workable manually. I think the chdir was > > in the main thread. > > Ah, that turned out to be quite easy. The culprit is probably: > > (gdb) bt > #0 chdir () at ../sysdeps/unix/syscall-template.S:84 > #1 0x555fe259 in real_path_internal (path=0x559f6b30 > "su:b/../.git/modules/su:b", die_on_error=1) > at abspath.c:84 > #2 0x555fe48a in real_path (path=0x559f6b30 > "su:b/../.git/modules/su:b") at abspath.c:135 > #3 0x556d09e6 in read_gitfile_gently (path=0x559f6ac0 > "su:b/.git", return_error_code=0x0) > at setup.c:555 > #4 0x556d19cf in resolve_gitdir (suspect=0x559f6ac0 "su:b/.git") > at setup.c:1021 > #5 0x556e7e34 in is_submodule_populated (path=0x559f5ec8 "su:b") > at submodule.c:244 > #6 0x555a0f05 in grep_submodule (opt=0x7fffd8b0, sha1=0x0, > filename=0x559f5ec8 "su:b", > path=0x559f5ec8 "su:b") at builtin/grep.c:619 > #7 0x555a12ac in grep_cache (opt=0x7fffd8b0, > pathspec=0x7fffd880, cached=0) at builtin/grep.c:700 > #8 0x555a36cb in cmd_grep (argc=0, argv=0x7fffdf40, prefix=0x0) > at builtin/grep.c:1257 > #9 0x5556603b in run_builtin (p=0x559b3ad8, > argc=4, argv=0x7fffdf40) at git.c:373 > #10 0x555662bc in handle_builtin (argc=4, argv=0x7fffdf40) at > git.c:572 > #11 0x5556641a in run_argv (argcp=0x7fffddfc, > argv=0x7fffddf0) at git.c:630 > #12 0x555665a8 in cmd_main (argc=4, argv=0x7fffdf40) at git.c:702 > #13 0x555fde47 in main (argc=7, argv=0x7fffdf28) at > common-main.c:40 > > So is_submodule_populated() needs to take a lock. But what's really > gross is that the _other_ threads need to lock just to call lstat(). > Presumably it could be done as a reader/writer type of lock where many > "reader" threads can take the "I need to lstat()" lock simultaneously, > but block when an "I'm going to chdir()" writer holds it. > > -Peff Oh interesting, I wonder if there is a way to not have to perform a chdir since taking a lock to lstat wouldn't be ideal. Thanks for helping out with this! -- Brandon Williams
Re: What's cooking in git.git (Nov 2016, #06; Mon, 28)
On 11/30, Jeff King wrote: > So I think there is some other chdir(). I'm not sure if there is an easy > way to get a backtrace on every call to chdir() in every thread. I'm > sure somebody more clever than me could figure out how to make gdb do it > automatically, but it might be workable manually. I think the chdir was > in the main thread. > > -Peff Yeah maybe we're missing something else... How did you run strace with your stress script? -- Brandon Williams
Re: What's cooking in git.git (Nov 2016, #06; Mon, 28)
On Wed, Nov 30, 2016 at 06:59:52PM -0500, Jeff King wrote: > So I think there is some other chdir(). I'm not sure if there is an easy > way to get a backtrace on every call to chdir() in every thread. I'm > sure somebody more clever than me could figure out how to make gdb do it > automatically, but it might be workable manually. I think the chdir was > in the main thread. Ah, that turned out to be quite easy. The culprit is probably: (gdb) bt #0 chdir () at ../sysdeps/unix/syscall-template.S:84 #1 0x555fe259 in real_path_internal (path=0x559f6b30 "su:b/../.git/modules/su:b", die_on_error=1) at abspath.c:84 #2 0x555fe48a in real_path (path=0x559f6b30 "su:b/../.git/modules/su:b") at abspath.c:135 #3 0x556d09e6 in read_gitfile_gently (path=0x559f6ac0 "su:b/.git", return_error_code=0x0) at setup.c:555 #4 0x556d19cf in resolve_gitdir (suspect=0x559f6ac0 "su:b/.git") at setup.c:1021 #5 0x556e7e34 in is_submodule_populated (path=0x559f5ec8 "su:b") at submodule.c:244 #6 0x555a0f05 in grep_submodule (opt=0x7fffd8b0, sha1=0x0, filename=0x559f5ec8 "su:b", path=0x559f5ec8 "su:b") at builtin/grep.c:619 #7 0x555a12ac in grep_cache (opt=0x7fffd8b0, pathspec=0x7fffd880, cached=0) at builtin/grep.c:700 #8 0x555a36cb in cmd_grep (argc=0, argv=0x7fffdf40, prefix=0x0) at builtin/grep.c:1257 #9 0x5556603b in run_builtin (p=0x559b3ad8, argc=4, argv=0x7fffdf40) at git.c:373 #10 0x555662bc in handle_builtin (argc=4, argv=0x7fffdf40) at git.c:572 #11 0x5556641a in run_argv (argcp=0x7fffddfc, argv=0x7fffddf0) at git.c:630 #12 0x555665a8 in cmd_main (argc=4, argv=0x7fffdf40) at git.c:702 #13 0x555fde47 in main (argc=7, argv=0x7fffdf28) at common-main.c:40 So is_submodule_populated() needs to take a lock. But what's really gross is that the _other_ threads need to lock just to call lstat(). Presumably it could be done as a reader/writer type of lock where many "reader" threads can take the "I need to lstat()" lock simultaneously, but block when an "I'm going to chdir()" writer holds it. -Peff
Re: What's cooking in git.git (Nov 2016, #06; Mon, 28)
On Wed, Nov 30, 2016 at 06:46:36PM -0500, Jeff King wrote: > > For now the work around could be to just pass "-C " to the child > > process instead of relying on run-command to chdir. > > Yeah, that would push it after the exec. I just don't understand why > that would be necessary. Hmm. It still seems to fail, even with the workaround (the patch in run-command is there to make sure there's not some other call that we're not catching): diff --git a/builtin/grep.c b/builtin/grep.c index 2c727ef49..3323a3e7f 100644 --- a/builtin/grep.c +++ b/builtin/grep.c @@ -553,6 +553,7 @@ static int grep_submodule_launch(struct grep_opt *opt, argv_array_pushf(, "--super-prefix=%s%s/", super_prefix ? super_prefix : "", name); + argv_array_pushl(, "-C", gs->path, NULL); argv_array_push(, "grep"); /* @@ -586,7 +587,6 @@ static int grep_submodule_launch(struct grep_opt *opt, } cp.git_cmd = 1; - cp.dir = gs->path; /* * Capture output to output buffer and check the return code from the diff --git a/run-command.c b/run-command.c index 5a4dbb66d..d040f4f77 100644 --- a/run-command.c +++ b/run-command.c @@ -393,6 +393,8 @@ int start_command(struct child_process *cmd) close(cmd->out); } + if (cmd->dir && git_env_bool("GIT_NO_CHDIR", 0)) + die("temporarily disallowing chdir"); if (cmd->dir && chdir(cmd->dir)) die_errno("exec '%s': cd to '%s' failed", cmd->argv[0], cmd->dir); diff --git a/t/t7814-grep-recurse-submodules.sh b/t/t7814-grep-recurse-submodules.sh index 050777186..591ff74ed 100755 --- a/t/t7814-grep-recurse-submodules.sh +++ b/t/t7814-grep-recurse-submodules.sh @@ -175,7 +175,7 @@ test_expect_success 'grep recurse submodule colon in name' ' fi:le:foobar su:b/fi:le:foobar EOF - git -C parent grep -e "foobar" --recurse-submodules >actual && + GIT_NO_CHDIR=1 strace -o foo.out -f git -C parent grep -e "foobar" --recurse-submodules >actual && test_cmp expect actual && cat >expect <<-\EOF && So I think there is some other chdir(). I'm not sure if there is an easy way to get a backtrace on every call to chdir() in every thread. I'm sure somebody more clever than me could figure out how to make gdb do it automatically, but it might be workable manually. I think the chdir was in the main thread. -Peff
Re: What's cooking in git.git (Nov 2016, #06; Mon, 28)
On 11/30, Jeff King wrote: > On Wed, Nov 30, 2016 at 03:42:48PM -0800, Brandon Williams wrote: > > > > where 20813 and 20867 are two threads of the main process. One is doing > > > the lstat and the other calls chdir at the same moment. > > > > Yeah so it looks like the start_command function calls chdir. Which > > means any uses of the run-command interface are not thread safe > > That seems crazy. The chdir should be happening on the child side of the > fork (and looking at the code, it seems to be the case). And on the > Windows side, without fork, it's an option to the spawn call, which > makes sense. > > > For now the work around could be to just pass "-C " to the child > > process instead of relying on run-command to chdir. > > Yeah, that would push it after the exec. I just don't understand why > that would be necessary. > > -Peff You're right, I jumped the gun. That doesn't seem to fix the problem (as I'm still seeing the same failure). -- Brandon Williams
Re: [PATCH] difftool.c: mark a file-local symbol with static
Jeff Kingwrites: > I forgot, we ended up reversing course later and silencing them: > > http://public-inbox.org/git/20140505052117.gc6...@sigill.intra.peff.net/ > > By the rationale of that conversation, we should be doing: > > warning("%s", ""); > > here. I forgot too. Thanks for digging up that thread.
Re: What's cooking in git.git (Nov 2016, #06; Mon, 28)
On Wed, Nov 30, 2016 at 03:42:48PM -0800, Brandon Williams wrote: > > where 20813 and 20867 are two threads of the main process. One is doing > > the lstat and the other calls chdir at the same moment. > > Yeah so it looks like the start_command function calls chdir. Which > means any uses of the run-command interface are not thread safe That seems crazy. The chdir should be happening on the child side of the fork (and looking at the code, it seems to be the case). And on the Windows side, without fork, it's an option to the spawn call, which makes sense. > For now the work around could be to just pass "-C " to the child > process instead of relying on run-command to chdir. Yeah, that would push it after the exec. I just don't understand why that would be necessary. -Peff
Re: What's cooking in git.git (Nov 2016, #06; Mon, 28)
On 11/30, Jeff King wrote: > On Wed, Nov 30, 2016 at 06:32:04PM -0500, Jeff King wrote: > > > On Wed, Nov 30, 2016 at 03:28:23PM -0800, Brandon Williams wrote: > > > > > So I couldn't find a race condition in the code. I tracked the problem > > > to grep_source_load_file which attempts to run lstat on the file so that > > > it can read it into a buffer. The lstat call fails with ENOENT (which > > > conveniently is skipped by the if statement which calls error_errno). So > > > for some reason the file cannot be found and read into memory resulting > > > in nothing being grep'ed for that particular file (since the buffer is > > > NULL). > > > > That's definitely weird. Is it possible that any of the underlying calls > > from another thread are using chdir()? I think realpath() make do that > > behind the scenes, and there may be others. > > > > A full strace from a failing case would be interesting reading. In > > theory we should be able to get that by running the stress script for > > long enough. :) > > Actually, it failed pretty much immediately. I guess the extra stracing > changes the timing to make the problem _more_ likely. > > And indeed, I see: > > 20867 lstat("fi:le", > 20813 <... read resumed> "", 232) = 0 > 20871 futex(0x558cdec8b164, FUTEX_WAIT_PRIVATE, 7, NULL > 20813 close(7 > 20870 <... futex resumed> ) = 0 > 20869 lstat(".gitmodules", > 20813 <... close resumed> ) = 0 > 20865 set_robust_list(0x7f1df92579e0, 24 > 20813 lstat("su:b/../.git/modules/su:b/commondir", 0x7ffecc8b3ac0) = -1 > ENOENT (No such file or directory) > 20865 <... set_robust_list resumed> ) = 0 > 20868 set_robust_list(0x7f1df7a549e0, 24 > 20813 access("su:b/../.git/modules/su:b/objects", X_OK) = 0 > 20813 access("su:b/../.git/modules/su:b/refs", X_OK) = 0 > 20813 stat("su:b/../.git/modules/su:b", {st_mode=S_IFDIR|0755, st_size=280, > ...}) = 0 > 20813 getcwd("/var/ram/git-stress/root-4/trash > directory.t7814-grep-recurse-submodules/parent", 129) = 80 > 20869 <... lstat resumed> {st_mode=S_IFREG|0644, st_size=47, ...}) = 0 > 20813 chdir("su:b/../.git/modules/su:b") = 0 > 20869 open(".gitmodules", O_RDONLY > 20813 getcwd( > 20867 <... lstat resumed> 0x7f1df8254cf0) = -1 ENOENT (No such file or > directory) > > where 20813 and 20867 are two threads of the main process. One is doing > the lstat and the other calls chdir at the same moment. > > -Peff Yeah so it looks like the start_command function calls chdir. Which means any uses of the run-command interface are not thread safe For now the work around could be to just pass "-C " to the child process instead of relying on run-command to chdir. -- Brandon Williams
Re: What's cooking in git.git (Nov 2016, #06; Mon, 28)
On Wed, Nov 30, 2016 at 3:40 PM, Jeff Kingwrote: > On Wed, Nov 30, 2016 at 06:32:04PM -0500, Jeff King wrote: > >> On Wed, Nov 30, 2016 at 03:28:23PM -0800, Brandon Williams wrote: >> >> > So I couldn't find a race condition in the code. I tracked the problem >> > to grep_source_load_file which attempts to run lstat on the file so that >> > it can read it into a buffer. The lstat call fails with ENOENT (which >> > conveniently is skipped by the if statement which calls error_errno). So >> > for some reason the file cannot be found and read into memory resulting >> > in nothing being grep'ed for that particular file (since the buffer is >> > NULL). >> >> That's definitely weird. Is it possible that any of the underlying calls >> from another thread are using chdir()? I think realpath() make do that >> behind the scenes, and there may be others. >> >> A full strace from a failing case would be interesting reading. In >> theory we should be able to get that by running the stress script for >> long enough. :) > > Actually, it failed pretty much immediately. I guess the extra stracing > changes the timing to make the problem _more_ likely. > > And indeed, I see: > > 20867 lstat("fi:le", > 20813 <... read resumed> "", 232) = 0 > 20871 futex(0x558cdec8b164, FUTEX_WAIT_PRIVATE, 7, NULL > 20813 close(7 > 20870 <... futex resumed> ) = 0 > 20869 lstat(".gitmodules", > 20813 <... close resumed> ) = 0 > 20865 set_robust_list(0x7f1df92579e0, 24 > 20813 lstat("su:b/../.git/modules/su:b/commondir", 0x7ffecc8b3ac0) = -1 > ENOENT (No such file or directory) > 20865 <... set_robust_list resumed> ) = 0 > 20868 set_robust_list(0x7f1df7a549e0, 24 > 20813 access("su:b/../.git/modules/su:b/objects", X_OK) = 0 > 20813 access("su:b/../.git/modules/su:b/refs", X_OK) = 0 > 20813 stat("su:b/../.git/modules/su:b", {st_mode=S_IFDIR|0755, st_size=280, > ...}) = 0 > 20813 getcwd("/var/ram/git-stress/root-4/trash > directory.t7814-grep-recurse-submodules/parent", 129) = 80 > 20869 <... lstat resumed> {st_mode=S_IFREG|0644, st_size=47, ...}) = 0 > 20813 chdir("su:b/../.git/modules/su:b") = 0 > 20869 open(".gitmodules", O_RDONLY > 20813 getcwd( > 20867 <... lstat resumed> 0x7f1df8254cf0) = -1 ENOENT (No such file or > directory) > > where 20813 and 20867 are two threads of the main process. One is doing > the lstat and the other calls chdir at the same moment. > Lessons learned here: The run-command API is not thread safe when used with setting the directory. If you need to run a thing in a threaded environment run git -C ... such that the child chdirs. Are there any other threaded environments that run things with .dir set?
Re: What's cooking in git.git (Nov 2016, #06; Mon, 28)
On Wed, Nov 30, 2016 at 06:32:04PM -0500, Jeff King wrote: > On Wed, Nov 30, 2016 at 03:28:23PM -0800, Brandon Williams wrote: > > > So I couldn't find a race condition in the code. I tracked the problem > > to grep_source_load_file which attempts to run lstat on the file so that > > it can read it into a buffer. The lstat call fails with ENOENT (which > > conveniently is skipped by the if statement which calls error_errno). So > > for some reason the file cannot be found and read into memory resulting > > in nothing being grep'ed for that particular file (since the buffer is > > NULL). > > That's definitely weird. Is it possible that any of the underlying calls > from another thread are using chdir()? I think realpath() make do that > behind the scenes, and there may be others. > > A full strace from a failing case would be interesting reading. In > theory we should be able to get that by running the stress script for > long enough. :) Actually, it failed pretty much immediately. I guess the extra stracing changes the timing to make the problem _more_ likely. And indeed, I see: 20867 lstat("fi:le", 20813 <... read resumed> "", 232) = 0 20871 futex(0x558cdec8b164, FUTEX_WAIT_PRIVATE, 7, NULL 20813 close(7 20870 <... futex resumed> ) = 0 20869 lstat(".gitmodules", 20813 <... close resumed> ) = 0 20865 set_robust_list(0x7f1df92579e0, 24 20813 lstat("su:b/../.git/modules/su:b/commondir", 0x7ffecc8b3ac0) = -1 ENOENT (No such file or directory) 20865 <... set_robust_list resumed> ) = 0 20868 set_robust_list(0x7f1df7a549e0, 24 20813 access("su:b/../.git/modules/su:b/objects", X_OK) = 0 20813 access("su:b/../.git/modules/su:b/refs", X_OK) = 0 20813 stat("su:b/../.git/modules/su:b", {st_mode=S_IFDIR|0755, st_size=280, ...}) = 0 20813 getcwd("/var/ram/git-stress/root-4/trash directory.t7814-grep-recurse-submodules/parent", 129) = 80 20869 <... lstat resumed> {st_mode=S_IFREG|0644, st_size=47, ...}) = 0 20813 chdir("su:b/../.git/modules/su:b") = 0 20869 open(".gitmodules", O_RDONLY 20813 getcwd( 20867 <... lstat resumed> 0x7f1df8254cf0) = -1 ENOENT (No such file or directory) where 20813 and 20867 are two threads of the main process. One is doing the lstat and the other calls chdir at the same moment. -Peff
Re: [RFC/PATCH v3 01/16] Add initial external odb support
On Wed, Nov 30, 2016 at 03:30:09PM -0800, Junio C Hamano wrote: > Christian Couderwrites: > > > From: Jeff King > > > > Signed-off-by: Christian Couder > > --- > > By the time the series loses RFC prefix, we'd need to see the above > three lines straightened out a bit more, e.g. a real message and a > more proper sign-off sequence. Actually, I would assume that this patch would go away altogether and get folded into commits that get us closer to the end state. But I haven't read the series carefully yet, so maybe it really does work better with this as a base. I am perfectly OK dropping my commit count by one and getting "based on a patch by Peff" in a cover letter or elsewhere. -Peff
Re: What's cooking in git.git (Nov 2016, #06; Mon, 28)
On Wed, Nov 30, 2016 at 03:28:23PM -0800, Brandon Williams wrote: > So I couldn't find a race condition in the code. I tracked the problem > to grep_source_load_file which attempts to run lstat on the file so that > it can read it into a buffer. The lstat call fails with ENOENT (which > conveniently is skipped by the if statement which calls error_errno). So > for some reason the file cannot be found and read into memory resulting > in nothing being grep'ed for that particular file (since the buffer is > NULL). That's definitely weird. Is it possible that any of the underlying calls from another thread are using chdir()? I think realpath() make do that behind the scenes, and there may be others. A full strace from a failing case would be interesting reading. In theory we should be able to get that by running the stress script for long enough. :) -Peff
Re: [RFC/PATCH v3 01/16] Add initial external odb support
Christian Couderwrites: > From: Jeff King > > Signed-off-by: Christian Couder > --- By the time the series loses RFC prefix, we'd need to see the above three lines straightened out a bit more, e.g. a real message and a more proper sign-off sequence. > +static struct odb_helper *find_or_create_helper(const char *name, int len) > +{ > + struct odb_helper *o; > + > + for (o = helpers; o; o = o->next) > + if (!strncmp(o->name, name, len) && !o->name[len]) > + return o; > + > + o = odb_helper_new(name, len); > + *helpers_tail = o; > + helpers_tail = >next; > + > + return o; > +} > + > +static int external_odb_config(const char *var, const char *value, void > *data) > +{ > + struct odb_helper *o; > + const char *key, *dot; > + > + if (!skip_prefix(var, "odb.", )) > + return 0; > + dot = strrchr(key, '.'); > + if (!dot) > + return 0; Is this something Peff wrote long time ago? I find it surprising that he would write this without using parse_config_key(). > +struct odb_helper_cmd { > + struct argv_array argv; > + struct child_process child; > +}; > + > +static void prepare_helper_command(struct argv_array *argv, const char *cmd, > +const char *fmt, va_list ap) > +{ > + struct strbuf buf = STRBUF_INIT; > + > + strbuf_addstr(, cmd); > + strbuf_addch(, ' '); > + strbuf_vaddf(, fmt, ap); > + > + argv_array_push(argv, buf.buf); > + strbuf_release(); This concatenates the cmdname (like "get") and its parameters into a single string (e.g. "get 454cb6bd52a4de614a3633e4f547af03d5c3b640") and then places the resulting string as the sole element in argv[] array, which I find somewhat unusual. It at least deserves a comment in front of the function that the callers are responsible to ensure that the result of vaddf(fmt, ap) is properly shell-quoted. Otherwise it is inviting quoting errors in the future (even if there is no such errors in the current code and command set, i.e. a 40-hex object name that "get" subcommand takes happens to require no special shell-quoting consideration). > +int odb_helper_fetch_object(struct odb_helper *o, const unsigned char *sha1, > + int fd) > +{ > + struct odb_helper_object *obj; > + struct odb_helper_cmd cmd; > + unsigned long total_got; > + git_zstream stream; > + int zret = Z_STREAM_END; > + git_SHA_CTX hash; > + unsigned char real_sha1[20]; > + > + obj = odb_helper_lookup(o, sha1); > + if (!obj) > + return -1; > + > + if (odb_helper_start(o, , "get %s", sha1_to_hex(sha1)) < 0) > + return -1; > + > + memset(, 0, sizeof(stream)); > + git_inflate_init(); > + git_SHA1_Init(); > + total_got = 0; > + > + for (;;) { > + unsigned char buf[4096]; > + int r; > + > + r = xread(cmd.child.out, buf, sizeof(buf)); > + if (r < 0) { > + error("unable to read from odb helper '%s': %s", > + o->name, strerror(errno)); > + close(cmd.child.out); > + odb_helper_finish(o, ); > + git_inflate_end(); > + return -1; > + } > + if (r == 0) > + break; > + > + write_or_die(fd, buf, r); > + > + stream.next_in = buf; > + stream.avail_in = r; > + do { > + unsigned char inflated[4096]; > + unsigned long got; > + > + stream.next_out = inflated; > + stream.avail_out = sizeof(inflated); > + zret = git_inflate(, Z_SYNC_FLUSH); > + got = sizeof(inflated) - stream.avail_out; > + > + git_SHA1_Update(, inflated, got); > + /* skip header when counting size */ > + if (!total_got) { > + const unsigned char *p = memchr(inflated, '\0', > got); Does this assume that a single xread() that can result in a short-read would not return in the middle of "header", and if so, is that a safe assumption to make? > + if (p) > + got -= p - inflated + 1; > + else > + got = 0; > + } > + total_got += got; > + } while (stream.avail_in && zret == Z_OK); > + } > + > + close(cmd.child.out); > + git_inflate_end(); > + git_SHA1_Final(real_sha1, ); > + if (odb_helper_finish(o, )) > + return -1; > + if (zret != Z_STREAM_END) { > + warning("bad zlib data from odb helper '%s' for %s", > + o->name,
Re: What's cooking in git.git (Nov 2016, #06; Mon, 28)
On 11/30, Brandon Williams wrote: > On 11/29, Jeff King wrote: > > On Tue, Nov 29, 2016 at 01:37:59AM -0500, Jeff King wrote: > > > > > 2. Grep threads doing more complicated stuff that needs to take a > > > lock. You might try building with -fsanitize=thread to see if it > > > turns up anything. > > > > I tried this and it didn't find anything useful. It complains about > > multiple threads calling want_color() at the same time, which you can > > silence with something like: > > > > diff --git a/builtin/grep.c b/builtin/grep.c > > index 2c727ef49..d48846f40 100644 > > --- a/builtin/grep.c > > +++ b/builtin/grep.c > > @@ -207,6 +207,12 @@ static void start_threads(struct grep_opt *opt) > > { > > int i; > > > > + /* > > +* trigger want_color() for its side effect of caching the result; > > +* otherwise the threads will fight over setting the cache > > +*/ > > + want_color(GIT_COLOR_AUTO); > > + > > pthread_mutex_init(_mutex, NULL); > > pthread_mutex_init(_read_mutex, NULL); > > pthread_mutex_init(_attr_mutex, NULL); > > > > But the problem persists even with that patch, so it is something else. > > It may still be a threading problem; -fsanitize=thread isn't perfect. I > > also couldn't get the stress-test to fail when compiled with it. But > > that may simply mean that the timing of the resulting binary is changed > > enough not to trigger the issue. > > > > -Peff > > With you're stress script I'm able to see the failures. The interesting > thing is that the entry missing is always from the non-submodule file. So I couldn't find a race condition in the code. I tracked the problem to grep_source_load_file which attempts to run lstat on the file so that it can read it into a buffer. The lstat call fails with ENOENT (which conveniently is skipped by the if statement which calls error_errno). So for some reason the file cannot be found and read into memory resulting in nothing being grep'ed for that particular file (since the buffer is NULL). Maybe there is something wrong with the way I wrote the tests themselves? So I could try rewriting them. Any other ideas? -- Brandon Williams
[RFC/PATCH v3 08/16] t0400: add test for external odb write support
Signed-off-by: Christian Couder--- t/t0400-external-odb.sh | 8 1 file changed, 8 insertions(+) diff --git a/t/t0400-external-odb.sh b/t/t0400-external-odb.sh index 6c6da5cf4f..3c868cad4c 100755 --- a/t/t0400-external-odb.sh +++ b/t/t0400-external-odb.sh @@ -66,4 +66,12 @@ test_expect_success 'helper can add objects to alt repo' ' test "$size" -eq "$alt_size" ' +test_expect_success 'commit adds objects to alt repo' ' + test_config odb.magic.command "$HELPER" && + test_commit three && + hash3=$(git ls-tree HEAD | grep three.t | cut -f1 | cut -d\ -f3) && + content=$(cd alt-repo && git show "$hash3") && + test "$content" = "three" +' + test_done -- 2.11.0.rc2.37.geb49ca6
[RFC/PATCH v3 04/16] t0400: add 'put' command to odb-helper script
Signed-off-by: Christian Couder--- t/t0400-external-odb.sh | 14 ++ 1 file changed, 14 insertions(+) diff --git a/t/t0400-external-odb.sh b/t/t0400-external-odb.sh index fe85413725..0f1bb97f38 100755 --- a/t/t0400-external-odb.sh +++ b/t/t0400-external-odb.sh @@ -7,6 +7,10 @@ test_description='basic tests for external object databases' ALT_SOURCE="$PWD/alt-repo/.git" export ALT_SOURCE write_script odb-helper <<\EOF +die() { + printf >&2 "%s\n" "$@" + exit 1 +} GIT_DIR=$ALT_SOURCE; export GIT_DIR case "$1" in have) @@ -16,6 +20,16 @@ have) get) cat "$GIT_DIR"/objects/$(echo $2 | sed 's#..#&/#') ;; +put) + sha1="$2" + size="$3" + kind="$4" + writen=$(git hash-object -w -t "$kind" --stdin) + test "$writen" = "$sha1" || die "bad sha1 passed '$sha1' vs writen '$writen'" + ;; +*) + die "unknown command '$1'" + ;; esac EOF HELPER="\"$PWD\"/odb-helper" -- 2.11.0.rc2.37.geb49ca6
Re: [PATCH] difftool.c: mark a file-local symbol with static
On Wed, Nov 30, 2016 at 10:37:40PM +, Ramsay Jones wrote: > > Anyway. Those are all options, but I don't think there is any problem > > with sticking with warning("") for now. It is not the first part of the > > code that tickles the format-zero-length warning. > > Hmm, well I have been building git for some time with the warning > enabled without problem. > > [I can see a few examples of 'status_printf_ln(s, c, "%s", "");' in > git-grep output, so ...] > > I am not suggesting any change here, and was just curious why it > seemed to be unnecessary now (I knew it was necessary at one time). I forgot, we ended up reversing course later and silencing them: http://public-inbox.org/git/20140505052117.gc6...@sigill.intra.peff.net/ By the rationale of that conversation, we should be doing: warning("%s", ""); here. -Peff
Re: "git add -p ." raises an unexpected "warning: empty strings as pathspecs will be made invalid in upcoming releases. please use . instead if you meant to match all paths"
Junio C Hamanoforgot to Cc: the author of the most relevant change to the issue, d426430e6e ("pathspec: warn on empty strings as pathspec", 2016-06-22). > Kevin Daudt writes: > >> On Wed, Nov 30, 2016 at 12:31:49PM -0800, Peter Urda wrote: >>> After upgrading to version 2.11.0 I am getting a warning about empty >>> strings as pathspecs while using 'patch' >>> >>> - Ran 'git add -p .' from the root of my git repository. >>> >>> - I was able to normally stage my changes, but was presented with a >>> "warning: empty strings as pathspecs will be made invalid in upcoming >>> releases. please use . instead if you meant to match all paths" >>> message. >>> >>> - I expected no warning message since I included a "." with my original >>> command. >>> >>> I believe that I should not be seeing this warning message as I >>> included the requested "." pathspec. > > Yes, this seems to be caused by pathspec.c::prefix_pathspec() > overwriting the original pathspec "." into "". The callchain > looks like this: > > builtin/add.c::interactive_add() > -> parse_pathspec() > passes argv[] that has "." to the caller, > receives pathspec whose pathspec->items[].original > is supposed to point at the unmolested original, > but prefix_pathspec() munges "." into "" > -> run_add_interactive() > which runs "git add--interactive" with > pathspec->items[].original as pathspecs > > > Perhaps this would work it around, but there should be a better way > to fix it (like, making sure that what we call "original" indeed > stays "original"). > > builtin/add.c | 13 +++-- > 1 file changed, 11 insertions(+), 2 deletions(-) > > diff --git a/builtin/add.c b/builtin/add.c > index e8fb80b36e..137097192d 100644 > --- a/builtin/add.c > +++ b/builtin/add.c > @@ -167,9 +167,18 @@ int run_add_interactive(const char *revision, const char > *patch_mode, > if (revision) > argv_array_push(, revision); > argv_array_push(, "--"); > - for (i = 0; i < pathspec->nr; i++) > + for (i = 0; i < pathspec->nr; i++) { > /* pass original pathspec, to be re-parsed */ > + if (!*pathspec->items[i].original) { > + /* > + * work around a misfeature in parse_pathspecs() > + * that munges "." into "". > + */ > + argv_array_push(, "."); > + continue; > + } > argv_array_push(, pathspec->items[i].original); > + } > > status = run_command_v_opt(argv.argv, RUN_GIT_CMD); > argv_array_clear(); > @@ -180,7 +189,7 @@ int interactive_add(int argc, const char **argv, const > char *prefix, int patch) > { > struct pathspec pathspec; > > - parse_pathspec(, 0, > + parse_pathspec(, 0, > PATHSPEC_PREFER_FULL | > PATHSPEC_SYMLINK_LEADING_PATH | > PATHSPEC_PREFIX_ORIGIN,
Re: [PATCH] difftool.c: mark a file-local symbol with static
On 30/11/16 21:25, Jeff King wrote: > On Wed, Nov 30, 2016 at 12:40:25PM -0800, Junio C Hamano wrote: > >> Ramsay Joneswrites: >> >>> [I have fixed my config.mak file now, so I don't see the warning >>> anymore! Having -Wno-format-zero-length in DEVELOPER_CFLAGS, or >>> not, is a separate matter.] >> >> I suspect that 658df95a4a ("add DEVELOPER makefile knob to check for >> acknowledged warnings", 2016-02-25) took it from me (namely, Make >> script in my 'todo' branch). In turn, I added it to my set of flags >> in order to squelch this exact warning, so... > > For anybody interested in the history, we started using this when > status_printf() got the format attribute. Relevant patch and discussion: > > > http://public-inbox.org/git/20130710002328.gc19...@sigill.intra.peff.net/T/#u Ah, thank you! I've been trying to remember this for the last hour or so ... (I misremembered something about gettext, and went looking in the wrong direction ;-) ). > We went with disabling the warning because it really is wrong. It makes > an assumption that calling a format function with an empty string > doesn't do anything, but that's only true of the stock printf functions. > Our custom functions _do_ have a side effect in this case. Yes, I agree, gcc is making an unwarranted assumption. > The other options are to have a special function for "print a warning > (or status) line with no content". Or to teach those functions to handle > newlines specially. We've often discussed that you should be able to do: > > warning("foo\nbar"); > > and have it print: > > warning: foo > warning: bar > > That's useful in itself, and would probably make cases like this easier > to handle, too. But it's a pretty big change. Another option would be to > just teach formatting functions to handle a single "\n" as a synonym for > the empty string (or even detect trailing newlines and avoid appending > our own in that case). That would mean you could do: > > warning("\n"); > > to print a blank line. That's arguably more obvious about the intent to > a reader (I say arguably because the new behavior _is_ subtle if you > happen to know that warning() usually appends a newline). Yes, I remember the discussion now and agree that this could cause problems. > Anyway. Those are all options, but I don't think there is any problem > with sticking with warning("") for now. It is not the first part of the > code that tickles the format-zero-length warning. Hmm, well I have been building git for some time with the warning enabled without problem. [I can see a few examples of 'status_printf_ln(s, c, "%s", "");' in git-grep output, so ...] I am not suggesting any change here, and was just curious why it seemed to be unnecessary now (I knew it was necessary at one time). ATB, Ramsay Jones
Re: [RFC/PATCH v3 00/16] Add initial experimental external ODB support
Christian Couderwrites: > For now there should be one odb ref per blob. Each ref name should be > refs/odbs// where is the sha1 of the blob stored > in the external odb named . > > These odb refs should all point to a blob that should be stored in the > Git repository and contain information about the blob stored in the > external odb. This information can be specific to the external odb. > The repos can then share this information using commands like: > > `git fetch origin "refs/odbs//*:refs/odbs//*"` Unless this is designed to serve only a handful of blobs, I cannot see how this design would scale successfully. I notice you wrote "For now" at the beginning, but what is the envisioned way this will evolve in the future?
Re: [PATCHv2 4/4] submodule: add embed-git-dir function
Stefan Bellerwrites: > On Wed, Nov 30, 2016 at 1:39 PM, Junio C Hamano wrote: >> Stefan Beller writes: >> >>> git relocate-git-dir (--into-workingtree|--into-gitdir) \ >> >> I am not sure if you meant this as a submodule-specific subcommand >> or more general helper. "into-workingtree" suggests to me that it >> is submodule specific, so I'll base my response on that assumption. >> >> Would there ever be a situation where you already have submodule >> repositories in the right place (according to the more modern >> practice, to keep them in .git/modules/ of superproject) and want to >> move them to embed them in worktrees of submodules? I do not think >> of any. > > "Hi, I made a mistake by using submodules. I don't want to use > them any more, I rather want to: > A) make it a separate git repo again and I'll keep them in sync myself > B) ... " OK, I can buy that. Thanks.
[RFC/PATCH v3 12/16] lib-httpd: add upload.sh
This cgi will be used to upload objects to, or to delete objects from, an apache web server. This way the apache server can work as an external object database. Signed-off-by: Christian Couder--- t/lib-httpd.sh| 1 + t/lib-httpd/upload.sh | 45 + 2 files changed, 46 insertions(+) create mode 100644 t/lib-httpd/upload.sh diff --git a/t/lib-httpd.sh b/t/lib-httpd.sh index 2e659a8ee2..d80b004549 100644 --- a/t/lib-httpd.sh +++ b/t/lib-httpd.sh @@ -132,6 +132,7 @@ prepare_httpd() { cp "$TEST_PATH"/passwd "$HTTPD_ROOT_PATH" install_script broken-smart-http.sh install_script error.sh + install_script upload.sh ln -s "$LIB_HTTPD_MODULE_PATH" "$HTTPD_ROOT_PATH/modules" diff --git a/t/lib-httpd/upload.sh b/t/lib-httpd/upload.sh new file mode 100644 index 00..172be0f73f --- /dev/null +++ b/t/lib-httpd/upload.sh @@ -0,0 +1,45 @@ +#!/bin/sh + +# In part from http://codereview.stackexchange.com/questions/79549/bash-cgi-upload-file + +FILES_DIR="www/files" + +OLDIFS="$IFS" +IFS='&' +set -- $QUERY_STRING +IFS="$OLDIFS" + +while test $# -gt 0 +do +key=${1%=*} +val=${1#*=} + +case "$key" in + "sha1") sha1="$val" ;; + "type") type="$val" ;; + "size") size="$val" ;; + "delete") delete=1 ;; + *) echo >&2 "unknown key '$key'" ;; +esac + +shift +done + +case "$REQUEST_METHOD" in + POST) +if test "$delete" = "1" +then + rm -f "$FILES_DIR/$sha1-$size-$type" +else + mkdir -p "$FILES_DIR" + cat >"$FILES_DIR/$sha1-$size-$type" +fi + +echo 'Status: 204 No Content' +echo +;; + + *) +echo 'Status: 405 Method Not Allowed' +echo +esac -- 2.11.0.rc2.37.geb49ca6
[RFC/PATCH v3 02/16] external odb foreach
From: Jeff King--- external-odb.c | 14 ++ external-odb.h | 6 ++ odb-helper.c | 15 +++ odb-helper.h | 4 4 files changed, 39 insertions(+) diff --git a/external-odb.c b/external-odb.c index 1ccfa99a01..42978a3298 100644 --- a/external-odb.c +++ b/external-odb.c @@ -113,3 +113,17 @@ int external_odb_fetch_object(const unsigned char *sha1) return -1; } + +int external_odb_for_each_object(each_external_object_fn fn, void *data) +{ + struct odb_helper *o; + + external_odb_init(); + + for (o = helpers; o; o = o->next) { + int r = odb_helper_for_each_object(o, fn, data); + if (r) + return r; + } + return 0; +} diff --git a/external-odb.h b/external-odb.h index 2397477684..cea8570a49 100644 --- a/external-odb.h +++ b/external-odb.h @@ -5,4 +5,10 @@ const char *external_odb_root(void); int external_odb_has_object(const unsigned char *sha1); int external_odb_fetch_object(const unsigned char *sha1); +typedef int (*each_external_object_fn)(const unsigned char *sha1, + enum object_type type, + unsigned long size, + void *data); +int external_odb_for_each_object(each_external_object_fn, void *); + #endif /* EXTERNAL_ODB_H */ diff --git a/odb-helper.c b/odb-helper.c index 244bc86792..2db59caa53 100644 --- a/odb-helper.c +++ b/odb-helper.c @@ -237,3 +237,18 @@ int odb_helper_fetch_object(struct odb_helper *o, const unsigned char *sha1, return 0; } + +int odb_helper_for_each_object(struct odb_helper *o, + each_external_object_fn fn, + void *data) +{ + int i; + for (i = 0; i < o->have_nr; i++) { + struct odb_helper_object *obj = >have[i]; + int r = fn(obj->sha1, obj->type, obj->size, data); + if (r) + return r; + } + + return 0; +} diff --git a/odb-helper.h b/odb-helper.h index 0f704f9452..8c3916d215 100644 --- a/odb-helper.h +++ b/odb-helper.h @@ -1,6 +1,8 @@ #ifndef ODB_HELPER_H #define ODB_HELPER_H +#include "external-odb.h" + struct odb_helper { const char *name; const char *cmd; @@ -21,5 +23,7 @@ struct odb_helper *odb_helper_new(const char *name, int namelen); int odb_helper_has_object(struct odb_helper *o, const unsigned char *sha1); int odb_helper_fetch_object(struct odb_helper *o, const unsigned char *sha1, int fd); +int odb_helper_for_each_object(struct odb_helper *o, + each_external_object_fn, void *); #endif /* ODB_HELPER_H */ -- 2.11.0.rc2.37.geb49ca6
Re: "git add -p ." raises an unexpected "warning: empty strings as pathspecs will be made invalid in upcoming releases. please use . instead if you meant to match all paths"
Kevin Daudtwrites: > On Wed, Nov 30, 2016 at 12:31:49PM -0800, Peter Urda wrote: >> After upgrading to version 2.11.0 I am getting a warning about empty >> strings as pathspecs while using 'patch' >> >> - Ran 'git add -p .' from the root of my git repository. >> >> - I was able to normally stage my changes, but was presented with a >> "warning: empty strings as pathspecs will be made invalid in upcoming >> releases. please use . instead if you meant to match all paths" >> message. >> >> - I expected no warning message since I included a "." with my original >> command. >> >> I believe that I should not be seeing this warning message as I >> included the requested "." pathspec. Yes, this seems to be caused by pathspec.c::prefix_pathspec() overwriting the original pathspec "." into "". The callchain looks like this: builtin/add.c::interactive_add() -> parse_pathspec() passes argv[] that has "." to the caller, receives pathspec whose pathspec->items[].original is supposed to point at the unmolested original, but prefix_pathspec() munges "." into "" -> run_add_interactive() which runs "git add--interactive" with pathspec->items[].original as pathspecs Perhaps this would work it around, but there should be a better way to fix it (like, making sure that what we call "original" indeed stays "original"). builtin/add.c | 13 +++-- 1 file changed, 11 insertions(+), 2 deletions(-) diff --git a/builtin/add.c b/builtin/add.c index e8fb80b36e..137097192d 100644 --- a/builtin/add.c +++ b/builtin/add.c @@ -167,9 +167,18 @@ int run_add_interactive(const char *revision, const char *patch_mode, if (revision) argv_array_push(, revision); argv_array_push(, "--"); - for (i = 0; i < pathspec->nr; i++) + for (i = 0; i < pathspec->nr; i++) { /* pass original pathspec, to be re-parsed */ + if (!*pathspec->items[i].original) { + /* +* work around a misfeature in parse_pathspecs() +* that munges "." into "". +*/ + argv_array_push(, "."); + continue; + } argv_array_push(, pathspec->items[i].original); + } status = run_command_v_opt(argv.argv, RUN_GIT_CMD); argv_array_clear(); @@ -180,7 +189,7 @@ int interactive_add(int argc, const char **argv, const char *prefix, int patch) { struct pathspec pathspec; - parse_pathspec(, 0, + parse_pathspec(, 0, PATHSPEC_PREFER_FULL | PATHSPEC_SYMLINK_LEADING_PATH | PATHSPEC_PREFIX_ORIGIN,
[RFC/PATCH v3 07/16] external-odb: accept only blobs for now
Signed-off-by: Christian Couder--- external-odb.c | 4 1 file changed, 4 insertions(+) diff --git a/external-odb.c b/external-odb.c index bb70fe3298..6dd7b2548b 100644 --- a/external-odb.c +++ b/external-odb.c @@ -133,6 +133,10 @@ int external_odb_write_object(const void *buf, unsigned long len, { struct odb_helper *o; + /* For now accept only blobs */ + if (strcmp(type, "blob")) + return 1; + external_odb_init(); for (o = helpers; o; o = o->next) { -- 2.11.0.rc2.37.geb49ca6
[RFC/PATCH v3 11/16] lib-httpd: pass config file to start_httpd()
This makes it possible to start an apache web server with different config files. This will be used in a later patch to pass a config file that makes apache store external objects. Signed-off-by: Christian Couder--- t/lib-httpd.sh | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/t/lib-httpd.sh b/t/lib-httpd.sh index 435a37465a..2e659a8ee2 100644 --- a/t/lib-httpd.sh +++ b/t/lib-httpd.sh @@ -171,12 +171,14 @@ prepare_httpd() { } start_httpd() { + APACHE_CONF_FILE=${1-apache.conf} + prepare_httpd >&3 2>&4 trap 'code=$?; stop_httpd; (exit $code); die' EXIT "$LIB_HTTPD_PATH" -d "$HTTPD_ROOT_PATH" \ - -f "$TEST_PATH/apache.conf" $HTTPD_PARA \ + -f "$TEST_PATH/$APACHE_CONF_FILE" $HTTPD_PARA \ -c "Listen 127.0.0.1:$LIB_HTTPD_PORT" -k start \ >&3 2>&4 if test $? -ne 0 @@ -191,7 +193,7 @@ stop_httpd() { trap 'die' EXIT "$LIB_HTTPD_PATH" -d "$HTTPD_ROOT_PATH" \ - -f "$TEST_PATH/apache.conf" $HTTPD_PARA -k stop + -f "$TEST_PATH/$APACHE_CONF_FILE" $HTTPD_PARA -k stop } test_http_push_nonff () { -- 2.11.0.rc2.37.geb49ca6
[RFC/PATCH v3 03/16] t0400: use --batch-all-objects to get all objects
Signed-off-by: Christian Couder--- t/t0400-external-odb.sh | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/t/t0400-external-odb.sh b/t/t0400-external-odb.sh index 2b016173a0..fe85413725 100755 --- a/t/t0400-external-odb.sh +++ b/t/t0400-external-odb.sh @@ -10,9 +10,7 @@ write_script odb-helper <<\EOF GIT_DIR=$ALT_SOURCE; export GIT_DIR case "$1" in have) - git rev-list --all --objects | - cut -d' ' -f1 | - git cat-file --batch-check | + git cat-file --batch-check --batch-all-objects | awk '{print $1 " " $3 " " $2}' ;; get) -- 2.11.0.rc2.37.geb49ca6
[RFC/PATCH v3 13/16] lib-httpd: add list.sh
This cgi script can list Git objects that have been uploaded as files to an apache web server. This script can also retrieve the content of each of these files. This will help make apache work as an external object database. Signed-off-by: Christian Couder--- t/lib-httpd.sh | 1 + t/lib-httpd/list.sh | 34 ++ 2 files changed, 35 insertions(+) create mode 100644 t/lib-httpd/list.sh diff --git a/t/lib-httpd.sh b/t/lib-httpd.sh index d80b004549..f31ea261f5 100644 --- a/t/lib-httpd.sh +++ b/t/lib-httpd.sh @@ -133,6 +133,7 @@ prepare_httpd() { install_script broken-smart-http.sh install_script error.sh install_script upload.sh + install_script list.sh ln -s "$LIB_HTTPD_MODULE_PATH" "$HTTPD_ROOT_PATH/modules" diff --git a/t/lib-httpd/list.sh b/t/lib-httpd/list.sh new file mode 100644 index 00..a54402558f --- /dev/null +++ b/t/lib-httpd/list.sh @@ -0,0 +1,34 @@ +#!/bin/sh + +FILES_DIR="www/files" + +OLDIFS="$IFS" +IFS='&' +set -- $QUERY_STRING +IFS="$OLDIFS" + +while test $# -gt 0 +do +key=${1%=*} +val=${1#*=} + +case "$key" in + "sha1") sha1="$val" ;; + *) echo >&2 "unknown key '$key'" ;; +esac + +shift +done + +echo 'Status: 200 OK' +echo + +if test -d "$FILES_DIR" +then +if test -n "$sha1" +then + cat "$FILES_DIR/$sha1"-* +else + ls "$FILES_DIR" | tr '-' ' ' +fi +fi -- 2.11.0.rc2.37.geb49ca6
[RFC/PATCH v3 10/16] Add t0410 to test external ODB transfer
Signed-off-by: Christian Couder--- t/t0410-transfer-e-odb.sh | 136 ++ 1 file changed, 136 insertions(+) create mode 100755 t/t0410-transfer-e-odb.sh diff --git a/t/t0410-transfer-e-odb.sh b/t/t0410-transfer-e-odb.sh new file mode 100755 index 00..868b55db94 --- /dev/null +++ b/t/t0410-transfer-e-odb.sh @@ -0,0 +1,136 @@ +#!/bin/sh + +test_description='basic tests for transfering external ODBs' + +. ./test-lib.sh + +ORIG_SOURCE="$PWD/.git" +export ORIG_SOURCE + +ALT_SOURCE1="$PWD/alt-repo1/.git" +export ALT_SOURCE1 +write_script odb-helper1 <<\EOF +die() { + printf >&2 "%s\n" "$@" + exit 1 +} +GIT_DIR=$ALT_SOURCE1; export GIT_DIR +case "$1" in +have) + git cat-file --batch-check --batch-all-objects | + awk '{print $1 " " $3 " " $2}' + ;; +get) + cat "$GIT_DIR"/objects/$(echo $2 | sed 's#..#&/#') + ;; +put) + sha1="$2" + size="$3" + kind="$4" + writen=$(git hash-object -w -t "$kind" --stdin) + test "$writen" = "$sha1" || die "bad sha1 passed '$sha1' vs writen '$writen'" + ref_hash=$(echo "$sha1 $size $kind" | GIT_DIR=$ORIG_SOURCE GIT_NO_EXTERNAL_ODB=1 git hash-object -w -t blob --stdin) || exit + GIT_DIR=$ORIG_SOURCE git update-ref refs/odbs/magic/"$sha1" "$ref_hash" + ;; +*) + die "unknown command '$1'" + ;; +esac +EOF +HELPER1="\"$PWD\"/odb-helper1" + +OTHER_SOURCE="$PWD/.git" +export OTHER_SOURCE + +ALT_SOURCE2="$PWD/alt-repo2/.git" +export ALT_SOURCE2 +write_script odb-helper2 <<\EOF +die() { + printf >&2 "%s\n" "$@" + exit 1 +} +GIT_DIR=$ALT_SOURCE2; export GIT_DIR +case "$1" in +have) + GIT_DIR=$OTHER_SOURCE git for-each-ref --format='%(objectname)' refs/odbs/magic/ | GIT_DIR=$OTHER_SOURCE xargs git show + ;; +get) + OBJ_FILE="$GIT_DIR"/objects/$(echo $2 | sed 's#..#&/#') + if ! test -f "$OBJ_FILE" + then + # "Download" the missing object by copying it from alt-repo1 + OBJ_DIR=$(echo $2 | sed 's/\(..\).*/\1/') + OBJ_BASE=$(basename "$OBJ_FILE") + ALT_OBJ_DIR1="$ALT_SOURCE1/objects/$OBJ_DIR" + ALT_OBJ_DIR2="$ALT_SOURCE2/objects/$OBJ_DIR" + mkdir -p "$ALT_OBJ_DIR2" || die "Could not mkdir '$ALT_OBJ_DIR2'" + OBJ_SRC="$ALT_OBJ_DIR1/$OBJ_BASE" + cp "$OBJ_SRC" "$ALT_OBJ_DIR2" || + die "Could not cp '$OBJ_SRC' into '$ALT_OBJ_DIR2'" + fi + cat "$OBJ_FILE" || die "Could not cat '$OBJ_FILE'" + ;; +put) + sha1="$2" + size="$3" + kind="$4" + writen=$(git hash-object -w -t "$kind" --stdin) + test "$writen" = "$sha1" || die "bad sha1 passed '$sha1' vs writen '$writen'" + ref_hash=$(echo "$sha1 $size $kind" | GIT_DIR=$OTHER_SOURCE GIT_NO_EXTERNAL_ODB=1 git hash-object -w -t blob --stdin) || exit + GIT_DIR=$OTHER_SOURCE git update-ref refs/odbs/magic/"$sha1" "$ref_hash" + ;; +*) + die "unknown command '$1'" + ;; +esac +EOF +HELPER2="\"$PWD\"/odb-helper2" + +test_expect_success 'setup first alternate repo' ' + git init alt-repo1 && + test_commit zero && + git config odb.magic.command "$HELPER1" +' + +test_expect_success 'setup other repo and its alternate repo' ' + git init other-repo && + git init alt-repo2 && + (cd other-repo && +git remote add origin .. && +git pull origin master && +git checkout master && +git log) +' + +test_expect_success 'new blobs are put in first object store' ' + test_commit one && + hash1=$(git ls-tree HEAD | grep one.t | cut -f1 | cut -d\ -f3) && + content=$(cd alt-repo1 && git show "$hash1") && + test "$content" = "one" && + test_commit two && + hash2=$(git ls-tree HEAD | grep two.t | cut -f1 | cut -d\ -f3) && + content=$(cd alt-repo1 && git show "$hash2") && + test "$content" = "two" +' + +test_expect_success 'other repo gets the blobs from object store' ' + (cd other-repo && +git fetch origin "refs/odbs/magic/*:refs/odbs/magic/*" && +test_must_fail git cat-file blob "$hash1" && +test_must_fail git cat-file blob "$hash2" && +git config odb.magic.command "$HELPER2" && +git cat-file blob "$hash1" && +git cat-file blob "$hash2" + ) +' + +test_expect_success 'other repo gets everything else' ' + (cd other-repo && +git fetch origin && +content=$(git show "$hash1") && +test "$content" = "one" && +content=$(git show "$hash2") && +test "$content" = "two") +' + +test_done -- 2.11.0.rc2.37.geb49ca6
[RFC/PATCH v3 09/16] Add GIT_NO_EXTERNAL_ODB env variable
Signed-off-by: Christian Couder--- cache.h| 9 + environment.c | 4 external-odb.c | 6 ++ sha1_file.c| 3 +++ 4 files changed, 22 insertions(+) diff --git a/cache.h b/cache.h index b419b9b7ce..503b618a1f 100644 --- a/cache.h +++ b/cache.h @@ -422,6 +422,7 @@ static inline enum object_type object_type(unsigned int mode) #define CEILING_DIRECTORIES_ENVIRONMENT "GIT_CEILING_DIRECTORIES" #define NO_REPLACE_OBJECTS_ENVIRONMENT "GIT_NO_REPLACE_OBJECTS" #define GIT_REPLACE_REF_BASE_ENVIRONMENT "GIT_REPLACE_REF_BASE" +#define NO_EXTERNAL_ODB_ENVIRONMENT "GIT_NO_EXTERNAL_ODB" #define GITATTRIBUTES_FILE ".gitattributes" #define INFOATTRIBUTES_FILE "info/attributes" #define ATTRIBUTE_MACRO_PREFIX "[attr]" @@ -698,6 +699,14 @@ void reset_shared_repository(void); extern int check_replace_refs; extern char *git_replace_ref_base; +/* + * Do external odbs need to be used this run? This variable is + * initialized to true unless $GIT_NO_EXTERNAL_ODB is set, but it + * maybe set to false by some commands that do not want external + * odbs to be active. + */ +extern int use_external_odb; + extern int fsync_object_files; extern int core_preload_index; extern int core_apply_sparse_checkout; diff --git a/environment.c b/environment.c index 0935ec696e..8aecdd0544 100644 --- a/environment.c +++ b/environment.c @@ -47,6 +47,7 @@ const char *excludes_file; enum auto_crlf auto_crlf = AUTO_CRLF_FALSE; int check_replace_refs = 1; char *git_replace_ref_base; +int use_external_odb = 1; enum eol core_eol = EOL_UNSET; enum safe_crlf safe_crlf = SAFE_CRLF_WARN; unsigned whitespace_rule_cfg = WS_DEFAULT_RULE; @@ -120,6 +121,7 @@ const char * const local_repo_env[] = { INDEX_ENVIRONMENT, NO_REPLACE_OBJECTS_ENVIRONMENT, GIT_REPLACE_REF_BASE_ENVIRONMENT, + NO_EXTERNAL_ODB_ENVIRONMENT, GIT_PREFIX_ENVIRONMENT, GIT_SUPER_PREFIX_ENVIRONMENT, GIT_SHALLOW_FILE_ENVIRONMENT, @@ -185,6 +187,8 @@ static void setup_git_env(void) replace_ref_base = getenv(GIT_REPLACE_REF_BASE_ENVIRONMENT); git_replace_ref_base = xstrdup(replace_ref_base ? replace_ref_base : "refs/replace/"); + if (getenv(NO_EXTERNAL_ODB_ENVIRONMENT)) + use_external_odb = 0; namespace = expand_namespace(getenv(GIT_NAMESPACE_ENVIRONMENT)); namespace_len = strlen(namespace); shallow_file = getenv(GIT_SHALLOW_FILE_ENVIRONMENT); diff --git a/external-odb.c b/external-odb.c index 6dd7b2548b..a980fbfbf2 100644 --- a/external-odb.c +++ b/external-odb.c @@ -63,6 +63,9 @@ int external_odb_has_object(const unsigned char *sha1) { struct odb_helper *o; + if (!use_external_odb) + return 0; + external_odb_init(); for (o = helpers; o; o = o->next) @@ -133,6 +136,9 @@ int external_odb_write_object(const void *buf, unsigned long len, { struct odb_helper *o; + if (!use_external_odb) + return 1; + /* For now accept only blobs */ if (strcmp(type, "blob")) return 1; diff --git a/sha1_file.c b/sha1_file.c index 3532c1c598..92f1244205 100644 --- a/sha1_file.c +++ b/sha1_file.c @@ -556,6 +556,9 @@ void prepare_external_alt_odb(void) static int linked_external; const char *path; + if (!use_external_odb) + return; + if (linked_external) return; -- 2.11.0.rc2.37.geb49ca6
Re: [PATCHv2 4/4] submodule: add embed-git-dir function
On Wed, Nov 30, 2016 at 1:39 PM, Junio C Hamanowrote: > Stefan Beller writes: > >> git relocate-git-dir (--into-workingtree|--into-gitdir) \ > > I am not sure if you meant this as a submodule-specific subcommand > or more general helper. "into-workingtree" suggests to me that it > is submodule specific, so I'll base my response on that assumption. > > Would there ever be a situation where you already have submodule > repositories in the right place (according to the more modern > practice, to keep them in .git/modules/ of superproject) and want to > move them to embed them in worktrees of submodules? I do not think > of any. "Hi, I made a mistake by using submodules. I don't want to use them any more, I rather want to: A) make it a separate git repo again and I'll keep them in sync myself B) ... " "I abuse submodules for what git-LFS was designed for, and the submodule is on a different mount point, please keep the git directory also at that mount point". Not sure I agree these problems and the proposed solutions are beautiful, but that is what people may think of as a fast hack? > > If there is no such situation, I do not think we want a verb that is > direction-neutral (e.g. "move" or "relocate") with two options. > Rather we would want "git submodule unembed-git-dir" or something > like that. So when we want to have a generic function in C ("relocate_gitdir") for both worktree and submodules, the recursive flag is not supposed to invoke a submodule specific helper, but a generic helper. Alternatively we make the function not as generic and claim the recursive part is submodule specific and we can happily call "git submodule [un]embed-git-dir" recursively.
CVSImport - spaces in CVS path
I'm in the process of moving an entire collection of cvs modules into git. I'm working in Mac Yosemite. Everything is working fine except for one thing. A couple of the CVS modules have spaces in the paths. Below is what my command line looks like. When the path has spaces I've tried putting it in single and double quotes and using escape characters. None of that matters. I always get a message that it can't read dir/dir/path and then messages that it can't find the modules "with" or "spaces". git cvsimport -v -d :pserver:MYLOGIN:/usr/local/cvsroot/ dir/dir/PathWithNoSpaces/dir git cvsimport -v -d :pserver:MYLOGIN:/usr/local/cvsroot/ dir/dir/path with spaces/dir -- View this message in context: http://git.661346.n2.nabble.com/CVSImport-spaces-in-CVS-path-tp7657459.html Sent from the git mailing list archive at Nabble.com.
Re: [PATCHv2 4/4] submodule: add embed-git-dir function
Stefan Bellerwrites: > git relocate-git-dir (--into-workingtree|--into-gitdir) \ I am not sure if you meant this as a submodule-specific subcommand or more general helper. "into-workingtree" suggests to me that it is submodule specific, so I'll base my response on that assumption. Would there ever be a situation where you already have submodule repositories in the right place (according to the more modern practice, to keep them in .git/modules/ of superproject) and want to move them to embed them in worktrees of submodules? I do not think of any. If there is no such situation, I do not think we want a verb that is direction-neutral (e.g. "move" or "relocate") with two options. Rather we would want "git submodule unembed-git-dir" or something like that.
Re: [PATCH v2] tag, branch, for-each-ref: add --ignore-case for sorting and filtering
Nguyễn Thái Ngọc Duywrites: > This option makes sorting ignore case, which is great when you have > branches named bug-12-do-something, Bug-12-do-some-more and > BUG-12-do-what and want to group them together. Sorting externally may > not be an option because we lose coloring and column layout from > git-branch and git-tag. > > The same could be said for filtering, but it's probably less important > because you can always go with the ugly pattern [bB][uU][gG]-* if you're > desperate. But of course --ignore-case is of course much easier. > You can't have case-sensitive filtering and case-insensitive sorting (or > the other way around) with this though. But who would want that? I do not feel uncomfortable declaring that the answer to that question is "nobody" ;-) > Signed-off-by: Nguyễn Thái Ngọc Duy > --- > v2 has a different approach, and I think it's a better one even with > that unanswered question above. Yeah, I think this would be easier to use. > @@ -512,15 +512,6 @@ static void print_ref_list(struct ref_filter *filter, > struct ref_sorting *sortin > if (filter->verbose) > maxwidth = calc_maxwidth(, strlen(remote_prefix)); > > - /* > - * If no sorting parameter is given then we default to sorting > - * by 'refname'. This would give us an alphabetically sorted > - * array with the 'HEAD' ref at the beginning followed by > - * local branches 'refs/heads/...' and finally remote-tacking > - * branches 'refs/remotes/...'. > - */ > - if (!sorting) > - sorting = ref_default_sorting(); So it is now a BUG() to give sorting==NULL to this function, which is OK and I do not think we even need an assert() for it (i.e. the code with the patch looks good). > @@ -744,6 +739,16 @@ int cmd_branch(int argc, const char **argv, const char > *prefix) > if ((filter.kind & FILTER_REFS_BRANCHES) && filter.detached) > filter.kind |= FILTER_REFS_DETACHED_HEAD; > filter.name_patterns = argv; > + /* > + * If no sorting parameter is given then we default to sorting > + * by 'refname'. This would give us an alphabetically sorted > + * array with the 'HEAD' ref at the beginning followed by > + * local branches 'refs/heads/...' and finally remote-tacking > + * branches 'refs/remotes/...'. > + */ > + if (!sorting) > + sorting = ref_default_sorting(); > + sorting->ignore_case = icase; > print_ref_list(, sorting); > print_columns(, colopts, NULL); > string_list_clear(, 0); ... and the fallback is moved to the caller, which makes sense. > diff --git a/ref-filter.c b/ref-filter.c > index f5f7a70..bd98010 100644 > --- a/ref-filter.c > +++ b/ref-filter.c > @@ -1231,8 +1231,14 @@ static int commit_contains(struct ref_filter *filter, > struct commit *commit) > * matches a pattern "refs/heads/mas") or a wildcard (e.g. the same ref > * matches "refs/heads/mas*", too). > */ > -static int match_pattern(const char **patterns, const char *refname) > +static int match_pattern(const struct ref_filter *filter, const char > *refname) > { > + const char **patterns = filter->name_patterns; > + unsigned flags = 0; > + > + if (filter->ignore_case) > + flags |= WM_CASEFOLD; > + Ahh, OK. My reading stuttered when seeing "sorting and filtering" in the option description for "git tag", but this makes perfect sense. Good job. > @@ -1255,9 +1261,15 @@ static int match_pattern(const char **patterns, const > char *refname) > * matches a pattern "refs/heads/" but not "refs/heads/m") or a > * wildcard (e.g. the same ref matches "refs/heads/m*", too). > */ > -static int match_name_as_path(const char **pattern, const char *refname) > +static int match_name_as_path(const struct ref_filter *filter, const char > *refname) > { > + const char **pattern = filter->name_patterns; > int namelen = strlen(refname); > + unsigned flags = WM_PATHNAME; > + > + if (filter->ignore_case) > + flags |= WM_CASEFOLD; > + Likewise. Very simple and nicely done. > @@ -1536,18 +1548,20 @@ static int cmp_ref_sorting(struct ref_sorting *s, > struct ref_array_item *a, stru > struct atom_value *va, *vb; > int cmp; > cmp_type cmp_type = used_atom[s->atom].type; > + int (*cmp_fn)(const char *, const char *); > > get_ref_atom_value(a, s->atom, ); > get_ref_atom_value(b, s->atom, ); > + cmp_fn = s->ignore_case ? strcasecmp : strcmp; > if (s->version) > cmp = versioncmp(va->s, vb->s); > else if (cmp_type == FIELD_STR) > - cmp = strcmp(va->s, vb->s); > + cmp = cmp_fn(va->s, vb->s); > else { > if (va->ul < vb->ul) > cmp = -1; > else
Re: "git add -p ." raises an unexpected "warning: empty strings as pathspecs will be made invalid in upcoming releases. please use . instead if you meant to match all paths"
On Wed, Nov 30, 2016 at 12:31:49PM -0800, Peter Urda wrote: > After upgrading to version 2.11.0 I am getting a warning about empty > strings as pathspecs while using 'patch' > > - Ran 'git add -p .' from the root of my git repository. > > - I was able to normally stage my changes, but was presented with a > "warning: empty strings as pathspecs will be made invalid in upcoming > releases. please use . instead if you meant to match all paths" > message. > > - I expected no warning message since I included a "." with my original > command. > > I believe that I should not be seeing this warning message as I > included the requested "." pathspec. > > ~ Peter Urda > > http://urda.cc I can reproduce this. Note that it only happens when you specify '-p'. Without the --patch option, the warning does not appear.
Re: [PATCH] difftool.c: mark a file-local symbol with static
On Wed, Nov 30, 2016 at 12:40:25PM -0800, Junio C Hamano wrote: > Ramsay Joneswrites: > > > [I have fixed my config.mak file now, so I don't see the warning > > anymore! Having -Wno-format-zero-length in DEVELOPER_CFLAGS, or > > not, is a separate matter.] > > I suspect that 658df95a4a ("add DEVELOPER makefile knob to check for > acknowledged warnings", 2016-02-25) took it from me (namely, Make > script in my 'todo' branch). In turn, I added it to my set of flags > in order to squelch this exact warning, so... For anybody interested in the history, we started using this when status_printf() got the format attribute. Relevant patch and discussion: http://public-inbox.org/git/20130710002328.gc19...@sigill.intra.peff.net/T/#u We went with disabling the warning because it really is wrong. It makes an assumption that calling a format function with an empty string doesn't do anything, but that's only true of the stock printf functions. Our custom functions _do_ have a side effect in this case. The other options are to have a special function for "print a warning (or status) line with no content". Or to teach those functions to handle newlines specially. We've often discussed that you should be able to do: warning("foo\nbar"); and have it print: warning: foo warning: bar That's useful in itself, and would probably make cases like this easier to handle, too. But it's a pretty big change. Another option would be to just teach formatting functions to handle a single "\n" as a synonym for the empty string (or even detect trailing newlines and avoid appending our own in that case). That would mean you could do: warning("\n"); to print a blank line. That's arguably more obvious about the intent to a reader (I say arguably because the new behavior _is_ subtle if you happen to know that warning() usually appends a newline). Anyway. Those are all options, but I don't think there is any problem with sticking with warning("") for now. It is not the first part of the code that tickles the format-zero-length warning. -Peff
[RFC/PATCH v3 15/16] odb-helper: add 'store_plain_objects' to 'struct odb_helper'
This adds a configuration option odb..plainObjects and the corresponding boolean variable called 'store_plain_objects' in 'struct odb_helper' to make it possible for external object databases to store object as plain objects instead of Git objects. The existing odb_helper_fetch_object() is renamed odb_helper_fetch_git_object() and a new odb_helper_fetch_plain_object() is introduce to deal with external objects that are not in Git format. Signed-off-by: Christian Couder--- external-odb.c | 2 + odb-helper.c | 113 - odb-helper.h | 1 + 3 files changed, 114 insertions(+), 2 deletions(-) diff --git a/external-odb.c b/external-odb.c index a980fbfbf2..af55377281 100644 --- a/external-odb.c +++ b/external-odb.c @@ -36,6 +36,8 @@ static int external_odb_config(const char *var, const char *value, void *data) if (!strcmp(key, "command")) return git_config_string(>cmd, var, value); + if (!strcmp(key, "plainobjects")) + o->store_plain_objects = git_config_bool(var, value); return 0; } diff --git a/odb-helper.c b/odb-helper.c index 7b7de7380f..6b9fb7927a 100644 --- a/odb-helper.c +++ b/odb-helper.c @@ -153,8 +153,107 @@ int odb_helper_has_object(struct odb_helper *o, const unsigned char *sha1) return !!odb_helper_lookup(o, sha1); } -int odb_helper_fetch_object(struct odb_helper *o, const unsigned char *sha1, - int fd) +static int odb_helper_fetch_plain_object(struct odb_helper *o, +const unsigned char *sha1, +int fd) +{ + struct odb_helper_object *obj; + struct odb_helper_cmd cmd; + unsigned long total_got = 0; + + char hdr[32]; + int hdrlen; + + int ret = Z_STREAM_END; + unsigned char compressed[4096]; + git_zstream stream; + git_SHA_CTX hash; + unsigned char real_sha1[20]; + + obj = odb_helper_lookup(o, sha1); + if (!obj) + return -1; + + if (odb_helper_start(o, , 0, "get %s", sha1_to_hex(sha1)) < 0) + return -1; + + /* Set it up */ + git_deflate_init(, zlib_compression_level); + stream.next_out = compressed; + stream.avail_out = sizeof(compressed); + git_SHA1_Init(); + + /* First header.. */ + hdrlen = xsnprintf(hdr, sizeof(hdr), "%s %lu", typename(obj->type), obj->size) + 1; + stream.next_in = (unsigned char *)hdr; + stream.avail_in = hdrlen; + while (git_deflate(, 0) == Z_OK) + ; /* nothing */ + git_SHA1_Update(, hdr, hdrlen); + + for (;;) { + unsigned char buf[4096]; + int r; + + r = xread(cmd.child.out, buf, sizeof(buf)); + if (r < 0) { + error("unable to read from odb helper '%s': %s", + o->name, strerror(errno)); + close(cmd.child.out); + odb_helper_finish(o, ); + git_deflate_end(); + return -1; + } + if (r == 0) + break; + + total_got += r; + + /* Then the data itself.. */ + stream.next_in = (void *)buf; + stream.avail_in = r; + do { + unsigned char *in0 = stream.next_in; + ret = git_deflate(, Z_FINISH); + git_SHA1_Update(, in0, stream.next_in - in0); + write_or_die(fd, compressed, stream.next_out - compressed); + stream.next_out = compressed; + stream.avail_out = sizeof(compressed); + } while (ret == Z_OK); + } + + close(cmd.child.out); + if (ret != Z_STREAM_END) { + warning("bad zlib data from odb helper '%s' for %s", + o->name, sha1_to_hex(sha1)); + return -1; + } + ret = git_deflate_end_gently(); + if (ret != Z_OK) { + warning("deflateEnd on object %s from odb helper '%s' failed (%d)", + sha1_to_hex(sha1), o->name, ret); + return -1; + } + git_SHA1_Final(real_sha1, ); + if (hashcmp(sha1, real_sha1)) { + warning("sha1 mismatch from odb helper '%s' for %s (got %s)", + o->name, sha1_to_hex(sha1), sha1_to_hex(real_sha1)); + return -1; + } + if (odb_helper_finish(o, )) + return -1; + if (total_got != obj->size) { + warning("size mismatch from odb helper '%s' for %s (%lu != %lu)", + o->name, sha1_to_hex(sha1), total_got, obj->size); + return -1; + } + + return 0; +} + +static int
[RFC/PATCH v3 01/16] Add initial external odb support
From: Jeff KingSigned-off-by: Christian Couder --- Makefile| 2 + cache.h | 9 ++ external-odb.c | 115 +++ external-odb.h | 8 ++ odb-helper.c| 239 odb-helper.h| 25 + sha1_file.c | 64 ++--- t/t0400-external-odb.sh | 48 ++ 8 files changed, 495 insertions(+), 15 deletions(-) create mode 100644 external-odb.c create mode 100644 external-odb.h create mode 100644 odb-helper.c create mode 100644 odb-helper.h create mode 100755 t/t0400-external-odb.sh diff --git a/Makefile b/Makefile index f53fcc90d7..88e78da886 100644 --- a/Makefile +++ b/Makefile @@ -747,6 +747,7 @@ LIB_OBJS += ewah/ewah_bitmap.o LIB_OBJS += ewah/ewah_io.o LIB_OBJS += ewah/ewah_rlw.o LIB_OBJS += exec_cmd.o +LIB_OBJS += external-odb.o LIB_OBJS += fetch-pack.o LIB_OBJS += fsck.o LIB_OBJS += gettext.o @@ -779,6 +780,7 @@ LIB_OBJS += notes-cache.o LIB_OBJS += notes-merge.o LIB_OBJS += notes-utils.o LIB_OBJS += object.o +LIB_OBJS += odb-helper.o LIB_OBJS += pack-bitmap.o LIB_OBJS += pack-bitmap-write.o LIB_OBJS += pack-check.o diff --git a/cache.h b/cache.h index a50a61a197..b419b9b7ce 100644 --- a/cache.h +++ b/cache.h @@ -885,6 +885,12 @@ const char *git_path_shallow(void); extern const char *sha1_file_name(const unsigned char *sha1); /* + * Like sha1_file_name, but return the filename within a specific alternate + * object directory. Shares the same static buffer with sha1_file_name. + */ +extern const char *sha1_file_name_alt(const char *objdir, const unsigned char *sha1); + +/* * Return the name of the (local) packfile with the specified sha1 in * its name. The return value is a pointer to memory that is * overwritten each time this function is called. @@ -1135,6 +1141,8 @@ extern int do_check_packed_object_crc; extern int check_sha1_signature(const unsigned char *sha1, void *buf, unsigned long size, const char *type); +extern int create_object_tmpfile(struct strbuf *tmp, const char *filename); +extern void close_sha1_file(int fd); extern int finalize_object_file(const char *tmpfile, const char *filename); extern int has_sha1_pack(const unsigned char *sha1); @@ -1409,6 +1417,7 @@ extern void read_info_alternates(const char * relative_base, int depth); extern char *compute_alternate_path(const char *path, struct strbuf *err); typedef int alt_odb_fn(struct alternate_object_database *, void *); extern int foreach_alt_odb(alt_odb_fn, void*); +extern void prepare_external_alt_odb(void); /* * Allocate a "struct alternate_object_database" but do _not_ actually diff --git a/external-odb.c b/external-odb.c new file mode 100644 index 00..1ccfa99a01 --- /dev/null +++ b/external-odb.c @@ -0,0 +1,115 @@ +#include "cache.h" +#include "external-odb.h" +#include "odb-helper.h" + +static struct odb_helper *helpers; +static struct odb_helper **helpers_tail = + +static struct odb_helper *find_or_create_helper(const char *name, int len) +{ + struct odb_helper *o; + + for (o = helpers; o; o = o->next) + if (!strncmp(o->name, name, len) && !o->name[len]) + return o; + + o = odb_helper_new(name, len); + *helpers_tail = o; + helpers_tail = >next; + + return o; +} + +static int external_odb_config(const char *var, const char *value, void *data) +{ + struct odb_helper *o; + const char *key, *dot; + + if (!skip_prefix(var, "odb.", )) + return 0; + dot = strrchr(key, '.'); + if (!dot) + return 0; + + o = find_or_create_helper(key, dot - key); + key = dot + 1; + + if (!strcmp(key, "command")) + return git_config_string(>cmd, var, value); + + return 0; +} + +static void external_odb_init(void) +{ + static int initialized; + + if (initialized) + return; + initialized = 1; + + git_config(external_odb_config, NULL); +} + +const char *external_odb_root(void) +{ + static const char *root; + if (!root) + root = git_pathdup("objects/external"); + return root; +} + +int external_odb_has_object(const unsigned char *sha1) +{ + struct odb_helper *o; + + external_odb_init(); + + for (o = helpers; o; o = o->next) + if (odb_helper_has_object(o, sha1)) + return 1; + return 0; +} + +int external_odb_fetch_object(const unsigned char *sha1) +{ + struct odb_helper *o; + const char *path; + + if (!external_odb_has_object(sha1)) + return -1; + + path = sha1_file_name_alt(external_odb_root(), sha1); + safe_create_leading_directories_const(path); + prepare_external_alt_odb(); + + for (o = helpers; o; o = o->next) { + struct strbuf tmpfile =
[RFC/PATCH v3 14/16] lib-httpd: add apache-e-odb.conf
This is an apache config file to test external object databases. It uses the upload.sh and list.sh cgi that have been added previously to make apache store external objects. Signed-off-by: Christian Couder--- t/lib-httpd/apache-e-odb.conf | 214 ++ 1 file changed, 214 insertions(+) create mode 100644 t/lib-httpd/apache-e-odb.conf diff --git a/t/lib-httpd/apache-e-odb.conf b/t/lib-httpd/apache-e-odb.conf new file mode 100644 index 00..19a1540c82 --- /dev/null +++ b/t/lib-httpd/apache-e-odb.conf @@ -0,0 +1,214 @@ +ServerName dummy +PidFile httpd.pid +DocumentRoot www +LogFormat "%h %l %u %t \"%r\" %>s %b" common +CustomLog access.log common +ErrorLog error.log + + LoadModule log_config_module modules/mod_log_config.so + + + LoadModule alias_module modules/mod_alias.so + + + LoadModule cgi_module modules/mod_cgi.so + + + LoadModule env_module modules/mod_env.so + + + LoadModule rewrite_module modules/mod_rewrite.so + + + LoadModule version_module modules/mod_version.so + + + LoadModule headers_module modules/mod_headers.so + + + +LockFile accept.lock + + + + + LoadModule auth_module modules/mod_auth.so + + + += 2.1> + + LoadModule auth_basic_module modules/mod_auth_basic.so + + + LoadModule authn_file_module modules/mod_authn_file.so + + + LoadModule authz_user_module modules/mod_authz_user.so + + + LoadModule authz_host_module modules/mod_authz_host.so + + + += 2.4> + + LoadModule authn_core_module modules/mod_authn_core.so + + + LoadModule authz_core_module modules/mod_authz_core.so + + + LoadModule access_compat_module modules/mod_access_compat.so + + + LoadModule mpm_prefork_module modules/mod_mpm_prefork.so + + + LoadModule unixd_module modules/mod_unixd.so + + + +PassEnv GIT_VALGRIND +PassEnv GIT_VALGRIND_OPTIONS +PassEnv GNUPGHOME +PassEnv ASAN_OPTIONS +PassEnv GIT_TRACE +PassEnv GIT_CONFIG_NOSYSTEM + +Alias /dumb/ www/ +Alias /auth/dumb/ www/auth/dumb/ + + + SetEnv GIT_EXEC_PATH ${GIT_EXEC_PATH} + SetEnv GIT_HTTP_EXPORT_ALL + + + SetEnv GIT_EXEC_PATH ${GIT_EXEC_PATH} + + + SetEnv GIT_EXEC_PATH ${GIT_EXEC_PATH} + SetEnv GIT_HTTP_EXPORT_ALL + SetEnv GIT_COMMITTER_NAME "Custom User" + SetEnv GIT_COMMITTER_EMAIL cus...@example.com + + + SetEnv GIT_EXEC_PATH ${GIT_EXEC_PATH} + SetEnv GIT_HTTP_EXPORT_ALL + SetEnv GIT_NAMESPACE ns + + + SetEnv GIT_EXEC_PATH ${GIT_EXEC_PATH} + SetEnv GIT_HTTP_EXPORT_ALL + Header set Set-Cookie name=value + + + SetEnv GIT_EXEC_PATH ${GIT_EXEC_PATH} + SetEnv GIT_HTTP_EXPORT_ALL + +ScriptAlias /upload/ upload.sh/ +ScriptAlias /list/ list.sh/ + + Options FollowSymlinks + + + Options ExecCGI + + + Options ExecCGI + + + Options ExecCGI + + +RewriteEngine on +RewriteRule ^/smart-redir-perm/(.*)$ /smart/$1 [R=301] +RewriteRule ^/smart-redir-temp/(.*)$ /smart/$1 [R=302] +RewriteRule ^/smart-redir-auth/(.*)$ /auth/smart/$1 [R=301] +RewriteRule ^/smart-redir-limited/(.*)/info/refs$ /smart/$1/info/refs [R=301] +RewriteRule ^/ftp-redir/(.*)$ ftp://localhost:1000/$1 [R=302] + +RewriteRule ^/loop-redir/x-x-x-x-x-x-x-x-x-x-x-x-x-x-x-x-x-x-x-x-(.*) /$1 [R=302] +RewriteRule ^/loop-redir/(.*)$ /loop-redir/x-$1 [R=302] + +# Apache 2.2 does not understand , so we use RewriteCond. +# And as RewriteCond does not allow testing for non-matches, we match +# the desired case first (one has abra, two has cadabra), and let it +# pass by marking the RewriteRule as [L], "last rule, do not process +# any other matching RewriteRules after this"), and then have another +# RewriteRule that matches all other cases and lets them fail via '[F]', +# "fail the request". +RewriteCond %{HTTP:x-magic-one} =abra +RewriteCond %{HTTP:x-magic-two} =cadabra +RewriteRule ^/smart_headers/.* - [L] +RewriteRule ^/smart_headers/.* - [F] + + +LoadModule ssl_module modules/mod_ssl.so + +SSLCertificateFile httpd.pem +SSLCertificateKeyFile httpd.pem +SSLRandomSeed startup file:/dev/urandom 512 +SSLRandomSeed connect file:/dev/urandom 512 +SSLSessionCache none +SSLMutex file:ssl_mutex +SSLEngine On + + + + AuthType Basic + AuthName "git-auth" + AuthUserFile passwd + Require valid-user + + + + AuthType Basic + AuthName "git-auth" + AuthUserFile passwd + Require valid-user + + + + AuthType Basic + AuthName "git-auth" + AuthUserFile passwd + Require valid-user + + +RewriteCond %{QUERY_STRING} service=git-receive-pack [OR] +RewriteCond %{REQUEST_URI} /git-receive-pack$ +RewriteRule ^/half-auth-complete/ - [E=AUTHREQUIRED:yes] + + + Order Deny,Allow + Deny from env=AUTHREQUIRED + + AuthType Basic + AuthName "Git Access" + AuthUserFile passwd + Require valid-user + Satisfy Any + + + + LoadModule dav_module modules/mod_dav.so + LoadModule dav_fs_module
[RFC/PATCH v3 16/16] t0420: add test with HTTP external odb
This tests that an apache web server can be used as an external object database and store files in their native format instead of converting them to a Git object. Signed-off-by: Christian Couder--- t/t0420-transfer-http-e-odb.sh | 118 + 1 file changed, 118 insertions(+) create mode 100755 t/t0420-transfer-http-e-odb.sh diff --git a/t/t0420-transfer-http-e-odb.sh b/t/t0420-transfer-http-e-odb.sh new file mode 100755 index 00..9fb84877b5 --- /dev/null +++ b/t/t0420-transfer-http-e-odb.sh @@ -0,0 +1,118 @@ +#!/bin/sh + +test_description='tests for transfering external objects to an HTTPD server' + +. ./test-lib.sh + +# If we don't specify a port, the current test number will be used +# which will not work as it is less than 1024, so it can only be used by root. +LIB_HTTPD_PORT=$(expr ${this_test#t} + 12000) + +. "$TEST_DIRECTORY"/lib-httpd.sh + +start_httpd apache-e-odb.conf + +# odb helper script must see this +export HTTPD_URL + +write_script odb-http-helper <<\EOF +die() { + printf >&2 "%s\n" "$@" + exit 1 +} +echo >&2 "odb-http-helper args:" "$@" +case "$1" in +have) + list_url="$HTTPD_URL/list/" + curl "$list_url" || + die "curl '$list_url' failed" + ;; +get) + get_url="$HTTPD_URL/list/?sha1=$2" + curl "$get_url" || + die "curl '$get_url' failed" + ;; +put) + sha1="$2" + size="$3" + kind="$4" + upload_url="$HTTPD_URL/upload/?sha1=$sha1=$size=$kind" + curl --data-binary @- --include "$upload_url" >out || + die "curl '$upload_url' failed" + ref_hash=$(echo "$sha1 $size $kind" | GIT_NO_EXTERNAL_ODB=1 git hash-object -w -t blob --stdin) || exit + git update-ref refs/odbs/magic/"$sha1" "$ref_hash" + ;; +*) + die "unknown command '$1'" + ;; +esac +EOF +HELPER="\"$PWD\"/odb-http-helper" + + +test_expect_success 'setup repo with a root commit and the helper' ' + test_commit zero && + git config odb.magic.command "$HELPER" && + git config odb.magic.plainObjects "true" +' + +test_expect_success 'setup another repo from the first one' ' + git init other-repo && + (cd other-repo && +git remote add origin .. && +git pull origin master && +git checkout master && +git log) +' + +UPLOADFILENAME="hello_apache_upload.txt" + +UPLOAD_URL="$HTTPD_URL/upload/?sha1=$UPLOADFILENAME=123=blob" + +test_expect_success 'can upload a file' ' + echo "Hello Apache World!" >hello_to_send.txt && + echo "How are you?" >>hello_to_send.txt && + curl --data-binary @hello_to_send.txt --include "$UPLOAD_URL" >out_upload +' + +LIST_URL="$HTTPD_URL/list/" + +test_expect_success 'can list uploaded files' ' + curl --include "$LIST_URL" >out_list && + grep "$UPLOADFILENAME" out_list +' + +test_expect_success 'can delete uploaded files' ' + curl --data "delete" --include "$UPLOAD_URL=1" >out_delete && + curl --include "$LIST_URL" >out_list2 && + ! grep "$UPLOADFILENAME" out_list2 +' + +FILES_DIR="httpd/www/files" + +test_expect_success 'new blobs are transfered to the http server' ' + test_commit one && + hash1=$(git ls-tree HEAD | grep one.t | cut -f1 | cut -d\ -f3) && + echo "$hash1-4-blob" >expected && + ls "$FILES_DIR" >actual && + test_cmp expected actual +' + +test_expect_success 'blobs can be retrieved from the http server' ' + git cat-file blob "$hash1" && + git log -p >expected +' + +test_expect_success 'update other repo from the first one' ' + (cd other-repo && +git fetch origin "refs/odbs/magic/*:refs/odbs/magic/*" && +test_must_fail git cat-file blob "$hash1" && +git config odb.magic.command "$HELPER" && +git config odb.magic.plainObjects "true" && +git cat-file blob "$hash1" && +git pull origin master) +' + +stop_httpd + +test_done -- 2.11.0.rc2.37.geb49ca6
[RFC/PATCH v3 06/16] external odb: add write support
Signed-off-by: Christian Couder--- external-odb.c | 15 +++ external-odb.h | 2 ++ odb-helper.c | 41 + odb-helper.h | 3 +++ sha1_file.c| 2 ++ 5 files changed, 59 insertions(+), 4 deletions(-) diff --git a/external-odb.c b/external-odb.c index 42978a3298..bb70fe3298 100644 --- a/external-odb.c +++ b/external-odb.c @@ -127,3 +127,18 @@ int external_odb_for_each_object(each_external_object_fn fn, void *data) } return 0; } + +int external_odb_write_object(const void *buf, unsigned long len, + const char *type, unsigned char *sha1) +{ + struct odb_helper *o; + + external_odb_init(); + + for (o = helpers; o; o = o->next) { + int r = odb_helper_write_object(o, buf, len, type, sha1); + if (r <= 0) + return r; + } + return 1; +} diff --git a/external-odb.h b/external-odb.h index cea8570a49..55d291d1cf 100644 --- a/external-odb.h +++ b/external-odb.h @@ -10,5 +10,7 @@ typedef int (*each_external_object_fn)(const unsigned char *sha1, unsigned long size, void *data); int external_odb_for_each_object(each_external_object_fn, void *); +int external_odb_write_object(const void *buf, unsigned long len, + const char *type, unsigned char *sha1); #endif /* EXTERNAL_ODB_H */ diff --git a/odb-helper.c b/odb-helper.c index 2db59caa53..7b7de7380f 100644 --- a/odb-helper.c +++ b/odb-helper.c @@ -33,9 +33,10 @@ static void prepare_helper_command(struct argv_array *argv, const char *cmd, strbuf_release(); } -__attribute__((format (printf,3,4))) +__attribute__((format (printf,4,5))) static int odb_helper_start(struct odb_helper *o, struct odb_helper_cmd *cmd, + int use_stdin, const char *fmt, ...) { va_list ap; @@ -52,7 +53,10 @@ static int odb_helper_start(struct odb_helper *o, cmd->child.argv = cmd->argv.argv; cmd->child.use_shell = 1; - cmd->child.no_stdin = 1; + if (use_stdin) + cmd->child.in = -1; + else + cmd->child.no_stdin = 1; cmd->child.out = -1; if (start_command(>child) < 0) { @@ -109,7 +113,7 @@ static void odb_helper_load_have(struct odb_helper *o) return; o->have_valid = 1; - if (odb_helper_start(o, , "have") < 0) + if (odb_helper_start(o, , 0, "have") < 0) return; fh = xfdopen(cmd.child.out, "r"); @@ -164,7 +168,7 @@ int odb_helper_fetch_object(struct odb_helper *o, const unsigned char *sha1, if (!obj) return -1; - if (odb_helper_start(o, , "get %s", sha1_to_hex(sha1)) < 0) + if (odb_helper_start(o, , 0, "get %s", sha1_to_hex(sha1)) < 0) return -1; memset(, 0, sizeof(stream)); @@ -252,3 +256,32 @@ int odb_helper_for_each_object(struct odb_helper *o, return 0; } + +int odb_helper_write_object(struct odb_helper *o, + const void *buf, unsigned long len, + const char *type, unsigned char *sha1) +{ + struct odb_helper_cmd cmd; + + if (odb_helper_start(o, , 1, "put %s %lu %s", +sha1_to_hex(sha1), len, type) < 0) + return -1; + + do { + int w = xwrite(cmd.child.in, buf, len); + if (w < 0) { + error("unable to write to odb helper '%s': %s", + o->name, strerror(errno)); + close(cmd.child.in); + close(cmd.child.out); + odb_helper_finish(o, ); + return -1; + } + len -= w; + } while (len > 0); + + close(cmd.child.in); + close(cmd.child.out); + odb_helper_finish(o, ); + return 0; +} diff --git a/odb-helper.h b/odb-helper.h index 8c3916d215..af31cc27d5 100644 --- a/odb-helper.h +++ b/odb-helper.h @@ -25,5 +25,8 @@ int odb_helper_fetch_object(struct odb_helper *o, const unsigned char *sha1, int fd); int odb_helper_for_each_object(struct odb_helper *o, each_external_object_fn, void *); +int odb_helper_write_object(struct odb_helper *o, + const void *buf, unsigned long len, + const char *type, unsigned char *sha1); #endif /* ODB_HELPER_H */ diff --git a/sha1_file.c b/sha1_file.c index 6d68157e30..3532c1c598 100644 --- a/sha1_file.c +++ b/sha1_file.c @@ -3320,6 +3320,8 @@ int write_sha1_file(const void *buf, unsigned long len, const char *type, unsign * it out into .git/objects/??/?{38} file. */
[RFC/PATCH v3 00/16] Add initial experimental external ODB support
Goal Git can store its objects only in the form of loose objects in separate files or packed objects in a pack file. To be able to better handle some kind of objects, for example big blobs, it would be nice if Git could store its objects in other object databases (ODB). To do that, this patch series makes it possible to register commands, using "odb..command" config variables, to access external ODBs where objects can be stored and retrieved. External ODBs should be able to tranfer information about the blobs they store. This patch series shows how this is possible using kind of replace refs. Design ~~ * Registered command Each registered command manages access to one external ODB and will be called the following ways: - " have": the command should output the sha1, size and type of all the objects the external ODB contains, one object per line. - " get ": the command should then read from the external ODB the content of the object corresponding to and output it on stdout. - " put ": the command should then read from stdin an object and store it in the external ODB. * Transfer To tranfer information about the blobs stored in external ODB, some special refs, called "odb ref", similar as replace refs, are used. For now there should be one odb ref per blob. Each ref name should be refs/odbs// where is the sha1 of the blob stored in the external odb named . These odb refs should all point to a blob that should be stored in the Git repository and contain information about the blob stored in the external odb. This information can be specific to the external odb. The repos can then share this information using commands like: `git fetch origin "refs/odbs//*:refs/odbs//*"` * External object database This RFC patch series shows in the tests: - how to use another git repository as an external ODB - how to use an http server as an external ODB Design discussion about performance ~~~ Yeah, it is not efficient to fork/exec a command to just read or write one object to or from the external ODB. Batch calls and/or using a daemon and/or RPC should be used instead to be able to store regular objects in an external ODB. But for now the external ODB would be all about really big files, where the cost of a fork+exec should not matter much. If we later want to extend usage of external ODBs, yeah we will probably need to design other mechanisms. Here are some related explanations from Peff: {{{ Because this "external odb" essentially acts as a git alternate, we would hit it only when we couldn't find an object through regular means. Git would then make the object available in the usual on-disk format (probably as a loose object). So in most processes, we would not need to consult the odb command at all. And when we do, the first thing would be to get its "have" list, which would at most run once per process. So the per-object cost is really calling "get", and my assumption there was that the cost of actually retrieving the object over the network would dwarf the fork/exec cost. I also waffled on having git cache the output of " have" in some fast-lookup format to save even the single fork/exec. But I figured that was something that could be added later if needed. You'll note that this is sort of a "fault-in" model. Another model would be to treat external odb updates similar to fetches. I.e., we touch the network only during a special update operation, and then try to work locally with whatever the external odb has. IMHO this policy could actually be up to the external odb itself (i.e., its "have" command could serve from a local cache if it likes). }}} Implementation ~~ * Mechanism to call the registered commands This series adds a set of function in external-odb.{c,h} that are called by the rest of Git to manage all the external ODBs. These functions use 'struct odb_helper' and its associated functions defined in odb-helper.{c,h} to talk to the different external ODBs by launching the configured "odb..command" commands and writing to or reading from them. The tests in this series creates an odb-helper script that is registered using the "odb.magic.command" config variable, and then called to read from and write to the external ODB. * ODB refs For now odb ref management is only implemented in a registered command in t0410, but maybe this or some parts of it could be done by Git itself. When a new blob is added to an external odb, its sha1, size and type are writen in another new blob and the odb ref is created. When the list of existing blobs is requested from the external odb, the content of the blobs pointed to by the odb refs can also be used by the odb to claim that it can get the objects. When a blob is actually requested from the external odb, it can use the content stored in the blobs pointed to by the odb refs to get the actual blobs and then pass them. Highlevel view of the patches in the series
[RFC/PATCH v3 05/16] t0400: add test for 'put' command
Signed-off-by: Christian Couder--- t/t0400-external-odb.sh | 9 + 1 file changed, 9 insertions(+) diff --git a/t/t0400-external-odb.sh b/t/t0400-external-odb.sh index 0f1bb97f38..6c6da5cf4f 100755 --- a/t/t0400-external-odb.sh +++ b/t/t0400-external-odb.sh @@ -57,4 +57,13 @@ test_expect_success 'helper can retrieve alt objects' ' test_cmp expect actual ' +test_expect_success 'helper can add objects to alt repo' ' + hash=$(echo "Hello odb!" | git hash-object -w -t blob --stdin) && + test -f .git/objects/$(echo $hash | sed "s#..#&/#") && + size=$(git cat-file -s "$hash") && + git cat-file blob "$hash" | ./odb-helper put "$hash" "$size" blob && + alt_size=$(cd alt-repo && git cat-file -s "$hash") && + test "$size" -eq "$alt_size" +' + test_done -- 2.11.0.rc2.37.geb49ca6
Grant.
Hello, A donation has been made out to you and your family from my foundation. For details, send an E-mail now to: castrat...@gmail.com Regards
Re: [PATCHv2 4/4] submodule: add embed-git-dir function
Duy Nguyenwrites: > On Wed, Nov 23, 2016 at 2:22 AM, Stefan Beller wrote: >> +/* >> + * Migrate the given submodule (and all its submodules recursively) from >> + * having its git directory within the working tree to the git dir nested >> + * in its superprojects git dir under modules/. >> + */ >> +void migrate_submodule_gitdir(const char *prefix, const char *path, >> + int recursive) > > Submodules and worktrees seem to have many things in common. The first > one is this. "git worktree move" on a worktree that contains > submodules .git also benefits from something like this [1]. I suggest > you move this function to some neutral place and maybe rename it to > relocate_gitdir() or something. Yeah, good suggestion (including name; first round used "intern" I had trouble with, then "embed" which was OK-ish, but probably "relocate" is better choice. If anything, what Stefan's series adds is a command to un-embed embedded one). > It probably should take a bit flag instead of "recursive" here. One > thing I would need is the ability to tell this function "I have moved > all these .git dirs already (because I move whole worktree in one > operation), here are the old and new locations of them, fix them up!". > In other words, no rename() could be optionally skipped. Thanks two of you for working well together ;-)
Re: [PATCHv2 4/4] submodule: add embed-git-dir function
On Wed, Nov 30, 2016 at 10:04 AM, Stefan Bellerwrote: >> Submodules and worktrees seem to have many things in common. > > Yes. :) I moved the code to dir.{c,h} and renamed it to a more generic "move_gitdir", but then I am thinking further about how to align worktrees and submodules here, specifically the recursive part. Whenever submodules and recursion is involved so far we spawned off a child process to take care of the issue in the submodule itself. Here we followed the same pattern and called the submodule--helper to embed the git dirs in the submodule recursively. As this function is not supposed to be submodule specific anymore, I'd rather not want to have the recursive childrens code live inside the submodule helper, so we also need a neutral helper for moving git directories? So there are a couple ways forward: * We declare the "recursive" flag to be submodule specific and keep the recursion in the submodule helper * the recursive flag is generic enough and we invent a plumbing helper. Analogous to the submodule--helper, that was originally invented as a C aid for the git-submodule shell script, we could have a "git--helper" or just "git plumbing ", though that sounds like reinventing the wheel as traditionally we'd just have a top level command to do a specific thing. (side note: Some parts of git-rev-parse could go into such a plumbing command) For now I'd think to declare recursion in this function to be submodule specific. Stefan
Re: [PATCHv2 4/4] submodule: add embed-git-dir function
On Wed, Nov 30, 2016 at 12:51 PM, Junio C Hamanowrote: > Duy Nguyen writes: > >> On Wed, Nov 23, 2016 at 2:22 AM, Stefan Beller wrote: >>> +/* >>> + * Migrate the given submodule (and all its submodules recursively) from >>> + * having its git directory within the working tree to the git dir nested >>> + * in its superprojects git dir under modules/. >>> + */ >>> +void migrate_submodule_gitdir(const char *prefix, const char *path, >>> + int recursive) >> >> Submodules and worktrees seem to have many things in common. The first >> one is this. "git worktree move" on a worktree that contains >> submodules .git also benefits from something like this [1]. I suggest >> you move this function to some neutral place and maybe rename it to >> relocate_gitdir() or something. > > Yeah, good suggestion (including name; first round used "intern" I > had trouble with, then "embed" which was OK-ish, but probably > "relocate" is better choice. If anything, what Stefan's series adds > is a command to un-embed embedded one). Heh, good counter perspective. I thought about embedding it "into the superprojects git directory" whereas your un-embed sounds like you'd assume embedding is targetd towards the working dir. relocate as a fancy name for move sounds like it expects 2 arguments (source and destination), but maybe we could go with: git relocate-git-dir (--into-workingtree|--into-gitdir) \ [--recurse-submodules] \ [--only-fix-gitfile-links-and-core-setting-as-I-moved-it-myself-already] \ [--dryrun] [--verbose] [--unsafe-move] No need to have a pathspec here IMHO. Later it could have another flag to specify the "main" worktree or such as well.
Re: [PATCH] difftool.c: mark a file-local symbol with static
Ramsay Joneswrites: > [I have fixed my config.mak file now, so I don't see the warning > anymore! Having -Wno-format-zero-length in DEVELOPER_CFLAGS, or > not, is a separate matter.] I suspect that 658df95a4a ("add DEVELOPER makefile knob to check for acknowledged warnings", 2016-02-25) took it from me (namely, Make script in my 'todo' branch). In turn, I added it to my set of flags in order to squelch this exact warning, so...
"git add -p ." raises an unexpected "warning: empty strings as pathspecs will be made invalid in upcoming releases. please use . instead if you meant to match all paths"
After upgrading to version 2.11.0 I am getting a warning about empty strings as pathspecs while using 'patch' - Ran 'git add -p .' from the root of my git repository. - I was able to normally stage my changes, but was presented with a "warning: empty strings as pathspecs will be made invalid in upcoming releases. please use . instead if you meant to match all paths" message. - I expected no warning message since I included a "." with my original command. I believe that I should not be seeing this warning message as I included the requested "." pathspec. ~ Peter Urda http://urda.cc
Re: [PATCH] difftool.c: mark a file-local symbol with static
On 30/11/16 11:07, Johannes Schindelin wrote: > Hi Ramsay, > > On Tue, 29 Nov 2016, Ramsay Jones wrote: >> Also, due to a problem in my config.mak file on Linux (a commented >> out line that had a line continuation '\', gr!), gcc issued a >> warning, thus: >> >> builtin/difftool.c: In function ‘run_dir_diff’: >> builtin/difftool.c:568:13: warning: zero-length gnu_printf format string >> [-Wformat-zero-length] >>warning(""); >>^ >> I am not sure why -Wno-format-zero-length is set in DEVELOPER_CFLAGS, >> but do you really need to space the output with an an 'empty' >> "warning:" line? (Just curious). > > That `warning("");` comes from a straight-forward port of this line (see > https://github.com/git/git/blob/v2.11.0/git-difftool.perl#L425): > > $errmsg .= "warning:\n"; Ah, OK, so it is being used to 'space out' the (possibly multiple) 'Both files modified' warning(s) followed by a 2-line warning summary. Hmm, I am not sure the 'Working tree file has been left' (after each pair of files) part of the message is adding much ... > I could see two possible ways out: > > - warning("%s", ""); (ugly!) > > - do away with the "prefix every line with warning:" convention and simply > have a multi-line `warning(_("...\n...\n"), ...)` > > What do you think? I think, for now, it is probably best to do nothing. ;-) Until you have replaced the 'legacy' difftool, it would be best not to alter the output from the tool, so that the tests can be used unaltered on both. This could be addressed (if necessary) after you complete your series. [I have fixed my config.mak file now, so I don't see the warning anymore! Having -Wno-format-zero-length in DEVELOPER_CFLAGS, or not, is a separate matter.] ATB, Ramsay Jones
Re: What's cooking in git.git (Nov 2016, #06; Mon, 28)
On 11/29, Jeff King wrote: > On Tue, Nov 29, 2016 at 01:37:59AM -0500, Jeff King wrote: > > > 2. Grep threads doing more complicated stuff that needs to take a > > lock. You might try building with -fsanitize=thread to see if it > > turns up anything. > > I tried this and it didn't find anything useful. It complains about > multiple threads calling want_color() at the same time, which you can > silence with something like: > > diff --git a/builtin/grep.c b/builtin/grep.c > index 2c727ef49..d48846f40 100644 > --- a/builtin/grep.c > +++ b/builtin/grep.c > @@ -207,6 +207,12 @@ static void start_threads(struct grep_opt *opt) > { > int i; > > + /* > + * trigger want_color() for its side effect of caching the result; > + * otherwise the threads will fight over setting the cache > + */ > + want_color(GIT_COLOR_AUTO); > + > pthread_mutex_init(_mutex, NULL); > pthread_mutex_init(_read_mutex, NULL); > pthread_mutex_init(_attr_mutex, NULL); > > But the problem persists even with that patch, so it is something else. > It may still be a threading problem; -fsanitize=thread isn't perfect. I > also couldn't get the stress-test to fail when compiled with it. But > that may simply mean that the timing of the resulting binary is changed > enough not to trigger the issue. > > -Peff With you're stress script I'm able to see the failures. The interesting thing is that the entry missing is always from the non-submodule file. -- Brandon Williams
Re: gitconfig includes
On Wed, Nov 30, 2016 at 2:06 PM, Jeff Kingwrote: > > I'm not sure what your script does exactly, but in general I think the > right thing for most scripts is _not_ to use a specific-file option > like --global. > > If the script is looking up a config value on behalf of a user, it > probably makes sense for it to use the normal config lookup procedure > (system, global, repo, command-line), which also enables includes by > default. That would make it consistent with internal git config > lookups (e.g., user.name probably only ever appears in global config, > but you _can_ override it at the repo level if you want to). This is intended for git newbies (and big company => infinite supply of them), and also allows them to conveniently nuke the repo and start from a fresh copy, so it makes sense to make the script inspect/tweak the global settings. If knowing git "well enough" was an assumed requirement, I'd definitely do the normal thing. > I know that's mostly orthogonal to what we're discussing, but I'd feel > more convinced that enabling "--includes" with "--global" is useful if > I thought that "--global" was useful in the first place outside of a > few narrow debugging cases. Ok. Perhaps I overestimated the utility of --global anyway, given the above... -- ((x=>x(x))(x=>x(x))) Eli Barzilay: http://barzilay.org/ Maze is Life!
Re: gitconfig includes
On Wed, Nov 30, 2016 at 01:54:35PM -0500, Eli Barzilay wrote: > I don't have any strong opinion, but FWIW, the use case I have for this > is as follows: I sync my ~/.gitconfig between my own machine and a work > machine. On the work machine though, I like people to have work emails, > and I wrote some scripts that verify that. For my case, I added an > include of a ~/.gitconfig.more which is not synced, and has values that > override the ones in ~/.gitconfig. Since I'm the one who also wrote > that script, I just added an "--includes" to the check so it won't barf > on my setup, but had it not been my script I'd be stuck. I'm not sure what your script does exactly, but in general I think the right thing for most scripts is _not_ to use a specific-file option like --global. If the script is looking up a config value on behalf of a user, it probably makes sense for it to use the normal config lookup procedure (system, global, repo, command-line), which also enables includes by default. That would make it consistent with internal git config lookups (e.g., user.name probably only ever appears in global config, but you _can_ override it at the repo level if you want to). I know that's mostly orthogonal to what we're discussing, but I'd feel more convinced that enabling "--includes" with "--global" is useful if I thought that "--global" was useful in the first place outside of a few narrow debugging cases. -Peff
Re: gitconfig includes
On Tue, Nov 29, 2016 at 4:50 PM, Junio C Hamanowrote: > Jeff King writes: > >> I think it's arguable whether "--global" should behave the same. > > I know you know this and I am writing this message for others. > > I admit that I wondered if "a single file" ought to cover these > short-hand notations like --global and --local while re-reading the > log message of 9b25a0b52 (config: add include directive, > 2012-02-06). In other words, I agree that it used to be arguable > before we released v1.7.10. > > It no longer is arguable simply due to backward compatibilty. The > ship has long sailed. I don't have any strong opinion, but FWIW, the use case I have for this is as follows: I sync my ~/.gitconfig between my own machine and a work machine. On the work machine though, I like people to have work emails, and I wrote some scripts that verify that. For my case, I added an include of a ~/.gitconfig.more which is not synced, and has values that override the ones in ~/.gitconfig. Since I'm the one who also wrote that script, I just added an "--includes" to the check so it won't barf on my setup, but had it not been my script I'd be stuck. This is all a "FWIW" -- in case anyone thinks about use cases for a possible (future) change of the default. -- ((x=>x(x))(x=>x(x))) Eli Barzilay: http://barzilay.org/ Maze is Life!
Re: [PATCH v3 1/2] difftool: add a skeleton for the upcoming builtin
Jakub Narębskiwrites: >> My original "create a file in libexec/git-core/" was simple, did the job >> reliably, and worked also for testing. >> >> It is a pity that you two gentlemen shot it down for being inelegant. And >> ever since, we try to find a solution that is as simple, works as >> reliably, also for testing, *and* appeases your tastes. > > I just would like to note that existence of file is used for both > git-daemon and gitweb (the latter following the git-daemon example). > > So there is a precedent for the use of this mechanism. I think you are thinking about git-daemon-export-ok (for 'git daemon') and $GITWEB_EXPORT_OK file (for 'gitweb'). You do realize that it is apples-and-oranges [*1*] to take these as analogous to what Dscho is trying to do, don't you? First of all, these are to control access to each repository on the server side; the presence of the file is checked in each repository. What Dscho wants is to control the behaviour of an installation of Git as a whole, no matter which repository is being accessed [*2*, *3*]. More importantly, did you notice that git-daemon-export-ok predates the configuration mechanism by a large margin? The "does the file exist?" check done in a87e8be2ae ("Add a "git-daemon" that listens on a TCP port", 2005-07-13) is a relic from the past [*4*], and 32f4aaccaa ("gitweb: export options", 2006-09-17) added GITWEB_EXPORT_OK to mimic it, also long time ago [*5*]. They are not something you would want to mimic in new programs these days. Besides, $GIT_EXEC_PATH is where you place git subcommands. Who in the right mind considers it even remotely sane to design a system where you have to throw in a file that is not a command to /usr/bin to control the behaviour of your system? [*6*] So the "precedent" is irrelevant in the first place, and even if it were relevant, it is a bad piece of advice to mimic it. [Footnote] *1* Or is it apples-and-pineapples these days? *2* Not that I agree with that desire, if I understand him correctly from his description against the approach based on an environment variable. If a user has multiple installations and not even aware of which one of them s/he is currently using, a mechanism that affects only one of them (instead of consistently affecting all of them) would lead to more confusion, I would think. *3* If such hermetically configured independent installations are desirable, etc/gitconfig aka "git config --system" is a more appropriate thing to use, and you do not need to do repository discovery before you can read it. *4* If we had config mechanism, we would have used it just like we use daemon.* variables to control what services are enabled for each repository. *5* By that time, the config mechanism did already exist, so the GITWEB_EXPORT_OK could have been a per-repository configuration, but "gitweb" had another excuse to deviate from the norm. "Is this repository visible?" was done during repository listing and the script did not want to run "git config" in each and every repository-like directory it encountered in File::Find::find(). *6* And I do not think $GIT_EXEC_PATH vs /usr/bin is apples-and-oranges analogy.
Re: [PATCHv2 4/4] submodule: add embed-git-dir function
On Wed, Nov 30, 2016 at 3:14 AM, Duy Nguyenwrote: > On Wed, Nov 23, 2016 at 2:22 AM, Stefan Beller wrote: >> +/* >> + * Migrate the given submodule (and all its submodules recursively) from >> + * having its git directory within the working tree to the git dir nested >> + * in its superprojects git dir under modules/. >> + */ >> +void migrate_submodule_gitdir(const char *prefix, const char *path, >> + int recursive) > > Submodules and worktrees seem to have many things in common. Yes. :) > The first > one is this. "git worktree move" on a worktree that contains > submodules .git also benefits from something like this [1]. That patch is a sensible approach. :) (By checking all files to not be submodules a worktree would not run into problems like a127331cd81, mv: allow moving nested submodules) > I suggest > you move this function to some neutral place and maybe rename it to > relocate_gitdir() or something. ok tell me where this neutral place is found? (I'd prefer to not clobber it into cache.h *the* most neutral place in git) Maybe dir.{c,h} ? > > It probably should take a bit flag instead of "recursive" here. One > thing I would need is the ability to tell this function "I have moved > all these .git dirs already (because I move whole worktree in one > operation), here are the old and new locations of them, fix them up!". > In other words, no rename() could be optionally skipped. In the non-main working trees you'd also have a .git file linking to the actual git dir and you'd only have to fix them up instead of moving. > > [1] > https://public-inbox.org/git/20161128094319.16176-11-pclo...@gmail.com/T/#u > >> +{ >> + char *old_git_dir; >> + const char *new_git_dir; >> + const struct submodule *sub; >> + >> + old_git_dir = xstrfmt("%s/.git", path); >> + if (read_gitfile(old_git_dir)) >> + /* If it is an actual gitfile, it doesn't need migration. */ >> + goto out; >> + >> + sub = submodule_from_path(null_sha1, path); >> + if (!sub) >> + die(_("Could not lookup name for submodule '%s'"), >> + path); >> + >> + new_git_dir = git_common_path("modules/%s", sub->name); > > Why doesn't git_path() work here? This would make "modules" shared > between worktrees, even though it's not normally. That inconsistency > could cause trouble. I thought that was a long term goal? (I actually think about reviving the series you sent out a few weeks ago to make worktree and submodules work well together) So for that we'd want to have at least the object store shared across all worktrees. > >> + if (safe_create_leading_directories_const(new_git_dir) < 0) >> + die(_("could not create directory '%s'"), new_git_dir); >> + >> + if (!prefix) >> + prefix = get_super_prefix(); >> + printf("Migrating git directory of %s%s from\n'%s' to\n'%s'\n", >> + prefix ? prefix : "", path, >> + real_path(old_git_dir), new_git_dir); >> + >> + if (rename(old_git_dir, new_git_dir) < 0) >> + die_errno(_("Could not migrate git directory from '%s' to >> '%s'"), >> + old_git_dir, new_git_dir); >> + >> + connect_work_tree_and_git_dir(path, new_git_dir); > > Another thing in common is, both submodules and worktrees use some > form of textual symlinks. You need to fix up some here. But if this > submodule has multiple worktreee, there may be some "symlinks" in > .git/worktrees which would need fixing up as well. We could signal that via one of the flag bits? (e.g. FIXUP_WORKTREE_SYMLINKS ) > > You don't have to do the fix up thing right away, but I think we > should at least make sure we leave no dangling links behind (by > die()ing early if we find a .git dir we can't handle yet) > -- > Duy
[PATCH v2 1/1] convert: git cherry-pick -Xrenormalize did not work
From: Torsten BögershausenWorking with a repo that used to be all CRLF. At some point it was changed to all LF, with `text=auto` in .gitattributes. Trying to cherry-pick a commit from before the switchover fails: $ git cherry-pick -Xrenormalize fatal: CRLF would be replaced by LF in [path] Commit 65237284 "unify the "auto" handling of CRLF" introduced a regression: Whenever crlf_action is CRLF_TEXT_XXX and not CRLF_AUTO_XXX, SAFE_CRLF_RENORMALIZE was feed into check_safe_crlf(). This is wrong because here everything else than SAFE_CRLF_WARN is treated as SAFE_CRLF_FAIL. Call check_safe_crlf() only if checksafe is SAFE_CRLF_WARN or SAFE_CRLF_FAIL. Reported-by: Eevee (Lexy Munroe) Signed-off-by: Torsten Bögershausen --- convert.c | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/convert.c b/convert.c index be91358..f8e4dfe 100644 --- a/convert.c +++ b/convert.c @@ -281,13 +281,13 @@ static int crlf_to_git(const char *path, const char *src, size_t len, /* * If the file in the index has any CR in it, do not convert. * This is the new safer autocrlf handling. + - unless we want to renormalize in a merge or cherry-pick */ - if (checksafe == SAFE_CRLF_RENORMALIZE) - checksafe = SAFE_CRLF_FALSE; - else if (has_cr_in_index(path)) + if ((checksafe != SAFE_CRLF_RENORMALIZE) && has_cr_in_index(path)) convert_crlf_into_lf = 0; } - if (checksafe && len) { + if ((checksafe == SAFE_CRLF_WARN || + (checksafe == SAFE_CRLF_FAIL)) && len) { struct text_stat new_stats; memcpy(_stats, , sizeof(new_stats)); /* simulate "git add" */ -- 2.10.0
Re: [PATCH v3 1/2] difftool: add a skeleton for the upcoming builtin
Hello, W dniu 28.11.2016 o 18:34, Johannes Schindelin pisze: > My original "create a file in libexec/git-core/" was simple, did the job > reliably, and worked also for testing. > > It is a pity that you two gentlemen shot it down for being inelegant. And > ever since, we try to find a solution that is as simple, works as > reliably, also for testing, *and* appeases your tastes. I just would like to note that existence of file is used for both git-daemon and gitweb (the latter following the git-daemon example). So there is a precedent for the use of this mechanism. Best, -- Jakub Narębski
Re: [PATCH v3 1/2] difftool: add a skeleton for the upcoming builtin
On Wed, Nov 30, 2016 at 01:30:47PM +0100, Johannes Schindelin wrote: > On Tue, 29 Nov 2016, Jeff King wrote: > > > On Tue, Nov 29, 2016 at 09:36:55PM +0100, Johannes Schindelin wrote: > > > > > So the suggestion by both you and Peff, to use an environment variable, > > > which is either global, or requires the user to set it manually per > > > session, is simply not a good idea at all. > > > > No, my suggestion was to use config and have the test suite use an > > environment variable to test both cases (preferably automatically, > > without the user having to do anything). > > > > I do not see how that fails to cover all of your use cases. > > Oh, so the suggestion is to have *both* a config *and* an environment > variable. That is not elegant. No, that is not at all what I said. I was going to explain myself again, but I do not see what good it would do, as clearly my point did not come across in the other three emails. And then you would just complain that I am making work for you. So whatever. I do not care about your difftool topic at all. Do whatever you like (which hey, I already said before, too). -Peff
[PATCH v2] tag, branch, for-each-ref: add --ignore-case for sorting and filtering
This option makes sorting ignore case, which is great when you have branches named bug-12-do-something, Bug-12-do-some-more and BUG-12-do-what and want to group them together. Sorting externally may not be an option because we lose coloring and column layout from git-branch and git-tag. The same could be said for filtering, but it's probably less important because you can always go with the ugly pattern [bB][uU][gG]-* if you're desperate. You can't have case-sensitive filtering and case-insensitive sorting (or the other way around) with this though. But who would want that? Signed-off-by: Nguyễn Thái Ngọc Duy--- v2 has a different approach, and I think it's a better one even with that unanswered question above. Documentation/git-branch.txt | 4 Documentation/git-for-each-ref.txt | 3 +++ Documentation/git-tag.txt | 4 builtin/branch.c | 23 ++- builtin/for-each-ref.c | 5 - builtin/tag.c | 4 ref-filter.c | 28 +--- ref-filter.h | 2 ++ t/t3203-branch-output.sh | 22 ++ t/t7004-tag.sh | 20 10 files changed, 98 insertions(+), 17 deletions(-) diff --git a/Documentation/git-branch.txt b/Documentation/git-branch.txt index 1fe7344..5516a47 100644 --- a/Documentation/git-branch.txt +++ b/Documentation/git-branch.txt @@ -118,6 +118,10 @@ OPTIONS default to color output. Same as `--color=never`. +-i:: +--ignore-case:: + Sorting and filtering branches are case insensitive. + --column[=]:: --no-column:: Display branch listing in columns. See configuration variable diff --git a/Documentation/git-for-each-ref.txt b/Documentation/git-for-each-ref.txt index f57e69b..6d22974 100644 --- a/Documentation/git-for-each-ref.txt +++ b/Documentation/git-for-each-ref.txt @@ -79,6 +79,9 @@ OPTIONS Only list refs which contain the specified commit (HEAD if not specified). +--ignore-case:: + Sorting and filtering refs are case insensitive. + FIELD NAMES --- diff --git a/Documentation/git-tag.txt b/Documentation/git-tag.txt index 80019c5..76cfe40 100644 --- a/Documentation/git-tag.txt +++ b/Documentation/git-tag.txt @@ -108,6 +108,10 @@ OPTIONS variable if it exists, or lexicographic order otherwise. See linkgit:git-config[1]. +-i:: +--ignore-case:: + Sorting and filtering tags are case insensitive. + --column[=]:: --no-column:: Display tag listing in columns. See configuration variable diff --git a/builtin/branch.c b/builtin/branch.c index 60cc5c8..36e0a21 100644 --- a/builtin/branch.c +++ b/builtin/branch.c @@ -512,15 +512,6 @@ static void print_ref_list(struct ref_filter *filter, struct ref_sorting *sortin if (filter->verbose) maxwidth = calc_maxwidth(, strlen(remote_prefix)); - /* -* If no sorting parameter is given then we default to sorting -* by 'refname'. This would give us an alphabetically sorted -* array with the 'HEAD' ref at the beginning followed by -* local branches 'refs/heads/...' and finally remote-tacking -* branches 'refs/remotes/...'. -*/ - if (!sorting) - sorting = ref_default_sorting(); ref_array_sort(sorting, ); for (i = 0; i < array.nr; i++) @@ -645,6 +636,7 @@ int cmd_branch(int argc, const char **argv, const char *prefix) const char *new_upstream = NULL; enum branch_track track; struct ref_filter filter; + int icase = 0; static struct ref_sorting *sorting = NULL, **sorting_tail = struct option options[] = { @@ -686,6 +678,7 @@ int cmd_branch(int argc, const char **argv, const char *prefix) OPTION_CALLBACK, 0, "points-at", _at, N_("object"), N_("print only branches of the object"), 0, parse_opt_object_name }, + OPT_BOOL('i', "ignore-case", , N_("sorting and filtering are case insensitive")), OPT_END(), }; @@ -723,6 +716,8 @@ int cmd_branch(int argc, const char **argv, const char *prefix) if (filter.abbrev == -1) filter.abbrev = DEFAULT_ABBREV; + filter.ignore_case = icase; + finalize_colopts(, -1); if (filter.verbose) { if (explicitly_enable_column(colopts)) @@ -744,6 +739,16 @@ int cmd_branch(int argc, const char **argv, const char *prefix) if ((filter.kind & FILTER_REFS_BRANCHES) && filter.detached) filter.kind |= FILTER_REFS_DETACHED_HEAD; filter.name_patterns = argv; + /* +* If no sorting parameter is given then we default to sorting +* by 'refname'. This would give us an alphabetically
Re: [PATCH v3 1/2] difftool: add a skeleton for the upcoming builtin
Hi Junio, On Tue, 29 Nov 2016, Junio C Hamano wrote: > Johannes Schindelinwrites: > > > So the suggestion by both you and Peff, to use an environment variable, > > which is either global, or requires the user to set it manually per > > session, is simply not a good idea at all. > > As I already said, I do not have a strong preference between config > and env. I raised the env as a possible alternative that you can > think about its pros and cons, and as I already said, if you thought > and your concluded that config would work better for your needs, > that is fine by me. The env flat out fails, on the grounds of not integrating nicely into a Git for Windows installer. The config kinda works now. But for what price. It stole 4 hours I did not have. When the libexec/git-core/use-builtin-difftool solution took me a grand total of half an hour to devise, implement and test. And you know what? I still do not really see what is so bad about it. And I still see what is bad about the config "solution": it *creates* a chicken-and-egg problem with the order of config reading vs running scripts. It *creates* problems for requiring to spawn a `git config` call because reading the config in-process would mess up the global variables and environment *beyond repair*, making it *impossible* to even spawn the git-legacy-difftool Perl script. In short: the config setting now works. But it is ugly as hell. I wish I never had listened to you. Ciao, Dscho
Re: [PATCH v3 1/2] difftool: add a skeleton for the upcoming builtin
Hi Peff, On Tue, 29 Nov 2016, Jeff King wrote: > On Tue, Nov 29, 2016 at 09:36:55PM +0100, Johannes Schindelin wrote: > > > So the suggestion by both you and Peff, to use an environment variable, > > which is either global, or requires the user to set it manually per > > session, is simply not a good idea at all. > > No, my suggestion was to use config and have the test suite use an > environment variable to test both cases (preferably automatically, > without the user having to do anything). > > I do not see how that fails to cover all of your use cases. Oh, so the suggestion is to have *both* a config *and* an environment variable. That is not elegant. Ciao, Dscho
Re: [PATCHv2 4/4] submodule: add embed-git-dir function
On Wed, Nov 23, 2016 at 2:22 AM, Stefan Bellerwrote: > +/* > + * Migrate the given submodule (and all its submodules recursively) from > + * having its git directory within the working tree to the git dir nested > + * in its superprojects git dir under modules/. > + */ > +void migrate_submodule_gitdir(const char *prefix, const char *path, > + int recursive) Submodules and worktrees seem to have many things in common. The first one is this. "git worktree move" on a worktree that contains submodules .git also benefits from something like this [1]. I suggest you move this function to some neutral place and maybe rename it to relocate_gitdir() or something. It probably should take a bit flag instead of "recursive" here. One thing I would need is the ability to tell this function "I have moved all these .git dirs already (because I move whole worktree in one operation), here are the old and new locations of them, fix them up!". In other words, no rename() could be optionally skipped. [1] https://public-inbox.org/git/20161128094319.16176-11-pclo...@gmail.com/T/#u > +{ > + char *old_git_dir; > + const char *new_git_dir; > + const struct submodule *sub; > + > + old_git_dir = xstrfmt("%s/.git", path); > + if (read_gitfile(old_git_dir)) > + /* If it is an actual gitfile, it doesn't need migration. */ > + goto out; > + > + sub = submodule_from_path(null_sha1, path); > + if (!sub) > + die(_("Could not lookup name for submodule '%s'"), > + path); > + > + new_git_dir = git_common_path("modules/%s", sub->name); Why doesn't git_path() work here? This would make "modules" shared between worktrees, even though it's not normally. That inconsistency could cause trouble. > + if (safe_create_leading_directories_const(new_git_dir) < 0) > + die(_("could not create directory '%s'"), new_git_dir); > + > + if (!prefix) > + prefix = get_super_prefix(); > + printf("Migrating git directory of %s%s from\n'%s' to\n'%s'\n", > + prefix ? prefix : "", path, > + real_path(old_git_dir), new_git_dir); > + > + if (rename(old_git_dir, new_git_dir) < 0) > + die_errno(_("Could not migrate git directory from '%s' to > '%s'"), > + old_git_dir, new_git_dir); > + > + connect_work_tree_and_git_dir(path, new_git_dir); Another thing in common is, both submodules and worktrees use some form of textual symlinks. You need to fix up some here. But if this submodule has multiple worktreee, there may be some "symlinks" in .git/worktrees which would need fixing up as well. You don't have to do the fix up thing right away, but I think we should at least make sure we leave no dangling links behind (by die()ing early if we find a .git dir we can't handle yet) -- Duy
Re: [PATCH] difftool.c: mark a file-local symbol with static
Hi Ramsay, On Tue, 29 Nov 2016, Ramsay Jones wrote: > If you need to re-roll your 'js/difftool-builtin' branch, could > you please squash this into the relevant patch. Fixed. Thanks! > Also, due to a problem in my config.mak file on Linux (a commented > out line that had a line continuation '\', gr!), gcc issued a > warning, thus: > > builtin/difftool.c: In function ‘run_dir_diff’: > builtin/difftool.c:568:13: warning: zero-length gnu_printf format string > [-Wformat-zero-length] >warning(""); >^ > I am not sure why -Wno-format-zero-length is set in DEVELOPER_CFLAGS, > but do you really need to space the output with an an 'empty' > "warning:" line? (Just curious). That `warning("");` comes from a straight-forward port of this line (see https://github.com/git/git/blob/v2.11.0/git-difftool.perl#L425): $errmsg .= "warning:\n"; I could see two possible ways out: - warning("%s", ""); (ugly!) - do away with the "prefix every line with warning:" convention and simply have a multi-line `warning(_("...\n...\n"), ...)` What do you think? Dscho