Re: [PATCH/RFC] receive-pack: allow for hiding refs outside the namespace
On Fri, 30 Oct 2015 at 22:46:19, Jeff King wrote: > On Fri, Oct 30, 2015 at 02:31:28PM -0700, Junio C Hamano wrote: > > > Lukas Fleischerwrites: > > > > > 1. There does not seem to be a way to pass configuration parameters to > > >git-shell commands. Right now, the only way to work around this seems > > >to write a wrapper script around git-shell that catches > > >git-receive-pack commands and executes something like > > > > > >git -c receive.hideRefs=[...] receive-pack [...] > > > > > >instead of forwarding those commands to git-shell. > > > > This part we have never discussed in the thread, I think. Why do > > you need to override, instead of having these in the repository's > > config files? > > > > Is it because a repository may host multiple pseudo repositories in > > the form of "namespaces" but they must share the same config file, > > and you would want to customize per "namespace"? > > Yes. As I said in the original thread, I want to set receive.hideRefs to hide everything outside the current namespace, i.e. something equivalent to git -c receive.hideRefs='refs/' -c receive.hideRefs="!refs/namespaces/$foo" receive-pack /some/path if receive.hideRefs would work with absolute (unstripped) namespaces. > > For that we may want to enhance the [include] mechanism. Something > > like > > > > [include "namespace=foo"] > > path = /path/to/foo/specific/config.txt > > > > [include "namespace=bar"] > > path = /path/to/bar/specific/config.txt > > > > Cc'ing Peff as we have discussed this kind of conditional inclusion > > in the past... > That would work but it would still be very cumbersome. Imagine that there is a single repository with 10 pseudo repositories inside. You really don't want to create a config file and a indirection in the main configuration for each of these pseudo repositories, just to build a configuration equivalent to the single line I described above. > [...] > I am slightly confused, though, where the namespace is set in such a > git-shell example. I have no really used ref namespaces myself, but my > understanding is that they have to come from the environment. You can > similarly set config through the environment. I don't think we've ever > publicized that, but it is how "git -c" works. E.g.: > > $ git -c alias.foo='!env' -c another.option=true foo | grep GIT_ > GIT_CONFIG_PARAMETERS='alias.foo='\!'env' 'another.option=true' > [...] Yes, the Git namespace is passed through the environment by setting GIT_NAMESPACE and GIT_CONFIG_PARAMETERS is exactly what I was looking for! Thanks! -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH/RFC] receive-pack: allow for hiding refs outside the namespace
Lukas Fleischerwrites: > 1. There does not seem to be a way to pass configuration parameters to >git-shell commands. Right now, the only way to work around this seems >to write a wrapper script around git-shell that catches >git-receive-pack commands and executes something like > >git -c receive.hideRefs=[...] receive-pack [...] > >instead of forwarding those commands to git-shell. This part we have never discussed in the thread, I think. Why do you need to override, instead of having these in the repository's config files? Is it because a repository may host multiple pseudo repositories in the form of "namespaces" but they must share the same config file, and you would want to customize per "namespace"? For that we may want to enhance the [include] mechanism. Something like [include "namespace=foo"] path = /path/to/foo/specific/config.txt [include "namespace=bar"] path = /path/to/bar/specific/config.txt Cc'ing Peff as we have discussed this kind of conditional inclusion in the past... -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH/RFC] receive-pack: allow for hiding refs outside the namespace
On Fri, Oct 30, 2015 at 02:31:28PM -0700, Junio C Hamano wrote: > Lukas Fleischerwrites: > > > 1. There does not seem to be a way to pass configuration parameters to > >git-shell commands. Right now, the only way to work around this seems > >to write a wrapper script around git-shell that catches > >git-receive-pack commands and executes something like > > > >git -c receive.hideRefs=[...] receive-pack [...] > > > >instead of forwarding those commands to git-shell. > > This part we have never discussed in the thread, I think. Why do > you need to override, instead of having these in the repository's > config files? > > Is it because a repository may host multiple pseudo repositories in > the form of "namespaces" but they must share the same config file, > and you would want to customize per "namespace"? > > For that we may want to enhance the [include] mechanism. Something > like > > [include "namespace=foo"] > path = /path/to/foo/specific/config.txt > > [include "namespace=bar"] > path = /path/to/bar/specific/config.txt > > Cc'ing Peff as we have discussed this kind of conditional inclusion > in the past... Yeah, that sort of conditional matching is exactly what I had intended for the "subsection" of include to be. We just haven't come up with a good condition to act as our first use case. :) I am happy with any syntax that does not paint us into a corner (and your example above looks fine, assuming we could later add other keys on the left-hand of the "="). I am slightly confused, though, where the namespace is set in such a git-shell example. I have no really used ref namespaces myself, but my understanding is that they have to come from the environment. You can similarly set config through the environment. I don't think we've ever publicized that, but it is how "git -c" works. E.g.: $ git -c alias.foo='!env' -c another.option=true foo | grep GIT_ GIT_CONFIG_PARAMETERS='alias.foo='\!'env' 'another.option=true' I think it is very particular that you single-quote each item, though: $ GIT_CONFIG_PARAMETERS=foo.bar=true git config foo.bar error: bogus format in GIT_CONFIG_PARAMETERS fatal: unable to parse command-line config $ GIT_CONFIG_PARAMETERS="'foo.bar=true'" git config foo.bar true So we may want to make it a little more friendly before truly recommending it as an interface, but I don't think there is any conceptual problem with doing so. -Peff -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH/RFC] receive-pack: allow for hiding refs outside the namespace
On Wed, Oct 28, 2015 at 08:00:45AM +0100, Lukas Fleischer wrote: > My original question remains: Do we want to continue supporting things > like transfer.hideRefs=.have (which currently magically hides all refs > outside the current namespace)? For 100% backwards compatibility, we > would have to. On the other hand, one could consider the current > behavior a bug and one could argue that it is weird enough that probably > nobody (apart from me) relies on it right now. If we decide to keep it > anyway, I think it should be documented. I don't think that hiding ".have" refs at that level is especially useful. I do not use namespaces, but I do use alternates extensively, and that is the original source of these ".have" refs. But filtering them at the advertisement layer is very inefficient, as it is expensive to get the list in the first place (we spawn ls-remote, which spawns upload-pack in the alternate!). So we'd want to prevent that process much earlier. I have an unpublished patch to specially disable alternates advertisement entirely (i.e., adding a new boolean config, receive.advertiseAlternates). In my case, it is because the alternates repositories have huge numbers of refs (sometimes ranging into the gigabytes) and the performance hit on even loading that packed-refs file is too large. I suppose that behavior _could_ be triggered by ".have" appearing in the hiderefs config, though (i.e., before accessing the alternate, check ref_is_hidden(".have")). That seems a bit too subtle to me, though. > Another patch I have in my patch queue adds support for a whitelist mode > to hideRefs. There are several ways to implement that: > > 1. Make transfer.hideRefs='' hide all refs (it currently does not). The >user can then whitelist refs explicitly using negative patterns >below that rule. This is how my current implementation works. Using >the empty string seemed most natural since hideRefs matches prefixes >and every string has the empty string as a prefix. If that seems too >weird, we could probably special case something like >transfer.hideRefs='*' instead. > > 2. Detect whether hideRefs only contains negative patterns. Switch to >whitelist mode ("hide by default") in that case. > > 3. Add another option to switch between "hide by default" and "show by >default". > > I personally prefer the first option. Any other opinions? I am just a bystander and would not use this myself, but I think the 1st is the least ugly. I am not sure why ignoring "refs/" does not work, though (it does not catch ".have", of course, but I think that is a feature; there are a finite set of pseudo-refs, so you can ignore those, too, if you want). -Peff -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH/RFC] receive-pack: allow for hiding refs outside the namespace
On Tue, 27 Oct 2015 at 19:18:26, Junio C Hamano wrote: > [...] > When I asked 'Is transfer.hiderefs insufficient?', I wasn't > expecting it to be usable out of box. It was a suggestion to build > on top of it, instead of adding a parallel support for something > specific to namespaces. > Agreed, and I do have a couple of patches to improve hideRefs. I still have some questions before submitting them, though. See below. > For example, if the problem is that you cannot tell ref_is_hidden() > what namespace the ref is from because it is called after running > strip_namespace(), perhaps you can find a way to have the original > "namespaced ref" specified on transfer.hiderefs and match them? > Then in repository for project A, namespaced refs for project B can > be excluded by specifying refs/namespaces/B/* on transfer.hiderefs. > > Perhaps along the lines of this? > [...] My original question remains: Do we want to continue supporting things like transfer.hideRefs=.have (which currently magically hides all refs outside the current namespace)? For 100% backwards compatibility, we would have to. On the other hand, one could consider the current behavior a bug and one could argue that it is weird enough that probably nobody (apart from me) relies on it right now. If we decide to keep it anyway, I think it should be documented. Another patch I have in my patch queue adds support for a whitelist mode to hideRefs. There are several ways to implement that: 1. Make transfer.hideRefs='' hide all refs (it currently does not). The user can then whitelist refs explicitly using negative patterns below that rule. This is how my current implementation works. Using the empty string seemed most natural since hideRefs matches prefixes and every string has the empty string as a prefix. If that seems too weird, we could probably special case something like transfer.hideRefs='*' instead. 2. Detect whether hideRefs only contains negative patterns. Switch to whitelist mode ("hide by default") in that case. 3. Add another option to switch between "hide by default" and "show by default". I personally prefer the first option. Any other opinions? -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH/RFC] receive-pack: allow for hiding refs outside the namespace
Lukas Fleischerwrites: > Another patch I have in my patch queue adds support for a whitelist mode > to hideRefs. There are several ways to implement that: > > 1. Make transfer.hideRefs='' hide all refs (it currently does not). The Hmph, that even sounds like a bug. parse_hide_refs_config() does not seem to reject ref[] whose length is zero, and ref_is_hidden() would just check "starts_with(refname, match)" with an empty string as "match", so I would naively have expected that to work already. Ahh, there is "if refname[len] is at the end or slash boundary" check after that. You're right--you'd need to tweak that one for it to work. >user can then whitelist refs explicitly using negative patterns >below that rule. This is how my current implementation works. That sounds like a good way to go. Thanks. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH/RFC] receive-pack: allow for hiding refs outside the namespace
Lukas Fleischerwrites: > 2. transfer.hideRefs and receive.hideRefs do not seem to work with Git >namespaces in general. show_ref_cb() replaces each ref outside the >current namespace with ".have" before passing it to show_ref() which >in turn performs the ref_is_hidden() check. This has the nice side >effect that receive.hideRefs=.have does exactly what I want, however >it also means that hideRefs feature does not allow for excluding only >specific tags outside the current namespace. Is that intended? Can we >rely on Git always looking for ".have" in the hideRefs list in this >case? When I asked 'Is transfer.hiderefs insufficient?', I wasn't expecting it to be usable out of box. It was a suggestion to build on top of it, instead of adding a parallel support for something specific to namespaces. For example, if the problem is that you cannot tell ref_is_hidden() what namespace the ref is from because it is called after running strip_namespace(), perhaps you can find a way to have the original "namespaced ref" specified on transfer.hiderefs and match them? Then in repository for project A, namespaced refs for project B can be excluded by specifying refs/namespaces/B/* on transfer.hiderefs. Perhaps along the lines of this? builtin/receive-pack.c | 9 + 1 file changed, 9 insertions(+) diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c index bcb624b..db0a99d 100644 --- a/builtin/receive-pack.c +++ b/builtin/receive-pack.c @@ -221,6 +221,15 @@ static void show_ref(const char *path, const unsigned char *sha1) static int show_ref_cb(const char *path, const struct object_id *oid, int flag, void *unused) { + const char *ns = get_git_namespace(); + + /* +* Give the "hiderefs" mechanism a chance to inspect and +* reject the namespaced ref itself. +*/ + if (ns[0] && ref_is_hidden(path)) + return 0; + path = strip_namespace(path); /* * Advertise refs outside our current namespace as ".have" -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH/RFC] receive-pack: allow for hiding refs outside the namespace
On Tue, 27 Oct 2015 at 06:59:11, Lukas Fleischer wrote: > [...] > On second thought, it might be possible to overwrite the value of > transfer.hiderefs using the -c command line option. If we combine that > with the negative patterns supported by hiderefs, we might get a > solution that is clean and that avoids race conditions. I will check > whether that works with git-http-backend as well and will report back. > > Thanks for the pointer! Using receive.hideRefs seems to work but there are two minor issues: 1. There does not seem to be a way to pass configuration parameters to git-shell commands. Right now, the only way to work around this seems to write a wrapper script around git-shell that catches git-receive-pack commands and executes something like git -c receive.hideRefs=[...] receive-pack [...] instead of forwarding those commands to git-shell. How about allowing to overwrite configuration parameters via an environment variable? Has that been discussed before? 2. transfer.hideRefs and receive.hideRefs do not seem to work with Git namespaces in general. show_ref_cb() replaces each ref outside the current namespace with ".have" before passing it to show_ref() which in turn performs the ref_is_hidden() check. This has the nice side effect that receive.hideRefs=.have does exactly what I want, however it also means that hideRefs feature does not allow for excluding only specific tags outside the current namespace. Is that intended? Can we rely on Git always looking for ".have" in the hideRefs list in this case? Should the documentation be updated? Regards, Lukas -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH/RFC] receive-pack: allow for hiding refs outside the namespace
Right now, we always advertise all refs as ".have", even those outside the current namespace. This leads to problems when trying to push to a repository with a huge number of namespaces from a slow connection. Add a configuration option receive.advertiseAllRefs that can be used to determine whether refs outside the current namespace should be advertised or not. Signed-off-by: Lukas Fleischer--- We are using Git namespaces to store a huge number of (virtual) repositories inside a shared repository. While the blobs in the virtual repositories are fairly similar, they do not share any refs, so advertising any refs outside the current namespace is undesirable. See the discussion on [1] for details. Note that this patch is just a draft: I didn't do any testing, apart from checking that it compiles. I would like to hear some opinions before sending a polished version. Is our use case considered common enough to justify the inclusion of such a configuration option in mainline? Are there suggestions for a better name for the option? Ideally, it should contain the word "namespace" but I could not come up with something sensible that is short enough. [1] https://lists.archlinux.org/pipermail/aur-general/2015-October/031596.html Documentation/config.txt | 6 ++ builtin/receive-pack.c | 31 +-- 2 files changed, 27 insertions(+), 10 deletions(-) diff --git a/Documentation/config.txt b/Documentation/config.txt index 315f271..aa101a7 100644 --- a/Documentation/config.txt +++ b/Documentation/config.txt @@ -2201,6 +2201,12 @@ receive.advertiseAtomic:: capability to its clients. If you don't want to this capability to be advertised, set this variable to false. +receive.advertiseAllRefs:: + By default, git-receive-pack will advertise all refs, even those + outside the current namespace, so that the client can use them to + minimize data transfer. If you only want to advertise refs from the + active namespace to be advertised, set this variable to false. + receive.autogc:: By default, git-receive-pack will run "git-gc --auto" after receiving data from git-push and updating refs. You can stop diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c index e6b93d0..ea9a820 100644 --- a/builtin/receive-pack.c +++ b/builtin/receive-pack.c @@ -41,6 +41,7 @@ static struct strbuf fsck_msg_types = STRBUF_INIT; static int receive_unpack_limit = -1; static int transfer_unpack_limit = -1; static int advertise_atomic_push = 1; +static int advertise_all_refs = 1; static int unpack_limit = 100; static int report_status; static int use_sideband; @@ -190,6 +191,11 @@ static int receive_pack_config(const char *var, const char *value, void *cb) return 0; } + if (strcmp(var, "receive.advertiseallrefs") == 0) { + advertise_all_refs = git_config_bool(var, value); + return 0; + } + return git_default_config(var, value, cb); } @@ -222,16 +228,21 @@ static void show_ref(const char *path, const unsigned char *sha1) static int show_ref_cb(const char *path, const struct object_id *oid, int flag, void *unused) { path = strip_namespace(path); - /* -* Advertise refs outside our current namespace as ".have" -* refs, so that the client can use them to minimize data -* transfer but will otherwise ignore them. This happens to -* cover ".have" that are thrown in by add_one_alternate_ref() -* to mark histories that are complete in our alternates as -* well. -*/ - if (!path) - path = ".have"; + if (!path) { + if (advertise_all_refs) { + /* +* Advertise refs outside our current namespace as +* ".have" refs, so that the client can use them to +* minimize data transfer but will otherwise ignore +* them. This happens to cover ".have" that are thrown +* in by add_one_alternate_ref() to mark histories that +* are complete in our alternates as well. +*/ + path = ".have"; + } else { + return 0; + } + } show_ref(path, oid->hash); return 0; } -- 2.6.2 -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH/RFC] receive-pack: allow for hiding refs outside the namespace
Is there a reason why transfer.hiderefs is not sufficient? -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html