Re: [PATCH/RFC] receive-pack: allow for hiding refs outside the namespace

2015-10-31 Thread Lukas Fleischer
On Fri, 30 Oct 2015 at 22:46:19, Jeff King wrote:
> On Fri, Oct 30, 2015 at 02:31:28PM -0700, Junio C Hamano wrote:
> 
> > Lukas Fleischer  writes:
> > 
> > > 1. There does not seem to be a way to pass configuration parameters to
> > >git-shell commands. Right now, the only way to work around this seems
> > >to write a wrapper script around git-shell that catches
> > >git-receive-pack commands and executes something like
> > >
> > >git -c receive.hideRefs=[...] receive-pack [...]
> > >
> > >instead of forwarding those commands to git-shell.
> > 
> > This part we have never discussed in the thread, I think.  Why do
> > you need to override, instead of having these in the repository's
> > config files?
> > 
> > Is it because a repository may host multiple pseudo repositories in
> > the form of "namespaces" but they must share the same config file,
> > and you would want to customize per "namespace"?
> > 

Yes. As I said in the original thread, I want to set receive.hideRefs to
hide everything outside the current namespace, i.e. something equivalent
to

git -c receive.hideRefs='refs/' -c receive.hideRefs="!refs/namespaces/$foo" 
receive-pack /some/path

if receive.hideRefs would work with absolute (unstripped) namespaces.

> > For that we may want to enhance the [include] mechanism.  Something
> > like
> > 
> >   [include "namespace=foo"]
> >   path = /path/to/foo/specific/config.txt
> > 
> >   [include "namespace=bar"]
> >   path = /path/to/bar/specific/config.txt
> > 
> > Cc'ing Peff as we have discussed this kind of conditional inclusion
> > in the past...
> 

That would work but it would still be very cumbersome. Imagine that
there is a single repository with 10 pseudo repositories inside. You
really don't want to create a config file and a indirection in the main
configuration for each of these pseudo repositories, just to build a
configuration equivalent to the single line I described above.

> [...]
> I am slightly confused, though, where the namespace is set in such a
> git-shell example. I have no really used ref namespaces myself, but my
> understanding is that they have to come from the environment. You can
> similarly set config through the environment. I don't think we've ever
> publicized that, but it is how "git -c" works. E.g.:
> 
>   $ git -c alias.foo='!env' -c another.option=true foo | grep GIT_
>   GIT_CONFIG_PARAMETERS='alias.foo='\!'env' 'another.option=true'
> [...]

Yes, the Git namespace is passed through the environment by setting
GIT_NAMESPACE and GIT_CONFIG_PARAMETERS is exactly what I was looking
for! Thanks!
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH/RFC] receive-pack: allow for hiding refs outside the namespace

2015-10-30 Thread Junio C Hamano
Lukas Fleischer  writes:

> 1. There does not seem to be a way to pass configuration parameters to
>git-shell commands. Right now, the only way to work around this seems
>to write a wrapper script around git-shell that catches
>git-receive-pack commands and executes something like
>
>git -c receive.hideRefs=[...] receive-pack [...]
>
>instead of forwarding those commands to git-shell.

This part we have never discussed in the thread, I think.  Why do
you need to override, instead of having these in the repository's
config files?

Is it because a repository may host multiple pseudo repositories in
the form of "namespaces" but they must share the same config file,
and you would want to customize per "namespace"?

For that we may want to enhance the [include] mechanism.  Something
like

[include "namespace=foo"]
path = /path/to/foo/specific/config.txt

[include "namespace=bar"]
path = /path/to/bar/specific/config.txt

Cc'ing Peff as we have discussed this kind of conditional inclusion
in the past...
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH/RFC] receive-pack: allow for hiding refs outside the namespace

2015-10-30 Thread Jeff King
On Fri, Oct 30, 2015 at 02:31:28PM -0700, Junio C Hamano wrote:

> Lukas Fleischer  writes:
> 
> > 1. There does not seem to be a way to pass configuration parameters to
> >git-shell commands. Right now, the only way to work around this seems
> >to write a wrapper script around git-shell that catches
> >git-receive-pack commands and executes something like
> >
> >git -c receive.hideRefs=[...] receive-pack [...]
> >
> >instead of forwarding those commands to git-shell.
> 
> This part we have never discussed in the thread, I think.  Why do
> you need to override, instead of having these in the repository's
> config files?
> 
> Is it because a repository may host multiple pseudo repositories in
> the form of "namespaces" but they must share the same config file,
> and you would want to customize per "namespace"?
> 
> For that we may want to enhance the [include] mechanism.  Something
> like
> 
>   [include "namespace=foo"]
>   path = /path/to/foo/specific/config.txt
> 
>   [include "namespace=bar"]
>   path = /path/to/bar/specific/config.txt
> 
> Cc'ing Peff as we have discussed this kind of conditional inclusion
> in the past...

Yeah, that sort of conditional matching is exactly what I had intended
for the "subsection" of include to be. We just haven't come up with a
good condition to act as our first use case. :)

I am happy with any syntax that does not paint us into a corner (and
your example above looks fine, assuming we could later add other keys on
the left-hand of the "=").

I am slightly confused, though, where the namespace is set in such a
git-shell example. I have no really used ref namespaces myself, but my
understanding is that they have to come from the environment. You can
similarly set config through the environment. I don't think we've ever
publicized that, but it is how "git -c" works. E.g.:

  $ git -c alias.foo='!env' -c another.option=true foo | grep GIT_
  GIT_CONFIG_PARAMETERS='alias.foo='\!'env' 'another.option=true'

I think it is very particular that you single-quote each item, though:

  $ GIT_CONFIG_PARAMETERS=foo.bar=true git config foo.bar
  error: bogus format in GIT_CONFIG_PARAMETERS
  fatal: unable to parse command-line config

  $ GIT_CONFIG_PARAMETERS="'foo.bar=true'" git config foo.bar
  true

So we may want to make it a little more friendly before truly
recommending it as an interface, but I don't think there is any
conceptual problem with doing so.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH/RFC] receive-pack: allow for hiding refs outside the namespace

2015-10-28 Thread Jeff King
On Wed, Oct 28, 2015 at 08:00:45AM +0100, Lukas Fleischer wrote:

> My original question remains: Do we want to continue supporting things
> like transfer.hideRefs=.have (which currently magically hides all refs
> outside the current namespace)? For 100% backwards compatibility, we
> would have to. On the other hand, one could consider the current
> behavior a bug and one could argue that it is weird enough that probably
> nobody (apart from me) relies on it right now. If we decide to keep it
> anyway, I think it should be documented.

I don't think that hiding ".have" refs at that level is especially
useful. I do not use namespaces, but I do use alternates extensively,
and that is the original source of these ".have" refs. But filtering
them at the advertisement layer is very inefficient, as it is expensive
to get the list in the first place (we spawn ls-remote, which spawns
upload-pack in the alternate!). So we'd want to prevent that process
much earlier.

I have an unpublished patch to specially disable alternates
advertisement entirely (i.e., adding a new boolean config,
receive.advertiseAlternates). In my case, it is because the alternates
repositories have huge numbers of refs (sometimes ranging into the
gigabytes) and the performance hit on even loading that packed-refs file
is too large.

I suppose that behavior _could_ be triggered by ".have" appearing in the
hiderefs config, though (i.e., before accessing the alternate, check
ref_is_hidden(".have")). That seems a bit too subtle to me, though.

> Another patch I have in my patch queue adds support for a whitelist mode
> to hideRefs. There are several ways to implement that:
> 
> 1. Make transfer.hideRefs='' hide all refs (it currently does not). The
>user can then whitelist refs explicitly using negative patterns
>below that rule. This is how my current implementation works. Using
>the empty string seemed most natural since hideRefs matches prefixes
>and every string has the empty string as a prefix. If that seems too
>weird, we could probably special case something like
>transfer.hideRefs='*' instead.
> 
> 2. Detect whether hideRefs only contains negative patterns. Switch to
>whitelist mode ("hide by default") in that case.
> 
> 3. Add another option to switch between "hide by default" and "show by
>default".
> 
> I personally prefer the first option. Any other opinions?

I am just a bystander and would not use this myself, but I think the 1st
is the least ugly. I am not sure why ignoring "refs/" does not work,
though (it does not catch ".have", of course, but I think that is a
feature; there are a finite set of pseudo-refs, so you can ignore those,
too, if you want).

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH/RFC] receive-pack: allow for hiding refs outside the namespace

2015-10-28 Thread Lukas Fleischer
On Tue, 27 Oct 2015 at 19:18:26, Junio C Hamano wrote:
> [...]
> When I asked 'Is transfer.hiderefs insufficient?', I wasn't
> expecting it to be usable out of box.  It was a suggestion to build
> on top of it, instead of adding a parallel support for something
> specific to namespaces.
> 

Agreed, and I do have a couple of patches to improve hideRefs. I still
have some questions before submitting them, though. See below.

> For example, if the problem is that you cannot tell ref_is_hidden()
> what namespace the ref is from because it is called after running
> strip_namespace(), perhaps you can find a way to have the original
> "namespaced ref" specified on transfer.hiderefs and match them?
> Then in repository for project A, namespaced refs for project B can
> be excluded by specifying refs/namespaces/B/* on transfer.hiderefs.
> 
> Perhaps along the lines of this?
> [...]

My original question remains: Do we want to continue supporting things
like transfer.hideRefs=.have (which currently magically hides all refs
outside the current namespace)? For 100% backwards compatibility, we
would have to. On the other hand, one could consider the current
behavior a bug and one could argue that it is weird enough that probably
nobody (apart from me) relies on it right now. If we decide to keep it
anyway, I think it should be documented.

Another patch I have in my patch queue adds support for a whitelist mode
to hideRefs. There are several ways to implement that:

1. Make transfer.hideRefs='' hide all refs (it currently does not). The
   user can then whitelist refs explicitly using negative patterns
   below that rule. This is how my current implementation works. Using
   the empty string seemed most natural since hideRefs matches prefixes
   and every string has the empty string as a prefix. If that seems too
   weird, we could probably special case something like
   transfer.hideRefs='*' instead.

2. Detect whether hideRefs only contains negative patterns. Switch to
   whitelist mode ("hide by default") in that case.

3. Add another option to switch between "hide by default" and "show by
   default".

I personally prefer the first option. Any other opinions?
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH/RFC] receive-pack: allow for hiding refs outside the namespace

2015-10-28 Thread Junio C Hamano
Lukas Fleischer  writes:

> Another patch I have in my patch queue adds support for a whitelist mode
> to hideRefs. There are several ways to implement that:
>
> 1. Make transfer.hideRefs='' hide all refs (it currently does not). The

Hmph, that even sounds like a bug.  parse_hide_refs_config() does
not seem to reject ref[] whose length is zero, and ref_is_hidden()
would just check "starts_with(refname, match)" with an empty string
as "match", so I would naively have expected that to work already.

Ahh, there is "if refname[len] is at the end or slash boundary"
check after that.  You're right--you'd need to tweak that one for it
to work.

>user can then whitelist refs explicitly using negative patterns
>below that rule. This is how my current implementation works.

That sounds like a good way to go.

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH/RFC] receive-pack: allow for hiding refs outside the namespace

2015-10-27 Thread Junio C Hamano
Lukas Fleischer  writes:

> 2. transfer.hideRefs and receive.hideRefs do not seem to work with Git
>namespaces in general. show_ref_cb() replaces each ref outside the
>current namespace with ".have" before passing it to show_ref() which
>in turn performs the ref_is_hidden() check. This has the nice side
>effect that receive.hideRefs=.have does exactly what I want, however
>it also means that hideRefs feature does not allow for excluding only
>specific tags outside the current namespace. Is that intended? Can we
>rely on Git always looking for ".have" in the hideRefs list in this
>case?

When I asked 'Is transfer.hiderefs insufficient?', I wasn't
expecting it to be usable out of box.  It was a suggestion to build
on top of it, instead of adding a parallel support for something
specific to namespaces.

For example, if the problem is that you cannot tell ref_is_hidden()
what namespace the ref is from because it is called after running
strip_namespace(), perhaps you can find a way to have the original
"namespaced ref" specified on transfer.hiderefs and match them?
Then in repository for project A, namespaced refs for project B can
be excluded by specifying refs/namespaces/B/* on transfer.hiderefs.

Perhaps along the lines of this?

 builtin/receive-pack.c | 9 +
 1 file changed, 9 insertions(+)

diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index bcb624b..db0a99d 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -221,6 +221,15 @@ static void show_ref(const char *path, const unsigned char 
*sha1)
 
 static int show_ref_cb(const char *path, const struct object_id *oid, int 
flag, void *unused)
 {
+   const char *ns = get_git_namespace();
+
+   /*
+* Give the "hiderefs" mechanism a chance to inspect and
+* reject the namespaced ref itself.
+*/
+   if (ns[0] && ref_is_hidden(path))
+   return 0;
+
path = strip_namespace(path);
/*
 * Advertise refs outside our current namespace as ".have"
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH/RFC] receive-pack: allow for hiding refs outside the namespace

2015-10-27 Thread Lukas Fleischer
On Tue, 27 Oct 2015 at 06:59:11, Lukas Fleischer wrote:
> [...]
> On second thought, it might be possible to overwrite the value of
> transfer.hiderefs using the -c command line option. If we combine that
> with the negative patterns supported by hiderefs, we might get a
> solution that is clean and that avoids race conditions. I will check
> whether that works with git-http-backend as well and will report back.
> 
> Thanks for the pointer!

Using receive.hideRefs seems to work but there are two minor issues:

1. There does not seem to be a way to pass configuration parameters to
   git-shell commands. Right now, the only way to work around this seems
   to write a wrapper script around git-shell that catches
   git-receive-pack commands and executes something like
   
   git -c receive.hideRefs=[...] receive-pack [...]
   
   instead of forwarding those commands to git-shell. How about allowing
   to overwrite configuration parameters via an environment variable?
   Has that been discussed before?

2. transfer.hideRefs and receive.hideRefs do not seem to work with Git
   namespaces in general. show_ref_cb() replaces each ref outside the
   current namespace with ".have" before passing it to show_ref() which
   in turn performs the ref_is_hidden() check. This has the nice side
   effect that receive.hideRefs=.have does exactly what I want, however
   it also means that hideRefs feature does not allow for excluding only
   specific tags outside the current namespace. Is that intended? Can we
   rely on Git always looking for ".have" in the hideRefs list in this
   case? Should the documentation be updated?

Regards,
Lukas
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH/RFC] receive-pack: allow for hiding refs outside the namespace

2015-10-26 Thread Lukas Fleischer
Right now, we always advertise all refs as ".have", even those outside
the current namespace. This leads to problems when trying to push to a
repository with a huge number of namespaces from a slow connection.

Add a configuration option receive.advertiseAllRefs that can be used to
determine whether refs outside the current namespace should be
advertised or not.

Signed-off-by: Lukas Fleischer 
---
We are using Git namespaces to store a huge number of (virtual)
repositories inside a shared repository. While the blobs in the virtual
repositories are fairly similar, they do not share any refs, so
advertising any refs outside the current namespace is undesirable. See
the discussion on [1] for details.

Note that this patch is just a draft: I didn't do any testing, apart
from checking that it compiles. I would like to hear some opinions
before sending a polished version.

Is our use case considered common enough to justify the inclusion of
such a configuration option in mainline?

Are there suggestions for a better name for the option? Ideally, it
should contain the word "namespace" but I could not come up with
something sensible that is short enough.

[1] https://lists.archlinux.org/pipermail/aur-general/2015-October/031596.html

 Documentation/config.txt |  6 ++
 builtin/receive-pack.c   | 31 +--
 2 files changed, 27 insertions(+), 10 deletions(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 315f271..aa101a7 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -2201,6 +2201,12 @@ receive.advertiseAtomic::
capability to its clients. If you don't want to this capability
to be advertised, set this variable to false.
 
+receive.advertiseAllRefs::
+   By default, git-receive-pack will advertise all refs, even those
+   outside the current namespace, so that the client can use them to
+   minimize data transfer. If you only want to advertise refs from the
+   active namespace to be advertised, set this variable to false.
+
 receive.autogc::
By default, git-receive-pack will run "git-gc --auto" after
receiving data from git-push and updating refs.  You can stop
diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index e6b93d0..ea9a820 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -41,6 +41,7 @@ static struct strbuf fsck_msg_types = STRBUF_INIT;
 static int receive_unpack_limit = -1;
 static int transfer_unpack_limit = -1;
 static int advertise_atomic_push = 1;
+static int advertise_all_refs = 1;
 static int unpack_limit = 100;
 static int report_status;
 static int use_sideband;
@@ -190,6 +191,11 @@ static int receive_pack_config(const char *var, const char 
*value, void *cb)
return 0;
}
 
+   if (strcmp(var, "receive.advertiseallrefs") == 0) {
+   advertise_all_refs = git_config_bool(var, value);
+   return 0;
+   }
+
return git_default_config(var, value, cb);
 }
 
@@ -222,16 +228,21 @@ static void show_ref(const char *path, const unsigned 
char *sha1)
 static int show_ref_cb(const char *path, const struct object_id *oid, int 
flag, void *unused)
 {
path = strip_namespace(path);
-   /*
-* Advertise refs outside our current namespace as ".have"
-* refs, so that the client can use them to minimize data
-* transfer but will otherwise ignore them. This happens to
-* cover ".have" that are thrown in by add_one_alternate_ref()
-* to mark histories that are complete in our alternates as
-* well.
-*/
-   if (!path)
-   path = ".have";
+   if (!path) {
+   if (advertise_all_refs) {
+   /*
+* Advertise refs outside our current namespace as
+* ".have" refs, so that the client can use them to
+* minimize data transfer but will otherwise ignore
+* them. This happens to cover ".have" that are thrown
+* in by add_one_alternate_ref() to mark histories that
+* are complete in our alternates as well.
+*/
+   path = ".have";
+   } else {
+   return 0;
+   }
+   }
show_ref(path, oid->hash);
return 0;
 }
-- 
2.6.2

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH/RFC] receive-pack: allow for hiding refs outside the namespace

2015-10-26 Thread Junio C Hamano
Is there a reason why transfer.hiderefs is not sufficient?
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html