Re: [RFC/PATCH] connect: add GIT_SSH_{SEND,RECEIVE}{,_COMMAND} env variables

2018-01-04 Thread Junio C Hamano
Jeff King  writes:

> I get that you may have two different keys to go with two different
> identities on a remote system. But I'm not sure I understand why
> "sending" or "receiving" is the right way to split those up. Wouldn't
> you also sometimes want to fetch from repository X? IOW, wouldn't you
> want to tie identity "A" to repository "X", and "B" to repository "Y?
>
>> So now I just have a GIT_SSH_COMMAND that dispatches to different keys
>> depending on the operation, as noted in the commit message, and I can
>> assure you that without that logic it doesn't work.
>
> You mentioned host aliases later, which is the solution I've seen in the
> wild. And then you can map each remote to a different host alias.

Yup, I do agree that it is exactly the established solution for this
kind of situation.


Re: [RFC/PATCH] connect: add GIT_SSH_{SEND,RECEIVE}{,_COMMAND} env variables

2018-01-04 Thread Ævar Arnfjörð Bjarmason

On Thu, Jan 04 2018, Jeff King jotted:

> On Thu, Jan 04, 2018 at 11:10:17AM +0100, Ævar Arnfjörð Bjarmason wrote:
>
>> That's badly explained, sorry, when I say "push" I mean "push and/or
>> pull".
>>
>> I don't know about Github, but on Gitlab when you provision a deploy key
>> and associate it with a repo it must be *globally* rw or ro, there's no
>> way to on a per-repo basis say it should be rw ro.
>>
>> I have a job that's fetching a bunch of repos to review code in them
>> (for auditing purposes). It then commits the results of that review to
>> other git repos.
>>
>> Thus I want to have a ro key to all those reviewed repos, but rw keys to
>> the audit repo itself (and it'll also pull with the rw key).
>
> OK, that part makes sense to me.
>
> But I'm not sure how your patch solves it. When you "git fetch" on the
> audit repo, wouldn't your GIT_SSH_RECEIVE_COMMAND kick in and use the
> wrong key? What am I missing?

I add both the ro and rw key to some projects. Those are a tiny subset
of the overall number.


Re: [RFC/PATCH] connect: add GIT_SSH_{SEND,RECEIVE}{,_COMMAND} env variables

2018-01-04 Thread Jeff King
On Thu, Jan 04, 2018 at 11:10:17AM +0100, Ævar Arnfjörð Bjarmason wrote:

> That's badly explained, sorry, when I say "push" I mean "push and/or
> pull".
> 
> I don't know about Github, but on Gitlab when you provision a deploy key
> and associate it with a repo it must be *globally* rw or ro, there's no
> way to on a per-repo basis say it should be rw ro.
> 
> I have a job that's fetching a bunch of repos to review code in them
> (for auditing purposes). It then commits the results of that review to
> other git repos.
> 
> Thus I want to have a ro key to all those reviewed repos, but rw keys to
> the audit repo itself (and it'll also pull with the rw key).

OK, that part makes sense to me.

But I'm not sure how your patch solves it. When you "git fetch" on the
audit repo, wouldn't your GIT_SSH_RECEIVE_COMMAND kick in and use the
wrong key? What am I missing?

-Peff


Re: [RFC/PATCH] connect: add GIT_SSH_{SEND,RECEIVE}{,_COMMAND} env variables

2018-01-04 Thread Ævar Arnfjörð Bjarmason

On Thu, Jan 04 2018, Jeff King jotted:

> On Thu, Jan 04, 2018 at 01:08:28AM +0100, Ævar Arnfjörð Bjarmason wrote:
>
>> Hopefully this is clearer, and depending on how the rest of the
>> discussion goes I'll submit v2 with something like this in the commit
>> message:
>>
>> SSH keys A and B are known to the remote service, and used to identify
>> two different users.
>>
>> A can only push to repository X, and B can only fetch from repository Y.
>>
>> Thus, if you have a script that does:
>>
>> GIT_SSH_COMMAND="ssh -i A -i B" git ...
>>
>> It'll always fail for pulling from X, and pushing to Y. Supply:
>>
>> GIT_SSH_COMMAND="ssh -i B -i A" git ...
>>
>> And now pulling will work, but pushing won't.
>
> I get that you may have two different keys to go with two different
> identities on a remote system. But I'm not sure I understand why
> "sending" or "receiving" is the right way to split those up. Wouldn't
> you also sometimes want to fetch from repository X? IOW, wouldn't you
> want to tie identity "A" to repository "X", and "B" to repository "Y?

That's badly explained, sorry, when I say "push" I mean "push and/or
pull".

I don't know about Github, but on Gitlab when you provision a deploy key
and associate it with a repo it must be *globally* rw or ro, there's no
way to on a per-repo basis say it should be rw ro.

I have a job that's fetching a bunch of repos to review code in them
(for auditing purposes). It then commits the results of that review to
other git repos.

Thus I want to have a ro key to all those reviewed repos, but rw keys to
the audit repo itself (and it'll also pull with the rw key).

Hence this patch, I thought *maybe* others would be interested in this
since it seems to me to be an easy thing to run into with these ssh-key
based hosting providers, but maybe not.

>> So now I just have a GIT_SSH_COMMAND that dispatches to different keys
>> depending on the operation, as noted in the commit message, and I can
>> assure you that without that logic it doesn't work.
>
> You mentioned host aliases later, which is the solution I've seen in the
> wild. And then you can map each remote to a different host alias.


Re: [RFC/PATCH] connect: add GIT_SSH_{SEND,RECEIVE}{,_COMMAND} env variables

2018-01-03 Thread Jeff King
On Thu, Jan 04, 2018 at 01:08:28AM +0100, Ævar Arnfjörð Bjarmason wrote:

> Hopefully this is clearer, and depending on how the rest of the
> discussion goes I'll submit v2 with something like this in the commit
> message:
> 
> SSH keys A and B are known to the remote service, and used to identify
> two different users.
> 
> A can only push to repository X, and B can only fetch from repository Y.
> 
> Thus, if you have a script that does:
> 
> GIT_SSH_COMMAND="ssh -i A -i B" git ...
> 
> It'll always fail for pulling from X, and pushing to Y. Supply:
> 
> GIT_SSH_COMMAND="ssh -i B -i A" git ...
> 
> And now pulling will work, but pushing won't.

I get that you may have two different keys to go with two different
identities on a remote system. But I'm not sure I understand why
"sending" or "receiving" is the right way to split those up. Wouldn't
you also sometimes want to fetch from repository X? IOW, wouldn't you
want to tie identity "A" to repository "X", and "B" to repository "Y?

> So now I just have a GIT_SSH_COMMAND that dispatches to different keys
> depending on the operation, as noted in the commit message, and I can
> assure you that without that logic it doesn't work.

You mentioned host aliases later, which is the solution I've seen in the
wild. And then you can map each remote to a different host alias.

-Peff


Re: [RFC/PATCH] connect: add GIT_SSH_{SEND,RECEIVE}{,_COMMAND} env variables

2018-01-03 Thread Ævar Arnfjörð Bjarmason

On Wed, Jan 03 2018, Junio C. Hamano jotted:

> Ævar Arnfjörð Bjarmason   writes:
>
>> This is useful for talking to systems such as Github or Gitlab that
>> identify user accounts (or deploy keys) by ssh keys. Normally, ssh
>> could do this itself by supplying multiple keys via -i, but that trick
>> doesn't work on these systems as the connection will have already been
>> accepted when the "wrong" key gets rejected.
>
> You need to explain this a lot better than the above.
>
> I am sure systems such as Github have more than dozens of users who
> push over ssh and these users identify themselves by which key to
> use when establishing connection just fine (presumably by using a
> "Host" entry for the github URL in ~/.ssh/config), and presumably we
> are not sending "wrong" keys over there.  So there needs to be a lot
> more clear description of the problem you are trying to solve in the
> first place.

Hopefully this is clearer, and depending on how the rest of the
discussion goes I'll submit v2 with something like this in the commit
message:

SSH keys A and B are known to the remote service, and used to identify
two different users.

A can only push to repository X, and B can only fetch from repository Y.

Thus, if you have a script that does:

GIT_SSH_COMMAND="ssh -i A -i B" git ...

It'll always fail for pulling from X, and pushing to Y. Supply:

GIT_SSH_COMMAND="ssh -i B -i A" git ...

And now pulling will work, but pushing won't.

If you were to do, where C is a completly unknown key:

GIT_SSH_COMMAND="ssh -i C -i A" git push X ...

It would work, since ssh wouldn't get far enough in the key negotiation
to drop you into a shell. This is the case you had in mind, but is
unrelated to the problem I'm trying to address.

I tested this on a Gitlab instance, but as far as I know this property
is going to be intrinsic to anything that uses ssh in this way,
i.e. once you get past the step where the server says "this key is OK"
and drops you into a shell, it's not going to retry the whole
negotiation with another key just because the command you ran exited
with non-zero.

So now I just have a GIT_SSH_COMMAND that dispatches to different keys
depending on the operation, as noted in the commit message, and I can
assure you that without that logic it doesn't work.

I thought that use-case might be useful enough to be natively supported,
since right now you either need to hack it up like that, or perform
similar hacks with url/pushurl and ssh host aliases in your config.


Re: [RFC/PATCH] connect: add GIT_SSH_{SEND,RECEIVE}{,_COMMAND} env variables

2018-01-03 Thread Junio C Hamano
Ævar Arnfjörð Bjarmason   writes:

> This is useful for talking to systems such as Github or Gitlab that
> identify user accounts (or deploy keys) by ssh keys. Normally, ssh
> could do this itself by supplying multiple keys via -i, but that trick
> doesn't work on these systems as the connection will have already been
> accepted when the "wrong" key gets rejected.

You need to explain this a lot better than the above.  

I am sure systems such as Github have more than dozens of users who
push over ssh and these users identify themselves by which key to
use when establishing connection just fine (presumably by using a
"Host" entry for the github URL in ~/.ssh/config), and presumably we
are not sending "wrong" keys over there.  So there needs to be a lot
more clear description of the problem you are trying to solve in the
first place.


[RFC/PATCH] connect: add GIT_SSH_{SEND,RECEIVE}{,_COMMAND} env variables

2018-01-03 Thread Ævar Arnfjörð Bjarmason
Amend the long-standing logic for overriding the ssh command with
GIT_SSH or GIT_SSH_COMMAND to also support
e.g. GIT_SSH_SEND_COMMAND. The new specific send/receive variables
take priority over the old ones, and fall back to the older ones if
they exist.

This is useful for talking to systems such as Github or Gitlab that
identify user accounts (or deploy keys) by ssh keys. Normally, ssh
could do this itself by supplying multiple keys via -i, but that trick
doesn't work on these systems as the connection will have already been
accepted when the "wrong" key gets rejected.

This new feature is redundant to and less general than setting the
GIT_SSH_COMMAND to the path of a script that's going to dispatch to
ssh depending on what the second argument to the script is, e.g.:

$ cat ./git-ssh-command
#!/usr/bin/perl
if ($ARGV[1] =~ /^git-upload-pack /) {
   system qw(ssh -i /some/ro/key) => @ARGV;
} elsif ($ARGV[1] =~ /^git-receive-pack /) {
   system qw(ssh -i /some/rw/key) => @ARGV;
} else { ... }
$ GIT_TRACE=1 GIT_SSH_COMMAND="./git-ssh-command" git fetch
10:22:39.415684 git.c:344   trace: built-in: git 'fetch'
10:22:39.432192 run-command.c:627   trace: run_command: 
'./git-ssh-command' '-G' 'g...@github.com'
10:22:39.434156 run-command.c:627   trace: run_command: 
'./git-ssh-command' 'g...@github.com' 'git-upload-pack '\''git/git.git'\'''
Warning: Identity file /some/ro/key not accessible: No such file or 
directory.

However, I feel that this is a common enough case to be worth
supporting explicitly, and such a script will also need to deal with
arbitrary arguments fed via git-fetch's --upload-pack="...", and
git-push's corresponding --receive-pack argument.

Signed-off-by: Ævar Arnfjörð Bjarmason 
---

I'm not 100% sure about this one myself, but am leaning towards
inclusion for the reasons explained above, and the patch is trivial
enough that I think we can discuss whether it's worthwhile without
test / documentation.

 builtin/fetch-pack.c |  2 +-
 builtin/send-pack.c  |  3 ++-
 connect.c| 21 ++---
 connect.h|  2 ++
 4 files changed, 23 insertions(+), 5 deletions(-)

diff --git a/builtin/fetch-pack.c b/builtin/fetch-pack.c
index 366b9d13f9..dae10f8419 100644
--- a/builtin/fetch-pack.c
+++ b/builtin/fetch-pack.c
@@ -189,7 +189,7 @@ int cmd_fetch_pack(int argc, const char **argv, const char 
*prefix)
if (args.diag_url)
flags |= CONNECT_DIAG_URL;
conn = git_connect(fd, dest, args.uploadpack,
-  flags);
+  flags | CONNECT_RECEIVE);
if (!conn)
return args.diag_url ? 0 : 1;
}
diff --git a/builtin/send-pack.c b/builtin/send-pack.c
index fc4f0bb5fb..2374d2b29c 100644
--- a/builtin/send-pack.c
+++ b/builtin/send-pack.c
@@ -252,8 +252,9 @@ int cmd_send_pack(int argc, const char **argv, const char 
*prefix)
fd[0] = 0;
fd[1] = 1;
} else {
+   int flags = args.verbose ? CONNECT_VERBOSE : 0;
conn = git_connect(fd, dest, receivepack,
-   args.verbose ? CONNECT_VERBOSE : 0);
+   flags | CONNECT_SEND);
}
 
get_remote_heads(fd[0], NULL, 0, _refs, REF_NORMAL,
diff --git a/connect.c b/connect.c
index c3a014c5ba..2a35924292 100644
--- a/connect.c
+++ b/connect.c
@@ -774,13 +774,23 @@ static enum protocol parse_connect_url(const char 
*url_orig, char **ret_host,
return protocol;
 }
 
-static const char *get_ssh_command(void)
+static const char *get_ssh_command(int flags)
 {
const char *ssh;
 
+   if (flags & CONNECT_SEND && (ssh = getenv("GIT_SSH_SEND_COMMAND")))
+   return ssh;
+   else if (flags & CONNECT_RECEIVE && (ssh = 
getenv("GIT_SSH_RECEIVE_COMMAND")))
+   return ssh;
if ((ssh = getenv("GIT_SSH_COMMAND")))
return ssh;
 
+   if (flags & CONNECT_SEND &&
+   !git_config_get_string_const("core.sshsendcommand", ))
+   return ssh;
+   else if (flags & CONNECT_RECEIVE &&
+   !git_config_get_string_const("core.sshreceivecommand", ))
+   return ssh;
if (!git_config_get_string_const("core.sshcommand", ))
return ssh;
 
@@ -997,7 +1007,7 @@ static void fill_ssh_args(struct child_process *conn, 
const char *ssh_host,
if (looks_like_command_line_option(ssh_host))
die("strange hostname '%s' blocked", ssh_host);
 
-   ssh = get_ssh_command();
+   ssh = get_ssh_command(flags);
if (ssh) {
variant = determine_ssh_variant(ssh, 1);
} else {
@@ -1008,7 +1018,12 @@ static void fill_ssh_args(struct child_process *conn, 
const char *ssh_host,
 */
conn->use_shell = 0;
 
-   ssh =