Re: [PATCH v3 12/35] serve: introduce git-serve

2018-03-12 Thread Jeff King
On Tue, Mar 06, 2018 at 07:29:02AM +0100, Jeff King wrote:

> > We want to do better (e.g. see [1]) but that's a bigger change than
> > the initial protocol v2.
> > 
> > As Brandon explained it to me, we really do want to use stateless-rpc
> > semantics by default, since that's just better for maintainability.
> > Instead of having two protocols, one that is sane and one that
> > struggles to hoist that into stateless-rpc, there would be one
> > stateless baseline plus capabilities to make use of state.
> 
> Yes, I think that would be a nice end-game. It just wasn't clear to me
> where we'd be in the interim.

After some more thinking about this, and a little chatting with Brandon
at the contrib summit, I'm willing to soften my position on this.

Basically I was concerned about this as a regression where git-over-ssh
would stop working in a few corner cases. And it would cease to be
available as an escape hatch for those cases where http wouldn't work.

But we may be OK in this "interim" period (before unified
stateful-negotiation bits are added back) because v2 would not yet be
the default. So the ssh cases can't regress without flipping the v2
switch manually, and any escape hatch would continue to work by flipping
back to v1 anyway.

So it's probably OK to continue experimenting in this direction and see
how often it's a problem in practice.

-Peff


Re: [PATCH v3 12/35] serve: introduce git-serve

2018-03-05 Thread Jeff King
On Mon, Mar 05, 2018 at 01:36:49PM -0800, Jonathan Nieder wrote:

> > I agree that would be a lot more pleasant for adding protocol features.
> > But I just worry that the stateful protocols get a lot less efficient.
> > I'm having trouble coming up with an easy reproduction, but my
> > recollection is that http has some nasty corner cases, because each
> > round of "have" lines sent to the server has to summarize the previous
> > conversation. So you can get a case where the client's requests keep
> > getting bigger and bigger during the negotiation (and eventually getting
> > large enough to cause problems).
> 
> That's not so much a corner case as just how negotiation works over
> http.

Sure. What I meant more was "there are corner cases where it gets out of
control and doesn't work".

I have had to give the advice in the past "if your fetch over http
doesn't work, try it over ssh". If we change the ssh protocol to be
stateless, too, then that closes that escape hatch.

I haven't had to give that advice for a while, though. Maybe tweaks to
the parameters or just larger buffers have made the problem go away over
the years?

> We want to do better (e.g. see [1]) but that's a bigger change than
> the initial protocol v2.
> 
> As Brandon explained it to me, we really do want to use stateless-rpc
> semantics by default, since that's just better for maintainability.
> Instead of having two protocols, one that is sane and one that
> struggles to hoist that into stateless-rpc, there would be one
> stateless baseline plus capabilities to make use of state.

Yes, I think that would be a nice end-game. It just wasn't clear to me
where we'd be in the interim.

-Peff


Re: [PATCH v3 12/35] serve: introduce git-serve

2018-03-05 Thread Jonathan Nieder
Hi,

Jeff King wrote:

> I agree that would be a lot more pleasant for adding protocol features.
> But I just worry that the stateful protocols get a lot less efficient.
> I'm having trouble coming up with an easy reproduction, but my
> recollection is that http has some nasty corner cases, because each
> round of "have" lines sent to the server has to summarize the previous
> conversation. So you can get a case where the client's requests keep
> getting bigger and bigger during the negotiation (and eventually getting
> large enough to cause problems).

That's not so much a corner case as just how negotiation works over
http.

We want to do better (e.g. see [1]) but that's a bigger change than
the initial protocol v2.

As Brandon explained it to me, we really do want to use stateless-rpc
semantics by default, since that's just better for maintainability.
Instead of having two protocols, one that is sane and one that
struggles to hoist that into stateless-rpc, there would be one
stateless baseline plus capabilities to make use of state.

For example, it would be nice to have a capability to remember
negotiation state between rounds, to get around exactly the problem
you're describing when using a stateful protocol.  Stateless backends
would just not advertise such a capability.  But doing that without [1]
still sort of feels like a cop-out.  If we can get a reasonable
baseline using ideas like [1] and then have a capability to keep
server-side state as icing on the cake instead of having a negotiation
process that only really makes sense when you have server-side state,
then that would be even better.

> If anything, I wish we could push the http protocol in a more stateful
> direction with something like websockets. But I suspect that's an
> unrealistic dream, just because not everybody's http setup (proxies,
> etc) will be able to handle that.

Agreed.  I think we have to continue to deal with stateless-rpc
semantics, at least for the near future.

Jonathan

[1] 
https://public-inbox.org/git/20180227054638.gb65...@aiede.svl.corp.google.com/


Re: [PATCH v3 12/35] serve: introduce git-serve

2018-03-05 Thread Jeff King
On Mon, Mar 05, 2018 at 10:43:21AM -0800, Brandon Williams wrote:

> In the current protocol http has a lot of additional stuff that's had to
> be done to it to get it to work with a protocol that was designed to be
> stateful first.  What I want is for the protocol to be designed
> stateless first so that http functions essentially the same as ssh or
> file or git transports and we don't have to do any hackery to get it to
> work.  This also makes it very simple to implement a new feature in the
> protocol because you only need to think about implementing it once
> instead of twice like you kind of have to do with v0.  So in the most
> recent series everything is a chain of request/response pairs even in
> the non-http cases.

I agree that would be a lot more pleasant for adding protocol features.
But I just worry that the stateful protocols get a lot less efficient.
I'm having trouble coming up with an easy reproduction, but my
recollection is that http has some nasty corner cases, because each
round of "have" lines sent to the server has to summarize the previous
conversation. So you can get a case where the client's requests keep
getting bigger and bigger during the negotiation (and eventually getting
large enough to cause problems).

If anything, I wish we could push the http protocol in a more stateful
direction with something like websockets. But I suspect that's an
unrealistic dream, just because not everybody's http setup (proxies,
etc) will be able to handle that.

-Peff


Re: [PATCH v3 12/35] serve: introduce git-serve

2018-03-05 Thread Brandon Williams
On 03/02, Jeff King wrote:
> On Fri, Feb 23, 2018 at 01:45:57PM -0800, Brandon Williams wrote:
> 
> > I think this is the price of extending the protocol in a backward
> > compatible way.  If we don't want to be backwards compatible (allowing
> > for graceful fallback to v1) then we could design this differently.
> > Even so we're not completely out of luck just yet.
> > 
> > Back when I introduced the GIT_PROTOCOL side-channel I was able to
> > demonstrate that arbitrary data could be sent to the server and it would
> > only respect the stuff it knows about.  This means that we can do a
> > follow up to v2 at some point to introduce an optimization where we can
> > stuff a request into GIT_PROTOCOL and short-circuit the first round-trip
> > if the server supports it.
> 
> If that's our end-game, it does make me wonder if we'd be happier just
> jumping to that at first. Before you started the v2 protocol work, I had
> a rough patch series passing what I called "early capabilities". The
> idea was to let the client speak a few optional capabilities before the
> ref advertisement, and be ready for the server to ignore them
> completely. That doesn't clean up all the warts with the v0 protocol,
> but it handles the major one (allowing more efficient ref
> advertisements).

I didn't really want to get to that just yet, simply because I want to
try and keep the scope of this smaller while still being able to fix
most of the issues we have with v0.

> I dunno. There's a lot more going on here in v2 and I'm not sure I've
> fully digested it.

I tried to keep it similar enough to v0 such that it wouldn't be that
big of a leap (small steps).  For example negotiation is really done the
same as it is in v0 during fetch (a next step would be to actually
improve that).  We can definitely talk about all this in more detail
later this week too.

> 
> > The great thing about this is that from the POV of the git-client, it
> > doesn't care if its speaking using the git://, ssh://, file://, or
> > http:// transport; it's all the same protocol.  In my next re-roll I'll
> > even drop the "# service" bit from the http server response and then the
> > responses will truly be identical in all cases.
> 
> This part has me a little confused still. The big difference between
> http and the other protocols is that the other ones are full-duplex, and
> http is a series of stateless request/response pairs.
> 
> Are the other protocols becoming stateless request/response pairs, too?
> Or will they be "the same protocol" only in the sense of using the same
> transport?
> 
> (There are a lot of reasons not to like the stateless pair thing; it has
> some horrid corner cases during want/have negotiation).

Junio made a comment on the Spec in the most recent version of the
series about how I state that v2 is stateless and "MUST NOT" rely on
state being stored on the server side.  In reality I think this needs to
be tweaked a bit because when you do have a full-duplex connection you
may probably want to use that to reduce the amount of data that you send
in some cases.

In the current protocol http has a lot of additional stuff that's had to
be done to it to get it to work with a protocol that was designed to be
stateful first.  What I want is for the protocol to be designed
stateless first so that http functions essentially the same as ssh or
file or git transports and we don't have to do any hackery to get it to
work.  This also makes it very simple to implement a new feature in the
protocol because you only need to think about implementing it once
instead of twice like you kind of have to do with v0.  So in the most
recent series everything is a chain of request/response pairs even in
the non-http cases.

In a previous version of the series I had each command being able to
last any number of rounds and having a 'stateless' capability indicating
if the command needed to be run stateless.  I didn't think that was a
good design because by default you are still designing the stateful
thing first and the http (stateless) case can be an afterthought.  So
instead maybe we'll need commands which can benefit from state to have a
'stateful' feature that can be advertised when a full-duplex connection
is possible.  This still gives you the opportunity to not advertise that
and have the same behavior over ssh as http.  I actually remember
hearing someone talk about how they would like to allow for ssh
connections to their server and just have it be a proxy for http and
this would enable that.

-- 
Brandon Williams


Re: [PATCH v3 12/35] serve: introduce git-serve

2018-03-02 Thread Jeff King
On Fri, Feb 23, 2018 at 01:45:57PM -0800, Brandon Williams wrote:

> I think this is the price of extending the protocol in a backward
> compatible way.  If we don't want to be backwards compatible (allowing
> for graceful fallback to v1) then we could design this differently.
> Even so we're not completely out of luck just yet.
> 
> Back when I introduced the GIT_PROTOCOL side-channel I was able to
> demonstrate that arbitrary data could be sent to the server and it would
> only respect the stuff it knows about.  This means that we can do a
> follow up to v2 at some point to introduce an optimization where we can
> stuff a request into GIT_PROTOCOL and short-circuit the first round-trip
> if the server supports it.

If that's our end-game, it does make me wonder if we'd be happier just
jumping to that at first. Before you started the v2 protocol work, I had
a rough patch series passing what I called "early capabilities". The
idea was to let the client speak a few optional capabilities before the
ref advertisement, and be ready for the server to ignore them
completely. That doesn't clean up all the warts with the v0 protocol,
but it handles the major one (allowing more efficient ref
advertisements).

I dunno. There's a lot more going on here in v2 and I'm not sure I've
fully digested it.

> The great thing about this is that from the POV of the git-client, it
> doesn't care if its speaking using the git://, ssh://, file://, or
> http:// transport; it's all the same protocol.  In my next re-roll I'll
> even drop the "# service" bit from the http server response and then the
> responses will truly be identical in all cases.

This part has me a little confused still. The big difference between
http and the other protocols is that the other ones are full-duplex, and
http is a series of stateless request/response pairs.

Are the other protocols becoming stateless request/response pairs, too?
Or will they be "the same protocol" only in the sense of using the same
transport?

(There are a lot of reasons not to like the stateless pair thing; it has
some horrid corner cases during want/have negotiation).

-Peff


Re: [PATCH v3 12/35] serve: introduce git-serve

2018-02-27 Thread Brandon Williams
On 02/27, Jonathan Tan wrote:
> On Fri, 23 Feb 2018 13:33:15 -0800
> Brandon Williams  wrote:
> 
> > On 02/21, Jonathan Tan wrote:
> > > As someone who is implementing the server side of protocol V2 in JGit, I
> > > now have a bit more insight into this :-)
> > > 
> > > First of all, I used to not have a strong opinion on the existence of a
> > > new endpoint, but now I think that it's better to *not* have git-serve.
> > > As it is, as far as I can tell, upload-pack also needs to support (and
> > > does support, as of the end of this patch set) protocol v2 anyway, so it
> > > might be better to merely upgrade upload-pack.
> > 
> > Having it allows for easier testing and the easy ability to make it a
> > true endpoint when we want to.  As of right now, git-serve isn't an
> > endpoint as you can't issue requests there via http-backend or
> > git-daemon.
> 
> Is git-serve planned to be a new endpoint?
> 
> If yes, I now don't think it's a good idea - it's an extra burden to
> reimplementors without much benefit (to have a new endpoint that does
> the same things as upload-pack).

I'm still going to include it, with the potential for it to become an
endpoint if we so choose (it isn't now), because when we start to
introduce more things to v2 (push or other commands we haven't dreamed
up yet) it just makes more sense to contact an endpoint that doesn't
explicitly say what it does.

> 
> If not, I don't think that easier testing makes it worth having an extra
> binary. Couldn't the same tests be done by running upload-pack directly?

its builtin and not a new binary, and yes it makes testing much easier
because its assumes v2 from the start instead of v0.

-- 
Brandon Williams


Re: [PATCH v3 12/35] serve: introduce git-serve

2018-02-27 Thread Jonathan Tan
On Fri, 23 Feb 2018 13:33:15 -0800
Brandon Williams  wrote:

> On 02/21, Jonathan Tan wrote:
> > As someone who is implementing the server side of protocol V2 in JGit, I
> > now have a bit more insight into this :-)
> > 
> > First of all, I used to not have a strong opinion on the existence of a
> > new endpoint, but now I think that it's better to *not* have git-serve.
> > As it is, as far as I can tell, upload-pack also needs to support (and
> > does support, as of the end of this patch set) protocol v2 anyway, so it
> > might be better to merely upgrade upload-pack.
> 
> Having it allows for easier testing and the easy ability to make it a
> true endpoint when we want to.  As of right now, git-serve isn't an
> endpoint as you can't issue requests there via http-backend or
> git-daemon.

Is git-serve planned to be a new endpoint?

If yes, I now don't think it's a good idea - it's an extra burden to
reimplementors without much benefit (to have a new endpoint that does
the same things as upload-pack).

If not, I don't think that easier testing makes it worth having an extra
binary. Couldn't the same tests be done by running upload-pack directly?


Re: [PATCH v3 12/35] serve: introduce git-serve

2018-02-23 Thread Brandon Williams
On 02/22, Jeff King wrote:
> On Tue, Feb 06, 2018 at 05:12:49PM -0800, Brandon Williams wrote:
> 
> > +In protocol v2 communication is command oriented.  When first contacting a
> > +server a list of capabilities will advertised.  Some of these capabilities
> > +will be commands which a client can request be executed.  Once a command
> > +has completed, a client can reuse the connection and request that other
> > +commands be executed.
> 
> If I understand this correctly, we'll potentially have a lot more
> round-trips between the client and server (one per "command"). And for
> git-over-http, each one will be its own HTTP request?
> 
> We've traditionally tried to minimize HTTP requests, but I guess it's
> not too bad if we can keep the connection open in most cases. Then we
> just suffer some extra framing bytes, but we don't have to re-establish
> the TCP connection each time.
> 
> I do wonder if the extra round trips will be noticeable in high-latency
> conditions. E.g., if I'm 200ms away, converting the current
> ref-advertisement spew to "capabilities, then the client asks for refs,
> then we spew the refs" is going to cost an extra 200ms, even if the
> fetch just ends up being a noop. I'm not sure how bad that is in the
> grand scheme of things (after all, the TCP handshake involves some
> round-trips, too).

I think this is the price of extending the protocol in a backward
compatible way.  If we don't want to be backwards compatible (allowing
for graceful fallback to v1) then we could design this differently.
Even so we're not completely out of luck just yet.

Back when I introduced the GIT_PROTOCOL side-channel I was able to
demonstrate that arbitrary data could be sent to the server and it would
only respect the stuff it knows about.  This means that we can do a
follow up to v2 at some point to introduce an optimization where we can
stuff a request into GIT_PROTOCOL and short-circuit the first round-trip
if the server supports it.

> 
> > + Capability Advertisement
> > +--
> > +
> > +A server which decides to communicate (based on a request from a client)
> > +using protocol version 2, notifies the client by sending a version string
> > +in its initial response followed by an advertisement of its capabilities.
> > +Each capability is a key with an optional value.  Clients must ignore all
> > +unknown keys.  Semantics of unknown values are left to the definition of
> > +each key.  Some capabilities will describe commands which can be requested
> > +to be executed by the client.
> > +
> > +capability-advertisement = protocol-version
> > +  capability-list
> > +  flush-pkt
> > +
> > +protocol-version = PKT-LINE("version 2" LF)
> > +capability-list = *capability
> > +capability = PKT-LINE(key[=value] LF)
> > +
> > +key = 1*CHAR
> > +value = 1*CHAR
> > +CHAR = 1*(ALPHA / DIGIT / "-" / "_")
> > +
> > +A client then responds to select the command it wants with any particular
> > +capabilities or arguments.  There is then an optional section where the
> > +client can provide any command specific parameters or queries.
> > +
> > +command-request = command
> > + capability-list
> > + (command-args)
> > + flush-pkt
> > +command = PKT-LINE("command=" key LF)
> > +command-args = delim-pkt
> > +  *arg
> > +arg = 1*CHAR
> 
> For a single stateful TCP connection like git:// or git-over-ssh, the
> client would get the capabilities once and then issue a series of
> commands. For git-over-http, how does it work?
> 
> The client speaks first in HTTP, so we'd first make a request to get
> just the capabilities from the server? And then proceed from there with
> a series of requests, assuming that the capabilities for each server we
> subsequently contact are the same? That's probably reasonable (and
> certainly the existing http protocol makes that capabilities
> assumption).
> 
> I don't see any documentation on how this all works with http. But

I can add in a bit for the initial request when using http, but the rest
of it should function the same.

> reading patch 34, it looks like we just do the usual
> service=git-upload-pack request (with the magic request for v2), and
> then the server would send us capabilities. Which follows my line of
> thinking in the paragraph above.

Yes this is exactly how it should work.  First we make an info/refs
request and if the server speaks v2 then instead of a refs request we
should get back a capability listing.  Then subsequent requests are made
assuming the capabilities are the same like we've done with the
existing protocol.

The great thing about this is that from the POV of the git-client, it
doesn't care if its speaking using the git://, ssh://, file://, or
http:// transport; it's all the same protocol.  In my next re-roll I'll
even drop the "# service" bit from the http server response and then 

Re: [PATCH v3 12/35] serve: introduce git-serve

2018-02-23 Thread Brandon Williams
On 02/21, Jonathan Tan wrote:
> On Tue,  6 Feb 2018 17:12:49 -0800
> Brandon Williams  wrote:
> 
> >  .gitignore  |   1 +
> >  Documentation/technical/protocol-v2.txt | 114 +++
> >  Makefile|   2 +
> >  builtin.h   |   1 +
> >  builtin/serve.c |  30 
> >  git.c   |   1 +
> >  serve.c | 250 
> > 
> >  serve.h |  15 ++
> >  t/t5701-git-serve.sh|  60 
> >  9 files changed, 474 insertions(+)
> >  create mode 100644 Documentation/technical/protocol-v2.txt
> >  create mode 100644 builtin/serve.c
> >  create mode 100644 serve.c
> >  create mode 100644 serve.h
> >  create mode 100755 t/t5701-git-serve.sh
> 
> As someone who is implementing the server side of protocol V2 in JGit, I
> now have a bit more insight into this :-)
> 
> First of all, I used to not have a strong opinion on the existence of a
> new endpoint, but now I think that it's better to *not* have git-serve.
> As it is, as far as I can tell, upload-pack also needs to support (and
> does support, as of the end of this patch set) protocol v2 anyway, so it
> might be better to merely upgrade upload-pack.

Having it allows for easier testing and the easy ability to make it a
true endpoint when we want to.  As of right now, git-serve isn't an
endpoint as you can't issue requests there via http-backend or
git-daemon.

> 
> > +A client then responds to select the command it wants with any particular
> > +capabilities or arguments.  There is then an optional section where the
> > +client can provide any command specific parameters or queries.
> > +
> > +command-request = command
> > + capability-list
> > + (command-args)
> 
> If you are stating that this is optional, write "*1command-args". (RFC
> 5234 also supports square brackets, but "*1" is already used in
> pack-protocol.txt and http-protocol.txt.)
> 
> > + flush-pkt
> > +command = PKT-LINE("command=" key LF)
> > +command-args = delim-pkt
> > +  *arg
> > +arg = 1*CHAR
> 
> arg should be wrapped in PKT-LINE, I think, and terminated by an LF.

-- 
Brandon Williams


Re: [PATCH v3 12/35] serve: introduce git-serve

2018-02-22 Thread Jeff King
On Tue, Feb 06, 2018 at 05:12:49PM -0800, Brandon Williams wrote:

> +In protocol v2 communication is command oriented.  When first contacting a
> +server a list of capabilities will advertised.  Some of these capabilities
> +will be commands which a client can request be executed.  Once a command
> +has completed, a client can reuse the connection and request that other
> +commands be executed.

If I understand this correctly, we'll potentially have a lot more
round-trips between the client and server (one per "command"). And for
git-over-http, each one will be its own HTTP request?

We've traditionally tried to minimize HTTP requests, but I guess it's
not too bad if we can keep the connection open in most cases. Then we
just suffer some extra framing bytes, but we don't have to re-establish
the TCP connection each time.

I do wonder if the extra round trips will be noticeable in high-latency
conditions. E.g., if I'm 200ms away, converting the current
ref-advertisement spew to "capabilities, then the client asks for refs,
then we spew the refs" is going to cost an extra 200ms, even if the
fetch just ends up being a noop. I'm not sure how bad that is in the
grand scheme of things (after all, the TCP handshake involves some
round-trips, too).

> + Capability Advertisement
> +--
> +
> +A server which decides to communicate (based on a request from a client)
> +using protocol version 2, notifies the client by sending a version string
> +in its initial response followed by an advertisement of its capabilities.
> +Each capability is a key with an optional value.  Clients must ignore all
> +unknown keys.  Semantics of unknown values are left to the definition of
> +each key.  Some capabilities will describe commands which can be requested
> +to be executed by the client.
> +
> +capability-advertisement = protocol-version
> +capability-list
> +flush-pkt
> +
> +protocol-version = PKT-LINE("version 2" LF)
> +capability-list = *capability
> +capability = PKT-LINE(key[=value] LF)
> +
> +key = 1*CHAR
> +value = 1*CHAR
> +CHAR = 1*(ALPHA / DIGIT / "-" / "_")
> +
> +A client then responds to select the command it wants with any particular
> +capabilities or arguments.  There is then an optional section where the
> +client can provide any command specific parameters or queries.
> +
> +command-request = command
> +   capability-list
> +   (command-args)
> +   flush-pkt
> +command = PKT-LINE("command=" key LF)
> +command-args = delim-pkt
> +*arg
> +arg = 1*CHAR

For a single stateful TCP connection like git:// or git-over-ssh, the
client would get the capabilities once and then issue a series of
commands. For git-over-http, how does it work?

The client speaks first in HTTP, so we'd first make a request to get
just the capabilities from the server? And then proceed from there with
a series of requests, assuming that the capabilities for each server we
subsequently contact are the same? That's probably reasonable (and
certainly the existing http protocol makes that capabilities
assumption).

I don't see any documentation on how this all works with http. But
reading patch 34, it looks like we just do the usual
service=git-upload-pack request (with the magic request for v2), and
then the server would send us capabilities. Which follows my line of
thinking in the paragraph above.

-Peff


Re: [PATCH v3 12/35] serve: introduce git-serve

2018-02-21 Thread Jonathan Tan
On Tue,  6 Feb 2018 17:12:49 -0800
Brandon Williams  wrote:

>  .gitignore  |   1 +
>  Documentation/technical/protocol-v2.txt | 114 +++
>  Makefile|   2 +
>  builtin.h   |   1 +
>  builtin/serve.c |  30 
>  git.c   |   1 +
>  serve.c | 250 
> 
>  serve.h |  15 ++
>  t/t5701-git-serve.sh|  60 
>  9 files changed, 474 insertions(+)
>  create mode 100644 Documentation/technical/protocol-v2.txt
>  create mode 100644 builtin/serve.c
>  create mode 100644 serve.c
>  create mode 100644 serve.h
>  create mode 100755 t/t5701-git-serve.sh

As someone who is implementing the server side of protocol V2 in JGit, I
now have a bit more insight into this :-)

First of all, I used to not have a strong opinion on the existence of a
new endpoint, but now I think that it's better to *not* have git-serve.
As it is, as far as I can tell, upload-pack also needs to support (and
does support, as of the end of this patch set) protocol v2 anyway, so it
might be better to merely upgrade upload-pack.

> +A client then responds to select the command it wants with any particular
> +capabilities or arguments.  There is then an optional section where the
> +client can provide any command specific parameters or queries.
> +
> +command-request = command
> +   capability-list
> +   (command-args)

If you are stating that this is optional, write "*1command-args". (RFC
5234 also supports square brackets, but "*1" is already used in
pack-protocol.txt and http-protocol.txt.)

> +   flush-pkt
> +command = PKT-LINE("command=" key LF)
> +command-args = delim-pkt
> +*arg
> +arg = 1*CHAR

arg should be wrapped in PKT-LINE, I think, and terminated by an LF.


[PATCH v3 12/35] serve: introduce git-serve

2018-02-06 Thread Brandon Williams
Introduce git-serve, the base server for protocol version 2.

Protocol version 2 is intended to be a replacement for Git's current
wire protocol.  The intention is that it will be a simpler, less
wasteful protocol which can evolve over time.

Protocol version 2 improves upon version 1 by eliminating the initial
ref advertisement.  In its place a server will export a list of
capabilities and commands which it supports in a capability
advertisement.  A client can then request that a particular command be
executed by providing a number of capabilities and command specific
parameters.  At the completion of a command, a client can request that
another command be executed or can terminate the connection by sending a
flush packet.

Signed-off-by: Brandon Williams 
---
 .gitignore  |   1 +
 Documentation/technical/protocol-v2.txt | 114 +++
 Makefile|   2 +
 builtin.h   |   1 +
 builtin/serve.c |  30 
 git.c   |   1 +
 serve.c | 250 
 serve.h |  15 ++
 t/t5701-git-serve.sh|  60 
 9 files changed, 474 insertions(+)
 create mode 100644 Documentation/technical/protocol-v2.txt
 create mode 100644 builtin/serve.c
 create mode 100644 serve.c
 create mode 100644 serve.h
 create mode 100755 t/t5701-git-serve.sh

diff --git a/.gitignore b/.gitignore
index 833ef3b0b..2d0450c26 100644
--- a/.gitignore
+++ b/.gitignore
@@ -140,6 +140,7 @@
 /git-rm
 /git-send-email
 /git-send-pack
+/git-serve
 /git-sh-i18n
 /git-sh-i18n--envsubst
 /git-sh-setup
diff --git a/Documentation/technical/protocol-v2.txt 
b/Documentation/technical/protocol-v2.txt
new file mode 100644
index 0..f87372f9b
--- /dev/null
+++ b/Documentation/technical/protocol-v2.txt
@@ -0,0 +1,114 @@
+ Git Wire Protocol, Version 2
+==
+
+This document presents a specification for a version 2 of Git's wire
+protocol.  Protocol v2 will improve upon v1 in the following ways:
+
+  * Instead of multiple service names, multiple commands will be
+supported by a single service
+  * Easily extendable as capabilities are moved into their own section
+of the protocol, no longer being hidden behind a NUL byte and
+limited by the size of a pkt-line (as there will be a single
+capability per pkt-line)
+  * Separate out other information hidden behind NUL bytes (e.g. agent
+string as a capability and symrefs can be requested using 'ls-refs')
+  * Reference advertisement will be omitted unless explicitly requested
+  * ls-refs command to explicitly request some refs
+  * Designed with http and stateless-rpc in mind.  With clear flush
+semantics the http remote helper can simply act as a proxy.
+
+ Detailed Design
+=
+
+A client can request to speak protocol v2 by sending `version=2` in the
+side-channel `GIT_PROTOCOL` in the initial request to the server.
+
+In protocol v2 communication is command oriented.  When first contacting a
+server a list of capabilities will advertised.  Some of these capabilities
+will be commands which a client can request be executed.  Once a command
+has completed, a client can reuse the connection and request that other
+commands be executed.
+
+ Special Packets
+-
+
+In protocol v2 these special packets will have the following semantics:
+
+  * '' Flush Packet (flush-pkt) - indicates the end of a message
+  * '0001' Delimiter Packet (delim-pkt) - separates sections of a message
+
+ Capability Advertisement
+--
+
+A server which decides to communicate (based on a request from a client)
+using protocol version 2, notifies the client by sending a version string
+in its initial response followed by an advertisement of its capabilities.
+Each capability is a key with an optional value.  Clients must ignore all
+unknown keys.  Semantics of unknown values are left to the definition of
+each key.  Some capabilities will describe commands which can be requested
+to be executed by the client.
+
+capability-advertisement = protocol-version
+  capability-list
+  flush-pkt
+
+protocol-version = PKT-LINE("version 2" LF)
+capability-list = *capability
+capability = PKT-LINE(key[=value] LF)
+
+key = 1*CHAR
+value = 1*CHAR
+CHAR = 1*(ALPHA / DIGIT / "-" / "_")
+
+A client then responds to select the command it wants with any particular
+capabilities or arguments.  There is then an optional section where the
+client can provide any command specific parameters or queries.
+
+command-request = command
+ capability-list
+ (command-args)
+ flush-pkt
+command = PKT-LINE("command=" key LF)
+command-args =