Re: [PATCH v3 12/35] serve: introduce git-serve
On Tue, Mar 06, 2018 at 07:29:02AM +0100, Jeff King wrote: > > We want to do better (e.g. see [1]) but that's a bigger change than > > the initial protocol v2. > > > > As Brandon explained it to me, we really do want to use stateless-rpc > > semantics by default, since that's just better for maintainability. > > Instead of having two protocols, one that is sane and one that > > struggles to hoist that into stateless-rpc, there would be one > > stateless baseline plus capabilities to make use of state. > > Yes, I think that would be a nice end-game. It just wasn't clear to me > where we'd be in the interim. After some more thinking about this, and a little chatting with Brandon at the contrib summit, I'm willing to soften my position on this. Basically I was concerned about this as a regression where git-over-ssh would stop working in a few corner cases. And it would cease to be available as an escape hatch for those cases where http wouldn't work. But we may be OK in this "interim" period (before unified stateful-negotiation bits are added back) because v2 would not yet be the default. So the ssh cases can't regress without flipping the v2 switch manually, and any escape hatch would continue to work by flipping back to v1 anyway. So it's probably OK to continue experimenting in this direction and see how often it's a problem in practice. -Peff
Re: [PATCH v3 12/35] serve: introduce git-serve
On Mon, Mar 05, 2018 at 01:36:49PM -0800, Jonathan Nieder wrote: > > I agree that would be a lot more pleasant for adding protocol features. > > But I just worry that the stateful protocols get a lot less efficient. > > I'm having trouble coming up with an easy reproduction, but my > > recollection is that http has some nasty corner cases, because each > > round of "have" lines sent to the server has to summarize the previous > > conversation. So you can get a case where the client's requests keep > > getting bigger and bigger during the negotiation (and eventually getting > > large enough to cause problems). > > That's not so much a corner case as just how negotiation works over > http. Sure. What I meant more was "there are corner cases where it gets out of control and doesn't work". I have had to give the advice in the past "if your fetch over http doesn't work, try it over ssh". If we change the ssh protocol to be stateless, too, then that closes that escape hatch. I haven't had to give that advice for a while, though. Maybe tweaks to the parameters or just larger buffers have made the problem go away over the years? > We want to do better (e.g. see [1]) but that's a bigger change than > the initial protocol v2. > > As Brandon explained it to me, we really do want to use stateless-rpc > semantics by default, since that's just better for maintainability. > Instead of having two protocols, one that is sane and one that > struggles to hoist that into stateless-rpc, there would be one > stateless baseline plus capabilities to make use of state. Yes, I think that would be a nice end-game. It just wasn't clear to me where we'd be in the interim. -Peff
Re: [PATCH v3 12/35] serve: introduce git-serve
Hi, Jeff King wrote: > I agree that would be a lot more pleasant for adding protocol features. > But I just worry that the stateful protocols get a lot less efficient. > I'm having trouble coming up with an easy reproduction, but my > recollection is that http has some nasty corner cases, because each > round of "have" lines sent to the server has to summarize the previous > conversation. So you can get a case where the client's requests keep > getting bigger and bigger during the negotiation (and eventually getting > large enough to cause problems). That's not so much a corner case as just how negotiation works over http. We want to do better (e.g. see [1]) but that's a bigger change than the initial protocol v2. As Brandon explained it to me, we really do want to use stateless-rpc semantics by default, since that's just better for maintainability. Instead of having two protocols, one that is sane and one that struggles to hoist that into stateless-rpc, there would be one stateless baseline plus capabilities to make use of state. For example, it would be nice to have a capability to remember negotiation state between rounds, to get around exactly the problem you're describing when using a stateful protocol. Stateless backends would just not advertise such a capability. But doing that without [1] still sort of feels like a cop-out. If we can get a reasonable baseline using ideas like [1] and then have a capability to keep server-side state as icing on the cake instead of having a negotiation process that only really makes sense when you have server-side state, then that would be even better. > If anything, I wish we could push the http protocol in a more stateful > direction with something like websockets. But I suspect that's an > unrealistic dream, just because not everybody's http setup (proxies, > etc) will be able to handle that. Agreed. I think we have to continue to deal with stateless-rpc semantics, at least for the near future. Jonathan [1] https://public-inbox.org/git/20180227054638.gb65...@aiede.svl.corp.google.com/
Re: [PATCH v3 12/35] serve: introduce git-serve
On Mon, Mar 05, 2018 at 10:43:21AM -0800, Brandon Williams wrote: > In the current protocol http has a lot of additional stuff that's had to > be done to it to get it to work with a protocol that was designed to be > stateful first. What I want is for the protocol to be designed > stateless first so that http functions essentially the same as ssh or > file or git transports and we don't have to do any hackery to get it to > work. This also makes it very simple to implement a new feature in the > protocol because you only need to think about implementing it once > instead of twice like you kind of have to do with v0. So in the most > recent series everything is a chain of request/response pairs even in > the non-http cases. I agree that would be a lot more pleasant for adding protocol features. But I just worry that the stateful protocols get a lot less efficient. I'm having trouble coming up with an easy reproduction, but my recollection is that http has some nasty corner cases, because each round of "have" lines sent to the server has to summarize the previous conversation. So you can get a case where the client's requests keep getting bigger and bigger during the negotiation (and eventually getting large enough to cause problems). If anything, I wish we could push the http protocol in a more stateful direction with something like websockets. But I suspect that's an unrealistic dream, just because not everybody's http setup (proxies, etc) will be able to handle that. -Peff
Re: [PATCH v3 12/35] serve: introduce git-serve
On 03/02, Jeff King wrote: > On Fri, Feb 23, 2018 at 01:45:57PM -0800, Brandon Williams wrote: > > > I think this is the price of extending the protocol in a backward > > compatible way. If we don't want to be backwards compatible (allowing > > for graceful fallback to v1) then we could design this differently. > > Even so we're not completely out of luck just yet. > > > > Back when I introduced the GIT_PROTOCOL side-channel I was able to > > demonstrate that arbitrary data could be sent to the server and it would > > only respect the stuff it knows about. This means that we can do a > > follow up to v2 at some point to introduce an optimization where we can > > stuff a request into GIT_PROTOCOL and short-circuit the first round-trip > > if the server supports it. > > If that's our end-game, it does make me wonder if we'd be happier just > jumping to that at first. Before you started the v2 protocol work, I had > a rough patch series passing what I called "early capabilities". The > idea was to let the client speak a few optional capabilities before the > ref advertisement, and be ready for the server to ignore them > completely. That doesn't clean up all the warts with the v0 protocol, > but it handles the major one (allowing more efficient ref > advertisements). I didn't really want to get to that just yet, simply because I want to try and keep the scope of this smaller while still being able to fix most of the issues we have with v0. > I dunno. There's a lot more going on here in v2 and I'm not sure I've > fully digested it. I tried to keep it similar enough to v0 such that it wouldn't be that big of a leap (small steps). For example negotiation is really done the same as it is in v0 during fetch (a next step would be to actually improve that). We can definitely talk about all this in more detail later this week too. > > > The great thing about this is that from the POV of the git-client, it > > doesn't care if its speaking using the git://, ssh://, file://, or > > http:// transport; it's all the same protocol. In my next re-roll I'll > > even drop the "# service" bit from the http server response and then the > > responses will truly be identical in all cases. > > This part has me a little confused still. The big difference between > http and the other protocols is that the other ones are full-duplex, and > http is a series of stateless request/response pairs. > > Are the other protocols becoming stateless request/response pairs, too? > Or will they be "the same protocol" only in the sense of using the same > transport? > > (There are a lot of reasons not to like the stateless pair thing; it has > some horrid corner cases during want/have negotiation). Junio made a comment on the Spec in the most recent version of the series about how I state that v2 is stateless and "MUST NOT" rely on state being stored on the server side. In reality I think this needs to be tweaked a bit because when you do have a full-duplex connection you may probably want to use that to reduce the amount of data that you send in some cases. In the current protocol http has a lot of additional stuff that's had to be done to it to get it to work with a protocol that was designed to be stateful first. What I want is for the protocol to be designed stateless first so that http functions essentially the same as ssh or file or git transports and we don't have to do any hackery to get it to work. This also makes it very simple to implement a new feature in the protocol because you only need to think about implementing it once instead of twice like you kind of have to do with v0. So in the most recent series everything is a chain of request/response pairs even in the non-http cases. In a previous version of the series I had each command being able to last any number of rounds and having a 'stateless' capability indicating if the command needed to be run stateless. I didn't think that was a good design because by default you are still designing the stateful thing first and the http (stateless) case can be an afterthought. So instead maybe we'll need commands which can benefit from state to have a 'stateful' feature that can be advertised when a full-duplex connection is possible. This still gives you the opportunity to not advertise that and have the same behavior over ssh as http. I actually remember hearing someone talk about how they would like to allow for ssh connections to their server and just have it be a proxy for http and this would enable that. -- Brandon Williams
Re: [PATCH v3 12/35] serve: introduce git-serve
On Fri, Feb 23, 2018 at 01:45:57PM -0800, Brandon Williams wrote: > I think this is the price of extending the protocol in a backward > compatible way. If we don't want to be backwards compatible (allowing > for graceful fallback to v1) then we could design this differently. > Even so we're not completely out of luck just yet. > > Back when I introduced the GIT_PROTOCOL side-channel I was able to > demonstrate that arbitrary data could be sent to the server and it would > only respect the stuff it knows about. This means that we can do a > follow up to v2 at some point to introduce an optimization where we can > stuff a request into GIT_PROTOCOL and short-circuit the first round-trip > if the server supports it. If that's our end-game, it does make me wonder if we'd be happier just jumping to that at first. Before you started the v2 protocol work, I had a rough patch series passing what I called "early capabilities". The idea was to let the client speak a few optional capabilities before the ref advertisement, and be ready for the server to ignore them completely. That doesn't clean up all the warts with the v0 protocol, but it handles the major one (allowing more efficient ref advertisements). I dunno. There's a lot more going on here in v2 and I'm not sure I've fully digested it. > The great thing about this is that from the POV of the git-client, it > doesn't care if its speaking using the git://, ssh://, file://, or > http:// transport; it's all the same protocol. In my next re-roll I'll > even drop the "# service" bit from the http server response and then the > responses will truly be identical in all cases. This part has me a little confused still. The big difference between http and the other protocols is that the other ones are full-duplex, and http is a series of stateless request/response pairs. Are the other protocols becoming stateless request/response pairs, too? Or will they be "the same protocol" only in the sense of using the same transport? (There are a lot of reasons not to like the stateless pair thing; it has some horrid corner cases during want/have negotiation). -Peff
Re: [PATCH v3 12/35] serve: introduce git-serve
On 02/27, Jonathan Tan wrote: > On Fri, 23 Feb 2018 13:33:15 -0800 > Brandon Williamswrote: > > > On 02/21, Jonathan Tan wrote: > > > As someone who is implementing the server side of protocol V2 in JGit, I > > > now have a bit more insight into this :-) > > > > > > First of all, I used to not have a strong opinion on the existence of a > > > new endpoint, but now I think that it's better to *not* have git-serve. > > > As it is, as far as I can tell, upload-pack also needs to support (and > > > does support, as of the end of this patch set) protocol v2 anyway, so it > > > might be better to merely upgrade upload-pack. > > > > Having it allows for easier testing and the easy ability to make it a > > true endpoint when we want to. As of right now, git-serve isn't an > > endpoint as you can't issue requests there via http-backend or > > git-daemon. > > Is git-serve planned to be a new endpoint? > > If yes, I now don't think it's a good idea - it's an extra burden to > reimplementors without much benefit (to have a new endpoint that does > the same things as upload-pack). I'm still going to include it, with the potential for it to become an endpoint if we so choose (it isn't now), because when we start to introduce more things to v2 (push or other commands we haven't dreamed up yet) it just makes more sense to contact an endpoint that doesn't explicitly say what it does. > > If not, I don't think that easier testing makes it worth having an extra > binary. Couldn't the same tests be done by running upload-pack directly? its builtin and not a new binary, and yes it makes testing much easier because its assumes v2 from the start instead of v0. -- Brandon Williams
Re: [PATCH v3 12/35] serve: introduce git-serve
On Fri, 23 Feb 2018 13:33:15 -0800 Brandon Williamswrote: > On 02/21, Jonathan Tan wrote: > > As someone who is implementing the server side of protocol V2 in JGit, I > > now have a bit more insight into this :-) > > > > First of all, I used to not have a strong opinion on the existence of a > > new endpoint, but now I think that it's better to *not* have git-serve. > > As it is, as far as I can tell, upload-pack also needs to support (and > > does support, as of the end of this patch set) protocol v2 anyway, so it > > might be better to merely upgrade upload-pack. > > Having it allows for easier testing and the easy ability to make it a > true endpoint when we want to. As of right now, git-serve isn't an > endpoint as you can't issue requests there via http-backend or > git-daemon. Is git-serve planned to be a new endpoint? If yes, I now don't think it's a good idea - it's an extra burden to reimplementors without much benefit (to have a new endpoint that does the same things as upload-pack). If not, I don't think that easier testing makes it worth having an extra binary. Couldn't the same tests be done by running upload-pack directly?
Re: [PATCH v3 12/35] serve: introduce git-serve
On 02/22, Jeff King wrote: > On Tue, Feb 06, 2018 at 05:12:49PM -0800, Brandon Williams wrote: > > > +In protocol v2 communication is command oriented. When first contacting a > > +server a list of capabilities will advertised. Some of these capabilities > > +will be commands which a client can request be executed. Once a command > > +has completed, a client can reuse the connection and request that other > > +commands be executed. > > If I understand this correctly, we'll potentially have a lot more > round-trips between the client and server (one per "command"). And for > git-over-http, each one will be its own HTTP request? > > We've traditionally tried to minimize HTTP requests, but I guess it's > not too bad if we can keep the connection open in most cases. Then we > just suffer some extra framing bytes, but we don't have to re-establish > the TCP connection each time. > > I do wonder if the extra round trips will be noticeable in high-latency > conditions. E.g., if I'm 200ms away, converting the current > ref-advertisement spew to "capabilities, then the client asks for refs, > then we spew the refs" is going to cost an extra 200ms, even if the > fetch just ends up being a noop. I'm not sure how bad that is in the > grand scheme of things (after all, the TCP handshake involves some > round-trips, too). I think this is the price of extending the protocol in a backward compatible way. If we don't want to be backwards compatible (allowing for graceful fallback to v1) then we could design this differently. Even so we're not completely out of luck just yet. Back when I introduced the GIT_PROTOCOL side-channel I was able to demonstrate that arbitrary data could be sent to the server and it would only respect the stuff it knows about. This means that we can do a follow up to v2 at some point to introduce an optimization where we can stuff a request into GIT_PROTOCOL and short-circuit the first round-trip if the server supports it. > > > + Capability Advertisement > > +-- > > + > > +A server which decides to communicate (based on a request from a client) > > +using protocol version 2, notifies the client by sending a version string > > +in its initial response followed by an advertisement of its capabilities. > > +Each capability is a key with an optional value. Clients must ignore all > > +unknown keys. Semantics of unknown values are left to the definition of > > +each key. Some capabilities will describe commands which can be requested > > +to be executed by the client. > > + > > +capability-advertisement = protocol-version > > + capability-list > > + flush-pkt > > + > > +protocol-version = PKT-LINE("version 2" LF) > > +capability-list = *capability > > +capability = PKT-LINE(key[=value] LF) > > + > > +key = 1*CHAR > > +value = 1*CHAR > > +CHAR = 1*(ALPHA / DIGIT / "-" / "_") > > + > > +A client then responds to select the command it wants with any particular > > +capabilities or arguments. There is then an optional section where the > > +client can provide any command specific parameters or queries. > > + > > +command-request = command > > + capability-list > > + (command-args) > > + flush-pkt > > +command = PKT-LINE("command=" key LF) > > +command-args = delim-pkt > > + *arg > > +arg = 1*CHAR > > For a single stateful TCP connection like git:// or git-over-ssh, the > client would get the capabilities once and then issue a series of > commands. For git-over-http, how does it work? > > The client speaks first in HTTP, so we'd first make a request to get > just the capabilities from the server? And then proceed from there with > a series of requests, assuming that the capabilities for each server we > subsequently contact are the same? That's probably reasonable (and > certainly the existing http protocol makes that capabilities > assumption). > > I don't see any documentation on how this all works with http. But I can add in a bit for the initial request when using http, but the rest of it should function the same. > reading patch 34, it looks like we just do the usual > service=git-upload-pack request (with the magic request for v2), and > then the server would send us capabilities. Which follows my line of > thinking in the paragraph above. Yes this is exactly how it should work. First we make an info/refs request and if the server speaks v2 then instead of a refs request we should get back a capability listing. Then subsequent requests are made assuming the capabilities are the same like we've done with the existing protocol. The great thing about this is that from the POV of the git-client, it doesn't care if its speaking using the git://, ssh://, file://, or http:// transport; it's all the same protocol. In my next re-roll I'll even drop the "# service" bit from the http server response and then
Re: [PATCH v3 12/35] serve: introduce git-serve
On 02/21, Jonathan Tan wrote: > On Tue, 6 Feb 2018 17:12:49 -0800 > Brandon Williamswrote: > > > .gitignore | 1 + > > Documentation/technical/protocol-v2.txt | 114 +++ > > Makefile| 2 + > > builtin.h | 1 + > > builtin/serve.c | 30 > > git.c | 1 + > > serve.c | 250 > > > > serve.h | 15 ++ > > t/t5701-git-serve.sh| 60 > > 9 files changed, 474 insertions(+) > > create mode 100644 Documentation/technical/protocol-v2.txt > > create mode 100644 builtin/serve.c > > create mode 100644 serve.c > > create mode 100644 serve.h > > create mode 100755 t/t5701-git-serve.sh > > As someone who is implementing the server side of protocol V2 in JGit, I > now have a bit more insight into this :-) > > First of all, I used to not have a strong opinion on the existence of a > new endpoint, but now I think that it's better to *not* have git-serve. > As it is, as far as I can tell, upload-pack also needs to support (and > does support, as of the end of this patch set) protocol v2 anyway, so it > might be better to merely upgrade upload-pack. Having it allows for easier testing and the easy ability to make it a true endpoint when we want to. As of right now, git-serve isn't an endpoint as you can't issue requests there via http-backend or git-daemon. > > > +A client then responds to select the command it wants with any particular > > +capabilities or arguments. There is then an optional section where the > > +client can provide any command specific parameters or queries. > > + > > +command-request = command > > + capability-list > > + (command-args) > > If you are stating that this is optional, write "*1command-args". (RFC > 5234 also supports square brackets, but "*1" is already used in > pack-protocol.txt and http-protocol.txt.) > > > + flush-pkt > > +command = PKT-LINE("command=" key LF) > > +command-args = delim-pkt > > + *arg > > +arg = 1*CHAR > > arg should be wrapped in PKT-LINE, I think, and terminated by an LF. -- Brandon Williams
Re: [PATCH v3 12/35] serve: introduce git-serve
On Tue, Feb 06, 2018 at 05:12:49PM -0800, Brandon Williams wrote: > +In protocol v2 communication is command oriented. When first contacting a > +server a list of capabilities will advertised. Some of these capabilities > +will be commands which a client can request be executed. Once a command > +has completed, a client can reuse the connection and request that other > +commands be executed. If I understand this correctly, we'll potentially have a lot more round-trips between the client and server (one per "command"). And for git-over-http, each one will be its own HTTP request? We've traditionally tried to minimize HTTP requests, but I guess it's not too bad if we can keep the connection open in most cases. Then we just suffer some extra framing bytes, but we don't have to re-establish the TCP connection each time. I do wonder if the extra round trips will be noticeable in high-latency conditions. E.g., if I'm 200ms away, converting the current ref-advertisement spew to "capabilities, then the client asks for refs, then we spew the refs" is going to cost an extra 200ms, even if the fetch just ends up being a noop. I'm not sure how bad that is in the grand scheme of things (after all, the TCP handshake involves some round-trips, too). > + Capability Advertisement > +-- > + > +A server which decides to communicate (based on a request from a client) > +using protocol version 2, notifies the client by sending a version string > +in its initial response followed by an advertisement of its capabilities. > +Each capability is a key with an optional value. Clients must ignore all > +unknown keys. Semantics of unknown values are left to the definition of > +each key. Some capabilities will describe commands which can be requested > +to be executed by the client. > + > +capability-advertisement = protocol-version > +capability-list > +flush-pkt > + > +protocol-version = PKT-LINE("version 2" LF) > +capability-list = *capability > +capability = PKT-LINE(key[=value] LF) > + > +key = 1*CHAR > +value = 1*CHAR > +CHAR = 1*(ALPHA / DIGIT / "-" / "_") > + > +A client then responds to select the command it wants with any particular > +capabilities or arguments. There is then an optional section where the > +client can provide any command specific parameters or queries. > + > +command-request = command > + capability-list > + (command-args) > + flush-pkt > +command = PKT-LINE("command=" key LF) > +command-args = delim-pkt > +*arg > +arg = 1*CHAR For a single stateful TCP connection like git:// or git-over-ssh, the client would get the capabilities once and then issue a series of commands. For git-over-http, how does it work? The client speaks first in HTTP, so we'd first make a request to get just the capabilities from the server? And then proceed from there with a series of requests, assuming that the capabilities for each server we subsequently contact are the same? That's probably reasonable (and certainly the existing http protocol makes that capabilities assumption). I don't see any documentation on how this all works with http. But reading patch 34, it looks like we just do the usual service=git-upload-pack request (with the magic request for v2), and then the server would send us capabilities. Which follows my line of thinking in the paragraph above. -Peff
Re: [PATCH v3 12/35] serve: introduce git-serve
On Tue, 6 Feb 2018 17:12:49 -0800 Brandon Williamswrote: > .gitignore | 1 + > Documentation/technical/protocol-v2.txt | 114 +++ > Makefile| 2 + > builtin.h | 1 + > builtin/serve.c | 30 > git.c | 1 + > serve.c | 250 > > serve.h | 15 ++ > t/t5701-git-serve.sh| 60 > 9 files changed, 474 insertions(+) > create mode 100644 Documentation/technical/protocol-v2.txt > create mode 100644 builtin/serve.c > create mode 100644 serve.c > create mode 100644 serve.h > create mode 100755 t/t5701-git-serve.sh As someone who is implementing the server side of protocol V2 in JGit, I now have a bit more insight into this :-) First of all, I used to not have a strong opinion on the existence of a new endpoint, but now I think that it's better to *not* have git-serve. As it is, as far as I can tell, upload-pack also needs to support (and does support, as of the end of this patch set) protocol v2 anyway, so it might be better to merely upgrade upload-pack. > +A client then responds to select the command it wants with any particular > +capabilities or arguments. There is then an optional section where the > +client can provide any command specific parameters or queries. > + > +command-request = command > + capability-list > + (command-args) If you are stating that this is optional, write "*1command-args". (RFC 5234 also supports square brackets, but "*1" is already used in pack-protocol.txt and http-protocol.txt.) > + flush-pkt > +command = PKT-LINE("command=" key LF) > +command-args = delim-pkt > +*arg > +arg = 1*CHAR arg should be wrapped in PKT-LINE, I think, and terminated by an LF.
[PATCH v3 12/35] serve: introduce git-serve
Introduce git-serve, the base server for protocol version 2. Protocol version 2 is intended to be a replacement for Git's current wire protocol. The intention is that it will be a simpler, less wasteful protocol which can evolve over time. Protocol version 2 improves upon version 1 by eliminating the initial ref advertisement. In its place a server will export a list of capabilities and commands which it supports in a capability advertisement. A client can then request that a particular command be executed by providing a number of capabilities and command specific parameters. At the completion of a command, a client can request that another command be executed or can terminate the connection by sending a flush packet. Signed-off-by: Brandon Williams--- .gitignore | 1 + Documentation/technical/protocol-v2.txt | 114 +++ Makefile| 2 + builtin.h | 1 + builtin/serve.c | 30 git.c | 1 + serve.c | 250 serve.h | 15 ++ t/t5701-git-serve.sh| 60 9 files changed, 474 insertions(+) create mode 100644 Documentation/technical/protocol-v2.txt create mode 100644 builtin/serve.c create mode 100644 serve.c create mode 100644 serve.h create mode 100755 t/t5701-git-serve.sh diff --git a/.gitignore b/.gitignore index 833ef3b0b..2d0450c26 100644 --- a/.gitignore +++ b/.gitignore @@ -140,6 +140,7 @@ /git-rm /git-send-email /git-send-pack +/git-serve /git-sh-i18n /git-sh-i18n--envsubst /git-sh-setup diff --git a/Documentation/technical/protocol-v2.txt b/Documentation/technical/protocol-v2.txt new file mode 100644 index 0..f87372f9b --- /dev/null +++ b/Documentation/technical/protocol-v2.txt @@ -0,0 +1,114 @@ + Git Wire Protocol, Version 2 +== + +This document presents a specification for a version 2 of Git's wire +protocol. Protocol v2 will improve upon v1 in the following ways: + + * Instead of multiple service names, multiple commands will be +supported by a single service + * Easily extendable as capabilities are moved into their own section +of the protocol, no longer being hidden behind a NUL byte and +limited by the size of a pkt-line (as there will be a single +capability per pkt-line) + * Separate out other information hidden behind NUL bytes (e.g. agent +string as a capability and symrefs can be requested using 'ls-refs') + * Reference advertisement will be omitted unless explicitly requested + * ls-refs command to explicitly request some refs + * Designed with http and stateless-rpc in mind. With clear flush +semantics the http remote helper can simply act as a proxy. + + Detailed Design += + +A client can request to speak protocol v2 by sending `version=2` in the +side-channel `GIT_PROTOCOL` in the initial request to the server. + +In protocol v2 communication is command oriented. When first contacting a +server a list of capabilities will advertised. Some of these capabilities +will be commands which a client can request be executed. Once a command +has completed, a client can reuse the connection and request that other +commands be executed. + + Special Packets +- + +In protocol v2 these special packets will have the following semantics: + + * '' Flush Packet (flush-pkt) - indicates the end of a message + * '0001' Delimiter Packet (delim-pkt) - separates sections of a message + + Capability Advertisement +-- + +A server which decides to communicate (based on a request from a client) +using protocol version 2, notifies the client by sending a version string +in its initial response followed by an advertisement of its capabilities. +Each capability is a key with an optional value. Clients must ignore all +unknown keys. Semantics of unknown values are left to the definition of +each key. Some capabilities will describe commands which can be requested +to be executed by the client. + +capability-advertisement = protocol-version + capability-list + flush-pkt + +protocol-version = PKT-LINE("version 2" LF) +capability-list = *capability +capability = PKT-LINE(key[=value] LF) + +key = 1*CHAR +value = 1*CHAR +CHAR = 1*(ALPHA / DIGIT / "-" / "_") + +A client then responds to select the command it wants with any particular +capabilities or arguments. There is then an optional section where the +client can provide any command specific parameters or queries. + +command-request = command + capability-list + (command-args) + flush-pkt +command = PKT-LINE("command=" key LF) +command-args =