Re: [PATCH 11/26] serve: introduce git-serve

2018-02-01 Thread Stefan Beller
On Thu, Feb 1, 2018 at 12:37 PM, Randall S. Becker
 wrote:

>> I think that it would be too much of a change to up to 1MB lines at the
>> moment so I'm planning on leaving it right where it is :)
>
> In for a kilo, in for a tonne. Once we're way up there, it's not a problem
> or much of a difference. :)

What benefit does a larger buffer have?

I outlined the negatives above (large static buffer, issues with
progress meter).

And it seems to me that Brandon wants to keep this series as small as possible
w.r.t. bait for endless discussions and only deliver innovation, that solves the
immediate needs. Are there issues with too small buffers? (Can you link to
the performance measurements or an analysis?)


RE: [PATCH 11/26] serve: introduce git-serve

2018-02-01 Thread Randall S. Becker
On February 1, 2018 3:08 PM, Brandon Williams wrote:
> On 02/01, Randall S. Becker wrote:
> > On February 1, 2018 1:58 PM, Stefan Beller wrote:
> > > On Thu, Feb 1, 2018 at 10:48 AM, Jeff Hostetler
> > > 
> > > wrote:
> > > >
> > > >
> > > > On 1/2/2018 7:18 PM, Brandon Williams wrote:
> > > >>
> > > >> Introduce git-serve, the base server for protocol version 2.
> > > >>
> > > >> Protocol version 2 is intended to be a replacement for Git's
> > > >> current wire protocol.  The intention is that it will be a
> > > >> simpler, less wasteful protocol which can evolve over time.
> > > >>
> > > >> Protocol version 2 improves upon version 1 by eliminating the
> > > >> initial ref advertisement.  In its place a server will export a
> > > >> list of capabilities and commands which it supports in a
> > > >> capability advertisement.  A client can then request that a
> > > >> particular command be executed by providing a number of
> > > >> capabilities and command specific parameters.  At the completion
> > > >> of a command, a client can request that another command be
> > > >> executed or can terminate the connection by sending a flush packet.
> > > >>
> > > >> Signed-off-by: Brandon Williams 
> > > >> ---
> > > >>   .gitignore  |   1 +
> > > >>   Documentation/technical/protocol-v2.txt |  91 
> > > >>   Makefile|   2 +
> > > >>   builtin.h   |   1 +
> > > >>   builtin/serve.c |  30 
> > > >>   git.c   |   1 +
> > > >>   serve.c | 239
> > > >> 
> > > >>   serve.h |  15 ++
> > > >>   8 files changed, 380 insertions(+)
> > > >>   create mode 100644 Documentation/technical/protocol-v2.txt
> > > >>   create mode 100644 builtin/serve.c
> > > >>   create mode 100644 serve.c
> > > >>   create mode 100644 serve.h
> > > >>
> > > >> diff --git a/.gitignore b/.gitignore index 833ef3b0b..2d0450c26
> > > >> 100644
> > > >> --- a/.gitignore
> > > >> +++ b/.gitignore
> > > >> @@ -140,6 +140,7 @@
> > > >>   /git-rm
> > > >>   /git-send-email
> > > >>   /git-send-pack
> > > >> +/git-serve
> > > >>   /git-sh-i18n
> > > >>   /git-sh-i18n--envsubst
> > > >>   /git-sh-setup
> > > >> diff --git a/Documentation/technical/protocol-v2.txt
> > > >> b/Documentation/technical/protocol-v2.txt
> > > >> new file mode 100644
> > > >> index 0..b87ba3816
> > > >> --- /dev/null
> > > >> +++ b/Documentation/technical/protocol-v2.txt
> > > >> @@ -0,0 +1,91 @@
> > > >> + Git Wire Protocol, Version 2
> > > >> +==
> > > >> +
> > > >> +This document presents a specification for a version 2 of Git's
> > > >> +wire protocol.  Protocol v2 will improve upon v1 in the following
> ways:
> > > >> +
> > > >> +  * Instead of multiple service names, multiple commands will be
> > > >> +supported by a single service.
> > > >> +  * Easily extendable as capabilities are moved into their own
section
> > > >> +of the protocol, no longer being hidden behind a NUL byte and
> > > >> +limited by the size of a pkt-line (as there will be a single
> > > >> +capability per pkt-line).
> > > >> +  * Separate out other information hidden behind NUL bytes (e.g.
> agent
> > > >> +string as a capability and symrefs can be requested using
> > > >> + 'ls-refs')
> > > >> +  * Reference advertisement will be omitted unless explicitly
> > > >> + requested
> > > >> +  * ls-refs command to explicitly request some refs
> > > >> +
> > > >> + Detailed Design
> > > >> +=
> > > >> +
> > > >> +A client can request to speak protocol v2 by sending `version=2`
> > > >> +in the side-channel `GIT_PROTOCOL` in the initial request to the
> server.
> > > >> +
> > > >> +In protocol v2 communication is command oriented.  When first
> > > >> +contacting
> > > >> a
> > > >> +server a list of capabilities will advertised.  Some of these
> > > >> capabilities
> > > >> +will be commands which a client can request be executed.  Once a
> > > >> +command has completed, a client can reuse the connection and
> > > >> +request that other commands be executed.
> > > >> +
> > > >> + Special Packets
> > > >> +-
> > > >> +
> > > >> +In protocol v2 these special packets will have the following
> semantics:
> > > >> +
> > > >> +  * '' Flush Packet (flush-pkt) - indicates the end of a
> > > >> + message
> > > >> +  * '0001' Delimiter Packet (delim-pkt) - separates sections of
> > > >> + a message
> > > >
> > > >
> > > > Previously, a 0001 pkt-line meant that there was 1 byte of data
> > > > following, right?
> > >
> > > No, the length was including the length field, so 0005 would
> > > indicate that there is one byte following, (+4 bytes of "0005"
> > > included)
> > >
> > > > Does this change that and/or prevent 1 byte packets?  (Not sure 

Re: [PATCH 11/26] serve: introduce git-serve

2018-02-01 Thread 'Brandon Williams'
On 02/01, Randall S. Becker wrote:
> On February 1, 2018 1:58 PM, Stefan Beller wrote:
> > On Thu, Feb 1, 2018 at 10:48 AM, Jeff Hostetler 
> > wrote:
> > >
> > >
> > > On 1/2/2018 7:18 PM, Brandon Williams wrote:
> > >>
> > >> Introduce git-serve, the base server for protocol version 2.
> > >>
> > >> Protocol version 2 is intended to be a replacement for Git's current
> > >> wire protocol.  The intention is that it will be a simpler, less
> > >> wasteful protocol which can evolve over time.
> > >>
> > >> Protocol version 2 improves upon version 1 by eliminating the initial
> > >> ref advertisement.  In its place a server will export a list of
> > >> capabilities and commands which it supports in a capability
> > >> advertisement.  A client can then request that a particular command
> > >> be executed by providing a number of capabilities and command
> > >> specific parameters.  At the completion of a command, a client can
> > >> request that another command be executed or can terminate the
> > >> connection by sending a flush packet.
> > >>
> > >> Signed-off-by: Brandon Williams 
> > >> ---
> > >>   .gitignore  |   1 +
> > >>   Documentation/technical/protocol-v2.txt |  91 
> > >>   Makefile|   2 +
> > >>   builtin.h   |   1 +
> > >>   builtin/serve.c |  30 
> > >>   git.c   |   1 +
> > >>   serve.c | 239
> > >> 
> > >>   serve.h |  15 ++
> > >>   8 files changed, 380 insertions(+)
> > >>   create mode 100644 Documentation/technical/protocol-v2.txt
> > >>   create mode 100644 builtin/serve.c
> > >>   create mode 100644 serve.c
> > >>   create mode 100644 serve.h
> > >>
> > >> diff --git a/.gitignore b/.gitignore
> > >> index 833ef3b0b..2d0450c26 100644
> > >> --- a/.gitignore
> > >> +++ b/.gitignore
> > >> @@ -140,6 +140,7 @@
> > >>   /git-rm
> > >>   /git-send-email
> > >>   /git-send-pack
> > >> +/git-serve
> > >>   /git-sh-i18n
> > >>   /git-sh-i18n--envsubst
> > >>   /git-sh-setup
> > >> diff --git a/Documentation/technical/protocol-v2.txt
> > >> b/Documentation/technical/protocol-v2.txt
> > >> new file mode 100644
> > >> index 0..b87ba3816
> > >> --- /dev/null
> > >> +++ b/Documentation/technical/protocol-v2.txt
> > >> @@ -0,0 +1,91 @@
> > >> + Git Wire Protocol, Version 2
> > >> +==
> > >> +
> > >> +This document presents a specification for a version 2 of Git's wire
> > >> +protocol.  Protocol v2 will improve upon v1 in the following ways:
> > >> +
> > >> +  * Instead of multiple service names, multiple commands will be
> > >> +supported by a single service.
> > >> +  * Easily extendable as capabilities are moved into their own section
> > >> +of the protocol, no longer being hidden behind a NUL byte and
> > >> +limited by the size of a pkt-line (as there will be a single
> > >> +capability per pkt-line).
> > >> +  * Separate out other information hidden behind NUL bytes (e.g. agent
> > >> +string as a capability and symrefs can be requested using
> > >> + 'ls-refs')
> > >> +  * Reference advertisement will be omitted unless explicitly
> > >> + requested
> > >> +  * ls-refs command to explicitly request some refs
> > >> +
> > >> + Detailed Design
> > >> +=
> > >> +
> > >> +A client can request to speak protocol v2 by sending `version=2` in
> > >> +the side-channel `GIT_PROTOCOL` in the initial request to the server.
> > >> +
> > >> +In protocol v2 communication is command oriented.  When first
> > >> +contacting
> > >> a
> > >> +server a list of capabilities will advertised.  Some of these
> > >> capabilities
> > >> +will be commands which a client can request be executed.  Once a
> > >> +command has completed, a client can reuse the connection and request
> > >> +that other commands be executed.
> > >> +
> > >> + Special Packets
> > >> +-
> > >> +
> > >> +In protocol v2 these special packets will have the following semantics:
> > >> +
> > >> +  * '' Flush Packet (flush-pkt) - indicates the end of a message
> > >> +  * '0001' Delimiter Packet (delim-pkt) - separates sections of a
> > >> + message
> > >
> > >
> > > Previously, a 0001 pkt-line meant that there was 1 byte of data
> > > following, right?
> > 
> > No, the length was including the length field, so 0005 would indicate that
> > there is one byte following, (+4 bytes of "0005" included)
> > 
> > > Does this change that and/or prevent 1 byte packets?  (Not sure if it
> > > is likely, but the odd-tail of a packfile might get sent in a 0001
> > > line, right?)  Or is it that 0001 is only special during the V2
> > > negotiation stuff, but not during the packfile transmission?
> > 
> > 0001 is invalid in the current protocol v0.
> > 
> > >
> > > (I'm not 

Re: [PATCH 11/26] serve: introduce git-serve

2018-02-01 Thread Brandon Williams
On 02/01, Jeff Hostetler wrote:
> 
> 
> On 2/1/2018 1:57 PM, Stefan Beller wrote:
> > On Thu, Feb 1, 2018 at 10:48 AM, Jeff Hostetler  
> > wrote:
> > > 
> > > 
> > > On 1/2/2018 7:18 PM, Brandon Williams wrote:
> > > > 
> > > > Introduce git-serve, the base server for protocol version 2.
> [...]
> > > > + Special Packets
> > > > +-
> > > > +
> > > > +In protocol v2 these special packets will have the following semantics:
> > > > +
> > > > +  * '' Flush Packet (flush-pkt) - indicates the end of a message
> > > > +  * '0001' Delimiter Packet (delim-pkt) - separates sections of a 
> > > > message
> > > 
> > > 
> > > Previously, a 0001 pkt-line meant that there was 1 byte of data
> > > following, right?
> > 
> > No, the length was including the length field, so 0005 would indicate that
> > there is one byte following, (+4 bytes of "0005" included)
> 
> d'oh.  right.  thanks!
> 
> > > Should we also consider increasing the pkt-line limit to 5 hex-digits
> > > while we're at it ?   That would let us have 1MB buffers if that would
> > > help with large packfiles.
> > 
> > AFAICT there is a static allocation of one pkt-line (of maximum size),
> > such that the code can read in a full packet and then process it.
> > If we'd increase the packet size we'd need the static buffer to be 1MB,
> > which sounds good for my developer machine. But I suspect it may be
> > too much for people using git on embedded devices?
> 
> I got burned by that static buffer once upon a time when I wanted
> to have 2 streams going at the same time.  Hopefully, we can move
> that into the new reader structure at some point (if it isn't already).

Yeah the reader struct could easily be extended to take in the
buffer to read the data into.  Because I'm not trying to do any of that
atm I decided to have it default to using the static buffer, but it
would be as simple as changing the reader->buffer variable to use a
different buffer.

> 
> > 
> > pack files larger than 64k are put into multiple pkt-lines, which is
> > not a big deal, as the overhead of 4bytes per 64k is negligible.
> > (also there is progress information in the side channel, which
> > would come in as a special packet in between real packets,
> > such that every 64k transmitted you can update your progress
> > meter; Not sure I feel strongly on fewer progress updates)
> > 
> > >   Granted, we're throttled by the network,
> > > so it might not matter.  Would it be interesting to have a 5 digit
> > > prefix with parts of the high bits of first digit being flags ?
> > > Or is this too radical of a change?
> > 
> > What would the flags be for?
> > 
> > As an alternative we could put the channel number in one byte,
> > such that we can have a side channel not just while streaming the
> > pack but all the time. (Again, not sure if that buys a lot for us)
> > 
> 
> Delimiters like the 0001 and the side channel are a couple of
> ideas, but I was just thinking out loud.  And right, I'm not sure
> it gets us much right now.
> 
> Jeff

-- 
Brandon Williams


RE: [PATCH 11/26] serve: introduce git-serve

2018-02-01 Thread Randall S. Becker
On February 1, 2018 1:58 PM, Stefan Beller wrote:
> On Thu, Feb 1, 2018 at 10:48 AM, Jeff Hostetler 
> wrote:
> >
> >
> > On 1/2/2018 7:18 PM, Brandon Williams wrote:
> >>
> >> Introduce git-serve, the base server for protocol version 2.
> >>
> >> Protocol version 2 is intended to be a replacement for Git's current
> >> wire protocol.  The intention is that it will be a simpler, less
> >> wasteful protocol which can evolve over time.
> >>
> >> Protocol version 2 improves upon version 1 by eliminating the initial
> >> ref advertisement.  In its place a server will export a list of
> >> capabilities and commands which it supports in a capability
> >> advertisement.  A client can then request that a particular command
> >> be executed by providing a number of capabilities and command
> >> specific parameters.  At the completion of a command, a client can
> >> request that another command be executed or can terminate the
> >> connection by sending a flush packet.
> >>
> >> Signed-off-by: Brandon Williams 
> >> ---
> >>   .gitignore  |   1 +
> >>   Documentation/technical/protocol-v2.txt |  91 
> >>   Makefile|   2 +
> >>   builtin.h   |   1 +
> >>   builtin/serve.c |  30 
> >>   git.c   |   1 +
> >>   serve.c | 239
> >> 
> >>   serve.h |  15 ++
> >>   8 files changed, 380 insertions(+)
> >>   create mode 100644 Documentation/technical/protocol-v2.txt
> >>   create mode 100644 builtin/serve.c
> >>   create mode 100644 serve.c
> >>   create mode 100644 serve.h
> >>
> >> diff --git a/.gitignore b/.gitignore
> >> index 833ef3b0b..2d0450c26 100644
> >> --- a/.gitignore
> >> +++ b/.gitignore
> >> @@ -140,6 +140,7 @@
> >>   /git-rm
> >>   /git-send-email
> >>   /git-send-pack
> >> +/git-serve
> >>   /git-sh-i18n
> >>   /git-sh-i18n--envsubst
> >>   /git-sh-setup
> >> diff --git a/Documentation/technical/protocol-v2.txt
> >> b/Documentation/technical/protocol-v2.txt
> >> new file mode 100644
> >> index 0..b87ba3816
> >> --- /dev/null
> >> +++ b/Documentation/technical/protocol-v2.txt
> >> @@ -0,0 +1,91 @@
> >> + Git Wire Protocol, Version 2
> >> +==
> >> +
> >> +This document presents a specification for a version 2 of Git's wire
> >> +protocol.  Protocol v2 will improve upon v1 in the following ways:
> >> +
> >> +  * Instead of multiple service names, multiple commands will be
> >> +supported by a single service.
> >> +  * Easily extendable as capabilities are moved into their own section
> >> +of the protocol, no longer being hidden behind a NUL byte and
> >> +limited by the size of a pkt-line (as there will be a single
> >> +capability per pkt-line).
> >> +  * Separate out other information hidden behind NUL bytes (e.g. agent
> >> +string as a capability and symrefs can be requested using
> >> + 'ls-refs')
> >> +  * Reference advertisement will be omitted unless explicitly
> >> + requested
> >> +  * ls-refs command to explicitly request some refs
> >> +
> >> + Detailed Design
> >> +=
> >> +
> >> +A client can request to speak protocol v2 by sending `version=2` in
> >> +the side-channel `GIT_PROTOCOL` in the initial request to the server.
> >> +
> >> +In protocol v2 communication is command oriented.  When first
> >> +contacting
> >> a
> >> +server a list of capabilities will advertised.  Some of these
> >> capabilities
> >> +will be commands which a client can request be executed.  Once a
> >> +command has completed, a client can reuse the connection and request
> >> +that other commands be executed.
> >> +
> >> + Special Packets
> >> +-
> >> +
> >> +In protocol v2 these special packets will have the following semantics:
> >> +
> >> +  * '' Flush Packet (flush-pkt) - indicates the end of a message
> >> +  * '0001' Delimiter Packet (delim-pkt) - separates sections of a
> >> + message
> >
> >
> > Previously, a 0001 pkt-line meant that there was 1 byte of data
> > following, right?
> 
> No, the length was including the length field, so 0005 would indicate that
> there is one byte following, (+4 bytes of "0005" included)
> 
> > Does this change that and/or prevent 1 byte packets?  (Not sure if it
> > is likely, but the odd-tail of a packfile might get sent in a 0001
> > line, right?)  Or is it that 0001 is only special during the V2
> > negotiation stuff, but not during the packfile transmission?
> 
> 0001 is invalid in the current protocol v0.
> 
> >
> > (I'm not against having this delimiter -- I think it is useful, but
> > just curious if will cause problems elsewhere.)
> >
> > Should we also consider increasing the pkt-line limit to 5 hex-digits
> > while we're at it ?   That would let us have 1MB buffers if that would
> > help 

Re: [PATCH 11/26] serve: introduce git-serve

2018-02-01 Thread Jeff Hostetler



On 2/1/2018 1:57 PM, Stefan Beller wrote:

On Thu, Feb 1, 2018 at 10:48 AM, Jeff Hostetler  wrote:



On 1/2/2018 7:18 PM, Brandon Williams wrote:


Introduce git-serve, the base server for protocol version 2.

[...]

+ Special Packets
+-
+
+In protocol v2 these special packets will have the following semantics:
+
+  * '' Flush Packet (flush-pkt) - indicates the end of a message
+  * '0001' Delimiter Packet (delim-pkt) - separates sections of a message



Previously, a 0001 pkt-line meant that there was 1 byte of data
following, right?


No, the length was including the length field, so 0005 would indicate that
there is one byte following, (+4 bytes of "0005" included)


d'oh.  right.  thanks!


Should we also consider increasing the pkt-line limit to 5 hex-digits
while we're at it ?   That would let us have 1MB buffers if that would
help with large packfiles.


AFAICT there is a static allocation of one pkt-line (of maximum size),
such that the code can read in a full packet and then process it.
If we'd increase the packet size we'd need the static buffer to be 1MB,
which sounds good for my developer machine. But I suspect it may be
too much for people using git on embedded devices?


I got burned by that static buffer once upon a time when I wanted
to have 2 streams going at the same time.  Hopefully, we can move
that into the new reader structure at some point (if it isn't already).



pack files larger than 64k are put into multiple pkt-lines, which is
not a big deal, as the overhead of 4bytes per 64k is negligible.
(also there is progress information in the side channel, which
would come in as a special packet in between real packets,
such that every 64k transmitted you can update your progress
meter; Not sure I feel strongly on fewer progress updates)


  Granted, we're throttled by the network,
so it might not matter.  Would it be interesting to have a 5 digit
prefix with parts of the high bits of first digit being flags ?
Or is this too radical of a change?


What would the flags be for?

As an alternative we could put the channel number in one byte,
such that we can have a side channel not just while streaming the
pack but all the time. (Again, not sure if that buys a lot for us)



Delimiters like the 0001 and the side channel are a couple of
ideas, but I was just thinking out loud.  And right, I'm not sure
it gets us much right now.

Jeff


Re: [PATCH 11/26] serve: introduce git-serve

2018-02-01 Thread Stefan Beller
On Thu, Feb 1, 2018 at 10:48 AM, Jeff Hostetler  wrote:
>
>
> On 1/2/2018 7:18 PM, Brandon Williams wrote:
>>
>> Introduce git-serve, the base server for protocol version 2.
>>
>> Protocol version 2 is intended to be a replacement for Git's current
>> wire protocol.  The intention is that it will be a simpler, less
>> wasteful protocol which can evolve over time.
>>
>> Protocol version 2 improves upon version 1 by eliminating the initial
>> ref advertisement.  In its place a server will export a list of
>> capabilities and commands which it supports in a capability
>> advertisement.  A client can then request that a particular command be
>> executed by providing a number of capabilities and command specific
>> parameters.  At the completion of a command, a client can request that
>> another command be executed or can terminate the connection by sending a
>> flush packet.
>>
>> Signed-off-by: Brandon Williams 
>> ---
>>   .gitignore  |   1 +
>>   Documentation/technical/protocol-v2.txt |  91 
>>   Makefile|   2 +
>>   builtin.h   |   1 +
>>   builtin/serve.c |  30 
>>   git.c   |   1 +
>>   serve.c | 239
>> 
>>   serve.h |  15 ++
>>   8 files changed, 380 insertions(+)
>>   create mode 100644 Documentation/technical/protocol-v2.txt
>>   create mode 100644 builtin/serve.c
>>   create mode 100644 serve.c
>>   create mode 100644 serve.h
>>
>> diff --git a/.gitignore b/.gitignore
>> index 833ef3b0b..2d0450c26 100644
>> --- a/.gitignore
>> +++ b/.gitignore
>> @@ -140,6 +140,7 @@
>>   /git-rm
>>   /git-send-email
>>   /git-send-pack
>> +/git-serve
>>   /git-sh-i18n
>>   /git-sh-i18n--envsubst
>>   /git-sh-setup
>> diff --git a/Documentation/technical/protocol-v2.txt
>> b/Documentation/technical/protocol-v2.txt
>> new file mode 100644
>> index 0..b87ba3816
>> --- /dev/null
>> +++ b/Documentation/technical/protocol-v2.txt
>> @@ -0,0 +1,91 @@
>> + Git Wire Protocol, Version 2
>> +==
>> +
>> +This document presents a specification for a version 2 of Git's wire
>> +protocol.  Protocol v2 will improve upon v1 in the following ways:
>> +
>> +  * Instead of multiple service names, multiple commands will be
>> +supported by a single service.
>> +  * Easily extendable as capabilities are moved into their own section
>> +of the protocol, no longer being hidden behind a NUL byte and
>> +limited by the size of a pkt-line (as there will be a single
>> +capability per pkt-line).
>> +  * Separate out other information hidden behind NUL bytes (e.g. agent
>> +string as a capability and symrefs can be requested using 'ls-refs')
>> +  * Reference advertisement will be omitted unless explicitly requested
>> +  * ls-refs command to explicitly request some refs
>> +
>> + Detailed Design
>> +=
>> +
>> +A client can request to speak protocol v2 by sending `version=2` in the
>> +side-channel `GIT_PROTOCOL` in the initial request to the server.
>> +
>> +In protocol v2 communication is command oriented.  When first contacting
>> a
>> +server a list of capabilities will advertised.  Some of these
>> capabilities
>> +will be commands which a client can request be executed.  Once a command
>> +has completed, a client can reuse the connection and request that other
>> +commands be executed.
>> +
>> + Special Packets
>> +-
>> +
>> +In protocol v2 these special packets will have the following semantics:
>> +
>> +  * '' Flush Packet (flush-pkt) - indicates the end of a message
>> +  * '0001' Delimiter Packet (delim-pkt) - separates sections of a message
>
>
> Previously, a 0001 pkt-line meant that there was 1 byte of data
> following, right?

No, the length was including the length field, so 0005 would indicate that
there is one byte following, (+4 bytes of "0005" included)

> Does this change that and/or prevent 1 byte
> packets?  (Not sure if it is likely, but the odd-tail of a packfile
> might get sent in a 0001 line, right?)  Or is it that 0001 is only
> special during the V2 negotiation stuff, but not during the packfile
> transmission?

0001 is invalid in the current protocol v0.

>
> (I'm not against having this delimiter -- I think it is useful, but
> just curious if will cause problems elsewhere.)
>
> Should we also consider increasing the pkt-line limit to 5 hex-digits
> while we're at it ?   That would let us have 1MB buffers if that would
> help with large packfiles.

AFAICT there is a static allocation of one pkt-line (of maximum size),
such that the code can read in a full packet and then process it.
If we'd increase the packet size we'd need the static buffer to be 1MB,
which sounds good for my developer machine. But I suspect it may be
too much 

Re: [PATCH 11/26] serve: introduce git-serve

2018-02-01 Thread Jeff Hostetler



On 1/2/2018 7:18 PM, Brandon Williams wrote:

Introduce git-serve, the base server for protocol version 2.

Protocol version 2 is intended to be a replacement for Git's current
wire protocol.  The intention is that it will be a simpler, less
wasteful protocol which can evolve over time.

Protocol version 2 improves upon version 1 by eliminating the initial
ref advertisement.  In its place a server will export a list of
capabilities and commands which it supports in a capability
advertisement.  A client can then request that a particular command be
executed by providing a number of capabilities and command specific
parameters.  At the completion of a command, a client can request that
another command be executed or can terminate the connection by sending a
flush packet.

Signed-off-by: Brandon Williams 
---
  .gitignore  |   1 +
  Documentation/technical/protocol-v2.txt |  91 
  Makefile|   2 +
  builtin.h   |   1 +
  builtin/serve.c |  30 
  git.c   |   1 +
  serve.c | 239 
  serve.h |  15 ++
  8 files changed, 380 insertions(+)
  create mode 100644 Documentation/technical/protocol-v2.txt
  create mode 100644 builtin/serve.c
  create mode 100644 serve.c
  create mode 100644 serve.h

diff --git a/.gitignore b/.gitignore
index 833ef3b0b..2d0450c26 100644
--- a/.gitignore
+++ b/.gitignore
@@ -140,6 +140,7 @@
  /git-rm
  /git-send-email
  /git-send-pack
+/git-serve
  /git-sh-i18n
  /git-sh-i18n--envsubst
  /git-sh-setup
diff --git a/Documentation/technical/protocol-v2.txt 
b/Documentation/technical/protocol-v2.txt
new file mode 100644
index 0..b87ba3816
--- /dev/null
+++ b/Documentation/technical/protocol-v2.txt
@@ -0,0 +1,91 @@
+ Git Wire Protocol, Version 2
+==
+
+This document presents a specification for a version 2 of Git's wire
+protocol.  Protocol v2 will improve upon v1 in the following ways:
+
+  * Instead of multiple service names, multiple commands will be
+supported by a single service.
+  * Easily extendable as capabilities are moved into their own section
+of the protocol, no longer being hidden behind a NUL byte and
+limited by the size of a pkt-line (as there will be a single
+capability per pkt-line).
+  * Separate out other information hidden behind NUL bytes (e.g. agent
+string as a capability and symrefs can be requested using 'ls-refs')
+  * Reference advertisement will be omitted unless explicitly requested
+  * ls-refs command to explicitly request some refs
+
+ Detailed Design
+=
+
+A client can request to speak protocol v2 by sending `version=2` in the
+side-channel `GIT_PROTOCOL` in the initial request to the server.
+
+In protocol v2 communication is command oriented.  When first contacting a
+server a list of capabilities will advertised.  Some of these capabilities
+will be commands which a client can request be executed.  Once a command
+has completed, a client can reuse the connection and request that other
+commands be executed.
+
+ Special Packets
+-
+
+In protocol v2 these special packets will have the following semantics:
+
+  * '' Flush Packet (flush-pkt) - indicates the end of a message
+  * '0001' Delimiter Packet (delim-pkt) - separates sections of a message


Previously, a 0001 pkt-line meant that there was 1 byte of data
following, right?  Does this change that and/or prevent 1 byte
packets?  (Not sure if it is likely, but the odd-tail of a packfile
might get sent in a 0001 line, right?)  Or is it that 0001 is only
special during the V2 negotiation stuff, but not during the packfile
transmission?

(I'm not against having this delimiter -- I think it is useful, but
just curious if will cause problems elsewhere.)

Should we also consider increasing the pkt-line limit to 5 hex-digits
while we're at it ?   That would let us have 1MB buffers if that would
help with large packfiles.  Granted, we're throttled by the network,
so it might not matter.  Would it be interesting to have a 5 digit
prefix with parts of the high bits of first digit being flags ?
Or is this too radical of a change?



+
+ Capability Advertisement
+--
+
+A server which decides to communicate (based on a request from a client)
+using protocol version 2, notifies the client by sending a version string
+in its initial response followed by an advertisement of its capabilities.
+Each capability is a key with an optional value.  Clients must ignore all
+unknown keys.  Semantics of unknown values are left to the definition of
+each key.  Some capabilities will describe commands which can be requested
+to be executed by the client.
+
+capability-advertisement = protocol-version
+  

Re: [PATCH 11/26] serve: introduce git-serve

2018-01-09 Thread Brandon Williams
On 01/09, Jonathan Tan wrote:
> On Tue, 9 Jan 2018 14:16:42 -0800
> Brandon Williams  wrote:
> 
> > All good documentation changes.
> 
> Thanks!
> 
> > > > +   /*
> > > > +* Function called when a client requests the capability as a 
> > > > command.
> > > > +* The command request will be provided to the function via 
> > > > 'keys', the
> > > > +* capabilities requested, and 'args', the command specific 
> > > > parameters.
> > > > +*
> > > > +* This field should be NULL for capabilities which are not 
> > > > commands.
> > > > +*/
> > > > +   int (*command)(struct repository *r,
> > > > +  struct argv_array *keys,
> > > > +  struct argv_array *args);
> > > 
> > > Looking at the code below, I see that the command is not executed unless
> > > advertise returns true - this means that a command cannot be both
> > > supported and unadvertised. Would this be too restrictive? For example,
> > > this would disallow a gradual across-multiple-servers rollout in which
> > > we allow but not advertise a capability, and then after some time,
> > > advertise the capability.
> > 
> > One way to change this would be to just add another function to the
> > struct which is called to check if the command is allowed, instead of
> > relying on the same function to do that for both advertise and
> > allow...though I don't see a big win for allowing a command but not
> > advertising it.
> 
> My rationale for allowing a command but not advertising it is in the
> paragraph above (that you quoted), but if that is insufficient
> rationale, then I agree that we don't need to do this.

I have no issues with adding that functionality, i don't really feel
that strongly one way or another.  Just seemed like additional work for
not much gain right now, key being right now.  It very well may be worth
it for the use case you specified.  If so I can definitely make the
change.

> 
> > > If we change this, then the value parameter of advertise can be
> > > mandatory instead of optional.
> > 
> > I don't see how this fixes the issue you bring up.
> 
> This is a consequence, not a fix - if we were to do as I suggested, then
> we no longer need to invoke advertise to check whether something is
> advertised except when we are advertising them, in which case "value"
> never needs to be NULL.

Oh I understand what you are trying to explain, yes you're right.

-- 
Brandon Williams


Re: [PATCH 11/26] serve: introduce git-serve

2018-01-09 Thread Jonathan Tan
On Tue, 9 Jan 2018 14:16:42 -0800
Brandon Williams  wrote:

> All good documentation changes.

Thanks!

> > > + /*
> > > +  * Function called when a client requests the capability as a command.
> > > +  * The command request will be provided to the function via 'keys', the
> > > +  * capabilities requested, and 'args', the command specific parameters.
> > > +  *
> > > +  * This field should be NULL for capabilities which are not commands.
> > > +  */
> > > + int (*command)(struct repository *r,
> > > +struct argv_array *keys,
> > > +struct argv_array *args);
> > 
> > Looking at the code below, I see that the command is not executed unless
> > advertise returns true - this means that a command cannot be both
> > supported and unadvertised. Would this be too restrictive? For example,
> > this would disallow a gradual across-multiple-servers rollout in which
> > we allow but not advertise a capability, and then after some time,
> > advertise the capability.
> 
> One way to change this would be to just add another function to the
> struct which is called to check if the command is allowed, instead of
> relying on the same function to do that for both advertise and
> allow...though I don't see a big win for allowing a command but not
> advertising it.

My rationale for allowing a command but not advertising it is in the
paragraph above (that you quoted), but if that is insufficient
rationale, then I agree that we don't need to do this.

> > If we change this, then the value parameter of advertise can be
> > mandatory instead of optional.
> 
> I don't see how this fixes the issue you bring up.

This is a consequence, not a fix - if we were to do as I suggested, then
we no longer need to invoke advertise to check whether something is
advertised except when we are advertising them, in which case "value"
never needs to be NULL.


Re: [PATCH 11/26] serve: introduce git-serve

2018-01-09 Thread Brandon Williams
On 01/09, Jonathan Tan wrote:
> On Tue,  2 Jan 2018 16:18:13 -0800
> Brandon Williams  wrote:
> 
> > diff --git a/Documentation/technical/protocol-v2.txt 
> > b/Documentation/technical/protocol-v2.txt
> > new file mode 100644
> > index 0..b87ba3816
> > --- /dev/null
> > +++ b/Documentation/technical/protocol-v2.txt
> 
> I'll review the documentation later, once there is some consensus that
> the overall design is OK. (Or maybe there already is consensus?)
> 
> > diff --git a/builtin/serve.c b/builtin/serve.c
> > new file mode 100644
> > index 0..bb726786a
> > --- /dev/null
> > +++ b/builtin/serve.c
> > @@ -0,0 +1,30 @@
> > +#include "cache.h"
> > +#include "builtin.h"
> > +#include "parse-options.h"
> > +#include "serve.h"
> > +
> > +static char const * const grep_usage[] = {
> 
> Should be serve_usage.
> 
> > diff --git a/serve.c b/serve.c
> > new file mode 100644
> > index 0..da8127775
> > --- /dev/null
> > +++ b/serve.c
> 
> [snip]
> 
> > +struct protocol_capability {
> > +   const char *name; /* capability name */
> 
> Maybe document as:
> 
>   The name of the capability. The server uses this name when advertising
>   this capability, and the client uses this name to invoke the command
>   corresponding to this capability.
> 
> > +   /*
> > +* Function queried to see if a capability should be advertised.
> > +* Optionally a value can be specified by adding it to 'value'.
> > +*/
> > +   int (*advertise)(struct repository *r, struct strbuf *value);
> 
> Document what happens when value is appended to. For example:
> 
>   ... If value is appended to, the server will advertise this capability
>   as = instead of .
> 

All good documentation changes.

> > +   /*
> > +* Function called when a client requests the capability as a command.
> > +* The command request will be provided to the function via 'keys', the
> > +* capabilities requested, and 'args', the command specific parameters.
> > +*
> > +* This field should be NULL for capabilities which are not commands.
> > +*/
> > +   int (*command)(struct repository *r,
> > +  struct argv_array *keys,
> > +  struct argv_array *args);
> 
> Looking at the code below, I see that the command is not executed unless
> advertise returns true - this means that a command cannot be both
> supported and unadvertised. Would this be too restrictive? For example,
> this would disallow a gradual across-multiple-servers rollout in which
> we allow but not advertise a capability, and then after some time,
> advertise the capability.

One way to change this would be to just add another function to the
struct which is called to check if the command is allowed, instead of
relying on the same function to do that for both advertise and
allow...though I don't see a big win for allowing a command but not
advertising it.

> 
> If we change this, then the value parameter of advertise can be
> mandatory instead of optional.

I don't see how this fixes the issue you bring up.

-- 
Brandon Williams


Re: [PATCH 11/26] serve: introduce git-serve

2018-01-09 Thread Jonathan Tan
On Tue,  2 Jan 2018 16:18:13 -0800
Brandon Williams  wrote:

> diff --git a/Documentation/technical/protocol-v2.txt 
> b/Documentation/technical/protocol-v2.txt
> new file mode 100644
> index 0..b87ba3816
> --- /dev/null
> +++ b/Documentation/technical/protocol-v2.txt

I'll review the documentation later, once there is some consensus that
the overall design is OK. (Or maybe there already is consensus?)

> diff --git a/builtin/serve.c b/builtin/serve.c
> new file mode 100644
> index 0..bb726786a
> --- /dev/null
> +++ b/builtin/serve.c
> @@ -0,0 +1,30 @@
> +#include "cache.h"
> +#include "builtin.h"
> +#include "parse-options.h"
> +#include "serve.h"
> +
> +static char const * const grep_usage[] = {

Should be serve_usage.

> diff --git a/serve.c b/serve.c
> new file mode 100644
> index 0..da8127775
> --- /dev/null
> +++ b/serve.c

[snip]

> +struct protocol_capability {
> + const char *name; /* capability name */

Maybe document as:

  The name of the capability. The server uses this name when advertising
  this capability, and the client uses this name to invoke the command
  corresponding to this capability.

> + /*
> +  * Function queried to see if a capability should be advertised.
> +  * Optionally a value can be specified by adding it to 'value'.
> +  */
> + int (*advertise)(struct repository *r, struct strbuf *value);

Document what happens when value is appended to. For example:

  ... If value is appended to, the server will advertise this capability
  as = instead of .

> + /*
> +  * Function called when a client requests the capability as a command.
> +  * The command request will be provided to the function via 'keys', the
> +  * capabilities requested, and 'args', the command specific parameters.
> +  *
> +  * This field should be NULL for capabilities which are not commands.
> +  */
> + int (*command)(struct repository *r,
> +struct argv_array *keys,
> +struct argv_array *args);

Looking at the code below, I see that the command is not executed unless
advertise returns true - this means that a command cannot be both
supported and unadvertised. Would this be too restrictive? For example,
this would disallow a gradual across-multiple-servers rollout in which
we allow but not advertise a capability, and then after some time,
advertise the capability.

If we change this, then the value parameter of advertise can be
mandatory instead of optional.