Re: [PATCH 00/10] pack v4 UI support

2013-09-26 Thread Nicolas Pitre
On Thu, 26 Sep 2013, Duy Nguyen wrote:

> On Thu, Sep 26, 2013 at 11:51 AM, Nicolas Pitre  wrote:
> >> Multi-base tree support is not part of "packv4" capability. Let's see
> >> if such support comes before the series is merged to master. If so we
> >> can drop that line from protocol-capabilities.txt. Otherwise a new
> >> capability can be added for multi-base trees.
> >
> > What is that for?  Multi-base trees are not created yet, but the code to
> > parse them is already there.  So I don't see the point of having a
> > capability in the protocol for this.
> 
> pv4_get_tree() can. index-pack cannot.

Hmmm... right.

I think this ought to be part of the "packv4" capability nevertheless. 
This is not going into the official git branch right away so there is 
still time to implement it.

> >> Another capability could be added for sending the actual number of
> >> objects in a thin pack for more accurate display in index-pack. Low
> >> priority in my opinion.
> >
> > That just cannot be communicated during capability exchange.  This
> > number is known only after object enumeration.  Hence my suggestion of a
> > ['T', 'H', 'I', 'N', htonl()] special header
> > prepended to the actual pack on the wire.  And that has to be decided
> > before formalizing the pack v4 capability.  That makes it a somewhat
> > higher priority.
> 
> The capability is to let the server know the client understands ['T',
> 'H', 'I', 'N' ..]. The server can choose not to send it later, but
> that's ok. Hence the new capability. I'm somewhat reluctant to do it
> because of peeking code in fetch-pack and receive-pack and some
> refactoring may be needed first. But I could certainly put it on
> higher priority.

Here's what I'm suggesting as a start (untested):

diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index e3eb5fc..04e5ae1 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -792,9 +792,10 @@ static struct command *read_head_info(void)
return commands;
 }
 
-static const char *parse_pack_header(struct pack_header *hdr)
+static const char *parse_pack_header(struct pack_header *hdr,
+struct thin_header *thin_hdr)
 {
-   switch (read_pack_header(0, hdr)) {
+   switch (read_pack_header(0, hdr, thin_hdr)) {
case PH_ERROR_EOF:
return "eof before pack header was fully read";
 
@@ -817,23 +818,25 @@ static const char *pack_lockfile;
 static const char *unpack(int err_fd)
 {
struct pack_header hdr;
+   struct thin_header thin_hdr;
const char *hdr_err;
-   char hdr_arg[38];
+   char hdr_arg[50];
int fsck_objects = (receive_fsck_objects >= 0
? receive_fsck_objects
: transfer_fsck_objects >= 0
? transfer_fsck_objects
: 0);
 
-   hdr_err = parse_pack_header(&hdr);
+   hdr_err = parse_pack_header(&hdr, &thin_hdr);
if (hdr_err) {
if (err_fd > 0)
close(err_fd);
return hdr_err;
}
snprintf(hdr_arg, sizeof(hdr_arg),
-   "--pack_header=%"PRIu32",%"PRIu32,
-   ntohl(hdr.hdr_version), ntohl(hdr.hdr_entries));
+   "--pack_header=%"PRIu32",%"PRIu32",%"PRIu32,
+   ntohl(hdr.hdr_version), ntohl(hdr.hdr_entries),
+   ntohl(thin_hdr.hdr_entries));
 
if (ntohl(hdr.hdr_entries) < unpack_limit) {
int code, i = 0;
diff --git a/fetch-pack.c b/fetch-pack.c
index 6684348..d86e5d1 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -715,12 +715,14 @@ static int get_pack(struct fetch_pack_args *args,
*hdr_arg = 0;
if (!args->keep_pack && unpack_limit) {
struct pack_header header;
+   struct thin_header thin_hdr;
 
-   if (read_pack_header(demux.out, &header))
+   if (read_pack_header(demux.out, &header, &thin_hdr))
die("protocol error: bad pack header");
snprintf(hdr_arg, sizeof(hdr_arg),
-"--pack_header=%"PRIu32",%"PRIu32,
-ntohl(header.hdr_version), ntohl(header.hdr_entries));
+"--pack_header=%"PRIu32",%"PRIu32",%"PRIu32,
+ntohl(header.hdr_version), ntohl(header.hdr_entries),
+ntohl(thin_hdr.hdr_entries));
if (ntohl(header.hdr_entries) < unpack_limit)
do_keep = 0;
else
diff --git a/pack.h b/pack.h
index ccefdbe..974b860 100644
--- a/pack.h
+++ b/pack.h
@@ -16,6 +16,18 @@ struct pack_header {
 };
 
 /*
+ * Pack v4 header prefix for thin packs, indicating the actual number
+ * of objects being transmitted.  Expected before the actual pack header
+ * on the wire and not stored on disk upon reception.  This is not
+ * included in the compu

Re: [PATCH 00/10] pack v4 UI support

2013-09-25 Thread Duy Nguyen
On Thu, Sep 26, 2013 at 11:51 AM, Nicolas Pitre  wrote:
>> Multi-base tree support is not part of "packv4" capability. Let's see
>> if such support comes before the series is merged to master. If so we
>> can drop that line from protocol-capabilities.txt. Otherwise a new
>> capability can be added for multi-base trees.
>
> What is that for?  Multi-base trees are not created yet, but the code to
> parse them is already there.  So I don't see the point of having a
> capability in the protocol for this.

pv4_get_tree() can. index-pack cannot.

>> Another capability could be added for sending the actual number of
>> objects in a thin pack for more accurate display in index-pack. Low
>> priority in my opinion.
>
> That just cannot be communicated during capability exchange.  This
> number is known only after object enumeration.  Hence my suggestion of a
> ['T', 'H', 'I', 'N', htonl()] special header
> prepended to the actual pack on the wire.  And that has to be decided
> before formalizing the pack v4 capability.  That makes it a somewhat
> higher priority.

The capability is to let the server know the client understands ['T',
'H', 'I', 'N' ..]. The server can choose not to send it later, but
that's ok. Hence the new capability. I'm somewhat reluctant to do it
because of peeking code in fetch-pack and receive-pack and some
refactoring may be needed first. But I could certainly put it on
higher priority.
-- 
Duy
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 00/10] pack v4 UI support

2013-09-25 Thread Nicolas Pitre
On Thu, 26 Sep 2013, Nguyễn Thái Ngọc Duy wrote:

> About the "packv4" capability in git protocol. I considered a more
> generic cap "packver=ver[,ver[,ver...]]" that represents all supported
> versions, with the order of preference from the first ver (most preferred)
> to the last (least preferred). But it adds more parsing code
> and frankly I don't think we'll have pack v5 in the next five years.
> So I dropped it.

Agreed.  In fact the receiver should only indicate the highest pack 
version it is ready to accept.  We'll have to support the sending and 
receiving of any previous pack versions forever anyway.

> Multi-base tree support is not part of "packv4" capability. Let's see
> if such support comes before the series is merged to master. If so we
> can drop that line from protocol-capabilities.txt. Otherwise a new
> capability can be added for multi-base trees.

What is that for?  Multi-base trees are not created yet, but the code to 
parse them is already there.  So I don't see the point of having a 
capability in the protocol for this.

> Another capability could be added for sending the actual number of
> objects in a thin pack for more accurate display in index-pack. Low
> priority in my opinion.

That just cannot be communicated during capability exchange.  This 
number is known only after object enumeration.  Hence my suggestion of a 
['T', 'H', 'I', 'N', htonl()] special header 
prepended to the actual pack on the wire.  And that has to be decided 
before formalizing the pack v4 capability.  That makes it a somewhat 
higher priority.

> There's also the pack conversion issue. Suppose the client requests v4
> but, for performance purposes, the server sends v2. Should index-pack
> convert the received pack to v4? My answer is no because there's no
> way we can do it without saving v2 first. And if we have to save v2
> anyway, pack-objects (or repack) will do the conversion better. We can
> adjust git-gc to maybe repack more often when the number of v2 packs
> is over a limit, or just repack v2 packs into one v4 pack.

Yeah... those are questions that certainly can wait.  Only 
experimentation will tell what is best, and that won't be conclusive 
until the core git code learns to parse pack v4 data natively.

I'll try to review your patches soon.


Nicolas


[PATCH 00/10] pack v4 UI support

2013-09-25 Thread Nguyễn Thái Ngọc Duy
This adds

 - clone, fetch and repack learn --pack-version (backends also learn
   new options, check out the patches for details)

 - core.preferredPackVersion to specify the default pack version for
   fetch, clone, repack and the receiver of push.
   
   This config is supposed to be used by porcelain commands only (with
   the exception of receive-pack). All other plumbing will default to
   pack v2 forever.

 - git and remote helper protocols learn about packv4 capabilities

I think the UI changes for pack v4 are done with this series. Well,
git-bundle needs --pack-version too but it's not part of core
user interaction. The remaining work for v4 would be optimization
(that includes multi-base tree support) and more v4 tests. Let me know
if I missed anything here.

I'm aware that repack is being rewritten in C, so it'll cause
conflicts when np/pack-v4 is re-pulled to 'pu' again. I suggest that
the new tests in t7700 are marked failed at the merge. We can add
--pack-version to the C-based repack and enable the tests later.

About the "packv4" capability in git protocol. I considered a more
generic cap "packver=ver[,ver[,ver...]]" that represents all supported
versions, with the order of preference from the first ver (most preferred)
to the last (least preferred). But it adds more parsing code
and frankly I don't think we'll have pack v5 in the next five years.
So I dropped it.

Multi-base tree support is not part of "packv4" capability. Let's see
if such support comes before the series is merged to master. If so we
can drop that line from protocol-capabilities.txt. Otherwise a new
capability can be added for multi-base trees.

Another capability could be added for sending the actual number of
objects in a thin pack for more accurate display in index-pack. Low
priority in my opinion.

There's also the pack conversion issue. Suppose the client requests v4
but, for performance purposes, the server sends v2. Should index-pack
convert the received pack to v4? My answer is no because there's no
way we can do it without saving v2 first. And if we have to save v2
anyway, pack-objects (or repack) will do the conversion better. We can
adjust git-gc to maybe repack more often when the number of v2 packs
is over a limit, or just repack v2 packs into one v4 pack.

Nguyễn Thái Ngọc Duy (10):
  test-dump: new test program to examine binary data
  config: add core.preferredPackVersion
  upload-pack: new capability to send pack v4
  fetch: new option to set preferred pack version for transfer
  clone: new option to set preferred pack version for transfer
  fetch: pack v4 support on smart http
  receive-pack: request for packv4 if it's the preferred version
  send-pack: support pack v4
  repack: add --pack-version and fall back to core.preferredPackVersion
  count-objects: report pack v4 usage

 .gitignore|  1 +
 Documentation/config.txt  |  5 
 Documentation/fetch-options.txt   |  5 
 Documentation/git-clone.txt   |  4 +++
 Documentation/git-count-objects.txt   |  4 +++
 Documentation/git-fetch-pack.txt  |  4 +++
 Documentation/git-repack.txt  |  6 +++-
 Documentation/gitremote-helpers.txt   |  3 ++
 Documentation/technical/protocol-capabilities.txt | 14 +
 Makefile  |  1 +
 builtin/clone.c   | 19 
 builtin/count-objects.c   | 23 +--
 builtin/fetch-pack.c  |  7 +
 builtin/fetch.c   | 10 +++
 builtin/receive-pack.c|  3 +-
 cache.h   |  1 +
 config.c  |  5 
 environment.c |  1 +
 fetch-pack.c  |  3 ++
 fetch-pack.h  |  1 +
 git-repack.sh |  8 +-
 remote-curl.c | 14 -
 send-pack.c   |  5 
 send-pack.h   |  1 +
 t/t5510-fetch.sh  | 13 +
 t/t5516-fetch-push.sh | 12 
 t/t5551-http-fetch.sh |  9 ++
 t/t5601-clone.sh  | 12 
 t/t7700-repack.sh | 35 +++
 test-dump.c (new) | 27 +
 transport-helper.c|  3 ++
 transport.c   |  4 +++
 transport.h   |  5 
 upload-pack.c | 16 +--
 34 files changed, 274 insertions(+