[PATCH] fetch-pack: unify ref in and out param

2018-08-01 Thread Jonathan Tan
When a user fetches:
 - at least one up-to-date ref and at least one non-up-to-date ref,
 - using HTTP with protocol v0 (or something else that uses the fetch
   command of a remote helper)
some refs might not be updated after the fetch.

This bug was introduced in commit 989b8c4452 ("fetch-pack: put shallow
info in output parameter", 2018-06-28) which allowed transports to
report the refs that they have fetched in a new out-parameter
"fetched_refs". If they do so, transport_fetch_refs() makes this
information available to its caller.

Users of "fetched_refs" rely on the following 3 properties:
 (1) it is the complete list of refs that was passed to
 transport_fetch_refs(),
 (2) it has shallow information (REF_STATUS_REJECT_SHALLOW set if
 relevant), and
 (3) it has updated OIDs if ref-in-want was used (introduced after
 989b8c4452).

In an effort to satisfy (1), whenever transport_fetch_refs()
filters the refs sent to the transport, it re-adds the filtered refs to
whatever the transport supplies before returning it to the user.
However, the implementation in 989b8c4452 unconditionally re-adds the
filtered refs without checking if the transport refrained from reporting
anything in "fetched_refs" (which it is allowed to do), resulting in an
incomplete list, no longer satisfying (1).

An earlier effort to resolve this [1] solved the issue by readding the
filtered refs only if the transport did not refrain from reporting in
"fetched_refs", but after further discussion, it seems that the better
solution is to revert the API change that introduced "fetched_refs".
This API change was first suggested as part of a ref-in-want
implementation that allowed for ref patterns and, thus, there could be
drastic differences between the input refs and the refs actually fetched
[2]; we eventually decided to only allow exact ref names, but this API
change remained even though its necessity was decreased.

Therefore, revert this API change by reverting commit 989b8c4452, and
make receive_wanted_refs() update the OIDs in the sought array (like how
update_shallow() updates shallow information in the sought array)
instead. A test is also included to show that the user-visible bug
discussed at the beginning of this commit message no longer exists.

[1] https://public-inbox.org/git/20180801171806.ga122...@google.com/
[2] 
https://public-inbox.org/git/86a128c5fb710a41791e7183207c4d64889f9307.1485381677.git.jonathanta...@google.com/

Signed-off-by: Jonathan Tan 
---
I now think that it's better to revert the API change introducing
"fetched_refs" (or as Peff describes it, "this whole 'return the fetched
refs' scheme from 989b8c4452"), so here is a patch doing so. I hope to
have covered all of Peff's and Junio's questions in the commit message.

As for Brandon's question:

> I haven't thought too much about what we would need to do in the event
> we add patterns to ref-in-want, but couldn't we possible mutate the
> input list again in this case and just simply add the resulting refs to
> the input list?

If we support ref patterns, we would need to support deletion of refs,
not just addition (because a ref might have existed in the initial ref
advertisement, but not when the packfile is delivered). But it should
be possible to add a flag stating "don't use this" to the ref, and
document that transport_fetch_refs() can append additional refs to the
tail of the input list. Upon hindsight, maybe this should have been the
original API change instead of the "fetched_refs" mechanism.
---
 builtin/clone.c |  4 ++--
 builtin/fetch.c | 28 
 fetch-object.c  |  2 +-
 fetch-pack.c| 30 +++---
 t/t5551-http-fetch-smart.sh | 18 ++
 transport-helper.c  |  6 ++
 transport-internal.h|  9 +
 transport.c | 34 ++
 transport.h |  3 +--
 9 files changed, 50 insertions(+), 84 deletions(-)

diff --git a/builtin/clone.c b/builtin/clone.c
index 5c439f139..76f7db47e 100644
--- a/builtin/clone.c
+++ b/builtin/clone.c
@@ -1156,7 +1156,7 @@ int cmd_clone(int argc, const char **argv, const char 
*prefix)
}
 
if (!is_local && !complete_refs_before_fetch)
-   transport_fetch_refs(transport, mapped_refs, NULL);
+   transport_fetch_refs(transport, mapped_refs);
 
remote_head = find_ref_by_name(refs, "HEAD");
remote_head_points_at =
@@ -1198,7 +1198,7 @@ int cmd_clone(int argc, const char **argv, const char 
*prefix)
if (is_local)
clone_local(path, git_dir);
else if (refs && complete_refs_before_fetch)
-   transport_fetch_refs(transport, mapped_refs, NULL);
+   transport_fetch_refs(transport, mapped_refs);
 
update_remote_refs(refs, mapped_refs, remote_head_points_at,
   

Re: [PATCH] fetch-pack: unify ref in and out param

2018-08-01 Thread Brandon Williams
On 08/01, Jonathan Tan wrote:
> When a user fetches:
>  - at least one up-to-date ref and at least one non-up-to-date ref,
>  - using HTTP with protocol v0 (or something else that uses the fetch
>command of a remote helper)
> some refs might not be updated after the fetch.
> 
> This bug was introduced in commit 989b8c4452 ("fetch-pack: put shallow
> info in output parameter", 2018-06-28) which allowed transports to
> report the refs that they have fetched in a new out-parameter
> "fetched_refs". If they do so, transport_fetch_refs() makes this
> information available to its caller.
> 
> Users of "fetched_refs" rely on the following 3 properties:
>  (1) it is the complete list of refs that was passed to
>  transport_fetch_refs(),
>  (2) it has shallow information (REF_STATUS_REJECT_SHALLOW set if
>  relevant), and
>  (3) it has updated OIDs if ref-in-want was used (introduced after
>  989b8c4452).
> 
> In an effort to satisfy (1), whenever transport_fetch_refs()
> filters the refs sent to the transport, it re-adds the filtered refs to
> whatever the transport supplies before returning it to the user.
> However, the implementation in 989b8c4452 unconditionally re-adds the
> filtered refs without checking if the transport refrained from reporting
> anything in "fetched_refs" (which it is allowed to do), resulting in an
> incomplete list, no longer satisfying (1).
> 
> An earlier effort to resolve this [1] solved the issue by readding the
> filtered refs only if the transport did not refrain from reporting in
> "fetched_refs", but after further discussion, it seems that the better
> solution is to revert the API change that introduced "fetched_refs".
> This API change was first suggested as part of a ref-in-want
> implementation that allowed for ref patterns and, thus, there could be
> drastic differences between the input refs and the refs actually fetched
> [2]; we eventually decided to only allow exact ref names, but this API
> change remained even though its necessity was decreased.
> 
> Therefore, revert this API change by reverting commit 989b8c4452, and
> make receive_wanted_refs() update the OIDs in the sought array (like how
> update_shallow() updates shallow information in the sought array)
> instead. A test is also included to show that the user-visible bug
> discussed at the beginning of this commit message no longer exists.
> 
> [1] https://public-inbox.org/git/20180801171806.ga122...@google.com/
> [2] 
> https://public-inbox.org/git/86a128c5fb710a41791e7183207c4d64889f9307.1485381677.git.jonathanta...@google.com/
> 
> Signed-off-by: Jonathan Tan 
> ---
> I now think that it's better to revert the API change introducing
> "fetched_refs" (or as Peff describes it, "this whole 'return the fetched
> refs' scheme from 989b8c4452"), so here is a patch doing so. I hope to
> have covered all of Peff's and Junio's questions in the commit message.
> 
> As for Brandon's question:
> 
> > I haven't thought too much about what we would need to do in the event
> > we add patterns to ref-in-want, but couldn't we possible mutate the
> > input list again in this case and just simply add the resulting refs to
> > the input list?
> 
> If we support ref patterns, we would need to support deletion of refs,
> not just addition (because a ref might have existed in the initial ref
> advertisement, but not when the packfile is delivered). But it should
> be possible to add a flag stating "don't use this" to the ref, and
> document that transport_fetch_refs() can append additional refs to the
> tail of the input list. Upon hindsight, maybe this should have been the
> original API change instead of the "fetched_refs" mechanism.

Thanks for getting this out, it looks good to me.  If we end up adding
patterns to ref-in-want then we can explore what changes would need to
be made then, I expect we may need to do a bit more work on the whole
fetching stack to get what we'd want in that case (because we would want
to avoid this issue again).

-- 
Brandon Williams


Re: [PATCH] fetch-pack: unify ref in and out param

2018-08-01 Thread Junio C Hamano
Brandon Williams  writes:

> ..., I expect we may need to do a bit more work on the whole
> fetching stack to get what we'd want in that case (because we would want
> to avoid this issue again).

Amen.  Thanks all.


Re: [PATCH] fetch-pack: unify ref in and out param

2018-08-02 Thread Jeff King
On Wed, Aug 01, 2018 at 01:13:20PM -0700, Jonathan Tan wrote:

> When a user fetches:
>  - at least one up-to-date ref and at least one non-up-to-date ref,
>  - using HTTP with protocol v0 (or something else that uses the fetch
>command of a remote helper)
> some refs might not be updated after the fetch.
> 
> This bug was introduced in commit 989b8c4452 ("fetch-pack: put shallow
> info in output parameter", 2018-06-28) which allowed transports to
> report the refs that they have fetched in a new out-parameter
> "fetched_refs". If they do so, transport_fetch_refs() makes this
> information available to its caller.
> 
> Users of "fetched_refs" rely on the following 3 properties:
>  (1) it is the complete list of refs that was passed to
>  transport_fetch_refs(),
>  (2) it has shallow information (REF_STATUS_REJECT_SHALLOW set if
>  relevant), and
>  (3) it has updated OIDs if ref-in-want was used (introduced after
>  989b8c4452).
> [...]

Thanks, this is a very clear and well-organized commit message. It
answers my questions, and I agree with the general notion of "we can
figure out the right API for ref patterns later" approach.

>  builtin/clone.c |  4 ++--
>  builtin/fetch.c | 28 
>  fetch-object.c  |  2 +-
>  fetch-pack.c| 30 +++---
>  t/t5551-http-fetch-smart.sh | 18 ++
>  transport-helper.c  |  6 ++
>  transport-internal.h|  9 +
>  transport.c | 34 ++
>  transport.h |  3 +--

The patch itself looks sane to me, and obviously fixes the problem. I
cannot offhand think of any reason that munging the existing list would
be a problem (though it has been a while since I have dealt with this
code, so take that with the appropriate grain of salt).

-Peff