Re: [PATCH] transport: report refs only if transport does
On Tue, Jul 31, 2018 at 04:23:43PM -0700, Jonathan Tan wrote: > > > Because transport_fetch_refs() filters the refs sent to the transport, > > > it cannot just report the transport's result directly, but first needs > > > to readd the excluded refs, pretending that they are fetched. However, > > > this results in a wrong result if the transport did not report the refs > > > that they have fetched in "fetched_refs" - the excluded refs would be > > > added and reported, presenting an incomplete picture to the caller. > > > > This part leaves me confused. If we are not fetching them, then why do > > we need to pretend that they are fetched? > > The short answer is that we need: > (1) the complete list of refs that was passed to > transport_fetch_refs(), > (2) with shallow information (REF_STATUS_REJECT_SHALLOW set if > relevant), and > (3) with updated OIDs if ref-in-want was used. > > The fetched_refs out param already fulfils (2) and (3), and this patch > makes it fulfil (1). As for calling them fetched_refs, perhaps that is a > misnomer, but they do appear in FETCH_HEAD even though they are not > truly fetched. Thanks for this explanation. It does make more sense to me now, and I agree that a lot of my confusion was from calling it "fetched_refs" (and the comment saying "reported as fetched, but not actually fetched"). > Which raises the question...if completeness is so important, why not > reuse the input list of refs and document that transport_fetch_refs() > can mutate the input list? You ask the same question below, so I'll put > the answer after quoting your paragraph. > [...] Thanks, the answer here was enlightening as well. I see you posted a patch to go back to mutating the list, and that seems reasonable to me. I'm fine with a separate "out" list, too. Its purpose and expectations just need to be reflected in the name (and possibly in a comment). -Peff
Re: [PATCH] transport: report refs only if transport does
On 07/31, Jonathan Tan wrote: > > On Mon, Jul 30, 2018 at 03:56:01PM -0700, Jonathan Tan wrote: > > > > > Commit 989b8c4452 ("fetch-pack: put shallow info in output parameter", > > > 2018-06-28) allows transports to report the refs that they have fetched > > > in a new out-parameter "fetched_refs". If they do so, > > > transport_fetch_refs() makes this information available to its caller. > > > > > > Because transport_fetch_refs() filters the refs sent to the transport, > > > it cannot just report the transport's result directly, but first needs > > > to readd the excluded refs, pretending that they are fetched. However, > > > this results in a wrong result if the transport did not report the refs > > > that they have fetched in "fetched_refs" - the excluded refs would be > > > added and reported, presenting an incomplete picture to the caller. > > > > This part leaves me confused. If we are not fetching them, then why do > > we need to pretend that they are fetched? > > The short answer is that we need: > (1) the complete list of refs that was passed to > transport_fetch_refs(), > (2) with shallow information (REF_STATUS_REJECT_SHALLOW set if > relevant), and > (3) with updated OIDs if ref-in-want was used. > > The fetched_refs out param already fulfils (2) and (3), and this patch > makes it fulfil (1). As for calling them fetched_refs, perhaps that is a > misnomer, but they do appear in FETCH_HEAD even though they are not > truly fetched. > > Which raises the question...if completeness is so important, why not > reuse the input list of refs and document that transport_fetch_refs() > can mutate the input list? You ask the same question below, so I'll put > the answer after quoting your paragraph. > > > I think I am showing my lack of understanding about the reason for this > > whole "return the fetched refs" scheme from 989b8c4452, and probably > > reading the rest of that series would make it more clear. But from the > > perspective of somebody digging into history and finding just this > > commit, it probably needs to lay out a little more of the reasoning. > > I think it's because 989b8c4452 is based on my earlier work [1] which > also had a fetched_refs out param. Its main reason is to enable the > invoker of transport_fetch_refs() to specify ref patterns (as you can > see in a later commit in the same patch set [2]) - and if we specify > patterns, the invoker of transport_fetch_refs() needs the resulting refs > (which are provided through fetched_refs). > > In the version that made it to master, however, there was some debate > about whether ref patterns need to be allowed. In the end, ref patterns > were not allowed [3], but the fetched_refs out param was still left in. > > I think that reverting the API might work, but am on the fence about it. > It would reduce the number of questions about the code (and would > probably automatically fix the issue that I was fixing in the first If you believe the API is difficult to work with (which given this bug it is) then perhaps we go with your suggestion and revert the API back to only providing a list of input refs and having the fetch operation mutate that input list. > place), but if we were to revert the API and then decide that we do want > ref patterns in "want-ref" (or expand transport_fetch_refs in some > similar way), we would need to revert our revert, causing code churn. I haven't thought too much about what we would need to do in the event we add patterns to ref-in-want, but couldn't we possible mutate the input list again in this case and just simply add the resulting refs to the input list? > > [1] > https://public-inbox.org/git/86a128c5fb710a41791e7183207c4d64889f9307.1485381677.git.jonathanta...@google.com/ > [2] > https://public-inbox.org/git/eef2b77d88df0db08e4a1505b06e0af2d40143d5.1485381677.git.jonathanta...@google.com/ > [3] https://public-inbox.org/git/20180620213235.10952-1-bmw...@google.com/ -- Brandon Williams
Re: [PATCH] transport: report refs only if transport does
> What leaves me even more confused is that the entire log message > does not make it clear what the end-user observable problem the > patch is trying to solve. > > Is this "we sometimes follow and sometimes fail to follow refs while > fetching"? Does it affect all protocol versions and transports, or > only just selected few (and if so which ones)? Normally I would respond by creating a new patch with the answer in its commit message, but I'm now not sure about whether it's better to revert back to the non-"fetched_refs" API entirely (as I explained in the reply to Peff I just sent [1]), so I'll answer your questions here for now: - Yes. We fail to follow when we fetch at least one ref that is up-to-date and one ref that is not, and when we're using the "fetch" command in a remote helper (for example, HTTP protocol v0). - I haven't checked exhaustively, but as far as I know, affects HTTP protocol v0, and does not affect anything using connect or stateless-connect (e.g. HTTP protocol v2, ssh). When I create a new patch, I'll also include these answers in its commit message. [1] https://public-inbox.org/git/20180731232343.184463-1-jonathanta...@google.com/
Re: [PATCH] transport: report refs only if transport does
> On Mon, Jul 30, 2018 at 03:56:01PM -0700, Jonathan Tan wrote: > > > Commit 989b8c4452 ("fetch-pack: put shallow info in output parameter", > > 2018-06-28) allows transports to report the refs that they have fetched > > in a new out-parameter "fetched_refs". If they do so, > > transport_fetch_refs() makes this information available to its caller. > > > > Because transport_fetch_refs() filters the refs sent to the transport, > > it cannot just report the transport's result directly, but first needs > > to readd the excluded refs, pretending that they are fetched. However, > > this results in a wrong result if the transport did not report the refs > > that they have fetched in "fetched_refs" - the excluded refs would be > > added and reported, presenting an incomplete picture to the caller. > > This part leaves me confused. If we are not fetching them, then why do > we need to pretend that they are fetched? The short answer is that we need: (1) the complete list of refs that was passed to transport_fetch_refs(), (2) with shallow information (REF_STATUS_REJECT_SHALLOW set if relevant), and (3) with updated OIDs if ref-in-want was used. The fetched_refs out param already fulfils (2) and (3), and this patch makes it fulfil (1). As for calling them fetched_refs, perhaps that is a misnomer, but they do appear in FETCH_HEAD even though they are not truly fetched. Which raises the question...if completeness is so important, why not reuse the input list of refs and document that transport_fetch_refs() can mutate the input list? You ask the same question below, so I'll put the answer after quoting your paragraph. > I think I am showing my lack of understanding about the reason for this > whole "return the fetched refs" scheme from 989b8c4452, and probably > reading the rest of that series would make it more clear. But from the > perspective of somebody digging into history and finding just this > commit, it probably needs to lay out a little more of the reasoning. I think it's because 989b8c4452 is based on my earlier work [1] which also had a fetched_refs out param. Its main reason is to enable the invoker of transport_fetch_refs() to specify ref patterns (as you can see in a later commit in the same patch set [2]) - and if we specify patterns, the invoker of transport_fetch_refs() needs the resulting refs (which are provided through fetched_refs). In the version that made it to master, however, there was some debate about whether ref patterns need to be allowed. In the end, ref patterns were not allowed [3], but the fetched_refs out param was still left in. I think that reverting the API might work, but am on the fence about it. It would reduce the number of questions about the code (and would probably automatically fix the issue that I was fixing in the first place), but if we were to revert the API and then decide that we do want ref patterns in "want-ref" (or expand transport_fetch_refs in some similar way), we would need to revert our revert, causing code churn. [1] https://public-inbox.org/git/86a128c5fb710a41791e7183207c4d64889f9307.1485381677.git.jonathanta...@google.com/ [2] https://public-inbox.org/git/eef2b77d88df0db08e4a1505b06e0af2d40143d5.1485381677.git.jonathanta...@google.com/ [3] https://public-inbox.org/git/20180620213235.10952-1-bmw...@google.com/
Re: [PATCH] transport: report refs only if transport does
Jeff King writes: > On Mon, Jul 30, 2018 at 03:56:01PM -0700, Jonathan Tan wrote: > >> Commit 989b8c4452 ("fetch-pack: put shallow info in output parameter", >> 2018-06-28) allows transports to report the refs that they have fetched >> in a new out-parameter "fetched_refs". If they do so, >> transport_fetch_refs() makes this information available to its caller. >> >> Because transport_fetch_refs() filters the refs sent to the transport, >> it cannot just report the transport's result directly, but first needs >> to readd the excluded refs, pretending that they are fetched. However, >> this results in a wrong result if the transport did not report the refs >> that they have fetched in "fetched_refs" - the excluded refs would be >> added and reported, presenting an incomplete picture to the caller. > > This part leaves me confused. If we are not fetching them, then why do > we need to pretend that they are fetched? What leaves me even more confused is that the entire log message does not make it clear what the end-user observable problem the patch is trying to solve. Is this "we sometimes follow and sometimes fail to follow refs while fetching"? Does it affect all protocol versions and transports, or only just selected few (and if so which ones)? In minds of those who reported an issue and wrote the fix, the issue may be fresh, but let's write the commit log message for ourselves 6 months down the road. Thanks.
Re: [PATCH] transport: report refs only if transport does
On Mon, Jul 30, 2018 at 03:56:01PM -0700, Jonathan Tan wrote: > Commit 989b8c4452 ("fetch-pack: put shallow info in output parameter", > 2018-06-28) allows transports to report the refs that they have fetched > in a new out-parameter "fetched_refs". If they do so, > transport_fetch_refs() makes this information available to its caller. > > Because transport_fetch_refs() filters the refs sent to the transport, > it cannot just report the transport's result directly, but first needs > to readd the excluded refs, pretending that they are fetched. However, > this results in a wrong result if the transport did not report the refs > that they have fetched in "fetched_refs" - the excluded refs would be > added and reported, presenting an incomplete picture to the caller. This part leaves me confused. If we are not fetching them, then why do we need to pretend that they are fetched? I think I am showing my lack of understanding about the reason for this whole "return the fetched refs" scheme from 989b8c4452, and probably reading the rest of that series would make it more clear. But from the perspective of somebody digging into history and finding just this commit, it probably needs to lay out a little more of the reasoning. > Thanks for the reproduction recipe, Peff. Here's a fix. It can be > reproduced with something using a remote helper's fetch command (and not > using "connect" or "stateless-connect"), fetching at least one ref that > requires a ref update and at least one that does not (as you can see > from the included test). Ah, that explains why I couldn't reproduce it with another repository; I was using a direct git-upload-pack fetch, which wouldn't trigger the remote helper code. > diff --git a/transport.c b/transport.c > index fdd813f684..2a2415d79c 100644 > --- a/transport.c > +++ b/transport.c > @@ -1230,17 +1230,18 @@ int transport_fetch_refs(struct transport *transport, > struct ref *refs, > struct ref **heads = NULL; > struct ref *nop_head = NULL, **nop_tail = _head; > struct ref *rm; > + struct ref *fetched_by_transport = NULL; > > for (rm = refs; rm; rm = rm->next) { > nr_refs++; > if (rm->peer_ref && > !is_null_oid(>old_oid) && > !oidcmp(>peer_ref->old_oid, >old_oid)) { > - /* > - * These need to be reported as fetched, but we don't > - * actually need to fetch them. > - */ > if (fetched_refs) { > + /* > + * These may need to be reported as fetched, > + * but we don't actually need to fetch them. > + */ So it's really this comment that leaves me the most puzzled. > @@ -1264,10 +1265,25 @@ int transport_fetch_refs(struct transport *transport, > struct ref *refs, > heads[nr_heads++] = rm; > } > > - rc = transport->vtable->fetch(transport, nr_heads, heads, fetched_refs); > - if (fetched_refs && nop_head) { > - *nop_tail = *fetched_refs; > - *fetched_refs = nop_head; > + rc = transport->vtable->fetch(transport, nr_heads, heads, > + fetched_refs ? _by_transport : > NULL); > + if (fetched_refs) { > + if (fetched_by_transport) { > + /* > + * The transport reported its fetched refs. Pretend > + * that we also fetched the ones that we didn't need to > + * fetch. > + */ > + *nop_tail = fetched_by_transport; > + *fetched_refs = nop_head; > + } else if (!fetched_by_transport) { > + /* > + * The transport didn't report its fetched refs, so > + * this function will not report them either. We have > + * no use for nop_head. > + */ > + free_refs(nop_head); > + } This part makes sense to me based on the description (and on the assumption that reporting those nop refs is useful in the first place ;) ). So I think your fix here is probably the right thing, but I'm just left confused by the background a bit. -Peff