Re: [PATCH] fetch-pack: do not reset in_vain on non-novel acks

2016-09-22 Thread Junio C Hamano
Jonathan Tan  writes:

> The MAX_IN_VAIN mechanism was introduced in commit f061e5f ("fetch-pack:
> give up after getting too many "ack continue"", 2006-05-24) to stop ref
> negotiation if a number of consecutive "have"s have been sent with no
> corresponding new acks. A use case (as described in that commit) is the
> scenario in which the local repository has more roots than the remote
> repository.

To those who know what the mechanism is about, the above is
sufficient to refresh their memory, but to others, a brief
explanation of _why_ it is a good idea to stop is needed to
understand what you are trying to achieve with this change.

It may help to add something like "This will stop the client to dig
too deep in an irrelevant side branch in vain without ever finding a
common ancestor." before "A use case is ...", perhaps?

By the way, you made me run "git show -W f061e5f" and then compare
it with "less fetch-pack.c"; I am kind of surprised to see that
find_common() has grown quite a bit over the years.

> However, during a negotiation in which stateless RPCs are used,
> MAX_IN_VAIN will (almost) never trigger (in the more-roots scenario
> above and others) because in each new request, the client has to inform
> the server of objects it already has and knows the server has (to remind
> the server of the state), which the server then acks.

Hmph.  So the problem you are trying to solve is that the current
code sees that the other side said 'yeah, that is a common commit'
by giving us ACK common, and resets the in_vain counter, when in
fact we haven't made _any_ progress at that point.

> Make fetch-pack only consider novel acks (acks for objects for which the
> client has never received an ack before in this session) as new acks for
> the purpose of MAX_IN_VAIN.

Makes sense.

Just a hint, because you are relatively new to the project.
Whenever you are tempted to say "In other words...", "That
means...", or further elaborte in parentheses, it pays to stop and
think if you can do without whatever you said before that.  In the
above paragraph and in the comment in the patch, a newly invented
term "novel ack" is used exactly once, and because it is a newly
invented word, you need to explain what you want it to mean, but
there is no need to do so.  "Make fetch-pack only consider acks for
objects for which no earlier acks have been seen ..." is equally
readable and does not burden the readers with "Ah, the author
introduced a new term 'novel ack', so I need to remember that this
is the definition of the word when I see it mentioned next time".

> Signed-off-by: Jonathan Tan 
> ---
>  fetch-pack.c | 12 ++--
>  1 file changed, 10 insertions(+), 2 deletions(-)
>
> diff --git a/fetch-pack.c b/fetch-pack.c
> index 85e77af..1141e3c 100644
> --- a/fetch-pack.c
> +++ b/fetch-pack.c
> @@ -428,10 +428,18 @@ static int find_common(struct fetch_pack_args *args,
>   const char *hex = 
> sha1_to_hex(result_sha1);
>   packet_buf_write(_buf, 
> "have %s\n", hex);
>   state_len = req_buf.len;
> - }
> + /*
> +  * Reset in_vain because this
> +  * ack is a novel ack (that is,
> +  * an ack for this commit has
> +  * not been seen).
> +  */

Side note.  Having to wrap the multi-line comment like this is a
sign that the loop got a bit too big to fit in brain.  We may want
to see if there is way to reduce the complexity by introducing a
helper function or something.

> + in_vain = 0;
> + } else if (!args->stateless_rpc
> +|| ack != ACK_common)
> + in_vain = 0;

It is a bit hard to read this hunk without pre-context.  The
original reads like so:


...
case ACK_common:
case ACK_ready:
case ACK_continue: {
struct commit *commit =
lookup_commit(result_sha1);
if (!commit)
die("invalid commit %s", sha1_to_hex(result_sha1));
if (args->stateless_rpc
 && ack == ACK_common
 && !(commit->object.flags & COMMON)) {

Here, they told us that this is a common ancestor by giving us "ACK
common", and this is not a response to our attempt to prime a new
incarnation of stateless server.  It is curious that only ACK_common
is checked, but it is OK because --stateless requires multi-ack and
ACK_continue is not used.

/* 

[PATCH] fetch-pack: do not reset in_vain on non-novel acks

2016-09-22 Thread Jonathan Tan
The MAX_IN_VAIN mechanism was introduced in commit f061e5f ("fetch-pack:
give up after getting too many "ack continue"", 2006-05-24) to stop ref
negotiation if a number of consecutive "have"s have been sent with no
corresponding new acks. A use case (as described in that commit) is the
scenario in which the local repository has more roots than the remote
repository.

However, during a negotiation in which stateless RPCs are used,
MAX_IN_VAIN will (almost) never trigger (in the more-roots scenario
above and others) because in each new request, the client has to inform
the server of objects it already has and knows the server has (to remind
the server of the state), which the server then acks.

Make fetch-pack only consider novel acks (acks for objects for which the
client has never received an ack before in this session) as new acks for
the purpose of MAX_IN_VAIN.

Signed-off-by: Jonathan Tan 
---
 fetch-pack.c | 12 ++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/fetch-pack.c b/fetch-pack.c
index 85e77af..1141e3c 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -428,10 +428,18 @@ static int find_common(struct fetch_pack_args *args,
const char *hex = 
sha1_to_hex(result_sha1);
packet_buf_write(_buf, 
"have %s\n", hex);
state_len = req_buf.len;
-   }
+   /*
+* Reset in_vain because this
+* ack is a novel ack (that is,
+* an ack for this commit has
+* not been seen).
+*/
+   in_vain = 0;
+   } else if (!args->stateless_rpc
+  || ack != ACK_common)
+   in_vain = 0;
mark_common(commit, 0, 1);
retval = 0;
-   in_vain = 0;
got_continue = 1;
if (ack == ACK_ready) {
clear_prio_queue(_list);
-- 
2.8.0.rc3.226.g39d4020