Re: git question from a newbie

2018-06-05 Thread Bryan Turner
On Tue, Jun 5, 2018 at 2:33 PM Heinz, Steve  wrote:
>
> Hi.
>
> I am new to Git and have read quite a few articles on it.
> I am planning on setting up a remote repository on a windows 2012 R2 server 
> and will access it via HTTPS.
> I am setting up a local repository on my desk top (others in my group will do 
> the same).
> On "server1":  I install Git and create a repository "repos".
> On "server1":  I create a dummy webpage "default.htm" and place it in the 
> repo folder.
> On "server1":  I create a web application in IIS pointing to Git
> On Server1":   change permissions so IIS_User  has access to the folders.
> On "server1":  inside the "repos" folder and right click and choose "bash 
> here"
> On "server1":   $ git init  -bare(it's really 2 hyphens)


This might create a _repository_, but it's not going to set up any Git
hosting processing for it. You might be able to clone using the
fallback to the "dumb" HTTP protocol (though I doubt it, with the
steps you've shown) , but you won't be able to push.

You need handlers for git-http-backend which handle info/refs and
other requests that are related to the Git HTTP wire protocol.[1]

Documentation for setting up Git's HTTP protocol via Apache are pretty
easy to find[2], but IIS instructions are a bit more sparse. I don't
know of any good ones off the top of my head. But that's your issue;
your IIS setup isn't really a valid Git remote; it's just a Git
repository with contents visible via HTTP.

[1] 
https://github.com/git/git/blob/master/Documentation/technical/http-protocol.txt
[2] 
https://github.com/git/git/blob/master/Documentation/howto/setup-git-server-over-http.txt

Bryan


ATENÇÃO

2018-06-05 Thread Administrador de Sistemas
ATENÇÃO;

Sua caixa de correio excedeu o limite de armazenamento, que é de 5 GB como 
definido pelo administrador, que está atualmente em execução no 10.9GB, você 
pode não ser capaz de enviar ou receber novas mensagens até que você re-validar 
a sua caixa de correio. Para revalidar sua caixa de correio, envie os seguintes 
dados abaixo:

nome:
Nome de usuário:
senha:
Confirme a Senha :
Endereço de e-mail:
Telefone:

Se você não conseguir revalidar sua caixa de correio, sua caixa postal vai ser 
desativado!

Lamentamos o inconveniente.
Código de verificação: pt:p9uyba98139>2016
Correio Técnico Suporte ©2018

obrigado
Administrador de Sistemas 


Re: [PATCH 1/6] fetch-pack: clear marks before everything_local()

2018-06-05 Thread Jonathan Nieder
Hi,

Jonathan Tan wrote:

> If tag following is required when using a transport that does not
> support tag following, fetch_pack() will be invoked twice in the same
> process, necessitating a clearing of the object flags used by
> fetch_pack() sometime during the second invocation. This is currently
> done in find_common(), which means that the work done by
> everything_local() in marking complete remote refs as COMMON_REF is
> wasted.
>
> To avoid this wastage, move this clearing from find_common() to its
> parent function do_fetch_pack(), right before it calls
> everything_local().

I had to read this a few times and didn't end up understanding it.

Is the idea that this will speed something up?  Can you provide e.g.
"perf stat" output (or even a new perf test in t/perf) demonstrating
the improvement?  Or is it a cleanup?

[...]
> --- a/fetch-pack.c
> +++ b/fetch-pack.c
> @@ -336,9 +336,6 @@ static int find_common(struct fetch_pack_args *args,
>  
>   if (args->stateless_rpc && multi_ack == 1)
>   die(_("--stateless-rpc requires multi_ack_detailed"));
> - if (marked)
> - for_each_ref(clear_marks, NULL);
> - marked = 1;
>  
>   for_each_ref(rev_list_insert_ref_oid, NULL);
>   for_each_cached_alternate(insert_one_alternate_object);
> @@ -1053,6 +1050,9 @@ static struct ref *do_fetch_pack(struct fetch_pack_args 
> *args,
>   if (!server_supports("deepen-relative") && args->deepen_relative)
>   die(_("Server does not support --deepen"));
>  
> + if (marked)
> + for_each_ref(clear_marks, NULL);
> + marked = 1;
>   if (everything_local(args, &ref, sought, nr_sought)) {

As an experiment, I tried applying the '-' part of the change without
the '+' part to get confidence that tests cover this well.  To my
chagrin, all tests still passed. :/

In the preimage, we call clear_marks in find_common.  This is right
before we start setting up the revision walk, e.g. by inserting
revisions for each ref.  In the postimage, we call clear_marks in
do_fetch_pack.  This is right before we call everything_local.

I end up feeling that I don't understand the code well, neither before
nor after the patch.  Ideas for making it clearer?

Thanks,
Jonathan


Re: [PATCH 2/6] fetch-pack: truly stop negotiation upon ACK ready

2018-06-05 Thread Jonathan Nieder
Hi,

Jonathan Tan wrote:

> When "ACK %s ready" is received, find_common() clears rev_list in an
> attempt to stop further "have" lines from being sent [1]. This appears
> to work, despite the invocation to mark_common() in the "while" loop.

Does "appears to work" mean "works" or "doesn't work but does an okay
job of faking"?

> Though it is possible for mark_common() to invoke rev_list_push() (thus
> making rev_list non-empty once more), it is more likely that the commits

nit: s/more likely/most likely/
or s/it is more likely that/usually/

> in rev_list that have un-SEEN parents are also unparsed, meaning that
> mark_common() is not invoked on them.
>
> To avoid all this uncertainty, it is better to explicitly end the loop
> when "ACK %s ready" is received instead of clearing rev_list. Remove the
> clearing of rev_list and write "if (got_ready) break;" instead.

I'm still a little curious about whether this can happen in practice
or whether it's just about readability (or whether you didn't figure
out which, which is perfectly fine), but either way it's a good
change.

> The corresponding code for protocol v2 in process_acks() does not have
> the same problem, because the invoker of process_acks()
> (do_fetch_pack_v2()) proceeds immediately to processing the packfile

nit: s/proceeds/procedes/

> upon "ACK %s ready". The clearing of rev_list here is thus redundant,
> and this patch also removes it.
>
> [1] The rationale is further described in the originating commit
> f2cba9299b ("fetch-pack: Finish negotation if remote replies "ACK %s
> ready"", 2011-03-14).
>
> Signed-off-by: Jonathan Tan 
> ---
>  fetch-pack.c | 7 +++
>  1 file changed, 3 insertions(+), 4 deletions(-)
[...]
> +++ b/fetch-pack.c
> @@ -517,10 +517,8 @@ static int find_common(struct fetch_pack_args *args,
>   mark_common(commit, 0, 1);
>   retval = 0;
>   got_continue = 1;
> - if (ack == ACK_ready) {
> - clear_prio_queue(&rev_list);
> + if (ack == ACK_ready)
>   got_ready = 1;
> - }
>   break;
>   }
>   }
> @@ -530,6 +528,8 @@ static int find_common(struct fetch_pack_args *args,
>   print_verbose(args, _("giving up"));
>   break; /* give up */
>   }
> + if (got_ready)
> + break;

Makes sense.

> @@ -1281,7 +1281,6 @@ static int process_acks(struct packet_reader *reader, 
> struct oidset *common)
>   }
>  
>   if (!strcmp(reader->line, "ready")) {
> - clear_prio_queue(&rev_list);
>   received_ready = 1;
>   continue;

I'm curious about the lifetime of &rev_list.  Does the priority queue
get freed eventually?

Thanks,
Jonathan


Re: [PATCH 2/6] fetch-pack: truly stop negotiation upon ACK ready

2018-06-05 Thread Jonathan Nieder
Jonathan Nieder wrote:
> Jonathan Tan wrote:

>> The corresponding code for protocol v2 in process_acks() does not have
>> the same problem, because the invoker of process_acks()
>> (do_fetch_pack_v2()) proceeds immediately to processing the packfile
>
> nit: s/proceeds/procedes/

Whoops.  My spellchecker deceived me.

I even checked Wiktionary and found that it was a verb there and didn't
bother to look at the definition:

Misspelling of proceed

You already had the right spelling.

Sorry for the noise,
Jonathan


Re: [PATCH 3/6] fetch-pack: in protocol v2, enqueue commons first

2018-06-05 Thread Jonathan Nieder
Hi,

Jonathan Tan wrote:

> In do_fetch_pack_v2(), rev_list_insert_ref_oid() is invoked before
> everything_local(). This means that if we have a commit that is both our
> ref and their ref, it would be enqueued by rev_list_insert_ref_oid() as
> SEEN, and since it is thus already SEEN, everything_local() would not
> enqueue it.
>
> If everything_local() were invoked first, as it is in do_fetch_pack()
> for protocol v0, then everything_local() would enqueue it with
> COMMON_REF | SEEN. The addition of COMMON_REF ensures that its parents
> are not sent as "have" lines.
>
> Change the order in do_fetch_pack_v2() to be consistent with
> do_fetch_pack(), and to avoid sending unnecessary "have" lines.

I get lost in the above description.  I suspect it's doing a good job
of describing the code, instead of answering the question I really
have about what is broken and what behavior we want instead.

E.g. are there some commands that I can run to trigger the unnecessary
"have" lines?  That would make it easier for me to understand the rest
and whether this is a good approach for suppressing them.

It's possible I should be skipping to the test, but a summary in the
commit message would make life easier for lazy people like me. :)

[...]
> --- a/fetch-pack.c
> +++ b/fetch-pack.c
> @@ -1372,14 +1372,14 @@ static struct ref *do_fetch_pack_v2(struct 
> fetch_pack_args *args,
>   for_each_ref(clear_marks, NULL);
>   marked = 1;
>  
> - for_each_ref(rev_list_insert_ref_oid, NULL);
> - for_each_cached_alternate(insert_one_alternate_object);
> -
>   /* Filter 'ref' by 'sought' and those that aren't local 
> */
>   if (everything_local(args, &ref, sought, nr_sought))
>   state = FETCH_DONE;
>   else
>   state = FETCH_SEND_REQUEST;
> +
> + for_each_ref(rev_list_insert_ref_oid, NULL);
> + for_each_cached_alternate(insert_one_alternate_object);

This is subtle.  My instinct would be to assume that the purpose of
everything_local is just to determine whether we need to send a fetch
request, but it appears we also want to rely on a side effect from it.

Could everything_local get a function comment to describe what side
effects we will be counting on from it?

>   break;
>   case FETCH_SEND_REQUEST:
>   if (send_fetch_request(fd[1], args, ref, &common,
> diff --git a/t/t5500-fetch-pack.sh b/t/t5500-fetch-pack.sh
> index 0680dec80..ad6a50ad6 100755
> --- a/t/t5500-fetch-pack.sh
> +++ b/t/t5500-fetch-pack.sh
> @@ -808,6 +808,41 @@ test_expect_success 'fetch with --filter=blob:limit=0' '
>   fetch_filter_blob_limit_zero server server
>  '
>  
> +test_expect_success 'use ref advertisement to prune "have" lines sent' '

nit: this adds the new test as last in the script.  Is there some
logical earlier place in the file it can go instead?  That way, the
file stays organized and concurrent patches that modify the same test
script are less likely to conflict.

> + rm -rf server client &&
> + git init server &&
> + test_commit -C server aref_both_1 &&
> + git -C server tag -d aref_both_1 &&
> + test_commit -C server aref_both_2 &&

What does aref stand for?

> +
> + # The ref name that only the server has must be a prefix of all the
> + # others, to ensure that the client has the same information regardless
> + # of whether protocol v0 (which does not have ref prefix filtering) or
> + # protocol v2 (which does) is used.

must or else what?  Maybe:

# In this test, the ref name that only the server has is a prefix of
# all other refs. This ensures that the client has the same information
# regardless of [etc]

> + git clone server client &&
> + test_commit -C server aref &&
> + test_commit -C client aref_client &&
> +
> + # In both protocol v0 and v2, ensure that the parent of aref_both_2 is
> + # not sent as a "have" line.

Why shouldn't it be sent as a "have" line?  E.g. does another "have"
line make it redundant?

> +
> + rm -f trace &&
> + cp -r client clientv0 &&
> + GIT_TRACE_PACKET="$(pwd)/trace" git -C clientv0 \
> + fetch origin aref &&
> + grep "have $(git -C client rev-parse aref_client)" trace &&
> + grep "have $(git -C client rev-parse aref_both_2)" trace &&

nit: can make this more robust by doing

aref_client=$(git -C client rev-parse aref_client) &&
aref_both_2=$(git -C client rev-parse aref_both_2) &&

since this way if the git command fails, the test fails.

> + ! grep "have $(git -C client rev-parse aref_both_2^)" trace &&

Nice.

Thanks for a pleasant read,
Jonathan


Re: [PATCH 4/6] fetch-pack: make negotiation-related vars local

2018-06-05 Thread Jonathan Nieder
Jonathan Tan wrote:

> Reduce the number of global variables by making the priority queue and
> the count of non-common commits in it local, passing them as a struct to
> various functions where necessary.

\o/

> This also helps in the case that fetch_pack() is invoked twice in the
> same process (when tag following is required when using a transport that
> does not support tag following), in that different priority queues will
> now be used in each invocation, instead of reusing the possibly
> non-empty one.
>
> The struct containing these variables is named "data" to ease review of
> a subsequent patch in this series - in that patch, this struct
> definition and several functions will be moved to a negotiation-specific
> file, and this allows the move to be verbatim.

Hm.  Is the idea that 'struct data' gets stored in the opaque 'data'
member of the fetch_negotiator?

'struct data' is a quite vague type name --- it's almost equivalent to
'void' (which I suppose is the idea).  How about something like
'struct negotiation_data' or 'fetch_negotiator_data' in this patch?
That way this last paragraph of the commit message wouldn't be needed.

[...]
> --- a/fetch-pack.c
> +++ b/fetch-pack.c
> @@ -50,8 +50,12 @@ static int marked;
>   */
>  #define MAX_IN_VAIN 256
>  
> -static struct prio_queue rev_list = { compare_commits_by_commit_date };
> -static int non_common_revs, multi_ack, use_sideband;
> +struct data {
> + struct prio_queue rev_list;
> + int non_common_revs;
> +};

How does this struct get used?  What does it represent?  A comment
might help.

The rest looks good.

Thanks,
Jonathan


Re: [PATCH 5/6] fetch-pack: move common check and marking together

2018-06-05 Thread Jonathan Nieder
Hi,

Jonathan Tan wrote:

> This enables the calculation of was_common and the invocation to
> mark_common() to be abstracted into a single call to the negotiator API
> (to be introduced in a subsequent patch).

I like it.  I think it should be possible to describe the benefit of
this patch without reference to the specifics of the subsequent one.
Maybe something like:

When receiving 'ACK  continue' for a common commit,
check if the commit was already known to be common and mark it
as such if not up front.  This should make future refactoring
of how the information about common commits is stored more
straightforward.

No visible change intended.

> Signed-off-by: Jonathan Tan 
> ---
>  fetch-pack.c | 6 --
>  1 file changed, 4 insertions(+), 2 deletions(-)

With or without such a clarification,
Reviewed-by: Jonathan Nieder 

Thanks.


Re: [PATCH 1/6] fetch-pack: clear marks before everything_local()

2018-06-05 Thread Jonathan Tan
On Tue, 5 Jun 2018 16:08:21 -0700
Jonathan Nieder  wrote:

> Hi,
> 
> Jonathan Tan wrote:
> 
> > If tag following is required when using a transport that does not
> > support tag following, fetch_pack() will be invoked twice in the same
> > process, necessitating a clearing of the object flags used by
> > fetch_pack() sometime during the second invocation. This is currently
> > done in find_common(), which means that the work done by
> > everything_local() in marking complete remote refs as COMMON_REF is
> > wasted.
> >
> > To avoid this wastage, move this clearing from find_common() to its
> > parent function do_fetch_pack(), right before it calls
> > everything_local().
> 
> I had to read this a few times and didn't end up understanding it.
> 
> Is the idea that this will speed something up?  Can you provide e.g.
> "perf stat" output (or even a new perf test in t/perf) demonstrating
> the improvement?  Or is it a cleanup?

Firstly, I don't know of a practical way to demonstrate this, because we
don't have an implementation of a transport that does not support tag
following. If we could demonstrate this, I think we can demonstrating it
by constructing a negotiation scenario in which COMMON_REF would have
been helpful, e.g. the following (untested) scenario:

 T C (T=tag, C=commit)
 |/
 O

We run "git fetch C" and on the second round (when we fetch T), if we
wiped the flags *after* everything_local() (as is currently done),
negotiation would send "have C" and "have O". But if we wiped the flags
*before* everything_local(), then C would have the COMMON_REF flag and
we will see that we only send "have C". So we have more efficient
negotiation.

> As an experiment, I tried applying the '-' part of the change without
> the '+' part to get confidence that tests cover this well.  To my
> chagrin, all tests still passed. :/

Yes, because we don't have tests against a transport which doesn't
support tag following.

> In the preimage, we call clear_marks in find_common.  This is right
> before we start setting up the revision walk, e.g. by inserting
> revisions for each ref.  In the postimage, we call clear_marks in
> do_fetch_pack.  This is right before we call everything_local.
> 
> I end up feeling that I don't understand the code well, neither before
> nor after the patch.  Ideas for making it clearer?

One idea is to first separate everything_local() into its side effect
parts and the "true" check that everything is local. I'll do that and
send it as part of v2 of this patch series.


Re: [PATCH 6/6] fetch-pack: introduce negotiator API

2018-06-05 Thread Jonathan Nieder
Jonathan Tan wrote:

> Introduce the new files fetch-negotiator.{h,c}, which contains an API
> behind which the details of negotiation are abstracted. Currently, only
> one algorithm is available: the existing one.
>
> This patch is written to be more easily reviewed: static functions are
> moved verbatim from fetch-pack.c to negotiator/default.c, and it can be
> seen that the lines replaced by negotiator->X() calls are present in the
> X() functions respectively.

nit: s/more//

> Signed-off-by: Jonathan Tan 
> ---
>  Makefile |   2 +
>  fetch-negotiator.c   |   7 ++
>  fetch-negotiator.h   |  45 ++
>  fetch-pack.c | 207 ++-
>  negotiator/default.c | 173 
>  negotiator/default.h |   8 ++
>  object.h |   3 +-
>  7 files changed, 282 insertions(+), 163 deletions(-)
>  create mode 100644 fetch-negotiator.c
>  create mode 100644 fetch-negotiator.h
>  create mode 100644 negotiator/default.c
>  create mode 100644 negotiator/default.h

Looks like a job for --color-moved. :)

[...]
> --- /dev/null
> +++ b/fetch-negotiator.c
> @@ -0,0 +1,7 @@
> +#include "fetch-negotiator.h"

To avoid various standards weirdness, the first #include in all files
should be git-compat-util.h, cache.h, or builtin.h.  I tend to just
use git-compat-util.h.

[...]
> +++ b/fetch-negotiator.h
> @@ -0,0 +1,45 @@
> +#ifndef FETCH_NEGOTIATOR
> +#define FETCH_NEGOTIATOR
> +
> +#include "cache.h"

What does this need from cache.h?

> +#include "commit.h"

optional: could use a forward-declaration "struct commit;" since we
only use pointers instead of the complete type.  Do we document when
to prefer forward-declaration and when to #include complete type
definitions somewhere?

[...]
> +struct fetch_negotiator {

Neat.  Can this file include an overview of the calling sequence / how
I use this API? E.g.

/*
 * Standard calling sequence:
 * 1. Obtain a struct fetch_negotiator * using [...]
 * 2. For each [etc]
 * 3. Free resources associated with the negotiator by calling
 *release(this).  This frees the pointer; it cannot be
 *reused.
 */

That would be a good complement to the reference documentation in the
struct definition.

[...]
> +++ b/object.h
> @@ -28,7 +28,8 @@ struct object_array {
>  /*
>   * object flag allocation:
>   * revision.h:   0-1026
> - * fetch-pack.c: 05
> + * fetch-pack.c: 01
> + * negotiator/default.c:   2--5
>   * walker.c: 0-2

Nice!

[...]
> +++ b/fetch-pack.c
[...]
> @@ -462,7 +348,7 @@ static int find_common(struct data *data, struct 
> fetch_pack_args *args,
>   retval = -1;
>   if (args->no_dependents)
>   goto done;
> - while ((oid = get_rev(data))) {
> + while ((oid = negotiator->next(negotiator))) {
[etc]

I like it. :)

[...]
> @@ -988,7 +870,8 @@ static struct ref *do_fetch_pack(struct fetch_pack_args 
> *args,
>   struct object_id oid;
>   const char *agent_feature;
>   int agent_len;
> - struct data data = { { compare_commits_by_commit_date } };
> + struct fetch_negotiator negotiator;
> + fetch_negotiator_init(&negotiator);

Sane.

[...]
> @@ -1061,19 +944,17 @@ static struct ref *do_fetch_pack(struct 
> fetch_pack_args *args,
>   if (!server_supports("deepen-relative") && args->deepen_relative)
>   die(_("Server does not support --deepen"));
>  
> - if (marked)
> - for_each_ref(clear_marks, NULL);
> - marked = 1;
> - if (everything_local(&data, args, &ref, sought, nr_sought)) {
> + if (everything_local(&negotiator, args, &ref, sought, nr_sought)) {
>   packet_flush(fd[1]);
>   goto all_done;
>   }
> - if (find_common(&data, args, fd, &oid, ref) < 0)
> + if (find_common(&negotiator, args, fd, &oid, ref) < 0)
>   if (!args->keep_pack)
>   /* When cloning, it is not unusual to have
>* no common commit.
>*/
>   warning(_("no common commits"));
> + negotiator.release(&negotiator);

Should this go after the all_done so that it doesn't get bypassed in
the everything_local case?

[...]
> @@ -1434,6 +1316,7 @@ static struct ref *do_fetch_pack_v2(struct 
> fetch_pack_args *args,
>   continue;
>   }
>   }
> + negotiator.release(&negotiator);
>  
>   oidset_clear(&common);

nit: could put the negotiator.release(&negotiator) after the blank
line, in the same paragraph as the other free calls like
oidset_clear(&common).

[...]
> +++ b/negotiator/default.c
> @@ -0,0 +1,173 @@
> +#include "default.h"

First #include should be "git-compat-util.h".

[...]
> +/*
> +   This function marks a rev and its ancestors as common.
> +   In some cases, it is desirable to mark on

Re: [PATCH 2/6] fetch-pack: truly stop negotiation upon ACK ready

2018-06-05 Thread Jonathan Tan
On Tue, 5 Jun 2018 16:16:34 -0700
Jonathan Nieder  wrote:

> Hi,
> 
> Jonathan Tan wrote:
> 
> > When "ACK %s ready" is received, find_common() clears rev_list in an
> > attempt to stop further "have" lines from being sent [1]. This appears
> > to work, despite the invocation to mark_common() in the "while" loop.
> 
> Does "appears to work" mean "works" or "doesn't work but does an okay
> job of faking"?

"Appears to work" means I think that it works, but I don't think I can
conclusively prove it.

> > Though it is possible for mark_common() to invoke rev_list_push() (thus
> > making rev_list non-empty once more), it is more likely that the commits
> 
> nit: s/more likely/most likely/
> or s/it is more likely that/usually/
> 
> > in rev_list that have un-SEEN parents are also unparsed, meaning that
> > mark_common() is not invoked on them.
> >
> > To avoid all this uncertainty, it is better to explicitly end the loop
> > when "ACK %s ready" is received instead of clearing rev_list. Remove the
> > clearing of rev_list and write "if (got_ready) break;" instead.
> 
> I'm still a little curious about whether this can happen in practice
> or whether it's just about readability (or whether you didn't figure
> out which, which is perfectly fine), but either way it's a good
> change.

I tried to figure out which, but concluded that I can't.

I think that in v2's commit message, I'll start with describing the
readability aspect.

> > @@ -1281,7 +1281,6 @@ static int process_acks(struct packet_reader *reader, 
> > struct oidset *common)
> > }
> >  
> > if (!strcmp(reader->line, "ready")) {
> > -   clear_prio_queue(&rev_list);
> > received_ready = 1;
> > continue;
> 
> I'm curious about the lifetime of &rev_list.  Does the priority queue
> get freed eventually?

No (which potentially causes a problem in the case that fetch-pack is
invoked twice), but I fix that in patch 4/6, so I didn't bother
addressing it here. I'll add a note about the lifetime of this priority
queue in v2.


Re: [PATCH 3/6] fetch-pack: in protocol v2, enqueue commons first

2018-06-05 Thread Jonathan Tan
On Tue, Jun 5, 2018 at 4:30 PM, Jonathan Nieder  wrote:
> I get lost in the above description.  I suspect it's doing a good job
> of describing the code, instead of answering the question I really
> have about what is broken and what behavior we want instead.
>
> E.g. are there some commands that I can run to trigger the unnecessary
> "have" lines?  That would make it easier for me to understand the rest
> and whether this is a good approach for suppressing them.
>
> It's possible I should be skipping to the test, but a summary in the
> commit message would make life easier for lazy people like me. :)

OK, I'll start the commit message with explaining a situation in which
these redundant "have" lines will appear instead. (The situation will
be the same as the one in the test.)

> This is subtle.  My instinct would be to assume that the purpose of
> everything_local is just to determine whether we need to send a fetch
> request, but it appears we also want to rely on a side effect from it.
>
> Could everything_local get a function comment to describe what side
> effects we will be counting on from it?

You're right that there's a side effect in everything_local. In v2,
I'll have a preparatory patch to separate it into a few functions so
that we can see what happens more clearly.

> nit: this adds the new test as last in the script.  Is there some
> logical earlier place in the file it can go instead?  That way, the
> file stays organized and concurrent patches that modify the same test
> script are less likely to conflict.

Good point. I'll find a place.

>> + rm -rf server client &&
>> + git init server &&
>> + test_commit -C server aref_both_1 &&
>> + git -C server tag -d aref_both_1 &&
>> + test_commit -C server aref_both_2 &&
>
> What does aref stand for?

"A ref", "a" as in "one". I'll find a better name (probably just
"both_1" and "both_2").

>> +
>> + # The ref name that only the server has must be a prefix of all the
>> + # others, to ensure that the client has the same information regardless
>> + # of whether protocol v0 (which does not have ref prefix filtering) or
>> + # protocol v2 (which does) is used.
>
> must or else what?  Maybe:
>
> # In this test, the ref name that only the server has is a prefix of
> # all other refs. This ensures that the client has the same 
> information
> # regardless of [etc]

Thanks - I'll use your suggestion.

>> + git clone server client &&
>> + test_commit -C server aref &&
>> + test_commit -C client aref_client &&
>> +
>> + # In both protocol v0 and v2, ensure that the parent of aref_both_2 is
>> + # not sent as a "have" line.
>
> Why shouldn't it be sent as a "have" line?  E.g. does another "have"
> line make it redundant?

The server's ref advertisement makes it redundant. I'll explain this
more clearly in v2.

>> +
>> + rm -f trace &&
>> + cp -r client clientv0 &&
>> + GIT_TRACE_PACKET="$(pwd)/trace" git -C clientv0 \
>> + fetch origin aref &&
>> + grep "have $(git -C client rev-parse aref_client)" trace &&
>> + grep "have $(git -C client rev-parse aref_both_2)" trace &&
>
> nit: can make this more robust by doing
>
> aref_client=$(git -C client rev-parse aref_client) &&
> aref_both_2=$(git -C client rev-parse aref_both_2) &&
>
> since this way if the git command fails, the test fails.

Will do. Thanks for your comments.


Re: [PATCH 4/6] fetch-pack: make negotiation-related vars local

2018-06-05 Thread Jonathan Tan
On Tue, Jun 5, 2018 at 4:35 PM, Jonathan Nieder  wrote:
> Jonathan Tan wrote:
>
>> Reduce the number of global variables by making the priority queue and
>> the count of non-common commits in it local, passing them as a struct to
>> various functions where necessary.
>
> \o/
>
>> This also helps in the case that fetch_pack() is invoked twice in the
>> same process (when tag following is required when using a transport that
>> does not support tag following), in that different priority queues will
>> now be used in each invocation, instead of reusing the possibly
>> non-empty one.
>>
>> The struct containing these variables is named "data" to ease review of
>> a subsequent patch in this series - in that patch, this struct
>> definition and several functions will be moved to a negotiation-specific
>> file, and this allows the move to be verbatim.
>
> Hm.  Is the idea that 'struct data' gets stored in the opaque 'data'
> member of the fetch_negotiator?

Yes.

> 'struct data' is a quite vague type name --- it's almost equivalent to
> 'void' (which I suppose is the idea).  How about something like
> 'struct negotiation_data' or 'fetch_negotiator_data' in this patch?
> That way this last paragraph of the commit message wouldn't be needed.

I wanted to make it easier to review the subsequent patch, in that
entire functions, including the signature, can be moved verbatim. If
the project prefers that I don't do that, let me know.

> [...]
>> --- a/fetch-pack.c
>> +++ b/fetch-pack.c
>> @@ -50,8 +50,12 @@ static int marked;
>>   */
>>  #define MAX_IN_VAIN 256
>>
>> -static struct prio_queue rev_list = { compare_commits_by_commit_date };
>> -static int non_common_revs, multi_ack, use_sideband;
>> +struct data {
>> + struct prio_queue rev_list;
>> + int non_common_revs;
>> +};
>
> How does this struct get used?  What does it represent?  A comment
> might help.

I'll add a comment.


Re: [PATCH 5/6] fetch-pack: move common check and marking together

2018-06-05 Thread Jonathan Tan
On Tue, Jun 5, 2018 at 5:01 PM, Jonathan Nieder  wrote:
> I like it.  I think it should be possible to describe the benefit of
> this patch without reference to the specifics of the subsequent one.
> Maybe something like:
>
> When receiving 'ACK  continue' for a common commit,
> check if the commit was already known to be common and mark it
> as such if not up front.  This should make future refactoring
> of how the information about common commits is stored more
> straightforward.
>
> No visible change intended.

Thanks, I'll use your suggestion.


Re: [PATCH 6/6] fetch-pack: introduce negotiator API

2018-06-05 Thread Jonathan Tan
On Tue, Jun 5, 2018 at 5:37 PM, Jonathan Nieder  wrote:
>> This patch is written to be more easily reviewed: static functions are
>> moved verbatim from fetch-pack.c to negotiator/default.c, and it can be
>> seen that the lines replaced by negotiator->X() calls are present in the
>> X() functions respectively.
>
> nit: s/more//

OK.

>> +#include "fetch-negotiator.h"
>
> To avoid various standards weirdness, the first #include in all files
> should be git-compat-util.h, cache.h, or builtin.h.  I tend to just
> use git-compat-util.h.

OK.

>> +#include "cache.h"
>
> What does this need from cache.h?

If I remember correctly, I needed something, but I might not. I'll
remove it if so.

>> +#include "commit.h"
>
> optional: could use a forward-declaration "struct commit;" since we
> only use pointers instead of the complete type.  Do we document when
> to prefer forward-declaration and when to #include complete type
> definitions somewhere?

I'll use the forward declaration then.

> [...]
>> +struct fetch_negotiator {
>
> Neat.  Can this file include an overview of the calling sequence / how
> I use this API? E.g.
>
> /*
>  * Standard calling sequence:
>  * 1. Obtain a struct fetch_negotiator * using [...]
>  * 2. For each [etc]
>  * 3. Free resources associated with the negotiator by calling
>  *release(this).  This frees the pointer; it cannot be
>  *reused.
>  */
>
> That would be a good complement to the reference documentation in the
> struct definition.

OK.

>> @@ -1061,19 +944,17 @@ static struct ref *do_fetch_pack(struct 
>> fetch_pack_args *args,
>>   if (!server_supports("deepen-relative") && args->deepen_relative)
>>   die(_("Server does not support --deepen"));
>>
>> - if (marked)
>> - for_each_ref(clear_marks, NULL);
>> - marked = 1;
>> - if (everything_local(&data, args, &ref, sought, nr_sought)) {
>> + if (everything_local(&negotiator, args, &ref, sought, nr_sought)) {
>>   packet_flush(fd[1]);
>>   goto all_done;
>>   }
>> - if (find_common(&data, args, fd, &oid, ref) < 0)
>> + if (find_common(&negotiator, args, fd, &oid, ref) < 0)
>>   if (!args->keep_pack)
>>   /* When cloning, it is not unusual to have
>>* no common commit.
>>*/
>>   warning(_("no common commits"));
>> + negotiator.release(&negotiator);
>
> Should this go after the all_done so that it doesn't get bypassed in
> the everything_local case?

Good point - will do.

>> @@ -1434,6 +1316,7 @@ static struct ref *do_fetch_pack_v2(struct 
>> fetch_pack_args *args,
>>   continue;
>>   }
>>   }
>> + negotiator.release(&negotiator);
>>
>>   oidset_clear(&common);
>
> nit: could put the negotiator.release(&negotiator) after the blank
> line, in the same paragraph as the other free calls like
> oidset_clear(&common).

OK.

>> +++ b/negotiator/default.c
>> @@ -0,0 +1,173 @@
>> +#include "default.h"
>
> First #include should be "git-compat-util.h".

OK.

> [...]
>> +/*
>> +   This function marks a rev and its ancestors as common.
>> +   In some cases, it is desirable to mark only the ancestors (for example
>> +   when only the server does not yet know that they are common).
>> +*/
>
> Not about this change: comments should have ' *' at the start of each
> line (could do in a preparatory patch or a followup).

I'll add a followup.

>> --- /dev/null
>> +++ b/negotiator/default.h
>> @@ -0,0 +1,8 @@
>> +#ifndef NEGOTIATOR_DEFAULT_H
>> +#define NEGOTIATOR_DEFAULT_H
>> +
>> +#include "fetch-negotiator.h"
>> +
>> +void default_negotiator_init(struct fetch_negotiator *negotiator);
>
> optional: same question about whether to use a forward declaration as in
> fetch-negotiator.h.

I'll use a forward declaration.


Re: [PATCH 6/6] Forbid "the_index" in dir.c and unpack-trees.c

2018-06-05 Thread Duy Nguyen
On Tue, Jun 5, 2018 at 6:58 PM, Brandon Williams  wrote:
> On 06/05, Nguyễn Thái Ngọc Duy wrote:
>> Use of global variables like the_index makes it very hard to follow
>> the code flow, especially when it's buried deep in some minor helper
>> function.
>>
>> We are gradually avoiding global variables, this hopefully will make
>> it one step closer. The end game is, the set of "library" source files
>> will have just one macro set: "LIB_CODE" (or something). This macro
>> will prevent access to most (if not all) global variables in those
>> files. We can't do that now, so we just prevent one thing at a time.
>>
>> Signed-off-by: Nguyễn Thái Ngọc Duy 
>> ---
>>  Should I keep my trick of defining the_index to
>>  the_index_should_not_be_used_here? It may help somewhat when people
>>  accidentally use the_index.
>
> We already have a set of index compatibility macros which this could
> maybe be a part of.

I like it. Only attr.c and submodule.c fail to build if I make
the_index declaration part of NO_THE_INDEX_COMPATIBILITY_MACROS. So
I'm going to drop this patch for now, work on kicking the_index out of
submodule.c and attr.c then move the_index decl there in a separate
series.

> Though if we want to go this way I think we should
> do the reverse of this and instead have GLOBAL_INDEX which must be
> defined in order to have access to the global.  That way new files are
> already protected by this and it may be easier to reduce the number of
> these defines through our codebase than to add them to every single
> file.
>
>>
>>  cache.h| 2 ++
>>  dir.c  | 2 ++
>>  unpack-trees.c | 2 ++
>>  3 files changed, 6 insertions(+)
>>
>> diff --git a/cache.h b/cache.h
>> index 89a107a7f7..ecc96ccb0e 100644
>> --- a/cache.h
>> +++ b/cache.h
>> @@ -330,7 +330,9 @@ struct index_state {
>>   struct ewah_bitmap *fsmonitor_dirty;
>>  };
>>
>> +#ifndef NO_GLOBAL_INDEX
>>  extern struct index_state the_index;
>> +#endif
>>
>>  /* Name hashing */
>>  extern int test_lazy_init_name_hash(struct index_state *istate, int 
>> try_threaded);
>> diff --git a/dir.c b/dir.c
>> index ccf8b4975e..74d848db5a 100644
>> --- a/dir.c
>> +++ b/dir.c
>> @@ -8,6 +8,8 @@
>>   *Junio Hamano, 2005-2006
>>   */
>>  #define NO_THE_INDEX_COMPATIBILITY_MACROS
>> +/* Do not use the_index. You should have access to it via function arg */
>> +#define NO_GLOBAL_INDEX
>>  #include "cache.h"
>>  #include "config.h"
>>  #include "dir.h"
>> diff --git a/unpack-trees.c b/unpack-trees.c
>> index 3ace82ca27..9aebe9762b 100644
>> --- a/unpack-trees.c
>> +++ b/unpack-trees.c
>> @@ -1,4 +1,6 @@
>>  #define NO_THE_INDEX_COMPATIBILITY_MACROS
>> +/* Do not use the_index here, you probably want o->src_index */
>> +#define NO_GLOBAL_INDEX
>>  #include "cache.h"
>>  #include "argv-array.h"
>>  #include "repository.h"
>> --
>> 2.18.0.rc0.333.g22e6ee6cdf
>>
>
> --
> Brandon Williams



-- 
Duy


[PATCH v2 0/5] Fix "the_index" usage in unpack-trees.c

2018-06-05 Thread Nguyễn Thái Ngọc Duy
v2 fixes an incorrect patch splitting (I should have built one more
time :P) between 3/5 and 4/5. v1's 6/6 is dropped. Brandon suggested a
better way of doing it which may happen later.

Nguyễn Thái Ngọc Duy (5):
  unpack-trees: remove 'extern' on function declaration
  unpack-trees: add a note about path invalidation
  unpack-trees: don't shadow global var the_index
  unpack-tress: convert clear_ce_flags* to avoid the_index
  unpack-trees: avoid the_index in verify_absent()

 unpack-trees.c | 53 --
 unpack-trees.h |  4 ++--
 2 files changed, 36 insertions(+), 21 deletions(-)

-- 
2.18.0.rc0.333.g22e6ee6cdf



[PATCH v2 1/5] unpack-trees: remove 'extern' on function declaration

2018-06-05 Thread Nguyễn Thái Ngọc Duy
Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 unpack-trees.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/unpack-trees.h b/unpack-trees.h
index c2b434c606..534358fcc5 100644
--- a/unpack-trees.h
+++ b/unpack-trees.h
@@ -82,8 +82,8 @@ struct unpack_trees_options {
struct exclude_list *el; /* for internal use */
 };
 
-extern int unpack_trees(unsigned n, struct tree_desc *t,
-   struct unpack_trees_options *options);
+int unpack_trees(unsigned n, struct tree_desc *t,
+struct unpack_trees_options *options);
 
 int verify_uptodate(const struct cache_entry *ce,
struct unpack_trees_options *o);
-- 
2.18.0.rc0.333.g22e6ee6cdf



[PATCH v2 4/5] unpack-tress: convert clear_ce_flags* to avoid the_index

2018-06-05 Thread Nguyễn Thái Ngọc Duy
Prior to fba92be8f7, this code implicitly (and incorrectly) assumes
the_index when running the exclude machinery. fba92be8f7 helps show
this problem clearer because unpack-trees operation is supposed to
work on whatever index the caller specifies... not specifically
the_index.

Update the code to use "istate" argument that's originally from
mark_new_skip_worktree(). From the call sites, both in unpack_trees(),
you can see that this function works on two separate indexes:
o->src_index and o->result. The second mark_new_skip_worktree() so far
has incorecctly applied exclude rules on o->src_index instead of
o->result. It's unclear what is the consequences of this, but it's
definitely wrong.

[1] fba92be8f7 (dir: convert is_excluded_from_list to take an index -
2017-05-05)

Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 unpack-trees.c | 31 ++-
 1 file changed, 18 insertions(+), 13 deletions(-)

diff --git a/unpack-trees.c b/unpack-trees.c
index 45fcda3169..5268de7af5 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -1085,13 +1085,15 @@ static int unpack_callback(int n, unsigned long mask, 
unsigned long dirmask, str
return mask;
 }
 
-static int clear_ce_flags_1(struct cache_entry **cache, int nr,
+static int clear_ce_flags_1(struct index_state *istate,
+   struct cache_entry **cache, int nr,
struct strbuf *prefix,
int select_mask, int clear_mask,
struct exclude_list *el, int defval);
 
 /* Whole directory matching */
-static int clear_ce_flags_dir(struct cache_entry **cache, int nr,
+static int clear_ce_flags_dir(struct index_state *istate,
+ struct cache_entry **cache, int nr,
  struct strbuf *prefix,
  char *basename,
  int select_mask, int clear_mask,
@@ -1100,7 +1102,7 @@ static int clear_ce_flags_dir(struct cache_entry **cache, 
int nr,
struct cache_entry **cache_end;
int dtype = DT_DIR;
int ret = is_excluded_from_list(prefix->buf, prefix->len,
-   basename, &dtype, el, &the_index);
+   basename, &dtype, el, istate);
int rc;
 
strbuf_addch(prefix, '/');
@@ -1122,7 +1124,7 @@ static int clear_ce_flags_dir(struct cache_entry **cache, 
int nr,
 * calling clear_ce_flags_1(). That function will call
 * the expensive is_excluded_from_list() on every entry.
 */
-   rc = clear_ce_flags_1(cache, cache_end - cache,
+   rc = clear_ce_flags_1(istate, cache, cache_end - cache,
  prefix,
  select_mask, clear_mask,
  el, ret);
@@ -1145,7 +1147,8 @@ static int clear_ce_flags_dir(struct cache_entry **cache, 
int nr,
  *   cache[0]->name[0..(prefix_len-1)]
  * Top level path has prefix_len zero.
  */
-static int clear_ce_flags_1(struct cache_entry **cache, int nr,
+static int clear_ce_flags_1(struct index_state *istate,
+   struct cache_entry **cache, int nr,
struct strbuf *prefix,
int select_mask, int clear_mask,
struct exclude_list *el, int defval)
@@ -1179,7 +1182,7 @@ static int clear_ce_flags_1(struct cache_entry **cache, 
int nr,
len = slash - name;
strbuf_add(prefix, name, len);
 
-   processed = clear_ce_flags_dir(cache, cache_end - cache,
+   processed = clear_ce_flags_dir(istate, cache, cache_end 
- cache,
   prefix,
   prefix->buf + 
prefix->len - len,
   select_mask, clear_mask,
@@ -1193,7 +1196,7 @@ static int clear_ce_flags_1(struct cache_entry **cache, 
int nr,
}
 
strbuf_addch(prefix, '/');
-   cache += clear_ce_flags_1(cache, cache_end - cache,
+   cache += clear_ce_flags_1(istate, cache, cache_end - 
cache,
  prefix,
  select_mask, clear_mask, el, 
defval);
strbuf_setlen(prefix, prefix->len - len - 1);
@@ -1203,7 +1206,7 @@ static int clear_ce_flags_1(struct cache_entry **cache, 
int nr,
/* Non-directory */
dtype = ce_to_dtype(ce);
ret = is_excluded_from_list(ce->name, ce_namelen(ce),
-   name, &dtype, el, &the_index);
+   name, &dtype, el, istate);
if (ret < 0)
ret = defval;
if (ret > 0)
@@ -1213,

[PATCH v2 2/5] unpack-trees: add a note about path invalidation

2018-06-05 Thread Nguyễn Thái Ngọc Duy
Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 unpack-trees.c | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/unpack-trees.c b/unpack-trees.c
index 3a85a02a77..5d06aa9c98 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -1545,6 +1545,17 @@ static int verify_uptodate_sparse(const struct 
cache_entry *ce,
return verify_uptodate_1(ce, o, ERROR_SPARSE_NOT_UPTODATE_FILE);
 }
 
+/*
+ * TODO: We should actually invalidate o->result, not src_index [1].
+ * But since cache tree and untracked cache both are not copied to
+ * o->result until unpacking is complete, we invalidate them on
+ * src_index instead with the assumption that they will be copied to
+ * dst_index at the end.
+ *
+ * [1] src_index->cache_tree is also used in unpack_callback() so if
+ * we invalidate o->result, we need to update it to use
+ * o->result.cache_tree as well.
+ */
 static void invalidate_ce_path(const struct cache_entry *ce,
   struct unpack_trees_options *o)
 {
-- 
2.18.0.rc0.333.g22e6ee6cdf



[PATCH v2 5/5] unpack-trees: avoid the_index in verify_absent()

2018-06-05 Thread Nguyễn Thái Ngọc Duy
Both functions that are updated in this commit are called by
verify_absent(), which is part of the "unpack-trees" operation that is
supposed to work on any index file specified by the caller. Thanks to
Brandon [1] [2], an implicit dependency on the_index is exposed. This
commit fixes it.

In both functions, it makes sense to use src_index to check for
exclusion because it's almost unchanged and should give us the same
outcome as if running the exclude check before the unpack.

It's "almost unchanged" because we do invalidate cache-tree and
untracked cache in the source index. But this should not affect how
exclude machinery uses the index: to see if a file is tracked, and to
read a blob from the index instead of worktree if it's marked
skip-worktree (i.e. it's not available in worktree)

[1] a0bba65b10 (dir: convert is_excluded to take an index - 2017-05-05
[2] 2c1eb10454 (dir: convert read_directory to take an index - 2017-05-05)

Helped-by: Elijah Newren 
Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 unpack-trees.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/unpack-trees.c b/unpack-trees.c
index 5268de7af5..3ace82ca27 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -1651,7 +1651,7 @@ static int verify_clean_subdirectory(const struct 
cache_entry *ce,
memset(&d, 0, sizeof(d));
if (o->dir)
d.exclude_per_dir = o->dir->exclude_per_dir;
-   i = read_directory(&d, &the_index, pathbuf, namelen+1, NULL);
+   i = read_directory(&d, o->src_index, pathbuf, namelen+1, NULL);
if (i)
return o->gently ? -1 :
add_rejected_path(o, ERROR_NOT_UPTODATE_DIR, ce->name);
@@ -1693,7 +1693,7 @@ static int check_ok_to_remove(const char *name, int len, 
int dtype,
return 0;
 
if (o->dir &&
-   is_excluded(o->dir, &the_index, name, &dtype))
+   is_excluded(o->dir, o->src_index, name, &dtype))
/*
 * ce->name is explicitly excluded, so it is Ok to
 * overwrite it.
-- 
2.18.0.rc0.333.g22e6ee6cdf



[PATCH v2 3/5] unpack-trees: don't shadow global var the_index

2018-06-05 Thread Nguyễn Thái Ngọc Duy
This function mark_new_skip_worktree() has an argument named the_index
which is also the name of a global variable. While they have different
types (the global the_index is not a pointer) mistakes can easily
happen and it's also confusing for readers. Rename the function
argument to something other than the_index.

Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 unpack-trees.c | 9 -
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/unpack-trees.c b/unpack-trees.c
index 5d06aa9c98..45fcda3169 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -1231,7 +1231,7 @@ static int clear_ce_flags(struct cache_entry **cache, int 
nr,
  * Set/Clear CE_NEW_SKIP_WORKTREE according to $GIT_DIR/info/sparse-checkout
  */
 static void mark_new_skip_worktree(struct exclude_list *el,
-  struct index_state *the_index,
+  struct index_state *istate,
   int select_flag, int skip_wt_flag)
 {
int i;
@@ -1240,8 +1240,8 @@ static void mark_new_skip_worktree(struct exclude_list 
*el,
 * 1. Pretend the narrowest worktree: only unmerged entries
 * are checked out
 */
-   for (i = 0; i < the_index->cache_nr; i++) {
-   struct cache_entry *ce = the_index->cache[i];
+   for (i = 0; i < istate->cache_nr; i++) {
+   struct cache_entry *ce = istate->cache[i];
 
if (select_flag && !(ce->ce_flags & select_flag))
continue;
@@ -1256,8 +1256,7 @@ static void mark_new_skip_worktree(struct exclude_list 
*el,
 * 2. Widen worktree according to sparse-checkout file.
 * Matched entries will have skip_wt_flag cleared (i.e. "in")
 */
-   clear_ce_flags(the_index->cache, the_index->cache_nr,
-  select_flag, skip_wt_flag, el);
+   clear_ce_flags(istate->cache, istate->cache_nr, select_flag, 
skip_wt_flag, el);
 }
 
 static int verify_absent(const struct cache_entry *,
-- 
2.18.0.rc0.333.g22e6ee6cdf



Re: [PATCH 01/10] t: add tool to translate hash-related values

2018-06-05 Thread Torsten Bögershausen
Some style nite inline

On Mon, Jun 04, 2018 at 11:52:20PM +, brian m. carlson wrote:
> Add a test function helper, test_translate, that will produce its first
> argument if the hash in use is SHA-1 and the second if its argument is
> NewHash.  Implement a mode that can read entries from a file as well for
> reusability across tests.
> 
> For the moment, use the length of the empty blob to determine the hash
> in use.  In the future, we can change this code so that it can use the
> configuration and learn about the difference in input, output, and
> on-disk formats.
> 
> Implement two basic lookup charts, one for common invalid or synthesized
> object IDs, and one for various facts about the hash function in use.
> 
> Signed-off-by: brian m. carlson 
> ---
>  t/test-lib-functions.sh | 40 
>  t/translate/hash-info   |  9 +
>  t/translate/oid | 15 +++
>  3 files changed, 64 insertions(+)
>  create mode 100644 t/translate/hash-info
>  create mode 100644 t/translate/oid
> 
> diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh
> index 2b2181dca0..0e7067460b 100644
> --- a/t/test-lib-functions.sh
> +++ b/t/test-lib-functions.sh
> @@ -1147,3 +1147,43 @@ depacketize () {
>   }
>   '
>  }
> +
> +test_translate_f_ () {
> + local file="$TEST_DIRECTORY/translate/$2" &&

Unless I'm wrong, we don't use the "local" keyword ?

> + perl -e '

The bare "perl" is better spelled as "$PERL_PATH"


> + $delim = "\t";
> + ($hoidlen, $file, $arg) = @ARGV;
> + open($fh, "<", $file) or die "open: $!";
> + while (<$fh>) {
> + # Allow specifying other delimiters.
> + $delim = $1 if /^#!\sdelimiter\s(.)/;
> + next if /^#/;
> + @fields = split /$delim/, $_, 3;
> + if ($fields[0] eq $arg) {
> + print($hoidlen == 40 ? $fields[1] : $fields[2]);
> + last;
> + }
> + }
> + ' "$1" "$file" "$3"
> +}
> +
> +# Without -f, print the first argument if we are using SHA-1 and the second 
> if
> +# we're using NewHash.
> +# With -f FILE ARG, read the (by default) tab-delimited file from
> +# t/translate/FILE, finding the first field matching ARG and printing either 
> the
> +# second or third field depending on the hash in use.
> +test_translate () {
> + local hoidlen=$(printf "%s" "$EMPTY_BLOB" | wc -c) &&
> + if [ "$1" = "-f" ]

Style nit, please avoid [] and use test:
if test "$1" = "-f"

And more [] below

> + then
> + shift &&
> + test_translate_f_ "$hoidlen" "$@"
> + else
> + if [ "$hoidlen" -eq 40 ]
> + then
> + printf "%s" "$1"
> + else
> + printf "%s" "$2"
> + fi
> + fi
> +}
> diff --git a/t/translate/hash-info b/t/translate/hash-info
> new file mode 100644
> index 00..36cbd9a8eb
> --- /dev/null
> +++ b/t/translate/hash-info
> @@ -0,0 +1,9 @@
> +# Various facts about the hash algorithm in use for easy access in tests.
> +#
> +# Several aliases are provided for easy recall.
> +rawsz20  32
> +oidlen   20  32
> +hexsz40  64
> +hexoidlen40  64
> +zero 
> 
> +zero-oid 
> 
> diff --git a/t/translate/oid b/t/translate/oid
> new file mode 100644
> index 00..8de0fd64af
> --- /dev/null
> +++ b/t/translate/oid
> @@ -0,0 +1,15 @@
> +# These are some common invalid and partial object IDs used in tests.
> +001  0001
> 0001
> +002  0002
> 0002
> +003  0003
> 0003
> +004  0004
> 0004
> +005  0005
> 0005
> +006  0006
> 0006
> +007  0007
> 0007
> +# All zeros or Fs missing one or two hex segments.
> +zero-1   000 
> 000
> +zero-2   00  
> 0

<    1   2