Re: [PATCH v2 1/2] commit: fix --short and --porcelain options

2018-05-01 Thread Junio C Hamano
Samuel Lijin  writes:

> Mark the commitable flag in the wt_status object in the call to
> `wt_status_collect()`, instead of in `wt_longstatus_print_updated()`,
> and simplify the logic in the latter function to take advantage of the
> logic shifted to the former. This means that callers do not need to use
> `wt_longstatus_print_updated()` to collect the `commitable` flag;
> calling `wt_status_collect()` is sufficient.
>
> As a result, invoking `git commit` with `--short` or `--porcelain`
> (which imply `--dry-run`, but previously returned an inconsistent error
> code inconsistent with dry run behavior) correctly returns status code
> zero when there is something to commit. This fixes two bugs documented
> in the test suite.


Hmm, I couldn't quite get what the above two paragraphs were trying
to say, but I think I figured out by looking at wt_status.c before
applying this patch, so let me see if I correctly understand what
this patch is about by thinking a bit aloud.

There are only two assignments to s->commitable in wt-status.c; one
happens in wt_longstatus_print_updated(), when the function notices
there is even one record to be shown (i.e. there is an "updated"
path) and the other in show_merge_in_progress() which is called by
wt_longstatus_prpint_state().  The latter codepath happens when we
are in a merge and there is no remaining conflicted paths (the code
allows the contents to be committed to be identical to HEAD).  Both
are called from wt_longstatus_print(), which in turn is called by
wt_status_print().

The implication of the above observation is that we do not set
commitable bit (by the way, shouldn't we spell it with two 'T's?)
if we are not doing the long format status.  The title hints at it
but "fix" is too vague.  It would be easier to understand if it
began like this (i.e. state problem clearly first, before outlining
the solution):

[PATCH 1/2] commit: fix exit status under --short/--porcelain options

In wt-status.c, s->commitable bit is set only in the
codepaths reachable from wt_status_print() when output
format is STATUS_FORMAT_LONG as a side effect of printing
bits of status.  Consequently, when running with --short and
--porcelain options, the bit is not set to reflect if there
is anything to be committed, and "git commit --short" or
"--porcelain" (both of which imply "--dry-run") failed to
signal if there is anything to commit with its exit status.

Instead, update s->commitable bit in wt_status_collect(),
regardless of the output format. ...

Is that what is going on here?  Yours made it sound as if moving the
code to _collect() was done for the sake of moving code around and
simplifying the logic, and bugfix fell out of the move merely as a
side effect, which probably was the source of my confusion.

> +static void wt_status_mark_commitable(struct wt_status *s) {
> + int i;
> +
> + for (i = 0; i < s->change.nr; i++) {
> + struct wt_status_change_data *d = (s->change.items[i]).util;
> +
> + if (d->index_status && d->index_status != DIFF_STATUS_UNMERGED) 
> {
> + s->commitable = 1;
> + return;
> + }
> + }
> +}

I am not sure if this is sufficient.  From a cursory look of the
existing code (and vague recollection in my ageing brain ;-), I
think we say it is committable if

 (1) when not merging, there is something to show in the "to be
 committed" section (i.e. there must be something changed since
 HEAD in the index).

 (2) when merging, no conflicting paths remain (i.e. change.nr being
 zero is fine).

So it is unclear to me how you are dealing with (2) under "--short"
option, which does not call show_merge_in_progress() to catch that
case.


subscription (was: (no subject))

2018-05-01 Thread Eric Wong
Daniel Villeneuve  wrote:
> subscribe

That line should be "subscribe git" and it needs to be
sent to majord...@vger.kernel.org , not this list.


Re: [PATCH] revisions.txt: expand tabs to spaces in diagram

2018-05-01 Thread Junio C Hamano
Martin Ågren  writes:

> The diagram renders fine in AsciiDoc before and after this patch.
> Asciidoctor, on the other hand, ignores the tabs entirely, which results
> in different indentation for different lines. The graph illustration
> earlier in the document already uses spaces instead of a tab.

Ouch.  We might want to teach Documentation/.gitattributes that
indent-with-tab is unwelcome in that directory, after making this
fix (and possibly similar ones to other files).


Re: [PATCH 0/3] Add --progress and --dissociate to git submodule

2018-05-01 Thread Junio C Hamano
Casey Fitzpatrick  writes:

> These patches add --progress and --dissociate options to git submodule.
>
> The --progress option existed beforehand, but only for the update command and
> it was left undocumented.
>
> Both add and update submodule commands supported --reference, but not its pair
> option --dissociate which allows for independent clones rather than depending
> on the reference.
>
> This is a resubmission with comments from Stefan Beller and Eric Sunshine
> addressed.

Readers would really appreciate it if these are prepared with
format-patch with -v$N option.  Unless they read faster than you
post patches, they will find a few messages identically titled, all
unread in their mailbox, and it is not always easy to tell which
ones are the latest.

Thanks.


Re: git-submodule is missing --dissociate option

2018-05-01 Thread Junio C Hamano
Stefan Beller  writes:

>> As far as I am aware this can be worked around with 'git repack -a'
>> and manual removal of the objects/info/alternates file afterward.
>> Though I don't know if this results in a less speedy clone than
>> dissociate would.
>
> That is an interesting workaround!

It's not just "workaround", but actually is an implementation of
that exact option, no?


Re: git-submodule is missing --dissociate option

2018-05-01 Thread Junio C Hamano
Ævar Arnfjörð Bjarmason  writes:

> On Mon, Apr 30, 2018 at 1:30 PM, Casey Fitzpatrick  wrote:
>> It also seems to be missing "--progress", and I imagine others.
>> Perhaps submodule add/update should be reworked to automatically
>> accept all the options that clone would?
>
> --progress is not missing, but I see that it isn't documented. It was
> added in 72c5f88311 ("clone: pass --progress decision to recursive
> submodules", 2016-09-22). What you're suggesting makes sense, but as
> shown in that commit it's not easy for it to happen automatically,
> there's a lot of boilerplate involved.
>
> But since you're interested you can see how to add new options with
> that patch, it should be easy for anyone not experienced with the
> codebase, it's all just boilerplate + adding a test.

I think it is going in the right direction overall, but a few corner
cases may need special attention.  "add" may be adding a new module
that locally originates, in which case "clone" is not relevant.
Similarly, for "update"options for "clone" are not relevant unless
it is the very initial one.


Re: What's cooking in git.git (Apr 2018, #04; Mon, 30)

2018-05-01 Thread Junio C Hamano
Johannes Schindelin  writes:

> So the problem you found is not a problem with *my* branch, of course, as
> I did not fork off of ...

Correct; there is no blame on you with the choice of the base.  It
was my mistake that I didn't check if the series could be queueable
on the maintenance track that is one release older than the current
'maint'.

As I wrote elsewhere, for a low-impact and ralatively old issue like
this, it is OK for a fix to use supporting code that only exists in
more recent codebase and become unmergeable to anything older than
the concurrent 'maint' track as of the time when the fix is written.
Even though it is sometimes nicer if the fix were written to be
mergeable to codebase near the point where the issue originates, it
is often not worth doing so if it requires bending backwards to
refrain from using a newer and more convenient facility.



Re: [PATCH 2/6] t1406: prepare for the refs code to fail with BUG()

2018-05-01 Thread Junio C Hamano
Duy Nguyen  writes:

> On Mon, Apr 30, 2018 at 12:17 AM, Johannes Schindelin
>  wrote:
>> t1406 specifically verifies that certain code paths fail with a BUG: ...
>> message.
>>
>> In the upcoming commit, we will convert that message to be generated via
>> BUG() instead of die("BUG: ..."), which implies SIGABRT instead of a
>> regular exit code.
>
> On the other hand, SIGABRT on linux creates core dumps. And on some
> setup (like mine) core dumps may be redirected to some central place
> via /proc/sys/kernel/core_pattern. I think systemd does it too but I
> didn't check.
>
> This moving to SIGABRT when we know it _will_ happen when running the
> test suite will accumulate core dumps over time and not cleaned up by
> the test suite. Maybe keeping die("BUG: here is a good compromise.

I do not think it is.  At regular runtime, we _do_ want Git to dump
core if it triggers BUG() condition, whose point is to mark
conditions that should never happen.

As discussed in this thread, tests that use t/helper/ executables
that try to trickle BUG() codepath to ensure that these "should
never happen" conditions are caught do need to deal with it.  If
dumping core is undesirable, tweaking BUG() implementation so that
it becomes die("BUG: ...") *ONLY* when the caller knows what it is
doing (e.g. running t/helper/ commands) is probably a good idea.
Perhaps GIT_TEST_OPTS can gain one feature "--bug-no-abort" and set
an environment variable so that implementation of BUG() can notice,
or something.

When we are testing normal parts of Git outside t/helper/, we never
want to hit BUG().  Aborting and dumping core when that happens is
an desirable outcome.  From that point of view, the idea in 1/6 of
this patch series to annotate test_must_fail and say "we know this
one is going to hit BUG()" is a sound one.  The implementation in
1/6 to treat SIGABRT as an acceptable failure needs to be replaced
to instead use the above mechanism you would use to tell BUG() not
to abort but die with message to arrange that to happen before
running the git command (most likely something from t/helper/) under
test_must_fail ok=sigabrt; and then those who regularly break their
Git being tested (read: us devs) and hit BUG() could instead set the
environment variable (i.e. internal implementation detail) manually
in their environment to turn these BUG()s into die("BUG:...)s while
testing their early mistakes if they do not want core (of course,
you could just do "ulimit -c", and that may be simpler solution of
your "testing Git contaminates central core depot" issue).








[GSoC] Info: new blog series of my work on Git GSoC '18

2018-05-01 Thread Pratik Karki
Hello mentors,

As promised in my proposal, I've started
to write a blog series of GSoC '18 with Git. The initial blog is up.
You can find it here[1]. The initial one is just to get started and
from next iterations, I'll start detailing of my work towards converting
rebase to builtin.

[1]: https://prertik.github.io/categories/git/

Cheers,
Pratik Karki


Re: [PATCH v4 1/2] blame: prevent error if range ends past end of file

2018-05-01 Thread Junio C Hamano
Isabella Stephens  writes:

> This is the existing behaviour. -L10,-20 for example will blame the
> first 10 lines of a file, it will not fail. My patch doesn't change
> this. The case I am discussing is -L,-20 which at the moment blames
> the first line of the file. Trying to go backwards from the start of
> a file should be considered invalid, in my opinion, however I don't
> feel strongly about it - I don't expect this case is common in 
> practice.

I tend to think that -L,-20 is a sloppy spelling of -L1,-20
(i.e. anything omitted gets reasonable default, and for something
that specifies both ends, i.e. ",", the beginning and
the end of the file would be such reasonable default, respectively),
and as such I would imagine that the user would expect the same
behaviour as -L1,-20.  If the longhand version gives only the first
line (i.e. show up to 20 lines ending at line #1), I'd say that
sounds sensible.

Or does -L1,-20 show nothing and consider the input invalid?  If so,
then sure, -L,-20 should also be an invalid input.



[no subject]

2018-05-01 Thread Daniel Villeneuve

subscribe


[PATCH v2] format-patch: make cover letters always text/plain

2018-05-01 Thread brian m. carlson
When formatting a series of patches using --attach and --cover-letter,
the cover letter lacks the closing MIME boundary, violating RFC 2046.
Certain clients, such as Thunderbird, discard the message body in such a
case.

Since the cover letter is just one part and sending it as
multipart/mixed is not very useful, always emit it as text/plain,
avoiding the boundary problem altogether.

Reported-by: Patrick Hemmer 
Signed-off-by: brian m. carlson 
---
Changes from v1:
* Switch from rm -r to rm -fr.

 builtin/log.c   | 2 +-
 log-tree.c  | 7 ---
 log-tree.h  | 3 ++-
 t/t4014-format-patch.sh | 9 +
 4 files changed, 16 insertions(+), 5 deletions(-)

diff --git a/builtin/log.c b/builtin/log.c
index 71f68a3e4f..24868ed070 100644
--- a/builtin/log.c
+++ b/builtin/log.c
@@ -1019,7 +1019,7 @@ static void make_cover_letter(struct rev_info *rev, int 
use_stdout,
open_next_file(NULL, rev->numbered_files ? NULL : "cover-letter", 
rev, quiet))
return;
 
-   log_write_email_headers(rev, head, _subject, _8bit_cte);
+   log_write_email_headers(rev, head, _subject, _8bit_cte, 
0);
 
for (i = 0; !need_8bit_cte && i < nr; i++) {
const char *buf = get_commit_buffer(list[i], NULL);
diff --git a/log-tree.c b/log-tree.c
index d1c0bedf24..9f5eb346a4 100644
--- a/log-tree.c
+++ b/log-tree.c
@@ -362,7 +362,8 @@ void fmt_output_email_subject(struct strbuf *sb, struct 
rev_info *opt)
 
 void log_write_email_headers(struct rev_info *opt, struct commit *commit,
 const char **extra_headers_p,
-int *need_8bit_cte_p)
+int *need_8bit_cte_p,
+int maybe_multipart)
 {
const char *extra_headers = opt->extra_headers;
const char *name = oid_to_hex(opt->zero_commit ?
@@ -385,7 +386,7 @@ void log_write_email_headers(struct rev_info *opt, struct 
commit *commit,
   opt->ref_message_ids->items[i].string);
graph_show_oneline(opt->graph);
}
-   if (opt->mime_boundary) {
+   if (opt->mime_boundary && maybe_multipart) {
static char subject_buffer[1024];
static char buffer[1024];
struct strbuf filename =  STRBUF_INIT;
@@ -610,7 +611,7 @@ void show_log(struct rev_info *opt)
 
if (cmit_fmt_is_mail(opt->commit_format)) {
log_write_email_headers(opt, commit, _headers,
-   _8bit_cte);
+   _8bit_cte, 1);
ctx.rev = opt;
ctx.print_email_subject = 1;
} else if (opt->commit_format != CMIT_FMT_USERFORMAT) {
diff --git a/log-tree.h b/log-tree.h
index deba035187..e668628074 100644
--- a/log-tree.h
+++ b/log-tree.h
@@ -27,7 +27,8 @@ void format_decorations_extended(struct strbuf *sb, const 
struct commit *commit,
 void show_decorations(struct rev_info *opt, struct commit *commit);
 void log_write_email_headers(struct rev_info *opt, struct commit *commit,
 const char **extra_headers_p,
-int *need_8bit_cte_p);
+int *need_8bit_cte_p,
+int maybe_multipart);
 void load_ref_decorations(struct decoration_filter *filter, int flags);
 
 #define FORMAT_PATCH_NAME_MAX 64
diff --git a/t/t4014-format-patch.sh b/t/t4014-format-patch.sh
index 482112ca33..6ea08fd5e9 100755
--- a/t/t4014-format-patch.sh
+++ b/t/t4014-format-patch.sh
@@ -1661,6 +1661,15 @@ test_expect_success 'format-patch --base with --attach' '
test_write_lines 1 2 >expect &&
test_cmp expect actual
 '
+test_expect_success 'format-patch --attach cover-letter only is non-multipart' 
'
+   test_when_finished "rm -fr patches" &&
+   git format-patch -o patches --cover-letter --attach=mimemime 
--base=HEAD~ -1 &&
+   ! egrep "^--+mimemime" patches/*.patch &&
+   egrep "^--+mimemime$" patches/0001*.patch >output &&
+   test_line_count = 2 output &&
+   egrep "^--+mimemime--$" patches/0001*.patch >output &&
+   test_line_count = 1 output
+'
 
 test_expect_success 'format-patch --pretty=mboxrd' '
sp=" " &&


Re: js/no-pager-shorthand [

2018-05-01 Thread Junio C Hamano
Eric Sunshine  writes:

> Although paged output is generally nice, I frequently find myself
> typing --no-pager numerous times during the day, so I, for one,
> welcome having a short option (and would be sad to see this patch
> retracted). As with Hannes, I too find -P a reasonably intuitive and
> easy-to-remember short option.

Yeah, we often see in other people's tools that the -X option means
opposite of the -x option, and even though "git --paginate" may not
have short-and-sweet "git -p" short-hand (or does it?) it would be
natural to expect "-p" to mean that, and I do not find it a stretch
for "-P" to mean opposite of it by extension.

... goes and looks ...

Ah, "-p" does mean "--paginate", it seems.  So I am not opposed to
resurrecting the patch with s/N/P/, if that is what people want.  On
the other hand, I am perfectly OK if the patch stays retracted.


Re: What's cooking in git.git (Apr 2018, #04; Mon, 30)

2018-05-01 Thread Junio C Hamano
SZEDER Gábor  writes:

> On Mon, Apr 30, 2018 at 5:25 AM, Junio C Hamano  wrote:
>> * js/rebase-i-clean-msg-after-fixup-continue (2018-04-30) 4 commits
>>   - rebase --skip: clean up commit message after a failed fixup/squash
>>   - sequencer: always commit without editing when asked for
>>   - rebase -i: Handle "combination of  commits" with GETTEXT_POISON
>>   - rebase -i: demonstrate bugs with fixup!/squash! commit messages
>
>>   "git rebase -i" sometimes left intermediate "# This is a
>>   combination of N commits" message meant for the human consumption
>>   inside an editor in the final result in certain corner cases, which
>>   has been fixed.
>
>>   Will merge to 'next'.
>
> This topic branches off from v2.16.3.  However, its last patch uses
> the sequencer's parse_head() function, which was only added in
> v2.17.0-rc0~110^2~6 (sequencer: try to commit without forking 'git
> commit', 2017-11-24), in topic 'pw/sequencer-in-process-commit',
> leading to compilation errors.

For a low-impact and ralatively old issue like this, it is OK for a
fix to use supporting code that only exists in more recent codebase
and become unmergeable to anything older than the concurrent 'maint'
track as of the time when the fix is written.  Even though it is
sometimes nicer if the fix were written to be mergeable to codebase
near the point where the issue originates, it is often not worth
doing so if it requires bending backwards to refrain from using a
newer and more convenient facility.

Thanks for catching my wishful thinking and carelessness before it
propagated to 'next'.  I try to check tips of individual topic
branches and also the integratoin result, but sometimes I get
trusting too much and get sloppy when dealing with too many topics
in flight.


[PATCH 1/3] submodule: clean up subsititions in script

2018-05-01 Thread Casey Fitzpatrick
'recommend_shallow' and 'jobs' variables do not need quotes (they never contain
spaces) and do not require any additional prefix, therefore remove the
unnecessary subsitition.

'progress' is a boolean value. Treat it like the other boolean values in the
script by using a substitution.

Signed-off-by: Casey Fitzpatrick 
---
 git-submodule.sh | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/git-submodule.sh b/git-submodule.sh
index 24914963c..262547968 100755
--- a/git-submodule.sh
+++ b/git-submodule.sh
@@ -465,7 +465,7 @@ cmd_update()
GIT_QUIET=1
;;
--progress)
-   progress="--progress"
+   progress=1
;;
-i|--init)
init=1
@@ -542,14 +542,14 @@ cmd_update()
 
{
git submodule--helper update-clone ${GIT_QUIET:+--quiet} \
-   ${progress:+"$progress"} \
+   ${progress:+"--progress"} \
${wt_prefix:+--prefix "$wt_prefix"} \
${prefix:+--recursive-prefix "$prefix"} \
${update:+--update "$update"} \
${reference:+"$reference"} \
${depth:+--depth "$depth"} \
-   ${recommend_shallow:+"$recommend_shallow"} \
-   ${jobs:+$jobs} \
+   $recommend_shallow \
+   $jobs \
"$@" || echo "#unmatched" $?
} | {
err=
-- 
2.17.0.1.ge0414f29c.dirty



[PATCH 3/3] submodule: add --dissociate option to add/update commands

2018-05-01 Thread Casey Fitzpatrick
Add --dissociate option to add and update commands, both clone helper commands
that already have the --reference option --dissociate pairs with.

Signed-off-by: Casey Fitzpatrick 
---
 Documentation/git-submodule.txt | 10 +-
 builtin/submodule--helper.c | 16 +---
 git-submodule.sh| 10 +-
 t/t7408-submodule-reference.sh  | 17 +
 4 files changed, 48 insertions(+), 5 deletions(-)

diff --git a/Documentation/git-submodule.txt b/Documentation/git-submodule.txt
index d1ebe32e8..a75b95043 100644
--- a/Documentation/git-submodule.txt
+++ b/Documentation/git-submodule.txt
@@ -369,7 +369,15 @@ the submodule itself.
this option will be passed to the linkgit:git-clone[1] command.
 +
 *NOTE*: Do *not* use this option unless you have read the note
-for linkgit:git-clone[1]'s `--reference` and `--shared` options carefully.
+for linkgit:git-clone[1]'s `--reference`, `--shared`, and `--dissociate`
+options carefully.
+
+--dissociate::
+   This option is only valid for add and update commands.  These
+   commands sometimes need to clone a remote repository. In this case,
+   this option will be passed to the linkgit:git-clone[1] command.
++
+*NOTE*: see the NOTE for the `--reference` option.
 
 --recursive::
This option is only valid for foreach, update, status and sync commands.
diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
index 6ba8587b6..a85b30419 100644
--- a/builtin/submodule--helper.c
+++ b/builtin/submodule--helper.c
@@ -1056,7 +1056,7 @@ static int module_deinit(int argc, const char **argv, 
const char *prefix)
 }
 
 static int clone_submodule(const char *path, const char *gitdir, const char 
*url,
-  const char *depth, struct string_list *reference,
+  const char *depth, struct string_list *reference, 
int dissociate,
   int quiet, int progress)
 {
struct child_process cp = CHILD_PROCESS_INIT;
@@ -1075,6 +1075,8 @@ static int clone_submodule(const char *path, const char 
*gitdir, const char *url
argv_array_pushl(, "--reference",
 item->string, NULL);
}
+   if (dissociate)
+   argv_array_push(, "--dissociate");
if (gitdir && *gitdir)
argv_array_pushl(, "--separate-git-dir", gitdir, NULL);
 
@@ -1190,6 +1192,7 @@ static int module_clone(int argc, const char **argv, 
const char *prefix)
char *p, *path = NULL, *sm_gitdir;
struct strbuf sb = STRBUF_INIT;
struct string_list reference = STRING_LIST_INIT_NODUP;
+   int dissociate = 0;
char *sm_alternate = NULL, *error_strategy = NULL;
 
struct option module_clone_options[] = {
@@ -1208,6 +1211,8 @@ static int module_clone(int argc, const char **argv, 
const char *prefix)
OPT_STRING_LIST(0, "reference", ,
   N_("repo"),
   N_("reference repository")),
+   OPT_BOOL(0, "dissociate", ,
+  N_("use --reference only while cloning")),
OPT_STRING(0, "depth", ,
   N_("string"),
   N_("depth for shallow clones")),
@@ -1247,7 +1252,7 @@ static int module_clone(int argc, const char **argv, 
const char *prefix)
 
prepare_possible_alternates(name, );
 
-   if (clone_submodule(path, sm_gitdir, url, depth, ,
+   if (clone_submodule(path, sm_gitdir, url, depth, , 
dissociate,
quiet, progress))
die(_("clone of '%s' into submodule path '%s' failed"),
url, path);
@@ -1300,6 +1305,7 @@ struct submodule_update_clone {
int quiet;
int recommend_shallow;
struct string_list references;
+   int dissociate;
const char *depth;
const char *recursive_prefix;
const char *prefix;
@@ -1315,7 +1321,7 @@ struct submodule_update_clone {
int failed_clones_nr, failed_clones_alloc;
 };
 #define SUBMODULE_UPDATE_CLONE_INIT {0, MODULE_LIST_INIT, 0, \
-   SUBMODULE_UPDATE_STRATEGY_INIT, 0, 0, -1, STRING_LIST_INIT_DUP, \
+   SUBMODULE_UPDATE_STRATEGY_INIT, 0, 0, -1, STRING_LIST_INIT_DUP, 0, \
NULL, NULL, NULL, \
STRING_LIST_INIT_DUP, 0, NULL, 0, 0}
 
@@ -1442,6 +1448,8 @@ static int prepare_to_clone_next_submodule(const struct 
cache_entry *ce,
for_each_string_list_item(item, >references)
argv_array_pushl(>args, "--reference", 
item->string, NULL);
}
+   if (suc->dissociate)
+   argv_array_push(>args, "--dissociate");
if (suc->depth)
argv_array_push(>args, suc->depth);
 
@@ -1575,6 +1583,8 @@ static int update_clone(int argc, const char **argv, 
const char *prefix)
   

[PATCH 2/3] submodule: add --progress option to add command

2018-05-01 Thread Casey Fitzpatrick
The '--progress' was introduced in 72c5f88311d (clone: pass --progress
decision to recursive submodules, 2016-09-22) to fix the progress reporting
of the clone command. Also add the progress option to the 'submodule add'
command. The update command already supports the progress flag, but it
is not documented.

Signed-off-by: Casey Fitzpatrick 
---
 Documentation/git-submodule.txt |  7 +++
 git-submodule.sh|  5 -
 t/t7400-submodule-basic.sh  | 16 
 3 files changed, 27 insertions(+), 1 deletion(-)

diff --git a/Documentation/git-submodule.txt b/Documentation/git-submodule.txt
index 71c5618e8..d1ebe32e8 100644
--- a/Documentation/git-submodule.txt
+++ b/Documentation/git-submodule.txt
@@ -239,6 +239,13 @@ OPTIONS
 --quiet::
Only print error messages.
 
+--progress::
+   This option is only valid for add and update commands.
+   Progress status is reported on the standard error stream
+   by default when it is attached to a terminal, unless -q
+   is specified. This flag forces progress status even if the
+   standard error stream is not directed to a terminal.
+
 --all::
This option is only valid for the deinit command. Unregister all
submodules in the working tree.
diff --git a/git-submodule.sh b/git-submodule.sh
index 262547968..39c241a1d 100755
--- a/git-submodule.sh
+++ b/git-submodule.sh
@@ -117,6 +117,9 @@ cmd_add()
-q|--quiet)
GIT_QUIET=1
;;
+   --progress)
+   progress=1
+   ;;
--reference)
case "$2" in '') usage ;; esac
reference_path=$2
@@ -255,7 +258,7 @@ or you are unsure what this means choose another name with 
the '--name' option."
eval_gettextln "Reactivating local git 
directory for submodule '\$sm_name'."
fi
fi
-   git submodule--helper clone ${GIT_QUIET:+--quiet} --prefix 
"$wt_prefix" --path "$sm_path" --name "$sm_name" --url "$realrepo" 
${reference:+"$reference"} ${depth:+"$depth"} || exit
+   git submodule--helper clone ${GIT_QUIET:+--quiet} 
${progress:+"--progress"} --prefix "$wt_prefix" --path "$sm_path" --name 
"$sm_name" --url "$realrepo" ${reference:+"$reference"} ${depth:+"$depth"} || 
exit
(
sanitize_submodule_env
cd "$sm_path" &&
diff --git a/t/t7400-submodule-basic.sh b/t/t7400-submodule-basic.sh
index a39e69a3e..b5b4922ab 100755
--- a/t/t7400-submodule-basic.sh
+++ b/t/t7400-submodule-basic.sh
@@ -126,6 +126,22 @@ test_expect_success 'submodule add' '
test_cmp empty untracked
 '
 
+test_create_repo parent &&
+test_commit -C parent one
+
+test_expect_success 'redirected submodule add does not show progress' '
+   git -C addtest submodule add "file://$submodurl/parent" 
submod-redirected \
+   2>err &&
+   ! grep % err &&
+   test_i18ngrep ! "Checking connectivity" err
+'
+
+test_expect_success 'redirected submodule add --progress does show progress' '
+   git -C addtest submodule add --progress "file://$submodurl/parent" \
+   submod-redirected-progress 2>err && \
+   grep % err
+'
+
 test_expect_success 'submodule add to .gitignored path fails' '
(
cd addtest-ignore &&
-- 
2.17.0.1.ge0414f29c.dirty



[PATCH 0/3] Add --progress and --dissociate to git submodule

2018-05-01 Thread Casey Fitzpatrick
These patches add --progress and --dissociate options to git submodule.

The --progress option existed beforehand, but only for the update command and
it was left undocumented.

Both add and update submodule commands supported --reference, but not its pair
option --dissociate which allows for independent clones rather than depending
on the reference.

This is a resubmission with comments from Stefan Beller and Eric Sunshine
addressed.

Casey Fitzpatrick (3):
  submodule: clean up subsititions in script
  submodule: add --progress option to add command
  submodule: add --dissociate option to add/update commands

 Documentation/git-submodule.txt | 17 -
 builtin/submodule--helper.c | 16 +---
 git-submodule.sh| 21 -
 t/t7400-submodule-basic.sh  | 16 
 t/t7408-submodule-reference.sh  | 17 +
 5 files changed, 78 insertions(+), 9 deletions(-)

-- 
2.17.0.1.ge0414f29c.dirty



Re: [PATCH 3/3] submodule: add --dissociate option to add/update commands

2018-05-01 Thread Casey Fitzpatrick
Just noticed I missed the other 'test_must_fail'. Resubmitting in a few moments.

On Tue, May 1, 2018 at 8:27 PM, Casey Fitzpatrick  wrote:
> Add --dissociate option to add and update commands, both clone helper commands
> that already have the --reference option --dissociate pairs with.
>
> Signed-off-by: Casey Fitzpatrick 
> ---
>  Documentation/git-submodule.txt | 10 +-
>  builtin/submodule--helper.c | 16 +---
>  git-submodule.sh| 10 +-
>  t/t7408-submodule-reference.sh  | 17 +
>  4 files changed, 48 insertions(+), 5 deletions(-)
>
> diff --git a/Documentation/git-submodule.txt b/Documentation/git-submodule.txt
> index d1ebe32e8..a75b95043 100644
> --- a/Documentation/git-submodule.txt
> +++ b/Documentation/git-submodule.txt
> @@ -369,7 +369,15 @@ the submodule itself.
> this option will be passed to the linkgit:git-clone[1] command.
>  +
>  *NOTE*: Do *not* use this option unless you have read the note
> -for linkgit:git-clone[1]'s `--reference` and `--shared` options carefully.
> +for linkgit:git-clone[1]'s `--reference`, `--shared`, and `--dissociate`
> +options carefully.
> +
> +--dissociate::
> +   This option is only valid for add and update commands.  These
> +   commands sometimes need to clone a remote repository. In this case,
> +   this option will be passed to the linkgit:git-clone[1] command.
> ++
> +*NOTE*: see the NOTE for the `--reference` option.
>
>  --recursive::
> This option is only valid for foreach, update, status and sync 
> commands.
> diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
> index 6ba8587b6..a85b30419 100644
> --- a/builtin/submodule--helper.c
> +++ b/builtin/submodule--helper.c
> @@ -1056,7 +1056,7 @@ static int module_deinit(int argc, const char **argv, 
> const char *prefix)
>  }
>
>  static int clone_submodule(const char *path, const char *gitdir, const char 
> *url,
> -  const char *depth, struct string_list *reference,
> +  const char *depth, struct string_list *reference, 
> int dissociate,
>int quiet, int progress)
>  {
> struct child_process cp = CHILD_PROCESS_INIT;
> @@ -1075,6 +1075,8 @@ static int clone_submodule(const char *path, const char 
> *gitdir, const char *url
> argv_array_pushl(, "--reference",
>  item->string, NULL);
> }
> +   if (dissociate)
> +   argv_array_push(, "--dissociate");
> if (gitdir && *gitdir)
> argv_array_pushl(, "--separate-git-dir", gitdir, 
> NULL);
>
> @@ -1190,6 +1192,7 @@ static int module_clone(int argc, const char **argv, 
> const char *prefix)
> char *p, *path = NULL, *sm_gitdir;
> struct strbuf sb = STRBUF_INIT;
> struct string_list reference = STRING_LIST_INIT_NODUP;
> +   int dissociate = 0;
> char *sm_alternate = NULL, *error_strategy = NULL;
>
> struct option module_clone_options[] = {
> @@ -1208,6 +1211,8 @@ static int module_clone(int argc, const char **argv, 
> const char *prefix)
> OPT_STRING_LIST(0, "reference", ,
>N_("repo"),
>N_("reference repository")),
> +   OPT_BOOL(0, "dissociate", ,
> +  N_("use --reference only while cloning")),
> OPT_STRING(0, "depth", ,
>N_("string"),
>N_("depth for shallow clones")),
> @@ -1247,7 +1252,7 @@ static int module_clone(int argc, const char **argv, 
> const char *prefix)
>
> prepare_possible_alternates(name, );
>
> -   if (clone_submodule(path, sm_gitdir, url, depth, ,
> +   if (clone_submodule(path, sm_gitdir, url, depth, , 
> dissociate,
> quiet, progress))
> die(_("clone of '%s' into submodule path '%s' 
> failed"),
> url, path);
> @@ -1300,6 +1305,7 @@ struct submodule_update_clone {
> int quiet;
> int recommend_shallow;
> struct string_list references;
> +   int dissociate;
> const char *depth;
> const char *recursive_prefix;
> const char *prefix;
> @@ -1315,7 +1321,7 @@ struct submodule_update_clone {
> int failed_clones_nr, failed_clones_alloc;
>  };
>  #define SUBMODULE_UPDATE_CLONE_INIT {0, MODULE_LIST_INIT, 0, \
> -   SUBMODULE_UPDATE_STRATEGY_INIT, 0, 0, -1, STRING_LIST_INIT_DUP, \
> +   SUBMODULE_UPDATE_STRATEGY_INIT, 0, 0, -1, STRING_LIST_INIT_DUP, 0, \
> NULL, NULL, NULL, \
> STRING_LIST_INIT_DUP, 0, NULL, 0, 0}
>
> @@ -1442,6 +1448,8 @@ static int prepare_to_clone_next_submodule(const struct 
> cache_entry *ce,
> for_each_string_list_item(item, >references)
>

[PATCH v2 3/3] {fetch,upload}-pack: support filter in protocol v2

2018-05-01 Thread Jonathan Tan
The fetch-pack/upload-pack protocol v2 was developed independently of
the filter parameter (used in partial fetches), thus it did not include
support for it. Add support for the filter parameter.

Like in the legacy protocol, the server advertises and supports "filter"
only if uploadpack.allowfilter is configured.

Like in the legacy protocol, the client continues with a warning if
"--filter" is specified, but the server does not advertise it.

Signed-off-by: Jonathan Tan 
---
 Documentation/technical/protocol-v2.txt |  9 +++
 fetch-pack.c| 23 +-
 t/t5702-protocol-v2.sh  | 98 +
 upload-pack.c   | 15 +++-
 4 files changed, 140 insertions(+), 5 deletions(-)

diff --git a/Documentation/technical/protocol-v2.txt 
b/Documentation/technical/protocol-v2.txt
index 136179d7d..38d24fd2b 100644
--- a/Documentation/technical/protocol-v2.txt
+++ b/Documentation/technical/protocol-v2.txt
@@ -290,6 +290,15 @@ included in the clients request as well as the potential 
addition of the
Cannot be used with "deepen", but can be used with
"deepen-since".
 
+If the 'filter' feature is advertised, the following argument can be
+included in the client's request:
+
+filter 
+   Request that various objects from the packfile be omitted
+   using one of several filtering techniques. These are intended
+   for use with partial clone and partial fetch operations. See
+   `rev-list` for possible "filter-spec" values.
+
 The response of `fetch` is broken into a number of sections separated by
 delimiter packets (0001), with each section beginning with its section
 header.
diff --git a/fetch-pack.c b/fetch-pack.c
index f93723fec..3ed40aa46 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -1191,14 +1191,29 @@ static int send_fetch_request(int fd_out, const struct 
fetch_pack_args *args,
else if (is_repository_shallow() || args->deepen)
die(_("Server does not support shallow requests"));
 
+   /* Add filter */
+   if (server_supports_feature("fetch", "filter", 0) &&
+   args->filter_options.choice) {
+   print_verbose(args, _("Server supports filter"));
+   packet_buf_write(_buf, "filter %s",
+args->filter_options.filter_spec);
+   } else if (args->filter_options.choice) {
+   warning("filtering not recognized by server, ignoring");
+   }
+
/* add wants */
add_wants(wants, _buf);
 
-   /* Add all of the common commits we've found in previous rounds */
-   add_common(_buf, common);
+   if (args->no_dependents) {
+   packet_buf_write(_buf, "done");
+   ret = 1;
+   } else {
+   /* Add all of the common commits we've found in previous rounds 
*/
+   add_common(_buf, common);
 
-   /* Add initial haves */
-   ret = add_haves(_buf, haves_to_send, in_vain);
+   /* Add initial haves */
+   ret = add_haves(_buf, haves_to_send, in_vain);
+   }
 
/* Send request */
packet_buf_flush(_buf);
diff --git a/t/t5702-protocol-v2.sh b/t/t5702-protocol-v2.sh
index 0ead3..858d3bb2d 100755
--- a/t/t5702-protocol-v2.sh
+++ b/t/t5702-protocol-v2.sh
@@ -215,6 +215,104 @@ test_expect_success 'upload-pack respects config using 
protocol v2' '
test -f server/.git/hookout
 '
 
+test_expect_success 'setup filter tests' '
+   rm -rf server client &&
+   git init server &&
+
+   # 1 commit to create a file, and 1 commit to modify it
+   test_commit -C server message1 a.txt &&
+   test_commit -C server message2 a.txt &&
+   git -C server config protocol.version 2 &&
+   git -C server config uploadpack.allowfilter 1 &&
+   git -C server config uploadpack.allowanysha1inwant 1 &&
+   git -C server config protocol.version 2
+'
+
+test_expect_success 'partial clone' '
+   GIT_TRACE_PACKET="$(pwd)/trace" git -c protocol.version=2 \
+   clone --filter=blob:none "file://$(pwd)/server" client &&
+   grep "version 2" trace &&
+
+   # Ensure that the old version of the file is missing
+   git -C client rev-list master --quiet --objects --missing=print \
+   >observed.oids &&
+   grep "$(git -C server rev-parse message1:a.txt)" observed.oids &&
+
+   # Ensure that client passes fsck
+   git -C client fsck
+'
+
+test_expect_success 'dynamically fetch missing object' '
+   rm "$(pwd)/trace" &&
+   GIT_TRACE_PACKET="$(pwd)/trace" git -C client -c protocol.version=2 \
+   cat-file -p $(git -C server rev-parse message1:a.txt) &&
+   grep "version 2" trace
+'
+
+test_expect_success 'partial fetch' '
+   rm -rf client "$(pwd)/trace" &&
+   git init client &&
+   SERVER="file://$(pwd)/server" &&
+   test_config -C client extensions.partialClone "$SERVER" &&

[PATCH v2 0/3] Supporting partial clones in protocol v2

2018-05-01 Thread Jonathan Tan
> This is a bit difficult to read and there is no reason why we would need
> to read the entire upload_pack_config to determine if we need to filter
> things (we will need to read the config if cmd "fetch" is requested
> though).  Instead it may be better to do the following:
> 
>   if (value) {
> strbuf_addstr(value, "shallow");
> if (repo_config_get(r, "uplaodpack.filter"))
>   strbuf_addstr(value, " filter");
>   }
> 
> This way its easier to read and you only are reading the required value
> from the config.

Thanks, Brandon. I went ahead and used repo_config_get_bool(), and
indeed it works.

Removing the call to git_config() from there exposed another issue in
that configs were not read if upload-pack was used with protocol v2, so
I inserted a patch in the middle addressing that. While writing that
patch, I noticed that uploadpack.packobjectshook couldn't take filenames
with spaces, which I think is due to prepare_shell_cmd() in
run-command.c not quoting properly. Adding single quotes around "%s"
worked, but made other tests fail. Instead of continuing down that
rabbit hole, I just made the uploadpack.packobjectshook not have any
spaces, just like in t5544.

Jonathan Tan (3):
  upload-pack: fix error message typo
  upload-pack: read config when serving protocol v2
  {fetch,upload}-pack: support filter in protocol v2

 Documentation/technical/protocol-v2.txt |   9 ++
 fetch-pack.c|  23 -
 t/t5701-git-serve.sh|  14 +++
 t/t5702-protocol-v2.sh  | 112 
 upload-pack.c   |  19 +++-
 5 files changed, 171 insertions(+), 6 deletions(-)

-- 
2.17.0.441.gb46fe60e1d-goog



[PATCH v2 2/3] upload-pack: read config when serving protocol v2

2018-05-01 Thread Jonathan Tan
The upload-pack code paths never call git_config() with
upload_pack_config() when protocol v2 is used, causing options like
uploadpack.packobjectshook to not take effect. Ensure that this function
is called.

Signed-off-by: Jonathan Tan 
---
 t/t5702-protocol-v2.sh | 14 ++
 upload-pack.c  |  2 ++
 2 files changed, 16 insertions(+)

diff --git a/t/t5702-protocol-v2.sh b/t/t5702-protocol-v2.sh
index 56f7c3c32..0ead3 100755
--- a/t/t5702-protocol-v2.sh
+++ b/t/t5702-protocol-v2.sh
@@ -201,6 +201,20 @@ test_expect_success 'ref advertisment is filtered during 
fetch using protocol v2
! grep "refs/tags/three" log
 '
 
+test_expect_success 'upload-pack respects config using protocol v2' '
+   git init server &&
+   write_script server/.git/hook <<-\EOF &&
+   touch hookout
+   "$@"
+   EOF
+   test_commit -C server one &&
+
+   test_config_global uploadpack.packobjectshook ./hook &&
+   test ! -f server/.git/hookout &&
+   GIT_TRACE=/tmp/y git -c protocol.version=2 clone "file://$(pwd)/server" 
client &&
+   test -f server/.git/hookout
+'
+
 # Test protocol v2 with 'http://' transport
 #
 . "$TEST_DIRECTORY"/lib-httpd.sh
diff --git a/upload-pack.c b/upload-pack.c
index c4456bb88..113edd32d 100644
--- a/upload-pack.c
+++ b/upload-pack.c
@@ -1376,6 +1376,8 @@ int upload_pack_v2(struct repository *r, struct 
argv_array *keys,
enum fetch_state state = FETCH_PROCESS_ARGS;
struct upload_pack_data data;
 
+   git_config(upload_pack_config, NULL);
+
upload_pack_data_init();
use_sideband = LARGE_PACKET_MAX;
 
-- 
2.17.0.441.gb46fe60e1d-goog



[PATCH v2 1/3] upload-pack: fix error message typo

2018-05-01 Thread Jonathan Tan
Fix a typo in an error message.

Also, this line was introduced in 3145ea957d2c ("upload-pack: introduce
fetch server command", 2018-03-15), which did not contain a test for the
case which causes this error to be printed, so introduce a test.

Signed-off-by: Jonathan Tan 
---
 t/t5701-git-serve.sh | 14 ++
 upload-pack.c|  2 +-
 2 files changed, 15 insertions(+), 1 deletion(-)

diff --git a/t/t5701-git-serve.sh b/t/t5701-git-serve.sh
index 72d7bc562..1b4b13cc2 100755
--- a/t/t5701-git-serve.sh
+++ b/t/t5701-git-serve.sh
@@ -173,4 +173,18 @@ test_expect_success 'symrefs parameter' '
test_cmp actual expect
 '
 
+test_expect_success 'unexpected lines are not allowed in fetch request' '
+   git init server &&
+
+   test-pkt-line pack >in <<-EOF &&
+   command=fetch
+   0001
+   this-is-not-a-command
+   
+   EOF
+
+   test_must_fail git -C server serve --stateless-rpc /dev/null 2>err 
&&
+   grep "unexpected line: .this-is-not-a-command." err
+'
+
 test_done
diff --git a/upload-pack.c b/upload-pack.c
index 87b4d32a6..c4456bb88 100644
--- a/upload-pack.c
+++ b/upload-pack.c
@@ -1252,7 +1252,7 @@ static void process_args(struct packet_reader *request,
}
 
/* ignore unknown lines maybe? */
-   die("unexpect line: '%s'", arg);
+   die("unexpected line: '%s'", arg);
}
 }
 
-- 
2.17.0.441.gb46fe60e1d-goog



[PATCH 2/3] submodule: add --progress option to add command

2018-05-01 Thread Casey Fitzpatrick
The '--progress' was introduced in 72c5f88311d (clone: pass --progress
decision to recursive submodules, 2016-09-22) to fix the progress reporting
of the clone command. Also add the progress option to the 'submodule add'
command. The update command already supports the progress flag, but it
is not documented.

Signed-off-by: Casey Fitzpatrick 
---
 Documentation/git-submodule.txt |  7 +++
 git-submodule.sh|  5 -
 t/t7400-submodule-basic.sh  | 16 
 3 files changed, 27 insertions(+), 1 deletion(-)

diff --git a/Documentation/git-submodule.txt b/Documentation/git-submodule.txt
index 71c5618e8..d1ebe32e8 100644
--- a/Documentation/git-submodule.txt
+++ b/Documentation/git-submodule.txt
@@ -239,6 +239,13 @@ OPTIONS
 --quiet::
Only print error messages.
 
+--progress::
+   This option is only valid for add and update commands.
+   Progress status is reported on the standard error stream
+   by default when it is attached to a terminal, unless -q
+   is specified. This flag forces progress status even if the
+   standard error stream is not directed to a terminal.
+
 --all::
This option is only valid for the deinit command. Unregister all
submodules in the working tree.
diff --git a/git-submodule.sh b/git-submodule.sh
index 262547968..39c241a1d 100755
--- a/git-submodule.sh
+++ b/git-submodule.sh
@@ -117,6 +117,9 @@ cmd_add()
-q|--quiet)
GIT_QUIET=1
;;
+   --progress)
+   progress=1
+   ;;
--reference)
case "$2" in '') usage ;; esac
reference_path=$2
@@ -255,7 +258,7 @@ or you are unsure what this means choose another name with 
the '--name' option."
eval_gettextln "Reactivating local git 
directory for submodule '\$sm_name'."
fi
fi
-   git submodule--helper clone ${GIT_QUIET:+--quiet} --prefix 
"$wt_prefix" --path "$sm_path" --name "$sm_name" --url "$realrepo" 
${reference:+"$reference"} ${depth:+"$depth"} || exit
+   git submodule--helper clone ${GIT_QUIET:+--quiet} 
${progress:+"--progress"} --prefix "$wt_prefix" --path "$sm_path" --name 
"$sm_name" --url "$realrepo" ${reference:+"$reference"} ${depth:+"$depth"} || 
exit
(
sanitize_submodule_env
cd "$sm_path" &&
diff --git a/t/t7400-submodule-basic.sh b/t/t7400-submodule-basic.sh
index a39e69a3e..b5b4922ab 100755
--- a/t/t7400-submodule-basic.sh
+++ b/t/t7400-submodule-basic.sh
@@ -126,6 +126,22 @@ test_expect_success 'submodule add' '
test_cmp empty untracked
 '
 
+test_create_repo parent &&
+test_commit -C parent one
+
+test_expect_success 'redirected submodule add does not show progress' '
+   git -C addtest submodule add "file://$submodurl/parent" 
submod-redirected \
+   2>err &&
+   ! grep % err &&
+   test_i18ngrep ! "Checking connectivity" err
+'
+
+test_expect_success 'redirected submodule add --progress does show progress' '
+   git -C addtest submodule add --progress "file://$submodurl/parent" \
+   submod-redirected-progress 2>err && \
+   grep % err
+'
+
 test_expect_success 'submodule add to .gitignored path fails' '
(
cd addtest-ignore &&
-- 
2.17.0.1.ge0414f29c.dirty



[PATCH 3/3] submodule: add --dissociate option to add/update commands

2018-05-01 Thread Casey Fitzpatrick
Add --dissociate option to add and update commands, both clone helper commands
that already have the --reference option --dissociate pairs with.

Signed-off-by: Casey Fitzpatrick 
---
 Documentation/git-submodule.txt | 10 +-
 builtin/submodule--helper.c | 16 +---
 git-submodule.sh| 10 +-
 t/t7408-submodule-reference.sh  | 17 +
 4 files changed, 48 insertions(+), 5 deletions(-)

diff --git a/Documentation/git-submodule.txt b/Documentation/git-submodule.txt
index d1ebe32e8..a75b95043 100644
--- a/Documentation/git-submodule.txt
+++ b/Documentation/git-submodule.txt
@@ -369,7 +369,15 @@ the submodule itself.
this option will be passed to the linkgit:git-clone[1] command.
 +
 *NOTE*: Do *not* use this option unless you have read the note
-for linkgit:git-clone[1]'s `--reference` and `--shared` options carefully.
+for linkgit:git-clone[1]'s `--reference`, `--shared`, and `--dissociate`
+options carefully.
+
+--dissociate::
+   This option is only valid for add and update commands.  These
+   commands sometimes need to clone a remote repository. In this case,
+   this option will be passed to the linkgit:git-clone[1] command.
++
+*NOTE*: see the NOTE for the `--reference` option.
 
 --recursive::
This option is only valid for foreach, update, status and sync commands.
diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
index 6ba8587b6..a85b30419 100644
--- a/builtin/submodule--helper.c
+++ b/builtin/submodule--helper.c
@@ -1056,7 +1056,7 @@ static int module_deinit(int argc, const char **argv, 
const char *prefix)
 }
 
 static int clone_submodule(const char *path, const char *gitdir, const char 
*url,
-  const char *depth, struct string_list *reference,
+  const char *depth, struct string_list *reference, 
int dissociate,
   int quiet, int progress)
 {
struct child_process cp = CHILD_PROCESS_INIT;
@@ -1075,6 +1075,8 @@ static int clone_submodule(const char *path, const char 
*gitdir, const char *url
argv_array_pushl(, "--reference",
 item->string, NULL);
}
+   if (dissociate)
+   argv_array_push(, "--dissociate");
if (gitdir && *gitdir)
argv_array_pushl(, "--separate-git-dir", gitdir, NULL);
 
@@ -1190,6 +1192,7 @@ static int module_clone(int argc, const char **argv, 
const char *prefix)
char *p, *path = NULL, *sm_gitdir;
struct strbuf sb = STRBUF_INIT;
struct string_list reference = STRING_LIST_INIT_NODUP;
+   int dissociate = 0;
char *sm_alternate = NULL, *error_strategy = NULL;
 
struct option module_clone_options[] = {
@@ -1208,6 +1211,8 @@ static int module_clone(int argc, const char **argv, 
const char *prefix)
OPT_STRING_LIST(0, "reference", ,
   N_("repo"),
   N_("reference repository")),
+   OPT_BOOL(0, "dissociate", ,
+  N_("use --reference only while cloning")),
OPT_STRING(0, "depth", ,
   N_("string"),
   N_("depth for shallow clones")),
@@ -1247,7 +1252,7 @@ static int module_clone(int argc, const char **argv, 
const char *prefix)
 
prepare_possible_alternates(name, );
 
-   if (clone_submodule(path, sm_gitdir, url, depth, ,
+   if (clone_submodule(path, sm_gitdir, url, depth, , 
dissociate,
quiet, progress))
die(_("clone of '%s' into submodule path '%s' failed"),
url, path);
@@ -1300,6 +1305,7 @@ struct submodule_update_clone {
int quiet;
int recommend_shallow;
struct string_list references;
+   int dissociate;
const char *depth;
const char *recursive_prefix;
const char *prefix;
@@ -1315,7 +1321,7 @@ struct submodule_update_clone {
int failed_clones_nr, failed_clones_alloc;
 };
 #define SUBMODULE_UPDATE_CLONE_INIT {0, MODULE_LIST_INIT, 0, \
-   SUBMODULE_UPDATE_STRATEGY_INIT, 0, 0, -1, STRING_LIST_INIT_DUP, \
+   SUBMODULE_UPDATE_STRATEGY_INIT, 0, 0, -1, STRING_LIST_INIT_DUP, 0, \
NULL, NULL, NULL, \
STRING_LIST_INIT_DUP, 0, NULL, 0, 0}
 
@@ -1442,6 +1448,8 @@ static int prepare_to_clone_next_submodule(const struct 
cache_entry *ce,
for_each_string_list_item(item, >references)
argv_array_pushl(>args, "--reference", 
item->string, NULL);
}
+   if (suc->dissociate)
+   argv_array_push(>args, "--dissociate");
if (suc->depth)
argv_array_push(>args, suc->depth);
 
@@ -1575,6 +1583,8 @@ static int update_clone(int argc, const char **argv, 
const char *prefix)
   

[PATCH 1/3] submodule: clean up subsititions in script

2018-05-01 Thread Casey Fitzpatrick
'recommend_shallow' and 'jobs' variables do not need quotes (they never contain
spaces) and do not require any additional prefix, therefore remove the
unnecessary subsitition.

'progress' is a boolean value. Treat it like the other boolean values in the
script by using a substitution.
---
 git-submodule.sh | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/git-submodule.sh b/git-submodule.sh
index 24914963c..262547968 100755
--- a/git-submodule.sh
+++ b/git-submodule.sh
@@ -465,7 +465,7 @@ cmd_update()
GIT_QUIET=1
;;
--progress)
-   progress="--progress"
+   progress=1
;;
-i|--init)
init=1
@@ -542,14 +542,14 @@ cmd_update()
 
{
git submodule--helper update-clone ${GIT_QUIET:+--quiet} \
-   ${progress:+"$progress"} \
+   ${progress:+"--progress"} \
${wt_prefix:+--prefix "$wt_prefix"} \
${prefix:+--recursive-prefix "$prefix"} \
${update:+--update "$update"} \
${reference:+"$reference"} \
${depth:+--depth "$depth"} \
-   ${recommend_shallow:+"$recommend_shallow"} \
-   ${jobs:+$jobs} \
+   $recommend_shallow \
+   $jobs \
"$@" || echo "#unmatched" $?
} | {
err=
-- 
2.17.0.1.ge0414f29c.dirty



[PATCH 0/3] Add --progress and --dissociate to git submodule

2018-05-01 Thread Casey Fitzpatrick
These patches add --progress and --dissociate options to git submodule.

The --progress option existed beforehand, but only for the update command and
it was left undocumented.

Both add and update submodule commands supported --reference, but not its pair
option --dissociate which allows for independent clones rather than depending
on the reference.

This is a resubmission with comments from Stefan Beller and Eric Sunshine
addressed.

Casey Fitzpatrick (3):
  submodule: clean up subsititions in script
  submodule: add --progress option to add command
  submodule: add --dissociate option to add/update commands

 Documentation/git-submodule.txt | 17 -
 builtin/submodule--helper.c | 16 +---
 git-submodule.sh| 21 -
 t/t7400-submodule-basic.sh  | 16 
 t/t7408-submodule-reference.sh  | 17 +
 5 files changed, 78 insertions(+), 9 deletions(-)

-- 
2.17.0.1.ge0414f29c.dirty



[PATCH v2 16/42] Update struct index_state to use struct object_id

2018-05-01 Thread brian m. carlson
Adjust struct index_state to use struct object_id instead of unsigned
char [20].

Signed-off-by: brian m. carlson 
---
 cache.h  |  2 +-
 read-cache.c | 16 
 t/helper/test-dump-split-index.c |  2 +-
 unpack-trees.c   |  2 +-
 4 files changed, 11 insertions(+), 11 deletions(-)

diff --git a/cache.h b/cache.h
index f06737fb78..37d081b8e4 100644
--- a/cache.h
+++ b/cache.h
@@ -324,7 +324,7 @@ struct index_state {
 drop_cache_tree : 1;
struct hashmap name_hash;
struct hashmap dir_hash;
-   unsigned char sha1[20];
+   struct object_id oid;
struct untracked_cache *untracked;
uint64_t fsmonitor_last_update;
struct ewah_bitmap *fsmonitor_dirty;
diff --git a/read-cache.c b/read-cache.c
index f47666b975..9dbaeeec43 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -1806,7 +1806,7 @@ int do_read_index(struct index_state *istate, const char 
*path, int must_exist)
if (verify_hdr(hdr, mmap_size) < 0)
goto unmap;
 
-   hashcpy(istate->sha1, (const unsigned char *)hdr + mmap_size - 
the_hash_algo->rawsz);
+   hashcpy(istate->oid.hash, (const unsigned char *)hdr + mmap_size - 
the_hash_algo->rawsz);
istate->version = ntohl(hdr->hdr_version);
istate->cache_nr = ntohl(hdr->hdr_entries);
istate->cache_alloc = alloc_nr(istate->cache_nr);
@@ -1902,10 +1902,10 @@ int read_index_from(struct index_state *istate, const 
char *path,
base_oid_hex = oid_to_hex(_index->base_oid);
base_path = xstrfmt("%s/sharedindex.%s", gitdir, base_oid_hex);
ret = do_read_index(split_index->base, base_path, 1);
-   if (hashcmp(split_index->base_oid.hash, split_index->base->sha1))
+   if (oidcmp(_index->base_oid, _index->base->oid))
die("broken index, expect %s in %s, got %s",
base_oid_hex, base_path,
-   sha1_to_hex(split_index->base->sha1));
+   oid_to_hex(_index->base->oid));
 
freshen_shared_index(base_path, 0);
merge_base_index(istate);
@@ -2194,7 +2194,7 @@ static int verify_index_from(const struct index_state 
*istate, const char *path)
if (n != the_hash_algo->rawsz)
goto out;
 
-   if (hashcmp(istate->sha1, hash))
+   if (hashcmp(istate->oid.hash, hash))
goto out;
 
close(fd);
@@ -2373,7 +2373,7 @@ static int do_write_index(struct index_state *istate, 
struct tempfile *tempfile,
return -1;
}
 
-   if (ce_flush(, newfd, istate->sha1))
+   if (ce_flush(, newfd, istate->oid.hash))
return -1;
if (close_tempfile_gently(tempfile)) {
error(_("could not close '%s'"), tempfile->filename.buf);
@@ -2497,10 +2497,10 @@ static int write_shared_index(struct index_state 
*istate,
return ret;
}
ret = rename_tempfile(temp,
- git_path("sharedindex.%s", 
sha1_to_hex(si->base->sha1)));
+ git_path("sharedindex.%s", 
oid_to_hex(>base->oid)));
if (!ret) {
-   hashcpy(si->base_oid.hash, si->base->sha1);
-   clean_shared_index_files(sha1_to_hex(si->base->sha1));
+   oidcpy(>base_oid, >base->oid);
+   clean_shared_index_files(oid_to_hex(>base->oid));
}
 
return ret;
diff --git a/t/helper/test-dump-split-index.c b/t/helper/test-dump-split-index.c
index 754e9bb624..63c689d6ee 100644
--- a/t/helper/test-dump-split-index.c
+++ b/t/helper/test-dump-split-index.c
@@ -14,7 +14,7 @@ int cmd__dump_split_index(int ac, const char **av)
int i;
 
do_read_index(_index, av[1], 1);
-   printf("own %s\n", sha1_to_hex(the_index.sha1));
+   printf("own %s\n", oid_to_hex(_index.oid));
si = the_index.split_index;
if (!si) {
printf("not a split index\n");
diff --git a/unpack-trees.c b/unpack-trees.c
index e73745051e..038ef7b926 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -1287,7 +1287,7 @@ int unpack_trees(unsigned len, struct tree_desc *t, 
struct unpack_trees_options
o->result.split_index = o->src_index->split_index;
if (o->result.split_index)
o->result.split_index->refcount++;
-   hashcpy(o->result.sha1, o->src_index->sha1);
+   oidcpy(>result.oid, >src_index->oid);
o->merge_size = len;
mark_all_ce_unused(o->src_index);
 


[PATCH v2 04/42] packfile: remove unused member from struct pack_entry

2018-05-01 Thread brian m. carlson
The sha1 member in struct pack_entry is unused except for one instance
in which we store a value in it.  Since nobody ever reads this value,
don't bother to compute it and remove the member from struct pack_entry.

Signed-off-by: brian m. carlson 
---
 cache.h| 1 -
 packfile.c | 1 -
 2 files changed, 2 deletions(-)

diff --git a/cache.h b/cache.h
index be9fab43ec..37d3f0104b 100644
--- a/cache.h
+++ b/cache.h
@@ -1572,7 +1572,6 @@ struct pack_window {
 
 struct pack_entry {
off_t offset;
-   unsigned char sha1[20];
struct packed_git *p;
 };
 
diff --git a/packfile.c b/packfile.c
index 0bc67d0e00..5c219d0229 100644
--- a/packfile.c
+++ b/packfile.c
@@ -1833,7 +1833,6 @@ static int fill_pack_entry(const unsigned char *sha1,
return 0;
e->offset = offset;
e->p = p;
-   hashcpy(e->sha1, sha1);
return 1;
 }
 


[PATCH v2 08/42] packfile: abstract away hash constant values

2018-05-01 Thread brian m. carlson
There are several instances of the constant 20 and 20-based values in
the packfile code.  Abstract away dependence on SHA-1 by using the
values from the_hash_algo instead.

Use unsigned values for temporary constants to provide the compiler with
more information about what kinds of values it should expect.

Signed-off-by: brian m. carlson 
---
 packfile.c | 66 ++
 1 file changed, 37 insertions(+), 29 deletions(-)

diff --git a/packfile.c b/packfile.c
index 84acd405e0..b7bc4eab17 100644
--- a/packfile.c
+++ b/packfile.c
@@ -84,6 +84,7 @@ static int check_packed_git_idx(const char *path, struct 
packed_git *p)
uint32_t version, nr, i, *index;
int fd = git_open(path);
struct stat st;
+   const unsigned int hashsz = the_hash_algo->rawsz;
 
if (fd < 0)
return -1;
@@ -92,7 +93,7 @@ static int check_packed_git_idx(const char *path, struct 
packed_git *p)
return -1;
}
idx_size = xsize_t(st.st_size);
-   if (idx_size < 4 * 256 + 20 + 20) {
+   if (idx_size < 4 * 256 + hashsz + hashsz) {
close(fd);
return error("index file %s is too small", path);
}
@@ -129,11 +130,11 @@ static int check_packed_git_idx(const char *path, struct 
packed_git *p)
/*
 * Total size:
 *  - 256 index entries 4 bytes each
-*  - 24-byte entries * nr (20-byte sha1 + 4-byte offset)
-*  - 20-byte SHA1 of the packfile
-*  - 20-byte SHA1 file checksum
+*  - 24-byte entries * nr (object ID + 4-byte offset)
+*  - hash of the packfile
+*  - file checksum
 */
-   if (idx_size != 4*256 + nr * 24 + 20 + 20) {
+   if (idx_size != 4*256 + nr * (hashsz + 4) + hashsz + hashsz) {
munmap(idx_map, idx_size);
return error("wrong index v1 file size in %s", path);
}
@@ -142,16 +143,16 @@ static int check_packed_git_idx(const char *path, struct 
packed_git *p)
 * Minimum size:
 *  - 8 bytes of header
 *  - 256 index entries 4 bytes each
-*  - 20-byte sha1 entry * nr
+*  - object ID entry * nr
 *  - 4-byte crc entry * nr
 *  - 4-byte offset entry * nr
-*  - 20-byte SHA1 of the packfile
-*  - 20-byte SHA1 file checksum
+*  - hash of the packfile
+*  - file checksum
 * And after the 4-byte offset table might be a
 * variable sized table containing 8-byte entries
 * for offsets larger than 2^31.
 */
-   unsigned long min_size = 8 + 4*256 + nr*(20 + 4 + 4) + 20 + 20;
+   unsigned long min_size = 8 + 4*256 + nr*(hashsz + 4 + 4) + 
hashsz + hashsz;
unsigned long max_size = min_size;
if (nr)
max_size += (nr - 1)*8;
@@ -444,10 +445,11 @@ static int open_packed_git_1(struct packed_git *p)
 {
struct stat st;
struct pack_header hdr;
-   unsigned char sha1[20];
-   unsigned char *idx_sha1;
+   unsigned char hash[GIT_MAX_RAWSZ];
+   unsigned char *idx_hash;
long fd_flag;
ssize_t read_result;
+   const unsigned hashsz = the_hash_algo->rawsz;
 
if (!p->index_data && open_pack_index(p))
return error("packfile %s index unavailable", p->pack_name);
@@ -507,15 +509,15 @@ static int open_packed_git_1(struct packed_git *p)
 " while index indicates %"PRIu32" objects",
 p->pack_name, ntohl(hdr.hdr_entries),
 p->num_objects);
-   if (lseek(p->pack_fd, p->pack_size - sizeof(sha1), SEEK_SET) == -1)
+   if (lseek(p->pack_fd, p->pack_size - hashsz, SEEK_SET) == -1)
return error("end of packfile %s is unavailable", p->pack_name);
-   read_result = read_in_full(p->pack_fd, sha1, sizeof(sha1));
+   read_result = read_in_full(p->pack_fd, hash, hashsz);
if (read_result < 0)
return error_errno("error reading from %s", p->pack_name);
-   if (read_result != sizeof(sha1))
+   if (read_result != hashsz)
return error("packfile %s signature is unavailable", 
p->pack_name);
-   idx_sha1 = ((unsigned char *)p->index_data) + p->index_size - 40;
-   if (hashcmp(sha1, idx_sha1))
+   idx_hash = ((unsigned char *)p->index_data) + p->index_size - hashsz * 
2;
+   if (hashcmp(hash, idx_hash))
return error("packfile %s does not match index", p->pack_name);
return 0;
 }
@@ -530,7 +532,7 @@ static int open_packed_git(struct packed_git *p)
 
 static int in_window(struct 

[PATCH v2 20/42] dir: convert struct untracked_cache_dir to object_id

2018-05-01 Thread brian m. carlson
Convert the exclude_sha1 member of struct untracked_cache_dir and rename
it to exclude_oid.  Eliminate several hard-coded integral constants, and
update a function name that referred to SHA-1.

Signed-off-by: brian m. carlson 
---
 dir.c| 23 ---
 dir.h|  5 +++--
 t/helper/test-dump-untracked-cache.c |  2 +-
 3 files changed, 16 insertions(+), 14 deletions(-)

diff --git a/dir.c b/dir.c
index 63a917be45..06f4c4a8bf 100644
--- a/dir.c
+++ b/dir.c
@@ -1240,11 +1240,11 @@ static void prep_exclude(struct dir_struct *dir,
(!untracked || !untracked->valid ||
 /*
  * .. and .gitignore does not exist before
- * (i.e. null exclude_sha1). Then we can skip
+ * (i.e. null exclude_oid). Then we can skip
  * loading .gitignore, which would result in
  * ENOENT anyway.
  */
-!is_null_sha1(untracked->exclude_sha1))) {
+!is_null_oid(>exclude_oid))) {
/*
 * dir->basebuf gets reused by the traversal, but we
 * need fname to remain unchanged to ensure the src
@@ -1275,9 +1275,9 @@ static void prep_exclude(struct dir_struct *dir,
 * order, though, if you do that.
 */
if (untracked &&
-   hashcmp(oid_stat.oid.hash, untracked->exclude_sha1)) {
+   oidcmp(_stat.oid, >exclude_oid)) {
invalidate_gitignore(dir->untracked, untracked);
-   hashcpy(untracked->exclude_sha1, oid_stat.oid.hash);
+   oidcpy(>exclude_oid, _stat.oid);
}
dir->exclude_stack = stk;
current = stk->baselen;
@@ -2622,9 +2622,10 @@ static void write_one_dir(struct untracked_cache_dir 
*untracked,
stat_data_to_disk(_data, >stat_data);
strbuf_add(>sb_stat, _data, sizeof(stat_data));
}
-   if (!is_null_sha1(untracked->exclude_sha1)) {
+   if (!is_null_oid(>exclude_oid)) {
ewah_set(wd->sha1_valid, i);
-   strbuf_add(>sb_sha1, untracked->exclude_sha1, 20);
+   strbuf_add(>sb_sha1, untracked->exclude_oid.hash,
+  the_hash_algo->rawsz);
}
 
intlen = encode_varint(untracked->untracked_nr, intbuf);
@@ -2825,16 +2826,16 @@ static void read_stat(size_t pos, void *cb)
ud->valid = 1;
 }
 
-static void read_sha1(size_t pos, void *cb)
+static void read_oid(size_t pos, void *cb)
 {
struct read_data *rd = cb;
struct untracked_cache_dir *ud = rd->ucd[pos];
-   if (rd->data + 20 > rd->end) {
+   if (rd->data + the_hash_algo->rawsz > rd->end) {
rd->data = rd->end + 1;
return;
}
-   hashcpy(ud->exclude_sha1, rd->data);
-   rd->data += 20;
+   hashcpy(ud->exclude_oid.hash, rd->data);
+   rd->data += the_hash_algo->rawsz;
 }
 
 static void load_oid_stat(struct oid_stat *oid_stat, const unsigned char *data,
@@ -2917,7 +2918,7 @@ struct untracked_cache *read_untracked_extension(const 
void *data, unsigned long
ewah_each_bit(rd.check_only, set_check_only, );
rd.data = next + len;
ewah_each_bit(rd.valid, read_stat, );
-   ewah_each_bit(rd.sha1_valid, read_sha1, );
+   ewah_each_bit(rd.sha1_valid, read_oid, );
next = rd.data;
 
 done:
diff --git a/dir.h b/dir.h
index b0758b82a2..de66be9f4e 100644
--- a/dir.h
+++ b/dir.h
@@ -3,6 +3,7 @@
 
 /* See Documentation/technical/api-directory-listing.txt */
 
+#include "cache.h"
 #include "strbuf.h"
 
 struct dir_entry {
@@ -118,8 +119,8 @@ struct untracked_cache_dir {
/* all data except 'dirs' in this struct are good */
unsigned int valid : 1;
unsigned int recurse : 1;
-   /* null SHA-1 means this directory does not have .gitignore */
-   unsigned char exclude_sha1[20];
+   /* null object ID means this directory does not have .gitignore */
+   struct object_id exclude_oid;
char name[FLEX_ARRAY];
 };
 
diff --git a/t/helper/test-dump-untracked-cache.c 
b/t/helper/test-dump-untracked-cache.c
index d7c55c2355..bd92fb305a 100644
--- a/t/helper/test-dump-untracked-cache.c
+++ b/t/helper/test-dump-untracked-cache.c
@@ -23,7 +23,7 @@ static void dump(struct untracked_cache_dir *ucd, struct 
strbuf *base)
len = base->len;
strbuf_addf(base, "%s/", ucd->name);
printf("%s %s", base->buf,
-  sha1_to_hex(ucd->exclude_sha1));
+  oid_to_hex(>exclude_oid));
if (ucd->recurse)
fputs(" recurse", stdout);
if (ucd->check_only)


[PATCH v2 19/42] commit: convert uses of get_sha1_hex to get_oid_hex

2018-05-01 Thread brian m. carlson
Signed-off-by: brian m. carlson 
---
 commit.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/commit.c b/commit.c
index ca474a7c11..9617f85caa 100644
--- a/commit.c
+++ b/commit.c
@@ -331,7 +331,7 @@ int parse_commit_buffer(struct commit *item, const void 
*buffer, unsigned long s
if (tail <= bufptr + tree_entry_len + 1 || memcmp(bufptr, "tree ", 5) ||
bufptr[tree_entry_len] != '\n')
return error("bogus commit object %s", 
oid_to_hex(>object.oid));
-   if (get_sha1_hex(bufptr + 5, parent.hash) < 0)
+   if (get_oid_hex(bufptr + 5, ) < 0)
return error("bad tree pointer in commit %s",
 oid_to_hex(>object.oid));
item->tree = lookup_tree();
@@ -343,7 +343,7 @@ int parse_commit_buffer(struct commit *item, const void 
*buffer, unsigned long s
struct commit *new_parent;
 
if (tail <= bufptr + parent_entry_len + 1 ||
-   get_sha1_hex(bufptr + 7, parent.hash) ||
+   get_oid_hex(bufptr + 7, ) ||
bufptr[parent_entry_len] != '\n')
return error("bad parents in commit %s", 
oid_to_hex(>object.oid));
bufptr += parent_entry_len + 1;


[PATCH v2 03/42] Remove unused member in struct object_context

2018-05-01 Thread brian m. carlson
The tree member of struct object_context is unused except in one place
where we write to it.  Since there are no users of this member, remove
it.

Signed-off-by: brian m. carlson 
---
 cache.h | 1 -
 sha1-name.c | 1 -
 2 files changed, 2 deletions(-)

diff --git a/cache.h b/cache.h
index af6693f352..be9fab43ec 100644
--- a/cache.h
+++ b/cache.h
@@ -1306,7 +1306,6 @@ static inline int hex2chr(const char *s)
 #define FALLBACK_DEFAULT_ABBREV 7
 
 struct object_context {
-   unsigned char tree[20];
unsigned mode;
/*
 * symlink_path is only used by get_tree_entry_follow_symlinks,
diff --git a/sha1-name.c b/sha1-name.c
index 5b93bf8da3..7043652a24 100644
--- a/sha1-name.c
+++ b/sha1-name.c
@@ -1698,7 +1698,6 @@ static int get_oid_with_context_1(const char *name,
   name, len);
}
}
-   hashcpy(oc->tree, tree_oid.hash);
if (flags & GET_OID_RECORD_PATH)
oc->path = xstrdup(filename);
 


[PATCH v2 24/42] diff: specify abbreviation size in terms of the_hash_algo

2018-05-01 Thread brian m. carlson
Instead of using hard-coded 40 constants, refer to the_hash_algo for the
current hash size.

Signed-off-by: brian m. carlson 
---
 diff.c | 18 --
 1 file changed, 12 insertions(+), 6 deletions(-)

diff --git a/diff.c b/diff.c
index 314c57e3c0..b1666b9b2d 100644
--- a/diff.c
+++ b/diff.c
@@ -3897,13 +3897,14 @@ static void fill_metainfo(struct strbuf *msg,
*must_show_header = 0;
}
if (one && two && oidcmp(>oid, >oid)) {
-   int abbrev = o->flags.full_index ? 40 : DEFAULT_ABBREV;
+   const unsigned hexsz = the_hash_algo->hexsz;
+   int abbrev = o->flags.full_index ? hexsz : DEFAULT_ABBREV;
 
if (o->flags.binary) {
mmfile_t mf;
if ((!fill_mmfile(, one) && 
diff_filespec_is_binary(one)) ||
(!fill_mmfile(, two) && 
diff_filespec_is_binary(two)))
-   abbrev = 40;
+   abbrev = hexsz;
}
strbuf_addf(msg, "%s%sindex %s..%s", line_prefix, set,
diff_abbrev_oid(>oid, abbrev),
@@ -4138,6 +4139,11 @@ void diff_setup_done(struct diff_options *options)
  DIFF_FORMAT_NAME_STATUS |
  DIFF_FORMAT_CHECKDIFF |
  DIFF_FORMAT_NO_OUTPUT;
+   /*
+* This must be signed because we're comparing against a potentially
+* negative value.
+*/
+   const int hexsz = the_hash_algo->hexsz;
 
if (options->set_default)
options->set_default(options);
@@ -4218,8 +4224,8 @@ void diff_setup_done(struct diff_options *options)
 */
read_cache();
}
-   if (40 < options->abbrev)
-   options->abbrev = 40; /* full */
+   if (hexsz < options->abbrev)
+   options->abbrev = hexsz; /* full */
 
/*
 * It does not make sense to show the first hit we happened
@@ -4797,8 +4803,8 @@ int diff_opt_parse(struct diff_options *options,
options->abbrev = strtoul(arg, NULL, 10);
if (options->abbrev < MINIMUM_ABBREV)
options->abbrev = MINIMUM_ABBREV;
-   else if (40 < options->abbrev)
-   options->abbrev = 40;
+   else if (the_hash_algo->hexsz < options->abbrev)
+   options->abbrev = the_hash_algo->hexsz;
}
else if ((argcount = parse_long_opt("src-prefix", av, ))) {
options->a_prefix = optarg;


[PATCH v2 21/42] http: eliminate hard-coded constants

2018-05-01 Thread brian m. carlson
Use the_hash_algo to find the right size for parsing pack names.

Signed-off-by: brian m. carlson 
---
 http.c | 13 +++--
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/http.c b/http.c
index 3034d10b68..312a5e1833 100644
--- a/http.c
+++ b/http.c
@@ -2047,7 +2047,8 @@ int http_get_info_packs(const char *base_url, struct 
packed_git **packs_head)
int ret = 0, i = 0;
char *url, *data;
struct strbuf buf = STRBUF_INIT;
-   unsigned char sha1[20];
+   unsigned char hash[GIT_MAX_RAWSZ];
+   const unsigned hexsz = the_hash_algo->hexsz;
 
end_url_with_slash(, base_url);
strbuf_addstr(, "objects/info/packs");
@@ -2063,13 +2064,13 @@ int http_get_info_packs(const char *base_url, struct 
packed_git **packs_head)
switch (data[i]) {
case 'P':
i++;
-   if (i + 52 <= buf.len &&
+   if (i + hexsz + 12 <= buf.len &&
starts_with(data + i, " pack-") &&
-   starts_with(data + i + 46, ".pack\n")) {
-   get_sha1_hex(data + i + 6, sha1);
-   fetch_and_setup_pack_index(packs_head, sha1,
+   starts_with(data + i + hexsz + 6, ".pack\n")) {
+   get_sha1_hex(data + i + 6, hash);
+   fetch_and_setup_pack_index(packs_head, hash,
  base_url);
-   i += 51;
+   i += hexsz + 11;
break;
}
default:


[PATCH v2 23/42] upload-pack: replace use of several hard-coded constants

2018-05-01 Thread brian m. carlson
Update several uses of hard-coded 40-based constants to use either
the_hash_algo or GIT_MAX_HEXSZ, as appropriate.  Replace a combined use
of oid_to_hex and memcpy with oid_to_hex_r, which not only avoids the
need for a constant, but is more efficient.  Make use of parse_oid_hex
to eliminate the need for constants and simplify the code at the same
time.  Update some comments to no longer refer to SHA-1 as well.

Signed-off-by: brian m. carlson 
---
 upload-pack.c | 18 +-
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/upload-pack.c b/upload-pack.c
index 6261d4fab3..e9adbe2f87 100644
--- a/upload-pack.c
+++ b/upload-pack.c
@@ -450,7 +450,7 @@ static int get_common_commits(void)
break;
default:
got_common = 1;
-   memcpy(last_hex, oid_to_hex(), 41);
+   oid_to_hex_r(last_hex, );
if (multi_ack == 2)
packet_write_fmt(1, "ACK %s common\n", 
last_hex);
else if (multi_ack)
@@ -492,7 +492,7 @@ static int do_reachable_revlist(struct child_process *cmd,
"rev-list", "--stdin", NULL,
};
struct object *o;
-   char namebuf[42]; /* ^ + SHA-1 + LF */
+   char namebuf[GIT_MAX_HEXSZ + 2]; /* ^ + hash + LF */
int i;
 
cmd->argv = argv;
@@ -561,15 +561,17 @@ static int get_reachable_list(struct object_array *src,
struct child_process cmd = CHILD_PROCESS_INIT;
int i;
struct object *o;
-   char namebuf[42]; /* ^ + SHA-1 + LF */
+   char namebuf[GIT_MAX_HEXSZ + 2]; /* ^ + hash + LF */
+   const unsigned hexsz = the_hash_algo->hexsz;
 
if (do_reachable_revlist(, src, reachable) < 0)
return -1;
 
-   while ((i = read_in_full(cmd.out, namebuf, 41)) == 41) {
+   while ((i = read_in_full(cmd.out, namebuf, hexsz + 1)) == hexsz + 1) {
struct object_id sha1;
+   const char *p;
 
-   if (namebuf[40] != '\n' || get_oid_hex(namebuf, ))
+   if (parse_oid_hex(namebuf, , ) || *p != '\n')
break;
 
o = lookup_object(sha1.hash);
@@ -820,11 +822,9 @@ static void receive_needs(void)
continue;
}
if (!skip_prefix(line, "want ", ) ||
-   get_oid_hex(arg, _buf))
+   parse_oid_hex(arg, _buf, ))
die("git upload-pack: protocol error, "
-   "expected to get sha, not '%s'", line);
-
-   features = arg + 40;
+   "expected to get object ID, not '%s'", line);
 
if (parse_feature_request(features, "deepen-relative"))
deepen_relative = 1;


[PATCH v2 29/42] merge: convert empty tree constant to the_hash_algo

2018-05-01 Thread brian m. carlson
To avoid dependency on a particular hash algorithm, convert a use of
EMPTY_TREE_SHA1_HEX to use the_hash_algo->empty_tree instead.  Since
both branches now use oid_to_hex, condense the if statement into a
ternary.

Signed-off-by: brian m. carlson 
---
 merge.c | 5 +
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/merge.c b/merge.c
index f06a4773d4..5186cb6156 100644
--- a/merge.c
+++ b/merge.c
@@ -11,10 +11,7 @@
 
 static const char *merge_argument(struct commit *commit)
 {
-   if (commit)
-   return oid_to_hex(>object.oid);
-   else
-   return EMPTY_TREE_SHA1_HEX;
+   return oid_to_hex(commit ? >object.oid : 
the_hash_algo->empty_tree);
 }
 
 int index_has_changes(struct strbuf *sb)


[PATCH v2 30/42] sequencer: convert one use of EMPTY_TREE_SHA1_HEX

2018-05-01 Thread brian m. carlson
Convert one use of EMPTY_TREE_SHA1_HEX to use empty_tree_oid_hex to
avoid a dependency on a given hash algorithm.

Signed-off-by: brian m. carlson 
---
 sequencer.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/sequencer.c b/sequencer.c
index 5e3a50fafc..75ed9676fe 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -1481,7 +1481,7 @@ static int do_pick_commit(enum todo_command command, 
struct commit *commit,
unborn = get_oid("HEAD", );
if (unborn)
oidcpy(, the_hash_algo->empty_tree);
-   if (index_differs_from(unborn ? EMPTY_TREE_SHA1_HEX : "HEAD",
+   if (index_differs_from(unborn ? empty_tree_oid_hex() : "HEAD",
   NULL, 0))
return error_dirty_index(opts);
}


[PATCH v2 40/42] Update shell scripts to compute empty tree object ID

2018-05-01 Thread brian m. carlson
Several of our shell scripts hard-code the object ID of the empty tree.
To avoid any problems when changing hashes, compute this value on
startup of the script.  For performance, store the value in a variable
and reuse it throughout the life of the script.

Signed-off-by: brian m. carlson 
---
 git-filter-branch.sh   | 4 +++-
 git-rebase--interactive.sh | 4 +++-
 templates/hooks--pre-commit.sample | 2 +-
 3 files changed, 7 insertions(+), 3 deletions(-)

diff --git a/git-filter-branch.sh b/git-filter-branch.sh
index 64f21547c1..ccceaf19a7 100755
--- a/git-filter-branch.sh
+++ b/git-filter-branch.sh
@@ -11,6 +11,8 @@
 # The following functions will also be available in the commit filter:
 
 functions=$(cat << \EOF
+EMPTY_TREE=$(git hash-object -t tree /dev/null)
+
 warn () {
echo "$*" >&2
 }
@@ -46,7 +48,7 @@ git_commit_non_empty_tree()
 {
if test $# = 3 && test "$1" = $(git rev-parse "$3^{tree}"); then
map "$3"
-   elif test $# = 1 && test "$1" = 
4b825dc642cb6eb9a060e54bf8d69288fbee4904; then
+   elif test $# = 1 && test "$1" = $EMPTY_TREE; then
:
else
git commit-tree "$@"
diff --git a/git-rebase--interactive.sh b/git-rebase--interactive.sh
index 9947e6265f..e5ae58a07e 100644
--- a/git-rebase--interactive.sh
+++ b/git-rebase--interactive.sh
@@ -81,6 +81,8 @@ rewritten_pending="$state_dir"/rewritten-pending
 # and leaves CR at the end instead.
 cr=$(printf "\015")
 
+empty_tree=$(git hash-object -t tree /dev/null)
+
 strategy_args=${strategy:+--strategy=$strategy}
 test -n "$strategy_opts" &&
 eval '
@@ -238,7 +240,7 @@ is_empty_commit() {
die "$(eval_gettext "\$sha1: not a commit that can be picked")"
}
ptree=$(git rev-parse -q --verify "$1"^^{tree} 2>/dev/null) ||
-   ptree=4b825dc642cb6eb9a060e54bf8d69288fbee4904
+   ptree=$empty_tree
test "$tree" = "$ptree"
 }
 
diff --git a/templates/hooks--pre-commit.sample 
b/templates/hooks--pre-commit.sample
index 68d62d5446..6a75641638 100755
--- a/templates/hooks--pre-commit.sample
+++ b/templates/hooks--pre-commit.sample
@@ -12,7 +12,7 @@ then
against=HEAD
 else
# Initial commit: diff against an empty tree object
-   against=4b825dc642cb6eb9a060e54bf8d69288fbee4904
+   against=$(git hash-object -t tree /dev/null)
 fi
 
 # If you want to allow non-ASCII filenames set this variable to true.


[PATCH v2 28/42] builtin/merge: switch tree functions to use object_id

2018-05-01 Thread brian m. carlson
The read_empty and reset_hard functions are static and their callers
have already changed to use struct object_id, so convert them as well.
To avoid dependency on the hash algorithm in use, switch from using
EMPTY_TREE_SHA1_HEX to using empty_tree_oid_hex.

Signed-off-by: brian m. carlson 
---
 builtin/merge.c | 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/builtin/merge.c b/builtin/merge.c
index 9db5a2cf16..7084bcfdea 100644
--- a/builtin/merge.c
+++ b/builtin/merge.c
@@ -280,7 +280,7 @@ static int save_state(struct object_id *stash)
return rc;
 }
 
-static void read_empty(unsigned const char *sha1, int verbose)
+static void read_empty(const struct object_id *oid, int verbose)
 {
int i = 0;
const char *args[7];
@@ -290,15 +290,15 @@ static void read_empty(unsigned const char *sha1, int 
verbose)
args[i++] = "-v";
args[i++] = "-m";
args[i++] = "-u";
-   args[i++] = EMPTY_TREE_SHA1_HEX;
-   args[i++] = sha1_to_hex(sha1);
+   args[i++] = empty_tree_oid_hex();
+   args[i++] = oid_to_hex(oid);
args[i] = NULL;
 
if (run_command_v_opt(args, RUN_GIT_CMD))
die(_("read-tree failed"));
 }
 
-static void reset_hard(unsigned const char *sha1, int verbose)
+static void reset_hard(const struct object_id *oid, int verbose)
 {
int i = 0;
const char *args[6];
@@ -308,7 +308,7 @@ static void reset_hard(unsigned const char *sha1, int 
verbose)
args[i++] = "-v";
args[i++] = "--reset";
args[i++] = "-u";
-   args[i++] = sha1_to_hex(sha1);
+   args[i++] = oid_to_hex(oid);
args[i] = NULL;
 
if (run_command_v_opt(args, RUN_GIT_CMD))
@@ -324,7 +324,7 @@ static void restore_state(const struct object_id *head,
if (is_null_oid(stash))
return;
 
-   reset_hard(head->hash, 1);
+   reset_hard(head, 1);
 
args[2] = oid_to_hex(stash);
 
@@ -1297,7 +1297,7 @@ int cmd_merge(int argc, const char **argv, const char 
*prefix)
if (remoteheads->next)
die(_("Can merge only exactly one commit into empty 
head"));
remote_head_oid = >item->object.oid;
-   read_empty(remote_head_oid->hash, 0);
+   read_empty(remote_head_oid, 0);
update_ref("initial pull", "HEAD", remote_head_oid, NULL, 0,
   UPDATE_REFS_DIE_ON_ERR);
goto done;


[PATCH v2 41/42] add--interactive: compute the empty tree value

2018-05-01 Thread brian m. carlson
The interactive add script hard-codes the object ID of the empty tree.
To avoid any problems when changing hashes, compute this value when used
and cache it for any future uses.

Signed-off-by: brian m. carlson 
---
 git-add--interactive.perl | 11 +--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/git-add--interactive.perl b/git-add--interactive.perl
index c1f52e457f..36f38ced90 100755
--- a/git-add--interactive.perl
+++ b/git-add--interactive.perl
@@ -205,8 +205,15 @@ my $status_head = sprintf($status_fmt, __('staged'), 
__('unstaged'), __('path'))
}
 }
 
-sub get_empty_tree {
-   return '4b825dc642cb6eb9a060e54bf8d69288fbee4904';
+{
+   my $empty_tree;
+   sub get_empty_tree {
+   return $empty_tree if defined $empty_tree;
+
+   $empty_tree = run_cmd_pipe(qw(git hash-object -t tree 
/dev/null));
+   chomp $empty_tree;
+   return $empty_tree;
+   }
 }
 
 sub get_diff_reference {


[PATCH v2 38/42] dir: use the_hash_algo for empty blob object ID

2018-05-01 Thread brian m. carlson
To ensure that we are hash algorithm agnostic, use the_hash_algo to look
up the object ID for the empty blob instead of using the empty_tree_oid
variable.

Signed-off-by: brian m. carlson 
---
 dir.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/dir.c b/dir.c
index 06f4c4a8bf..e879c34c2e 100644
--- a/dir.c
+++ b/dir.c
@@ -828,7 +828,7 @@ static int add_excludes(const char *fname, const char 
*base, int baselen,
if (size == 0) {
if (oid_stat) {
fill_stat_data(_stat->stat, );
-   oidcpy(_stat->oid, _blob_oid);
+   oidcpy(_stat->oid, 
the_hash_algo->empty_blob);
oid_stat->valid = 1;
}
close(fd);


[PATCH v2 22/42] revision: replace use of hard-coded constants

2018-05-01 Thread brian m. carlson
Replace two uses of the hard-coded constant 40 with references to
the_hash_algo.

Signed-off-by: brian m. carlson 
---
 revision.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/revision.c b/revision.c
index ce0e7b71f2..daf7fe6ff4 100644
--- a/revision.c
+++ b/revision.c
@@ -1751,6 +1751,7 @@ static int handle_revision_opt(struct rev_info *revs, int 
argc, const char **arg
const char *arg = argv[0];
const char *optarg;
int argcount;
+   const unsigned hexsz = the_hash_algo->hexsz;
 
/* pseudo revision arguments */
if (!strcmp(arg, "--all") || !strcmp(arg, "--branches") ||
@@ -2038,8 +2039,8 @@ static int handle_revision_opt(struct rev_info *revs, int 
argc, const char **arg
revs->abbrev = strtoul(optarg, NULL, 10);
if (revs->abbrev < MINIMUM_ABBREV)
revs->abbrev = MINIMUM_ABBREV;
-   else if (revs->abbrev > 40)
-   revs->abbrev = 40;
+   else if (revs->abbrev > hexsz)
+   revs->abbrev = hexsz;
} else if (!strcmp(arg, "--abbrev-commit")) {
revs->abbrev_commit = 1;
revs->abbrev_commit_given = 1;


[PATCH v2 25/42] builtin/receive-pack: avoid hard-coded constants for push certs

2018-05-01 Thread brian m. carlson
Use the GIT_SHA1_RAWSZ and GIT_SHA1_HEXSZ macros instead of hard-coding
the constants 20 and 40.  Switch one use of 20 with a format specifier
for a hex value to use the hex constant instead, as the original appears
to have been a typo.

At this point, avoid converting the hard-coded use of SHA-1 to use
the_hash_algo.  SHA-1, even if not collision resistant, is secure in the
context in which it is used here, and the hash algorithm of the repo
need not match what is used here.  When we adopt a new hash algorithm,
we can simply adopt the new algorithm wholesale here, as the nonce is
opaque and its length and validity are entirely controlled by the
server.  Consequently, defer updating this code until that point.

Signed-off-by: brian m. carlson 
---
 builtin/receive-pack.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index 4b68a28e92..6501d6b6cf 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -454,21 +454,21 @@ static void hmac_sha1(unsigned char *out,
/* RFC 2104 2. (6) & (7) */
git_SHA1_Init();
git_SHA1_Update(, k_opad, sizeof(k_opad));
-   git_SHA1_Update(, out, 20);
+   git_SHA1_Update(, out, GIT_SHA1_RAWSZ);
git_SHA1_Final(out, );
 }
 
 static char *prepare_push_cert_nonce(const char *path, timestamp_t stamp)
 {
struct strbuf buf = STRBUF_INIT;
-   unsigned char sha1[20];
+   unsigned char sha1[GIT_SHA1_RAWSZ];
 
strbuf_addf(, "%s:%"PRItime, path, stamp);
hmac_sha1(sha1, buf.buf, buf.len, cert_nonce_seed, 
strlen(cert_nonce_seed));;
strbuf_release();
 
/* RFC 2104 5. HMAC-SHA1-80 */
-   strbuf_addf(, "%"PRItime"-%.*s", stamp, 20, sha1_to_hex(sha1));
+   strbuf_addf(, "%"PRItime"-%.*s", stamp, GIT_SHA1_HEXSZ, 
sha1_to_hex(sha1));
return strbuf_detach(, NULL);
 }
 


[PATCH v2 26/42] sha1-file: add functions for hex empty tree and blob OIDs

2018-05-01 Thread brian m. carlson
Oftentimes, we'll want to refer to an empty tree or empty blob by its
hex name without having to call oid_to_hex or explicitly refer to
the_hash_algo.  Add helper functions that format these values into
static buffers and return them for easy use.

Signed-off-by: brian m. carlson 
---
 cache.h |  3 +++
 sha1-file.c | 12 
 2 files changed, 15 insertions(+)

diff --git a/cache.h b/cache.h
index 37d081b8e4..7115770fa6 100644
--- a/cache.h
+++ b/cache.h
@@ -1049,6 +1049,9 @@ static inline int is_empty_tree_oid(const struct 
object_id *oid)
return !oidcmp(oid, the_hash_algo->empty_tree);
 }
 
+const char *empty_tree_oid_hex(void);
+const char *empty_blob_oid_hex(void);
+
 /* set default permissions by passing mode arguments to open(2) */
 int git_mkstemps_mode(char *pattern, int suffix_len, int mode);
 int git_mkstemp_mode(char *pattern, int mode);
diff --git a/sha1-file.c b/sha1-file.c
index 4328c61285..11598b43eb 100644
--- a/sha1-file.c
+++ b/sha1-file.c
@@ -100,6 +100,18 @@ const struct git_hash_algo hash_algos[GIT_HASH_NALGOS] = {
},
 };
 
+const char *empty_tree_oid_hex(void)
+{
+   static char buf[GIT_MAX_HEXSZ + 1];
+   return oid_to_hex_r(buf, the_hash_algo->empty_tree);
+}
+
+const char *empty_blob_oid_hex(void)
+{
+   static char buf[GIT_MAX_HEXSZ + 1];
+   return oid_to_hex_r(buf, the_hash_algo->empty_blob);
+}
+
 /*
  * This is meant to hold a *small* number of objects that you would
  * want read_sha1_file() to be able to return, but yet you do not want


[PATCH v2 34/42] builtin/reset: convert use of EMPTY_TREE_SHA1_BIN

2018-05-01 Thread brian m. carlson
Convert the last use of EMPTY_TREE_SHA1_BIN to use a direct copy from
the_hash_algo->empty_tree to avoid a dependency on a given hash
algorithm.

Signed-off-by: brian m. carlson 
---
 builtin/reset.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/builtin/reset.c b/builtin/reset.c
index 7f1c3f02a3..a862c70fab 100644
--- a/builtin/reset.c
+++ b/builtin/reset.c
@@ -314,7 +314,7 @@ int cmd_reset(int argc, const char **argv, const char 
*prefix)
unborn = !strcmp(rev, "HEAD") && get_oid("HEAD", );
if (unborn) {
/* reset on unborn branch: treat as reset to empty tree */
-   hashcpy(oid.hash, EMPTY_TREE_SHA1_BIN);
+   oidcpy(, the_hash_algo->empty_tree);
} else if (!pathspec.nr) {
struct commit *commit;
if (get_oid_committish(rev, ))


[PATCH v2 31/42] submodule: convert several uses of EMPTY_TREE_SHA1_HEX

2018-05-01 Thread brian m. carlson
Convert several uses of EMPTY_TREE_SHA1_HEX to use empty_tree_oid_hex to
avoid a dependency on a given hash algorithm.

Signed-off-by: brian m. carlson 
---
 submodule.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/submodule.c b/submodule.c
index 9a50168b23..ee7eea4877 100644
--- a/submodule.c
+++ b/submodule.c
@@ -1567,7 +1567,7 @@ static void submodule_reset_index(const char *path)
   get_super_prefix_or_empty(), path);
argv_array_pushl(, "read-tree", "-u", "--reset", NULL);
 
-   argv_array_push(, EMPTY_TREE_SHA1_HEX);
+   argv_array_push(, empty_tree_oid_hex());
 
if (run_command())
die("could not reset submodule index");
@@ -1659,9 +1659,9 @@ int submodule_move_head(const char *path,
argv_array_push(, "-m");
 
if (!(flags & SUBMODULE_MOVE_HEAD_FORCE))
-   argv_array_push(, old_head ? old_head : 
EMPTY_TREE_SHA1_HEX);
+   argv_array_push(, old_head ? old_head : 
empty_tree_oid_hex());
 
-   argv_array_push(, new_head ? new_head : EMPTY_TREE_SHA1_HEX);
+   argv_array_push(, new_head ? new_head : empty_tree_oid_hex());
 
if (run_command()) {
ret = -1;


[PATCH v2 32/42] wt-status: convert two uses of EMPTY_TREE_SHA1_HEX

2018-05-01 Thread brian m. carlson
Convert two uses of EMPTY_TREE_SHA1_HEX to use empty_tree_oid_hex to
avoid a dependency on a given hash algorithm.

Signed-off-by: brian m. carlson 
---
 wt-status.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/wt-status.c b/wt-status.c
index 50815e5faf..e44115b3be 100644
--- a/wt-status.c
+++ b/wt-status.c
@@ -603,7 +603,7 @@ static void wt_status_collect_changes_index(struct 
wt_status *s)
 
init_revisions(, NULL);
memset(, 0, sizeof(opt));
-   opt.def = s->is_initial ? EMPTY_TREE_SHA1_HEX : s->reference;
+   opt.def = s->is_initial ? empty_tree_oid_hex() : s->reference;
setup_revisions(0, NULL, , );
 
rev.diffopt.flags.override_submodule_config = 1;
@@ -981,7 +981,7 @@ static void wt_longstatus_print_verbose(struct wt_status *s)
rev.diffopt.ita_invisible_in_index = 1;
 
memset(, 0, sizeof(opt));
-   opt.def = s->is_initial ? EMPTY_TREE_SHA1_HEX : s->reference;
+   opt.def = s->is_initial ? empty_tree_oid_hex() : s->reference;
setup_revisions(0, NULL, , );
 
rev.diffopt.output_format |= DIFF_FORMAT_PATCH;


[PATCH v2 36/42] cache-tree: use is_empty_tree_oid

2018-05-01 Thread brian m. carlson
When comparing an object ID against that of the empty tree, use the
is_empty_tree_oid function to ensure that we abstract over the hash
algorithm properly.  In addition, this is more readable than a plain
oidcmp.

Signed-off-by: brian m. carlson 
---
 cache-tree.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/cache-tree.c b/cache-tree.c
index 8c7e1258a4..25663825b5 100644
--- a/cache-tree.c
+++ b/cache-tree.c
@@ -385,7 +385,7 @@ static int update_one(struct cache_tree *it,
/*
 * "sub" can be an empty tree if all subentries are i-t-a.
 */
-   if (contains_ita && !oidcmp(oid, _tree_oid))
+   if (contains_ita && is_empty_tree_oid(oid))
continue;
 
strbuf_grow(, entlen + 100);


[PATCH v2 39/42] sha1_file: only expose empty object constants through git_hash_algo

2018-05-01 Thread brian m. carlson
There really isn't any case in which we want to expose the constants for
empty trees and blobs outside of using the hash algorithm abstraction.
Make these constants static and stop exposing the defines in cache.h.
Remove the constants which are no longer in use.

Signed-off-by: brian m. carlson 
---
 cache.h | 16 
 sha1-file.c | 13 +++--
 2 files changed, 11 insertions(+), 18 deletions(-)

diff --git a/cache.h b/cache.h
index 7115770fa6..ffe145c539 100644
--- a/cache.h
+++ b/cache.h
@@ -1013,22 +1013,6 @@ static inline void oidread(struct object_id *oid, const 
unsigned char *hash)
memcpy(oid->hash, hash, the_hash_algo->rawsz);
 }
 
-
-#define EMPTY_TREE_SHA1_HEX \
-   "4b825dc642cb6eb9a060e54bf8d69288fbee4904"
-#define EMPTY_TREE_SHA1_BIN_LITERAL \
-"\x4b\x82\x5d\xc6\x42\xcb\x6e\xb9\xa0\x60" \
-"\xe5\x4b\xf8\xd6\x92\x88\xfb\xee\x49\x04"
-extern const struct object_id empty_tree_oid;
-#define EMPTY_TREE_SHA1_BIN (empty_tree_oid.hash)
-
-#define EMPTY_BLOB_SHA1_HEX \
-   "e69de29bb2d1d6434b8b29ae775ad8c2e48c5391"
-#define EMPTY_BLOB_SHA1_BIN_LITERAL \
-   "\xe6\x9d\xe2\x9b\xb2\xd1\xd6\x43\x4b\x8b" \
-   "\x29\xae\x77\x5a\xd8\xc2\xe4\x8c\x53\x91"
-extern const struct object_id empty_blob_oid;
-
 static inline int is_empty_blob_sha1(const unsigned char *sha1)
 {
return !hashcmp(sha1, the_hash_algo->empty_blob->hash);
diff --git a/sha1-file.c b/sha1-file.c
index 4acbf8ee08..bf6c8da3ff 100644
--- a/sha1-file.c
+++ b/sha1-file.c
@@ -35,12 +35,21 @@
 /* The maximum size for an object header. */
 #define MAX_HEADER_LEN 32
 
+
+#define EMPTY_TREE_SHA1_BIN_LITERAL \
+"\x4b\x82\x5d\xc6\x42\xcb\x6e\xb9\xa0\x60" \
+"\xe5\x4b\xf8\xd6\x92\x88\xfb\xee\x49\x04"
+
+#define EMPTY_BLOB_SHA1_BIN_LITERAL \
+   "\xe6\x9d\xe2\x9b\xb2\xd1\xd6\x43\x4b\x8b" \
+   "\x29\xae\x77\x5a\xd8\xc2\xe4\x8c\x53\x91"
+
 const unsigned char null_sha1[GIT_MAX_RAWSZ];
 const struct object_id null_oid;
-const struct object_id empty_tree_oid = {
+static const struct object_id empty_tree_oid = {
EMPTY_TREE_SHA1_BIN_LITERAL
 };
-const struct object_id empty_blob_oid = {
+static const struct object_id empty_blob_oid = {
EMPTY_BLOB_SHA1_BIN_LITERAL
 };
 


[PATCH v2 37/42] sequencer: use the_hash_algo for empty tree object ID

2018-05-01 Thread brian m. carlson
To ensure that we are hash algorithm agnostic, use the_hash_algo to look
up the object ID for the empty tree instead of using the empty_tree_oid
variable.

Signed-off-by: brian m. carlson 
---
 sequencer.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/sequencer.c b/sequencer.c
index 75ed9676fe..a487411fde 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -1120,7 +1120,7 @@ static int try_to_commit(struct strbuf *msg, const char 
*author,
 
if (!(flags & ALLOW_EMPTY) && !oidcmp(current_head ?
  _head->tree->object.oid :
- _tree_oid, )) {
+ the_hash_algo->empty_tree, 
)) {
res = 1; /* run 'git commit' to display error message */
goto out;
}


[PATCH v2 35/42] sha1_file: convert cached object code to struct object_id

2018-05-01 Thread brian m. carlson
Convert the code that looks up cached objects to use struct object_id.
Adjust the lookup for empty trees to use the_hash_algo.  Note that we
don't need to be concerned about the hard-coded object ID in the
empty_tree object since we never use it.

Signed-off-by: brian m. carlson 
---
 sha1-file.c | 16 
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/sha1-file.c b/sha1-file.c
index 11598b43eb..4acbf8ee08 100644
--- a/sha1-file.c
+++ b/sha1-file.c
@@ -119,7 +119,7 @@ const char *empty_blob_oid_hex(void)
  * application).
  */
 static struct cached_object {
-   unsigned char sha1[20];
+   struct object_id oid;
enum object_type type;
void *buf;
unsigned long size;
@@ -127,22 +127,22 @@ static struct cached_object {
 static int cached_object_nr, cached_object_alloc;
 
 static struct cached_object empty_tree = {
-   EMPTY_TREE_SHA1_BIN_LITERAL,
+   { EMPTY_TREE_SHA1_BIN_LITERAL },
OBJ_TREE,
"",
0
 };
 
-static struct cached_object *find_cached_object(const unsigned char *sha1)
+static struct cached_object *find_cached_object(const struct object_id *oid)
 {
int i;
struct cached_object *co = cached_objects;
 
for (i = 0; i < cached_object_nr; i++, co++) {
-   if (!hashcmp(co->sha1, sha1))
+   if (!oidcmp(>oid, oid))
return co;
}
-   if (!hashcmp(sha1, empty_tree.sha1))
+   if (!oidcmp(oid, the_hash_algo->empty_tree))
return _tree;
return NULL;
 }
@@ -1260,7 +1260,7 @@ int oid_object_info_extended(const struct object_id *oid, 
struct object_info *oi
oi = _oi;
 
if (!(flags & OBJECT_INFO_SKIP_CACHED)) {
-   struct cached_object *co = find_cached_object(real->hash);
+   struct cached_object *co = find_cached_object(real);
if (co) {
if (oi->typep)
*(oi->typep) = co->type;
@@ -1369,7 +1369,7 @@ int pretend_object_file(void *buf, unsigned long len, 
enum object_type type,
struct cached_object *co;
 
hash_object_file(buf, len, type_name(type), oid);
-   if (has_sha1_file(oid->hash) || find_cached_object(oid->hash))
+   if (has_sha1_file(oid->hash) || find_cached_object(oid))
return 0;
ALLOC_GROW(cached_objects, cached_object_nr + 1, cached_object_alloc);
co = _objects[cached_object_nr++];
@@ -1377,7 +1377,7 @@ int pretend_object_file(void *buf, unsigned long len, 
enum object_type type,
co->type = type;
co->buf = xmalloc(len);
memcpy(co->buf, buf, len);
-   hashcpy(co->sha1, oid->hash);
+   oidcpy(>oid, oid);
return 0;
 }
 


[PATCH v2 33/42] builtin/receive-pack: convert one use of EMPTY_TREE_SHA1_HEX

2018-05-01 Thread brian m. carlson
Convert one use of EMPTY_TREE_SHA1_HEX to use empty_tree_oid_hex to
avoid a dependency on a given hash algorithm.

Signed-off-by: brian m. carlson 
---
 builtin/receive-pack.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index 6501d6b6cf..b6b3bc1334 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -968,7 +968,7 @@ static const char *push_to_deploy(unsigned char *sha1,
return "Working directory has unstaged changes";
 
/* diff-index with either HEAD or an empty tree */
-   diff_index[4] = head_has_history() ? "HEAD" : EMPTY_TREE_SHA1_HEX;
+   diff_index[4] = head_has_history() ? "HEAD" : empty_tree_oid_hex();
 
child_process_init();
child.argv = diff_index;


[PATCH v2 42/42] merge-one-file: compute empty blob object ID

2018-05-01 Thread brian m. carlson
This script hard-codes the object ID of the empty blob.  To avoid any
problems when changing hashes, compute this value by calling git
hash-object.

Signed-off-by: brian m. carlson 
---
 git-merge-one-file.sh | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/git-merge-one-file.sh b/git-merge-one-file.sh
index 9879c59395..f6d9852d2f 100755
--- a/git-merge-one-file.sh
+++ b/git-merge-one-file.sh
@@ -120,7 +120,7 @@ case "${1:-.}${2:-.}${3:-.}" in
case "$1" in
'')
echo "Added $4 in both, but differently."
-   orig=$(git unpack-file e69de29bb2d1d6434b8b29ae775ad8c2e48c5391)
+   orig=$(git unpack-file $(git hash-object /dev/null))
;;
*)
echo "Auto-merging $4"


[PATCH v2 00/42] object_id part 13

2018-05-01 Thread brian m. carlson
This is the thirteenth series of patches to convert to struct object_id
and the_hash_algo.

The series adds an oidread function to read object IDs from a buffer,
removes unused structure members (which therefore don't require
conversion), converts various functions to struct object_id, and
improves usage of the_hash_algo.  It also makes empty_blob_oid and
empty_tree_oid static, exposed only through the hash algorithm
abstraction, and updates all the hard-coded instances of the empty blob
and empty tree object IDs in scripts (excepting the testsuite).

Changes from v1:
* Add missing sign-off.
* Removed unneeded braces from init_pack_info.
* Express 51 in terms of the_hash_algo->hexsz.
* Fix comments referring to SHA-1.
* Update commit messages as suggested.
* Add and use empty_tree_oid_hex and empty_blob_oid_hex.

brian m. carlson (42):
  cache: add a function to read an object ID from a buffer
  server-info: remove unused members from struct pack_info
  Remove unused member in struct object_context
  packfile: remove unused member from struct pack_entry
  packfile: convert has_sha1_pack to object_id
  sha1-file: convert freshen functions to object_id
  packfile: convert find_pack_entry to object_id
  packfile: abstract away hash constant values
  pack-objects: abstract away hash algorithm
  pack-redundant: abstract away hash algorithm
  tree-walk: avoid hard-coded 20 constant
  tree-walk: convert get_tree_entry_follow_symlinks to object_id
  fsck: convert static functions to struct object_id
  submodule-config: convert structures to object_id
  split-index: convert struct split_index to object_id
  Update struct index_state to use struct object_id
  pack-redundant: convert linked lists to use struct object_id
  index-pack: abstract away hash function constant
  commit: convert uses of get_sha1_hex to get_oid_hex
  dir: convert struct untracked_cache_dir to object_id
  http: eliminate hard-coded constants
  revision: replace use of hard-coded constants
  upload-pack: replace use of several hard-coded constants
  diff: specify abbreviation size in terms of the_hash_algo
  builtin/receive-pack: avoid hard-coded constants for push certs
  sha1-file: add functions for hex empty tree and blob OIDs
  builtin/am: convert uses of EMPTY_TREE_SHA1_BIN to the_hash_algo
  builtin/merge: switch tree functions to use object_id
  merge: convert empty tree constant to the_hash_algo
  sequencer: convert one use of EMPTY_TREE_SHA1_HEX
  submodule: convert several uses of EMPTY_TREE_SHA1_HEX
  wt-status: convert two uses of EMPTY_TREE_SHA1_HEX
  builtin/receive-pack: convert one use of EMPTY_TREE_SHA1_HEX
  builtin/reset: convert use of EMPTY_TREE_SHA1_BIN
  sha1_file: convert cached object code to struct object_id
  cache-tree: use is_empty_tree_oid
  sequencer: use the_hash_algo for empty tree object ID
  dir: use the_hash_algo for empty blob object ID
  sha1_file: only expose empty object constants through git_hash_algo
  Update shell scripts to compute empty tree object ID
  add--interactive: compute the empty tree value
  merge-one-file: compute empty blob object ID

 builtin/am.c |  8 +--
 builtin/count-objects.c  |  2 +-
 builtin/fsck.c   |  2 +-
 builtin/index-pack.c |  3 +-
 builtin/merge.c  | 14 ++---
 builtin/pack-objects.c   | 32 +--
 builtin/pack-redundant.c | 62 +++--
 builtin/prune-packed.c   |  2 +-
 builtin/receive-pack.c   |  8 +--
 builtin/reset.c  |  2 +-
 builtin/rev-parse.c  |  4 +-
 cache-tree.c |  4 +-
 cache.h  | 28 --
 commit.c |  4 +-
 diff.c   | 20 ---
 dir.c| 25 -
 dir.h|  5 +-
 fsck.c   | 20 +++
 git-add--interactive.perl| 11 +++-
 git-filter-branch.sh |  4 +-
 git-merge-one-file.sh|  2 +-
 git-rebase--interactive.sh   |  4 +-
 http.c   | 13 ++---
 merge.c  |  5 +-
 packfile.c   | 79 ++-
 packfile.h   |  4 +-
 read-cache.c | 34 ++--
 resolve-undo.c   |  2 +-
 revision.c   |  7 +--
 sequencer.c  |  4 +-
 server-info.c|  9 +---
 sha1-file.c  | 81 +---
 sha1-name.c  |  5 +-
 split-index.c| 10 ++--
 split-index.h|  4 +-
 submodule-config.c   | 66 +++
 submodule-config.h   |  7 +--
 submodule.c 

[PATCH v2 12/42] tree-walk: convert get_tree_entry_follow_symlinks to object_id

2018-05-01 Thread brian m. carlson
Since the only caller of this function already uses struct object_id,
update get_tree_entry_follow_symlinks to use it in parameters and
internally.

Signed-off-by: brian m. carlson 
---
 sha1-name.c |  4 ++--
 tree-walk.c | 16 
 tree-walk.h |  2 +-
 3 files changed, 11 insertions(+), 11 deletions(-)

diff --git a/sha1-name.c b/sha1-name.c
index 7043652a24..7c2d08a202 100644
--- a/sha1-name.c
+++ b/sha1-name.c
@@ -1685,8 +1685,8 @@ static int get_oid_with_context_1(const char *name,
if (new_filename)
filename = new_filename;
if (flags & GET_OID_FOLLOW_SYMLINKS) {
-   ret = 
get_tree_entry_follow_symlinks(tree_oid.hash,
-   filename, oid->hash, >symlink_path,
+   ret = get_tree_entry_follow_symlinks(_oid,
+   filename, oid, >symlink_path,
>mode);
} else {
ret = get_tree_entry(_oid, filename, oid,
diff --git a/tree-walk.c b/tree-walk.c
index 27797c5406..8f5090862b 100644
--- a/tree-walk.c
+++ b/tree-walk.c
@@ -488,7 +488,7 @@ int traverse_trees(int n, struct tree_desc *t, struct 
traverse_info *info)
 struct dir_state {
void *tree;
unsigned long size;
-   unsigned char sha1[20];
+   struct object_id oid;
 };
 
 static int find_tree_entry(struct tree_desc *t, const char *name, struct 
object_id *result, unsigned *mode)
@@ -576,7 +576,7 @@ int get_tree_entry(const struct object_id *tree_oid, const 
char *name, struct ob
  * See the code for enum follow_symlink_result for a description of
  * the return values.
  */
-enum follow_symlinks_result get_tree_entry_follow_symlinks(unsigned char 
*tree_sha1, const char *name, unsigned char *result, struct strbuf 
*result_path, unsigned *mode)
+enum follow_symlinks_result get_tree_entry_follow_symlinks(struct object_id 
*tree_oid, const char *name, struct object_id *result, struct strbuf 
*result_path, unsigned *mode)
 {
int retval = MISSING_OBJECT;
struct dir_state *parents = NULL;
@@ -589,7 +589,7 @@ enum follow_symlinks_result 
get_tree_entry_follow_symlinks(unsigned char *tree_s
 
init_tree_desc(, NULL, 0UL);
strbuf_addstr(, name);
-   hashcpy(current_tree_oid.hash, tree_sha1);
+   oidcpy(_tree_oid, tree_oid);
 
while (1) {
int find_result;
@@ -609,11 +609,11 @@ enum follow_symlinks_result 
get_tree_entry_follow_symlinks(unsigned char *tree_s
ALLOC_GROW(parents, parents_nr + 1, parents_alloc);
parents[parents_nr].tree = tree;
parents[parents_nr].size = size;
-   hashcpy(parents[parents_nr].sha1, root.hash);
+   oidcpy([parents_nr].oid, );
parents_nr++;
 
if (namebuf.buf[0] == '\0') {
-   hashcpy(result, root.hash);
+   oidcpy(result, );
retval = FOUND;
goto done;
}
@@ -663,7 +663,7 @@ enum follow_symlinks_result 
get_tree_entry_follow_symlinks(unsigned char *tree_s
 
/* We could end up here via a symlink to dir/.. */
if (namebuf.buf[0] == '\0') {
-   hashcpy(result, parents[parents_nr - 1].sha1);
+   oidcpy(result, [parents_nr - 1].oid);
retval = FOUND;
goto done;
}
@@ -677,7 +677,7 @@ enum follow_symlinks_result 
get_tree_entry_follow_symlinks(unsigned char *tree_s
 
if (S_ISDIR(*mode)) {
if (!remainder) {
-   hashcpy(result, current_tree_oid.hash);
+   oidcpy(result, _tree_oid);
retval = FOUND;
goto done;
}
@@ -687,7 +687,7 @@ enum follow_symlinks_result 
get_tree_entry_follow_symlinks(unsigned char *tree_s
  1 + first_slash - namebuf.buf);
} else if (S_ISREG(*mode)) {
if (!remainder) {
-   hashcpy(result, current_tree_oid.hash);
+   oidcpy(result, _tree_oid);
retval = FOUND;
} else {
retval = NOT_DIR;
diff --git a/tree-walk.h b/tree-walk.h
index 4617deeb0e..805f58f00f 100644
--- a/tree-walk.h
+++ b/tree-walk.h
@@ -64,7 +64,7 @@ enum follow_symlinks_result {
   */
 };
 
-enum follow_symlinks_result get_tree_entry_follow_symlinks(unsigned char 
*tree_sha1, const char *name, unsigned char 

[PATCH v2 15/42] split-index: convert struct split_index to object_id

2018-05-01 Thread brian m. carlson
Convert the base_sha1 member of struct split_index to use struct
object_id and rename it base_oid.  Include cache.h to make the structure
visible.

Signed-off-by: brian m. carlson 
---
 builtin/rev-parse.c  |  4 ++--
 read-cache.c | 22 +++---
 split-index.c| 10 +-
 split-index.h|  4 +++-
 t/helper/test-dump-split-index.c |  2 +-
 5 files changed, 22 insertions(+), 20 deletions(-)

diff --git a/builtin/rev-parse.c b/builtin/rev-parse.c
index 36b2087782..55c0b90441 100644
--- a/builtin/rev-parse.c
+++ b/builtin/rev-parse.c
@@ -887,8 +887,8 @@ int cmd_rev_parse(int argc, const char **argv, const char 
*prefix)
if (read_cache() < 0)
die(_("Could not read the index"));
if (the_index.split_index) {
-   const unsigned char *sha1 = 
the_index.split_index->base_sha1;
-   const char *path = 
git_path("sharedindex.%s", sha1_to_hex(sha1));
+   const struct object_id *oid = 
_index.split_index->base_oid;
+   const char *path = 
git_path("sharedindex.%s", oid_to_hex(oid));
strbuf_reset();
puts(relative_path(path, prefix, ));
}
diff --git a/read-cache.c b/read-cache.c
index 10f1c6bb8a..f47666b975 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -1878,7 +1878,7 @@ int read_index_from(struct index_state *istate, const 
char *path,
uint64_t start = getnanotime();
struct split_index *split_index;
int ret;
-   char *base_sha1_hex;
+   char *base_oid_hex;
char *base_path;
 
/* istate->initialized covers both .git/index and .git/sharedindex.xxx 
*/
@@ -1889,7 +1889,7 @@ int read_index_from(struct index_state *istate, const 
char *path,
trace_performance_since(start, "read cache %s", path);
 
split_index = istate->split_index;
-   if (!split_index || is_null_sha1(split_index->base_sha1)) {
+   if (!split_index || is_null_oid(_index->base_oid)) {
post_read_index_from(istate);
return ret;
}
@@ -1899,12 +1899,12 @@ int read_index_from(struct index_state *istate, const 
char *path,
else
split_index->base = xcalloc(1, sizeof(*split_index->base));
 
-   base_sha1_hex = sha1_to_hex(split_index->base_sha1);
-   base_path = xstrfmt("%s/sharedindex.%s", gitdir, base_sha1_hex);
+   base_oid_hex = oid_to_hex(_index->base_oid);
+   base_path = xstrfmt("%s/sharedindex.%s", gitdir, base_oid_hex);
ret = do_read_index(split_index->base, base_path, 1);
-   if (hashcmp(split_index->base_sha1, split_index->base->sha1))
+   if (hashcmp(split_index->base_oid.hash, split_index->base->sha1))
die("broken index, expect %s in %s, got %s",
-   base_sha1_hex, base_path,
+   base_oid_hex, base_path,
sha1_to_hex(split_index->base->sha1));
 
freshen_shared_index(base_path, 0);
@@ -2499,7 +2499,7 @@ static int write_shared_index(struct index_state *istate,
ret = rename_tempfile(temp,
  git_path("sharedindex.%s", 
sha1_to_hex(si->base->sha1)));
if (!ret) {
-   hashcpy(si->base_sha1, si->base->sha1);
+   hashcpy(si->base_oid.hash, si->base->sha1);
clean_shared_index_files(sha1_to_hex(si->base->sha1));
}
 
@@ -2554,13 +2554,13 @@ int write_locked_index(struct index_state *istate, 
struct lock_file *lock,
if (!si || alternate_index_output ||
(istate->cache_changed & ~EXTMASK)) {
if (si)
-   hashclr(si->base_sha1);
+   oidclr(>base_oid);
ret = do_write_locked_index(istate, lock, flags);
goto out;
}
 
if (getenv("GIT_TEST_SPLIT_INDEX")) {
-   int v = si->base_sha1[0];
+   int v = si->base_oid.hash[0];
if ((v & 15) < 6)
istate->cache_changed |= SPLIT_INDEX_ORDERED;
}
@@ -2575,7 +2575,7 @@ int write_locked_index(struct index_state *istate, struct 
lock_file *lock,
 
temp = mks_tempfile(git_path("sharedindex_XX"));
if (!temp) {
-   hashclr(si->base_sha1);
+   oidclr(>base_oid);
ret = do_write_locked_index(istate, lock, flags);
goto out;
}
@@ -2595,7 +2595,7 @@ int write_locked_index(struct index_state *istate, struct 
lock_file *lock,
/* Freshen the shared index only if the split-index was written */
if (!ret && 

[PATCH v2 13/42] fsck: convert static functions to struct object_id

2018-05-01 Thread brian m. carlson
Convert two static functions to use struct object_id and parse_oid_hex,
instead of relying on harcoded 20 and 40-based constants.

Signed-off-by: brian m. carlson 
---
 fsck.c | 20 +++-
 1 file changed, 11 insertions(+), 9 deletions(-)

diff --git a/fsck.c b/fsck.c
index 9218c2a643..768011f812 100644
--- a/fsck.c
+++ b/fsck.c
@@ -711,30 +711,31 @@ static int fsck_ident(const char **ident, struct object 
*obj, struct fsck_option
 static int fsck_commit_buffer(struct commit *commit, const char *buffer,
unsigned long size, struct fsck_options *options)
 {
-   unsigned char tree_sha1[20], sha1[20];
+   struct object_id tree_oid, oid;
struct commit_graft *graft;
unsigned parent_count, parent_line_count = 0, author_count;
int err;
const char *buffer_begin = buffer;
+   const char *p;
 
if (verify_headers(buffer, size, >object, options))
return -1;
 
if (!skip_prefix(buffer, "tree ", ))
return report(options, >object, FSCK_MSG_MISSING_TREE, 
"invalid format - expected 'tree' line");
-   if (get_sha1_hex(buffer, tree_sha1) || buffer[40] != '\n') {
+   if (parse_oid_hex(buffer, _oid, ) || *p != '\n') {
err = report(options, >object, FSCK_MSG_BAD_TREE_SHA1, 
"invalid 'tree' line format - bad sha1");
if (err)
return err;
}
-   buffer += 41;
+   buffer = p + 1;
while (skip_prefix(buffer, "parent ", )) {
-   if (get_sha1_hex(buffer, sha1) || buffer[40] != '\n') {
+   if (parse_oid_hex(buffer, , ) || *p != '\n') {
err = report(options, >object, 
FSCK_MSG_BAD_PARENT_SHA1, "invalid 'parent' line format - bad sha1");
if (err)
return err;
}
-   buffer += 41;
+   buffer = p + 1;
parent_line_count++;
}
graft = lookup_commit_graft(>object.oid);
@@ -773,7 +774,7 @@ static int fsck_commit_buffer(struct commit *commit, const 
char *buffer,
if (err)
return err;
if (!commit->tree) {
-   err = report(options, >object, FSCK_MSG_BAD_TREE, 
"could not load commit's tree %s", sha1_to_hex(tree_sha1));
+   err = report(options, >object, FSCK_MSG_BAD_TREE, 
"could not load commit's tree %s", oid_to_hex(_oid));
if (err)
return err;
}
@@ -799,11 +800,12 @@ static int fsck_commit(struct commit *commit, const char 
*data,
 static int fsck_tag_buffer(struct tag *tag, const char *data,
unsigned long size, struct fsck_options *options)
 {
-   unsigned char sha1[20];
+   struct object_id oid;
int ret = 0;
const char *buffer;
char *to_free = NULL, *eol;
struct strbuf sb = STRBUF_INIT;
+   const char *p;
 
if (data)
buffer = data;
@@ -834,12 +836,12 @@ static int fsck_tag_buffer(struct tag *tag, const char 
*data,
ret = report(options, >object, FSCK_MSG_MISSING_OBJECT, 
"invalid format - expected 'object' line");
goto done;
}
-   if (get_sha1_hex(buffer, sha1) || buffer[40] != '\n') {
+   if (parse_oid_hex(buffer, , ) || *p != '\n') {
ret = report(options, >object, FSCK_MSG_BAD_OBJECT_SHA1, 
"invalid 'object' line format - bad sha1");
if (ret)
goto done;
}
-   buffer += 41;
+   buffer = p + 1;
 
if (!skip_prefix(buffer, "type ", )) {
ret = report(options, >object, 
FSCK_MSG_MISSING_TYPE_ENTRY, "invalid format - expected 'type' line");


[PATCH v2 06/42] sha1-file: convert freshen functions to object_id

2018-05-01 Thread brian m. carlson
Convert the various functions for freshening objects and
has_loose_object_nonlocal to use struct object_id.

Signed-off-by: brian m. carlson 
---
 builtin/pack-objects.c |  2 +-
 cache.h|  2 +-
 sha1-file.c| 36 ++--
 3 files changed, 20 insertions(+), 20 deletions(-)

diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index 4bdae5a1d8..907e112331 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -1012,7 +1012,7 @@ static int want_object_in_pack(const struct object_id 
*oid,
int want;
struct list_head *pos;
 
-   if (!exclude && local && has_loose_object_nonlocal(oid->hash))
+   if (!exclude && local && has_loose_object_nonlocal(oid))
return 0;
 
/*
diff --git a/cache.h b/cache.h
index 37d3f0104b..f06737fb78 100644
--- a/cache.h
+++ b/cache.h
@@ -1275,7 +1275,7 @@ extern int has_object_file_with_flags(const struct 
object_id *oid, int flags);
  * with the specified name.  This function does not respect replace
  * references.
  */
-extern int has_loose_object_nonlocal(const unsigned char *sha1);
+extern int has_loose_object_nonlocal(const struct object_id *oid);
 
 extern void assert_oid_type(const struct object_id *oid, enum object_type 
expect);
 
diff --git a/sha1-file.c b/sha1-file.c
index 77ccaab928..1617e25495 100644
--- a/sha1-file.c
+++ b/sha1-file.c
@@ -709,42 +709,42 @@ int check_and_freshen_file(const char *fn, int freshen)
return 1;
 }
 
-static int check_and_freshen_local(const unsigned char *sha1, int freshen)
+static int check_and_freshen_local(const struct object_id *oid, int freshen)
 {
static struct strbuf buf = STRBUF_INIT;
 
strbuf_reset();
-   sha1_file_name(the_repository, , sha1);
+   sha1_file_name(the_repository, , oid->hash);
 
return check_and_freshen_file(buf.buf, freshen);
 }
 
-static int check_and_freshen_nonlocal(const unsigned char *sha1, int freshen)
+static int check_and_freshen_nonlocal(const struct object_id *oid, int freshen)
 {
struct alternate_object_database *alt;
prepare_alt_odb(the_repository);
for (alt = the_repository->objects->alt_odb_list; alt; alt = alt->next) 
{
-   const char *path = alt_sha1_path(alt, sha1);
+   const char *path = alt_sha1_path(alt, oid->hash);
if (check_and_freshen_file(path, freshen))
return 1;
}
return 0;
 }
 
-static int check_and_freshen(const unsigned char *sha1, int freshen)
+static int check_and_freshen(const struct object_id *oid, int freshen)
 {
-   return check_and_freshen_local(sha1, freshen) ||
-  check_and_freshen_nonlocal(sha1, freshen);
+   return check_and_freshen_local(oid, freshen) ||
+  check_and_freshen_nonlocal(oid, freshen);
 }
 
-int has_loose_object_nonlocal(const unsigned char *sha1)
+int has_loose_object_nonlocal(const struct object_id *oid)
 {
-   return check_and_freshen_nonlocal(sha1, 0);
+   return check_and_freshen_nonlocal(oid, 0);
 }
 
-static int has_loose_object(const unsigned char *sha1)
+static int has_loose_object(const struct object_id *oid)
 {
-   return check_and_freshen(sha1, 0);
+   return check_and_freshen(oid, 0);
 }
 
 static void mmap_limit_check(size_t length)
@@ -1661,15 +1661,15 @@ static int write_loose_object(const struct object_id 
*oid, char *hdr,
return finalize_object_file(tmp_file.buf, filename.buf);
 }
 
-static int freshen_loose_object(const unsigned char *sha1)
+static int freshen_loose_object(const struct object_id *oid)
 {
-   return check_and_freshen(sha1, 1);
+   return check_and_freshen(oid, 1);
 }
 
-static int freshen_packed_object(const unsigned char *sha1)
+static int freshen_packed_object(const struct object_id *oid)
 {
struct pack_entry e;
-   if (!find_pack_entry(the_repository, sha1, ))
+   if (!find_pack_entry(the_repository, oid->hash, ))
return 0;
if (e.p->freshened)
return 1;
@@ -1689,7 +1689,7 @@ int write_object_file(const void *buf, unsigned long len, 
const char *type,
 * it out into .git/objects/??/?{38} file.
 */
write_object_file_prepare(buf, len, type, oid, hdr, );
-   if (freshen_packed_object(oid->hash) || freshen_loose_object(oid->hash))
+   if (freshen_packed_object(oid) || freshen_loose_object(oid))
return 0;
return write_loose_object(oid, hdr, hdrlen, buf, len, 0);
 }
@@ -1708,7 +1708,7 @@ int hash_object_file_literally(const void *buf, unsigned 
long len,
 
if (!(flags & HASH_WRITE_OBJECT))
goto cleanup;
-   if (freshen_packed_object(oid->hash) || freshen_loose_object(oid->hash))
+   if (freshen_packed_object(oid) || freshen_loose_object(oid))
goto cleanup;
status = write_loose_object(oid, header, hdrlen, buf, len, 

[PATCH v2 09/42] pack-objects: abstract away hash algorithm

2018-05-01 Thread brian m. carlson
Instead of using hard-coded instances of the constant 20, use
the_hash_algo to look up the correct constant.

Signed-off-by: brian m. carlson 
---
 builtin/pack-objects.c | 30 --
 1 file changed, 16 insertions(+), 14 deletions(-)

diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index 907e112331..f014523613 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -264,6 +264,7 @@ static unsigned long write_no_reuse_object(struct hashfile 
*f, struct object_ent
enum object_type type;
void *buf;
struct git_istream *st = NULL;
+   const unsigned hashsz = the_hash_algo->rawsz;
 
if (!usable_delta) {
if (entry->type == OBJ_BLOB &&
@@ -320,7 +321,7 @@ static unsigned long write_no_reuse_object(struct hashfile 
*f, struct object_ent
dheader[pos] = ofs & 127;
while (ofs >>= 7)
dheader[--pos] = 128 | (--ofs & 127);
-   if (limit && hdrlen + sizeof(dheader) - pos + datalen + 20 >= 
limit) {
+   if (limit && hdrlen + sizeof(dheader) - pos + datalen + hashsz 
>= limit) {
if (st)
close_istream(st);
free(buf);
@@ -332,19 +333,19 @@ static unsigned long write_no_reuse_object(struct 
hashfile *f, struct object_ent
} else if (type == OBJ_REF_DELTA) {
/*
 * Deltas with a base reference contain
-* an additional 20 bytes for the base sha1.
+* additional bytes for the base object ID.
 */
-   if (limit && hdrlen + 20 + datalen + 20 >= limit) {
+   if (limit && hdrlen + hashsz + datalen + hashsz >= limit) {
if (st)
close_istream(st);
free(buf);
return 0;
}
hashwrite(f, header, hdrlen);
-   hashwrite(f, entry->delta->idx.oid.hash, 20);
-   hdrlen += 20;
+   hashwrite(f, entry->delta->idx.oid.hash, hashsz);
+   hdrlen += hashsz;
} else {
-   if (limit && hdrlen + datalen + 20 >= limit) {
+   if (limit && hdrlen + datalen + hashsz >= limit) {
if (st)
close_istream(st);
free(buf);
@@ -376,6 +377,7 @@ static off_t write_reuse_object(struct hashfile *f, struct 
object_entry *entry,
unsigned char header[MAX_PACK_OBJECT_HEADER],
  dheader[MAX_PACK_OBJECT_HEADER];
unsigned hdrlen;
+   const unsigned hashsz = the_hash_algo->rawsz;
 
if (entry->delta)
type = (allow_ofs_delta && entry->delta->idx.offset) ?
@@ -411,7 +413,7 @@ static off_t write_reuse_object(struct hashfile *f, struct 
object_entry *entry,
dheader[pos] = ofs & 127;
while (ofs >>= 7)
dheader[--pos] = 128 | (--ofs & 127);
-   if (limit && hdrlen + sizeof(dheader) - pos + datalen + 20 >= 
limit) {
+   if (limit && hdrlen + sizeof(dheader) - pos + datalen + hashsz 
>= limit) {
unuse_pack(_curs);
return 0;
}
@@ -420,16 +422,16 @@ static off_t write_reuse_object(struct hashfile *f, 
struct object_entry *entry,
hdrlen += sizeof(dheader) - pos;
reused_delta++;
} else if (type == OBJ_REF_DELTA) {
-   if (limit && hdrlen + 20 + datalen + 20 >= limit) {
+   if (limit && hdrlen + hashsz + datalen + hashsz >= limit) {
unuse_pack(_curs);
return 0;
}
hashwrite(f, header, hdrlen);
-   hashwrite(f, entry->delta->idx.oid.hash, 20);
-   hdrlen += 20;
+   hashwrite(f, entry->delta->idx.oid.hash, hashsz);
+   hdrlen += hashsz;
reused_delta++;
} else {
-   if (limit && hdrlen + datalen + 20 >= limit) {
+   if (limit && hdrlen + datalen + hashsz >= limit) {
unuse_pack(_curs);
return 0;
}
@@ -752,7 +754,7 @@ static off_t write_reused_pack(struct hashfile *f)
die_errno("unable to seek in reused packfile");
 
if (reuse_packfile_offset < 0)
-   reuse_packfile_offset = reuse_packfile->pack_size - 20;
+   reuse_packfile_offset = reuse_packfile->pack_size - 
the_hash_algo->rawsz;
 
total = to_write = reuse_packfile_offset - sizeof(struct pack_header);
 
@@ -1438,7 +1440,7 @@ static void check_object(struct object_entry *entry)
if (reuse_delta && !entry->preferred_base)
base_ref = use_pack(p, _curs,
  

[PATCH v2 27/42] builtin/am: convert uses of EMPTY_TREE_SHA1_BIN to the_hash_algo

2018-05-01 Thread brian m. carlson
Convert several uses of EMPTY_TREE_SHA1_BIN to use the_hash_algo
and struct object_id instead.

Signed-off-by: brian m. carlson 
---
 builtin/am.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/builtin/am.c b/builtin/am.c
index d834f9e62b..206af06a18 100644
--- a/builtin/am.c
+++ b/builtin/am.c
@@ -1542,7 +1542,7 @@ static int fall_back_threeway(const struct am_state 
*state, const char *index_pa
char *their_tree_name;
 
if (get_oid("HEAD", _tree) < 0)
-   hashcpy(our_tree.hash, EMPTY_TREE_SHA1_BIN);
+   oidcpy(_tree, the_hash_algo->empty_tree);
 
if (build_fake_ancestor(state, index_path))
return error("could not build fake ancestor");
@@ -2042,7 +2042,7 @@ static void am_skip(struct am_state *state)
am_rerere_clear();
 
if (get_oid("HEAD", ))
-   hashcpy(head.hash, EMPTY_TREE_SHA1_BIN);
+   oidcpy(, the_hash_algo->empty_tree);
 
if (clean_index(, ))
die(_("failed to clean index"));
@@ -2105,11 +2105,11 @@ static void am_abort(struct am_state *state)
curr_branch = resolve_refdup("HEAD", 0, _head, NULL);
has_curr_head = curr_branch && !is_null_oid(_head);
if (!has_curr_head)
-   hashcpy(curr_head.hash, EMPTY_TREE_SHA1_BIN);
+   oidcpy(_head, the_hash_algo->empty_tree);
 
has_orig_head = !get_oid("ORIG_HEAD", _head);
if (!has_orig_head)
-   hashcpy(orig_head.hash, EMPTY_TREE_SHA1_BIN);
+   oidcpy(_head, the_hash_algo->empty_tree);
 
clean_index(_head, _head);
 


[PATCH v2 14/42] submodule-config: convert structures to object_id

2018-05-01 Thread brian m. carlson
Convert struct submodule and struct parse_config_parameter to use struct
object_id.  Adjust the functions which take members of these structures
as arguments to also use struct object_id.  Include cache.h into
submodule-config.h to make struct object_id visible.

Signed-off-by: brian m. carlson 
---
 submodule-config.c | 66 +++---
 submodule-config.h |  7 ++---
 2 files changed, 37 insertions(+), 36 deletions(-)

diff --git a/submodule-config.c b/submodule-config.c
index 3f2075764f..5537c88727 100644
--- a/submodule-config.c
+++ b/submodule-config.c
@@ -44,7 +44,7 @@ static int config_path_cmp(const void *unused_cmp_data,
const struct submodule_entry *b = entry_or_key;
 
return strcmp(a->config->path, b->config->path) ||
-  hashcmp(a->config->gitmodules_sha1, b->config->gitmodules_sha1);
+  oidcmp(>config->gitmodules_oid, >config->gitmodules_oid);
 }
 
 static int config_name_cmp(const void *unused_cmp_data,
@@ -56,7 +56,7 @@ static int config_name_cmp(const void *unused_cmp_data,
const struct submodule_entry *b = entry_or_key;
 
return strcmp(a->config->name, b->config->name) ||
-  hashcmp(a->config->gitmodules_sha1, b->config->gitmodules_sha1);
+  oidcmp(>config->gitmodules_oid, >config->gitmodules_oid);
 }
 
 static struct submodule_cache *submodule_cache_alloc(void)
@@ -109,17 +109,17 @@ void submodule_cache_free(struct submodule_cache *cache)
free(cache);
 }
 
-static unsigned int hash_sha1_string(const unsigned char *sha1,
-const char *string)
+static unsigned int hash_oid_string(const struct object_id *oid,
+   const char *string)
 {
-   return memhash(sha1, 20) + strhash(string);
+   return memhash(oid->hash, the_hash_algo->rawsz) + strhash(string);
 }
 
 static void cache_put_path(struct submodule_cache *cache,
   struct submodule *submodule)
 {
-   unsigned int hash = hash_sha1_string(submodule->gitmodules_sha1,
-submodule->path);
+   unsigned int hash = hash_oid_string(>gitmodules_oid,
+   submodule->path);
struct submodule_entry *e = xmalloc(sizeof(*e));
hashmap_entry_init(e, hash);
e->config = submodule;
@@ -129,8 +129,8 @@ static void cache_put_path(struct submodule_cache *cache,
 static void cache_remove_path(struct submodule_cache *cache,
  struct submodule *submodule)
 {
-   unsigned int hash = hash_sha1_string(submodule->gitmodules_sha1,
-submodule->path);
+   unsigned int hash = hash_oid_string(>gitmodules_oid,
+   submodule->path);
struct submodule_entry e;
struct submodule_entry *removed;
hashmap_entry_init(, hash);
@@ -142,8 +142,8 @@ static void cache_remove_path(struct submodule_cache *cache,
 static void cache_add(struct submodule_cache *cache,
  struct submodule *submodule)
 {
-   unsigned int hash = hash_sha1_string(submodule->gitmodules_sha1,
-submodule->name);
+   unsigned int hash = hash_oid_string(>gitmodules_oid,
+   submodule->name);
struct submodule_entry *e = xmalloc(sizeof(*e));
hashmap_entry_init(e, hash);
e->config = submodule;
@@ -151,14 +151,14 @@ static void cache_add(struct submodule_cache *cache,
 }
 
 static const struct submodule *cache_lookup_path(struct submodule_cache *cache,
-   const unsigned char *gitmodules_sha1, const char *path)
+   const struct object_id *gitmodules_oid, const char *path)
 {
struct submodule_entry *entry;
-   unsigned int hash = hash_sha1_string(gitmodules_sha1, path);
+   unsigned int hash = hash_oid_string(gitmodules_oid, path);
struct submodule_entry key;
struct submodule key_config;
 
-   hashcpy(key_config.gitmodules_sha1, gitmodules_sha1);
+   oidcpy(_config.gitmodules_oid, gitmodules_oid);
key_config.path = path;
 
hashmap_entry_init(, hash);
@@ -171,14 +171,14 @@ static const struct submodule *cache_lookup_path(struct 
submodule_cache *cache,
 }
 
 static struct submodule *cache_lookup_name(struct submodule_cache *cache,
-   const unsigned char *gitmodules_sha1, const char *name)
+   const struct object_id *gitmodules_oid, const char *name)
 {
struct submodule_entry *entry;
-   unsigned int hash = hash_sha1_string(gitmodules_sha1, name);
+   unsigned int hash = hash_oid_string(gitmodules_oid, name);
struct submodule_entry key;
struct submodule key_config;
 
-   hashcpy(key_config.gitmodules_sha1, gitmodules_sha1);
+   

[PATCH v2 18/42] index-pack: abstract away hash function constant

2018-05-01 Thread brian m. carlson
The code for reading certain pack v2 offsets had a hard-coded 5
representing the number of uint32_t words that we needed to skip over.
Specify this value in terms of a value from the_hash_algo.

Signed-off-by: brian m. carlson 
---
 builtin/index-pack.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/builtin/index-pack.c b/builtin/index-pack.c
index 78e66b9986..9d6d19525a 100644
--- a/builtin/index-pack.c
+++ b/builtin/index-pack.c
@@ -1543,12 +1543,13 @@ static void read_v2_anomalous_offsets(struct packed_git 
*p,
 {
const uint32_t *idx1, *idx2;
uint32_t i;
+   const uint32_t hashwords = the_hash_algo->rawsz / sizeof(uint32_t);
 
/* The address of the 4-byte offset table */
idx1 = (((const uint32_t *)p->index_data)
+ 2 /* 8-byte header */
+ 256 /* fan out */
-   + 5 * p->num_objects /* 20-byte SHA-1 table */
+   + hashwords * p->num_objects /* object ID table */
+ p->num_objects /* CRC32 table */
);
 


[PATCH v2 05/42] packfile: convert has_sha1_pack to object_id

2018-05-01 Thread brian m. carlson
Convert this function to take a pointer to struct object_id and rename
it has_object_pack for consistency with has_object_file.

Signed-off-by: brian m. carlson 
---
 builtin/count-objects.c | 2 +-
 builtin/fsck.c  | 2 +-
 builtin/prune-packed.c  | 2 +-
 diff.c  | 2 +-
 packfile.c  | 4 ++--
 packfile.h  | 2 +-
 revision.c  | 2 +-
 7 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/builtin/count-objects.c b/builtin/count-objects.c
index b054713e1a..d51e2ce1ec 100644
--- a/builtin/count-objects.c
+++ b/builtin/count-objects.c
@@ -66,7 +66,7 @@ static int count_loose(const struct object_id *oid, const 
char *path, void *data
else {
loose_size += on_disk_bytes(st);
loose++;
-   if (verbose && has_sha1_pack(oid->hash))
+   if (verbose && has_object_pack(oid))
packed_loose++;
}
return 0;
diff --git a/builtin/fsck.c b/builtin/fsck.c
index 087360a675..3eb82ac44f 100644
--- a/builtin/fsck.c
+++ b/builtin/fsck.c
@@ -227,7 +227,7 @@ static void check_reachable_object(struct object *obj)
if (!(obj->flags & HAS_OBJ)) {
if (is_promisor_object(>oid))
return;
-   if (has_sha1_pack(obj->oid.hash))
+   if (has_object_pack(>oid))
return; /* it is in pack - forget about it */
printf("missing %s %s\n", printable_type(obj),
describe_object(obj));
diff --git a/builtin/prune-packed.c b/builtin/prune-packed.c
index 419238171d..4ff525e50f 100644
--- a/builtin/prune-packed.c
+++ b/builtin/prune-packed.c
@@ -25,7 +25,7 @@ static int prune_object(const struct object_id *oid, const 
char *path,
 {
int *opts = data;
 
-   if (!has_sha1_pack(oid->hash))
+   if (!has_object_pack(oid))
return 0;
 
if (*opts & PRUNE_PACKED_DRY_RUN)
diff --git a/diff.c b/diff.c
index 1289df4b1f..314c57e3c0 100644
--- a/diff.c
+++ b/diff.c
@@ -3472,7 +3472,7 @@ static int reuse_worktree_file(const char *name, const 
struct object_id *oid, in
 * objects however would tend to be slower as they need
 * to be individually opened and inflated.
 */
-   if (!FAST_WORKING_DIRECTORY && !want_file && has_sha1_pack(oid->hash))
+   if (!FAST_WORKING_DIRECTORY && !want_file && has_object_pack(oid))
return 0;
 
/*
diff --git a/packfile.c b/packfile.c
index 5c219d0229..e65f943664 100644
--- a/packfile.c
+++ b/packfile.c
@@ -1854,10 +1854,10 @@ int find_pack_entry(struct repository *r, const 
unsigned char *sha1, struct pack
return 0;
 }
 
-int has_sha1_pack(const unsigned char *sha1)
+int has_object_pack(const struct object_id *oid)
 {
struct pack_entry e;
-   return find_pack_entry(the_repository, sha1, );
+   return find_pack_entry(the_repository, oid->hash, );
 }
 
 int has_pack_index(const unsigned char *sha1)
diff --git a/packfile.h b/packfile.h
index a92c0b241c..14ca34bcbd 100644
--- a/packfile.h
+++ b/packfile.h
@@ -136,7 +136,7 @@ extern const struct packed_git *has_packed_and_bad(const 
unsigned char *sha1);
  */
 extern int find_pack_entry(struct repository *r, const unsigned char *sha1, 
struct pack_entry *e);
 
-extern int has_sha1_pack(const unsigned char *sha1);
+extern int has_object_pack(const struct object_id *oid);
 
 extern int has_pack_index(const unsigned char *sha1);
 
diff --git a/revision.c b/revision.c
index b42c836d7a..ce0e7b71f2 100644
--- a/revision.c
+++ b/revision.c
@@ -3086,7 +3086,7 @@ enum commit_action get_commit_action(struct rev_info 
*revs, struct commit *commi
 {
if (commit->object.flags & SHOWN)
return commit_ignore;
-   if (revs->unpacked && has_sha1_pack(commit->object.oid.hash))
+   if (revs->unpacked && has_object_pack(>object.oid))
return commit_ignore;
if (commit->object.flags & UNINTERESTING)
return commit_ignore;


[PATCH v2 07/42] packfile: convert find_pack_entry to object_id

2018-05-01 Thread brian m. carlson
Convert find_pack_entry and the static function fill_pack_entry to take
pointers to struct object_id.

Signed-off-by: brian m. carlson 
---
 packfile.c  | 12 ++--
 packfile.h  |  2 +-
 sha1-file.c |  6 +++---
 3 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/packfile.c b/packfile.c
index e65f943664..84acd405e0 100644
--- a/packfile.c
+++ b/packfile.c
@@ -1805,7 +1805,7 @@ struct packed_git *find_sha1_pack(const unsigned char 
*sha1,
 
 }
 
-static int fill_pack_entry(const unsigned char *sha1,
+static int fill_pack_entry(const struct object_id *oid,
   struct pack_entry *e,
   struct packed_git *p)
 {
@@ -1814,11 +1814,11 @@ static int fill_pack_entry(const unsigned char *sha1,
if (p->num_bad_objects) {
unsigned i;
for (i = 0; i < p->num_bad_objects; i++)
-   if (!hashcmp(sha1, p->bad_object_sha1 + 20 * i))
+   if (!hashcmp(oid->hash, p->bad_object_sha1 + 20 * i))
return 0;
}
 
-   offset = find_pack_entry_one(sha1, p);
+   offset = find_pack_entry_one(oid->hash, p);
if (!offset)
return 0;
 
@@ -1836,7 +1836,7 @@ static int fill_pack_entry(const unsigned char *sha1,
return 1;
 }
 
-int find_pack_entry(struct repository *r, const unsigned char *sha1, struct 
pack_entry *e)
+int find_pack_entry(struct repository *r, const struct object_id *oid, struct 
pack_entry *e)
 {
struct list_head *pos;
 
@@ -1846,7 +1846,7 @@ int find_pack_entry(struct repository *r, const unsigned 
char *sha1, struct pack
 
list_for_each(pos, >objects->packed_git_mru) {
struct packed_git *p = list_entry(pos, struct packed_git, mru);
-   if (fill_pack_entry(sha1, e, p)) {
+   if (fill_pack_entry(oid, e, p)) {
list_move(>mru, >objects->packed_git_mru);
return 1;
}
@@ -1857,7 +1857,7 @@ int find_pack_entry(struct repository *r, const unsigned 
char *sha1, struct pack
 int has_object_pack(const struct object_id *oid)
 {
struct pack_entry e;
-   return find_pack_entry(the_repository, oid->hash, );
+   return find_pack_entry(the_repository, oid, );
 }
 
 int has_pack_index(const unsigned char *sha1)
diff --git a/packfile.h b/packfile.h
index 14ca34bcbd..782029ed07 100644
--- a/packfile.h
+++ b/packfile.h
@@ -134,7 +134,7 @@ extern const struct packed_git *has_packed_and_bad(const 
unsigned char *sha1);
  * Iff a pack file in the given repository contains the object named by sha1,
  * return true and store its location to e.
  */
-extern int find_pack_entry(struct repository *r, const unsigned char *sha1, 
struct pack_entry *e);
+extern int find_pack_entry(struct repository *r, const struct object_id *oid, 
struct pack_entry *e);
 
 extern int has_object_pack(const struct object_id *oid);
 
diff --git a/sha1-file.c b/sha1-file.c
index 1617e25495..4328c61285 100644
--- a/sha1-file.c
+++ b/sha1-file.c
@@ -1268,7 +1268,7 @@ int oid_object_info_extended(const struct object_id *oid, 
struct object_info *oi
}
 
while (1) {
-   if (find_pack_entry(the_repository, real->hash, ))
+   if (find_pack_entry(the_repository, real, ))
break;
 
if (flags & OBJECT_INFO_IGNORE_LOOSE)
@@ -1281,7 +1281,7 @@ int oid_object_info_extended(const struct object_id *oid, 
struct object_info *oi
/* Not a loose object; someone else may have just packed it. */
if (!(flags & OBJECT_INFO_QUICK)) {
reprepare_packed_git(the_repository);
-   if (find_pack_entry(the_repository, real->hash, ))
+   if (find_pack_entry(the_repository, real, ))
break;
}
 
@@ -1669,7 +1669,7 @@ static int freshen_loose_object(const struct object_id 
*oid)
 static int freshen_packed_object(const struct object_id *oid)
 {
struct pack_entry e;
-   if (!find_pack_entry(the_repository, oid->hash, ))
+   if (!find_pack_entry(the_repository, oid, ))
return 0;
if (e.p->freshened)
return 1;


[PATCH v2 02/42] server-info: remove unused members from struct pack_info

2018-05-01 Thread brian m. carlson
The head member of struct pack_info is completely unused and the
nr_heads member is used only in one place, which is an assignment.  This
member was last usefully used in 3e15c67c90 (server-info: throw away T
computation as well, 2005-12-04).

Since this structure member is not useful, remove it.

Signed-off-by: brian m. carlson 
---
 server-info.c | 9 ++---
 1 file changed, 2 insertions(+), 7 deletions(-)

diff --git a/server-info.c b/server-info.c
index 83460ec0d6..7ce6dcd67b 100644
--- a/server-info.c
+++ b/server-info.c
@@ -92,8 +92,6 @@ static struct pack_info {
int old_num;
int new_num;
int nr_alloc;
-   int nr_heads;
-   unsigned char (*head)[20];
 } **info;
 static int num_pack;
 static const char *objdir;
@@ -225,12 +223,9 @@ static void init_pack_info(const char *infofile, int force)
else
stale = 1;
 
-   for (i = 0; i < num_pack; i++) {
-   if (stale) {
+   for (i = 0; i < num_pack; i++)
+   if (stale)
info[i]->old_num = -1;
-   info[i]->nr_heads = 0;
-   }
-   }
 
/* renumber them */
QSORT(info, num_pack, compare_info);


[PATCH v2 17/42] pack-redundant: convert linked lists to use struct object_id

2018-05-01 Thread brian m. carlson
Convert struct llist_item and the rest of the linked list code to use
struct object_id.  Add a use of GIT_MAX_HEXSZ to avoid a dependency on a
hard-coded constant.

Signed-off-by: brian m. carlson 
---
 builtin/pack-redundant.c | 50 +---
 1 file changed, 26 insertions(+), 24 deletions(-)

diff --git a/builtin/pack-redundant.c b/builtin/pack-redundant.c
index 0fe1ff3cb7..0494dceff7 100644
--- a/builtin/pack-redundant.c
+++ b/builtin/pack-redundant.c
@@ -20,7 +20,7 @@ static int load_all_packs, verbose, alt_odb;
 
 struct llist_item {
struct llist_item *next;
-   const unsigned char *sha1;
+   const struct object_id *oid;
 };
 static struct llist {
struct llist_item *front;
@@ -90,14 +90,14 @@ static struct llist * llist_copy(struct llist *list)
return ret;
 
new_item = ret->front = llist_item_get();
-   new_item->sha1 = list->front->sha1;
+   new_item->oid = list->front->oid;
 
old_item = list->front->next;
while (old_item) {
prev = new_item;
new_item = llist_item_get();
prev->next = new_item;
-   new_item->sha1 = old_item->sha1;
+   new_item->oid = old_item->oid;
old_item = old_item->next;
}
new_item->next = NULL;
@@ -108,10 +108,10 @@ static struct llist * llist_copy(struct llist *list)
 
 static inline struct llist_item *llist_insert(struct llist *list,
  struct llist_item *after,
-  const unsigned char *sha1)
+ const struct object_id *oid)
 {
struct llist_item *new_item = llist_item_get();
-   new_item->sha1 = sha1;
+   new_item->oid = oid;
new_item->next = NULL;
 
if (after != NULL) {
@@ -131,21 +131,21 @@ static inline struct llist_item *llist_insert(struct 
llist *list,
 }
 
 static inline struct llist_item *llist_insert_back(struct llist *list,
-  const unsigned char *sha1)
+  const struct object_id *oid)
 {
-   return llist_insert(list, list->back, sha1);
+   return llist_insert(list, list->back, oid);
 }
 
 static inline struct llist_item *llist_insert_sorted_unique(struct llist *list,
-   const unsigned char *sha1, struct llist_item *hint)
+   const struct object_id *oid, struct llist_item *hint)
 {
struct llist_item *prev = NULL, *l;
 
l = (hint == NULL) ? list->front : hint;
while (l) {
-   int cmp = hashcmp(l->sha1, sha1);
+   int cmp = oidcmp(l->oid, oid);
if (cmp > 0) { /* we insert before this entry */
-   return llist_insert(list, prev, sha1);
+   return llist_insert(list, prev, oid);
}
if (!cmp) { /* already exists */
return l;
@@ -154,11 +154,11 @@ static inline struct llist_item 
*llist_insert_sorted_unique(struct llist *list,
l = l->next;
}
/* insert at the end */
-   return llist_insert_back(list, sha1);
+   return llist_insert_back(list, oid);
 }
 
 /* returns a pointer to an item in front of sha1 */
-static inline struct llist_item * llist_sorted_remove(struct llist *list, 
const unsigned char *sha1, struct llist_item *hint)
+static inline struct llist_item * llist_sorted_remove(struct llist *list, 
const struct object_id *oid, struct llist_item *hint)
 {
struct llist_item *prev, *l;
 
@@ -166,7 +166,7 @@ static inline struct llist_item * 
llist_sorted_remove(struct llist *list, const
l = (hint == NULL) ? list->front : hint;
prev = NULL;
while (l) {
-   int cmp = hashcmp(l->sha1, sha1);
+   int cmp = oidcmp(l->oid, oid);
if (cmp > 0) /* not in list, since sorted */
return prev;
if (!cmp) { /* found */
@@ -201,7 +201,7 @@ static void llist_sorted_difference_inplace(struct llist *A,
b = B->front;
 
while (b) {
-   hint = llist_sorted_remove(A, b->sha1, hint);
+   hint = llist_sorted_remove(A, b->oid, hint);
b = b->next;
}
 }
@@ -268,9 +268,11 @@ static void cmp_two_packs(struct pack_list *p1, struct 
pack_list *p2)
/* cmp ~ p1 - p2 */
if (cmp == 0) {
p1_hint = llist_sorted_remove(p1->unique_objects,
-   p1_base + p1_off, p1_hint);
+   (const struct object_id *)(p1_base + 
p1_off),
+   p1_hint);
p2_hint = llist_sorted_remove(p2->unique_objects,
-   p1_base 

[PATCH v2 10/42] pack-redundant: abstract away hash algorithm

2018-05-01 Thread brian m. carlson
Instead of using hard-coded instances of the constant 20, use
the_hash_algo to look up the correct constant.

Signed-off-by: brian m. carlson 
---
 builtin/pack-redundant.c | 12 +++-
 1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/builtin/pack-redundant.c b/builtin/pack-redundant.c
index 354478a127..0fe1ff3cb7 100644
--- a/builtin/pack-redundant.c
+++ b/builtin/pack-redundant.c
@@ -252,13 +252,14 @@ static void cmp_two_packs(struct pack_list *p1, struct 
pack_list *p2)
unsigned long p1_off = 0, p2_off = 0, p1_step, p2_step;
const unsigned char *p1_base, *p2_base;
struct llist_item *p1_hint = NULL, *p2_hint = NULL;
+   const unsigned int hashsz = the_hash_algo->rawsz;
 
p1_base = p1->pack->index_data;
p2_base = p2->pack->index_data;
p1_base += 256 * 4 + ((p1->pack->index_version < 2) ? 4 : 8);
p2_base += 256 * 4 + ((p2->pack->index_version < 2) ? 4 : 8);
-   p1_step = (p1->pack->index_version < 2) ? 24 : 20;
-   p2_step = (p2->pack->index_version < 2) ? 24 : 20;
+   p1_step = hashsz + ((p1->pack->index_version < 2) ? 4 : 0);
+   p2_step = hashsz + ((p2->pack->index_version < 2) ? 4 : 0);
 
while (p1_off < p1->pack->num_objects * p1_step &&
   p2_off < p2->pack->num_objects * p2_step)
@@ -359,13 +360,14 @@ static size_t sizeof_union(struct packed_git *p1, struct 
packed_git *p2)
size_t ret = 0;
unsigned long p1_off = 0, p2_off = 0, p1_step, p2_step;
const unsigned char *p1_base, *p2_base;
+   const unsigned int hashsz = the_hash_algo->rawsz;
 
p1_base = p1->index_data;
p2_base = p2->index_data;
p1_base += 256 * 4 + ((p1->index_version < 2) ? 4 : 8);
p2_base += 256 * 4 + ((p2->index_version < 2) ? 4 : 8);
-   p1_step = (p1->index_version < 2) ? 24 : 20;
-   p2_step = (p2->index_version < 2) ? 24 : 20;
+   p1_step = hashsz + ((p1->index_version < 2) ? 4 : 0);
+   p2_step = hashsz + ((p2->index_version < 2) ? 4 : 0);
 
while (p1_off < p1->num_objects * p1_step &&
   p2_off < p2->num_objects * p2_step)
@@ -558,7 +560,7 @@ static struct pack_list * add_pack(struct packed_git *p)
 
base = p->index_data;
base += 256 * 4 + ((p->index_version < 2) ? 4 : 8);
-   step = (p->index_version < 2) ? 24 : 20;
+   step = the_hash_algo->rawsz + ((p->index_version < 2) ? 4 : 0);
while (off < p->num_objects * step) {
llist_insert_back(l.all_objects, base + off);
off += step;


[PATCH v2 11/42] tree-walk: avoid hard-coded 20 constant

2018-05-01 Thread brian m. carlson
Use the_hash_algo to look up the length of our current hash instead of
hard-coding the value 20.

Signed-off-by: brian m. carlson 
---
 tree-walk.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tree-walk.c b/tree-walk.c
index e11b3063af..27797c5406 100644
--- a/tree-walk.c
+++ b/tree-walk.c
@@ -105,7 +105,7 @@ static void entry_extract(struct tree_desc *t, struct 
name_entry *a)
 static int update_tree_entry_internal(struct tree_desc *desc, struct strbuf 
*err)
 {
const void *buf = desc->buffer;
-   const unsigned char *end = desc->entry.oid->hash + 20;
+   const unsigned char *end = desc->entry.oid->hash + the_hash_algo->rawsz;
unsigned long size = desc->size;
unsigned long len = end - (const unsigned char *)buf;
 


[PATCH v2 01/42] cache: add a function to read an object ID from a buffer

2018-05-01 Thread brian m. carlson
In various places throughout the codebase, we need to read data into a
struct object_id from a pack or other unsigned char buffer.  Add an
inline function that does this based on the current hash algorithm in
use, and use it in several places.

Signed-off-by: brian m. carlson 
---
 cache-tree.c   | 2 +-
 cache.h| 5 +
 resolve-undo.c | 2 +-
 3 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/cache-tree.c b/cache-tree.c
index 6a555f4d43..8c7e1258a4 100644
--- a/cache-tree.c
+++ b/cache-tree.c
@@ -523,7 +523,7 @@ static struct cache_tree *read_one(const char **buffer, 
unsigned long *size_p)
if (0 <= it->entry_count) {
if (size < rawsz)
goto free_return;
-   memcpy(it->oid.hash, (const unsigned char*)buf, rawsz);
+   oidread(>oid, (const unsigned char *)buf);
buf += rawsz;
size -= rawsz;
}
diff --git a/cache.h b/cache.h
index 77b7acebb6..af6693f352 100644
--- a/cache.h
+++ b/cache.h
@@ -1008,6 +1008,11 @@ static inline void oidclr(struct object_id *oid)
memset(oid->hash, 0, GIT_MAX_RAWSZ);
 }
 
+static inline void oidread(struct object_id *oid, const unsigned char *hash)
+{
+   memcpy(oid->hash, hash, the_hash_algo->rawsz);
+}
+
 
 #define EMPTY_TREE_SHA1_HEX \
"4b825dc642cb6eb9a060e54bf8d69288fbee4904"
diff --git a/resolve-undo.c b/resolve-undo.c
index aed95b4b35..fc5b3b83d9 100644
--- a/resolve-undo.c
+++ b/resolve-undo.c
@@ -90,7 +90,7 @@ struct string_list *resolve_undo_read(const char *data, 
unsigned long size)
continue;
if (size < rawsz)
goto error;
-   memcpy(ui->oid[i].hash, (const unsigned char *)data, 
rawsz);
+   oidread(>oid[i], (const unsigned char *)data);
size -= rawsz;
data += rawsz;
}


Re: [PATCH] format-patch: make cover letters always text/plain

2018-05-01 Thread brian m. carlson
On Mon, Apr 30, 2018 at 09:53:33PM -0400, Eric Sunshine wrote:
> On Mon, Apr 30, 2018 at 8:02 PM, brian m. carlson
>  wrote:
> > When formatting a series of patches using --attach and --cover-letter,
> > the cover letter lacks the closing MIME boundary, violating RFC 2046.
> > Certain clients, such as Thunderbird, discard the message body in such a
> > case.
> >
> > Since the cover letter is just one part and sending it as
> > multipart/mixed is not very useful, always emit it as text/plain,
> > avoiding the boundary problem altogether.
> >
> > Reported-by: Patrick Hemmer 
> > Signed-off-by: brian m. carlson 
> > ---
> > diff --git a/t/t4014-format-patch.sh b/t/t4014-format-patch.sh
> > @@ -1661,6 +1661,15 @@ test_expect_success 'format-patch --base with 
> > --attach' '
> > +test_expect_success 'format-patch --attach cover-letter only is 
> > non-multipart' '
> > +   test_when_finished "rm -r patches" &&
> > +   git format-patch -o patches --cover-letter --attach=mimemime 
> > --base=HEAD~ -1 &&
> 
> Nit: "rm -rf" would be a bit more robust against git-format-patch
> somehow crashing before creating the "patches" directory.

Sure, I can reroll with that change.  I had considered doing that, but
decided against it.  I hadn't thought of resilience against a failed git
format-patch, though.
-- 
brian m. carlson: Houston, Texas, US
OpenPGP: https://keybase.io/bk2204


signature.asc
Description: PGP signature


Re: [PATCH 08/41] packfile: abstract away hash constant values

2018-05-01 Thread brian m. carlson
On Tue, May 01, 2018 at 12:22:43PM +0200, Duy Nguyen wrote:
> On Mon, Apr 23, 2018 at 11:39:18PM +, brian m. carlson wrote:
> > There are several instances of the constant 20 and 20-based values in
> > the packfile code.  Abstract away dependence on SHA-1 by using the
> > values from the_hash_algo instead.
> 
> While we're abstracting away 20. There's the only 20 left in
> sha1_file.c that should also be gone. But I guess you could do that
> later since you need to rename fill_sha1_path to
> fill_loose_object_path or something.

I'm already working on knocking those out.

> > @@ -507,15 +509,15 @@ static int open_packed_git_1(struct packed_git *p)
> >  " while index indicates %"PRIu32" objects",
> >  p->pack_name, ntohl(hdr.hdr_entries),
> >  p->num_objects);
> > -   if (lseek(p->pack_fd, p->pack_size - sizeof(sha1), SEEK_SET) == -1)
> > +   if (lseek(p->pack_fd, p->pack_size - hashsz, SEEK_SET) == -1)
> > return error("end of packfile %s is unavailable", p->pack_name);
> > -   read_result = read_in_full(p->pack_fd, sha1, sizeof(sha1));
> > +   read_result = read_in_full(p->pack_fd, hash, hashsz);
> > if (read_result < 0)
> > return error_errno("error reading from %s", p->pack_name);
> > -   if (read_result != sizeof(sha1))
> > +   if (read_result != hashsz)
> > return error("packfile %s signature is unavailable", 
> > p->pack_name);
> > -   idx_sha1 = ((unsigned char *)p->index_data) + p->index_size - 40;
> > -   if (hashcmp(sha1, idx_sha1))
> > +   idx_hash = ((unsigned char *)p->index_data) + p->index_size - hashsz * 
> > 2;
> > +   if (hashcmp(hash, idx_hash))
> 
> Since the hash size is abstracted away, shouldn't this hashcmp become
> oidcmp? (which still does not do the right thing, but at least it's
> one less place to worry about)

Unfortunately, I can't, because it's not an object ID.  I think the
decision was made to not transform non-object ID hashes into struct
object_id, which makes sense.  I suppose we could have an equivalent
struct hash or something for those other uses.

> Same comment for other hashcmp in this patch.
> 
> > @@ -675,7 +677,8 @@ struct packed_git *add_packed_git(const char *path, 
> > size_t path_len, int local)
> > p->pack_size = st.st_size;
> > p->pack_local = local;
> > p->mtime = st.st_mtime;
> > -   if (path_len < 40 || get_sha1_hex(path + path_len - 40, p->sha1))
> > +   if (path_len < the_hash_algo->hexsz ||
> > +   get_sha1_hex(path + path_len - the_hash_algo->hexsz, p->sha1))
> 
> get_sha1_hex looks out of place when we start going with
> the_hash_algo. Maybe change to get_oid_hex() too.

I believe this is the pack hash, which isn't an object ID.  I will
transform it to be called something other than "sha1" and allocate more
memory for it in a future series, though.

> > @@ -1678,10 +1683,10 @@ int bsearch_pack(const struct object_id *oid, const 
> > struct packed_git *p, uint32
> >  
> > index_lookup = index_fanout + 4 * 256;
> > if (p->index_version == 1) {
> > -   index_lookup_width = 24;
> > +   index_lookup_width = hashsz + 4;
> 
> You did good research to spot this 24 constant ;-)

Thanks.
-- 
brian m. carlson: Houston, Texas, US
OpenPGP: https://keybase.io/bk2204


signature.asc
Description: PGP signature


Re: [RFC PATCH v4 0/3] Optional sub hierarchy for remote tags

2018-05-01 Thread Jacob Keller
On Tue, May 1, 2018 at 4:24 PM, Junio C Hamano  wrote:
> Jacob Keller  writes:
>
>> I also agree, I'd prefer if we aim for the mapping to be something
>> which works for all refs in the future, even if such support isn't
>> added now, which is why i've proposed using "refs/remote//" so
>> that a tag would go from
>>
>> refs/tags/v1.7
>>
>> to
>>
>> refs/remote//tags/v1.7
>>
>> Ideally, we could work to update "refs/remotes/" to go to
>> "refs/remote//heads" as well.
>
> This is *not* imcompatible with having refs/remote-tags/* as an
> interim solution.

Sure. I'm just proposing that we pick a name that all the refs can move to now.

>
> We'll have to support refs/remotes// anyway long after
> we start using refs/remote//heads/ by (1) switching
> the fetch refspecs newer "git clone" writes to the latter format,

Ofcourse we'll have to support this, and i didn't mean to imply we wouldn't.

I was just hoping to avoid having even more places to check in the future.

> and (2) extending the dwim table to try both formats.  Having Wink's
> solution as an interim step adds one more entry to (2) but the
> machinery is already there.  And it does not change (1), either.
>

Sure, we could. And yes, we have to do (1), which means we have to do
(2) anyways. But we can still pick something which is more easily
expandable than refs/remotes/ was.

Thanks,
Jake


Re: [PATCH v2] wt-status: use rename settings from init_diff_ui_defaults

2018-05-01 Thread Elijah Newren
On Tue, May 1, 2018 at 4:11 PM, Junio C Hamano  wrote:
> Elijah Newren  writes:
>
>> I'm not certain what the default should be, but I do believe that it
>> should be consistent between these commands.  I lean towards
>> considering break detection being on by default a good thing, but
>> there are some interesting issues to address:
>>   - there is no knob to adjust break detection for status, only for
>> diff/log/show/etc.
>>   - folks have a knob to turn break detection on (for diff/log/show),
>> but not one to turn it off
>>   - for status, break detection makes no sense if rename detection is off.
>>   - for diff/log/show, break detection provides almost no value if
>> rename detection is off (only a dissimilarity index), suggesting that
>> if user turns rename detection off and doesn't explicitly ask for
>> break detection, then it's a waste of time to have it be on
>>   - merge-recursive would break horribly right now if someone turned
>> break detection on for it.  Turning it on might be a good long term
>> goal, but it's not an easy change.
>
> Many of the issues in the above list are surmountable.  A new option
> could be added to "status" to enable break or "diff" family to
> disable it if we really wanted to.  A new "rewritten" state can be
> added alongside with "modified" to "status" output.
>
> A more serious issue around "-B" is this one:
>
> https://public-inbox.org/git/xmqqegqaahnh@gitster.dls.corp.google.com/
>
> Even though the message is back from 2015 and asks users not to use
> -B/-M together for anything critical "for now", the issue has not
> been resolved and the same bug remains with us in the current code.
>
> In the longer term, I suspect that it might make sense to have an
> option to let users choose among "I do not want to have anything to
> do with -B", "I always want -B when I ask for -M" and "I always want
> -B whether I ask for -M".  But unfortunately the latter two with the
> current codebase is an unacceptably risky/broken choice.

Very interesting; I didn't know that break detection and rename
detection were unsafe to use together.

I also just realized that in addition to status being inconsistent
with diff/log/show, it was also inconsistent with itself -- it handles
staged and unstaged changes differently.
(wt_status_collect_changes_worktree() had different settings for break
detection than wt_status_collect_changes_index().)

Eckhard, can you add some comments to your commit message mentioning
the email pointed to by Junio about break detection and rename
detection being unsafe to use together, as well as the inconsistencies
in break detection between different commands?  That may help future
readers understand why break detection was turned off for
wt_status_collect_changes_index().


Re: [PATCH 01/41] cache: add a function to read an object ID from a buffer

2018-05-01 Thread brian m. carlson
On Tue, May 01, 2018 at 11:36:03AM +0200, Duy Nguyen wrote:
> On Mon, Apr 23, 2018 at 11:39:11PM +, brian m. carlson wrote:
> > diff --git a/cache.h b/cache.h
> > index bbaf5c349a..4bca177cf3 100644
> > --- a/cache.h
> > +++ b/cache.h
> > @@ -1008,6 +1008,11 @@ static inline void oidclr(struct object_id *oid)
> > memset(oid->hash, 0, GIT_MAX_RAWSZ);
> >  }
> >  
> > +static inline void oidread(struct object_id *oid, const unsigned char 
> > *hash)
> > +{
> > +   memcpy(oid->hash, hash, the_hash_algo->rawsz);
> 
> If performance is a concern, should we go with GIT_MAX_RAWSZ instead
> of the_hash_algo->rawsz which gives the compiler some more to bypass
> actual memcpy function and generate copy code directly?

I don't think we can do that.  If we have both NewHash and SHA-1
compiled in and are using SHA-1, GIT_MAX_RAWSZ will be 32, but we may
only have 20 bytes that are valid to read.

> If it is not a performance problem, should we avoid inline and move
> the implementation somewhere?

I would like to make it as fast as possible if we can, especially since
hashcpy is inline.  If you have concerns about performance, I can add a
patch in a future series that does some sort of macro if we're using gcc
that does something like the following:

  ({
int rawsz = the_hash_algo->rawsz;
if (rawsz != GIT_SHA1_RAWSZ && rawsz != GIT_MAX_RAWSZ)
  __builtin_trap(); /* never reached */
rawsz;
   })

And then use that instead of the_hash_algo->rawsz in performance
sensitive paths.  That would mean the compiler would know that it was
only one of those two values and it could optimize better.
-- 
brian m. carlson: Houston, Texas, US
OpenPGP: https://keybase.io/bk2204


signature.asc
Description: PGP signature


Re: [RFC PATCH v4 0/3] Optional sub hierarchy for remote tags

2018-05-01 Thread Junio C Hamano
Ævar Arnfjörð Bjarmason  writes:

> As a workaround for that maybe we'll need something like:
>
>   [remote "gbenchmark"]
> url = g...@github.com:google/benchmark
> fetch = +refs/heads/*:refs/remotes/gbenchmark/*
> fetch = +refs/tags/*:refs/remote-tags/gbenchmark/*
>   tagStyle = remote
> tagopt = --no-tags

Good thinking.  In the longer term we would probably want to
deprecate tagopt that was invented in a very lazy way (it was
originally meant to hold any random string that we can insert on the
shell command that invokes "git fetch", which obviously is not a
good idea in the production code) and replace it with something more
"controlled", and the above looks like a good improvement to Wink's
proposed change.

> Or whatever, i.e. something where only the new version will fetch the
> tags and ignore the tagopt option (which I never liked anyway). It's a
> hack, but at least you don't end up with crap it your ref namespace by
> flip-flopping between versions.


Re: [RFC PATCH v4 0/3] Optional sub hierarchy for remote tags

2018-05-01 Thread Junio C Hamano
Jacob Keller  writes:

> I also agree, I'd prefer if we aim for the mapping to be something
> which works for all refs in the future, even if such support isn't
> added now, which is why i've proposed using "refs/remote//" so
> that a tag would go from
>
> refs/tags/v1.7
>
> to
>
> refs/remote//tags/v1.7
>
> Ideally, we could work to update "refs/remotes/" to go to
> "refs/remote//heads" as well.

This is *not* imcompatible with having refs/remote-tags/* as an
interim solution.  

We'll have to support refs/remotes// anyway long after
we start using refs/remote//heads/ by (1) switching
the fetch refspecs newer "git clone" writes to the latter format,
and (2) extending the dwim table to try both formats.  Having Wink's
solution as an interim step adds one more entry to (2) but the
machinery is already there.  And it does not change (1), either.



Re: [PATCH v2] wt-status: use rename settings from init_diff_ui_defaults

2018-05-01 Thread Junio C Hamano
Elijah Newren  writes:

> I'm not certain what the default should be, but I do believe that it
> should be consistent between these commands.  I lean towards
> considering break detection being on by default a good thing, but
> there are some interesting issues to address:
>   - there is no knob to adjust break detection for status, only for
> diff/log/show/etc.
>   - folks have a knob to turn break detection on (for diff/log/show),
> but not one to turn it off
>   - for status, break detection makes no sense if rename detection is off.
>   - for diff/log/show, break detection provides almost no value if
> rename detection is off (only a dissimilarity index), suggesting that
> if user turns rename detection off and doesn't explicitly ask for
> break detection, then it's a waste of time to have it be on
>   - merge-recursive would break horribly right now if someone turned
> break detection on for it.  Turning it on might be a good long term
> goal, but it's not an easy change.

Many of the issues in the above list are surmountable.  A new option
could be added to "status" to enable break or "diff" family to
disable it if we really wanted to.  A new "rewritten" state can be
added alongside with "modified" to "status" output.

A more serious issue around "-B" is this one:

https://public-inbox.org/git/xmqqegqaahnh@gitster.dls.corp.google.com/

Even though the message is back from 2015 and asks users not to use
-B/-M together for anything critical "for now", the issue has not
been resolved and the same bug remains with us in the current code.

In the longer term, I suspect that it might make sense to have an
option to let users choose among "I do not want to have anything to
do with -B", "I always want -B when I ask for -M" and "I always want
-B whether I ask for -M".  But unfortunately the latter two with the
current codebase is an unacceptably risky/broken choice.

>
> So, where does that leave us?  My opinion is
>   - these commands should be consistent.  Eckhard's patch makes them so.
>   - we might want to move towards break detection being on as the
> default.  That's a couple patch series (one for everything but
> merge-recursive, and a separate much bigger series for
> merge-recursive).
>
> But I can see others saying we should leave things inconsistent until
> we can fix the other commands to use break detection as the default.
> So...thoughts?
>
> Elijah


Re: [PATCH 2/2] {fetch,upload}-pack: support filter in protocol v2

2018-05-01 Thread Brandon Williams
On 05/01, Jonathan Tan wrote:
> The fetch-pack/upload-pack protocol v2 was developed independently of
> the filter parameter (used in partial fetches), thus it did not include
> support for it. Add support for the filter parameter.
> 
> Like in the legacy protocol, the server advertises and supports "filter"
> only if uploadpack.allowfilter is configured.
> 
> Like in the legacy protocol, the client continues with a warning if
> "--filter" is specified, but the server does not advertise it.
> 
> Signed-off-by: Jonathan Tan 
> ---
>  Documentation/technical/protocol-v2.txt |  9 +++
>  fetch-pack.c| 23 +-
>  t/t5702-protocol-v2.sh  | 97 +
>  upload-pack.c   | 12 ++-
>  4 files changed, 135 insertions(+), 6 deletions(-)
> 

> @@ -1428,7 +1434,9 @@ int upload_pack_v2(struct repository *r, struct 
> argv_array *keys,
>  int upload_pack_advertise(struct repository *r,
> struct strbuf *value)
>  {
> - if (value)
> - strbuf_addstr(value, "shallow");
> + git_config(upload_pack_config, NULL);
> + if (value) {
> + strbuf_addf(value, "%sshallow", allow_filter ? "filter " : "");
> + }

This is a bit difficult to read and there is no reason why we would need
to read the entire upload_pack_config to determine if we need to filter
things (we will need to read the config if cmd "fetch" is requested
though).  Instead it may be better to do the following:

  if (value) {
strbuf_addstr(value, "shallow");
if (repo_config_get(r, "uplaodpack.filter"))
  strbuf_addstr(value, " filter");
  }

This way its easier to read and you only are reading the required value
from the config.

>   return 1;
>  }
> -- 
> 2.17.0.441.gb46fe60e1d-goog
> 

-- 
Brandon Williams


Re: [PATCH 1/2] upload-pack: fix error message typo

2018-05-01 Thread Jonathan Tan
On Tue,  1 May 2018 15:22:20 -0700
Jonathan Tan  wrote:

> +test_expect_success 'unexpected lines are not allowed in fetch request' '
> + git init server &&
> +
> + # Custom request that tries to filter even though it is not advertised.

Oops...I saw this copy-and-paste error right after I sent the e-mails.
I'll remove this in the next reroll (if there is one).

> + test-pkt-line pack >in <<-EOF &&
> + command=fetch
> + 0001
> + this-is-not-a-command
> + 
> + EOF
> +
> + test_must_fail git -C server serve --stateless-rpc /dev/null 2>err 
> &&
> + grep "unexpected line: .this-is-not-a-command." err
> +'


[PATCH 2/2] {fetch,upload}-pack: support filter in protocol v2

2018-05-01 Thread Jonathan Tan
The fetch-pack/upload-pack protocol v2 was developed independently of
the filter parameter (used in partial fetches), thus it did not include
support for it. Add support for the filter parameter.

Like in the legacy protocol, the server advertises and supports "filter"
only if uploadpack.allowfilter is configured.

Like in the legacy protocol, the client continues with a warning if
"--filter" is specified, but the server does not advertise it.

Signed-off-by: Jonathan Tan 
---
 Documentation/technical/protocol-v2.txt |  9 +++
 fetch-pack.c| 23 +-
 t/t5702-protocol-v2.sh  | 97 +
 upload-pack.c   | 12 ++-
 4 files changed, 135 insertions(+), 6 deletions(-)

diff --git a/Documentation/technical/protocol-v2.txt 
b/Documentation/technical/protocol-v2.txt
index 136179d7d..38d24fd2b 100644
--- a/Documentation/technical/protocol-v2.txt
+++ b/Documentation/technical/protocol-v2.txt
@@ -290,6 +290,15 @@ included in the clients request as well as the potential 
addition of the
Cannot be used with "deepen", but can be used with
"deepen-since".
 
+If the 'filter' feature is advertised, the following argument can be
+included in the client's request:
+
+filter 
+   Request that various objects from the packfile be omitted
+   using one of several filtering techniques. These are intended
+   for use with partial clone and partial fetch operations. See
+   `rev-list` for possible "filter-spec" values.
+
 The response of `fetch` is broken into a number of sections separated by
 delimiter packets (0001), with each section beginning with its section
 header.
diff --git a/fetch-pack.c b/fetch-pack.c
index f93723fec..3ed40aa46 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -1191,14 +1191,29 @@ static int send_fetch_request(int fd_out, const struct 
fetch_pack_args *args,
else if (is_repository_shallow() || args->deepen)
die(_("Server does not support shallow requests"));
 
+   /* Add filter */
+   if (server_supports_feature("fetch", "filter", 0) &&
+   args->filter_options.choice) {
+   print_verbose(args, _("Server supports filter"));
+   packet_buf_write(_buf, "filter %s",
+args->filter_options.filter_spec);
+   } else if (args->filter_options.choice) {
+   warning("filtering not recognized by server, ignoring");
+   }
+
/* add wants */
add_wants(wants, _buf);
 
-   /* Add all of the common commits we've found in previous rounds */
-   add_common(_buf, common);
+   if (args->no_dependents) {
+   packet_buf_write(_buf, "done");
+   ret = 1;
+   } else {
+   /* Add all of the common commits we've found in previous rounds 
*/
+   add_common(_buf, common);
 
-   /* Add initial haves */
-   ret = add_haves(_buf, haves_to_send, in_vain);
+   /* Add initial haves */
+   ret = add_haves(_buf, haves_to_send, in_vain);
+   }
 
/* Send request */
packet_buf_flush(_buf);
diff --git a/t/t5702-protocol-v2.sh b/t/t5702-protocol-v2.sh
index 56f7c3c32..834ccc6f2 100755
--- a/t/t5702-protocol-v2.sh
+++ b/t/t5702-protocol-v2.sh
@@ -201,6 +201,103 @@ test_expect_success 'ref advertisment is filtered during 
fetch using protocol v2
! grep "refs/tags/three" log
 '
 
+test_expect_success 'setup filter tests' '
+   git init server &&
+
+   # 1 commit to create a file, and 1 commit to modify it
+   test_commit -C server message1 a.txt &&
+   test_commit -C server message2 a.txt &&
+   git -C server config protocol.version 2 &&
+   git -C server config uploadpack.allowfilter 1 &&
+   git -C server config uploadpack.allowanysha1inwant 1 &&
+   git -C server config protocol.version 2
+'
+
+test_expect_success 'partial clone' '
+   GIT_TRACE_PACKET="$(pwd)/trace" git -c protocol.version=2 \
+   clone --filter=blob:none "file://$(pwd)/server" client &&
+   grep "version 2" trace &&
+
+   # Ensure that the old version of the file is missing
+   git -C client rev-list master --quiet --objects --missing=print \
+   >observed.oids &&
+   grep "$(git -C server rev-parse message1:a.txt)" observed.oids &&
+
+   # Ensure that client passes fsck
+   git -C client fsck
+'
+
+test_expect_success 'dynamically fetch missing object' '
+   rm "$(pwd)/trace" &&
+   GIT_TRACE_PACKET="$(pwd)/trace" git -C client -c protocol.version=2 \
+   cat-file -p $(git -C server rev-parse message1:a.txt) &&
+   grep "version 2" trace
+'
+
+test_expect_success 'partial fetch' '
+   rm -rf client "$(pwd)/trace" &&
+   git init client &&
+   SERVER="file://$(pwd)/server" &&
+   test_config -C client extensions.partialClone "$SERVER" &&
+
+   

[PATCH 0/2] Supporting partial clones in protocol v2

2018-05-01 Thread Jonathan Tan
This patch set is built on "next". (I had to build it on "next" because
bw/protocol-v2 is built on v2.16, which does not have support for
partial clones.)

Partial clones and protocol v2 were built independently, so here is a
patch to support partial clones in protocol v2.

One thing I am a little unhappy about is the fact that the upload-pack
config has to be read in multiple places, but perhaps there is no choice
about that.

Jonathan Tan (2):
  upload-pack: fix error message typo
  {fetch,upload}-pack: support filter in protocol v2

 Documentation/technical/protocol-v2.txt |  9 +++
 fetch-pack.c| 23 +-
 t/t5701-git-serve.sh| 15 
 t/t5702-protocol-v2.sh  | 97 +
 upload-pack.c   | 14 +++-
 5 files changed, 151 insertions(+), 7 deletions(-)

-- 
2.17.0.441.gb46fe60e1d-goog



[PATCH 1/2] upload-pack: fix error message typo

2018-05-01 Thread Jonathan Tan
Fix a typo in an error message.

Also, this line was introduced in 3145ea957d2c ("upload-pack: introduce
fetch server command", 2018-03-15), which did not contain a test for the
case which causes this error to be printed, so introduce a test.

Signed-off-by: Jonathan Tan 
---
 t/t5701-git-serve.sh | 15 +++
 upload-pack.c|  2 +-
 2 files changed, 16 insertions(+), 1 deletion(-)

diff --git a/t/t5701-git-serve.sh b/t/t5701-git-serve.sh
index 72d7bc562..abbe5b19e 100755
--- a/t/t5701-git-serve.sh
+++ b/t/t5701-git-serve.sh
@@ -173,4 +173,19 @@ test_expect_success 'symrefs parameter' '
test_cmp actual expect
 '
 
+test_expect_success 'unexpected lines are not allowed in fetch request' '
+   git init server &&
+
+   # Custom request that tries to filter even though it is not advertised.
+   test-pkt-line pack >in <<-EOF &&
+   command=fetch
+   0001
+   this-is-not-a-command
+   
+   EOF
+
+   test_must_fail git -C server serve --stateless-rpc /dev/null 2>err 
&&
+   grep "unexpected line: .this-is-not-a-command." err
+'
+
 test_done
diff --git a/upload-pack.c b/upload-pack.c
index 87b4d32a6..c4456bb88 100644
--- a/upload-pack.c
+++ b/upload-pack.c
@@ -1252,7 +1252,7 @@ static void process_args(struct packet_reader *request,
}
 
/* ignore unknown lines maybe? */
-   die("unexpect line: '%s'", arg);
+   die("unexpected line: '%s'", arg);
}
 }
 
-- 
2.17.0.441.gb46fe60e1d-goog



Re: [PATCH 02/13] object: add repository argument to create_object

2018-05-01 Thread Eric Sunshine
On Tue, May 1, 2018 at 5:33 PM, Stefan Beller  wrote:
> Add a repository argument to allow the callers of create_object
> to be more specific about which repository to act on. This is a small
> mechanical change; it doesn't change the implementation to handle
> repositories other than the_repository yet.
>
> As with the previous commits, use a macro to catch callers passing a
> repository other than the_repository at compile time.

This is only patch 2, and patch 1 did not use a macro trick to catch
callers passing the wrong argument.

> Add the cocci patch that converted the callers.

No cocci patch anywhere in sight.

> Signed-off-by: Jonathan Nieder 
> Signed-off-by: Stefan Beller 


[PATCH 08/13] alloc: add repository argument to alloc_object_node

2018-05-01 Thread Stefan Beller
Signed-off-by: Stefan Beller 
---
 alloc.c  | 2 +-
 cache.h  | 3 ++-
 object.c | 2 +-
 3 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/alloc.c b/alloc.c
index 290250e3595..f031ce422d9 100644
--- a/alloc.c
+++ b/alloc.c
@@ -73,7 +73,7 @@ void *alloc_tag_node_the_repository(void)
 }
 
 static struct alloc_state object_state;
-void *alloc_object_node(void)
+void *alloc_object_node_the_repository(void)
 {
struct object *obj = alloc_node(_state, sizeof(union 
any_object));
obj->type = OBJ_NONE;
diff --git a/cache.h b/cache.h
index 32f340cde59..2d60359a964 100644
--- a/cache.h
+++ b/cache.h
@@ -1772,7 +1772,8 @@ extern void *alloc_tree_node_the_repository(void);
 extern void *alloc_commit_node_the_repository(void);
 #define alloc_tag_node(r) alloc_tag_node_##r()
 extern void *alloc_tag_node_the_repository(void);
-extern void *alloc_object_node(void);
+#define alloc_object_node(r) alloc_object_node_##r()
+extern void *alloc_object_node_the_repository(void);
 extern void alloc_report(void);
 extern unsigned int alloc_commit_index(void);
 
diff --git a/object.c b/object.c
index a6202d11292..7d36323445b 100644
--- a/object.c
+++ b/object.c
@@ -180,7 +180,7 @@ struct object *lookup_unknown_object(const unsigned char 
*sha1)
struct object *obj = lookup_object(sha1);
if (!obj)
obj = create_object(the_repository, sha1,
-   alloc_object_node());
+   alloc_object_node(the_repository));
return obj;
 }
 
-- 
2.17.0.441.gb46fe60e1d-goog



[PATCH 04/13] alloc: add repository argument to alloc_blob_node

2018-05-01 Thread Stefan Beller
Signed-off-by: Stefan Beller 
---
 alloc.c | 2 +-
 blob.c  | 2 +-
 cache.h | 3 ++-
 3 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/alloc.c b/alloc.c
index 12afadfacdd..6c5c376a25a 100644
--- a/alloc.c
+++ b/alloc.c
@@ -49,7 +49,7 @@ static inline void *alloc_node(struct alloc_state *s, size_t 
node_size)
 }
 
 static struct alloc_state blob_state;
-void *alloc_blob_node(void)
+void *alloc_blob_node_the_repository(void)
 {
struct blob *b = alloc_node(_state, sizeof(struct blob));
b->object.type = OBJ_BLOB;
diff --git a/blob.c b/blob.c
index 85c2143f299..9e64f301895 100644
--- a/blob.c
+++ b/blob.c
@@ -9,7 +9,7 @@ struct blob *lookup_blob(const struct object_id *oid)
struct object *obj = lookup_object(oid->hash);
if (!obj)
return create_object(the_repository, oid->hash,
-alloc_blob_node());
+alloc_blob_node(the_repository));
return object_as_type(obj, OBJ_BLOB, 0);
 }
 
diff --git a/cache.h b/cache.h
index 3a4d80e92bf..2258e611275 100644
--- a/cache.h
+++ b/cache.h
@@ -1764,7 +1764,8 @@ int decode_85(char *dst, const char *line, int linelen);
 void encode_85(char *buf, const unsigned char *data, int bytes);
 
 /* alloc.c */
-extern void *alloc_blob_node(void);
+#define alloc_blob_node(r) alloc_blob_node_##r()
+extern void *alloc_blob_node_the_repository(void);
 extern void *alloc_tree_node(void);
 extern void *alloc_commit_node(void);
 extern void *alloc_tag_node(void);
-- 
2.17.0.441.gb46fe60e1d-goog



[PATCH 06/13] alloc: add repository argument to alloc_commit_node

2018-05-01 Thread Stefan Beller
Signed-off-by: Stefan Beller 
---
 alloc.c   | 2 +-
 blame.c   | 2 +-
 cache.h   | 3 ++-
 commit.c  | 2 +-
 merge-recursive.c | 2 +-
 5 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/alloc.c b/alloc.c
index 2c8d1430758..9e2b897ec1d 100644
--- a/alloc.c
+++ b/alloc.c
@@ -88,7 +88,7 @@ unsigned int alloc_commit_index(void)
return count++;
 }
 
-void *alloc_commit_node(void)
+void *alloc_commit_node_the_repository(void)
 {
struct commit *c = alloc_node(_state, sizeof(struct commit));
c->object.type = OBJ_COMMIT;
diff --git a/blame.c b/blame.c
index dfa24473dc6..ba9b18e7542 100644
--- a/blame.c
+++ b/blame.c
@@ -161,7 +161,7 @@ static struct commit *fake_working_tree_commit(struct 
diff_options *opt,
 
read_cache();
time();
-   commit = alloc_commit_node();
+   commit = alloc_commit_node(the_repository);
commit->object.parsed = 1;
commit->date = now;
parent_tail = >parents;
diff --git a/cache.h b/cache.h
index 1717d07a2c5..bf6e8c87d83 100644
--- a/cache.h
+++ b/cache.h
@@ -1768,7 +1768,8 @@ void encode_85(char *buf, const unsigned char *data, int 
bytes);
 extern void *alloc_blob_node_the_repository(void);
 #define alloc_tree_node(r) alloc_tree_node_##r()
 extern void *alloc_tree_node_the_repository(void);
-extern void *alloc_commit_node(void);
+#define alloc_commit_node(r) alloc_commit_node_##r()
+extern void *alloc_commit_node_the_repository(void);
 extern void *alloc_tag_node(void);
 extern void *alloc_object_node(void);
 extern void alloc_report(void);
diff --git a/commit.c b/commit.c
index 9106acf0aad..a9a43e79bae 100644
--- a/commit.c
+++ b/commit.c
@@ -51,7 +51,7 @@ struct commit *lookup_commit(const struct object_id *oid)
struct object *obj = lookup_object(oid->hash);
if (!obj)
return create_object(the_repository, oid->hash,
-alloc_commit_node());
+alloc_commit_node(the_repository));
return object_as_type(obj, OBJ_COMMIT, 0);
 }
 
diff --git a/merge-recursive.c b/merge-recursive.c
index 0c0d48624da..6dac8908648 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -98,7 +98,7 @@ static struct tree *shift_tree_object(struct tree *one, 
struct tree *two,
 
 static struct commit *make_virtual_commit(struct tree *tree, const char 
*comment)
 {
-   struct commit *commit = alloc_commit_node();
+   struct commit *commit = alloc_commit_node(the_repository);
 
set_merge_remote_desc(commit, comment, (struct object *)commit);
commit->tree = tree;
-- 
2.17.0.441.gb46fe60e1d-goog



[PATCH 03/13] object: add repository argument to grow_object_hash

2018-05-01 Thread Stefan Beller
From: Jonathan Nieder 

Add a repository argument to allow the caller of grow_object_hash to
be more specific about which repository to handle. This is a small
mechanical change; it doesn't change the implementation to handle
repositories other than the_repository yet.

As with the previous commits, use a macro to catch callers passing a
repository other than the_repository at compile time.

Signed-off-by: Jonathan Nieder 
Signed-off-by: Stefan Beller 
---
 object.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/object.c b/object.c
index 933921e35c9..a6202d11292 100644
--- a/object.c
+++ b/object.c
@@ -116,7 +116,8 @@ struct object *lookup_object(const unsigned char *sha1)
  * power of 2 (but at least 32).  Copy the existing values to the new
  * hash map.
  */
-static void grow_object_hash(void)
+#define grow_object_hash(r) grow_object_hash_##r()
+static void grow_object_hash_the_repository(void)
 {
int i;
/*
@@ -147,7 +148,7 @@ void *create_object_the_repository(const unsigned char 
*sha1, void *o)
hashcpy(obj->oid.hash, sha1);
 
if (the_repository->parsed_objects->obj_hash_size - 1 <= 
the_repository->parsed_objects->nr_objs * 2)
-   grow_object_hash();
+   grow_object_hash(the_repository);
 
insert_obj_hash(obj, the_repository->parsed_objects->obj_hash,
the_repository->parsed_objects->obj_hash_size);
-- 
2.17.0.441.gb46fe60e1d-goog



[PATCH 11/13] object: allow grow_object_hash to handle arbitrary repositories

2018-05-01 Thread Stefan Beller
Signed-off-by: Jonathan Nieder 
Signed-off-by: Stefan Beller 
---
 object.c | 16 
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/object.c b/object.c
index ddf4b7b196e..43954fadf93 100644
--- a/object.c
+++ b/object.c
@@ -116,27 +116,27 @@ struct object *lookup_object(const unsigned char *sha1)
  * power of 2 (but at least 32).  Copy the existing values to the new
  * hash map.
  */
-#define grow_object_hash(r) grow_object_hash_##r()
-static void grow_object_hash_the_repository(void)
+static void grow_object_hash(struct repository *r)
 {
int i;
/*
 * Note that this size must always be power-of-2 to match hash_obj
 * above.
 */
-   int new_hash_size = the_repository->parsed_objects->obj_hash_size < 32 
? 32 : 2 * the_repository->parsed_objects->obj_hash_size;
+   int new_hash_size = r->parsed_objects->obj_hash_size < 32 ? 32 : 2 * 
r->parsed_objects->obj_hash_size;
struct object **new_hash;
 
new_hash = xcalloc(new_hash_size, sizeof(struct object *));
-   for (i = 0; i < the_repository->parsed_objects->obj_hash_size; i++) {
-   struct object *obj = 
the_repository->parsed_objects->obj_hash[i];
+   for (i = 0; i < r->parsed_objects->obj_hash_size; i++) {
+   struct object *obj = r->parsed_objects->obj_hash[i];
+
if (!obj)
continue;
insert_obj_hash(obj, new_hash, new_hash_size);
}
-   free(the_repository->parsed_objects->obj_hash);
-   the_repository->parsed_objects->obj_hash = new_hash;
-   the_repository->parsed_objects->obj_hash_size = new_hash_size;
+   free(r->parsed_objects->obj_hash);
+   r->parsed_objects->obj_hash = new_hash;
+   r->parsed_objects->obj_hash_size = new_hash_size;
 }
 
 void *create_object_the_repository(const unsigned char *sha1, void *o)
-- 
2.17.0.441.gb46fe60e1d-goog



[PATCH 09/13] alloc: add repository argument to alloc_report

2018-05-01 Thread Stefan Beller
Signed-off-by: Stefan Beller 
---
 alloc.c | 2 +-
 cache.h | 3 ++-
 2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/alloc.c b/alloc.c
index f031ce422d9..28b85b22144 100644
--- a/alloc.c
+++ b/alloc.c
@@ -105,7 +105,7 @@ static void report(const char *name, unsigned int count, 
size_t size)
 #define REPORT(name, type) \
 report(#name, name##_state.count, name##_state.count * sizeof(type) >> 10)
 
-void alloc_report(void)
+void alloc_report_the_repository(void)
 {
REPORT(blob, struct blob);
REPORT(tree, struct tree);
diff --git a/cache.h b/cache.h
index 2d60359a964..01cc207d218 100644
--- a/cache.h
+++ b/cache.h
@@ -1774,7 +1774,8 @@ extern void *alloc_commit_node_the_repository(void);
 extern void *alloc_tag_node_the_repository(void);
 #define alloc_object_node(r) alloc_object_node_##r()
 extern void *alloc_object_node_the_repository(void);
-extern void alloc_report(void);
+#define alloc_report(r) alloc_report_##r()
+extern void alloc_report_the_repository(void);
 extern unsigned int alloc_commit_index(void);
 
 /* pkt-line.c */
-- 
2.17.0.441.gb46fe60e1d-goog



[PATCH 13/13] alloc: allow arbitrary repositories for alloc functions

2018-05-01 Thread Stefan Beller
We have to convert all of the alloc functions at once, because alloc_report
uses a funky macro for reporting. It is better for the sake of mechanical
conversion to convert multiple functions at once rather than changing the
structure of the reporting function.

We record all memory allocation in alloc.c, and free them in
clear_alloc_state, which is called for all repositories except
the_repository.

Signed-off-by: Stefan Beller 
---
 alloc.c   | 69 ++-
 alloc.h   | 21 +++
 blame.c   |  1 +
 blob.c|  1 +
 cache.h   | 16 ---
 commit.c  |  1 +
 merge-recursive.c |  1 +
 object.c  | 24 ++---
 object.h  | 10 ++-
 repository.c  |  4 +--
 tag.c |  1 +
 tree.c|  1 +
 12 files changed, 104 insertions(+), 46 deletions(-)
 create mode 100644 alloc.h

diff --git a/alloc.c b/alloc.c
index 277dadd221b..66a3d07ba2d 100644
--- a/alloc.c
+++ b/alloc.c
@@ -4,10 +4,11 @@
  * Copyright (C) 2006 Linus Torvalds
  *
  * The standard malloc/free wastes too much space for objects, partly because
- * it maintains all the allocation infrastructure (which isn't needed, since
- * we never free an object descriptor anyway), but even more because it ends
+ * it maintains all the allocation infrastructure, but even more because it 
ends
  * up with maximal alignment because it doesn't know what the object alignment
  * for the new allocation is.
+ *
+ * TODO: Combine this with mem-pool?
  */
 #include "cache.h"
 #include "object.h"
@@ -30,8 +31,25 @@ struct alloc_state {
int count; /* total number of nodes allocated */
int nr;/* number of nodes left in current allocation */
void *p;   /* first free node in current allocation */
+
+   /* bookkeeping of allocations */
+   void **slabs;
+   int slab_nr, slab_alloc;
 };
 
+void *allocate_alloc_state(void)
+{
+   return xcalloc(1, sizeof(struct alloc_state));
+}
+
+void clear_alloc_state(struct alloc_state *s)
+{
+   while (s->slab_nr > 0) {
+   s->slab_nr--;
+   free(s->slabs[s->slab_nr]);
+   }
+}
+
 static inline void *alloc_node(struct alloc_state *s, size_t node_size)
 {
void *ret;
@@ -45,54 +63,56 @@ static inline void *alloc_node(struct alloc_state *s, 
size_t node_size)
ret = s->p;
s->p = (char *)s->p + node_size;
memset(ret, 0, node_size);
+
+   ALLOC_GROW(s->slabs, s->slab_nr + 1, s->slab_alloc);
+   s->slabs[s->slab_nr++] = ret;
+
return ret;
 }
 
-static struct alloc_state blob_state;
-void *alloc_blob_node_the_repository(void)
+struct alloc_state the_repository_blob_state;
+void *alloc_blob_node(struct repository *r)
 {
-   struct blob *b = alloc_node(_state, sizeof(struct blob));
+   struct blob *b = alloc_node(r->parsed_objects->blob_state, 
sizeof(struct blob));
b->object.type = OBJ_BLOB;
return b;
 }
 
-static struct alloc_state tree_state;
-void *alloc_tree_node_the_repository(void)
+struct alloc_state the_repository_tree_state;
+void *alloc_tree_node(struct repository *r)
 {
-   struct tree *t = alloc_node(_state, sizeof(struct tree));
+   struct tree *t = alloc_node(r->parsed_objects->tree_state, 
sizeof(struct tree));
t->object.type = OBJ_TREE;
return t;
 }
 
-static struct alloc_state tag_state;
-void *alloc_tag_node_the_repository(void)
+struct alloc_state the_repository_tag_state;
+void *alloc_tag_node(struct repository *r)
 {
-   struct tag *t = alloc_node(_state, sizeof(struct tag));
+   struct tag *t = alloc_node(r->parsed_objects->tag_state, sizeof(struct 
tag));
t->object.type = OBJ_TAG;
return t;
 }
 
-static struct alloc_state object_state;
-void *alloc_object_node_the_repository(void)
+struct alloc_state the_repository_object_state;
+void *alloc_object_node(struct repository *r)
 {
-   struct object *obj = alloc_node(_state, sizeof(union 
any_object));
+   struct object *obj = alloc_node(r->parsed_objects->object_state, 
sizeof(union any_object));
obj->type = OBJ_NONE;
return obj;
 }
 
-static struct alloc_state commit_state;
-
-unsigned int alloc_commit_index_the_repository(void)
+unsigned int alloc_commit_index(struct repository *r)
 {
-   static unsigned int count;
-   return count++;
+   return r->parsed_objects->commit_count++;
 }
 
-void *alloc_commit_node_the_repository(void)
+struct alloc_state the_repository_commit_state;
+void *alloc_commit_node(struct repository *r)
 {
-   struct commit *c = alloc_node(_state, sizeof(struct commit));
+   struct commit *c = alloc_node(r->parsed_objects->commit_state, 
sizeof(struct commit));
c->object.type = OBJ_COMMIT;
-   c->index = alloc_commit_index(the_repository);
+   c->index = alloc_commit_index(r);
return c;
 }
 
@@ -103,9 +123,10 @@ static void 

[PATCH 10/13] alloc: add repository argument to alloc_commit_index

2018-05-01 Thread Stefan Beller
Signed-off-by: Stefan Beller 
---
 alloc.c  | 4 ++--
 cache.h  | 3 ++-
 object.c | 2 +-
 3 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/alloc.c b/alloc.c
index 28b85b22144..277dadd221b 100644
--- a/alloc.c
+++ b/alloc.c
@@ -82,7 +82,7 @@ void *alloc_object_node_the_repository(void)
 
 static struct alloc_state commit_state;
 
-unsigned int alloc_commit_index(void)
+unsigned int alloc_commit_index_the_repository(void)
 {
static unsigned int count;
return count++;
@@ -92,7 +92,7 @@ void *alloc_commit_node_the_repository(void)
 {
struct commit *c = alloc_node(_state, sizeof(struct commit));
c->object.type = OBJ_COMMIT;
-   c->index = alloc_commit_index();
+   c->index = alloc_commit_index(the_repository);
return c;
 }
 
diff --git a/cache.h b/cache.h
index 01cc207d218..0e6c5dd5639 100644
--- a/cache.h
+++ b/cache.h
@@ -1776,7 +1776,8 @@ extern void *alloc_tag_node_the_repository(void);
 extern void *alloc_object_node_the_repository(void);
 #define alloc_report(r) alloc_report_##r()
 extern void alloc_report_the_repository(void);
-extern unsigned int alloc_commit_index(void);
+#define alloc_commit_index(r) alloc_commit_index_##r()
+extern unsigned int alloc_commit_index_the_repository(void);
 
 /* pkt-line.c */
 void packet_trace_identity(const char *prog);
diff --git a/object.c b/object.c
index 7d36323445b..ddf4b7b196e 100644
--- a/object.c
+++ b/object.c
@@ -162,7 +162,7 @@ void *object_as_type(struct object *obj, enum object_type 
type, int quiet)
return obj;
else if (obj->type == OBJ_NONE) {
if (type == OBJ_COMMIT)
-   ((struct commit *)obj)->index = alloc_commit_index();
+   ((struct commit *)obj)->index = 
alloc_commit_index(the_repository);
obj->type = type;
return obj;
}
-- 
2.17.0.441.gb46fe60e1d-goog



[PATCH 12/13] object: allow create_object to handle arbitrary repositories

2018-05-01 Thread Stefan Beller
Signed-off-by: Jonathan Nieder 
Signed-off-by: Stefan Beller 
---
 object.c | 12 ++--
 object.h |  3 +--
 2 files changed, 7 insertions(+), 8 deletions(-)

diff --git a/object.c b/object.c
index 43954fadf93..fd27cf54faa 100644
--- a/object.c
+++ b/object.c
@@ -139,7 +139,7 @@ static void grow_object_hash(struct repository *r)
r->parsed_objects->obj_hash_size = new_hash_size;
 }
 
-void *create_object_the_repository(const unsigned char *sha1, void *o)
+void *create_object(struct repository *r, const unsigned char *sha1, void *o)
 {
struct object *obj = o;
 
@@ -147,12 +147,12 @@ void *create_object_the_repository(const unsigned char 
*sha1, void *o)
obj->flags = 0;
hashcpy(obj->oid.hash, sha1);
 
-   if (the_repository->parsed_objects->obj_hash_size - 1 <= 
the_repository->parsed_objects->nr_objs * 2)
-   grow_object_hash(the_repository);
+   if (r->parsed_objects->obj_hash_size - 1 <= r->parsed_objects->nr_objs 
* 2)
+   grow_object_hash(r);
 
-   insert_obj_hash(obj, the_repository->parsed_objects->obj_hash,
-   the_repository->parsed_objects->obj_hash_size);
-   the_repository->parsed_objects->nr_objs++;
+   insert_obj_hash(obj, r->parsed_objects->obj_hash,
+   r->parsed_objects->obj_hash_size);
+   r->parsed_objects->nr_objs++;
return obj;
 }
 
diff --git a/object.h b/object.h
index d1869dbc502..5ef6ce1ea96 100644
--- a/object.h
+++ b/object.h
@@ -93,8 +93,7 @@ extern struct object *get_indexed_object(unsigned int);
  */
 struct object *lookup_object(const unsigned char *sha1);
 
-#define create_object(r, s, o) create_object_##r(s, o)
-extern void *create_object_the_repository(const unsigned char *sha1, void 
*obj);
+extern void *create_object(struct repository *r, const unsigned char *sha1, 
void *obj);
 
 void *object_as_type(struct object *obj, enum object_type type, int quiet);
 
-- 
2.17.0.441.gb46fe60e1d-goog



[PATCH 05/13] alloc: add repository argument to alloc_tree_node

2018-05-01 Thread Stefan Beller
Signed-off-by: Stefan Beller 
---
 alloc.c | 2 +-
 cache.h | 3 ++-
 tree.c  | 2 +-
 3 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/alloc.c b/alloc.c
index 6c5c376a25a..2c8d1430758 100644
--- a/alloc.c
+++ b/alloc.c
@@ -57,7 +57,7 @@ void *alloc_blob_node_the_repository(void)
 }
 
 static struct alloc_state tree_state;
-void *alloc_tree_node(void)
+void *alloc_tree_node_the_repository(void)
 {
struct tree *t = alloc_node(_state, sizeof(struct tree));
t->object.type = OBJ_TREE;
diff --git a/cache.h b/cache.h
index 2258e611275..1717d07a2c5 100644
--- a/cache.h
+++ b/cache.h
@@ -1766,7 +1766,8 @@ void encode_85(char *buf, const unsigned char *data, int 
bytes);
 /* alloc.c */
 #define alloc_blob_node(r) alloc_blob_node_##r()
 extern void *alloc_blob_node_the_repository(void);
-extern void *alloc_tree_node(void);
+#define alloc_tree_node(r) alloc_tree_node_##r()
+extern void *alloc_tree_node_the_repository(void);
 extern void *alloc_commit_node(void);
 extern void *alloc_tag_node(void);
 extern void *alloc_object_node(void);
diff --git a/tree.c b/tree.c
index 63730e3fb46..58cf19b4fa8 100644
--- a/tree.c
+++ b/tree.c
@@ -197,7 +197,7 @@ struct tree *lookup_tree(const struct object_id *oid)
struct object *obj = lookup_object(oid->hash);
if (!obj)
return create_object(the_repository, oid->hash,
-alloc_tree_node());
+alloc_tree_node(the_repository));
return object_as_type(obj, OBJ_TREE, 0);
 }
 
-- 
2.17.0.441.gb46fe60e1d-goog



[PATCH 02/13] object: add repository argument to create_object

2018-05-01 Thread Stefan Beller
Add a repository argument to allow the callers of create_object
to be more specific about which repository to act on. This is a small
mechanical change; it doesn't change the implementation to handle
repositories other than the_repository yet.

As with the previous commits, use a macro to catch callers passing a
repository other than the_repository at compile time.

Add the cocci patch that converted the callers.

Signed-off-by: Jonathan Nieder 
Signed-off-by: Stefan Beller 
---
 blob.c   | 4 +++-
 commit.c | 3 ++-
 object.c | 5 +++--
 object.h | 3 ++-
 tag.c| 3 ++-
 tree.c   | 3 ++-
 6 files changed, 14 insertions(+), 7 deletions(-)

diff --git a/blob.c b/blob.c
index fa2ab4f7a74..85c2143f299 100644
--- a/blob.c
+++ b/blob.c
@@ -1,5 +1,6 @@
 #include "cache.h"
 #include "blob.h"
+#include "repository.h"
 
 const char *blob_type = "blob";
 
@@ -7,7 +8,8 @@ struct blob *lookup_blob(const struct object_id *oid)
 {
struct object *obj = lookup_object(oid->hash);
if (!obj)
-   return create_object(oid->hash, alloc_blob_node());
+   return create_object(the_repository, oid->hash,
+alloc_blob_node());
return object_as_type(obj, OBJ_BLOB, 0);
 }
 
diff --git a/commit.c b/commit.c
index ca474a7c112..9106acf0aad 100644
--- a/commit.c
+++ b/commit.c
@@ -50,7 +50,8 @@ struct commit *lookup_commit(const struct object_id *oid)
 {
struct object *obj = lookup_object(oid->hash);
if (!obj)
-   return create_object(oid->hash, alloc_commit_node());
+   return create_object(the_repository, oid->hash,
+alloc_commit_node());
return object_as_type(obj, OBJ_COMMIT, 0);
 }
 
diff --git a/object.c b/object.c
index 503c73e2d1f..933921e35c9 100644
--- a/object.c
+++ b/object.c
@@ -138,7 +138,7 @@ static void grow_object_hash(void)
the_repository->parsed_objects->obj_hash_size = new_hash_size;
 }
 
-void *create_object(const unsigned char *sha1, void *o)
+void *create_object_the_repository(const unsigned char *sha1, void *o)
 {
struct object *obj = o;
 
@@ -178,7 +178,8 @@ struct object *lookup_unknown_object(const unsigned char 
*sha1)
 {
struct object *obj = lookup_object(sha1);
if (!obj)
-   obj = create_object(sha1, alloc_object_node());
+   obj = create_object(the_repository, sha1,
+   alloc_object_node());
return obj;
 }
 
diff --git a/object.h b/object.h
index 84380b2b4d5..d1869dbc502 100644
--- a/object.h
+++ b/object.h
@@ -93,7 +93,8 @@ extern struct object *get_indexed_object(unsigned int);
  */
 struct object *lookup_object(const unsigned char *sha1);
 
-extern void *create_object(const unsigned char *sha1, void *obj);
+#define create_object(r, s, o) create_object_##r(s, o)
+extern void *create_object_the_repository(const unsigned char *sha1, void 
*obj);
 
 void *object_as_type(struct object *obj, enum object_type type, int quiet);
 
diff --git a/tag.c b/tag.c
index 3d37c1bd251..7150b759d66 100644
--- a/tag.c
+++ b/tag.c
@@ -93,7 +93,8 @@ struct tag *lookup_tag(const struct object_id *oid)
 {
struct object *obj = lookup_object(oid->hash);
if (!obj)
-   return create_object(oid->hash, alloc_tag_node());
+   return create_object(the_repository, oid->hash,
+alloc_tag_node());
return object_as_type(obj, OBJ_TAG, 0);
 }
 
diff --git a/tree.c b/tree.c
index 1c68ea586bd..63730e3fb46 100644
--- a/tree.c
+++ b/tree.c
@@ -196,7 +196,8 @@ struct tree *lookup_tree(const struct object_id *oid)
 {
struct object *obj = lookup_object(oid->hash);
if (!obj)
-   return create_object(oid->hash, alloc_tree_node());
+   return create_object(the_repository, oid->hash,
+alloc_tree_node());
return object_as_type(obj, OBJ_TREE, 0);
 }
 
-- 
2.17.0.441.gb46fe60e1d-goog



[PATCH 07/13] alloc: add repository argument to alloc_tag_node

2018-05-01 Thread Stefan Beller
Signed-off-by: Stefan Beller 
---
 alloc.c | 2 +-
 cache.h | 3 ++-
 tag.c   | 2 +-
 3 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/alloc.c b/alloc.c
index 9e2b897ec1d..290250e3595 100644
--- a/alloc.c
+++ b/alloc.c
@@ -65,7 +65,7 @@ void *alloc_tree_node_the_repository(void)
 }
 
 static struct alloc_state tag_state;
-void *alloc_tag_node(void)
+void *alloc_tag_node_the_repository(void)
 {
struct tag *t = alloc_node(_state, sizeof(struct tag));
t->object.type = OBJ_TAG;
diff --git a/cache.h b/cache.h
index bf6e8c87d83..32f340cde59 100644
--- a/cache.h
+++ b/cache.h
@@ -1770,7 +1770,8 @@ extern void *alloc_blob_node_the_repository(void);
 extern void *alloc_tree_node_the_repository(void);
 #define alloc_commit_node(r) alloc_commit_node_##r()
 extern void *alloc_commit_node_the_repository(void);
-extern void *alloc_tag_node(void);
+#define alloc_tag_node(r) alloc_tag_node_##r()
+extern void *alloc_tag_node_the_repository(void);
 extern void *alloc_object_node(void);
 extern void alloc_report(void);
 extern unsigned int alloc_commit_index(void);
diff --git a/tag.c b/tag.c
index 7150b759d66..02ef4eaafc0 100644
--- a/tag.c
+++ b/tag.c
@@ -94,7 +94,7 @@ struct tag *lookup_tag(const struct object_id *oid)
struct object *obj = lookup_object(oid->hash);
if (!obj)
return create_object(the_repository, oid->hash,
-alloc_tag_node());
+alloc_tag_node(the_repository));
return object_as_type(obj, OBJ_TAG, 0);
 }
 
-- 
2.17.0.441.gb46fe60e1d-goog



[PATCH 00/13] object store: alloc

2018-05-01 Thread Stefan Beller
This applies on top of sb/oid-object-info and is the logical continuum of
the series that it builds on; this brings the object store into more of
Gits code, removing global state, such that reasoning about the state of
the in-memory representation of the repository is easier.

My original plan was to convert lookup_commit_graft as the next series,
which would be similar to lookup_replace_object, as in sb/object-store-replace.
The grafts and shallow mechanism are very close to each other, such that
they need to be converted at the same time, both depending on the
"parsed object store" that is introduced in this commit.

The next series will then convert code in {object/blob/tree/commit/tag}.c
hopefully finishing the lookup_* functions.

I also debated if it is worth converting alloc.c via this patch series
or if it might make more sense to use the new mem-pool by Jameson[1].

I vaguely wonder about the performance impact, as the object allocation
code seemed to be relevant in the past.

[1] https://public-inbox.org/git/20180430153122.243976-1-jam...@microsoft.com/

Any comments welcome,
Thanks,
Stefan

Jonathan Nieder (1):
  object: add repository argument to grow_object_hash

Stefan Beller (12):
  repository: introduce object parser field
  object: add repository argument to create_object
  alloc: add repository argument to alloc_blob_node
  alloc: add repository argument to alloc_tree_node
  alloc: add repository argument to alloc_commit_node
  alloc: add repository argument to alloc_tag_node
  alloc: add repository argument to alloc_object_node
  alloc: add repository argument to alloc_report
  alloc: add repository argument to alloc_commit_index
  object: allow grow_object_hash to handle arbitrary repositories
  object: allow create_object to handle arbitrary repositories
  alloc: allow arbitrary repositories for alloc functions

 alloc.c   | 69 +++
 alloc.h   | 21 +++
 blame.c   |  3 +-
 blob.c|  5 ++-
 cache.h   |  9 -
 commit.c  |  4 +-
 merge-recursive.c |  3 +-
 object.c  | 93 +--
 object.h  | 18 -
 repository.c  |  7 
 repository.h  | 11 +-
 tag.c |  4 +-
 tree.c|  4 +-
 13 files changed, 182 insertions(+), 69 deletions(-)
 create mode 100644 alloc.h

-- 
2.17.0.441.gb46fe60e1d-goog



[PATCH 01/13] repository: introduce object parser field

2018-05-01 Thread Stefan Beller
Git's object access code can be thought of as containing two layers:
the raw object store provides access to raw object content, while the
higher level obj_hash code parses raw objects and keeps track of
parenthood and other object relationships using 'struct object'.
Keeping these layers separate should make it easier to find relevant
functions and to change the implementation of one without having to
touch the other.

Add an object_parser field to 'struct repository' to prepare obj_hash
to be handled per repository.  Callers still only use the_repository
for now --- later patches will adapt them to handle arbitrary
repositories.

Signed-off-by: Jonathan Nieder 
Signed-off-by: Stefan Beller 
---
 object.c | 63 +---
 object.h |  8 +++
 repository.c |  7 ++
 repository.h | 11 -
 4 files changed, 65 insertions(+), 24 deletions(-)

diff --git a/object.c b/object.c
index 5044d08e96c..503c73e2d1f 100644
--- a/object.c
+++ b/object.c
@@ -8,17 +8,14 @@
 #include "object-store.h"
 #include "packfile.h"
 
-static struct object **obj_hash;
-static int nr_objs, obj_hash_size;
-
 unsigned int get_max_object_index(void)
 {
-   return obj_hash_size;
+   return the_repository->parsed_objects->obj_hash_size;
 }
 
 struct object *get_indexed_object(unsigned int idx)
 {
-   return obj_hash[idx];
+   return the_repository->parsed_objects->obj_hash[idx];
 }
 
 static const char *object_type_strings[] = {
@@ -90,15 +87,16 @@ struct object *lookup_object(const unsigned char *sha1)
unsigned int i, first;
struct object *obj;
 
-   if (!obj_hash)
+   if (!the_repository->parsed_objects->obj_hash)
return NULL;
 
-   first = i = hash_obj(sha1, obj_hash_size);
-   while ((obj = obj_hash[i]) != NULL) {
+   first = i = hash_obj(sha1,
+the_repository->parsed_objects->obj_hash_size);
+   while ((obj = the_repository->parsed_objects->obj_hash[i]) != NULL) {
if (!hashcmp(sha1, obj->oid.hash))
break;
i++;
-   if (i == obj_hash_size)
+   if (i == the_repository->parsed_objects->obj_hash_size)
i = 0;
}
if (obj && i != first) {
@@ -107,7 +105,8 @@ struct object *lookup_object(const unsigned char *sha1)
 * that we do not need to walk the hash table the next
 * time we look for it.
 */
-   SWAP(obj_hash[i], obj_hash[first]);
+   SWAP(the_repository->parsed_objects->obj_hash[i],
+the_repository->parsed_objects->obj_hash[first]);
}
return obj;
 }
@@ -124,19 +123,19 @@ static void grow_object_hash(void)
 * Note that this size must always be power-of-2 to match hash_obj
 * above.
 */
-   int new_hash_size = obj_hash_size < 32 ? 32 : 2 * obj_hash_size;
+   int new_hash_size = the_repository->parsed_objects->obj_hash_size < 32 
? 32 : 2 * the_repository->parsed_objects->obj_hash_size;
struct object **new_hash;
 
new_hash = xcalloc(new_hash_size, sizeof(struct object *));
-   for (i = 0; i < obj_hash_size; i++) {
-   struct object *obj = obj_hash[i];
+   for (i = 0; i < the_repository->parsed_objects->obj_hash_size; i++) {
+   struct object *obj = 
the_repository->parsed_objects->obj_hash[i];
if (!obj)
continue;
insert_obj_hash(obj, new_hash, new_hash_size);
}
-   free(obj_hash);
-   obj_hash = new_hash;
-   obj_hash_size = new_hash_size;
+   free(the_repository->parsed_objects->obj_hash);
+   the_repository->parsed_objects->obj_hash = new_hash;
+   the_repository->parsed_objects->obj_hash_size = new_hash_size;
 }
 
 void *create_object(const unsigned char *sha1, void *o)
@@ -147,11 +146,12 @@ void *create_object(const unsigned char *sha1, void *o)
obj->flags = 0;
hashcpy(obj->oid.hash, sha1);
 
-   if (obj_hash_size - 1 <= nr_objs * 2)
+   if (the_repository->parsed_objects->obj_hash_size - 1 <= 
the_repository->parsed_objects->nr_objs * 2)
grow_object_hash();
 
-   insert_obj_hash(obj, obj_hash, obj_hash_size);
-   nr_objs++;
+   insert_obj_hash(obj, the_repository->parsed_objects->obj_hash,
+   the_repository->parsed_objects->obj_hash_size);
+   the_repository->parsed_objects->nr_objs++;
return obj;
 }
 
@@ -431,8 +431,8 @@ void clear_object_flags(unsigned flags)
 {
int i;
 
-   for (i=0; i < obj_hash_size; i++) {
-   struct object *obj = obj_hash[i];
+   for (i=0; i < the_repository->parsed_objects->obj_hash_size; i++) {
+   struct object *obj = 
the_repository->parsed_objects->obj_hash[i];
if (obj)
   

Re: [PATCH 2/2] submodule: Add --dissociate option to add/update commands

2018-05-01 Thread Casey Fitzpatrick
Thanks, I will clean up the braces and commit message.

I have to disagree with the 's/reference/dissociate/' comments. It
appears this section of option descriptions mostly copies from the
descriptions given by 'git clone -h', which outputs:
--reference reference repository
--reference-if-able 
  reference repository
--dissociate  use --reference only while cloning
It is perhaps not the best description, but it means to say when
--dissociate is used --reference is only in play for reducing network
transfer, not keeping alternates.

Those expansions *are* certainly off-putting, they make a long line
even more difficult to parse. I just searched that file for ":+" for
those types of expressions and I think a patch to fix them would be
fairly short. I'll look into making that cleanup patch.



On Tue, May 1, 2018 at 4:25 PM, Eric Sunshine  wrote:
> On Tue, May 1, 2018 at 2:09 PM, Casey Fitzpatrick  wrote:
>> submodule: Add --dissociate option to add/update commands
>
> s/Add/add/
>
>> Add --dissociate option to add and update commands, both clone helper 
>> commands
>> that already have the --reference option --dissociate pairs with.
>> Add documentation.
>> Add tests.
>>
>> Signed-off-by: Casey Fitzpatrick 
>> ---
>> diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
>> @@ -1075,6 +1075,9 @@ static int clone_submodule(const char *path, const 
>> char *gitdir, const char *url
>> +   if (dissociate) {
>> +   argv_array_push(, "--dissociate");
>> +   }
>
> Style: drop unnecessary braces
>
>> @@ -1208,6 +1212,8 @@ static int module_clone(int argc, const char **argv, 
>> const char *prefix)
>> +   OPT_BOOL(0, "dissociate", ,
>> +  N_("use --reference only while cloning")),
>
> s/reference/dissociate/
>
>> @@ -1575,6 +1584,8 @@ static int update_clone(int argc, const char **argv, 
>> const char *prefix)
>> +   OPT_BOOL(0, "dissociate", ,
>> +  N_("use --reference only while cloning")),
>
> s/reference/dissociate/
>
>> diff --git a/git-submodule.sh b/git-submodule.sh
>> +   --dissociate)
>> +   dissociate="--dissociate"
>> @@ -258,7 +262,7 @@ or you are unsure what this means choose another name 
>> with the '--name' option."
>> -   git submodule--helper clone ${GIT_QUIET:+--quiet} 
>> ${progress:+"$progress"} --prefix "$wt_prefix" --path "$sm_path" --name 
>> "$sm_name" --url "$realrepo" ${reference:+"$reference"} ${depth:+"$depth"} 
>> || exit
>> +   git submodule--helper clone ${GIT_QUIET:+--quiet} 
>> ${progress:+"$progress"} --prefix "$wt_prefix" --path "$sm_path" --name 
>> "$sm_name" --url "$realrepo" ${reference:+"$reference"} 
>> ${dissociate:+"$dissociate"} ${depth:+"$depth"} || exit
>
> I realize that you're just following existing practice in this script,
> but it's a bit off-putting to see expansions for the new --progress
> and --dissociate options being done via unnecessarily complex
> ${foobar:+"$foobar"} when the simpler $foobar would work just as well.
>
> Just a comment; not necessarily a request for change. (A separate
> preparatory cleanup patch which simplifies the existing complex
> expansion expressions would be welcome but could also be considered
> outside the scope of this patch series.)
>
>> @@ -493,6 +497,9 @@ cmd_update()
>> +   --dissociate)
>> +   dissociate="--dissociate"
>> +   ;;
>> @@ -550,6 +557,7 @@ cmd_update()
>> ${reference:+"$reference"} \
>> +   ${dissociate:+"$dissociate"} \
>> ${depth:+--depth "$depth"} \


Re: [PATCH 1/2] submodule: Add --progress option to add command

2018-05-01 Thread Casey Fitzpatrick
Thanks, I'll clean it up based on your comments. I based the tests
from t5606-clone-options.sh; I'm not sure why now but I thought I
needed that clone -o thing from there, but it appears useless.



On Tue, May 1, 2018 at 2:41 PM, Stefan Beller  wrote:
> On Tue, May 1, 2018 at 11:09 AM, Casey Fitzpatrick  wrote:
>> --progress was missing from add command, was only in update.
>> Add --progress where it makes sense (both clone helper commands).
>> Add missing documentation entry.
>> Add test.
>
> Maybe:
>   The '--progress' was introduced in 72c5f88311d (clone: pass --progress
>   decision to recursive submodules, 2016-09-22) to fix the progress reporting
>   of the clone command. Also add the progress option to the 'submodule add'
>   command. The update command already support the progress flag, but it
>   is not documented.
>
>> @@ -117,6 +117,9 @@ cmd_add()
>> -q|--quiet)
>> GIT_QUIET=1
>> ;;
>> +   --progress)
>> +   progress="--progress"
>> +   ;;
>
> The code looks good, though unlike the other commands progress is a
> boolean decision.
>
> If we want to support --no-progress as well, we can do so by adding
> --no-progress)
> progress="--no-progress"
> ;;
> and then the submodule helper ought to cope with it.
>
>
>> +test_expect_success 'setup test repo' '
>> +   mkdir parent &&
>> +   (cd parent && git init &&
>> +echo one >file && git add file &&
>> +git commit -m one)
>> +'
>
> Coding style:
> (
> parens open on its own line, and its contents
> are indented by a tab.
>
> Instead of coding this yourself, you could write the
> test as:
>
> test_create_repo parent &&
> test_create_commit -C parent one
>
>> +test_expect_success 'clone -o' '
>
> What are we testing here? I do not quite see the connection to
> testing --progress here.
>
>> +   git clone -o foo parent clone-o &&
>> +   (cd clone-o && git rev-parse --verify refs/remotes/foo/master)
>
>
>> +test_expect_success 'redirected submodule add does not show progress' '
>> +   (
>> +   cd addtest &&
>
>
>
>> +   git submodule add "file://$submodurl/parent" 
>> submod-redirected \
>> +   >out 2>err &&
>> +   ! grep % err &&
>
> We're grepping for percent to see that there is no progress. At first I 
> thought
> the percent sign might be a special character, had to research it to know it's
> meant literally. TiL!
>
>> +   test_i18ngrep ! "Checking connectivity" err
>
> Makes sense.
>
>> +   )
>
> We could omit the extra shell by using
>
> git -C addtest submodule add "file://$... \
> >../out 2>../err &&
>
> Why do we need 'out' ?
>
>> +test_expect_success 'redirected submodule add --progress does show 
>> progress' '
>> +   (
>> +   cd addtest &&
>> +   git submodule add --progress "file://$submodurl/parent" \
>> +   submod-redirected-progress >out 2>err && \
>> +   grep % err
>> +   )
>> +'
>
> Thanks for writing these tests!
> Stefan


  1   2   3   >