[no subject]

2018-02-25 Thread Alfred Chow




Good Day,

I am Mr. Alfred Cheuk Yu Chow, the Director for Credit & Marketing Chong
Hing Bank, Hong Kong, Chong Hing Bank Center, 24 Des Voeux Road Central,
Hong Kong. I have a business proposal of $ 38,980,369.00.

All confirmable documents to back up the claims will be made available
to you prior to your acceptance and as soon as I receive your return
mail.

Best Regards.







[no subject]

2018-02-25 Thread Alfred Chow




Good Day,

I am Mr. Alfred Cheuk Yu Chow, the Director for Credit & Marketing
Chong Hing Bank, Hong Kong, Chong Hing Bank Centre, 24 Des Voeux Road
Central, Hong Kong. I have a business proposal of  $38,980,369.00.

All confirmable documents to back up the claims will be made available
to you prior to your acceptance and as soon as I receive your return
mail.

Email me for more details:

Best Regards.







[no subject]

2018-02-25 Thread Alfred Chow




Good Day,

I am Mr. Alfred Cheuk Yu Chow, the Director for Credit & Marketing
Chong Hing Bank, Hong Kong, Chong Hing Bank Centre, 24 Des Voeux Road
Central, Hong Kong. I have a business proposal of  $38,980,369.00.

All confirmable documents to back up the claims will be made available
to you prior to your acceptance and as soon as I receive your return
mail.

Email me for more details:

Best Regards.







Re: [PATCH v7 0/7] convert: add support for different encodings

2018-02-25 Thread Jeff King
On Sat, Feb 24, 2018 at 04:18:36PM +0100, Lars Schneider wrote:

> > We always use the in-repo contents when generating 'diff'.  I think
> > by "attribute to be used in diff", what you are reallying after is
> > to convert the in-repo contents to that encoding _BEFORE_ running
> > 'diff' on it.  E.g. in-repo UTF-16 that can have NUL bytes all over
> > the place will not diff well with the xdiff machinery, but if you
> > first convert it to UTF-8 and have xdiff work on it, you can get
> > reasonable result out of it.  It is unclear what encoding you want
> > your final diff output in (it is equally valid in such a set-up to
> > desire your patch output in UTF-16 or UTF-8), but assuming that you
> > want UTF-8 in your patch output, perhaps we do not have to break
> > gitk users by hijacking the 'encoding' attribute.  Instead what you
> > want is a single bit that says between in-repo or working tree which
> > representation should be given to the xdiff machinery.
> 
> I fear that we could confuse users with an additional knob/bit that
> defines what we diff against. Git always diff'ed against in-repo 
> content and I feel it should stay that way.

Well, except for textconv. You can already do this:

  echo "foo diff=utf16" >.gitattributes
  git config diff.utf16.textconv 'iconv -f utf16 -t utf8'

We could make that easier to use and much more efficient by:

  1. Allowing a special syntax for textconv filters that kicks off an
 internal iconv.

  2. Providing baked-in config for utf16.

The patch below provides a sketch. But I think Torsten raised a good
point that you might want the encoding conversion to be independent of
other diff characteristics (so, e.g., you might say "this is utf16 but
once converted treat it like C code for finding funcnames, etc").

---
diff --git a/diff.c b/diff.c
index 21c3838b25..04032e059c 100644
--- a/diff.c
+++ b/diff.c
@@ -5968,6 +5968,21 @@ struct diff_filepair *diff_unmerge(struct diff_options 
*options, const char *pat
return pair;
 }
 
+static char *iconv_textconv(const char *encoding, struct diff_filespec *spec,
+   size_t *outsize)
+{
+   char *ret;
+   int outsize_int; /* this really should be a size_t */
+
+   if (diff_populate_filespec(spec, 0))
+   die("unable to load content for %s", spec->path);
+   ret = reencode_string_len(spec->data, spec->size,
+ "utf-8", /* should be log_output_encoding? */
+ encoding, _int);
+   *outsize = outsize_int;
+   return ret;
+}
+
 static char *run_textconv(const char *pgm, struct diff_filespec *spec,
size_t *outsize)
 {
@@ -5978,6 +5993,9 @@ static char *run_textconv(const char *pgm, struct 
diff_filespec *spec,
struct strbuf buf = STRBUF_INIT;
int err = 0;
 
+   if (skip_prefix(pgm, "iconv:", ))
+   return iconv_textconv(pgm, spec, outsize);
+
temp = prepare_temp_file(spec->path, spec);
*arg++ = pgm;
*arg++ = temp->name;
diff --git a/userdiff.c b/userdiff.c
index dbfb4e13cd..48fa7e8bdd 100644
--- a/userdiff.c
+++ b/userdiff.c
@@ -161,6 +161,7 @@ IPATTERN("css",
 "-?[_a-zA-Z][-_a-zA-Z0-9]*" /* identifiers */
 "|-?[0-9]+|\\#[0-9a-fA-F]+" /* numbers */
 ),
+{ "utf16", NULL, -1, { NULL, 0 }, NULL, "iconv:utf16" },
 { "default", NULL, -1, { NULL, 0 } },
 };
 #undef PATTERNS


Re: [PATCH] revision.c: reduce object database queries

2018-02-25 Thread Jeff King
On Sun, Feb 25, 2018 at 08:30:48PM -0500, Jeff King wrote:

> > diff --git a/revision.c b/revision.c
> > index 5ce9b93..bc7def5 100644
> > --- a/revision.c
> > +++ b/revision.c
> > @@ -113,7 +113,8 @@ void mark_parents_uninteresting(struct commit *commit)
> >  * it is popped next time around, we won't be trying
> >  * to parse it and get an error.
> >  */
> > -   if (!has_object_file(>object.oid))
> > +   if (!commit->object.parsed &&
> > +   !has_object_file(>object.oid))
> > commit->object.parsed = 1;
> 
> We don't actually need the object contents at all right here. This is
> just faking the "parsed" flag for later so that calls to parse_object()
> don't barf.
> 
> This code comes originally form 454fbbcde3 (git-rev-list: allow missing
> objects when the parent is marked UNINTERESTING, 2005-07-10). But later,
> in aeeae1b771 (revision traversal: allow UNINTERESTING objects to be
> missing, 2009-01-27), we marked dealt with calling parse_object() on the
> parents more directly.
> 
> So what I wonder is whether this code is simply redundant and can go
> away entirely. That would save the has_object_file() call in all cases.

There's a similar case for trees. In mark_tree_contents_uninteresting()
we do:

   if (!has_object_file(>oid))
return;
   if (parse_tree(tree) < 0)
die("bad tree %s", oid_to_hex(>oid));

which seems wasteful. Probably this could be:

  if (parse_tree_gently(tree, 1) < 0)
return; /* missing uninteresting trees ok */

though technically the existing code allows _missing_ trees, but
not on corrupt ones.

I guess this is perhaps less interesting, because we only mark trees
directly fed from the pending array, not every tree of commits that we
traverse. Though if you had a really gigantic tree, it might be
measurable.

-Peff


Re: [PATCH] revision.c: reduce object database queries

2018-02-25 Thread Jeff King
On Sat, Feb 24, 2018 at 08:34:56PM -0500, Derrick Stolee wrote:

> In mark_parents_uninteresting(), we check for the existence of an
> object file to see if we should treat a commit as parsed. The result
> is to set the "parsed" bit on the commit.
> 
> Modify the condition to only check has_object_file() if the result
> would change the parsed bit.
> 
> When a local branch is different from its upstream ref, "git status"
> will compute ahead/behind counts. This uses paint_down_to_common()
> and hits mark_parents_uninteresting(). On a copy of the Linux repo
> with a local instance of "master" behind the remote branch
> "origin/master" by ~60,000 commits, we find the performance of
> "git status" went from 1.42 seconds to 1.32 seconds, for a relative
> difference of -7.0%.

This looks like an obvious and easy optimization, and I'm OK with this
patch.

But looking at the existing code, it makes me wonder a few things:

> diff --git a/revision.c b/revision.c
> index 5ce9b93..bc7def5 100644
> --- a/revision.c
> +++ b/revision.c
> @@ -113,7 +113,8 @@ void mark_parents_uninteresting(struct commit *commit)
>* it is popped next time around, we won't be trying
>* to parse it and get an error.
>*/
> - if (!has_object_file(>object.oid))
> + if (!commit->object.parsed &&
> + !has_object_file(>object.oid))
>   commit->object.parsed = 1;

We don't actually need the object contents at all right here. This is
just faking the "parsed" flag for later so that calls to parse_object()
don't barf.

This code comes originally form 454fbbcde3 (git-rev-list: allow missing
objects when the parent is marked UNINTERESTING, 2005-07-10). But later,
in aeeae1b771 (revision traversal: allow UNINTERESTING objects to be
missing, 2009-01-27), we marked dealt with calling parse_object() on the
parents more directly.

So what I wonder is whether this code is simply redundant and can go
away entirely. That would save the has_object_file() call in all cases.
We'd stop setting the fake commit->object.parsed flag, which in theory
means we'd make repeated failed calls to parse_commit_gently() if we
visited the same commit over and over. I wonder if add_parents_to_list()
should skip already-UNINTERESTING parents, which would mean we only try
to access each missing candidate once (OTOH missing ones are probably
rare enough that it doesn't make much difference either way).

Probably the fake commit->object.parsed flag isn't hurting anything in
practice, but it seems pretty nasty to lie about having loaded the
commit in the global store. E.g., imagine a follow-up traversal in the
same process in which that commit _isn't_ UNINTERESTING, and we'd
erroneously treat it as an available commit with no parents.

-Peff


[PATCH v2 33/36] sha1_file: convert read_sha1_file to struct object_id

2018-02-25 Thread brian m. carlson
Convert read_sha1_file to take a pointer to struct object_id and rename
it read_object_file.  Do the same for read_sha1_file_extended.

Convert one use in grep.c to use the new function without any other code
change, since the pointer being passed is a void pointer that is already
initialized with a pointer to struct object_id.  Update the declaration
and definitions of the modified functions, and apply the following
semantic patch to convert the remaining callers:

@@
expression E1, E2, E3;
@@
- read_sha1_file(E1.hash, E2, E3)
+ read_object_file(, E2, E3)

@@
expression E1, E2, E3;
@@
- read_sha1_file(E1->hash, E2, E3)
+ read_object_file(E1, E2, E3)

@@
expression E1, E2, E3, E4;
@@
- read_sha1_file_extended(E1.hash, E2, E3, E4)
+ read_object_file_extended(, E2, E3, E4)

@@
expression E1, E2, E3, E4;
@@
- read_sha1_file_extended(E1->hash, E2, E3, E4)
+ read_object_file_extended(E1, E2, E3, E4)

Signed-off-by: brian m. carlson 
---
 apply.c  |  4 ++--
 archive.c|  2 +-
 bisect.c |  3 ++-
 blame.c  |  8 
 builtin/cat-file.c   | 14 --
 builtin/difftool.c   |  2 +-
 builtin/fast-export.c|  4 ++--
 builtin/fmt-merge-msg.c  |  2 +-
 builtin/grep.c   |  2 +-
 builtin/index-pack.c |  5 +++--
 builtin/log.c|  2 +-
 builtin/merge-tree.c |  5 +++--
 builtin/mktag.c  |  2 +-
 builtin/notes.c  |  6 +++---
 builtin/pack-objects.c   | 16 ++--
 builtin/reflog.c |  2 +-
 builtin/tag.c|  4 ++--
 builtin/unpack-file.c|  2 +-
 builtin/unpack-objects.c |  2 +-
 builtin/verify-commit.c  |  2 +-
 bundle.c |  2 +-
 cache.h  | 10 +-
 combine-diff.c   |  2 +-
 commit.c |  6 +++---
 config.c |  2 +-
 diff.c   |  2 +-
 dir.c|  2 +-
 entry.c  |  2 +-
 fast-import.c|  6 +++---
 fsck.c   |  2 +-
 grep.c   |  2 +-
 http-push.c  |  2 +-
 mailmap.c|  2 +-
 match-trees.c|  4 ++--
 merge-blobs.c|  4 ++--
 merge-recursive.c|  4 ++--
 notes-cache.c|  2 +-
 notes-merge.c|  2 +-
 notes.c  |  8 
 object.c |  2 +-
 read-cache.c |  4 ++--
 ref-filter.c |  2 +-
 remote-testsvn.c |  4 ++--
 rerere.c |  4 ++--
 sha1_file.c  | 20 ++--
 streaming.c  |  2 +-
 submodule-config.c   |  2 +-
 tag.c|  4 ++--
 tree-walk.c  |  4 ++--
 tree.c   |  2 +-
 xdiff-interface.c|  2 +-
 51 files changed, 104 insertions(+), 103 deletions(-)

diff --git a/apply.c b/apply.c
index 40a368b315..dd3cf99b7c 100644
--- a/apply.c
+++ b/apply.c
@@ -3180,7 +3180,7 @@ static int apply_binary(struct apply_state *state,
unsigned long size;
char *result;
 
-   result = read_sha1_file(oid.hash, , );
+   result = read_object_file(, , );
if (!result)
return error(_("the necessary postimage %s for "
   "'%s' cannot be read"),
@@ -3242,7 +3242,7 @@ static int read_blob_object(struct strbuf *buf, const 
struct object_id *oid, uns
unsigned long sz;
char *result;
 
-   result = read_sha1_file(oid->hash, , );
+   result = read_object_file(oid, , );
if (!result)
return -1;
/* XXX read_sha1_file NUL-terminates */
diff --git a/archive.c b/archive.c
index 1ab62d426e..93ab175b0b 100644
--- a/archive.c
+++ b/archive.c
@@ -72,7 +72,7 @@ void *object_file_to_archive(const struct archiver_args *args,
const struct commit *commit = args->convert ? args->commit : NULL;
 
path += args->baselen;
-   buffer = read_sha1_file(oid->hash, type, sizep);
+   buffer = read_object_file(oid, type, sizep);
if (buffer && S_ISREG(mode)) {
struct strbuf buf = STRBUF_INIT;
size_t size = 0;
diff --git a/bisect.c b/bisect.c
index f6d05bd66f..ad395bb2b8 100644
--- a/bisect.c
+++ b/bisect.c
@@ -132,7 +132,8 @@ static void show_list(const char *debug, int counted, int 
nr,
unsigned flags = commit->object.flags;
enum object_type type;
unsigned long size;
-   char *buf = read_sha1_file(commit->object.oid.hash, , 
);
+   char *buf = read_object_file(>object.oid, ,
+);
const char *subject_start;
int subject_len;
 
diff --git a/blame.c b/blame.c
index 3f9bd29615..e5ed80083e 100644
--- a/blame.c
+++ b/blame.c
@@ -297,8 +297,8 @@ static void 

[PATCH v2 31/36] tree-walk: convert tree entry functions to object_id

2018-02-25 Thread brian m. carlson
Convert get_tree_entry and find_tree_entry to take pointers to struct
object_id.

Signed-off-by: brian m. carlson 
---
 archive.c  |  4 ++--
 blame.c|  6 ++
 builtin/rm.c   |  2 +-
 builtin/update-index.c |  2 +-
 line-log.c |  3 +--
 match-trees.c  |  6 +++---
 merge-recursive.c  | 12 ++--
 notes.c|  2 +-
 sha1_name.c|  7 +++
 tree-walk.c| 20 ++--
 tree-walk.h|  2 +-
 11 files changed, 31 insertions(+), 35 deletions(-)

diff --git a/archive.c b/archive.c
index da62b2f541..1ab62d426e 100644
--- a/archive.c
+++ b/archive.c
@@ -397,8 +397,8 @@ static void parse_treeish_arg(const char **argv,
unsigned int mode;
int err;
 
-   err = get_tree_entry(tree->object.oid.hash, prefix,
-tree_oid.hash, );
+   err = get_tree_entry(>object.oid, prefix, _oid,
+);
if (err || !S_ISDIR(mode))
die("current working directory is untracked");
 
diff --git a/blame.c b/blame.c
index c1df10cdd2..3f9bd29615 100644
--- a/blame.c
+++ b/blame.c
@@ -80,7 +80,7 @@ static void verify_working_tree_path(struct commit 
*work_tree, const char *path)
struct object_id blob_oid;
unsigned mode;
 
-   if (!get_tree_entry(commit_oid->hash, path, blob_oid.hash, 
) &&
+   if (!get_tree_entry(commit_oid, path, _oid, ) &&
oid_object_info(_oid, NULL) == OBJ_BLOB)
return;
}
@@ -502,9 +502,7 @@ static int fill_blob_sha1_and_mode(struct blame_origin 
*origin)
 {
if (!is_null_oid(>blob_oid))
return 0;
-   if (get_tree_entry(origin->commit->object.oid.hash,
-  origin->path,
-  origin->blob_oid.hash, >mode))
+   if (get_tree_entry(>commit->object.oid, origin->path, 
>blob_oid, >mode))
goto error_out;
if (oid_object_info(>blob_oid, NULL) != OBJ_BLOB)
goto error_out;
diff --git a/builtin/rm.c b/builtin/rm.c
index 4a2fcca27b..974a7ef5bf 100644
--- a/builtin/rm.c
+++ b/builtin/rm.c
@@ -178,7 +178,7 @@ static int check_local_mod(struct object_id *head, int 
index_only)
 * way as changed from the HEAD.
 */
if (no_head
-|| get_tree_entry(head->hash, name, oid.hash, )
+|| get_tree_entry(head, name, , )
 || ce->ce_mode != create_ce_mode(mode)
 || oidcmp(>oid, ))
staged_changes = 1;
diff --git a/builtin/update-index.c b/builtin/update-index.c
index 58d1c2d282..9625d1e10a 100644
--- a/builtin/update-index.c
+++ b/builtin/update-index.c
@@ -592,7 +592,7 @@ static struct cache_entry *read_one_ent(const char *which,
int size;
struct cache_entry *ce;
 
-   if (get_tree_entry(ent->hash, path, oid.hash, )) {
+   if (get_tree_entry(ent, path, , )) {
if (which)
error("%s: not in %s branch.", path, which);
return NULL;
diff --git a/line-log.c b/line-log.c
index 545ad0f28b..700121eb92 100644
--- a/line-log.c
+++ b/line-log.c
@@ -501,8 +501,7 @@ static void fill_blob_sha1(struct commit *commit, struct 
diff_filespec *spec)
unsigned mode;
struct object_id oid;
 
-   if (get_tree_entry(commit->object.oid.hash, spec->path,
-  oid.hash, ))
+   if (get_tree_entry(>object.oid, spec->path, , ))
die("There is no path %s in the commit", spec->path);
fill_filespec(spec, , 1, mode);
 
diff --git a/match-trees.c b/match-trees.c
index 0ca99d5162..10d6c39d47 100644
--- a/match-trees.c
+++ b/match-trees.c
@@ -269,7 +269,7 @@ void shift_tree(const struct object_id *hash1,
if (!*del_prefix)
return;
 
-   if (get_tree_entry(hash2->hash, del_prefix, shifted->hash, 
))
+   if (get_tree_entry(hash2, del_prefix, shifted, ))
die("cannot find path %s in tree %s",
del_prefix, oid_to_hex(hash2));
return;
@@ -296,12 +296,12 @@ void shift_tree_by(const struct object_id *hash1,
unsigned candidate = 0;
 
/* Can hash2 be a tree at shift_prefix in tree hash1? */
-   if (!get_tree_entry(hash1->hash, shift_prefix, sub1.hash, ) &&
+   if (!get_tree_entry(hash1, shift_prefix, , ) &&
S_ISDIR(mode1))
candidate |= 1;
 
/* Can hash1 be a tree at shift_prefix in tree hash2? */
-   if (!get_tree_entry(hash2->hash, shift_prefix, sub2.hash, ) &&
+   if (!get_tree_entry(hash2, shift_prefix, , ) &&
S_ISDIR(mode2))
candidate |= 2;
 
diff --git 

[PATCH v2 25/36] Convert remaining callers of sha1_object_info_extended to object_id

2018-02-25 Thread brian m. carlson
Convert the remaining caller of sha1_object_info_extended to use struct
object_id.  Introduce temporaries, which will be removed later, since
there is a dependency loop between sha1_object_info_extended and
lookup_replace_object_extended.  This allows us to convert the code in a
piecemeal fashion instead of all at once.

Signed-off-by: brian m. carlson 
---
 sha1_file.c | 14 +++---
 streaming.c |  5 -
 2 files changed, 15 insertions(+), 4 deletions(-)

diff --git a/sha1_file.c b/sha1_file.c
index fa69d86309..19995766b6 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -1231,6 +1231,9 @@ int sha1_object_info_extended(const unsigned char *sha1, 
struct object_info *oi,
lookup_replace_object(sha1) :
sha1;
int already_retried = 0;
+   struct object_id realoid;
+
+   hashcpy(realoid.hash, real);
 
if (is_null_sha1(real))
return -1;
@@ -1295,7 +1298,7 @@ int sha1_object_info_extended(const unsigned char *sha1, 
struct object_info *oi,
rtype = packed_object_info(e.p, e.offset, oi);
if (rtype < 0) {
mark_bad_packed_object(e.p, real);
-   return sha1_object_info_extended(real, oi, 0);
+   return sha1_object_info_extended(realoid.hash, oi, 0);
} else if (oi->whence == OI_PACKED) {
oi->u.packed.offset = e.offset;
oi->u.packed.pack = e.p;
@@ -1323,13 +1326,16 @@ int sha1_object_info(const unsigned char *sha1, 
unsigned long *sizep)
 static void *read_object(const unsigned char *sha1, enum object_type *type,
 unsigned long *size)
 {
+   struct object_id oid;
struct object_info oi = OBJECT_INFO_INIT;
void *content;
oi.typep = type;
oi.sizep = size;
oi.contentp = 
 
-   if (sha1_object_info_extended(sha1, , 0) < 0)
+   hashcpy(oid.hash, sha1);
+
+   if (sha1_object_info_extended(oid.hash, , 0) < 0)
return NULL;
return content;
 }
@@ -1723,9 +1729,11 @@ int force_object_loose(const struct object_id *oid, 
time_t mtime)
 
 int has_sha1_file_with_flags(const unsigned char *sha1, int flags)
 {
+   struct object_id oid;
if (!startup_info->have_repository)
return 0;
-   return sha1_object_info_extended(sha1, NULL,
+   hashcpy(oid.hash, sha1);
+   return sha1_object_info_extended(oid.hash, NULL,
 flags | OBJECT_INFO_SKIP_CACHED) >= 0;
 }
 
diff --git a/streaming.c b/streaming.c
index be85507922..042d6082e8 100644
--- a/streaming.c
+++ b/streaming.c
@@ -111,10 +111,13 @@ static enum input_source istream_source(const unsigned 
char *sha1,
 {
unsigned long size;
int status;
+   struct object_id oid;
+
+   hashcpy(oid.hash, sha1);
 
oi->typep = type;
oi->sizep = 
-   status = sha1_object_info_extended(sha1, oi, 0);
+   status = sha1_object_info_extended(oid.hash, oi, 0);
if (status < 0)
return stream_error;
 


[PATCH v2 21/36] builtin/mktree: convert to struct object_id

2018-02-25 Thread brian m. carlson
Convert this file to use struct object_id.  Modify one use of
get_sha1_hex into parse_oid_hex; this is safe since we get the data from
a strbuf.

Signed-off-by: brian m. carlson 
---
 builtin/mktree.c | 24 
 1 file changed, 12 insertions(+), 12 deletions(-)

diff --git a/builtin/mktree.c b/builtin/mktree.c
index 8dd9f52f77..d6ca19d835 100644
--- a/builtin/mktree.c
+++ b/builtin/mktree.c
@@ -10,13 +10,13 @@
 
 static struct treeent {
unsigned mode;
-   unsigned char sha1[20];
+   struct object_id oid;
int len;
char name[FLEX_ARRAY];
 } **entries;
 static int alloc, used;
 
-static void append_to_tree(unsigned mode, unsigned char *sha1, char *path)
+static void append_to_tree(unsigned mode, struct object_id *oid, char *path)
 {
struct treeent *ent;
size_t len = strlen(path);
@@ -26,7 +26,7 @@ static void append_to_tree(unsigned mode, unsigned char 
*sha1, char *path)
FLEX_ALLOC_MEM(ent, name, path, len);
ent->mode = mode;
ent->len = len;
-   hashcpy(ent->sha1, sha1);
+   oidcpy(>oid, oid);
 
ALLOC_GROW(entries, used + 1, alloc);
entries[used++] = ent;
@@ -54,7 +54,7 @@ static void write_tree(struct object_id *oid)
for (i = 0; i < used; i++) {
struct treeent *ent = entries[i];
strbuf_addf(, "%o %s%c", ent->mode, ent->name, '\0');
-   strbuf_add(, ent->sha1, 20);
+   strbuf_add(, ent->oid.hash, the_hash_algo->rawsz);
}
 
write_object_file(buf.buf, buf.len, tree_type, oid);
@@ -69,11 +69,12 @@ static const char *mktree_usage[] = {
 static void mktree_line(char *buf, size_t len, int nul_term_line, int 
allow_missing)
 {
char *ptr, *ntr;
+   const char *p;
unsigned mode;
enum object_type mode_type; /* object type derived from mode */
enum object_type obj_type; /* object type derived from sha */
char *path, *to_free = NULL;
-   unsigned char sha1[20];
+   struct object_id oid;
 
ptr = buf;
/*
@@ -85,9 +86,8 @@ static void mktree_line(char *buf, size_t len, int 
nul_term_line, int allow_miss
die("input format error: %s", buf);
ptr = ntr + 1; /* type */
ntr = strchr(ptr, ' ');
-   if (!ntr || buf + len <= ntr + 40 ||
-   ntr[41] != '\t' ||
-   get_sha1_hex(ntr + 1, sha1))
+   if (!ntr || parse_oid_hex(ntr + 1, , ) ||
+   *p != '\t')
die("input format error: %s", buf);
 
/* It is perfectly normal if we do not have a commit from a submodule */
@@ -116,12 +116,12 @@ static void mktree_line(char *buf, size_t len, int 
nul_term_line, int allow_miss
}
 
/* Check the type of object identified by sha1 */
-   obj_type = sha1_object_info(sha1, NULL);
+   obj_type = sha1_object_info(oid.hash, NULL);
if (obj_type < 0) {
if (allow_missing) {
; /* no problem - missing objects are presumed to be of 
the right type */
} else {
-   die("entry '%s' object %s is unavailable", path, 
sha1_to_hex(sha1));
+   die("entry '%s' object %s is unavailable", path, 
oid_to_hex());
}
} else {
if (obj_type != mode_type) {
@@ -131,11 +131,11 @@ static void mktree_line(char *buf, size_t len, int 
nul_term_line, int allow_miss
 * because the new tree entry will never be correct.
 */
die("entry '%s' object %s is a %s but specified type 
was (%s)",
-   path, sha1_to_hex(sha1), typename(obj_type), 
typename(mode_type));
+   path, oid_to_hex(), typename(obj_type), 
typename(mode_type));
}
}
 
-   append_to_tree(mode, sha1, path);
+   append_to_tree(mode, , path);
free(to_free);
 }
 


[PATCH v2 05/36] resolve-undo: convert struct resolve_undo_info to object_id

2018-02-25 Thread brian m. carlson
Convert the sha1 member of this struct to be an array of struct
object_id instead.  This change is needed to convert find_unique_abbrev.

Signed-off-by: brian m. carlson 
---
 builtin/ls-files.c | 2 +-
 resolve-undo.c | 8 
 resolve-undo.h | 2 +-
 3 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/builtin/ls-files.c b/builtin/ls-files.c
index 2fc836e330..9df66ba307 100644
--- a/builtin/ls-files.c
+++ b/builtin/ls-files.c
@@ -271,7 +271,7 @@ static void show_ru_info(const struct index_state *istate)
if (!ui->mode[i])
continue;
printf("%s%06o %s %d\t", tag_resolve_undo, ui->mode[i],
-  find_unique_abbrev(ui->sha1[i], abbrev),
+  find_unique_abbrev(ui->oid[i].hash, abbrev),
   i + 1);
write_name(path);
}
diff --git a/resolve-undo.c b/resolve-undo.c
index b40f3173d3..109c658a85 100644
--- a/resolve-undo.c
+++ b/resolve-undo.c
@@ -24,7 +24,7 @@ void record_resolve_undo(struct index_state *istate, struct 
cache_entry *ce)
if (!lost->util)
lost->util = xcalloc(1, sizeof(*ui));
ui = lost->util;
-   hashcpy(ui->sha1[stage - 1], ce->oid.hash);
+   oidcpy(>oid[stage - 1], >oid);
ui->mode[stage - 1] = ce->ce_mode;
 }
 
@@ -44,7 +44,7 @@ void resolve_undo_write(struct strbuf *sb, struct string_list 
*resolve_undo)
for (i = 0; i < 3; i++) {
if (!ui->mode[i])
continue;
-   strbuf_add(sb, ui->sha1[i], 20);
+   strbuf_add(sb, ui->oid[i].hash, the_hash_algo->rawsz);
}
}
 }
@@ -89,7 +89,7 @@ struct string_list *resolve_undo_read(const char *data, 
unsigned long size)
continue;
if (size < 20)
goto error;
-   hashcpy(ui->sha1[i], (const unsigned char *)data);
+   hashcpy(ui->oid[i].hash, (const unsigned char *)data);
size -= 20;
data += 20;
}
@@ -145,7 +145,7 @@ int unmerge_index_entry_at(struct index_state *istate, int 
pos)
struct cache_entry *nce;
if (!ru->mode[i])
continue;
-   nce = make_cache_entry(ru->mode[i], ru->sha1[i],
+   nce = make_cache_entry(ru->mode[i], ru->oid[i].hash,
   name, i + 1, 0);
if (matched)
nce->ce_flags |= CE_MATCHED;
diff --git a/resolve-undo.h b/resolve-undo.h
index 46306455ed..87291904bd 100644
--- a/resolve-undo.h
+++ b/resolve-undo.h
@@ -3,7 +3,7 @@
 
 struct resolve_undo_info {
unsigned int mode[3];
-   unsigned char sha1[3][20];
+   struct object_id oid[3];
 };
 
 extern void record_resolve_undo(struct index_state *, struct cache_entry *);


[PATCH v2 28/36] builtin/notes: convert static functions to object_id

2018-02-25 Thread brian m. carlson
Convert the remaining static functions to take pointers to struct
object_id.

Signed-off-by: brian m. carlson 
---
 builtin/notes.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/builtin/notes.c b/builtin/notes.c
index 39304ba743..08ab9d7130 100644
--- a/builtin/notes.c
+++ b/builtin/notes.c
@@ -118,11 +118,11 @@ static int list_each_note(const struct object_id 
*object_oid,
return 0;
 }
 
-static void copy_obj_to_fd(int fd, const unsigned char *sha1)
+static void copy_obj_to_fd(int fd, const struct object_id *oid)
 {
unsigned long size;
enum object_type type;
-   char *buf = read_sha1_file(sha1, , );
+   char *buf = read_sha1_file(oid->hash, , );
if (buf) {
if (size)
write_or_die(fd, buf, size);
@@ -162,7 +162,7 @@ static void write_commented_object(int fd, const struct 
object_id *object)
 }
 
 static void prepare_note_data(const struct object_id *object, struct note_data 
*d,
-   const unsigned char *old_note)
+   const struct object_id *old_note)
 {
if (d->use_editor || !d->given) {
int fd;
@@ -457,7 +457,7 @@ static int add(int argc, const char **argv, const char 
*prefix)
oid_to_hex());
}
 
-   prepare_note_data(, , note ? note->hash : NULL);
+   prepare_note_data(, , note);
if (d.buf.len || allow_empty) {
write_note_data(, _note);
if (add_note(t, , _note, combine_notes_overwrite))
@@ -602,7 +602,7 @@ static int append_edit(int argc, const char **argv, const 
char *prefix)
t = init_notes_check(argv[0], NOTES_INIT_WRITABLE);
note = get_note(t, );
 
-   prepare_note_data(, , edit && note ? note->hash : NULL);
+   prepare_note_data(, , edit && note ? note : NULL);
 
if (note && !edit) {
/* Append buf to previous note contents */


[PATCH v2 19/36] sha1_file: convert check_sha1_signature to struct object_id

2018-02-25 Thread brian m. carlson
Convert this function to take a pointer to struct object_id and rename
it check_object_signature.  Introduce temporaries to convert the return
values of lookup_replace_object and lookup_replace_object_extended into
struct object_id.

The temporaries are needed because in order to convert
lookup_replace_object, open_istream needs to be converted, and
open_istream needs check_sha1_signature to be converted, causing a loop
of dependencies.  The temporaries will be removed in a future patch.

Signed-off-by: brian m. carlson 
---
 builtin/fast-export.c |  2 +-
 builtin/index-pack.c  |  2 +-
 builtin/mktag.c   |  5 -
 cache.h   |  2 +-
 object.c  | 10 --
 pack-check.c  |  4 ++--
 sha1_file.c   | 12 ++--
 7 files changed, 23 insertions(+), 14 deletions(-)

diff --git a/builtin/fast-export.c b/builtin/fast-export.c
index 796d0cd66c..293a6615fa 100644
--- a/builtin/fast-export.c
+++ b/builtin/fast-export.c
@@ -240,7 +240,7 @@ static void export_blob(const struct object_id *oid)
buf = read_sha1_file(oid->hash, , );
if (!buf)
die ("Could not read blob %s", oid_to_hex(oid));
-   if (check_sha1_signature(oid->hash, buf, size, typename(type)) 
< 0)
+   if (check_object_signature(oid, buf, size, typename(type)) < 0)
die("sha1 mismatch in blob %s", oid_to_hex(oid));
object = parse_object_buffer(oid, type, size, buf, );
}
diff --git a/builtin/index-pack.c b/builtin/index-pack.c
index 5c7ab47c36..e0a776f1ac 100644
--- a/builtin/index-pack.c
+++ b/builtin/index-pack.c
@@ -1377,7 +1377,7 @@ static void fix_unresolved_deltas(struct hashfile *f)
if (!base_obj->data)
continue;
 
-   if (check_sha1_signature(d->oid.hash, base_obj->data,
+   if (check_object_signature(>oid, base_obj->data,
base_obj->size, typename(type)))
die(_("local object %s is corrupt"), 
oid_to_hex(>oid));
base_obj->obj = append_obj_to_pack(f, d->oid.hash,
diff --git a/builtin/mktag.c b/builtin/mktag.c
index 65bb41e3cd..810b24bef3 100644
--- a/builtin/mktag.c
+++ b/builtin/mktag.c
@@ -27,8 +27,11 @@ static int verify_object(const struct object_id *oid, const 
char *expected_type)
const unsigned char *repl = lookup_replace_object(oid->hash);
 
if (buffer) {
+   struct object_id reploid;
+   hashcpy(reploid.hash, repl);
+
if (type == type_from_string(expected_type))
-   ret = check_sha1_signature(repl, buffer, size, 
expected_type);
+   ret = check_object_signature(, buffer, size, 
expected_type);
free(buffer);
}
return ret;
diff --git a/cache.h b/cache.h
index 8a9055f4e7..f29ff43bbd 100644
--- a/cache.h
+++ b/cache.h
@@ -1236,7 +1236,7 @@ extern void *map_sha1_file(const unsigned char *sha1, 
unsigned long *size);
 extern int unpack_sha1_header(git_zstream *stream, unsigned char *map, 
unsigned long mapsize, void *buffer, unsigned long bufsiz);
 extern int parse_sha1_header(const char *hdr, unsigned long *sizep);
 
-extern int check_sha1_signature(const unsigned char *sha1, void *buf, unsigned 
long size, const char *type);
+extern int check_object_signature(const struct object_id *oid, void *buf, 
unsigned long size, const char *type);
 
 extern int finalize_object_file(const char *tmpfile, const char *filename);
 
diff --git a/object.c b/object.c
index 9e6f9ff20b..c63f02a40f 100644
--- a/object.c
+++ b/object.c
@@ -255,7 +255,10 @@ struct object *parse_object(const struct object_id *oid)
if ((obj && obj->type == OBJ_BLOB && has_object_file(oid)) ||
(!obj && has_object_file(oid) &&
 sha1_object_info(oid->hash, NULL) == OBJ_BLOB)) {
-   if (check_sha1_signature(repl, NULL, 0, NULL) < 0) {
+   struct object_id reploid;
+   hashcpy(reploid.hash, repl);
+
+   if (check_object_signature(, NULL, 0, NULL) < 0) {
error("sha1 mismatch %s", oid_to_hex(oid));
return NULL;
}
@@ -265,7 +268,10 @@ struct object *parse_object(const struct object_id *oid)
 
buffer = read_sha1_file(oid->hash, , );
if (buffer) {
-   if (check_sha1_signature(repl, buffer, size, typename(type)) < 
0) {
+   struct object_id reploid;
+   hashcpy(reploid.hash, repl);
+
+   if (check_object_signature(, buffer, size, 
typename(type)) < 0) {
free(buffer);
error("sha1 mismatch %s", sha1_to_hex(repl));
return NULL;
diff --git a/pack-check.c b/pack-check.c
index 403a572567..80ef7c1c63 100644
--- a/pack-check.c
+++ b/pack-check.c
@@ -126,7 +126,7 @@ 

[PATCH v2 20/36] streaming: convert open_istream to use struct object_id

2018-02-25 Thread brian m. carlson
Signed-off-by: brian m. carlson 
---
 archive-tar.c  | 2 +-
 archive-zip.c  | 2 +-
 builtin/index-pack.c   | 2 +-
 builtin/pack-objects.c | 2 +-
 sha1_file.c| 2 +-
 streaming.c| 6 +++---
 streaming.h| 2 +-
 7 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/archive-tar.c b/archive-tar.c
index fd622eacc0..7a0d31d847 100644
--- a/archive-tar.c
+++ b/archive-tar.c
@@ -119,7 +119,7 @@ static int stream_blocked(const struct object_id *oid)
char buf[BLOCKSIZE];
ssize_t readlen;
 
-   st = open_istream(oid->hash, , , NULL);
+   st = open_istream(oid, , , NULL);
if (!st)
return error("cannot stream blob %s", oid_to_hex(oid));
for (;;) {
diff --git a/archive-zip.c b/archive-zip.c
index 5841a6ceb6..18b951b732 100644
--- a/archive-zip.c
+++ b/archive-zip.c
@@ -337,7 +337,7 @@ static int write_zip_entry(struct archiver_args *args,
 
if (S_ISREG(mode) && type == OBJ_BLOB && !args->convert &&
size > big_file_threshold) {
-   stream = open_istream(oid->hash, , , NULL);
+   stream = open_istream(oid, , , NULL);
if (!stream)
return error("cannot stream blob %s",
 oid_to_hex(oid));
diff --git a/builtin/index-pack.c b/builtin/index-pack.c
index e0a776f1ac..a0ca525e99 100644
--- a/builtin/index-pack.c
+++ b/builtin/index-pack.c
@@ -771,7 +771,7 @@ static int check_collison(struct object_entry *entry)
 
memset(, 0, sizeof(data));
data.entry = entry;
-   data.st = open_istream(entry->idx.oid.hash, , , NULL);
+   data.st = open_istream(>idx.oid, , , NULL);
if (!data.st)
return -1;
if (size != entry->size || type != entry->type)
diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index 5c674b2843..ef63aa5878 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -267,7 +267,7 @@ static unsigned long write_no_reuse_object(struct hashfile 
*f, struct object_ent
if (!usable_delta) {
if (entry->type == OBJ_BLOB &&
entry->size > big_file_threshold &&
-   (st = open_istream(entry->idx.oid.hash, , , 
NULL)) != NULL)
+   (st = open_istream(>idx.oid, , , NULL)) != 
NULL)
buf = NULL;
else {
buf = read_sha1_file(entry->idx.oid.hash, ,
diff --git a/sha1_file.c b/sha1_file.c
index 64f0905799..5dc85122c3 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -799,7 +799,7 @@ int check_object_signature(const struct object_id *oid, 
void *map,
return oidcmp(oid, _oid) ? -1 : 0;
}
 
-   st = open_istream(oid->hash, _type, , NULL);
+   st = open_istream(oid, _type, , NULL);
if (!st)
return -1;
 
diff --git a/streaming.c b/streaming.c
index 5892b50bd8..be85507922 100644
--- a/streaming.c
+++ b/streaming.c
@@ -130,14 +130,14 @@ static enum input_source istream_source(const unsigned 
char *sha1,
}
 }
 
-struct git_istream *open_istream(const unsigned char *sha1,
+struct git_istream *open_istream(const struct object_id *oid,
 enum object_type *type,
 unsigned long *size,
 struct stream_filter *filter)
 {
struct git_istream *st;
struct object_info oi = OBJECT_INFO_INIT;
-   const unsigned char *real = lookup_replace_object(sha1);
+   const unsigned char *real = lookup_replace_object(oid->hash);
enum input_source src = istream_source(real, type, );
 
if (src < 0)
@@ -507,7 +507,7 @@ int stream_blob_to_fd(int fd, const struct object_id *oid, 
struct stream_filter
ssize_t kept = 0;
int result = -1;
 
-   st = open_istream(oid->hash, , , filter);
+   st = open_istream(oid, , , filter);
if (!st) {
if (filter)
free_stream_filter(filter);
diff --git a/streaming.h b/streaming.h
index 73c1d156b3..32f4626771 100644
--- a/streaming.h
+++ b/streaming.h
@@ -8,7 +8,7 @@
 /* opaque */
 struct git_istream;
 
-extern struct git_istream *open_istream(const unsigned char *, enum 
object_type *, unsigned long *, struct stream_filter *);
+extern struct git_istream *open_istream(const struct object_id *, enum 
object_type *, unsigned long *, struct stream_filter *);
 extern int close_istream(struct git_istream *);
 extern ssize_t read_istream(struct git_istream *, void *, size_t);
 


[PATCH v2 07/36] ref-filter: convert grab_objectname to struct object_id

2018-02-25 Thread brian m. carlson
This is necessary in order to convert find_unique_abbrev.

Signed-off-by: brian m. carlson 
---
 ref-filter.c | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/ref-filter.c b/ref-filter.c
index f9e25aea7a..9cbd92611c 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -737,18 +737,18 @@ static void *get_obj(const struct object_id *oid, struct 
object **obj, unsigned
return buf;
 }
 
-static int grab_objectname(const char *name, const unsigned char *sha1,
+static int grab_objectname(const char *name, const struct object_id *oid,
   struct atom_value *v, struct used_atom *atom)
 {
if (starts_with(name, "objectname")) {
if (atom->u.objectname.option == O_SHORT) {
-   v->s = xstrdup(find_unique_abbrev(sha1, 
DEFAULT_ABBREV));
+   v->s = xstrdup(find_unique_abbrev(oid->hash, 
DEFAULT_ABBREV));
return 1;
} else if (atom->u.objectname.option == O_FULL) {
-   v->s = xstrdup(sha1_to_hex(sha1));
+   v->s = xstrdup(oid_to_hex(oid));
return 1;
} else if (atom->u.objectname.option == O_LENGTH) {
-   v->s = xstrdup(find_unique_abbrev(sha1, 
atom->u.objectname.length));
+   v->s = xstrdup(find_unique_abbrev(oid->hash, 
atom->u.objectname.length));
return 1;
} else
die("BUG: unknown %%(objectname) option");
@@ -775,7 +775,7 @@ static void grab_common_values(struct atom_value *val, int 
deref, struct object
v->s = xstrfmt("%lu", sz);
}
else if (deref)
-   grab_objectname(name, obj->oid.hash, v, _atom[i]);
+   grab_objectname(name, >oid, v, _atom[i]);
}
 }
 
@@ -1439,7 +1439,7 @@ static void populate_value(struct ref_array_item *ref)
v->s = xstrdup(buf + 1);
}
continue;
-   } else if (!deref && grab_objectname(name, 
ref->objectname.hash, v, atom)) {
+   } else if (!deref && grab_objectname(name, >objectname, v, 
atom)) {
continue;
} else if (!strcmp(name, "HEAD")) {
if (atom->u.head && !strcmp(ref->refname, atom->u.head))


[PATCH v2 26/36] sha1_file: convert sha1_object_info* to object_id

2018-02-25 Thread brian m. carlson
Convert sha1_object_info and sha1_object_info_extended to take pointers
to struct object_id and rename them to use "oid" instead of "sha1" in
their names.  Update the declaration and definition and apply the
following semantic patch, plus the standard object_id transforms:

@@
expression E1, E2;
@@
- sha1_object_info(E1.hash, E2)
+ oid_object_info(, E2)

@@
expression E1, E2;
@@
- sha1_object_info(E1->hash, E2)
+ oid_object_info(E1, E2)

@@
expression E1, E2, E3;
@@
- sha1_object_info_extended(E1.hash, E2, E3)
+ oid_object_info_extended(, E2, E3)

@@
expression E1, E2, E3;
@@
- sha1_object_info_extended(E1->hash, E2, E3)
+ oid_object_info_extended(E1, E2, E3)
---
 archive-tar.c|  2 +-
 archive-zip.c|  2 +-
 blame.c  |  4 ++--
 builtin/cat-file.c   | 14 +++---
 builtin/fast-export.c|  2 +-
 builtin/fetch.c  |  2 +-
 builtin/fsck.c   |  2 +-
 builtin/index-pack.c |  4 ++--
 builtin/ls-tree.c|  2 +-
 builtin/mktree.c |  2 +-
 builtin/pack-objects.c   |  7 +++
 builtin/prune.c  |  2 +-
 builtin/replace.c| 10 +-
 builtin/tag.c|  4 ++--
 builtin/unpack-objects.c |  2 +-
 cache.h  |  6 +++---
 diff.c   |  2 +-
 fast-import.c| 10 +-
 list-objects-filter.c|  2 +-
 object.c |  2 +-
 pack-bitmap-write.c  |  3 +--
 packfile.c   |  4 ++--
 reachable.c  |  2 +-
 refs.c   |  2 +-
 remote.c |  2 +-
 sequencer.c  |  3 ++-
 sha1_file.c  | 22 +++---
 sha1_name.c  | 12 ++--
 streaming.c  |  2 +-
 submodule.c  |  2 +-
 tag.c|  2 +-
 31 files changed, 69 insertions(+), 70 deletions(-)

diff --git a/archive-tar.c b/archive-tar.c
index 7a0d31d847..3563bcb9f2 100644
--- a/archive-tar.c
+++ b/archive-tar.c
@@ -276,7 +276,7 @@ static int write_tar_entry(struct archiver_args *args,
memcpy(header.name, path, pathlen);
 
if (S_ISREG(mode) && !args->convert &&
-   sha1_object_info(oid->hash, ) == OBJ_BLOB &&
+   oid_object_info(oid, ) == OBJ_BLOB &&
size > big_file_threshold)
buffer = NULL;
else if (S_ISLNK(mode) || S_ISREG(mode)) {
diff --git a/archive-zip.c b/archive-zip.c
index 18b951b732..6b20bce4d1 100644
--- a/archive-zip.c
+++ b/archive-zip.c
@@ -325,7 +325,7 @@ static int write_zip_entry(struct archiver_args *args,
compressed_size = 0;
buffer = NULL;
} else if (S_ISREG(mode) || S_ISLNK(mode)) {
-   enum object_type type = sha1_object_info(oid->hash, );
+   enum object_type type = oid_object_info(oid, );
 
method = 0;
attr2 = S_ISLNK(mode) ? ((mode | 0777) << 16) :
diff --git a/blame.c b/blame.c
index 1fc22b304b..c1df10cdd2 100644
--- a/blame.c
+++ b/blame.c
@@ -81,7 +81,7 @@ static void verify_working_tree_path(struct commit 
*work_tree, const char *path)
unsigned mode;
 
if (!get_tree_entry(commit_oid->hash, path, blob_oid.hash, 
) &&
-   sha1_object_info(blob_oid.hash, NULL) == OBJ_BLOB)
+   oid_object_info(_oid, NULL) == OBJ_BLOB)
return;
}
 
@@ -506,7 +506,7 @@ static int fill_blob_sha1_and_mode(struct blame_origin 
*origin)
   origin->path,
   origin->blob_oid.hash, >mode))
goto error_out;
-   if (sha1_object_info(origin->blob_oid.hash, NULL) != OBJ_BLOB)
+   if (oid_object_info(>blob_oid, NULL) != OBJ_BLOB)
goto error_out;
return 0;
  error_out:
diff --git a/builtin/cat-file.c b/builtin/cat-file.c
index cf9ea5c796..3264bada39 100644
--- a/builtin/cat-file.c
+++ b/builtin/cat-file.c
@@ -77,7 +77,7 @@ static int cat_one_file(int opt, const char *exp_type, const 
char *obj_name,
switch (opt) {
case 't':
oi.typename = 
-   if (sha1_object_info_extended(oid.hash, , flags) < 0)
+   if (oid_object_info_extended(, , flags) < 0)
die("git cat-file: could not get object info");
if (sb.len) {
printf("%s\n", sb.buf);
@@ -88,7 +88,7 @@ static int cat_one_file(int opt, const char *exp_type, const 
char *obj_name,
 
case 's':
oi.sizep = 
-   if (sha1_object_info_extended(oid.hash, , flags) < 0)
+   if (oid_object_info_extended(, , flags) < 0)
die("git cat-file: could not get object info");
printf("%lu\n", size);
return 0;
@@ -116,7 +116,7 @@ static int cat_one_file(int opt, const char *exp_type, 
const char *obj_name,
/* else fallthrough */
 
case 'p':
-   type = 

[PATCH v2 16/36] archive: convert sha1_file_to_archive to struct object_id

2018-02-25 Thread brian m. carlson
Convert this function to take a pointer to struct object_id and rename
it object_file_to_archive.

Signed-off-by: brian m. carlson 
---
 archive-tar.c |  2 +-
 archive-zip.c |  4 ++--
 archive.c | 10 +-
 archive.h |  8 
 4 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/archive-tar.c b/archive-tar.c
index 24b1ccef3a..fd622eacc0 100644
--- a/archive-tar.c
+++ b/archive-tar.c
@@ -281,7 +281,7 @@ static int write_tar_entry(struct archiver_args *args,
buffer = NULL;
else if (S_ISLNK(mode) || S_ISREG(mode)) {
enum object_type type;
-   buffer = sha1_file_to_archive(args, path, oid->hash, old_mode, 
, );
+   buffer = object_file_to_archive(args, path, oid, old_mode, 
, );
if (!buffer)
return error("cannot read %s", oid_to_hex(oid));
} else {
diff --git a/archive-zip.c b/archive-zip.c
index e2e5513c03..5841a6ceb6 100644
--- a/archive-zip.c
+++ b/archive-zip.c
@@ -344,8 +344,8 @@ static int write_zip_entry(struct archiver_args *args,
flags |= ZIP_STREAM;
out = buffer = NULL;
} else {
-   buffer = sha1_file_to_archive(args, path, oid->hash, 
mode,
- , );
+   buffer = object_file_to_archive(args, path, oid, mode,
+   , );
if (!buffer)
return error("cannot read %s",
 oid_to_hex(oid));
diff --git a/archive.c b/archive.c
index 4942b5632b..da62b2f541 100644
--- a/archive.c
+++ b/archive.c
@@ -63,16 +63,16 @@ static void format_subst(const struct commit *commit,
free(to_free);
 }
 
-void *sha1_file_to_archive(const struct archiver_args *args,
-  const char *path, const unsigned char *sha1,
-  unsigned int mode, enum object_type *type,
-  unsigned long *sizep)
+void *object_file_to_archive(const struct archiver_args *args,
+const char *path, const struct object_id *oid,
+unsigned int mode, enum object_type *type,
+unsigned long *sizep)
 {
void *buffer;
const struct commit *commit = args->convert ? args->commit : NULL;
 
path += args->baselen;
-   buffer = read_sha1_file(sha1, type, sizep);
+   buffer = read_sha1_file(oid->hash, type, sizep);
if (buffer && S_ISREG(mode)) {
struct strbuf buf = STRBUF_INIT;
size_t size = 0;
diff --git a/archive.h b/archive.h
index 741991bfb6..1f9954f7cd 100644
--- a/archive.h
+++ b/archive.h
@@ -39,9 +39,9 @@ extern int write_archive_entries(struct archiver_args *args, 
write_archive_entry
 extern int write_archive(int argc, const char **argv, const char *prefix, 
const char *name_hint, int remote);
 
 const char *archive_format_from_filename(const char *filename);
-extern void *sha1_file_to_archive(const struct archiver_args *args,
- const char *path, const unsigned char *sha1,
- unsigned int mode, enum object_type *type,
- unsigned long *sizep);
+extern void *object_file_to_archive(const struct archiver_args *args,
+   const char *path, const struct object_id 
*oid,
+   unsigned int mode, enum object_type *type,
+   unsigned long *sizep);
 
 #endif /* ARCHIVE_H */


[PATCH v2 27/36] builtin/fmt-merge-msg: convert remaining code to object_id

2018-02-25 Thread brian m. carlson
We were using the util pointer, which is a pointer to void, as an
unsigned char pointer.  The pointer actually points to a struct
origin_data, which has a struct object_id as its first member, which in
turn has an unsigned char array as its first member, so this was valid.
Since we want to convert this to struct object_id, simply change the
pointer we're using.

Signed-off-by: brian m. carlson 
---
 builtin/fmt-merge-msg.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/builtin/fmt-merge-msg.c b/builtin/fmt-merge-msg.c
index 8e8a15ea4a..0bc0dda99c 100644
--- a/builtin/fmt-merge-msg.c
+++ b/builtin/fmt-merge-msg.c
@@ -485,10 +485,10 @@ static void fmt_merge_msg_sigs(struct strbuf *out)
struct strbuf tagbuf = STRBUF_INIT;
 
for (i = 0; i < origins.nr; i++) {
-   unsigned char *sha1 = origins.items[i].util;
+   struct object_id *oid = origins.items[i].util;
enum object_type type;
unsigned long size, len;
-   char *buf = read_sha1_file(sha1, , );
+   char *buf = read_sha1_file(oid->hash, , );
struct strbuf sig = STRBUF_INIT;
 
if (!buf || type != OBJ_TAG)


[PATCH v2 09/36] wt-status: convert struct wt_status_state to object_id

2018-02-25 Thread brian m. carlson
Convert the various *_sha1 members to use struct object_id instead.

Signed-off-by: brian m. carlson 
---
 wt-status.c | 12 ++--
 wt-status.h |  6 +++---
 2 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/wt-status.c b/wt-status.c
index 98e558a70b..698fa1d42a 100644
--- a/wt-status.c
+++ b/wt-status.c
@@ -1349,7 +1349,7 @@ static void show_cherry_pick_in_progress(struct wt_status 
*s,
const char *color)
 {
status_printf_ln(s, color, _("You are currently cherry-picking commit 
%s."),
-   find_unique_abbrev(state->cherry_pick_head_sha1, 
DEFAULT_ABBREV));
+   find_unique_abbrev(state->cherry_pick_head_oid.hash, 
DEFAULT_ABBREV));
if (s->hints) {
if (has_unmerged(s))
status_printf_ln(s, color,
@@ -1368,7 +1368,7 @@ static void show_revert_in_progress(struct wt_status *s,
const char *color)
 {
status_printf_ln(s, color, _("You are currently reverting commit %s."),
-find_unique_abbrev(state->revert_head_sha1, 
DEFAULT_ABBREV));
+find_unique_abbrev(state->revert_head_oid.hash, 
DEFAULT_ABBREV));
if (s->hints) {
if (has_unmerged(s))
status_printf_ln(s, color,
@@ -1489,9 +1489,9 @@ static void wt_status_get_detached_from(struct 
wt_status_state *state)
} else
state->detached_from =
xstrdup(find_unique_abbrev(cb.noid.hash, 
DEFAULT_ABBREV));
-   hashcpy(state->detached_sha1, cb.noid.hash);
+   oidcpy(>detached_oid, );
state->detached_at = !get_oid("HEAD", ) &&
-!hashcmp(oid.hash, state->detached_sha1);
+!oidcmp(, >detached_oid);
 
free(ref);
strbuf_release();
@@ -1550,13 +1550,13 @@ void wt_status_get_state(struct wt_status_state *state,
} else if (!stat(git_path_cherry_pick_head(), ) &&
!get_oid("CHERRY_PICK_HEAD", )) {
state->cherry_pick_in_progress = 1;
-   hashcpy(state->cherry_pick_head_sha1, oid.hash);
+   oidcpy(>cherry_pick_head_oid, );
}
wt_status_check_bisect(NULL, state);
if (!stat(git_path_revert_head(), ) &&
!get_oid("REVERT_HEAD", )) {
state->revert_in_progress = 1;
-   hashcpy(state->revert_head_sha1, oid.hash);
+   oidcpy(>revert_head_oid, );
}
 
if (get_detached_from)
diff --git a/wt-status.h b/wt-status.h
index 3f84d5c29f..c7ef173284 100644
--- a/wt-status.h
+++ b/wt-status.h
@@ -116,9 +116,9 @@ struct wt_status_state {
char *branch;
char *onto;
char *detached_from;
-   unsigned char detached_sha1[20];
-   unsigned char revert_head_sha1[20];
-   unsigned char cherry_pick_head_sha1[20];
+   struct object_id detached_oid;
+   struct object_id revert_head_oid;
+   struct object_id cherry_pick_head_oid;
 };
 
 size_t wt_status_locate_end(const char *s, size_t len);


[PATCH v2 11/36] http-walker: convert struct object_request to use struct object_id

2018-02-25 Thread brian m. carlson
Convert struct object_request to use struct object_id by updating the
definition and applying the following semantic patch, plus the standard
object_id transforms:

@@
struct object_request E1;
@@
- E1.sha1
+ E1.oid.hash

@@
struct object_request *E1;
@@
- E1->sha1
+ E1->oid.hash

Signed-off-by: brian m. carlson 
---
 http-walker.c | 16 
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/http-walker.c b/http-walker.c
index 07c2b1af82..f506f394ac 100644
--- a/http-walker.c
+++ b/http-walker.c
@@ -22,7 +22,7 @@ enum object_request_state {
 
 struct object_request {
struct walker *walker;
-   unsigned char sha1[20];
+   struct object_id oid;
struct alt_base *repo;
enum object_request_state state;
struct http_object_request *req;
@@ -56,7 +56,7 @@ static void start_object_request(struct walker *walker,
struct active_request_slot *slot;
struct http_object_request *req;
 
-   req = new_http_object_request(obj_req->repo->base, obj_req->sha1);
+   req = new_http_object_request(obj_req->repo->base, obj_req->oid.hash);
if (req == NULL) {
obj_req->state = ABORTED;
return;
@@ -82,7 +82,7 @@ static void finish_object_request(struct object_request 
*obj_req)
return;
 
if (obj_req->req->rename == 0)
-   walker_say(obj_req->walker, "got %s\n", 
sha1_to_hex(obj_req->sha1));
+   walker_say(obj_req->walker, "got %s\n", 
oid_to_hex(_req->oid));
 }
 
 static void process_object_response(void *callback_data)
@@ -129,7 +129,7 @@ static int fill_active_slot(struct walker *walker)
list_for_each_safe(pos, tmp, head) {
obj_req = list_entry(pos, struct object_request, node);
if (obj_req->state == WAITING) {
-   if (has_sha1_file(obj_req->sha1))
+   if (has_sha1_file(obj_req->oid.hash))
obj_req->state = COMPLETE;
else {
start_object_request(walker, obj_req);
@@ -148,7 +148,7 @@ static void prefetch(struct walker *walker, unsigned char 
*sha1)
 
newreq = xmalloc(sizeof(*newreq));
newreq->walker = walker;
-   hashcpy(newreq->sha1, sha1);
+   hashcpy(newreq->oid.hash, sha1);
newreq->repo = data->alt;
newreq->state = WAITING;
newreq->req = NULL;
@@ -481,13 +481,13 @@ static int fetch_object(struct walker *walker, unsigned 
char *sha1)
 
list_for_each(pos, head) {
obj_req = list_entry(pos, struct object_request, node);
-   if (!hashcmp(obj_req->sha1, sha1))
+   if (!hashcmp(obj_req->oid.hash, sha1))
break;
}
if (obj_req == NULL)
return error("Couldn't find request for %s in the queue", hex);
 
-   if (has_sha1_file(obj_req->sha1)) {
+   if (has_sha1_file(obj_req->oid.hash)) {
if (obj_req->req != NULL)
abort_http_object_request(obj_req->req);
abort_object_request(obj_req);
@@ -541,7 +541,7 @@ static int fetch_object(struct walker *walker, unsigned 
char *sha1)
} else if (req->zret != Z_STREAM_END) {
walker->corrupt_object_found++;
ret = error("File %s (%s) corrupt", hex, req->url);
-   } else if (hashcmp(obj_req->sha1, req->real_sha1)) {
+   } else if (hashcmp(obj_req->oid.hash, req->real_sha1)) {
ret = error("File %s has bad hash", hex);
} else if (req->rename < 0) {
struct strbuf buf = STRBUF_INIT;


[PATCH v2 36/36] convert: convert to struct object_id

2018-02-25 Thread brian m. carlson
Convert convert.c to struct object_id.  Add a use of the_hash_algo to
replace hard-coded constants and change a strbuf_add to a strbuf_addstr
to avoid another hard-coded constant.

Note that a strict conversion using the hexsz constant would cause
problems in the future if the internal and user-visible hash algorithms
differed, as anticipated by the hash function transition plan.

Signed-off-by: brian m. carlson 
---
 convert.c | 12 ++--
 convert.h |  2 +-
 entry.c   |  2 +-
 3 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/convert.c b/convert.c
index cc562f6509..c480097a2a 100644
--- a/convert.c
+++ b/convert.c
@@ -914,7 +914,7 @@ static int ident_to_worktree(const char *path, const char 
*src, size_t len,
to_free = strbuf_detach(buf, NULL);
hash_object_file(src, len, "blob", );
 
-   strbuf_grow(buf, len + cnt * 43);
+   strbuf_grow(buf, len + cnt * (the_hash_algo->hexsz + 3));
for (;;) {
/* step 1: run to the next '$' */
dollar = memchr(src, '$', len);
@@ -1510,7 +1510,7 @@ struct ident_filter {
struct stream_filter filter;
struct strbuf left;
int state;
-   char ident[45]; /* ": x40 $" */
+   char ident[GIT_MAX_HEXSZ + 5]; /* ": x40 $" */
 };
 
 static int is_foreign_ident(const char *str)
@@ -1635,12 +1635,12 @@ static struct stream_filter_vtbl ident_vtbl = {
ident_free_fn,
 };
 
-static struct stream_filter *ident_filter(const unsigned char *sha1)
+static struct stream_filter *ident_filter(const struct object_id *oid)
 {
struct ident_filter *ident = xmalloc(sizeof(*ident));
 
xsnprintf(ident->ident, sizeof(ident->ident),
- ": %s $", sha1_to_hex(sha1));
+ ": %s $", oid_to_hex(oid));
strbuf_init(>left, 0);
ident->filter.vtbl = _vtbl;
ident->state = 0;
@@ -1655,7 +1655,7 @@ static struct stream_filter *ident_filter(const unsigned 
char *sha1)
  * Note that you would be crazy to set CRLF, smuge/clean or ident to a
  * large binary blob you would want us not to slurp into the memory!
  */
-struct stream_filter *get_stream_filter(const char *path, const unsigned char 
*sha1)
+struct stream_filter *get_stream_filter(const char *path, const struct 
object_id *oid)
 {
struct conv_attrs ca;
struct stream_filter *filter = NULL;
@@ -1668,7 +1668,7 @@ struct stream_filter *get_stream_filter(const char *path, 
const unsigned char *s
return NULL;
 
if (ca.ident)
-   filter = ident_filter(sha1);
+   filter = ident_filter(oid);
 
if (output_eol(ca.crlf_action) == EOL_CRLF)
filter = cascade_filter(filter, lf_to_crlf_filter());
diff --git a/convert.h b/convert.h
index 65ab3e5167..2e9b4f49cc 100644
--- a/convert.h
+++ b/convert.h
@@ -93,7 +93,7 @@ extern int would_convert_to_git_filter_fd(const char *path);
 
 struct stream_filter; /* opaque */
 
-extern struct stream_filter *get_stream_filter(const char *path, const 
unsigned char *);
+extern struct stream_filter *get_stream_filter(const char *path, const struct 
object_id *);
 extern void free_stream_filter(struct stream_filter *);
 extern int is_null_stream_filter(struct stream_filter *);
 
diff --git a/entry.c b/entry.c
index 3f21d2c913..49b83f4a97 100644
--- a/entry.c
+++ b/entry.c
@@ -266,7 +266,7 @@ static int write_entry(struct cache_entry *ce,
 
if (ce_mode_s_ifmt == S_IFREG) {
struct stream_filter *filter = get_stream_filter(ce->name,
-ce->oid.hash);
+>oid);
if (filter &&
!streaming_write_entry(ce, path, filter,
   state, to_tempfile,


[PATCH v2 08/36] strbuf: convert strbuf_add_unique_abbrev to use struct object_id

2018-02-25 Thread brian m. carlson
Convert the declaration and definition of strbuf_add_unique_abbrev to
make it take a pointer to struct object_id.  Predeclare the struct in
strbuf.h, as cache.h includes strbuf.h before it declares the struct,
and otherwise the struct declaration would have the wrong scope.

Apply the following semantic patch, along with the standard object_id
transforms, to adjust the callers:

@@
expression E1, E2, E3;
@@
- strbuf_add_unique_abbrev(E1, E2.hash, E3);
+ strbuf_add_unique_abbrev(E1, , E3);

@@
expression E1, E2, E3;
@@
- strbuf_add_unique_abbrev(E1, E2->hash, E3);
+ strbuf_add_unique_abbrev(E1, E2, E3);

Signed-off-by: brian m. carlson 
---
 builtin/checkout.c | 2 +-
 builtin/fetch.c| 8 
 builtin/tag.c  | 2 +-
 merge-recursive.c  | 2 +-
 pretty.c   | 8 
 strbuf.c   | 4 ++--
 strbuf.h   | 8 +++-
 submodule.c| 4 ++--
 transport.c| 4 ++--
 wt-status.c| 6 +++---
 10 files changed, 27 insertions(+), 21 deletions(-)

diff --git a/builtin/checkout.c b/builtin/checkout.c
index 87182aa018..bc68d423e5 100644
--- a/builtin/checkout.c
+++ b/builtin/checkout.c
@@ -720,7 +720,7 @@ static int add_pending_uninteresting_ref(const char 
*refname,
 static void describe_one_orphan(struct strbuf *sb, struct commit *commit)
 {
strbuf_addstr(sb, "  ");
-   strbuf_add_unique_abbrev(sb, commit->object.oid.hash, DEFAULT_ABBREV);
+   strbuf_add_unique_abbrev(sb, >object.oid, DEFAULT_ABBREV);
strbuf_addch(sb, ' ');
if (!parse_commit(commit))
pp_commit_easy(CMIT_FMT_ONELINE, commit, sb);
diff --git a/builtin/fetch.c b/builtin/fetch.c
index 8ee998ea2e..3b8988e51d 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -697,9 +697,9 @@ static int update_local_ref(struct ref *ref,
if (in_merge_bases(current, updated)) {
struct strbuf quickref = STRBUF_INIT;
int r;
-   strbuf_add_unique_abbrev(, current->object.oid.hash, 
DEFAULT_ABBREV);
+   strbuf_add_unique_abbrev(, >object.oid, 
DEFAULT_ABBREV);
strbuf_addstr(, "..");
-   strbuf_add_unique_abbrev(, ref->new_oid.hash, 
DEFAULT_ABBREV);
+   strbuf_add_unique_abbrev(, >new_oid, 
DEFAULT_ABBREV);
if ((recurse_submodules != RECURSE_SUBMODULES_OFF) &&
(recurse_submodules != RECURSE_SUBMODULES_ON))
check_for_new_submodule_commits(>new_oid);
@@ -712,9 +712,9 @@ static int update_local_ref(struct ref *ref,
} else if (force || ref->force) {
struct strbuf quickref = STRBUF_INIT;
int r;
-   strbuf_add_unique_abbrev(, current->object.oid.hash, 
DEFAULT_ABBREV);
+   strbuf_add_unique_abbrev(, >object.oid, 
DEFAULT_ABBREV);
strbuf_addstr(, "...");
-   strbuf_add_unique_abbrev(, ref->new_oid.hash, 
DEFAULT_ABBREV);
+   strbuf_add_unique_abbrev(, >new_oid, 
DEFAULT_ABBREV);
if ((recurse_submodules != RECURSE_SUBMODULES_OFF) &&
(recurse_submodules != RECURSE_SUBMODULES_ON))
check_for_new_submodule_commits(>new_oid);
diff --git a/builtin/tag.c b/builtin/tag.c
index 8885e21ddc..99dba2d907 100644
--- a/builtin/tag.c
+++ b/builtin/tag.c
@@ -289,7 +289,7 @@ static void create_reflog_msg(const struct object_id *oid, 
struct strbuf *sb)
strbuf_addstr(sb, rla);
} else {
strbuf_addstr(sb, "tag: tagging ");
-   strbuf_add_unique_abbrev(sb, oid->hash, DEFAULT_ABBREV);
+   strbuf_add_unique_abbrev(sb, oid, DEFAULT_ABBREV);
}
 
strbuf_addstr(sb, " (");
diff --git a/merge-recursive.c b/merge-recursive.c
index c886f2e9cd..f58f13957e 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -228,7 +228,7 @@ static void output_commit_title(struct merge_options *o, 
struct commit *commit)
strbuf_addf(>obuf, "virtual %s\n",
merge_remote_util(commit)->name);
else {
-   strbuf_add_unique_abbrev(>obuf, commit->object.oid.hash,
+   strbuf_add_unique_abbrev(>obuf, >object.oid,
 DEFAULT_ABBREV);
strbuf_addch(>obuf, ' ');
if (parse_commit(commit) != 0)
diff --git a/pretty.c b/pretty.c
index f7ce490230..34fe891fc0 100644
--- a/pretty.c
+++ b/pretty.c
@@ -549,7 +549,7 @@ static void add_merge_info(const struct 
pretty_print_context *pp,
struct object_id *oidp = >item->object.oid;
strbuf_addch(sb, ' ');
if (pp->abbrev)
-   strbuf_add_unique_abbrev(sb, oidp->hash, pp->abbrev);
+   strbuf_add_unique_abbrev(sb, oidp, pp->abbrev);
else
strbuf_addstr(sb, oid_to_hex(oidp));
parent = parent->next;
@@ 

[PATCH v2 10/36] Convert find_unique_abbrev* to struct object_id

2018-02-25 Thread brian m. carlson
Convert find_unique_abbrev and find_unique_abbrev_r to each take a
pointer to struct object_id.

Signed-off-by: brian m. carlson 
---
 builtin/blame.c|  2 +-
 builtin/branch.c   |  2 +-
 builtin/checkout.c |  6 +++---
 builtin/describe.c |  4 ++--
 builtin/log.c  |  4 ++--
 builtin/ls-files.c |  4 ++--
 builtin/ls-tree.c  |  4 ++--
 builtin/merge.c|  6 +++---
 builtin/name-rev.c |  2 +-
 builtin/receive-pack.c |  8 
 builtin/reset.c|  2 +-
 builtin/rev-list.c |  2 +-
 builtin/rev-parse.c|  2 +-
 builtin/show-branch.c  |  2 +-
 builtin/show-ref.c |  4 ++--
 builtin/tag.c  |  6 --
 builtin/worktree.c |  4 ++--
 cache.h|  6 +++---
 combine-diff.c |  4 ++--
 diff.c |  2 +-
 log-tree.c | 12 ++--
 ref-filter.c   |  4 ++--
 sequencer.c|  2 +-
 sha1_name.c| 12 ++--
 strbuf.c   |  2 +-
 tag.c  |  4 ++--
 transport.c|  2 +-
 wt-status.c|  6 +++---
 28 files changed, 61 insertions(+), 59 deletions(-)

diff --git a/builtin/blame.c b/builtin/blame.c
index 005f55aaa2..21b496517a 100644
--- a/builtin/blame.c
+++ b/builtin/blame.c
@@ -499,7 +499,7 @@ static int read_ancestry(const char *graft_file)
 
 static int update_auto_abbrev(int auto_abbrev, struct blame_origin *suspect)
 {
-   const char *uniq = find_unique_abbrev(suspect->commit->object.oid.hash,
+   const char *uniq = find_unique_abbrev(>commit->object.oid,
  auto_abbrev);
int len = strlen(uniq);
if (auto_abbrev < len)
diff --git a/builtin/branch.c b/builtin/branch.c
index 8dcc2ed058..659deb36f3 100644
--- a/builtin/branch.c
+++ b/builtin/branch.c
@@ -273,7 +273,7 @@ static int delete_branches(int argc, const char **argv, int 
force, int kinds,
   bname.buf,
   (flags & REF_ISBROKEN) ? "broken"
   : (flags & REF_ISSYMREF) ? target
-  : find_unique_abbrev(oid.hash, DEFAULT_ABBREV));
+  : find_unique_abbrev(, DEFAULT_ABBREV));
}
delete_branch_config(bname.buf);
 
diff --git a/builtin/checkout.c b/builtin/checkout.c
index bc68d423e5..e6ee955671 100644
--- a/builtin/checkout.c
+++ b/builtin/checkout.c
@@ -405,10 +405,10 @@ static void describe_detached_head(const char *msg, 
struct commit *commit)
pp_commit_easy(CMIT_FMT_ONELINE, commit, );
if (print_sha1_ellipsis()) {
fprintf(stderr, "%s %s... %s\n", msg,
-   find_unique_abbrev(commit->object.oid.hash, 
DEFAULT_ABBREV), sb.buf);
+   find_unique_abbrev(>object.oid, 
DEFAULT_ABBREV), sb.buf);
} else {
fprintf(stderr, "%s %s %s\n", msg,
-   find_unique_abbrev(commit->object.oid.hash, 
DEFAULT_ABBREV), sb.buf);
+   find_unique_abbrev(>object.oid, 
DEFAULT_ABBREV), sb.buf);
}
strbuf_release();
 }
@@ -778,7 +778,7 @@ static void suggest_reattach(struct commit *commit, struct 
rev_info *revs)
" git branch  %s\n\n",
/* Give ngettext() the count */
lost),
-   find_unique_abbrev(commit->object.oid.hash, 
DEFAULT_ABBREV));
+   find_unique_abbrev(>object.oid, 
DEFAULT_ABBREV));
 }
 
 /*
diff --git a/builtin/describe.c b/builtin/describe.c
index c428984706..dc5c3aa9f9 100644
--- a/builtin/describe.c
+++ b/builtin/describe.c
@@ -285,7 +285,7 @@ static void append_name(struct commit_name *n, struct 
strbuf *dst)
 
 static void append_suffix(int depth, const struct object_id *oid, struct 
strbuf *dst)
 {
-   strbuf_addf(dst, "-%d-g%s", depth, find_unique_abbrev(oid->hash, 
abbrev));
+   strbuf_addf(dst, "-%d-g%s", depth, find_unique_abbrev(oid, abbrev));
 }
 
 static void describe_commit(struct object_id *oid, struct strbuf *dst)
@@ -383,7 +383,7 @@ static void describe_commit(struct object_id *oid, struct 
strbuf *dst)
if (!match_cnt) {
struct object_id *cmit_oid = >object.oid;
if (always) {
-   strbuf_add_unique_abbrev(dst, cmit_oid->hash, abbrev);
+   strbuf_add_unique_abbrev(dst, cmit_oid, abbrev);
if (suffix)
strbuf_addstr(dst, suffix);
return;
diff --git a/builtin/log.c b/builtin/log.c
index 32a744bfd2..411339c1ed 100644
--- a/builtin/log.c
+++ b/builtin/log.c
@@ -1873,12 +1873,12 @@ static void print_commit(char sign, struct commit 
*commit, int verbose,
 {
if (!verbose) {
fprintf(file, "%c %s\n", sign,
-  

[PATCH v2 24/36] packfile: convert unpack_entry to struct object_id

2018-02-25 Thread brian m. carlson
Convert unpack_entry and read_object to use struct object_id.
---
 packfile.c | 22 +++---
 1 file changed, 11 insertions(+), 11 deletions(-)

diff --git a/packfile.c b/packfile.c
index 954e1fca11..e3b1db7dd5 100644
--- a/packfile.c
+++ b/packfile.c
@@ -1452,7 +1452,7 @@ struct unpack_entry_stack_ent {
unsigned long size;
 };
 
-static void *read_object(const unsigned char *sha1, enum object_type *type,
+static void *read_object(const struct object_id *oid, enum object_type *type,
 unsigned long *size)
 {
struct object_info oi = OBJECT_INFO_INIT;
@@ -1461,7 +1461,7 @@ static void *read_object(const unsigned char *sha1, enum 
object_type *type,
oi.sizep = size;
oi.contentp = 
 
-   if (sha1_object_info_extended(sha1, , 0) < 0)
+   if (sha1_object_info_extended(oid->hash, , 0) < 0)
return NULL;
return content;
 }
@@ -1501,11 +1501,11 @@ void *unpack_entry(struct packed_git *p, off_t 
obj_offset,
struct revindex_entry *revidx = find_pack_revindex(p, 
obj_offset);
off_t len = revidx[1].offset - obj_offset;
if (check_pack_crc(p, _curs, obj_offset, len, 
revidx->nr)) {
-   const unsigned char *sha1 =
-   nth_packed_object_sha1(p, revidx->nr);
+   struct object_id oid;
+   nth_packed_object_oid(, p, revidx->nr);
error("bad packed object CRC for %s",
- sha1_to_hex(sha1));
-   mark_bad_packed_object(p, sha1);
+ oid_to_hex());
+   mark_bad_packed_object(p, oid.hash);
data = NULL;
goto out;
}
@@ -1588,16 +1588,16 @@ void *unpack_entry(struct packed_git *p, off_t 
obj_offset,
 * of a corrupted pack, and is better than failing 
outright.
 */
struct revindex_entry *revidx;
-   const unsigned char *base_sha1;
+   struct object_id base_oid;
revidx = find_pack_revindex(p, obj_offset);
if (revidx) {
-   base_sha1 = nth_packed_object_sha1(p, 
revidx->nr);
+   nth_packed_object_oid(_oid, p, revidx->nr);
error("failed to read delta base object %s"
  " at offset %"PRIuMAX" from %s",
- sha1_to_hex(base_sha1), 
(uintmax_t)obj_offset,
+ oid_to_hex(_oid), 
(uintmax_t)obj_offset,
  p->pack_name);
-   mark_bad_packed_object(p, base_sha1);
-   base = read_object(base_sha1, , 
_size);
+   mark_bad_packed_object(p, base_oid.hash);
+   base = read_object(_oid, , 
_size);
external_base = base;
}
}


[PATCH v2 30/36] streaming: convert istream internals to struct object_id

2018-02-25 Thread brian m. carlson
Convert the various open_istream variants to take a pointer to struct
object_id.  Introduce a temporary, which will be removed later, to work
around the fact that lookup_replace_object still returns a pointer to
unsigned char.

Signed-off-by: brian m. carlson 
---
 streaming.c | 15 +--
 1 file changed, 9 insertions(+), 6 deletions(-)

diff --git a/streaming.c b/streaming.c
index 344678e95f..3f4be1ea2c 100644
--- a/streaming.c
+++ b/streaming.c
@@ -14,7 +14,7 @@ enum input_source {
 
 typedef int (*open_istream_fn)(struct git_istream *,
   struct object_info *,
-  const unsigned char *,
+  const struct object_id *,
   enum object_type *);
 typedef int (*close_istream_fn)(struct git_istream *);
 typedef ssize_t (*read_istream_fn)(struct git_istream *, char *, size_t);
@@ -27,7 +27,7 @@ struct stream_vtbl {
 #define open_method_decl(name) \
int open_istream_ ##name \
(struct git_istream *st, struct object_info *oi, \
-const unsigned char *sha1, \
+const struct object_id *oid, \
 enum object_type *type)
 
 #define close_method_decl(name) \
@@ -142,13 +142,16 @@ struct git_istream *open_istream(const struct object_id 
*oid,
struct object_info oi = OBJECT_INFO_INIT;
const unsigned char *real = lookup_replace_object(oid->hash);
enum input_source src = istream_source(real, type, );
+   struct object_id realoid;
+
+   hashcpy(realoid.hash, real);
 
if (src < 0)
return NULL;
 
st = xmalloc(sizeof(*st));
-   if (open_istream_tbl[src](st, , real, type)) {
-   if (open_istream_incore(st, , real, type)) {
+   if (open_istream_tbl[src](st, , , type)) {
+   if (open_istream_incore(st, , , type)) {
free(st);
return NULL;
}
@@ -338,7 +341,7 @@ static struct stream_vtbl loose_vtbl = {
 
 static open_method_decl(loose)
 {
-   st->u.loose.mapped = map_sha1_file(sha1, >u.loose.mapsize);
+   st->u.loose.mapped = map_sha1_file(oid->hash, >u.loose.mapsize);
if (!st->u.loose.mapped)
return -1;
if ((unpack_sha1_header(>z,
@@ -489,7 +492,7 @@ static struct stream_vtbl incore_vtbl = {
 
 static open_method_decl(incore)
 {
-   st->u.incore.buf = read_sha1_file_extended(sha1, type, >size, 0);
+   st->u.incore.buf = read_sha1_file_extended(oid->hash, type, >size, 
0);
st->u.incore.read_ptr = 0;
st->vtbl = _vtbl;
 


[PATCH v2 32/36] sha1_file: convert read_object_with_reference to object_id

2018-02-25 Thread brian m. carlson
Convert read_object_with_reference to take pointers to struct object_id.
Update the internals of the function accordingly.

Signed-off-by: brian m. carlson 
---
 builtin/cat-file.c |  2 +-
 builtin/grep.c |  4 ++--
 builtin/pack-objects.c |  2 +-
 cache.h|  4 ++--
 fast-import.c  | 15 ---
 sha1_file.c| 20 ++--
 tree-walk.c|  9 -
 7 files changed, 28 insertions(+), 28 deletions(-)

diff --git a/builtin/cat-file.c b/builtin/cat-file.c
index 3264bada39..17382f2fa4 100644
--- a/builtin/cat-file.c
+++ b/builtin/cat-file.c
@@ -159,7 +159,7 @@ static int cat_one_file(int opt, const char *exp_type, 
const char *obj_name,
 * fall-back to the usual case.
 */
}
-   buf = read_object_with_reference(oid.hash, exp_type, , 
NULL);
+   buf = read_object_with_reference(, exp_type, , NULL);
break;
 
default:
diff --git a/builtin/grep.c b/builtin/grep.c
index 3ca4ac80d8..7c2843a492 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -445,7 +445,7 @@ static int grep_submodule(struct grep_opt *opt, struct 
repository *superproject,
object = parse_object_or_die(oid, oid_to_hex(oid));
 
grep_read_lock();
-   data = read_object_with_reference(object->oid.hash, tree_type,
+   data = read_object_with_reference(>oid, tree_type,
  , NULL);
grep_read_unlock();
 
@@ -607,7 +607,7 @@ static int grep_object(struct grep_opt *opt, const struct 
pathspec *pathspec,
int hit, len;
 
grep_read_lock();
-   data = read_object_with_reference(obj->oid.hash, tree_type,
+   data = read_object_with_reference(>oid, tree_type,
  , NULL);
grep_read_unlock();
 
diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index f8148eb9d5..16d6069e16 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -1351,7 +1351,7 @@ static void add_preferred_base(struct object_id *oid)
if (window <= num_preferred_base++)
return;
 
-   data = read_object_with_reference(oid->hash, tree_type, , 
tree_oid.hash);
+   data = read_object_with_reference(oid, tree_type, , _oid);
if (!data)
return;
 
diff --git a/cache.h b/cache.h
index b965a073f2..c482fb92fe 100644
--- a/cache.h
+++ b/cache.h
@@ -1431,10 +1431,10 @@ extern int df_name_compare(const char *name1, int len1, 
int mode1, const char *n
 extern int name_compare(const char *name1, size_t len1, const char *name2, 
size_t len2);
 extern int cache_name_stage_compare(const char *name1, int len1, int stage1, 
const char *name2, int len2, int stage2);
 
-extern void *read_object_with_reference(const unsigned char *sha1,
+extern void *read_object_with_reference(const struct object_id *oid,
const char *required_type,
unsigned long *size,
-   unsigned char *sha1_ret);
+   struct object_id *oid_ret);
 
 extern struct object *peel_to_type(const char *name, int namelen,
   struct object *o, enum object_type);
diff --git a/fast-import.c b/fast-import.c
index 71b60f9068..004a9c9f99 100644
--- a/fast-import.c
+++ b/fast-import.c
@@ -2583,8 +2583,9 @@ static void note_change_n(const char *p, struct branch 
*b, unsigned char *old_fa
oidcpy(_oid, _oe->idx.oid);
} else if (!get_oid(p, _oid)) {
unsigned long size;
-   char *buf = read_object_with_reference(commit_oid.hash,
-   commit_type, , commit_oid.hash);
+   char *buf = read_object_with_reference(_oid,
+  commit_type, ,
+  _oid);
if (!buf || size < 46)
die("Not a valid commit: %s", p);
free(buf);
@@ -2653,9 +2654,8 @@ static void parse_from_existing(struct branch *b)
unsigned long size;
char *buf;
 
-   buf = read_object_with_reference(b->oid.hash,
-commit_type, ,
-b->oid.hash);
+   buf = read_object_with_reference(>oid, commit_type, ,
+>oid);
parse_from_commit(b, buf, size);
free(buf);
}
@@ -2732,8 +2732,9 @@ static struct hash_list *parse_merge(unsigned int *count)
oidcpy(>oid, >idx.oid);
} else if (!get_oid(from, >oid)) {

[PATCH v2 29/36] tree-walk: convert get_tree_entry_follow_symlinks internals to object_id

2018-02-25 Thread brian m. carlson
Convert the internals of this function to use struct object_id.  This is
one of the last remaining callers of read_sha1_file_extended that has
not been converted yet.

Signed-off-by: brian m. carlson 
---
 tree-walk.c | 22 +++---
 1 file changed, 11 insertions(+), 11 deletions(-)

diff --git a/tree-walk.c b/tree-walk.c
index 63a87ed666..521defdaa2 100644
--- a/tree-walk.c
+++ b/tree-walk.c
@@ -583,14 +583,14 @@ enum follow_symlinks_result 
get_tree_entry_follow_symlinks(unsigned char *tree_s
struct dir_state *parents = NULL;
size_t parents_alloc = 0;
size_t i, parents_nr = 0;
-   unsigned char current_tree_sha1[20];
+   struct object_id current_tree_oid;
struct strbuf namebuf = STRBUF_INIT;
struct tree_desc t;
int follows_remaining = GET_TREE_ENTRY_FOLLOW_SYMLINKS_MAX_LINKS;
 
init_tree_desc(, NULL, 0UL);
strbuf_addstr(, name);
-   hashcpy(current_tree_sha1, tree_sha1);
+   hashcpy(current_tree_oid.hash, tree_sha1);
 
while (1) {
int find_result;
@@ -599,22 +599,22 @@ enum follow_symlinks_result 
get_tree_entry_follow_symlinks(unsigned char *tree_s
 
if (!t.buffer) {
void *tree;
-   unsigned char root[20];
+   struct object_id root;
unsigned long size;
-   tree = read_object_with_reference(current_tree_sha1,
+   tree = read_object_with_reference(current_tree_oid.hash,
  tree_type, ,
- root);
+ root.hash);
if (!tree)
goto done;
 
ALLOC_GROW(parents, parents_nr + 1, parents_alloc);
parents[parents_nr].tree = tree;
parents[parents_nr].size = size;
-   hashcpy(parents[parents_nr].sha1, root);
+   hashcpy(parents[parents_nr].sha1, root.hash);
parents_nr++;
 
if (namebuf.buf[0] == '\0') {
-   hashcpy(result, root);
+   hashcpy(result, root.hash);
retval = FOUND;
goto done;
}
@@ -671,14 +671,14 @@ enum follow_symlinks_result 
get_tree_entry_follow_symlinks(unsigned char *tree_s
 
/* Look up the first (or only) path component in the tree. */
find_result = find_tree_entry(, namebuf.buf,
- current_tree_sha1, mode);
+ current_tree_oid.hash, mode);
if (find_result) {
goto done;
}
 
if (S_ISDIR(*mode)) {
if (!remainder) {
-   hashcpy(result, current_tree_sha1);
+   hashcpy(result, current_tree_oid.hash);
retval = FOUND;
goto done;
}
@@ -688,7 +688,7 @@ enum follow_symlinks_result 
get_tree_entry_follow_symlinks(unsigned char *tree_s
  1 + first_slash - namebuf.buf);
} else if (S_ISREG(*mode)) {
if (!remainder) {
-   hashcpy(result, current_tree_sha1);
+   hashcpy(result, current_tree_oid.hash);
retval = FOUND;
} else {
retval = NOT_DIR;
@@ -714,7 +714,7 @@ enum follow_symlinks_result 
get_tree_entry_follow_symlinks(unsigned char *tree_s
 */
retval = DANGLING_SYMLINK;
 
-   contents = read_sha1_file(current_tree_sha1, ,
+   contents = read_sha1_file(current_tree_oid.hash, ,
  _len);
 
if (!contents)


[PATCH v2 15/36] archive: convert write_archive_entry_fn_t to object_id

2018-02-25 Thread brian m. carlson
Convert the write_archive_entry_fn_t type to use a pointer to struct
object_id.  Convert various static functions in the tar and zip
archivers also.

Signed-off-by: brian m. carlson 
---
 archive-tar.c | 28 ++--
 archive-zip.c | 16 
 archive.c | 12 ++--
 archive.h |  2 +-
 4 files changed, 29 insertions(+), 29 deletions(-)

diff --git a/archive-tar.c b/archive-tar.c
index c6ed96ee74..24b1ccef3a 100644
--- a/archive-tar.c
+++ b/archive-tar.c
@@ -111,7 +111,7 @@ static void write_trailer(void)
  * queues up writes, so that all our write(2) calls write exactly one
  * full block; pads writes to RECORDSIZE
  */
-static int stream_blocked(const unsigned char *sha1)
+static int stream_blocked(const struct object_id *oid)
 {
struct git_istream *st;
enum object_type type;
@@ -119,9 +119,9 @@ static int stream_blocked(const unsigned char *sha1)
char buf[BLOCKSIZE];
ssize_t readlen;
 
-   st = open_istream(sha1, , , NULL);
+   st = open_istream(oid->hash, , , NULL);
if (!st)
-   return error("cannot stream blob %s", sha1_to_hex(sha1));
+   return error("cannot stream blob %s", oid_to_hex(oid));
for (;;) {
readlen = read_istream(st, buf, sizeof(buf));
if (readlen <= 0)
@@ -218,7 +218,7 @@ static void prepare_header(struct archiver_args *args,
 }
 
 static void write_extended_header(struct archiver_args *args,
- const unsigned char *sha1,
+ const struct object_id *oid,
  const void *buffer, unsigned long size)
 {
struct ustar_header header;
@@ -226,14 +226,14 @@ static void write_extended_header(struct archiver_args 
*args,
memset(, 0, sizeof(header));
*header.typeflag = TYPEFLAG_EXT_HEADER;
mode = 0100666;
-   xsnprintf(header.name, sizeof(header.name), "%s.paxheader", 
sha1_to_hex(sha1));
+   xsnprintf(header.name, sizeof(header.name), "%s.paxheader", 
oid_to_hex(oid));
prepare_header(args, , mode, size);
write_blocked(, sizeof(header));
write_blocked(buffer, size);
 }
 
 static int write_tar_entry(struct archiver_args *args,
-  const unsigned char *sha1,
+  const struct object_id *oid,
   const char *path, size_t pathlen,
   unsigned int mode)
 {
@@ -257,7 +257,7 @@ static int write_tar_entry(struct archiver_args *args,
mode = (mode | ((mode & 0100) ? 0777 : 0666)) & ~tar_umask;
} else {
return error("unsupported file mode: 0%o (SHA1: %s)",
-mode, sha1_to_hex(sha1));
+mode, oid_to_hex(oid));
}
if (pathlen > sizeof(header.name)) {
size_t plen = get_path_prefix(path, pathlen,
@@ -268,7 +268,7 @@ static int write_tar_entry(struct archiver_args *args,
memcpy(header.name, path + plen + 1, rest);
} else {
xsnprintf(header.name, sizeof(header.name), "%s.data",
- sha1_to_hex(sha1));
+ oid_to_hex(oid));
strbuf_append_ext_header(_header, "path",
 path, pathlen);
}
@@ -276,14 +276,14 @@ static int write_tar_entry(struct archiver_args *args,
memcpy(header.name, path, pathlen);
 
if (S_ISREG(mode) && !args->convert &&
-   sha1_object_info(sha1, ) == OBJ_BLOB &&
+   sha1_object_info(oid->hash, ) == OBJ_BLOB &&
size > big_file_threshold)
buffer = NULL;
else if (S_ISLNK(mode) || S_ISREG(mode)) {
enum object_type type;
-   buffer = sha1_file_to_archive(args, path, sha1, old_mode, 
, );
+   buffer = sha1_file_to_archive(args, path, oid->hash, old_mode, 
, );
if (!buffer)
-   return error("cannot read %s", sha1_to_hex(sha1));
+   return error("cannot read %s", oid_to_hex(oid));
} else {
buffer = NULL;
size = 0;
@@ -292,7 +292,7 @@ static int write_tar_entry(struct archiver_args *args,
if (S_ISLNK(mode)) {
if (size > sizeof(header.linkname)) {
xsnprintf(header.linkname, sizeof(header.linkname),
- "see %s.paxheader", sha1_to_hex(sha1));
+ "see %s.paxheader", oid_to_hex(oid));
strbuf_append_ext_header(_header, "linkpath",
 buffer, size);
} else
@@ -308,7 +308,7 @@ static int write_tar_entry(struct archiver_args *args,
prepare_header(args, , 

[PATCH v2 18/36] sha1_file: convert read_loose_object to use struct object_id

2018-02-25 Thread brian m. carlson
Signed-off-by: brian m. carlson 
---
 builtin/fsck.c |  2 +-
 cache.h|  4 ++--
 sha1_file.c| 10 +-
 3 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/builtin/fsck.c b/builtin/fsck.c
index 9981db2263..57509a4eac 100644
--- a/builtin/fsck.c
+++ b/builtin/fsck.c
@@ -513,7 +513,7 @@ static struct object *parse_loose_object(const struct 
object_id *oid,
unsigned long size;
int eaten;
 
-   if (read_loose_object(path, oid->hash, , , ) < 0)
+   if (read_loose_object(path, oid, , , ) < 0)
return NULL;
 
if (!contents && type != OBJ_BLOB)
diff --git a/cache.h b/cache.h
index 09cd994bcc..8a9055f4e7 100644
--- a/cache.h
+++ b/cache.h
@@ -1241,14 +1241,14 @@ extern int check_sha1_signature(const unsigned char 
*sha1, void *buf, unsigned l
 extern int finalize_object_file(const char *tmpfile, const char *filename);
 
 /*
- * Open the loose object at path, check its sha1, and return the contents,
+ * Open the loose object at path, check its hash, and return the contents,
  * type, and size. If the object is a blob, then "contents" may return NULL,
  * to allow streaming of large blobs.
  *
  * Returns 0 on success, negative on error (details may be written to stderr).
  */
 int read_loose_object(const char *path,
- const unsigned char *expected_sha1,
+ const struct object_id *expected_oid,
  enum object_type *type,
  unsigned long *size,
  void **contents);
diff --git a/sha1_file.c b/sha1_file.c
index c1d06c6109..69e8d27773 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -2176,7 +2176,7 @@ static int check_stream_sha1(git_zstream *stream,
 }
 
 int read_loose_object(const char *path,
- const unsigned char *expected_sha1,
+ const struct object_id *expected_oid,
  enum object_type *type,
  unsigned long *size,
  void **contents)
@@ -2208,19 +2208,19 @@ int read_loose_object(const char *path,
}
 
if (*type == OBJ_BLOB) {
-   if (check_stream_sha1(, hdr, *size, path, expected_sha1) 
< 0)
+   if (check_stream_sha1(, hdr, *size, path, 
expected_oid->hash) < 0)
goto out;
} else {
-   *contents = unpack_sha1_rest(, hdr, *size, 
expected_sha1);
+   *contents = unpack_sha1_rest(, hdr, *size, 
expected_oid->hash);
if (!*contents) {
error("unable to unpack contents of %s", path);
git_inflate_end();
goto out;
}
-   if (check_sha1_signature(expected_sha1, *contents,
+   if (check_sha1_signature(expected_oid->hash, *contents,
 *size, typename(*type))) {
error("sha1 mismatch for %s (expected %s)", path,
- sha1_to_hex(expected_sha1));
+ oid_to_hex(expected_oid));
free(*contents);
goto out;
}


[PATCH v2 17/36] builtin/index-pack: convert struct ref_delta_entry to object_id

2018-02-25 Thread brian m. carlson
Convert this struct to use a member of type object_id.  Convert various
static functions as well.

Signed-off-by: brian m. carlson 
---
 builtin/index-pack.c | 34 +-
 1 file changed, 17 insertions(+), 17 deletions(-)

diff --git a/builtin/index-pack.c b/builtin/index-pack.c
index 7e3e1a461c..5c7ab47c36 100644
--- a/builtin/index-pack.c
+++ b/builtin/index-pack.c
@@ -58,7 +58,7 @@ struct ofs_delta_entry {
 };
 
 struct ref_delta_entry {
-   unsigned char sha1[20];
+   struct object_id oid;
int obj_no;
 };
 
@@ -671,18 +671,18 @@ static void find_ofs_delta_children(off_t offset,
*last_index = last;
 }
 
-static int compare_ref_delta_bases(const unsigned char *sha1,
-  const unsigned char *sha2,
+static int compare_ref_delta_bases(const struct object_id *oid1,
+  const struct object_id *oid2,
   enum object_type type1,
   enum object_type type2)
 {
int cmp = type1 - type2;
if (cmp)
return cmp;
-   return hashcmp(sha1, sha2);
+   return oidcmp(oid1, oid2);
 }
 
-static int find_ref_delta(const unsigned char *sha1, enum object_type type)
+static int find_ref_delta(const struct object_id *oid, enum object_type type)
 {
int first = 0, last = nr_ref_deltas;
 
@@ -691,7 +691,7 @@ static int find_ref_delta(const unsigned char *sha1, enum 
object_type type)
struct ref_delta_entry *delta = _deltas[next];
int cmp;
 
-   cmp = compare_ref_delta_bases(sha1, delta->sha1,
+   cmp = compare_ref_delta_bases(oid, >oid,
  type, 
objects[delta->obj_no].type);
if (!cmp)
return next;
@@ -704,11 +704,11 @@ static int find_ref_delta(const unsigned char *sha1, enum 
object_type type)
return -first-1;
 }
 
-static void find_ref_delta_children(const unsigned char *sha1,
+static void find_ref_delta_children(const struct object_id *oid,
int *first_index, int *last_index,
enum object_type type)
 {
-   int first = find_ref_delta(sha1, type);
+   int first = find_ref_delta(oid, type);
int last = first;
int end = nr_ref_deltas - 1;
 
@@ -717,9 +717,9 @@ static void find_ref_delta_children(const unsigned char 
*sha1,
*last_index = -1;
return;
}
-   while (first > 0 && !hashcmp(ref_deltas[first - 1].sha1, sha1))
+   while (first > 0 && !oidcmp(_deltas[first - 1].oid, oid))
--first;
-   while (last < end && !hashcmp(ref_deltas[last + 1].sha1, sha1))
+   while (last < end && !oidcmp(_deltas[last + 1].oid, oid))
++last;
*first_index = first;
*last_index = last;
@@ -991,7 +991,7 @@ static struct base_data *find_unresolved_deltas_1(struct 
base_data *base,
  struct base_data *prev_base)
 {
if (base->ref_last == -1 && base->ofs_last == -1) {
-   find_ref_delta_children(base->obj->idx.oid.hash,
+   find_ref_delta_children(>obj->idx.oid,
>ref_first, >ref_last,
OBJ_REF_DELTA);
 
@@ -1075,7 +1075,7 @@ static int compare_ref_delta_entry(const void *a, const 
void *b)
const struct ref_delta_entry *delta_a = a;
const struct ref_delta_entry *delta_b = b;
 
-   return hashcmp(delta_a->sha1, delta_b->sha1);
+   return oidcmp(_a->oid, _b->oid);
 }
 
 static void resolve_base(struct object_entry *obj)
@@ -1141,7 +1141,7 @@ static void parse_pack_objects(unsigned char *hash)
ofs_delta++;
} else if (obj->type == OBJ_REF_DELTA) {
ALLOC_GROW(ref_deltas, nr_ref_deltas + 1, 
ref_deltas_alloc);
-   hashcpy(ref_deltas[nr_ref_deltas].sha1, 
ref_delta_oid.hash);
+   oidcpy(_deltas[nr_ref_deltas].oid, _delta_oid);
ref_deltas[nr_ref_deltas].obj_no = i;
nr_ref_deltas++;
} else if (!data) {
@@ -1373,14 +1373,14 @@ static void fix_unresolved_deltas(struct hashfile *f)
 
if (objects[d->obj_no].real_type != OBJ_REF_DELTA)
continue;
-   base_obj->data = read_sha1_file(d->sha1, , 
_obj->size);
+   base_obj->data = read_sha1_file(d->oid.hash, , 
_obj->size);
if (!base_obj->data)
continue;
 
-   if (check_sha1_signature(d->sha1, base_obj->data,
+   if (check_sha1_signature(d->oid.hash, base_obj->data,
base_obj->size, typename(type)))
-   die(_("local object %s 

[PATCH v2 14/36] builtin/mktag: convert to struct object_id

2018-02-25 Thread brian m. carlson
Signed-off-by: brian m. carlson 
---
 builtin/mktag.c | 18 +-
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/builtin/mktag.c b/builtin/mktag.c
index beb552847b..65bb41e3cd 100644
--- a/builtin/mktag.c
+++ b/builtin/mktag.c
@@ -18,13 +18,13 @@
 /*
  * We refuse to tag something we can't verify. Just because.
  */
-static int verify_object(const unsigned char *sha1, const char *expected_type)
+static int verify_object(const struct object_id *oid, const char 
*expected_type)
 {
int ret = -1;
enum object_type type;
unsigned long size;
-   void *buffer = read_sha1_file(sha1, , );
-   const unsigned char *repl = lookup_replace_object(sha1);
+   void *buffer = read_sha1_file(oid->hash, , );
+   const unsigned char *repl = lookup_replace_object(oid->hash);
 
if (buffer) {
if (type == type_from_string(expected_type))
@@ -38,8 +38,8 @@ static int verify_tag(char *buffer, unsigned long size)
 {
int typelen;
char type[20];
-   unsigned char sha1[20];
-   const char *object, *type_line, *tag_line, *tagger_line, *lb, *rb;
+   struct object_id oid;
+   const char *object, *type_line, *tag_line, *tagger_line, *lb, *rb, *p;
size_t len;
 
if (size < 84)
@@ -52,11 +52,11 @@ static int verify_tag(char *buffer, unsigned long size)
if (memcmp(object, "object ", 7))
return error("char%d: does not start with \"object \"", 0);
 
-   if (get_sha1_hex(object + 7, sha1))
+   if (parse_oid_hex(object + 7, , ))
return error("char%d: could not get SHA1 hash", 7);
 
/* Verify type line */
-   type_line = object + 48;
+   type_line = p + 1;
if (memcmp(type_line - 1, "\ntype ", 6))
return error("char%d: could not find \"\\ntype \"", 47);
 
@@ -80,8 +80,8 @@ static int verify_tag(char *buffer, unsigned long size)
type[typelen] = 0;
 
/* Verify that the object matches */
-   if (verify_object(sha1, type))
-   return error("char%d: could not verify object %s", 7, 
sha1_to_hex(sha1));
+   if (verify_object(, type))
+   return error("char%d: could not verify object %s", 7, 
oid_to_hex());
 
/* Verify the tag-name: we don't allow control characters or spaces in 
it */
tag_line += 4;


[PATCH v2 23/36] sha1_file: convert retry_bad_packed_offset to struct object_id

2018-02-25 Thread brian m. carlson
Signed-off-by: brian m. carlson 
---
 packfile.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/packfile.c b/packfile.c
index 7dbe8739d1..954e1fca11 100644
--- a/packfile.c
+++ b/packfile.c
@@ -1095,13 +1095,13 @@ static int retry_bad_packed_offset(struct packed_git 
*p, off_t obj_offset)
 {
int type;
struct revindex_entry *revidx;
-   const unsigned char *sha1;
+   struct object_id oid;
revidx = find_pack_revindex(p, obj_offset);
if (!revidx)
return OBJ_BAD;
-   sha1 = nth_packed_object_sha1(p, revidx->nr);
-   mark_bad_packed_object(p, sha1);
-   type = sha1_object_info(sha1, NULL);
+   nth_packed_object_oid(, p, revidx->nr);
+   mark_bad_packed_object(p, oid.hash);
+   type = sha1_object_info(oid.hash, NULL);
if (type <= OBJ_NONE)
return OBJ_BAD;
return type;


[PATCH v2 22/36] sha1_file: convert assert_sha1_type to object_id

2018-02-25 Thread brian m. carlson
Convert this function to take a pointer to struct object_id and rename
it to assert_oid_type.

Signed-off-by: brian m. carlson 
---
 builtin/commit-tree.c | 2 +-
 cache.h   | 2 +-
 commit.c  | 2 +-
 sha1_file.c   | 8 
 4 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/builtin/commit-tree.c b/builtin/commit-tree.c
index e5bdf57b1e..ecf42191da 100644
--- a/builtin/commit-tree.c
+++ b/builtin/commit-tree.c
@@ -58,7 +58,7 @@ int cmd_commit_tree(int argc, const char **argv, const char 
*prefix)
usage(commit_tree_usage);
if (get_oid_commit(argv[i], ))
die("Not a valid object name %s", argv[i]);
-   assert_sha1_type(oid.hash, OBJ_COMMIT);
+   assert_oid_type(, OBJ_COMMIT);
new_parent(lookup_commit(), );
continue;
}
diff --git a/cache.h b/cache.h
index f29ff43bbd..b1a117e404 100644
--- a/cache.h
+++ b/cache.h
@@ -1275,7 +1275,7 @@ extern int has_object_file_with_flags(const struct 
object_id *oid, int flags);
  */
 extern int has_loose_object_nonlocal(const unsigned char *sha1);
 
-extern void assert_sha1_type(const unsigned char *sha1, enum object_type 
expect);
+extern void assert_oid_type(const struct object_id *oid, enum object_type 
expect);
 
 /* Helper to check and "touch" a file */
 extern int check_and_freshen_file(const char *fn, int freshen);
diff --git a/commit.c b/commit.c
index e8a49b9c84..b9f028976f 100644
--- a/commit.c
+++ b/commit.c
@@ -1517,7 +1517,7 @@ int commit_tree_extended(const char *msg, size_t msg_len,
int encoding_is_utf8;
struct strbuf buffer;
 
-   assert_sha1_type(tree->hash, OBJ_TREE);
+   assert_oid_type(tree, OBJ_TREE);
 
if (memchr(msg, '\0', msg_len))
return error("a NUL byte in commit log message not allowed.");
diff --git a/sha1_file.c b/sha1_file.c
index 5dc85122c3..fa69d86309 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -1966,13 +1966,13 @@ int read_pack_header(int fd, struct pack_header *header)
return 0;
 }
 
-void assert_sha1_type(const unsigned char *sha1, enum object_type expect)
+void assert_oid_type(const struct object_id *oid, enum object_type expect)
 {
-   enum object_type type = sha1_object_info(sha1, NULL);
+   enum object_type type = sha1_object_info(oid->hash, NULL);
if (type < 0)
-   die("%s is not a valid object", sha1_to_hex(sha1));
+   die("%s is not a valid object", oid_to_hex(oid));
if (type != expect)
-   die("%s is not a valid '%s' object", sha1_to_hex(sha1),
+   die("%s is not a valid '%s' object", oid_to_hex(oid),
typename(expect));
 }
 


[PATCH v2 13/36] replace_object: convert struct replace_object to object_id

2018-02-25 Thread brian m. carlson
Convert the two members of this struct to be instances of struct
object_id.  Adjust the various functions in this file accordingly.

Signed-off-by: brian m. carlson 
---
 replace_object.c | 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/replace_object.c b/replace_object.c
index 3e49965d05..232e8b8550 100644
--- a/replace_object.c
+++ b/replace_object.c
@@ -8,8 +8,8 @@
  * sha1.
  */
 static struct replace_object {
-   unsigned char original[20];
-   unsigned char replacement[20];
+   struct object_id original;
+   struct object_id replacement;
 } **replace_object;
 
 static int replace_object_alloc, replace_object_nr;
@@ -17,7 +17,7 @@ static int replace_object_alloc, replace_object_nr;
 static const unsigned char *replace_sha1_access(size_t index, void *table)
 {
struct replace_object **replace = table;
-   return replace[index]->original;
+   return replace[index]->original.hash;
 }
 
 static int replace_object_pos(const unsigned char *sha1)
@@ -29,7 +29,7 @@ static int replace_object_pos(const unsigned char *sha1)
 static int register_replace_object(struct replace_object *replace,
   int ignore_dups)
 {
-   int pos = replace_object_pos(replace->original);
+   int pos = replace_object_pos(replace->original.hash);
 
if (0 <= pos) {
if (ignore_dups)
@@ -59,14 +59,14 @@ static int register_replace_ref(const char *refname,
const char *hash = slash ? slash + 1 : refname;
struct replace_object *repl_obj = xmalloc(sizeof(*repl_obj));
 
-   if (strlen(hash) != 40 || get_sha1_hex(hash, repl_obj->original)) {
+   if (get_oid_hex(hash, _obj->original)) {
free(repl_obj);
warning("bad replace ref name: %s", refname);
return 0;
}
 
/* Copy sha1 from the read ref */
-   hashcpy(repl_obj->replacement, oid->hash);
+   oidcpy(_obj->replacement, oid);
 
/* Register new object */
if (register_replace_object(repl_obj, 1))
@@ -113,7 +113,7 @@ const unsigned char *do_lookup_replace_object(const 
unsigned char *sha1)
 
pos = replace_object_pos(cur);
if (0 <= pos)
-   cur = replace_object[pos]->replacement;
+   cur = replace_object[pos]->replacement.hash;
} while (0 <= pos);
 
return cur;


[PATCH v2 00/36] object_id part 12

2018-02-25 Thread brian m. carlson
This is the twelfth in a series of patches to convert various parts of
the code to struct object_id.

The primary changes here are to the sha1_file code, find_unique_abbrev,
and lookup_replace_object.  Due to the circular nature of converting the
latter, there are several places where we briefly insert temporaries to
allow a multi-patch conversion.

Also included is a constant for object header length allocation.  This
isn't strictly related to the series, but it makes for a nice cleanup of
hard-coded constants, which is good for maintainability.

This series further decreases the number of hard-coded constants in the
code.  After this series, there are only 172 instances of the constant
"20" in the code (for any purpose) and 98 instances of "40".

Changes from v1:
* Rebase onto master.

brian m. carlson (36):
  bulk-checkin: convert index_bulk_checkin to struct object_id
  builtin/write-tree: convert to struct object_id
  cache-tree: convert write_*_as_tree to object_id
  cache-tree: convert remnants to struct object_id
  resolve-undo: convert struct resolve_undo_info to object_id
  tree: convert read_tree_recursive to struct object_id
  ref-filter: convert grab_objectname to struct object_id
  strbuf: convert strbuf_add_unique_abbrev to use struct object_id
  wt-status: convert struct wt_status_state to object_id
  Convert find_unique_abbrev* to struct object_id
  http-walker: convert struct object_request to use struct object_id
  send-pack: convert remaining functions to struct object_id
  replace_object: convert struct replace_object to object_id
  builtin/mktag: convert to struct object_id
  archive: convert write_archive_entry_fn_t to object_id
  archive: convert sha1_file_to_archive to struct object_id
  builtin/index-pack: convert struct ref_delta_entry to object_id
  sha1_file: convert read_loose_object to use struct object_id
  sha1_file: convert check_sha1_signature to struct object_id
  streaming: convert open_istream to use struct object_id
  builtin/mktree: convert to struct object_id
  sha1_file: convert assert_sha1_type to object_id
  sha1_file: convert retry_bad_packed_offset to struct object_id
  packfile: convert unpack_entry to struct object_id
  Convert remaining callers of sha1_object_info_extended to object_id
  sha1_file: convert sha1_object_info* to object_id
  builtin/fmt-merge-msg: convert remaining code to object_id
  builtin/notes: convert static functions to object_id
  tree-walk: convert get_tree_entry_follow_symlinks internals to
object_id
  streaming: convert istream internals to struct object_id
  tree-walk: convert tree entry functions to object_id
  sha1_file: convert read_object_with_reference to object_id
  sha1_file: convert read_sha1_file to struct object_id
  Convert lookup_replace_object to struct object_id
  sha1_file: introduce a constant for max header length
  convert: convert to struct object_id

 apply.c  |   4 +-
 archive-tar.c|  28 
 archive-zip.c|  18 ++---
 archive.c|  32 -
 archive.h|  10 +--
 bisect.c |   3 +-
 blame.c  |  18 +++--
 builtin/am.c |   8 +--
 builtin/blame.c  |   2 +-
 builtin/branch.c |   2 +-
 builtin/cat-file.c   |  30 +
 builtin/checkout.c   |  12 ++--
 builtin/commit-tree.c|   2 +-
 builtin/describe.c   |   4 +-
 builtin/difftool.c   |   2 +-
 builtin/fast-export.c|   8 +--
 builtin/fetch.c  |  10 +--
 builtin/fmt-merge-msg.c  |   4 +-
 builtin/fsck.c   |   4 +-
 builtin/grep.c   |   6 +-
 builtin/index-pack.c |  43 ++--
 builtin/log.c|   8 +--
 builtin/ls-files.c   |   4 +-
 builtin/ls-tree.c|   8 +--
 builtin/merge-tree.c |   5 +-
 builtin/merge.c  |   8 +--
 builtin/mktag.c  |  20 +++---
 builtin/mktree.c |  24 +++
 builtin/name-rev.c   |   2 +-
 builtin/notes.c  |  14 ++--
 builtin/pack-objects.c   |  27 
 builtin/prune.c  |   2 +-
 builtin/receive-pack.c   |   8 +--
 builtin/reflog.c |   2 +-
 builtin/replace.c|  10 +--
 builtin/reset.c  |   2 +-
 builtin/rev-list.c   |   2 +-
 builtin/rev-parse.c  |   2 +-
 builtin/rm.c |   2 +-
 builtin/show-branch.c|   2 +-
 builtin/show-ref.c   |   4 +-
 builtin/tag.c|  16 +++--
 builtin/unpack-file.c|   2 +-
 builtin/unpack-objects.c |   4 +-
 builtin/update-index.c   |   2 +-
 builtin/verify-commit.c  |   2 +-
 builtin/worktree.c   |   4 +-
 builtin/write-tree.c |   6 +-
 bulk-checkin.c   |  18 ++---
 bulk-checkin.h   |   2 +-
 bundle.c |   2 +-
 cache-tree.c |  36 +-
 cache-tree.h |   4 +-
 cache.h  |  42 ++--
 combine-diff.c   |   6 +-
 commit.c |   8 +--
 config.c |   2 +-
 convert.c 

[PATCH v2 06/36] tree: convert read_tree_recursive to struct object_id

2018-02-25 Thread brian m. carlson
Convert the callback functions for read_tree_recursive to take a pointer
to struct object_id.

Signed-off-by: brian m. carlson 
---
 archive.c  |  8 
 builtin/checkout.c |  4 ++--
 builtin/log.c  |  2 +-
 builtin/ls-tree.c  |  8 
 merge-recursive.c  |  2 +-
 tree.c | 14 +++---
 tree.h |  2 +-
 7 files changed, 20 insertions(+), 20 deletions(-)

diff --git a/archive.c b/archive.c
index 0b7b62af0c..e664cdb624 100644
--- a/archive.c
+++ b/archive.c
@@ -198,7 +198,7 @@ static int write_directory(struct archiver_context *c)
return ret ? -1 : 0;
 }
 
-static int queue_or_write_archive_entry(const unsigned char *sha1,
+static int queue_or_write_archive_entry(const struct object_id *oid,
struct strbuf *base, const char *filename,
unsigned mode, int stage, void *context)
 {
@@ -224,14 +224,14 @@ static int queue_or_write_archive_entry(const unsigned 
char *sha1,
 
if (check_attr_export_ignore(check))
return 0;
-   queue_directory(sha1, base, filename,
+   queue_directory(oid->hash, base, filename,
mode, stage, c);
return READ_TREE_RECURSIVE;
}
 
if (write_directory(c))
return -1;
-   return write_archive_entry(sha1, base->buf, base->len, filename, mode,
+   return write_archive_entry(oid->hash, base->buf, base->len, filename, 
mode,
   stage, context);
 }
 
@@ -303,7 +303,7 @@ static const struct archiver *lookup_archiver(const char 
*name)
return NULL;
 }
 
-static int reject_entry(const unsigned char *sha1, struct strbuf *base,
+static int reject_entry(const struct object_id *oid, struct strbuf *base,
const char *filename, unsigned mode,
int stage, void *context)
 {
diff --git a/builtin/checkout.c b/builtin/checkout.c
index 191b96c49c..87182aa018 100644
--- a/builtin/checkout.c
+++ b/builtin/checkout.c
@@ -66,7 +66,7 @@ static int post_checkout_hook(struct commit *old, struct 
commit *new,
 
 }
 
-static int update_some(const unsigned char *sha1, struct strbuf *base,
+static int update_some(const struct object_id *oid, struct strbuf *base,
const char *pathname, unsigned mode, int stage, void *context)
 {
int len;
@@ -78,7 +78,7 @@ static int update_some(const unsigned char *sha1, struct 
strbuf *base,
 
len = base->len + strlen(pathname);
ce = xcalloc(1, cache_entry_size(len));
-   hashcpy(ce->oid.hash, sha1);
+   oidcpy(>oid, oid);
memcpy(ce->name, base->buf, base->len);
memcpy(ce->name + base->len, pathname, len - base->len);
ce->ce_flags = create_ce_flags(0) | CE_UPDATE;
diff --git a/builtin/log.c b/builtin/log.c
index 94ee177d56..32a744bfd2 100644
--- a/builtin/log.c
+++ b/builtin/log.c
@@ -541,7 +541,7 @@ static int show_tag_object(const struct object_id *oid, 
struct rev_info *rev)
return 0;
 }
 
-static int show_tree_object(const unsigned char *sha1,
+static int show_tree_object(const struct object_id *oid,
struct strbuf *base,
const char *pathname, unsigned mode, int stage, void *context)
 {
diff --git a/builtin/ls-tree.c b/builtin/ls-tree.c
index ef965408e8..c613dd7b82 100644
--- a/builtin/ls-tree.c
+++ b/builtin/ls-tree.c
@@ -60,7 +60,7 @@ static int show_recursive(const char *base, int baselen, 
const char *pathname)
return 0;
 }
 
-static int show_tree(const unsigned char *sha1, struct strbuf *base,
+static int show_tree(const struct object_id *oid, struct strbuf *base,
const char *pathname, unsigned mode, int stage, void *context)
 {
int retval = 0;
@@ -94,7 +94,7 @@ static int show_tree(const unsigned char *sha1, struct strbuf 
*base,
char size_text[24];
if (!strcmp(type, blob_type)) {
unsigned long size;
-   if (sha1_object_info(sha1, ) == OBJ_BAD)
+   if (sha1_object_info(oid->hash, ) == 
OBJ_BAD)
xsnprintf(size_text, sizeof(size_text),
  "BAD");
else
@@ -103,11 +103,11 @@ static int show_tree(const unsigned char *sha1, struct 
strbuf *base,
} else
xsnprintf(size_text, sizeof(size_text), "-");
printf("%06o %s %s %7s\t", mode, type,
-  find_unique_abbrev(sha1, abbrev),
+  find_unique_abbrev(oid->hash, abbrev),
   size_text);
} else
printf("%06o %s %s\t", mode, type,
-  find_unique_abbrev(sha1, abbrev));
+   

[PATCH v2 02/36] builtin/write-tree: convert to struct object_id

2018-02-25 Thread brian m. carlson
This is needed to convert parts of the cache-tree code.

Signed-off-by: brian m. carlson 
---
 builtin/write-tree.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/builtin/write-tree.c b/builtin/write-tree.c
index bd0a78aa3c..299a121531 100644
--- a/builtin/write-tree.c
+++ b/builtin/write-tree.c
@@ -19,7 +19,7 @@ int cmd_write_tree(int argc, const char **argv, const char 
*unused_prefix)
 {
int flags = 0, ret;
const char *prefix = NULL;
-   unsigned char sha1[20];
+   struct object_id oid;
const char *me = "git-write-tree";
struct option write_tree_options[] = {
OPT_BIT(0, "missing-ok", , N_("allow missing objects"),
@@ -38,10 +38,10 @@ int cmd_write_tree(int argc, const char **argv, const char 
*unused_prefix)
argc = parse_options(argc, argv, unused_prefix, write_tree_options,
 write_tree_usage, 0);
 
-   ret = write_cache_as_tree(sha1, flags, prefix);
+   ret = write_cache_as_tree(oid.hash, flags, prefix);
switch (ret) {
case 0:
-   printf("%s\n", sha1_to_hex(sha1));
+   printf("%s\n", oid_to_hex());
break;
case WRITE_TREE_UNREADABLE_INDEX:
die("%s: error reading the index", me);


[PATCH v2 04/36] cache-tree: convert remnants to struct object_id

2018-02-25 Thread brian m. carlson
Convert the remaining portions of cache-tree.c to use struct object_id.
Convert several instances of 20 to use the_hash_algo instead.

Signed-off-by: brian m. carlson 
---
 cache-tree.c | 26 +-
 1 file changed, 13 insertions(+), 13 deletions(-)

diff --git a/cache-tree.c b/cache-tree.c
index ba07a8067e..601dd1bb1f 100644
--- a/cache-tree.c
+++ b/cache-tree.c
@@ -320,7 +320,7 @@ static int update_one(struct cache_tree *it,
struct cache_tree_sub *sub = NULL;
const char *path, *slash;
int pathlen, entlen;
-   const unsigned char *sha1;
+   const struct object_id *oid;
unsigned mode;
int expected_missing = 0;
int contains_ita = 0;
@@ -338,7 +338,7 @@ static int update_one(struct cache_tree *it,
die("cache-tree.c: '%.*s' in '%s' not found",
entlen, path + baselen, path);
i += sub->count;
-   sha1 = sub->cache_tree->oid.hash;
+   oid = >cache_tree->oid;
mode = S_IFDIR;
contains_ita = sub->cache_tree->entry_count < 0;
if (contains_ita) {
@@ -347,19 +347,19 @@ static int update_one(struct cache_tree *it,
}
}
else {
-   sha1 = ce->oid.hash;
+   oid = >oid;
mode = ce->ce_mode;
entlen = pathlen - baselen;
i++;
}
 
-   if (is_null_sha1(sha1) ||
-   (mode != S_IFGITLINK && !missing_ok && 
!has_sha1_file(sha1))) {
+   if (is_null_oid(oid) ||
+   (mode != S_IFGITLINK && !missing_ok && 
!has_object_file(oid))) {
strbuf_release();
if (expected_missing)
return -1;
return error("invalid object %06o %s for '%.*s'",
-   mode, sha1_to_hex(sha1), entlen+baselen, path);
+   mode, oid_to_hex(oid), entlen+baselen, path);
}
 
/*
@@ -385,12 +385,12 @@ static int update_one(struct cache_tree *it,
/*
 * "sub" can be an empty tree if all subentries are i-t-a.
 */
-   if (contains_ita && !hashcmp(sha1, EMPTY_TREE_SHA1_BIN))
+   if (contains_ita && !oidcmp(oid, _tree_oid))
continue;
 
strbuf_grow(, entlen + 100);
strbuf_addf(, "%o %.*s%c", mode, entlen, path + baselen, 
'\0');
-   strbuf_add(, sha1, 20);
+   strbuf_add(, oid->hash, the_hash_algo->rawsz);
 
 #if DEBUG
fprintf(stderr, "cache-tree update-one %o %.*s\n",
@@ -401,7 +401,7 @@ static int update_one(struct cache_tree *it,
if (repair) {
struct object_id oid;
hash_object_file(buffer.buf, buffer.len, tree_type, );
-   if (has_sha1_file(oid.hash))
+   if (has_object_file())
oidcpy(>oid, );
else
to_invalidate = 1;
@@ -465,7 +465,7 @@ static void write_one(struct strbuf *buffer, struct 
cache_tree *it,
 #endif
 
if (0 <= it->entry_count) {
-   strbuf_add(buffer, it->oid.hash, 20);
+   strbuf_add(buffer, it->oid.hash, the_hash_algo->rawsz);
}
for (i = 0; i < it->subtree_nr; i++) {
struct cache_tree_sub *down = it->down[i];
@@ -520,11 +520,11 @@ static struct cache_tree *read_one(const char **buffer, 
unsigned long *size_p)
goto free_return;
buf++; size--;
if (0 <= it->entry_count) {
-   if (size < 20)
+   if (size < the_hash_algo->rawsz)
goto free_return;
hashcpy(it->oid.hash, (const unsigned char*)buf);
-   buf += 20;
-   size -= 20;
+   buf += the_hash_algo->rawsz;
+   size -= the_hash_algo->rawsz;
}
 
 #if DEBUG


[PATCH v2 12/36] send-pack: convert remaining functions to struct object_id

2018-02-25 Thread brian m. carlson
Convert the remaining function, feed_object, to use struct object_id.

Signed-off-by: brian m. carlson 
---
 send-pack.c | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/send-pack.c b/send-pack.c
index 2112d3b27a..7350af8883 100644
--- a/send-pack.c
+++ b/send-pack.c
@@ -37,14 +37,14 @@ int option_parse_push_signed(const struct option *opt,
die("bad %s argument: %s", opt->long_name, arg);
 }
 
-static void feed_object(const unsigned char *sha1, FILE *fh, int negative)
+static void feed_object(const struct object_id *oid, FILE *fh, int negative)
 {
-   if (negative && !has_sha1_file(sha1))
+   if (negative && !has_sha1_file(oid->hash))
return;
 
if (negative)
putc('^', fh);
-   fputs(sha1_to_hex(sha1), fh);
+   fputs(oid_to_hex(oid), fh);
putc('\n', fh);
 }
 
@@ -89,13 +89,13 @@ static int pack_objects(int fd, struct ref *refs, struct 
oid_array *extra, struc
 */
po_in = xfdopen(po.in, "w");
for (i = 0; i < extra->nr; i++)
-   feed_object(extra->oid[i].hash, po_in, 1);
+   feed_object(>oid[i], po_in, 1);
 
while (refs) {
if (!is_null_oid(>old_oid))
-   feed_object(refs->old_oid.hash, po_in, 1);
+   feed_object(>old_oid, po_in, 1);
if (!is_null_oid(>new_oid))
-   feed_object(refs->new_oid.hash, po_in, 0);
+   feed_object(>new_oid, po_in, 0);
refs = refs->next;
}
 


[PATCH v2 35/36] sha1_file: introduce a constant for max header length

2018-02-25 Thread brian m. carlson
There were several instances of 32 sprinkled throughout this file, all
of which were used for allocating a buffer to store the header of an
object.  Introduce a constant, MAX_HEADER_LEN, for this purpose.

Note that this constant is slightly larger than required; the longest
possible header is 28 (7 for "commit", 1 for a space, 20 for a 63-bit
length in decimal, and 1 for the NUL).  However, the overallocation
should not cause any problems, so leave it as it is.

Signed-off-by: brian m. carlson 
---
 sha1_file.c | 17 ++---
 1 file changed, 10 insertions(+), 7 deletions(-)

diff --git a/sha1_file.c b/sha1_file.c
index c41fbe2f01..4d4f9248c7 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -30,6 +30,9 @@
 #include "packfile.h"
 #include "fetch-object.h"
 
+/* The maximum size for an object header. */
+#define MAX_HEADER_LEN 32
+
 const unsigned char null_sha1[GIT_MAX_RAWSZ];
 const struct object_id null_oid;
 const struct object_id empty_tree_oid = {
@@ -791,7 +794,7 @@ int check_object_signature(const struct object_id *oid, 
void *map,
enum object_type obj_type;
struct git_istream *st;
git_hash_ctx c;
-   char hdr[32];
+   char hdr[MAX_HEADER_LEN];
int hdrlen;
 
if (map) {
@@ -1150,7 +1153,7 @@ static int sha1_loose_object_info(const unsigned char 
*sha1,
unsigned long mapsize;
void *map;
git_zstream stream;
-   char hdr[32];
+   char hdr[MAX_HEADER_LEN];
struct strbuf hdrbuf = STRBUF_INIT;
unsigned long size_scratch;
 
@@ -1512,7 +1515,7 @@ static int write_buffer(int fd, const void *buf, size_t 
len)
 int hash_object_file(const void *buf, unsigned long len, const char *type,
 struct object_id *oid)
 {
-   char hdr[32];
+   char hdr[MAX_HEADER_LEN];
int hdrlen = sizeof(hdr);
write_object_file_prepare(buf, len, type, oid, hdr, );
return 0;
@@ -1667,7 +1670,7 @@ static int freshen_packed_object(const unsigned char 
*sha1)
 int write_object_file(const void *buf, unsigned long len, const char *type,
  struct object_id *oid)
 {
-   char hdr[32];
+   char hdr[MAX_HEADER_LEN];
int hdrlen = sizeof(hdr);
 
/* Normally if we have it in the pack then we do not bother writing
@@ -1687,7 +1690,7 @@ int hash_object_file_literally(const void *buf, unsigned 
long len,
int hdrlen, status = 0;
 
/* type string, SP, %lu of the length plus NUL must fit this */
-   hdrlen = strlen(type) + 32;
+   hdrlen = strlen(type) + MAX_HEADER_LEN;
header = xmalloc(hdrlen);
write_object_file_prepare(buf, len, type, oid, header, );
 
@@ -1707,7 +1710,7 @@ int force_object_loose(const struct object_id *oid, 
time_t mtime)
void *buf;
unsigned long len;
enum object_type type;
-   char hdr[32];
+   char hdr[MAX_HEADER_LEN];
int hdrlen;
int ret;
 
@@ -2189,7 +2192,7 @@ int read_loose_object(const char *path,
void *map = NULL;
unsigned long mapsize;
git_zstream stream;
-   char hdr[32];
+   char hdr[MAX_HEADER_LEN];
 
*contents = NULL;
 


[PATCH v2 01/36] bulk-checkin: convert index_bulk_checkin to struct object_id

2018-02-25 Thread brian m. carlson
Convert the index_bulk_checkin function, and the static functions it
calls, to use pointers to struct object_id.

Signed-off-by: brian m. carlson 
---
 bulk-checkin.c | 18 +-
 bulk-checkin.h |  2 +-
 sha1_file.c|  2 +-
 3 files changed, 11 insertions(+), 11 deletions(-)

diff --git a/bulk-checkin.c b/bulk-checkin.c
index 8bcd1c8665..6f5ee25bff 100644
--- a/bulk-checkin.c
+++ b/bulk-checkin.c
@@ -60,17 +60,17 @@ static void finish_bulk_checkin(struct bulk_checkin_state 
*state)
reprepare_packed_git();
 }
 
-static int already_written(struct bulk_checkin_state *state, unsigned char 
sha1[])
+static int already_written(struct bulk_checkin_state *state, struct object_id 
*oid)
 {
int i;
 
/* The object may already exist in the repository */
-   if (has_sha1_file(sha1))
+   if (has_sha1_file(oid->hash))
return 1;
 
/* Might want to keep the list sorted */
for (i = 0; i < state->nr_written; i++)
-   if (!hashcmp(state->written[i]->oid.hash, sha1))
+   if (!oidcmp(>written[i]->oid, oid))
return 1;
 
/* This is a new object we need to keep */
@@ -186,7 +186,7 @@ static void prepare_to_stream(struct bulk_checkin_state 
*state,
 }
 
 static int deflate_to_pack(struct bulk_checkin_state *state,
-  unsigned char result_sha1[],
+  struct object_id *result_oid,
   int fd, size_t size,
   enum object_type type, const char *path,
   unsigned flags)
@@ -236,17 +236,17 @@ static int deflate_to_pack(struct bulk_checkin_state 
*state,
if (lseek(fd, seekback, SEEK_SET) == (off_t) -1)
return error("cannot seek back");
}
-   the_hash_algo->final_fn(result_sha1, );
+   the_hash_algo->final_fn(result_oid->hash, );
if (!idx)
return 0;
 
idx->crc32 = crc32_end(state->f);
-   if (already_written(state, result_sha1)) {
+   if (already_written(state, result_oid)) {
hashfile_truncate(state->f, );
state->offset = checkpoint.offset;
free(idx);
} else {
-   hashcpy(idx->oid.hash, result_sha1);
+   oidcpy(>oid, result_oid);
ALLOC_GROW(state->written,
   state->nr_written + 1,
   state->alloc_written);
@@ -255,11 +255,11 @@ static int deflate_to_pack(struct bulk_checkin_state 
*state,
return 0;
 }
 
-int index_bulk_checkin(unsigned char *sha1,
+int index_bulk_checkin(struct object_id *oid,
   int fd, size_t size, enum object_type type,
   const char *path, unsigned flags)
 {
-   int status = deflate_to_pack(, sha1, fd, size, type,
+   int status = deflate_to_pack(, oid, fd, size, type,
 path, flags);
if (!state.plugged)
finish_bulk_checkin();
diff --git a/bulk-checkin.h b/bulk-checkin.h
index fbd40fc98c..a85527318b 100644
--- a/bulk-checkin.h
+++ b/bulk-checkin.h
@@ -4,7 +4,7 @@
 #ifndef BULK_CHECKIN_H
 #define BULK_CHECKIN_H
 
-extern int index_bulk_checkin(unsigned char sha1[],
+extern int index_bulk_checkin(struct object_id *oid,
  int fd, size_t size, enum object_type type,
  const char *path, unsigned flags);
 
diff --git a/sha1_file.c b/sha1_file.c
index 826d7a0ae3..c1d06c6109 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -1892,7 +1892,7 @@ static int index_stream(struct object_id *oid, int fd, 
size_t size,
enum object_type type, const char *path,
unsigned flags)
 {
-   return index_bulk_checkin(oid->hash, fd, size, type, path, flags);
+   return index_bulk_checkin(oid, fd, size, type, path, flags);
 }
 
 int index_fd(struct object_id *oid, int fd, struct stat *st,


[PATCH v2 34/36] Convert lookup_replace_object to struct object_id

2018-02-25 Thread brian m. carlson
Convert both the argument and the return value to be pointers to struct
object_id.  Update the callers and their internals to deal with the new
type.  Remove several temporaries which are no longer needed.

Signed-off-by: brian m. carlson 
---
 builtin/mktag.c  |  7 ++-
 cache.h  |  8 
 object.c | 14 --
 replace_object.c | 14 +++---
 sha1_file.c  | 44 
 streaming.c  | 16 +---
 6 files changed, 42 insertions(+), 61 deletions(-)

diff --git a/builtin/mktag.c b/builtin/mktag.c
index cfb777b3c8..9f5a50a8fd 100644
--- a/builtin/mktag.c
+++ b/builtin/mktag.c
@@ -24,14 +24,11 @@ static int verify_object(const struct object_id *oid, const 
char *expected_type)
enum object_type type;
unsigned long size;
void *buffer = read_object_file(oid, , );
-   const unsigned char *repl = lookup_replace_object(oid->hash);
+   const struct object_id *repl = lookup_replace_object(oid);
 
if (buffer) {
-   struct object_id reploid;
-   hashcpy(reploid.hash, repl);
-
if (type == type_from_string(expected_type))
-   ret = check_object_signature(, buffer, size, 
expected_type);
+   ret = check_object_signature(repl, buffer, size, 
expected_type);
free(buffer);
}
return ret;
diff --git a/cache.h b/cache.h
index 970d6edd04..a70c52b22f 100644
--- a/cache.h
+++ b/cache.h
@@ -1197,7 +1197,7 @@ static inline void *read_object_file(const struct 
object_id *oid, enum object_ty
  * This internal function is only declared here for the benefit of
  * lookup_replace_object().  Please do not call it directly.
  */
-extern const unsigned char *do_lookup_replace_object(const unsigned char 
*sha1);
+extern const struct object_id *do_lookup_replace_object(const struct object_id 
*oid);
 
 /*
  * If object sha1 should be replaced, return the replacement object's
@@ -1205,11 +1205,11 @@ extern const unsigned char 
*do_lookup_replace_object(const unsigned char *sha1);
  * either sha1 or a pointer to a permanently-allocated value.  When
  * object replacement is suppressed, always return sha1.
  */
-static inline const unsigned char *lookup_replace_object(const unsigned char 
*sha1)
+static inline const struct object_id *lookup_replace_object(const struct 
object_id *oid)
 {
if (!check_replace_refs)
-   return sha1;
-   return do_lookup_replace_object(sha1);
+   return oid;
+   return do_lookup_replace_object(oid);
 }
 
 /* Read and unpack an object file into memory, write memory to an object file 
*/
diff --git a/object.c b/object.c
index ea1a6f18e8..4e1c065d55 100644
--- a/object.c
+++ b/object.c
@@ -244,7 +244,7 @@ struct object *parse_object(const struct object_id *oid)
unsigned long size;
enum object_type type;
int eaten;
-   const unsigned char *repl = lookup_replace_object(oid->hash);
+   const struct object_id *repl = lookup_replace_object(oid);
void *buffer;
struct object *obj;
 
@@ -255,10 +255,7 @@ struct object *parse_object(const struct object_id *oid)
if ((obj && obj->type == OBJ_BLOB && has_object_file(oid)) ||
(!obj && has_object_file(oid) &&
 oid_object_info(oid, NULL) == OBJ_BLOB)) {
-   struct object_id reploid;
-   hashcpy(reploid.hash, repl);
-
-   if (check_object_signature(, NULL, 0, NULL) < 0) {
+   if (check_object_signature(repl, NULL, 0, NULL) < 0) {
error("sha1 mismatch %s", oid_to_hex(oid));
return NULL;
}
@@ -268,12 +265,9 @@ struct object *parse_object(const struct object_id *oid)
 
buffer = read_object_file(oid, , );
if (buffer) {
-   struct object_id reploid;
-   hashcpy(reploid.hash, repl);
-
-   if (check_object_signature(, buffer, size, 
typename(type)) < 0) {
+   if (check_object_signature(repl, buffer, size, typename(type)) 
< 0) {
free(buffer);
-   error("sha1 mismatch %s", sha1_to_hex(repl));
+   error("sha1 mismatch %s", oid_to_hex(repl));
return NULL;
}
 
diff --git a/replace_object.c b/replace_object.c
index 232e8b8550..336357394d 100644
--- a/replace_object.c
+++ b/replace_object.c
@@ -92,16 +92,16 @@ static void prepare_replace_object(void)
 #define MAXREPLACEDEPTH 5
 
 /*
- * If a replacement for object sha1 has been set up, return the
+ * If a replacement for object oid has been set up, return the
  * replacement object's name (replaced recursively, if necessary).
- * The return value is either sha1 or a pointer to a
+ * The return value is either oid or a pointer to a
  * permanently-allocated value.  This function always 

[PATCH v2 03/36] cache-tree: convert write_*_as_tree to object_id

2018-02-25 Thread brian m. carlson
Convert write_index_as_tree and write_cache_as_tree to use struct
object_id.

Signed-off-by: brian m. carlson 
---
 builtin/am.c |  8 
 builtin/merge.c  |  2 +-
 builtin/write-tree.c |  2 +-
 cache-tree.c | 10 +-
 cache-tree.h |  4 ++--
 sequencer.c  |  4 ++--
 6 files changed, 15 insertions(+), 15 deletions(-)

diff --git a/builtin/am.c b/builtin/am.c
index 6661edc162..220c5deed8 100644
--- a/builtin/am.c
+++ b/builtin/am.c
@@ -1546,7 +1546,7 @@ static int fall_back_threeway(const struct am_state 
*state, const char *index_pa
discard_cache();
read_cache_from(index_path);
 
-   if (write_index_as_tree(orig_tree.hash, _index, index_path, 0, 
NULL))
+   if (write_index_as_tree(_tree, _index, index_path, 0, NULL))
return error(_("Repository lacks necessary blobs to fall back 
on 3-way merge."));
 
say(state, stdout, _("Using index info to reconstruct a base tree..."));
@@ -1571,7 +1571,7 @@ static int fall_back_threeway(const struct am_state 
*state, const char *index_pa
return error(_("Did you hand edit your patch?\n"
"It does not apply to blobs recorded in its 
index."));
 
-   if (write_index_as_tree(their_tree.hash, _index, index_path, 0, 
NULL))
+   if (write_index_as_tree(_tree, _index, index_path, 0, NULL))
return error("could not write tree");
 
say(state, stdout, _("Falling back to patching base and 3-way 
merge..."));
@@ -1622,7 +1622,7 @@ static void do_commit(const struct am_state *state)
if (run_hook_le(NULL, "pre-applypatch", NULL))
exit(1);
 
-   if (write_cache_as_tree(tree.hash, 0, NULL))
+   if (write_cache_as_tree(, 0, NULL))
die(_("git write-tree failed to write a tree"));
 
if (!get_oid_commit("HEAD", )) {
@@ -2001,7 +2001,7 @@ static int clean_index(const struct object_id *head, 
const struct object_id *rem
if (fast_forward_to(head_tree, head_tree, 1))
return -1;
 
-   if (write_cache_as_tree(index.hash, 0, NULL))
+   if (write_cache_as_tree(, 0, NULL))
return -1;
 
index_tree = parse_tree_indirect();
diff --git a/builtin/merge.c b/builtin/merge.c
index 92ba99a1a5..861b170468 100644
--- a/builtin/merge.c
+++ b/builtin/merge.c
@@ -638,7 +638,7 @@ static int read_tree_trivial(struct object_id *common, 
struct object_id *head,
 
 static void write_tree_trivial(struct object_id *oid)
 {
-   if (write_cache_as_tree(oid->hash, 0, NULL))
+   if (write_cache_as_tree(oid, 0, NULL))
die(_("git write-tree failed to write a tree"));
 }
 
diff --git a/builtin/write-tree.c b/builtin/write-tree.c
index 299a121531..c9d3c544e7 100644
--- a/builtin/write-tree.c
+++ b/builtin/write-tree.c
@@ -38,7 +38,7 @@ int cmd_write_tree(int argc, const char **argv, const char 
*unused_prefix)
argc = parse_options(argc, argv, unused_prefix, write_tree_options,
 write_tree_usage, 0);
 
-   ret = write_cache_as_tree(oid.hash, flags, prefix);
+   ret = write_cache_as_tree(, flags, prefix);
switch (ret) {
case 0:
printf("%s\n", oid_to_hex());
diff --git a/cache-tree.c b/cache-tree.c
index c52e4303df..ba07a8067e 100644
--- a/cache-tree.c
+++ b/cache-tree.c
@@ -599,7 +599,7 @@ static struct cache_tree *cache_tree_find(struct cache_tree 
*it, const char *pat
return it;
 }
 
-int write_index_as_tree(unsigned char *sha1, struct index_state *index_state, 
const char *index_path, int flags, const char *prefix)
+int write_index_as_tree(struct object_id *oid, struct index_state 
*index_state, const char *index_path, int flags, const char *prefix)
 {
int entries, was_valid;
struct lock_file lock_file = LOCK_INIT;
@@ -640,19 +640,19 @@ int write_index_as_tree(unsigned char *sha1, struct 
index_state *index_state, co
ret = WRITE_TREE_PREFIX_ERROR;
goto out;
}
-   hashcpy(sha1, subtree->oid.hash);
+   oidcpy(oid, >oid);
}
else
-   hashcpy(sha1, index_state->cache_tree->oid.hash);
+   oidcpy(oid, _state->cache_tree->oid);
 
 out:
rollback_lock_file(_file);
return ret;
 }
 
-int write_cache_as_tree(unsigned char *sha1, int flags, const char *prefix)
+int write_cache_as_tree(struct object_id *oid, int flags, const char *prefix)
 {
-   return write_index_as_tree(sha1, _index, get_index_file(), flags, 
prefix);
+   return write_index_as_tree(oid, _index, get_index_file(), flags, 
prefix);
 }
 
 static void prime_cache_tree_rec(struct cache_tree *it, struct tree *tree)
diff --git a/cache-tree.h b/cache-tree.h
index f7b9cab7ee..cfd5328cc9 100644
--- a/cache-tree.h
+++ b/cache-tree.h
@@ -47,8 +47,8 @@ int update_main_cache_tree(int);
 #define 

[PATCH 1/2] wildmatch: add interface for precompiling wildmatch() patterns

2018-02-25 Thread Ævar Arnfjörð Bjarmason
Add the scaffolding necessary for precompiling wildmatch()
patterns.

There is currently no point in doing this with the wildmatch()
function we have now, since it can't make any use of precompiling the
pattern.

But adding this interface and making use of it will make it easy to
refactor the wildmatch() function to parse the pattern into opcodes as
some glob() implementations do, or to drop an alternate wildmatch()
backend in which trades parsing slowness for faster matching, such as
the PCRE v2 conversion function that understands the wildmatch()
syntax.

It's very unlikely that we'll remove the wildmatch() function as a
convenience wrapper even if we end up requiring a separate compilation
step in some future implementation. There are a lot of one-shot
wildmatches in the codebase, in that case most likely wildmatch() will
be kept around as a shorthand for wildmatch_{compile,match,free}().

I modeled this interface on the PCRE v2 interface. I didn't go with a
glob(3) & globfree(3)-like interface because that would require every
wildmatch() user to pass a dummy parameter, which I got rid of in
55d3426929 ("wildmatch: remove unused wildopts parameter",
2017-06-22).

Signed-off-by: Ævar Arnfjörð Bjarmason 
---
 wildmatch.c | 25 +
 wildmatch.h | 11 +++
 2 files changed, 36 insertions(+)

diff --git a/wildmatch.c b/wildmatch.c
index d074c1be10..032f339391 100644
--- a/wildmatch.c
+++ b/wildmatch.c
@@ -276,3 +276,28 @@ int wildmatch(const char *pattern, const char *text, 
unsigned int flags)
 {
return dowild((const uchar*)pattern, (const uchar*)text, flags);
 }
+
+struct wildmatch_compiled *wildmatch_compile(const char *pattern,
+unsigned int flags)
+{
+   struct wildmatch_compiled *wildmatch_compiled = xmalloc(
+   sizeof(struct wildmatch_compiled));
+   wildmatch_compiled->pattern = xstrdup(pattern);
+   wildmatch_compiled->flags = flags;
+
+   return wildmatch_compiled;
+}
+
+int wildmatch_match(struct wildmatch_compiled *wildmatch_compiled,
+   const char *text)
+{
+   return wildmatch(wildmatch_compiled->pattern, text,
+wildmatch_compiled->flags);
+}
+
+void wildmatch_free(struct wildmatch_compiled *wildmatch_compiled)
+{
+   if (wildmatch_compiled)
+   free((void *)wildmatch_compiled->pattern);
+   free(wildmatch_compiled);
+}
diff --git a/wildmatch.h b/wildmatch.h
index b8c826aa68..2fc00e0ca0 100644
--- a/wildmatch.h
+++ b/wildmatch.h
@@ -10,5 +10,16 @@
 #define WM_ABORT_ALL -1
 #define WM_ABORT_TO_STARSTAR -2
 
+struct wildmatch_compiled {
+   const char *pattern;
+   unsigned int flags;
+};
+
 int wildmatch(const char *pattern, const char *text, unsigned int flags);
+struct wildmatch_compiled *wildmatch_compile(const char *pattern,
+unsigned int flags);
+int wildmatch_match(struct wildmatch_compiled *wildmatch_compiled,
+   const char *text);
+void wildmatch_free(struct wildmatch_compiled *wildmatch_compiled);
+
 #endif
-- 
2.15.1.424.g9478a66081



[PATCH 2/2] wildmatch: use the new precompiling wildmatch()

2018-02-25 Thread Ævar Arnfjörð Bjarmason
Make some limited use of the wildmatch() interface for "precompiling"
patterns, although the current implementation does not do this yet.

The main hot codepath in dir.c is not being touched yet, but some
other invocations where we repeatedly match the same glob against
multiple strings have been migrated.

Signed-off-by: Ævar Arnfjörð Bjarmason 
---
 builtin/name-rev.c | 7 ++-
 builtin/replace.c  | 7 ---
 config.c   | 8 ++--
 refs.c | 7 ---
 4 files changed, 20 insertions(+), 9 deletions(-)

diff --git a/builtin/name-rev.c b/builtin/name-rev.c
index 9e088ebd11..c75ac8d9af 100644
--- a/builtin/name-rev.c
+++ b/builtin/name-rev.c
@@ -128,14 +128,19 @@ static void name_rev(struct commit *commit,
 static int subpath_matches(const char *path, const char *filter)
 {
const char *subpath = path;
+   struct wildmatch_compiled *wildmatch_compiled =
+   wildmatch_compile(filter, 0);
 
while (subpath) {
-   if (!wildmatch(filter, subpath, 0))
+   if (!wildmatch_match(wildmatch_compiled, subpath)) {
+   wildmatch_free(wildmatch_compiled);
return subpath - path;
+   }
subpath = strchr(subpath, '/');
if (subpath)
subpath++;
}
+   wildmatch_free(wildmatch_compiled);
return -1;
 }
 
diff --git a/builtin/replace.c b/builtin/replace.c
index 83d3235721..9be72f2b7b 100644
--- a/builtin/replace.c
+++ b/builtin/replace.c
@@ -32,7 +32,7 @@ enum replace_format {
 };
 
 struct show_data {
-   const char *pattern;
+   struct wildmatch_compiled *wildmatch_compiled;
enum replace_format format;
 };
 
@@ -41,7 +41,7 @@ static int show_reference(const char *refname, const struct 
object_id *oid,
 {
struct show_data *data = cb_data;
 
-   if (!wildmatch(data->pattern, refname, 0)) {
+   if (!wildmatch_match(data->wildmatch_compiled, refname)) {
if (data->format == REPLACE_FORMAT_SHORT)
printf("%s\n", refname);
else if (data->format == REPLACE_FORMAT_MEDIUM)
@@ -70,7 +70,7 @@ static int list_replace_refs(const char *pattern, const char 
*format)
 
if (pattern == NULL)
pattern = "*";
-   data.pattern = pattern;
+   data.wildmatch_compiled = wildmatch_compile(pattern, 0);
 
if (format == NULL || *format == '\0' || !strcmp(format, "short"))
data.format = REPLACE_FORMAT_SHORT;
@@ -84,6 +84,7 @@ static int list_replace_refs(const char *pattern, const char 
*format)
format);
 
for_each_replace_ref(show_reference, (void *));
+   wildmatch_free(data.wildmatch_compiled);
 
return 0;
 }
diff --git a/config.c b/config.c
index b0c20e6cb8..0f595de971 100644
--- a/config.c
+++ b/config.c
@@ -210,6 +210,7 @@ static int include_by_gitdir(const struct config_options 
*opts,
int ret = 0, prefix;
const char *git_dir;
int already_tried_absolute = 0;
+   struct wildmatch_compiled *wildmatch_compiled = NULL;
 
if (opts->git_dir)
git_dir = opts->git_dir;
@@ -237,8 +238,10 @@ static int include_by_gitdir(const struct config_options 
*opts,
goto done;
}
 
-   ret = !wildmatch(pattern.buf + prefix, text.buf + prefix,
-icase ? WM_CASEFOLD : 0);
+   if (!wildmatch_compiled)
+   wildmatch_compiled = wildmatch_compile(pattern.buf + prefix,
+  icase ? WM_CASEFOLD : 0);
+   ret = !wildmatch_match(wildmatch_compiled, text.buf + prefix);
 
if (!ret && !already_tried_absolute) {
/*
@@ -257,6 +260,7 @@ static int include_by_gitdir(const struct config_options 
*opts,
 done:
strbuf_release();
strbuf_release();
+   wildmatch_free(wildmatch_compiled);
return ret;
 }
 
diff --git a/refs.c b/refs.c
index 20ba82b434..c631793d1e 100644
--- a/refs.c
+++ b/refs.c
@@ -213,7 +213,7 @@ char *resolve_refdup(const char *refname, int resolve_flags,
 
 /* The argument to filter_refs */
 struct ref_filter {
-   const char *pattern;
+   struct wildmatch_compiled *code;
each_ref_fn *fn;
void *cb_data;
 };
@@ -291,7 +291,7 @@ static int filter_refs(const char *refname, const struct 
object_id *oid,
 {
struct ref_filter *filter = (struct ref_filter *)data;
 
-   if (wildmatch(filter->pattern, refname, 0))
+   if (wildmatch_match(filter->code, refname))
return 0;
return filter->fn(refname, oid, flags, filter->cb_data);
 }
@@ -454,12 +454,13 @@ int for_each_glob_ref_in(each_ref_fn fn, const char 
*pattern,
strbuf_addch(_pattern, '*');
}
 
-   filter.pattern = real_pattern.buf;
+   filter.code = wildmatch_compile(real_pattern.buf, 0);
filter.fn = 

[PATCH 0/2] wildmatch precompilation interface

2018-02-25 Thread Ævar Arnfjörð Bjarmason
My recently landed wildmatch test series was in preperation for some
more major wildmatch changes.

Here's another series that prepares for bigger changes in wildmatch,
it adds an interface to pre-compile the patterns. Right now there's no
point in doing this, and it's harmless since none of the codepaths are
that performance sensitive, but down the line this'll save us time as
we'll be able to skip re-parsing the pattern each time with a better
wildmatch backend.

Ævar Arnfjörð Bjarmason (2):
  wildmatch: add interface for precompiling wildmatch() patterns
  wildmatch: use the new precompiling wildmatch()

 builtin/name-rev.c |  7 ++-
 builtin/replace.c  |  7 ---
 config.c   |  8 ++--
 refs.c |  7 ---
 wildmatch.c| 25 +
 wildmatch.h| 11 +++
 6 files changed, 56 insertions(+), 9 deletions(-)

-- 
2.15.1.424.g9478a66081



Re: [PATCH v3 0/6] Fix initializing the_hash_algo

2018-02-25 Thread brian m. carlson
On Sun, Feb 25, 2018 at 06:18:34PM +0700, Nguyễn Thái Ngọc Duy wrote:
> v3 refines v2 a bit more:
> 
> - fix configure typo (and stray words in commit message)
> - use repo_set_hash_algo() instead of reassigning the_hash_algo
> - compare hash algos by format_id
> - catch NULL hash algo, report nicely and suggest GIT_HASH_FIXUP
> 
> The last point makes me much happier about keeping this workaround
> around until we are confident we can live without it. Interdiff

This looks sane to me.

Reviewed-by: brian m. carlson 
-- 
brian m. carlson / brian with sandals: Houston, Texas, US
https://www.crustytoothpaste.net/~bmc | My opinion only
OpenPGP: https://keybase.io/bk2204


signature.asc
Description: PGP signature


Re: [PATCH v2 5/5] Revert "repository: pre-initialize hash algo pointer"

2018-02-25 Thread brian m. carlson
On Sun, Feb 25, 2018 at 10:29:29AM +0700, Duy Nguyen wrote:
> When I set unknown hash algo here, I think some test failed
> mysteriously because it used rawsz field (which has value zero), it
> didn't match some expectation, the code went to the error handling
> path, which eventually failed with some error message, but it's not
> obvious that the problem was rawsz being zero and back tracking that
> took me some time.
> 
> With NULL hash_algo, any dereferencing fails immediately with a nice
> stack trace. Another reason to push me towards NULL hash algo is, even
> if we prefer nice messages over segmentation faults, we can't avoid it
> completely anyway (empty_tree and empty_blob are still NULL in unknown
> hash algo and will cause segfaults). Might as well make things
> consistent and always segfault.

That makes sense.  Thanks for the explanation.
-- 
brian m. carlson / brian with sandals: Houston, Texas, US
https://www.crustytoothpaste.net/~bmc | My opinion only
OpenPGP: https://keybase.io/bk2204


signature.asc
Description: PGP signature


Re: [PATCH v2 06/13] git-send-email: unconditionally use Net::{SMTP,Domain}

2018-02-25 Thread Eric Sunshine
On Sun, Feb 25, 2018 at 2:46 PM, Ævar Arnfjörð Bjarmason
 wrote:
> The Net::SMTP and Net::Domain were both first released with perl
> v5.7.3[1], since my d48b284183 ("perl: bump the required Perl version
> to 5.8 from 5.6.[21]", 2010-09-24) we've depended on 5.8, so there's
> no reason to conditionally require this anymore.

s/this/them/ or s/this/these/

> This conditional loading was initially added in
> 87840620fd ("send-email: only 'require' instead of 'use' Net::SMTP",
> 2006-06-01) for Net::SMTP and 134550fe21 ("git-send-email.perl - try
> to give real name of the calling host to HELO/EHLO", 2010-03-14) for
> Net::Domain, both of which predate the hard dependency on 5.8.
>
> Since they're guaranteed to be installed now let's "use" them
> instead. The cost of loading them both is trivial given what
> git-send-email does (~15ms on my system), and it's better to not defer
> any potential loading errors until runtime.
>
> This patch is better viewed with -w, which shows that the only change
> in the last two hunks is removing the "if eval" wrapper block.
>
> 1. $ parallel 'corelist {}' ::: Net::{SMTP,Domain}
>Data for 2015-02-14
>Net::SMTP was first released with perl v5.7.3
>
>Data for 2015-02-14
>Net::Domain was first released with perl v5.7.3
>
> Signed-off-by: Ævar Arnfjörð Bjarmason 


Re: [PATCH v2 04/13] gitweb: hard-depend on the Digest::MD5 5.8 module

2018-02-25 Thread Eric Sunshine
On Sun, Feb 25, 2018 at 2:46 PM, Ævar Arnfjörð Bjarmason
 wrote:
> Since my d48b284183 ("perl: bump the required Perl version to 5.8 from
> 5.6.[21]", 2010-09-24), we've depended on 5.8, so there's no reason to
> conditionally require Digest::MD5 anymore. It was released with perl
> v5.7.3[1]
>
> The initial introduction of the dependency in
> e9fdd74e53 ("gitweb: (gr)avatar support", 2009-06-30) says as much,
> this also undoes part of the later 2e9c8789b7 ("gitweb: Mention
> optional Perl modules in INSTALL", 2011-02-04) since gitweb will
> always be run on at least 5.8, so there's no need to mention
> Digest::MD5 as a required module in the documentation, let's instead
> say that we require perl 5.8.
>
> As with the preceding Net::{SMTP,Domain} change we now unconditionally
> "use" the module instead.

This is patch 4/13 but the Net::{SMTP,Domain} change doesn't come
until 6/13, so saying "preceding" is confusing.

> 1. $ corelist Digest::MD5
>Data for 2015-02-14
>Digest::MD5 was first released with perl v5.7.3
>
> Signed-off-by: Ævar Arnfjörð Bjarmason 


Re: [PATCH v8 7/7] convert: add round trip check based on 'core.checkRoundtripEncoding'

2018-02-25 Thread Eric Sunshine
On Sat, Feb 24, 2018 at 11:28 AM,   wrote:
> UTF supports lossless conversion round tripping and conversions between
> UTF and other encodings are mostly round trip safe as Unicode aims to be
> a superset of all other character encodings. However, certain encodings
> (e.g. SHIFT-JIS) are known to have round trip issues [1].
>
> Add 'core.checkRoundtripEncoding', which contains a comma separated
> list of encodings, to define for what encodings Git should check the
> conversion round trip if they are used in the 'working-tree-encoding'
> attribute.
>
> Set SHIFT-JIS as default value for 'core.checkRoundtripEncoding'.
>
> [1] 
> https://support.microsoft.com/en-us/help/170559/prb-conversion-problem-between-shift-jis-and-unicode
>
> Signed-off-by: Lars Schneider 
> ---
> diff --git a/convert.c b/convert.c
> @@ -289,6 +289,39 @@ static void trace_encoding(const char *context, const 
> char *path,
> +static int check_roundtrip(const char* enc_name)
> +{
> +   /*
> +* check_roundtrip_encoding contains a string of space and/or
> +* comma separated encodings (eg. "UTF-16, ASCII, CP1125").
> +* Search for the given encoding in that string.
> +*/
> +   const char *found = strcasestr(check_roundtrip_encoding, enc_name);
> +   const char *next = found + strlen(enc_name);

Is this pointer arithmetic undefined behavior (according to the C
standard) if NULL is returned by strcasestr()? If so, how comfortable
are we with this? Perhaps if you add an 'if' into the mix, you can
avoid it:

if (found) {
const char *next = found + strlen(enc_name);
return ...long complicated expression...;
}
return false;

> +   int len = strlen(check_roundtrip_encoding);
> +   return (found && (
> +   /*
> +* check that the found encoding is at the
> +* beginning of check_roundtrip_encoding or
> +* that it is prefixed with a space or comma
> +*/
> +   found == check_roundtrip_encoding || (
> +   found > check_roundtrip_encoding &&
> +   (*(found-1) == ' ' || *(found-1) == ',')

Although the 'found > check_roundtrip_encoding' expression is
effectively dead code in this case, it does document that the
programmer didn't just blindly use '*(found-1)' without taking
boundary conditions into consideration.

(It's dead code because, at this point, we know that strcasestr()
didn't return NULL and we know that 'found' doesn't equal
'check_roundtrip_encoding', therefore it _must_ be greater than
'check_roundtrip_encoding'.)

Just an observation; not a request to remove the dead code since it
has some documentation value.

> +   )
> +   ) && (
> +   /*
> +* check that the found encoding is at the
> +* end of check_roundtrip_encoding or
> +* that it is suffixed with a space or comma
> +*/
> +   next == check_roundtrip_encoding + len || (
> +   next < check_roundtrip_encoding + len &&
> +   (*next == ' ' || *next == ',')
> +   )
> +   ));
> +}
> @@ -366,6 +399,47 @@ static int encode_to_git(const char *path, const char 
> *src, size_t src_len,
> +   if ((conv_flags & CONV_WRITE_OBJECT) && check_roundtrip(enc->name)) {
> +   re_src = reencode_string_len(dst, dst_len,
> +enc->name, default_encoding,
> +_src_len);
> +
> +   trace_printf("Checking roundtrip encoding for %s...\n", 
> enc->name);
> +   trace_encoding("reencoded source", path, enc->name,
> +  re_src, re_src_len);
> +
> +   if (!re_src || src_len != re_src_len ||
> +   memcmp(src, re_src, src_len)) {
> +   const char* msg = _("encoding '%s' from %s to %s and "
> +   "back is not the same");
> +   die(msg, path, enc->name, default_encoding);

Last two arguments need to be swapped.

> +   }
> +
> +   free(re_src);
> +   }
> +
> strbuf_attach(buf, dst, dst_len, dst_len + 1);
> return 1;
>  }
> diff --git a/t/t0028-working-tree-encoding.sh 
> b/t/t0028-working-tree-encoding.sh
> @@ -225,4 +225,45 @@ test_expect_success 'error if encoding garbage is 
> already in Git' '
> +test_expect_success 'check roundtrip encoding' '
> +   text="hallo there!\nroundtrip test here!" &&
> +   printf "$text" | iconv -f UTF-8 -t SHIFT-JIS >roundtrip.shift &&
> +   printf "$text" | iconv -f UTF-8 -t UTF-16 >roundtrip.utf16 &&
> +  

[PATCH v2 12/13] Makefile: add NO_PERL_CPAN_FALLBACKS knob

2018-02-25 Thread Ævar Arnfjörð Bjarmason
From: Todd Zullinger 

We include some perl modules which are not part of the core perl
install, as a convenience.  This allows us to rely on those modules in
our perl-based tools and scripts without requiring users to install the
modules from CPAN or their operating system packages.

Users whose operating system provides these modules and packagers of Git
often don't want to ship or use these bundled modules.  Allow these
users to set NO_PERL_CPAN_FALLBACKS to avoid installing the bundled
modules.

Signed-off-by: Todd Zullinger 
Signed-off-by: Ævar Arnfjörð Bjarmason 
---
 Makefile | 8 
 1 file changed, 8 insertions(+)

diff --git a/Makefile b/Makefile
index 7b97699864..518c5f6be0 100644
--- a/Makefile
+++ b/Makefile
@@ -296,6 +296,12 @@ all::
 #
 # Define NO_PERL if you do not want Perl scripts or libraries at all.
 #
+# Define NO_PERL_CPAN_FALLBACKS if you do not want to install bundled
+# copies of CPAN modules that serve as a fallback in case the modules
+# are not available on the system. This option is intended for
+# distributions that want to use their packaged versions of Perl
+# modules, instead of the fallbacks shipped with Git.
+#
 # Define PYTHON_PATH to the path of your Python binary (often /usr/bin/python
 # but /usr/bin/python2.7 on some platforms).
 #
@@ -2305,8 +2311,10 @@ LIB_CPAN_GEN := $(patsubst 
perl/%.pm,perl/build/lib/%.pm,$(LIB_CPAN))
 
 ifndef NO_PERL
 all:: $(LIB_PERL_GEN)
+ifndef NO_PERL_CPAN_FALLBACKS
 all:: $(LIB_CPAN_GEN)
 endif
+endif
 
 perl/build/lib/%.pm: perl/%.pm
$(QUIET_GEN)mkdir -p $(dir $@) && \
-- 
2.15.1.424.g9478a66081



[PATCH v2 13/13] perl Git::LoadCPAN: emit better errors under NO_PERL_CPAN_FALLBACKS

2018-02-25 Thread Ævar Arnfjörð Bjarmason
Before my 20d2a30f8f ("Makefile: replace perl/Makefile.PL with simple
make rules", 2017-12-10) on an OS package that removed the
private-Error.pm copy we carried around manually removing the OS's
Error.pm would yield:

$ git add -p
Can't locate Error.pm in @INC (you may need to install the Error module) 
[...]

Now, before this change we'll instead emit this more cryptic error:

$ git add -p
BUG: '/usr/share/perl5/Git/FromCPAN' should be a directory! at 
/usr/share/perl5/Git/Error.pm line 36.

This is a confusing error. Now if the new NO_PERL_CPAN_FALLBACKS
option is specified and we can't find the module we'll instead emit:

$ /tmp/git/bin/git add -p
BUG: The 'Error' module is not here, but NO_PERL_CPAN_FALLBACKS was set!

[...]

Where [...] is the lengthy explanation seen in the change below, which
explains what the potential breakage is, and how to fix it.

The reason for checking @@NO_PERL_CPAN_FALLBACKS@@] against the empty
string in Perl is as opposed to checking for a boolean value is that
that's (as far as I can tell) make's idea of a string that's set, and
e.g. NO_PERL_CPAN_FALLBACKS=0 is enough to set NO_PERL_CPAN_FALLBACKS.

Signed-off-by: Ævar Arnfjörð Bjarmason 
---
 INSTALL  | 11 ---
 Makefile |  5 -
 perl/Git/LoadCPAN.pm | 33 -
 3 files changed, 44 insertions(+), 5 deletions(-)

diff --git a/INSTALL b/INSTALL
index 808e07b659..60e515eaf7 100644
--- a/INSTALL
+++ b/INSTALL
@@ -88,9 +88,9 @@ Issues of note:
export GIT_EXEC_PATH PATH GITPERLLIB
 
  - By default (unless NO_PERL is provided) Git will ship various perl
-   scripts & libraries it needs. However, for simplicity it doesn't
-   use the ExtUtils::MakeMaker toolchain to decide where to place the
-   perl libraries. Depending on the system this can result in the perl
+   scripts. However, for simplicity it doesn't use the
+   ExtUtils::MakeMaker toolchain to decide where to place the perl
+   libraries. Depending on the system this can result in the perl
libraries not being where you'd like them if they're expected to be
used by things other than Git itself.
 
@@ -102,6 +102,11 @@ Issues of note:
Will result in e.g. perllibdir=/usr/share/perl/5.26.1 on Debian,
perllibdir=/usr/share/perl5 (which we'd use by default) on CentOS.
 
+ - Unless NO_PERL is provided Git will ship various perl libraries it
+   needs. Distributors of Git will usually want to set
+   NO_PERL_CPAN_FALLBACKS if NO_PERL is not provided to use their own
+   copies of the CPAN modules Git needs.
+
  - Git is reasonably self-sufficient, but does depend on a few external
programs and libraries.  Git can be used without most of them by adding
the approriate "NO_=YesPlease" to the make command line or
diff --git a/Makefile b/Makefile
index 518c5f6be0..4e0cdb3ca4 100644
--- a/Makefile
+++ b/Makefile
@@ -2314,11 +2314,14 @@ all:: $(LIB_PERL_GEN)
 ifndef NO_PERL_CPAN_FALLBACKS
 all:: $(LIB_CPAN_GEN)
 endif
+NO_PERL_CPAN_FALLBACKS_SQ = $(subst ','\'',$(NO_PERL_CPAN_FALLBACKS))
 endif
 
 perl/build/lib/%.pm: perl/%.pm
$(QUIET_GEN)mkdir -p $(dir $@) && \
-   sed -e 's|@@LOCALEDIR@@|$(localedir_SQ)|g' < $< > $@
+   sed -e 's|@@LOCALEDIR@@|$(localedir_SQ)|g' \
+   -e 's|@@NO_PERL_CPAN_FALLBACKS@@|$(NO_PERL_CPAN_FALLBACKS_SQ)|g' \
+   < $< > $@
 
 perl/build/man/man3/Git.3pm: perl/Git.pm
$(QUIET_GEN)mkdir -p $(dir $@) && \
diff --git a/perl/Git/LoadCPAN.pm b/perl/Git/LoadCPAN.pm
index 229c1ee87d..e5585e75e8 100644
--- a/perl/Git/LoadCPAN.pm
+++ b/perl/Git/LoadCPAN.pm
@@ -19,13 +19,25 @@ attempt to load C from the OS, and if that 
doesn't work
 will fall back on C shipped with Git itself.
 
 Usually distributors will not ship with Git's Git::FromCPAN tree at
-all, preferring to use their own packaging of CPAN modules instead.
+all via the C option, preferring to use their
+own packaging of CPAN modules instead.
 
 This module is only intended to be used for code shipping in the
 C repository. Use it for anything else at your peril!
 
 =cut
 
+# NO_PERL_CPAN_FALLBACKS_STR evades the sed search-replace from the
+# Makefile, and allows for detecting whether the module is loaded from
+# perl/Git as opposed to perl/build/Git, which is useful for one-off
+# testing without having Error.pm et al installed.
+use constant NO_PERL_CPAN_FALLBACKS_STR => '@@' . 'NO_PERL_CPAN_FALLBACKS' . 
'@@';
+use constant NO_PERL_CPAN_FALLBACKS => (
+   q[@@NO_PERL_CPAN_FALLBACKS@@] ne ''
+   and
+   q[@@NO_PERL_CPAN_FALLBACKS@@] ne NO_PERL_CPAN_FALLBACKS_STR
+);
+
 sub import {
shift;
my $caller = caller;
@@ -45,6 +57,25 @@ sub import {
} or do {
my $error = $@ || "Zombie Error";
 
+   if (NO_PERL_CPAN_FALLBACKS) {
+   chomp(my $error = sprintf <<'THEY_PROMISED', $module);
+BUG: The '%s' module is not here, but NO_PERL_CPAN_FALLBACKS was set!
+

[PATCH v2 04/13] gitweb: hard-depend on the Digest::MD5 5.8 module

2018-02-25 Thread Ævar Arnfjörð Bjarmason
Since my d48b284183 ("perl: bump the required Perl version to 5.8 from
5.6.[21]", 2010-09-24), we've depended on 5.8, so there's no reason to
conditionally require Digest::MD5 anymore. It was released with perl
v5.7.3[1]

The initial introduction of the dependency in
e9fdd74e53 ("gitweb: (gr)avatar support", 2009-06-30) says as much,
this also undoes part of the later 2e9c8789b7 ("gitweb: Mention
optional Perl modules in INSTALL", 2011-02-04) since gitweb will
always be run on at least 5.8, so there's no need to mention
Digest::MD5 as a required module in the documentation, let's instead
say that we require perl 5.8.

As with the preceding Net::{SMTP,Domain} change we now unconditionally
"use" the module instead.

1. $ corelist Digest::MD5
   Data for 2015-02-14
   Digest::MD5 was first released with perl v5.7.3

Signed-off-by: Ævar Arnfjörð Bjarmason 
---
 gitweb/INSTALL |  3 +--
 gitweb/gitweb.perl | 17 -
 2 files changed, 5 insertions(+), 15 deletions(-)

diff --git a/gitweb/INSTALL b/gitweb/INSTALL
index 408f2859d3..a58e6b3c44 100644
--- a/gitweb/INSTALL
+++ b/gitweb/INSTALL
@@ -29,12 +29,11 @@ Requirements
 
 
  - Core git tools
- - Perl
+ - Perl 5.8
  - Perl modules: CGI, Encode, Fcntl, File::Find, File::Basename.
  - web server
 
 The following optional Perl modules are required for extra features
- - Digest::MD5 - for gravatar support
  - CGI::Fast and FCGI - for running gitweb as FastCGI script
  - HTML::TagCloud - for fancy tag cloud in project list view
  - HTTP::Date or Time::ParseDate - to support If-Modified-Since for feeds
diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
index 2417057f2b..2594a4badb 100755
--- a/gitweb/gitweb.perl
+++ b/gitweb/gitweb.perl
@@ -20,6 +20,8 @@ use Fcntl ':mode';
 use File::Find qw();
 use File::Basename qw(basename);
 use Time::HiRes qw(gettimeofday tv_interval);
+use Digest::MD5 qw(md5_hex);
+
 binmode STDOUT, ':utf8';
 
 if (!defined($CGI::VERSION) || $CGI::VERSION < 4.08) {
@@ -490,7 +492,6 @@ our %feature = (
# Currently available providers are gravatar and picon.
# If an unknown provider is specified, the feature is disabled.
 
-   # Gravatar depends on Digest::MD5.
# Picon currently relies on the indiana.edu database.
 
# To enable system wide have in $GITWEB_CONFIG
@@ -1166,18 +1167,8 @@ sub configure_gitweb_features {
our @snapshot_fmts = gitweb_get_feature('snapshot');
@snapshot_fmts = filter_snapshot_fmts(@snapshot_fmts);
 
-   # check that the avatar feature is set to a known provider name,
-   # and for each provider check if the dependencies are satisfied.
-   # if the provider name is invalid or the dependencies are not met,
-   # reset $git_avatar to the empty string.
our ($git_avatar) = gitweb_get_feature('avatar');
-   if ($git_avatar eq 'gravatar') {
-   $git_avatar = '' unless (eval { require Digest::MD5; 1; });
-   } elsif ($git_avatar eq 'picon') {
-   # no dependencies
-   } else {
-   $git_avatar = '';
-   }
+   $git_avatar = '' unless $git_avatar =~ /^(?:gravatar|picon)$/s;
 
our @extra_branch_refs = gitweb_get_feature('extra-branch-refs');
@extra_branch_refs = filter_and_validate_refs (@extra_branch_refs);
@@ -2167,7 +2158,7 @@ sub gravatar_url {
my $size = shift;
$avatar_cache{$email} ||=
"//www.gravatar.com/avatar/" .
-   Digest::MD5::md5_hex($email) . "?s=";
+   md5_hex($email) . "?s=";
return $avatar_cache{$email} . $size;
 }
 
-- 
2.15.1.424.g9478a66081



[PATCH v2 07/13] perl: update our ancient copy of Error.pm

2018-02-25 Thread Ævar Arnfjörð Bjarmason
The Error.pm shipped with Git as a fallback if there was no Error.pm
on the system was released in April 2006. There's been dozens of
releases since then, the latest at August 7, 2017. Let's update to
that.

I don't know of anything we need from this new release or which this
fixes. This change is simply a matter of keeping up with
upstream. Before this users who'd install git via their package system
would get an up-to-date Error.pm, but if it's installed from source
they'd get one more than a decade old.

This undoes a local hack we'd accumulated in 96bc4de85c ("Eliminate
Scalar::Util usage from private-Error.pm", 2006-07-26), it's been
redundant since my d48b284183 ("perl: bump the required Perl version
to 5.8 from 5.6.[21]", 2010-09-24).

This also undoes 3a51467b94 ("Typo fix: replacing it's -> its",
2013-04-13). This is the Nth time I find that some upstream code of
ours (in contrib/, in sha1dc/ and now in perl/ ...) has diverged from
upstream because of some tree-wide typo fixing. Let's not do those
fixes against upstream projects, it's more valuable that we have a 1=1
mapping to upstream than to fix typos in docs we never even generate
from this code. If someone wants to fix typos in them fine, but they
should do it with a patch to upstream which git.git can then
incorporate.

The upstream code doesn't cleanly pass a --check, so I'm adding a
.gitattributes file for similar reasons as done for sha1dc in
5d184f468e ("sha1dc: ignore indent-with-non-tab whitespace
violations", 2017-06-06).

The updated source was retrieved from
https://fastapi.metacpan.org/source/SHLOMIF/Error-0.17025/lib/Error.pm

Signed-off-by: Ævar Arnfjörð Bjarmason 
---
 perl/Git/FromCPAN/.gitattributes |   1 +
 perl/Git/FromCPAN/Error.pm   | 296 +--
 2 files changed, 256 insertions(+), 41 deletions(-)
 create mode 100644 perl/Git/FromCPAN/.gitattributes

diff --git a/perl/Git/FromCPAN/.gitattributes b/perl/Git/FromCPAN/.gitattributes
new file mode 100644
index 00..8b64fc5e22
--- /dev/null
+++ b/perl/Git/FromCPAN/.gitattributes
@@ -0,0 +1 @@
+/Error.pm whitespace=-blank-at-eof
diff --git a/perl/Git/FromCPAN/Error.pm b/perl/Git/FromCPAN/Error.pm
index 6098135ae2..f9c36e9e98 100644
--- a/perl/Git/FromCPAN/Error.pm
+++ b/perl/Git/FromCPAN/Error.pm
@@ -12,10 +12,12 @@
 package Error;
 
 use strict;
+use warnings;
+
 use vars qw($VERSION);
 use 5.004;
 
-$VERSION = "0.15009";
+$VERSION = "0.17025";
 
 use overload (
'""'   =>   'stringify',
@@ -32,21 +34,35 @@ $Error::THROWN = undef; # last error thrown, a 
workaround until die $ref works
 my $LAST;  # Last error created
 my %ERROR; # Last error associated with package
 
-sub throw_Error_Simple
+sub _throw_Error_Simple
 {
 my $args = shift;
 return Error::Simple->new($args->{'text'});
 }
 
-$Error::ObjectifyCallback = \_Error_Simple;
+$Error::ObjectifyCallback = \&_throw_Error_Simple;
 
 
 # Exported subs are defined in Error::subs
 
+use Scalar::Util ();
+
 sub import {
 shift;
+my @tags = @_;
 local $Exporter::ExportLevel = $Exporter::ExportLevel + 1;
-Error::subs->import(@_);
+
+@tags = grep {
+   if( $_ eq ':warndie' ) {
+  Error::WarnDie->import();
+  0;
+   }
+   else {
+  1;
+   }
+} @tags;
+
+Error::subs->import(@tags);
 }
 
 # I really want to use last for the name of this method, but it is a keyword
@@ -107,10 +123,6 @@ sub stacktrace {
 $text;
 }
 
-# Allow error propagation, ie
-#
-# $ber->encode(...) or
-#return Error->prior($ber)->associate($ldap);
 
 sub associate {
 my $err = shift;
@@ -130,6 +142,7 @@ sub associate {
 return;
 }
 
+
 sub new {
 my $self = shift;
 my($pkg,$file,$line) = caller($Error::Depth);
@@ -246,6 +259,10 @@ sub value {
 
 package Error::Simple;
 
+use vars qw($VERSION);
+
+$VERSION = "0.17025";
+
 @Error::Simple::ISA = qw(Error);
 
 sub new {
@@ -288,14 +305,6 @@ use vars qw(@EXPORT_OK @ISA %EXPORT_TAGS);
 
 @ISA = qw(Exporter);
 
-
-sub blessed {
-   my $item = shift;
-   local $@; # don't kill an outer $@
-   ref $item and eval { $item->can('can') };
-}
-
-
 sub run_clauses ($$$\@) {
 my($clauses,$err,$wantarray,$result) = @_;
 my $code = undef;
@@ -314,16 +323,17 @@ sub run_clauses ($$$\@) {
my $pkg = $catch->[$i];
unless(defined $pkg) {
#except
-   splice(@$catch,$i,2,$catch->[$i+1]->());
+   splice(@$catch,$i,2,$catch->[$i+1]->($err));
$i -= 2;
next CATCHLOOP;
}
-   elsif(blessed($err) && $err->isa($pkg)) {
+   elsif(Scalar::Util::blessed($err) && $err->isa($pkg)) {
$code = $catch->[$i+1];
while(1) {
my $more = 0;
-   local($Error::THROWN);
+   

[PATCH v2 11/13] perl: move the perl/Git/FromCPAN tree to perl/FromCPAN

2018-02-25 Thread Ævar Arnfjörð Bjarmason
Move the CPAN modules that have lived under perl/Git/FromCPAN since my
20d2a30f8f ("Makefile: replace perl/Makefile.PL with simple make
rules", 2017-12-10) to perl/FromCPAN.

A subsequent change will teach the Makefile to only install these
copies of CPAN modules if a flag that distro packagers would like to
set isn't set. Due to how the wildcard globbing is being done it's
much easier to accomplish that if they're moved to their own
directory.

Signed-off-by: Ævar Arnfjörð Bjarmason 
---
 Makefile| 3 +++
 perl/{Git => }/FromCPAN/.gitattributes  | 0
 perl/{Git => }/FromCPAN/Error.pm| 0
 perl/{Git => }/FromCPAN/Mail/Address.pm | 0
 perl/Git/LoadCPAN.pm| 5 ++---
 5 files changed, 5 insertions(+), 3 deletions(-)
 rename perl/{Git => }/FromCPAN/.gitattributes (100%)
 rename perl/{Git => }/FromCPAN/Error.pm (100%)
 rename perl/{Git => }/FromCPAN/Mail/Address.pm (100%)

diff --git a/Makefile b/Makefile
index c56fdc14ca..7b97699864 100644
--- a/Makefile
+++ b/Makefile
@@ -2300,9 +2300,12 @@ po/build/locale/%/LC_MESSAGES/git.mo: po/%.po
 
 LIB_PERL := $(wildcard perl/Git.pm perl/Git/*.pm perl/Git/*/*.pm 
perl/Git/*/*/*.pm)
 LIB_PERL_GEN := $(patsubst perl/%.pm,perl/build/lib/%.pm,$(LIB_PERL))
+LIB_CPAN := $(wildcard perl/FromCPAN/*.pm perl/FromCPAN/*/*.pm)
+LIB_CPAN_GEN := $(patsubst perl/%.pm,perl/build/lib/%.pm,$(LIB_CPAN))
 
 ifndef NO_PERL
 all:: $(LIB_PERL_GEN)
+all:: $(LIB_CPAN_GEN)
 endif
 
 perl/build/lib/%.pm: perl/%.pm
diff --git a/perl/Git/FromCPAN/.gitattributes b/perl/FromCPAN/.gitattributes
similarity index 100%
rename from perl/Git/FromCPAN/.gitattributes
rename to perl/FromCPAN/.gitattributes
diff --git a/perl/Git/FromCPAN/Error.pm b/perl/FromCPAN/Error.pm
similarity index 100%
rename from perl/Git/FromCPAN/Error.pm
rename to perl/FromCPAN/Error.pm
diff --git a/perl/Git/FromCPAN/Mail/Address.pm b/perl/FromCPAN/Mail/Address.pm
similarity index 100%
rename from perl/Git/FromCPAN/Mail/Address.pm
rename to perl/FromCPAN/Mail/Address.pm
diff --git a/perl/Git/LoadCPAN.pm b/perl/Git/LoadCPAN.pm
index 1568c177fe..229c1ee87d 100644
--- a/perl/Git/LoadCPAN.pm
+++ b/perl/Git/LoadCPAN.pm
@@ -16,8 +16,7 @@ source.
 Therefore the L namespace shipped with Git contains
 wrapper modules like C that will first
 attempt to load C from the OS, and if that doesn't work
-will fall back on C shipped with Git
-itself.
+will fall back on C shipped with Git itself.
 
 Usually distributors will not ship with Git's Git::FromCPAN tree at
 all, preferring to use their own packaging of CPAN modules instead.
@@ -52,7 +51,7 @@ sub import {
my $Git_LoadCPAN_pm_root = 
File::Basename::dirname($Git_LoadCPAN_pm_path) || die "BUG: Can't figure out 
lib/Git dirname from '$Git_LoadCPAN_pm_path'!";
 
require File::Spec;
-   my $Git_pm_FromCPAN_root = 
File::Spec->catdir($Git_LoadCPAN_pm_root, 'FromCPAN');
+   my $Git_pm_FromCPAN_root = 
File::Spec->catdir($Git_LoadCPAN_pm_root, '..', 'FromCPAN');
die "BUG: '$Git_pm_FromCPAN_root' should be a directory!" 
unless -d $Git_pm_FromCPAN_root;
 
local @INC = ($Git_pm_FromCPAN_root, @INC);
-- 
2.15.1.424.g9478a66081



[PATCH v2 08/13] perl: update our copy of Mail::Address

2018-02-25 Thread Ævar Arnfjörð Bjarmason
Update our copy of Mail::Address from 2.19 (Aug 22, 2017) to 2.20 (Jan
23, 2018). Like the preceding Error.pm update this is done simply to
keep up-to-date with upstream, and as can be shown from the diff
there's no functional changes.

The updated source was retrieved from
https://fastapi.metacpan.org/source/MARKOV/MailTools-2.20/lib/Mail/Address.pm

Signed-off-by: Ævar Arnfjörð Bjarmason 
---
 perl/Git/FromCPAN/Mail/Address.pm | 8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/perl/Git/FromCPAN/Mail/Address.pm 
b/perl/Git/FromCPAN/Mail/Address.pm
index 13b2ff7d05..683d490b2b 100644
--- a/perl/Git/FromCPAN/Mail/Address.pm
+++ b/perl/Git/FromCPAN/Mail/Address.pm
@@ -1,10 +1,14 @@
-# Copyrights 1995-2017 by [Mark Overmeer ].
+# Copyrights 1995-2018 by [Mark Overmeer].
 #  For other contributors see ChangeLog.
 # See the manual pages for details on the licensing terms.
 # Pod stripped from pm file by OODoc 2.02.
+# This code is part of the bundle MailTools.  Meta-POD processed with
+# OODoc into POD and HTML manual-pages.  See README.md for Copyright.
+# Licensed under the same terms as Perl itself.
+
 package Mail::Address;
 use vars '$VERSION';
-$VERSION = '2.19';
+$VERSION = '2.20';
 
 use strict;
 
-- 
2.15.1.424.g9478a66081



[PATCH v2 10/13] perl: generalize the Git::LoadCPAN facility

2018-02-25 Thread Ævar Arnfjörð Bjarmason
Change the two wrappers that load from CPAN (local OS) or our own copy
to do so via the same codepath.

I added the Error.pm wrapper in 20d2a30f8f ("Makefile: replace
perl/Makefile.PL with simple make rules", 2017-12-10), and shortly
afterwards Matthieu Moy added a wrapper for Mail::Address in
bd869f67b9 ("send-email: add and use a local copy of Mail::Address",
2018-01-05).

His loader was simpler since Mail::Address doesn't have an "import"
method, but didn't do the same sanity checking; For example, a missing
FromCPAN directory (which OS packages are likely not to have) wouldn't
be explicitly warned about as a "BUG: ...".

Update both to use a common implementation based on the previous
Error.pm loader. Which has been amended to take the module to load as
parameter, as well as whether or not that module has an import
method.

This loader should be generic enough to handle almost all CPAN modules
out there, some use some crazy loading magic and wouldn't like being
wrapped like this, but that would be immediately obvious, and we'd
find out right away since the module wouldn't work at all.

Signed-off-by: Ævar Arnfjörð Bjarmason 
---
 perl/Git/LoadCPAN.pm  | 74 +++
 perl/Git/LoadCPAN/Error.pm| 44 +++
 perl/Git/LoadCPAN/Mail/Address.pm | 22 +++-
 3 files changed, 82 insertions(+), 58 deletions(-)
 create mode 100644 perl/Git/LoadCPAN.pm

diff --git a/perl/Git/LoadCPAN.pm b/perl/Git/LoadCPAN.pm
new file mode 100644
index 00..1568c177fe
--- /dev/null
+++ b/perl/Git/LoadCPAN.pm
@@ -0,0 +1,74 @@
+package Git::LoadCPAN;
+use 5.008;
+use strict;
+use warnings;
+
+=head1 NAME
+
+Git::LoadCPAN - Wrapper for loading modules from the CPAN (OS) or Git's own 
copy
+
+=head1 DESCRIPTION
+
+The Perl code in Git depends on some modules from the CPAN, but we
+don't want to make those a hard requirement for anyone building from
+source.
+
+Therefore the L namespace shipped with Git contains
+wrapper modules like C that will first
+attempt to load C from the OS, and if that doesn't work
+will fall back on C shipped with Git
+itself.
+
+Usually distributors will not ship with Git's Git::FromCPAN tree at
+all, preferring to use their own packaging of CPAN modules instead.
+
+This module is only intended to be used for code shipping in the
+C repository. Use it for anything else at your peril!
+
+=cut
+
+sub import {
+   shift;
+   my $caller = caller;
+   my %args = @_;
+   my $module = exists $args{module} ? delete $args{module} : die "BUG: 
Expected 'module' parameter!";
+   my $import = exists $args{import} ? delete $args{import} : die "BUG: 
Expected 'import' parameter!";
+   die "BUG: Too many arguments!" if keys %args;
+
+   # Foo::Bar to Foo/Bar.pm
+   my $package_pm = $module;
+   $package_pm =~ s[::][/]g;
+   $package_pm .= '.pm';
+
+   eval {
+   require $package_pm;
+   1;
+   } or do {
+   my $error = $@ || "Zombie Error";
+
+   my $Git_LoadCPAN_pm_path = $INC{"Git/LoadCPAN.pm"} || die "BUG: 
Should have our own path from %INC!";
+
+   require File::Basename;
+   my $Git_LoadCPAN_pm_root = 
File::Basename::dirname($Git_LoadCPAN_pm_path) || die "BUG: Can't figure out 
lib/Git dirname from '$Git_LoadCPAN_pm_path'!";
+
+   require File::Spec;
+   my $Git_pm_FromCPAN_root = 
File::Spec->catdir($Git_LoadCPAN_pm_root, 'FromCPAN');
+   die "BUG: '$Git_pm_FromCPAN_root' should be a directory!" 
unless -d $Git_pm_FromCPAN_root;
+
+   local @INC = ($Git_pm_FromCPAN_root, @INC);
+   require $package_pm;
+   };
+
+   if ($import) {
+   no strict 'refs';
+   *{"${caller}::import"} = sub {
+   shift;
+   use strict 'refs';
+   unshift @_, $module;
+   goto &{"${module}::import"};
+   };
+   use strict 'refs';
+   }
+}
+
+1;
diff --git a/perl/Git/LoadCPAN/Error.pm b/perl/Git/LoadCPAN/Error.pm
index 3513fe745b..c6d2c45d80 100644
--- a/perl/Git/LoadCPAN/Error.pm
+++ b/perl/Git/LoadCPAN/Error.pm
@@ -2,45 +2,9 @@ package Git::LoadCPAN::Error;
 use 5.008;
 use strict;
 use warnings;
-
-=head1 NAME
-
-Git::LoadCPAN::Error - Wrapper for the L module, in case it's not 
installed
-
-=head1 DESCRIPTION
-
-Wraps the import function for the L module.
-
-This module is only intended to be used for code shipping in the
-C repository. Use it for anything else at your peril!
-
-=cut
-
-sub import {
-shift;
-my $caller = caller;
-
-eval {
-   require Error;
-   1;
-} or do {
-   my $error = $@ || "Zombie Error";
-
-   my $Git_Error_pm_path = $INC{"Git/LoadCPAN/Error.pm"} || die "BUG: 
Should have our own path from %INC!";
-
-   require File::Basename;
-   my $Git_Error_pm_root = 

[PATCH v2 05/13] Git.pm: hard-depend on the File::{Temp,Spec} modules

2018-02-25 Thread Ævar Arnfjörð Bjarmason
Since my d48b284183 ("perl: bump the required Perl version to 5.8 from
5.6.[21]", 2010-09-24), we've depended on 5.8, so there's no reason to
conditionally require File::Temp and File::Spec anymore. They were
first released with perl versions v5.6.1 and 5.00405, respectively.

This code was originally added in c14c8ceb13 ("Git.pm: Make File::Spec
and File::Temp requirement lazy", 2008-08-15), presumably to make
Git.pm work on 5.6.0.

Signed-off-by: Ævar Arnfjörð Bjarmason 
---
 perl/Git.pm | 10 ++
 1 file changed, 2 insertions(+), 8 deletions(-)

diff --git a/perl/Git.pm b/perl/Git.pm
index 7ec16e6dde..151b0e7144 100644
--- a/perl/Git.pm
+++ b/perl/Git.pm
@@ -11,6 +11,8 @@ use 5.008;
 use strict;
 use warnings;
 
+use File::Temp ();
+use File::Spec ();
 
 BEGIN {
 
@@ -190,7 +192,6 @@ sub repository {
};
 
if ($dir) {
-   _verify_require();
File::Spec->file_name_is_absolute($dir) or $dir = 
$opts{Directory} . '/' . $dir;
$opts{Repository} = abs_path($dir);
 
@@ -1289,8 +1290,6 @@ sub temp_release {
 sub _temp_cache {
my ($self, $name) = _maybe_self(@_);
 
-   _verify_require();
-
my $temp_fd = \$TEMP_FILEMAP{$name};
if (defined $$temp_fd and $$temp_fd->opened) {
if ($TEMP_FILES{$$temp_fd}{locked}) {
@@ -1324,11 +1323,6 @@ sub _temp_cache {
$$temp_fd;
 }
 
-sub _verify_require {
-   eval { require File::Temp; require File::Spec; };
-   $@ and throw Error::Simple($@);
-}
-
 =item temp_reset ( FILEHANDLE )
 
 Truncates and resets the position of the C.
-- 
2.15.1.424.g9478a66081



[PATCH v2 02/13] Git.pm: remove redundant "use strict" from sub-package

2018-02-25 Thread Ævar Arnfjörð Bjarmason
In Perl the "use strict/warnings" pragmas are lexical, thus there's no
reason to do:

package Foo;
use strict;
package Bar;
use strict;
$x = 5;

To satisfy the desire that the undeclared $x variable will be spotted
at compile-time. It's enough to include the first "use strict".

This functionally changes nothing, but makes a subsequent change where
"use warnings" will be added to Git.pm less confusing and less
verbose, since as with "strict" we'll only need to do that at the top
of the file.

Changes code initially added in a6065b548f ("Git.pm: Try to support
ActiveState output pipe", 2006-06-25).

Signed-off-by: Ævar Arnfjörð Bjarmason 
---
 perl/Git.pm | 1 -
 1 file changed, 1 deletion(-)

diff --git a/perl/Git.pm b/perl/Git.pm
index 9d60d7948b..99e5d943af 100644
--- a/perl/Git.pm
+++ b/perl/Git.pm
@@ -1692,7 +1692,6 @@ sub DESTROY {
 # Pipe implementation for ActiveState Perl.
 
 package Git::activestate_pipe;
-use strict;
 
 sub TIEHANDLE {
my ($class, @params) = @_;
-- 
2.15.1.424.g9478a66081



[PATCH v2 03/13] Git.pm: add the "use warnings" pragma

2018-02-25 Thread Ævar Arnfjörð Bjarmason
Amend Git.pm to load the "warnings" pragma like the rest of the code
in perl/ in addition to the existing "strict" pragma. This is
considered the bare minimum best practice in Perl.

Ever since this code was introduced in b1edc53d06 ("Introduce
Git.pm (v4)", 2006-06-24) it's only been using "strict", not
"warnings".

This leaves contrib/buildsystems/Generators/{QMake,VCproj}.pm and
contrib/mw-to-git/Git/Mediawiki.pm without "use warnings". Amending
those would be a sensible follow-up change, but I don't have an easy
way to test those so I'm not changing them.

Signed-off-by: Ævar Arnfjörð Bjarmason 
---
 perl/Git.pm | 1 +
 1 file changed, 1 insertion(+)

diff --git a/perl/Git.pm b/perl/Git.pm
index 99e5d943af..7ec16e6dde 100644
--- a/perl/Git.pm
+++ b/perl/Git.pm
@@ -9,6 +9,7 @@ package Git;
 
 use 5.008;
 use strict;
+use warnings;
 
 
 BEGIN {
-- 
2.15.1.424.g9478a66081



[PATCH v2 09/13] perl: move CPAN loader wrappers to another namespace

2018-02-25 Thread Ævar Arnfjörð Bjarmason
Move the Git::Error and Git::Mail::Address wrappers to the
Git::LoadCPAN::Loader::* namespace, e.g. Git::LoadCPAN::Error. That
module will then either load Error from CPAN (if installed on the OS),
or use Git::FromCPAN::Error.

When I added the Error wrapper in 20d2a30f8f ("Makefile: replace
perl/Makefile.PL with simple make rules", 2017-12-10) I didn't think
about how confusing it would be to have these modules sitting in the
same tree as our normal modules. Let's put these all into
Git::{Load,From}CPAN::* to clearly distinguish them from the rest.

This also makes things a bit less confusing since there was already a
Git::Error namespace ever since 8b9150e3e3 ("Git.pm: Handle failed
commands' output", 2006-06-24).

Signed-off-by: Ævar Arnfjörð Bjarmason 
---
 contrib/examples/git-difftool.perl  | 2 +-
 git-send-email.perl | 4 ++--
 perl/Git.pm | 2 +-
 perl/Git/{ => LoadCPAN}/Error.pm| 8 
 perl/Git/{ => LoadCPAN}/Mail/Address.pm | 4 ++--
 5 files changed, 10 insertions(+), 10 deletions(-)
 rename perl/Git/{ => LoadCPAN}/Error.pm (78%)
 rename perl/Git/{ => LoadCPAN}/Mail/Address.pm (69%)

diff --git a/contrib/examples/git-difftool.perl 
b/contrib/examples/git-difftool.perl
index fb0fd0b84b..b2ea80f9ed 100755
--- a/contrib/examples/git-difftool.perl
+++ b/contrib/examples/git-difftool.perl
@@ -13,7 +13,7 @@
 use 5.008;
 use strict;
 use warnings;
-use Git::Error qw(:try);
+use Git::LoadCPAN::Error qw(:try);
 use File::Basename qw(dirname);
 use File::Copy;
 use File::Find;
diff --git a/git-send-email.perl b/git-send-email.perl
index d5a4826a1c..1ec22c5ef3 100755
--- a/git-send-email.perl
+++ b/git-send-email.perl
@@ -26,13 +26,13 @@ use Text::ParseWords;
 use Term::ANSIColor;
 use File::Temp qw/ tempdir tempfile /;
 use File::Spec::Functions qw(catdir catfile);
-use Git::Error qw(:try);
+use Git::LoadCPAN::Error qw(:try);
 use Cwd qw(abs_path cwd);
 use Git;
 use Git::I18N;
-use Git::Mail::Address;
 use Net::Domain ();
 use Net::SMTP ();
+use Git::LoadCPAN::Mail::Address;
 
 Getopt::Long::Configure qw/ pass_through /;
 
diff --git a/perl/Git.pm b/perl/Git.pm
index 151b0e7144..9f246c7988 100644
--- a/perl/Git.pm
+++ b/perl/Git.pm
@@ -104,7 +104,7 @@ increase notwithstanding).
 
 
 use Carp qw(carp croak); # but croak is bad - throw instead
-use Git::Error qw(:try);
+use Git::LoadCPAN::Error qw(:try);
 use Cwd qw(abs_path cwd);
 use IPC::Open2 qw(open2);
 use Fcntl qw(SEEK_SET SEEK_CUR);
diff --git a/perl/Git/Error.pm b/perl/Git/LoadCPAN/Error.pm
similarity index 78%
rename from perl/Git/Error.pm
rename to perl/Git/LoadCPAN/Error.pm
index 09bbc97390..3513fe745b 100644
--- a/perl/Git/Error.pm
+++ b/perl/Git/LoadCPAN/Error.pm
@@ -1,11 +1,11 @@
-package Git::Error;
+package Git::LoadCPAN::Error;
 use 5.008;
 use strict;
 use warnings;
 
 =head1 NAME
 
-Git::Error - Wrapper for the L module, in case it's not installed
+Git::LoadCPAN::Error - Wrapper for the L module, in case it's not 
installed
 
 =head1 DESCRIPTION
 
@@ -26,13 +26,13 @@ sub import {
 } or do {
my $error = $@ || "Zombie Error";
 
-   my $Git_Error_pm_path = $INC{"Git/Error.pm"} || die "BUG: Should have 
our own path from %INC!";
+   my $Git_Error_pm_path = $INC{"Git/LoadCPAN/Error.pm"} || die "BUG: 
Should have our own path from %INC!";
 
require File::Basename;
my $Git_Error_pm_root = File::Basename::dirname($Git_Error_pm_path) || 
die "BUG: Can't figure out lib/Git dirname from '$Git_Error_pm_path'!";
 
require File::Spec;
-   my $Git_pm_FromCPAN_root = File::Spec->catdir($Git_Error_pm_root, 
'FromCPAN');
+   my $Git_pm_FromCPAN_root = File::Spec->catdir($Git_Error_pm_root, '..', 
'FromCPAN');
die "BUG: '$Git_pm_FromCPAN_root' should be a directory!" unless -d 
$Git_pm_FromCPAN_root;
 
local @INC = ($Git_pm_FromCPAN_root, @INC);
diff --git a/perl/Git/Mail/Address.pm b/perl/Git/LoadCPAN/Mail/Address.pm
similarity index 69%
rename from perl/Git/Mail/Address.pm
rename to perl/Git/LoadCPAN/Mail/Address.pm
index 2ce3e84670..879c2f5cd1 100644
--- a/perl/Git/Mail/Address.pm
+++ b/perl/Git/LoadCPAN/Mail/Address.pm
@@ -1,11 +1,11 @@
-package Git::Mail::Address;
+package Git::LoadCPAN::Mail::Address;
 use 5.008;
 use strict;
 use warnings;
 
 =head1 NAME
 
-Git::Mail::Address - Wrapper for the L module, in case it's not 
installed
+Git::LoadCPAN::Mail::Address - Wrapper for the L module, in 
case it's not installed
 
 =head1 DESCRIPTION
 
-- 
2.15.1.424.g9478a66081



[PATCH v2 00/13] various perl fixes

2018-02-25 Thread Ævar Arnfjörð Bjarmason
Addresses comments against v1, and ships with a new
NO_PERL_CPAN_FALLBACKS knob (thanks Todd!) which distributions can
turn on to not get our FromCPAN copies. Details below.

Todd Zullinger (1):
  Makefile: add NO_PERL_CPAN_FALLBACKS knob

Some of this was split off into into my "perl: move the
perl/Git/FromCPAN tree to perl/FromCPAN", and I added "This option is
intended[...]" to the Makefile documentation.

Ævar Arnfjörð Bjarmason (12):
  perl: *.pm files should not have the executable bit

JN: Rephrased commit message.

  Git.pm: remove redundant "use strict" from sub-package
  Git.pm: add the "use warnings" pragma

These are both new, something I noticed and seems sensible to fix
while we're at it.

  gitweb: hard-depend on the Digest::MD5 5.8 module
  Git.pm: hard-depend on the File::{Temp,Spec} modules
  git-send-email: unconditionally use Net::{SMTP,Domain}

These all "use" the modules now, instead of using "require".

JN: Removed a comment from the gitweb code that isn't needed anymore.

Clarifications / fixes to the commit messages. 

  perl: update our ancient copy of Error.pm

Rephrased commit message.

  perl: update our copy of Mail::Address

Actually ships with the working version of Mail::Address now (oops!),
which makes the patch much smaller, and requires less explanation.

  perl: move CPAN loader wrappers to another namespace

JN: Commit message phrasing.

  perl: generalize the Git::LoadCPAN facility

JN: Commit message phrasing, and s//\t/g.

  perl: move the perl/Git/FromCPAN tree to perl/FromCPAN
  perl Git::LoadCPAN: emit better errors under NO_PERL_CPAN_FALLBACKS

These are both new. There's now a NO_PERL_CPAN_FALLBACKS option for
distributors to turn on to stop us from installing these CPAN
fallbacks.

13/13 updates the docs, and emits a better error than what we emit now
in master if the CPAN module we expect from the OS isn't installed.

 INSTALL |  11 +-
 Makefile|  16 +-
 contrib/examples/git-difftool.perl  |   2 +-
 git-send-email.perl |  28 ++-
 gitweb/INSTALL  |   3 +-
 gitweb/gitweb.perl  |  17 +-
 perl/FromCPAN/.gitattributes|   1 +
 perl/{Git => }/FromCPAN/Error.pm| 296 +++-
 perl/{Git => }/FromCPAN/Mail/Address.pm |   8 +-
 perl/Git.pm |  14 +-
 perl/Git/Error.pm   |  46 -
 perl/Git/LoadCPAN.pm| 104 +++
 perl/Git/LoadCPAN/Error.pm  |  10 ++
 perl/Git/LoadCPAN/Mail/Address.pm   |  10 ++
 perl/Git/Mail/Address.pm|  24 ---
 15 files changed, 432 insertions(+), 158 deletions(-)
 create mode 100644 perl/FromCPAN/.gitattributes
 rename perl/{Git => }/FromCPAN/Error.pm (72%)
 rename perl/{Git => }/FromCPAN/Mail/Address.pm (96%)
 delete mode 100644 perl/Git/Error.pm
 create mode 100644 perl/Git/LoadCPAN.pm
 create mode 100644 perl/Git/LoadCPAN/Error.pm
 create mode 100644 perl/Git/LoadCPAN/Mail/Address.pm
 delete mode 100755 perl/Git/Mail/Address.pm

-- 
2.15.1.424.g9478a66081



[PATCH v2 01/13] perl: *.pm files should not have the executable bit

2018-02-25 Thread Ævar Arnfjörð Bjarmason
The Git::Mail::Address file added in bd869f67b9 ("send-email: add and
use a local copy of Mail::Address", 2018-01-05) had the executable bit
set. That bit should not be set for *.pm files. It breaks nothing but
it is redundant and confusing as none of the other files have it and
these files are never executed as stand-alone programs.

Signed-off-by: Ævar Arnfjörð Bjarmason 
---
 perl/Git/Mail/Address.pm | 0
 1 file changed, 0 insertions(+), 0 deletions(-)
 mode change 100755 => 100644 perl/Git/Mail/Address.pm

diff --git a/perl/Git/Mail/Address.pm b/perl/Git/Mail/Address.pm
old mode 100755
new mode 100644
-- 
2.15.1.424.g9478a66081



[PATCH v2 06/13] git-send-email: unconditionally use Net::{SMTP,Domain}

2018-02-25 Thread Ævar Arnfjörð Bjarmason
The Net::SMTP and Net::Domain were both first released with perl
v5.7.3[1], since my d48b284183 ("perl: bump the required Perl version
to 5.8 from 5.6.[21]", 2010-09-24) we've depended on 5.8, so there's
no reason to conditionally require this anymore.

This conditional loading was initially added in
87840620fd ("send-email: only 'require' instead of 'use' Net::SMTP",
2006-06-01) for Net::SMTP and 134550fe21 ("git-send-email.perl - try
to give real name of the calling host to HELO/EHLO", 2010-03-14) for
Net::Domain, both of which predate the hard dependency on 5.8.

Since they're guaranteed to be installed now let's "use" them
instead. The cost of loading them both is trivial given what
git-send-email does (~15ms on my system), and it's better to not defer
any potential loading errors until runtime.

This patch is better viewed with -w, which shows that the only change
in the last two hunks is removing the "if eval" wrapper block.

1. $ parallel 'corelist {}' ::: Net::{SMTP,Domain}
   Data for 2015-02-14
   Net::SMTP was first released with perl v5.7.3

   Data for 2015-02-14
   Net::Domain was first released with perl v5.7.3

Signed-off-by: Ævar Arnfjörð Bjarmason 
---
 git-send-email.perl | 24 +++-
 1 file changed, 11 insertions(+), 13 deletions(-)

diff --git a/git-send-email.perl b/git-send-email.perl
index bbf4deaa0d..d5a4826a1c 100755
--- a/git-send-email.perl
+++ b/git-send-email.perl
@@ -31,6 +31,8 @@ use Cwd qw(abs_path cwd);
 use Git;
 use Git::I18N;
 use Git::Mail::Address;
+use Net::Domain ();
+use Net::SMTP ();
 
 Getopt::Long::Configure qw/ pass_through /;
 
@@ -1143,10 +1145,8 @@ sub valid_fqdn {
 sub maildomain_net {
my $maildomain;
 
-   if (eval { require Net::Domain; 1 }) {
-   my $domain = Net::Domain::domainname();
-   $maildomain = $domain if valid_fqdn($domain);
-   }
+   my $domain = Net::Domain::domainname();
+   $maildomain = $domain if valid_fqdn($domain);
 
return $maildomain;
 }
@@ -1154,17 +1154,15 @@ sub maildomain_net {
 sub maildomain_mta {
my $maildomain;
 
-   if (eval { require Net::SMTP; 1 }) {
-   for my $host (qw(mailhost localhost)) {
-   my $smtp = Net::SMTP->new($host);
-   if (defined $smtp) {
-   my $domain = $smtp->domain;
-   $smtp->quit;
+   for my $host (qw(mailhost localhost)) {
+   my $smtp = Net::SMTP->new($host);
+   if (defined $smtp) {
+   my $domain = $smtp->domain;
+   $smtp->quit;
 
-   $maildomain = $domain if valid_fqdn($domain);
+   $maildomain = $domain if valid_fqdn($domain);
 
-   last if $maildomain;
-   }
+   last if $maildomain;
}
}
 
-- 
2.15.1.424.g9478a66081



RE: [PATCH 3/8] perl: generalize the Git::LoadCPAN facility

2018-02-25 Thread Randall S. Becker
On February 25, 2018 1:57 PM, Ævar Arnfjörð Bjarmason wrote:
> On Wed, Feb 14 2018, Jonathan Nieder jotted:
> 
> > Ævar Arnfjörð Bjarmason wrote:
> >
> >> Change the two wrappers to load from CPAN (local OS) or our own copy
> >> to do so via the same codepath.
> >
> > nit: I think with s/to load/that load/ this will be easier to read.
> >
> >> I added the Error.pm wrapper in 20d2a30f8f ("Makefile: replace
> >> perl/Makefile.PL with simple make rules", 2017-12-10), and shortly
> >> afterwards Matthieu Moy added a wrapper for Mail::Address in
> >> bd869f67b9 ("send-email: add and use a local copy of Mail::Address",
> >> 2018-01-05).
> >>
> >> His was simpler since Mail::Address doesn't have an "import" method,
> >> but didn't do the same sanity checking, e.g. a missing FromCPAN
> >> directory (which OS packages are likely not to have) wouldn't be
> >> explicitly warned about.
> >
> > I'm having trouble parsing this.  Mail::Address didn't do the same
> > sanity checking or his didn't?
> >
> > The comma before e.g. should be a period or semicolon, since it's
> > starting a new independent clause.
> >
> >> Now both use a modification of the previously Error.pm-specific
> >> codepath, which has been amended to take the module to load as
> >> parameter, as well as whether or not that module has an import method.
> >
> > Does "now" mean before this patch or after this patch?  Usually commit
> > messages describe the status quo without the patch in the present
> > tense and the change the patch will make in the imperative.
> > So this could say:
> >
> > Update both to use a common implementation based on the
> previous
> > Error.pm loader.
> 
> All good feeedback, thanks. Incorporated into v2 which I'm about to submit.
> 
> > [...]
> >> +++ b/perl/Git/LoadCPAN.pm
> >> @@ -0,0 +1,74 @@
> > [...]
> >> +The Perl code in Git depends on some modules from the CPAN, but we
> >> +don't want to make those a hard requirement for anyone building from
> >> +source.
> >
> > not about this patch: have we considered making it a hard requirement?
> > Both Mail::Address and Error.pm are pretty widely available, and I
> > wonder if we could make the instructions in the INSTALL file say that
> > they are required dependencies to simplify things.
> 
> I can't remember when, but at some point this was discussed on list, and the
> consensus was that us using perl should be kept as a non-invasive
> implementation detail that would be as small of a pain as possible for users.

That would include the platform I'm maintaining, where perl is currently pretty 
handcuffed and blindfolded (including completion code misinterprets). CPAN 
isn't currently an option, but might be soon.

> It's easy for distros to package these modules, but for users building from
> source who know nothing about perl it can be a pain.

Know perl I do. Use it not, can I. ;-)

> I also think it's very useful to avoid the side-discussion about not using 
> some
> useful CPAN module in the future just because it's not widely used, but
> would be perfect for some use-case of ours.
> 
> > I admit part of my bias here is coming from the distro world, where we
> > have to do extra work to get rid of the FromCPAN embedded copies and
> > would be happier not to have to.
> 
> I think there's a very good argument to be made for inverting the
> NO_PERL_CPAN_FALLBACKS logic, but my soon to be submitted v2 keeps it
> off by default.

Cool, thanks.

Cheers,
Randall

-- Brief whoami:
  NonStop developer since approximately NonStop(2112884442)
  UNIX developer since approximately 421664400
-- In my real life, I talk too much.





Re: [PATCH 1/8] perl: *.pm files should not have the executable bit

2018-02-25 Thread Ævar Arnfjörð Bjarmason

On Wed, Feb 14 2018, Jonathan Nieder jotted:

> Hi,
>
> Ævar Arnfjörð Bjarmason wrote:
>
>> The Git::Mail::Address file added in bd869f67b9 ("send-email: add and
>> use a local copy of Mail::Address", 2018-01-05) had the executable bit
>> set, this should not be the case with *.pm files, it breaks nothing,
>> but is redundant and confusing as none of the other files have it, and
>> it's never executed as a stand-alone program.
>
> Needs a period somewhere to break up the long sentence with comma
> splices.  How about:
>
>   The Git::Mail::Address file added in bd869f67b9 ("send-email: add and
>   use a local copy of Mail::Address", 2018-01-05) had the executable bit
>   set. That bit should not be set for *.pm files. It breaks nothing but
>   but it is redundant and confusing as none of the other files have it
>   and these files are never executed as stand-alone programs.

Thanks, used this with s/but but/but/

>> Signed-off-by: Ævar Arnfjörð Bjarmason 
>
> With or without such a tweak,
> Reviewed-by: Jonathan Nieder 
>
> Thanks.



Re: [PATCH 3/8] perl: generalize the Git::LoadCPAN facility

2018-02-25 Thread Ævar Arnfjörð Bjarmason

On Wed, Feb 14 2018, Jonathan Nieder jotted:

> Ævar Arnfjörð Bjarmason wrote:
>
>> Change the two wrappers to load from CPAN (local OS) or our own copy
>> to do so via the same codepath.
>
> nit: I think with s/to load/that load/ this will be easier to read.
>
>> I added the Error.pm wrapper in 20d2a30f8f ("Makefile: replace
>> perl/Makefile.PL with simple make rules", 2017-12-10), and shortly
>> afterwards Matthieu Moy added a wrapper for Mail::Address in
>> bd869f67b9 ("send-email: add and use a local copy of Mail::Address",
>> 2018-01-05).
>>
>> His was simpler since Mail::Address doesn't have an "import" method,
>> but didn't do the same sanity checking, e.g. a missing FromCPAN
>> directory (which OS packages are likely not to have) wouldn't be
>> explicitly warned about.
>
> I'm having trouble parsing this.  Mail::Address didn't do the same
> sanity checking or his didn't?
>
> The comma before e.g. should be a period or semicolon, since it's
> starting a new independent clause.
>
>> Now both use a modification of the previously Error.pm-specific
>> codepath, which has been amended to take the module to load as
>> parameter, as well as whether or not that module has an import method.
>
> Does "now" mean before this patch or after this patch?  Usually
> commit messages describe the status quo without the patch in the
> present tense and the change the patch will make in the imperative.
> So this could say:
>
>   Update both to use a common implementation based on the previous
>   Error.pm loader.

All good feeedback, thanks. Incorporated into v2 which I'm about to
submit.

> [...]
>> +++ b/perl/Git/LoadCPAN.pm
>> @@ -0,0 +1,74 @@
> [...]
>> +The Perl code in Git depends on some modules from the CPAN, but we
>> +don't want to make those a hard requirement for anyone building from
>> +source.
>
> not about this patch: have we considered making it a hard requirement?
> Both Mail::Address and Error.pm are pretty widely available, and I
> wonder if we could make the instructions in the INSTALL file say that
> they are required dependencies to simplify things.

I can't remember when, but at some point this was discussed on list, and
the consensus was that us using perl should be kept as a non-invasive
implementation detail that would be as small of a pain as possible for
users.

It's easy for distros to package these modules, but for users building
from source who know nothing about perl it can be a pain.

I also think it's very useful to avoid the side-discussion about not
using some useful CPAN module in the future just because it's not widely
used, but would be perfect for some use-case of ours.

> I admit part of my bias here is coming from the distro world, where we
> have to do extra work to get rid of the FromCPAN embedded copies and
> would be happier not to have to.

I think there's a very good argument to be made for inverting the
NO_PERL_CPAN_FALLBACKS logic, but my soon to be submitted v2 keeps it
off by default.

> [...]
>> +Usually OS's will not ship with Git's Git::FromCPAN tree at all,
>> +preferring to use their own packaging of CPAN modules instead.
>
> nit: I think the plural of OS is OSes, or something like
> "distributors" or "operating systems".

Thanks.

> [...]
>> +eval {
>> +require $package_pm;
>> +1;
>> +} or do {
>
> also not about this patch: this mixed tabs/spacing formatting feels a
> bit unusual.  I don't know if it's idiomatic for perl, and if it is
> then no complaints; it just stood out a little.  Can
> Documentation/CodingGuidelines say something about the preferred
> indentation in Perl to avoid having to think about such questions?

Thanks, that sucked. Changed to \t. I still haven't gotten around to
hacking my $EDITOR settings for git.git (like for C & SH).

>> --- a/perl/Git/LoadCPAN/Error.pm
>> +++ b/perl/Git/LoadCPAN/Error.pm
>> @@ -2,45 +2,9 @@ package Git::LoadCPAN::Error;
>>  use 5.008;
>>  use strict;
>>  use warnings;
>> +use Git::LoadCPAN (
>> +module => 'Error',
>> +import => 1,
>> +);
>
> Nice!
>
> Thanks and hope that helps,
> Jonathan


ref-filter: how to improve the code

2018-02-25 Thread Оля Тележная
Hi everyone,
I am trying to remove cat-file formatting part and reuse same
functionality from ref-filter.
I have a problem that cat-file sometimes needs to continue running
even if the request is broken, while in ref-filter we invoke die() in
many places everywhere during formatting process. I write this email
because I want to discuss how to implement the solution better.

ref-filter has 2 functions which could be interesting for us:
format_ref_array_item() and show_ref_array_item(). I guess it's a good
idea to print everything in show_ref_array_item(), including all
errors. In that case all current users of ref-filter will invoke
show_ref_array_item() (as they did it before), and cat-file could use
format_ref_array_item() and work with the result in its own logic.

I tried to replace all die("...") with `return error("...")` and
finally exit(), but actual problem is that we print "error:..."
instead of "fatal:...", and it looks funny.
The draft of the code is here: https://github.com/telezhnaya/git/commits/p2

One of the easiest solutions is to add strbuf parameter for errors to
all functions that we use during formatting process, fill it in and
print in show_ref_array_item() if necessary. What do you think about
this idea?

Another way is to change the resulting error message, print current
message with "error" prefix and then print something like "fatal:
could not format the output". It is easier but I am not sure that it's
a good idea to change the interface.

If you have another ideas - please share them with me. It is really
important step to make formatting logic more general and easier to
reuse.

Thanks!
Olga


[PATCH v1.1] t: prevent '-x' tracing from interfering with test helpers' stderr

2018-02-25 Thread SZEDER Gábor
Running a test script with '-x' turns on 'set -x' tracing, the output
of which is normally sent to stderr.  This causes a lot of
test failures, because many tests redirect and verify the stderr
of shell functions, most frequently that of 'test_must_fail'.
These issues were worked around somewhat in d88785e424 (test-lib: set
BASH_XTRACEFD automatically, 2016-05-11), so at least we could
reliably run tests with '-x' tracing under a Bash version supporting
BASH_XTRACEFD, i.e. v4.1 and later.

Futhermore, redirecting the stderr of test helper functions like
'test_must_fail' or 'test_expect_code' is the cause of a different
issue as well.  If these functions detect something unexpected, they
will write their error messages intended to the user to thier stderr.
However, if their stderr is redirected in order to save and verify the
stderr of the tested git command invoked in the function, then the
function's error messages will be redirected as well.  Consequently,
those messages won't reach the user, making the test's verbose output
less useful.

This patch makes it safe to redirect and verify the stderr of those
test helper functions which are meant to run the tested command given
as argument, even when running tests with '-x' and /bin/sh.  This is
achieved through a couple of file descriptor redirections:

  - Duplicate stderr of the tested command executed in the test helper
function from the function's fd 7 (see next point), to ensure that
the tested command's error messages go to a different fd than the
'-x' trace of the commands executed in the function or the
function's error messages.

  - Duplicate the test helper function's fd 7 from the function's
original stderr, meaning that, after taking a detour through fd 7,
the error messages of the tested command do end up on the
function's original stderr.

  - Duplicate stderr of the test helper function from fd 4, i.e. the
fd connected to the test script's original stderr and the fd used
for BASH_XTRACEFD.  This ensures that the '-x' trace of the
commands executed in the function

  - doesn't go to the function's original stderr, so it won't mess
with callers who want to save and verify the tested command's
stderr.

  - does go to the same fd independently from the shell running
the test script, be it /bin/sh, an older Bash without
BASH_XTRACEFD, or a more recent Bash already supporting
BASH_XTRACEFD.

Furthermore, this also makes sure that the function's error
messages go to this fd 4, meaning that the user will be able to
see them even if the function's stderr is redirected in the test.

  - Specify the latter two redirections above in the test helper
function's definition, so they are performed every time the
function is invoked, without the need to modify the callsites of
the function.

Perform these redirections in those test helper functions which can be
expected to have their stderr redirected, i.e. in the functions
'test_must_fail', 'test_might_fail', 'test_expect_code', 'test_env',
'nongit', 'test_terminal' and 'perl'.  Note that 'test_might_fail',
'test_env', and 'nongit' are not involved in any test failures when
running tests with '-x' and /bin/sh.

The other test helper functions are left unchanged, because they
either don't run commands specified as their arguments, or redirecting
their stderr wouldn't make sense, or both.

With this change the number of failures when running the test suite
with '-x' tracing and /bin/sh goes down from 340 failed tests in 43
test scripts to 22 failed tests in 6 scripts (or 23 in 7, if the
system (OSX) uses an older Bash version without BASH_XTRACEFD to run
't9903-bash-prompt.sh').

Signed-off-by: SZEDER Gábor 
---

Changes:

  - Duplicate from/to fd 7 instead of fd 9 in 'test_terminal'.

  - Talk about the issue that redirecting stderr of test helper
functions affect their error messages as well, and how this patch
resolves that issue as well.

 t/lib-terminal.sh   |  4 ++--
 t/test-lib-functions.sh | 24 
 2 files changed, 14 insertions(+), 14 deletions(-)

diff --git a/t/lib-terminal.sh b/t/lib-terminal.sh
index cd220e378e..e3809dcead 100644
--- a/t/lib-terminal.sh
+++ b/t/lib-terminal.sh
@@ -9,8 +9,8 @@ test_terminal () {
echo >&4 "test_terminal: need to declare TTY prerequisite"
return 127
fi
-   perl "$TEST_DIRECTORY"/test-terminal.perl "$@"
-}
+   perl "$TEST_DIRECTORY"/test-terminal.perl "$@" 2>&7
+} 7>&2 2>&4
 
 test_lazy_prereq TTY '
test_have_prereq PERL &&
diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh
index 67b5994afb..37eb34044a 100644
--- a/t/test-lib-functions.sh
+++ b/t/test-lib-functions.sh
@@ -621,7 +621,7 @@ test_must_fail () {
_test_ok=
;;
esac
-   "$@"
+   "$@" 2>&7
exit_code=$?
if test 

[PATCH] perf: use GIT_PERF_REPEAT_COUNT=3 by default even without config file

2018-02-25 Thread René Scharfe
9ba95ed23c (perf/run: update get_var_from_env_or_config() for
subsections) stopped setting a default value for GIT_PERF_REPEAT_COUNT
if no perf config file is present, because get_var_from_env_or_config
returns early in that case.

Fix it by setting the default value after calling this function.  Its
fifth parameter is not used for any other variable, so remove the
associated code.

Signed-off-by: Rene Scharfe 
---
 t/perf/run | 8 +++-
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/t/perf/run b/t/perf/run
index 1a100d6134..213da5d6b9 100755
--- a/t/perf/run
+++ b/t/perf/run
@@ -106,29 +106,27 @@ get_var_from_env_or_config () {
conf_sec="$2"
conf_var="$3"
conf_opts="$4" # optional
-   # $5 can be set to a default value
 
# Do nothing if the env variable is already set
eval "test -z \"\${$env_var+x}\"" || return
 
test -z "$GIT_PERF_CONFIG_FILE" && return
 
# Check if the variable is in the config file
if test -n "$GIT_PERF_SUBSECTION"
then
var="$conf_sec.$GIT_PERF_SUBSECTION.$conf_var"
conf_value=$(git config $conf_opts -f "$GIT_PERF_CONFIG_FILE" 
"$var") &&
eval "$env_var=\"$conf_value\"" && return
fi
var="$conf_sec.$conf_var"
conf_value=$(git config $conf_opts -f "$GIT_PERF_CONFIG_FILE" "$var") &&
-   eval "$env_var=\"$conf_value\"" && return
-
-   test -n "${5+x}" && eval "$env_var=\"$5\""
+   eval "$env_var=\"$conf_value\""
 }
 
 run_subsection () {
-   get_var_from_env_or_config "GIT_PERF_REPEAT_COUNT" "perf" "repeatCount" 
"--int" 3
+   get_var_from_env_or_config "GIT_PERF_REPEAT_COUNT" "perf" "repeatCount" 
"--int"
+   : ${GIT_PERF_REPEAT_COUNT:=3}
export GIT_PERF_REPEAT_COUNT
 
get_var_from_env_or_config "GIT_PERF_DIRS_OR_REVS" "perf" "dirsOrRevs"
-- 
2.16.2


Re: [PATCH v8 4/7] utf8: add function to detect a missing UTF-16/32 BOM

2018-02-25 Thread Lars Schneider

> On 25 Feb 2018, at 04:52, Eric Sunshine  wrote:
> 
> On Sat, Feb 24, 2018 at 11:27 AM,   wrote:
>> If the endianness is not defined in the encoding name, then let's
>> be strict and require a BOM to avoid any encoding confusion. The
>> is_missing_required_utf_bom() function returns true if a required BOM
>> is missing.
>> 
>> The Unicode standard instructs to assume big-endian if there in no BOM
>> for UTF-16/32 [1][2]. However, the W3C/WHATWG encoding standard used
>> in HTML5 recommends to assume little-endian to "deal with deployed
>> content" [3]. Strictly requiring a BOM seems to be the safest option
>> for content in Git.
>> 
>> Signed-off-by: Lars Schneider 
>> ---
>> diff --git a/utf8.h b/utf8.h
>> @@ -79,4 +79,20 @@ void strbuf_utf8_align(struct strbuf *buf, align_type 
>> position, unsigned int wid
>> +/*
>> + * If the endianness is not defined in the encoding name, then we
>> + * require a BOM. The function returns true if a required BOM is missing.
>> + *
>> + * The Unicode standard instructs to assume big-endian if there
>> + * in no BOM for UTF-16/32 [1][2]. However, the W3C/WHATWG
>> + * encoding standard used in HTML5 recommends to assume
>> + * little-endian to "deal with deployed content" [3].
> 
> Perhaps you could tack on to the comment here the final bit of
> explanation from the commit message which ties these conflicting
> recommendations together. In particular:
> 
>Therefore, strictly requiring a BOM seems to be the
>safest option for content in Git.

Agreed. I'll change it.

Thanks,
Lars

Re: [PATCH v8 3/7] utf8: add function to detect prohibited UTF-16/32 BOM

2018-02-25 Thread Lars Schneider

> On 25 Feb 2018, at 04:41, Eric Sunshine  wrote:
> 
> On Sat, Feb 24, 2018 at 11:27 AM,   wrote:
>> Whenever a data stream is declared to be UTF-16BE, UTF-16LE, UTF-32BE
>> or UTF-32LE a BOM must not be used [1]. The function returns true if
>> this is the case.
>> 
>> [1] http://unicode.org/faq/utf_bom.html#bom10
>> 
>> Signed-off-by: Lars Schneider 
>> ---
>> diff --git a/utf8.c b/utf8.c
>> @@ -538,6 +538,30 @@ char *reencode_string_len(const char *in, int insz,
>> +int has_prohibited_utf_bom(const char *enc, const char *data, size_t len)
>> +{
>> +   return (
>> + (!strcmp(enc, "UTF-16BE") || !strcmp(enc, "UTF-16LE")) &&
>> + (has_bom_prefix(data, len, utf16_be_bom, sizeof(utf16_be_bom)) ||
>> +  has_bom_prefix(data, len, utf16_le_bom, sizeof(utf16_le_bom)))
>> +   ) || (
>> + (!strcmp(enc, "UTF-32BE") || !strcmp(enc, "UTF-32LE")) &&
>> + (has_bom_prefix(data, len, utf32_be_bom, sizeof(utf32_be_bom)) ||
>> +  has_bom_prefix(data, len, utf32_le_bom, sizeof(utf32_le_bom)))
>> +   );
>> +}
> 
> Is this interpretation correct? When I read [1], I interpret it as
> saying that no BOM _of any sort_ should be present when the encoding
> is declared as one of UTF-16BE, UTF-16LE, UTF-32BE, or UTF-32LE.

Correct!

> This
> code, on the other hand, only checks for BOMs corresponding to the
> declared size (16 or 32 bits).

Hmm. Interesting thought. You are saying that my code won't complain if
a document declared as UTF-16LE has a UTF32-LE BOM, correct? I would say
this is correct behavior in context of this function. This function assumes
that the document is proper UTF-16/UTF-16BE/UTF-16LE but it is wrongly
declared with respect to its BOM in the .gitattributes. Would this
comment make it more clear to you?

/*
 * If a data stream is declared as UTF-16BE or UTF-16LE, then a UTF-16
 * BOM must not be used [1]. The same applies for the UTF-32 
equivalents.
 * The function returns true if this rule is violated.
 *
 * [1] http://unicode.org/faq/utf_bom.html#bom10
 */


I think what you are referring to is a different class of error and
would therefore warrant its own checker function. Would you agree?


> I suppose the intention of [1] is to detect a mismatch between the
> declared encoding and how the stream is actually encoded. The check
> implemented here will fail to detect a mismatch between, say, declared
> encoding UTF-16BE and actual encoding UTF-32BE.

As stated above the intention is to detect wrong BOMs! I think we cannot 
detect the "declared as UTF-16BE but actually UTF-32BE" error.

Consider this:

printf "test" | iconv -f UTF-8 -t UTF-32BE | iconv -f UTF-16BE -t UTF-8 | od -c
000   \0   t  \0   e  \0   s  \0   t
010

In the first step we "encode" the string to UTF-32BE and then we "decode" it as
UTF-16BE. The result is valid although not correct. Does this make sense?

Thanks,
Lars




[PATCH v3 0/6] Fix initializing the_hash_algo

2018-02-25 Thread Nguyễn Thái Ngọc Duy
v3 refines v2 a bit more:

- fix configure typo (and stray words in commit message)
- use repo_set_hash_algo() instead of reassigning the_hash_algo
- compare hash algos by format_id
- catch NULL hash algo, report nicely and suggest GIT_HASH_FIXUP

The last point makes me much happier about keeping this workaround
around until we are confident we can live without it. Interdiff

diff --git a/builtin/index-pack.c b/builtin/index-pack.c
index 1144458140..2c75f28b41 100644
--- a/builtin/index-pack.c
+++ b/builtin/index-pack.c
@@ -331,21 +331,24 @@ static const char *open_pack_file(const char *pack_name)
 
 static void prepare_hash_algo(uint32_t pack_version)
 {
+   int pack_algo_id;
const struct git_hash_algo *pack_algo;
 
switch (pack_version) {
case 2:
case 3:
-   pack_algo = _algos[GIT_HASH_SHA1];
+   pack_algo_id = GIT_HASH_SHA1;
break;
default:
-   die("BUG: how to determine hash algo for new version?");
+   die("BUG: how to determine hash algo for version %d?",
+   pack_version);
}
 
-   if (!the_hash_algo) /* running without repo */
-   the_hash_algo = pack_algo;
+   if (!repo_has_valid_hash_algo(the_repository)) /* running without repo 
*/
+   repo_set_hash_algo(the_repository, pack_algo_id);
 
-   if (the_hash_algo != pack_algo)
+   pack_algo = _algos[pack_algo_id];
+   if (the_hash_algo->format_id != pack_algo->format_id)
die(_("incompatible hash algorithm, "
  "configured for %s but the pack file needs %s"),
the_hash_algo->name, pack_algo->name);
diff --git a/cache.h b/cache.h
index 6b97138264..55b31e9756 100644
--- a/cache.h
+++ b/cache.h
@@ -53,7 +53,7 @@ struct object_id {
unsigned char hash[GIT_MAX_RAWSZ];
 };
 
-#define the_hash_algo the_repository->hash_algo
+#define the_hash_algo repo_get_hash_algo(the_repository)
 
 #if defined(DT_UNKNOWN) && !defined(NO_D_TYPE_IN_DIRENT)
 #define DTYPE(de)  ((de)->d_type)
diff --git a/diff.c b/diff.c
index f5755425b6..5f28d84b87 100644
--- a/diff.c
+++ b/diff.c
@@ -3997,15 +3997,15 @@ static void run_diff(struct diff_filepair *p, struct 
diff_options *o)
 
/*
 * NEEDSWORK: When running in no-index mode (and no repo is
-* found, thus no hash algo conifugred), fall back to SHA-1
+* found, thus no hash algo configured), fall back to SHA-1
 * hashing (which is used by diff_fill_oid_info below) to
 * avoid regression in diff output.
 *
 * In future, perhaps we can allow the user to specify their
 * hash algorithm from command line in this mode.
 */
-   if (o->flags.no_index && !the_hash_algo)
-   the_hash_algo = _algos[GIT_HASH_SHA1];
+   if (o->flags.no_index && !repo_has_valid_hash_algo(the_repository))
+   repo_set_hash_algo(the_repository, GIT_HASH_SHA1);
 
diff_fill_oid_info(one);
diff_fill_oid_info(two);
diff --git a/repository.h b/repository.h
index 0329e40c7f..40bd49611f 100644
--- a/repository.h
+++ b/repository.h
@@ -107,4 +107,20 @@ extern void repo_clear(struct repository *repo);
  */
 extern int repo_read_index(struct repository *repo);
 
+static inline const struct git_hash_algo *repo_get_hash_algo(
+   const struct repository *repo)
+{
+   if (!repo->hash_algo)
+   die("BUG: hash_algo is not initialized!\n%s",
+   _("You can work around this by setting environment"
+ " variable GIT_HASH_FIXUP=1.\n"
+ "Please report this to git@vger.kernel.org"));
+   return repo->hash_algo;
+}
+
+static inline int repo_has_valid_hash_algo(const struct repository *repo)
+{
+   return repo->hash_algo != NULL;
+}
+
 #endif /* REPOSITORY_H */

Nguyễn Thái Ngọc Duy (6):
  setup.c: initialize the_repository correctly in all cases
  sha1_file.c: keep a good name for "unknown" hash_algos[UNKNOWN]
  cache.h: make the_hash_algo read-only
  index-pack: check (and optionally set) hash algo based on input file
  diff.c: initialize hash algo when running in --no-index mode
  Revert "repository: pre-initialize hash algo pointer"

 builtin/index-pack.c | 29 -
 builtin/init-db.c|  3 ++-
 cache.h  |  5 +++--
 common-main.c| 10 ++
 diff.c   | 12 
 path.c   |  2 +-
 repository.c |  2 +-
 repository.h | 16 
 setup.c  |  5 -
 sha1_file.c  |  2 +-
 t/helper/test-dump-split-index.c |  2 ++
 11 files changed, 80 insertions(+), 8 deletions(-)

-- 
2.16.1.435.g8f24da2e1a



[PATCH v3 1/6] setup.c: initialize the_repository correctly in all cases

2018-02-25 Thread Nguyễn Thái Ngọc Duy
There are many ways for any command to access a git repository:

- most of them will try to discover the .git dir via
  setup_git_directory() and friends

- the server side programs already know where the repo is and prepare
  with enter_repo()

- special commands that deal with repo creation (init/clone) use
  init_db() once the new repo is ready for access.

- somebody accesses $GIT_DIR before any of above functions are called
  and accidentally sets up a git repository by set_git_dir() alone

"the_repository" is partially set up via set_git_dir() at some point
in all four cases. The hash algorithm though is configured later after
.git/config is read.

So far proper repo initialization is done only for the first case [1].
The second case is not covered (but that's fine [3]). The third case
was found and worked around in [2]. The fourth case is a buggy one,
which should be fixed already by jk/no-looking-at-dotgit-outside-repo
and never happens again.

This patch makes sure all cases initialize the hash algorithm in
the_repository correctly. Both second and third cases must run
check_repo_format() before "entering" it. Eventually we probably just
rename this function to init_repo() or something.

[1] 78a6766802 (Integrate hash algorithm support with repo setup -
2017-11-12)

[2] e26f7f19b6 (repository: pre-initialize hash algo pointer -
2018-01-19)

[3] the reason server side is still running ok with no hash algo before
[2] is because the programs that use enter_repo() do very
little (and unlikely to hash anything) then spawn a new
program (like pack-objects or upload-archive) to do the heavy
lifting. These programs already use setup_git_directory() or the
gently version

Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 builtin/init-db.c | 3 ++-
 cache.h   | 3 ++-
 path.c| 2 +-
 setup.c   | 5 -
 4 files changed, 9 insertions(+), 4 deletions(-)

diff --git a/builtin/init-db.c b/builtin/init-db.c
index c9b7946bad..d119d9906b 100644
--- a/builtin/init-db.c
+++ b/builtin/init-db.c
@@ -9,6 +9,7 @@
 #include "builtin.h"
 #include "exec_cmd.h"
 #include "parse-options.h"
+#include "repository.h"
 
 #ifndef DEFAULT_GIT_TEMPLATE_DIR
 #define DEFAULT_GIT_TEMPLATE_DIR "/usr/share/git-core/templates"
@@ -369,7 +370,7 @@ int init_db(const char *git_dir, const char *real_git_dir,
 * config file, so this will not fail.  What we are catching
 * is an attempt to reinitialize new repository with an old tool.
 */
-   check_repository_format();
+   check_repository_format(the_repository);
 
reinit = create_default_files(template_dir, original_git_dir);
 
diff --git a/cache.h b/cache.h
index 21fbcc2414..6b97138264 100644
--- a/cache.h
+++ b/cache.h
@@ -894,6 +894,7 @@ extern int repository_format_precious_objects;
 extern char *repository_format_partial_clone;
 extern const char *core_partial_clone_filter_default;
 
+struct repository;
 struct repository_format {
int version;
int precious_objects;
@@ -926,7 +927,7 @@ int verify_repository_format(const struct repository_format 
*format,
  * set_git_dir() before calling this, and use it only for "are we in a valid
  * repo?".
  */
-extern void check_repository_format(void);
+extern void check_repository_format(struct repository *);
 
 #define MTIME_CHANGED  0x0001
 #define CTIME_CHANGED  0x0002
diff --git a/path.c b/path.c
index da8b655730..a544252198 100644
--- a/path.c
+++ b/path.c
@@ -827,7 +827,7 @@ const char *enter_repo(const char *path, int strict)
 
if (is_git_directory(".")) {
set_git_dir(".");
-   check_repository_format();
+   check_repository_format(the_repository);
return path;
}
 
diff --git a/setup.c b/setup.c
index c5d55dcee4..a82103832e 100644
--- a/setup.c
+++ b/setup.c
@@ -1180,11 +1180,14 @@ int git_config_perm(const char *var, const char *value)
return -(i & 0666);
 }
 
-void check_repository_format(void)
+/* optionally configure "repo" to the correct format */
+void check_repository_format(struct repository *repo)
 {
struct repository_format repo_fmt;
check_repository_format_gently(get_git_dir(), _fmt, NULL);
startup_info->have_repository = 1;
+   if (repo)
+   repo_set_hash_algo(repo, repo_fmt.hash_algo);
 }
 
 /*
-- 
2.16.1.435.g8f24da2e1a



[PATCH v3 2/6] sha1_file.c: keep a good name for "unknown" hash_algos[UNKNOWN]

2018-02-25 Thread Nguyễn Thái Ngọc Duy
This is mostly for displaying the hash algorithm name when we report
errors. Printing "unknown" with '%s' is much better than '(null)' in
glibc printf version (and probably could crash if other implementations
do not check for NULL pointer)

Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 sha1_file.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/sha1_file.c b/sha1_file.c
index 826d7a0ae3..4b266b1c68 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -71,7 +71,7 @@ static void git_hash_unknown_final(unsigned char *hash, 
git_hash_ctx *ctx)
 
 const struct git_hash_algo hash_algos[GIT_HASH_NALGOS] = {
{
-   NULL,
+   "unknown",
0x,
0,
0,
-- 
2.16.1.435.g8f24da2e1a



[PATCH v3 4/6] index-pack: check (and optionally set) hash algo based on input file

2018-02-25 Thread Nguyễn Thái Ngọc Duy
After 454253f059 (builtin/index-pack: improve hash function abstraction
- 2018-02-01), index-pack uses the_hash_algo for hashing. If "git
index-pack" is executed without a repository, we do not know what hash
algorithm to be used and the_hash_algo in theory could be undefined.

Since there should be some information about the hash algorithm in the
input pack file, we can initialize the correct hash algorithm with that
if the_hash_algo is not yet initialized. This assumes that pack files
with new hash algorithm MUST step up pack version.

While at there, make sure the hash algorithm requested by the pack file
and configured by the repository (if we're running with a repo) are
consistent.

Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 builtin/index-pack.c | 29 -
 repository.h |  5 +
 2 files changed, 33 insertions(+), 1 deletion(-)

diff --git a/builtin/index-pack.c b/builtin/index-pack.c
index 7e3e1a461c..2c75f28b41 100644
--- a/builtin/index-pack.c
+++ b/builtin/index-pack.c
@@ -326,10 +326,34 @@ static const char *open_pack_file(const char *pack_name)
output_fd = -1;
nothread_data.pack_fd = input_fd;
}
-   the_hash_algo->init_fn(_ctx);
return pack_name;
 }
 
+static void prepare_hash_algo(uint32_t pack_version)
+{
+   int pack_algo_id;
+   const struct git_hash_algo *pack_algo;
+
+   switch (pack_version) {
+   case 2:
+   case 3:
+   pack_algo_id = GIT_HASH_SHA1;
+   break;
+   default:
+   die("BUG: how to determine hash algo for version %d?",
+   pack_version);
+   }
+
+   if (!repo_has_valid_hash_algo(the_repository)) /* running without repo 
*/
+   repo_set_hash_algo(the_repository, pack_algo_id);
+
+   pack_algo = _algos[pack_algo_id];
+   if (the_hash_algo->format_id != pack_algo->format_id)
+   die(_("incompatible hash algorithm, "
+ "configured for %s but the pack file needs %s"),
+   the_hash_algo->name, pack_algo->name);
+}
+
 static void parse_pack_header(void)
 {
struct pack_header *hdr = fill(sizeof(struct pack_header));
@@ -341,6 +365,9 @@ static void parse_pack_header(void)
die(_("pack version %"PRIu32" unsupported"),
ntohl(hdr->hdr_version));
 
+   prepare_hash_algo(ntohl(hdr->hdr_version));
+   the_hash_algo->init_fn(_ctx);
+
nr_objects = ntohl(hdr->hdr_entries);
use(sizeof(struct pack_header));
 }
diff --git a/repository.h b/repository.h
index 5092df3700..02c695bc74 100644
--- a/repository.h
+++ b/repository.h
@@ -113,4 +113,9 @@ static inline const struct git_hash_algo 
*repo_get_hash_algo(
return repo->hash_algo;
 }
 
+static inline int repo_has_valid_hash_algo(const struct repository *repo)
+{
+   return repo->hash_algo != NULL;
+}
+
 #endif /* REPOSITORY_H */
-- 
2.16.1.435.g8f24da2e1a



[PATCH v3 3/6] cache.h: make the_hash_algo read-only

2018-02-25 Thread Nguyễn Thái Ngọc Duy
By returning an R-value in the_hash_algo we make sure people can't
accidentally change hash algorithm with

the_hash_algo = _algos[something];

and go with repo_set_hash_algo() instead. Of course they can still do

the_repository->hash_algo = ...

but that is more obvious and easily caught in review.

Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 cache.h  | 2 +-
 repository.h | 6 ++
 2 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/cache.h b/cache.h
index 6b97138264..55b31e9756 100644
--- a/cache.h
+++ b/cache.h
@@ -53,7 +53,7 @@ struct object_id {
unsigned char hash[GIT_MAX_RAWSZ];
 };
 
-#define the_hash_algo the_repository->hash_algo
+#define the_hash_algo repo_get_hash_algo(the_repository)
 
 #if defined(DT_UNKNOWN) && !defined(NO_D_TYPE_IN_DIRENT)
 #define DTYPE(de)  ((de)->d_type)
diff --git a/repository.h b/repository.h
index 0329e40c7f..5092df3700 100644
--- a/repository.h
+++ b/repository.h
@@ -107,4 +107,10 @@ extern void repo_clear(struct repository *repo);
  */
 extern int repo_read_index(struct repository *repo);
 
+static inline const struct git_hash_algo *repo_get_hash_algo(
+   const struct repository *repo)
+{
+   return repo->hash_algo;
+}
+
 #endif /* REPOSITORY_H */
-- 
2.16.1.435.g8f24da2e1a



[PATCH v3 6/6] Revert "repository: pre-initialize hash algo pointer"

2018-02-25 Thread Nguyễn Thái Ngọc Duy
This reverts commit e26f7f19b6c7485f04234946a59ab8f4fd21d6d1. The root
problem, git clone not setting up the_hash_algo, has been fixed in the
previous patch.

Since this is a dangerous move and could potentially break stuff after
release (and leads to workaround like the reverted commit), the
workaround technically remains, but is hidden behind a new environment
variable GIT_HASH_FIXUP. This should let the users continue to use git
while we fix the problem. This variable can be deleted after one or two
releases.

test-dump-split-index.c needs to call setup_git_directory() after this
to configure hash algorithm before parsing the index file. It's not the
best way (i.e. check index file version) but it's a test tool, this is
good enough.

Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 common-main.c| 10 ++
 repository.c |  2 +-
 repository.h |  5 +
 t/helper/test-dump-split-index.c |  2 ++
 4 files changed, 18 insertions(+), 1 deletion(-)

diff --git a/common-main.c b/common-main.c
index 6a689007e7..fbfa98c3b8 100644
--- a/common-main.c
+++ b/common-main.c
@@ -1,6 +1,7 @@
 #include "cache.h"
 #include "exec_cmd.h"
 #include "attr.h"
+#include "repository.h"
 
 /*
  * Many parts of Git have subprograms communicate via pipe, expect the
@@ -40,5 +41,14 @@ int main(int argc, const char **argv)
 
restore_sigpipe_to_default();
 
+   /*
+* Temporary recovery measure while hashing code is being
+* refactored. This variable should be gone after everybody
+* has used the_hash_algo in one or two releases and nobody
+* complains anything.
+*/
+   if (getenv("GIT_HASH_FIXUP"))
+   repo_set_hash_algo(the_repository, GIT_HASH_SHA1);
+
return cmd_main(argc, argv);
 }
diff --git a/repository.c b/repository.c
index 4ffbe9bc94..0d715f4fdb 100644
--- a/repository.c
+++ b/repository.c
@@ -5,7 +5,7 @@
 
 /* The main repository */
 static struct repository the_repo = {
-   NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, _index, 
_algos[GIT_HASH_SHA1], 0, 0
+   NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, _index, NULL, 
0, 0
 };
 struct repository *the_repository = _repo;
 
diff --git a/repository.h b/repository.h
index 02c695bc74..40bd49611f 100644
--- a/repository.h
+++ b/repository.h
@@ -110,6 +110,11 @@ extern int repo_read_index(struct repository *repo);
 static inline const struct git_hash_algo *repo_get_hash_algo(
const struct repository *repo)
 {
+   if (!repo->hash_algo)
+   die("BUG: hash_algo is not initialized!\n%s",
+   _("You can work around this by setting environment"
+ " variable GIT_HASH_FIXUP=1.\n"
+ "Please report this to git@vger.kernel.org"));
return repo->hash_algo;
 }
 
diff --git a/t/helper/test-dump-split-index.c b/t/helper/test-dump-split-index.c
index e44430b699..b759fe3fc1 100644
--- a/t/helper/test-dump-split-index.c
+++ b/t/helper/test-dump-split-index.c
@@ -12,6 +12,8 @@ int cmd_main(int ac, const char **av)
struct split_index *si;
int i;
 
+   setup_git_directory();
+
do_read_index(_index, av[1], 1);
printf("own %s\n", sha1_to_hex(the_index.sha1));
si = the_index.split_index;
-- 
2.16.1.435.g8f24da2e1a



[PATCH v3 5/6] diff.c: initialize hash algo when running in --no-index mode

2018-02-25 Thread Nguyễn Thái Ngọc Duy
Our "git diff" command supports running as a standalone tool. In this
code path, we try to hash the file content but after
18e2588e11 (sha1_file: switch uses of SHA-1 to the_hash_algo -
2018-02-01), there is a chance that the_hash_algo (required by
index_path) may still be uninitialized if no repository is found.

Executing index_path() when the_hash_algo is NULL (or points to unknown
algo) either crashes or dies. Let's make it a bit safer by explicitly
falling back to SHA-1 (so that the diff output remains the same as
before, compared to the alternative that we simply do not hash).

Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 diff.c | 12 
 1 file changed, 12 insertions(+)

diff --git a/diff.c b/diff.c
index 21c3838b25..5f28d84b87 100644
--- a/diff.c
+++ b/diff.c
@@ -3995,6 +3995,18 @@ static void run_diff(struct diff_filepair *p, struct 
diff_options *o)
return;
}
 
+   /*
+* NEEDSWORK: When running in no-index mode (and no repo is
+* found, thus no hash algo configured), fall back to SHA-1
+* hashing (which is used by diff_fill_oid_info below) to
+* avoid regression in diff output.
+*
+* In future, perhaps we can allow the user to specify their
+* hash algorithm from command line in this mode.
+*/
+   if (o->flags.no_index && !repo_has_valid_hash_algo(the_repository))
+   repo_set_hash_algo(the_repository, GIT_HASH_SHA1);
+
diff_fill_oid_info(one);
diff_fill_oid_info(two);
 
-- 
2.16.1.435.g8f24da2e1a



Re: [PATCH v4 00/12] rebase -i: offer to recreate merge commits

2018-02-25 Thread Jacob Keller
On Fri, Feb 23, 2018 at 4:35 AM, Johannes Schindelin
 wrote:
> Changes since v3:
>
> - fixed a grammar error in "introduce the `merge` command"'s commit message.
>
> - fixed a couple of resource leaks in safe_append() and do_reset(), pointed
>   out by Eric Sunshine.
>

The interdiff seems incorrect for such a small list of changes, it
appears like large sections of code added by this series appear in the
interdiff without subtractions from the previous versions? Is all that
code new to v3? If not, I'd suspect you accidentally diffed between
the wrong points.

Thanks,
Jake