Re: [PATCH] [GSOC] remove_temporary_files(): reimplement using iterators

2017-03-31 Thread Jeff King
On Sat, Apr 01, 2017 at 01:37:16AM -0400, Jeff King wrote:

> We use that strbuf for the prefix-comparison, too, but the way it is
> done is rather confusing. AFAICT, we could just be comparing against
> "packtmp + strlen(packdir) + 1". Though it would be simpler still to
> make "packtmp" just the basename, rather than the full path.

Something like the patch below, which would then free you up to replace
"buf" there with the diter->path.

diff --git a/builtin/repack.c b/builtin/repack.c
index 677bc7c81..06a14d687 100644
--- a/builtin/repack.c
+++ b/builtin/repack.c
@@ -48,7 +48,7 @@ static int repack_config(const char *var, const char *value, 
void *cb)
 static void remove_temporary_files(void)
 {
struct strbuf buf = STRBUF_INIT;
-   size_t dirlen, prefixlen;
+   size_t dirlen;
DIR *dir;
struct dirent *e;
 
@@ -56,14 +56,11 @@ static void remove_temporary_files(void)
if (!dir)
return;
 
-   /* Point at the slash at the end of ".../objects/pack/" */
-   dirlen = strlen(packdir) + 1;
-   strbuf_addstr(, packtmp);
-   /* Hold the length of  ".tmp-%d-pack-" */
-   prefixlen = buf.len - dirlen;
+   strbuf_addf(, "%s/", packdir);
+   dirlen = buf.len;
 
while ((e = readdir(dir))) {
-   if (strncmp(e->d_name, buf.buf + dirlen, prefixlen))
+   if (!starts_with(e->d_name, packtmp))
continue;
strbuf_setlen(, dirlen);
strbuf_addstr(, e->d_name);
@@ -216,7 +213,7 @@ int cmd_repack(int argc, const char **argv, const char 
*prefix)
die(_(incremental_bitmap_conflict_error));
 
packdir = mkpathdup("%s/pack", get_object_directory());
-   packtmp = mkpathdup("%s/.tmp-%d-pack", packdir, (int)getpid());
+   packtmp = mkpathdup(".tmp-%d-pack", (int)getpid());
 
sigchain_push_common(remove_pack_on_signal);
 
@@ -274,7 +271,7 @@ int cmd_repack(int argc, const char **argv, const char 
*prefix)
if (delta_base_offset)
argv_array_push(,  "--delta-base-offset");
 
-   argv_array_push(, packtmp);
+   argv_array_pushf(, "%s/%s", packdir, packtmp);
 
cmd.git_cmd = 1;
cmd.out = -1;
@@ -372,8 +369,8 @@ int cmd_repack(int argc, const char **argv, const char 
*prefix)
int exists = 0;
fname = mkpathdup("%s/pack-%s%s",
packdir, item->string, exts[ext].name);
-   fname_old = mkpathdup("%s-%s%s",
-   packtmp, item->string, exts[ext].name);
+   fname_old = mkpathdup("%s/%s-%s%s",
+   packdir, packtmp, item->string, 
exts[ext].name);
if (!stat(fname_old, )) {
statbuffer.st_mode &= ~(S_IWUSR | S_IWGRP | 
S_IWOTH);
chmod(fname_old, statbuffer.st_mode);


Re: [PATCH] [GSOC] remove_temporary_files(): reimplement using iterators

2017-03-31 Thread Jeff King
On Sat, Apr 01, 2017 at 03:24:58AM +0300, Robert Stanca wrote:

> @@ -49,12 +51,7 @@ static void remove_temporary_files(void)
>  {
>   struct strbuf buf = STRBUF_INIT;
>   size_t dirlen, prefixlen;
> - DIR *dir;
> - struct dirent *e;
> -
> - dir = opendir(packdir);
> - if (!dir)
> - return;
> + struct dir_iterator *diter = dir_iterator_begin(packdir);
>  
>   /* Point at the slash at the end of ".../objects/pack/" */
>   dirlen = strlen(packdir) + 1;
> @@ -62,14 +59,13 @@ static void remove_temporary_files(void)
>   /* Hold the length of  ".tmp-%d-pack-" */
>   prefixlen = buf.len - dirlen;
>  
> - while ((e = readdir(dir))) {
> - if (strncmp(e->d_name, buf.buf + dirlen, prefixlen))
> + while (dir_iterator_advance(diter) == ITER_OK) {
> + if (strncmp(diter->relative_path, buf.buf + dirlen, prefixlen))
>   continue;
>   strbuf_setlen(, dirlen);
> - strbuf_addstr(, e->d_name);
> + strbuf_addstr(, diter->relative_path);
>   unlink(buf.buf);
>   }
> - closedir(dir);

I think you could actually clean this code up more. The dir_iterator
already does this strbuf magic to hold the full path, so you should be
able to just run "unlink(iter->path.buf)", get rid of the extra strbuf
entirely.

We use that strbuf for the prefix-comparison, too, but the way it is
done is rather confusing. AFAICT, we could just be comparing against
"packtmp + strlen(packdir) + 1". Though it would be simpler still to
make "packtmp" just the basename, rather than the full path.

I do agree with the point Junio raised, though, that this loop isn't
recursive, but dir_iterator is. I think there was talk elsewhere of
giving it more options, so perhaps it could be taught a non-recursive
version. Until we have that, though, I'm not sure this is a good spot to
convert.

-Peff


Re: Bug in "git am" when the body starts with spaces

2017-03-31 Thread Jeff King
On Fri, Mar 31, 2017 at 05:52:00PM -0700, Linus Torvalds wrote:

> Ok, did a bisect, and this bisects to commit 6b4b013f1884 ("mailinfo:
> handle in-body header continuations").
> 
> The continuation logic is oddly complex, and I can't follow the logic.
> But it is completely broken in how it thinks empty lines are somehow
> "continuations".

I think the continuation logic is OK. The problem is that the newline is
never fed to check_inbody_header() at all. So the next line its state
machine sees is the indented line, which looks like a continuation.

It seems to me that ignoring that newline is a bug, and causes other
problems. For instance, if you have a blank line and then another
header-looking thing (not a continuation) after, it is respected. For
instance, running mailinfo on:

  From ...mbox header...
  From: Email Author 
  Subject: email subject
  Date: whatever

  From: in-body author 

  Subject: in-body subject

  And the rest of the message.

will use "in-body subject" as the subject, though I'd have thought we
should stop parsing in-body headers as soon as we see the first in-body
blank line (the one after the in-body "From").

The blank line gets eaten by the check at the beginning of
handle_commit_msg(), right _before_ we pass it off to the
check_inbody_header() function.

It seems like that should be more like:

diff --git a/mailinfo.c b/mailinfo.c
index a489d9d0f..6d89781eb 100644
--- a/mailinfo.c
+++ b/mailinfo.c
@@ -757,8 +757,10 @@ static int handle_commit_msg(struct mailinfo *mi, struct 
strbuf *line)
assert(!mi->filter_stage);
 
if (mi->header_stage) {
-   if (!line->len || (line->len == 1 && line->buf[0] == '\n'))
+   if (!line->len || (line->len == 1 && line->buf[0] == '\n')) {
+   mi->header_stage = 0;
return 0;
+   }
}
 
if (mi->use_inbody_headers && mi->header_stage) {


But that breaks a bunch of tests. It looks like the loop in handle_body
always starts by feeding the initial header-separator (the real headers,
not the in-body ones) to the various parsers. So that first newline
makes us trigger "no more in-body headers" before we even start parsing
any lines. Oops.

So doing this:

@@ -960,7 +962,7 @@ static void handle_body(struct mailinfo *mi, struct strbuf 
*line)
goto handle_body_out;
}
 
-   do {
+   while (!strbuf_getwholeline(line, mi->input, '\n')) {
/* process any boundary lines */
if (*(mi->content_top) && is_multipart_boundary(mi, line)) {
/* flush any leftover */
@@ -1013,7 +1015,7 @@ static void handle_body(struct mailinfo *mi, struct 
strbuf *line)
 
if (mi->input_error)
break;
-   } while (!strbuf_getwholeline(line, mi->input, '\n'));
+   }
 
flush_inbody_header_accum(mi);
 

on top fixes that. But that still breaks a test that has blank lines at
the beginning of the message, before the in-body header. So probably the
state-machine needs an extra state: sucking up pre-inbody-header
newlines.

-Peff


Re: [PATCH v2 00/20] Separate `ref_cache` into a separate module

2017-03-31 Thread Michael Haggerty
On 03/31/2017 06:01 PM, Junio C Hamano wrote:
> Michael Haggerty  writes:
> 
>> This version literally only contains a few commit message changes and
>> one minor comment changes relative to v1. The code is identical. I
>> wasn't sure whether it is even worth sending this patch series to the
>> ML again; Junio, if you'd prefer I just send a link to a published
>> branch in such cases, please let me know.
> 
> The review on the list is not about letting me pick it up, but is
> about giving reviewing contributors a chance to comment.  I think
> for a series this important ;-) it is good that you are giving it
> multiple exposures so that people who were offline the last time can
> have a chance to look at it, even if the update is minimum.
> 
>> This patch series is also available from my GitHub fork [2] as branch
>> "separate-ref-cache". These patches depend on Duy's
>> nd/files-backend-git-dir branch.
> 
> I am getting the impression that the files-backend thing as well as
> this topic are ready for 'next'.  Please stop me if I missed something
> in these topics (especially the other one) that needs updating
> before that happens.

I don't know of any remaining problems with these two patch series
(aside from a couple of cosmetic problems that I just pointed out about
v7 of nd/files-backend-git-dir). I'm pretty confident about both of them.

Duy, have you looked over my patch series? Since you've been working in
the area, your feedback would be very welcome, if you have the time for it.

Michael



Re: [PATCH v7 00/28] Remove submodule from files-backend.c

2017-03-31 Thread Michael Haggerty
On 03/26/2017 04:42 AM, Nguyễn Thái Ngọc Duy wrote:
> v7 is mostly about style changes except the one bug in
> test-ref-store.c, missing setup_git_directory().
> 
> There's one new patch, 03/28, which maps to the "if (!refs)" deletion
> in the interdiff.
> 
> The one comment from v6 I haven't addressed in v7 is whether to delete
> REF_STORE_READ. But if it is deleted, I think I'm doing it in
> nd/worktree-kill-parse-ref, which is partly about cleanup refs code
> anyway, to avoid another re-roll in this series.

I reviewed v7 pretty carefully, and send a couple of minor comments. But
with or without changes, it looks good to me and the whole series is

Reviewed-by: Michael Haggerty 

Thanks!

Michael



Re: [PATCH v7 23/28] files-backend: avoid ref api targetting main ref store

2017-03-31 Thread Michael Haggerty
Nits: s/targetting/targeting/ in the subject line.

On 03/26/2017 04:42 AM, Nguyễn Thái Ngọc Duy wrote:
> A small step towards making files-backend works as a non-main ref store
> using the newly added store-aware API.

s/works/work/

> For the record, `join` and `nm` on refs.o and files-backend.o tell me
> that files-backend no longer uses functions that defaults to
> get_main_ref_store().

s/defaults/default/

> I'm not yet comfortable at the idea of removing
> files_assert_main_repository() (or converting REF_STORE_MAIN to
> REF_STORE_WRITE). More staring and testing is required before that can
> happen. Well, except peel_ref(). I'm pretty sure that function is safe.
> [...]

Michael



Re: [PATCH v7 22/28] refs: new transaction related ref-store api

2017-03-31 Thread Michael Haggerty
On 03/26/2017 04:42 AM, Nguyễn Thái Ngọc Duy wrote:
> The transaction struct now takes a ref store at creation and will
> operate on that ref store alone.

Having worked downstream of this patch series for a while, I started to
wonder why a `ref_store` has to be baked into a `ref_transaction` at
creation. I haven't noticed any technical reason why it must be so.

If we expected to need to virtualize `ref_transaction_begin()` sometime,
then of course your design would be preferable.

It surprised be, because until now `ref_transaction` has been a plain
old data structure unconnected with a particular `ref_store`. I would
have expected `ref_transaction_commit()` to gain a more general sibling,
`refs_ref_transaction_commit()` [*], that takes an additional `ref_store
*` object argument, and for `ref_transaction_commit()` to call that
function. It would feel slightly more natural to me, given that
`transaction_commit` is a virtual method of the `ref_store` class, for
`refs_ref_transaction_commit()` to take an explicit `refs` pointer
rather than having that pointer buried in the `ref_transaction`.

I think this is mostly an aesthetic issue (about which opinions can
legitimately differ) rather than a concrete problem. I haven't yet had
any difficulties working with your interface. But I wanted to mention my
observation anyway. As far as I'm concerned, you don't need to change it.

> [...]

Michael

[*] The name could obviously be improved, but you get the idea.



[PATCH] contrib/git-resurrect.sh: do not write \t for HT in sed scripts

2017-03-31 Thread Junio C Hamano
Just like we did in 0d1d6e50 ("t/t7003: replace \t with literal tab
in sed expression", 2010-08-12), avoid writing "\t" for HT in sed
scripts, which is not portable.

Signed-off-by: Junio C Hamano 
---
 * This should hopefully kill the last instances of the same issue
   to make sure people do not copy-and-paste this unportable
   construct.

 contrib/git-resurrect.sh | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/contrib/git-resurrect.sh b/contrib/git-resurrect.sh
index c364dda696..3b78ffd079 100755
--- a/contrib/git-resurrect.sh
+++ b/contrib/git-resurrect.sh
@@ -24,13 +24,13 @@ n,dry-rundon't recreate the branch"
 . git-sh-setup
 
 search_reflog () {
-sed -ne 's~^\([^ ]*\) .*\tcheckout: moving from '"$1"' .*~\1~p' \
+   sed -ne 's~^\([^ ]*\) .*checkout: moving from '"$1"' .*~\1~p' \
 < "$GIT_DIR"/logs/HEAD
 }
 
 search_reflog_merges () {
git rev-parse $(
-   sed -ne 's~^[^ ]* \([^ ]*\) .*\tmerge '"$1"':.*~\1^2~p' \
+   sed -ne 's~^[^ ]* \([^ ]*\) .*  merge '"$1"':.*~\1^2~p' \
< "$GIT_DIR"/logs/HEAD
)
 }
-- 
2.12.2-752-g2215051a9e



Re: [PATCH] name-hash: fix buffer overrun

2017-03-31 Thread Junio C Hamano
Junio C Hamano  writes:

> Ah, of course. Avoid GNUism to spell HT as "\t" in a sed script.

Here is what I replaced the original patch with.  Let's see how well
it fares with Travis tonight.

-- >8 --
From: Kevin Willford 
Date: Fri, 31 Mar 2017 17:32:14 +
Subject: [PATCH] name-hash: fix buffer overrun

Add check for the end of the entries for the thread partition.
Add test for lazy init name hash with specific directory structure

The lazy init hash name was causing a buffer overflow when the last
entry in the index was multiple folder deep with parent folders that
did not have any files in them.

This adds a test for the boundary condition of the thread partitions
with the folder structure that was triggering the buffer overflow.

The fix was to check if it is the last entry for the thread partition
in the handle_range_dir and not try to use the next entry in the cache.

Signed-off-by: Kevin Willford 
Signed-off-by: Johannes Schindelin 
Signed-off-by: Jeff Hostetler 
Signed-off-by: Junio C Hamano 
---
 name-hash.c |  4 +++-
 t/t3008-ls-files-lazy-init-name-hash.sh | 19 +++
 2 files changed, 22 insertions(+), 1 deletion(-)
 create mode 100755 t/t3008-ls-files-lazy-init-name-hash.sh

diff --git a/name-hash.c b/name-hash.c
index cac313c78d..39309efb7f 100644
--- a/name-hash.c
+++ b/name-hash.c
@@ -342,7 +342,9 @@ static int handle_range_dir(
 * Scan forward in the index array for index entries having the same
 * path prefix (that are also in this directory).
 */
-   if (strncmp(istate->cache[k_start + 1]->name, prefix->buf, prefix->len) 
> 0)
+   if (k_start + 1 >= k_end)
+   k = k_end;
+   else if (strncmp(istate->cache[k_start + 1]->name, prefix->buf, 
prefix->len) > 0)
k = k_start + 1;
else if (strncmp(istate->cache[k_end - 1]->name, prefix->buf, 
prefix->len) == 0)
k = k_end;
diff --git a/t/t3008-ls-files-lazy-init-name-hash.sh 
b/t/t3008-ls-files-lazy-init-name-hash.sh
new file mode 100755
index 00..971975bff4
--- /dev/null
+++ b/t/t3008-ls-files-lazy-init-name-hash.sh
@@ -0,0 +1,19 @@
+#!/bin/sh
+
+test_description='Test the lazy init name hash with various folder structures'
+
+. ./test-lib.sh
+
+test_expect_success 'no buffer overflow in lazy_init_name_hash' '
+   (
+   test_seq 2000 | sed "s/^/a_/"
+   echo b/b/b
+   test_seq 2000 | sed "s/^/c_/"
+   test_seq 50 | sed "s/^/d_/" | tr "\n" "/"; echo d
+   ) |
+   sed -e "s/^/100644 $EMPTY_BLOB  /" |
+   git update-index --index-info &&
+   test-lazy-init-name-hash -m
+'
+
+test_done
-- 
2.12.2-752-g2215051a9e



Re: [PATCH v7 19/28] files-backend: replace submodule_allowed check in files_downcast()

2017-03-31 Thread Michael Haggerty
On 03/26/2017 04:42 AM, Nguyễn Thái Ngọc Duy wrote:
> files-backend.c is unlearning submodules. Instead of having a specific
> check for submodules to see what operation is allowed, files backend
> now takes a set of flags at init. Each operation will check if the
> required flags is present before performing.
> 
> For now we have four flags: read, write and odb access. Main ref store
> has all flags, obviously, while submodule stores are read-only and have
> access to odb (*).
> 
> The "main" flag stays because many functions in the backend calls
> frontend ones without a ref store, so these functions always target the
> main ref store. Ideally the flag should be gone after ref-store-aware
> api is in place and used by backends.
> 
> (*) Submodule code needs for_each_ref. Try take REF_STORE_ODB flag
> out. At least t3404 would fail. The "have access to odb" in submodule is
> a bit hacky since we don't know from he whether add_submodule_odb() has
> been called.
> 
> Signed-off-by: Nguyễn Thái Ngọc Duy 
> ---
>  refs.c   | 15 +++---
>  refs/files-backend.c | 81 
> +---
>  refs/refs-internal.h |  9 +-
>  3 files changed, 71 insertions(+), 34 deletions(-)
> 
> diff --git a/refs.c b/refs.c
> index d72b48a430..241b4227b2 100644
> --- a/refs.c
> +++ b/refs.c
> [...]
> @@ -1481,7 +1486,9 @@ struct ref_store *get_ref_store(const char *submodule)
>   return NULL;
>   }
>  
> - refs = ref_store_init(submodule_sb.buf);
> + /* pretend that add_submodule_odb() has been called */

The word "pretend" implies that the thing that follows is not true,
whereas we hope that it *is* true. It would be better to say "assume".

> + refs = ref_store_init(submodule_sb.buf,
> +   REF_STORE_READ | REF_STORE_ODB);
>   register_submodule_ref_store(refs, submodule);
>  
>   strbuf_release(_sb);
> diff --git a/refs/files-backend.c b/refs/files-backend.c
> index 490f05a6f4..d97a924860 100644
> --- a/refs/files-backend.c
> +++ b/refs/files-backend.c
> [...]
> @@ -994,13 +997,17 @@ static struct ref_store *files_ref_store_create(const 
> char *gitdir)
>  }
>  
>  /*
> - * Die if refs is for a submodule (i.e., not for the main repository).
> - * caller is used in any necessary error messages.
> + * Die if refs is not the main ref store. caller is used in any
> + * necessary error messages.
>   */
>  static void files_assert_main_repository(struct files_ref_store *refs,
>const char *caller)
>  {
> - /* This function is to be fixed up in the next patch */
> + if (refs->store_flags & REF_STORE_MAIN)
> + return;
> +
> + die("BUG: unallowed operation (%s), only works "
> + "on main ref store\n", caller);

"Unallowed" isn't really a word; one would say "disallowed". But it
might sound better to say

BUG: operation %s only allowed for main ref store

>  }
>  
>  /*
> @@ -1009,9 +1016,9 @@ static void files_assert_main_repository(struct 
> files_ref_store *refs,
>   * files_ref_store is for a submodule (i.e., not for the main
>   * repository). caller is used in any necessary error messages.
>   */
> -static struct files_ref_store *files_downcast(
> - struct ref_store *ref_store, int submodule_allowed,
> - const char *caller)
> +static struct files_ref_store *files_downcast(struct ref_store *ref_store,
> +   unsigned int required_flags,
> +   const char *caller)

The docstring for this function needs to be updated; it still talks
about the old `submodule_allowed` parameter.

>  {
>   struct files_ref_store *refs;
>  
> @@ -1021,8 +1028,9 @@ static struct files_ref_store *files_downcast(
>  
>   refs = (struct files_ref_store *)ref_store;
>  
> - if (!submodule_allowed)
> - files_assert_main_repository(refs, caller);
> + if ((refs->store_flags & required_flags) != required_flags)
> + die("BUG: unallowed operation (%s), requires %x, has %x\n",
> + caller, required_flags, refs->store_flags);

Same comment about "unallowed". Maybe

BUG: operation %s requires abilities 0x%x but only have 0x%x

>  
>   return refs;
>  }
> [...]

Michael



Re: [PATCH] [GSOC] prune_worktrees(): reimplement with dir_iterator

2017-03-31 Thread Junio C Hamano
Robert Stanca  writes:

> Replaces recursive traversing of opendir with dir_iterator

The original code for this one, and also the other one, is not
recursive traversing.  This one enumerates all the _direct_
subdirectories of ".git/worktrees", and feeds them to
prune_worktree() without recursing.  The other one scans all the
files _directly_ underneath ".git/objects/pack" to find the ones
that begin with the packtmp prefix, and unlinks them without
recursing.  Neither of them touches anything in subdirectory of
".git/worktrees/" nor ".git/objects/pack/".

Using dir_iterator() to make it recursive is not just overkill but
is a positively wrong change, isn't it?


Re: [PATCH] Documentation: improve git ls-files -s manpage entry

2017-03-31 Thread Junio C Hamano
Mostyn Bramley-Moore  writes:

> List the fields in order of appearance in the command output.
>
> Signed-off-by: Mostyn Bramley-Moore 
> ---
>  Documentation/git-ls-files.txt | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/Documentation/git-ls-files.txt b/Documentation/git-ls-files.txt
> index 1cab703..a9149fc 100644
> --- a/Documentation/git-ls-files.txt
> +++ b/Documentation/git-ls-files.txt
> @@ -57,7 +57,8 @@ OPTIONS
>  
>  -s::
>  --stage::
> - Show staged contents' object name, mode bits and stage number in the 
> output.
> + Show staged contents' mode bits, hash, stage number as well as the
> + object name in the output.

This is not an improvement, as the word "hash" is merely a
colloquial way to say what is officially known as "object name".

If the patch were only to swap "mode bits" and "object name" without
doing anything else, it may have been an improvement, though.  It is
implied that the path the output line talks about is shown, and that
is why we do not say "mode bits, object name, stage number and the
pathname".

>  
>  --directory::
>   If a whole directory is classified as "other", show just its


Preparing for the Upcoming Removal of UCB Utilities from the Next Version of Solaris

2017-03-31 Thread Jeffrey Walton
Preparing for the Upcoming Removal of UCB Utilities from the Next
Version of Solaris,
https://blogs.oracle.com/partnertech/entry/preparing_for_the_upcoming_removal
.

Sorry to keep beating the Solaris horse. Oracle charges forks for
simple updates, like security bug fixes, and updates to cURL and Git.
I think its important that folks have alternatives available to them.

Jeff


[PATCH] [GSOC] prune_worktrees(): reimplement with dir_iterator

2017-03-31 Thread Robert Stanca
Replaces recursive traversing of opendir with dir_iterator

Signed-off-by: Robert Stanca 
---
 builtin/worktree.c | 17 +++--
 1 file changed, 7 insertions(+), 10 deletions(-)

diff --git a/builtin/worktree.c b/builtin/worktree.c
index 9993ded..7cfd78c 100644
--- a/builtin/worktree.c
+++ b/builtin/worktree.c
@@ -10,6 +10,8 @@
 #include "refs.h"
 #include "utf8.h"
 #include "worktree.h"
+#include "iterator.h"
+#include "dir-iterator.h"
 
 static const char * const worktree_usage[] = {
N_("git worktree add []  []"),
@@ -91,30 +93,25 @@ static void prune_worktrees(void)
 {
struct strbuf reason = STRBUF_INIT;
struct strbuf path = STRBUF_INIT;
-   DIR *dir = opendir(git_path("worktrees"));
-   struct dirent *d;
+   struct dir_iterator *diter = dir_iterator_begin(git_path("worktrees"));
int ret;
-   if (!dir)
-   return;
-   while ((d = readdir(dir)) != NULL) {
-   if (is_dot_or_dotdot(d->d_name))
-   continue;
+
+   while (dir_iterator_advance(diter) == ITER_OK) {
strbuf_reset();
-   if (!prune_worktree(d->d_name, ))
+   if (!prune_worktree(diter->relative_path, ))
continue;
if (show_only || verbose)
printf("%s\n", reason.buf);
if (show_only)
continue;
strbuf_reset();
-   strbuf_addstr(, git_path("worktrees/%s", d->d_name));
+   strbuf_addstr(, git_path("worktrees/%s", 
diter->relative_path));
ret = remove_dir_recursively(, 0);
if (ret < 0 && errno == ENOTDIR)
ret = unlink(path.buf);
if (ret)
error_errno(_("failed to remove '%s'"), path.buf);
}
-   closedir(dir);
if (!show_only)
rmdir(git_path("worktrees"));
strbuf_release();
-- 
2.7.4



Re: Bug in "git am" when the body starts with spaces

2017-03-31 Thread Linus Torvalds
Ok, did a bisect, and this bisects to commit 6b4b013f1884 ("mailinfo:
handle in-body header continuations").

The continuation logic is oddly complex, and I can't follow the logic.
But it is completely broken in how it thinks empty lines are somehow
"continuations".

Jonathan?

 Linus

On Fri, Mar 31, 2017 at 5:24 PM, Linus Torvalds
 wrote:
>
> I think the reason is that the "header continuation line" logic kicks
> in because the lines in the body start with spaces, but that's
> entirely incorrect, since
>
>  (a) we're not in an email header
>
>  (b) there's an empty line in between anyway, so no way are those body
> lines continuation lines.
>
> I didn't check how far back this goes, I guess I'll do that next. But
> I thought I'd report it here first in case somebody else goes "ahhh".


[PATCH] Documentation: improve git ls-files -s manpage entry

2017-03-31 Thread Mostyn Bramley-Moore
List the fields in order of appearance in the command output.

Signed-off-by: Mostyn Bramley-Moore 
---
 Documentation/git-ls-files.txt | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/Documentation/git-ls-files.txt b/Documentation/git-ls-files.txt
index 1cab703..a9149fc 100644
--- a/Documentation/git-ls-files.txt
+++ b/Documentation/git-ls-files.txt
@@ -57,7 +57,8 @@ OPTIONS
 
 -s::
 --stage::
-   Show staged contents' object name, mode bits and stage number in the 
output.
+   Show staged contents' mode bits, hash, stage number as well as the
+   object name in the output.
 
 --directory::
If a whole directory is classified as "other", show just its
-- 
2.9.3



[PATCH] [GSOC] remove_temporary_files(): reimplement using iterators

2017-03-31 Thread Robert Stanca
Replaces recursive traversing of opendir with dir_iterator

Signed-off-by: Robert Stanca 
---
 builtin/repack.c | 16 ++--
 1 file changed, 6 insertions(+), 10 deletions(-)

diff --git a/builtin/repack.c b/builtin/repack.c
index 677bc7c81..dba465814 100644
--- a/builtin/repack.c
+++ b/builtin/repack.c
@@ -7,6 +7,8 @@
 #include "strbuf.h"
 #include "string-list.h"
 #include "argv-array.h"
+#include "iterator.h"
+#include "dir-iterator.h"
 
 static int delta_base_offset = 1;
 static int pack_kept_objects = -1;
@@ -49,12 +51,7 @@ static void remove_temporary_files(void)
 {
struct strbuf buf = STRBUF_INIT;
size_t dirlen, prefixlen;
-   DIR *dir;
-   struct dirent *e;
-
-   dir = opendir(packdir);
-   if (!dir)
-   return;
+   struct dir_iterator *diter = dir_iterator_begin(packdir);
 
/* Point at the slash at the end of ".../objects/pack/" */
dirlen = strlen(packdir) + 1;
@@ -62,14 +59,13 @@ static void remove_temporary_files(void)
/* Hold the length of  ".tmp-%d-pack-" */
prefixlen = buf.len - dirlen;
 
-   while ((e = readdir(dir))) {
-   if (strncmp(e->d_name, buf.buf + dirlen, prefixlen))
+   while (dir_iterator_advance(diter) == ITER_OK) {
+   if (strncmp(diter->relative_path, buf.buf + dirlen, prefixlen))
continue;
strbuf_setlen(, dirlen);
-   strbuf_addstr(, e->d_name);
+   strbuf_addstr(, diter->relative_path);
unlink(buf.buf);
}
-   closedir(dir);
strbuf_release();
 }
 
-- 
2.12.2.575.gb14f27f91.dirty



Bug in "git am" when the body starts with spaces

2017-03-31 Thread Linus Torvalds
Try applying the attached patch with

   git am 0001-Test-patch.patch

in the git repository.

At least for me, it results in a very odd commit that has one single
line in the commit message:

Test patch This should go in the body not in the subject line

which is obviously bogus.

I think the reason is that the "header continuation line" logic kicks
in because the lines in the body start with spaces, but that's
entirely incorrect, since

 (a) we're not in an email header

 (b) there's an empty line in between anyway, so no way are those body
lines continuation lines.

I didn't check how far back this goes, I guess I'll do that next. But
I thought I'd report it here first in case somebody else goes "ahhh".

Linus
From ad65cf7ba97ac071da1f845ec854165e7bf1efdf Mon Sep 17 00:00:00 2001
From: Linus Torvalds 
Date: Fri, 31 Mar 2017 17:18:16 -0700
Subject: [PATCH] Test patch example

Subject: [PATCH] Test patch

  This should go in the body
  not in the subject line
---
 Makefile | 1 +
 1 file changed, 1 insertion(+)

diff --git a/Makefile b/Makefile
index 9b36068ac..9f36c149b 100644
--- a/Makefile
+++ b/Makefile
@@ -1,3 +1,4 @@
+
 # The default target of this Makefile is...
 all::
 
-- 
2.12.2.401.g5d4234a49



Re: [PATCH v2 2/2] push: propagate push-options with --recurse-submodules

2017-03-31 Thread Jonathan Nieder
Brandon Williams wrote:

> Teach push --recurse-submodules to propagate push-options recursively to
> the pushes performed in the submodules.

Some time in the future we may want "push --recurse-submodules" to do a
dry run pass before doing the final push, so that if it is known that
some of the pushes wouldn't succeed (e.g. due to not being
fast-forward, or the server not being reachable, or the server not
supporting push options) then git could stop early instead of some
succeeding and some failing.

But that's a larger and separate change from this one.  Users of push
--recurse-submodules today know they are effectively asking for
multiple pushes that are not guaranteed to succeed or fail together.

> Signed-off-by: Brandon Williams 
> ---
>  submodule.c | 13 +++--
>  submodule.h |  1 +
>  t/t5545-push-options.sh | 39 +++
>  transport.c |  1 +
>  4 files changed, 52 insertions(+), 2 deletions(-)

For what it's worth,
Reviewed-by: Jonathan Nieder 


Re: [PATCH v2 1/2] push: unmark a local variable as static

2017-03-31 Thread Brandon Williams
On 03/31, Jonathan Nieder wrote:
> Brandon Williams wrote:
> 
> > Also, clear the push_options string_list to
> > prevent memory leaking.
> 
> That's not a real leak, right?  Is the motivation to make it not show up
> in valgrind output?

well depends on what you classify as a leak.  I guess since the program
will exit soon its not an issue, but there is still memory that was
alloc'd that wasn't freed before the variable pointing to it went out of
scope. :D
> 
> At first it didn't look related to the other change but then the
> connection dawned on me.
> 
> > Signed-off-by: Brandon Williams 
> > ---
> >  builtin/push.c | 5 +++--
> >  1 file changed, 3 insertions(+), 2 deletions(-)
> 
> For what it's worth,
> Reviewed-by: Jonathan Nieder 

-- 
Brandon Williams


Re: [PATCH v2 1/2] push: unmark a local variable as static

2017-03-31 Thread Jonathan Nieder
Brandon Williams wrote:

> Also, clear the push_options string_list to
> prevent memory leaking.

That's not a real leak, right?  Is the motivation to make it not show up
in valgrind output?

At first it didn't look related to the other change but then the
connection dawned on me.

> Signed-off-by: Brandon Williams 
> ---
>  builtin/push.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)

For what it's worth,
Reviewed-by: Jonathan Nieder 


[RFC PATCH] git-news: obtain latest news for your favorite VCS

2017-03-31 Thread Stefan Beller
Today when a user is interested in news regarding the Git, their favorite
version control, it is a challenging task to find out what is actually
happening.  If the user is using a Git in a distribution that packaged
it nicely for them, they may find outdated information in their
distribution using the distributions choice of relaying news.
As Git is a fast paced project, this is not the desired behavior for
most users. They rather prefer the latest news.

One could argue that it is OK to expect the user to read the "What's
cooking" emails or even release notes to get a grasp of the latest
development of Git, but these news do not quite catch the lively
atmosphere of the mailing list with all its polite flame wars and
opinionated arguments on why a patch is written the way it ends up in
the release notes.

It is time to fix the root case of this important user facing problem in
a sensible way that doesn't confuse the user.

Invent a new sub command 'git-news', which presents the latest news
in a way capturing all the interesting tid-bits of the mailing list as
well as the surrounding ecosystem.

As we're rushing the solution of this important problem, we did not
consider alternatives, such that we end up solving it in a backwards
compatible, but forwards incompatible way; clearly nobody in the future
wants to add another command that points to different news source. This is
why we can take the generic name 'git-news', instead of using the rather
specialized 'git-rev-news' command that is harder to remember and type.

Signed-off-by: Stefan Beller 
---
 .gitignore| 1 +
 Documentation/git.txt | 2 ++
 Makefile  | 1 +
 command-list.txt  | 1 +
 git-news.sh   | 4 
 5 files changed, 9 insertions(+)
 create mode 100755 git-news.sh

diff --git a/.gitignore b/.gitignore
index 833ef3b0b7..b2d9c1161f 100644
--- a/.gitignore
+++ b/.gitignore
@@ -98,6 +98,7 @@
 /git-mktag
 /git-mktree
 /git-name-rev
+/git-news
 /git-mv
 /git-notes
 /git-p4
diff --git a/Documentation/git.txt b/Documentation/git.txt
index ecc1bb4bd7..f4629aa39b 100644
--- a/Documentation/git.txt
+++ b/Documentation/git.txt
@@ -35,6 +35,8 @@ manual page gives you an overview of the command-line command 
syntax.
 A formatted and hyperlinked copy of the latest Git documentation
 can be viewed at `https://git.github.io/htmldocs/git.html`.
 
+A more entertaining section of news can be obtained via `git-news`.
+
 ifdef::stalenotes[]
 [NOTE]
 
diff --git a/Makefile b/Makefile
index 9f8b35ad41..32e50f30a2 100644
--- a/Makefile
+++ b/Makefile
@@ -519,6 +519,7 @@ SCRIPT_SH += git-merge-octopus.sh
 SCRIPT_SH += git-merge-one-file.sh
 SCRIPT_SH += git-merge-resolve.sh
 SCRIPT_SH += git-mergetool.sh
+SCRIPT_SH += git-news.sh
 SCRIPT_SH += git-quiltimport.sh
 SCRIPT_SH += git-rebase.sh
 SCRIPT_SH += git-remote-testgit.sh
diff --git a/command-list.txt b/command-list.txt
index a1fad28fd8..150b287ada 100644
--- a/command-list.txt
+++ b/command-list.txt
@@ -91,6 +91,7 @@ git-mktag   plumbingmanipulators
 git-mktree  plumbingmanipulators
 git-mv  mainporcelain   worktree
 git-name-revplumbinginterrogators
+git-newsmainporcelain
 git-notes   mainporcelain
 git-p4  foreignscminterface
 git-pack-objectsplumbingmanipulators
diff --git a/git-news.sh b/git-news.sh
new file mode 100755
index 00..1707dc633e
--- /dev/null
+++ b/git-news.sh
@@ -0,0 +1,4 @@
+#!/bin/sh
+#
+
+/usr/bin/sensible-browser https://git.github.io/rev_news/
-- 
2.12.2.576.g7be6e4ba40.dirty



[PATCH v2 2/2] push: propagate push-options with --recurse-submodules

2017-03-31 Thread Brandon Williams
Teach push --recurse-submodules to propagate push-options recursively to
the pushes performed in the submodules.

Signed-off-by: Brandon Williams 
---
 submodule.c | 13 +++--
 submodule.h |  1 +
 t/t5545-push-options.sh | 39 +++
 transport.c |  1 +
 4 files changed, 52 insertions(+), 2 deletions(-)

diff --git a/submodule.c b/submodule.c
index c52d6634c..de444be61 100644
--- a/submodule.c
+++ b/submodule.c
@@ -782,7 +782,9 @@ int find_unpushed_submodules(struct sha1_array *commits,
return needs_pushing->nr;
 }
 
-static int push_submodule(const char *path, int dry_run)
+static int push_submodule(const char *path,
+ const struct string_list *push_options,
+ int dry_run)
 {
if (add_submodule_odb(path))
return 1;
@@ -793,6 +795,12 @@ static int push_submodule(const char *path, int dry_run)
if (dry_run)
argv_array_push(, "--dry-run");
 
+   if (push_options && push_options->nr) {
+   const struct string_list_item *item;
+   for_each_string_list_item(item, push_options)
+   argv_array_pushf(, "--push-option=%s",
+item->string);
+   }
prepare_submodule_repo_env(_array);
cp.git_cmd = 1;
cp.no_stdin = 1;
@@ -807,6 +815,7 @@ static int push_submodule(const char *path, int dry_run)
 
 int push_unpushed_submodules(struct sha1_array *commits,
 const char *remotes_name,
+const struct string_list *push_options,
 int dry_run)
 {
int i, ret = 1;
@@ -818,7 +827,7 @@ int push_unpushed_submodules(struct sha1_array *commits,
for (i = 0; i < needs_pushing.nr; i++) {
const char *path = needs_pushing.items[i].string;
fprintf(stderr, "Pushing submodule '%s'\n", path);
-   if (!push_submodule(path, dry_run)) {
+   if (!push_submodule(path, push_options, dry_run)) {
fprintf(stderr, "Unable to push submodule '%s'\n", 
path);
ret = 0;
}
diff --git a/submodule.h b/submodule.h
index 8a8bc49dc..0e26430fd 100644
--- a/submodule.h
+++ b/submodule.h
@@ -92,6 +92,7 @@ extern int find_unpushed_submodules(struct sha1_array 
*commits,
struct string_list *needs_pushing);
 extern int push_unpushed_submodules(struct sha1_array *commits,
const char *remotes_name,
+   const struct string_list *push_options,
int dry_run);
 extern void connect_work_tree_and_git_dir(const char *work_tree, const char 
*git_dir);
 extern int parallel_submodules(void);
diff --git a/t/t5545-push-options.sh b/t/t5545-push-options.sh
index 97065e62b..32c513c78 100755
--- a/t/t5545-push-options.sh
+++ b/t/t5545-push-options.sh
@@ -142,6 +142,45 @@ test_expect_success 'push options work properly across 
http' '
test_cmp expect actual
 '
 
+test_expect_success 'push options and submodules' '
+   test_when_finished "rm -rf parent" &&
+   test_when_finished "rm -rf parent_upstream" &&
+   mk_repo_pair &&
+   git -C upstream config receive.advertisePushOptions true &&
+   cp -r upstream parent_upstream &&
+   test_commit -C upstream one &&
+
+   test_create_repo parent &&
+   git -C parent remote add up ../parent_upstream &&
+   test_commit -C parent one &&
+   git -C parent push --mirror up &&
+
+   git -C parent submodule add ../upstream workbench &&
+   git -C parent commit -m "add submoule" &&
+
+   test_commit -C parent/workbench two &&
+   git -C parent add workbench &&
+   git -C parent commit -m "update workbench" &&
+
+   git -C parent push \
+   --push-option=asdf --push-option="more structured text" \
+   --recurse-submodules=on-demand up master &&
+
+   git -C upstream rev-parse --verify master >expect &&
+   git -C parent/workbench rev-parse --verify master >actual &&
+   test_cmp expect actual &&
+
+   git -C parent_upstream rev-parse --verify master >expect &&
+   git -C parent rev-parse --verify master >actual &&
+   test_cmp expect actual &&
+
+   printf "asdf\nmore structured text\n" >expect &&
+   test_cmp expect upstream/.git/hooks/pre-receive.push_options &&
+   test_cmp expect upstream/.git/hooks/post-receive.push_options &&
+   test_cmp expect parent_upstream/.git/hooks/pre-receive.push_options &&
+   test_cmp expect parent_upstream/.git/hooks/post-receive.push_options
+'
+
 stop_httpd
 
 test_done
diff --git a/transport.c b/transport.c
index 417ed7f19..64e60b635 

[PATCH v2 1/2] push: unmark a local variable as static

2017-03-31 Thread Brandon Williams
There isn't any obvious reason for the 'struct string_list push_options'
and 'struct string_list_item *item' to be marked as static, so unmark
them as being static.  Also, clear the push_options string_list to
prevent memory leaking.

Signed-off-by: Brandon Williams 
---
 builtin/push.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/builtin/push.c b/builtin/push.c
index 5c22e9f2e..a597759d8 100644
--- a/builtin/push.c
+++ b/builtin/push.c
@@ -510,8 +510,8 @@ int cmd_push(int argc, const char **argv, const char 
*prefix)
int push_cert = -1;
int rc;
const char *repo = NULL;/* default repository */
-   static struct string_list push_options = STRING_LIST_INIT_DUP;
-   static struct string_list_item *item;
+   struct string_list push_options = STRING_LIST_INIT_DUP;
+   const struct string_list_item *item;
 
struct option options[] = {
OPT__VERBOSITY(),
@@ -584,6 +584,7 @@ int cmd_push(int argc, const char **argv, const char 
*prefix)
die(_("push options must not have new line 
characters"));
 
rc = do_push(repo, flags, _options);
+   string_list_clear(_options, 0);
if (rc == -1)
usage_with_options(push_usage, options);
else
-- 
2.12.2.564.g063fe858b8-goog



[PATCH v2 0/2] propagate push-options

2017-03-31 Thread Brandon Williams
v2 addresses Jonathan's comments as well as adds an additional patch to unmark
a local variable as static.

Brandon Williams (2):
  push: unmark a local variable as static
  push: propagate push-options with --recurse-submodules

 builtin/push.c  |  5 +++--
 submodule.c | 13 +++--
 submodule.h |  1 +
 t/t5545-push-options.sh | 39 +++
 transport.c |  1 +
 5 files changed, 55 insertions(+), 4 deletions(-)

-- 
2.12.2.564.g063fe858b8-goog



Re: [PATCH] push: propagate push-options with --recurse-submodules

2017-03-31 Thread Brandon Williams
On 03/31, Jonathan Nieder wrote:
> Hi,
> 
> Brandon Williams wrote:
> 
> > Teach push --recurse-submodules to propagate push-options recursively to
> > the pushes performed in the submodules.
> 
> Sounds like a good change.
> 
> [...]
> > +++ b/submodule.c
> [...]
> > @@ -793,6 +794,12 @@ static int push_submodule(const char *path, int 
> > dry_run)
> > if (dry_run)
> > argv_array_push(, "--dry-run");
> >  
> > +   if (push_options && push_options->nr) {
> > +   static struct string_list_item *item;
> 
> Why static?  It would be simpler (and would use less bss) to let this
> pointer be an automatic variable instead.

Oops, definitely shouldn't be static! Thanks for catching that.

> 
> > +   for_each_string_list_item(item, push_options)
> > +   argv_array_pushf(, "--push-option=%s",
> > +item->string);
> > +   }
> > prepare_submodule_repo_env(_array);
> > cp.git_cmd = 1;
> > cp.no_stdin = 1;
> > @@ -807,7 +814,8 @@ static int push_submodule(const char *path, int dry_run)
> >  
> >  int push_unpushed_submodules(struct sha1_array *commits,
> >  const char *remotes_name,
> > -int dry_run)
> > +int dry_run,
> > +const struct string_list *push_options)
> 
> nit: I wonder if dry_run should stay as the last argument.  That would
> make it closer to the idiom of have a flag word as last argument.

Yeah I can flip the order.

> 
> [...]
> > +++ b/t/t5545-push-options.sh
> > @@ -142,6 +142,45 @@ test_expect_success 'push options work properly across 
> > http' '
> > test_cmp expect actual
> >  '
> >  
> > +test_expect_success 'push options and submodules' '
> 
> Yay!
> 
> What happens if the upstream of the parent repo supports push options
> but the upstream of the child repo doesn't?  What should happen?

push would fail since the children are pushed first.

> 
> Thanks and hope that helps,
> Jonathan

-- 
Brandon Williams


Re: [PATCH] push: propagate push-options with --recurse-submodules

2017-03-31 Thread Jonathan Nieder
Hi,

Brandon Williams wrote:

> Teach push --recurse-submodules to propagate push-options recursively to
> the pushes performed in the submodules.

Sounds like a good change.

[...]
> +++ b/submodule.c
[...]
> @@ -793,6 +794,12 @@ static int push_submodule(const char *path, int dry_run)
>   if (dry_run)
>   argv_array_push(, "--dry-run");
>  
> + if (push_options && push_options->nr) {
> + static struct string_list_item *item;

Why static?  It would be simpler (and would use less bss) to let this
pointer be an automatic variable instead.

> + for_each_string_list_item(item, push_options)
> + argv_array_pushf(, "--push-option=%s",
> +  item->string);
> + }
>   prepare_submodule_repo_env(_array);
>   cp.git_cmd = 1;
>   cp.no_stdin = 1;
> @@ -807,7 +814,8 @@ static int push_submodule(const char *path, int dry_run)
>  
>  int push_unpushed_submodules(struct sha1_array *commits,
>const char *remotes_name,
> -  int dry_run)
> +  int dry_run,
> +  const struct string_list *push_options)

nit: I wonder if dry_run should stay as the last argument.  That would
make it closer to the idiom of have a flag word as last argument.

[...]
> +++ b/t/t5545-push-options.sh
> @@ -142,6 +142,45 @@ test_expect_success 'push options work properly across 
> http' '
>   test_cmp expect actual
>  '
>  
> +test_expect_success 'push options and submodules' '

Yay!

What happens if the upstream of the parent repo supports push options
but the upstream of the child repo doesn't?  What should happen?

Thanks and hope that helps,
Jonathan


Re: [PATCH] name-hash: fix buffer overrun

2017-03-31 Thread Junio C Hamano
On Fri, Mar 31, 2017 at 2:18 PM, Junio C Hamano  wrote:
> Junio C Hamano  writes:
>
>> Will queue with ...
>>
>>>  name-hash.c |  4 +++-
>>>  t/t3008-ls-files-lazy-init-name-hash.sh | 19 +++
>>>  2 files changed, 22 insertions(+), 1 deletion(-)
>>>  create mode 100644 t/t3008-ls-files-lazy-init-name-hash.sh
>>
>> ... this thing fixed by "chmod +x" (otherwise the tests won't start).
>>
>> Thanks.
>
> Also, https://travis-ci.org/git/git/jobs/217303927 seems to say that
> MacOS is not happy with this change.

Ah, of course. Avoid GNUism to spell HT as "\t" in a sed script.


[PATCH 2/2] diff: recurse into nested submodules for inline diff

2017-03-31 Thread Stefan Beller
When fd47ae6a5b (diff: teach diff to display submodule difference with an
inline diff, 2016-08-31) was introduced, we did not think of recursing
into nested submodules.

When showing the inline diff for submodules, automatically recurse
into nested submodules as well with inline submodule diffs.

Signed-off-by: Stefan Beller 
---
 submodule.c  |  3 +-
 t/t4060-diff-submodule-option-diff-format.sh | 41 
 2 files changed, 43 insertions(+), 1 deletion(-)

diff --git a/submodule.c b/submodule.c
index d84109908f..471ca9ce7d 100644
--- a/submodule.c
+++ b/submodule.c
@@ -553,7 +553,8 @@ void show_submodule_inline_diff(FILE *f, const char *path,
cp.no_stdin = 1;
 
/* TODO: other options may need to be passed here. */
-   argv_array_push(, "diff");
+   argv_array_pushl(, "diff", "--submodule=diff", NULL);
+
argv_array_pushf(, "--line-prefix=%s", line_prefix);
if (DIFF_OPT_TST(o, REVERSE_DIFF)) {
argv_array_pushf(, "--src-prefix=%s%s/",
diff --git a/t/t4060-diff-submodule-option-diff-format.sh 
b/t/t4060-diff-submodule-option-diff-format.sh
index d4a3ffa69c..33ec26d755 100755
--- a/t/t4060-diff-submodule-option-diff-format.sh
+++ b/t/t4060-diff-submodule-option-diff-format.sh
@@ -775,4 +775,45 @@ test_expect_success 'diff --submodule=diff with moved 
nested submodule HEAD' '
test_cmp expected actual
 '
 
+test_expect_success 'diff --submodule=diff recurses into nested submodules' '
+   cat >expected <<-EOF &&
+   Submodule sm2 contains modified content
+   Submodule sm2 a5a65c9..280969a:
+   diff --git a/sm2/.gitmodules b/sm2/.gitmodules
+   new file mode 100644
+   index 000..3a816b8
+   --- /dev/null
+   +++ b/sm2/.gitmodules
+   @@ -0,0 +1,3 @@
+   +[submodule "nested"]
+   +   path = nested
+   +   url = ../sm2
+   Submodule nested 000...b55928c (new submodule)
+   diff --git a/sm2/nested/file b/sm2/nested/file
+   new file mode 100644
+   index 000..ca281f5
+   --- /dev/null
+   +++ b/sm2/nested/file
+   @@ -0,0 +1 @@
+   +nested content
+   diff --git a/sm2/nested/foo8 b/sm2/nested/foo8
+   new file mode 100644
+   index 000..db9916b
+   --- /dev/null
+   +++ b/sm2/nested/foo8
+   @@ -0,0 +1 @@
+   +foo8
+   diff --git a/sm2/nested/foo9 b/sm2/nested/foo9
+   new file mode 100644
+   index 000..9c3b4f6
+   --- /dev/null
+   +++ b/sm2/nested/foo9
+   @@ -0,0 +1 @@
+   +foo9
+   EOF
+   git diff --submodule=diff >actual 2>err &&
+   test_must_be_empty err &&
+   test_cmp expected actual
+'
+
 test_done
-- 
2.12.2.576.g7be6e4ba40.dirty



[PATCH 0/2] Re: git diff --submodule=diff fails with submodules in a submodule

2017-03-31 Thread Stefan Beller
I came up with a patch that fixes the issue locally.
(It is the first patch, such that we can track the first patch down to maint)

While doing so, I noticed 2 issues:
* when having nested submodules, we probably want to have --submodule=diff
  to recurse into the nested submodules, so pass on the option.
  (2nd patch)
* the tests in t4060 hardcode hashes literally. I do not have a patch for that
  yet, but I will send patches later. (c.f. "sanitize_output" in t4202)
  
Thanks,
Stefan

Stefan Beller (2):
  diff: submodule inline diff to initialize env array.
  diff: recurse into nested submodules for inline diff

 submodule.c  |  4 +-
 t/t4060-diff-submodule-option-diff-format.sh | 70 
 2 files changed, 73 insertions(+), 1 deletion(-)

-- 
2.12.2.576.g7be6e4ba40.dirty



[PATCH 1/2] diff: submodule inline diff to initialize env array.

2017-03-31 Thread Stefan Beller
David reported:
> When I try to run `git diff --submodule=diff` in a submodule which has
> it's own submodules that have changes I get the error: fatal: bad
> object.

This happens, because we do not properly initialize the environment
in which the diff is run in the submodule. That means we inherit the
environment from the main process, which sets environment variables.
(Apparently we do set environment variables which we do not set
when not in a submodules, i.e. the .git directory is linked)

This commit, just like fd47ae6a5b (diff: teach diff to display
submodule difference with an inline diff, 2016-08-31) introduces bad
test code (i.e. hard coded hash values), which will be cleanup up in
a later patch.

Reported-by: David Parrish 
Signed-off-by: Stefan Beller 
---
 submodule.c  |  1 +
 t/t4060-diff-submodule-option-diff-format.sh | 29 
 2 files changed, 30 insertions(+)

diff --git a/submodule.c b/submodule.c
index c52d6634c5..d84109908f 100644
--- a/submodule.c
+++ b/submodule.c
@@ -576,6 +576,7 @@ void show_submodule_inline_diff(FILE *f, const char *path,
if (!(dirty_submodule & DIRTY_SUBMODULE_MODIFIED))
argv_array_push(, oid_to_hex(new));
 
+   prepare_submodule_repo_env(_array);
if (run_command())
fprintf(f, "(diff failed)\n");
 
diff --git a/t/t4060-diff-submodule-option-diff-format.sh 
b/t/t4060-diff-submodule-option-diff-format.sh
index 7e23b55ea4..d4a3ffa69c 100755
--- a/t/t4060-diff-submodule-option-diff-format.sh
+++ b/t/t4060-diff-submodule-option-diff-format.sh
@@ -746,4 +746,33 @@ test_expect_success 'diff --submodule=diff with .git file' 
'
test_cmp expected actual
 '
 
+test_expect_success 'setup nested submodule' '
+   git submodule add -f ./sm2 &&
+   git commit -a -m "add sm2" &&
+   git -C sm2 submodule add ../sm2 nested &&
+   git -C sm2 commit -a -m "nested sub"
+'
+
+test_expect_success 'move nested submodule HEAD' '
+   echo "nested content" >sm2/nested/file &&
+   git -C sm2/nested add file &&
+   git -C sm2/nested commit --allow-empty -m "new HEAD"
+'
+
+test_expect_success 'diff --submodule=diff with moved nested submodule HEAD' '
+   cat >expected <<-EOF &&
+   Submodule nested a5a65c9..b55928c:
+   diff --git a/nested/file b/nested/file
+   new file mode 100644
+   index 000..ca281f5
+   --- /dev/null
+   +++ b/nested/file
+   @@ -0,0 +1 @@
+   +nested content
+   EOF
+   git -C sm2 diff --submodule=diff >actual 2>err &&
+   test_must_be_empty err &&
+   test_cmp expected actual
+'
+
 test_done
-- 
2.12.2.576.g7be6e4ba40.dirty



[PATCH] push: propagate push-options with --recurse-submodules

2017-03-31 Thread Brandon Williams
Teach push --recurse-submodules to propagate push-options recursively to
the pushes performed in the submodules.

Signed-off-by: Brandon Williams 
---
 submodule.c | 14 +++---
 submodule.h |  3 ++-
 t/t5545-push-options.sh | 39 +++
 transport.c |  3 ++-
 4 files changed, 54 insertions(+), 5 deletions(-)

diff --git a/submodule.c b/submodule.c
index c52d6634c..3d6d5e62d 100644
--- a/submodule.c
+++ b/submodule.c
@@ -782,7 +782,8 @@ int find_unpushed_submodules(struct sha1_array *commits,
return needs_pushing->nr;
 }
 
-static int push_submodule(const char *path, int dry_run)
+static int push_submodule(const char *path, int dry_run,
+ const struct string_list *push_options)
 {
if (add_submodule_odb(path))
return 1;
@@ -793,6 +794,12 @@ static int push_submodule(const char *path, int dry_run)
if (dry_run)
argv_array_push(, "--dry-run");
 
+   if (push_options && push_options->nr) {
+   static struct string_list_item *item;
+   for_each_string_list_item(item, push_options)
+   argv_array_pushf(, "--push-option=%s",
+item->string);
+   }
prepare_submodule_repo_env(_array);
cp.git_cmd = 1;
cp.no_stdin = 1;
@@ -807,7 +814,8 @@ static int push_submodule(const char *path, int dry_run)
 
 int push_unpushed_submodules(struct sha1_array *commits,
 const char *remotes_name,
-int dry_run)
+int dry_run,
+const struct string_list *push_options)
 {
int i, ret = 1;
struct string_list needs_pushing = STRING_LIST_INIT_DUP;
@@ -818,7 +826,7 @@ int push_unpushed_submodules(struct sha1_array *commits,
for (i = 0; i < needs_pushing.nr; i++) {
const char *path = needs_pushing.items[i].string;
fprintf(stderr, "Pushing submodule '%s'\n", path);
-   if (!push_submodule(path, dry_run)) {
+   if (!push_submodule(path, dry_run, push_options)) {
fprintf(stderr, "Unable to push submodule '%s'\n", 
path);
ret = 0;
}
diff --git a/submodule.h b/submodule.h
index 8a8bc49dc..036ce4c5c 100644
--- a/submodule.h
+++ b/submodule.h
@@ -92,7 +92,8 @@ extern int find_unpushed_submodules(struct sha1_array 
*commits,
struct string_list *needs_pushing);
 extern int push_unpushed_submodules(struct sha1_array *commits,
const char *remotes_name,
-   int dry_run);
+   int dry_run,
+   const struct string_list *push_options);
 extern void connect_work_tree_and_git_dir(const char *work_tree, const char 
*git_dir);
 extern int parallel_submodules(void);
 
diff --git a/t/t5545-push-options.sh b/t/t5545-push-options.sh
index 97065e62b..32c513c78 100755
--- a/t/t5545-push-options.sh
+++ b/t/t5545-push-options.sh
@@ -142,6 +142,45 @@ test_expect_success 'push options work properly across 
http' '
test_cmp expect actual
 '
 
+test_expect_success 'push options and submodules' '
+   test_when_finished "rm -rf parent" &&
+   test_when_finished "rm -rf parent_upstream" &&
+   mk_repo_pair &&
+   git -C upstream config receive.advertisePushOptions true &&
+   cp -r upstream parent_upstream &&
+   test_commit -C upstream one &&
+
+   test_create_repo parent &&
+   git -C parent remote add up ../parent_upstream &&
+   test_commit -C parent one &&
+   git -C parent push --mirror up &&
+
+   git -C parent submodule add ../upstream workbench &&
+   git -C parent commit -m "add submoule" &&
+
+   test_commit -C parent/workbench two &&
+   git -C parent add workbench &&
+   git -C parent commit -m "update workbench" &&
+
+   git -C parent push \
+   --push-option=asdf --push-option="more structured text" \
+   --recurse-submodules=on-demand up master &&
+
+   git -C upstream rev-parse --verify master >expect &&
+   git -C parent/workbench rev-parse --verify master >actual &&
+   test_cmp expect actual &&
+
+   git -C parent_upstream rev-parse --verify master >expect &&
+   git -C parent rev-parse --verify master >actual &&
+   test_cmp expect actual &&
+
+   printf "asdf\nmore structured text\n" >expect &&
+   test_cmp expect upstream/.git/hooks/pre-receive.push_options &&
+   test_cmp expect upstream/.git/hooks/post-receive.push_options &&
+   test_cmp expect parent_upstream/.git/hooks/pre-receive.push_options &&
+   test_cmp expect 

Re: Very promising results with libpcre2

2017-03-31 Thread Junio C Hamano
Ævar Arnfjörð Bjarmason  writes:

> That enables the new JIT support in pcre v2:
>
>   s/iterrx fixed   prx
> rx  2.19--  -33%  -44%
> fixed   1.47   49%--  -17%
> prx 1.22   79%   20%--

The numbers with JIT does look "interesting".  

I couldn't quite tell if there are major incompatibilities in the
regex language itself between two versions from their documentation,
but assuming that there isn't (modulo bugfixes and enhancements) and
assuming that we are going to use their standard matcher, it may be
OK to just use the newer one without linking both.



Very promising results with libpcre2

2017-03-31 Thread Ævar Arnfjörð Bjarmason
The recent libpcre2 got me interested in seeing what the difference in
v1 and v2 was.

So I hacked up a *very basic* patch for libpcre2 that passes all
tests, but obviously isn't ready for inclusion (I searched/replaced
all the v1 usage with v2). I'm not even bothering sending this to the
list since discussing the patch itself isn't the point:

https://github.com/avar/git/commit/414647d88dd9c5

Before that patch, running a test[1] on linux.git where I grep the
whole tree for a fixed string / simple regex with POSIX regexes & PCRE
(all the greps match the same few lines) gives:

  s/iterrx   prx fixed
rx  2.20--   -2%  -34%
prx 2.172%--  -32%
fixed   1.46   51%   48%--

I.e. fixed string is fastest, and both POSIX regcomp() & pcre v1 are
~30% slower than that, with no difference in performance between the
two. Now with my patch above with pcre v2 there's a notable
performance difference:

  s/iterrx   prx fixed
rx  2.18--  -16%  -33%
prx 1.84   19%--  -20%
fixed   1.47   48%   25%--

We've gone from ~30% slower to ~20% slower for PCRE with v2. But now
let's test that with this patch:

https://github.com/avar/git/commit/4b7e5da3606c0b9b12025437de8005f5fa07ff54

That enables the new JIT support in pcre v2:

  s/iterrx fixed   prx
rx  2.19--  -33%  -44%
fixed   1.47   49%--  -17%
prx 1.22   79%   20%--

Now it's PCRE that's 20% faster than our currently fastest grep
codepath that searches for a fixed string, and in absolute terms it's
around 50% faster than the current PCRE implementation.

This is on Debian testing with both PCRE libraries installed via
packages, 8.35 & 10.22 for v1 and v2, respectively. Both are the
second-latest[2] point releases for their respective versions.

As far as turning this into a patch goes there's a few open questions:

* PCRE itself supports linking to v1 and v2 in the same program just
fine. Should we provide the possibility to link to both, or just make
the user choose? If these performance numbers hold up preferring v2 is
definitely better.

* The JIT is supposedly a bit slower if you're not doing a lot of
matching, although I doubt this matters in practice, but whether to
use it & a few other options could be controlled by some config/CLI
option. I think it probably makes sense just to always use it if it's
there pending some cases where it makes performance worse in practice

As an aside I started looking into this because I'm interested in
eventually hacking up something that makes every user-facing
regcomp()/regexec() we have now (e.g. log -G) accept PCRE as well.

How do do this in all cases isn't very obvious, we could just have
some global config option, but there's lots of stuff like e.g.
"^{/} & "git show :/" that takes regexes, and e.g.
how to pass a /i flag to some of these isn't obvious at all.

The solution I'm leaning towards is to just make stuff like thath only
work under PCRE, via the native (?:) facility. E.g.
this now works:

git grep -P '(?xi: h e l l o)'

1. PF=~/g/git/ perl -MBenchmark=cmpthese -wE 'cmpthese(20, { fixed =>
sub { system "$ENV{PF}git grep -F avarasu >/dev/null" }, rx => sub {
system "$ENV{PF}git grep avara?su >/dev/null" }, prx => sub { system
"$ENV{PF}git grep -P avara?su >/dev/null" } })'
2. https://ftp.pcre.org/pub/pcre/


Re: [PATCH] name-hash: fix buffer overrun

2017-03-31 Thread Junio C Hamano
Junio C Hamano  writes:

> Will queue with ...
>
>>  name-hash.c |  4 +++-
>>  t/t3008-ls-files-lazy-init-name-hash.sh | 19 +++
>>  2 files changed, 22 insertions(+), 1 deletion(-)
>>  create mode 100644 t/t3008-ls-files-lazy-init-name-hash.sh
>
> ... this thing fixed by "chmod +x" (otherwise the tests won't start).
>
> Thanks.

Also, https://travis-ci.org/git/git/jobs/217303927 seems to say that
MacOS is not happy with this change.


Re: SHA1 collision in production repo?! (probably not)

2017-03-31 Thread Junio C Hamano
Before we forget...


-- >8 --
From: Jeff King 

When sha1_loose_object_info() finds that a loose object file cannot
be stat(2)ed or mmap(2)ed, it returns -1 to signal an error to the
caller.  However, if it found that the loose object file is corrupt
and the object data cannot be used from it, it forgets to return -1.

This can confuse the caller, who may be lead to mistakenly think
that there is a loose object and possibly gets an incorrect type and
size from the function.  The SHA-1 collision detection codepath, for
example, when it gets an object over the wire and tries to see the
data is the same as what is available as a loose object locally, can
get confused when the loose object is correupted due to this bug.

---
 sha1_file.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/sha1_file.c b/sha1_file.c
index 71063890ff..368c89b5c3 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -2952,7 +2952,7 @@ static int sha1_loose_object_info(const unsigned char 
*sha1,
if (status && oi->typep)
*oi->typep = status;
strbuf_release();
-   return 0;
+   return (status < 0) ? status : 0;
 }
 
 int sha1_object_info_extended(const unsigned char *sha1, struct object_info 
*oi, unsigned flags)


[PATCH] remote.[ch]: parse_push_cas_option() can be static

2017-03-31 Thread Junio C Hamano
Since 068c77a5 ("builtin/send-pack.c: use parse_options API",
2015-08-19), there is no external user of this helper function.

Signed-off-by: Junio C Hamano 
---
 remote.c | 2 +-
 remote.h | 1 -
 2 files changed, 1 insertion(+), 2 deletions(-)

diff --git a/remote.c b/remote.c
index 68901b0070..fc2df99b81 100644
--- a/remote.c
+++ b/remote.c
@@ -2178,7 +2178,7 @@ static struct push_cas *add_cas_entry(struct 
push_cas_option *cas,
return entry;
 }
 
-int parse_push_cas_option(struct push_cas_option *cas, const char *arg, int 
unset)
+static int parse_push_cas_option(struct push_cas_option *cas, const char *arg, 
int unset)
 {
const char *colon;
struct push_cas *entry;
diff --git a/remote.h b/remote.h
index 02d66ceff5..19a4905da6 100644
--- a/remote.h
+++ b/remote.h
@@ -260,7 +260,6 @@ struct push_cas_option {
 };
 
 extern int parseopt_push_cas_option(const struct option *, const char *arg, 
int unset);
-extern int parse_push_cas_option(struct push_cas_option *, const char *arg, 
int unset);
 
 extern int is_empty_cas(const struct push_cas_option *);
 void apply_push_cas(struct push_cas_option *, struct remote *, struct ref *);
-- 
2.12.2-750-g6626ddbf7f



Re: [PATCH v3] http.postbuffer: allow full range of ssize_t values

2017-03-31 Thread Junio C Hamano
David Turner  writes:

> +static int git_parse_ssize_t(const char *value, ssize_t *ret)
> +{
> + ssize_t tmp;
> + if (!git_parse_signed(value, , 
> maximum_signed_value_of_type(ssize_t)))
> + return 0;
> + *ret = tmp;
> + return 1;
> +}
> +
>  NORETURN
>  static void die_bad_number(const char *name, const char *value)
>  {
> @@ -892,6 +901,14 @@ unsigned long git_config_ulong(const char *name, const 
> char *value)
>   return ret;
>  }
>  
> +ssize_t git_config_ssize_t(const char *name, const char *value)
> +{
> + ssize_t ret;
> + if (!git_parse_ssize_t(value, ))
> + die_bad_number(name, value);
> + return ret;
> +}
> +

Thanks.


Re: Add configuration options for some commonly used command-line options

2017-03-31 Thread Brandon McCaig
On Mon, Mar 20, 2017 at 02:18:01PM -0400, Jeff King wrote:
> I think we've had similar proposals in the form of an
> environment variable like "GIT_PLUMBING" (and your "command",
> which I do like syntactically, would probably just end up
> setting such an environment variable anyway).

For reference, Mercurial has long had HGPLAIN with a similar
purpose. See `hg help scripting'. I think that this is a
generally good idea to adopt. Albeit, I think that Mercurial
considers its command line a scriptable API, and HGPLAIN skips
over any customization of output. That may not be the exact
intention of this, but something related nevertheless.

Instead of a subcommand I think that a command line --option
would make better sense. It would only even save a few characters
in *nix shells where the variable can be inlined, but could be
more practical for MS Windows users that would need separate
commands to manage the variable.

Regards,


-- 
Brandon McCaig  
Castopulence Software 
Blog 
perl -E '$_=q{V zrna gur orfg jvgu jung V fnl. }.
q{Vg qbrfa'\''g nyjnlf fbhaq gung jnl.};
tr/A-Ma-mN-Zn-z/N-Zn-zA-Ma-m/;say'



signature.asc
Description: Digital signature


Re: [BUG?] iconv used as textconv, and spurious ^M on added lines on Windows

2017-03-31 Thread Jakub Narębski
W dniu 31.03.2017 o 14:38, Torsten Bögershausen pisze:
> On 30.03.17 21:35, Jakub Narębski wrote:
>> Hello,
>>
>> Recently I had to work on a project which uses legacy 8-bit encoding
>> (namely cp1250 encoding) instead of utf-8 for text files (LaTeX
>> documents).  My terminal, that is Git Bash from Git for Windows is set
>> up for utf-8.
>>
>> I wanted for "git diff" and friends to return something sane on said
>> utf-8 terminal, instead of mojibake.  There is 'encoding'
>> gitattribute... but it works only for GUI ('git gui', that is).
>>
>> Therefore I have (ab)used textconv facility to convert from cp1250 of
>> file encoding to utf-8 encoding of console.
>>
>> I have set the following in .gitattributes file:
>>
>>   ## LaTeX documents in cp1250 encoding
>>   *.tex text diff=mylatex
>>
>> The 'mylatex' driver is defined as:
>>
>>   [diff "mylatex"]
>> xfuncname = "^(((sub)*section|chapter|part)\\*{0,1}\\{.*)$"
>> wordRegex = "[a-zA-Z]+|[{}]|.|[^\\{}[:space:]]+"
>> textconv  = \"C:/Program Files/Git/usr/bin/iconv.exe\" -f cp1250 -t 
>> utf-8
>> cachetextconv = true
>>
>> And everything would be all right... if not the fact that Git appends
>> spurious ^M to added lines in the `git diff` output.  Files use CRLF
>> end-of-line convention (the native MS Windows one).
>>
>>   $ git diff test.tex
>>   diff --git a/test.tex b/test.tex
>>   index 029646e..250ab16 100644
>>   --- a/test.tex
>>   +++ b/test.tex
>>   @@ -1,4 +1,4 @@
>>   -\documentclass{article}
>>   +\documentclass{mwart}^M
>>   
>>\usepackage[cp1250]{inputenc}
>>\usepackage{polski}
>>
>> What gives?  Why there is this ^M tacked on the end of added lines,
>> while it is not present in deleted lines, nor in content lines?
>>
>> Puzzled.
>>
>> P.S. Git has `i18n.commitEncoding` and `i18n.logOutputEncoding`; pity
>> that it doesn't supports in core `encoding` attribute together with
>> having `i18n.outputEncoding`.
>
> Is there a chance to give us a receipt how to reproduce it?
> A complete test script or ?
> (I don't want to speculate, if the invocation of iconv is the problem,
>  where stdout is not in "binary mode", or however this is called under 
> Windows)

I'm sorry, I though I posted whole recipe, but I missed some details
in the above description of the case.

First, files are stored on filesystem using CRLF eol (DOS end-of-line
convention).  Due to `core.autocrlf` they are converted to LF in blobs,
that is in the index and in the repository.

Second, a textconv with filter preserving end-of-line needs to be
configured.  I have used `iconv`, but I suspect that the problem would
happen also for `cat`.

In the .gitattributes file, or .git/info/attributes add, for example:

  *.tex text diff=myconv

In the .git/config configure the textconv filter, for example:

  [diff "myconv"]
 textconv  = iconv.exe -f cp1250 -t utf-8

Create a file which filename matches the attribute line, and which
uses CRLF end of line convention, and add it to Git (adding it to
the index):

  $ printf "foo\r\n" >foo.tex
  $ git add foo.tex

Modify file (also with CRLF):

  $ printf "bar\r\n" >foo.tex

Check the difference

  $ git diff foo.tex

HTH
-- 
Jakub Narębski



Re: [PATCH] failure with diff --submodule=diff with moved nested submodule HEAD

2017-03-31 Thread Junio C Hamano
Stefan Beller  writes:

> This fails reliable for me.
>
> Signed-off-by: Stefan Beller 
> ---

I take it that this is a WIP for the testing half of the patch you
are working on (i.e. you are not just throwing a patch to document
known breakage to be fixed later, in which case these would have
been _expect_failure).

Thanks.

>  t/t4060-diff-submodule-option-diff-format.sh | 16 
>  1 file changed, 16 insertions(+)
>
> diff --git a/t/t4060-diff-submodule-option-diff-format.sh 
> b/t/t4060-diff-submodule-option-diff-format.sh
> index 7e23b55ea4..89bced3484 100755
> --- a/t/t4060-diff-submodule-option-diff-format.sh
> +++ b/t/t4060-diff-submodule-option-diff-format.sh
> @@ -746,4 +746,20 @@ test_expect_success 'diff --submodule=diff with .git 
> file' '
>   test_cmp expected actual
>  '
>  
> +test_expect_success 'setup nested submodule' '
> + git submodule add -f ./sm2 &&
> + git commit -a -m "add sm2" &&
> + git -C sm2 submodule add ../sm2 &&
> + git -C sm2 commit -a -m "nested sub"
> +'
> +
> +test_expect_success 'move nested submodule HEAD' '
> + git -C sm2/sm2 commit --allow-empty -m "new HEAD"
> +'
> +
> +test_expect_success 'diff --submodule=diff with moved nested submodule HEAD' 
> '
> + git -C sm2 diff --submodule=diff >actual 2>err &&
> + test_must_be_empty err
> +'
> +
>  test_done


Re: [PATCH] name-hash: fix buffer overrun

2017-03-31 Thread Junio C Hamano
g...@jeffhostetler.com writes:

> From: Kevin Willford 
>
> Add check for the end of the entries for the thread partition.
> Add test for lazy init name hash with specific directory structure
>
> The lazy init hash name was causing a buffer overflow when the last
> entry in the index was multiple folder deep with parent folders that
> did not have any files in them.
>
> This adds a test for the boundary condition of the thread partitions
> with the folder structure that was triggering the buffer overflow.
>
> The fix was to check if it is the last entry for the thread partition
> in the handle_range_dir and not try to use the next entry in the cache.
>
> Signed-off-by: Kevin Willford 
> Signed-off-by: Johannes Schindelin 
> Signed-off-by: Jeff Hostetler 
>
> ---

Will queue with ...

>  name-hash.c |  4 +++-
>  t/t3008-ls-files-lazy-init-name-hash.sh | 19 +++
>  2 files changed, 22 insertions(+), 1 deletion(-)
>  create mode 100644 t/t3008-ls-files-lazy-init-name-hash.sh

... this thing fixed by "chmod +x" (otherwise the tests won't start).

Thanks.


Re: [PATCH v3 3/4] name-rev: provide debug output

2017-03-31 Thread Junio C Hamano
Michael J Gruber  writes:

>>What problem are you solving?  
>
> Sorry, I forgot about that change and failed to mention it.
>
> It makes no difference in the non-debug case which cares about the
> Boolean only. In the debug case, I want to distinguish between
> annotated and lightweight tags, just like describe --debug
> does. By adding 1 via deref and passing this down, I know that an
> annotated tag gets the value 2, a lightweight tag 1 and everything
> else 0, just like describe --tags.

If you want to only affect debug display, perhaps you should start
with a patch like the attached, before adding any debug code as a
preparatory step.  Then add your debugging thing, _WITHOUT_ the
increment in name_rev(), that uses from_tag to choose between
lightweight and annotated as a separate step.

When we decide that it would make sense to give precedence to
annotated ones over lightweight ones in is_better_name(), the
comparison can be further tweaked to actually compare values of the
from_tag thing in *name and the current candidate.  That would have
to be a separate step, as it changes the semantics (I suspect it
would be a better change but it may not be).

How does that sound?

-- >8 --
Subject: name-rev: allow to tell annotated and lightweight tags apart

We do not use this feature yet, but from_tag that is passed around
and kept in the rev_name structure now takes three values, instead
of a boolean "did this come from refs/tags/ hierarchy?".  A new
value '2' is "this is an annotated tag that came from refs/tags/
hierarchy".

---
 builtin/name-rev.c | 9 ++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/builtin/name-rev.c b/builtin/name-rev.c
index bf7ed015ae..fe2d306e7c 100644
--- a/builtin/name-rev.c
+++ b/builtin/name-rev.c
@@ -41,7 +41,7 @@ static int is_better_name(struct rev_name *name,
 * We know that at least one of them is a non-tag at this point.
 * favor a tag over a non-tag.
 */
-   if (name->from_tag != from_tag)
+   if (!!name->from_tag != !!from_tag)
return from_tag;
 
/*
@@ -247,8 +247,11 @@ static int name_ref(const char *path, const struct 
object_id *oid, int flags, vo
}
if (o && o->type == OBJ_COMMIT) {
struct commit *commit = (struct commit *)o;
-   int from_tag = starts_with(path, "refs/tags/");
-
+   int from_tag;
+   if (starts_with(path, "refs/tags/"))
+   from_tag = 1 + deref;
+   else
+   from_tag = 0;
if (taggerdate == ULONG_MAX)
taggerdate = ((struct commit *)o)->date;
path = name_ref_abbrev(path, can_abbreviate_output);


Re: [GSoC] Proposal: turn git-add--interactive.perl into a builtin

2017-03-31 Thread Daniel Ferreira (theiostream)
Well, Google requires me to have a draft on a Google Doc anyway for
the proposal, and I am unsure who exactly it will reach. Since it *is*
part of the discussion regarding my proposal, I suppose it is worth
posting here for anyone to comment:
https://docs.google.com/document/d/1dvF2PNRQvvZ351jCdKzOLs7tzaDqhR7ci7TDgzYQg9I/edit?usp=sharing.

-- Daniel.

On Fri, Mar 31, 2017 at 2:07 AM, Daniel Ferreira (theiostream)
 wrote:
> Hi Stefan & Johannes,
>
> Thank you for the precious feedback on the proposal. I don't see much
> sense in sending a full "v2" of it and have you read it all over
> again, so I'll just answer to your comments directly.
>
> Also, although the GSoC website allows me to send a "proposal draft"
> to you through the website, since I've already sent it here that
> shouldn't be necessary, correct? I intend to use it just to send the
> final thing.
>
> On Wed, Mar 29, 2017 at 9:01 PM, Johannes Schindelin
>  wrote:
>> On Tue, 28 Mar 2017, Stefan Beller wrote:
>>
>>> On Sat, Mar 25, 2017 at 8:15 PM, Daniel Ferreira (theiostream)
>>>  wrote:
>>>
>>> > SYNOPSIS
>>> > There are many advantages to converting parts of git that are still
>>> > scripts to C builtins, among which execution speed, improved
>>> > compatibility and code deduplication.
>>>
>>> agreed.
>>
>> I would even add portability. But yeah, speed is a big thing. I am an
>> extensive user of `git add -p` (which is backed by
>> git-add--interactive.perl) and it is slow as molasses on Windows, just
>> because it is a Perl script (and the Perl interpreter needs to emulate
>> POSIX functionality that is frequently not even needed, such as: copying
>> all memory and reopening all file descriptors in a fork() call only to
>> exec() git.exe right away, tossing all of the diligently work into the
>> dustbin).
>
> Thanks for this example – it hadn't come to my mind since I don't use
> Git on Windows. I'll be sure to complement the synopsis with it. :)
>
>>
>>> > FEASIBILITY
>>> >
>>> > There was only one discussion regarding the feasibility of its porting
>>> > (https://public-inbox.org/git/CAP8UFD2PcBsU6=FK4OHVrB7E98ycohS_0pYcbCBar=of1hl...@mail.gmail.com/).
>>> > It resulted in a consensus that doing it would be a task too large –
>>> > although interesting – for GSoC 2015 based on the amount of its lines
>>> > of code. It is, however, only a few lines larger than
>>> > git-rebase--interactive, which has been considered an appropriate
>>> > idea. As such, it looks like a possible project for three months of
>>> > full-time work.
>>>
>>> ok, it sounds a challenging project. (currently counting 1750 lines of
>>> code). Scrolling over the source code, there are quite a couple of
>>> functions, where the direct equivalent in C springs to mind.
>>>
>>> run_cmd_pipe -> see run-command.h
>>> unquote_path -> unquote_c_style ?
>>> refresh -> update_index_if_able()
>>> list_modified -> iterate over "const struct cache_entry *ce = 
>>> active_cache[i];"
>
> Thank you for these functions. I don't think I will be able to specify
> them in detail as part of the projected timeline (e.g. "June 1:
> convert calls to refresh() to use update_index_if_able()") already
> because there is not enough time prior to the proposal deadline to
> study their behavior in detail, and I like to avoid talking about
> things I don't fully understand. Although I think I can cite them as
> examples for a thesis I had put elsewhere in the proposal that "Git
> APIs in Perl already have functional equivalents in C".
>
> Also, they will be great for my early investigation stage into
> git-add--interactive. :) Once more, thanks for having listed them.
>
>> Yes, I think it would be more important to acquaint oneself with the
>> idiosynchracies of Git's internal "API" than to get familiar with Perl:
>> interpreting what obscure Perl code does is something I would gladly do as
>> a mentor.
>
> That's really nice! I usually don't get stuck when trying to
> understand code in languages I'm not too well acquainted with, but I
> figured getting more familiar with Perl would speed development up.
> But it does make sense that this "prior to May 4" might be better
> invested learning about git's internals than Perl.
>
> Question: do you suggest any pending bugfix to git-add--interactive or
> to something related that might give some useful knowledge in advance?
> (for the pre-code period). My microproject involves playing with the
> dir_iterator interface, which is a great exercise in code refactoring
> but really does not teach me too much about Git's architecture.
>
> Even if you do not have an answer to this, I'm pretty sure I'll keep
> this commitment to submitting some patch series somehow related to
> git-add before GSoC begins, especially after this comment from
> Johannes.
>
>>
>>> > PROJECTED TIMELINE
>>> > - Prior to May 4
>>> > -- Refine my basic knowledge of Perl
>>> > -- Craft one or two small patches to some of 

Re: SHA1 collision in production repo?! (probably not)

2017-03-31 Thread Jeff King
On Fri, Mar 31, 2017 at 11:19:54AM -0700, Junio C Hamano wrote:

> > Er, no, that's totally wrong. "status' may be holding the type. It
> > should really be:
> >
> >   return status < 0 ? status : 0;
> 
> Sounds more like it.  The only caller will say "ah, that object is
> not available to us---let's try packs again", which is exactly what
> we want to happen.

Right. Callers cannot distinguish between "did not have it" and
"corrupted", but that is no different than the rest of the sha1-file
interface.

> There is another bug in the codepath: the assignment to *oi->typep
> in the pre-context must guard against negative status value.  By
> returning an error correctly like you do above, that bug becomes
> more or less irrelevant, though.

I think that is intentional. OBJ_BAD is -1 (though the callers are
assuming that error() == OBJ_BAD). And that type gets relayed through
sha1_object_info() as the combined type/status return value.

-Peff


[PATCH v2] git-gui: Error on systems with TK < 8.6.0

2017-03-31 Thread Peter van der Does
Using git-gui on systems that run a TK version below 8.6.0 results in a
crash when checking for the current theme.

Catch the error on those systems and use a different command to check
for the current theme.

Signed-off-by: Peter van der Does 
---
 lib/themed.tcl | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/lib/themed.tcl b/lib/themed.tcl
index 351a712..bb4e8f2 100644
--- a/lib/themed.tcl
+++ b/lib/themed.tcl
@@ -248,7 +248,11 @@ proc tspinbox {w args} {
 proc ttext {w args} {
global use_ttk
if {$use_ttk} {
-   switch -- [ttk::style theme use] {
+   # Handle either current Tk or older versions of 8.5
+   if {[catch {set theme [ttk::style theme use]}]} {
+   set theme  $::ttk::currentTheme
+   }
+   switch -- $theme {
"vista" - "xpnative" {
lappend args -highlightthickness 0 -borderwidth 0
}
-- 
2.12.2


Re: [PATCH v3 3/4] name-rev: provide debug output

2017-03-31 Thread Junio C Hamano
Junio C Hamano  writes:

> Michael J Gruber  writes:
>
>>>The only case that this change may make a difference I can think of
>>>is when you have a tag object pointed at from outside refs/tags
>>>(e.g. refs/heads/foo is a tag object); if you are trying to change
>>>the definition of "from_tag" from the current "Is the tip inside
>>>refs/tags/?" to "Is the tip either inside refs/tags/ or is it a tag
>>>object anywhere?", that may be a good change (I didn't think things
>>>through, though), but that shouldn't be hidden inside a commit that
>>>claims to only add support for debugging.
>>>
>>>What problem are you solving?  
>>
>> Sorry, I forgot about that change and failed to mention it.
>>
>> It makes no difference in the non-debug case which cares about the
>> Boolean only. In the debug case, I want to distinguish between
>> annotated and lightweight tags, just like describe --debug does. By
>> adding 1 via deref and passing this down, I know that an annotated tag
>> gets the value 2, a lightweight tag 1 and everything else 0, just like
>> describe --tags.
>
> So it sounds like you meant to do something else, and the
> implementation is wrong for that something else (i.e. it wouldn't do
> the right thing for a tag object outside refs/tags/, with or without
> the "--debug" option passed).

The damage seems worse, but I may be misreading the code.

is_better_name() compares name->from_tag and from_tag numerically,
because it was designed to take a boolean view of that variable.
Now, an artificially bumped 2 gets compared with name->from_tag that
may be 1 and gets different priority.  That artificially inflated
value may be propagated to name->from_tag when the current tip is
judged as a better basis for naming the object.

If this change is only for debugging, perhaps inside if(data->debug)
you added, instead of looking at from_tag, you can look at both
from_tag and deref to choose which prio-nmes to show, without
butchering the value in from_tag variable to affect the existing
code that is exercised with or without --debug?


Re: SHA1 collision in production repo?! (probably not)

2017-03-31 Thread Junio C Hamano
Jeff King  writes:

> On Fri, Mar 31, 2017 at 01:45:15PM -0400, Jeff King wrote:
>
>> I suspect this may improve things, but I haven't dug deeper to see if
>> there are unwanted side effects, or if there are other spots that need
>> similar treatment.
>> 
>> diff --git a/sha1_file.c b/sha1_file.c
>> index 43990dec7..38411f90b 100644
>> --- a/sha1_file.c
>> +++ b/sha1_file.c
>> @@ -2952,7 +2952,7 @@ static int sha1_loose_object_info(const unsigned char 
>> *sha1,
>>  if (status && oi->typep)
>>  *oi->typep = status;
>>  strbuf_release();
>> -return 0;
>> +return status;
>>  }
>>  
>>  int sha1_object_info_extended(const unsigned char *sha1, struct object_info 
>> *oi, unsigned flags)
>
> Er, no, that's totally wrong. "status' may be holding the type. It
> should really be:
>
>   return status < 0 ? status : 0;

Sounds more like it.  The only caller will say "ah, that object is
not available to us---let's try packs again", which is exactly what
we want to happen.

There is another bug in the codepath: the assignment to *oi->typep
in the pre-context must guard against negative status value.  By
returning an error correctly like you do above, that bug becomes
more or less irrelevant, though.





Re: [PATCH] status: show in-progress info for short status

2017-03-31 Thread Junio C Hamano
Michael J Gruber  writes:

> Ordinary (long) status shows information about bisect, revert, am,
> rebase, cherry-pick in progress, and so does git-prompt.sh. status
> --short currently shows none of this information.
>
> Introduce an `--inprogress` argument to git status so that, when used with
> `--short --branch`, in-progress information is shown next to the branch
> information. Just like `--branch`, this comes with a config option.
>
> The wording for the in-progress information is taken over from
> git-prompt.sh.
>
> Signed-off-by: Michael J Gruber 

I haven't formed an opinion on the feature itself, or the way it is
triggered, so I won't comment on them.  I just say --porcelain (any
version) may (or may not) want to be extended in backward compatible
way (but again I haven't formed an opinion on the issue--I just know
and say there is an issue there that needs to be considered at this
point).

> diff --git a/t/t7512-status-help.sh b/t/t7512-status-help.sh
> index 458608cc1e..103e006249 100755
> --- a/t/t7512-status-help.sh
> +++ b/t/t7512-status-help.sh
> @@ -74,7 +74,6 @@ test_expect_success 'prepare for rebase conflicts' '
>  
>  
>  test_expect_success 'status when rebase in progress before resolving 
> conflicts' '
> - test_when_finished "git rebase --abort" &&
>   ONTO=$(git rev-parse --short HEAD^^) &&
>   test_must_fail git rebase HEAD^ --onto HEAD^^ &&
>   cat >expected < @@ -96,6 +95,15 @@ EOF
>   test_i18ncmp expected actual
>  '
>  
> +test_expect_success 'short status when rebase in progress' '
> + test_when_finished "git rebase --abort" &&
> + cat >expected < +## HEAD (no branch); REBASE-m
> +UU main.txt
> +EOF
> + git status --untracked-files=no --short --branch --inprogress >actual &&
> + test_i18ncmp expected actual
> +'

This is not a good way to structure the test.  If the one in the
previous hunk is what creates a conflicted state by running
"rebase", check the status output from within that test, after the
conflicting "rebase" fails and other things the existing test checks
are tested.  That way, you do not have to worry about this new check
getting confused if the previous one fails in the middle.

Likewise for the most (if not all---I didn't check very carefully)
of the remaining hunks in this test script.


Re: [PATCH v3 3/4] name-rev: provide debug output

2017-03-31 Thread Junio C Hamano
Michael J Gruber  writes:

>>The only case that this change may make a difference I can think of
>>is when you have a tag object pointed at from outside refs/tags
>>(e.g. refs/heads/foo is a tag object); if you are trying to change
>>the definition of "from_tag" from the current "Is the tip inside
>>refs/tags/?" to "Is the tip either inside refs/tags/ or is it a tag
>>object anywhere?", that may be a good change (I didn't think things
>>through, though), but that shouldn't be hidden inside a commit that
>>claims to only add support for debugging.
>>
>>What problem are you solving?  
>
> Sorry, I forgot about that change and failed to mention it.
>
> It makes no difference in the non-debug case which cares about the
> Boolean only. In the debug case, I want to distinguish between
> annotated and lightweight tags, just like describe --debug does. By
> adding 1 via deref and passing this down, I know that an annotated tag
> gets the value 2, a lightweight tag 1 and everything else 0, just like
> describe --tags.

So it sounds like you meant to do something else, and the
implementation is wrong for that something else (i.e. it wouldn't do
the right thing for a tag object outside refs/tags/, with or without
the "--debug" option passed).

>>> @@ -236,7 +273,6 @@ static int name_ref(const char *path, const
>>struct object_id *oid, int flags, vo
>>> }
>>>  
>>> add_to_tip_table(oid->hash, path, can_abbreviate_output);
>>> -
>>> while (o && o->type == OBJ_TAG) {
>>> struct tag *t = (struct tag *) o;
>>> if (!t->tagged)
>>
>>This is a patch noise we can do without.
>>
>>Thanks.


Re: [PATCH v3 3/4] name-rev: provide debug output

2017-03-31 Thread Michael J Gruber
Am 31. März 2017 18:52:16 MESZ schrieb Junio C Hamano :
>Michael J Gruber  writes:
>
>> Currently, `git describe --contains --debug` does not create any
>debug
>> output because it does not pass the flag down to `git name-rev`,
>which
>> does not know that flag.
>>
>> Teach the latter that flag, so that the former can pass it down (in
>> the following commit).
>>
>> The output is patterned after that of `git describe --debug`, with
>the
>> following differences:
>>
>> describe loops over all args to describe, then over all possible
>> descriptions; name-rev does it the other way round. Therefore, we
>need
>> to amend each possible description by the arg that it is for (and we
>> leave out the "searching to describe" header).
>>
>> The date cut-off for name-rev kicks in way more often than the
>candidate
>> number cut-off of describe, so we do not clutter the output with the
>> cut-off.
>>
>> Signed-off-by: Michael J Gruber 
>> ---
>
>>  static void name_rev(struct commit *commit,
>>  const char *tip_name, unsigned long taggerdate,
>>  int generation, int distance, int from_tag,
>> -int deref)
>> +int deref, struct name_ref_data *data)
>>  {
>>  struct rev_name *name = (struct rev_name *)commit->util;
>>  struct commit_list *parents;
>> @@ -75,6 +88,7 @@ static void name_rev(struct commit *commit,
>>  
>>  if (deref) {
>>  tip_name = xstrfmt("%s^0", tip_name);
>> +from_tag += 1;
>
>Why this change?  I didn't see it explained in the proposed log
>message.  "deref" is true only when our immediate caller is the one
>that inspected the object at the tip and found it to be a tag object
>(i.e. not a lightweight tag or a branch).  from_tag is about "is the
>tip within refs/tags/ hierarchy?  Yes/No?" and such a caller will
>set it appropriately when calling us.  This function just passes it
>down when it recursively calls itself.  
>
>We shouldn't be mucking with that value ourselves here, should we?
>
>The only case that this change may make a difference I can think of
>is when you have a tag object pointed at from outside refs/tags
>(e.g. refs/heads/foo is a tag object); if you are trying to change
>the definition of "from_tag" from the current "Is the tip inside
>refs/tags/?" to "Is the tip either inside refs/tags/ or is it a tag
>object anywhere?", that may be a good change (I didn't think things
>through, though), but that shouldn't be hidden inside a commit that
>claims to only add support for debugging.
>
>What problem are you solving?  

Sorry, I forgot about that change and failed to mention it.

It makes no difference in the non-debug case which cares about the Boolean 
only. In the debug case, I want to distinguish between annotated and 
lightweight tags, just like describe --debug does. By adding 1 via deref and 
passing this down, I know that an annotated tag gets the value 2, a lightweight 
tag 1 and everything else 0, just like describe --tags.

>> @@ -236,7 +273,6 @@ static int name_ref(const char *path, const
>struct object_id *oid, int flags, vo
>>  }
>>  
>>  add_to_tip_table(oid->hash, path, can_abbreviate_output);
>> -
>>  while (o && o->type == OBJ_TAG) {
>>  struct tag *t = (struct tag *) o;
>>  if (!t->tagged)
>
>This is a patch noise we can do without.
>
>Thanks.



[PATCH] failure with diff --submodule=diff with moved nested submodule HEAD

2017-03-31 Thread Stefan Beller
This fails reliable for me.

Signed-off-by: Stefan Beller 
---
 t/t4060-diff-submodule-option-diff-format.sh | 16 
 1 file changed, 16 insertions(+)

diff --git a/t/t4060-diff-submodule-option-diff-format.sh 
b/t/t4060-diff-submodule-option-diff-format.sh
index 7e23b55ea4..89bced3484 100755
--- a/t/t4060-diff-submodule-option-diff-format.sh
+++ b/t/t4060-diff-submodule-option-diff-format.sh
@@ -746,4 +746,20 @@ test_expect_success 'diff --submodule=diff with .git file' 
'
test_cmp expected actual
 '
 
+test_expect_success 'setup nested submodule' '
+   git submodule add -f ./sm2 &&
+   git commit -a -m "add sm2" &&
+   git -C sm2 submodule add ../sm2 &&
+   git -C sm2 commit -a -m "nested sub"
+'
+
+test_expect_success 'move nested submodule HEAD' '
+   git -C sm2/sm2 commit --allow-empty -m "new HEAD"
+'
+
+test_expect_success 'diff --submodule=diff with moved nested submodule HEAD' '
+   git -C sm2 diff --submodule=diff >actual 2>err &&
+   test_must_be_empty err
+'
+
 test_done
-- 
2.12.2.511.g2abb8caf66



Re: SHA1 collision in production repo?! (probably not)

2017-03-31 Thread Jeff King
On Fri, Mar 31, 2017 at 01:45:15PM -0400, Jeff King wrote:

> I suspect this may improve things, but I haven't dug deeper to see if
> there are unwanted side effects, or if there are other spots that need
> similar treatment.
> 
> diff --git a/sha1_file.c b/sha1_file.c
> index 43990dec7..38411f90b 100644
> --- a/sha1_file.c
> +++ b/sha1_file.c
> @@ -2952,7 +2952,7 @@ static int sha1_loose_object_info(const unsigned char 
> *sha1,
>   if (status && oi->typep)
>   *oi->typep = status;
>   strbuf_release();
> - return 0;
> + return status;
>  }
>  
>  int sha1_object_info_extended(const unsigned char *sha1, struct object_info 
> *oi, unsigned flags)

Er, no, that's totally wrong. "status' may be holding the type. It
should really be:

  return status < 0 ? status : 0;

-Peff


Re: SHA1 collision in production repo?! (probably not)

2017-03-31 Thread Jeff King
On Fri, Mar 31, 2017 at 10:35:06AM -0700, Junio C Hamano wrote:

> Lars Schneider  writes:
> 
> > Hi,
> >
> > I just got a report with the following output after a "git fetch" operation
> > using Git 2.11.0.windows.3 [1]:
> >
> > remote: Counting objects: 5922, done.
> > remote: Compressing objects: 100% (14/14), done.
> > error: inflate: data stream error (unknown compression method)
> > error: unable to unpack 6acd8f279a8b20311665f41134579b7380970446 header
> > fatal: SHA1 COLLISION FOUND WITH 6acd8f279a8b20311665f41134579b7380970446 !
> > fatal: index-pack failed
> >
> > I would be really surprised if we discovered a SHA1 collision in a 
> > production
> > repo. My guess is that this is somehow triggered by a network issue (see 
> > data
> > stream error). Any tips how to debug this?
> 
> Perhaps the first thing to do is to tweak the messages in builtin/index-pack.c
> to help you identify which one of identical 5 messages is firing.
> 
> My guess would be that the code saw an object that came over the
> wire, hashed it to learn that its object name is
> 6acd8f279a8b20311665f41134579b7380970446, noticed that it locally
> already has the object with the same name, and tried to make sure
> they have identical contents (otherwise, what came over the wire is
> a successful second preimage attack).  But your loose object on disk
> you already had was corrupt and didn't inflate correctly when
> builtin/index-pack.c::compare_objects() or check_collision() tried
> to.  The code saw no data, or truncated data, or
> whatever---something different from what the other data that hashed
> down to 6acd8..., and reported a collision when there is no
> collision.

My guess is the one in compare_objects(). The "unable to unpack" error
comes from sha1_loose_object_info(). We'd normally then follow up with
read_sha1_file(), which would generate its own set of errors.

But if open_istream() got a bogus value for the object size (but didn't
correctly report an error), then it would probably quietly return 0
bytes from read_istream() later.

I suspect this may improve things, but I haven't dug deeper to see if
there are unwanted side effects, or if there are other spots that need
similar treatment.

diff --git a/sha1_file.c b/sha1_file.c
index 43990dec7..38411f90b 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -2952,7 +2952,7 @@ static int sha1_loose_object_info(const unsigned char 
*sha1,
if (status && oi->typep)
*oi->typep = status;
strbuf_release();
-   return 0;
+   return status;
 }
 
 int sha1_object_info_extended(const unsigned char *sha1, struct object_info 
*oi, unsigned flags)

-Peff


Re: SHA1 collision in production repo?! (probably not)

2017-03-31 Thread Junio C Hamano
Lars Schneider  writes:

> Hi,
>
> I just got a report with the following output after a "git fetch" operation
> using Git 2.11.0.windows.3 [1]:
>
> remote: Counting objects: 5922, done.
> remote: Compressing objects: 100% (14/14), done.
> error: inflate: data stream error (unknown compression method)
> error: unable to unpack 6acd8f279a8b20311665f41134579b7380970446 header
> fatal: SHA1 COLLISION FOUND WITH 6acd8f279a8b20311665f41134579b7380970446 !
> fatal: index-pack failed
>
> I would be really surprised if we discovered a SHA1 collision in a production
> repo. My guess is that this is somehow triggered by a network issue (see data
> stream error). Any tips how to debug this?

Perhaps the first thing to do is to tweak the messages in builtin/index-pack.c
to help you identify which one of identical 5 messages is firing.

My guess would be that the code saw an object that came over the
wire, hashed it to learn that its object name is
6acd8f279a8b20311665f41134579b7380970446, noticed that it locally
already has the object with the same name, and tried to make sure
they have identical contents (otherwise, what came over the wire is
a successful second preimage attack).  But your loose object on disk
you already had was corrupt and didn't inflate correctly when
builtin/index-pack.c::compare_objects() or check_collision() tried
to.  The code saw no data, or truncated data, or
whatever---something different from what the other data that hashed
down to 6acd8..., and reported a collision when there is no
collision.



[PATCH] name-hash: fix buffer overrun

2017-03-31 Thread git
From: Kevin Willford 

Add check for the end of the entries for the thread partition.
Add test for lazy init name hash with specific directory structure

The lazy init hash name was causing a buffer overflow when the last
entry in the index was multiple folder deep with parent folders that
did not have any files in them.

This adds a test for the boundary condition of the thread partitions
with the folder structure that was triggering the buffer overflow.

The fix was to check if it is the last entry for the thread partition
in the handle_range_dir and not try to use the next entry in the cache.

Signed-off-by: Kevin Willford 
Signed-off-by: Johannes Schindelin 
Signed-off-by: Jeff Hostetler 

---
 name-hash.c |  4 +++-
 t/t3008-ls-files-lazy-init-name-hash.sh | 19 +++
 2 files changed, 22 insertions(+), 1 deletion(-)
 create mode 100644 t/t3008-ls-files-lazy-init-name-hash.sh

diff --git a/name-hash.c b/name-hash.c
index cac313c..39309ef 100644
--- a/name-hash.c
+++ b/name-hash.c
@@ -342,7 +342,9 @@ static int handle_range_dir(
 * Scan forward in the index array for index entries having the same
 * path prefix (that are also in this directory).
 */
-   if (strncmp(istate->cache[k_start + 1]->name, prefix->buf, prefix->len) 
> 0)
+   if (k_start + 1 >= k_end)
+   k = k_end;
+   else if (strncmp(istate->cache[k_start + 1]->name, prefix->buf, 
prefix->len) > 0)
k = k_start + 1;
else if (strncmp(istate->cache[k_end - 1]->name, prefix->buf, 
prefix->len) == 0)
k = k_end;
diff --git a/t/t3008-ls-files-lazy-init-name-hash.sh 
b/t/t3008-ls-files-lazy-init-name-hash.sh
new file mode 100644
index 000..e46a11b
--- /dev/null
+++ b/t/t3008-ls-files-lazy-init-name-hash.sh
@@ -0,0 +1,19 @@
+#!/bin/sh
+
+test_description='Test the lazy init name hash with various folder structures'
+
+. ./test-lib.sh
+
+test_expect_success 'no buffer overflow in lazy_init_name_hash' '
+   (
+   test_seq 2000 | sed "s/^/a_/"
+   echo b/b/b
+   test_seq 2000 | sed "s/^/c_/"
+   test_seq 50 | sed "s/^/d_/" | tr "\n" "/"; echo d
+   ) |
+   sed "s/^/100644 $EMPTY_BLOB\t/" |
+   git update-index --index-info &&
+   test-lazy-init-name-hash -m
+'
+
+test_done
-- 
2.9.3



Re: git diff --submodule=diff fails with submodules in a submodule

2017-03-31 Thread Jacob Keller
On Fri, Mar 31, 2017 at 10:07 AM, Stefan Beller  wrote:
> +cc Jacob, who implemented --submodule=diff
>
> On Fri, Mar 31, 2017 at 8:40 AM, David Parrish  wrote:
>> When I try to run `git diff --submodule=diff` in a submodule which has
>> it's own submodules that have changes I get the error: fatal: bad
>> object
>
> Thanks for the bug report!
>
>> Let me know if you need an example reproduce the issue.
>
> I could reproduce it when playing around locally with a submodule in
> submodules. I think sub-submodule needs to have its HEAD moved from
> the recorded commit.
>
> Thanks,
> Stefan

Hmm. An example reproduction would be helpful. Ideally in the form of
a test ;) But otherwise whatever helps. I will try to look at this,
but I'm  busy for a few days.

Thanks,
Jake


[PATCH] name-hash: fix buffer overrun

2017-03-31 Thread git
From: Jeff Hostetler 

Fix buffer overrun in handle_range_dir() when the final entry
in the index was the only file in the last directory, such as
"a/b/foo.txt". The look ahead (k_start + 1) was invalid since
(k_start + 1) == k_end.

This bug was introduced by Jeff in "jh/memihash-opt" which was
recently merged into master.

Kevin Willford (1):
  name-hash: fix buffer overrun

 name-hash.c |  4 +++-
 t/t3008-ls-files-lazy-init-name-hash.sh | 19 +++
 2 files changed, 22 insertions(+), 1 deletion(-)
 create mode 100644 t/t3008-ls-files-lazy-init-name-hash.sh

-- 
2.9.3



Re: SHA1 collision in production repo?! (probably not)

2017-03-31 Thread Jeff King
On Fri, Mar 31, 2017 at 06:05:17PM +0200, Lars Schneider wrote:

> I just got a report with the following output after a "git fetch" operation
> using Git 2.11.0.windows.3 [1]:
> 
> remote: Counting objects: 5922, done.
> remote: Compressing objects: 100% (14/14), done.
> error: inflate: data stream error (unknown compression method)
> error: unable to unpack 6acd8f279a8b20311665f41134579b7380970446 header
> fatal: SHA1 COLLISION FOUND WITH 6acd8f279a8b20311665f41134579b7380970446 !
> fatal: index-pack failed
> 
> I would be really surprised if we discovered a SHA1 collision in a production
> repo. My guess is that this is somehow triggered by a network issue (see data
> stream error). Any tips how to debug this?

I'd be surprised, too. :)

I'm not sure that inflate error actually comes from the network pack.
The "unable to unpack $sha1 header" message actually comes from
sha1_object_info_loose(). Which means we're failing to read our _local_
version of 6acd8f279a8b, which is an object we believe is coming in from
the network.

And that would explain the false-positive collision. We computed the
sha1 on something come from the network. We believe we have an object
with that sha1 already (but it's corrupted), and then when we compared
the real bytes to the corrupted bytes, they didn't match.

We should be able to confirm that by running "git fsck" on the local
repo, which I'd expect to complain about the loose object.

But what I find disturbing there is that we did not notice the failure
when accessing the loose object. It seems to have been lost in the call
chain. I think the problem is that sha1_loose_object_info() may report
errors in two ways: returning -1 if it did not find the object, or
putting OBJ_BAD into the type field if it found a corrupt object. But
callers are not aware of the second one.

I think it should probably return -1 for a corruption, too, and act as
if we don't have the object (because we effectively don't).

-Peff


[PATCH v3] http.postbuffer: allow full range of ssize_t values

2017-03-31 Thread David Turner
Unfortunately, in order to push some large repos, the http postbuffer
must sometimes exceed two gigabytes.  On a 64-bit system, this is OK:
we just malloc a larger buffer.

Signed-off-by: David Turner 
---

This version fixes the definition of git_parse_ssize_t to return int.

Sorry for the sloppiness.

 cache.h  |  1 +
 config.c | 17 +
 http.c   |  4 ++--
 http.h   |  2 +-
 4 files changed, 21 insertions(+), 3 deletions(-)

diff --git a/cache.h b/cache.h
index fbdf7a815a..5e6747dbb4 100644
--- a/cache.h
+++ b/cache.h
@@ -1900,6 +1900,7 @@ extern int git_parse_maybe_bool(const char *);
 extern int git_config_int(const char *, const char *);
 extern int64_t git_config_int64(const char *, const char *);
 extern unsigned long git_config_ulong(const char *, const char *);
+extern ssize_t git_config_ssize_t(const char *, const char *);
 extern int git_config_bool_or_int(const char *, const char *, int *);
 extern int git_config_bool(const char *, const char *);
 extern int git_config_maybe_bool(const char *, const char *);
diff --git a/config.c b/config.c
index 1a4d85537b..de5b155a4e 100644
--- a/config.c
+++ b/config.c
@@ -834,6 +834,15 @@ int git_parse_ulong(const char *value, unsigned long *ret)
return 1;
 }
 
+static int git_parse_ssize_t(const char *value, ssize_t *ret)
+{
+   ssize_t tmp;
+   if (!git_parse_signed(value, , 
maximum_signed_value_of_type(ssize_t)))
+   return 0;
+   *ret = tmp;
+   return 1;
+}
+
 NORETURN
 static void die_bad_number(const char *name, const char *value)
 {
@@ -892,6 +901,14 @@ unsigned long git_config_ulong(const char *name, const 
char *value)
return ret;
 }
 
+ssize_t git_config_ssize_t(const char *name, const char *value)
+{
+   ssize_t ret;
+   if (!git_parse_ssize_t(value, ))
+   die_bad_number(name, value);
+   return ret;
+}
+
 int git_parse_maybe_bool(const char *value)
 {
if (!value)
diff --git a/http.c b/http.c
index 96d84bbed3..22f8167ba2 100644
--- a/http.c
+++ b/http.c
@@ -19,7 +19,7 @@ long int git_curl_ipresolve;
 #endif
 int active_requests;
 int http_is_verbose;
-size_t http_post_buffer = 16 * LARGE_PACKET_MAX;
+ssize_t http_post_buffer = 16 * LARGE_PACKET_MAX;
 
 #if LIBCURL_VERSION_NUM >= 0x070a06
 #define LIBCURL_CAN_HANDLE_AUTH_ANY
@@ -331,7 +331,7 @@ static int http_options(const char *var, const char *value, 
void *cb)
}
 
if (!strcmp("http.postbuffer", var)) {
-   http_post_buffer = git_config_int(var, value);
+   http_post_buffer = git_config_ssize_t(var, value);
if (http_post_buffer < LARGE_PACKET_MAX)
http_post_buffer = LARGE_PACKET_MAX;
return 0;
diff --git a/http.h b/http.h
index 02bccb7b0c..f7bd3b26b0 100644
--- a/http.h
+++ b/http.h
@@ -111,7 +111,7 @@ extern struct curl_slist *http_copy_default_headers(void);
 extern long int git_curl_ipresolve;
 extern int active_requests;
 extern int http_is_verbose;
-extern size_t http_post_buffer;
+extern ssize_t http_post_buffer;
 extern struct credential http_auth;
 
 extern char curl_errorstr[CURL_ERROR_SIZE];
-- 
2.11.GIT



Re: [PATCH] git-gui: Error on systems with TK < 8.6.0

2017-03-31 Thread Junio C Hamano
Peter van der Does  writes:

> Using git-gui on systems that run a TK version below 8.6.0 results in a
> crash when checking for the current theme.
>
> Catch the error on those systems and use a different command to check
> for the current theme.
> ---

Needs sign-off.  Also if you can make the patch against the git-gui
project (the upstream project for this part of our tree, which has
this file at lib/themed.tcl, not at git-gui/lib/themed.tcl) and send
it to its maintainer (Cc'ed), it would be great.

Thanks.

>  git-gui/lib/themed.tcl | 6 +-
>  1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/git-gui/lib/themed.tcl b/git-gui/lib/themed.tcl
> index 351a712c8..bb4e8f25e 100644
> --- a/git-gui/lib/themed.tcl
> +++ b/git-gui/lib/themed.tcl
> @@ -248,7 +248,11 @@ proc tspinbox {w args} {
>  proc ttext {w args} {
>   global use_ttk
>   if {$use_ttk} {
> - switch -- [ttk::style theme use] {
> + # Handle either current Tk or older versions of 8.5
> + if {[catch {set theme [ttk::style theme use]}]} {
> + set theme  $::ttk::currentTheme
> + }
> + switch -- $theme {
>   "vista" - "xpnative" {
>   lappend args -highlightthickness 0 -borderwidth > 0
>   }


Re: [PATCH v2 01/20] get_ref_dir(): don't call read_loose_refs() for "refs/bisect"

2017-03-31 Thread Stefan Beller
On Fri, Mar 31, 2017 at 7:10 AM, Michael Haggerty  wrote:
> Since references under "refs/bisect/" are per-worktree, they have to
> be sought in the worktree rather than in the main repository. But
> since loose references are found by traversing directories, the
> reference iterator won't even get the idea to look for a
> "refs/bisect/" directory in the worktree if there is not a directory
> with that name in the main repository. Thus `get_ref_dir()` manually
> inserts a dir_entry for "refs/bisect/" whenever it reads the entry for
> "refs/".
>
> The current code then immediately calls `read_loose_refs()` on that
> directory. But since the dir_entry is created with its `incomplete`
> flag set, any traversal that gets to this point will read the
> directory automatically. So there is no need to call
> `read_loose_refs()` explicitly; the lazy mechanism suffices.
>
> And in fact, the attempt to `read_loose_refs()` was broken anyway.
> That function needs its `dirname` argument to have a trailing `/`
> character, but the invocation here was passing it "refs/bisect"
> without a trailing slash. So `read_loose_refs()` would read
> `$GIT_DIR/refs/bisect" correctly, but if it found an entry "foo" in
> that directory, it would try to read "$GIT_DIR/refs/bisectfoo".
> Normally it wouldn't find anything at that path, but the failure was
> canceled out because `get_ref_dir()` *also* forgot to reset the
> `REF_INCOMPLETE` bit on the dir_entry. So the read was attempted again
> when it was accessed, via the lazy mechanism, and this time the read
> was done correctly.

With all this retry logic going on, it is hard to add a test demonstrating
this is correct now. The description makes sense though.

Thanks,
Stefan


Re: git diff --submodule=diff fails with submodules in a submodule

2017-03-31 Thread Stefan Beller
+cc Jacob, who implemented --submodule=diff

On Fri, Mar 31, 2017 at 8:40 AM, David Parrish  wrote:
> When I try to run `git diff --submodule=diff` in a submodule which has
> it's own submodules that have changes I get the error: fatal: bad
> object

Thanks for the bug report!

> Let me know if you need an example reproduce the issue.

I could reproduce it when playing around locally with a submodule in
submodules. I think sub-submodule needs to have its HEAD moved from
the recorded commit.

Thanks,
Stefan


Re: [PATCH v3 3/4] name-rev: provide debug output

2017-03-31 Thread Junio C Hamano
Junio C Hamano  writes:

> The only case that this change may make a difference I can think of
> is when you have a tag object pointed at from outside refs/tags
> (e.g. refs/heads/foo is a tag object); if you are trying to change
> the definition of "from_tag" from the current "Is the tip inside
> refs/tags/?" to "Is the tip either inside refs/tags/ or is it a tag
> object anywhere?", that may be a good change (I didn't think things
> through, though), but that shouldn't be hidden inside a commit that
> claims to only add support for debugging.

And if that "a tag object outside refs/tags/" is what you are
solving, I think a better place to do it is in name_ref().  

Instead of saying "from_tag is true iff it starts with refs/tags/",
you'd say "... or deref is set to true, because we know that the
original was a tag object in that case".


> What problem are you solving?  
>
>> @@ -236,7 +273,6 @@ static int name_ref(const char *path, const struct 
>> object_id *oid, int flags, vo
>>  }
>>  
>>  add_to_tip_table(oid->hash, path, can_abbreviate_output);
>> -
>>  while (o && o->type == OBJ_TAG) {
>>  struct tag *t = (struct tag *) o;
>>  if (!t->tagged)
>
> This is a patch noise we can do without.
>
> Thanks.


Re: [PATCH v3 3/4] name-rev: provide debug output

2017-03-31 Thread Junio C Hamano
Michael J Gruber  writes:

> Currently, `git describe --contains --debug` does not create any debug
> output because it does not pass the flag down to `git name-rev`, which
> does not know that flag.
>
> Teach the latter that flag, so that the former can pass it down (in
> the following commit).
>
> The output is patterned after that of `git describe --debug`, with the
> following differences:
>
> describe loops over all args to describe, then over all possible
> descriptions; name-rev does it the other way round. Therefore, we need
> to amend each possible description by the arg that it is for (and we
> leave out the "searching to describe" header).
>
> The date cut-off for name-rev kicks in way more often than the candidate
> number cut-off of describe, so we do not clutter the output with the
> cut-off.
>
> Signed-off-by: Michael J Gruber 
> ---

>  static void name_rev(struct commit *commit,
>   const char *tip_name, unsigned long taggerdate,
>   int generation, int distance, int from_tag,
> - int deref)
> + int deref, struct name_ref_data *data)
>  {
>   struct rev_name *name = (struct rev_name *)commit->util;
>   struct commit_list *parents;
> @@ -75,6 +88,7 @@ static void name_rev(struct commit *commit,
>  
>   if (deref) {
>   tip_name = xstrfmt("%s^0", tip_name);
> + from_tag += 1;

Why this change?  I didn't see it explained in the proposed log
message.  "deref" is true only when our immediate caller is the one
that inspected the object at the tip and found it to be a tag object
(i.e. not a lightweight tag or a branch).  from_tag is about "is the
tip within refs/tags/ hierarchy?  Yes/No?" and such a caller will
set it appropriately when calling us.  This function just passes it
down when it recursively calls itself.  

We shouldn't be mucking with that value ourselves here, should we?

The only case that this change may make a difference I can think of
is when you have a tag object pointed at from outside refs/tags
(e.g. refs/heads/foo is a tag object); if you are trying to change
the definition of "from_tag" from the current "Is the tip inside
refs/tags/?" to "Is the tip either inside refs/tags/ or is it a tag
object anywhere?", that may be a good change (I didn't think things
through, though), but that shouldn't be hidden inside a commit that
claims to only add support for debugging.

What problem are you solving?  

> @@ -236,7 +273,6 @@ static int name_ref(const char *path, const struct 
> object_id *oid, int flags, vo
>   }
>  
>   add_to_tip_table(oid->hash, path, can_abbreviate_output);
> -
>   while (o && o->type == OBJ_TAG) {
>   struct tag *t = (struct tag *) o;
>   if (!t->tagged)

This is a patch noise we can do without.

Thanks.



Lieber Freund,

2017-03-31 Thread Dr.John Martin
Barclays Bank PLC
28 High Street
Nottinghamshire
Vereinigtes Königreich
NG1 2bd


Lieber Freund,


Mein Name ist Doz k. Martins und ich bin der chief Officer der internationalen 
Transaktionen von der Barclays Bank London, Vereinigtes Königreich.

Ich entdeckte einer Summe von £ 16,5 Millionen (sechzehn Millionen fünf hundert 
tausend Pfund Sterling) in einem Konto, dass gehört, die vor 10 Jahren 
gestorben.

Ich Suche Ihre Zustimmung an Sie als die nächsten Angehörigen zu präsentieren / 
Will Beneficiary auf diesen Betrag geschätzt auf £ 16,5 Millionen Pfund. Alles, 
was ich jetzt brauche ist Ihre ehrliche Zusammenarbeit, Diskretion und 
Vertrauen zu ermöglichen, sehen diese Transaktion durchgeführt.


Ich brauche dieses Geld von meiner Bank an Ihre Bank zu übertragen, so dass wir 
beide sich das Geld teilen werden und ich brauche die Übertragung dringend 
getan werden.

Bitte geben Sie mir die folgenden, wie wir 7 Tage Zeit haben, führen Sie es 
durch. Dies ist sehr dringend bitte. - durch diese e-Mail 
address;mdrjoh...@gmail.com reagieren.


1. vollständiger Name:
2. Direkte Handynummer:
3. Ihre Kontaktadresse:


Doz k.Martins


Re: Terrible bad performance for it blame --date=iso -C -C master --

2017-03-31 Thread Samuel Lijin
Hi Ulrich,

Is there any chance you could share the repo where this is coming from?

This is actually something a colleague and I are looking into seeing
if we can crunch out some performance gains since -C -C isn't
threaded.

Sam

On Fri, Mar 31, 2017 at 10:52 AM, Junio C Hamano  wrote:
> "Ulrich Windl"  writes:
>
>> I was running "vc-annotate" in Emacs for a file from a large
>> repository (>4 files, a big percentage being binary, about 10
>> commits). For the first file the result was presented rather soon, but
>> for a second file the command did not finish even after about 10
>> minutes!
>>
>> The file in question is a rather short text file (124 kB), and
>> according to git log it has one commit.
>>
>> While being bored, I did an strace of the command to find out that a
>> huge number of files is inspected.
>
> With -C -C the user (vc-annotate?) is asking to inspect huge number
> of files, to find if the contents of the file (except for the part
> that came from its own previous version) came from other existing
> files.  So this is very much expected.
>
> It might not be a bad idea to teach "blame" not to pay attention to
> any path that is marked as "-diff" (e.g. binary files) when trying
> to see if remaining contents appeared by borrowing from them.  We do
> not have that heuristics (yet).


SHA1 collision in production repo?! (probably not)

2017-03-31 Thread Lars Schneider
Hi,

I just got a report with the following output after a "git fetch" operation
using Git 2.11.0.windows.3 [1]:

remote: Counting objects: 5922, done.
remote: Compressing objects: 100% (14/14), done.
error: inflate: data stream error (unknown compression method)
error: unable to unpack 6acd8f279a8b20311665f41134579b7380970446 header
fatal: SHA1 COLLISION FOUND WITH 6acd8f279a8b20311665f41134579b7380970446 !
fatal: index-pack failed

I would be really surprised if we discovered a SHA1 collision in a production
repo. My guess is that this is somehow triggered by a network issue (see data
stream error). Any tips how to debug this?

Thanks,
Lars

[1] Git for Windows build based on Git v2.11.0


Re: [PATCH v2 00/20] Separate `ref_cache` into a separate module

2017-03-31 Thread Junio C Hamano
Michael Haggerty  writes:

> This version literally only contains a few commit message changes and
> one minor comment changes relative to v1. The code is identical. I
> wasn't sure whether it is even worth sending this patch series to the
> ML again; Junio, if you'd prefer I just send a link to a published
> branch in such cases, please let me know.

The review on the list is not about letting me pick it up, but is
about giving reviewing contributors a chance to comment.  I think
for a series this important ;-) it is good that you are giving it
multiple exposures so that people who were offline the last time can
have a chance to look at it, even if the update is minimum.

> This patch series is also available from my GitHub fork [2] as branch
> "separate-ref-cache". These patches depend on Duy's
> nd/files-backend-git-dir branch.

I am getting the impression that the files-backend thing as well as
this topic are ready for 'next'.  Please stop me if I missed something
in these topics (especially the other one) that needs updating
before that happens.

Thanks.


RE: [PATCH] http.postbuffer: make a size_t

2017-03-31 Thread David Turner
> -Original Message-
> From: Torsten Bögershausen [mailto:tbo...@web.de]
> Sent: Friday, March 31, 2017 12:22 AM
> To: David Turner ; git@vger.kernel.org
> Subject: Re: [PATCH] http.postbuffer: make a size_t
> 
> 
> 
> On 30/03/17 22:29, David Turner wrote:
> > Unfortunately, in order to push some large repos, the http postbuffer
> > must sometimes exceed two gigabytes.  On a 64-bit system, this is OK:
> > we just malloc a larger buffer.
> >
> > Signed-off-by: David Turner 
> > ---
> >  cache.h  |  1 +
> >  config.c | 17 +
> >  http.c   |  2 +-
> >  3 files changed, 19 insertions(+), 1 deletion(-)
> >
> > diff --git a/cache.h b/cache.h
> > index fbdf7a815a..a8c1b65db0 100644
> > --- a/cache.h
> > +++ b/cache.h
> > @@ -1900,6 +1900,7 @@ extern int git_parse_maybe_bool(const char *);
> > extern int git_config_int(const char *, const char *);  extern int64_t
> > git_config_int64(const char *, const char *);  extern unsigned long
> > git_config_ulong(const char *, const char *);
> > +extern size_t git_config_size_t(const char *, const char *);
> >  extern int git_config_bool_or_int(const char *, const char *, int *);
> > extern int git_config_bool(const char *, const char *);  extern int
> > git_config_maybe_bool(const char *, const char *); diff --git
> > a/config.c b/config.c index 1a4d85537b..7b706cf27a 100644
> > --- a/config.c
> > +++ b/config.c
> > @@ -834,6 +834,15 @@ int git_parse_ulong(const char *value, unsigned
> long *ret)
> > return 1;
> >  }
> >
> > +static size_t git_parse_size_t(const char *value, unsigned long *ret)
> > +{
> > +   size_t tmp;
> > +   if (!git_parse_signed(value, ,
> maximum_unsigned_value_of_type(size_t)))
> > +   return 0;
> > +   *ret = tmp;
> > +   return 1;
> > +}
> What is the return value here ?
> Isn't it a size_t we want ?

Yeah, that should return an int, since it's just "parsed" vs "unparsed".

> (There was a recent discussion about "unsigned long" vs "size_t", which
>   are the same on many systems, but not under Win64) Would the following
> work ?
> 
> static int git_parse_size_t(const char *value, size_t *ret) {
>   if (!git_parse_signed(value, ret,
> maximum_unsigned_value_of_type(size_t)))
>   return 0;
>   return 1;
> }
> 
> []
> > +size_t git_config_size_t(const char *name, const char *value) {
> > +   unsigned long ret;
> > +   if (!git_parse_size_t(value, ))
> > +   die_bad_number(name, value);
> > +   return ret;
> > +}
> > +
> Same here:
> size_t git_config_size_t(const char *name, const char *value) {
>   size_t ret;
>   if (!git_parse_size_t(value, ))
>   die_bad_number(name, value);
>   return ret;
> }

In v2 I decided to make this a ssize_t, since it was previously a signed int.  
I know we're not using negative values right now, but since nobody has 2**63 
bytes of ram anyway, it's probably safe to keep it signed to keep our options 
open.


Re: Terrible bad performance for it blame --date=iso -C -C master --

2017-03-31 Thread Junio C Hamano
"Ulrich Windl"  writes:

> I was running "vc-annotate" in Emacs for a file from a large
> repository (>4 files, a big percentage being binary, about 10
> commits). For the first file the result was presented rather soon, but
> for a second file the command did not finish even after about 10
> minutes!
>
> The file in question is a rather short text file (124 kB), and
> according to git log it has one commit.
>
> While being bored, I did an strace of the command to find out that a
> huge number of files is inspected.

With -C -C the user (vc-annotate?) is asking to inspect huge number
of files, to find if the contents of the file (except for the part
that came from its own previous version) came from other existing
files.  So this is very much expected.

It might not be a bad idea to teach "blame" not to pay attention to
any path that is marked as "-diff" (e.g. binary files) when trying
to see if remaining contents appeared by borrowing from them.  We do
not have that heuristics (yet).


Re: [PATCH v3 00/20] object_id part 7

2017-03-31 Thread Junio C Hamano
Thanks for these patches.  I didn't see anything questionable in
them (except a very minor thing I sent comments on separately).

Will replace.  Thanks.


Re: [PATCH v3 14/20] sha1-array: convert internal storage for struct sha1_array to object_id

2017-03-31 Thread Junio C Hamano
"brian m. carlson"  writes:

>   for (i = 0; i < array->nr; i++) {
> - strbuf_addstr(_hexs, sha1_to_hex(array->sha1[i]));
> + strbuf_addstr(_hexs, oid_to_hex(array->oid + i));

As I said in the previous round (in my comment on the one that
corresponds to the next patch, which has been updated in this
round), this converts E1[E2] to E1 + E2.

> @@ -621,7 +621,7 @@ static void bisect_rev_setup(struct rev_info *revs, const 
> char *prefix,
>   argv_array_pushf(_argv, bad_format, oid_to_hex(current_bad_oid));
>   for (i = 0; i < good_revs.nr; i++)
>   argv_array_pushf(_argv, good_format,
> -  sha1_to_hex(good_revs.sha1[i]));
> +  oid_to_hex(good_revs.oid + i));

Likewise.

> @@ -715,9 +715,9 @@ static struct commit **get_bad_and_good_commits(int 
> *rev_nr)
>   int i, n = 0;
>  
>   ALLOC_ARRAY(rev, 1 + good_revs.nr);
> - rev[n++] = get_commit_reference(current_bad_oid->hash);
> + rev[n++] = get_commit_reference(current_bad_oid);
>   for (i = 0; i < good_revs.nr; i++)
> - rev[n++] = get_commit_reference(good_revs.sha1[i]);
> + rev[n++] = get_commit_reference(good_revs.oid + i);
>   *rev_nr = n;

Likewise.

> @@ -53,9 +53,9 @@ int sha1_array_for_each_unique(struct sha1_array *array,
>  
>   for (i = 0; i < array->nr; i++) {
>   int ret;
> - if (i > 0 && !hashcmp(array->sha1[i], array->sha1[i-1]))
> + if (i > 0 && !oidcmp(array->oid + i, array->oid + i - 1))
>   continue;

Likewise.

> diff --git a/shallow.c b/shallow.c
> index 11f7dde9d9..dc7b67a294 100644
> --- a/shallow.c
> +++ b/shallow.c
> @@ -273,7 +273,7 @@ static int write_shallow_commits_1(struct strbuf *out, 
> int use_pack_protocol,
>   if (!extra)
>   return data.count;
>   for (i = 0; i < extra->nr; i++) {
> - strbuf_addstr(out, sha1_to_hex(extra->sha1[i]));
> + strbuf_addstr(out, oid_to_hex(extra->oid + i));
>   strbuf_addch(out, '\n');
>   data.count++;
>   }

Likewise.

> @@ -417,13 +417,13 @@ void clear_shallow_info(struct shallow_info *info)
>  
>  void remove_nonexistent_theirs_shallow(struct shallow_info *info)
>  {
> - unsigned char (*sha1)[20] = info->shallow->sha1;
> + struct object_id *oid = info->shallow->oid;
>   int i, dst;
>   trace_printf_key(_shallow, "shallow: 
> remove_nonexistent_theirs_shallow\n");
>   for (i = dst = 0; i < info->nr_theirs; i++) {
>   if (i != dst)
>   info->theirs[dst] = info->theirs[i];
> - if (has_sha1_file(sha1[info->theirs[i]]))
> + if (has_object_file(oid + info->theirs[i]))
>   dst++;
>   }
>   info->nr_theirs = dst;

Likewise.

It is so minor that there is no point rerolling the whole thing only
for these, though.

Thanks.



git diff --submodule=diff fails with submodules in a submodule

2017-03-31 Thread David Parrish
When I try to run `git diff --submodule=diff` in a submodule which has
it's own submodules that have changes I get the error: fatal: bad
object

Let me know if you need an example reproduce the issue.

David


Re: ttk error when starting git gui

2017-03-31 Thread Konstantin Khomoutov
On Fri, 31 Mar 2017 13:29:04 +0200
"Jessie Hernandez"  wrote:

> > On Fri, 31 Mar 2017 09:53:38 +0200
> > "Jessie Hernandez"  wrote:
> >
> > [...]
> >> >> It's possible to have ttk with 8.5 as well (say, here on Debian
> >> >> 8.5 ships with ttk enabled).
> >> >>
> >> >> A proper patch would be
> >> >>
> >> >> -set default_config(gui.usettk) 1
> >> >> +set default_config(gui.usettk) [namespace exists ::ttk]
> >> >>
> >> >> Could you please test it on your system?
> >> >>
> >> >
> >> > Yeah that seems to work.
> >> > Thanks for this.
> >> >
> >> > I have tried it with git 2.12.2
> > [...]
> >> I spoke to soon. :(
> >> It does not work. I forgot to do a install when testing.
> >> Sorry for the confusion.
> >
> > Care to elaborate on the exact error, please?
> >
> 
> The error I get is the following
> 
> Error in startup script: wrong # args: should be "ttk::style theme
> use theme" while executing
> "ttk::style theme use"
> (procedure "ttext" line 4)
> invoked from within
> "ttext $ui_workdir -background white -foreground black \
>   -borderwidth 0 \
>   -width 20 -height 10 \
>   -wrap none \
>   -takefocus 1 -highlightthickness 1\
>   ..."
> (file "/usr/libexec/git-core/git-gui" line 3190)

Too bad: the git-gui relies on the [ttk::style theme use] command --
that is, the two-argument invocation of [ttk::style] -- to return
the theme being currently in use, and this feature was implemented
on 2008-05-27 [1] by Pat Thoyts.

Judging from the output of

  fossil descendants e83b7dd29ddae998f96538584afb518849ac1e2c

the first Tk release to have this change was 8.6b2.

So the proper fix appears to be more involved:

 set default_config(gui.usettk) \
   [expr {[package vcompare [info patchlevel] 8.6b2] >= 0}]

(The slash+newline sequence is not needed -- it's here mostly
for pretty-printing.)

1. http://core.tcl.tk/tk/info/e83b7dd29ddae998


[PATCH v2 13/20] refs: handle "refs/bisect/" in `loose_fill_ref_dir()`

2017-03-31 Thread Michael Haggerty
That "refs/bisect/" has to be handled specially when filling the
ref_cache for loose references is a peculiarity of the files backend,
and the ref-cache code shouldn't need to know about it. So move this
code to the callback function, `loose_fill_ref_dir()`.

Signed-off-by: Michael Haggerty 
---
 refs/files-backend.c | 15 +++
 refs/ref-cache.c | 16 
 2 files changed, 15 insertions(+), 16 deletions(-)

diff --git a/refs/files-backend.c b/refs/files-backend.c
index e4d78393ac..7b5f5c1240 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -508,6 +508,21 @@ static void loose_fill_ref_dir(struct ref_store *ref_store,
strbuf_release();
strbuf_release();
closedir(d);
+
+   /*
+* Manually add refs/bisect, which, being per-worktree, might
+* not appear in the directory listing for refs/ in the main
+* repo.
+*/
+   if (!strcmp(dirname, "refs/")) {
+   int pos = search_ref_dir(dir, "refs/bisect/", 12);
+
+   if (pos < 0) {
+   struct ref_entry *child_entry = create_dir_entry(
+   dir->cache, "refs/bisect/", 12, 1);
+   add_entry_to_dir(dir, child_entry);
+   }
+   }
 }
 
 static struct ref_dir *get_loose_refs(struct files_ref_store *refs)
diff --git a/refs/ref-cache.c b/refs/ref-cache.c
index 7f247b9170..0e0c13 100644
--- a/refs/ref-cache.c
+++ b/refs/ref-cache.c
@@ -26,22 +26,6 @@ struct ref_dir *get_ref_dir(struct ref_entry *entry)
die("BUG: incomplete ref_store without fill_ref_dir 
function");
 
dir->cache->fill_ref_dir(dir->cache->ref_store, dir, 
entry->name);
-
-   /*
-* Manually add refs/bisect, which, being
-* per-worktree, might not appear in the directory
-* listing for refs/ in the main repo.
-*/
-   if (!strcmp(entry->name, "refs/")) {
-   int pos = search_ref_dir(dir, "refs/bisect/", 12);
-   if (pos < 0) {
-   struct ref_entry *child_entry;
-   child_entry = create_dir_entry(dir->cache,
-  "refs/bisect/",
-  12, 1);
-   add_entry_to_dir(dir, child_entry);
-   }
-   }
entry->flag &= ~REF_INCOMPLETE;
}
return dir;
-- 
2.11.0



[PATCH v2 18/20] commit_packed_refs(): use reference iteration

2017-03-31 Thread Michael Haggerty
Use reference iteration rather than do_for_each_entry_in_dir() in the
definition of commit_packed_refs().

Note that an internal consistency check that was previously done in
`write_packed_entry_fn()` is not there anymore. This is actually an
improvement:

The old error message was emitted when there is an entry in the
packed-ref cache that is not `REF_KNOWS_PEELED`, and when we attempted
to peel the reference, the result was `PEEL_INVALID`,
`PEEL_IS_SYMREF`, or `PEEL_BROKEN`. Since a packed ref cannot be a
symref, `PEEL_IS_SYMREF` and `PEEL_BROKEN` can be ruled out. So we're
left with `PEEL_INVALID`.

An entry without `REF_KNOWS_PEELED` can get into the packed-refs cache
in the following two ways:

* The reference was read from a `packed-refs` file that didn't have
  the `fully-peeled` attribute. In that case, we *don't want* to emit
  an error, because the broken value is presumably a stale value of
  the reference that is now masked by a loose version of the same
  reference (which we just don't happen to be packing this time). This
  is a perfectly legitimate situation and doesn't indicate that the
  repository is corrupt. The old code incorrectly emits an error
  message in this case. (It was probably never reported as a bug
  because this scenario is rare.)

* The reference was a loose reference that was just added to the
  packed ref cache by `files_packed_refs()` via
  `pack_if_possible_fn()` in preparation for being packed. The latter
  function refuses to pack a reference for which
  `entry_resolves_to_object()` returns false, and otherwise calls
  `peel_entry()` itself and checks the return value. So an entry added
  this way should always have `REF_KNOWS_PEELED` and shouldn't trigger
  the error message in either the old code or the new.

Signed-off-by: Michael Haggerty 
---
 refs/files-backend.c | 38 +-
 1 file changed, 17 insertions(+), 21 deletions(-)

diff --git a/refs/files-backend.c b/refs/files-backend.c
index 736a6c9ff7..0ea42826c8 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -1293,30 +1293,15 @@ static struct ref_lock *lock_ref_sha1_basic(struct 
files_ref_store *refs,
  * Write an entry to the packed-refs file for the specified refname.
  * If peeled is non-NULL, write it as the entry's peeled value.
  */
-static void write_packed_entry(FILE *fh, char *refname, unsigned char *sha1,
-  unsigned char *peeled)
+static void write_packed_entry(FILE *fh, const char *refname,
+  const unsigned char *sha1,
+  const unsigned char *peeled)
 {
fprintf_or_die(fh, "%s %s\n", sha1_to_hex(sha1), refname);
if (peeled)
fprintf_or_die(fh, "^%s\n", sha1_to_hex(peeled));
 }
 
-/*
- * An each_ref_entry_fn that writes the entry to a packed-refs file.
- */
-static int write_packed_entry_fn(struct ref_entry *entry, void *cb_data)
-{
-   enum peel_status peel_status = peel_entry(entry, 0);
-
-   if (peel_status != PEEL_PEELED && peel_status != PEEL_NON_TAG)
-   error("internal error: %s is not a valid packed reference!",
- entry->name);
-   write_packed_entry(cb_data, entry->name, entry->u.value.oid.hash,
-  peel_status == PEEL_PEELED ?
-  entry->u.value.peeled.hash : NULL);
-   return 0;
-}
-
 /*
  * Lock the packed-refs file for writing. Flags is passed to
  * hold_lock_file_for_update(). Return 0 on success. On errors, set
@@ -1362,9 +1347,10 @@ static int commit_packed_refs(struct files_ref_store 
*refs)
 {
struct packed_ref_cache *packed_ref_cache =
get_packed_ref_cache(refs);
-   int error = 0;
+   int ok, error = 0;
int save_errno = 0;
FILE *out;
+   struct ref_iterator *iter;
 
files_assert_main_repository(refs, "commit_packed_refs");
 
@@ -1376,8 +1362,18 @@ static int commit_packed_refs(struct files_ref_store 
*refs)
die_errno("unable to fdopen packed-refs descriptor");
 
fprintf_or_die(out, "%s", PACKED_REFS_HEADER);
-   do_for_each_entry_in_dir(get_packed_ref_dir(packed_ref_cache),
-write_packed_entry_fn, out);
+
+   iter = cache_ref_iterator_begin(packed_ref_cache->cache, NULL, 0);
+   while ((ok = ref_iterator_advance(iter)) == ITER_OK) {
+   struct object_id peeled;
+   int peel_error = ref_iterator_peel(iter, );
+
+   write_packed_entry(out, iter->refname, iter->oid->hash,
+  peel_error ? NULL : peeled.hash);
+   }
+
+   if (ok != ITER_DONE)
+   die("error while iterating over references");
 
if (commit_lock_file(packed_ref_cache->lock)) {
save_errno = errno;
-- 
2.11.0



[PATCH v2 17/20] cache_ref_iterator_begin(): make function smarter

2017-03-31 Thread Michael Haggerty
Change `cache_ref_iterator_begin()` to take two new arguments:

* `prefix` -- to iterate only over references with the specified
  prefix.

* `prime_dir` -- to "prime" (i.e., pre-load) the cache before starting
  the iteration.

The new functionality makes it possible for
`files_ref_iterator_begin()` to be made more ignorant of the internals
of `ref_cache`, and `find_containing_dir()` and `prime_ref_dir()` to
be made private.

Signed-off-by: Michael Haggerty 
---
 refs/files-backend.c | 44 +---
 refs/ref-cache.c | 38 ++
 refs/ref-cache.h | 27 +--
 3 files changed, 56 insertions(+), 53 deletions(-)

diff --git a/refs/files-backend.c b/refs/files-backend.c
index fefc29433a..736a6c9ff7 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -1083,7 +1083,6 @@ static struct ref_iterator *files_ref_iterator_begin(
const char *prefix, unsigned int flags)
 {
struct files_ref_store *refs;
-   struct ref_dir *loose_dir, *packed_dir;
struct ref_iterator *loose_iter, *packed_iter;
struct files_ref_iterator *iter;
struct ref_iterator *ref_iterator;
@@ -1109,41 +1108,24 @@ static struct ref_iterator *files_ref_iterator_begin(
 * condition if loose refs are migrated to the packed-refs
 * file by a simultaneous process, but our in-memory view is
 * from before the migration. We ensure this as follows:
-* First, we call prime_ref_dir(), which pre-reads the loose
-* references for the subtree into the cache. (If they've
-* already been read, that's OK; we only need to guarantee
-* that they're read before the packed refs, not *how much*
-* before.) After that, we call get_packed_ref_cache(), which
-* internally checks whether the packed-ref cache is up to
-* date with what is on disk, and re-reads it if not.
+* First, we call start the loose refs iteration with its
+* `prime_ref` argument set to true. This causes the loose
+* references in the subtree to be pre-read into the cache.
+* (If they've already been read, that's OK; we only need to
+* guarantee that they're read before the packed refs, not
+* *how much* before.) After that, we call
+* get_packed_ref_cache(), which internally checks whether the
+* packed-ref cache is up to date with what is on disk, and
+* re-reads it if not.
 */
 
-   loose_dir = get_loose_ref_dir(refs);
-
-   if (prefix && *prefix)
-   loose_dir = find_containing_dir(loose_dir, prefix, 0);
-
-   if (loose_dir) {
-   prime_ref_dir(loose_dir);
-   loose_iter = cache_ref_iterator_begin(loose_dir);
-   } else {
-   /* There's nothing to iterate over. */
-   loose_iter = empty_ref_iterator_begin();
-   }
+   loose_iter = cache_ref_iterator_begin(get_loose_ref_cache(refs),
+ prefix, 1);
 
iter->packed_ref_cache = get_packed_ref_cache(refs);
acquire_packed_ref_cache(iter->packed_ref_cache);
-   packed_dir = get_packed_ref_dir(iter->packed_ref_cache);
-
-   if (prefix && *prefix)
-   packed_dir = find_containing_dir(packed_dir, prefix, 0);
-
-   if (packed_dir) {
-   packed_iter = cache_ref_iterator_begin(packed_dir);
-   } else {
-   /* There's nothing to iterate over. */
-   packed_iter = empty_ref_iterator_begin();
-   }
+   packed_iter = cache_ref_iterator_begin(iter->packed_ref_cache->cache,
+  prefix, 0);
 
iter->iter0 = overlay_ref_iterator_begin(loose_iter, packed_iter);
iter->flags = flags;
diff --git a/refs/ref-cache.c b/refs/ref-cache.c
index 38d4c31985..b3a30350d7 100644
--- a/refs/ref-cache.c
+++ b/refs/ref-cache.c
@@ -177,8 +177,17 @@ static struct ref_dir *search_for_subdir(struct ref_dir 
*dir,
return get_ref_dir(entry);
 }
 
-struct ref_dir *find_containing_dir(struct ref_dir *dir,
-   const char *refname, int mkdir)
+/*
+ * If refname is a reference name, find the ref_dir within the dir
+ * tree that should hold refname. If refname is a directory name
+ * (i.e., it ends in '/'), then return that ref_dir itself. dir must
+ * represent the top-level directory and must already be complete.
+ * Sort ref_dirs and recurse into subdirectories as necessary. If
+ * mkdir is set, then create any missing directories; otherwise,
+ * return NULL if the desired directory cannot be found.
+ */
+static struct ref_dir *find_containing_dir(struct ref_dir *dir,
+  const char *refname, int mkdir)
 {
const char *slash;
for (slash = strchr(refname, '/'); slash; slash = strchr(slash + 1, 
'/')) {
@@ 

[PATCH v2 14/20] do_for_each_entry_in_dir(): eliminate `offset` argument

2017-03-31 Thread Michael Haggerty
It was never used.

Signed-off-by: Michael Haggerty 
---
 refs/files-backend.c |  4 ++--
 refs/ref-cache.c |  6 +++---
 refs/ref-cache.h | 11 +--
 3 files changed, 10 insertions(+), 11 deletions(-)

diff --git a/refs/files-backend.c b/refs/files-backend.c
index 7b5f5c1240..0ff5df6b46 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -1390,7 +1390,7 @@ static int commit_packed_refs(struct files_ref_store 
*refs)
 
fprintf_or_die(out, "%s", PACKED_REFS_HEADER);
do_for_each_entry_in_dir(get_packed_ref_dir(packed_ref_cache),
-0, write_packed_entry_fn, out);
+write_packed_entry_fn, out);
 
if (commit_lock_file(packed_ref_cache->lock)) {
save_errno = errno;
@@ -1584,7 +1584,7 @@ static int files_pack_refs(struct ref_store *ref_store, 
unsigned int flags)
lock_packed_refs(refs, LOCK_DIE_ON_ERROR);
cbdata.packed_refs = get_packed_refs(refs);
 
-   do_for_each_entry_in_dir(get_loose_refs(refs), 0,
+   do_for_each_entry_in_dir(get_loose_refs(refs),
 pack_if_possible_fn, );
 
if (commit_packed_refs(refs))
diff --git a/refs/ref-cache.c b/refs/ref-cache.c
index 0e0c13..38d4c31985 100644
--- a/refs/ref-cache.c
+++ b/refs/ref-cache.c
@@ -307,18 +307,18 @@ static void sort_ref_dir(struct ref_dir *dir)
dir->sorted = dir->nr = i;
 }
 
-int do_for_each_entry_in_dir(struct ref_dir *dir, int offset,
+int do_for_each_entry_in_dir(struct ref_dir *dir,
 each_ref_entry_fn fn, void *cb_data)
 {
int i;
assert(dir->sorted == dir->nr);
-   for (i = offset; i < dir->nr; i++) {
+   for (i = 0; i < dir->nr; i++) {
struct ref_entry *entry = dir->entries[i];
int retval;
if (entry->flag & REF_DIR) {
struct ref_dir *subdir = get_ref_dir(entry);
sort_ref_dir(subdir);
-   retval = do_for_each_entry_in_dir(subdir, 0, fn, 
cb_data);
+   retval = do_for_each_entry_in_dir(subdir, fn, cb_data);
} else {
retval = fn(entry, cb_data);
}
diff --git a/refs/ref-cache.h b/refs/ref-cache.h
index ed51e80d88..6eecdf4276 100644
--- a/refs/ref-cache.h
+++ b/refs/ref-cache.h
@@ -258,13 +258,12 @@ struct ref_iterator *cache_ref_iterator_begin(struct 
ref_dir *dir);
 typedef int each_ref_entry_fn(struct ref_entry *entry, void *cb_data);
 
 /*
- * Call fn for each reference in dir that has index in the range
- * offset <= index < dir->nr.  Recurse into subdirectories that are in
- * that index range, sorting them before iterating.  This function
- * does not sort dir itself; it should be sorted beforehand.  fn is
- * called for all references, including broken ones.
+ * Call `fn` for each reference in `dir`. Recurse into subdirectories,
+ * sorting them before iterating. This function does not sort `dir`
+ * itself; it should be sorted beforehand. `fn` is called for all
+ * references, including broken ones.
  */
-int do_for_each_entry_in_dir(struct ref_dir *dir, int offset,
+int do_for_each_entry_in_dir(struct ref_dir *dir,
 each_ref_entry_fn fn, void *cb_data);
 
 /*
-- 
2.11.0



[PATCH v2 20/20] do_for_each_entry_in_dir(): delete function

2017-03-31 Thread Michael Haggerty
Its only remaining caller was itself.

Signed-off-by: Michael Haggerty 
---
 refs/ref-cache.c | 21 -
 refs/ref-cache.h | 11 ---
 2 files changed, 32 deletions(-)

diff --git a/refs/ref-cache.c b/refs/ref-cache.c
index b3a30350d7..6059362f1d 100644
--- a/refs/ref-cache.c
+++ b/refs/ref-cache.c
@@ -316,27 +316,6 @@ static void sort_ref_dir(struct ref_dir *dir)
dir->sorted = dir->nr = i;
 }
 
-int do_for_each_entry_in_dir(struct ref_dir *dir,
-each_ref_entry_fn fn, void *cb_data)
-{
-   int i;
-   assert(dir->sorted == dir->nr);
-   for (i = 0; i < dir->nr; i++) {
-   struct ref_entry *entry = dir->entries[i];
-   int retval;
-   if (entry->flag & REF_DIR) {
-   struct ref_dir *subdir = get_ref_dir(entry);
-   sort_ref_dir(subdir);
-   retval = do_for_each_entry_in_dir(subdir, fn, cb_data);
-   } else {
-   retval = fn(entry, cb_data);
-   }
-   if (retval)
-   return retval;
-   }
-   return 0;
-}
-
 /*
  * Load all of the refs from `dir` (recursively) into our in-memory
  * cache.
diff --git a/refs/ref-cache.h b/refs/ref-cache.h
index 5e7a918ac0..ffdc54f3f0 100644
--- a/refs/ref-cache.h
+++ b/refs/ref-cache.h
@@ -251,17 +251,6 @@ struct ref_iterator *cache_ref_iterator_begin(struct 
ref_cache *cache,
  const char *prefix,
  int prime_dir);
 
-typedef int each_ref_entry_fn(struct ref_entry *entry, void *cb_data);
-
-/*
- * Call `fn` for each reference in `dir`. Recurse into subdirectories,
- * sorting them before iterating. This function does not sort `dir`
- * itself; it should be sorted beforehand. `fn` is called for all
- * references, including broken ones.
- */
-int do_for_each_entry_in_dir(struct ref_dir *dir,
-each_ref_entry_fn fn, void *cb_data);
-
 /*
  * Peel the entry (if possible) and return its new peel_status.  If
  * repeel is true, re-peel the entry even if there is an old peeled
-- 
2.11.0



[PATCH v2 10/20] ref-cache: introduce a new type, ref_cache

2017-03-31 Thread Michael Haggerty
For now, it just wraps a `ref_entry *` that points at the root of the
tree. Soon it will hold more information.

Add two new functions, `create_ref_cache()` and `free_ref_cache()`.
Make `free_ref_entry()` private.

Change files-backend to use this type to hold its caches.

Signed-off-by: Michael Haggerty 
---
 refs/files-backend.c | 28 +---
 refs/ref-cache.c | 16 +++-
 refs/ref-cache.h | 15 ++-
 3 files changed, 46 insertions(+), 13 deletions(-)

diff --git a/refs/files-backend.c b/refs/files-backend.c
index f07e93d522..a7d912ae39 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -44,7 +44,7 @@ static int entry_resolves_to_object(struct ref_entry *entry)
 }
 
 struct packed_ref_cache {
-   struct ref_entry *root;
+   struct ref_cache *cache;
 
/*
 * Count of references to the data structure in this instance,
@@ -79,7 +79,7 @@ struct files_ref_store {
char *gitcommondir;
char *packed_refs_path;
 
-   struct ref_entry *loose;
+   struct ref_cache *loose;
struct packed_ref_cache *packed;
 };
 
@@ -101,7 +101,7 @@ static void acquire_packed_ref_cache(struct 
packed_ref_cache *packed_refs)
 static int release_packed_ref_cache(struct packed_ref_cache *packed_refs)
 {
if (!--packed_refs->referrers) {
-   free_ref_entry(packed_refs->root);
+   free_ref_cache(packed_refs->cache);
stat_validity_clear(_refs->validity);
free(packed_refs);
return 1;
@@ -125,7 +125,7 @@ static void clear_packed_ref_cache(struct files_ref_store 
*refs)
 static void clear_loose_ref_cache(struct files_ref_store *refs)
 {
if (refs->loose) {
-   free_ref_entry(refs->loose);
+   free_ref_cache(refs->loose);
refs->loose = NULL;
}
 }
@@ -387,11 +387,12 @@ static struct packed_ref_cache 
*get_packed_ref_cache(struct files_ref_store *ref
 
refs->packed = xcalloc(1, sizeof(*refs->packed));
acquire_packed_ref_cache(refs->packed);
-   refs->packed->root = create_dir_entry(refs, "", 0, 0);
+   refs->packed->cache = create_ref_cache(refs);
+   refs->packed->cache->root->flag &= ~REF_INCOMPLETE;
f = fopen(packed_refs_file, "r");
if (f) {
stat_validity_update(>packed->validity, 
fileno(f));
-   read_packed_refs(f, get_ref_dir(refs->packed->root));
+   read_packed_refs(f, 
get_ref_dir(refs->packed->cache->root));
fclose(f);
}
}
@@ -400,7 +401,7 @@ static struct packed_ref_cache *get_packed_ref_cache(struct 
files_ref_store *ref
 
 static struct ref_dir *get_packed_ref_dir(struct packed_ref_cache 
*packed_ref_cache)
 {
-   return get_ref_dir(packed_ref_cache->root);
+   return get_ref_dir(packed_ref_cache->cache->root);
 }
 
 static struct ref_dir *get_packed_refs(struct files_ref_store *refs)
@@ -515,14 +516,19 @@ static struct ref_dir *get_loose_refs(struct 
files_ref_store *refs)
 * are about to read the only subdirectory that can
 * hold references:
 */
-   refs->loose = create_dir_entry(refs, "", 0, 0);
+   refs->loose = create_ref_cache(refs);
+
+   /* We're going to fill the top level ourselves: */
+   refs->loose->root->flag &= ~REF_INCOMPLETE;
+
/*
-* Create an incomplete entry for "refs/":
+* Add an incomplete entry for "refs/" (to be filled
+* lazily):
 */
-   add_entry_to_dir(get_ref_dir(refs->loose),
+   add_entry_to_dir(get_ref_dir(refs->loose->root),
 create_dir_entry(refs, "refs/", 5, 1));
}
-   return get_ref_dir(refs->loose);
+   return get_ref_dir(refs->loose->root);
 }
 
 /*
diff --git a/refs/ref-cache.c b/refs/ref-cache.c
index 4274a43a9b..bf911028c8 100644
--- a/refs/ref-cache.c
+++ b/refs/ref-cache.c
@@ -63,9 +63,17 @@ struct ref_entry *create_ref_entry(const char *refname,
return ref;
 }
 
+struct ref_cache *create_ref_cache(struct files_ref_store *refs)
+{
+   struct ref_cache *ret = xcalloc(1, sizeof(*ret));
+
+   ret->root = create_dir_entry(refs, "", 0, 1);
+   return ret;
+}
+
 static void clear_ref_dir(struct ref_dir *dir);
 
-void free_ref_entry(struct ref_entry *entry)
+static void free_ref_entry(struct ref_entry *entry)
 {
if (entry->flag & REF_DIR) {
/*
@@ -77,6 +85,12 @@ void free_ref_entry(struct ref_entry *entry)
free(entry);
 }
 
+void free_ref_cache(struct ref_cache *cache)
+{
+   free_ref_entry(cache->root);
+   free(cache);
+}
+
 /*
  * Clear and free all entries in dir, recursively.
  */
diff --git a/refs/ref-cache.h 

[PATCH v2 11/20] refs: record the ref_store in ref_cache, not ref_dir

2017-03-31 Thread Michael Haggerty
Instead of keeping a pointer to the ref_store in every ref_dir entry,
store it once in `struct ref_cache`, and change `struct ref_dir` to
include a pointer to its containing `ref_cache` instead. This makes it
easier to add to the information that is accessible from a `ref_dir`
without increasing the size of every `ref_dir` instance.

Signed-off-by: Michael Haggerty 
---
 refs/files-backend.c |  6 +++---
 refs/ref-cache.c | 12 +++-
 refs/ref-cache.h |  9 ++---
 3 files changed, 16 insertions(+), 11 deletions(-)

diff --git a/refs/files-backend.c b/refs/files-backend.c
index a7d912ae39..78572e55a0 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -433,7 +433,7 @@ static void add_packed_ref(struct files_ref_store *refs,
  */
 void read_loose_refs(const char *dirname, struct ref_dir *dir)
 {
-   struct files_ref_store *refs = dir->ref_store;
+   struct files_ref_store *refs = dir->cache->ref_store;
DIR *d;
struct dirent *de;
int dirnamelen = strlen(dirname);
@@ -469,7 +469,7 @@ void read_loose_refs(const char *dirname, struct ref_dir 
*dir)
} else if (S_ISDIR(st.st_mode)) {
strbuf_addch(, '/');
add_entry_to_dir(dir,
-create_dir_entry(refs, refname.buf,
+create_dir_entry(dir->cache, 
refname.buf,
  refname.len, 1));
} else {
if (!refs_resolve_ref_unsafe(>base,
@@ -526,7 +526,7 @@ static struct ref_dir *get_loose_refs(struct 
files_ref_store *refs)
 * lazily):
 */
add_entry_to_dir(get_ref_dir(refs->loose->root),
-create_dir_entry(refs, "refs/", 5, 1));
+create_dir_entry(refs->loose, "refs/", 5, 1));
}
return get_ref_dir(refs->loose->root);
 }
diff --git a/refs/ref-cache.c b/refs/ref-cache.c
index bf911028c8..96da094788 100644
--- a/refs/ref-cache.c
+++ b/refs/ref-cache.c
@@ -36,7 +36,7 @@ struct ref_dir *get_ref_dir(struct ref_entry *entry)
int pos = search_ref_dir(dir, "refs/bisect/", 12);
if (pos < 0) {
struct ref_entry *child_entry;
-   child_entry = create_dir_entry(dir->ref_store,
+   child_entry = create_dir_entry(dir->cache,
   "refs/bisect/",
   12, 1);
add_entry_to_dir(dir, child_entry);
@@ -67,7 +67,8 @@ struct ref_cache *create_ref_cache(struct files_ref_store 
*refs)
 {
struct ref_cache *ret = xcalloc(1, sizeof(*ret));
 
-   ret->root = create_dir_entry(refs, "", 0, 1);
+   ret->ref_store = refs;
+   ret->root = create_dir_entry(ret, "", 0, 1);
return ret;
 }
 
@@ -104,13 +105,14 @@ static void clear_ref_dir(struct ref_dir *dir)
dir->entries = NULL;
 }
 
-struct ref_entry *create_dir_entry(struct files_ref_store *ref_store,
+struct ref_entry *create_dir_entry(struct ref_cache *cache,
   const char *dirname, size_t len,
   int incomplete)
 {
struct ref_entry *direntry;
+
FLEX_ALLOC_MEM(direntry, name, dirname, len);
-   direntry->u.subdir.ref_store = ref_store;
+   direntry->u.subdir.cache = cache;
direntry->flag = REF_DIR | (incomplete ? REF_INCOMPLETE : 0);
return direntry;
 }
@@ -181,7 +183,7 @@ static struct ref_dir *search_for_subdir(struct ref_dir 
*dir,
 * therefore, create an empty record for it but mark
 * the record complete.
 */
-   entry = create_dir_entry(dir->ref_store, subdirname, len, 0);
+   entry = create_dir_entry(dir->cache, subdirname, len, 0);
add_entry_to_dir(dir, entry);
} else {
entry = dir->entries[entry_index];
diff --git a/refs/ref-cache.h b/refs/ref-cache.h
index da5388c136..83051854ff 100644
--- a/refs/ref-cache.h
+++ b/refs/ref-cache.h
@@ -3,6 +3,9 @@
 
 struct ref_cache {
struct ref_entry *root;
+
+   /* A pointer to the files_ref_store whose cache this is: */
+   struct files_ref_store *ref_store;
 };
 
 /*
@@ -66,8 +69,8 @@ struct ref_dir {
 */
int sorted;
 
-   /* A pointer to the files_ref_store that contains this ref_dir. */
-   struct files_ref_store *ref_store;
+   /* The ref_cache containing this entry: */
+   struct ref_cache *cache;
 
struct ref_entry **entries;
 };
@@ -161,7 +164,7 @@ struct ref_dir *get_ref_dir(struct ref_entry *entry);
  * dirname is the name of the directory with a trailing slash (e.g.,
  * "refs/heads/") 

[PATCH v2 19/20] files_pack_refs(): use reference iteration

2017-03-31 Thread Michael Haggerty
Use reference iteration rather than do_for_each_entry_in_dir() in the
definition of files_pack_refs().

Signed-off-by: Michael Haggerty 
---
 refs/files-backend.c | 143 +--
 1 file changed, 60 insertions(+), 83 deletions(-)

diff --git a/refs/files-backend.c b/refs/files-backend.c
index 0ea42826c8..8420d0fc5b 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -32,17 +32,6 @@ static int ref_resolves_to_object(const char *refname,
return 1;
 }
 
-/*
- * Return true if the reference described by entry can be resolved to
- * an object in the database; otherwise, emit a warning and return
- * false.
- */
-static int entry_resolves_to_object(struct ref_entry *entry)
-{
-   return ref_resolves_to_object(entry->name,
- >u.value.oid, entry->flag);
-}
-
 struct packed_ref_cache {
struct ref_cache *cache;
 
@@ -548,11 +537,6 @@ static struct ref_cache *get_loose_ref_cache(struct 
files_ref_store *refs)
return refs->loose;
 }
 
-static struct ref_dir *get_loose_ref_dir(struct files_ref_store *refs)
-{
-   return get_ref_dir(get_loose_ref_cache(refs)->root);
-}
-
 /*
  * Return the ref_entry for the given refname from the packed
  * references.  If it does not exist, return NULL.
@@ -1411,65 +1395,6 @@ struct ref_to_prune {
char name[FLEX_ARRAY];
 };
 
-struct pack_refs_cb_data {
-   unsigned int flags;
-   struct ref_dir *packed_refs;
-   struct ref_to_prune *ref_to_prune;
-};
-
-/*
- * An each_ref_entry_fn that is run over loose references only.  If
- * the loose reference can be packed, add an entry in the packed ref
- * cache.  If the reference should be pruned, also add it to
- * ref_to_prune in the pack_refs_cb_data.
- */
-static int pack_if_possible_fn(struct ref_entry *entry, void *cb_data)
-{
-   struct pack_refs_cb_data *cb = cb_data;
-   enum peel_status peel_status;
-   struct ref_entry *packed_entry;
-   int is_tag_ref = starts_with(entry->name, "refs/tags/");
-
-   /* Do not pack per-worktree refs: */
-   if (ref_type(entry->name) != REF_TYPE_NORMAL)
-   return 0;
-
-   /* ALWAYS pack tags */
-   if (!(cb->flags & PACK_REFS_ALL) && !is_tag_ref)
-   return 0;
-
-   /* Do not pack symbolic or broken refs: */
-   if ((entry->flag & REF_ISSYMREF) || !entry_resolves_to_object(entry))
-   return 0;
-
-   /* Add a packed ref cache entry equivalent to the loose entry. */
-   peel_status = peel_entry(entry, 1);
-   if (peel_status != PEEL_PEELED && peel_status != PEEL_NON_TAG)
-   die("internal error peeling reference %s (%s)",
-   entry->name, oid_to_hex(>u.value.oid));
-   packed_entry = find_ref_entry(cb->packed_refs, entry->name);
-   if (packed_entry) {
-   /* Overwrite existing packed entry with info from loose entry */
-   packed_entry->flag = REF_ISPACKED | REF_KNOWS_PEELED;
-   oidcpy(_entry->u.value.oid, >u.value.oid);
-   } else {
-   packed_entry = create_ref_entry(entry->name, 
entry->u.value.oid.hash,
-   REF_ISPACKED | 
REF_KNOWS_PEELED, 0);
-   add_ref_entry(cb->packed_refs, packed_entry);
-   }
-   oidcpy(_entry->u.value.peeled, >u.value.peeled);
-
-   /* Schedule the loose reference for pruning if requested. */
-   if ((cb->flags & PACK_REFS_PRUNE)) {
-   struct ref_to_prune *n;
-   FLEX_ALLOC_STR(n, name, entry->name);
-   hashcpy(n->sha1, entry->u.value.oid.hash);
-   n->next = cb->ref_to_prune;
-   cb->ref_to_prune = n;
-   }
-   return 0;
-}
-
 enum {
REMOVE_EMPTY_PARENTS_REF = 0x01,
REMOVE_EMPTY_PARENTS_REFLOG = 0x02
@@ -1559,21 +1484,73 @@ static int files_pack_refs(struct ref_store *ref_store, 
unsigned int flags)
struct files_ref_store *refs =
files_downcast(ref_store, REF_STORE_WRITE | REF_STORE_ODB,
   "pack_refs");
-   struct pack_refs_cb_data cbdata;
-
-   memset(, 0, sizeof(cbdata));
-   cbdata.flags = flags;
+   struct ref_iterator *iter;
+   struct ref_dir *packed_refs;
+   int ok;
+   struct ref_to_prune *refs_to_prune = NULL;
 
lock_packed_refs(refs, LOCK_DIE_ON_ERROR);
-   cbdata.packed_refs = get_packed_refs(refs);
+   packed_refs = get_packed_refs(refs);
+
+   iter = cache_ref_iterator_begin(get_loose_ref_cache(refs), NULL, 0);
+   while ((ok = ref_iterator_advance(iter)) == ITER_OK) {
+   /*
+* If the loose reference can be packed, add an entry
+* in the packed ref cache. If the reference should be
+* pruned, also add it to refs_to_prune.
+*/
+   struct ref_entry *packed_entry;
+   

[PATCH v2 12/20] ref-cache: use a callback function to fill the cache

2017-03-31 Thread Michael Haggerty
It is a leveling violation for `ref_cache` to know about
`files_ref_store` or that it should call `read_loose_refs()` to lazily
fill cache directories. So instead, have its constructor take as an
argument a callback function that it should use for lazy-filling, and
change `files_ref_store` to supply a pointer to function
`read_loose_refs` (renamed to `loose_fill_ref_dir`) when creating the
ref cache for its loose refs.

This means that we can generify the type of the back-pointer in
`struct ref_cache` from `files_ref_store` to `ref_store`.

Signed-off-by: Michael Haggerty 
---
 refs/files-backend.c | 10 ++
 refs/ref-cache.c | 12 +++-
 refs/ref-cache.h | 29 +
 3 files changed, 38 insertions(+), 13 deletions(-)

diff --git a/refs/files-backend.c b/refs/files-backend.c
index 78572e55a0..e4d78393ac 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -387,7 +387,7 @@ static struct packed_ref_cache *get_packed_ref_cache(struct 
files_ref_store *ref
 
refs->packed = xcalloc(1, sizeof(*refs->packed));
acquire_packed_ref_cache(refs->packed);
-   refs->packed->cache = create_ref_cache(refs);
+   refs->packed->cache = create_ref_cache(>base, NULL);
refs->packed->cache->root->flag &= ~REF_INCOMPLETE;
f = fopen(packed_refs_file, "r");
if (f) {
@@ -431,9 +431,11 @@ static void add_packed_ref(struct files_ref_store *refs,
  * (without recursing).  dirname must end with '/'.  dir must be the
  * directory entry corresponding to dirname.
  */
-void read_loose_refs(const char *dirname, struct ref_dir *dir)
+static void loose_fill_ref_dir(struct ref_store *ref_store,
+  struct ref_dir *dir, const char *dirname)
 {
-   struct files_ref_store *refs = dir->cache->ref_store;
+   struct files_ref_store *refs =
+   files_downcast(ref_store, REF_STORE_READ, "fill_ref_dir");
DIR *d;
struct dirent *de;
int dirnamelen = strlen(dirname);
@@ -516,7 +518,7 @@ static struct ref_dir *get_loose_refs(struct 
files_ref_store *refs)
 * are about to read the only subdirectory that can
 * hold references:
 */
-   refs->loose = create_ref_cache(refs);
+   refs->loose = create_ref_cache(>base, loose_fill_ref_dir);
 
/* We're going to fill the top level ourselves: */
refs->loose->root->flag &= ~REF_INCOMPLETE;
diff --git a/refs/ref-cache.c b/refs/ref-cache.c
index 96da094788..7f247b9170 100644
--- a/refs/ref-cache.c
+++ b/refs/ref-cache.c
@@ -4,9 +4,6 @@
 #include "ref-cache.h"
 #include "../iterator.h"
 
-/* FIXME: This declaration shouldn't be here */
-void read_loose_refs(const char *dirname, struct ref_dir *dir);
-
 void add_entry_to_dir(struct ref_dir *dir, struct ref_entry *entry)
 {
ALLOC_GROW(dir->entries, dir->nr + 1, dir->alloc);
@@ -25,7 +22,10 @@ struct ref_dir *get_ref_dir(struct ref_entry *entry)
assert(entry->flag & REF_DIR);
dir = >u.subdir;
if (entry->flag & REF_INCOMPLETE) {
-   read_loose_refs(entry->name, dir);
+   if (!dir->cache->fill_ref_dir)
+   die("BUG: incomplete ref_store without fill_ref_dir 
function");
+
+   dir->cache->fill_ref_dir(dir->cache->ref_store, dir, 
entry->name);
 
/*
 * Manually add refs/bisect, which, being
@@ -63,11 +63,13 @@ struct ref_entry *create_ref_entry(const char *refname,
return ref;
 }
 
-struct ref_cache *create_ref_cache(struct files_ref_store *refs)
+struct ref_cache *create_ref_cache(struct ref_store *refs,
+  fill_ref_dir_fn *fill_ref_dir)
 {
struct ref_cache *ret = xcalloc(1, sizeof(*ret));
 
ret->ref_store = refs;
+   ret->fill_ref_dir = fill_ref_dir;
ret->root = create_dir_entry(ret, "", 0, 1);
return ret;
 }
diff --git a/refs/ref-cache.h b/refs/ref-cache.h
index 83051854ff..ed51e80d88 100644
--- a/refs/ref-cache.h
+++ b/refs/ref-cache.h
@@ -1,11 +1,27 @@
 #ifndef REFS_REF_CACHE_H
 #define REFS_REF_CACHE_H
 
+struct ref_dir;
+
+/*
+ * If this ref_cache is filled lazily, this function is used to load
+ * information into the specified ref_dir (shallow or deep, at the
+ * option of the ref_store). dirname includes a trailing slash.
+ */
+typedef void fill_ref_dir_fn(struct ref_store *ref_store,
+struct ref_dir *dir, const char *dirname);
+
 struct ref_cache {
struct ref_entry *root;
 
-   /* A pointer to the files_ref_store whose cache this is: */
-   struct files_ref_store *ref_store;
+   /* A pointer to the ref_store whose cache this is: */
+   struct ref_store *ref_store;
+
+   /*
+* Function used (if necessary) to lazily-fill cache. May be
+* NULL.
+*/

[PATCH v2 08/20] ref-cache: rename `remove_entry()` to `remove_entry_from_dir()`

2017-03-31 Thread Michael Haggerty
This function's visibility is about to be increased, so give it a more
distinctive name.

Signed-off-by: Michael Haggerty 
---
 refs/files-backend.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/refs/files-backend.c b/refs/files-backend.c
index 6768c8c86b..b4c11afadf 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -413,7 +413,7 @@ static struct ref_entry *find_ref_entry(struct ref_dir 
*dir, const char *refname
  * empty by the removal.  dir must represent the top-level directory
  * and must already be complete.
  */
-static int remove_entry(struct ref_dir *dir, const char *refname)
+static int remove_entry_from_dir(struct ref_dir *dir, const char *refname)
 {
int refname_len = strlen(refname);
int entry_index;
@@ -2338,7 +2338,7 @@ static int repack_without_refs(struct files_ref_store 
*refs,
 
/* Remove refnames from the cache */
for_each_string_list_item(refname, refnames)
-   if (remove_entry(packed, refname->string) != -1)
+   if (remove_entry_from_dir(packed, refname->string) != -1)
removed = 1;
if (!removed) {
/*
-- 
2.11.0



[PATCH v2 16/20] get_loose_ref_cache(): new function

2017-03-31 Thread Michael Haggerty
Extract a new function, `get_loose_ref_cache()`, from
get_loose_ref_dir(). The function returns the `ref_cache` for the
loose refs of a `files_ref_store`.

Signed-off-by: Michael Haggerty 
---
 refs/files-backend.c | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/refs/files-backend.c b/refs/files-backend.c
index 0a16f6196c..fefc29433a 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -525,7 +525,7 @@ static void loose_fill_ref_dir(struct ref_store *ref_store,
}
 }
 
-static struct ref_dir *get_loose_ref_dir(struct files_ref_store *refs)
+static struct ref_cache *get_loose_ref_cache(struct files_ref_store *refs)
 {
if (!refs->loose) {
/*
@@ -545,7 +545,12 @@ static struct ref_dir *get_loose_ref_dir(struct 
files_ref_store *refs)
add_entry_to_dir(get_ref_dir(refs->loose->root),
 create_dir_entry(refs->loose, "refs/", 5, 1));
}
-   return get_ref_dir(refs->loose->root);
+   return refs->loose;
+}
+
+static struct ref_dir *get_loose_ref_dir(struct files_ref_store *refs)
+{
+   return get_ref_dir(get_loose_ref_cache(refs)->root);
 }
 
 /*
-- 
2.11.0



[PATCH v2 06/20] ref-cache: rename `add_ref()` to `add_ref_entry()`

2017-03-31 Thread Michael Haggerty
This function's visibility is about to be increased, so give it a more
distinctive name.

Signed-off-by: Michael Haggerty 
---
 refs/files-backend.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/refs/files-backend.c b/refs/files-backend.c
index cad56efb04..4d579cbdac 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -455,7 +455,7 @@ static int remove_entry(struct ref_dir *dir, const char 
*refname)
  * subdirectories as necessary.  dir must represent the top-level
  * directory.  Return 0 on success.
  */
-static int add_ref(struct ref_dir *dir, struct ref_entry *ref)
+static int add_ref_entry(struct ref_dir *dir, struct ref_entry *ref)
 {
dir = find_containing_dir(dir, ref->name, 1);
if (!dir)
@@ -994,7 +994,7 @@ static void read_packed_refs(FILE *f, struct ref_dir *dir)
if (peeled == PEELED_FULLY ||
(peeled == PEELED_TAGS && starts_with(refname, 
"refs/tags/")))
last->flag |= REF_KNOWS_PEELED;
-   add_ref(dir, last);
+   add_ref_entry(dir, last);
continue;
}
if (last &&
@@ -1116,7 +1116,7 @@ static void add_packed_ref(struct files_ref_store *refs,
 
if (!packed_ref_cache->lock)
die("internal error: packed refs not locked");
-   add_ref(get_packed_ref_dir(packed_ref_cache),
+   add_ref_entry(get_packed_ref_dir(packed_ref_cache),
create_ref_entry(refname, sha1, REF_ISPACKED, 1));
 }
 
@@ -2179,7 +2179,7 @@ static int pack_if_possible_fn(struct ref_entry *entry, 
void *cb_data)
} else {
packed_entry = create_ref_entry(entry->name, 
entry->u.value.oid.hash,
REF_ISPACKED | 
REF_KNOWS_PEELED, 0);
-   add_ref(cb->packed_refs, packed_entry);
+   add_ref_entry(cb->packed_refs, packed_entry);
}
oidcpy(_entry->u.value.peeled, >u.value.peeled);
 
-- 
2.11.0



[PATCH v2 00/20] Separate `ref_cache` into a separate module

2017-03-31 Thread Michael Haggerty
This is v2 of this patch series. Thanks to Peff, Junio, Stefan and
Ævar for their comments about v1 [1].

This version literally only contains a few commit message changes and
one minor comment changes relative to v1. The code is identical. I
wasn't sure whether it is even worth sending this patch series to the
ML again; Junio, if you'd prefer I just send a link to a published
branch in such cases, please let me know.

* I picked up the "area:" prefixes that Junio added to a couple of
  commit subject lines.

* I clarified the justification for keeping a pointer to the
  `ref_store` in `ref_dir`.

* I added a long blurb about the removal of an internal consistency
  check in the commit message for

  [18/20] commit_packed_refs(): use reference iteration

* I changed "loose refs cache" to "loose ref cache" in a comment in

  [19/20] files_pack_refs(): use reference iteration

This patch series is also available from my GitHub fork [2] as branch
"separate-ref-cache". These patches depend on Duy's
nd/files-backend-git-dir branch.

[1] http://public-inbox.org/git/cover.1490026594.git.mhag...@alum.mit.edu/
[2] https://github.com/mhagger/git

Michael Haggerty (20):
  get_ref_dir(): don't call read_loose_refs() for "refs/bisect"
  refs_read_raw_ref(): new function
  refs_ref_iterator_begin(): new function
  refs_verify_refname_available(): implement once for all backends
  refs_verify_refname_available(): use function in more places
  ref-cache: rename `add_ref()` to `add_ref_entry()`
  ref-cache: rename `find_ref()` to `find_ref_entry()`
  ref-cache: rename `remove_entry()` to `remove_entry_from_dir()`
  refs: split `ref_cache` code into separate files
  ref-cache: introduce a new type, ref_cache
  refs: record the ref_store in ref_cache, not ref_dir
  ref-cache: use a callback function to fill the cache
  refs: handle "refs/bisect/" in `loose_fill_ref_dir()`
  do_for_each_entry_in_dir(): eliminate `offset` argument
  get_loose_ref_dir(): function renamed from get_loose_refs()
  get_loose_ref_cache(): new function
  cache_ref_iterator_begin(): make function smarter
  commit_packed_refs(): use reference iteration
  files_pack_refs(): use reference iteration
  do_for_each_entry_in_dir(): delete function

 Makefile |1 +
 refs.c   |  111 -
 refs.h   |2 +-
 refs/files-backend.c | 1229 +++---
 refs/ref-cache.c |  523 +
 refs/ref-cache.h |  267 +++
 refs/refs-internal.h |   22 +-
 7 files changed, 1066 insertions(+), 1089 deletions(-)
 create mode 100644 refs/ref-cache.c
 create mode 100644 refs/ref-cache.h

-- 
2.11.0



[PATCH v2 04/20] refs_verify_refname_available(): implement once for all backends

2017-03-31 Thread Michael Haggerty
It turns out that we can now implement
`refs_verify_refname_available()` based on the other virtual
functions, so there is no need for it to be defined at the backend
level. Instead, define it once in `refs.c` and remove the
`files_backend` definition.

Signed-off-by: Michael Haggerty 
---
 refs.c   | 85 ++--
 refs.h   |  2 +-
 refs/files-backend.c | 39 +---
 refs/refs-internal.h |  7 -
 4 files changed, 92 insertions(+), 41 deletions(-)

diff --git a/refs.c b/refs.c
index aeb704ab92..2cf531b9c7 100644
--- a/refs.c
+++ b/refs.c
@@ -5,6 +5,7 @@
 #include "cache.h"
 #include "hashmap.h"
 #include "lockfile.h"
+#include "iterator.h"
 #include "refs.h"
 #include "refs/refs-internal.h"
 #include "object.h"
@@ -1661,11 +1662,91 @@ int ref_transaction_commit(struct ref_transaction 
*transaction,
 
 int refs_verify_refname_available(struct ref_store *refs,
  const char *refname,
- const struct string_list *extra,
+ const struct string_list *extras,
  const struct string_list *skip,
  struct strbuf *err)
 {
-   return refs->be->verify_refname_available(refs, refname, extra, skip, 
err);
+   const char *slash;
+   const char *extra_refname;
+   struct strbuf dirname = STRBUF_INIT;
+   struct strbuf referent = STRBUF_INIT;
+   struct object_id oid;
+   unsigned int type;
+   struct ref_iterator *iter;
+   int ok;
+   int ret = -1;
+
+   /*
+* For the sake of comments in this function, suppose that
+* refname is "refs/foo/bar".
+*/
+
+   assert(err);
+
+   strbuf_grow(, strlen(refname) + 1);
+   for (slash = strchr(refname, '/'); slash; slash = strchr(slash + 1, 
'/')) {
+   /* Expand dirname to the new prefix, not including the trailing 
slash: */
+   strbuf_add(, refname + dirname.len, slash - refname - 
dirname.len);
+
+   /*
+* We are still at a leading dir of the refname (e.g.,
+* "refs/foo"; if there is a reference with that name,
+* it is a conflict, *unless* it is in skip.
+*/
+   if (skip && string_list_has_string(skip, dirname.buf))
+   continue;
+
+   if (!refs_read_raw_ref(refs, dirname.buf, oid.hash, , 
)) {
+   strbuf_addf(err, "'%s' exists; cannot create '%s'",
+   dirname.buf, refname);
+   goto cleanup;
+   }
+
+   if (extras && string_list_has_string(extras, dirname.buf)) {
+   strbuf_addf(err, "cannot process '%s' and '%s' at the 
same time",
+   refname, dirname.buf);
+   goto cleanup;
+   }
+   }
+
+   /*
+* We are at the leaf of our refname (e.g., "refs/foo/bar").
+* There is no point in searching for a reference with that
+* name, because a refname isn't considered to conflict with
+* itself. But we still need to check for references whose
+* names are in the "refs/foo/bar/" namespace, because they
+* *do* conflict.
+*/
+   strbuf_addstr(, refname + dirname.len);
+   strbuf_addch(, '/');
+
+   iter = refs_ref_iterator_begin(refs, dirname.buf, 0,
+  DO_FOR_EACH_INCLUDE_BROKEN);
+   while ((ok = ref_iterator_advance(iter)) == ITER_OK) {
+   if (skip &&
+   string_list_has_string(skip, iter->refname))
+   continue;
+
+   strbuf_addf(err, "'%s' exists; cannot create '%s'",
+   iter->refname, refname);
+   ok = ref_iterator_abort(iter);
+   goto cleanup;
+   }
+
+   if (ok != ITER_DONE)
+   die("BUG: error while iterating over references");
+
+   extra_refname = find_descendant_ref(dirname.buf, extras, skip);
+   if (extra_refname)
+   strbuf_addf(err, "cannot process '%s' and '%s' at the same 
time",
+   refname, extra_refname);
+   else
+   ret = 0;
+
+cleanup:
+   strbuf_release();
+   strbuf_release();
+   return ret;
 }
 
 int refs_for_each_reflog(struct ref_store *refs, each_ref_fn fn, void *cb_data)
diff --git a/refs.h b/refs.h
index 49e97d7d5f..07cf4cd41b 100644
--- a/refs.h
+++ b/refs.h
@@ -97,7 +97,7 @@ int read_ref(const char *refname, unsigned char *sha1);
 
 int refs_verify_refname_available(struct ref_store *refs,
  const char *refname,
- const struct string_list *extra,
+ const struct string_list *extras,
   

[PATCH v2 09/20] refs: split `ref_cache` code into separate files

2017-03-31 Thread Michael Haggerty
The `ref_cache` code is currently too tightly coupled to
`files-backend`, making the code harder to understand and making it
awkward for new code to use `ref_cache` (as we indeed have planned).
Start loosening that coupling by splitting `ref_cache` into a separate
module.

This commit moves code, adds declarations, and changes the visibility
of some functions, but doesn't change any code.

The modules are still too tightly coupled, but the situation will be
improved in subsequent commits.

Signed-off-by: Michael Haggerty 
---
 Makefile |   1 +
 refs/files-backend.c | 736 +--
 refs/ref-cache.c | 512 +++
 refs/ref-cache.h | 251 ++
 4 files changed, 767 insertions(+), 733 deletions(-)
 create mode 100644 refs/ref-cache.c
 create mode 100644 refs/ref-cache.h

diff --git a/Makefile b/Makefile
index 5f3844e33e..2f30580cde 100644
--- a/Makefile
+++ b/Makefile
@@ -807,6 +807,7 @@ LIB_OBJS += reflog-walk.o
 LIB_OBJS += refs.o
 LIB_OBJS += refs/files-backend.o
 LIB_OBJS += refs/iterator.o
+LIB_OBJS += refs/ref-cache.o
 LIB_OBJS += ref-filter.o
 LIB_OBJS += remote.o
 LIB_OBJS += replace_object.o
diff --git a/refs/files-backend.c b/refs/files-backend.c
index b4c11afadf..f07e93d522 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -1,6 +1,7 @@
 #include "../cache.h"
 #include "../refs.h"
 #include "refs-internal.h"
+#include "ref-cache.h"
 #include "../iterator.h"
 #include "../dir-iterator.h"
 #include "../lockfile.h"
@@ -13,509 +14,6 @@ struct ref_lock {
struct object_id old_oid;
 };
 
-struct ref_entry;
-
-/*
- * Information used (along with the information in ref_entry) to
- * describe a single cached reference.  This data structure only
- * occurs embedded in a union in struct ref_entry, and only when
- * (ref_entry->flag & REF_DIR) is zero.
- */
-struct ref_value {
-   /*
-* The name of the object to which this reference resolves
-* (which may be a tag object).  If REF_ISBROKEN, this is
-* null.  If REF_ISSYMREF, then this is the name of the object
-* referred to by the last reference in the symlink chain.
-*/
-   struct object_id oid;
-
-   /*
-* If REF_KNOWS_PEELED, then this field holds the peeled value
-* of this reference, or null if the reference is known not to
-* be peelable.  See the documentation for peel_ref() for an
-* exact definition of "peelable".
-*/
-   struct object_id peeled;
-};
-
-struct files_ref_store;
-
-/*
- * Information used (along with the information in ref_entry) to
- * describe a level in the hierarchy of references.  This data
- * structure only occurs embedded in a union in struct ref_entry, and
- * only when (ref_entry.flag & REF_DIR) is set.  In that case,
- * (ref_entry.flag & REF_INCOMPLETE) determines whether the references
- * in the directory have already been read:
- *
- * (ref_entry.flag & REF_INCOMPLETE) unset -- a directory of loose
- * or packed references, already read.
- *
- * (ref_entry.flag & REF_INCOMPLETE) set -- a directory of loose
- * references that hasn't been read yet (nor has any of its
- * subdirectories).
- *
- * Entries within a directory are stored within a growable array of
- * pointers to ref_entries (entries, nr, alloc).  Entries 0 <= i <
- * sorted are sorted by their component name in strcmp() order and the
- * remaining entries are unsorted.
- *
- * Loose references are read lazily, one directory at a time.  When a
- * directory of loose references is read, then all of the references
- * in that directory are stored, and REF_INCOMPLETE stubs are created
- * for any subdirectories, but the subdirectories themselves are not
- * read.  The reading is triggered by get_ref_dir().
- */
-struct ref_dir {
-   int nr, alloc;
-
-   /*
-* Entries with index 0 <= i < sorted are sorted by name.  New
-* entries are appended to the list unsorted, and are sorted
-* only when required; thus we avoid the need to sort the list
-* after the addition of every reference.
-*/
-   int sorted;
-
-   /* A pointer to the files_ref_store that contains this ref_dir. */
-   struct files_ref_store *ref_store;
-
-   struct ref_entry **entries;
-};
-
-/*
- * Bit values for ref_entry::flag.  REF_ISSYMREF=0x01,
- * REF_ISPACKED=0x02, REF_ISBROKEN=0x04 and REF_BAD_NAME=0x08 are
- * public values; see refs.h.
- */
-
-/*
- * The field ref_entry->u.value.peeled of this value entry contains
- * the correct peeled value for the reference, which might be
- * null_sha1 if the reference is not a tag or if it is broken.
- */
-#define REF_KNOWS_PEELED 0x10
-
-/* ref_entry represents a directory of references */
-#define REF_DIR 0x20
-
-/*
- * Entry has not yet been read from disk (used only for REF_DIR
- * entries representing loose references)

[PATCH v2 15/20] get_loose_ref_dir(): function renamed from get_loose_refs()

2017-03-31 Thread Michael Haggerty
The new name is more analogous to `get_packed_ref_dir()`.

Signed-off-by: Michael Haggerty 
---
 refs/files-backend.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/refs/files-backend.c b/refs/files-backend.c
index 0ff5df6b46..0a16f6196c 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -525,7 +525,7 @@ static void loose_fill_ref_dir(struct ref_store *ref_store,
}
 }
 
-static struct ref_dir *get_loose_refs(struct files_ref_store *refs)
+static struct ref_dir *get_loose_ref_dir(struct files_ref_store *refs)
 {
if (!refs->loose) {
/*
@@ -1113,7 +1113,7 @@ static struct ref_iterator *files_ref_iterator_begin(
 * date with what is on disk, and re-reads it if not.
 */
 
-   loose_dir = get_loose_refs(refs);
+   loose_dir = get_loose_ref_dir(refs);
 
if (prefix && *prefix)
loose_dir = find_containing_dir(loose_dir, prefix, 0);
@@ -1584,7 +1584,7 @@ static int files_pack_refs(struct ref_store *ref_store, 
unsigned int flags)
lock_packed_refs(refs, LOCK_DIE_ON_ERROR);
cbdata.packed_refs = get_packed_refs(refs);
 
-   do_for_each_entry_in_dir(get_loose_refs(refs),
+   do_for_each_entry_in_dir(get_loose_ref_dir(refs),
 pack_if_possible_fn, );
 
if (commit_packed_refs(refs))
-- 
2.11.0



[PATCH v2 02/20] refs_read_raw_ref(): new function

2017-03-31 Thread Michael Haggerty
Extract a new function from `refs_resolve_ref_unsafe()`. It will be
useful elsewhere.

Signed-off-by: Michael Haggerty 
---
 refs.c   | 11 +--
 refs/refs-internal.h |  4 
 2 files changed, 13 insertions(+), 2 deletions(-)

diff --git a/refs.c b/refs.c
index 77a39f8b17..0ed6c3c7a4 100644
--- a/refs.c
+++ b/refs.c
@@ -1326,6 +1326,13 @@ int for_each_rawref(each_ref_fn fn, void *cb_data)
return refs_for_each_rawref(get_main_ref_store(), fn, cb_data);
 }
 
+int refs_read_raw_ref(struct ref_store *ref_store,
+ const char *refname, unsigned char *sha1,
+ struct strbuf *referent, unsigned int *type)
+{
+   return ref_store->be->read_raw_ref(ref_store, refname, sha1, referent, 
type);
+}
+
 /* This function needs to return a meaningful errno on failure */
 const char *refs_resolve_ref_unsafe(struct ref_store *refs,
const char *refname,
@@ -1362,8 +1369,8 @@ const char *refs_resolve_ref_unsafe(struct ref_store 
*refs,
for (symref_count = 0; symref_count < SYMREF_MAXDEPTH; symref_count++) {
unsigned int read_flags = 0;
 
-   if (refs->be->read_raw_ref(refs, refname,
-  sha1, _refname, _flags)) {
+   if (refs_read_raw_ref(refs, refname,
+ sha1, _refname, _flags)) {
*flags |= read_flags;
if (errno != ENOENT || (resolve_flags & 
RESOLVE_REF_READING))
return NULL;
diff --git a/refs/refs-internal.h b/refs/refs-internal.h
index 690498698e..6ee9f20dbc 100644
--- a/refs/refs-internal.h
+++ b/refs/refs-internal.h
@@ -165,6 +165,10 @@ struct ref_update {
const char refname[FLEX_ARRAY];
 };
 
+int refs_read_raw_ref(struct ref_store *ref_store,
+ const char *refname, unsigned char *sha1,
+ struct strbuf *referent, unsigned int *type);
+
 /*
  * Add a ref_update with the specified properties to transaction, and
  * return a pointer to the new object. This function does not verify
-- 
2.11.0



[PATCH v2 07/20] ref-cache: rename `find_ref()` to `find_ref_entry()`

2017-03-31 Thread Michael Haggerty
This function's visibility is about to be increased, so give it a more
distinctive name.

Signed-off-by: Michael Haggerty 
---
 refs/files-backend.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/refs/files-backend.c b/refs/files-backend.c
index 4d579cbdac..6768c8c86b 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -385,7 +385,7 @@ static struct ref_dir *find_containing_dir(struct ref_dir 
*dir,
  * and recursing into subdirectories as necessary.  If the name is not
  * found or it corresponds to a directory entry, return NULL.
  */
-static struct ref_entry *find_ref(struct ref_dir *dir, const char *refname)
+static struct ref_entry *find_ref_entry(struct ref_dir *dir, const char 
*refname)
 {
int entry_index;
struct ref_entry *entry;
@@ -1227,7 +1227,7 @@ static struct ref_dir *get_loose_refs(struct 
files_ref_store *refs)
 static struct ref_entry *get_packed_ref(struct files_ref_store *refs,
const char *refname)
 {
-   return find_ref(get_packed_refs(refs), refname);
+   return find_ref_entry(get_packed_refs(refs), refname);
 }
 
 /*
@@ -2171,7 +2171,7 @@ static int pack_if_possible_fn(struct ref_entry *entry, 
void *cb_data)
if (peel_status != PEEL_PEELED && peel_status != PEEL_NON_TAG)
die("internal error peeling reference %s (%s)",
entry->name, oid_to_hex(>u.value.oid));
-   packed_entry = find_ref(cb->packed_refs, entry->name);
+   packed_entry = find_ref_entry(cb->packed_refs, entry->name);
if (packed_entry) {
/* Overwrite existing packed entry with info from loose entry */
packed_entry->flag = REF_ISPACKED | REF_KNOWS_PEELED;
-- 
2.11.0



[PATCH v2 05/20] refs_verify_refname_available(): use function in more places

2017-03-31 Thread Michael Haggerty
Change `lock_raw_ref()` and `lock_ref_sha1_basic()` to use
`refs_verify_refname_available()` instead of
`verify_refname_available_dir()`. This means that those callsites now
check for conflicts with all references rather than just packed refs,
but the performance cost shouldn't be significant (and will be
regained later).

These were the last callers of `verify_refname_available_dir()`, so
also delete that (very complicated) function.

Signed-off-by: Michael Haggerty 
---
 refs/files-backend.c | 171 ---
 1 file changed, 11 insertions(+), 160 deletions(-)

diff --git a/refs/files-backend.c b/refs/files-backend.c
index 447f2767f3..cad56efb04 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -738,152 +738,6 @@ static struct ref_iterator 
*cache_ref_iterator_begin(struct ref_dir *dir)
return ref_iterator;
 }
 
-struct nonmatching_ref_data {
-   const struct string_list *skip;
-   const char *conflicting_refname;
-};
-
-static int nonmatching_ref_fn(struct ref_entry *entry, void *vdata)
-{
-   struct nonmatching_ref_data *data = vdata;
-
-   if (data->skip && string_list_has_string(data->skip, entry->name))
-   return 0;
-
-   data->conflicting_refname = entry->name;
-   return 1;
-}
-
-/*
- * Return 0 if a reference named refname could be created without
- * conflicting with the name of an existing reference in dir.
- * See verify_refname_available for more information.
- */
-static int verify_refname_available_dir(const char *refname,
-   const struct string_list *extras,
-   const struct string_list *skip,
-   struct ref_dir *dir,
-   struct strbuf *err)
-{
-   const char *slash;
-   const char *extra_refname;
-   int pos;
-   struct strbuf dirname = STRBUF_INIT;
-   int ret = -1;
-
-   /*
-* For the sake of comments in this function, suppose that
-* refname is "refs/foo/bar".
-*/
-
-   assert(err);
-
-   strbuf_grow(, strlen(refname) + 1);
-   for (slash = strchr(refname, '/'); slash; slash = strchr(slash + 1, 
'/')) {
-   /* Expand dirname to the new prefix, not including the trailing 
slash: */
-   strbuf_add(, refname + dirname.len, slash - refname - 
dirname.len);
-
-   /*
-* We are still at a leading dir of the refname (e.g.,
-* "refs/foo"; if there is a reference with that name,
-* it is a conflict, *unless* it is in skip.
-*/
-   if (dir) {
-   pos = search_ref_dir(dir, dirname.buf, dirname.len);
-   if (pos >= 0 &&
-   (!skip || !string_list_has_string(skip, 
dirname.buf))) {
-   /*
-* We found a reference whose name is
-* a proper prefix of refname; e.g.,
-* "refs/foo", and is not in skip.
-*/
-   strbuf_addf(err, "'%s' exists; cannot create 
'%s'",
-   dirname.buf, refname);
-   goto cleanup;
-   }
-   }
-
-   if (extras && string_list_has_string(extras, dirname.buf) &&
-   (!skip || !string_list_has_string(skip, dirname.buf))) {
-   strbuf_addf(err, "cannot process '%s' and '%s' at the 
same time",
-   refname, dirname.buf);
-   goto cleanup;
-   }
-
-   /*
-* Otherwise, we can try to continue our search with
-* the next component. So try to look up the
-* directory, e.g., "refs/foo/". If we come up empty,
-* we know there is nothing under this whole prefix,
-* but even in that case we still have to continue the
-* search for conflicts with extras.
-*/
-   strbuf_addch(, '/');
-   if (dir) {
-   pos = search_ref_dir(dir, dirname.buf, dirname.len);
-   if (pos < 0) {
-   /*
-* There was no directory "refs/foo/",
-* so there is nothing under this
-* whole prefix. So there is no need
-* to continue looking for conflicting
-* references. But we need to continue
-* looking for conflicting extras.
-*/
-   dir = NULL;
-   } else {
-  

[PATCH v2 01/20] get_ref_dir(): don't call read_loose_refs() for "refs/bisect"

2017-03-31 Thread Michael Haggerty
Since references under "refs/bisect/" are per-worktree, they have to
be sought in the worktree rather than in the main repository. But
since loose references are found by traversing directories, the
reference iterator won't even get the idea to look for a
"refs/bisect/" directory in the worktree if there is not a directory
with that name in the main repository. Thus `get_ref_dir()` manually
inserts a dir_entry for "refs/bisect/" whenever it reads the entry for
"refs/".

The current code then immediately calls `read_loose_refs()` on that
directory. But since the dir_entry is created with its `incomplete`
flag set, any traversal that gets to this point will read the
directory automatically. So there is no need to call
`read_loose_refs()` explicitly; the lazy mechanism suffices.

And in fact, the attempt to `read_loose_refs()` was broken anyway.
That function needs its `dirname` argument to have a trailing `/`
character, but the invocation here was passing it "refs/bisect"
without a trailing slash. So `read_loose_refs()` would read
`$GIT_DIR/refs/bisect" correctly, but if it found an entry "foo" in
that directory, it would try to read "$GIT_DIR/refs/bisectfoo".
Normally it wouldn't find anything at that path, but the failure was
canceled out because `get_ref_dir()` *also* forgot to reset the
`REF_INCOMPLETE` bit on the dir_entry. So the read was attempted again
when it was accessed, via the lazy mechanism, and this time the read
was done correctly.

This code has been broken since it was first introduced.

Signed-off-by: Michael Haggerty 
---
 refs/files-backend.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/refs/files-backend.c b/refs/files-backend.c
index 4242486118..e7a075e864 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -191,8 +191,6 @@ static struct ref_dir *get_ref_dir(struct ref_entry *entry)
   "refs/bisect/",
   12, 1);
add_entry_to_dir(dir, child_entry);
-   read_loose_refs("refs/bisect",
-   _entry->u.subdir);
}
}
entry->flag &= ~REF_INCOMPLETE;
-- 
2.11.0



[PATCH v2 03/20] refs_ref_iterator_begin(): new function

2017-03-31 Thread Michael Haggerty
Extract a new function from `do_for_each_ref()`. It will be useful
elsewhere.

Signed-off-by: Michael Haggerty 
---
 refs.c   | 15 +--
 refs/refs-internal.h | 11 +++
 2 files changed, 24 insertions(+), 2 deletions(-)

diff --git a/refs.c b/refs.c
index 0ed6c3c7a4..aeb704ab92 100644
--- a/refs.c
+++ b/refs.c
@@ -1230,6 +1230,18 @@ int head_ref(each_ref_fn fn, void *cb_data)
return head_ref_submodule(NULL, fn, cb_data);
 }
 
+struct ref_iterator *refs_ref_iterator_begin(
+   struct ref_store *refs,
+   const char *prefix, int trim, int flags)
+{
+   struct ref_iterator *iter;
+
+   iter = refs->be->iterator_begin(refs, prefix, flags);
+   iter = prefix_ref_iterator_begin(iter, prefix, trim);
+
+   return iter;
+}
+
 /*
  * Call fn for each reference in the specified submodule for which the
  * refname begins with prefix. If trim is non-zero, then trim that
@@ -1247,8 +1259,7 @@ static int do_for_each_ref(struct ref_store *refs, const 
char *prefix,
if (!refs)
return 0;
 
-   iter = refs->be->iterator_begin(refs, prefix, flags);
-   iter = prefix_ref_iterator_begin(iter, prefix, trim);
+   iter = refs_ref_iterator_begin(refs, prefix, trim, flags);
 
return do_for_each_ref_iterator(iter, fn, cb_data);
 }
diff --git a/refs/refs-internal.h b/refs/refs-internal.h
index 6ee9f20dbc..545989ae7f 100644
--- a/refs/refs-internal.h
+++ b/refs/refs-internal.h
@@ -335,6 +335,17 @@ struct ref_iterator *empty_ref_iterator_begin(void);
  */
 int is_empty_ref_iterator(struct ref_iterator *ref_iterator);
 
+/*
+ * Return an iterator that goes over each reference in `refs` for
+ * which the refname begins with prefix. If trim is non-zero, then
+ * trim that many characters off the beginning of each refname. flags
+ * can be DO_FOR_EACH_INCLUDE_BROKEN to include broken references in
+ * the iteration.
+ */
+struct ref_iterator *refs_ref_iterator_begin(
+   struct ref_store *refs,
+   const char *prefix, int trim, int flags);
+
 /*
  * A callback function used to instruct merge_ref_iterator how to
  * interleave the entries from iter0 and iter1. The function should
-- 
2.11.0



[PATCH] status: show in-progress info for short status

2017-03-31 Thread Michael J Gruber
Ordinary (long) status shows information about bisect, revert, am,
rebase, cherry-pick in progress, and so does git-prompt.sh. status
--short currently shows none of this information.

Introduce an `--inprogress` argument to git status so that, when used with
`--short --branch`, in-progress information is shown next to the branch
information. Just like `--branch`, this comes with a config option.

The wording for the in-progress information is taken over from
git-prompt.sh.

Signed-off-by: Michael J Gruber 
---
When used with --porcelain, this gives an easy way to amend the 
powerline-gitstatus
prompt for example (in use locally here), and certainly others.

I don't touch --porcelain=v2 - that one reads in-progress state but does not 
seem
to display it. I don't know which parts are supposed to be stable for v2. 

 Documentation/config.txt |  4 +++
 Documentation/git-status.txt |  5 
 builtin/commit.c | 13 ++
 t/t7512-status-help.sh   | 58 
 wt-status.c  | 32 +---
 wt-status.h  |  1 +
 6 files changed, 105 insertions(+), 8 deletions(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 475e874d51..3adc65f9d3 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -2957,6 +2957,10 @@ status.branch::
Set to true to enable --branch by default in linkgit:git-status[1].
The option --no-branch takes precedence over this variable.
 
+status.inprogress::
+   Set to true to enable --inprogress by default in linkgit:git-status[1].
+   The option --no-inprogress takes precedence over this variable.
+
 status.displayCommentPrefix::
If set to true, linkgit:git-status[1] will insert a comment
prefix before each output line (starting with
diff --git a/Documentation/git-status.txt b/Documentation/git-status.txt
index ba873657cf..4fac073247 100644
--- a/Documentation/git-status.txt
+++ b/Documentation/git-status.txt
@@ -32,6 +32,11 @@ OPTIONS
 --branch::
Show the branch and tracking info even in short-format.
 
+-p::
+--inprogress::
+   When showing branch and tracking info in short-format,
+   show in-progress information (BISECTING, MERGING etc.), too.
+
 --porcelain[=]::
Give the output in an easy-to-parse format for scripts.
This is similar to the short output, but will remain stable
diff --git a/builtin/commit.c b/builtin/commit.c
index 4e288bc513..6176dd2c2f 100644
--- a/builtin/commit.c
+++ b/builtin/commit.c
@@ -1105,8 +1105,10 @@ static const char *read_commit_message(const char *name)
 static struct status_deferred_config {
enum wt_status_format status_format;
int show_branch;
+   int show_inprogress;
 } status_deferred_config = {
STATUS_FORMAT_UNSPECIFIED,
+   -1, /* unspecified */
-1 /* unspecified */
 };
 
@@ -1133,6 +1135,10 @@ static void finalize_deferred_config(struct wt_status *s)
s->show_branch = status_deferred_config.show_branch;
if (s->show_branch < 0)
s->show_branch = 0;
+   if (use_deferred_config && s->show_inprogress < 0)
+   s->show_inprogress = status_deferred_config.show_inprogress;
+   if (s->show_inprogress < 0)
+   s->show_inprogress = 0;
 }
 
 static int parse_and_validate_options(int argc, const char *argv[],
@@ -1291,6 +1297,10 @@ static int git_status_config(const char *k, const char 
*v, void *cb)
status_deferred_config.show_branch = git_config_bool(k, v);
return 0;
}
+   if (!strcmp(k, "status.inprogress")) {
+   status_deferred_config.show_inprogress = git_config_bool(k, v);
+   return 0;
+   }
if (!strcmp(k, "status.color") || !strcmp(k, "color.status")) {
s->use_color = git_config_colorbool(k, v);
return 0;
@@ -1339,6 +1349,8 @@ int cmd_status(int argc, const char **argv, const char 
*prefix)
N_("show status concisely"), STATUS_FORMAT_SHORT),
OPT_BOOL('b', "branch", _branch,
 N_("show branch information")),
+   OPT_BOOL('p', "inprogress", _inprogress,
+N_("show in-progress information")),
{ OPTION_CALLBACK, 0, "porcelain", _format,
  N_("version"), N_("machine-readable output"),
  PARSE_OPT_OPTARG, opt_parse_porcelain },
@@ -1612,6 +1624,7 @@ int cmd_commit(int argc, const char **argv, const char 
*prefix)
OPT_SET_INT(0, "short", _format, N_("show status 
concisely"),
STATUS_FORMAT_SHORT),
OPT_BOOL(0, "branch", _branch, N_("show branch 
information")),
+   OPT_BOOL(0, "inprogress", _inprogress, N_("show 
in-progress information")),
OPT_SET_INT(0, "porcelain", _format,
   

[PATCH v3 3/4] name-rev: provide debug output

2017-03-31 Thread Michael J Gruber
Currently, `git describe --contains --debug` does not create any debug
output because it does not pass the flag down to `git name-rev`, which
does not know that flag.

Teach the latter that flag, so that the former can pass it down (in
the following commit).

The output is patterned after that of `git describe --debug`, with the
following differences:

describe loops over all args to describe, then over all possible
descriptions; name-rev does it the other way round. Therefore, we need
to amend each possible description by the arg that it is for (and we
leave out the "searching to describe" header).

The date cut-off for name-rev kicks in way more often than the candidate
number cut-off of describe, so we do not clutter the output with the
cut-off.

Signed-off-by: Michael J Gruber 
---
 Documentation/git-name-rev.txt |  5 
 builtin/name-rev.c | 64 +-
 2 files changed, 56 insertions(+), 13 deletions(-)

diff --git a/Documentation/git-name-rev.txt b/Documentation/git-name-rev.txt
index e8e68f528c..fd78ee86e8 100644
--- a/Documentation/git-name-rev.txt
+++ b/Documentation/git-name-rev.txt
@@ -42,6 +42,11 @@ OPTIONS
 --all::
List all commits reachable from all refs
 
+--debug::
+   Verbosely display information about the searching strategy
+   being employed to standard error.  The symbolic name will still
+   be printed to standard out.
+
 --stdin::
Transform stdin by substituting all the 40-character SHA-1
hexes (say $hex) with "$hex ($rev_name)".  When used with
diff --git a/builtin/name-rev.c b/builtin/name-rev.c
index bf7ed015ae..51302a79ba 100644
--- a/builtin/name-rev.c
+++ b/builtin/name-rev.c
@@ -18,6 +18,10 @@ typedef struct rev_name {
 
 static long cutoff = LONG_MAX;
 
+static const char *prio_names[] = {
+   N_("head"), N_("lightweight"), N_("annotated"),
+};
+
 /* How many generations are maximally preferred over _one_ merge traversal? */
 #define MERGE_TRAVERSAL_WEIGHT 65535
 
@@ -59,10 +63,19 @@ static int is_better_name(struct rev_name *name,
return 0;
 }
 
+struct name_ref_data {
+   int tags_only;
+   int name_only;
+   int debug;
+   struct string_list ref_filters;
+   struct string_list exclude_filters;
+   struct object_array *revs;
+};
+
 static void name_rev(struct commit *commit,
const char *tip_name, unsigned long taggerdate,
int generation, int distance, int from_tag,
-   int deref)
+   int deref, struct name_ref_data *data)
 {
struct rev_name *name = (struct rev_name *)commit->util;
struct commit_list *parents;
@@ -75,6 +88,7 @@ static void name_rev(struct commit *commit,
 
if (deref) {
tip_name = xstrfmt("%s^0", tip_name);
+   from_tag += 1;
 
if (generation)
die("generation: %d, but deref?", generation);
@@ -87,6 +101,36 @@ static void name_rev(struct commit *commit,
} else if (is_better_name(name, tip_name, taggerdate,
  generation, distance, from_tag)) {
 copy_data:
+   if (data->debug) {
+   int i;
+   static int label_width = -1;
+   static int name_width = -1;
+   if (label_width < 0) {
+   int w;
+   for (i = 0; i < ARRAY_SIZE(prio_names); i++) {
+   w = strlen(_(prio_names[i]));
+   if (label_width < w)
+   label_width = w;
+   }
+   }
+   if (name_width < 0) {
+   int w;
+   for (i = 0; i < data->revs->nr; i++) {
+   w = strlen(data->revs->objects[i].name);
+   if (name_width < w)
+   name_width = w;
+   }
+   }
+   for (i = 0; i < data->revs->nr; i++)
+   if (!oidcmp(>object.oid,
+   >revs->objects[i].item->oid))
+   fprintf(stderr, " %-*s %8d %-*s 
%s~%d\n",
+   label_width,
+   _(prio_names[from_tag]),
+   distance, name_width,
+   data->revs->objects[i].name,
+   tip_name, generation);
+   }
name->tip_name = tip_name;
name->taggerdate = taggerdate;
name->generation = generation;
@@ -112,11 +156,11 @@ 

[PATCH v3 4/4] describe: pass --debug down to name-rev

2017-03-31 Thread Michael J Gruber
Now that name-rev knows --debug, pass that flag down to name-rev when we
call it for doing describe --contains.

Signed-off-by: Michael J Gruber 
---
 builtin/describe.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/builtin/describe.c b/builtin/describe.c
index a5cd8c513f..30196793f0 100644
--- a/builtin/describe.c
+++ b/builtin/describe.c
@@ -486,6 +486,8 @@ int cmd_describe(int argc, const char **argv, const char 
*prefix)
 NULL);
if (always)
argv_array_push(, "--always");
+   if (debug)
+   argv_array_push(, "--debug");
if (!all) {
argv_array_push(, "--tags");
for_each_string_list_item(item, )
-- 
2.12.2.739.gfc3eb97820



[PATCH v3 2/4] name-rev: favor describing with tags and use committer date to tiebreak

2017-03-31 Thread Michael J Gruber
From: Junio C Hamano 

"git name-rev" assigned a phony "far in the future" date to tips of
refs that are not pointing at tag objects, and favored names based
on a ref with the oldest date.  This made it almost impossible for
an unannotated tags and branches to be counted as a viable base,
which was especially problematic when the command is run with the
"--tags" option.  If an unannotated tag that points at an ancient
commit and an annotated tag that points at a much newer commit
reaches the commit that is being named, the old unannotated tag was
ignored.

Update the "taggerdate" field of the rev-name structure, which is
initialized from the tip of ref, to have the committer date if the
object at the tip of ref is a commit, not a tag, so that we can
optionally take it into account when doing "is this name better?"
comparison logic.

When "name-rev" is run without the "--tags" option, the general
expectation is still to name the commit based on a tag if possible,
but use non-tag refs as fallback, and tiebreak among these non-tag
refs by favoring names with shorter hops from the tip.  The use of a
phony "far in the future" date in the original code was an effective
way to ensure this expectation is held: a non-tag tip gets the same
"far in the future" timestamp, giving precedence to tags, and among
non-tag tips, names with shorter hops are preferred over longer
hops, without taking the "taggerdate" into account.  As we are
taking over the "taggerdate" field to store the committer date for
tips with commits:

 (1) keep the original logic when comparing names based on two refs
 both of which are from refs/tags/;

 (2) favoring a name based on a ref in refs/tags/ hierarchy over
 a ref outside the hierarchy;

 (3) between two names based on a ref both outside refs/tags/, give
 precedence to a name with shorter hops and use "taggerdate"
 only to tie-break.

A change to t4202 is a natural consequence.  The test creates a
commit on a branch "side" and points at it with an unannotated tag
"refs/tags/side-2".  The original code couldn't decide which one to
favor at all, and gave a name based on a branch (simply because
refs/heads/side sorts earlier than refs/tags/side-2).  Because the
updated logic is taught to favor refs in refs/tags/ hierarchy, the
the test is updated to expect to see tags/side-2 instead.

Signed-off-by: Junio C Hamano 
---

 * I am sure others can add better heurisitics in is_better_name()
   for comparing names based on non-tag refs, and I do not mind
   people disagreeing with the logic that this patch happens to
   implement.  This is done primarily to illustrate the value of
   using a separate helper function is_better_name() instead of
   open-coding the selection logic in name_rev() function.
Signed-off-by: Michael J Gruber 
---
 builtin/name-rev.c | 53 -
 t/t4202-log.sh |  2 +-
 2 files changed, 45 insertions(+), 10 deletions(-)

diff --git a/builtin/name-rev.c b/builtin/name-rev.c
index 7890f826ce..bf7ed015ae 100644
--- a/builtin/name-rev.c
+++ b/builtin/name-rev.c
@@ -13,6 +13,7 @@ typedef struct rev_name {
unsigned long taggerdate;
int generation;
int distance;
+   int from_tag;
 } rev_name;
 
 static long cutoff = LONG_MAX;
@@ -24,16 +25,43 @@ static int is_better_name(struct rev_name *name,
  const char *tip_name,
  unsigned long taggerdate,
  int generation,
- int distance)
+ int distance,
+ int from_tag)
 {
-   return (name->taggerdate > taggerdate ||
-   (name->taggerdate == taggerdate &&
-name->distance > distance));
+   /*
+* When comparing names based on tags, prefer names
+* based on the older tag, even if it is farther away.
+*/
+   if (from_tag && name->from_tag)
+   return (name->taggerdate > taggerdate ||
+   (name->taggerdate == taggerdate &&
+name->distance > distance));
+
+   /*
+* We know that at least one of them is a non-tag at this point.
+* favor a tag over a non-tag.
+*/
+   if (name->from_tag != from_tag)
+   return from_tag;
+
+   /*
+* We are now looking at two non-tags.  Tiebreak to favor
+* shorter hops.
+*/
+   if (name->distance != distance)
+   return name->distance > distance;
+
+   /* ... or tiebreak to favor older date */
+   if (name->taggerdate != taggerdate)
+   return name->taggerdate > taggerdate;
+
+   /* keep the current one if we cannot decide */
+   return 0;
 }
 
 static void name_rev(struct commit *commit,
const char *tip_name, unsigned long taggerdate,
-   int generation, int distance,
+

  1   2   >