[PATCH] use strbuf_add_unique_abbrev() for adding short hashes

2016-08-06 Thread René Scharfe
Call strbuf_add_unique_abbrev() to add abbreviated hashes to strbufs
instead of taking detours through find_unique_abbrev() and its static
buffer.  This is shorter and a bit more efficient.

Signed-off-by: Rene Scharfe 
---
 builtin/checkout.c |  3 +--
 pretty.c   | 13 ++---
 transport.c| 11 ---
 3 files changed, 11 insertions(+), 16 deletions(-)

diff --git a/builtin/checkout.c b/builtin/checkout.c
index 27c1a05..0a7d24c 100644
--- a/builtin/checkout.c
+++ b/builtin/checkout.c
@@ -703,8 +703,7 @@ static int add_pending_uninteresting_ref(const char 
*refname,
 static void describe_one_orphan(struct strbuf *sb, struct commit *commit)
 {
strbuf_addstr(sb, "  ");
-   strbuf_addstr(sb,
-   find_unique_abbrev(commit->object.oid.hash, DEFAULT_ABBREV));
+   strbuf_add_unique_abbrev(sb, commit->object.oid.hash, DEFAULT_ABBREV);
strbuf_addch(sb, ' ');
if (!parse_commit(commit))
pp_commit_easy(CMIT_FMT_ONELINE, commit, sb);
diff --git a/pretty.c b/pretty.c
index 9fa42c2..9609afb 100644
--- a/pretty.c
+++ b/pretty.c
@@ -1143,8 +1143,8 @@ static size_t format_commit_one(struct strbuf *sb, /* in 
UTF-8 */
strbuf_addstr(sb, diff_get_color(c->auto_color, 
DIFF_RESET));
return 1;
}
-   strbuf_addstr(sb, find_unique_abbrev(commit->object.oid.hash,
-c->pretty_ctx->abbrev));
+   strbuf_add_unique_abbrev(sb, commit->object.oid.hash,
+c->pretty_ctx->abbrev);
strbuf_addstr(sb, diff_get_color(c->auto_color, DIFF_RESET));
c->abbrev_commit_hash.len = sb->len - c->abbrev_commit_hash.off;
return 1;
@@ -1154,8 +1154,8 @@ static size_t format_commit_one(struct strbuf *sb, /* in 
UTF-8 */
case 't':   /* abbreviated tree hash */
if (add_again(sb, &c->abbrev_tree_hash))
return 1;
-   strbuf_addstr(sb, 
find_unique_abbrev(commit->tree->object.oid.hash,
-c->pretty_ctx->abbrev));
+   strbuf_add_unique_abbrev(sb, commit->tree->object.oid.hash,
+c->pretty_ctx->abbrev);
c->abbrev_tree_hash.len = sb->len - c->abbrev_tree_hash.off;
return 1;
case 'P':   /* parent hashes */
@@ -1171,9 +1171,8 @@ static size_t format_commit_one(struct strbuf *sb, /* in 
UTF-8 */
for (p = commit->parents; p; p = p->next) {
if (p != commit->parents)
strbuf_addch(sb, ' ');
-   strbuf_addstr(sb, find_unique_abbrev(
-   p->item->object.oid.hash,
-   c->pretty_ctx->abbrev));
+   strbuf_add_unique_abbrev(sb, p->item->object.oid.hash,
+c->pretty_ctx->abbrev);
}
c->abbrev_parent_hashes.len = sb->len -
  c->abbrev_parent_hashes.off;
diff --git a/transport.c b/transport.c
index 4ba48b0..557f399 100644
--- a/transport.c
+++ b/transport.c
@@ -321,11 +321,6 @@ static void print_ref_status(char flag, const char 
*summary, struct ref *to, str
}
 }
 
-static const char *status_abbrev(unsigned char sha1[20])
-{
-   return find_unique_abbrev(sha1, DEFAULT_ABBREV);
-}
-
 static void print_ok_ref_status(struct ref *ref, int porcelain)
 {
if (ref->deletion)
@@ -340,7 +335,8 @@ static void print_ok_ref_status(struct ref *ref, int 
porcelain)
char type;
const char *msg;
 
-   strbuf_addstr(&quickref, status_abbrev(ref->old_oid.hash));
+   strbuf_add_unique_abbrev(&quickref, ref->old_oid.hash,
+DEFAULT_ABBREV);
if (ref->forced_update) {
strbuf_addstr(&quickref, "...");
type = '+';
@@ -350,7 +346,8 @@ static void print_ok_ref_status(struct ref *ref, int 
porcelain)
type = ' ';
msg = NULL;
}
-   strbuf_addstr(&quickref, status_abbrev(ref->new_oid.hash));
+   strbuf_add_unique_abbrev(&quickref, ref->new_oid.hash,
+DEFAULT_ABBREV);
 
print_ref_status(type, quickref.buf, ref, ref->peer_ref, msg, 
porcelain);
strbuf_release(&quickref);
-- 
2.9.2

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Rename detection within in files WAS: [PATCH 2/6] t7408: merge short tests, factor out testing method

2016-08-07 Thread René Scharfe

Am 06.08.2016 um 01:26 schrieb Stefan Beller:

When moving code around, we usually get large chunks of text. If the contributor
is not 100% trustworthy, we need to review all the code without much intelectual
joy. Essentially the reviewer is just making sure the parts of the text are the
same.

I'd like to propose a new addition to the diff format that makes this use case
easier. The idea is to mark up lines that were just moved around in the file
instead of adding and removing them.

Currently we have 3 characters that
are allowed to start a line within a hunk:
' ' to indicate context
'+' to add a line
'-' to remove a line

I'd propose to add the following characters:
'*' which is the same as '+', but it indicates that the line was moved
 from somewhere else without change.
'X' The same as '-', with the addition that this line was moved to a different
 place without change.

The patch below uses these new '*' and 'X'. Each hunk that makes use of these
additions, is followed other sections, [moved-from, moved-to] that indicate
where the corresponding line is.


Interesting idea. It should be easy to convert the result into a regular 
unified diff for consumption with patch(1) or git am/apply by replacing 
the new flags with + and - and removing the moved-* hunks.


Your example ignores whitespace changes at the start of the line and 
within it, the added "-C $working_dir", s/expected/expect/; is this all 
intended?  Only a single blank line was moved verbatim.


The moved-from and moved-to hunks make this diff quite verbose.

If multiple lines from different sources are moved to the same hunk then 
you'd get multiple moved-from hunks following that single destination, 
right?  (Same with lines moved from a single hunk to multiple 
destinations and moved-to.)


But does it even warrent a new format? It's a display problem; the 
necessary information is already in the diffs we have today.  A 
graphical diff viewer could connect moved blocks with lines, like 
http://www.araxis.com/merge/ does in its side-by-side view.  A 
Thunderbird extension (or a bookmarklet or browser extendiion for 
webmail users) could do that for an email-based workflow.


Still, what about adding information about moved lines as an extended 
header (like that index line)?  Line numbers are included in hunk 
headers and can serve as orientation.  A reader would have to do some 
mental arithmetic (ugh), but incompatible format changes would be 
avoided.  For your example it should look something like this:


move from t/t7408-submodule-reference.sh:52,1
move to t/t7408-submodule-reference.sh:22,1



There are multiple things to tackle when going for such an addition:
* How to present this to the user (it's covered in this email)
* how to find the renamed lines algorithmically.
   (there are already approaches to that, e.g. 
https://github.com/stefanbeller/duplo
   which is http://duplo.sourceforge.net/ with no substantial additions)

Any comments welcome,

Thanks,
Stefan

---
  t/t7408-submodule-reference.sh | 50 +++---
  1 file changed, 15 insertions(+), 29 deletions(-), 6 moved lines

diff --git a/t/t7408-submodule-reference.sh b/t/t7408-submodule-reference.sh
index afcc629..1416cbd 100755
--- a/t/t7408-submodule-reference.sh
+++ b/t/t7408-submodule-reference.sh
@@ -10,6 +10,16 @@ base_dir=$(pwd)

  U=$base_dir/UPLOAD_LOG

+test_alternate_usage()
+{
+   alternates_file=$1
+   working_dir=$2
+   test_line_count = 1 $alternates_file &&
*   echo "0 objects, 0 kilobytes" >expect &&
*   git -C $working_dir count-objects >current &&
*   diff expect current
+}
+


Post-image line 22.


  test_expect_success 'preparing first repository' '
test_create_repo A &&
(
@@ move-source 42,6 @@ test_expect_success 'that reference gets used with add' '
  test_expect_success 'that reference gets used with add' '
(
cd super/sub &&
X   echo "0 objects, 0 kilobytes" > expected &&
X   git count-objects > current &&
X   diff expected current
)
  '
@@ -42,44 +52,20 @@ test_expect_success 'preparing superproject' '
)
  '

-test_expect_success 'submodule add --reference' '
+test_expect_success 'submodule add --reference uses alternates' '
(
cd super &&
git submodule add --reference ../B "file://$base_dir/A" sub &&
git commit -m B-super-added
-   )
-'
-


Pre-image line 52.


-test_expect_success 'after add: existence of info/alternates' '
-   test_line_count = 1 super/.git/modules/sub/objects/info/alternates
-'
-
-test_expect_success 'that reference gets used with add' '
-   (
-   cd super/sub &&
X   echo "0 objects, 0 kilobytes" > expected &&
X   git count-objects > current &&
X   diff expected current
-   )
-'
-
-test_expect_success 'cloning superproject' '
-   git

[PATCH] correct FLEXPTR_* example in comment

2016-08-13 Thread René Scharfe
This section is about "The FLEXPTR_* variants", so use FLEXPTR_ALLOC_STR
in the example.

Signed-off-by: Rene Scharfe 
---
 git-compat-util.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/git-compat-util.h b/git-compat-util.h
index 590bfdd..f52e00b 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -815,7 +815,7 @@ extern FILE *fopen_for_writing(const char *path);
  * you can do:
  *
  *   struct foo *f;
- *   FLEX_ALLOC_STR(f, name, src);
+ *   FLEXPTR_ALLOC_STR(f, name, src);
  *
  * and "name" will point to a block of memory after the struct, which will be
  * freed along with the struct (but the pointer can be repointed anywhere).
-- 
2.9.3

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] mailinfo: recycle strbuf in check_header()

2016-08-13 Thread René Scharfe
handle_message_id() duplicates the contents of the strbuf that is passed
to it.  Its only caller proceeds to release the strbuf immediately after
that.  Reuse it instead and make that change of object ownership more
obvious by inlining this short function.

Signed-off-by: Rene Scharfe 
---
 mailinfo.c | 9 ++---
 1 file changed, 2 insertions(+), 7 deletions(-)

diff --git a/mailinfo.c b/mailinfo.c
index 9f19ca1..e19abe3 100644
--- a/mailinfo.c
+++ b/mailinfo.c
@@ -179,12 +179,6 @@ static void handle_content_type(struct mailinfo *mi, 
struct strbuf *line)
}
 }
 
-static void handle_message_id(struct mailinfo *mi, const struct strbuf *line)
-{
-   if (mi->add_message_id)
-   mi->message_id = strdup(line->buf);
-}
-
 static void handle_content_transfer_encoding(struct mailinfo *mi,
 const struct strbuf *line)
 {
@@ -495,7 +489,8 @@ static int check_header(struct mailinfo *mi,
len = strlen("Message-Id: ");
strbuf_add(&sb, line->buf + len, line->len - len);
decode_header(mi, &sb);
-   handle_message_id(mi, &sb);
+   if (mi->add_message_id)
+   mi->message_id = strbuf_detach(&sb, NULL);
ret = 1;
goto check_header_out;
}
-- 
2.9.3

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] commit: introduce set_merge_remote_desc()

2016-08-13 Thread René Scharfe
Add a helper function for allocating, populating and attaching struct
merge_remote_desc to a commit and use it consistently.  It allocates the
necessary memory in a single block.

commit.c::get_merge_parent() forgot to check for memory allocation
failures of strdup(3).

merge-recursive.c::make_virtual_commit() didn't duplicate the string for
the name member, even though one of it's callers (indirectly through
get_ref()) may pass the result of oid_to_hex(), i.e. a static buffer.

Signed-off-by: Rene Scharfe 
---
 commit.c  | 18 +++---
 commit.h  |  2 ++
 merge-recursive.c |  5 +
 3 files changed, 14 insertions(+), 11 deletions(-)

diff --git a/commit.c b/commit.c
index 71a360d..372b200 100644
--- a/commit.c
+++ b/commit.c
@@ -1576,6 +1576,15 @@ int commit_tree_extended(const char *msg, size_t msg_len,
return result;
 }
 
+void set_merge_remote_desc(struct commit *commit,
+  const char *name, struct object *obj)
+{
+   struct merge_remote_desc *desc;
+   FLEXPTR_ALLOC_STR(desc, name, name);
+   desc->obj = obj;
+   commit->util = desc;
+}
+
 struct commit *get_merge_parent(const char *name)
 {
struct object *obj;
@@ -1585,13 +1594,8 @@ struct commit *get_merge_parent(const char *name)
return NULL;
obj = parse_object(oid.hash);
commit = (struct commit *)peel_to_type(name, 0, obj, OBJ_COMMIT);
-   if (commit && !commit->util) {
-   struct merge_remote_desc *desc;
-   desc = xmalloc(sizeof(*desc));
-   desc->obj = obj;
-   desc->name = strdup(name);
-   commit->util = desc;
-   }
+   if (commit && !commit->util)
+   set_merge_remote_desc(commit, name, obj);
return commit;
 }
 
diff --git a/commit.h b/commit.h
index 23ae0c1..84bb507 100644
--- a/commit.h
+++ b/commit.h
@@ -365,6 +365,8 @@ struct merge_remote_desc {
const char *name;
 };
 #define merge_remote_util(commit) ((struct merge_remote_desc 
*)((commit)->util))
+extern void set_merge_remote_desc(struct commit *commit,
+ const char *name, struct object *obj);
 
 /*
  * Given "name" from the command line to merge, find the commit object
diff --git a/merge-recursive.c b/merge-recursive.c
index e5243c2..e349126 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -73,12 +73,9 @@ static struct tree *shift_tree_object(struct tree *one, 
struct tree *two,
 static struct commit *make_virtual_commit(struct tree *tree, const char 
*comment)
 {
struct commit *commit = alloc_commit_node();
-   struct merge_remote_desc *desc = xmalloc(sizeof(*desc));
 
-   desc->name = comment;
-   desc->obj = (struct object *)commit;
+   set_merge_remote_desc(commit, comment, (struct object *)commit);
commit->tree = tree;
-   commit->util = desc;
commit->object.parsed = 1;
return commit;
 }
-- 
2.9.3

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] correct FLEXPTR_* example in comment

2016-08-13 Thread René Scharfe

Am 13.08.2016 um 11:09 schrieb Jeff King:

On Sat, Aug 13, 2016 at 11:01:21AM +0200, René Scharfe wrote:


This section is about "The FLEXPTR_* variants", so use FLEXPTR_ALLOC_STR
in the example.


Oops, yeah. Your patch is clearly an improvement.

Since this is obviously an easy mistake to make (using one form rather
than the other), I wondered if the compiler would catch it.

I think it would catch an accidental use of FLEXPTR instead of FLEX,
because it involves an attempted assignment of an array. But I don't
think we would catch the reverse; we'd just write the data directly on
top of the pointer. That would probably crash immediately at runtime, so
if you exercise the code at all in tests, it is OK. But something to be
aware of.

I suppose it could assert(sizeof((x)->flexname) == FLEX_ALLOC) or
something, but I'm not sure if it is worth worrying about.


A compilation error or warning would be nice.  I scratched my head quite 
a bit before figuring out that the example was wrong.  Copy&paste from a 
reputable source must be correct, right? :)


René

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] commit: introduce set_merge_remote_desc()

2016-08-13 Thread René Scharfe

Am 13.08.2016 um 11:23 schrieb Jeff King:

On Sat, Aug 13, 2016 at 11:14:31AM +0200, René Scharfe wrote:


Add a helper function for allocating, populating and attaching struct
merge_remote_desc to a commit and use it consistently.  It allocates the
necessary memory in a single block.

commit.c::get_merge_parent() forgot to check for memory allocation
failures of strdup(3).

merge-recursive.c::make_virtual_commit() didn't duplicate the string for
the name member, even though one of it's callers (indirectly through
get_ref()) may pass the result of oid_to_hex(), i.e. a static buffer.


It seems like you've buried the interesting part here. This isn't just
for cleanup, but a bugfix that the oids in our virtual commits might get
overwritten by subsequent actions.

It seems like that should be the subject and beginning of the commit
message.  And then the fix is to allocate, and by the way we can do so
easily with this nice new helper. :)


Bugs are usually hidden, so why not hide fixes? ;-)


diff --git a/commit.c b/commit.c
index 71a360d..372b200 100644
--- a/commit.c
+++ b/commit.c
@@ -1576,6 +1576,15 @@ int commit_tree_extended(const char *msg, size_t msg_len,
return result;
  }

+void set_merge_remote_desc(struct commit *commit,
+  const char *name, struct object *obj)
+{
+   struct merge_remote_desc *desc;
+   FLEXPTR_ALLOC_STR(desc, name, name);
+   desc->obj = obj;
+   commit->util = desc;
+}


I don't think there is any reason to prefer FLEXPTR_ALLOC over
FLEX_ALLOC, unless your struct interface is constrained by non-flex
users (that's why it is necessary for "struct exclude", for example,
which sometimes needs to carry its own string and sometimes not).

Using FLEX_ALLOC saves a few bytes per struct, and avoids an extra
pointer indirection when accessing the data.

Since it looks like you touch all of the allocations here, I think they
would both be happy as a regular flex array.


Good idea.

So let's turn this dish into a full menu:

  commit: use xstrdup() in get_merge_parent()
  commit: factor out set_merge_remote_desc()
  merge-recursive: fix verbose output for multiple base trees
  commit: use FLEX_ARRAY in struct merge_remote_desc

 commit.c   | 18 +++---
 commit.h   |  4 +++-
 merge-recursive.c  |  5 +
 t/t3030-merge-recursive.sh | 18 ++
 4 files changed, 33 insertions(+), 12 deletions(-)


--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 1/4] commit: use xstrdup() in get_merge_parent()

2016-08-13 Thread René Scharfe
Handle allocation errors for the name member just like we already do
for the struct merge_remote_desc itself.

Signed-off-by: Rene Scharfe 
---
 commit.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/commit.c b/commit.c
index 71a360d..ccd232a 100644
--- a/commit.c
+++ b/commit.c
@@ -1589,7 +1589,7 @@ struct commit *get_merge_parent(const char *name)
struct merge_remote_desc *desc;
desc = xmalloc(sizeof(*desc));
desc->obj = obj;
-   desc->name = strdup(name);
+   desc->name = xstrdup(name);
commit->util = desc;
}
return commit;
-- 
2.9.3

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 2/4] commit: factor out set_merge_remote_desc()

2016-08-13 Thread René Scharfe
Export a helper function for allocating, populating and attaching a
merge_remote_desc to a commit.

Signed-off-by: Rene Scharfe 
---
 commit.c | 19 ---
 commit.h |  2 ++
 2 files changed, 14 insertions(+), 7 deletions(-)

diff --git a/commit.c b/commit.c
index ccd232a..8bad713 100644
--- a/commit.c
+++ b/commit.c
@@ -1576,6 +1576,16 @@ int commit_tree_extended(const char *msg, size_t msg_len,
return result;
 }
 
+void set_merge_remote_desc(struct commit *commit,
+  const char *name, struct object *obj)
+{
+   struct merge_remote_desc *desc;
+   desc = xmalloc(sizeof(*desc));
+   desc->obj = obj;
+   desc->name = xstrdup(name);
+   commit->util = desc;
+}
+
 struct commit *get_merge_parent(const char *name)
 {
struct object *obj;
@@ -1585,13 +1595,8 @@ struct commit *get_merge_parent(const char *name)
return NULL;
obj = parse_object(oid.hash);
commit = (struct commit *)peel_to_type(name, 0, obj, OBJ_COMMIT);
-   if (commit && !commit->util) {
-   struct merge_remote_desc *desc;
-   desc = xmalloc(sizeof(*desc));
-   desc->obj = obj;
-   desc->name = xstrdup(name);
-   commit->util = desc;
-   }
+   if (commit && !commit->util)
+   set_merge_remote_desc(commit, name, obj);
return commit;
 }
 
diff --git a/commit.h b/commit.h
index 23ae0c1..84bb507 100644
--- a/commit.h
+++ b/commit.h
@@ -365,6 +365,8 @@ struct merge_remote_desc {
const char *name;
 };
 #define merge_remote_util(commit) ((struct merge_remote_desc 
*)((commit)->util))
+extern void set_merge_remote_desc(struct commit *commit,
+ const char *name, struct object *obj);
 
 /*
  * Given "name" from the command line to merge, find the commit object
-- 
2.9.3

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 3/4] merge-recursive: fix verbose output for multiple base trees

2016-08-13 Thread René Scharfe
One of the indirect callers of make_virtual_commit() passes the result of
oid_to_hex() as the name, i.e. a pointer to a static buffer.  Since the
function uses that string pointer directly in building a struct
merge_remote_desc, multiple entries can end up sharing the same name
inadvertently.

Fix that by calling set_merge_remote_desc(), which creates a copy of the
string, instead of building the struct by hand.

Signed-off-by: Rene Scharfe 
---
 merge-recursive.c  |  5 +
 t/t3030-merge-recursive.sh | 18 ++
 2 files changed, 19 insertions(+), 4 deletions(-)

diff --git a/merge-recursive.c b/merge-recursive.c
index e5243c2..e349126 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -73,12 +73,9 @@ static struct tree *shift_tree_object(struct tree *one, 
struct tree *two,
 static struct commit *make_virtual_commit(struct tree *tree, const char 
*comment)
 {
struct commit *commit = alloc_commit_node();
-   struct merge_remote_desc *desc = xmalloc(sizeof(*desc));
 
-   desc->name = comment;
-   desc->obj = (struct object *)commit;
+   set_merge_remote_desc(commit, comment, (struct object *)commit);
commit->tree = tree;
-   commit->util = desc;
commit->object.parsed = 1;
return commit;
 }
diff --git a/t/t3030-merge-recursive.sh b/t/t3030-merge-recursive.sh
index f7b0e59..470f334 100755
--- a/t/t3030-merge-recursive.sh
+++ b/t/t3030-merge-recursive.sh
@@ -660,4 +660,22 @@ test_expect_success 'merging with triple rename across D/F 
conflict' '
git merge other
 '
 
+test_expect_success 'merge-recursive remembers the names of all base trees' '
+   git reset --hard HEAD &&
+
+   # more trees than static slots used by oid_to_hex()
+   for commit in $c0 $c2 $c4 $c5 $c6 $c7
+   do
+   git rev-parse "$commit^{tree}"
+   done >trees &&
+
+   # ignore the return code -- it only fails because the input is weird
+   test_must_fail git -c merge.verbosity=5 merge-recursive $(cat trees) -- 
$c1 $c3 >out &&
+
+   # merge-recursive prints in reverse order, but we do not care
+   sort expect &&
+   sed -n "s/^virtual //p" out | sort >actual &&
+   test_cmp expect actual
+'
+
 test_done
-- 
2.9.3

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 4/4] commit: use FLEX_ARRAY in struct merge_remote_desc

2016-08-13 Thread René Scharfe
Convert the name member of struct merge_remote_desc to a FLEX_ARRAY and
use FLEX_ALLOC_STR to build the struct.  This halves the number of
memory allocations, saves the storage for a pointer and avoids an
indirection when reading the name.

Suggested-by: Jeff King 
Signed-off-by: Rene Scharfe 
---
 commit.c | 3 +--
 commit.h | 2 +-
 2 files changed, 2 insertions(+), 3 deletions(-)

diff --git a/commit.c b/commit.c
index 8bad713..ba6dee3 100644
--- a/commit.c
+++ b/commit.c
@@ -1580,9 +1580,8 @@ void set_merge_remote_desc(struct commit *commit,
   const char *name, struct object *obj)
 {
struct merge_remote_desc *desc;
-   desc = xmalloc(sizeof(*desc));
+   FLEX_ALLOC_STR(desc, name, name);
desc->obj = obj;
-   desc->name = xstrdup(name);
commit->util = desc;
 }
 
diff --git a/commit.h b/commit.h
index 84bb507..32e1a11 100644
--- a/commit.h
+++ b/commit.h
@@ -362,7 +362,7 @@ extern void for_each_mergetag(each_mergetag_fn fn, struct 
commit *commit, void *
 
 struct merge_remote_desc {
struct object *obj; /* the named object, could be a tag */
-   const char *name;
+   char name[FLEX_ARRAY];
 };
 #define merge_remote_util(commit) ((struct merge_remote_desc 
*)((commit)->util))
 extern void set_merge_remote_desc(struct commit *commit,
-- 
2.9.3

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] receive-pack: use FLEX_ALLOC_MEM in queue_command()

2016-08-13 Thread René Scharfe
Use the macro FLEX_ALLOC_MEM instead of open-coding it.  This shortens
and simplifies the code a bit.

Signed-off-by: Rene Scharfe 
---
 builtin/receive-pack.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index 92e1213..011db00 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -1478,11 +1478,9 @@ static struct command **queue_command(struct command 
**tail,
 
refname = line + 82;
reflen = linelen - 82;
-   cmd = xcalloc(1, st_add3(sizeof(struct command), reflen, 1));
+   FLEX_ALLOC_MEM(cmd, ref_name, refname, reflen);
hashcpy(cmd->old_sha1, old_sha1);
hashcpy(cmd->new_sha1, new_sha1);
-   memcpy(cmd->ref_name, refname, reflen);
-   cmd->ref_name[reflen] = '\0';
*tail = cmd;
return &cmd->next;
 }
-- 
2.9.3

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] correct FLEXPTR_* example in comment

2016-08-13 Thread René Scharfe

Am 13.08.2016 um 11:09 schrieb Jeff King:

On Sat, Aug 13, 2016 at 11:01:21AM +0200, René Scharfe wrote:


This section is about "The FLEXPTR_* variants", so use FLEXPTR_ALLOC_STR
in the example.


Oops, yeah. Your patch is clearly an improvement.

Since this is obviously an easy mistake to make (using one form rather
than the other), I wondered if the compiler would catch it.

I think it would catch an accidental use of FLEXPTR instead of FLEX,
because it involves an attempted assignment of an array.


Gcc 5.4 reports "invalid use of flexible array member".


But I don't
think we would catch the reverse; we'd just write the data directly on
top of the pointer. That would probably crash immediately at runtime, so
if you exercise the code at all in tests, it is OK. But something to be
aware of.


No hint at compile time, segfaults at runtime.


I suppose it could assert(sizeof((x)->flexname) == FLEX_ALLOC) or
something, but I'm not sure if it is worth worrying about.


You can't use sizeof with an actual flexible array.  It only works if 
FLEX_ARRAY is defined as 1 (for platforms without native support), and 
perhaps also if it's 0.


offsetof(struct x, arr) == sizeof(struct x) won't work either because of 
padding.


I have no idea what you can do with a flexible array that would throw a 
compile error when done with a pointer.


René
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 7/7] blame: actually use the diff opts parsed from the command line

2016-08-23 Thread René Scharfe

Am 22.08.2016 um 13:22 schrieb Michael Haggerty:

"git blame" already parsed generic diff options from the command line
via diff_opt_parse(), but instead of passing the resulting xdl_opts to
xdi_diff(), it sent its own xdl_opts, which only reflected the values of
the self-parsed options "-w" and "--minimal". Instead, rely on
diff_opt_parse() to parse all of the diff options, including "-w" and
"--minimal", and pass the resulting xdl_opts to xdi_diff().


Sounds useful: It allows more fine-grained control over which whitespace 
changes to ignore and which diff algorithm to use.  There is a bit of 
overlap (e.g. with -b meaning show blank boundaries vs. ignore 
whitespace changes), but with your patch blame's own options still take 
precedence, so there should be no unpleasant surprises.



Signed-off-by: Michael Haggerty 
---
Somebody who knows more about how diff operations are configured
should please review this. I'm not certain that the change as
implemented won't have other unwanted side-effects, though of course
I checked that the test suite runs correctly.


I don't qualify, but I'll comment anyway..


 builtin/blame.c|  11 ++--
 t/t4059-diff-indent.sh | 160 +
 2 files changed, 165 insertions(+), 6 deletions(-)
 create mode 100755 t/t4059-diff-indent.sh


This new test doesn't call git blame.  Does it belong to a different 
commit?  And shouldn't the change to blame.c stand on its own, outside 
of this series?



diff --git a/builtin/blame.c b/builtin/blame.c
index 7ec7823..cde2d15 100644
--- a/builtin/blame.c
+++ b/builtin/blame.c
@@ -48,11 +48,12 @@ static int show_root;
 static int reverse;
 static int blank_boundary;
 static int incremental;
-static int xdl_opts;
 static int abbrev = -1;
 static int no_whole_file_rename;
 static int show_progress;

+static struct rev_info revs;
+
 static struct date_mode blame_date_mode = { DATE_ISO8601 };
 static size_t blame_date_width;

@@ -137,11 +138,12 @@ struct progress_info {
 static int diff_hunks(mmfile_t *file_a, mmfile_t *file_b,
  xdl_emit_hunk_consume_func_t hunk_func, void *cb_data)
 {
-   xpparam_t xpp = {0};
+   xpparam_t xpp;
xdemitconf_t xecfg = {0};
xdemitcb_t ecb = {NULL};

-   xpp.flags = xdl_opts;
+   memset(&xpp, 0, sizeof(xpp));
+   xpp.flags = revs.diffopt.xdl_opts;


Why call memset instead of using a static initializer?  The intent of 
this patch is just to change the .flags assignment, isn't it?



xecfg.hunk_func = hunk_func;
ecb.priv = cb_data;
return xdi_diff(file_a, file_b, &xpp, &xecfg, &ecb);
@@ -2517,7 +2519,6 @@ static int blame_move_callback(const struct option 
*option, const char *arg, int

 int cmd_blame(int argc, const char **argv, const char *prefix)
 {
-   struct rev_info revs;
const char *path;
struct scoreboard sb;
struct origin *o;
@@ -2548,8 +2549,6 @@ int cmd_blame(int argc, const char **argv, const char 
*prefix)
OPT_BIT('l', NULL, &output_option, N_("Show long commit SHA1 
(Default: off)"), OUTPUT_LONG_OBJECT_NAME),
OPT_BIT('s', NULL, &output_option, N_("Suppress author name and 
timestamp (Default: off)"), OUTPUT_NO_AUTHOR),
OPT_BIT('e', "show-email", &output_option, N_("Show author email 
instead of name (Default: off)"), OUTPUT_SHOW_EMAIL),
-   OPT_BIT('w', NULL, &xdl_opts, N_("Ignore whitespace 
differences"), XDF_IGNORE_WHITESPACE),
-   OPT_BIT(0, "minimal", &xdl_opts, N_("Spend extra cycles to find 
better match"), XDF_NEED_MINIMAL),


This removes -w and --minimal from blame's short help; diff options 
should be mentioned somehow in exchange for that loss.  Or perhaps they 
should be mentioned in git-rev-list(1)?  (git blame -h points to 
git-rev-list(1) already.)


Documentation/git-blame.txt needs an update as well.


OPT_STRING('S', NULL, &revs_file, N_("file"), N_("Use revisions from 
 instead of calling git-rev-list")),
OPT_STRING(0, "contents", &contents_from, N_("file"), N_("Use 
's contents as the final image")),
{ OPTION_CALLBACK, 'C', NULL, &opt, N_("score"), N_("Find line 
copies within and across files"), PARSE_OPT_OPTARG, blame_copy_callback },


--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] p3400: make test script executable

2016-08-28 Thread René Scharfe
Signed-off-by: Rene Scharfe 
---
This script was added by v2.10.0-rc0~3^2.

 t/perf/p3400-rebase.sh | 0
 1 file changed, 0 insertions(+), 0 deletions(-)
 mode change 100644 => 100755 t/perf/p3400-rebase.sh

diff --git a/t/perf/p3400-rebase.sh b/t/perf/p3400-rebase.sh
old mode 100644
new mode 100755
-- 
2.9.3

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] compat: move strdup(3) replacement to its own file

2016-09-03 Thread René Scharfe
Move our implementation of strdup(3) out of compat/nedmalloc/ and allow
it to be used independently from USE_NED_ALLOCATOR.  This reduces the
difference of our copy of nedmalloc from the original, making it easier
to update, and allows for easier testing and reusing of our version of
strdup().

Signed-off-by: Rene Scharfe 
---
 Makefile | 17 ++---
 compat/nedmalloc/nedmalloc.c | 16 
 compat/strdup.c  | 11 +++
 git-compat-util.h|  8 
 4 files changed, 33 insertions(+), 19 deletions(-)
 create mode 100644 compat/strdup.c

diff --git a/Makefile b/Makefile
index d96ecb7..7f18492 100644
--- a/Makefile
+++ b/Makefile
@@ -296,6 +296,11 @@ all::
 # Define USE_NED_ALLOCATOR if you want to replace the platforms default
 # memory allocators with the nedmalloc allocator written by Niall Douglas.
 #
+# Define OVERRIDE_STRDUP to override the libc version of strdup(3).
+# This is necessary when using a custom allocator in order to avoid
+# crashes due to allocation and free working on different 'heaps'.
+# It's defined automatically if USE_NED_ALLOCATOR is set.
+#
 # Define NO_REGEX if you have no or inferior regex support in your C library.
 #
 # Define HAVE_DEV_TTY if your system can open /dev/tty to interact with the
@@ -1456,8 +1461,14 @@ ifdef NATIVE_CRLF
 endif
 
 ifdef USE_NED_ALLOCATOR
-   COMPAT_CFLAGS += -Icompat/nedmalloc
-   COMPAT_OBJS += compat/nedmalloc/nedmalloc.o
+   COMPAT_CFLAGS += -Icompat/nedmalloc
+   COMPAT_OBJS += compat/nedmalloc/nedmalloc.o
+   OVERRIDE_STRDUP = YesPlease
+endif
+
+ifdef OVERRIDE_STRDUP
+   COMPAT_CFLAGS += -DOVERRIDE_STRDUP
+   COMPAT_OBJS += compat/strdup.o
 endif
 
 ifdef GIT_TEST_CMP_USE_COPIED_CONTEXT
@@ -2029,7 +2040,7 @@ endif
 
 ifdef USE_NED_ALLOCATOR
 compat/nedmalloc/nedmalloc.sp compat/nedmalloc/nedmalloc.o: EXTRA_CPPFLAGS = \
-   -DNDEBUG -DOVERRIDE_STRDUP -DREPLACE_SYSTEM_ALLOCATOR
+   -DNDEBUG -DREPLACE_SYSTEM_ALLOCATOR
 compat/nedmalloc/nedmalloc.sp: SPARSE_FLAGS += -Wno-non-pointer-null
 endif
 
diff --git a/compat/nedmalloc/nedmalloc.c b/compat/nedmalloc/nedmalloc.c
index 2d4ef59..1cc31c3 100644
--- a/compat/nedmalloc/nedmalloc.c
+++ b/compat/nedmalloc/nedmalloc.c
@@ -948,22 +948,6 @@ void **nedpindependent_comalloc(nedpool *p, size_t elems, 
size_t *sizes, void **
return ret;
 }
 
-#ifdef OVERRIDE_STRDUP
-/*
- * This implementation is purely there to override the libc version, to
- * avoid a crash due to allocation and free on different 'heaps'.
- */
-char *strdup(const char *s1)
-{
-   size_t len = strlen(s1) + 1;
-   char *s2 = malloc(len);
-
-   if (s2)
-   memcpy(s2, s1, len);
-   return s2;
-}
-#endif
-
 #if defined(__cplusplus)
 }
 #endif
diff --git a/compat/strdup.c b/compat/strdup.c
new file mode 100644
index 000..f3fb978
--- /dev/null
+++ b/compat/strdup.c
@@ -0,0 +1,11 @@
+#include "../git-compat-util.h"
+
+char *gitstrdup(const char *s1)
+{
+   size_t len = strlen(s1) + 1;
+   char *s2 = malloc(len);
+
+   if (s2)
+   memcpy(s2, s1, len);
+   return s2;
+}
diff --git a/git-compat-util.h b/git-compat-util.h
index db89ba7..93f6f23 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -663,6 +663,14 @@ void *gitmemmem(const void *haystack, size_t haystacklen,
 const void *needle, size_t needlelen);
 #endif
 
+#ifdef OVERRIDE_STRDUP
+#ifdef strdup
+#undef strdup
+#endif
+#define strdup gitstrdup
+char *gitstrdup(const char *s);
+#endif
+
 #ifdef NO_GETPAGESIZE
 #define getpagesize() sysconf(_SC_PAGESIZE)
 #endif
-- 
2.10.0



[PATCH] introduce hex2chr() for converting two hexadecimal digits to a character

2016-09-03 Thread René Scharfe
Add and use a helper function that decodes the char value of two
hexadecimal digits.  It returns a negative number on error, avoids
running over the end of the given string and doesn't shift negative
values.

Signed-off-by: Rene Scharfe 
---
 cache.h  | 10 ++
 hex.c| 12 ++--
 pkt-line.c   | 23 ++-
 pretty.c | 13 +
 ref-filter.c | 20 +---
 url.c| 21 +
 6 files changed, 21 insertions(+), 78 deletions(-)

diff --git a/cache.h b/cache.h
index b780a91..b0dae4b 100644
--- a/cache.h
+++ b/cache.h
@@ -1139,6 +1139,16 @@ static inline unsigned int hexval(unsigned char c)
return hexval_table[c];
 }
 
+/*
+ * Convert two consecutive hexadecimal digits into a char.  Return a
+ * negative value on error.  Don't run over the end of short strings.
+ */
+static inline int hex2chr(const char *s)
+{
+   int val = hexval(s[0]);
+   return (val < 0) ? val : (val << 4) | hexval(s[1]);
+}
+
 /* Convert to/from hex/sha1 representation */
 #define MINIMUM_ABBREV minimum_abbrev
 #define DEFAULT_ABBREV default_abbrev
diff --git a/hex.c b/hex.c
index 9619b67..ab2610e 100644
--- a/hex.c
+++ b/hex.c
@@ -39,16 +39,8 @@ int get_sha1_hex(const char *hex, unsigned char *sha1)
 {
int i;
for (i = 0; i < GIT_SHA1_RAWSZ; i++) {
-   unsigned int val;
-   /*
-* hex[1]=='\0' is caught when val is checked below,
-* but if hex[0] is NUL we have to avoid reading
-* past the end of the string:
-*/
-   if (!hex[0])
-   return -1;
-   val = (hexval(hex[0]) << 4) | hexval(hex[1]);
-   if (val & ~0xff)
+   int val = hex2chr(hex);
+   if (val < 0)
return -1;
*sha1++ = val;
hex += 2;
diff --git a/pkt-line.c b/pkt-line.c
index 62fdb37..30489c6 100644
--- a/pkt-line.c
+++ b/pkt-line.c
@@ -172,27 +172,8 @@ static int get_packet_data(int fd, char **src_buf, size_t 
*src_size,
 
 static int packet_length(const char *linelen)
 {
-   int n;
-   int len = 0;
-
-   for (n = 0; n < 4; n++) {
-   unsigned char c = linelen[n];
-   len <<= 4;
-   if (c >= '0' && c <= '9') {
-   len += c - '0';
-   continue;
-   }
-   if (c >= 'a' && c <= 'f') {
-   len += c - 'a' + 10;
-   continue;
-   }
-   if (c >= 'A' && c <= 'F') {
-   len += c - 'A' + 10;
-   continue;
-   }
-   return -1;
-   }
-   return len;
+   int val = hex2chr(linelen);
+   return (val < 0) ? val : (val << 8) | hex2chr(linelen + 2);
 }
 
 int packet_read(int fd, char **src_buf, size_t *src_len,
diff --git a/pretty.c b/pretty.c
index 9609afb..9788bd8 100644
--- a/pretty.c
+++ b/pretty.c
@@ -1065,7 +1065,7 @@ static size_t format_commit_one(struct strbuf *sb, /* in 
UTF-8 */
const struct commit *commit = c->commit;
const char *msg = c->message;
struct commit_list *p;
-   int h1, h2;
+   int ch;
 
/* these are independent of the commit */
switch (placeholder[0]) {
@@ -1089,14 +1089,11 @@ static size_t format_commit_one(struct strbuf *sb, /* 
in UTF-8 */
return 1;
case 'x':
/* %x00 == NUL, %x0a == LF, etc. */
-   if (0 <= (h1 = hexval_table[0xff & placeholder[1]]) &&
-   h1 <= 16 &&
-   0 <= (h2 = hexval_table[0xff & placeholder[2]]) &&
-   h2 <= 16) {
-   strbuf_addch(sb, (h1<<4)|h2);
-   return 3;
-   } else
+   ch = hex2chr(placeholder + 1);
+   if (ch < 0)
return 0;
+   strbuf_addch(sb, ch);
+   return 3;
case 'w':
if (placeholder[1] == '(') {
unsigned long width = 0, indent1 = 0, indent2 = 0;
diff --git a/ref-filter.c b/ref-filter.c
index bc551a7..9adbb8a 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -1576,24 +1576,6 @@ void ref_array_sort(struct ref_sorting *sorting, struct 
ref_array *array)
qsort(array->items, array->nr, sizeof(struct ref_array_item *), 
compare_refs);
 }
 
-static int hex1(char ch)
-{
-   if ('0' <= ch && ch <= '9')
-   return ch - '0';
-   else if ('a' <= ch && ch <= 'f')
-   return ch - 'a' + 10;
-   else if ('A' <= ch && ch <= 'F')
-   return ch - 'A' + 10;
-   return -1;
-}
-static int hex2(const char *cp)
-{
-   if (cp[0] && cp[1])
-   return (hex1(cp[0]) << 4) | hex1(cp[1]);
-   else
-   return -1;
-}
-
 static void append_literal(const char *cp, const char *ep, st

Re: [PATCH] compat: move strdup(3) replacement to its own file

2016-09-06 Thread René Scharfe

Am 04.09.2016 um 09:46 schrieb Johannes Schindelin:

Hi René,

I imagine you Cc:ed me because the nedmalloc stuff came in via the Windows
port, contributed by Marius (who is no longer active on the Git project
because it works well enough for him)?


Kind of; it's also a follow-up to the recent discussion you started 
about compiler warnings in that code.



On Sat, 3 Sep 2016, René Scharfe wrote:


Move our implementation of strdup(3) out of compat/nedmalloc/ and allow
it to be used independently from USE_NED_ALLOCATOR.  This reduces the
difference of our copy of nedmalloc from the original, making it easier
to update, and allows for easier testing and reusing of our version of
strdup().


I would like to suggest an additional paragraph to explain why we do not
need to #include "git-compat-util.h" in nedmalloc from now on:

Please note that nedmalloc never actually uses strdup() itself,
therefore we need not enforce gitstrdup() usage in nedmalloc.c.


Well, OK.  I think the missing point is that the original nedmalloc 
doesn't come with strdup() and doesn't need it.  Only _users_ of 
nedmalloc need it.  Marius added it in nedmalloc.c, but strdup.c is a 
better place for it.


René


Re: [PATCH] introduce hex2chr() for converting two hexadecimal digits to a character

2016-09-06 Thread René Scharfe

Am 04.09.2016 um 09:49 schrieb Johannes Schindelin:

Hi René,

On Sat, 3 Sep 2016, René Scharfe wrote:


Add and use a helper function that decodes the char value of two
hexadecimal digits.  It returns a negative number on error, avoids
running over the end of the given string and doesn't shift negative
values.


I like it! Maybe stress a little bit why this is a good change? Like, DRY
up code, makes the code safer (bt avoiding shifting negative values)?


 6 files changed, 21 insertions(+), 78 deletions(-)


Very, very nice!


That's the main reason: Consistency.  It's intended to be a safe, easy 
to use and reasonably fast replacement for those other (lengthy) 
variations.


René


Re: [RFC/PATCH 01/17] diff: move line ending check into emit_hunk_header

2016-09-13 Thread René Scharfe

Am 13.09.2016 um 06:45 schrieb Stefan Beller:

In a later patch, I want to propose an option to detect&color
moved lines in a diff, which cannot be done in a one-pass over
the diff. Instead we need to go over the whole diff twice,
because we cannot detect the first line of the two corresponding
lines (+ and -) that got moved.

So to prepare the diff machinery for two pass algorithms
(i.e. buffer it all up and then operate on the result),
move all emissions to places, such that the only emitting
function is emit_line_0.

This patch moves code that is conceptually part of
emit_hunk_header, but was using output in fn_out_consume,
back to emit_hunk_header.

Signed-off-by: Stefan Beller 
---
 diff.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/diff.c b/diff.c
index cc8e812..aa50b2d 100644
--- a/diff.c
+++ b/diff.c
@@ -610,6 +610,9 @@ static void emit_hunk_header(struct emit_callback *ecbdata,
}

strbuf_add(&msgbuf, line + len, org_len - len);
+   if (line[org_len - 1] != '\n')
+   strbuf_addch(&msgbuf, '\n');
+


Using strbuf_complete_line() would be nicer.


emit_line(ecbdata->opt, "", "", msgbuf.buf, msgbuf.len);
strbuf_release(&msgbuf);
 }
@@ -1247,8 +1250,6 @@ static void fn_out_consume(void *priv, char *line, 
unsigned long len)
len = sane_truncate_line(ecbdata, line, len);
find_lno(line, ecbdata);
emit_hunk_header(ecbdata, line, len);
-   if (line[len-1] != '\n')
-   putc('\n', o->file);
return;
}






[PATCH] strbuf: use valid pointer in strbuf_remove()

2016-09-13 Thread René Scharfe
The fourth argument of strbuf_splice() is passed to memcpy(3), which is
not supposed to handle NULL pointers.  Let's be extra careful and use a
valid empty string instead.  It even shortens the source code. :)

Signed-off-by: Rene Scharfe 
---
 strbuf.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/strbuf.c b/strbuf.c
index f3bd571..b839be4 100644
--- a/strbuf.c
+++ b/strbuf.c
@@ -187,7 +187,7 @@ void strbuf_insert(struct strbuf *sb, size_t pos, const 
void *data, size_t len)
 
 void strbuf_remove(struct strbuf *sb, size_t pos, size_t len)
 {
-   strbuf_splice(sb, pos, len, NULL, 0);
+   strbuf_splice(sb, pos, len, "", 0);
 }
 
 void strbuf_add(struct strbuf *sb, const void *data, size_t len)
-- 
2.10.0


[PATCH] checkout: constify parameters of checkout_stage() and checkout_merged()

2016-09-13 Thread René Scharfe
Document the fact that checkout_stage() and checkout_merged() don't
change the objects passed to them by adding the modifier const.

Signed-off-by: Rene Scharfe 
---
 builtin/checkout.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/builtin/checkout.c b/builtin/checkout.c
index 8672d07..afbff3e 100644
--- a/builtin/checkout.c
+++ b/builtin/checkout.c
@@ -154,8 +154,8 @@ static int check_stages(unsigned stages, const struct 
cache_entry *ce, int pos)
return 0;
 }
 
-static int checkout_stage(int stage, struct cache_entry *ce, int pos,
- struct checkout *state)
+static int checkout_stage(int stage, const struct cache_entry *ce, int pos,
+ const struct checkout *state)
 {
while (pos < active_nr &&
   !strcmp(active_cache[pos]->name, ce->name)) {
@@ -169,7 +169,7 @@ static int checkout_stage(int stage, struct cache_entry 
*ce, int pos,
return error(_("path '%s' does not have their version"), 
ce->name);
 }
 
-static int checkout_merged(int pos, struct checkout *state)
+static int checkout_merged(int pos, const struct checkout *state)
 {
struct cache_entry *ce = active_cache[pos];
const char *path = ce->name;
-- 
2.10.0



[PATCH] unpack-trees: pass checkout state explicitly to check_updates()

2016-09-13 Thread René Scharfe
Add a parameter for the struct checkout variable to check_updates()
instead of using a static global variable.  Passing it explicitly makes
object ownership and usage more easily apparent.  And we get rid of a
static variable; those can be problematic in library-like code.

Signed-off-by: Rene Scharfe 
---
 unpack-trees.c | 9 +
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/unpack-trees.c b/unpack-trees.c
index 11c37fb..74d6dd4 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -218,8 +218,8 @@ static void unlink_entry(const struct cache_entry *ce)
schedule_dir_for_removal(ce->name, ce_namelen(ce));
 }
 
-static struct checkout state;
-static int check_updates(struct unpack_trees_options *o)
+static int check_updates(struct unpack_trees_options *o,
+const struct checkout *state)
 {
unsigned cnt = 0, total = 0;
struct progress *progress = NULL;
@@ -264,7 +264,7 @@ static int check_updates(struct unpack_trees_options *o)
display_progress(progress, ++cnt);
ce->ce_flags &= ~CE_UPDATE;
if (o->update && !o->dry_run) {
-   errs |= checkout_entry(ce, &state, NULL);
+   errs |= checkout_entry(ce, state, NULL);
}
}
}
@@ -1094,6 +1094,7 @@ int unpack_trees(unsigned len, struct tree_desc *t, 
struct unpack_trees_options
int i, ret;
static struct cache_entry *dfc;
struct exclude_list el;
+   struct checkout state;
 
if (len > MAX_UNPACK_TREES)
die("unpack_trees takes at most %d trees", MAX_UNPACK_TREES);
@@ -1239,7 +1240,7 @@ int unpack_trees(unsigned len, struct tree_desc *t, 
struct unpack_trees_options
}
 
o->src_index = NULL;
-   ret = check_updates(o) ? (-2) : 0;
+   ret = check_updates(o, &state) ? (-2) : 0;
if (o->dst_index) {
if (!ret) {
if (!o->result.cache_tree)
-- 
2.10.0



[PATCH] sha1_file: use llist_mergesort() for sorting packs

2016-09-13 Thread René Scharfe
Sort the linked list of packs directly using llist_mergesort() instead
of building an array, calling qsort(3) and fixing up the list pointers.
This is shorter and less complicated.

Signed-off-by: Rene Scharfe 
---
Peff: Or do you have other plans, e.g. to replace packed_git with
packed_git_mru completely?

 sha1_file.c | 39 +++
 1 file changed, 15 insertions(+), 24 deletions(-)

diff --git a/sha1_file.c b/sha1_file.c
index 472ccb2..66dccaa 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -25,6 +25,7 @@
 #include "dir.h"
 #include "mru.h"
 #include "list.h"
+#include "mergesort.h"
 
 #ifndef O_NOATIME
 #if defined(__linux__) && (defined(__i386__) || defined(__PPC__))
@@ -1380,10 +1381,20 @@ static void prepare_packed_git_one(char *objdir, int 
local)
strbuf_release(&path);
 }
 
+static void *get_next_packed_git(const void *p)
+{
+   return ((const struct packed_git *)p)->next;
+}
+
+static void set_next_packed_git(void *p, void *next)
+{
+   ((struct packed_git *)p)->next = next;
+}
+
 static int sort_pack(const void *a_, const void *b_)
 {
-   struct packed_git *a = *((struct packed_git **)a_);
-   struct packed_git *b = *((struct packed_git **)b_);
+   const struct packed_git *a = a_;
+   const struct packed_git *b = b_;
int st;
 
/*
@@ -1410,28 +1421,8 @@ static int sort_pack(const void *a_, const void *b_)
 
 static void rearrange_packed_git(void)
 {
-   struct packed_git **ary, *p;
-   int i, n;
-
-   for (n = 0, p = packed_git; p; p = p->next)
-   n++;
-   if (n < 2)
-   return;
-
-   /* prepare an array of packed_git for easier sorting */
-   ary = xcalloc(n, sizeof(struct packed_git *));
-   for (n = 0, p = packed_git; p; p = p->next)
-   ary[n++] = p;
-
-   qsort(ary, n, sizeof(struct packed_git *), sort_pack);
-
-   /* link them back again */
-   for (i = 0; i < n - 1; i++)
-   ary[i]->next = ary[i + 1];
-   ary[n - 1]->next = NULL;
-   packed_git = ary[0];
-
-   free(ary);
+   packed_git = llist_mergesort(packed_git, get_next_packed_git,
+set_next_packed_git, sort_pack);
 }
 
 static void prepare_packed_git_mru(void)
-- 
2.10.0



[PATCH] xdiff: fix merging of hunks with -W context and -u context

2016-09-14 Thread René Scharfe
If the function context for a hunk (with -W) reaches the beginning of
the next hunk then we need to merge these two -- otherwise we'd show
some lines twice, which looks strange and even confuses git apply.  We
already do this checking and merging in xdl_emit_diff(), but forget to
consider regular context (with -u or -U).

Fix that by merging hunks already if function context of the first one
touches or overlaps regular context of the second one.

Signed-off-by: Rene Scharfe 
---
 t/t4051-diff-function-context.sh | 25 +
 xdiff/xemit.c|  2 +-
 2 files changed, 26 insertions(+), 1 deletion(-)

diff --git a/t/t4051-diff-function-context.sh b/t/t4051-diff-function-context.sh
index b79b877..6154acb 100755
--- a/t/t4051-diff-function-context.sh
+++ b/t/t4051-diff-function-context.sh
@@ -67,6 +67,15 @@ test_expect_success 'setup' '
commit_and_tag long_common_tail file.c &&
 
git checkout initial &&
+   cat "$dir/hello.c" "$dir/dummy.c" >file.c &&
+   commit_and_tag hello_dummy file.c &&
+
+   # overlap function context of 1st change and -u context of 2nd change
+   grep -v "delete me from hello" <"$dir/hello.c" >file.c &&
+   sed 2p <"$dir/dummy.c" >>file.c &&
+   commit_and_tag changed_hello_dummy file.c &&
+
+   git checkout initial &&
grep -v "delete me from hello" file.c.new &&
mv file.c.new file.c &&
cat "$dir/appended1.c" >>file.c &&
@@ -179,4 +188,20 @@ test_expect_success ' context does not include other 
functions' '
test $(grep -c "^[ +-].*Begin" changed_hello_appended.diff) -le 2
 '
 
+check_diff changed_hello_dummy 'changed two consecutive functions'
+
+test_expect_success ' context includes begin' '
+   grep "^ .*Begin of hello" changed_hello_dummy.diff &&
+   grep "^ .*Begin of dummy" changed_hello_dummy.diff
+'
+
+test_expect_success ' context includes end' '
+   grep "^ .*End of hello" changed_hello_dummy.diff &&
+   grep "^ .*End of dummy" changed_hello_dummy.diff
+'
+
+test_expect_success ' overlapping hunks are merged' '
+   test $(grep -c "^@@" changed_hello_dummy.diff) -eq 1
+'
+
 test_done
diff --git a/xdiff/xemit.c b/xdiff/xemit.c
index b52b4b9..7389ce4 100644
--- a/xdiff/xemit.c
+++ b/xdiff/xemit.c
@@ -239,7 +239,7 @@ int xdl_emit_diff(xdfenv_t *xe, xdchange_t *xscr, 
xdemitcb_t *ecb,
if (xche->next) {
long l = XDL_MIN(xche->next->i1,
 xe->xdf1.nrec - 1);
-   if (l <= e1 ||
+   if (l - xecfg->ctxlen <= e1 ||
get_func_line(xe, xecfg, NULL, l, e1) < 0) {
xche = xche->next;
goto post_context_calculation;
-- 
2.10.0



[PATCH] contrib/coccinelle: fix semantic patch for oid_to_hex_r()

2016-09-15 Thread René Scharfe
Both sha1_to_hex_r() and oid_to_hex_r() take two parameters, so use two
expressions in the semantic patch for transforming calls of the former
to the latter one.

Signed-off-by: Rene Scharfe 
---
 contrib/coccinelle/object_id.cocci | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/contrib/coccinelle/object_id.cocci 
b/contrib/coccinelle/object_id.cocci
index 8ccdbb5..0307624 100644
--- a/contrib/coccinelle/object_id.cocci
+++ b/contrib/coccinelle/object_id.cocci
@@ -23,16 +23,16 @@ expression E1;
 + oid_to_hex(E1)
 
 @@
-expression E1;
+expression E1, E2;
 @@
-- sha1_to_hex_r(E1.hash)
-+ oid_to_hex_r(&E1)
+- sha1_to_hex_r(E1, E2.hash)
++ oid_to_hex_r(E1, &E2)
 
 @@
-expression E1;
+expression E1, E2;
 @@
-- sha1_to_hex_r(E1->hash)
-+ oid_to_hex_r(E1)
+- sha1_to_hex_r(E1, E2->hash)
++ oid_to_hex_r(E1, E2)
 
 @@
 expression E1;
-- 
2.10.0



[PATCH] use strbuf_addstr() for adding constant strings to a strbuf, part 2

2016-09-15 Thread René Scharfe
Replace uses of strbuf_addf() for adding strings with more lightweight
strbuf_addstr() calls.  This makes the intent clearer and avoids
potential issues with printf format specifiers.

02962d36845b89145cd69f8bc65e015d78ae3434 already converted six cases,
this patch covers eleven more.

A semantic patch for Coccinelle is included for easier checking for
new cases that might be introduced in the future.

Signed-off-by: Rene Scharfe 
---
Silly question: Is there a natural language that uses percent signs
as letters or e.g. instead of commas? :)

 builtin/fmt-merge-msg.c | 2 +-
 builtin/merge.c | 2 +-
 builtin/submodule--helper.c | 5 +++--
 contrib/coccinelle/strbuf.cocci | 5 +
 merge-recursive.c   | 2 +-
 remote.c| 8 
 wt-status.c | 6 +++---
 7 files changed, 18 insertions(+), 12 deletions(-)
 create mode 100644 contrib/coccinelle/strbuf.cocci

diff --git a/builtin/fmt-merge-msg.c b/builtin/fmt-merge-msg.c
index ac84e99..dc2e9e4 100644
--- a/builtin/fmt-merge-msg.c
+++ b/builtin/fmt-merge-msg.c
@@ -395,7 +395,7 @@ static void shortlog(const char *name,
 
for (i = 0; i < subjects.nr; i++)
if (i >= limit)
-   strbuf_addf(out, "  ...\n");
+   strbuf_addstr(out, "  ...\n");
else
strbuf_addf(out, "  %s\n", subjects.items[i].string);
 
diff --git a/builtin/merge.c b/builtin/merge.c
index 0ae099f..a8b57c7 100644
--- a/builtin/merge.c
+++ b/builtin/merge.c
@@ -940,7 +940,7 @@ static void write_merge_state(struct commit_list 
*remoteheads)
 
strbuf_reset(&buf);
if (fast_forward == FF_NO)
-   strbuf_addf(&buf, "no-ff");
+   strbuf_addstr(&buf, "no-ff");
write_file_buf(git_path_merge_mode(), buf.buf, buf.len);
 }
 
diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
index 9d79f19..ad23155 100644
--- a/builtin/submodule--helper.c
+++ b/builtin/submodule--helper.c
@@ -859,8 +859,9 @@ static int update_clone_get_next_task(struct child_process 
*child,
ce = suc->failed_clones[index];
if (!prepare_to_clone_next_submodule(ce, child, suc, err)) {
suc->current ++;
-   strbuf_addf(err, "BUG: submodule considered for 
cloning,"
-   "doesn't need cloning any more?\n");
+   strbuf_addstr(err, "BUG: submodule considered for "
+  "cloning, doesn't need cloning "
+  "any more?\n");
return 0;
}
p = xmalloc(sizeof(*p));
diff --git a/contrib/coccinelle/strbuf.cocci b/contrib/coccinelle/strbuf.cocci
new file mode 100644
index 000..7932d48
--- /dev/null
+++ b/contrib/coccinelle/strbuf.cocci
@@ -0,0 +1,5 @@
+@@
+expression E1, E2;
+@@
+- strbuf_addf(E1, E2);
++ strbuf_addstr(E1, E2);
diff --git a/merge-recursive.c b/merge-recursive.c
index e349126..d2b191b 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -206,7 +206,7 @@ static void output_commit_title(struct merge_options *o, 
struct commit *commit)
find_unique_abbrev(commit->object.oid.hash,
DEFAULT_ABBREV));
if (parse_commit(commit) != 0)
-   strbuf_addf(&o->obuf, _("(bad commit)\n"));
+   strbuf_addstr(&o->obuf, _("(bad commit)\n"));
else {
const char *title;
const char *msg = get_commit_buffer(commit, NULL);
diff --git a/remote.c b/remote.c
index d29850a..ad6c542 100644
--- a/remote.c
+++ b/remote.c
@@ -2073,7 +2073,7 @@ int format_tracking_info(struct branch *branch, struct 
strbuf *sb)
_("Your branch is based on '%s', but the upstream is 
gone.\n"),
base);
if (advice_status_hints)
-   strbuf_addf(sb,
+   strbuf_addstr(sb,
_("  (use \"git branch --unset-upstream\" to 
fixup)\n"));
} else if (!ours && !theirs) {
strbuf_addf(sb,
@@ -2086,7 +2086,7 @@ int format_tracking_info(struct branch *branch, struct 
strbuf *sb)
   ours),
base, ours);
if (advice_status_hints)
-   strbuf_addf(sb,
+   strbuf_addstr(sb,
_("  (use \"git push\" to publish your local 
commits)\n"));
} else if (!ours) {
strbuf_addf(sb,
@@ -2097,7 +2097,7 @@ int format_tracking_info(struct branch *branch, struct 
strbuf *sb)
   theirs),
base, theirs);
if (advice_status_hints)
-   strbuf_addf(sb,
+   s

[PATCH] add coccicheck make target

2016-09-15 Thread René Scharfe
Provide a simple way to run Coccinelle against all source files, in the
form of a Makefile target.  Running "make coccicheck" applies each
.cocci file in contrib/coccinelle/ on all source files.  It generates
a .patch file for each .cocci file, containing the actual changes for
effecting the transformations described by the semantic patches.

Non-empty .patch files are reported.  They can be applied to the work
tree using "patch -p0", but should be checked to e.g. make sure they
don't screw up formatting or create circular references.

Coccinelle's diagnostic output (stderr) is piped into .log files.

Linux has a much more elaborate make target of the same name; let's
start nice and easy.

Signed-off-by: Rene Scharfe 
---
 Makefile | 14 ++
 1 file changed, 14 insertions(+)

diff --git a/Makefile b/Makefile
index 7f18492..74b2788 100644
--- a/Makefile
+++ b/Makefile
@@ -461,6 +461,7 @@ CURL_CONFIG = curl-config
 PTHREAD_LIBS = -lpthread
 PTHREAD_CFLAGS =
 GCOV = gcov
+SPATCH = spatch
 
 export TCL_PATH TCLTK_PATH
 
@@ -2307,6 +2308,18 @@ check: common-cmds.h
exit 1; \
fi
 
+C_SOURCES = $(patsubst %.o,%.c,$(C_OBJ))
+%.cocci.patch: %.cocci $(C_SOURCES)
+   @echo '' SPATCH $<; \
+   for f in $(C_SOURCES); do \
+   $(SPATCH) --sp-file $< $$f; \
+   done >$@ 2>$@.log; \
+   if test -s $@; \
+   then \
+   echo '' SPATCH result: $@; \
+   fi
+coccicheck: $(patsubst %.cocci,%.cocci.patch,$(wildcard 
contrib/coccinelle/*.cocci))
+
 ### Installation rules
 
 ifneq ($(filter /%,$(firstword $(template_dir))),)
@@ -2498,6 +2511,7 @@ clean: profile-clean coverage-clean
$(RM) -r $(GIT_TARNAME) .doc-tmp-dir
$(RM) $(GIT_TARNAME).tar.gz git-core_$(GIT_VERSION)-*.tar.gz
$(RM) $(htmldocs).tar.gz $(manpages).tar.gz
+   $(RM) contrib/coccinelle/*.cocci.patch*
$(MAKE) -C Documentation/ clean
 ifndef NO_PERL
$(MAKE) -C gitweb clean
-- 
2.10.0



Re: [PATCH] use strbuf_addstr() for adding constant strings to a strbuf, part 2

2016-09-15 Thread René Scharfe

Am 15.09.2016 um 21:38 schrieb Jeff King:

On Thu, Sep 15, 2016 at 12:25:43PM -0700, Junio C Hamano wrote:


Silly question: Is there a natural language that uses percent signs
as letters or e.g. instead of commas? :)


I don't know, but if they do, they'd better get used to escaping them.
:)


I do not know either, but I am curious where that question comes
from.  I stared at this patch for a few minutes but couldn't guess.


My initial thought is that the next step after picking this low-hanging
fruit would be to find cases where the strings do not contain "%", and
thus we do not have to care about formatting. But a case like:

  strbuf_addf(&buf, "this does not have any percents!", foo);

is simply broken (albeit in a way that we ignore foo, so it's just ugly
code, not a real bug).

So I dunno. I too am curious.


Take this for example:

-   strbuf_addf(&o->obuf, _("(bad commit)\n"));
+   strbuf_addstr(&o->obuf, _("(bad commit)\n"));

If there's a language that uses percent signs instead of parens or as 
regular letters, then they need to be escaped in the translated string 
before, but not after the patch.  As I wrote: silly.


René


Re: [PATCH] use strbuf_addstr() for adding constant strings to a strbuf, part 2

2016-09-15 Thread René Scharfe
Am 15.09.2016 um 22:01 schrieb Junio C Hamano:
> René Scharfe  writes:
> 
>> Take this for example:
>>
>> -strbuf_addf(&o->obuf, _("(bad commit)\n"));
>> +strbuf_addstr(&o->obuf, _("(bad commit)\n"));
>>
>> If there's a language that uses percent signs instead of parens or as
>> regular letters, then they need to be escaped in the translated string
>> before, but not after the patch.  As I wrote: silly.
> 
> Ahh, OK, so "This use of addf only has format part and nothing else,
> hence the format part can be taken as-is" which is the Coccinelle rule
> used to produce this patch is incomplete and always needs manual
> inspection, in case the format part wanted to give a literal % in
> the output.  E.g. it is a bug to convert this
> 
>   strbuf_addf(&buf, _("this is 100%% wrong!"));
> 
> to
> 
>   strbuf_addstr(&buf, _("this is 100%% wrong!"));

Right.  Such strings seem to be quite rare in practice, though. 

> Thanks for clarification.  Perhaps the strbuf.cocci rule file can
> have some comment to warn the person who builds *.patch file to look
> for % in E2, or something?

Something like this?

---
 contrib/coccinelle/strbuf.cocci | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/contrib/coccinelle/strbuf.cocci b/contrib/coccinelle/strbuf.cocci
index 7932d48..3f535ca 100644
--- a/contrib/coccinelle/strbuf.cocci
+++ b/contrib/coccinelle/strbuf.cocci
@@ -1,3 +1,5 @@
+// Careful, this is not fully equivalent: "%" is no longer treated
+// specially.  Check for "%%", "%m" etc. in the format string (E2).
 @@
 expression E1, E2;
 @@
-- 
2.10.0





Re: Two bugs in --pretty with %C(auto)

2016-09-17 Thread René Scharfe
Am 17.09.2016 um 14:51 schrieb Anatoly Borodin:
> Hi All!
> 
> First bug:
> 
>   git log -3 --pretty='%C(cyan)%C(auto)%h%C(auto)%d %s'
> 
> prints %h with the default color (normal yellow), but
> 
>   git log -3 --pretty='%C(bold cyan)%C(auto)%h%C(auto)%d %s'
> 
> shows %h with bold yellow, as if only the color was reset, but not
> the attributes (blink, ul, reverse also work this way). %d and %s are
> printed with the right color both times.
> 
> Second bug, maybe related to the first one:
> 
>   git log -3 --pretty='%C(bold cyan)%h%C(auto)%d %s %an %h %h %s'
> 
> The first line looks as expected. Well, almost: the '(' of %d is bold
> yellow.
> 
> The second line looks like this:
> 
> * %h, %s, %an with bold cyan;
> * %h with bold yellow;
> * %h with normal yellow and %s with normal white (default colors).
> 
> PS git version 2.9.2

Well, in both cases you could add %Creset before %C(auto) to get what
you want.

I'm not sure how just how automatic %C(auto) is supposed to be, but you
expected it do emit the reset for you, right?  Sounds reasonable to me.
The following patch implements that behavior.

Duy, what do you think?

-- >8 --
Subject: pretty: let %C(auto) reset all attributes

Reset colors and attributes upon %C(auto) to enable full automatic
control over them; otherwise attributes like bold or reverse could
still be in effect from previous %C placeholders.

Signed-off-by: Rene Scharfe 
---
 pretty.c   | 2 ++
 t/t6006-rev-list-format.sh | 2 +-
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/pretty.c b/pretty.c
index 9788bd8..493edb0 100644
--- a/pretty.c
+++ b/pretty.c
@@ -1072,6 +1072,8 @@ static size_t format_commit_one(struct strbuf *sb, /* in 
UTF-8 */
case 'C':
if (starts_with(placeholder + 1, "(auto)")) {
c->auto_color = want_color(c->pretty_ctx->color);
+   if (c->auto_color)
+   strbuf_addstr(sb, GIT_COLOR_RESET);
return 7; /* consumed 7 bytes, "C(auto)" */
} else {
int ret = parse_color(sb, placeholder, c);
diff --git a/t/t6006-rev-list-format.sh b/t/t6006-rev-list-format.sh
index a1dcdb8..f6020cd 100755
--- a/t/t6006-rev-list-format.sh
+++ b/t/t6006-rev-list-format.sh
@@ -225,7 +225,7 @@ test_expect_success '%C(auto,...) respects --color=auto 
(stdout not tty)' '
 
 test_expect_success '%C(auto) respects --color' '
git log --color --format="%C(auto)%H" -1 >actual &&
-   printf "\\033[33m%s\\033[m\\n" $(git rev-parse HEAD) >expect &&
+   printf "\\033[m\\033[33m%s\\033[m\\n" $(git rev-parse HEAD) >expect &&
test_cmp expect actual
 '
 
-- 
2.10.0




Re: Two bugs in --pretty with %C(auto)

2016-09-18 Thread René Scharfe

Am 18.09.2016 um 14:30 schrieb Anatoly Borodin:

On Sat, Sep 17, 2016 at 8:25 PM, René Scharfe  wrote:

I'm not sure how just how automatic %C(auto) is supposed to be, but you
expected it do emit the reset for you, right?  Sounds reasonable to me.


I don't see a good reason not to do so. Spare some bytes?..


The states for colors and attributes are separate; that's how the bold 
attribute bled into your the auto-colored parts of your output.  You 
could use that property to specify e.g. "give me automatic coloring, but 
reverse it".


This only works by accident now, I think.  Full resets are emitted after 
many placeholders, so attributes don't reach very far in practice.  We'd 
have to be more careful with these full resets if we'd want attributes 
to cover multiple placeholders.  An automatic reset at the start of 
%C(auto) would go into the opposite direction.


René


Re: [PATCH v3 0/4] submodule config lookup API

2015-05-21 Thread René Scharfe

Am 21.05.2015 um 19:06 schrieb Heiko Voigt:

diff --git a/submodule-config.h b/submodule-config.h
index 9061e4e..58afc83 100644
--- a/submodule-config.h
+++ b/submodule-config.h
@@ -24,6 +24,6 @@ const struct submodule *submodule_from_name(const unsigned 
char *commit_sha1,
const char *name);
  const struct submodule *submodule_from_path(const unsigned char *commit_sha1,
const char *path);
-void submodule_free(void);
+void submodule_free();


Why this change?  With void it matches the function's definition.

René

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] grep: use regcomp() for icase search with non-ascii patterns

2015-07-06 Thread René Scharfe

Am 06.07.2015 um 14:42 schrieb Nguyễn Thái Ngọc Duy:

Noticed-by: Plamen Totev 
Signed-off-by: Nguyễn Thái Ngọc Duy 
---
  grep.c | 14 +++---
  1 file changed, 11 insertions(+), 3 deletions(-)

diff --git a/grep.c b/grep.c
index b58c7c6..48db15a 100644
--- a/grep.c
+++ b/grep.c
@@ -378,7 +378,7 @@ static void free_pcre_regexp(struct grep_pat *p)
  }
  #endif /* !USE_LIBPCRE */

-static int is_fixed(const char *s, size_t len)
+static int is_fixed(const char *s, size_t len, int ignore_icase)
  {
size_t i;

@@ -391,6 +391,13 @@ static int is_fixed(const char *s, size_t len)
for (i = 0; i < len; i++) {
if (is_regex_special(s[i]))
return 0;
+   /*
+* The builtin substring search can only deal with case
+* insensitivity in ascii range. If there is something outside
+* of that range, fall back to regcomp.
+*/
+   if (ignore_icase && (unsigned char)s[i] >= 128)
+   return 0;


How about "isascii(s[i])"?


}

return 1;
@@ -398,18 +405,19 @@ static int is_fixed(const char *s, size_t len)

  static void compile_regexp(struct grep_pat *p, struct grep_opt *opt)
  {
+   int ignore_icase = opt->regflags & REG_ICASE || p->ignore_case;
int err;

p->word_regexp = opt->word_regexp;
p->ignore_case = opt->ignore_case;


Using p->ignore_case before this line, as in initialization of the new 
variable ignore_icase above, changes the meaning.




-   if (opt->fixed || is_fixed(p->pattern, p->patternlen))
+   if (opt->fixed || is_fixed(p->pattern, p->patternlen, ignore_icase))
p->fixed = 1;
else
p->fixed = 0;

if (p->fixed) {
-   if (opt->regflags & REG_ICASE || p->ignore_case)
+   if (ignore_case)


ignore_icase instead?  ignore_case is for the config variable 
core.ignorecase.  Tricky.



p->kws = kwsalloc(tolower_trans_tbl);
else
p->kws = kwsalloc(NULL);



So the optimization before this patch was that if a string was searched 
for without -F then it would be treated as a fixed string anyway unless 
it contained regex special characters.  Searching for fixed strings 
using the kwset functions is faster than using regcomp and regexec, 
which makes the exercise worthwhile.


Your patch disables the optimization if non-ASCII characters are 
searched for because kwset handles case transformations only for ASCII 
chars.


Another consequence of this limitation is that -Fi (explicit 
case-insensitive fixed-string search) doesn't work properly with 
non-ASCII chars neither.  How can we handle this one?  Fall back to 
regcomp by escaping all special characters?  Or at least warn?


Tests would be nice. :)

René

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


git@vger.kernel.org

2015-07-10 Thread René Scharfe
Signed-off-by: Rene Scharfe 
---
GIT_TEST_CHAIN_LINT complains about the missing &&s and is enabled
by default now.

 t/perf/p5310-pack-bitmaps.sh | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/t/perf/p5310-pack-bitmaps.sh b/t/perf/p5310-pack-bitmaps.sh
index f8ed857..de2a224 100755
--- a/t/perf/p5310-pack-bitmaps.sh
+++ b/t/perf/p5310-pack-bitmaps.sh
@@ -39,14 +39,14 @@ test_expect_success 'create partial bitmap state' '
 
# now kill off all of the refs and pretend we had
# just the one tip
-   rm -rf .git/logs .git/refs/* .git/packed-refs
-   git update-ref HEAD $cutoff
+   rm -rf .git/logs .git/refs/* .git/packed-refs &&
+   git update-ref HEAD $cutoff &&
 
# and then repack, which will leave us with a nice
# big bitmap pack of the "old" history, and all of
# the new history will be loose, as if it had been pushed
# up incrementally and exploded via unpack-objects
-   git repack -Ad
+   git repack -Ad &&
 
# and now restore our original tip, as if the pushes
# had happened
-- 
2.4.4

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


git@vger.kernel.org

2015-07-11 Thread René Scharfe

Am 10.07.2015 um 22:50 schrieb Jeff King:

Thanks, this definitely is a problem, but we already have a fix in the
sb/p5310-and-chain topic. I thought that had been merged-up, but it
looks like it is only in "next" right now.


All the better.  And I see it's in master now.

René

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] diff: parse ws-error-highlight option more strictly

2015-07-11 Thread René Scharfe
Check if a matched token is followed by a delimiter before advancing the
pointer arg.  This avoids accepting composite words like "allnew" or
"defaultcontext".

Signed-off-by: Rene Scharfe 
---
 diff.c | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/diff.c b/diff.c
index 87b16d5..0f17ec5 100644
--- a/diff.c
+++ b/diff.c
@@ -3653,7 +3653,12 @@ static void enable_patch_output(int *fmt) {
 
 static int parse_one_token(const char **arg, const char *token)
 {
-   return skip_prefix(*arg, token, arg) && (!**arg || **arg == ',');
+   const char *rest;
+   if (skip_prefix(*arg, token, &rest) && (!*rest || *rest == ',')) {
+   *arg = rest;
+   return 1;
+   }
+   return 0;
 }
 
 static int parse_ws_error_highlight(struct diff_options *opt, const char *arg)
-- 
2.4.4

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] introduce CHECKOUT_INIT

2016-09-22 Thread René Scharfe
Add a static initializer for struct checkout and use it throughout the
code base.  It's shorter, avoids a memset(3) call and makes sure the
base_dir member is initialized to a valid (empty) string.

Signed-off-by: Rene Scharfe 
---
 apply.c  | 4 +---
 builtin/checkout-index.c | 2 +-
 builtin/checkout.c   | 3 +--
 cache.h  | 1 +
 unpack-trees.c   | 4 +---
 5 files changed, 5 insertions(+), 9 deletions(-)

diff --git a/apply.c b/apply.c
index e327021..b03d274 100644
--- a/apply.c
+++ b/apply.c
@@ -3334,10 +3334,8 @@ static void prepare_fn_table(struct apply_state *state, 
struct patch *patch)
 static int checkout_target(struct index_state *istate,
   struct cache_entry *ce, struct stat *st)
 {
-   struct checkout costate;
+   struct checkout costate = CHECKOUT_INIT;
 
-   memset(&costate, 0, sizeof(costate));
-   costate.base_dir = "";
costate.refresh_cache = 1;
costate.istate = istate;
if (checkout_entry(ce, &costate, NULL) || lstat(ce->name, st))
diff --git a/builtin/checkout-index.c b/builtin/checkout-index.c
index 92c6967..30a49d9 100644
--- a/builtin/checkout-index.c
+++ b/builtin/checkout-index.c
@@ -16,7 +16,7 @@ static int checkout_stage; /* default to checkout stage0 */
 static int to_tempfile;
 static char topath[4][TEMPORARY_FILENAME_LENGTH + 1];
 
-static struct checkout state;
+static struct checkout state = CHECKOUT_INIT;
 
 static void write_tempfile_record(const char *name, const char *prefix)
 {
diff --git a/builtin/checkout.c b/builtin/checkout.c
index 9941abc..4c86272 100644
--- a/builtin/checkout.c
+++ b/builtin/checkout.c
@@ -239,7 +239,7 @@ static int checkout_paths(const struct checkout_opts *opts,
  const char *revision)
 {
int pos;
-   struct checkout state;
+   struct checkout state = CHECKOUT_INIT;
static char *ps_matched;
struct object_id rev;
struct commit *head;
@@ -352,7 +352,6 @@ static int checkout_paths(const struct checkout_opts *opts,
return 1;
 
/* Now we are committed to check them out */
-   memset(&state, 0, sizeof(state));
state.force = 1;
state.refresh_cache = 1;
state.istate = &the_index;
diff --git a/cache.h b/cache.h
index d0494c8..5d9116c 100644
--- a/cache.h
+++ b/cache.h
@@ -1354,6 +1354,7 @@ struct checkout {
 not_new:1,
 refresh_cache:1;
 };
+#define CHECKOUT_INIT { NULL, "" }
 
 #define TEMPORARY_FILENAME_LENGTH 25
 extern int checkout_entry(struct cache_entry *ce, const struct checkout 
*state, char *topath);
diff --git a/unpack-trees.c b/unpack-trees.c
index 3db3f02..ea6bdd2 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -1094,12 +1094,10 @@ int unpack_trees(unsigned len, struct tree_desc *t, 
struct unpack_trees_options
int i, ret;
static struct cache_entry *dfc;
struct exclude_list el;
-   struct checkout state;
+   struct checkout state = CHECKOUT_INIT;
 
if (len > MAX_UNPACK_TREES)
die("unpack_trees takes at most %d trees", MAX_UNPACK_TREES);
-   memset(&state, 0, sizeof(state));
-   state.base_dir = "";
state.force = 1;
state.quiet = 1;
state.refresh_cache = 1;
-- 
2.10.0



[PATCH] git-gui: stop using deprecated merge syntax

2016-09-24 Thread René Scharfe
Starting with v2.5.0 git merge can handle FETCH_HEAD internally and
warns when it's called like 'git merge  HEAD ' because
that syntax is deprecated.  Use this feature in git-gui and get rid of
that warning.

Signed-off-by: Rene Scharfe 
---
Tested only _very_ lightly!

 git-gui/lib/merge.tcl | 7 +--
 1 file changed, 1 insertion(+), 6 deletions(-)

diff --git a/git-gui/lib/merge.tcl b/git-gui/lib/merge.tcl
index 460d32f..5ab6f8f 100644
--- a/git-gui/lib/merge.tcl
+++ b/git-gui/lib/merge.tcl
@@ -112,12 +112,7 @@ method _start {} {
close $fh
set _last_merged_branch $branch
 
-   set cmd [list git]
-   lappend cmd merge
-   lappend cmd --strategy=recursive
-   lappend cmd [git fmt-merge-msg <[gitdir FETCH_HEAD]]
-   lappend cmd HEAD
-   lappend cmd $name
+   set cmd [list git merge --strategy=recursive FETCH_HEAD]
 
ui_status [mc "Merging %s and %s..." $current_branch $stitle]
set cons [console::new [mc "Merge"] "merge $stitle"]
-- 
2.10.0



[PATCH 1/2] add COPY_ARRAY

2016-09-25 Thread René Scharfe
Add COPY_ARRAY, a safe and convenient helper for copying arrays,
complementing ALLOC_ARRAY and REALLOC_ARRAY.  Users just specify source,
destination and the number of elements; the size of an element is
inferred automatically.

It checks if the multiplication of size and element count overflows.
The inferred size is passed first to st_mult, which allows the division
there to be done at compilation time.

As a basic type safety check it makes sure the sizes of source and
destination elements are the same.  That's evaluated at compilation time
as well.

COPY_ARRAY is safe to use with NULL as source pointer iff 0 elements are
to be copied.  That convention is used in some cases for initializing
arrays.  Raw memcpy(3) does not support it -- compilers are allowed to
assume that only valid pointers are passed to it and can optimize away
NULL checks after such a call.

Signed-off-by: Rene Scharfe 
---
 git-compat-util.h | 8 
 1 file changed, 8 insertions(+)

diff --git a/git-compat-util.h b/git-compat-util.h
index 37cce07..91775ce 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -801,6 +801,14 @@ extern FILE *fopen_for_writing(const char *path);
 #define ALLOC_ARRAY(x, alloc) (x) = xmalloc(st_mult(sizeof(*(x)), (alloc)))
 #define REALLOC_ARRAY(x, alloc) (x) = xrealloc((x), st_mult(sizeof(*(x)), 
(alloc)))
 
+#define COPY_ARRAY(dst, src, n) copy_array((dst), (src), (n), sizeof(*(dst)) + 
\
+   BUILD_ASSERT_OR_ZERO(sizeof(*(dst)) == sizeof(*(src
+static inline void copy_array(void *dst, const void *src, size_t n, size_t 
size)
+{
+   if (n)
+   memcpy(dst, src, st_mult(size, n));
+}
+
 /*
  * These functions help you allocate structs with flex arrays, and copy
  * the data directly into the array. For example, if you had:
-- 
2.10.0



[PATCH 6/7] use COPY_ARRAY

2016-09-25 Thread René Scharfe
Add a semantic patch for converting certain calls of memcpy(3) to
COPY_ARRAY() and apply that transformation to the code base.  The result
is
 shorter and safer code.  For now only consider calls where source and
destination have the same type, or in other words: easy cases.

Signed-off-by: Rene Scharfe 
---
 builtin/mv.c   |  2 +-
 commit.c   |  2 +-
 contrib/coccinelle/array.cocci | 26 ++
 pack-revindex.c|  2 +-
 pathspec.c |  3 +--
 split-index.c  |  6 ++
 6 files changed, 32 insertions(+), 9 deletions(-)
 create mode 100644 contrib/coccinelle/array.cocci

diff --git a/builtin/mv.c b/builtin/mv.c
index 446a316..2f43877 100644
--- a/builtin/mv.c
+++ b/builtin/mv.c
@@ -26,7 +26,7 @@ static const char **internal_copy_pathspec(const char *prefix,
int i;
const char **result;
ALLOC_ARRAY(result, count + 1);
-   memcpy(result, pathspec, count * sizeof(const char *));
+   COPY_ARRAY(result, pathspec, count);
result[count] = NULL;
for (i = 0; i < count; i++) {
int length = strlen(result[i]);
diff --git a/commit.c b/commit.c
index ba6dee3..aada266 100644
--- a/commit.c
+++ b/commit.c
@@ -931,7 +931,7 @@ static int remove_redundant(struct commit **array, int cnt)
}
 
/* Now collect the result */
-   memcpy(work, array, sizeof(*array) * cnt);
+   COPY_ARRAY(work, array, cnt);
for (i = filled = 0; i < cnt; i++)
if (!redundant[i])
array[filled++] = work[i];
diff --git a/contrib/coccinelle/array.cocci b/contrib/coccinelle/array.cocci
new file mode 100644
index 000..2d7f25d
--- /dev/null
+++ b/contrib/coccinelle/array.cocci
@@ -0,0 +1,26 @@
+@@
+type T;
+T *dst;
+T *src;
+expression n;
+@@
+- memcpy(dst, src, n * sizeof(*dst));
++ COPY_ARRAY(dst, src, n);
+
+@@
+type T;
+T *dst;
+T *src;
+expression n;
+@@
+- memcpy(dst, src, n * sizeof(*src));
++ COPY_ARRAY(dst, src, n);
+
+@@
+type T;
+T *dst;
+T *src;
+expression n;
+@@
+- memcpy(dst, src, n * sizeof(T));
++ COPY_ARRAY(dst, src, n);
diff --git a/pack-revindex.c b/pack-revindex.c
index 96d51c3..6bc7c94 100644
--- a/pack-revindex.c
+++ b/pack-revindex.c
@@ -107,7 +107,7 @@ static void sort_revindex(struct revindex_entry *entries, 
unsigned n, off_t max)
 * we have to move it back from the temporary storage.
 */
if (from != entries)
-   memcpy(entries, tmp, n * sizeof(*entries));
+   COPY_ARRAY(entries, tmp, n);
free(tmp);
free(pos);
 
diff --git a/pathspec.c b/pathspec.c
index 24e0dd5..49a5360 100644
--- a/pathspec.c
+++ b/pathspec.c
@@ -485,8 +485,7 @@ void copy_pathspec(struct pathspec *dst, const struct 
pathspec *src)
 {
*dst = *src;
ALLOC_ARRAY(dst->items, dst->nr);
-   memcpy(dst->items, src->items,
-  sizeof(struct pathspec_item) * dst->nr);
+   COPY_ARRAY(dst->items, src->items, dst->nr);
 }
 
 void clear_pathspec(struct pathspec *pathspec)
diff --git a/split-index.c b/split-index.c
index 3c75d4b..35da553 100644
--- a/split-index.c
+++ b/split-index.c
@@ -83,8 +83,7 @@ void move_cache_to_base_index(struct index_state *istate)
si->base->timestamp = istate->timestamp;
ALLOC_GROW(si->base->cache, istate->cache_nr, si->base->cache_alloc);
si->base->cache_nr = istate->cache_nr;
-   memcpy(si->base->cache, istate->cache,
-  sizeof(*istate->cache) * istate->cache_nr);
+   COPY_ARRAY(si->base->cache, istate->cache, istate->cache_nr);
mark_base_index_entries(si->base);
for (i = 0; i < si->base->cache_nr; i++)
si->base->cache[i]->ce_flags &= ~CE_UPDATE_IN_BASE;
@@ -141,8 +140,7 @@ void merge_base_index(struct index_state *istate)
istate->cache   = NULL;
istate->cache_alloc = 0;
ALLOC_GROW(istate->cache, istate->cache_nr, istate->cache_alloc);
-   memcpy(istate->cache, si->base->cache,
-  sizeof(*istate->cache) * istate->cache_nr);
+   COPY_ARRAY(istate->cache, si->base->cache, istate->cache_nr);
 
si->nr_deletions = 0;
si->nr_replacements = 0;
-- 
2.10.0



Re: [PATCH 6/7] use COPY_ARRAY

2016-09-25 Thread René Scharfe

Ha, can't count.  It should be [PATCH 2/2] of course.

René



Re: [PATCH 1/2] add COPY_ARRAY

2016-09-25 Thread René Scharfe

Am 25.09.2016 um 09:41 schrieb Jeff King:

On Sun, Sep 25, 2016 at 09:15:42AM +0200, René Scharfe wrote:

It checks if the multiplication of size and element count overflows.
The inferred size is passed first to st_mult, which allows the division
there to be done at compilation time.


I wonder if this actually stops any real overflows. My goal with
ALLOC_ARRAY, etc, was to catch these at the malloc stage (which is the
really dangerous part, because we don't want to under-allocate). So the
first hunk of your patch is:

ALLOC_ARRAY(result, count + 1);
-   memcpy(result, pathspec, count * sizeof(const char *));
+   COPY_ARRAY(result, pathspec, count);

which clearly cannot trigger the st_mult() check, because we would have
done so in the ALLOC_ARRAY call[1].

Other calls are not so obvious, but in general I would expect the
allocation step to be doing this check. If we missed one, then it's
possible that this macro could detect it and prevent a problem. But it
seems like the wrong time to check. The allocation is buggy, and we'd
have to just get lucky to be using COPY_ARRAY(). And I don't even mean
"lucky that we switched to COPY_ARRAY from memcpy for this callsite".
There are lots of sites that allocate and then fill the array one by
one, without ever computing the full size again. So allocation is the
only sensible place to enforce integer overflow.

So I'm not sold on this providing any real integer overflow safety. But
I do otherwise like it, as it drops the extra "sizeof" which has to
repeat either the variable name or the type).


Well, yes, checking if the destination object is big enough requires 
calculating the size of the source object and thus should include a 
check for multiplication overflow already.  Note the word "should". :) 
If that's the case then a smart compiler could elide the duplicate 
check.  Keeping the check in COPY_ARRAY reduces the assumptions we have 
make about its use as well as any previous checks and doesn't cost much.


René


[PATCH] gitignore: ignore output files of coccicheck make target

2016-09-27 Thread René Scharfe

Signed-off-by: Rene Scharfe 
---
 .gitignore | 1 +
 1 file changed, 1 insertion(+)

diff --git a/.gitignore b/.gitignore
index 05cb58a..f370ba0 100644
--- a/.gitignore
+++ b/.gitignore
@@ -207,6 +207,7 @@
 /tags
 /TAGS
 /cscope*
+/contrib/coccinelle/*.patch*
 *.obj
 *.lib
 *.res
--
2.10.0



[PATCH 1/2] use strbuf_addstr() instead of strbuf_addf() with "%s", part 2

2016-09-27 Thread René Scharfe
Replace uses of strbuf_addf() for adding strings with more lightweight
strbuf_addstr() calls.  This is shorter and makes the intent clearer.

bc57b9c0cc5a123365a922fa1831177e3fd607ed already converted three cases,
this patch covers two more.

A semantic patch for Coccinelle is included for easier checking for
new cases that might be introduced in the future.

Signed-off-by: Rene Scharfe 
---
 builtin/submodule--helper.c | 2 +-
 contrib/coccinelle/strbuf.cocci | 6 ++
 submodule.c | 2 +-
 3 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
index e3fdc0a..444ec06 100644
--- a/builtin/submodule--helper.c
+++ b/builtin/submodule--helper.c
@@ -753,7 +753,7 @@ static int prepare_to_clone_next_submodule(const struct 
cache_entry *ce,
if (suc->recursive_prefix)
strbuf_addf(&sb, "%s/%s", suc->recursive_prefix, 
ce->name);
else
-   strbuf_addf(&sb, "%s", ce->name);
+   strbuf_addstr(&sb, ce->name);
strbuf_addf(out, _("Skipping unmerged submodule %s"), sb.buf);
strbuf_addch(out, '\n');
goto cleanup;
diff --git a/contrib/coccinelle/strbuf.cocci b/contrib/coccinelle/strbuf.cocci
index 7932d48..4b7553f 100644
--- a/contrib/coccinelle/strbuf.cocci
+++ b/contrib/coccinelle/strbuf.cocci
@@ -3,3 +3,9 @@ expression E1, E2;
 @@
 - strbuf_addf(E1, E2);
 + strbuf_addstr(E1, E2);
+
+@@
+expression E1, E2;
+@@
+- strbuf_addf(E1, "%s", E2);
++ strbuf_addstr(E1, E2);
diff --git a/submodule.c b/submodule.c
index 0ef2ff4..dcc5ce3 100644
--- a/submodule.c
+++ b/submodule.c
@@ -396,7 +396,7 @@ static void show_submodule_header(FILE *f, const char *path,
find_unique_abbrev(one->hash, DEFAULT_ABBREV));
if (!fast_backward && !fast_forward)
strbuf_addch(&sb, '.');
-   strbuf_addf(&sb, "%s", find_unique_abbrev(two->hash, DEFAULT_ABBREV));
+   strbuf_addstr(&sb, find_unique_abbrev(two->hash, DEFAULT_ABBREV));
if (message)
strbuf_addf(&sb, " %s%s\n", message, reset);
else
-- 
2.10.0



[PATCH 2/2] use strbuf_add_unique_abbrev() for adding short hashes, part 2

2016-09-27 Thread René Scharfe
Call strbuf_add_unique_abbrev() to add abbreviated hashes to strbufs
instead of taking detours through find_unique_abbrev() and its static
buffer.  This is shorter and a bit more efficient.

1eb47f167d65d1d305b9c196a1bb40eb96117cb1 already converted six cases,
this patch covers three more.

A semantic patch for Coccinelle is included for easier checking for
new cases that might be introduced in the future.

Signed-off-by: Rene Scharfe 
---
 contrib/coccinelle/strbuf.cocci | 6 ++
 diff.c  | 2 +-
 submodule.c | 2 +-
 wt-status.c | 3 +--
 4 files changed, 9 insertions(+), 4 deletions(-)

diff --git a/contrib/coccinelle/strbuf.cocci b/contrib/coccinelle/strbuf.cocci
index 4b7553f..1e24298 100644
--- a/contrib/coccinelle/strbuf.cocci
+++ b/contrib/coccinelle/strbuf.cocci
@@ -9,3 +9,9 @@ expression E1, E2;
 @@
 - strbuf_addf(E1, "%s", E2);
 + strbuf_addstr(E1, E2);
+
+@@
+expression E1, E2, E3;
+@@
+- strbuf_addstr(E1, find_unique_abbrev(E2, E3));
++ strbuf_add_unique_abbrev(E1, E2, E3);
diff --git a/diff.c b/diff.c
index a178ed3..be11e4e 100644
--- a/diff.c
+++ b/diff.c
@@ -3109,7 +3109,7 @@ static void fill_metainfo(struct strbuf *msg,
}
strbuf_addf(msg, "%s%sindex %s..", line_prefix, set,
find_unique_abbrev(one->oid.hash, abbrev));
-   strbuf_addstr(msg, find_unique_abbrev(two->oid.hash, abbrev));
+   strbuf_add_unique_abbrev(msg, two->oid.hash, abbrev);
if (one->mode == two->mode)
strbuf_addf(msg, " %06o", one->mode);
strbuf_addf(msg, "%s\n", reset);
diff --git a/submodule.c b/submodule.c
index dcc5ce3..8cf40ea 100644
--- a/submodule.c
+++ b/submodule.c
@@ -396,7 +396,7 @@ static void show_submodule_header(FILE *f, const char *path,
find_unique_abbrev(one->hash, DEFAULT_ABBREV));
if (!fast_backward && !fast_forward)
strbuf_addch(&sb, '.');
-   strbuf_addstr(&sb, find_unique_abbrev(two->hash, DEFAULT_ABBREV));
+   strbuf_add_unique_abbrev(&sb->hash, two, DEFAULT_ABBREV);
if (message)
strbuf_addf(&sb, " %s%s\n", message, reset);
else
diff --git a/wt-status.c b/wt-status.c
index 9628c1d..99d1b0a 100644
--- a/wt-status.c
+++ b/wt-status.c
@@ -1383,8 +1383,7 @@ static int grab_1st_switch(unsigned char *osha1, unsigned 
char *nsha1,
if (!strcmp(cb->buf.buf, "HEAD")) {
/* HEAD is relative. Resolve it to the right reflog entry. */
strbuf_reset(&cb->buf);
-   strbuf_addstr(&cb->buf,
- find_unique_abbrev(nsha1, DEFAULT_ABBREV));
+   strbuf_add_unique_abbrev(&cb->buf, nsha1, DEFAULT_ABBREV);
}
return 1;
 }
-- 
2.10.0




Re: [PATCH] gitignore: ignore output files of coccicheck make target

2016-09-27 Thread René Scharfe
Am 27.09.2016 um 21:52 schrieb Jakub Narębski:
> W dniu 27.09.2016 o 21:01, René Scharfe pisze:
>> Signed-off-by: Rene Scharfe 
>> ---
>>  .gitignore | 1 +
>>  1 file changed, 1 insertion(+)
>>
>> diff --git a/.gitignore b/.gitignore
>> index 05cb58a..f370ba0 100644
>> --- a/.gitignore
>> +++ b/.gitignore
> 
> Wouldn't it be better to have this in contrib/coccinelle/.gitignore?

True.

-- >8 --
Subject: [PATCH v2] gitignore: ignore output files of coccicheck make target

Helped-by: Jakub Narębski 
Signed-off-by: Rene Scharfe 
---
 contrib/coccinelle/.gitignore | 1 +
 1 file changed, 1 insertion(+)
 create mode 100644 contrib/coccinelle/.gitignore

diff --git a/contrib/coccinelle/.gitignore b/contrib/coccinelle/.gitignore
new file mode 100644
index 000..d3f2964
--- /dev/null
+++ b/contrib/coccinelle/.gitignore
@@ -0,0 +1 @@
+*.patch*
-- 
2.10.0



Re: [PATCH 2/2] use strbuf_add_unique_abbrev() for adding short hashes, part 2

2016-09-27 Thread René Scharfe
Am 27.09.2016 um 22:28 schrieb Junio C Hamano:
> René Scharfe  writes:
>> diff --git a/submodule.c b/submodule.c
>> index dcc5ce3..8cf40ea 100644
>> --- a/submodule.c
>> +++ b/submodule.c
>> @@ -396,7 +396,7 @@ static void show_submodule_header(FILE *f, const char 
>> *path,
>>  find_unique_abbrev(one->hash, DEFAULT_ABBREV));
>>  if (!fast_backward && !fast_forward)
>>  strbuf_addch(&sb, '.');
>> -strbuf_addstr(&sb, find_unique_abbrev(two->hash, DEFAULT_ABBREV));
>> +strbuf_add_unique_abbrev(&sb->hash, two, DEFAULT_ABBREV);
> 
> I wonder how could this change come out of this definition:
> 
> @@
> expression E1, E2, E3;
> @@
> - strbuf_addstr(E1, find_unique_abbrev(E2, E3));
> + strbuf_add_unique_abbrev(E1, E2, E3);

Impossible.  I added "->hash" manually during a rebase (merging
a0d12c44, wrongly).  Good catch, thanks!

Seeing proof of skipping compile-testing I wonder what else I do
forget in my daily life. :-|  I'll better go to sleep now..

Fixup patch, generated by reverting the diff, re-adding the
semantic patch and using coccicheck; compiles and survives make
test:
---
 submodule.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/submodule.c b/submodule.c
index 8cf40ea..bb06b60 100644
--- a/submodule.c
+++ b/submodule.c
@@ -396,7 +396,7 @@ static void show_submodule_header(FILE *f, const char *path,
find_unique_abbrev(one->hash, DEFAULT_ABBREV));
if (!fast_backward && !fast_forward)
strbuf_addch(&sb, '.');
-   strbuf_add_unique_abbrev(&sb->hash, two, DEFAULT_ABBREV);
+   strbuf_add_unique_abbrev(&sb, two->hash, DEFAULT_ABBREV);
if (message)
strbuf_addf(&sb, " %s%s\n", message, reset);
else
-- 
2.10.0



[PATCH 2/3] use QSORT

2016-09-29 Thread René Scharfe
Apply the semantic patch contrib/coccinelle/qsort.cocci to the code
base, replacing calls of qsort(3) with QSORT.  The resulting code is
shorter and supports empty arrays with NULL pointers.

Signed-off-by: Rene Scharfe 
---
Freshly generated using coccicheck, compiles, survives make test.

 bisect.c |  2 +-
 builtin/describe.c   |  2 +-
 builtin/fast-export.c|  2 +-
 builtin/fmt-merge-msg.c  |  6 ++
 builtin/index-pack.c |  8 +++-
 builtin/mktree.c |  2 +-
 builtin/name-rev.c   |  3 +--
 builtin/pack-objects.c   |  7 +++
 builtin/remote.c |  3 +--
 diff.c   |  6 +++---
 diffcore-delta.c |  5 +
 diffcore-order.c |  2 +-
 diffcore-rename.c|  2 +-
 dir.c|  4 ++--
 fast-import.c|  4 ++--
 fetch-pack.c |  2 +-
 help.c   | 15 +--
 line-log.c   |  2 +-
 pack-bitmap-write.c  |  3 +--
 pack-check.c |  2 +-
 pack-write.c |  3 +--
 pathspec.c   |  3 +--
 ref-filter.c |  2 +-
 refs/files-backend.c |  2 +-
 server-info.c|  2 +-
 sh-i18n--envsubst.c  |  2 +-
 sha1-array.c |  2 +-
 string-list.c|  2 +-
 t/helper/test-dump-untracked-cache.c |  6 ++
 tree.c   |  3 +--
 30 files changed, 44 insertions(+), 65 deletions(-)

diff --git a/bisect.c b/bisect.c
index 6f512c2..21bc6da 100644
--- a/bisect.c
+++ b/bisect.c
@@ -215,7 +215,7 @@ static struct commit_list *best_bisection_sorted(struct 
commit_list *list, int n
array[cnt].distance = distance;
cnt++;
}
-   qsort(array, cnt, sizeof(*array), compare_commit_dist);
+   QSORT(array, cnt, compare_commit_dist);
for (p = list, i = 0; i < cnt; i++) {
char buf[100]; /* enough for dist=%d */
struct object *obj = &(array[i].commit->object);
diff --git a/builtin/describe.c b/builtin/describe.c
index 8a25abe..01490a1 100644
--- a/builtin/describe.c
+++ b/builtin/describe.c
@@ -352,7 +352,7 @@ static void describe(const char *arg, int last_one)
oid_to_hex(oid));
}
 
-   qsort(all_matches, match_cnt, sizeof(all_matches[0]), compare_pt);
+   QSORT(all_matches, match_cnt, compare_pt);
 
if (gave_up_on) {
commit_list_insert_by_date(gave_up_on, &list);
diff --git a/builtin/fast-export.c b/builtin/fast-export.c
index c0652a7..1e815b5 100644
--- a/builtin/fast-export.c
+++ b/builtin/fast-export.c
@@ -347,7 +347,7 @@ static void show_filemodify(struct diff_queue_struct *q,
 * Handle files below a directory first, in case they are all deleted
 * and the directory changes to a file or symlink.
 */
-   qsort(q->queue, q->nr, sizeof(q->queue[0]), depth_first);
+   QSORT(q->queue, q->nr, depth_first);
 
for (i = 0; i < q->nr; i++) {
struct diff_filespec *ospec = q->queue[i]->one;
diff --git a/builtin/fmt-merge-msg.c b/builtin/fmt-merge-msg.c
index dc2e9e4..4976967 100644
--- a/builtin/fmt-merge-msg.c
+++ b/builtin/fmt-merge-msg.c
@@ -315,12 +315,10 @@ static void add_people_info(struct strbuf *out,
struct string_list *committers)
 {
if (authors->nr)
-   qsort(authors->items,
- authors->nr, sizeof(authors->items[0]),
+   QSORT(authors->items, authors->nr,
  cmp_string_list_util_as_integral);
if (committers->nr)
-   qsort(committers->items,
- committers->nr, sizeof(committers->items[0]),
+   QSORT(committers->items, committers->nr,
  cmp_string_list_util_as_integral);
 
credit_people(out, authors, 'a');
diff --git a/builtin/index-pack.c b/builtin/index-pack.c
index 4a8b4ae..7657d0a 100644
--- a/builtin/index-pack.c
+++ b/builtin/index-pack.c
@@ -1190,10 +1190,8 @@ static void resolve_deltas(void)
return;
 
/* Sort deltas by base SHA1/offset for fast searching */
-   qsort(ofs_deltas, nr_ofs_deltas, sizeof(struct ofs_delta_entry),
- compare_ofs_delta_entry);
-   qsort(ref_deltas, nr_ref_deltas, sizeof(struct ref_delta_entry),
- compare_ref_delta_entry);
+   QSORT(ofs_deltas, nr_ofs_deltas, compare_ofs_delta_entry);
+   QSORT(ref_deltas, nr_ref_deltas, compare_ref_delta_entry);
 
if (verbose || show_resolving_progress)
progress = start_progress(_("Resolving deltas"),
@@ -1356,7 +1354,7 @@ stat

[PATCH 3/3] remove unnecessary check before QSORT

2016-09-29 Thread René Scharfe
Add a semantic patch for removing checks similar to the one that QSORT
already does internally and apply it to the code base.

Signed-off-by: Rene Scharfe 
---
 builtin/fmt-merge-msg.c| 10 --
 contrib/coccinelle/qsort.cocci | 18 ++
 sh-i18n--envsubst.c|  3 +--
 3 files changed, 23 insertions(+), 8 deletions(-)

diff --git a/builtin/fmt-merge-msg.c b/builtin/fmt-merge-msg.c
index 4976967..efab62f 100644
--- a/builtin/fmt-merge-msg.c
+++ b/builtin/fmt-merge-msg.c
@@ -314,12 +314,10 @@ static void add_people_info(struct strbuf *out,
struct string_list *authors,
struct string_list *committers)
 {
-   if (authors->nr)
-   QSORT(authors->items, authors->nr,
- cmp_string_list_util_as_integral);
-   if (committers->nr)
-   QSORT(committers->items, committers->nr,
- cmp_string_list_util_as_integral);
+   QSORT(authors->items, authors->nr,
+ cmp_string_list_util_as_integral);
+   QSORT(committers->items, committers->nr,
+ cmp_string_list_util_as_integral);
 
credit_people(out, authors, 'a');
credit_people(out, committers, 'c');
diff --git a/contrib/coccinelle/qsort.cocci b/contrib/coccinelle/qsort.cocci
index a094e7c..22b93a9 100644
--- a/contrib/coccinelle/qsort.cocci
+++ b/contrib/coccinelle/qsort.cocci
@@ -17,3 +17,21 @@ expression nmemb, compar;
 @@
 - qsort(base, nmemb, sizeof(T), compar);
 + QSORT(base, nmemb, compar);
+
+@@
+expression base, nmemb, compar;
+@@
+- if (nmemb)
+QSORT(base, nmemb, compar);
+
+@@
+expression base, nmemb, compar;
+@@
+- if (nmemb > 0)
+QSORT(base, nmemb, compar);
+
+@@
+expression base, nmemb, compar;
+@@
+- if (nmemb > 1)
+QSORT(base, nmemb, compar);
diff --git a/sh-i18n--envsubst.c b/sh-i18n--envsubst.c
index 3637a2a..c3a2b5a 100644
--- a/sh-i18n--envsubst.c
+++ b/sh-i18n--envsubst.c
@@ -230,8 +230,7 @@ cmp_string (const void *pstr1, const void *pstr2)
 static inline void
 string_list_sort (string_list_ty *slp)
 {
-  if (slp->nitems > 0)
-QSORT(slp->item, slp->nitems, cmp_string);
+  QSORT(slp->item, slp->nitems, cmp_string);
 }
 
 /* Test whether a sorted string list contains a given string.  */
-- 
2.10.0



[PATCH 1/3] add QSORT

2016-09-29 Thread René Scharfe
Add the macro QSORT, a convenient wrapper for qsort(3) that infers the
size of the array elements and supports the convention of initializing
empty arrays with a NULL pointer, which we use in some places.

Calling qsort(3) directly with a NULL pointer is undefined -- even with
an element count of zero -- and allows the compiler to optimize away any
following NULL checks.  Using the macro avoids such surprises.

Add a semantic patch as well to demonstrate the macro's usage and to
automate the transformation of trivial cases.

Signed-off-by: Rene Scharfe 
---
 contrib/coccinelle/qsort.cocci | 19 +++
 git-compat-util.h  |  8 
 2 files changed, 27 insertions(+)
 create mode 100644 contrib/coccinelle/qsort.cocci

diff --git a/contrib/coccinelle/qsort.cocci b/contrib/coccinelle/qsort.cocci
new file mode 100644
index 000..a094e7c
--- /dev/null
+++ b/contrib/coccinelle/qsort.cocci
@@ -0,0 +1,19 @@
+@@
+expression base, nmemb, compar;
+@@
+- qsort(base, nmemb, sizeof(*base), compar);
++ QSORT(base, nmemb, compar);
+
+@@
+expression base, nmemb, compar;
+@@
+- qsort(base, nmemb, sizeof(base[0]), compar);
++ QSORT(base, nmemb, compar);
+
+@@
+type T;
+T *base;
+expression nmemb, compar;
+@@
+- qsort(base, nmemb, sizeof(T), compar);
++ QSORT(base, nmemb, compar);
diff --git a/git-compat-util.h b/git-compat-util.h
index 8aab0c3..d7ed137 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -977,6 +977,14 @@ void git_qsort(void *base, size_t nmemb, size_t size,
 #define qsort git_qsort
 #endif
 
+#define QSORT(base, n, compar) sane_qsort((base), (n), sizeof(*(base)), compar)
+static void inline sane_qsort(void *base, size_t nmemb, size_t size,
+ int(*compar)(const void *, const void *))
+{
+   if (nmemb > 1)
+   qsort(base, nmemb, size, compar);
+}
+
 #ifndef REG_STARTEND
 #error "Git requires REG_STARTEND support. Compile with NO_REGEX=NeedsStartEnd"
 #endif
-- 
2.10.0



Re: Two bugs in --pretty with %C(auto)

2016-09-29 Thread René Scharfe
Am 17.09.2016 um 20:25 schrieb René Scharfe:
> diff --git a/pretty.c b/pretty.c
> index 9788bd8..493edb0 100644
> --- a/pretty.c
> +++ b/pretty.c
> @@ -1072,6 +1072,8 @@ static size_t format_commit_one(struct strbuf *sb, /* 
> in UTF-8 */
>   case 'C':
>   if (starts_with(placeholder + 1, "(auto)")) {
>   c->auto_color = want_color(c->pretty_ctx->color);
> + if (c->auto_color)
> + strbuf_addstr(sb, GIT_COLOR_RESET);
>   return 7; /* consumed 7 bytes, "C(auto)" */
>   } else {
>   int ret = parse_color(sb, placeholder, c);

We could optimize this a bit (see below).  I can't think of a downside;
someone adding a prefix would be responsible for adding a reset as well
if needed, right?

-- >8 --
Subject: [PATCH] pretty: avoid adding reset for %C(auto) if output is empty

We emit an escape sequence for resetting color and attribute for
%C(auto) to make sure automatic coloring is displayed as intended.
Stop doing that if the output strbuf is empty, i.e. when %C(auto)
appears at the start of the format string, because then there is no
need for a reset and we save a few bytes in the output.

Signed-off-by: Rene Scharfe 
---
Reverts the change to t6006, so we'd need another test for this.
Anatoly? :)

 pretty.c   | 2 +-
 t/t6006-rev-list-format.sh | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/pretty.c b/pretty.c
index 493edb0..25efbca 100644
--- a/pretty.c
+++ b/pretty.c
@@ -1072,7 +1072,7 @@ static size_t format_commit_one(struct strbuf *sb, /* in 
UTF-8 */
case 'C':
if (starts_with(placeholder + 1, "(auto)")) {
c->auto_color = want_color(c->pretty_ctx->color);
-   if (c->auto_color)
+   if (c->auto_color && sb->len)
strbuf_addstr(sb, GIT_COLOR_RESET);
return 7; /* consumed 7 bytes, "C(auto)" */
} else {
diff --git a/t/t6006-rev-list-format.sh b/t/t6006-rev-list-format.sh
index f6020cd..a1dcdb8 100755
--- a/t/t6006-rev-list-format.sh
+++ b/t/t6006-rev-list-format.sh
@@ -225,7 +225,7 @@ test_expect_success '%C(auto,...) respects --color=auto 
(stdout not tty)' '
 
 test_expect_success '%C(auto) respects --color' '
git log --color --format="%C(auto)%H" -1 >actual &&
-   printf "\\033[m\\033[33m%s\\033[m\\n" $(git rev-parse HEAD) >expect &&
+   printf "\\033[33m%s\\033[m\\n" $(git rev-parse HEAD) >expect &&
test_cmp expect actual
 '
 
-- 
2.10.0



Re: [PATCH 1/3] add QSORT

2016-09-29 Thread René Scharfe
Am 30.09.2016 um 00:36 schrieb Junio C Hamano:
> René Scharfe  writes:
> 
>> Add the macro QSORT, a convenient wrapper for qsort(3) that infers the
>> size of the array elements and supports the convention of initializing
>> empty arrays with a NULL pointer, which we use in some places.
>>
>> Calling qsort(3) directly with a NULL pointer is undefined -- even with
>> an element count of zero -- and allows the compiler to optimize away any
>> following NULL checks.  Using the macro avoids such surprises.
>>
>> Add a semantic patch as well to demonstrate the macro's usage and to
>> automate the transformation of trivial cases.
>>
>> Signed-off-by: Rene Scharfe 
>> ---
>>  contrib/coccinelle/qsort.cocci | 19 +++
>>  git-compat-util.h  |  8 
>>  2 files changed, 27 insertions(+)
>>  create mode 100644 contrib/coccinelle/qsort.cocci
> 
> The direct calls to qsort(3) that this series leaves behind are
> interesting.
> 
> 1. builtin/index-pack.c has this:
> 
>   if (1 < opts->anomaly_nr)
>   qsort(opts->anomaly, opts->anomaly_nr, sizeof(uint32_t), 
> cmp_uint32);
> 
> where opts->anomaly is coming from pack.h:
> 
> struct pack_idx_option {
> unsigned flags;
> ...
> int anomaly_alloc, anomaly_nr;
> uint32_t *anomaly;
> };
> 
> I cannot quite see how the automated conversion misses it?  It's not
> like base and nmemb are type-restricted in the rule (they are both
> just "expression"s).
> 
> 2. builtin/shortlog.c has this:
> 
>   qsort(log->list.items, log->list.nr, sizeof(struct string_list_item),
> log->summary ? compare_by_counter : compare_by_list);
> 
> where log->list is coming from shortlog.h:
> 
> struct shortlog {
> struct string_list list;
> };
> 
> and string-list.h says:
> 
> struct string_list {
> struct string_list_item *items;
> unsigned int nr, alloc;
> ...
> };
> 
> which seems to be a good candidate for this rule:
> 
> type T;
> T *base;
> expression nmemb, compar;
> @@
> - qsort(base, nmemb, sizeof(T), compar);
> + QSORT(base, nmemb, compar);
> 
> if we take "T == struct string_list_item".

Transformations for these two are generated if we pass --all-includes
to spatch.  So let's do that.

-- >8 --
Subject: [PATCH] coccicheck: use --all-includes by default

Add a make variable, SPATCH_FLAGS, for specifying flags for spatch, and
set it to --all-includes by default.  This option lets it consider
header files which would otherwise be ignored.  That's important for
some rules that rely on type information.  It doubles the duration of
coccicheck, however.

Signed-off-by: Rene Scharfe 
---
 Makefile | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/Makefile b/Makefile
index 1aad150..d15bf8d 100644
--- a/Makefile
+++ b/Makefile
@@ -467,6 +467,7 @@ SPATCH = spatch
 export TCL_PATH TCLTK_PATH
 
 SPARSE_FLAGS =
+SPATCH_FLAGS = --all-includes
 
 
 
@@ -2314,7 +2315,7 @@ C_SOURCES = $(patsubst %.o,%.c,$(C_OBJ))
 %.cocci.patch: %.cocci $(C_SOURCES)
@echo '' SPATCH $<; \
for f in $(C_SOURCES); do \
-   $(SPATCH) --sp-file $< $$f; \
+   $(SPATCH) --sp-file $< $$f $(SPATCH_FLAGS); \
done >$@ 2>$@.log; \
if test -s $@; \
then \
-- 
2.10.0



Re: [PATCH 1/3] add QSORT

2016-09-29 Thread René Scharfe
Am 30.09.2016 um 01:21 schrieb René Scharfe:
> Am 30.09.2016 um 00:36 schrieb Junio C Hamano:
>> René Scharfe  writes:
>>
>>> Add the macro QSORT, a convenient wrapper for qsort(3) that infers the
>>> size of the array elements and supports the convention of initializing
>>> empty arrays with a NULL pointer, which we use in some places.
>>>
>>> Calling qsort(3) directly with a NULL pointer is undefined -- even with
>>> an element count of zero -- and allows the compiler to optimize away any
>>> following NULL checks.  Using the macro avoids such surprises.
>>>
>>> Add a semantic patch as well to demonstrate the macro's usage and to
>>> automate the transformation of trivial cases.
>>>
>>> Signed-off-by: Rene Scharfe 
>>> ---
>>>  contrib/coccinelle/qsort.cocci | 19 +++
>>>  git-compat-util.h  |  8 
>>>  2 files changed, 27 insertions(+)
>>>  create mode 100644 contrib/coccinelle/qsort.cocci
>>
>> The direct calls to qsort(3) that this series leaves behind are
>> interesting.
>>
>> 1. builtin/index-pack.c has this:
>>
>>  if (1 < opts->anomaly_nr)
>>  qsort(opts->anomaly, opts->anomaly_nr, sizeof(uint32_t), 
>> cmp_uint32);
>>
>> where opts->anomaly is coming from pack.h:
>>
>> struct pack_idx_option {
>> unsigned flags;
>> ...
>> int anomaly_alloc, anomaly_nr;
>> uint32_t *anomaly;
>> };
>>
>> I cannot quite see how the automated conversion misses it?  It's not
>> like base and nmemb are type-restricted in the rule (they are both
>> just "expression"s).
>>
>> 2. builtin/shortlog.c has this:
>>
>>  qsort(log->list.items, log->list.nr, sizeof(struct string_list_item),
>>log->summary ? compare_by_counter : compare_by_list);
>>
>> where log->list is coming from shortlog.h:
>>
>> struct shortlog {
>> struct string_list list;
>> };
>>
>> and string-list.h says:
>>
>> struct string_list {
>> struct string_list_item *items;
>> unsigned int nr, alloc;
>> ...
>> };
>>
>> which seems to be a good candidate for this rule:
>>
>> type T;
>> T *base;
>> expression nmemb, compar;
>> @@
>> - qsort(base, nmemb, sizeof(T), compar);
>> + QSORT(base, nmemb, compar);
>>
>> if we take "T == struct string_list_item".
> 
> Transformations for these two are generated if we pass --all-includes
> to spatch.  So let's do that.

And here's the result:

-- >8 --
Subject: [PATCH] use QSORT, part 2

Convert two more qsort(3) calls to QSORT to reduce code size and for
better safety and consistency.

Signed-off-by: Rene Scharfe 
---
Squashable.

 builtin/index-pack.c | 3 +--
 builtin/shortlog.c   | 2 +-
 2 files changed, 2 insertions(+), 3 deletions(-)

diff --git a/builtin/index-pack.c b/builtin/index-pack.c
index 7657d0a..0a27bab 100644
--- a/builtin/index-pack.c
+++ b/builtin/index-pack.c
@@ -1531,8 +1531,7 @@ static void read_v2_anomalous_offsets(struct packed_git 
*p,
opts->anomaly[opts->anomaly_nr++] = ntohl(idx2[off * 2 + 1]);
}
 
-   if (1 < opts->anomaly_nr)
-   qsort(opts->anomaly, opts->anomaly_nr, sizeof(uint32_t), 
cmp_uint32);
+   QSORT(opts->anomaly, opts->anomaly_nr, cmp_uint32);
 }
 
 static void read_idx_option(struct pack_idx_option *opts, const char 
*pack_name)
diff --git a/builtin/shortlog.c b/builtin/shortlog.c
index 25fa8a6..ba0e115 100644
--- a/builtin/shortlog.c
+++ b/builtin/shortlog.c
@@ -308,7 +308,7 @@ void shortlog_output(struct shortlog *log)
struct strbuf sb = STRBUF_INIT;
 
if (log->sort_by_number)
-   qsort(log->list.items, log->list.nr, sizeof(struct 
string_list_item),
+   QSORT(log->list.items, log->list.nr,
  log->summary ? compare_by_counter : compare_by_list);
for (i = 0; i < log->list.nr; i++) {
const struct string_list_item *item = &log->list.items[i];
-- 
2.10.0




Re: [PATCH 1/3] add QSORT

2016-10-01 Thread René Scharfe
Am 30.09.2016 um 00:36 schrieb Junio C Hamano:
> 3. builtin/show-branch.c does this:
> 
> qsort(ref_name + bottom, top - bottom, sizeof(ref_name[0]),
>   compare_ref_name);
> 
> where ref_name[] is a file-scope global:
> 
> static char *ref_name[MAX_REVS + 1];
> 
> and top and bottom are plain integers.  The sizeof() does not take
> the size of *base, so it is understandable that this does not get
> automatically converted.
> 
> It seems that some calls to this function _could_ send the same top
> and bottom, asking for 0 element array to be sorted, by the way.

It's hard to imagine an implementation of qsort(3) that can't handle
zero elements.  QSORT's safety feature is that it prevents the compiler
from removing NULL checks for the array pointer.  E.g. the last two
lines in the following example could be optimized away:

qsort(ptr, n, sizeof(*ptr), fn);
if (!ptr)
do_stuff();

You can see that on https://godbolt.org/g/JwS99b -- an awesome website
for exploring compilation results for small snippets, by the way.

This optimization is dangerous when combined with the convention of
using a NULL pointer for empty arrays.  Diagnosing an affected NULL
check is probably quite hard -- it's right there in the code after all
and not all compilers remove it.

builtin/show-branch.c never passes NULL, so it's not affected by that
hazard.  We can (and should, IMHO) still use QSORT there for
consistency and convenience, though:

-- >8 --
Subject: [PATCH] show-branch: use QSORT

Shorten the code by using QSORT instead of calling qsort(3) directly,
as the former determines the element size automatically and checks if
there are at least two elements to sort already.

Signed-off-by: Rene Scharfe 
---
 builtin/show-branch.c | 6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/builtin/show-branch.c b/builtin/show-branch.c
index 623ca56..974f340 100644
--- a/builtin/show-branch.c
+++ b/builtin/show-branch.c
@@ -353,8 +353,7 @@ static int compare_ref_name(const void *a_, const void *b_)
 
 static void sort_ref_range(int bottom, int top)
 {
-   qsort(ref_name + bottom, top - bottom, sizeof(ref_name[0]),
- compare_ref_name);
+   QSORT(ref_name + bottom, top - bottom, compare_ref_name);
 }
 
 static int append_ref(const char *refname, const struct object_id *oid,
@@ -540,8 +539,7 @@ static void append_one_rev(const char *av)
if (saved_matches == ref_name_cnt &&
ref_name_cnt < MAX_REVS)
error(_("no matching refs with %s"), av);
-   if (saved_matches + 1 < ref_name_cnt)
-   sort_ref_range(saved_matches, ref_name_cnt);
+   sort_ref_range(saved_matches, ref_name_cnt);
return;
}
die("bad sha1 reference %s", av);
-- 
2.10.0




Re: [PATCH 2/6] sha1_file: always allow relative paths to alternates

2016-10-02 Thread René Scharfe
Am 30.09.2016 um 21:36 schrieb Jeff King:
> We adjust the test script here to demonstrate that this now
> works. Unfortunately, we can't demonstrate that the
> duplicate is suppressed, since it has no user-visible
> behavior (it's just one less place for our object lookups to
> go). But you can verify it manually via gdb, with something
> like:
> 
> for i in a b c; do
> git init --bare $i
> blob=$(echo $i | git -C $i hash-object -w --stdin)
> done
> echo "../../b/objects" >a/objects/info/alternates
> echo "../../c/objects" >>a/objects/info/alternates
> echo "../../c/objects" >b/objects/info/alternates
> gdb --args git cat-file -e $blob
> 
> After prepare_alt_odb() runs, we have only a single copy of
> "/path/to/c/objects/" in the alt_odb list.

A better way would be to provide a UI for that.  We could easily bolt
it to the side of count-objects like in the patch below.  Feels a bit
hackish, though.

Per-ODB counting would be nice, but a lot more involved, I guess
(didn't try).


diff --git a/builtin/count-objects.c b/builtin/count-objects.c
index ba92919..b2afe36 100644
--- a/builtin/count-objects.c
+++ b/builtin/count-objects.c
@@ -73,6 +73,12 @@ static int count_cruft(const char *basename, const char 
*path, void *data)
return 0;
 }
 
+static int print_alt_odb(struct alternate_object_database *alt, void *data)
+{
+   puts(alt->base);
+   return 0;
+}
+
 static char const * const count_objects_usage[] = {
N_("git count-objects [-v] [-H | --human-readable]"),
NULL
@@ -81,10 +87,13 @@ static char const * const count_objects_usage[] = {
 int cmd_count_objects(int argc, const char **argv, const char *prefix)
 {
int human_readable = 0;
+   int list_alt_odb = 0;
struct option opts[] = {
OPT__VERBOSE(&verbose, N_("be verbose")),
OPT_BOOL('H', "human-readable", &human_readable,
 N_("print sizes in human readable format")),
+   OPT_BOOL(0, "list-alternates", &list_alt_odb,
+N_("print list of alternate object databases")),
OPT_END(),
};
 
@@ -92,6 +101,12 @@ int cmd_count_objects(int argc, const char **argv, const 
char *prefix)
/* we do not take arguments other than flags for now */
if (argc)
usage_with_options(count_objects_usage, opts);
+
+   if (list_alt_odb) {
+   foreach_alt_odb(print_alt_odb, NULL);
+   return 0;
+   }
+
if (verbose) {
report_garbage = real_report_garbage;
report_linked_checkout_garbage();



Re: [PATCH] use strbuf_addstr() for adding constant strings to a strbuf, part 2

2016-10-02 Thread René Scharfe
Am 15.09.2016 um 23:39 schrieb Junio C Hamano:
> René Scharfe  writes:
> 
>> Am 15.09.2016 um 22:01 schrieb Junio C Hamano:
>>> René Scharfe  writes:
>>>
>>>> Take this for example:
>>>>
>>>> -  strbuf_addf(&o->obuf, _("(bad commit)\n"));
>>>> +  strbuf_addstr(&o->obuf, _("(bad commit)\n"));
>>>>
>>>> If there's a language that uses percent signs instead of parens or as
>>>> regular letters, then they need to be escaped in the translated string
>>>> before, but not after the patch.  As I wrote: silly.
>>>
>>> Ahh, OK, so "This use of addf only has format part and nothing else,
>>> hence the format part can be taken as-is" which is the Coccinelle rule
>>> used to produce this patch is incomplete and always needs manual
>>> inspection, in case the format part wanted to give a literal % in
>>> the output.  E.g. it is a bug to convert this
>>>
>>> strbuf_addf(&buf, _("this is 100%% wrong!"));
>>>
>>> to
>>>
>>> strbuf_addstr(&buf, _("this is 100%% wrong!"));
>>
>> Right.  Such strings seem to be quite rare in practice, though. 
>>
>>> Thanks for clarification.  Perhaps the strbuf.cocci rule file can
>>> have some comment to warn the person who builds *.patch file to look
>>> for % in E2, or something?
>>
>> Something like this?
> 
> Yup, with something like that I would understdood where that
> puzzling question came from.

Here's something better than a comment:

-- >8 --
Subject: [PATCH] coccicheck: make transformation for strbuf_addf(sb, "...") 
more precise

We can replace strbuf_addf() calls that just add a simple string with
calls to strbuf_addstr() to make the intent clearer.  We need to be
careful if that string contains printf format specifications like %%,
though, as a simple replacement would change the output.

Add checks to the semantic patch to make sure we only perform the
transformation if the second argument is a string constant (possibly
translated) that doesn't contain any percent signs.

Signed-off-by: Rene Scharfe 
---
 contrib/coccinelle/strbuf.cocci | 29 ++---
 1 file changed, 26 insertions(+), 3 deletions(-)

diff --git a/contrib/coccinelle/strbuf.cocci b/contrib/coccinelle/strbuf.cocci
index 1e24298..63995f2 100644
--- a/contrib/coccinelle/strbuf.cocci
+++ b/contrib/coccinelle/strbuf.cocci
@@ -1,8 +1,31 @@
+@ strbuf_addf_with_format_only @
+expression E;
+constant fmt;
 @@
-expression E1, E2;
+  strbuf_addf(E,
+(
+  fmt
+|
+  _(fmt)
+)
+  );
+
+@ script:python @
+fmt << strbuf_addf_with_format_only.fmt;
 @@
-- strbuf_addf(E1, E2);
-+ strbuf_addstr(E1, E2);
+cocci.include_match("%" not in fmt)
+
+@ extends strbuf_addf_with_format_only @
+@@
+- strbuf_addf
++ strbuf_addstr
+  (E,
+(
+  fmt
+|
+  _(fmt)
+)
+  );
 
 @@
 expression E1, E2;
-- 
2.10.0


Re: [PATCH] git-gui: stop using deprecated merge syntax

2016-10-03 Thread René Scharfe

Am 03.10.2016 um 10:30 schrieb Pat Thoyts:

The only problem I see here is that generally git-gui tries to continue
to work with older versions of git as well. So adding a guard using the
git-version procedure should maintain that backwards compatibility.


Makes sense for a stand-alone tool.


I suggest:

From c2716458f05893ca88c05ce211a295a330e74590 Mon Sep 17 00:00:00 2001
From:  René Scharfe 
Date: Sat, 24 Sep 2016 13:30:22 +0200
Subject: [PATCH] git-gui: stop using deprecated merge syntax

Starting with v2.5.0 git merge can handle FETCH_HEAD internally and
warns when it's called like 'git merge  HEAD ' because
that syntax is deprecated.  Use this feature in git-gui and get rid of
that warning.

Tested-by: Johannes Sixt 
Reviewed-by: Stefan Beller 
Signed-off-by: Rene Scharfe 
Signed-off-by: Pat Thoyts 


OK, but perhaps move me from From: to Original-patch-by: as the version 
check is a big enough change in itself.  Or add a separate commit for 
it.  Or at least mention that you added the check in the commit message.


Thanks,
René


Re: [PATCH 1/3] add QSORT

2016-10-03 Thread René Scharfe

Am 03.10.2016 um 19:09 schrieb Kevin Bracey:

As such, NULL checks can still be elided even with your change. If you
effectively change your example to:

if (nmemb > 1)
qsort(array, nmemb, size, cmp);
if (!array)
printf("array is NULL\n");

array may only be checked for NULL if nmemb <= 1. You can see GCC doing
that in the compiler explorer - it effectively turns that into "else
if".


We don't support array == NULL together with nmemb > 1, so a segfault is 
to be expected in such cases, and thus NULL checks can be removed safely.



To make that check really work, you have to do:

if (array)
qsort(array, nmemb, size, cmp);
else
printf("array is NULL\n");

So maybe your "sane_qsort" should be checking array, not nmemb.


It would be safe, but arguably too much so, because non-empty arrays 
with NULL wouldn't segfault anymore, and thus become harder to identify 
as the programming errors they are.


The intention is to support NULL pointers only for empty arrays (in 
addition to valid pointers).  That we also support NULL pointers for 
arrays with a single member might be considered to be the result of a 
premature optimization, but it should be safe -- the compiler won't 
remove checks unexpectedly.


Does that make sense (it's getting late here, so my logic might already 
be resting..)?


René


Re: [PATCH 1/3] add QSORT

2016-10-04 Thread René Scharfe

Am 04.10.2016 um 07:28 schrieb Kevin Bracey:

On 04/10/2016 01:00, René Scharfe wrote:

Am 03.10.2016 um 19:09 schrieb Kevin Bracey:

As such, NULL checks can still be elided even with your change. If you
effectively change your example to:

if (nmemb > 1)
qsort(array, nmemb, size, cmp);
if (!array)
printf("array is NULL\n");

array may only be checked for NULL if nmemb <= 1. You can see GCC doing
that in the compiler explorer - it effectively turns that into "else
if".


We don't support array == NULL together with nmemb > 1, so a segfault
is to be expected in such cases, and thus NULL checks can be removed
safely.


Possibly true in practice.

But technically wrong by the C standard - behaviour is undefined if the
qsort pointer is invalid. You can't formally expect the defined
behaviour of a segfault when sending NULL into qsort. (Hell, maybe the
qsort has its own NULL check and silently returns!


A qsort(3) implementation that doesn't segfault is inconvenient, but 
still safe.  I'm more concerned about NULL checks being removed from our 
code.



So if it's not a program error for array to be NULL and nmemb to be zero
in your code, and you want a diagnostic for array=NULL, nmemb non-zero,
I think you should put that diagnostic into sane_qsort as an assert or
something, not rely on qsort's undefined behaviour being a segfault.

sane_qsort(blah)
{
 if (nmemb >= 1) {
 assert(array);
 qsort(array, nmemb, ...);
 }
}

Can't invoke undefined behaviour from NULL without triggering the
assert. (Could still have other invalid pointers, of course).


We could do that, but I think it's not necessary.  We'd get a segfault 
when accessing the sorted array anyway.  (If we don't look at the data 
after sorting then we can get rid of the sorting step altogether.)



Usually I am on the side of "no NULL checks", as I make the assumption
that we will get a segfault as soon as NULL pointers are used, and those
are generally easy to diagnose. But seeing a compiler invoking this sort
of new trickery due to invoking undefined behaviour is making me more
nervous about doing so...


I was shocked a bit myself when I learned about this, but let's not 
panic. :)



To make that check really work, you have to do:

if (array)
qsort(array, nmemb, size, cmp);
else
printf("array is NULL\n");

So maybe your "sane_qsort" should be checking array, not nmemb.


It would be safe, but arguably too much so, because non-empty arrays
with NULL wouldn't segfault anymore, and thus become harder to
identify as the programming errors they are.

Well, you get the print. Although I guess you're worrying about the
second if being real code, not a debugging check.


Yes, but the optimization is valid: If nmemb > 0 then array can only be 
NULL if we have a bug, and then we'd get a segfault eventually.  So such 
checks can be removed safely.



I must say, this is quite a courageous new optimisation from GCC. It
strikes me as finding a language lawyer loophole that seems to have been
intended for something else (mapping library functions directly onto
CISCy CPU intrinsics), and using it to invent a whole new optimisation
that seems more likely to trigger bugs than optimise any significant
amount of code in a desirable way.


Yeah, and the bugs triggered are quite obscure in this case.  But having 
richer type information and thus restricting the range of possible 
values for variables *can* enable useful optimizations.



Doubly weird as there's no (standard) language support for this. I don't
know how you'd define "my_qsort" that triggered the same optimisations.


The nonnull attribute is a GCC extension, but it's also supported by clang:

  http://clang.llvm.org/docs/AttributeReference.html#nonnull-gnu-nonnull

I don't know if other compilers support it as well, or if there are 
efforts underway to standardize it.



I've seen similar
library-knowledge-without-any-way-to-reproduce-in-user-code
optimisations like "malloc returns a new pointer that doesn't alias with
anything existing" (and no way to reproduce the optimisation with
my_malloc_wrapper). But those seemed to have a clear performance
benefit, without any obvious traps. Doubtful about this one.


Still we have to deal with it..

So let's summarize; here's the effect of a raw qsort(3) call:

array == NULL  nmemb  bug  QSORT  following NULL check
-  -  ---  -  
0  0  no   qsort  is skipped
0 >0  no   qsort  is skipped
1  0  no   qsort  is skipped (bad!)
1 >0  yes  qsort  is skipped

Here's what the current implementation (nmemb > 1) does:

array == NULL  nmemb  bug  QSORT

Re: [PATCH 16/18] count-objects: report alternates via verbose mode

2016-10-05 Thread René Scharfe

Am 03.10.2016 um 22:36 schrieb Jeff King:

There's no way to get the list of alternates that git
computes internally; our tests only infer it based on which
objects are available. In addition to testing, knowing this
list may be helpful for somebody debugging their alternates
setup.

Let's add it to the "count-objects -v" output. We could give
it a separate flag, but there's not really any need.
"count-objects -v" is already a debugging catch-all for the
object database, its output is easily extensible to new data
items, and printing the alternates is not expensive (we
already had to find them to count the objects).


Good idea.


Signed-off-by: Jeff King 
---
 Documentation/git-count-objects.txt |  5 +
 builtin/count-objects.c | 10 ++
 t/t5613-info-alternate.sh   | 10 ++
 3 files changed, 25 insertions(+)

diff --git a/Documentation/git-count-objects.txt 
b/Documentation/git-count-objects.txt
index 2ff3568..cb9b4d2 100644
--- a/Documentation/git-count-objects.txt
+++ b/Documentation/git-count-objects.txt
@@ -38,6 +38,11 @@ objects nor valid packs
 +
 size-garbage: disk space consumed by garbage files, in KiB (unless -H is
 specified)
++
+alternate: absolute path of alternate object databases; may appear
+multiple times, one line per path. Note that if the path contains
+non-printable characters, it may be surrounded by double-quotes and
+contain C-style backslashed escape sequences.

 -H::
 --human-readable::
diff --git a/builtin/count-objects.c b/builtin/count-objects.c
index ba92919..a700409 100644
--- a/builtin/count-objects.c
+++ b/builtin/count-objects.c
@@ -8,6 +8,7 @@
 #include "dir.h"
 #include "builtin.h"
 #include "parse-options.h"
+#include "quote.h"

 static unsigned long garbage;
 static off_t size_garbage;
@@ -73,6 +74,14 @@ static int count_cruft(const char *basename, const char 
*path, void *data)
return 0;
 }

+static int print_alternate(struct alternate_object_database *alt, void *data)
+{
+   printf("alternate: ");
+   quote_c_style(alt->path, NULL, stdout, 0);
+   putchar('\n');
+   return 0;
+}


Yeah, quoting paths makes sense.


+
 static char const * const count_objects_usage[] = {
N_("git count-objects [-v] [-H | --human-readable]"),
NULL
@@ -140,6 +149,7 @@ int cmd_count_objects(int argc, const char **argv, const 
char *prefix)
printf("prune-packable: %lu\n", packed_loose);
printf("garbage: %lu\n", garbage);
printf("size-garbage: %s\n", garbage_buf.buf);
+   foreach_alt_odb(print_alternate, NULL);
strbuf_release(&loose_buf);
strbuf_release(&pack_buf);
strbuf_release(&garbage_buf);
diff --git a/t/t5613-info-alternate.sh b/t/t5613-info-alternate.sh
index b393613..74f6770 100755
--- a/t/t5613-info-alternate.sh
+++ b/t/t5613-info-alternate.sh
@@ -39,6 +39,16 @@ test_expect_success 'preparing third repository' '
)
 '

+test_expect_success 'count-objects shows the alternates' '
+   cat >expect <<-EOF &&
+   alternate: $(pwd)/B/.git/objects
+   alternate: $(pwd)/A/.git/objects
+   EOF
+   git -C C count-objects -v >actual &&
+   grep ^alternate: actual >actual.alternates &&
+   test_cmp expect actual.alternates
+'
+
 # Note: These tests depend on the hard-coded value of 5 as "too deep". We start
 # the depth at 0 and count links, not repositories, so in a chain like:
 #



Re: [PATCH 0/18] alternate object database cleanups

2016-10-05 Thread René Scharfe

Am 03.10.2016 um 22:33 schrieb Jeff King:

This series is the result of René nerd-sniping me with the claim that we
could "easily" teach count-objects to print out the list of alternates
in:

  http://public-inbox.org/git/c27dc1a4-3c7a-2866-d9d8-f5d3eb161...@web.de/


1. Send crappy patch
2. 
3. PROFIT!!!

Sometimes it works. :)

Thank you!
René


Re: [PATCH 07/18] link_alt_odb_entry: handle normalize_path errors

2016-10-05 Thread René Scharfe

Am 03.10.2016 um 22:34 schrieb Jeff King:

When we add a new alternate to the list, we try to normalize
out any redundant "..", etc. However, we do not look at the
return value of normalize_path_copy(), and will happily
continue with a path that could not be normalized. Worse,
the normalizing process is done in-place, so we are left
with whatever half-finished working state the normalizing
function was in.

Fortunately, this cannot cause us to read past the end of
our buffer, as that working state will always leave the
NUL from the original path in place. And we do tend to
notice problems when we check is_directory() on the path.
But you can see the nonsense that we feed to is_directory
with an entry like:

  this/../../is/../../way/../../too/../../deep/../../to/../../resolve

in your objects/info/alternates, which yields:

  error: object directory
  
/to/e/deep/too/way//ects/this/../../is/../../way/../../too/../../deep/../../to/../../resolve
  does not exist; check .git/objects/info/alternates.

We can easily fix this just by checking the return value.
But that makes it hard to generate a good error message,
since we're normalizing in-place and our input value has
been overwritten by cruft.

Instead, let's provide a strbuf helper that does an in-place
normalize, but restores the original contents on error. This
uses a second buffer under the hood, which is slightly less
efficient, but this is not a performance-critical code path.


Hmm, in-place functions are quite rare in the strbuf collection.  It 
looks like a good fit for the two callers and makes sense in general, 
though.


Re: [PATCH 2/2] use strbuf_add_unique_abbrev() for adding short hashes, part 2

2016-10-07 Thread René Scharfe

Am 07.10.2016 um 02:46 schrieb Jeff King:

On Tue, Sep 27, 2016 at 09:11:58PM +0200, René Scharfe wrote:


Call strbuf_add_unique_abbrev() to add abbreviated hashes to strbufs
instead of taking detours through find_unique_abbrev() and its static
buffer.  This is shorter and a bit more efficient.
[...]
diff --git a/diff.c b/diff.c
index a178ed3..be11e4e 100644
--- a/diff.c
+++ b/diff.c
@@ -3109,7 +3109,7 @@ static void fill_metainfo(struct strbuf *msg,
}
strbuf_addf(msg, "%s%sindex %s..", line_prefix, set,
find_unique_abbrev(one->oid.hash, abbrev));
-   strbuf_addstr(msg, find_unique_abbrev(two->oid.hash, abbrev));
+   strbuf_add_unique_abbrev(msg, two->oid.hash, abbrev);
if (one->mode == two->mode)
strbuf_addf(msg, " %06o", one->mode);
strbuf_addf(msg, "%s\n", reset);


This one is an interesting case, and maybe a good example of why blind
coccinelle usage can have some pitfalls. :)


Thank you for paying attention. :)  In general I agree that the 
surrounding code of such changes should be checked; the issue at hand 
could be part of a bigger problem.



We get rid of the strbuf_addstr(), but notice that we leave untouched
the find_unique_abbrev() call immediately above. There was a symmetry to
the two that has been lost.

Probably either:

  strbuf_addf(msg, "%s%sindex %s..%s", line_prefix, set
find_unique_abbrev(one->oid.hash, abbrev),
find_unique_abbrev(two->oid.hash, abbrev));

or:

  strbuf_addf(msg, "%s%sindex ", line_prefix, set);
  strbuf_add_unique_abbrev(msg, one->oid.hash, abbrev);
  strbuf_addstr(msg, "..");
  strbuf_add_unique_abbrev(msg, two->oid.hash, abbrev);

would be a more appropriate refactoring. The problem is in the original
patch (which also lacks symmetry; either this predates the multi-buffer
find_unique_abbrev, or the original author didn't know about it), but I
think your refactor makes it slightly worse.


I still think the automatically generated patch is a net win, but we 
shouldn't stop there.



I noticed because I have another series which touches these lines, and
it wants to symmetrically swap out find_unique_abbrev for something
else. :) I don't think it's a big enough deal to switch now (and I've
already rebased my series which will touch these lines), but I wanted to
mention it as a thing to watch out for as we do more of these kinds of
automated transformations.


OK, then I'll wait for that series to land.


--- a/submodule.c
+++ b/submodule.c
@@ -396,7 +396,7 @@ static void show_submodule_header(FILE *f, const char *path,
find_unique_abbrev(one->hash, DEFAULT_ABBREV));
if (!fast_backward && !fast_forward)
strbuf_addch(&sb, '.');
-   strbuf_addstr(&sb, find_unique_abbrev(two->hash, DEFAULT_ABBREV));
+   strbuf_add_unique_abbrev(&sb->hash, two, DEFAULT_ABBREV);


This one is a similar situation, I think.


Yes, and there are some more.  Will take a look.

I don't know how to crack printf-style formats using semantic patches. 
It's easy for fixed formats (silly example):


- strbuf_addf(sb, "%s%s", a, b);
+ strbuf_addf(sb, "%s", a);
+ strbuf_addf(sb, "%s", b);

But how to do that for arbitrary formats?  Probably not worth it..

René


Re: [PATCH] remote.c: free previous results when looking for a ref match

2016-10-08 Thread René Scharfe

Am 08.10.2016 um 01:58 schrieb Stefan Beller:

Signed-off-by: Stefan Beller 
---
 remote.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/remote.c b/remote.c
index ad6c542..5f9afb4 100644
--- a/remote.c
+++ b/remote.c
@@ -833,6 +833,8 @@ static int match_name_with_pattern(const char *key, const 
char *name,
strbuf_add(&sb, value, vstar - value);
strbuf_add(&sb, name + klen, namelen - klen - ksuffixlen);
strbuf_addstr(&sb, vstar + 1);
+   if (*result)
+   free(*result);


free(3) can handle NULL pointers; this check is not necessary.

Is it wise to release memory for callers?  I'd expect them to be 
responsible for that.  Some of them can pass uninitialized pointers; 
this is not allowed anymore after the change.



*result = strbuf_detach(&sb, NULL);
}
return ret;
@@ -1262,6 +1264,8 @@ static char *get_ref_match(const struct refspec *rs, int 
rs_nr, const struct ref
 */
if (!send_mirror && !starts_with(ref->name, "refs/heads/"))
return NULL;
+   if (name)
+   free(name);


Again, this check is not necessary.  If I read the code correctly the 
pointer could be uninitialized at that point, though, causing free(3) to 
crash.



name = xstrdup(ref->name);
}
if (ret_pat)





[PATCH] remove unnecessary NULL check before free(3)

2016-10-08 Thread René Scharfe
free(3) handles NULL pointers just fine.  Add a semantic patch for
removing unnecessary NULL checks before calling this function, and
apply it on the code base.

Signed-off-by: Rene Scharfe 
---
 contrib/coccinelle/free.cocci | 5 +
 parse-options-cb.c| 3 +--
 2 files changed, 6 insertions(+), 2 deletions(-)
 create mode 100644 contrib/coccinelle/free.cocci

diff --git a/contrib/coccinelle/free.cocci b/contrib/coccinelle/free.cocci
new file mode 100644
index 000..e282131
--- /dev/null
+++ b/contrib/coccinelle/free.cocci
@@ -0,0 +1,5 @@
+@@
+expression E;
+@@
+- if (E)
+  free(E);
diff --git a/parse-options-cb.c b/parse-options-cb.c
index b5d9209..b7d8f7d 100644
--- a/parse-options-cb.c
+++ b/parse-options-cb.c
@@ -211,8 +211,7 @@ int parse_opt_passthru(const struct option *opt, const char 
*arg, int unset)
if (recreate_opt(&sb, opt, arg, unset) < 0)
return -1;
 
-   if (*opt_value)
-   free(*opt_value);
+   free(*opt_value);
 
*opt_value = strbuf_detach(&sb, NULL);
 
-- 
2.10.0



[PATCH] use strbuf_add_unique_abbrev() for adding short hashes, part 3

2016-10-08 Thread René Scharfe
Call strbuf_add_unique_abbrev() to add abbreviated hashes to strbufs
instead of taking detours through find_unique_abbrev() and its static
buffer.  This is shorter in most cases and a bit more efficient.

The changes here are not easily handled by a semantic patch because
they involve removing temporary variables and deconstructing format
strings for strbuf_addf().

Signed-off-by: Rene Scharfe 
---
 merge-recursive.c |  6 +++---
 pretty.c  | 12 +---
 submodule.c   |  7 +++
 wt-status.c   | 10 --
 4 files changed, 15 insertions(+), 20 deletions(-)

diff --git a/merge-recursive.c b/merge-recursive.c
index 5200d5c..9041c2f 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -202,9 +202,9 @@ static void output_commit_title(struct merge_options *o, 
struct commit *commit)
strbuf_addf(&o->obuf, "virtual %s\n",
merge_remote_util(commit)->name);
else {
-   strbuf_addf(&o->obuf, "%s ",
-   find_unique_abbrev(commit->object.oid.hash,
-   DEFAULT_ABBREV));
+   strbuf_add_unique_abbrev(&o->obuf, commit->object.oid.hash,
+DEFAULT_ABBREV);
+   strbuf_addch(&o->obuf, ' ');
if (parse_commit(commit) != 0)
strbuf_addstr(&o->obuf, _("(bad commit)\n"));
else {
diff --git a/pretty.c b/pretty.c
index 25efbca..0c31495 100644
--- a/pretty.c
+++ b/pretty.c
@@ -544,15 +544,13 @@ static void add_merge_info(const struct 
pretty_print_context *pp,
strbuf_addstr(sb, "Merge:");
 
while (parent) {
-   struct commit *p = parent->item;
-   const char *hex = NULL;
+   struct object_id *oidp = &parent->item->object.oid;
+   strbuf_addch(sb, ' ');
if (pp->abbrev)
-   hex = find_unique_abbrev(p->object.oid.hash, 
pp->abbrev);
-   if (!hex)
-   hex = oid_to_hex(&p->object.oid);
+   strbuf_add_unique_abbrev(sb, oidp->hash, pp->abbrev);
+   else
+   strbuf_addstr(sb, oid_to_hex(oidp));
parent = parent->next;
-
-   strbuf_addf(sb, " %s", hex);
}
strbuf_addch(sb, '\n');
 }
diff --git a/submodule.c b/submodule.c
index 2de06a3..476f60b 100644
--- a/submodule.c
+++ b/submodule.c
@@ -392,10 +392,9 @@ static void show_submodule_header(FILE *f, const char 
*path,
}
 
 output_header:
-   strbuf_addf(&sb, "%s%sSubmodule %s %s..", line_prefix, meta, path,
-   find_unique_abbrev(one->hash, DEFAULT_ABBREV));
-   if (!fast_backward && !fast_forward)
-   strbuf_addch(&sb, '.');
+   strbuf_addf(&sb, "%s%sSubmodule %s ", line_prefix, meta, path);
+   strbuf_add_unique_abbrev(&sb, one->hash, DEFAULT_ABBREV);
+   strbuf_addstr(&sb, (fast_backward || fast_forward) ? ".." : "...");
strbuf_add_unique_abbrev(&sb, two->hash, DEFAULT_ABBREV);
if (message)
strbuf_addf(&sb, " %s%s\n", message, reset);
diff --git a/wt-status.c b/wt-status.c
index 99d1b0a..ca5c45f 100644
--- a/wt-status.c
+++ b/wt-status.c
@@ -1110,7 +1110,6 @@ static void abbrev_sha1_in_line(struct strbuf *line)
split = strbuf_split_max(line, ' ', 3);
if (split[0] && split[1]) {
unsigned char sha1[20];
-   const char *abbrev;
 
/*
 * strbuf_split_max left a space. Trim it and re-add
@@ -1118,9 +1117,10 @@ static void abbrev_sha1_in_line(struct strbuf *line)
 */
strbuf_trim(split[1]);
if (!get_sha1(split[1]->buf, sha1)) {
-   abbrev = find_unique_abbrev(sha1, DEFAULT_ABBREV);
strbuf_reset(split[1]);
-   strbuf_addf(split[1], "%s ", abbrev);
+   strbuf_add_unique_abbrev(split[1], sha1,
+DEFAULT_ABBREV);
+   strbuf_addch(split[1], ' ');
strbuf_reset(line);
for (i = 0; split[i]; i++)
strbuf_addbuf(line, split[i]);
@@ -1343,10 +1343,8 @@ static char *get_branch(const struct worktree *wt, const 
char *path)
else if (starts_with(sb.buf, "refs/"))
;
else if (!get_sha1_hex(sb.buf, sha1)) {
-   const char *abbrev;
-   abbrev = find_unique_abbrev(sha1, DEFAULT_ABBREV);
strbuf_reset(&sb);
-   strbuf_addstr(&sb, abbrev);
+   strbuf_add_unique_abbrev(&sb, sha1, DEFAULT_ABBREV);
} else if (!strcmp(sb.buf, "detached HEAD")) /* rebase */
goto got_nothing;
else/* bisect */
-- 
2.10.1



Re: %C(auto) not working as expected

2016-10-08 Thread René Scharfe

Am 09.10.2016 um 07:43 schrieb Tom Hale:

$ ~/repo/git/git --version
git version 2.10.0.GIT

The `git-log` man page says:

`auto` alone (i.e.  %C(auto)) will turn on auto coloring on the next
placeholders until the color is switched again.

In this example:

http://i.imgur.com/y3yLxk7.png

I turn on auto colouring for green, but it seems that this is not being
respected when piped through `cat`.

Am I misunderstanding the manual?


So this colors the ref name decoration and the short hash:

$ git log -n1 --format="%C(auto)%d %C(auto)%Cgreen%h"

And this only colors the short hash:

$ git log -n1 --format="%C(auto)%d %C(auto)%Cgreen%h" | cat

%C(auto) respects the color-related configuration settings; that's 
mentioned on the man page for git log in the section on %C(...).  You 
probably have color.ui=auto or color.diff=auto in your config, which 
means that output to terminals is to be colored, but output to files and 
pipes isn't.  You could override that setting e.g. by adding the command 
line option --color=always.


%Cgreen emits color codes unconditionally.  %C(auto,green) would respect 
the config settings.


Your second %C(auto) has no effect because it is overridden by the 
immediately following %Cgreen.


You may want to add a %Creset at the end to avoid colors bleeding out 
over the end of the line.  You can see that happening e.g. with:


$ git log -n3 --format="normal %C(green)green" | cat

Without cat bash seems to add a reset automatically.

René



Re: %C(auto) not working as expected

2016-10-09 Thread René Scharfe
Am 09.10.2016 um 12:04 schrieb Tom Hale:
> On 2016-10-09 13:47, René Scharfe wrote:
> 
>> %Cgreen emits color codes unconditionally.  %C(auto,green) would respect
>> the config settings.
> 
> Thanks, I've never seen the (,) syntax documented before!

Both the prefix "auto," for terminal-detection and "%C(auto)" for
choosing colors automatically are mentioned in the manpage for git log
(from Documentation/pretty-formats.txt):

- '%C(...)': color specification, as described in color.branch.* config option;
  adding `auto,` at the beginning will emit color only when colors are
  enabled for log output (by `color.diff`, `color.ui`, or `--color`, and
  respecting the `auto` settings of the former if we are going to a
  terminal). `auto` alone (i.e. `%C(auto)`) will turn on auto coloring
  on the next placeholders until the color is switched again.

> What's strange is that this works:
>   %C(auto,green bold)
> but
>   %C(auto,green,bold)
> does not.

Looking at the code that's not surprising; colors and attributes are
interpreted as a space-separated list.  The prefix "auto," is handled
specially.  For a user it may look strange, admittedly.

Supporting "auto " as well would be easy.  Supporting it in such a way
that it can be mixed freely with colors and attributes as in
%C(bold auto green) would be a bit harder.  Could this lead to confusion
between the auto for terminal-detection and the one for automatic color
selection?

The documentation cited above says the color specification was explained
together with the color.branch.* config option, but that part only says
(from Documentation/config.txt):

color.branch.::
Use customized color for branch coloration. `` is one of
`current` (the current branch), `local` (a local branch),
`remote` (a remote-tracking branch in refs/remotes/),
`upstream` (upstream tracking branch), `plain` (other
refs).

It really is described earlier in the same file, in the Values section
(a fitting place, I think).  Here's just the first sentence:

color::
   The value for a variable that takes a color is a list of
   colors (at most two, one for foreground and one for background)
   and attributes (as many as you want), separated by spaces.

Patch below.  Does it help a little?

> Also:
> Given it's very rare to want only part of a string to emit colour codes,
> if something like "bold" carries through until a "no-bold", why doesn't
> "auto" do the same thing?

No state is kept for "auto,".  Attributes and colors are switched
separately by terminals, that's why e.g. bold stays in effect through
a color change -- unless you specify an attribute change as well.

Offering a way to enable terminal-detection for all color codes of a
format would be useful, but using the existing "auto," prefix for that
would be a behaviour change that could surprise users.

René


-- >8 --
Subject: [PATCH] pretty: fix document reference for color specification

Signed-off-by: Rene Scharfe 
---
 Documentation/pretty-formats.txt | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/Documentation/pretty-formats.txt b/Documentation/pretty-formats.txt
index a942d57..89e3bc6 100644
--- a/Documentation/pretty-formats.txt
+++ b/Documentation/pretty-formats.txt
@@ -166,7 +166,8 @@ endif::git-rev-list[]
 - '%Cgreen': switch color to green
 - '%Cblue': switch color to blue
 - '%Creset': reset color
-- '%C(...)': color specification, as described in color.branch.* config option;
+- '%C(...)': color specification, as described under Values, color in the
+  "CONFIGURATION FILE" section of linkgit:git-config[1];
   adding `auto,` at the beginning will emit color only when colors are
   enabled for log output (by `color.diff`, `color.ui`, or `--color`, and
   respecting the `auto` settings of the former if we are going to a
-- 
2.10.1



Re: [PATCH] pretty: respect color settings for %C placeholders

2016-10-10 Thread René Scharfe

Am 10.10.2016 um 17:15 schrieb Jeff King:

On Mon, Oct 10, 2016 at 10:28:18AM -0400, Jeff King wrote:


We could add some new tag to change the behavior of all following %C
tags. Something like %C(tty) maybe (probably a bad name), then
discourage the use if "%C(auto" for terminal detection?


Yeah, adding a "%C(enable-auto-color)" or something would be backwards
compatible and less painful than using "%C(auto)" everywhere. I do
wonder if anybody actually _wants_ the "always show color, even if
--no-color" behavior. I'm having trouble thinking of a good use for it.

IOW, I'm wondering if anyone would disagree that the current behavior is
simply buggy. Reading the thread at:

  http://public-inbox.org/git/7v4njkmq07@alter.siamese.dyndns.org/

I don't really see any compelling reason against it (there was some
question of which config to use, but we already answered that with
"%C(auto)", and use the value from the pretty_ctx).


So here's what a patch to do that would look like. I admit that "I can't
think of a good use" does not mean there _isn't_ one, but perhaps by
posting this, it might provoke other people to think on it, too. And if
nobody can come up with, maybe it's a good idea.


Color tags that respect the config and the --color option make the most 
sense to me as well.


Nevertheless a possible counterpoint: Coloring is used in commands that 
are intended for human consumption.  Most of the time there is no need 
to to conserve the output in a file.  But even then, and of course with 
pipes, once you look at it using less -R or grep you still get nice 
colored lines and not a monochrome wall of text.



-- >8 --
Subject: pretty: respect color settings for %C placeholders

The color placeholders have traditionally been
unconditional, showing colors even when git is not otherwise
configured to do so. This was not so bad for their original
use, which was on the command-line (and the user could
decide at that moment whether to add colors or not). But
these days we have configured formats via pretty.*, and
those should operate in multiple contexts.

In 3082517 (log --format: teach %C(auto,black) to respect
color config, 2012-12-17), we gave an extended placeholder
that could be used to accomplish this. But it's rather
clunky to use, because you have to specify it individually
for each color (and their matching resets) in the format.
We shied away from just switching the default to auto,
because it is technically breaking backwards compatibility.

However, there's not really a use case for unconditional
colors. The most plausible reason you would want them
unconditional is to redirect "git log" output to a file. But
there, the right answer is --color=always, as it does the
right thing both with custom user-format colors and
git-generated colors.

So let's switch to the more useful default. In the
off-chance that somebody really does find a use for
unconditional colors without wanting to enable the rest of
git's colors, we can provide %C(always,...) to enable the
old behavior.

Signed-off-by: Jeff King 
---
The tests unsurprisingly needed updating, as we're breaking the old
behavior. The diff is easier to read read with "-w".

 Documentation/pretty-formats.txt | 13 +++---
 pretty.c | 19 +---
 t/t6006-rev-list-format.sh   | 94 
 3 files changed, 70 insertions(+), 56 deletions(-)

diff --git a/Documentation/pretty-formats.txt b/Documentation/pretty-formats.txt
index a942d57..7aa1a8b 100644
--- a/Documentation/pretty-formats.txt
+++ b/Documentation/pretty-formats.txt
@@ -167,11 +167,14 @@ endif::git-rev-list[]
 - '%Cblue': switch color to blue
 - '%Creset': reset color
 - '%C(...)': color specification, as described in color.branch.* config option;
-  adding `auto,` at the beginning will emit color only when colors are
-  enabled for log output (by `color.diff`, `color.ui`, or `--color`, and
-  respecting the `auto` settings of the former if we are going to a
-  terminal). `auto` alone (i.e. `%C(auto)`) will turn on auto coloring
-  on the next placeholders until the color is switched again.
+  By default, colors are shown only when enabled for log output (by
+  `color.diff`, `color.ui`, or `--color`, and respecting the `auto`
+  settings of the former if we are going to a terminal). `%C(auto,...)`
+  is accepted as a historical synonym for the default. Specifying
+  `%C(always,...) will show the colors always, even when colors are not
+  otherwise enabled (to enable this behavior for the whole format, use
+  `--color=always`). `auto` alone (i.e. `%C(auto)`) will turn on auto
+  coloring on the next placeholders until the color is switched again.
 - '%m': left (`<`), right (`>`) or boundary (`-`) mark
 - '%n': newline
 - '%%': a raw '%'
diff --git a/pretty.c b/pretty.c
index 25efbca..73e58b5 100644
--- a/pretty.c
+++ b/pretty.c
@@ -965,22 +965,31 @@ static size_t parse_color(struct strbuf *sb, /* in UTF-8 
*/

if (!

Re: [PATCH] pretty: respect color settings for %C placeholders

2016-10-10 Thread René Scharfe

Am 10.10.2016 um 19:42 schrieb Jeff King:

On Mon, Oct 10, 2016 at 07:09:12PM +0200, René Scharfe wrote:


diff --git a/pretty.c b/pretty.c
index 25efbca..73e58b5 100644
--- a/pretty.c
+++ b/pretty.c
@@ -965,22 +965,31 @@ static size_t parse_color(struct strbuf *sb, /* in UTF-8 
*/

if (!end)
return 0;
-   if (skip_prefix(begin, "auto,", &begin)) {
+
+   if (!skip_prefix(begin, "always,", &begin)) {
if (!want_color(c->pretty_ctx->color))
return end - placeholder + 1;
}


Shouldn't we have an "else" here?


I'm not sure what you mean; can you write it out?


> -  if (skip_prefix(begin, "auto,", &begin)) {
> +
> +  if (!skip_prefix(begin, "always,", &begin)) {
>if (!want_color(c->pretty_ctx->color))
>return end - placeholder + 1;
>}

else {  /* here */

> +  /* this is a historical noop */
> +  skip_prefix(begin, "auto,", &begin);

}

Otherwise "always,auto," would be allowed and mean the same as 
"always,", which just seems wrong.  Not a biggie.



-   if (skip_prefix(placeholder + 1, "red", &rest))
+   if (skip_prefix(placeholder + 1, "red", &rest) &&
+   want_color(c->pretty_ctx->color))
strbuf_addstr(sb, GIT_COLOR_RED);
-   else if (skip_prefix(placeholder + 1, "green", &rest))
+   else if (skip_prefix(placeholder + 1, "green", &rest) &&
+want_color(c->pretty_ctx->color))
strbuf_addstr(sb, GIT_COLOR_GREEN);
-   else if (skip_prefix(placeholder + 1, "blue", &rest))
+   else if (skip_prefix(placeholder + 1, "blue", &rest) &&
+want_color(c->pretty_ctx->color))
strbuf_addstr(sb, GIT_COLOR_BLUE);
-   else if (skip_prefix(placeholder + 1, "reset", &rest))
+   else if (skip_prefix(placeholder + 1, "reset", &rest) &&
+want_color(c->pretty_ctx->color))
strbuf_addstr(sb, GIT_COLOR_RESET);
return rest - placeholder;
 }


Perhaps it's a funtion like add_color(sb, ctx, color) or similar would be
nice?


I actually wrote it that way first (I called it "maybe_add_color()"),
but it felt a little funny to have a separate function that people might
be tempted to reuse (the right solution is generally to check
want_color() early as above, but we can't do that here because we have
to find the end of each placeholder).


OK.  A variable then?  Lazy pseudo-code:

if (RED)
color = red;
else if (GREEN)
...

if (want_color())
strbuf_addstr(sb, color);


What I have here is a little funny, too, though, as it keeps trying
other color names if it finds "red" but want_color() returns 0.


Oh, missed that somehow.. O_o


Re: [PATCH] use strbuf_add_unique_abbrev() for adding short hashes, part 3

2016-10-10 Thread René Scharfe
Am 10.10.2016 um 02:00 schrieb Jeff King:
> On Sat, Oct 08, 2016 at 05:38:47PM +0200, René Scharfe wrote:
> 
>> Call strbuf_add_unique_abbrev() to add abbreviated hashes to strbufs
>> instead of taking detours through find_unique_abbrev() and its static
>> buffer.  This is shorter in most cases and a bit more efficient.
>>
>> The changes here are not easily handled by a semantic patch because
>> they involve removing temporary variables and deconstructing format
>> strings for strbuf_addf().
> 
> Yeah, the thing to look for here is whether the sha1 variable holds the
> same value at both times.
> 
> These all look OK to me. Mild rambling below.
> 
>> diff --git a/merge-recursive.c b/merge-recursive.c
>> index 5200d5c..9041c2f 100644
>> --- a/merge-recursive.c
>> +++ b/merge-recursive.c
>> @@ -202,9 +202,9 @@ static void output_commit_title(struct merge_options *o, 
>> struct commit *commit)
>>  strbuf_addf(&o->obuf, "virtual %s\n",
>>  merge_remote_util(commit)->name);
>>  else {
>> -strbuf_addf(&o->obuf, "%s ",
>> -find_unique_abbrev(commit->object.oid.hash,
>> -DEFAULT_ABBREV));
>> +strbuf_add_unique_abbrev(&o->obuf, commit->object.oid.hash,
>> + DEFAULT_ABBREV);
>> +strbuf_addch(&o->obuf, ' ');
> 
> I've often wondered whether a big strbuf_addf() is more efficient than
> several strbuf_addstrs. It amortizes the size-checks, certainly, though
> those are probably not very big. It shouldn't matter much for amortizing
> the cost of malloc, as it's very unlikely to have a case where:
> 
>   strbuf_addf("%s%s", foo, bar);
> 
> would require one malloc, but:
> 
>   strbuf_addstr(foo);
>   strbuf_addstr(bar);
> 
> would require two (one of the strings would have to be around the same
> size as the ALLOC_GROW() doubling).
> 
> So it probably doesn't matter much in practice (not that most of these
> cases are very performance sensitive anyway). Mostly just something I've
> pondered while tweaking strbuf invocations.

Good question.  ALLOC_GROW() doesn't double exactly, but indeed the
number of reallocations depends on the size of the added pieces.  I
always thought of strbuf_addf() as an expensive function for
convenience, but never timed it.

Numbers vary a bit, but here's what I get for the crude test program
at the end:

$ time t/helper/test-strbuf strbuf_addf 123 123456789012345678901234567890
123123456789012345678901234567890

real0m0.168s
user0m0.164s
sys 0m0.000s
$ time t/helper/test-strbuf strbuf_addstr 123 123456789012345678901234567890
123123456789012345678901234567890

real0m0.141s
user0m0.140s
sys 0m0.000s

Just a data-point, but it confirms my bias, so I stop here. :)

> Just thinking aloud, I've also wondered if strbuf_addoid() would be
> handy.  We already have oid_to_hex_r(). In fact, you could write it
> already as:
> 
>   strbuf_add_unique_abbrev(sb, oidp->hash, 0);
> 
> but that is probably not helping maintainability. ;)

Yes, a function for adding full hashes without using a static
variable is useful for library functions that may end up being
called often or in parallel.  I'd call it strbuf_add_hash,
though.



diff --git a/Makefile b/Makefile
index 1aad150..ad5e98c 100644
--- a/Makefile
+++ b/Makefile
@@ -628,6 +628,7 @@ TEST_PROGRAMS_NEED_X += test-scrap-cache-tree
 TEST_PROGRAMS_NEED_X += test-sha1
 TEST_PROGRAMS_NEED_X += test-sha1-array
 TEST_PROGRAMS_NEED_X += test-sigchain
+TEST_PROGRAMS_NEED_X += test-strbuf
 TEST_PROGRAMS_NEED_X += test-string-list
 TEST_PROGRAMS_NEED_X += test-submodule-config
 TEST_PROGRAMS_NEED_X += test-subprocess
diff --git a/t/helper/test-strbuf.c b/t/helper/test-strbuf.c
new file mode 100644
index 000..177f3e2
--- /dev/null
+++ b/t/helper/test-strbuf.c
@@ -0,0 +1,25 @@
+#include "cache.h"
+
+int cmd_main(int argc, const char **argv)
+{
+   struct strbuf sb = STRBUF_INIT;
+   int i = 100;
+
+   if (argc == 4) {
+   if (!strcmp(argv[1], "strbuf_addf")) {
+   while (i--) {
+   strbuf_release(&sb);
+   strbuf_addf(&sb, "%s%s", argv[2], argv[3]);
+   }
+   }
+   if (!strcmp(argv[1], "strbuf_addstr")) {
+   while (i--) {
+   strbuf_release(&sb);
+   strbuf_addstr(&sb, argv[2]);
+   strbuf_addstr(&sb, argv[3]);
+   }
+   }
+   puts(sb.buf);
+   }
+   return 0;
+}



Re: %C(auto) not working as expected

2016-10-10 Thread René Scharfe
Am 10.10.2016 um 01:46 schrieb Jeff King:
>> diff --git a/Documentation/pretty-formats.txt 
>> b/Documentation/pretty-formats.txt
>> index a942d57..89e3bc6 100644
>> --- a/Documentation/pretty-formats.txt
>> +++ b/Documentation/pretty-formats.txt
>> @@ -166,7 +166,8 @@ endif::git-rev-list[]
>>  - '%Cgreen': switch color to green
>>  - '%Cblue': switch color to blue
>>  - '%Creset': reset color
>> -- '%C(...)': color specification, as described in color.branch.* config 
>> option;
>> +- '%C(...)': color specification, as described under Values, color in the
>> +  "CONFIGURATION FILE" section of linkgit:git-config[1];
> 
> I like the intent here, though I found "Values, color" hard to parse (it
> was not immediately clear that you mean "the color paragraph of the
> Values section", as commas are already being used in that sentence for
> the parenthetical phrase).
> 
> I'm not sure how to say that succinctly, as we are four levels deep
> (git-config -> configuration file -> values -> color). Too bad there is
> no good link syntax for it. Maybe:
> 
>   ...color specification, as described in linkgit:git-config[1] (see the
>   paragraph on colors in the "Values" section, under "CONFIGURATION
>   FILE")
> 
> or something.

That's better.  Or we could remove the "color" part and trust readers to
find the right paragraph in the Values sub-section?

-- '%C(...)': color specification, as described in color.branch.* config option;
+- '%C(...)': color specification, as described under Values in the
+  "CONFIGURATION FILE" section of linkgit:git-config[1];


[PATCH v2] pretty: fix document link for color specification

2016-10-10 Thread René Scharfe
Helped-by: Jeff King 
Signed-off-by: Rene Scharfe 
---
Removed confusing and unnecessary reference to the "colors" paragraph.

 Documentation/pretty-formats.txt | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/Documentation/pretty-formats.txt b/Documentation/pretty-formats.txt
index a942d57..89e3bc6 100644
--- a/Documentation/pretty-formats.txt
+++ b/Documentation/pretty-formats.txt
@@ -166,7 +166,8 @@ endif::git-rev-list[]
 - '%Cgreen': switch color to green
 - '%Cblue': switch color to blue
 - '%Creset': reset color
-- '%C(...)': color specification, as described in color.branch.* config option;
+- '%C(...)': color specification, as described under Values in the
+  "CONFIGURATION FILE" section of linkgit:git-config[1];
   adding `auto,` at the beginning will emit color only when colors are
   enabled for log output (by `color.diff`, `color.ui`, or `--color`, and
   respecting the `auto` settings of the former if we are going to a
-- 
2.10.1



[PATCH] cocci: avoid self-references in object_id transformations

2016-10-15 Thread René Scharfe
The object_id functions oid_to_hex, oid_to_hex_r, oidclr, oidcmp, and
oidcpy are defined as wrappers of their legacy counterparts sha1_to_hex,
sha1_to_hex_r, hashclr, hashcmp, and hashcpy, respectively.  Make sure
that the Coccinelle transformations for converting legacy function calls
are not applied to these wrappers themselves, which would result in
tautological declarations.

We can remove the added guards once the object_id functions stand on
their own.

Signed-off-by: Rene Scharfe 
---
 contrib/coccinelle/object_id.cocci | 15 +++
 1 file changed, 15 insertions(+)

diff --git a/contrib/coccinelle/object_id.cocci 
b/contrib/coccinelle/object_id.cocci
index 0307624..09afdbf 100644
--- a/contrib/coccinelle/object_id.cocci
+++ b/contrib/coccinelle/object_id.cocci
@@ -17,10 +17,13 @@ expression E1;
 + oid_to_hex(&E1)
 
 @@
+identifier f != oid_to_hex;
 expression E1;
 @@
+  f(...) {...
 - sha1_to_hex(E1->hash)
 + oid_to_hex(E1)
+  ...}
 
 @@
 expression E1, E2;
@@ -29,10 +32,13 @@ expression E1, E2;
 + oid_to_hex_r(E1, &E2)
 
 @@
+identifier f != oid_to_hex_r;
 expression E1, E2;
 @@
+   f(...) {...
 - sha1_to_hex_r(E1, E2->hash)
 + oid_to_hex_r(E1, E2)
+  ...}
 
 @@
 expression E1;
@@ -41,10 +47,13 @@ expression E1;
 + oidclr(&E1)
 
 @@
+identifier f != oidclr;
 expression E1;
 @@
+  f(...) {...
 - hashclr(E1->hash)
 + oidclr(E1)
+  ...}
 
 @@
 expression E1, E2;
@@ -53,10 +62,13 @@ expression E1, E2;
 + oidcmp(&E1, &E2)
 
 @@
+identifier f != oidcmp;
 expression E1, E2;
 @@
+  f(...) {...
 - hashcmp(E1->hash, E2->hash)
 + oidcmp(E1, E2)
+  ...}
 
 @@
 expression E1, E2;
@@ -77,10 +89,13 @@ expression E1, E2;
 + oidcpy(&E1, &E2)
 
 @@
+identifier f != oidcpy;
 expression E1, E2;
 @@
+  f(...) {...
 - hashcpy(E1->hash, E2->hash)
 + oidcpy(E1, E2)
+  ...}
 
 @@
 expression E1, E2;
-- 
2.10.1



[PATCH] avoid pointer arithmetic involving NULL in FLEX_ALLOC_MEM

2016-10-15 Thread René Scharfe
Calculating offsets involving a NULL pointer is undefined.  It works in
practice (for now?), but we should not rely on it.  Allocate first and
then simply refer to the flexible array member by its name instead of
performing pointer arithmetic up front.  The resulting code is slightly
shorter, easier to read and doesn't rely on undefined behaviour.

NB: The cast to a (non-const) void pointer is necessary to keep support
for flexible array members declared as const.

Signed-off-by: Rene Scharfe 
---
This patch allows the test suite to largely pass (t7063 still fails)
for clang 3.8 with -fsanitize=undefined and -DNO_UNALIGNED_LOADS.

 git-compat-util.h | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/git-compat-util.h b/git-compat-util.h
index 43718da..f964e36 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -851,8 +851,9 @@ static inline void copy_array(void *dst, const void *src, 
size_t n, size_t size)
  * times, and it must be assignable as an lvalue.
  */
 #define FLEX_ALLOC_MEM(x, flexname, buf, len) do { \
-   (x) = NULL; /* silence -Wuninitialized for offset calculation */ \
-   (x) = xalloc_flex(sizeof(*(x)), (char *)(&((x)->flexname)) - (char 
*)(x), (buf), (len)); \
+   size_t flex_array_len_ = (len); \
+   (x) = xcalloc(1, st_add3(sizeof(*(x)), flex_array_len_, 1)); \
+   memcpy((void *)(x)->flexname, (buf), flex_array_len_); \
 } while (0)
 #define FLEXPTR_ALLOC_MEM(x, ptrname, buf, len) do { \
(x) = xalloc_flex(sizeof(*(x)), sizeof(*(x)), (buf), (len)); \
-- 
2.10.1



Re: [PATCH] avoid pointer arithmetic involving NULL in FLEX_ALLOC_MEM

2016-10-16 Thread René Scharfe
Am 15.10.2016 um 19:13 schrieb Jeff King:
> On Sat, Oct 15, 2016 at 06:23:11PM +0200, René Scharfe wrote:
> 
>> Calculating offsets involving a NULL pointer is undefined.  It works in
>> practice (for now?), but we should not rely on it.  Allocate first and
>> then simply refer to the flexible array member by its name instead of
>> performing pointer arithmetic up front.  The resulting code is slightly
>> shorter, easier to read and doesn't rely on undefined behaviour.
> 
> Yeah, this NULL computation is pretty nasty. I recall trying to get rid
> of it, but I think it is impossible to do so portably while still using
> the generic xalloc_flex() helper.

The only way I see is to pass the type to the macro explicitly (because
typeof is an extention), and that would make call sites ugly.

>>  #define FLEX_ALLOC_MEM(x, flexname, buf, len) do { \
>> -(x) = NULL; /* silence -Wuninitialized for offset calculation */ \
>> -(x) = xalloc_flex(sizeof(*(x)), (char *)(&((x)->flexname)) - (char 
>> *)(x), (buf), (len)); \
>> +size_t flex_array_len_ = (len); \
>> +(x) = xcalloc(1, st_add3(sizeof(*(x)), flex_array_len_, 1)); \
>> +memcpy((void *)(x)->flexname, (buf), flex_array_len_); \
> 
> This looks correct. I wondered at first why you bothered with
> flex_array_len, but it is to avoid evaluating the "len" parameter
> multiple times.

Right; we could drop that feature of the original macros and require
users to pass length expressions that don't have side effects -- all of
them already do that anyway.  But let's keep it in this round; it just
costs one extra line.

>>  } while (0)
>>  #define FLEXPTR_ALLOC_MEM(x, ptrname, buf, len) do { \
>>  (x) = xalloc_flex(sizeof(*(x)), sizeof(*(x)), (buf), (len)); \
> 
> Now that xalloc_flex() has only this one caller remaining, perhaps it
> should just be inlined here, too, for simplicity.

-- >8 --
Subject: [PATCH 2/1] inline xalloc_flex() into FLEXPTR_ALLOC_MEM

Allocate and copy directly in FLEXPTR_ALLOC_MEM and remove the now
unused helper function xalloc_flex().  The resulting code is shorter
and the offset arithmetic is a bit simpler.

Suggested-by: Jeff King 
Signed-off-by: Rene Scharfe 
---
 git-compat-util.h | 12 +++-
 1 file changed, 3 insertions(+), 9 deletions(-)

diff --git a/git-compat-util.h b/git-compat-util.h
index f964e36..49ca28c 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -856,7 +856,9 @@ static inline void copy_array(void *dst, const void *src, 
size_t n, size_t size)
memcpy((void *)(x)->flexname, (buf), flex_array_len_); \
 } while (0)
 #define FLEXPTR_ALLOC_MEM(x, ptrname, buf, len) do { \
-   (x) = xalloc_flex(sizeof(*(x)), sizeof(*(x)), (buf), (len)); \
+   size_t flex_array_len_ = (len); \
+   (x) = xcalloc(1, st_add3(sizeof(*(x)), flex_array_len_, 1)); \
+   memcpy((x) + 1, (buf), flex_array_len_); \
(x)->ptrname = (void *)((x)+1); \
 } while(0)
 #define FLEX_ALLOC_STR(x, flexname, str) \
@@ -864,14 +866,6 @@ static inline void copy_array(void *dst, const void *src, 
size_t n, size_t size)
 #define FLEXPTR_ALLOC_STR(x, ptrname, str) \
FLEXPTR_ALLOC_MEM((x), ptrname, (str), strlen(str))
 
-static inline void *xalloc_flex(size_t base_len, size_t offset,
-   const void *src, size_t src_len)
-{
-   unsigned char *ret = xcalloc(1, st_add3(base_len, src_len, 1));
-   memcpy(ret + offset, src, src_len);
-   return ret;
-}
-
 static inline char *xstrdup_or_null(const char *str)
 {
return str ? xstrdup(str) : NULL;
-- 
2.10.1


Re: [PATCH] cocci: avoid self-references in object_id transformations

2016-10-17 Thread René Scharfe
Am 17.10.2016 um 20:08 schrieb Junio C Hamano:
> ... oops.  Totally unrelated to this patch, but I see these in
> strbuf.cocci.patch (this is at the tip of 'pu'), which are total
> nonsense.  Perhaps I am running a way-stale spatch?  It claims to be
> "spatch version 1.0.0-rc19 with Python support and with PCRE support"
> 
> --- date.c
> +++ /tmp/cocci-output-21568-bd3448-date.c
> @@ -179,7 +179,7 @@ const char *show_date(unsigned long time
>  
>   if (mode->type == DATE_UNIX) {
>   strbuf_reset(&timebuf);
> - strbuf_addf(&timebuf, "%lu", time);
> + strbuf_addstr(&timebuf, time);
>   return timebuf.buf;
>   }
>  
> --- log-tree.c
> +++ /tmp/cocci-output-21608-b02087-log-tree.c
> @@ -400,7 +400,7 @@ void log_write_email_headers(struct rev_
>   extra_headers = subject_buffer;
>  
>   if (opt->numbered_files)
> - strbuf_addf(&filename, "%d", opt->nr);
> + strbuf_addstr(&filename, opt->nr);
>   else
>   fmt_output_commit(&filename, commit, opt);
>   snprintf(buffer, sizeof(buffer) - 1,

I get these instead with 6513eabcbcbcfa684d4bb2d57f61c662b870b5ca on
Debian testing with its "spatch version 1.0.4 with Python support and
with PCRE support", which look legit:

--- sequencer.c
+++ /tmp/cocci-output-40365-db7a71-sequencer.c
@@ -159,7 +159,7 @@ int sequencer_remove_state(struct replay
free(opts->xopts[i]);
free(opts->xopts);
 
-   strbuf_addf(&dir, "%s", get_dir(opts));
+   strbuf_addstr(&dir, get_dir(opts));
remove_dir_recursively(&dir, 0);
strbuf_release(&dir);
 
--- builtin/branch.c
+++ /tmp/cocci-output-40858-a86d1a-branch.c
@@ -316,7 +316,7 @@ static char *build_format(struct ref_fil
 
if (filter->verbose) {
strbuf_addf(&local, 
"%%(align:%d,left)%%(refname:strip=2)%%(end)", maxwidth);
-   strbuf_addf(&local, "%s", branch_get_color(BRANCH_COLOR_RESET));
+   strbuf_addstr(&local, branch_get_color(BRANCH_COLOR_RESET));
strbuf_addf(&local, " %%(objectname:short=7) ");
 
if (filter->verbose > 1)


Re: [PATCH] convert: mark a file-local symbol static

2016-10-17 Thread René Scharfe

Am 17.10.2016 um 22:07 schrieb Junio C Hamano:

Ramsay Jones  writes:


Heh, I actually have the following in my config.mak already:

extra-clean: clean
find . -iname '*.o' -exec rm {} \;

But for some reason I _always_ type 'make clean' and then, to top
it off, I _always_ type the 'find' command by hand (I have no idea
why) :-D


"git clean -x" anybody?


This removes config.mak as well.

René




Re: git checkout crashes after server being updated to Debian X86_64

2016-10-18 Thread René Scharfe

Am 18.10.2016 um 17:17 schrieb Raffael Reichelt:

Hello!

I have a serious problem with git, After my provider had updated to a
X86_64 architecture git crashes with various memory-related errors.
This is happening remote when pushing to the repository from my local
machine as well as trying it on a shell on the server itself.

This are the error-messages:

fatal: Out of memory, realloc failed
fatal: recursion detected in die handler
fatal: recursion detected in die handler

or
fatal: unable to create threaded lstat
fatal: recursion detected in die handler
or
fatal: unable to create threaded lstat
*** Error in `git': double free or corruption (fasttop): 0x00a8ade0 ***
fatal: recursion detected in die handler
Aborted

It’s obviously not a problem of the repository - happens with all of
them. I think it is also not a question of size - happens with a 80M
Repository as well as with a 500M one.

Any way: did a

git fsck
Prüfe Objekt-Verzeichnisse: 100% (256/256), Fertig.
Prüfe Objekte: 100% (56305/56305), Fertig.

git gc --auto --prune=today —aggressive
git repack

Additionally I played around some config parameters  so my config now looks 
like:
[http]
postbuffer = 524288000
[pack]
threads = 1
deltaCacheSize = 128m
packSizeLimit = 128m
windowMemory = 128m
[core]
packedGitLimit = 128m
packedGitWindowSize = 128m
repositoryformatversion = 0
filemode = true
bare = true

I am running
git version 2.1.4

on
Linux infongp-de65 3.14.0-ui16196-uiabi1-infong-amd64 #1 SMP Debian 
3.14.73-2~ui80+4 (2016-07-13) x86_64 GNU/Linux

Anyone out there to help me getting out of this trouble?


Git 2.1.4 is the version that comes with Debian stable according to 
https://packages.debian.org/jessie/git, so I guess using a more recent 
version is not a reasonable option.


What do "file $(which git)" and "ulimit -a" return?  Do you have an 
x86-64 binary and no unnecessarily low limits set?


René



Re: Drastic jump in the time required for the test suite

2016-10-20 Thread René Scharfe
Am 20.10.2016 um 13:02 schrieb Duy Nguyen:
> On Wed, Oct 19, 2016 at 4:18 PM, Johannes Schindelin
>  wrote:
>> Hi Junio,
>>
>> I know you are a fan of testing things thoroughly in the test suite, but I
>> have to say that it is getting out of hand, in particular due to our
>> over-use of shell script idioms (which really only run fast on Linux, not
>> a good idea for a portable software).
>>
>> My builds of `pu` now time out, after running for 3h straight in the VM
>> dedicated to perform the daily routine of building and testing the git.git
>> branches in Git for Windows' SDK. For comparison, `next` passes build &
>> tests in 2.6h. That is quite the jump.
> 
> I'm just curious, will running git.exe from WSL [1] help speed things
> up a bit (or, hopefully, a lot)? I'm assuming that shell's speed in
> WSL is quite fast.
> 
> I'm pretty sure the test suite would need some adaptation, but if the
> speedup is significant, maybe it's worth spending time on.
> 
> [1] https://news.ycombinator.com/item?id=12748395

I get this on WSL with prove -j8:

Files=750, Tests=13657, 906 wallclock secs ( 8.51 usr 17.17 sys + 282.62 cusr 
3731.85 csys = 4040.15 CPU)

And this for a run on Debian inside a Hyper-V VM on the same system:

Files=759, Tests=13895, 99 wallclock secs ( 4.81 usr  1.06 sys + 39.70 cusr 
25.82 csys = 71.39 CPU)

All tests pass on master.

René



Re: Drastic jump in the time required for the test suite

2016-10-21 Thread René Scharfe
Am 21.10.2016 um 15:10 schrieb Matthieu Moy:
> René Scharfe  writes:
> 
>> I get this on WSL with prove -j8:
>>
>> Files=750, Tests=13657, 906 wallclock secs ( 8.51 usr 17.17 sys + 282.62 
>> cusr 3731.85 csys = 4040.15 CPU)
>>
>> And this for a run on Debian inside a Hyper-V VM on the same system:
>>
>> Files=759, Tests=13895, 99 wallclock secs ( 4.81 usr  1.06 sys + 39.70 cusr 
>> 25.82 csys = 71.39 CPU)
> 
> What about the same without WSL on windows?

Files=755, Tests=14894, 1774 wallclock secs ( 9.31 usr  9.58 sys + 821.87 cusr 
1961.23 csys = 2801.99 CPU)

René



Re: Drastic jump in the time required for the test suite

2016-10-21 Thread René Scharfe

Am 21.10.2016 um 12:59 schrieb Duy Nguyen:

On Thu, Oct 20, 2016 at 11:40 PM, René Scharfe  wrote:

I get this on WSL with prove -j8:

Files=750, Tests=13657, 906 wallclock secs ( 8.51 usr 17.17 sys + 282.62 cusr 
3731.85 csys = 4040.15 CPU)

And this for a run on Debian inside a Hyper-V VM on the same system:

Files=759, Tests=13895, 99 wallclock secs ( 4.81 usr  1.06 sys + 39.70 cusr 
25.82 csys = 71.39 CPU)

All tests pass on master.


Thank you for doing this. 10 times slower is probably not worth
following up (though absolute numbers still look amazing, you have
some beefy machine there).


Thanks, but it's not too impressive: Xeon E3-1231 v3, Crucial BX100 SSD, 
8GB RAM.  I wonder how much faster a brand-new CPU with more RAM and a 
PCIe SSD would be..


René



[PATCH] hex: use unsigned index for ring buffer

2016-10-23 Thread René Scharfe
Overflow is defined for unsigned integers, but not for signed ones.
Make the ring buffer index in sha1_to_hex() unsigned to be on the
safe side.

Signed-off-by: Rene Scharfe 
---
Hard to trigger, but probably even harder to diagnose once someone
somehow manages to do it on some uncommon architecture.

 hex.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hex.c b/hex.c
index ab2610e..8c6c189 100644
--- a/hex.c
+++ b/hex.c
@@ -76,7 +76,7 @@ char *oid_to_hex_r(char *buffer, const struct object_id *oid)
 
 char *sha1_to_hex(const unsigned char *sha1)
 {
-   static int bufno;
+   static unsigned int bufno;
static char hexbuffer[4][GIT_SHA1_HEXSZ + 1];
return sha1_to_hex_r(hexbuffer[3 & ++bufno], sha1);
 }
-- 
2.10.1



Re: [PATCH] hex: use unsigned index for ring buffer

2016-10-23 Thread René Scharfe

Am 23.10.2016 um 11:11 schrieb Jeff King:

On Sun, Oct 23, 2016 at 11:00:48AM +0200, René Scharfe wrote:


Overflow is defined for unsigned integers, but not for signed ones.
Make the ring buffer index in sha1_to_hex() unsigned to be on the
safe side.

Signed-off-by: Rene Scharfe 
---
Hard to trigger, but probably even harder to diagnose once someone
somehow manages to do it on some uncommon architecture.


Indeed. If we are worried about overflow, we would also want to assume
that it wraps at a multiple of 4, but that is probably a sane
assumption.


Hmm, I can't think of a way to violate this assumption except with 
unsigned integers that are only a single bit wide.  That would be a 
weird machine.  Are there other possibilities?



diff --git a/hex.c b/hex.c
index ab2610e..8c6c189 100644
--- a/hex.c
+++ b/hex.c
@@ -76,7 +76,7 @@ char *oid_to_hex_r(char *buffer, const struct object_id *oid)

 char *sha1_to_hex(const unsigned char *sha1)
 {
-   static int bufno;
+   static unsigned int bufno;
static char hexbuffer[4][GIT_SHA1_HEXSZ + 1];
return sha1_to_hex_r(hexbuffer[3 & ++bufno], sha1);
 }


I wonder if just truncating bufno would be conceptually simpler (albeit
longer):

  bufno++;
  bufno &= 3;
  return sha1_to_hex_r(hexbuffer[bufno], sha1);

You could also write the second line like:

  bufno %= ARRAY_SIZE(hexbuffer);

which is less magical (right now the set of buffers must be a power of
2). I expect the compiler could turn that into a bitmask itself.


Expelling magic is a good idea.  And indeed, at least gcc, clang and icc 
on x86-64 are smart enough to use an AND instead of dividing 
(https://godbolt.org/g/rFPpzF).


But gcc also adds a sign extension (cltq/cdqe) if we store the truncated 
value, unlike the other two compilers.  I don't see why -- the bit mask 
operation enforces a value between 0 and 3 (inclusive) and the upper 
bits of eax are zeroed automatically, so the cltq is effectively a noop.


Using size_t gets us rid of the extra instruction and is the right type 
anyway.  It would suffice on its own, hmm..



I'm fine with any of the options. I guess you'd want a similar patch for
find_unique_abbrev on top of jk/no-looking-at-dotgit-outside-repo.


Actually I'd want you to want to amend your series yourself. ;)  Maybe I 
can convince Coccinelle to handle that issue for us.


And there's also path.c::get_pathname().  That's enough cases to justify 
adding a macro, I'd say:


---
 cache.h | 3 +++
 hex.c   | 4 ++--
 path.c  | 4 ++--
 3 files changed, 7 insertions(+), 4 deletions(-)

diff --git a/cache.h b/cache.h
index 05ecb88..8bb4918 100644
--- a/cache.h
+++ b/cache.h
@@ -555,6 +555,9 @@ extern int daemonize(void);
} \
} while (0)

+#define NEXT_RING_ITEM(array, index) \
+   (array)[(index) = ((index) + 1) % ARRAY_SIZE(array)]
+
 /* Initialize and use the cache information */
 struct lock_file;
 extern int read_index(struct index_state *);
diff --git a/hex.c b/hex.c
index ab2610e..5e711b9 100644
--- a/hex.c
+++ b/hex.c
@@ -76,9 +76,9 @@ char *oid_to_hex_r(char *buffer, const struct 
object_id *oid)


 char *sha1_to_hex(const unsigned char *sha1)
 {
-   static int bufno;
+   static size_t bufno;
static char hexbuffer[4][GIT_SHA1_HEXSZ + 1];
-   return sha1_to_hex_r(hexbuffer[3 & ++bufno], sha1);
+   return sha1_to_hex_r(NEXT_RING_ITEM(hexbuffer, bufno), sha1);
 }

 char *oid_to_hex(const struct object_id *oid)
diff --git a/path.c b/path.c
index a8e7295..60dba6a 100644
--- a/path.c
+++ b/path.c
@@ -24,8 +24,8 @@ static struct strbuf *get_pathname(void)
static struct strbuf pathname_array[4] = {
STRBUF_INIT, STRBUF_INIT, STRBUF_INIT, STRBUF_INIT
};
-   static int index;
-   struct strbuf *sb = &pathname_array[3 & ++index];
+   static size_t index;
+   struct strbuf *sb = &NEXT_RING_ITEM(pathname_array, index);
strbuf_reset(sb);
return sb;
 }
--
2.10.1




Re: [PATCH] hex: use unsigned index for ring buffer

2016-10-24 Thread René Scharfe
Am 24.10.2016 um 19:27 schrieb Junio C Hamano:
> Junio C Hamano  writes:
> 
>>> I think it would be preferable to just fix it inline in each place.
>>
>> Yeah, probably.
>>
>> My initial reaction to this was
>>
>>  char *sha1_to_hex(const unsigned char *sha1)
>>  {
>> -static int bufno;
>> +static unsigned int bufno;
>>  static char hexbuffer[4][GIT_SHA1_HEXSZ + 1];
>>  return sha1_to_hex_r(hexbuffer[3 & ++bufno], sha1);
>>
>> "ah, we do not even need bufno as uint; it could be ushort or even
>> uchar".  If this were a 256 element ring buffer and the index were
>> uchar, we wouldn't even be having this discussion, and "3 &" is a
>> way to get a fake type that is a 2-bit unsigned integer that wraps
>> around when incremented.
>>
>> But being explicit, especially when we know that we can rely on the
>> fact that the compilers are usually intelligent enough, is a good
>> idea, I would think.
>>
>> Isn't size_t often wider than uint, by the way?  It somehow makes me
>> feel dirty to use it when we know we only care about the bottom two
>> bit, especially with the explicit "bufno %= ARRAY_SIZE(hexbuffer)",
>> but I may be simply superstitious in this case.  I dunno.
> 
> If we are doing the wrap-around ourselves, I suspect that the index
> should stay "int" (not even unsigned), as that is supposed to be the
> most natural and performant type on the architecture.  Would it
> still result in better code to use size_t instead?

Right.

Gcc emits an extra cltq instruction for x86-64 (Convert Long To Quad,
i.e. 32-bit to 64-bit) if we wrap explicitly and still use an int in
sha1_to_hex().  It doesn't if we use an unsigned int or size_t.  It
also doesn't for get_pathname().  I guess updating the index variable
only after use makes the difference there.

But I think we can ignore that; it's just an extra cycle.  I only
even noticed it when verifying that "% 4" is turned into "& 3"..
Clang and icc don't add the cltq, by the way.

So how about this?  It gets rid of magic number 3 and works for array
size that's not a power of two.  And as a nice side effect it can't
trigger a signed overflow anymore.

Just adding "unsigned" looks more attractive to me for some reason.
Perhaps I stared enough at the code to get used to the magic values
there..

René

---
 hex.c  | 3 ++-
 path.c | 3 ++-
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/hex.c b/hex.c
index ab2610e..845b01a 100644
--- a/hex.c
+++ b/hex.c
@@ -78,7 +78,8 @@ char *sha1_to_hex(const unsigned char *sha1)
 {
static int bufno;
static char hexbuffer[4][GIT_SHA1_HEXSZ + 1];
-   return sha1_to_hex_r(hexbuffer[3 & ++bufno], sha1);
+   bufno = (bufno + 1) % ARRAY_SIZE(hexbuffer);
+   return sha1_to_hex_r(hexbuffer[bufno], sha1);
 }
 
 char *oid_to_hex(const struct object_id *oid)
diff --git a/path.c b/path.c
index a8e7295..52d889c 100644
--- a/path.c
+++ b/path.c
@@ -25,7 +25,8 @@ static struct strbuf *get_pathname(void)
STRBUF_INIT, STRBUF_INIT, STRBUF_INIT, STRBUF_INIT
};
static int index;
-   struct strbuf *sb = &pathname_array[3 & ++index];
+   struct strbuf *sb = &pathname_array[index];
+   index = (index + 1) % ARRAY_SIZE(pathname_array);
strbuf_reset(sb);
return sb;
 }
-- 
2.10.1


Re: [PATCH] hex: use unsigned index for ring buffer

2016-10-26 Thread René Scharfe

Am 25.10.2016 um 20:28 schrieb Junio C Hamano:

Jeff King  writes:


On Mon, Oct 24, 2016 at 04:53:50PM -0700, Junio C Hamano wrote:


So how about this?  It gets rid of magic number 3 and works for array
size that's not a power of two.  And as a nice side effect it can't
trigger a signed overflow anymore.


Looks good to me.  Peff?


Any of the variants discussed in this thread is fine by me.


OK, here is what I'll queue then.
I assumed that René wants to sign it off ;-).


Actually I didn't sign-off on purpose originally.  But OK, let's keep
the version below.  I just feel strangely sad seeing that concise magic
go.  Nevermind.

Signed-off-by: Rene Scharfe 


-- >8 --
From: René Scharfe 
Date: Sun, 23 Oct 2016 19:57:30 +0200
Subject: [PATCH] hex: make wraparound of the index into ring-buffer explicit

Overflow is defined for unsigned integers, but not for signed ones.

We could make the ring-buffer index in sha1_to_hex() and
get_pathname() unsigned to be on the safe side to resolve this, but
let's make it explicit that we are wrapping around at whatever the
number of elements the ring-buffer has.  The compiler is smart enough
to turn modulus into bitmask for these codepaths that use
ring-buffers of a size that is a power of 2.

Signed-off-by: René Scharfe 
Signed-off-by: Junio C Hamano 
---
 hex.c  | 3 ++-
 path.c | 3 ++-
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/hex.c b/hex.c
index ab2610e498..845b01a874 100644
--- a/hex.c
+++ b/hex.c
@@ -78,7 +78,8 @@ char *sha1_to_hex(const unsigned char *sha1)
 {
static int bufno;
static char hexbuffer[4][GIT_SHA1_HEXSZ + 1];
-   return sha1_to_hex_r(hexbuffer[3 & ++bufno], sha1);
+   bufno = (bufno + 1) % ARRAY_SIZE(hexbuffer);
+   return sha1_to_hex_r(hexbuffer[bufno], sha1);
 }

 char *oid_to_hex(const struct object_id *oid)
diff --git a/path.c b/path.c
index fe3c4d96c6..9bfaeda207 100644
--- a/path.c
+++ b/path.c
@@ -24,7 +24,8 @@ static struct strbuf *get_pathname(void)
STRBUF_INIT, STRBUF_INIT, STRBUF_INIT, STRBUF_INIT
};
static int index;
-   struct strbuf *sb = &pathname_array[3 & ++index];
+   struct strbuf *sb = &pathname_array[index];
+   index = (index + 1) % ARRAY_SIZE(pathname_array);
strbuf_reset(sb);
return sb;
 }



[PATCH] valgrind: support test helpers

2016-10-27 Thread René Scharfe
Tests run with --valgrind call git commands through a wrapper script
that invokes valgrind on them.  This script (valgrind.sh) is in turn
invoked through symlinks created for each command in t/valgrind/bin/.

Since e6e7530d (test helpers: move test-* to t/helper/ subdirectory)
these symlinks have been broken for test helpers -- they point to the
old locations in the root of the build directory.  Fix that by teaching
the code for creating the links about the new location of the binaries,
and do the same in the wrapper script to allow it to find its payload.

Signed-off-by: Rene Scharfe 
---
 t/test-lib.sh  |  9 -
 t/valgrind/valgrind.sh | 12 ++--
 2 files changed, 18 insertions(+), 3 deletions(-)

diff --git a/t/test-lib.sh b/t/test-lib.sh
index b859db6..a724181 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -809,7 +809,14 @@ then
return;
 
base=$(basename "$1")
-   symlink_target=$GIT_BUILD_DIR/$base
+   case "$base" in
+   test-*)
+   symlink_target="$GIT_BUILD_DIR/t/helper/$base"
+   ;;
+   *)
+   symlink_target="$GIT_BUILD_DIR/$base"
+   ;;
+   esac
# do not override scripts
if test -x "$symlink_target" &&
test ! -d "$symlink_target" &&
diff --git a/t/valgrind/valgrind.sh b/t/valgrind/valgrind.sh
index 4215303..669ebaf 100755
--- a/t/valgrind/valgrind.sh
+++ b/t/valgrind/valgrind.sh
@@ -1,11 +1,19 @@
 #!/bin/sh
 
 base=$(basename "$0")
+case "$base" in
+test-*)
+   program="$GIT_VALGRIND/../../t/helper/$base"
+   ;;
+*)
+   program="$GIT_VALGRIND/../../$base"
+   ;;
+esac
 
 TOOL_OPTIONS='--leak-check=no'
 
 test -z "$GIT_VALGRIND_ENABLED" &&
-exec "$GIT_VALGRIND"/../../"$base" "$@"
+exec "$program" "$@"
 
 case "$GIT_VALGRIND_MODE" in
 memcheck-fast)
@@ -29,4 +37,4 @@ exec valgrind -q --error-exitcode=126 \
--log-fd=4 \
--input-fd=4 \
$GIT_VALGRIND_OPTIONS \
-   "$GIT_VALGRIND"/../../"$base" "$@"
+   "$program" "$@"
-- 
2.10.1


Re: [PATCH] valgrind: support test helpers

2016-10-29 Thread René Scharfe
Am 28.10.2016 um 14:50 schrieb Junio C Hamano:
> Hmph.  I somehow thought this was supposed to have been fixed by
> 503e224180 ("t/test-lib.sh: fix running tests with --valgrind",
> 2016-07-11) already.

Its title seems to indicate that intention.  Probably the quickest test
script that calls a helper is t0009-prio-queue.sh, and without my patch
it reports something like this, unfortunately:

  expecting success:
  test-prio-queue 2 6 3 10 9 5 7 4 5 8 1 dump >actual &&
  test_cmp expect actual
  
  ./t0009-prio-queue.sh: 4: eval: test-prio-queue: not found
  not ok 1 - basic ordering
  #
  #   test-prio-queue 2 6 3 10 9 5 7 4 5 8 1 dump >actual &&
  #   test_cmp expect actual
  #

René


Re: [PATCH] valgrind: support test helpers

2016-10-29 Thread René Scharfe

Am 28.10.2016 um 10:51 schrieb Johannes Schindelin:

On Fri, 28 Oct 2016, René Scharfe wrote:

Signed-off-by: Rene Scharfe 


Apart from the missing accent ("é") in your SOB: ACK.


I sign off without accent out of habit, to avoid display problems -- 
better have a plain "e" than something like "" or worse.


But only now I realized that I can fix at least my end by using an UTF-8 
locale (e.g. LANG=C.UTF-8 instead of LANG=C) -- together with PuTTY and 
its settings Window, Translation, Remote character set: UTF-8 and 
Connection, Data, Terminal-type-string: linux, which I already had.  My 
terminal just got boosted into the 21st century, woohoo! ;)


René


[PATCH] commit: simplify building parents list

2016-10-29 Thread René Scharfe
Push pptr down into the FROM_MERGE branch of the if/else statement,
where it's actually used, and call commit_list_append() for appending
elements instead of playing tricks with commit_list_insert().  Call
copy_commit_list() in the amend branch instead of open-coding it.  Don't
bother setting pptr in the final branch as it's not used thereafter.

Signed-off-by: Rene Scharfe 
---
 builtin/commit.c | 14 ++
 1 file changed, 6 insertions(+), 8 deletions(-)

diff --git a/builtin/commit.c b/builtin/commit.c
index a2baa6e..8976c3d 100644
--- a/builtin/commit.c
+++ b/builtin/commit.c
@@ -1642,7 +1642,7 @@ int cmd_commit(int argc, const char **argv, const char 
*prefix)
const char *index_file, *reflog_msg;
char *nl;
unsigned char sha1[20];
-   struct commit_list *parents = NULL, **pptr = &parents;
+   struct commit_list *parents = NULL;
struct stat statbuf;
struct commit *current_head = NULL;
struct commit_extra_header *extra = NULL;
@@ -1688,20 +1688,18 @@ int cmd_commit(int argc, const char **argv, const char 
*prefix)
if (!reflog_msg)
reflog_msg = "commit (initial)";
} else if (amend) {
-   struct commit_list *c;
-
if (!reflog_msg)
reflog_msg = "commit (amend)";
-   for (c = current_head->parents; c; c = c->next)
-   pptr = &commit_list_insert(c->item, pptr)->next;
+   parents = copy_commit_list(current_head->parents);
} else if (whence == FROM_MERGE) {
struct strbuf m = STRBUF_INIT;
FILE *fp;
int allow_fast_forward = 1;
+   struct commit_list **pptr = &parents;
 
if (!reflog_msg)
reflog_msg = "commit (merge)";
-   pptr = &commit_list_insert(current_head, pptr)->next;
+   pptr = commit_list_append(current_head, pptr);
fp = fopen(git_path_merge_head(), "r");
if (fp == NULL)
die_errno(_("could not open '%s' for reading"),
@@ -1712,7 +1710,7 @@ int cmd_commit(int argc, const char **argv, const char 
*prefix)
parent = get_merge_parent(m.buf);
if (!parent)
die(_("Corrupt MERGE_HEAD file (%s)"), m.buf);
-   pptr = &commit_list_insert(parent, pptr)->next;
+   pptr = commit_list_append(parent, pptr);
}
fclose(fp);
strbuf_release(&m);
@@ -1729,7 +1727,7 @@ int cmd_commit(int argc, const char **argv, const char 
*prefix)
reflog_msg = (whence == FROM_CHERRY_PICK)
? "commit (cherry-pick)"
: "commit";
-   pptr = &commit_list_insert(current_head, pptr)->next;
+   commit_list_insert(current_head, &parents);
}
 
/* Finally, get the commit message */
-- 
2.10.2



[PATCH] sha1_name: make wraparound of the index into ring-buffer explicit

2016-11-01 Thread René Scharfe
Overflow is defined for unsigned integers, but not for signed ones.
Wrap around explicitly for the new ring-buffer in find_unique_abbrev()
as we did in bb84735c for the ones in sha1_to_hex() and get_pathname(),
thus avoiding signed overflows and getting rid of the magic number 3.

Signed-off-by: Rene Scharfe 
---
 sha1_name.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/sha1_name.c b/sha1_name.c
index 06409a3..73a915f 100644
--- a/sha1_name.c
+++ b/sha1_name.c
@@ -510,7 +510,8 @@ const char *find_unique_abbrev(const unsigned char *sha1, 
int len)
 {
static int bufno;
static char hexbuffer[4][GIT_SHA1_HEXSZ + 1];
-   char *hex = hexbuffer[3 & ++bufno];
+   char *hex = hexbuffer[bufno];
+   bufno = (bufno + 1) % ARRAY_SIZE(hexbuffer);
find_unique_abbrev_r(hex, sha1, len);
return hex;
 }
-- 
2.10.2



[PATCH RESEND] cocci: avoid self-references in object_id transformations

2016-11-01 Thread René Scharfe
The object_id functions oid_to_hex, oid_to_hex_r, oidclr, oidcmp, and
oidcpy are defined as wrappers of their legacy counterparts sha1_to_hex,
sha1_to_hex_r, hashclr, hashcmp, and hashcpy, respectively.  Make sure
that the Coccinelle transformations for converting legacy function calls
are not applied to these wrappers themselves, which would result in
tautological declarations.

Signed-off-by: Rene Scharfe 
---
 contrib/coccinelle/object_id.cocci | 15 +++
 1 file changed, 15 insertions(+)

diff --git a/contrib/coccinelle/object_id.cocci 
b/contrib/coccinelle/object_id.cocci
index 0307624..09afdbf 100644
--- a/contrib/coccinelle/object_id.cocci
+++ b/contrib/coccinelle/object_id.cocci
@@ -17,10 +17,13 @@ expression E1;
 + oid_to_hex(&E1)
 
 @@
+identifier f != oid_to_hex;
 expression E1;
 @@
+  f(...) {...
 - sha1_to_hex(E1->hash)
 + oid_to_hex(E1)
+  ...}
 
 @@
 expression E1, E2;
@@ -29,10 +32,13 @@ expression E1, E2;
 + oid_to_hex_r(E1, &E2)
 
 @@
+identifier f != oid_to_hex_r;
 expression E1, E2;
 @@
+   f(...) {...
 - sha1_to_hex_r(E1, E2->hash)
 + oid_to_hex_r(E1, E2)
+  ...}
 
 @@
 expression E1;
@@ -41,10 +47,13 @@ expression E1;
 + oidclr(&E1)
 
 @@
+identifier f != oidclr;
 expression E1;
 @@
+  f(...) {...
 - hashclr(E1->hash)
 + oidclr(E1)
+  ...}
 
 @@
 expression E1, E2;
@@ -53,10 +62,13 @@ expression E1, E2;
 + oidcmp(&E1, &E2)
 
 @@
+identifier f != oidcmp;
 expression E1, E2;
 @@
+  f(...) {...
 - hashcmp(E1->hash, E2->hash)
 + oidcmp(E1, E2)
+  ...}
 
 @@
 expression E1, E2;
@@ -77,10 +89,13 @@ expression E1, E2;
 + oidcpy(&E1, &E2)
 
 @@
+identifier f != oidcpy;
 expression E1, E2;
 @@
+  f(...) {...
 - hashcpy(E1->hash, E2->hash)
 + oidcpy(E1, E2)
+  ...}
 
 @@
 expression E1, E2;
-- 
2.10.2



[PATCH 0/3] string-list: make string_list_sort() reentrant

2016-12-01 Thread René Scharfe
Use qsort_s() from C11 Annex K to make string_list_sort() safer, in
particular when called from parallel threads.

  compat: add qsort_s()
  add QSORT_S
  string-list: use QSORT_S in string_list_sort()

 Makefile  | 10 
 compat/qsort_s.c  | 69 +++
 git-compat-util.h | 11 +
 string-list.c | 13 ---
 4 files changed, 95 insertions(+), 8 deletions(-)
 create mode 100644 compat/qsort_s.c

-- 
2.11.0



[PATCH 1/3] compat: add qsort_s()

2016-12-01 Thread René Scharfe
The function qsort_s() was introduced with C11 Annex K; it provides the
ability to pass a context pointer to the comparison function, supports
the convention of using a NULL pointer for an empty array and performs a
few safety checks.

Add an implementation based on compat/qsort.c for platforms that lack a
native standards-compliant qsort_s() (i.e. basically everyone).  It
doesn't perform the full range of possible checks: It uses size_t
instead of rsize_t and doesn't check nmemb and size against RSIZE_MAX
because we probably don't have the restricted size type defined.  For
the same reason it returns int instead of errno_t.

Signed-off-by: Rene Scharfe 
---
 Makefile  | 10 
 compat/qsort_s.c  | 69 +++
 git-compat-util.h |  6 +
 3 files changed, 85 insertions(+)
 create mode 100644 compat/qsort_s.c

diff --git a/Makefile b/Makefile
index f53fcc90d..2245fd95d 100644
--- a/Makefile
+++ b/Makefile
@@ -279,6 +279,9 @@ all::
 # is a simplified version of the merge sort used in glibc. This is
 # recommended if Git triggers O(n^2) behavior in your platform's qsort().
 #
+# Define HAVE_QSORT_S if your platform provides a qsort_s() that's
+# compatible with the one described in C11 Annex K.
+#
 # Define UNRELIABLE_FSTAT if your system's fstat does not return the same
 # information on a not yet closed file that lstat would return for the same
 # file after it was closed.
@@ -1423,6 +1426,13 @@ ifdef INTERNAL_QSORT
COMPAT_CFLAGS += -DINTERNAL_QSORT
COMPAT_OBJS += compat/qsort.o
 endif
+
+ifdef HAVE_QSORT_S
+   COMPAT_CFLAGS += -DHAVE_QSORT_S
+else
+   COMPAT_OBJS += compat/qsort_s.o
+endif
+
 ifdef RUNTIME_PREFIX
COMPAT_CFLAGS += -DRUNTIME_PREFIX
 endif
diff --git a/compat/qsort_s.c b/compat/qsort_s.c
new file mode 100644
index 0..52d1f0a73
--- /dev/null
+++ b/compat/qsort_s.c
@@ -0,0 +1,69 @@
+#include "../git-compat-util.h"
+
+/*
+ * A merge sort implementation, simplified from the qsort implementation
+ * by Mike Haertel, which is a part of the GNU C Library.
+ * Added context pointer, safety checks and return value.
+ */
+
+static void msort_with_tmp(void *b, size_t n, size_t s,
+  int (*cmp)(const void *, const void *, void *),
+  char *t, void *ctx)
+{
+   char *tmp;
+   char *b1, *b2;
+   size_t n1, n2;
+
+   if (n <= 1)
+   return;
+
+   n1 = n / 2;
+   n2 = n - n1;
+   b1 = b;
+   b2 = (char *)b + (n1 * s);
+
+   msort_with_tmp(b1, n1, s, cmp, t, ctx);
+   msort_with_tmp(b2, n2, s, cmp, t, ctx);
+
+   tmp = t;
+
+   while (n1 > 0 && n2 > 0) {
+   if (cmp(b1, b2, ctx) <= 0) {
+   memcpy(tmp, b1, s);
+   tmp += s;
+   b1 += s;
+   --n1;
+   } else {
+   memcpy(tmp, b2, s);
+   tmp += s;
+   b2 += s;
+   --n2;
+   }
+   }
+   if (n1 > 0)
+   memcpy(tmp, b1, n1 * s);
+   memcpy(b, t, (n - n2) * s);
+}
+
+int git_qsort_s(void *b, size_t n, size_t s,
+   int (*cmp)(const void *, const void *, void *), void *ctx)
+{
+   const size_t size = st_mult(n, s);
+   char buf[1024];
+
+   if (!n)
+   return 0;
+   if (!b || !cmp)
+   return -1;
+
+   if (size < sizeof(buf)) {
+   /* The temporary array fits on the small on-stack buffer. */
+   msort_with_tmp(b, n, s, cmp, buf, ctx);
+   } else {
+   /* It's somewhat large, so malloc it.  */
+   char *tmp = xmalloc(size);
+   msort_with_tmp(b, n, s, cmp, tmp, ctx);
+   free(tmp);
+   }
+   return 0;
+}
diff --git a/git-compat-util.h b/git-compat-util.h
index 87237b092..d25f0bd4c 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -988,6 +988,12 @@ static inline void sane_qsort(void *base, size_t nmemb, 
size_t size,
qsort(base, nmemb, size, compar);
 }
 
+#ifndef HAVE_QSORT_S
+int git_qsort_s(void *base, size_t nmemb, size_t size,
+   int (*compar)(const void *, const void *, void *), void *ctx);
+#define qsort_s git_qsort_s
+#endif
+
 #ifndef REG_STARTEND
 #error "Git requires REG_STARTEND support. Compile with NO_REGEX=NeedsStartEnd"
 #endif
-- 
2.11.0



[PATCH 2/3] add QSORT_S

2016-12-01 Thread René Scharfe
Add the macro QSORT_S, a convenient wrapper for qsort_s() that infers
the size of the array elements and dies on error.

Basically all possible errors are programming mistakes (passing NULL as
base of a non-empty array, passing NULL as comparison function,
out-of-bounds accesses), so terminating the program should be acceptable
for most callers.

Signed-off-by: Rene Scharfe 
---
 git-compat-util.h | 5 +
 1 file changed, 5 insertions(+)

diff --git a/git-compat-util.h b/git-compat-util.h
index d25f0bd4c..b707dd880 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -994,6 +994,11 @@ int git_qsort_s(void *base, size_t nmemb, size_t size,
 #define qsort_s git_qsort_s
 #endif
 
+#define QSORT_S(base, n, compar, ctx) do { \
+   if (qsort_s((base), (n), sizeof(*(base)), compar, ctx)) \
+   die("BUG: qsort_s() failed");   \
+} while (0)
+
 #ifndef REG_STARTEND
 #error "Git requires REG_STARTEND support. Compile with NO_REGEX=NeedsStartEnd"
 #endif
-- 
2.11.0



[PATCH 3/3] string-list: use QSORT_S in string_list_sort()

2016-12-01 Thread René Scharfe
Pass the comparison function to cmp_items() via the context parameter of
qsort_s() instead of using a global variable.  That allows calling
string_list_sort() from multiple parallel threads.

Signed-off-by: Rene Scharfe 
---
 string-list.c | 13 +
 1 file changed, 5 insertions(+), 8 deletions(-)

diff --git a/string-list.c b/string-list.c
index 8c83cac18..45016ad86 100644
--- a/string-list.c
+++ b/string-list.c
@@ -211,21 +211,18 @@ struct string_list_item *string_list_append(struct 
string_list *list,
list->strdup_strings ? xstrdup(string) : (char 
*)string);
 }
 
-/* Yuck */
-static compare_strings_fn compare_for_qsort;
-
-/* Only call this from inside string_list_sort! */
-static int cmp_items(const void *a, const void *b)
+static int cmp_items(const void *a, const void *b, void *ctx)
 {
+   compare_strings_fn cmp = ctx;
const struct string_list_item *one = a;
const struct string_list_item *two = b;
-   return compare_for_qsort(one->string, two->string);
+   return cmp(one->string, two->string);
 }
 
 void string_list_sort(struct string_list *list)
 {
-   compare_for_qsort = list->cmp ? list->cmp : strcmp;
-   QSORT(list->items, list->nr, cmp_items);
+   QSORT_S(list->items, list->nr, cmp_items,
+   list->cmp ? list->cmp : strcmp);
 }
 
 struct string_list_item *unsorted_string_list_lookup(struct string_list *list,
-- 
2.11.0



Re: [PATCH 1/3] compat: add qsort_s()

2016-12-01 Thread René Scharfe
Am 01.12.2016 um 17:26 schrieb René Scharfe:
> The function qsort_s() was introduced with C11 Annex K; it provides the
> ability to pass a context pointer to the comparison function, supports
> the convention of using a NULL pointer for an empty array and performs a
> few safety checks.
> 
> Add an implementation based on compat/qsort.c for platforms that lack a
> native standards-compliant qsort_s() (i.e. basically everyone).  It
> doesn't perform the full range of possible checks: It uses size_t
> instead of rsize_t and doesn't check nmemb and size against RSIZE_MAX
> because we probably don't have the restricted size type defined.  For
> the same reason it returns int instead of errno_t.
> 
> Signed-off-by: Rene Scharfe 
> ---
>  Makefile  | 10 
>  compat/qsort_s.c  | 69 
> +++
>  git-compat-util.h |  6 +
>  3 files changed, 85 insertions(+)
>  create mode 100644 compat/qsort_s.c

And here's the diff for compat/qsort_s.c with copy detection (-C -C):
---
 compat/{qsort.c => qsort_s.c} | 25 -
 1 file changed, 16 insertions(+), 9 deletions(-)

diff --git a/compat/qsort.c b/compat/qsort_s.c
similarity index 62%
copy from compat/qsort.c
copy to compat/qsort_s.c
index 7d071afb7..52d1f0a73 100644
--- a/compat/qsort.c
+++ b/compat/qsort_s.c
@@ -3,11 +3,12 @@
 /*
  * A merge sort implementation, simplified from the qsort implementation
  * by Mike Haertel, which is a part of the GNU C Library.
+ * Added context pointer, safety checks and return value.
  */
 
 static void msort_with_tmp(void *b, size_t n, size_t s,
-  int (*cmp)(const void *, const void *),
-  char *t)
+  int (*cmp)(const void *, const void *, void *),
+  char *t, void *ctx)
 {
char *tmp;
char *b1, *b2;
@@ -21,13 +22,13 @@ static void msort_with_tmp(void *b, size_t n, size_t s,
b1 = b;
b2 = (char *)b + (n1 * s);
 
-   msort_with_tmp(b1, n1, s, cmp, t);
-   msort_with_tmp(b2, n2, s, cmp, t);
+   msort_with_tmp(b1, n1, s, cmp, t, ctx);
+   msort_with_tmp(b2, n2, s, cmp, t, ctx);
 
tmp = t;
 
while (n1 > 0 && n2 > 0) {
-   if (cmp(b1, b2) <= 0) {
+   if (cmp(b1, b2, ctx) <= 0) {
memcpy(tmp, b1, s);
tmp += s;
b1 += s;
@@ -44,19 +45,25 @@ static void msort_with_tmp(void *b, size_t n, size_t s,
memcpy(b, t, (n - n2) * s);
 }
 
-void git_qsort(void *b, size_t n, size_t s,
-  int (*cmp)(const void *, const void *))
+int git_qsort_s(void *b, size_t n, size_t s,
+   int (*cmp)(const void *, const void *, void *), void *ctx)
 {
const size_t size = st_mult(n, s);
char buf[1024];
 
+   if (!n)
+   return 0;
+   if (!b || !cmp)
+   return -1;
+
if (size < sizeof(buf)) {
/* The temporary array fits on the small on-stack buffer. */
-   msort_with_tmp(b, n, s, cmp, buf);
+   msort_with_tmp(b, n, s, cmp, buf, ctx);
} else {
/* It's somewhat large, so malloc it.  */
char *tmp = xmalloc(size);
-   msort_with_tmp(b, n, s, cmp, tmp);
+   msort_with_tmp(b, n, s, cmp, tmp, ctx);
free(tmp);
}
+   return 0;
 }



Re: [PATCH 1/3] compat: add qsort_s()

2016-12-01 Thread René Scharfe

Am 01.12.2016 um 21:19 schrieb Jeff King:

On Thu, Dec 01, 2016 at 12:14:42PM -0800, Junio C Hamano wrote:


Jeff King  writes:


To make matters more fun, apparently[1] there are multiple variants of
qsort_r with different argument orders. _And_ apparently Microsoft
defines qsort_s, but it's not quite the same thing. But all of that can
be dealt with by having more specific flags (HAVE_GNU_QSORT_R, etc).


AFAIU it went like this:

// FreeBSD 5.0 (2003)
void qsort_r(void *base, size_t nmemb, size_t size,
void *context,
int (*compar)(void *context, const void *x, const void *y));

// Microsoft Visual Studio 2005
void qsort_s(void *base, size_t nmemb, size_t size,
int (*compar)(void *context, const void *x, const void *y),
void *context);

// glibc 2.8 (2008)
void qsort_r(void *base, size_t nmemb, size_t size,
int (*compar)(const void *x, const void *y, void *context),
void *context);

// C11 Annex K (2011)
errno_t qsort_s(void *base, rsize_t nmemb, rsize_t size,
int (*compar)(const void *x, const void *y, void *context),
void *context);


It just seems like we should be able to do a better job of using the
system qsort in many cases.


Sure, platform-specific implementations can be shorter.


If we were to go that route, perhaps we shouldn't have HAVE_QSORT_S
so that Microsoft folks won't define it by mistake (instead perhaps
call it HAVE_ISO_QSORT_S or something).


OK.


I like your suggestion in general.  The body of git_qsort_s() on
systems without ISO_QSORT_S can do

 - GNU qsort_r() without any change in the parameters,

 - Microsoft qsort_s() with parameter reordered, or

 - Apple/BSD qsort_r() with parameter reordered.

and that would cover the major platforms.


Yes.

However, for MSys INTERNAL_QSORT is defined for some reason, so the 
platform's qsort(3) is not used there; I guess the same reason applies 
to qsort_s().  If it doesn't then an implementation may want to convert 
a call to the invalid parameter handler (which may show a dialog 
offering to Retry, Continue or Abort) into a non-zero return value.



Eh, wait.  BSD and Microsoft have paramters reordered in the
callback comparison function.  I suspect that would not fly very
well.


You can hack around it by passing a wrapper callback that flips the
arguments. Since we have a "void *" data pointer, that would point to a
struct holding the "real" callback and chaining to the original data
pointer.

It does incur the cost of an extra level of indirection for each
comparison, though (not just for each qsort call).


Indeed.  We'd need a perf test to measure that overhead before we could 
determine if that's a problem, though.



You could do it as zero-cost if you were willing to turn the comparison
function definition into a macro.


Ugh.  That either requires changing the signature of qsort_s() based on 
the underlying native function as well, or using a void pointer to pass 
the comparison function, no?  Let's not do that, at least not without a 
good reason.


René


Re: [PATCH 1/3] compat: add qsort_s()

2016-12-12 Thread René Scharfe

Am 01.12.2016 um 21:22 schrieb Junio C Hamano:

Junio C Hamano  writes:


Eh, wait.  BSD and Microsoft have paramters reordered in the
callback comparison function.  I suspect that would not fly very
well.


Hmm.  We could do it like this, which may not be too bad.


It's kinda cool to have a bespoke compatibility layer for major 
platforms, but the more I think about it the less I can answer why we 
would want that.  Safety, reliability and performance can't be good 
reasons -- if our fallback function lacks in these regards then we have 
to improve it in any case.


Text size could be a valid reason, but the full function only adds a bit 
more than 2KB to the unstripped git binary.


The flip side is we'd build an ifdef maze that's harder to read and a 
lot more difficult to test.


What do we get in return for that additional complexity?


#if APPLE_QSORT_R
struct apple_qsort_adapter {
int (*user_cmp)(const void *, const void *, void *);
void *user_ctx;
}

static int apple_qsort_adapter_cmp(void *ctx, const void *a, const void *b)
{
struct apple_qsort_adapter *wrapper_ctx = ctx;
return wrapper_ctx->user_cmp(a, b, wrapper_ctx->user_ctx);
}
#endif

int git_qsort_s(void *b, size_t n, size_t s,
   int (*cmp)(const void *, const void *, void *), void *ctx)
{
if (!n)
return 0;
if (!b || !cmp)
return -1;
#if GNU_QSORT_R
qsort_r(b, n, s, cmp, ctx);
#elif APPLE_QSORT_R
{
struct appple_qsort_adapter a = { cmp, ctx };
qsort_r(b, n, s, &a, appple_qsort_adapter_cmp);
}
#endif


Nit: The fallback for non-GNU, non-Apple systems is missing here, but 
the idea is illustrated clearly enough.



  return 0;
}





<    1   2   3   4   5   6   7   8   9   10   >