Re: [PATCH v2] range-diff: don't segfault with mode-only changes

2019-10-09 Thread Uwe Kleine-König
On Tue, Oct 08, 2019 at 06:38:43PM +0100, Thomas Gummerer wrote:
> In ef283b3699 ("apply: make parse_git_diff_header public", 2019-07-11)
> the 'parse_git_diff_header' function was made public and useable by
> callers outside of apply.c.
> 
> However it was missed that its (then) only caller, 'find_header' did
> some error handling, and completing 'struct patch' appropriately.
> 
> range-diff then started using this function, and tried to handle this
> appropriately itself, but fell short in some cases.  This in turn
> would lead to range-diff segfaulting when there are mode-only changes
> in a range.
> 
> Move the error handling and completing of the struct into the
> 'parse_git_diff_header' function, so other callers can take advantage
> of it.  This fixes the segfault in 'git range-diff'.
> 
> Reported-by: Uwe Kleine-König 
> Signed-off-by: Thomas Gummerer 

This patch also makes git work again for the originally problematic
usecase.

Tested-by: Uwe Kleine-König 

Thanks for your quick reaction to my bug report,
Uwe Kleine-König

-- 
Pengutronix e.K.   | Uwe Kleine-König|
Industrial Linux Solutions | http://www.pengutronix.de/  |


Re: [PATCH v2] range-diff: don't segfault with mode-only changes

2019-10-08 Thread Johannes Schindelin
Hi Thomas,

On Tue, 8 Oct 2019, Thomas Gummerer wrote:

> In ef283b3699 ("apply: make parse_git_diff_header public", 2019-07-11)
> the 'parse_git_diff_header' function was made public and useable by
> callers outside of apply.c.
>
> However it was missed that its (then) only caller, 'find_header' did
> some error handling, and completing 'struct patch' appropriately.
>
> range-diff then started using this function, and tried to handle this
> appropriately itself, but fell short in some cases.  This in turn
> would lead to range-diff segfaulting when there are mode-only changes
> in a range.
>
> Move the error handling and completing of the struct into the
> 'parse_git_diff_header' function, so other callers can take advantage
> of it.  This fixes the segfault in 'git range-diff'.
>
> Reported-by: Uwe Kleine-König 

Acked-by: Johannes Schindelin 

Ciao,
Dscho

> Signed-off-by: Thomas Gummerer 
> ---
>
> Thanks Junio and Dscho for your reviews.  I decided to lift the whole
> error handling behaviour from find_header into parse_git_diff_header,
> instead of just filling the two names with xstrdup(def_name) if
> (!old_name && !new_name && !!def_name).  I think the additional
> information presented there can be useful.  For example we would have
> gotten some "error: git diff header lacks filename information"
> instead of a segfault for the problem described in
> https://public-inbox.org/git/20191002141615.gb17...@kitsune.suse.cz/T/#me576615d7a151cf2ed46186c482fbd88f9959914.
>
> Dscho, I didn't re-use your test case here as I had already written
> one, and think what I have is slightly nicer in that it follows what
> most other range-diff tests do in using the fast-exported history.  It
> also expands the test coverage slightly, as we currently don't have
> any coverage of the mode-change header, but will with this test.
>
> The downside is of course that the fast export script is harder to
> understand than the test you had, at least for me, but I think the
> tradeoff of having the additional test coverage, and having it similar
> to the rest of the test script is worth it.  If you strongly prefer
> your test though I'm not going to be unhappy to use that :)
>
>  apply.c| 43 +-
>  t/t3206-range-diff.sh  | 40 +++
>  t/t3206/history.export | 31 +-
>  3 files changed, 92 insertions(+), 22 deletions(-)
>
> diff --git a/apply.c b/apply.c
> index 57a61f2881..f8a046a6a5 100644
> --- a/apply.c
> +++ b/apply.c
> @@ -1361,11 +1361,32 @@ int parse_git_diff_header(struct strbuf *root,
>   if (check_header_line(*linenr, patch))
>   return -1;
>   if (res > 0)
> - return offset;
> + goto done;
>   break;
>   }
>   }
>
> +done:
> + if (!patch->old_name && !patch->new_name) {
> + if (!patch->def_name) {
> + error(Q_("git diff header lacks filename information 
> when removing "
> +  "%d leading pathname component (line %d)",
> +  "git diff header lacks filename information 
> when removing "
> +  "%d leading pathname components (line %d)",
> +  parse_hdr_state.p_value),
> +   parse_hdr_state.p_value, *linenr);
> + return -128;
> + }
> + patch->old_name = xstrdup(patch->def_name);
> + patch->new_name = xstrdup(patch->def_name);
> + }
> + if ((!patch->new_name && !patch->is_delete) ||
> + (!patch->old_name && !patch->is_new)) {
> + error(_("git diff header lacks filename information "
> + "(line %d)"), *linenr);
> + return -128;
> + }
> + patch->is_toplevel_relative = 1;
>   return offset;
>  }
>
> @@ -1546,26 +1567,6 @@ static int find_header(struct apply_state *state,
>   return -128;
>   if (git_hdr_len <= len)
>   continue;
> - if (!patch->old_name && !patch->new_name) {
> - if (!patch->def_name) {
> - error(Q_("git diff header lacks 
> filename information when removing &q

[PATCH v2] range-diff: don't segfault with mode-only changes

2019-10-08 Thread Thomas Gummerer
In ef283b3699 ("apply: make parse_git_diff_header public", 2019-07-11)
the 'parse_git_diff_header' function was made public and useable by
callers outside of apply.c.

However it was missed that its (then) only caller, 'find_header' did
some error handling, and completing 'struct patch' appropriately.

range-diff then started using this function, and tried to handle this
appropriately itself, but fell short in some cases.  This in turn
would lead to range-diff segfaulting when there are mode-only changes
in a range.

Move the error handling and completing of the struct into the
'parse_git_diff_header' function, so other callers can take advantage
of it.  This fixes the segfault in 'git range-diff'.

Reported-by: Uwe Kleine-König 
Signed-off-by: Thomas Gummerer 
---

Thanks Junio and Dscho for your reviews.  I decided to lift the whole
error handling behaviour from find_header into parse_git_diff_header,
instead of just filling the two names with xstrdup(def_name) if
(!old_name && !new_name && !!def_name).  I think the additional
information presented there can be useful.  For example we would have
gotten some "error: git diff header lacks filename information"
instead of a segfault for the problem described in
https://public-inbox.org/git/20191002141615.gb17...@kitsune.suse.cz/T/#me576615d7a151cf2ed46186c482fbd88f9959914.

Dscho, I didn't re-use your test case here as I had already written
one, and think what I have is slightly nicer in that it follows what
most other range-diff tests do in using the fast-exported history.  It
also expands the test coverage slightly, as we currently don't have
any coverage of the mode-change header, but will with this test.

The downside is of course that the fast export script is harder to
understand than the test you had, at least for me, but I think the
tradeoff of having the additional test coverage, and having it similar
to the rest of the test script is worth it.  If you strongly prefer
your test though I'm not going to be unhappy to use that :)

 apply.c| 43 +-
 t/t3206-range-diff.sh  | 40 +++
 t/t3206/history.export | 31 +-
 3 files changed, 92 insertions(+), 22 deletions(-)

diff --git a/apply.c b/apply.c
index 57a61f2881..f8a046a6a5 100644
--- a/apply.c
+++ b/apply.c
@@ -1361,11 +1361,32 @@ int parse_git_diff_header(struct strbuf *root,
if (check_header_line(*linenr, patch))
return -1;
if (res > 0)
-   return offset;
+   goto done;
break;
}
}
 
+done:
+   if (!patch->old_name && !patch->new_name) {
+   if (!patch->def_name) {
+   error(Q_("git diff header lacks filename information 
when removing "
+"%d leading pathname component (line %d)",
+"git diff header lacks filename information 
when removing "
+"%d leading pathname components (line %d)",
+parse_hdr_state.p_value),
+ parse_hdr_state.p_value, *linenr);
+   return -128;
+   }
+   patch->old_name = xstrdup(patch->def_name);
+   patch->new_name = xstrdup(patch->def_name);
+   }
+   if ((!patch->new_name && !patch->is_delete) ||
+   (!patch->old_name && !patch->is_new)) {
+   error(_("git diff header lacks filename information "
+   "(line %d)"), *linenr);
+   return -128;
+   }
+   patch->is_toplevel_relative = 1;
return offset;
 }
 
@@ -1546,26 +1567,6 @@ static int find_header(struct apply_state *state,
return -128;
if (git_hdr_len <= len)
continue;
-   if (!patch->old_name && !patch->new_name) {
-   if (!patch->def_name) {
-   error(Q_("git diff header lacks 
filename information when removing "
-   "%d leading pathname 
component (line %d)",
-   "git diff header lacks 
filename information when removing "
-   "%d leading pathname 
components (line %d)",
-   state->p_value),
-  

Re: [PATCH v2] upload-pack commit graph segfault fix

2019-09-12 Thread Taylor Blau
On Thu, Sep 12, 2019 at 10:41:22AM -0400, Jeff King wrote:
> Here's a re-roll of my "disable commit graph more gently" fix. Versus
> v1:

Thanks for sending a re-roll. I deployed this out to all of our servers
running git at GitHub, and it seems to be working fine.

For what it's worth, I don't have 'core.commitGraph' enabled on any of
those servers for now, but I'll turn it back on shortly.

>   - I've included a preparatory patch that speeds up
> prepare_commit_graph(). It's not strictly related, but there's a
> textual dependency. It could be easily spun off to its own series.
>
>   - a comment points out that the ordering of the "disable" check is
> important
>
>   - disable_commit_graph() now works on a repository struct
>
>   - typo fixes in the test comments
>
>   [1/2]: commit-graph: bump DIE_ON_LOAD check to actual load-time
>   [2/2]: upload-pack: disable commit graph more gently for shallow traversal
>
>  commit-graph.c| 18 +++---
>  commit-graph.h|  6 ++
>  repository.h  |  3 +++
>  t/t5500-fetch-pack.sh | 38 ++
>  upload-pack.c |  2 +-
>  5 files changed, 63 insertions(+), 4 deletions(-)
>
> -Peff

Thanks,
Taylor


Re: [PATCH v2] upload-pack commit graph segfault fix

2019-09-12 Thread Jeff King
On Thu, Sep 12, 2019 at 10:41:22AM -0400, Jeff King wrote:

> Here's a re-roll of my "disable commit graph more gently" fix. Versus
> v1:
> 
>   - I've included a preparatory patch that speeds up
> prepare_commit_graph(). It's not strictly related, but there's a
> textual dependency. It could be easily spun off to its own series.

I _didn't_ include here the other speedup from this thread:

  https://public-inbox.org/git/20190912011137.ga23...@sigill.intra.peff.net/

That's worth doing, but is totally independent (conceptually and
textually).

-Peff


[PATCH v2] upload-pack commit graph segfault fix

2019-09-12 Thread Jeff King
Here's a re-roll of my "disable commit graph more gently" fix. Versus
v1:

  - I've included a preparatory patch that speeds up
prepare_commit_graph(). It's not strictly related, but there's a
textual dependency. It could be easily spun off to its own series.

  - a comment points out that the ordering of the "disable" check is
important

  - disable_commit_graph() now works on a repository struct

  - typo fixes in the test comments

  [1/2]: commit-graph: bump DIE_ON_LOAD check to actual load-time
  [2/2]: upload-pack: disable commit graph more gently for shallow traversal

 commit-graph.c| 18 +++---
 commit-graph.h|  6 ++
 repository.h  |  3 +++
 t/t5500-fetch-pack.sh | 38 ++
 upload-pack.c |  2 +-
 5 files changed, 63 insertions(+), 4 deletions(-)

-Peff


Re: git name-rev segfault

2019-07-29 Thread Tamas Papp

Thanks for letting me know and for the corrections too.


Cheers,

tamas


On 7/29/19 9:50 PM, Jeff King wrote:

On Mon, Jul 29, 2019 at 04:19:47PM +0200, Tamas Papp wrote:


Generate 100k file into a repository:

#!/bin/bash

rm -rf .git test.file
git init
git config user.email a@b
git config user.name c

time for i in {1..10}
do
   [ $((i % 2)) -eq 1 ] && echo $i>test.file || echo 0 >test.file
   git add test.file

   git commit -m "$i committed"

done

I lost patience kicking off two hundred thousand processes. Try this:

   for i in {1..10}
   do
 echo "commit HEAD"
 echo "committer c  $i +"
 echo "data <
Run git on it:

$ git name-rev a20f6989b75fa63ec6259a988e38714e1f5328a0

Anybody who runs your script will get a different sha1 because of the
change in timestamps. I guess this is HEAD, though. I also needed to
have an actual tag to find. So:

   git tag old-tag HEAD~9
   git name-rev HEAD

segfaults for me.


Could you coment on it?

This is a known issue. The algorithm used by name-rev is recursive, and
you can run out of stack space in some deep cases. There's more
discussion this thread:

   https://public-inbox.org/git/6a4cbbee-ffc6-739b-d649-079ba0143...@grubix.eu/

including some patches that document the problem with an expected
failure in our test suite. Nobody has actually rewritten the C code yet,
though.

-Peff


Re: git name-rev segfault

2019-07-29 Thread Jeff King
On Mon, Jul 29, 2019 at 04:19:47PM +0200, Tamas Papp wrote:

> Generate 100k file into a repository:
> 
> #!/bin/bash
> 
> rm -rf .git test.file
> git init
> git config user.email a@b
> git config user.name c
> 
> time for i in {1..10}
> do
>   [ $((i % 2)) -eq 1 ] && echo $i>test.file || echo 0 >test.file
>   git add test.file
> 
>   git commit -m "$i committed"
> 
> done

I lost patience kicking off two hundred thousand processes. Try this:

  for i in {1..10}
  do
echo "commit HEAD"
echo "committer c  $i +"
echo "data < Run git on it:
> 
> $ git name-rev a20f6989b75fa63ec6259a988e38714e1f5328a0

Anybody who runs your script will get a different sha1 because of the
change in timestamps. I guess this is HEAD, though. I also needed to
have an actual tag to find. So:

  git tag old-tag HEAD~9
  git name-rev HEAD

segfaults for me.

> Could you coment on it?

This is a known issue. The algorithm used by name-rev is recursive, and
you can run out of stack space in some deep cases. There's more
discussion this thread:

  https://public-inbox.org/git/6a4cbbee-ffc6-739b-d649-079ba0143...@grubix.eu/

including some patches that document the problem with an expected
failure in our test suite. Nobody has actually rewritten the C code yet,
though.

-Peff


git name-rev segfault

2019-07-29 Thread Tamas Papp

hi All,


We ran into a segfault with all version of command line git.

We are able to reproduce this issue by this way:


Generate 100k file into a repository:

#!/bin/bash

rm -rf .git test.file
git init
git config user.email a@b
git config user.name c

time for i in {1..10}
do
  [ $((i % 2)) -eq 1 ] && echo $i>test.file || echo 0 >test.file
  git add test.file

  git commit -m "$i committed"

done


Run git on it:

$ git name-rev a20f6989b75fa63ec6259a988e38714e1f5328a0



Could you coment on it?


Thank you,

tamas



Re: git segfault in tag verify (patch included)

2019-07-16 Thread Junio C Hamano
Jeff King  writes:

> Something like:
>
>   for (line = buf; *line; line = next) {
>   next = strchrnul(line, '\n');
>
>   ... do stuff ...
>   /* find a space within the line */
>   space = memchr(line, ' ', next - line);
>   }

I am not sure about the memchr() thing, but "prepare next that is
constant thru the iteration, and update line to it at the end of
each iteration" is a robust pattern to follow.  I like it.



Re: git segfault in tag verify (patch included)

2019-07-16 Thread Jeff King
On Tue, Jul 16, 2019 at 11:20:48AM -0700, Junio C Hamano wrote:

> That is this thing.
> 
> static void parse_gpg_output(struct signature_check *sigc)
> {
> const char *buf = sigc->gpg_status;
> const char *line, *next;
> int i, j;
> int seen_exclusive_status = 0;
> 
> /* Iterate over all lines */
> for (line = buf; *line; line = strchrnul(line+1, '\n')) {
> while (*line == '\n')
> line++;
> /* Skip lines that don't start with GNUPG status */
> if (!skip_prefix(line, "[GNUPG:] ", &line))
> continue;
> 
> If the GPG output ends with a trailing blank line, we skip and get
> to the terminating NUL, then find that it does not begin with
> the "[GNUPG:] " prefix, and hit the continue.  We try to scan and
> look for LF (or stop at the end of the string) for the next round,
> starting at one past where we are, which is already the terminating
> NUL.  Ouch.
> 
> Good finding.

Yeah. The patch below looks fine, and I do not see any other
out-of-bounds issues in the code (there is a similar "next + 1" in the
inner loop, but it checks for an empty line right beforehand already).

I find these kind of "+1" pointer increments without constraint checking
are error-prone.  I suspect this whole loop could be a bit easier to
follow by finding the next line boundary at the start of the loop, and
jumping to it in the for-loop advancement. And then within the loop, we
know the boundaries of the line (right now, for example, we issue a
strchrnul() looking for a space that might unexpectedly cross line
boundaries).

Something like:

  for (line = buf; *line; line = next) {
next = strchrnul(line, '\n');

... do stuff ...
/* find a space within the line */
space = memchr(line, ' ', next - line);
  }

or even:

  for (line = buf; *line; line += len) {
size_t len = strchrnul(line, '\n') - line;
...
space = memchr(line, ' ', linelen);
  }

but it may not be worth the churn. It's just as likely to introduce a
new bug. ;)

I've also found working with strbuf_getline() to be very pleasant for a
lot of these cases (in which you get a real per-line NUL-terminated
string), but of course it implies an extra copy.

-Peff


Re: git segfault in tag verify (patch included)

2019-07-16 Thread Steven Roberts
Thanks. I hope this works ok for you (see attached).

On Tue, Jul 16, 2019 at 11:20 AM Junio C Hamano  wrote:
>
> Steven Roberts  writes:
>
> > I believe I have found an off-by-one error in git.
> >
> > Please see https://marc.info/?l=openbsd-ports&m=156326783610123&w=2
>
> That is this thing.
>
> static void parse_gpg_output(struct signature_check *sigc)
> {
> const char *buf = sigc->gpg_status;
> const char *line, *next;
> int i, j;
> int seen_exclusive_status = 0;
>
> /* Iterate over all lines */
> for (line = buf; *line; line = strchrnul(line+1, '\n')) {
> while (*line == '\n')
> line++;
> /* Skip lines that don't start with GNUPG status */
> if (!skip_prefix(line, "[GNUPG:] ", &line))
> continue;
>
> If the GPG output ends with a trailing blank line, we skip and get
> to the terminating NUL, then find that it does not begin with
> the "[GNUPG:] " prefix, and hit the continue.  We try to scan and
> look for LF (or stop at the end of the string) for the next round,
> starting at one past where we are, which is already the terminating
> NUL.  Ouch.
>
> Good finding.
>
> We need your sign-off (see Documentation/SubmittingPatches).
>
> Thanks.
>
>
> -- >8 --
> From: Steven Roberts 
> Subject: gpg-interface: do not scan past the end of buffer
>
> If the GPG output ends with trailing blank lines, after skipping
> them over inside the loop to find the terminating NUL at the end,
> the loop ends up looking for the next line, starting past the end.
>
> ---
>  gpg-interface.c | 4 
>  1 file changed, 4 insertions(+)
>
> diff --git a/gpg-interface.c b/gpg-interface.c
> index 8ed274533f..eb55d46ea4 100644
> --- a/gpg-interface.c
> +++ b/gpg-interface.c
> @@ -116,6 +116,9 @@ static void parse_gpg_output(struct signature_check *sigc)
> for (line = buf; *line; line = strchrnul(line+1, '\n')) {
> while (*line == '\n')
> line++;
> +   if (!*line)
> +   break;
> +
> /* Skip lines that don't start with GNUPG status */
> if (!skip_prefix(line, "[GNUPG:] ", &line))
> continue;
>


-- 
Steven Roberts | https://www.fenderq.com/
From d48814273a50cf0b293148cc40a6a5cc7c13686e Mon Sep 17 00:00:00 2001
From: Steven Roberts 
Date: Tue, 16 Jul 2019 11:40:46 -0700
Subject: [PATCH] gpg-interface: do not scan past the end of buffer

Signed-off-by: Steven Roberts 
---
 gpg-interface.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/gpg-interface.c b/gpg-interface.c
index 8ed274533f..775475131d 100644
--- a/gpg-interface.c
+++ b/gpg-interface.c
@@ -116,6 +116,11 @@ static void parse_gpg_output(struct signature_check *sigc)
 	for (line = buf; *line; line = strchrnul(line+1, '\n')) {
 		while (*line == '\n')
 			line++;
+
+		/* Break out of trailing '\n' */
+		if (!*line)
+			break;
+
 		/* Skip lines that don't start with GNUPG status */
 		if (!skip_prefix(line, "[GNUPG:] ", &line))
 			continue;
-- 
2.21.0



Re: git segfault in tag verify (patch included)

2019-07-16 Thread Junio C Hamano
Steven Roberts  writes:

> I believe I have found an off-by-one error in git.
>
> Please see https://marc.info/?l=openbsd-ports&m=156326783610123&w=2

That is this thing.

static void parse_gpg_output(struct signature_check *sigc)
{
const char *buf = sigc->gpg_status;
const char *line, *next;
int i, j;
int seen_exclusive_status = 0;

/* Iterate over all lines */
for (line = buf; *line; line = strchrnul(line+1, '\n')) {
while (*line == '\n')
line++;
/* Skip lines that don't start with GNUPG status */
if (!skip_prefix(line, "[GNUPG:] ", &line))
continue;

If the GPG output ends with a trailing blank line, we skip and get
to the terminating NUL, then find that it does not begin with
the "[GNUPG:] " prefix, and hit the continue.  We try to scan and
look for LF (or stop at the end of the string) for the next round,
starting at one past where we are, which is already the terminating
NUL.  Ouch.

Good finding.

We need your sign-off (see Documentation/SubmittingPatches).

Thanks.


-- >8 --
From: Steven Roberts 
Subject: gpg-interface: do not scan past the end of buffer

If the GPG output ends with trailing blank lines, after skipping
them over inside the loop to find the terminating NUL at the end,
the loop ends up looking for the next line, starting past the end.

---
 gpg-interface.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/gpg-interface.c b/gpg-interface.c
index 8ed274533f..eb55d46ea4 100644
--- a/gpg-interface.c
+++ b/gpg-interface.c
@@ -116,6 +116,9 @@ static void parse_gpg_output(struct signature_check *sigc)
for (line = buf; *line; line = strchrnul(line+1, '\n')) {
while (*line == '\n')
line++;
+   if (!*line)
+   break;
+
/* Skip lines that don't start with GNUPG status */
if (!skip_prefix(line, "[GNUPG:] ", &line))
continue;



git segfault in tag verify (patch included)

2019-07-16 Thread Steven Roberts
Hi,

I believe I have found an off-by-one error in git.

Please see https://marc.info/?l=openbsd-ports&m=156326783610123&w=2


Re: new segfault in master (6a6c0f10a70a6eb1)

2019-05-11 Thread Duy Nguyen
On Sun, May 12, 2019 at 6:02 AM Jeff King  wrote:
>
> On Sat, May 11, 2019 at 06:31:20PM -0400, Jeff King wrote:
>
> > On Sat, May 11, 2019 at 08:57:11PM +, Eric Wong wrote:
> >
> > > This test-tool submodule segfault seems new.  Noticed it while
> > > checking dmesg for other things.
> >
> > Yeah, I hadn't seen it before. It's almost certainly the expect_failure
> > added in 2b1257e463 (t/helper: add test-submodule-nested-repo-config,
> > 2018-10-25), since otherwise we'd be complaining of a test failure.
> >
> > I know we don't expect this to do the right thing yet, but it seems like
> > there's still a bug, since the test seems to think we should output a
> > nice message (and it's possible that the segfault can be triggered from
> > non-test-tool code, too).
> >
> > +cc the author.
>
> Actually, the plot thickens. That test _used to_ correctly expect
> failure (well, sort of -- it greps for the string with %s, which is
> wrong!). But then more recently in d9b8b8f896 (submodule-config.c: use
> repo_get_oid for reading .gitmodules, 2019-04-16), it started actually
> doing the lookup in the correct repo. And that started the segfault,
> because nobody has actually loaded the index for the submodule.
>
> I don't think this can be triggered outside of test-tool. There are
> four ways to get to config_from_gitmodules():
>
>   - via repo_read_gitmodules(), which explicitly loads the index
>
>   - via print_config_from_gitmodules(). This is called from
> submodule--helper, but only with the_repository as the argument (and
> I _think_ that the_repository->index is never NULL, because we point
> it at the_index).
>
>   - via fetch_config_from_gitmodules(), which always passes
> the_repository
>
>   - via update_clone_config_from_gitmodules(), likewise
>
> But regardless, I think it makes sense to load the index on demand when
> we need it here, which makes Antonio's original test pass (like the
> patch below).
>
> The segfault ultimately comes from repo_get_oid(); we feed it
> ":.gitmodules" and it blindly looks at repo->index. It's probably worth
> it being a bit more defensive and just returning "no such entry" if
> there's no index to look at (it could also load on demand, I guess, but
> it seems like too low a level to be making that kind of decision).
>
> I'm out of time for now, but I'll look into cleaning this up and writing
> a real commit message later.
>
> ---
> diff --git a/submodule-config.c b/submodule-config.c
> index 4264ee216f..ad2444bcec 100644
> --- a/submodule-config.c
> +++ b/submodule-config.c
> @@ -630,7 +630,8 @@ static void config_from_gitmodules(config_fn_t fn, struct 
> repository *repo, void
> file = repo_worktree_path(repo, GITMODULES_FILE);
> if (file_exists(file)) {
> config_source.file = file;
> -   } else if (repo_get_oid(repo, GITMODULES_INDEX, &oid) >= 0 ||
> +   } else if ((repo_read_index(repo) >= 0 &&
> +   repo_get_oid(repo, GITMODULES_INDEX, &oid) >= 0) 
> ||
>repo_get_oid(repo, GITMODULES_HEAD, &oid) >= 0) {
> config_source.blob = oidstr = 
> xstrdup(oid_to_hex(&oid));
> if (repo != the_repository)
> diff --git a/t/t7411-submodule-config.sh b/t/t7411-submodule-config.sh
> index fcc0fb82d8..ad28e93880 100755
> --- a/t/t7411-submodule-config.sh
> +++ b/t/t7411-submodule-config.sh
> @@ -243,18 +243,14 @@ test_expect_success 'reading nested submodules config' '
> )
>  '
>
> -# When this test eventually passes, before turning it into
> -# test_expect_success, remember to replace the test_i18ngrep below with
> -# a "test_must_be_empty warning" to be sure that the warning is actually
> -# removed from the code.
> -test_expect_failure 'reading nested submodules config when .gitmodules is 
> not in the working tree' '
> +test_expect_success 'reading nested submodules config when .gitmodules is 
> not in the working tree' '

I did miss this test. Yeah your fix makes sense.

> test_when_finished "git -C super/submodule checkout .gitmodules" &&
> (cd super &&
> echo "./nested_submodule" >expect &&
> rm submodule/.gitmodules &&
> test-tool submodule-nested-repo-config \
> submodule submodule.nested_submodule.url >actual 
> 2>warning &&
> -   test_i18ngrep "nested submodules without %s in the working 
> tree are not supported yet" warning &&
> +   test_must_be_empty warning &&
> test_cmp expect actual
> )
>  '



-- 
Duy


Re: new segfault in master (6a6c0f10a70a6eb1)

2019-05-11 Thread Jeff King
On Sat, May 11, 2019 at 06:31:20PM -0400, Jeff King wrote:

> On Sat, May 11, 2019 at 08:57:11PM +, Eric Wong wrote:
> 
> > This test-tool submodule segfault seems new.  Noticed it while
> > checking dmesg for other things.
> 
> Yeah, I hadn't seen it before. It's almost certainly the expect_failure
> added in 2b1257e463 (t/helper: add test-submodule-nested-repo-config,
> 2018-10-25), since otherwise we'd be complaining of a test failure.
> 
> I know we don't expect this to do the right thing yet, but it seems like
> there's still a bug, since the test seems to think we should output a
> nice message (and it's possible that the segfault can be triggered from
> non-test-tool code, too).
> 
> +cc the author.

Actually, the plot thickens. That test _used to_ correctly expect
failure (well, sort of -- it greps for the string with %s, which is
wrong!). But then more recently in d9b8b8f896 (submodule-config.c: use
repo_get_oid for reading .gitmodules, 2019-04-16), it started actually
doing the lookup in the correct repo. And that started the segfault,
because nobody has actually loaded the index for the submodule.

I don't think this can be triggered outside of test-tool. There are
four ways to get to config_from_gitmodules():

  - via repo_read_gitmodules(), which explicitly loads the index

  - via print_config_from_gitmodules(). This is called from
submodule--helper, but only with the_repository as the argument (and
I _think_ that the_repository->index is never NULL, because we point
it at the_index).

  - via fetch_config_from_gitmodules(), which always passes
the_repository

  - via update_clone_config_from_gitmodules(), likewise

But regardless, I think it makes sense to load the index on demand when
we need it here, which makes Antonio's original test pass (like the
patch below).

The segfault ultimately comes from repo_get_oid(); we feed it
":.gitmodules" and it blindly looks at repo->index. It's probably worth
it being a bit more defensive and just returning "no such entry" if
there's no index to look at (it could also load on demand, I guess, but
it seems like too low a level to be making that kind of decision).

I'm out of time for now, but I'll look into cleaning this up and writing
a real commit message later.

---
diff --git a/submodule-config.c b/submodule-config.c
index 4264ee216f..ad2444bcec 100644
--- a/submodule-config.c
+++ b/submodule-config.c
@@ -630,7 +630,8 @@ static void config_from_gitmodules(config_fn_t fn, struct 
repository *repo, void
file = repo_worktree_path(repo, GITMODULES_FILE);
if (file_exists(file)) {
config_source.file = file;
-   } else if (repo_get_oid(repo, GITMODULES_INDEX, &oid) >= 0 ||
+   } else if ((repo_read_index(repo) >= 0 &&
+   repo_get_oid(repo, GITMODULES_INDEX, &oid) >= 0) ||
   repo_get_oid(repo, GITMODULES_HEAD, &oid) >= 0) {
config_source.blob = oidstr = xstrdup(oid_to_hex(&oid));
if (repo != the_repository)
diff --git a/t/t7411-submodule-config.sh b/t/t7411-submodule-config.sh
index fcc0fb82d8..ad28e93880 100755
--- a/t/t7411-submodule-config.sh
+++ b/t/t7411-submodule-config.sh
@@ -243,18 +243,14 @@ test_expect_success 'reading nested submodules config' '
)
 '
 
-# When this test eventually passes, before turning it into
-# test_expect_success, remember to replace the test_i18ngrep below with
-# a "test_must_be_empty warning" to be sure that the warning is actually
-# removed from the code.
-test_expect_failure 'reading nested submodules config when .gitmodules is not 
in the working tree' '
+test_expect_success 'reading nested submodules config when .gitmodules is not 
in the working tree' '
test_when_finished "git -C super/submodule checkout .gitmodules" &&
(cd super &&
echo "./nested_submodule" >expect &&
rm submodule/.gitmodules &&
test-tool submodule-nested-repo-config \
submodule submodule.nested_submodule.url >actual 
2>warning &&
-   test_i18ngrep "nested submodules without %s in the working tree 
are not supported yet" warning &&
+   test_must_be_empty warning &&
test_cmp expect actual
)
 '


Re: new segfault in master (6a6c0f10a70a6eb1)

2019-05-11 Thread Jeff King
On Sat, May 11, 2019 at 08:57:11PM +, Eric Wong wrote:

> This test-tool submodule segfault seems new.  Noticed it while
> checking dmesg for other things.

Yeah, I hadn't seen it before. It's almost certainly the expect_failure
added in 2b1257e463 (t/helper: add test-submodule-nested-repo-config,
2018-10-25), since otherwise we'd be complaining of a test failure.

I know we don't expect this to do the right thing yet, but it seems like
there's still a bug, since the test seems to think we should output a
nice message (and it's possible that the segfault can be triggered from
non-test-tool code, too).

+cc the author.

> There's also "name-rev HEAD~4000" (bottom), which is old, I think...
> [...]
> Looks like a stack overflow:

Yeah, this one is old and expected. It's also in an expect_failure. The
CI we run things through at GitHub complains if there are any segfaults,
and I hacked around it with the patch below.

I sort of assumed nobody else cared, since they hadn't mentioned it. But
we could do something similar. Though note in my version the default is
"do not run the test", and we'd maybe want to flip it the other way (and
also break up the setup step so that the succeeding test actually runs).

-- >8 --
Subject: [PATCH] t6120: mark a failing test with SEGFAULT_OK prereq

Upstream recently added a test of name-rev on a deep repository, which
shows that its recursive algorithm can blow out the stack and segfault.
We have several such tests already, but the twist here is that it's
expect_failure. So we _know_ it's going to segfault and Git's test suite
is OK with that, until the problem is fixed.

But our CI is not so forgiving. If it sees any segfault at all, it
interrupts the test run and declares the whole thing a failure.

Let's just skip this test by adding a prerequisite that isn't filled.
It's not telling us anything interesting. And if it ever gets fixed
upstream, that will cause a conflict and we can start running it.

Note that we also have to skip the test after it, which relies on the
state set up by the first one. This isn't a big deal, as it's not
testing code that we're likely to change ourselves.

Signed-off-by: Jeff King 
---
 t/t6120-describe.sh | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/t/t6120-describe.sh b/t/t6120-describe.sh
index 1c0e8659d9..9d98b95ba6 100755
--- a/t/t6120-describe.sh
+++ b/t/t6120-describe.sh
@@ -310,7 +310,7 @@ test_expect_success 'describe ignoring a borken submodule' '
grep broken out
 '
 
-test_expect_failure ULIMIT_STACK_SIZE 'name-rev works in a deep repo' '
+test_expect_failure ULIMIT_STACK_SIZE,SEGFAULT_OK 'name-rev works in a deep 
repo' '
i=1 &&
while test $i -lt 8000
do
@@ -331,7 +331,7 @@ EOF"
test_cmp expect actual
 '
 
-test_expect_success ULIMIT_STACK_SIZE 'describe works in a deep repo' '
+test_expect_success ULIMIT_STACK_SIZE,SEGFAULT_OK 'describe works in a deep 
repo' '
git tag -f far-far-away HEAD~7999 &&
echo "far-far-away" >expect &&
git describe --tags --abbrev=0 HEAD~4000 >actual &&
-- 
2.21.0.1388.g2b1efd806f



new segfault in master (6a6c0f10a70a6eb1)

2019-05-11 Thread Eric Wong
This test-tool submodule segfault seems new.  Noticed it while
checking dmesg for other things.
There's also "name-rev HEAD~4000" (bottom), which is old, I think...

Core was generated by `$WT/t/helper/test-tool submodule-nested-repo-config 
submodule sub'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x559b12746174 in get_oid_with_context_1 (
repo=repo@entry=0x7ffc5de3cf30, 
name=name@entry=0x559b1280a882 ":.gitmodules", flags=flags@entry=0, 
prefix=prefix@entry=0x0, oid=oid@entry=0x7ffc5de3ce80, 
oc=oc@entry=0x7ffc5de3ce20) at sha1-name.c:1840
1840if (!repo->index->cache)
(gdb) #0  0x559b12746174 in get_oid_with_context_1 (
repo=repo@entry=0x7ffc5de3cf30, 
name=name@entry=0x559b1280a882 ":.gitmodules", flags=flags@entry=0, 
prefix=prefix@entry=0x0, oid=oid@entry=0x7ffc5de3ce80, 
oc=oc@entry=0x7ffc5de3ce20) at sha1-name.c:1840
#1  0x559b12746dc3 in get_oid_with_context (oc=0x7ffc5de3ce20, 
oid=0x7ffc5de3ce80, flags=0, str=str@entry=0x559b1280a882 ":.gitmodules", 
repo=repo@entry=0x7ffc5de3cf30) at sha1-name.c:1946
#2  repo_get_oid (r=r@entry=0x7ffc5de3cf30, 
name=name@entry=0x559b1280a882 ":.gitmodules", 
oid=oid@entry=0x7ffc5de3ce80) at sha1-name.c:1595
#3  0x559b12753447 in config_from_gitmodules (
fn=fn@entry=0x559b127534b0 , 
repo=repo@entry=0x7ffc5de3cf30, data=0x559b145802c0)
at submodule-config.c:633
#4  0x559b12754664 in print_config_from_gitmodules (
repo=repo@entry=0x7ffc5de3cf30, key=)
at submodule-config.c:742
#5  0x559b126dca4f in cmd__submodule_nested_repo_config (
argc=, argv=0x7ffc5de3d290)
at t/helper/test-submodule-nested-repo-config.c:27
#6  0x559b126d367f in cmd_main (argc=3, argv=0x7ffc5de3d290)
at t/helper/test-tool.c:109
#7  0x559b126d337a in main (argc=4, argv=0x7ffc5de3d288)
at common-main.c:50
(gdb) quit


Looks like a stack overflow:

Core was generated by `$WT/git name-rev HEAD~4000'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x7f4df1896d6a in _int_malloc (
av=av@entry=0x7f4df1bb7b00 , bytes=bytes@entry=33)
at malloc.c:3444
(gdb) #0  0x7f4df1896d6a in _int_malloc (
av=av@entry=0x7f4df1bb7b00 , bytes=bytes@entry=33)
at malloc.c:3444
#1  0x7f4df1897b68 in malloc_check (sz=32, caller=)
at hooks.c:295
#2  0x5642d1240f51 in do_xmalloc (size=size@entry=32, 
gentle=gentle@entry=0) at wrapper.c:60
#3  0x5642d12410e7 in xmalloc (size=size@entry=32) at wrapper.c:87
#4  0x5642d10c75b1 in name_rev (commit=0x5642d2357bb0, 
tip_name=tip_name@entry=0x5642d245e7c0 "master", 
taggerdate=taggerdate@entry=1000799900, generation=generation@entry=1044, 
distance=distance@entry=1044, from_tag=from_tag@entry=0, deref=0)
at builtin/name-rev.c:103
#5  0x5642d10c74e3 in name_rev (commit=, 
tip_name=tip_name@entry=0x5642d245e7c0 "master", 
taggerdate=taggerdate@entry=1000799900, generation=generation@entry=1043, 
distance=distance@entry=1043, from_tag=from_tag@entry=0, deref=0)
at builtin/name-rev.c:138

...

#1047 0x5642d10c74e3 in name_rev (commit=, 
tip_name=tip_name@entry=0x5642d245e7c0 "master", 
taggerdate=taggerdate@entry=1000799900, generation=generation@entry=1, 
distance=distance@entry=1, from_tag=from_tag@entry=0, deref=0)
at builtin/name-rev.c:138
#1048 0x5642d10c74e3 in name_rev (commit=commit@entry=0x5642d22e6420, 
tip_name=0x5642d245e7c0 "master", taggerdate=taggerdate@entry=1000799900, 
generation=generation@entry=0, distance=distance@entry=0, 
from_tag=from_tag@entry=0, deref=0) at builtin/name-rev.c:138
#1049 0x5642d10c7889 in name_ref (path=, 
oid=0x5642d2465bd8, flags=, cb_data=)
at builtin/name-rev.c:276
#1050 0x5642d11d5074 in do_for_each_repo_ref_iterator (
r=0x5642d1546de0 , iter=0x5642d245da20, 
fn=fn@entry=0x5642d11cab10 , 
cb_data=cb_data@entry=0x7ffc71fffa90) at refs/iterator.c:418
#1051 0x5642d11cc7eb in do_for_each_ref (refs=, 
prefix=prefix@entry=0x5642d12a0450 "", 
fn=fn@entry=0x5642d10c75f0 , trim=trim@entry=0, 
flags=flags@entry=0, cb_data=cb_data@entry=0x7ffc71fffb40) at refs.c:1496
#1052 0x5642d11cd488 in refs_for_each_ref (cb_data=0x7ffc71fffb40, 
fn=0x5642d10c75f0 , refs=) at refs.c:1502
#1053 for_each_ref (fn=fn@entry=0x5642d10c75f0 , 
cb_data=cb_data@entry=0x7ffc71fffb40) at refs.c:1507
#1054 0x5642d10c7ec5 in cmd_name_rev (argc=, 
argv=0x7ffc71fffb40, prefix=) at builtin/name-rev.c:490
#1055 0x5642d1070d68 in run_builtin (argv=, 
argc=, p=) at git.c:444
#1056 handle_builtin (argc=2, argv=0x7ffc72000ac0) at git.c:675
#1057 0x5642d1071d1e in run_argv (argv=0x7ffc72000840, 
argcp=0x7ffc7200084c) at git.c:742
#1058 cmd_main (argc=, argv=) at git.c:876
#1059 0x5642d107094a in main (argc=3, argv=0x7ffc72000ab8)
at common-main.c:50
(gdb) quit


[PATCH v3 3/8] commit-graph: fix segfault on e.g. "git status"

2019-03-25 Thread Ævar Arnfjörð Bjarmason
When core.commitGraph=true is set, various common commands now consult
the commit graph. Because the commit-graph code is very trusting of
its input data, it's possibly to construct a graph that'll cause an
immediate segfault on e.g. "status" (and e.g. "log", "blame", ...). In
some other cases where git immediately exits with a cryptic error
about the graph being broken.

The root cause of this is that while the "commit-graph verify"
sub-command exhaustively verifies the graph, other users of the graph
simply trust the graph, and will e.g. deference data found at certain
offsets as pointers, causing segfaults.

This change does the bare minimum to ensure that we don't segfault in
the common fill_commit_in_graph() codepath called by
e.g. setup_revisions(), to do this instrument the "commit-graph
verify" tests to always check if "status" would subsequently
segfault. This fixes the following tests which would previously
segfault:

not ok 50 - detect low chunk count
not ok 51 - detect missing OID fanout chunk
not ok 52 - detect missing OID lookup chunk
not ok 53 - detect missing commit data chunk

Those happened because with the commit-graph enabled setup_revisions()
would eventually call fill_commit_in_graph(), where e.g.
g->chunk_commit_data is used early as an offset (and will be
0x0). With this change we get far enough to detect that the graph is
broken, and show an error instead. E.g.:

$ git status; echo $?
error: commit-graph is missing the Commit Data chunk
1

That also sucks, we should *warn* and not hard-fail "status" just
because the commit-graph is corrupt, but fixing is left to a follow-up
change.

A side-effect of changing the reporting from graph_report() to error()
is that we now have an "error: " prefix for these even for
"commit-graph verify". Pseudo-diff before/after:

$ git commit-graph verify
-commit-graph is missing the Commit Data chunk
+error: commit-graph is missing the Commit Data chunk

Changing that is OK. Various errors it emits now early on are prefixed
with "error: ", moving these over and changing the output doesn't
break anything.

Signed-off-by: Ævar Arnfjörð Bjarmason 
---
 commit-graph.c  | 43 -
 t/t5318-commit-graph.sh |  3 ++-
 2 files changed, 36 insertions(+), 10 deletions(-)

diff --git a/commit-graph.c b/commit-graph.c
index 47e9be0a3a..f8201d888b 100644
--- a/commit-graph.c
+++ b/commit-graph.c
@@ -112,6 +112,36 @@ struct commit_graph *load_commit_graph_one(const char 
*graph_file)
return ret;
 }
 
+static int verify_commit_graph_lite(struct commit_graph *g)
+{
+   /*
+* Basic validation shared between parse_commit_graph()
+* which'll be called every time the graph is used, and the
+* much more expensive verify_commit_graph() used by
+* "commit-graph verify".
+*
+* There should only be very basic checks here to ensure that
+* we don't e.g. segfault in fill_commit_in_graph(), but
+* because this is a very hot codepath nothing that e.g. loops
+* over g->num_commits, or runs a checksum on the commit-graph
+* itself.
+*/
+   if (!g->chunk_oid_fanout) {
+   error("commit-graph is missing the OID Fanout chunk");
+   return 1;
+   }
+   if (!g->chunk_oid_lookup) {
+   error("commit-graph is missing the OID Lookup chunk");
+   return 1;
+   }
+   if (!g->chunk_commit_data) {
+   error("commit-graph is missing the Commit Data chunk");
+   return 1;
+   }
+
+   return 0;
+}
+
 struct commit_graph *parse_commit_graph(void *graph_map, int fd,
size_t graph_size)
 {
@@ -233,6 +263,9 @@ struct commit_graph *parse_commit_graph(void *graph_map, 
int fd,
last_chunk_offset = chunk_offset;
}
 
+   if (verify_commit_graph_lite(graph))
+   return NULL;
+
return graph;
 }
 
@@ -1089,15 +1122,7 @@ int verify_commit_graph(struct repository *r, struct 
commit_graph *g)
return 1;
}
 
-   verify_commit_graph_error = 0;
-
-   if (!g->chunk_oid_fanout)
-   graph_report("commit-graph is missing the OID Fanout chunk");
-   if (!g->chunk_oid_lookup)
-   graph_report("commit-graph is missing the OID Lookup chunk");
-   if (!g->chunk_commit_data)
-   graph_report("commit-graph is missing the Commit Data chunk");
-
+   verify_commit_graph_error = verify_commit_graph_lite(g);
if (verify_commit_graph_error)
return verify_commit_graph_error;
 
diff --git a/t/t5318-commit-graph.sh b/t/t5318-commit-graph.sh
index ce3

[PATCH v3 0/8] commit-graph: segfault & other fixes for broken graphs

2019-03-25 Thread Ævar Arnfjörð Bjarmason
This v3 fixes issues raised by Ramsay in
https://public-inbox.org/git/1908832c-8dd0-377e-917b-acb33b002...@ramsayjones.plus.com/

While I was at it I changed a die("Whatevs") to
die("whatevs"). I.e. start with lower-case as discussed in other
places on-list recently.

The range-diff looks scarier than this v2..v3 really is. There's no
code changes aside from the s/0/NULL/ in one place, but marking a
couple of functions as "static" required moving them around, and
removing their entry from the corresponding *.h file.

Ævar Arnfjörð Bjarmason (8):
  commit-graph tests: split up corrupt_graph_and_verify()
  commit-graph tests: test a graph that's too small
  commit-graph: fix segfault on e.g. "git status"
  commit-graph: don't early exit(1) on e.g. "git status"
  commit-graph: don't pass filename to load_commit_graph_one_fd_st()
  commit-graph verify: detect inability to read the graph
  commit-graph write: don't die if the existing graph is corrupt
  commit-graph: improve & i18n error messages

 builtin/commit-graph.c  |  23 +--
 commit-graph.c  | 132 +++-
 commit-graph.h  |   4 +-
 commit.h|   6 ++
 t/t5318-commit-graph.sh |  42 +++--
 5 files changed, 153 insertions(+), 54 deletions(-)

Range-diff:
1:  2f8ba0adf8 = 1:  83ff92a39d commit-graph tests: split up 
corrupt_graph_and_verify()
2:  800b17edde = 2:  b9170c35e6 commit-graph tests: test a graph that's too 
small
3:  7083ab81c7 ! 3:  daf38a9af7 commit-graph: fix segfault on e.g. "git status"
@@ -58,20 +58,10 @@
  --- a/commit-graph.c
  +++ b/commit-graph.c
 @@
-   last_chunk_offset = chunk_offset;
-   }
- 
-+  if (verify_commit_graph_lite(graph))
-+  return NULL;
-+
-   return graph;
+   return ret;
  }
  
-@@
- #define GENERATION_ZERO_EXISTS 1
- #define GENERATION_NUMBER_EXISTS 2
- 
-+int verify_commit_graph_lite(struct commit_graph *g)
++static int verify_commit_graph_lite(struct commit_graph *g)
 +{
 +  /*
 +   * Basic validation shared between parse_commit_graph()
@@ -101,9 +91,19 @@
 +  return 0;
 +}
 +
- int verify_commit_graph(struct repository *r, struct commit_graph *g)
+ struct commit_graph *parse_commit_graph(void *graph_map, int fd,
+   size_t graph_size)
  {
-   uint32_t i, cur_fanout_pos = 0;
+@@
+   last_chunk_offset = chunk_offset;
+   }
+ 
++  if (verify_commit_graph_lite(graph))
++  return NULL;
++
+   return graph;
+ }
+ 
 @@
return 1;
}
@@ -122,18 +122,6 @@
return verify_commit_graph_error;
  
 
- diff --git a/commit-graph.h b/commit-graph.h
- --- a/commit-graph.h
- +++ b/commit-graph.h
-@@
-   struct string_list *commit_hex,
-   int append, int report_progress);
- 
-+int verify_commit_graph_lite(struct commit_graph *g);
- int verify_commit_graph(struct repository *r, struct commit_graph *g);
- 
- void close_commit_graph(struct repository *);
-
  diff --git a/t/t5318-commit-graph.sh b/t/t5318-commit-graph.sh
  --- a/t/t5318-commit-graph.sh
  +++ b/t/t5318-commit-graph.sh
4:  d00564ae89 ! 4:  3d7d8c4deb commit-graph: don't early exit(1) on e.g. "git 
status"
@@ -29,7 +29,15 @@
 passed to that function for the the "graph file %s is too small" error
 message.
 
+This leaves load_commit_graph_one() unused by everything except the
+internal prepare_commit_graph_one() function, so let's mark it as
+"static". If someone needs it in the future we can remove the "static"
+attribute. I could also rewrite its sole remaining
+user ("prepare_commit_graph_one()") to use
+load_commit_graph_one_fd_st() instead, but let's leave it at this.
+
 Signed-off-by: Ævar Arnfjörð Bjarmason 
+Signed-off-by: Ramsay Jones 
 
  diff --git a/builtin/commit-graph.c b/builtin/commit-graph.c
  --- a/builtin/commit-graph.c
@@ -131,7 +139,7 @@
close(fd);
 -  die(_("graph file %s is too small"), graph_file);
 +  error(_("graph file %s is too small"), graph_file);
-+  return 0;
++  return NULL;
}
graph_map = xmmap(NULL, graph_size, PROT_READ, MAP_PRIVATE, fd, 0);
ret = parse_commit_graph(graph_map, fd, graph_size);
@@ -143,9 +151,11 @@
}
  
return ret;
+@@
+   return graph;
  }
  
-+struct commit_graph *load_commit_graph_one(const char *graph_file)
++static struct commit_graph *load_commit_graph_one(

Re: [PATCH v2 0/8] commit-graph: segfault & other fixes for broken graphs

2019-03-15 Thread Ævar Arnfjörð Bjarmason


On Fri, Mar 15 2019, Eric Sunshine wrote:

> On Thu, Mar 14, 2019 at 5:47 PM Ævar Arnfjörð Bjarmason
>  wrote:
>> I fixed a test to avoid the pattern a0a630192d
>> (t/check-non-portable-shell: detect "FOO=bar shell_func", 2018-07-13)
>> tests for. The new pattern is more obvious.
>>
>> As an aside I don't get how that doesn't work as intended from the
>> commit message of that commit & its series.
>>
>> $ cat /tmp/x.sh
>> sayit() { echo "saying '$SAY'"; };
>> SAY=hi sayit; sayit;
>> $ sh /tmp/x.sh
>> saying 'hi'
>> saying ''
>>
>> I get the same thing on bash, dash, NetBSD ksh, Solaris's shell &
>> AIX's. I.e. it's explicitly not assigning $SAY for the duration of the
>> shell [...]
>
> The shells you tested may all behave "sanely" given that input, but
> not all shells do. For instance, on MacOS, the factory-supplied bash
> 3.2.57 behaves in the "broken" way in 'sh' compatibility mode:
>
> $ cat /tmp/x.sh
> sayit() { echo "saying '$SAY'"; };
> SAY=hi sayit; sayit;
>
> $ /bin/sh /tmp/x.sh
> saying 'hi'
> saying 'hi'
>
> $ /bin/bash /tmp/x.sh
> saying 'hi'
> saying ''

Thanks. Also I found (with help from Freenode ##POSIX) that this part of
the spec[1] talks about it:

BEGIN QUOTE
Variable assignments shall be performed as follows:
[...]

* If the command name is a function that is not a standard utility
  implemented as a function, variable assignments shall affect the
  current execution environment during the execution of the
  function. It is unspecified:

  - Whether or not the variable assignments persist after the
completion of the function

  - Whether or not the variables gain the export attribute during
the execution of the function

  - Whether or not export attributes gained as a result of the
variable assignments persist after the completion of the
function (if variable assignments persist after the completion
of the function)
END QUOTE

I.e. this is undefined behavior that just happened to work on the
various shells I tested. Makes sense, was just wondering if I'd entirely
missed out on what behavior was, turns out it's just fairly rare.

FWIW My bash 5.0.2 on Debian behaves as I noted in both sh and bash
mode.

1. http://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html


Re: [PATCH v2 0/8] commit-graph: segfault & other fixes for broken graphs

2019-03-15 Thread Eric Sunshine
On Thu, Mar 14, 2019 at 5:47 PM Ævar Arnfjörð Bjarmason
 wrote:
> I fixed a test to avoid the pattern a0a630192d
> (t/check-non-portable-shell: detect "FOO=bar shell_func", 2018-07-13)
> tests for. The new pattern is more obvious.
>
> As an aside I don't get how that doesn't work as intended from the
> commit message of that commit & its series.
>
> $ cat /tmp/x.sh
> sayit() { echo "saying '$SAY'"; };
> SAY=hi sayit; sayit;
> $ sh /tmp/x.sh
> saying 'hi'
> saying ''
>
> I get the same thing on bash, dash, NetBSD ksh, Solaris's shell &
> AIX's. I.e. it's explicitly not assigning $SAY for the duration of the
> shell [...]

The shells you tested may all behave "sanely" given that input, but
not all shells do. For instance, on MacOS, the factory-supplied bash
3.2.57 behaves in the "broken" way in 'sh' compatibility mode:

$ cat /tmp/x.sh
sayit() { echo "saying '$SAY'"; };
SAY=hi sayit; sayit;

$ /bin/sh /tmp/x.sh
saying 'hi'
saying 'hi'

$ /bin/bash /tmp/x.sh
saying 'hi'
saying ''


[PATCH v2 3/8] commit-graph: fix segfault on e.g. "git status"

2019-03-14 Thread Ævar Arnfjörð Bjarmason
When core.commitGraph=true is set, various common commands now consult
the commit graph. Because the commit-graph code is very trusting of
its input data, it's possibly to construct a graph that'll cause an
immediate segfault on e.g. "status" (and e.g. "log", "blame", ...). In
some other cases where git immediately exits with a cryptic error
about the graph being broken.

The root cause of this is that while the "commit-graph verify"
sub-command exhaustively verifies the graph, other users of the graph
simply trust the graph, and will e.g. deference data found at certain
offsets as pointers, causing segfaults.

This change does the bare minimum to ensure that we don't segfault in
the common fill_commit_in_graph() codepath called by
e.g. setup_revisions(), to do this instrument the "commit-graph
verify" tests to always check if "status" would subsequently
segfault. This fixes the following tests which would previously
segfault:

not ok 50 - detect low chunk count
not ok 51 - detect missing OID fanout chunk
not ok 52 - detect missing OID lookup chunk
not ok 53 - detect missing commit data chunk

Those happened because with the commit-graph enabled setup_revisions()
would eventually call fill_commit_in_graph(), where e.g.
g->chunk_commit_data is used early as an offset (and will be
0x0). With this change we get far enough to detect that the graph is
broken, and show an error instead. E.g.:

$ git status; echo $?
error: commit-graph is missing the Commit Data chunk
1

That also sucks, we should *warn* and not hard-fail "status" just
because the commit-graph is corrupt, but fixing is left to a follow-up
change.

A side-effect of changing the reporting from graph_report() to error()
is that we now have an "error: " prefix for these even for
"commit-graph verify". Pseudo-diff before/after:

$ git commit-graph verify
-commit-graph is missing the Commit Data chunk
+error: commit-graph is missing the Commit Data chunk

Changing that is OK. Various errors it emits now early on are prefixed
with "error: ", moving these over and changing the output doesn't
break anything.

Signed-off-by: Ævar Arnfjörð Bjarmason 
---
 commit-graph.c  | 43 -
 commit-graph.h  |  1 +
 t/t5318-commit-graph.sh |  3 ++-
 3 files changed, 37 insertions(+), 10 deletions(-)

diff --git a/commit-graph.c b/commit-graph.c
index 47e9be0a3a..980fbf47ea 100644
--- a/commit-graph.c
+++ b/commit-graph.c
@@ -233,6 +233,9 @@ struct commit_graph *parse_commit_graph(void *graph_map, 
int fd,
last_chunk_offset = chunk_offset;
}
 
+   if (verify_commit_graph_lite(graph))
+   return NULL;
+
return graph;
 }
 
@@ -1075,6 +1078,36 @@ static void graph_report(const char *fmt, ...)
 #define GENERATION_ZERO_EXISTS 1
 #define GENERATION_NUMBER_EXISTS 2
 
+int verify_commit_graph_lite(struct commit_graph *g)
+{
+   /*
+* Basic validation shared between parse_commit_graph()
+* which'll be called every time the graph is used, and the
+* much more expensive verify_commit_graph() used by
+* "commit-graph verify".
+*
+* There should only be very basic checks here to ensure that
+* we don't e.g. segfault in fill_commit_in_graph(), but
+* because this is a very hot codepath nothing that e.g. loops
+* over g->num_commits, or runs a checksum on the commit-graph
+* itself.
+*/
+   if (!g->chunk_oid_fanout) {
+   error("commit-graph is missing the OID Fanout chunk");
+   return 1;
+   }
+   if (!g->chunk_oid_lookup) {
+   error("commit-graph is missing the OID Lookup chunk");
+   return 1;
+   }
+   if (!g->chunk_commit_data) {
+   error("commit-graph is missing the Commit Data chunk");
+   return 1;
+   }
+
+   return 0;
+}
+
 int verify_commit_graph(struct repository *r, struct commit_graph *g)
 {
uint32_t i, cur_fanout_pos = 0;
@@ -1089,15 +1122,7 @@ int verify_commit_graph(struct repository *r, struct 
commit_graph *g)
return 1;
}
 
-   verify_commit_graph_error = 0;
-
-   if (!g->chunk_oid_fanout)
-   graph_report("commit-graph is missing the OID Fanout chunk");
-   if (!g->chunk_oid_lookup)
-   graph_report("commit-graph is missing the OID Lookup chunk");
-   if (!g->chunk_commit_data)
-   graph_report("commit-graph is missing the Commit Data chunk");
-
+   verify_commit_graph_error = verify_commit_graph_lite(g);
if (verify_commit_graph_error)
return verify_commit_graph_error;
 
diff --git a/com

[PATCH v2 0/8] commit-graph: segfault & other fixes for broken graphs

2019-03-14 Thread Ævar Arnfjörð Bjarmason
See the v1 cover letter for details:
https://public-inbox.org/git/20190221223753.20070-1-ava...@gmail.com/

I'd forgotten this after 2.21 was released.

This addresses all the comments on v1 and rebases it. A range-diff is
below. I also improved 7/8's commit message a bit.

I fixed a test to avoid the pattern a0a630192d
(t/check-non-portable-shell: detect "FOO=bar shell_func", 2018-07-13)
tests for. The new pattern is more obvious.

As an aside I don't get how that doesn't work as intended from the
commit message of that commit & its series.

$ cat /tmp/x.sh 
sayit() { echo "saying '$SAY'"; };
SAY=hi sayit; sayit;
$ sh /tmp/x.sh
saying 'hi'
saying ''

I get the same thing on bash, dash, NetBSD ksh, Solaris's shell &
AIX's. I.e. it's explicitly not assigning $SAY for the duration of the
shell as this would do:

$ cat /tmp/y.sh 
sayit() { echo "saying '$SAY'"; };
SAY=hi; sayit; sayit;
$ sh /tmp/y.sh
saying 'hi'
saying 'hi'

Which is the impression I get from the commit message.

Ævar Arnfjörð Bjarmason (8):
  commit-graph tests: split up corrupt_graph_and_verify()
  commit-graph tests: test a graph that's too small
  commit-graph: fix segfault on e.g. "git status"
  commit-graph: don't early exit(1) on e.g. "git status"
  commit-graph: don't pass filename to load_commit_graph_one_fd_st()
  commit-graph verify: detect inability to read the graph
  commit-graph write: don't die if the existing graph is corrupt
  commit-graph: improve & i18n error messages

 builtin/commit-graph.c  |  23 +--
 commit-graph.c  | 132 +++-
 commit-graph.h  |   4 ++
 commit.h|   6 ++
 t/t5318-commit-graph.sh |  42 +++--
 5 files changed, 154 insertions(+), 53 deletions(-)

Range-diff:
1:  9d318d5106 ! 1:  2f8ba0adf8 commit-graph tests: split up 
corrupt_graph_and_verify()
@@ -49,7 +49,7 @@
 -  test_when_finished mv commit-graph-backup $objdir/info/commit-graph &&
 -  cp $objdir/info/commit-graph commit-graph-backup &&
printf "$data" | dd of="$objdir/info/commit-graph" bs=1 seek="$pos" 
conv=notrunc &&
-   dd of="$objdir/info/commit-graph" bs=1 seek="$zero_pos" count=0 &&
+   dd of="$objdir/info/commit-graph" bs=1 seek="$zero_pos" if=/dev/null &&
generate_zero_bytes $(($orig_size - $zero_pos)) 
>>"$objdir/info/commit-graph" &&
 -  test_must_fail git commit-graph verify 2>test_err &&
 -  grep -v "^+" test_err >err &&
2:  73849add5e = 2:  800b17edde commit-graph tests: test a graph that's too 
small
3:  6bfce758e1 = 3:  7083ab81c7 commit-graph: fix segfault on e.g. "git status"
4:  ac07ff415e = 4:  d00564ae89 commit-graph: don't early exit(1) on e.g. "git 
status"
5:  b2dd394cc7 = 5:  25ee185bf7 commit-graph: don't pass filename to 
load_commit_graph_one_fd_st()
6:  9987149e5c ! 6:  7619b46987 commit-graph verify: detect inability to read 
the graph
@@ -37,16 +37,10 @@
  
  }
  
-+test_expect_success 'detect permission problem' '
++test_expect_success POSIXPERM,SANITY 'detect permission problem' '
 +  corrupt_graph_setup &&
 +  chmod 000 $objdir/info/commit-graph &&
-+
-+  # Skip as root, or in other cases (odd fs or OS) where a
-+  # "chmod 000 file" does not yield EACCES on e.g. "cat file"
-+  if ! test -r $objdir/info/commit-graph
-+  then
-+  corrupt_graph_verify "Could not open"
-+  fi
++  corrupt_graph_verify "Could not open"
 +'
 +
  test_expect_success 'detect too small' '
7:  0e35b12a1a ! 7:  17ee4fc050 commit-graph write: don't die if the existing 
graph is corrupt
@@ -18,6 +18,10 @@
 use_commit_graph=1 as seen in 177722b344 ("commit: integrate commit
 graph with commit parsing", 2018-04-10).
 
+Not using the old graph at all slows down the writing of the new graph
+by some small amount, but is a sensible way to prevent an error in the
+existing commit-graph from spreading.
+
 Just fixing the current issue would be likely to result in code that's
 inadvertently broken in the future. New code might use the
 commit-graph at a distance. To detect such cases introduce a
@@ -36,7 +40,12 @@
 corruption.
 
 This might need to be re-visited if we learn to write the commit-graph
-incrementally.
+incrementally, but probably not. Hopefully we'll just start by finding

[PATCH 3/8] commit-graph: fix segfault on e.g. "git status"

2019-02-21 Thread Ævar Arnfjörð Bjarmason
When core.commitGraph=true is set, various common commands now consult
the commit graph. Because the commit-graph code is very trusting of
its input data, it's possibly to construct a graph that'll cause an
immediate segfault on e.g. "status" (and e.g. "log", "blame", ...). In
some other cases where git immediately exits with a cryptic error
about the graph being broken.

The root cause of this is that while the "commit-graph verify"
sub-command exhaustively verifies the graph, other users of the graph
simply trust the graph, and will e.g. deference data found at certain
offsets as pointers, causing segfaults.

This change does the bare minimum to ensure that we don't segfault in
the common fill_commit_in_graph() codepath called by
e.g. setup_revisions(), to do this instrument the "commit-graph
verify" tests to always check if "status" would subsequently
segfault. This fixes the following tests which would previously
segfault:

not ok 50 - detect low chunk count
not ok 51 - detect missing OID fanout chunk
not ok 52 - detect missing OID lookup chunk
not ok 53 - detect missing commit data chunk

Those happened because with the commit-graph enabled setup_revisions()
would eventually call fill_commit_in_graph(), where e.g.
g->chunk_commit_data is used early as an offset (and will be
0x0). With this change we get far enough to detect that the graph is
broken, and show an error instead. E.g.:

$ git status; echo $?
error: commit-graph is missing the Commit Data chunk
1

That also sucks, we should *warn* and not hard-fail "status" just
because the commit-graph is corrupt, but fixing is left to a follow-up
change.

A side-effect of changing the reporting from graph_report() to error()
is that we now have an "error: " prefix for these even for
"commit-graph verify". Pseudo-diff before/after:

$ git commit-graph verify
-commit-graph is missing the Commit Data chunk
+error: commit-graph is missing the Commit Data chunk

Changing that is OK. Various errors it emits now early on are prefixed
with "error: ", moving these over and changing the output doesn't
break anything.

Signed-off-by: Ævar Arnfjörð Bjarmason 
---
 commit-graph.c  | 43 -
 commit-graph.h  |  1 +
 t/t5318-commit-graph.sh |  3 ++-
 3 files changed, 37 insertions(+), 10 deletions(-)

diff --git a/commit-graph.c b/commit-graph.c
index 47e9be0a3a..980fbf47ea 100644
--- a/commit-graph.c
+++ b/commit-graph.c
@@ -233,6 +233,9 @@ struct commit_graph *parse_commit_graph(void *graph_map, 
int fd,
last_chunk_offset = chunk_offset;
}
 
+   if (verify_commit_graph_lite(graph))
+   return NULL;
+
return graph;
 }
 
@@ -1075,6 +1078,36 @@ static void graph_report(const char *fmt, ...)
 #define GENERATION_ZERO_EXISTS 1
 #define GENERATION_NUMBER_EXISTS 2
 
+int verify_commit_graph_lite(struct commit_graph *g)
+{
+   /*
+* Basic validation shared between parse_commit_graph()
+* which'll be called every time the graph is used, and the
+* much more expensive verify_commit_graph() used by
+* "commit-graph verify".
+*
+* There should only be very basic checks here to ensure that
+* we don't e.g. segfault in fill_commit_in_graph(), but
+* because this is a very hot codepath nothing that e.g. loops
+* over g->num_commits, or runs a checksum on the commit-graph
+* itself.
+*/
+   if (!g->chunk_oid_fanout) {
+   error("commit-graph is missing the OID Fanout chunk");
+   return 1;
+   }
+   if (!g->chunk_oid_lookup) {
+   error("commit-graph is missing the OID Lookup chunk");
+   return 1;
+   }
+   if (!g->chunk_commit_data) {
+   error("commit-graph is missing the Commit Data chunk");
+   return 1;
+   }
+
+   return 0;
+}
+
 int verify_commit_graph(struct repository *r, struct commit_graph *g)
 {
uint32_t i, cur_fanout_pos = 0;
@@ -1089,15 +1122,7 @@ int verify_commit_graph(struct repository *r, struct 
commit_graph *g)
return 1;
}
 
-   verify_commit_graph_error = 0;
-
-   if (!g->chunk_oid_fanout)
-   graph_report("commit-graph is missing the OID Fanout chunk");
-   if (!g->chunk_oid_lookup)
-   graph_report("commit-graph is missing the OID Lookup chunk");
-   if (!g->chunk_commit_data)
-   graph_report("commit-graph is missing the Commit Data chunk");
-
+   verify_commit_graph_error = verify_commit_graph_lite(g);
if (verify_commit_graph_error)
return verify_commit_graph_error;
 
diff --git a/com

[PATCH 0/8] commit-graph: segfault & other fixes for broken graphs

2019-02-21 Thread Ævar Arnfjörð Bjarmason
This one isn't for 2.21.0, these issues are already in v2.20.0, I just
spotted them when looking at & fixing the commit-graph test broken on
NetBSD[1].

This series touches (among other things) some of the same test code,
but merges cleanly with that "dd" fix, and it's not required for
testing it (unless you're on NetBSD).

Instrumenting the commit-graph tests reveals that if you have a broken
commit-graph git will either segfault or abort early with cryptic
errors on such basic commands as "status".

Furthemore, if our graph is corrupt and we've set
core.commitGraph=true we can't even write a new commit-graph, because
writing a new one has grown to implicitly depend on reading the old
one!

This series fixes all of that.

1. https://public-inbox.org/git/20190221192849.6581-3-ava...@gmail.com/

Ævar Arnfjörð Bjarmason (8):
  commit-graph tests: split up corrupt_graph_and_verify()
  commit-graph tests: test a graph that's too small
  commit-graph: fix segfault on e.g. "git status"
  commit-graph: don't early exit(1) on e.g. "git status"
  commit-graph: don't pass filename to load_commit_graph_one_fd_st()
  commit-graph verify: detect inability to read the graph
  commit-graph write: don't die if the existing graph is corrupt
  commit-graph: improve & i18n error messages

 builtin/commit-graph.c  |  23 +--
 commit-graph.c  | 132 +++-
 commit-graph.h  |   4 ++
 commit.h|   6 ++
 t/t5318-commit-graph.sh |  48 +--
 5 files changed, 160 insertions(+), 53 deletions(-)

-- 
2.21.0.rc0.258.g878e2cd30e



Re: [PATCH v3] list-objects.c: don't segfault for missing cmdline objects

2018-12-05 Thread Junio C Hamano
Matthew DeVore  writes:

> When a command is invoked with both --exclude-promisor-objects,
> --objects-edge-aggressive, and a missing object on the command line,
> the rev_info.cmdline array could get a NULL pointer for the value of
> an 'item' field. Prevent dereferencing of a NULL pointer in that
> situation.
>
> Properly handle --ignore-missing. If it is not passed, die when an
> object is missing. Otherwise, just silently ignore it.
>
> Signed-off-by: Matthew DeVore 

Thanks for an update.  Will replace.

> ---
>  revision.c   |  2 ++
>  t/t0410-partial-clone.sh | 16 ++--
>  2 files changed, 16 insertions(+), 2 deletions(-)
>
> diff --git a/revision.c b/revision.c
> index 13e0519c02..293303b67d 100644
> --- a/revision.c
> +++ b/revision.c
> @@ -1729,6 +1729,8 @@ int handle_revision_arg(const char *arg_, struct 
> rev_info *revs, int flags, unsi
>   if (!cant_be_filename)
>   verify_non_filename(revs->prefix, arg);
>   object = get_reference(revs, arg, &oid, flags ^ local_flags);
> + if (!object)
> + return revs->ignore_missing ? 0 : -1;
>   add_rev_cmdline(revs, object, arg_, REV_CMD_REV, flags ^ local_flags);
>   add_pending_object_with_path(revs, object, arg, oc.mode, oc.path);
>   free(oc.path);
> diff --git a/t/t0410-partial-clone.sh b/t/t0410-partial-clone.sh
> index ba3887f178..169f7f10a7 100755
> --- a/t/t0410-partial-clone.sh
> +++ b/t/t0410-partial-clone.sh
> @@ -349,7 +349,7 @@ test_expect_success 'rev-list stops traversal at promisor 
> commit, tree, and blob
>   grep $(git -C repo rev-parse bar) out  # sanity check that some walking 
> was done
>  '
>  
> -test_expect_success 'rev-list accepts missing and promised objects on 
> command line' '
> +test_expect_success 'rev-list dies for missing objects on cmd line' '
>   rm -rf repo &&
>   test_create_repo repo &&
>   test_commit -C repo foo &&
> @@ -366,7 +366,19 @@ test_expect_success 'rev-list accepts missing and 
> promised objects on command li
>  
>   git -C repo config core.repositoryformatversion 1 &&
>   git -C repo config extensions.partialclone "arbitrary string" &&
> - git -C repo rev-list --exclude-promisor-objects --objects "$COMMIT" 
> "$TREE" "$BLOB"
> +
> + for OBJ in "$COMMIT" "$TREE" "$BLOB"; do
> + test_must_fail git -C repo rev-list --objects \
> + --exclude-promisor-objects "$OBJ" &&
> + test_must_fail git -C repo rev-list --objects-edge-aggressive \
> + --exclude-promisor-objects "$OBJ" &&
> +
> + # Do not die or crash when --ignore-missing is passed.
> + git -C repo rev-list --ignore-missing --objects \
> + --exclude-promisor-objects "$OBJ" &&
> + git -C repo rev-list --ignore-missing --objects-edge-aggressive 
> \
> + --exclude-promisor-objects "$OBJ"
> + done
>  '
>  
>  test_expect_success 'gc repacks promisor objects separately from 
> non-promisor objects' '


[PATCH v3] list-objects.c: don't segfault for missing cmdline objects

2018-12-05 Thread Matthew DeVore
When a command is invoked with both --exclude-promisor-objects,
--objects-edge-aggressive, and a missing object on the command line,
the rev_info.cmdline array could get a NULL pointer for the value of
an 'item' field. Prevent dereferencing of a NULL pointer in that
situation.

Properly handle --ignore-missing. If it is not passed, die when an
object is missing. Otherwise, just silently ignore it.

Signed-off-by: Matthew DeVore 
---
 revision.c   |  2 ++
 t/t0410-partial-clone.sh | 16 ++--
 2 files changed, 16 insertions(+), 2 deletions(-)

diff --git a/revision.c b/revision.c
index 13e0519c02..293303b67d 100644
--- a/revision.c
+++ b/revision.c
@@ -1729,6 +1729,8 @@ int handle_revision_arg(const char *arg_, struct rev_info 
*revs, int flags, unsi
if (!cant_be_filename)
verify_non_filename(revs->prefix, arg);
object = get_reference(revs, arg, &oid, flags ^ local_flags);
+   if (!object)
+   return revs->ignore_missing ? 0 : -1;
add_rev_cmdline(revs, object, arg_, REV_CMD_REV, flags ^ local_flags);
add_pending_object_with_path(revs, object, arg, oc.mode, oc.path);
free(oc.path);
diff --git a/t/t0410-partial-clone.sh b/t/t0410-partial-clone.sh
index ba3887f178..169f7f10a7 100755
--- a/t/t0410-partial-clone.sh
+++ b/t/t0410-partial-clone.sh
@@ -349,7 +349,7 @@ test_expect_success 'rev-list stops traversal at promisor 
commit, tree, and blob
grep $(git -C repo rev-parse bar) out  # sanity check that some walking 
was done
 '
 
-test_expect_success 'rev-list accepts missing and promised objects on command 
line' '
+test_expect_success 'rev-list dies for missing objects on cmd line' '
rm -rf repo &&
test_create_repo repo &&
test_commit -C repo foo &&
@@ -366,7 +366,19 @@ test_expect_success 'rev-list accepts missing and promised 
objects on command li
 
git -C repo config core.repositoryformatversion 1 &&
git -C repo config extensions.partialclone "arbitrary string" &&
-   git -C repo rev-list --exclude-promisor-objects --objects "$COMMIT" 
"$TREE" "$BLOB"
+
+   for OBJ in "$COMMIT" "$TREE" "$BLOB"; do
+   test_must_fail git -C repo rev-list --objects \
+   --exclude-promisor-objects "$OBJ" &&
+   test_must_fail git -C repo rev-list --objects-edge-aggressive \
+   --exclude-promisor-objects "$OBJ" &&
+
+   # Do not die or crash when --ignore-missing is passed.
+   git -C repo rev-list --ignore-missing --objects \
+   --exclude-promisor-objects "$OBJ" &&
+   git -C repo rev-list --ignore-missing --objects-edge-aggressive 
\
+   --exclude-promisor-objects "$OBJ"
+   done
 '
 
 test_expect_success 'gc repacks promisor objects separately from non-promisor 
objects' '
-- 
2.20.0.rc1.387.gf8505762e3-goog



Re: [PATCH v2] list-objects.c: don't segfault for missing cmdline objects

2018-10-28 Thread Junio C Hamano
Matthew DeVore  writes:

> When a command is invoked with both --exclude-promisor-objects,
> --objects-edge-aggressive, and a missing object on the command line,
> the rev_info.cmdline array could get a NULL pointer for the value of
> an 'item' field. Prevent dereferencing of a NULL pointer in that
> situation.

Thanks.

> There are a few other places in the code where rev_info.cmdline is read
> and the code doesn't handle NULL objects, but I couldn't prove to myself
> that any of them needed to change except this one (since it may not
> actually be possible to reach the other code paths with
> rev_info.cmdline[] set to NULL).
>
> Signed-off-by: Matthew DeVore 
> ---
>  list-objects.c   | 3 ++-
>  t/t0410-partial-clone.sh | 6 +-
>  2 files changed, 7 insertions(+), 2 deletions(-)
>
> diff --git a/list-objects.c b/list-objects.c
> index c41cc80db5..27ed2c6cab 100644
> --- a/list-objects.c
> +++ b/list-objects.c
> @@ -245,7 +245,8 @@ void mark_edges_uninteresting(struct rev_info *revs, 
> show_edge_fn show_edge)
>   for (i = 0; i < revs->cmdline.nr; i++) {
>   struct object *obj = revs->cmdline.rev[i].item;
>   struct commit *commit = (struct commit *)obj;
> - if (obj->type != OBJ_COMMIT || !(obj->flags & 
> UNINTERESTING))
> + if (!obj || obj->type != OBJ_COMMIT ||
> + !(obj->flags & UNINTERESTING))
>   continue;
>   mark_tree_uninteresting(revs->repo,
>   get_commit_tree(commit));
> diff --git a/t/t0410-partial-clone.sh b/t/t0410-partial-clone.sh
> index ba3887f178..e52291e674 100755
> --- a/t/t0410-partial-clone.sh
> +++ b/t/t0410-partial-clone.sh
> @@ -366,7 +366,11 @@ test_expect_success 'rev-list accepts missing and 
> promised objects on command li
>  
>   git -C repo config core.repositoryformatversion 1 &&
>   git -C repo config extensions.partialclone "arbitrary string" &&
> - git -C repo rev-list --exclude-promisor-objects --objects "$COMMIT" 
> "$TREE" "$BLOB"
> +
> + git -C repo rev-list --objects \
> + --exclude-promisor-objects "$COMMIT" "$TREE" "$BLOB" &&
> + git -C repo rev-list --objects-edge-aggressive \
> + --exclude-promisor-objects "$COMMIT" "$TREE" "$BLOB"
>  '
>  
>  test_expect_success 'gc repacks promisor objects separately from 
> non-promisor objects' '


[PATCH v2] list-objects.c: don't segfault for missing cmdline objects

2018-10-25 Thread Matthew DeVore
When a command is invoked with both --exclude-promisor-objects,
--objects-edge-aggressive, and a missing object on the command line,
the rev_info.cmdline array could get a NULL pointer for the value of
an 'item' field. Prevent dereferencing of a NULL pointer in that
situation.

There are a few other places in the code where rev_info.cmdline is read
and the code doesn't handle NULL objects, but I couldn't prove to myself
that any of them needed to change except this one (since it may not
actually be possible to reach the other code paths with
rev_info.cmdline[] set to NULL).

Signed-off-by: Matthew DeVore 
---
 list-objects.c   | 3 ++-
 t/t0410-partial-clone.sh | 6 +-
 2 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/list-objects.c b/list-objects.c
index c41cc80db5..27ed2c6cab 100644
--- a/list-objects.c
+++ b/list-objects.c
@@ -245,7 +245,8 @@ void mark_edges_uninteresting(struct rev_info *revs, 
show_edge_fn show_edge)
for (i = 0; i < revs->cmdline.nr; i++) {
struct object *obj = revs->cmdline.rev[i].item;
struct commit *commit = (struct commit *)obj;
-   if (obj->type != OBJ_COMMIT || !(obj->flags & 
UNINTERESTING))
+   if (!obj || obj->type != OBJ_COMMIT ||
+   !(obj->flags & UNINTERESTING))
continue;
mark_tree_uninteresting(revs->repo,
get_commit_tree(commit));
diff --git a/t/t0410-partial-clone.sh b/t/t0410-partial-clone.sh
index ba3887f178..e52291e674 100755
--- a/t/t0410-partial-clone.sh
+++ b/t/t0410-partial-clone.sh
@@ -366,7 +366,11 @@ test_expect_success 'rev-list accepts missing and promised 
objects on command li
 
git -C repo config core.repositoryformatversion 1 &&
git -C repo config extensions.partialclone "arbitrary string" &&
-   git -C repo rev-list --exclude-promisor-objects --objects "$COMMIT" 
"$TREE" "$BLOB"
+
+   git -C repo rev-list --objects \
+   --exclude-promisor-objects "$COMMIT" "$TREE" "$BLOB" &&
+   git -C repo rev-list --objects-edge-aggressive \
+   --exclude-promisor-objects "$COMMIT" "$TREE" "$BLOB"
 '
 
 test_expect_success 'gc repacks promisor objects separately from non-promisor 
objects' '
-- 
2.19.1.568.g152ad8e336-goog



[PATCH 0/3] Fix gc segfault

2018-10-16 Thread Jameson Miller via GitGitGadget
In 9ac3f0e (pack-objects: fix performance issues on packing large deltas,
2018-07-22), a mutex was introduced that is used to guard the call to set
the delta size. This commit added code to initialize it, but at an incorrect
spot: in init_threaded_search(), while the call to oe_set_delta_size() (and
hence to packing_data_lock()) can happen in the call chain check_object() <- 
get_object_details() <-prepare_pack() <- cmd_pack_objects(), which is long
before theprepare_pack() function calls ll_find_deltas() (which initializes
the threaded search).

Another tell-tale that the mutex was initialized in an incorrect spot is
that the function to initialize it lives in builtin/, while the code that
uses the mutex is defined in a libgit.a header file.

Let's use a more appropriate function: prepare_packing_data(), which not
only lives in libgit.a, but has to be called before thepacking_data struct
is used that contains that mutex.

Johannes Schindelin (3):
  Fix typo 'detla' -> 'delta'
  pack-objects (mingw): demonstrate a segmentation fault with large
deltas
  pack-objects (mingw): initialize `packing_data` mutex in the correct
spot

 builtin/pack-objects.c|  1 -
 pack-objects.c|  3 +++
 pack-objects.h|  2 +-
 t/t5321-pack-large-objects.sh | 32 
 4 files changed, 36 insertions(+), 2 deletions(-)
 create mode 100755 t/t5321-pack-large-objects.sh


base-commit: 5a0cc8aca797dbd7d2be3b67458ff880ed45cddf
Published-As: 
https://github.com/gitgitgadget/git/releases/tags/pr-47%2Fjamill%2Ffix-gc-segfault-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git 
pr-47/jamill/fix-gc-segfault-v1
Pull-Request: https://github.com/gitgitgadget/git/pull/47
-- 
gitgitgadget


Re: [BUG] Segfault in "git submodule"

2018-10-01 Thread Raymond Jennings
I instructed downstream to update their repository.
On Mon, Oct 1, 2018 at 2:31 PM Raymond Jennings  wrote:
>
> Yes, git 2.16.4 to be exact.
>
> I upgraded to 2.19 after ~arch keywording the package on gentoo and
> that fixed it.
> On Mon, Oct 1, 2018 at 12:19 PM Stefan Beller  wrote:
> >
> > On Sat, Sep 29, 2018 at 9:43 AM Raymond Jennings  wrote:
> > >
> > > I have a repo, but it appears to be specific to staging area state.
> > > It only segfaults when I have a certain file deleted.
> > >
> > > Where do you want me to upload it?
> > > On Sat, Sep 29, 2018 at 8:34 AM Duy Nguyen  wrote:
> > > >
> > > > On Sat, Sep 29, 2018 at 5:31 PM Ævar Arnfjörð Bjarmason
> > > >  wrote:
> > > > > > #1  refs_resolve_ref_unsafe (refs=0x0,
> > > > > > refname=refname@entry=0x55e863062253 "HEAD",
> > > > > > resolve_flags=resolve_flags@entry=1, oid=oid@entry=0x7ffdc834b1c0,
> > > > > > flags=flags@entry=0x7ffdc834b1bc) at refs.c:1493
> > > >
> > > > refs is NULL. It looks like somebody fails to get the right submodule
> > > > ref store (or tries to get it but fails to check if it may return
> > > > NULL)
> >
> > This is spot on.
> >
> > Raymond, are you on Git v2.16.0 by any chance?
> > (and if now which version are you on).
> >
> > I suspect 2.16, as that is a version of Git, in which there happens to be
> > a call into the refs subsystem in submodule--helper.c in line 624.
> >
> > Is it possible to upgrade Git (to v2.18.0 or later) or cherry-pick
> > 74b6bda32f (submodule: check for NULL return of get_submodule_ref_store(),
> > 2018-03-28) into your copy of Git?
> >
> > Thanks,
> > Stefan


Re: [BUG] Segfault in "git submodule"

2018-10-01 Thread Raymond Jennings
Yes, git 2.16.4 to be exact.

I upgraded to 2.19 after ~arch keywording the package on gentoo and
that fixed it.
On Mon, Oct 1, 2018 at 12:19 PM Stefan Beller  wrote:
>
> On Sat, Sep 29, 2018 at 9:43 AM Raymond Jennings  wrote:
> >
> > I have a repo, but it appears to be specific to staging area state.
> > It only segfaults when I have a certain file deleted.
> >
> > Where do you want me to upload it?
> > On Sat, Sep 29, 2018 at 8:34 AM Duy Nguyen  wrote:
> > >
> > > On Sat, Sep 29, 2018 at 5:31 PM Ævar Arnfjörð Bjarmason
> > >  wrote:
> > > > > #1  refs_resolve_ref_unsafe (refs=0x0,
> > > > > refname=refname@entry=0x55e863062253 "HEAD",
> > > > > resolve_flags=resolve_flags@entry=1, oid=oid@entry=0x7ffdc834b1c0,
> > > > > flags=flags@entry=0x7ffdc834b1bc) at refs.c:1493
> > >
> > > refs is NULL. It looks like somebody fails to get the right submodule
> > > ref store (or tries to get it but fails to check if it may return
> > > NULL)
>
> This is spot on.
>
> Raymond, are you on Git v2.16.0 by any chance?
> (and if now which version are you on).
>
> I suspect 2.16, as that is a version of Git, in which there happens to be
> a call into the refs subsystem in submodule--helper.c in line 624.
>
> Is it possible to upgrade Git (to v2.18.0 or later) or cherry-pick
> 74b6bda32f (submodule: check for NULL return of get_submodule_ref_store(),
> 2018-03-28) into your copy of Git?
>
> Thanks,
> Stefan


Re: [BUG] Segfault in "git submodule"

2018-10-01 Thread Stefan Beller
On Sat, Sep 29, 2018 at 9:43 AM Raymond Jennings  wrote:
>
> I have a repo, but it appears to be specific to staging area state.
> It only segfaults when I have a certain file deleted.
>
> Where do you want me to upload it?
> On Sat, Sep 29, 2018 at 8:34 AM Duy Nguyen  wrote:
> >
> > On Sat, Sep 29, 2018 at 5:31 PM Ævar Arnfjörð Bjarmason
> >  wrote:
> > > > #1  refs_resolve_ref_unsafe (refs=0x0,
> > > > refname=refname@entry=0x55e863062253 "HEAD",
> > > > resolve_flags=resolve_flags@entry=1, oid=oid@entry=0x7ffdc834b1c0,
> > > > flags=flags@entry=0x7ffdc834b1bc) at refs.c:1493
> >
> > refs is NULL. It looks like somebody fails to get the right submodule
> > ref store (or tries to get it but fails to check if it may return
> > NULL)

This is spot on.

Raymond, are you on Git v2.16.0 by any chance?
(and if now which version are you on).

I suspect 2.16, as that is a version of Git, in which there happens to be
a call into the refs subsystem in submodule--helper.c in line 624.

Is it possible to upgrade Git (to v2.18.0 or later) or cherry-pick
74b6bda32f (submodule: check for NULL return of get_submodule_ref_store(),
2018-03-28) into your copy of Git?

Thanks,
Stefan


Re: [BUG] Segfault in "git submodule"

2018-09-29 Thread Raymond Jennings
I have a repo, but it appears to be specific to staging area state.
It only segfaults when I have a certain file deleted.

Where do you want me to upload it?
On Sat, Sep 29, 2018 at 8:34 AM Duy Nguyen  wrote:
>
> On Sat, Sep 29, 2018 at 5:31 PM Ævar Arnfjörð Bjarmason
>  wrote:
> > > #1  refs_resolve_ref_unsafe (refs=0x0,
> > > refname=refname@entry=0x55e863062253 "HEAD",
> > > resolve_flags=resolve_flags@entry=1, oid=oid@entry=0x7ffdc834b1c0,
> > > flags=flags@entry=0x7ffdc834b1bc) at refs.c:1493
>
> refs is NULL. It looks like somebody fails to get the right submodule
> ref store (or tries to get it but fails to check if it may return
> NULL)
> --
> Duy


Re: [BUG] Segfault in "git submodule"

2018-09-29 Thread Duy Nguyen
On Sat, Sep 29, 2018 at 5:31 PM Ævar Arnfjörð Bjarmason
 wrote:
> > #1  refs_resolve_ref_unsafe (refs=0x0,
> > refname=refname@entry=0x55e863062253 "HEAD",
> > resolve_flags=resolve_flags@entry=1, oid=oid@entry=0x7ffdc834b1c0,
> > flags=flags@entry=0x7ffdc834b1bc) at refs.c:1493

refs is NULL. It looks like somebody fails to get the right submodule
ref store (or tries to get it but fails to check if it may return
NULL)
-- 
Duy


Re: [BUG] Segfault in "git submodule"

2018-09-29 Thread Ævar Arnfjörð Bjarmason


On Sat, Sep 29 2018, Raymond Jennings wrote:

> [New LWP 19644]
> [Thread debugging using libthread_db enabled]
> Using host libthread_db library "/lib64/libthread_db.so.1".
> Core was generated by `git submodule--helper status'.
> Program terminated with signal SIGSEGV, Segmentation fault.
> #0  refs_read_raw_ref (type=, referent=,
> oid=, refname=, ref_store= out>) at refs.c:1451
> 1451 return ref_store->be->read_raw_ref(ref_store, refname, oid,
> referent, type);
> #0  refs_read_raw_ref (type=, referent=,
> oid=, refname=, ref_store= out>) at refs.c:1451
> #1  refs_resolve_ref_unsafe (refs=0x0,
> refname=refname@entry=0x55e863062253 "HEAD",
> resolve_flags=resolve_flags@entry=1, oid=oid@entry=0x7ffdc834b1c0,
> flags=flags@entry=0x7ffdc834b1bc) at refs.c:1493
> #2  0x55e862fcad5c in refs_read_ref_full (flags=0x7ffdc834b1bc,
> oid=0x7ffdc834b1c0, resolve_flags=1, refname=0x55e863062253 "HEAD",
> refs=) at refs.c:224
> #3  refs_head_ref (refs=, fn=fn@entry=0x55e862f25fb0
> , cb_data=cb_data@entry=0x7ffdc834b300) at
> refs.c:1314
> #4  0x55e862f292a2 in status_submodule (flags=0, prefix= out>, ce_flags=0, ce_oid=0x55e86468d4d4, path=0x55e86468d4e8
> "lpc-doc") at builtin/submodule--helper.c:624
> #5  status_submodule_cb (cb_data=0x7ffdc834b240,
> list_item=0x55e86468d490) at builtin/submodule--helper.c:665
> #6  for_each_listed_submodule (cb_data=, fn= out>, list=) at builtin/submodule--helper.c:404
> #7  module_status (argc=, argv=,
> prefix=) at builtin/submodule--helper.c:698
> #8  0x55e862eaec95 in run_builtin (argv=,
> argc=, p=) at git.c:346
> #9  handle_builtin (argc=, argv=) at git.c:554
> #10 0x55e862eaf985 in run_argv (argv=0x7ffdc834be20,
> argcp=0x7ffdc834be2c) at git.c:606
> #11 cmd_main (argc=, argv=) at git.c:683
> #12 0x55e862eae96a in main (argc=3, argv=0x7ffdc834c078) at 
> common-main.c:43

Can you consistently reproduce this? Maybe Stefan can make some sense of
this, but it would be really useful if you could compile git from the
master branch with CFLAGS="-O0 -g" and run gdb with that and paste that
backtrace, and even better if you're in a position to share a copy of
the repository where this happens.


[BUG] Segfault in "git submodule"

2018-09-29 Thread Raymond Jennings
[New LWP 19644]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `git submodule--helper status'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  refs_read_raw_ref (type=, referent=,
oid=, refname=, ref_store=) at refs.c:1451
1451 return ref_store->be->read_raw_ref(ref_store, refname, oid,
referent, type);
#0  refs_read_raw_ref (type=, referent=,
oid=, refname=, ref_store=) at refs.c:1451
#1  refs_resolve_ref_unsafe (refs=0x0,
refname=refname@entry=0x55e863062253 "HEAD",
resolve_flags=resolve_flags@entry=1, oid=oid@entry=0x7ffdc834b1c0,
flags=flags@entry=0x7ffdc834b1bc) at refs.c:1493
#2  0x55e862fcad5c in refs_read_ref_full (flags=0x7ffdc834b1bc,
oid=0x7ffdc834b1c0, resolve_flags=1, refname=0x55e863062253 "HEAD",
refs=) at refs.c:224
#3  refs_head_ref (refs=, fn=fn@entry=0x55e862f25fb0
, cb_data=cb_data@entry=0x7ffdc834b300) at
refs.c:1314
#4  0x55e862f292a2 in status_submodule (flags=0, prefix=, ce_flags=0, ce_oid=0x55e86468d4d4, path=0x55e86468d4e8
"lpc-doc") at builtin/submodule--helper.c:624
#5  status_submodule_cb (cb_data=0x7ffdc834b240,
list_item=0x55e86468d490) at builtin/submodule--helper.c:665
#6  for_each_listed_submodule (cb_data=, fn=, list=) at builtin/submodule--helper.c:404
#7  module_status (argc=, argv=,
prefix=) at builtin/submodule--helper.c:698
#8  0x55e862eaec95 in run_builtin (argv=,
argc=, p=) at git.c:346
#9  handle_builtin (argc=, argv=) at git.c:554
#10 0x55e862eaf985 in run_argv (argv=0x7ffdc834be20,
argcp=0x7ffdc834be2c) at git.c:606
#11 cmd_main (argc=, argv=) at git.c:683
#12 0x55e862eae96a in main (argc=3, argv=0x7ffdc834c078) at common-main.c:43


Re: Segfault in master due to 4fbcca4eff

2018-09-21 Thread Ævar Arnfjörð Bjarmason


On Fri, Sep 21 2018, Jeff King wrote:

> On Sat, Sep 22, 2018 at 01:37:17AM +0200, Ævar Arnfjörð Bjarmason wrote:
>
>> > Thanks, both of you ;-).  I was aware of the issue and proposed fix
>> > but forgot about it when merging things down to 'master'.  Sorry
>> > about that.
>> 
>> Just a follow-up question, in your merge commit you just pushed to
>> "next" you say:
>> 
>> Recent update broke the reachability algorithm when tags pointing
>> at objects that are not commit were involved, which has been fixed.
>> 
>> And in Derrick's commit message it says:
>> 
>> [...]but incorrectly assumed that all objects provided were commits[...]
>> 
>> I just wanted to double check (without having the time to dig myself at
>> this point) whether this bug was understood & tested for, or whether the
>> case I had was just /also/ fixed for unexpected reasons.
>> 
>> I.e. in my upthread test case I have two annotated tags pointing at
>> commits, whereas the merge to "next" says "when tags pointing at objects
>> that are not commit were involved", which I I assume means say annotated
>> tags pointing at blobs..., but that's not what I had.
>> 
>> Wasn't this just a bug fix that had nothing to do with tags not pointing
>> to commits, but just ones where we had the simple case of tags pointing
>> to commits, but they just weren't peeled?
>> 
>> I'm hoping for a "Junio skimmed the fix and wrote a merge message that
>> wasn't quite accurate" here, but maybe that's not the case and something
>> might be missing (e.g. missing test code).
>
> I think it's a combination of the merge message being slightly
> inaccurate, and you slightly misreading it. :)
>
> I think by "tags pointing", Junio meant "tag refs". Which of course,
> often point at tag objects, but can also point at trees, etc.
>
> But the problem is not limited to tag refs. I think it's a problem with
> any "want" that is a non-commit. So really any ref pointing to a
> non-commit is a problem. But of course tags are the likely way for that
> to happen, since refs/heads is generally limited to commits.
>
> So in short, yeah, the bug was triggered by fetching any annotated tag.

Thanks for clearing that up.


Re: Segfault in master due to 4fbcca4eff

2018-09-21 Thread Jeff King
On Sat, Sep 22, 2018 at 01:37:17AM +0200, Ævar Arnfjörð Bjarmason wrote:

> > Thanks, both of you ;-).  I was aware of the issue and proposed fix
> > but forgot about it when merging things down to 'master'.  Sorry
> > about that.
> 
> Just a follow-up question, in your merge commit you just pushed to
> "next" you say:
> 
> Recent update broke the reachability algorithm when tags pointing
> at objects that are not commit were involved, which has been fixed.
> 
> And in Derrick's commit message it says:
> 
> [...]but incorrectly assumed that all objects provided were commits[...]
> 
> I just wanted to double check (without having the time to dig myself at
> this point) whether this bug was understood & tested for, or whether the
> case I had was just /also/ fixed for unexpected reasons.
> 
> I.e. in my upthread test case I have two annotated tags pointing at
> commits, whereas the merge to "next" says "when tags pointing at objects
> that are not commit were involved", which I I assume means say annotated
> tags pointing at blobs..., but that's not what I had.
> 
> Wasn't this just a bug fix that had nothing to do with tags not pointing
> to commits, but just ones where we had the simple case of tags pointing
> to commits, but they just weren't peeled?
> 
> I'm hoping for a "Junio skimmed the fix and wrote a merge message that
> wasn't quite accurate" here, but maybe that's not the case and something
> might be missing (e.g. missing test code).

I think it's a combination of the merge message being slightly
inaccurate, and you slightly misreading it. :)

I think by "tags pointing", Junio meant "tag refs". Which of course,
often point at tag objects, but can also point at trees, etc.

But the problem is not limited to tag refs. I think it's a problem with
any "want" that is a non-commit. So really any ref pointing to a
non-commit is a problem. But of course tags are the likely way for that
to happen, since refs/heads is generally limited to commits.

So in short, yeah, the bug was triggered by fetching any annotated tag.

-Peff


Re: Segfault in master due to 4fbcca4eff

2018-09-21 Thread Ævar Arnfjörð Bjarmason


On Fri, Sep 21 2018, Junio C Hamano wrote:

> Derrick Stolee  writes:
>
>> On 9/21/2018 10:40 AM, Ævar Arnfjörð Bjarmason wrote:
>>> On Fri, Sep 21 2018, Derrick Stolee wrote:
>>>
>>>>
>>>> This error was reported by Peff [1] and fixed in [2], but as stated
>>>> [3] I was waiting for more review before sending a v3. I'll send that
>>>> v3 shortly, responding to the feedback so far.
>>>>
>>>> -Stolee
>>>>
>>>> [1]
>>>> https://public-inbox.org/git/0bf9103c-9377-506b-7ad7-e5273d8e9...@gmail.com/T/#u
>>>>
>>>> [2] https://public-inbox.org/git/pull.39.git.gitgitgad...@gmail.com/
>>>>
>>>> [3]
>>>> https://public-inbox.org/git/8d6061de-1654-577c-40c6-211dbd03a...@gmail.com/
>>> Thanks and sorry for the duplicate report. I can confirm that applying
>>> the v2 of that fixes the segfault for the test case I posted.
>>
>> Thanks for the report! You are a good dogfooder.
>
> Thanks, both of you ;-).  I was aware of the issue and proposed fix
> but forgot about it when merging things down to 'master'.  Sorry
> about that.

Just a follow-up question, in your merge commit you just pushed to
"next" you say:

Recent update broke the reachability algorithm when tags pointing
at objects that are not commit were involved, which has been fixed.

And in Derrick's commit message it says:

[...]but incorrectly assumed that all objects provided were commits[...]

I just wanted to double check (without having the time to dig myself at
this point) whether this bug was understood & tested for, or whether the
case I had was just /also/ fixed for unexpected reasons.

I.e. in my upthread test case I have two annotated tags pointing at
commits, whereas the merge to "next" says "when tags pointing at objects
that are not commit were involved", which I I assume means say annotated
tags pointing at blobs..., but that's not what I had.

Wasn't this just a bug fix that had nothing to do with tags not pointing
to commits, but just ones where we had the simple case of tags pointing
to commits, but they just weren't peeled?

I'm hoping for a "Junio skimmed the fix and wrote a merge message that
wasn't quite accurate" here, but maybe that's not the case and something
might be missing (e.g. missing test code).


Re: Segfault in master due to 4fbcca4eff

2018-09-21 Thread Junio C Hamano
Derrick Stolee  writes:

> On 9/21/2018 10:40 AM, Ævar Arnfjörð Bjarmason wrote:
>> On Fri, Sep 21 2018, Derrick Stolee wrote:
>>
>>>
>>> This error was reported by Peff [1] and fixed in [2], but as stated
>>> [3] I was waiting for more review before sending a v3. I'll send that
>>> v3 shortly, responding to the feedback so far.
>>>
>>> -Stolee
>>>
>>> [1]
>>> https://public-inbox.org/git/0bf9103c-9377-506b-7ad7-e5273d8e9...@gmail.com/T/#u
>>>
>>> [2] https://public-inbox.org/git/pull.39.git.gitgitgad...@gmail.com/
>>>
>>> [3]
>>> https://public-inbox.org/git/8d6061de-1654-577c-40c6-211dbd03a...@gmail.com/
>> Thanks and sorry for the duplicate report. I can confirm that applying
>> the v2 of that fixes the segfault for the test case I posted.
>
> Thanks for the report! You are a good dogfooder.

Thanks, both of you ;-).  I was aware of the issue and proposed fix
but forgot about it when merging things down to 'master'.  Sorry
about that.


Re: Segfault in master due to 4fbcca4eff

2018-09-21 Thread Derrick Stolee

On 9/21/2018 10:40 AM, Ævar Arnfjörð Bjarmason wrote:

On Fri, Sep 21 2018, Derrick Stolee wrote:



This error was reported by Peff [1] and fixed in [2], but as stated
[3] I was waiting for more review before sending a v3. I'll send that
v3 shortly, responding to the feedback so far.

-Stolee

[1]
https://public-inbox.org/git/0bf9103c-9377-506b-7ad7-e5273d8e9...@gmail.com/T/#u

[2] https://public-inbox.org/git/pull.39.git.gitgitgad...@gmail.com/

[3]
https://public-inbox.org/git/8d6061de-1654-577c-40c6-211dbd03a...@gmail.com/

Thanks and sorry for the duplicate report. I can confirm that applying
the v2 of that fixes the segfault for the test case I posted.


Thanks for the report! You are a good dogfooder.

Thanks,

-Stolee



Re: Segfault in master due to 4fbcca4eff

2018-09-21 Thread Ævar Arnfjörð Bjarmason


On Fri, Sep 21 2018, Derrick Stolee wrote:

> On 9/21/2018 10:30 AM, Ævar Arnfjörð Bjarmason wrote:
>> On Fri, Sep 21 2018, Junio C Hamano wrote:
>>
>>> * ds/reachable (2018-08-28) 19 commits
>>>(merged to 'next' on 2018-08-28 at b1634b371d)
>>>   + commit-reach: correct accidental #include of C file
>>>(merged to 'next' on 2018-08-22 at 17f3275afb)
>>>   + commit-reach: use can_all_from_reach
>>>   + commit-reach: make can_all_from_reach... linear
>>>   + commit-reach: replace ref_newer logic
>>>   + test-reach: test commit_contains
>>>   + test-reach: test can_all_from_reach_with_flags
>>>   + test-reach: test reduce_heads
>>>   + test-reach: test get_merge_bases_many
>>>   + test-reach: test is_descendant_of
>>>   + test-reach: test in_merge_bases
>>>   + test-reach: create new test tool for ref_newer
>>>   + commit-reach: move can_all_from_reach_with_flags
>>>   + upload-pack: generalize commit date cutoff
>>>   + upload-pack: refactor ok_to_give_up()
>>>   + upload-pack: make reachable() more generic
>>>   + commit-reach: move commit_contains from ref-filter
>>>   + commit-reach: move ref_newer from remote.c
>>>   + commit.h: remove method declarations
>>>   + commit-reach: move walk methods from commit.c
>>>
>>>   The code for computing history reachability has been shuffled,
>>>   obtained a bunch of new tests to cover them, and then being
>>>   improved.
>> There's a segfault now in master when fetching because of 4fbcca4eff
>> ("commit-reach: make can_all_from_reach... linear", 2018-07-20). I have
>> not had time to debug this, or found an easy isolated test case, but
>> this script will reliably make it segfault for me:
>>
>>  #!/bin/sh
>>
>>  git_dot_git=/home/avar/g/git
>>
>>  old=v0.99
>>  new=v0.99.1
>>
>>  rm -rf /tmp/srv
>>  rm -rf /tmp/cln
>>  git init --bare /tmp/srv
>>  git init --bare /tmp/cln
>>  $git_dot_git/git --exec-path=$git_dot_git push file:///tmp/srv 
>> $old:refs/whatever/ref
>>  $git_dot_git/git --exec-path=$git_dot_git -C /tmp/cln fetch 
>> file:///tmp/srv 'refs/*:refs/*'
>>  $git_dot_git/git --exec-path=$git_dot_git push file:///tmp/srv 
>> $new:refs/whatever/ref
>>  if GIT_TRACE=1 $git_dot_git/git --exec-path=$git_dot_git -C /tmp/cln 
>> fetch file:///tmp/srv 'refs/*:refs/*'
>>  then
>>  exit 0
>>  else
>>  exit 1
>>  fi
>>
>> I.e. you need to push the v0.99 tag to its own repo, fetch that from
>> another one, then push v0.99.1, fetch everything, and you'll get a
>> segfault:
>>
>>  remote: Resolving deltas: 100% (187/187), completed with 48 local 
>> objects.
>>  To file:///tmp/srv
>> d6602ec519..f25a265a34  v0.99.1 -> refs/whatever/ref
>>  14:26:44.505787 git.c:415   trace: built-in: git fetch 
>> file:///tmp/srv 'refs/*:refs/*'
>>  14:26:44.506708 run-command.c:637   trace: run_command: unset 
>> GIT_DIR GIT_IMPLICIT_WORK_TREE GIT_PREFIX; 'git-upload-pack '\''/tmp/srv'\'''
>>  14:26:44.508831 git.c:415   trace: built-in: git 
>> upload-pack /tmp/srv
>>  14:26:44.509953 run-command.c:637   trace: run_command: git 
>> rev-list --objects --stdin --not --all --quiet
>>  Segmentation fault
>>  fatal: The remote end hung up unexpectedly
>>
>> Same with refs/tags/ref b.t.w., not just refs/whatever/ref, I just was
>> initially testing this for some follow-up work on my series for checking
>> how fetching to various non-heads/tags namespaces works.
>
> This error was reported by Peff [1] and fixed in [2], but as stated
> [3] I was waiting for more review before sending a v3. I'll send that
> v3 shortly, responding to the feedback so far.
>
> -Stolee
>
> [1]
> https://public-inbox.org/git/0bf9103c-9377-506b-7ad7-e5273d8e9...@gmail.com/T/#u
>
> [2] https://public-inbox.org/git/pull.39.git.gitgitgad...@gmail.com/
>
> [3]
> https://public-inbox.org/git/8d6061de-1654-577c-40c6-211dbd03a...@gmail.com/

Thanks and sorry for the duplicate report. I can confirm that applying
the v2 of that fixes the segfault for the test case I posted.


Re: Segfault in master due to 4fbcca4eff

2018-09-21 Thread Derrick Stolee

On 9/21/2018 10:30 AM, Ævar Arnfjörð Bjarmason wrote:

On Fri, Sep 21 2018, Junio C Hamano wrote:


* ds/reachable (2018-08-28) 19 commits
   (merged to 'next' on 2018-08-28 at b1634b371d)
  + commit-reach: correct accidental #include of C file
   (merged to 'next' on 2018-08-22 at 17f3275afb)
  + commit-reach: use can_all_from_reach
  + commit-reach: make can_all_from_reach... linear
  + commit-reach: replace ref_newer logic
  + test-reach: test commit_contains
  + test-reach: test can_all_from_reach_with_flags
  + test-reach: test reduce_heads
  + test-reach: test get_merge_bases_many
  + test-reach: test is_descendant_of
  + test-reach: test in_merge_bases
  + test-reach: create new test tool for ref_newer
  + commit-reach: move can_all_from_reach_with_flags
  + upload-pack: generalize commit date cutoff
  + upload-pack: refactor ok_to_give_up()
  + upload-pack: make reachable() more generic
  + commit-reach: move commit_contains from ref-filter
  + commit-reach: move ref_newer from remote.c
  + commit.h: remove method declarations
  + commit-reach: move walk methods from commit.c

  The code for computing history reachability has been shuffled,
  obtained a bunch of new tests to cover them, and then being
  improved.

There's a segfault now in master when fetching because of 4fbcca4eff
("commit-reach: make can_all_from_reach... linear", 2018-07-20). I have
not had time to debug this, or found an easy isolated test case, but
this script will reliably make it segfault for me:

 #!/bin/sh

 git_dot_git=/home/avar/g/git

 old=v0.99
 new=v0.99.1

 rm -rf /tmp/srv
 rm -rf /tmp/cln
 git init --bare /tmp/srv
 git init --bare /tmp/cln
 $git_dot_git/git --exec-path=$git_dot_git push file:///tmp/srv 
$old:refs/whatever/ref
 $git_dot_git/git --exec-path=$git_dot_git -C /tmp/cln fetch 
file:///tmp/srv 'refs/*:refs/*'
 $git_dot_git/git --exec-path=$git_dot_git push file:///tmp/srv 
$new:refs/whatever/ref
 if GIT_TRACE=1 $git_dot_git/git --exec-path=$git_dot_git -C /tmp/cln fetch 
file:///tmp/srv 'refs/*:refs/*'
 then
 exit 0
 else
 exit 1
 fi

I.e. you need to push the v0.99 tag to its own repo, fetch that from
another one, then push v0.99.1, fetch everything, and you'll get a
segfault:

 remote: Resolving deltas: 100% (187/187), completed with 48 local objects.
 To file:///tmp/srv
d6602ec519..f25a265a34  v0.99.1 -> refs/whatever/ref
 14:26:44.505787 git.c:415   trace: built-in: git fetch 
file:///tmp/srv 'refs/*:refs/*'
 14:26:44.506708 run-command.c:637   trace: run_command: unset GIT_DIR 
GIT_IMPLICIT_WORK_TREE GIT_PREFIX; 'git-upload-pack '\''/tmp/srv'\'''
 14:26:44.508831 git.c:415   trace: built-in: git upload-pack 
/tmp/srv
 14:26:44.509953 run-command.c:637   trace: run_command: git rev-list 
--objects --stdin --not --all --quiet
 Segmentation fault
 fatal: The remote end hung up unexpectedly

Same with refs/tags/ref b.t.w., not just refs/whatever/ref, I just was
initially testing this for some follow-up work on my series for checking
how fetching to various non-heads/tags namespaces works.


This error was reported by Peff [1] and fixed in [2], but as stated [3] 
I was waiting for more review before sending a v3. I'll send that v3 
shortly, responding to the feedback so far.


-Stolee

[1] 
https://public-inbox.org/git/0bf9103c-9377-506b-7ad7-e5273d8e9...@gmail.com/T/#u


[2] https://public-inbox.org/git/pull.39.git.gitgitgad...@gmail.com/

[3] 
https://public-inbox.org/git/8d6061de-1654-577c-40c6-211dbd03a...@gmail.com/




Segfault in master due to 4fbcca4eff

2018-09-21 Thread Ævar Arnfjörð Bjarmason


On Fri, Sep 21 2018, Junio C Hamano wrote:

> * ds/reachable (2018-08-28) 19 commits
>   (merged to 'next' on 2018-08-28 at b1634b371d)
>  + commit-reach: correct accidental #include of C file
>   (merged to 'next' on 2018-08-22 at 17f3275afb)
>  + commit-reach: use can_all_from_reach
>  + commit-reach: make can_all_from_reach... linear
>  + commit-reach: replace ref_newer logic
>  + test-reach: test commit_contains
>  + test-reach: test can_all_from_reach_with_flags
>  + test-reach: test reduce_heads
>  + test-reach: test get_merge_bases_many
>  + test-reach: test is_descendant_of
>  + test-reach: test in_merge_bases
>  + test-reach: create new test tool for ref_newer
>  + commit-reach: move can_all_from_reach_with_flags
>  + upload-pack: generalize commit date cutoff
>  + upload-pack: refactor ok_to_give_up()
>  + upload-pack: make reachable() more generic
>  + commit-reach: move commit_contains from ref-filter
>  + commit-reach: move ref_newer from remote.c
>  + commit.h: remove method declarations
>  + commit-reach: move walk methods from commit.c
>
>  The code for computing history reachability has been shuffled,
>  obtained a bunch of new tests to cover them, and then being
>  improved.

There's a segfault now in master when fetching because of 4fbcca4eff
("commit-reach: make can_all_from_reach... linear", 2018-07-20). I have
not had time to debug this, or found an easy isolated test case, but
this script will reliably make it segfault for me:

#!/bin/sh

git_dot_git=/home/avar/g/git

old=v0.99
new=v0.99.1

rm -rf /tmp/srv
rm -rf /tmp/cln
git init --bare /tmp/srv
git init --bare /tmp/cln
$git_dot_git/git --exec-path=$git_dot_git push file:///tmp/srv 
$old:refs/whatever/ref
$git_dot_git/git --exec-path=$git_dot_git -C /tmp/cln fetch file:///tmp/srv 
'refs/*:refs/*'
$git_dot_git/git --exec-path=$git_dot_git push file:///tmp/srv 
$new:refs/whatever/ref
if GIT_TRACE=1 $git_dot_git/git --exec-path=$git_dot_git -C /tmp/cln fetch 
file:///tmp/srv 'refs/*:refs/*'
then
exit 0
else
exit 1
fi

I.e. you need to push the v0.99 tag to its own repo, fetch that from
another one, then push v0.99.1, fetch everything, and you'll get a
segfault:

remote: Resolving deltas: 100% (187/187), completed with 48 local objects.
To file:///tmp/srv
   d6602ec519..f25a265a34  v0.99.1 -> refs/whatever/ref
14:26:44.505787 git.c:415   trace: built-in: git fetch 
file:///tmp/srv 'refs/*:refs/*'
14:26:44.506708 run-command.c:637   trace: run_command: unset GIT_DIR 
GIT_IMPLICIT_WORK_TREE GIT_PREFIX; 'git-upload-pack '\''/tmp/srv'\'''
14:26:44.508831 git.c:415   trace: built-in: git upload-pack 
/tmp/srv
14:26:44.509953 run-command.c:637   trace: run_command: git rev-list 
--objects --stdin --not --all --quiet
Segmentation fault
fatal: The remote end hung up unexpectedly

Same with refs/tags/ref b.t.w., not just refs/whatever/ref, I just was
initially testing this for some follow-up work on my series for checking
how fetching to various non-heads/tags namespaces works.


[PATCHv2 0/2] Fix rerere segfault with specially crafted merge

2018-09-11 Thread Elijah Newren
This series documents and fixes a rerere segfault (dating back to
git-2.7.0) when a merge strategy makes the last entry in the index be
at stage 1.

Changes since last version:
  - In my last submission I only had the code fix and no testcase; I
even commented in the commit message that the bug "cannot be
triggered in the current codebase" and mentioned that I was fixing
it just because an exotic external merge strategy could trigger
the bug.  I realized later that it is triggerable without any
exotic external merge strategies; built-in ones can do it.

Elijah Newren (2):
  t4200: demonstrate rerere segfault on specially crafted merge
  rerere: avoid buffer overrun

 rerere.c  |  2 +-
 t/t4200-rerere.sh | 29 +
 2 files changed, 30 insertions(+), 1 deletion(-)

-- 
2.19.0.2.gdbd064c81f



[PATCHv2 1/2] t4200: demonstrate rerere segfault on specially crafted merge

2018-09-11 Thread Elijah Newren
Signed-off-by: Elijah Newren 
---
 t/t4200-rerere.sh | 29 +
 1 file changed, 29 insertions(+)

diff --git a/t/t4200-rerere.sh b/t/t4200-rerere.sh
index 65da74c766..f9294b7677 100755
--- a/t/t4200-rerere.sh
+++ b/t/t4200-rerere.sh
@@ -577,4 +577,33 @@ test_expect_success 'multiple identical conflicts' '
count_pre_post 0 0
 '
 
+test_expect_success 'setup simple stage 1 handling' '
+   test_create_repo stage_1_handling &&
+   (
+   cd stage_1_handling &&
+
+   test_seq 1 10 >original &&
+   git add original &&
+   git commit -m original &&
+
+   git checkout -b A master &&
+   git mv original A &&
+   git commit -m "rename to A" &&
+
+   git checkout -b B master &&
+   git mv original B &&
+   git commit -m "rename to B"
+   )
+'
+
+test_expect_failure 'test simple stage 1 handling' '
+   (
+   cd stage_1_handling &&
+
+   git config rerere.enabled true &&
+   git checkout A^0 &&
+   test_must_fail git merge B^0
+   )
+'
+
 test_done
-- 
2.19.0.2.gdbd064c81f



Re: [PATCH 2/3] t/helper/test-delta: segfault on OOB access

2018-08-29 Thread Jeff King
On Wed, Aug 29, 2018 at 05:46:21PM -0400, Jeff King wrote:

> > I think even with ASAN, you'd still need read_in_full() or an mmap()
> > wrapper that fiddles with the ASAN shadow, because mmap() always maps
> > whole pages:
> > 
> > $ cat mmap-read-asan-blah.c
> > #include 
> > #include 
> > int main(void) {
> >   volatile char *p = mmap(NULL, 1, PROT_READ|PROT_WRITE,
> > MAP_PRIVATE|MAP_ANONYMOUS, -1, 0);
> >   p[200] = 1;
> > }
> > $ gcc -o mmap-read-asan-blah mmap-read-asan-blah.c -fsanitize=address
> > $ ./mmap-read-asan-blah
> > $
> 
> Yeah, I was just trying to run your tests with ASan and couldn't
> convince it to complain. I also tried MSan, but no luck.
> 
> > But that aside, you do have a point about having some custom hack for
> > a single patch.
> 
> I'm also not sure how portable it is. Looks like we have a Windows
> wrapper for getpagesize(), but I don't see any other uses of mprotect().

Actually, there is no real need for this test helper to use mmap. I
suppose we could swap it out for malloc + read_in_full() and let ASan
(or even valgrind) handle the tricky parts.

-Peff


Re: [PATCH 2/3] t/helper/test-delta: segfault on OOB access

2018-08-29 Thread Jeff King
On Wed, Aug 29, 2018 at 11:40:41PM +0200, Jann Horn wrote:

> > If we want to detect this kind of thing in tests, we should probably be
> > relying on tools like ASan, which would cover all mmaps.
> >
> > It would be nice if there was a low-cost way to detect this in
> > production use, but it looks like this replaces mmap with
> > read_in_full(), which I think is a non-starter for most uses.
> 
> I think even with ASAN, you'd still need read_in_full() or an mmap()
> wrapper that fiddles with the ASAN shadow, because mmap() always maps
> whole pages:
> 
> $ cat mmap-read-asan-blah.c
> #include 
> #include 
> int main(void) {
>   volatile char *p = mmap(NULL, 1, PROT_READ|PROT_WRITE,
> MAP_PRIVATE|MAP_ANONYMOUS, -1, 0);
>   p[200] = 1;
> }
> $ gcc -o mmap-read-asan-blah mmap-read-asan-blah.c -fsanitize=address
> $ ./mmap-read-asan-blah
> $

Yeah, I was just trying to run your tests with ASan and couldn't
convince it to complain. I also tried MSan, but no luck.

> But that aside, you do have a point about having some custom hack for
> a single patch.

I'm also not sure how portable it is. Looks like we have a Windows
wrapper for getpagesize(), but I don't see any other uses of mprotect().

-Peff


Re: [PATCH 2/3] t/helper/test-delta: segfault on OOB access

2018-08-29 Thread Jann Horn
On Wed, Aug 29, 2018 at 11:34 PM Jeff King  wrote:
>
> On Wed, Aug 29, 2018 at 10:58:56PM +0200, Jann Horn wrote:
>
> > This ensures that any attempts to access memory directly after the input
> > buffer or delta buffer in a delta test will cause a segmentation fault.
> >
> > Inspired by vsftpd.
>
> Neat trick, but it seems funny to protect this one buffer in
> non-production code. Obviously you were interested in demonstrating the
> issue for your tests, but do we want to carry this all the time?
>
> If we want to detect this kind of thing in tests, we should probably be
> relying on tools like ASan, which would cover all mmaps.
>
> It would be nice if there was a low-cost way to detect this in
> production use, but it looks like this replaces mmap with
> read_in_full(), which I think is a non-starter for most uses.

I think even with ASAN, you'd still need read_in_full() or an mmap()
wrapper that fiddles with the ASAN shadow, because mmap() always maps
whole pages:

$ cat mmap-read-asan-blah.c
#include 
#include 
int main(void) {
  volatile char *p = mmap(NULL, 1, PROT_READ|PROT_WRITE,
MAP_PRIVATE|MAP_ANONYMOUS, -1, 0);
  p[200] = 1;
}
$ gcc -o mmap-read-asan-blah mmap-read-asan-blah.c -fsanitize=address
$ ./mmap-read-asan-blah
$

But that aside, you do have a point about having some custom hack for
a single patch.


Re: [PATCH 2/3] t/helper/test-delta: segfault on OOB access

2018-08-29 Thread Jeff King
On Wed, Aug 29, 2018 at 10:58:56PM +0200, Jann Horn wrote:

> This ensures that any attempts to access memory directly after the input
> buffer or delta buffer in a delta test will cause a segmentation fault.
> 
> Inspired by vsftpd.

Neat trick, but it seems funny to protect this one buffer in
non-production code. Obviously you were interested in demonstrating the
issue for your tests, but do we want to carry this all the time?

If we want to detect this kind of thing in tests, we should probably be
relying on tools like ASan, which would cover all mmaps.

It would be nice if there was a low-cost way to detect this in
production use, but it looks like this replaces mmap with
read_in_full(), which I think is a non-starter for most uses.

-Peff


[PATCH 2/3] t/helper/test-delta: segfault on OOB access

2018-08-29 Thread Jann Horn
This ensures that any attempts to access memory directly after the input
buffer or delta buffer in a delta test will cause a segmentation fault.

Inspired by vsftpd.

Signed-off-by: Jann Horn 
---
 t/helper/test-delta.c | 78 +--
 1 file changed, 53 insertions(+), 25 deletions(-)

diff --git a/t/helper/test-delta.c b/t/helper/test-delta.c
index 34c725924..64d0ec902 100644
--- a/t/helper/test-delta.c
+++ b/t/helper/test-delta.c
@@ -16,45 +16,73 @@
 static const char usage_str[] =
"test-tool delta (-d|-p)   ";
 
-int cmd__delta(int argc, const char **argv)
+/*
+ * We want to detect OOB reads behind the resulting buffer, even in non-ASAN
+ * builds. This helper reads some data into memory, aligns the *end* of the
+ * buffer on a page boundary, and reserves the next virtual page. This ensures
+ * that a single-byte OOB access segfaults.
+ */
+static void *map_with_adjacent_trailing_guard(const char *path,
+ unsigned long *sizep)
 {
int fd;
struct stat st;
-   void *from_buf, *data_buf, *out_buf;
-   unsigned long from_size, data_size, out_size;
+   unsigned long page_size = getpagesize();
+   unsigned long padded_size, padding;
 
-   if (argc != 5 || (strcmp(argv[1], "-d") && strcmp(argv[1], "-p"))) {
-   fprintf(stderr, "usage: %s\n", usage_str);
-   return 1;
+   fd = open(path, O_RDONLY);
+   if (fd < 0) {
+   perror(path);
+   return NULL;
}
+   if (fstat(fd, &st)) {
+   perror(path);
+   close(fd);
+   return NULL;
+   }
+   *sizep = st.st_size;
 
-   fd = open(argv[2], O_RDONLY);
-   if (fd < 0 || fstat(fd, &st)) {
-   perror(argv[2]);
-   return 1;
+   /* pad in front for alignment and add trailing page */
+   padded_size = ((page_size-1) + st.st_size + page_size) & ~(page_size-1);
+   padding = padded_size - (st.st_size + page_size);
+
+   char *mapping = mmap(NULL, padded_size, PROT_READ|PROT_WRITE,
+MAP_PRIVATE|MAP_ANONYMOUS, -1, 0);
+   if (mapping == MAP_FAILED ||
+   mprotect(mapping + padded_size - page_size, page_size, PROT_NONE)) {
+   perror("mmap");
+   close(fd);
+   return NULL;
}
-   from_size = st.st_size;
-   from_buf = mmap(NULL, from_size, PROT_READ, MAP_PRIVATE, fd, 0);
-   if (from_buf == MAP_FAILED) {
-   perror(argv[2]);
+   if (read_in_full(fd, mapping + padding, st.st_size) != st.st_size) {
+   perror("read_in_full");
+   munmap(mapping, padded_size);
close(fd);
-   return 1;
+   return NULL;
}
+   mprotect(mapping, padded_size - page_size, PROT_READ);
close(fd);
+   return mapping + padding;
+}
 
-   fd = open(argv[3], O_RDONLY);
-   if (fd < 0 || fstat(fd, &st)) {
-   perror(argv[3]);
+int cmd__delta(int argc, const char **argv)
+{
+   int fd;
+   void *from_buf, *data_buf, *out_buf;
+   unsigned long from_size, data_size, out_size;
+
+   if (argc != 5 || (strcmp(argv[1], "-d") && strcmp(argv[1], "-p"))) {
+   fprintf(stderr, "usage: %s\n", usage_str);
return 1;
}
-   data_size = st.st_size;
-   data_buf = mmap(NULL, data_size, PROT_READ, MAP_PRIVATE, fd, 0);
-   if (data_buf == MAP_FAILED) {
-   perror(argv[3]);
-   close(fd);
+
+   from_buf = map_with_adjacent_trailing_guard(argv[2], &from_size);
+   if (from_buf == NULL)
+   return 1;
+
+   data_buf = map_with_adjacent_trailing_guard(argv[3], &data_size);
+   if (data_buf == NULL)
return 1;
-   }
-   close(fd);
 
if (argv[1][1] == 'd')
out_buf = diff_delta(from_buf, from_size,
-- 
2.19.0.rc0.228.g281dcd1b4d0-goog



Re: [BUG] 'git ls-files --no-exclude' segfault & co

2018-08-06 Thread Jeff King
On Mon, Aug 06, 2018 at 06:07:06PM +0200, SZEDER Gábor wrote:

> 'git ls-files' has the options '--exclude', '--exclude-from',
> '--exclude-per-directory', and '--exclude-standard', which all work
> fine.  However, it also allows the negated version of all these four
> options, and they definitely don't work very well:
> 
>   $ git ls-files --no-exclude
>   Segmentation fault
>   $ git ls-files --no-exclude-from
>   warning: unable to access '(null)': Bad address
>   fatal: cannot use (null) as an exclude file
> 
> And '--no-exclude-standard' has the same effect as
> '--exclude-standard', because its parseopt callback function
> option_parse_exclude_standard() doesn't bother to look at its 'unset'
> parameter.

I think --exclude-per-directory is fine, since it uses OPT_STRING().
Using "--no-exclude-per-directory" just means we'll cancel any
previously found option.

In an ideal world we'd perhaps do something useful with the negated
forms for the others, but I don't think the underlying code is set up to
do that (i.e., how do you undo "setup_standard_excludes()"). Possibly we
could switch to setting a flag (which could then be cleared), and
resolve the flags after parsing the whole command line. But often
options with callbacks list this have subtle user-visible timing effects
(e.g., that the command line options need to take effect in the order
they were found).

So unless somebody actually wants these negated forms to do something
useful, it's probably not worth the trouble and risk of regression.  But
we obviously should at least disallow them explicitly rather than
segfaulting.

I thought adding PARSE_OPT_NONEG like this would work:

diff --git a/builtin/ls-files.c b/builtin/ls-files.c
index 88bb2019ad..9adee62358 100644
--- a/builtin/ls-files.c
+++ b/builtin/ls-files.c
@@ -547,15 +547,16 @@ int cmd_ls_files(int argc, const char **argv, const char 
*cmd_prefix)
N_("show resolve-undo information")),
{ OPTION_CALLBACK, 'x', "exclude", &exclude_list, N_("pattern"),
N_("skip files matching pattern"),
-   0, option_parse_exclude },
+   PARSE_OPT_NONEG, option_parse_exclude },
{ OPTION_CALLBACK, 'X', "exclude-from", &dir, N_("file"),
N_("exclude patterns are read from "),
-   0, option_parse_exclude_from },
+   PARSE_OPT_NONEG, option_parse_exclude_from },
OPT_STRING(0, "exclude-per-directory", &dir.exclude_per_dir, 
N_("file"),
N_("read additional per-directory exclude patterns in 
")),
{ OPTION_CALLBACK, 0, "exclude-standard", &dir, NULL,
N_("add the standard git exclusions"),
-   PARSE_OPT_NOARG, option_parse_exclude_standard },
+   PARSE_OPT_NOARG | PARSE_OPT_NONEG,
+   option_parse_exclude_standard },
OPT_SET_INT_F(0, "full-name", &prefix_len,
  N_("make the output relative to the project top 
directory"),
  0, PARSE_OPT_NONEG),

But it actually does something quite interesting. Because of the NONEG
flag, we know that the user cannot mean "--no-exclude" itself. So our
liberal prefix-matching kicks in, and we treat it as
--no-exclude-per-directory. I.e., it becomes a silent noop and the user
gets no warning.

I think parse_options() may be overly liberal here, and we might want to
change that. But in the interim, it probably makes sense to just detect
the "unset" case in the callbacks, report an error, and return -1.

-Peff


[BUG] 'git ls-files --no-exclude' segfault & co

2018-08-06 Thread SZEDER Gábor
'git ls-files' has the options '--exclude', '--exclude-from',
'--exclude-per-directory', and '--exclude-standard', which all work
fine.  However, it also allows the negated version of all these four
options, and they definitely don't work very well:

  $ git ls-files --no-exclude
  Segmentation fault
  $ git ls-files --no-exclude-from
  warning: unable to access '(null)': Bad address
  fatal: cannot use (null) as an exclude file

And '--no-exclude-standard' has the same effect as
'--exclude-standard', because its parseopt callback function
option_parse_exclude_standard() doesn't bother to look at its 'unset'
parameter.


Re: BUG: Segfault on "git pull" on "bad object HEAD"

2018-07-11 Thread Junio C Hamano
Jeff King  writes:

> So I feel like the right answer here is probably this:
>
> diff --git a/wt-status.c b/wt-status.c
> index d1c05145a4..5fcaa3d0f8 100644
> --- a/wt-status.c
> +++ b/wt-status.c
> @@ -2340,7 +2340,16 @@ int has_uncommitted_changes(int ignore_submodules)
>   if (ignore_submodules)
>   rev_info.diffopt.flags.ignore_submodules = 1;
>   rev_info.diffopt.flags.quick = 1;
> +
>   add_head_to_pending(&rev_info);
> + if (!rev_info.pending.nr) {
> + /*
> +  * We have no head (or it's corrupt), but the index is not
> +  * unborn; declare it as uncommitted changes.
> +  */
> + return 1;
> + }
> +
>   diff_setup_done(&rev_info.diffopt);
>   result = run_diff_index(&rev_info, 1);
>   return diff_result_code(&rev_info.diffopt, result);
>
> That does quietly paper over the corruption, but it does the
> conservative thing, and a follow-up "git status" would yield "bad
> object: HEAD".

Sounds quite sensible.



Re: BUG: Segfault on "git pull" on "bad object HEAD"

2018-07-11 Thread Duy Nguyen
On Wed, Jul 11, 2018 at 1:02 PM Ævar Arnfjörð Bjarmason
 wrote:
>
> This segfaults, but should print an error instead, have a repo with a
> corrupt HEAD:
>
> (
> rm -rf /tmp/git &&
> git clone --single-branch --branch todo g...@github.com:git/git.git 
> /tmp/git &&
> echo  
> >/tmp/git/.git/refs/heads/todo &&
> git -C /tmp/git pull
> )
>
> On this repository e.g. "git log" will print "fatal: bad object HEAD",
> but for some reason "git pull" makes it this far:
>
> $ git pull
> Segmentation fault
>
> The immediate reason is that in run_diff_index() we have this:
>
> ent = revs->pending.objects;
>
> And that in this case that's NULL:

Probably because add_head_to_pending() in has_uncommitted_change()
does not add anything to the "pending" list because HEAD is broken.

I think if we make add_head_to_pending() return a boolean, then we can
check that if no HEAD is added, there's no point to run_diff_index and
has_uncommitted_changes() can return 0 immediately.

A new BUG() could still be added in run_diff_index() though, to check
if revs->pending.nr is non-zero before attempting to access
revs->pending.objects.

>
> (gdb) bt
> #0  0x5565993f in run_diff_index (revs=0x7fffcb90, cached=1) 
> at diff-lib.c:524
> #1  0x557633da in has_uncommitted_changes (ignore_submodules=1) 
> at wt-status.c:2345
> #2  0x557634c9 in require_clean_work_tree (action=0x55798f18 
> "pull with rebase", hint=0x55798efb "please commit or stash them.", 
> ignore_submodules=1, gently=0) at wt-status.c:2370
> #3  0x555dbdee in cmd_pull (argc=0, argv=0x7fffd868, 
> prefix=0x0) at builtin/pull.c:885
> #4  0x5556c9da in run_builtin (p=0x55a2de50 , 
> argc=1, argv=0x7fffd868) at git.c:417
> #5  0x5556cce2 in handle_builtin (argc=1, argv=0x7fffd868) at 
> git.c:633
> #6  0x5556ce8a in run_argv (argcp=0x7fffd71c, 
> argv=0x7fffd710) at git.c:685
> #7  0x5556d03f in cmd_main (argc=1, argv=0x7fffd868) at 
> git.c:762
> #8  0x55611786 in main (argc=3, argv=0x7fffd858) at 
> common-main.c:45
> (gdb) p revs
> $4 = (struct rev_info *) 0x7fffcb90
> (gdb) p revs->pending
> $5 = {nr = 0, alloc = 0, objects = 0x0}
> (gdb)
>
> This has been an issue since at least v2.8.0 (didn't test back
> further). I'm not familiar with the status / diff code, so I'm not sure
> where the assertion should be added.
>
> This came up in the wild due to a user with a corrupt repo (don't know
> how it got corrupt) trying "git pull" and seeing git segfault.



-- 
Duy


Re: BUG: Segfault on "git pull" on "bad object HEAD"

2018-07-11 Thread Jeff King
On Wed, Jul 11, 2018 at 01:00:57PM +0200, Ævar Arnfjörð Bjarmason wrote:

> This segfaults, but should print an error instead, have a repo with a
> corrupt HEAD:
> 
> (
> rm -rf /tmp/git &&
> git clone --single-branch --branch todo g...@github.com:git/git.git 
> /tmp/git &&
> echo  
> >/tmp/git/.git/refs/heads/todo &&
> git -C /tmp/git pull
> )

It took me a minute to reproduce this. It needs "pull --rebase" if you
don't have that setup in your config.

> The immediate reason is that in run_diff_index() we have this:
> 
>   ent = revs->pending.objects;
> 
> And that in this case that's NULL:
> 
> (gdb) bt
> #0  0x5565993f in run_diff_index (revs=0x7fffcb90, cached=1) 
> at diff-lib.c:524
> #1  0x557633da in has_uncommitted_changes (ignore_submodules=1) 
> at wt-status.c:2345

These two are the interesting functions. has_uncommitted_changes() calls
add_head_to_pending(). So it could realize then that there is no valid
HEAD to compare against.

But as you note, it's run_diff_index() that blindly dereferences
revs->pending.objects without seeing if it's non-empty. Normally
setup_revisions() would barf on a bad object, but the manual
add_head_to_pending() quietly returns (as it must for some cases, like
unborn branches).

So I feel like the right answer here is probably this:

diff --git a/wt-status.c b/wt-status.c
index d1c05145a4..5fcaa3d0f8 100644
--- a/wt-status.c
+++ b/wt-status.c
@@ -2340,7 +2340,16 @@ int has_uncommitted_changes(int ignore_submodules)
if (ignore_submodules)
rev_info.diffopt.flags.ignore_submodules = 1;
rev_info.diffopt.flags.quick = 1;
+
add_head_to_pending(&rev_info);
+   if (!rev_info.pending.nr) {
+   /*
+* We have no head (or it's corrupt), but the index is not
+* unborn; declare it as uncommitted changes.
+*/
+   return 1;
+   }
+
diff_setup_done(&rev_info.diffopt);
result = run_diff_index(&rev_info, 1);
return diff_result_code(&rev_info.diffopt, result);

That does quietly paper over the corruption, but it does the
conservative thing, and a follow-up "git status" would yield "bad
object: HEAD".

I do worry that other callers of run_diff_index() might have similar
problems, though. Grepping around, the other callers seem to fall into
one of three categories:

 - they resolve the object themselves and put it in the pending list
   (and often fallback to the empty tree, which is more or less what the
   patch above is doing)

 - they resolve the object themselves and avoid calling run_diff_index()
   if it's not valid

 - they use setup_revisions(), which will barf on the broken object

So I think this may be sufficient. We probably should also add an
assertion to run_diff_index(), since that's better than segfaulting.

-Peff


BUG: Segfault on "git pull" on "bad object HEAD"

2018-07-11 Thread Ævar Arnfjörð Bjarmason
This segfaults, but should print an error instead, have a repo with a
corrupt HEAD:

(
rm -rf /tmp/git &&
git clone --single-branch --branch todo g...@github.com:git/git.git 
/tmp/git &&
echo  
>/tmp/git/.git/refs/heads/todo &&
git -C /tmp/git pull
)

On this repository e.g. "git log" will print "fatal: bad object HEAD",
but for some reason "git pull" makes it this far:

$ git pull
Segmentation fault

The immediate reason is that in run_diff_index() we have this:

ent = revs->pending.objects;

And that in this case that's NULL:

(gdb) bt
#0  0x5565993f in run_diff_index (revs=0x7fffcb90, cached=1) at 
diff-lib.c:524
#1  0x557633da in has_uncommitted_changes (ignore_submodules=1) at 
wt-status.c:2345
#2  0x557634c9 in require_clean_work_tree (action=0x55798f18 
"pull with rebase", hint=0x55798efb "please commit or stash them.", 
ignore_submodules=1, gently=0) at wt-status.c:2370
#3  0x555dbdee in cmd_pull (argc=0, argv=0x7fffd868, 
prefix=0x0) at builtin/pull.c:885
#4  0x5556c9da in run_builtin (p=0x55a2de50 , 
argc=1, argv=0x7fffd868) at git.c:417
#5  0x5556cce2 in handle_builtin (argc=1, argv=0x7fffd868) at 
git.c:633
#6  0x5556ce8a in run_argv (argcp=0x7fffd71c, 
argv=0x7fffd710) at git.c:685
#7  0x5556d03f in cmd_main (argc=1, argv=0x7fffd868) at 
git.c:762
#8  0x55611786 in main (argc=3, argv=0x7fffd858) at 
common-main.c:45
(gdb) p revs
$4 = (struct rev_info *) 0x7fffcb90
(gdb) p revs->pending
$5 = {nr = 0, alloc = 0, objects = 0x0}
(gdb)

This has been an issue since at least v2.8.0 (didn't test back
further). I'm not familiar with the status / diff code, so I'm not sure
where the assertion should be added.

This came up in the wild due to a user with a corrupt repo (don't know
how it got corrupt) trying "git pull" and seeing git segfault.


Re: BUG: rev-parse segfault with invalid input

2018-05-23 Thread Todd Zullinger
Hi,

Elijah Newren wrote:
> Thanks for the detailed report.  This apparently goes back to
> git-1.6.0 with commit 2122f8b963d4 ("rev-parse: Add support for the ^!
> and ^@ syntax", 2008-07-26).  We aren't checking that the commit from
> lookup_commit_reference() is non-NULL before proceeding.  Looks like
> it's simple to fix.  I'll send a patch shortly...

Thanks Elijah!  I thought it was likely to be a simple fix.
But I also don't know the area well and that kept me from
being too ambitious about suggesting a fix or the difficulty
of one. :)

-- 
Todd
~~
I believe in the noble, aristocratic art of doing absolutely nothing.
And someday, I hope to be in a position where I can do even less.



Re: BUG: rev-parse segfault with invalid input

2018-05-23 Thread Elijah Newren
On Wed, May 23, 2018 at 12:52 PM, Todd Zullinger  wrote:
> Hi,
>
> Certain invalid input causes git rev-parse to crash rather
> than return a 'fatal: ambiguous argument ...' error.
>
> This was reported against the Fedora git package:
>
> https://bugzilla.redhat.com/1581678
>
> Simple reproduction recipe and analysis, from the bug:
>
> $ git init
> Initialized empty Git repository in /tmp/t/.git/
> $ git rev-parse ^@
> Segmentation fault (core dumped)
>
> gdb) break lookup_commit_reference
> Breakpoint 1 at 0x55609f00: lookup_commit_reference. (3 locations)
> (gdb) r
> Starting program: /usr/bin/git rev-parse 
> \^@
> [Thread debugging using libthread_db enabled]
> Using host libthread_db library "/lib64/libthread_db.so.1".
>
> Breakpoint 1, lookup_commit_reference (oid=oid@entry=0x7fffd550) at 
> commit.c:34
> 34  return lookup_commit_reference_gently(oid, 0);
> (gdb) finish
> Run till exit from #0  lookup_commit_reference 
> (oid=oid@entry=0x7fffd550) at commit.c:34
> try_parent_shorthands (arg=0x7fffdd44 'f' ) at 
> builtin/rev-parse.c:314
> 314 include_parents = 1;
> Value returned is $1 = (struct commit *) 0x0
> (gdb) c
>
> (gdb) c
> Continuing.
>
> Program received signal SIGSEGV, Segmentation fault.
> try_parent_shorthands (arg=0x7fffdd44 'f' ) at 
> builtin/rev-parse.c:345
> 345 for (parents = commit->parents, parent_number = 1;
> (gdb) l 336,+15
> 336 commit = lookup_commit_reference(&oid);
> 337 if (exclude_parent &&
> 338 exclude_parent > commit_list_count(commit->parents)) {
> 339 *dotdot = '^';
> 340 return 0;
> 341 }
> 342
> 343 if (include_rev)
> 344 show_rev(NORMAL, &oid, arg);
> 345 for (parents = commit->parents, parent_number = 1;
> 346  parents;
> 347  parents = parents->next, parent_number++) {
> 348 char *name = NULL;
> 349
> 350 if (exclude_parent && parent_number != 
> exclude_parent)
> 351 continue;
>
> Looks like a null pointer check is missing.
>
> This occurs on master and as far back as 1.8.3.1 (what's in
> RHEL-6, I didn't try to test anything older).  Only a string
> with 40 valid hex characters and ^@, @-, of ^!  seems to
> trigger it.

Thanks for the detailed report.  This apparently goes back to
git-1.6.0 with commit 2122f8b963d4 ("rev-parse: Add support for the ^!
and ^@ syntax", 2008-07-26).  We aren't checking that the commit from
lookup_commit_reference() is non-NULL before proceeding.  Looks like
it's simple to fix.  I'll send a patch shortly...


BUG: rev-parse segfault with invalid input

2018-05-23 Thread Todd Zullinger
Hi,

Certain invalid input causes git rev-parse to crash rather
than return a 'fatal: ambiguous argument ...' error.

This was reported against the Fedora git package:

https://bugzilla.redhat.com/1581678

Simple reproduction recipe and analysis, from the bug:

$ git init
Initialized empty Git repository in /tmp/t/.git/
$ git rev-parse ^@
Segmentation fault (core dumped)

gdb) break lookup_commit_reference
Breakpoint 1 at 0x55609f00: lookup_commit_reference. (3 locations)
(gdb) r
Starting program: /usr/bin/git rev-parse 
\^@
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".

Breakpoint 1, lookup_commit_reference (oid=oid@entry=0x7fffd550) at 
commit.c:34
34  return lookup_commit_reference_gently(oid, 0);
(gdb) finish
Run till exit from #0  lookup_commit_reference 
(oid=oid@entry=0x7fffd550) at commit.c:34
try_parent_shorthands (arg=0x7fffdd44 'f' ) at 
builtin/rev-parse.c:314
314 include_parents = 1;
Value returned is $1 = (struct commit *) 0x0
(gdb) c

(gdb) c
Continuing.

Program received signal SIGSEGV, Segmentation fault.
try_parent_shorthands (arg=0x7fffdd44 'f' ) at 
builtin/rev-parse.c:345
345 for (parents = commit->parents, parent_number = 1;
(gdb) l 336,+15
336 commit = lookup_commit_reference(&oid);
337 if (exclude_parent &&
338 exclude_parent > commit_list_count(commit->parents)) {
339 *dotdot = '^';
340 return 0;
341 }
342 
343 if (include_rev)
344 show_rev(NORMAL, &oid, arg);
345 for (parents = commit->parents, parent_number = 1;
346  parents;
347  parents = parents->next, parent_number++) {
348 char *name = NULL;
349 
350 if (exclude_parent && parent_number != 
exclude_parent)
351 continue;

Looks like a null pointer check is missing.

This occurs on master and as far back as 1.8.3.1 (what's in
RHEL-6, I didn't try to test anything older).  Only a string
with 40 valid hex characters and ^@, @-, of ^!  seems to
trigger it.

-- 
Todd
~~
I don't mind arguing with myself. It's when I lose that it bothers me.
-- Richard Powers



Re: [PATCH 0/2] fix a segfault in get_main_ref_store()

2018-05-18 Thread Johannes Schindelin
Hi Peff,

On Fri, 18 May 2018, Jeff King wrote:

> I stumbled across a BUG() today. But interestingly, in the current tip
> of master it actually segfaults instead! This fixes the segfault (back
> into a BUG(), and then fixes the caller to avoid the BUG() in the first
> place).
> 
>   [1/2]: get_main_ref_store: BUG() when outside a repository
>   [2/2]: config: die when --blob is used outside a repository
> 
>  builtin/config.c   | 3 +++
>  refs.c | 3 +++
>  t/t1307-config-blob.sh | 4 
>  3 files changed, 10 insertions(+)

Both patches look obviously correct to me.

Thanks,
Dscho


[PATCH 0/2] fix a segfault in get_main_ref_store()

2018-05-18 Thread Jeff King
I stumbled across a BUG() today. But interestingly, in the current tip
of master it actually segfaults instead! This fixes the segfault (back
into a BUG(), and then fixes the caller to avoid the BUG() in the first
place).

  [1/2]: get_main_ref_store: BUG() when outside a repository
  [2/2]: config: die when --blob is used outside a repository

 builtin/config.c   | 3 +++
 refs.c | 3 +++
 t/t1307-config-blob.sh | 4 
 3 files changed, 10 insertions(+)

-Peff


Re: [PATCH] git-svn: control destruction order to avoid segfault

2018-01-30 Thread Junio C Hamano
Eric Wong  writes:

> Todd Zullinger  wrote:
>> I'm running the tests with and without your patch as well.
>> So far I've run t9128 300 times with the patch and no
>> failures.  Without it, it's failed 3 times in only a few
>> dozen runs.  That's promising.
>
> Thanks for confirming it works on other systems.
> Pull request and patch below:
>
> The following changes since commit 5be1f00a9a701532232f57958efab4be8c959a29:
>
>   First batch after 2.16 (2018-01-23 13:21:10 -0800)
>
> are available in the Git repository at:
>
>   git://bogomips.org/git-svn.git svn-branch-segfault
>
> for you to fetch changes up to 2784b8d68faca823489949cbc69ead2f296cfc07:
>
>   git-svn: control destruction order to avoid segfault (2018-01-29 23:12:00 
> +)
>
> ----
> Eric Wong (1):
>   git-svn: control destruction order to avoid segfault
>
>  git-svn.perl | 5 +
>  1 file changed, 5 insertions(+)

Thanks.  I'd actually apply this as a patch instead of pullilng, as
I suspect you'd want it in 'maint' as well, though.


> -8<-
> Subject: [PATCH] git-svn: control destruction order to avoid segfault
>
> It seems necessary to control destruction ordering to avoid a
> segfault with SVN 1.9.5 when using "git svn branch".
> I've also reported the problem against libsvn-perl to Debian
> [Bug #888791], but releasing the SVN::Client instance can be
> beneficial anyways to save memory.
>
> ref: https://bugs.debian.org/888791
> Tested-by: Todd Zullinger 
> Reported-by: brian m. carlson 
> Signed-off-by: Eric Wong 
> ---
>  git-svn.perl | 5 +
>  1 file changed, 5 insertions(+)
>
> diff --git a/git-svn.perl b/git-svn.perl
> index 76a75d0b3d..a6b6c3e40c 100755
> --- a/git-svn.perl
> +++ b/git-svn.perl
> @@ -1200,6 +1200,11 @@ sub cmd_branch {
>   $ctx->copy($src, $rev, $dst)
>   unless $_dry_run;
>  
> + # Release resources held by ctx before creating another SVN::Ra
> + # so destruction is orderly.  This seems necessary with SVN 1.9.5
> + # to avoid segfaults.
> + $ctx = undef;
> +
>   $gs->fetch_all;
>  }


[PATCH] git-svn: control destruction order to avoid segfault

2018-01-29 Thread Eric Wong
Todd Zullinger  wrote:
> I'm running the tests with and without your patch as well.
> So far I've run t9128 300 times with the patch and no
> failures.  Without it, it's failed 3 times in only a few
> dozen runs.  That's promising.

Thanks for confirming it works on other systems.
Pull request and patch below:

The following changes since commit 5be1f00a9a701532232f57958efab4be8c959a29:

  First batch after 2.16 (2018-01-23 13:21:10 -0800)

are available in the Git repository at:

  git://bogomips.org/git-svn.git svn-branch-segfault

for you to fetch changes up to 2784b8d68faca823489949cbc69ead2f296cfc07:

  git-svn: control destruction order to avoid segfault (2018-01-29 23:12:00 
+)


Eric Wong (1):
  git-svn: control destruction order to avoid segfault

 git-svn.perl | 5 +
 1 file changed, 5 insertions(+)

-8<-
Subject: [PATCH] git-svn: control destruction order to avoid segfault

It seems necessary to control destruction ordering to avoid a
segfault with SVN 1.9.5 when using "git svn branch".
I've also reported the problem against libsvn-perl to Debian
[Bug #888791], but releasing the SVN::Client instance can be
beneficial anyways to save memory.

ref: https://bugs.debian.org/888791
Tested-by: Todd Zullinger 
Reported-by: brian m. carlson 
Signed-off-by: Eric Wong 
---
 git-svn.perl | 5 +
 1 file changed, 5 insertions(+)

diff --git a/git-svn.perl b/git-svn.perl
index 76a75d0b3d..a6b6c3e40c 100755
--- a/git-svn.perl
+++ b/git-svn.perl
@@ -1200,6 +1200,11 @@ sub cmd_branch {
$ctx->copy($src, $rev, $dst)
unless $_dry_run;
 
+   # Release resources held by ctx before creating another SVN::Ra
+   # so destruction is orderly.  This seems necessary with SVN 1.9.5
+   # to avoid segfaults.
+   $ctx = undef;
+
$gs->fetch_all;
 }
 
-- 
EW


Re: [PATCH] mailinfo: avoid segfault when can't open files

2018-01-24 Thread Jeff King
On Wed, Jan 24, 2018 at 01:51:31PM -0300, Juan F. Codagnone wrote:

> > As for the patch itself, it looks correct but I saw two style nits:
> 
> Thanks for the detailed review! I'm sorry about the style nits. I
> focused on the tabs and braces. Next time I will take additional
> attention.

No problem, and thank you for fixing the bug (also, I think this is your
first patch, so welcome to git. :) ).

> I'll be resubmitting  the patch taking into account your remarks.

It looks good to me.

-Peff


[PATCH v1] mailinfo: avoid segfault when can't open files

2018-01-24 Thread Juan F. Codagnone
If  or  files can't be opened, then mailinfo() returns an
error before it even initializes mi->p_hdr_data or mi->s_hdr_data.
When cmd_mailinfo() then calls clear_mailinfo(), we dereference the
NULL pointers trying to free their contents.

Signed-off-by: Juan F. Codagnone 
Reviewed-by: Jeff King 
---
 mailinfo.c | 10 ++
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/mailinfo.c b/mailinfo.c
index a89db22ab..d04142ccc 100644
--- a/mailinfo.c
+++ b/mailinfo.c
@@ -1167,11 +1167,13 @@ void clear_mailinfo(struct mailinfo *mi)
strbuf_release(&mi->inbody_header_accum);
free(mi->message_id);
 
-   for (i = 0; mi->p_hdr_data[i]; i++)
-   strbuf_release(mi->p_hdr_data[i]);
+   if (mi->p_hdr_data)
+   for (i = 0; mi->p_hdr_data[i]; i++)
+   strbuf_release(mi->p_hdr_data[i]);
free(mi->p_hdr_data);
-   for (i = 0; mi->s_hdr_data[i]; i++)
-   strbuf_release(mi->s_hdr_data[i]);
+   if (mi->s_hdr_data)
+   for (i = 0; mi->s_hdr_data[i]; i++)
+   strbuf_release(mi->s_hdr_data[i]);
free(mi->s_hdr_data);
 
while (mi->content < mi->content_top) {
-- 
2.14.3



Re: [PATCH] mailinfo: avoid segfault when can't open files

2018-01-24 Thread Juan F. Codagnone
On Wed, Jan 24, 2018 at 1:02 AM, Jeff King  wrote:
>
> On Tue, Jan 23, 2018 at 11:54:17PM -0300, Juan F. Codagnone wrote:
...
>
>
> But given the lack of callers, it may not be worth the effort. So I'm OK
> with this solution. It may be worth giving an abbreviated version of the
> above explanation in the commit message. Perhaps:
>
>   If  or  files can't be opened, then mailinfo() returns an
>   error before it even initializes mi->p_hdr_data or mi->s_hdr_data.
>   When cmd_mailinfo() then calls clear_mailinfo(), we dereference the
>   NULL pointers trying to free their contents.
>
> As for the patch itself, it looks correct but I saw two style nits:


Thanks for the detailed review! I'm sorry about the style nits. I
focused on the tabs and braces. Next time I will take additional
attention.
I'll be resubmitting  the patch taking into account your remarks.


Re: [PATCH] mailinfo: avoid segfault when can't open files

2018-01-23 Thread Jeff King
On Tue, Jan 23, 2018 at 11:54:17PM -0300, Juan F. Codagnone wrote:

> If  or  files can't be opened, clear_mailinfo crash as
> it follows NULL pointers.
> 
> Can be reproduced using `git mailinfo . .`

Thanks for finding this.

Looking at the offending code and your solution, it looks like:

  1. mailinfo() sometimes allocates mi->p_hdr_data and mi->s_hdr_data
 and sometimes not, depending on how far we get before seeing an
 error. The caller cannot know whether we did so or not based on
 seeing an error return, but most call clear_mailinfo() if it wants
 to avoid a leak.

  2. There are two callers of mailinfo(). git-am simply dies on an
 error, and so is unaffected. But git-mailinfo unconditionally calls
 clear_mailinfo() before returning, regardless of the return code.

  3. When we get to clear_mailinfo(), the arrays are either populated or
 are NULL. We know they're initialized to NULL because of
 setup_mailinfo(), which zeroes the whole struct.

So I think your fix does the right thing. I do think this is a pretty
awkward interface, and it would be less error-prone if either[1]:

  a. we bumped the allocation of these arrays up in mailinfo() so
 that they were simply always initialized. This fixes the bug in
 clear_mailinfo(), but also any other function which looks at the
 mailinfo struct (though I don't think there are any such cases).

  b. we had mailinfo() clean up after itself on error, so that it was
 always in a de-initialized state.

But given the lack of callers, it may not be worth the effort. So I'm OK
with this solution. It may be worth giving an abbreviated version of the
above explanation in the commit message. Perhaps:

  If  or  files can't be opened, then mailinfo() returns an
  error before it even initializes mi->p_hdr_data or mi->s_hdr_data.
  When cmd_mailinfo() then calls clear_mailinfo(), we dereference the
  NULL pointers trying to free their contents.

As for the patch itself, it looks correct but I saw two style nits:

> diff --git a/mailinfo.c b/mailinfo.c
> index a89db22ab..035abbbf5 100644
> --- a/mailinfo.c
> +++ b/mailinfo.c
> @@ -1167,11 +1167,13 @@ void clear_mailinfo(struct mailinfo *mi)
>   strbuf_release(&mi->inbody_header_accum);
>   free(mi->message_id);
>  
> - for (i = 0; mi->p_hdr_data[i]; i++)
> - strbuf_release(mi->p_hdr_data[i]);
> + if(mi->p_hdr_data != NULL)
> + for (i = 0; mi->p_hdr_data[i]; i++)
> + strbuf_release(mi->p_hdr_data[i]);

We usually say "if (" with an extra space. And we generally just check
pointers for their truth value. So:

  if (mi->p_hdr_data) {
for (i = 0; ...)

-Peff

[1] Actually, it seems a little funny that we use xcalloc() here at all,
since the size is determined by a compile-time constant. Why not
just put an array directly into the struct and let it get zeroed
with the rest of the struct? That sidesteps the question of whether
we need to clear() after an error return, but it would fix this bug. :)


[PATCH] mailinfo: avoid segfault when can't open files

2018-01-23 Thread Juan F. Codagnone
If  or  files can't be opened, clear_mailinfo crash as
it follows NULL pointers.

Can be reproduced using `git mailinfo . .`

Signed-off-by: Juan F. Codagnone 
---
 mailinfo.c | 10 ++
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/mailinfo.c b/mailinfo.c
index a89db22ab..035abbbf5 100644
--- a/mailinfo.c
+++ b/mailinfo.c
@@ -1167,11 +1167,13 @@ void clear_mailinfo(struct mailinfo *mi)
strbuf_release(&mi->inbody_header_accum);
free(mi->message_id);
 
-   for (i = 0; mi->p_hdr_data[i]; i++)
-   strbuf_release(mi->p_hdr_data[i]);
+   if(mi->p_hdr_data != NULL)
+   for (i = 0; mi->p_hdr_data[i]; i++)
+   strbuf_release(mi->p_hdr_data[i]);
free(mi->p_hdr_data);
-   for (i = 0; mi->s_hdr_data[i]; i++)
-   strbuf_release(mi->s_hdr_data[i]);
+   if(mi->s_hdr_data != NULL)
+   for (i = 0; mi->s_hdr_data[i]; i++)
+   strbuf_release(mi->s_hdr_data[i]);
free(mi->s_hdr_data);
 
while (mi->content < mi->content_top) {
-- 
2.14.3



[PATCH] bisect: fix a regression causing a segfault

2018-01-03 Thread Ævar Arnfjörð Bjarmason
In 7c117184d7 ("bisect: fix off-by-one error in
`best_bisection_sorted()`", 2017-11-05) the more careful logic dealing
with freeing p->next in 50e62a8e70 ("rev-list: implement
--bisect-all", 2007-10-22) was removed.

Restore the more careful check to avoid segfaulting. Ideally this
would come with a test case, but we don't have steps to reproduce
this, only a backtrace from gdb pointing to this being the issue.

Reported-by: Yasushi SHOJI 
Signed-off-by: Ævar Arnfjörð Bjarmason 
---
 bisect.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/bisect.c b/bisect.c
index 0fca17c02b..2f3008b078 100644
--- a/bisect.c
+++ b/bisect.c
@@ -229,8 +229,10 @@ static struct commit_list *best_bisection_sorted(struct 
commit_list *list, int n
if (i < cnt - 1)
p = p->next;
}
-   free_commit_list(p->next);
-   p->next = NULL;
+   if (p) {
+   free_commit_list(p->next);
+   p->next = NULL;
+   }
strbuf_release(&buf);
free(array);
return list;
-- 
2.15.1.424.g9478a66081



Re: Segfault

2018-01-02 Thread Andrew Tsykhonya
I made:
$git stash
$git checkout 
$git stash pop - crash here. changes were applied, but index.lock was
not removed.

The issue appeared only once, I haven't seen it before and cannot
reproduce it now.

Thanks, updated git to 2.15.1.

2017-12-31 10:59 GMT+02:00 Eric Sunshine :
> On Fri, Dec 29, 2017 at 4:04 AM, Andrew Tsykhonya
>  wrote:
>> git stash pop resulted in crash:
>> /usr/local/Cellar/git/2.10.2/libexec/git-core/git-stash: line 470:
>> 14946 Segmentation fault: 11 git merge-recursive $b_tree -- $c_tree
>> $w_tree
>> although, the changes have been applied successfully.
>
> Thanks for the problem report. Can you come up with a reproduction
> recipe in order to help debug the problem?
>
> Also, what happens if you update to a newer version of Git? You appear
> to be running 2.10.2, whereas the latest version in Homebrew seems to
> be 2.15.1.


Re: Segfault

2017-12-31 Thread Eric Sunshine
On Fri, Dec 29, 2017 at 4:04 AM, Andrew Tsykhonya
 wrote:
> git stash pop resulted in crash:
> /usr/local/Cellar/git/2.10.2/libexec/git-core/git-stash: line 470:
> 14946 Segmentation fault: 11 git merge-recursive $b_tree -- $c_tree
> $w_tree
> although, the changes have been applied successfully.

Thanks for the problem report. Can you come up with a reproduction
recipe in order to help debug the problem?

Also, what happens if you update to a newer version of Git? You appear
to be running 2.10.2, whereas the latest version in Homebrew seems to
be 2.15.1.


Segfault

2017-12-29 Thread Andrew Tsykhonya
git stash pop resulted in crash:
/usr/local/Cellar/git/2.10.2/libexec/git-core/git-stash: line 470:
14946 Segmentation fault: 11 git merge-recursive $b_tree -- $c_tree
$w_tree

although, the changes have been applied successfully.


[PATCH v2 2/2] grep: fix segfault under -P + PCRE2 <=10.30 + (*NO_JIT)

2017-11-23 Thread Ævar Arnfjörð Bjarmason
Fix a bug in the compilation of PCRE2 patterns under JIT (the most
common runtime configuration). Any pattern with a (*NO_JIT) verb would
segfault in any currently released PCRE2 version:

$ git grep -P '(*NO_JIT)hi.*there'
Segmentation fault

That this segfaulted was a bug in PCRE2 itself, after reporting it[1]
on pcre-dev it's been fixed in a yet-to-be-released version of
PCRE (presumably released first as 10.31). Now it'll die with:

$ git grep -P '(*NO_JIT)hi.*there'
fatal: pcre2_jit_match failed with error code -45: bad JIT option

But the cause of the bug is in our own code dating back to my
94da9193a6 ("grep: add support for PCRE v2", 2017-06-01).

As explained at more length in the comment being added here, it isn't
sufficient to just check pcre2_config() to see whether the JIT should
be used, pcre2_pattern_info() also has to be asked.

This is something I discovered myself when fiddling around with PCRE2
verbs in patterns passed to git. I don't expect that any user of git
has encountered this given the obscurity of passing PCRE2 verbs
through to the library, along with the relative obscurity of (*NO_JIT)
itself.

1. "How am I supposed to use PCRE2 JIT in the face of (*NO_JIT) ?"
   ( &
https://lists.exim.org/lurker/thread/20171123.101502.7f0d38ca.en.html)
   on the pcre-dev mailing list

Signed-off-by: Ævar Arnfjörð Bjarmason 
---

Incorporates feedback from Eric & Simon. Thanks both. I also amended
the commit message / comment to note that this was also a bug in PCRE2
upstream, which has been fixed after I reported it.

 grep.c  | 26 ++
 t/t7810-grep.sh |  6 ++
 2 files changed, 32 insertions(+)

diff --git a/grep.c b/grep.c
index d0b9b6cdfa..e8ae0b5d8f 100644
--- a/grep.c
+++ b/grep.c
@@ -477,6 +477,8 @@ static void compile_pcre2_pattern(struct grep_pat *p, const 
struct grep_opt *opt
int options = PCRE2_MULTILINE;
const uint8_t *character_tables = NULL;
int jitret;
+   int patinforet;
+   size_t jitsizearg;
 
assert(opt->pcre2);
 
@@ -511,6 +513,30 @@ static void compile_pcre2_pattern(struct grep_pat *p, 
const struct grep_opt *opt
jitret = pcre2_jit_compile(p->pcre2_pattern, 
PCRE2_JIT_COMPLETE);
if (jitret)
die("Couldn't JIT the PCRE2 pattern '%s', got '%d'\n", 
p->pattern, jitret);
+
+   /*
+* The pcre2_config(PCRE2_CONFIG_JIT, ...) call just
+* tells us whether the library itself supports JIT,
+* but to see whether we're going to be actually using
+* JIT we need to extract PCRE2_INFO_JITSIZE from the
+* pattern *after* we do pcre2_jit_compile() above.
+*
+* This is because if the pattern contains the
+* (*NO_JIT) verb (see pcre2syntax(3))
+* pcre2_jit_compile() will exit early with 0. If we
+* then proceed to call pcre2_jit_match() further down
+* the line instead of pcre2_match() we'll either
+* segfault (pre PCRE 10.31) or run into a fatal error
+* (post PCRE2 10.31)
+*/
+   patinforet = pcre2_pattern_info(p->pcre2_pattern, 
PCRE2_INFO_JITSIZE, &jitsizearg);
+   if (patinforet)
+   BUG("pcre2_pattern_info() failed: %d", patinforet);
+   if (jitsizearg == 0) {
+   p->pcre2_jit_on = 0;
+   return;
+   }
+
p->pcre2_jit_stack = pcre2_jit_stack_create(1, 1024 * 1024, 
NULL);
if (!p->pcre2_jit_stack)
die("Couldn't allocate PCRE2 JIT stack");
diff --git a/t/t7810-grep.sh b/t/t7810-grep.sh
index 2a6679c2f5..c8ff50cc30 100755
--- a/t/t7810-grep.sh
+++ b/t/t7810-grep.sh
@@ -1110,6 +1110,12 @@ test_expect_success PCRE 'grep -P pattern' '
test_cmp expected actual
 '
 
+test_expect_success LIBPCRE2 "grep -P with (*NO_JIT) doesn't error out" '
+   git grep -P "(*NO_JIT)\p{Ps}.*?\p{Pe}" hello.c >actual &&
+   test_cmp expected actual
+
+'
+
 test_expect_success !PCRE 'grep -P pattern errors without PCRE' '
test_must_fail git grep -P "foo.*bar"
 '
-- 
2.15.0.403.gc27cc4dac6



Re: [PATCH 2/2] grep: fix segfault under -P + PCRE2 + (*NO_JIT)

2017-11-23 Thread Ævar Arnfjörð Bjarmason

On Wed, Nov 22 2017, Eric Sunshine jotted:

> On Wed, Nov 22, 2017 at 8:36 AM, Ævar Arnfjörð Bjarmason
>  wrote:
>> Fix a bug in the compilation of PCRE2 patterns under JIT (the most
>> common runtime configuration), any pattern with a (*NO_JIT) verb would
>> segfault. This bug dates back to my 94da9193a6 ("grep: add support for
>> PCRE v2", 2017-06-01):
>>
>> $ git grep -P '(*NO_JIT)hi.*there'
>> Segmentation fault
>>
>> As explained ad more length in the comment being added here it isn't
>
> s/ad/at/
> s/here/here,/

Thanks. I'll let this sit for a bit and submit a v2 soon. There's also
an upstream fix in pcre2 to prevent the segfault that'll be in future
versions & I'm going to note in the amended commit message.

>> sufficient to just check pcre2_config() to see whether the JIT should
>> be used, pcre2_pattern_info() also has to be asked.
>>
>> This is something I discovered myself when fiddling around with PCRE2
>> verbs in patterns passed to git. I don't expect that any user of git
>> has encountered this given the obscurity of passing PCRE2 verbs
>> through to the library, along with the relative obscurity of (*NO_JIT)
>> itself.
>>
>> Signed-off-by: Ævar Arnfjörð Bjarmason 


Re: [PATCH 2/2] grep: fix segfault under -P + PCRE2 + (*NO_JIT)

2017-11-23 Thread Simon Ruderich
On Wed, Nov 22, 2017 at 01:36:30PM +, Ævar Arnfjörð Bjarmason wrote:
> +  *
> +  * This is because if the pattern contains the
> +  * (*NO_JIT) verb (see pcre2syntax(3))
> +  * pcre2_jit_compile() will exit early with 0. If we
> +  * then proceed to call pcre2_jit_match() further down
> +  * the line instead of pcre2_match() we'll segfault.
> +  */
> + patinforet = pcre2_pattern_info(p->pcre2_pattern, 
> PCRE2_INFO_JITSIZE, &jitsizearg);
> + if (patinforet)
> + die("BUG: The patinforet variable should be 0 after the 
> pcre2_pattern_info() call, not %d",
> + patinforet);

I think BUG() should be used here, and maybe shorten the error
message:

BUG("pcre2_pattern_info() failed: %d", patinforet);

Regards
Simon
-- 
+ privacy is necessary
+ using gnupg http://gnupg.org
+ public key id: 0x92FEFDB7E44C32F9


Re: [PATCH 2/2] grep: fix segfault under -P + PCRE2 + (*NO_JIT)

2017-11-22 Thread Eric Sunshine
On Wed, Nov 22, 2017 at 8:36 AM, Ævar Arnfjörð Bjarmason
 wrote:
> Fix a bug in the compilation of PCRE2 patterns under JIT (the most
> common runtime configuration), any pattern with a (*NO_JIT) verb would
> segfault. This bug dates back to my 94da9193a6 ("grep: add support for
> PCRE v2", 2017-06-01):
>
> $ git grep -P '(*NO_JIT)hi.*there'
> Segmentation fault
>
> As explained ad more length in the comment being added here it isn't

s/ad/at/
s/here/here,/

> sufficient to just check pcre2_config() to see whether the JIT should
> be used, pcre2_pattern_info() also has to be asked.
>
> This is something I discovered myself when fiddling around with PCRE2
> verbs in patterns passed to git. I don't expect that any user of git
> has encountered this given the obscurity of passing PCRE2 verbs
> through to the library, along with the relative obscurity of (*NO_JIT)
> itself.
>
> Signed-off-by: Ævar Arnfjörð Bjarmason 


[PATCH 2/2] grep: fix segfault under -P + PCRE2 + (*NO_JIT)

2017-11-22 Thread Ævar Arnfjörð Bjarmason
Fix a bug in the compilation of PCRE2 patterns under JIT (the most
common runtime configuration), any pattern with a (*NO_JIT) verb would
segfault. This bug dates back to my 94da9193a6 ("grep: add support for
PCRE v2", 2017-06-01):

$ git grep -P '(*NO_JIT)hi.*there'
Segmentation fault

As explained ad more length in the comment being added here it isn't
sufficient to just check pcre2_config() to see whether the JIT should
be used, pcre2_pattern_info() also has to be asked.

This is something I discovered myself when fiddling around with PCRE2
verbs in patterns passed to git. I don't expect that any user of git
has encountered this given the obscurity of passing PCRE2 verbs
through to the library, along with the relative obscurity of (*NO_JIT)
itself.

Signed-off-by: Ævar Arnfjörð Bjarmason 
---
 grep.c  | 25 +
 t/t7810-grep.sh |  6 ++
 2 files changed, 31 insertions(+)

diff --git a/grep.c b/grep.c
index d0b9b6cdfa..f3139e867c 100644
--- a/grep.c
+++ b/grep.c
@@ -477,6 +477,8 @@ static void compile_pcre2_pattern(struct grep_pat *p, const 
struct grep_opt *opt
int options = PCRE2_MULTILINE;
const uint8_t *character_tables = NULL;
int jitret;
+   int patinforet;
+   size_t jitsizearg;
 
assert(opt->pcre2);
 
@@ -511,6 +513,29 @@ static void compile_pcre2_pattern(struct grep_pat *p, 
const struct grep_opt *opt
jitret = pcre2_jit_compile(p->pcre2_pattern, 
PCRE2_JIT_COMPLETE);
if (jitret)
die("Couldn't JIT the PCRE2 pattern '%s', got '%d'\n", 
p->pattern, jitret);
+
+   /*
+* The pcre2_config(PCRE2_CONFIG_JIT, ...) call just
+* tells us whether the library itself supports JIT,
+* but to see whether we're going to be actually using
+* JIT we need to extract PCRE2_INFO_JITSIZE from the
+* pattern *after* we do pcre2_jit_compile() above.
+*
+* This is because if the pattern contains the
+* (*NO_JIT) verb (see pcre2syntax(3))
+* pcre2_jit_compile() will exit early with 0. If we
+* then proceed to call pcre2_jit_match() further down
+* the line instead of pcre2_match() we'll segfault.
+*/
+   patinforet = pcre2_pattern_info(p->pcre2_pattern, 
PCRE2_INFO_JITSIZE, &jitsizearg);
+   if (patinforet)
+   die("BUG: The patinforet variable should be 0 after the 
pcre2_pattern_info() call, not %d",
+   patinforet);
+   if (jitsizearg == 0) {
+   p->pcre2_jit_on = 0;
+   return;
+   }
+
p->pcre2_jit_stack = pcre2_jit_stack_create(1, 1024 * 1024, 
NULL);
if (!p->pcre2_jit_stack)
die("Couldn't allocate PCRE2 JIT stack");
diff --git a/t/t7810-grep.sh b/t/t7810-grep.sh
index 2a6679c2f5..c8ff50cc30 100755
--- a/t/t7810-grep.sh
+++ b/t/t7810-grep.sh
@@ -1110,6 +1110,12 @@ test_expect_success PCRE 'grep -P pattern' '
test_cmp expected actual
 '
 
+test_expect_success LIBPCRE2 "grep -P with (*NO_JIT) doesn't error out" '
+   git grep -P "(*NO_JIT)\p{Ps}.*?\p{Pe}" hello.c >actual &&
+   test_cmp expected actual
+
+'
+
 test_expect_success !PCRE 'grep -P pattern errors without PCRE' '
test_must_fail git grep -P "foo.*bar"
 '
-- 
2.15.0.403.gc27cc4dac6



[PATCH v5 1/6] environment.c: fix potential segfault by get_git_common_dir()

2017-04-24 Thread Nguyễn Thái Ngọc Duy
setup_git_env() must be called before this function to initialize
git_common_dir so that it returns a non NULL string. And it must return
a non NULL string or segfault can happen because all callers expect so.

It does not do so explicitly though and depends on get_git_dir() being
called first (which will guarantee setup_git_env()). Avoid this
dependency and call setup_git_env() by itself.

test-ref-store.c will hit this problem because it's very lightweight,
just enough initialization to exercise refs code, and get_git_dir() will
never be called until get_worktrees() is, which uses get_git_common_dir
and hits a segfault.

Signed-off-by: Nguyễn Thái Ngọc Duy 
Signed-off-by: Junio C Hamano 
---
 environment.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/environment.c b/environment.c
index 42dc3106d2..2986ee7200 100644
--- a/environment.c
+++ b/environment.c
@@ -214,6 +214,8 @@ const char *get_git_dir(void)
 
 const char *get_git_common_dir(void)
 {
+   if (!git_dir)
+   setup_git_env();
return git_common_dir;
 }
 
-- 
2.11.0.157.gd943d85



Re: [PATCH] pathspec: fix segfault in clear_pathspec

2017-04-10 Thread Duy Nguyen
On Sat, Apr 8, 2017 at 2:29 AM, Brandon Williams  wrote:
> In 'clear_pathspec()' the incorrect index parameter is used to bound an
> inner-loop which is used to free a 'struct attr_match' value field.
> Using the incorrect index parameter (in addition to being incorrect)
> occasionally causes segmentation faults when attempting to free an
> invalid pointer.  Fix this by using the correct index parameter 'i'.
>
> Signed-off-by: Brandon Williams 
> ---
>  pathspec.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/pathspec.c b/pathspec.c
> index 303efda83..69ef86b85 100644
> --- a/pathspec.c
> +++ b/pathspec.c
> @@ -724,7 +724,7 @@ void clear_pathspec(struct pathspec *pathspec)
> free(pathspec->items[i].match);
> free(pathspec->items[i].original);
>
> -   for (j = 0; j < pathspec->items[j].attr_match_nr; j++)
> +   for (j = 0; j < pathspec->items[i].attr_match_nr; j++)

Ouch. Perhaps this is a good time to rename 'j' to something better?
attr_idx or attr_index, maybe.

> free(pathspec->items[i].attr_match[j].value);
> free(pathspec->items[i].attr_match);
>
> --
> 2.12.2.715.g7642488e1d-goog
>



-- 
Duy


Re: [PATCH] pathspec: fix segfault in clear_pathspec

2017-04-07 Thread Stefan Beller
On Fri, Apr 7, 2017 at 12:29 PM, Brandon Williams  wrote:
> In 'clear_pathspec()' the incorrect index parameter is used to bound an
> inner-loop which is used to free a 'struct attr_match' value field.
> Using the incorrect index parameter (in addition to being incorrect)
> occasionally causes segmentation faults when attempting to free an
> invalid pointer.  Fix this by using the correct index parameter 'i'.

This was introduced via b0db704652 (pathspec: allow querying for
attributes, 2017-03-13), and it seems there was no other topics
in flight since then or at the time.  So the review failed to spot it
and not some other weird root cause.

Thanks,
Stefan


[PATCH] pathspec: fix segfault in clear_pathspec

2017-04-07 Thread Brandon Williams
In 'clear_pathspec()' the incorrect index parameter is used to bound an
inner-loop which is used to free a 'struct attr_match' value field.
Using the incorrect index parameter (in addition to being incorrect)
occasionally causes segmentation faults when attempting to free an
invalid pointer.  Fix this by using the correct index parameter 'i'.

Signed-off-by: Brandon Williams 
---
 pathspec.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/pathspec.c b/pathspec.c
index 303efda83..69ef86b85 100644
--- a/pathspec.c
+++ b/pathspec.c
@@ -724,7 +724,7 @@ void clear_pathspec(struct pathspec *pathspec)
free(pathspec->items[i].match);
free(pathspec->items[i].original);
 
-   for (j = 0; j < pathspec->items[j].attr_match_nr; j++)
+   for (j = 0; j < pathspec->items[i].attr_match_nr; j++)
free(pathspec->items[i].attr_match[j].value);
free(pathspec->items[i].attr_match);
 
-- 
2.12.2.715.g7642488e1d-goog



RE: Segfault on git for Windows

2017-04-05 Thread Rémi Galan Alfonso
> Hi Rémi,
> 
> On Wed, 5 Apr 2017, Rémi Galan Alfonso wrote:
> 
> > At $DAYWORK, the code is versionned under SVN. Since I haven't used SVN
> > before and try to have a clean and bisectable history, I installed git
> > with the intent to manage my code locally before pushing to SVN when I'm
> > satisfied (I haven't tries git-svn because I have never used it and
> > would like to avoid screwing up the SVN repo by some mistake).
> > 
> > So first to setup the local repo, I wanted to add all of the code files.
> > So I first ran at the root of the repo:
> >   $ git add ./**.cpp
> > Which is quite a big amount of files (partly because of external
> > dependencies which would have been smart to exclude, but it's done).
> >   $ find -type f -name "**.cpp" | wc -l
> >   8676
> > This command worked (return status is 0 and no error message).
> > 
> > However following `git add **.hpp` and `git status` segfault with no
> > additional message:
> >   $ git status
> >   Segmentation fault
> 
> I think this is the problem identified in
> https://github.com/git-for-windows/git/issues/.
> 
> To verify, you could try the snapshot (with the proposed fix) hosted here:
> http://wingit.blob.core.windows.net/files/index.html

Thanks, it fixed it. Sorry for the noise.
Rémi


Re: Segfault on git for Windows

2017-04-05 Thread Johannes Schindelin
Hi Rémi,

On Wed, 5 Apr 2017, Rémi Galan Alfonso wrote:

> At $DAYWORK, the code is versionned under SVN. Since I haven't used SVN
> before and try to have a clean and bisectable history, I installed git
> with the intent to manage my code locally before pushing to SVN when I'm
> satisfied (I haven't tries git-svn because I have never used it and
> would like to avoid screwing up the SVN repo by some mistake).
> 
> So first to setup the local repo, I wanted to add all of the code files.
> So I first ran at the root of the repo:
>   $ git add ./**.cpp
> Which is quite a big amount of files (partly because of external
> dependencies which would have been smart to exclude, but it's done).
>   $ find -type f -name "**.cpp" | wc -l
>   8676
> This command worked (return status is 0 and no error message).
> 
> However following `git add **.hpp` and `git status` segfault with no
> additional message:
>   $ git status
>   Segmentation fault

I think this is the problem identified in
https://github.com/git-for-windows/git/issues/.

To verify, you could try the snapshot (with the proposed fix) hosted here:
http://wingit.blob.core.windows.net/files/index.html

Ciao,
Johannes

Segfault on git for Windows

2017-04-05 Thread Rémi Galan Alfonso
Hello,

I am unsure whether it's Windows only or not, so I'm sending here and
CCing Dscho.

At $DAYWORK, the code is versionned under SVN. Since I haven't used
SVN before and try to have a clean and bisectable history, I installed
git with the intent to manage my code locally before pushing to SVN
when I'm satisfied (I haven't tries git-svn because I have never used
it and would like to avoid screwing up the SVN repo by some mistake).

So first to setup the local repo, I wanted to add all of the code
files. So I first ran at the root of the repo:
  $ git add ./**.cpp
Which is quite a big amount of files (partly because of external
dependencies which would have been smart to exclude, but it's
done).
  $ find -type f -name "**.cpp" | wc -l
  8676
This command worked (return status is 0 and no error message).

However following `git add **.hpp` and `git status` segfault with no
additional message:
  $ git status
  Segmentation fault
I didn't try to test other commands (`git diff --cached` works
though).

$ git --version
git version 2.12.2.windows.1

This seems to be reproducible.

Sadly I won't be able to share the repo where this happens.

FWIW:
  $ git fsck
  notice: HEAD points to an unborn branch (master)
  Checking object directories: 100% (256/256), done.
  notice: No default references

Thanks,
Rémi Galan Alfonso


[PATCH v4 1/5] environment.c: fix potential segfault by get_git_common_dir()

2017-04-04 Thread Nguyễn Thái Ngọc Duy
setup_git_env() must be called before this function to initialize
git_common_dir so that it returns a non NULL string. And it must return
a non NULL string or segfault can happen because all callers expect so.

It does not do so explicitly though and depends on get_git_dir() being
called first (which will guarantee setup_git_env()). Avoid this
dependency and call setup_git_env() by itself.

test-ref-store.c will hit this problem because it's very lightweight,
just enough initialization to exercise refs code, and get_git_dir() will
never be called until get_worktrees() is, which uses get_git_common_dir
and hits a segfault.

Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 environment.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/environment.c b/environment.c
index 42dc3106d2..2986ee7200 100644
--- a/environment.c
+++ b/environment.c
@@ -214,6 +214,8 @@ const char *get_git_dir(void)
 
 const char *get_git_common_dir(void)
 {
+   if (!git_dir)
+   setup_git_env();
return git_common_dir;
 }
 
-- 
2.11.0.157.gd943d85



Re: [PATCHv2] pickaxe: fix segfault with '-S<...> --pickaxe-regex'

2017-03-21 Thread Johannes Schindelin
Hi Gábor,

On Sat, 18 Mar 2017, SZEDER Gábor wrote:

> 'git {log,diff,...} -S<...> --pickaxe-regex' can segfault as a result of
> out-of-bounds memory reads.
> 
> diffcore-pickaxe.c:contains() looks for all matches of the given regex
> in a buffer in a loop, advancing the buffer pointer to the end of the
> last match in each iteration.  When we switched to REG_STARTEND in
> b7d36ffca (regex: use regexec_buf(), 2016-09-21), we started passing
> the size of that buffer to the regexp engine, too.  Unfortunately,
> this buffer size is never updated on subsequent iterations, and as the
> buffer pointer advances on each iteration, this "bufptr+bufsize"
> points past the end of the buffer.  This results in segmentation
> fault, if that memory can't be accessed.  In case of 'git log' it can
> also result in erroneously listed commits, if the memory past the end
> of buffer is accessible and happens to contain data matching the
> regex.
> 
> Reduce the buffer size on each iteration as the buffer pointer is
> advanced, thus maintaining the correct end of buffer location.
> Furthermore, make sure that the buffer pointer is not dereferenced in
> the control flow statements when we already reached the end of the
> buffer.
> 
> The new test is flaky, I've never seen it fail on my Linux box even
> without the fix, but this is expected according to db5dfa3 (regex:
> -G feeds a non NUL-terminated string to regexec() and fails,
> 2016-09-21).  However, it did fail on Travis CI with the first (and
> incomplete) version of the fix, and based on that commit message I
> would expect the new test without the fix to fail most of the time on
> Windows.

Thank you for catching and fixing this. On Windows, I indeed get a
segmentation fault for the new test case without the patched code, and the
patch indeed fixes this.

ACK,
Dscho

Re: [PATCH v3 1/4] environment.c: fix potential segfault by get_git_common_dir()

2017-03-19 Thread Duy Nguyen
On Sun, Mar 19, 2017 at 12:54 AM, Junio C Hamano  wrote:
> Nguyễn Thái Ngọc Duy   writes:
>
>> setup_git_env() must be called before this function to initialize
>> git_common_dir so that it returns a non NULL string. And it must return
>> a non NULL string or segfault can happen because all callers expect so.
>>
>> Normally if somebody has called get_git_dir(), or set_git_dir() then
>> setup_git_env() is already called. But if you do setup_git_directory()
>> at top dir (which skips set_git_dir) and never call get_git_dir, you'll
>> get NULL here.
>
> Hmph, and the solution for the problem not being "so let's make sure
> get_git_dir() is called even when the command is started at the top
> directory" is because...?

-EHARDTOPARSE. There's a hidden dependency between get_git_dir() and
get_git_common_dir() which is not good. If we lazily call
set_git_env(), make sure we do it lazily but consistently at all
relevant function calls (i.e. including get_git_common_dir).

Alternatively (I was thinking of this but didn't really follow up
because this was side issue) we should make sure setup_git_env() is
always called at the end of setup_git_dir...() and remove the laziness
in get_git_dir(). This may be more in line of recent attempts to catch
repo access without calling setup_git_directory..() first. But sadly I
haven't read Jeff's series, so I can't say whether it's true.
-- 
Duy


Re: [PATCH] pickaxe: fix segfault with '-S<...> --pickaxe-regex'

2017-03-18 Thread Junio C Hamano
Junio C Hamano  writes:

> Interestingly, the new test fails (with the patch) under prove but
> not when run from the shell (i.e. "cd t && sh t4062-diff-pickaxe.sh").

Sorry, false alarm.


Re: [PATCH] pickaxe: fix segfault with '-S<...> --pickaxe-regex'

2017-03-18 Thread Junio C Hamano
Interestingly, the new test fails (with the patch) under prove but
not when run from the shell (i.e. "cd t && sh t4062-diff-pickaxe.sh").



[PATCHv2] pickaxe: fix segfault with '-S<...> --pickaxe-regex'

2017-03-18 Thread SZEDER Gábor
'git {log,diff,...} -S<...> --pickaxe-regex' can segfault as a result
of out-of-bounds memory reads.

diffcore-pickaxe.c:contains() looks for all matches of the given regex
in a buffer in a loop, advancing the buffer pointer to the end of the
last match in each iteration.  When we switched to REG_STARTEND in
b7d36ffca (regex: use regexec_buf(), 2016-09-21), we started passing
the size of that buffer to the regexp engine, too.  Unfortunately,
this buffer size is never updated on subsequent iterations, and as the
buffer pointer advances on each iteration, this "bufptr+bufsize"
points past the end of the buffer.  This results in segmentation
fault, if that memory can't be accessed.  In case of 'git log' it can
also result in erroneously listed commits, if the memory past the end
of buffer is accessible and happens to contain data matching the
regex.

Reduce the buffer size on each iteration as the buffer pointer is
advanced, thus maintaining the correct end of buffer location.
Furthermore, make sure that the buffer pointer is not dereferenced in
the control flow statements when we already reached the end of the
buffer.

The new test is flaky, I've never seen it fail on my Linux box even
without the fix, but this is expected according to db5dfa3 (regex:
-G feeds a non NUL-terminated string to regexec() and fails,
2016-09-21).  However, it did fail on Travis CI with the first (and
incomplete) version of the fix, and based on that commit message I
would expect the new test without the fix to fail most of the time on
Windows.

Signed-off-by: SZEDER Gábor 
---

Changes since v1:

 diff --git a/diffcore-pickaxe.c b/diffcore-pickaxe.c
 index 03f84b714..341529b5a 100644
 --- a/diffcore-pickaxe.c
 +++ b/diffcore-pickaxe.c
 @@ -81,12 +81,12 @@ static unsigned int contains(mmfile_t *mf, regex_t 
*regexp, kwset_t kws)
regmatch_t regmatch;
int flags = 0;
  
 -  while (*data &&
 +  while (sz && *data &&
   !regexec_buf(regexp, data, sz, 1, ®match, flags)) {
flags |= REG_NOTBOL;
data += regmatch.rm_eo;
sz -= regmatch.rm_eo;
 -  if (*data && regmatch.rm_so == regmatch.rm_eo) {
 +  if (sz && *data && regmatch.rm_so == regmatch.rm_eo) {
data++;
sz--;
}

 diffcore-pickaxe.c  | 7 +--
 t/t4062-diff-pickaxe.sh | 5 +
 2 files changed, 10 insertions(+), 2 deletions(-)

diff --git a/diffcore-pickaxe.c b/diffcore-pickaxe.c
index 9795ca1c1..341529b5a 100644
--- a/diffcore-pickaxe.c
+++ b/diffcore-pickaxe.c
@@ -81,12 +81,15 @@ static unsigned int contains(mmfile_t *mf, regex_t *regexp, 
kwset_t kws)
regmatch_t regmatch;
int flags = 0;
 
-   while (*data &&
+   while (sz && *data &&
   !regexec_buf(regexp, data, sz, 1, ®match, flags)) {
flags |= REG_NOTBOL;
data += regmatch.rm_eo;
-   if (*data && regmatch.rm_so == regmatch.rm_eo)
+   sz -= regmatch.rm_eo;
+   if (sz && *data && regmatch.rm_so == regmatch.rm_eo) {
data++;
+   sz--;
+   }
cnt++;
}
 
diff --git a/t/t4062-diff-pickaxe.sh b/t/t4062-diff-pickaxe.sh
index f0bf50bda..7c4903f49 100755
--- a/t/t4062-diff-pickaxe.sh
+++ b/t/t4062-diff-pickaxe.sh
@@ -19,4 +19,9 @@ test_expect_success '-G matches' '
test 4096-zeroes.txt = "$(cat out)"
 '
 
+test_expect_success '-S --pickaxe-regex' '
+   git diff --name-only -S0 --pickaxe-regex HEAD^ >out &&
+   verbose test 4096-zeroes.txt = "$(cat out)"
+'
+
 test_done
-- 
2.12.0.377.g15f6ffe90


Re: [PATCH v3 1/4] environment.c: fix potential segfault by get_git_common_dir()

2017-03-18 Thread Junio C Hamano
Nguyễn Thái Ngọc Duy   writes:

> setup_git_env() must be called before this function to initialize
> git_common_dir so that it returns a non NULL string. And it must return
> a non NULL string or segfault can happen because all callers expect so.
>
> Normally if somebody has called get_git_dir(), or set_git_dir() then
> setup_git_env() is already called. But if you do setup_git_directory()
> at top dir (which skips set_git_dir) and never call get_git_dir, you'll
> get NULL here.

Hmph, and the solution for the problem not being "so let's make sure
get_git_dir() is called even when the command is started at the top
directory" is because...?

> test-ref-store.c will hit this problem because it's very lightweight,
> just enough initialization to exercise refs code, and get_git_dir() will
> never be called until get_worktrees() is, which uses get_git_common_dir().
> ---

Missing sign-off.



>  environment.c | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/environment.c b/environment.c
> index 42dc3106d2..2986ee7200 100644
> --- a/environment.c
> +++ b/environment.c
> @@ -214,6 +214,8 @@ const char *get_git_dir(void)
>  
>  const char *get_git_common_dir(void)
>  {
> + if (!git_dir)
> + setup_git_env();
>   return git_common_dir;
>  }


Re: [PATCH] pickaxe: fix segfault with '-S<...> --pickaxe-regex'

2017-03-18 Thread SZEDER Gábor
On Sat, Mar 18, 2017 at 4:12 PM, SZEDER Gábor  wrote:
> Make sure that the buffer size is reduced on each iteration as the
> buffer pointer is advanced, thus maintaining the correct end of buffer
> location.
>
> The new test is flaky, I've never seen it fail on my Linux box, but
> this is expected according to db5dfa331 (regex: -G feeds a
> non NUL-terminated string to regexec() and fails, 2016-09-21).  And
> based on that commit message I would expect the new test without the
> fix to fail reliably on Windows.
>
> Signed-off-by: SZEDER Gábor 
> ---
>
>  diffcore-pickaxe.c  | 5 -
>  t/t4062-diff-pickaxe.sh | 5 +
>  2 files changed, 9 insertions(+), 1 deletion(-)
>
> diff --git a/diffcore-pickaxe.c b/diffcore-pickaxe.c
> index 9795ca1c1..03f84b714 100644
> --- a/diffcore-pickaxe.c
> +++ b/diffcore-pickaxe.c
> @@ -85,8 +85,11 @@ static unsigned int contains(mmfile_t *mf, regex_t 
> *regexp, kwset_t kws)
>!regexec_buf(regexp, data, sz, 1, ®match, flags)) {
> flags |= REG_NOTBOL;
> data += regmatch.rm_eo;
> -   if (*data && regmatch.rm_so == regmatch.rm_eo)
> +   sz -= regmatch.rm_eo;
> +   if (*data && regmatch.rm_so == regmatch.rm_eo) {
> data++;
> +   sz--;
> +   }
> cnt++;
> }
>
> diff --git a/t/t4062-diff-pickaxe.sh b/t/t4062-diff-pickaxe.sh
> index f0bf50bda..7c4903f49 100755
> --- a/t/t4062-diff-pickaxe.sh
> +++ b/t/t4062-diff-pickaxe.sh
> @@ -19,4 +19,9 @@ test_expect_success '-G matches' '
> test 4096-zeroes.txt = "$(cat out)"
>  '
>
> +test_expect_success '-S --pickaxe-regex' '
> +   git diff --name-only -S0 --pickaxe-regex HEAD^ >out &&
> +   verbose test 4096-zeroes.txt = "$(cat out)"
> +'
> +
>  test_done

Hang on, this new test does fail because of a segfault _with_ the fix
on Travis 64bit Linux and OSX builds.

Oh, well.


[PATCH] pickaxe: fix segfault with '-S<...> --pickaxe-regex'

2017-03-18 Thread SZEDER Gábor
'git {log,diff,...} -S<...> --pickaxe-regex' can segfault as a result
of out-of-bounds memory reads.

diffcore-pickaxe.c:contains() looks for all matches of the given regex
in a buffer in a loop, advancing the buffer pointer to the end of the
last match in each iteration.  When we switched to REG_STARTEND in
b7d36ffca (regex: use regexec_buf(), 2016-09-21), we started passing
the size of that buffer to the regexp engine, too.  Unfortunately,
this buffer size is never updated on subsequent iterations, and as the
buffer pointer advances on each iteration, this "bufptr+bufsize"
points past the end of the buffer.  This results in segmentation
fault, if that memory can't be accessed.  In case of 'git log' it can
also result in erroneously listed commits, if the memory past the end
of buffer is accessible and happens to contain data matching the
regex.

Make sure that the buffer size is reduced on each iteration as the
buffer pointer is advanced, thus maintaining the correct end of buffer
location.

The new test is flaky, I've never seen it fail on my Linux box, but
this is expected according to db5dfa331 (regex: -G feeds a
non NUL-terminated string to regexec() and fails, 2016-09-21).  And
based on that commit message I would expect the new test without the
fix to fail reliably on Windows.

Signed-off-by: SZEDER Gábor 
---

 diffcore-pickaxe.c  | 5 -
 t/t4062-diff-pickaxe.sh | 5 +
 2 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/diffcore-pickaxe.c b/diffcore-pickaxe.c
index 9795ca1c1..03f84b714 100644
--- a/diffcore-pickaxe.c
+++ b/diffcore-pickaxe.c
@@ -85,8 +85,11 @@ static unsigned int contains(mmfile_t *mf, regex_t *regexp, 
kwset_t kws)
   !regexec_buf(regexp, data, sz, 1, ®match, flags)) {
flags |= REG_NOTBOL;
data += regmatch.rm_eo;
-   if (*data && regmatch.rm_so == regmatch.rm_eo)
+   sz -= regmatch.rm_eo;
+   if (*data && regmatch.rm_so == regmatch.rm_eo) {
data++;
+   sz--;
+   }
cnt++;
}
 
diff --git a/t/t4062-diff-pickaxe.sh b/t/t4062-diff-pickaxe.sh
index f0bf50bda..7c4903f49 100755
--- a/t/t4062-diff-pickaxe.sh
+++ b/t/t4062-diff-pickaxe.sh
@@ -19,4 +19,9 @@ test_expect_success '-G matches' '
test 4096-zeroes.txt = "$(cat out)"
 '
 
+test_expect_success '-S --pickaxe-regex' '
+   git diff --name-only -S0 --pickaxe-regex HEAD^ >out &&
+   verbose test 4096-zeroes.txt = "$(cat out)"
+'
+
 test_done
-- 
2.12.0.377.g15f6ffe90



[PATCH v3 1/4] environment.c: fix potential segfault by get_git_common_dir()

2017-03-18 Thread Nguyễn Thái Ngọc Duy
setup_git_env() must be called before this function to initialize
git_common_dir so that it returns a non NULL string. And it must return
a non NULL string or segfault can happen because all callers expect so.

Normally if somebody has called get_git_dir(), or set_git_dir() then
setup_git_env() is already called. But if you do setup_git_directory()
at top dir (which skips set_git_dir) and never call get_git_dir, you'll
get NULL here.

test-ref-store.c will hit this problem because it's very lightweight,
just enough initialization to exercise refs code, and get_git_dir() will
never be called until get_worktrees() is, which uses get_git_common_dir().
---
 environment.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/environment.c b/environment.c
index 42dc3106d2..2986ee7200 100644
--- a/environment.c
+++ b/environment.c
@@ -214,6 +214,8 @@ const char *get_git_dir(void)
 
 const char *get_git_common_dir(void)
 {
+   if (!git_dir)
+   setup_git_env();
return git_common_dir;
 }
 
-- 
2.11.0.157.gd943d85



Re: [PATCH] run-command: fix segfault when cleaning forked async process

2017-03-17 Thread Brandon Williams
On 03/17, Jeff King wrote:
> Callers of the run-command API may mark a child as
> "clean_on_exit"; it gets added to a list and killed when the
> main process dies.  Since commit 46df6906f
> (execv_dashed_external: wait for child on signal death,
> 2017-01-06), we respect an extra "wait_after_clean" flag,
> which we expect to find in the child_process struct.
> 
> When Git is built with NO_PTHREADS, we start "struct
> async" processes by forking rather than spawning a thread.
> The resulting processes get added to the cleanup list but
> they don't have a child_process struct, and the cleanup
> function ends up dereferencing NULL.
> 
> We should notice this case and assume that the processes do
> not need to be waited for (i.e., the same behavior they had
> before 46df6906f).
> 
> Reported-by: Brandon Williams 
> Signed-off-by: Jeff King 
> ---
> This is a regression in v2.12.0, though there is no hurry to get it into
> v2.12.1 unless your grep patches go in, too. Without them you can't
> actually build with NO_PTHREADS anyway.
> 
> However, applied directly on top of 46df6906f (which predates the build
> breakage), you can easily see the test failures with NO_PTHREADS and
> that this fixes them.
> 
>  run-command.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/run-command.c b/run-command.c
> index 5227f78ae..574b81d3e 100644
> --- a/run-command.c
> +++ b/run-command.c
> @@ -48,7 +48,7 @@ static void cleanup_children(int sig, int in_signal)
>  
>   kill(p->pid, sig);
>  
> - if (p->process->wait_after_clean) {
> + if (p->process && p->process->wait_after_clean) {
>   p->next = children_to_wait_for;
>   children_to_wait_for = p;
>   } else {

Looks good to me! Thanks for tracking that down so quickly.  I'm glad it
was a quick fix.

-- 
Brandon Williams


Re: [PATCH] run-command: fix segfault when cleaning forked async process

2017-03-17 Thread Jonathan Nieder
Jeff King wrote:

> Reported-by: Brandon Williams 
> Signed-off-by: Jeff King 

Reviewed-by: Jonathan Nieder 

Thanks.


[PATCH] run-command: fix segfault when cleaning forked async process

2017-03-17 Thread Jeff King
Callers of the run-command API may mark a child as
"clean_on_exit"; it gets added to a list and killed when the
main process dies.  Since commit 46df6906f
(execv_dashed_external: wait for child on signal death,
2017-01-06), we respect an extra "wait_after_clean" flag,
which we expect to find in the child_process struct.

When Git is built with NO_PTHREADS, we start "struct
async" processes by forking rather than spawning a thread.
The resulting processes get added to the cleanup list but
they don't have a child_process struct, and the cleanup
function ends up dereferencing NULL.

We should notice this case and assume that the processes do
not need to be waited for (i.e., the same behavior they had
before 46df6906f).

Reported-by: Brandon Williams 
Signed-off-by: Jeff King 
---
This is a regression in v2.12.0, though there is no hurry to get it into
v2.12.1 unless your grep patches go in, too. Without them you can't
actually build with NO_PTHREADS anyway.

However, applied directly on top of 46df6906f (which predates the build
breakage), you can easily see the test failures with NO_PTHREADS and
that this fixes them.

 run-command.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/run-command.c b/run-command.c
index 5227f78ae..574b81d3e 100644
--- a/run-command.c
+++ b/run-command.c
@@ -48,7 +48,7 @@ static void cleanup_children(int sig, int in_signal)
 
kill(p->pid, sig);
 
-   if (p->process->wait_after_clean) {
+   if (p->process && p->process->wait_after_clean) {
p->next = children_to_wait_for;
children_to_wait_for = p;
} else {
-- 
2.12.0.660.gf842b44fd


  1   2   3   4   >