Re: What's cooking in git.git (Aug 2016, #08; Wed, 24)

2016-08-30 Thread Jacob Keller
On Tue, Aug 30, 2016 at 10:10 PM, Stefan Beller  wrote:
>>
>> * sb/submodule-clone-rr (2016-08-17) 8 commits
>>  - clone: recursive and reference option triggers submodule alternates
>>  - clone: implement optional references
>>  - clone: clarify option_reference as required
>>  - clone: factor out checking for an alternate path
>>  - submodule--helper update-clone: allow multiple references
>>  - submodule--helper module-clone: allow multiple references
>>  - t7408: merge short tests, factor out testing method
>>  - t7408: modernize style
>>
>>  I spotted a last-minute bug in v5, which is not a very good sign
>>  (it shows that nobody is reviewing).  Any more comments?
>>
>>
>
> Jacob Keller reviewed that series as announced in
>
> https://public-inbox.org/git/CA+P7+xpE=GoFWfdzmT+k=zku8+yjeh-aomsfutjjjwfha1h...@mail.gmail.com/#t
>
> and concluded it is fine as in
>
> https://public-inbox.org/git/ca+p7+xrokr0zgidqfuvpn+-j_wdjkauropcnpgvjzhafc12...@mail.gmail.com/
>
> Thanks,
> Stefan

Yep.

Thanks,
Jake


Re: [PATCHv5 8/8] clone: recursive and reference option triggers submodule alternates

2016-08-30 Thread Jacob Keller
On Tue, Aug 30, 2016 at 10:04 PM, Stefan Beller  wrote:
> On Wed, Aug 24, 2016 at 4:37 PM, Jacob Keller  wrote:
>
>> Yes that seems reasonable.
>>
>> Thanks,
>> Jake
>
> I reviewed all your comments and you seem to be ok with including this
> series as it is queued currently?
>
> Thanks,
> Stefan

Yea based on what's in Junio's tree, I think we've squashed in all the
suggested changes, unless there have been more suggestions I missed.

Regards,
Jake


Re: Reducing CPU load on git server

2016-08-30 Thread Jeff King
On Mon, Aug 29, 2016 at 03:41:59PM -0700, W. David Jarvis wrote:

> We have an open support thread with the GHE support folks, but the
> only feedback we've gotten so far on this subject is that they believe
> our CPU load is being driven by the quantity of fetch operations (an
> average of 16 fetch requests per second during normal business hours,
> so 4 requests per second per subscriber box). About 4,000 fetch
> requests on our main repository per day.

Hmm. That might well be, but I'd have to see the numbers to say more. At
this point I think we're exhausting what is useful to talk about on the
Git list.  I'll point GHE Support at this thread, which might help your
conversation with them (and they might pull me in behind the scenes).

I'll try to answer any Git-specific questions here, though.

> > None of those operations is triggered by client fetches. You'll see
> > "rev-list" for a variety of operations, so that's hard to pinpoint. But
> > I'm surprised that "prune" is a common one for you. It is run as part of
> > the post-push, but I happen to know that the version that ships on GHE
> > is optimized to use bitmaps, and to avoid doing any work when there are
> > no loose objects that need pruning in the first place.
> 
> Would regular force-pushing trigger prune operations? Our engineering
> body loves to force-push.

No, it shouldn't make a difference. For stock git, "prune" will only be
run occasionally as part of "git gc". On GitHub Enterprise, every push
kicks off a "sync" job that moves objects from a specific fork into
storage shared by all of the related forks. So GHE will run prune more
often than stock git would, but force-pushing wouldn't have any effect
on that.

There are also some custom patches to optimize prune on GHE, so it
shouldn't generally be very expensive. Unless perhaps for some reason
the reachability bitmaps on your repository aren't performing very well.

You could try something like comparing:

  time git rev-list --objects --all >/dev/null

and

  time git rev-list --objects --all --use-bitmap-index >/dev/null

on your server. The second should be a lot faster. If it's not, that may
be an indication that Git could be doing a better job of selecting
bitmap commits (that code is not GitHub-specific at all).

> >> I might be misunderstanding this, but if the subscriber is already "up
> >> to date" modulo a single updated ref tip, then this problem shouldn't
> >> occur, right? Concretely: if ref A is built off of ref B, and the
> >> subscriber already has B when it requests A, that shouldn't cause this
> >> behavior, but it would cause this behavior if the subscriber didn't
> >> have B when it requested A.
> >
> > Correct. So this shouldn't be a thing you are running into now, but it's
> > something that might be made worse if you switch to fetching only single
> > refs.
> 
> But in theory if we were always up-to-date (since we'd always fetch
> any updated ref) we wouldn't run into this problem? We could have a
> cron job to ensure that we run a full git fetch every once in a while
> but I'd hope that if this was written properly we'd almost always have
> the most recent commit for any dependency ref.

It's a little more complicated than that. What you're really going for
is letting git reuse on-disk deltas when serving fetches. But depending
on when the last repack was run, we might be cobbling together the fetch
from multiple packs on disk, in which case there will still be some
delta search. In my experience that's not _generally_ a big deal,
though. Small fetches don't have that many deltas to find.

> Our primary repository is fairly busy. It has about 1/3 the commits of
> Linux and about 1/3 the refs, but seems otherwise to be on the same
> scale. And, of course, it both hasn't been around for as long as Linux
> has and has been experiencing exponential growth, which means its
> current activity is higher than it has ever been before -- might put
> it on a similar scale to Linux's current activity.

Most of the work for repacking scales with the number of total objects
(not quite linearly, though).  For torvalds/linux (and its forks),
that's around 16 million objects. You might try "git count-objects -v"
on your server for comparison (but do it in the "network.git" directory,
as that's the shared object storage).

-Peff


Re: git blame [was: Reducing CPU load on git server]

2016-08-30 Thread Jeff King
On Tue, Aug 30, 2016 at 12:46:20PM +0200, Jakub Narębski wrote:

> W dniu 29.08.2016 o 23:31, Jeff King pisze:
> 
> > Blame-tree is a GitHub-specific command (it feeds the main repository
> > view page), and is a known CPU hog. There's more clever caching for that
> > coming down the pipe, but it's not shipped yet.
> 
> I wonder if having support for 'git blame ' in Git core would
> be something interesting to Git users.  I once tried to implement it,
> but it went nowhere.  Would it be hard to implement?

I think there's some interest; I have received a few off-list emails
over the years about it. There was some preliminary discussion long ago:

  http://public-inbox.org/git/20110302164031.ga18...@sigill.intra.peff.net/

The code that runs on GitHub is available in my fork of git. I haven't
submitted it upstream because there are some lingering issues. I
mentioned them on-list in the first few items of:

  http://public-inbox.org/git/20130318121243.gc14...@sigill.intra.peff.net/

That code is in the jk/blame-tree branch of https://github.com/peff/git
if you are interested in addressing them (note that I haven't touched
that code in a few years except for rebasing it forward, so it may have
bitrotted a little).

Here's a snippet from an off-list conversation I had with Dennis (cc'd)
in 2014 (I think he uses that blame-tree code as part of a custom git
web interface):

> The things I think it needs are:
> 
>   1. The max-depth patches need to be reconciled with Duy's pathspec
>  work upstream. The current implementation works only for tree
>  diffs, and is not really part of the pathspec at all.
> 
>   2. Docs/output formats for blame-tree need to be friendlier, as you
>  noticed.
> 
>   3. Blame-tree does not use revision pathspecs at all. This makes it
>  take way longer than it could, because it does not prune away side
>  branches deep in history that affect only paths whose blame we have
>  already found. But the current pathspec code is so slow that using
>  it outweighs the pruning benefit.
> 
>  I have a series, which I just pushed up to jk/faster-blame-tree,
>  which tries to improve this.  But it's got a lot of new, untested
>  code itself (we are not even running it at GitHub yet). It's also
>  based on v1.9.4; I think there are going to be a lot of conflicts
>  with the combine-tree work done in v2.0.
> 
> [...]
> 
> I also think it would probably make sense for blame-tree to support the
> same output formats as git-blame (e.g., to have an identical --porcelain
> mode, to have a reasonable human-readable format by default, etc).

That's all I could dig out of my archives. I'd be happy if somebody
wanted to pick it up and run with it. Polishing for upstream has been on
my list for several years now, but there's usually something more
important (or interesting) to work on at any given moment.

You might also look at how GitLab does it. I've never talked to them
about it, and as far as I know they do not use blame-tree.

-Peff


Re: [PATCH v1] pack-protocol: fix maximum pkt-line size

2016-08-30 Thread Stefan Beller
On Mon, Aug 29, 2016 at 10:55 AM,   wrote:
> From: Lars Schneider 
>
> According to LARGE_PACKET_MAX in pkt-line.h the maximal length of a
> pkt-line packet is 65520 bytes. The pkt-line header takes 4 bytes and
> therefore the pkt-line data component must not exceed 65516 bytes.
>
> Signed-off-by: Lars Schneider 
> ---
>
> This patch was part of my "Git filter protocol" series [1]. Stefan
> suggested to submit this patch individually as it is not strictly
> required in the series [2].
>
> Thanks,
> Lars
>
>
> [1] 
> http://public-inbox.org/git/20160825110752.31581-1-larsxschnei...@gmail.com/
> [2] 
> http://public-inbox.org/git/cagz79kajn-yf28+ls2jyqss6jt-oj2jw-saxqn-oe5xaxsy...@mail.gmail.com/
>
>
>  Documentation/technical/protocol-common.txt | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/Documentation/technical/protocol-common.txt 
> b/Documentation/technical/protocol-common.txt
> index bf30167..ecedb34 100644
> --- a/Documentation/technical/protocol-common.txt
> +++ b/Documentation/technical/protocol-common.txt
> @@ -67,9 +67,9 @@ with non-binary data the same whether or not they contain 
> the trailing
>  LF (stripping the LF if present, and not complaining when it is
>  missing).
>
> -The maximum length of a pkt-line's data component is 65520 bytes.
> -Implementations MUST NOT send pkt-line whose length exceeds 65524
> -(65520 bytes of payload + 4 bytes of length data).
> +The maximum length of a pkt-line's data component is 65516 bytes.
> +Implementations MUST NOT send pkt-line whose length exceeds 65520
> +(65516 bytes of payload + 4 bytes of length data).
>
>  Implementations SHOULD NOT send an empty pkt-line ("0004").
>

This looks good,

Thanks,
Stefan

> --
> 2.9.2
>


Re: What's cooking in git.git (Aug 2016, #08; Wed, 24)

2016-08-30 Thread Stefan Beller
>
> * sb/submodule-clone-rr (2016-08-17) 8 commits
>  - clone: recursive and reference option triggers submodule alternates
>  - clone: implement optional references
>  - clone: clarify option_reference as required
>  - clone: factor out checking for an alternate path
>  - submodule--helper update-clone: allow multiple references
>  - submodule--helper module-clone: allow multiple references
>  - t7408: merge short tests, factor out testing method
>  - t7408: modernize style
>
>  I spotted a last-minute bug in v5, which is not a very good sign
>  (it shows that nobody is reviewing).  Any more comments?
>
>

Jacob Keller reviewed that series as announced in

https://public-inbox.org/git/CA+P7+xpE=GoFWfdzmT+k=zku8+yjeh-aomsfutjjjwfha1h...@mail.gmail.com/#t

and concluded it is fine as in

https://public-inbox.org/git/ca+p7+xrokr0zgidqfuvpn+-j_wdjkauropcnpgvjzhafc12...@mail.gmail.com/

Thanks,
Stefan


[PATCH 3/3] diff-highlight: avoid highlighting combined diffs

2016-08-30 Thread Jeff King
The algorithm in diff-highlight only understands how to look
at two sides of a diff; it cannot correctly handle combined
diffs with multiple preimages. Often highlighting does not
trigger at all for these diffs because the line counts do
not match up.  E.g., if we see:

  - ours
   -theirs
  ++resolved

we would not bother highlighting; it naively looks like a
single line went away, and then a separate hunk added
another single line.

But of course there are exceptions. E.g., if the other side
deleted the line, we might see:

  - ours
  ++resolved

which looks like we dropped " ours" and added "+resolved".
This is only a small highlighting glitch (we highlight the
space and the "+" along with the content), but it's also the
tip of the iceberg. Even if we learned to find the true
content here (by noticing we are in a 3-way combined diff
and marking _two_ characters from the front of the line as
uninteresting), there are other more complicated cases where
we really do need to handle a 3-way hunk.

Let's just punt for now; we can recognize combined diffs by
the presence of extra "@" symbols in the hunk header, and
treat them as non-diff content.

Signed-off-by: Jeff King 
---
 contrib/diff-highlight/diff-highlight|  2 +-
 contrib/diff-highlight/t/t9400-diff-highlight.sh | 37 
 2 files changed, 38 insertions(+), 1 deletion(-)

diff --git a/contrib/diff-highlight/diff-highlight 
b/contrib/diff-highlight/diff-highlight
index 9280c88..81bd804 100755
--- a/contrib/diff-highlight/diff-highlight
+++ b/contrib/diff-highlight/diff-highlight
@@ -36,7 +36,7 @@ $SIG{PIPE} = 'DEFAULT';
 while (<>) {
if (!$in_hunk) {
print;
-   $in_hunk = /^$GRAPH*$COLOR*\@/;
+   $in_hunk = /^$GRAPH*$COLOR*\@\@ /;
}
elsif (/^$GRAPH*$COLOR*-/) {
push @removed, $_;
diff --git a/contrib/diff-highlight/t/t9400-diff-highlight.sh 
b/contrib/diff-highlight/t/t9400-diff-highlight.sh
index 7d034aa..64dd9f7 100755
--- a/contrib/diff-highlight/t/t9400-diff-highlight.sh
+++ b/contrib/diff-highlight/t/t9400-diff-highlight.sh
@@ -254,4 +254,41 @@ test_expect_success 'diff-highlight works with the --graph 
option' '
test_cmp graph.exp graph.act
 '
 
+# Most combined diffs won't meet diff-highlight's line-number filter. So we
+# create one here where one side drops a line and the other modifies it. That
+# should result in a diff like:
+#
+#- modified content
+#++resolved content
+#
+# which naively looks like one side added "+resolved".
+test_expect_success 'diff-highlight ignores combined diffs' '
+   echo "content" >file &&
+   git add file &&
+   git commit -m base &&
+
+   >file &&
+   git commit -am master &&
+
+   git checkout -b other HEAD^ &&
+   echo "modified content" >file &&
+   git commit -am other &&
+
+   test_must_fail git merge master &&
+   echo "resolved content" >file &&
+   git commit -am resolved &&
+
+   cat >expect <<-\EOF &&
+   --- a/file
+   +++ b/file
+   @@@ -1,1 -1,0 +1,1 @@@
+   - modified content
+   ++resolved content
+   EOF
+
+   git show -c | "$DIFF_HIGHLIGHT" >actual.raw &&
+   sed -n "/^---/,\$p" actual &&
+   test_cmp expect actual
+'
+
 test_done
-- 
2.10.0.rc2.125.gcfb3d08


Re: [PATCHv5 8/8] clone: recursive and reference option triggers submodule alternates

2016-08-30 Thread Stefan Beller
On Wed, Aug 24, 2016 at 4:37 PM, Jacob Keller  wrote:

> Yes that seems reasonable.
>
> Thanks,
> Jake

I reviewed all your comments and you seem to be ok with including this
series as it is queued currently?

Thanks,
Stefan


[PATCH 1/3] diff-highlight: ignore test cruft

2016-08-30 Thread Jeff King
These are the same as in the normal t/.gitignore, with the
exception of ".prove", as our Makefile does not support it.

Signed-off-by: Jeff King 
---
 contrib/diff-highlight/t/.gitignore | 2 ++
 1 file changed, 2 insertions(+)
 create mode 100644 contrib/diff-highlight/t/.gitignore

diff --git a/contrib/diff-highlight/t/.gitignore 
b/contrib/diff-highlight/t/.gitignore
new file mode 100644
index 000..7dcbb23
--- /dev/null
+++ b/contrib/diff-highlight/t/.gitignore
@@ -0,0 +1,2 @@
+/trash directory*
+/test-results
-- 
2.10.0.rc2.125.gcfb3d08



[PATCH 2/3] diff-highlight: add multi-byte tests

2016-08-30 Thread Jeff King
Now that we have a test suite for diff highlight, we can
show off the improvements from 8d00662 (diff-highlight: do
not split multibyte characters, 2015-04-03).

While we're at it, we can also add another case that
_doesn't_ work: combining code points are treated as their
own unit, which means that we may stick colors between them
and the character they are modifying (with the result that
the color is not shown in an xterm, though it's possible
that other terminals err the other way, and show the color
but not the accent).  There's no fix here, but let's
document it as a failure.

Signed-off-by: Jeff King 
---
 contrib/diff-highlight/t/t9400-diff-highlight.sh | 36 +++-
 1 file changed, 35 insertions(+), 1 deletion(-)

diff --git a/contrib/diff-highlight/t/t9400-diff-highlight.sh 
b/contrib/diff-highlight/t/t9400-diff-highlight.sh
index e42232d..7d034aa 100755
--- a/contrib/diff-highlight/t/t9400-diff-highlight.sh
+++ b/contrib/diff-highlight/t/t9400-diff-highlight.sh
@@ -207,7 +207,41 @@ test_expect_failure 'diff-highlight highlights mismatched 
hunk size' '
EOF
 '
 
-# TODO add multi-byte test
+# These two code points share the same leading byte in UTF-8 representation;
+# a naive byte-wise diff would highlight only the second byte.
+#
+#   - U+00f3 ("o" with acute)
+o_accent=$(printf '\303\263')
+#   - U+00f8 ("o" with stroke)
+o_stroke=$(printf '\303\270')
+
+test_expect_success 'diff-highlight treats multibyte utf-8 as a unit' '
+   echo "unic${o_accent}de" >a &&
+   echo "unic${o_stroke}de" >b &&
+   dh_test a b <<-EOF
+   @@ -1 +1 @@
+   -unic${CW}${o_accent}${CR}de
+   +unic${CW}${o_stroke}${CR}de
+   EOF
+'
+
+# Unlike the UTF-8 above, these are combining code points which are meant
+# to modify the character preceding them:
+#
+#   - U+0301 (combining acute accent)
+combine_accent=$(printf '\314\201')
+#   - U+0302 (combining circumflex)
+combine_circum=$(printf '\314\202')
+
+test_expect_failure 'diff-highlight treats combining code points as a unit' '
+   echo "unico${combine_accent}de" >a &&
+   echo "unico${combine_circum}de" >b &&
+   dh_test a b <<-EOF
+   @@ -1 +1 @@
+   -unic${CW}o${combine_accent}${CR}de
+   +unic${CW}o${combine_circum}${CR}de
+   EOF
+'
 
 test_expect_success 'diff-highlight works with the --graph option' '
dh_test_setup_history &&
-- 
2.10.0.rc2.125.gcfb3d08



Re: [PATCH v4 0/3] diff-highlight: add support for --graph option

2016-08-30 Thread Jeff King
On Tue, Aug 30, 2016 at 07:07:11AM -0700, Brian Henderson wrote:

> On Mon, Aug 29, 2016 at 02:37:46PM -0700, Junio C Hamano wrote:
> > Brian Henderson  writes:
> >
> > > How does this look?
> > >
> > > Drawing the graph helped me a lot in figuring out what I was
> > > actually testing. thanks!
> >
> > Yeah, I also am pleased to see the picture of what is being tested
> > in the test script.
> >
> > With your sign-off, they would have been almost perfect ;-).
> 
> doh. fixed.
> 
> I left the subject as v4, probably mostly because I have this weird aversion 
> to
> increasing version numbers :) but I justified it by thinking that the actual
> patch set isn't changing, I just added the sign-off (and updated the commit
> messages per Jeff.) Hope that's ok.

Thanks. Here are a few patches to go on top. The first one could
arguably be squashed into your first patch (and I don't mind if Junio
wants to do so while applying, but I don't think it's worth you
re-sending).

The second one fleshes out the test scripts a bit, now that we have them
(yay!).

And the third fixes a bug that was reported to me off-list. I held back
because it touches the same lines as your topic (and as a bonus, I was
now able to write a test for it). It could be its own topic branch that
graduates separately, but seeing as it's contrib, I don't mind one big
diff-highlight potpourri topic if it makes things simpler.

  [1/3]: diff-highlight: ignore test cruft
  [2/3]: diff-highlight: add multi-byte tests
  [3/3]: diff-highlight: avoid highlighting combined diffs

-Peff


Re: [PATCH v6 12/13] convert: add filter..process option

2016-08-30 Thread Torsten Bögershausen
On Tue, Aug 30, 2016 at 03:23:10PM -0700, Junio C Hamano wrote:
> Lars Schneider  writes:
> 
> > On 30 Aug 2016, at 20:59, Junio C Hamano  wrote:
> >> 
> >>> "abort" could be ambiguous because it could be read as "abort only
> >>> this file". "abort-all" would work, though. Would you prefer to see
> >>> "error" replaced by "abort" and "error-all" by "abort-all"?
> >> 
> >> No.
> >> 
> >> I was primarily reacting to "-all" part, so anything that ends with
> >> "-all" is equally ugly from my point of view and not an improvement.
> >> 
> >> As I said, "error-all" as long as other reviewers are happy with is
> >> OK by me, too.
> >
> > Now, I see your point. How about "error-and-stop" or "error-stop"
> > instead of "error-all"?
> 
> Not really.  I was primarily reacting to having "error" and
> "error-all", that share the same prefix.  Changing "-all" suffix to
> "-stop" does not make all that difference to me.  But in any case,
> as long as other reviewers are happy with it, it is OK by me, too.
> 
> > Good argument. However, my intention here was to mimic the v1 filter
> > mechanism.
> 
> I am not making any argument and in no way against the chosen
> behaviour.  That "intention here" that was revealed after two
> iterations is something we would want to see as the reasoning
> behind the choice of the final behaviour recorded somewhere,
> and now I drew it out of you, I achieved what I set out to do
> initially ;-)
> 
> > I am not sure I understand your last sentence. Just to be clear:
> > Would you prefer it, if Git would just close the pipe to the filter process
> > on Git exit and leave the filter running?
> 
> As a potential filter writer, I would probably feel it the easiest
> to handle if there is no signal involved.  My job would be to read
> from the command pipe, react to it, and when the command pipe gives
> me EOF, I am expected to exit myself.  That would be simple enough.
> 
> >> I meant it as primarily an example people can learn from when they
> >> want to write their own.
> >
> > I think `t/t0021/rot13-filter.pl` (part of this patch) serves this purpose 
> > already.
> 
> I would expect them to peek at contrib/, but do you seriously expect
> people to comb through t/ directory?

How about a filter written in C, and placed in ./t/helper/ ?
At least I feel that a filter in C-language could be a starting point
for others which prefer, depending on the task the filter is going to do,
to use shell scripts, perl, python or any other high-level language.

A test case, where data can not be filtered, would be a minimum.
As Jakub pointed out, you can use iconv with good and bad data.


[PATCH 2/2] color_parse_mem: initialize "struct color" temporary

2016-08-30 Thread Jeff King
Compiling color.c with gcc 6.2.0 using -O3 produces some
-Wmaybe-uninitialized false positives:

color.c: In function ‘color_parse_mem’:
color.c:189:10: warning: ‘bg.blue’ may be used uninitialized in this 
function [-Wmaybe-uninitialized]
   out += xsnprintf(out, len, "%c8;2;%d;%d;%d", type,
  ^~~
  c->red, c->green, c->blue);
  ~~
color.c:208:15: note: ‘bg.blue’ was declared here
  struct color bg = { COLOR_UNSPECIFIED };
   ^~
[ditto for bg.green, bg.red, fg.blue, etc]

This is doubly confusing, because the declaration shows it
being initialized! Even though we do not explicitly
initialize the color components, an incomplete initializer
sets the unmentioned members to zero.

What the warning doesn't show is that we later do this:

  struct color c;
  if (!parse_color(&c, ...)) {
  if (fg.type == COLOR_UNSPECIFIED)
fg = c;
  ...
  }

gcc is clever enough to realize that a struct assignment
from an uninitialized variable taints the destination. But
unfortunately it's _not_ clever enough to realize that we
only look at those members when type is set to COLOR_RGB, in
which case they are always initialized.

With -O2, gcc does not look into parse_color() and must
assume that "c" emerges fully initialized. With -O3, it
inlines parse_color(), and learns just enough to get
confused.

We can silence the false positive by initializing the
temporary "c". This also future-proofs us against
violating the type assumptions (the result would probably
still be buggy, but in a deterministic way).

Signed-off-by: Jeff King 
---
Of course it's possible that I am wrong and gcc is right, but I just
don't see it (and that was the real reason I dug; I don't care _that_
much about -O3 warnings, but I wanted to see if gcc had found a real
bug).

 color.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/color.c b/color.c
index 81c2676..1b95e6b 100644
--- a/color.c
+++ b/color.c
@@ -215,7 +215,7 @@ int color_parse_mem(const char *value, int value_len, char 
*dst)
/* [fg [bg]] [attr]... */
while (len > 0) {
const char *word = ptr;
-   struct color c;
+   struct color c = { COLOR_UNSPECIFIED };
int val, wordlen = 0;
 
while (len > 0 && !isspace(word[wordlen])) {
-- 
2.10.0.rc2.125.gcfb3d08


[PATCH 1/2] error_errno: use constant return similar to error()

2016-08-30 Thread Jeff King
Commit e208f9c (make error()'s constant return value more
visible, 2012-12-15) introduced some macro trickery to make
the constant return from error() more visible to callers,
which in turn can help gcc produce better warnings (and
possibly even better code).

Later, fd1d672 (usage.c: add warning_errno() and
error_errno(), 2016-05-08) introduced another variant, and
subsequent commits converted some uses of error() to
error_errno(), losing the magic from e208f9c for those
sites.

As a result, compiling vcs-svn/svndiff.c with "gcc -O3"
produces -Wmaybe-uninitialized false positives (at least
with gcc 6.2.0). Let's give error_errno() the same
treatment, which silences these warnings.

Signed-off-by: Jeff King 
---
 git-compat-util.h | 1 +
 usage.c   | 1 +
 2 files changed, 2 insertions(+)

diff --git a/git-compat-util.h b/git-compat-util.h
index db89ba7..2ad45b3 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -436,6 +436,7 @@ static inline int const_error(void)
return -1;
 }
 #define error(...) (error(__VA_ARGS__), const_error())
+#define error_errno(...) (error_errno(__VA_ARGS__), const_error())
 #endif
 
 extern void set_die_routine(NORETURN_PTR void (*routine)(const char *err, 
va_list params));
diff --git a/usage.c b/usage.c
index 1dad03f..0efa3fa 100644
--- a/usage.c
+++ b/usage.c
@@ -148,6 +148,7 @@ void NORETURN die_errno(const char *fmt, ...)
va_end(params);
 }
 
+#undef error_errno
 int error_errno(const char *fmt, ...)
 {
char buf[1024];
-- 
2.10.0.rc2.125.gcfb3d08



[PATCH 0/2] squelch some "gcc -O3 -Wmaybe-uninitialized" warnings

2016-08-30 Thread Jeff King
I happened to be compiling git with -O3 today, and noticed we generate
some warnings about uninitialized variables (I actually compile with
-Wall, but the only false positives I saw were these).

Here are patches to squelch them.

  [1/2]: error_errno: use constant return similar to error()
  [2/2]: color_parse_mem: initialize "struct color" temporary

-Peff


Re: git submodules implementation question

2016-08-30 Thread Uma Srinivasan
On Tue, Aug 30, 2016 at 10:53 AM, Junio C Hamano  wrote:
> Uma Srinivasan  writes:
>
>> I think the following fix is still needed to is_submodule_modified():
>>
>> strbuf_addf(&buf, "%s/.git", path);
>> git_dir = read_gitfile(buf.buf);
>> if (!git_dir) {
>> git_dir = buf.buf;
>>  ==>   if (!is_git_directory(git_dir)) {
>>  ==> die("Corrupted .git dir in submodule %s", path);
>>  ==>   }
>> }
>
> If it is so important that git_dir is a valid Git
> repository after
>
> git_dir = read_gitfile(buf.buf);
> if (!git_dir)
> git_dir = buf.buf;
>
> is done to determine what "git_dir" to use, it seems to me that it
> does not make much sense to check ONLY dir/.git that is a directory
> and leave .git/modules/$name that dir/.git file points at unchecked.
>

Okay.

> But there is much bigger problem with the above addition, I think.
> There also can be a case where dir/ does not even have ".git" in it.
> A submodule the user is not interested in will just have an empty
> directory there, and immediately after the above three lines I
> reproduced above, we have this
>
> if (!is_directory(git_dir)) {
> strbuf_release(&buf);
> return 0;
> }
>

Good to know about this use case.

> The added check will break the use case.  If anything, that check,
> if this code needs to verify that "git_dir" points at a valid Git
> repository, should come _after_ that.

Okay.

>
> Shouldn't "git-status --porcelain" run in the submodule notice that
> it is not a valid repository and quickly die anyway?  Why should we
> even check before spawning that process in the first place?

I don't see any reason to disagree with this.

>
> I might suggest to update prepare_submodule_repo_env() so that the
> spawned process will *NOT* have to guess where the working tree and
> repository by exporting GIT_DIR (set to "git_dir" we discover above)
> and GIT_WORK_TREE (set to "." as cp.dir is set to the path to the
> working tree of the submodule).  That would stop the "git status" to
> guess (and fooled by a corrupted dir/.git that is not a git
> repository).
>

Here's where I am struggling with my lack of knowledge of git
internals and the implementation particularly in the context of how
environment variables are passed from the parent to the child process.

Are you suggesting that we set up the child process environment array
(the "out" argument) in prepare_submodule_repo_env() to include
GIT_DIR_ENVIRONMENT and GIT_WORK_TREE_ENVIRONMENT in addition to
CONFIG_DATA_ENVIRONMENT that is there now?  Can I use the strcmp and
argv_array_push() calls to do this? There are several callers for this
routine prepare_submodule_repo_env(). Would the caller pass the git
dir and work tree as arguments to this routine or would the caller set
up the environment variables as needed? Is there any documentation or
other example code where similar passing of env. vars to a child
process is being done?

Sorry for all these questions but I'm still a novice as far as the
code is concerned.

Thanks for your help.

Uma


Re: [PATCH 14/22] sequencer: prepare for rebase -i's commit functionality

2016-08-30 Thread Junio C Hamano
Junio C Hamano  writes:

> Two functions with the same name reading from the same format, even
> when they expect to produce the same result in different internal
> format, without even being aware of each other is a bad enough
> "regression" in maintainability of the code.  One of them not even
> using sq_dequote() helper and reinventing is even worse.

So, here is what I did as a lunch-break-hack.  The "same result in
different internal format" calls for a fairly light-weight mechanism
to convey the necessary information that can be shared by callers
with different needs, and I chose string-list for that.

Totally untested, but parse_key_value_squoted() would be a good
foundation to build on.  The caller in "am" deliberately wants to be
strict (e.g. it wants to error out when the user mucked with the
file, so it insists on the three variables to appear in a known
order and rejects input that violates its assumption), but the
function does not mind if other callers want to be more lenient.

-- >8 --
From: Junio C Hamano 
Date: Tue, 30 Aug 2016 12:36:42 -0700
Subject: [PATCH] am: refactor read_author_script()

By splitting the part that reads from a file and the part that
parses the variable definitions from the contents, make the latter
can be more reusable in the future.

Signed-off-by: Junio C Hamano 
---
 builtin/am.c | 103 ++-
 1 file changed, 45 insertions(+), 58 deletions(-)

diff --git a/builtin/am.c b/builtin/am.c
index 739b34d..b36d1f0 100644
--- a/builtin/am.c
+++ b/builtin/am.c
@@ -28,6 +28,7 @@
 #include "rerere.h"
 #include "prompt.h"
 #include "mailinfo.h"
+#include "string-list.h"
 
 /**
  * Returns 1 if the file is empty or does not exist, 0 otherwise.
@@ -258,38 +259,29 @@ static int read_state_file(struct strbuf *sb, const 
struct am_state *state,
 }
 
 /**
- * Reads a KEY=VALUE shell variable assignment from `fp`, returning the VALUE
- * as a newly-allocated string. VALUE must be a quoted string, and the KEY must
- * match `key`. Returns NULL on failure.
- *
- * This is used by read_author_script() to read the GIT_AUTHOR_* variables from
- * the author-script.
+ * Take a series of KEY='VALUE' lines where VALUE part is
+ * sq-quoted, and append  at the end of the string list
  */
-static char *read_shell_var(FILE *fp, const char *key)
+static int parse_key_value_squoted(char *buf, struct string_list *list)
 {
-   struct strbuf sb = STRBUF_INIT;
-   const char *str;
-
-   if (strbuf_getline_lf(&sb, fp))
-   goto fail;
-
-   if (!skip_prefix(sb.buf, key, &str))
-   goto fail;
-
-   if (!skip_prefix(str, "=", &str))
-   goto fail;
-
-   strbuf_remove(&sb, 0, str - sb.buf);
-
-   str = sq_dequote(sb.buf);
-   if (!str)
-   goto fail;
-
-   return strbuf_detach(&sb, NULL);
-
-fail:
-   strbuf_release(&sb);
-   return NULL;
+   while (*buf) {
+   struct string_list_item *item;
+   char *np;
+   char *cp = strchr(buf, '=');
+   if (!cp)
+   return -1;
+   np = strchrnul(cp, '\n');
+   *cp++ = '\0';
+   item = string_list_append(list, buf);
+
+   buf = np + (*np == '\n');
+   *np = '\0';
+   cp = sq_dequote(cp);
+   if (!cp)
+   return -1;
+   item->util = xstrdup(cp);
+   }
+   return 0;
 }
 
 /**
@@ -311,44 +303,39 @@ static char *read_shell_var(FILE *fp, const char *key)
 static int read_author_script(struct am_state *state)
 {
const char *filename = am_path(state, "author-script");
-   FILE *fp;
+   struct strbuf buf = STRBUF_INIT;
+   struct string_list kv = STRING_LIST_INIT_DUP;
+   int retval = -1; /* assume failure */
+   int fd;
 
assert(!state->author_name);
assert(!state->author_email);
assert(!state->author_date);
 
-   fp = fopen(filename, "r");
-   if (!fp) {
+   fd = open(filename, O_RDONLY);
+   if (fd < 0) {
if (errno == ENOENT)
return 0;
die_errno(_("could not open '%s' for reading"), filename);
}
+   strbuf_read(&buf, fd, 0);
+   close(fd);
+   if (parse_key_value_squoted(buf.buf, &kv))
+   goto finish;
 
-   state->author_name = read_shell_var(fp, "GIT_AUTHOR_NAME");
-   if (!state->author_name) {
-   fclose(fp);
-   return -1;
-   }
-
-   state->author_email = read_shell_var(fp, "GIT_AUTHOR_EMAIL");
-   if (!state->author_email) {
-   fclose(fp);
-   return -1;
-   }
-
-   state->author_date = read_shell_var(fp, "GIT_AUTHOR_DATE");
-   if (!state->author_date) {
-   fclose(fp);
-   return -1;
-   }
-
-   if (fgetc(fp) != EOF) {
-   fclose(fp);
-   ret

Re: [PATCH v6 12/13] convert: add filter..process option

2016-08-30 Thread Junio C Hamano
Lars Schneider  writes:

> On 30 Aug 2016, at 20:59, Junio C Hamano  wrote:
>> 
>>> "abort" could be ambiguous because it could be read as "abort only
>>> this file". "abort-all" would work, though. Would you prefer to see
>>> "error" replaced by "abort" and "error-all" by "abort-all"?
>> 
>> No.
>> 
>> I was primarily reacting to "-all" part, so anything that ends with
>> "-all" is equally ugly from my point of view and not an improvement.
>> 
>> As I said, "error-all" as long as other reviewers are happy with is
>> OK by me, too.
>
> Now, I see your point. How about "error-and-stop" or "error-stop"
> instead of "error-all"?

Not really.  I was primarily reacting to having "error" and
"error-all", that share the same prefix.  Changing "-all" suffix to
"-stop" does not make all that difference to me.  But in any case,
as long as other reviewers are happy with it, it is OK by me, too.

> Good argument. However, my intention here was to mimic the v1 filter
> mechanism.

I am not making any argument and in no way against the chosen
behaviour.  That "intention here" that was revealed after two
iterations is something we would want to see as the reasoning
behind the choice of the final behaviour recorded somewhere,
and now I drew it out of you, I achieved what I set out to do
initially ;-)

> I am not sure I understand your last sentence. Just to be clear:
> Would you prefer it, if Git would just close the pipe to the filter process
> on Git exit and leave the filter running?

As a potential filter writer, I would probably feel it the easiest
to handle if there is no signal involved.  My job would be to read
from the command pipe, react to it, and when the command pipe gives
me EOF, I am expected to exit myself.  That would be simple enough.

>> I meant it as primarily an example people can learn from when they
>> want to write their own.
>
> I think `t/t0021/rot13-filter.pl` (part of this patch) serves this purpose 
> already.

I would expect them to peek at contrib/, but do you seriously expect
people to comb through t/ directory?


Re: [PATCH 06/22] sequencer: release memory that was allocated when reading options

2016-08-30 Thread Junio C Hamano
Johannes Sixt  writes:

> Am 30.08.2016 um 19:52 schrieb Johannes Schindelin:
>> Right, but that is exactly what I wanted to avoid, because it is rather
>> inelegant to strdup() strings just so that we do not have to record what
>> to free() and what not to free().
>
> Please, excuse, but when I have to choose what is more "elegant":
>
>  1. strdup() sometimes so that I can later free() always
>  2. use sequencer_entrust()
>
> I would choose 1. at all times.

Let's not bring elegance or aesthetics in.  It is like asking if
garbage collected systems are elegant.  I do not think it is a valid
question.  GC is a good pragmatic compromise after (1) declaring
that keeping track of allocation is too cumbersome and programmers'
time is better spent elsewhere, and (2) making sure hiccups caused
by GC can be minimized to keep the latency down to acceptable
levels.  There are good compromises and bad ones.

Does the proposed entrust() thing qualify as a good pragmatic
compromise like GC does?

> Particularly in this case: parsing options does not sound like a
> major drain of resources, neither CPU- nor memory-wise.

If entrust() does not have any CPU- or memory-cost, then comparing
it with having to strdup() sometimes might be a useful comparison.

But the entrust() thing has the allocation cost in order to track of
what needs to be done at runtime, and just like strdup() needs to be
done on things that are being borrowed, entrust() needs to be
avoided on things that are being borrowed (and the caller needs to
be sure not to entrust() the same piece of memory twice), so it does
not reduce the cost on programmers' and readers' mental burden--the
programmer always has to know which piece of memory is borrowed and
which are owned by the subsystem.

So...


Re: [PATCH 06/22] sequencer: release memory that was allocated when reading options

2016-08-30 Thread Jakub Narębski
W dniu 30.08.2016 o 19:52, Johannes Schindelin pisze:
> Hi Kuba,
> 
> On Tue, 30 Aug 2016, Jakub Narębski wrote:
> 
>> W dniu 29.08.2016 o 10:04, Johannes Schindelin pisze:
>>
>>> The sequencer reads options from disk and stores them in its struct
>>> for use during sequencer's operations.
>>>
>>> With this patch, the memory is released afterwards, plugging a
>>> memory leak.
>>>
>>> Signed-off-by: Johannes Schindelin 
>>> ---
>>>  sequencer.c | 13 ++---
>>>  1 file changed, 10 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/sequencer.c b/sequencer.c
>>> index b5be0f9..8d79091 100644
>>> --- a/sequencer.c
>>> +++ b/sequencer.c
>>> @@ -131,6 +131,8 @@ static void remove_sequencer_state(const struct 
>>> replay_opts *opts)
>>> free(opts->owned[i]);
>>> free(opts->owned);
>>>  
>>> +   free(opts->xopts);
>>> +
>>
>> This looks like independent change, not related to using the
>> sequencer_entrust() to store options read from disk in replay_opts
>> struct to be able to free memory afterwards.
>>
>> I guess you wanted to avoid one line changes...
> 
> Actually, it is not an independent change, but it free()s memory that has
> been allocated while reading the options, as the commit message says ;-)
> 
>>> @@ -811,13 +813,18 @@ static int populate_opts_cb(const char *key, const 
>>> char *value, void *data)
>>
>> Sidenote: this patch would be easier to read if lines were reordered
>> as below, but I don't think any slider heuristics could help achieve
>> that automatically.  Also, the patch might be invalid...
>>
>>> opts->allow_ff = git_config_bool_or_int(key, value, 
>>> &error_flag);
>>> else if (!strcmp(key, "options.mainline"))
>>> opts->mainline = git_config_int(key, value);
>>> -   else if (!strcmp(key, "options.strategy"))
>>> +   else if (!strcmp(key, "options.strategy")) {
>>> git_config_string(&opts->strategy, key, value);
>>> +   sequencer_entrust(opts, (char *) opts->strategy);
>>
>> I wonder if the ability to free strings dup-ed by git_config_string()
>> be something that is part of replay_opts, or rather remove_sequencer_state(),
>> that is a list of
>>
>>  free(opts->strategy);
>>  free(opts->gpg_sign);
> 
> That is not necessarily possible because the way sequencer works, the
> options may have not actually be read from the file, but may be populated
> by the caller (in which case we do not necessarily want to require
> strdup()ing the strings just so that the sequencer can clean stuff up
> afterwards).

I guess from cursory browsing through the Git code that _currently_
they are only read from the config file, where git_config_string()
strdup's them, isn't it?  And we want to prepare for the future, where
the caller would prepare replay_opts, and the caller would be responsible
for freeing data if necessary?

Would there be any sane situation where some of data should be owned
by caller (and freed by caller), and some of data should be owned by
sequencer library API (and freed in remove_sequencer_state())?  If
not, perhaps *_entrust() mechanism is overthinking it, and we simply
need 'is_strdup' boolean flag or something like that...

>> The *_entrust() mechanism is more generic, but do we use this general-ness?
>> Well, it could be xstrdup or git_config_string doing entrust'ing...
> 
> Right, but that is exactly what I wanted to avoid, because it is rather
> inelegant to strdup() strings just so that we do not have to record what
> to free() and what not to free().

Maybe inelegant, but it might be easier than inventing and implementing
*_entrust() mechanism, like Hannes wrote.

> 
> BTW I have no objection at all to generalize this sequencer_entrust()
> mechanism further (read: to other, similar use cases), should it withstand
> the test of time.

Yeah, that's my take on it too.

-- 
Jakub Narębski



Re: [PATCH v2] t/Makefile: add a rule to re-run previously-failed tests

2016-08-30 Thread Ævar Arnfjörð Bjarmason
On Tue, Aug 30, 2016 at 10:51 PM, Jeff King  wrote:
> On Tue, Aug 30, 2016 at 10:48:19PM +0200, Ævar Arnfjörð Bjarmason wrote:
>
>> >  -failed: $(patsubst trash,,$(patsubst directory.%,%.sh,$(wildcard trash\ 
>> > directory.t[0-9]*)))
>> >  +failed:
>> >  +  @failed=$$(cd '$(TEST_RESULTS_DIRECTORY_SQ)' && \
>> >  +  grep -l '^failed [1-9]' $$(ls -t *.counts | \
>> >  +  sed 
>> > 'G;h;/^\(t[^.]*\)-[0-9]*\..*\n\1-[0-9]*\./d;P;d') | \
>> >  +  sed -n 's/-[0-9]*\.counts$$/.sh/p') && \
>> >  +  test -z "$$failed" || $(MAKE) $$failed
>> >
>> >   prove: pre-clean $(TEST_LINT)
>> > @echo "*** prove ***"; $(PROVE) --exec '$(SHELL_PATH_SQ)' 
>> > $(GIT_PROVE_OPTS) $(T) :: $(GIT_TEST_OPTS)
>>
>> I don't at all mind this solution to the problem, if it works for that's 
>> cool.
>>
>> But FWIW something you may have missed is that you can just use
>> prove(1) for this, which is why I initially patched git.git to support
>> TAP, so I didn't have to implement stuff like this.
>
> Heh. I think each iteration of this patch will be destined to have
> somebody[1] point Johannes at prove. ;)
>
> (But I really do recommend prove if you can use it).
>
> -Peff
>
> [1] http://public-inbox.org/git/20160630063725.gc15...@sigill.intra.peff.net/

Sorry about that, I see it's been mentioned already. My only excuse is
that I don't know how to operate my E-Mail client :)


Apply Today

2016-08-30 Thread Loan Firm



Dear Sir/Madam,

We give out urgent loan for business and personal purpose with 3% intrest rate 
applicable to all amount.

Kindly get back to us via email: loa...@foxmail.com for further details on how 
to apply.


Re: [PATCH v2] t/Makefile: add a rule to re-run previously-failed tests

2016-08-30 Thread Jeff King
On Tue, Aug 30, 2016 at 10:48:19PM +0200, Ævar Arnfjörð Bjarmason wrote:

> >  -failed: $(patsubst trash,,$(patsubst directory.%,%.sh,$(wildcard trash\ 
> > directory.t[0-9]*)))
> >  +failed:
> >  +  @failed=$$(cd '$(TEST_RESULTS_DIRECTORY_SQ)' && \
> >  +  grep -l '^failed [1-9]' $$(ls -t *.counts | \
> >  +  sed 
> > 'G;h;/^\(t[^.]*\)-[0-9]*\..*\n\1-[0-9]*\./d;P;d') | \
> >  +  sed -n 's/-[0-9]*\.counts$$/.sh/p') && \
> >  +  test -z "$$failed" || $(MAKE) $$failed
> >
> >   prove: pre-clean $(TEST_LINT)
> > @echo "*** prove ***"; $(PROVE) --exec '$(SHELL_PATH_SQ)' 
> > $(GIT_PROVE_OPTS) $(T) :: $(GIT_TEST_OPTS)
> 
> I don't at all mind this solution to the problem, if it works for that's cool.
> 
> But FWIW something you may have missed is that you can just use
> prove(1) for this, which is why I initially patched git.git to support
> TAP, so I didn't have to implement stuff like this.

Heh. I think each iteration of this patch will be destined to have
somebody[1] point Johannes at prove. ;)

(But I really do recommend prove if you can use it).

-Peff

[1] http://public-inbox.org/git/20160630063725.gc15...@sigill.intra.peff.net/


Re: [PATCH v2] t/Makefile: add a rule to re-run previously-failed tests

2016-08-30 Thread Ævar Arnfjörð Bjarmason
On Mon, Aug 29, 2016 at 3:46 PM, Johannes Schindelin
 wrote:
> While developing patch series, it is a good practice to run the test
> suite from time to time, just to make sure that obvious bugs are caught
> early.  With complex patch series, it is common to run `make -j15 -k
> test`, i.e.  run the tests in parallel and *not* stop at the first
> failing test but continue. This has the advantage of identifying
> possibly multiple problems in one big test run.
>
> It is particularly important to reduce the turn-around time thusly on
> Windows, where the test suite spends 45 minutes on the computer on which
> this patch was developed.
>
> It is the most convenient way to determine which tests failed after
> running the entire test suite, in parallel, to look for left-over "trash
> directory.t*" subdirectories in the t/ subdirectory. However, as was
> pointed out by Jeff King, those directories might live outside t/ when
> overridden using the --root= option, to which the Makefile
> has no access. The next best method is to grep explicitly for failed
> tests in the test-results/ directory, which the Makefile *can* access.
>
> This patch automates the process of determinig which tests failed
> previously and re-running them.
>
> Note that we need to be careful to inspect only the *newest* entries in
> test-results/: this directory contains files of the form
> t--.counts and is only removed wholesale when running the
> *entire* test suite, not when running individual tests. We ensure that
> with a little sed magic on `ls -t`'s output that simply skips lines
> when the file name was seen earlier.
>
> Signed-off-by: Johannes Schindelin 
> ---
>
> The patch is unfortunately no longer as trivial as before, but it
> now also works with --root=..., i.e. when the user overrode the
> location of the trash directories.
>
> Published-As: https://github.com/dscho/git/releases/tag/failing-tests-v2
> Fetch-It-Via: git fetch https://github.com/dscho/git failing-tests-v2
> Interdiff vs v1:
>
>  diff --git a/t/Makefile b/t/Makefile
>  index c402a9ec..8aa6a72 100644
>  --- a/t/Makefile
>  +++ b/t/Makefile
>  @@ -35,7 +35,12 @@ all: $(DEFAULT_TEST_TARGET)
>   test: pre-clean $(TEST_LINT)
> $(MAKE) aggregate-results-and-cleanup
>
>  -failed: $(patsubst trash,,$(patsubst directory.%,%.sh,$(wildcard trash\ 
> directory.t[0-9]*)))
>  +failed:
>  +  @failed=$$(cd '$(TEST_RESULTS_DIRECTORY_SQ)' && \
>  +  grep -l '^failed [1-9]' $$(ls -t *.counts | \
>  +  sed 'G;h;/^\(t[^.]*\)-[0-9]*\..*\n\1-[0-9]*\./d;P;d') 
> | \
>  +  sed -n 's/-[0-9]*\.counts$$/.sh/p') && \
>  +  test -z "$$failed" || $(MAKE) $$failed
>
>   prove: pre-clean $(TEST_LINT)
> @echo "*** prove ***"; $(PROVE) --exec '$(SHELL_PATH_SQ)' 
> $(GIT_PROVE_OPTS) $(T) :: $(GIT_TEST_OPTS)

I don't at all mind this solution to the problem, if it works for that's cool.

But FWIW something you may have missed is that you can just use
prove(1) for this, which is why I initially patched git.git to support
TAP, so I didn't have to implement stuff like this.

I.e.:

$ prove --state=save t[0-9]*.sh
$ prove --state=failed,save t[0-9]*.sh

Does exactly what you're trying to do here with existing tools we support.

I.e. your new target could just be implemented in terms of calling prove.

Check out its man page, it may have other stuff you want to use, e.g.
you can run the slowest tests first etc.


Re: [PATCH 06/22] sequencer: release memory that was allocated when reading options

2016-08-30 Thread Johannes Sixt

Am 30.08.2016 um 19:52 schrieb Johannes Schindelin:

Right, but that is exactly what I wanted to avoid, because it is rather
inelegant to strdup() strings just so that we do not have to record what
to free() and what not to free().


Please, excuse, but when I have to choose what is more "elegant":

 1. strdup() sometimes so that I can later free() always
 2. use sequencer_entrust()

I would choose 1. at all times.

Particularly in this case: parsing options does not sound like a major 
drain of resources, neither CPU- nor memory-wise.


-- Hannes



Re: [PATCH v6 12/13] convert: add filter..process option

2016-08-30 Thread Jakub Narębski
Hello,

W dniu 30.08.2016 o 20:59, Junio C Hamano pisze:
> Lars Schneider  writes:
[...]

>> The filter can exit right after the "error-all". If the filter does
>> not exit then Git will kill the filter. I'll add this to the docs.
> 
> OK.

Is it explicit kill, or implicit kill: close pipe and end process?

>> "abort" could be ambiguous because it could be read as "abort only
>> this file". "abort-all" would work, though. Would you prefer to see
>> "error" replaced by "abort" and "error-all" by "abort-all"?
> 
> No.
> 
> I was primarily reacting to "-all" part, so anything that ends with
> "-all" is equally ugly from my point of view and not an improvement.
> 
> As I said, "error-all" as long as other reviewers are happy with is
> OK by me, too.

I'm rather partial to "abort" instead of "error-all", or "quit"/"exit"
(but I prefer "abort" or "bail-out"), as it better reflects what this
response is about - ending filter process.
 
>> A filter that dies during communication or does not adhere to the protocol
>> is a faulty filter. Feeding the faulty filter after restart with the same 
>> blob would likely cause the same error. 
> 
> Why does Git restart it and continue with the rest of the blobs
> then?  Is there a reason to believe it may produce correct results
> for other blobs if you run it again?

I guess the idea is that single filter can be run on different types
of blobs, and it could fail on some types (some files) and not others.
Like LFS-type filter failing on files with size larger than 2 GiB,
or iconv-like filter to convert UTF-16 to UTF-8 failing on invalid
byte sequences.

>> B) If we communicate "shutdown" to the filter then we need to give the
>>filter some time to perform the exit before the filter is killed on
>>Git exit. I wasn't able to come up with a good answer how long Git 
>>should wait for the exit.
> 
> Would that be an issue?  A filter is buggy if told to shutdown,
> ignores the request and hangs around forever.  I do not think we
> even need to kill it.  It is not Git's problem.

I think the idea was for Git to request shutdown, and filter to tell
when it finished (or just exiting, and Git getting SIGCHLD, I guess).
It is hard to tell how much to wait, for example for filter process
to send "sync" command, or finish unmounting, or something...

> I personally would be reluctant to become a filter process writer if
> the system it will be talking to can kill it without giving it a
> chance to do a graceful exit, but perhaps that is just me.  I don't
> know if closing the pipe going there where you are having Git to
> kill the process on the other end is any more involved than what you
> have without extra patches.

Isn't it the same problem with v1 filters being killed on Git process
exit?  Unless v2 filter wants to do something _after_ writing output
to Git, it should be safe... unless Git process is killed, but it
is not different from filter being stray-killed.

 +Please note that you cannot use an existing `filter..clean`
 +or `filter..smudge` command with `filter..process`
 +because the former two use a different inter process communication
 +protocol than the latter one.
>>>
>>> Would it be a useful sample program we can ship in contrib/ if you
>>> created a "filter adapter" that reads these two configuration
>>> variables and act as a filter..process?
>>
>> You mean a v2 filter that would use v1 filters under the hood?
>> If we would drop v1, then this would be useful. Otherwise I don't
>> see any need for such a thing.
> 
> I meant it as primarily an example people can learn from when they
> want to write their own.

Errr... if it were any v1 filter, it would be useless (as bad or
worse performance than ordinary v1 clean or smudge filter).  It
might make sense if v1 filter and v2 wrapper were written in the
same [dynamic] language, and wrapper could just load / require
filter as a function / procedure, [compile it], and use it.
For example smudge/clean filter in Perl (e.g. rot13), and v2 wrapper
in Perl too.

Best,
-- 
Jakub Narębski 
 



Re: [PATCH v6 12/13] convert: add filter..process option

2016-08-30 Thread Lars Schneider

On 30 Aug 2016, at 20:59, Junio C Hamano  wrote:
> 
>> "abort" could be ambiguous because it could be read as "abort only
>> this file". "abort-all" would work, though. Would you prefer to see
>> "error" replaced by "abort" and "error-all" by "abort-all"?
> 
> No.
> 
> I was primarily reacting to "-all" part, so anything that ends with
> "-all" is equally ugly from my point of view and not an improvement.
> 
> As I said, "error-all" as long as other reviewers are happy with is
> OK by me, too.

Now, I see your point. How about "error-and-stop" or "error-stop"
instead of "error-all"?


>> A filter that dies during communication or does not adhere to the protocol
>> is a faulty filter. Feeding the faulty filter after restart with the same 
>> blob would likely cause the same error. 
> 
> Why does Git restart it and continue with the rest of the blobs
> then?  Is there a reason to believe it may produce correct results
> for other blobs if you run it again?

Good argument. However, my intention here was to mimic the v1 filter
mechanism. If a filter fails then Git will run the same v1 filter on the
next file that needs to be filtered. If `filter..required=false`
then a failure is even a legitimate result.


>> B) If we communicate "shutdown" to the filter then we need to give the
>>   filter some time to perform the exit before the filter is killed on
>>   Git exit. I wasn't able to come up with a good answer how long Git 
>>   should wait for the exit.
> 
> Would that be an issue?  A filter is buggy if told to shutdown,
> ignores the request and hangs around forever.  I do not think we
> even need to kill it.  It is not Git's problem.
> I personally would be reluctant to become a filter process writer if
> the system it will be talking to can kill it without giving it a
> chance to do a graceful exit, but perhaps that is just me.  I don't
> know if closing the pipe going there where you are having Git to
> kill the process on the other end is any more involved than what you
> have without extra patches.

I am not sure I understand your last sentence. Just to be clear:
Would you prefer it, if Git would just close the pipe to the filter process
on Git exit and leave the filter running? I think I wouldn't even need
to close the pipe explicitly. The pipe would be closed automatically when
git exits or dies, right? The filter could detect EOF and exit on its own.
That would be OK with me.


 +Please note that you cannot use an existing `filter..clean`
 +or `filter..smudge` command with `filter..process`
 +because the former two use a different inter process communication
 +protocol than the latter one.
>>> 
>>> Would it be a useful sample program we can ship in contrib/ if you
>>> created a "filter adapter" that reads these two configuration
>>> variables and act as a filter..process?
>> 
>> You mean a v2 filter that would use v1 filters under the hood?
>> If we would drop v1, then this would be useful. Otherwise I don't
>> see any need for such a thing.
> 
> I meant it as primarily an example people can learn from when they
> want to write their own.

I think `t/t0021/rot13-filter.pl` (part of this patch) serves this purpose 
already.


Thanks,
Lars


FW: GIT Support Partners

2016-08-30 Thread Scott Sobstad
Hi- I'm wondering if anyone could suggest a GIT support partner(s)?  The 
community is great, but I'm looking for a more personalized GIT support 
experience.  

Thanks!

-Scott Sobstad

Scott Sobstad
Director-Application Support,TSG
JDA Software
20700 Swenson Dr,
Waukesha, WI 53186 / United States
1.262.317.2185 / office
1.480.308.3000 / worldwide
scott.sobs...@jda.com
visit us / jda.com 




[PATCH] git-send-email: Add ability to cc: any "trailers" from commit message

2016-08-30 Thread Joe Perches
Many commits have various forms of trailers similar to
 "Acked-by: Name " and "Reported-by: Name "

Add the ability to cc these trailers when using git send-email.

This can be suppressed with --suppress-cc=trailers.

Signed-off-by: Joe Perches 
---
 Documentation/git-send-email.txt | 10 ++
 git-send-email.perl  | 16 +++-
 2 files changed, 17 insertions(+), 9 deletions(-)

diff --git a/Documentation/git-send-email.txt b/Documentation/git-send-email.txt
index 642d0ef..999c842 100644
--- a/Documentation/git-send-email.txt
+++ b/Documentation/git-send-email.txt
@@ -278,9 +278,10 @@ Automating
the value of `sendemail.identity`.
 
 --[no-]signed-off-by-cc::
-   If this is set, add emails found in Signed-off-by: or Cc: lines to the
-   cc list. Default is the value of `sendemail.signedoffbycc` configuration
-   value; if that is unspecified, default to --signed-off-by-cc.
+   If this is set, add emails found in Signed-off-by: or Cc: or any other
+   trailer -by: lines to the cc list. Default is the value of
+   `sendemail.signedoffbycc` configuration value; if that is unspecified,
+   default to --signed-off-by-cc.
 
 --[no-]cc-cover::
If this is set, emails found in Cc: headers in the first patch of
@@ -307,8 +308,9 @@ Automating
   patch body (commit message) except for self (use 'self' for that).
 - 'sob' will avoid including anyone mentioned in Signed-off-by lines except
for self (use 'self' for that).
+- 'trailers' will avoid including anyone mentioned in any "-by:" lines.
 - 'cccmd' will avoid running the --cc-cmd.
-- 'body' is equivalent to 'sob' + 'bodycc'
+- 'body' is equivalent to 'sob' + 'bodycc' + 'trailers'
 - 'all' will suppress all auto cc values.
 --
 +
diff --git a/git-send-email.perl b/git-send-email.perl
index da81be4..255465a 100755
--- a/git-send-email.perl
+++ b/git-send-email.perl
@@ -84,7 +84,7 @@ git send-email --dump-aliases
 --identity* Use the sendemail. options.
 --to-cmd  * Email To: via ` \$patch_path`
 --cc-cmd  * Email Cc: via ` \$patch_path`
---suppress-cc * author, self, sob, cc, cccmd, body, 
bodycc, all.
+--suppress-cc * author, self, sob, cc, cccmd, body, 
bodycc, trailers, all.
 --[no-]cc-cover* Email Cc: addresses in the cover letter.
 --[no-]to-cover* Email To: addresses in the cover letter.
 --[no-]signed-off-by-cc* Send to Signed-off-by: addresses. Default 
on.
@@ -431,13 +431,13 @@ my(%suppress_cc);
 if (@suppress_cc) {
foreach my $entry (@suppress_cc) {
die "Unknown --suppress-cc field: '$entry'\n"
-   unless $entry =~ 
/^(?:all|cccmd|cc|author|self|sob|body|bodycc)$/;
+   unless $entry =~ 
/^(?:all|cccmd|cc|author|self|sob|body|bodycc|trailers)$/;
$suppress_cc{$entry} = 1;
}
 }
 
 if ($suppress_cc{'all'}) {
-   foreach my $entry (qw (cccmd cc author self sob body bodycc)) {
+   foreach my $entry (qw (cccmd cc author self sob body bodycc trailers)) {
$suppress_cc{$entry} = 1;
}
delete $suppress_cc{'all'};
@@ -448,7 +448,7 @@ $suppress_cc{'self'} = $suppress_from if defined 
$suppress_from;
 $suppress_cc{'sob'} = !$signed_off_by_cc if defined $signed_off_by_cc;
 
 if ($suppress_cc{'body'}) {
-   foreach my $entry (qw (sob bodycc)) {
+   foreach my $entry (qw (sob bodycc trailers)) {
$suppress_cc{$entry} = 1;
}
delete $suppress_cc{'body'};
@@ -1545,7 +1545,7 @@ foreach my $t (@files) {
# Now parse the message body
while(<$fh>) {
$message .=  $_;
-   if (/^(Signed-off-by|Cc): (.*)$/i) {
+   if (/^(Signed-off-by|Cc|[^\s]+[_-]by): (.*)$/i) {
chomp;
my ($what, $c) = ($1, $2);
chomp $c;
@@ -1555,6 +1555,12 @@ foreach my $t (@files) {
} else {
next if $suppress_cc{'sob'} and $what =~ 
/Signed-off-by/i;
next if $suppress_cc{'bodycc'} and $what =~ 
/Cc/i;
+   next if $suppress_cc{'trailers'} and $what !~ 
/Signed-off-by/i && $what =~ /by$/i;
+   }
+   if ($c !~ /.+@.+/) {
+   printf("(body) Ignoring %s from line '%s'\n",
+  $what, $_) unless $quiet;
+   next;
}
push @cc, $c;
printf("(body) Adding cc: %s from line '%s'\n",
-- 
2.10.0.rc2.1.gb2aa91d



Re: [PATCH v14 14/27] bisect--helper: `bisect_next` and `bisect_auto_next` shell function in C

2016-08-30 Thread Pranit Bauva
Hey Junio,

On Wed, Aug 31, 2016 at 1:03 AM, Junio C Hamano  wrote:
> Pranit Bauva  writes:
>
>> This is a very tricky one. I have purposely not included this after a
>> lot of testing. I have hand tested with the original git and with this
>> branch. The reason why anyone wouldn't be able to catch this is
>> because its not covered in the test suite. I am including a patch with
>> this as an attachment (because I am behind a proxy right now but don't
>> worry I will include this as a commit in the next series). The
>> original behaviour of git does not clean the bisect state when this
>> situation occurs.
>
> "We sometimes clean and we sometimes don't and this follows the
> original" may be a valid justification but it is not a very useful
> explanation.  The real issue is if not cleaning is intended (and if
> so why; otherwise, if it is clear that it is simply forgotten, we
> can just fix it in the series as a follow-up step).

I think we do want to recover from this "bad merge base" state and for
that not cleaning up is essential. The original behaviour of git seems
natural to me.

> If not cleaning in some cases (but not others) is the right thing,
> at least there needs an in-code comment to warn others against
> "fixing" the lack of cleanups (e.g. "don't clean state here, because
> the caller still wants to see what state we were for this and that
> reason").

I will mention it in the comments.

 + if (bisect_next_check(terms, terms->term_good.buf))
 + return -1;
>>>
>>> Mental note.  The "autostart" in the original is gone.  Perhaps it
>>> is done by next_check in this code, but we haven't seen that yet.
>>
>> This will be added back again in a coming patch[1].
>
> In other words, this series is broken at this step and the breakage
> stay with the codebase until a later step?

Broken though it passes the test suite.

> I do not know if reordering the patches in the series is enough to
> fix that, or if it is even worth to avoid such a temporary breakage.
> It depends on how serious the circularity is, but I would understand
> if it is too hard and not worth the effort (I think in a very early
> review round some people advised against the bottom-up rewrite
> because they anticipated this exact reason).  At least the omission
> (and later resurrection) needs to be mentioned in the log so that
> people understand what is going on when they later need to bisect.

bisect_autostart() call from bisect_next() was introduced in the
earliest version of git-bisect in the commit 8cc6a0831 but it isn't
really a very big deal and I think it would be OK to skip it for a
very little while as any which ways the series in the end would fix
it. It is important to mention this in the commit message and I will
do it.

Regards,
Pranit Bauva


Re: [PATCH v14 14/27] bisect--helper: `bisect_next` and `bisect_auto_next` shell function in C

2016-08-30 Thread Junio C Hamano
Pranit Bauva  writes:

> This is a very tricky one. I have purposely not included this after a
> lot of testing. I have hand tested with the original git and with this
> branch. The reason why anyone wouldn't be able to catch this is
> because its not covered in the test suite. I am including a patch with
> this as an attachment (because I am behind a proxy right now but don't
> worry I will include this as a commit in the next series). The
> original behaviour of git does not clean the bisect state when this
> situation occurs.

"We sometimes clean and we sometimes don't and this follows the
original" may be a valid justification but it is not a very useful
explanation.  The real issue is if not cleaning is intended (and if
so why; otherwise, if it is clear that it is simply forgotten, we
can just fix it in the series as a follow-up step).

If not cleaning in some cases (but not others) is the right thing,
at least there needs an in-code comment to warn others against
"fixing" the lack of cleanups (e.g. "don't clean state here, because
the caller still wants to see what state we were for this and that
reason").

>>> + if (bisect_next_check(terms, terms->term_good.buf))
>>> + return -1;
>>
>> Mental note.  The "autostart" in the original is gone.  Perhaps it
>> is done by next_check in this code, but we haven't seen that yet.
>
> This will be added back again in a coming patch[1].

In other words, this series is broken at this step and the breakage
stay with the codebase until a later step?

I do not know if reordering the patches in the series is enough to
fix that, or if it is even worth to avoid such a temporary breakage.
It depends on how serious the circularity is, but I would understand
if it is too hard and not worth the effort (I think in a very early
review round some people advised against the bottom-up rewrite
because they anticipated this exact reason).  At least the omission
(and later resurrection) needs to be mentioned in the log so that
people understand what is going on when they later need to bisect.

Thanks.


Re: [PATCH v2] t/Makefile: add a rule to re-run previously-failed tests

2016-08-30 Thread Junio C Hamano
Jeff King  writes:

> Hmm, interesting. Your approach seems reasonable, but I have to wonder
> if writing the pid in the first place is sane.
>
> I started to write up my reasoning in this email, but realized it was
> rapidly becoming the content of a commit message. So here is that
> commit.

Sounds sensible; if this makes Dscho's "which ones failed in the
previous run" simpler, that is even better ;-)



Re: git am and duplicate signatures

2016-08-30 Thread Junio C Hamano
Joe Perches  writes:

> On Tue, 2016-08-30 at 11:17 -0700, Junio C Hamano wrote:
>> Joe Perches  writes:
>> 
>> > 
>> > On Tue, 2016-08-30 at 10:41 -0700, Joe Perches wrote:
>> > > 
>> > > Maybe something like traces or chains.
>> > Or "taggers" or "tagged-bys"
>> I am afraid that you are way too late; the ship has already sailed a
>> few years ago, if not earlier, I think.
>
> What's the ship's name?  Is it footers or trailers?

I think we casually call them footers (as they are counter-part to
"headers"), or trailers (probably more official as that is half of
the name of the subsystem that is supposed to deal with them).



Re: [PATCH v6 12/13] convert: add filter..process option

2016-08-30 Thread Junio C Hamano
Lars Schneider  writes:

>> This part of the document is well-written to help filter-writers.
>
> Thanks!

Don't thank me; thank yourself and reviewers of the previous rounds.


> The filter can exit right after the "error-all". If the filter does
> not exit then Git will kill the filter. I'll add this to the docs.

OK.

> "abort" could be ambiguous because it could be read as "abort only
> this file". "abort-all" would work, though. Would you prefer to see
> "error" replaced by "abort" and "error-all" by "abort-all"?

No.

I was primarily reacting to "-all" part, so anything that ends with
"-all" is equally ugly from my point of view and not an improvement.

As I said, "error-all" as long as other reviewers are happy with is
OK by me, too.

> A filter that dies during communication or does not adhere to the protocol
> is a faulty filter. Feeding the faulty filter after restart with the same 
> blob would likely cause the same error. 

Why does Git restart it and continue with the rest of the blobs
then?  Is there a reason to believe it may produce correct results
for other blobs if you run it again?

> B) If we communicate "shutdown" to the filter then we need to give the
>filter some time to perform the exit before the filter is killed on
>Git exit. I wasn't able to come up with a good answer how long Git 
>should wait for the exit.

Would that be an issue?  A filter is buggy if told to shutdown,
ignores the request and hangs around forever.  I do not think we
even need to kill it.  It is not Git's problem.

I personally would be reluctant to become a filter process writer if
the system it will be talking to can kill it without giving it a
chance to do a graceful exit, but perhaps that is just me.  I don't
know if closing the pipe going there where you are having Git to
kill the process on the other end is any more involved than what you
have without extra patches.

>>> +Please note that you cannot use an existing `filter..clean`
>>> +or `filter..smudge` command with `filter..process`
>>> +because the former two use a different inter process communication
>>> +protocol than the latter one.
>> 
>> Would it be a useful sample program we can ship in contrib/ if you
>> created a "filter adapter" that reads these two configuration
>> variables and act as a filter..process?
>
> You mean a v2 filter that would use v1 filters under the hood?
> If we would drop v1, then this would be useful. Otherwise I don't
> see any need for such a thing.

I meant it as primarily an example people can learn from when they
want to write their own.


Re: [PATCH v14 14/27] bisect--helper: `bisect_next` and `bisect_auto_next` shell function in C

2016-08-30 Thread Pranit Bauva
Hey Junio,

On Tue, Aug 30, 2016 at 11:55 PM, Pranit Bauva  wrote:
>
>>> @@ -729,7 +735,7 @@ static struct commit **get_bad_and_good_commits(int 
>>> *rev_nr)
>>>   return rev;
>>>  }
>>>
>>> -static void handle_bad_merge_base(void)
>>> +static int handle_bad_merge_base(void)
>>>  {
>>>   if (is_expected_rev(current_bad_oid)) {
>>>   char *bad_hex = oid_to_hex(current_bad_oid);
>>> @@ -750,17 +756,18 @@ static void handle_bad_merge_base(void)
>>>   "between %s and [%s].\n"),
>>>   bad_hex, term_bad, term_good, bad_hex, 
>>> good_hex);
>>>   }
>>> - exit(3);
>>> + return 3;
>>>   }
>>>
>>>   fprintf(stderr, _("Some %s revs are not ancestor of the %s rev.\n"
>>>   "git bisect cannot work properly in this case.\n"
>>>   "Maybe you mistook %s and %s revs?\n"),
>>>   term_good, term_bad, term_good, term_bad);
>>> - exit(1);
>>> + bisect_clean_state();
>>> + return 1;
>>
>> What is the logic behind this function sometimes clean the state,
>> and some other times do not, when it makes an error-return?  We see
>> above that "return 3" codepath leaves the state behind.
>>
>> Either you forgot a necessary clean_state in "return 3" codepath,
>> or you forgot to document why the distinction exists in the in-code
>> comment for the function.  I cannot tell which, but I am leaning
>> towards guessing that it is the former.
>
> This is a very tricky one. I have purposely not included this after a
> lot of testing. I have hand tested with the original git and with this
> branch. The reason why anyone wouldn't be able to catch this is
> because its not covered in the test suite. I am including a patch with
> this as an attachment (because I am behind a proxy right now but don't
> worry I will include this as a commit in the next series). The
> original behaviour of git does not clean the bisect state when this
> situation occurs. On another note which you might have missed that
> bisect_clean_state() is purposely put before return 1 which is covered
> by the test suite. You can try removing it and see that there is a
> broken tes. tI was thinking of including the tests after the whole
> conversion but now I think including this before will make the
> conversion more easier for review.

The patch which I included as an attachment before will fail for
different reasons if you apply it on master branch. To test it on
master branch use the current attachment. Again sorry I couldn't
include this in the email itself.

Regards,
Pranit Bauva


out3
Description: Binary data


Re: [PATCH 06/22] sequencer: release memory that was allocated when reading options

2016-08-30 Thread Junio C Hamano
Johannes Schindelin  writes:

> @@ -811,13 +813,18 @@ static int populate_opts_cb(const char *key, const char 
> *value, void *data)
>   opts->allow_ff = git_config_bool_or_int(key, value, 
> &error_flag);
>   else if (!strcmp(key, "options.mainline"))
>   opts->mainline = git_config_int(key, value);
> - else if (!strcmp(key, "options.strategy"))
> + else if (!strcmp(key, "options.strategy")) {
>   git_config_string(&opts->strategy, key, value);
> - else if (!strcmp(key, "options.gpg-sign"))
> + sequencer_entrust(opts, (char *) opts->strategy);
> + }
> + else if (!strcmp(key, "options.gpg-sign")) {
>   git_config_string(&opts->gpg_sign, key, value);
> + sequencer_entrust(opts, (char *) opts->gpg_sign);
> + }
>   else if (!strcmp(key, "options.strategy-option")) {
>   ALLOC_GROW(opts->xopts, opts->xopts_nr + 1, opts->xopts_alloc);
> - opts->xopts[opts->xopts_nr++] = xstrdup(value);
> + opts->xopts[opts->xopts_nr++] =
> + sequencer_entrust(opts, xstrdup(value));
>   } else
>   return error(_("Invalid key: %s"), key);

Hmm.

I would have expected a call to sequencer_opts_clear(&opts) once the
machinery is done with the options structure, and among these places
where an old value in opts->field is overwritten by a new one would
get

free(opt->field); opt->field = ... new value ...;

Perhaps there was a good reason to do it this way (one valid reason
may be that there is _no_ good place to declare that opts is now
done and it is safe to call sequencer_opts_clear() on it), but this
looks backwards from the way things are usually done.



Re: [PATCH 05/22] sequencer: allow the sequencer to take custody of malloc()ed data

2016-08-30 Thread Junio C Hamano
Jakub Narębski  writes:

> In my personal opinion 'set_me_free_after_use' is not the best name,
> but I unfortunately do not have a better proposal.  Maybe 'entrust_ptr',
> or 'entrusted_data' / 'entrusted_ptr' / 'entrusted'?

Is this to accumulate to-be-freed pointers?

I think we often call a local variable that points at a piece of
memory to be freed "to_free", and that is an appropriate name for
what this function is trying to do.

It is a bit surprising that the careless memory management in this
codepath leaks only the dumb pieces of memory (as opposed to
pointers to structures like string list that needs _clear()
functions, in which case we cannot get away with list of
to-be-freed).  I guess we were somewhat lucky ;-)


Re: [PATCH v14 14/27] bisect--helper: `bisect_next` and `bisect_auto_next` shell function in C

2016-08-30 Thread Pranit Bauva
Hey Junio,

Sorry for a late replay.

On Fri, Aug 26, 2016 at 2:00 AM, Junio C Hamano  wrote:
> Pranit Bauva  writes:
>
>> A lot of parts of bisect.c uses exit() and these signals are then
>> trapped in the `bisect_start` function. Since the shell script ceases
>> its existence it would be necessary to convert those exit() calls to
>> return statements so that errors can be reported efficiently in C code.
>
> Is efficiency really an issue?  I think the real reason is that it
> would make it impossible for the callers to handle errors, if you do
> not convert and let the error codepaths exit().

I think I put the word "efficiently" wrongly over here. Will omit it.

>> @@ -729,7 +735,7 @@ static struct commit **get_bad_and_good_commits(int 
>> *rev_nr)
>>   return rev;
>>  }
>>
>> -static void handle_bad_merge_base(void)
>> +static int handle_bad_merge_base(void)
>>  {
>>   if (is_expected_rev(current_bad_oid)) {
>>   char *bad_hex = oid_to_hex(current_bad_oid);
>> @@ -750,17 +756,18 @@ static void handle_bad_merge_base(void)
>>   "between %s and [%s].\n"),
>>   bad_hex, term_bad, term_good, bad_hex, 
>> good_hex);
>>   }
>> - exit(3);
>> + return 3;
>>   }
>>
>>   fprintf(stderr, _("Some %s revs are not ancestor of the %s rev.\n"
>>   "git bisect cannot work properly in this case.\n"
>>   "Maybe you mistook %s and %s revs?\n"),
>>   term_good, term_bad, term_good, term_bad);
>> - exit(1);
>> + bisect_clean_state();
>> + return 1;
>
> What is the logic behind this function sometimes clean the state,
> and some other times do not, when it makes an error-return?  We see
> above that "return 3" codepath leaves the state behind.
>
> Either you forgot a necessary clean_state in "return 3" codepath,
> or you forgot to document why the distinction exists in the in-code
> comment for the function.  I cannot tell which, but I am leaning
> towards guessing that it is the former.

This is a very tricky one. I have purposely not included this after a
lot of testing. I have hand tested with the original git and with this
branch. The reason why anyone wouldn't be able to catch this is
because its not covered in the test suite. I am including a patch with
this as an attachment (because I am behind a proxy right now but don't
worry I will include this as a commit in the next series). The
original behaviour of git does not clean the bisect state when this
situation occurs. On another note which you might have missed that
bisect_clean_state() is purposely put before return 1 which is covered
by the test suite. You can try removing it and see that there is a
broken tes. tI was thinking of including the tests after the whole
conversion but now I think including this before will make the
conversion more easier for review.

>> -static void check_good_are_ancestors_of_bad(const char *prefix, int 
>> no_checkout)
>> +static int check_good_are_ancestors_of_bad(const char *prefix, int 
>> no_checkout)
>>  {
>>   char *filename = git_pathdup("BISECT_ANCESTORS_OK");
>>   struct stat st;
>> - int fd;
>> + int fd, res = 0;
>>
>>   if (!current_bad_oid)
>>   die(_("a %s revision is needed"), term_bad);
>
> Can you let it die yere?

Not really. I should change it to return error().

>> @@ -873,8 +890,11 @@ static void check_good_are_ancestors_of_bad(const char 
>> *prefix, int no_checkout)
>> filename);
>>   else
>>   close(fd);
>> +
>> + return 0;
>>   done:
>>   free(filename);
>> + return 0;
>>  }
>
> Who owns "filename"?  The first "return 0" leaves it unfreed, and
> when "goto done" is done, it is freed.
>
> The above two may indicate that "perhaps 'retval + goto finish'
> pattern?" is a really relevant suggestion for the earlier steps in
> this series.

Yes.

>>   if (!all) {
>>   fprintf(stderr, _("No testable commit found.\n"
>>   "Maybe you started with bad path parameters?\n"));
>> - exit(4);
>> + return 4;
>>   }
>>
>>   bisect_rev = revs.commits->item->object.oid.hash;
>>
>>   if (!hashcmp(bisect_rev, current_bad_oid->hash)) {
>> - exit_if_skipped_commits(tried, current_bad_oid);
>> + res = exit_if_skipped_commits(tried, current_bad_oid);
>> + if (res)
>> + return res;
>> +
>>   printf("%s is the first %s commit\n", sha1_to_hex(bisect_rev),
>>   term_bad);
>>   show_diff_tree(prefix, revs.commits->item);
>>   /* This means the bisection process succeeded. */
>> - exit(10);
>> + return 10;
>>   }
>>
>>   nr = all - reaches - 1;
>> @@ -1005,7 +1033,11 @@ int bisect_next_all(const char *prefix, int 
>> no_checkout)
>> "Bisecting: %d revisions left to t

Re: git am and duplicate signatures

2016-08-30 Thread Joe Perches
On Tue, 2016-08-30 at 11:17 -0700, Junio C Hamano wrote:
> Joe Perches  writes:
> 
> > 
> > On Tue, 2016-08-30 at 10:41 -0700, Joe Perches wrote:
> > > 
> > > Maybe something like traces or chains.
> > Or "taggers" or "tagged-bys"
> I am afraid that you are way too late; the ship has already sailed a
> few years ago, if not earlier, I think.

What's the ship's name?  Is it footers or trailers?



Re: Notation for current branch?

2016-08-30 Thread Junio C Hamano
ryenus  writes:

> For now the best use case I can think of is with git-reflog, e.g.,
> the meaning of `git reflog HEAD` and `git reflog feature-branch`
> are quite different, even if I'm currently on the feature-branch,
> especially when I want to track the rebase histories (if any).

"git reflog" gives you the reflog of HEAD in numbered format
(HEAD@{1}, HEAD@{2}, etc.), and "git reflog HEAD@{now}" is an
interesting way to tell it "I want that same HEAD reflog but not in
numbered but in timed format).

"git reflog @{0}" and "git reflog @{now}" give you the reflog of the
current branch in these formats.



Re: [PATCH 14/22] sequencer: prepare for rebase -i's commit functionality

2016-08-30 Thread Johannes Schindelin
Hi Junio,

On Tue, 30 Aug 2016, Junio C Hamano wrote:

> Johannes Schindelin  writes:
> 
> >> > +static char **read_author_script(void)
> >> > +{
> >> > +struct strbuf script = STRBUF_INIT;
> >> > +int i, count = 0;
> >> > +char *p, *p2, **env;
> >> > +size_t env_size;
> >> > +
> >> > +if (strbuf_read_file(&script, rebase_path_author_script(), 256) 
> >> > <= 0)
> >> > +return NULL;
> >> > +
> >> > +for (p = script.buf; *p; p++)
> >> > +if (skip_prefix(p, "'''", (const char **)&p2))
> >> > +strbuf_splice(&script, p - script.buf, p2 - p, 
> >> > "'", 1);
> >> > +else if (*p == '\'')
> >> > +strbuf_splice(&script, p-- - script.buf, 1, "", 
> >> > 0);
> >> > +else if (*p == '\n') {
> >> > +*p = '\0';
> >> > +count++;
> >> > +}
> >> 
> >> Hmph.  What is this loop doing?  Is it decoding a sq-quoted buffer
> >> or something?  Don't we have a helper function to do that?
> >
> > Well, it is not just decoding an sq-quoted buffer, but several lines with
> > definitions we sq-quoted ourselves, individually.
> >
> > The quote.[ch] code currently has no code to dequote lines individually.
> 
> There is a function with exactly the same name in builtin/am.c and I
> assume that it is reading from a file with the same format, which
> uses a helper to read one variable line at a time.  Hopefully that
> can be refactored so that more parsing is shared between the two
> users.
> 
> Two functions with the same name reading from the same format, even
> when they expect to produce the same result in different internal
> format, without even being aware of each other is a bad enough
> "regression" in maintainability of the code.  One of them not even
> using sq_dequote() helper and reinventing is even worse.

First of all, builtin/am's read_author_script() really, really, really
only wants to read stuff into the am_state struct.

That alone is already so incompatible that I do not think it can be
repaired.

Further, builtin/am's read_author_script() reads *only* the
GIT_AUTHOR_NAME, GIT_AUTHOR_EMAIL and GIT_AUTHOR_DATE lines (opening the
file three times for the task). And then it *requires* the corresponding
values to be sq-quoted.

I do not have a use case for this myself, but rebase -i explicitly eval's
the author script, so it is conceivable that some enterprisey user makes
use of this feature to set other environment variables. The thing is that
rebase -i cannot necessarily expect all of those files in its state
directory to be under tight control -- in stark contrast to git-am.

I could imagine that this could be fixed by abstracting the functionality
more, and use your favored sq_dequote() (which may actually fail in case
of an enterprisey usage as outlined above), and adapting builtin/am's
read_author_script() to make use of that refactored approach.

This is a huge task, make no mistake, in particular because we need to
ensure compatibility with non-standard usage. So I do not think I can
tackle that any time soon. Certainly not as part of an iteration of this
here patch series.

Ciao,
Dscho


Re: git am and duplicate signatures

2016-08-30 Thread Junio C Hamano
Joe Perches  writes:

> On Tue, 2016-08-30 at 10:41 -0700, Joe Perches wrote:
>> Maybe something like traces or chains.
>
> Or "taggers" or "tagged-bys"

I am afraid that you are way too late; the ship has already sailed a
few years ago, if not earlier, I think.



Re: Notation for current branch?

2016-08-30 Thread ryenus
On 30 August 2016 at 03:49, Junio C Hamano  wrote:
> Jacob Keller  writes:
>
>>> What's wrong with simply using 'HEAD' for scripting?
>>
>> When you want to display the current branch to the user, e.g. when
>> scripting a shell prompt or similar use
>
> Wait.  Even if a hypothetical version of Git understood @@ as "the
> current branch", how would you use that notation, instead of HEAD,
> to decide what to "display the current branch to the user"?
>
> You obviously will not be doing
>
> echo @@
>
> You would be doing something around "git symbolic-ref" at the very
> least, and wouldn't you be feeding HEAD to that symbolic-ref anyway
> while doing so?
>

For now the best use case I can think of is with git-reflog, e.g.,
the meaning of `git reflog HEAD` and `git reflog feature-branch`
are quite different, even if I'm currently on the feature-branch,
especially when I want to track the rebase histories (if any).

If there's a notation for the current branch then I don't have to
find the name of the current branch, then copy/paste it.
However, in this case, maybe git-reflog can have a `-b` option to
show the reflog of the current branch (if on any)?

- ryenus


Re: [PATCH 2/6] pull: make code more similar to the shell script again

2016-08-30 Thread Johannes Schindelin
Hi Junio,

On Mon, 29 Aug 2016, Junio C Hamano wrote:

> Johannes Schindelin  writes:
> 
> > +static int require_clean_work_tree(const char *action, const char *hint,
> > +   int gently)
> >  {
> > struct lock_file *lock_file = xcalloc(1, sizeof(*lock_file));
> > -   int do_die = 0;
> > +   int err = 0;
> >  
> > hold_locked_index(lock_file, 0);
> > refresh_cache(REFRESH_QUIET);
> > @@ -376,20 +377,26 @@ static void die_on_unclean_work_tree(void)
> > rollback_lock_file(lock_file);
> >  
> > if (has_unstaged_changes()) {
> > -   error(_("Cannot pull with rebase: You have unstaged changes."));
> > -   do_die = 1;
> > +   error(_("Cannot %s: You have unstaged changes."), action);
> > ...
> > if (!autostash)
> > -   die_on_unclean_work_tree();
> > +   require_clean_work_tree("pull with rebase",
> > +   "Please commit or stash them.", 0);
> >  
> > if (get_rebase_fork_point(rebase_fork_point, repo, *refspecs))
> > hashclr(rebase_fork_point);
> 
> Splicing an English/C phrase 'pull with rebase' into a
> _("localizable %s string") makes the life of i18n team hard.

Hrm.

> Can we do this differently?

Sure, but not at this stage. Because...

> If you are eventually going to expose this function as public API, I
> think the right approach would be to enumerate the possible error
> conditions this function can diagnose and return them to the caller,
> i.e.
> 
> #define WT_STATUS_DIRTY_WORKTREE 01
> #define WT_STATUS_DIRTY_INDEX02
> 
> static int require_clean_work_tree(void)
> {
>   int status = 0;
>   ...
> if (has_unstaged_changes())
>   status |= WT_STATUS_DIRTY_WORKTREE;
>   if (has_uncommitted_changes())
>   status |= WT_STATUS_DIRTY_INDEX;
>   return status;
> }
> 
> Then die_on_unclean_work_tree() can be made as a thin-wrapper that
> calls it and shows the pull-specific error message.

This sounds like a good plan, if involved. At this stage, I am really
unwilling to introduce such extensive changes, for fear of introducing
regressions.

I will keep it in mind and make those changes, once Git for Windows
v2.10.0 is out.

Ciao,
Dscho


Re: git submodules implementation question

2016-08-30 Thread Junio C Hamano
Uma Srinivasan  writes:

> I think the following fix is still needed to is_submodule_modified():
>
> strbuf_addf(&buf, "%s/.git", path);
> git_dir = read_gitfile(buf.buf);
> if (!git_dir) {
> git_dir = buf.buf;
>  ==>   if (!is_git_directory(git_dir)) {
>  ==> die("Corrupted .git dir in submodule %s", path);
>  ==>   }
> }

If it is so important that git_dir is a valid Git
repository after

git_dir = read_gitfile(buf.buf);
if (!git_dir)
git_dir = buf.buf;

is done to determine what "git_dir" to use, it seems to me that it
does not make much sense to check ONLY dir/.git that is a directory
and leave .git/modules/$name that dir/.git file points at unchecked.

But there is much bigger problem with the above addition, I think.
There also can be a case where dir/ does not even have ".git" in it.
A submodule the user is not interested in will just have an empty
directory there, and immediately after the above three lines I
reproduced above, we have this

if (!is_directory(git_dir)) {
strbuf_release(&buf);
return 0;
}

The added check will break the use case.  If anything, that check,
if this code needs to verify that "git_dir" points at a valid Git
repository, should come _after_ that.

Shouldn't "git-status --porcelain" run in the submodule notice that
it is not a valid repository and quickly die anyway?  Why should we
even check before spawning that process in the first place?

I might suggest to update prepare_submodule_repo_env() so that the
spawned process will *NOT* have to guess where the working tree and
repository by exporting GIT_DIR (set to "git_dir" we discover above)
and GIT_WORK_TREE (set to "." as cp.dir is set to the path to the
working tree of the submodule).  That would stop the "git status" to
guess (and fooled by a corrupted dir/.git that is not a git
repository).



Re: [PATCH 06/22] sequencer: release memory that was allocated when reading options

2016-08-30 Thread Johannes Schindelin
Hi Kuba,

On Tue, 30 Aug 2016, Jakub Narębski wrote:

> W dniu 29.08.2016 o 10:04, Johannes Schindelin pisze:
> 
> > The sequencer reads options from disk and stores them in its struct
> > for use during sequencer's operations.
> > 
> > With this patch, the memory is released afterwards, plugging a
> > memory leak.
> > 
> > Signed-off-by: Johannes Schindelin 
> > ---
> >  sequencer.c | 13 ++---
> >  1 file changed, 10 insertions(+), 3 deletions(-)
> > 
> > diff --git a/sequencer.c b/sequencer.c
> > index b5be0f9..8d79091 100644
> > --- a/sequencer.c
> > +++ b/sequencer.c
> > @@ -131,6 +131,8 @@ static void remove_sequencer_state(const struct 
> > replay_opts *opts)
> > free(opts->owned[i]);
> > free(opts->owned);
> >  
> > +   free(opts->xopts);
> > +
> 
> This looks like independent change, not related to using the
> sequencer_entrust() to store options read from disk in replay_opts
> struct to be able to free memory afterwards.
> 
> I guess you wanted to avoid one line changes...

Actually, it is not an independent change, but it free()s memory that has
been allocated while reading the options, as the commit message says ;-)

> > @@ -811,13 +813,18 @@ static int populate_opts_cb(const char *key, const 
> > char *value, void *data)
> 
> Sidenote: this patch would be easier to read if lines were reordered
> as below, but I don't think any slider heuristics could help achieve
> that automatically.  Also, the patch might be invalid...
> 
> > opts->allow_ff = git_config_bool_or_int(key, value, 
> > &error_flag);
> > else if (!strcmp(key, "options.mainline"))
> > opts->mainline = git_config_int(key, value);
> > -   else if (!strcmp(key, "options.strategy"))
> > +   else if (!strcmp(key, "options.strategy")) {
> > git_config_string(&opts->strategy, key, value);
> > +   sequencer_entrust(opts, (char *) opts->strategy);
> 
> I wonder if the ability to free strings dup-ed by git_config_string()
> be something that is part of replay_opts, or rather remove_sequencer_state(),
> that is a list of
> 
>   free(opts->strategy);
>   free(opts->gpg_sign);

That is not necessarily possible because the way sequencer works, the
options may have not actually be read from the file, but may be populated
by the caller (in which case we do not necessarily want to require
strdup()ing the strings just so that the sequencer can clean stuff up
afterwards).

> Though... free(NULL) is nop as per standard, but can we rely on it?

We can, and we do.

> The *_entrust() mechanism is more generic, but do we use this general-ness?
> Well, it could be xstrdup or git_config_string doing entrust'ing...

Right, but that is exactly what I wanted to avoid, because it is rather
inelegant to strdup() strings just so that we do not have to record what
to free() and what not to free().

BTW I have no objection at all to generalize this sequencer_entrust()
mechanism further (read: to other, similar use cases), should it withstand
the test of time.

Ciao,
Johannes

Re: git am and duplicate signatures

2016-08-30 Thread Joe Perches
On Tue, 2016-08-30 at 10:41 -0700, Joe Perches wrote:
> Maybe something like traces or chains.

Or "taggers" or "tagged-bys"



Re: git submodules implementation question

2016-08-30 Thread Uma Srinivasan
Thanks for the patch. Unfortunately, it doesn't help in my case as it
invokes the is_submodule_modified() routine which you didn't modify.
Here's my call trace

#0  is_submodule_modified (path=path@entry=0x17c2f08 "groc", ignore_untracked=0)
at submodule.c:939
#1  0x004aa4dc in match_stat_with_submodule (
diffopt=diffopt@entry=0x7fffde18, ce=ce@entry=0x17c2eb0,
st=st@entry=0x7fffd840, ce_option=ce_option@entry=0,
dirty_submodule=dirty_submodule@entry=0x7fffd83c) at diff-lib.c:81
#2  0x004ab4f5 in run_diff_files (revs=revs@entry=0x7fffd920,
option=option@entry=0) at diff-lib.c:217
#3  0x0054c0d4 in wt_status_collect_changes_worktree
(s=s@entry=0x7de280 )
at wt-status.c:559
#4  0x0054ecf6 in wt_status_collect (s=s@entry=0x7de280 )
at wt-status.c:678
#5  0x00422171 in cmd_status (argc=,
argv=,
prefix=0x0) at builtin/commit.c:1390
#6  0x00405abe in run_builtin (argv=,
argc=,
p=) at git.c:352
#7  handle_builtin (argc=1, argv=0x7fffe570) at git.c:551
#8  0x00405dd8 in run_argv (argv=0x7fffe320, argcp=0x7fffe32c)
at git.c:606
#9  cmd_main (argc=1, argc@entry=2, argv=0x7fffe570,
argv@entry=0x7fffe568)
at git.c:678
#10 0x00405060 in main (argc=2, argv=0x7fffe568) at common-main.c:40


I think the following fix is still needed to is_submodule_modified():

strbuf_addf(&buf, "%s/.git", path);
git_dir = read_gitfile(buf.buf);
if (!git_dir) {
git_dir = buf.buf;
 ==>   if (!is_git_directory(git_dir)) {
 ==> die("Corrupted .git dir in submodule %s", path);
 ==>   }
}


Thanks,
Uma

On Mon, Aug 29, 2016 at 11:23 PM, Jacob Keller  wrote:
> On Mon, Aug 29, 2016 at 11:09 PM, Jacob Keller  wrote:
>> On Mon, Aug 29, 2016 at 5:12 PM, Uma Srinivasan  
>> wrote:
>>> This is great! Thanks Jake. If you happen to have the patch ID it
>>> would be helpful.
>>>
>>> Uma
>>>
>>
>> http://public-inbox.org/git/1472236108.28343.5.ca...@intel.com/
>
>
> Actually correct patch is
> http://public-inbox.org/git/20160825233243.30700-6-jacob.e.kel...@intel.com/
>
> Thanks,
> Jake


Re: git am and duplicate signatures

2016-08-30 Thread Joe Perches
On Tue, 2016-08-30 at 10:34 -0700, Junio C Hamano wrote:
> Joe Perches  writes:
> > On Tue, 2016-08-30 at 09:54 -0700, Junio C Hamano wrote:
> > > 
> > > Support for more generic footers was supposed to come when the
> > > "interpret-trailers" topic started, but the author of the topic
> > > seems to have lost interest before the mechanism has become ready to
> > > be integrated in the workflow commands like "am", "commit", "rebase"
> > > etc., which is unfortunate.
> > I think adding at least an option to git send-email
> > allowing auto-cc's for all
> > "-by: [name] "
> > lines in the commit log would be useful.
> > 
> > Today, only "Signed-off-by" and "CC" lines are
> > added to cc's.
> > 
> > I've always called these lines "-by:" lines
> > "signatures", but perhaps there's a better name.
> I think we casually call them footers (as they are counter-part to
> "headers"), or trailers.

I think they are neither footers, which would relate
more to the email headers, nor trailers.

Maybe something like traces or chains.

btw: I submitted this awhile ago

http://www.spinics.net/lists/git/msg162269.html



Re: [PATCH 07/22] sequencer: future-proof read_populate_todo()

2016-08-30 Thread Johannes Schindelin
Hi Kuba,

On Tue, 30 Aug 2016, Jakub Narębski wrote:

> W dniu 29.08.2016 o 10:04, Johannes Schindelin pisze:
> 
> > Over the next commits, we will work on improving the sequencer to the
> > point where it can process the edit script of an interactive rebase. To
> > that end, we will need to teach the sequencer to read interactive
> > rebase's todo file. In preparation, we consolidate all places where
> > that todo file is needed to call a function that we will later extend.
> > 
> > Signed-off-by: Johannes Schindelin 
> > ---
> >  sequencer.c | 18 +++---
> >  1 file changed, 11 insertions(+), 7 deletions(-)
> > 
> > diff --git a/sequencer.c b/sequencer.c
> > index 8d79091..982b6e9 100644
> > --- a/sequencer.c
> > +++ b/sequencer.c
> > @@ -32,6 +32,11 @@ static const char *get_dir(const struct replay_opts 
> > *opts)
> > return git_path_seq_dir();
> >  }
> >  
> > +static const char *get_todo_path(const struct replay_opts *opts)
> > +{
> > +   return git_path_todo_file();
> > +}
> 
> I guess that in the future commit the return value of get_todo_path()
> would change depending on what sequencer is used for, cherry-pick or
> interactive rebase, that is, contents of replay_opts...

Right.

> >  static int is_rfc2822_line(const char *buf, int len)
> >  {
> > int i;
> > @@ -772,25 +777,24 @@ static int parse_insn_buffer(char *buf, struct 
> > commit_list **todo_list,
> >  static int read_populate_todo(struct commit_list **todo_list,
> > struct replay_opts *opts)
> >  {
> > +   const char *todo_file = get_todo_path(opts);
> 
> ...and that's why you have added this temporary variable here, to
> not repeat get_todo_path(opts) calculations...

... and to repeat only 9 characters instead of 19...

> > -   fd = open(git_path_todo_file(), O_RDONLY);
> > +   fd = open(todo_file, O_RDONLY);
> > if (fd < 0)
> > -   return error_errno(_("Could not open %s"),
> > -  git_path_todo_file());
> > +   return error_errno(_("Could not open %s"), todo_file);
> 
> ... So that's why it is s/git_path_todo_file()/todo_file/ replacement,
> and not simply...
> 
> > @@ -1064,7 +1068,7 @@ static int sequencer_continue(struct replay_opts 
> > *opts)
> >  {
> > struct commit_list *todo_list = NULL;
> >  
> > -   if (!file_exists(git_path_todo_file()))
> > +   if (!file_exists(get_todo_path(opts)))
> 
> ...the s/git_path_todo_file()/git_todo_path(opts)/, isn't it?

Correct.

> Looks good; though I have not checked if all calling sites were converted.

Thanks for the review!
Johannes

Re: git am and duplicate signatures

2016-08-30 Thread Junio C Hamano
Joe Perches  writes:

> On Tue, 2016-08-30 at 09:54 -0700, Junio C Hamano wrote:
>> Support for more generic footers was supposed to come when the
>> "interpret-trailers" topic started, but the author of the topic
>> seems to have lost interest before the mechanism has become ready to
>> be integrated in the workflow commands like "am", "commit", "rebase"
>> etc., which is unfortunate.
>
> I think adding at least an option to git send-email
> allowing auto-cc's for all
>   "-by: [name] "
> lines in the commit log would be useful.
>
> Today, only "Signed-off-by" and "CC" lines are
> added to cc's.
>
> I've always called these lines "-by:" lines
> "signatures", but perhaps there's a better name.

I think we casually call them footers (as they are counter-part to
"headers"), or trailers.


Re: git am and duplicate signatures

2016-08-30 Thread Junio C Hamano
Joe Perches  writes:

> (adding lkml)
>
> On Tue, 2016-08-30 at 09:54 -0700, Junio C Hamano wrote:
>> Joe Perches  writes:
>> > git-am -s will avoid duplicating the last signature
>> > in a patch.
>> > 
>> > But given a developer creates a patch, send it around for
>> > acks/other signoffs, collects signatures and then does
>> > a git am -s on a different branch, this sort of sign-off
>> > chain is possible:
>> > 
>> >Signed-off-by: Original Developer 
>> >Acked-by: Random Developer 
>> >Signed-off-by: Original Developer 
>> Both correct and allowing the earlier one duplicated as long as
>> there is somebody/something else in between is deliberate.
>
> linux-kernel has a script (scripts/checkpatch.pl) that
> looks for duplicate signatures (-by: [name] )
>
> Should the last Signed-off-by: in the commit log be
> excluded from this check?

That is left for the kernel folks to decide, but excluding only "the
last" does not make much sense to me.  If you look for only "two
consecutive same signatures" and barf, that would be in line with
what we have been shooting for to support the above "original then
random then back to original" example you gave us above.




Re: [PATCH 14/22] sequencer: prepare for rebase -i's commit functionality

2016-08-30 Thread Junio C Hamano
Johannes Schindelin  writes:

>> > +static char **read_author_script(void)
>> > +{
>> > +  struct strbuf script = STRBUF_INIT;
>> > +  int i, count = 0;
>> > +  char *p, *p2, **env;
>> > +  size_t env_size;
>> > +
>> > +  if (strbuf_read_file(&script, rebase_path_author_script(), 256) <= 0)
>> > +  return NULL;
>> > +
>> > +  for (p = script.buf; *p; p++)
>> > +  if (skip_prefix(p, "'''", (const char **)&p2))
>> > +  strbuf_splice(&script, p - script.buf, p2 - p, "'", 1);
>> > +  else if (*p == '\'')
>> > +  strbuf_splice(&script, p-- - script.buf, 1, "", 0);
>> > +  else if (*p == '\n') {
>> > +  *p = '\0';
>> > +  count++;
>> > +  }
>> 
>> Hmph.  What is this loop doing?  Is it decoding a sq-quoted buffer
>> or something?  Don't we have a helper function to do that?
>
> Well, it is not just decoding an sq-quoted buffer, but several lines with
> definitions we sq-quoted ourselves, individually.
>
> The quote.[ch] code currently has no code to dequote lines individually.

There is a function with exactly the same name in builtin/am.c and I
assume that it is reading from a file with the same format, which
uses a helper to read one variable line at a time.  Hopefully that
can be refactored so that more parsing is shared between the two
users.

Two functions with the same name reading from the same format, even
when they expect to produce the same result in different internal
format, without even being aware of each other is a bad enough
"regression" in maintainability of the code.  One of them not even
using sq_dequote() helper and reinventing is even worse.


Re: git am and duplicate signatures

2016-08-30 Thread Joe Perches
On Tue, 2016-08-30 at 09:54 -0700, Junio C Hamano wrote:
> Support for more generic footers was supposed to come when the
> "interpret-trailers" topic started, but the author of the topic
> seems to have lost interest before the mechanism has become ready to
> be integrated in the workflow commands like "am", "commit", "rebase"
> etc., which is unfortunate.

I think adding at least an option to git send-email
allowing auto-cc's for all
"-by: [name] "
lines in the commit log would be useful.

Today, only "Signed-off-by" and "CC" lines are
added to cc's.

I've always called these lines "-by:" lines
"signatures", but perhaps there's a better name.

Any preference?

from git send-email --help
       --suppress-cc=
   Specify an additional category of recipients to suppress the
   auto-cc of:

   ·   author will avoid including the patch author

   ·   self will avoid including the sender

   ·   cc will avoid including anyone mentioned in Cc lines in the
   patch header except for self (use self for that).

   ·   bodycc will avoid including anyone mentioned in Cc lines in the
   patch body (commit message) except for self (use self for
   that).

   ·   sob will avoid including anyone mentioned in Signed-off-by
   lines except for self (use self for that).

   ·   cccmd will avoid running the --cc-cmd.

   ·   body is equivalent to sob + bodycc

   ·   all will suppress all auto cc values.
> > 
> > sequencer.c:append_signoff() has a flag for APPEND_SIGNOFF_DEDUP
> Yes, I think this is one of the warts we talked about getting rid of
> but haven't got around to it.  It is there because "format-patch -s"
> was incorrectly written to dedup Signed-off-by: from anywhere in its
> early implementation and to keep the same behaviour.  We should drop
> that flag from append_signoff() function.


Re: [PATCH v2 08/14] sequencer: lib'ify read_and_refresh_cache()

2016-08-30 Thread Junio C Hamano
Johannes Schindelin  writes:

>> With the current set of callers, a caller that notices an error from
>> this function will immediately exit without doing any further
>> damage.
>> 
>> So in that sense, this is a "safe" conversion.
>> 
>> But is it a sensible conversion?  When the caller wants to do
>> anything else (e.g. clean-up and try something else, perhaps read
>> the index again), the caller can't, as the index is still locked,
>> because even though the code knows that the lock will not be
>> released until the process exit, it chose to return error without
>> releasing the lock.
>
> It depends what the caller wants to do. The case about which I care most
> is when some helpful advice should be printed (see e.g. 3be18b4 (t5520:
> verify that `pull --rebase` shows the helpful advice when failing,
> 2016-07-26)). Those callers do not need to care, as the atexit() handler
> will clean up the lock file.
>
> However, I am sympathetic to your angle, even if I do not expect any such
> caller to arise anytime soon.

We just fixed a similar "why are we allowing the 'if the_index
hasn't been read, read unconditionally from $GIT_INDEX_FILE" that is
reached by a codepath that is specifically designed to read from a
temporary index file while reviewing a separate topic, and that is
where my reaction "this is not very helpful for other callers" comes
from.


Re: [PATCH v2 10/14] sequencer: lib'ify read_populate_opts()

2016-08-30 Thread Junio C Hamano
Johannes Schindelin  writes:

> I still think that it is a serious mistake for library functions to die().
> But I have no time to take care of git_parse_source() right now.

I think we already agreed on both points during the previous round
of review ;-)


Re: git am and duplicate signatures

2016-08-30 Thread Joe Perches
(adding lkml)

On Tue, 2016-08-30 at 09:54 -0700, Junio C Hamano wrote:
> Joe Perches  writes:
> > git-am -s will avoid duplicating the last signature
> > in a patch.
> > 
> > But given a developer creates a patch, send it around for
> > acks/other signoffs, collects signatures and then does
> > a git am -s on a different branch, this sort of sign-off
> > chain is possible:
> > 
> > Signed-off-by: Original Developer 
> > Acked-by: Random Developer 
> > Signed-off-by: Original Developer 
> Both correct and allowing the earlier one duplicated as long as
> there is somebody/something else in between is deliberate.

linux-kernel has a script (scripts/checkpatch.pl) that
looks for duplicate signatures (-by: [name] )

Should the last Signed-off-by: in the commit log be
excluded from this check?

> > Should there be an option to avoid duplicate signatures
> > in a sequence where an author can git-am the same patch?
> I dunno.  The way "Signed-off-by" is handled is designed
> specifically to support the meaning of that footer, namely to record
> where it originated and whose hands it passed, used in the kernel
> and Git land.  Other projects certainly may have need for footers
> that denote different things that want different semantics (e.g. Who
> authored it and who cheered on it), but that is outside the scope of
> the "Signed-off-by" supported by "am -s" and "commit -s".
> 
> Support for more generic footers was supposed to come when the
> "interpret-trailers" topic started, but the author of the topic
> seems to have lost interest before the mechanism has become ready to
> be integrated in the workflow commands like "am", "commit", "rebase"
> etc., which is unfortunate.
> 
> > 
> > sequencer.c:append_signoff() has a flag for APPEND_SIGNOFF_DEDUP
> Yes, I think this is one of the warts we talked about getting rid of
> but haven't got around to it.  It is there because "format-patch -s"
> was incorrectly written to dedup Signed-off-by: from anywhere in its
> early implementation and to keep the same behaviour.  We should drop
> that flag from append_signoff() function.


Re: [PATCH v6 10/13] convert: generate large test files only once

2016-08-30 Thread Junio C Hamano
Lars Schneider  writes:

> True, but applying rot13 (via tr ...) on 20+ MB takes quite a bit of
> time. That's why I came up with the 1M SP in between.

Ah, OK, that makes sense (it may not necessarily make it a good
decision, but it certainly explains it).

> However, I realized that testing a large amount of data is not really
> necessary for the final series. A single packet is 64k. A 500k pseudo random
> test file should be sufficient. This will make the test way simpler.

;-)  Thanks.


Re: git am and duplicate signatures

2016-08-30 Thread Junio C Hamano
Joe Perches  writes:

> git-am -s will avoid duplicating the last signature
> in a patch.
>
> But given a developer creates a patch, send it around for
> acks/other signoffs, collects signatures and then does
> a git am -s on a different branch, this sort of sign-off
> chain is possible:
>
>   Signed-off-by: Original Developer 
>   Acked-by: Random Developer 
>   Signed-off-by: Original Developer 

Both correct and allowing the earlier one duplicated as long as
there is somebody/something else in between is deliberate.

> Should there be an option to avoid duplicate signatures
> in a sequence where an author can git-am the same patch?

I dunno.  The way "Signed-off-by" is handled is designed
specifically to support the meaning of that footer, namely to record
where it originated and whose hands it passed, used in the kernel
and Git land.  Other projects certainly may have need for footers
that denote different things that want different semantics (e.g. Who
authored it and who cheered on it), but that is outside the scope of
the "Signed-off-by" supported by "am -s" and "commit -s".

Support for more generic footers was supposed to come when the
"interpret-trailers" topic started, but the author of the topic
seems to have lost interest before the mechanism has become ready to
be integrated in the workflow commands like "am", "commit", "rebase"
etc., which is unfortunate.

> sequencer.c:append_signoff() has a flag for APPEND_SIGNOFF_DEDUP

Yes, I think this is one of the warts we talked about getting rid of
but haven't got around to it.  It is there because "format-patch -s"
was incorrectly written to dedup Signed-off-by: from anywhere in its
early implementation and to keep the same behaviour.  We should drop
that flag from append_signoff() function.


Re: [PATCH v6 10/13] convert: generate large test files only once

2016-08-30 Thread Jeff King
On Tue, Aug 30, 2016 at 01:41:59PM +0200, Lars Schneider wrote:

> >> +  git checkout -- test test.t test.i &&
> >> +
> >> +  mkdir generated-test-data &&
> >> +  for i in $(test_seq 1 $T0021_LARGE_FILE_SIZE)
> >> +  do
> >> +  RANDOM_STRING="$(test-genrandom end $i | tr -dc "A-Za-z0-9" )"
> >> +  ROT_RANDOM_STRING="$(echo $RANDOM_STRING | ./rot13.sh )"
> > 
> > In earlier iteration of loop with lower $i, what guarantees that
> > some bytes survive "tr -dc"?
> 
> Nothing really, good catch! The seed "end" produces as first character always 
> a 
> "S" which would survive "tr -dc". However, that is clunky. I will always set 
> "1"
> as first character in $RANDOM_STRING to mitigate the problem.

It seems odd that you would generate a larger set of random bytes and
then throw most of them away (about 1 in 5, on average). So you don't
actually know how long your inputs are, and you're wasting time
generating bytes which are discarded.

The goal looks like it is just to clean up the string to only-ASCII
characters. Perhaps converting to to base64 or hex would be conceptually
simpler? Like:

  test-genrandom ... |
  perl -pe 's/./hex(ord($&))/sge'

> > I do not quite get the point of this complexity.  You are using
> > exactly the same seed "end" every time, so in the first round you
> > have 1M of SP, letter '1', letter 'S' (from the genrandom), then
> > in the second round you have 1M of SP, letter '1', letter 'S' and
> > letter 'p' (the last two from the genrandom), and go on.  Is it
> > significant for the purpose of your test that the cruft inserted
> > between the repetition of 1M of SP gets longer by one byte but they
> > all share the same prefix (e.g. "1S", "1Sp", "1SpZ", "1SpZT",
> > ... are what you insert between a large run of spaces)?
> 
> The pktline packets have a constant size. If the cruft between 1M of SP 
> has a constant size as well then the generated packets for the test data
> would repeat themselves. That's why I increased the length after every 1M
> of SP.
> 
> However, I realized that this test complexity is not necessary. I'll
> simplify it in the next round.

I was also confused by this, and wondered if other patterns (perhaps
using a single longer genrandom) might be applicable. Simplification (or
explanation in comments about what properties the content _needs_ to
have) would be welcome. :)

-Peff


Re: [PATCH v6 12/13] convert: add filter..process option

2016-08-30 Thread Lars Schneider

> On 30 Aug 2016, at 00:21, Junio C Hamano  wrote:
> 
> larsxschnei...@gmail.com writes:
> 
>> +In case the filter cannot or does not want to process the content,
>> +it is expected to respond with an "error" status. Depending on the
>> +`filter..required` flag Git will interpret that as error
>> +but it will not stop or restart the filter process.
>> +
>> +packet:  git< status=error\n
>> +packet:  git< 
>> +
>> +
>> +In case the filter cannot or does not want to process the content
>> +as well as any future content for the lifetime of the Git process,
>> +it is expected to respond with an "error-all" status. Depending on
>> +the `filter..required` flag Git will interpret that as error
>> +but it will not stop or restart the filter process.
>> +
>> +packet:  git< status=error-all\n
>> +packet:  git< 
>> +
> 
> This part of the document is well-written to help filter-writers.

Thanks!


> One thing that was unclear from the above to me, when read as a
> potential filter-writer, is when I am supposed to exit(2).  After I
> tell Git with error-all (I would have called it "abort", but that's
> OK) that I desire no further communication, am I free to go?  Or do
> I wait until Git somehow disconnects (perhaps by closing the packet
> stream I have been reading)?

The filter can exit right after the "error-all". If the filter does
not exit then Git will kill the filter. I'll add this to the docs.

"abort" could be ambiguous because it could be read as "abort only
this file". "abort-all" would work, though. Would you prefer to see
"error" replaced by "abort" and "error-all" by "abort-all"?


>> +If the filter dies during the communication or does not adhere to
>> +the protocol then Git will stop the filter process and restart it
>> +with the next file that needs to be processed.
> 
> Hmph, is there a reason not to retry a half-converted-and-failed
> blob with the fresh process?  Note that this is not "you must do it
> that way", and it is not even "I think doing so may be a better
> idea".  I merely want to know the reason behind this decision.

A filter that dies during communication or does not adhere to the protocol
is a faulty filter. Feeding the faulty filter after restart with the same 
blob would likely cause the same error. 

There are legitimate reasons for retries. E.g. if the filter communicates
with the network. In these cases I expect the filter to handle the retry
logic. Git just writes to and reads from pipes. I don't expect frequent
problems in that area. Plus the existing filter mechanism has no retry
either.

Later on we could easily add a "retry" capability if we deem it necessary,
though.


>> +After the filter has processed a blob it is expected to wait for
>> +the next "key=value" list containing a command. When the Git process
>> +terminates, it will send a kill signal to the filter in that stage.
> 
> The "kill" may not be very nice.  As Git side _knows_ that the
> filter is waiting for the next command, having an explicit
> "shutdown" command would give the filter a chance to implement a
> clean exit--it may have some housekeeping tasks it wants to perform
> once it is done.  The "explicit shutdown" could just be "the pipe
> gets closed", so from the implementation point of view there may not
> be anything you need to further add to this patch (after all, when
> we exit, the pipes to them would be closed), but the shutdown
> protocol and the expectation on the behaviour of filter processes
> would need to be documented.

I implemented a shutdown command in v4 [1][2] but dropped it in v5 after
a discussion with Peff [3].

[1] http://public-inbox.org/git/20160803164225.46355-8-larsxschnei...@gmail.com/
[2] 
http://public-inbox.org/git/20160803164225.46355-13-larsxschnei...@gmail.com/
[3] 
http://public-inbox.org/git/20160803225313.pk3tfe5ovz4y3...@sigill.intra.peff.net/

My main reasons to drop it:

A) There is no central place in Git that could execute code *after*
   all filter operations are complete and *before* Git exits. Therefore,
   I had to add a "clean_on_exit_handler()" to "run-command" [1]. This
   change made this series even larger and therefore harder to review.

B) If we communicate "shutdown" to the filter then we need to give the
   filter some time to perform the exit before the filter is killed on
   Git exit. I wasn't able to come up with a good answer how long Git 
   should wait for the exit.

Do you think I should resurrect the "shutdown" patch?


>> +If a `filter..clean` or `filter..smudge` command
>> +is configured then these commands always take precedence over
>> +a configured `filter..process` command.
> 
> It may make more sense to give precedence to the .process (which is
> a late-comer) if defined, ignoring .clean and .smudge, than the
> other way around.

I agree.


>> +Please note that you cannot use an existing `filter..cle

Re: [PATCH 4/6] require_clean_work_tree: ensure that the index was read

2016-08-30 Thread Junio C Hamano
Johannes Schindelin  writes:

>> > Scatch that.  That would not work in a freshly created repository
>> > before doing any "git add".  An empty index is a normal state,...
>
> Alas, that does not work, either. If no .git/index exists, read_index()
> will not set the "initialized" flag.

Exactly


git am and duplicate signatures

2016-08-30 Thread Joe Perches
git-am -s will avoid duplicating the last signature
in a patch.

But given a developer creates a patch, send it around for
acks/other signoffs, collects signatures and then does
a git am -s on a different branch, this sort of sign-off
chain is possible:

Signed-off-by: Original Developer 
Acked-by: Random Developer 
Signed-off-by: Original Developer 

Should there be an option to avoid duplicate signatures
in a sequence where an author can git-am the same patch?

sequencer.c:append_signoff() has a flag for APPEND_SIGNOFF_DEDUP

sequencer.c:void append_signoff(struct strbuf *msgbuf, int ignore_footer, 
unsigned flag)

but

builtin/commit.c:   append_signoff(&sb, ignore_non_trailer(&sb), 0);

doesn't have an optional use mechanism available.



Re: [PATCH 07/22] sequencer: future-proof read_populate_todo()

2016-08-30 Thread Jakub Narębski
W dniu 29.08.2016 o 10:04, Johannes Schindelin pisze:

> Over the next commits, we will work on improving the sequencer to the
> point where it can process the edit script of an interactive rebase. To
> that end, we will need to teach the sequencer to read interactive
> rebase's todo file. In preparation, we consolidate all places where
> that todo file is needed to call a function that we will later extend.
> 
> Signed-off-by: Johannes Schindelin 
> ---
>  sequencer.c | 18 +++---
>  1 file changed, 11 insertions(+), 7 deletions(-)
> 
> diff --git a/sequencer.c b/sequencer.c
> index 8d79091..982b6e9 100644
> --- a/sequencer.c
> +++ b/sequencer.c
> @@ -32,6 +32,11 @@ static const char *get_dir(const struct replay_opts *opts)
>   return git_path_seq_dir();
>  }
>  
> +static const char *get_todo_path(const struct replay_opts *opts)
> +{
> + return git_path_todo_file();
> +}

I guess that in the future commit the return value of get_todo_path()
would change depending on what sequencer is used for, cherry-pick or
interactive rebase, that is, contents of replay_opts...

> +
>  static int is_rfc2822_line(const char *buf, int len)
>  {
>   int i;
> @@ -772,25 +777,24 @@ static int parse_insn_buffer(char *buf, struct 
> commit_list **todo_list,
>  static int read_populate_todo(struct commit_list **todo_list,
>   struct replay_opts *opts)
>  {
> + const char *todo_file = get_todo_path(opts);

...and that's why you have added this temporary variable here, to
not repeat get_todo_path(opts) calculations...

>   struct strbuf buf = STRBUF_INIT;
>   int fd, res;
>  
> - fd = open(git_path_todo_file(), O_RDONLY);
> + fd = open(todo_file, O_RDONLY);
>   if (fd < 0)
> - return error_errno(_("Could not open %s"),
> -git_path_todo_file());
> + return error_errno(_("Could not open %s"), todo_file);

... So that's why it is s/git_path_todo_file()/todo_file/ replacement,
and not simply...

>   if (strbuf_read(&buf, fd, 0) < 0) {
>   close(fd);
>   strbuf_release(&buf);
> - return error(_("Could not read %s."), git_path_todo_file());
> + return error(_("Could not read %s."), todo_file);
>   }
>   close(fd);
>  
>   res = parse_insn_buffer(buf.buf, todo_list, opts);
>   strbuf_release(&buf);
>   if (res)
> - return error(_("Unusable instruction sheet: %s"),
> - git_path_todo_file());
> + return error(_("Unusable instruction sheet: %s"), todo_file);
>   return 0;
>  }
>  
> @@ -1064,7 +1068,7 @@ static int sequencer_continue(struct replay_opts *opts)
>  {
>   struct commit_list *todo_list = NULL;
>  
> - if (!file_exists(git_path_todo_file()))
> + if (!file_exists(get_todo_path(opts)))

...the s/git_path_todo_file()/git_todo_path(opts)/, isn't it?

>   return continue_single_pick();
>   if (read_populate_opts(opts) ||
>   read_populate_todo(&todo_list, opts))
> 

Looks good; though I have not checked if all calling sites were converted.

Good work,
-- 
Jakub Narębski



Re: [PATCH v6 13/13] read-cache: make sure file handles are not inherited by child processes

2016-08-30 Thread Torsten Bögershausen

> 
> diff --git a/sha1_file.c b/sha1_file.c
> index d5e1121..759991e 100644
> --- a/sha1_file.c
> +++ b/sha1_file.c
> @@ -1485,7 +1485,7 @@ int check_sha1_signature(const unsigned char *sha1, 
> void *map,
>  
>  int git_open_noatime(const char *name)

Hm, should the function then be renamed into

git_open_noatime_cloexec()

>  {
> - static int sha1_file_open_flag = O_NOATIME;
> + static int sha1_file_open_flag = O_NOATIME | O_CLOEXEC;
>  
>   for (;;) {
>   int fd;
> 
> 
> 


Re: [PATCH 06/22] sequencer: release memory that was allocated when reading options

2016-08-30 Thread Jakub Narębski
W dniu 29.08.2016 o 10:04, Johannes Schindelin pisze:

> The sequencer reads options from disk and stores them in its struct
> for use during sequencer's operations.
> 
> With this patch, the memory is released afterwards, plugging a
> memory leak.
> 
> Signed-off-by: Johannes Schindelin 
> ---
>  sequencer.c | 13 ++---
>  1 file changed, 10 insertions(+), 3 deletions(-)
> 
> diff --git a/sequencer.c b/sequencer.c
> index b5be0f9..8d79091 100644
> --- a/sequencer.c
> +++ b/sequencer.c
> @@ -131,6 +131,8 @@ static void remove_sequencer_state(const struct 
> replay_opts *opts)
>   free(opts->owned[i]);
>   free(opts->owned);
>  
> + free(opts->xopts);
> +

This looks like independent change, not related to using the
sequencer_entrust() to store options read from disk in replay_opts
struct to be able to free memory afterwards.

I guess you wanted to avoid one line changes...

>   strbuf_addf(&dir, "%s", get_dir(opts));
>   remove_dir_recursively(&dir, 0);
>   strbuf_release(&dir);
> @@ -811,13 +813,18 @@ static int populate_opts_cb(const char *key, const char 
> *value, void *data)

Sidenote: this patch would be easier to read if lines were reordered
as below, but I don't think any slider heuristics could help achieve
that automatically.  Also, the patch might be invalid...

>   opts->allow_ff = git_config_bool_or_int(key, value, 
> &error_flag);
>   else if (!strcmp(key, "options.mainline"))
>   opts->mainline = git_config_int(key, value);
> - else if (!strcmp(key, "options.strategy"))
> + else if (!strcmp(key, "options.strategy")) {
>   git_config_string(&opts->strategy, key, value);
> + sequencer_entrust(opts, (char *) opts->strategy);

I wonder if the ability to free strings dup-ed by git_config_string()
be something that is part of replay_opts, or rather remove_sequencer_state(),
that is a list of

free(opts->strategy);
free(opts->gpg_sign);

And of course

for (i = 0; i < opts->xopts_nr; i++)
free(opts->xopts[i]);
free(opts->xopts);

Though... free(NULL) is nop as per standard, but can we rely on it?
If it is a problem, we can create xfree(ptr) being if(ptr)free(ptr);

The *_entrust() mechanism is more generic, but do we use this general-ness?
Well, it could be xstrdup or git_config_string doing entrust'ing...


> + }
> - else if (!strcmp(key, "options.gpg-sign"))
> + else if (!strcmp(key, "options.gpg-sign")) {
>   git_config_string(&opts->gpg_sign, key, value);
> + sequencer_entrust(opts, (char *) opts->gpg_sign);
> + }
>   else if (!strcmp(key, "options.strategy-option")) {
>   ALLOC_GROW(opts->xopts, opts->xopts_nr + 1, opts->xopts_alloc);
> - opts->xopts[opts->xopts_nr++] = xstrdup(value);
> + opts->xopts[opts->xopts_nr++] =
> + sequencer_entrust(opts, xstrdup(value));

Nice.

>   } else
>   return error(_("Invalid key: %s"), key);
>  
> 



Re: [PATCH 4/6] require_clean_work_tree: ensure that the index was read

2016-08-30 Thread Johannes Schindelin
Hi Junio,

On Tue, 30 Aug 2016, Johannes Schindelin wrote:

> On Mon, 29 Aug 2016, Junio C Hamano wrote:
> 
> > Junio C Hamano  writes:
> > 
> > > I am not sure if it should be left as the responsibility of the
> > > caller (i.e. check the_index.initialized to bark at a caller that
> > > forgets to read from an index) ...
> > 
> > Scatch that.  That would not work in a freshly created repository
> > before doing any "git add".  An empty index is a normal state, so it
> > would not just be annoying to warn "You called me without reading
> > the index" but is simply wrong.
> 
> Fine. I changed it to assert that the_index.initialized was set.

Alas, that does not work, either. If no .git/index exists, read_index()
will not set the "initialized" flag.

So it turns out that I can either get distracted in a major way, or drop
the patch. I opt for the latter.

Ciao,
Dscho


Re: how to showing a merge graph?

2016-08-30 Thread Michael J Gruber
Duy Nguyen venit, vidit, dixit 30.08.2016 15:10:
> I want to see a "git log --oneline --graph" with all non-merge commits
> removed, but history is rewritten so that the merge commits represent
> the entire topics and are shown to have all the parents of the base
> commits. e.g. if the full graph is
> 
> *   8118403 Merge commit 'bbb2437'
> |\
> | * bbb2437 3p
> | * dfde6b9 2p
> * | 9c0aeb2 2
> |/
> * 8db1c06 1
> 
> I just want to see
> 
> *   8118403 Merge commit 'bbb2437'
> |\
> | |
> |/
> * 8db1c06 1
> 
> I had a quick look of rev-list-options.txt but couldn't find anything
> that does that (--simplify-merges looks different), and revision.c is
> not that familiar to me to figure this out by myself. Help?

I don't think anything (we have) would give you two lines connecting the
same two commits directly, i.e. a double edge in the graph as you
indicate above.

Michael



[PATCH v4 3/3] diff-highlight: add support for --graph output

2016-08-30 Thread Brian Henderson
Signed-off-by: Brian Henderson 
---
 contrib/diff-highlight/diff-highlight| 19 +--
 contrib/diff-highlight/t/t9400-diff-highlight.sh |  2 +-
 2 files changed, 14 insertions(+), 7 deletions(-)

diff --git a/contrib/diff-highlight/diff-highlight 
b/contrib/diff-highlight/diff-highlight
index ffefc31..9280c88 100755
--- a/contrib/diff-highlight/diff-highlight
+++ b/contrib/diff-highlight/diff-highlight
@@ -21,6 +21,10 @@ my $RESET = "\x1b[m";
 my $COLOR = qr/\x1b\[[0-9;]*m/;
 my $BORING = qr/$COLOR|\s/;
 
+# The patch portion of git log -p --graph should only ever have preceding | and
+# not / or \ as merge history only shows up on the commit line.
+my $GRAPH = qr/$COLOR?\|$COLOR?\s+/;
+
 my @removed;
 my @added;
 my $in_hunk;
@@ -32,12 +36,12 @@ $SIG{PIPE} = 'DEFAULT';
 while (<>) {
if (!$in_hunk) {
print;
-   $in_hunk = /^$COLOR*\@/;
+   $in_hunk = /^$GRAPH*$COLOR*\@/;
}
-   elsif (/^$COLOR*-/) {
+   elsif (/^$GRAPH*$COLOR*-/) {
push @removed, $_;
}
-   elsif (/^$COLOR*\+/) {
+   elsif (/^$GRAPH*$COLOR*\+/) {
push @added, $_;
}
else {
@@ -46,7 +50,7 @@ while (<>) {
@added = ();
 
print;
-   $in_hunk = /^$COLOR*[\@ ]/;
+   $in_hunk = /^$GRAPH*$COLOR*[\@ ]/;
}
 
# Most of the time there is enough output to keep things streaming,
@@ -163,6 +167,9 @@ sub highlight_pair {
}
 }
 
+# we split either by $COLOR or by character. This has the side effect of
+# leaving in graph cruft. It works because the graph cruft does not contain "-"
+# or "+"
 sub split_line {
local $_ = shift;
return utf8::decode($_) ?
@@ -211,8 +218,8 @@ sub is_pair_interesting {
my $suffix_a = join('', @$a[($sa+1)..$#$a]);
my $suffix_b = join('', @$b[($sb+1)..$#$b]);
 
-   return $prefix_a !~ /^$COLOR*-$BORING*$/ ||
-  $prefix_b !~ /^$COLOR*\+$BORING*$/ ||
+   return $prefix_a !~ /^$GRAPH*$COLOR*-$BORING*$/ ||
+  $prefix_b !~ /^$GRAPH*$COLOR*\+$BORING*$/ ||
   $suffix_a !~ /^$BORING*$/ ||
   $suffix_b !~ /^$BORING*$/;
 }
diff --git a/contrib/diff-highlight/t/t9400-diff-highlight.sh 
b/contrib/diff-highlight/t/t9400-diff-highlight.sh
index 54e11fe..e42232d 100755
--- a/contrib/diff-highlight/t/t9400-diff-highlight.sh
+++ b/contrib/diff-highlight/t/t9400-diff-highlight.sh
@@ -209,7 +209,7 @@ test_expect_failure 'diff-highlight highlights mismatched 
hunk size' '
 
 # TODO add multi-byte test
 
-test_expect_failure 'diff-highlight works with the --graph option' '
+test_expect_success 'diff-highlight works with the --graph option' '
dh_test_setup_history &&
 
# topo-order so that the order of the commits is the same as with 
--graph
-- 
2.9.3



[PATCH v4 0/3] diff-highlight: add support for --graph option

2016-08-30 Thread Brian Henderson
On Mon, Aug 29, 2016 at 02:37:46PM -0700, Junio C Hamano wrote:
> Brian Henderson  writes:
>
> > How does this look?
> >
> > Drawing the graph helped me a lot in figuring out what I was
> > actually testing. thanks!
>
> Yeah, I also am pleased to see the picture of what is being tested
> in the test script.
>
> With your sign-off, they would have been almost perfect ;-).

doh. fixed.

I left the subject as v4, probably mostly because I have this weird aversion to
increasing version numbers :) but I justified it by thinking that the actual
patch set isn't changing, I just added the sign-off (and updated the commit
messages per Jeff.) Hope that's ok.

thanks everyone for all your help!


Brian Henderson (3):
  diff-highlight: add some tests
  diff-highlight: add failing test for handling --graph output
  diff-highlight: add support for --graph output

 contrib/diff-highlight/Makefile  |   5 +
 contrib/diff-highlight/diff-highlight|  19 +-
 contrib/diff-highlight/t/Makefile|  22 +++
 contrib/diff-highlight/t/t9400-diff-highlight.sh | 223 +++
 4 files changed, 263 insertions(+), 6 deletions(-)
 create mode 100644 contrib/diff-highlight/Makefile
 create mode 100644 contrib/diff-highlight/t/Makefile
 create mode 100755 contrib/diff-highlight/t/t9400-diff-highlight.sh

-- 
2.9.3



[PATCH v4 2/3] diff-highlight: add failing test for handling --graph output

2016-08-30 Thread Brian Henderson
Signed-off-by: Brian Henderson 
---
 contrib/diff-highlight/t/t9400-diff-highlight.sh | 60 
 1 file changed, 60 insertions(+)

diff --git a/contrib/diff-highlight/t/t9400-diff-highlight.sh 
b/contrib/diff-highlight/t/t9400-diff-highlight.sh
index 7c303f7..54e11fe 100755
--- a/contrib/diff-highlight/t/t9400-diff-highlight.sh
+++ b/contrib/diff-highlight/t/t9400-diff-highlight.sh
@@ -49,6 +49,55 @@ test_strip_patch_header () {
sed -n '/^@@/,$p' $*
 }
 
+# dh_test_setup_history generates a contrived graph such that we have at least
+# 1 nesting (E) and 2 nestings (F).
+#
+#A branch
+#   /
+#  D---E---F master
+#
+#  git log --all --graph
+#  * commit
+#  |A
+#  | * commit
+#  | |F
+#  | * commit
+#  |/
+#  |E
+#  * commit
+#   D
+#
+dh_test_setup_history () {
+   echo "file1" >file1 &&
+   echo "file2" >file2 &&
+   echo "file3" >file3 &&
+
+   cat file1 >file &&
+   git add file &&
+   git commit -m "D" &&
+
+   git checkout -b branch &&
+   cat file2 >file &&
+   git commit -am "A" &&
+
+   git checkout master &&
+   cat file2 >file &&
+   git commit -am "E" &&
+
+   cat file3 >file &&
+   git commit -am "F"
+}
+
+left_trim () {
+   "$PERL_PATH" -pe 's/^\s+//'
+}
+
+trim_graph () {
+   # graphs start with * or |
+   # followed by a space or / or \
+   "$PERL_PATH" -pe 's@^((\*|\|)( |/|\\))+@@'
+}
+
 test_expect_success 'diff-highlight highlights the beginning of a line' '
cat >a <<-\EOF &&
aaa
@@ -160,4 +209,15 @@ test_expect_failure 'diff-highlight highlights mismatched 
hunk size' '
 
 # TODO add multi-byte test
 
+test_expect_failure 'diff-highlight works with the --graph option' '
+   dh_test_setup_history &&
+
+   # topo-order so that the order of the commits is the same as with 
--graph
+   # trim graph elements so we can do a diff
+   # trim leading space because our trim_graph is not perfect
+   git log --branches -p --topo-order | "$DIFF_HIGHLIGHT" | left_trim 
>graph.exp &&
+   git log --branches -p --graph | "$DIFF_HIGHLIGHT" | trim_graph | 
left_trim >graph.act &&
+   test_cmp graph.exp graph.act
+'
+
 test_done
-- 
2.9.3



[PATCH v4 1/3] diff-highlight: add some tests

2016-08-30 Thread Brian Henderson
Signed-off-by: Brian Henderson 
---
 contrib/diff-highlight/Makefile  |   5 +
 contrib/diff-highlight/t/Makefile|  22 +++
 contrib/diff-highlight/t/t9400-diff-highlight.sh | 163 +++
 3 files changed, 190 insertions(+)
 create mode 100644 contrib/diff-highlight/Makefile
 create mode 100644 contrib/diff-highlight/t/Makefile
 create mode 100755 contrib/diff-highlight/t/t9400-diff-highlight.sh

diff --git a/contrib/diff-highlight/Makefile b/contrib/diff-highlight/Makefile
new file mode 100644
index 000..9018724
--- /dev/null
+++ b/contrib/diff-highlight/Makefile
@@ -0,0 +1,5 @@
+# nothing to build
+all:
+
+test:
+   $(MAKE) -C t
diff --git a/contrib/diff-highlight/t/Makefile 
b/contrib/diff-highlight/t/Makefile
new file mode 100644
index 000..5ff5275
--- /dev/null
+++ b/contrib/diff-highlight/t/Makefile
@@ -0,0 +1,22 @@
+-include ../../../config.mak.autogen
+-include ../../../config.mak
+
+# copied from ../../t/Makefile
+SHELL_PATH ?= $(SHELL)
+SHELL_PATH_SQ = $(subst ','\'',$(SHELL_PATH))
+T = $(wildcard t[0-9][0-9][0-9][0-9]-*.sh)
+
+all: test
+test: $(T)
+
+.PHONY: help clean all test $(T)
+
+help:
+   @echo 'Run "$(MAKE) test" to launch test scripts'
+   @echo 'Run "$(MAKE) clean" to remove trash folders'
+
+$(T):
+   @echo "*** $@ ***"; '$(SHELL_PATH_SQ)' $@ $(GIT_TEST_OPTS)
+
+clean:
+   $(RM) -r 'trash directory'.*
diff --git a/contrib/diff-highlight/t/t9400-diff-highlight.sh 
b/contrib/diff-highlight/t/t9400-diff-highlight.sh
new file mode 100755
index 000..7c303f7
--- /dev/null
+++ b/contrib/diff-highlight/t/t9400-diff-highlight.sh
@@ -0,0 +1,163 @@
+#!/bin/sh
+
+test_description='Test diff-highlight'
+
+CURR_DIR=$(pwd)
+TEST_OUTPUT_DIRECTORY=$(pwd)
+TEST_DIRECTORY="$CURR_DIR"/../../../t
+DIFF_HIGHLIGHT="$CURR_DIR"/../diff-highlight
+
+CW="$(printf "\033[7m")"   # white
+CR="$(printf "\033[27m")"  # reset
+
+. "$TEST_DIRECTORY"/test-lib.sh
+
+if ! test_have_prereq PERL
+then
+   skip_all='skipping diff-highlight tests; perl not available'
+   test_done
+fi
+
+# dh_test is a test helper function which takes 3 file names as parameters. The
+# first 2 files are used to generate diff and commit output, which is then
+# piped through diff-highlight. The 3rd file should contain the expected output
+# of diff-highlight (minus the diff/commit header, ie. everything after and
+# including the first @@ line).
+dh_test () {
+   a="$1" b="$2" &&
+
+   cat >patch.exp &&
+
+   {
+   cat "$a" >file &&
+   git add file &&
+   git commit -m "Add a file" &&
+
+   cat "$b" >file &&
+   git diff file >diff.raw &&
+   git commit -am "Update a file" &&
+   git show >commit.raw
+   } >/dev/null &&
+
+   "$DIFF_HIGHLIGHT" diff.act &&
+   "$DIFF_HIGHLIGHT" commit.act &&
+   test_cmp patch.exp diff.act &&
+   test_cmp patch.exp commit.act
+}
+
+test_strip_patch_header () {
+   sed -n '/^@@/,$p' $*
+}
+
+test_expect_success 'diff-highlight highlights the beginning of a line' '
+   cat >a <<-\EOF &&
+   aaa
+   bbb
+   ccc
+   EOF
+
+   cat >b <<-\EOF &&
+   aaa
+   0bb
+   ccc
+   EOF
+
+   dh_test a b <<-EOF
+   @@ -1,3 +1,3 @@
+aaa
+   -${CW}b${CR}bb
+   +${CW}0${CR}bb
+ccc
+   EOF
+'
+
+test_expect_success 'diff-highlight highlights the end of a line' '
+   cat >a <<-\EOF &&
+   aaa
+   bbb
+   ccc
+   EOF
+
+   cat >b <<-\EOF &&
+   aaa
+   bb0
+   ccc
+   EOF
+
+   dh_test a b <<-EOF
+   @@ -1,3 +1,3 @@
+aaa
+   -bb${CW}b${CR}
+   +bb${CW}0${CR}
+ccc
+   EOF
+'
+
+test_expect_success 'diff-highlight highlights the middle of a line' '
+   cat >a <<-\EOF &&
+   aaa
+   bbb
+   ccc
+   EOF
+
+   cat >b <<-\EOF &&
+   aaa
+   b0b
+   ccc
+   EOF
+
+   dh_test a b <<-EOF
+   @@ -1,3 +1,3 @@
+aaa
+   -b${CW}b${CR}b
+   +b${CW}0${CR}b
+ccc
+   EOF
+'
+
+test_expect_success 'diff-highlight does not highlight whole line' '
+   cat >a <<-\EOF &&
+   aaa
+   bbb
+   ccc
+   EOF
+
+   cat >b <<-\EOF &&
+   aaa
+   000
+   ccc
+   EOF
+
+   dh_test a b <<-EOF
+   @@ -1,3 +1,3 @@
+aaa
+   -bbb
+   +000
+ccc
+   EOF
+'
+
+test_expect_failure 'diff-highlight highlights mismatched hunk size' '
+   cat >a <<-\EOF &&
+   aaa
+   bbb
+   EOF
+
+   

how to showing a merge graph?

2016-08-30 Thread Duy Nguyen
I want to see a "git log --oneline --graph" with all non-merge commits
removed, but history is rewritten so that the merge commits represent
the entire topics and are shown to have all the parents of the base
commits. e.g. if the full graph is

*   8118403 Merge commit 'bbb2437'
|\
| * bbb2437 3p
| * dfde6b9 2p
* | 9c0aeb2 2
|/
* 8db1c06 1

I just want to see

*   8118403 Merge commit 'bbb2437'
|\
| |
|/
* 8db1c06 1

I had a quick look of rev-list-options.txt but couldn't find anything
that does that (--simplify-merges looks different), and revision.c is
not that familiar to me to figure this out by myself. Help?
-- 
Duy


Re: [PATCH v6 13/13] read-cache: make sure file handles are not inherited by child processes

2016-08-30 Thread Lars Schneider

> On 29 Aug 2016, at 21:45, Junio C Hamano  wrote:
> 
> Lars Schneider  writes:
> 
>> I see. Thanks for the explanation.
> 
> I expect the updated log message to explain the issue correctly
> then.

Sure!


>>> The parent is
>>> very likely to have pack windows open into .pack files and they need
>>> to be closed on the child side after fork(2) starts the child
>>> process but before execve(2) runs the helper, if we want to avoid
>>> file descriptor leaks.
>> 
>> I think I understand what you are saying. However, during my tests
>> .pack file fd's were never a problem.
> 
> I do not expect during the lifetime of your long-running helper
> anybody would try to unlink an existing packfile, so it is unlikely
> that "cannot unlink an open file on Windows" issue to come up.  And
> the cross-platform problem I pointed out is a fd leak; leaks would
> not show up until you run out of the resource, just like you
> wouldn't notice small memory leak here and there UNLESS you actively
> look for them.  I would be surprised if your "tests" found anything.
> 
>> How would I find the open .pack file fd's?  Should I go through
>> /proc/PID/fd? Why is this no problem for other longer running
>> commands such as the git-credential-cache--daemon or git-daemon?
> 
> Nobody said this is "no problem for others".  While discussing the
> patch that started this thread, we noticed that any open file
> descriptor the main process has when the long-running clean/smudge
> helper is spawned _is_ a problem.  Other helpers may share the same
> problem, and if they do, that is all the more reason that fixing the
> file descriptor leak is a good thing.
> 
> The primary thing I was wondering was if we should open the window
> into packfiles with CLOEXEC, just like we recently started opening
> the tempfiles and lockfiles with the flag.  The reason why I asked
> if the site that spawns (i.e. run_command API) has an easy access to
> the list of file descriptors that we opened ONLY for ourselves is
> because that would make it possible to consider an alternative
> approach close them before execve(2) in run_commands API.  That
> would save us from having to sprinkle existing calls to open(2) with
> CLOEXEC.  But if that is not the case, opening them with CLOEXEC is
> a much better solution than going outside the program to "find" them
> in non-portable ways.

The pack files are opened before the filter process is forked. Therefore,
I think CLOEXEC makes sense for them. Should this change be part of this 
series? If yes, would it look like below? Should I adjust the function name?

Thanks,
Lars

diff --git a/sha1_file.c b/sha1_file.c
index d5e1121..759991e 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -1485,7 +1485,7 @@ int check_sha1_signature(const unsigned char *sha1, void 
*map,
 
 int git_open_noatime(const char *name)
 {
-   static int sha1_file_open_flag = O_NOATIME;
+   static int sha1_file_open_flag = O_NOATIME | O_CLOEXEC;
 
for (;;) {
int fd;





Re: [PATCH v6 10/13] convert: generate large test files only once

2016-08-30 Thread Lars Schneider

> On 29 Aug 2016, at 19:52, Junio C Hamano  wrote:
> 
> Lars Schneider  writes:
> 
>>> On 25 Aug 2016, at 21:17, Stefan Beller  wrote:
>>> 
 On Thu, Aug 25, 2016 at 4:07 AM,   wrote:
 From: Lars Schneider 
 
 Generate more interesting large test files
>>> 
>>> How are the large test files more interesting?
>>> (interesting in the notion of covering more potential bugs?
>>> easier to debug? better to maintain, or just a pleasant read?)
>> 
>> The old large test file was 1MB of zeros and 1 byte with a one, repeated 
>> 2048 times.
>> 
>> Since the filter uses 64k packets we would test a large number of equally 
>> looking packets.
>> 
>> That's why I thought the pseudo random content is more interesting.
> 
> I guess my real question is why it is not just a single invocation
> of test-genrandom that gives you the whole test file; if you are
> using 20MB, the simplest would be to grab 20MB out of test-genrandom.
> With that hopefully you won't see large number of equally looking
> packets, no?

True, but applying rot13 (via tr ...) on 20+ MB takes quite a bit of
time. That's why I came up with the 1M SP in between.

However, I realized that testing a large amount of data is not really
necessary for the final series. A single packet is 64k. A 500k pseudo random
test file should be sufficient. This will make the test way simpler.

Thanks,
Lars

Bug Report: "git submodule deinit" fails right after a clone

2016-08-30 Thread Thomas Bétous
Hello,

I found a curious bug in git version 2.9.0.windows.1 (run on Windows 7
via git bash).

If I clone a repository containing submodules and run a "git submodule
deinit" on any of the submodules of this repository without executing
another git command, this command fails.

For instance:

(let's say the repo MyProject contains 2 submodules: Submodule1 and Submodule2)

$ git clone ssh:///MyProject
[ ... "git clone" output ... ]
$ cd MyProject
$ git submodule deinit Submodule1
fatal: Please stage your changes to .gitmodules or stash them to proceed
Submodule work tree 'Submodule1' contains local modifications; use
'-f' to discard them
$ git submodule deinit Submodule2
fatal: Please stage your changes to .gitmodules or stash them to proceed
Submodule work tree 'Submodule2' contains local modifications; use
'-f' to discard them

First the error message is strange.
Then what is even stranger is that the error disappears if a "git
status" is run before the submodule deinit...


$ git clone ssh:///MyProject
[ ... "git clone" output ... ]
$ cd MyProject
$ git submodule deinit Submodule1
fatal: Please stage your changes to .gitmodules or stash them to proceed
Submodule work tree 'Submodule1' contains local modifications; use
'-f' to discard them
$ git status
[ ... "git status" output ...]
$ git submodule deinit Submodule1
Cleared directory 'Submodule1'

I have been able to reproduce this error at least 10 times and with
different repositories so I do not think it comes from the state of my
repositories.
Moreover, the use of "--recursive" argument to the "git clone" command
did not change anything.

Are you able to reproduce this problem?

Thank you in advance.

Best regards,

Thomas


Re: [PATCH v6 10/13] convert: generate large test files only once

2016-08-30 Thread Lars Schneider

> On 29 Aug 2016, at 19:46, Junio C Hamano  wrote:
> 
> larsxschnei...@gmail.com writes:
> 
>> diff --git a/t/t0021-conversion.sh b/t/t0021-conversion.sh
>> index 7b45136..34c8eb9 100755
>> --- a/t/t0021-conversion.sh
>> +++ b/t/t0021-conversion.sh
>> @@ -4,6 +4,15 @@ test_description='blob conversion via gitattributes'
>> 
>> . ./test-lib.sh
>> 
>> +if test_have_prereq EXPENSIVE
>> +then
>> +T0021_LARGE_FILE_SIZE=2048
>> +T0021_LARGISH_FILE_SIZE=100
>> +else
>> +T0021_LARGE_FILE_SIZE=30
>> +T0021_LARGISH_FILE_SIZE=2
>> +fi
> 
> Minor: do we need T0021_ prefix?  What are you trying to avoid
> collisions with?

Not necessary. I'll remove the prefix.


>> +git checkout -- test test.t test.i &&
>> +
>> +mkdir generated-test-data &&
>> +for i in $(test_seq 1 $T0021_LARGE_FILE_SIZE)
>> +do
>> +RANDOM_STRING="$(test-genrandom end $i | tr -dc "A-Za-z0-9" )"
>> +ROT_RANDOM_STRING="$(echo $RANDOM_STRING | ./rot13.sh )"
> 
> In earlier iteration of loop with lower $i, what guarantees that
> some bytes survive "tr -dc"?

Nothing really, good catch! The seed "end" produces as first character always a 
"S" which would survive "tr -dc". However, that is clunky. I will always set "1"
as first character in $RANDOM_STRING to mitigate the problem.

> 
>> +# Generate 1MB of empty data and 100 bytes of random characters
> 
> 100 bytes?  It seems to me that you are giving 1MB and then $i-byte
> or less (which sometimes can be zero) of random string.

Outdated comment. Will fix!

> 
>> +# printf "$(test-genrandom start $i)"
>> +printf "%1048576d" 1 >>generated-test-data/large.file &&
>> +printf "$RANDOM_STRING" >>generated-test-data/large.file &&
>> +printf "%1048576d" 1 >>generated-test-data/large.file.rot13 &&
>> +printf "$ROT_RANDOM_STRING" 
>> >>generated-test-data/large.file.rot13 &&
>> +
>> +if test $i = $T0021_LARGISH_FILE_SIZE
>> +then
>> +cat generated-test-data/large.file 
>> >generated-test-data/largish.file &&
>> +cat generated-test-data/large.file.rot13 
>> >generated-test-data/largish.file.rot13
>> +fi
>> +done
> 
> This "now we are done with the loop, so copy them to the second
> pair" needs to be in the loop?  Shouldn't it come after 'done'?

No, it does not need to be in the loop. I think I could do this
after the loop instead:

head -c $((1048576*$T0021_LARGISH_FILE_SIZE)) generated-test-data/large.file 
>generated-test-data/largish.file


> I do not quite get the point of this complexity.  You are using
> exactly the same seed "end" every time, so in the first round you
> have 1M of SP, letter '1', letter 'S' (from the genrandom), then
> in the second round you have 1M of SP, letter '1', letter 'S' and
> letter 'p' (the last two from the genrandom), and go on.  Is it
> significant for the purpose of your test that the cruft inserted
> between the repetition of 1M of SP gets longer by one byte but they
> all share the same prefix (e.g. "1S", "1Sp", "1SpZ", "1SpZT",
> ... are what you insert between a large run of spaces)?

The pktline packets have a constant size. If the cruft between 1M of SP 
has a constant size as well then the generated packets for the test data
would repeat themselves. That's why I increased the length after every 1M
of SP.

However, I realized that this test complexity is not necessary. I'll
simplify it in the next round.

Thanks,
Lars

Re: [PATCH 05/22] sequencer: allow the sequencer to take custody of malloc()ed data

2016-08-30 Thread Jakub Narębski
Hello Johannes,

W dniu 30.08.2016 o 09:29, Johannes Schindelin pisze:
> On Mon, 29 Aug 2016, Jakub Narębski wrote: 
>> W dniu 29.08.2016 o 10:04, Johannes Schindelin pisze:

>>> +void *sequencer_entrust(struct replay_opts *opts, void 
>>> *set_me_free_after_use)
>>> +{
>>> +   ALLOC_GROW(opts->owned, opts->owned_nr + 1, opts->owned_alloc);
>>> +   opts->owned[opts->owned_nr++] = set_me_free_after_use;
>>> +
>>> +   return set_me_free_after_use;
>>
>> I was wondering what this 'set_me_free_after_use' parameter is about;
>> wouldn't it be more readable if this parameter was called 'owned_data'
>> or 'owned_ptr'?
> 
> If I read "owned_ptr" as a function's parameter, I would assume that the
> associated memory is owned by the caller. So I would be puzzled reading
> that name.

Right.  Well, it is difficult to come up with a good name for this
parameter that would make sense both in a declaration as an information
for a caller, and in the function itself as information about what it
holds.

In my personal opinion 'set_me_free_after_use' is not the best name,
but I unfortunately do not have a better proposal.  Maybe 'entrust_ptr',
or 'entrusted_data' / 'entrusted_ptr' / 'entrusted'?

There are two hard things in computer science: cache invalidation, 
*naming things*, and off-by-one errors ;-)


P.S. It would be nice to have generic mechanism for taking custody
of data to help libification, either at this or at lower level (on
the level of xstrdup, etc.), but that can safely wait.  It even should
wait, so that we can see that this approach is a good one, before
trying to generalize it.  That should be not a blocker for this series,
IMVHO.

Best,
-- 
Jakub Narębski



Re: [PATCH 4/6] require_clean_work_tree: ensure that the index was read

2016-08-30 Thread Johannes Schindelin
Hi Junio,

On Mon, 29 Aug 2016, Junio C Hamano wrote:

> Junio C Hamano  writes:
> 
> > I am not sure if it should be left as the responsibility of the
> > caller (i.e. check the_index.initialized to bark at a caller that
> > forgets to read from an index) ...
> 
> Scatch that.  That would not work in a freshly created repository
> before doing any "git add".  An empty index is a normal state, so it
> would not just be annoying to warn "You called me without reading
> the index" but is simply wrong.

Fine. I changed it to assert that the_index.initialized was set.

Ciao,
Dscho


Re: [PATCH 4/6] require_clean_work_tree: ensure that the index was read

2016-08-30 Thread Johannes Schindelin
Hi Junio,

On Mon, 29 Aug 2016, Junio C Hamano wrote:

> Johannes Schindelin  writes:
> 
> > The function would otherwise pretend to work fine, but totally ignore
> > the working directory.
> 
> s/^T/Unless the caller has already read the index, t/;

Changed. (I also removed the "otherwise" which would sound funny
otherwise.)

> I am not sure if it should be left as the responsibility of the
> caller (i.e. check the_index.initialized to bark at a caller that
> forgets to read from an index) or it is OK to unconditionally read
> from $GIT_INDEX_FILE in a truly libified function.  A caller that
> wants to read from a custom (temporary) index file can choose to
> make sure it does read_inddex_from() from its custom file before
> calling this function, but if it forgets to do so, the penalty is
> severe than ordinary callers who would have read from the normal
> index file anyway.
> 
> Even though the word-lego issue would make this particular API not
> yet suitable, "who is responsible for reading the index" is an
> orthogonal issue that we can discuss even on this version, so I
> thought I'd bring it up.

It is orthogonal alright. So I won't act on it, but just add my thoughts:

We might want to consider thinking about introducing a more consistent
internal API. This is even more orthogonal to the patch under discussion
than just teaching require_clean_work_tree() about different index files,
of course.

It might very well pay off in the future were we to design a more unified
"look & feel" to the API e.g. by putting more restrictions on the order of
parameters, required "int" return values for functions that may fail, a
unified error handling that does not necessarily print to stderr.

Having said that, I do not have time to take lead on that, either. :-)

Ciao,
Dscho


git blame [was: Reducing CPU load on git server]

2016-08-30 Thread Jakub Narębski
W dniu 29.08.2016 o 23:31, Jeff King pisze:

> Blame-tree is a GitHub-specific command (it feeds the main repository
> view page), and is a known CPU hog. There's more clever caching for that
> coming down the pipe, but it's not shipped yet.

I wonder if having support for 'git blame ' in Git core would
be something interesting to Git users.  I once tried to implement it,
but it went nowhere.  Would it be hard to implement?


I guess that GitHub offers this view as a part of the home page view
for a repository is something of a historical artifact; namely that
web interfaces for other version control systems used this.  But I
wonder how useful it is.  Neither cgit nor gitweb offer such view,
both GitLab and Bitbucket start with summary age with no blame-tree.

-- 
Jakub Narębski



Re: [PATCH v2 12/14] sequencer: lib'ify save_head()

2016-08-30 Thread Johannes Schindelin
Hi Junio,

On Mon, 29 Aug 2016, Junio C Hamano wrote:

> Johannes Schindelin  writes:
> 
> > strbuf_addf(&buf, "%s\n", head);
> > if (write_in_full(fd, buf.buf, buf.len) < 0)
> > -   die_errno(_("Could not write to %s"), git_path_head_file());
> > +   return error_errno(_("Could not write to %s"),
> > +  git_path_head_file());
> 
> Same comment around a left-over lockfile applies to this.  An extra
> rollback being minimally intrusive also applies here, I think.

Sure. I added rollbacks in case of failure.

Ciao,
Dscho


Re: [PATCH v2 10/14] sequencer: lib'ify read_populate_opts()

2016-08-30 Thread Johannes Schindelin
Hi Junio,

On Mon, 29 Aug 2016, Junio C Hamano wrote:

> Johannes Schindelin  writes:
> 
> > Instead of dying there, let the caller high up in the callchain notice
> > the error and handle it (by dying, still).
> >
> > The only caller of read_populate_opts(), sequencer_continue() can
> > already return errors, so its caller must be already prepared to
> > handle error returns, and with this step, we make it notice an error
> > return from this function.
> >
> > So this is a safe conversion to make read_populate_opts() callable
> > from new callers that want it not to die, without changing the
> > external behaviour of anything existing.
> >
> > Signed-off-by: Johannes Schindelin 
> > ---
> >  sequencer.c | 14 --
> >  1 file changed, 8 insertions(+), 6 deletions(-)
> >
> > diff --git a/sequencer.c b/sequencer.c
> > index e11b24f..be6020a 100644
> > --- a/sequencer.c
> > +++ b/sequencer.c
> > @@ -808,12 +808,14 @@ static int populate_opts_cb(const char *key, const 
> > char *value, void *data)
> > return 0;
> >  }
> >  
> > -static void read_populate_opts(struct replay_opts **opts_ptr)
> > +static int read_populate_opts(struct replay_opts **opts)
> >  {
> > if (!file_exists(git_path_opts_file()))
> > -   return;
> > -   if (git_config_from_file(populate_opts_cb, git_path_opts_file(), 
> > *opts_ptr) < 0)
> > -   die(_("Malformed options sheet: %s"), git_path_opts_file());
> > +   return 0;
> > +   if (git_config_from_file(populate_opts_cb, git_path_opts_file(), *opts) 
> > < 0)
> > +   return error(_("Malformed options sheet: %s"),
> > +   git_path_opts_file());
> > +   return 0;
> 
> As discussed, perhaps have a comment immediately before calling
> config-from-file that says that the call could die when it is fed a
> syntactically broken file, but we ignore it for now because we will
> be writing the file we have written, or something?

Sure. I added a code comment.

I still think that it is a serious mistake for library functions to die().
But I have no time to take care of git_parse_source() right now.

Ciao,
Dscho


Re: [PATCH v2 08/14] sequencer: lib'ify read_and_refresh_cache()

2016-08-30 Thread Johannes Schindelin
Hi Junio,

On Mon, 29 Aug 2016, Junio C Hamano wrote:

> Johannes Schindelin  writes:
> 
> > Instead of dying there, let the caller high up in the callchain
> > notice the error and handle it (by dying, still).
> >
> > There are two call sites of read_and_refresh_cache(), one of which is
> > pick_commits(), whose callers were already prepared to do the right
> > thing given an "error" return from it by an earlier patch, so the
> > conversion is safe.
> >
> > The other one, sequencer_pick_revisions() was also prepared to relay
> > an error return back to its caller in all remaining cases in an
> > earlier patch.
> >
> > Signed-off-by: Johannes Schindelin 
> > ---
> >  sequencer.c | 15 ++-
> >  1 file changed, 10 insertions(+), 5 deletions(-)
> >
> > diff --git a/sequencer.c b/sequencer.c
> > index c006cae..e30aa82 100644
> > --- a/sequencer.c
> > +++ b/sequencer.c
> > @@ -640,18 +640,21 @@ static int prepare_revs(struct replay_opts *opts)
> > return 0;
> >  }
> >  
> > -static void read_and_refresh_cache(struct replay_opts *opts)
> > +static int read_and_refresh_cache(struct replay_opts *opts)
> >  {
> > static struct lock_file index_lock;
> > int index_fd = hold_locked_index(&index_lock, 0);
> > if (read_index_preload(&the_index, NULL) < 0)
> > -   die(_("git %s: failed to read the index"), action_name(opts));
> > +   return error(_("git %s: failed to read the index"),
> > +   action_name(opts));
> > refresh_index(&the_index, REFRESH_QUIET|REFRESH_UNMERGED, NULL, NULL, 
> > NULL);
> > if (the_index.cache_changed && index_fd >= 0) {
> > if (write_locked_index(&the_index, &index_lock, COMMIT_LOCK))
> > -   die(_("git %s: failed to refresh the index"), 
> > action_name(opts));
> > +   return error(_("git %s: failed to refresh the index"),
> > +   action_name(opts));
> > }
> > rollback_lock_file(&index_lock);
> > +   return 0;
> >  }
> 
> With the current set of callers, a caller that notices an error from
> this function will immediately exit without doing any further
> damage.
> 
> So in that sense, this is a "safe" conversion.
> 
> But is it a sensible conversion?  When the caller wants to do
> anything else (e.g. clean-up and try something else, perhaps read
> the index again), the caller can't, as the index is still locked,
> because even though the code knows that the lock will not be
> released until the process exit, it chose to return error without
> releasing the lock.

It depends what the caller wants to do. The case about which I care most
is when some helpful advice should be printed (see e.g. 3be18b4 (t5520:
verify that `pull --rebase` shows the helpful advice when failing,
2016-07-26)). Those callers do not need to care, as the atexit() handler
will clean up the lock file.

However, I am sympathetic to your angle, even if I do not expect any such
caller to arise anytime soon.

> For a file-scope static helper, that probably is sufficient.  But if
> this can be reached from a public entry point in the API, the caller
> of that entry point will find this not-so-useful, I would think.
> 
> I suspect doing the "right thing" to future-proof it may not be too
> much more work.
> 
>   static int read_and_refresh_cache(struct replay_opts *opts)
> {
> + int retval = 0; /* assume success */
>   ...
> if (read_idnex_preload(...) < 0) {
>   retval = error(...);
> goto finish;
>   }
> refresh_index(...);
> if (...changed...) {
>   if (write_locked_index(...))
>   retval = error(...);
>   }
> + finish:
> rollback_lock_file(&index_lock);
> return retval;
>   }
> 
> or something like that on top?

I settled for rolling back in the if() clauses.

Ciao,
Dscho


Re: [PATCH v13 00/14] libify apply and use lib in am, part 3

2016-08-30 Thread Christian Couder
On Mon, Aug 29, 2016 at 9:04 PM, Junio C Hamano  wrote:
> Christian Couder  writes:
>
>> Highlevel view of the patches in the series
>> ~~~
>>
>> This is "part 3" of the full patch series. I am resending only the
>> last 14 patches of the full series as "part 3", because I don't want
>> to resend the first 27 patches of v10 nearly as is.
>
> Just to make sure, you are sending the first 11 of these 14 exactly
> as-is, right?  I didn't spot anything different other than 12 and 13
> that replaced the "alternate index" step from the previous round.

Yeah, the first 11 of the 14 patches have no change compared to v12.
I didn't want to create a "part 4" as that could be confusing, and
sending the first 11 patches gives them another chance to be reviewed
again.


[PATCH RFC] subtree: support subtree -P /path/to/file split

2016-08-30 Thread Ivan Shapovalov
Hello,

I'm missing `git subtree split` functionality for the case when prefix
is a single file. Looking at git-subtree, pretty few changes are needed
to handle this. The attached patch works for me in the common case
(no rejoins). As such, I'm looking for comments :)

Signed-off-by: Ivan Shapovalov 

---
 contrib/subtree/git-subtree.sh | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/contrib/subtree/git-subtree.sh b/contrib/subtree/git-subtree.sh
index dec085a..464dcf7 100755
--- a/contrib/subtree/git-subtree.sh
+++ b/contrib/subtree/git-subtree.sh
@@ -485,8 +485,12 @@ subtree_for_commit () {
while read mode type tree name
do
assert test "$name" = "$dir"
-   assert test "$type" = "tree" -o "$type" = "commit"
+   assert test "$type" = "tree" -o "$type" = "commit" -o "$type" = 
"blob"
test "$type" = "commit" && continue  # ignore submodules
+   test "$type" = "blob" && { # handle if $dir is a file
+   printf "%s %s %s\t%s\0" "$mode" "$type" "$tree" 
"${file##*/}" | git mktree
+   break
+   }
echo $tree
break
done
-- 
2.9.3



Re: [PATCH v2] t/Makefile: add a rule to re-run previously-failed tests

2016-08-30 Thread Jeff King
On Mon, Aug 29, 2016 at 03:46:05PM +0200, Johannes Schindelin wrote:

> Note that we need to be careful to inspect only the *newest* entries in
> test-results/: this directory contains files of the form
> t--.counts and is only removed wholesale when running the
> *entire* test suite, not when running individual tests. We ensure that
> with a little sed magic on `ls -t`'s output that simply skips lines
> when the file name was seen earlier.

Hmm, interesting. Your approach seems reasonable, but I have to wonder
if writing the pid in the first place is sane.

I started to write up my reasoning in this email, but realized it was
rapidly becoming the content of a commit message. So here is that
commit.

-- >8 --
Subject: [PATCH] test-lib: drop PID from test-results/*.count

Each test run generates a "count" file in t/test-results
that stores the number of successful, failed, etc tests.
If you run "t1234-foo.sh", that file is named as
"t/test-results/t1234-foo-$$.count"

The addition of the PID there is serving no purpose, and
makes analysis of the count files harder.

The presence of the PID dates back to 2d84e9f (Modify
test-lib.sh to output stats to t/test-results/*,
2008-06-08), but no reasoning is given there. Looking at the
current code, we can see that other files we write to
test-results (like *.exit and *.out) do _not_ have the PID
included. So the presence of the PID does not meaningfully
allow one to store the results from multiple runs anyway.

Moreover, anybody wishing to read the *.count files to
aggregate results has to deal with the presence of multiple
files for a given test (and figure out which one is the most
recent based on their timestamps!). The only consumer of
these files is the aggregate.sh script, which arguably gets
this wrong. If a test is run multiple times, its counts will
appear multiple times in the total (I say arguably only
because the desired semantics aren't documented anywhere,
but I have trouble seeing how this behavior could be
useful).

So let's just drop the PID, which fixes aggregate.sh, and
will make new features based around the count files easier
to write.

Note that since the count-file may already exist (when
re-running a test), we also switch the "cat" from appending
to truncating. The use of append here was pointless in the
first place, as we expected to always write to a unique file.

Signed-off-by: Jeff King 
---
The presence of the append, combined with the way aggregate.sh is
written makes me wonder if the intent was to store multiple run results
for each test in a single file (and aggregate would just report the last
one). Which _still_ makes the use of the PID wrong. But again, I don't
see much use for it.

 t/test-lib.sh | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/t/test-lib.sh b/t/test-lib.sh
index d731d66..eada492 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -687,9 +687,9 @@ test_done () {
test_results_dir="$TEST_OUTPUT_DIRECTORY/test-results"
mkdir -p "$test_results_dir"
base=${0##*/}
-   test_results_path="$test_results_dir/${base%.sh}-$$.counts"
+   test_results_path="$test_results_dir/${base%.sh}.counts"
 
-   cat >>"$test_results_path" <<-EOF
+   cat >"$test_results_path" <<-EOF
total $test_count
success $test_success
fixed $test_fixed
-- 
2.10.0.rc2.123.ga991f9e



Re: [PATCH v4 0/3] diff-highlight: add support for --graph option

2016-08-30 Thread Jeff King
On Mon, Aug 29, 2016 at 10:33:44AM -0700, Brian Henderson wrote:

> How does this look?
> 
> Drawing the graph helped me a lot in figuring out what I was actually 
> testing. thanks!
> 
> Brian Henderson (3):
>   diff-highlight: add some tests.
>   diff-highlight: add failing test for handling --graph output.
>   diff-highlight: add support for --graph output.

These all look good to me. As Junio mentioned, you need to add a
signoff (see the section "Sign your work" in Documentation/SubmittingPatches).

Also, a minor nit, but we don't usually put a "." at the end of our
commit message subjects (not worth a re-roll on its own, but if you are
sending them again anyway...).

-Peff


Re: [PATCH v2 03/14] sequencer: lib'ify write_message()

2016-08-30 Thread Johannes Schindelin
Hi Junio,

On Mon, 29 Aug 2016, Junio C Hamano wrote:

> Johannes Schindelin  writes:
> 
> > Instead of dying there, let the caller high up in the callchain
> > notice the error and handle it (by dying, still).
> >
> > The only caller of write_message(), do_pick_commit() already checks
> > the return value and passes it on to its callers, so its caller must
> > be already prepared to handle error returns, and with this step, we
> > make it notice an error return from this function.
> > ...
> > @@ -223,7 +226,7 @@ static int fast_forward_to(const unsigned char *to, 
> > const unsigned char *from,
> >  
> > read_cache();
> > if (checkout_fast_forward(from, to, 1))
> > -   exit(128); /* the callee should have complained already */
> > +   return -1; /* the callee should have complained already */
> 
> This hunk does not seem to have anything to do with write_message()
> conversion, other than that its only caller, do_pick_commit(), also
> calls write_message().

Darn. Sorry that this slipped through. I split it out into its own commit,
and fixed checkout_fast_forward_to() in yet another commit.

Ciao,
Dscho


Re: [PATCH 05/22] sequencer: allow the sequencer to take custody of malloc()ed data

2016-08-30 Thread Johannes Schindelin
Hi Hannes,

On Tue, 30 Aug 2016, Johannes Sixt wrote:

> Am 29.08.2016 um 23:59 schrieb Jakub Narębski:
> > W dniu 29.08.2016 o 10:04, Johannes Schindelin pisze:
> > > -#define REPLAY_OPTS_INIT { -1, -1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, NULL,
> > > NULL, NULL, 0, 0, NULL }
> > > +#define REPLAY_OPTS_INIT { -1, -1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, NULL,
> > > NULL, NULL, 0, 0, NULL, NULL, 0, 0 }
> >
> > Nb. it is a pity that we cannot use named initializers for structs,
> > so called designated inits.  It would make this macro more readable.
> 
> It is actually pointless to add the 0's and NULL's here. This should  be
> sufficient:
> 
> #define REPLAY_OPTS_INIT { -1, -1 }
> 
> because initialization with 0 (or NULL) is the default for any omitted
> members.

D'oh. You're right. The same applies to TODO_LIST_INIT, of course.

Fixed,
Johannes

Re: [PATCH 05/22] sequencer: allow the sequencer to take custody of malloc()ed data

2016-08-30 Thread Johannes Schindelin
Hi Kuba,

On Mon, 29 Aug 2016, Jakub Narębski wrote:

> W dniu 29.08.2016 o 10:04, Johannes Schindelin pisze:
> 
> > The sequencer is our attempt to lib-ify cherry-pick. Yet it behaves
> > like a one-shot command when it reads its configuration: memory is
> > allocated and released only when the command exits.
> > 
> > This is kind of okay for git-cherry-pick, which *is* a one-shot
> > command. All the work to make the sequencer its work horse was done to
> > allow using the functionality as a library function, though, including
> > proper clean-up after use.
> > 
> > This patch introduces an API to pass the responsibility of releasing
> > certain memory to the sequencer.
> 
> So how this API would be / is meant to be used?

I added an example to the commit message.

> Would sequencer as a library function be called multiple times,
> or only once?

The point of a library function is that it should not care.

> I'm trying to find out how this is solved in other places of Git
> code, and I have stumbled upon free_util in string_list...

I wanted this to be flexible enough to take care of any type of data, not
just strings.

And while the string_list has a void *util field, it would be rather silly
to add strings to a string list for the sole purpose of free()ing their
util fields in the end.

(That was the conclusion I came to after a search of my own.)

> > +void *sequencer_entrust(struct replay_opts *opts, void 
> > *set_me_free_after_use)
> > +{
> > +   ALLOC_GROW(opts->owned, opts->owned_nr + 1, opts->owned_alloc);
> > +   opts->owned[opts->owned_nr++] = set_me_free_after_use;
> > +
> > +   return set_me_free_after_use;
> 
> I was wondering what this 'set_me_free_after_use' parameter is about;
> wouldn't it be more readable if this parameter was called 'owned_data'
> or 'owned_ptr'?

If I read "owned_ptr" as a function's parameter, I would assume that the
associated memory is owned by the caller. So I would be puzzled reading
that name.

> >  static void remove_sequencer_state(const struct replay_opts *opts)
> >  {
> > struct strbuf dir = STRBUF_INIT;
> > +   int i;
> > +
> > +   for (i = 0; i < opts->owned_nr; i++)
> > +   free(opts->owned[i]);
> 
> I guess you can remove owned data in any order, regardless if you
> store struct or its members first...

Indeed, this is not like a C++ destructor. It's free().

> > diff --git a/sequencer.h b/sequencer.h
> > index c955594..20b708a 100644
> > --- a/sequencer.h
> > +++ b/sequencer.h
> > @@ -43,8 +43,14 @@ struct replay_opts {
> >  
> > /* Only used by REPLAY_NONE */
> > struct rev_info *revs;
> > +
> > +   /* malloc()ed data entrusted to the sequencer */
> > +   void **owned;
> > +   int owned_nr, owned_alloc;
> 
> I'm not sure about naming conventions for those types of data, but
> wouldn't 'owned_data' be a better name?  I could be wrong here...

The convention seemed to be "void *X; int X_nr, X_alloc;", so I stuck with
it.

Thanks for your review!
Johannes