Re: [PATCH v2 0/4] Port git-add--interactive.perl:status_cmd to C

2018-11-26 Thread Slavica Djukic

Hi Daniel,

On 16-May-17 6:00 AM, Daniel Ferreira wrote:

This is the second version of a patch series to start porting
git-add--interactive from Perl to C.

Series:
v1: 
https://public-inbox.org/git/1494009820-2090-1-git-send-email-bnm...@gmail.com/

Travis CI build: https://travis-ci.org/theiostream/git/builds/232669894

I have applied most of the changes suggested by Johannes and Ævar (thank
you!). The one exception was Johannes' request to make this a WIP patch
series with a sort-of "design" of what's next to come. I did not do
this because I still have no clue...! I'll begin experimenting on
update_cmd() to see if I gain some insight on this issue that I could
bring to this series. I do think this would be a good idea, though! :)



I am Slavica Đukic, Outreachy intern for Git (Dec-March 2018/19 round).

Project I'll be working on is "Turn git add -i to built-in".

Would you mind if I use your work so far as head start?




-- Daniel.

Daniel Ferreira (4):
   diff: export diffstat interface
   add--helper: create builtin helper for interactive add
   add--helper: implement interactive status command
   add--interactive: use add-interactive--helper for status_cmd

  .gitignore|   1 +
  Makefile  |   1 +
  builtin.h |   1 +
  builtin/add--helper.c | 285 ++
  diff.c|  17 +--
  diff.h|  19 +++-
  git-add--interactive.perl |   4 +-
  git.c |   1 +
  8 files changed, 309 insertions(+), 20 deletions(-)
  create mode 100644 builtin/add--helper.c

--
2.7.4 (Apple Git-66)




Thank you,
Slavica


Re: [PATCH v2 0/4] Multiple subtree split fixes regarding complex repos

2018-10-12 Thread Junio C Hamano
Roger Strain  writes:

> After doing some testing at scale, determined that one call was
> taking too long; replaced that with an alternate call which
> returns the same data significantly faster.

Curious where the time goes.  Do you know?

> Also, if anyone has any other feedback on these I'd really love to
> hear it. It's working better for us (as in, it actually generates

The previous one is already in 'next'; please make it incremental
with explanation as to why "show -s" is worse than "log -1" (but see
below).

> # processing commit without normal parent information;
> # fetch from repo
> -   parents=$(git show -s --pretty=%P "$rev")
> +   parents=$(git log --pretty=%P -n 1 "$rev")

If you want to learn the parents of a given commit:

$ git help revisions

says

   ^@, e.g. HEAD^@
   A suffix ^ followed by an at sign is the same as listing all parents 
of 
   (meaning, include anything reachable from its parents, but not the 
commit
   itself).

so

parents=$(git rev-parse "$rev^@")

ought to be the most efficient way to do this, I suspect.


[PATCH v2 0/4] Multiple subtree split fixes regarding complex repos

2018-10-11 Thread Roger Strain
After doing some testing at scale, determined that one call was taking too 
long; replaced that with an alternate call which returns the same data 
significantly faster.

Also, if anyone has any other feedback on these I'd really love to hear it. 
It's working better for us (as in, it actually generates a compatible tree 
version to version) but still isn't perfect, and I'm not sure perfect is 
achievable, but want to make sure this doesn't things for anyone else.

Changes since v1:
diff --git a/contrib/subtree/git-subtree.sh b/contrib/subtree/git-subtree.sh
index 1c157dbd9..7dd643998 100755
--- a/contrib/subtree/git-subtree.sh
+++ b/contrib/subtree/git-subtree.sh
@@ -633,7 +633,7 @@ process_split_commit () {
else
# processing commit without normal parent information;
# fetch from repo
-   parents=$(git show -s --pretty=%P "$rev")
+   parents=$(git log --pretty=%P -n 1 "$rev")
extracount=$(($extracount + 1))
fi

Strain, Roger L (4):
  subtree: refactor split of a commit into standalone method
  subtree: make --ignore-joins pay attention to adds
  subtree: use commits before rejoins for splits
  subtree: improve decision on merges kept in split

 contrib/subtree/git-subtree.sh | 129 +
 1 file changed, 83 insertions(+), 46 deletions(-)

-- 
2.19.1



Re: [PATCH v2 0/4] Add proto v2 archive command with HTTP support

2018-09-27 Thread Junio C Hamano
Josh Steadmon  writes:

> Yes, the version on my desktop sends version=2 when archiving:
>
> ∫ which git
> /usr/bin/git
> ∫ git --version
> git version 2.19.0.605.g01d371f741-goog
> ∫ GIT_TRACE_PACKET=${HOME}/server_trace git daemon \
>   --enable=upload-archive \
>   --base-path=${HOME}/src/bare-repos &
> [1] 258496
> ∫ git archive --remote git://localhost/test-repo.git HEAD >! test.tar
> ∫ grep version ~/server_trace
> 15:31:22.377869 pkt-line.c:80   packet:  git< 
> git-upload-archive /test-repo.git\0host=localhost\0\0version=2\0

Ah, that's truly broken.

Come to think of it, do we need to be using uniform versions across
different endpoints?  The archive request could be at v3 while fetch
request could still be at v2, in which case the design to use a
single protocol.version variable is probably the root cause of the
confusion?  Perhaps like protocol..allow, we would want
protocol..version or something like that (and no
protocol.version) to make it clear that protocol v2 used for
fetching has nothing to do with protocol v1 or v2 or v3 used for
archiving?

Luckily, protocol.version is still marked as experimental so it is
not too bad that we caught the design mistake (if it is one) and can
now correct it before the damage spreads too widely.




Re: [PATCH v2 0/4] Add proto v2 archive command with HTTP support

2018-09-27 Thread Josh Steadmon
On 2018.09.27 15:20, Junio C Hamano wrote:
> Jonathan Nieder  writes:
> 
> >  1. Clients sending version=2 when they do not, in fact, speak protocol
> > v2 for a service is a (serious) bug.  (Separately from this
> > series) we should fix it.
> >
> >  2. That bug is already in the wild, alas.  Fortunately the semantics of
> > GIT_PROTOCOL as a list of key/value pairs is well defined.  So we
> > have choices of (a) bump version to version=3 (b) pass another
> > value 'version=2:yesreallyversion=2' (c) etc.
> >
> >  3. This is likely to affect push, too.
> 
> Do you mean that existing "git push", "git fetch" and "git archive"
> sends version=2 even when they are not capable of speaking protocol
> v2?  I thought that "git archive [--remote]" was left outside of the
> protocol update (that was the reason why the earlier attempt took a
> hacky route of "shallow clone followed by local archive"), so there
> is no "git archive" in the wild that can even say "version=$n"
> (which requires you to be at least version=1)?

Yes, the version on my desktop sends version=2 when archiving:

∫ which git
/usr/bin/git
∫ git --version
git version 2.19.0.605.g01d371f741-goog
∫ GIT_TRACE_PACKET=${HOME}/server_trace git daemon \
  --enable=upload-archive \
  --base-path=${HOME}/src/bare-repos &
[1] 258496
∫ git archive --remote git://localhost/test-repo.git HEAD >! test.tar
∫ grep version ~/server_trace
15:31:22.377869 pkt-line.c:80   packet:  git< 
git-upload-archive /test-repo.git\0host=localhost\0\0version=2\0


Re: [PATCH v2 0/4] Add proto v2 archive command with HTTP support

2018-09-27 Thread Junio C Hamano
Jonathan Nieder  writes:

>  1. Clients sending version=2 when they do not, in fact, speak protocol
> v2 for a service is a (serious) bug.  (Separately from this
> series) we should fix it.
>
>  2. That bug is already in the wild, alas.  Fortunately the semantics of
> GIT_PROTOCOL as a list of key/value pairs is well defined.  So we
> have choices of (a) bump version to version=3 (b) pass another
> value 'version=2:yesreallyversion=2' (c) etc.
>
>  3. This is likely to affect push, too.

Do you mean that existing "git push", "git fetch" and "git archive"
sends version=2 even when they are not capable of speaking protocol
v2?  I thought that "git archive [--remote]" was left outside of the
protocol update (that was the reason why the earlier attempt took a
hacky route of "shallow clone followed by local archive"), so there
is no "git archive" in the wild that can even say "version=$n"
(which requires you to be at least version=1)?


Re: [RFC PATCH v2 0/4] Avoid ls-refs when possible in protocol v2

2018-09-27 Thread Junio C Hamano
Jonathan Tan  writes:

> To answer Junio's questions in [1], I think it's best to include the
> full patch set that I'm developing, so here it is. The original patch is
> now patch 3 of this set.

Yeah, taking it out of context did make its purpose fuzzy.  Without
the other patches in the series that set the overall direction of
reducing ls-refs, it was unclear why some transports are marked to
specifically want to automatically call transport_get_remote_refs().
With these surrounding changes, "we call it as needed, because an
earlier step may not have called it" starts to sound quite sensible.



[RFC PATCH v2 0/4] Avoid ls-refs when possible in protocol v2

2018-09-27 Thread Jonathan Tan
To answer Junio's questions in [1], I think it's best to include the
full patch set that I'm developing, so here it is. The original patch is
now patch 3 of this set.

[1] https://public-inbox.org/git/xmqq8t3pnphe@gitster-ct.c.googlers.com/

Rearranging Junio's questions:

> ... ah, do you mean that this is not a new feature, but is a bugfix
> for some callers that are not calling get-remote-refs before calling
> fetch-refs, and the bit is to work around the fact that some
> transport not just can function without get-remote-refs first but do
> not want to call it?

Yes, it is the bugfix you describe, except that the bug coincidentally
does not cause any bad behavior. fetch-object.c indeed does not call
get-remote-refs before fetch-refs, but it calls transport_set_option(),
which so happens to do what we need (call set_helper_option()).

However, we need it now, because ...

> But this I do not quite understand.  It looks saying "when asked to
> fetch, if the transport does not allow us to do so without first
> getting the advertisement, lazily do that", and that may be a good
> thing to do, but then aren't the current set of callers already
> calling transport-get-remote-refs elsewhere before they call
> transport-fetch-refs?  IOW, I would have expected to see a matching
> removal, or at least a code that turns an unconditional call to
> get-remote-refs to a conditional one that is done only for the
> transport that lacks the capability, or something along that line.

... this "matching removal" you are talking about is in the subsequent
patch 4. And there is no transport_set_option() to save us this time, so
we really do need this bugfix.

> IOW, I am a bit confused by this comment (copied from an earlier part)
> 
> > +   /**
> > +* This transport supports the fetch() function being called
> > +* without get_refs_list() first being called.
> > +*/
> 
> Shouldn't it read more like "this transport does not want its
> get-refs-list called when fetch-refs is done"?
> 
> I dunno.

I'm not sure I understand - transports generally don't care if
get-refs-list is called after fetch-refs. Also, this already happens
when fetching with tag following from a server that does not support tag
following, using a transport that supports reuse.

Jonathan Tan (4):
  transport: allow skipping of ref listing
  transport: do not list refs if possible
  transport: list refs before fetch if necessary
  fetch: do not list refs if fetching only hashes

 builtin/fetch.c | 32 +-
 fetch-pack.c|  2 +-
 t/t5551-http-fetch-smart.sh | 15 +++
 t/t5702-protocol-v2.sh  | 18 +
 transport-helper.c  |  1 +
 transport-internal.h|  6 +
 transport.c | 54 -
 7 files changed, 115 insertions(+), 13 deletions(-)

-- 
2.19.0.605.g01d371f741-goog



Re: [PATCH v2 0/4] git-commit-graph.txt: various cleanups

2018-09-27 Thread Martin Ågren
Hi Derrick

On Thu, 27 Sep 2018 at 21:16, Derrick Stolee  wrote:
> Thanks! This version satisfies my concerns and looks good to me.
>
> Reviewed-by: Derrick Stolee 

Thanks for the spectacularly snappy review. I don't expect commit graphs
to help my use cases a lot, but I still wanted to try them out a little
and stumbled on the `*` lists. Thanks for doing this work!

Martin


Re: [PATCH v2 0/4] git-commit-graph.txt: various cleanups

2018-09-27 Thread Derrick Stolee

On 9/27/2018 3:12 PM, Martin Ågren wrote:

This v2 starts with the same two patches as v1 did, then goes on to
change "[commit] graph file" to "commit-graph file" with a dash, to
match other instances as well as Derrick's feedback.

Thanks! This version satisfies my concerns and looks good to me.

Reviewed-by: Derrick Stolee 


[PATCH v2 0/4] git-commit-graph.txt: various cleanups

2018-09-27 Thread Martin Ågren
This v2 starts with the same two patches as v1 did, then goes on to
change "[commit] graph file" to "commit-graph file" with a dash, to
match other instances as well as Derrick's feedback.

Martin Ågren (4):
  git-commit-graph.txt: fix bullet lists
  git-commit-graph.txt: typeset more in monospace
  git-commit-graph.txt: refer to "*commit*-graph file"
  Doc: refer to the "commit-graph file" with dash

 Documentation/git-commit-graph.txt   | 31 
 Documentation/technical/commit-graph.txt |  8 +++---
 2 files changed, 20 insertions(+), 19 deletions(-)

Range-diff against v1:
1:  222721870b = 1:  837ef2f231 git-commit-graph.txt: fix bullet lists
2:  acac5c3584 = 2:  9759a162ca git-commit-graph.txt: typeset more in monospace
3:  65f42c947a ! 3:  759bc886d8 git-commit-graph.txt: refer to "*commit* graph 
file"
@@ -1,17 +1,17 @@
 Author: Martin Ågren 
 
-git-commit-graph.txt: refer to "*commit* graph file"
+git-commit-graph.txt: refer to "*commit*-graph file"
 
-This document sometimes refers to the "commit graph file" as just "the
+This document sometimes refers to the "commit-graph file" as just "the
 graph file". This saves a couple of words here and there at the risk of
 confusion. In particular, the documentation for `git commit-graph read`
 appears to suggest that there are indeed different types of graph 
files.
 
 Let's just write out the full name everywhere.
 
-The full name, by the way, is not the "commit-graph file" with a dash,
-cf. the synopsis. Use the dashless form. (The next commit will fix the
-remaining few instances of the "commit-graph file" in this document.)
+The full name, by the way, is not the dash-less "commit graph file".
+Use the dashed form. (The next commit will fix the remaining few
+instances of the "commit graph file" in this document.)
 
 Signed-off-by: Martin Ågren 
 
@@ -24,7 +24,7 @@
  
 -Read a graph file given by the commit-graph file and output basic
 -details about the graph file. Used for debugging purposes.
-+Read the commit graph file and output basic details about it.
++Read the commit-graph file and output basic details about it.
 +Used for debugging purposes.
  
  'verify'::
@@ -35,7 +35,7 @@
  
 -* Write a graph file, extending the current graph file using commits
 -  in ``.
-+* Write a commit graph file, extending the current commit graph file
++* Write a commit-graph file, extending the current commit-graph file
 +  using commits in ``.
  +
  
@@ -43,16 +43,14 @@
  
  
 -* Write a graph file containing all reachable commits.
-+* Write a commit graph file containing all reachable commits.
++* Write a commit-graph file containing all reachable commits.
  +
  
  $ git show-ref -s | git commit-graph write --stdin-commits
  
  
 -* Write a graph file containing all commits in the current
--  commit-graph file along with those reachable from `HEAD`.
-+* Write a commit graph file containing all commits in the current
-+  commit graph file along with those reachable from `HEAD`.
++* Write a commit-graph file containing all commits in the current
+   commit-graph file along with those reachable from `HEAD`.
  +
  
- $ git rev-parse HEAD | git commit-graph write --stdin-commits --append
4:  fc81147ea4 < -:  -- git-commit-graph.txt: refer to the "commit 
graph file" without dash
-:  -- > 4:  99b64287ec Doc: refer to the "commit-graph file" with dash
-- 
2.19.0.216.g2d3b1c576c



Re: [PATCH v2 0/4] Add proto v2 archive command with HTTP support

2018-09-27 Thread Josh Steadmon
On 2018.09.27 11:20, Stefan Beller wrote:
> On Wed, Sep 26, 2018 at 6:25 PM Josh Steadmon  wrote:
> >
> > This is the second version of my series to add a new protocol v2 command
> > for archiving, with support for HTTP(S).
> >
> > NEEDSWORK: a server built with this series is not backwards-compatible
> > with clients that set GIT_PROTOCOL=version=2 or configure
> > protocol.version=2. The old client will unconditionally send "argument
> > ..." packet lines, which breaks the server's expectations of a
> > "command=archive" request,
> 
> So if an old client sets protocol to v2, it would only apply that
> protocol version
> to fetch, not archive, so it would start following a v0 conversation, but
> as the protocol version is set, it would be transmitted to the server.
> This sounds like a bug in the client?

Yeah, basically. We're telling the server we support v2, even if the
specific operation we're trying to do doesn't have a v2 implementation
on the client. So this is going to make it ugly to replace existing
commands.

> >  while the server's capability advertisement
> > in turn breaks the clients expectation of either an ACK or NACK.
> 
> Could a modern client send either another protocol version (3?)
> or a special capability along the protocol version ("fixed_archive")
> 
> > I've been discussing workarounds for this with Jonathan Nieder, but
> > please let me know if you have any suggestions for v3 of this series.
> 
> Care to open the discussion to the list? What are the different
> approaches, what are the pros/cons?

Jonathan suggested something along the lines of what you said above,
adding a new field in GIT_PROTOCOL. So we'd send something like
"version=2:archive_version=2" and have the server detect the latter.

I'm not sure if that's the best way to go about this since I'm not
familiar with the version detection code for other parts of the system.
I worry that it will lead us down the path of having to specify a
version for every command that we eventually convert to protocol v2. On
the other hand, I don't see any other way to work around this, at least
in the archive case. We can't peek at the client's transmissions on the
server, because v2 requires that the server speaks first...


Re: [PATCH v2 0/4] Add proto v2 archive command with HTTP support

2018-09-27 Thread Jonathan Nieder
Stefan Beller wrote:
> On Wed, Sep 26, 2018 at 6:25 PM Josh Steadmon  wrote:

>> I've been discussing workarounds for this with Jonathan Nieder, but
>> please let me know if you have any suggestions for v3 of this series.
>
> Care to open the discussion to the list? What are the different
> approaches, what are the pros/cons?

Do you mean sending video of chatting in the office?

Josh and I discussed that

 1. Clients sending version=2 when they do not, in fact, speak protocol
v2 for a service is a (serious) bug.  (Separately from this
series) we should fix it.

 2. That bug is already in the wild, alas.  Fortunately the semantics of
GIT_PROTOCOL as a list of key/value pairs is well defined.  So we
have choices of (a) bump version to version=3 (b) pass another
value 'version=2:yesreallyversion=2' (c) etc.

 3. This is likely to affect push, too.

Thanks and hope that helps,
Jonathan


Re: [PATCH v2 0/4] Add proto v2 archive command with HTTP support

2018-09-27 Thread Stefan Beller
On Wed, Sep 26, 2018 at 6:25 PM Josh Steadmon  wrote:
>
> This is the second version of my series to add a new protocol v2 command
> for archiving, with support for HTTP(S).
>
> NEEDSWORK: a server built with this series is not backwards-compatible
> with clients that set GIT_PROTOCOL=version=2 or configure
> protocol.version=2. The old client will unconditionally send "argument
> ..." packet lines, which breaks the server's expectations of a
> "command=archive" request,

So if an old client sets protocol to v2, it would only apply that
protocol version
to fetch, not archive, so it would start following a v0 conversation, but
as the protocol version is set, it would be transmitted to the server.
This sounds like a bug in the client?

>  while the server's capability advertisement
> in turn breaks the clients expectation of either an ACK or NACK.

Could a modern client send either another protocol version (3?)
or a special capability along the protocol version ("fixed_archive")

> I've been discussing workarounds for this with Jonathan Nieder, but
> please let me know if you have any suggestions for v3 of this series.

Care to open the discussion to the list? What are the different
approaches, what are the pros/cons?


[PATCH v2 0/4] Add proto v2 archive command with HTTP support

2018-09-26 Thread Josh Steadmon
This is the second version of my series to add a new protocol v2 command
for archiving, with support for HTTP(S).

NEEDSWORK: a server built with this series is not backwards-compatible
with clients that set GIT_PROTOCOL=version=2 or configure
protocol.version=2. The old client will unconditionally send "argument
..." packet lines, which breaks the server's expectations of a
"command=archive" request, while the server's capability advertisement
in turn breaks the clients expectation of either an ACK or NACK.

I've been discussing workarounds for this with Jonathan Nieder, but
please let me know if you have any suggestions for v3 of this series.


Josh Steadmon (4):
  archive: follow test standards around assertions
  archive: use packet_reader for communications
  archive: implement protocol v2 archive command
  archive: allow archive over HTTP(S) with proto v2

 Documentation/technical/protocol-v2.txt | 21 -
 builtin/archive.c   | 58 +++--
 builtin/upload-archive.c| 27 ++--
 http-backend.c  | 13 +-
 serve.c |  7 +++
 t/t5000-tar-tree.sh | 33 +++---
 t/t5701-git-serve.sh|  1 +
 transport-helper.c  |  7 +--
 8 files changed, 130 insertions(+), 37 deletions(-)

Range-diff against v1:
-:  -- > 1:  c2e371ad24 archive: follow test standards around assertions
1:  b514184273 ! 2:  a65f73f627 archive: use packet_reader for communications
@@ -6,7 +6,10 @@
 handling, which will make implementation of protocol v2 support in
 git-archive easier.
 
+This refactoring does not change the behavior of "git archive".
+
 Signed-off-by: Josh Steadmon 
+Reviewed-by: Stefan Beller 
 
  diff --git a/builtin/archive.c b/builtin/archive.c
@@ -42,24 +45,24 @@
 -  if (!buf)
 +  status = packet_reader_read();
 +
-+  if (status == PACKET_READ_FLUSH)
++  if (status != PACKET_READ_NORMAL || reader.pktlen <= 0)
die(_("git archive: expected ACK/NAK, got a flush packet"));
 -  if (strcmp(buf, "ACK")) {
 -  if (starts_with(buf, "NACK "))
 -  die(_("git archive: NACK %s"), buf + 5);
 -  if (starts_with(buf, "ERR "))
 -  die(_("remote error: %s"), buf + 4);
-+  if (strcmp(reader.buffer, "ACK")) {
-+  if (starts_with(reader.buffer, "NACK "))
-+  die(_("git archive: NACK %s"), reader.buffer + 5);
-+  if (starts_with(reader.buffer, "ERR "))
-+  die(_("remote error: %s"), reader.buffer + 4);
++  if (strcmp(reader.line, "ACK")) {
++  if (starts_with(reader.line, "NACK "))
++  die(_("git archive: NACK %s"), reader.line + 5);
++  if (starts_with(reader.line, "ERR "))
++  die(_("remote error: %s"), reader.line + 4);
die(_("git archive: protocol error"));
}
  
 -  if (packet_read_line(fd[0], NULL))
 +  status = packet_reader_read();
-+  if (status != PACKET_READ_FLUSH)
++  if (status == PACKET_READ_NORMAL && reader.pktlen > 0)
die(_("git archive: expected a flush"));
  
/* Now, start reading from fd[0] and spit it out to stdout */
2:  1518c15dc1 < -:  -- archive: implement protocol v2 archive command
-:  -- > 3:  0a8cc5e331 archive: implement protocol v2 archive command
3:  1b7ad8d8f6 ! 4:  97a1424f32 archive: allow archive over HTTP(S) with proto 
v2
@@ -10,16 +10,20 @@
  +++ b/builtin/archive.c
 @@
status = packet_reader_read();
-   if (status != PACKET_READ_FLUSH)
+   if (status == PACKET_READ_NORMAL && reader.pktlen > 0)
die(_("git archive: expected a flush"));
 -  }
 +  } else if (version == protocol_v2 &&
-+ starts_with(transport->url, "http"))
++ (starts_with(transport->url, "http://;) ||
++  starts_with(transport->url, "https://;)))
 +  /*
 +   * Commands over HTTP require two requests, so there's an
-+   * additional server response to parse.
++   * additional server response to parse. We do only basic sanity
++   * checking here that the versions presented match across
++   * requests.
 +   */
-+  discover_version();
++  if (version != discover_version())
++  die(_("git archive: received different protocol 
versions in subsequent requests"));
  
/* Now, start reading from fd[0] and spit it out to stdout */
rv = recv_sideband("archive", fd[0], 1);
@@ -40,7 +44,10 @@
struct strbuf buf = STRBUF_INIT;
  
 +  if (!strcmp(service_name, "git-upload-archive")) {

Re: [PATCH v2 0/4] fix "rebase -i --root" corrupting root commit

2018-08-27 Thread Johannes Schindelin
Hi,

On Mon, 6 Aug 2018, Eric Sunshine wrote:

> On Mon, Aug 6, 2018 at 9:20 PM Hilco Wijbenga  
> wrote:
> > But your suggestion did make me think about what behaviour I would
> > like to see, exactly. I like that Git removes commits that no longer
> > serve any purpose (because I've included their changes in earlier
> > commits). So I would not want to keep commits that become empty during
> > the rebase. What I would like to see is that commits that _start out_
> > as empty, are retained. Would such behaviour make sense? Or would that
> > be considered surprising behaviour?
> 
> I, personally, have no opinion since I don't use empty commits.
> Perhaps someone more experienced and more long-sighted will chime in.

I got aware about two weeks ago that my mail provider fails to deliver all
the mails that are addressed to me. Two days ago, I managed to get this
mail (and the thread) via public-inbox (thanks, Eric!!!). Hence my late
reply.

Hilco: if you provide a patch to extend the test suite to demonstrate the
breakage (via `test_expect_failure`), I promise to try my best to provide
a fix on top.

Ciao,
Johannes


[PATCH v2 0/4] t5552: fix flakiness by introducing proper locking for GIT_TRACE

2018-08-10 Thread Johannes Schindelin via GitGitGadget
I reported a couple of times that t5552 is not passing reliably. It has now
reached next, and will no doubt infect master soon.

Turns out that it is not a Windows-specific issue, even if it occurs a lot 
more often on Windows than elsewhere. (And even if it is apparently
impossible to trigger on Linux.)

The culprit is that two processes try to write simultaneously to the same
file specified via GIT_TRACE_PACKET, and it is not well defined how that
should work, even when only looking at the POSIX specification (the
documentation of write()
[http://pubs.opengroup.org/onlinepubs/9699919799/functions/write.html] says
"Applications should use some form of concurrency control.").

This patch series addresses that by locking the trace fd. I chose to use 
fcntl() for the Unix(-y) platforms we support instead of flock() (even if
the latter has much simpler semantics) because fcntl() is in POSIX while 
flock() is not. On Windows, I use the Win32 API function LockFileEx()
[https://docs.microsoft.com/en-us/windows/desktop/api/fileapi/nf-fileapi-lockfileex]
.

Of course, I have to admit that I am not super solid on the fcntl() 
semantics. Junio was nice enough to educate me on l_len = 0 meaning the
entire file. If anybody has more experience with locking file descriptors
referring not to files, but, say, to pipes or to an interactive terminal, I
would be very thankful for help (while the POSIX documentation states that
the errno should be EINVAL on a file descriptor that cannot be locked,
apparently macOS sets errno to EBADF when trying to lock a redirected stdout
)?

Changes since v1:

 * Touched up the cover letter and the first commit message (in particular,
   removed the bogus squash! line giving away just how much I rely on the 
   --autosquash feature these days).
 * Now we're locking the entire file via l_len = 0 (except on Windows).
 * To cover all bases, EINVAL is now also treated as "cannot lock this fd"
   (in addition to EBADF).
 * Removed some superfluous flock()-related left-overs from a previous
   attempt (that caused a lot of me fighting with Linux).

Johannes Schindelin (4):
  Introduce a function to lock/unlock file descriptors when appending
  mingw: implement lock_or_unlock_fd_for_appending()
  trace: lock the trace file to avoid racy trace_write() calls
  trace: verify that locking works

 Makefile   |   1 +
 compat/mingw.c |  19 ++
 compat/mingw.h |   3 +
 git-compat-util.h  |   2 +
 t/helper/test-tool.c   |   1 +
 t/helper/test-tool.h   |   1 +
 t/helper/test-trace.c  | 130 +
 t/t0070-fundamental.sh |   6 ++
 trace.c|  11 +++-
 wrapper.c  |  16 +
 10 files changed, 189 insertions(+), 1 deletion(-)
 create mode 100644 t/helper/test-trace.c


base-commit: 42cc7485a2ec49ecc440c921d2eb0cae4da80549
Published-As: 
https://github.com/gitgitgadget/git/releases/tags/pr-17%2Fdscho%2Ffetch-negotiator-skipping-v2
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git 
pr-17/dscho/fetch-negotiator-skipping-v2
Pull-Request: https://github.com/gitgitgadget/git/pull/17

Range-diff vs v1:

 1:  e449ed75f ! 1:  a19904682 Introduce a function to lock/unlock file 
descriptors when appending
 @@ -30,7 +30,7 @@
  added in this commit.
  
  But fcntl() allows a lot more versatile file region locking that we
 -actually need, to by necessity the fcntl() emulation would be quite
 +actually need, so by necessity the fcntl() emulation would be quite
  complex: To support the `l_whence = SEEK_CUR` (which we would use, if 
it
  did not require so much book-keeping due to our writing between 
locking
  and unlocking the file), we would have to call `SetFilePointerEx()`
 @@ -47,18 +47,16 @@
  to do, as we would once again fail to have an abstraction that clearly
  says what *exact*functionality we want to use.
  
 -To set a precedent for a better approach, let's introduce a proper
 -abstraction: a function that says in its name precisely what Git
 -wants it to do (as opposed to *how* it does it on Linux):
 -lock_or_unlock_fd_for_appending().
 +This commit expressly tries to set a precedent for a better approach:
 +Let's introduce a proper abstraction: a function that says in its name
 +precisely what Git wants it to do (as opposed to *how* it does it on
 +Linux): lock_or_unlock_fd_for_appending().
  
  The next commit will provide a Windows-specific implementation of this
  function/functionality.
  
  Signed-off-by: Johannes Schindelin 
  
 -squash! Introduce a function to lock/unlock file descriptors when 
appending
 -
  diff --git a/git-compat-util.h b/git-compat-util.h
  --- a/git-compat-util.h
  +++ b/git-compat-util.h
 @@ -86,9 +84,11 @@
  + struct flock flock;
  +
  + flock.l_type 

Re: [PATCH v2 0/4] Speed up unpack_trees()

2018-08-10 Thread Duy Nguyen
On Thu, Aug 9, 2018 at 10:16 AM Ben Peart  wrote:
> In fact, in the other [1] patch series, we're detecting the number of
> cache entries that are the same as the cache tree and using that to
> traverse_by_cache_tree().  At that point, couldn't we copy the
> corresponding cache tree entries over to the destination so that those
> don't have to get recreated in the later call to cache_tree_update()?

We could. But as I stated in another mail, I saw this cache-tree
optimization as a separate problem and didn't want to mix them up.
That way cache_tree_copy() could be used elsewhere if the opportunity
shows up.

Mixing them up could also complicate the problem. You you merge stuff,
you add new cache-entries to o->result with add_index_entry() which
tries to invalidate those paths in o->result's cache-tree. Right now
the cache-tree is empty so it's really no-op. But if you copy
cache-tree over while merging, that invalidation might either
invalidate your newly copied cache-tree, or get slowed down because
non-empty o->result's cache-tree means you start to need to walk it to
find if there's any path to invalidate.

PS. This code keeps messing me up. invalidate_ce_path() may also
invalidate cache-tree in the _source_ index. For this optimization to
really shine, you better keep the the original cache-tree intact (so
that you can reuse as much as possible).

I don't see the purpose of this source cache tree invalidation at all.
My guess at this point is Linus actually made a mistake in 34110cd4e3
(Make 'unpack_trees()' have a separate source and destination index -
2008-03-06) and he should have invalidated _destination_ index instead
of the source one. I'm going to dig in some more and probably will
send a patch to remove this invalidation.
-- 
Duy


Re: [PATCH v2 0/4] Speed up unpack_trees()

2018-08-10 Thread Duy Nguyen
On Wed, Aug 8, 2018 at 10:53 PM Ben Peart  wrote:
>
>
>
> On 8/1/2018 12:38 PM, Duy Nguyen wrote:
> > On Tue, Jul 31, 2018 at 01:31:31PM -0400, Ben Peart wrote:
> >>
> >>
> >> On 7/31/2018 12:50 PM, Ben Peart wrote:
> >>>
> >>>
> >>> On 7/31/2018 11:31 AM, Duy Nguyen wrote:
> >>
> 
> > In the performance game of whack-a-mole, that call to repair cache-tree
> > is now looking quite expensive...
> 
>  Yeah and I think we can whack that mole too. I did some measurement.
>  Best case possible, we just need to scan through two indexes (one with
>  many good cache-tree, one with no cache-tree), compare and copy
>  cache-tree over. The scanning takes like 1% time of current repair
>  step and I suspect it's the hashing that takes most of the time. Of
>  course real world won't have such nice numbers, but I guess we could
>  maybe half cache-tree update/repair time.
> 
> >>>
> >>> I have some great profiling tools available so will take a look at this
> >>> next and see exactly where the time is being spent.
> >>
> >> Good instincts.  In cache_tree_update, the heavy hitter is definitely
> >> hash_object_file followed by has_object_file.
> >>
> >> Name Inc %Inc
> >> + git!cache_tree_update   12.4  4,935
> >> |+ git!update_one 11.8  4,706
> >> | + git!update_one11.8  4,706
> >> |  + git!hash_object_file  6.1  2,406
> >> |  + git!has_object_file   2.0813
> >> |  + OTHER <>0.5203
> >> |  + git!strbuf_addf   0.4155
> >> |  + git!strbuf_release0.4143
> >> |  + git!strbuf_add0.3121
> >> |  + OTHER <>0.2 93
> >> |  + git!strbuf_grow   0.1 25
> >
> > Ben, if you work on this, this could be a good starting point. I will
> > not work on this because I still have some other things to catch up
> > and follow through. You can have my sign off if you reuse something
> > from this patch
> >
> > Even if it's a naive implementation, the initial numbers look pretty
> > good. Without the patch we have
> >
> > 18:31:05.970621 unpack-trees.c:1437 performance: 0.01029 s: copy
> > 18:31:05.975729 unpack-trees.c:1444 performance: 0.005082004 s: update
> >
> > And with the patch
> >
> > 18:31:13.295655 unpack-trees.c:1437 performance: 0.000198017 s: copy
> > 18:31:13.296757 unpack-trees.c:1444 performance: 0.001075935 s: update
> >
> > Time saving is about 80% by the look of this (best possible case
> > because only the top tree needs to be hashed and written out).
> >
> > -- 8< --
> > diff --git a/cache-tree.c b/cache-tree.c
> > index 6b46711996..67a4a93100 100644
> > --- a/cache-tree.c
> > +++ b/cache-tree.c
> > @@ -440,6 +440,147 @@ int cache_tree_update(struct index_state *istate, int 
> > flags)
> >   return 0;
> >   }
> >
> > +static int same(const struct cache_entry *a, const struct cache_entry *b)
> > +{
> > + if (ce_stage(a) || ce_stage(b))
> > + return 0;
> > + if ((a->ce_flags | b->ce_flags) & CE_CONFLICTED)
> > + return 0;
> > + return a->ce_mode == b->ce_mode &&
> > +!oidcmp(>oid, >oid);
> > +}
> > +
> > +static int cache_tree_name_pos(const struct index_state *istate,
> > +const struct strbuf *path)
> > +{
> > + int pos;
> > +
> > + if (!path->len)
> > + return 0;
> > +
> > + pos = index_name_pos(istate, path->buf, path->len);
> > + if (pos >= 0)
> > + BUG("No no no, directory path must not exist in index");
> > + return -pos - 1;
> > +}
> > +
> > +/*
> > + * Locate the same cache-tree in two separate indexes. Check the
> > + * cache-tree is still valid for the "to" index (i.e. it contains the
> > + * same set of entries in the "from" index).
> > + */
> > +static int verify_one_cache_tree(const struct index_state *to,
> > +  const struct index_state *from,
> > +  const struct cache_tree *it,
> > +  const struct strbuf *path)
> > +{
> > + int i, spos, dpos;
> > +
> > + spos = cache_tree_name_pos(from, path);
> > + if (spos + it->entry_count > from->cache_nr)
> > + return -1;
> > +
> > + dpos = cache_tree_name_pos(to, path);
> > + if (dpos + it->entry_count > to->cache_nr)
> > + return -1;
> > +
> > + /* Can we quickly check head and tail and bail out early */
> > + if (!same(from->cache[spos], to->cache[spos]) ||
> > + !same(from->cache[spos + it->entry_count - 1],
> > +   to->cache[spos + it->entry_count - 1]))
> > + return -1;
> > +
> > + for (i = 1; i < it->entry_count - 1; i++)
> > + if (!same(from->cache[spos + i],
> > +   to->cache[dpos + i]))
> > + 

Re: [PATCH v2 0/4] Speed up unpack_trees()

2018-08-09 Thread Ben Peart




On 8/8/2018 4:53 PM, Ben Peart wrote:



On 8/1/2018 12:38 PM, Duy Nguyen wrote:

On Tue, Jul 31, 2018 at 01:31:31PM -0400, Ben Peart wrote:



On 7/31/2018 12:50 PM, Ben Peart wrote:



On 7/31/2018 11:31 AM, Duy Nguyen wrote:




In the performance game of whack-a-mole, that call to repair 
cache-tree

is now looking quite expensive...


Yeah and I think we can whack that mole too. I did some measurement.
Best case possible, we just need to scan through two indexes (one with
many good cache-tree, one with no cache-tree), compare and copy
cache-tree over. The scanning takes like 1% time of current repair
step and I suspect it's the hashing that takes most of the time. Of
course real world won't have such nice numbers, but I guess we could
maybe half cache-tree update/repair time.



I have some great profiling tools available so will take a look at this
next and see exactly where the time is being spent.


Good instincts.  In cache_tree_update, the heavy hitter is definitely
hash_object_file followed by has_object_file.

Name   Inc % Inc
+ git!cache_tree_update 12.4   4,935
|+ git!update_one   11.8   4,706
| + git!update_one  11.8   4,706
|  + git!hash_object_file    6.1   2,406
|  + git!has_object_file 2.0 813
|  + OTHER <>  0.5 203
|  + git!strbuf_addf 0.4 155
|  + git!strbuf_release  0.4 143
|  + git!strbuf_add  0.3 121
|  + OTHER <>  0.2  93
|  + git!strbuf_grow 0.1  25


Ben, if you work on this, this could be a good starting point. I will
not work on this because I still have some other things to catch up
and follow through. You can have my sign off if you reuse something
from this patch

Even if it's a naive implementation, the initial numbers look pretty
good. Without the patch we have

18:31:05.970621 unpack-trees.c:1437 performance: 0.01029 s: copy
18:31:05.975729 unpack-trees.c:1444 performance: 0.005082004 s: 
update


And with the patch

18:31:13.295655 unpack-trees.c:1437 performance: 0.000198017 s: copy
18:31:13.296757 unpack-trees.c:1444 performance: 0.001075935 s: 
update


Time saving is about 80% by the look of this (best possible case
because only the top tree needs to be hashed and written out).

-- 8< --
diff --git a/cache-tree.c b/cache-tree.c
index 6b46711996..67a4a93100 100644
--- a/cache-tree.c
+++ b/cache-tree.c
@@ -440,6 +440,147 @@ int cache_tree_update(struct index_state 
*istate, int flags)

  return 0;
  }
+static int same(const struct cache_entry *a, const struct cache_entry 
*b)

+{
+    if (ce_stage(a) || ce_stage(b))
+    return 0;
+    if ((a->ce_flags | b->ce_flags) & CE_CONFLICTED)
+    return 0;
+    return a->ce_mode == b->ce_mode &&
+   !oidcmp(>oid, >oid);
+}
+
+static int cache_tree_name_pos(const struct index_state *istate,
+   const struct strbuf *path)
+{
+    int pos;
+
+    if (!path->len)
+    return 0;
+
+    pos = index_name_pos(istate, path->buf, path->len);
+    if (pos >= 0)
+    BUG("No no no, directory path must not exist in index");
+    return -pos - 1;
+}
+
+/*
+ * Locate the same cache-tree in two separate indexes. Check the
+ * cache-tree is still valid for the "to" index (i.e. it contains the
+ * same set of entries in the "from" index).
+ */
+static int verify_one_cache_tree(const struct index_state *to,
+ const struct index_state *from,
+ const struct cache_tree *it,
+ const struct strbuf *path)
+{
+    int i, spos, dpos;
+
+    spos = cache_tree_name_pos(from, path);
+    if (spos + it->entry_count > from->cache_nr)
+    return -1;
+
+    dpos = cache_tree_name_pos(to, path);
+    if (dpos + it->entry_count > to->cache_nr)
+    return -1;
+
+    /* Can we quickly check head and tail and bail out early */
+    if (!same(from->cache[spos], to->cache[spos]) ||
+    !same(from->cache[spos + it->entry_count - 1],
+  to->cache[spos + it->entry_count - 1]))
+    return -1;
+
+    for (i = 1; i < it->entry_count - 1; i++)
+    if (!same(from->cache[spos + i],
+  to->cache[dpos + i]))
+    return -1;
+
+    return 0;
+}
+
+static int verify_and_invalidate(struct index_state *to,
+ const struct index_state *from,
+ struct cache_tree *it,
+ struct strbuf *path)
+{
+    /*
+ * Optimistically verify the current tree first. Alternatively
+ * we could verify all the subtrees first then do this
+ * last. Any invalid subtree would also invalidates its
+ * ancestors.
+ */
+    if (it->entry_count != -1 &&
+    verify_one_cache_tree(to, from, it, path))
+    it->entry_count = -1;
+
+    /*
+ * If the current tree is valid, don't bother checking

Re: [PATCH v2 0/4] Speed up unpack_trees()

2018-08-08 Thread Ben Peart




On 8/1/2018 12:38 PM, Duy Nguyen wrote:

On Tue, Jul 31, 2018 at 01:31:31PM -0400, Ben Peart wrote:



On 7/31/2018 12:50 PM, Ben Peart wrote:



On 7/31/2018 11:31 AM, Duy Nguyen wrote:





In the performance game of whack-a-mole, that call to repair cache-tree
is now looking quite expensive...


Yeah and I think we can whack that mole too. I did some measurement.
Best case possible, we just need to scan through two indexes (one with
many good cache-tree, one with no cache-tree), compare and copy
cache-tree over. The scanning takes like 1% time of current repair
step and I suspect it's the hashing that takes most of the time. Of
course real world won't have such nice numbers, but I guess we could
maybe half cache-tree update/repair time.



I have some great profiling tools available so will take a look at this
next and see exactly where the time is being spent.


Good instincts.  In cache_tree_update, the heavy hitter is definitely
hash_object_file followed by has_object_file.

NameInc %Inc
+ git!cache_tree_update  12.4  4,935
|+ git!update_one11.8  4,706
| + git!update_one   11.8  4,706
|  + git!hash_object_file 6.1  2,406
|  + git!has_object_file  2.0813
|  + OTHER <>   0.5203
|  + git!strbuf_addf  0.4155
|  + git!strbuf_release   0.4143
|  + git!strbuf_add   0.3121
|  + OTHER <>   0.2 93
|  + git!strbuf_grow  0.1 25


Ben, if you work on this, this could be a good starting point. I will
not work on this because I still have some other things to catch up
and follow through. You can have my sign off if you reuse something
from this patch

Even if it's a naive implementation, the initial numbers look pretty
good. Without the patch we have

18:31:05.970621 unpack-trees.c:1437 performance: 0.01029 s: copy
18:31:05.975729 unpack-trees.c:1444 performance: 0.005082004 s: update

And with the patch

18:31:13.295655 unpack-trees.c:1437 performance: 0.000198017 s: copy
18:31:13.296757 unpack-trees.c:1444 performance: 0.001075935 s: update

Time saving is about 80% by the look of this (best possible case
because only the top tree needs to be hashed and written out).

-- 8< --
diff --git a/cache-tree.c b/cache-tree.c
index 6b46711996..67a4a93100 100644
--- a/cache-tree.c
+++ b/cache-tree.c
@@ -440,6 +440,147 @@ int cache_tree_update(struct index_state *istate, int 
flags)
return 0;
  }
  
+static int same(const struct cache_entry *a, const struct cache_entry *b)

+{
+   if (ce_stage(a) || ce_stage(b))
+   return 0;
+   if ((a->ce_flags | b->ce_flags) & CE_CONFLICTED)
+   return 0;
+   return a->ce_mode == b->ce_mode &&
+  !oidcmp(>oid, >oid);
+}
+
+static int cache_tree_name_pos(const struct index_state *istate,
+  const struct strbuf *path)
+{
+   int pos;
+
+   if (!path->len)
+   return 0;
+
+   pos = index_name_pos(istate, path->buf, path->len);
+   if (pos >= 0)
+   BUG("No no no, directory path must not exist in index");
+   return -pos - 1;
+}
+
+/*
+ * Locate the same cache-tree in two separate indexes. Check the
+ * cache-tree is still valid for the "to" index (i.e. it contains the
+ * same set of entries in the "from" index).
+ */
+static int verify_one_cache_tree(const struct index_state *to,
+const struct index_state *from,
+const struct cache_tree *it,
+const struct strbuf *path)
+{
+   int i, spos, dpos;
+
+   spos = cache_tree_name_pos(from, path);
+   if (spos + it->entry_count > from->cache_nr)
+   return -1;
+
+   dpos = cache_tree_name_pos(to, path);
+   if (dpos + it->entry_count > to->cache_nr)
+   return -1;
+
+   /* Can we quickly check head and tail and bail out early */
+   if (!same(from->cache[spos], to->cache[spos]) ||
+   !same(from->cache[spos + it->entry_count - 1],
+ to->cache[spos + it->entry_count - 1]))
+   return -1;
+
+   for (i = 1; i < it->entry_count - 1; i++)
+   if (!same(from->cache[spos + i],
+ to->cache[dpos + i]))
+   return -1;
+
+   return 0;
+}
+
+static int verify_and_invalidate(struct index_state *to,
+const struct index_state *from,
+struct cache_tree *it,
+struct strbuf *path)
+{
+   /*
+* Optimistically verify the current tree first. Alternatively
+* we could verify all the subtrees first then do this
+* last. Any invalid subtree would also invalidates its
+* 

Re: [PATCH v2 0/4] fix "rebase -i --root" corrupting root commit

2018-08-07 Thread Junio C Hamano
Eric Sunshine  writes:

> On Mon, Aug 6, 2018 at 9:20 PM Hilco Wijbenga  
> wrote:
>> But your suggestion did make me think about what behaviour I would
>> like to see, exactly. I like that Git removes commits that no longer
>> serve any purpose (because I've included their changes in earlier
>> commits). So I would not want to keep commits that become empty during
>> the rebase. What I would like to see is that commits that _start out_
>> as empty, are retained. Would such behaviour make sense? Or would that
>> be considered surprising behaviour?
>
> I, personally, have no opinion since I don't use empty commits.
> Perhaps someone more experienced and more long-sighted will chime in.

0661e49a ("git-rebase.txt: document behavioral differences between
modes", 2018-06-27) added the following.  In short, "rebase -i"
should already behave that way.

+ * empty commits:
+
+am-based rebase will drop any "empty" commits, whether the
+commit started empty (had no changes relative to its parent to
+start with) or ended empty (all changes were already applied
+upstream in other commits).
+
+merge-based rebase does the same.
+
+interactive-based rebase will by default drop commits that
+started empty and halt if it hits a commit that ended up empty.
+The `--keep-empty` option exists for interactive rebases to allow
+it to keep commits that started empty.


Re: [PATCH v2 0/4] fix "rebase -i --root" corrupting root commit

2018-08-06 Thread Eric Sunshine
On Mon, Aug 6, 2018 at 9:20 PM Hilco Wijbenga  wrote:
> But your suggestion did make me think about what behaviour I would
> like to see, exactly. I like that Git removes commits that no longer
> serve any purpose (because I've included their changes in earlier
> commits). So I would not want to keep commits that become empty during
> the rebase. What I would like to see is that commits that _start out_
> as empty, are retained. Would such behaviour make sense? Or would that
> be considered surprising behaviour?

I, personally, have no opinion since I don't use empty commits.
Perhaps someone more experienced and more long-sighted will chime in.


Re: [PATCH v2 0/4] fix "rebase -i --root" corrupting root commit

2018-08-06 Thread Hilco Wijbenga
On Tue, Jul 31, 2018 at 11:22 PM, Eric Sunshine  wrote:
> On Tue, Jul 31, 2018 at 9:30 PM Hilco Wijbenga  
> wrote:
>> On Tue, Jul 31, 2018 at 12:33 AM, Eric Sunshine  
>> wrote:
>> > This is a re-roll of [1] which fixes sequencer bugs resulting in commit
>> > object corruption when "rebase -i --root" swaps in a new commit as root.
>> > Unfortunately, those bugs made it into v2.18.0 and have already
>> > corrupted at least one repository (a local project of mine). Patches 3/4
>> > and 4/4 are new.
>>
>> Does this also fix losing the initial commit if it is empty?
>>
>> git init ; git commit -m 'Initial commit' --allow-empty ; touch
>> file.txt ; git add file.txt ; git commit -m 'Add file.txt' ; git
>> rebase --root
>>
>> I would expect there to be 2 commits but the first one has
>> disappeared. (This usually happens with "git rebase -i --root" early
>> on in a new project.)
>
> This series should have no impact on that issue; it is fixing root
> commit object corruption, and does not touch or change how the commit
> graph is maintained or how the rebasing machinery itself works (and
> the --allow-empty stuff is part of that machinery).
>
> As Dscho is cc:'d already, perhaps he can chime in as a resident expert.
>
> By the way, what happens if you use --keep-empty while rebasing?

I was not aware of that flag. But, although I was expecting it to, if
I use it with the rebase, I don't see any different behaviour. I can't
really make out what its purpose is from "Keep the commits that do not
change anything from its parents in the result.".

But your suggestion did make me think about what behaviour I would
like to see, exactly. I like that Git removes commits that no longer
serve any purpose (because I've included their changes in earlier
commits). So I would not want to keep commits that become empty during
the rebase. What I would like to see is that commits that _start out_
as empty, are retained. Would such behaviour make sense? Or would that
be considered surprising behaviour?


Re: [PATCH v2 0/4] fix "rebase -i --root" corrupting root commit

2018-08-02 Thread Eric Sunshine
On Wed, Aug 1, 2018 at 7:25 PM brian m. carlson
 wrote:
> On Tue, Jul 31, 2018 at 03:33:27AM -0400, Eric Sunshine wrote:
> > This is a re-roll of [1] which fixes sequencer bugs resulting in commit
> > object corruption when "rebase -i --root" swaps in a new commit as root.
> > Unfortunately, those bugs made it into v2.18.0 and have already
> > corrupted at least one repository (a local project of mine). Patches 3/4
> > and 4/4 are new.
>
> I looked at this series and it seems sane and logical to me.  Thanks for
> acting quickly to fix the corruption.
>
> Reviewed-by: brian m. carlson 

Thanks for the review.


Re: [PATCH v2 0/4] fix "rebase -i --root" corrupting root commit

2018-08-01 Thread brian m. carlson
On Tue, Jul 31, 2018 at 03:33:27AM -0400, Eric Sunshine wrote:
> This is a re-roll of [1] which fixes sequencer bugs resulting in commit
> object corruption when "rebase -i --root" swaps in a new commit as root.
> Unfortunately, those bugs made it into v2.18.0 and have already
> corrupted at least one repository (a local project of mine). Patches 3/4
> and 4/4 are new.
> 
> v1 fixed these bugs:
> 
> * trailing garbage on the commit's "author" header
> 
> * extra trailing digit on "author" header's timezone (caused by two
>   separate bugs)
> 
> v2 fixes those same bugs, plus:
> 
> * eliminates a bogus "@" prepended to the "author" header timestamp
>   which renders the header corrupt
> 
> * takes care to validate information coming from
>   "rebase-merge/author-script" before incorporating it into the "author"
>   header since that file may be hand-edited, and bogus hand-edited
>   values could corrupt the commit object.

I looked at this series and it seems sane and logical to me.  Thanks for
acting quickly to fix the corruption.

Reviewed-by: brian m. carlson 
-- 
brian m. carlson: Houston, Texas, US
OpenPGP: https://keybase.io/bk2204


signature.asc
Description: PGP signature


Re: [PATCH v2 0/4] Speed up unpack_trees()

2018-08-01 Thread Duy Nguyen
On Tue, Jul 31, 2018 at 01:31:31PM -0400, Ben Peart wrote:
> 
> 
> On 7/31/2018 12:50 PM, Ben Peart wrote:
> > 
> > 
> > On 7/31/2018 11:31 AM, Duy Nguyen wrote:
> 
> >>
> >>> In the performance game of whack-a-mole, that call to repair cache-tree
> >>> is now looking quite expensive...
> >>
> >> Yeah and I think we can whack that mole too. I did some measurement.
> >> Best case possible, we just need to scan through two indexes (one with
> >> many good cache-tree, one with no cache-tree), compare and copy
> >> cache-tree over. The scanning takes like 1% time of current repair
> >> step and I suspect it's the hashing that takes most of the time. Of
> >> course real world won't have such nice numbers, but I guess we could
> >> maybe half cache-tree update/repair time.
> >>
> > 
> > I have some great profiling tools available so will take a look at this 
> > next and see exactly where the time is being spent.
> 
> Good instincts.  In cache_tree_update, the heavy hitter is definitely 
> hash_object_file followed by has_object_file.
> 
> Name  Inc %Inc
> + git!cache_tree_update12.4  4,935
> |+ git!update_one  11.8  4,706
> | + git!update_one 11.8  4,706
> |  + git!hash_object_file   6.1  2,406
> |  + git!has_object_file2.0813
> |  + OTHER <> 0.5203
> |  + git!strbuf_addf0.4155
> |  + git!strbuf_release 0.4143
> |  + git!strbuf_add 0.3121
> |  + OTHER <> 0.2 93
> |  + git!strbuf_grow0.1 25

Ben, if you work on this, this could be a good starting point. I will
not work on this because I still have some other things to catch up
and follow through. You can have my sign off if you reuse something
from this patch

Even if it's a naive implementation, the initial numbers look pretty
good. Without the patch we have

18:31:05.970621 unpack-trees.c:1437 performance: 0.01029 s: copy
18:31:05.975729 unpack-trees.c:1444 performance: 0.005082004 s: update

And with the patch

18:31:13.295655 unpack-trees.c:1437 performance: 0.000198017 s: copy
18:31:13.296757 unpack-trees.c:1444 performance: 0.001075935 s: update

Time saving is about 80% by the look of this (best possible case
because only the top tree needs to be hashed and written out).

-- 8< --
diff --git a/cache-tree.c b/cache-tree.c
index 6b46711996..67a4a93100 100644
--- a/cache-tree.c
+++ b/cache-tree.c
@@ -440,6 +440,147 @@ int cache_tree_update(struct index_state *istate, int 
flags)
return 0;
 }
 
+static int same(const struct cache_entry *a, const struct cache_entry *b)
+{
+   if (ce_stage(a) || ce_stage(b))
+   return 0;
+   if ((a->ce_flags | b->ce_flags) & CE_CONFLICTED)
+   return 0;
+   return a->ce_mode == b->ce_mode &&
+  !oidcmp(>oid, >oid);
+}
+
+static int cache_tree_name_pos(const struct index_state *istate,
+  const struct strbuf *path)
+{
+   int pos;
+
+   if (!path->len)
+   return 0;
+
+   pos = index_name_pos(istate, path->buf, path->len);
+   if (pos >= 0)
+   BUG("No no no, directory path must not exist in index");
+   return -pos - 1;
+}
+
+/*
+ * Locate the same cache-tree in two separate indexes. Check the
+ * cache-tree is still valid for the "to" index (i.e. it contains the
+ * same set of entries in the "from" index).
+ */
+static int verify_one_cache_tree(const struct index_state *to,
+const struct index_state *from,
+const struct cache_tree *it,
+const struct strbuf *path)
+{
+   int i, spos, dpos;
+
+   spos = cache_tree_name_pos(from, path);
+   if (spos + it->entry_count > from->cache_nr)
+   return -1;
+
+   dpos = cache_tree_name_pos(to, path);
+   if (dpos + it->entry_count > to->cache_nr)
+   return -1;
+
+   /* Can we quickly check head and tail and bail out early */
+   if (!same(from->cache[spos], to->cache[spos]) ||
+   !same(from->cache[spos + it->entry_count - 1],
+ to->cache[spos + it->entry_count - 1]))
+   return -1;
+
+   for (i = 1; i < it->entry_count - 1; i++)
+   if (!same(from->cache[spos + i],
+ to->cache[dpos + i]))
+   return -1;
+
+   return 0;
+}
+
+static int verify_and_invalidate(struct index_state *to,
+const struct index_state *from,
+struct cache_tree *it,
+struct strbuf *path)
+{
+   /*
+* Optimistically verify the current tree first. Alternatively
+* we could verify all the subtrees first then do this
+* last. Any 

Re: [PATCH v2 0/4] fix "rebase -i --root" corrupting root commit

2018-08-01 Thread Eric Sunshine
On Tue, Jul 31, 2018 at 9:30 PM Hilco Wijbenga  wrote:
> On Tue, Jul 31, 2018 at 12:33 AM, Eric Sunshine  
> wrote:
> > This is a re-roll of [1] which fixes sequencer bugs resulting in commit
> > object corruption when "rebase -i --root" swaps in a new commit as root.
> > Unfortunately, those bugs made it into v2.18.0 and have already
> > corrupted at least one repository (a local project of mine). Patches 3/4
> > and 4/4 are new.
>
> Does this also fix losing the initial commit if it is empty?
>
> git init ; git commit -m 'Initial commit' --allow-empty ; touch
> file.txt ; git add file.txt ; git commit -m 'Add file.txt' ; git
> rebase --root
>
> I would expect there to be 2 commits but the first one has
> disappeared. (This usually happens with "git rebase -i --root" early
> on in a new project.)

This series should have no impact on that issue; it is fixing root
commit object corruption, and does not touch or change how the commit
graph is maintained or how the rebasing machinery itself works (and
the --allow-empty stuff is part of that machinery).

As Dscho is cc:'d already, perhaps he can chime in as a resident expert.

By the way, what happens if you use --keep-empty while rebasing?


Re: [PATCH v2 0/4] fix "rebase -i --root" corrupting root commit

2018-07-31 Thread Hilco Wijbenga
Hi Eric,

On Tue, Jul 31, 2018 at 12:33 AM, Eric Sunshine  wrote:
> This is a re-roll of [1] which fixes sequencer bugs resulting in commit
> object corruption when "rebase -i --root" swaps in a new commit as root.
> Unfortunately, those bugs made it into v2.18.0 and have already
> corrupted at least one repository (a local project of mine). Patches 3/4
> and 4/4 are new.
>
> v1 fixed these bugs:
>
> * trailing garbage on the commit's "author" header
>
> * extra trailing digit on "author" header's timezone (caused by two
>   separate bugs)
>
> v2 fixes those same bugs, plus:
>
> * eliminates a bogus "@" prepended to the "author" header timestamp
>   which renders the header corrupt
>
> * takes care to validate information coming from
>   "rebase-merge/author-script" before incorporating it into the "author"
>   header since that file may be hand-edited, and bogus hand-edited
>   values could corrupt the commit object.
>

Does this also fix losing the initial commit if it is empty?

Given

git init ; git commit -m 'Initial commit' --allow-empty ; touch
file.txt ; git add file.txt ; git commit -m 'Add file.txt' ; git
rebase --root

I would expect there to be 2 commits but the first one has
disappeared. (This usually happens with "git rebase -i --root" early
on in a new project.)

Cheers,
Hilco


Re: [PATCH v2 0/4] Speed up unpack_trees()

2018-07-31 Thread Ben Peart




On 7/31/2018 12:50 PM, Ben Peart wrote:



On 7/31/2018 11:31 AM, Duy Nguyen wrote:





In the performance game of whack-a-mole, that call to repair cache-tree
is now looking quite expensive...


Yeah and I think we can whack that mole too. I did some measurement.
Best case possible, we just need to scan through two indexes (one with
many good cache-tree, one with no cache-tree), compare and copy
cache-tree over. The scanning takes like 1% time of current repair
step and I suspect it's the hashing that takes most of the time. Of
course real world won't have such nice numbers, but I guess we could
maybe half cache-tree update/repair time.



I have some great profiling tools available so will take a look at this 
next and see exactly where the time is being spent.


Good instincts.  In cache_tree_update, the heavy hitter is definitely 
hash_object_file followed by has_object_file.


NameInc %Inc
+ git!cache_tree_update  12.4  4,935
|+ git!update_one11.8  4,706
| + git!update_one   11.8  4,706
|  + git!hash_object_file 6.1  2,406
|  + git!has_object_file  2.0813
|  + OTHER <>   0.5203
|  + git!strbuf_addf  0.4155
|  + git!strbuf_release   0.4143
|  + git!strbuf_add   0.3121
|  + OTHER <>   0.2 93
|  + git!strbuf_grow  0.1 25


Re: [PATCH v2 0/4] Speed up unpack_trees()

2018-07-31 Thread Ben Peart




On 7/31/2018 11:31 AM, Duy Nguyen wrote:

On Mon, Jul 30, 2018 at 8:10 PM Ben Peart  wrote:

I ran "git checkout" on a large repo and averaged the results of 3 runs.
   This clearly demonstrates the benefit of the optimized unpack_trees()
as even the final "diff-index" is essentially a 3rd call to unpack_trees().

baselinenew
--
0.535510167 0.556558733 s: read cache .git/index
0.3057373   0.3147105   s: initialize name hash
0.0184082   0.023558433 s: preload index
0.086910967 0.089085967 s: refresh index
7.889590767 2.191554433 s: unpack trees
0.120760833 0.131941267 s: update worktree after a merge
2.2583504   2.572663167 s: repair cache-tree
0.8916137   0.959495233 s: write index, changed mask = 28
3.405199233 0.2710663   s: unpack trees
0.000999667 0.0021554   s: update worktree after a merge
3.4063306   0.273318333 s: diff-index
16.9524923  9.462943133 s: git command:
'c:\git-sdk-64\usr\src\git\git.exe' checkout

The first call to unpack_trees() saves 72%
The 2nd and 3rd call save 92%


By the 3rd I guess you meant "diff-index" line. I think it's the same
with the second call. diff-index triggers the second unpack-trees but
there's no indent here and it's misleading to read this as diff-index
and unpack-trees execute one after the other.


Total time savings for the entire command was 44%


Wow.. I guess you have more trees since I could only save 30% on gcc.git.


Yes, with over 500K trees, this optimization really pays off for us.  I 
can't wait to see how this works out in the wild (vs my "lab" based 
performance testing).


Thank you!  I definitely owe you lunch. :)




In the performance game of whack-a-mole, that call to repair cache-tree
is now looking quite expensive...


Yeah and I think we can whack that mole too. I did some measurement.
Best case possible, we just need to scan through two indexes (one with
many good cache-tree, one with no cache-tree), compare and copy
cache-tree over. The scanning takes like 1% time of current repair
step and I suspect it's the hashing that takes most of the time. Of
course real world won't have such nice numbers, but I guess we could
maybe half cache-tree update/repair time.



I have some great profiling tools available so will take a look at this 
next and see exactly where the time is being spent.


Re: [PATCH v2 0/4] Speed up unpack_trees()

2018-07-31 Thread Duy Nguyen
On Mon, Jul 30, 2018 at 8:10 PM Ben Peart  wrote:
> I ran "git checkout" on a large repo and averaged the results of 3 runs.
>   This clearly demonstrates the benefit of the optimized unpack_trees()
> as even the final "diff-index" is essentially a 3rd call to unpack_trees().
>
> baselinenew
> --
> 0.535510167 0.556558733 s: read cache .git/index
> 0.3057373   0.3147105   s: initialize name hash
> 0.0184082   0.023558433 s: preload index
> 0.086910967 0.089085967 s: refresh index
> 7.889590767 2.191554433 s: unpack trees
> 0.120760833 0.131941267 s: update worktree after a merge
> 2.2583504   2.572663167 s: repair cache-tree
> 0.8916137   0.959495233 s: write index, changed mask = 28
> 3.405199233 0.2710663   s: unpack trees
> 0.000999667 0.0021554   s: update worktree after a merge
> 3.4063306   0.273318333 s: diff-index
> 16.9524923  9.462943133 s: git command:
> 'c:\git-sdk-64\usr\src\git\git.exe' checkout
>
> The first call to unpack_trees() saves 72%
> The 2nd and 3rd call save 92%

By the 3rd I guess you meant "diff-index" line. I think it's the same
with the second call. diff-index triggers the second unpack-trees but
there's no indent here and it's misleading to read this as diff-index
and unpack-trees execute one after the other.

> Total time savings for the entire command was 44%

Wow.. I guess you have more trees since I could only save 30% on gcc.git.

> In the performance game of whack-a-mole, that call to repair cache-tree
> is now looking quite expensive...

Yeah and I think we can whack that mole too. I did some measurement.
Best case possible, we just need to scan through two indexes (one with
many good cache-tree, one with no cache-tree), compare and copy
cache-tree over. The scanning takes like 1% time of current repair
step and I suspect it's the hashing that takes most of the time. Of
course real world won't have such nice numbers, but I guess we could
maybe half cache-tree update/repair time.
-- 
Duy


Re: [PATCH v2 0/4] fix "rebase -i --root" corrupting root commit

2018-07-31 Thread Eric Sunshine
On Tue, Jul 31, 2018 at 6:46 AM Eric Sunshine  wrote:
> Anyhow, thanks for reading over the series. I appreciate it even if
> our "sense of priority" doesn't always align (as evidenced by your
> review comments and my responses).

To be clear, the changes you suggest all make sense, and would be
welcome (especially the bug fixes), but I consider them lower priority
than the fixes in this series, and here's why:

The commit object corruption caused by the bugs fixed by this series
are unavoidable. Anyone using "rebase -i --root" to swap in a new
commit as root is going to end up with a corrupt repository no matter
what. There's no way to side step it. And, most people won't know how
to fix the corruption, assuming they even notice it.

If I understand correctly, the issues you describe are unlikely to
come up in practice. The only way they can arise is if someone hand
edits the script (something only power users will do) _and_ botches
the edit in the process, or if the person's name contains an
apostrophe (possible, though perhaps uncommon?). Also, (if again I
understand correctly) they are only data "corruptions", not genuine
broken-repository corruptions, thus are more likely to be fixable by a
typical user.

So it's not that I think your proposed fixes and suggestions are
unimportant, I just don't think they belong in this series, and would
be happy to see them atop it.

Thanks.


Re: [PATCH v2 0/4] fix "rebase -i --root" corrupting root commit

2018-07-31 Thread Phillip Wood
Hi Eric

On 31/07/18 11:46, Eric Sunshine wrote:
> On Tue, Jul 31, 2018 at 6:06 AM Phillip Wood  
> wrote:
>> On 31/07/18 08:33, Eric Sunshine wrote:
>>> Patch 2/4 of this series conflicts with Akinori MUSHA's
>>> 'am/sequencer-author-script-fix' which takes a stab at fixing one of the
>>> four (or so) bugs fixed by this series (namely, adding a missing closing
>>> quote to GIT_AUTHOR_DATE in "rebase-merge/author-script"). That patch
>>> probably ought to be dropped (without prejudice) in favor of this series
>>> for the following reasons:
>>> [...]
>>> The test added by Akinori MUSHA's patch may still have value, and it may
>>> make sense to re-submit it, however, doing so need not hold up this
>>> (higher priority) series.
>>
>> Yes I think it does, also the patch that Johannes and I have on top of
>> it to fix the quoting of "'" in write_author_script() relies on fixing
>> the missing trailing quote and handling of "'" at the same time,
>> hopefully we can get that rebased on top of these asap.
> 
> I'm not sure if "Yes I think it does" means that Akinori's test has
> value or if it means that this series should be held back waiting for
> other changes built atop it.

It means we should use a test based on that with the quoting fixes on
top of this series and progress them together if possible. I think the
quoting patch I just sent is good now so hopefully there wont be any
issue with it holding up your fixes.

> Anyhow, thanks for reading over the series. I appreciate it even if
> our "sense of priority" doesn't always align (as evidenced by your
> review comments and my responses).

My "sense of priority" was informed by the hope that the quoting patch
wouldn't hold things up (let's hope I was right about that!).

Best Wishes

Phillip



Re: [PATCH v2 0/4] fix "rebase -i --root" corrupting root commit

2018-07-31 Thread Eric Sunshine
On Tue, Jul 31, 2018 at 6:06 AM Phillip Wood  wrote:
> On 31/07/18 08:33, Eric Sunshine wrote:
> > Patch 2/4 of this series conflicts with Akinori MUSHA's
> > 'am/sequencer-author-script-fix' which takes a stab at fixing one of the
> > four (or so) bugs fixed by this series (namely, adding a missing closing
> > quote to GIT_AUTHOR_DATE in "rebase-merge/author-script"). That patch
> > probably ought to be dropped (without prejudice) in favor of this series
> > for the following reasons:
> > [...]
> > The test added by Akinori MUSHA's patch may still have value, and it may
> > make sense to re-submit it, however, doing so need not hold up this
> > (higher priority) series.
>
> Yes I think it does, also the patch that Johannes and I have on top of
> it to fix the quoting of "'" in write_author_script() relies on fixing
> the missing trailing quote and handling of "'" at the same time,
> hopefully we can get that rebased on top of these asap.

I'm not sure if "Yes I think it does" means that Akinori's test has
value or if it means that this series should be held back waiting for
other changes built atop it.

Anyhow, thanks for reading over the series. I appreciate it even if
our "sense of priority" doesn't always align (as evidenced by your
review comments and my responses).


Re: [PATCH v2 0/4] fix "rebase -i --root" corrupting root commit

2018-07-31 Thread Phillip Wood

Hi Eric

On 31/07/18 08:33, Eric Sunshine wrote:

This is a re-roll of [1] which fixes sequencer bugs resulting in commit
object corruption when "rebase -i --root" swaps in a new commit as root.
Unfortunately, those bugs made it into v2.18.0 and have already
corrupted at least one repository (a local project of mine). Patches 3/4
and 4/4 are new.

v1 fixed these bugs:

* trailing garbage on the commit's "author" header

* extra trailing digit on "author" header's timezone (caused by two
   separate bugs)

v2 fixes those same bugs, plus:

* eliminates a bogus "@" prepended to the "author" header timestamp
   which renders the header corrupt

* takes care to validate information coming from
   "rebase-merge/author-script" before incorporating it into the "author"
   header since that file may be hand-edited, and bogus hand-edited
   values could corrupt the commit object.

Patch 2/4 of this series conflicts with Akinori MUSHA's
'am/sequencer-author-script-fix' which takes a stab at fixing one of the
four (or so) bugs fixed by this series (namely, adding a missing closing
quote to GIT_AUTHOR_DATE in "rebase-merge/author-script"). That patch
probably ought to be dropped (without prejudice) in favor of this series
for the following reasons:

* It has the undesirable property of adding an unwanted extra blank line
   to "rebase-merge/author-script"; this series doesn't make that
   mistake.

* The fix in this series (patch 2/4) is more complete, also ensuring
   that the return value of sq_dequote() is checked.

* And, most importantly, this series sells the change as a fix for a
   genuine serious bug (commit object corruption), whereas that patch
   talks only about adjusting the file content to make it parseable by
   the shell. It's important for future readers of this change to
   understand that the bug (missing closing quote) had much more dire
   consequences than merely being syntactically invalid as a shell
   script.

The test added by Akinori MUSHA's patch may still have value, and it may
make sense to re-submit it, however, doing so need not hold up this
(higher priority) series.


Yes I think it does, also the patch that Johannes and I have on top of 
it to fix the quoting of "'" in write_author_script() relies on fixing 
the missing trailing quote and handling of "'" at the same time, 
hopefully we can get that rebased on top of these asap.


Best Wishes

Phillip


Phillip's proposed[2] unification of parsers and other fixes and
cleanups can easily be built atop this series and, likewise, need not
prevent these (higher priority) patches from graduating independently.

[1]: 
https://public-inbox.org/git/20180730092929.71114-1-sunsh...@sunshineco.com/
[2]: 
https://public-inbox.org/git/1f172fc1-4b51-cdf7-e841-5ca41e209...@talktalk.net/

Eric Sunshine (4):
   sequencer: fix "rebase -i --root" corrupting author header
   sequencer: fix "rebase -i --root" corrupting author header timezone
   sequencer: fix "rebase -i --root" corrupting author header timestamp
   sequencer: don't die() on bogus user-edited timestamp

  sequencer.c   | 39 +--
  t/t3404-rebase-interactive.sh | 10 -
  2 files changed, 33 insertions(+), 16 deletions(-)

Range-diff against v1:
1:  ba10ae4e5d ! 1:  8c47555bcf sequencer: fix "rebase -i --root" corrupting 
author header
 @@ -11,8 +11,8 @@
  This is a result of read_author_ident() neglecting to NUL-terminate 
the
  buffer into which it composes the "author" header.
  
 -(Note that the extra "0" in the timezone is a separate bug which will be

 -fixed subsequently.)
 +(Note that the "@" preceding the timestamp and the extra "0" in the
 +timezone are separate bugs which will be fixed subsequently.)
  
  Security considerations: Construction of the "author" header by

  read_author_ident() happens in-place and in parallel with parsing the
 @@ -26,6 +26,7 @@
  inserted as part of the parsing process.
  
  Signed-off-by: Eric Sunshine 

 +Acked-by: Johannes Schindelin 
  
  diff --git a/sequencer.c b/sequencer.c

  --- a/sequencer.c
 @@ -61,7 +62,7 @@
  + set_fake_editor &&
  + FAKE_LINES="2 1" git rebase -i --root &&
  + git cat-file commit HEAD^ >out &&
 -+ grep "^author ..* @[0-9][0-9]* [-+][0-9][0-9]*$" out
 ++ grep "^author ..*> @[0-9][0-9]* [-+][0-9][0-9]*$" out
  +'
  +
   test_done
2:  c0400cda85 ! 2:  1d4064147e sequencer: fix "rebase -i --root" corrupting 
author header timezone
 @@ -22,6 +22,9 @@
  a NUL-terminator at the end of the shifted content, which explains the
  duplicated last digit in the timezone.
  
 +(Note that the "@" preceding the timestamp is a separate bug which

 +will be fixed subsequently.)
 +
  Signed-off-by: Eric Sunshine 
  
  diff --git a/sequencer.c b/sequencer.c

 @@ -56,8 

[PATCH v2 0/4] fix "rebase -i --root" corrupting root commit

2018-07-31 Thread Eric Sunshine
This is a re-roll of [1] which fixes sequencer bugs resulting in commit
object corruption when "rebase -i --root" swaps in a new commit as root.
Unfortunately, those bugs made it into v2.18.0 and have already
corrupted at least one repository (a local project of mine). Patches 3/4
and 4/4 are new.

v1 fixed these bugs:

* trailing garbage on the commit's "author" header

* extra trailing digit on "author" header's timezone (caused by two
  separate bugs)

v2 fixes those same bugs, plus:

* eliminates a bogus "@" prepended to the "author" header timestamp
  which renders the header corrupt

* takes care to validate information coming from
  "rebase-merge/author-script" before incorporating it into the "author"
  header since that file may be hand-edited, and bogus hand-edited
  values could corrupt the commit object.

Patch 2/4 of this series conflicts with Akinori MUSHA's
'am/sequencer-author-script-fix' which takes a stab at fixing one of the
four (or so) bugs fixed by this series (namely, adding a missing closing
quote to GIT_AUTHOR_DATE in "rebase-merge/author-script"). That patch
probably ought to be dropped (without prejudice) in favor of this series
for the following reasons:

* It has the undesirable property of adding an unwanted extra blank line
  to "rebase-merge/author-script"; this series doesn't make that
  mistake.

* The fix in this series (patch 2/4) is more complete, also ensuring
  that the return value of sq_dequote() is checked.

* And, most importantly, this series sells the change as a fix for a
  genuine serious bug (commit object corruption), whereas that patch
  talks only about adjusting the file content to make it parseable by
  the shell. It's important for future readers of this change to
  understand that the bug (missing closing quote) had much more dire
  consequences than merely being syntactically invalid as a shell
  script.

The test added by Akinori MUSHA's patch may still have value, and it may
make sense to re-submit it, however, doing so need not hold up this
(higher priority) series.

Phillip's proposed[2] unification of parsers and other fixes and
cleanups can easily be built atop this series and, likewise, need not
prevent these (higher priority) patches from graduating independently.

[1]: 
https://public-inbox.org/git/20180730092929.71114-1-sunsh...@sunshineco.com/
[2]: 
https://public-inbox.org/git/1f172fc1-4b51-cdf7-e841-5ca41e209...@talktalk.net/

Eric Sunshine (4):
  sequencer: fix "rebase -i --root" corrupting author header
  sequencer: fix "rebase -i --root" corrupting author header timezone
  sequencer: fix "rebase -i --root" corrupting author header timestamp
  sequencer: don't die() on bogus user-edited timestamp

 sequencer.c   | 39 +--
 t/t3404-rebase-interactive.sh | 10 -
 2 files changed, 33 insertions(+), 16 deletions(-)

Range-diff against v1:
1:  ba10ae4e5d ! 1:  8c47555bcf sequencer: fix "rebase -i --root" corrupting 
author header
@@ -11,8 +11,8 @@
 This is a result of read_author_ident() neglecting to NUL-terminate the
 buffer into which it composes the "author" header.
 
-(Note that the extra "0" in the timezone is a separate bug which will 
be
-fixed subsequently.)
+(Note that the "@" preceding the timestamp and the extra "0" in the
+timezone are separate bugs which will be fixed subsequently.)
 
 Security considerations: Construction of the "author" header by
 read_author_ident() happens in-place and in parallel with parsing the
@@ -26,6 +26,7 @@
 inserted as part of the parsing process.
 
 Signed-off-by: Eric Sunshine 
+Acked-by: Johannes Schindelin 
 
 diff --git a/sequencer.c b/sequencer.c
 --- a/sequencer.c
@@ -61,7 +62,7 @@
 +  set_fake_editor &&
 +  FAKE_LINES="2 1" git rebase -i --root &&
 +  git cat-file commit HEAD^ >out &&
-+  grep "^author ..* @[0-9][0-9]* [-+][0-9][0-9]*$" out
++  grep "^author ..*> @[0-9][0-9]* [-+][0-9][0-9]*$" out
 +'
 +
  test_done
2:  c0400cda85 ! 2:  1d4064147e sequencer: fix "rebase -i --root" corrupting 
author header timezone
@@ -22,6 +22,9 @@
 a NUL-terminator at the end of the shifted content, which explains the
 duplicated last digit in the timezone.
 
+(Note that the "@" preceding the timestamp is a separate bug which
+will be fixed subsequently.)
+
 Signed-off-by: Eric Sunshine 
 
 diff --git a/sequencer.c b/sequencer.c
@@ -56,8 +59,8 @@
set_fake_editor &&
FAKE_LINES="2 1" git rebase -i --root &&
git cat-file commit HEAD^ >out &&
--  grep "^author ..* @[0-9][0-9]* [-+][0-9][0-9]*$" out
-+  grep "^author ..* @[0-9][0-9]* [-+][0-9][0-9][0-9][0-9]$" out
+-  grep "^author ..*> @[0-9][0-9]* [-+][0-9][0-9]*$" out
++  grep "^author ..*> @[0-9][0-9]* [-+][0-9][0-9][0-9][0-9]$" out
  '
  

Re: [PATCH v2 0/4] Speed up unpack_trees()

2018-07-30 Thread Ben Peart




On 7/29/2018 6:33 AM, Nguyễn Thái Ngọc Duy wrote:

This series speeds up unpack_trees() a bit by using cache-tree.
unpack-trees could bit split in three big parts

- the actual tree unpacking and running n-way merging
- update worktree, which could be expensive depending on how much I/O
   is involved
- repair cache-tree

This series focuses on the first part alone and could give 700%
speedup (best case possible scenario, real life ones probably not that
impressive).

It also shows that the reparing cache-tree is kinda expensive. I have
an idea of reusing cache-tree from the original index, but I'll leave
that to Ben or others to try out and see if it helps at all.

v2 fixes the comments from Junio, adds more performance tracing and
reduces the cost of adding index entries.

Nguyễn Thái Ngọc Duy (4):
   unpack-trees.c: add performance tracing
   unpack-trees: optimize walking same trees with cache-tree
   unpack-trees: reduce malloc in cache-tree walk
   unpack-trees: cheaper index update when walking by cache-tree

  cache-tree.c   |   2 +
  cache.h|   1 +
  read-cache.c   |   3 +-
  unpack-trees.c | 161 -
  unpack-trees.h |   1 +
  5 files changed, 166 insertions(+), 2 deletions(-)



I have a limited understanding of this code path so I'm not the best 
person to review this but I didn't see any issues that concerned me.  I 
also was able to run our internal functional and performance tests in 
addition to the git tests and the results were positive.


Ben


Re: [PATCH v2 0/4] Speed up unpack_trees()

2018-07-30 Thread Ben Peart




On 7/29/2018 6:33 AM, Nguyễn Thái Ngọc Duy wrote:

This series speeds up unpack_trees() a bit by using cache-tree.
unpack-trees could bit split in three big parts

- the actual tree unpacking and running n-way merging
- update worktree, which could be expensive depending on how much I/O
   is involved
- repair cache-tree

This series focuses on the first part alone and could give 700%
speedup (best case possible scenario, real life ones probably not that
impressive).

It also shows that the reparing cache-tree is kinda expensive. I have
an idea of reusing cache-tree from the original index, but I'll leave
that to Ben or others to try out and see if it helps at all.

v2 fixes the comments from Junio, adds more performance tracing and
reduces the cost of adding index entries.

Nguyễn Thái Ngọc Duy (4):
   unpack-trees.c: add performance tracing
   unpack-trees: optimize walking same trees with cache-tree
   unpack-trees: reduce malloc in cache-tree walk
   unpack-trees: cheaper index update when walking by cache-tree

  cache-tree.c   |   2 +
  cache.h|   1 +
  read-cache.c   |   3 +-
  unpack-trees.c | 161 -
  unpack-trees.h |   1 +
  5 files changed, 166 insertions(+), 2 deletions(-)



I ran "git checkout" on a large repo and averaged the results of 3 runs. 
 This clearly demonstrates the benefit of the optimized unpack_trees() 
as even the final "diff-index" is essentially a 3rd call to unpack_trees().


baselinenew 
--
0.535510167 0.556558733 s: read cache .git/index
0.3057373   0.3147105   s: initialize name hash
0.0184082   0.023558433 s: preload index
0.086910967 0.089085967 s: refresh index
7.889590767 2.191554433 s: unpack trees
0.120760833 0.131941267 s: update worktree after a merge
2.2583504   2.572663167 s: repair cache-tree
0.8916137   0.959495233 s: write index, changed mask = 28
3.405199233 0.2710663   s: unpack trees
0.000999667 0.0021554   s: update worktree after a merge
3.4063306   0.273318333 s: diff-index
16.9524923	9.462943133	s: git command: 
'c:\git-sdk-64\usr\src\git\git.exe' checkout


The first call to unpack_trees() saves 72%
The 2nd and 3rd call save 92%
Total time savings for the entire command was 44%

In the performance game of whack-a-mole, that call to repair cache-tree 
is now looking quite expensive...


Ben


[PATCH v2 0/4] Speed up unpack_trees()

2018-07-29 Thread Nguyễn Thái Ngọc Duy
This series speeds up unpack_trees() a bit by using cache-tree.
unpack-trees could bit split in three big parts

- the actual tree unpacking and running n-way merging
- update worktree, which could be expensive depending on how much I/O
  is involved
- repair cache-tree

This series focuses on the first part alone and could give 700%
speedup (best case possible scenario, real life ones probably not that
impressive).

It also shows that the reparing cache-tree is kinda expensive. I have
an idea of reusing cache-tree from the original index, but I'll leave
that to Ben or others to try out and see if it helps at all.

v2 fixes the comments from Junio, adds more performance tracing and
reduces the cost of adding index entries.

Nguyễn Thái Ngọc Duy (4):
  unpack-trees.c: add performance tracing
  unpack-trees: optimize walking same trees with cache-tree
  unpack-trees: reduce malloc in cache-tree walk
  unpack-trees: cheaper index update when walking by cache-tree

 cache-tree.c   |   2 +
 cache.h|   1 +
 read-cache.c   |   3 +-
 unpack-trees.c | 161 -
 unpack-trees.h |   1 +
 5 files changed, 166 insertions(+), 2 deletions(-)

-- 
2.18.0.656.gda699b98b3



[PATCH v2 0/4] fail compilation with strcpy

2018-07-24 Thread Jeff King
On Thu, Jul 19, 2018 at 04:32:59PM -0400, Jeff King wrote:

> This is a patch series to address the discussion in the thread at:
> 
>   https://public-inbox.org/git/20180713204350.ga16...@sigill.intra.peff.net/
> 
> Basically, the question was: can we declare strcpy banned and have a
> linter save us the trouble of finding it in review. The answer is yes,
> the compiler is good at that. ;)

Here's a v2 that I think addresses the comments on the earlier series.
Thanks everybody for your review.

Changes here include:

 - drop the mention in CodingGuidelines. That list is already long, and
   we don't need to to waste mental effort on things that will be
   enforced automatically

 - we now #undef all macros before defining them to avoid redefinition
   warnings

 - the first patch now covers _just_ strcpy(), and each function gets
   its own patch with an explanation (and suggested alternatives). My
   thought is that these should be easy to dig up via blame, "log -S",
   or "log --grep". Though another option would be comments in banned.h.

 - added strcat and vsprintf to the banned list

 - tweaked the first patch's commit message for clarity and to address
   points raised in discussion

As before, this needs to go on top of 022d2ac1f3 (which I hope should
hit master soon anyway).

  [1/4]: automatically ban strcpy()
  [2/4]: banned.h: mark strcat() as banned
  [3/4]: banned.h: mark sprintf() as banned
  [4/4]: banned.h: mark strncpy() as banned

 banned.h  | 30 ++
 git-compat-util.h |  6 ++
 2 files changed, 36 insertions(+)
 create mode 100644 banned.h

-Peff


Re: [PATCH v2 0/4] Automatic transfer encoding for patches

2018-07-08 Thread Drew DeVault
LGTM, thanks for the v2.


[PATCH v2 0/4] Automatic transfer encoding for patches

2018-07-08 Thread brian m. carlson
This series introduces an "auto" value for git send-email
--transfer-encoding that uses 8bit when possible (i.e. when lines are
998 octets or shorter) and quoted-printable otherwise; it then makes
this the default behavior.  It also makes --validate aware of transfer
encoding so it doesn't complain when using quoted-printable or base64.

Changes from v1:
* Update commit messages to refer to RFC 5322.
* Add a missing space.
* Remove the needless capture of stderr.
* Define "suitable transfer encoding".
* Invert test to better capture failures.
* Wrap --validate code in an if block instead of returning early.
* Update documentation to reflect correct, modern RFC.

brian m. carlson (4):
  send-email: add an auto option for transfer encoding
  send-email: accept long lines with suitable transfer encoding
  send-email: automatically determine transfer-encoding
  docs: correct RFC specifying email line length

 Documentation/git-send-email.txt | 17 ++
 git-send-email.perl  | 46 +-
 t/t9001-send-email.sh| 57 
 3 files changed, 91 insertions(+), 29 deletions(-)



[GSoC] [PATCH v2 0/4] rebase: rewrite rebase in C

2018-07-02 Thread Pratik Karki
As a GSoC project, I have been working on the builtin rebase.

The motivation behind the rewrite of rebase i.e. from shell script to C
are for following reasons:

1.  Writing shell scripts and getting it to production is much faster
than doing the equivalent in C but lacks in performance and extra
workarounds are needed for non-POSIX platforms.

2.  Git for Windows is at loss as the installer size increases due to
addition of extra dependencies for the shell scripts which are usually
available in POSIX compliant platforms.

This series of patches serves to demonstrate a minimal builtin rebase
which supports running `git rebase ` and also serves to ask for
reviews.

Changes since v1:

  -  Remove the TODO-rebase.sh which shouldn't be merged into `pu`.

  -  Add newline at the end of the file `builtin/rebase.c` in
 `rebase: start implementing it as a builtin`.

  -  Fix unintentional ordering in `git-rebase--common.sh` and fix the commit
 message in `rebase: refactor common shell functions into their own file`.

  -  Fix wrong expression of `argc` in the handle upstream loop, fix wrong
 type casting code, make the condition simple and fix commit message in
 `builtin/rebase: support running "git-rebase ".

Pratik Karki (4):
  rebase: start implementing it as a builtin
  rebase: refactor common shell functions into their own file
  sequencer: refactor the code to detach HEAD to checkout.c
  builtin/rebase: support running "git rebase "

 .gitignore|   2 +
 Makefile  |   4 +-
 builtin.h |   1 +
 builtin/rebase.c  | 282 ++
 checkout.c|  64 ++
 checkout.h|   3 +
 git-rebase.sh => git-legacy-rebase.sh |  62 +-
 git-rebase--common.sh |  61 ++
 git.c |   6 +
 sequencer.c   |  58 +-
 10 files changed, 428 insertions(+), 115 deletions(-)
 create mode 100644 builtin/rebase.c
 rename git-rebase.sh => git-legacy-rebase.sh (91%)
 create mode 100644 git-rebase--common.sh

-- 
2.18.0



[PATCH v2 0/4] branch -l deprecation revisited

2018-06-22 Thread Jeff King
This series replaces the contents of jk/branch-l-0-deprecation,
jk/branch-l-1-removal, and jk/branch-l-2-reincarnation.

It implements the idea discussed in the subthread in:

  https://public-inbox.org/git/xmqqzi0hety4@gitster-ct.c.googlers.com/

Namely that "branch -l" would warn about deprecation only when we're not
in list mode, and then we'll eventually jump straight to repurposing
"-l" as "--list".

This gets us to the end result faster, without the annoying "-l doesn't
mean anything" step in between, and without useless warnings on "git
branch -l". It also sidesteps any pager issues since list mode will not
print the warning.

The first 3 patches would become jk/branch-l-0-deprecate, and then the
final one would become jk/branch-l-1-repurpose or similar. No third step
required.

  [1/4]: t3200: unset core.logallrefupdates when testing reflog creation
  [2/4]: t: switch "branch -l" to "branch --create-reflog"
  [3/4]: branch: deprecate "-l" option
  [4/4]: branch: make "-l" a synonym for "--list"

 Documentation/git-branch.txt |  2 +-
 builtin/branch.c |  4 ++--
 t/t1410-reflog.sh|  4 ++--
 t/t3200-branch.sh| 34 +-
 4 files changed, 22 insertions(+), 22 deletions(-)

-Peff


Re: [RFC PATCH v2 0/4] contrib/credential/netrc Makefile & test improvements

2018-06-13 Thread Luis Marsano
On Tue, Jun 12, 2018 at 11:10 PM Todd Zullinger  wrote:
>
> This replaces my 2/2 "use in-tree Git.pm for tests" with
> Luis's improved version.  It also adds Luis's fix to ensure
> the proper exit status on test failures and a minor
> whitespace cleanup.
>
> Is it alright to forge your signoff Luis?

Thanks, that's fine.
You should sign-off on those, too: you identified the problems, told
me the solutions, tested them, and deserve the credit.
Moreover, Documentation/SubmittingPatches encourages it.

When the patches are ready, I think we'll need the primary author
(Ted) to acknowledge before the maintainer (Junio) can approve.

> Luis Marsano (2):
>   git-credential-netrc: use in-tree Git.pm for tests
>   git-credential-netrc: fix exit status when tests fail
>
> Todd Zullinger (2):
>   git-credential-netrc: make "all" default target of Makefile
>   git-credential-netrc: minor whitespace cleanup in test script
>
>  contrib/credential/netrc/Makefile  | 3 +++
>  contrib/credential/netrc/t-git-credential-netrc.sh | 9 +
>  contrib/credential/netrc/test.pl   | 5 +++--
>  3 files changed, 11 insertions(+), 6 deletions(-)


[RFC PATCH v2 0/4] contrib/credential/netrc Makefile & test improvements

2018-06-12 Thread Todd Zullinger
This replaces my 2/2 "use in-tree Git.pm for tests" with
Luis's improved version.  It also adds Luis's fix to ensure
the proper exit status on test failures and a minor
whitespace cleanup.

Is it alright to forge your signoff Luis?

Luis Marsano (2):
  git-credential-netrc: use in-tree Git.pm for tests
  git-credential-netrc: fix exit status when tests fail

Todd Zullinger (2):
  git-credential-netrc: make "all" default target of Makefile
  git-credential-netrc: minor whitespace cleanup in test script

 contrib/credential/netrc/Makefile  | 3 +++
 contrib/credential/netrc/t-git-credential-netrc.sh | 9 +
 contrib/credential/netrc/test.pl   | 5 +++--
 3 files changed, 11 insertions(+), 6 deletions(-)


[PATCH v2 0/4] Fix i-t-a entries in git-diff and git-apply

2018-05-26 Thread Nguyễn Thái Ngọc Duy
v2 first fixes a bug in --ita-invisible-in-index that accidentally
affects diffing a worktree and a tree. This caused a regression when
v1's 1/2 turned this option on by default.

Other than that, 4/4 should address Junio's comments on v1's 2/2.

Nguyễn Thái Ngọc Duy (4):
  diff: ignore --ita-[in]visible-in-index when diffing worktree-to-tree
  diff: turn --ita-invisible-in-index on by default
  t2203: add a test about "diff HEAD" case
  apply: add --intent-to-add

 Documentation/git-apply.txt | 10 +-
 apply.c | 19 +++
 apply.h |  1 +
 builtin/diff.c  |  7 
 diff-lib.c  |  8 +++--
 t/t2203-add-intent.sh   | 64 +
 t/t4011-diff-symlink.sh | 10 +++---
 7 files changed, 99 insertions(+), 20 deletions(-)

-- 
2.17.0.705.g3525833791



[GSoC][PATCH v2 0/4] rebase: split rebase -p from rebase -i

2018-05-22 Thread Alban Gruin
This splits the `rebase --preserve-merges` functionnality from
git-rebase--interactive.sh. All the dead code left by the duplication of
git-rebase--interactive.sh is also removed. This will make it easier to rewrite
this script in C.

This patch series is based of js/sequencer-and-root-commits.

Changes since v1:

 - Duplicating the correct version of git-rebase--interactive.sh (thanks
   Stefan!)

Alban Gruin (4):
  rebase: duplicate git-rebase--interactive.sh to
git-rebase--preserve-merges.sh
  rebase: strip unused code in git-rebase--preserve-merges.sh
  rebase: use the new git-rebase--preserve-merges.sh
  rebase: remove -p code from git-rebase--interactive.sh

 .gitignore |1 +
 Makefile   |2 +
 git-rebase--interactive.sh |  708 +---
 git-rebase--preserve-merges.sh | 1012 
 git-rebase.sh  |   32 +-
 5 files changed, 1048 insertions(+), 707 deletions(-)
 create mode 100644 git-rebase--preserve-merges.sh

-- 
2.16.1



Re: [PATCH v2 0/4] Fix mem leaks of recent object store conversions.

2018-05-11 Thread Stefan Beller
On Fri, May 11, 2018 at 1:37 AM, Jeff King  wrote:
> On Thu, May 10, 2018 at 12:58:45PM -0700, Stefan Beller wrote:
>
>> This series replaces the two commits that were queued on 
>> sb/object-store-replace,
>> fixing memory leaks that were recently introduced.
>>
>> Compared to v1, I merged the two independent series from yesterday,
>> rewrote the commit message to clear up Junios confusion and addresses Peffs
>> comments for the packfiles as well.
>
> Mostly. :)
>
> My one remaining complaint is that the bitmap code may hold on to a
> dangling pointer to a packed_git after this series.

Ok, I'll look into that.

>
> I think that is part of a larger problem, though, which is that the
> bitmap code's globals need to be part of the struct raw_object_store.
> I think this can already cause problems before your series if we were to
> try to use bitmaps in both a superproject and a submodule in the same
> process, though I think we'd at least hit the "ignoring extra bitmap
> file" code path in open_pack_bitmap_1(). So right now it's an annoyance,
> but after your series it becomes a potential segfault.

Ok, maybe we'll need to convert bitmaps into the object store for that.

Thanks for the pointer,
Stefan


Re: [PATCH v2 0/4] Fix mem leaks of recent object store conversions.

2018-05-11 Thread Jeff King
On Thu, May 10, 2018 at 12:58:45PM -0700, Stefan Beller wrote:

> This series replaces the two commits that were queued on 
> sb/object-store-replace,
> fixing memory leaks that were recently introduced.
> 
> Compared to v1, I merged the two independent series from yesterday,
> rewrote the commit message to clear up Junios confusion and addresses Peffs
> comments for the packfiles as well.

Mostly. :)

My one remaining complaint is that the bitmap code may hold on to a
dangling pointer to a packed_git after this series.

I think that is part of a larger problem, though, which is that the
bitmap code's globals need to be part of the struct raw_object_store.
I think this can already cause problems before your series if we were to
try to use bitmaps in both a superproject and a submodule in the same
process, though I think we'd at least hit the "ignoring extra bitmap
file" code path in open_pack_bitmap_1(). So right now it's an annoyance,
but after your series it becomes a potential segfault.

-Peff


[PATCH v2 0/4] Fix mem leaks of recent object store conversions.

2018-05-10 Thread Stefan Beller
This series replaces the two commits that were queued on 
sb/object-store-replace,
fixing memory leaks that were recently introduced.

Compared to v1, I merged the two independent series from yesterday,
rewrote the commit message to clear up Junios confusion and addresses Peffs
comments for the packfiles as well.
Also added another free to free the oidmap for the replaces themselves.

Thanks,
Stefan

Stefan Beller (4):
  packfile: close and free packs upon releasing an object store
  packfile.h: remove all extern keywords
  object.c: free replace map in raw_object_store_clear
  replace-object.c: remove the_repository from prepare_replace_object

 object.c |  7 +++--
 packfile.c   | 13 
 packfile.h   | 79 
 replace-object.c |  2 +-
 4 files changed, 58 insertions(+), 43 deletions(-)

-- 
2.17.0.255.g8bfb7c0704



Re: [PATCH v2 0/4] Finish the conversion from die("BUG: ...") to BUG()

2018-05-07 Thread Jeff King
On Wed, May 02, 2018 at 11:38:13AM +0200, Johannes Schindelin wrote:

> The BUG() macro was introduced in this patch series:
> https://public-inbox.org/git/20170513032414.mfrwabt4hovuj...@sigill.intra.peff.net
> 
> The second patch in that series converted one caller from die("BUG: ")
> to use the BUG() macro.
> 
> It seems that there was no concrete plan to address the same issue in
> the rest of the code base.

I had a plan; it was that people would convert these as they touched the
relevant areas. :)

I'm happy to see a mass-conversion, though.

> For that reason, the commit message contains the precise Unix shell
> invocation (GNU sed semantics, not BSD sed ones, because I know that the
> Git maintainer as well as the author of the patch introducing BUG() both
> use Linux and not macOS or any other platform that would offer a BSD
> sed). It should be straight-forward to handle merge
> conflicts/non-applying patches by simply re-running that command.

I suspect this could have been done with coccinelle, but it's definitely
not worth going back to re-do it now.

> Changes since v2:
> 
> - Avoided the entire discussion about the previous 2/6 (now dropped)
>   that prepared t1406 to handle SIGABRT by side-stepping the issue: the
>   ref-store test helper will no longer call abort() in BUG() calls but
>   exit with exit code 99 instead.

I actually think this should be a runtime flag in the environment, like
GIT_BUG_EXIT_CODE (and if not set, continue to abort). That can help if
you come across a BUG() in real Git code, but for some reason your
environment makes aborting a pain. For example, maybe you're debugging a
racy BUG() and don't want to generate a bunch of coredumps.

Or here's an interesting related case I came across a few months ago.
t6210 has a test that's known to segfault due to stack exhaustion. It's
marked test_expect_failure, so all is good, right?  Normally, yes, but
when I run the test suite inside our local Jenkins setup, it detects a
segfault in _any_ child process of the test runner and aborts the whole
thing. This is great as a belt-and-suspenders if we miss an unexpected
segfault, but is obviously annoying in this case.

Triggering BUG()s in the test suite, even inside an expect_failure,
would introduce the same headache. It would be nice if I could just do:

  GIT_BUG_EXIT_CODE=134 make test

to avoid it.  Possibly expect_failure should even set that
automatically.

I also suspect that nobody really needs to set a specific exit code.
Using 134 is enough to avoid all of the unpleasantness of SIGABRT, but
still enough to trigger test_must_fail to distinguish it from a non-BUG
death. The callers that intentionally trigger bugs probably ought to be
using test_expect_code to make sure they are hitting a BUG() and not
some other death, anyway.

So we could probably simplify it to something like this:

diff --git a/usage.c b/usage.c
index cdd534c9df..50651bed40 100644
--- a/usage.c
+++ b/usage.c
@@ -221,7 +221,11 @@ static NORETURN void BUG_vfl(const char *file, int line, 
const char *fmt, va_lis
snprintf(prefix, sizeof(prefix), "BUG: ");
 
vreportf(prefix, fmt, params);
-   abort();
+
+   if (git_env_bool("GIT_BUG_ABORT"), 1)
+   abort();
+   else
+   exit(134);
 }
 
 #ifdef HAVE_VARIADIC_MACROS

-Peff


[PATCH v2 0/4] Finish the conversion from die("BUG: ...") to BUG()

2018-05-02 Thread Johannes Schindelin
The BUG() macro was introduced in this patch series:
https://public-inbox.org/git/20170513032414.mfrwabt4hovuj...@sigill.intra.peff.net

The second patch in that series converted one caller from die("BUG: ")
to use the BUG() macro.

It seems that there was no concrete plan to address the same issue in
the rest of the code base.

This patch series tries to do that.

Note: I would be very surprised if the monster commit 3/4 ("Replace all
die("BUG: ...") calls by BUG() ones") would apply cleanly on `pu` (I
develop this on top of `master`).

For that reason, the commit message contains the precise Unix shell
invocation (GNU sed semantics, not BSD sed ones, because I know that the
Git maintainer as well as the author of the patch introducing BUG() both
use Linux and not macOS or any other platform that would offer a BSD
sed). It should be straight-forward to handle merge
conflicts/non-applying patches by simply re-running that command.

Changes since v2:

- Avoided the entire discussion about the previous 2/6 (now dropped)
  that prepared t1406 to handle SIGABRT by side-stepping the issue: the
  ref-store test helper will no longer call abort() in BUG() calls but
  exit with exit code 99 instead.

- As a consequence, I could fold 3/6 into 4/6 (i.e. *not* special-case
  the refs/*.c code but do one wholesale die("BUG: ...") -> BUG()
  conversion).

- Further consequence: 1/6, which added handling for ok=sigabrt to
  test_must_fail, was dropped.

- I also used the opportunity to explain in the commit message of the last
  commit why the localization was dropped from one die(_("BUG: ...")) call.


Johannes Schindelin (4):
  test-tool: help verifying BUG() code paths
  run-command: use BUG() to report bugs, not die()
  Replace all die("BUG: ...") calls by BUG() ones
  Convert remaining die*(BUG) messages

 apply.c  |  4 ++--
 archive-tar.c|  2 +-
 attr.c   | 10 +-
 blame.c  |  2 +-
 builtin/am.c | 20 +--
 builtin/branch.c |  2 +-
 builtin/cat-file.c   |  4 ++--
 builtin/clone.c  |  2 +-
 builtin/commit.c |  2 +-
 builtin/config.c |  2 +-
 builtin/fast-export.c|  2 +-
 builtin/fsck.c   |  2 +-
 builtin/index-pack.c |  4 ++--
 builtin/init-db.c|  2 +-
 builtin/ls-files.c   |  2 +-
 builtin/notes.c  |  8 
 builtin/pack-objects.c   |  4 ++--
 builtin/pull.c   |  2 +-
 builtin/receive-pack.c   |  2 +-
 builtin/replace.c|  2 +-
 builtin/update-index.c   |  2 +-
 bulk-checkin.c   |  2 +-
 color.c  |  4 ++--
 column.c |  2 +-
 config.c | 12 +--
 date.c   |  2 +-
 diff.c   | 12 +--
 dir-iterator.c   |  2 +-
 git-compat-util.h|  2 +-
 grep.c   | 16 +++
 http.c   |  8 
 imap-send.c  |  2 +-
 lockfile.c   |  2 +-
 mailinfo.c   |  2 +-
 merge-recursive.c| 12 +--
 notes-merge.c|  4 ++--
 pack-bitmap-write.c  |  2 +-
 pack-bitmap.c|  6 +++---
 pack-objects.c   |  2 +-
 packfile.c   |  6 +++---
 pathspec.c   | 12 +--
 pkt-line.c   |  2 +-
 prio-queue.c |  2 +-
 ref-filter.c |  4 ++--
 refs.c   | 34 
 refs/files-backend.c | 20 +--
 refs/iterator.c  |  6 +++---
 refs/packed-backend.c| 16 +++
 refs/ref-cache.c |  2 +-
 remote.c |  2 +-
 revision.c   |  4 ++--
 run-command.c| 33 ++-
 setup.c  |  4 ++--
 sha1-lookup.c|  2 +-
 sha1-name.c  |  4 ++--
 shallow.c|  6 +++---
 sigchain.c   |  2 +-
 strbuf.c |  4 ++--
 submodule.c  |  8 
 t/helper/test-example-decorate.c | 16 +++
 t/helper/test-tool.c |  2 ++
 tmp-objdir.c |  2 +-
 trailer.c|  6 +++---
 transport.c  |  4 ++--
 unpack-trees.c   |  2 +-
 usage.c  |  5 +
 vcs-svn/fast_export.c|  6 --
 worktree.c   |  2 +-
 wrapper.c|  4 ++--
 wt-status.c  

[PATCH v2 0/4] rebase -i: avoid stale "# This is a combination of" in commit messages

2018-04-20 Thread Johannes Schindelin
Eric Sunshine pointed out that I had such a commit message in
https://public-inbox.org/git/CAPig+cRrS0_nYJJY=o6cbov630snqhpv5qgrqdd8mw-syzn...@mail.gmail.com/
and I went on a hunt to figure out how the heck this happened.

Turns out that if there is a fixup/squash chain where the *last* command
fails with merge conflicts, and we either --skip ahead or resolve the
conflict to a clean tree and then --continue, our code does not do a
final cleanup.

Contrary to my initial gut feeling, this bug was not introduced by my
rewrite in C of the core parts of rebase -i, but it looks to me as if
that bug was with us for a very long time (at least the --skip part).

The developer (read: user of rebase -i) in me says that we would want to
fast-track this, but the author of rebase -i in me says that we should
be cautious and cook this in `next` for a while.

Fixes since v1:

- Using test_i18ngrep instead of grep, because "This is a combination of
   commits" is marked for translation.

- Added a patch to actually fix `rebase -i` when building with
  GETTEXT_POISON, because we used to assume that numbers are encoded as
  ASCII so that we can increment it when writing the next commit message
  in the fixup/squash chain. This also seems to be a long-standing bug
  that has been with us since the the beginning of the localization of
  rebase -i's commit messages.

- The test case now starts with test_when_finished "test_might_fail git rebase
  --abort" to be allow for failing more gently.

- Fixed grammar of 2/3 (now 3/4): thanks, Eric!

- Fixed the description of the new test case (it purported to test --continue,
  but it really tests --skip).


Johannes Schindelin (4):
  rebase -i: demonstrate bugs with fixup!/squash! commit messages
  rebase -i: Handle "combination of  commits" with GETTEXT_POISON
  sequencer: leave a tell-tale when a fixup/squash failed
  rebase --skip: clean up commit message after a failed fixup/squash

 sequencer.c| 93 --
 t/t3418-rebase-continue.sh | 22 +
 2 files changed, 92 insertions(+), 23 deletions(-)


base-commit: fe0a9eaf31dd0c349ae4308498c33a5c3794b293
Published-As: 
https://github.com/dscho/git/releases/tag/clean-msg-after-fixup-continue-v2
Fetch-It-Via: git fetch https://github.com/dscho/git 
clean-msg-after-fixup-continue-v2

Interdiff vs v1:
 diff --git a/sequencer.c b/sequencer.c
 index f067b7b24c5..881503a6463 100644
 --- a/sequencer.c
 +++ b/sequencer.c
 @@ -1350,19 +1350,18 @@ static int update_squash_messages(enum todo_command 
command,
eol = strchrnul(buf.buf, '\n');
if (buf.buf[0] != comment_line_char ||
(p += strcspn(p, "0123456789\n")) == eol)
 -  return error(_("unexpected 1st line of squash message:"
 - "\n\n\t%.*s"),
 -   (int)(eol - buf.buf), buf.buf);
 -  count = strtol(p, NULL, 10);
 -
 -  if (count < 1)
 -  return error(_("invalid 1st line of squash message:\n"
 - "\n\t%.*s"),
 -   (int)(eol - buf.buf), buf.buf);
 +  count = -1;
 +  else
 +  count = strtol(p, NULL, 10);
  
strbuf_addf(, "%c ", comment_line_char);
 -  strbuf_addf(,
 -  _("This is a combination of %d commits."), ++count);
 +  if (count < 1)
 +  strbuf_addf(, _("This is a combination of "
 + "several commits."));
 +  else
 +  strbuf_addf(,
 +  _("This is a combination of %d commits."),
 +  ++count);
strbuf_splice(, 0, eol - buf.buf, header.buf, header.len);
strbuf_release();
} else {
 @@ -1405,13 +1404,22 @@ static int update_squash_messages(enum todo_command 
command,
if (command == TODO_SQUASH) {
unlink(rebase_path_fixup_msg());
strbuf_addf(, "\n%c ", comment_line_char);
 -  strbuf_addf(, _("This is the commit message #%d:"), count);
 +  if (count < 2)
 +  strbuf_addf(, _("This is the next commit "
 +  "message:"));
 +  else
 +  strbuf_addf(, _("This is the commit message #%d:"),
 +  count);
strbuf_addstr(, "\n\n");
strbuf_addstr(, body);
} else if (command == TODO_FIXUP) {
strbuf_addf(, "\n%c ", comment_line_char);
 -  strbuf_addf(, _("The commit message #%d will be skipped:"),
 -  count);
 +  if (count < 2)
 +  strbuf_addf(, _("The next commit message will be "
 +  

Re: [PATCH v2 0/4] Lazy-load trees when reading commit-graph

2018-04-11 Thread Jakub Narebski
Derrick Stolee  writes:

> On 4/7/2018 2:40 PM, Jakub Narebski wrote:
>> Derrick Stolee  writes:
>>
>> [...]
>>> On the Linux repository, performance tests were run for the following
>>> command:
>>>
>>>  git log --graph --oneline -1000
>>>
>>>  Before: 0.92s
>>>  After:  0.66s
>>>  Rel %: -28.3%
>>>
>>> Adding '-- kernel/' to the command requires loading the root tree
>>> for every commit that is walked. There was no measureable performance
>>> change as a result of this patch.
>>
>> In the "Git Merge contributor summit notes" [1] one can read that:
>>
>>> - VSTS adds bloom filters to know which paths have changed on the commit
>>> - tree-same check in the bloom filter is fast; speeds up file history checks
>>> - if the file history is _very_ sparse, then bloom filter is useful
>>
>> Could this method speed up also the second case mentioned here?  Can
>> anyone explain how this "path-changed bloom filter" works in VSTS?
>>
>
> The idea is simple: for every commit, store a Bloom filter containing
> the list of paths that are not TREESAME against the first parent. (A
> slight detail: have a max cap on the number of paths, and store simply
> "TOO_BIG" for commits with too many diffs.)

Isn't Bloom filter with all bits set to 1 equivalent of "too big"? It
would answer each query with "possibly in set, check it".

> When performing 'git log -- path' queries, the most important detail
> for considering how to advance the walk is whether the commit is
> TREESAME to its first parent. For a deep path in a large repo, this is
> almost always true. When a Bloom filter says "TREESAME" (i.e. "this
> path is not in my set") it is always correct, so we can set the
> treesame bit and continue without walking any trees. When a Bloom
> filter says "MAYBE NOT TREESAME" (i.e. "this path is probably in my
> set") you only need to do the same work as before: walk the trees to
> compare against your first parent.
>
> If a Bloom filter has a false-positive rate of X%, then you can
> possibly drop your number of tree comparisons by (100-X)%. This is
> very important for large repos where some paths were changed only ten
> times or so, the full graph needs to be walked and it is helpful to
> avoid parsing too many trees.

So this works only for exact file pathnames, or does checking for
subdirectory also work?  I guess it cannot work for globbing patterns
(like "git log -- '*.c'").

I guess that it speeds up "git log -- ", but also "git blame
".


Sidenote: I have stumbled upon https://github.com/efficient/ffbf project
(fast-forward Bllom filters - for exact matching of large number of
patterns), where they invent cache-efficient version of Bloom filter
algorithm.  The paper that the project is based on also might be
interesting, if only because it points to other improvements of classic
Bloom filters.

>> Could we add something like this to the commit-graph file? 
>
> I'm not sure if it is necessary for client-side operations, but it is
> one of the reasons the commit-graph file has the idea of an "optional
> chunk". It could be added to the file format (without changing version
> numbers) and be ignored by clients that don't understand it. I could
> also be gated by a config setting for computing them. My guess is that
> only server-side operations will need the added response time, and can
> bear the cost of computing them when writing the commit-graph
> file. Clients are less likely to be patient waiting for a lot of diff
> calculations.

So the problem is performing large amount of diff calculations.  I
wonder if it would be possible to add Bloom filters incrementally, when
for some reason we need to calculate diff anyway.

>
> If we add commit-graph file downloads to the protocol, then the server
> could do this computation and send the data to all clients. But that
> would be "secondary" information that maybe clients want to verify,
> which is as difficult as computing it themselves.

Can you tell us how much work (how much time) must spend the server to
calculate those per-commit "files changed" Bloom fillters?

Best,
--
Jakub Narębski


Re: [PATCH v2 0/4] Lazy-load trees when reading commit-graph

2018-04-09 Thread Stefan Beller
On Mon, Apr 9, 2018 at 6:15 AM, Derrick Stolee  wrote:

> I don't understand how folding the patches makes the correctness clearer,
> since the rename (1/4) is checked by the compiler and the Coccinelle script
> (3/4) only works after that rename is complete.
>
> The only thing I can imagine is that it makes smaller patch emails, since
> there is only one large patch instead of two. In this case, I prefer to make
> changes that are easier to check by automation (compiler and coccinelle).
>

I prefer the offloading to automation, too.
So I would vouch for keeping it as-is.


Re: [PATCH v2 0/4] Lazy-load trees when reading commit-graph

2018-04-09 Thread Derrick Stolee

On 4/8/2018 7:18 PM, Junio C Hamano wrote:

Jeff King  writes:


If I were doing it myself, I probably would have folded patches 1 and 3
together. They are touching all the same spots, and it would be an error
for any case converted in patch 1 to not get converted in patch 3. I'm
assuming you caught them all due to Coccinelle, though IMHO it is
somewhat overkill here. By folding them together the compiler could tell
you which spots you missed.

Yeah, that approach would probably be a more sensible way to assure
the safety/correctness of the result to readers better.


I don't understand how folding the patches makes the correctness 
clearer, since the rename (1/4) is checked by the compiler and the 
Coccinelle script (3/4) only works after that rename is complete.


The only thing I can imagine is that it makes smaller patch emails, 
since there is only one large patch instead of two. In this case, I 
prefer to make changes that are easier to check by automation (compiler 
and coccinelle).


However, I will defer to the experts in this regard. If a v3 is 
necessary, then I will fold the commits together.


Thanks,
-Stolee


Re: [PATCH v2 0/4] Lazy-load trees when reading commit-graph

2018-04-08 Thread Junio C Hamano
Jeff King  writes:

> If I were doing it myself, I probably would have folded patches 1 and 3
> together. They are touching all the same spots, and it would be an error
> for any case converted in patch 1 to not get converted in patch 3. I'm
> assuming you caught them all due to Coccinelle, though IMHO it is
> somewhat overkill here. By folding them together the compiler could tell
> you which spots you missed.

Yeah, that approach would probably be a more sensible way to assure
the safety/correctness of the result to readers better.

>
> And going forward, I doubt it is going to be a common error for people
> to use maybe_tree directly. Between the name and the warning comment,
> you'd have to really try to shoot yourself in the foot with it. The
> primary concern was catching people using the existing "tree" name,
> whose semantics changed.

Yup.


Re: [PATCH v2 0/4] Lazy-load trees when reading commit-graph

2018-04-07 Thread Derrick Stolee

On 4/7/2018 2:40 PM, Jakub Narebski wrote:

Derrick Stolee  writes:

[...]

On the Linux repository, performance tests were run for the following
command:

 git log --graph --oneline -1000

 Before: 0.92s
 After:  0.66s
 Rel %: -28.3%

Adding '-- kernel/' to the command requires loading the root tree
for every commit that is walked. There was no measureable performance
change as a result of this patch.

In the "Git Merge contributor summit notes" [1] one can read that:


- VSTS adds bloom filters to know which paths have changed on the commit
- tree-same check in the bloom filter is fast; speeds up file history checks
- if the file history is _very_ sparse, then bloom filter is useful

Could this method speed up also the second case mentioned here?  Can
anyone explain how this "path-changed bloom filter" works in VSTS?
   


The idea is simple: for every commit, store a Bloom filter containing 
the list of paths that are not TREESAME against the first parent. (A 
slight detail: have a max cap on the number of paths, and store simply 
"TOO_BIG" for commits with too many diffs.)


When performing 'git log -- path' queries, the most important detail for 
considering how to advance the walk is whether the commit is TREESAME to 
its first parent. For a deep path in a large repo, this is almost always 
true. When a Bloom filter says "TREESAME" (i.e. "this path is not in my 
set") it is always correct, so we can set the treesame bit and continue 
without walking any trees. When a Bloom filter says "MAYBE NOT TREESAME" 
(i.e. "this path is probably in my set") you only need to do the same 
work as before: walk the trees to compare against your first parent.


If a Bloom filter has a false-positive rate of X%, then you can possibly 
drop your number of tree comparisons by (100-X)%. This is very important 
for large repos where some paths were changed only ten times or so, the 
full graph needs to be walked and it is helpful to avoid parsing too 
many trees.



Could we add something like this to the commit-graph file? 


I'm not sure if it is necessary for client-side operations, but it is 
one of the reasons the commit-graph file has the idea of an "optional 
chunk". It could be added to the file format (without changing version 
numbers) and be ignored by clients that don't understand it. I could 
also be gated by a config setting for computing them. My guess is that 
only server-side operations will need the added response time, and can 
bear the cost of computing them when writing the commit-graph file. 
Clients are less likely to be patient waiting for a lot of diff 
calculations.


If we add commit-graph file downloads to the protocol, then the server 
could do this computation and send the data to all clients. But that 
would be "secondary" information that maybe clients want to verify, 
which is as difficult as computing it themselves.


Thanks,

-Stolee



Re: [PATCH v2 0/4] Lazy-load trees when reading commit-graph

2018-04-07 Thread Jakub Narebski
Derrick Stolee  writes:

[...]
> On the Linux repository, performance tests were run for the following
> command:
>
> git log --graph --oneline -1000
>
> Before: 0.92s
> After:  0.66s
> Rel %: -28.3%
>
> Adding '-- kernel/' to the command requires loading the root tree
> for every commit that is walked. There was no measureable performance
> change as a result of this patch.

In the "Git Merge contributor summit notes" [1] one can read that:

> - VSTS adds bloom filters to know which paths have changed on the commit
> - tree-same check in the bloom filter is fast; speeds up file history checks
> - if the file history is _very_ sparse, then bloom filter is useful

Could this method speed up also the second case mentioned here?  Can
anyone explain how this "path-changed bloom filter" works in VSTS?

Could we add something like this to the commit-graph file?

[1]: 
https://public-inbox.org/git/alpine.DEB.2.20.1803091557510.23109@alexmv-linux/

Best regards,
--
Jakub Narębski


Re: [PATCH v2 0/4] Lazy-load trees when reading commit-graph

2018-04-06 Thread Stefan Beller
On Fri, Apr 6, 2018 at 12:21 PM, Jeff King  wrote:
> On Fri, Apr 06, 2018 at 07:09:30PM +, Derrick Stolee wrote:
>
>> Derrick Stolee (4):
>>   treewide: rename tree to maybe_tree
>>   commit: create get_commit_tree() method
>>   treewide: replace maybe_tree with accessor methods
>>   commit-graph: lazy-load trees for commits
>
> I gave this only a cursory read, but it addresses my concern from the
> previous round.

Same here.

Thanks,
Stefan


Re: [PATCH v2 0/4] Lazy-load trees when reading commit-graph

2018-04-06 Thread Derrick Stolee

On 4/6/2018 3:21 PM, Jeff King wrote:

On Fri, Apr 06, 2018 at 07:09:30PM +, Derrick Stolee wrote:


Derrick Stolee (4):
   treewide: rename tree to maybe_tree
   commit: create get_commit_tree() method
   treewide: replace maybe_tree with accessor methods
   commit-graph: lazy-load trees for commits

I gave this only a cursory read, but it addresses my concern from the
previous round.

If I were doing it myself, I probably would have folded patches 1 and 3
together. They are touching all the same spots, and it would be an error
for any case converted in patch 1 to not get converted in patch 3. I'm
assuming you caught them all due to Coccinelle, though IMHO it is
somewhat overkill here. By folding them together the compiler could tell
you which spots you missed.

And going forward, I doubt it is going to be a common error for people
to use maybe_tree directly. Between the name and the warning comment,
you'd have to really try to shoot yourself in the foot with it. The
primary concern was catching people using the existing "tree" name,
whose semantics changed.

All that said, I'm fine with having it done this way, too.


Thanks. As a double-check that I caught all of the 'maybe_tree' 
accesses, I ran the following:


$ git grep maybe_tree | grep -v get_commit_tree
commit-graph.c: item->maybe_tree = NULL;
commit-graph.c: c->maybe_tree = lookup_tree();
commit-graph.c: return c->maybe_tree;
commit-graph.c: if (c->maybe_tree)
commit-graph.c: return c->maybe_tree;
commit.c:   if (commit->maybe_tree || !commit->object.parsed)
commit.c:   return commit->maybe_tree;
commit.c:   item->maybe_tree = lookup_tree();
commit.h:   struct tree *maybe_tree;
contrib/coccinelle/commit.cocci:- >maybe_tree->object.oid
contrib/coccinelle/commit.cocci:- c->maybe_tree->object.oid.hash
contrib/coccinelle/commit.cocci:- c->maybe_tree
contrib/coccinelle/commit.cocci:+ c->maybe_tree = s
contrib/coccinelle/commit.cocci:+ return c->maybe_tree;
merge-recursive.c:  commit->maybe_tree = tree;

Thanks,
-Stolee


Re: [PATCH v2 0/4] Lazy-load trees when reading commit-graph

2018-04-06 Thread Jeff King
On Fri, Apr 06, 2018 at 07:09:30PM +, Derrick Stolee wrote:

> Derrick Stolee (4):
>   treewide: rename tree to maybe_tree
>   commit: create get_commit_tree() method
>   treewide: replace maybe_tree with accessor methods
>   commit-graph: lazy-load trees for commits

I gave this only a cursory read, but it addresses my concern from the
previous round.

If I were doing it myself, I probably would have folded patches 1 and 3
together. They are touching all the same spots, and it would be an error
for any case converted in patch 1 to not get converted in patch 3. I'm
assuming you caught them all due to Coccinelle, though IMHO it is
somewhat overkill here. By folding them together the compiler could tell
you which spots you missed.

And going forward, I doubt it is going to be a common error for people
to use maybe_tree directly. Between the name and the warning comment,
you'd have to really try to shoot yourself in the foot with it. The
primary concern was catching people using the existing "tree" name,
whose semantics changed.

All that said, I'm fine with having it done this way, too.

-Peff


[PATCH v2 0/4] Lazy-load trees when reading commit-graph

2018-04-06 Thread Derrick Stolee
There are several commit-graph walks that require loading many commits
but never walk the trees reachable from those commits. However, the
current logic in parse_commit() requires the root tree to be loaded.
This only uses lookup_tree(), but when reading commits from the commit-
graph file, the hashcpy() to load the root tree hash and the time spent
checking the object cache take more time than parsing the rest of the
commit.

In this patch series, all direct references to accessing the 'tree'
member of struct commit are replaced instead by one of the following
methods:

struct tree *get_commit_tree(struct commit *)
struct object_id *get_commit_tree_oid(struct commit *)

This replacement was assisted by a Coccinelle script, but the 'tree'
member is overloaded in other types, so we first rename the 'tree'
member to 'maybe_tree' and use the compiler to ensure we caught all
examples. Then, contrib/coccinelle/commit.cocci generates the patch
to replace all accessors of 'maybe_tree' to the methods above.

After all access is restricted to use these methods, we can then
change the postcondition of parse_commit_in_graph() to allow 'maybe_tree'
to be NULL. If the tree is accessed later, we can load the tree's
OID from the commit-graph in constant time and perform the lookup_tree().

On the Linux repository, performance tests were run for the following
command:

git log --graph --oneline -1000

Before: 0.92s
After:  0.66s
Rel %: -28.3%

Adding '-- kernel/' to the command requires loading the root tree
for every commit that is walked. There was no measureable performance
change as a result of this patch.

This patch series depends on v7 of ds/commit-graph.

Derrick Stolee (4):
  treewide: rename tree to maybe_tree
  commit: create get_commit_tree() method
  treewide: replace maybe_tree with accessor methods
  commit-graph: lazy-load trees for commits

 blame.c | 18 +-
 builtin/checkout.c  | 18 --
 builtin/diff.c  |  2 +-
 builtin/fast-export.c   |  6 +++---
 builtin/log.c   |  4 ++--
 builtin/reflog.c|  2 +-
 commit-graph.c  | 27 +++
 commit-graph.h  |  7 +++
 commit.c| 18 +-
 commit.h|  5 -
 contrib/coccinelle/commit.cocci | 30 ++
 fsck.c  |  8 +---
 http-push.c |  2 +-
 line-log.c  |  4 ++--
 list-objects.c  | 10 +-
 log-tree.c  |  6 +++---
 merge-recursive.c   |  5 +++--
 notes-merge.c   |  9 +
 packfile.c  |  2 +-
 pretty.c|  5 +++--
 ref-filter.c|  2 +-
 revision.c  |  8 
 sequencer.c | 12 ++--
 sha1_name.c |  2 +-
 tree.c  |  4 ++--
 walker.c|  2 +-
 26 files changed, 152 insertions(+), 66 deletions(-)
 create mode 100644 contrib/coccinelle/commit.cocci

-- 
2.17.0



[PATCH v2 0/4] Colorize push errors

2018-04-05 Thread Johannes Schindelin
To help users discern large chunks of white text (when the push
succeeds) from large chunks of white text (when the push fails), let's
add some color to the latter.

The original contribution came in via Pull Request from the Git for
Windows project:

https://github.com/git-for-windows/git/pull/1429

Changes since v1:

- implemented want_color_fd() as suggested by Junio

- added color.{advice,push,transport} to be able to force this thing on

- added a test

- added documentation to `git config`'s man page

- fixed a bug where the push config looked at color.advice.

- fixed a bug where `color.advise.` was not parsed because
  git_default_config() would fail to hand `color.advise.*` settings to
  git_default_advice_config()

- wrapped a couple of too-long lines

- changed the strstr() hack to detect push failures to use push_had_errors()
  instead

- renamed the transport.color.* settings to color.transport.* (to make them
  consistent with all the other color.. settings)


Johannes Schindelin (3):
  color: introduce support for colorizing stderr
  Add a test to verify that push errors are colorful
  Document the new color.* settings to colorize push errors/hints

Ryan Dammrose (1):
  push: colorize errors

 Documentation/config.txt   | 28 
 advice.c   | 49 ++--
 builtin/push.c | 44 -
 color.c| 20 +++-
 color.h|  4 ++-
 config.c   |  2 +-
 t/t5541-http-push-smart.sh | 18 ++
 transport.c| 67 +-
 8 files changed, 217 insertions(+), 15 deletions(-)


base-commit: 468165c1d8a442994a825f3684528361727cd8c0
Published-As: https://github.com/dscho/git/releases/tag/colorize-push-errors-v2
Fetch-It-Via: git fetch https://github.com/dscho/git colorize-push-errors-v2

Interdiff vs v1:
 diff --git a/Documentation/config.txt b/Documentation/config.txt
 index 4e0cff87f62..40e3828b85f 100644
 --- a/Documentation/config.txt
 +++ b/Documentation/config.txt
 @@ -1088,6 +1088,16 @@ clean.requireForce::
A boolean to make git-clean do nothing unless given -f,
-i or -n.   Defaults to true.
  
 +color.advice::
 +  A boolean to enable/disable color in hints (e.g. when a push
 +  failed, see `advice.*` for a list).  May be set to `always`,
 +  `false` (or `never`) or `auto` (or `true`), in which case colors
 +  are used only when the error output goes to a terminal. If
 +  unset, then the value of `color.ui` is used (`auto` by default).
 +
 +color.advice.advice::
 +  Use customized color for hints.
 +
  color.branch::
A boolean to enable/disable color in the output of
linkgit:git-branch[1]. May be set to `always`,
 @@ -1190,6 +1200,15 @@ color.pager::
A boolean to enable/disable colored output when the pager is in
use (default is true).
  
 +color.push::
 +  A boolean to enable/disable color in push errors. May be set to
 +  `always`, `false` (or `never`) or `auto` (or `true`), in which
 +  case colors are used only when the error output goes to a terminal.
 +  If unset, then the value of `color.ui` is used (`auto` by default).
 +
 +color.push.error::
 +  Use customized color for push errors.
 +
  color.showBranch::
A boolean to enable/disable color in the output of
linkgit:git-show-branch[1]. May be set to `always`,
 @@ -1218,6 +1237,15 @@ color.status.::
status short-format), or
`unmerged` (files which have unmerged changes).
  
 +color.transport::
 +  A boolean to enable/disable color when pushes are rejected. May be
 +  set to `always`, `false` (or `never`) or `auto` (or `true`), in which
 +  case colors are used only when the error output goes to a terminal.
 +  If unset, then the value of `color.ui` is used (`auto` by default).
 +
 +color.transport.rejected::
 +  Use customized color when a push was rejected.
 +
  color.ui::
This variable determines the default value for variables such
as `color.diff` and `color.grep` that control the use of color
 diff --git a/advice.c b/advice.c
 index 42460877ef6..2121c1811fd 100644
 --- a/advice.c
 +++ b/advice.c
 @@ -43,7 +43,7 @@ static int parse_advise_color_slot(const char *slot)
  
  static const char *advise_get_color(enum color_advice ix)
  {
 -  if (want_color(advice_use_color))
 +  if (want_color_stderr(advice_use_color))
return advice_colors[ix];
return "";
  }
 @@ -87,8 +87,10 @@ void advise(const char *advice, ...)
  
for (cp = buf.buf; *cp; cp = np) {
np = strchrnul(cp, '\n');
 -  fprintf(stderr, _("%shint: %.*s%s\n"), 
advise_get_color(ADVICE_COLOR_HINT),
 -  (int)(np - cp), cp, 
advise_get_color(ADVICE_COLOR_RESET));
 +  fprintf(stderr, _("%shint: %.*s%s\n"),
 +  

Re: [PATCH v2 0/4] Teach `--default` to `git-config(1)`

2018-03-26 Thread Jeff King
On Fri, Mar 23, 2018 at 08:55:52PM -0400, Taylor Blau wrote:

> Attached is 'v2' of my patch series to add a `--default` option to `git 
> config`.

Thanks, this is looking much better. I did come up with a few
suggestions/fixes. Some minor which would make v3 easy, and some that
would require expanding the series quite a bit. I'm not sure if those
bigger ones are worth doing or not. :)

> Thank you again for all of your feedback, and my apologies that it has taken 
> me
> so long to respond. I was out of the office last week, and have been quite 
> busy
> since catching up on Git LFS-related issues.

No problem. Open source moves at the pace of patch submissions.

-Peff


[PATCH v2 0/4] Teach `--default` to `git-config(1)`

2018-03-23 Thread Taylor Blau
Hi,

Attached is 'v2' of my patch series to add a `--default` option to `git config`.

I have addressed the following concerns since the first iteration:

  1. Better motivation in 'builtin/config: introduce `--default`'. (cc: Peff,
  Eric)

  2. Fixed a typo in the message of 'builtin/config: introduce `--default`'.
  (cc: Eric).

  3. Changed the first mention of type specifiers in `git config`'s
  documentation from a paragraph to a brief list (cc: Peff, Junio).

  4. Changed the signatures of some new functions, particularly in 'config.c:
  introduce 'git_config_color' to parse ANSI colors'.

I have thus-far avoided mentioning any specific deprecation of `--get-color` and
`--get-colorbool`, but would not be opposed to changing that in this series
before queuing.

Thank you again for all of your feedback, and my apologies that it has taken me
so long to respond. I was out of the office last week, and have been quite busy
since catching up on Git LFS-related issues.


Thanks,
Taylor

Taylor Blau (4):
  builtin/config: introduce `--default`
  Documentation: list all type specifiers in config prose
  config.c: introduce 'git_config_color' to parse ANSI colors
  builtin/config: introduce `--color` type specifier

 Documentation/git-config.txt |  38 +++---
 builtin/config.c |  30 
 config.c |  10 +++
 config.h |   1 +
 t/t1300-repo-config.sh   |  24 +++
 t/t1310-config-default.sh| 131 +++
 6 files changed, 226 insertions(+), 8 deletions(-)
 create mode 100755 t/t1310-config-default.sh

--
2.16.2.440.gc6284da4f



[PATCH v2 0/4] nd/parseopt-completion fixups

2018-03-06 Thread Nguyễn Thái Ngọc Duy
v2 fixes the comments from v1 and adds to new patches to improve
_git_notes() (sorry I couldn't resist)

Interdiff with what's on 'pu'

diff --git a/contrib/completion/git-completion.bash 
b/contrib/completion/git-completion.bash
index 5f7495cda3..2e30950299 100644
--- a/contrib/completion/git-completion.bash
+++ b/contrib/completion/git-completion.bash
@@ -1815,7 +1815,7 @@ _git_name_rev ()
 
 _git_notes ()
 {
-   local subcommands='add append copy edit list prune remove show'
+   local subcommands='add append copy edit get-ref list merge prune remove 
show'
local subcommand="$(__git_find_on_cmdline "$subcommands")"
 
case "$subcommand,$cur" in
@@ -1832,18 +1832,15 @@ _git_notes ()
;;
esac
;;
-   add,--reuse-message=*|append,--reuse-message=*|\
-   add,--reedit-message=*|append,--reedit-message=*)
+   *,--reuse-message=*|*,--reedit-message=*)
__git_complete_refs --cur="${cur#*=}"
;;
-   prune,--*)
-   __gitcomp_builtin notes_prune
-   ;;
-   prune,*)
-   ;;
*,--*)
__gitcomp_builtin notes_$subcommand
;;
+   prune,*|get-ref,*)
+   # this command does not take a ref, do not complete it
+   ;;
*)
case "$prev" in
-m|-F)
@@ -1956,6 +1953,7 @@ _git_rebase ()
--autostash --no-autostash
--verify --no-verify
--keep-empty --root --force-rebase --no-ff
+   --rerere-autoupdate
--exec
"
 

Nguyễn Thái Ngọc Duy (4):
  completion: don't set PARSE_OPT_NOCOMPLETE on --rerere-autoupdate
  completion: simplify _git_notes
  completion: complete --{reuse,reedit}-message= for all notes subcmds
  completion: more subcommands in _git_notes()

 contrib/completion/git-completion.bash | 25 -
 parse-options.h|  4 ++--
 rerere.h   |  3 +--
 3 files changed, 11 insertions(+), 21 deletions(-)

-- 
2.16.2.785.g429c04a1b9



Re: [PATCH v2 0/4] Delete ignore_env member in struct repository

2018-02-27 Thread Stefan Beller
On Tue, Feb 27, 2018 at 1:58 AM, Nguyễn Thái Ngọc Duy  wrote:
> v2 fixes the incorrect use of consecutive getenv() and adds a comment
> to clarify the role of old_gitdir
>
> Interdiff:
>
> diff --git a/environment.c b/environment.c
> index 95de419de8..47c6e31559 100644
> --- a/environment.c
> +++ b/environment.c
> @@ -14,6 +14,7 @@
>  #include "fmt-merge-msg.h"
>  #include "commit.h"
>  #include "object-store.h"
> +#include "argv-array.h"
>
>  int trust_executable_bit = 1;
>  int trust_ctime = 1;
> @@ -148,18 +149,34 @@ static char *expand_namespace(const char *raw_namespace)
> return strbuf_detach(, NULL);
>  }
>
> +/* Wrapper of getenv() that returns a strdup value. This value is kept
> + * in argv to be freed later.
> + */

/* comment style, see also Erics reply */

I do not understand why we need to use an argv_array here?
I see that the push calls xstrdup and then puts it into the array.
So to me this looks like a clever way that we can easily free them all
at once using the predefined argv_array_clear() after calling
repo_set_gitdir.

That makes sense, though is confusing at first; I would have expected
a set_gitdir_args_clear() function that just frees all its strings, whether they
are NULL or not.

But given that we are clever with not pushing onto the argv_array in case the
getenv returns NULL, this seems to be less work in the usual case of no
env variables set.

Ok, I think I like it.

> +static const char *getenv_safe(struct argv_array *argv, const char *name)

I would have expected that we already have such a (or similar) helper
in wrapper.c, but it turns out we always take care of getenv manually,
and we do not have a wrapper yet.

So only the comment style remains as a nit.

Thanks,
Stefan


Re: [PATCH v2 0/4] Delete ignore_env member in struct repository

2018-02-27 Thread Eric Sunshine
On Tue, Feb 27, 2018 at 4:58 AM, Nguyễn Thái Ngọc Duy  wrote:
> v2 fixes the incorrect use of consecutive getenv() and adds a comment
> to clarify the role of old_gitdir
>
> diff --git a/environment.c b/environment.c
> @@ -148,18 +149,34 @@ static char *expand_namespace(const char *raw_namespace)
> +/* Wrapper of getenv() that returns a strdup value. This value is kept
> + * in argv to be freed later.
> + */

/*
 * Comment style.
 */

> +static const char *getenv_safe(struct argv_array *argv, const char *name)
> +{
> +   const char *value = getenv(name);
> +
> +   if (!value)
> +   return NULL;
> +
> +   argv_array_push(argv, value);
> +   return argv->argv[argv->argc - 1];
> +}
> +
>  void setup_git_env(const char *git_dir)
>  {
> const char *shallow_file;
> const char *replace_ref_base;
> struct set_gitdir_args args = { NULL };
> +   struct argv_array to_free = ARGV_ARRAY_INIT;

Cute. Another example[1] showing that renaming argv_array to something
else might be a good idea.

[1]: 
https://public-inbox.org/git/20180227062128.gg65...@aiede.svl.corp.google.com/

> -   args.shared_root = getenv(GIT_COMMON_DIR_ENVIRONMENT);
> -   args.object_dir = getenv(DB_ENVIRONMENT);
> -   args.graft_file = getenv(GRAFT_ENVIRONMENT);
> -   args.index_file = getenv(INDEX_ENVIRONMENT);
> -   args.alternate_db = getenv(ALTERNATE_DB_ENVIRONMENT);
> +   args.shared_root = getenv_safe(_free, GIT_COMMON_DIR_ENVIRONMENT);
> +   args.object_dir = getenv_safe(_free, DB_ENVIRONMENT);
> +   args.graft_file = getenv_safe(_free, GRAFT_ENVIRONMENT);
> +   args.index_file = getenv_safe(_free, INDEX_ENVIRONMENT);
> +   args.alternate_db = getenv_safe(_free, ALTERNATE_DB_ENVIRONMENT);
> repo_set_gitdir(the_repository, git_dir, );
> +   argv_array_clear(_free);
> diff --git a/repository.c b/repository.c
> @@ -48,6 +48,11 @@ void repo_set_gitdir(struct repository *repo,
> const char *gitfile = read_gitfile(root);
> +   /*
> +* repo->gitdir is saved because the caller could pass "root"
> +* that also points to repo->gitdir. We want to keep it alive
> +* until after xstrdup(root). Then we can free it.
> +*/
> char *old_gitdir = repo->gitdir;

Good. This comment makes the reasoning clear.

> repo->gitdir = xstrdup(gitfile ? gitfile : root);


[PATCH v2 0/4] Delete ignore_env member in struct repository

2018-02-27 Thread Nguyễn Thái Ngọc Duy
v2 fixes the incorrect use of consecutive getenv() and adds a comment
to clarify the role of old_gitdir

Interdiff:

diff --git a/environment.c b/environment.c
index 95de419de8..47c6e31559 100644
--- a/environment.c
+++ b/environment.c
@@ -14,6 +14,7 @@
 #include "fmt-merge-msg.h"
 #include "commit.h"
 #include "object-store.h"
+#include "argv-array.h"
 
 int trust_executable_bit = 1;
 int trust_ctime = 1;
@@ -148,18 +149,34 @@ static char *expand_namespace(const char *raw_namespace)
return strbuf_detach(, NULL);
 }
 
+/* Wrapper of getenv() that returns a strdup value. This value is kept
+ * in argv to be freed later.
+ */
+static const char *getenv_safe(struct argv_array *argv, const char *name)
+{
+   const char *value = getenv(name);
+
+   if (!value)
+   return NULL;
+
+   argv_array_push(argv, value);
+   return argv->argv[argv->argc - 1];
+}
+
 void setup_git_env(const char *git_dir)
 {
const char *shallow_file;
const char *replace_ref_base;
struct set_gitdir_args args = { NULL };
+   struct argv_array to_free = ARGV_ARRAY_INIT;
 
-   args.shared_root = getenv(GIT_COMMON_DIR_ENVIRONMENT);
-   args.object_dir = getenv(DB_ENVIRONMENT);
-   args.graft_file = getenv(GRAFT_ENVIRONMENT);
-   args.index_file = getenv(INDEX_ENVIRONMENT);
-   args.alternate_db = getenv(ALTERNATE_DB_ENVIRONMENT);
+   args.shared_root = getenv_safe(_free, GIT_COMMON_DIR_ENVIRONMENT);
+   args.object_dir = getenv_safe(_free, DB_ENVIRONMENT);
+   args.graft_file = getenv_safe(_free, GRAFT_ENVIRONMENT);
+   args.index_file = getenv_safe(_free, INDEX_ENVIRONMENT);
+   args.alternate_db = getenv_safe(_free, ALTERNATE_DB_ENVIRONMENT);
repo_set_gitdir(the_repository, git_dir, );
+   argv_array_clear(_free);
 
if (getenv(NO_REPLACE_OBJECTS_ENVIRONMENT))
check_replace_refs = 0;
diff --git a/repository.c b/repository.c
index 8f6386022f..c555dacad2 100644
--- a/repository.c
+++ b/repository.c
@@ -48,6 +48,11 @@ void repo_set_gitdir(struct repository *repo,
 const struct set_gitdir_args *o)
 {
const char *gitfile = read_gitfile(root);
+   /*
+* repo->gitdir is saved because the caller could pass "root"
+* that also points to repo->gitdir. We want to keep it alive
+* until after xstrdup(root). Then we can free it.
+*/
char *old_gitdir = repo->gitdir;
 
repo->gitdir = xstrdup(gitfile ? gitfile : root);

Nguyễn Thái Ngọc Duy (4):
  repository.c: move env-related setup code back to environment.c
  repository.c: delete dead functions
  sha1_file.c: move delayed getenv(altdb) back to setup_git_env()
  repository: delete ignore_env member

 cache.h|  2 +-
 environment.c  | 30 --
 object-store.h |  5 ++-
 object.c   |  1 +
 repository.c   | 84 --
 repository.h   | 21 +++--
 setup.c|  3 +-
 sha1_file.c|  6 +---
 8 files changed, 86 insertions(+), 66 deletions(-)

-- 
2.16.1.435.g8f24da2e1a



[PATCH v2 0/4] making test-suite tracing more useful

2017-12-08 Thread Jeff King
This series fixes some rough edges in the "-x" feature of the test
suite. With it, it should be possible to turn on tracing for CI runs.

This is mostly a repost of v1 at:

  
https://public-inbox.org/git/20171019210140.64lb52cqtgdh2...@sigill.intra.peff.net

which had some discussion, but wasn't picked up.

I fixed a few typos pointed out by reviewers, and I tried to summarize
the discussion around "magic descriptor 4" in the commit message of
patch 2.  My conclusion there was that none of the options is
particularly good.  The only thing thing that might make sense is making
"7" the magical descriptor. But given that "4" only had one troubling
call, I'm inclined to just leave it as-is and see if this ever comes up
again (especially if we start using "-x" in the Travis builds, then we'd
catch any problems).

  [1/4]: test-lib: silence "-x" cleanup under bash
  [2/4]: t5615: avoid re-using descriptor 4
  [3/4]: test-lib: make "-x" work with "--verbose-log"
  [4/4]: t/Makefile: introduce TEST_SHELL_PATH

 Makefile |  8 
 t/Makefile   |  6 --
 t/t5615-alternate-env.sh |  6 +++---
 t/test-lib.sh| 46 +-
 4 files changed, 48 insertions(+), 18 deletions(-)

-Peff


[PATCH v2 0/4] Fix issues with rename detection limits

2017-11-13 Thread Elijah Newren
This patchset fixes some issues around rename detection limits.  Changes
since the original submission[1]:
  * Patches 2 and 3 are swapped in order so as to not temporarily introduce
any bugs (even if only cosmetic)
  * Commit message fixups
  * Using uint64_t instead of double for holding multiplication of integers
  * Corrected format specifier for printing 64-bit ints.

[1] https://public-inbox.org/git/20171110173956.25105-1-new...@gmail.com/

Elijah Newren (4):
  sequencer: warn when internal merge may be suboptimal due to
renameLimit
  progress: fix progress meters when dealing with lots of work
  Remove silent clamp of renameLimit
  sequencer: show rename progress during cherry picks

 diff.c|  2 +-
 diffcore-rename.c | 15 ++-
 progress.c| 22 +++---
 progress.h|  8 
 sequencer.c   |  2 ++
 5 files changed, 24 insertions(+), 25 deletions(-)

-- 
2.15.0.44.gf995e411c7



[PATCH v2 0/4] bisect: assorted leak-fixes

2017-11-05 Thread Martin Ågren
This fixes a couple of leaks around `find_bisection(). The first patch
is new since v1 [1] to make the calling-convention of `find_bisection()`
less prone to misuse. Patch 2 is similar to v1, the only difference is
that the "while at it" has moved into patch 1. Patches 3 and 4 are
identical to v1.

Martin

[1] 
https://public-inbox.org/git/4795556016c25e4a78241362547c5468877f808d.1509557518.git.martin.ag...@gmail.com/

Martin Ågren (4):
  bisect: change calling-convention of `find_bisection()`
  bisect: fix memory leak in `find_bisection()`
  bisect: fix off-by-one error in `best_bisection_sorted()`
  bisect: fix memory leak when returning best element

 bisect.h   | 12 +---
 bisect.c   | 33 +++--
 builtin/rev-list.c |  3 +--
 3 files changed, 29 insertions(+), 19 deletions(-)

-- 
2.15.0.415.gac1375d7e



[PATCH v2 0/4] convert diff flags to be stored in a struct

2017-10-30 Thread Brandon Williams
Changes in v2:
 * removed the diff_flags_cleared singleton 
 * eliminated the 'touched' parallel flags
 * pass structs by reference instead of by value

Now that the 'touched' flags have been removed it may be valuable to go ahead
and remove the macros all together (including making the flags lower case).  If
that's what we want to do, I can go ahead and send those patches out as a
follow on to these.

Brandon Williams (4):
  add, reset: use DIFF_OPT_SET macro to set a diff flag
  diff: convert flags to be stored in bitfields
  diff: add flag to indicate textconv was set via cmdline
  diff: remove touched flags

 builtin/add.c|  2 +-
 builtin/commit.c |  7 +++--
 builtin/log.c|  3 +-
 builtin/reset.c  |  2 +-
 diff-lib.c   |  7 +++--
 diff.c   | 10 +++---
 diff.h   | 96 +---
 sequencer.c  |  5 +--
 8 files changed, 77 insertions(+), 55 deletions(-)

-- 
2.15.0.403.gc27cc4dac6-goog



Re: [PATCH v2 0/4] Hash Abstraction

2017-10-30 Thread Stefan Beller
On Sat, Oct 28, 2017 at 11:12 AM, brian m. carlson
 wrote:
> This is a series proposing a basic abstraction for hash functions.
>
> As we get closer to converting the remainder of the codebase to use
> struct object_id, we should think about the design we want our hash
> function abstraction to take.  This series is a proposal for one idea.
> Input on any aspect of this proposal is welcome.

I like the series/idea.

> This series exposes a struct git_hash_algo that contains basic
> information about a given hash algorithm that distinguishes it from
> other algorithms: name, identifiers, lengths, implementing functions,
> and empty tree and blob constants.  It also exposes an array of hash
> algorithms, and a constant for indexing them.
>
> The series also demonstrates a simple conversion using the abstraction
> over empty blob and tree values.

That looks good, though I wonder if we'll have to give a way a little
performance during the transition period (or even indefinitely, as the
hash abstraction won't go away; we'd rather want to keep it in place
long term). I guess we can measure after all is said and done,
no big deal.

>
> In order to avoid conflicting with the struct repository work and with
> the goal of avoiding global variables as much as possible, I've pushed
> the hash algorithm into struct repository and exposed it via a #define.

If you are referring to that long series[1] that Jonathan and I were working
on, then no worries.  Unfortunately that series got some delays, but I rebased
its prepared successors (of over 100 patches) on Friday and it was surprisingly
easy to do so; just tedious.

[1] 
https://public-inbox.org/git/20170830064634.ga153...@aiede.mtv.corp.google.com/

>
> I propose this series now as it will inform the way we go about
> converting other parts of the codebase, especially some of the pack
> algorithms.

Thanks for doing this now.

> Because we share some hash computation code between pack
> checksums and object hashing, we need to decide whether to expose pack
> checksums as struct object_id, even though they are technically not
> object IDs.

I think we used sha1 as checksums, because it was a hash function easily
accessible, not because its other properties were the best to do its job.
So I am undecided if we want to just "keep the same hash function for
checksum as well as object hashing" or pickup another checksum, that
actually *only* does checksumming (we don't need the cryptographic
properties, such that an easy hash [such as crc, djb, fnv or murmur]
would do; raw speed, barely protecting the pack file against bit flips).

For the sake of symmetry I tend to go with the former, i.e use the object
hash function for pack files, too.

>  Furthermore, if we end up needing to stuff an algorithm
> value into struct object_id, we'll no longer be able to directly
> reference object IDs in a pack without a copy.
>
> I've updated this series in some significant ways to reflect and better
> implement the transition plan as it's developed.  If there are ways
> in which this series (or future series) can converge better on the
> transition plan, that input would be valuable.
>
> This series is available from the usual places as branch hash-struct,
> based against master as of 2.15-rc2.

Thanks,
Stefan


Re: [PATCH v2 0/4] fsmonitor fixes

2017-10-29 Thread Johannes Schindelin
Hi Alex,

On Wed, 25 Oct 2017, Alex Vandiver wrote:

> Updated based on comments from Dscho and Ben.  Thanks for those!

Thank you for this excellent improvement.

Ciao,
Dscho


[PATCH v2 0/4] Hash Abstraction

2017-10-28 Thread brian m. carlson
This is a series proposing a basic abstraction for hash functions.

As we get closer to converting the remainder of the codebase to use
struct object_id, we should think about the design we want our hash
function abstraction to take.  This series is a proposal for one idea.
Input on any aspect of this proposal is welcome.

This series exposes a struct git_hash_algo that contains basic
information about a given hash algorithm that distinguishes it from
other algorithms: name, identifiers, lengths, implementing functions,
and empty tree and blob constants.  It also exposes an array of hash
algorithms, and a constant for indexing them.

The series also demonstrates a simple conversion using the abstraction
over empty blob and tree values.

In order to avoid conflicting with the struct repository work and with
the goal of avoiding global variables as much as possible, I've pushed
the hash algorithm into struct repository and exposed it via a #define.

I propose this series now as it will inform the way we go about
converting other parts of the codebase, especially some of the pack
algorithms.  Because we share some hash computation code between pack
checksums and object hashing, we need to decide whether to expose pack
checksums as struct object_id, even though they are technically not
object IDs.  Furthermore, if we end up needing to stuff an algorithm
value into struct object_id, we'll no longer be able to directly
reference object IDs in a pack without a copy.

I've updated this series in some significant ways to reflect and better
implement the transition plan as it's developed.  If there are ways
in which this series (or future series) can converge better on the
transition plan, that input would be valuable.

This series is available from the usual places as branch hash-struct,
based against master as of 2.15-rc2.

Changes from v1:
* Rebase onto 2.15-rc2.
* Fix the uninitialized value that Peff pointed out.  This fixes the
  error, but leaves the code in the same place, since I think it's where
  it should be.
* Improve commit message to explain the meaning of current_hash WRT the
  transition plan.
* Added an unknown hash algorithm constant and value to better implement
  the transition plan.
* Explain in the commit message why hex size and binary size are both
  provided.
* Add a format_id field to the struct, in coordination with the
  transition plan.
* Improve comments for struct fields and constants.

brian m. carlson (4):
  setup: expose enumerated repo info
  Add structure representing hash algorithm
  Integrate hash algorithm support with repo setup
  Switch empty tree and blob lookups to use hash abstraction

 builtin/am.c   |  2 +-
 builtin/checkout.c |  2 +-
 builtin/diff.c |  2 +-
 builtin/pull.c |  2 +-
 cache.h| 67 ++
 diff-lib.c |  2 +-
 merge-recursive.c  |  2 +-
 notes-merge.c  |  2 +-
 repository.c   |  7 ++
 repository.h   |  5 
 sequencer.c|  6 ++---
 setup.c| 49 ++-
 sha1_file.c| 43 +++
 submodule.c|  2 +-
 14 files changed, 157 insertions(+), 36 deletions(-)

-- 
2.15.0.rc2



[PATCH v2 0/4] fsmonitor fixes

2017-10-25 Thread Alex Vandiver
Updated based on comments from Dscho and Ben.  Thanks for those!
 - Alex



[PATCH v2 0/4] filter-branch: support for incremental update + fix for ancient tag format

2017-09-17 Thread Ian Campbell
This is the second version of my patches to add incremental support to
git-filter-branch. Since the last time I have:
 * addressed the review feedback (see changelog embedded in final
   patch)
 * switched to using the (newly introduced) `--allow-missing-tagger`
   option to `git mktag` to allow early Linux kernel tags to be
   rewritten
 * split out some of the preparatory changes to make the final patch
   easier to read.

I've force pushed to [1] where Travis seems happy and have set off the
process of re-rewriting the devicetree tree from scratch (a multi-day
affair) to validate (it's looking good).

Ian.

[0] 
https://git.kernel.org/pub/scm/linux/kernel/git/devicetree/devicetree-rebasing.git/
[1] https://github.com/ijc/git/tree/git-filter-branch


Re: [PATCH v2 0/4] Some ThreadSanitizer-results

2017-08-28 Thread Jeff Hostetler



On 8/21/2017 1:43 PM, Martin Ågren wrote:

This is the second version of my series to try to address some issues
... 
2) hashmap_add, which I could try my hands on if Jeff doesn't beat me to

it -- his proposed change should fix it and I doubt I could come up with
anything "better", considering he knows the code.


Now that I'm back from vacation, let me take another stab at this.

Jeff



[GSoC][PATCH v2 0/4] submodule: Incremental rewrite of git-submodules

2017-08-23 Thread Prathamesh Chavan
Changes in v2:

* In the function get_submodule_displaypath(), I tried avoiding the extra
  memory allocation, but it rather invoked test 5 from
  t7406-submodule-update. The details of the test failiure can be seen
  in the build report as well. (Build #162)
  This failure occured as even though the function relative_path(),
  returned the value correctly, NULL was stored in sb.buf
  Hence currently the function is still using the old way of
  allocating extra memory and releasing the strbuf first, and return
  the extra allocated memory.
* It was observed that strbuf_reset was being done just before the strbuf
  was being reused. It made more sense to reset them just after their use 
  instead. Hence, these function calls of strbuf_reset() were moved
  accordingly.
  (The above change was limited to the function init_submodule() only,
   since already there was a deletion of one such funciton call,
   and IMO, it won't be suitable to carryout the operation for the complete
   file in the same commit.)
* The function name for_each_submodule_list() was changed to
  for_each_submodule().
* Instead of passing the complete list to the function for_each_submodule(),
  only its pointer is passed to avoid the compiler from making copy of the
  list, and to make the code more efficient.
* The function names of print_name_rev() and get_name_rev() were changed
  to get_rev_name() and compute_rev_name(). Now the function get_rev_name()
  acts as the front end of the subcommand and calls the function
  compute_rev_name() for generating and receiving the value of revision name.
* The above change was also accompanied by the change in names of some
  variables used internally in the functions.

As before you can find this series at: 
https://github.com/pratham-pc/git/commits/week-14-1

And the build report is available at: 
https://travis-ci.org/pratham-pc/git/builds/
Branch: week-14-1
Build #163

Prathamesh Chavan (4):
  submodule--helper: introduce get_submodule_displaypath()
  submodule--helper: introduce for_each_submodule()
  submodule: port set_name_rev() from shell to C
  submodule: port submodule subcommand 'status' from shell to C

 builtin/submodule--helper.c | 294 
 git-submodule.sh|  61 +
 2 files changed, 273 insertions(+), 82 deletions(-)

-- 
2.13.0



[PATCH v2 0/4] Some ThreadSanitizer-results

2017-08-21 Thread Martin Ågren
This is the second version of my series to try to address some issues
noted by ThreadSanitizer. Thanks to all who took the time to provide
input on the first version.

The largest change is in the third patch, which moves the "avoid writing
to slopbuf"-logic into strbuf_setlen, and compiles it unconditionally.
Patch 2 hasn't been changed. The others have seen some minor changes.
The patch introducing GIT_THREAD_SANITIZER is gone.

The end result as far as ThreadSanitizer is concerned is the same:

1) set_try_to_free_routine, where Peff outlined some possibilities, and
where I don't feel like I have any idea which of them is better.

2) hashmap_add, which I could try my hands on if Jeff doesn't beat me to
it -- his proposed change should fix it and I doubt I could come up with
anything "better", considering he knows the code.

Martin

Martin Ågren (4):
  convert: always initialize attr_action in convert_attrs
  pack-objects: take lock before accessing `remaining`
  strbuf_setlen: don't write to strbuf_slopbuf
  ThreadSanitizer: add suppressions

 strbuf.h   |  5 -
 builtin/pack-objects.c |  6 ++
 color.c|  7 +++
 convert.c  |  5 +++--
 transport-helper.c |  7 +++
 .tsan-suppressions | 10 ++
 6 files changed, 37 insertions(+), 3 deletions(-)
 create mode 100644 .tsan-suppressions

-- 
2.14.1.151.gdfeca7a7e



[PATCH v2 0/4] Modernize read_graft_line implementation

2017-08-16 Thread Patryk Obara
Compared to v1:
- the first patch is dropped to make it easier to merge
- free_graft is now static function in commit.c

I don't know, what are exact rules about adding Reviewed-by footer, so
I didn't add any.


Patryk Obara (4):
  sha1_file: fix hardcoded size in null_sha1
  commit: replace the raw buffer with strbuf in read_graft_line
  commit: implement free_commit_graft
  commit: rewrite read_graft_line

 builtin/blame.c |  2 +-
 commit.c| 55 ---
 commit.h|  4 ++--
 sha1_file.c |  2 +-
 shallow.c   |  1 +
 5 files changed, 37 insertions(+), 27 deletions(-)

-- 
2.9.5



[RFC PATCH v2 0/4] Partial clone: promised objects (not only blobs)

2017-07-19 Thread Jonathan Tan
Thanks for all your comments on the earlier version. This is a
substantially different version. In particular:
 - Now supports all types (tag, commit, tree) of objects, not only blobs
 - fsck works
 - Incorporates Ben Peart's code that uses a long-living process
   (lifetime of the Git invocation) to obtain objects
 - Implemented as a repository extension

If anyone would like to comment on the overall direction of this
approach, that would be great. In particular, I'm not too sure on the
names of things. I know that I have some things to still work on
(documentation and coding style, more improvements to and tests for
fsck, maybe division of commits?) but I would like some early feedback
on the bigger picture first.

If you want to patch this in, this is built off my recent cleanup patch
[1].

About inability to scale if we have the list of promised blobs: In this
design, we will have a promised blob entry only if we have a concrete
tree, so in a repository in which many trees are omitted, there will not
be many promised blob entry objects. In fact, the minimal partial clone
will be one in which there will be one promised commit for each ref
(assuming no duplicates), no promised trees, and no promised blobs.
(I'm not planning to implement such a clone, but someone else could do
so.)

About having multiple promise lists: I have retained the single list,
but am still open to change. The support of all the object types might
be sufficient mitigation for the issues that caused us to investigate
multiple promise lists in the first place.

[1] https://public-inbox.org/git/20170718222848.1453-1-jonathanta...@google.com/

Jonathan Tan (4):
  object: remove "used" field from struct object
  promised-object, fsck: introduce promised objects
  sha1-array: support appending unsigned char hash
  sha1_file: support promised object hook

 Documentation/config.txt |   8 +
 Documentation/gitrepository-layout.txt   |   8 +
 Documentation/technical/read-object-protocol.txt | 102 +++
 Documentation/technical/repository-version.txt   |   6 +
 Makefile |   1 +
 builtin/cat-file.c   |   9 +
 builtin/fsck.c   |  42 ++-
 cache.h  |   4 +
 environment.c|   1 +
 fsck.c   |   6 +-
 object.c |  21 +-
 object.h |  21 +-
 promised-object.c| 324 +++
 promised-object.h|  34 +++
 setup.c  |   7 +-
 sha1-array.c |   7 +
 sha1-array.h |   1 +
 sha1_file.c  |  44 ++-
 t/t3907-promised-object.sh   |  73 +
 t/t3907/read-object  | 114 
 t/test-lib-functions.sh  |   6 +
 21 files changed, 808 insertions(+), 31 deletions(-)
 create mode 100644 Documentation/technical/read-object-protocol.txt
 create mode 100644 promised-object.c
 create mode 100644 promised-object.h
 create mode 100755 t/t3907-promised-object.sh
 create mode 100755 t/t3907/read-object

-- 
2.14.0.rc0.284.gd933b75aa4-goog



Re: [PATCH v2 0/4] reflog-walk fixes for maint

2017-07-07 Thread Junio C Hamano
Thanks.  All made sense to me after reading them twice.



[PATCH v2 0/4] reflog-walk fixes for maint

2017-07-07 Thread Jeff King
On Wed, Jul 05, 2017 at 03:55:08AM -0400, Jeff King wrote:

> The first patch is my original small fix with an extra test. I think
> that would be appropriate for 'maint'. Its behavior still has some
> quirks, but it avoids the confusion that you experienced and has a low
> risk of breaking anything else.
> 
> The rest of it replaces the fake-parent thing with a more
> straight-forward iteration over the reflogs (i.e., a cleanup of the
> further patches I've been posting). After digging into it and especially
> after writing the new tests, I think I've convinced myself that this is
> the right way forward.

Here's an updated version of the bug-fix patch, along with the fix for
the problem that Eric noticed, and some other problems I noticed while
fixing that one. So I've split these immediate fixes for maint off into
their own series.

These are based on maint itself, rather than Kyle's original commit that
introduces the double-null reflog, since some of the bugs came later in
the v2.13 cycle (if we really wanted to, we could split it again into
two more series, but I don't think it's worth the trouble).

  [1/4]: reflog-walk: skip over double-null oid due to HEAD rename

This is the fix for the pseudo-truncation in v2.13, and is the same
as the previous round.

  [2/4]: reflog-walk: duplicate strings in complete_reflogs list

This fixes Eric's bug, and is the same as what I showed earlier.
It's a triggerable use-after-free, which is why I think it's
important to get it into maint.

  [3/4]: reflog-walk: don't free reflogs added to cache

This is another use-after-free, though it's slightly harder to
trigger.

  [4/4]: reflog-walk: include all fields when freeing complete_reflogs

This one is an optional cleanup, but worth doing, I think.


 reflog-walk.c  | 33 +
 t/t1411-reflog-show.sh | 10 ++
 t/t3200-branch.sh  | 11 +++
 3 files changed, 42 insertions(+), 12 deletions(-)

-Peff


[PATCH v2 0/4] Improvements to sha1_file

2017-06-13 Thread Jonathan Tan
Peff suggested putting in a new field in struct object_info for the
object contents; I tried it and it seems to work out quite well.

Patch 1 is unmodified from the previous version. Patches 2-3 have been
rewritten, and patch 4 is similar except that the missing-lookup change
is made to sha1_object_info_extended() instead of the now gone
get_object().

As before, I would like review on patches 1-3 to go into the tree.
(Patch 4 is a work in progress, and is here just to demonstrate the
effectiveness of the refactoring.)

Jonathan Tan (4):
  sha1_file: teach packed_object_info about typename
  sha1_file: move delta base cache code up
  sha1_file: consolidate storage-agnostic object fns
  sha1_file, fsck: add missing blob support

 Documentation/config.txt |  10 +
 builtin/fsck.c   |   7 +
 cache.h  |  13 ++
 sha1_file.c  | 506 +--
 t/t3907-missing-blob.sh  |  69 +++
 5 files changed, 418 insertions(+), 187 deletions(-)
 create mode 100755 t/t3907-missing-blob.sh

-- 
2.13.1.518.g3df882009-goog



[PATCH v2 0/4] Port git-add--interactive.perl:status_cmd to C

2017-05-15 Thread Daniel Ferreira
This is the second version of a patch series to start porting
git-add--interactive from Perl to C.

Series:
v1: 
https://public-inbox.org/git/1494009820-2090-1-git-send-email-bnm...@gmail.com/

Travis CI build: https://travis-ci.org/theiostream/git/builds/232669894

I have applied most of the changes suggested by Johannes and Ævar (thank
you!). The one exception was Johannes' request to make this a WIP patch
series with a sort-of "design" of what's next to come. I did not do
this because I still have no clue...! I'll begin experimenting on
update_cmd() to see if I gain some insight on this issue that I could
bring to this series. I do think this would be a good idea, though! :)

-- Daniel.

Daniel Ferreira (4):
  diff: export diffstat interface
  add--helper: create builtin helper for interactive add
  add--helper: implement interactive status command
  add--interactive: use add-interactive--helper for status_cmd

 .gitignore|   1 +
 Makefile  |   1 +
 builtin.h |   1 +
 builtin/add--helper.c | 285 ++
 diff.c|  17 +--
 diff.h|  19 +++-
 git-add--interactive.perl |   4 +-
 git.c |   1 +
 8 files changed, 309 insertions(+), 20 deletions(-)
 create mode 100644 builtin/add--helper.c

--
2.7.4 (Apple Git-66)



[PATCH v2 0/4] travis-ci: build docs with asciidoctor

2017-04-26 Thread Lars Schneider
Hi,

changes since v1:

* check Asciidoctor stderr output (Brian)
  
http://public-inbox.org/git/20170418104411.hdkzh3psvej63...@genre.crustytoothpaste.net/

* fix make style nit (Junio)
  http://public-inbox.org/git/xmqq37dcorr7@gitster.mtv.corp.google.com/


Thanks,
Lars

Base Ref: master
Web-Diff: https://github.com/larsxschneider/git/commit/315affa7c0
Checkout: git fetch https://github.com/larsxschneider/git 
travisci/asciidoctor-v2 && git checkout 315affa7c0
Interdiff (v1..v2):

diff --git a/ci/test-documentation.sh b/ci/test-documentation.sh
index 81f123e68d..6214e6acb4 100755
--- a/ci/test-documentation.sh
+++ b/ci/test-documentation.sh
@@ -1,4 +1,4 @@
-#!/bin/sh
+#!/usr/bin/env bash
 #
 # Perform sanity checks on documentation and build it.
 #
@@ -9,7 +9,8 @@ make check-builtins
 make check-docs

 # Build docs with AsciiDoc
-make --jobs=2 doc
+make --jobs=2 doc > >(tee stdout.log) 2> >(tee stderr.log >&2)
+! test -s stderr.log
 test -s Documentation/git.html
 test -s Documentation/git.xml
 test -s Documentation/git.1
@@ -17,6 +18,8 @@ grep 

[PATCH v2 0/4] recursing submodules with relative pathspec (grep and ls-files)

2017-03-14 Thread Brandon Williams
v2 of the series tackles the problem slightly differently than v1 did, and in a
less invasive way in my opinion.  v1 also didn't fix everything.

v2 introduces a way to pass a 'prefix' that is respected by a git process.
This allows the git child processes (which operate on submodules) to have the
super_prefix (the path from the root of the superproject to the submodule) and
the prefix (path from the root of the superproject to the directory in which
the original command was launched).  With these two pieces of information the
child process can correctly handle pathspec matching as well as correctly
formatting their output with relative paths.

In order to pass the prefix to a child I made a new env var since the existing
GIT_PREFIX isn't respected.  I'm not sure this is the best method so I'm open
to ideas on the best way to convey this information to a child process.

Brandon Williams (4):
  grep: fix help text typo
  setup: allow for prefix to be passed to git commands
  grep: fix bug when recursing with relative pathspec
  ls-files: fix bug when recursing with relative pathspec

 builtin/grep.c | 41 +++
 builtin/ls-files.c | 41 ++-
 cache.h|  1 +
 git.c  |  2 -
 setup.c|  6 +++
 t/t3007-ls-files-recurse-submodules.sh | 39 ++
 t/t7814-grep-recurse-submodules.sh | 75 ++
 7 files changed, 167 insertions(+), 38 deletions(-)

-- 
2.12.0.367.g23dc2f6d3c-goog



Re: [PATCH v2 0/4] delete_ref: support reflog messages

2017-02-21 Thread Kyle Meyer
Junio C Hamano  writes:

> These looked reasonable.  I had to resolve conflicts with another
> topic in flight that removed set_worktree_head_symref() function
> while queuing, and I think I resolved it correctly, but please
> double check what is pushed out on 'pu'.

The merge looks right to me, thanks.  Thanks also for touching up the
tab/space issue in t3200-branch.sh.

-- 
Kyle


Re: [PATCH v2 0/4] delete_ref: support reflog messages

2017-02-20 Thread Junio C Hamano
Kyle Meyer  writes:

> Kyle Meyer (4):
>   delete_ref: accept a reflog message argument
>   update-ref: pass reflog message to delete_ref()
>   rename_ref: replace empty message in HEAD's log
>   branch: record creation of renamed branch in HEAD's log

These looked reasonable.  I had to resolve conflicts with another
topic in flight that removed set_worktree_head_symref() function
while queuing, and I think I resolved it correctly, but please
double check what is pushed out on 'pu'.

Thanks.


Re: [PATCH v2 0/4] delete_ref: support reflog messages

2017-02-20 Thread Jeff King
On Mon, Feb 20, 2017 at 08:10:31PM -0500, Kyle Meyer wrote:

> Kyle Meyer (4):
>   delete_ref: accept a reflog message argument
>   update-ref: pass reflog message to delete_ref()
>   rename_ref: replace empty message in HEAD's log
>   branch: record creation of renamed branch in HEAD's log

These look good to me. I think you did a nice job of summarizing the
discussion in the commit messages.

-Peff


[PATCH v2 0/4] delete_ref: support reflog messages

2017-02-20 Thread Kyle Meyer
Thanks to Junio and Peff for their feedback on v1.

  https://public-inbox.org/git/20170217035800.13214-1-k...@kyleam.com/T/#u

Changes from v1:

 * "msg" is now positioned as the first argument to delete_ref() to
   match update_ref().

   20170217081205.zn7j6d5cffgdv...@sigill.intra.peff.net

 * A "delete by head" test case has been added for the update-ref
   change.

   20170217082253.kxjezkxfqkfxj...@sigill.intra.peff.net

 * The added tests no longer send grep's output to /dev/null.

   20170217082253.kxjezkxfqkfxj...@sigill.intra.peff.net

 * Renaming the current branch is represented by two entries in HEAD's
   log, both which reuse the log message passed to rename_ref().
   
   20170217083112.vn7m4udsopmlv...@sigill.intra.peff.net,
   20170217195549.z6uyy7hbbhj5a...@sigill.intra.peff.net

 * Corrected a few places in the commit messages where "delete_refs"
   was written instead of "delete_ref".


Kyle Meyer (4):
  delete_ref: accept a reflog message argument
  update-ref: pass reflog message to delete_ref()
  rename_ref: replace empty message in HEAD's log
  branch: record creation of renamed branch in HEAD's log

 branch.c   |  5 +++--
 branch.h   |  3 ++-
 builtin/am.c   |  4 ++--
 builtin/branch.c   |  7 ---
 builtin/notes.c|  4 ++--
 builtin/remote.c   |  4 ++--
 builtin/replace.c  |  2 +-
 builtin/reset.c|  2 +-
 builtin/symbolic-ref.c |  2 +-
 builtin/tag.c  |  2 +-
 builtin/update-ref.c   |  2 +-
 fast-import.c  |  2 +-
 refs.c |  6 +++---
 refs.h |  7 ---
 refs/files-backend.c   | 10 +-
 t/t1400-update-ref.sh  | 18 ++
 t/t3200-branch.sh  |  6 ++
 transport.c|  2 +-
 18 files changed, 58 insertions(+), 30 deletions(-)

-- 
2.11.1



[PATCH v2 0/4] stash: create filename argument

2017-01-29 Thread Thomas Gummerer
Previous round is at:
http://public-inbox.org/git/20170121200804.19009-1-t.gumme...@gmail.com/.
Thanks Junio, Peff, Øyvind, Jakub and Johannes for your feedback on
the previous round.

Changes since the previous round:

- Re-phrased the Documentation update.
- Added missing $ in 2/3
- Added an extra patch introducing a new syntax for git stash create,
  where the message can be specified with the -m flag, instead of as a
  positional argument
- Filenames with $IFS in their name are now supported.  Added a test
  for that as well.

Interdiff below:

diff --git a/Documentation/git-stash.txt b/Documentation/git-stash.txt
index 871a3b246c..8306bac397 100644
--- a/Documentation/git-stash.txt
+++ b/Documentation/git-stash.txt
@@ -20,6 +20,8 @@ SYNOPSIS
 [--] [...]
 'git stash' clear
 'git stash' create []
+'git stash' create [-m ] [-u|--include-untracked ]
+[-- ...]
 'git stash' store [-m|--message ] [-q|--quiet] 
 
 DESCRIPTION
@@ -51,8 +53,8 @@ OPTIONS
 save [-p|--patch] [-k|--[no-]keep-index] [-u|--include-untracked] [-a|--all] 
[-q|--quiet] []::
 push [-p|--patch] [-k|--[no-]keep-index] [-u|--include-untracked] [-a|--all] 
[-q|--quiet] [-m|--message ] [--] [...]::
 
-   Save your local modifications to a new 'stash', and revert the
-   the changes in the working tree to match the index.
+   Save your local modifications to a new 'stash' and roll them
+   back both in the working tree and in the index.
The  part is optional and gives
the description along with the stashed state.  For quickly making
a snapshot, you can omit _both_ "save" and , but giving
diff --git a/git-stash.sh b/git-stash.sh
index 7dcce629bd..0072a38b4c 100755
--- a/git-stash.sh
+++ b/git-stash.sh
@@ -56,25 +56,57 @@ clear_stash () {
 }
 
 create_stash () {
+   stash_msg=
+   untracked=
+   new_style=
files=
while test $# != 0
do
case "$1" in
+   -m|--message)
+   shift
+   stash_msg="$1"
+   new_style=t
+   ;;
+   -u|--include-untracked)
+   shift
+   untracked="$1"
+   new_style=t
+   ;;
--)
shift
+   files="$@"
+   new_style=t
break
;;
-   --files)
-   ;;
*)
-   files="$1 $files"
+   if test -n "$new_style"
+   then
+   echo "invalid argument"
+   option="$1"
+   # TRANSLATORS: $option is an invalid option, 
like
+   # `--blah-blah'. The 7 spaces at the beginning 
of the
+   # second line correspond to "error: ". So you 
should line
+   # up the second line with however many 
characters the
+   # translation of "error: " takes in your 
language. E.g. in
+   # English this is:
+   #
+   #$ git stash save --blah-blah 2>&1 | head 
-n 2
+   #error: unknown option for 'stash save': 
--blah-blah
+   #   To provide a message, use git stash 
save -- '--blah-blah'
+   eval_gettextln "error: unknown option for 
'stash create': \$option"
+   usage
+   fi
+   break
;;
esac
shift
done
 
-   stash_msg="$1"
-   untracked="$2"
+   if test -z "$new_style"
+   then
+   stash_msg="$*"
+   fi
 
git update-index -q --refresh
if no_changes
@@ -284,7 +316,7 @@ push_stash () {
git reflog exists $ref_stash ||
clear_stash || die "$(gettext "Cannot initialize stash")"
 
-   create_stash --files $files -- "$stash_msg" "$untracked"
+   create_stash -m "$stash_msg" -u "$untracked" -- $files
store_stash -m "$stash_msg" -q $w_commit ||
die "$(gettext "Cannot save the current status")"
say "$(eval_gettext "Saved working directory and index state 
\$stash_msg")"
@@ -293,9 +325,9 @@ push_stash () {
then
if test -n "$files"
then
-   git reset -- $files
-   git checkout HEAD -- $(git ls-files --modified -- 
$files)
-   git clean --force --quiet -- $(git ls-files --others -- 
$files)
+   git ls-files -z -- "$@" | xargs -0 git reset --
+   git ls-files -z --modified -- "$@" | xargs -0 git 

[PATCH v2 0/4] urlmatch: allow regexp-based matches

2017-01-24 Thread Patrick Steinhardt
Hi,

This is version two of my patch series.

The use case is to be able to configure an HTTP proxy for all
subdomains of a domain where there are hundreds of subdomains.

Previously, I have been using complete regular expressions with
an escape-mechanism to match the configuration key's URLs.
According to Junio's comments, I changed this mechanism to a much
simpler one, where the user is only allowed to use globbing for
the host part of the URL. That is a user can now specify a key
`http.https://*.example.com` to match all sub-domains of
`example.com`. For now I've decided to implement it such that a
single `*` matches a single subdomain only, so for example
`https://foo.bar.example.com` would not match in this case. This
is similar to how shell-globbing works usually, so it should not
be of much surprise. It's also highlighted in the documentation.

I did not include an interdiff as too much has changed between
the two versions.

Regards
Patrick

Patrick Steinhardt (4):
  mailmap: add Patrick Steinhardt's work address
  urlmatch: enable normalization of URLs with globs
  urlmatch: split host and port fields in `struct url_info`
  urlmatch: allow globbing for the URL host part

 .mailmap |  1 +
 Documentation/config.txt |  5 +++-
 t/t1300-repo-config.sh   | 36 +
 urlmatch.c   | 68 +---
 urlmatch.h   |  9 ---
 5 files changed, 104 insertions(+), 15 deletions(-)

-- 
2.11.0



[PATCH v2 0/4] fix mergetool+rerere+subdir regression

2017-01-05 Thread Richard Hansen
If rerere is enabled, no pathnames are given, and mergetool is run
from a subdirectory, mergetool always prints "No files need merging".
Fix the bug.

This regression was introduced in
57937f70a09c12ef484c290865dac4066d207c9c (v2.11.0).

Changes since v1:
  * Patch 2/4 was reworked to improve the commit message, improve test
case independence even further, and use 'test_when_finished "git
reset --hard"' instead of a plain 'git reset --hard'.

Richard Hansen (4):
  t7610: update branch names to match test number
  t7610: make tests more independent and debuggable
  t7610: add test case for rerere+mergetool+subdir bug
  mergetool: fix running in subdir when rerere enabled

 git-mergetool.sh |   1 +
 t/t7610-mergetool.sh | 251 ++-
 2 files changed, 147 insertions(+), 105 deletions(-)

-- 
2.11.0.390.gc69c2f50cf-goog



smime.p7s
Description: S/MIME Cryptographic Signature


Re: [PATCH v2 0/4] road to reentrant real_path

2016-12-10 Thread Duy Nguyen
On Sat, Dec 10, 2016 at 2:42 AM, Brandon Williams  wrote:
> On 12/09, Duy Nguyen wrote:
>> On Fri, Dec 9, 2016 at 6:58 AM, Brandon Williams  wrote:
>> > diff --git a/setup.c b/setup.c
>> > index fe572b8..0d9fdd0 100644
>> > --- a/setup.c
>> > +++ b/setup.c
>> > @@ -254,10 +254,12 @@ int get_common_dir_noenv(struct strbuf *sb, const 
>> > char *gitdir)
>> > if (!is_absolute_path(data.buf))
>> > strbuf_addf(, "%s/", gitdir);
>> > strbuf_addbuf(, );
>> > -   strbuf_addstr(sb, real_path(path.buf));
>> > +   strbuf_realpath(sb, path.buf, 1);
>>
>> This is not the same because of this hunk in strbuf_realpath()
>
> Then perhaps I shouldn't make this change (and just leave it as is)
> since the way real_path_internal/strbuf_realpath is written requires
> that the strbuf being used for the resolved path only contains the
> resolved path (see the lstat(resolved->buf ) call).  Sidenote it
> looks like strbuf_getcwd() also does a reset, though more subtlety,
> since it just passes its buffer to getcwd().

Yeah that's ok too (I did not see this subtlety when I suggested the change).
-- 
Duy


  1   2   3   >