Re: Drastic jump in the time required for the test suite

2016-10-20 Thread Johannes Sixt

Am 20.10.2016 um 23:38 schrieb Jeff King:

test_cmp () {
# optimize for common "they are the same" case
# without any subshells or subprograms


We do this already on Windows; it's the function mingw_test_cmp in our 
test suite:

https://github.com/git/git/blob/master/t/test-lib-functions.sh#L884-L945

-- Hannes



Prove "Tests out of sequence" Error

2016-10-20 Thread Lars Schneider
Hi,

on TravisCI I see these weird "Tests out of sequence" errors with prove
and they seem to not go away. I assume the reason that they not go away
is that the ".prove" file is carried over from on build to another (but I can't 
look into this file on TravisCI).

Has anyone an idea where these errors might come from?


t5547-push-quarantine.sh (Wstat: 0 Tests: 5 Failed: 0)
  Parse errors: Tests out of sequence.  Found (2) but expected (3)
Tests out of sequence.  Found (3) but expected (4)
Tests out of sequence.  Found (4) but expected (5)
Bad plan.  You planned 4 tests but ran 5.
Files=760, Tests=15109, 679 wallclock secs (21.91 usr  1.78 sys + 320.79 cusr 
529.13 csys = 873.61 CPU)
Result: FAIL
make[1]: *** [prove] Error 1
make: *** [test] Error 2


Example:
https://s3.amazonaws.com/archive.travis-ci.org/jobs/169385219/log.txt

Thanks,
Lars

Re: t9000-addresses.sh: unexpected pases

2016-10-20 Thread Junio C Hamano
Ramsay Jones  writes:

> I have started seeing unexpected passes in this test (am I the
> only one?) on the next and pu branch, which seems to be caused
> by commit e3fdbcc8 ("parse_mailboxes: accept extra text after
> <...> address", 13-10-2016). Thus:
>
>   $ tail -15 ntest-out
>   [15:17:44]
>   All tests successful.
>
>   Test Summary Report
>   ---
>   t9000-addresses.sh   (Wstat: 0 Tests: 37 
> Failed: 0)
> TODO passed:   28, 30-31

Yeah, I noticed this in some of my integration runs but didn't pay
attention and forgot about it; thanks for bringing it up.



Re: [PATCH] doc: fix merge-base ASCII art tab spacing

2016-10-20 Thread Junio C Hamano
Jeff King  writes:

> On Fri, Oct 21, 2016 at 12:40:09AM +0100, Philip Oakley wrote:
>
>> The doc-tool stack does not always respect the 'tab = 8 spaces' rule,
>> particularly the git-scm doc pages https://git-scm.com/docs/git-merge-base
>> and the Git generated html pages.
>
> Hmm, I thought git-scm.com was fixed by:
>
>   
> https://github.com/git/git-scm.com/commit/1e13397edccdecd0e06ff851cffaa1c44e140bf3
>
> Not that I mind using spaces consistently here on principle. I just
> didn't think it was a problem anymore (my asciidoc seems to get it
> right, too).

I do not terribly mind it, either, and I'd prefer to apply Philip's
patch eventually. But I'd rather want to do so _after_ known broken
sites, if there are any, are fixed by correcting their tools.



Re: [PATCH] doc: fix merge-base ASCII art tab spacing

2016-10-20 Thread Junio C Hamano
Philip Oakley  writes:

> The doc-tool stack does not always respect the 'tab = 8 spaces' rule,
> particularly the git-scm doc pages https://git-scm.com/docs/git-merge-base
> and the Git generated html pages.

Sorry, but I do not understand this change.

https://git.github.io/htmldocs/git-merge-base.html is "Git generated
HTML page" and I do not see any breakage around the drawings this
patch touches, and the fp-path series does not touch these drawings,
either.

If a broken "doc-tool stack" breaks the formatting, I'd prefer to
see that "doc-tool stack" fixed, instead of working around by
updating the source they work on.  Otherwise, the broken "doc-tool
stack" will keep producing broken output next time a source that
respects "tab is to skip to the next multiple of 8" rule is fed to
it, no?



[BUG] fetch output is ugly in 'next'

2016-10-20 Thread Jeff King
I recently started using lt/abbrev-auto via 'next'. This is the fetch
output I saw today:

$ git fetch
remote: Counting objects: 332, done.
remote: Compressing objects: 100% (237/237), done.
remote: Total 332 (delta 182), reused 64 (delta 64), pack-reused 31
Receiving objects: 100% (332/332), 351.53 KiB | 0 bytes/s, done.
Resolving deltas: 100% (184/184), completed with 26 local objects.
>From git://github.com/gitster/git
 + fb65135d9c...16ab66ec97 jch  -> origin/jch  
(forced update)
   fd47ae6a5b..98985c6911 jk/diff-submodule-diff-inline-> 
origin/jk/diff-submodule-diff-inline
 * [new branch]  jk/no-looking-at-dotgit-outside-repo -> 
origin/jk/no-looking-at-dotgit-outside-repo
 + 3a8f853f16...1369f9b5ba jt/trailer-with-cruft-> 
origin/jt/trailer-with-cruft  (forced update)
 * [new branch]  pt/gitgui-updates-> 
origin/pt/gitgui-updates
 + 7594b34cbb...be8e40093b pu   -> origin/pu  
(forced update)
 + 7b95cf9a4e...76e368c378 tg/add-chmod+x-fix   -> 
origin/tg/add-chmod+x-fix  (forced update)
 + c4cf1f93cf...221912514c va/i18n-perl-scripts -> 
origin/va/i18n-perl-scripts  (forced update)
 * [new branch]  yk/git-tag-remove-mention-of-old-layout-in-doc -> 
origin/yk/git-tag-remove-mention-of-old-layout-in-doc

Yuck. I could maybe get over the wasted space due to the longer sha1s,
but we clearly need to fix the alignment computation. I haven't dug on
it at all; it's probably low-hanging fruit if somebody is interested.

-Peff


t9000-addresses.sh: unexpected pases

2016-10-20 Thread Ramsay Jones
Hi Matthieu,

I have started seeing unexpected passes in this test (am I the
only one?) on the next and pu branch, which seems to be caused
by commit e3fdbcc8 ("parse_mailboxes: accept extra text after
<...> address", 13-10-2016). Thus:

  $ tail -15 ntest-out
  [15:17:44]
  All tests successful.

  Test Summary Report
  ---
  t9000-addresses.sh   (Wstat: 0 Tests: 37 Failed: 
0)
TODO passed:   28, 30-31
  Files=760, Tests=13940, 484 wallclock secs ( 4.04 usr  1.30 sys + 60.52 cusr 
36.76 csys = 102.62 CPU)
  Result: PASS
  make clean-except-prove-cache
  make[2]: Entering directory '/home/ramsay/git/t'
  rm -f -r 'trash directory'.* 'test-results'
  rm -f -r valgrind/bin
  make[2]: Leaving directory '/home/ramsay/git/t'
  make[1]: Leaving directory '/home/ramsay/git/t'
  $ 

I have not even looked, but I suspect that it simply requires
a change from expect_fail to expect_success, since your commit
has 'fixed' these tests ... would you mind taking a quick look?

Thanks!

ATB,
Ramsay Jones



Re: [PATCH] doc: fix merge-base ASCII art tab spacing

2016-10-20 Thread Jeff King
On Fri, Oct 21, 2016 at 12:40:09AM +0100, Philip Oakley wrote:

> The doc-tool stack does not always respect the 'tab = 8 spaces' rule,
> particularly the git-scm doc pages https://git-scm.com/docs/git-merge-base
> and the Git generated html pages.

Hmm, I thought git-scm.com was fixed by:

  
https://github.com/git/git-scm.com/commit/1e13397edccdecd0e06ff851cffaa1c44e140bf3

Not that I mind using spaces consistently here on principle. I just
didn't think it was a problem anymore (my asciidoc seems to get it
right, too).

-Peff


Re: [PATCH v4 5/8] trailer: clarify failure modes in parse_trailer

2016-10-20 Thread Junio C Hamano
Jonathan Tan  writes:

> That is true - I think we can take the allowed separators as an
> argument (meaning that we can have different behavior for file parsing
> and command line parsing), and since we already have that string, we
> can use strcspn. I'll try this out in the next reroll.

Sounds good.  Thanks.


The following is a tangent that I think this topic should ignore,
but we may want to revisit it sometime later.

I think the design of the "separator" mechanism is one of the things
we botched in the current system.  If I recall correctly, this was
introduced to allow people write "Bug# 538" in the trailer section
and get it recognised as a valid trailer.

When I say that this was a botched design, I do not mean to say that
we should have instead forced projects to adopt "Bug: 538" format.
The design is botched because the users' wish to allow "Bug# 538" or
"Bug #538" by setting separators to ":#" from the built-in ":" does
not mean that they would want "Signed-off-by# me " to
be accepted.

If I were guiding a topic that introduce this feature from scratch
today, I would probably suggest a pattern based approach, e.g.  a
built-in "[-A-Za-z0-9]+:" [*1*] may be the default prefix that is
used to recognize the beginning of a trailer, and a user or a
project that wants "Bug #538" would be allowed to add an additional
pattern, e.g. "Bug *#", that recognises a custom trailer line that
is used by the project.


[Footnote]

*1* Or more lenient "[A-Za-z0-9][- A-Za-z0-9]*:".


Re: A couple errors dealing with uninitialized submodules

2016-10-20 Thread Stefan Beller
On Thu, Oct 20, 2016 at 4:29 PM, Aaron Schrab  wrote:
> I was working with a fresh clone of a repository where I'd forgotten that
> one of the directories was setup as a submodule, so I hadn't initialized it.
>
> I tried to add a symlink to a location outside the repository and this
> failed with an assertion (exact text in comment below). When looking into
> that I realized that the directory was meant to be a submodule and decided
> to try to change that.  So I tried to remove the submodule, and got an error
> (misleadingly) saying that couldn't be done because it uses a .git
> directory.
>
> I first noticed this with git 2.9.3 from Debian unstable, but I also see it
> building from v2.10.1-502-g6598894 fetched from master recently.
>
> The following script replicates both of these issues. These could both be
> classified as "don't do that", although at lease the assertion is quite
> ugly.
>
> --- >8 
> #!/bin/sh -e
>
> directory=$(mktemp -d)
> echo "Using directory '$directory'"
> cd $directory
> git init --quiet orig
> (
>  cd orig
>  # Using a random, small repository for the submodule.
>  git submodule --quiet add https://github.com/diepm/vim-rest-console.git sub
>  git commit -m init >/dev/null
> )
> git clone --quiet orig dup
> cd dup
>
> (
>  cd sub
>  ln -s /tmp/dont_care
>  # Next command aborts with
>  # git: pathspec.c:317: prefix_pathspec: Assertion `item->nowildcard_len <=
> item->len && item->prefix <= item->len' failed.`

For this bug see
https://public-inbox.org/git/cafoyhzdw-p0st8wkosvxbpbfciaczpgidpmfw5mrtftmoso...@mail.gmail.com/
Specifically try this patch
https://public-inbox.org/git/cagz79kzazcwz-cesb_voq0s0qt+ipcgb6tkdzle+ewsf_qr...@mail.gmail.com/

>  git add . || : expected to fail
> )
>
> rm -f .git/index.lock
> # Next command fails with error wrongly saying that the submodule uses a
> .git
> # directory.  I believe that the real problem is that the uninitialized
> # submodule has content.
> git rm sub


[PATCH] doc: fix merge-base ASCII art tab spacing

2016-10-20 Thread Philip Oakley
The doc-tool stack does not always respect the 'tab = 8 spaces' rule,
particularly the git-scm doc pages https://git-scm.com/docs/git-merge-base
and the Git generated html pages.

Use just spaces within the block of the ascii art.

Noticed when reviewing Junio's suggested update to `git merge-base`
https://public-inbox.org/git/xmqqmvi2sj8f@gitster.mtv.corp.google.com/T/#u


Signed-off-by: Philip Oakley 
---

A fixed consistent prefix of tabs is OK, but once that lead is done, stay
with spaces only.

This complements the jc/merge-base-fp-only series.

---
 Documentation/git-merge-base.txt | 26 +-
 1 file changed, 13 insertions(+), 13 deletions(-)

diff --git a/Documentation/git-merge-base.txt b/Documentation/git-merge-base.txt
index 808426f..b968b64 100644
--- a/Documentation/git-merge-base.txt
+++ b/Documentation/git-merge-base.txt
@@ -80,8 +80,8 @@ which is reachable from both 'A' and 'B' through the parent 
relationship.
 
 For example, with this topology:
 
-o---o---o---B
-   /
+o---o---o---B
+   /
---o---1---o---o---o---A
 
 the merge base between 'A' and 'B' is '1'.
@@ -116,11 +116,11 @@ the best common ancestor of all commits.
 When the history involves criss-cross merges, there can be more than one
 'best' common ancestor for two commits.  For example, with this topology:
 
-   ---1---o---A
-  \ /
-   X
-  / \
-   ---2---o---o---B
+   ---1---o---A
+   \ /
+X
+   / \
+   ---2---o---o---B
 
 both '1' and '2' are merge-bases of A and B.  Neither one is better than
 the other (both are 'best' merge bases).  When the `--all` option is not given,
@@ -154,13 +154,13 @@ topic origin/master`, the history of remote-tracking 
branch
 `origin/master` may have been rewound and rebuilt, leading to a
 history of this shape:
 
-o---B1
-   /
+o---B1
+   /
---o---o---B2--o---o---o---B (origin/master)
-   \
-B3
- \
-  Derived (topic)
+   \
+B3
+ \
+  Derived (topic)
 
 where `origin/master` used to point at commits B3, B2, B1 and now it
 points at B, and your `topic` branch was started on top of it back
-- 
2.9.0.windows.1.323.g0305acf



A couple errors dealing with uninitialized submodules

2016-10-20 Thread Aaron Schrab
I was working with a fresh clone of a repository where I'd forgotten 
that one of the directories was setup as a submodule, so I hadn't 
initialized it.


I tried to add a symlink to a location outside the repository and this 
failed with an assertion (exact text in comment below). When looking 
into that I realized that the directory was meant to be a submodule and 
decided to try to change that.  So I tried to remove the submodule, and 
got an error (misleadingly) saying that couldn't be done because it uses 
a .git directory.


I first noticed this with git 2.9.3 from Debian unstable, but I also see 
it building from v2.10.1-502-g6598894 fetched from master recently.


The following script replicates both of these issues. These could both 
be classified as "don't do that", although at lease the assertion is 
quite ugly.


--- >8 
#!/bin/sh -e

directory=$(mktemp -d)
echo "Using directory '$directory'"
cd $directory
git init --quiet orig
(
 cd orig
 # Using a random, small repository for the submodule.
 git submodule --quiet add https://github.com/diepm/vim-rest-console.git sub
 git commit -m init >/dev/null
)
git clone --quiet orig dup
cd dup

(
 cd sub
 ln -s /tmp/dont_care
 # Next command aborts with
 # git: pathspec.c:317: prefix_pathspec: Assertion `item->nowildcard_len <= item->len && 
item->prefix <= item->len' failed.`
 git add . || : expected to fail
)

rm -f .git/index.lock
# Next command fails with error wrongly saying that the submodule uses a .git
# directory.  I believe that the real problem is that the uninitialized
# submodule has content.
git rm sub


Re: Drastic jump in the time required for the test suite

2016-10-20 Thread Jeff King
On Thu, Oct 20, 2016 at 05:00:13PM -0400, Jeff King wrote:

> > I am not an expert on perl nor tracing, but is it feasible to find out
> > how many internal calls there are? i.e. either some shell script (rebase,
> > submodule) calling git itself a couple of times or even from compile/git/git
> > itself, e.g. some submodule operations use forking in there.
> 
> The script below is my attempt, though I think it is not quite right, as
> "make" should be the single apex of the graph. You can run it like:
> 
>   strace -f -o /tmp/foo.out -e clone,execve make test
>   perl graph.pl /tmp/foo.out | less -S

Ah, I see why it is sometimes wrong. We may see parent/child
interactions out of order. E.g., we may see:

  1. Process B execs "git".

  2. Process B exits.

  3. Process A forks pid B.

because strace does not manage to ptrace "A" before "B" finishes
running. It's hard to tell in step 3 if this is the same "A" we saw
earlier. You cannot just go by PID, because the tests run enough
processes that we end up reusing PIDs. My script has a hack which
increments a counter when processes go away, but that happens in step 2
above (and steps 1 and 3 think they are seeing different processes, even
though they are the same).

I'm sure it could be cleverly hacked around, but the easiest thing is
probably to just make sure the tests are not run in parallel. High load
makes that kind of out-of-order sequence much more likely.

Making sure I run it with "make -j1", the script I posted earlier gives
pretty reasonable output. I won't post mine here, as it's double-digit
megabytes, but you should be able to recreate it yourself.

Of course, the call graph by itself isn't that interesting. But you
might be able to do some interesting analysis, or just find spots of
interest to you (like git-submodule).

-Peff


Re: [PATCH v2 2/3] serialize collection of refs that contain submodule changes

2016-10-20 Thread Stefan Beller
> Thanks. So I do not completely get what you are suggesting: args or kept
> it the way it is? Since in the end you are saying it is ok here ;) I
> mainly chose this name because I am substituting the argv variable which
> is already called 'argv' with this array. That might also be the reason
> why in so many locations with struct child_processe's we have the 'argv'
> name: Because they initially started with the old-style NULL terminated
> array.
>
> I am fine with it either way. Just tell me what you like :)

I think it's fine as is here; I was just confused when first seeing this code.

>
> Cheers Heiko


What's cooking in git.git (Oct 2016, #05; Thu, 20)

2016-10-20 Thread Junio C Hamano
Here are the topics that have been cooking.  Commits prefixed with
'-' are only in 'pu' (proposed updates) while commits prefixed with
'+' are in 'next'.  The ones marked with '.' do not appear in any of
the integration branches, but I am still holding onto them.

The last batch of topics before -rc0 are expected to hopefully
graduate to 'master' sometime this weekend.  We are at around 450+
non-merge commits since v2.10.0 now.

You can find the changes described here in the integration branches
of the repositories listed at

http://git-blame.blogspot.com/p/git-public-repositories.html

--
[New Topics]

* jc/merge-base-fp-only (2016-10-19) 8 commits
 . merge-base: fp experiment
 - merge: allow to use only the fp-only merge bases
 - merge-base: limit the output to bases that are on first-parent chain
 - merge-base: mark bases that are on first-parent chain
 - merge-base: expose get_merge_bases_many_0() a bit more
 - merge-base: stop moving commits around in remove_redundant()
 - sha1_name: remove ONELINE_SEEN bit
 - commit: simplify fastpath of merge-base

 An experiment of merge-base that ignores common ancestors that are
 not on the first parent chain.


* bw/submodule-branch-dot-doc (2016-10-19) 1 commit
 - submodules doc: update documentation for "." used for submodule branches

 Recent git allows submodule..branch to use a special token
 "." instead of the branch name; the documentation has been updated
 to describe it.

 Will merge to 'next'.


* tg/add-chmod+x-fix (2016-10-20) 1 commit
 - t3700: fix broken test under !SANITY

 A hot-fix for a test added by a recent topic that went to both
 'master' and 'maint' already.

 Will merge to 'next'.


* jk/diff-submodule-diff-inline (2016-10-20) 1 commit
 - rev-list: use hdr_termination instead of a always using a newline

 A recently graduated topic regressed "git rev-list --header"
 output, breaking "gitweb".  This has been fixed.

 Will merge to 'next'.


* jk/no-looking-at-dotgit-outside-repo (2016-10-20) 8 commits
 - setup_git_env: avoid blind fall-back to ".git"
 - diff: handle sha1 abbreviations outside of repository
 - diff_aligned_abbrev: use "struct oid"
 - diff_unique_abbrev: rename to diff_aligned_abbrev
 - find_unique_abbrev: use 4-buffer ring
 - test-*-cache-tree: setup git dir
 - read info/{attributes,exclude} only when in repository
 - Merge branch 'jc/diff-unique-abbrev-comments' into 
jk/no-looking-at-dotgit-outside-repo
 (this branch uses jc/diff-unique-abbrev-comments.)

 Update "git diff --no-index" codepath not to try to peek into .git/
 directory that happens to be under the current directory, when we
 know we are operating outside any repository.

 Will wait until 'jc/diff-unique-abbrev-comments' graduates, rebase
 onto 'master' and then cook in 'next'.


* pt/gitgui-updates (2016-10-20) 35 commits
 - Merge tag 'gitgui-0.21.0' of git://repo.or.cz/git-gui
 - git-gui: set version 0.21
 - Merge branch 'as/bulgarian' into pu
 - git-gui: Mark 'All' in remote.tcl for translation
 - git-gui i18n: Updated Bulgarian translation (565,0f,0u)
 - Merge branch 'os/preserve-author' into pu
 - git-gui: avoid persisting modified author identity
 - git-gui: Do not reset author details on amend
 - Merge branch 'kb/unicode' into pu
 - git-gui: handle the encoding of Git's output correctly
 - git-gui: unicode file name support on windows
 - Merge branch 'dr/ru' into pu
 - git-gui: Update Russian translation
 - git-gui: maintain backwards compatibility for merge syntax
 - Merge branch 'va/i18n_2' into pu
 - git-gui i18n: mark string in lib/error.tcl for translation
 - git-gui: fix incorrect use of Tcl append command
 - git-gui i18n: mark "usage:" strings for translation
 - git-gui i18n: internationalize use of colon punctuation
 - Merge branch 'pt/non-mouse-usage' into pu
 - Amend tab ordering and text widget border and highlighting.
 - Allow keyboard control to work in the staging widgets.
 - Merge branch 'pt/git4win-mods' into pu
 - git-gui (Windows): use git-gui.exe in `Create Desktop Shortcut`
 - git-gui: fix detection of Cygwin
 - Merge branch 'patches' into pu
 - git-gui: ensure the file in the diff pane is in the list of selected files
 - git-gui: support for $FILENAMES in tool definitions
 - git-gui: fix initial git gui message encoding
 - git-gui/po/glossary/txt-to-pot.sh: use the $( ... ) construct for command 
substitution
 - Merge branch 'va/i18n' into pu
 - Merge branch 'rs/use-modern-git-merge-syntax' into pu
 - Merge branch 'js/commit-gpgsign' into pu
 - Merge branch 'sy/i18n' into pu
 - git-gui: sort entries in tclIndex

 Will merge to 'master'.
 Parked here temporarily as I didn't have chance to double-check.


* yk/git-tag-remove-mention-of-old-layout-in-doc (2016-10-20) 1 commit
 - doc: remove reference to the traditional layout in git-tag.txt

 Shorten description of auto-following in "git tag" by removing a
 mention of historical remotes layout which is not relevant to the
 main 

Re: [PATCH v4 5/8] trailer: clarify failure modes in parse_trailer

2016-10-20 Thread Jonathan Tan

On 10/20/2016 03:45 PM, Junio C Hamano wrote:

Jonathan Tan  writes:


If we do that, there is also the necessity of creating a string that
combines the separators and '=' (I guess '\n' is not necessary now,
since all the lines are null terminated). I'm OK either way.

(We could cache that string, although I would think that if we did
that, we might as well write the loop manually, like in this patch.)


I wonder if there is a legit reason to look for '=' in the first
place.  "Signed-off-by= Jonathan Tan " does not look
like a valid trailer line to me.

Isn't that a remnant of lazy coding in the original that tried to
share a single parser for contents and command line options or
something?


That is true - I think we can take the allowed separators as an argument 
(meaning that we can have different behavior for file parsing and 
command line parsing), and since we already have that string, we can use 
strcspn. I'll try this out in the next reroll.


Re: [PATCH v4 5/8] trailer: clarify failure modes in parse_trailer

2016-10-20 Thread Junio C Hamano
Jonathan Tan  writes:

> If we do that, there is also the necessity of creating a string that
> combines the separators and '=' (I guess '\n' is not necessary now,
> since all the lines are null terminated). I'm OK either way.
>
> (We could cache that string, although I would think that if we did
> that, we might as well write the loop manually, like in this patch.)

I wonder if there is a legit reason to look for '=' in the first
place.  "Signed-off-by= Jonathan Tan " does not look
like a valid trailer line to me.

Isn't that a remnant of lazy coding in the original that tried to
share a single parser for contents and command line options or
something?


Re: [PATCH v4 5/8] trailer: clarify failure modes in parse_trailer

2016-10-20 Thread Jonathan Tan

On 10/20/2016 03:40 PM, Jonathan Tan wrote:

On 10/20/2016 03:14 PM, Junio C Hamano wrote:

Stefan Beller  writes:


+static int find_separator(const char *line)
+{
+   const char *c;
+   for (c = line; ; c++) {
+   if (!*c || *c == '\n')
+   return -1;
+   if (*c == '=' || strchr(separators, *c))
+   return c - line;
+   }


I was about to suggest this function can be simplified and maybe
even inlined by the use of strspn or strcspn, but I think manual
processing of the string is fine, too, as it would not really be
shorter.


Hmm, I fear that iterating over a line one-byte-at-a-time and
running strchr(separators, *c) on it for each byte has a performance
implication over running a single call to strcspn(line, separators).


If we do that, there is also the necessity of creating a string that
combines the separators and '=' (I guess '\n' is not necessary now,
since all the lines are null terminated). I'm OK either way.

(We could cache that string, although I would think that if we did that,
we might as well write the loop manually, like in this patch.)


Actually I guess we could generate the separators_and_equal string 
whenever we obtain new separators from the config. I'll do this in the 
next reroll.


Re: [PATCH v4 5/8] trailer: clarify failure modes in parse_trailer

2016-10-20 Thread Jonathan Tan

On 10/20/2016 03:14 PM, Junio C Hamano wrote:

Stefan Beller  writes:


+static int find_separator(const char *line)
+{
+   const char *c;
+   for (c = line; ; c++) {
+   if (!*c || *c == '\n')
+   return -1;
+   if (*c == '=' || strchr(separators, *c))
+   return c - line;
+   }


I was about to suggest this function can be simplified and maybe
even inlined by the use of strspn or strcspn, but I think manual
processing of the string is fine, too, as it would not really be shorter.


Hmm, I fear that iterating over a line one-byte-at-a-time and
running strchr(separators, *c) on it for each byte has a performance
implication over running a single call to strcspn(line, separators).


If we do that, there is also the necessity of creating a string that 
combines the separators and '=' (I guess '\n' is not necessary now, 
since all the lines are null terminated). I'm OK either way.


(We could cache that string, although I would think that if we did that, 
we might as well write the loop manually, like in this patch.)


Re: [PATCH v4 5/8] trailer: clarify failure modes in parse_trailer

2016-10-20 Thread Junio C Hamano
Stefan Beller  writes:

>> +static int find_separator(const char *line)
>> +{
>> +   const char *c;
>> +   for (c = line; ; c++) {
>> +   if (!*c || *c == '\n')
>> +   return -1;
>> +   if (*c == '=' || strchr(separators, *c))
>> +   return c - line;
>> +   }
>
> I was about to suggest this function can be simplified and maybe
> even inlined by the use of strspn or strcspn, but I think manual
> processing of the string is fine, too, as it would not really be shorter.

Hmm, I fear that iterating over a line one-byte-at-a-time and
running strchr(separators, *c) on it for each byte has a performance
implication over running a single call to strcspn(line, separators).


Re: Fwd: New Defects reported by Coverity Scan for git

2016-10-20 Thread Junio C Hamano
Jeff King  writes:

> On Thu, Oct 20, 2016 at 10:58:39AM -0700, Stefan Beller wrote:
>
>> > By the way, do you know who is managing the service on our end
>> > (e.g. approving new people to be "defect viewer")?
>> 
>> I do it most of the time, but I did not start managing it.
>> And I have been pretty lax/liberal about handing out rights to do stuff,
>> because I did not want to tip on anyone's toes giving to few rights
>> and thereby annoying them.
>
> I also do this, though I admit with more urgency when I recognize the
> name as somebody who has showed up on the git list.

Well, then huge thanks to both of you.


Re: [PATCH v4 5/8] trailer: clarify failure modes in parse_trailer

2016-10-20 Thread Stefan Beller
On Thu, Oct 20, 2016 at 2:39 PM, Jonathan Tan  wrote:
> The parse_trailer function has a few modes of operation, all depending
> on whether the separator is present in its input, and if yes, the
> separator's position. Some of these modes are failure modes, and these
> failure modes are handled differently depending on whether the trailer
> line was sourced from a file or from a command-line argument.
>
> Extract a function to find the separator, allowing the invokers of
> parse_trailer to determine how to handle the failure modes instead of
> making parse_trailer do it.
>
> Signed-off-by: Jonathan Tan 
> ---
>  trailer.c | 70 
> +++
>  1 file changed, 48 insertions(+), 22 deletions(-)
>
> diff --git a/trailer.c b/trailer.c
> index 99018f8..137a3fb 100644
> --- a/trailer.c
> +++ b/trailer.c
> @@ -543,29 +543,40 @@ static int token_matches_item(const char *tok, struct 
> arg_item *item, int tok_le
> return item->conf.key ? !strncasecmp(tok, item->conf.key, tok_len) : 
> 0;
>  }
>
> -static int parse_trailer(struct strbuf *tok, struct strbuf *val,
> -const struct conf_info **conf, const char *trailer)
> +/*
> + * Return the location of the first separator or '=' in line, or -1 if 
> either a
> + * newline or the null terminator is reached first.
> + */
> +static int find_separator(const char *line)
> +{
> +   const char *c;
> +   for (c = line; ; c++) {
> +   if (!*c || *c == '\n')
> +   return -1;
> +   if (*c == '=' || strchr(separators, *c))
> +   return c - line;
> +   }

I was about to suggest this function can be simplified and maybe
even inlined by the use of strspn or strcspn, but I think manual
processing of the string is fine, too, as it would not really be shorter.


Re: Drastic jump in the time required for the test suite

2016-10-20 Thread Jeff King
On Thu, Oct 20, 2016 at 02:53:36PM -0700, Stefan Beller wrote:

> >> That said I really like the idea of having a helper that would eliminate 
> >> the cat
> >> for you, e.g. :
> >>
> >> git_test_helper_equal_stdin_or_diff_and_die -C super_repo status
> >> --porcelain=v2 --branch --untracked-files=all <<-EOF
> >> 1 A. N... 00 100644 100644 $_z40 $HMOD .gitmodules
> >> 1 AM S.M. 00 16 16 $_z40 $HSUP sub1
> >> EOF
> >
> > I think that helper still ends up using "cat" and "diff" under the hood,
> 
> I assumed that helper being a C program, that only needs to fork once
> for the actual git call and then it sits idle to compare the exact output
> from stdout to its stdin.

Ah, I see. I thought you meant a shell helper. A C helper does drop 2
forks down to 1, but it would be nice if we could drop it to 0.

-Peff


Re: Drastic jump in the time required for the test suite

2016-10-20 Thread Jeff King
On Thu, Oct 20, 2016 at 05:38:03PM -0400, Jeff King wrote:

> I think that helper still ends up using "cat" and "diff" under the hood,
> unless you write those bits in pure shell. But at that point, I suspect
> we could "cat" and "test_cmp" in pure shell, something like:
> [...]
> Those are both completely untested. But maybe they are worth playing
> around with for somebody on Windows to see if they make a dent in the
> test runtime.

If you tried to run them, you probably noticed that the "untested" was
really true. One of the functions was missing an "else", and the other
forgot to add a "\n" to its printf.

The patch below gets closer, though there are still a handful of test
failures.  I didn't investigate deeply, but I think at least one is
related to the "read/printf" version of cat not being binary-clean.

-Peff

---
diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh
index fdaeb3a96b..de37f3d 100644
--- a/t/test-lib-functions.sh
+++ b/t/test-lib-functions.sh
@@ -685,9 +685,48 @@ test_expect_code () {
 # - not all diff versions understand "-u"
 
 test_cmp() {
+   # optimize for common "they are the same" case
+   # without any subshells or subprograms
+   while true; do
+   if ! read -r line1 <&3
+   then
+   if ! read -r line2 <&4
+   then
+   # EOF on both; good
+   return 0
+   else
+   # EOF only on file1; fail
+   break
+   fi
+   fi
+   if ! read -r line2 <&4
+   then
+   # EOF only on file2; fail
+   break
+   fi
+   test "$line1" = "$line2" || break
+   done 3<"$1" 4<"$2"
+
+   # if we get here, the optimized version found some
+   # difference. We can just "return 1", but let's run
+   # the real $GIT_TEST_CMP to provide pretty output.
+   # This should generally only happen on test failures,
+   # so performance isn't a big deal.
$GIT_TEST_CMP "$@"
 }
 
+cat () {
+   # optimize common here-doc usage
+   if test $# -eq 0
+   then
+   while read -r line
+   do
+   printf '%s\n' "$line"
+   done
+   fi
+   command cat "$@"
+}
+
 # test_cmp_bin - helper to compare binary files
 
 test_cmp_bin() {


Re: Drastic jump in the time required for the test suite

2016-10-20 Thread Stefan Beller
On Thu, Oct 20, 2016 at 2:38 PM, Jeff King  wrote:
> On Thu, Oct 20, 2016 at 12:54:32PM -0700, Stefan Beller wrote:
>
>> Maybe we should stop introducing un-optimized tests.
>> [...]
>> * heavy use of the "git -C " pattern. When applying that
>>   thouroughly we'd save spanning the subshells.
>
> Yeah, I imagine with some style changes we could drop quite a few
> subshells. The problem is that the conversion work is manual and
> tedious. I'd look first for spots where we can eliminate thousands of
> calls with a single change.
>
>> That said I really like the idea of having a helper that would eliminate the 
>> cat
>> for you, e.g. :
>>
>> git_test_helper_equal_stdin_or_diff_and_die -C super_repo status
>> --porcelain=v2 --branch --untracked-files=all <<-EOF
>> 1 A. N... 00 100644 100644 $_z40 $HMOD .gitmodules
>> 1 AM S.M. 00 16 16 $_z40 $HSUP sub1
>> EOF
>
> I think that helper still ends up using "cat" and "diff" under the hood,

I assumed that helper being a C program, that only needs to fork once
for the actual git call and then it sits idle to compare the exact output
from stdout to its stdin.

If there is a difference it will do the extra work (i.e. put stdin and stout to
a file and call diff on it)


Re: Fwd: New Defects reported by Coverity Scan for git

2016-10-20 Thread Jeff King
On Thu, Oct 20, 2016 at 10:58:39AM -0700, Stefan Beller wrote:

> > By the way, do you know who is managing the service on our end
> > (e.g. approving new people to be "defect viewer")?
> 
> I do it most of the time, but I did not start managing it.
> And I have been pretty lax/liberal about handing out rights to do stuff,
> because I did not want to tip on anyone's toes giving to few rights
> and thereby annoying them.

I also do this, though I admit with more urgency when I recognize the
name as somebody who has showed up on the git list.

I wish there was a flag to simply make the results public. There is no
point in anybody logging in just to view them, but I don't think
Coverity makes that an option.

-Peff


Re: Fwd: New Defects reported by Coverity Scan for git

2016-10-20 Thread Jeff King
On Thu, Oct 20, 2016 at 10:05:38AM -0700, Stefan Beller wrote:

> Not sure what triggered the new finding of coverity as seen below as the
> parse_commit() was not touched. Junios series regarding the merge base
> optimization touches a bit of code nearby though.

I have noticed that "old" problems sometimes reappear when nearby code
changes. I assume that they have some kind of heuristic to identify the
location of a defect, that it probably includes the line number in the
file, and that it can be fooled by too much code appearing nearby.

-Peff


[PATCH v4 4/8] trailer: make args have their own struct

2016-10-20 Thread Jonathan Tan
Improve type safety by making arguments (whether from configuration or
from the command line) have their own "struct arg_item" type, separate
from the "struct trailer_item" type used to represent the trailers in
the buffer being manipulated.

This change also prepares "struct trailer_item" to be further
differentiated from "struct arg_item" in future patches.

Signed-off-by: Jonathan Tan 
---
 trailer.c | 135 +++---
 1 file changed, 85 insertions(+), 50 deletions(-)

diff --git a/trailer.c b/trailer.c
index ae3972a..99018f8 100644
--- a/trailer.c
+++ b/trailer.c
@@ -29,6 +29,12 @@ struct trailer_item {
struct list_head list;
char *token;
char *value;
+};
+
+struct arg_item {
+   struct list_head list;
+   char *token;
+   char *value;
struct conf_info conf;
 };
 
@@ -62,7 +68,7 @@ static size_t token_len_without_separator(const char *token, 
size_t len)
return len;
 }
 
-static int same_token(struct trailer_item *a, struct trailer_item *b)
+static int same_token(struct trailer_item *a, struct arg_item *b)
 {
size_t a_len = token_len_without_separator(a->token, strlen(a->token));
size_t b_len = token_len_without_separator(b->token, strlen(b->token));
@@ -71,12 +77,12 @@ static int same_token(struct trailer_item *a, struct 
trailer_item *b)
return !strncasecmp(a->token, b->token, min_len);
 }
 
-static int same_value(struct trailer_item *a, struct trailer_item *b)
+static int same_value(struct trailer_item *a, struct arg_item *b)
 {
return !strcasecmp(a->value, b->value);
 }
 
-static int same_trailer(struct trailer_item *a, struct trailer_item *b)
+static int same_trailer(struct trailer_item *a, struct arg_item *b)
 {
return same_token(a, b) && same_value(a, b);
 }
@@ -98,6 +104,13 @@ static inline void strbuf_replace(struct strbuf *sb, const 
char *a, const char *
 
 static void free_trailer_item(struct trailer_item *item)
 {
+   free(item->token);
+   free(item->value);
+   free(item);
+}
+
+static void free_arg_item(struct arg_item *item)
+{
free(item->conf.name);
free(item->conf.key);
free(item->conf.command);
@@ -137,17 +150,29 @@ static void print_all(FILE *outfile, struct list_head 
*head, int trim_empty)
}
 }
 
+static struct trailer_item *trailer_from_arg(struct arg_item *arg_tok)
+{
+   struct trailer_item *new = xcalloc(sizeof(*new), 1);
+   new->token = arg_tok->token;
+   new->value = arg_tok->value;
+   arg_tok->token = arg_tok->value = NULL;
+   free_arg_item(arg_tok);
+   return new;
+}
+
 static void add_arg_to_input_list(struct trailer_item *on_tok,
- struct trailer_item *arg_tok)
+ struct arg_item *arg_tok)
 {
-   if (after_or_end(arg_tok->conf.where))
-   list_add(_tok->list, _tok->list);
+   int aoe = after_or_end(arg_tok->conf.where);
+   struct trailer_item *to_add = trailer_from_arg(arg_tok);
+   if (aoe)
+   list_add(_add->list, _tok->list);
else
-   list_add_tail(_tok->list, _tok->list);
+   list_add_tail(_add->list, _tok->list);
 }
 
 static int check_if_different(struct trailer_item *in_tok,
- struct trailer_item *arg_tok,
+ struct arg_item *arg_tok,
  int check_all,
  struct list_head *head)
 {
@@ -200,7 +225,7 @@ static char *apply_command(const char *command, const char 
*arg)
return result;
 }
 
-static void apply_item_command(struct trailer_item *in_tok, struct 
trailer_item *arg_tok)
+static void apply_item_command(struct trailer_item *in_tok, struct arg_item 
*arg_tok)
 {
if (arg_tok->conf.command) {
const char *arg;
@@ -218,13 +243,13 @@ static void apply_item_command(struct trailer_item 
*in_tok, struct trailer_item
 }
 
 static void apply_arg_if_exists(struct trailer_item *in_tok,
-   struct trailer_item *arg_tok,
+   struct arg_item *arg_tok,
struct trailer_item *on_tok,
struct list_head *head)
 {
switch (arg_tok->conf.if_exists) {
case EXISTS_DO_NOTHING:
-   free_trailer_item(arg_tok);
+   free_arg_item(arg_tok);
break;
case EXISTS_REPLACE:
apply_item_command(in_tok, arg_tok);
@@ -241,39 +266,41 @@ static void apply_arg_if_exists(struct trailer_item 
*in_tok,
if (check_if_different(in_tok, arg_tok, 1, head))
add_arg_to_input_list(on_tok, arg_tok);
else
-   free_trailer_item(arg_tok);
+   free_arg_item(arg_tok);
break;
case 

[PATCH v4 3/8] trailer: streamline trailer item create and add

2016-10-20 Thread Jonathan Tan
Currently, creation and addition (to a list) of trailer items are spread
across multiple functions. Streamline this by only having 2 functions:
one to parse the user-supplied string, and one to add the parsed
information to a list.

Signed-off-by: Jonathan Tan 
---
 trailer.c | 130 +-
 1 file changed, 60 insertions(+), 70 deletions(-)

diff --git a/trailer.c b/trailer.c
index 4e85aae..ae3972a 100644
--- a/trailer.c
+++ b/trailer.c
@@ -500,10 +500,31 @@ static int git_trailer_config(const char *conf_key, const 
char *value, void *cb)
return 0;
 }
 
-static int parse_trailer(struct strbuf *tok, struct strbuf *val, const char 
*trailer)
+static const char *token_from_item(struct trailer_item *item, char *tok)
+{
+   if (item->conf.key)
+   return item->conf.key;
+   if (tok)
+   return tok;
+   return item->conf.name;
+}
+
+static int token_matches_item(const char *tok, struct trailer_item *item, int 
tok_len)
+{
+   if (!strncasecmp(tok, item->conf.name, tok_len))
+   return 1;
+   return item->conf.key ? !strncasecmp(tok, item->conf.key, tok_len) : 0;
+}
+
+static int parse_trailer(struct strbuf *tok, struct strbuf *val,
+const struct conf_info **conf, const char *trailer)
 {
size_t len;
struct strbuf seps = STRBUF_INIT;
+   struct trailer_item *item;
+   int tok_len;
+   struct list_head *pos;
+
strbuf_addstr(, separators);
strbuf_addch(, '=');
len = strcspn(trailer, seps.buf);
@@ -523,74 +544,31 @@ static int parse_trailer(struct strbuf *tok, struct 
strbuf *val, const char *tra
strbuf_addstr(tok, trailer);
strbuf_trim(tok);
}
-   return 0;
-}
-
-static const char *token_from_item(struct trailer_item *item, char *tok)
-{
-   if (item->conf.key)
-   return item->conf.key;
-   if (tok)
-   return tok;
-   return item->conf.name;
-}
-
-static struct trailer_item *new_trailer_item(struct trailer_item *conf_item,
-char *tok, char *val)
-{
-   struct trailer_item *new = xcalloc(sizeof(*new), 1);
-   new->value = val ? val : xstrdup("");
-
-   if (conf_item) {
-   duplicate_conf(>conf, _item->conf);
-   new->token = xstrdup(token_from_item(conf_item, tok));
-   free(tok);
-   } else {
-   duplicate_conf(>conf, _conf_info);
-   new->token = tok;
-   }
-
-   return new;
-}
-
-static int token_matches_item(const char *tok, struct trailer_item *item, int 
tok_len)
-{
-   if (!strncasecmp(tok, item->conf.name, tok_len))
-   return 1;
-   return item->conf.key ? !strncasecmp(tok, item->conf.key, tok_len) : 0;
-}
-
-static struct trailer_item *create_trailer_item(const char *string)
-{
-   struct strbuf tok = STRBUF_INIT;
-   struct strbuf val = STRBUF_INIT;
-   struct trailer_item *item;
-   int tok_len;
-   struct list_head *pos;
-
-   if (parse_trailer(, , string))
-   return NULL;
-
-   tok_len = token_len_without_separator(tok.buf, tok.len);
 
/* Lookup if the token matches something in the config */
+   tok_len = token_len_without_separator(tok->buf, tok->len);
+   *conf = _conf_info;
list_for_each(pos, _head) {
item = list_entry(pos, struct trailer_item, list);
-   if (token_matches_item(tok.buf, item, tok_len))
-   return new_trailer_item(item,
-   strbuf_detach(, NULL),
-   strbuf_detach(, NULL));
+   if (token_matches_item(tok->buf, item, tok_len)) {
+   char *tok_buf = strbuf_detach(tok, NULL);
+   *conf = >conf;
+   strbuf_addstr(tok, token_from_item(item, tok_buf));
+   free(tok_buf);
+   break;
+   }
}
 
-   return new_trailer_item(NULL,
-   strbuf_detach(, NULL),
-   strbuf_detach(, NULL));
+   return 0;
 }
 
-static void add_trailer_item(struct list_head *head, struct trailer_item *new)
+static void add_trailer_item(struct list_head *head, char *tok, char *val,
+const struct conf_info *conf)
 {
-   if (!new)
-   return;
+   struct trailer_item *new = xcalloc(sizeof(*new), 1);
+   new->token = tok;
+   new->value = val;
+   duplicate_conf(>conf, conf);
list_add_tail(>list, head);
 }
 
@@ -599,21 +577,28 @@ static void process_command_line_args(struct list_head 
*arg_head,
 {
struct string_list_item *tr;
struct trailer_item *item;
+   struct strbuf tok = STRBUF_INIT;
+   struct 

[PATCH v4 6/8] trailer: allow non-trailers in trailer block

2016-10-20 Thread Jonathan Tan
Currently, interpret-trailers requires all lines of a trailer block to
be trailers (or comments) - if not it would not identify that block as a
trailer block, and thus create its own trailer block, inserting a blank
line.  For example:

  echo -e "\nSigned-off-by: x\nnot trailer" |
  git interpret-trailers --trailer "c: d"

would result in:

  Signed-off-by: x
  not trailer

  c: d

Relax the definition of a trailer block to require that the trailers (i)
are all trailers, or (ii) contain at least one Git-generated trailer and
consists of at least 25% trailers.

  Signed-off-by: x
  not trailer
  c: d

(i) is the existing functionality. (ii) allows arbitrary lines to be
included in trailer blocks, like those in [1], and still allow
interpret-trailers to be used.

[1]
https://kernel.googlesource.com/pub/scm/linux/kernel/git/stable/linux-stable/+/e7d316a02f683864a12389f8808570e37fb90aa3

Signed-off-by: Jonathan Tan 
---
 Documentation/git-interpret-trailers.txt |   5 +-
 t/t7513-interpret-trailers.sh| 115 +++
 trailer.c|  89 
 3 files changed, 194 insertions(+), 15 deletions(-)

diff --git a/Documentation/git-interpret-trailers.txt 
b/Documentation/git-interpret-trailers.txt
index 93d1db6..cf4c5ea 100644
--- a/Documentation/git-interpret-trailers.txt
+++ b/Documentation/git-interpret-trailers.txt
@@ -48,8 +48,9 @@ with only spaces at the end of the commit message part, one 
blank line
 will be added before the new trailer.
 
 Existing trailers are extracted from the input message by looking for
-a group of one or more lines that contain a colon (by default), where
-the group is preceded by one or more empty (or whitespace-only) lines.
+a group of one or more lines that (i) are all trailers, or (ii) contains at
+least one Git-generated trailer and consists of at least 25% trailers.
+The group must be preceded by one or more empty (or whitespace-only) lines.
 The group must either be at the end of the message or be the last
 non-whitespace lines before a line that starts with '---'. Such three
 minus signs start the patch part of the message.
diff --git a/t/t7513-interpret-trailers.sh b/t/t7513-interpret-trailers.sh
index aee785c..003e90f 100755
--- a/t/t7513-interpret-trailers.sh
+++ b/t/t7513-interpret-trailers.sh
@@ -126,6 +126,121 @@ test_expect_success 'with multiline title in the message' 
'
test_cmp expected actual
 '
 
+test_expect_success 'with non-trailer lines mixed with Signed-off-by' '
+   cat >patch <<-\EOF &&
+
+   this is not a trailer
+   this is not a trailer
+   Signed-off-by: a 
+   this is not a trailer
+   EOF
+   cat >expected <<-\EOF &&
+
+   this is not a trailer
+   this is not a trailer
+   Signed-off-by: a 
+   this is not a trailer
+   token: value
+   EOF
+   git interpret-trailers --trailer "token: value" patch >actual &&
+   test_cmp expected actual
+'
+
+test_expect_success 'with non-trailer lines mixed with cherry picked from' '
+   cat >patch <<-\EOF &&
+
+   this is not a trailer
+   this is not a trailer
+   (cherry picked from commit x)
+   this is not a trailer
+   EOF
+   cat >expected <<-\EOF &&
+
+   this is not a trailer
+   this is not a trailer
+   (cherry picked from commit x)
+   this is not a trailer
+   token: value
+   EOF
+   git interpret-trailers --trailer "token: value" patch >actual &&
+   test_cmp expected actual
+'
+
+test_expect_success 'with non-trailer lines mixed with a configured trailer' '
+   cat >patch <<-\EOF &&
+
+   this is not a trailer
+   this is not a trailer
+   My-trailer: x
+   this is not a trailer
+   EOF
+   cat >expected <<-\EOF &&
+
+   this is not a trailer
+   this is not a trailer
+   My-trailer: x
+   this is not a trailer
+   token: value
+   EOF
+   test_config trailer.my.key "My-trailer: " &&
+   git interpret-trailers --trailer "token: value" patch >actual &&
+   test_cmp expected actual
+'
+
+test_expect_success 'with non-trailer lines mixed with a non-configured 
trailer' '
+   cat >patch <<-\EOF &&
+
+   this is not a trailer
+   this is not a trailer
+   I-am-not-configured: x
+   this is not a trailer
+   EOF
+   cat >expected <<-\EOF &&
+
+   this is not a trailer
+   this is not a trailer
+   I-am-not-configured: x
+   this is not a trailer
+
+   token: value
+   EOF
+   test_config trailer.my.key "My-trailer: " &&
+   git 

[PATCH v4 2/8] trailer: use list.h for doubly-linked list

2016-10-20 Thread Jonathan Tan
Replace the existing handwritten implementation of a doubly-linked list
in trailer.c with the functions and macros from list.h. This
significantly simplifies the code.

Signed-off-by: Jonathan Tan 
Signed-off-by: Ramsay Jones 
---
 trailer.c | 258 ++
 1 file changed, 91 insertions(+), 167 deletions(-)

diff --git a/trailer.c b/trailer.c
index 1f191b2..4e85aae 100644
--- a/trailer.c
+++ b/trailer.c
@@ -4,6 +4,7 @@
 #include "commit.h"
 #include "tempfile.h"
 #include "trailer.h"
+#include "list.h"
 /*
  * Copyright (c) 2013, 2014 Christian Couder 
  */
@@ -25,19 +26,24 @@ struct conf_info {
 static struct conf_info default_conf_info;
 
 struct trailer_item {
-   struct trailer_item *previous;
-   struct trailer_item *next;
+   struct list_head list;
char *token;
char *value;
struct conf_info conf;
 };
 
-static struct trailer_item *first_conf_item;
+static LIST_HEAD(conf_head);
 
 static char *separators = ":";
 
 #define TRAILER_ARG_STRING "$ARG"
 
+/* Iterate over the elements of the list. */
+#define list_for_each_dir(pos, head, is_reverse) \
+   for (pos = is_reverse ? (head)->prev : (head)->next; \
+   pos != (head); \
+   pos = is_reverse ? pos->prev : pos->next)
+
 static int after_or_end(enum action_where where)
 {
return (where == WHERE_AFTER) || (where == WHERE_END);
@@ -120,101 +126,49 @@ static void print_tok_val(FILE *outfile, const char 
*tok, const char *val)
fprintf(outfile, "%s%c %s\n", tok, separators[0], val);
 }
 
-static void print_all(FILE *outfile, struct trailer_item *first, int 
trim_empty)
+static void print_all(FILE *outfile, struct list_head *head, int trim_empty)
 {
+   struct list_head *pos;
struct trailer_item *item;
-   for (item = first; item; item = item->next) {
+   list_for_each(pos, head) {
+   item = list_entry(pos, struct trailer_item, list);
if (!trim_empty || strlen(item->value) > 0)
print_tok_val(outfile, item->token, item->value);
}
 }
 
-static void update_last(struct trailer_item **last)
-{
-   if (*last)
-   while ((*last)->next != NULL)
-   *last = (*last)->next;
-}
-
-static void update_first(struct trailer_item **first)
-{
-   if (*first)
-   while ((*first)->previous != NULL)
-   *first = (*first)->previous;
-}
-
 static void add_arg_to_input_list(struct trailer_item *on_tok,
- struct trailer_item *arg_tok,
- struct trailer_item **first,
- struct trailer_item **last)
-{
-   if (after_or_end(arg_tok->conf.where)) {
-   arg_tok->next = on_tok->next;
-   on_tok->next = arg_tok;
-   arg_tok->previous = on_tok;
-   if (arg_tok->next)
-   arg_tok->next->previous = arg_tok;
-   update_last(last);
-   } else {
-   arg_tok->previous = on_tok->previous;
-   on_tok->previous = arg_tok;
-   arg_tok->next = on_tok;
-   if (arg_tok->previous)
-   arg_tok->previous->next = arg_tok;
-   update_first(first);
-   }
+ struct trailer_item *arg_tok)
+{
+   if (after_or_end(arg_tok->conf.where))
+   list_add(_tok->list, _tok->list);
+   else
+   list_add_tail(_tok->list, _tok->list);
 }
 
 static int check_if_different(struct trailer_item *in_tok,
  struct trailer_item *arg_tok,
- int check_all)
+ int check_all,
+ struct list_head *head)
 {
enum action_where where = arg_tok->conf.where;
+   struct list_head *next_head;
do {
-   if (!in_tok)
-   return 1;
if (same_trailer(in_tok, arg_tok))
return 0;
/*
 * if we want to add a trailer after another one,
 * we have to check those before this one
 */
-   in_tok = after_or_end(where) ? in_tok->previous : in_tok->next;
+   next_head = after_or_end(where) ? in_tok->list.prev
+   : in_tok->list.next;
+   if (next_head == head)
+   break;
+   in_tok = list_entry(next_head, struct trailer_item, list);
} while (check_all);
return 1;
 }
 
-static void remove_from_list(struct trailer_item *item,
-struct trailer_item **first,
-struct trailer_item **last)
-{
-   struct trailer_item *next = 

[PATCH v4 7/8] trailer: forbid leading whitespace in trailers

2016-10-20 Thread Jonathan Tan
Currently, interpret-trailers allows leading whitespace in trailer
lines. This leads to false positives, especially for quoted lines or
bullet lists.

Forbid leading whitespace in trailers.

Signed-off-by: Jonathan Tan 
---
 Documentation/git-interpret-trailers.txt |  2 +-
 t/t7513-interpret-trailers.sh| 15 +++
 trailer.c|  2 +-
 3 files changed, 17 insertions(+), 2 deletions(-)

diff --git a/Documentation/git-interpret-trailers.txt 
b/Documentation/git-interpret-trailers.txt
index cf4c5ea..4966b5b 100644
--- a/Documentation/git-interpret-trailers.txt
+++ b/Documentation/git-interpret-trailers.txt
@@ -55,7 +55,7 @@ The group must either be at the end of the message or be the 
last
 non-whitespace lines before a line that starts with '---'. Such three
 minus signs start the patch part of the message.
 
-When reading trailers, there can be whitespaces before and after the
+When reading trailers, there can be whitespaces after the
 token, the separator and the value. There can also be whitespaces
 inside the token and the value.
 
diff --git a/t/t7513-interpret-trailers.sh b/t/t7513-interpret-trailers.sh
index 003e90f..3d94b3a 100755
--- a/t/t7513-interpret-trailers.sh
+++ b/t/t7513-interpret-trailers.sh
@@ -241,6 +241,21 @@ test_expect_success 'with non-trailer lines only' '
test_cmp expected actual
 '
 
+test_expect_success 'line with leading whitespace is not trailer' '
+   q_to_tab >patch <<-\EOF &&
+
+   Qtoken: value
+   EOF
+   q_to_tab >expected <<-\EOF &&
+
+   Qtoken: value
+
+   token: value
+   EOF
+   git interpret-trailers --trailer "token: value" patch >actual &&
+   test_cmp expected actual
+'
+
 test_expect_success 'with config setup' '
git config trailer.ack.key "Acked-by: " &&
cat >expected <<-\EOF &&
diff --git a/trailer.c b/trailer.c
index da15b79..3ef5576 100644
--- a/trailer.c
+++ b/trailer.c
@@ -770,7 +770,7 @@ static int find_trailer_start(struct strbuf **lines, int 
count)
}
 
separator_pos = find_separator(lines[start]->buf);
-   if (separator_pos >= 1) {
+   if (separator_pos >= 1 && !isspace(lines[start]->buf[0])) {
struct list_head *pos;
 
trailer_lines++;
-- 
2.8.0.rc3.226.g39d4020



[PATCH v4 8/8] trailer: support values folded to multiple lines

2016-10-20 Thread Jonathan Tan
Currently, interpret-trailers requires that a trailer be only on 1 line.
For example:

a: first line
   second line

would be interpreted as one trailer line followed by one non-trailer line.

Make interpret-trailers support RFC 822-style folding, treating those
lines as one logical trailer.

Signed-off-by: Jonathan Tan 
---
 Documentation/git-interpret-trailers.txt |   7 +-
 t/t7513-interpret-trailers.sh| 169 +++
 trailer.c|  43 ++--
 3 files changed, 210 insertions(+), 9 deletions(-)

diff --git a/Documentation/git-interpret-trailers.txt 
b/Documentation/git-interpret-trailers.txt
index 4966b5b..e99bda6 100644
--- a/Documentation/git-interpret-trailers.txt
+++ b/Documentation/git-interpret-trailers.txt
@@ -57,11 +57,12 @@ minus signs start the patch part of the message.
 
 When reading trailers, there can be whitespaces after the
 token, the separator and the value. There can also be whitespaces
-inside the token and the value.
+inside the token and the value. The value may be split over multiple lines with
+each subsequent line starting with whitespace, like the "folding" in RFC 822.
 
 Note that 'trailers' do not follow and are not intended to follow many
-rules for RFC 822 headers. For example they do not follow the line
-folding rules, the encoding rules and probably many other rules.
+rules for RFC 822 headers. For example they do not follow
+the encoding rules and probably many other rules.
 
 OPTIONS
 ---
diff --git a/t/t7513-interpret-trailers.sh b/t/t7513-interpret-trailers.sh
index 3d94b3a..4dd1d7c 100755
--- a/t/t7513-interpret-trailers.sh
+++ b/t/t7513-interpret-trailers.sh
@@ -256,6 +256,175 @@ test_expect_success 'line with leading whitespace is not 
trailer' '
test_cmp expected actual
 '
 
+test_expect_success 'multiline field treated as one trailer for 25% check' '
+   q_to_tab >patch <<-\EOF &&
+
+   Signed-off-by: a 
+   name: value on
+   Qmultiple lines
+   this is not a trailer
+   this is not a trailer
+   this is not a trailer
+   this is not a trailer
+   this is not a trailer
+   this is not a trailer
+   EOF
+   q_to_tab >expected <<-\EOF &&
+
+   Signed-off-by: a 
+   name: value on
+   Qmultiple lines
+   this is not a trailer
+   this is not a trailer
+   this is not a trailer
+   this is not a trailer
+   this is not a trailer
+   this is not a trailer
+   name: value
+   EOF
+   git interpret-trailers --trailer "name: value" patch >actual &&
+   test_cmp expected actual
+'
+
+test_expect_success 'multiline field treated as atomic for placement' '
+   q_to_tab >patch <<-\EOF &&
+
+   another: trailer
+   name: value on
+   Qmultiple lines
+   another: trailer
+   EOF
+   q_to_tab >expected <<-\EOF &&
+
+   another: trailer
+   name: value on
+   Qmultiple lines
+   name: value
+   another: trailer
+   EOF
+   test_config trailer.name.where after &&
+   git interpret-trailers --trailer "name: value" patch >actual &&
+   test_cmp expected actual
+'
+
+test_expect_success 'multiline field treated as atomic for replacement' '
+   q_to_tab >patch <<-\EOF &&
+
+   another: trailer
+   name: value on
+   Qmultiple lines
+   another: trailer
+   EOF
+   q_to_tab >expected <<-\EOF &&
+
+   another: trailer
+   another: trailer
+   name: value
+   EOF
+   test_config trailer.name.ifexists replace &&
+   git interpret-trailers --trailer "name: value" patch >actual &&
+   test_cmp expected actual
+'
+
+test_expect_success 'multiline field treated as atomic for difference check' '
+   q_to_tab >patch <<-\EOF &&
+
+   another: trailer
+   name: first line
+   Qsecond line
+   another: trailer
+   EOF
+   test_config trailer.name.ifexists addIfDifferent &&
+
+   q_to_tab >trailer <<-\EOF &&
+   name: first line
+   Qsecond line
+   EOF
+   q_to_tab >expected <<-\EOF &&
+
+   another: trailer
+   name: first line
+   Qsecond line
+   another: trailer
+   EOF
+   git interpret-trailers --trailer "$(cat trailer)" patch >actual &&
+   test_cmp expected actual &&
+
+   q_to_tab >trailer <<-\EOF &&
+   name: first line
+   Qsecond line
+   EOF
+   q_to_tab >expected <<-\EOF &&
+
+   another: trailer
+   name: first line
+   

[PATCH v4 5/8] trailer: clarify failure modes in parse_trailer

2016-10-20 Thread Jonathan Tan
The parse_trailer function has a few modes of operation, all depending
on whether the separator is present in its input, and if yes, the
separator's position. Some of these modes are failure modes, and these
failure modes are handled differently depending on whether the trailer
line was sourced from a file or from a command-line argument.

Extract a function to find the separator, allowing the invokers of
parse_trailer to determine how to handle the failure modes instead of
making parse_trailer do it.

Signed-off-by: Jonathan Tan 
---
 trailer.c | 70 +++
 1 file changed, 48 insertions(+), 22 deletions(-)

diff --git a/trailer.c b/trailer.c
index 99018f8..137a3fb 100644
--- a/trailer.c
+++ b/trailer.c
@@ -543,29 +543,40 @@ static int token_matches_item(const char *tok, struct 
arg_item *item, int tok_le
return item->conf.key ? !strncasecmp(tok, item->conf.key, tok_len) : 0;
 }
 
-static int parse_trailer(struct strbuf *tok, struct strbuf *val,
-const struct conf_info **conf, const char *trailer)
+/*
+ * Return the location of the first separator or '=' in line, or -1 if either a
+ * newline or the null terminator is reached first.
+ */
+static int find_separator(const char *line)
+{
+   const char *c;
+   for (c = line; ; c++) {
+   if (!*c || *c == '\n')
+   return -1;
+   if (*c == '=' || strchr(separators, *c))
+   return c - line;
+   }
+}
+
+/*
+ * Obtain the token, value, and conf from the given trailer.
+ *
+ * separator_pos must not be 0, since the token cannot be an empty string.
+ *
+ * If separator_pos is -1, interpret the whole trailer as a token.
+ */
+static void parse_trailer(struct strbuf *tok, struct strbuf *val,
+const struct conf_info **conf, const char *trailer,
+int separator_pos)
 {
-   size_t len;
-   struct strbuf seps = STRBUF_INIT;
struct arg_item *item;
int tok_len;
struct list_head *pos;
 
-   strbuf_addstr(, separators);
-   strbuf_addch(, '=');
-   len = strcspn(trailer, seps.buf);
-   strbuf_release();
-   if (len == 0) {
-   int l = strlen(trailer);
-   while (l > 0 && isspace(trailer[l - 1]))
-   l--;
-   return error(_("empty trailer token in trailer '%.*s'"), l, 
trailer);
-   }
-   if (len < strlen(trailer)) {
-   strbuf_add(tok, trailer, len);
+   if (separator_pos != -1) {
+   strbuf_add(tok, trailer, separator_pos);
strbuf_trim(tok);
-   strbuf_addstr(val, trailer + len + 1);
+   strbuf_addstr(val, trailer + separator_pos + 1);
strbuf_trim(val);
} else {
strbuf_addstr(tok, trailer);
@@ -587,8 +598,6 @@ static int parse_trailer(struct strbuf *tok, struct strbuf 
*val,
break;
}
}
-
-   return 0;
 }
 
 static void add_trailer_item(struct list_head *head, char *tok, char *val)
@@ -631,11 +640,22 @@ static void process_command_line_args(struct list_head 
*arg_head,
 
/* Add an arg item for each trailer on the command line */
for_each_string_list_item(tr, trailers) {
-   if (!parse_trailer(, , , tr->string))
+   int separator_pos = find_separator(tr->string);
+   if (separator_pos == 0) {
+   struct strbuf sb = STRBUF_INIT;
+   strbuf_addstr(, tr->string);
+   strbuf_trim();
+   error(_("empty trailer token in trailer '%.*s'"),
+ (int) sb.len, sb.buf);
+   strbuf_release();
+   } else {
+   parse_trailer(, , , tr->string,
+ separator_pos);
add_arg_item(arg_head,
 strbuf_detach(, NULL),
 strbuf_detach(, NULL),
 conf);
+   }
}
 }
 
@@ -775,11 +795,17 @@ static int process_input_file(FILE *outfile,
 
/* Parse trailer lines */
for (i = trailer_start; i < trailer_end; i++) {
-   if (lines[i]->buf[0] != comment_line_char &&
-   !parse_trailer(, , NULL, lines[i]->buf))
+   int separator_pos;
+   if (lines[i]->buf[0] == comment_line_char)
+   continue;
+   separator_pos = find_separator(lines[i]->buf);
+   if (separator_pos >= 1) {
+   parse_trailer(, , NULL, lines[i]->buf,
+ separator_pos);
add_trailer_item(head,
 strbuf_detach(, NULL),
 

[PATCH v4 0/8] allow non-trailers and multiple-line trailers

2016-10-20 Thread Jonathan Tan
Main changes are:
 - implemented the previously discussed trailer block recognizing rule
   (recognized trailer + 25% trailers or 100% trailers)
 - forbidding leading whitespace in trailers to avoid false positives

Once the recognized trailer + 25% trailers rule is implemented,
implementing the 100% trailer rule gives us backwards compatibility and
is only a few lines of code, so I included it.

Summary of changes from v3:
 2/6->2/8:
   - squashed Ramsay Jones's "static" patch
 new->5/8:
   - new patch
 5/6->6/8:
   - new trailer block recognizing rule
   - reverted to the existing behavior of ignoring comments, since the
 number of trailers and non-trailers in the trailer block now
 matters more
 new->7/8:
   - new patch
 6/6->8/8:
   - updated trailer block recognizing code, since the continuation
 lines must not be counted if they follow a trailer line

Jonathan Tan (8):
  trailer: improve const correctness
  trailer: use list.h for doubly-linked list
  trailer: streamline trailer item create and add
  trailer: make args have their own struct
  trailer: clarify failure modes in parse_trailer
  trailer: allow non-trailers in trailer block
  trailer: forbid leading whitespace in trailers
  trailer: support values folded to multiple lines

 Documentation/git-interpret-trailers.txt |  14 +-
 t/t7513-interpret-trailers.sh| 299 +++
 trailer.c| 619 +--
 3 files changed, 651 insertions(+), 281 deletions(-)

-- 
2.8.0.rc3.226.g39d4020



[PATCH v4 1/8] trailer: improve const correctness

2016-10-20 Thread Jonathan Tan
Change "const char *" to "char *" in struct trailer_item and in the
return value of apply_command (since those strings are owned strings).

Change "struct conf_info *" to "const struct conf_info *" (since that
struct is not modified).

Signed-off-by: Jonathan Tan 
---
 trailer.c | 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/trailer.c b/trailer.c
index c6ea9ac..1f191b2 100644
--- a/trailer.c
+++ b/trailer.c
@@ -27,8 +27,8 @@ static struct conf_info default_conf_info;
 struct trailer_item {
struct trailer_item *previous;
struct trailer_item *next;
-   const char *token;
-   const char *value;
+   char *token;
+   char *value;
struct conf_info conf;
 };
 
@@ -95,8 +95,8 @@ static void free_trailer_item(struct trailer_item *item)
free(item->conf.name);
free(item->conf.key);
free(item->conf.command);
-   free((char *)item->token);
-   free((char *)item->value);
+   free(item->token);
+   free(item->value);
free(item);
 }
 
@@ -215,13 +215,13 @@ static struct trailer_item *remove_first(struct 
trailer_item **first)
return item;
 }
 
-static const char *apply_command(const char *command, const char *arg)
+static char *apply_command(const char *command, const char *arg)
 {
struct strbuf cmd = STRBUF_INIT;
struct strbuf buf = STRBUF_INIT;
struct child_process cp = CHILD_PROCESS_INIT;
const char *argv[] = {NULL, NULL};
-   const char *result;
+   char *result;
 
strbuf_addstr(, command);
if (arg)
@@ -425,7 +425,7 @@ static int set_if_missing(struct conf_info *item, const 
char *value)
return 0;
 }
 
-static void duplicate_conf(struct conf_info *dst, struct conf_info *src)
+static void duplicate_conf(struct conf_info *dst, const struct conf_info *src)
 {
*dst = *src;
if (src->name)
-- 
2.8.0.rc3.226.g39d4020



Re: [PATCH 2/2] tag: send fully qualified refnames to verify_tag_and_format

2016-10-20 Thread Jeff King
On Thu, Oct 20, 2016 at 12:57:01PM -0400, Santiago Torres wrote:

> > I think you'd really just squash the various bits of this into your
> > series at the right spots, though I don't mind it on top, either.
> > 
> > The big question is to what degree we should care about the verify-tag
> > case. I don't think it's any worse off with this change than it is with
> > your series (its "kind" becomes "OTHER", but I don't think that is
> > actually used for display at all; the name remains the same). I'd be OK
> > with leaving it like this, as a known bug, until get_sha1_with_context()
> > learns to tell us about the ref. It's an unhandled corner case in a
> > brand-new feature, not a regression in an existing one.
> 
> I see now, I think I can sprinkle some of these changes on 2/7 then. The
> rest should be doing 4/7 and 5/7. Does this sound ok?

Yep, that sounds about right.

-Peff


Re: Drastic jump in the time required for the test suite

2016-10-20 Thread Jeff King
On Thu, Oct 20, 2016 at 12:54:32PM -0700, Stefan Beller wrote:

> Maybe we should stop introducing un-optimized tests.
> [...]
> * heavy use of the "git -C " pattern. When applying that
>   thouroughly we'd save spanning the subshells.

Yeah, I imagine with some style changes we could drop quite a few
subshells. The problem is that the conversion work is manual and
tedious. I'd look first for spots where we can eliminate thousands of
calls with a single change.

> That said I really like the idea of having a helper that would eliminate the 
> cat
> for you, e.g. :
> 
> git_test_helper_equal_stdin_or_diff_and_die -C super_repo status
> --porcelain=v2 --branch --untracked-files=all <<-EOF
> 1 A. N... 00 100644 100644 $_z40 $HMOD .gitmodules
> 1 AM S.M. 00 16 16 $_z40 $HSUP sub1
> EOF

I think that helper still ends up using "cat" and "diff" under the hood,
unless you write those bits in pure shell. But at that point, I suspect
we could "cat" and "test_cmp" in pure shell, something like:

cat () {
# optimize common here-doc usage
if test $# -eq 0
then
while read -r line
do
printf '%s' "$line"
done
fi
command cat "$@"
}

test_cmp () {
# optimize for common "they are the same" case
# without any subshells or subprograms
while true; do
if ! read -r line1 <&3
then
if ! read -r line2 <&4
# EOF on both; good
return 0
else
# EOF only on file1; fail
break
fi
fi
if ! read -r line2 <&4
then
# EOF only on file2; fail
break
fi
test "$line1" = "$line2" || break
done 3<"$1" 4<"$2"

# if we get here, the optimized version found some
# difference. We can just "return 1", but let's run
# the real $GIT_TEST_CMP to provide pretty output.
# This should generally only happen on test failures,
# so performance isn't a big deal.
"$GIT_TEST_CMP" "$@"
}

Those are both completely untested. But maybe they are worth playing
around with for somebody on Windows to see if they make a dent in the
test runtime.

-Peff


Re: [ANNOUNCE] git-log-compact v1.0

2016-10-20 Thread Kevin Daudt
On Wed, Oct 19, 2016 at 05:13:34PM -0700, Kyle J. McKay wrote:
> 
> The project page with detailed help and many screen shots is located at:
> 
>   https://mackyle.github.io/git-log-compact/
> 
> Alternatively the repository can be cloned from:
> 
>   https://github.com/mackyle/git-log-compact.git
> 
> Or the script file itself (which is really all you need) can be
> viewed/fetched from:
> 
>   https://github.com/mackyle/git-log-compact/blob/HEAD/git-log-compact
> 
> The git-log-compact script should work with any version of Git released
> in the last several years.
> 
> --Kyle
> 
> [1] https://git.github.io/rev_news/2016/10/19/edition-20/

I've packaged up an arch AUR package[1] for it if anyone is interested.

[1]:https://aur.archlinux.org/packages/git-log-compact


Re: Drastic jump in the time required for the test suite

2016-10-20 Thread Jeff King
On Thu, Oct 20, 2016 at 10:38:23PM +0200, Johannes Sixt wrote:

> Am 20.10.2016 um 14:31 schrieb Jeff King:
> > Close to 1/3 of those processes are just invoking the bin-wrapper
> > script to set up the EXEC_PATH, etc. I imagine it would not be too hard
> > to just do that in the test script. In fact, it looks like:
> > 
> >   make prefix=/wherever install
> >   GIT_TEST_INSTALLED=/wherever/bin make test
> > 
> > might give you an immediate speedup by skipping bin-wrappers entirely.
> 
> Running the tests with --with-dashes should give you the same effect, no?

Yeah, looks like it. it still uses bin-wrappers for t/helper, but that
should be a minority of calls.

Which I think explains why I saw some test failures with the
GIT_TEST_INSTALLED above. It does not know about t/helper, but relies on
those programs being present in $GIT_BUILD_DIR. So I suspect it has been
totally broken since e6e7530d10 (test helpers: move test-* to t/helper/
subdirectory, 2016-04-13).

-Peff


Re: Drastic jump in the time required for the test suite

2016-10-20 Thread Jeff King
On Thu, Oct 20, 2016 at 09:30:27AM -0700, Stefan Beller wrote:

> On Thu, Oct 20, 2016 at 5:31 AM, Jeff King  wrote:
> 
> >
> > $ perl -lne '/execve\("(.*?)"/ and print $1' /tmp/foo.out | sort | uniq -c 
> > | sort -rn | head
> >  152271 /home/peff/compile/git/git
> >   57340 /home/peff/compile/git/t/../bin-wrappers/git
> >   16865 /bin/sed
> >   12650 /bin/rm
> >   11257 /bin/cat
> >9326 /home/peff/compile/git/git-sh-i18n--envsubst
> >9079 /usr/bin/diff
> >8013 /usr/bin/wc
> >5924 /bin/mv
> >4566 /bin/grep
> >
> 
> I am not an expert on perl nor tracing, but is it feasible to find out
> how many internal calls there are? i.e. either some shell script (rebase,
> submodule) calling git itself a couple of times or even from compile/git/git
> itself, e.g. some submodule operations use forking in there.

The script below is my attempt, though I think it is not quite right, as
"make" should be the single apex of the graph. You can run it like:

  strace -f -o /tmp/foo.out -e clone,execve make test
  perl graph.pl /tmp/foo.out | less -S

One thing that it counts (that was not counted above) is the number of
forks for subshells, which is considerable. I don't know how expensive
that is versus, say, running "cat" (if your fork() doesn't
copy-on-write, and you implement sub-programs via an efficient spawn()
call, it's possible that the subshells are significantly more
expensive).

-Peff

-- >8 --
#!/usr/bin/perl

my %clone;
my %exec;
my %is_child;
my %counter;
while (<>) {
#  execve("some-prog", ...
if (/^(\d+)\s+execve\("(.*?)"/) {
push @{$exec{node($1)}}, $2;
}
#  clone(...) = 
#   or
#  <... clone resumed> ...) = 
elsif (/^(\d+)\s+.*clone.*\) = (\d+)$/) {
push @{$clone{node($1)}}, node($2);
$is_child{node($2)} = 1;
}
#  +++ exited with  +++
# We have to keep track of this because pids get recycled,
# and so are not unique node names in our graph.
elsif (/^(\d+)\s+.*exited with/) {
$counter{$1}++;
}
}

show($_, 0) for grep { !$is_child{$_} } keys(%clone);

sub show {
my ($pid, $indent) = @_;

my @progs = @{$exec{$pid}};
if (!@progs) {
@progs = ("(fork)");
}

print ' ' x $indent;
print "$pid: ", shift @progs;
print " => $_" for @progs;
print "\n";

show($_, $indent + 2) for @{$clone{$pid}};
}

sub node {
my $pid = shift;
my $c = $counter{$pid} || "0";
return "$pid-$c";
}


[PATCH v3] rev-list: use hdr_termination instead of a always using a newline

2016-10-20 Thread Jacob Keller
From: Jacob Keller 

When adding support for prefixing output of log and other commands using
--line-prefix, commit 660e113ce118 ("graph: add support for
--line-prefix on all graph-aware output", 2016-08-31) accidentally
broke rev-list --header output.

In order to make the output appear with a line-prefix, the flow was
changed to always use the graph subsystem for display. Unfortunately
the graph flow in rev-list did not use info->hdr_termination as it was
assumed that graph output would never need to putput NULs.

Since we now always use the graph code in order to handle the case of
line-prefix, simply replace putchar('\n') with
putchar(info->hdr_termination) which will correct this issue.

Add a test for the --header case to make sure we don't break it in the
future.

Reported-by: Dennis Kaarsemaker 
Signed-off-by: Jacob Keller 
Signed-off-by: Junio C Hamano 
Signed-off-by: Jacob Keller 
---
Changes in v2
* Squash Junio's suggested (better) test
* Add Junio's signed-off-by since he wrote the new test

Changes in v3
* Fix commit description to not reference the no longer existing test

 builtin/rev-list.c   |  2 +-
 t/t6000-rev-list-misc.sh | 14 ++
 2 files changed, 15 insertions(+), 1 deletion(-)

diff --git c/builtin/rev-list.c c/builtin/rev-list.c
index 8479f6ed28aa..c43decda7011 100644
--- c/builtin/rev-list.c
+++ c/builtin/rev-list.c
@@ -145,7 +145,7 @@ static void show_commit(struct commit *commit, void *data)
 */
if (buf.len && buf.buf[buf.len - 1] == '\n')
graph_show_padding(revs->graph);
-   putchar('\n');
+   putchar(info->hdr_termination);
} else {
/*
 * If the message buffer is empty, just show
diff --git c/t/t6000-rev-list-misc.sh c/t/t6000-rev-list-misc.sh
index 3e752ce03280..969e4e9e5261 100755
--- c/t/t6000-rev-list-misc.sh
+++ c/t/t6000-rev-list-misc.sh
@@ -100,4 +100,18 @@ test_expect_success '--bisect and --first-parent can not 
be combined' '
test_must_fail git rev-list --bisect --first-parent HEAD
 '
 
+test_expect_success '--header shows a NUL after each commit' '
+   # We know that there is no Q in the true payload; names and
+   # addresses of the authors and the committers do not have
+   # any, and object names or header names do not, either.
+   git rev-list --header --max-count=2 HEAD |
+   nul_to_q |
+   grep "^Q" >actual &&
+   cat >expect <<-EOF &&
+   Q$(git rev-parse HEAD~1)
+   Q
+   EOF
+   test_cmp expect actual
+'
+
 test_done

base-commit: 659889482ac63411daea38b2c3d127842ea04e4d
-- 
git-series 0.8.10


Re: Drastic jump in the time required for the test suite

2016-10-20 Thread Dennis Kaarsemaker
On Thu, 2016-10-20 at 08:31 -0400, Jeff King wrote:

> I'm also not entirely convinced that the test suite being a shell script
> is the main culprit for its slowness. We run git a lot of times, and
> that's inherent in testing it. I ran the whole test suite under
> "strace -f -e execve". There are ~335K execs. Here's the breakdown of
> the top ones:

You're measuring execve's, but fork (well, fork emulation. There's no
actual fork) is also expensive on windows iirc, so subshells add a lot
to this cost. That said, strace -eclone says that a 'make test' forks
~408k times, and while this is significantly more than the amount of
execs in your example, this does include cvs and svn tests and it's
still in the same ballpark.

D.


Re: Drastic jump in the time required for the test suite

2016-10-20 Thread Johannes Sixt

Am 20.10.2016 um 14:31 schrieb Jeff King:

Close to 1/3 of those processes are just invoking the bin-wrapper
script to set up the EXEC_PATH, etc. I imagine it would not be too hard
to just do that in the test script. In fact, it looks like:

  make prefix=/wherever install
  GIT_TEST_INSTALLED=/wherever/bin make test

might give you an immediate speedup by skipping bin-wrappers entirely.


Running the tests with --with-dashes should give you the same effect, no?

-- Hannes



[PATCH v2] rev-list: use hdr_termination instead of a always using a newline

2016-10-20 Thread Jacob Keller
From: Jacob Keller 

When adding support for prefixing output of log and other commands using
--line-prefix, commit 660e113ce118 ("graph: add support for
--line-prefix on all graph-aware output", 2016-08-31) accidentally
broke rev-list --header output.

In order to make the output appear with a line-prefix, the flow was
changed to always use the graph subsystem for display. Unfortunately
the graph flow in rev-list did not use info->hdr_termination as it was
assumed that graph output would never need to putput NULs.

Since we now always use the graph code in order to handle the case of
line-prefix, simply replace putchar('\n') with
putchar(info->hdr_termination) which will correct this issue.

Add a test for the --header case to make sure we don't break it in the
future. Implement a helper function test_ends_with_nul() to make it more
obvious what sort of check we are looking for.

Reported-by: Dennis Kaarsemaker 
Signed-off-by: Jacob Keller 
Signed-off-by: Junio C Hamano 
Signed-off-by: Jacob Keller 
---

Changes in v2
* Squash Junio's suggested (better) test
* Add Junio's signed-off-by since he wrote the new test

 builtin/rev-list.c   |  2 +-
 t/t6000-rev-list-misc.sh | 14 ++
 2 files changed, 15 insertions(+), 1 deletion(-)

diff --git c/builtin/rev-list.c c/builtin/rev-list.c
index 8479f6ed28aa..c43decda7011 100644
--- c/builtin/rev-list.c
+++ c/builtin/rev-list.c
@@ -145,7 +145,7 @@ static void show_commit(struct commit *commit, void *data)
 */
if (buf.len && buf.buf[buf.len - 1] == '\n')
graph_show_padding(revs->graph);
-   putchar('\n');
+   putchar(info->hdr_termination);
} else {
/*
 * If the message buffer is empty, just show
diff --git c/t/t6000-rev-list-misc.sh c/t/t6000-rev-list-misc.sh
index 3e752ce03280..969e4e9e5261 100755
--- c/t/t6000-rev-list-misc.sh
+++ c/t/t6000-rev-list-misc.sh
@@ -100,4 +100,18 @@ test_expect_success '--bisect and --first-parent can not 
be combined' '
test_must_fail git rev-list --bisect --first-parent HEAD
 '
 
+test_expect_success '--header shows a NUL after each commit' '
+   # We know that there is no Q in the true payload; names and
+   # addresses of the authors and the committers do not have
+   # any, and object names or header names do not, either.
+   git rev-list --header --max-count=2 HEAD |
+   nul_to_q |
+   grep "^Q" >actual &&
+   cat >expect <<-EOF &&
+   Q$(git rev-parse HEAD~1)
+   Q
+   EOF
+   test_cmp expect actual
+'
+
 test_done

base-commit: 659889482ac63411daea38b2c3d127842ea04e4d
-- 
git-series 0.8.10


Re: [PATCH v4 23/25] sequencer: quote filenames in error messages

2016-10-20 Thread Junio C Hamano
Junio C Hamano  writes:

> Johannes Schindelin  writes:
>
>> This makes the code consistent by fixing quite a couple of error messages.
>>
>> Suggested by Jakub Narębski.
>>
>> Signed-off-by: Johannes Schindelin 
>> ---
>
> These finishing touches in 21-23 look all sensible to me.

Make that 21-25.  I finished reading to the end and it was mostly a
pleasnt read, except for a few things I noticed and sent reviews
separately.

Thanks.


Re: [PATCH v4 23/25] sequencer: quote filenames in error messages

2016-10-20 Thread Junio C Hamano
Johannes Schindelin  writes:

> This makes the code consistent by fixing quite a couple of error messages.
>
> Suggested by Jakub Narębski.
>
> Signed-off-by: Johannes Schindelin 
> ---

These finishing touches in 21-23 look all sensible to me.


Re: [PATCH v4 20/25] sequencer: refactor write_message()

2016-10-20 Thread Junio C Hamano
Junio C Hamano  writes:

> If I were doing this, I would make this into three separate steps:
>
> - move the strbuf_release(msgbuf) to the caller in
>   do_pick_commit();
>
> - add the missing rollback_lock_file(), which is a bugfix; and
>   then finally
>
> - allow the helper to take not a strbuf but  pair as
>   parameters.
>
> The end result of this patch achieves two thirds of the above, but
> especially given that write_message() only has two call sites in a
> single function, I think it is OK and preferrable even to do all
> three.

Ah, make that four steps.  The final one is:

- add append_eol parameter that nobody uses at this step in the
  series.

This is a new feature to the helper.  While it is OK to have it as a
preparatory step in this series, it is easier to understand if it
kept as a separate step.  It is even more preferrable if it is made
as a preparatory step in a series that adds a caller that passes
true to append_eol to this helper, or if that real change is small
enough, part of that patch that adds such a caller, not as a
separate step.




Re: [PATCH v4 20/25] sequencer: refactor write_message()

2016-10-20 Thread Junio C Hamano
Johannes Schindelin  writes:

> The write_message() function safely writes an strbuf to a file.
> Sometimes it is inconvenient to require an strbuf, though: the text to
> be written may not be stored in a strbuf, or the strbuf should not be
> released after writing.
>
> Let's refactor "safely writing string to a file" into
> write_with_lock_file(), and make write_message() use it.
>
> The new function will make it easy to create a new convenience function
> write_file_gently() that does not die(); as some of the upcoming callers
> of this new function will want to append a newline character, we already
> add that flag as parameter to write_with_lock_file().
>
> While at it, roll back the locked files in case of failure, as pointed
> out by Hannes Sixt.
>
> Signed-off-by: Johannes Schindelin 
> ---
>  sequencer.c | 25 -
>  1 file changed, 20 insertions(+), 5 deletions(-)

Once a helper function starts taking  pair, not a strbuf,
it becomes obvious that it does not make much sense to calling
strbuf_release() from the helper.  It is caller's job to do the
memory management of the strbuf it uses to pass information to the
helper, and the current api into write_message() is klunky.

If I were doing this, I would make this into three separate steps:

- move the strbuf_release(msgbuf) to the caller in
  do_pick_commit();

- add the missing rollback_lock_file(), which is a bugfix; and
  then finally

- allow the helper to take not a strbuf but  pair as
  parameters.

The end result of this patch achieves two thirds of the above, but
especially given that write_message() only has two call sites in a
single function, I think it is OK and preferrable even to do all
three.


Re: [PATCH v4 18/25] sequencer: do not try to commit when there were merge conflicts

2016-10-20 Thread Junio C Hamano
Johannes Schindelin  writes:

> The return value of do_recursive_merge() may be positive (indicating merge
> conflicts), or 0 (indicating success). It also may be negative, indicating
> a fatal error that requires us to abort.
>
> Now, if the return value indicates that there are merge conflicts, we
> should not try to commit those changes, of course.
>
> Signed-off-by: Johannes Schindelin 
> ---
>  sequencer.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/sequencer.c b/sequencer.c
> index cbc3742..9ffc090 100644
> --- a/sequencer.c
> +++ b/sequencer.c
> @@ -787,7 +787,7 @@ static int do_pick_commit(enum todo_command command, 
> struct commit *commit,
>   res = allow;
>   goto leave;
>   }
> - if (!opts->no_commit)
> + if (!res && !opts->no_commit)
>   res = run_git_commit(opts->edit ? NULL : git_path_merge_msg(),
>opts, allow, opts->edit, 0, 0);

This by itself looks more like a bugfix than preparation for later
steps.  The only reason why it is not a bugfix is because there is
nothing in this function that makes res a non-zero value and reach
this if statement at this step.  We would have been caught by an 
"if (res) { ... rerere(); goto leave; }" or 
"if (allow < 0) { res = allow; goto leave; }" 
that appear before this part of the code.

So while it is not wrong per-se, I think this should be part of an
actual change that makes it possible for the control flow to reach
here with non-zero res.




Re: [PATCH v4 17/25] sequencer: support cleaning up commit messages

2016-10-20 Thread Junio C Hamano
Johannes Schindelin  writes:

> The run_git_commit() function already knows how to amend commits, and
> with this new option, it can also clean up commit messages (i.e. strip
> out commented lines). This is needed to implement rebase -i's 'fixup'
> and 'squash' commands as sequencer commands.
>
> Signed-off-by: Johannes Schindelin 
> ---
>  sequencer.c | 10 +++---
>  1 file changed, 7 insertions(+), 3 deletions(-)
>
> diff --git a/sequencer.c b/sequencer.c
> index fa77c82..cbc3742 100644
> --- a/sequencer.c
> +++ b/sequencer.c
> @@ -481,7 +481,8 @@ static char **read_author_script(void)
>   * author metadata.
>   */
>  static int run_git_commit(const char *defmsg, struct replay_opts *opts,
> -   int allow_empty, int edit, int amend)
> +   int allow_empty, int edit, int amend,
> +   int cleanup_commit_message)
>  {
>   char **env = NULL;
>   struct argv_array array;

Looks OK to me.  

This starts to look like calling for a single flag word even more,
but it is a file-local helper so we can clean it up if it becomes
necessary without affecting too many things later.

> @@ -518,9 +519,12 @@ static int run_git_commit(const char *defmsg, struct 
> replay_opts *opts,
>   argv_array_push(, "-s");
>   if (defmsg)
>   argv_array_pushl(, "-F", defmsg, NULL);
> + if (cleanup_commit_message)
> + argv_array_push(, "--cleanup=strip");
>   if (edit)
>   argv_array_push(, "-e");
> - else if (!opts->signoff && !opts->record_origin &&
> + else if (!cleanup_commit_message &&
> +  !opts->signoff && !opts->record_origin &&
>git_config_get_value("commit.cleanup", ))
>   argv_array_push(, "--cleanup=verbatim");
>  
> @@ -785,7 +789,7 @@ static int do_pick_commit(enum todo_command command, 
> struct commit *commit,
>   }
>   if (!opts->no_commit)
>   res = run_git_commit(opts->edit ? NULL : git_path_merge_msg(),
> -  opts, allow, opts->edit, 0);
> +  opts, allow, opts->edit, 0, 0);
>  
>  leave:
>   free_message(commit, );


Re: [PATCH] rev-list: use hdr_termination instead of a always using a newline

2016-10-20 Thread Jacob Keller
On Thu, Oct 20, 2016 at 11:54 AM, Junio C Hamano  wrote:
>
> The main part of the patch looks good.  For "passing NUL to sed",
> I'd probably work it around like so:
>

Yep. I wasn't sure on the test as it was, because of the portability concern.

>  t/t6000-rev-list-misc.sh | 19 +++
>  1 file changed, 11 insertions(+), 8 deletions(-)
>
> diff --git a/t/t6000-rev-list-misc.sh b/t/t6000-rev-list-misc.sh
> index e8c6979baf..737026c34f 100755
> --- a/t/t6000-rev-list-misc.sh
> +++ b/t/t6000-rev-list-misc.sh
> @@ -4,12 +4,6 @@ test_description='miscellaneous rev-list tests'
>
>  . ./test-lib.sh
>
> -test_ends_with_nul() {
> -   printf "\0" >nul
> -   sed '$!d' "$@" >contents
> -   test_cmp_bin nul contents
> -}
> -
>  test_expect_success setup '
> echo content1 >wanted_file &&
> echo content2 >unwanted_file &&
> @@ -107,8 +101,17 @@ test_expect_success '--bisect and --first-parent can not 
> be combined' '
>  '
>
>  test_expect_success '--header shows a NUL after each commit' '
> -   git rev-list --header --max-count=1 HEAD | sed \$!d >actual &&
> -   test_ends_with_nul actual
> +   # We know there is no Q in the true payload; names and
> +   # addresses of the authors and the committers do not have
> +   # any, and object names or header names do not, either.
> +   git rev-list --header --max-count=2 HEAD |
> +   nul_to_q |
> +   grep "^Q" >actual &&
> +   cat >expect <<-EOF &&
> +   Q$(git rev-parse HEAD~1)
> +   Q
> +   EOF
> +   test_cmp expect actual
>  '
>
>  test_done

I will squash this in and re-send.

Thanks,
Jake


Re: tools for easily "uncommitting" parts of a patch I just commited?

2016-10-20 Thread Jacob Keller
On Thu, Oct 20, 2016 at 11:41 AM, Junio C Hamano  wrote:
> Jacob Keller  writes:
>
>>> I am not sure if that is OK.  I think it is less not-OK than the use
>>> case I mentioned in my earlier message, in that this is not a case
>>> that "please don't do it" breaks.  It however is an inconvenience
>>> that the user has to say "git add file" before the "git commit" (or
>>> "git commit file") to conclude the sequence.
>>>
>>> So I dunno.
>>
>> Hmmm.. Ya ok I don't think we can actually distinguish between these
>> two work flows.
>
> What we might want to have in "git commit " is a new mode
> that is totally different from -i/-o that says roughly "Start from
> the tree of HEAD, pretend as if you removed all the paths that match
> the given pathspec from the tree, and then added all the entries in
> the index that match that pathspec.  Write that tree and commit.
> Take nothing from the working tree".  I have a feeling that when
> people do
>
> $ git add -p file1 file2 file3
> $ git commit file2
>
> and ends up including _all_ changes made to file2, not just the ones
> they picked in the earlier part of the workflow, they are expecting
> such a behaviour.
>

Right now I think people who use it intentionally do expect it to work
that way. I just happen to not have wanted to add  but did so
anyways without considering, and thus I ended up including changes
that were for the next commit.

As long as there is a way to change "git commit" default from that
mode then we could make the default work and then let people configure
it to what makes sense.

I'll take a look at going this route.

Thanks,
Jake


Re: Drastic jump in the time required for the test suite

2016-10-20 Thread Stefan Beller
On Thu, Oct 20, 2016 at 3:50 AM, Johannes Schindelin
 wrote:
> Hi peff,
>
> On Wed, 19 Oct 2016, Jeff King wrote:
>
>> On Wed, Oct 19, 2016 at 10:32:12AM -0700, Junio C Hamano wrote:
>>
>> > > Maybe we should start optimizing the tests...
>> >

Maybe we should stop introducing un-optimized tests.
For other reasons I just stumbled upon t7064
(porcelain V2 output for git status)


This is how an arbitrary test looks like:

test_expect_success 'staged changes in added submodule (AM S.M.)' '
(cd super_repo &&
## stage the changes in the submodule.
(cd sub1 &&
git add file_in_sub
) &&

HMOD=$(git hash-object -t blob -- .gitmodules) &&
HSUP=$(git rev-parse HEAD) &&
HSUB=$HSUP &&

cat >expect <<-EOF &&
# branch.oid $HSUP
# branch.head master
# branch.upstream origin/master
# branch.ab +0 -0
1 A. N... 00 100644 100644 $_z40 $HMOD .gitmodules
1 AM S.M. 00 16 16 $_z40 $HSUB sub1
EOF

git status --porcelain=v2 --branch --untracked-files=all >actual &&
test_cmp expect actual
)
'

Following "modern" Git tests I would have expected:

* heavy use of the "git -C " pattern. When applying that
  thouroughly we'd save spanning the subshells.
* no `cd` on the same line as the opening paren.
  (This is style and would derail the performance discussion)

test_expect_success 'staged changes in added submodule (AM S.M.)' '
git -C super_repo/sub1 add file_in_sub &&
HMOD=$(git -C super_repo hash-object -t blob -- .gitmodules) &&
HSUP=$(git -C super_repo rev-parse HEAD) &&
# as a comment: HSUB is equal to HSUP, because ...

cat >expect <<-EOF &&
# branch.oid $HSUP
# branch.head master
# branch.upstream origin/master
# branch.ab +0 -0
1 A. N... 00 100644 100644 $_z40 $HMOD .gitmodules
1 AM S.M. 00 16 16 $_z40 $HSUP sub1
EOF

git -C super_repo status --porcelain=v2 --branch
--untracked-files=all >../actual &&
test_cmp expect actual
'

That said I really like the idea of having a helper that would eliminate the cat
for you, e.g. :

git_test_helper_equal_stdin_or_diff_and_die -C super_repo status
--porcelain=v2 --branch --untracked-files=all <<-EOF
1 A. N... 00 100644 100644 $_z40 $HMOD .gitmodules
1 AM S.M. 00 16 16 $_z40 $HSUP sub1
EOF

Thanks,
Stefan


Re: [PATCHv3] submodule--helper: normalize funny urls

2016-10-20 Thread Junio C Hamano
Stefan Beller  writes:

> My thought was to fix it nevertheless, such that the url recorded as
> remote.origin.url is always the first case (no l or /. at the end).
>
> If we were to add this fix to clone, then it may be easier to debug
> submodule url schemes for users as the submodule url would then
> be a concatenation of remote.origin.url and the relative part.
>
> That seems easier to understand than ${remote.origin.url%%/.} +
> relative path, maybe? (Because then the user doesn't need to guess
> or remember historical behavior that is wrong on how this)

Are you declaring that trailing / or /. will now be illegal?  If you
are declaring that, then I agree that new codepaths no longer have
to worry about "strip / or /. before concatenating" and will
simplify things for them.  But otherwise, such a "fix" also would
have an effect of hiding bugs from codepaths.  They still need to be
prepared to see any of the three variants and cope with them
correctly, right?


Re: [PATCHv3] submodule--helper: normalize funny urls

2016-10-20 Thread Stefan Beller
On Thu, Oct 20, 2016 at 12:26 PM, Junio C Hamano  wrote:
> Stefan Beller  writes:
>
>> Do we actually want to fix git-clone as well?
>
> If I understand correctly, the point of this fix is to make it not
> to matter whether the original URL the end user gives or recorded as
> the remote by "git clone" in the repository is any one of:
>
> $any_leading_part/path/to/dir
> $any_leading_part/path/to/dir/
> $any_leading_part/path/to/dir/.
>
> So I do not think there is anything to "fix", as long as "git clone"

Yes "git clone" works with any of the three above.

My thought was to fix it nevertheless, such that the url recorded as
remote.origin.url is always the first case (no l or /. at the end).

If we were to add this fix to clone, then it may be easier to debug
submodule url schemes for users as the submodule url would then
be a concatenation of remote.origin.url and the relative part.

That seems easier to understand than ${remote.origin.url%%/.} +
relative path, maybe? (Because then the user doesn't need to guess
or remember historical behavior that is wrong on how this)

> that is given any one of the above three records any one of the
> above three as the result.  It _may_ be desirable if the result is
> identical what was given as input, but I do not offhand think that
> is required.
>
>> I tried and then I see breakage in 5603-clone-dirname
>> as ssh://host seems to be an invalid url; it has to end with a slash?
>
> That is a separate issue, isn't it?  We shouldn't be touching the
> leading ":///" part, I would think.

I agree, So I'll first fix the submodule parts only.

>
> For example, a URL "../another" relative to "ssh://host/path" may be
> "ssh://host/another", but shouldn't it be an error to take
> "../../outside" relative to "ssh://host/path"?

That is correct. I'll stop looking at clone code.


Re: [PATCHv3] submodule--helper: normalize funny urls

2016-10-20 Thread Junio C Hamano
Stefan Beller  writes:

> Do we actually want to fix git-clone as well?

If I understand correctly, the point of this fix is to make it not
to matter whether the original URL the end user gives or recorded as
the remote by "git clone" in the repository is any one of:

$any_leading_part/path/to/dir
$any_leading_part/path/to/dir/
$any_leading_part/path/to/dir/.

So I do not think there is anything to "fix", as long as "git clone"
that is given any one of the above three records any one of the
above three as the result.  It _may_ be desirable if the result is
identical what was given as input, but I do not offhand think that
is required.

> I tried and then I see breakage in 5603-clone-dirname
> as ssh://host seems to be an invalid url; it has to end with a slash?

That is a separate issue, isn't it?  We shouldn't be touching the
leading ":///" part, I would think.  

For example, a URL "../another" relative to "ssh://host/path" may be
"ssh://host/another", but shouldn't it be an error to take
"../../outside" relative to "ssh://host/path"?


Re: [PATCHv3] submodule--helper: normalize funny urls

2016-10-20 Thread Stefan Beller
On Tue, Oct 18, 2016 at 7:05 PM, Junio C Hamano  wrote:
> Stefan Beller  writes:
>
>>> I am not sure.  Certainly we would want to make sure that the normal
>>> case (i.e. no funny trailing junk) to work correctly, but we do want
>>> to protect the fix from future breakage as well, no?
>>
>> Exactly. So not intermediate "root" that we clone from, but adapting the
>> relative URLs. Maybe half the broken tests can switch to 'root' and the 
>> others
>> go with the current behavior of cloning . to super.
>>>
>>> Perhaps we can do a preliminary step to update tests to primarily
>>> check the cases that do not involve URL with trailing "/." by either
>>> doing that double clone, or more preferrably, clone from "$(pwd)"
>>> and adjust the incorrect submodule reference that have been relying
>>> on the buggy behaviour.  With that "root" trick, the test would pass
>>> with or without the fix under discussion, right?
>>
>> I assume so, (not tested).
>
> OK.  Thanks for sticking with it.

Do we actually want to fix git-clone as well?
I tried and then I see breakage in 5603-clone-dirname
as ssh://host seems to be an invalid url; it has to end with a slash?

If we were to fix clone as well, then we'd still have a lot of possible stale
data (ending in /.) out there, so maybe we want to not fix clone for
now and only
fix it when computing the submodule url.

So I'll first fix up the test suite to not rely on buggy behavior and
then send this patch
with no change in tests? That sounds strange to me as it hides away the cause.


Re: [PATCH] rev-list: use hdr_termination instead of a always using a newline

2016-10-20 Thread Junio C Hamano
Jacob Keller  writes:

> diff --git a/t/t6000-rev-list-misc.sh b/t/t6000-rev-list-misc.sh
> index 3e752ce03280..e8c6979baf59 100755
> --- a/t/t6000-rev-list-misc.sh
> +++ b/t/t6000-rev-list-misc.sh
> @@ -4,6 +4,12 @@ test_description='miscellaneous rev-list tests'
>  
>  . ./test-lib.sh
>  
> +test_ends_with_nul() {
> + printf "\0" >nul
> + sed '$!d' "$@" >contents
> + test_cmp_bin nul contents
> +}
> +
>  test_expect_success setup '
>   echo content1 >wanted_file &&
>   echo content2 >unwanted_file &&
> @@ -100,4 +106,9 @@ test_expect_success '--bisect and --first-parent can not 
> be combined' '
>   test_must_fail git rev-list --bisect --first-parent HEAD
>  '
>  
> +test_expect_success '--header shows a NUL after each commit' '
> + git rev-list --header --max-count=1 HEAD | sed \$!d >actual &&
> + test_ends_with_nul actual
> +'
> +
>  test_done

Thanks.

The main part of the patch looks good.  For "passing NUL to sed",
I'd probably work it around like so:

 t/t6000-rev-list-misc.sh | 19 +++
 1 file changed, 11 insertions(+), 8 deletions(-)

diff --git a/t/t6000-rev-list-misc.sh b/t/t6000-rev-list-misc.sh
index e8c6979baf..737026c34f 100755
--- a/t/t6000-rev-list-misc.sh
+++ b/t/t6000-rev-list-misc.sh
@@ -4,12 +4,6 @@ test_description='miscellaneous rev-list tests'
 
 . ./test-lib.sh
 
-test_ends_with_nul() {
-   printf "\0" >nul
-   sed '$!d' "$@" >contents
-   test_cmp_bin nul contents
-}
-
 test_expect_success setup '
echo content1 >wanted_file &&
echo content2 >unwanted_file &&
@@ -107,8 +101,17 @@ test_expect_success '--bisect and --first-parent can not 
be combined' '
 '
 
 test_expect_success '--header shows a NUL after each commit' '
-   git rev-list --header --max-count=1 HEAD | sed \$!d >actual &&
-   test_ends_with_nul actual
+   # We know there is no Q in the true payload; names and
+   # addresses of the authors and the committers do not have
+   # any, and object names or header names do not, either.
+   git rev-list --header --max-count=2 HEAD |
+   nul_to_q |
+   grep "^Q" >actual &&
+   cat >expect <<-EOF &&
+   Q$(git rev-parse HEAD~1)
+   Q
+   EOF
+   test_cmp expect actual
 '
 
 test_done


Re: [PATCH] rev-list: use hdr_termination instead of a always using a newline

2016-10-20 Thread Dennis Kaarsemaker
On Thu, 2016-10-20 at 11:19 -0700, Jacob Keller wrote:
> Here's my solution, with an updated test using a helper function based
> on using sed (which I think is more portable than tail -n1 ?). The
> change actually is very simple. I ran the test suite and it appears to
> be not breaking anyone else since the normal case is
> hdr_termination="\n" except in the cases where it needs to be NUL.
> 
> Thanks for the bug report!

I like both improvements, and both 'make test' and gitweb are happy
with this version. Thanks for the quick fix.

D.


Re: [PATCH] rev-list: restore the NUL commit separator in --header mode

2016-10-20 Thread Junio C Hamano
Jacob Keller  writes:

> I did some searching, and we do use sed so I replaced it with sed \$!d
> which appears to work. I think we should probably implement a
> test_ends_with_nul or something.

As it is "a stream editor that shall read one or more text files", I
do not think "sed" is any better (or any worse) than "tail -n" from
the portability point of view.  They both may happen to work on GNU
systems.


Re: tools for easily "uncommitting" parts of a patch I just commited?

2016-10-20 Thread Junio C Hamano
Jacob Keller  writes:

>> I am not sure if that is OK.  I think it is less not-OK than the use
>> case I mentioned in my earlier message, in that this is not a case
>> that "please don't do it" breaks.  It however is an inconvenience
>> that the user has to say "git add file" before the "git commit" (or
>> "git commit file") to conclude the sequence.
>>
>> So I dunno.
>
> Hmmm.. Ya ok I don't think we can actually distinguish between these
> two work flows.

What we might want to have in "git commit " is a new mode
that is totally different from -i/-o that says roughly "Start from
the tree of HEAD, pretend as if you removed all the paths that match
the given pathspec from the tree, and then added all the entries in
the index that match that pathspec.  Write that tree and commit.
Take nothing from the working tree".  I have a feeling that when
people do

$ git add -p file1 file2 file3
$ git commit file2

and ends up including _all_ changes made to file2, not just the ones
they picked in the earlier part of the workflow, they are expecting
such a behaviour.





[PATCH] rev-list: use hdr_termination instead of a always using a newline

2016-10-20 Thread Jacob Keller
From: Jacob Keller 

When adding support for prefixing output of log and other commands using
--line-prefix, commit 660e113ce118 ("graph: add support for
--line-prefix on all graph-aware output", 2016-08-31) accidentally
broke rev-list --header output.

In order to make the output appear with a line-prefix, the flow was
changed to always use the graph subsystem for display. Unfortunately
the graph flow in rev-list did not use info->hdr_termination as it was
assumed that graph output would never need to putput NULs.

Since we now always use the graph code in order to handle the case of
line-prefix, simply replace putchar('\n') with
putchar(info->hdr_termination) which will correct this issue.

Add a test for the --header case to make sure we don't break it in the
future. Implement a helper function test_ends_with_nul() to make it more
obvious what sort of check we are looking for.

Reported-by: Dennis Kaarsemaker 
Signed-off-by: Jacob Keller 
---

Here's my solution, with an updated test using a helper function based
on using sed (which I think is more portable than tail -n1 ?). The
change actually is very simple. I ran the test suite and it appears to
be not breaking anyone else since the normal case is
hdr_termination="\n" except in the cases where it needs to be NUL.

Thanks for the bug report!

 builtin/rev-list.c   |  2 +-
 t/t6000-rev-list-misc.sh | 11 +++
 2 files changed, 12 insertions(+), 1 deletion(-)

diff --git a/builtin/rev-list.c b/builtin/rev-list.c
index 8479f6ed28aa..c43decda7011 100644
--- a/builtin/rev-list.c
+++ b/builtin/rev-list.c
@@ -145,7 +145,7 @@ static void show_commit(struct commit *commit, void *data)
 */
if (buf.len && buf.buf[buf.len - 1] == '\n')
graph_show_padding(revs->graph);
-   putchar('\n');
+   putchar(info->hdr_termination);
} else {
/*
 * If the message buffer is empty, just show
diff --git a/t/t6000-rev-list-misc.sh b/t/t6000-rev-list-misc.sh
index 3e752ce03280..e8c6979baf59 100755
--- a/t/t6000-rev-list-misc.sh
+++ b/t/t6000-rev-list-misc.sh
@@ -4,6 +4,12 @@ test_description='miscellaneous rev-list tests'
 
 . ./test-lib.sh
 
+test_ends_with_nul() {
+   printf "\0" >nul
+   sed '$!d' "$@" >contents
+   test_cmp_bin nul contents
+}
+
 test_expect_success setup '
echo content1 >wanted_file &&
echo content2 >unwanted_file &&
@@ -100,4 +106,9 @@ test_expect_success '--bisect and --first-parent can not be 
combined' '
test_must_fail git rev-list --bisect --first-parent HEAD
 '
 
+test_expect_success '--header shows a NUL after each commit' '
+   git rev-list --header --max-count=1 HEAD | sed \$!d >actual &&
+   test_ends_with_nul actual
+'
+
 test_done
-- 
2.10.0.560.g867c144



Re: tools for easily "uncommitting" parts of a patch I just commited?

2016-10-20 Thread Jacob Keller
On Thu, Oct 20, 2016 at 10:39 AM, Junio C Hamano  wrote:
> Jacob Keller  writes:
>
>> I still think we're misunderstanding. I want git commit to complain
>> *only* under the following circumstance:
>>
>> I run "git add -p" and put a partial change into the index in .
>> There are still other parts which were not added to the index yet.
>> Thus, the index version of the file and the actual file differ.
>>
>> Then, I (accidentally) run "git commit "
>
> I agree that this case is different.
>
> Again, users are different, and I also often do
>
> $ edit file; think; decide it is a good enough first cut
> $ git add file
> $ edit file; think; decide it is getting better
> $ git add file
> $ edit file; think; decide it is now perfect
> $ git commit file
>
> Because I do not think you can differentiate the above workflow from
> the case where "git add -p" was used earlier, I think your updated
> "git commit" needs to complain at this point.
>
> I am not sure if that is OK.  I think it is less not-OK than the use
> case I mentioned in my earlier message, in that this is not a case
> that "please don't do it" breaks.  It however is an inconvenience
> that the user has to say "git add file" before the "git commit" (or
> "git commit file") to conclude the sequence.
>
> So I dunno.

Hmmm.. Ya ok I don't think we can actually distinguish between these
two work flows.

Given that I now know how to fix my mistake easily (git reset -p) I
think I will just go ahead and not bother with this as it's much less
of a pain now.

Thanks,
Jake


Re: Fwd: New Defects reported by Coverity Scan for git

2016-10-20 Thread Stefan Beller
On Thu, Oct 20, 2016 at 11:05 AM, Junio C Hamano  wrote:
>
> Good to know that you have been managing it; I was mostly worried
> about not having anybody managing it (i.e. imagining Coverity
> nominated/volunteered me as manager with everybody else as viewers)
> and the new viewer requests get totally ignored by the project as
> the whole.

No, I have been paying attention, but in case I suddenly stop contributing
to the git project I thought it's better to have a couple of people there.

>
>> I see that some of these emails may be inconvenient to you, I can
>> change your role to defect viewer/contributor if you prefer.
>
> It is not a huge inconvenience to me, because any piece of e-mail
> that is addressed directly to gitster@ without CC'ing to git@vger
> and is not a follow-up to any earlier message goes to a separate
> lowest-priority mailbox that I rarely look at.  But if it is easy
> to recategorize me, please do so.

done.


Re: [PATCH] rev-list: restore the NUL commit separator in --header mode

2016-10-20 Thread Jacob Keller
On Thu, Oct 20, 2016 at 11:04 AM, Dennis Kaarsemaker
 wrote:
> On Wed, 2016-10-19 at 15:41 -0700, Junio C Hamano wrote:
>> Dennis Kaarsemaker  writes:
>>
>> > +   touch expect &&
>> > +   printf "\0" > expect &&
>>
>>
>> What's the point of that "touch", especially if you are going to
>> overwrite it immediately after?
>
> Leftover debugging crud. I tried various ways of generating an
> actual/expect to compare.
>
>> > +   git rev-list --header --max-count=1 HEAD | tail -n1 >actual &&
>>
>>
>> As "tail" is a tool for text files, it is likely unportable to use
>> "tail -n1" to grab the "last incomplete line that happens to contain
>> a single NUL".
>>
>> > +   test_cmp_bin expect actual
>> > +'
>
> Yeah, I was fearing that. I didn't find anything in the testsuite that
> helps answering the question "does this file end with a NUL" and would
> appreciate a hint :)
>
> D.

I did some searching, and we do use sed so I replaced it with sed \$!d
which appears to work. I think we should probably implement a
test_ends_with_nul or something.

Thanks,
Jake


[PATCH] commit parsing: replace unchecked parse_commit by parse_commit_or_die

2016-10-20 Thread Stefan Beller
The reason parse_commit() would fail at these points would be because
the repository is corrupt.

This was noticed by coverity.

Signed-off-by: Stefan Beller 
---

 developed on pu as that's where coverity spotted it.
 I have no overview if these areas are being worked on. (It may clash
 with at least jc/merge-base-fp-only)
 
 Thanks,
 Stefan

 builtin/blame.c   | 2 +-
 builtin/describe.c| 4 ++--
 builtin/name-rev.c| 2 +-
 builtin/show-branch.c | 4 ++--
 commit.c  | 2 +-
 fetch-pack.c  | 2 +-
 6 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/builtin/blame.c b/builtin/blame.c
index 992a79c..3b8564c 100644
--- a/builtin/blame.c
+++ b/builtin/blame.c
@@ -1801,7 +1801,7 @@ static void assign_blame(struct scoreboard *sb, int opt)
 * so hold onto it in the meantime.
 */
origin_incref(suspect);
-   parse_commit(commit);
+   parse_commit_or_die(commit);
if (reverse ||
(!(commit->object.flags & UNINTERESTING) &&
 !(revs->max_age != -1 && commit->date < revs->max_age)))
diff --git a/builtin/describe.c b/builtin/describe.c
index 01490a1..8299b16 100644
--- a/builtin/describe.c
+++ b/builtin/describe.c
@@ -199,7 +199,7 @@ static unsigned long finish_depth_computation(
best->depth++;
while (parents) {
struct commit *p = parents->item;
-   parse_commit(p);
+   parse_commit_or_die(p);
if (!(p->object.flags & SEEN))
commit_list_insert_by_date(p, list);
p->object.flags |= c->object.flags;
@@ -322,7 +322,7 @@ static void describe(const char *arg, int last_one)
}
while (parents) {
struct commit *p = parents->item;
-   parse_commit(p);
+   parse_commit_or_die(p);
if (!(p->object.flags & SEEN))
commit_list_insert_by_date(p, );
p->object.flags |= c->object.flags;
diff --git a/builtin/name-rev.c b/builtin/name-rev.c
index cd89d48..92c3316 100644
--- a/builtin/name-rev.c
+++ b/builtin/name-rev.c
@@ -29,7 +29,7 @@ static void name_rev(struct commit *commit,
struct commit_list *parents;
int parent_number = 1;
 
-   parse_commit(commit);
+   parse_commit_or_die(commit);
 
if (commit->date < cutoff)
return;
diff --git a/builtin/show-branch.c b/builtin/show-branch.c
index 974f340..fd911b5 100644
--- a/builtin/show-branch.c
+++ b/builtin/show-branch.c
@@ -218,7 +218,7 @@ static void join_revs(struct commit_list **list_p,
parents = parents->next;
if ((this_flag & flags) == flags)
continue;
-   parse_commit(p);
+   parse_commit_or_die(p);
if (mark_seen(p, seen_p) && !still_interesting)
extra--;
p->object.flags |= flags;
@@ -835,7 +835,7 @@ int cmd_show_branch(int ac, const char **av, const char 
*prefix)
if (!commit)
die(_("cannot find commit %s (%s)"),
ref_name[num_rev], oid_to_hex());
-   parse_commit(commit);
+   parse_commit_or_die(commit);
mark_seen(commit, );
 
/* rev#0 uses bit REV_SHIFT, rev#1 uses bit REV_SHIFT+1,
diff --git a/commit.c b/commit.c
index b9c0c81..5b23eaf 100644
--- a/commit.c
+++ b/commit.c
@@ -910,7 +910,7 @@ static void mark_redundant(struct commit **array, int cnt)
ALLOC_ARRAY(filled_index, cnt - 1);
 
for (i = 0; i < cnt; i++)
-   parse_commit(array[i]);
+   parse_commit_or_die(array[i]);
for (i = 0; i < cnt; i++) {
struct commit_list *common;
 
diff --git a/fetch-pack.c b/fetch-pack.c
index cb45c34..8b4ab47 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -159,7 +159,7 @@ static const unsigned char *get_rev(void)
return NULL;
 
commit = prio_queue_get(_list);
-   parse_commit(commit);
+   parse_commit_or_die(commit);
parents = commit->parents;
 
commit->object.flags |= POPPED;
-- 
2.10.1.448.g862ec83.dirty



Re: Fwd: New Defects reported by Coverity Scan for git

2016-10-20 Thread Junio C Hamano
Stefan Beller  writes:

> I do it most of the time, but I did not start managing it.
> And I have been pretty lax/liberal about handing out rights to do stuff,
> because I did not want to tip on anyone's toes giving to few rights
> and thereby annoying them.

Good to know that you have been managing it; I was mostly worried
about not having anybody managing it (i.e. imagining Coverity
nominated/volunteered me as manager with everybody else as viewers)
and the new viewer requests get totally ignored by the project as
the whole.

> I see that some of these emails may be inconvenient to you, I can
> change your role to defect viewer/contributor if you prefer.

It is not a huge inconvenience to me, because any piece of e-mail
that is addressed directly to gitster@ without CC'ing to git@vger
and is not a follow-up to any earlier message goes to a separate
lowest-priority mailbox that I rarely look at.  But if it is easy
to recategorize me, please do so.

Thanks.


Re: [PATCH] rev-list: restore the NUL commit separator in --header mode

2016-10-20 Thread Dennis Kaarsemaker
On Wed, 2016-10-19 at 15:41 -0700, Junio C Hamano wrote:
> Dennis Kaarsemaker  writes:
> 
> > +   touch expect &&
> > +   printf "\0" > expect &&
> 
> 
> What's the point of that "touch", especially if you are going to
> overwrite it immediately after?

Leftover debugging crud. I tried various ways of generating an
actual/expect to compare.

> > +   git rev-list --header --max-count=1 HEAD | tail -n1 >actual &&
> 
> 
> As "tail" is a tool for text files, it is likely unportable to use
> "tail -n1" to grab the "last incomplete line that happens to contain
> a single NUL".
> 
> > +   test_cmp_bin expect actual
> > +'

Yeah, I was fearing that. I didn't find anything in the testsuite that
helps answering the question "does this file end with a NUL" and would
appreciate a hint :)

D.


Re: [PATCH] rev-list: restore the NUL commit separator in --header mode

2016-10-20 Thread Dennis Kaarsemaker
On Wed, 2016-10-19 at 15:39 -0700, Junio C Hamano wrote:
> Jacob Keller  writes:
> 
> > Hi,
> > 
> > On Wed, Oct 19, 2016 at 2:04 PM, Dennis Kaarsemaker
> >  wrote:
> > > Commit 660e113 (graph: add support for --line-prefix on all graph-aware
> > > output) changed the way commits were shown. Unfortunately this dropped
> > > the NUL between commits in --header mode. Restore the NUL and add a test
> > > for this feature.
> > > 
> > 
> > 
> > Oops! Thanks for the bug fix.
> > 
> > > Signed-off-by: Dennis Kaarsemaker 
> > > ---
> > >  builtin/rev-list.c   | 4 
> > >  t/t6000-rev-list-misc.sh | 7 +++
> > >  2 files changed, 11 insertions(+)
> > > 
> > > diff --git a/builtin/rev-list.c b/builtin/rev-list.c
> > > index 8479f6e..cfa6a7d 100644
> > > --- a/builtin/rev-list.c
> > > +++ b/builtin/rev-list.c
> > > @@ -157,6 +157,10 @@ static void show_commit(struct commit *commit, void 
> > > *data)
> > > if (revs->commit_format == CMIT_FMT_ONELINE)
> > > putchar('\n');
> > > }
> > > +   if (revs->commit_format == CMIT_FMT_RAW) {
> > > +   putchar(info->hdr_termination);
> > > +   }
> > > +
> > 
> > 
> > This seems right to me. My one concern is that we make sure we restore
> > it for every case (in case it needs to be there for other formats?)
> > I'm not entirely sure about whether other non-raw modes need this or
> > not?
> 
> 
> Right.  The original didn't do anything special for CMIT_FMT_RAW,
> and 660e113 did not remove anything special for CMIT_FMT_RAW, so it
> isn't immediately obvious why this patch is sufficient.  
> 
> Dennis, care to elaborate?

The original logic was (best seen with git show -w 660e113):

if(showing graphs) {
 do pretty things
}
else {
 just print the buffer and the header terminator
}

660e113 changed that to

do pretty things

Given that the 'do pretty things part' works for other uses of git rev-
list, it made sense that the \0 should only be added back in
CMIT_FMT_RAW mode. Changing the first putchar('\n') as Jacob proposes
(that mail arrived while I'm typing this) might work too, I haven't
tested it.

D.


Re: Fwd: New Defects reported by Coverity Scan for git

2016-10-20 Thread Stefan Beller
On Thu, Oct 20, 2016 at 10:50 AM, Junio C Hamano  wrote:
> Stefan Beller  writes:
>
>> Not sure what triggered the new finding of coverity as seen below as the
>> parse_commit() was not touched. Junios series regarding the merge base
>> optimization touches a bit of code nearby though.
>>
>> Do we want to replace the unchecked places of parse_commit with
>> parse_commit_or_die ?
>
> The reason parse_commit() would fail at this point would be because
> the repository is corrupt, I do not think it would hurt to do such a
> change.
>
> I agree that it is curious why it shows up as a "new defect",
> though.
>
> By the way, do you know who is managing the service on our end
> (e.g. approving new people to be "defect viewer")?

I do it most of the time, but I did not start managing it.
And I have been pretty lax/liberal about handing out rights to do stuff,
because I did not want to tip on anyone's toes giving to few rights
and thereby annoying them.

I see that some of these emails may be inconvenient to you, I can
change your role to defect viewer/contributor if you prefer.

Thanks,
Stefan


Re: Fwd: New Defects reported by Coverity Scan for git

2016-10-20 Thread Junio C Hamano
Stefan Beller  writes:

> Not sure what triggered the new finding of coverity as seen below as the
> parse_commit() was not touched. Junios series regarding the merge base
> optimization touches a bit of code nearby though.
>
> Do we want to replace the unchecked places of parse_commit with
> parse_commit_or_die ?

The reason parse_commit() would fail at this point would be because
the repository is corrupt, I do not think it would hurt to do such a
change.  

I agree that it is curious why it shows up as a "new defect",
though.

By the way, do you know who is managing the service on our end
(e.g. approving new people to be "defect viewer")?  The site seems
to think I have the power to manage others' subscription, which I do
not think I have (I do not go to the site myself).  As it spewed
quite a many false positives into my mailbox in the past, I do not
pay very close attention to these reports these days, but I still
read the e-mailed reports every once in a while.

Thanks.


Re: tools for easily "uncommitting" parts of a patch I just commited?

2016-10-20 Thread Junio C Hamano
Jacob Keller  writes:

> I still think we're misunderstanding. I want git commit to complain
> *only* under the following circumstance:
>
> I run "git add -p" and put a partial change into the index in .
> There are still other parts which were not added to the index yet.
> Thus, the index version of the file and the actual file differ.
>
> Then, I (accidentally) run "git commit "

I agree that this case is different.

Again, users are different, and I also often do

$ edit file; think; decide it is a good enough first cut
$ git add file
$ edit file; think; decide it is getting better
$ git add file
$ edit file; think; decide it is now perfect
$ git commit file

Because I do not think you can differentiate the above workflow from
the case where "git add -p" was used earlier, I think your updated
"git commit" needs to complain at this point.

I am not sure if that is OK.  I think it is less not-OK than the use
case I mentioned in my earlier message, in that this is not a case
that "please don't do it" breaks.  It however is an inconvenience
that the user has to say "git add file" before the "git commit" (or
"git commit file") to conclude the sequence.

So I dunno.


Re: tools for easily "uncommitting" parts of a patch I just commited?

2016-10-20 Thread Jacob Keller
On Thu, Oct 20, 2016 at 9:30 AM, Junio C Hamano  wrote:
> Jeff King  writes:
>
>>> I still think it's worth while to add a check for git-commit which
>>> does something like check when we say "git commit " and if the
>>> index already has those files marked as being changed, compare them
>>> with the current contents of the file as in the checkout and quick
>>> saying "please don't do that" so as to avoid the problem in the first
>>> place.
>> ...
>> I suspect both of those would complain about legitimate workflows.
>>
>> I dunno.  I do not ever use "git commit " myself.
>
> Users are different.  I do use this all the time, and it is not
> unusual at all to have changed contents on paths other than 
> already added to the index when I do so, i.e. an unrelated small
> typofix in  jumping ahead of the real changes I am working on
> in other parts of the tree.
>
> "Please don't do that" would break.  Jacob says "avoid the problem",
> but I do not see a problem in allowing it (it could be that the
> problem Jacob has is in other parts of his workflow, but I do not
> know what it is offhand).

I still think we're misunderstanding. I want git commit to complain
*only* under the following circumstance:

I run "git add -p" and put a partial change into the index in .
There are still other parts which were not added to the index yet.
Thus, the index version of the file and the actual file differ.

Then, I (accidentally) run "git commit "

I want git commit to complain here that the index  and acutal
 being requested are different and it thinks there's an issue.

I do *NOT* want it to complain if I do "git add -p" and put parts of
 into the index, and then run

git commit 

Does that make sense?

Basically if the index and "git commit " both say "add "
but they conflict in what version of  I want it to go "hey..
uhhh.. that's a bad idea"

Thanks,
Jake


Re: Drastic jump in the time required for the test suite

2016-10-20 Thread Matthieu Moy
Junio C Hamano  writes:

> Are you proposing to replace the tests written as shell scripts with
> scripts in another language or framework that run equivalent
> sequences of git commands that is as portable as, if not more,
> Bourne shell?

The language (/bin/sh) is probably not the biggest issue. The way we use
it may be.

I don't have benchmark to tell what slows down the testsuite, but for
example we often write

cat >expected 

Re: [PATCH v4 05/14] i18n: add--interactive: mark plural strings

2016-10-20 Thread Junio C Hamano
Vasco Almeida  writes:

> A Seg, 10-10-2016 às 12:54 +, Vasco Almeida escreveu:
>> @@ -70,6 +72,8 @@ Git::I18N - Perl interface to Git's Gettext localizations
>>  
>> printf __("The following error occurred: %s\n"), $error;
>>  
>> +   printf __n("commited %d file", "commited %d files", $files), $files;
>> +
>
> I forgot to add \n to this example as suggested in
> 
>
> What should I do? Should I wait for more reviews and then send a new
> re-roll fixing this?

You fix it up locally not to forget, in case you need a reroll, and
wait for more reviews.  In the meantime, I'll also fix it up locally
not to forget ;-)  That way, if it turns out that this round is good
enough to be the final version, people will see my fixup, and if what
I have needs to be replaced with your new version, your fixup will
be in there.

Thanks.


Fwd: New Defects reported by Coverity Scan for git

2016-10-20 Thread Stefan Beller
Not sure what triggered the new finding of coverity as seen below as the
parse_commit() was not touched. Junios series regarding the merge base
optimization touches a bit of code nearby though.

Do we want to replace the unchecked places of parse_commit with
parse_commit_or_die ?

Thanks,
Stefan
_
*** CID 1374088:  Error handling issues  (CHECKED_RETURN)
/commit.c: 913 in mark_redundant()
907
908 work = xcalloc(cnt, sizeof(*work));
909 redundant = xcalloc(cnt, 1);
910 ALLOC_ARRAY(filled_index, cnt - 1);
911
912 for (i = 0; i < cnt; i++)
>>> CID 1374088:  Error handling issues  (CHECKED_RETURN)
>>> Calling "parse_commit" without checking return value (as is done 
>>> elsewhere 37 out of 45 times).
913 parse_commit(array[i]);
914 for (i = 0; i < cnt; i++) {
915 struct commit_list *common;
916
917 if (redundant[i])
918 continue;


Re: [PATCH 2/2] tag: send fully qualified refnames to verify_tag_and_format

2016-10-20 Thread Santiago Torres
On Wed, Oct 19, 2016 at 04:39:44PM -0400, Jeff King wrote:
> The ref-filter code generally expects to see fully qualified
> refs, so that things like "%(refname)" and "%(refname:short)"
> work as expected. We can do so easily from git-tag, which
> always works with refnames in the refs/tags namespace. As a
> bonus, we can drop the "kind" parameter from
> pretty_print_ref() and just deduce it automatically.
> 
> Unfortunately, things are not so simple for verify-tag,
> which takes an arbitrary sha1 expression. It has no clue if
> a refname as used or not, and whether it was in the
> refs/tags namespace.
> 
> In an ideal world, get_sha1_with_context() would optionally
> tell us about any refs we resolved while it was working, and
> we could just feed that refname (and then in cases where we
> didn't use a ref at all, like a bare sha1, we could fallback
> to just showing the sha1 name the user gave us).
> 
> Signed-off-by: Jeff King 
> ---
> I think you'd really just squash the various bits of this into your
> series at the right spots, though I don't mind it on top, either.
> 
> The big question is to what degree we should care about the verify-tag
> case. I don't think it's any worse off with this change than it is with
> your series (its "kind" becomes "OTHER", but I don't think that is
> actually used for display at all; the name remains the same). I'd be OK
> with leaving it like this, as a known bug, until get_sha1_with_context()
> learns to tell us about the ref. It's an unhandled corner case in a
> brand-new feature, not a regression in an existing one.

I see now, I think I can sprinkle some of these changes on 2/7 then. The
rest should be doing 4/7 and 5/7. Does this sound ok?


signature.asc
Description: PGP signature


Re: Drastic jump in the time required for the test suite

2016-10-20 Thread René Scharfe
Am 20.10.2016 um 13:02 schrieb Duy Nguyen:
> On Wed, Oct 19, 2016 at 4:18 PM, Johannes Schindelin
>  wrote:
>> Hi Junio,
>>
>> I know you are a fan of testing things thoroughly in the test suite, but I
>> have to say that it is getting out of hand, in particular due to our
>> over-use of shell script idioms (which really only run fast on Linux, not
>> a good idea for a portable software).
>>
>> My builds of `pu` now time out, after running for 3h straight in the VM
>> dedicated to perform the daily routine of building and testing the git.git
>> branches in Git for Windows' SDK. For comparison, `next` passes build &
>> tests in 2.6h. That is quite the jump.
> 
> I'm just curious, will running git.exe from WSL [1] help speed things
> up a bit (or, hopefully, a lot)? I'm assuming that shell's speed in
> WSL is quite fast.
> 
> I'm pretty sure the test suite would need some adaptation, but if the
> speedup is significant, maybe it's worth spending time on.
> 
> [1] https://news.ycombinator.com/item?id=12748395

I get this on WSL with prove -j8:

Files=750, Tests=13657, 906 wallclock secs ( 8.51 usr 17.17 sys + 282.62 cusr 
3731.85 csys = 4040.15 CPU)

And this for a run on Debian inside a Hyper-V VM on the same system:

Files=759, Tests=13895, 99 wallclock secs ( 4.81 usr  1.06 sys + 39.70 cusr 
25.82 csys = 71.39 CPU)

All tests pass on master.

René



Re: [PATCH v4 05/14] i18n: add--interactive: mark plural strings

2016-10-20 Thread Vasco Almeida
A Seg, 10-10-2016 às 12:54 +, Vasco Almeida escreveu:
> @@ -70,6 +72,8 @@ Git::I18N - Perl interface to Git's Gettext localizations
>  
> printf __("The following error occurred: %s\n"), $error;
>  
> +   printf __n("commited %d file", "commited %d files", $files), $files;
> +

I forgot to add \n to this example as suggested in


What should I do? Should I wait for more reviews and then send a new
re-roll fixing this?


Re: tools for easily "uncommitting" parts of a patch I just commited?

2016-10-20 Thread Junio C Hamano
Jeff King  writes:

>> I still think it's worth while to add a check for git-commit which
>> does something like check when we say "git commit " and if the
>> index already has those files marked as being changed, compare them
>> with the current contents of the file as in the checkout and quick
>> saying "please don't do that" so as to avoid the problem in the first
>> place.
> ...
> I suspect both of those would complain about legitimate workflows.
>
> I dunno.  I do not ever use "git commit " myself.

Users are different.  I do use this all the time, and it is not
unusual at all to have changed contents on paths other than 
already added to the index when I do so, i.e. an unrelated small
typofix in  jumping ahead of the real changes I am working on
in other parts of the tree.

"Please don't do that" would break.  Jacob says "avoid the problem",
but I do not see a problem in allowing it (it could be that the
problem Jacob has is in other parts of his workflow, but I do not
know what it is offhand).


Re: Drastic jump in the time required for the test suite

2016-10-20 Thread Stefan Beller
On Thu, Oct 20, 2016 at 5:31 AM, Jeff King  wrote:

>
> $ perl -lne '/execve\("(.*?)"/ and print $1' /tmp/foo.out | sort | uniq -c | 
> sort -rn | head
>  152271 /home/peff/compile/git/git
>   57340 /home/peff/compile/git/t/../bin-wrappers/git
>   16865 /bin/sed
>   12650 /bin/rm
>   11257 /bin/cat
>9326 /home/peff/compile/git/git-sh-i18n--envsubst
>9079 /usr/bin/diff
>8013 /usr/bin/wc
>5924 /bin/mv
>4566 /bin/grep
>

I am not an expert on perl nor tracing, but is it feasible to find out
how many internal calls there are? i.e. either some shell script (rebase,
submodule) calling git itself a couple of times or even from compile/git/git
itself, e.g. some submodule operations use forking in there.


Re: [regression] `make profile-install` fails in 2.10.1

2016-10-20 Thread Junio C Hamano
Jeff King  writes:

> I usually just try to recreate the actual environment (e.g., run the
> tests as root, run them on a loopback case-insensitive fs, etc) as that
> gives a more realistic recreation.

True.  I just do not want to run the tests as root myself ;-)  I
wonder if fakeroot would give us good enough illusion for that--I
haven't used it for a long while.


Re: [PATCH v4 05/14] i18n: add--interactive: mark plural strings

2016-10-20 Thread Vasco Almeida
A Qua, 19-10-2016 às 11:40 -0700, Junio C Hamano escreveu:
> Vasco Almeida  writes:
> 
> > 
> > @@ -669,12 +669,18 @@ sub status_cmd {
> >  sub say_n_paths {
> >     my $did = shift @_;
> >     my $cnt = scalar @_;
> > -   print "$did ";
> > -   if (1 < $cnt) {
> > -   print "$cnt paths\n";
> > -   }
> > -   else {
> > -   print "one path\n";
> > +   if ($did eq 'added') {
> > +   printf(__n("added %d path\n", "added %d paths\n",
> > +      $cnt), $cnt);
> > +   } elsif ($did eq 'updated') {
> > +   printf(__n("updated %d path\n", "updated %d
> > paths\n",
> > +      $cnt), $cnt);
> > +   } elsif ($did eq 'reverted') {
> > +   printf(__n("reverted %d path\n", "reverted %d
> > paths\n",
> > +      $cnt), $cnt);
> > +   } else {
> > +   printf(__n("touched %d path\n", "touched %d
> > paths\n",
> > +      $cnt), $cnt);
> >     }
> >  }
> 
> Nice to see you covered all verbs currently in use and then
> future-proofed by adding a fallback "touched" here.
> 
> Thanks.
> 

Thanks. Here I added %d to the singular sentences "added %d path\n" to
avoid a Perl warning about a redundant argument in printf.


Re: [PATCH v4 01/14] i18n: add--interactive: mark strings for translation

2016-10-20 Thread Junio C Hamano
Vasco Almeida  writes:

>> Not a big deal, but this makes me wonder if we want to do this
>> instead ...

For future reference (for others as well), when I say "makes me
wonder" or "I wonder", I am never demanding to change what the
original author wrote.  I just am trying to see that pros-and-cons
have been considered already.

> ... Also I think msgfmt checks if English source and translation
> both end with newline or not.

That is a good enough safety belt to me.

> I will leave this patch as is.

Yup.  Thanks.


Re: [PATCH v4 01/14] i18n: add--interactive: mark strings for translation

2016-10-20 Thread Vasco Almeida
A Qua, 19-10-2016 às 11:14 -0700, Junio C Hamano escreveu:
> Vasco Almeida  writes:
> 
> > 
> >     } else {
> > -   print "No untracked files.\n";
> > +   print __("No untracked files.\n");
> >     }
> 
> Not a big deal, but this makes me wonder if we want to do this
> instead
> 
>   print __("No untracked files.") . "\n";
> 
> so that translators do not have to remember to keep the final LF.

This can be a good idea. On the other hand, I think translators are
cautious to not forget the final LF since there is a lot of them from C
source. Also I think msgfmt checks if English source and translation
both end with newline or not. So if a translator forgets to put a \n
then msgfmt would return an error. If it is not the translator to find
the error herself, someone else will, like the Translation coordinator.

I will leave this patch as is.

https://www.gnu.org/software/gettext/FAQ.html#newline


Re: Drastic jump in the time required for the test suite

2016-10-20 Thread Junio C Hamano
Johannes Schindelin  writes:

> Of course, if you continue to resist (because the problem is obviously not
> affecting you personally, so why would you care), I won't even try to find
> the time to start on that project.

Sorry, but I did not know I was resisting, as I didn't see any
proposal to resist against in the first place.  I was trying to help
by mentioning two tricks that may be helping my test runtime that
may help you as well.

Are you proposing to replace the tests written as shell scripts with
scripts in another language or framework that run equivalent
sequences of git commands that is as portable as, if not more,
Bourne shell?  If that is what you are proposing, well, I won't stop
you and I may even help you in there, but I fail to guess what
alternative you have in mind.  I certainly do not have a suggestion
myself and I won't suggest migrate to tclsh or perl for that matter.

If that is not what you are trying to propose, and if parallelism
has already been employed, then there may or may not be other tricks
you are not yet using that helps to speed up your shell execution
that others are using---being confrontational is not an effective
way to ask others about them.


Re: [PATCH] rev-list: restore the NUL commit separator in --header mode

2016-10-20 Thread Keller, Jacob E
On Wed, 2016-10-19 at 15:39 -0700, Junio C Hamano wrote:
> Jacob Keller  writes:
> 
> > 
> > Hi,
> > 
> > On Wed, Oct 19, 2016 at 2:04 PM, Dennis Kaarsemaker
> >  wrote:
> > > 
> > > Commit 660e113 (graph: add support for --line-prefix on all
> > > graph-aware
> > > output) changed the way commits were shown. Unfortunately this
> > > dropped
> > > the NUL between commits in --header mode. Restore the NUL and add
> > > a test
> > > for this feature.
> > > 
> > 
> > Oops! Thanks for the bug fix.
> > 
> > > 
> > > Signed-off-by: Dennis Kaarsemaker 
> > > ---
> > >  builtin/rev-list.c   | 4 
> > >  t/t6000-rev-list-misc.sh | 7 +++
> > >  2 files changed, 11 insertions(+)
> > > 
> > > diff --git a/builtin/rev-list.c b/builtin/rev-list.c
> > > index 8479f6e..cfa6a7d 100644
> > > --- a/builtin/rev-list.c
> > > +++ b/builtin/rev-list.c
> > > @@ -157,6 +157,10 @@ static void show_commit(struct commit
> > > *commit, void *data)
> > > if (revs->commit_format ==
> > > CMIT_FMT_ONELINE)
> > > putchar('\n');
> > > }
> > > +   if (revs->commit_format == CMIT_FMT_RAW) {
> > > +   putchar(info->hdr_termination);
> > > +   }
> > > +
> > 
> > This seems right to me. My one concern is that we make sure we
> > restore
> > it for every case (in case it needs to be there for other formats?)
> > I'm not entirely sure about whether other non-raw modes need this
> > or
> > not?
> 
> Right.  The original didn't do anything special for CMIT_FMT_RAW,
> and 660e113 did not remove anything special for CMIT_FMT_RAW, so it
> isn't immediately obvious why this patch is sufficient.  
> 
> Dennis, care to elaborate?

I believe all we need to do is change one of the places where we emit
"\n" with emiting info->hdr_termination instead.

I'm looking at the original code now.

Thanks,
Jake

Re: How to rename remote branches if I only have "client access"?

2016-10-20 Thread Junio C Hamano
Manuel Reimer  writes:

> I have the following branches on my remote repo:
>
> new_version
> master
>
> I now want the "new_version" branch to be the "master" and the old
> "master" has to be renamed to the old version number (in my case
> 0.2.0).
>
> How to do this? 

I assume you have "git push" access into this remote repository.

You are correct that there is no way for you to tell "git push" (or
any Git native method) to rename "master" to "maint-0.2.0".

What you can do is to push the commits at the tip of "master" and
"new_version" to "maint-0.2.0" and "master" respectively, and then
delete "new_version".  If the 'master' and 'new_version' you locally
have are already the 'master' and 'new_version' you have over there
that you want to see sit at the tips of updated 'maint-0.2.0' and
'master' branches, then:

git push $remote master:refs/heads/maint-0.2.0 new_version:master
git push $remote :new_version

would do this.

Note that in the : syntax in "git push", the 
side refers to the names in your local repository; if your local
'master' is different from what you want to see at the tip of
'maint-0.2.0' at the remote after this, replace it with whatever
name you give to that commit locally in the above example.

Also note that the this push may not succeed if your new_version is
not a descendant of the current 'master' at the remote.  You'd need
to use +: to force it if that is the case.

The second command that has an empty  is to delete .

Lastly, the remote side can be configured to forbid deletion of
branches, and/or to forbid forced pushes.  If your remote is
configured that way (which is not default), then there is no way for
you to do any of the above (and that is by design---the server
operators use these configuration variables to forbid you from doing
something, so you shouldn't be able to override that choice).

> Currently this causes me much trouble as I can't
> delete the remote "master" repository as this is the "remote current
> repository"...

Sorry, but you lost me here. You were talking about two branches in
a single repository that is remote earlier.  I do not know what this
"remote master repository" you bring up in this paragraph is, and
why you can't remove it.  Not that I want to know the answers to
these questions myself.  I just do not understand these as a reason
behind your wanting to rename branches at a remote repository.


Re: [PATCH] doc: remove reference to the traditional layout in git-tag.txt

2016-10-20 Thread Junio C Hamano
Younes Khoudli  writes:

> This is the only place in the documentation that the traditional layout
> is mentioned, and it is confusing. Remove it.

Yeah, the information is not incorrect per-se, but certainly is out
of place and immaterial to what this part of the documentation tries
to teach.

Will queue; thanks.

>
> * Documentation/git-tag.txt: Here.
>
> Signed-off-by: Younes Khoudli 
> ---
>  Documentation/git-tag.txt | 5 ++---
>  1 file changed, 2 insertions(+), 3 deletions(-)
>
> diff --git a/Documentation/git-tag.txt b/Documentation/git-tag.txt
> index 7ecca8e..80019c5 100644
> --- a/Documentation/git-tag.txt
> +++ b/Documentation/git-tag.txt
> @@ -253,9 +253,8 @@ On Automatic following
>  ~~
>  
>  If you are following somebody else's tree, you are most likely
> -using remote-tracking branches (`refs/heads/origin` in traditional
> -layout, or `refs/remotes/origin/master` in the separate-remote
> -layout).  You usually want the tags from the other end.
> +using remote-tracking branches (eg. `refs/remotes/origin/master`).
> +You usually want the tags from the other end.
>  
>  On the other hand, if you are fetching because you would want a
>  one-shot merge from somebody else, you typically do not want to


How to rename remote branches if I only have "client access"?

2016-10-20 Thread Manuel Reimer

Hello,

I have the following branches on my remote repo:

new_version
master

I now want the "new_version" branch to be the "master" and the old 
"master" has to be renamed to the old version number (in my case 0.2.0).


How to do this? Currently this causes me much trouble as I can't delete 
the remote "master" repository as this is the "remote current repository"...


Thanks in advance

Manuel



[PATCH] doc: remove reference to the traditional layout in git-tag.txt

2016-10-20 Thread Younes Khoudli
This is the only place in the documentation that the traditional layout
is mentioned, and it is confusing. Remove it.

* Documentation/git-tag.txt: Here.

Signed-off-by: Younes Khoudli 
---
 Documentation/git-tag.txt | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/Documentation/git-tag.txt b/Documentation/git-tag.txt
index 7ecca8e..80019c5 100644
--- a/Documentation/git-tag.txt
+++ b/Documentation/git-tag.txt
@@ -253,9 +253,8 @@ On Automatic following
 ~~
 
 If you are following somebody else's tree, you are most likely
-using remote-tracking branches (`refs/heads/origin` in traditional
-layout, or `refs/remotes/origin/master` in the separate-remote
-layout).  You usually want the tags from the other end.
+using remote-tracking branches (eg. `refs/remotes/origin/master`).
+You usually want the tags from the other end.
 
 On the other hand, if you are fetching because you would want a
 one-shot merge from somebody else, you typically do not want to
-- 
2.10.0



Re: Drastic jump in the time required for the test suite

2016-10-20 Thread Jeff King
On Thu, Oct 20, 2016 at 12:17:33PM +0200, Johannes Schindelin wrote:

> If you want to know just how harmful this reliance on shell scripting is
> to our goal of keeping Git portable: already moving from Linux to MacOSX
> costs you roughly 3x as long to run the build & test (~12mins vs ~36mins
> for GCC, according to https://travis-ci.org/git/git/builds/159125647).

Wait, shell scripts are slow on MacOS now?

Perhaps, but it seems more likely that one or more of the following is
true:

  - setup of the OS X VM takes longer (it does; if you click-through to
the test results, you'll see that the "make test" step goes from
647s on Linux to 1108s on MacOS. That's much worse, but not even
twice as slow, let alone 3x).

  - Travis Linux and OSX VMs do not have identical hardware. Looking at
https://docs.travis-ci.com/user/ci-environment/, it appears that
Linux containers get twice as many cores.

  - Git performance on Linux may be better than MacOS. The test suite is
very filesystem-heavy because it creates and destroys a lot of files
and repositories. If the kernel vfs performance is worse, it's
likely to show up in the test suite (especially if the issue is
latency and you aren't doing it massively in parallel).

I don't have a real way to measure that, but it seems like a
plausible factor.

So that sucks that the MacOS Travis build takes a half hour to run. But
I don't think that shell scripting is the culprit.

> So the only thing that would really count as an improvement would be to
> change the test suite in such a manner that it relies more on helpers in
> t/helper/ and less on heavy-duty shell scripting.
> 
> Of course, if you continue to resist (because the problem is obviously not
> affecting you personally, so why would you care), I won't even try to find
> the time to start on that project.

I'm not sure what you mean by "resist". The tests suite has been a set
of shell scripts for over a decade. As far as I know there is not
currently a viable alternative. If you have patches that make it faster
without negatively impact the ease of writing tests, I'd be happy to see
them.  If you have more t/helper programs that can eliminate expensive
bits of the shell scripts and speed up the test run, great. If you have
some other proposal entirely, I'd love to hear it.  But I do not see
that there is any proposal to "resist" at this point.

I'm also not entirely convinced that the test suite being a shell script
is the main culprit for its slowness. We run git a lot of times, and
that's inherent in testing it. I ran the whole test suite under
"strace -f -e execve". There are ~335K execs. Here's the breakdown of
the top ones:

$ perl -lne '/execve\("(.*?)"/ and print $1' /tmp/foo.out | sort | uniq -c | 
sort -rn | head
 152271 /home/peff/compile/git/git
  57340 /home/peff/compile/git/t/../bin-wrappers/git
  16865 /bin/sed
  12650 /bin/rm
  11257 /bin/cat
   9326 /home/peff/compile/git/git-sh-i18n--envsubst
   9079 /usr/bin/diff
   8013 /usr/bin/wc
   5924 /bin/mv
   4566 /bin/grep

Almost half are running git itself. Let's assume that can't be changed.
That leaves ~180K of shell-related overhead (versus the optimal case,
that the entire test suite becomes one monolithic program ;) ).

Close to 1/3 of those processes are just invoking the bin-wrapper
script to set up the EXEC_PATH, etc. I imagine it would not be too hard
to just do that in the test script. In fact, it looks like:

  make prefix=/wherever install
  GIT_TEST_INSTALLED=/wherever/bin make test

might give you an immediate speedup by skipping bin-wrappers entirely.

The rest of it is harder. I think you'd have to move the test suite to a
language like perl that can do more of that as builtins (I'm sure you'd
enjoy the portability implications of _that_).  It would almost be
easier to build a variant of the shell that has sed, rm, cat, and a few
others compiled in.

-Peff

PS I haven't kept up with all of this POSIX-layer stuff that's been
   announced in Windows the past few months. Is it a viable path forward
   that would have better performance (obviously not in the short term,
   but where we may arrive in a few years)?


Re: [PATCH v4 05/25] sequencer: eventually release memory allocated for the option values

2016-10-20 Thread Johannes Schindelin
Hi Junio,

On Tue, 18 Oct 2016, Junio C Hamano wrote:

> Johannes Schindelin  writes:
> 
> >> > To remedy that, we now take custody of the option values in question,
> >> > requiring those values to be malloc()ed or strdup()ed
> >> 
> >> That is the approach this patch takes, so "eventually release" in
> >> the title is no longer accurate, I would think.
> >
> > To the contrary, we now free() things in remove_state(), so we still
> > "eventually release" the memory.
> 
> OK.  We call a change to teach remove_state() to free the resource
> does more commonly as "plug leaks"; the word "eventually" gave me an
> impression that we are emphasizing the fact that we do not free(3)
> immediately but lazily do so at the end, hence my response.

I changed the commit subject, hopefully to your liking,
Dscho


Re: [PATCH v3 14/25] sequencer: introduce a helper to read files written by scripts

2016-10-20 Thread Johannes Schindelin
Hi Junio,

On Tue, 18 Oct 2016, Junio C Hamano wrote:

> Johannes Schindelin  writes:
> 
> > In the meantime, I'd be happy to just add a comment that this function is
> > intended for oneliners, but that it will also read multi-line files and
> > only strip off the EOL marker from the last line.
> >
> > Would that work for you?
> 
> That would be ideal, I would think.

Done,
Dscho


[REQUEST PULL] git-gui 0.20.0 to 0.21.0

2016-10-20 Thread Pat Thoyts
The following changes since commit 4498b3a50a0e839788682f672df267cbc1ba9292:

  git-gui: set version 0.20 (2015-04-18 12:15:32 +0100)

are available in the git repository at:

  git://repo.or.cz/git-gui.git tags/gitgui-0.21.0

for you to fetch changes up to ccc985126f23ff5d9ac610cb820bca48405ff5ef:

  git-gui: set version 0.21 (2016-10-20 11:19:43 +0100)


git-gui 0.21.0


Alex Riesen (2):
  git-gui: support for $FILENAMES in tool definitions
  git-gui: ensure the file in the diff pane is in the list of selected files

Alexander Shopov (2):
  git-gui i18n: Updated Bulgarian translation (565,0f,0u)
  git-gui: Mark 'All' in remote.tcl for translation

Dimitriy Ryazantcev (1):
  git-gui: Update Russian translation

Elia Pinto (1):
  git-gui/po/glossary/txt-to-pot.sh: use the $( ... ) construct for command 
substitution

Johannes Schindelin (1):
  git-gui: respect commit.gpgsign again

Karsten Blees (2):
  git-gui: unicode file name support on windows
  git-gui: handle the encoding of Git's output correctly

Olaf Hering (1):
  git-gui: sort entries in tclIndex

Orgad Shaneh (1):
  git-gui: Do not reset author details on amend

Pat Thoyts (19):
  Allow keyboard control to work in the staging widgets.
  Amend tab ordering and text widget border and highlighting.
  git-gui: fix detection of Cygwin
  git-gui (Windows): use git-gui.exe in `Create Desktop Shortcut`
  Merge branch 'sy/i18n' into pu
  Merge branch 'js/commit-gpgsign' into pu
  Merge branch 'rs/use-modern-git-merge-syntax' into pu
  Merge branch 'va/i18n' into pu
  Merge branch 'patches' into pu
  Merge branch 'pt/git4win-mods' into pu
  Merge branch 'pt/non-mouse-usage' into pu
  Merge branch 'va/i18n_2' into pu
  git-gui: maintain backwards compatibility for merge syntax
  Merge branch 'dr/ru' into pu
  git-gui: avoid persisting modified author identity
  Merge branch 'kb/unicode' into pu
  Merge branch 'os/preserve-author' into pu
  Merge branch 'as/bulgarian' into pu
  git-gui: set version 0.21

René Scharfe (1):
  git-gui: stop using deprecated merge syntax

Satoshi Yasushima (6):
  git-gui: consistently use the same word for "remote" in Japanese
  git-gui: consistently use the same word for "blame" in Japanese
  git-gui: apply po template to Japanese translation
  git-gui: add Japanese language code
  git-gui: update Japanese translation
  git-gui: update Japanese information

Vasco Almeida (6):
  git-gui i18n: mark strings for translation
  git-gui: l10n: add Portuguese translation
  git-gui i18n: internationalize use of colon punctuation
  git-gui i18n: mark "usage:" strings for translation
  git-gui: fix incorrect use of Tcl append command
  git-gui i18n: mark string in lib/error.tcl for translation

yaras (1):
  git-gui: fix initial git gui message encoding

 GIT-VERSION-GEN  |2 +-
 Makefile |2 +-
 git-gui.sh   |  154 +-
 lib/blame.tcl|2 +-
 lib/branch_checkout.tcl  |2 +-
 lib/branch_create.tcl|2 +-
 lib/branch_delete.tcl|4 +-
 lib/branch_rename.tcl|2 +-
 lib/browser.tcl  |6 +-
 lib/commit.tcl   |   39 +-
 lib/database.tcl |4 +-
 lib/diff.tcl |   14 +-
 lib/error.tcl|8 +-
 lib/index.tcl|   12 +-
 lib/merge.tcl|   18 +-
 lib/option.tcl   |8 +-
 lib/remote.tcl   |8 +-
 lib/remote_add.tcl   |2 +-
 lib/remote_branch_delete.tcl |2 +-
 lib/shortcut.tcl |   17 +-
 lib/themed.tcl   |   87 +-
 lib/tools.tcl|3 +
 lib/tools_dlg.tcl|6 +-
 lib/transport.tcl|2 +-
 po/bg.po | 3543 ++
 po/glossary/pt_pt.po |  293 
 po/glossary/txt-to-pot.sh|4 +-
 po/ja.po | 3077 ++--
 po/pt_pt.po  | 2716 
 po/ru.po |  680 +++-
 30 files changed, 6957 insertions(+), 3762 deletions(-)
 create mode 100644 po/glossary/pt_pt.po
 create mode 100644 po/pt_pt.po


Re: Drastic jump in the time required for the test suite

2016-10-20 Thread Jeff King
On Thu, Oct 20, 2016 at 12:50:32PM +0200, Johannes Schindelin wrote:

> That reflects my findings, too. I want to add that I found preciously
> little difference between running slow-to-fast and running in numeric
> order, so I gave up on optimizing on that front.

Interesting. It makes a 10-15% difference here.

I also point "--root" at a ram disk. The tests are very I/O heavy and
sometimes fsync; even on a system with an SSD, this saves another ~10%.

I know that's small potatoes compared to the Windows vs Linux times, but
it might be worth exploring.

> Further, I found that the Subversion tests (which run at the end) are so
> close in their running time that running the tests in parallel with -j5
> does not result in any noticeable improvement when reordered.

I normally don't run the Subversion tests at all. Installing cvs, cvsps,
subversion, and libsvn-perl nearly doubles the runtime of the test suite
for me (I imagine adding p4 to the mix would bump it further). While
it's certainly possible to break them with a change in core git, it
doesn't seem like a good tradeoff if I'm not touching them often.

As the GfW maintainer, you probably should be running them, at least
before a release. But cutting them might be a good way to speed up your
day-to-day runs.

I also use -j16 on a quad-core (+hyperthreads) machine, which I arrived
at experimentally. At least on Linux, it's definitely worth having more
threads than processors, to keep the processors busy.

> I guess I will have to bite into the sour apple and try to profile, say,
> t3404 somehow, including all the shell scripting stuff, to identify where
> exactly all that time is lost. My guess is that it boils down to
> gazillions of calls to programs like expr.exe or merely subshells.

I'm not so sure it isn't gazillions of calls to git. It is testing
rebase, after all, which is itself a shell script. GIT_TRACE_PERFORMANCE
gives sort of a crude measure; it reports only builtins (so it will
underestimate the total time spent in git), but it also doesn't make
clear which programs call which, so some times are double-counted (if a
builtin shells out to another builtin). But:

  $ export GIT_TRACE_PERFORMANCE=/tmp/foo.out
  $ rm /tmp/foo.out
  $ time ./t3404-rebase-interactive.sh
  real0m29.755s
  user0m1.444s
  sys 0m2.268s

  $ perl -lne '
  /performance: ([0-9.]+)/ and $total += $1;
  END { print $total }
' /tmp/foo.out
  32.851352624

Clearly that's not 100% accurate, as it claims we spent longer in git
than the script actually took to run. Given the caveats above, I'm not
even sure if it is in the right ballpark. But there are 11,000 git
builtins run as part of that script. Even at 2ms each, that's still most
of the time going to git.

And obviously the fix involves converting git-rebase, which you're
already working on. But it's not clear to me that the test
infrastructure or shell scripts are the primary cause of the slowness in
this particular case.

-Peff


Re: Drastic jump in the time required for the test suite

2016-10-20 Thread Duy Nguyen
On Wed, Oct 19, 2016 at 4:18 PM, Johannes Schindelin
 wrote:
> Hi Junio,
>
> I know you are a fan of testing things thoroughly in the test suite, but I
> have to say that it is getting out of hand, in particular due to our
> over-use of shell script idioms (which really only run fast on Linux, not
> a good idea for a portable software).
>
> My builds of `pu` now time out, after running for 3h straight in the VM
> dedicated to perform the daily routine of building and testing the git.git
> branches in Git for Windows' SDK. For comparison, `next` passes build &
> tests in 2.6h. That is quite the jump.

I'm just curious, will running git.exe from WSL [1] help speed things
up a bit (or, hopefully, a lot)? I'm assuming that shell's speed in
WSL is quite fast.

I'm pretty sure the test suite would need some adaptation, but if the
speedup is significant, maybe it's worth spending time on.

[1] https://news.ycombinator.com/item?id=12748395
-- 
Duy


Re: Drastic jump in the time required for the test suite

2016-10-20 Thread Johannes Schindelin
Hi peff,

On Wed, 19 Oct 2016, Jeff King wrote:

> On Wed, Oct 19, 2016 at 10:32:12AM -0700, Junio C Hamano wrote:
> 
> > > Maybe we should start optimizing the tests...
> > 
> > Yup, two things that come to mind are to identify long ones and see if
> > each of them can be split into two halves that can be run in parallel,
> > and to go through the tests with fine toothed comb and remove the ones
> > that test exactly the same thing as another test.  The latter would be
> > very time consuming, though.
> 
> FWIW, I have made attempts at "split long ones into two" before, and
> didn't come up with much. There _are_ some tests that are much longer
> than others[1], but they are not longer than the whole suite takes to
> run. So running in slow-to-fast order means they start first, are run in
> parallel with the other tests, and the CPUs stay relatively full through
> the whole run.

That reflects my findings, too. I want to add that I found preciously
little difference between running slow-to-fast and running in numeric
order, so I gave up on optimizing on that front.

> 43.216765165329 t3404-rebase-interactive.sh
> 30.6568658351898 t3421-rebase-topology-linear.sh
> 27.92564702034 t9001-send-email.sh
> 15.5906939506531 t9500-gitweb-standalone-no-errors.sh
> 15.4882569313049 t6030-bisect-porcelain.sh
> 14.487174987793 t7610-mergetool.sh
> 13.8276169300079 t3425-rebase-topology-merges.sh
> 12.7450480461121 t3426-rebase-submodule.sh
> 12.4915001392365 t3415-rebase-autosquash.sh
> 11.8122401237488 t5572-pull-submodule.sh

That looks very similar to what I found: t3404 on top, followed by t3421.

Further, I found that the Subversion tests (which run at the end) are so
close in their running time that running the tests in parallel with -j5
does not result in any noticeable improvement when reordered.

I guess I will have to bite into the sour apple and try to profile, say,
t3404 somehow, including all the shell scripting stuff, to identify where
exactly all that time is lost. My guess is that it boils down to
gazillions of calls to programs like expr.exe or merely subshells.

Ciao,
Dscho


Re: Drastic jump in the time required for the test suite

2016-10-20 Thread Johannes Schindelin
Hi Junio,

On Wed, 19 Oct 2016, Junio C Hamano wrote:

> Johannes Schindelin  writes:
> 
> > What I can also say for certain is that the version from yesterday (commit
> > 4ef44ce) was the first one in a week that built successfully and hence
> > reached the test phase, and it was the first version of `pu` ever to time
> > out after running for 3h.
> 
> I am sympathetic, but I'd be lying if I said I can feel your pain.

Obviously.

> Admittedly I do not run _all_ the tests (I certainly know that I
> exclude the ones behind EXPENSIVE prerequisite), but after every
> rebuilding of 'jch' and 'pu', I run the testsuite (and also rebuild
> docs) before pushing them out, and "make test && make doc && make
> install install-doc" run sequentially for the four integration
> branches finishes within 15 minutes, even when I run them after
> "make clean".

You have the luxury of a system that runs shell scripts fast.

Or maybe I should put it differently: you set up the test suite in such a
way that it runs fast on your system, exploiting the fact that shell
scripts run fast there.

If you want to know just how harmful this reliance on shell scripting is
to our goal of keeping Git portable: already moving from Linux to MacOSX
costs you roughly 3x as long to run the build & test (~12mins vs ~36mins
for GCC, according to https://travis-ci.org/git/git/builds/159125647).

Again, I have to repeat myself: this is not good.

> Perhaps the recent change to run the tests in parallel from slower
> to faster under prove may be helping my case.

No, you misunderstand. The tests are *already* run in parallel. And
running them from slower to faster would only make a difference if the
last tests were not requiring roughly the same time to run.

Also, I cannot use prove(1) because it proved to be unreliable in my setup
(it does hang from time to time, for no good reason whatsoever, which
would only make my situation worse, of course).

> > Maybe we should start optimizing the tests...
> 
> Yup, two things that come to mind are to identify long ones and see
> if each of them can be split into two halves that can be run in
> parallel, and to go through the tests with fine toothed comb and
> remove the ones that test exactly the same thing as another test.
> The latter would be very time consuming, though.

Again, you misunderstand.

The problem is not whether or not to run the tests in parallel.

The problem is that our tests take an insanely long time to run. That is a
big problem, it is no good, tests are only useful if they are cheap enough
to run all the time.

So the only thing that would really count as an improvement would be to
change the test suite in such a manner that it relies more on helpers in
t/helper/ and less on heavy-duty shell scripting.

Of course, if you continue to resist (because the problem is obviously not
affecting you personally, so why would you care), I won't even try to find
the time to start on that project.

Ciao,
Dscho

P.S.: After increasing the time-out to 240 minutes, the test suite still
times out. I investigated a little bit and it appears that
t6030-bisect-porcelain.sh now hangs in the `git bisect next` call of its
"2 - bisect starts with only one bad" test case. This is specific to
Windows, I cannot reproduce that problem on Linux, and I will keep you
posted on my progress.


  1   2   >