Re: [PATCH] docs: submitting-patches: improve the base commit explanation

2023-11-15 Thread Borislav Petkov
On Wed, Nov 15, 2023 at 09:49:31AM -0800, Kees Cook wrote:
> On Wed, Nov 15, 2023 at 06:03:30PM +0100, Borislav Petkov wrote:
> > From: "Borislav Petkov (AMD)" 
> > 
> > After receiving a second patchset this week without knowing which tree
> > it applies on and trying to apply it on the obvious ones and failing,
> > make sure the base tree information which needs to be supplied in the
> > 0th message of the patchset is spelled out more explicitly.
> > 
> > Also, make the formulations stronger as this really is a requirement and
> > not only a useful thing anymore.
> > 
> > Signed-off-by: Borislav Petkov (AMD) 
> 
> Yup, I wonder if making "--base=auto" a default in git might be a good
> idea too?

Not a bad idea. And if not needed, one can simply ignore it when reading
the cover letter.

CCing the git ML for comment/opinions.

Thx.

-- 
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette


Re: [PATCH v3 4/4] git-gui: allow undoing last revert

2019-10-23 Thread Pratyush Yadav
On 22/10/19 09:46AM, Bert Wesarg wrote:
> On Mon, Oct 21, 2019 at 9:35 PM Johannes Sixt  wrote:
> >
> > Am 21.10.19 um 11:16 schrieb Bert Wesarg:
> > > Dear Pratyush,
> > >
> > > I just noticed that the 'Revert Last Hunk' menu entry is enabled in
> > > the stage-list. But I think it should be disabled, like the 'Revert
> > > Hunk' and 'Revert Line' menu entry.
> > >
> > > Can you confirm this?
> >
> > Technically, it need not be disabled because the hunk being reverted
> > does not depend on the contents of any of diffs that can be shown.
> >
> > The entry should be disabled if reverting the stored hunk fails. But to
> > know that, it would have to be tried: the file could have been edited
> > since the hunk was generated so that the reversal of the hunk fails.
> 
> But the "Undo" changes the worktree not the stage, sure it indirectly
> also changes the view of the staged content, but that is only a

I don't think the "Undo Last Revert" should affect "staged content" in 
any way. In fact, if it does, it is probably a bug.

A more detailed reply is to your other email. I just wanted to clarify 
that an undo _should not_ affect the staged content.

> secondary effect. As I only can revert in the worktree list, I think
> we should be consistent and also only allow to undo the revert in the
> worktree list.
> 
> And I think it is independent of 'does the undo apply at all' question.

-- 
Regards,
Pratyush Yadav


Re: [PATCH v3 4/4] git-gui: allow undoing last revert

2019-10-23 Thread Pratyush Yadav
On 22/10/19 10:17AM, Bert Wesarg wrote:
> On Mon, Oct 21, 2019 at 9:04 PM Pratyush Yadav  wrote:
> >
> > On 21/10/19 11:16AM, Bert Wesarg wrote:
> > > Dear Pratyush,
> > >
> > > I just noticed that the 'Revert Last Hunk' menu entry is enabled in
> > > the stage-list. But I think it should be disabled, like the 'Revert
> > > Hunk' and 'Revert Line' menu entry.
> >
> > I'm not sure what you mean. There is no "Revert Last Hunk" menu entry (I
> > assume you are talking about the context menu in the diff view that we
> > open by right clicking).
> >
> > My guess is that you mean the "Undo Last Revert" option. And you want it
> > disabled if the diff shown is of a staged file, correct?
> >
> > I'm not sure if disabling it would be a good idea.
> >
> > Say I revert a hunk or line while the file is not staged, and stage the
> > rest of the file. This would mean that file is no longer in the
> > "Unstaged Changes" widget. Now if I choose the file from the "Staged
> > Changes" widget, I get the option to undo the last revert. If I hit
> > that, it will put whatever I reverted in the "Unstaged Changes" widget.
> > So, now part of the file that was reverted is in "Unstaged Changes", and
> > the rest in "Unstaged Changes".
> >
> 
> Sorry for this confusion.
> 
> > IIUC, this is what you think should not happen, correct? What's wrong
> > with allowing the user to undo reverts from anywhere?
> 
> The 'Undo' changes the worktree not the staged content,.
> 
> >
> > The way I see it, it doesn't really matter what file is selected or
> > whether it is staged or not, the action of the undo remains the same, so
> > it makes sense to me to allow it from anywhere.
> 
> It would make sense to undo the revert on the staged content, if there
> are no more changes to this file in the worktree. I.e., you wont be
> able to apply the 'undo' anymore to the worktree file if it is not
> listed anymore. Though even that case should be able to implement.
> Though the undo is currently not bound to a specific path. This may be

I did it this way because I thought it would be best to go for the 
simplest implementation first, and then re-evaluate if needed. And 
nobody objected in the reviews. Maybe I didn't do a good job of 
clarifying the exact behaviour in the commit message.

I'm not opposed to something like a per-path undo though. But I'm not 
sure if that will cause any confusion as to what is being undone. Or 
maybe the current version is more confusing. I can't really say since I 
am the one who implemented it, so I know exactly what it does.

> the cause of our different view. I though the undo is bound to the
> path it was recorded, thus also would only be available when showing a
> diff on this path again. Therefore I think, having the 'Undo Last
> Revert' in the context menu may not be the best place to begin with,
> or at least indicate that this 'undo' operation does not necceseritly
> operate on the file currently shown.

Where else would you put it if not the context menu? There is the option 
of putting it in the menu bar at the top under the "Edit" menu entry, 
but I think that is too hidden. The original idea of the feature was 
that you could quickly undo an accidental revert. Hiding this in the 
menu bar would hurt discoverability, and some people might not realize 
they could have undone the revert.

Is there a better place to put it that I'm missing?

-- 
Regards,
Pratyush Yadav


Re: Git Test Coverage Report (October 11)

2019-10-23 Thread Derrick Stolee
On 10/23/2019 1:00 PM, Torsten Bögershausen wrote:
> On Fri, Oct 11, 2019 at 09:33:11AM -0400, Derrick Stolee wrote:
>> Here is today's test coverage report. The usual report format is
>> available online [1], [2]. The report listed below is a new format
>> that groups lines by the commit that introduced them [3]. Thanks
>> Peff for the feedback on that idea.
>>
> 
> []
>>
>> Torsten Bögershausen ebb8d2c9 mingw: support UNC in git clone 
>> file://server/share/repo
>> connect.c
>> ebb8d2c9 921) path = host - 2; /* include the leading "//" */
>>
> 
> I actually looked into this one, and my understanding is that the code path
> makes only sense for windows and is only tested on Windows in t5500.
> (Linux/Unix/POSIX don't use UNC path names starting with "//" )
> 
> How can we avoid those "not covered by test" warnings?
> 
> One solution could be to use
> 
> #ifndef has_dos_drive_prefix
> #define has_dos_drive_prefix(a) 0
> #endif
> 
> in git-compat-util.h and hope that the compiler is smart enough
> to optimize away that line of code.
> 
> Another way could be to have #ifdefs in connect.c, so that it
> is clear "this is Windows only".
> 
> Or make a comment for the "cover report" saying "not covered".
> 
> Are there any good or better thoughts on this ?

One way to avoid this is to add ignored lines to the test-coverage
repo [1]. These only work if the exact contents match on a specific
line number, but can be a way to stop noise in the short-term.

For example, I added a few lines to ignore in commit-graph.c [2],
but I haven't added ignored lines in a while.

I'm happy to take a PR including the lines you want to ignore, or
I could take inventory of the lines in the current report before regenerating
a test for -rc1.

Thanks,
-Stolee

[1] https://github.com/derrickstolee/git-test-coverage

[2] 
https://github.com/derrickstolee/git-test-coverage/blob/master/ignored/commit-graph.c


Re: [PATCH 2/5] t4108: remove git command upstream of pipe

2019-10-23 Thread Denton Liu
On Wed, Oct 23, 2019 at 09:32:26AM -0400, Eric Sunshine wrote:
> On Wed, Oct 23, 2019 at 8:04 AM Denton Liu  wrote:
> > Before, the output of `git diff HEAD` would always be piped to
> > sanitize_conflicted_diff(). However, since the Git command was upstream
> > of the pipe, in case the Git command fails, the return code would be
> > lost. Rewrite into separate statements so that the return code is no
> > longer lost.
> >
> > Since only the command `git diff HEAD` was being piped to
> > sanitize_conflicted_diff(), move the command into the function and rename
> > it to print_sanitized_diff().
> >
> > Signed-off-by: Denton Liu 
> > ---
> > diff --git a/t/t4108-apply-threeway.sh b/t/t4108-apply-threeway.sh
> > @@ -4,11 +4,12 @@ test_description='git apply --3way'
> > -sanitize_conflicted_diff () {
> > +print_sanitized_diff () {
> > +   git diff HEAD >diff.raw &&
> > sed -e '
> > /^index /d
> > s/^\(+[<>][<>][<>][<>]*\) .*/\1/
> > -   '
> > +   ' diff.raw
> >  }
> 
> Nit: By hard-coding "HEAD" in this function, you lose the flexibility
> of the original. An alternative would have been to accept the ref
> against which to diff as an argument to this function:
> 
> print_sanitized_diff () {
> git diff "$@" >diff.raw &&
>     ...
> 
> Or, better yet, keep the original design and pass the diff in as the
> shell function's input, so a caller would say:
> 
> git diff HEAD >diff.raw &&
> sanitize_conflicted_diff expect.diff &&
> 
> However, not necessarily worth a re-roll if we never expect anyone to
> pass anything other than "HEAD".

Since it doesn't really make sense to commmit conflicts, I decided to
hardcode it to be a diff against HEAD and the worktree since that's the
only sensible place where the conflict should live.

Speaking of conflicts, I dropped the "conflicted" part of the old
function name. I think that removes a lot of clarity so I'll reroll
renaming the function to print_sanitized_conflicted_diff() or something
like that.


Re: Git Test Coverage Report (October 11)

2019-10-23 Thread Torsten Bögershausen
On Fri, Oct 11, 2019 at 09:33:11AM -0400, Derrick Stolee wrote:
> Here is today's test coverage report. The usual report format is
> available online [1], [2]. The report listed below is a new format
> that groups lines by the commit that introduced them [3]. Thanks
> Peff for the feedback on that idea.
>

[]
>
> Torsten Bögershausen  ebb8d2c9 mingw: support UNC in git clone 
> file://server/share/repo
> connect.c
> ebb8d2c9 921) path = host - 2; /* include the leading "//" */
>

I actually looked into this one, and my understanding is that the code path
makes only sense for windows and is only tested on Windows in t5500.
(Linux/Unix/POSIX don't use UNC path names starting with "//" )

How can we avoid those "not covered by test" warnings?

One solution could be to use

#ifndef has_dos_drive_prefix
#define has_dos_drive_prefix(a) 0
#endif

in git-compat-util.h and hope that the compiler is smart enough
to optimize away that line of code.

Another way could be to have #ifdefs in connect.c, so that it
is clear "this is Windows only".

Or make a comment for the "cover report" saying "not covered".

Are there any good or better thoughts on this ?






Re: [PATCH v3 1/1] ci(osx): use new location of the `perforce` cask

2019-10-23 Thread SZEDER Gábor
On Wed, Oct 23, 2019 at 07:05:04PM +0900, Junio C Hamano wrote:
> "Johannes Schindelin via GitGitGadget" 
> writes:
> 
> > From: Johannes Schindelin 
> >
> > The Azure Pipelines builds are failing for macOS due to a change in the
> > location of the perforce cask. The command outputs the following error:
> >
> > + brew install caskroom/cask/perforce
> > Error: caskroom/cask was moved. Tap homebrew/cask-cask instead.
> > ...
> > In any case, as the error message at the top of this commit message
> > shows, 'brew install caskroom/cask/perforce' has stopped working
> > recently, but 'brew cask install perforce' still does, so let's use
> > that.
> 
> It appears that OSX jobs at Travis are getting hit by this issue.
> Here is what our failed build ends with, for example:
> 
> +brew install caskroom/cask/perforce
> Error: caskroom/cask was moved. Tap homebrew/cask-cask instead.
> 
> cf. https://travis-ci.org/git/git/jobs/601697815
> 
> Today's 'pu' has this topic queued, and it seems to help even the
> builds at Travis ('pu' seems to fail the test for totally different
> reason, though):
> 
> +brew link gcc@8
> Error: No such keg: /usr/local/Cellar/gcc@8
> 
> cf. https://travis-ci.org/git/git/jobs/601697903

Yeah, that's a new one, so we don't get too bored.  We'll need the
patch below as well:

 --- >8 ---

Subject: [PATCH] ci: fix GCC install in the GCC OSX job

A few days ago Travis CI updated their existing OSX images, including
the Homebrew database in the xcode10.1 OSX image that we use.  Since
then installing dependencies in the 'osx-gcc' job fails when it tries
to link gcc@8:

  + brew link gcc@8
  Error: No such keg: /usr/local/Cellar/gcc@8

Apparently 'brew link gcc' works, so let's do that then, and fall back
to linking gcc@8 if it doesn't.

Signed-off-by: SZEDER Gábor 
---
 ci/install-dependencies.sh | 1 +
 1 file changed, 1 insertion(+)

diff --git a/ci/install-dependencies.sh b/ci/install-dependencies.sh
index ce149ed39c..4e64a19112 100755
--- a/ci/install-dependencies.sh
+++ b/ci/install-dependencies.sh
@@ -48,6 +48,7 @@ osx-clang|osx-gcc)
brew install caskroom/cask/perforce
case "$jobname" in
osx-gcc)
+   brew link gcc ||
brew link gcc@8
;;
esac
-- 
2.24.0.rc0.502.g7008375535



Re: [PATCH 5/5] path.c: don't call the match function without value in trie_find()

2019-10-23 Thread SZEDER Gábor
On Wed, Oct 23, 2019 at 01:01:00PM +0900, Junio C Hamano wrote:
> SZEDER Gábor  writes:
> 
> >>   - b9317d55a3 added two new keys to the trie: 'logs/refs/rewritten'
> >> and 'logs/refs/worktree', next to the already existing
> >> 'logs/refs/bisect'.  This resulted in a trie node with the path
> >> 'logs/refs', which didn't exist before, and which doesn't have a
> >
> > Oops, I missed the trailing slash, that must be 'logs/refs/'!
> >
> >> value attached.  A query for 'logs/refs/' finds this node and then
> >> hits that one callsite of the match function which doesn't check
> >> for the value's existence, and thus invokes the match function
> >> with NULL as value.
> 
> Given that the trie is maintained by hand in common_list[], I wonder
> if we can mechanically catch errors like the one b9317d55a3 added,
> by perhaps having a self-test function that a t/helper/ program
> calls to perform consistency check after the "git" gets built.

I'm not sure what you mean by "consistency check".  The resulting trie
looked as expected both before and after b9317d55a3, i.e. each trie
node had the right contents, value, and children, as far as I could
tell.  The issue was in the lookup function.



Re: [PATCH] pretty: Add "%aU"|"%au" option to output author's username

2019-10-23 Thread Prarit Bhargava



On 10/23/19 1:02 AM, Jeff King wrote:
> On Tue, Oct 22, 2019 at 07:28:47PM -0400, Prarit Bhargava wrote:
> 
>> In many projects the number of contributors is low enough that users know
>> each other and the full email address doesn't need to be displayed.
>> Displaying only the author's username saves a lot of columns on the screen.
>> For example displaying "prarit" instead of "pra...@redhat.com" saves 11
>> columns.
>>
>> Add a "%aU"|"%au" option that outputs the author's email username.
> 
> Like others, this seems potentially useful even if I probably wouldn't
> use it myself. Another more complicated way to think of it would be to
> give a list of domains to omit (so if 90% of the committers are
> @redhat.com, we can skip that, but the one-off contributor from another
> domain gets their fully qualified name.>
> But that's a lot more complicated. I don't mind doing the easy thing
> now, and even if we later grew the more complicated thing, I wouldn't be
> sad to still have this easy one as an option.

FWIW, I went through the exact same thought process as you did and came to the
same conclusion.

> 
>> --- a/pretty.c
>> +++ b/pretty.c
>> @@ -706,6 +706,11 @@ static size_t format_person_part(struct strbuf *sb, 
>> char part,
>>  strbuf_add(sb, mail, maillen);
>>  return placeholder_len;
>>  }
>> +if (part == 'u' || part == 'U') {   /* username */
>> +maillen = strstr(s.mail_begin, "@") - s.mail_begin;
>> +strbuf_add(sb, mail, maillen);
>> +return placeholder_len;
>> +}
> 
> What happens if the email doesn't have an "@"? I think you'd either end
> up printing a bunch of extra cruft (because you're not limiting the
> search for "@" to the boundaries from split_ident_line) or you'd
> crash (if there's no "@" at all, and you'd get a huge maillen).
> 
> There's also no need to use the slower strstr() when looking for a
> single character. So perhaps:
> 
>   const char *at = memchr(mail, '@', maillen);
>   if (at)
>   maillen = at - mail;
>   strbuf_add(sb, mail, maillen);

TBH I had assumed that the email address was RFC2822 compliant.  Thanks for the
code.  I've incorporated it into v2.

> 
>> +test_expect_success 'log pretty %an %ae %au' '
> 
> As others noted, this could cover %aU, too (which is broken; you need to
> handle 'U' alongside 'E' and 'N' earlier in format_person_part()).

Whups.  Thanks for the pointer.  Also fixed in v2.

> 
>> +git checkout -b anaeau &&
>> +test_commit anaeau_test anaeau_test_file &&
>> +git log --pretty="%an" > actual &&
>> +git log --pretty="%ae" >> actual &&
>> +git log --pretty="%au" >> actual &&
> 
> Maybe:
> 
>   git log --pretty="%an %ae %au"
> 
> or
> 
>   git log --pretty="%an%n%ae%n%au"
> 
> which is shorter and runs more efficiently?
> 
>> +git log > full &&
>> +name=$(cat full | grep "^Author: " | awk -F "Author: " " { print \$2 } 
>> " | awk -F " <" " { print \$1 } ") &&
>> +email=$(cat full | grep "^Author: " | awk -F "<" " { print \$2 } " | 
>> awk -F ">" " { print \$1 } ") &&
>> +username=$(cat full | grep "^Author: " | awk -F "<" " { print \$2 } " | 
>> awk -F ">" " { print \$1 } " | awk -F "@" " { print \$1 } " ) &&
>> +echo "${name}" > expect &&
>> +echo "${email}" >> expect &&
>> +echo "${username}" >> expect &&
> 
> These values come from our hard-coded test setup, so it would be more
> readable to just expect those:
> 
>   {
>   echo "$GIT_AUTHOR_NAME" &&
>   echo "$GIT_AUTHOR_EMAIL" &&
>   echo "$GIT_AUTHOR_EMAIL" | sed "s/@.*//"
>   } >expect

Added to v2 (along with Brian's suggestion to test %aE, %aN, and %aL).

> 
> For the last one, also I wouldn't be upset to see test-lib.sh do
> something like:
> 
>   TEST_AUTHOR_USERNAME=author
>   TEST_AUTHOR_DOMAIN=example.com
>   GIT_AUTHOR_NAME=$TEST_AUTHOR_USERNAME@$TEST_AUTHOR_DOMAIN

I like this suggestion a lot.  I'm incorporating into v2 as well as similar
changes for the COMMITTER fields.

P.

> 
> to let tests like this pick out the individual values if they want.
> 
> -Peff
> 



Re: [PATCH] pretty: Add "%aU"|"%au" option to output author's username

2019-10-23 Thread Prarit Bhargava



On 10/22/19 7:48 PM, brian m. carlson wrote:
> On 2019-10-22 at 23:28:47, Prarit Bhargava wrote:
>> In many projects the number of contributors is low enough that users know
>> each other and the full email address doesn't need to be displayed.
>> Displaying only the author's username saves a lot of columns on the screen.
>> For example displaying "prarit" instead of "pra...@redhat.com" saves 11
>> columns.
>>
>> Add a "%aU"|"%au" option that outputs the author's email username.
> 
> I have no position on whether or not this is a useful change.  I don't
> think I'll end up using it, but I can imagine cases where it is useful,
> such as certain corporate environments.  It would be interesting to see
> what others think.
> 
>> diff --git a/Documentation/pretty-formats.txt 
>> b/Documentation/pretty-formats.txt
>> index b87e2e83e6d0..479a15a8ab12 100644
>> --- a/Documentation/pretty-formats.txt
>> +++ b/Documentation/pretty-formats.txt
>> @@ -163,6 +163,9 @@ The placeholders are:
>>  '%ae':: author email
>>  '%aE':: author email (respecting .mailmap, see linkgit:git-shortlog[1]
>>  or linkgit:git-blame[1])
>> +'%au':: author username
>> +'%aU':: author username (respecting .mailmap, see linkgit:git-shortlog[1]
>> +or linkgit:git-blame[1])
> 
> I think we need to actually document what "username" means here.  I
> expect you'll have people think that this magically means their
> $HOSTING_PLATFORM username, which it is not.
> 

Based on Junio's input, I'm going to change the option to "al" and "aL" where L
means "local-part" as defined by the rfc2822.txt specification, and include a
comment that says "(the portion of the email address preceding the '@' symbol)".
 Admittedly that doesn't convey the meaning of the mailbox concept of the email
address it does tell a user what is going to be output.


>> diff --git a/pretty.c b/pretty.c
>> index b32f0369531c..2a5b93022050 100644
>> --- a/pretty.c
>> +++ b/pretty.c
>> @@ -706,6 +706,11 @@ static size_t format_person_part(struct strbuf *sb, 
>> char part,
>>  strbuf_add(sb, mail, maillen);
>>  return placeholder_len;
>>  }
>> +if (part == 'u' || part == 'U') {   /* username */
>> +maillen = strstr(s.mail_begin, "@") - s.mail_begin;
>> +strbuf_add(sb, mail, maillen);
>> +return placeholder_len;
>> +}
> 
> This branch doesn't appear to do anything different for the mailmap and
> non-mailmap cases.  Perhaps adding an additional test that demonstrates
> the difference would be a good idea.
> 

Thanks for the suggestion.  I'll look into it for v2.

P.



Re: [PATCH v2 2/2] commit-graph: fix writing first commit-graph during fetch

2019-10-23 Thread SZEDER Gábor
On Wed, Oct 23, 2019 at 01:01:35PM +, Derrick Stolee via GitGitGadget wrote:
> From: Derrick Stolee 
> 
> The previous commit includes a failing test for an issue around
> fetch.writeCommitGraph and fetching in a repo with a submodule. Here, we
> fix that bug and set the test to "test_expect_success".
> 
> The prolem arises with this set of commands when the remote repo at

s/prolem/problem/

>  has a submodule. Note that --recurse-submodules is not needed to
> demonstrate the bug.
> 
>   $ git clone  test
>   $ cd test
>   $ git -c fetch.writeCommitGraph=true fetch origin
>   Computing commit graph generation numbers: 100% (12/12), done.
>   BUG: commit-graph.c:886: missing parent  for commit 
>   Aborted (core dumped)
> 
> As an initial fix, I converted the code in builtin/fetch.c that calls
> write_commit_graph_reachable() to instead launch a "git commit-graph
> write --reachable --split" process. That code worked, but is not how we
> want the feature to work long-term.
> 
> That test did demonstrate that the issue must be something to do with
> internal state of the 'git fetch' process.
> 
> The write_commit_graph() method in commit-graph.c ensures the commits we
> plan to write are "closed under reachability" using close_reachable().
> This method walks from the input commits, and uses the UNINTERESTING
> flag to mark which commits have already been visited. This allows the
> walk to take O(N) time, where N is the number of commits, instead of
> O(P) time, where P is the number of paths. (The number of paths can be
> exponential in the number of commits.)
> 
> However, the UNINTERESTING flag is used in lots of places in the
> codebase. This flag usually means some barrier to stop a commit walk,
> such as in revision-walking to compare histories. It is not often
> cleared after the walk completes because the starting points of those
> walks do not have the UNINTERESTING flag, and clear_commit_marks() would
> stop immediately.
> 
> This is happening during a 'git fetch' call with a remote. The fetch
> negotiation is comparing the remote refs with the local refs and marking
> some commits as UNINTERESTING.
> 
> You may ask: did this feature ever work at all? Yes, it did, as long as
> you had a commit-graph covering all of your local refs. My testing was
> unfortunately limited to this scenario. The UNINTERESTING commits are
> always part of the "old" commit-graph, and when we add new commits to a
> top layer of the commit-graph chain those are not needed. If we happen
> to merge layers of the chain, then the commits are added as a list, not
> using a commit walk. Further, the test added for this config option in
> t5510-fetch.sh uses local filesystem clones, which somehow avoids this
> logic.

Does this last sentence still holds, given that a submodule plays a
crucial role in triggering this bug?  I think it either doesn't, or
I still don't completely understand the situation.

> I tested running clear_commit_marks_many() to clear the UNINTERESTING
> flag inside close_reachable(), but the tips did not have the flag, so
> that did nothing.
> 
> It turns out that the calculate_changed_submodule_paths() method is at
> fault. Thanks, Peff, for pointing out this detail! More specifically,
> for each submodule, the collect_changed_submodules() runs a revision
> walk to essentially do file-history on the list of submodules. That
> revision walk marks commits UNININTERESTING if they are simiplified away

s/simiplified/simplified/

> by not changing the submodule.
> 
> Instead, I finally arrived on the conclusion that I should use a flag
> that is not used in any other part of the code. In commit-reach.c, a
> number of flags were defined for commit walk algorithms. The REACHABLE
> flag seemed like it made the most sense, and it seems it was not
> actually used in the file. The REACHABLE flag was used in early versions
> of commit-reach.c, but was removed by 4fbcca4 (commit-reach: make
> can_all_from_reach... linear, 2018-07-20).
> 
> Add the REACHABLE flag to commit-graph.c and use it instead of
> UNINTERESTING in close_reachable(). This fixes the bug in manual
> testing.

I'm inclined to agree that using a flag that is not used anywhere else
is the safest thing to do, and at -rcX time safest is good.  I'm not
sure whether it's the right thing to do in the long term, though.

Furthermore, calling this flag REACHABLE is misleading, because the
code actually means SEEN.
Consider the following sequence of commands:

  # Create a pack with two commits
  $ git commit --allow-empty -m one &&
  $ git commit --allow-empty -m two &&
  $ git repack -ad &&
  # Make one of those commits unreachable
  $ git reset --hard HEAD^ &&
  # Not even from reflogs!
  $ git reflog expire --expire-unreachable=now --all
  # Now write a commit-graph from that pack file
  $ git commit-graph write
  Computing commit graph generation numbers: 100% (2/2), done.

It added two commits to the commit-graph, although one

Re: [PATCH] pretty: Add "%aU"|"%au" option to output author's username

2019-10-23 Thread Prarit Bhargava



On 10/22/19 7:46 PM, Junio C Hamano wrote:
> Prarit Bhargava  writes:
> 
>> Subject: Re: [PATCH] pretty: Add "%aU"|"%au" option to output author's 
>> username
> 
> Downcase "Add" (see "git shortlog --no-merges -100 master" and
> mimick the project convention).

I'll fix that.

> 
>> Add a "%aU"|"%au" option that outputs the author's email username.
> 
> Even though I personally do not see the use for it, I agree it would
> make sense to have an option to show the local part only where the
> e-mail address is shown.  
> 
> I do not know if u/U is a good mnemonic; it hints too strongly that
> it may come from GIT_{AUTHOR/COMMITTER}_NAME but that is not what
> you are doing---isn't there a letter that better conveys that this
> is about RFC 2822 local-part (cf. page 16 ieft.org/rfc/rfc2822.txt)?

I'll go with "l" and "L" for local-part as defined on page 16.  I will also
include a comment in braces that says (the portion of the email address
preceding the '@' symbol)".  Admittedly that doesn't convey the meaning of the
mailbox concept of the email address it does tell a user what is going to be 
output.

> 
>> diff --git a/Documentation/pretty-formats.txt 
>> b/Documentation/pretty-formats.txt
>> index b87e2e83e6d0..479a15a8ab12 100644
>> --- a/Documentation/pretty-formats.txt
>> +++ b/Documentation/pretty-formats.txt
>> @@ -163,6 +163,9 @@ The placeholders are:
>>  '%ae':: author email
>>  '%aE':: author email (respecting .mailmap, see linkgit:git-shortlog[1]
>>  or linkgit:git-blame[1])
>> +'%au':: author username
>> +'%aU':: author username (respecting .mailmap, see linkgit:git-shortlog[1]
>> +or linkgit:git-blame[1])
>>  '%ad':: author date (format respects --date= option)
>>  '%aD':: author date, RFC2822 style
>>  '%ar':: author date, relative
> 
>> diff --git a/pretty.c b/pretty.c
>> index b32f0369531c..2a5b93022050 100644
>> --- a/pretty.c
>> +++ b/pretty.c
>> @@ -706,6 +706,11 @@ static size_t format_person_part(struct strbuf *sb, 
>> char part,
>>  strbuf_add(sb, mail, maillen);
>>  return placeholder_len;
>>  }
>> +if (part == 'u' || part == 'U') {   /* username */
>> +maillen = strstr(s.mail_begin, "@") - s.mail_begin;
>> +strbuf_add(sb, mail, maillen);
>> +return placeholder_len;
>> +}
> 
> I think users get %eu and %eU for free with this change, which should
> be documented.
> 

Did you mean %cu and %cU?  Or am I missing something with "%e"?

P.



Re: [PATCH v2 1/2] t5510-fetch.sh: demonstrate fetch.writeCommitGraph bug

2019-10-23 Thread SZEDER Gábor
On Wed, Oct 23, 2019 at 01:01:34PM +, Derrick Stolee via GitGitGadget wrote:
> From: Derrick Stolee 
> 
> While dogfooding, Johannes found a bug in the fetch.writeCommitGraph
> config behavior. His example initially happened during a clone with
> --recurse-submodules, we found that this happens with the first fetch
> after cloning a repository that contains a submodule:
> 
>   $ git clone  test
>   $ cd test
>   $ git -c fetch.writeCommitGraph=true fetch origin
>   Computing commit graph generation numbers: 100% (12/12), done.
>   BUG: commit-graph.c:886: missing parent  for commit 
>   Aborted (core dumped)
> 
> In the repo I had cloned, there were really 60 commits to scan, but
> only 12 were in the list to write when calling
> compute_generation_numbers(). A commit in the list expects to see a
> parent, but that parent is not in the list.
> 
> A follow-up will fix the bug, but first we create a test that
> demonstrates the problem.
> 
> I used "test_expect_failure" for the entire test instead of
> "test_must_fail" only on the command that I expect to fail. This is
> because the BUG() returns an exit code so test_must_fail complains.

I don't think this paragraph is necessary; using 'test_expect_failure'
is the way to demonstrate a known breakage.

'test_must_fail' should only be used when the failure is the desired
behavior of a git command.  (I used the word "desired" here, because
you just used the word "expect" above in the sense that "I expect it
to fail, because I know it's buggy, and I want to demonstrate that
bug")

> Helped-by: Jeff King 
> Helped-by: Johannes Schindelin 
> Helped-by: Szeder Gábor 
> Signed-off-by: Derrick Stolee 
> ---
>  t/t5510-fetch.sh | 17 +
>  1 file changed, 17 insertions(+)
> 
> diff --git a/t/t5510-fetch.sh b/t/t5510-fetch.sh
> index ecabbe1616..e8ae3af0b6 100755
> --- a/t/t5510-fetch.sh
> +++ b/t/t5510-fetch.sh
> @@ -583,6 +583,23 @@ test_expect_success 'fetch.writeCommitGraph' '
>   )
>  '
>  
> +test_expect_failure 'fetch.writeCommitGraph with submodules' '
> + pwd="$(pwd)" &&
> + git clone dups super &&
> + (
> + cd super &&
> + git submodule add "file://$pwd/three" &&

Nits: couldn't the URL simply be file://$TRASH_DIRECTORY/three ?

> + git commit -m "add submodule"
> + ) &&
> + git clone "super" writeError &&

There is a write error now, because we have a bug, but after the next
patch the bug will be fixed and we won't have a write error anymore.

> + (
> + cd writeError &&
> + test_path_is_missing 
> .git/objects/info/commit-graphs/commit-graph-chain &&
> + git -c fetch.writeCommitGraph=true fetch origin &&
> + test_path_is_file 
> .git/objects/info/commit-graphs/commit-graph-chain
> + )
> +'
> +
>  # configured prune tests
>  
>  set_config_tristate () {
> -- 
> gitgitgadget
> 


Re: [PATCH 08/12] t5520: use test_cmp_rev where possible

2019-10-23 Thread Eric Sunshine
On Fri, Oct 18, 2019 at 2:52 PM Denton Liu  wrote:
> On Thu, Oct 17, 2019 at 07:41:44PM -0400, Eric Sunshine wrote:
> > On Thu, Oct 17, 2019 at 7:17 PM Denton Liu  wrote:
> > > -   test "$COPY" = "$(git rev-parse --verify me/copy)" &&
> > > +   test_cmp_rev "$COPY" me/copy &&
> >
> > This transformation doesn't look correct. COPY already holds the
> > result of a git-rev-parse invocation:
> >
> > COPY="$(git rev-parse --verify me/copy)" &&
> >
> > so passing it to test_cmp_rev() -- which applies its own git-rev-parse
> > invocation -- doesn't make sense.
>
> So after grokking the test case, it seems like the the transformation is
> indeed correct. Maybe we can replace the last line with
>
> test_cmp_rev copy me/copy
>
> but I think I'll leave it unless you have any strong opinions.

For some reason, I had it in my mind that test_cmp_rev() was primarily
meant for comparing _named_ revisions, but of course there is nothing
about the function which even suggests that that is its intended
use-case. In retrospect, using it to compare an OID against a named
revision is a sensible use-case too, so I withdraw the objection.


Re: [PATCH 4/5] t4108: demonstrate bug in apply

2019-10-23 Thread Eric Sunshine
On Wed, Oct 23, 2019 at 8:04 AM Denton Liu  wrote:
> Currently, apply does not respect the merge.conflictStyle setting.
> Demonstrate this by making the 'apply with --3way' test case generic and
> extending it to show that the configuration of
> merge.conflictStyle = diff3 causes a breakage.
>
> Change print_sanitized_diff() to also sanitize `|||` conflict markers.
>
> Signed-off-by: Denton Liu 
> ---
> diff --git a/t/t4108-apply-threeway.sh b/t/t4108-apply-threeway.sh
> @@ -46,28 +46,42 @@ test_expect_success 'apply without --3way' '
> +test_apply_with_3way () {
> +   status="$1" &&
> +   shift &&
> +   description="$1" &&
> +   shift &&
> +   preamble="$1" &&
> +   shift &&
> +
> +   test_expect_$status "apply with --3way ($description)" "
> +   $preamble &&
> +
> +   # Merging side should be similar to applying this patch
> +   git diff ...side >P.diff &&
> +   [...]
> +   "
> +}
>
> +test_apply_with_3way success default true
> +test_apply_with_3way failure 'merge.conflictStyle = diff3' 'test_config 
> merge.conflictStyle diff3'

This isn't the prettiest way to achieve this; the "success"/"failure"
arguments, especially, are somewhat ugly. A perhaps more palatable
alternative which gives you the same benefit of re-using the common
code but avoids the ugliness while still allowing customization (now
and in the future):

test_apply_with_3way () {
# Merging side should be similar to applying this patch
git diff ...side >P.diff &&
[...]
}

test_expect_success 'apply with --3way (default)' '
test_apply_with_3way
'

test_expect_failure 'apply with --3way (merge.conflictStyle = diff3)' '
test_config merge.conflictStyle diff3 &&
test_apply_with_3way
'


Re: [BUG] "--show-current-patch" return a mail instead of a patch

2019-10-23 Thread Jerome Pouiller
On Wednesday 23 October 2019 13:09:51 CEST Denton Liu wrote:
> On Wed, Oct 23, 2019 at 10:15:15AM +, Jerome Pouiller wrote:
> > On Wednesday 23 October 2019 10:55:03 CEST Denton Liu wrote:
> > > I am currently have a WIP patchset that will print the location of the
> > > failed patch file (.git/rebase-apply/patch) in the case of a failure as
> > > well as the line number. Will this be sufficient for your purposes?
> >
> > It would be a clear improvement (the perfection would be to be able to
> > use mergetool with git-am :) ).
> 
> You should be able to do that with the --3way (-3) flag.

AFAIK, it only works if index is found in local repository, which is
rarely the case :(

-- 
Jérôme Pouiller



Re: [PATCH v5 13/17] read-tree: show progress by default

2019-10-23 Thread Derrick Stolee
On 10/22/2019 11:48 PM, Junio C Hamano wrote:
> Derrick Stolee  writes:
> 
>>> I'm slightly wary of changing the output of plumbing commands
>>> like this. If a script wants progress output it can already get
>>> it by passing --verbose. With this change a script that does not
>>> want that output now has to pass --no-verbose.
>>
>> If a script is calling this, then won't stderr not be a terminal window, and
>> isatty(2) return 0?
> 
> Unless the script tries to capture the error output and react
> differently depending on the error message from the plumbing (which
> is not localized), iow most of the time, standard error stream is
> left unredirected and likely to be connected to the terminal if the
> script is driven from a terminal command line.
> 
>> Or, if the script is run with stderr passing through to
>> a terminal, then the user would see progress while running the script, which
>> seems like a side-effect but not one that will cause a broken script.
> 
> It will show unwanted output to the end users, no?  That is the
> complaint about having to pass --no-verbose, if I understand
> correctly, if the script does not want to show the progress output.

I'm happy to have attempted the change and start this discussion. It
sounds like this one patch could be ejected to no loss to the full
series.

Thanks,
-Stolee


Re: [BUG] "--show-current-patch" return a mail instead of a patch

2019-10-23 Thread Denton Liu
On Wed, Oct 23, 2019 at 10:15:15AM +, Jerome Pouiller wrote:
> On Wednesday 23 October 2019 10:55:03 CEST Denton Liu wrote:
> > I am currently have a WIP patchset that will print the location of the
> > failed patch file (.git/rebase-apply/patch) in the case of a failure as
> > well as the line number. Will this be sufficient for your purposes?
> 
> It would be a clear improvement (the perfection would be to be able to
> use mergetool with git-am :) ).

You should be able to do that with the --3way (-3) flag.

> 
> Thank you,
> 
> -- 
> Jérôme Pouiller
> 


Re: [BUG] "--show-current-patch" return a mail instead of a patch

2019-10-23 Thread Jerome Pouiller
On Wednesday 23 October 2019 10:55:03 CEST Denton Liu wrote:
> On Wed, Oct 23, 2019 at 08:49:41AM +, Jerome Pouiller wrote:
> > On Wednesday 23 October 2019 04:24:58 CEST Junio C Hamano wrote:
> > > Jerome Pouiller  writes:
> > > > I try to use "git am" to apply a patch sent using "git send-email". This
> > > > patch does not apply properly. I try to use "git am 
> > > > --show-current-patch"
> > > > to understand the problem. However, since original mail is encoded in 
> > > > quoted-
> > > > printable, data returned by --show-current-patch is not a valid patch.
> > >
> > > I agree that --show-current-patch is a misdesigned feature.  We'd be
> > > doing a better service to our users if we documented that the patch
> > > and log message are found at .git/rebase-apply/{patch,msg} instead
> > > of trying to hide the path.
> > >
> > > Unfortunately, it is likely that those who added that feature have
> > > built their tooling around it to depend on its output being the full
> > > e-mail message "am" was fed (and split by "git mailsplit").  So I do
> > > not think we will be changing the output to the patch file only.
> > >
> > > But even then, the documentation can be fixed without any backward
> > > compatibility issues.  Perhaps like this?
> > >
> > >  Documentation/git-am.txt | 2 +-
> > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > >
> > > diff --git a/Documentation/git-am.txt b/Documentation/git-am.txt
> > > index 6f6c34b0f4..f63b70325c 100644
> > > --- a/Documentation/git-am.txt
> > > +++ b/Documentation/git-am.txt
> > > @@ -172,7 +172,7 @@ default.   You can use `--no-utf8` to override this.
> > > untouched.
> > >
> > >  --show-current-patch::
> > > -   Show the patch being applied when "git am" is stopped because
> > > +   Show the entire e-mail message "git am" has stopped at, because
> > > of conflicts.
> >
> > I agree with you: I think that manpage and/or output of "git am" should
> > mention ".git/rebase-apply/patch" (that is exactly what I was looking
> > for).
> 
> I am currently have a WIP patchset that will print the location of the
> failed patch file (.git/rebase-apply/patch) in the case of a failure as
> well as the line number. Will this be sufficient for your purposes?

It would be a clear improvement (the perfection would be to be able to
use mergetool with git-am :) ).

Thank you,

-- 
Jérôme Pouiller



Re: [RFC/WIP] range-diff: show old/new blob OIDs in comments

2019-10-23 Thread Eric Wong
Eric Wong  wrote:
> Johannes Schindelin  wrote:
> > I guess your patch won't hurt.
> 
> Cool, will update tests and resend.

Turns out the t3206 "trivial reordering" case is way too noisy.
I might need to change diff.c to support this case; will update
in a few days or week.


Re: [PATCH v3 1/1] ci(osx): use new location of the `perforce` cask

2019-10-23 Thread Junio C Hamano
"Johannes Schindelin via GitGitGadget" 
writes:

> From: Johannes Schindelin 
>
> The Azure Pipelines builds are failing for macOS due to a change in the
> location of the perforce cask. The command outputs the following error:
>
> + brew install caskroom/cask/perforce
> Error: caskroom/cask was moved. Tap homebrew/cask-cask instead.
> ...
> In any case, as the error message at the top of this commit message
> shows, 'brew install caskroom/cask/perforce' has stopped working
> recently, but 'brew cask install perforce' still does, so let's use
> that.

It appears that OSX jobs at Travis are getting hit by this issue.
Here is what our failed build ends with, for example:

+brew install caskroom/cask/perforce
Error: caskroom/cask was moved. Tap homebrew/cask-cask instead.

cf. https://travis-ci.org/git/git/jobs/601697815

Today's 'pu' has this topic queued, and it seems to help even the
builds at Travis ('pu' seems to fail the test for totally different
reason, though):

+brew link gcc@8
Error: No such keg: /usr/local/Cellar/gcc@8

cf. https://travis-ci.org/git/git/jobs/601697903

Thanks.


Re: Issue: "Could not access submodule" error when pulling recursively with Git 2.22.0

2019-10-23 Thread SZEDER Gábor
On Wed, Oct 23, 2019 at 07:22:12AM +, Aleksey Mikhaylov wrote:
> "Could not access submodule" error when pulling recursively with Git 2.22.0.
> This issue causes if there is submodule in submodule.

> Please use my simple test repository to reproduce the problem:
> https://github.com/agmikhailov/example2361.git
> 
> It is just an empty repository with one submodule 
> (https://github.com/agmikhailov/example2361M1.git) 
> and one submodule of submodule 
> (https://github.com/agmikhailov/example2361M2.git):
> 
> git clone https://github.com/agmikhailov/example2361.git
> cd example2361/
> git submodule init

According to the docs of 'git submodule init', it "initializes the
submodules recorded in the index".  Therefore, it can only initialize
the submodule in 'example2361M1', because at this point we have no
idea that there is a nested submodule in there.

  $ git submodule init
  Submodule 'example2361M1' (https://github.com/agmikhailov/example2361M1.git) 
registered for path 'example2361M1'

> git submodule update

This command clones 'example2361M1':

  $ git submodule update --recursive
  Cloning into '/tmp/example2361/example2361M1'...
  Submodule path 'example2361M1': checked out 
'6a9be24a1c0ebd44d91ae4dcf1fd62580b936540'

Only at this point can we finally see that there is a nested
submodule, and can initialize and clone it with:

  $ cd example2361M1
  $ git submodule init
  Submodule 'example2361M2' (https://github.com/agmikhailov/example2361M2.git) 
registered for path 'example2361M2'
  $ git submodule update
  Cloning into '/tmp/example2361/example2361M1/example2361M2'...
  Submodule path 'example2361M2': checked out 
'9ed39cf1fe0a8cf34e72d2e7ebff1ea9d4a63ac1'

> git pull --recurse-submodules=true

And after that:

  $ cd ../..
  $ git pull --recurse-submodules=true
  Fetching submodule example2361M1
  Fetching submodule example2361M1/example2361M2
  Already up to date.


> ACTUAL RESULT
> 
> "git --recurse-submodules=true" command fails with message "Could not access 
> submodule":
> 
> $ git --recurse-submodules=true
> Fetching submodule example2361M1
> Could not access submodule 'example2361M2'
> 
> EXPECTED RESULT
> 
> All submodules are successfully updated by "git --recurse-submodules=true" 
> command.
> 
> ADDITIONAL INFORMATION
> 
> Git version 2.20.1 does not have this problem. 
> So we had to downgrade Git to work with submodules.

The behavior was indeed different with v2.20.1, but that version
didn't show the behavior you expected.  When running your commands
with v2.20.1 I get:

  $ ~/src/git/bin-wrappers/git pull --recurse-submodules=true
  Fetching submodule example2361M1
  Already up to date.
  $ find example2361M1/example2361M2/
  example2361M1/example2361M2/

So while that 'git pull' didn't error out, it didn't even look at the
nested submodule, which is still uninitialized and empty.

FWIW, bisecting shows that the behavior changed with commit
a62387b3fc, but, unfortunately, the commit message doesn't seem to be
very helpful to me, but perhaps others with more experience with
submodules can make something out of it.

commit a62387b3fc9f5aeeb04a2db278121d33a9caafa7
Author: Stefan Beller 
Date:   Wed Nov 28 16:27:55 2018 -0800

submodule.c: fetch in submodules git directory instead of in worktree

Keep the properties introduced in 10f5c52656 (submodule: avoid
auto-discovery in prepare_submodule_repo_env(), 2016-09-01), by fixating
the git directory of the submodule.

Signed-off-by: Stefan Beller 
Signed-off-by: Junio C Hamano 

 submodule.c | 10 --
 1 file changed, 8 insertions(+), 2 deletions(-)




Re: [BUG] "--show-current-patch" return a mail instead of a patch

2019-10-23 Thread Denton Liu
Hi Jerome,

On Wed, Oct 23, 2019 at 08:49:41AM +, Jerome Pouiller wrote:
> On Wednesday 23 October 2019 04:24:58 CEST Junio C Hamano wrote:
> > Jerome Pouiller  writes:
> > > I try to use "git am" to apply a patch sent using "git send-email". This
> > > patch does not apply properly. I try to use "git am --show-current-patch"
> > > to understand the problem. However, since original mail is encoded in 
> > > quoted-
> > > printable, data returned by --show-current-patch is not a valid patch.
> > 
> > I agree that --show-current-patch is a misdesigned feature.  We'd be
> > doing a better service to our users if we documented that the patch
> > and log message are found at .git/rebase-apply/{patch,msg} instead
> > of trying to hide the path.
> > 
> > Unfortunately, it is likely that those who added that feature have
> > built their tooling around it to depend on its output being the full
> > e-mail message "am" was fed (and split by "git mailsplit").  So I do
> > not think we will be changing the output to the patch file only.
> > 
> > But even then, the documentation can be fixed without any backward
> > compatibility issues.  Perhaps like this?
> > 
> >  Documentation/git-am.txt | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/Documentation/git-am.txt b/Documentation/git-am.txt
> > index 6f6c34b0f4..f63b70325c 100644
> > --- a/Documentation/git-am.txt
> > +++ b/Documentation/git-am.txt
> > @@ -172,7 +172,7 @@ default.   You can use `--no-utf8` to override this.
> > untouched.
> > 
> >  --show-current-patch::
> > -   Show the patch being applied when "git am" is stopped because
> > +   Show the entire e-mail message "git am" has stopped at, because
> > of conflicts.
> 
> I agree with you: I think that manpage and/or output of "git am" should
> mention ".git/rebase-apply/patch" (that is exactly what I was looking
> for).

I am currently have a WIP patchset that will print the location of the
failed patch file (.git/rebase-apply/patch) in the case of a failure as
well as the line number. Will this be sufficient for your purposes?

Thanks,

Denton

> 
> Maybe documentation of --show-current-patch should be clarified with a
> note like "This option is mainly for internal purpose. If you want to
> get current patch, rely on .git/rebase-apply/patch".
> 
> 
> -- 
> Jérôme Pouiller
> 


Re: [BUG] "--show-current-patch" return a mail instead of a patch

2019-10-23 Thread Jerome Pouiller
On Wednesday 23 October 2019 04:24:58 CEST Junio C Hamano wrote:
> Jerome Pouiller  writes:
> > I try to use "git am" to apply a patch sent using "git send-email". This
> > patch does not apply properly. I try to use "git am --show-current-patch"
> > to understand the problem. However, since original mail is encoded in 
> > quoted-
> > printable, data returned by --show-current-patch is not a valid patch.
> 
> I agree that --show-current-patch is a misdesigned feature.  We'd be
> doing a better service to our users if we documented that the patch
> and log message are found at .git/rebase-apply/{patch,msg} instead
> of trying to hide the path.
> 
> Unfortunately, it is likely that those who added that feature have
> built their tooling around it to depend on its output being the full
> e-mail message "am" was fed (and split by "git mailsplit").  So I do
> not think we will be changing the output to the patch file only.
> 
> But even then, the documentation can be fixed without any backward
> compatibility issues.  Perhaps like this?
> 
>  Documentation/git-am.txt | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/Documentation/git-am.txt b/Documentation/git-am.txt
> index 6f6c34b0f4..f63b70325c 100644
> --- a/Documentation/git-am.txt
> +++ b/Documentation/git-am.txt
> @@ -172,7 +172,7 @@ default.   You can use `--no-utf8` to override this.
> untouched.
> 
>  --show-current-patch::
> -   Show the patch being applied when "git am" is stopped because
> +   Show the entire e-mail message "git am" has stopped at, because
> of conflicts.

I agree with you: I think that manpage and/or output of "git am" should
mention ".git/rebase-apply/patch" (that is exactly what I was looking
for).

Maybe documentation of --show-current-patch should be clarified with a
note like "This option is mainly for internal purpose. If you want to
get current patch, rely on .git/rebase-apply/patch".


-- 
Jérôme Pouiller



Re: [PATCH 1/1] documentation: remove empty doc files

2019-10-23 Thread Heba Waly
On Wed, Oct 23, 2019 at 12:52 PM Junio C Hamano  wrote:
>
> Emily Shaffer  writes:
>
> > As for the content of this change, I absolutely approve. I've stumbled
> > across some of these empty docs while looking for answers before and
> > found it really demoralizing - the community is so interested in
> > teaching me how to contribute that they've sat on a TODO for 12 years?
> > :( I even held up api-grep.txt as a (bad) example in a talk I gave this
> > year. I'm happy to see these files go.
>
> I'd approve this move, too, especially if we accompanied deletion
> with addition (or verification of existence) of necessary docs
> elsewhere (perhaps in *.h headers).
Good point, although not all corresponding header files are
documented, but I'll include that in the commit message.

Thanks


Re: [PATCH 1/1] documentation: remove empty doc files

2019-10-23 Thread Heba Waly
On Wed, Oct 23, 2019 at 10:05 AM Emily Shaffer  wrote:
>
> On Tue, Oct 22, 2019 at 06:19:35PM +, Heba Waly via GitGitGadget wrote:
> > From: Heba Waly 
> >
> > Remove empty and redundant documentation files from the
> > Documentation/technical/ directory.
> >
> > As part of moving the documentation from Documentation/technical/api-* to
> > header files, the following files are deleted because they include only
> > TODO messages with no documentation to be moved:
> > Documentation/technical/api-grep.txt
> > Documentation/technical/api-object-access.txt
> > Documentation/technical/api-quote.txt
> > Documentation/technical/api-xdiff-interface.txt
>
> Same thing as I mentioned in your other review; what you've added to
> your commit message now doesn't say anything you didn't say with the
> diff. I can see that you removed empty documentation files; I can see
> that those files include only TODO.
>
> Maybe you can explain why it's a bad developer experience to stumble
> across these, and that those files sat untouched for years in the
> TODO(contributor-name) state.
you're right!
> >
> > Signed-off-by: Heba Waly 
> > ---
> >  Documentation/technical/api-grep.txt|  8 
> >  Documentation/technical/api-object-access.txt   | 15 ---
> >  Documentation/technical/api-quote.txt   | 10 --
> >  Documentation/technical/api-xdiff-interface.txt |  7 ---
> >  4 files changed, 40 deletions(-)
> >  delete mode 100644 Documentation/technical/api-grep.txt
> >  delete mode 100644 Documentation/technical/api-object-access.txt
> >  delete mode 100644 Documentation/technical/api-quote.txt
> >  delete mode 100644 Documentation/technical/api-xdiff-interface.txt
>
> As for the content of this change, I absolutely approve. I've stumbled
> across some of these empty docs while looking for answers before and
> found it really demoralizing - the community is so interested in
> teaching me how to contribute that they've sat on a TODO for 12 years?
> :( I even held up api-grep.txt as a (bad) example in a talk I gave this
> year. I'm happy to see these files go.
>
>  - Emily


Re: [PATCH] pretty: Add "%aU"|"%au" option to output author's username

2019-10-22 Thread Jeff King
On Tue, Oct 22, 2019 at 07:28:47PM -0400, Prarit Bhargava wrote:

> In many projects the number of contributors is low enough that users know
> each other and the full email address doesn't need to be displayed.
> Displaying only the author's username saves a lot of columns on the screen.
> For example displaying "prarit" instead of "pra...@redhat.com" saves 11
> columns.
> 
> Add a "%aU"|"%au" option that outputs the author's email username.

Like others, this seems potentially useful even if I probably wouldn't
use it myself. Another more complicated way to think of it would be to
give a list of domains to omit (so if 90% of the committers are
@redhat.com, we can skip that, but the one-off contributor from another
domain gets their fully qualified name.

But that's a lot more complicated. I don't mind doing the easy thing
now, and even if we later grew the more complicated thing, I wouldn't be
sad to still have this easy one as an option.

> --- a/pretty.c
> +++ b/pretty.c
> @@ -706,6 +706,11 @@ static size_t format_person_part(struct strbuf *sb, char 
> part,
>   strbuf_add(sb, mail, maillen);
>   return placeholder_len;
>   }
> + if (part == 'u' || part == 'U') {   /* username */
> + maillen = strstr(s.mail_begin, "@") - s.mail_begin;
> + strbuf_add(sb, mail, maillen);
> + return placeholder_len;
> + }

What happens if the email doesn't have an "@"? I think you'd either end
up printing a bunch of extra cruft (because you're not limiting the
search for "@" to the boundaries from split_ident_line) or you'd
crash (if there's no "@" at all, and you'd get a huge maillen).

There's also no need to use the slower strstr() when looking for a
single character. So perhaps:

  const char *at = memchr(mail, '@', maillen);
  if (at)
  maillen = at - mail;
  strbuf_add(sb, mail, maillen);

> +test_expect_success 'log pretty %an %ae %au' '

As others noted, this could cover %aU, too (which is broken; you need to
handle 'U' alongside 'E' and 'N' earlier in format_person_part()).

> + git checkout -b anaeau &&
> + test_commit anaeau_test anaeau_test_file &&
> + git log --pretty="%an" > actual &&
> + git log --pretty="%ae" >> actual &&
> + git log --pretty="%au" >> actual &&

Maybe:

  git log --pretty="%an %ae %au"

or

  git log --pretty="%an%n%ae%n%au"

which is shorter and runs more efficiently?

> + git log > full &&
> + name=$(cat full | grep "^Author: " | awk -F "Author: " " { print \$2 } 
> " | awk -F " <" " { print \$1 } ") &&
> + email=$(cat full | grep "^Author: " | awk -F "<" " { print \$2 } " | 
> awk -F ">" " { print \$1 } ") &&
> + username=$(cat full | grep "^Author: " | awk -F "<" " { print \$2 } " | 
> awk -F ">" " { print \$1 } " | awk -F "@" " { print \$1 } " ) &&
> + echo "${name}" > expect &&
> + echo "${email}" >> expect &&
> + echo "${username}" >> expect &&

These values come from our hard-coded test setup, so it would be more
readable to just expect those:

  {
echo "$GIT_AUTHOR_NAME" &&
echo "$GIT_AUTHOR_EMAIL" &&
echo "$GIT_AUTHOR_EMAIL" | sed "s/@.*//"
  } >expect

For the last one, also I wouldn't be upset to see test-lib.sh do
something like:

  TEST_AUTHOR_USERNAME=author
  TEST_AUTHOR_DOMAIN=example.com
  GIT_AUTHOR_NAME=$TEST_AUTHOR_USERNAME@$TEST_AUTHOR_DOMAIN

to let tests like this pick out the individual values if they want.

-Peff


Re: [PATCH v2 1/1] config: move documentation to config.h

2019-10-22 Thread Heba Waly
On Wed, Oct 23, 2019 at 9:59 AM Emily Shaffer  wrote:
>
> On Tue, Oct 22, 2019 at 07:05:06AM +, Heba Waly via GitGitGadget wrote:
> > From: Heba Waly 
> >
> > Move the documentation from Documentation/technical/api-config.txt into
> > config.h
>
> This is still a little thin for what we usually want from commit
> messages. Try to imagine that five years from now, you find this commit
> by running `git blame` on config.h and then examining the commit which
> introduced all these comments with `git show ` - what would
> you want to know?
>
> Typically we want to know "why" the change was made, because the diff
> shows "what". We can see from the diff that you're moving comments from
> A to B, but if you explain why you did so (not "because my Outreachy
> mentor told me to" ;) but "because it is useful to see usage information
> next to code" or "this is best practice as described by blah blah") - I
> wouldn't be able to know that reasoning just from looking at your diff.
Ok, got it.

>
> > diff --git a/config.h b/config.h
> > index f0ed464004..02f78ffc2b 100644
> > --- a/config.h
> > +++ b/config.h
> > @@ -4,6 +4,23 @@
> >  #include "hashmap.h"
> >  #include "string-list.h"
> >
> > +
> > +/**
> > + * The config API gives callers a way to access Git configuration files
> > + * (and files which have the same syntax). See linkgit:git-config[1] for a
>
> Ah, here's another place where the Asciidoc link isn't going to do
> anything anymore.
yep!
> Otherwise I didn't still see anything jumping out. When the commit
> message is cleaned up I'm ready to add my Reviewed-by line.
great!

>  - Emily

Thanks :)


Re: [Git Developer Blog] [PATCH] post: a tour of git's object types

2019-10-22 Thread Junio C Hamano
Emily Shaffer  writes:

> +# Commit
> +
> +This is the one we're all familiar with - commits are those things we write 
> at
> +1am, angry at a pesky bug, and label with something like "really fix it this
> +time", right?
> +
> +A commit references exactly one tree. That's the root directory of your 
> project.
> +It also references zero or more other commits - and this is where we diverge
> +from the filesystem parallel, because the other commits it references are its
> +parent(s), each of which has its own copy of the project at that commit's 
> point
> +in time. (Commits with more than one parent are merge commits; otherwise, 
> your
> +commit only has the one parent.)

I do not see a need for (parentheses) around the last sentence, but if
you must, then s/in time. /in time/; and s/one parent.)/one parent)./
would be better.

> +Commits represent a specific state of the repository which someone thought 
> was
> +worth saving - a new feature, or a small step of progress which you don't 
> want
> +to lose as you're hacking your way through that homework assignment. Each 
> commit
> +points to a complete image of the repository - but that's not as bad as it
> +sounds, as we'll see when we try it out.

That's half of a commit (i.e. the tree that represents the specific
state).  

The other half is that a commit (at least in a serious-enough
project) is a statement by its author: I considered all the parent
commits, and declare that the tree this commit records suits my goal
better than any and all of these parents' trees.

That is what makes 3-way merge work correctly at the philosophical
level.  As long as the project participants share the same goal and
trust each other, when one creates a merge, one trusts what the
others built in the side branch (i.e. each of the commits they made
got us closer to our collective goals) and take their changes to
where one did not touch.

Of course, a good description in the log message helps the one
who makes such a merge to see if the workmade on the side branch
moves the tree in the direction that truly fits one's goal.

> +# Tag
> +
> +Tags are a little lackluster after all the exciting types up until now. They 
> are
> +essentially a label; they serve as an entry point into the graph and point to
> +either another tag or a commit.

Either say "generally point to", or "point to another object"
(i.e. a tag that points to a tree or a blob is normal---it is just
they do not so frequently appear).

> They're generally used to mark releases, and you
> +can see them pretty easily with `git log --oneline --decorate=full`.
> +
> +# A quick return to an overloaded word
> +
> +"Tree", "worktree", and "working tree" seem to refer to different concepts.

Not just seem to.  They do refer to different things.  "tree" is a
type of object.  "working tree" is a directory hiearchy where you
did "git checkout" to materialize the contents of a tree object
(recursively) and are using to work towards updating the index to
create the next commit.  "worktree" is a mechanism that allows you
to have multiple "working tree"s that is backed by the same repository.

They may share the same word "tree".  You may want to update this
document to say "tree object" when you mean it---that would help
disambiguating it from other uses of words with "tree" in them.

> ...
> +predictable way. Let's walk through creating a pretty basic repository and
> +examining it with some low-level plumbing commands!
> +
> +# An empty repo
> +
> +For starters, we'll make a new, shiny, totally empty repo.
> +
> +{% highlight shell_session %}
> +$ mkdir demo
> +$ cd demo
> +$ git init
> +{% endhighlight %}
> +
> +We've got nothing. If we try `git log`, we'll be assured that we have no
> +commits, and if we try `git branch -a` we'll see we have no branches, either.
> +So let's make a very simple first commit.
> +
> +# A single commit
> +
> +{% highlight shell_session %}
> +$ echo "abcd" >foo.txt
> +$ git add foo.txt
> +$ git commit -m "first commit"
> +{% endhighlight %}
> +
> +I know this is boring, but bear with me and run `git ls-tree HEAD`.

You may probably want to say that, even though you upfront raised
the expectation of readers that they would hear about plumbing soon,
you haven't so far used any plumbing yet.  And stress that ls-tree
is a plumbing, what the readers have been waiting for!

> +{% highlight shell_session %}
> +$ git ls-tree HEAD
> +100644 blob acbe86c7c89586e0912a0a851bacf309c595c308 foo.txt
> +$ git cat-file -p acbe86c7c89586e0912a0a851bacf309c595c308
> +abcd
> +{% endhighlight %}
> +
> +While we're here, we can also take a look at the commit object. Use `git log`
> +to determine your commit's OID, then use `git cat-file -p` to print the
> +contents:

I doubt that "cat-file -p" is helpful to a reader who is learning
the basic object layer.  For non-tree types, learning "cat-file -t"
followed by "cat-file " would be more useful to gain proper
understanding (and for trees, as you showed above, ls-tr

Re: [RFC/WIP] range-diff: show old/new blob OIDs in comments

2019-10-22 Thread Junio C Hamano
Eric Wong  writes:

> What Konstantin said about git repos being transient.
> It wasn't too much work to recreate those blobs from
> scratch since git-apply has done it since 2005.

;-)

> We could get around transient repos with automatic mirroring
> bots which never deletes or overwrites anything published.
> That includes preserving pre-force-push data in case of
> force pushes.
>
>> Instead, we will have to rely on your centralized, non-distributed
>> service...
>
> I'm curious how you came to believe that, since that's the
> opposite of what public-inbox has always been intended to be.

I think the (mis)perception comes from the fact that the website and
the newsfeed you give are both too easy to use and directly attract
end users, instead of enticing them to keep their own mirrors for
offline use.

Thanks for injecting dose of sanity.


Re: [PATCH 5/5] path.c: don't call the match function without value in trie_find()

2019-10-22 Thread Junio C Hamano
SZEDER Gábor  writes:

>>   - b9317d55a3 added two new keys to the trie: 'logs/refs/rewritten'
>> and 'logs/refs/worktree', next to the already existing
>> 'logs/refs/bisect'.  This resulted in a trie node with the path
>> 'logs/refs', which didn't exist before, and which doesn't have a
>
> Oops, I missed the trailing slash, that must be 'logs/refs/'!
>
>> value attached.  A query for 'logs/refs/' finds this node and then
>> hits that one callsite of the match function which doesn't check
>> for the value's existence, and thus invokes the match function
>> with NULL as value.

Given that the trie is maintained by hand in common_list[], I wonder
if we can mechanically catch errors like the one b9317d55a3 added,
by perhaps having a self-test function that a t/helper/ program
calls to perform consistency check after the "git" gets built.

Thanks.





Re: [PATCH v5 00/17] New sparse-checkout builtin and "cone" mode

2019-10-22 Thread Junio C Hamano
"Derrick Stolee via GitGitGadget"  writes:

> V4 UPDATE: Rebased on latest master to include ew/hashmap and
> ds/include-exclude in the base.
>
> This series makes the sparse-checkout feature more user-friendly. While
> there, I also present a way to use a limited set of patterns to gain a
> significant performance boost in very large repositories.
> ...
> Updates in V5:
>
>  * The 'set' subcommand now enables the core.sparseCheckout config setting
>(unless the checkout fails).
>
>
>  * If the in-process unpack_trees() fails with the new patterns, the
>index.lock file is rolled back before the replay of the old
>sparse-checkout patterns.
>
>
>  * Some documentation fixes, f(d)open->xf(d)open calls, and other nits.
>Thanks everyone!

Thanks.  I quickly scanned the changes between the last round and
this one, and all I saw were pure improvements ;-)  Not that I claim
to have read the previous round very carefully, though.

Will replace and queue.


Re: [PATCH v5 13/17] read-tree: show progress by default

2019-10-22 Thread Junio C Hamano
Derrick Stolee  writes:

>> I'm slightly wary of changing the output of plumbing commands
>> like this. If a script wants progress output it can already get
>> it by passing --verbose. With this change a script that does not
>> want that output now has to pass --no-verbose.
>
> If a script is calling this, then won't stderr not be a terminal window, and
> isatty(2) return 0?

Unless the script tries to capture the error output and react
differently depending on the error message from the plumbing (which
is not localized), iow most of the time, standard error stream is
left unredirected and likely to be connected to the terminal if the
script is driven from a terminal command line.

> Or, if the script is run with stderr passing through to
> a terminal, then the user would see progress while running the script, which
> seems like a side-effect but not one that will cause a broken script.

It will show unwanted output to the end users, no?  That is the
complaint about having to pass --no-verbose, if I understand
correctly, if the script does not want to show the progress output.



Re: [PATCH v3 2/2] git_path(): handle `.lock` files correctly

2019-10-22 Thread Junio C Hamano
SZEDER Gábor  writes:

>> +char *base = buf->buf + git_dir_len, *base2 = NULL;
>> +
>> +if (ends_with(base, ".lock"))
>> +base = base2 = xstrndup(base, strlen(base) - 5);
>
> Hm, this adds the magic number 5 and calls strlen(base) twice, because
> ends_with() calls strip_suffix(), which calls strlen().  Calling
> strip_suffix() directly would allow us to avoid the magic number and
> the second strlen():
>
>   size_t len;
>   if (strip_suffix(base, ".lock", &len))
>   base = base2 = xstrndup(base, len);

Makes sense, and is easy to squash in.


Re: [PATCH 0/3] commit: fix advice for empty commits during rebases

2019-10-22 Thread Junio C Hamano
"Johannes Schindelin via GitGitGadget" 
writes:

> In dcb500dc16c (cherry-pick/revert: advise using --skip, 2019-07-02), we
> introduced a helpful message that suggests to run git cherry-pick --skip 
> (instead of the previous message that talked about git reset) when a
> cherry-pick failed due to an empty patch.
>
> However, the same message is displayed during a rebase, when the patch
> to-be-committed is empty. In this case, git reset would also have worked,
> but git cherry-pick --skip does not work. This is a regression introduced in
> this cycle.
>
> Let's be more careful here.
>
> Johannes Schindelin (3):
>   cherry-pick: add test for `--skip` advice in `git commit`
>   sequencer: export the function to get the path of `.git/rebase-merge/`
>   commit: give correct advice for empty commit during a rebase

Overall they looked nicely done.  The last one may have started its
life as "let's fix advice for empty", but no longer is.

The old code used the sequencer_in_use boolean to tell between two
states ("are we doing a single pick, or a range pick?"), but now we
have another boolean rebase_in_progress, and depending on the shape
of the if statements existing codepaths happen to have, these two
are combined in different ways to deal with new states.  E.g. some
places say

if (rebase_in_progress && !sequencer_in_use)
we are doing a rebase;
else
we are doing a cherry-pick;

and some others say

if (sequencer_in_use)
we are doing a multi pick;
else if (rebase_in_progress)
we are doing a rebase;
else
we are doing a single pick;

I wonder if it makes the resulting logic simpler to reason about if
these combination of two variables are rewritten to use a single
variable that enumerates three (or is it four, counting "we are
doing neither a cherry-pick or a rebase"?) possible state.

Other than that, looked sensible.  Will queue.

Thanks.


Re: [PATCH 3/3] commit: give correct advice for empty commit during a rebase

2019-10-22 Thread Junio C Hamano
"Johannes Schindelin via GitGitGadget" 
writes:

> Note: we take pains to handle the situation when a user runs a `git
> cherry-pick` _during_ a rebase. This is quite valid (e.g. in an `exec`
> line in an interactive rebase). On the other hand, it is not possible to
> run a rebase during a cherry-pick, meaning: if both `rebase-merge/` and
> `sequencer/` exist, we still want to advise to use `git cherry-pick
> --skip`.

Good description of why the code does what it does, that future
readers will surely appreciate.  Nice.



Re: [BUG] "--show-current-patch" return a mail instead of a patch

2019-10-22 Thread Junio C Hamano
Jerome Pouiller  writes:

> Hello all,
>
> I try to use "git am" to apply a patch sent using "git send-email". This
> patch does not apply properly. I try to use "git am --show-current-patch"
> to understand the problem. However, since original mail is encoded in quoted-
> printable, data returned by --show-current-patch is not a valid patch.

I agree that --show-current-patch is a misdesigned feature.  We'd be
doing a better service to our users if we documented that the patch
and log message are found at .git/rebase-apply/{patch,msg} instead
of trying to hide the path.

Unfortunately, it is likely that those who added that feature have
built their tooling around it to depend on its output being the full
e-mail message "am" was fed (and split by "git mailsplit").  So I do
not think we will be changing the output to the patch file only.

But even then, the documentation can be fixed without any backward
compatibility issues.  Perhaps like this?

 Documentation/git-am.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/git-am.txt b/Documentation/git-am.txt
index 6f6c34b0f4..f63b70325c 100644
--- a/Documentation/git-am.txt
+++ b/Documentation/git-am.txt
@@ -172,7 +172,7 @@ default.   You can use `--no-utf8` to override this.
untouched.
 
 --show-current-patch::
-   Show the patch being applied when "git am" is stopped because
+   Show the entire e-mail message "git am" has stopped at, because
of conflicts.
 
 DISCUSSION



Re: [PATCH] t7419: change test_must_fail to ! for grep

2019-10-22 Thread Junio C Hamano
Denton Liu  writes:

> According to t/README, test_must_fail() should only be used to test for
> failure in Git commands. Replace the invocations of
> `test_must_fail grep` with `! grep`.
>
> Signed-off-by: Denton Liu 
> ---
> *sigh* Here's another cleanup patch for 'dl/submodule-set-branch'. It's
> inspired by Eric Sunshine's comments on the t5520 patchset from earlier.
> It's definitely not urgent, though, and can wait for v2.25.0.

True, but they are trivially correct and removes the risk of
inexperienced developers copying and pasting this bad pattern to
other tests that was added during this cycle, so I think it is a
good idea to take it now.

Thanks for being extra careful.


Re: [PATCH v2 1/1] config: move documentation to config.h

2019-10-22 Thread Junio C Hamano
Emily Shaffer  writes:

>> ...
>> +/**
>> + * The config API gives callers a way to access Git configuration files
>> + * (and files which have the same syntax). See linkgit:git-config[1] for a
>
> Ah, here's another place where the Asciidoc link isn't going to do
> anything anymore.
>
> Otherwise I didn't still see anything jumping out. When the commit
> message is cleaned up I'm ready to add my Reviewed-by line.

Thanks.  Your review(s) have been quite sensible and helpful.




Re: [PATCH v2 2/2] git_path(): handle `.lock` files correctly

2019-10-22 Thread Junio C Hamano
Johannes Schindelin  writes:

> However, I think this is _really_ ugly and intrusive. The opposite of
> what my goals were.
>
> So I think I'll just bite the bullet and use a temporary copy if the
> argument ends in `.lock`.

Sounds like a quite sensible design decision to me.


Re: [PATCH v5 1/2] format-patch: create leading components of output directory

2019-10-22 Thread Junio C Hamano
Bert Wesarg  writes:

> Please ignore this. Will rebase on 2.24-rc0 and will only include the
> test changes.

Thanks.


Re: [PATCH v2 0/2] Fix the speed of the CI (Visual Studio) tests

2019-10-22 Thread Junio C Hamano
"Johannes Schindelin via GitGitGadget" 
writes:

> Changes since v1:
>
>  * Fixed typo "nore" -> "nor" in the commit message.
>
> Johannes Schindelin (2):
>   ci(visual-studio): use strict compile flags, and optimization
>   ci(visual-studio): actually run the tests in parallel
>
>  azure-pipelines.yml | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)

Thanks.

I will take the change to 'master' directly, as cooking in 'next'
won't give it any extra exposure when the topic touches only this
file and the patch comes from those who do exercise azure CI build
before sending it out to the list ;-)


Re: [PATCH] test-progress: fix test failures on big-endian systems

2019-10-22 Thread Junio C Hamano
Jeff King  writes:

> But here's where it gets tricky. In addition to catching any size
> mismatches, this will also catch signedness problems. I.e., if we make
> OPT_INTEGER() use "intp", then everybody passing in &unsigned_var now
> gets a compiler warning. Which maybe is a good thing, I dunno.

Hmph, true.  I'd agree with back-burnering it for now.  

Perhaps we'd fix the signedness issue one by one in a preparatory
series before converting the value field to a union, if we want to
pursue this idea further (in which I am mildly interested, by the
way), but it does sound like it should be given lower priority.

> So that's where I gave up. Converting between signed and unsigned
> variables needs to be done very carefully, as there are often subtle
> impacts (e.g., loop terminations). And because we have so many sign
> issues already, compiling with "-Wsign-compare", etc, isn't likely to
> help.

True.

Thanks.


Re: [RFC/WIP] range-diff: show old/new blob OIDs in comments

2019-10-22 Thread Eric Wong
Johannes Schindelin  wrote:
> Hi Eric,
> 
> 
> On Thu, 17 Oct 2019, Eric Wong wrote:
> 
> > (WIP, mostly stream-of-concious notes + reasoning)
> >
> > When using "git format-patch --range-diff", the pre and
> > post-image blob OIDs are in each email, while the exact
> > commit OIDs are rarely shared via emails (only the tip
> > commit from "git request-pull").
> >
> > These blob OIDs make it easy to search for or lookup the
> > full emails which create them, or the blob itself once
> > it's fetched via git.
> >
> > public-inbox indexes and allows querying specifically for blob
> > OIDs via dfpre:/dfpost: since June 2017.  As of Jan 2019,
> > public-inbox also supports recreating blobs out of patch emails
> > (querying internally with dfpre:/dfpost: and doing "git apply")
> >
> > Searching on these blob OIDs also makes it easier to find
> > previous versions of the patch sets using any mail search
> > engine.
> >
> > Future changes to public-inbox may allow generating custom
> > diffs out of any blobs it can find or recreate.
> >
> > Most of this is pretty public-inbox-specific and would've
> > made some future changes to public-inbox much easier
> > (if we had this from the start of range-diff).
> >
> > Unfortunately, it won't help with cases where range-diffs
> > are already published, but range-diff isn't too old.
> 
> I guess your patch won't hurt.

Cool, will update tests and resend.

> As to recreating blobs from mails: Wow. That's quite a length you're
> going, and I think it is a shame that you have to. If only every
 contribution came accompanied with a pullable branch in a public
> repository.

What Konstantin said about git repos being transient.
It wasn't too much work to recreate those blobs from
scratch since git-apply has done it since 2005.
Just add search :)

We could get around transient repos with automatic mirroring
bots which never deletes or overwrites anything published.
That includes preserving pre-force-push data in case of
force pushes.

> Instead, we will have to rely on your centralized, non-distributed
> service...

I'm curious how you came to believe that, since that's the
opposite of what public-inbox has always been intended to be.

The only thing that's centralized and not reproducible by
mirrors is the domain name (and I also have Tor .onion mirrors
with no dependency on ICAAN).  Memorable naming is a tricky
problem in decentralized systems, though...


Re: [PATCH 1/1] commit-graph: fix writing first commit-graph during fetch

2019-10-22 Thread Jeff King
On Tue, Oct 22, 2019 at 08:48:20PM -0400, Jeff King wrote:

> I admit I am puzzled, though, _why_ the presence of the submodule
> matters. That is, from your explanation, I thought the issue was simply
> that `fetch` walked (and marked) some commits, and the flags overlapped
> with what the commit-graph code expected.
> 
> I could guess that the presence of the submodule triggers some analysis
> for --recurse-submodules. But then we don't actually recurse (maybe
> because they're not activated? In which case maybe we shouldn't be doing
> that extra walk to look for submodules if there aren't any activated
> ones in our local repo).

Indeed, that seems to be it. If I do this:

  git init repo
  cd repo
  cat >.gitmodules <<\EOF
  [submodule "foo"]
  path = foo
  url = https://example.com
  EOF
  time git fetch /path/to/git.git

then we end up traversing the whole git.git history a second time, even
though we should know off the bat that there are no active submodules
that we would recurse to.

Doing this makes the problem go away:

diff --git a/submodule.c b/submodule.c
index 0f199c5137..0db2f18b93 100644
--- a/submodule.c
+++ b/submodule.c
@@ -1193,7 +1193,7 @@ static void calculate_changed_submodule_paths(struct 
repository *r,
struct string_list_item *name;
 
/* No need to check if there are no submodules configured */
-   if (!submodule_from_path(r, NULL, NULL))
+   if (!is_submodule_active(r, NULL))
return;
 
argv_array_push(&argv, "--"); /* argv[0] program name */

but causes some tests to fail (I think that in some cases we're supposed
to auto-initialize, and we'd probably need to cover that case, too).

All of this is outside of your fix, of course, but:

  1. I'm satisfied now that I understand why the test triggers the
 problem.

  2. You may want have a real activated submodule in your test. Right
 now we'll trigger the submodule-recursion check even without that,
 but in the future we might do something like the hunk above. In
 which case your test wouldn't be checking anything interesting
 anymore.

-Peff


Re: [PATCH 1/1] commit-graph: fix writing first commit-graph during fetch

2019-10-22 Thread Jeff King
On Tue, Oct 22, 2019 at 08:35:57PM -0400, Derrick Stolee wrote:

> > In the cover letter Derrick mentioned that he used
> > https://github.com/derrickstolee/numbers for testing, and that repo
> > has a submodule as well.
> 
> I completely forgot that I put a submodule in that repo. It makes sense
> that the Git repo also has one. Thanks for the test! I'll get it into
> the test suite tomorrow.

Ah, right. Good catch Gábor. The test below fails for me without your
patch, and succeeds with it (the extra empty commit is necessary so that
we have a parent).

I admit I am puzzled, though, _why_ the presence of the submodule
matters. That is, from your explanation, I thought the issue was simply
that `fetch` walked (and marked) some commits, and the flags overlapped
with what the commit-graph code expected.

I could guess that the presence of the submodule triggers some analysis
for --recurse-submodules. But then we don't actually recurse (maybe
because they're not activated? In which case maybe we shouldn't be doing
that extra walk to look for submodules if there aren't any activated
ones in our local repo).

I'd also wonder if it would be possible to trigger in another way (say,
due to want/have negotiation, though I guess most of the walking there
is done on the server side). But it may not be worth digging too far
into it.

-Peff

---
diff --git a/t/t5510-fetch.sh b/t/t5510-fetch.sh
index ecabbe1616..1b092fec0b 100755
--- a/t/t5510-fetch.sh
+++ b/t/t5510-fetch.sh
@@ -583,6 +583,21 @@ test_expect_success 'fetch.writeCommitGraph' '
)
 '
 
+test_expect_success 'fetch.writeCommitGraph with a submodule' '
+   git init has-submodule &&
+   (
+   cd has-submodule &&
+   git commit --allow-empty -m parent &&
+   git submodule add ../three &&
+   git commit -m "add submodule"
+   ) &&
+   git clone has-submodule submod-clone &&
+   (
+   cd submod-clone &&
+   git -c fetch.writeCommitGraph fetch origin
+   )
+'
+
 # configured prune tests
 
 set_config_tristate () {


Re: [PATCH 1/1] commit-graph: fix writing first commit-graph during fetch

2019-10-22 Thread Derrick Stolee



On 10/22/2019 7:35 PM, SZEDER Gábor wrote:
> On Tue, Oct 22, 2019 at 05:45:54PM -0400, Jeff King wrote:
>> Puzzling...
> 
> Submodules?
> 
>   $ cd ~/src/git/
>   $ git quotelog 86cfd61e6b
>   86cfd61e6b (sha1dc: optionally use sha1collisiondetection as a submodule, 
> 2017-07-01)
>   $ git init --bare good.git
>   Initialized empty Git repository in /home/szeder/src/git/good.git/
>   $ git push -q good.git 86cfd61e6b^:refs/heads/master
>   $ git clone good.git good-clone
>   Cloning into 'good-clone'...
>   done.
>   $ git -c fetch.writeCommitGraph -C good-clone fetch origin
>   Computing commit graph generation numbers: 100% (46958/46958), done.
>   $ git init --bare bad.git
>   Initialized empty Git repository in /home/szeder/src/git/bad.git/
>   $ git push -q bad.git 86cfd61e6b:refs/heads/master
>   $ git clone bad.git bad-clone
>   Cloning into 'bad-clone'...
>   done.
>   $ git -c fetch.writeCommitGraph -C bad-clone fetch origin
>   Computing commit graph generation numbers: 100% (1/1), done.
>   BUG: commit-graph.c:886: missing parent 
> 9936c1b52a39fa14fca04f937df3e75f7498ac66 for commit 
> 86cfd61e6bc12745751c43b4f69886b290cd85cb
>   Aborted
> 
> In the cover letter Derrick mentioned that he used
> https://github.com/derrickstolee/numbers for testing, and that repo
> has a submodule as well.

I completely forgot that I put a submodule in that repo. It makes sense
that the Git repo also has one. Thanks for the test! I'll get it into
the test suite tomorrow.

-Stolee


Re: [PATCH v2 1/1] ci(osx): use new location of the `perforce` cask

2019-10-22 Thread SZEDER Gábor
On Wed, Oct 23, 2019 at 01:23:25AM +0200, Johannes Schindelin wrote:
> On Fri, 18 Oct 2019, SZEDER Gábor wrote:
> 
> > On Thu, Oct 17, 2019 at 12:47:33PM +, Johannes Schindelin via 
> > GitGitGadget wrote:
> > > From: Johannes Schindelin 
> > >
> > > The CI builds are failing for Mac OS X due to a change in the
> >
> > s/CI/Azure Pipelines/
> >
> > Our Travis CI builds are fine.
> 
> For the moment ;-)

Touché.
Believe it or not, I did wrote "at least for now" at the end of that
sentence, but then deleted it.  Serves me right, now there is some new
breakage with gcc@8... :)

> If you don't mind, I am going to copy/edit these three paragraphs into
> the commit message,

Sure, go ahead.

> > In our CI builds we don't at all care what the checksums of the
> > Perforce binaries are, so I would really like to tell 'brew' to ignore
> > any checksum mismatch when installing 'perforce'.  Alas, it appears
> > that 'brew' has no public options to turn of or to ignore checksum
> > verification.
> 
> Sad, yet true, that we indeed have no command-line option to say "you
> know what, your checksum possibly mismatches, but we really don't care".

Actually, 'brew' does have some undocumented options, but I didn't
even bothered to check, because it's not really sensible to rely on an
undocumented option (especially when even the documented options break
every once in a while...).

> > Now, let's take a step back.
> >
> > All 'brew cask install perforce' really does is run 'curl' to download
> > a tar.gz from the Perforce servers, verify its checksum, unpack it,
> > and put the executables somewhere on $PATH.  That's not rocket
> > science, we could easily do that ourselves; we don't even have to deal
> > with a tar.gz, the 'p4' and 'p4d' binaries for mac are readily
> > available for download at:
> >
> >   http://filehost.perforce.com/perforce/r19.1/bin.macosx1010x86_64/
> >
> > And, in fact, that's what we have been doing in some of our Linux jobs
> > since the very beginning, so basically only the download URL has to be
> > adjusted.
> 
> I'd rather not.
> 
> Just because there is no better way on Linux, and just because the
> current `perforce` cask recipe happens to just download and unpack that
> file does not mean that this won't change.

Yeah, I'm fairly sure that the way Homebrew installs Perforce will
change, but if we download Perforce ourselves, then it won't matter at
all.

Downloading the Perforce binaries with 'wget' worked fairly well for
almost four years, except from that server glitch a couple of weeks
ago; I think downloading the macOS binaries from the same server would
work just as well.  OTOH, this is the fourth time that we have to
tweak how we install Perforce via Homebrew.

FWIW, it looks like this:

https://github.com/szeder/git/blob/ci-osx-wget-perforce/ci/install-dependencies.sh#L11


Re: [PATCH 1/1] documentation: remove empty doc files

2019-10-22 Thread Junio C Hamano
Emily Shaffer  writes:

> As for the content of this change, I absolutely approve. I've stumbled
> across some of these empty docs while looking for answers before and
> found it really demoralizing - the community is so interested in
> teaching me how to contribute that they've sat on a TODO for 12 years?
> :( I even held up api-grep.txt as a (bad) example in a talk I gave this
> year. I'm happy to see these files go.

I'd approve this move, too, especially if we accompanied deletion
with addition (or verification of existence) of necessary docs
elsewhere (perhaps in *.h headers).



Re: [PATCH] pretty: Add "%aU"|"%au" option to output author's username

2019-10-22 Thread brian m. carlson
On 2019-10-22 at 23:28:47, Prarit Bhargava wrote:
> In many projects the number of contributors is low enough that users know
> each other and the full email address doesn't need to be displayed.
> Displaying only the author's username saves a lot of columns on the screen.
> For example displaying "prarit" instead of "pra...@redhat.com" saves 11
> columns.
> 
> Add a "%aU"|"%au" option that outputs the author's email username.

I have no position on whether or not this is a useful change.  I don't
think I'll end up using it, but I can imagine cases where it is useful,
such as certain corporate environments.  It would be interesting to see
what others think.

> diff --git a/Documentation/pretty-formats.txt 
> b/Documentation/pretty-formats.txt
> index b87e2e83e6d0..479a15a8ab12 100644
> --- a/Documentation/pretty-formats.txt
> +++ b/Documentation/pretty-formats.txt
> @@ -163,6 +163,9 @@ The placeholders are:
>  '%ae':: author email
>  '%aE':: author email (respecting .mailmap, see linkgit:git-shortlog[1]
>   or linkgit:git-blame[1])
> +'%au':: author username
> +'%aU':: author username (respecting .mailmap, see linkgit:git-shortlog[1]
> + or linkgit:git-blame[1])

I think we need to actually document what "username" means here.  I
expect you'll have people think that this magically means their
$HOSTING_PLATFORM username, which it is not.

> diff --git a/pretty.c b/pretty.c
> index b32f0369531c..2a5b93022050 100644
> --- a/pretty.c
> +++ b/pretty.c
> @@ -706,6 +706,11 @@ static size_t format_person_part(struct strbuf *sb, char 
> part,
>   strbuf_add(sb, mail, maillen);
>   return placeholder_len;
>   }
> + if (part == 'u' || part == 'U') {   /* username */
> + maillen = strstr(s.mail_begin, "@") - s.mail_begin;
> + strbuf_add(sb, mail, maillen);
> + return placeholder_len;
> + }

This branch doesn't appear to do anything different for the mailmap and
non-mailmap cases.  Perhaps adding an additional test that demonstrates
the difference would be a good idea.
-- 
brian m. carlson: Houston, Texas, US
OpenPGP: https://keybase.io/bk2204


signature.asc
Description: PGP signature


Re: [PATCH] pretty: Add "%aU"|"%au" option to output author's username

2019-10-22 Thread Junio C Hamano
Prarit Bhargava  writes:

> Subject: Re: [PATCH] pretty: Add "%aU"|"%au" option to output author's 
> username

Downcase "Add" (see "git shortlog --no-merges -100 master" and
mimick the project convention).

> Add a "%aU"|"%au" option that outputs the author's email username.

Even though I personally do not see the use for it, I agree it would
make sense to have an option to show the local part only where the
e-mail address is shown.  

I do not know if u/U is a good mnemonic; it hints too strongly that
it may come from GIT_{AUTHOR/COMMITTER}_NAME but that is not what
you are doing---isn't there a letter that better conveys that this
is about RFC 2822 local-part (cf. page 16 ieft.org/rfc/rfc2822.txt)?

> diff --git a/Documentation/pretty-formats.txt 
> b/Documentation/pretty-formats.txt
> index b87e2e83e6d0..479a15a8ab12 100644
> --- a/Documentation/pretty-formats.txt
> +++ b/Documentation/pretty-formats.txt
> @@ -163,6 +163,9 @@ The placeholders are:
>  '%ae':: author email
>  '%aE':: author email (respecting .mailmap, see linkgit:git-shortlog[1]
>   or linkgit:git-blame[1])
> +'%au':: author username
> +'%aU':: author username (respecting .mailmap, see linkgit:git-shortlog[1]
> + or linkgit:git-blame[1])
>  '%ad':: author date (format respects --date= option)
>  '%aD':: author date, RFC2822 style
>  '%ar':: author date, relative

> diff --git a/pretty.c b/pretty.c
> index b32f0369531c..2a5b93022050 100644
> --- a/pretty.c
> +++ b/pretty.c
> @@ -706,6 +706,11 @@ static size_t format_person_part(struct strbuf *sb, char 
> part,
>   strbuf_add(sb, mail, maillen);
>   return placeholder_len;
>   }
> + if (part == 'u' || part == 'U') {   /* username */
> + maillen = strstr(s.mail_begin, "@") - s.mail_begin;
> + strbuf_add(sb, mail, maillen);
> + return placeholder_len;
> + }

I think users get %eu and %eU for free with this change, which should
be documented.


Re: [PATCH v2 1/1] ci(osx): use new location of the `perforce` cask

2019-10-22 Thread Junio C Hamano
Johannes Schindelin  writes:

>> This is already in 'next' X-<; reverting a merge is cheap but I
>> prefer to do so when we already have a replacement.
>
> I force-pushed (see https://github.com/gitgitgadget/git/pull/400), and
> once Stolee approves, he will submit v3. This will only change the
> commit message, though, as I disagree that hard-coding the URL would be
> an improvement: the nice thing about a package management system is that
> the user does not need to know the details (or need to know if the
> details change, like, ever).

If this were meant for the upcoming release, I would rather see us
copy a butt-ugly-but-known-working procedure if we have one this
close to -rc1.  If the hard-coded URL ever changes, the procedure
we would be copying from would be broken anyway.

But I agree 100% that we should take a conceptually cleaner approach
for the longer term.  Let's replace the original one with this and
cook in 'next'---it would be ideal if the ugly-but-know-working one
be updated to match in the meantime, but if it is bypassing package
management for a reason (the upstream just publishes the URL to
download from without packaging it properly, for example?), that
would not be possible, and it is OK if that is the case.

Thanks.





Re: [PATCH 1/1] commit-graph: fix writing first commit-graph during fetch

2019-10-22 Thread SZEDER Gábor
On Tue, Oct 22, 2019 at 05:45:54PM -0400, Jeff King wrote:
> On Tue, Oct 22, 2019 at 04:33:16PM -0400, Jeff King wrote:
> 
> > > I have failed to produce a test using the file:// protocol that
> > > demonstrates this bug.
> > 
> > Hmm, from the description, it sounds like it should be easy. I might
> > poke at it a bit.
> 
> Hmph. I can reproduce it here, but it seems to depend on the repository.
> If I do this:
> 
> diff --git a/t/t5510-fetch.sh b/t/t5510-fetch.sh
> index ecabbe1616..8d473a456f 100755
> --- a/t/t5510-fetch.sh
> +++ b/t/t5510-fetch.sh
> @@ -583,6 +583,14 @@ test_expect_success 'fetch.writeCommitGraph' '
>   )
>  '
>  
> +test_expect_success 'fetch.writeCommitGraph with a bigger repo' '
> + git clone "$TEST_DIRECTORY/.." repo &&
> + (
> + cd repo &&
> + git -c fetch.writeCommitGraph fetch origin
> + )
> +'
> +
>  # configured prune tests
>  
>  set_config_tristate () {
> 
> it reliably triggers the bug. But if I make a synthetic repo, even it
> has a lot of commits (thousands or more), it doesn't trigger. I thought
> maybe it had to do with having commits that were not at tips (since the
> tip ones presumably _are_ fed into the graph generation process). But
> that doesn't seem to help.
> 
> Puzzling...

Submodules?

  $ cd ~/src/git/
  $ git quotelog 86cfd61e6b
  86cfd61e6b (sha1dc: optionally use sha1collisiondetection as a submodule, 
2017-07-01)
  $ git init --bare good.git
  Initialized empty Git repository in /home/szeder/src/git/good.git/
  $ git push -q good.git 86cfd61e6b^:refs/heads/master
  $ git clone good.git good-clone
  Cloning into 'good-clone'...
  done.
  $ git -c fetch.writeCommitGraph -C good-clone fetch origin
  Computing commit graph generation numbers: 100% (46958/46958), done.
  $ git init --bare bad.git
  Initialized empty Git repository in /home/szeder/src/git/bad.git/
  $ git push -q bad.git 86cfd61e6b:refs/heads/master
  $ git clone bad.git bad-clone
  Cloning into 'bad-clone'...
  done.
  $ git -c fetch.writeCommitGraph -C bad-clone fetch origin
  Computing commit graph generation numbers: 100% (1/1), done.
  BUG: commit-graph.c:886: missing parent 
9936c1b52a39fa14fca04f937df3e75f7498ac66 for commit 
86cfd61e6bc12745751c43b4f69886b290cd85cb
  Aborted

In the cover letter Derrick mentioned that he used
https://github.com/derrickstolee/numbers for testing, and that repo
has a submodule as well.



Re: [RFC PATCH 1/7] Makefile: alphabetically sort += lists

2019-10-22 Thread Johannes Schindelin
Hi,

On Mon, 21 Oct 2019, Denton Liu wrote:

> Hi Johannes,
>
> On Mon, Oct 21, 2019 at 08:44:40PM +0200, Johannes Schindelin wrote:
> > Hi Junio,
> >
> > On Fri, 18 Oct 2019, Junio C Hamano wrote:
> >
> > > Denton Liu  writes:
> > >
> > > > There are many += lists in the Makefile and, over time, they have gotten
> > > > slightly out of order, alphabetically. Alphabetically sort all += lists
> > > > to bring them back in order.
> > > > ...
> > >
> > > Hmm.  I like the general thrust, but ...
> > >
> > > >  LIB_OBJS += combine-diff.o
> > > > -LIB_OBJS += commit.o
> > > >  LIB_OBJS += commit-graph.o
> > > >  LIB_OBJS += commit-reach.o
> > > > +LIB_OBJS += commit.o
> > >
> > > ... I do not particularly see this change (there may be similar
> > > ones) desirable.  I'd find it it be much more natural to sort
> > > "commit-anything" after "commit", and that is true with or without
> > > the common extension ".o" added to these entries.
> > >
> > > In short, flipping these entries because '.' sorts later than '-' is
> > > making the result look "less sorted", at least to me.
> >
> > The problem with this argument is that it disagrees with ASCII, as `-`
> > has code 0x2d while `.` has code 0x2e, i.e. it is lexicographically
> > _larger_.
> >
> > So Denton's patch does the correct thing.
>
> I actually agree with Junio on this one. Without the prefixes, "commit"
> would go before "commit-graph" so I think it would make more sense to
> order with the prefixes removed instead of taking the naive ordering by
> just sorting each block.

That will make it harder on other contributors like me, who prefer to
mark the lines in `vim` and then call `:sort` on them, and then not care
about it any further.

Any decision that makes automating tedious tasks harder puts more burden
on human beings. I don't like that.

Ciao,
Dscho

>
> Thanks,
>
> Denton
>
> >
> > Ciao,
> > Dscho
>


Re: [PATCH v2 1/1] ci(osx): use new location of the `perforce` cask

2019-10-22 Thread Johannes Schindelin
Hi Junio,

On Mon, 21 Oct 2019, Junio C Hamano wrote:

> SZEDER Gábor  writes:
>
> > On Thu, Oct 17, 2019 at 12:47:33PM +, Johannes Schindelin via 
> > GitGitGadget wrote:
> >> From: Johannes Schindelin 
> >>
> >> The CI builds are failing for Mac OS X due to a change in the
> >
> > s/CI/Azure Pipelines/
> >
> > Our Travis CI builds are fine.
> >
> >> location of the perforce cask. The command outputs the following
> >> error:
> >>
> >> + brew install caskroom/cask/perforce
> >> Error: caskroom/cask was moved. Tap homebrew/cask-cask instead.
> >>
> >> So let's try to call `brew cask install perforce` first (which is what
> >> that error message suggests, in a most round-about way).
> >>
> >> The "caskroom" way was added in 672f51cb (travis-ci:
> >> fix Perforce install on macOS, 2017-01-22) and the justification
> >> is that the call "brew cask install perforce" can fail due to a checksum
> >> mismatch: the recipe simply downloads the official Perforce distro, and
> >> whenever that is updated, the recipe needs to be updated, too.
> >
> > This paragraph is wrong, it mixes up things too much.
> >
> > Prior to 672f51cb we used to install the 'perforce' _package_ with
> > 'brew install perforce' (note: no 'cask' in there).  The justification
> > for 672f51cb was that the command 'brew install perforce' simply
> > stopped working, after Homebrew folks decided that it's better to move
> > the 'perforce' package to a "cask".  It was _their_ justification for
> > this move that 'brew install perforce' "can fail due to a checksum
> > mismatch ...", and casks can be installed without checksum
> > verification.  And indeed, both 'brew cask install perforce' and 'brew
> > install caskroom/cask/perforce' printed something along the lines of:
> >
> >   ==> No checksum defined for Cask perforce, skipping verification
> >
> > It's unclear to me why 672f51cb used 'brew install
> > caskroom/cask/perforce' instead of 'brew cask install perforce'.  It
> > appears (by running both commands on old Travis CI macOS images) that
> > both commands worked all the same already back then.
> >
> > Anyway, as the error message at the top of the log message shows,
> > 'brew install caskroom/cask/perforce' has stopped working recently,
> > but 'brew cask install perforce' still does, so let's use that.
> >
> > Note that on Travis CI we explicitly specify which macOS image to use,
> > and nowadays we don't run 'brew update' during the build process [1],
> > so both commands work in our builds there.
> > ...
> > Now, let's take a step back.
> >
> > All 'brew cask install perforce' really does is ...
> > ... in fact, that's what we have been doing in some of our Linux jobs
> > since the very beginning, so basically only the download URL has to be
> > adjusted.
>
> This is already in 'next' X-<; reverting a merge is cheap but I
> prefer to do so when we already have a replacement.

I force-pushed (see https://github.com/gitgitgadget/git/pull/400), and
once Stolee approves, he will submit v3. This will only change the
commit message, though, as I disagree that hard-coding the URL would be
an improvement: the nice thing about a package management system is that
the user does not need to know the details (or need to know if the
details change, like, ever).

Ciao,
Dscho

>
> Thanks for taking a closer look.
>


Re: [PATCH v2 1/1] ci(osx): use new location of the `perforce` cask

2019-10-22 Thread Johannes Schindelin
Hi Gábor,

On Fri, 18 Oct 2019, SZEDER Gábor wrote:

> On Thu, Oct 17, 2019 at 12:47:33PM +, Johannes Schindelin via 
> GitGitGadget wrote:
> > From: Johannes Schindelin 
> >
> > The CI builds are failing for Mac OS X due to a change in the
>
> s/CI/Azure Pipelines/
>
> Our Travis CI builds are fine.

For the moment ;-)

> > location of the perforce cask. The command outputs the following
> > error:
> >
> > + brew install caskroom/cask/perforce
> > Error: caskroom/cask was moved. Tap homebrew/cask-cask instead.
> >
> > So let's try to call `brew cask install perforce` first (which is what
> > that error message suggests, in a most round-about way).
> >
> > The "caskroom" way was added in 672f51cb (travis-ci:
> > fix Perforce install on macOS, 2017-01-22) and the justification
> > is that the call "brew cask install perforce" can fail due to a checksum
> > mismatch: the recipe simply downloads the official Perforce distro, and
> > whenever that is updated, the recipe needs to be updated, too.
>
> This paragraph is wrong, it mixes up things too much.
>
> Prior to 672f51cb we used to install the 'perforce' _package_ with
> 'brew install perforce' (note: no 'cask' in there).  The justification
> for 672f51cb was that the command 'brew install perforce' simply
> stopped working, after Homebrew folks decided that it's better to move
> the 'perforce' package to a "cask".  It was _their_ justification for
> this move that 'brew install perforce' "can fail due to a checksum
> mismatch ...", and casks can be installed without checksum
> verification.  And indeed, both 'brew cask install perforce' and 'brew
> install caskroom/cask/perforce' printed something along the lines of:
>
>   ==> No checksum defined for Cask perforce, skipping verification
>
> It's unclear to me why 672f51cb used 'brew install
> caskroom/cask/perforce' instead of 'brew cask install perforce'.  It
> appears (by running both commands on old Travis CI macOS images) that
> both commands worked all the same already back then.
>
> Anyway, as the error message at the top of the log message shows,
> 'brew install caskroom/cask/perforce' has stopped working recently,
> but 'brew cask install perforce' still does, so let's use that.

If you don't mind, I am going to copy/edit these three paragraphs into
the commit message,

> Note that on Travis CI we explicitly specify which macOS image to use,
> and nowadays we don't run 'brew update' during the build process [1],
> so both commands work in our builds there.
>
> [1] f2f4715033 (ci: don't update Homebrew, 2019-07-03)
>
> > CI servers are typically fresh virtual machines, but not always. To
> > accommodate for that, let's try harder if `brew cask install perforce`
> > fails, by specifically pulling the latest `master` of the
> > `homebrew-cask` repository.
>
> Homebrew didn't record a checksum for Perforce versions r17.1, r17.2
> and r18.1, so installing those still works fine.  Our Travis CI images
> install r18.1.
>
> However, when Homebrew updated to Perforce r19.1, they included the
> checksum again for some reason (intentional or accidental; I didn't
> look why).  This worked fine for a while, until a couple of days ago
> Perforce updated the r19.1 binaries in place, breaking those
> checksums.
>
> If we were to still run 'brew update', then it would shortly fix the
> checksum mismatch.  But we don't run it, and we do not want to run it
> because it takes ages.  Falling back to pull from the 'homebrew-cask'
> repository could be a reasonable and quick workaround.

Okay, good.

> > This will still fail, of course, when `homebrew-cask` falls behind
> > Perforce's release schedule. But once it is updated, we can now simply
> > re-run the failed jobs and they will pick up that update.
>
> In our CI builds we don't at all care what the checksums of the
> Perforce binaries are, so I would really like to tell 'brew' to ignore
> any checksum mismatch when installing 'perforce'.  Alas, it appears
> that 'brew' has no public options to turn of or to ignore checksum
> verification.

Sad, yet true, that we indeed have no command-line option to say "you
know what, your checksum possibly mismatches, but we really don't care".

> Now, let's take a step back.
>
> All 'brew cask install perforce' really does is run 'curl' to download
> a tar.gz from the Perfor

Re: [PATCH 1/1] commit-graph: fix writing first commit-graph during fetch

2019-10-22 Thread Jeff King
On Tue, Oct 22, 2019 at 04:33:16PM -0400, Jeff King wrote:

> > I have failed to produce a test using the file:// protocol that
> > demonstrates this bug.
> 
> Hmm, from the description, it sounds like it should be easy. I might
> poke at it a bit.

Hmph. I can reproduce it here, but it seems to depend on the repository.
If I do this:

diff --git a/t/t5510-fetch.sh b/t/t5510-fetch.sh
index ecabbe1616..8d473a456f 100755
--- a/t/t5510-fetch.sh
+++ b/t/t5510-fetch.sh
@@ -583,6 +583,14 @@ test_expect_success 'fetch.writeCommitGraph' '
)
 '
 
+test_expect_success 'fetch.writeCommitGraph with a bigger repo' '
+   git clone "$TEST_DIRECTORY/.." repo &&
+   (
+   cd repo &&
+   git -c fetch.writeCommitGraph fetch origin
+   )
+'
+
 # configured prune tests
 
 set_config_tristate () {

it reliably triggers the bug. But if I make a synthetic repo, even it
has a lot of commits (thousands or more), it doesn't trigger. I thought
maybe it had to do with having commits that were not at tips (since the
tip ones presumably _are_ fed into the graph generation process). But
that doesn't seem to help.

Puzzling...

-Peff


Re: Git in Outreachy December 2019?

2019-10-22 Thread Emily Shaffer
On Fri, Sep 20, 2019 at 06:47:01PM -0700, Emily Shaffer wrote:
> On Fri, Sep 20, 2019 at 10:04:48AM -0700, Jonathan Tan wrote:
> > > Prospective mentors need to sign up on that site, and should propose a
> > > project they'd be willing to mentor.
> > 
> > [snip]
> > 
> > > I'm happy to discuss possible projects if anybody has an idea but isn't
> > > sure how to develop it into a proposal.
> > 
> > I'm new to Outreachy and programs like this, so does anyone have an
> > opinion on my draft proposal below? It does not have any immediate
> > user-facing benefit, but it does have a definite end point.
> 
> I'd appreciate similar opinion if anybody has it - and I'd also really
> feel more comfortable with a co-mentor.

I know early on in this thread about Outreachy projects some folks
expressed interest in comentoring. Is anybody still interested in doing
so?

For context, I've been in contact with 3 applicants who have either sent
their first patch already or are getting ready to (and have needed some
involved discussion or help) plus another few applicants who have
inquired and may or may not send patches in the future. I've also
received quite a few mails outside of my timezone working hours (I
usually am awake/working 18:00GMT-02:00GMT), which I feel badly about
not being able to respond to in a timely fashion. If anybody wants to
comentor I would be so excited to have the help :)

 - Emily


Re: [PATCH 1/1] documentation: remove empty doc files

2019-10-22 Thread Emily Shaffer
On Tue, Oct 22, 2019 at 06:19:35PM +, Heba Waly via GitGitGadget wrote:
> From: Heba Waly 
> 
> Remove empty and redundant documentation files from the
> Documentation/technical/ directory.
> 
> As part of moving the documentation from Documentation/technical/api-* to
> header files, the following files are deleted because they include only
> TODO messages with no documentation to be moved:
> Documentation/technical/api-grep.txt
> Documentation/technical/api-object-access.txt
> Documentation/technical/api-quote.txt
> Documentation/technical/api-xdiff-interface.txt

Same thing as I mentioned in your other review; what you've added to
your commit message now doesn't say anything you didn't say with the
diff. I can see that you removed empty documentation files; I can see
that those files include only TODO.

Maybe you can explain why it's a bad developer experience to stumble
across these, and that those files sat untouched for years in the
TODO(contributor-name) state.

> 
> Signed-off-by: Heba Waly 
> ---
>  Documentation/technical/api-grep.txt|  8 
>  Documentation/technical/api-object-access.txt   | 15 ---
>  Documentation/technical/api-quote.txt   | 10 --
>  Documentation/technical/api-xdiff-interface.txt |  7 ---
>  4 files changed, 40 deletions(-)
>  delete mode 100644 Documentation/technical/api-grep.txt
>  delete mode 100644 Documentation/technical/api-object-access.txt
>  delete mode 100644 Documentation/technical/api-quote.txt
>  delete mode 100644 Documentation/technical/api-xdiff-interface.txt

As for the content of this change, I absolutely approve. I've stumbled
across some of these empty docs while looking for answers before and
found it really demoralizing - the community is so interested in
teaching me how to contribute that they've sat on a TODO for 12 years?
:( I even held up api-grep.txt as a (bad) example in a talk I gave this
year. I'm happy to see these files go.

 - Emily


Re: [PATCH v2 1/1] config: move documentation to config.h

2019-10-22 Thread Emily Shaffer
On Tue, Oct 22, 2019 at 07:05:06AM +, Heba Waly via GitGitGadget wrote:
> From: Heba Waly 
> 
> Move the documentation from Documentation/technical/api-config.txt into
> config.h

This is still a little thin for what we usually want from commit
messages. Try to imagine that five years from now, you find this commit
by running `git blame` on config.h and then examining the commit which
introduced all these comments with `git show ` - what would
you want to know?

Typically we want to know "why" the change was made, because the diff
shows "what". We can see from the diff that you're moving comments from
A to B, but if you explain why you did so (not "because my Outreachy
mentor told me to" ;) but "because it is useful to see usage information
next to code" or "this is best practice as described by blah blah") - I
wouldn't be able to know that reasoning just from looking at your diff.


> diff --git a/config.h b/config.h
> index f0ed464004..02f78ffc2b 100644
> --- a/config.h
> +++ b/config.h
> @@ -4,6 +4,23 @@
>  #include "hashmap.h"
>  #include "string-list.h"
>  
> +
> +/**
> + * The config API gives callers a way to access Git configuration files
> + * (and files which have the same syntax). See linkgit:git-config[1] for a

Ah, here's another place where the Asciidoc link isn't going to do
anything anymore.

Otherwise I didn't still see anything jumping out. When the commit
message is cleaned up I'm ready to add my Reviewed-by line.

 - Emily


Re: [PATCH 1/1] config: add documentation to config.h

2019-10-22 Thread Emily Shaffer
On Sun, Oct 20, 2019 at 09:35:17PM +1300, Heba Waly wrote:
> On Sat, Oct 19, 2019 at 11:52 AM Emily Shaffer  
> wrote:
> >
> > On Fri, Oct 18, 2019 at 12:06:59AM +, Heba Waly via GitGitGadget wrote:
> > > From: Heba Waly 
> >
> > Hi Heba,
> >
> > Thanks for the patch!
> >
> > I'd like to highlight to the community that this is an Outreachy
> > applicant and microproject. Heba, when you send the next version, I
> > think you can add [Outreachy] manually to the PR subject line - that
> > should draw the attention of those in the community who are invested in
> > helping Outreachy applicants.
> Good idea! I wanted to add it to the email subject but as I decided to
> use gitgadget
> I had no control over the subject.

Hm, it looks like you already figured out how to add it to the title of
the PR. :)

> > >
> > > This commit is copying and summarizing the documentation from
> > > documentation/technical/api-config.txt to comments in config.h
> >
> > I think in the GitGitGadget PR you've got some great comments from Dscho
> > about how to format your commit message; please take a look at those and
> > feel free to reach out to me if you're still not sure what's missing or
> > not.
> Will do.
> > > Signed-off-by: Heba Waly 
> >
> > One thing I miss in this change is the removal of the contents of
> > Documentation/technical/api-config.txt (or maybe the removal of the file
> > itself). I'd prefer to see at least for api-config.txt to say something
> > like "Please refer to comments in 'config.h'"; or, more drastically, for
> > api-config.txt to be removed entirely.
> >
> > Having both pieces of documentation standing independently means that
> > someone who's trying to add new information about the config API won't
> > know where to add it; eventually they'll add something to config.h but
> > not api-config.txt, or vice versa, and the two documents will go out of
> > sync. So we want to move the documentation, rather than copy it.
> That makes sense, thanks for the explanation.
> I wasn't sure if it should be removed or not so I decided to leave it
> until I'm asked otherwise.
> So I assume api-config.html will be removed too?

That shouldn't be tracked - this is generated from api-config.txt as
part of the build. So don't worry about this part.

> > > +
> > > +/**
> > > + * Value Parsing Helpers
> > > + * -
> >
> > It may not make sense to have the header here in the middle of the doc.
> >
> > I wonder whether we need the headers at all anymore; or, whether it
> > makes more sense to put this header in the long comment at the top with
> > just the list of function names (so someone knows where to look), and
> > leave the per-function explanations inline with the function they
> > describe?
> I see your point Emily, but in the CodingGuidelines file it was
> advised to refer to strbuf.h
> as a model for documentation, I noticed that strbuf.h used headers
> this way so I decided
> to replicate that.

Ok! Sure.

> > I made a couple of smallish comments about general formatting, but I'm
> > also interested to know whether you were able to move the entire
> > contents of api-config.txt across to here. Was there anything that you
> > couldn't find a place for?
> Yes, everything is moved.
> > Thanks a lot for this change, and congrats on getting your first review
> > out! Welcome! :)
> >
> >  - Emily
> >
> Thanks a lot Emily for the detailed and helpful feedback!
> 
> Heba


Re: is commitGraph useful on the server side?

2019-10-22 Thread Konstantin Ryabitsev

On Tue, Oct 22, 2019 at 04:06:16PM -0400, Jeff King wrote:
I'm biased, but I think the commit-graph is generally really good to 
have

in almost all cases. I actually do not know of a good reason to _not_ have
it.


A lot depends on how much you do on the server. If you're serving a web
interface that runs things like `rev-list`, or `for-each-ref
--contains`, etc, then you should see a big improvement.


Ah, good to know, so something like cgit would see an improvement if 
there are commit graphs generated for the repos it serves.



If you're _just_ serving fetches with `upload-pack`, you might see some
small improvement during fetch negotiation. But I suspect it would be
dwarfed by the cost of actually generating packs. Likewise, the
traversal there will be dominated by accessing trees (and if that is
expensive, then you ought to be using reachability bitmaps).


We do generate bitmaps on a routine basis.

OK, I think I'm convinced that enabling commitgraph and generating them 
regularly is going to be a net win.


Thanks, everyone.

-K



Re: [PATCH 1/1] commit-graph: fix writing first commit-graph during fetch

2019-10-22 Thread Jeff King
On Tue, Oct 22, 2019 at 05:28:55PM +, Derrick Stolee via GitGitGadget wrote:

> However, the UNINTERESTING flag is used in lots of places in the
> codebase. This flag usually means some barrier to stop a commit walk,
> such as in revision-walking to compare histories. It is not often
> cleared after the walk completes because the starting points of those
> walks do not have the UNINTERESTING flag, and clear_commit_marks() would
> stop immediately.

Oof. Nicely explained, and your fix makes sense.

The global-ness of revision flags always makes me nervous about doing
more things in-process (this isn't the first such bug we've had).

I have a dream of converting most uses of flags into using a
commit-slab. That provides cheap access to an auxiliary structure, so
each traversal, etc, could keep its own flag structure. I'm not sure if
it would have a performance impact, though. Even though it's O(1), it is
an indirect lookup, which could have some memory-access impact (though
my guess is it would be lost in the noise).

One of the sticking points is that all object types, not just commits,
use flags. But we only assign slab ids to commits. I noticed recently
that "struct object" has quite a few spare bits in it these days,
because the switch to a 32-byte oid means 64-bit machines now have an
extra 4 bytes of padding. I wonder if we could use that to store an
index field.

Anyway, that's getting far off the topic; clearly we need a fix in the
meantime, and what you have here looks good to me.

> I tested running clear_commit_marks_many() to clear the UNINTERESTING
> flag inside close_reachable(), but the tips did not have the flag, so
> that did nothing.

Another option would be clear_object_flags(), which just walks all of
the in-memory structs. Your REACHABLE solution is cheaper, though.

> Instead, I finally arrived on the conclusion that I should use a flag
> that is not used in any other part of the code. In commit-reach.c, a
> number of flags were defined for commit walk algorithms. The REACHABLE
> flag seemed like it made the most sense, and it seems it was not
> actually used in the file.

Yeah, being able to remove it from commit-reach.c surprised me for a
moment. To further add to the confusion, builtin/fsck.c has its own
REACHABLE flag (with a different bit and a totally different purpose). I
don't think there's any practical problem there, though.

> I have failed to produce a test using the file:// protocol that
> demonstrates this bug.

Hmm, from the description, it sounds like it should be easy. I might
poke at it a bit.

-Peff


Re: is commitGraph useful on the server side?

2019-10-22 Thread Jeff King
[resending, looks like we lost Konstantin from the cc]

On Tue, Oct 22, 2019 at 03:44:28PM -0400, Derrick Stolee wrote:

> I'm biased, but I think the commit-graph is generally really good to have
> in almost all cases. I actually do not know of a good reason to _not_ have
> it.

A lot depends on how much you do on the server. If you're serving a web
interface that runs things like `rev-list`, or `for-each-ref
--contains`, etc, then you should see a big improvement.

If you're _just_ serving fetches with `upload-pack`, you might see some
small improvement during fetch negotiation. But I suspect it would be
dwarfed by the cost of actually generating packs. Likewise, the
traversal there will be dominated by accessing trees (and if that is
expensive, then you ought to be using reachability bitmaps).

But I agree that there's no reason _not_ to use them.

-Peff


Re: is commitGraph useful on the server side?

2019-10-22 Thread Jeff King
On Tue, Oct 22, 2019 at 03:44:28PM -0400, Derrick Stolee wrote:

> I'm biased, but I think the commit-graph is generally really good to have
> in almost all cases. I actually do not know of a good reason to _not_ have
> it.

A lot depends on how much you do on the server. If you're serving a web
interface that runs things like `rev-list`, or `for-each-ref
--contains`, etc, then you should see a big improvement.

If you're _just_ serving fetches with `upload-pack`, you might see some
small improvement during fetch negotiation. But I suspect it would be
dwarfed by the cost of actually generating packs. Likewise, the
traversal there will be dominated by accessing trees (and if that is
expensive, then you ought to be using reachability bitmaps).

But I agree that there's no reason _not_ to use them.

-Peff


Re: [PATCH v2 9/9] pack-objects: improve partial packfile reuse

2019-10-22 Thread Jonathan Tan
First, the commit message could probably be reordered to start with the
main point (reusing of packfiles at object granularity instead of
allowing reuse of only one contiguous region). To specific points:

> The dynamic array of `struct reused_chunk` is useful
> because we need to know not just the number of zero bits,
> but the accumulated offset due to missing objects.

The number of zero bits is irrelevant, I think. (I know I introduced
this idea, but at that time I was somehow confused that the offset was a
number of objects and not a number of bytes.)

> If a client both asks for a tag by sha1 and specifies
> "include-tag", we may end up including the tag in the reuse
> bitmap (due to the first thing), and then later adding it to
> the packlist (due to the second). This results in duplicate
> objects in the pack, which git chokes on. We should notice
> that we are already including it when doing the include-tag
> portion, and avoid adding it to the packlist.

I still don't understand this part. Going off what's written in the
commit message here, it seems to me that the issue is that (1) an object
can be added to the reuse bitmap without going through to_pack, and (2)
this is done when the client asks for a tag by SHA-1. But all objects
when specified by SHA-1 go through to_pack, don't they?

On to the review of the code. The ordering of - and + lines are
confusing, so I have just removed all - lines.

> +/*
> + * Record the offsets needed in our reused packfile chunks due to
> + * "gaps" where we omitted some objects.
> + */
> +static struct reused_chunk {
> + off_t start;
> + off_t offset;
> +} *reused_chunks;
> +static int reused_chunks_nr;
> +static int reused_chunks_alloc;

A chunk is a set of objects from the original packfile that we are
reusing; all objects in a chunk have the same relative position in the
original packfile and the generated packfile. (This does not mean that
the bytes are exactly the same or, even, that the last object in the
chunk has the same size in the original packfile and the generated
packfile.)

"start" is the offset of the first object of the chunk in the original
packfile.

"offset" is "start" minus the offset of the first object of the chunk in
the generated packfile. (I think it is easier to understand if this was
"new minus old" instead of "old minus new", that is, the negation of its
current value.)

So I would prefer:

  /*
   * A reused set of objects. All objects in a chunk have the same
   * relative position in the original packfile and the generated
   * packfile.
   */
  struct reused_chunk {
/* The offset of the first object of this chunk in the original
 * packfile. */
off_t original;
/* The offset of the first object of this chunk in the generated
 * packfile minus "original". */
off_t difference;
  }

> +static void record_reused_object(off_t where, off_t offset)

Straightforward, other than the renaming of parameters.

> +/*
> + * Binary search to find the chunk that "where" is in. Note
> + * that we're not looking for an exact match, just the first
> + * chunk that contains it (which implicitly ends at the start
> + * of the next chunk.
> + */
> +static off_t find_reused_offset(off_t where)

Also straightforward.

> +static void write_reused_pack_one(size_t pos, struct hashfile *out,
> +   struct pack_window **w_curs)
> +{
> + off_t offset, next, cur;
> + enum object_type type;
> + unsigned long size;
> +
> + offset = reuse_packfile->revindex[pos].offset;
> + next = reuse_packfile->revindex[pos + 1].offset;
>  
> + record_reused_object(offset, offset - hashfile_total(out));

This is where I understood what "offset" meant in struct reused_chunk.

>  
> + cur = offset;
> + type = unpack_object_header(reuse_packfile, w_curs, &cur, &size);
> + assert(type >= 0);
>  
> + if (type == OBJ_OFS_DELTA) {
> + off_t base_offset;
> + off_t fixup;
> +
> + unsigned char header[MAX_PACK_OBJECT_HEADER];
> + unsigned len;
> +
> + base_offset = get_delta_base(reuse_packfile, w_curs, &cur, 
> type, offset);
> + assert(base_offset != 0);
> +
> + /* Convert to REF_DELTA if we must... */
> + if (!allow_ofs_delta) {
> + int base_pos = find_revindex_position(reuse_packfile, 
> base_offset);
> + const unsigned char *base_sha1 =
> + nth_packed_object_sha1(reuse_packfile,
> +
> reuse_packfile->revindex[base_pos].nr);
> +
> + len = encode_in_pack_object_header(header, 
> sizeof(header),
> +OBJ_REF_DELTA, size);
> + hashwrite(out, header, len);
> + hashwrite(out, base_sha1, 20);
> + copy_pack_data(out, reuse_packfile, w_curs, cur, next

Re: is commitGraph useful on the server side?

2019-10-22 Thread Derrick Stolee
On 10/22/2019 12:51 PM, Konstantin Ryabitsev wrote:
> Hi, all:
> 
> I've read the docs on commitGraph and it's not 100% clear to me if turning it 
> on and generating commit graphs would be useful on the server-side. I know 
> it's going to be enabled by default and automatically generated whenever "git 
> gc" runs, so I'm trying to figure out if it'll be useful for git-daemon 
> operations.
> 
> Thanks in advance for your help.

I've CC'd Taylor Blau for more information here.

I'm biased, but I think the commit-graph is generally really good to have
in almost all cases. I actually do not know of a good reason to _not_ have
it.

If you are managing reachability bitmaps, then most of the server-side
stuff will use the bitmaps instead. However, creating those bitmaps will
be slightly faster with the commit-graph.

If you don't use bitmaps, then the commit-graph will help fetch negotiation
and many other commit-walk experiences.

If you have a lot of machinery around your server maintenance, then you
can schedule commit-graph updates more frequently than bitmap computations,
and you would get benefit by parsing commits faster in the zone "above" the
bitmaps.

Thanks,
-Stolee


Re: [RFC/WIP] range-diff: show old/new blob OIDs in comments

2019-10-22 Thread Konstantin Ryabitsev

On Tue, Oct 22, 2019 at 09:18:35PM +0200, Johannes Schindelin wrote:

As to recreating blobs from mails: Wow. That's quite a length you're
going, and I think it is a shame that you have to. If only every
contribution came accompanied with a pullable branch in a public
repository.


Trouble is, those public repositories are transient and may be gone soon 
after the pull request is performed. The goal is to be able to 
reconstruct full code history prior to when it is merged into the 
official repository.



Instead, we will have to rely on your centralized, non-distributed
service...


It's actually neither, because mirroring and distributing public-inbox 
repositories is already easy, and we hope to make it easier in the near 
future.


-K


Re: [RFC/WIP] range-diff: show old/new blob OIDs in comments

2019-10-22 Thread Johannes Schindelin
Hi Eric,


On Thu, 17 Oct 2019, Eric Wong wrote:

> (WIP, mostly stream-of-concious notes + reasoning)
>
> When using "git format-patch --range-diff", the pre and
> post-image blob OIDs are in each email, while the exact
> commit OIDs are rarely shared via emails (only the tip
> commit from "git request-pull").
>
> These blob OIDs make it easy to search for or lookup the
> full emails which create them, or the blob itself once
> it's fetched via git.
>
> public-inbox indexes and allows querying specifically for blob
> OIDs via dfpre:/dfpost: since June 2017.  As of Jan 2019,
> public-inbox also supports recreating blobs out of patch emails
> (querying internally with dfpre:/dfpost: and doing "git apply")
>
> Searching on these blob OIDs also makes it easier to find
> previous versions of the patch sets using any mail search
> engine.
>
> Future changes to public-inbox may allow generating custom
> diffs out of any blobs it can find or recreate.
>
> Most of this is pretty public-inbox-specific and would've
> made some future changes to public-inbox much easier
> (if we had this from the start of range-diff).
>
> Unfortunately, it won't help with cases where range-diffs
> are already published, but range-diff isn't too old.

I guess your patch won't hurt.

As to recreating blobs from mails: Wow. That's quite a length you're
going, and I think it is a shame that you have to. If only every
contribution came accompanied with a pullable branch in a public
repository.

Instead, we will have to rely on your centralized, non-distributed
service...

Ciao,
Dscho

>
> I'm also still learning my way around git's C internals, but
> using patch.{old,new}_oid_prefix seems alright...
>
> FIXME: tests, t3206 needs updating
>
> Not-signed-off-by: Eric Wong 
> ---
>  range-diff.c | 19 +++
>  1 file changed, 15 insertions(+), 4 deletions(-)
>
> diff --git a/range-diff.c b/range-diff.c
> index 7fed5a3b4b..85d2f1f58f 100644
> --- a/range-diff.c
> +++ b/range-diff.c
> @@ -118,13 +118,24 @@ static int read_patches(const char *range, struct 
> string_list *list)
>   die(_("could not parse git header '%.*s'"), 
> (int)len, line);
>   strbuf_addstr(&buf, " ## ");
>   if (patch.is_new > 0)
> - strbuf_addf(&buf, "%s (new)", patch.new_name);
> + strbuf_addf(&buf, "%s (new %s)",
> + patch.new_name,
> + patch.new_oid_prefix);
>   else if (patch.is_delete > 0)
> - strbuf_addf(&buf, "%s (deleted)", 
> patch.old_name);
> + strbuf_addf(&buf, "%s (deleted %s)",
> + patch.old_name,
> + patch.old_oid_prefix);
>   else if (patch.is_rename)
> - strbuf_addf(&buf, "%s => %s", patch.old_name, 
> patch.new_name);
> + strbuf_addf(&buf, "%s => %s (%s..%s)",
> + patch.old_name,
> + patch.new_name,
> + patch.old_oid_prefix,
> + patch.new_oid_prefix);
>   else
> - strbuf_addstr(&buf, patch.new_name);
> + strbuf_addf(&buf, "%s (%s..%s)",
> + patch.new_name,
> + patch.old_oid_prefix,
> + patch.new_oid_prefix);
>
>   free(current_filename);
>   if (patch.is_delete > 0)
>
>


Re: [PATCH 2/2] merge-recursive: fix merging a subdirectory into the root directory

2019-10-22 Thread Elijah Newren
Sorry for the long delay before getting back to this; the other stuff
I was working on took longer than expected.

On Mon, Oct 14, 2019 at 3:42 AM Johannes Schindelin
 wrote:
> On Sat, 12 Oct 2019, Elijah Newren wrote:
> > On Sat, Oct 12, 2019 at 1:37 PM Johannes Schindelin
> >  wrote:
> > >
> > > For the record: I am still a huge anti-fan of splitting `setup` test
> > > cases from the test cases that do actual things, _unless_ it is
> > > _one_, and _only one_, big, honking `setup` test case that is the
> > > very first one in the test script.
[...]
> > The one thing I do agree with you on is test cases need to be
> > optimized for when they report breakages, but that is precisely what
> > led me to splitting setup and testing.
>
> To me, it is so not helpful _not_ to see the output of a `setup` that
> succeeded, and only the output of the actual test that actually failed.
>
> It removes context.
>
> I need to understand the scenario where the breakage happens, and the
> only way I can understand is when I understand the context.
>
> So the context needs to be as close as possible.

I've updated the patch series with a change that I hope helps while
still allowing the setup "steps" to be visibly differentiated from the
testing steps.

> > Way too many tests in the testsuite intersperse several setup and test
> > cases together making it really hard to disentangle, understand what
> > is going on, or even reverse engineer what is relevant.  The absolute
> > worst tests are the ones which just keep making additional changes to
> > some existing repo to provide extra setup, causing all sorts of
> > problems for skipping and resuming and understanding (to say nothing
> > of test prerequisites that aren't always met).
>
> I agree with this sentiment, and have to point out that this is yet
> another fallout of the way our test suite is implemented. If you look
> e.g. at JUnit, there are no "setup test cases". There are specific setup
> steps that you can define, there is even a teardown step you can define,
> and those individual test cases? They can run in parallel, or
> randomized, and they run in their own sandbox, to make sure that they
> don't rely on side effects of unrelated test cases.
>
> We don't do that. We don't enforce the absence of side effects, and
> therefore we have a megaton of them.
>
> But note how this actually speaks _against_ separating the setup from
> the test? Because then you _explicitly_ want those test cases to rely on
> one another. Which flies in the _face_ of trying to disentangling them.

I agree that it is desirable to avoid side effects in the tests, but
I'd like to point out that I'm not at all sure that your conclusion
here is the only logical one to draw here in comparing to JUnit.  As
you point out, JUnit has clearly delineated setup steps for a test (as
well as teardown steps), providing a place to keep them separate.  Our
testsuite lacks that, so how do folks try to get it?  One logical way
would be just inlining the setup steps in the test outside a
test_expect_* block (which has been done in the past), but that has
even worse problems.  Another way, even if suboptimal, is placing
those steps in their own test_expect_* block.  You say just throw the
setup and test together, but that breaks the separation.

I think it's a case of the testsuite not providing the right
abstractions and enough capability, leaving us to argue over which
aspects of a more featureful test harness are most important to
emulate.  You clearly picked one, while I was focusing on another.
Anyway, all that said, I think I have a nice compromise that I'll send
out with V2.

[...]
> > > Makes sense, but the part that I am missing is
> > >
> > > test_path_is_file bar.c.t
> > >
> > > I.e. the _most_ important outcome of this test is: the rename was
> > > detected, and the added file was correctly placed into the target
> > > directory of the rename.
> >
> > That's a useful thing to add to the test, I'll add it.  (It's kind of
> > included in the 'git hash-object bar.c.t' half a dozen or so lines
> > above, but this line helps document the expectation a bit better.)
> >
> > I'd be very reticent to include only this test_path_is_file check, as
> > it makes no guarantee that it has the right contents or that we didn't
> > also keep around another copy in a/b/bar.c.t, etc.
>
> I agree that it has to strike a balance. There are multiple aspects you
> need to consider:
>
> - It needs to be easy to understand what the test case tries to ensure.
>
> - While it is important to verify that Git works correctly, you do not
>   want to make the test suite so slow as to be useless. I vividly
>   remember how Duy mentioned that he does not have the time to run the
>   test suite before contributing. That's how bad things are _already_.

On this issue I'm having a harder time finding common ground.  Perhaps
there's something clever to be done, but I'm not seeing it.  I can at
least try to explain my pe

Re: email as a bona fide git transport

2019-10-22 Thread Eric Wong
Vegard Nossum  wrote:
> I sent v2 of the patches (with metadata _after_ the diff) to the git
> list here:
> 
> https://public-inbox.org/git/20191022114518.32055-1-vegard.nos...@oracle.com/T/#u
> 
> As I wrote in there, we could already today start using
> 
>git am --message-id
> 
> when applying patches and this would provide something that a bot could
> annotate with git notes pointing to lore/LKML/LWN/whatever. I think that
> would already be a pretty nice improvement over today's situation.
> 
> Sadly, since the beginning of 2018, this was only used for a measly
> ~0.14% of all non-merge commits in the kernel:

--message-id helps provide a concrete reference, yes.  However,
being able to search for commit subjects in the mail archives is
already implemented via cgit filter.  An example is here:

https://80x24.org/mirrors/git.git/commit/?id=8da56a484800023a545d7a7c022473f5aa9e720f

The link at "userdiff: fix some corner cases in dts regex" makes a link to:

https://public-inbox.org/git/?x=t&q=%22userdiff:+fix+some+corner+cases+in+dts+regex%22
(side note: not sure if that "x=t" to expand the whole message is good...)

That link is generated by examples/cgit-commit-filter.lua in the
 public-inbox source:

https://public-inbox.org/meta/1677253/s/?b=examples/cgit-commit-filter.lua

My longer term plan is to be able to use the post-image blob OIDs
from cgit to generate a search query for public-inbox such as:

https://public-inbox.org/git/?q=dfpost:afc6b5b404+dfpost:072d58b69d+dfpost:4353b8220c+dfpost:333a625c70+dfpost:e187d356f6

Which finds all versions of the userdiff patch posted.  But AFAIK
there's no easy way to get at blob OIDs from cgit to a Lua filter...


Re: [PATCH v2 3/9] ewah/bitmap: introduce bitmap_word_alloc()

2019-10-22 Thread Jonathan Tan
> From: Jeff King 
> 
> In a following commit we will need to allocate a variable
> number of bitmap words, instead of always 32, so let's add
> bitmap_word_alloc() for this purpose.
> 
> We will also always access at least one word for each bitmap,
> so we want to make sure that at least one is always
> allocated.

The last paragraph is not true - we still can allocate 0. (We just ensure
that when we grow, we grow to at least 1.) I think we should just
delete the last paragraph - the first paragraph is sufficient.

Other than that, all patches up to but not including the last one look
good, except the ones that just add a new function because I'll have to
review the last patch to see how they are used.


Re: email as a bona fide git transport

2019-10-22 Thread Vegard Nossum



On 10/22/19 3:53 PM, Theodore Y. Ts'o wrote:

On Tue, Oct 22, 2019 at 02:11:22PM +0200, Vegard Nossum wrote:


As I wrote in there, we could already today start using

   git am --message-id

when applying patches and this would provide something that a bot could
annotate with git notes pointing to lore/LKML/LWN/whatever. I think that
would already be a pretty nice improvement over today's situation.

Sadly, since the beginning of 2018, this was only used for a measly
~0.14% of all non-merge commits in the kernel:

$ git rev-list --count --no-merges --since='2018-01-01' --grep 'Message-Id:
' linus/master
178


You might also want to count commits which have a link tag with a
Message-Id:

Link: 
https://lore.kernel.org/r/c3438dad66a34a7d4e7509a5dd64c2326340a52a.1571647180.git.mbobrow...@mbobrowski.org

That's because some kernel developers have been using a hook script like this:

#!/bin/sh
# For .git/hooks/applypatch-msg
#
# You must have the following in .git/config:
# [am]
#   messageid = true

. git-sh-setup
perl -pi -e 's|^Message-Id:\s*]+)>?$|Link: https://lore.kernel.org/r/$1|g;' 
"$1"
test -x "$GIT_DIR/hooks/commit-msg" &&
exec "$GIT_DIR/hooks/commit-msg" ${1+"$@"}
:

 as we had reached rough consensus that this was the best way to
incorprate the message id (since it could made to be a clickable link
in tools like gitk, for example).  This rough consensus has only been
in place since around the time of the Maintainer's Summit in Lisbon,
so uptake is still probably a bit slow.  I'd expect to see a lot more
of this in the next merge window, though.


Thanks, I was not aware of this!

Seems like something that should go in Documentation/maintainer/,
right?

The figure is much better, 16.7% on all non-merges since 2018-01-01.
This should help and we can maybe already do some interesting things
with git notes and lore/public-inbox.


Vegard


Re: [PATCH v3 2/2] git_path(): handle `.lock` files correctly

2019-10-22 Thread SZEDER Gábor
On Mon, Oct 21, 2019 at 09:54:42PM +, Johannes Schindelin via GitGitGadget 
wrote:
> From: Johannes Schindelin 
> 
> Ever since worktrees were introduced, the `git_path()` function _really_
> needed to be called e.g. to get at the path to `logs/HEAD` (`HEAD` is
> specific to the worktree, and therefore so is its reflog). However, the
> wrong path is returned for `logs/HEAD.lock`.
> 
> This does not matter as long as the Git executable is doing the asking,
> as the path for that `logs/HEAD.lock` file is constructed from
> `git_path("logs/HEAD")` by appending the `.lock` suffix.
> 
> However, Git GUI just learned to use `--git-path` instead of appending
> relative paths to what `git rev-parse --git-dir` returns (and as a
> consequence not only using the correct hooks directory, but also using
> the correct paths in worktrees other than the main one). While it does
> not seem as if Git GUI in particular is asking for `logs/HEAD.lock`,
> let's be safe rather than sorry.
> 
> Side note: Git GUI _does_ ask for `index.lock`, but that is already
> resolved correctly, due to `update_common_dir()` preferring to leave
> unknown paths in the (worktree-specific) git directory.
> 
> Signed-off-by: Johannes Schindelin 
> ---
>  path.c   |  8 +++-
>  t/t1500-rev-parse.sh | 15 +++
>  2 files changed, 22 insertions(+), 1 deletion(-)
> 
> diff --git a/path.c b/path.c
> index e3da1f3c4e..5ff64e7a8a 100644
> --- a/path.c
> +++ b/path.c
> @@ -349,10 +349,16 @@ static int check_common(const char *unmatched, void 
> *value, void *baton)
>  static void update_common_dir(struct strbuf *buf, int git_dir_len,
> const char *common_dir)
>  {
> - char *base = buf->buf + git_dir_len;
> + char *base = buf->buf + git_dir_len, *base2 = NULL;
> +
> + if (ends_with(base, ".lock"))
> + base = base2 = xstrndup(base, strlen(base) - 5);

Hm, this adds the magic number 5 and calls strlen(base) twice, because
ends_with() calls strip_suffix(), which calls strlen().  Calling
strip_suffix() directly would allow us to avoid the magic number and
the second strlen():

  size_t len;
  if (strip_suffix(base, ".lock", &len))
  base = base2 = xstrndup(base, len);

>   init_common_trie();
>   if (trie_find(&common_trie, base, check_common, NULL) > 0)
>   replace_dir(buf, git_dir_len, common_dir);
> +
> + free(base2);
>  }
>  
>  void report_linked_checkout_garbage(void)
> diff --git a/t/t1500-rev-parse.sh b/t/t1500-rev-parse.sh
> index 01abee533d..d318a1eeef 100755
> --- a/t/t1500-rev-parse.sh
> +++ b/t/t1500-rev-parse.sh
> @@ -116,6 +116,21 @@ test_expect_success 'git-path inside sub-dir' '
>   test_cmp expect actual
>  '
>  
> +test_expect_success 'git-path in worktree' '
> + test_tick &&
> + git commit --allow-empty -m empty &&
> + git worktree add --detach wt &&
> + test_write_lines >expect \
> + "$(pwd)/.git/worktrees/wt/logs/HEAD" \
> + "$(pwd)/.git/worktrees/wt/logs/HEAD.lock" \
> + "$(pwd)/.git/worktrees/wt/index" \
> + "$(pwd)/.git/worktrees/wt/index.lock" &&
> + git -C wt rev-parse >actual \
> + --git-path logs/HEAD --git-path logs/HEAD.lock \
> + --git-path index --git-path index.lock &&
> + test_cmp expect actual
> +'

I think this test better fits into 't0060-path-utils.sh': it already
checks 'logs/HEAD' and 'index' in a different working tree (well, with
GIT_COMMON_DIR set, but that's the same), and it has a helper function
to make each of the two new .lock tests a one-liner.

>  test_expect_success 'rev-parse --is-shallow-repository in shallow repo' '
>   test_commit test_commit &&
>   echo true >expect &&
> -- 
> gitgitgadget


Re: [git-for-windows] Git for Windows v2.24.0-rc0, was Re: [ANNOUNCE] Git v2.24.0-rc0

2019-10-22 Thread Philip Oakley

Hi Dscho,

Install went Ok.

Did a quick test on the config locations and `git config -l 
-show-origin` has 'lost' the ProgramData location as planned.


The minor pedant did notice that the new location is listed slightly 
differently from the release notes.

`file:C:/Program Files/Git/mingw64/../etc/gitconfig`  --system,
while the release notes simplify the path to C:/Program 
Files/Git/etc/gitconfig


Dunno if that's worth a minor fix to the release notes to clarify for 
the broader Windows community. Maybe others can comment if they think 
it's even worth it.


Philip

On 21/10/2019 23:05, Johannes Schindelin wrote:

Team,

a couple of days later than I wanted, but at least it is now here:
https://github.com/git-for-windows/git/releases/tag/v2.24.0-rc0.windows.1

Please test...

Thank you,
Johannes






Re: [GSoC] Follow-up post

2019-10-22 Thread Christian Couder
Hi Matheus,

On Mon, Oct 21, 2019 at 7:14 PM Matheus Tavares Bernardino
 wrote:
>
> I wrote a small follow-up post to talk about the conclusion of my GSoC
> project. I believe the main remaining tasks are now finally complete
> :) If you would be interested in taking a look, the post is at
> https://matheustavares.gitlab.io/posts/gsoc-follow-ups

Thank you for the follow-up blog post!

It is a great explanation of the work you did to come up with the
current mt/threaded-grep-in-object-store!

Best,
Christian.


Re: email as a bona fide git transport

2019-10-22 Thread Theodore Y. Ts'o
On Tue, Oct 22, 2019 at 02:11:22PM +0200, Vegard Nossum wrote:
> 
> As I wrote in there, we could already today start using
> 
>   git am --message-id
> 
> when applying patches and this would provide something that a bot could
> annotate with git notes pointing to lore/LKML/LWN/whatever. I think that
> would already be a pretty nice improvement over today's situation.
> 
> Sadly, since the beginning of 2018, this was only used for a measly
> ~0.14% of all non-merge commits in the kernel:
> 
> $ git rev-list --count --no-merges --since='2018-01-01' --grep 'Message-Id:
> ' linus/master
> 178

You might also want to count commits which have a link tag with a
Message-Id:

Link: 
https://lore.kernel.org/r/c3438dad66a34a7d4e7509a5dd64c2326340a52a.1571647180.git.mbobrow...@mbobrowski.org

That's because some kernel developers have been using a hook script like this:

#!/bin/sh
# For .git/hooks/applypatch-msg
#
# You must have the following in .git/config:
# [am]
#   messageid = true

. git-sh-setup
perl -pi -e 's|^Message-Id:\s*]+)>?$|Link: 
https://lore.kernel.org/r/$1|g;' "$1"
test -x "$GIT_DIR/hooks/commit-msg" &&
exec "$GIT_DIR/hooks/commit-msg" ${1+"$@"}
:

 as we had reached rough consensus that this was the best way to
incorprate the message id (since it could made to be a clickable link
in tools like gitk, for example).  This rough consensus has only been
in place since around the time of the Maintainer's Summit in Lisbon,
so uptake is still probably a bit slow.  I'd expect to see a lot more
of this in the next merge window, though.

- Ted


Re: [Outreachy] First contribution

2019-10-22 Thread Miriam R.
El lun., 21 oct. 2019 a las 20:35, Emily Shaffer
() escribió:
>
> On Mon, Oct 21, 2019 at 12:39:16PM +0200, Miriam R. wrote:
> > Dear Git developers,
> > I’m an Outreachy applicant, I would like to make my contribution to
> > apply to this Outreachy internship period.
>
> Welcome, Miriam! Good to hear from you.
>
> >
> > I have found this issue tagged as open and goodfirstissue:
> > https://github.com/gitgitgadget/git/issues/230
> >
> > But there is a PR from 4 months ago:
> > https://github.com/gitgitgadget/git/pull/271  and I don't know how to
> > find out if a patch including that change already exists or if it
> > makes sense to do it.
>
> GitGitGadget exists to repackage PRs (which Git project doesn't use)
> into emailed patches (which Git project does use) when the author writes
> /submit on the PR comment chain. In that PR I see Johannes asking for a
> /submit, but no submit; next I would check if a patch with the same
> title came through in the mailing list by searching on the
> public-inbox.org mirror:
>
> https://public-inbox.org/git/?q=is_directory+dir_exists
>
> Looks like, no, a patch with those hotwords wasn't mailed. Finally, I
> would check the project to see if it's still an issue:
>
>   $ cd my-git-dir/
>   $ git grep is_directory
>
> I still see 30 instances of is_directory in the codebase, so looks like
> we haven't made this change. :)
>

Thank you Emily!! Then I'll do this issue #230 :)

> >
> > In case this issue is not suitable for my first contribution,  I have
> > also found this:
> > https://github.com/gitgitgadget/git/issues/379
>
> This is also a fine change if you want to make it.
>
> Good luck, and remember it's fine to ask the mentor for the project you
> ultimately want to help on for help, code review in advance, etc.

Thank you for the advice!, I'm already in touch with Christian Couder.

Best,
Miriam

>
>  - Emily


Re: email as a bona fide git transport

2019-10-22 Thread Vegard Nossum

On 10/20/19 8:28 AM, Vegard Nossum wrote:


On 10/20/19 5:17 AM, Willy Tarreau wrote:

On Fri, Oct 18, 2019 at 03:14:56PM -0400, Theodore Y. Ts'o wrote:

On Fri, Oct 18, 2019 at 06:50:51PM +0200, Vegard Nossum wrote:

The problem I ran into with putting the metadata at the end was
detecting where the diff ends. A comment in 'git apply' suggested that
detecting the difference between "--" as a diff/signature separator and
as part of the diff is nontrivial in the sense that you need to 
actually

do some parsing and keep track of hunk sizes.


Could we cheat by having "git format-patch" add a "Diff-size" in the
header which gives the number of lines in the diff so git am can just
count lines to find the Trailer section?


Be careful with this, it starts like this and ends up with non-editable
patches. I'd rather have git-am use best-effort detection of the end.


Expect filesystem developers to come up with a format that uses extents ;-)


Also when dealing with stable backports, I've done a lot of
"cat foo.diff >> bar.patch" to fixup some patches in which I just had
to move some parts around. Having to count lines and edit a counter
somewhere is going to become really painful.


I almost have some new patches ready for putting the metadata after the
patch using a very bare-bones diff parser (it's actually not that bad),
I just need to fix a few corner cases that are causing breakage in the
git test suite.


I sent v2 of the patches (with metadata _after_ the diff) to the git
list here:

https://public-inbox.org/git/20191022114518.32055-1-vegard.nos...@oracle.com/T/#u

As I wrote in there, we could already today start using

  git am --message-id

when applying patches and this would provide something that a bot could
annotate with git notes pointing to lore/LKML/LWN/whatever. I think that
would already be a pretty nice improvement over today's situation.

Sadly, since the beginning of 2018, this was only used for a measly
~0.14% of all non-merge commits in the kernel:

$ git rev-list --count --no-merges --since='2018-01-01' --grep 
'Message-Id: ' linus/master

178

$ git rev-list --count --no-merges --since='2018-01-01' linus/master
130777

So how can we spread the word about --message-id and get maintainers to
actually use it? I don't suppose it's reasonable to change the 'git am'
default setting?


Vegard


Re: [RFC PATCH 1/7] Makefile: alphabetically sort += lists

2019-10-22 Thread Junio C Hamano
Jeff King  writes:

>> ...
>> I agree with you that it did correctly sort them in ASCII order.
>
> What's the purpose of sorting them, though? I thought it was less for
> aesthetics and more to to keep lines deterministic (to avoid two
> branches adding the same line in different places, thus causing
> weirdness when the two are merged). In that case, I think we care less
> about the exact order and more that anybody can easily reproduce the
> same sort (by running "10:!sort" or whatever you weird emacs-types would
> type).

In the ideal world, "sort" would have a handy option we can tell it
to reshuffle the ASCII table in such a way that all punctuations
come before alphanumeric, making sure "/" and "." are the first two
letters in the alphabet, and everybody can use it to sort the lines
reproducibly and also readably.  But I do not know of such a widely
used implementation of "sort", so...

If we had known better, we would have used such a custom sort order
to sort the index entries, making sure that slash sorts before any
other byte ;-)


Re: Outreachy Winter 2019

2019-10-22 Thread Christian Couder
Hi Karina,

Please see my answer below.

On Mon, Oct 21, 2019 at 6:59 PM Karina Saucedo
 wrote:
>
> Hello, my name is Karina and I'm and Outreachy applicant.
> I´m interested in applying to the project 'Add did you mean hints´ and
> I was wondering how can I start contributing since there seem to be no
> issues on the github page. Thank you!

Thank you for your introduction email and for your interest in Git!

Emily posted some interesting information on the Git mailing list that
you can see in the archives there:

https://public-inbox.org/git/20191007203654.ga20...@google.com/

We require Outreachy (and GSoC) applicants to work on a microproject
before they can be selected. There are microproject suggestions in
Emily's email and the following discussions.

Unfortunately we only have a web page that we prepared for the last
Google Summer of Code:

https://git.github.io/SoC-2019-Microprojects/

So it might not be up to date but you can still find interesting
information on top of what Emily posted.

Don't hesitate to ask if you have any further questions.

Best,
Christian.


Re: [PATCH v3 4/4] git-gui: allow undoing last revert

2019-10-22 Thread Bert Wesarg
On Mon, Oct 21, 2019 at 9:04 PM Pratyush Yadav  wrote:
>
> On 21/10/19 11:16AM, Bert Wesarg wrote:
> > Dear Pratyush,
> >
> > I just noticed that the 'Revert Last Hunk' menu entry is enabled in
> > the stage-list. But I think it should be disabled, like the 'Revert
> > Hunk' and 'Revert Line' menu entry.
>
> I'm not sure what you mean. There is no "Revert Last Hunk" menu entry (I
> assume you are talking about the context menu in the diff view that we
> open by right clicking).
>
> My guess is that you mean the "Undo Last Revert" option. And you want it
> disabled if the diff shown is of a staged file, correct?
>
> I'm not sure if disabling it would be a good idea.
>
> Say I revert a hunk or line while the file is not staged, and stage the
> rest of the file. This would mean that file is no longer in the
> "Unstaged Changes" widget. Now if I choose the file from the "Staged
> Changes" widget, I get the option to undo the last revert. If I hit
> that, it will put whatever I reverted in the "Unstaged Changes" widget.
> So, now part of the file that was reverted is in "Unstaged Changes", and
> the rest in "Unstaged Changes".
>

Sorry for this confusion.

> IIUC, this is what you think should not happen, correct? What's wrong
> with allowing the user to undo reverts from anywhere?

The 'Undo' changes the worktree not the staged content,.

>
> The way I see it, it doesn't really matter what file is selected or
> whether it is staged or not, the action of the undo remains the same, so
> it makes sense to me to allow it from anywhere.

It would make sense to undo the revert on the staged content, if there
are no more changes to this file in the worktree. I.e., you wont be
able to apply the 'undo' anymore to the worktree file if it is not
listed anymore. Though even that case should be able to implement.
Though the undo is currently not bound to a specific path. This may be
the cause of our different view. I though the undo is bound to the
path it was recorded, thus also would only be available when showing a
diff on this path again. Therefore I think, having the 'Undo Last
Revert' in the context menu may not be the best place to begin with,
or at least indicate that this 'undo' operation does not necceseritly
operate on the file currently shown.

Bertt

>
> > Can you confirm this?
> >
> > On Wed, Aug 28, 2019 at 11:57 PM Pratyush Yadav 
> > wrote:
> > >
> > > Accidental clicks on the revert hunk/lines buttons can cause loss of
> > > work, and can be frustrating. So, allow undoing the last revert.
> > >
> > > Right now, a stack or deque are not being used for the sake of
> > > simplicity, so only one undo is possible. Any reverts before the
> > > previous one are lost.
> > >
> > > Signed-off-by: Pratyush Yadav 
>
> --
> Regards,
> Pratyush Yadav


Re: [GIT PULL] arm64: Fixes for -rc4

2019-10-22 Thread Uwe Kleine-König
Hello,

I added the git list to Cc:. For the new readers: The context of this
thread can be found at
https://lwn.net/ml/linux-kernel/20191017234348.wcbbo2njexn7ixpk@willie-the-truck/

On Mon, Oct 21, 2019 at 08:46:58AM +0200, Ingo Molnar wrote:
> Anyway, a small Git feature request: it would be super useful if "git 
> request-pull" output was a bit more dependable and at least warned about 
> this and didn't include what is, from the viewpoint of the person doing 
> the merge, a bogus diffstat. (Generating the correct diffstat is probably 
> beyond request-pull's abilities: it would require changing the working 
> tree to actually perform the merge - while request-pull is a read-only 
> operation right now. But detecting the condition and warning about it 
> should be possible?)

I think Will's case is still an easy one compared with what could
actually happen.

The related history looks as follows:

 ,-. ,-.  ,-.,-.,-.
  v5.4-rc1 --| |-...-| |-- v5.4-rc2 --| |-..-| |-..-| |-- v5.4-rc3
  \  `-' `-'   \  `-'/-'`-'
   \   ,-. ,-.  \ ,-/,-. ,-.
`--| |-...-| ||*|| |-...-|H|
   `-' `-'\   `-'`-' /-'
   \   ,-. ,-.  /
`--| |-...-| |-'
   `-' `-'

Will asked Linus to merge the Commit marked 'H', the two merge bases are
v5.4-rc2 and '*'.

(FTR:
  * = 3e7c93bd04edfb0cae7dad1215544c9350254b8f
  H = 777d062e5bee0e3c0751cdcbce116a76ee2310ec
, they can be found in
git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git)

The formally correct way to create the diffstat is to merge v5.4-rc2 and
'*' (in general: all merge bases) and calculate the diff between this
merge and the to-be-merged-commit. Compared to what Will did (i.e. merge
Linus' HEAD and this branch and then diff @~ with @) doing it the way I
described has the advantage(?) that commits that conflict with this
merge request in Linus' tree since the merge bases are not in the way.

In this case this can be done automatically:

$ git read-tree --index-output=tralala v5.4-rc2 
3e7c93bd04edfb0cae7dad1215544c9350254b8f
$ GIT_INDEX=tralala git write-tree
6a2acfd1870d9da3c330ea9b648a7e858b5ee39f
$ git diff --stat 6a2acfd1870d9da3c330ea9b648a7e858b5ee39f 
777d062e5bee0e3c0751cdcbce116a76ee2310ec
 Documentation/arm64/silicon-errata.rst |  2 ++
 arch/arm64/Kconfig | 17 ++
 arch/arm64/include/asm/asm-uaccess.h   |  7 +++---
 arch/arm64/include/asm/cpucaps.h   |  4 +++-
 arch/arm64/include/asm/memory.h| 10 ++--
 arch/arm64/include/asm/pgtable.h   |  3 ---
 arch/arm64/include/asm/sysreg.h|  2 +-
 arch/arm64/kernel/cpu_errata.c | 38 
+++
 arch/arm64/kernel/cpufeature.c | 15 
 arch/arm64/kernel/entry.S  |  8 ---
 arch/arm64/kernel/hibernate.c  |  9 +++-
 arch/arm64/kernel/process.c| 18 +++
 arch/arm64/kvm/hyp/switch.c| 69 
++--
 arch/arm64/mm/fault.c  |  6 -
 include/linux/sched.h  |  1 +
 15 files changed, 186 insertions(+), 23 deletions(-)

Would be great if git-request-pull learned to do that.

Best regards
Uwe

-- 
Pengutronix e.K.   | Uwe Kleine-König|
Industrial Linux Solutions | http://www.pengutronix.de/  |


Re: [PATCH v3 4/4] git-gui: allow undoing last revert

2019-10-22 Thread Bert Wesarg
On Mon, Oct 21, 2019 at 9:35 PM Johannes Sixt  wrote:
>
> Am 21.10.19 um 11:16 schrieb Bert Wesarg:
> > Dear Pratyush,
> >
> > I just noticed that the 'Revert Last Hunk' menu entry is enabled in
> > the stage-list. But I think it should be disabled, like the 'Revert
> > Hunk' and 'Revert Line' menu entry.
> >
> > Can you confirm this?
>
> Technically, it need not be disabled because the hunk being reverted
> does not depend on the contents of any of diffs that can be shown.
>
> The entry should be disabled if reverting the stored hunk fails. But to
> know that, it would have to be tried: the file could have been edited
> since the hunk was generated so that the reversal of the hunk fails.

But the "Undo" changes the worktree not the stage, sure it indirectly
also changes the view of the staged content, but that is only a
secondary effect. As I only can revert in the worktree list, I think
we should be consistent and also only allow to undo the revert in the
worktree list.

And I think it is independent of 'does the undo apply at all' question.

Bert

>
> Not sure what to do, though.
>
> -- Hannes


Re: [ANNOUNCE] Git v2.24.0-rc0

2019-10-21 Thread Jeff King
On Mon, Oct 21, 2019 at 04:04:22PM -0700, Elijah Newren wrote:

> > 4211.3: git log --follow [...]8.56(8.41+0.15) -0.2%   3.67(3.53+0.13) 
> > -57.2%
> 
> Many nice speedups here, not just commit-graph (the rev-list cases)
> but also log -L (from sg/line-log-tree-diff-optim, I believe), and log
> --follow.  I'm curious if the log --follow speedup comes from sg's
> series or something else...

The "log --follow" speedup comes from turning on commit-graphs. You can
see a similar effect without "--follow", since "git log " is going
to be dominated by the commit traversal, and not accessing the trees
(especially if  is at the top-level).

> > 0001.9: rev-list --objects $commit --not --all
> > 0.08(0.05+0.03)   0.08(0.05+0.03) +0.0%   0.09(0.07+0.02) +12.5%
> 
> Looks like this one increased too, with a similar magnitude to the
> 7300.2 you pointed out.  But the base is kinda small; is this just
> noise?

Probably. I also frequently run the perf suite between major version
releases, and this kind of noise is quite common.

> I'm also curious what change it was that made these rebase tests faster.

Perf changes of this magnitude are usually pretty easy to bisect. You
can even do:

  git bisect start --term-old=slow --term-new=fast

so you don't have to confuse yourself with opposite good/bad markers.
I just ran:

  make && (cd t/perf && GIT_SKIP_TESTS=p3400.[3456] ./p3400*)

at each stopping point and eyeballed the resulting time (which for some
reason seems to be about 10x faster than Stolee's machine, but does
still show the same relative speedup). I was surprised that this also
yields 31b1de6a09 (commit-graph: turn on commit-graph by default,
2019-08-13). I wouldn't have expected commit access to dominate the
rebase time so much.

-Peff


Re: [ANNOUNCE] Git v2.24.0-rc0

2019-10-21 Thread Elijah Newren
 On Mon, Oct 21, 2019 at 1:50 PM Derrick Stolee  wrote:
> I ran a few of the performance tests against the Linux repository
> using v2.22.0, v2.23.0, and the new v2.24.0-rc0. I thought it worth
> pointing out that the drastic performance improvements are due to
> turning on the commit-graph by default. I had computed a commit-graph
> for my Linux repo, but used my global config to enable core.commitGraph.
> The global config is ignored by perf tests, so v2.22.0 and v2.23.0 were
> operating without looking at the commit-graph.
>
> (These were run on my old dev machine, which is now running Ubuntu on
> bare metal. No VM this time!)
>
> Test  v2.22.0 
>   v2.23.0 v2.24.0-rc0
> 
> 0001.1: rev-list --all6.01(5.73+0.28) 
>   5.99(5.73+0.25) -0.3%   0.97(0.80+0.16) -83.9%
> 0001.2: rev-list --all --objects  
> 40.40(39.86+0.54) 40.22(39.59+0.62) -0.4% 35.28(34.75+0.52) -12.7%
> 0001.3: rev-list --parents6.11(5.83+0.27) 
>   6.07(5.82+0.25) -0.7%   1.03(0.86+0.16) -83.1%
> 0001.5: rev-list -- dummy 0.64(0.58+0.06) 
>   0.66(0.59+0.07) +3.1%   0.34(0.29+0.05) -46.9%
> 0001.6: rev-list --parents -- dummy   0.66(0.60+0.05) 
>   0.67(0.62+0.05) +1.5%   0.36(0.32+0.03) -45.5%
[...]
> 4211.2: git rev-list --topo-order (baseline)  6.32(6.04+0.28) 
>   6.30(6.09+0.21) -0.3%   1.15(0.96+0.19) -81.8%
> 4211.3: git log --follow (baseline for -M)8.58(8.43+0.14) 
>   8.56(8.41+0.15) -0.2%   3.67(3.53+0.13) -57.2%
> 4211.4: git log -L (renames off)  
> 32.79(30.68+2.10) 32.80(30.69+2.11) +0.0% 27.17(25.24+1.93) -17.1%
> 4211.5: git log -L (renames on)   
> 212.64(210.39+2.24)   213.48(211.26+2.20) +0.4%   27.38(25.53+1.84) -87.1%

Many nice speedups here, not just commit-graph (the rev-list cases)
but also log -L (from sg/line-log-tree-diff-optim, I believe), and log
--follow.  I'm curious if the log --follow speedup comes from sg's
series or something else...

> 0001.9: rev-list --objects $commit --not --all0.08(0.05+0.03) 
>   0.08(0.05+0.03) +0.0%   0.09(0.07+0.02) +12.5%

Looks like this one increased too, with a similar magnitude to the
7300.2 you pointed out.  But the base is kinda small; is this just
noise?

> The tests below are some that I don't run very often, but seemed
> interesting. Interesting that rebase got a lot faster!
>
> Testv2.22.0   
> v2.23.0 v2.24.0-rc0
> ---
> 3400.2: rebase on top of a lot of unrelated changes 
> 18.86(17.80+1.71) 18.80(17.80+1.66) -0.3% 2.63(2.49+0.79) -86.1%
> 3400.4: rebase a lot of unrelated changes without split-index   
> 68.00(62.32+5.04) 68.50(62.34+5.30) +0.7% 45.25(41.37+4.18) -33.5%
> 3400.6: rebase a lot of unrelated changes with split-index  
> 46.39(44.89+2.19) 46.24(44.66+2.30) -0.3% 25.00(24.49+1.23) -46.1%

I'm also curious what change it was that made these rebase tests faster.

> 7300.2: clean many untracked sub dirs, check for nested git 
> 1.36(0.54+0.81)   1.35(0.51+0.82) -0.7%   1.53(0.62+0.90) +12.5%
[...]
> Any thoughts on 7300.2? Seems to not just be noise, or maybe it is?

Well, en/clean-nested-with-ignored is a very likely the cause of any
performance difference here, but given the nasty bug it was fixing
(see sg/clean-nested-repo-with-ignored topic), the performance change
is totally warranted if necessary for the fix.  And it looks like that
test is exercising one of the areas of logic that my series was
modifying (namely the clean -fd case in conjunction with the
possibility of nested .git dirs).

That's enough for me to accept the performance change.  If soemone
else wants to dig a little further to determine whether this perf
change was part of the important fix or just due to a separate change,
I'll provide a few pointers.  Assuming it's one of my commits, I think
it has to be one of the following three:

404ebceda01c ("dir: also check directories for matching pathspecs",
2019-09-17): if this one causes the perf change, I think we just suck
it up.

89a1f4aaf765 ("dir: if our pathspec might match files under a dir,
recurse into it", 2019-09-17): if this one causes the perf change, we
might be able to do something by somehow rearranging the if-block
logic.  Checking bits is going to be faster than calling the
get_dtype() f

Git for Windows v2.24.0-rc0, was Re: [ANNOUNCE] Git v2.24.0-rc0

2019-10-21 Thread Johannes Schindelin
Team,

a couple of days later than I wanted, but at least it is now here:
https://github.com/git-for-windows/git/releases/tag/v2.24.0-rc0.windows.1

Please test...

Thank you,
Johannes




Re: [PATCH 5/5] path.c: don't call the match function without value in trie_find()

2019-10-21 Thread SZEDER Gábor
On Mon, Oct 21, 2019 at 06:00:43PM +0200, SZEDER Gábor wrote:
> 'logs/refs' is not a working tree-specific path, but since commit
> b9317d55a3 (Make sure refs/rewritten/ is per-worktree, 2019-03-07)
> 'git rev-parse --git-path' has been returning a bogus path if a
> trailing '/' is present:
> 
>   $ git -C WT/ rev-parse --git-path logs/refs --git-path logs/refs/
>   /home/szeder/src/git/.git/logs/refs
>   /home/szeder/src/git/.git/worktrees/WT/logs/refs/
> 
> We use a trie data structure to efficiently decide whether a path
> belongs to the common dir or is working tree-specific.  As it happens
> b9317d55a3 triggered a bug that is as old as the trie implementation
> itself, added in 4e09cf2acf (path: optimize common dir checking,
> 2015-08-31).
> 
>   - According to the comment describing trie_find(), it should only
> call the given match function 'fn' for a "/-or-\0-terminated
> prefix of the key for which the trie contains a value".  This is
> not true: there are three places where trie_find() calls the match
> function, but one of them is missing the check for value's
> existence.
> 
>   - b9317d55a3 added two new keys to the trie: 'logs/refs/rewritten'
> and 'logs/refs/worktree', next to the already existing
> 'logs/refs/bisect'.  This resulted in a trie node with the path
> 'logs/refs', which didn't exist before, and which doesn't have a

Oops, I missed the trailing slash, that must be 'logs/refs/'!

> value attached.  A query for 'logs/refs/' finds this node and then
> hits that one callsite of the match function which doesn't check
> for the value's existence, and thus invokes the match function
> with NULL as value.


Re: [ANNOUNCE] Git v2.24.0-rc0

2019-10-21 Thread Derrick Stolee
(dropping some of the other aliases from this reply)

I ran a few of the performance tests against the Linux repository
using v2.22.0, v2.23.0, and the new v2.24.0-rc0. I thought it worth
pointing out that the drastic performance improvements are due to
turning on the commit-graph by default. I had computed a commit-graph
for my Linux repo, but used my global config to enable core.commitGraph.
The global config is ignored by perf tests, so v2.22.0 and v2.23.0 were
operating without looking at the commit-graph.

(These were run on my old dev machine, which is now running Ubuntu on
bare metal. No VM this time!)

Test  v2.22.0   
v2.23.0 v2.24.0-rc0 

0001.1: rev-list --all6.01(5.73+0.28)   
5.99(5.73+0.25) -0.3%   0.97(0.80+0.16) -83.9%  
0001.2: rev-list --all --objects  40.40(39.86+0.54) 
40.22(39.59+0.62) -0.4% 35.28(34.75+0.52) -12.7%
0001.3: rev-list --parents6.11(5.83+0.27)   
6.07(5.82+0.25) -0.7%   1.03(0.86+0.16) -83.1%  
0001.5: rev-list -- dummy 0.64(0.58+0.06)   
0.66(0.59+0.07) +3.1%   0.34(0.29+0.05) -46.9%  
0001.6: rev-list --parents -- dummy   0.66(0.60+0.05)   
0.67(0.62+0.05) +1.5%   0.36(0.32+0.03) -45.5%  
0001.8: rev-list $commit --not --all  0.03(0.02+0.01)   
0.03(0.02+0.01) +0.0%   0.03(0.02+0.01) +0.0%   
0001.9: rev-list --objects $commit --not --all0.08(0.05+0.03)   
0.08(0.05+0.03) +0.0%   0.09(0.07+0.02) +12.5%  
0002.1: read_cache/discard_cache 1000 times   3.33(3.07+0.25)   
3.32(3.05+0.27) -0.3%   3.31(3.10+0.21) -0.6%   
0005.2: read-tree status br_ballast (65695)   0.48(0.42+0.23)   
0.46(0.43+0.19) -4.2%   0.49(0.48+0.18) +2.1%   
0006.2: read-tree br_base br_ballast (65695)  0.17(0.14+0.03)   
0.17(0.15+0.01) +0.0%   0.18(0.16+0.02) +5.9%   
0006.3: switch between br_base br_ballast (65695) 7.17(5.35+2.10)   
6.84(5.14+2.04) -4.6%   6.67(5.07+1.88) -7.0%   
0006.4: switch between br_ballast br_ballast_plus_1 (65695)   0.35(0.37+0.29)   
0.34(0.32+0.33) -2.9%   0.35(0.33+0.34) +0.0%   
0006.5: switch between aliases (65695)0.33(0.34+0.31)   
0.32(0.34+0.30) -3.0%   0.34(0.34+0.32) +3.0%   
0007.2: write_locked_index 3 times (65695 files)  0.17(0.16+0.01)   
0.17(0.15+0.02) +0.0%   0.18(0.16+0.01) +5.9%   
0071.2: sort(1)   6.01(17.44+1.78)  
6.05(17.44+1.69) +0.7%  5.85(17.07+1.69) -2.7%  
0071.3: string_list_sort()15.81(14.94+0.87) 
15.80(14.83+0.97) -0.1% 15.86(15.03+0.83) +0.3% 
4205.1: log with %H   6.46(6.14+0.31)   
6.42(6.15+0.26) -0.6%   5.95(5.71+0.24) -7.9%   
4205.2: log with %h   7.06(6.74+0.32)   
7.02(6.75+0.27) -0.6%   6.47(6.26+0.21) -8.4%   
4205.3: log with %T   6.43(6.13+0.29)   
6.41(6.10+0.30) -0.3%   6.24(5.95+0.29) -3.0%   
4205.4: log with %t   7.29(7.04+0.24)   
7.28(6.96+0.32) -0.1%   7.05(6.78+0.27) -3.3%   
4205.5: log with %P   6.44(6.19+0.24)   
6.51(6.19+0.31) +1.1%   5.95(5.69+0.25) -7.6%   
4205.6: log with %p   7.03(6.70+0.33)   
7.01(6.74+0.27) -0.3%   6.52(6.26+0.26) -7.3%   
4205.7: log with %h-%h-%h 7.79(7.57+0.22)   
7.79(7.55+0.24) +0.0%   7.10(6.84+0.26) -8.9%   
4211.2: git rev-list --topo-order (baseline)  6.32(6.04+0.28)   
6.30(6.09+0.21) -0.3%   1.15(0.96+0.19) -81.8%  
4211.3: git log --follow (baseline for -M)8.58(8.43+0.14)   
8.56(8.41+0.15) -0.2%   3.67(3.53+0.13) -57.2%  
4211.4: git log -L (renames off)  32.79(30.68+2.10) 
32.80(30.69+2.11) +0.0% 27.17(25.24+1.93) -17.1%
4211.5: git log -L (renames on)   
212.64(210.39+2.24)   213.48(211.26+2.20) +0.4%   27.38(25.53+1.84) -87.1%
4211.6: git log --oneline --raw --parents 46.41(45.97+0.43) 
46.34(45.81+0.52) -0.2% 46.03(45.46+0.57) -0.8% 
4211.7: git log --oneline --raw --parents -1000   0.09(0.07+0.01)   
0.09(0.09+0.00) +0.0%   0.09(0.08+0.00) +0.0%   

The tests below are some that I don't run very often, but seemed
interesting. Interesting that rebase g

Re: [Git Developer Blog] [PATCH] post: a tour of git's object types

2019-10-21 Thread Derrick Stolee
On 10/18/2019 8:20 PM, Emily Shaffer wrote:
> An overview of what Git object types mean and how they loosely translate
> into filesystem types users are already familiar with is a good start to
> making Git's internals less scary to users. This post is an interactive
> overview of the various types, demonstrating subcommands which show what
> the objects look like and how their names are generated.
> 
> This is related to https://gitlab.com/git-scm/blog/issues/15
> 
> Signed-off-by: Emily Shaffer 
> ---
> Hi all,
> 
> In the hopes of getting some more momentum on the developer blog, here's
> a crosspost from my personal blog some months ago, targeted for the Git
> Developer Blog (as discussed in the Virtual Contributor Summit and
> on-list). During those conversations I emphasized my wish to make sure
> posts on this developer blog are vetted by the Git development community
> - to that end, the textual contents of this blog post are being sent to
> the vger.kernel.org list in their entirety. Feel free to provide
> comments here, or on the merge request in GitLab:
> https://gitlab.com/git-scm/blog/merge_requests/4
> Or, if you have another idea about how you'd like this review process to
> look, we may as well discuss it on this patch too.
> 
> I hope this post also shows what I hoped to achieve with the Git
> Developer Blog - in-depth, accurate information presented in a casual
> tone which helps users better understand Git.
> 
> At this time I've simply copied the blog post verbatim from my personal
> blog; I didn't do a lot of review on it because I was hoping to focus on
> the process of getting posts reviewed and accepted to the GitLab repo.
> It's probable that the tone is actually more conversational than we want
> for a developer blog, and the post itself didn't go through any kind of
> peer review, so I welcome comments on any and all aspects of the post.

Thanks for getting the conversation started (again)! I've got a post that
I've been tinkering with for some time now, and you gave me the motivation
to actually finish it.

> +Naming is one of the hard problems in computer science, right? It's hard for
> +Git developers too. One of the more arcane concepts in Git - object
> +reachability - becomes simpler to understand with a little bit of naming
> +indirection.
> +
> +Reachability is an important concept in Git. It's how the server determines
> +what objects you need in order to get it up to what the server itself knows.
> +It's also how merges and rebases work. With all this big stuff riding on
> +reachability, it seems intimidating to try to understand - but it turns out 
> if
> +we give it a slightly simpler name, things become a little clearer.
> +
> +## Git's four object types
> +
> +Under the covers, Git is mostly a directed graph of objects. Those objects 
> come
> +in four flavors; from root to leaf (generally), those flavors are:
> +
> +- Tag
> +- Commit
> +- Tree
> +- Blob
> +
> +We'll take a closer look in the opposite order, though.
> +
> +# Blob
> +
> +Surprise! It's a file. Well, kind of - it can also be a symlink to a file - 
> but
> +this is the most atomic type of object. We'll explore these a little more 
> later,
> +but really, it's just a file.

When I teach people about blobs [1], I take special care to point out that
a blob is only the file _content_. It does not actually store any information 
about
the filename or permissions.

It could help to describe an example: maybe `git cat-file -p HEAD:README` for a
well-known repo? I'm using "HEAD:" here because it is easier to understand
where the file reference comes from, but perhaps it would be better to use a
"git rev-parse" and "git cat-file" pair:

$ git rev-parse HEAD:README.md
88f126184c52bfe4859ec189d018872902e02a84

$ git cat-file -p 88f126184c52bfe4859ec189d018872902e02a84
[![Build 
Status](https://dev.azure.com/git/git/_apis/build/status/git.git)](https://dev.azure.com/git/git/_build/latest?definitionId=11)

Git - fast, scalable, distributed revision control system
=

...

[1] https://stolee.dev/docs/git.pptx

> +
> +# Tree
> +
> +A tree references zero or more trees or blobs. Said another way, a tree holds
Perhaps "A tree references blobs and other trees." Saying "zero or more" makes
me get spun up about what it would mean for a tree to have zero entries, which
is not possible in porcelain Git.

> +one or more trees or files. This sounds familiar - basically, a tree is a
> +directory. (Okay, or a symlink to a directory.) It points to more trees
> +(subdirectories) or blobs (files). It can also point to commits in the case
> +of submodules, but that's another conversation.

Here is a great time to mention that filenames happen in a tree. You can
use `git cat-file -p HEAD:docs/` (or something) to show more contents of a tree.

> +By the way, "tree" is one that gets a little sticky, because we also talk 
> about
> +"working tree" as well as "worktree". We'll touch

Re: [PATCH v2 2/2] git_path(): handle `.lock` files correctly

2019-10-21 Thread Johannes Schindelin
Hi Junio,

On Fri, 18 Oct 2019, Junio C Hamano wrote:

> "Johannes Schindelin via GitGitGadget" 
> writes:
>
> > From: Johannes Schindelin 
> >
> > Ever since worktrees were introduced, the `git_path()` function _really_
> > needed to be called e.g. to get at the path to `logs/HEAD` (`HEAD` is
> > specific to the worktree). However, the wrong path is returned for
> > `logs/HEAD.lock`.
> >
> > This does not matter as long as the Git executable is doing the asking,
> > as the path for that `index.lock` file is constructed from
> > `git_path("index")` by appending the `.lock` suffix.
>
> Is this still git_path("index") or is it now HEAD?

Darn. It is "logs/HEAD", of course.

> > Side note: Git GUI _does_ ask for `index.lock`, but that is already
> > resolved correctly.
>
> Is that s/but/and/?

It sounds better with an "and", you're right.

> > diff --git a/path.c b/path.c
> > index e3da1f3c4e..ff85692b45 100644
> > --- a/path.c
> > +++ b/path.c
> > @@ -268,7 +268,7 @@ static int trie_find(struct trie *root, const char 
> > *key, match_fn fn,
> > int result;
> > struct trie *child;
> >
> > -   if (!*key) {
> > +   if (!*key || !strcmp(key, ".lock")) {
>
> We only do strcmp for the tail part at the end of the path, so this
> should probably OK from performance point of view but semantically
> it is not very satisfying to see a special case for a single .suffix
> this deep in the callchain.  I wonder if it is nicer to have the
> higher level callers notice ".lock" or whatever other suffixes they
> care about and ask the lower layer for a key with the suffix
> stripped?

Hmm. The parameter is provided as a `const char *`, so I cannot just
edit it. My first attempt at fixing this resulted in this commit:

-- snip --
path: switch `trie_find()` to a counted string

Before this patch, the `trie_find()` function would be fed a regular,
NUL-terminated string.

However, in the next commit, we want to strip any `.lock` suffix from
the argument, and the argument is provided as a `const char *` (i.e. we
do not want to modify it).

Let's use a string + length pair of parameters instead of a single
NUL-terminated string to open the door for this kind of manipulation.

Signed-off-by: Johannes Schindelin 

diff --git a/path.c b/path.c
index e3da1f3c4e2..b18d552c890 100644
--- a/path.c
+++ b/path.c
@@ -236,7 +236,8 @@ static void *add_to_trie(struct trie *root, const char 
*key, void *value)
return old;
 }

-typedef int (*match_fn)(const char *unmatched, void *data, void *baton);
+typedef int (*match_fn)(const char *unmatched, size_t unmatched_len,
+   void *data, void *baton);

 /*
  * Search a trie for some key.  Find the longest /-or-\0-terminated
@@ -261,22 +262,22 @@ typedef int (*match_fn)(const char *unmatched, void 
*data, void *baton);
  * |-||---|--|
  *
  */
-static int trie_find(struct trie *root, const char *key, match_fn fn,
-void *baton)
+static int trie_find(struct trie *root, const char *key, size_t key_len,
+match_fn fn, void *baton)
 {
int i;
int result;
struct trie *child;

-   if (!*key) {
+   if (!key_len) {
/* we have reached the end of the key */
if (root->value && !root->len)
-   return fn(key, root->value, baton);
+   return fn(key, key_len, root->value, baton);
else
return -1;
}

-   for (i = 0; i < root->len; i++) {
+   for (i = 0; i < key_len && i < root->len; i++) {
/* Partial path normalization: skip consecutive slashes. */
if (key[i] == '/' && key[i+1] == '/') {
key++;
@@ -288,24 +289,26 @@ static int trie_find(struct trie *root, const char *key, 
match_fn fn,

/* Matched the entire compressed section */
key += i;
-   if (!*key)
+   key_len -= i;
+
+   if (!key_len)
/* End of key */
-   return fn(key, root->value, baton);
+   return fn(key, key_len, root->value, baton);

/* Partial path normalization: skip consecutive slashes */
-   while (key[0] == '/' && key[1] == '/')
-   key++;
+   while (key_len > 0 && key[0] == '/' && key[1] == '/')
+   key++, key_len--;

-   child = root->children[(unsigned char)*key];
+   child = root->children[!key_len ? '0' : (unsigned char)*key];
if (child)
-   result = trie_find(child, key + 1, fn, baton);
+   result = trie_find(child, key + 1, key_len - 1, fn, baton);
else
result = -1;

-   if (result >= 0 || (*key != '/' && *key != 0))
+   if (result >= 0 || (key_len && *key != '/'))
return result;
if (root->value)
-   return fn(key, root->value, baton);
+   return fn(k

Re: [RFC PATCH 1/7] Makefile: alphabetically sort += lists

2019-10-21 Thread Jeff King
On Tue, Oct 22, 2019 at 04:49:19AM +0900, Junio C Hamano wrote:

> Johannes Schindelin  writes:
> 
> >> ... I do not particularly see this change (there may be similar
> >> ones) desirable.  I'd find it it be much more natural to sort
> >> "commit-anything" after "commit", and that is true with or without
> >> the common extension ".o" added to these entries.
> >>
> >> In short, flipping these entries because '.' sorts later than '-' is
> >> making the result look "less sorted", at least to me.
> >
> > The problem with this argument is that it disagrees with ASCII, as `-`
> > has code 0x2d while `.` has code 0x2e, i.e. it is lexicographically
> > _larger_.
> 
> I am saying that sorting these in ASCII order did not produce result
> that is easy to the eyes.
> 
> You are saying that Denton's patch sorted these lines in ASCII order.
> 
> I agree with you that it did correctly sort them in ASCII order.
> 
> That does not make the patch right ;-)

What's the purpose of sorting them, though? I thought it was less for
aesthetics and more to to keep lines deterministic (to avoid two
branches adding the same line in different places, thus causing
weirdness when the two are merged). In that case, I think we care less
about the exact order and more that anybody can easily reproduce the
same sort (by running "10:!sort" or whatever you weird emacs-types would
type).

-Peff


Re: [RFC PATCH 1/7] Makefile: alphabetically sort += lists

2019-10-21 Thread Junio C Hamano
Johannes Schindelin  writes:

>> ... I do not particularly see this change (there may be similar
>> ones) desirable.  I'd find it it be much more natural to sort
>> "commit-anything" after "commit", and that is true with or without
>> the common extension ".o" added to these entries.
>>
>> In short, flipping these entries because '.' sorts later than '-' is
>> making the result look "less sorted", at least to me.
>
> The problem with this argument is that it disagrees with ASCII, as `-`
> has code 0x2d while `.` has code 0x2e, i.e. it is lexicographically
> _larger_.

I am saying that sorting these in ASCII order did not produce result
that is easy to the eyes.

You are saying that Denton's patch sorted these lines in ASCII order.

I agree with you that it did correctly sort them in ASCII order.

That does not make the patch right ;-)



Re: [PATCH 0/2] path.c: minor common_list fixes

2019-10-21 Thread Johannes Schindelin
Hi Gábor,

On Fri, 18 Oct 2019, SZEDER Gábor wrote:

> On Fri, Oct 18, 2019 at 01:06:18PM +0200, SZEDER Gábor wrote:
> > I didn't look yesterday at all, but now I did, and, unfortunately, see
> > two more bugs
>
> The second patch is kind of a bugfix, though luckily the bug doesn't
> actually manifest in undesired behavior.
> The first is a minor cleanup, so future readers won't have a 'Huh?'
> moment like I just did.
>
> SZEDER Gábor (2):
>   path.c: fix field name in 'struct common_dir'
>   path.c: mark 'logs/HEAD' in 'common_list' as file

Those patches look very reasonable to me.

Thanks,
Dscho

>
>  path.c | 12 ++--
>  1 file changed, 6 insertions(+), 6 deletions(-)
>
> --
> 2.23.0.1084.gae250eaa40
>
>


Re: [PATCH v3 4/4] git-gui: allow undoing last revert

2019-10-21 Thread Johannes Sixt
Am 21.10.19 um 11:16 schrieb Bert Wesarg:
> Dear Pratyush,
> 
> I just noticed that the 'Revert Last Hunk' menu entry is enabled in
> the stage-list. But I think it should be disabled, like the 'Revert
> Hunk' and 'Revert Line' menu entry.
> 
> Can you confirm this?

Technically, it need not be disabled because the hunk being reverted
does not depend on the contents of any of diffs that can be shown.

The entry should be disabled if reverting the stored hunk fails. But to
know that, it would have to be tried: the file could have been edited
since the hunk was generated so that the reversal of the hunk fails.

Not sure what to do, though.

-- Hannes


Re: [RFC PATCH 7/7] merge: teach --autostash option

2019-10-21 Thread Johannes Schindelin
Hi Denton,

On Wed, 16 Oct 2019, Denton Liu wrote:

> In rebase, one can pass the `--autostash` option to cause the worktree
> to be automatically stashed before continuing with the rebase. This
> option is missing in merge, however.
>
> Implement the `--autostash` option and corresponding `merge.autoStash`
> option in merge which stashes before merging and then pops after.
>
> Reported-by: Alban Gruin 

Maybe "Suggested-by" would be more accurate, it is not like this feature
request was a bug report...

> Signed-off-by: Denton Liu 
> ---
>  builtin/merge.c | 13 +
>  builtin/pull.c  |  9 +
>  t/t5520-pull.sh |  8 
>  3 files changed, 18 insertions(+), 12 deletions(-)
>
> diff --git a/builtin/merge.c b/builtin/merge.c
> index 062e911441..d1a5eaad0d 100644
> --- a/builtin/merge.c
> +++ b/builtin/merge.c
> @@ -40,6 +40,7 @@
>  #include "branch.h"
>  #include "commit-reach.h"
>  #include "wt-status.h"
> +#include "autostash.h"
>
>  #define DEFAULT_TWOHEAD (1<<0)
>  #define DEFAULT_OCTOPUS (1<<1)
> @@ -58,6 +59,8 @@ static const char * const builtin_merge_usage[] = {
>   NULL
>  };
>
> +static GIT_PATH_FUNC(merge_autostash, "MERGE_AUTOSTASH")
> +
>  static int show_diffstat = 1, shortlog_len = -1, squash;
>  static int option_commit = -1;
>  static int option_edit = -1;
> @@ -81,6 +84,7 @@ static int show_progress = -1;
>  static int default_to_upstream = 1;
>  static int signoff;
>  static const char *sign_commit;
> +static int autostash;
>  static int no_verify;
>
>  static struct strategy all_strategy[] = {
> @@ -285,6 +289,8 @@ static struct option builtin_merge_options[] = {
>   OPT_SET_INT(0, "progress", &show_progress, N_("force progress 
> reporting"), 1),
>   { OPTION_STRING, 'S', "gpg-sign", &sign_commit, N_("key-id"),
> N_("GPG sign commit"), PARSE_OPT_OPTARG, NULL, (intptr_t) "" },
> + OPT_BOOL(0, "autostash", &autostash,
> +   N_("automatically stash/stash pop before and after")),
>   OPT_BOOL(0, "overwrite-ignore", &overwrite_ignore, N_("update ignored 
> files (default)")),
>   OPT_BOOL(0, "signoff", &signoff, N_("add Signed-off-by:")),
>   OPT_BOOL(0, "no-verify", &no_verify, N_("bypass pre-merge-commit and 
> commit-msg hooks")),
> @@ -440,6 +446,7 @@ static void finish(struct commit *head_commit,
>   strbuf_addf(&reflog_message, "%s: %s",
>   getenv("GIT_REFLOG_ACTION"), msg);
>   }
> + apply_autostash(merge_autostash());

Should this not be guarded by `if (autostash)`?

>   if (squash) {
>   squash_message(head_commit, remoteheads);
>   } else {
> @@ -631,6 +638,9 @@ static int git_merge_config(const char *k, const char *v, 
> void *cb)
>   } else if (!strcmp(k, "commit.gpgsign")) {
>   sign_commit = git_config_bool(k, v) ? "" : NULL;
>   return 0;
> + } else if (!strcmp(k, "merge.autostash")) {
> + autostash = git_config_bool(k, v);
> + return 0;
>   }
>
>   status = fmt_merge_msg_config(k, v, cb);
> @@ -724,6 +734,8 @@ static int try_merge_strategy(const char *strategy, 
> struct commit_list *common,
>   for (j = common; j; j = j->next)
>   commit_list_insert(j->item, &reversed);
>
> + if (autostash)
> + perform_autostash(merge_autostash());
>   hold_locked_index(&lock, LOCK_DIE_ON_ERROR);
>   clean = merge_recursive(&o, head,
>   remoteheads->item, reversed, &result);
> @@ -1288,6 +1300,7 @@ int cmd_merge(int argc, const char **argv, const char 
> *prefix)
>
>   /* Invoke 'git reset --merge' */
>   ret = cmd_reset(nargc, nargv, prefix);
> + apply_autostash(merge_autostash());

Again, this should be guarded by `if (autostash)`, methinks.

>   goto done;
>   }
>
> diff --git a/builtin/pull.c b/builtin/pull.c
> index d25ff13a60..ee186781ae 100644
> --- a/builtin/pull.c
> +++ b/builtin/pull.c
> @@ -183,7 +183,7 @@ static struct option pull_options[] = {
>   N_("verify that the named commit has a valid GPG signature"),
>   PARSE_OPT_NOARG),
>   OPT_BOOL(0, "autostash", &opt_autostash,
> - N_("automatically stash/stash pop before and after rebase")),
> + N_("automatically stash/stash pop before and after")),

Makes sense; this is now shared between the rebase and the merge modes.

>   OPT_PASSTHRU_ARGV('s', "strategy", &opt_strategies, N_("strategy"),
>   N_("merge strategy to use"),
>   0),
> @@ -671,6 +671,10 @@ static int run_merge(void)
>   argv_array_pushv(&args, opt_strategy_opts.argv);
>   if (opt_gpg_sign)
>   argv_array_push(&args, opt_gpg_sign);
> + if (opt_autostash == 0)
> + argv_array_push(&args, "--no-autostash");
> + else if (opt_autostash == 1)
> + argv_array_push(&args, "--autostash");

Or sh

Re: [PATCH v3 4/4] git-gui: allow undoing last revert

2019-10-21 Thread Pratyush Yadav
On 21/10/19 11:16AM, Bert Wesarg wrote:
> Dear Pratyush,
> 
> I just noticed that the 'Revert Last Hunk' menu entry is enabled in
> the stage-list. But I think it should be disabled, like the 'Revert
> Hunk' and 'Revert Line' menu entry.

I'm not sure what you mean. There is no "Revert Last Hunk" menu entry (I 
assume you are talking about the context menu in the diff view that we 
open by right clicking).

My guess is that you mean the "Undo Last Revert" option. And you want it 
disabled if the diff shown is of a staged file, correct?

I'm not sure if disabling it would be a good idea.

Say I revert a hunk or line while the file is not staged, and stage the 
rest of the file. This would mean that file is no longer in the 
"Unstaged Changes" widget. Now if I choose the file from the "Staged 
Changes" widget, I get the option to undo the last revert. If I hit 
that, it will put whatever I reverted in the "Unstaged Changes" widget. 
So, now part of the file that was reverted is in "Unstaged Changes", and 
the rest in "Unstaged Changes".

IIUC, this is what you think should not happen, correct? What's wrong 
with allowing the user to undo reverts from anywhere?

The way I see it, it doesn't really matter what file is selected or 
whether it is staged or not, the action of the undo remains the same, so 
it makes sense to me to allow it from anywhere.
 
> Can you confirm this?
> 
> On Wed, Aug 28, 2019 at 11:57 PM Pratyush Yadav  
> wrote:
> >
> > Accidental clicks on the revert hunk/lines buttons can cause loss of
> > work, and can be frustrating. So, allow undoing the last revert.
> >
> > Right now, a stack or deque are not being used for the sake of
> > simplicity, so only one undo is possible. Any reverts before the
> > previous one are lost.
> >
> > Signed-off-by: Pratyush Yadav 

-- 
Regards,
Pratyush Yadav


Re: [RFC PATCH 5/7] autostash: extract perform_autostash() from rebase

2019-10-21 Thread Johannes Schindelin
Hi Denton,

On Wed, 16 Oct 2019, Denton Liu wrote:

> Continue the process of lib-ifying the autostash code. In a future
> commit, this will be used to implement `--autostash` in other builtins.
>
> This patch is best viewed with `--color-moved` and
> `--color-moved-ws=allow-indentation-change`.
>
> Signed-off-by: Denton Liu 
> ---
>  autostash.c  | 46 ++
>  autostash.h  |  1 +
>  builtin/rebase.c | 44 +---
>  3 files changed, 48 insertions(+), 43 deletions(-)
>
> diff --git a/autostash.c b/autostash.c
> index eb58e0c8a4..722cf78b12 100644
> --- a/autostash.c
> +++ b/autostash.c
> @@ -12,6 +12,7 @@
>  #include "tree-walk.h"
>  #include "tree.h"
>  #include "unpack-trees.h"
> +#include "wt-status.h"
>
>  int read_one(const char *path, struct strbuf *buf)
>  {
> @@ -150,6 +151,51 @@ int reset_head(struct object_id *oid, const char *action,
>   return ret;
>  }
>
> +void perform_autostash(const char *path)

Maybe we can find a better name than "perform_autostash"? It is not
clear whether this is the "saving" or "applying" part of the autostash,
I think.

Maybe `save_autostash()`? And maybe instead of `path`, the parameter
could be `save_hash_to_path` or something similar?

Now that I think of it, I forgot to mention in a reply to an earlier
patch in this series that `reset_head()` might be too generic a name to
be a global function...

Ciao,
Dscho

> +{
> + struct strbuf buf = STRBUF_INIT;
> + struct lock_file lock_file = LOCK_INIT;
> + int fd;
> +
> + fd = hold_locked_index(&lock_file, 0);
> + refresh_cache(REFRESH_QUIET);
> + if (0 <= fd)
> + repo_update_index_if_able(the_repository, &lock_file);
> + rollback_lock_file(&lock_file);
> +
> + if (has_unstaged_changes(the_repository, 1) ||
> + has_uncommitted_changes(the_repository, 1)) {
> + struct child_process stash = CHILD_PROCESS_INIT;
> + struct object_id oid;
> +
> + argv_array_pushl(&stash.args,
> +  "stash", "create", "autostash", NULL);
> + stash.git_cmd = 1;
> + stash.no_stdin = 1;
> + if (capture_command(&stash, &buf, GIT_MAX_HEXSZ))
> + die(_("Cannot autostash"));
> + strbuf_trim_trailing_newline(&buf);
> + if (get_oid(buf.buf, &oid))
> + die(_("Unexpected stash response: '%s'"),
> + buf.buf);
> + strbuf_reset(&buf);
> + strbuf_add_unique_abbrev(&buf, &oid, DEFAULT_ABBREV);
> +
> + if (safe_create_leading_directories_const(path))
> + die(_("Could not create directory for '%s'"),
> + path);
> + write_file(path, "%s", oid_to_hex(&oid));
> + printf(_("Created autostash: %s\n"), buf.buf);
> + if (reset_head(NULL, "reset --hard",
> +NULL, RESET_HEAD_HARD, NULL, NULL) < 0)
> + die(_("could not reset --hard"));
> +
> + if (discard_index(the_repository->index) < 0 ||
> + repo_read_index(the_repository) < 0)
> + die(_("could not read index"));
> + }
> +}
> +
>  int apply_autostash(const char *path)
>  {
>   struct strbuf autostash = STRBUF_INIT;
> diff --git a/autostash.h b/autostash.h
> index 1406638166..e08ccb9881 100644
> --- a/autostash.h
> +++ b/autostash.h
> @@ -18,6 +18,7 @@ int reset_head(struct object_id *oid, const char *action,
>  const char *switch_to_branch, unsigned flags,
>  const char *reflog_orig_head, const char *reflog_head);
>
> +void perform_autostash(const char *path);
>  int apply_autostash(const char *path);
>
>  #endif
> diff --git a/builtin/rebase.c b/builtin/rebase.c
> index c3165896cc..c4decdfb5b 100644
> --- a/builtin/rebase.c
> +++ b/builtin/rebase.c
> @@ -1797,49 +1797,7 @@ int cmd_rebase(int argc, const char **argv, const char 
> *prefix)
>   die(_("could not read index"));
>
>   if (options.autostash) {
> - struct lock_file lock_file = LOCK_INIT;
> - int fd;
> -
> - fd = hold_locked_index(&lock_file, 0);
> - refresh_cache(REFRESH_QUIET);
> - if (0 <= fd)
> - repo_update_index_if_able(the_repository, &lock_file);
> - rollback_lock_file(&lock_file);
> -
> - if (has_unstaged_changes(the_repository, 1) ||
> - has_uncommitted_changes(the_repository, 1)) {
> - const char *autostash =
> - state_dir_path("autostash", &options);
> - struct child_process stash = CHILD_PROCESS_INIT;
> - struct object_id oid;
> -
> - argv_array_pushl(&stash.args,
> -  "stash", "create", "autostash", NULL);

Re: [RFC PATCH 4/7] autostash: extract reset_head() from rebase

2019-10-21 Thread Johannes Schindelin
Hi,

[sorry, Phillip, my reply-all fu deserts me today, apparently.]


On Fri, 18 Oct 2019, Phillip Wood wrote:

> Hi Denton
>
> It's great to see this being libified, I've had it in mind to do this so we
> can avoid forking 'git checkout' in sequencer.c
>
> On 16/10/2019 18:26, Denton Liu wrote:
> > Begin the process of lib-ifying the autostash code. In a future commit,
> >
> > This patch is best viewed with `--color-moved` and
> > `--color-moved-ws=allow-indentation-change`.
> > this will be used to implement `--autostash` in other builtins.
> >
> > Signed-off-by: Denton Liu 
> > ---
> >   autostash.c  | 137 +++
> >   autostash.h  |  12 +
> >   builtin/rebase.c | 137 ---
> >   3 files changed, 149 insertions(+), 137 deletions(-)
> >
> > diff --git a/autostash.c b/autostash.c
> > index 62ec7a7c80..eb58e0c8a4 100644
> > --- a/autostash.c
> > +++ b/autostash.c
> > @@ -1,9 +1,17 @@
> > +#define USE_THE_INDEX_COMPATIBILITY_MACROS
>
> It might be nicer to have a preparatory step that fixes this by adding a
> 'struct repository *r' argument to the function in builtin/rebase.c before
> moving the function. You could also do the same for next patch and then move
> both functions together.

In addition to that, I think that `reset_head()`

- should live in its own file, not be hidden in `autostash.c`,

- its default reflog action should _not_ be "rebase".

- ideally be made the working horse of `builtin/reset.c`,

- in addition to that `struct repository *r`, it should probably accept
  a `struct index_state *index` and a `const char *worktree_directory`,
  but that can easily come in the future, as needed.

Thanks,
Dscho


>
> Best Wishes
>
> Phillip
>
> > +
> >   #include "git-compat-util.h"
> >   #include "autostash.h"
> > +#include "cache-tree.h"
> >   #include "dir.h"
> >   #include "gettext.h"
> > +#include "lockfile.h"
> > +#include "refs.h"
> >   #include "run-command.h"
> >   #include "strbuf.h"
> > +#include "tree-walk.h"
> > +#include "tree.h"
> > +#include "unpack-trees.h"
> >
> >   int read_one(const char *path, struct strbuf *buf)
> >   {
> > @@ -13,6 +21,135 @@ int read_one(const char *path, struct strbuf *buf)
> > return 0;
> >   }
> >
> > +int reset_head(struct object_id *oid, const char *action,
> > +  const char *switch_to_branch, unsigned flags,
> > +  const char *reflog_orig_head, const char *reflog_head)
> > +{
> > +   unsigned detach_head = flags & RESET_HEAD_DETACH;
> > +   unsigned reset_hard = flags & RESET_HEAD_HARD;
> > +   unsigned run_hook = flags & RESET_HEAD_RUN_POST_CHECKOUT_HOOK;
> > +   unsigned refs_only = flags & RESET_HEAD_REFS_ONLY;
> > +   unsigned update_orig_head = flags & RESET_ORIG_HEAD;
> > +   struct object_id head_oid;
> > +   struct tree_desc desc[2] = { { NULL }, { NULL } };
> > +   struct lock_file lock = LOCK_INIT;
> > +   struct unpack_trees_options unpack_tree_opts;
> > +   struct tree *tree;
> > +   const char *reflog_action;
> > +   struct strbuf msg = STRBUF_INIT;
> > +   size_t prefix_len;
> > +   struct object_id *orig = NULL, oid_orig,
> > +   *old_orig = NULL, oid_old_orig;
> > +   int ret = 0, nr = 0;
> > +
> > +   if (switch_to_branch && !starts_with(switch_to_branch, "refs/"))
> > +   BUG("Not a fully qualified branch: '%s'", switch_to_branch);
> > +
> > +   if (!refs_only && hold_locked_index(&lock, LOCK_REPORT_ON_ERROR) < 0)
> > {
> > +   ret = -1;
> > +   goto leave_reset_head;
> > +   }
> > +
> > +   if ((!oid || !reset_hard) && get_oid("HEAD", &head_oid)) {
> > +   ret = error(_("could not determine HEAD revision"));
> > +   goto leave_reset_head;
> > +   }
> > +
> > +   if (!oid)
> > +   oid = &head_oid;
> > +
> > +   if (refs_only)
> > +   goto reset_head_refs;
> > +
> > +   memset(&unpack_tree_opts, 0, sizeof(unpack_tree_opts));
> > +   setup_unpack_trees_porcelain(&unpack_tree_opts, action);
> > +   unpack_tree_opts.head_idx = 1;
> > +   unpack_tree_opts.src_index = the_repository->index;
> > +   unpack_tree_opts.dst_index = the_repository->index;
> > +   unpack_tree_opts.fn = reset_hard ? oneway_merge : twoway_merge;
> > +   unpack_tree_opts.update = 1;
> > +   unpack_tree_opts.merge = 1;
> > +   if (!detach_head)
> > +   unpack_tree_opts.reset = 1;
> > +
> > +   if (repo_read_index_unmerged(the_repository) < 0) {
> > +   ret = error(_("could not read index"));
> > +   goto leave_reset_head;
> > +   }
> > +
> > +   if (!reset_hard && !fill_tree_descriptor(the_repository, &desc[nr++],
> > &head_oid)) {
> > +   ret = error(_("failed to find tree of %s"),
> > +   oid_to_hex(&head_oid));
> > +   goto leave_reset_head;
> > +   }
> > +
> > +   if (!fill_tree_descriptor(the_repository, &desc[nr++], oid)) {
> > +   ret = error(_("failed to find tree of %s"), oid_to_hex(oid));
> > +   goto leave_

Re: [RFC PATCH 1/7] Makefile: alphabetically sort += lists

2019-10-21 Thread Denton Liu
Hi Johannes,

On Mon, Oct 21, 2019 at 08:44:40PM +0200, Johannes Schindelin wrote:
> Hi Junio,
> 
> On Fri, 18 Oct 2019, Junio C Hamano wrote:
> 
> > Denton Liu  writes:
> >
> > > There are many += lists in the Makefile and, over time, they have gotten
> > > slightly out of order, alphabetically. Alphabetically sort all += lists
> > > to bring them back in order.
> > > ...
> >
> > Hmm.  I like the general thrust, but ...
> >
> > >  LIB_OBJS += combine-diff.o
> > > -LIB_OBJS += commit.o
> > >  LIB_OBJS += commit-graph.o
> > >  LIB_OBJS += commit-reach.o
> > > +LIB_OBJS += commit.o
> >
> > ... I do not particularly see this change (there may be similar
> > ones) desirable.  I'd find it it be much more natural to sort
> > "commit-anything" after "commit", and that is true with or without
> > the common extension ".o" added to these entries.
> >
> > In short, flipping these entries because '.' sorts later than '-' is
> > making the result look "less sorted", at least to me.
> 
> The problem with this argument is that it disagrees with ASCII, as `-`
> has code 0x2d while `.` has code 0x2e, i.e. it is lexicographically
> _larger_.
> 
> So Denton's patch does the correct thing.

I actually agree with Junio on this one. Without the prefixes, "commit"
would go before "commit-graph" so I think it would make more sense to
order with the prefixes removed instead of taking the naive ordering by
just sorting each block.

Thanks,

Denton

> 
> Ciao,
> Dscho


  1   2   3   4   5   6   7   8   9   10   >