date:20180213

Re: [PATCH] color.h: document and modernize header

2018-02-13 Thread Eric Sunshine

On Mon, Feb 12, 2018 at 10:55 PM, Eric Sunshine  wrote:
> On Mon, Feb 12, 2018 at 8:41 PM, Stefan Beller  wrote:
>> + * Output the formatted string in the specified color (and then reset to 
>> normal
>> + * color so subsequent output is uncolored). Omits the color encapsulation 
>> if
>> + * `color` is NULL. The `color_fprintf_ln` prints a new line after resetting
>> + * the color.  BUG: The `color_print_strbuf` prints the given pre-formatted
>> + * strbuf instead, up to its first NUL character.
>
> "`color_print_strbuf` prints the given pre-formatted strbuf (BUG: but
> only up to the first NUL character)."
>
> Probably not worth a re-roll is Junio can amend it locally.

By the way, thanks for the patience in the face of the nit-picking
this patch has undergone.

Re: [PATCH 5/8] rebase: introduce the --recreate-merges option

2018-02-13 Thread Sergey Organov

Johannes Schindelin  writes:
[...]
> Just to give you one concrete example: when I recently rebased some
> patches (no reording or dropping involved here!) and one of the picks
> failed with merge conflicts, I realized that that particular commit
> introduced incorrect formatting and fixed that right away (verifying that
> no other commits introduced incorrect formatting, of course).
>
> With your new cute idea to magically cherry-pick -m1, this change would
> have been magically dropped from the subsequent merge commits!

You put it as if the problem you describe is unsolvable short of getting
back to your favorite blind re-merge. Do you really believe it?

I thought it's obvious that I originally meant "cherry-pick -m1" to be
an explanation facility, a proof of concept, not the final answer to all
the problems of history editing. It's a nice base for actually
approaching these problems though, unlike blind re-merge currently being
used, the latter having no potential.

The fact that bare naked "cherry-pick -m1" doesn't do what is often[1]
required in such cases neither voids the general idea of reproducing
merge-the-result, nor does it make current re-merge approach less
broken.

[1] Please take into consideration that it's _not always_ the case that
one needs a change made to a side-branch to actually propagate to the
main-line over the merge (think "merge -x ours", or something similar
but not that simple), and then it's rather the cute idea to blindly
re-merge that will wreak havoc, as in a lot of other cases.

-- Sergey

Can we make git clone --recurse-submodules --shallow-submodules work for commits that are not on tags or branches

2018-02-13 Thread Ciro Santilli

I have a git repo with large submodules, notably the Linux kernel, and
shallow clone saves a lot of time.

However, QEMU is also a submodule, which I don't control, and QEMU has
submodules which don't point to commits that are not on any tag or
branch, so:

 git clone --recurse-submodules --shallow-submodules git://git.qemu.org/qemu.git

currently fails with:

error: Server does not allow request for unadvertised object
22ead3e0bfdb87516656453336160e0a37b066bf
Fetched in submodule path 'capstone', but it did not contain
22ead3e0bfdb87516656453336160e0a37b066bf. Direct fetching of that
commit failed.

on git 2.16.1.

Furthermore, I reproduce this locally with direct filesystem clones:
https://github.com/cirosantilli/test-git-web-interface/blob/15335d3002a3e64fc5756b69fb832a733aa63fb9/shallow-submodule.sh#L158
and on GitHub, so I'm guessing it is not just the settings for a
specific server?

Would it be possible to make that work, or are there fundamental
reasons why it is not possible?

Here is my use case repo, at the point of the ugly workaround I'm
having to do: 
https://github.com/cirosantilli/linux-kernel-module-cheat/blob/a14c95346cfd9d2e7b2e475b0981aa71d819c20b/configure#L23

Some more context:
https://stackoverflow.com/questions/2144406/git-shallow-submodules/47374702#47374702

This would make some embedded people happy.

[PATCH] subtree: die if failed to find existing splits

2018-02-13 Thread Hironao OTSUBO

When using git-subtree with --squash option, you may not have commits
indicated by git-subtree-split: at hand.

find_existing_splits dies on such situation, but the caller cmd_split
ignored it and produce unwanted output, i.e. a history that does not
contain commits from subproject. This patch fixes it.

Signed-off-by: Hironao OTSUBO 
---

Notes:
I know there was a discussion [1] about moving git-subtree development to
GitHub, but don't know where the actual repository is, so I am sending this
patch to the mailing list.

[1]: https://public-inbox.org/git/87mv7kf369@waller.obbligato.org/

 contrib/subtree/git-subtree.sh |  2 +-
 contrib/subtree/t/t7900-subtree.sh | 18 ++
 2 files changed, 19 insertions(+), 1 deletion(-)

diff --git a/contrib/subtree/git-subtree.sh b/contrib/subtree/git-subtree.sh
index dec085a23..81ad6e716 100755
--- a/contrib/subtree/git-subtree.sh
+++ b/contrib/subtree/git-subtree.sh
@@ -693,7 +693,7 @@ cmd_split () {
then
unrevs=
else
-   unrevs="$(find_existing_splits "$dir" "$revs")"
+   unrevs="$(find_existing_splits "$dir" "$revs")" || die "could 
not find existing splits"
fi
 
# We can't restrict rev-list to only $dir here, because some of our
diff --git a/contrib/subtree/t/t7900-subtree.sh 
b/contrib/subtree/t/t7900-subtree.sh
index d05c613c9..fb3d73acf 100755
--- a/contrib/subtree/t/t7900-subtree.sh
+++ b/contrib/subtree/t/t7900-subtree.sh
@@ -493,6 +493,24 @@ test_expect_success 'split "sub dir"/ with --branch for an 
incompatible branch'
)
 '
 
+next_test
+test_expect_success 'split fails without squashed commits existing' '
+   subtree_test_create_repo "$subtree_test_count" &&
+   subtree_test_create_repo "$subtree_test_count/sub proj" &&
+   test_create_commit "$subtree_test_count" main1 &&
+   test_create_commit "$subtree_test_count/sub proj" sub1 &&
+   commit1=$( cd "$subtree_test_count/sub proj" && git rev-parse 
HEAD ) &&
+   test_create_commit "$subtree_test_count/sub proj" sub2 &&
+   (
+   cd "$subtree_test_count" &&
+   git fetch ./"sub proj" master &&
+   git subtree add --prefix="sub dir" --squash FETCH_HEAD &&
+   git gc --prune=all &&
+   ! git rev-parse -q --verify "$commit1^{commit}" &&
+   test_must_fail git subtree split --prefix="sub dir"
+   )
+'
+
 #
 # Validity checking
 #
-- 
2.16.1

Re: [PATCH 3/7] worktree move: new command

2018-02-13 Thread Jeff King

On Tue, Feb 13, 2018 at 07:27:58AM +0700, Duy Nguyen wrote:

> > Makes sense to try to make sure that we don't regress leak-free tests. I
> > don't know what our Travis-budget looks like, but I would volunteer to
> > run something like this periodically using my own cycles.
> > 
> > My experience with the innards of our Makefiles and test-lib.sh is
> > non-existant, but from a very cursory look it seems like something as
> > simple as loading GIT_SKIP_TESTS from a blacklist-file might do the
> > trick.
> 
> Yeah my first throught was something along that line too. But maybe
> it's nicer to mark a test file leak-free like this:
> 
> -- 8< --
> diff --git a/t/t2028-worktree-move.sh b/t/t2028-worktree-move.sh
> index 459f676683..1446f075b7 100755
> --- a/t/t2028-worktree-move.sh
> +++ b/t/t2028-worktree-move.sh
> @@ -2,6 +2,8 @@
>  
>  test_description='test git worktree move, remove, lock and unlock'
>  
> +test_leak_free=yes
> +
>  . ./test-lib.sh
>  
>  test_expect_success 'setup' '
> -- 8< --

Hmm. That is not too bad, but somehow it feels funny to me to be
polluting each test script with these annotations. And to be driving it
from inside the test scripts.

It seems like:

  make SANITIZE=leak test GIT_SKIP_TESTS="$(cat known-leaky)"

would be sufficient. And updating the list would just be:

  # assume we're using prove, which will keep running after failure,
  # and will record the results for us to parse (using "--state=").
  # Otherwise use "make -k" and grep in t/test-results.
  make SANITIZE=leak test
  (cd t && prove --dry --state=failed) |
  perl -lne '/^(t[0-9]{4})-.*.sh$/ and print $1' |
  sort >known-leaky

That would update both now-passing and now-failing tests. Presumably
we'd keep it checked in, so "git diff" would show you the changes.

-Peff

Re: Fetch-hooks

2018-02-13 Thread Leo Gaspard

On 02/14/2018 02:35 AM, Jeff King wrote:
> On Sat, Feb 10, 2018 at 07:36:47PM +0100, Leo Gaspard wrote:
>> [...]
> I think there may have been some more concrete proposals after that, but
> that's what I was able to dig up quickly.

Thanks! Though it looks way above my knowledge of git internals as well
as time available, it looks like a great project, and I hope it someday
succeeds!

>> That said, actually I just noticed an issue in the “add a
>> remote..fetch option to fetch to refs/quarantine then have the
>> post-fetch hook do the work”: it means if I run `git pull`, then:
>>  1. The remote references will be pulled to refs/quarantine/...
>>  2. The post-fetch hook will copy the accepted ones to refs/remotes/...
>>  3. The `git merge FETCH_HEAD` called by pull will merge FETCH_HEAD into
>> local branches... and so merge from refs/quarantine.
> 
> Good point. You can't munge FETCH_HEAD by playing with refspecs.
> 
> I am starting to come around to the idea that "pre-fetch" might be the
> best way to do what you want. Not to rewrite refs, but perhaps to simply
> reject them. In the same way that we allow pre-receive to reject pushed
> refs (both are, after all, the final check on admitting new history into
> the repository, just in opposite directions).
> 
>> So, when thinking about it, I'm back to thinking the proper hook
>> interface should be the one of the tweak-fetch hook, but its
>> implementation should make it not go crazy on remote servers. And so
>> that the implementation should do all this refs/quarantine wizardry
>> inside git itself.
> 
> So does anybody actually want to be able to adjust the refs as they pass
> through? It really sounds like you just want to be able to reject or not
> reject the fetch. And that rejecting would be the uncommon case, so it's
> OK to just abort the whole operation and expect the user to figure it
> out.

This completely fits my use case (modulo the fact that it's more similar
to the `update` hook than to `pre-receive` I think, as verifying the
signature requires the objects to already have been downloaded), indeed,
though I'm not sure it would have fit Joey's (based on my understanding,
adding a merge was what was originally asked for).

Actually, I'm wondering if the existing semantics of `update` could not
be reused for the `pre-fetch`. Would it make sense to just call `update`
during a fetch in the same way as during a receive-pack? That said it
likely makes this a breaking change, though it's maybe unlikely that a
repository is used both for fetching and for receive-pack'ing, it could
happen.

So the proposal as I understand it would currently be adding a
`fetch-update` hook that does exactly the same thing as `update` but for
`fetch`. This solves my use case in a nice way, though it likely doesn't
solve Joey's original one (which has been taken care of by a wrapper
since then).

What do you all think about it?

What's cooking in git.git (Feb 2018, #02; Tue, 13)

2018-02-13 Thread Junio C Hamano

Here are the topics that have been cooking.  Commits prefixed with
'-' are only in 'pu' (proposed updates) while commits prefixed with
'+' are in 'next'.  The ones marked with '.' do not appear in any of
the integration branches, but I am still holding onto them.

You can find the changes described here in the integration branches
of the repositories listed at

http://git-blame.blogspot.com/p/git-public-repositories.html

--
[Graduated to "master"]

* ab/simplify-perl-makefile (2018-01-03) 3 commits
  (merged to 'next' on 2018-01-23 at 1506e0651a)
 + perl: treat PERLLIB_EXTRA as an extra path again
 + perl: avoid *.pmc and fix Error.pm further
 + Makefile: replace perl/Makefile.PL with simple make rules

 Originally merged to 'next' on 2018-01-03

 The build procedure for perl/ part has been greatly simplified by
 weaning ourselves off of MakeMaker.


* cc/sha1-file-name (2018-01-19) 2 commits
  (merged to 'next' on 2018-02-07 at a50b171a84)
 + sha1_file: improve sha1_file_name() perfs
 + sha1_file: remove static strbuf from sha1_file_name()

 Code clean-up.


* cl/t9001-cleanup (2018-01-12) 1 commit
  (merged to 'next' on 2018-02-07 at 9b194a)
 + t9001: use existing helper in send-email test

 Test clean-up.


* ds/use-get-be64 (2018-01-19) 1 commit
  (merged to 'next' on 2018-02-07 at 6d6d5ba71d)
 + packfile: use get_be64() for large offsets

 Code clean-up.


* ew/svn-branch-segfault-fix (2018-01-30) 1 commit
  (merged to 'next' on 2018-02-07 at 1bf8d8f74f)
 + git-svn: control destruction order to avoid segfault

 Workaround for segfault with more recent versions of SVN.


* gs/retire-mru (2018-01-24) 1 commit
  (merged to 'next' on 2018-02-07 at 4b2e893911)
 + mru: Replace mru.[ch] with list.h implementation
 (this branch uses ot/mru-on-list.)

 Retire mru API as it does not give enough abstraction over
 underlying list API to be worth it.


* jc/mailinfo-cleanup-fix (2018-01-24) 1 commit
  (merged to 'next' on 2018-02-07 at 65d41f993b)
 + mailinfo: avoid segfault when can't open files

 Corner case bugfix.


* jh/fsck-promisors (2017-12-08) 10 commits
  (merged to 'next' on 2018-01-23 at ca59f5c18e)
 + gc: do not repack promisor packfiles
 + rev-list: support termination at promisor objects
 + sha1_file: support lazily fetching missing objects
 + introduce fetch-object: fetch one promisor object
 + index-pack: refactor writing of .keep files
 + fsck: support promisor objects as CLI argument
 + fsck: support referenced promisor objects
 + fsck: support refs pointing to promisor objects
 + fsck: introduce partialclone extension
 + extension.partialclone: introduce partial clone extension
 (this branch is used by jh/partial-clone.)

 Originally merged to 'next' on 2018-01-17

 In preparation for implementing narrow/partial clone, the machinery
 for checking object connectivity used by gc and fsck has been
 taught that a missing object is OK when it is referenced by a
 packfile specially marked as coming from trusted repository that
 promises to make them available on-demand and lazily.


* jh/partial-clone (2017-12-08) 13 commits
  (merged to 'next' on 2018-01-23 at de0f0111ea)
 + t5616: test bulk prefetch after partial fetch
 + fetch: inherit filter-spec from partial clone
 + t5616: end-to-end tests for partial clone
 + fetch-pack: restore save_commit_buffer after use
 + unpack-trees: batch fetching of missing blobs
 + clone: partial clone
 + partial-clone: define partial clone settings in config
 + fetch: support filters
 + fetch: refactor calculation of remote list
 + fetch-pack: test support excluding large blobs
 + fetch-pack: add --no-filter
 + fetch-pack, index-pack, transport: partial clone
 + upload-pack: add object filtering for partial clone
 (this branch uses jh/fsck-promisors.)

 Originally merged to 'next' on 2018-01-17

 The machinery to clone & fetch, which in turn involves packing and
 unpacking objects, have been told how to omit certain objects using
 the filtering mechanism introduced by the jh/object-filtering
 topic, and also mark the resulting pack as a promisor pack to
 tolerate missing objects, taking advantage of the mechanism
 introduced by the jh/fsck-promisors topic.


* jk/daemon-fixes (2018-01-25) 6 commits
  (merged to 'next' on 2018-02-07 at 0e4fe8f8cc)
 + daemon: fix length computation in newline stripping
 + t/lib-git-daemon: add network-protocol helpers
 + daemon: handle NULs in extended attribute string
 + daemon: fix off-by-one in logging extended attributes
 + t/lib-git-daemon: record daemon log
 + t5570: use ls-remote instead of clone for interp tests

 Assorted fixes to "git daemon".


* jt/http-redact-cookies (2018-01-19) 2 commits
  (merged to 'next' on 2018-02-07 at a8c3416a7d)
 + http: support omitting data from traces
 + http: support cookie redaction when tracing

 The http tracing code, often used to debug connection issues,
 learned to redact potentially sensitive information from its output
 so that

Re: Fetch-hooks

2018-02-13 Thread Jacob Keller

On Sat, Feb 10, 2018 at 4:21 AM, Jeff King  wrote:
> On Sat, Feb 10, 2018 at 01:37:20AM +0100, Leo Gaspard wrote:
>
>> > Yeah, tag-following may be a little tricky, because it usually wants to
>> > write to refs/tags/. One workaround would be to have your config look
>> > like this:
>> >
>> >   [remote "origin"]
>> >   fetch = +refs/heads/*:refs/quarantine/origin/heads/*
>> >   fetch = +refs/tags/*:refs/quarantine/origin/tags/*
>> >   tagOpt = --no-tags
>> >
>> > That's not exactly the same thing, because it would fetch all tags, not
>> > just those that point to the history on the branches. But in most
>> > repositories and workflows the distinction doesn't matter.
>>
>> Hmm... apart from the implementation complexity (of which I have no
>> idea), is there an argument against the idea of adding a
>> remote..fetchTagsTo refmap similar to remote..fetch but used
>> every time a tag is fetched? (well, maybe not exactly similar to
>> remote..fetch because we know the source is going to be
>> refs/tags/*; so just having the part of .fetch past the ':' would be
>> more like what's needed I guess)
>
> I don't think it would be too hard to implement, and is at least
> logically consistent with the way we handle tags.
>
> If we were designing from scratch, I do think this would all be helped
> immensely by having more separation of refs fetched from remotes. There
> was a proposal in the v1.8 era to fetch everything for a remote,
> including tags, into "refs/remotes/origin/heads/",
> "refs/remotes/origin/tags/", etc. And then we'd teach the lookup side to
> look for tags in each of the remote-tag namespaces.
>
> I still think that would be a good direction to go, but it would be a
> big project (which is why the original stalled).
>

I want to go this direction, but the difficulty has always been
limited time to actually work on such a large project.

Thanks,
Jake

>> The issue with your solution is that if the user runs 'git fetch
>> --tags', he will get the (potentially compromised) tags directly in his
>> refs/tags/.
>
> True.
>
> -Peff

Re: Fetch-hooks

2018-02-13 Thread Jeff King

On Mon, Feb 12, 2018 at 11:23:27AM -0800, Brandon Williams wrote:

> Maybe this isn't helpful but you may be able to implement this by using
> a remote-helper.  The helper could perform any sort of caching it needed
> to prevent re-downloading large amounts of data that is potentially
> thrown away, while only sending through the relevant commits which
> satisfy some criteria (signed, etc.).

Interesting idea, though it would still be calling git-fetch under the
hood, right? So I guess we're just reimplementing this "quarantine"
concept, but in theory we'd have the flexibility to store that
quarantine in another one-off sub-repository (instead of a special ref
namespace, though I suppose you could use the special ref namespace,
too).

I suspect it could work, but there would probably be a lot of rough
edges.

-Peff

Re: [PATCH 5/8] rebase: introduce the --recreate-merges option

2018-02-13 Thread Jacob Keller

On Mon, Feb 12, 2018 at 11:15 PM, Sergey Organov  wrote:
> Hi Jake,
>
> Jacob Keller  writes:
>
>> On Mon, Feb 12, 2018 at 12:39 PM, Johannes Schindelin
>>  wrote:
>>> Hi Sergey,
>>>
>>> On Mon, 12 Feb 2018, Sergey Organov wrote:
 > Have a look at https://github.com/git/git/pull/447, especially the
 > latest commit in there which is an early version of the deprecation I
 > intend to bring about.

 You shouldn't want a deprecation at all should you have re-used
 --preserve-merges in the first place, and I still don't see why you
 haven't.
>>>
>>> Keep repeating it, and it won't become truer.
>>>
>>> If you break formats, you break scripts. Git has *so* many users, there
>>> are very likely some who script *every* part of it.
>>>
>>> We simply cannot do that.
>>>
>>> What we can is deprecate designs which we learned on the way were not only
>>> incomplete from the get-go, but bad overall and hard (or impossible) to
>>> fix. Like --preserve-merges.
>>>
>>> Or for that matter like the design you proposed, to use --first-parent for
>>> --recreate-merges. Or to use --first-parent for some --recreate-merges,
>>> surprising users in very bad ways when it is not used (or when it is
>>> used). I get the impression that you still think it would be a good idea,
>>> even if it should be obvious that it is not.
>>
>> If we consider the addition of new todo list elements as "user
>> breaking", then yes this change would be user-script breaking.
>
> It _is_ user script breaking, provided such script exists. Has anybody
> actually seen one? Not that it's wrong to be extra-cautious about it,
> just curios. Note that to be actually affected, such a script must
> invoke "git rebase -p" _command_ and then tweak its todo output to
> produce outcome.
>
>> Since we did not originally spell out that todo-list items are subject
>> to enhancement by addition of operations in the future, scripts are
>> likely not designed to allow addition of new elements.
>
> Out of curiosity, are you going to spell it now, for the new todo
> format?
>
>> Thus, adding recreate-merges, and deprecating preserve-merges, seems
>> to me to be the correct action to take here.
>
> Yes, sure, provided there is actual breakage, or at least informed
> suspicion there is one.
>
>> One could argue that users should have expected new todo list elements
>> to be added in the future and thus design their scripts to cope with
>> such a thing. If you can convincingly argue this, then I don't
>> necessarily see it as a complete user breaking change to fix
>> preserve-merges in order to allow it to handle re-ordering properly..
>
> I'd not argue this way myself. If there are out-of-git-tree non-human
> users that accept and tweak todo _generated_ by current "git rebase -p"
> _command_, I also vote for a new option.
>

To be fair, I have not seen anything that actually reads the todo list
and tweaks it in such a manner. The closest example is the git garden
shears script, which simply replaces the todo list.

It's certainly *possible* that such a script would exist though,

Thanks,
Jake

>> I think I lean towards agreeing with Johannes, and that adding
>> recreate-merges and removing preserve-merges is the better solution.
>
> On these grounds it is, no objections.
>
> -- Sergey

Re: Fetch-hooks

2018-02-13 Thread Jeff King

On Sat, Feb 10, 2018 at 07:36:47PM +0100, Leo Gaspard wrote:

> Hmm... would this also drown the remote..fetch map? Also, I think
> this behavior could be emulated with fetch and fetchTagsTo, and it would
> look like:
> [remote "my-remote"]
> fetch = +refs/heads/*:refs/remotes/my-remote/heads/*
> fetchTagsTo = refs/remotes/my-remote/tags/*
> The remaining issue being to teach the lookup side to look for tags in
> all the remote-tag namespaces (and the fact it's a breaking change).

Right, I think fetching into the right spots is the easy part. Designing
the new lookup rules is the tricky part.

If you're really interested in the gory details, here's a very old
discussion on it:

https://public-inbox.org/git/AANLkTi=yfwoaqmhhvlsb1_xmyoe9hhp2yb4h4tqzw...@mail.gmail.com/

I think there may have been some more concrete proposals after that, but
that's what I was able to dig up quickly.

> That said, actually I just noticed an issue in the “add a
> remote..fetch option to fetch to refs/quarantine then have the
> post-fetch hook do the work”: it means if I run `git pull`, then:
>  1. The remote references will be pulled to refs/quarantine/...
>  2. The post-fetch hook will copy the accepted ones to refs/remotes/...
>  3. The `git merge FETCH_HEAD` called by pull will merge FETCH_HEAD into
> local branches... and so merge from refs/quarantine.

Good point. You can't munge FETCH_HEAD by playing with refspecs.

I am starting to come around to the idea that "pre-fetch" might be the
best way to do what you want. Not to rewrite refs, but perhaps to simply
reject them. In the same way that we allow pre-receive to reject pushed
refs (both are, after all, the final check on admitting new history into
the repository, just in opposite directions).

> So, when thinking about it, I'm back to thinking the proper hook
> interface should be the one of the tweak-fetch hook, but its
> implementation should make it not go crazy on remote servers. And so
> that the implementation should do all this refs/quarantine wizardry
> inside git itself.

So does anybody actually want to be able to adjust the refs as they pass
through? It really sounds like you just want to be able to reject or not
reject the fetch. And that rejecting would be the uncommon case, so it's
OK to just abort the whole operation and expect the user to figure it
out.

-Peff

Re: [PATCH 003/194] object-store: move packed_git and packed_git_mru to object store

2018-02-13 Thread Junio C Hamano

René Scharfe  writes:

> A short semantic patch with a limited time of usefulness and possible
> side-effects can easily be included in a commit message, of course..

Yeah, I think that is Jonathan's favourite approach as well, and I
do not have problem with that.  This transition spatch is unlike the
"avoid strbuf_addf when we do not have to use it" one, in that there
is a definite cut-over where the old way becomes an invalid C code
and compilers can help us spot unconverted places (as opposed to a
valid but unwanted way that can come into the codebase after one
round of code cleaning is done, which is a useful thing to catch
with the "make coccicheck" target).

Re: [PATCH v2] dir.c: ignore paths containing .git when invalidating untracked cache

2018-02-13 Thread Duy Nguyen

On Wed, Feb 14, 2018 at 12:57 AM, Junio C Hamano  wrote:
> Duy Nguyen  writes:
>
>> It's very tempting considering that the amount of changes is much
>> smaller. But I think we should go with my version. The hope is when a
>> _new_ call site appears, the author would think twice before passing
>> zero or one to the safe_path argument.
>
> Wouldn't it be a better API if the author of new callsite does not
> have to think twice and can instead rely on the called function
> untracked_cache_invalidate_path() to always do the right thing?

I am worried that always doing the right thing may carry performance
penalty (this is based purely on reading verify_path() code, no actual
benchmarking). For safety, you can always set safe_path to zero. But
if you do a lot of invalidation and something starts to slow down,
then you can consider setting safe_path to 1 (if it's actually safe to
do so). I think we do mass invalidation in some case, so I will try to
actually benchmark that and see if this safe_path argument is
justified or if we can always call verify_path().
-- 
Duy

Re: [PATCH 00/26] Moving global state into the repository object (part 1)

2018-02-13 Thread Duy Nguyen

On Wed, Feb 14, 2018 at 2:26 AM, Jonathan Nieder  wrote:
>> Duy suggested that we shall not use the repository blindly, but should 
>> carefully
>> examine whether to pass on an object store or the refstore or such[4], which
>> I agree with if it makes sense. This series unfortunately has an issue with 
>> that
>> as I would not want to pass down the `ignore_env` flag separately from the 
>> object
>> store, so I made all functions that only take the object store to have the 
>> raw
>> object store as the first parameter, and others using the full repository.
>
> I think I want to push back on this a little.
>
> The advantage of a function taking e.g. an object_store as an argument
> instead of a repository is that it increases its flexibility, since it
> allows callers that do not have access to a repository to call it.  The
> disadvantage is also that it increases the flexibility without any
> callers benefitting from that:
>
>  1. It ties us to assumptions from today.  If e.g. an object access in
> the future starts relying on some other information from the
> repository (e.g. its config) then we'd have to either add a
> back-pointer from the object store to its repository or add
> additional arguments for that additional data at that point.
>
> If all callers already have a repository, it is simpler to pass
> that repository as context so that we have the flexibility to make
> more use of it later.

It's essentially putting all global variables in the same place again.
Only this time it's not global namespace, but "struct repository".
It's no worse than the current state though.

>
>  2. It complicates the caller.  Instead of consistently passing the
> same repository argument as context to functions that access that
> repository, the caller would have to pull out relevant fields like
> the object store from it.

Well, I see that as a good point :)

>
>  3. It prevents us from making opportunistic use of other information
> from the repository, such as its name for use in error messages.

It does not exactly prevent us. It's just more effort to pass this
around. The repository name for example, there's no reason we can't
have object store name (which could be initialized the same as repo
name).

> In lower-level funcitons that need to be usable by callers without a
> repository (e.g. to find packfiles in an alternate) it makes sense to
> not pass a repository, but without such a use case in mind

I do agree with your benefit argument. But I'd like to point out
though that having all input to object store visible from something
like "struct raw_object_store" makes it easier to reason about the
code. I know how object store works, but occasionally I'm still
surprised when it getenv (or read $GIT_DIR/index, but not in object
store code) behind the scene. Imagine how hard it is for newcomers.

I would count that as benefit, even though it's not a use case per se.
Another potential benefit is writing unit tests will be much easier
(you can configure object store through struct repository too, but
setting one piece here, one piece there to control object store
behavior is not a nice experience). It's a nice thing to have, but not
a deciding factor.

> I don't think it needs to be a general goal.

My stand is a bit more aggressive. We should try to achieve better
abstraction if possible. But if it makes Stefan's life hell, it's not
worth doing. Converting to 'struct repository' is already a step
forward. Actually if it discourages him from finishing this work, it's
already not worth doing.
-- 
Duy

Re: [PATCH 14/26] sha1_file: allow prepare_alt_odb to handle arbitrary repositories

2018-02-13 Thread Duy Nguyen

On Tue, Feb 13, 2018 at 8:22 AM, Stefan Beller  wrote:
> Signed-off-by: Stefan Beller 
> ---
>  object-store.h |  3 +--
>  sha1_file.c| 21 +++--
>  2 files changed, 12 insertions(+), 12 deletions(-)
>
> diff --git a/object-store.h b/object-store.h
> index d96a16edd1..add1d4e27c 100644
> --- a/object-store.h
> +++ b/object-store.h
> @@ -61,7 +61,6 @@ struct packed_git {
> char pack_name[FLEX_ARRAY]; /* more */
>  };
>
> -#define prepare_alt_odb(r) prepare_alt_odb_##r()
> -extern void prepare_alt_odb_the_repository(void);
> +void prepare_alt_odb(struct repository *r);
>
>  #endif /* OBJECT_STORE_H */
> diff --git a/sha1_file.c b/sha1_file.c
> index d18ce3aeba..f046d560f8 100644
> --- a/sha1_file.c
> +++ b/sha1_file.c
> @@ -677,21 +677,22 @@ int foreach_alt_odb(alt_odb_fn fn, void *cb)
> return r;
>  }
>
> -void prepare_alt_odb_the_repository(void)
> +void prepare_alt_odb(struct repository *r)
>  {
> -   const char *alt;
> -
> -   if (the_repository->objects.alt_odb_tail)
> +   if (r->objects.alt_odb_tail)
> return;
>
> -   alt = getenv(ALTERNATE_DB_ENVIRONMENT);
> +   r->objects.alt_odb_tail = >objects.alt_odb_list;
> +
> +   if (!r->ignore_env) {
> +   const char *alt = getenv(ALTERNATE_DB_ENVIRONMENT);

If one day the majority of git moves to use 'struct repository', then
ALTERNATE_DB_ENVIRONMENT is always ignored because ignore_env is
always true. I think if you ignore_env, then you still need to get
this "alt" from  'struct raw_object_store' (or 'struct repository').

Since you get lots of getenv() in repo_setup_env(), I think this
getenv(ALTERNATE_DB_ENVIRONMENT) belongs there too. Then here, if
ignore_env is true, you read r->objects.alt or something instead of
doing getenv().

I really want to kill this getenv() in this code, which is basically
delayed initialization because we haven't done proper init on
the_repo. I realize that it cannot be done earlier, when
prepare_alt_odb() does not even have a  'struct repository *' to work
with. Would it be ok if I contributed a patch on top of your series to
basically do repo_init(_repo) for all builtin/external commands
(and fix all the bugs that come with it)? Then we would not need
ignore_env here anymore.

> +   if (!alt)
> +   alt = "";
>
> -   the_repository->objects.alt_odb_tail =
> -   _repository->objects.alt_odb_list;
> -   link_alt_odb_entries(the_repository, alt,
> -PATH_SEP, NULL, 0);
> +   link_alt_odb_entries(r, alt, PATH_SEP, NULL, 0);
> +   }
>
> -   read_info_alternates(the_repository, get_object_directory(), 0);
> +   read_info_alternates(r, r->objects.objectdir, 0);
>  }
>
>  /* Returns 1 if we have successfully freshened the file, 0 otherwise. */
> --
> 2.16.1.73.ga2c3e9663f.dirty
>

-- 
Duy

Re: [PATCH 00/26] Moving global state into the repository object (part 1)

2018-02-13 Thread Junio C Hamano

Stefan Beller  writes:

> Which is why I'd strongly consider having it only in the repository
> object as that is the largest-scoped thing we'd want. e.g. submodules
> should care about environment variables differently:
>
> GIT_WORK_TREE=~/mysuperproject git checkout \
> --recurse-submodules master
>
>  So with such a command in mind, the environment variable would
> only apply to the superproject and the nested submodules should
> ignore the env, but compute their paths off the superproject, i.e.
> the superproject repository, not its object store?

In the longer term endgame state, what is in the repository object
cannot just be a simple "do we honor environment variable" bit
anymore.  

It will be more like "we may (or may not) look at environment when
we create a repository object", i.e. a bit passed to repo_init()
constructor, and from then on, the repository object knows where the
object store and its alternates are, where the top of the working
tree is, where the repository (i.e. the directory that has "refs/"
in it) is, what "worktree" of the repository we are talking about by
itself.  There is no need for a bit "do we or do we not look at
environment?" that needs to be consulted at runtime, which is quite
a round-about thing.

In your example, the repository object that represents the
superproject will pay attention to GIT_WORK_TREE environment when it
is consturcted, and repository objects dynamically constructed while
the command "recurse-submodules" through them will be constructed
with their "where is the top of my working tree?" set appropriately.
They won't be storing "when figuring out where my working tree is,
do not pay attention to GIT_WORK_TREE environment variable" bit.

should "--recurse-submodule" be "--recurse-submodules"?

2018-02-13 Thread Robert P. J. Day


  also just noticed the following:

Documentation/RelNotes/2.13.0.txt:489: * A few commands that recently learned 
the "--recurse-submodule"
Documentation/RelNotes/2.12.0.txt:226: * "git push --dry-run 
--recurse-submodule=on-demand" wasn't
Documentation/RelNotes/2.11.1.txt:27: * "git push --dry-run 
--recurse-submodule=on-demand" wasn't
t/t5531-deep-submodule-push.sh:366: git push 
--recurse-submodule=check origin master
t/t5572-pull-submodule.sh:45:test_expect_success 'pull --recurse-submodule 
setup' '

  should some of those be corrected?

rday

[no subject]

2018-02-13 Thread Emile Kenold

-- 
This is Mrs. Emile Kenold a missionary contacting you from London. I
pray you will be kind enough to deliver my £7 million pounds donation
to the less privileged ones in your country and God will bless your
generation for doing this humanitarian work.

I am a widow suffering of lung cancer which has damaged my liver and
back bone and i decided to donate this money to you to use it for
Charity works. Please i want your sincere reply to know if you will be
able to execute this project, and I will give you more information on
how to receive this money. I am waiting for your reply.

Thanks and God bless you.

HOPE TO HEAR FROM YOU

2018-02-13 Thread Miss Nadege

Hello dear how are you?

Nice to meet you,my name is Miss Nadege Yann, can we become friends? hope to 
hear from you so that we can know each other very well, love matters mostly in 
life,i will also send you my pictures and tell you more about myself, my email 
address is(missnade...@gmail.com)

waiting to hear from you soon.
Miss.Nadege Yann

Re: [PATCH v3 11/14] commit: integrate commit graph with commit parsing

2018-02-13 Thread Jonathan Tan

On Thu,  8 Feb 2018 15:37:35 -0500
Derrick Stolee  wrote:

> | Command  | Before | After  | Rel % |
> |--|||---|
> | log --oneline --topo-order -1000 |  5.9s  |  0.7s  | -88%  |
> | branch -vv   |  0.42s |  0.27s | -35%  |
> | rev-list --all   |  6.4s  |  1.0s  | -84%  |
> | rev-list --all --objects | 32.6s  | 27.6s  | -15%  |

Could we have a performance test (in t/perf) demonstrating this?

> +static int check_commit_parents(struct commit *item, struct commit_graph *g,
> + uint32_t pos, const unsigned char *commit_data)

Document what this function does? Also, this function probably needs a
better name.

> +/*
> + * Given a commit struct, try to fill the commit struct info, including:
> + *  1. tree object
> + *  2. date
> + *  3. parents.
> + *
> + * Returns 1 if and only if the commit was found in the commit graph.
> + *
> + * See parse_commit_buffer() for the fallback after this call.
> + */
> +int parse_commit_in_graph(struct commit *item)
> +{

The documentation above duplicates what's in the header file, so we can
probably omit it.

> +extern struct object_id *get_nth_commit_oid(struct commit_graph *g,
> + uint32_t n,
> + struct object_id *oid);

This doesn't seem to be used elsewhere - do you plan for a future patch
to use it?

[PATCH v2] Correct mispellings of ".gitmodule" to ".gitmodules"

2018-02-13 Thread Robert P. J. Day

There are a small number of misspellings, ".gitmodule", scattered
throughout the code base, correct them ... no apparent functional
changes.

 Documentation/technical/api-submodule-config.txt | 2 +-
 contrib/subtree/git-subtree.txt  | 2 +-
 submodule-config.c   | 4 ++--
 t/t5526-fetch-submodules.sh  | 2 +-
 4 files changed, 5 insertions(+), 5 deletions(-)

Signed-off-by: Robert P. J. Day 

---

  can't believe i forgot to sign my own patch. le *sigh* ...

diff --git a/Documentation/technical/api-submodule-config.txt 
b/Documentation/technical/api-submodule-config.txt
index 3dce003fd..ee907c4a8 100644
--- a/Documentation/technical/api-submodule-config.txt
+++ b/Documentation/technical/api-submodule-config.txt
@@ -4,7 +4,7 @@ submodule config cache API
 The submodule config cache API allows to read submodule
 configurations/information from specified revisions. Internally
 information is lazily read into a cache that is used to avoid
-unnecessary parsing of the same .gitmodule files. Lookups can be done by
+unnecessary parsing of the same .gitmodules files. Lookups can be done by
 submodule path or name.

 Usage
diff --git a/contrib/subtree/git-subtree.txt b/contrib/subtree/git-subtree.txt
index 60d76cddd..352deda69 100644
--- a/contrib/subtree/git-subtree.txt
+++ b/contrib/subtree/git-subtree.txt
@@ -28,7 +28,7 @@ as a subdirectory of your application.

 Subtrees are not to be confused with submodules, which are meant for
 the same task. Unlike submodules, subtrees do not need any special
-constructions (like .gitmodule files or gitlinks) be present in
+constructions (like .gitmodules files or gitlinks) be present in
 your repository, and do not force end-users of your
 repository to do anything special or to understand how subtrees
 work. A subtree is just a subdirectory that can be
diff --git a/submodule-config.c b/submodule-config.c
index 2aa8a1747..602ba8ca8 100644
--- a/submodule-config.c
+++ b/submodule-config.c
@@ -9,7 +9,7 @@
 /*
  * submodule cache lookup structure
  * There is one shared set of 'struct submodule' entries which can be
- * looked up by their sha1 blob id of the .gitmodule file and either
+ * looked up by their sha1 blob id of the .gitmodules file and either
  * using path or name as key.
  * for_path stores submodule entries with path as key
  * for_name stores submodule entries with name as key
@@ -91,7 +91,7 @@ static void submodule_cache_clear(struct submodule_cache 
*cache)
/*
 * We iterate over the name hash here to be symmetric with the
 * allocation of struct submodule entries. Each is allocated by
-* their .gitmodule blob sha1 and submodule name.
+* their .gitmodules blob sha1 and submodule name.
 */
hashmap_iter_init(>for_name, );
while ((entry = hashmap_iter_next()))
diff --git a/t/t5526-fetch-submodules.sh b/t/t5526-fetch-submodules.sh
index a552ad4ea..74486c73b 100755
--- a/t/t5526-fetch-submodules.sh
+++ b/t/t5526-fetch-submodules.sh
@@ -485,7 +485,7 @@ test_expect_success "don't fetch submodule when newly 
recorded commits are alrea
)
 '

-test_expect_success "'fetch.recurseSubmodules=on-demand' works also without 
.gitmodule entry" '
+test_expect_success "'fetch.recurseSubmodules=on-demand' works also without 
.gitmodules entry" '
(
cd downstream &&
git fetch --recurse-submodules

rday

-- 


Robert P. J. Day Ottawa, Ontario, CANADA
http://crashcourse.ca

Twitter:   http://twitter.com/rpjday
LinkedIn:   http://ca.linkedin.com/in/rpjday

Re: [PATCH 0/4] Correct offsets of hunks when one is skipped

2018-02-13 Thread brian m. carlson

On Tue, Feb 13, 2018 at 10:44:04AM +, Phillip Wood wrote:
> From: Phillip Wood 
> 
> While working on a patch series to stage selected lines from a hunk
> without having to edit it I got worried that subsequent patches would
> be applied in the wrong place which lead to this series to correct the
> offsets of hunks following those that are skipped or edited.
> 
> Phillip Wood (4):
>   add -i: add function to format hunk header
>   t3701: add failing test for pathological context lines
>   add -p: Adjust offsets of subsequent hunks when one is skipped
>   add -p: calculate offset delta for edited patches
> 
>  git-add--interactive.perl  | 93 
> +++---
>  t/t3701-add-interactive.sh | 30 +++
>  2 files changed, 102 insertions(+), 21 deletions(-)

This looks reasonably sane to me.  I really like that you managed to
produce failing tests for this situation.  I know pathological cases
like this have bit GCC in the past, so it's good that you fixed this.
-- 
brian m. carlson / brian with sandals: Houston, Texas, US
https://www.crustytoothpaste.net/~bmc | My opinion only
OpenPGP: https://keybase.io/bk2204


signature.asc
Description: PGP signature

Re: [PATCH] Correct mispellings of ".gitmodule" to ".gitmodules"

2018-02-13 Thread Stefan Beller

On Tue, Feb 13, 2018 at 3:33 PM, Robert P. J. Day  wrote:
>
> There are a small number of misspellings, ".gitmodule", scattered
> throughout the code base, correct them ... no apparent functional
> changes.

Thanks for catching these!


>  Documentation/technical/api-submodule-config.txt | 2 +-
>  contrib/subtree/git-subtree.txt  | 2 +-
>  submodule-config.c   | 4 ++--
>  t/t5526-fetch-submodules.sh  | 2 +-
>  4 files changed, 5 insertions(+), 5 deletions(-)

Please sign off your patch. (See Documentation/SubmittingPatches)

>
> ---
>
>   fairly brainless correction of what appear to be minor misspellings.

the code looks good.

Thanks,
Stefan

>
> diff --git a/Documentation/technical/api-submodule-config.txt 
> b/Documentation/technical/api-submodule-config.txt
> index 3dce003fd..ee907c4a8 100644
> --- a/Documentation/technical/api-submodule-config.txt
> +++ b/Documentation/technical/api-submodule-config.txt
> @@ -4,7 +4,7 @@ submodule config cache API
>  The submodule config cache API allows to read submodule
>  configurations/information from specified revisions. Internally
>  information is lazily read into a cache that is used to avoid
> -unnecessary parsing of the same .gitmodule files. Lookups can be done by
> +unnecessary parsing of the same .gitmodules files. Lookups can be done by
>  submodule path or name.
>
>  Usage
> diff --git a/contrib/subtree/git-subtree.txt b/contrib/subtree/git-subtree.txt
> index 60d76cddd..352deda69 100644
> --- a/contrib/subtree/git-subtree.txt
> +++ b/contrib/subtree/git-subtree.txt
> @@ -28,7 +28,7 @@ as a subdirectory of your application.
>
>  Subtrees are not to be confused with submodules, which are meant for
>  the same task. Unlike submodules, subtrees do not need any special
> -constructions (like .gitmodule files or gitlinks) be present in
> +constructions (like .gitmodules files or gitlinks) be present in
>  your repository, and do not force end-users of your
>  repository to do anything special or to understand how subtrees
>  work. A subtree is just a subdirectory that can be
> diff --git a/submodule-config.c b/submodule-config.c
> index 2aa8a1747..602ba8ca8 100644
> --- a/submodule-config.c
> +++ b/submodule-config.c
> @@ -9,7 +9,7 @@
>  /*
>   * submodule cache lookup structure
>   * There is one shared set of 'struct submodule' entries which can be
> - * looked up by their sha1 blob id of the .gitmodule file and either
> + * looked up by their sha1 blob id of the .gitmodules file and either
>   * using path or name as key.
>   * for_path stores submodule entries with path as key
>   * for_name stores submodule entries with name as key
> @@ -91,7 +91,7 @@ static void submodule_cache_clear(struct submodule_cache 
> *cache)
> /*
>  * We iterate over the name hash here to be symmetric with the
>  * allocation of struct submodule entries. Each is allocated by
> -* their .gitmodule blob sha1 and submodule name.
> +* their .gitmodules blob sha1 and submodule name.
>  */
> hashmap_iter_init(>for_name, );
> while ((entry = hashmap_iter_next()))
> diff --git a/t/t5526-fetch-submodules.sh b/t/t5526-fetch-submodules.sh
> index a552ad4ea..74486c73b 100755
> --- a/t/t5526-fetch-submodules.sh
> +++ b/t/t5526-fetch-submodules.sh
> @@ -485,7 +485,7 @@ test_expect_success "don't fetch submodule when newly 
> recorded commits are alrea
> )
>  '
>
> -test_expect_success "'fetch.recurseSubmodules=on-demand' works also without 
> .gitmodule entry" '
> +test_expect_success "'fetch.recurseSubmodules=on-demand' works also without 
> .gitmodules entry" '
> (
> cd downstream &&
> git fetch --recurse-submodules
>
> --
>
> 
> Robert P. J. Day Ottawa, Ontario, CANADA
> http://crashcourse.ca
>
> Twitter:   http://twitter.com/rpjday
> LinkedIn:   http://ca.linkedin.com/in/rpjday
>

[PATCH] sq_dequote: fix extra consumption of source string

2018-02-13 Thread Jeff King

This fixes a (probably harmless) parsing problem in
sq_dequote_step(), in which we parse some bogus input
incorrectly rather than complaining that it's bogus.

Our shell-dequoting function is very strict: it can unquote
everything generated by sq_quote(), but not arbitrary
strings. In particular, it only allows characters outside of
the single-quoted string if they are immediately backslashed
and then the single-quoted string is resumed. So:

  'foo'\''bar'

is OK. But these are not:

  'foo'\'bar
  'foo'\'
  'foo'\'\''bar'

even though they are all valid shell. The parser has a funny
corner case here. When we see a backslashed character, we
keep incrementing the "src" pointer as we parse it. For a
single sq_dequote() call, that's OK; our next step is to
bail with an error, and we don't care where "src" points.

But if we're parsing multiple strings with sq_dequote_to_argv(),
then our next step is to see if the string is followed by
whitespace. Because we erroneously incremented the "src"
pointer, we don't barf on the bogus backslash that we
skipped. Instead, we may find whitespace that immediately
follows it, and continue as if all is well (skipping the
backslashed character completely!).

In practice, this shouldn't be a big deal. The input is
bogus, and our sq_quote() would never generate this bogus
input. In all but one callers, we are parsing input created
by an earlier call to sq_quote(). That final case is "git
shell", which parses shell-quoting generated by the client.
And in that case we use the singular sq_quote(), which has
always behaved correctly.

One might also wonder if you could provoke a read past the
end of the string. But the answer is no; we still parse
character by character, and would never advance past a NUL.

This patch implements the minimal fix, along with
documenting the restriction (which confused at least me
while reading the code). We should possibly consider
being more liberal in accepting valid shell-quoted words. I
suspect the code may actually be simpler, and it would be
more friendly to anybody generating or editing input by
hand. But I wanted to fix just the immediate bug in this
patch.

We don't have a direct way to unit-test the sq_dequote()
functions, but we can do this by feeding input to
GIT_CONFIG_PARAMETERS (which is not normally a user-facing
interface, but serves here as it expects to see sq_quote()
input from "git -c"). I've included both a bogus example,
and a related "good" one to confirm that we still parse it
correctly.

Noticed-by: Michael Haggerty 
Signed-off-by: Jeff King 
---
Phew. That was a lot of explanation for a tiny bug. But it
really took me a while to convince myself that there were no
other lurking problems.

 quote.c| 12 +---
 t/t1300-repo-config.sh | 23 +++
 2 files changed, 32 insertions(+), 3 deletions(-)

diff --git a/quote.c b/quote.c
index de2922ddd6..44f47bd3dc 100644
--- a/quote.c
+++ b/quote.c
@@ -94,9 +94,15 @@ static char *sq_dequote_step(char *arg, char **next)
*next = NULL;
return arg;
case '\\':
-   c = *++src;
-   if (need_bs_quote(c) && *++src == '\'') {
-   *dst++ = c;
+   /*
+* Allow backslashed characters outside of
+* single-quotes only if they need escaping,
+* and only if we resume the single-quoted part
+* afterward.
+*/
+   if (need_bs_quote(src[1]) && src[2] == '\'') {
+   *dst++ = src[1];
+   src += 2;
continue;
}
/* Fallthrough */
diff --git a/t/t1300-repo-config.sh b/t/t1300-repo-config.sh
index cbeb9bebee..4f8e6f5fde 100755
--- a/t/t1300-repo-config.sh
+++ b/t/t1300-repo-config.sh
@@ -1206,6 +1206,29 @@ test_expect_success 'git -c is not confused by empty 
environment' '
GIT_CONFIG_PARAMETERS="" git -c x.one=1 config --list
 '
 
+sq="'"
+test_expect_success 'detect bogus GIT_CONFIG_PARAMETERS' '
+   cat >expect <<-\EOF &&
+   env.one one
+   env.two two
+   EOF
+   GIT_CONFIG_PARAMETERS="${sq}env.one=one${sq} ${sq}env.two=two${sq}" \
+   git config --get-regexp "env.*" >actual &&
+   test_cmp expect actual &&
+
+   cat >expect <<-EOF &&
+   env.one one${sq}
+   env.two two
+   EOF
+   GIT_CONFIG_PARAMETERS="${sq}env.one=one${sq}\\$sq$sq$sq 
${sq}env.two=two${sq}" \
+   git config --get-regexp "env.*" >actual &&
+   test_cmp expect actual &&
+
+   test_must_fail env \
+   GIT_CONFIG_PARAMETERS="${sq}env.one=one${sq}\\$sq 
${sq}env.two=two${sq}" \
+   git config --get-regexp "env.*"
+'
+
 test_expect_success

[PATCH] Correct mispellings of ".gitmodule" to ".gitmodules"

2018-02-13 Thread Robert P. J. Day


There are a small number of misspellings, ".gitmodule", scattered
throughout the code base, correct them ... no apparent functional
changes.

 Documentation/technical/api-submodule-config.txt | 2 +-
 contrib/subtree/git-subtree.txt  | 2 +-
 submodule-config.c   | 4 ++--
 t/t5526-fetch-submodules.sh  | 2 +-
 4 files changed, 5 insertions(+), 5 deletions(-)

---

  fairly brainless correction of what appear to be minor misspellings.

diff --git a/Documentation/technical/api-submodule-config.txt 
b/Documentation/technical/api-submodule-config.txt
index 3dce003fd..ee907c4a8 100644
--- a/Documentation/technical/api-submodule-config.txt
+++ b/Documentation/technical/api-submodule-config.txt
@@ -4,7 +4,7 @@ submodule config cache API
 The submodule config cache API allows to read submodule
 configurations/information from specified revisions. Internally
 information is lazily read into a cache that is used to avoid
-unnecessary parsing of the same .gitmodule files. Lookups can be done by
+unnecessary parsing of the same .gitmodules files. Lookups can be done by
 submodule path or name.

 Usage
diff --git a/contrib/subtree/git-subtree.txt b/contrib/subtree/git-subtree.txt
index 60d76cddd..352deda69 100644
--- a/contrib/subtree/git-subtree.txt
+++ b/contrib/subtree/git-subtree.txt
@@ -28,7 +28,7 @@ as a subdirectory of your application.

 Subtrees are not to be confused with submodules, which are meant for
 the same task. Unlike submodules, subtrees do not need any special
-constructions (like .gitmodule files or gitlinks) be present in
+constructions (like .gitmodules files or gitlinks) be present in
 your repository, and do not force end-users of your
 repository to do anything special or to understand how subtrees
 work. A subtree is just a subdirectory that can be
diff --git a/submodule-config.c b/submodule-config.c
index 2aa8a1747..602ba8ca8 100644
--- a/submodule-config.c
+++ b/submodule-config.c
@@ -9,7 +9,7 @@
 /*
  * submodule cache lookup structure
  * There is one shared set of 'struct submodule' entries which can be
- * looked up by their sha1 blob id of the .gitmodule file and either
+ * looked up by their sha1 blob id of the .gitmodules file and either
  * using path or name as key.
  * for_path stores submodule entries with path as key
  * for_name stores submodule entries with name as key
@@ -91,7 +91,7 @@ static void submodule_cache_clear(struct submodule_cache 
*cache)
/*
 * We iterate over the name hash here to be symmetric with the
 * allocation of struct submodule entries. Each is allocated by
-* their .gitmodule blob sha1 and submodule name.
+* their .gitmodules blob sha1 and submodule name.
 */
hashmap_iter_init(>for_name, );
while ((entry = hashmap_iter_next()))
diff --git a/t/t5526-fetch-submodules.sh b/t/t5526-fetch-submodules.sh
index a552ad4ea..74486c73b 100755
--- a/t/t5526-fetch-submodules.sh
+++ b/t/t5526-fetch-submodules.sh
@@ -485,7 +485,7 @@ test_expect_success "don't fetch submodule when newly 
recorded commits are alrea
)
 '

-test_expect_success "'fetch.recurseSubmodules=on-demand' works also without 
.gitmodule entry" '
+test_expect_success "'fetch.recurseSubmodules=on-demand' works also without 
.gitmodules entry" '
(
cd downstream &&
git fetch --recurse-submodules

-- 


Robert P. J. Day Ottawa, Ontario, CANADA
http://crashcourse.ca

Twitter:   http://twitter.com/rpjday
LinkedIn:   http://ca.linkedin.com/in/rpjday

Re: [PATCH v3 08/14] commit-graph: implement 'git-commit-graph clear'

2018-02-13 Thread Jonathan Tan

On Thu,  8 Feb 2018 15:37:32 -0500
Derrick Stolee  wrote:

> Teach Git to delete the current 'graph_head' file and the commit graph
> it references. This is a good safety valve if somehow the file is
> corrupted and needs to be recalculated. Since the commit graph is a
> summary of contents already in the ODB, it can be regenerated.

Spelling of graph-head (hyphen, not underscore).

I'm not sure of the usefulness of this feature - if the graph is indeed
corrupt, the user can just be instructed to delete graph-head (not even
the commit graph it references, since when we create a new graph-head,
--delete-expired will take care of deleting the old one).

>  extern char *get_graph_head_filename(const char *pack_dir);
> +extern struct object_id *get_graph_head_hash(const char *pack_dir,
> +  struct object_id *hash);
>  extern char* get_commit_graph_filename_hash(const char *pack_dir,
>   struct object_id *hash);

This file is starting to need documentation - in particular, the
difference between get_graph_head_hash() and
get_commit_graph_filename_hash() is not clear.

Re: make git diff output easier to read - use better diff heuristics

2018-02-13 Thread Michael Haggerty

On Tue, Feb 13, 2018 at 7:25 PM, Σπύρος Βαζαίος  wrote:
> While I din't have the experience to express an opinion on this
> matter, I have to say that the --no-indent-heuristic that Jeff
> suggested worked great.
> There were more than a handful of cases that this issue happened in my
> diff file (all were the same: #endif followed by #ifdef).
> Oh, and the language is C indeed.

The "indent heuristic" algorithm that Git now uses by default is
nothing more than that—a heuristic—so it can be fooled. It bases its
decision on the locations of blank lines and the indentations of
non-blank lines. In the vast majority of cases it gives the same or
better results than the old algorithm, but there are some cases, like
yours, where it gives aesthetically less pleasing (though still
correct) results.

The algorithm usually handles C code well, but it tends to be confused
by preprocessor directives, because they are not indented like typical
code. It might be possible to tweak the weights to get it to handle
preprocessor directives better, but that causes it to do worse on
other, more common things like Python code (where blocks are preceded
but not followed by a line with lesser indentation).

Doing significantly better probably would require some amount of
language-awareness, but that's a bigger job than I was willing to take
on.

Michael

Re: [PATCH v3 07/14] commit-graph: update graph-head during write

2018-02-13 Thread Jonathan Tan

On Thu,  8 Feb 2018 15:37:31 -0500
Derrick Stolee  wrote:

> It is possible to have multiple commit graph files in a pack directory,
> but only one is important at a time. Use a 'graph_head' file to point
> to the important file. Teach git-commit-graph to write 'graph_head' upon
> writing a new commit graph file.

You should probably include the rationale for a special "graph_head"
file that you describe here [1] in the commit message.

[1] https://public-inbox.org/git/99543db0-26e4-8daa-a580-b618497e4...@gmail.com/

> +char *get_graph_head_filename(const char *pack_dir)
> +{
> + struct strbuf fname = STRBUF_INIT;
> + strbuf_addstr(, pack_dir);
> + strbuf_addstr(, "/graph-head");
> + return strbuf_detach(, 0);

NULL, not 0.

> +}

Why git-revert doesn't invoke the pre-commit and the commit-msg hooks?

2018-02-13 Thread Gustavo Chaves

Using strace I noticed that git-revert invokes only two hooks:
- prepare-commit-msg
- post-commit

But git-commit invoke these four:
- pre-commit
- prepare-commit-msg
- commit-msg
- post-commit

Since git-revert produces a commit, why doesn't it invoke the same
hooks as git-commit?

I couldn't find any discussing about this in the list or elsewhere. So
I'm asking here.

I ended up researching this when I was implementing a hook to detect
and deny commits which revert merge-commits, since they are
troublesome 
(https://www.kernel.org/pub/software/scm/git/docs/howto/revert-a-faulty-merge.html).
I tried to implement it as a commit-msg hook to search for the string
"This reverts commit SHA-1" in the commit message. But git-revert
doesn't invoke the commit-msg hook.

So, for now I implemented my check as a pre-receive hook. But I find
it useful to have all pre-receive checks implemented also as a
pre-commit or a commit-msg hook so that I can detect problems at
commit time instead of only at push time.

-- 
Gustavo Chaves

Re: [PATCH] t6300-for-each-ref: fix "more than one quoting style" tests

2018-02-13 Thread Jeff King

On Tue, Feb 13, 2018 at 01:36:01AM +0100, SZEDER Gábor wrote:

> 'git for-each-ref' should error out when invoked with more than one
> quoting style options.  The tests checking this have two issues:
> 
>   - They run 'git for-each-ref' upstream of a pipe, hiding its exit
> code, thus don't actually checking that 'git for-each-ref' exits
> with error code.
> 
>   - They check the error message in a rather roundabout way.
> 
> Ensure that 'git for-each-ref' exits with an error code using the
> 'test_must_fail' helper function, and check its error message by
> grepping its saved standard error.

Yeah, this looks much nicer than the original.

>  for i in "--perl --shell" "-s --python" "--python --tcl" "--tcl --perl"; do
>   test_expect_success "more than one quoting style: $i" "
> - git for-each-ref $i 2>&1 | (read line &&
> - case \$line in
> - \"error: more than one quoting style\"*) : happy;;
> - *) false
> - esac)
> + test_must_fail git for-each-ref $i 2>err &&
> + grep '^error: more than one quoting style' err

I suspect in the long run this ought to be test_i18ngrep, but since it's
not localized yet, it makes sense to stop here with this patch.

-Peff

Re: [PATCH v3 05/14] commit-graph: implement 'git-commit-graph write'

2018-02-13 Thread Jonathan Tan

On Thu,  8 Feb 2018 15:37:29 -0500
Derrick Stolee  wrote:

> +test_expect_success 'setup full repo' '
> + rm -rf .git &&
> + mkdir full &&
> + cd full &&
> + git init &&
> + packdir=".git/objects/pack"'

Thanks for simplifying the repo generated in the test. One more style
nit: the final apostrophe goes onto its own line.

Re: git-rebase --undo-skip proposal

2018-02-13 Thread Stefan Beller

On Tue, Feb 13, 2018 at 12:22 PM, Psidium Guajava  wrote:
> Hello git community,
>
> I'd like to add a new feature in git-rebase: --undo-skip.
> But first, I'd like to consult with the experts if it would be
> beneficial for the project and if my line of tought is correct.
>
> Imagine that you are working on a feature for a long time, but there
> are multiple bug fixes happening at `master` branch at the same time.
> After lots of commits on both ends, you decide to rebase your changes
> on top of the current `master` branch.
> After lots of conflict resolution steps, you mistakenly call
> `git-rebase --skip` instead of `git-rebase --continue`, thus losing a
> commit of your work, and possibly inserting bugs in your project.
> The only solution for this right now would be to abort the current
> rebase and start over.
>
> It seems that a feature like this have been requested once on the mail list 
> [1].
>
> I propose the existence of --undo-skip on git-rebase's `$action` domain.
>
> How I fixed it when that happened with me was (just after running the
> wrong skip):
>
> 1. I figured I was making a rebase that used `git-am` as a backend.
> 2. In the rebase-apply directory I searched for the patch file with
> the change I just skipped.
> 3. Found the `rebase-apply/next` file.
> 4. Wrote the number of the patch I skipped - 1 in rebase-apply/next.
> 5. run `git rebase --skip` again on the repository.
>
> This made the lost patch appear again and I could `--continue` it this time.

I think this could also be done with "git rebase --edit-todo", which brings
up the right file in your editor.

> I propose the addition of an action `--undo-skip`, that could be
> called only after a wrongfully called `--skip`.
> `git rebase --undo-skip`.
> I would implemented it to do programatically what I did by hand when
> that happened with me.
>
> Here are my questions for you:
> 1. Would this be beneficial for the users?

I guess it is.

> 2. For `rebase--am`, I would need to change `git-rebase--am.sh` file, correct?
> 3. Can I assume `builtin/am.c` will always store its information on
> `$state_dir/next` and `$state_dir/$patchnumbers`?
> 4. How hard would it be to add that logic for `rebase--interactive`
> and `rebase--merge` backends?

cc'd Johannes who is currently working on revamping rebase.

>
> Also, a little unrelated with this issue:
> 5. What happened to the rewrite of rebase in C [2]? I couldn't find
> any information after 2016.
>
> [1] https://public-inbox.org/git/201311011522.44631.tho...@koch.ro/
> [2] 
> https://public-inbox.org/git/1457779597-6918-1-git-send-email-pyoka...@gmail.com/

cc'd Paul Tan, maybe he recalls the situation.

Re: [PATCH 2/4] t3701: add failing test for pathological context lines

2018-02-13 Thread Junio C Hamano

Phillip Wood  writes:

> From: Phillip Wood 
>
> When a hunk is skipped by add -i the offsets of subsequent hunks are
> not adjusted to account for any missing insertions due to the skipped
> hunk. Most of the time this does not matter as apply uses the context
> lines to apply the subsequent hunks in the correct place, however in
> pathological cases the context lines will match at the now incorrect
> offset and the hunk will be applied in the wrong place. The offsets of

Good.  The --recount "feature" on the receiving end does not have
enough information to do a job as good as the code sitting at the
side of producing a patch to be applied, and this goes in the right
direction.

git-rebase --undo-skip proposal

2018-02-13 Thread Psidium Guajava

Hello git community,

I'd like to add a new feature in git-rebase: --undo-skip.
But first, I'd like to consult with the experts if it would be
beneficial for the project and if my line of tought is correct.

Imagine that you are working on a feature for a long time, but there
are multiple bug fixes happening at `master` branch at the same time.
After lots of commits on both ends, you decide to rebase your changes
on top of the current `master` branch.
After lots of conflict resolution steps, you mistakenly call
`git-rebase --skip` instead of `git-rebase --continue`, thus losing a
commit of your work, and possibly inserting bugs in your project.
The only solution for this right now would be to abort the current
rebase and start over.

It seems that a feature like this have been requested once on the mail list [1].

I propose the existence of --undo-skip on git-rebase's `$action` domain.

How I fixed it when that happened with me was (just after running the
wrong skip):

1. I figured I was making a rebase that used `git-am` as a backend.
2. In the rebase-apply directory I searched for the patch file with
the change I just skipped.
3. Found the `rebase-apply/next` file.
4. Wrote the number of the patch I skipped - 1 in rebase-apply/next.
5. run `git rebase --skip` again on the repository.

This made the lost patch appear again and I could `--continue` it this time.

I propose the addition of an action `--undo-skip`, that could be
called only after a wrongfully called `--skip`.
`git rebase --undo-skip`.
I would implemented it to do programatically what I did by hand when
that happened with me.

Here are my questions for you:
1. Would this be beneficial for the users?
2. For `rebase--am`, I would need to change `git-rebase--am.sh` file, correct?
3. Can I assume `builtin/am.c` will always store its information on
`$state_dir/next` and `$state_dir/$patchnumbers`?
4. How hard would it be to add that logic for `rebase--interactive`
and `rebase--merge` backends?

Also, a little unrelated with this issue:
5. What happened to the rewrite of rebase in C [2]? I couldn't find
any information after 2016.

[1] https://public-inbox.org/git/201311011522.44631.tho...@koch.ro/
[2] 
https://public-inbox.org/git/1457779597-6918-1-git-send-email-pyoka...@gmail.com/


Best Regards,
Gabriel Borges

Re: [PATCH v2 0/2] Refactor hash search with fanout table

2018-02-13 Thread Jonathan Tan

On Tue, 13 Feb 2018 11:57:16 -0800
Junio C Hamano  wrote:

> Jonathan Tan  writes:
> 
> > Updates from v1:
> >  - use uint32_t so that we can operate on packfiles of up to 4G objects
> >(this also means that I had to change the signature of the function)
> >  - don't hide types
> >
> > Derrick: you'll need to slightly change your patch to use the new API.
> > As for find_abbrev_len_for_pack(), that's a good idea - I didn't do it
> > in this set but it definitely should be done.
> >
> > Jonathan Tan (2):
> >   packfile: remove GIT_DEBUG_LOOKUP log statements
> >   packfile: refactor hash search with fanout table
> 
> Hmm, is this meant to replace the topic that was merged to 'next'
> last week?

Yes - René pointed out [1] that V1 of my patch series (which you merged
to 'next') does not handle packfiles of more than 2G pack entries, so I
sent out a new version. Yes, this replaces jt/binsearch-with-fanout.
Sorry for not being clearer.

[1] https://public-inbox.org/git/cfbde137-dbac-8796-f49f-2a543303d...@web.de/

Re: [PATCH v2 0/2] Refactor hash search with fanout table

2018-02-13 Thread Junio C Hamano

Jonathan Tan  writes:

> Updates from v1:
>  - use uint32_t so that we can operate on packfiles of up to 4G objects
>(this also means that I had to change the signature of the function)
>  - don't hide types
>
> Derrick: you'll need to slightly change your patch to use the new API.
> As for find_abbrev_len_for_pack(), that's a good idea - I didn't do it
> in this set but it definitely should be done.
>
> Jonathan Tan (2):
>   packfile: remove GIT_DEBUG_LOOKUP log statements
>   packfile: refactor hash search with fanout table

Hmm, is this meant to replace the topic that was merged to 'next'
last week?

>
>  packfile.c| 29 -
>  sha1-lookup.c | 28 
>  sha1-lookup.h | 22 ++
>  3 files changed, 54 insertions(+), 25 deletions(-)

Re: [PATCH 02/26] object-store: move alt_odb_list and alt_odb_tail to object store

2018-02-13 Thread Stefan Beller

On Tue, Feb 13, 2018 at 10:51 AM, Brandon Williams  wrote:

>> +#include "cache.h"
>> +
>
> Maybe we would want to move the definition of the alternate store to
> this header file?  That way we don't need to include cache.h here.
>

Sounds good. I'll take a look at that.

Stefan

Re: [PATCH 003/194] object-store: move packed_git and packed_git_mru to object store

2018-02-13 Thread Stefan Beller

On Tue, Feb 13, 2018 at 10:52 AM, René Scharfe  wrote:
> Am 12.02.2018 um 22:48 schrieb Junio C Hamano:
>> René Scharfe  writes:
>>
>>> Am 12.02.2018 um 22:04 schrieb Junio C Hamano:
 Stefan Beller  writes:

> I thought it may be a helpful
> for merging this series with the rest of the evolved code base which
> may make use of one of the converted functions. So instead of fixing
> that new instance manually, cocinelle could do that instead.

 Having the .cocci used for the conversion *somewhere* would indeed
 be helpful, as it allows me to (1) try reproducing this patch by
 somebody else using the file and following the steps in order to
 audit this patch and (2) catch new places that need to be migrated
 in in-flight topics.

 But placing it in contrib/coccinelle/ has other side effects.
>>>
>>> Running "make coccicheck" takes longer.  What other downsides are
>>> there?
>>
>> Once the global variable packed_git has been migrated out of
>> existence, no new code that relies on it would be referring to that
>> global variable.  If coccicheck finds something, the suggested rewrite
>> would be turning an unrelated packed_git (which may not even be the
>> right type) to a reference to a field in a global variable, that
>> would certainly be wrong.
>
> Ugh, yes.  The semantic patch in question doesn't contain any type
> information.  I don't know how to match a variable by name *and* type.
> Here's the closest I can come up with to a safe and complete
> transformation, but it only handles assignments:
>
> @@
> struct packed_git *A;
> identifier B = packed_git;
> @@
> - A = B
> + A = the_repository->objects.packed_git
>

I'll need to play more with coccinelle. :) Thanks for this patch.

> Seeing the many for loops, I'm tempted to suggest adding a
> for_each_packed_git macro to hide the global variable and thus reduce
> the number of places to change at cutover.  Coccinelle doesn't seem
> to like them, though.
>
> A short semantic patch with a limited time of usefulness and possible
> side-effects can easily be included in a commit message, of course..

In the shorter reroll I moved the semantic patch into a subdir of
contrib/coccinelle, but in the next reroll I'll just put it into the
commit message.

Thanks,
Stefan

Re: [PATCH 00/26] Moving global state into the repository object (part 1)

2018-02-13 Thread Stefan Beller

On Tue, Feb 13, 2018 at 11:35 AM, Junio C Hamano  wrote:
> Stefan Beller  writes:
>
>> For now the ignore_env bit lives in the repository, as that helps
>> when working with submodules, when reading its comments.
>> Unfortunately 359efeffc1 (repository: introduce the repository
>> object, 2017-06-22) did not reason about the existence of the ignore_env
>> flag in its commit message.
>>
>>> So from that point of view, it may not matter where the "bit" lives,
>>> among repository, object-store, or ref-store.
>>
>> It matters on the scale of confusing the developer?
>
> What I meant is that from the point of view, having the bit in the
> data (not on the invocation) is confusing no matter which data
> structure holds it--they are equally bad.

Right.

Which is why I'd strongly consider having it only in the repository
object as that is the largest-scoped thing we'd want. e.g. submodules
should care about environment variables differently:

GIT_WORK_TREE=~/mysuperproject git checkout \
--recurse-submodules master

 So with such a command in mind, the environment variable would
only apply to the superproject and the nested submodules should
ignore the env, but compute their paths off the superproject, i.e.
the superproject repository, not its object store?

Re: [PATCH 00/26] Moving global state into the repository object (part 1)

2018-02-13 Thread Junio C Hamano

Stefan Beller  writes:

> For now the ignore_env bit lives in the repository, as that helps
> when working with submodules, when reading its comments.
> Unfortunately 359efeffc1 (repository: introduce the repository
> object, 2017-06-22) did not reason about the existence of the ignore_env
> flag in its commit message.
>
>> So from that point of view, it may not matter where the "bit" lives,
>> among repository, object-store, or ref-store.
>
> It matters on the scale of confusing the developer?

What I meant is that from the point of view, having the bit in the
data (not on the invocation) is confusing no matter which data
structure holds it--they are equally bad.

Re: [PATCH 00/26] Moving global state into the repository object (part 1)

2018-02-13 Thread Brandon Williams

On 02/12, Stefan Beller wrote:
> This is a real take on the first part of the recent RFC[1].
> 
> Jonathan Tan suggested[2] that "sha1_loose_object_info to handle arbitrary 
> repositories"
> might be a good breaking point for a first part at that RFC at patch 38.
> This series is smaller and contains only 26 patches as the patches in the big
> RFC were slightly out of order.
> 
> I developed this series partly by writing patches, but mostly by cherrypicking
> from that RFC on top of current master. I noticed no external conflicts apart
> from one addition to the repositories _INIT macro, which was easy to resolve.
> 
> Comments in the early range of that RFC were on 003 where Junio pointed out
> that the coccinelle patch ought to be not in contrib/coccinelle, so I put it
> in a sub directory there, as 'make coccicheck' doesn't traverse subdirs.
> 
> brian had a questoin on patch 25 in the RFC, but that seemed to resolve itself
> without any suggestion to include into this series[3].
> 
> Duy suggested that we shall not use the repository blindly, but should 
> carefully
> examine whether to pass on an object store or the refstore or such[4], which 
> I agree with if it makes sense. This series unfortunately has an issue with 
> that
> as I would not want to pass down the `ignore_env` flag separately from the 
> object
> store, so I made all functions that only take the object store to have the raw
> object store as the first parameter, and others using the full repository.
> 
> Eric Sunshine brought up memory leaks with the RFC, and I would think to
> have plugged all holes.

I've looked through the patches and I think they look good.  At the end
of the series all the #define tricks have been eliminated so we don't
have to worry about them possibly being left and forgotten :)


Thanks for getting the ball rolling on this.

> 
> [1] https://public-inbox.org/git/20180205235508.216277-1-sbel...@google.com/
> [2] 
> https://public-inbox.org/git/20180207143300.ce1c39ca07f6a0d64fe0e...@google.com/
> [3] 
> https://public-inbox.org/git/20180206011940.gd7...@genre.crustytoothpaste.net/
> [4] 
> https://public-inbox.org/git/cacsjy8cggekpx4czkyytsprj87uqvkzsol7fyt__p2dh_1l...@mail.gmail.com/
> 
> Thanks,
> Stefan
> 
> Jonathan Nieder (8):
>   pack: move prepare_packed_git_run_once to object store
>   pack: move approximate object count to object store
>   sha1_file: add repository argument to sha1_file_name
>   sha1_file: add repository argument to map_sha1_file
>   sha1_file: allow stat_sha1_file to handle arbitrary repositories
>   sha1_file: allow open_sha1_file to handle arbitrary repositories
>   sha1_file: allow map_sha1_file_1 to handle arbitrary repositories
>   sha1_file: allow sha1_loose_object_info to handle arbitrary
> repositories
> 
> Stefan Beller (18):
>   repository: introduce raw object store field
>   object-store: move alt_odb_list and alt_odb_tail to object store
>   object-store: free alt_odb_list
>   object-store: move packed_git and packed_git_mru to object store
>   object-store: close all packs upon clearing the object store
>   sha1_file: add raw_object_store argument to alt_odb_usable
>   sha1_file: add repository argument to link_alt_odb_entry
>   sha1_file: add repository argument to read_info_alternates
>   sha1_file: add repository argument to link_alt_odb_entries
>   sha1_file: add repository argument to prepare_alt_odb
>   sha1_file: allow link_alt_odb_entries to handle arbitrary repositories
>   sha1_file: allow prepare_alt_odb to handle arbitrary repositories
>   sha1_file: add repository argument to stat_sha1_file
>   sha1_file: add repository argument to open_sha1_file
>   sha1_file: add repository argument to map_sha1_file_1
>   sha1_file: add repository argument to sha1_loose_object_info
>   sha1_file: allow sha1_file_name to handle arbitrary repositories
>   sha1_file: allow map_sha1_file to handle arbitrary repositories
> 
>  builtin/am.c|   2 +-
>  builtin/clone.c |   2 +-
>  builtin/count-objects.c |   6 +-
>  builtin/fetch.c |   2 +-
>  builtin/fsck.c  |  13 ++-
>  builtin/gc.c|   4 +-
>  builtin/grep.c  |   2 +-
>  builtin/index-pack.c|   1 +
>  builtin/merge.c |   2 +-
>  builtin/pack-objects.c  |  21 ++--
>  builtin/pack-redundant.c|   6 +-
>  builtin/receive-pack.c  |   3 +-
>  cache.h |  46 ++--
>  contrib/coccinelle/refactoring/packed_git.cocci |   7 ++
>  environment.c   |   5 +-
>  fast-import.c   |   6 +-
>  http-backend.c  |   6 +-
>  http-push.c

Re: [PATCH 00/26] Moving global state into the repository object (part 1)

2018-02-13 Thread Jonathan Nieder

Hi,

Stefan Beller wrote:

> This is a real take on the first part of the recent RFC[1].
>
> Jonathan Tan suggested[2] that "sha1_loose_object_info to handle arbitrary 
> repositories"
> might be a good breaking point for a first part at that RFC at patch 38.
> This series is smaller and contains only 26 patches as the patches in the big
> RFC were slightly out of order.

Thanks.  This looks like a nice reviewable series, so I'm happy to see
it broken out.

[...]
> Comments in the early range of that RFC were on 003 where Junio pointed out
> that the coccinelle patch ought to be not in contrib/coccinelle, so I put it
> in a sub directory there, as 'make coccicheck' doesn't traverse subdirs.

Can you say a little more about this?  Was the problem that the
semantic patch wasn't idempotent, that it was too slow to run, or
something else?

If we're including the semantic patch for reference but never running
it, I think I'd prefer it to go in the commit message.  But if it's
useful to run then we should make it idempotent so it can go in
contrib/coccinelle.

[...]
> Duy suggested that we shall not use the repository blindly, but should 
> carefully
> examine whether to pass on an object store or the refstore or such[4], which
> I agree with if it makes sense. This series unfortunately has an issue with 
> that
> as I would not want to pass down the `ignore_env` flag separately from the 
> object
> store, so I made all functions that only take the object store to have the raw
> object store as the first parameter, and others using the full repository.

I think I want to push back on this a little.

The advantage of a function taking e.g. an object_store as an argument
instead of a repository is that it increases its flexibility, since it
allows callers that do not have access to a repository to call it.  The
disadvantage is also that it increases the flexibility without any
callers benefitting from that:

 1. It ties us to assumptions from today.  If e.g. an object access in
the future starts relying on some other information from the
repository (e.g. its config) then we'd have to either add a
back-pointer from the object store to its repository or add
additional arguments for that additional data at that point.

If all callers already have a repository, it is simpler to pass
that repository as context so that we have the flexibility to make
more use of it later.

 2. It complicates the caller.  Instead of consistently passing the
same repository argument as context to functions that access that
repository, the caller would have to pull out relevant fields like
the object store from it.

 3. It prevents us from making opportunistic use of other information
from the repository, such as its name for use in error messages.

In lower-level funcitons that need to be usable by callers without a
repository (e.g. to find packfiles in an alternate) it makes sense to
not pass a repository, but without such a use case in mind I don't
think it needs to be a general goal.

To put it another way, most callers do not *care* whether they are
working with a repository's object store, ref database, or some other
aspect of the repository.  They just know they want to e.g. read an
object from this repository.

It's similar to how FILE * works: some operations rely on the buffer
the FILE * manages and some other operations only rely on the
underlying file descriptor, but using the FILE * consistently provides
a clean abstraction that generally makes life easier.

> Eric Sunshine brought up memory leaks with the RFC, and I would think to
> have plugged all holes.

Yay, thank you!

I'll try to find time to look at the patches in detail soon, but no
promises (i.e. if someone else reviews them first, then even better
;-)).

Sincerely,
Jonathan

Re: [PATCH 00/26] Moving global state into the repository object (part 1)

2018-02-13 Thread Stefan Beller

On Tue, Feb 13, 2018 at 10:57 AM, Junio C Hamano  wrote:
> Stefan Beller  writes:
>
>> Oh, that is an interesting perspective. Here is how I arrived at the opposite
>> conclusion initially: Searching for 'ignore_env' shows that we care about it
>> as well for the index and graft paths, which are not the object store, hence
>> it would be better kept in the repository. (The alternative would be to
>> duplicate it into the raw object store, but I do not like data duplication)
>>
>> But maybe it is better to duplicate this one bit instead of passing through
>> a larger scoped object.
>
> If a larger scoped repository refers to a smaller scoped
> object-store, is there still a need to duplicate that bit, instead
> of referring to the bit the smaller scoped one has when asking about
> the bit in the larger scoped one?

No (in theory). But in practice it may be worthwhile:

"What's the value of this ref?"

"Oh let me check the ignore_env bit that happens
to live in the object store first."

would be super confusing to me.

> I am not sure if these "do not look at environment variables" is an
> attribute of these entities---it sounds more like an attribute for
> each invocation of an operation, i.e. "I want to learn where the
> index is but would ignore GIT_INDEX environment for this particular
> query." and "What's the value of this ref?  Please honor the
> common-dir environment during this query".

That sounds like we want to have a configset struct eventually.

For now the ignore_env bit lives in the repository, as that helps
when working with submodules, when reading its comments.
Unfortunately 359efeffc1 (repository: introduce the repository
object, 2017-06-22) did not reason about the existence of the ignore_env
flag in its commit message.

> So from that point of view, it may not matter where the "bit" lives,
> among repository, object-store, or ref-store.

It matters on the scale of confusing the developer?

Stefan

Re: [PATCH 00/26] Moving global state into the repository object (part 1)

2018-02-13 Thread Junio C Hamano

Stefan Beller  writes:

> Oh, that is an interesting perspective. Here is how I arrived at the opposite
> conclusion initially: Searching for 'ignore_env' shows that we care about it
> as well for the index and graft paths, which are not the object store, hence
> it would be better kept in the repository. (The alternative would be to
> duplicate it into the raw object store, but I do not like data duplication)
>
> But maybe it is better to duplicate this one bit instead of passing through
> a larger scoped object.

If a larger scoped repository refers to a smaller scoped
object-store, is there still a need to duplicate that bit, instead
of referring to the bit the smaller scoped one has when asking about
the bit in the larger scoped one?

I am not sure if these "do not look at environment variables" is an
attribute of these entities---it sounds more like an attribute for
each invocation of an operation, i.e. "I want to learn where the
index is but would ignore GIT_INDEX environment for this particular
query." and "What's the value of this ref?  Please honor the
common-dir environment during this query".

So from that point of view, it may not matter where the "bit" lives,
among repository, object-store, or ref-store.

Re: [PATCH 003/194] object-store: move packed_git and packed_git_mru to object store

2018-02-13 Thread René Scharfe

Am 12.02.2018 um 22:48 schrieb Junio C Hamano:
> René Scharfe  writes:
> 
>> Am 12.02.2018 um 22:04 schrieb Junio C Hamano:
>>> Stefan Beller  writes:
>>>
 I thought it may be a helpful
 for merging this series with the rest of the evolved code base which
 may make use of one of the converted functions. So instead of fixing
 that new instance manually, cocinelle could do that instead.
>>>
>>> Having the .cocci used for the conversion *somewhere* would indeed
>>> be helpful, as it allows me to (1) try reproducing this patch by
>>> somebody else using the file and following the steps in order to
>>> audit this patch and (2) catch new places that need to be migrated
>>> in in-flight topics.
>>>
>>> But placing it in contrib/coccinelle/ has other side effects.
>>
>> Running "make coccicheck" takes longer.  What other downsides are
>> there?
> 
> Once the global variable packed_git has been migrated out of
> existence, no new code that relies on it would be referring to that
> global variable.  If coccicheck finds something, the suggested rewrite
> would be turning an unrelated packed_git (which may not even be the
> right type) to a reference to a field in a global variable, that
> would certainly be wrong.

Ugh, yes.  The semantic patch in question doesn't contain any type
information.  I don't know how to match a variable by name *and* type.
Here's the closest I can come up with to a safe and complete
transformation, but it only handles assignments:

@@
struct packed_git *A;
identifier B = packed_git;
@@
- A = B
+ A = the_repository->objects.packed_git

Seeing the many for loops, I'm tempted to suggest adding a
for_each_packed_git macro to hide the global variable and thus reduce
the number of places to change at cutover.  Coccinelle doesn't seem
to like them, though.

A short semantic patch with a limited time of usefulness and possible
side-effects can easily be included in a commit message, of course..

René

Re: [PATCH v2 0/2] Refactor hash search with fanout table

2018-02-13 Thread Derrick Stolee


On 2/13/2018 1:39 PM, Jonathan Tan wrote:

Updates from v1:
  - use uint32_t so that we can operate on packfiles of up to 4G objects
(this also means that I had to change the signature of the function)


I take it "4G objects" means "4 billion objects". The distinction is 
only important that the limit is not a 4GB IDX file, but in fact a large 
enough object count that the IDX file can be 100+ GB.



  - don't hide types

Derrick: you'll need to slightly change your patch to use the new API.
As for find_abbrev_len_for_pack(), that's a good idea - I didn't do it
in this set but it definitely should be done.


Thanks. I'll try to remember to do that change when this lands in 
master. I'll bring the new prototype into my commit-graph patch.


Reviewed-By: Derrick Stolee

Re: [PATCH 02/26] object-store: move alt_odb_list and alt_odb_tail to object store

2018-02-13 Thread Brandon Williams

On 02/12, Stefan Beller wrote:
> In a process with multiple repositories open, alternates should be
> associated to a single repository and not shared globally. Move
> alt_odb_list and alt_odb_tail into the_repository and adjust callers
> to reflect this.
> 
> Now that the alternative object data base is per repository, we're
> leaking its memory upon freeing a repository. The next patch plugs
> this hole.
> 
> No functional change intended.
> 
> Signed-off-by: Stefan Beller 
> Signed-off-by: Jonathan Nieder 
> ---
>  builtin/fsck.c |  4 +++-
>  cache.h|  4 ++--
>  object-store.h |  7 ++-
>  packfile.c |  3 ++-
>  sha1_file.c| 25 -
>  sha1_name.c|  3 ++-
>  6 files changed, 27 insertions(+), 19 deletions(-)
> 
> diff --git a/builtin/fsck.c b/builtin/fsck.c
> index 04846d46f9..1048255da1 100644
> --- a/builtin/fsck.c
> +++ b/builtin/fsck.c
> @@ -1,5 +1,6 @@
>  #include "builtin.h"
>  #include "cache.h"
> +#include "repository.h"
>  #include "config.h"
>  #include "commit.h"
>  #include "tree.h"
> @@ -694,7 +695,8 @@ int cmd_fsck(int argc, const char **argv, const char 
> *prefix)
>   fsck_object_dir(get_object_directory());
>  
>   prepare_alt_odb();
> - for (alt = alt_odb_list; alt; alt = alt->next)
> + for (alt = the_repository->objects.alt_odb_list;
> + alt; alt = alt->next)
>   fsck_object_dir(alt->path);
>  
>   if (check_full) {
> diff --git a/cache.h b/cache.h
> index d8b975a571..918c2f15b4 100644
> --- a/cache.h
> +++ b/cache.h
> @@ -1573,7 +1573,7 @@ extern int has_dirs_only_path(const char *name, int 
> len, int prefix_len);
>  extern void schedule_dir_for_removal(const char *name, int len);
>  extern void remove_scheduled_dirs(void);
>  
> -extern struct alternate_object_database {
> +struct alternate_object_database {
>   struct alternate_object_database *next;
>  
>   /* see alt_scratch_buf() */
> @@ -1591,7 +1591,7 @@ extern struct alternate_object_database {
>   struct oid_array loose_objects_cache;
>  
>   char path[FLEX_ARRAY];
> -} *alt_odb_list;
> +};
>  extern void prepare_alt_odb(void);
>  extern char *compute_alternate_path(const char *path, struct strbuf *err);
>  typedef int alt_odb_fn(struct alternate_object_database *, void *);
> diff --git a/object-store.h b/object-store.h
> index cf35760ceb..44b8f84753 100644
> --- a/object-store.h
> +++ b/object-store.h
> @@ -1,14 +1,19 @@
>  #ifndef OBJECT_STORE_H
>  #define OBJECT_STORE_H
>  
> +#include "cache.h"
> +

Maybe we would want to move the definition of the alternate store to
this header file?  That way we don't need to include cache.h here.

>  struct raw_object_store {
>   /*
>* Path to the repository's object store.
>* Cannot be NULL after initialization.
>*/
>   char *objectdir;
> +
> + struct alternate_object_database *alt_odb_list;
> + struct alternate_object_database **alt_odb_tail;
>  };
> -#define RAW_OBJECT_STORE_INIT { NULL }
> +#define RAW_OBJECT_STORE_INIT { NULL, NULL, NULL }
>  
>  void raw_object_store_clear(struct raw_object_store *o);
>  
> diff --git a/packfile.c b/packfile.c
> index 4a5fe7ab18..d61076faaf 100644
> --- a/packfile.c
> +++ b/packfile.c
> @@ -1,6 +1,7 @@
>  #include "cache.h"
>  #include "mru.h"
>  #include "pack.h"
> +#include "repository.h"
>  #include "dir.h"
>  #include "mergesort.h"
>  #include "packfile.h"
> @@ -880,7 +881,7 @@ void prepare_packed_git(void)
>   return;
>   prepare_packed_git_one(get_object_directory(), 1);
>   prepare_alt_odb();
> - for (alt = alt_odb_list; alt; alt = alt->next)
> + for (alt = the_repository->objects.alt_odb_list; alt; alt = alt->next)
>   prepare_packed_git_one(alt->path, 0);
>   rearrange_packed_git();
>   prepare_packed_git_mru();
> diff --git a/sha1_file.c b/sha1_file.c
> index 3da70ac650..2826d5d6ed 100644
> --- a/sha1_file.c
> +++ b/sha1_file.c
> @@ -22,6 +22,7 @@
>  #include "pack-revindex.h"
>  #include "sha1-lookup.h"
>  #include "bulk-checkin.h"
> +#include "repository.h"
>  #include "streaming.h"
>  #include "dir.h"
>  #include "mru.h"
> @@ -346,9 +347,6 @@ static const char *alt_sha1_path(struct 
> alternate_object_database *alt,
>   return buf->buf;
>  }
>  
> -struct alternate_object_database *alt_odb_list;
> -static struct alternate_object_database **alt_odb_tail;
> -
>  /*
>   * Return non-zero iff the path is usable as an alternate object database.
>   */
> @@ -368,7 +366,7 @@ static int alt_odb_usable(struct strbuf *path, const char 
> *normalized_objdir)
>* Prevent the common mistake of listing the same
>* thing twice, or object directory itself.
>*/
> - for (alt = alt_odb_list; alt; alt = alt->next) {
> + for (alt = the_repository->objects.alt_odb_list; alt; alt = alt->next) {
>   if

Re: [PATCH v2] docs/interpret-trailers: fix agreement error

2018-02-13 Thread Jonathan Tan

On Tue, 13 Feb 2018 02:23:52 +
"brian m. carlson"  wrote:

> In the description of git interpret-trailers, we describe "a group…of
> lines" that have certain characteristics.  Ensure both options
> describing this group use a singular verb for parallelism.
> 
> Signed-off-by: brian m. carlson 
> ---
>  Documentation/git-interpret-trailers.txt | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/Documentation/git-interpret-trailers.txt 
> b/Documentation/git-interpret-trailers.txt
> index 9dd19a1dd9..ff446f15f7 100644
> --- a/Documentation/git-interpret-trailers.txt
> +++ b/Documentation/git-interpret-trailers.txt
> @@ -51,7 +51,7 @@ with only spaces at the end of the commit message part, one 
> blank line
>  will be added before the new trailer.
>  
>  Existing trailers are extracted from the input message by looking for
> -a group of one or more lines that (i) are all trailers, or (ii) contains at
> +a group of one or more lines that (i) is all trailers, or (ii) contains at
>  least one Git-generated or user-configured trailer and consists of at
>  least 25% trailers.
>  The group must be preceded by one or more empty (or whitespace-only) lines.

This looks good to me, thanks.

[PATCH v2 0/2] Refactor hash search with fanout table

2018-02-13 Thread Jonathan Tan

Updates from v1:
 - use uint32_t so that we can operate on packfiles of up to 4G objects
   (this also means that I had to change the signature of the function)
 - don't hide types

Derrick: you'll need to slightly change your patch to use the new API.
As for find_abbrev_len_for_pack(), that's a good idea - I didn't do it
in this set but it definitely should be done.

Jonathan Tan (2):
  packfile: remove GIT_DEBUG_LOOKUP log statements
  packfile: refactor hash search with fanout table

 packfile.c| 29 -
 sha1-lookup.c | 28 
 sha1-lookup.h | 22 ++
 3 files changed, 54 insertions(+), 25 deletions(-)

-- 
2.16.0.rc1.238.g530d649a79-goog

[PATCH v2 1/2] packfile: remove GIT_DEBUG_LOOKUP log statements

2018-02-13 Thread Jonathan Tan

In commit 628522ec1439 ("sha1-lookup: more memory efficient search in
sorted list of SHA-1", 2008-04-09), a different algorithm for searching
a sorted list was introduced, together with a set of log statements
guarded by GIT_DEBUG_LOOKUP that are invoked both when using that
algorithm and when using the existing binary search. Those log
statements was meant for experiments and debugging, but with the removal
of the aforementioned different algorithm in commit f1068efefe6d
("sha1_file: drop experimental GIT_USE_LOOKUP search", 2017-08-09),
those log statements are probably no longer necessary.

Remove those statements.

Signed-off-by: Jonathan Tan 
---
 packfile.c | 11 ---
 1 file changed, 11 deletions(-)

diff --git a/packfile.c b/packfile.c
index 4a5fe7ab1..58bdced3b 100644
--- a/packfile.c
+++ b/packfile.c
@@ -1713,10 +1713,6 @@ off_t find_pack_entry_one(const unsigned char *sha1,
const uint32_t *level1_ofs = p->index_data;
const unsigned char *index = p->index_data;
unsigned hi, lo, stride;
-   static int debug_lookup = -1;
-
-   if (debug_lookup < 0)
-   debug_lookup = !!getenv("GIT_DEBUG_LOOKUP");
 
if (!index) {
if (open_pack_index(p))
@@ -1738,17 +1734,10 @@ off_t find_pack_entry_one(const unsigned char *sha1,
index += 4;
}
 
-   if (debug_lookup)
-   printf("%02x%02x%02x... lo %u hi %u nr %"PRIu32"\n",
-  sha1[0], sha1[1], sha1[2], lo, hi, p->num_objects);
-
while (lo < hi) {
unsigned mi = lo + (hi - lo) / 2;
int cmp = hashcmp(index + mi * stride, sha1);
 
-   if (debug_lookup)
-   printf("lo %u hi %u rg %u mi %u\n",
-  lo, hi, hi - lo, mi);
if (!cmp)
return nth_packed_object_offset(p, mi);
if (cmp > 0)
-- 
2.16.0.rc1.238.g530d649a79-goog

[PATCH v2 2/2] packfile: refactor hash search with fanout table

2018-02-13 Thread Jonathan Tan

Subsequent patches will introduce file formats that make use of a fanout
array and a sorted table containing hashes, just like packfiles.
Refactor the hash search in packfile.c into its own function, so that
those patches can make use of it as well.

Signed-off-by: Jonathan Tan 
---
 packfile.c| 18 --
 sha1-lookup.c | 28 
 sha1-lookup.h | 22 ++
 3 files changed, 54 insertions(+), 14 deletions(-)

diff --git a/packfile.c b/packfile.c
index 58bdced3b..59648a182 100644
--- a/packfile.c
+++ b/packfile.c
@@ -1712,7 +1712,8 @@ off_t find_pack_entry_one(const unsigned char *sha1,
 {
const uint32_t *level1_ofs = p->index_data;
const unsigned char *index = p->index_data;
-   unsigned hi, lo, stride;
+   unsigned stride;
+   uint32_t result;
 
if (!index) {
if (open_pack_index(p))
@@ -1725,8 +1726,6 @@ off_t find_pack_entry_one(const unsigned char *sha1,
index += 8;
}
index += 4 * 256;
-   hi = ntohl(level1_ofs[*sha1]);
-   lo = ((*sha1 == 0x0) ? 0 : ntohl(level1_ofs[*sha1 - 1]));
if (p->index_version > 1) {
stride = 20;
} else {
@@ -1734,17 +1733,8 @@ off_t find_pack_entry_one(const unsigned char *sha1,
index += 4;
}
 
-   while (lo < hi) {
-   unsigned mi = lo + (hi - lo) / 2;
-   int cmp = hashcmp(index + mi * stride, sha1);
-
-   if (!cmp)
-   return nth_packed_object_offset(p, mi);
-   if (cmp > 0)
-   hi = mi;
-   else
-   lo = mi+1;
-   }
+   if (bsearch_hash(sha1, level1_ofs, index, stride, ))
+   return nth_packed_object_offset(p, result);
return 0;
 }
 
diff --git a/sha1-lookup.c b/sha1-lookup.c
index 4cf3ebd92..8d0b1db3e 100644
--- a/sha1-lookup.c
+++ b/sha1-lookup.c
@@ -99,3 +99,31 @@ int sha1_pos(const unsigned char *sha1, void *table, size_t 
nr,
} while (lo < hi);
return -lo-1;
 }
+
+int bsearch_hash(const unsigned char *sha1, const uint32_t *fanout_nbo,
+const unsigned char *table, size_t stride, uint32_t *result)
+{
+   uint32_t hi, lo;
+
+   hi = ntohl(fanout_nbo[*sha1]);
+   lo = ((*sha1 == 0x0) ? 0 : ntohl(fanout_nbo[*sha1 - 1]));
+
+   while (lo < hi) {
+   unsigned mi = lo + (hi - lo) / 2;
+   int cmp = hashcmp(table + mi * stride, sha1);
+
+   if (!cmp) {
+   if (result)
+   *result = mi;
+   return 1;
+   }
+   if (cmp > 0)
+   hi = mi;
+   else
+   lo = mi + 1;
+   }
+
+   if (result)
+   *result = lo;
+   return 0;
+}
diff --git a/sha1-lookup.h b/sha1-lookup.h
index cf5314f40..7678b23b3 100644
--- a/sha1-lookup.h
+++ b/sha1-lookup.h
@@ -7,4 +7,26 @@ extern int sha1_pos(const unsigned char *sha1,
void *table,
size_t nr,
sha1_access_fn fn);
+
+/*
+ * Searches for sha1 in table, using the given fanout table to determine the
+ * interval to search, then using binary search. Returns 1 if found, 0 if not.
+ *
+ * Takes the following parameters:
+ *
+ *  - sha1: the hash to search for
+ *  - fanout_nbo: a 256-element array of NETWORK-order 32-bit integers; the
+ *integer at position i represents the number of elements in table whose
+ *first byte is less than or equal to i
+ *  - table: a sorted list of hashes with optional extra information in between
+ *  - stride: distance between two consecutive elements in table (should be
+ *GIT_MAX_RAWSZ or greater)
+ *  - result: if not NULL, this function stores the element index of the
+ *position found (if the search is successful) or the index of the least
+ *element that is greater than sha1 (if the search is not successful)
+ *
+ * This function does not verify the validity of the fanout table.
+ */
+int bsearch_hash(const unsigned char *sha1, const uint32_t *fanout_nbo,
+const unsigned char *table, size_t stride, uint32_t *result);
 #endif
-- 
2.16.0.rc1.238.g530d649a79-goog

Re: make git diff output easier to read - use better diff heuristics

2018-02-13 Thread Σπύρος Βαζαίος

Hi guys,

While I din't have the experience to express an opinion on this
matter, I have to say that the --no-indent-heuristic that Jeff
suggested worked great.
There were more than a handful of cases that this issue happened in my
diff file (all were the same: #endif followed by #ifdef).
Oh, and the language is C indeed.

Thanks for the great support!

On Tue, Feb 13, 2018 at 8:11 PM, Stefan Beller  wrote:
> On Tue, Feb 13, 2018 at 8:08 AM, Jeff King  wrote:
>> On Tue, Feb 13, 2018 at 05:06:15PM +0200, Σπύρος Βαζαίος wrote:
>>
>>> Hi, I've came across an issue when using the git diff command. In
>>> particular the diff is different to what the svn diff produces. While
>>> both being correct the output of the svn diff is easier to understand
>>> than the git diff one. See the following issue on github where I
>>> initially reported the issue:
>>>
>>> https://github.com/git-for-windows/git/issues/1494
>>>
>>> I have Included a picture to better illustrate the problem. What do
>>> you think? Is it possible to make git diff output similar to svn diff
>>> regarding this issue?
>>
>> Try "git diff --no-indent-heuristic", which makes your example look
>> better. Here's a quick reproduction:
>>
>> -- >8 --
>> cat >foo.c <<\EOF
>> static struct foo bar[] = {
>> #ifdef SOMETHING
>> { "stats.info", MNU_GBX_FSTAINF, etc },
>> { "expired.info", MNU_GBX_FSTAINF, etc },
>> { "info.log", MNU_GBX_INFOLOG, etc },
>> #endif
>> { NULL, 0, 0 },
>> };
>> EOF
>>
>> sed '6a\
>> #ifdef WITH_EMU\
>> { "SoftCam.Key", MNU_CFG_FSOFTCAMKEY, etc },\
>> #endif
>> ' bar.c
>> -- 8< --
>>
>> Now this looks ugly:
>>
>>   git diff --no-index foo.c bar.c
>>
>> but this does not:
>>
>>   git diff --no-index --no-indent-heuristic foo.c bar.c
>>
>> That heuristic is described in 433860f3d0 (diff: improve positioning of
>> add/delete blocks in diffs, 2016-09-05). I'm not sure exactly why it
>> does the wrong thing here, or if it would be possible to tweak the
>> weighting factors to make it work.
>
> https://github.com/git/git/commit/433860f3d0
>
> I would think that the long lines at the shift boundaries[1] have
> bad impact on the algorithm as they are of different length
> and influence the backstepping value. (vague theory)
>
> I wonder if we want to add language specifics to the heuristic,
> such that '}' or 'end' in a given line have a strong signal that the
> cut should be right after that line? i.e. that would decrease the
> badness score.
>
> While this is not a general solution (but quite language specific),
> I would think this is still a good idea to look at.
> (When coming up with the heuristic, most people thought the bad
> examples would come from exotic languages, but not C, which we
> are all so familiar with such that we thought we had a good grasp
> how to make a heuristic that works with C. Note that the example
> looks like C, though).
>
> [1]
> { "info.log", MNU_GBX_INFOLOG, etc },
> vs
> { "SoftCam.Key", MNU_CFG_FSOFTCAMKEY, etc },\
>
>
> (slightly unrelated tangent on better diffs in general:)
> A couple days ago I had another discussion on how
> diffs are best presented to users. That discussion was
> focused on a slightly higher layer, the diff algorithm itself,
> instead of a post-algorithm boost-fixup.
>
> It is usually easier to both write as well as review code that
> adds completely new features or projects instead of squashing
> bugs or integrating subtle new features in an existing code base.
>
> This observation lead to the conclusion that reviewing large
> locks of adjacent new lines is easier than reviewing blocks
> where deletions as well as additions are found. So we theorized
> that deletions should have more impact than additions when
> computing the diff itself. The case presented here has no deletions
> so this is not applicable, but most examples that came up once
> 2477ab2ea8 (diff: support anchoring line(s), 2017-11-27)
> was discussed would be likely to have better diffs.
>
> Stefan

Re: "git submodule" vs "git subtree" vs "repo" vs "git subdir" ... ?

2018-02-13 Thread Stefan Beller

On Tue, Feb 13, 2018 at 5:06 AM, Robert P. J. Day  wrote:
>
>   looking for general opinions ... i am (frighteningly :-) teaching a
> git course later this week, and one of the topics on the list is git
> submodules, which was specifically requested by the client as their
> idea of how to start incorporating child repos in new projects.
>
>   however, given the number of articles written about the drawbacks
> with submodules, i wanted to throw in a section about "git subtree" as
> well, even though (as discussed earlier) there is some minor dispute
> as to whether "git subtree" is part of "core" git, but i'm not going
> to let that stop me.
>
>   going even beyond that, there is also google's "repo" command, which
> i seem to see more and more often, like here for automotive grade
> linux:
>
>   https://wiki.automotivelinux.org/agl-distro/source-code
>
> and it would be a shame to at least not mention that as yet another
> possibility.

Please note that Google would prefer to get rid of the repo tool.
(It was made as a stop gap solution until submodules are good enough,
i.e. have comparable UX compared to repo. But as you know stop gap
solutions hold up for quite a long time reliably. :) repo has issues by itself,
fundamental issues such as the data model, as well as minor things like
complete lack of tests)

>   and then there are unofficial, hand-rolled solutions, like
> "git-subdir":
>
>   https://github.com/andreyvit/git-subdir
>
> given that the client does not appear to be wedded to any particular
> solution yet, i'm open to recommendations or pointers to online
> comparisons that i'll collect and post on a single wiki page, and they
> can peruse the comparisons at their leisure.

There are a couple of these. I came across these recently
http://gitslave.sourceforge.net/
https://github.com/ingydotnet/git-subrepo

>   so ... thoughts? no need to be verbose, just links to decent online
> discussions/comparisons would be massively useful.
>
> rday
>
> p.s. oh, pointers to well-designed usage of any of the above would be
> handy as well. as i mentioned, for "repo", there's AGL. for
> submodules, i might use boost:

If your copy of Git is recent, look at the man pages for submodules
such as d48034551a (submodules: overhaul documentation, 2017-06-22)
or 4f73a7f124 (Doc/gitsubmodules: make some changes to improve
readability and syntax, 2018-01-14)

>   https://github.com/boostorg/boost/wiki/Getting-Started
>
> and so on. thank you kindly.

Stefan

Re: make git diff output easier to read - use better diff heuristics

2018-02-13 Thread Stefan Beller

On Tue, Feb 13, 2018 at 8:08 AM, Jeff King  wrote:
> On Tue, Feb 13, 2018 at 05:06:15PM +0200, Σπύρος Βαζαίος wrote:
>
>> Hi, I've came across an issue when using the git diff command. In
>> particular the diff is different to what the svn diff produces. While
>> both being correct the output of the svn diff is easier to understand
>> than the git diff one. See the following issue on github where I
>> initially reported the issue:
>>
>> https://github.com/git-for-windows/git/issues/1494
>>
>> I have Included a picture to better illustrate the problem. What do
>> you think? Is it possible to make git diff output similar to svn diff
>> regarding this issue?
>
> Try "git diff --no-indent-heuristic", which makes your example look
> better. Here's a quick reproduction:
>
> -- >8 --
> cat >foo.c <<\EOF
> static struct foo bar[] = {
> #ifdef SOMETHING
> { "stats.info", MNU_GBX_FSTAINF, etc },
> { "expired.info", MNU_GBX_FSTAINF, etc },
> { "info.log", MNU_GBX_INFOLOG, etc },
> #endif
> { NULL, 0, 0 },
> };
> EOF
>
> sed '6a\
> #ifdef WITH_EMU\
> { "SoftCam.Key", MNU_CFG_FSOFTCAMKEY, etc },\
> #endif
> ' bar.c
> -- 8< --
>
> Now this looks ugly:
>
>   git diff --no-index foo.c bar.c
>
> but this does not:
>
>   git diff --no-index --no-indent-heuristic foo.c bar.c
>
> That heuristic is described in 433860f3d0 (diff: improve positioning of
> add/delete blocks in diffs, 2016-09-05). I'm not sure exactly why it
> does the wrong thing here, or if it would be possible to tweak the
> weighting factors to make it work.

https://github.com/git/git/commit/433860f3d0

I would think that the long lines at the shift boundaries[1] have
bad impact on the algorithm as they are of different length
and influence the backstepping value. (vague theory)

I wonder if we want to add language specifics to the heuristic,
such that '}' or 'end' in a given line have a strong signal that the
cut should be right after that line? i.e. that would decrease the
badness score.

While this is not a general solution (but quite language specific),
I would think this is still a good idea to look at.
(When coming up with the heuristic, most people thought the bad
examples would come from exotic languages, but not C, which we
are all so familiar with such that we thought we had a good grasp
how to make a heuristic that works with C. Note that the example
looks like C, though).

[1]
{ "info.log", MNU_GBX_INFOLOG, etc },
vs
{ "SoftCam.Key", MNU_CFG_FSOFTCAMKEY, etc },\

(slightly unrelated tangent on better diffs in general:)
A couple days ago I had another discussion on how
diffs are best presented to users. That discussion was
focused on a slightly higher layer, the diff algorithm itself,
instead of a post-algorithm boost-fixup.

It is usually easier to both write as well as review code that
adds completely new features or projects instead of squashing
bugs or integrating subtle new features in an existing code base.

This observation lead to the conclusion that reviewing large
locks of adjacent new lines is easier than reviewing blocks
where deletions as well as additions are found. So we theorized
that deletions should have more impact than additions when
computing the diff itself. The case presented here has no deletions
so this is not applicable, but most examples that came up once
2477ab2ea8 (diff: support anchoring line(s), 2017-11-27)
was discussed would be likely to have better diffs.

Stefan

Re: [PATCH 2/2] t5556: replace test_i18ngrep with a simple grep

2018-02-13 Thread Junio C Hamano

Jeff King  writes:

> If I understand Gábor's patch correctly, it is using test_i18ngrep for
> the specific lines we care about so that we don't have to worry about
> other cruft lines that may or may not appear (including the hangup one).
>
> The downside is that we would not notice if a _new_ error message
> (beyond the ones we expect and the one we were explicitly ignoring)
> appeared. IMHO that's probably fine.

Ah, OK, I didn't notice how the multi-line one was handled.  Unable
to notice new error messages and undisturbed by possible "hung up"
messages are the sides of the same coin---I myself am unsure if it
is a good trade-off, but I'm inclined to defer to judgment of two
people ;-)

Re: [PATCH v2] dir.c: ignore paths containing .git when invalidating untracked cache

2018-02-13 Thread Junio C Hamano

Duy Nguyen  writes:

> It's very tempting considering that the amount of changes is much
> smaller. But I think we should go with my version. The hope is when a
> _new_ call site appears, the author would think twice before passing
> zero or one to the safe_path argument.

Wouldn't it be a better API if the author of new callsite does not
have to think twice and can instead rely on the called function
untracked_cache_invalidate_path() to always do the right thing?

Re: [PATCH 00/26] Moving global state into the repository object (part 1)

2018-02-13 Thread Stefan Beller

On Tue, Feb 13, 2018 at 4:13 AM, Duy Nguyen  wrote:
> On Tue, Feb 13, 2018 at 6:49 PM, Duy Nguyen  wrote:
>> On Mon, Feb 12, 2018 at 05:22:15PM -0800, Stefan Beller wrote:
>>> This is a real take on the first part of the recent RFC[1].
>>>
>>> ...
>>>
>>> Duy suggested that we shall not use the repository blindly, but
>>> should carefully examine whether to pass on an object store or the
>>> refstore or such[4], which I agree with if it makes sense. This
>>> series unfortunately has an issue with that as I would not want to
>>> pass down the `ignore_env` flag separately from the object store, so
>>> I made all functions that only take the object store to have the raw
>>> object store as the first parameter, and others using the full
>>> repository.
>>
>> Second proposal :) How about you store ignore_env in raw_object_store?
>> This would not be the first time an object has some configuration
>> passed in at construction time. And it has a "constructor" now,
>> raw_object_store_init() (I probably should merge _setup in it too)
>
> A bit more on this configuration parameters. Down the road I think we
> need something like this anyway to delete global config vars like
> packed_git_window_size, delta_base_cache_limit...  Either all these
> end up in raw_object_store, or raw_object_store holds a link to
> "struct config_set".

That makes sense long term.

> The ignore_env specifically though looks to me like a stop gap
> solution until everything goes through repo_init() first. At that
> point we don't have to delay getenv() anymore. We can getenv() all at
> repo_init() then pass them in raw_object_store and ignore_env should
> be gone. So sticking it inside raw_object_store _temporarily_ does not
> sound so bad.

Oh, that is an interesting perspective. Here is how I arrived at the opposite
conclusion initially: Searching for 'ignore_env' shows that we care about it
as well for the index and graft paths, which are not the object store, hence
it would be better kept in the repository. (The alternative would be to
duplicate it into the raw object store, but I do not like data duplication)

But maybe it is better to duplicate this one bit instead of passing through
a larger scoped object.

I'll rework the patches.

Thanks!
Stefan

Re: linux-next: unnecessary merge in the v4l-dvb tree

2018-02-13 Thread Linus Torvalds

On Tue, Feb 13, 2018 at 9:18 AM, Junio C Hamano  wrote:
>
> That makes me wonder if another heuristic I floated earlier is more
> appropriate.  When merging a tag object T, if refs/tags/T exists and
> it is that tag object, then an updated "merge" would default to "--ff";
> otherwise, it would keep the current default of creating a merge even
> when we could fast-forward, in order to record that tag T in the
> resulting history.

Oooh. Yes, that sounds like the right thing to do.

So the "no fast-forward" logic would trigger only if the name we used
for merging is one of the temporary ones (ie .git/{FETCH,MERGE}_HEAD),
not if the mentioned tag is already a normal tag reference.

Then it's very explicitly about "don't lose the signing information".

I'd still have to teach people to use "--only-ff" if they don't do the
"fetch and merge" model but literally just do  "git pull upstream
vX.Y", but at least the case Mauro describes would automatically just
DTRT.

Me likey.

   Linus

Re: [PATCH 2/2] t5556: replace test_i18ngrep with a simple grep

2018-02-13 Thread Jeff King

On Tue, Feb 13, 2018 at 09:08:44AM -0800, Junio C Hamano wrote:

> SZEDER Gábor  writes:
> 
> > A third possible fix, which is also in the "we don't care about the
> > order of multiple warning messages" camp and has a nice looking
> > diffstat, would be something like this:
> 
> Hmph, we are running a "git fetch" locally and observing the error
> output from both "fetch" and its counterpart "upload-pack", aren't
> we?  The "fetch" instances that are run with test_must_fail are
> expected to stop talking to "upload-pack" by detecting an error and
> severe the connection abruptly---depending on the relative timing
> between the processes, the other side may try to read and diagnose
> "the remote end hung up unexpectedly", no?  

If I understand Gábor's patch correctly, it is using test_i18ngrep for
the specific lines we care about so that we don't have to worry about
other cruft lines that may or may not appear (including the hangup one).

The downside is that we would not notice if a _new_ error message
(beyond the ones we expect and the one we were explicitly ignoring)
appeared. IMHO that's probably fine.

-Peff

Re: linux-next: unnecessary merge in the v4l-dvb tree

2018-02-13 Thread Junio C Hamano

Mauro Carvalho Chehab  writes:

> Yes, that's my pain. I don't want ff only when pulling from others,
> only when pulling from upstream tree.
>
>> 
>> We may want per-remote equivalent for it, i.e. e.g.
>> 
>>  [pull]
>>  ff=false ;# good default for collecting contributions
>> 
>>  [remote "torvalds"] 
>>  pullFF = only ;# good default for catching up
>> 
>> or something like that, perhaps?
>
>
> Yeah, something like that works. Please notice, however, that what I
> usually do is:
>
>   $ git remote update torvalds
>   $ git merge 
> (or git pull . )
>
> So, for the above to work, it should store somehow the remote from
> where a tag came from.

That makes me wonder if another heuristic I floated earlier is more
appropriate.  When merging a tag object T, if refs/tags/T exists and
it is that tag object, then an updated "merge" would default to "--ff";
otherwise, it would keep the current default of creating a merge even
when we could fast-forward, in order to record that tag T in the
resulting history.

Of course, end users can use command line options to override such
heuristics anyway, but if the behaviour based on the new heuristic
is easy to explain and understand, and covers majority of the use
cases without command line override, then we might not even need a
new configuration mechanism like remove.torvalds.pullFF mentioned
above.

Re: [PATCH 2/2] t5556: replace test_i18ngrep with a simple grep

2018-02-13 Thread Junio C Hamano

SZEDER Gábor  writes:

> A third possible fix, which is also in the "we don't care about the
> order of multiple warning messages" camp and has a nice looking
> diffstat, would be something like this:

Hmph, we are running a "git fetch" locally and observing the error
output from both "fetch" and its counterpart "upload-pack", aren't
we?  The "fetch" instances that are run with test_must_fail are
expected to stop talking to "upload-pack" by detecting an error and
severe the connection abruptly---depending on the relative timing
between the processes, the other side may try to read and diagnose
"the remote end hung up unexpectedly", no?  

I think "grep -v" filtering is an attempt to protect the test from
getting confused by that output, but is it safe not to worry about
it these days?

> diff --git a/t/t5536-fetch-conflicts.sh b/t/t5536-fetch-conflicts.sh
> index 2e42cf3316..91f28c2f78 100755
> --- a/t/t5536-fetch-conflicts.sh
> +++ b/t/t5536-fetch-conflicts.sh
> @@ -18,14 +18,6 @@ setup_repository () {
>   )
>  }
>  
> -verify_stderr () {
> - cat >expected &&
> - # We're not interested in the error
> - # "fatal: The remote end hung up unexpectedly":
> - test_i18ngrep -E '^(fatal|warning):' actual 
> | sort &&
> - test_i18ncmp expected actual
> -}
> -
>  test_expect_success 'setup' '
>   git commit --allow-empty -m "Initial" &&
>   git branch branch1 &&
> @@ -48,9 +40,7 @@ test_expect_success 'fetch conflict: config vs. config' '
>   "+refs/heads/branch2:refs/remotes/origin/branch1" && (
>   cd ccc &&
>   test_must_fail git fetch origin 2>error &&
> - verify_stderr <<-\EOF
> - fatal: Cannot fetch both refs/heads/branch1 and 
> refs/heads/branch2 to refs/remotes/origin/branch1
> - EOF
> + test_i18ngrep "fatal: Cannot fetch both refs/heads/branch1 and 
> refs/heads/branch2 to refs/remotes/origin/branch1" error
>   )
>  '

Re: [PATCH v3 22/35] upload-pack: support shallow requests

2018-02-13 Thread Brandon Williams

On 02/07, Stefan Beller wrote:
> On Tue, Feb 6, 2018 at 5:12 PM, Brandon Williams  wrote:
> > Add the 'shallow' feature to the protocol version 2 command 'fetch'
> > which indicates that the server supports shallow clients and deepen
> > requets.
> >
> > Signed-off-by: Brandon Williams 
> > ---
> >  Documentation/technical/protocol-v2.txt |  67 +++-
> >  serve.c |   2 +-
> >  t/t5701-git-serve.sh|   2 +-
> >  upload-pack.c   | 138 
> > +++-
> >  upload-pack.h   |   3 +
> >  5 files changed, 173 insertions(+), 39 deletions(-)
> >
> > diff --git a/Documentation/technical/protocol-v2.txt 
> > b/Documentation/technical/protocol-v2.txt
> > index 4d5096dae..fedeb6b77 100644
> > --- a/Documentation/technical/protocol-v2.txt
> > +++ b/Documentation/technical/protocol-v2.txt
> > @@ -201,12 +201,42 @@ packet-lines:
> > to its base by position in pack rather than by an oid.  That is,
> > they can read OBJ_OFS_DELTA (ake type 6) in a packfile.
> >
> > +shallow 
> > +   A client must notify the server of all objects for which it only
> 
> s/all objects/all commits/ for preciseness
> 
> > +   has shallow copies of (meaning that it doesn't have the parents
> > +   of a commit) by supplying a 'shallow ' line for each such
> > +   object so that the serve is aware of the limitations of the
> > +   client's history.
> > +
> > +deepen 
> > +   Request that the fetch/clone should be shallow having a commit 
> > depth of
> > +relative to the remote side.
> 
> What does depth mean? number of commits, or number of edges?
> Are there any special numbers (-1, 0, 1, max int) ?
> 
> From reading ahead: "Cannot be used with deepen-since, but
> can be combined with deepen-relative" ?

It just uses the current logic, which has no documentation on any of
that so...I'm not really sure?

> 
> 
> > +
> > +deepen-relative
> > +   Requests that the semantics of the "deepen" command be changed
> > +   to indicate that the depth requested is relative to the clients
> > +   current shallow boundary, instead of relative to the remote
> > +   refs.
> > +
> > +deepen-since 
> > +   Requests that the shallow clone/fetch should be cut at a
> > +   specific time, instead of depth.  Internally it's equivalent of
> > +   doing "rev-list --max-age=". Cannot be used with
> > +   "deepen".
> > +
> > +deepen-not 
> > +   Requests that the shallow clone/fetch should be cut at a
> > +   specific revision specified by '', instead of a depth.
> > +   Internally it's equivalent of doing "rev-list --not ".
> > +   Cannot be used with "deepen", but can be used with
> > +   "deepen-since".
> 
> What happens if those are given in combination?

Should act as an AND, it uses the old logic and there isn't very much
documentation on that...

-- 
Brandon Williams

Re: [PATCH 00/26] Moving global state into the repository object (part 1)

2018-02-13 Thread Brandon Williams

On 02/13, Duy Nguyen wrote:
> On Tue, Feb 13, 2018 at 6:49 PM, Duy Nguyen  wrote:
> > On Mon, Feb 12, 2018 at 05:22:15PM -0800, Stefan Beller wrote:
> >> This is a real take on the first part of the recent RFC[1].
> >>
> >> ...
> >>
> >> Duy suggested that we shall not use the repository blindly, but
> >> should carefully examine whether to pass on an object store or the
> >> refstore or such[4], which I agree with if it makes sense. This
> >> series unfortunately has an issue with that as I would not want to
> >> pass down the `ignore_env` flag separately from the object store, so
> >> I made all functions that only take the object store to have the raw
> >> object store as the first parameter, and others using the full
> >> repository.
> >
> > Second proposal :) How about you store ignore_env in raw_object_store?
> > This would not be the first time an object has some configuration
> > passed in at construction time. And it has a "constructor" now,
> > raw_object_store_init() (I probably should merge _setup in it too)
> 
> A bit more on this configuration parameters. Down the road I think we
> need something like this anyway to delete global config vars like
> packed_git_window_size, delta_base_cache_limit...  Either all these
> end up in raw_object_store, or raw_object_store holds a link to
> "struct config_set".
> 
> The ignore_env specifically though looks to me like a stop gap
> solution until everything goes through repo_init() first. At that
> point we don't have to delay getenv() anymore. We can getenv() all at
> repo_init() then pass them in raw_object_store and ignore_env should
> be gone. So sticking it inside raw_object_store _temporarily_ does not
> sound so bad.

I like this approach, I mean at the moment we are replicating a single
bit of data but that allows us to be able to limit the scope of where a
repository struct is passed, giving us a better abstraction layer.

> -- 
> Duy

-- 
Brandon Williams

Re: make git diff output easier to read - use better diff heuristics

2018-02-13 Thread Jeff King

On Tue, Feb 13, 2018 at 05:06:15PM +0200, Σπύρος Βαζαίος wrote:

> Hi, I've came across an issue when using the git diff command. In
> particular the diff is different to what the svn diff produces. While
> both being correct the output of the svn diff is easier to understand
> than the git diff one. See the following issue on github where I
> initially reported the issue:
> 
> https://github.com/git-for-windows/git/issues/1494
> 
> I have Included a picture to better illustrate the problem. What do
> you think? Is it possible to make git diff output similar to svn diff
> regarding this issue?

Try "git diff --no-indent-heuristic", which makes your example look
better. Here's a quick reproduction:

-- >8 --
cat >foo.c <<\EOF
static struct foo bar[] = {
#ifdef SOMETHING
{ "stats.info", MNU_GBX_FSTAINF, etc },
{ "expired.info", MNU_GBX_FSTAINF, etc },
{ "info.log", MNU_GBX_INFOLOG, etc },
#endif
{ NULL, 0, 0 },
};
EOF

sed '6a\
#ifdef WITH_EMU\
{ "SoftCam.Key", MNU_CFG_FSOFTCAMKEY, etc },\
#endif
' bar.c
-- 8< --

Now this looks ugly:

  git diff --no-index foo.c bar.c

but this does not:

  git diff --no-index --no-indent-heuristic foo.c bar.c

That heuristic is described in 433860f3d0 (diff: improve positioning of
add/delete blocks in diffs, 2016-09-05). I'm not sure exactly why it
does the wrong thing here, or if it would be possible to tweak the
weighting factors to make it work.

-Peff

Re: Fetch-hooks

2018-02-13 Thread Leo Gaspard

On 02/12/2018 08:23 PM, Brandon Williams wrote:> Maybe this isn't
helpful but you may be able to implement this by using
> a remote-helper.  The helper could perform any sort of caching it needed
> to prevent re-downloading large amounts of data that is potentially
> thrown away, while only sending through the relevant commits which
> satisfy some criteria (signed, etc.).

This looks like a possibility, thanks! I'll likely try to implement it
this way if there can be no consensus on the features wanted for a
fetch-hook (as the current state of discussion looks like to me).

[no subject]

2018-02-13 Thread mavis lilian wanczyk





This is the second time i am sending you this mail.

I, Mavis Wanczyk donates $ 5 Million Dollars from part of my Powerball  
Jackpot Lottery of $ 758 Million Dollars, respond with your details  
for claims.


I await your earliest response and God Bless you

Good luck.
Mavis Wanczyk

make git diff output easier to read - use better diff heuristics

2018-02-13 Thread Σπύρος Βαζαίος

Hi, I've came across an issue when using the git diff command. In
particular the diff is different to what the svn diff produces. While
both being correct the output of the svn diff is easier to understand
than the git diff one. See the following issue on github where I
initially reported the issue:

https://github.com/git-for-windows/git/issues/1494

I have Included a picture to better illustrate the problem. What do
you think? Is it possible to make git diff output similar to svn diff
regarding this issue?

[no subject]

2018-02-13 Thread imani



-- 
Hello,

I have a proposal for you.kindly get back to me

Best Regards.
Sherry Brydson

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.

Re: "git submodule" vs "git subtree" vs "repo" vs "git subdir" ... ?

2018-02-13 Thread Nicolas Morey-Chaisemartin



Le 13/02/2018 à 14:06, Robert P. J. Day a écrit :
>   looking for general opinions ... i am (frighteningly :-) teaching a
> git course later this week, and one of the topics on the list is git
> submodules, which was specifically requested by the client as their
> idea of how to start incorporating child repos in new projects.
>
>   however, given the number of articles written about the drawbacks
> with submodules, i wanted to throw in a section about "git subtree" as
> well, even though (as discussed earlier) there is some minor dispute
> as to whether "git subtree" is part of "core" git, but i'm not going
> to let that stop me.

I've dealt with a large number of submodules and subtrees in the last years so 
here are my 2 cents.
The first question to ask is the most important one:
Do the repos really need to be split at all.

In a previous life, we started with one repo per "small projects", each 
containing a couple of binaries or libs.
Everything was put together through submodules.
After way too long to admit, we realized that it was just not worth it. We had 
to deal with so many corner cases in CI and other things for a no real gain.
The reality is git works perfectly with bigger projects and smart enough that 
people working on two different "subprojects" don't create any conflict.
So we started merging a lot of it back into the main repo and that was it.

I see some advantages submodules have other subtrees:
- It's slightly easier to in a subproject when using submodule than with 
subtree (at least for git newcomers).
Once you made sure your submodule is on a branch, you just go into it and work 
in a standard git.
When using subtree, you have to know you're in a subtree and to your subtree 
push/pull to resync with someone working directly in the subproject.
At the time conflict in subtrees were sometimes a bit weird and created history 
full of merge.

- submodules are great when working with huge subrepo.
We have a repository than had among its submodule: linux, gcc, gdb, glibc.
This means a real lot of code/files and commits.
Subtree would work but you'd endup with a humongous repo and as expected 
everything is slowed down.

- submodules work well when you are moving between versions and branches of a 
subproject.
Let me clarify with an example:
We had a repo that had ffmpeg as a submodule with a large patch series from us.
We would regularly create a new branch in the subrepo with all our patches on 
top of a new ffmpeg upstream release.
And simply point the top project to this SHA1. And it worked great.
I wouldn't risk doing that kind of things with subtree. I may be wrong here (I 
haven't tried it to be honest) the the push/pull approach doesn't seem to fit
 the idea of moving from a SHA1 to a seemingly unrelated one.

Now that I'm working with reasonable repo sizes again, and linear history, I 
mostly stick with subtrees.

Nicolas

[PATCH 1/1] Mark messages for translations

2018-02-13 Thread Alexander Shopov

Small changes in messages to fit the style and typography of rest.
Reuse already translated messages if possible.
Do not translate messages aimed at developers of git.
Fix unit tests depending on the original string.
Use `test_i18ngrep` for tests with translatable strings.
Change and verify rest of tests via `make GETTEXT_POISON=1 test`.

Signed-off-by: Alexander Shopov 
---
 git.c  | 38 +-
 setup.c| 62 +-
 t/t0002-gitfile.sh |  4 +--
 t/t0008-ignores.sh |  2 +-
 t/t1506-rev-parse-diagnosis.sh |  2 +-
 5 files changed, 54 insertions(+), 54 deletions(-)

diff --git a/git.c b/git.c
index c870b9719..5ddcb75d4 100644
--- a/git.c
+++ b/git.c
@@ -5,11 +5,11 @@
 #include "run-command.h"
 
 const char git_usage_string[] =
-   "git [--version] [--help] [-C ] [-c name=value]\n"
-   "   [--exec-path[=]] [--html-path] [--man-path] 
[--info-path]\n"
-   "   [-p | --paginate | --no-pager] [--no-replace-objects] 
[--bare]\n"
-   "   [--git-dir=] [--work-tree=] 
[--namespace=]\n"
-   "[]";
+   N_("git [--version] [--help] [-C ] [-c =]\n"
+  "   [--exec-path[=]] [--html-path] [--man-path] 
[--info-path]\n"
+  "   [-p | --paginate | --no-pager] [--no-replace-objects] 
[--bare]\n"
+  "   [--git-dir=] [--work-tree=] 
[--namespace=]\n"
+  "[]");
 
 const char git_more_info_string[] =
N_("'git help -a' and 'git help -g' list available subcommands and 
some\n"
@@ -92,7 +92,7 @@ static int handle_options(const char ***argv, int *argc, int 
*envchanged)
*envchanged = 1;
} else if (!strcmp(cmd, "--git-dir")) {
if (*argc < 2) {
-   fprintf(stderr, "No directory given for 
--git-dir.\n" );
+   fprintf(stderr, _("no directory given for 
--git-dir\n" ));
usage(git_usage_string);
}
setenv(GIT_DIR_ENVIRONMENT, (*argv)[1], 1);
@@ -106,7 +106,7 @@ static int handle_options(const char ***argv, int *argc, 
int *envchanged)
*envchanged = 1;
} else if (!strcmp(cmd, "--namespace")) {
if (*argc < 2) {
-   fprintf(stderr, "No namespace given for 
--namespace.\n" );
+   fprintf(stderr, _("no namespace given for 
--namespace\n" ));
usage(git_usage_string);
}
setenv(GIT_NAMESPACE_ENVIRONMENT, (*argv)[1], 1);
@@ -120,7 +120,7 @@ static int handle_options(const char ***argv, int *argc, 
int *envchanged)
*envchanged = 1;
} else if (!strcmp(cmd, "--work-tree")) {
if (*argc < 2) {
-   fprintf(stderr, "No directory given for 
--work-tree.\n" );
+   fprintf(stderr, _("no directory given for 
--work-tree\n" ));
usage(git_usage_string);
}
setenv(GIT_WORK_TREE_ENVIRONMENT, (*argv)[1], 1);
@@ -134,7 +134,7 @@ static int handle_options(const char ***argv, int *argc, 
int *envchanged)
*envchanged = 1;
} else if (!strcmp(cmd, "--super-prefix")) {
if (*argc < 2) {
-   fprintf(stderr, "No prefix given for 
--super-prefix.\n" );
+   fprintf(stderr, _("no prefix given for 
--super-prefix\n" ));
usage(git_usage_string);
}
setenv(GIT_SUPER_PREFIX_ENVIRONMENT, (*argv)[1], 1);
@@ -156,7 +156,7 @@ static int handle_options(const char ***argv, int *argc, 
int *envchanged)
*envchanged = 1;
} else if (!strcmp(cmd, "-c")) {
if (*argc < 2) {
-   fprintf(stderr, "-c expects a configuration 
string\n" );
+   fprintf(stderr, _("-c expects a configuration 
string\n" ));
usage(git_usage_string);
}
git_config_push_parameter((*argv)[1]);
@@ -194,12 +194,12 @@ static int handle_options(const char ***argv, int *argc, 
int *envchanged)
*envchanged = 1;
} else if (!strcmp(cmd, "-C")) {
if (*argc < 2) {
-   fprintf(stderr, "No directory given for -C.\n" 
);
+   fprintf(stderr, _("no directory given for -C\n" 
));

[PATCH 0/1] Mark messages for translation

2018-02-13 Thread Alexander Shopov

Following suggestions on last email exchange:

Junio C Hamano
  1. End sentences in long log message with '.'
  2. Leave final merge fixes to him based on suggestion from Jeff King  

Previous changes:

Eric Sunshine:
  1. Use 'test_i18ngrep' rather than 'grep'
Jeff King:
  2. Fix t1506
Johannes Sixt:
  3. Lower-case letters at the beginning of error messages
  4. Past tense to present tense in some cases
Eric Sunshine:
  5. Fix `-cname=value` to say `-c =`
  6. Do not translate "BUG message"
Duy Nguyen:
  7. Fix parentheses on `_` macro

Kind regards:
al_shopov



Alexander Shopov (1):
  Mark messages for translations

 git.c  | 38 +-
 setup.c| 62 +-
 t/t0002-gitfile.sh |  4 +--
 t/t0008-ignores.sh |  2 +-
 t/t1506-rev-parse-diagnosis.sh |  2 +-
 5 files changed, 54 insertions(+), 54 deletions(-)

-- 
2.16.1

Re: totally confused as to what "git bisect skip" is supposed to do

2018-02-13 Thread Robert P. J. Day

On Tue, 13 Feb 2018, Christian Couder wrote:

> On Tue, Feb 13, 2018 at 1:28 PM, Robert P. J. Day  
> wrote:
> >
> > p.s. i suspect i should RTFS to see exactly how git bisect does its
> > work.
>
> You might want to read
> https://git-scm.com/docs/git-bisect-lk2009.html before reading the
> source code.

  ah, excellent, thanks very much.

rday

"git submodule" vs "git subtree" vs "repo" vs "git subdir" ... ?

2018-02-13 Thread Robert P. J. Day


  looking for general opinions ... i am (frighteningly :-) teaching a
git course later this week, and one of the topics on the list is git
submodules, which was specifically requested by the client as their
idea of how to start incorporating child repos in new projects.

  however, given the number of articles written about the drawbacks
with submodules, i wanted to throw in a section about "git subtree" as
well, even though (as discussed earlier) there is some minor dispute
as to whether "git subtree" is part of "core" git, but i'm not going
to let that stop me.

  going even beyond that, there is also google's "repo" command, which
i seem to see more and more often, like here for automotive grade
linux:

  https://wiki.automotivelinux.org/agl-distro/source-code

and it would be a shame to at least not mention that as yet another
possibility.

  and then there are unofficial, hand-rolled solutions, like
"git-subdir":

  https://github.com/andreyvit/git-subdir

given that the client does not appear to be wedded to any particular
solution yet, i'm open to recommendations or pointers to online
comparisons that i'll collect and post on a single wiki page, and they
can peruse the comparisons at their leisure.

  so ... thoughts? no need to be verbose, just links to decent online
discussions/comparisons would be massively useful.

rday

p.s. oh, pointers to well-designed usage of any of the above would be
handy as well. as i mentioned, for "repo", there's AGL. for
submodules, i might use boost:

  https://github.com/boostorg/boost/wiki/Getting-Started

and so on. thank you kindly.

Re: totally confused as to what "git bisect skip" is supposed to do

2018-02-13 Thread Christian Couder

On Tue, Feb 13, 2018 at 1:28 PM, Robert P. J. Day  wrote:
>
> p.s. i suspect i should RTFS to see exactly how git bisect does its
> work.

You might want to read https://git-scm.com/docs/git-bisect-lk2009.html
before reading the source code.

Re: "git bisect run make" adequate to locate first unbuildable commit?

2018-02-13 Thread Robert P. J. Day

On Fri, 9 Feb 2018, Philip Oakley wrote:

> From: "Robert P. J. Day" 
> > On Fri, 9 Feb 2018, Philip Oakley, CEng MIET wrote:
> (apologies for using the fancy letters after the name ID...)
> >
> >> From: "Robert P. J. Day" 
> >> >
> >> > writing a short tutorial on "git bisect" and, all the details
> >> > of special exit code 125 aside, if one wanted to locate the
> >> > first unbuildable commit, would it be sufficient to just run?
> >> >
> >> >  $ git bisect run make
> >> >
> >> > as i read it, make returns either 0, 1 or 2 so there doesn't
> >> > appear to be any possibility of weirdness with clashing with a
> >> > 125 exit code. am i overlooking some subtle detail here i
> >> > should be aware of? thanks.
> >> >
> >> > rday
> >>
> >> In the spirit of pedanticism, one should also clarify the word
> >> "first", in that it's not a linear search for _an_ unbuildable
> >> commit, but that one is looking for the transition between an
> >> unbroken sequence of unbuildable commits, which transitions to
> >> buildable commits, and its the transition that is sought. (there
> >> could be many random unbuildable commits within a sequence in
> >> some folks' processes!)
> >
> >  quite so, i should have been more precise.
> >
> > rday
>
> The other two things that may be happening (in the wider bisect
> discussion) that I've heard of are:

> 1. there may be feature branches that bypass the known good starting
>commit, which can cause understanding issues as those side
>branches that predate the start point are also considered
>potential bu commits.

  ... snip ...

  sorry, i should have replied to this post since my last post
directly addresses this scenario -- i believe this is precisely what
you were suggesting:

   ... 5 commits ... (feature branch)
 /  ^
/\
   v  \
  A  <--  B <--  <-- D <-- E <-- F <-- 


i'm still curious ... if one doesn't get into skipping, what is the
"git rev-list" command to identify all of the possible revisions to be
tested here? isn't it just:

  $ git rev-list  ^

or was there somthing more subtle i was missing? sorry if someone
already explained this, i'll go back later and read the replies in
more excruciating detail.

rday

p.s. i'll take a closer look at the "--bisect*" options to "git
rev-list" shortly.

Re: totally confused as to what "git bisect skip" is supposed to do

2018-02-13 Thread Robert P. J. Day

On Mon, 12 Feb 2018, Christian Couder wrote:

> On Mon, Feb 12, 2018 at 11:44 AM, Robert P. J. Day
>  wrote:
> > On Fri, 9 Feb 2018, Junio C Hamano wrote:
> >
> >> "Robert P. J. Day"  writes:
> >>
> >> >   i'm confused ... why, after skipping a good chunk in the interval
> >> > [v4.13,v4.14], do i still have exactly 7300 revisions to bisect? what
> >> > am i so hopelessly misunderstanding here?
> >>
> >> Are you really "skipping" a chunk in the interval?
> >>
> >> I thought that "git bisect skip" is a way for you to respond, when
> >> "git bisect" gave you a commit to test, saying "sorry, I cannot test
> >> that exact version, please offer me something else to test".  And
> >> each time you say that, you are not narrowing the search space in
> >> any way, so it is understandable that the numver of candidate bad
> >> commits will not decrease.
> >
> >   this might be an issue of terminology, then, as "man git-bisect"
> > clearly suggests you can skip a range:
> >
> > You can also skip a range of commits, instead of just one
> > commit, using range notation. For example:
> >
> >$ git bisect skip v2.5..v2.6
> >
> > This tells the bisect process that no commit after v2.5, up to
> > and including v2.6, should be tested.
>
> Yeah, I think this is kind of a terminology related.
>
> First when git bisect says "Bisecting: XXX revisions left to test
> after this" it doesn't mean that all those revisions left will
> actually be tested, as git bisect's purpose is to avoid testing as
> many revisions as possible.
>
> So the XXX revisions are actually the revisions that possibly
> contain the first bad commit.
>
> And, as Junio wrote, when you tell git bisect that you cannot test
> some revisions, it doesn't mean that those revisions cannot contain
> the first bad commit.
>
> > my issue (if this is indeed an issue) is that if i select to skip
> > a sizable range of commits to test, should that not result in git
> > bisect telling me it now has far fewer revisions to test? if i, in
> > fact, manage to "disqualify" a number of commits from testing, is
> > there no visual confirmation that i now have fewer commits to
> > test?
>
> I hope that the above clarification I gave is enough, but maybe the
> following will help you.
>
> If you cannot test let's say 20 commits because there is build
> problem in those commits, and in the end Git tells you that the
> first bad commit could be any of 3 commits, 2 of them that were
> previously marked with skip, then you could still, if you wanted,
> fix those commits, so that they can be built and test them.
>
> So yeah if we only talk about the current bisection, the skipped
> commits will not be tested, but if we talk about completely
> finishing the bisection and finding the first bad commit, then those
> commits could still be tested.

  ok, i'll give this more thought later in the week when i have the
time, but is there a simple expression (using "gitrevisions") that
defines the set of revisions to be tested by bisection if i define the
search space between  and ?

  consider the following history:

   ... 5 commits ... (feature branch)
 /  ^
/\
   v  \
  A  <--  B <--  <-- D <-- E <-- F <-- 

so imagine branching at B, creating a massively lengthy feature
branch, and merging it back to master at F. now imagine i know "GOOD"
is a good revision, and "BAD" is broken. according to the above, the
offending commit could be any of D, E, or any of the 50,000 commits on
the feature branch, correct? so if i had the above commit history,
would:

  $ git bisect start  

tell me i have 50,002 revisions to test? am i making sense here?

rday

p.s. i suspect i should RTFS to see exactly how git bisect does its
work.

Re: [PATCH 00/26] Moving global state into the repository object (part 1)

2018-02-13 Thread Duy Nguyen

On Tue, Feb 13, 2018 at 6:49 PM, Duy Nguyen  wrote:
> On Mon, Feb 12, 2018 at 05:22:15PM -0800, Stefan Beller wrote:
>> This is a real take on the first part of the recent RFC[1].
>>
>> ...
>>
>> Duy suggested that we shall not use the repository blindly, but
>> should carefully examine whether to pass on an object store or the
>> refstore or such[4], which I agree with if it makes sense. This
>> series unfortunately has an issue with that as I would not want to
>> pass down the `ignore_env` flag separately from the object store, so
>> I made all functions that only take the object store to have the raw
>> object store as the first parameter, and others using the full
>> repository.
>
> Second proposal :) How about you store ignore_env in raw_object_store?
> This would not be the first time an object has some configuration
> passed in at construction time. And it has a "constructor" now,
> raw_object_store_init() (I probably should merge _setup in it too)

A bit more on this configuration parameters. Down the road I think we
need something like this anyway to delete global config vars like
packed_git_window_size, delta_base_cache_limit...  Either all these
end up in raw_object_store, or raw_object_store holds a link to
"struct config_set".

The ignore_env specifically though looks to me like a stop gap
solution until everything goes through repo_init() first. At that
point we don't have to delay getenv() anymore. We can getenv() all at
repo_init() then pass them in raw_object_store and ignore_env should
be gone. So sticking it inside raw_object_store _temporarily_ does not
sound so bad.
-- 
Duy

Re: [PATCH 00/26] Moving global state into the repository object (part 1)

2018-02-13 Thread Duy Nguyen

On Mon, Feb 12, 2018 at 05:22:15PM -0800, Stefan Beller wrote:
> This is a real take on the first part of the recent RFC[1].
>
> ...
> 
> Duy suggested that we shall not use the repository blindly, but
> should carefully examine whether to pass on an object store or the
> refstore or such[4], which I agree with if it makes sense. This
> series unfortunately has an issue with that as I would not want to
> pass down the `ignore_env` flag separately from the object store, so
> I made all functions that only take the object store to have the raw
> object store as the first parameter, and others using the full
> repository.

Second proposal :) How about you store ignore_env in raw_object_store?
This would not be the first time an object has some configuration
passed in at construction time. And it has a "constructor" now,
raw_object_store_init() (I probably should merge _setup in it too)

The core changes look like this. I have a full commit on top of your
series [1] that keeps sha1_file.c functions take 'struct
raw_object_store' instead.

[1] https://github.com/pclouds/git/tree/object-store-part1

-- 8< --
diff --git a/object-store.h b/object-store.h
index f164c4e5c9..a8bf1738f2 100644
--- a/object-store.h
+++ b/object-store.h
@@ -34,9 +34,13 @@ struct raw_object_store {
 * packs.
 */
unsigned packed_git_initialized : 1;
+
+   unsigned ignore_env : 1;
 };
 #define RAW_OBJECT_STORE_INIT { NULL, NULL, MRU_INIT, NULL, NULL, 0, 0, 0 }
 
+void raw_object_store_init(struct raw_object_store *ros, struct repository *r);
+void raw_object_store_setup(struct raw_object_store *ros, char *);
 void raw_object_store_clear(struct raw_object_store *o);
 
 struct packed_git {
diff --git a/object.c b/object.c
index 79c2c447bc..a4534bf4c4 100644
--- a/object.c
+++ b/object.c
@@ -461,6 +461,20 @@ static void free_alt_odbs(struct raw_object_store *o)
}
 }
 
+void raw_object_store_init(struct raw_object_store *o,
+  struct repository *r)
+{
+   /* FIXME: memset? */
+   o->ignore_env = r->ignore_env;
+}
+
+void raw_object_store_setup(struct raw_object_store *o,
+   char *objectdir)
+{
+   free(o->objectdir);
+   o->objectdir = objectdir;
+}
+
 void raw_object_store_clear(struct raw_object_store *o)
 {
free(o->objectdir);
diff --git a/repository.c b/repository.c
index bd2ad578de..ac5863d7e1 100644
--- a/repository.c
+++ b/repository.c
@@ -49,9 +49,9 @@ static void repo_setup_env(struct repository *repo)
!repo->ignore_env);
free(repo->commondir);
repo->commondir = strbuf_detach(, NULL);
-   free(repo->objects.objectdir);
-   repo->objects.objectdir = git_path_from_env(DB_ENVIRONMENT, 
repo->commondir,
-   "objects", !repo->ignore_env);
+   raw_object_store_setup(>objects,
+  git_path_from_env(DB_ENVIRONMENT, 
repo->commondir,
+"objects", !repo->ignore_env));
free(repo->graft_file);
repo->graft_file = git_path_from_env(GRAFT_ENVIRONMENT, repo->commondir,
 "info/grafts", !repo->ignore_env);
@@ -142,6 +142,8 @@ int repo_init(struct repository *repo, const char *gitdir, 
const char *worktree)
 
repo->ignore_env = 1;
 
+   raw_object_store_init(>objects, repo);
+
if (repo_init_gitdir(repo, gitdir))
goto error;
 
diff --git a/sha1_file.c b/sha1_file.c
index c3f35914ce..3be58651a1 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -677,14 +677,14 @@ int foreach_alt_odb(alt_odb_fn fn, void *cb)
return r;
 }
 
-void prepare_alt_odb(struct repository *r)
+void prepare_alt_odb(struct raw_object_store *ros)
 {
if (r->objects.alt_odb_tail)
return;
 
r->objects.alt_odb_tail = >objects.alt_odb_list;
 
-   if (!r->ignore_env) {
+   if (!ros->ignore_env) {
const char *alt = getenv(ALTERNATE_DB_ENVIRONMENT);
if (!alt)
alt = "";
-- 8< --


--
Duy

Re: [PATCH 2/3] config: respect `pager.config` in list/get-mode only

2018-02-13 Thread Martin Ågren

On 13 February 2018 at 11:25, Duy Nguyen  wrote:
> On Sun, Feb 11, 2018 at 11:40 PM, Martin Ågren  wrote:
>> Similar to de121ffe5 (tag: respect `pager.tag` in list-mode only,
>> 2017-08-02), use the DELAY_PAGER_CONFIG-mechanism to only respect
>> `pager.config` when we are listing or "get"ing config.
>>
>> Some getters give at most one line of output, but it is much easier to
>> document and understand that we page all of --get[-*] and --list, than
>> to divide the (current and future) getters into "pages" and "doesn't".
>
> I realize modern pagers like 'less' can automatically exit if the
> output is less than a screen. But are we sure it's true for all
> pagers? It would be annoying to have a pager waiting for me to exit
> when I only want to check one config item out (which prints one line).
> Trading one-time convenience at reading the manual with constantly
> pressing 'q' does not seem justified.

Well, there was one recent instance of a misconfigured LESS causing the
pager not to quit automatically [1]. Your "Trading"-sentence does argue
nicely for rethinking my approach here.

A tweaked behavior could be documented as something like:

`pager.config` is only respected when listing configuration, i.e.,
when using `--list` or any of the `--get-*` which may return
multiple results.

Maybe it doesn't look to complicated after all. I'd rather not give any
ideas about how we only page if there *are* more than one line of
result, i.e., that we'd examine the result before turning on the pager.
I think I've avoided that misconception here.

Thanks
Martin

[1] 
https://public-inbox.org/git/2412a603-4382-4af5-97d0-d16d5faaf...@eluvio.com/

[PATCH 4/4] add -p: calculate offset delta for edited patches

2018-02-13 Thread Phillip Wood

From: Phillip Wood 

Recount the number of preimage and postimage lines in a hunk after it
has been edited so any change in the number of insertions or deletions
can be used to adjust the offsets of subsequent hunks. If an edited
hunk is subsequently split then the offset correction will be lost. It
would be possible to fix this if it is a problem, however the code
here is still an improvement on the status quo for the common case
where an edited hunk is applied without being split.

This is also a necessary step to removing '--recount' and
'--allow-overlap' from the invocation of 'git apply'. Before
'--recount' can be removed the splitting and coalescing counting needs
to be fixed to handle a missing newline at the end of a file. In order
to remove '--allow-overlap' there needs to be i) some way of verifying
the offset data in the edited hunk (probably by correlating the
preimage (or postimage if the patch is going to be applied in reverse)
lines of the edited and unedited versions to see if they are offset or
if any leading/trailing context lines have been removed) and ii) a way of
dealing with edited hunks that change context lines that are shared
with neighbouring hunks.

Signed-off-by: Phillip Wood 
---
 git-add--interactive.perl  | 57 ++
 t/t3701-add-interactive.sh |  2 +-
 2 files changed, 49 insertions(+), 10 deletions(-)

diff --git a/git-add--interactive.perl b/git-add--interactive.perl
index 
7a0a5896bb8fa063ace29b85840849c867b874f5..f139e545a4fb9cbaa757208a44cb2b5c3ad3678a
 100755
--- a/git-add--interactive.perl
+++ b/git-add--interactive.perl
@@ -938,6 +938,9 @@ sub coalesce_overlapping_hunks {
parse_hunk_header($text->[0]);
unless ($_->{USE}) {
$ofs_delta += $o_cnt - $n_cnt;
+   # If this hunk has been edited then subtract
+   # the delta that is due to the edit.
+   $_->{OFS_DELTA} and $ofs_delta -= $_->{OFS_DELTA};
next;
}
if ($ofs_delta) {
@@ -945,6 +948,9 @@ sub coalesce_overlapping_hunks {
$_->{TEXT}->[0] = format_hunk_header($o_ofs, $o_cnt,
 $n_ofs, $n_cnt);
}
+   # If this hunk was edited then adjust the offset delta
+   # to reflect the edit.
+   $_->{OFS_DELTA} and $ofs_delta += $_->{OFS_DELTA};
if (defined $last_o_ctx &&
$o_ofs <= $last_o_ctx &&
!$_->{DIRTY} &&
@@ -1016,6 +1022,30 @@ marked for discarding."),
 marked for applying."),
 );
 
+sub recount_edited_hunk {
+   local $_;
+   my ($oldtext, $newtext) = @_;
+   my ($o_cnt, $n_cnt) = (0, 0);
+   for (@{$newtext}[1..$#{$newtext}]) {
+   my $mode = substr($_, 0, 1);
+   if ($mode eq '-') {
+   $o_cnt++;
+   } elsif ($mode eq '+') {
+   $n_cnt++;
+   } elsif ($mode eq ' ') {
+   $o_cnt++;
+   $n_cnt++;
+   }
+   }
+   my ($o_ofs, undef, $n_ofs, undef) =
+   parse_hunk_header($newtext->[0]);
+   $newtext->[0] = format_hunk_header($o_ofs, $o_cnt, $n_ofs, $n_cnt);
+   my (undef, $orig_o_cnt, undef, $orig_n_cnt) =
+   parse_hunk_header($oldtext->[0]);
+   # Return the change in the number of lines inserted by this hunk
+   return $orig_o_cnt - $orig_n_cnt - $o_cnt + $n_cnt;
+}
+
 sub edit_hunk_manually {
my ($oldtext) = @_;
 
@@ -1114,25 +1144,34 @@ sub prompt_yesno {
 }
 
 sub edit_hunk_loop {
-   my ($head, $hunk, $ix) = @_;
-   my $text = $hunk->[$ix]->{TEXT};
+   my ($head, $hunks, $ix) = @_;
+   my $hunk = $hunks->[$ix];
+   my $text = $hunk->{TEXT};
 
while (1) {
-   $text = edit_hunk_manually($text);
-   if (!defined $text) {
+   my $newtext = edit_hunk_manually($text);
+   if (!defined $newtext) {
return undef;
}
my $newhunk = {
-   TEXT => $text,
-   TYPE => $hunk->[$ix]->{TYPE},
+   TEXT => $newtext,
+   TYPE => $hunk->{TYPE},
USE => 1,
DIRTY => 1,
};
if (diff_applies($head,
-@{$hunk}[0..$ix-1],
+@{$hunks}[0..$ix-1],
 $newhunk,
-@{$hunk}[$ix+1..$#{$hunk}])) {
-   $newhunk->{DISPLAY} = [color_diff(@{$text})];
+

[PATCH 2/4] t3701: add failing test for pathological context lines

2018-02-13 Thread Phillip Wood

From: Phillip Wood 

When a hunk is skipped by add -i the offsets of subsequent hunks are
not adjusted to account for any missing insertions due to the skipped
hunk. Most of the time this does not matter as apply uses the context
lines to apply the subsequent hunks in the correct place, however in
pathological cases the context lines will match at the now incorrect
offset and the hunk will be applied in the wrong place. The offsets of
hunks following an edited hunk that has had the number of insertions
or deletions changed also need to be updated in the same way. Add
failing tests to demonstrate this.

Signed-off-by: Phillip Wood 
---
 t/t3701-add-interactive.sh | 30 ++
 1 file changed, 30 insertions(+)

diff --git a/t/t3701-add-interactive.sh b/t/t3701-add-interactive.sh
index 
058698df6a4a9811b9db84fb5900472c47c61798..6838698a1382b24724cfbd3be04a7054489b94af
 100755
--- a/t/t3701-add-interactive.sh
+++ b/t/t3701-add-interactive.sh
@@ -541,4 +541,34 @@ test_expect_success 'status ignores dirty submodules 
(except HEAD)' '
! grep dirty-otherwise output
 '
 
+test_expect_success 'set up pathological context' '
+   git reset --hard &&
+   test_write_lines a a a a a a a a a a a >a &&
+   git add a &&
+   git commit -m a &&
+   test_write_lines c b a a a a a a a b a a a a >a &&
+   test_write_lines a a a a a a a b a a a a >expected-1 &&
+   test_write_lines   b a a a a a a a b a a a a >expected-2 &&
+   write_script editor <<-\EOF
+   sed /^+c/d "$1" >"$1.tmp"
+   mv "$1.tmp" "$1"
+   EOF
+'
+
+test_expect_failure 'add -p works with pathological context lines' '
+   git reset &&
+   printf "%s\n" n y |
+   git add -p &&
+   git cat-file blob :a > actual &&
+   test_cmp expected-1 actual
+'
+
+test_expect_failure 'add -p patch editing works with pathological context 
lines' '
+   git reset &&
+   printf "%s\n" e y |
+   GIT_EDITOR=./editor git add -p &&
+   git cat-file blob :a > actual &&
+   test_cmp expected-2 actual
+'
+
 test_done
-- 
2.16.1

[PATCH 1/4] add -i: add function to format hunk header

2018-02-13 Thread Phillip Wood

From: Phillip Wood 

This code is duplicated in a couple of places so make it into a
function as we're going to add some more callers shortly.

Signed-off-by: Phillip Wood 
---
 git-add--interactive.perl | 21 +++--
 1 file changed, 11 insertions(+), 10 deletions(-)

diff --git a/git-add--interactive.perl b/git-add--interactive.perl
index 
964c3a75420db4751cf11125b68b6904112632f1..8ababa6453ac4f57a925aacbb8b9bb1c973e4a54
 100755
--- a/git-add--interactive.perl
+++ b/git-add--interactive.perl
@@ -751,6 +751,15 @@ sub parse_hunk_header {
return ($o_ofs, $o_cnt, $n_ofs, $n_cnt);
 }
 
+sub format_hunk_header {
+   my ($o_ofs, $o_cnt, $n_ofs, $n_cnt) = @_;
+   return ("@@ -$o_ofs" .
+   (($o_cnt != 1) ? ",$o_cnt" : '') .
+   " +$n_ofs" .
+   (($n_cnt != 1) ? ",$n_cnt" : '') .
+   " @@\n");
+}
+
 sub split_hunk {
my ($text, $display) = @_;
my @split = ();
@@ -838,11 +847,7 @@ sub split_hunk {
my $o_cnt = $hunk->{OCNT};
my $n_cnt = $hunk->{NCNT};
 
-   my $head = ("@@ -$o_ofs" .
-   (($o_cnt != 1) ? ",$o_cnt" : '') .
-   " +$n_ofs" .
-   (($n_cnt != 1) ? ",$n_cnt" : '') .
-   " @@\n");
+   my $head = format_hunk_header($o_ofs, $o_cnt, $n_ofs, $n_cnt);
my $display_head = $head;
unshift @{$hunk->{TEXT}}, $head;
if ($diff_use_color) {
@@ -912,11 +917,7 @@ sub merge_hunk {
}
push @line, $line;
}
-   my $head = ("@@ -$o0_ofs" .
-   (($o_cnt != 1) ? ",$o_cnt" : '') .
-   " +$n0_ofs" .
-   (($n_cnt != 1) ? ",$n_cnt" : '') .
-   " @@\n");
+   my $head = format_hunk_header($o0_ofs, $o_cnt, $n0_ofs, $n_cnt);
@{$prev->{TEXT}} = ($head, @line);
 }
 
-- 
2.16.1

[PATCH 0/4] Correct offsets of hunks when one is skipped

2018-02-13 Thread Phillip Wood

From: Phillip Wood 

While working on a patch series to stage selected lines from a hunk
without having to edit it I got worried that subsequent patches would
be applied in the wrong place which lead to this series to correct the
offsets of hunks following those that are skipped or edited.

Phillip Wood (4):
  add -i: add function to format hunk header
  t3701: add failing test for pathological context lines
  add -p: Adjust offsets of subsequent hunks when one is skipped
  add -p: calculate offset delta for edited patches

 git-add--interactive.perl  | 93 +++---
 t/t3701-add-interactive.sh | 30 +++
 2 files changed, 102 insertions(+), 21 deletions(-)

-- 
2.16.1

[PATCH 3/4] add -p: Adjust offsets of subsequent hunks when one is skipped

2018-02-13 Thread Phillip Wood

From: Phillip Wood 

Since commit 8cbd431082 ("git-add--interactive: replace hunk
recounting with apply --recount", 2008-7-2) if a hunk is skipped then
we rely on the context lines to apply subsequent hunks in the right
place. While this works most of the time it is possible for hunks to
end up being applied in the wrong place. To fix this adjust the offset
of subsequent hunks to correct for any change in the number of
insertions or deletions due to the skipped hunk. The change in offset
due to edited hunks that have the number of insertions or deletions
changed is ignored here, it will be fixed in the next commit.

Signed-off-by: Phillip Wood 
---
 git-add--interactive.perl  | 15 +--
 t/t3701-add-interactive.sh |  2 +-
 2 files changed, 14 insertions(+), 3 deletions(-)

diff --git a/git-add--interactive.perl b/git-add--interactive.perl
index 
8ababa6453ac4f57a925aacbb8b9bb1c973e4a54..7a0a5896bb8fa063ace29b85840849c867b874f5
 100755
--- a/git-add--interactive.perl
+++ b/git-add--interactive.perl
@@ -926,14 +926,25 @@ sub coalesce_overlapping_hunks {
my @out = ();
 
my ($last_o_ctx, $last_was_dirty);
+   my $ofs_delta = 0;
 
-   for (grep { $_->{USE} } @in) {
+   for (@in) {
if ($_->{TYPE} ne 'hunk') {
push @out, $_;
next;
}
my $text = $_->{TEXT};
-   my ($o_ofs) = parse_hunk_header($text->[0]);
+   my ($o_ofs, $o_cnt, $n_ofs, $n_cnt) =
+   parse_hunk_header($text->[0]);
+   unless ($_->{USE}) {
+   $ofs_delta += $o_cnt - $n_cnt;
+   next;
+   }
+   if ($ofs_delta) {
+   $n_ofs += $ofs_delta;
+   $_->{TEXT}->[0] = format_hunk_header($o_ofs, $o_cnt,
+$n_ofs, $n_cnt);
+   }
if (defined $last_o_ctx &&
$o_ofs <= $last_o_ctx &&
!$_->{DIRTY} &&
diff --git a/t/t3701-add-interactive.sh b/t/t3701-add-interactive.sh
index 
6838698a1382b24724cfbd3be04a7054489b94af..5401d74d5804b5e43286d08a905fb96c68b02e42
 100755
--- a/t/t3701-add-interactive.sh
+++ b/t/t3701-add-interactive.sh
@@ -555,7 +555,7 @@ test_expect_success 'set up pathological context' '
EOF
 '
 
-test_expect_failure 'add -p works with pathological context lines' '
+test_expect_success 'add -p works with pathological context lines' '
git reset &&
printf "%s\n" n y |
git add -p &&
-- 
2.16.1

[PATCH 0/3] add -p: improve help messages

2018-02-13 Thread Phillip Wood

From: Phillip Wood 

Improve the help displayed if the user presses an inactive key by only
listing active keys and printing specific messages for some keys. Also
disable search if there's only a single hunk.

Phillip Wood (3):
  add -p: only display help for active keys
  add -p: Only bind search key if there's more than one hunk
  add -p: improve error messages

 git-add--interactive.perl | 74 ++-
 1 file changed, 48 insertions(+), 26 deletions(-)

-- 
2.16.1

[PATCH 2/3] add -p: Only bind search key if there's more than one hunk

2018-02-13 Thread Phillip Wood

From: Phillip Wood 

If there is only a single hunk then disable searching as there is
nothing to search for. Also print a specific error message if the user
tries to search with '/' when there's only a single hunk rather than
just listing the key bindings.

Signed-off-by: Phillip Wood 
---
 git-add--interactive.perl | 50 +--
 1 file changed, 27 insertions(+), 23 deletions(-)

diff --git a/git-add--interactive.perl b/git-add--interactive.perl
index 
76b941f8f0071b7bfa9d515af25b69997ef4581a..79ab36aacf84ed329a9d50886b98bb4ef8b8e312
 100755
--- a/git-add--interactive.perl
+++ b/git-add--interactive.perl
@@ -1185,7 +1185,7 @@ d - do not apply this hunk or any of the later hunks in 
the file"),
 
 sub help_patch_cmd {
local $_;
-   my $other = $_[0] . ",/,?";
+   my $other = $_[0] . ",?";
print colored $help_color, __($help_patch_modes{$patch_mode}), "\n",
map { "$_\n" } grep {
my $c = quotemeta(substr($_, 0, 1));
@@ -1308,39 +1308,39 @@ sub display_hunks {
 
 my %patch_update_prompt_modes = (
stage => {
-   mode => N__("Stage mode change [y,n,q,a,d,/%s,?]? "),
-   deletion => N__("Stage deletion [y,n,q,a,d,/%s,?]? "),
-   hunk => N__("Stage this hunk [y,n,q,a,d,/%s,?]? "),
+   mode => N__("Stage mode change [y,n,q,a,d%s,?]? "),
+   deletion => N__("Stage deletion [y,n,q,a,d%s,?]? "),
+   hunk => N__("Stage this hunk [y,n,q,a,d%s,?]? "),
},
stash => {
-   mode => N__("Stash mode change [y,n,q,a,d,/%s,?]? "),
-   deletion => N__("Stash deletion [y,n,q,a,d,/%s,?]? "),
-   hunk => N__("Stash this hunk [y,n,q,a,d,/%s,?]? "),
+   mode => N__("Stash mode change [y,n,q,a,d%s,?]? "),
+   deletion => N__("Stash deletion [y,n,q,a,d%s,?]? "),
+   hunk => N__("Stash this hunk [y,n,q,a,d%s,?]? "),
},
reset_head => {
-   mode => N__("Unstage mode change [y,n,q,a,d,/%s,?]? "),
-   deletion => N__("Unstage deletion [y,n,q,a,d,/%s,?]? "),
-   hunk => N__("Unstage this hunk [y,n,q,a,d,/%s,?]? "),
+   mode => N__("Unstage mode change [y,n,q,a,d%s,?]? "),
+   deletion => N__("Unstage deletion [y,n,q,a,d%s,?]? "),
+   hunk => N__("Unstage this hunk [y,n,q,a,d%s,?]? "),
},
reset_nothead => {
-   mode => N__("Apply mode change to index [y,n,q,a,d,/%s,?]? "),
-   deletion => N__("Apply deletion to index [y,n,q,a,d,/%s,?]? "),
-   hunk => N__("Apply this hunk to index [y,n,q,a,d,/%s,?]? "),
+   mode => N__("Apply mode change to index [y,n,q,a,d%s,?]? "),
+   deletion => N__("Apply deletion to index [y,n,q,a,d%s,?]? "),
+   hunk => N__("Apply this hunk to index [y,n,q,a,d%s,?]? "),
},
checkout_index => {
-   mode => N__("Discard mode change from worktree 
[y,n,q,a,d,/%s,?]? "),
-   deletion => N__("Discard deletion from worktree 
[y,n,q,a,d,/%s,?]? "),
-   hunk => N__("Discard this hunk from worktree [y,n,q,a,d,/%s,?]? 
"),
+   mode => N__("Discard mode change from worktree [y,n,q,a,d%s,?]? 
"),
+   deletion => N__("Discard deletion from worktree 
[y,n,q,a,d%s,?]? "),
+   hunk => N__("Discard this hunk from worktree [y,n,q,a,d%s,?]? 
"),
},
checkout_head => {
-   mode => N__("Discard mode change from index and worktree 
[y,n,q,a,d,/%s,?]? "),
-   deletion => N__("Discard deletion from index and worktree 
[y,n,q,a,d,/%s,?]? "),
-   hunk => N__("Discard this hunk from index and worktree 
[y,n,q,a,d,/%s,?]? "),
+   mode => N__("Discard mode change from index and worktree 
[y,n,q,a,d%s,?]? "),
+   deletion => N__("Discard deletion from index and worktree 
[y,n,q,a,d%s,?]? "),
+   hunk => N__("Discard this hunk from index and worktree 
[y,n,q,a,d%s,?]? "),
},
checkout_nothead => {
-   mode => N__("Apply mode change to index and worktree 
[y,n,q,a,d,/%s,?]? "),
-   deletion => N__("Apply deletion to index and worktree 
[y,n,q,a,d,/%s,?]? "),
-   hunk => N__("Apply this hunk to index and worktree 
[y,n,q,a,d,/%s,?]? "),
+   mode => N__("Apply mode change to index and worktree 
[y,n,q,a,d%s,?]? "),
+   deletion => N__("Apply deletion to index and worktree 
[y,n,q,a,d%s,?]? "),
+   hunk => N__("Apply this hunk to index and worktree 
[y,n,q,a,d%s,?]? "),
},
 );
 
@@ -1396,7 +1396,7 @@ sub patch_update_file {
$other .= ',J';
}
if ($num > 1) {
-   $other .= ',g';
+   $other .= ',g,/';

[PATCH 1/3] add -p: only display help for active keys

2018-02-13 Thread Phillip Wood

From: Phillip Wood 

If the user presses a key that add -p wasn't expecting then it prints
a list of key bindings. Although the prompt only lists the active
bindings the help was printed for all bindings.  Fix this by using the
list of keys in the prompt to filter the help. Note that the list of
keys was already passed to help_patch_cmd() by the caller so there is
no change needed to the call site.

Signed-off-by: Phillip Wood 
---
 git-add--interactive.perl | 8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/git-add--interactive.perl b/git-add--interactive.perl
index 
964c3a75420db4751cf11125b68b6904112632f1..76b941f8f0071b7bfa9d515af25b69997ef4581a
 100755
--- a/git-add--interactive.perl
+++ b/git-add--interactive.perl
@@ -1184,7 +1184,13 @@ d - do not apply this hunk or any of the later hunks in 
the file"),
 );
 
 sub help_patch_cmd {
-   print colored $help_color, __($help_patch_modes{$patch_mode}), "\n", __ 
<

[PATCH 3/3] add -p: improve error messages

2018-02-13 Thread Phillip Wood

From: Phillip Wood 

If the user presses a key that isn't currently active then explain why
it isn't active rather than just listing all the keys. It already did
this for some keys, this patch does the same for the those that
weren't already handled.

Signed-off-by: Phillip Wood 
---
 git-add--interactive.perl | 18 +++---
 1 file changed, 15 insertions(+), 3 deletions(-)

diff --git a/git-add--interactive.perl b/git-add--interactive.perl
index 
79ab36aacf84ed329a9d50886b98bb4ef8b8e312..d9d8ff3090e914421ab67071117789f6b3554475
 100755
--- a/git-add--interactive.perl
+++ b/git-add--interactive.perl
@@ -1437,8 +1437,12 @@ sub patch_update_file {
}
next;
}
-   elsif ($other =~ /g/ && $line =~ /^g(.*)/) {
+   elsif ($line =~ /^g(.*)/) {
my $response = $1;
+   unless ($other =~ /g/) {
+   error_msg __("No other hunks to 
goto\n");
+   next;
+   }
my $no = $ix > 10 ? $ix - 10 : 0;
while ($response eq '') {
$no = display_hunks(\@hunk, $no);
@@ -1556,7 +1560,11 @@ sub patch_update_file {
next;
}
}
-   elsif ($other =~ /s/ && $line =~ /^s/) {
+   elsif ($line =~ /^s/) {
+   unless ($other =~ /s/) {
+   error_msg __("Sorry, cannot split this 
hunk\n");
+   next;
+   }
my @split = split_hunk($hunk[$ix]{TEXT}, 
$hunk[$ix]{DISPLAY});
if (1 < @split) {
print colored $header_color, sprintf(
@@ -1568,7 +1576,11 @@ sub patch_update_file {
$num = scalar @hunk;
next;
}
-   elsif ($other =~ /e/ && $line =~ /^e/) {
+   elsif ($line =~ /^e/) {
+   unless ($other =~ /e/) {
+   error_msg __("Sorry, cannot edit this 
hunk\n");
+   next;
+   }
my $newhunk = edit_hunk_loop($head, \@hunk, 
$ix);
if (defined $newhunk) {
splice @hunk, $ix, 1, $newhunk;
-- 
2.16.1

Re: [PATCH 2/3] config: respect `pager.config` in list/get-mode only

2018-02-13 Thread Duy Nguyen

On Sun, Feb 11, 2018 at 11:40 PM, Martin Ågren  wrote:
> Similar to de121ffe5 (tag: respect `pager.tag` in list-mode only,
> 2017-08-02), use the DELAY_PAGER_CONFIG-mechanism to only respect
> `pager.config` when we are listing or "get"ing config.
>
> Some getters give at most one line of output, but it is much easier to
> document and understand that we page all of --get[-*] and --list, than
> to divide the (current and future) getters into "pages" and "doesn't".

I realize modern pagers like 'less' can automatically exit if the
output is less than a screen. But are we sure it's true for all
pagers? It would be annoying to have a pager waiting for me to exit
when I only want to check one config item out (which prints one line).
Trading one-time convenience at reading the manual with constantly
pressing 'q' does not seem justified.
-- 
Duy

Re: [PATCH 2/2] t5556: replace test_i18ngrep with a simple grep

2018-02-13 Thread SZEDER Gábor

> > I must admit that I didn't think about the effect of the useless
> > "| sort" on the exit status!  What I saw was: a process that
> > received no input, sorted nothing and produced no output - pretty
> > much the definition of useless! ;-)
> 
> I am not sure what you mean by "receive no input, sort nothing and
> produce no output".
> 
> Ahh, OK, this is a funny one.  I think the original meant to do
> 
>   grep ... | grep -v ... | sort >actual
> 
> but it did
> 
>   grep ... | grep -v ... >actual | sort
> 
> instead by mistake.
> 
> And we have two possible "fixes" for that mistake.  Either removing
> "|sort" (and replace its only effect, which is to hide brittleness
> of relying on exit status of the second grep, with something else)
> to declare that we do care about the order multiple warning messages
> are given by the last test in the script (by the way, the script is
> t5536, not t5556; the patch needs to be retitled), or keeping the "|
> sort" and move the redirection into ">actual" to the correct place,
> which is to follow through the intention of having that "sort" on
> the pipeline in the first place.  I somewhat favor the former in
> this particular case myself, but the preference is not a very strong
> one.

A third possible fix, which is also in the "we don't care about the
order of multiple warning messages" camp and has a nice looking
diffstat, would be something like this:


diff --git a/t/t5536-fetch-conflicts.sh b/t/t5536-fetch-conflicts.sh
index 2e42cf3316..91f28c2f78 100755
--- a/t/t5536-fetch-conflicts.sh
+++ b/t/t5536-fetch-conflicts.sh
@@ -18,14 +18,6 @@ setup_repository () {
)
 }
 
-verify_stderr () {
-   cat >expected &&
-   # We're not interested in the error
-   # "fatal: The remote end hung up unexpectedly":
-   test_i18ngrep -E '^(fatal|warning):' actual 
| sort &&
-   test_i18ncmp expected actual
-}
-
 test_expect_success 'setup' '
git commit --allow-empty -m "Initial" &&
git branch branch1 &&
@@ -48,9 +40,7 @@ test_expect_success 'fetch conflict: config vs. config' '
"+refs/heads/branch2:refs/remotes/origin/branch1" && (
cd ccc &&
test_must_fail git fetch origin 2>error &&
-   verify_stderr <<-\EOF
-   fatal: Cannot fetch both refs/heads/branch1 and 
refs/heads/branch2 to refs/remotes/origin/branch1
-   EOF
+   test_i18ngrep "fatal: Cannot fetch both refs/heads/branch1 and 
refs/heads/branch2 to refs/remotes/origin/branch1" error
)
 '
 
@@ -77,9 +67,7 @@ test_expect_success 'fetch conflict: arg vs. arg' '
test_must_fail git fetch origin \
refs/heads/*:refs/remotes/origin/* \
refs/heads/branch2:refs/remotes/origin/branch1 2>error 
&&
-   verify_stderr <<-\EOF
-   fatal: Cannot fetch both refs/heads/branch1 and 
refs/heads/branch2 to refs/remotes/origin/branch1
-   EOF
+   test_i18ngrep "fatal: Cannot fetch both refs/heads/branch1 and 
refs/heads/branch2 to refs/remotes/origin/branch1" error
)
 '
 
@@ -90,10 +78,8 @@ test_expect_success 'fetch conflict: criss-cross args' '
git fetch origin \
refs/heads/branch1:refs/remotes/origin/branch2 \
refs/heads/branch2:refs/remotes/origin/branch1 2>error 
&&
-   verify_stderr <<-\EOF
-   warning: refs/remotes/origin/branch1 usually tracks 
refs/heads/branch1, not refs/heads/branch2
-   warning: refs/remotes/origin/branch2 usually tracks 
refs/heads/branch2, not refs/heads/branch1
-   EOF
+   test_i18ngrep "warning: refs/remotes/origin/branch1 usually 
tracks refs/heads/branch1, not refs/heads/branch2" error &&
+   test_i18ngrep "warning: refs/remotes/origin/branch2 usually 
tracks refs/heads/branch2, not refs/heads/branch1" error
)
 '

Re: [PATCH v2] dir.c: ignore paths containing .git when invalidating untracked cache

2018-02-13 Thread Duy Nguyen

On Wed, Feb 7, 2018 at 11:59 PM, Ben Peart  wrote:
> diff --git a/dir.c b/dir.c
> index 7c4b45e30e..d431da46f5 100644
> --- a/dir.c
> +++ b/dir.c
> @@ -1773,7 +1773,7 @@ static enum path_treatment treat_path(struct
> dir_struct *dir,
> if (!de)
> return treat_path_fast(dir, untracked, cdir, istate, path,
>baselen, pathspec);
> -   if (is_dot_or_dotdot(de->d_name) || !strcmp(de->d_name, ".git"))
> +   if (is_dot_or_dotdot(de->d_name) || !fspathcmp(de->d_name, ".git"))
> return path_none;
> strbuf_setlen(path, baselen);
> strbuf_addstr(path, de->d_name);
> diff --git a/fsmonitor.c b/fsmonitor.c
> index 0af7c4edba..019576f306 100644
> --- a/fsmonitor.c
> +++ b/fsmonitor.c
> @@ -118,8 +118,12 @@ static int query_fsmonitor(int version, uint64_t
> last_update, struct strbuf *que
>
>  static void fsmonitor_refresh_callback(struct index_state *istate, const
> char *name)
>  {
> -   int pos = index_name_pos(istate, name, strlen(name));
> +   int pos;
>
> +   if (!verify_path(name))
> +   return;
> +
> +   pos = index_name_pos(istate, name, strlen(name));
> if (pos >= 0) {
> struct cache_entry *ce = istate->cache[pos];
> ce->ce_flags &= ~CE_FSMONITOR_VALID;
>

It's very tempting considering that the amount of changes is much
smaller. But I think we should go with my version. The hope is when a
_new_ call site appears, the author would think twice before passing
zero or one to the safe_path argument.

> diff --git a/t/t7519-status-fsmonitor.sh b/t/t7519-status-fsmonitor.sh
> index eb2d13bbcf..756beb0d8e 100755
> --- a/t/t7519-status-fsmonitor.sh
> +++ b/t/t7519-status-fsmonitor.sh
> @@ -314,4 +314,43 @@ test_expect_success 'splitting the index results in the
> same state' '
> test_cmp expect actual
>  '
>
> +test_expect_success UNTRACKED_CACHE 'ignore .git changes when invalidating
> UNTR' '
> +   test_create_repo dot-git &&
> +   (
> +   cd dot-git &&
> +   mkdir -p .git/hooks &&
> +   : >tracked &&
> +   : >modified &&
> +   mkdir dir1 &&
> +   : >dir1/tracked &&
> +   : >dir1/modified &&
> +   mkdir dir2 &&
> +   : >dir2/tracked &&
> +   : >dir2/modified &&
> +   write_integration_script &&
> +   git config core.fsmonitor .git/hooks/fsmonitor-test &&
> +   git update-index --untracked-cache &&
> +   git update-index --fsmonitor &&
> +   GIT_TRACE_UNTRACKED_STATS="$TRASH_DIRECTORY/trace-before" \
> +   git status &&
> +   test-dump-untracked-cache >../before
> +   ) &&
> +   cat >>dot-git/.git/hooks/fsmonitor-test <<-\EOF &&
> +   printf ".git\0"
> +   printf ".git/index\0"
> +   printf "dir1/.git\0"
> +   printf "dir1/.git/index\0"
> +   EOF
> +   (
> +   cd dot-git &&
> +   GIT_TRACE_UNTRACKED_STATS="$TRASH_DIRECTORY/trace-after" \
> +   git status &&
> +   test-dump-untracked-cache >../after
> +   ) &&
> +   grep "directory invalidation" trace-before >>before &&
> +   grep "directory invalidation" trace-after >>after &&
> +   # UNTR extension unchanged, dir invalidation count unchanged
> +   test_cmp before after
> +'
> +
>  test_done
>
> base-commit: 5be1f00a9a701532232f57958efab4be8c959a29
> --
> 2.15.0.windows.1
>



-- 
Duy

Re: [PATCH v1] dir.c: don't flag the index as dirty for changes to the untracked cache

2018-02-13 Thread Duy Nguyen

On Tue, Feb 13, 2018 at 12:57 AM, Ben Peart  wrote:
>> Another case that could touches a lot of directories over time is
>> switch trees (e.g. "git checkout master" then "checkout next" or worse
>> "checkout v1.0").
>
>
> You're example above makes me wonder if you understand what my patch is
> doing.  If the index is flagged as dirty for _any_ reason, the entire index
> is written to disk with the latest information - including the updated
> untracked cache and all other extensions.  So in your checkout examples
> above, the index will still get written to disk with the updated untracked
> cache extension.  There would be zero change in behavior or performance.
> The _only_ time the index is not written to disk after my patch is if there
> were no other changes to the index.  In my experience, that is only status
> calls.

The untracked cache is updated and does get written down, but it's not
"repaired" unless you have called read_directory() before the index is
written. Though paths that hit untracked_cache_invalidate_path() will
continue on slow path until you call read_directory() and write down.
I don't think "git checkout" calls read_directory. There are some
commands, like "git add", that update the index and call
read_directory() at the same time. So yes I was wrong, the untracked
cache can be repaired sometimes, not never repaired.

We do try to improve performance at "git checkout" and a couple other
"slow" commands though (e.g. repair cache tree), perhaps we can do the
same for untracked cache. Though the cost of updating untracked cache
is potentially much higher than cache tree.
-- 
Duy

Schnelle Kredite

2018-02-13 Thread Sec Capital Loan

Darlehen Angebot bei 3%, zögern Sie nicht, uns für weitere Informationen zu 
kontaktieren

93 matches

Mail list logo