Re: [PATCH v7] read-cache: force_verify_index_checksum

2017-05-08 Thread Junio C Hamano
Christian Couder  writes:

> On Fri, Apr 14, 2017 at 10:32 PM,   wrote:
>> diff --git a/t/t1450-fsck.sh b/t/t1450-fsck.sh
>> index 33a51c9..677e15a 100755
>> --- a/t/t1450-fsck.sh
>> +++ b/t/t1450-fsck.sh
>> @@ -689,4 +689,17 @@ test_expect_success 'bogus head does not fallback to 
>> all heads' '
>> ! grep $blob out
>>  '
>>
>> +test_expect_success 'detect corrupt index file in fsck' '
>> +   cp .git/index .git/index.backup &&
>> +   test_when_finished "mv .git/index.backup .git/index" &&
>> +   echo  > &&
>> +   git add  &&
>> +   sed -e "s///" .git/index >.git/index.yyy &&
>> +   mv .git/index.yyy .git/index &&
>> +   # Confirm that fsck detects invalid checksum
>> +   test_must_fail git fsck --cache &&
>> +   # Confirm that status no longer complains about invalid checksum
>> +   git status
>> +'
>
> This test does not pass when the GIT_TEST_SPLIT_INDEX env variable is
> set on my Linux machine.
>
> Also it looks like you sent a v8 of this patch series with a different
> test, but what is in master looks like the above test instead of the
> test in your v8.

Thanks for raising this.  I almost forgot that the final finishing
touch to update the test was needed in 'master', and this exchange
caught it before it is too late.



Re: [PATCH 0/1] Preserve the untracked cache across checkout, reset --hard, etc

2017-05-08 Thread Junio C Hamano
David Turner  writes:

> Can you actually keep the email address as my Twopensource one?  I want to 
> make sure that Twitter, my employer at the time, gets credit for this work 
> (just as I want to make sure that my current employer, Two Sigma, gets credit 
> for my current work).
>
> Please feel free to add Signed-off-by: David Turner  in 
> case that makes tracking easier.
>
> Thanks.
>
> WRT the actual patch, I want to note that past me did not do a
> great job here.  The tests do not correctly check that the
> post-checkout untracked cache is still valid after a checkout.
> For example, let's say that previously, the directory foo was
> entirely untracked (but it contained a file bar), but after the
> checkout, there is a file foo/baz.  Does the untracked cache need
> to get updated?
>
> Unfortunately, the untracked cache is very unlikely to make it to
> the top of my priority list any time soon, so I won't be able to
> correct this test (and, if necessary, correct the code).  But I
> would strongly suggest that the test be improved before this code
> is merged.
>
> Thanks for CCing me.

I will try to find time to tweak what was sent to the list here to
reflect your affiliations better, but marked with DONTMERGE waiting
for the necessary updates you mentioned above, so that this change
is not forgotten.  It may turn out to be that copying from src to
dst like the patch does is all that is needed, or the cache may need
further invalidation when the copying happens, and I haven't got a
good feeling that anybody who are familiar with the codepath vetted
the correctness from seeing the discussion from sidelines (yet).

Thanks.


Re: [L10N] Kickoff for Git 2.13.0 l10n round 2

2017-05-08 Thread Junio C Hamano
Jiang Xin  writes:

> Git v2.13.0-rc2 introduced 4 new messages, let's start round 2 for Git
> 2.13.0 l10n.
>
> You can get it from the usual place:
>
> https://github.com/git-l10n/git-po/
>
> As how to update your XX.po and help to translate Git, please see
> "Updating a XX.po file" and other sections in “po/README" file.

What's the doneness of l10n for the upcoming release?  It would be
preferrable if I can get a "done -- pull from me now" within 16
hours or so, as I'd like to do the release at around May 10th 00:00
UTC.

Thanks everybody, as always, for your translations and general
support of the project.


Re: Not translatable strings in Git

2017-05-08 Thread Junio C Hamano
Jeff King  writes:

> I don't think the diffstat summary is marked for translation at all,
> plumbing or not.  There's some history there:
>
>   
> http://public-inbox.org/git/1345922816-20616-1-git-send-email-pclo...@gmail.com/t/#u

Wow, that's a long thread.

> There was an attempt to re-enable it for local-only commands:
>
>   http://public-inbox.org/git/alpine.deb.2.00.1210231323480@ds9.cixit.se/
>
> but nobody seems to have responded (I don't have much of an opinion
> myself).

I still think that the latter is wrong, especially the part that
touches log-tree.c, pp_header(), and other lower level things that
will be read and interpreted by scripts.  The change to diff.c about
the diffstat reverts a bugfix made in v1.8.0 timeperiod, it seems.





Re: [PATCH v2 1/1] t0027: tests are not expensive; remove t0025

2017-05-08 Thread Torsten Bögershausen



On 09/05/17 03:29, Junio C Hamano wrote:

Johannes Schindelin  writes:


Recent "stress" tests show that t0025 if flaky, reported by Lars Schneider,
larsxschnei...@gmail.com

All tests from t0025 are covered in t0027 already, so that t0025 can be
retiered:


s/retiered/retired/


The execution time for t0027 is 14 seconds under Linux,
and 63 seconds under Mac Os X.
And in case you ask, things are not going significantly faster using a SSD
instead of a spinning disk.

Signed-off-by: Torsten Bögershausen 


Thank you for this patch.

Apart from the tyop, would it be possible to fix the formatting to look
less strange? (Unless you use this to transport a super-secret message
steganographically to an alien planet or some such, of course.)


Ping?


I didn't had the time to send a patch yet -
so either you feel free to amend the commit message locally and queue
that - or I send a new version in  2 weeks.


Re: Script to rebase branches

2017-05-08 Thread Jeff King
On Sat, May 06, 2017 at 12:23:32PM +0200, Lars Schneider wrote:

> I am about to write a bash/sh script that helps me to rebase a bunch of 
> branches (e.g. select branches based on prefix, conflict resolution/
> rerere support, ...).
> 
> I wonder if anyone has such a script already and is willing to share it.

This is what I use:

  https://github.com/peff/git/blob/meta/rebase

There's no documentation in the script, but the commit message in its
history should give a good sense of what each part does.

-Peff


Re: Not translatable strings in Git

2017-05-08 Thread Jeff King
On Mon, May 08, 2017 at 09:49:34AM -0700, Stefan Beller wrote:

> On Sat, May 6, 2017 at 1:40 AM, Jordi Mas  wrote:
> > Hello,
> >
> > When translating git (https://github.com/git/git/tree/master/po)
> >
> > The following strings cannot be translated:
> >
> >  remote: Counting objects: 331, done.
> >  remote: Compressing objects: 100% (213/213), done.
> >  remote: Total 244 (delta 184), reused 34 (delta 29)
> 
> This is what the remote server tells a client, but the client did not
> tell the server which locale it has.
> 
> Now consider we have hosting service that hosts projects accessible from
> a wide range of users, all of them having different locales. Which language
> should the remote speak? (Not knowing which language the client speaks).

In theory the client could pass the locale variables. We probably
already do by default over ssh. Likewise we could ask curl to send an
HTTP header based on the current locale, and the server could respect
that. That probably wouldn't even be that much code. I think it's just
something that nobody has bothered to implement so far.

> >  27 files changed, 3399 insertions(+), 3320 deletions(-)
> >  create mode 100644 build-aux/flatpak/org.gnome.Nautilus-stable.json
> >  delete mode 100755 build-aux/meson/check_libgd.sh
> 
> These sound like they are done locally and could be translated easily.
> However I wonder if we have issues with leaky plumbing in the user porcelain.

I don't think the diffstat summary is marked for translation at all,
plumbing or not.  There's some history there:

  
http://public-inbox.org/git/1345922816-20616-1-git-send-email-pclo...@gmail.com/t/#u

There was an attempt to re-enable it for local-only commands:

  http://public-inbox.org/git/alpine.deb.2.00.1210231323480@ds9.cixit.se/

but nobody seems to have responded (I don't have much of an opinion
myself).

-Peff


Re: [PATCH 0/2] Make diff plumbing commands respect the indentHeuristic.

2017-05-08 Thread Junio C Hamano
Jeff King  writes:

> I do feel a bit sad about breaking this case (or at the very least
> forcing you to set an option to retain cross-version compatibility). But
> my gut says that we don't want to lock ourselves into never changing the
> diff algorithm (and I'm sure we've done it inadvertently a few times
> over the years; even the recent switch to turning on renames would have
> had that impact).

IIRC, we never promised that different versions of Git produce the
same patch ID for the same change.  I do share that sadness, as not
making that promise would affect not just such an external database
keyed with patch-ids but also affects the rerere database, and those
who heavily depend on the rerere database would need to refresh them
every time diff/merge algorithm is updated, using something like
contrib/rerere-train.sh script.

But at the same time, I am very much against promising that the
textual diff output will never improve for human readability.



Re: [PATCH] am: check return value of resolve_refdup before using hash

2017-05-08 Thread Jeff King
On Sat, May 06, 2017 at 07:13:56PM +0200, René Scharfe wrote:

> If resolve_refdup() fails it returns NULL and possibly leaves its hash
> output parameter untouched.  Make sure to use it only if the function
> succeeded, in order to avoid accessing uninitialized memory.
> 
> Signed-off-by: Rene Scharfe 
> ---
>  builtin/am.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/builtin/am.c b/builtin/am.c
> index a95dd8b4e6..2c52c820aa 100644
> --- a/builtin/am.c
> +++ b/builtin/am.c
> @@ -2156,7 +2156,7 @@ static void am_abort(struct am_state *state)
>   am_rerere_clear();
>  
>   curr_branch = resolve_refdup("HEAD", 0, curr_head.hash, NULL);
> - has_curr_head = !is_null_oid(_head);
> + has_curr_head = curr_branch && !is_null_oid(_head);
>   if (!has_curr_head)
>   hashcpy(curr_head.hash, EMPTY_TREE_SHA1_BIN);

This one looks good to me (and no interesting ripple effects from
handling the error correctly :) ).

-Peff


Re: [PATCH] checkout: check return value of resolve_refdup before using hash

2017-05-08 Thread Jeff King
On Sat, May 06, 2017 at 07:13:52PM +0200, René Scharfe wrote:

> If resolve_refdup() fails it returns NULL and possibly leaves its hash
> output parameter untouched.  Make sure to use it only if the function
> succeeded, in order to avoid accessing uninitialized memory.
> 
> Found with t/t2011-checkout-invalid-head.sh --valgrind.
> 
> Signed-off-by: Rene Scharfe 
> ---
>  builtin/checkout.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/builtin/checkout.c b/builtin/checkout.c
> index bfa5419f33..6c3d2e4f4c 100644
> --- a/builtin/checkout.c
> +++ b/builtin/checkout.c
> @@ -833,7 +833,8 @@ static int switch_branches(const struct checkout_opts 
> *opts,
>   int flag, writeout_error = 0;
>   memset(, 0, sizeof(old));
>   old.path = path_to_free = resolve_refdup("HEAD", 0, rev.hash, );
> - old.commit = lookup_commit_reference_gently(rev.hash, 1);
> + if (old.path)
> + old.commit = lookup_commit_reference_gently(rev.hash, 1);

I wondered for a second what the value of old.commit would be in the
error case. But it should be NULL due to the memset above.

But what happens after that? Is everybody OK with NULL? I briefly poked
through the callees (merge_working_tree, update_refs_for_switch, and
post_checkout_hook) and they all seem to explicitly handle the NULL.
So I think this is good (well, clearly your change was an improvement
either way, but I feel more confident now that there is nothing else to
be fixed on top of it).

-Peff


Re: [PATCH v2 2/2] receive-pack: verify push options in cert

2017-05-08 Thread Junio C Hamano
Jonathan Tan  writes:

> + # Tweak the push output to make the push option outside the cert
> + # different, then replay it on a fresh dst, checking that ff is not
> + # deleted.
> + sed -i "s/\\([^ ]\\)bar/\\1baz/" push &&

This is not portable.  Didn't you get an error from the lint
checker?

The attached may not be enough if "push" is a binary file, though,
in which case perl may be your friend ;-)

 t/t5534-push-signed.sh | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/t/t5534-push-signed.sh b/t/t5534-push-signed.sh
index 0ef6f66b5d..effe769688 100755
--- a/t/t5534-push-signed.sh
+++ b/t/t5534-push-signed.sh
@@ -152,11 +152,11 @@ test_expect_success GPG 'inconsistent push options in 
signed push not allowed' '
# Tweak the push output to make the push option outside the cert
# different, then replay it on a fresh dst, checking that ff is not
# deleted.
-   sed -i "s/\\([^ ]\\)bar/\\1baz/" push &&
+   sed -e "s/\\([^ ]\\)bar/\\1baz/" push >push.tweak &&
prepare_dst &&
git -C dst config receive.certnonceseed sekrit &&
git -C dst config receive.advertisepushoptions 1 &&
-   git receive-pack dst out &&
+   git receive-pack dst out &&
git -C dst rev-parse ff &&
grep "inconsistent push options" out
 '




Re: [PATCH v4 0/4] Make diff plumbing commands respect the indentHeuristic.

2017-05-08 Thread Jeff King
On Mon, May 08, 2017 at 12:03:35PM -0400, Marc Branchaud wrote:

> The only change from v3 is in 3/4, to expand t4061 to test various
> combinations of --(no-)indent-heuristic and diff.indentHeuristic.
> 
> I kindof went all-in and tried to cover every possible combination for
> all four affected commands.

TBH, I don't know that we need to be that thorough. Unless we have a
reason to believe that the code will behave differently in context A
versus B, it's probably not buying us much. It would be nice, of course,
if we could get full coverage of all possible paths through the program,
but I think that involves a combinator explosion.

But I'm OK with what you have here (the whole series, though I dropped
one or two comments while reading it).

-Peff


Re: [PATCH v4 4/4] add--interactive: drop diff.indentHeuristic handling

2017-05-08 Thread Jeff King
On Mon, May 08, 2017 at 12:03:39PM -0400, Marc Branchaud wrote:

> @@ -730,9 +729,6 @@ sub parse_diff {
>   if (defined $diff_algorithm) {
>   splice @diff_cmd, 1, 0, "--diff-algorithm=${diff_algorithm}";
>   }
> - if ($diff_indent_heuristic) {
> - splice @diff_cmd, 1, 0, "--indent-heuristic";
> - }

I don't remember if I mentioned this before, but this series (and the
reasoning why it is OK to tweak the default) did make me wonder if it
be reasonable to respect diff.algorithm even in plumbing.

I don't actually use it myself, and certainly it would not need to be
part of this series. But perhaps if somebody is really into alternate
diff algorithms they'd be interested in following it up (my own
experience with alternate algorithms has usually been "wow, this diff is
ugly; I wonder if --patience helps" followed by "nope, still ugly").

-Peff


Re: [PATCH v4 0/4] Make diff plumbing commands respect the indentHeuristic.

2017-05-08 Thread Jeff King
On Mon, May 08, 2017 at 10:13:41AM -0700, Stefan Beller wrote:

> > -test_expect_success 'diff: nice spaces with diff.indentHeuristic' '
> > +test_expect_success 'diff: nice spaces with diff.indentHeuristic=true' '
> 
> Adding the '=true' seems weird to me (I'd think the true is implied,
> similar to C,
> where we do "if (!null_pointer)" instead of an explicit "!=null".
> However we do have these '=true' sprinkled all over the tests,
> so I guess it is a valid dialect in our test suite.

I think it's better to be explicit here. I think the original was less
about "implied true", as it was making the assumption that the feature
was off in the first place. Now that we've flipped that, just saying
"diff.indentHeuristic" is unclear whether you mean the "implied true"
of:

  git -c diff.indentHeuristic diff ...

or if you mean setting diff.indentHeuristic at all but aren't saying
which (which I think is how the original meant it).

I dunno. I guess it is subjective, but IMHO the "=true" is much clearer.

-Peff


Re: [PATCH v4 2/4] diff: have the diff-* builtins configure diff before initializing revisions

2017-05-08 Thread Jeff King
On Mon, May 08, 2017 at 12:03:37PM -0400, Marc Branchaud wrote:

> This matches how the diff Porcelain works.  It makes the plumbing commands
> respect diff's configuration options, such as indentHeuristic, because
> init_revisions() calls diff_setup() which fills in the diff_options struct.

I don't know if you want to note here that this is only _some_ options.
I.e., ones that we handle by copying via diff_setup(). Maybe it's
obvious from the description already (it's hard for me to tell because I
already know it either way :) ).

-Peff


Re: Enabling the diff "indent" heuristic by default

2017-05-08 Thread Jeff King
On Mon, May 08, 2017 at 10:54:10AM -0400, Marc Branchaud wrote:

> On 2017-05-08 03:48 AM, Junio C Hamano wrote:
> > 
> > * mb/diff-default-to-indent-heuristics (2017-05-02) 4 commits
> >   (merged to 'next' on 2017-05-08 at 158f401a92)
> 
> I think there's a general open question about this, which is whether or not
> we should just drop the diff.indentHeuristic configuration setting
> altogether.
> 
> Peff made the point [0] that if we keep the setting then t4061 should be
> rewritten.
> 
> My instinct is to keep the setting, at least until the changed default has a
> bit of time to settle in.  So I'll re-send the topic with the renovated
> t4061.

My instinct matches that, too. It gives people like Ævar, with his
patch-id database, a way to keep compatibility if he chooses. If we were
designing from the ground up, I'd say the option is probably just
clutter, but the backwards compatibility issue means we should probably
keep it around more or less forever.

And since Junio wasn't on the other thread, I'll repeat what I wrote
there:

  I do feel a bit sad about breaking this case (or at the very least
  forcing you to set an option to retain cross-version compatibility).
  But my gut says that we don't want to lock ourselves into never
  changing the diff algorithm (and I'm sure we've done it inadvertently
  a few times over the years; even the recent switch to turning on
  renames would have had that impact).

> Both Peff [1] and Ævar [2] mentioned situations where enabling the heuristic
> has a small impact on them.  If/when this graduates, it's perhaps worth
> adding a backward-compatibility note that the default patch IDs are
> changing.  Maybe something like:
> 
> The diff "indent" heuristic is now enabled by default.  This changes the
> patch IDs calculated by git-patch-id and used by git-cherry, which could
> affect patch-based workflows that rely on previously-computed patch IDs.
> The heuristic can be disabled by setting diff.indentHeuristic to false.

I think a note like this is a good idea.

-Peff


Re: [PATCH 0/2] Make diff plumbing commands respect the indentHeuristic.

2017-05-08 Thread Jeff King
On Mon, May 01, 2017 at 12:34:38PM +0200, Ævar Arnfjörð Bjarmason wrote:

> > I don't know if we would want to be extra paranoid about patch-ids.
> > There is no helping:
> >
> >   git rev-list HEAD | git diff-tree --stdin -p | git patch-id --stable
> >
> > because diff-tree doesn't know that it's trying for "--stable" output.
> > But the diffs we compute internally for patch-id could disable the
> > heuristics. I'm not sure if those matter, though. AFAIK those are used
> > only for internal comparisons within a single program. I.e., we never
> > compare them against input from the user, nor do we output them to the
> > user. So they'll change, but I don't think anybody would care.
> 
> I have a few-million row table with commit_id as one column & patch_id
> as another. I.e. a commit -> patch_id mapping.

Thanks for this data point. It's always interesting to hear about
unforeseen uses of the tools.

Out of curiosity, how do you generate the patch-ids? Is it with
something like diff-tree piped to patch-id?

I do feel a bit sad about breaking this case (or at the very least
forcing you to set an option to retain cross-version compatibility). But
my gut says that we don't want to lock ourselves into never changing the
diff algorithm (and I'm sure we've done it inadvertently a few times
over the years; even the recent switch to turning on renames would have
had that impact).

-Peff


Re: [PATCH v2 2/2] receive-pack: verify push options in cert

2017-05-08 Thread Junio C Hamano
Jonathan Tan  writes:

> Teach receive-pack, in the case that push options are provided for a
> signed push, to verify that the push options both within the cert and
> outside the cert are consistent.

Thanks.  The idea was that the certificate should record how the
push was made fully, hence we need two copies.  The one outside the
certificate is meant to be actually used, but obviously we need to
make sure that matches what is recorded in the certificate.

In retrospect, we could have required the receiver who groks signed
pushes to only look inside the certificate for options etc. so that
the sender can omit the "extra" copies outside the certificate, but
that is not how the current protocol is structured, hence ...

> This sets in stone the requirement that send-pack redundantly send its
> push options in 2 places,...

... this requirement.

Thanks.


Re: [PATCH] pack-objects: disable pack reuse for object-selection options

2017-05-08 Thread Junio C Hamano
Jeff King  writes:

> On Tue, May 09, 2017 at 11:48:17AM +0900, Junio C Hamano wrote:
>
>> > I guess what I was asking was: do you still think it was unclear, or do
>> > you think you were just being dense?
>> >
>> > I don't feel like I gave any information in the follow-on explanation
>> > that wasn't in the commit message, so I wasn't clear if I worded it
>> > better or if it just sunk in better.
>> 
>> At least, "the current code is buggy when --local and friend are
>> given and includes needless objects in the result" was something I
>> learned only during the discussion, and would never have guessed by
>> reading the log message.  The second paragraph does talk about "This
>> bug has been present since...", but the first paragraph does not say
>> anything about the current output being broken.
>
> While waiting for your response I took a look to see if I could improve
> it and came to the same conclusion. The result is below.

Looks good to me.  I really like how the third-paragraph reasons
about pros and cons and decides to just disable the codepath.

I see this as an example of omitting something you know so well as
"too obvious", and it turns out that it isn't so obvious to others;
I commit the same sin all the time myself.  Catching instances of
these is part of the review process.

Thanks.

> -- >8 --
> Subject: pack-objects: disable pack reuse for object-selection options
>
> If certain options like --honor-pack-keep, --local, or
> --incremental are used with pack-objects, then we need to
> feed each potential object to want_object_in_pack() to see
> if it should be filtered out. But when the bitmap
> reuse_packfile optimization is in effect, we do not call
> that function at all, and in fact skip adding the objects to
> the to_pack list entirely.  This means we have a bug: for
> certain requests we will silently ignore those options and
> include objects in that pack that should not be there.
>
> The problem has been present since the inception of the
> pack-reuse code in 6b8fda2db (pack-objects: use bitmaps when
> packing objects, 2013-12-21), but it was unlikely to come up
> in practice.  These options are generally used for on-disk
> packing, not transfer packs (which go to stdout), but we've
> never allowed pack reuse for non-stdout packs (until
> 645c432d6, we did not even use bitmaps, which the reuse
> optimization relies on; after that, we explicitly turned it
> off when not packing to stdout).
>
> We can fix this by just disabling the reuse_packfile
> optimization when the options are in use. In theory we could
> teach the pack-reuse code to satisfy these checks, but it's
> not worth the complexity. The purpose of the optimization is
> to keep the amount of per-object work we do to a minimum.
> But these options inherently require us to search for other
> copies of each object, drowning out any benefit of the
> pack-reuse optimization. But note that the optimizations
> from 56dfeb626 (pack-objects: compute local/ignore_pack_keep
> early, 2016-07-29) happen before pack-reuse, meaning that
> specifying "--honor-pack-keep" in a repository with no .keep
> files can still follow the fast path.
>
> There are tests in t5310 that check these options with
> bitmaps and --stdout, but they didn't catch the bug, and
> it's hard to adapt them to do so.
>
> One problem is that they don't use --delta-base-offset;
> without that option, we always disable the reuse
> optimization entirely. It would be fine to add it in (it
> actually makes the test more realistic), but that still
> isn't quite enough.
>
> The other problem is that the reuse code is very picky; it
> only kicks in when it can reuse most of a pack, starting
> from the first byte. So we'd have to start from a fully
> repacked and bitmapped state to trigger it. But the tests
> for these options use a much more subtle state; they want to
> be sure that the want_object_in_pack() code is allowing some
> objects but not others. Doing a full repack runs counter to
> that.
>
> So this patch adds new tests at the end of the script which
> create the fully-packed state and make sure that each option
> is not fooled by reusable pack.
>
> Signed-off-by: Jeff King 


Re: [PATCH] pack-objects: disable pack reuse for object-selection options

2017-05-08 Thread Jeff King
On Mon, May 08, 2017 at 10:54:12PM -0400, Jeff King wrote:

> Contents are the same. I decided to leave the "; do" as it
> matches the rest of the script. If we're going to fix it, we
> should do them all.

Like this, perhaps.

I didn't go on a crusade against "; do" in the other scripts, but
perhaps that is low-hanging fruit that somebody else might want to pick.

-- >8 --
Subject: [PATCH] t5310: fix "; do" style

Our usual shell style is to put the "do" of a loop on its
own line, like:

  while $cond
  do
  something
  done

instead of:

  while $cond; do
  something
  done

We have a bit of both in our code base, but the former is
what's in CodingGuidelines (and outnumbers the latter in t/
by about 6:1).

Signed-off-by: Jeff King 
---
 t/t5310-pack-bitmaps.sh | 12 
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/t/t5310-pack-bitmaps.sh b/t/t5310-pack-bitmaps.sh
index c3ddfa217..20e2473a0 100755
--- a/t/t5310-pack-bitmaps.sh
+++ b/t/t5310-pack-bitmaps.sh
@@ -20,11 +20,13 @@ has_any () {
 }
 
 test_expect_success 'setup repo with moderate-sized history' '
-   for i in $(test_seq 1 10); do
+   for i in $(test_seq 1 10)
+   do
test_commit $i
done &&
git checkout -b other HEAD~5 &&
-   for i in $(test_seq 1 10); do
+   for i in $(test_seq 1 10)
+   do
test_commit side-$i
done &&
git checkout master &&
@@ -104,7 +106,8 @@ test_expect_success 'clone from bitmapped repository' '
 '
 
 test_expect_success 'setup further non-bitmapped commits' '
-   for i in $(test_seq 1 10); do
+   for i in $(test_seq 1 10)
+   do
test_commit further-$i
done
 '
@@ -300,7 +303,8 @@ test_expect_success 'set up reusable pack' '
 
 test_expect_success 'pack reuse respects --honor-pack-keep' '
test_when_finished "rm -f .git/objects/pack/*.keep" &&
-   for i in .git/objects/pack/*.pack; do
+   for i in .git/objects/pack/*.pack
+   do
>${i%.pack}.keep
done &&
reusable_pack --honor-pack-keep >empty.pack &&
-- 
2.13.0.rc2.436.g524cea07c



Re: Add an option to automatically submodule update on checkout

2017-05-08 Thread Junio C Hamano
"Randall S. Becker"  writes:

> I have to admit that I just assumed it would have to work that way
> this would not be particularly useful. However, in thinking about
> it, we might want to limit the depth of how far -b  takes
> effect. If the super module brings in submodules entirely within
> control of the development group, having -b  apply down to
> leaf submodules makes sense (in some policies). However, if some
> submodules span out to, say, gnulib, that might not make
> particular sense.

I do not see a strong reason to avoid your own branches in "other
people's project" like this.

The submodule's upstream may be a project you have no control over,
but the repository you have locally is under your total control and
you can use any branch names to suit the need of your project as the
whole (i.e. the superproject and submodules bound to it).

The fact that local branch names are under your control and for your
own use is true even when you are not using submodules, by the way.



Re: [PATCH] pack-objects: disable pack reuse for object-selection options

2017-05-08 Thread Jeff King
On Tue, May 09, 2017 at 11:48:17AM +0900, Junio C Hamano wrote:

> > I guess what I was asking was: do you still think it was unclear, or do
> > you think you were just being dense?
> >
> > I don't feel like I gave any information in the follow-on explanation
> > that wasn't in the commit message, so I wasn't clear if I worded it
> > better or if it just sunk in better.
> 
> At least, "the current code is buggy when --local and friend are
> given and includes needless objects in the result" was something I
> learned only during the discussion, and would never have guessed by
> reading the log message.  The second paragraph does talk about "This
> bug has been present since...", but the first paragraph does not say
> anything about the current output being broken.

While waiting for your response I took a look to see if I could improve
it and came to the same conclusion. The result is below.

> So, I am not sure if this was a case where I was dense.  I think the
> first paragraph needs a bit more clarity.

Yeah, you were not being dense. I just wrote it badly. Sorry
for the confusion.

-- >8 --
Subject: pack-objects: disable pack reuse for object-selection options

If certain options like --honor-pack-keep, --local, or
--incremental are used with pack-objects, then we need to
feed each potential object to want_object_in_pack() to see
if it should be filtered out. But when the bitmap
reuse_packfile optimization is in effect, we do not call
that function at all, and in fact skip adding the objects to
the to_pack list entirely.  This means we have a bug: for
certain requests we will silently ignore those options and
include objects in that pack that should not be there.

The problem has been present since the inception of the
pack-reuse code in 6b8fda2db (pack-objects: use bitmaps when
packing objects, 2013-12-21), but it was unlikely to come up
in practice.  These options are generally used for on-disk
packing, not transfer packs (which go to stdout), but we've
never allowed pack reuse for non-stdout packs (until
645c432d6, we did not even use bitmaps, which the reuse
optimization relies on; after that, we explicitly turned it
off when not packing to stdout).

We can fix this by just disabling the reuse_packfile
optimization when the options are in use. In theory we could
teach the pack-reuse code to satisfy these checks, but it's
not worth the complexity. The purpose of the optimization is
to keep the amount of per-object work we do to a minimum.
But these options inherently require us to search for other
copies of each object, drowning out any benefit of the
pack-reuse optimization. But note that the optimizations
from 56dfeb626 (pack-objects: compute local/ignore_pack_keep
early, 2016-07-29) happen before pack-reuse, meaning that
specifying "--honor-pack-keep" in a repository with no .keep
files can still follow the fast path.

There are tests in t5310 that check these options with
bitmaps and --stdout, but they didn't catch the bug, and
it's hard to adapt them to do so.

One problem is that they don't use --delta-base-offset;
without that option, we always disable the reuse
optimization entirely. It would be fine to add it in (it
actually makes the test more realistic), but that still
isn't quite enough.

The other problem is that the reuse code is very picky; it
only kicks in when it can reuse most of a pack, starting
from the first byte. So we'd have to start from a fully
repacked and bitmapped state to trigger it. But the tests
for these options use a much more subtle state; they want to
be sure that the want_object_in_pack() code is allowing some
objects but not others. Doing a full repack runs counter to
that.

So this patch adds new tests at the end of the script which
create the fully-packed state and make sure that each option
is not fooled by reusable pack.

Signed-off-by: Jeff King 
---
Contents are the same. I decided to leave the "; do" as it
matches the rest of the script. If we're going to fix it, we
should do them all.

 builtin/pack-objects.c  |  6 +-
 t/t5310-pack-bitmaps.sh | 38 ++
 2 files changed, 43 insertions(+), 1 deletion(-)

diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index 0fe35d1b5..50e01aa80 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -2717,7 +2717,11 @@ static void loosen_unused_packed_objects(struct rev_info 
*revs)
  */
 static int pack_options_allow_reuse(void)
 {
-   return pack_to_stdout && allow_ofs_delta;
+   return pack_to_stdout &&
+  allow_ofs_delta &&
+  !ignore_packed_keep &&
+  (!local || !have_non_local_packs) &&
+  !incremental;
 }
 
 static int get_object_list_from_bitmap(struct rev_info *revs)
diff --git a/t/t5310-pack-bitmaps.sh b/t/t5310-pack-bitmaps.sh
index 424bec7d7..c3ddfa217 100755
--- a/t/t5310-pack-bitmaps.sh
+++ b/t/t5310-pack-bitmaps.sh
@@ -289,4 +289,42 @@ test_expect_success 'splitting packs does 

Re: git-clone --config order & fetching extra refs during initial clone

2017-05-08 Thread Junio C Hamano
Jeff King  writes:

> Actually, it's not too bad, because we can pick up things like
> option_origin from the globals. Here it is for reference. The nice thing
> about it (IMHO) is that it makes the lifetimes of the helper variables
> much more shorter and more clear.
>
> But I'm OK with any of Gábor's original, my earlier squash, or this one.
>
> (Also, as a side note, the free(refspec) in the context at the bottom
> seems like it probably ought to be free_refspec() in the original and
> free_refspecs() after this change, but I didn't investigate).

I'll tentatively queue this on top of the original and wait for
Gábor to say something, then ;-) as I do agree that this one looks
reasonable.

Thanks.


Re: [PATCH] pack-objects: disable pack reuse for object-selection options

2017-05-08 Thread Junio C Hamano
Jeff King  writes:

> On Tue, May 09, 2017 at 11:14:18AM +0900, Junio C Hamano wrote:
>
>> Jeff King  writes:
>> 
>> >> Ah, OK, and now I understand why you called this a "bug" (which is
>> >> older and do not need to be addressed as part of 2.13) in the
>> >> original message.  The new tests check requests that ought to
>> >> produce an empty packfile as the result actually do, but with the
>> >> current code, the reuse code does not work with --local and friends
>> >> and ends up including what was requested to be excluded.
>> >
>> > Right. Did you want me to try re-wording the commit message, or does it
>> > make sense now?
>> 
>> It does make sense to me now, but I do not speak for all future
>> readers of "git log", so...
>
> I guess what I was asking was: do you still think it was unclear, or do
> you think you were just being dense?
>
> I don't feel like I gave any information in the follow-on explanation
> that wasn't in the commit message, so I wasn't clear if I worded it
> better or if it just sunk in better.

At least, "the current code is buggy when --local and friend are
given and includes needless objects in the result" was something I
learned only during the discussion, and would never have guessed by
reading the log message.  The second paragraph does talk about "This
bug has been present since...", but the first paragraph does not say
anything about the current output being broken.

So, I am not sure if this was a case where I was dense.  I think the
first paragraph needs a bit more clarity.

If certain options like --honor-pack-keep, --local, or
--incremental are used with pack-objects, then we need to
feed each potential object to want_object_in_pack() to see
if it should be filtered out.  This is totally contrary to
the purpose of the pack-reuse optimization, which tries hard
to avoid doing any per-object work.  Therefore we need to
disable this optimization when these options are in use.

Perhaps "... should be filtered out." can be followed by "However,
the current code fails to do so, and we end up including these
unwanted objects in the output.", and then "This is totally..."  can
instead begin with "Besides, having to do per-object filtering is
totally...".  I wouldn't have been confused if it were like so.



Re: [PATCH] diff.c: Fix whitespace issues due to a mismerge(?)

2017-05-08 Thread Junio C Hamano
Junio C Hamano  writes:

> Thanks for spotting.  
>
> I do not think you have to worry about any bug in Git-the-program
> with this merge.  If you try to reproduce the merge yourself (which
> by the way is easy to do, with "M=4af9a7d344 && git checkout $M^ &&
> git merge $M^2"), you'll see that quite a lot of changes made to
> "builtin/apply.c" had to be hand-ported to the corresponding lines
> that are now in "apply.c" at the top-level, because in the meantime
> 13b5af22 ("apply: move libified code from builtin/apply.c to
> apply.{c,h}", 2016-04-22) moved things around while the merged side
> branch has been cooking.  
>
> It is very likely that manual killing and yanking in Emacs
> introduced the screw-up.

I did your patch manually by hand and saw that a few hunks are still
left whitespace-broken by your patch, so I committed this one
instead.  Thanks.

-- >8 --
From: Junio C Hamano 
Date: Mon, 8 May 2017 19:30:24 -0700
Subject: [PATCH] apply.c: fix whitespace-only mismerge

4af9a7d3 ("Merge branch 'bc/object-id'", 2016-09-19) involved
merging a lot of changes made to builtin/apply.c on the side branch
manually to apply.c as an intervening commit 13b5af22 ("apply: move
libified code from builtin/apply.c to apply.{c,h}", 2016-04-22)
moved a lot of the lines changed on the side branch to a different
file apply.c at the top-level, requiring manual patching of it.
Apparently, the maintainer screwed up and made the code indent in a
funny way while doing so.

Reported-by: Stefan Beller 
Signed-off-by: Junio C Hamano 
---
 apply.c | 338 
 1 file changed, 169 insertions(+), 169 deletions(-)

diff --git a/apply.c b/apply.c
index 0e2caeab9c..acebd1cb49 100644
--- a/apply.c
+++ b/apply.c
@@ -4091,181 +4091,181 @@ static int build_fake_ancestor(struct apply_state 
*state, struct patch *list)
res = write_locked_index(, , COMMIT_LOCK);
discard_index();
 
-if (res)
-return error(_("could not write temporary index to %s"),
- state->fake_ancestor);
+   if (res)
+   return error(_("could not write temporary index to %s"),
+state->fake_ancestor);
 
-return 0;
- }
+   return 0;
+}
 
- static void stat_patch_list(struct apply_state *state, struct patch *patch)
- {
-int files, adds, dels;
+static void stat_patch_list(struct apply_state *state, struct patch *patch)
+{
+   int files, adds, dels;
 
-for (files = adds = dels = 0 ; patch ; patch = patch->next) {
-files++;
-adds += patch->lines_added;
-dels += patch->lines_deleted;
-show_stats(state, patch);
-}
+   for (files = adds = dels = 0 ; patch ; patch = patch->next) {
+   files++;
+   adds += patch->lines_added;
+   dels += patch->lines_deleted;
+   show_stats(state, patch);
+   }
 
-print_stat_summary(stdout, files, adds, dels);
- }
+   print_stat_summary(stdout, files, adds, dels);
+}
 
- static void numstat_patch_list(struct apply_state *state,
-   struct patch *patch)
- {
-for ( ; patch; patch = patch->next) {
-const char *name;
-name = patch->new_name ? patch->new_name : patch->old_name;
-if (patch->is_binary)
-printf("-\t-\t");
-else
-printf("%d\t%d\t", patch->lines_added, 
patch->lines_deleted);
-write_name_quoted(name, stdout, state->line_termination);
-}
- }
-
- static void show_file_mode_name(const char *newdelete, unsigned int mode, 
const char *name)
- {
-if (mode)
-printf(" %s mode %06o %s\n", newdelete, mode, name);
-else
-printf(" %s %s\n", newdelete, name);
- }
-
- static void show_mode_change(struct patch *p, int show_name)
- {
-if (p->old_mode && p->new_mode && p->old_mode != p->new_mode) {
-if (show_name)
-printf(" mode change %06o => %06o %s\n",
-   p->old_mode, p->new_mode, p->new_name);
-else
-printf(" mode change %06o => %06o\n",
-   p->old_mode, p->new_mode);
-}
- }
-
- static void show_rename_copy(struct patch *p)
- {
-const char *renamecopy = p->is_rename ? "rename" : "copy";
-const char *old, *new;
-
-/* Find common prefix */
-old = p->old_name;
-new = p->new_name;
-while (1) {
-const char *slash_old, *slash_new;
-slash_old = strchr(old, '/');
-slash_new = strchr(new, '/');
-if (!slash_old ||
-!slash_new ||
-

Re: [RFC PATCH 02/10] Move textconv_object to be with other textconv methods

2017-05-08 Thread Jeff King
On Tue, May 09, 2017 at 10:49:19AM +0900, Junio C Hamano wrote:

> Jeff King  writes:
> 
> > The specific helpers have less visibility, which is good. The public
> > functions a() and b() were already public, so no change. But now the
> > common helper is public, too, even though nobody except a() and b() care
> > about it.
> >
> > So it's a tradeoff. And the important question is: is the bleed-over
> > between a() and b() worse than the common bits being made public?
> 
> At this point you are saying the same thing as I said ;-) 

Maybe, but I think my hunch is that the answer to the question is
different than yours. ;)

> I haven't touched diff.c for a while, but my impression was that it
> was already a smallest logical unit, especially since some bits like
> diff-lib.c (the interface for the front-end programs to drive the
> machinery via the three standard pairs of sources) and ws.c (diff
> subsystem to deal with whitespace errors) are already split out (and
> needless to say, the diffcore transformations are all in their own
> separate files).  Over time people may have added what does not
> belong there merely for convenience, which may want to get ejected,
> but I do not think of many instances of them offhand.

Just skimming the file, I suspect diff_filespec could be broken out into
a semi-public interface (many of its functions already are public
anyway), probably dirstat could be broken out, and possibly other
specific formats could be broken into their own files. But I didn't
spend much time on it.

> It appears that the textconv related helpers (which was where this
> discussion thread started from) could be first restructured so that
> they do not depend on diff_filespec and turned into a more generic
> "I have this path and a blob object name, please run textconv to it
> and..." interface and then moved out of the file into its own
> textconv.[ch].  It does not need access to quite a few fields in
> diff_filespec structure (e.g.  it shouldn't care what filemode the
> thing has).  The diff machinery wants the result in a contiguous
> piece of memory and that could be a good starting point, but I said
> "and..." above because I am not sure if it is the best generic
> thing the restructured interface can do to the result.

I think some of the users of textconv do have to go through some
contortions to turn their buffers into diff_filespecs (e.g., blame and
grep), and it would be nice to fix that. But the diff-driver lookup and
caching is tied to the diff_filespec, too (which is perhaps a sign of a
misfeature in the first place, as the "diff" attribute is being used for
things like "grep" and "cat-file"). So we'd have to resolve that.

So I think there is room for improvement there, but I'm also not sure it
is worth the trouble. AFAIK neither textconv nor the general length of
diff.c is holding up anybody's work or even making it significantly more
difficult. If people have spare cycles, they are free to work on
anything, of course. But I can think of code that is probably in more
dire need of refactoring. :)

-Peff


Re: git-clone --config order & fetching extra refs during initial clone

2017-05-08 Thread Jeff King
On Mon, May 08, 2017 at 10:10:28PM -0400, Jeff King wrote:

> On Tue, May 09, 2017 at 10:33:39AM +0900, Junio C Hamano wrote:
> 
> > >> My patch deals with 'remote..refspec', i.e. 'remote->fetch'.
> > >> Apparently some extra care is necessary for 'remote..tagOpt' and
> > >> 'remote->fetch_tags', too.  Perhaps there are more, I haven't checked
> > >> again, and maybe we'll add similar config variables in the future.  So
> > >> I don't think that dealing with such config variables one by one in
> > >> 'git clone', too, is the right long-term solution...  but perhaps it's
> > >> sufficient for the time being?
> > >
> > > I think your patch is a strict improvement and we don't need to hold up
> > > waiting for a perfect fix (and because of the --single-branch thing you
> > > mentioned, this may be the best we can do anyway).
> > 
> > OK, so where does this patch stand now?  It already is too late for
> > the upcoming release, but should we merge it to 'next' once the
> > release is made, cook it in 'next' and shoot for the next release
> > as-is, or do we want to allow minor tweaks before it hits 'next'?
> 
> I'd be OK with it as-is, but here are my nitpicks as a diff (keep the
> assignment of refspec_count in one place, and free fetch_patterns as
> soon as it is no longer used).
> 
> I actually think it might be nice to pull the whole thing out into its
> own function, but the parameters it takes would be oddly specific.

Actually, it's not too bad, because we can pick up things like
option_origin from the globals. Here it is for reference. The nice thing
about it (IMHO) is that it makes the lifetimes of the helper variables
much more shorter and more clear.

But I'm OK with any of Gábor's original, my earlier squash, or this one.

(Also, as a side note, the free(refspec) in the context at the bottom
seems like it probably ought to be free_refspec() in the original and
free_refspecs() after this change, but I didn't investigate).

diff --git a/builtin/clone.c b/builtin/clone.c
index 0630202c8..645cfa4fd 100644
--- a/builtin/clone.c
+++ b/builtin/clone.c
@@ -841,6 +841,37 @@ static void dissociate_from_references(void)
free(alternates);
 }
 
+static struct refspec *get_clone_refspecs(const char *base_refspec,
+ unsigned int *count)
+{
+   char *key;
+   const struct string_list *config_fetch_patterns;
+   struct refspec *ret;
+
+   key = xstrfmt("remote.%s.fetch", option_origin);
+   config_fetch_patterns = git_config_get_value_multi(key);
+
+   if (!config_fetch_patterns) {
+   *count = 1;
+   ret = parse_fetch_refspec(1, _refspec);
+   } else {
+   const char **fetch_patterns;
+   struct string_list_item *fp;
+   unsigned int i = 1;
+
+   *count = 1 + config_fetch_patterns->nr;
+   ALLOC_ARRAY(fetch_patterns, *count);
+   fetch_patterns[0] = base_refspec;
+   for_each_string_list_item(fp, config_fetch_patterns)
+   fetch_patterns[i++] = fp->string;
+   ret = parse_fetch_refspec(*count, fetch_patterns);
+   free(fetch_patterns);
+   }
+
+   free(key);
+   return ret;
+}
+
 int cmd_clone(int argc, const char **argv, const char *prefix)
 {
int is_bundle = 0, is_local;
@@ -861,9 +892,7 @@ int cmd_clone(int argc, const char **argv, const char 
*prefix)
int err = 0, complete_refs_before_fetch = 1;
 
struct refspec *refspec;
-   unsigned int refspec_count = 1;
-   const char **fetch_patterns;
-   const struct string_list *config_fetch_patterns;
+   unsigned int refspec_count;
 
packet_trace_identity("clone");
argc = parse_options(argc, argv, prefix, builtin_clone_options,
@@ -982,7 +1011,6 @@ int cmd_clone(int argc, const char **argv, const char 
*prefix)
strbuf_addf(_top, "refs/remotes/%s/", option_origin);
}
 
-   strbuf_addf(, "+%s*:%s*", src_ref_prefix, branch_top.buf);
strbuf_addf(, "remote.%s.url", option_origin);
git_config_set(key.buf, repo);
strbuf_reset();
@@ -990,21 +1018,8 @@ int cmd_clone(int argc, const char **argv, const char 
*prefix)
if (option_reference.nr)
setup_reference();
 
-   strbuf_addf(, "remote.%s.fetch", option_origin);
-   config_fetch_patterns = git_config_get_value_multi(key.buf);
-   if (config_fetch_patterns)
-   refspec_count = 1 + config_fetch_patterns->nr;
-   fetch_patterns = xcalloc(refspec_count, sizeof(*fetch_patterns));
-   fetch_patterns[0] = value.buf;
-   if (config_fetch_patterns) {
-   struct string_list_item *fp;
-   unsigned int i = 1;
-   for_each_string_list_item(fp, config_fetch_patterns)
-   fetch_patterns[i++] = fp->string;
-   }
-   refspec = parse_fetch_refspec(refspec_count, fetch_patterns);
-
- 

Re: [PATCH] pack-objects: disable pack reuse for object-selection options

2017-05-08 Thread Jeff King
On Tue, May 09, 2017 at 11:14:18AM +0900, Junio C Hamano wrote:

> Jeff King  writes:
> 
> >> Ah, OK, and now I understand why you called this a "bug" (which is
> >> older and do not need to be addressed as part of 2.13) in the
> >> original message.  The new tests check requests that ought to
> >> produce an empty packfile as the result actually do, but with the
> >> current code, the reuse code does not work with --local and friends
> >> and ends up including what was requested to be excluded.
> >
> > Right. Did you want me to try re-wording the commit message, or does it
> > make sense now?
> 
> It does make sense to me now, but I do not speak for all future
> readers of "git log", so...

I guess what I was asking was: do you still think it was unclear, or do
you think you were just being dense?

I don't feel like I gave any information in the follow-on explanation
that wasn't in the commit message, so I wasn't clear if I worded it
better or if it just sunk in better.

-Peff


Re: [PATCH] pack-objects: disable pack reuse for object-selection options

2017-05-08 Thread Junio C Hamano
Jeff King  writes:

>> Ah, OK, and now I understand why you called this a "bug" (which is
>> older and do not need to be addressed as part of 2.13) in the
>> original message.  The new tests check requests that ought to
>> produce an empty packfile as the result actually do, but with the
>> current code, the reuse code does not work with --local and friends
>> and ends up including what was requested to be excluded.
>
> Right. Did you want me to try re-wording the commit message, or does it
> make sense now?

It does make sense to me now, but I do not speak for all future
readers of "git log", so...


Re: [PATCH v7] read-cache: force_verify_index_checksum

2017-05-08 Thread Junio C Hamano
Jeff Hostetler  writes:

> On 5/8/2017 4:03 PM, Christian Couder wrote:
>> On Mon, May 8, 2017 at 6:50 PM, Jeff Hostetler  
>> wrote:
>>>
>>>
>>> On 5/8/2017 5:45 AM, Christian Couder wrote:

 This test does not pass when the GIT_TEST_SPLIT_INDEX env variable is
 set on my Linux machine.

 Also it looks like you sent a v8 of this patch series with a different
 test, but what is in master looks like the above test instead of the
 test in your v8.
>>>
>>> There was concern about using sed on a binary file not being portable
>>> and a request to change the test to just corrupt the checksum rather
>>> than an index-entry, so I changed it in v8.
>>>
>>> Does the v8 version of the test also fail on your machine ?
>>
>> The v8 version of the test succeeds on my machine.
>
> OK, thanks.  It worked on mine too.  Since the v8 version is a
> better test, I'm content to stick with it and not try to address
> the problem with the previous version.

Thanks.  Let's merge that in.


Re: git-clone --config order & fetching extra refs during initial clone

2017-05-08 Thread Jeff King
On Tue, May 09, 2017 at 10:33:39AM +0900, Junio C Hamano wrote:

> >> My patch deals with 'remote..refspec', i.e. 'remote->fetch'.
> >> Apparently some extra care is necessary for 'remote..tagOpt' and
> >> 'remote->fetch_tags', too.  Perhaps there are more, I haven't checked
> >> again, and maybe we'll add similar config variables in the future.  So
> >> I don't think that dealing with such config variables one by one in
> >> 'git clone', too, is the right long-term solution...  but perhaps it's
> >> sufficient for the time being?
> >
> > I think your patch is a strict improvement and we don't need to hold up
> > waiting for a perfect fix (and because of the --single-branch thing you
> > mentioned, this may be the best we can do anyway).
> 
> OK, so where does this patch stand now?  It already is too late for
> the upcoming release, but should we merge it to 'next' once the
> release is made, cook it in 'next' and shoot for the next release
> as-is, or do we want to allow minor tweaks before it hits 'next'?

I'd be OK with it as-is, but here are my nitpicks as a diff (keep the
assignment of refspec_count in one place, and free fetch_patterns as
soon as it is no longer used).

I actually think it might be nice to pull the whole thing out into its
own function, but the parameters it takes would be oddly specific.

diff --git a/builtin/clone.c b/builtin/clone.c
index 0630202c8..577529a11 100644
--- a/builtin/clone.c
+++ b/builtin/clone.c
@@ -861,7 +861,7 @@ int cmd_clone(int argc, const char **argv, const char 
*prefix)
int err = 0, complete_refs_before_fetch = 1;
 
struct refspec *refspec;
-   unsigned int refspec_count = 1;
+   unsigned int refspec_count;
const char **fetch_patterns;
const struct string_list *config_fetch_patterns;
 
@@ -994,6 +994,8 @@ int cmd_clone(int argc, const char **argv, const char 
*prefix)
config_fetch_patterns = git_config_get_value_multi(key.buf);
if (config_fetch_patterns)
refspec_count = 1 + config_fetch_patterns->nr;
+   else
+   refspec_count = 1;
fetch_patterns = xcalloc(refspec_count, sizeof(*fetch_patterns));
fetch_patterns[0] = value.buf;
if (config_fetch_patterns) {
@@ -1003,6 +1005,7 @@ int cmd_clone(int argc, const char **argv, const char 
*prefix)
fetch_patterns[i++] = fp->string;
}
refspec = parse_fetch_refspec(refspec_count, fetch_patterns);
+   free(fetch_patterns);
 
strbuf_reset();
strbuf_reset();
@@ -1129,7 +1132,6 @@ int cmd_clone(int argc, const char **argv, const char 
*prefix)
strbuf_release();
junk_mode = JUNK_LEAVE_ALL;
 
-   free(fetch_patterns);
free(refspec);
return err;
 }

-Peff


Re: [PATCH] diff.c: Fix whitespace issues due to a mismerge(?)

2017-05-08 Thread Junio C Hamano
Stefan Beller  writes:

> It looks like all these lines were introduced by one of the conflict chunks
> in 4af9a7d344 (Merge branch 'bc/object-id', 2016-09-19). Viewing that
> commit in gitk, the indentation seems fine, i.e. there is just one
> whitespace in front of the lines, as you would expect from a formatted
> patch.
>
> Signed-off-by: Stefan Beller 
> ---
>
>  Junio, 
>  I do not think it is worth to apply this patch on its own,
>  but maybe it is worth to investigate your setup? (Assuming it is
>  git that did the merge, we may have a bug in whitespacing and
>  merge conflicts.)

Thanks for spotting.  

I do not think you have to worry about any bug in Git-the-program
with this merge.  If you try to reproduce the merge yourself (which
by the way is easy to do, with "M=4af9a7d344 && git checkout $M^ &&
git merge $M^2"), you'll see that quite a lot of changes made to
"builtin/apply.c" had to be hand-ported to the corresponding lines
that are now in "apply.c" at the top-level, because in the meantime
13b5af22 ("apply: move libified code from builtin/apply.c to
apply.{c,h}", 2016-04-22) moved things around while the merged side
branch has been cooking.  

It is very likely that manual killing and yanking in Emacs
introduced the screw-up.


Re: [PATCH] pack-objects: disable pack reuse for object-selection options

2017-05-08 Thread Jeff King
On Tue, May 09, 2017 at 09:44:50AM +0900, Junio C Hamano wrote:

> Jeff King  writes:
> 
> > On Mon, May 08, 2017 at 01:56:27PM +0900, Junio C Hamano wrote:
> >
> >> Surely, even if we need to exclude some objects from an existing
> >> packfile due to these selection options, we should be able to reuse
> >> the non-excluded part, no?  The end result may involve having to
> >> pick and reuse more and smaller slices from existing packfiles,
> >> which may be much less efficient, but it is no immediately obvious
> >> to me if it leads to "need to disable".  I would understand it if it
> >> were "it becomes much less efficient and we are better off not using
> >> the bitmap code at all", though.
> >
> > Yes, it's this last bit. The main win of the packfile reuse code is that
> > it builds on the bitmaps to avoid doing as much per-object work as
> > possible. So the objects don't even get added to the list of "struct
> > object_entry", and we never consider them for the "should they be in the
> > result" checks beyond the have/want computation done by the bitmaps.
> >
> > We could add those checks in, but what's the point? The idea of the
> > reuse code is to be a fast path for serving vanilla clones. Searching
> > through all of the packfiles for a .keep entry is the antithesis of
> > that.
> 
> Ah, OK, and now I understand why you called this a "bug" (which is
> older and do not need to be addressed as part of 2.13) in the
> original message.  The new tests check requests that ought to
> produce an empty packfile as the result actually do, but with the
> current code, the reuse code does not work with --local and friends
> and ends up including what was requested to be excluded.

Right. Did you want me to try re-wording the commit message, or does it
make sense now?

-Peff


Re: Questions about validation/verification done for Git Software

2017-05-08 Thread Junio C Hamano
"Robertson, Todd"  writes:

> Is the software FDA certified?

No, Git is toxic and are not approved for human consumption ;-).



Re: [RFC PATCH 02/10] Move textconv_object to be with other textconv methods

2017-05-08 Thread Junio C Hamano
Jeff King  writes:

> The specific helpers have less visibility, which is good. The public
> functions a() and b() were already public, so no change. But now the
> common helper is public, too, even though nobody except a() and b() care
> about it.
>
> So it's a tradeoff. And the important question is: is the bleed-over
> between a() and b() worse than the common bits being made public?

At this point you are saying the same thing as I said ;-) 

I haven't touched diff.c for a while, but my impression was that it
was already a smallest logical unit, especially since some bits like
diff-lib.c (the interface for the front-end programs to drive the
machinery via the three standard pairs of sources) and ws.c (diff
subsystem to deal with whitespace errors) are already split out (and
needless to say, the diffcore transformations are all in their own
separate files).  Over time people may have added what does not
belong there merely for convenience, which may want to get ejected,
but I do not think of many instances of them offhand.

... goes and looks ...

It appears that the textconv related helpers (which was where this
discussion thread started from) could be first restructured so that
they do not depend on diff_filespec and turned into a more generic
"I have this path and a blob object name, please run textconv to it
and..." interface and then moved out of the file into its own
textconv.[ch].  It does not need access to quite a few fields in
diff_filespec structure (e.g.  it shouldn't care what filemode the
thing has).  The diff machinery wants the result in a contiguous
piece of memory and that could be a good starting point, but I said
"and..." above because I am not sure if it is the best generic
thing the restructured interface can do to the result.







Re: git-clone --config order & fetching extra refs during initial clone

2017-05-08 Thread Junio C Hamano
Jeff King  writes:

> Good point. We can't really consider clone to be a blind "init + config
> + fetch + checkout" because those middle two steps sometimes overlap
> each other.  It really does need to be its own beast.
> ...
> The right solution there is probably pushing that logic down into the
> transport layer. Or at the very least abstracting it into a function so
> that both clone and fetch can call it without replicating the logic.
>
>> My patch deals with 'remote..refspec', i.e. 'remote->fetch'.
>> Apparently some extra care is necessary for 'remote..tagOpt' and
>> 'remote->fetch_tags', too.  Perhaps there are more, I haven't checked
>> again, and maybe we'll add similar config variables in the future.  So
>> I don't think that dealing with such config variables one by one in
>> 'git clone', too, is the right long-term solution...  but perhaps it's
>> sufficient for the time being?
>
> I think your patch is a strict improvement and we don't need to hold up
> waiting for a perfect fix (and because of the --single-branch thing you
> mentioned, this may be the best we can do anyway).

OK, so where does this patch stand now?  It already is too late for
the upcoming release, but should we merge it to 'next' once the
release is made, cook it in 'next' and shoot for the next release
as-is, or do we want to allow minor tweaks before it hits 'next'?

Thanks.


Re: [PATCH v2 1/1] t0027: tests are not expensive; remove t0025

2017-05-08 Thread Junio C Hamano
Johannes Schindelin  writes:

>> Recent "stress" tests show that t0025 if flaky, reported by Lars Schneider,
>> larsxschnei...@gmail.com
>> 
>> All tests from t0025 are covered in t0027 already, so that t0025 can be
>> retiered:
>
> s/retiered/retired/
>
>> The execution time for t0027 is 14 seconds under Linux,
>> and 63 seconds under Mac Os X.
>> And in case you ask, things are not going significantly faster using a SSD
>> instead of a spinning disk.
>> 
>> Signed-off-by: Torsten Bögershausen 
>
> Thank you for this patch.
>
> Apart from the tyop, would it be possible to fix the formatting to look
> less strange? (Unless you use this to transport a super-secret message
> steganographically to an alien planet or some such, of course.)

Ping?

Thanks.


Hello dear,...

2017-05-08 Thread Makena Jelani
Hello dear,
I am Miss Makena Jelani. hope you remember me? is been a long time,
where have you been. well, contact me so we can talk
Makena.


Re: [PATCH] pack-objects: disable pack reuse for object-selection options

2017-05-08 Thread Junio C Hamano
Jeff King  writes:

> On Mon, May 08, 2017 at 01:56:27PM +0900, Junio C Hamano wrote:
>
>> Surely, even if we need to exclude some objects from an existing
>> packfile due to these selection options, we should be able to reuse
>> the non-excluded part, no?  The end result may involve having to
>> pick and reuse more and smaller slices from existing packfiles,
>> which may be much less efficient, but it is no immediately obvious
>> to me if it leads to "need to disable".  I would understand it if it
>> were "it becomes much less efficient and we are better off not using
>> the bitmap code at all", though.
>
> Yes, it's this last bit. The main win of the packfile reuse code is that
> it builds on the bitmaps to avoid doing as much per-object work as
> possible. So the objects don't even get added to the list of "struct
> object_entry", and we never consider them for the "should they be in the
> result" checks beyond the have/want computation done by the bitmaps.
>
> We could add those checks in, but what's the point? The idea of the
> reuse code is to be a fast path for serving vanilla clones. Searching
> through all of the packfiles for a .keep entry is the antithesis of
> that.

Ah, OK, and now I understand why you called this a "bug" (which is
older and do not need to be addressed as part of 2.13) in the
original message.  The new tests check requests that ought to
produce an empty packfile as the result actually do, but with the
current code, the reuse code does not work with --local and friends
and ends up including what was requested to be excluded.

Thanks.


Re: PCRE v2 compile error, was Re: What's cooking in git.git (May 2017, #01; Mon, 1)

2017-05-08 Thread brian m. carlson
On Tue, May 09, 2017 at 02:00:18AM +0200, Ævar Arnfjörð Bjarmason wrote:
> On Tue, May 9, 2017 at 1:32 AM, brian m. carlson
>  wrote:
> > PCRE and PCRE2 also tend to have a lot of security updates, so I would
> > prefer if we didn't import them into the tree.  It is far better for
> > users to use their distro's packages for PCRE, as it means they get
> > automatic security updates even if they're using an old Git.
> >
> > We shouldn't consider shipping anything with a remotely frequent history
> > of security updates in our tree, since people very frequently run old or
> > ancient versions of Git.
> 
> I'm aware of its security record[1], but I wonder what threat model
> you have in mind here. I'm not aware of any parts of git (except maybe
> gitweb?) where we take regexes from untrusted sources.
> 
> I.e. yes there have been DoS's & even some overflow bugs leading code
> execution in PCRE, but in the context of powering git-grep & git-log
> with PCRE this falls into the "stop hitting yourself" category.

Just because you don't drive Git with untrusted regexes doesn't mean
other people don't.  It's not a good idea to require a stronger security
model than we absolutely have to, since people can and will violate it.
Think how devastating Shellshock was even though technically nobody
should provide insecure environment variables to the shell.

And, yes, gitweb does in fact call git grep.  That means that git grep
must in fact be secure against untrusted regexes, or you have a remote
code execution vulnerability.

Furthermore, at work we distribute Git with all releases of our product.
We normally only do non-security updates to the last couple of releases,
but we provide security updates to all supported versions.  I'm not
comfortable shipping the entirety of PCRE or PCRE2 to customers without
providing security updates, so you're going to make my job (and my
coworkers') a lot harder by shipping it.  Please don't.
-- 
brian m. carlson / brian with sandals: Houston, Texas, US
+1 832 623 2791 | https://www.crustytoothpaste.net/~bmc | My opinion only
OpenPGP: https://keybase.io/bk2204


signature.asc
Description: PGP signature


Re: [PATCH v2 0/9] Keep git clean -d from inadvertently removing ignored files

2017-05-08 Thread Junio C Hamano
Samuel Lijin  writes:

> What happens right now is that because (1) directories containing only
> untracked and ignored files are considered "untracked" and (2)
> read_directory_recursive() skips over untracked directories, even with
> DIR_SHOW_IGNORED_TOO set. As a result, `status --ignored` never lists
> ignored files if they're in an "untracked" directory (and this is the
> currently defined behavior per t7061).
>
> By teaching read_directory_recursive() to recurse into untracked
> directories in search of ignored files when DIR_SHOW_IGNORED_TOO is
> set, though, `status --ignored` now learns to report the existence of
> these ignored files, whereas previously it did not.

OK, if you are revisiting the design decision made by eb8c5b87
("git-status: Test --ignored behavior", 2012-12-30), we should
clearly document it in the log message of the commit that does so in
a way similar to how eb8c5b87 described the behaviour it desired.

Thanks.


Re: PCRE v2 compile error, was Re: What's cooking in git.git (May 2017, #01; Mon, 1)

2017-05-08 Thread Ævar Arnfjörð Bjarmason
On Tue, May 9, 2017 at 1:32 AM, brian m. carlson
 wrote:
> On Mon, May 08, 2017 at 04:10:41PM +0900, Junio C Hamano wrote:
>> Ævar Arnfjörð Bjarmason  writes:
>>
>> > This won't be in my next PCRE submission, but I have a path locally to
>> > simply import PCRE into git.git as compat/pcre2, so it can be compiled
>> > with NO_PCRE=Y similar to how NO_REGEX=Y works.
>> >
>> > This will hopefully address your concerns partially, i.e. when you do
>> > want to try it out it'll be easier.
>>
>> Eek, please don't.
>>
>> Until pcre2 becomes _so_ stable that all reasonable distros give
>> choice to the end-users to install it easily in a packaged form,
>> such a "not a fork, but a copy" will be a moving target that I'd
>> rather not to have in compat/.  IOW, our compat/$pkg should be a
>> last resort to help those on distros that are so hard to convince to
>> carry the version/variant of $pkg we would like to use.

The reason I'm looking into this is because the WIP part of my PCRE
branch has changes which start to use PCRE as a general matching
engine in git, even. I.e.:

* git grep -F will be powered by it rather than kwset (which'll be git rm'd)
* Long standing limitations with \0s in regexes go away.
* grep -G and -E will use PCRE via a WIP POSIX -> PCRE  pattern
translator (https://bugs.exim.org/show_bug.cgi?id=2106)
* Perhaps we can remove compat/regex/ entirely & use PCRE via its
POSIX emulation mode + pattern translator (we use regcomp/regexec a
lot for non-grep/log), I'm not sure yet.

I have messy but working code for this in a WIP branch, here's the
performance improvement against linux.git:

Test   v2.13.0-rc2   HEAD
---
7820.1: basic grep how.to  0.31(1.20+0.46)
0.19(0.33+0.55) -38.7%
7820.2: extended grep how.to   0.31(1.19+0.46)
0.19(0.33+0.55) -38.7%
7820.3: perl grep how.to   0.30(1.12+0.46)
0.19(0.28+0.62) -36.7%
7820.5: basic grep ^how to 0.31(1.24+0.39)
0.19(0.32+0.56) -38.7%
7820.6: extended grep ^how to  0.30(1.18+0.44)
0.19(0.22+0.66) -36.7%
7820.7: perl grep ^how to  0.55(2.68+0.41)
0.19(0.32+0.56) -65.5%
7820.9: basic grep [how] to0.47(2.17+0.44)
0.22(0.39+0.54) -53.2%
7820.10: extended grep [how] to0.47(2.21+0.40)
0.22(0.39+0.55) -53.2%
7820.11: perl grep [how] to0.53(2.64+0.39)
0.21(0.37+0.58) -60.4%
7820.13: basic grep \(e.t[^ ]*\|v.ry\) rare0.63(3.16+0.48)
0.21(0.48+0.53) -66.7%
7820.14: extended grep (e.t[^ ]*|v.ry) rare0.64(3.28+0.38)
0.21(0.45+0.57) -67.2%
7820.15: perl grep (e.t[^ ]*|v.ry) rare1.00(5.77+0.37)
0.21(0.50+0.53) -79.0%
7820.17: basic grep m\(ú\|u\)ult.b\(æ\|y\)te   0.31(1.23+0.51)
0.19(0.30+0.58) -38.7%
7820.18: extended grep m(ú|u)ult.b(æ|y)te  0.32(1.26+0.47)
0.19(0.27+0.61) -40.6%
7820.19: perl grep m(ú|u)ult.b(æ|y)te  0.36(1.61+0.37)
0.19(0.30+0.57) -47.2%
7821.1: fixed grep int 0.52(1.71+0.64)
0.43(1.11+0.72) -17.3%
7821.2: basic grep int 0.54(1.62+0.70)
0.42(1.14+0.62) -22.2%
7821.3: extended grep int  0.53(1.67+0.64)
0.51(1.17+0.62) -3.8%
7821.4: perl grep int  0.53(1.71+0.59)
0.72(1.13+0.63) +35.8%
7821.6: fixed grep -i int  0.58(1.86+0.67)
0.47(1.32+0.62) -19.0%
7821.7: basic grep -i int  0.62(1.94+0.61)
0.57(1.25+0.72) -8.1%
7821.8: extended grep -i int   0.82(1.86+0.68)
0.50(1.41+0.56) -39.0%
7821.9: perl grep -i int   0.70(1.88+0.68)
0.56(1.25+0.70) -20.0%
7821.11: fixed grep æ  0.33(1.30+0.43)
0.19(0.22+0.64) -42.4%
7821.12: basic grep æ  0.33(1.35+0.38)
0.18(0.26+0.59) -45.5%
7821.13: extended grep æ   0.33(1.20+0.52)
0.18(0.32+0.53) -45.5%
7821.14: perl grep æ   0.33(1.31+0.40)
0.18(0.28+0.56) -45.5%
7821.16: fixed grep -i æ   0.25(0.87+0.50)
0.18(0.24+0.60) -28.0%
7821.17: basic grep -i æ   0.26(0.88+0.48)
0.18(0.24+0.60) -30.8%
7821.18: extended grep -i æ0.26(0.92+0.44)
0.18(0.24+0.61) -30.8%
7821.19: perl grep -i æ0.25(0.79+0.45)
0.19(0.32+0.56) -24.0%

In case that comes out misformatted it's also available at
https://github.com/avar/git/commit/ee5b2040e2c697e22a73c7b5f07f1b1e591f07e3

I.e. grepping is sped up by 50% and more in many cases, even for -G
and -E patterns which now get translated internally into PCRE
patterns.

> PCRE and PCRE2 also tend to have a lot of security updates, so I would
> prefer if we didn't import them into the tree.  It is far better for
> users to use their distro's packages for PCRE, as it means they get
> automatic 

Re: PCRE v2 compile error, was Re: What's cooking in git.git (May 2017, #01; Mon, 1)

2017-05-08 Thread brian m. carlson
On Mon, May 08, 2017 at 04:10:41PM +0900, Junio C Hamano wrote:
> Ævar Arnfjörð Bjarmason  writes:
> 
> > This won't be in my next PCRE submission, but I have a path locally to
> > simply import PCRE into git.git as compat/pcre2, so it can be compiled
> > with NO_PCRE=Y similar to how NO_REGEX=Y works.
> >
> > This will hopefully address your concerns partially, i.e. when you do
> > want to try it out it'll be easier.
> 
> Eek, please don't.
> 
> Until pcre2 becomes _so_ stable that all reasonable distros give
> choice to the end-users to install it easily in a packaged form,
> such a "not a fork, but a copy" will be a moving target that I'd
> rather not to have in compat/.  IOW, our compat/$pkg should be a
> last resort to help those on distros that are so hard to convince to
> carry the version/variant of $pkg we would like to use.

PCRE and PCRE2 also tend to have a lot of security updates, so I would
prefer if we didn't import them into the tree.  It is far better for
users to use their distro's packages for PCRE, as it means they get
automatic security updates even if they're using an old Git.

We shouldn't consider shipping anything with a remotely frequent history
of security updates in our tree, since people very frequently run old or
ancient versions of Git.
-- 
brian m. carlson / brian with sandals: Houston, Texas, US
+1 832 623 2791 | https://www.crustytoothpaste.net/~bmc | My opinion only
OpenPGP: https://keybase.io/bk2204


signature.asc
Description: PGP signature


Re: [RFC PATCH 02/10] Move textconv_object to be with other textconv methods

2017-05-08 Thread Stefan Beller
On Mon, May 8, 2017 at 2:55 PM, Jeff King  wrote:
> On Mon, May 08, 2017 at 10:02:58AM +0900, Junio C Hamano wrote:
>
>> Stefan Beller  writes:
>>
>> > I guess it is ok for now and in this series, but we may want
>> > to split up diff.[ch] in the future into multiple finer grained files.
>>
>> For what end?  Such a split would require more symbols internal to
>> diff.[ch] to become external, which is a big downside, so we need to
>> have a large reward to compensate it if we were to go there.
>
> I think there are two sides to that coin. Let's say you have a file with
> five functions (and you can replace function with structs, global
> variables, etc; any unit of complexity that might or might not be
> externally visible):
>
>   /* called by both a() and b() */
>   static void a_and_b_helper();
>
>   /* called by a() */
>   static void a_helper();
>   void a();
>
>   /* called by b() */
>   static void b_helper();
>   void b();
>
> When they are in the same file, b() and b_helper() can see a_helper(),
> even though they don't need it. And that means increased complexity in
> dealing with a_helper(), because its visibility is larger than
> necessary. We have to worry about what a change to it might do to b()
> (or more realistically, c(), d(), etc).
>
> If we split this apart, we end up with three files:
>
>common.c:
>  void a_and_b_helper();
>
>a.c:
>  static void a_helper();
>  void a();
>
>b.c:
>  static void b_helper();
>  void b();
>
> The specific helpers have less visibility, which is good. The public
> functions a() and b() were already public, so no change. But now the
> common helper is public, too, even though nobody except a() and b() care
> about it.
>
> So it's a tradeoff. And the important question is: is the bleed-over
> between a() and b() worse than the common bits being made public?  That
> can't be answered without looking at how many distinct "a" and "b"-like
> chunks there are in the file, and what the common bits look like. I'm
> not sure of the answer for diff.c. Without digging, both ends of the
> spectrum seem equally plausible to me: that it is mostly a set of N
> distinct units which could be split apart, or that it really is a few
> public functions calling the same common core over and over.
>
> And a follow-on question is what we can do to mitigate the cost of
> making the common code public. We could advertise a_and_b_helper() only
> in diff-internal.h, and then makes it only semi-public (anybody can
> include that, of course, but including diff-internal.h seems like it
> ought to be a red flag). That only helps the programmer, though; we'd
> still be losing out on compiler optimizations and static analysis that
> only looks at one translation unit at a time.
>
> Phew. That ended up a little long-winded. All of it is to say that I
> don't think it makes sense to reject a split out-of-hand, but nor is it
> always a good thing. It depends on what's in the file.

I agree on this sentiment. It really depends on the content under
discussion whether it makes sense to split.

Having had some exposure recently for diff.[ch] (and I just picked up
that series again, but did not send it out yet), I have the impression that
we do have a lot of code in diff.c, which is quite unrelated to each
other, i.e. a lot of [a,b]_helper()s, and few a_and_b_helper() for the
already public functions.

May first mail was based on perceived unneeded complexity, which
slows down in achieving a goal.

Thanks,
Stefan


Re: [RFC PATCH 02/10] Move textconv_object to be with other textconv methods

2017-05-08 Thread Jeff King
On Mon, May 08, 2017 at 10:02:58AM +0900, Junio C Hamano wrote:

> Stefan Beller  writes:
> 
> > I guess it is ok for now and in this series, but we may want
> > to split up diff.[ch] in the future into multiple finer grained files.
> 
> For what end?  Such a split would require more symbols internal to
> diff.[ch] to become external, which is a big downside, so we need to
> have a large reward to compensate it if we were to go there.

I think there are two sides to that coin. Let's say you have a file with
five functions (and you can replace function with structs, global
variables, etc; any unit of complexity that might or might not be
externally visible):

  /* called by both a() and b() */
  static void a_and_b_helper();

  /* called by a() */
  static void a_helper();
  void a();

  /* called by b() */
  static void b_helper();
  void b();

When they are in the same file, b() and b_helper() can see a_helper(),
even though they don't need it. And that means increased complexity in
dealing with a_helper(), because its visibility is larger than
necessary. We have to worry about what a change to it might do to b()
(or more realistically, c(), d(), etc).

If we split this apart, we end up with three files:

   common.c:
 void a_and_b_helper();

   a.c:
 static void a_helper();
 void a();

   b.c:
 static void b_helper();
 void b();

The specific helpers have less visibility, which is good. The public
functions a() and b() were already public, so no change. But now the
common helper is public, too, even though nobody except a() and b() care
about it.

So it's a tradeoff. And the important question is: is the bleed-over
between a() and b() worse than the common bits being made public?  That
can't be answered without looking at how many distinct "a" and "b"-like
chunks there are in the file, and what the common bits look like. I'm
not sure of the answer for diff.c. Without digging, both ends of the
spectrum seem equally plausible to me: that it is mostly a set of N
distinct units which could be split apart, or that it really is a few
public functions calling the same common core over and over.

And a follow-on question is what we can do to mitigate the cost of
making the common code public. We could advertise a_and_b_helper() only
in diff-internal.h, and then makes it only semi-public (anybody can
include that, of course, but including diff-internal.h seems like it
ought to be a red flag). That only helps the programmer, though; we'd
still be losing out on compiler optimizations and static analysis that
only looks at one translation unit at a time.

Phew. That ended up a little long-winded. All of it is to say that I
don't think it makes sense to reject a split out-of-hand, but nor is it
always a good thing. It depends on what's in the file.

-Peff


Re: [PATCH v3 00/53] object_id part 8

2017-05-08 Thread Jonathan Tan

On 05/06/2017 03:09 PM, brian m. carlson wrote:

This is the eighth series of patches to convert unsigned char [20] to
struct object_id.  This series converts lookup_commit, lookup_blob,
lookup_tree, lookup_tag, and finally parse_object to struct object_id.


I patched v2 (which I have reviewed) and v3 against master, and diffed 
the two, and it looks good to me. (There was a conflict in 
builtin/tag.c, but it is a trivial one.)


Re: [PATCH v2 0/9] Keep git clean -d from inadvertently removing ignored files

2017-05-08 Thread Samuel Lijin
On Sun, May 7, 2017 at 11:26 PM, Junio C Hamano  wrote:
> Samuel Lijin  writes:
>
>> Addresses the issues raised by Stefan and Junio (thanks for your
>> feedback) about not using C99-style comments and keeping tests
>> working on every commit to prevent breaking git bisect. (About the
>> latter one: is it necessary to prevent compiler warnings, in
>> addition to compiler errors? Because if so I should probably
>> squash some of the commits together.)
>
> Some of us build with -Werror, so yes.  If by "squashing" you mean
> "instead of piling a fix on top of a broken patch, I need to do
> things right from the beginning", then yes, please do so, not just
> for compiler warnings but for all forms of changes.

Got it - will keep this in mind when I reroll the patch series.

>> Note that this introduces a breaking change in the behavior of git
>> status: when invoked with --ignored, git status will now return
>> ignored files in an untracked directory, whereas previously it
>> would not.
>
> What do you mean by a "breaking change"?  Is it just "a new bug"?
> Or "the current behaviour is logically broken, but people and
> scripts might have relied on that odd behaviour and fixing it this
> late in the game would break their expectations"?

The latter, as I believe you noticed in your reply about patch 9/9.

What happens right now is that because (1) directories containing only
untracked and ignored files are considered "untracked" and (2)
read_directory_recursive() skips over untracked directories, even with
DIR_SHOW_IGNORED_TOO set. As a result, `status --ignored` never lists
ignored files if they're in an "untracked" directory (and this is the
currently defined behavior per t7061).

By teaching read_directory_recursive() to recurse into untracked
directories in search of ignored files when DIR_SHOW_IGNORED_TOO is
set, though, `status --ignored` now learns to report the existence of
these ignored files, whereas previously it did not.

>> It's possible that there are standard practices that I might have
>> missed, so if there is anything along those lines, I'd appreciate
>> you letting me know. (As an aside, about the git bisect thing: is
>> there a script somewhere that people use to test patch series
>> before sending them out?)
>
> I hear that people use variations of
>
> git rebase -x "make test"
>
> on their topic.

Aha - thanks.


[PATCH v2 1/2] docs: correct receive.advertisePushOptions default

2017-05-08 Thread Jonathan Tan
In commit c714e45 ("receive-pack: implement advertising and receiving
push options", 2016-07-14), receive-pack was taught to (among other
things) advertise that it understood push options, depending on
configuration. It was documented that it advertised such ability by
default; however, it actually does not. (In that commit, notice that
advertise_push_options defaults to 0, unlike advertise_atomic_push which
defaults to 1.)

Update the documentation to state that it does not advertise the ability
by default.

Signed-off-by: Jonathan Tan 
---
 Documentation/config.txt | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 475e874d5..3a847a62e 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -2620,9 +2620,8 @@ receive.advertiseAtomic::
capability, set this variable to false.
 
 receive.advertisePushOptions::
-   By default, git-receive-pack will advertise the push options
-   capability to its clients. If you don't want to advertise this
-   capability, set this variable to false.
+   When set to true, git-receive-pack will advertise the push options
+   capability to its clients. False by default.
 
 receive.autogc::
By default, git-receive-pack will run "git-gc --auto" after
-- 
2.13.0.rc1.294.g07d810a77f-goog



[PATCH v2 2/2] receive-pack: verify push options in cert

2017-05-08 Thread Jonathan Tan
In commit f6a4e61 ("push: accept push options", 2016-07-14), send-pack
was taught to include push options both within the signed cert (if the
push is a signed push) and outside the signed cert; however,
receive-pack ignores push options within the cert, only handling push
options outside the cert.

Teach receive-pack, in the case that push options are provided for a
signed push, to verify that the push options both within the cert and
outside the cert are consistent.

This sets in stone the requirement that send-pack redundantly send its
push options in 2 places, but I think that this is better than the
alternatives. Sending push options only within the cert is
backwards-incompatible with existing Git servers (which read push
options only from outside the cert), and sending push options only
outside the cert means that the push options are not signed for.

Signed-off-by: Jonathan Tan 
---
 Documentation/technical/pack-protocol.txt | 32 +++
 builtin/receive-pack.c| 51 +--
 t/t5534-push-signed.sh| 37 ++
 3 files changed, 112 insertions(+), 8 deletions(-)

diff --git a/Documentation/technical/pack-protocol.txt 
b/Documentation/technical/pack-protocol.txt
index 5b0ba3ef2..a34917153 100644
--- a/Documentation/technical/pack-protocol.txt
+++ b/Documentation/technical/pack-protocol.txt
@@ -473,13 +473,10 @@ that it wants to update, it sends a line listing the 
obj-id currently on
 the server, the obj-id the client would like to update it to and the name
 of the reference.
 
-This list is followed by a flush-pkt. Then the push options are transmitted
-one per packet followed by another flush-pkt. After that the packfile that
-should contain all the objects that the server will need to complete the new
-references will be sent.
+This list is followed by a flush-pkt.
 
 
-  update-request=  *shallow ( command-list | push-cert ) [packfile]
+  update-requests   =  *shallow ( command-list | push-cert )
 
   shallow   =  PKT-LINE("shallow" SP obj-id)
 
@@ -500,12 +497,35 @@ references will be sent.
  PKT-LINE("pusher" SP ident LF)
  PKT-LINE("pushee" SP url LF)
  PKT-LINE("nonce" SP nonce LF)
+ *PKT-LINE("push-option" SP push-option LF)
  PKT-LINE(LF)
  *PKT-LINE(command LF)
  *PKT-LINE(gpg-signature-lines LF)
  PKT-LINE("push-cert-end" LF)
 
-  packfile  = "PACK" 28*(OCTET)
+  push-option   =  1*( VCHAR | SP )
+
+
+If the server has advertised the 'push-options' capability and the client has
+specified 'push-options' as part of the capability list above, the client then
+sends its push options followed by a flush-pkt.
+
+
+  push-options  =  *PKT-LINE(push-option) flush-pkt
+
+
+For backwards compatibility with older Git servers, if the client sends a push
+cert and push options, it MUST send its push options both embedded within the
+push cert and after the push cert. (Note that the push options within the cert
+are prefixed, but the push options after the cert are not.) Both these lists
+MUST be the same, modulo the prefix.
+
+After that the packfile that
+should contain all the objects that the server will need to complete the new
+references will be sent.
+
+
+  packfile  =  "PACK" 28*(OCTET)
 
 
 If the receiving end does not support delete-refs, the sending end MUST
diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index f96834f42..fff5c7a54 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -473,7 +473,8 @@ static char *prepare_push_cert_nonce(const char *path, 
unsigned long stamp)
  * after dropping "_commit" from its name and possibly moving it out
  * of commit.c
  */
-static char *find_header(const char *msg, size_t len, const char *key)
+static char *find_header(const char *msg, size_t len, const char *key,
+const char **next_line)
 {
int key_len = strlen(key);
const char *line = msg;
@@ -486,6 +487,8 @@ static char *find_header(const char *msg, size_t len, const 
char *key)
if (line + key_len < eol &&
!memcmp(line, key, key_len) && line[key_len] == ' ') {
int offset = key_len + 1;
+   if (next_line)
+   *next_line = *eol ? eol + 1 : eol;
return xmemdupz(line + offset, (eol - line) - offset);
}
line = *eol ? eol + 1 : NULL;
@@ -495,7 +498,7 @@ static char *find_header(const char *msg, size_t len, const 
char *key)
 
 static const char *check_nonce(const char *buf, size_t len)
 {
-   char *nonce = find_header(buf, len, "nonce");
+   char *nonce = find_header(buf, len, "nonce", NULL);
unsigned long stamp, ostamp;

[PATCH v2 0/2] Clarify interaction between signed pushes and push options

2017-05-08 Thread Jonathan Tan
Changes from v1:
 - merged last 2 patches
 - patch 1:
   - updated advertisePushOptions doc to say "False by default"
 - patch 2 (formerly 2-3):
   - reject mismatching push options (similar to how a pre-receive hook
 can reject a push) instead of merely reporting it
   - added test to check failure (in addition to success)
   - modified pack-protocol.txt to describe new behavior

Jonathan Tan (2):
  docs: correct receive.advertisePushOptions default
  receive-pack: verify push options in cert

 Documentation/config.txt  |  5 ++-
 Documentation/technical/pack-protocol.txt | 32 +++
 builtin/receive-pack.c| 51 +--
 t/t5534-push-signed.sh| 37 ++
 4 files changed, 114 insertions(+), 11 deletions(-)

-- 
2.13.0.rc1.294.g07d810a77f-goog



Re: [PATCH 3/3] protocol docs: explain receive-pack push options

2017-05-08 Thread Jonathan Tan

On 05/05/2017 05:26 PM, Jonathan Nieder wrote:

-This list is followed by a flush-pkt. Then the push options are transmitted
-one per packet followed by another flush-pkt. After that the packfile that
-should contain all the objects that the server will need to complete the new
-references will be sent.
+This list is followed by a flush-pkt.


I think this removed more than intended.

Before c714e45f (receive-pack: implement advertising and receiving
push options, 2016-07-14), this said

This list is followed by a flush-pkt and then the packfile that should
contain all the objects that the server will need to complete the new
references.

which I think would work fine.


That wouldn't work fine because there is no mention of push options - 
hence the modification in c714e45f to talk about push options, but that 
is also not accurate because push options (and, more importantly, the 
associated flush-pkt) are not sent if the client doesn't have any.



[...]

-  update-request=  *shallow ( command-list | push-cert ) [packfile]
+  update-request=  *shallow ( command-list | push-cert )


This seems confusing to me both before and after.  How does
update-request get used?  Is it supposed to include the packfile or not?

Where do push-options fit into an unsigned request?


I've updated "update-request" to "update-requests" to show that this is 
the "list of reference update requests" mentioned in the previous paragraph.


This is only the ref part - I've chosen to describe push options and 
packfile in separate sections, because there are very specific rules 
that determine whether the push option section must be included or omitted.




[...]

@@ -500,12 +497,35 @@ references will be sent.
  PKT-LINE("pusher" SP ident LF)
  PKT-LINE("pushee" SP url LF)
  PKT-LINE("nonce" SP nonce LF)
+ *PKT-LINE("push-option" SP push-option LF)
  PKT-LINE(LF)
  *PKT-LINE(command LF)
  *PKT-LINE(gpg-signature-lines LF)
  PKT-LINE("push-cert-end" LF)

-  packfile  = "PACK" 28*(OCTET)
+  push-option   =  1*CHAR
+
+
+If the server has advertised the 'push-options' capability and the client has
+specified 'push-options' as part of the capability list above, the client then
+sends its push options followed by a flush-pkt.
+
+
+  push-options  =  *PKT-LINE(push-option) flush-pkt
+
+
+For backwards compatibility with older Git servers, if the client sends a push
+cert and push options, it SHOULD send its push options both embedded within the


This needs to be a MUST, not SHOULD.


+push cert and after the push cert. (Note that the push options within the cert
+are prefixed, but the push options after the cert are not.) Both these lists
+SHOULD be consistent.


Likewise this one.

What does it mean to be consistent?


Changed to "MUST be the same, modulo the prefix".


Re: [PATCH v3 0/6] rebase -i: add config to abbreviate command-names

2017-05-08 Thread Liam Beguin
Hi,

On Mon, 2017-05-08 at 09:27 +0900, Junio C Hamano wrote:
> Liam Beguin  writes:
> 
> > Sorry for the delay, I don't mind switching to C but it would probably
> > be easier to see if the scripted version gets approved first.
> > If it does, I could then get started on the C implementation.
> > If you prefer I could also forget about the scripted version, make a C
> > implementation work and see if that gets approved.
> 
> I am not sure what "approved" would mean in the context of this
> project, though ;-) Your patch to the scripted version would
> certainly not be in the upcoming release.  If you define the
> "approval" as "it is queued to my tree somewhere", the patch would
> start its life like everybody else by getting merged to the 'pu'
> branch, where there already is a topic that removes the code you
> patch your enhancement into.
> 

By "approved", I guess I meant the list reaches an agreement. 

> The list _can_ agree that it is a good idea to have an option to
> populate the todo list with shortened insn words from the beginning
> (instead of merely accepting a short-hand while executing), which is
> what your patch wants to do, without actually having the updated
> scripted "rebase -i" merged in any of the integration branches in my
> tree.  If you meant by "approval" to have such a list concensus, I
> think you may already have one.  I personally do not think it is a
> great idea but I do not think it is a horrible one, either.  As long
> as it is an opt-in feature that many people find useful (which may
> be the case already, judging from the list traffic), I do not mind
> ;-)
> 

Ok, based on this, I'll send a new series based on the 'pu' branch.

Thanks again, 
Liam


Re: [PATCH v2 1/9] t7300: skip untracked dirs containing ignored files

2017-05-08 Thread Samuel Lijin
On Sun, May 7, 2017 at 2:12 PM, Torsten Bögershausen  wrote:
> On 2017-05-05 12:46, Samuel Lijin wrote:
>> If git sees a directory which contains only untracked and ignored
>> files, clean -d should not remove that directory. It was recently
>> discovered that this is *not* true of git clean -d, and it's possible
>> that this has never worked correctly; this test and its accompanying
>> patch series aims to fix that.
>>
>> Signed-off-by: Samuel Lijin 
>> ---
>>  t/t7300-clean.sh | 10 ++
>>  1 file changed, 10 insertions(+)
>>
>> diff --git a/t/t7300-clean.sh b/t/t7300-clean.sh
>> index b89fd2a6a..252c75b40 100755
>> --- a/t/t7300-clean.sh
>> +++ b/t/t7300-clean.sh
>> @@ -653,4 +653,14 @@ test_expect_success 'git clean -d respects pathspecs 
>> (pathspec is prefix of dir)
>>   test_path_is_dir foobar
>>  '
>>
>> +test_expect_failure 'git clean -d skips untracked dirs containing ignored 
>> files' '
>> + echo /foo/bar >.gitignore &&
>> + rm -rf foo &&
>> + mkdir -p foo &&
> Minor remark:
> Do we need the -p here, or can it be dropped?

Yeah, the -p isn't needed here - will change when I reroll.

>> + touch foo/bar &&
>> + git clean -df &&
>> + test_path_is_file foo/bar &&
>> + test_path_is_dir foo
>> +'
>> +
>>  test_done
>>
>


[PATCH v2] git_fopen: fix a sparse 'not declared' warning

2017-05-08 Thread Ramsay Jones

If git is built with the FREAD_READS_DIRECTORIES build variable set, this
would cause sparse to issue a 'not declared, should it be static?' warning
on Linux. This is a result of the method employed by 'compat/fopen.c' to
suppress the (possible) redefinition of the (system) fopen macro, which
also removes the extern declaration of the git_fopen function.

In order to suppress the warning, introduce a new macro to suppress the
definition (or possibly the re-definition) of the fopen symbol as a macro
override. This new macro (SUPPRESS_FOPEN_REDEFINITION) is only defined in
'compat/fopen.c', just prior to the inclusion of the 'git-compat-util.h'
header file.

Signed-off-by: Ramsay Jones 
---

Hi Junio,

I noticed the "What's cooking" entry for Duy's 'nd/fopen-errors'
branch 'needs to resurrect' this patch, so I re-worded the commit
message (because commit 8f4f6e53d2 is no more ...) so that it could
stand alone. (or on the new re-rolled branch).

Thanks.

ATB,
Ramsay Jones

 compat/fopen.c|  4 ++--
 git-compat-util.h | 10 ++
 2 files changed, 8 insertions(+), 6 deletions(-)

diff --git a/compat/fopen.c b/compat/fopen.c
index b5ca142fe..107b3e818 100644
--- a/compat/fopen.c
+++ b/compat/fopen.c
@@ -1,14 +1,14 @@
 /*
  *  The order of the following two lines is important.
  *
- *  FREAD_READS_DIRECTORIES is undefined before including git-compat-util.h
+ *  SUPPRESS_FOPEN_REDEFINITION is defined before including git-compat-util.h
  *  to avoid the redefinition of fopen within git-compat-util.h. This is
  *  necessary since fopen is a macro on some platforms which may be set
  *  based on compiler options. For example, on AIX fopen is set to fopen64
  *  when _LARGE_FILES is defined. The previous technique of merely undefining
  *  fopen after including git-compat-util.h is inadequate in this case.
  */
-#undef FREAD_READS_DIRECTORIES
+#define SUPPRESS_FOPEN_REDEFINITION
 #include "../git-compat-util.h"
 
 FILE *git_fopen(const char *path, const char *mode)
diff --git a/git-compat-util.h b/git-compat-util.h
index df7b41524..33e06cac1 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -694,10 +694,12 @@ char *gitstrdup(const char *s);
 #endif
 
 #ifdef FREAD_READS_DIRECTORIES
-#ifdef fopen
-#undef fopen
-#endif
-#define fopen(a,b) git_fopen(a,b)
+# if !defined(SUPPRESS_FOPEN_REDEFINITION)
+#  ifdef fopen
+#   undef fopen
+#  endif
+#  define fopen(a,b) git_fopen(a,b)
+# endif
 extern FILE *git_fopen(const char*, const char*);
 #endif
 
-- 
2.12.0


[PATCH v2] archive-tar: fix a sparse 'constant too large' warning

2017-05-08 Thread Ramsay Jones

Commit dddbad728c ("timestamp_t: a new data type for timestamps",
26-04-2017) introduced a new typedef 'timestamp_t', as a synonym for an
unsigned long, which was used at the time to represent timestamps in
git. A later commit 28f4aee3fb ("use uintmax_t for timestamps",
26-04-2017) changed the typedef to use an 'uintmax_t' for the timestamp
representation type.

When building on a 32-bit Linux system, sparse complains that a constant
(USTAR_MAX_MTIME) used to detect a 'far-future mtime' timestamp, is too
large; 'warning: constant 0777UL is so big it is unsigned long
long' on lines 335 and 338 of archive-tar.c. Note that both gcc and
clang only issue a warning if this constant is used in a context that
requires an 'unsigned long' (rather than an uintmax_t). (Since TIME_MAX
is no longer equal to 0x, even on a 32-bit system, the macro
USTAR_MAX_MTIME is set to 0777UL, which cannot be represented as
an 'unsigned long' constant).

In order to suppress the warning, change the definition of the macro
constant USTAR_MAX_MTIME to use an 'ULL' type suffix.

In a similar vein, on systems which use a 64-bit representation of the
'unsigned long' type, the USTAR_MAX_SIZE constant macro is defined with
the value 0777ULL. Although this does not cause any warning
messages to be issued, it would be more appropriate for this constant
to use an 'UL' type suffix rather than 'ULL'.

Signed-off-by: Ramsay Jones 
---

Hi Junio,

Sorry for being a bit slow with this, but this is the v2 version
of the patch that I promised, which includes the change to the
USTAR_MAX_SIZE macro that Johannes requested.

Thanks.

ATB,
Ramsay Jones

 archive-tar.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/archive-tar.c b/archive-tar.c
index 319a5b1c7..073e60ebd 100644
--- a/archive-tar.c
+++ b/archive-tar.c
@@ -28,12 +28,12 @@ static int write_tar_filter_archive(const struct archiver 
*ar,
 #if ULONG_MAX == 0x
 #define USTAR_MAX_SIZE ULONG_MAX
 #else
-#define USTAR_MAX_SIZE 0777ULL
+#define USTAR_MAX_SIZE 0777UL
 #endif
 #if TIME_MAX == 0x
 #define USTAR_MAX_MTIME TIME_MAX
 #else
-#define USTAR_MAX_MTIME 0777UL
+#define USTAR_MAX_MTIME 0777ULL
 #endif
 
 /* writes out the whole block, but only if it is full */
-- 
2.12.0


Re: git and the Clang Static Analyzer

2017-05-08 Thread Jeff King
On Mon, May 08, 2017 at 09:26:11PM +0200, Дилян Палаузов wrote:

> Click on  https://mail.aegee.org/dpa/scan-build-git-4fa66c85f11/ and
> then on "fast-import.c: line 2057 -> View Report" and you will see
> pointless assignment.
> 
> I cannot organize the report much better, as filtering out the false
> positives requires usually too deep understanding of the code
> organization of git, which I do not have.

Right, but I think Johannes's point is that we already know about
scan-build. It's been discussed on the list multiple times. The next
step is somebody actually looking at each reported issue, fixing any
problematic ones, and suppressing (or at least cataloging) the false
positives.

Here's a link to the last discussion on this topic, from a few months
ago:

  
http://public-inbox.org/git/cajzjrdxp3n5folx4reekbjt7jbmpuqk4qdutm6kpvmvumwc...@mail.gmail.com/T/#u

It even involves a Travis build. :)

-Peff


Re: [PATCH v7] read-cache: force_verify_index_checksum

2017-05-08 Thread Jeff Hostetler



On 5/8/2017 4:03 PM, Christian Couder wrote:

On Mon, May 8, 2017 at 6:50 PM, Jeff Hostetler  wrote:



On 5/8/2017 5:45 AM, Christian Couder wrote:


This test does not pass when the GIT_TEST_SPLIT_INDEX env variable is
set on my Linux machine.

Also it looks like you sent a v8 of this patch series with a different
test, but what is in master looks like the above test instead of the
test in your v8.


There was concern about using sed on a binary file not being portable
and a request to change the test to just corrupt the checksum rather
than an index-entry, so I changed it in v8.

Does the v8 version of the test also fail on your machine ?


The v8 version of the test succeeds on my machine.


OK, thanks.  It worked on mine too.  Since the v8 version is a
better test, I'm content to stick with it and not try to address
the problem with the previous version.

Jeff


Re: Commits messages numbered only until 10 when squashed

2017-05-08 Thread Jeff King
On Mon, May 08, 2017 at 04:13:51PM -0300, Luciano Moreira wrote:

> Git version: 2.11.0
> 
> THE CASE:
> I have 15 commit to be squashed (the hashes are real, but the commit
> messages were changed for privacy). When it's rebase interactively
> squashing all the 15 commits, the numbering goes from 1 to 10 and
> after it starts again from 1 to 5.

This is fixed already by 356b8ecff (rebase--interactive: count squash
commits above 10 correctly, 2017-01-07). It's in v2.11.1.

-Peff


Re: [PATCH v7] read-cache: force_verify_index_checksum

2017-05-08 Thread Christian Couder
On Mon, May 8, 2017 at 6:50 PM, Jeff Hostetler  wrote:
>
>
> On 5/8/2017 5:45 AM, Christian Couder wrote:
>>
>> This test does not pass when the GIT_TEST_SPLIT_INDEX env variable is
>> set on my Linux machine.
>>
>> Also it looks like you sent a v8 of this patch series with a different
>> test, but what is in master looks like the above test instead of the
>> test in your v8.
>
> There was concern about using sed on a binary file not being portable
> and a request to change the test to just corrupt the checksum rather
> than an index-entry, so I changed it in v8.
>
> Does the v8 version of the test also fail on your machine ?

The v8 version of the test succeeds on my machine.


Re: git and the Clang Static Analyzer

2017-05-08 Thread Дилян Палаузов

Hello Johannes,


I compiled git/master


... which advances from time to time, so you definitely want to include a
more informative data point here, e.g. 4fa66c85f11 (Git 2.13-rc2,
2017-05-04) ...



The 4fa66c85f11 you mentioned is part of the URL I sent.


Please note, that the information is only about what gets actually compiled,
code disabled by #if .. #endif is not considered (e.g. when determining
whether a variable assignment is useless).


So you already know that the report is specific to your setup. It may make
a lot of sense to actually state what your setup is, i.e. Operating
System, installed libraries (and their respective versions), CPU, etc.


I don't think this is of much relevance.  The hints provided encourage one to 
look at the code and to evaluate mentally the lines.  By tweaking the 
preprocessor directives, you could get less warnings (a previously unused 
variable now appears within an asser()), or more warnings (as more code gets 
compiled).  Getting more warnings makes sense, after the current ones are 
processed.  Getting less warnings means (again) compiling more code.  I use 
already pcre and openssl, what else can I enable?
 

There are probably false-positives.


Probably. So why don't you give it a try and look through the report? Then
summarize your findings here. That would definitely find a warm welcome, I
would expect.


However in case of e.g. builtin/notes.c:1018, builtin/reset.c:294 or
fast-import.c:2057 I consider the hints as justified.


Okay. And those hint are...?


Click on  https://mail.aegee.org/dpa/scan-build-git-4fa66c85f11/ and then on 
"fast-import.c: line 2057 -> View Report" and you will see pointless assignment.

I cannot organize the report much better, as filtering out the false positives 
requires usually too deep understanding of the code organization of git, which 
I do not have.

This is the analysis done on the pu-branch:
  https://mail.aegee.org/dpa/scan-build-git-7dd243c75

Both reports do not list files in the same order, as I did parallel builds, but 
I do not see on the spot any difference.

Learning Travis is not on my priority list, I sent the commands I called to get 
the report.  I also compiled clang by myself.  For those who mistrust sites, 
there are no-javascipt, no-css browsers like lynx.

Greetings
  Dilyan


Commits messages numbered only until 10 when squashed

2017-05-08 Thread Luciano Moreira
Git version: 2.11.0

THE CASE:
I have 15 commit to be squashed (the hashes are real, but the commit
messages were changed for privacy). When it's rebase interactively
squashing all the 15 commits, the numbering goes from 1 to 10 and
after it starts again from 1 to 5.

- The first line says "This is a combination of 5 commits" but the
real amount of commit is 15.
- #5 coincidentally is the last commit in the concatenated messages to
build the new commit message
- The order of the commit messages joined were: 1, 2, 3, 4, 5, 6, 7,
8, 9, 10, 1, 2, 3, 4, 5

WHAT WAS DONE ?
git rebase -i 4e4f37c3~1

pick 4e4f37c3
s 72742a70
s db800d70
s 2bcd475a
s 5e5eb85f
s 4422b1e5
s af19b0e2
s 3d64165b
s faeb2ce9
s c7dc6b38
s edeff8d9
s bd451d5b
s 851bf828
s 9cae1f95
s b79466fb


WHEN THE COMMIT MESSAGE IS EDITED: The commit hashes are real, but the
messages were changed for privacy.

# This is a combination of 5 commits.
# This is the 1st commit message:
Terminal redim Ok.

Fix #64

# This is the commit message #2:

Script improved.


# This is the commit message #3:

Added some environment variables.

# This is the commit message #4:

Improved the redim command.

# This is the commit message #5:

+Fix

# This is the commit message #6:

Fixed header and footer.

# This is the commit message #7:

+fix

# This is the commit message #8:

+fix

# This is the commit message #9:

+fix

# This is the commit message #10:

+fix

# This is the commit message #1:

+fix

# This is the commit message #2:

+fix

# This is the commit message #3:

Changed some environment variables.

# This is the commit message #4:

+fix

# This is the commit message #5:

+fix

# Please enter the commit message for your changes. Lines starting
# with '#' will be ignored, and an empty message aborts the commit.
...

--
Luciano Moreira


[PATCH] diff.c: Fix whitespace issues due to a mismerge(?)

2017-05-08 Thread Stefan Beller
Re-indent lines, as they were off by one.
When a line was not indented (as you would expect from function defintions)
we had a stray whitespace preceding the line.
Indented lines have a stray white space after the indentation by tabs,
before the actual text, i.e.

if (...)
...

It looks like all these lines were introduced by one of the conflict chunks
in 4af9a7d344 (Merge branch 'bc/object-id', 2016-09-19). Viewing that
commit in gitk, the indentation seems fine, i.e. there is just one
whitespace in front of the lines, as you would expect from a formatted
patch.

Signed-off-by: Stefan Beller 
---

 Junio, 
 I do not think it is worth to apply this patch on its own,
 but maybe it is worth to investigate your setup? (Assuming it is
 git that did the merge, we may have a bug in whitespacing and
 merge conflicts.)
 
 Thanks,
 Stefan

 apply.c | 310 
 1 file changed, 155 insertions(+), 155 deletions(-)

diff --git a/apply.c b/apply.c
index e6dbab26ad..be340991d7 100644
--- a/apply.c
+++ b/apply.c
@@ -4098,172 +4098,172 @@ static int build_fake_ancestor(struct apply_state 
*state, struct patch *list)
 return 0;
  }
 
- static void stat_patch_list(struct apply_state *state, struct patch *patch)
- {
-int files, adds, dels;
-
-for (files = adds = dels = 0 ; patch ; patch = patch->next) {
-files++;
-adds += patch->lines_added;
-dels += patch->lines_deleted;
-show_stats(state, patch);
-}
-
-print_stat_summary(stdout, files, adds, dels);
- }
+static void stat_patch_list(struct apply_state *state, struct patch *patch)
+{
+   int files, adds, dels;
 
- static void numstat_patch_list(struct apply_state *state,
+   for (files = adds = dels = 0 ; patch ; patch = patch->next) {
+   files++;
+   adds += patch->lines_added;
+   dels += patch->lines_deleted;
+   show_stats(state, patch);
+   }
+
+   print_stat_summary(stdout, files, adds, dels);
+}
+
+static void numstat_patch_list(struct apply_state *state,
struct patch *patch)
- {
-for ( ; patch; patch = patch->next) {
-const char *name;
-name = patch->new_name ? patch->new_name : patch->old_name;
-if (patch->is_binary)
-printf("-\t-\t");
-else
-printf("%d\t%d\t", patch->lines_added, 
patch->lines_deleted);
-write_name_quoted(name, stdout, state->line_termination);
-}
- }
+{
+   for ( ; patch; patch = patch->next) {
+   const char *name;
+   name = patch->new_name ? patch->new_name : patch->old_name;
+   if (patch->is_binary)
+   printf("-\t-\t");
+   else
+   printf("%d\t%d\t", patch->lines_added, 
patch->lines_deleted);
+   write_name_quoted(name, stdout, state->line_termination);
+   }
+}
 
- static void show_file_mode_name(const char *newdelete, unsigned int mode, 
const char *name)
- {
-if (mode)
-printf(" %s mode %06o %s\n", newdelete, mode, name);
+static void show_file_mode_name(const char *newdelete, unsigned int mode, 
const char *name)
+{
+   if (mode)
+   printf(" %s mode %06o %s\n", newdelete, mode, name);
 else
-printf(" %s %s\n", newdelete, name);
- }
+   printf(" %s %s\n", newdelete, name);
+}
 
- static void show_mode_change(struct patch *p, int show_name)
- {
-if (p->old_mode && p->new_mode && p->old_mode != p->new_mode) {
-if (show_name)
-printf(" mode change %06o => %06o %s\n",
+static void show_mode_change(struct patch *p, int show_name)
+{
+   if (p->old_mode && p->new_mode && p->old_mode != p->new_mode) {
+   if (show_name)
+   printf(" mode change %06o => %06o %s\n",
p->old_mode, p->new_mode, p->new_name);
-else
-printf(" mode change %06o => %06o\n",
-   p->old_mode, p->new_mode);
-}
- }
+   else
+   printf(" mode change %06o => %06o\n",
+  p->old_mode, p->new_mode);
+   }
+}
 
- static void show_rename_copy(struct patch *p)
- {
-const char *renamecopy = p->is_rename ? "rename" : "copy";
-const char *old, *new;
-
-/* Find common prefix */
-old = p->old_name;
-new = p->new_name;
-while (1) {
-const char *slash_old, *slash_new;
-slash_old = strchr(old, '/');
-slash_new = strchr(new, '/');
-if (!slash_old ||
-!slash_new ||
-slash_old - old != slash_new 

Re: [RFC 00/14] convert dir.c to take an index parameter

2017-05-08 Thread Jeff Hostetler



On 5/8/2017 1:12 PM, Brandon Williams wrote:

On 05/06, Junio C Hamano wrote:

Brandon Williams  writes:


One of the things brought up on the list in the past few days has been
migrating away from using the index compatibility macros.  One of the issues
brought up in that thread was how simply doing that conversion doesn't
eliminate the reliance on global state (specifically the_index).  If one day we
want to have a 'repository object' passed around then we first need to convert
different subsystems to be prepared to handle that.  This series provides a
first step, converting the code in dir.c to take a 'struct index_state' and
using that instead of implicitly using 'the_index'.


Very nicely done (I only skimmed "dir.c" in the end result and didn't
go through the changes with fine toothed comb, though).

I would have done this without the first step and then instead had a
final patch that only inserts a single

#define NO_THE_INDEX_COMPATIBILITY_MACROS

at the beginning of dir.c once everybody in dir.c loses the
reference to all "cache" macros at the end, if I were doing this
series, but it is a personal taste.

The resulting dir.c does not even refer to the_index, which is very
nice.


Thanks! I'm glad there's a few people who see this as a positive change.


Agreed.  This looks like a nice start.
Jeff



Re: Questions about validation/verification done for Git Software

2017-05-08 Thread Stefan Beller
On Mon, May 8, 2017 at 10:14 AM, Robertson, Todd
 wrote:
> Hello,
>
> I'm a Sr. Software Quality Assurance Engineer at Epocal Inc. in Ottawa, 
> Ontario,
> Canada.  We are a medical device manufacturer that is required to comply with
> the Food & Drug Administration's (FDA) regulations.
> I'm in charge of what is known as Non-Product Software, essentially any piece 
> of
> SW that is used in the day-to-day work of creating our products.
> Part of complying with the FDA regulations is to ensure that all Non-Product 
> Software
> has been validated.
>
> Some of our engineers are planning on using your software "Git" for their 
> software
> development processes (latest version) to replace our current software "SVN" 
> and I was
> hoping you could answer some questions for me?

How is svn validated for you?

> Are you able to provide me with a copy of your Test Plans and/or Test Cases 
> for the "Git" product?

Tests are found inside the t/ directory in your copy of Git[1].
[1] obtained e.g. via "git clone https://github.com/git/git;

Note that these are just tests; nothing formal attached.

> Is the software FDA certified?

I am not aware of that.

Checkout https://git-scm.com/ and for legal questions contact
https://sfconservancy.org/ (via https://git-scm.com/sfc)


Re: Add an option to automatically submodule update on checkout

2017-05-08 Thread Stefan Beller
On Mon, May 8, 2017 at 10:08 AM, Brandon Williams  wrote:
>>
>> >[submodule "gnulib"]
>> >path=./gnulib
>> >external = true # implies no branch for checkout -b --recurse-submodules
>>
>> >I think there are a couple more situations where such "external" submodules 
>> >are treated differently, so maybe we'd want to think carefully about the 
>> >>actual name as different workflows would want to have different features 
>> >for an internal/external submodule.
>>
>> I didn't want to open up that one, but yes. That makes sense. However, I 
>> don't like overloading what "external" means or might mean in the future. 
>> Would you consider a distinct Boolean for that, like inherit-branch=true?
>
> Something like that kind of already exists.  The 'branch' field.
> Internal repos would most likely use the '.' value to indicate that the
> submodules should track the superproject's branch.  While a value of say
> 'foo' would indicate that the submodule should always be on branch
> 'foo'; this could be used for external repositories.

so for external repos you'd keep the branch unset, such that
you strictly checkout the sha1 object into a detached HEAD.

Makes sense.

Thanks,
Stefan


Questions about validation/verification done for Git Software

2017-05-08 Thread Robertson, Todd
Hello,

I'm a Sr. Software Quality Assurance Engineer at Epocal Inc. in Ottawa, Ontario,
Canada.  We are a medical device manufacturer that is required to comply with
the Food & Drug Administration's (FDA) regulations.
I'm in charge of what is known as Non-Product Software, essentially any piece of
SW that is used in the day-to-day work of creating our products.
Part of complying with the FDA regulations is to ensure that all Non-Product 
Software
has been validated.

Some of our engineers are planning on using your software "Git" for their 
software
development processes (latest version) to replace our current software "SVN" 
and I was 
hoping you could answer some questions for me?

Are you able to provide me with a copy of your Test Plans and/or Test Cases for 
the "Git" product?

Is the software FDA certified?

Please contact me with your reply and/or any questions you have.

Thank you,
Todd Robertson
Sr. Software QA Engineer
Epocal Inc.
Ottawa, Ontario, Canada
todd.robert...@alere.com
613-688-3982 ext.2229


Re: [PATCH v4 0/4] Make diff plumbing commands respect the indentHeuristic.

2017-05-08 Thread Stefan Beller
On Mon, May 8, 2017 at 9:03 AM, Marc Branchaud  wrote:
> The only change from v3 is in 3/4, to expand t4061 to test various
> combinations of --(no-)indent-heuristic and diff.indentHeuristic.
>
> I kindof went all-in and tried to cover every possible combination for
> all four affected commands.
>
> An inter-diff is below.
>
> M.
>
> Jeff King (1):
>   add--interactive: drop diff.indentHeuristic handling
>
> Marc Branchaud (2):
>   diff: make the indent heuristic part of diff's basic configuration
>   diff: have the diff-* builtins configure diff before initializing
> revisions
>
> Stefan Beller (1):
>   diff: enable indent heuristic by default
>
>  builtin/diff-files.c |   2 +-
>  builtin/diff-index.c |   2 +-
>  builtin/diff-tree.c  |   2 +-
>  diff.c   |   8 +-
>  git-add--interactive.perl|   4 -
>  t/t4051-diff-function-context.sh |   3 +-
>  t/t4061-diff-indent.sh   | 184 
> +++
>  7 files changed, 177 insertions(+), 28 deletions(-)
>
>
> diff --git a/t/t4061-diff-indent.sh b/t/t4061-diff-indent.sh
> index 56d7d7760..2affd7a10 100755
> --- a/t/t4061-diff-indent.sh
> +++ b/t/t4061-diff-indent.sh
> @@ -152,26 +152,28 @@ test_expect_success 'prepare' '
> EOF
>  '
>
> +# --- diff tests --
> +
>  test_expect_success 'diff: ugly spaces' '
> git diff --no-indent-heuristic old new -- spaces.txt >out &&
> compare_diff spaces-expect out
>  '
>
> +test_expect_success 'diff: --no-indent-heuristic overrides config' '
> +   git -c diff.indentHeuristic=true diff --no-indent-heuristic old new 
> -- spaces.txt >out2 &&
> +   compare_diff spaces-expect out2
> +'
> +
>  test_expect_success 'diff: nice spaces with --indent-heuristic' '
> -   git diff --indent-heuristic old new -- spaces.txt >out-compacted &&
> +   git -c diff.indentHeuristic=false diff --indent-heuristic old new -- 
> spaces.txt >out-compacted &&
> compare_diff spaces-compacted-expect out-compacted
>  '
>
> -test_expect_success 'diff: nice spaces with diff.indentHeuristic' '
> +test_expect_success 'diff: nice spaces with diff.indentHeuristic=true' '

Adding the '=true' seems weird to me (I'd think the true is implied,
similar to C,
where we do "if (!null_pointer)" instead of an explicit "!=null".
However we do have these '=true' sprinkled all over the tests,
so I guess it is a valid dialect in our test suite.

The tests (just looked at the interdiff) look good to me.

Thanks,
Stefan


Re: [RFC 00/14] convert dir.c to take an index parameter

2017-05-08 Thread Brandon Williams
On 05/06, Junio C Hamano wrote:
> Brandon Williams  writes:
> 
> > One of the things brought up on the list in the past few days has been
> > migrating away from using the index compatibility macros.  One of the issues
> > brought up in that thread was how simply doing that conversion doesn't
> > eliminate the reliance on global state (specifically the_index).  If one 
> > day we
> > want to have a 'repository object' passed around then we first need to 
> > convert
> > different subsystems to be prepared to handle that.  This series provides a
> > first step, converting the code in dir.c to take a 'struct index_state' and
> > using that instead of implicitly using 'the_index'.
> 
> Very nicely done (I only skimmed "dir.c" in the end result and didn't
> go through the changes with fine toothed comb, though).
> 
> I would have done this without the first step and then instead had a
> final patch that only inserts a single
> 
> #define NO_THE_INDEX_COMPATIBILITY_MACROS
> 
> at the beginning of dir.c once everybody in dir.c loses the
> reference to all "cache" macros at the end, if I were doing this
> series, but it is a personal taste.  
> 
> The resulting dir.c does not even refer to the_index, which is very
> nice.

Thanks! I'm glad there's a few people who see this as a positive change.

> 
> Thanks.
> 
> > Brandon Williams (14):
> >   dir: stop using the index compatibility macros
> >   dir: convert read_skip_worktree_file_from_index to take an index
> >   dir: convert directory_exists_in_index to take index
> >   dir: convert get_dtype to take index
> >   dir: convert dir_add* to take an index
> >   dir: convert last_exclude_matching_from_list to take an index
> >   dir: convert is_excluded_from_list to take an index
> >   dir: convert add_excludes to take an index
> >   dir: convert prep_exclude to take an index
> >   dir: convert is_excluded to take an index
> >   dir: convert open_cached_dir to take an index
> >   dir: convert read_directory_recursive to take an index
> >   dir: convert read_directory to take an index
> >   dir: convert fill_directory to take an index
> >
> >  builtin/add.c  |   7 +-
> >  builtin/check-ignore.c |   3 +-
> >  builtin/clean.c|   4 +-
> >  builtin/grep.c |   2 +-
> >  builtin/ls-files.c |   4 +-
> >  dir.c  | 200 
> > -
> >  dir.h  |  27 +--
> >  unpack-trees.c |  10 +--
> >  wt-status.c|   2 +-
> >  9 files changed, 151 insertions(+), 108 deletions(-)

-- 
Brandon Williams


Re: Add an option to automatically submodule update on checkout

2017-05-08 Thread Brandon Williams
On 05/08, Randall S. Becker wrote:
> On May 8, 2017 12:55 PM, Stefan Beller wrote:
> >On Mon, May 8, 2017 at 9:46 AM, Randall S. Becker  
> >wrote:
> >> On May 8, 2017 12:25 PM, Stefan Beller wrote:
> >>>On Mon, May 8, 2017 at 7:42 AM, Randall S. Becker  
> >>>wrote:
>  On May 6, 2017 4:38 AM Ciro Santilli wrote:
> > This is a must if you are working with submodules, otherwise every 
> > git checkout requires a git submodule update, and you forget it, 
> > and things break, and you understand, and you go to stack overflow 
> > questions 
> > http://stackoverflow.com/questions/22328053/why-doesnt-git-checkout
> > -a utomatically-do-git-submodule-update-recursive
> > http://stackoverflow.com/questions/4611512/is-there-a-way-to-make-g
> > it -pull-automatically-update-submodules
> > and you give up and create aliases :-)
> >>
> >>> The upcoming release (2.13) will have "git checkout 
> >>> --recurse-submodules", which will checkout the submodules at the commit 
> >>> as recorded in the superproject.
> >>> I plan to add an option "submodule.recurse" (name is subject to 
> >>> bikeshedding), which would make the --recurse-submodules flag given 
> >>> by default for all commands that support the flag. (Currently cooking we 
> >>> have reset --recurse-submodules, already existing there is push/pull).
> >>
> >> Brilliant! 
> >>
>  I rather like the concept of supporting --recurse-submodules. The 
>  complexity is that the branches in all submodules all have to have 
>  compatible >>>semantics when doing the checkout, which is by no means 
>  guaranteed. In the scenario where you are including a submodule from a 
>  third-party (very >>>common - see gnulib), the branches likely won't be 
>  there, so you have a high probability of having the command fail or 
>  produce the same results as >>>currently exists if you allow the 
>  checkout even with problems (another option?). If you have control of 
>  everything, then this makes sense.
> >>
> >>>I am trying to give the use case of having control over everything (or 
> >>>rather mixed) more thought as well, e.g. "checkout --recurse-submodules -b 
> >" may want to create the branches in a subset of submodules as 
> >>>well.
> >>
> >> I have to admit that I just assumed it would have to work that way 
> >> this would not be particularly useful. However, in thinking about it, 
> >> we might want to limit the depth of how far -b  takes effect. If 
> >> the super module brings in submodules entirely within control of the 
> >> development group, having -b  apply down to leaf submodules 
> >> makes sense (in some policies). However, if some submodules span out 
> >> to, say, gnulib, that might not make particular sense. Some downward 
> >> limit might be appropriate. Perhaps, in the submodule ref, you might 
> >> want to qualify it as : (but the impact of that is 
> >> probably and admittedly pretty horrid). I hesitate to suggest a 
> >> numeric limit, as that assumes that submodules are organized in a 
> >> balanced tree - which is axiomatically unreasonable. Maybe something 
> >> in .git/config, like
> >>
> >> [branch "topic*"]
> >> submodules=a,b,c
> >>
> >> But I suspect that would make things even more confusing.
> 
> >I thought about having yet-another-flag in the .gitmodules file, which 
> >states if the submodule is extern or internal.
> 
> >[submodule "gnulib"]
> >path=./gnulib
> >external = true # implies no branch for checkout -b --recurse-submodules
> 
> >I think there are a couple more situations where such "external" submodules 
> >are treated differently, so maybe we'd want to think carefully about the 
> >>actual name as different workflows would want to have different features 
> >for an internal/external submodule.
> 
> I didn't want to open up that one, but yes. That makes sense. However, I 
> don't like overloading what "external" means or might mean in the future. 
> Would you consider a distinct Boolean for that, like inherit-branch=true?

Something like that kind of already exists.  The 'branch' field.
Internal repos would most likely use the '.' value to indicate that the
submodules should track the superproject's branch.  While a value of say
'foo' would indicate that the submodule should always be on branch
'foo'; this could be used for external repositories.

> 
> Cheers,
> Randall
> 

-- 
Brandon Williams


RE: Add an option to automatically submodule update on checkout

2017-05-08 Thread Randall S. Becker
On May 8, 2017 12:55 PM, Stefan Beller wrote:
>On Mon, May 8, 2017 at 9:46 AM, Randall S. Becker  
>wrote:
>> On May 8, 2017 12:25 PM, Stefan Beller wrote:
>>>On Mon, May 8, 2017 at 7:42 AM, Randall S. Becker  
>>>wrote:
 On May 6, 2017 4:38 AM Ciro Santilli wrote:
> This is a must if you are working with submodules, otherwise every 
> git checkout requires a git submodule update, and you forget it, 
> and things break, and you understand, and you go to stack overflow 
> questions 
> http://stackoverflow.com/questions/22328053/why-doesnt-git-checkout
> -a utomatically-do-git-submodule-update-recursive
> http://stackoverflow.com/questions/4611512/is-there-a-way-to-make-g
> it -pull-automatically-update-submodules
> and you give up and create aliases :-)
>>
>>> The upcoming release (2.13) will have "git checkout 
>>> --recurse-submodules", which will checkout the submodules at the commit as 
>>> recorded in the superproject.
>>> I plan to add an option "submodule.recurse" (name is subject to 
>>> bikeshedding), which would make the --recurse-submodules flag given 
>>> by default for all commands that support the flag. (Currently cooking we 
>>> have reset --recurse-submodules, already existing there is push/pull).
>>
>> Brilliant! 
>>
 I rather like the concept of supporting --recurse-submodules. The 
 complexity is that the branches in all submodules all have to have 
 compatible >>>semantics when doing the checkout, which is by no means 
 guaranteed. In the scenario where you are including a submodule from a 
 third-party (very >>>common - see gnulib), the branches likely won't be 
 there, so you have a high probability of having the command fail or 
 produce the same results as >>>currently exists if you allow the checkout 
 even with problems (another option?). If you have control of everything, 
 then this makes sense.
>>
>>>I am trying to give the use case of having control over everything (or 
>>>rather mixed) more thought as well, e.g. "checkout --recurse-submodules -b 
>" may want to create the branches in a subset of submodules as well.
>>
>> I have to admit that I just assumed it would have to work that way 
>> this would not be particularly useful. However, in thinking about it, 
>> we might want to limit the depth of how far -b  takes effect. If 
>> the super module brings in submodules entirely within control of the 
>> development group, having -b  apply down to leaf submodules 
>> makes sense (in some policies). However, if some submodules span out 
>> to, say, gnulib, that might not make particular sense. Some downward 
>> limit might be appropriate. Perhaps, in the submodule ref, you might 
>> want to qualify it as : (but the impact of that is 
>> probably and admittedly pretty horrid). I hesitate to suggest a 
>> numeric limit, as that assumes that submodules are organized in a 
>> balanced tree - which is axiomatically unreasonable. Maybe something 
>> in .git/config, like
>>
>> [branch "topic*"]
>> submodules=a,b,c
>>
>> But I suspect that would make things even more confusing.

>I thought about having yet-another-flag in the .gitmodules file, which states 
>if the submodule is extern or internal.

>[submodule "gnulib"]
>path=./gnulib
>external = true # implies no branch for checkout -b --recurse-submodules

>I think there are a couple more situations where such "external" submodules 
>are treated differently, so maybe we'd want to think carefully about the 
>>actual name as different workflows would want to have different features for 
>an internal/external submodule.

I didn't want to open up that one, but yes. That makes sense. However, I don't 
like overloading what "external" means or might mean in the future. Would you 
consider a distinct Boolean for that, like inherit-branch=true?

Cheers,
Randall



Re: vger not relaying some of Junio's messages today?

2017-05-08 Thread Brandon Williams
On 05/07, Eric Wong wrote:
> Samuel Lijin  wrote:
> > > Samuel Lijin  wrote:
> > >> Yep, I see these on public-inbox.org/git/ but not in my gmail inbox:
> > >
> > > Hi Samuel, check your Spam box (and move it to a normal inbox so
> > > they can train it).  Gmail filters are known to trigger happy
> > > and incorrectly flag messages.  It's been a problem on LKML,
> > > too.
> > 
> > Sorry, should've been clearer - I did check my spambox in my original
> > message. Some old patches from Brandon were in there, but the ones I
> > mentioned in my original message just seem to have been dropped.
> 
> Yikes.  I wonder if Gmail automatically nukes messages if
> they end up in enough Spam boxes...  Really hoping Googlers
> can do something about this (or better, more people start
> running their own SMTP servers, again).

That's quite odd, none of my coworkers are seeing behavior like this (my
mail going to spam).

-- 
Brandon Williams


Re: vger not relaying some of Junio's messages today?

2017-05-08 Thread Brandon Williams
On 05/08, Johannes Schindelin wrote:
> Hi Ævar,
> 
> On Sat, 6 May 2017, Ævar Arnfjörð Bjarmason wrote:
> 
> > I have one [script] to git am a patch from a msgid, thought I should
> > write something to handle a series in some DWIM fashion (e.g. apply the
> > latest continuous sequence of patches matching --author) but figured
> > that someone's probably wrote this already & I don't need to hack it up
> > myself...
> 
> You probably missed my previous mails mentioning
> 
> https://github.com/git-for-windows/build-extra/blob/master/apply-from-public-inbox.sh
> 
> You can use this script to apply single patches (identified by their
> Message-ID), and patch series (identified by their cover letter's
> Message-ID).
> 
> As I mentioned at the Contributors' Summit at GitMerge 2017: I would
> *love* to collaborate on tools that make any part of the
> contribution/review process less cumbersome than it is right now.

Yeah its not the most streamlined process.  I'm sure everyone writes
their own scripts (like I did) tailored to their workflow.  For example
I just tag a bunch of mails in mutt and then have a scripts which 'git
am's them on a branch/base of my choosing.  But its specific to
my workflow so idk how useful it would be to others :(


-- 
Brandon Williams


Re: Add an option to automatically submodule update on checkout

2017-05-08 Thread Stefan Beller
On Mon, May 8, 2017 at 9:46 AM, Randall S. Becker
 wrote:
> On May 8, 2017 12:25 PM, Stefan Beller wrote:
>>On Mon, May 8, 2017 at 7:42 AM, Randall S. Becker  
>>wrote:
>>> On May 6, 2017 4:38 AM Ciro Santilli wrote:
 This is a must if you are working with submodules, otherwise every
 git checkout requires a git submodule update, and you forget it, and
 things break, and you understand, and you go to stack overflow
 questions
 http://stackoverflow.com/questions/22328053/why-doesnt-git-checkout-a
 utomatically-do-git-submodule-update-recursive
 http://stackoverflow.com/questions/4611512/is-there-a-way-to-make-git
 -pull-automatically-update-submodules
 and you give up and create aliases :-)
>
>> The upcoming release (2.13) will have "git checkout --recurse-submodules", 
>> which will checkout the submodules
>> at the commit as recorded in the superproject.
>> I plan to add an option "submodule.recurse" (name is subject to 
>> bikeshedding), which would make the --recurse-submodules
>> flag given by default for all commands that support the flag. (Currently 
>> cooking we have reset --recurse-submodules, already
>> existing there is push/pull).
>
> Brilliant! 
>
>>> I rather like the concept of supporting --recurse-submodules. The 
>>> complexity is that the branches in all submodules all have to have 
>>> compatible >>semantics when doing the checkout, which is by no means 
>>> guaranteed. In the scenario where you are including a submodule from a 
>>> third-party (very >>common - see gnulib), the branches likely won't be 
>>> there, so you have a high probability of having the command fail or produce 
>>> the same results as >>currently exists if you allow the checkout even with 
>>> problems (another option?). If you have control of everything, then this 
>>> makes sense.
>
>>I am trying to give the use case of having control over everything (or rather 
>>mixed) more thought as well, e.g. "checkout --recurse-submodules -b >" 
>>may want to create the branches in a subset of submodules as well.
>
> I have to admit that I just assumed it would have to work that way this would 
> not be particularly useful. However, in thinking about it, we might want to 
> limit the depth of how far -b  takes effect. If the super module brings 
> in submodules entirely within control of the development group, having -b 
>  apply down to leaf submodules makes sense (in some policies). However, 
> if some submodules span out to, say, gnulib, that might not make particular 
> sense. Some downward limit might be appropriate. Perhaps, in the submodule 
> ref, you might want to qualify it as : (but the impact of that 
> is probably and admittedly pretty horrid). I hesitate to suggest a numeric 
> limit, as that assumes that submodules are organized in a balanced tree - 
> which is axiomatically unreasonable. Maybe something in .git/config, like
>
> [branch "topic*"]
> submodules=a,b,c
>
> But I suspect that would make things even more confusing.

I thought about having yet-another-flag in the .gitmodules file, which
states if the submodule is extern or internal.

[submodule "gnulib"]
path=./gnulib
external = true # implies no branch for checkout -b --recurse-submodules

I think there are a couple more situations where such "external" submodules
are treated differently, so maybe we'd want to think carefully about
the actual name
as different workflows would want to have different features for an
internal/external
submodule.

Thanks,
Stefan


Re: [PATCH v7] read-cache: force_verify_index_checksum

2017-05-08 Thread Jeff Hostetler



On 5/8/2017 5:45 AM, Christian Couder wrote:

On Fri, Apr 14, 2017 at 10:32 PM,   wrote:

diff --git a/t/t1450-fsck.sh b/t/t1450-fsck.sh
index 33a51c9..677e15a 100755
--- a/t/t1450-fsck.sh
+++ b/t/t1450-fsck.sh
@@ -689,4 +689,17 @@ test_expect_success 'bogus head does not fallback to all 
heads' '
! grep $blob out
 '

+test_expect_success 'detect corrupt index file in fsck' '
+   cp .git/index .git/index.backup &&
+   test_when_finished "mv .git/index.backup .git/index" &&
+   echo  > &&
+   git add  &&
+   sed -e "s///" .git/index >.git/index.yyy &&
+   mv .git/index.yyy .git/index &&
+   # Confirm that fsck detects invalid checksum
+   test_must_fail git fsck --cache &&
+   # Confirm that status no longer complains about invalid checksum
+   git status
+'


This test does not pass when the GIT_TEST_SPLIT_INDEX env variable is
set on my Linux machine.

Also it looks like you sent a v8 of this patch series with a different
test, but what is in master looks like the above test instead of the
test in your v8.



There was concern about using sed on a binary file not being portable
and a request to change the test to just corrupt the checksum rather
than an index-entry, so I changed it in v8.

Does the v8 version of the test also fail on your machine ?

Thanks
Jeff


Re: Not translatable strings in Git

2017-05-08 Thread Stefan Beller
On Sat, May 6, 2017 at 1:40 AM, Jordi Mas  wrote:
> Hello,
>
> When translating git (https://github.com/git/git/tree/master/po)
>
> The following strings cannot be translated:
>
>  remote: Counting objects: 331, done.
>  remote: Compressing objects: 100% (213/213), done.
>  remote: Total 244 (delta 184), reused 34 (delta 29)

This is what the remote server tells a client, but the client did not
tell the server which locale it has.

Now consider we have hosting service that hosts projects accessible from
a wide range of users, all of them having different locales. Which language
should the remote speak? (Not knowing which language the client speaks).


>  27 files changed, 3399 insertions(+), 3320 deletions(-)
>  create mode 100644 build-aux/flatpak/org.gnome.Nautilus-stable.json
>  delete mode 100755 build-aux/meson/check_libgd.sh

These sound like they are done locally and could be translated easily.
However I wonder if we have issues with leaky plumbing in the user porcelain.

Thanks,
Stefan


Re: Add an option to automatically submodule update on checkout

2017-05-08 Thread Brandon Williams
On 05/08, Stefan Beller wrote:
> On Mon, May 8, 2017 at 7:42 AM, Randall S. Becker
>  wrote:
> > On May 6, 2017 4:38 AM Ciro Santilli wrote:
> >> This is a must if you are working with submodules, otherwise every
> >> git checkout requires a git submodule update, and you forget it,
> >> and things break, and you understand, and you go to stack overflow
> >> questions
> >> http://stackoverflow.com/questions/22328053/why-doesnt-git-checkout-automatically-do-git-submodule-update-recursive
> >> http://stackoverflow.com/questions/4611512/is-there-a-way-to-make-git-pull-automatically-update-submodules
> >> and you give up and create aliases :-)
> 
> The upcoming release (2.13) will have "git checkout
> --recurse-submodules", which will checkout the submodules at the
> commit as recorded in the superproject.
> 
> I plan to add an option "submodule.recurse" (name is subject to
> bikeshedding), which would make the --recurse-submodules flag given by
> default for all commands that support the flag. (Currently cooking we
> have reset --recurse-submodules, already existing there is push/pull).

Well pull not so much...it'll do a recursive fetch but not a recursive
merge/rebase.  That is something on the docket to get done in the next
couple months though.

> 
> > I rather like the concept of supporting --recurse-submodules. The
> > complexity is that the branches in all submodules all have to have
> > compatible semantics when doing the checkout, which is by no means
> > guaranteed. In the scenario where you are including a submodule from
> > a third-party (very common - see gnulib), the branches likely won't
> > be there, so you have a high probability of having the command fail
> > or produce the same results as currently exists if you allow the
> > checkout even with problems (another option?). If you have control
> > of everything, then this makes sense.
> 
> I am trying to give the use case of having control over everything (or
> rather mixed) more thought as well, e.g. "checkout
> --recurse-submodules -b " may want to create the branches in a
> subset of submodules as well.
> 
> Thanks, Stefan
> 
> >
> > Cheers, Randall
> >

-- 
Brandon Williams


RE: Add an option to automatically submodule update on checkout

2017-05-08 Thread Randall S. Becker
On May 8, 2017 12:25 PM, Stefan Beller wrote:
>On Mon, May 8, 2017 at 7:42 AM, Randall S. Becker  
>wrote:
>> On May 6, 2017 4:38 AM Ciro Santilli wrote:
>>> This is a must if you are working with submodules, otherwise every 
>>> git checkout requires a git submodule update, and you forget it, and 
>>> things break, and you understand, and you go to stack overflow 
>>> questions 
>>> http://stackoverflow.com/questions/22328053/why-doesnt-git-checkout-a
>>> utomatically-do-git-submodule-update-recursive
>>> http://stackoverflow.com/questions/4611512/is-there-a-way-to-make-git
>>> -pull-automatically-update-submodules
>>> and you give up and create aliases :-)

> The upcoming release (2.13) will have "git checkout --recurse-submodules", 
> which will checkout the submodules
> at the commit as recorded in the superproject.
> I plan to add an option "submodule.recurse" (name is subject to 
> bikeshedding), which would make the --recurse-submodules
> flag given by default for all commands that support the flag. (Currently 
> cooking we have reset --recurse-submodules, already
> existing there is push/pull).

Brilliant! 

>> I rather like the concept of supporting --recurse-submodules. The complexity 
>> is that the branches in all submodules all have to have compatible 
>> >>semantics when doing the checkout, which is by no means guaranteed. In the 
>> scenario where you are including a submodule from a third-party (very 
>> >>common - see gnulib), the branches likely won't be there, so you have a 
>> high probability of having the command fail or produce the same results as 
>> >>currently exists if you allow the checkout even with problems (another 
>> option?). If you have control of everything, then this makes sense.

>I am trying to give the use case of having control over everything (or rather 
>mixed) more thought as well, e.g. "checkout --recurse-submodules -b >" 
>may want to create the branches in a subset of submodules as well.

I have to admit that I just assumed it would have to work that way this would 
not be particularly useful. However, in thinking about it, we might want to 
limit the depth of how far -b  takes effect. If the super module brings 
in submodules entirely within control of the development group, having -b 
 apply down to leaf submodules makes sense (in some policies). However, 
if some submodules span out to, say, gnulib, that might not make particular 
sense. Some downward limit might be appropriate. Perhaps, in the submodule ref, 
you might want to qualify it as : (but the impact of that is 
probably and admittedly pretty horrid). I hesitate to suggest a numeric limit, 
as that assumes that submodules are organized in a balanced tree - which is 
axiomatically unreasonable. Maybe something in .git/config, like

[branch "topic*"]
submodules=a,b,c

But I suspect that would make things even more confusing.

Cheers,
Randall



Re: git and the Clang Static Analyzer

2017-05-08 Thread Lars Schneider

> On 08 May 2017, at 12:12, Johannes Schindelin  
> wrote:
> 
> Hi Dilyan,
> 
> 
> On Sun, 7 May 2017, Дилян Палаузов wrote:
> 
>> ...
> 
> Also: to make it *really* useful for other developers, it would be a good
> idea for you to try your hand at patching the .travis.yml file in such a
> way that the static analysis is performed as part of the Continuous
> Testing, and to contribute said patch. That way, other contributors could
> not only see how it is done (and copy-paste the commands, including `apt
> install` calls insofar necessary), but also to see the reports on a
> trusted website (Travis').

I did that here:
https://github.com/larsxschneider/git/commit/faf4ecfdca1a732459c1f93c334928ee2826d490

Unfortunately, there are still too many issues to activate this job:
https://travis-ci.org/larsxschneider/git/jobs/205507241

See this thread for more info:
http://public-inbox.org/git/bab1ee0e-b258-4108-ae24-110172086...@gmail.com/

- Lars

Re: Add an option to automatically submodule update on checkout

2017-05-08 Thread Stefan Beller
On Mon, May 8, 2017 at 7:42 AM, Randall S. Becker
 wrote:
> On May 6, 2017 4:38 AM Ciro Santilli wrote:
>> This is a must if you are working with submodules, otherwise every git 
>> checkout requires a git submodule update,
>> and you forget it, and things break, and you understand, and you go to stack 
>> overflow questions
>> http://stackoverflow.com/questions/22328053/why-doesnt-git-checkout-automatically-do-git-submodule-update-recursive
>> http://stackoverflow.com/questions/4611512/is-there-a-way-to-make-git-pull-automatically-update-submodules
>> and you give up and create aliases :-)

The upcoming release (2.13) will have "git checkout --recurse-submodules",
which will checkout the submodules at the commit as recorded in the
superproject.

I plan to add an option "submodule.recurse" (name is subject to bikeshedding),
which would make the --recurse-submodules flag given by default for all commands
that support the flag. (Currently cooking we have reset
--recurse-submodules, already
existing there is push/pull).

> I rather like the concept of supporting --recurse-submodules. The complexity 
> is that the branches in all submodules all have to have compatible semantics 
> when doing the checkout, which is by no means guaranteed. In the scenario 
> where you are including a submodule from a third-party (very common - see 
> gnulib), the branches likely won't be there, so you have a high probability 
> of having the command fail or produce the same results as currently exists if 
> you allow the checkout even with problems (another option?). If you have 
> control of everything, then this makes sense.

I am trying to give the use case of having control over everything
(or rather mixed) more thought as well,
e.g. "checkout --recurse-submodules -b " may want to
create the branches in a subset of submodules as well.

Thanks,
Stefan

>
> Cheers,
> Randall
>


[PATCH v4 1/4] diff: make the indent heuristic part of diff's basic configuration

2017-05-08 Thread Marc Branchaud
This heuristic was originally introduced as an experimental feature,
and therefore part of the UI configuration.

But the user often sees diffs generated by plumbing commands like
diff-tree.  Moving the indent heuristic into diff's basic configuration
prepares the way for diff plumbing commands to respect the setting.

The heuristic itself merely makes the diffs more aesthetically
pleasing, without changing their correctness.  Scripts that rely on
the diff plumbing commands should not care whether or not the heuristic
is employed.

Signed-off-by: Marc Branchaud 
---
 diff.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/diff.c b/diff.c
index 74283d900..b6e3ffe92 100644
--- a/diff.c
+++ b/diff.c
@@ -290,9 +290,6 @@ int git_diff_ui_config(const char *var, const char *value, 
void *cb)
return 0;
}
 
-   if (git_diff_heuristic_config(var, value, cb) < 0)
-   return -1;
-
if (!strcmp(var, "diff.wserrorhighlight")) {
int val = parse_ws_error_highlight(value);
if (val < 0)
@@ -351,6 +348,9 @@ int git_diff_basic_config(const char *var, const char 
*value, void *cb)
if (starts_with(var, "submodule."))
return parse_submodule_config_option(var, value);
 
+   if (git_diff_heuristic_config(var, value, cb) < 0)
+   return -1;
+
return git_default_config(var, value, cb);
 }
 
-- 
2.13.0.rc1.15.gf67d331ad



[PATCH v4 2/4] diff: have the diff-* builtins configure diff before initializing revisions

2017-05-08 Thread Marc Branchaud
This matches how the diff Porcelain works.  It makes the plumbing commands
respect diff's configuration options, such as indentHeuristic, because
init_revisions() calls diff_setup() which fills in the diff_options struct.

Signed-off-by: Marc Branchaud 
---
 builtin/diff-files.c   |  2 +-
 builtin/diff-index.c   |  2 +-
 builtin/diff-tree.c|  2 +-
 t/t4061-diff-indent.sh | 66 ++
 4 files changed, 69 insertions(+), 3 deletions(-)

diff --git a/builtin/diff-files.c b/builtin/diff-files.c
index 15c61fd8d..a572da9d5 100644
--- a/builtin/diff-files.c
+++ b/builtin/diff-files.c
@@ -20,9 +20,9 @@ int cmd_diff_files(int argc, const char **argv, const char 
*prefix)
int result;
unsigned options = 0;
 
+   git_config(git_diff_basic_config, NULL); /* no "diff" UI options */
init_revisions(, prefix);
gitmodules_config();
-   git_config(git_diff_basic_config, NULL); /* no "diff" UI options */
rev.abbrev = 0;
precompose_argv(argc, argv);
 
diff --git a/builtin/diff-index.c b/builtin/diff-index.c
index 1af373d00..f084826a2 100644
--- a/builtin/diff-index.c
+++ b/builtin/diff-index.c
@@ -17,9 +17,9 @@ int cmd_diff_index(int argc, const char **argv, const char 
*prefix)
int i;
int result;
 
+   git_config(git_diff_basic_config, NULL); /* no "diff" UI options */
init_revisions(, prefix);
gitmodules_config();
-   git_config(git_diff_basic_config, NULL); /* no "diff" UI options */
rev.abbrev = 0;
precompose_argv(argc, argv);
 
diff --git a/builtin/diff-tree.c b/builtin/diff-tree.c
index 326f88b65..36a3a1976 100644
--- a/builtin/diff-tree.c
+++ b/builtin/diff-tree.c
@@ -105,9 +105,9 @@ int cmd_diff_tree(int argc, const char **argv, const char 
*prefix)
struct setup_revision_opt s_r_opt;
int read_stdin = 0;
 
+   git_config(git_diff_basic_config, NULL); /* no "diff" UI options */
init_revisions(opt, prefix);
gitmodules_config();
-   git_config(git_diff_basic_config, NULL); /* no "diff" UI options */
opt->abbrev = 0;
opt->diff = 1;
opt->disable_stdin = 1;
diff --git a/t/t4061-diff-indent.sh b/t/t4061-diff-indent.sh
index 556450609..13d3dc96a 100755
--- a/t/t4061-diff-indent.sh
+++ b/t/t4061-diff-indent.sh
@@ -213,4 +213,70 @@ test_expect_success 'blame: --no-indent-heuristic 
overrides config' '
compare_blame spaces-expect out-blame2
 '
 
+test_expect_success 'diff-tree: nice spaces with --indent-heuristic' '
+   git diff-tree --indent-heuristic -p old new -- spaces.txt 
>out-diff-tree-compacted &&
+   compare_diff spaces-compacted-expect out-diff-tree-compacted
+'
+
+test_expect_success 'diff-tree: nice spaces with diff.indentHeuristic' '
+   git -c diff.indentHeuristic=true diff-tree -p old new -- spaces.txt 
>out-diff-tree-compacted2 &&
+   compare_diff spaces-compacted-expect out-diff-tree-compacted2
+'
+
+test_expect_success 'diff-tree: --no-indent-heuristic overrides config' '
+   git -c diff.indentHeuristic=true diff-tree --no-indent-heuristic -p old 
new -- spaces.txt >out-diff-tree &&
+   compare_diff spaces-expect out-diff-tree
+'
+
+test_expect_success 'diff-index: nice spaces with --indent-heuristic' '
+   git checkout -B diff-index &&
+   git reset --soft HEAD~ &&
+   git diff-index --indent-heuristic -p old -- spaces.txt 
>out-diff-index-compacted &&
+   compare_diff spaces-compacted-expect out-diff-index-compacted &&
+   git checkout -f master
+'
+
+test_expect_success 'diff-index: nice spaces with diff.indentHeuristic' '
+   git checkout -B diff-index &&
+   git reset --soft HEAD~ &&
+   git -c diff.indentHeuristic=true diff-index -p old -- spaces.txt 
>out-diff-index-compacted2 &&
+   compare_diff spaces-compacted-expect out-diff-index-compacted2 &&
+   git checkout -f master
+'
+
+test_expect_success 'diff-index: --no-indent-heuristic overrides config' '
+   git checkout -B diff-index &&
+   git reset --soft HEAD~ &&
+   git -c diff.indentHeuristic=true diff-index --no-indent-heuristic -p 
old -- spaces.txt >out-diff-index &&
+   compare_diff spaces-expect out-diff-index &&
+   git checkout -f master
+'
+
+test_expect_success 'diff-files: nice spaces with diff.indentHeuristic' '
+   git checkout -B diff-files &&
+   git reset HEAD~ &&
+   git -c diff.indentHeuristic=true diff-files -p spaces.txt 
>out-diff-files-raw &&
+   grep -v index out-diff-files-raw >out-diff-files-compacted &&
+   compare_diff spaces-compacted-expect out-diff-files-compacted &&
+   git checkout -f master
+'
+
+test_expect_success 'diff-files: nice spaces with diff.indentHeuristic' '
+   git checkout -B diff-files &&
+   git reset HEAD~ &&
+   git -c diff.indentHeuristic=true diff-files -p spaces.txt 
>out-diff-files-raw2 &&
+   grep -v index out-diff-files-raw2 

[PATCH v4 3/4] diff: enable indent heuristic by default

2017-05-08 Thread Marc Branchaud
From: Stefan Beller 

The feature was included in v2.11 (released 2016-11-29) and we got no
negative feedback. Quite the opposite, all feedback we got was positive.

Turn it on by default. Users who dislike the feature can turn it off
by setting diff.indentHeuristic (which also configures plumbing commands,
see prior patches).

The change to t/t4051-diff-function-context.sh is needed because the
heuristic shifts the changed hunk in the patch.  To get the same result
regardless of the heuristic configuration, we modify the test file
differently:  We insert a completely new line after line 2, instead of
simply duplicating it.

Helped-by: Jeff King 
Signed-off-by: Stefan Beller 
Signed-off-by: Marc Branchaud 
---

 diff.c   |   2 +-
 t/t4051-diff-function-context.sh |   3 +-
 t/t4061-diff-indent.sh   | 140 +++
 3 files changed, 116 insertions(+), 29 deletions(-)

diff --git a/diff.c b/diff.c
index b6e3ffe92..a24452051 100644
--- a/diff.c
+++ b/diff.c
@@ -27,7 +27,7 @@
 #endif
 
 static int diff_detect_rename_default;
-static int diff_indent_heuristic; /* experimental */
+static int diff_indent_heuristic = 1;
 static int diff_rename_limit_default = 400;
 static int diff_suppress_blank_empty;
 static int diff_use_color_default = -1;
diff --git a/t/t4051-diff-function-context.sh b/t/t4051-diff-function-context.sh
index 6154acb45..3e6b485ec 100755
--- a/t/t4051-diff-function-context.sh
+++ b/t/t4051-diff-function-context.sh
@@ -72,7 +72,8 @@ test_expect_success 'setup' '
 
# overlap function context of 1st change and -u context of 2nd change
grep -v "delete me from hello" <"$dir/hello.c" >file.c &&
-   sed 2p <"$dir/dummy.c" >>file.c &&
+   sed "2a\\
+extra line" <"$dir/dummy.c" >>file.c &&
commit_and_tag changed_hello_dummy file.c &&
 
git checkout initial &&
diff --git a/t/t4061-diff-indent.sh b/t/t4061-diff-indent.sh
index 13d3dc96a..2affd7a10 100755
--- a/t/t4061-diff-indent.sh
+++ b/t/t4061-diff-indent.sh
@@ -152,26 +152,28 @@ test_expect_success 'prepare' '
EOF
 '
 
+# --- diff tests --
+
 test_expect_success 'diff: ugly spaces' '
-   git diff old new -- spaces.txt >out &&
+   git diff --no-indent-heuristic old new -- spaces.txt >out &&
compare_diff spaces-expect out
 '
 
+test_expect_success 'diff: --no-indent-heuristic overrides config' '
+   git -c diff.indentHeuristic=true diff --no-indent-heuristic old new -- 
spaces.txt >out2 &&
+   compare_diff spaces-expect out2
+'
+
 test_expect_success 'diff: nice spaces with --indent-heuristic' '
-   git diff --indent-heuristic old new -- spaces.txt >out-compacted &&
+   git -c diff.indentHeuristic=false diff --indent-heuristic old new -- 
spaces.txt >out-compacted &&
compare_diff spaces-compacted-expect out-compacted
 '
 
-test_expect_success 'diff: nice spaces with diff.indentHeuristic' '
+test_expect_success 'diff: nice spaces with diff.indentHeuristic=true' '
git -c diff.indentHeuristic=true diff old new -- spaces.txt 
>out-compacted2 &&
compare_diff spaces-compacted-expect out-compacted2
 '
 
-test_expect_success 'diff: --no-indent-heuristic overrides config' '
-   git -c diff.indentHeuristic=true diff --no-indent-heuristic old new -- 
spaces.txt >out2 &&
-   compare_diff spaces-expect out2
-'
-
 test_expect_success 'diff: --indent-heuristic with --patience' '
git diff --indent-heuristic --patience old new -- spaces.txt 
>out-compacted3 &&
compare_diff spaces-compacted-expect out-compacted3
@@ -183,7 +185,7 @@ test_expect_success 'diff: --indent-heuristic with 
--histogram' '
 '
 
 test_expect_success 'diff: ugly functions' '
-   git diff old new -- functions.c >out &&
+   git diff --no-indent-heuristic old new -- functions.c >out &&
compare_diff functions-expect out
 '
 
@@ -192,42 +194,73 @@ test_expect_success 'diff: nice functions with 
--indent-heuristic' '
compare_diff functions-compacted-expect out-compacted
 '
 
-test_expect_success 'blame: ugly spaces' '
-   git blame old..new -- spaces.txt >out-blame &&
-   compare_blame spaces-expect out-blame
-'
+# --- blame tests -
 
 test_expect_success 'blame: nice spaces with --indent-heuristic' '
git blame --indent-heuristic old..new -- spaces.txt 
>out-blame-compacted &&
compare_blame spaces-compacted-expect out-blame-compacted
 '
 
-test_expect_success 'blame: nice spaces with diff.indentHeuristic' '
+test_expect_success 'blame: nice spaces with diff.indentHeuristic=true' '
git -c diff.indentHeuristic=true blame old..new -- spaces.txt 
>out-blame-compacted2 &&
compare_blame spaces-compacted-expect out-blame-compacted2
 '
 
+test_expect_success 'blame: 

[PATCH v4 4/4] add--interactive: drop diff.indentHeuristic handling

2017-05-08 Thread Marc Branchaud
From: Jeff King 

Now that diff.indentHeuristic is handled automatically by the plumbing
commands, there's no need to propagate it manually.

Signed-off-by: Jeff King 
Signed-off-by: Marc Branchaud 
---
 git-add--interactive.perl | 4 
 1 file changed, 4 deletions(-)

diff --git a/git-add--interactive.perl b/git-add--interactive.perl
index 709a5f6ce..79d675b5b 100755
--- a/git-add--interactive.perl
+++ b/git-add--interactive.perl
@@ -46,7 +46,6 @@ my ($diff_new_color) =
 my $normal_color = $repo->get_color("", "reset");
 
 my $diff_algorithm = $repo->config('diff.algorithm');
-my $diff_indent_heuristic = $repo->config_bool('diff.indentheuristic');
 my $diff_filter = $repo->config('interactive.difffilter');
 
 my $use_readkey = 0;
@@ -730,9 +729,6 @@ sub parse_diff {
if (defined $diff_algorithm) {
splice @diff_cmd, 1, 0, "--diff-algorithm=${diff_algorithm}";
}
-   if ($diff_indent_heuristic) {
-   splice @diff_cmd, 1, 0, "--indent-heuristic";
-   }
if (defined $patch_mode_revision) {
push @diff_cmd, get_diff_reference($patch_mode_revision);
}
-- 
2.13.0.rc1.15.gf67d331ad



[PATCH v4 0/4] Make diff plumbing commands respect the indentHeuristic.

2017-05-08 Thread Marc Branchaud
The only change from v3 is in 3/4, to expand t4061 to test various
combinations of --(no-)indent-heuristic and diff.indentHeuristic.

I kindof went all-in and tried to cover every possible combination for
all four affected commands.

An inter-diff is below.

M.

Jeff King (1):
  add--interactive: drop diff.indentHeuristic handling

Marc Branchaud (2):
  diff: make the indent heuristic part of diff's basic configuration
  diff: have the diff-* builtins configure diff before initializing
revisions

Stefan Beller (1):
  diff: enable indent heuristic by default

 builtin/diff-files.c |   2 +-
 builtin/diff-index.c |   2 +-
 builtin/diff-tree.c  |   2 +-
 diff.c   |   8 +-
 git-add--interactive.perl|   4 -
 t/t4051-diff-function-context.sh |   3 +-
 t/t4061-diff-indent.sh   | 184 +++
 7 files changed, 177 insertions(+), 28 deletions(-)


diff --git a/t/t4061-diff-indent.sh b/t/t4061-diff-indent.sh
index 56d7d7760..2affd7a10 100755
--- a/t/t4061-diff-indent.sh
+++ b/t/t4061-diff-indent.sh
@@ -152,26 +152,28 @@ test_expect_success 'prepare' '
EOF
 '
 
+# --- diff tests --
+
 test_expect_success 'diff: ugly spaces' '
git diff --no-indent-heuristic old new -- spaces.txt >out &&
compare_diff spaces-expect out
 '
 
+test_expect_success 'diff: --no-indent-heuristic overrides config' '
+   git -c diff.indentHeuristic=true diff --no-indent-heuristic old new -- 
spaces.txt >out2 &&
+   compare_diff spaces-expect out2
+'
+
 test_expect_success 'diff: nice spaces with --indent-heuristic' '
-   git diff --indent-heuristic old new -- spaces.txt >out-compacted &&
+   git -c diff.indentHeuristic=false diff --indent-heuristic old new -- 
spaces.txt >out-compacted &&
compare_diff spaces-compacted-expect out-compacted
 '
 
-test_expect_success 'diff: nice spaces with diff.indentHeuristic' '
+test_expect_success 'diff: nice spaces with diff.indentHeuristic=true' '
git -c diff.indentHeuristic=true diff old new -- spaces.txt 
>out-compacted2 &&
compare_diff spaces-compacted-expect out-compacted2
 '
 
-test_expect_success 'diff: --no-indent-heuristic overrides config' '
-   git -c diff.indentHeuristic=true diff --no-indent-heuristic old new -- 
spaces.txt >out2 &&
-   compare_diff spaces-expect out2
-'
-
 test_expect_success 'diff: --indent-heuristic with --patience' '
git diff --indent-heuristic --patience old new -- spaces.txt 
>out-compacted3 &&
compare_diff spaces-compacted-expect out-compacted3
@@ -192,42 +194,73 @@ test_expect_success 'diff: nice functions with 
--indent-heuristic' '
compare_diff functions-compacted-expect out-compacted
 '
 
-test_expect_success 'blame: ugly spaces' '
-   git blame --no-indent-heuristic old..new -- spaces.txt >out-blame &&
-   compare_blame spaces-expect out-blame
-'
+# --- blame tests -
 
 test_expect_success 'blame: nice spaces with --indent-heuristic' '
git blame --indent-heuristic old..new -- spaces.txt 
>out-blame-compacted &&
compare_blame spaces-compacted-expect out-blame-compacted
 '
 
-test_expect_success 'blame: nice spaces with diff.indentHeuristic' '
+test_expect_success 'blame: nice spaces with diff.indentHeuristic=true' '
git -c diff.indentHeuristic=true blame old..new -- spaces.txt 
>out-blame-compacted2 &&
compare_blame spaces-compacted-expect out-blame-compacted2
 '
 
+test_expect_success 'blame: ugly spaces with --no-indent-heuristic' '
+   git blame --no-indent-heuristic old..new -- spaces.txt >out-blame &&
+   compare_blame spaces-expect out-blame
+'
+
+test_expect_success 'blame: ugly spaces with diff.indentHeuristic=false' '
+   git -c diff.indentHeuristic=false blame old..new -- spaces.txt 
>out-blame2 &&
+   compare_blame spaces-expect out-blame2
+'
+
 test_expect_success 'blame: --no-indent-heuristic overrides config' '
-   git -c diff.indentHeuristic=true blame --no-indent-heuristic old..new 
-- spaces.txt >out-blame2 &&
+   git -c diff.indentHeuristic=true blame --no-indent-heuristic old..new 
-- spaces.txt >out-blame3 &&
git blame old..new -- spaces.txt >out-blame &&
-   compare_blame spaces-expect out-blame2
+   compare_blame spaces-expect out-blame3
 '
 
+test_expect_success 'blame: --indent-heuristic overrides config' '
+   git -c diff.indentHeuristic=false blame --indent-heuristic old..new -- 
spaces.txt >out-blame-compacted3 &&
+   compare_blame spaces-compacted-expect out-blame-compacted2
+'
+
+# --- diff-tree tests -
+
 test_expect_success 'diff-tree: nice spaces with --indent-heuristic' '
git diff-tree --indent-heuristic -p old new -- spaces.txt 
>out-diff-tree-compacted &&
  

RE: [PATCH 0/1] Preserve the untracked cache across checkout, reset --hard, etc

2017-05-08 Thread David Turner
Can you actually keep the email address as my Twopensource one?  I want to make 
sure that Twitter, my employer at the time, gets credit for this work (just as 
I want to make sure that my current employer, Two Sigma, gets credit for my 
current work).

Please feel free to add Signed-off-by: David Turner  in 
case that makes tracking easier.

Thanks.

WRT the actual patch, I want to note that past me did not do a great job here.  
The tests do not correctly check that the post-checkout untracked cache is 
still valid after a checkout.  For example, let's say that previously, the 
directory foo was entirely untracked (but it contained a file bar), but after 
the checkout, there is a file foo/baz.  Does the untracked cache need to get 
updated?  

Unfortunately, the untracked cache is very unlikely to make it to the top of my 
priority list any time soon, so I won't be able to correct this test (and, if 
necessary, correct the code).But I would strongly suggest that the test be 
improved before this code is merged.

Thanks for CCing me.

> -Original Message-
> From: Christian Couder [mailto:christian.cou...@gmail.com]
> Sent: Monday, May 8, 2017 6:12 AM
> To: Johannes Schindelin 
> Cc: git ; Junio C Hamano ; Nguyễn
> Thái Ngọc Duy ; Ben Peart ;
> David Turner 
> Subject: Re: [PATCH 0/1] Preserve the untracked cache across checkout, reset 
> --
> hard, etc
> 
> (Adding Dave in Cc as it looks like he is involved.)
> 
> On Mon, May 8, 2017 at 11:41 AM, Johannes Schindelin
>  wrote:
> > I recently sent out a request for assistance, after noticing that the
> > untracked cache is simply thrown away after operations such as `git
> > checkout` or `git reset --hard`:
> >
> > http://public-inbox.org/git/alpine.DEB.2.20.1705031202470.3480@virtual
> > box/
> >
> > Duy responded with some high-level reasoning that it should be
> > possible to simply reuse the untracked cache data structure in the new
> > index, as he had a gut feeling that "we do invalidation right".
> >
> > I did not have time to back that up by a thorough analysis of the
> > code, but it turns out that it is unnecessary: Ben Peart pointed me to
> > a patch of Dave Turner's that was submitted as part of the watchman
> > series, addressing the very issue about which I was concerned.
> >
> > And I trust Dave to have validated the idea that the untracked cache
> > invalidation "is done right" even when we simply move the pointer to a
> > different index_state struct than originally.
> >
> > Seeing as the untracked cache being dropped unceremoniously when it
> > should not be dropped, in a surprising number of operations, I think
> > it is a sensible change, and important, too, and independent enough
> > from the watchman patches to merit being separated out and applied
> > pretty soon.
> >
> > So what I did was simply to drop the two lines from this patch that
> > referred to index_state fields added by Dave's watchman patch series.
> >
> > Please do not mistake this for a sign that I am disinterested in
> > watchman support, far from it... stay tuned ;-)
> >
> > Oh, and I adjusted Dave's email address. Dave, is that okay?
> >
> > As we are in a feature freeze phase, I was debating whether to send
> > out this patch now or later.
> >
> > Having thought about it for quite a bit, I am now convinced that this
> > patch fixes a bug in the untracked cache feature that is so critical
> > as to render it useless: if you
> >
> > - have to switch between branches frequently, or
> > - rebase frequently (which calls `git reset --hard`), or
> > - stash frequently (which calls `git reset --hard`),
> >
> > it is as if you had not enabled the untracked cache at all. Even
> > worse, Git will do a ton of work to recreate the untracked cache and
> > to store it as an index extension, *just* to throw the untracked away in the
> end.
> >
> >
> > David Turner (1):
> >   unpack-trees: preserve index extensions
> >
> >  cache.h   |  1 +
> >  read-cache.c  |  6 ++
> >  t/t7063-status-untracked-cache.sh | 22 ++
> >  unpack-trees.c|  1 +
> >  4 files changed, 30 insertions(+)
> >
> >
> > base-commit: 4fa66c85f11bc5a541462ca5ae3246aa0ce02e74
> > Published-As:
> > https://github.com/dscho/git/releases/tag/preserve-untracked-cache-v1
> > Fetch-It-Via: git fetch https://github.com/dscho/git
> > preserve-untracked-cache-v1
> >
> > --
> > 2.12.2.windows.2.800.gede8f145e06
> >


Enabling the diff "indent" heuristic by default

2017-05-08 Thread Marc Branchaud

On 2017-05-08 03:48 AM, Junio C Hamano wrote:


* mb/diff-default-to-indent-heuristics (2017-05-02) 4 commits
  (merged to 'next' on 2017-05-08 at 158f401a92)


I think there's a general open question about this, which is whether or 
not we should just drop the diff.indentHeuristic configuration setting 
altogether.


Peff made the point [0] that if we keep the setting then t4061 should be 
rewritten.


My instinct is to keep the setting, at least until the changed default 
has a bit of time to settle in.  So I'll re-send the topic with the 
renovated t4061.


The topic would of course change more drastically if we decide to drop 
the setting right away.



 + add--interactive: drop diff.indentHeuristic handling
 + diff: enable indent heuristic by default
 + diff: have the diff-* builtins configure diff before initializing revisions
 + diff: make the indent heuristic part of diff's basic configuration

 Make the "indent" heuristics the default in "diff" and diff.indentHeuristics


s/heuristics/heuristic/  (both places)


 configuration variable an escape hatch for those who do no want it.


s/do no/do not/


 Will cook in 'next'.


Both Peff [1] and Ævar [2] mentioned situations where enabling the 
heuristic has a small impact on them.  If/when this graduates, it's 
perhaps worth adding a backward-compatibility note that the default 
patch IDs are changing.  Maybe something like:


The diff "indent" heuristic is now enabled by default.  This changes the 
patch IDs calculated by git-patch-id and used by git-cherry, which could 
affect patch-based workflows that rely on previously-computed patch IDs. 
 The heuristic can be disabled by setting diff.indentHeuristic to false.


[0] 
https://public-inbox.org/git/20170501222051.svylxazjwnot3...@sigill.intra.peff.net/


[1] 
https://public-inbox.org/git/20170428220450.olqitnuwhrxzg...@sigill.intra.peff.net/


[2] 
https://public-inbox.org/git/cacbzzx5f81hkcjrjtdyxznmvuef9z_ecs+0svk2xpbwxudg...@mail.gmail.com/


M.



Re: [PATCH 0/3] Port git-add--interactive.perl:status_cmd to C

2017-05-08 Thread Johannes Schindelin
Hi Daniel,

On Fri, 5 May 2017, Daniel Ferreira (theiostream) wrote:

> On Fri, May 5, 2017 at 7:38 PM, Johannes Schindelin
>  wrote:
>
> > But maybe you want to keep the naming a little more consistent with
> > the Perl script, e.g. instead of calling the function
> > `print_modified()` call it already `list()` (and rename it later to
> > `list_and_choose()` once you have taught it to ask for a choice)?
> 
> Actually, I named it print_modified() because I anticipated there
> would be no list_and_choose() equivalent in C. I don't think the
> builtin should have the interactive menu + modified list + untracked
> list being handled by one function. Rather, I think a saner way to go
> on with it would be to create functions like print_untracked();
> choose_from_input(); print_menu() etc.

Okay. But maybe then the `selected` field should not (at least not yet) be
a part of this patch.

> This is still pure speculation from what I've written until now and from
> the roadmap I have in my head (I do not know how writing code from now
> on will actually be), but I have a hunch that list_and_choose() is
> already convoluted enough in Perl; in C it might become a monster.

True.

To be honest, I would love for this patch to become part of a
"work-in-progress" patch series that lays out the plan a little bit more
concretely (I typically implement functions with a single `die("TODO")` in
the function body in such patch series). This patch series would of course
not be expected to pass the test suite yet, but it would allow other
developers (e.g. myself) to jump in and complete individual patches...

There would be real advantages in such a patch series:

- it could be worked on in parallel (coordination required, of course);
  may be a ton of fun, and

- the overall design could be iterated according with the needs of later
  patches in the series.

> > Thank you for this pleasant read!
> 
> Thank *you* for the quick and thorough review :)

Heh... I would not call it "quick"... it took a long time. But not as much
time as you spent crafting it!

Thank you,
Johannes


Re: [PATCH 2/3] add--interactive: add builtin helper for interactive add

2017-05-08 Thread Johannes Schindelin
Hi Daniel,

On Fri, 5 May 2017, Daniel Ferreira (theiostream) wrote:

> On Fri, May 5, 2017 at 7:30 PM, Johannes Schindelin
>  wrote:
> >> +static int git_add_interactive_config(const char *var,
> >
> > Not git_add_interactive__helper_config()? ;-)
> 
> I don't get if you mean this ironically (because of the verbosity) or
> if you do think this would be a good name ;P

Hehe. I meant it tongue-in-cheek.

So let me try again in my endeavor to provide *constructive* criticism,
i.e. not only pointing out what I think is suboptimal, but *also* how to
improve it in my opinion. How about add_i_config() or git_add_config()?

> >> + for (i = 0; i < q->nr; i++) {
> >> + struct diff_filepair *p;
> >> + p = q->queue[i];
> >> + diff_flush_stat(p, options, );
> >> + }
> >> +
> >> + for (i = 0; i < stat.nr; i++) {
> >> + int file_index = s->file_count;
> >> + for (j = 0; j < s->file_count; j++) {
> >> + if (!strcmp(s->files[j].path, stat.files[i]->name)) {
> >> + file_index = j;
> >> + break;
> >> + }
> >> + }
> >
> > So basically, this is looking up in a list whether we saw the file in
> > question already, and the reason we have to do that is that we run the
> > entire shebang twice, once with the worktree and once with the index.
> >
> > I wonder whether it would not make sense to switch away s->files from a
> > list to a hashmap.
> > [...]
> > BTW in the first pass, we pretty much know that we only get unique names,
> > so the entire lookup is unnecessary and will just increase the time
> > complexity from O(n) to O(n^2). So let's avoid that.
> >
> > By moving to a hashmap, you can even get the second phase down to an
> > expected O(n).
> 
> How would you go about implementing that hashmap (i.e. what should be
> the hash)? Does Git have any interface for it, or is there any example
> I can look after in the codebase?

The example Ævar pointed to seems to be pretty good (Michael Haggerty's
commits are in general excellent examples to follow):

https://github.com/git-for-windows/git/commit/7d4558c462f0

In this case, we can even fold the added/deleted part into the struct:

#include "hashmap.h"

...

struct file_stats {
struct hashmap_entry entry;
struct {
uintmax_t added, deleted;
} index, worktree;
char name[FLEX_ARRAY];
}

...
for (i = 0; i < stat.nr; i++) {
struct file_stats *stats;
const char *name = stat.files[i]->name;
unsigned int hash = strhash(name);

stats = s->phase == INDEX ? NULL :
hashmap_get_from_hash(, hash, name);
if (!stats) {
FLEX_ALLOC_STR(stats, name, name);
hashmap_entry_init(stats, hash);
stats->index.added = stats->index.deleted = 0;
stats->worktree.added =
stats->worktree.deleted = 0;
hashmap_add(, stats);
}

if (s->phase == INDEX) {
stats->index.added = stat.files[i]->added;
stats->index.deleted = stat.files[i]->deleted;
} else {
stats->worktree.added = stat.files[i]->added;
stats->worktree.deleted = 
stat.files[i]->deleted;
}
}

But maybe it should simultaneously put those added entries into a growing
array, as they arrive sorted and we will want to output the entries
sorted, too.

Oh, this reminds me: you are reading the list in two phases, and each time
the entries arrive sorted, but the second time we still may append new
entries.

Maybe we need to sort the entries afterwards?

> > Apart from using PATH_MAX bytes for most likely only short names: [...]
> 
> If not PATH_MAX, what should I go for? Make it a strbuf? I tend to
> believe keeping that on the stack would be simpler and more optimal.

On the stack, no question. But I was talking about struct
add_interactive_file_status, which is allocated via malloc(), not
allocated on the stack/heap.

> > Now that I read this and remember that only WORKTREE and INDEX are
> > handled in the callback function: is there actually a use for the NONE
> > enum value?  I.e. is current_mode read out in any other context than
> > the callback function? If there is no other read, then the NONE enum
> > value is just confusing.
> 
> I just preferred to have a declared non-handled value than leave
> something undefined 

Re: [PATCH 2/3] add--interactive: add builtin helper for interactive add

2017-05-08 Thread Johannes Schindelin
Hi,

On Sat, 6 May 2017, Ævar Arnfjörð Bjarmason wrote:

> On Sat, May 6, 2017 at 12:30 AM, Johannes Schindelin
>  wrote:
> > Hi Daniel,
> >
> >
> > On Fri, 5 May 2017, Daniel Ferreira wrote:
> >> +static void print_modified(void)
> >> +{
> >> + int i;
> >> + struct add_interactive_status s;
> >> + const char *modified_fmt = _("%12s %12s %s");
> >
> > We cannot really translate that...
> 
> He copied this from the *.perl code:
> 
> # TRANSLATORS: you can adjust this to align "git add -i" status menu
> my $status_fmt = __('%12s %12s %s');
> 
> And one translation at least makes use of that (and probably others should).
> 
> But porting the /* TRANSLATORS: ... */ comment over is missing, and
> should be added.

Ah, that explains it. It still is not really translateable, but with the
comment, even *I* understand why it is marked for translation.

Thanks,
Dscho

Re: [PATCH v3] send-email: --batch-size to work around some SMTP server limit

2017-05-08 Thread 赵小强


> 在 2017年5月8日,12:11,Junio C Hamano  写道:
> 
> Two suggestions.
> 
> (1) I do not think $smtp is always valid when we come here; it is
> unsafe to unconditionally say $smtp->quit like this patch does.
> 
>$smtp->quit if defined $smtp;
> 
> may help codepaths like $dry_run and also the case where
> $smtp_server is the absolute path to a local program.
> 

Hmm,missed this code path.

> (2) You are setting $auth to zero to force re-authentication to
> happen.  It would be more consistent to the state of $auth that
> is not-yet-used to "undef $auth;" here instead.  After all, the
> variable starts its life in an undefined state.
> 
> 
> So all in all
> 
>$smtp->quit if defined $smtp;
>undef $smtp;
>undef $auth;
> 
> perhaps?
> 
> This change of course forces re-authentication every N messages,
> which may not hurt those who use some form of credential helper, but
> that may be something we want to mention in the log message.

Yes, it' s better to undef $auth here. I will update the commit message next 
version.

Thank you very much for your helpful suggestions !

> 
>> +sleep($relogin_delay);
>> +}
>> }
> 
> Thanks.




Re: [GSoC] Project Selected: Incremental rewrite of git-submodules

2017-05-08 Thread Johannes Schindelin
Hi,

On Mon, 8 May 2017, Prathamesh Chavan wrote:

> I am Prathamesh Chavan, an undergraduate student from the department
> of Computer Science and Engineering, at Indian Institue of Technology,
> Kharagpur. I applied for Google Summer of Code 2017 and my project
> "Incremental rewrite of git-submodules" has been selected.
> This project will be done under the guidance of the mentors:
> Stefan Beller and Christian Couder.
> 
> [...]

Welcome! Glad to see you here, and looking forward to your contributions.

Ciao,
Johannes


Re: git and the Clang Static Analyzer

2017-05-08 Thread Johannes Schindelin
Hi Dilyan,


On Sun, 7 May 2017, Дилян Палаузов wrote:

> I compiled git/master

... which advances from time to time, so you definitely want to include a
more informative data point here, e.g. 4fa66c85f11 (Git 2.13-rc2,
2017-05-04) ...

> using the clang 4.0 static analyzer with
> 
> scan-build ./configure --with-libpcre --with-openssl
> scan-build make
> 
> and here are the results:
>   https://mail.aegee.org/dpa/scan-build-git-4fa66c85f11/
> 
> Please note, that the information is only about what gets actually compiled,
> code disabled by #if .. #endif is not considered (e.g. when determining
> whether a variable assignment is useless).

So you already know that the report is specific to your setup. It may make
a lot of sense to actually state what your setup is, i.e. Operating
System, installed libraries (and their respective versions), CPU, etc.

> There are probably false-positives.

Probably. So why don't you give it a try and look through the report? Then
summarize your findings here. That would definitely find a warm welcome, I
would expect.

> However in case of e.g. builtin/notes.c:1018, builtin/reset.c:294 or
> fast-import.c:2057 I consider the hints as justified.

Okay. And those hint are...?

Remember: the easier you make it for your readers to follow your thoughts,
the more readers will try to do that.

> This is for your information,  I wouldn't have a problem if you ignore
> the analysis.

Thank you for running the scan.

However, I would like to encourage you to do better than just call Clang
and slap the report verbatim, without much in the way of a manual
analysis, onto some website.

In fact, I fear that doing it this way is rather counter-productive.

Instead, you could try to organize the report a lot better (see the many
mails about Coverity in the past):

- we already *know* that static analyzers have tons of problems with the
  way we specify "FLEX_ARRAYS": we define structs to have a 0-item or
  1-item array as their last field, then use a variable-length malloc() to
  tailor the allocated data to the needs of a variable-length array (most
  often: string). To have these issues in one category would be very
  helpful, as we could ignore those parts of the report easier.

- we also already know quite well that Git's source code often abuses
  exit() for a garbage collector on short-running processes: In many
  cmd_*() functions, there are "memory leaks" that are intentional (read:
  lazy). Again, it would be helpful to be able to fade out those reports.

- Likewise, our test helpers abide by a lot less strict rules, as we only
  use them in the test suite (and therefore it is not *really* necessary
  to, say, validate the input passed as command-line parameters). Here
  again, it would be very helpful to put those into a separate category
  that can be selectively hidden.

- there have been a number of patches floating about to fix one or more of
  the issues reported by Coverity. These patches may not yet have made it
  into `master`, but at least some of them are in `pu`. A careful analysis
  which issues reported by Clang would be addressed by `pu` would also be
  quite welcome.

Also: to make it *really* useful for other developers, it would be a good
idea for you to try your hand at patching the .travis.yml file in such a
way that the static analysis is performed as part of the Continuous
Testing, and to contribute said patch. That way, other contributors could
not only see how it is done (and copy-paste the commands, including `apt
install` calls insofar necessary), but also to see the reports on a
trusted website (Travis').

I really would like to believe that you meant to be helpful in sending
this mail. But the style does come over as "throwing a couple of scraps to
the dogs". I do not believe that is what you wanted, and I think you can
turn that impression around by at least some of the things I mentioned
above.

Ciao,
Johannes

Re: [PATCH 0/5] Abide by our own rules regarding line endings

2017-05-08 Thread Johannes Schindelin
Hi Junio,

On Sun, 7 May 2017, Junio C Hamano wrote:

> Johannes Schindelin  writes:
> 
> > Let me repeat myself:
> 
> Don't.

I had to. You did not understand me.

> Instead, read through what you are responding to the end before start
> typing a byte.  In case you didn't do that, in the end I agree with the
> direction of the series ;-).

And I am still convinced that you do not understand me. At least on the
point that I tried to repeat in a different way, in an attempt to help you
to understand me.

What makes me so convinced that you do not understand me? The part of your
mail that you assumed I did not read. (Disclaimer: I did read it.)

Ciao,
Dscho


Re: vger not relaying some of Junio's messages today?

2017-05-08 Thread Johannes Schindelin
Hi Ævar,

On Sat, 6 May 2017, Ævar Arnfjörð Bjarmason wrote:

> I have one [script] to git am a patch from a msgid, thought I should
> write something to handle a series in some DWIM fashion (e.g. apply the
> latest continuous sequence of patches matching --author) but figured
> that someone's probably wrote this already & I don't need to hack it up
> myself...

You probably missed my previous mails mentioning

https://github.com/git-for-windows/build-extra/blob/master/apply-from-public-inbox.sh

You can use this script to apply single patches (identified by their
Message-ID), and patch series (identified by their cover letter's
Message-ID).

As I mentioned at the Contributors' Summit at GitMerge 2017: I would
*love* to collaborate on tools that make any part of the
contribution/review process less cumbersome than it is right now.

Ciao,
Dscho

Re: [PATCH 0/1] Preserve the untracked cache across checkout, reset --hard, etc

2017-05-08 Thread Christian Couder
(Adding Dave in Cc as it looks like he is involved.)

On Mon, May 8, 2017 at 11:41 AM, Johannes Schindelin
 wrote:
> I recently sent out a request for assistance, after noticing that the
> untracked cache is simply thrown away after operations such as
> `git checkout` or `git reset --hard`:
>
> http://public-inbox.org/git/alpine.DEB.2.20.1705031202470.3480@virtualbox/
>
> Duy responded with some high-level reasoning that it should be possible
> to simply reuse the untracked cache data structure in the new index, as
> he had a gut feeling that "we do invalidation right".
>
> I did not have time to back that up by a thorough analysis of the code,
> but it turns out that it is unnecessary: Ben Peart pointed me to a patch
> of Dave Turner's that was submitted as part of the watchman series,
> addressing the very issue about which I was concerned.
>
> And I trust Dave to have validated the idea that the untracked cache
> invalidation "is done right" even when we simply move the pointer to a
> different index_state struct than originally.
>
> Seeing as the untracked cache being dropped unceremoniously when it
> should not be dropped, in a surprising number of operations, I think it
> is a sensible change, and important, too, and independent enough from
> the watchman patches to merit being separated out and applied pretty
> soon.
>
> So what I did was simply to drop the two lines from this patch that
> referred to index_state fields added by Dave's watchman patch series.
>
> Please do not mistake this for a sign that I am disinterested in
> watchman support, far from it... stay tuned ;-)
>
> Oh, and I adjusted Dave's email address. Dave, is that okay?
>
> As we are in a feature freeze phase, I was debating whether to send out
> this patch now or later.
>
> Having thought about it for quite a bit, I am now convinced that this
> patch fixes a bug in the untracked cache feature that is so critical as
> to render it useless: if you
>
> - have to switch between branches frequently, or
> - rebase frequently (which calls `git reset --hard`), or
> - stash frequently (which calls `git reset --hard`),
>
> it is as if you had not enabled the untracked cache at all. Even worse,
> Git will do a ton of work to recreate the untracked cache and to store
> it as an index extension, *just* to throw the untracked away in the end.
>
>
> David Turner (1):
>   unpack-trees: preserve index extensions
>
>  cache.h   |  1 +
>  read-cache.c  |  6 ++
>  t/t7063-status-untracked-cache.sh | 22 ++
>  unpack-trees.c|  1 +
>  4 files changed, 30 insertions(+)
>
>
> base-commit: 4fa66c85f11bc5a541462ca5ae3246aa0ce02e74
> Published-As: 
> https://github.com/dscho/git/releases/tag/preserve-untracked-cache-v1
> Fetch-It-Via: git fetch https://github.com/dscho/git 
> preserve-untracked-cache-v1
>
> --
> 2.12.2.windows.2.800.gede8f145e06
>


Re: [PATCH v7] read-cache: force_verify_index_checksum

2017-05-08 Thread Christian Couder
On Fri, Apr 14, 2017 at 10:32 PM,   wrote:
> diff --git a/t/t1450-fsck.sh b/t/t1450-fsck.sh
> index 33a51c9..677e15a 100755
> --- a/t/t1450-fsck.sh
> +++ b/t/t1450-fsck.sh
> @@ -689,4 +689,17 @@ test_expect_success 'bogus head does not fallback to all 
> heads' '
> ! grep $blob out
>  '
>
> +test_expect_success 'detect corrupt index file in fsck' '
> +   cp .git/index .git/index.backup &&
> +   test_when_finished "mv .git/index.backup .git/index" &&
> +   echo  > &&
> +   git add  &&
> +   sed -e "s///" .git/index >.git/index.yyy &&
> +   mv .git/index.yyy .git/index &&
> +   # Confirm that fsck detects invalid checksum
> +   test_must_fail git fsck --cache &&
> +   # Confirm that status no longer complains about invalid checksum
> +   git status
> +'

This test does not pass when the GIT_TEST_SPLIT_INDEX env variable is
set on my Linux machine.

Also it looks like you sent a v8 of this patch series with a different
test, but what is in master looks like the above test instead of the
test in your v8.


[PATCH 1/1] unpack-trees: preserve index extensions

2017-05-08 Thread Johannes Schindelin
From: David Turner 

Make git checkout (and other unpack_tree operations) preserve the
untracked cache. This is valuable for two reasons:

1. Often, an unpack_tree operation will not touch large parts of the
working tree, and thus most of the untracked cache will continue to be
valid.

2. Even if the untracked cache were entirely invalidated by such an
operation, the user has signaled their intention to have such a cache,
and we don't want to throw it away.

[jes: backed out the watchman-specific parts]

Signed-off-by: David Turner 
Signed-off-by: Ben Peart 
Signed-off-by: Johannes Schindelin 
---
 cache.h   |  1 +
 read-cache.c  |  6 ++
 t/t7063-status-untracked-cache.sh | 22 ++
 unpack-trees.c|  1 +
 4 files changed, 30 insertions(+)

diff --git a/cache.h b/cache.h
index e1f0e182ad0..d41336dfd5d 100644
--- a/cache.h
+++ b/cache.h
@@ -597,6 +597,7 @@ extern int read_index_unmerged(struct index_state *);
 #define CLOSE_LOCK (1 << 1)
 extern int write_locked_index(struct index_state *, struct lock_file *lock, 
unsigned flags);
 extern int discard_index(struct index_state *);
+extern void move_index_extensions(struct index_state *dst, struct index_state 
*src);
 extern int unmerged_index(const struct index_state *);
 extern int verify_path(const char *path);
 extern int strcmp_offset(const char *s1, const char *s2, size_t *first_change);
diff --git a/read-cache.c b/read-cache.c
index 0d0081a11b8..79827a4d710 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -2628,3 +2628,9 @@ void stat_validity_update(struct stat_validity *sv, int 
fd)
fill_stat_data(sv->sd, );
}
 }
+
+void move_index_extensions(struct index_state *dst, struct index_state *src)
+{
+   dst->untracked = src->untracked;
+   src->untracked = NULL;
+}
diff --git a/t/t7063-status-untracked-cache.sh 
b/t/t7063-status-untracked-cache.sh
index 0667bd9dd3e..e5fb892f957 100755
--- a/t/t7063-status-untracked-cache.sh
+++ b/t/t7063-status-untracked-cache.sh
@@ -661,4 +661,26 @@ test_expect_success 'test ident field is working' '
test_i18ncmp ../expect ../err
 '
 
+test_expect_success 'untracked cache survives a checkout' '
+   git commit --allow-empty -m empty &&
+   test-dump-untracked-cache >../before &&
+   test_when_finished  "git checkout master" &&
+   git checkout -b other_branch &&
+   test-dump-untracked-cache >../after &&
+   test_cmp ../before ../after &&
+   test_commit test &&
+   test-dump-untracked-cache >../before &&
+   git checkout master &&
+   test-dump-untracked-cache >../after &&
+   test_cmp ../before ../after
+'
+
+test_expect_success 'untracked cache survives a commit' '
+   test-dump-untracked-cache >../before &&
+   git add done/two &&
+   git commit -m commit &&
+   test-dump-untracked-cache >../after &&
+   test_cmp ../before ../after
+'
+
 test_done
diff --git a/unpack-trees.c b/unpack-trees.c
index aa15111fefc..17117bd0fda 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -1391,6 +1391,7 @@ int unpack_trees(unsigned len, struct tree_desc *t, 
struct unpack_trees_options
  WRITE_TREE_SILENT |
  WRITE_TREE_REPAIR);
}
+   move_index_extensions(>result, o->dst_index);
discard_index(o->dst_index);
*o->dst_index = o->result;
} else {
-- 
2.12.2.windows.2.800.gede8f145e06


[PATCH 0/1] Preserve the untracked cache across checkout, reset --hard, etc

2017-05-08 Thread Johannes Schindelin
I recently sent out a request for assistance, after noticing that the
untracked cache is simply thrown away after operations such as
`git checkout` or `git reset --hard`:

http://public-inbox.org/git/alpine.DEB.2.20.1705031202470.3480@virtualbox/

Duy responded with some high-level reasoning that it should be possible
to simply reuse the untracked cache data structure in the new index, as
he had a gut feeling that "we do invalidation right".

I did not have time to back that up by a thorough analysis of the code,
but it turns out that it is unnecessary: Ben Peart pointed me to a patch
of Dave Turner's that was submitted as part of the watchman series,
addressing the very issue about which I was concerned.

And I trust Dave to have validated the idea that the untracked cache
invalidation "is done right" even when we simply move the pointer to a
different index_state struct than originally.

Seeing as the untracked cache being dropped unceremoniously when it
should not be dropped, in a surprising number of operations, I think it
is a sensible change, and important, too, and independent enough from
the watchman patches to merit being separated out and applied pretty
soon.

So what I did was simply to drop the two lines from this patch that
referred to index_state fields added by Dave's watchman patch series.

Please do not mistake this for a sign that I am disinterested in
watchman support, far from it... stay tuned ;-)

Oh, and I adjusted Dave's email address. Dave, is that okay?

As we are in a feature freeze phase, I was debating whether to send out
this patch now or later.

Having thought about it for quite a bit, I am now convinced that this
patch fixes a bug in the untracked cache feature that is so critical as
to render it useless: if you

- have to switch between branches frequently, or
- rebase frequently (which calls `git reset --hard`), or
- stash frequently (which calls `git reset --hard`),

it is as if you had not enabled the untracked cache at all. Even worse,
Git will do a ton of work to recreate the untracked cache and to store
it as an index extension, *just* to throw the untracked away in the end.


David Turner (1):
  unpack-trees: preserve index extensions

 cache.h   |  1 +
 read-cache.c  |  6 ++
 t/t7063-status-untracked-cache.sh | 22 ++
 unpack-trees.c|  1 +
 4 files changed, 30 insertions(+)


base-commit: 4fa66c85f11bc5a541462ca5ae3246aa0ce02e74
Published-As: 
https://github.com/dscho/git/releases/tag/preserve-untracked-cache-v1
Fetch-It-Via: git fetch https://github.com/dscho/git preserve-untracked-cache-v1

-- 
2.12.2.windows.2.800.gede8f145e06



What's cooking in git.git (May 2017, #02; Mon, 8)

2017-05-08 Thread Junio C Hamano
Here are the topics that have been cooking.  Commits prefixed with
'-' are only in 'pu' (proposed updates) while commits prefixed with
'+' are in 'next'.  The ones marked with '.' do not appear in any of
the integration branches, but I am still holding onto them.

You can find the changes described here in the integration branches
of the repositories listed at

http://git-blame.blogspot.com/p/git-public-repositories.html

--
[Graduated to "master"]

* ja/i18n-cleanup (2017-05-01) 2 commits
  (merged to 'next' on 2017-04-30 at 8002e53820)
 + i18n: read-cache: typofix
 + i18n: remove i18n from tag reflog message


* rg/a-the-typo (2017-05-01) 1 commit
  (merged to 'next' on 2017-04-30 at a648cf2961)
 + fix minor typos

 Typofix.


* rg/doc-pull-typofix (2017-05-01) 1 commit
  (merged to 'next' on 2017-04-30 at c2edb4d813)
 + doc: git-pull.txt use US spelling, fix minor typo


* rg/doc-submittingpatches-wordfix (2017-05-01) 1 commit
  (merged to 'next' on 2017-04-30 at 4065036588)
 + doc: update SubmittingPatches


* sr/hooks-cwd-doc (2017-05-01) 1 commit
  (merged to 'next' on 2017-04-30 at 063dd5cb02)
 + githooks.txt: clarify push hooks are always executed in $GIT_DIR

--
[New Topics]

* jn/clone-add-empty-config-from-command-line (2017-05-02) 1 commit
 - clone: handle empty config values in -c

 "git clone --config var=val" is a way to populate the
 per-repository configuration file of the new repository, but it did
 not work well when val is an empty string.  This has been fixed.

 Will merge to 'next'.


* jn/credential-doc-on-clear (2017-05-02) 1 commit
 - credential doc: make multiple-helper behavior more prominent

 Doc update.

 Will merge to 'next'.


* sb/checkout-recurse-submodules (2017-05-04) 3 commits
 - submodule: properly recurse for read-tree and checkout
 - submodule: avoid auto-discovery in new working tree manipulator code
 - submodule_move_head: reuse child_process structure for futher commands

 "git checkout --recurse-submodules" did not quite work with a
 submodule that itself has submodules. 

 Will merge to 'next'.


* ab/aix-needs-compat-regex (2017-05-04) 1 commit
 - config.mak.uname: set NO_REGEX=NeedsStartEnd on AIX

 Build fix.

 Will merge to 'next'.


* ja/do-not-ask-needless-questions (2017-05-04) 3 commits
 - git-filter-branch: make the error msg when missing branch more open
 - read-tree -m: make error message for merging 0 trees less smart aleck
 - usability: don't ask questions if no reply is required

 Git sometimes gives an advice in a rhetorical question that does
 not require an answer, which can confuse new users and non native
 speakers.  Attempt to rephrase them.

 I lost track of the discussion on this topic.  Did we decide that
 this is not a good idea?


* ab/doc-replace-gmane-links (2017-05-08) 2 commits
 - SQAUSH???
 - doc: replace a couple of broken gmane links

 The Web interface to gmane news archive is long gone, even though
 the articles are still accessible via NTTP.  Replace the links with
 ones to public-inbox.org.  Because their message identification is
 based on the actual message-id, it is likely that it will be easier
 to migrate away from it if/when necessary.


* ab/fix-poison-tests (2017-05-08) 3 commits
 - travis-ci: add job to run tests with GETTEXT_POISON
 - travis-ci: setup "prove cache" in "script" step
 - tests: fix tests broken under GETTEXT_POISON=YesPlease

 Update tests to pass under GETTEXT_POISON (a mechanism to ensure
 that output strings that should not be translated are not
 translated by mistake), and tell TravisCI to run them.

 Will merge to 'next' after a few rounds on 'pu'.


* bw/dir-c-stops-relying-on-the-index (2017-05-06) 14 commits
 - dir: convert fill_directory to take an index
 - dir: convert read_directory to take an index
 - dir: convert read_directory_recursive to take an index
 - dir: convert open_cached_dir to take an index
 - dir: convert is_excluded to take an index
 - dir: convert prep_exclude to take an index
 - dir: convert add_excludes to take an index
 - dir: convert is_excluded_from_list to take an index
 - dir: convert last_exclude_matching_from_list to take an index
 - dir: convert dir_add* to take an index
 - dir: convert get_dtype to take index
 - dir: convert directory_exists_in_index to take index
 - dir: convert read_skip_worktree_file_from_index to take an index
 - dir: stop using the index compatibility macros

 API update.

 Will merge to 'next'.


* jk/diff-submodule-diff-inline (2017-05-08) 1 commit
 - diff: recurse into nested submodules for inline diff

 "git diff --submodule=diff" didn't recurse into nested submodules.


* jk/disable-pack-reuse-when-broken (2017-05-08) 1 commit
 - pack-objects: disable pack reuse for object-selection options

 "pack-objects" can stream a slice of an existing packfile out when
 the pack bitmap can tell that the reachable objects are all needed
 in the 

Re: git clone -b

2017-05-08 Thread Jeff King
On Mon, May 08, 2017 at 08:30:49AM +0200, Дилян Палаузов wrote:

> why do these work:
> 
> git clone --bare -b 3.5 https://github.com/python/cpython A
> git clone -b 3.6 A B

>From the description of --bare in "git help clone":

  [...]the branch heads at the remote are copied directly to
  corresponding local branch heads, without mapping them to
  refs/remotes/origin/. When this option is used, neither
  remote-tracking branches nor the related configuration variables are
  created.

So because the upstream has a refs/heads/3.6 branch, so too does the
bare clone "A". And thus when you clone it asking for that branch, Git
can find it.

But in your non-bare example:

> git clone -b 3.5 https://github.com/python/cpython C
> 
> but these not:
> 
> git clone -b 3.6 C D
> git clone --no-local -b 3.6 C D

In the non-bare clone C, there is no local 3.6 branch. You only have the
remote-tracking branch refs/remotes/origin/3.6. So when you try to clone
from it, Git can't find the branch:

  $ git clone -b 3.6 C D
  Cloning into 'D'...
  fatal: Remote branch 3.6 not found in upstream origin
  fatal: The remote end hung up unexpectedly

It works if you create a local branch based on upstream's branch:

  $ git -C C checkout 3.6
  Branch 3.6 set up to track remote branch 3.6 from origin.
  Switched to a new branch '3.6'

  $ git clone -b 3.6 C D
  Cloning into 'D'...
  done.

-Peff


  1   2   >