Re: [PATCH 0/2] routines to generate JSON data

2018-03-19 Thread Jeff King
On Sat, Mar 17, 2018 at 12:00:26AM +0100, Ævar Arnfjörð Bjarmason wrote:

> 
> On Fri, Mar 16 2018, Jeff King jotted:
> 
> > I really like the idea of being able to send our machine-readable output
> > in some "standard" syntax for which people may already have parsers. But
> > one big hangup with JSON is that it assumes all strings are UTF-8.
> 
> FWIW It's not UTF-8 but "Unicode characters", i.e. any Unicode encoding
> is valid, not that it changes anything you're pointing out, but people
> on Win32 could use UTF-16 as-is if their filenames were in that format.

But AIUI, non-UTF8 has to come as "\u" escapes, right? That at least
gives us an "out" for exotic characters, but I don't think we can just
blindly dump pathnames into quoted strings, can we?

> > Some possible solutions I can think of:
> >
> >   1. Ignore the UTF-8 requirement, making a JSON-like output (which I
> >  think is what your patches do). I'm not sure what problems this
> >  might cause on the parsing side.
> 
> Maybe some JSON parsers are more permissive, but they'll commonly just
> die on non-Unicode (usually UTF-8) input, e.g.:
> 
> $ (echo -n '{"str ": "'; head -c 3 /dev/urandom ; echo -n '"}') | perl 
> -0666 -MJSON::XS -wE 'say decode_json(<>)->{str}'
> malformed UTF-8 character in JSON string, at character offset 10 (before 
> "\x{fffd}e\x{fffd}"}") at -e line 1, <> chunk 1.

OK, that's about what I expected.

> >   2. Specially encode non-UTF-8 bits. I'm not familiar enough with JSON
> >  to know the options here, but my understanding is that numeric
> >  escapes are just for inserting unicode code points. _Can_ you
> >  actually transport arbitrary binary data across JSON without
> >  base64-encoding it (yech)?
> 
> There's no way to transfer binary data in JSON without it being shoved
> into a UTF-8 encoding, so you'd need to know on the other side that
> such-and-such a field has binary in it, i.e. you'll need to invent your
> own schema.

Yuck. That's what I was afraid of. Is there any kind of standard scheme
here? It seems like we lose all of the benefits of JSON if the receiver
has to know whether and when to de-base64 (or whatever) our data.

> I think for git's use-case we're probably best off with JSON. It's going
> to work almost all of the time, and when it doesn't it's going to be on
> someone's weird non-UTF-8 repo, and those people are probably used to
> dealing with crap because of that anyway and can just manually decode
> their thing after it gets double-encoded.

That sounds a bit hand-wavy. While I agree that anybody using non-utf8
at this point is slightly insane, Git _does_ actually work with
arbitrary encodings in things like pathnames. It just seems kind of lame
to settle on a new universal encoding format for output that's actually
less capable than the current output.

> That sucks, but given that we'll be using this either for just ASCII
> (telemetry) or UTF-8 most of the time, and that realistically other
> formats either suck more or aren't nearly as ubiquitous...

I'd hoped to be able to output something like "git status" in JSON,
which is inherently going to deal with user paths.

-Peff


Re: [PATCH 0/2] routines to generate JSON data

2018-03-19 Thread Jeff King
On Mon, Mar 19, 2018 at 06:19:26AM -0400, Jeff Hostetler wrote:

> > To make the above work, I think you'd have to store a little more state.
> > E.g., the "array_append" functions check "out->len" to see if they need
> > to add a separating comma. That wouldn't work if we might be part of a
> > nested array. So I think you'd need a context struct like:
> > 
> >struct json_writer {
> >  int first_item;
> >  struct strbuf out;
> >};
> >#define JSON_WRITER_INIT { 1, STRBUF_INIT }
> > 
> > to store the state and the output. As a bonus, you could also use it to
> > store some other sanity checks (e.g., keep a "depth" counter and BUG()
> > when somebody tries to access the finished strbuf with a hanging-open
> > object or array).
> 
> Yeah, I thought about that, but I think it gets more complex than that.
> I'd need a stack of "first_item" values.  Or maybe the _begin() needs to
> increment a depth and set first_item and the _end() needs to always
> unset first_item.  I'll look at this gain.

I think you may be able to get by with just unsetting first_item for any
"end". Because as you "pop" to whatever data structure is holding
whatever has ended, you know it's no longer the first item (whatever
just ended was there before it).

I admit I haven't thought too hard on it, though, so maybe I'm missing
something.

> The thing I liked about the bottom-up construction is that it is easier
> to collect multiple sets in parallel and combine them during the final
> roll-up.  With the in-line nesting, you're tempted to try to construct
> the resulting JSON in a single series and that may not fit what the code
> is trying to do.  For example, if I wanted to collect an array of error
> messages as they are generated and an array of argv arrays and any alias
> expansions, then put together a final JSON string containing them and
> the final exit code, I'd need to build it in parts.  I can build these
> parts in pieces of JSON and combine them at the end -- or build up other
> similar data structures (string arrays, lists, or whatever) and then
> have a JSON conversion step.  But we can make it work both ways, I just
> wanted to keep it simpler.

Yeah, I agree that kind of bottom-up construction would be nice for some
cases. I'm mostly worried about inefficiency copying the strings over
and over as we build up the final output.  Maybe that's premature
worrying, though.

If the first_item thing isn't too painful, then it might be nice to have
both approaches available.

> > In general I'd really prefer to keep the shell script as the driver for
> > the tests, and have t/helper programs just be conduits. E.g., something
> > like:
> > 
> >cat >expect <<-\EOF &&
> >{"key": "value", "foo": 42}
> >EOF
> >test-json-writer >actual \
> >  object_begin \
> >  object_append_string key value \
> >  object_append_int foo 42 \
> >  object_end &&
> >test_cmp expect actual
> > 
> > It's a bit tedious (though fairly mechanical) to expose the API in this
> > way, but it makes it much easier to debug, modify, or add tests later
> > on (for example, I had to modify the C program to find out that my
> > append example above wouldn't work).
> 
> Yeah, I wasn't sure if such a simple api required exposing all that
> machinery to the shell or not.  And the api is fairly self-contained
> and not depending on a lot of disk/repo setup or anything, so my tests
> would be essentially static WRT everything else.
> 
> With my t0019 script you should have been able use -x -v to see what
> was failing.

I was able to run the test-helper directly. The tricky thing is that I
had to write new C code to test my theory about how the API worked.
Admittedly that's not something most people would do regularly, but I
often seem to end up doing that kind of probing and debugging. Many
times I've found the more generic t/helper programs useful.

I also wonder if various parts of the system embrace JSON, if we'd want
to have a tool for generating it as part of other tests (e.g., to create
"expect" files).

-Peff


Re: [PATCH] Makefile: detect compiler and enable more warnings in DEVELOPER=1

2018-03-19 Thread Jeff King
On Sun, Mar 18, 2018 at 04:55:25PM +0100, Duy Nguyen wrote:

> On Sun, Mar 18, 2018 at 9:18 AM, Nguyễn Thái Ngọc Duy  
> wrote:
> > +ifneq ($(or $(filter gcc6,$(COMPILER_FEATURES)),$(filter 
> > clang4,$(COMPILER_FEATURES))),)
> > +CFLAGS += -Wextra
> 
> Another thing we can add here is -Og instead of standard -O2 (or -O0
> in my build), which is supported since gcc 4.8. clang seems to support
> it too (mapped to -O1 at least for clang5) but I don't know what
> version added that flag.

I don't know, that feels kind of weird to me. I thought DEVELOPER was
supposed to mean "ratchet up the linting, I want to know if I've broken
something".

But tweaking -O is about "I am in an edit-compile-debug cycle".
Sometimes you are and sometimes you aren't, but you'd presumably want to
have extra warnings regardless (especially because some warnings only
trigger under particular optimization settings).

Personally, I default to -O0, but then crank up to -O2 for performance
testing or for my daily-driver builds. But I _always_ have as many
warnings enabled as possible[1].

-Peff

[1] I do have some pretty horrific magic to turn on -Werror when I'm
visiting a "historical" commit, such as during a bisect.


Re: Why does pack-objects use so much memory on incremental packing?

2018-03-19 Thread Jeff King
On Sat, Mar 17, 2018 at 11:05:59PM +0100, Ævar Arnfjörð Bjarmason wrote:

> Splitting this off into its own thread. Aside from the improvements in
> your repack memory reduction (20180317141033.21545-1-pclo...@gmail.com)
> and gc config (20180316192745.19557-1-pclo...@gmail.com) series's I'm
> wondering why repack takes so much memory to incrementally repack new
> stuff when you leave out the base pack.

I think it's a combination of a few issues:

 1. We do a complete history traversal, and then cull out objects which
our filters reject (e.g., things in a .keep pack). So you pay for
all of the "struct object", along with the obj_hash table to look
them up.

In my measurements of just "git rev-list --objects --all", that's
about 25MB for git.git. Plus a few misc things (pending object
structs for the traversal, etc).

 2. The delta-base cache used for the traversal is a fixed size. So
that's going to be 96MB regardless of your repo size.

I measured a total heap usage of 130MB for "rev-list --objects --all".
That's not 230, but I'm not sure what you're measuring. If it's RSS,
keep in mind that includes the mmap'd packfiles, too.

Doing a separate "rev-list | pack-objects" should be minorly cheaper
(although it will still have a similar peak cost, since that memory will
just be moved to the rev-list process).

If you _just_ want to pack the loose objects, you could probably do
something like:

  find .git/objects/?? -type f |
  tr -d / |
  git pack-objects .git/objects/pack/pack
  git prune-packed

But you'd get pretty crappy deltas out of that, since the heuristics
rely on knowing the filenames of trees and blobs (which you can only get
by walking the graph).

So you'd do better with something like:

  git rev-list --objects $new_tips --not $old_tips |
  git pack-objects .git/objects/pack/pack

but it's hard to know what "$old_tips" should be, unless you recorded it
last time you did a full repack.

> But no, it takes around 230MB. But thinking about it a bit further:
> 
>  * This builds on top of existing history, so that needs to be
>read/consulted

Right, I think this is the main thing.

>  * We might be reusing (if not directly, skipping re-comuting) deltas
>from the existing pack.

I don't think that should matter. We'll reuse deltas if the base is
going into our pack, but otherwise recompute. The delta computation
itself takes some memory, but it should be fairly constant even for a
large repo (it's really average_blob_size * window_size).

So I think most of your memory is just going to the traversal stuff.
Running:

  valgrind --tool=massif git pack-objects --all foo  But I get the same result if after cloning I make an orphan branch, and
> pass all the "do this as cheaply as possible" branches I can find down
> to git-repack:
> 
> (
> rm -rf /tmp/git &&
> git clone g...@github.com:git/git.git /tmp/git &&
> cd /tmp/git &&
> touch $(ls .git/objects/pack/*pack | sed 's/\.pack$/.keep/') &&
> git checkout --orphan new &&
> git reset --hard &&
> for i in {1..10}
> do
> touch $i &&
> git add $i &&
> git commit -m$i
> done &&
> git tag -d $(git tag -l) &&
> /usr/bin/time -f %M git repack -A -d -f -F --window=1 --depth=1
> )
> 
> But the memory use barely changes, my first example used 227924 kb, but
> this one uses 226788.

I think you still had to do the whole history traversal there, because
you have existing refs (the "master" branch, along with refs/remotes) as
well as reflogs.

Try:

  git branch -d master
  git remote rm origin
  rm -rf .git/logs

After that, the repack uses about 5MB.

> Jeff: Is this something ref islands[1] could be (ab)used to do, or have
> I misunderstood that concept?
> 
> 1. https://public-inbox.org/git/20130626051117.gb26...@sigill.intra.peff.net/
>https://public-inbox.org/git/20160304153359.ga16...@sigill.intra.peff.net/
>
> https://public-inbox.org/git/20160809174528.2ydgkhd7aycla...@sigill.intra.peff.net/

I think you misunderstood the concept. :)

They are about disallowing deltas between unrelated islands. They
actually require _more_ memory, because you have to storage an island
bitmap for each object (though with some copy-on-write magic, it's not
too bad). But they can never save you memory, since reused deltas are
always cheaper than re-finding new ones.

-Peff


Re: .gitattributes override behavior (possible bug, or documentation bug)

2018-03-19 Thread Dakota Hawkins
> So I think the "recommended subset" is basically "everything but these
> few constructs". We just need to document them. ;)

Mentioned so-far/running list?

- Matching directories recursively, or at all I guess (e.g. "/")
  - (???) Instead: "/*"
- Negative matches
  - (???) Instead: Is there any longer-form attributes-OK equivalent?
Just positive-match with "!s"?


Re: .gitattributes override behavior (possible bug, or documentation bug)

2018-03-19 Thread Jeff King
On Tue, Mar 20, 2018 at 12:25:27AM -0400, Dakota Hawkins wrote:

> > Right. The technical reason is mostly "that is not how it was designed,
> > and it would possibly break some corner cases if we switched it now".
> 
> I'm just spitballing here, but do you guys think there's any subset of
> the combined .gitignore and .gitattributes matching functionality that
> could at least serve as a good "best-practices, going forward"
> (because of consistency) for both? I will say every time I do this for
> a new repo and have to do something even slightly complicated or
> different from what I've done before with .gitattributes/.gitignore
> that it takes me a long-ish time to figure it out. It's like I'm
> vaguely aware of pitfalls I've encountered in the past in certain
> areas but don't remember exactly what they are, so I consult the docs,
> which are (in sum) confusing and lead to more time spent
> trying/failing/trying/works/fails-later/etc.
> 
> One "this subset of rules will work for both this way" would be
> awesome even if the matching capabilities are technically divergent,
> but on the other hand that might paint both into a corner in terms of
> functionality.

As far as I know, they should be the same with the exception of this
recursion, and the negative-pattern thing. But I'm cc-ing Duy, who is
the resident expert on ignore and attributes matching (whether he wants
to be or not ;) ). I wouldn't be surprised if there's something I don't
know about.

So I think the "recommended subset" is basically "everything but these
few constructs". We just need to document them. ;)

I probably should cc'd Duy on the documentation patch, too:

  https://public-inbox.org/git/20180320041454.ga15...@sigill.intra.peff.net/

-Peff


Re: [PATCH 2/2] read-cache: fix an -Wmaybe-uninitialized warning

2018-03-19 Thread Jeff King
On Mon, Mar 19, 2018 at 05:56:11PM +, Ramsay Jones wrote:

> For the purposes of this discussion, the ce_write_entry() function has
> three code blocks of interest, that look like so:
> 
> /* block #1 */
> if (ce->ce_flags & CE_STRIP_NAME) {
> saved_namelen = ce_namelen(ce);
> ce->ce_namelen = 0;
> }
> 
> /* block #2 */
> /*
>* several code blocks that contain, among others, calls
>  * to copy_cache_entry_to_ondisk(ondisk, ce);
>  */
> 
> /* block #3 */
> if (ce->ce_flags & CE_STRIP_NAME) {
> ce->ce_namelen = saved_namelen;
> ce->ce_flags &= ~CE_STRIP_NAME;
> }
> 
> The warning implies that gcc thinks it is possible that the first
> block is not entered, the calls to copy_cache_entry_to_ondisk()
> could toggle the CE_STRIP_NAME flag on, thereby entering block #3
> with saved_namelen unset. However, the copy_cache_entry_to_ondisk()
> function does not write to ce->ce_flags (it only reads). gcc could
> easily determine this, since that function is local to this file,
> but it obviously doesn't.

Weird. It seems like it would be pretty easy for it to know that we
don't write the flags field at all. But I also don't see any other thing
that would fool the compiler.

> In order to suppress this warning, we make it clear to the reader
> (human and compiler), that block #3 will only be entered when the
> first block has been entered, by introducing a new 'stripped_name'
> boolean variable. We also take the opportunity to change the type
> of 'saved_namelen' to 'unsigned int' to match ce->ce_namelen.

These probably both ought to be size_t, but it makes sense to match
ce_namelen for now.

> diff --git a/read-cache.c b/read-cache.c
> index 2eb81a66b..49607ddcd 100644
> --- a/read-cache.c
> +++ b/read-cache.c
> @@ -2104,13 +2104,15 @@ static int ce_write_entry(git_SHA_CTX *c, int fd, 
> struct cache_entry *ce,
> struct strbuf *previous_name, struct 
> ondisk_cache_entry *ondisk)
>  {
>   int size;
> - int saved_namelen = saved_namelen; /* compiler workaround */
>   int result;
> + unsigned int saved_namelen;
> + int stripped_name = 0;

Maybe too clever, but I think you could just do:

  unsigned int saved_namelen = 0;
  ...
saved_namelen = ce_namelen(ce);
  ...
  if (saved_namelen)
ce->ce_namelen = saved_namelen;
  ce->ce_flags &= ~CE_STRIP_NAME;

the zero-length name case (if that's even legal) would work out the
same.

That probably falls under the category of bikeshedding, though.

-Peff


Re: [PATCH 0/2] -Wuninitialized

2018-03-19 Thread Jeff King
On Mon, Mar 19, 2018 at 05:53:09PM +, Ramsay Jones wrote:

> This series removes all 'self-initialised' variables (ie.  var = var;).
> This construct has been used to silence gcc '-W[maybe-]uninitialized' warnings
> in the past [1]. Unfortunately, this construct causes warnings to be issued by
> MSVC [2], along with clang static analysis complaining about an 'Assigned 
> value
> is garbage or undefined'. The number of these constructs has dropped over the
> years (eg. see [3] and [4]), so there are currently only 6 remaining in the
> current codebase. As demonstrated below, 5 of these no longer cause gcc to
> issue warnings.

Great. I'm happy to see these going away, and thanks for all the careful
digging.

> If we now add a patch to remove all self-initialization, which would be the
> first patch plus the obvious change to 'saved_namelen' in read-cache.c, then
> note the warnings issued by various compilers at various optimization levels
> on several different platforms [5]:
> 
> O0  O1  O2  O3  Os   Og
>  1) gcc 4.8.3   |   -  1,20 11,18-19  1-4,21-23  1,5-17
>  2) gcc 4.8.4   |   -  1,20 1   1 1-4,21-23  
> 1,5-8,10-13,15-16 
>  3) clang 3.4   |   -   -   -   --   n/a 
>  4) gcc 5.4.0   |   -   1   1   1 1,3-4,21   1,5-8,10-13,16-16
>  5) clang 3.8.0 |   -   -   -   --   n/a 
>  6) gcc 5.4.0   |   -   1   1   1   1-4 1,5-17 
>  7) clang 3.8.0 |   -   -   -   --   n/a 
>  8) gcc 6.4.0   |   -   1   11,18-191,4 1,5-17
>  9) clang 5.0.1 |   -   -   -   ---
> 10) gcc 7.2.1   |   -   1   1   1   1,4 1,5-17

So I guess this could create headaches for people using DEVELOPER=1 on
as ancient a compiler as 4.8.4, but most other people should be OK. I
think I can live with that as a cutoff, and the Travis builds should
work there.

(And if we do the detect-compiler stuff from the other nearby thread,
somebody who cares can even loosen the warnings for those old gcc
versions).

I'm neglecting anybody doing -O3 or -Os here, but IMHO those are
sufficiently rare that the builder can tweak their own settings.

I wonder if people use -Og, though? I don't (I usually do -O0 for my
edit-compile-debug cycle).

-Peff


Re: [PATCH] doc/gitattributes: mention non-recursive behavior

2018-03-19 Thread Dakota Hawkins
That's perfect, thank you so much!

On Tue, Mar 20, 2018 at 12:14 AM, Jeff King  wrote:
> On Tue, Mar 20, 2018 at 12:04:11AM -0400, Jeff King wrote:
>
>> > I guess my takeaway is that it would be _good_ if the gitattributes
>> > documentation contained the caveat about not matching directories
>> > recursively, but _great_ if gitattributes and gitignore (and whatever
>> > else there is) were consistent.
>>
>> I agree it would be nice if they were consistent (and pathspecs, too).
>> But unfortunately at this point there's a maze of backwards
>> compatibility to deal with.
>
> So let's not forget to do the easy half there. Here's a patch.
>
> -- >8 --
> Subject: [PATCH] doc/gitattributes: mention non-recursive behavior
>
> The gitattributes documentation claims that the pattern
> rules are largely the same as for gitignore. However, the
> rules for recursion are different.
>
> In an ideal world, we would make them the same (if for
> nothing else than consistency and simplicity), but that
> would create backwards compatibility issues. For some
> discussion, see this thread:
>
>   https://public-inbox.org/git/slrnkldd3g.1l4@majutsushi.net/
>
> But let's at least document the differences instead of
> actively misleading the user by claiming that they're the
> same.
>
> Signed-off-by: Jeff King 
> ---
>  Documentation/gitattributes.txt | 13 ++---
>  1 file changed, 10 insertions(+), 3 deletions(-)
>
> diff --git a/Documentation/gitattributes.txt b/Documentation/gitattributes.txt
> index d52b254a22..1094fe2b5b 100644
> --- a/Documentation/gitattributes.txt
> +++ b/Documentation/gitattributes.txt
> @@ -56,9 +56,16 @@ Unspecified::
>
>  When more than one pattern matches the path, a later line
>  overrides an earlier line.  This overriding is done per
> -attribute.  The rules how the pattern matches paths are the
> -same as in `.gitignore` files; see linkgit:gitignore[5].
> -Unlike `.gitignore`, negative patterns are forbidden.
> +attribute.
> +
> +The rules by which the pattern matches paths are the same as in
> +`.gitignore` files (see linkgit:gitignore[5]), with a few exceptions:
> +
> +  - negative patterns are forbidden
> +
> +  - patterns that match a directory do not recursively match paths
> +inside that directory (so using the trailing-slash `path/` syntax is
> +pointless in an attributes file; use `path/**` instead)
>
>  When deciding what attributes are assigned to a path, Git
>  consults `$GIT_DIR/info/attributes` file (which has the highest
> --
> 2.17.0.rc0.402.ged0b3fd1ee
>


Re: .gitattributes override behavior (possible bug, or documentation bug)

2018-03-19 Thread Dakota Hawkins
> Right. The technical reason is mostly "that is not how it was designed,
> and it would possibly break some corner cases if we switched it now".

I'm just spitballing here, but do you guys think there's any subset of
the combined .gitignore and .gitattributes matching functionality that
could at least serve as a good "best-practices, going forward"
(because of consistency) for both? I will say every time I do this for
a new repo and have to do something even slightly complicated or
different from what I've done before with .gitattributes/.gitignore
that it takes me a long-ish time to figure it out. It's like I'm
vaguely aware of pitfalls I've encountered in the past in certain
areas but don't remember exactly what they are, so I consult the docs,
which are (in sum) confusing and lead to more time spent
trying/failing/trying/works/fails-later/etc.

One "this subset of rules will work for both this way" would be
awesome even if the matching capabilities are technically divergent,
but on the other hand that might paint both into a corner in terms of
functionality.

>> > I think just "/.readme-docs/**" should be sufficient for your case. You
>> > could also probably write "*" inside ".readme-docs/.gitattributes",
>> > which may be simpler (you don't need "**" there because patterns without
>> > a slash are just matched directly against the basename).
>>
>> Wouldn't that make the "*" inside ".readme-docs/.gitattributes",
>> technically recursive when "*" matches a directory?
>
> Sort of. The pattern is applied recursively to each basename, but the
> pattern itself does not match recursively. That's probably splitting
> hairs, though. :)

I understand, but if I think about it too much I feel the overwhelming
urge to stop thinking about it :)

>> It's always seemed to me that both were necessary to explicitly match
>> things in a directory and its subdirectories (example, IIRC: "git
>> ls-files -- .gitattributes" vs "git ls-files -- .gitattributes
>> **/.gitattributes"). Maybe that example is peculiar in that its a
>> dotfile and can't have a wildcard before the dot?
>
> Those are pathspecs, which are not quite the same as gitattributes. They
> don't do the magic basename matching.
>
> For pathspecs a "*" matches across slashes, which is what allows "git
> log -- '*.h'" to work (but not a suffix wildcard like 'foo*').

Same comment -- makes sense but hard to want to think too much about.

>> I guess my takeaway is that it would be _good_ if the gitattributes
>> documentation contained the caveat about not matching directories
>> recursively, but _great_ if gitattributes and gitignore (and whatever
>> else there is) were consistent.
>
> I agree it would be nice if they were consistent (and pathspecs, too).
> But unfortunately at this point there's a maze of backwards
> compatibility to deal with.
>
> -Peff

Again, as above, what if there were a subset of functionality git
could recommend to avoid inconsistencies?


Re: [PATCH] filter-branch: use printf instead of echo -e

2018-03-19 Thread Jeff King
On Mon, Mar 19, 2018 at 03:39:46PM +, CB Bailey wrote:

> > diff --git a/git-filter-branch.sh b/git-filter-branch.sh
> > index 1b7e4b2cd..21d84eff3 100755
> > --- a/git-filter-branch.sh
> > +++ b/git-filter-branch.sh
> > @@ -627,7 +627,7 @@ then
> > print H "$_:$f\n" or die;
> > }
> > close(H) or die;' || die "Unable to save state")
> > -   state_tree=$(/bin/echo -e "100644 blob $state_blob\tfilter.map" | git 
> > mktree)
> > +   state_tree=$(printf '100644 blob %s\tfilter.map\n' "$state_blob" | git 
> > mktree)
> > if test -n "$state_commit"
> > then
> > state_commit=$(/bin/echo "Sync" | git commit-tree "$state_tree" 
> > -p "$state_commit")
> 
> I think the change from 'echo -e' to printf is good because of the
> better portability reason that you cite.
> 
> Looking at the change, I am now curious as to why '/bin/echo' is used.
> Testing on a Mac, bash's built in 'echo' recognizes '-e' whereas
> '/bin/echo' does not. This is just an observation, I still prefer the
> move to 'printf' that you suggest.

Right. Moving them to just "echo -e" would work on systems where /bin/sh
is bash, but not elsewhere (e.g., Debian systems with "dash" whose
built-in echo doesn't understand "-e").

So my guess as to why /bin/echo was used is that on Linux systems it's
_more_ predictable and portable, because you know you're always going to
get the GNU coreutils version, which knows "-e". Even if you're using a
non-bash shell.

But on non-Linux systems, who knows what system "echo" you'll get. :)

Author cc'd in case there's something more interesting going on.

-Peff


[PATCH] doc/gitattributes: mention non-recursive behavior

2018-03-19 Thread Jeff King
On Tue, Mar 20, 2018 at 12:04:11AM -0400, Jeff King wrote:

> > I guess my takeaway is that it would be _good_ if the gitattributes
> > documentation contained the caveat about not matching directories
> > recursively, but _great_ if gitattributes and gitignore (and whatever
> > else there is) were consistent.
> 
> I agree it would be nice if they were consistent (and pathspecs, too).
> But unfortunately at this point there's a maze of backwards
> compatibility to deal with.

So let's not forget to do the easy half there. Here's a patch.

-- >8 --
Subject: [PATCH] doc/gitattributes: mention non-recursive behavior

The gitattributes documentation claims that the pattern
rules are largely the same as for gitignore. However, the
rules for recursion are different.

In an ideal world, we would make them the same (if for
nothing else than consistency and simplicity), but that
would create backwards compatibility issues. For some
discussion, see this thread:

  https://public-inbox.org/git/slrnkldd3g.1l4@majutsushi.net/

But let's at least document the differences instead of
actively misleading the user by claiming that they're the
same.

Signed-off-by: Jeff King 
---
 Documentation/gitattributes.txt | 13 ++---
 1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/Documentation/gitattributes.txt b/Documentation/gitattributes.txt
index d52b254a22..1094fe2b5b 100644
--- a/Documentation/gitattributes.txt
+++ b/Documentation/gitattributes.txt
@@ -56,9 +56,16 @@ Unspecified::
 
 When more than one pattern matches the path, a later line
 overrides an earlier line.  This overriding is done per
-attribute.  The rules how the pattern matches paths are the
-same as in `.gitignore` files; see linkgit:gitignore[5].
-Unlike `.gitignore`, negative patterns are forbidden.
+attribute.
+
+The rules by which the pattern matches paths are the same as in
+`.gitignore` files (see linkgit:gitignore[5]), with a few exceptions:
+
+  - negative patterns are forbidden
+
+  - patterns that match a directory do not recursively match paths
+inside that directory (so using the trailing-slash `path/` syntax is
+pointless in an attributes file; use `path/**` instead)
 
 When deciding what attributes are assigned to a path, Git
 consults `$GIT_DIR/info/attributes` file (which has the highest
-- 
2.17.0.rc0.402.ged0b3fd1ee



Re: .gitattributes override behavior (possible bug, or documentation bug)

2018-03-19 Thread Jeff King
On Mon, Mar 19, 2018 at 11:17:04PM -0400, Dakota Hawkins wrote:

> Sorry to tack on to my previous email, but I just thought of this:
> 
> If something like "-diff=lfs" won't do what I (and git-lfs) thought it
> would, do you think it would be prudent/reasonable to suggest git-lfs
> add a "no-lfs" filter for exactly this case? That way I could have
> explicit exclusions without any "diff=foo" shenanigans.

It might be if my earlier email weren't totally wrong. ;) I think the
"!filter" syntax that Junio mentioned would do what you want.

-Peff


Re: .gitattributes override behavior (possible bug, or documentation bug)

2018-03-19 Thread Jeff King
On Mon, Mar 19, 2018 at 11:10:26PM -0400, Dakota Hawkins wrote:

> > As you noted below, that second line does not match your path, because
> > attributes on a directory aren't applied recursively. And it has nothing
> > to do with overriding. If you remove the png line entirely, you can see
> > that we still do not match it. You need to use "*" to match the paths.
> 
> Ah, yes, I see that. Inconsistent with .gitignore (more below), right?

Yes, it is.

> > I could not find anything useful in gitattributes(5). There's some old
> > discussion here:
> >
> >   https://public-inbox.org/git/slrnkldd3g.1l4@majutsushi.net/
> 
> If I follow that correctly: There's some initial speculation that it
> would be OK to apply the attributes recursively, which is then shot
> down because it wasn't designed to be recursive (though I don't see a
> different, technical reason for that), followed by finding a (this
> same?) solution/workaround for the original problem. Is that about
> right?

Right. The technical reason is mostly "that is not how it was designed,
and it would possibly break some corner cases if we switched it now".

> > I think just "/.readme-docs/**" should be sufficient for your case. You
> > could also probably write "*" inside ".readme-docs/.gitattributes",
> > which may be simpler (you don't need "**" there because patterns without
> > a slash are just matched directly against the basename).
> 
> Wouldn't that make the "*" inside ".readme-docs/.gitattributes",
> technically recursive when "*" matches a directory?

Sort of. The pattern is applied recursively to each basename, but the
pattern itself does not match recursively. That's probably splitting
hairs, though. :)

> It's always seemed to me that both were necessary to explicitly match
> things in a directory and its subdirectories (example, IIRC: "git
> ls-files -- .gitattributes" vs "git ls-files -- .gitattributes
> **/.gitattributes"). Maybe that example is peculiar in that its a
> dotfile and can't have a wildcard before the dot?

Those are pathspecs, which are not quite the same as gitattributes. They
don't do the magic basename matching.

For pathspecs a "*" matches across slashes, which is what allows "git
log -- '*.h'" to work (but not a suffix wildcard like 'foo*').

> I guess my takeaway is that it would be _good_ if the gitattributes
> documentation contained the caveat about not matching directories
> recursively, but _great_ if gitattributes and gitignore (and whatever
> else there is) were consistent.

I agree it would be nice if they were consistent (and pathspecs, too).
But unfortunately at this point there's a maze of backwards
compatibility to deal with.

-Peff


Re: .gitattributes override behavior (possible bug, or documentation bug)

2018-03-19 Thread Jeff King
On Mon, Mar 19, 2018 at 08:33:15PM -0700, Junio C Hamano wrote:

> Jeff King  writes:
> 
> > The best you can probably do is:
> >
> >   /.readme-docs/* diff=foo
> >
> > Since you have no diff.foo.* config, that will behave in the default way
> > (including respecting the usual "is it binary" checks). So a bit hacky,
> > but I think it would work as "ignore prior diff".
> 
> You can say
> 
>   /.readme-docs/* !diff
> 
> I think.  Thre is a difference between setting it to "false"
> (i.e. Unset) and reverting it to unspecified state.
> 
> Isn't that what you want here?

Ah, yes, I totally forgot that existed. That's much better than the
hackery I proposed.

-Peff


Re: .gitattributes override behavior (possible bug, or documentation bug)

2018-03-19 Thread Dakota Hawkins
On Mon, Mar 19, 2018 at 11:33 PM, Junio C Hamano  wrote:
> Jeff King  writes:
>
>> The best you can probably do is:
>>
>>   /.readme-docs/* diff=foo
>>
>> Since you have no diff.foo.* config, that will behave in the default way
>> (including respecting the usual "is it binary" checks). So a bit hacky,
>> but I think it would work as "ignore prior diff".
>
> You can say
>
> /.readme-docs/* !diff
>
> I think.  Thre is a difference between setting it to "false"
> (i.e. Unset) and reverting it to unspecified state.
>
> Isn't that what you want here?

In this case, I think so? In this context I don't necessarily know the
files in /.readme-docs/ are binary (though that's its intent).
Ideally, I just want it to do whatever it did before the match gave it
"diff=lfs". I realize that that's not possible/feasible, so I asked in
a separate reply whether it might be a good idea to ask git-lfs for a
"no-lfs" filter for exactly this situation.

The actual "lfs" section of my .gitattributes file is about 65 lines
for 65 different extensions, so I'd prefer to handle the exclusion of
a directory without having to repeat that whole block with different
options, if that makes sense :)


Re: .gitattributes override behavior (possible bug, or documentation bug)

2018-03-19 Thread Junio C Hamano
Jeff King  writes:

> The best you can probably do is:
>
>   /.readme-docs/* diff=foo
>
> Since you have no diff.foo.* config, that will behave in the default way
> (including respecting the usual "is it binary" checks). So a bit hacky,
> but I think it would work as "ignore prior diff".

You can say

/.readme-docs/* !diff

I think.  Thre is a difference between setting it to "false"
(i.e. Unset) and reverting it to unspecified state.

Isn't that what you want here?


Re: .gitattributes override behavior (possible bug, or documentation bug)

2018-03-19 Thread Dakota Hawkins
Sorry to tack on to my previous email, but I just thought of this:

If something like "-diff=lfs" won't do what I (and git-lfs) thought it
would, do you think it would be prudent/reasonable to suggest git-lfs
add a "no-lfs" filter for exactly this case? That way I could have
explicit exclusions without any "diff=foo" shenanigans.

Thanks again,

- Dakota

On Mon, Mar 19, 2018 at 11:10 PM, Dakota Hawkins
 wrote:
> Thanks for the quick reply!
>
> On Mon, Mar 19, 2018 at 10:34 PM, Jeff King  wrote:
>> On Mon, Mar 19, 2018 at 09:49:28PM -0400, Dakota Hawkins wrote:
>>
>>> Summary: Trying to apply attributes to file extensions everywhere
>>> except in one directory.
>>>
>>> .gitattributes:
>>>
>>> *.[Pp][Nn][Gg] filter=lfs diff=lfs merge=lfs -text
>>> /.readme-docs/ -filter=lfs -diff=lfs -merge=lfs
>>>
>>> Make some data:
>>>
>>> echo "asldkjfa;sldkjf;alsdjf" > ./.readme-docs/test.png
>>> git add -A
>>
>> As you noted below, that second line does not match your path, because
>> attributes on a directory aren't applied recursively. And it has nothing
>> to do with overriding. If you remove the png line entirely, you can see
>> that we still do not match it. You need to use "*" to match the paths.
>
> Ah, yes, I see that. Inconsistent with .gitignore (more below), right?
>
>> You may also find that "-diff=lfs" does not do quite what you want.
>> There is no way to say "cancel any previous attribute", which I think is
>> what you're trying for here. You can only override it with a new value.
>> So:
>>
>>   /.readme-docs/* -diff
>>
>> says "do not diff this". And:
>>
>>   /.readme-docs/* diff
>>
>> says "diff this as text, even if it looks binary".
>>
>> The best you can probably do is:
>>
>>   /.readme-docs/* diff=foo
>>
>> Since you have no diff.foo.* config, that will behave in the default way
>> (including respecting the usual "is it binary" checks). So a bit hacky,
>> but I think it would work as "ignore prior diff".
>>
>> And I think filter and merge drivers should work the same.
>
> That's interesting... in this case I was taking my advice on how this
> should work from the git-lfs folks. I have promised to share what I
> find here with them, so that will help at least :)
>
> I think that makes sense to me -- there would be no good way to tell
> it what the default should have been without explicitly telling it
> what to use instead.
>
>>> Is this me misunderstanding something in the documentation? I would
>>> expect "./.readme-docs/" to match "./.readme-docs/test.png" and
>>> override the earlier "*.[Pp][Nn][Gg]" attributes.
>>>
>>> I have found the following overrides to work in lieu of the directory match:
>>>
>>> /.readme-docs/* -filter=lfs -diff=lfs -merge=lfs
>>> /.readme-docs/**/* -filter=lfs -diff=lfs -merge=lfs
>>>
>>> ...but I don't see a justification in the documentation for this
>>> working and the original directory filter not working.
>>
>> I could not find anything useful in gitattributes(5). There's some old
>> discussion here:
>>
>>   https://public-inbox.org/git/slrnkldd3g.1l4@majutsushi.net/
>
> If I follow that correctly: There's some initial speculation that it
> would be OK to apply the attributes recursively, which is then shot
> down because it wasn't designed to be recursive (though I don't see a
> different, technical reason for that), followed by finding a (this
> same?) solution/workaround for the original problem. Is that about
> right?
>
>> which makes it clear that attributes aren't recursive, but it's probably
>> worth calling out in the documentation. In fact, I think the current
>> documentation is a bit misleading in that it says "patterns are matched
>> as in .gitignore", which is clearly not the case here.
>
> I was indeed going off of the suggestion to consult the .gitignore
> pattern matching documentation.
>
>> I think just "/.readme-docs/**" should be sufficient for your case. You
>> could also probably write "*" inside ".readme-docs/.gitattributes",
>> which may be simpler (you don't need "**" there because patterns without
>> a slash are just matched directly against the basename).
>
> Wouldn't that make the "*" inside ".readme-docs/.gitattributes",
> technically recursive when "*" matches a directory? It's always seemed
> to me that both were necessary to explicitly match things in a
> directory and its subdirectories (example, IIRC: "git ls-files --
> .gitattributes" vs "git ls-files -- .gitattributes
> **/.gitattributes"). Maybe that example is peculiar in that its a
> dotfile and can't have a wildcard before the dot?
>
> I guess my takeaway is that it would be _good_ if the gitattributes
> documentation contained the caveat about not matching directories
> recursively, but _great_ if gitattributes and gitignore (and whatever
> else there is) were consistent.
>
> At any rate, thanks for the great, quick help!
>
> -Dakota


Re: .gitattributes override behavior (possible bug, or documentation bug)

2018-03-19 Thread Dakota Hawkins
Thanks for the quick reply!

On Mon, Mar 19, 2018 at 10:34 PM, Jeff King  wrote:
> On Mon, Mar 19, 2018 at 09:49:28PM -0400, Dakota Hawkins wrote:
>
>> Summary: Trying to apply attributes to file extensions everywhere
>> except in one directory.
>>
>> .gitattributes:
>>
>> *.[Pp][Nn][Gg] filter=lfs diff=lfs merge=lfs -text
>> /.readme-docs/ -filter=lfs -diff=lfs -merge=lfs
>>
>> Make some data:
>>
>> echo "asldkjfa;sldkjf;alsdjf" > ./.readme-docs/test.png
>> git add -A
>
> As you noted below, that second line does not match your path, because
> attributes on a directory aren't applied recursively. And it has nothing
> to do with overriding. If you remove the png line entirely, you can see
> that we still do not match it. You need to use "*" to match the paths.

Ah, yes, I see that. Inconsistent with .gitignore (more below), right?

> You may also find that "-diff=lfs" does not do quite what you want.
> There is no way to say "cancel any previous attribute", which I think is
> what you're trying for here. You can only override it with a new value.
> So:
>
>   /.readme-docs/* -diff
>
> says "do not diff this". And:
>
>   /.readme-docs/* diff
>
> says "diff this as text, even if it looks binary".
>
> The best you can probably do is:
>
>   /.readme-docs/* diff=foo
>
> Since you have no diff.foo.* config, that will behave in the default way
> (including respecting the usual "is it binary" checks). So a bit hacky,
> but I think it would work as "ignore prior diff".
>
> And I think filter and merge drivers should work the same.

That's interesting... in this case I was taking my advice on how this
should work from the git-lfs folks. I have promised to share what I
find here with them, so that will help at least :)

I think that makes sense to me -- there would be no good way to tell
it what the default should have been without explicitly telling it
what to use instead.

>> Is this me misunderstanding something in the documentation? I would
>> expect "./.readme-docs/" to match "./.readme-docs/test.png" and
>> override the earlier "*.[Pp][Nn][Gg]" attributes.
>>
>> I have found the following overrides to work in lieu of the directory match:
>>
>> /.readme-docs/* -filter=lfs -diff=lfs -merge=lfs
>> /.readme-docs/**/* -filter=lfs -diff=lfs -merge=lfs
>>
>> ...but I don't see a justification in the documentation for this
>> working and the original directory filter not working.
>
> I could not find anything useful in gitattributes(5). There's some old
> discussion here:
>
>   https://public-inbox.org/git/slrnkldd3g.1l4@majutsushi.net/

If I follow that correctly: There's some initial speculation that it
would be OK to apply the attributes recursively, which is then shot
down because it wasn't designed to be recursive (though I don't see a
different, technical reason for that), followed by finding a (this
same?) solution/workaround for the original problem. Is that about
right?

> which makes it clear that attributes aren't recursive, but it's probably
> worth calling out in the documentation. In fact, I think the current
> documentation is a bit misleading in that it says "patterns are matched
> as in .gitignore", which is clearly not the case here.

I was indeed going off of the suggestion to consult the .gitignore
pattern matching documentation.

> I think just "/.readme-docs/**" should be sufficient for your case. You
> could also probably write "*" inside ".readme-docs/.gitattributes",
> which may be simpler (you don't need "**" there because patterns without
> a slash are just matched directly against the basename).

Wouldn't that make the "*" inside ".readme-docs/.gitattributes",
technically recursive when "*" matches a directory? It's always seemed
to me that both were necessary to explicitly match things in a
directory and its subdirectories (example, IIRC: "git ls-files --
.gitattributes" vs "git ls-files -- .gitattributes
**/.gitattributes"). Maybe that example is peculiar in that its a
dotfile and can't have a wildcard before the dot?

I guess my takeaway is that it would be _good_ if the gitattributes
documentation contained the caveat about not matching directories
recursively, but _great_ if gitattributes and gitignore (and whatever
else there is) were consistent.

At any rate, thanks for the great, quick help!

-Dakota


Re: .gitattributes override behavior (possible bug, or documentation bug)

2018-03-19 Thread Jeff King
On Mon, Mar 19, 2018 at 09:49:28PM -0400, Dakota Hawkins wrote:

> Summary: Trying to apply attributes to file extensions everywhere
> except in one directory.
> 
> .gitattributes:
> 
> *.[Pp][Nn][Gg] filter=lfs diff=lfs merge=lfs -text
> /.readme-docs/ -filter=lfs -diff=lfs -merge=lfs
> 
> Make some data:
> 
> echo "asldkjfa;sldkjf;alsdjf" > ./.readme-docs/test.png
> git add -A

As you noted below, that second line does not match your path, because
attributes on a directory aren't applied recursively. And it has nothing
to do with overriding. If you remove the png line entirely, you can see
that we still do not match it. You need to use "*" to match the paths.

You may also find that "-diff=lfs" does not do quite what you want.
There is no way to say "cancel any previous attribute", which I think is
what you're trying for here. You can only override it with a new value.
So:

  /.readme-docs/* -diff

says "do not diff this". And:

  /.readme-docs/* diff

says "diff this as text, even if it looks binary".

The best you can probably do is:

  /.readme-docs/* diff=foo

Since you have no diff.foo.* config, that will behave in the default way
(including respecting the usual "is it binary" checks). So a bit hacky,
but I think it would work as "ignore prior diff".

And I think filter and merge drivers should work the same.

> Is this me misunderstanding something in the documentation? I would
> expect "./.readme-docs/" to match "./.readme-docs/test.png" and
> override the earlier "*.[Pp][Nn][Gg]" attributes.
>
> I have found the following overrides to work in lieu of the directory match:
> 
> /.readme-docs/* -filter=lfs -diff=lfs -merge=lfs
> /.readme-docs/**/* -filter=lfs -diff=lfs -merge=lfs
> 
> ...but I don't see a justification in the documentation for this
> working and the original directory filter not working.

I could not find anything useful in gitattributes(5). There's some old
discussion here:

  https://public-inbox.org/git/slrnkldd3g.1l4@majutsushi.net/

which makes it clear that attributes aren't recursive, but it's probably
worth calling out in the documentation. In fact, I think the current
documentation is a bit misleading in that it says "patterns are matched
as in .gitignore", which is clearly not the case here.

I think just "/.readme-docs/**" should be sufficient for your case. You
could also probably write "*" inside ".readme-docs/.gitattributes",
which may be simpler (you don't need "**" there because patterns without
a slash are just matched directly against the basename).

-Peff


.gitattributes override behavior (possible bug, or documentation bug)

2018-03-19 Thread Dakota Hawkins
According to the gitattributes docs:

"When more than one pattern matches the path, a later line overrides
an earlier line. This overriding is done per attribute."

I had a need to do this recently, and while the attributes are git-lfs
specific, I think the issue may be generic.

Summary: Trying to apply attributes to file extensions everywhere
except in one directory.

.gitattributes:

*.[Pp][Nn][Gg] filter=lfs diff=lfs merge=lfs -text
/.readme-docs/ -filter=lfs -diff=lfs -merge=lfs

Make some data:

echo "asldkjfa;sldkjf;alsdjf" > ./.readme-docs/test.png
git add -A

Result:

"git check-attr -a --cached -- ./readme-docs/test.png" still shows
"filter=lfs diff=lfs merge=lfs" attributes.

Full details: 
https://github.com/git-lfs/git-lfs/issues/2120#issuecomment-374432354

Is this me misunderstanding something in the documentation? I would
expect "./.readme-docs/" to match "./.readme-docs/test.png" and
override the earlier "*.[Pp][Nn][Gg]" attributes.

I have found the following overrides to work in lieu of the directory match:

/.readme-docs/* -filter=lfs -diff=lfs -merge=lfs
/.readme-docs/**/* -filter=lfs -diff=lfs -merge=lfs

...but I don't see a justification in the documentation for this
working and the original directory filter not working.

Thanks for any clarification or help you can provide,

- Dakota


Re: PATCH 1/2] -Wuninitialized: remove some 'init-self' workarounds

2018-03-19 Thread Ramsay Jones


On 19/03/18 17:54, Ramsay Jones wrote:
> 
> The 'self-initialised' variables construct (ie  var = var;) has
> been used to silence gcc '-W[maybe-]uninitialized' warnings. This has,
> unfortunately, caused MSVC to issue 'uninitialized variable' warnings.
> Also, using clang static analysis causes complaints about an 'Assigned
> value is garbage or undefined'.
> 
[snip]

Hi Junio,

You may not have noticed that I messed up the Subject line of
this patch - I managed to drop the '[' character in the patch
prefix string. :(

So, the commit 7bc195d877 on the 'pu' branch is titled:

  "PATCH 1/2] -Wininitialized: remove some 'init-self' workarounds"

Ahem! Sorry about that.

ATB,
Ramsay Jones




Re: How to debug a "git merge"?

2018-03-19 Thread Derrick Stolee

On 3/14/2018 4:53 PM, Lars Schneider wrote:

On 14 Mar 2018, at 18:02, Derrick Stolee  wrote:

On 3/14/2018 12:56 PM, Lars Schneider wrote:

Hi,

I am investigating a Git merge (a86dd40fe) in which an older version of
a file won over the newer version. I try to understand why this is the
case. I can reproduce the merge with the following commands:
$ git checkout -b test a02fa3303
$ GIT_MERGE_VERBOSITY=5 git merge --verbose c1b82995c

The merge actually generates a merge conflict but not for my
problematic file. The common ancestor of the two parents (merge base)
is b91161554.

The merge graph is not pretty (the committers don't have a clean
branching scheme) but I cannot spot a problem between the merge commit
and the common ancestor:
$ git log --graph --oneline a86dd40fe

Have you tried `git log --graph --oneline --simplify-merges -- path` to see 
what changes and merges involved the file? I find that view to be very helpful 
(while the default history simplification can hide things). In particular, if 
there was a change that was reverted in one side and not another, we could find 
out.

Thanks for this tip! Unfortunately, this only confirms my current view:

### First parent
$ git log --graph --oneline --simplify-merges a02fa3303 -- path/to/problem
* 4e47a10c7 <-- old version
* 01f01f61c

### Second parent
$ git log --graph --oneline --simplify-merges c1b82995c -- path/to/problem
* 590e52ed1 <-- new version
* 8e598828d
* ad4e9034b
* 4e47a10c7
* 01f01f61c

### Merge
$ git log --graph --oneline --simplify-merges a86dd40fe -- path/to/problem
*   a86dd40fe <-- old version ?!?! That's the problem!
|\
| * 590e52ed1 <-- new version
| * 8e598828d
| * ad4e9034b
|/
* 4e47a10c7 <-- old version
* 01f01f61c



You could also use the "A...B" to check your two commits for merging, and maybe add 
"--boundary".

$ git diff --boundary a02fa3303...c1b82995c -- path/to/problem

This looks like the correct diff. The "new version" is mark as +/add/green in 
the diff.

Does this make any sense to you?

Thank you,
Lars


I'm sorry for the delay on this, but in my experience in helping 
customers saying "why doesn't my commit/change appear in the history of 
a file?" is because someone resolved merge conflicts by just taking one 
side instead of doing a real merge (such as using -Xours). The only 
solution is to re-apply the original change again and talk to the user 
who incorrectly merged, to prevent future occurrences. These things 
rarely happen to teams who use pull requests that require review, since 
the diff would be problematic. There are still a lot of teams that push 
directly to master, though.


Thanks,
-Stolee


Re: [PATCH v6 00/14] Serialized Git Commit Graph

2018-03-19 Thread Derrick Stolee

On 3/19/2018 8:55 AM, Derrick Stolee wrote:


Thanks for this! Fixing this performance problem is very important to 
me, as we will use the "--stdin-packs" mechanism in the GVFS scenario 
(we will walk all commits in the prefetch packs full of commits and 
trees instead of relying on refs). This speedup is very valuable!


Thanks,
-Stolee


Also, for those interested in this series, I plan to do a rebase onto 
2.17.0, when available, as my re-roll. I pushed my responses to the 
current feedback at the GitHub PR for the series [1].


If you are planning to provide more feedback to the series, then please 
let me know and I'll delay my re-roll so you have a chance to review.


Thanks,
-Stolee

[1] https://github.com/derrickstolee/git/pull/2


Re: [GSoC] Regarding "Convert scripts to builtins"

2018-03-19 Thread Christian Couder
On Tue, Mar 20, 2018 at 12:11 AM, Christian Couder
 wrote:
> On Sun, Mar 18, 2018 at 11:15 PM, Paul Sebastian Ungureanu
>> First of all, I find it difficult to pick which scripts would benefit
>> the most by being rewritten. I am thinking of git bisect, git stash
>> and git rebase since these are maybe some of the most popular commands
>> of Git. However, on the other side, scripts like
>> git-rebase--interactive.sh and git-bisect.sh are also subject of other
>> GSoC projects. Should I steer away from these projects or can I
>> consider them?
>
> If you are interested in converting these scripts, you should probably
> ask publicly to the former GSoC students who have been working on
> these projects if they still plan to continue working on them of if
> they are ok to let you finish/continue their work. You will get extra
> bonus points if they agree to help you or maybe co-mentor you.

I realize that you were perhaps talking about other potential GSoC
students who are also writing proposals about converting one of these
scripts. If you care about the other proposals though, you should
probably just write two proposals as we will have at most 2 GSoC
students this year. Just try to have different possible mentors
interested in your 2 proposals.


[PATCH v5 2/3] stash push: avoid printing errors

2018-03-19 Thread Thomas Gummerer
'git stash push -u -- ' prints the following errors if
 only matches untracked files:

fatal: pathspec 'untracked' did not match any files
error: unrecognized input

This is because we first clean up the untracked files using 'git clean
', and then use a command chain involving 'git add -u
' and 'git apply' to clear the changes to files that are in
the index and were stashed.

As the  only includes untracked files that were already
removed by 'git clean', the 'git add' call will barf, and so will 'git
apply', as there are no changes that need to be applied.

Fix this by avoiding the 'git clean' if a pathspec is given, and use the
pipeline that's used for pathspec mode to get rid of the untracked files
as well.

Reported-by: Marc Strapetz 
Signed-off-by: Thomas Gummerer 
---
 git-stash.sh   |  6 +++--
 t/t3905-stash-include-untracked.sh | 46 ++
 2 files changed, 50 insertions(+), 2 deletions(-)

diff --git a/git-stash.sh b/git-stash.sh
index 4c92ec931f..5e06f96da5 100755
--- a/git-stash.sh
+++ b/git-stash.sh
@@ -308,14 +308,16 @@ push_stash () {
if test -z "$patch_mode"
then
test "$untracked" = "all" && CLEAN_X_OPTION=-x || 
CLEAN_X_OPTION=
-   if test -n "$untracked"
+   if test -n "$untracked" && test $# = 0
then
git clean --force --quiet -d $CLEAN_X_OPTION -- "$@"
fi
 
if test $# != 0
then
-   git add -u -- "$@"
+   test -z "$untracked" && UPDATE_OPTION="-u" || 
UPDATE_OPTION=
+   test "$untracked" = "all" && FORCE_OPTION="--force" || 
FORCE_OPTION=
+   git add $UPDATE_OPTION $FORCE_OPTION -- "$@"
git diff-index -p --cached --binary HEAD -- "$@" |
git apply --index -R
else
diff --git a/t/t3905-stash-include-untracked.sh 
b/t/t3905-stash-include-untracked.sh
index bfde4057ad..2f9045553e 100755
--- a/t/t3905-stash-include-untracked.sh
+++ b/t/t3905-stash-include-untracked.sh
@@ -228,4 +228,50 @@ test_expect_success 'stash previously ignored file' '
test_path_is_file ignored.d/foo
 '
 
+test_expect_success 'stash -u --  doesnt print error' '
+   >untracked &&
+   git stash push -u -- untracked 2>actual &&
+   test_path_is_missing untracked &&
+   test_line_count = 0 actual
+'
+
+test_expect_success 'stash -u --  leaves rest of working tree in 
place' '
+   >tracked &&
+   git add tracked &&
+   >untracked &&
+   git stash push -u -- untracked &&
+   test_path_is_missing untracked &&
+   test_path_is_file tracked
+'
+
+test_expect_success 'stash -u --   clears changes in both' 
'
+   >tracked &&
+   git add tracked &&
+   >untracked &&
+   git stash push -u -- tracked untracked &&
+   test_path_is_missing tracked &&
+   test_path_is_missing untracked
+'
+
+test_expect_success 'stash --all --  stashes ignored file' '
+   >ignored.d/bar &&
+   git stash push --all -- ignored.d/bar &&
+   test_path_is_missing ignored.d/bar
+'
+
+test_expect_success 'stash --all --   clears changes in 
both' '
+   >tracked &&
+   git add tracked &&
+   >ignored.d/bar &&
+   git stash push --all -- tracked ignored.d/bar &&
+   test_path_is_missing tracked &&
+   test_path_is_missing ignored.d/bar
+'
+
+test_expect_success 'stash -u --  leaves ignored file alone' '
+   >ignored.d/bar &&
+   git stash push -u -- ignored.d/bar &&
+   test_path_is_file ignored.d/bar
+'
+
 test_done
-- 
2.15.1.33.g3165b24a68



[PATCH v5 1/3] stash: fix nonsense pipeline

2018-03-19 Thread Thomas Gummerer
From: Junio C Hamano 

An earlier change bba067d2 ("stash: don't delete untracked files
that match pathspec", 2018-01-06) was made by taking a suggestion in
a list discussion [1] but did not copy the suggested snippet
correctly.  And the bug was unnoticed during the review and slipped
through.

This fixes it.

[1] https://public-inbox.org/git/xmqqpo7byjwb@gitster.mtv.corp.google.com/

Signed-off-by: Junio C Hamano 
---
 git-stash.sh | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/git-stash.sh b/git-stash.sh
index b48b164748..4c92ec931f 100755
--- a/git-stash.sh
+++ b/git-stash.sh
@@ -315,9 +315,9 @@ push_stash () {
 
if test $# != 0
then
-   git add -u -- "$@" |
-   git checkout-index -z --force --stdin
-   git diff-index -p --cached --binary HEAD -- "$@" | git 
apply --index -R
+   git add -u -- "$@"
+   git diff-index -p --cached --binary HEAD -- "$@" |
+   git apply --index -R
else
git reset --hard -q
fi
-- 
2.15.1.33.g3165b24a68



[PATCH v5 3/3] stash push -u: don't create empty stash

2018-03-19 Thread Thomas Gummerer
When introducing the stash push feature, and thus allowing users to pass
in a pathspec to limit the files that would get stashed in
df6bba0937 ("stash: teach 'push' (and 'create_stash') to honor
pathspec", 2017-02-28), this developer missed one place where the
pathspec should be passed in.

Namely in the call to the 'untracked_files()' function in the
'no_changes()' function.  This resulted in 'git stash push -u --
' creating an empty stash when there are untracked files
in the repository other that don't match the pathspec.

As 'git stash' never creates empty stashes, this behaviour is wrong and
confusing for users.  Instead it should just show a message "No local
changes to save", and not create a stash.

Luckily the 'untracked_files()' function already correctly respects
pathspecs that are passed to it, so the fix is simply to pass the
pathspec along to the function.

Reported-by: Marc Strapetz 
Signed-off-by: Thomas Gummerer 
---
 git-stash.sh   | 2 +-
 t/t3905-stash-include-untracked.sh | 6 ++
 2 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/git-stash.sh b/git-stash.sh
index 5e06f96da5..4e55f278bd 100755
--- a/git-stash.sh
+++ b/git-stash.sh
@@ -39,7 +39,7 @@ fi
 no_changes () {
git diff-index --quiet --cached HEAD --ignore-submodules -- "$@" &&
git diff-files --quiet --ignore-submodules -- "$@" &&
-   (test -z "$untracked" || test -z "$(untracked_files)")
+   (test -z "$untracked" || test -z "$(untracked_files "$@")")
 }
 
 untracked_files () {
diff --git a/t/t3905-stash-include-untracked.sh 
b/t/t3905-stash-include-untracked.sh
index 2f9045553e..3ea5b9bb3f 100755
--- a/t/t3905-stash-include-untracked.sh
+++ b/t/t3905-stash-include-untracked.sh
@@ -274,4 +274,10 @@ test_expect_success 'stash -u --  leaves ignored 
file alone' '
test_path_is_file ignored.d/bar
 '
 
+test_expect_success 'stash -u --  shows no changes when there 
are none' '
+   git stash push -u -- non-existant >actual &&
+   echo "No local changes to save" >expect &&
+   test_i18ncmp expect actual
+'
+
 test_done
-- 
2.15.1.33.g3165b24a68



[PATCH v5 0/3] stash push -u -- fixes

2018-03-19 Thread Thomas Gummerer
Thanks again Marc for all the testing and Junio for fixing up my brown
paper bag copy-pasto.

This iteration addresses the breakage Marc noticed with the latest
version of the patches, adds some more tests, and moves all the new
tests to t3905 instead of t3903, which I just noticed exists and is a
much better match for the new tests.

Patch 1 and 3 are the same as in the previous round, Patch 2 is mostly
rewritten.  Instead of trying to avoid part of the pipeline we're
using to get rid of changes, we now are getting rid of the 'git clean'
call in the pathspec case, and use the existing pipeline to get rid of
changes in untracked files as well.  I'm not adding an interdiff,
because Patch 2 is mostly rewritten and the other two are unchanged,
so it is probably easiest to just review patch 2.

Junio C Hamano (1):
  stash: fix nonsense pipeline

Thomas Gummerer (2):
  stash push: avoid printing errors
  stash push -u: don't create empty stash

 git-stash.sh   | 12 +
 t/t3905-stash-include-untracked.sh | 52 ++
 2 files changed, 59 insertions(+), 5 deletions(-)

-- 
2.15.1.33.g3165b24a68



Re: [GSoC] Regarding "Convert scripts to builtins"

2018-03-19 Thread Christian Couder
Hi,

On Sun, Mar 18, 2018 at 11:15 PM, Paul Sebastian Ungureanu
 wrote:
> Hello,
>
> I am interested in the "Convert scripts to builtins" project. I have
> recently started to analyze it better and see exactly what it entails
> and a few questions came to my mind.

Great! As other potential GSoC students are also interested I'm adding
them into Cc.

> First of all, I find it difficult to pick which scripts would benefit
> the most by being rewritten. I am thinking of git bisect, git stash
> and git rebase since these are maybe some of the most popular commands
> of Git. However, on the other side, scripts like
> git-rebase--interactive.sh and git-bisect.sh are also subject of other
> GSoC projects. Should I steer away from these projects or can I
> consider them?

If you are interested in converting these scripts, you should probably
ask publicly to the former GSoC students who have been working on
these projects if they still plan to continue working on them of if
they are ok to let you finish/continue their work. You will get extra
bonus points if they agree to help you or maybe co-mentor you.

> Secondly, what is too little or too much? On one hand, I do want to do
> my best and help the Git community as much as I can. On the other
> hand, I do not want to have too much on my plate and not be able to
> finish my project. Considering that mentors have already decided that
> git rebase --interactive and git bisect are enough for two projects,
> how could I quantify the time required for each command? Looking back
> at the previous editions of GSoC I noticed that most projects were
> focused on only one command.

Yeah, I don't think it is a good idea to focus on more than one
command per project. It could be possible if there were really small
scripts to convert, but I think those have been already converted. It
could perhaps be possible if 2 scripts were very similar, but I don't
think there are similar enough scripts to convert.

You can however submit more than one proposal, so you could for
example submit one proposal to convert one script and another one to
convert another script.

> From my research, these are the scripts that could be subject of this
> project. Which ones do you think could be the best choice for a
> project of this kind?

I think it is definitely a good idea to work on a script that has
started to be converted. Make sure that no one is still actively
working on converting it though.

I think the scripts related to other versions control systems are not
a good choice as they are not really part of the core of Git.

It is also a good idea to choose scripts that potential mentors are
familiar with.

>  * git/git-add--interactive.perl
>  * git/git-archimport.perl
>  * git/git-bisect.sh -- there is a project about this
>  * git/git-cvsexportcommit.perl
>  * git/git-cvsimport.perl
>  * git/git-cvsserver.perl
>  * git/git-difftool--helper.sh
>  * git/git-filter-branch.sh
>  * git/git-instaweb.sh
>  * git/git-merge-octopus.sh
>  * git/git-merge-one-file.sh
>  * git/git-merge-resolve.sh
>  * git/git-mergetool--lib.sh
>  * git/git-mergetool.sh
>  * git/git-quiltimport.sh
>  * git/git-rebase--am.sh
>  * git/git-rebase--interactive.sh -- there is a project about this
>  * git/git-rebase--merge.sh
>  * git/git-rebase.sh
>  * git/git-remote-testgit.sh
>  * git/git-request-pull.sh
>  * git/git-send-email.perl
>  * git/git-stash.sh
>  * git/git-submodule.sh -- there was a project about this
>  * git/git-svn.perl
>  * git/git-web--browse.sh

It would be interesting to know the number of lines of code for each
of these script, as it could give an idea about how big the task of
fully converting the script could be.

> I look forward to hearing from you. I will also submit a draft of my
> proposal soon enough.

Great!

Christian.


Re: [PATCH v2 2/2] git-svn: allow empty email-address in authors-prog and authors-file

2018-03-19 Thread Andreas Heiduk
Am 19.03.2018 um 00:04 schrieb Eric Wong:
> Andreas Heiduk  wrote:
>> The email address in --authors-file and --authors-prog can be empty but
>> git-svn translated it into a syntethic email address in the form
>> $USERNAME@$REPO_UUID. Now git-svn behaves like git-commit: If the email
>> is explicitly set to the empty string, the commit does not contain
>> an email address.
> 
> What is missing is WHY "<>" is preferable to "<$USERNAME@$REPO_UUID>".
>
> $USERNAME is good anyways since projects/organizations tie their
> SVN usernames to email usernames via LDAP, making it easy to
> infer their email address from $USERNAME.  The latter can also
> be used to disambiguate authors if they happen to have the same
> real name.

That's still available and it's even still the default.

But: If the user of git-svn takes the burden of writing an authors
script or maintaining an authors file then he should have full control
over the result as long as git can handle the output reasonably.
Currently that's the case for git but not for git-svn.

Git can handle empty emails quite nicely:

> git -c user.email= commit --allow-empty -m "foo"
> git show --format=raw HEAD | egrep "author|committer"
author jondoe <> 1521495217 +0100
committer jondoe <> 1521495217 +0100

Doing the same with current git-svn requires a filter-branch followed
by `rm -r .git/svn/`  followed by `git svn fetch` to recreate the
rev_map files. That would be feasible for a one-time conversion but
not in a situation where SVN is live and the master repository.

>
> "<>" is completely meaningless.
>

Not quite. The "<>" is not the only information - there is still the
mandatory "name" part. So the commit id

jondoe <>

just means: "There is intentionally no email address." For an
internal, ephemeral repository that can be OK. It has the advantage,
that no automatic system (Jira, Jenkins, ...) will try to send emails to 

jondoe 

Additionally the log output isn't cluttered with irrelevant stuff. :-)

And last but not least we don't have to hunt down names long gone by and
already deleted in LDAP. In that case the UUID doesn't help either.


Further steps: Eric Sunshine mentioned [1] that you might have concerns about
the change of behavior per se. For me the patch is not so much a new feature but
a bugfix bringing git-svn in sync with git itself. Adding an option parameter 
to enable the new behavior seems strange to me. But there might be other ways
to achieve the same effect:

- changing the output format of the file and prog: empty emails could be 
  marked by a syntax which is invalid so far.

- OR (if some change of behaviour is acceptable) the script could evaluate
  a new environment variable like GIT_SVN_UUID to compose the 
  `<$user@$uuid>` part itself.

- OR just mention it in the relaese notes ;-)

- OR [please insert ideas here]


[1] 
https://public-inbox.org/git/capig+cq1si-avazf_1kf4yx9+egd9tgudvp7npj3uyxy1pl...@mail.gmail.com/


Re: [PATCH] filter-branch: use printf instead of echo -e

2018-03-19 Thread CB Bailey
On Mon, Mar 19, 2018 at 03:49:05PM +0100, Michele Locati wrote:
> In order to echo a tab character, it's better to use printf instead of
> "echo -e", because it's more portable (for instance, "echo -e" doesn't work
> as expected on a Mac).
> 
> This solves the "fatal: Not a valid object name" error in git-filter-branch
> when using the --state-branch option.
> 
> Signed-off-by: Michele Locati 
> ---
>  git-filter-branch.sh | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/git-filter-branch.sh b/git-filter-branch.sh
> index 1b7e4b2cd..21d84eff3 100755
> --- a/git-filter-branch.sh
> +++ b/git-filter-branch.sh
> @@ -627,7 +627,7 @@ then
>   print H "$_:$f\n" or die;
>   }
>   close(H) or die;' || die "Unable to save state")
> - state_tree=$(/bin/echo -e "100644 blob $state_blob\tfilter.map" | git 
> mktree)
> + state_tree=$(printf '100644 blob %s\tfilter.map\n' "$state_blob" | git 
> mktree)
>   if test -n "$state_commit"
>   then
>   state_commit=$(/bin/echo "Sync" | git commit-tree "$state_tree" 
> -p "$state_commit")

I think the change from 'echo -e' to printf is good because of the
better portability reason that you cite.

Looking at the change, I am now curious as to why '/bin/echo' is used.
Testing on a Mac, bash's built in 'echo' recognizes '-e' whereas
'/bin/echo' does not. This is just an observation, I still prefer the
move to 'printf' that you suggest.

There are two further uses of '/bin/echo' in git-filter-branch.sh which
are portable (no "-e", just printing a word that cannot be confused for
an option). One is visible in your diff context and the other is just
below it. For consistency with other echos in git-filter-branch.sh, I
think that these should probably use simple 'echo' rather than
'/bin/echo' to use a builtin where available.

CB


Re: [PATCH v6 2/3] Makefile: add Perl runtime prefix support

2018-03-19 Thread Daniel Jacques
On Mon, Mar 19, 2018 at 5:32 PM Martin Ågren  wrote:

> This commit message mentions RUNTIME_PREFIX_PERL twice, but there is no
> use of RUNTIME_PREFIX_PERL in the actual diffs (patches 1-3/3). Should
> it be s/_PERL//? Your cover letter hints as much under "Changes in v6
> from v5". And "Add a new Makefile flag ..." would need some more
> rewriting since this patch rather expands the scope of the existing
> flag?

Thanks for pointing this out - the two were separate flags in my original
patch set because I
wanted to minimize the scope of impact; however, I have since received
advice and buy-in
on converging them and RUNTIME_PREFIX_PERL functionality was merged
underneath
of RUNTIME_PREFIX in this latest patch set.

I'll update the commit message!


Re: [GSoC][PATCH v5] Make options that expect object ids less chatty if id is invalid

2018-03-19 Thread Junio C Hamano
Paul-Sebastian Ungureanu  writes:

> Subject: Re: [GSoC][PATCH v5] Make options that expect object ids less chatty 
> if id is invalid

For the past few iterations, this is no longer about "options that
expect object IDs that get an invalid one" anymore, right?  The fix
has become a lot more generic and extended in scope.

> Usually, the usage should be shown only if the user does not know what
> options are available. If the user specifies an invalid value, the user
> is already aware of the available options. In this case, there is no
> point in displaying the usage anymore.

This presents the more general fix implemented in these later rounds
rather well.

parse-options: do not show usage upon invalid option value

may be a title that match the more recent state of this topioc
better.

> diff --git a/t/t0041-usage.sh b/t/t0041-usage.sh
> new file mode 100755
> index 0..2fc08ae70
> --- /dev/null
> +++ b/t/t0041-usage.sh
> @@ -0,0 +1,89 @@
> +#!/bin/sh
> +
> +test_description='Test commands behavior when given invalid argument value'
> +
> +. ./test-lib.sh
> +
> +test_expect_success 'setup ' '
> + git init . &&

Do you need this "git init ."?  I somehow doubt it.

> + test_commit "v1.0" &&
> + git tag "v1.1"
> +'
> +
> +test_expect_success 'tag --contains ' '
> + git tag --contains "v1.0" >actual &&
> + grep "v1.0" actual &&
> + grep "v1.1" actual
> +'
> +
> +test_expect_success 'tag --contains ' '
> + test_must_fail git tag --contains "notag" 2>actual &&
> + test_i18ngrep "error" actual
> +'
> +
> +test_expect_success 'tag --no-contains ' '
> + git tag --no-contains "v1.1" >actual &&
> + test_line_count = 0 actual
> +'

So...  at this point there are v1.0 and v1.1 that point at the same
commit and there is no other comit nor tag.  There is no tag that
does not contain v1.1 so the output is empty.

Which is correct, but is a rather boring thing to try ;-).

> +test_expect_success 'tag --no-contains ' '
> + test_must_fail git tag --no-contains "notag" 2>actual &&
> + test_i18ngrep "error" actual
> +'

Good.  Don't we need to check that this does not have "usage" in it?

> +test_expect_success 'tag usage error' '
> + test_must_fail git tag --noopt 2>actual &&
> + test_i18ngrep "usage" actual
> +'
> +
> +test_expect_success 'branch --contains ' '
> + git branch --contains "master" >actual &&
> + test_i18ngrep "master" actual
> +'
> +
> +test_expect_success 'branch --contains ' '
> + test_must_fail git branch --no-contains "nocommit" 2>actual &&
> + test_i18ngrep "error" actual
> +'

Likewise.

> +test_expect_success 'branch --no-contains ' '
> + git branch --no-contains "master" >actual &&
> + test_line_count = 0 actual
> +'
> +
> +test_expect_success 'branch --no-contains ' '
> + test_must_fail git branch --no-contains "nocommit" 2>actual &&
> + test_i18ngrep "error" actual
> +'

Likewise.

> +test_expect_success 'branch usage error' '
> + test_must_fail git branch --noopt 2>actual &&
> + test_i18ngrep "usage" actual
> +'
> +
> +test_expect_success 'for-each-ref --contains ' '
> + git for-each-ref --contains "master" >actual &&
> + test_line_count = 3 actual
> +'
> +
> +test_expect_success 'for-each-ref --contains ' '
> + test_must_fail git for-each-ref --no-contains "noobject" 2>actual &&
> + test_i18ngrep "error" actual
> +'

Likewise.

> +test_expect_success 'for-each-ref --no-contains ' '
> + git for-each-ref --no-contains "master" >actual &&
> + test_line_count = 0 actual
> +'
> +
> +test_expect_success 'for-each-ref --no-contains ' '
> + test_must_fail git for-each-ref --no-contains "noobject" 2>actual &&
> + test_i18ngrep "error" actual
> +'

Likewise.

> +test_expect_success 'for-each-ref usage error' '
> + test_must_fail git for-each-ref --noopt 2>actual &&
> + test_i18ngrep "usage" actual
> +'
> +
> +test_done
> diff --git a/t/t3404-rebase-interactive.sh b/t/t3404-rebase-interactive.sh
> index ef2887bd8..cac8b2bd8 100755
> --- a/t/t3404-rebase-interactive.sh
> +++ b/t/t3404-rebase-interactive.sh
> @@ -919,10 +919,8 @@ test_expect_success 'rebase --exec works without -i ' '
>  test_expect_success 'rebase -i --exec without ' '
>   git reset --hard execute &&
>   set_fake_editor &&
> - test_must_fail git rebase -i --exec 2>tmp &&
> - sed -e "1d" tmp >actual &&
> - test_must_fail git rebase -h >expected &&
> - test_cmp expected actual &&
> + test_must_fail git rebase -i --exec 2>actual &&
> + test_i18ngrep "requires a value" actual &&
>   git checkout master
>  '


Re: [PATCH v3 0/2] stash push -u -- fixes

2018-03-19 Thread Thomas Gummerer
On 03/19, Marc Strapetz wrote:
> On 16.03.2018 21:43, Thomas Gummerer wrote:
> >Thanks Marc for catching the regression I almost introduced and Junio
> >for the review of the second patch.  Here's a re-roll that should fix
> >the issues of v2.
> 
> Thanks, existing issues are fixed, but cleanup of the stashed files seems to
> not work properly when stashing a mixture of tracked and untracked files:

Thanks for the continued testing, and sorry I just can't seem to get
this right :/  The problem here was that I misunderstood what 'git
ls-files --error-unmatch' does without testing it myself.  I'll send
v5 in a bit, which will hopefully finally fix all the cases.

> $ git init
> $ touch file1
> $ touch file2
> $ git add file1 file2
> $ git commit -m "initial import"
> $ echo "a" > file1
> $ echo "b" > file2
> $ touch file3
> $ git status --porcelain
>  M file1
>  M file2
> ?? file3
> $ git stash push -u -- file1 file3
> Saved working directory and index state WIP on master: 48fb140 initial
> import
> $ git status --porcelain
>  M file1
>  M file2
> 
> file1 and file3 are properly stashed, but file1 still remains modified in
> the working tree. When instead stashing file1 and file2, results are fine:
> 
> $ git stash push -u -- file1 file2
> Saved working directory and index state WIP on master: 48fb140 initial
> import
> $ git status
> On branch master
> nothing to commit, working tree clean
> 
> -Marc
> 
> 


Re: [RFC] Rebasing merges: a jorney to the ultimate solution(RoadClear)

2018-03-19 Thread Igor Djordjevic
Hi Sergey,

On 19/03/2018 06:44, Sergey Organov wrote:
> 
> > > > > > Second side note: if we can fast-forward, currently we prefer
> > > > > > that, and I think we should keep that behavior with -R, too.
> > > > >
> > > > > I agree.
> > > >
> > > > I'm admittedly somewhat lost in the discussion, but are you
> > > > talking fast-forward on _rebasing_ existing merge? Where would it
> > > > go in any of the suggested algorithms of rebasing and why?
> > > >
> > > > I readily see how it can break merges. E.g., any "git merge
> > > > --ff-only --no-ff" merge will magically disappear. So, even if
> > > > somehow supported, fast-forward should not be performed by default
> > > > during _rebasing_ of a merge.
> > >
> > > Hmm, now that you brought this up, I can only agree, of course.
> > >
> > > What I had in my mind was more similar to "no-rebase-cousins", like 
> > > if we can get away without actually rebasing the merge but still 
> > > using the original one, do it. But I guess that`s not what Johannes 
> > > originally asked about.
> > >
> > > This is another definitive difference between rebasing (`pick`?) and 
> > > recreating (`merge`) a merge commit - in the case where we`re rebasing, 
> > > of course it doesn`t make sense to drop commit this time (due to 
> > > fast-forward). This does make sense in recreating the merge (only).
> >
> > Eh, I might take this back. I think my original interpretation (and 
> > agreement) to fast-forwarding is correct.
> >
> > But the confusion here comes from `--no-ff` as used for merging, as 
> > opposed to `--no-ff` as used for rebasing. I _think_ Johannes meant 
> > the latter one.
> >
> > In rebasing, `--no-ff` means that even if a commit inside todo list 
> > isn`t to be changed, do not reuse it but create a new one. Here`s 
> > excerpt from the docs[1]:
> >
> >   --no-ff
> > With --interactive, cherry-pick all rebased commits instead of 
> > fast-forwarding over the unchanged ones. This ensures that the 
> > entire history of the rebased branch is composed of new commits.
> >
> > Without --interactive, this is a synonym for --force-rebase.
> >
> >
> > So fast-forwarding in case of rebasing (merge commits as well) is 
> > something you would want by default, as it wouldn`t drop/lose 
> > anything, but merely reuse existing commit (if unchanged), instead of 
> > cherry-picking (rebasing) it into a new (merge) commit anyway.
> 
> This sounds like breakage. E.g., it seems to be breaking every "-x ours"
> merge out there.

Either you are not understanding how rebase fast-forward works, or 
I`m missing what you are pointing to... Mind explaining how can 
something that`s left unchanged suddenly become a breakage?

> Fast-forwarding existing merge, one way or another, still seems to be
> wrong idea to me, as merge commit is not only about content change, but
> also about joint point at particular place in the DAG.

Not sure what this has to do with rebase fast-forwarding, either - 
nothing changes for fast-forwarded (merge or non-merge) commit in 
question, both content, joint point and everything else stays exactly 
the same. If anything changed, then it can`t/won`t be fast-forwarded, 
being unchanged is a prerequisite.

Let me elaborate a bit. Here`s a starting diagram:

(1) ---X1---X2---X3 (master)
 |\
 | A1---A2---A3
 |\
 | M---C1---C2 (topic)
 |/
 \-B1---B2---B3


With "topic" being active branch, we start interactive rebase with 
`git rebase -i master`. Generated todo list will hold commits A1 to 
A3, B1 to B3, M and C1 to C2.

Now, if we decide to `edit` commit C1, leaving everything else the 
same, fast-forward logic will make the new situation look like this:

(2) ---X1---X2---X3 (master)
 |\
 | A1---A2---A3
 |\
 | M---C1'--C2' (topic)
 |/
 \-B1---B2---B3


Notice how only C1 and C2 changed to C1' and C2'? That`s rebase 
fast-forwarding, noticing earlier commits left unchanged, thus 
reusing original ones.

No matter what, no breakage can happen to M in this case, as it`s 
left (reused) exactly as it was - it`s fast-forward rebased.

If we `edit`-ed commit A2, we would have ended in a situation like this:

(3) ---X1---X2---X3 (master)
 |\
 | A1---A2'--A3'
 |\
 | M'--C1'--C2' (topic)
 |/
 \-B1---B2---B3


This time we have new commits A2', A3', M', C1' and C2' - so 
everything influenced by the change that happened will be changed 
(merge commit as well), where all the rest can still be reused 
(fast-forwarded).

If we had started rebasing with `git rebase -i --no-ff master`, no 
matter which commits we `edit` (or none, even), we would end up with 
this instead:

Re: [PATCH v6 2/3] Makefile: add Perl runtime prefix support

2018-03-19 Thread Martin Ågren
On 19 March 2018 at 03:50, Dan Jacques  wrote:
> Add a new Makefile flag, RUNTIME_PREFIX_PERL, which, when enabled,
> configures Perl scripts to locate the Git installation's Perl support
> libraries by resolving against the script's path, rather than
> hard-coding that path at build-time.
>
> RUNTIME_PREFIX_PERL requires that system paths are expressed relative to

This commit message mentions RUNTIME_PREFIX_PERL twice, but there is no
use of RUNTIME_PREFIX_PERL in the actual diffs (patches 1-3/3). Should
it be s/_PERL//? Your cover letter hints as much under "Changes in v6
from v5". And "Add a new Makefile flag ..." would need some more
rewriting since this patch rather expands the scope of the existing
flag?

> a common installation directory, and uses that relationship to locate
> support files based on the known starting point of the script being
> executed, much like RUNTIME_PREFIX does for the Git binary.

With s/_PERL//, this part above reads a bit odd. Would this be
s/RUNTIME_PREFIX/it/?

Martin


[GSoC][PATCH v5] Make options that expect object ids less chatty if id is invalid

2018-03-19 Thread Paul-Sebastian Ungureanu
Usually, the usage should be shown only if the user does not know what
options are available. If the user specifies an invalid value, the user
is already aware of the available options. In this case, there is no
point in displaying the usage anymore.

This patch applies to "git tag --contains", "git branch --contains",
"git branch --points-at", "git for-each-ref --contains" and many more.

Signed-off-by: Paul-Sebastian Ungureanu 
---
 builtin/blame.c   |  1 +
 builtin/shortlog.c|  1 +
 builtin/update-index.c|  1 +
 parse-options.c   | 20 
 parse-options.h   |  1 +
 t/t0040-parse-options.sh  |  2 +-
 t/t0041-usage.sh  | 89 +++
 t/t3404-rebase-interactive.sh |  6 +--
 8 files changed, 107 insertions(+), 14 deletions(-)
 create mode 100755 t/t0041-usage.sh

diff --git a/builtin/blame.c b/builtin/blame.c
index 9dcb367b9..e8c6a4d6a 100644
--- a/builtin/blame.c
+++ b/builtin/blame.c
@@ -729,6 +729,7 @@ int cmd_blame(int argc, const char **argv, const char 
*prefix)
for (;;) {
switch (parse_options_step(, options, blame_opt_usage)) {
case PARSE_OPT_HELP:
+   case PARSE_OPT_ERROR:
exit(129);
case PARSE_OPT_DONE:
if (ctx.argv[0])
diff --git a/builtin/shortlog.c b/builtin/shortlog.c
index e29875b84..be4df6a03 100644
--- a/builtin/shortlog.c
+++ b/builtin/shortlog.c
@@ -283,6 +283,7 @@ int cmd_shortlog(int argc, const char **argv, const char 
*prefix)
for (;;) {
switch (parse_options_step(, options, shortlog_usage)) {
case PARSE_OPT_HELP:
+   case PARSE_OPT_ERROR:
exit(129);
case PARSE_OPT_DONE:
goto parse_done;
diff --git a/builtin/update-index.c b/builtin/update-index.c
index 58d1c2d28..34adf55a7 100644
--- a/builtin/update-index.c
+++ b/builtin/update-index.c
@@ -1059,6 +1059,7 @@ int cmd_update_index(int argc, const char **argv, const 
char *prefix)
break;
switch (parseopt_state) {
case PARSE_OPT_HELP:
+   case PARSE_OPT_ERROR:
exit(129);
case PARSE_OPT_NON_OPTION:
case PARSE_OPT_DONE:
diff --git a/parse-options.c b/parse-options.c
index d02eb8b01..47c09a82b 100644
--- a/parse-options.c
+++ b/parse-options.c
@@ -317,14 +317,16 @@ static int parse_long_opt(struct parse_opt_ctx_t *p, 
const char *arg,
return get_value(p, options, all_opts, flags ^ opt_flags);
}
 
-   if (ambiguous_option)
-   return error("Ambiguous option: %s "
+   if (ambiguous_option) {
+   error("Ambiguous option: %s "
"(could be --%s%s or --%s%s)",
arg,
(ambiguous_flags & OPT_UNSET) ?  "no-" : "",
ambiguous_option->long_name,
(abbrev_flags & OPT_UNSET) ?  "no-" : "",
abbrev_option->long_name);
+   return -3;
+   }
if (abbrev_option)
return get_value(p, abbrev_option, all_opts, abbrev_flags);
return -2;
@@ -434,7 +436,6 @@ int parse_options_step(struct parse_opt_ctx_t *ctx,
   const char * const usagestr[])
 {
int internal_help = !(ctx->flags & PARSE_OPT_NO_INTERNAL_HELP);
-   int err = 0;
 
/* we must reset ->opt, unknown short option leave it dangling */
ctx->opt = NULL;
@@ -459,7 +460,7 @@ int parse_options_step(struct parse_opt_ctx_t *ctx,
ctx->opt = arg + 1;
switch (parse_short_opt(ctx, options)) {
case -1:
-   goto show_usage_error;
+   return PARSE_OPT_ERROR;
case -2:
if (ctx->opt)
check_typos(arg + 1, options);
@@ -472,7 +473,7 @@ int parse_options_step(struct parse_opt_ctx_t *ctx,
while (ctx->opt) {
switch (parse_short_opt(ctx, options)) {
case -1:
-   goto show_usage_error;
+   return PARSE_OPT_ERROR;
case -2:
if (internal_help && *ctx->opt == 'h')
goto show_usage;
@@ -504,9 +505,11 @@ int parse_options_step(struct parse_opt_ctx_t *ctx,
goto show_usage;
switch (parse_long_opt(ctx, arg + 2, options)) {
case -1:
-   goto show_usage_error;
+   return 

Re: [PATCH 3/3] Makefile: optionally symlink libexec/git-core binaries to bin/git

2018-03-19 Thread Linus Torvalds
On Mon, Mar 19, 2018, 04:34 Johannes Schindelin
 wrote:
>
> This is a real problem.

No it isn't.

We already handle those special cases specially, and install them in
the bin directory (as opposed to libexec). And it all works fine.

Look into the bin directory some day. You'll find things like

  git-cvsserver
  gitk
  git-receive-pack
  git-shell
  git-upload-archive
  git-upload-pack

there, and the fact that a couple of them happen to be built-ins is an
IMPLEMENTATION DETAIL, not a "Oh we should have used just 'git' for
them".

The design of having separate programs is the *good* conceptual
design. And we damn well should keep it for these things that are used
for special purposes.

The fact that two of them have become built-ins as part of the git
binary is incidental. It shouldn't be visible in the names, because it
really is just an internal implementation thing, not anything
fundamental.

> And it is our own darned fault because we let an
> implementation detail bleed into a protocol. We could have designed that a
> lot better.

And by "we" you clearly mean "not you", and by "we could have designed
that a lot better" you must mean "and it was very well designed by
competent people who didn't use bad operating systems".

> Of course we should fix this, though. There is literally no good reason

Go away.

We shouldn't fix it, it's all fine as-is, and there were tons of
f*cking good reasons for why git did what it did. The main one being
"it's a collection of scripts", which was what git _was_, for
chrissake. And using spaces and running some idiotic and
hard-to-verify script de-multiplexer is the WRONG THING for things
like "git-shell" and "git-receive-pack" and friends.

Right now you can actually verify exactly what "git-shell" does. Or
you could replace - or remove - it entirely if you don't like it. And
never have to worry about running "git" with some "shell" subcommand.

And you know that it's not an alias, for example.  Because "git-xyz"
simply does not look up aliases.

So really. Go away, Johannes. Your concerns are complete and utter BS.

The real problem is that Windows is badly designed, but since it's
easy to work around (by using hard-linking on Windows), nobody sane
cares.

The solution is simple, and was already suggested: use symlinks (like
we used to!) on non-windows systems. End of story.

And for the libexec thing, we might want to deprecate those names, if
somebody wants to, but it's not like it actually hurts, and it gives
backwards compatibility.

Btw, real Windows people know all about backwards compatibility. Ask
around competent people inside MS whether it's an important thing.

So stop this idiotic "bad design" crap. Somebody working on Windows
simply can't afford your attitude.

Somebody who didn't design it in the first place can't afford your attitude.

 Linus


Re: What's cooking in git.git (Mar 2018, #03; Wed, 14)

2018-03-19 Thread Derrick Stolee

On 3/15/2018 4:36 AM, Ævar Arnfjörð Bjarmason wrote:

On Thu, Mar 15 2018, Junio C. Hamano jotted:


* nd/repack-keep-pack (2018-03-07) 6 commits
  - SQUASH???
  - pack-objects: display progress in get_object_details()
  - pack-objects: show some progress when counting kept objects
  - gc --auto: exclude base pack if not enough mem to "repack -ad"
  - repack: add --keep-pack option
  - t7700: have closing quote of a test at the beginning of line

  "git gc" in a large repository takes a lot of time as it considers
  to repack all objects into one pack by default.  The command has
  been taught to pretend as if the largest existing packfile is
  marked with ".keep" so that it is left untouched while objects in
  other packs and loose ones are repacked.

  Expecting a reroll.
  cf. 

Re: Bug With git rebase -p

2018-03-19 Thread Junio C Hamano
Joseph Strauss  writes:

> I found the following erroneous behavior with "git rebase -p".
>
> My current version is git version 2.16.2.windows.1
>
> I made an example at GitHub, https://github.com/jkstrauss/git-bug/
>
> There seem to be two problems when rebasing merge commits with git rebase -p :
>   1. All lines of merge commits' messages get collapse into a single line.
>   2. When an asterisk is present in the middle of the line it gets replaced 
> with the file names of the current directory.

I suspect that this has already been independently discovered
(twice) and corrected with

https://public-inbox.org/git/20180208204241.19324-1-gregory.herr...@oracle.com/

and is included in v2.17-rc0 (and later ;-).




Re: [PATCH v4 1/4] worktree: improve message when creating a new worktree

2018-03-19 Thread Junio C Hamano
Duy Nguyen  writes:

>> @@ -320,10 +318,19 @@ static int add_worktree(const char *path, const char 
>> *refname,
>> if (ret)
>> goto done;
>>
>> +   fprintf(stderr, _("worktree HEAD is now at %s"),
>
> We use the term "working tree" for UI and documents. "worktree" is
> only used in code comments and stuff.

Hmph, that is a bit different from what I recall.  "working tree" is
the phrase we have used and we still use to refer to those things
that are checked out (as opposed to the in-repo data).  We say
"worktree" when we want to stress the fact that we are talking about
a single instance among possibly multiple instances of the "working
tree" that are associated to a single repository.

Technically speaking, the traditional "working tree" is everything
in the directory immediately above ".git/" plus ".git/index"; a
single "worktree" consists of a bit more, as we have to count per
worktree states like .git/rebase-apply/ and .git/refs/bisect/ as
part of it.

And from that point of view, HEAD is one of those per worktree
states, so...


Bug With git rebase -p

2018-03-19 Thread Joseph Strauss
I found the following erroneous behavior with "git rebase -p".

My current version is git version 2.16.2.windows.1

I made an example at GitHub, https://github.com/jkstrauss/git-bug/

There seem to be two problems when rebasing merge commits with git rebase -p :
  1. All lines of merge commits' messages get collapse into a single line.
  2. When an asterisk is present in the middle of the line it gets replaced 
with the file names of the current directory.

  


Joseph Kalman Strauss
Lifecycle Management Engineer
B Photo
212-239-7500 x2212
josep...@bhphoto.com


Re: [PATCH v6 2/3] Makefile: add Perl runtime prefix support

2018-03-19 Thread Junio C Hamano
Daniel Jacques  writes:

>> > The merge conflict becomes a tad easier to deal with, also makes sense
>> > to have gitexecdir after infodir since that's the order we're listing
>> > these in just a few lines earlier, and this is otherwise (mostly)
>> > consistent.
>
> Actually as a quick follow-up question: for these patch sets, is it best
> for me to have them based off of "master", "next", or a different branch?

When you are cooperating with somebody else, e.g. in this case you
are planning your changes to work well with the ab/install-symlinks
topic, there are three choices, I think.

 (1) Build your topic on 'master'.  From time to time (and
 especially before sending it out to the list), do a trial merge
 of your topic to 'master', 'next' and 'pu' to see how badly it
 interacts with the other topic.  

 If the conflicts are not too bad, and if it makes sense for
 your topic to graduate without the other topic being in
 'master', then this is the preferrable approach.

 (2) Build your topic on top of the other's topic.  When the other
 branch gets updated (either by rerolling if it is not yet on
 'next', or by adding a follow up commit), you may need to
 rebase before sending an update.

 As long as you can live without new stuff added to 'master'
 since the other's topic forked from 'master', this is probably
 the second best option.  It definitely is worse than (1) as
 you'd need to rebase on top of other's work, which will become
 impossible once your topic hits 'next'.

 (3) Make a merge of the other's topic into 'master', and then build
 your topic on top of the result.  Keep the updates from the
 other's topic to the minimum once you start working on your
 topic to simplify the task to update your topic.  From time to
 time, do a trial merge to 'master', 'next' and 'pu' to ensure
 you are compatible with the updates made to the other's topic
 since you forked from them.

 As long as the other's topic is already fairly stable, and if
 you need to depend on new stuff added to 'master' since the
 other's topic forked from 'master', this is a workable
 approach.

I suspect that (1) is fine in this case.  As to the reordering of
gitexecdir_relative thing Ævar mentioned, I agree that such a change
is good because the order of the lines in the result makes more
sense.

Thanks.


Re: [PATCH 44/44] packfile: keep prepare_packed_git() private

2018-03-19 Thread Jonathan Tan
On Sat,  3 Mar 2018 18:36:37 +0700
Nguyễn Thái Ngọc Duy   wrote:

> The reason callers have to call this is to make sure either packed_git
> or packed_git_mru pointers are initialized since we don't do that by
> default. Sometimes it's hard to see this connection between where the
> function is called and where packed_git pointer is used (sometimes in
> separate functions).
> 
> Keep this dependency internal because now all access to packed_git and
> packed_git_mru must go through get_xxx() wrappers.
> 
> Signed-off-by: Nguyễn Thái Ngọc Duy 

The patches up to and including this one look good.

I also see that the question I asked in patch 10 about lazily
initializing some fields is answered here.

If we're planning to avoid making the user call prepare_packed_git()
(which I agree with), I think we also need to ensure that we always use
the get_xxx() wrapper whenever we access objects.packed_git. Currently,
there are still some functions in packfile.c that do not do so (notably
for_each_packed_object()). Could these be changed too? (This should not
be too difficult for you to do on your own, but I can send a fixup
patch if you want.)


Re: Potential git bug

2018-03-19 Thread Jeff King
On Mon, Mar 19, 2018 at 04:14:31PM -0400, Nick Hunt wrote:

> oh, wait, switching branches didn't vaporize my changes, it auto-committed 
> them.
> which is still weird and possibly a bug?

Are you sure that the changes were not already present on the
switched-to branch?

-Peff


Re: Potential git bug

2018-03-19 Thread Nick Hunt
oh, wait, switching branches didn't vaporize my changes, it auto-committed them.
which is still weird and possibly a bug?
Nick Hunt
nhun...@gmail.com
404-988-1845


On Mon, Mar 19, 2018 at 3:13 PM, Nick Hunt  wrote:
> i committed my changes, then ran
> git reset --soft HEAD^
> at this point everything is fine
> then i switched branches, and all of my changes vaporized into thin
> air. uhhh, is this supposed to happen?
>
> anyway, thank god intellij saves my work for me as i go, so i didn't
> have to re-write all my code.
>
> my bash/zsh commands are attached.
>
> thanks for the help! :)


Re: [PATCH 39/44] packfile: allow prepare_packed_git_one to handle arbitrary repositories

2018-03-19 Thread Jonathan Tan
On Sat,  3 Mar 2018 18:36:32 +0700
Nguyễn Thái Ngọc Duy   wrote:

> From: Stefan Beller 
> 
> Signed-off-by: Stefan Beller 
> Signed-off-by: Junio C Hamano 
> Signed-off-by: Nguyễn Thái Ngọc Duy 

Thanks - I've checked that none of the functions invoked in
prepare_packed_git_one() access the_repository. (add_packed_git() does
not, despite its name.)

The patches up to this one are fine.

[snip]

> - for (p = the_repository->objects.packed_git; p;
> + for (p = r->objects.packed_git; p;

Optional: this could be get_packed_git(r).


Re: [PATCH v6 09/11] pack-objects: shrink size field in struct object_entry

2018-03-19 Thread Junio C Hamano
Duy Nguyen  writes:

> There is a difference. For sizes smaller than 2^32, whatever you
> pass to oe_set_size() will be returned by oe_size(),
> consistently. It does not matter if this size is "good"  If
> it's different, oe_size() will return something else other than
> oe_set_size() is given.

OK, fair enough.

> ... I was trying to exercise this
> code the other day by reducing size_ field down to 4 bits, and a
> couple tests broke but I still don't understand how.

Off by one?  Two or more copies of the same objects available whose
oe_size() are different?



Re: feature-request: git "cp" like there is git mv.

2018-03-19 Thread Junio C Hamano
Stefan Moch  writes:

> Are such redundant checks in general a pattern worth searching
> for and cleaning up globally? Or is this rather in the category
> of cleaning up only when noticed?

A clean-up patch that is otherwise a no-op is still welcome as it
will improve the health of the codebase, but they become hindrances
if there are too many of them to consume the review bandwidth that
would otherwise be better spent on other non no-op topics, and/or if
they add too many merge conflicts with other non no-op topics in
flight.

The amount of such negative impact a no-op clean-up patch can have
on the project does not depend on how the issue was discovered, so
we do not even have to know if the issue was discovered by actively
hunting or by noticing while working on a near-by area.

It is possible that by actively looking for, you may end up
producing more of the no-op clean-up patches and can more easily
interfere with other topics, which we may need to discourge or at
least ask you to slow down.  On the other hand, issues discovered
while working on a near-by area would typically not increase
conflicts with other topics in flight over the conflicts that would
be caused by that real work you were doing in a near-by area
already, so in that sense, "only when noticed" is a practical way to
avoid the clean-up fatigue.


Re: [PATCH v6 2/3] Makefile: add Perl runtime prefix support

2018-03-19 Thread Daniel Jacques
On Mon, Mar 19, 2018 at 3:21 PM Ævar Arnfjörð Bjarmason 
wrote:

> I think it would be more idiomatic and more paranoid (we'll catch bugs)
> to do:

>   my $exec_path;
>   if (exists $ENV{GIT_EXEC_PATH}) {
>   $exec_path = $ENV{GIT_EXEC_PATH};
>   } else {
>   [...]
>   }

> I.e. we're interested if we got passed GIT_EXEC_PATH, so let's see if it
> exists in the env hash, and then use it as-is. If we have some bug where
> it's an empty string we'd like to know, presumably...

Good idea, done.

> > +
> > + # Trim off the relative gitexecdir path to get the system path.
> > + (my $prefix = $exec_path) =~ s=${gitexecdir_relative}$==;

> The path could contain regex metacharacters, so let's quote those via:

>   (my $prefix = $exec_path) =~ s/\Q$gitexecdir_relative\E$//;

> This also nicely gets us rid of the more verbose ${} form, which makes
> esnse when we're doing ${foo}$ instead of the arguably less readbale
> $foo$, but when it's \Q$foo\E$ it's clear what's going on.

Ah cool - makes sense. I'm not strong with Perl, so I wasn't aware that
this was an option, but I agree it's cleaner. Done.


Re: [PATCH v3 3/5] ref-filter: change parsing function error handling

2018-03-19 Thread Junio C Hamano
Olga Telezhnaya  writes:

> Continue removing any printing from ref-filter formatting logic,
> so that it could be more general.
>
> Change the signature of parse_ref_filter_atom() by adding
> strbuf parameter for error message.
> Return value means the same except negative values: they indicate
> errors (previous version could return only non-negative value).

"The same" meaning?  The same as before?  It shouldn't be too much
work to describe what it means instead (and mention that you are not
changing the meaning as a sidenote) to help the readers, something
like "The function returns the position in the used_atom[] array (as
before) for the given atom, or a negative value to signal an error".



Re: [PATCH 10/44] object-store: move packed_git and packed_git_mru to object store

2018-03-19 Thread Jonathan Tan
On Sat,  3 Mar 2018 18:36:03 +0700
Nguyễn Thái Ngọc Duy   wrote:

> From: Stefan Beller 
> 
> In a process with multiple repositories open, packfile accessors
> should be associated to a single repository and not shared globally.
> Move packed_git and packed_git_mru into the_repository and adjust
> callers to reflect this.
> 
> [nd: while at there, wrap access to these two fields in get_packed_git()
> and get_packed_git_mru(). This allows us to lazily initialize these
> fields without caller doing that explicitly]

The patches up to this one look good. (I didn't reply for each
individual one to avoid unnecessarily sending messages to the list.)

About this patch: will lazily initializing these fields be done in a
later patch?

> Patch generated by
> 
>  1. Moving the struct packed_git declaration to object-store.h
> and packed_git, packed_git_mru globals to struct object_store.
> 
>  2. Apply the following semantic patch to adjust callers:
> @@ @@
> - packed_git
> + the_repository->objects.packed_git
> 
> @@ @@
> - packed_git_mru
> + the_repository->objects.packed_git_mru

This doesn't seem up-to-date - they are being replaced with a function
call, not a field access. I would be OK with just removing this "Patch
generated by" section.

[snip remainder of "Patch generated by" section]

> @@ -246,7 +244,7 @@ static int unuse_one_window(struct packed_git *current)
>  
>   if (current)
>   scan_windows(current, _p, _w, _l);
> - for (p = packed_git; p; p = p->next)
> + for (p = the_repository->objects.packed_git; p; p = p->next)
>   scan_windows(p, _p, _w, _l);
>   if (lru_p) {
>   munmap(lru_w->base, lru_w->len);

Here (and elsewhere), "the_repository->objects.packed_git" instead of
"get_packed_git" is still used.


Re: [PATCH v6 3/3] exec_cmd: RUNTIME_PREFIX on some POSIX systems

2018-03-19 Thread Daniel Jacques
On Mon, Mar 19, 2018 at 3:27 PM Ævar Arnfjörð Bjarmason 
wrote:

> I wonder if it wouldn't be a lot more understandable if these were noted
> together, i.e. let's first document RUNTIME_PREFIX, then for all the
> other ones say below that:

Sounds good to me, done.

> Whitespace changed mixed in with the actual change.

Oops, automatic "clang-format" slipped in there. I've reverted this part.

Thanks for reviewing!


Re: [PATCH v6 0/3] RUNTIME_PREFIX relocatable Git

2018-03-19 Thread Ævar Arnfjörð Bjarmason

On Mon, Mar 19 2018, Dan Jacques jotted:

> I'm dusting this back off now that avarab@'s Perl Makefile simplification
> patch set has landed. It's been a few months, so I'm a bit rusty, but I think
> that I've incorporated all of the feedback. Please take a look and let me know
> what you think!

Thanks a lot, sans the tiny nits I noted in individual patch review (and
stuff noted by others) these all look good to me.

Also it would be great if you could test it for your use-case with the
next branch and define my new INSTALL_SYMLINKS to check that it doesn't
ruin anything for you, it shouldn't since I made it use relative
symlinks, but better to make sure (maybe I missed some edge case, and
we're largely modifying code in similar places).


Re: [PATCH v6 3/3] exec_cmd: RUNTIME_PREFIX on some POSIX systems

2018-03-19 Thread Ævar Arnfjörð Bjarmason

On Mon, Mar 19 2018, Dan Jacques jotted:

>  #
>  # Define HAVE_BSD_SYSCTL if your platform has a BSD-compatible sysctl 
> function.
>  #
> +# Define HAVE_BSD_KERN_PROC_SYSCTL if your platform supports the KERN_PROC 
> BSD
> +# sysctl function.
> +#
> +# Define PROCFS_EXECUTABLE_PATH if your platform mounts a "procfs" filesystem
> +# capable of resolving the path of the current executable. If defined, this
> +# must be the canonical path for the "procfs" current executable path.
> +#
> +# Define HAVE_NS_GET_EXECUTABLE_PATH if your platform supports calling
> +# _NSGetExecutablePath to retrieve the path of the running executable.
> +#
>  # Define HAVE_GETDELIM if your system has the getdelim() function.
>  #
>  # Define PAGER_ENV to a SP separated VAR=VAL pairs to define

This is fine in isolation, but the sum total of the series ends up
being:

diff --git a/Makefile b/Makefile
index 96f6138f63..c23d4d10f0 100644
--- a/Makefile
+++ b/Makefile
@@ -425,6 +425,16 @@ all::
 #
 # Define HAVE_BSD_SYSCTL if your platform has a BSD-compatible sysctl 
function.
 #
+# Define HAVE_BSD_KERN_PROC_SYSCTL if your platform supports the KERN_PROC 
BSD
+# sysctl function.
+#
+# Define PROCFS_EXECUTABLE_PATH if your platform mounts a "procfs" 
filesystem
+# capable of resolving the path of the current executable. If defined, this
+# must be the canonical path for the "procfs" current executable path.
+#
+# Define HAVE_NS_GET_EXECUTABLE_PATH if your platform supports calling
+# _NSGetExecutablePath to retrieve the path of the running executable.
+#
 # Define HAVE_GETDELIM if your system has the getdelim() function.
 #
 # Define PAGER_ENV to a SP separated VAR=VAL pairs to define
@@ -441,6 +451,13 @@ all::
 #
 # When cross-compiling, define HOST_CPU as the canonical name of the CPU on
 # which the built Git will run (for instance "x86_64").
+#
+# Define RUNTIME_PREFIX to configure Git to resolve its ancillary tooling 
and
+# support files relative to the location of the runtime binary, rather than
+# hard-coding them into the binary. Git installations built with 
RUNTIME_PREFIX
+# can be moved to arbitrary filesystem locations. RUNTIME_PREFIX also 
causes
+# Perl scripts to use a modified entry point header allowing them to 
resolve
+# support files at runtime.

I wonder if it wouldn't be a lot more understandable if these were noted
together, i.e. let's first document RUNTIME_PREFIX, then for all the
other ones say below that:

   # When using RUNTIME_PREFIX, define HAVE_BSD[...]

Or something like that. We can always drop the "When using
RUNTIME_PREFIX, " bit later if it ends up benig used for other stuff,
but for now it's helpful to note that you don't need to care about these
if you're not using RUNTIME_PREFIX.

> - "but prefix computation failed.  "
> - "Using static fallback '%s'.\n", prefix);
> +  "but prefix computation failed.  "
> +  "Using static fallback '%s'.\n",
> +  prefix);

Whitespace changed mixed in with the actual change.


Re: [PATCH v3 2/5] ref-filter: add return value && strbuf to handlers

2018-03-19 Thread Junio C Hamano
Olga Telezhnaya  writes:

> Continue removing any printing from ref-filter formatting logic,
> so that it could be more general.

"more general" sounds overly broad.  Is this to avoid calling die()
so that the caller can decide what error messages it wants to give,
abort operation or not, etc.?  From a quick scan of the patch, I
have a feeling that "any printing" is not the problem you are
solving (calling die() is).

> Change the signature of handlers by adding return value
> and strbuf parameter for errors.

Could you explain what the "return value" means?  We can see from
the patch that the function signature is being changed by the patch.
What needs human explanation is why and what the values mean,
e.g. "to allow the caller to notice an error, the function returns 0
upon success and non-0 upon failure, and an error message is
appended to the strbuf *err" (don't pay too much attention to "0" or
"non-0" in this example, as I do not know what you chose to assign
the return values to signal what to the callers; this is merely to
illustrate the degree of details I would expect).

Thanks.


Re: [PATCH v6 2/3] Makefile: add Perl runtime prefix support

2018-03-19 Thread Daniel Jacques
> > The merge conflict becomes a tad easier to deal with, also makes sense
> > to have gitexecdir after infodir since that's the order we're listing
> > these in just a few lines earlier, and this is otherwise (mostly)
> > consistent.

Actually as a quick follow-up question: for these patch sets, is it best
for me to have them based off of "master", "next", or a different branch?


Re: [PATCH v6 2/3] Makefile: add Perl runtime prefix support

2018-03-19 Thread Daniel Jacques
On Mon, Mar 19, 2018 at 3:12 PM Ævar Arnfjörð Bjarmason 
wrote:

> The merge conflict becomes a tad easier to deal with, also makes sense
> to have gitexecdir after infodir since that's the order we're listing
> these in just a few lines earlier, and this is otherwise (mostly)
> consistent.

Got it, I'll update my patch set to include this in v7, which I'll post
after a little more time for comment on v6. Thanks!


Potential git bug

2018-03-19 Thread Nick Hunt
i committed my changes, then ran
git reset --soft HEAD^
at this point everything is fine
then i switched branches, and all of my changes vaporized into thin
air. uhhh, is this supposed to happen?

anyway, thank god intellij saves my work for me as i go, so i didn't
have to re-write all my code.

my bash/zsh commands are attached.

thanks for the help! :)
 nhunt@nicks-mbp  ~/Turner/newstron_client/newstron/videoingest   master ●  
gst
On branch master
Your branch is behind 'origin/master' by 50 commits, and can be fast-forwarded.
  (use "git pull" to update your local branch)

Changes not staged for commit:
  (use "git add ..." to update what will be committed)
  (use "git checkout -- ..." to discard changes in working directory)

modified:   ../cnn/services.js
modified:   ../resources/vendor/quark-filters/.bower.json
modified:   ../resources/vendor/quark-filters/README.md
modified:   ../resources/vendor/quark-filters/dist/quark-filters.min.js
modified:   
../resources/vendor/quark-filters/dist/quark-filters.min.js.map
modified:   ../resources/vendor/quark-filters/src/quark-filters.js
modified:   js/videomonitor-page.js

no changes added to commit (use "git add" and/or "git commit -a")
 nhunt@nicks-mbp  ~/Turner/newstron_client/newstron/videoingest   master ●  
git add -u
 nhunt@nicks-mbp  ~/Turner/newstron_client/newstron/videoingest   master ✚  
gst
On branch master
Your branch is behind 'origin/master' by 50 commits, and can be fast-forwarded.
  (use "git pull" to update your local branch)

Changes to be committed:
  (use "git reset HEAD ..." to unstage)

modified:   ../cnn/services.js
modified:   ../resources/vendor/quark-filters/.bower.json
modified:   ../resources/vendor/quark-filters/README.md
modified:   ../resources/vendor/quark-filters/dist/quark-filters.min.js
modified:   
../resources/vendor/quark-filters/dist/quark-filters.min.js.map
modified:   ../resources/vendor/quark-filters/src/quark-filters.js
modified:   js/videomonitor-page.js

 nhunt@nicks-mbp  ~/Turner/newstron_client/newstron/videoingest   master ✚  
git reset -- ../cnn/services.js
Unstaged changes after reset:
M   cnn/services.js
 nhunt@nicks-mbp  ~/Turner/newstron_client/newstron/videoingest   master ●✚ 
 gst
On branch master
Your branch is behind 'origin/master' by 50 commits, and can be fast-forwarded.
  (use "git pull" to update your local branch)

Changes to be committed:
  (use "git reset HEAD ..." to unstage)

modified:   ../resources/vendor/quark-filters/.bower.json
modified:   ../resources/vendor/quark-filters/README.md
modified:   ../resources/vendor/quark-filters/dist/quark-filters.min.js
modified:   
../resources/vendor/quark-filters/dist/quark-filters.min.js.map
modified:   ../resources/vendor/quark-filters/src/quark-filters.js
modified:   js/videomonitor-page.js

Changes not staged for commit:
  (use "git add ..." to update what will be committed)
  (use "git checkout -- ..." to discard changes in working directory)

modified:   ../cnn/services.js

 nhunt@nicks-mbp  ~/Turner/newstron_client/newstron/videoingest   master ●✚ 
 git commit -m "MCI-515: initial commit: speeding up load time of the monitor 
page by defaulting the date filter to 12 hours instead of 'all time', b/c all 
time doesn't actually represent all time and b/c no users actually look at all 
time anyway"
[master 23942d272] MCI-515: initial commit: speeding up load time of the 
monitor page by defaulting the date filter to 12 hours instead of 'all time', 
b/c all time doesn't actually represent all time and b/c no users actually look 
at all time anyway
 6 files changed, 349 insertions(+), 276 deletions(-)
 rewrite resources/vendor/quark-filters/dist/quark-filters.min.js.map (95%)
 nhunt@nicks-mbp  ~/Turner/newstron_client/newstron/videoingest   master ●  
gst
On branch master
Your branch and 'origin/master' have diverged,
and have 1 and 50 different commits each, respectively.
  (use "git pull" to merge the remote branch into yours)

Changes not staged for commit:
  (use "git add ..." to update what will be committed)
  (use "git checkout -- ..." to discard changes in working directory)

modified:   ../cnn/services.js

no changes added to commit (use "git add" and/or "git commit -a")
 nhunt@nicks-mbp  ~/Turner/newstron_client/newstron/videoingest   master ●  
git push
No user exists for uid 501
fatal: Could not read from remote repository.

Please make sure you have the correct access rights
and the repository exists.
 ✘ nhunt@nicks-mbp  ~/Turner/newstron_client/newstron/videoingest   master ● 
 gco -b MCI-515
M   cnn/services.js
Switched to a new branch 'MCI-515'
 nhunt@nicks-mbp  ~/Turner/newstron_client/newstron/videoingest   MCI-515 ● 
 gst
On branch MCI-515
Changes not staged for commit:
  (use "git add ..." to update what will 

Re: [PATCH v2 00/36] Promisor remotes and external ODB support

2018-03-19 Thread Junio C Hamano
Christian Couder  writes:

> A lot of things are different because the jh/fsck-promisors and
> jh/partial-clone have been merged into master since the v1. So the
> integration is much more complete now (though not fully complete), and
> this integration happens quite early in the patch series.
>
> This integration makes it possible to have many promisor and partial
> clone remotes (instead of just one) with possibly different partial
> clone filters, though that is not tested yet.
>
> I am not sure that the "external odb" name is still the best, as the
> promisor remotes are not quite external. So I am open to suggestions
> about a new name.

So,... so far we have a way to make an incomplete clone of a project
from a remote that promises that objects (deliberately) missing from
the resulting repository are available later by an on-demand request.

We do not yet have code to actually make an on-demand request, and
the other side of the request to fulfill the promise.  

And that is what these patches want to do?

That sounds like a "lazy backfill" mechanism; I know others are
better in naming things than me, though ;-)

On the other hand, if the code updated with these patches do not
cooperate with the promise mechansim (e.g. request is made to any
missing objects, instead of "this object was promised by that
remote, so let's go there and ask" and "this object is simply
missing, without promise by anybody, so let's not bother the
promisor remote"), then it is not even "back"-filling, but is a
mechanism to access remote object database over a protocol, so
a minimum s/ext/remote/ would clarify what it is.

I guess "lazy backfill" would be more preferrable than a pile of
independent, competing and uncooperative features ;-)

>   - Patches 13/36 and 14/36:
>
> These patches move over the promisor objects and partial clone code to
> use the external odb mechanism. The result of 13/36 is that instead of
> the "extensions.partialclone" config variable, a
> "odb..promisorRemote" config variable is now used. The result of
> 14/36 is that instead of the "core.partialclonefilter" config
> variable, a "odb..partialclonefilter" config variable is now
> used.

The use of "extensions" was to protect the repository from versions
of Git that are unaware of the "promise" mechanism to even attempt
touching it.  Will we keep the same degree of safety with these
changes, I wonder?



Re: [PATCH v6 2/3] Makefile: add Perl runtime prefix support

2018-03-19 Thread Ævar Arnfjörð Bjarmason

On Mon, Mar 19 2018, Dan Jacques jotted:

> +gitexecdir_relative = $(patsubst $(prefix)/%,%,$(gitexecdir))
>  mandir_relative = $(patsubst $(prefix)/%,%,$(mandir))
>  infodir_relative = $(patsubst $(prefix)/%,%,$(infodir))
> +localedir_relative = $(patsubst $(prefix)/%,%,$(localedir))
>  htmldir_relative = $(patsubst $(prefix)/%,%,$(htmldir))
> +perllibdir_relative = $(patsubst $(prefix)/%,%,$(perllibdir))

I stole a small part of this for my a4d79b99a0 ("Makefile: add a
gitexecdir_relative variable", 2018-03-13) patch now sitting in next, if
you do this:

diff --git a/Makefile b/Makefile
index 101a98a783..033a55505e 100644
--- a/Makefile
+++ b/Makefile
@@ -501,9 +501,9 @@ lib = lib
 # DESTDIR =
 pathsep = :

-gitexecdir_relative = $(patsubst $(prefix)/%,%,$(gitexecdir))
 mandir_relative = $(patsubst $(prefix)/%,%,$(mandir))
 infodir_relative = $(patsubst $(prefix)/%,%,$(infodir))
+gitexecdir_relative = $(patsubst $(prefix)/%,%,$(gitexecdir))
 localedir_relative = $(patsubst $(prefix)/%,%,$(localedir))
 htmldir_relative = $(patsubst $(prefix)/%,%,$(htmldir))
 perllibdir_relative = $(patsubst $(prefix)/%,%,$(perllibdir))

The merge conflict becomes a tad easier to deal with, also makes sense
to have gitexecdir after infodir since that's the order we're listing
these in just a few lines earlier, and this is otherwise (mostly)
consistent.


Re: [PATCH v6 07/14] commit-graph: implement 'git-commit-graph write'

2018-03-19 Thread Derrick Stolee

On 3/19/2018 10:36 AM, Ævar Arnfjörð Bjarmason wrote:

On Mon, Mar 19 2018, Derrick Stolee jotted:


On 3/18/2018 9:25 AM, Ævar Arnfjörð Bjarmason wrote:

On Wed, Mar 14 2018, Derrick Stolee jotted:


+'git commit-graph write'  [--object-dir ]
+
+
+DESCRIPTION
+---
+
+Manage the serialized commit graph file.
+
+
+OPTIONS
+---
+--object-dir::
+   Use given directory for the location of packfiles and commit graph
+   file. The commit graph file is expected to be at /info/commit-graph
+   and the packfiles are expected to be in /pack.

Maybe this was covered in a previous round, this series is a little hard
to follow since each version isn't In-Reply-To the version before it,
but why is this option needed, i.e. why would you do:

  git commit-graph write --object-dir=/some/path/.git/objects

As opposed to just pigging-backing on what we already have with both of:

  git --git-dir=/some/path/.git commit-graph write
  git -C /some/path commit-graph write

Is there some use-case where you have *just* the objects dir and not the
rest of the .git folder?

Yes, such as an alternate. If I remember correctly, alternates only
need the objects directory.

In the GVFS case, we place prefetch packfiles in an alternate so there
is only one copy of the "remote objects" per drive. The commit graph
will be stored in that alternate.

Makes sense, but we should really document this as being such an unusual
option, i.e. instead say something like.

 Use given directory for the location of packfiles and commit graph
 file. Usually you'd use the `--git-dir` or `-C` arguments to `git`
 itself. This option is here to support obscure use-cases where we
 have a stand-alone object directory. The commit graph file is
 expected to be at /info/commit-graph and the packfiles are
 expected to be in /pack.


A slight change to your recommendation:


OPTIONS
---
--object-dir::
    Use given directory for the location of packfiles and commit graph
    file. This parameter exists to specify the location of an alternate
    that only has the objects directory, not a full .git directory. The
    commit graph file is expected to be at /info/commit-graph and
    the packfiles are expected to be in /pack.



Re: [PATCH v3 2/7] gc: add --keep-base-pack

2018-03-19 Thread Ævar Arnfjörð Bjarmason

On Mon, Mar 19 2018, Duy Nguyen jotted:

> On Fri, Mar 16, 2018 at 10:05 PM, Ævar Arnfjörð Bjarmason
>  wrote:
>>
>> On Fri, Mar 16 2018, Nguyễn Thái Ngọc Duy jotted:
>>
>>> +--keep-base-pack::
>>> + All packs except the base pack and those marked with a `.keep`
>>> + files are consolidated into a single pack. The largest pack is
>>> + considered the base pack.
>>> +
>>
>> I wonder if all of this would be less confusing as:
>>
>>> +--keep-biggest-pack::
>>> + All packs except the largest pack and those marked with a `.keep`
>>> + files are consolidated into a single pack.
>>
>> I.e. just skimming these docs I'd expect "base" to somehow be the thing
>> that we initially cloned, of course in almost all cases that *is* the
>> largest pack, but not necessarily. So rather than communicate that
>> expectation let's just say largest/biggest?
>
> Keeping the term base pack allows us to change its definition later
> (something else other than "largest"). But to be honest I can't see
> what else can a base pack(s) be. So unless people object I'm changing
> this to --keep-biggest-pack (which could take a value  to keep 
> largest packs, but I will refrain from doing things we don't need
> right now).

Maybe I've just been reading this differently, but to me the "base" pack
means the pack that holds the basis of our history, i.e. the thing we
first cloned. As in the base of the history.

Let's say we have a 100MB pack that we cloned, and someone adds a 200MB
(uncompressible) binary file to the repo, then we'll have a "base" pack
that's smaller than the "largest" pack.

So when I was initially reading this series I kept looking for some
discovery of *that* pack, but of course it turned out that it's just
looking for the largest pack.

I just think it's best to avoid that confusion since we really mean
largest, and maybe in the future it would be legitimate to treat the
"base" pack differently, i.e. as you pull down more updates you're
likely to need to be consulting it less and less as time goes on, so
maybe we should have some mode to explicitly exclude just *that* pack
eventually. I.e. as an optimization to keep the more relevant stuff in
cache.


Re: [PATCH v6 09/11] pack-objects: shrink size field in struct object_entry

2018-03-19 Thread Junio C Hamano
Duy Nguyen  writes:

> This is why I do "size_valid = size_ == size". In my private build, I
> reduced size_ to less than 32 bits and change the "fits_in_32bits"
> function to do something like
>
> int fits_in_32bits(unsigned long size)
> {
> struct object_entry e;
> e.size_ = size;
> return e.size_ == size.
> }
>
> which makes sure it always works. This spreads the use of "valid = xx
> == yy"  in more places though. I think if we just limit the use of
> this expression in a couple access wrappers than it's not so bad.

Yes, but then we should name the helper so that it is clear that it
is not about 32-bit but is about the width of e.size_ field.
>
>>> + if (!e->size_valid) {
>>> + unsigned long real_size;
>>> +
>>> + if (sha1_object_info(e->idx.oid.hash, _size) < 0 ||
>>> + size != real_size)
>>> + die("BUG: 'size' is supposed to be the object size!");
>>> + }
>>
>> If an object that is smaller than 4GB is fed to this function with
>> an incorrect size, we happily record it in e->size_ and declare it
>> is valid.  Wouldn't that be equally grave error as we are catching
>> in this block?
>
> That adds an extra sha1_object_info() to all objects and it's
> expensive (I think it's one of the reasons we cache values in
> object_entry in the first place). I think there are also a few
> occasions we reuse even bad in-pack objects (there are even tests for
> that) so it's not always safe to die() here.

So what?  My point is that I do not see the point in checking if the
size is correct on only one side (i.e. size is too big to fit in
e->size_) and not the other.  If it is worth checking (perhaps under
"#ifndef NDEBUG" or some other debug option?) then I'd think we
should spend cycles for all objects and check.



Re: [PATCH 02/44] repository.c: move env-related setup code back to environment.c

2018-03-19 Thread Duy Nguyen
On Mon, Mar 19, 2018 at 7:07 PM, Jonathan Tan  wrote:
>> -extern void repo_set_gitdir(struct repository *repo, const char *path);
>> +struct set_gitdir_args {
>> + const char *commondir;
>> + const char *object_dir;
>> + const char *graft_file;
>> + const char *index_file;
>> +};
>> +
>> +extern void repo_set_gitdir(struct repository *repo,
>> + const char *root,
>> + const struct set_gitdir_args *optional);
>
> Optional: Reading this header file alone makes me think that the 3rd
> argument can be NULL, but it actually can't. I would name it
> "extra_args" and add a comment inside "struct set_gitdir_args"
> explaining (e.g.):
>
>   /*
>* Any of these fields may be NULL. If so, the name of that directory
>* is instead derived from the root path of the repository.
>*/

Yeah. I think Eric made the same comment. I'm still (very slowly) in
the process of unifying the repo setup for the main repo and
submodules, which hopefully may kill this function or replace it with
something better. But it's too early to tell. Since this part of the
series has landed in 'next', I'll post a fixup patch soon with your
suggestion.
-- 
Duy


[GSoC][PATCH v5] Make options that expect object ids less chatty if id is invalid

2018-03-19 Thread Paul-Sebastian Ungureanu
Usually, the usage should be shown only if the user does not know what
options are available. If the user specifies an invalid value, the user
is already aware of the available options. In this case, there is no
point in displaying the usage anymore.

This patch applies to "git tag --contains", "git branch --contains",
"git branch --points-at", "git for-each-ref --contains" and many more.

Signed-off-by: Paul-Sebastian Ungureanu 
---
 builtin/blame.c   |  1 +
 builtin/shortlog.c|  1 +
 builtin/update-index.c|  1 +
 parse-options.c   | 20 
 parse-options.h   |  1 +
 t/t0040-parse-options.sh  |  2 +-
 t/t0041-usage.sh  | 89 +++
 t/t3404-rebase-interactive.sh |  6 +--
 8 files changed, 107 insertions(+), 14 deletions(-)
 create mode 100755 t/t0041-usage.sh

diff --git a/builtin/blame.c b/builtin/blame.c
index 9dcb367b9..e8c6a4d6a 100644
--- a/builtin/blame.c
+++ b/builtin/blame.c
@@ -729,6 +729,7 @@ int cmd_blame(int argc, const char **argv, const char 
*prefix)
for (;;) {
switch (parse_options_step(, options, blame_opt_usage)) {
case PARSE_OPT_HELP:
+   case PARSE_OPT_ERROR:
exit(129);
case PARSE_OPT_DONE:
if (ctx.argv[0])
diff --git a/builtin/shortlog.c b/builtin/shortlog.c
index e29875b84..be4df6a03 100644
--- a/builtin/shortlog.c
+++ b/builtin/shortlog.c
@@ -283,6 +283,7 @@ int cmd_shortlog(int argc, const char **argv, const char 
*prefix)
for (;;) {
switch (parse_options_step(, options, shortlog_usage)) {
case PARSE_OPT_HELP:
+   case PARSE_OPT_ERROR:
exit(129);
case PARSE_OPT_DONE:
goto parse_done;
diff --git a/builtin/update-index.c b/builtin/update-index.c
index 58d1c2d28..34adf55a7 100644
--- a/builtin/update-index.c
+++ b/builtin/update-index.c
@@ -1059,6 +1059,7 @@ int cmd_update_index(int argc, const char **argv, const 
char *prefix)
break;
switch (parseopt_state) {
case PARSE_OPT_HELP:
+   case PARSE_OPT_ERROR:
exit(129);
case PARSE_OPT_NON_OPTION:
case PARSE_OPT_DONE:
diff --git a/parse-options.c b/parse-options.c
index d02eb8b01..47c09a82b 100644
--- a/parse-options.c
+++ b/parse-options.c
@@ -317,14 +317,16 @@ static int parse_long_opt(struct parse_opt_ctx_t *p, 
const char *arg,
return get_value(p, options, all_opts, flags ^ opt_flags);
}
 
-   if (ambiguous_option)
-   return error("Ambiguous option: %s "
+   if (ambiguous_option) {
+   error("Ambiguous option: %s "
"(could be --%s%s or --%s%s)",
arg,
(ambiguous_flags & OPT_UNSET) ?  "no-" : "",
ambiguous_option->long_name,
(abbrev_flags & OPT_UNSET) ?  "no-" : "",
abbrev_option->long_name);
+   return -3;
+   }
if (abbrev_option)
return get_value(p, abbrev_option, all_opts, abbrev_flags);
return -2;
@@ -434,7 +436,6 @@ int parse_options_step(struct parse_opt_ctx_t *ctx,
   const char * const usagestr[])
 {
int internal_help = !(ctx->flags & PARSE_OPT_NO_INTERNAL_HELP);
-   int err = 0;
 
/* we must reset ->opt, unknown short option leave it dangling */
ctx->opt = NULL;
@@ -459,7 +460,7 @@ int parse_options_step(struct parse_opt_ctx_t *ctx,
ctx->opt = arg + 1;
switch (parse_short_opt(ctx, options)) {
case -1:
-   goto show_usage_error;
+   return PARSE_OPT_ERROR;
case -2:
if (ctx->opt)
check_typos(arg + 1, options);
@@ -472,7 +473,7 @@ int parse_options_step(struct parse_opt_ctx_t *ctx,
while (ctx->opt) {
switch (parse_short_opt(ctx, options)) {
case -1:
-   goto show_usage_error;
+   return PARSE_OPT_ERROR;
case -2:
if (internal_help && *ctx->opt == 'h')
goto show_usage;
@@ -504,9 +505,11 @@ int parse_options_step(struct parse_opt_ctx_t *ctx,
goto show_usage;
switch (parse_long_opt(ctx, arg + 2, options)) {
case -1:
-   goto show_usage_error;
+   return 

Re: [GSoC][PATCH v4] Make options that expect object ids less chatty if id is invalid

2018-03-19 Thread ungureanupaulsebastian
Hello,

Thank you for your advice! Soon enough, I wil submit a new patch which
fixes the issues you mentioned.

Best regards,
Paul Ungureanu


Re: [PATCH v6 07/14] commit-graph: implement 'git-commit-graph write'

2018-03-19 Thread Ævar Arnfjörð Bjarmason

On Mon, Mar 19 2018, Derrick Stolee jotted:

> On 3/19/2018 10:36 AM, Ævar Arnfjörð Bjarmason wrote:
>> On Mon, Mar 19 2018, Derrick Stolee jotted:
>>
>>> On 3/18/2018 9:25 AM, Ævar Arnfjörð Bjarmason wrote:
 On Wed, Mar 14 2018, Derrick Stolee jotted:

> +'git commit-graph write'  [--object-dir ]
> +
> +
> +DESCRIPTION
> +---
> +
> +Manage the serialized commit graph file.
> +
> +
> +OPTIONS
> +---
> +--object-dir::
> + Use given directory for the location of packfiles and commit graph
> + file. The commit graph file is expected to be at /info/commit-graph
> + and the packfiles are expected to be in /pack.
 Maybe this was covered in a previous round, this series is a little hard
 to follow since each version isn't In-Reply-To the version before it,
 but why is this option needed, i.e. why would you do:

   git commit-graph write --object-dir=/some/path/.git/objects

 As opposed to just pigging-backing on what we already have with both of:

   git --git-dir=/some/path/.git commit-graph write
   git -C /some/path commit-graph write

 Is there some use-case where you have *just* the objects dir and not the
 rest of the .git folder?
>>> Yes, such as an alternate. If I remember correctly, alternates only
>>> need the objects directory.
>>>
>>> In the GVFS case, we place prefetch packfiles in an alternate so there
>>> is only one copy of the "remote objects" per drive. The commit graph
>>> will be stored in that alternate.
>> Makes sense, but we should really document this as being such an unusual
>> option, i.e. instead say something like.
>>
>>  Use given directory for the location of packfiles and commit graph
>>  file. Usually you'd use the `--git-dir` or `-C` arguments to `git`
>>  itself. This option is here to support obscure use-cases where we
>>  have a stand-alone object directory. The commit graph file is
>>  expected to be at /info/commit-graph and the packfiles are
>>  expected to be in /pack.
>
> A slight change to your recommendation:
>
>
> OPTIONS
> ---
> --object-dir::
>  Use given directory for the location of packfiles and commit graph
>  file. This parameter exists to specify the location of an alternate
>  that only has the objects directory, not a full .git directory. The
>  commit graph file is expected to be at /info/commit-graph and
>  the packfiles are expected to be in /pack.

Sounds good. Although I think we should add

For full .git directories use the `--git-dir` or `-C` arguments to
git itself.

I.e. for documenting an unusual option it makes sense to have docs in
the form "this is bit odd, usually you'd use XYZ", rather than just
"this is a bit odd"..


Re: [PATCH v6 09/11] pack-objects: shrink size field in struct object_entry

2018-03-19 Thread Duy Nguyen
On Mon, Mar 19, 2018 at 7:29 PM, Junio C Hamano  wrote:
 + if (!e->size_valid) {
 + unsigned long real_size;
 +
 + if (sha1_object_info(e->idx.oid.hash, _size) < 0 ||
 + size != real_size)
 + die("BUG: 'size' is supposed to be the object 
 size!");
 + }
>>>
>>> If an object that is smaller than 4GB is fed to this function with
>>> an incorrect size, we happily record it in e->size_ and declare it
>>> is valid.  Wouldn't that be equally grave error as we are catching
>>> in this block?
>>
>> That adds an extra sha1_object_info() to all objects and it's
>> expensive (I think it's one of the reasons we cache values in
>> object_entry in the first place). I think there are also a few
>> occasions we reuse even bad in-pack objects (there are even tests for
>> that) so it's not always safe to die() here.
>
> So what?  My point is that I do not see the point in checking if the
> size is correct on only one side (i.e. size is too big to fit in
> e->size_) and not the other.  If it is worth checking (perhaps under
> "#ifndef NDEBUG" or some other debug option?) then I'd think we
> should spend cycles for all objects and check.

There is a difference. For sizes smaller than 2^32, whatever you pass
to oe_set_size() will be returned by oe_size(), consistently. It does
not matter if this size is "good" or not. With sizes > 2^32, we make
the assumption that this size must be the same as one found in the
object database. If it's different, oe_size() will return something
else other than oe_set_size() is given. This check here is to make
sure we do not accidentally let the caller fall into this trap.

Yes, it may be a good thing to check anyway even for sizes < 2^32. I'm
a bit uncomfortable doing that though. I was trying to exercise this
code the other day by reducing size_ field down to 4 bits, and a
couple tests broke but I still don't understand how. It's probably
just me pushing the limits too hard, not a bug in these changes. But
it does tell me that I don't understand pack-objects enough to assert
that "all calls to oe_set_size() give good size".
-- 
Duy


Re: [PATCH 02/44] repository.c: move env-related setup code back to environment.c

2018-03-19 Thread Jonathan Tan
On Sat,  3 Mar 2018 18:35:55 +0700
Nguyễn Thái Ngọc Duy   wrote:

> It does not make sense that generic repository code contains handling
> of environment variables, which are specific for the main repository
> only. Refactor repo_set_gitdir() function to take $GIT_DIR and
> optionally _all_ other customizable paths. These optional paths can be
> NULL and will be calculated according to the default directory layout.
> 
> Note that some dead functions are left behind to reduce diff
> noise. They will be deleted in the next patch.
> 
> Signed-off-by: Nguyễn Thái Ngọc Duy 

Thanks - I've reviewed up to this patch, and patches 1 and 2 look good.

> -extern void repo_set_gitdir(struct repository *repo, const char *path);
> +struct set_gitdir_args {
> + const char *commondir;
> + const char *object_dir;
> + const char *graft_file;
> + const char *index_file;
> +};
> +
> +extern void repo_set_gitdir(struct repository *repo,
> + const char *root,
> + const struct set_gitdir_args *optional);

Optional: Reading this header file alone makes me think that the 3rd
argument can be NULL, but it actually can't. I would name it
"extra_args" and add a comment inside "struct set_gitdir_args"
explaining (e.g.):

  /*
   * Any of these fields may be NULL. If so, the name of that directory
   * is instead derived from the root path of the repository.
   */


Re: [PATCH v2] filter-branch: use printf instead of echo -e

2018-03-19 Thread Junio C Hamano
Michele Locati  writes:

> In order to echo a tab character, it's better to use printf instead of
> "echo -e", because it's more portable (for instance, "echo -e" doesn't work
> as expected on a Mac).
>
> This solves the "fatal: Not a valid object name" error in git-filter-branch
> when using the --state-branch option.
>
> Furthermore, let's switch from "/bin/echo" to just "echo", so that the
> built-in echo command is used where available.
>
> Signed-off-by: Michele Locati 
> ---

Thanks; will queue.

>  git-filter-branch.sh | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/git-filter-branch.sh b/git-filter-branch.sh
> index 1b7e4b2cd..98c76ec58 100755
> --- a/git-filter-branch.sh
> +++ b/git-filter-branch.sh
> @@ -627,12 +627,12 @@ then
>   print H "$_:$f\n" or die;
>   }
>   close(H) or die;' || die "Unable to save state")
> - state_tree=$(/bin/echo -e "100644 blob $state_blob\tfilter.map" | git 
> mktree)
> + state_tree=$(printf '100644 blob %s\tfilter.map\n' "$state_blob" | git 
> mktree)
>   if test -n "$state_commit"
>   then
> - state_commit=$(/bin/echo "Sync" | git commit-tree "$state_tree" 
> -p "$state_commit")
> + state_commit=$(echo "Sync" | git commit-tree "$state_tree" -p 
> "$state_commit")
>   else
> - state_commit=$(/bin/echo "Sync" | git commit-tree "$state_tree" 
> )
> + state_commit=$(echo "Sync" | git commit-tree "$state_tree" )
>   fi
>   git update-ref "$state_branch" "$state_commit"
>  fi


[PATCH 2/2] read-cache: fix an -Wmaybe-uninitialized warning

2018-03-19 Thread Ramsay Jones

The function ce_write_entry() uses a 'self-initialised' variable
construct, for the symbol 'saved_namelen', to suppress a gcc
'-Wmaybe-uninitialized' warning, given that the warning is a false
positive.

For the purposes of this discussion, the ce_write_entry() function has
three code blocks of interest, that look like so:

/* block #1 */
if (ce->ce_flags & CE_STRIP_NAME) {
saved_namelen = ce_namelen(ce);
ce->ce_namelen = 0;
}

/* block #2 */
/*
 * several code blocks that contain, among others, calls
 * to copy_cache_entry_to_ondisk(ondisk, ce);
 */

/* block #3 */
if (ce->ce_flags & CE_STRIP_NAME) {
ce->ce_namelen = saved_namelen;
ce->ce_flags &= ~CE_STRIP_NAME;
}

The warning implies that gcc thinks it is possible that the first
block is not entered, the calls to copy_cache_entry_to_ondisk()
could toggle the CE_STRIP_NAME flag on, thereby entering block #3
with saved_namelen unset. However, the copy_cache_entry_to_ondisk()
function does not write to ce->ce_flags (it only reads). gcc could
easily determine this, since that function is local to this file,
but it obviously doesn't.

In order to suppress this warning, we make it clear to the reader
(human and compiler), that block #3 will only be entered when the
first block has been entered, by introducing a new 'stripped_name'
boolean variable. We also take the opportunity to change the type
of 'saved_namelen' to 'unsigned int' to match ce->ce_namelen.

Signed-off-by: Ramsay Jones 
---
 read-cache.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/read-cache.c b/read-cache.c
index 2eb81a66b..49607ddcd 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -2104,13 +2104,15 @@ static int ce_write_entry(git_SHA_CTX *c, int fd, 
struct cache_entry *ce,
  struct strbuf *previous_name, struct 
ondisk_cache_entry *ondisk)
 {
int size;
-   int saved_namelen = saved_namelen; /* compiler workaround */
int result;
+   unsigned int saved_namelen;
+   int stripped_name = 0;
static unsigned char padding[8] = { 0x00 };
 
if (ce->ce_flags & CE_STRIP_NAME) {
saved_namelen = ce_namelen(ce);
ce->ce_namelen = 0;
+   stripped_name = 1;
}
 
if (ce->ce_flags & CE_EXTENDED)
@@ -2150,7 +2152,7 @@ static int ce_write_entry(git_SHA_CTX *c, int fd, struct 
cache_entry *ce,
strbuf_splice(previous_name, common, to_remove,
  ce->name + common, ce_namelen(ce) - common);
}
-   if (ce->ce_flags & CE_STRIP_NAME) {
+   if (stripped_name) {
ce->ce_namelen = saved_namelen;
ce->ce_flags &= ~CE_STRIP_NAME;
}
-- 
2.16.0


PATCH 1/2] -Wuninitialized: remove some 'init-self' workarounds

2018-03-19 Thread Ramsay Jones

The 'self-initialised' variables construct (ie  var = var;) has
been used to silence gcc '-W[maybe-]uninitialized' warnings. This has,
unfortunately, caused MSVC to issue 'uninitialized variable' warnings.
Also, using clang static analysis causes complaints about an 'Assigned
value is garbage or undefined'.

There are six such constructs in the current codebase. Only one of the
six causes gcc to issue a '-Wmaybe-uninitialized' warning (which will
be addressed elsewhere). The remaining five 'init-self' gcc workarounds
are noted below, along with the commit which introduced them:

  1. builtin/rev-list.c: 'reaches' and 'all', see commit 457f08a030
 ("git-rev-list: add --bisect-vars option.", 2007-03-21).

  2. merge-recursive.c:2064 'mrtree', see commit f120ae2a8e ("merge-
 recursive.c: mrtree in merge() is not used before set", 2007-10-29).

  3. fast-import.c:3023 'oe', see commit 85c62395b1 ("fast-import: let
 importers retrieve blobs", 2010-11-28).

  4. fast-import.c:3006 'oe', see commit 28c7b1f7b7 ("fast-import: add a
 get-mark command", 2015-07-01).

Remove the 'self-initialised' variable constructs noted above.

Signed-off-by: Ramsay Jones 
---
 builtin/rev-list.c | 2 +-
 fast-import.c  | 4 ++--
 merge-recursive.c  | 2 +-
 3 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/builtin/rev-list.c b/builtin/rev-list.c
index d5345b6a2..fbfc62de4 100644
--- a/builtin/rev-list.c
+++ b/builtin/rev-list.c
@@ -479,7 +479,7 @@ int cmd_rev_list(int argc, const char **argv, const char 
*prefix)
mark_edges_uninteresting(, show_edge);
 
if (bisect_list) {
-   int reaches = reaches, all = all;
+   int reaches, all;
 
find_bisection(, , , bisect_find_all);
 
diff --git a/fast-import.c b/fast-import.c
index b70ac025e..1f01a2205 100644
--- a/fast-import.c
+++ b/fast-import.c
@@ -3003,7 +3003,7 @@ static void cat_blob(struct object_entry *oe, struct 
object_id *oid)
 
 static void parse_get_mark(const char *p)
 {
-   struct object_entry *oe = oe;
+   struct object_entry *oe;
char output[GIT_MAX_HEXSZ + 2];
 
/* get-mark SP  LF */
@@ -3020,7 +3020,7 @@ static void parse_get_mark(const char *p)
 
 static void parse_cat_blob(const char *p)
 {
-   struct object_entry *oe = oe;
+   struct object_entry *oe;
struct object_id oid;
 
/* cat-blob SP  LF */
diff --git a/merge-recursive.c b/merge-recursive.c
index 0fc580d8c..fa9067eec 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -2061,7 +2061,7 @@ int merge_recursive(struct merge_options *o,
 {
struct commit_list *iter;
struct commit *merged_common_ancestors;
-   struct tree *mrtree = mrtree;
+   struct tree *mrtree;
int clean;
 
if (show(o, 4)) {
-- 
2.16.0


Re: [PATCH] mingw: abort on invalid strftime formats

2018-03-19 Thread Junio C Hamano
Johannes Schindelin  writes:

> On Windows, strftime() does not silently ignore invalid formats, but
> warns about them and then returns 0 and sets errno to EINVAL.
>
> Unfortunately, Git does not expect such a behavior, as it disagrees
> with strftime()'s semantics on Linux. As a consequence, Git
> misinterprets the return value 0 as "I need more space" and grows the
> buffer. As the larger buffer does not fix the format, the buffer grows
> and grows and grows until we are out of memory and abort.

Yuck, it is bad that the callers assume 0 is always "need more
space", as "If a conversion specification does not correspond to any
of the above, the behavior is undefined." is what we get from POSIX.

> So let's just bite the bullet and override strftime() for MINGW and
> abort on an invalid format string. While this does not provide the
> best user experience, it is the best we can do.

As long as we allow arbitrary end-user input passed to strftime(),
this is probably a step in the right direction.  

As to the future direction, I however wonder if we can return an
error from here and make the caller cooperate a bit more.  Of
course, implementation of strftime() that silently ignore invalid
formats would not be able to return correct errors like an updated
mingw_strftime() that does not die but communicates error to its
caller would, though.

My "git grep" is hitting only one caller of strftime(), which is
strbuf_addftime(), which already does some magic to the format
string, so such a future enhancement may not be _so_ bad.

Will apply, thanks.  I do not think there is no reason to cook this
in 'next', and would assume this can instead go directly to 'master'
to be part of v2.17-rc1 and onward, right?


[PATCH 0/2] -Wuninitialized

2018-03-19 Thread Ramsay Jones

This series removes all 'self-initialised' variables (ie.  var = var;).
This construct has been used to silence gcc '-W[maybe-]uninitialized' warnings
in the past [1]. Unfortunately, this construct causes warnings to be issued by
MSVC [2], along with clang static analysis complaining about an 'Assigned value
is garbage or undefined'. The number of these constructs has dropped over the
years (eg. see [3] and [4]), so there are currently only 6 remaining in the
current codebase. As demonstrated below, 5 of these no longer cause gcc to
issue warnings.

These patches were developed on v2.16.0, but a test merge to current 'master',
'next' and 'pu' branches complete without conflict.

If you add '-Winit-self' to CFLAGS and compile v2.16.0, then:

  $ vim config.mak # add -Winit-self to CFLAGS
  $ make >g.out-warn-init 2>&1
  $ grep warning g.out-warn-init
  merge-recursive.c:2064:15: warning: ‘mrtree’ is used uninitialized in this 
function [-Wuninitialized]
  read-cache.c:2107:6: warning: ‘saved_namelen’ is used uninitialized in this 
function [-Wuninitialized]
  fast-import.c:3006:23: warning: ‘oe’ is used uninitialized in this function 
[-Wuninitialized]
  fast-import.c:3023:23: warning: ‘oe’ is used uninitialized in this function 
[-Wuninitialized]
  $ 

This misses the self-initialization of the 'reaches' and 'all' symbols in
builtin/rev-list.c, which is somewhat surprising! ;-)

The commits in which these self-initializations were introduced are noted
below:

  a. builtin/rev-list.c: 'reaches' and 'all', see commit 457f08a030
 ("git-rev-list: add --bisect-vars option.", 2007-03-21).

  b. merge-recursive.c:2064 'mrtree', see commit f120ae2a8e ("merge-
 recursive.c: mrtree in merge() is not used before set", 2007-10-29).

  c. fast-import.c:3023 'oe', see commit 85c62395b1 ("fast-import: let
 importers retrieve blobs", 2010-11-28).

  d. read-cache.c:2107 'saved_namelen', see commit b3c96fb158 ("split-index:
 strip pathname of on-disk replaced entries", 2014-06-13).

  e. fast-import.c:3006 'oe', see commit 28c7b1f7b7 ("fast-import: add a
 get-mark command", 2015-07-01).

Clang static analysis marks the self-initialization as a 'Logic error' with
the name 'Assigned value is garbage or undefined'. clang 3.8.0 notes b,c,d
and e, but not a. clang 5.0.1 notes a(reaches),b,c,d and e, but not a(all).

If we now add a patch to remove all self-initialization, which would be the
first patch plus the obvious change to 'saved_namelen' in read-cache.c, then
note the warnings issued by various compilers at various optimization levels
on several different platforms [5]:

O0  O1  O2  O3  Os   Og
 1) gcc 4.8.3   |   -  1,20 11,18-19  1-4,21-23  1,5-17
 2) gcc 4.8.4   |   -  1,20 1   1 1-4,21-23  1,5-8,10-13,15-16 
 3) clang 3.4   |   -   -   -   --   n/a 
 4) gcc 5.4.0   |   -   1   1   1 1,3-4,21   1,5-8,10-13,16-16
 5) clang 3.8.0 |   -   -   -   --   n/a 
 6) gcc 5.4.0   |   -   1   1   1   1-4 1,5-17 
 7) clang 3.8.0 |   -   -   -   --   n/a 
 8) gcc 6.4.0   |   -   1   11,18-191,4 1,5-17
 9) clang 5.0.1 |   -   -   -   ---
10) gcc 7.2.1   |   -   1   1   1   1,4 1,5-17

[Not -Wmaybe-uninitialized table]
 1) gcc 4.8.3   |   27  27  27  27   27   27
 3) clang 3.4   |   26  26  26  26   26  n/a 
 5) clang 3.8.0 | 24-26   24-26   24-26   24-2624-26 n/a 

[clang 3.4 and 3.8.0 do not support -Og, but clang 5.0.1 does.]
[warnings 18-19 issued on cygwin since it builds with NO_REGEX.]

Compiler/Platforms:
  1: 32-bit Windows XP - Cygwin 1.7.30 2014-05-23 i686, Core2 Duo T2050
2,3: 32-bit Linux Mint 17.3, i686, Intel Core2 Duo T2050
4,5: 32-bit Linux Mint 18.3, i686, Virtual Box VM
6,7: 64-bit Linux Mint 18.3, x86_64, Intel Core i5-4200M
8,9: 64-bit Windows 10 - Cygwin 2.10.0 2018-02-02 x86_64, Intel Core i5-4200M
 10: 64-bit Fedora Linux 27, x86_64, Virtual Box VM

Note that warnings 1-23 [6] are all -Wmaybe-uninitialized, and 24-27 do not
concern us here [7]. Outside of -Os and -Og, the only warnings are 1, 18
and 19. Warnings 18 and 19 are only issued against the compat regex routines
while building with NO_REGEX set. Warning 1 is suppressed by applying the
second patch in this series.

So, as noted above, with the exception of the 'saved_namelen' symbol, none
of the 'self-initialised' variables cause gcc to complain. Thus the first
patch does not make any difference to the table(s) above. The second patch
removes the warning #1 from the above table(s).

Notes:
[1] My memory is not what it was, but my recollection is that, circa 2007,
these warnings were -Wuninitialized. I suspect that, due to too many false
positives, many have been moved to -Wmaybe-uninitialized. ;-)

[2] At least it used to cause (actually 

Re: Bad refspec messes up bundle.

2018-03-19 Thread Junio C Hamano
Luciano Joublanc  writes:

> Yesterday I created a git bundle as best as I can remember like this
>
> git bundle save chunk chunk.bundle --all master
>
> Note the 'master' I added accidentally at the end - this was a user
> error but still the bundle was created.
>
> When I tried to clone this, I get
>
> ~\local\src> git clone 'G:\My Drive\chunk.bundle' fs2-columns
> Cloning into 'fs2-columns'...
> Receiving objects: 100% (31/31), done.
> Resolving deltas: 100% (5/5), done.
> fatal: multiple updates for ref 'refs/remotes/origin/master' not allowed.
> ~\local\src> git bundle verify chunk.bundle
> The bundle contains these 3 refs:
> 3c804437a5f8537db1bfb5d09b7bff4f9950605e refs/heads/master
> 3c804437a5f8537db1bfb5d09b7bff4f9950605e HEAD
> 3c804437a5f8537db1bfb5d09b7bff4f9950605e refs/heads/master
> The bundle records a complete history.
> chunk.bundle is okay
>
> After trying a couple of things, I finally managed to clone it using
>
> ~\local\src> git clone -b master --single-branch .\chunk.bundle fs2-columns
>
> i.e. the '--single-branch' option saved me.
>
> Is this a bug? Should bundle allow providing multiple refspecs when
> `--all` is provided? I admit this was clearly a case of 'caveat
> emptor', but shouldn't this be disallowed (i.e. is there any situation
> when this is useful?)

Thanks for a report.

Just like a remote repository that reports the same ref more than
once in its initial advertisement (i.e. "git ls-remote $remote"
gives duplicate entries), a bundle file that records the same ref
more than once *is* a bug, I would think.

A "git bundle create" command that creates such a bundle file
shouldn't.  It is not very useful to diagnose it as an error; it
probably makes more sense to dedup the refs instead when writing the
bundle file.  Of course, we should abort with an error *if* the code
ever tries to store the same ref twice with different object name
(i.e. attempt to dedup, in vain).

Also, "git clone" from such a bundle file (or for that matter, a
remote repository that advertises the same ref twice) probably
should do a similar deduping, with a warning message.



[GSoC][PATCH] test: avoid pipes in git related commands for test suite

2018-03-19 Thread Pratik Karki
Thank you Eric Sunshine,

I have done as you had instructed me. I look forward to more
understanding of the codebase and would love to fix
"git rev-parse" problems in my follow-on patches.
Thank you for the professional review comment.

Sorry for late follow-on patch, I got tied up with my university stuffs.

Please do review this patch as before. I will correct it if needed.

Cheers,
Pratik Karki

Avoid using pipes downstream of Git commands since the exit codes
of commands upstream of pipes get swallowed, thus potentially
hiding failure of those commands. Instead, capture Git command
output to a file apply the downstream command(s) to that file.

Signed-off-by: Pratik Karki 
---
 t/t5300-pack-object.sh | 10 +++---
 t/t5510-fetch.sh   |  8 ++---
 t/t7001-mv.sh  | 22 ++---
 t/t7003-filter-branch.sh   |  9 --
 t/t9104-git-svn-follow-parent.sh   | 16 +
 t/t9108-git-svn-glob.sh| 14 
 t/t9109-git-svn-multi-glob.sh  | 28 +---
 t/t9110-git-svn-use-svm-props.sh   | 42 
 t/t9111-git-svn-use-svnsync-props.sh   | 36 ++---
 t/t9114-git-svn-dcommit-merge.sh   | 10 +++---
 t/t9130-git-svn-authors-file.sh| 28 +---
 t/t9138-git-svn-authors-prog.sh| 31 +-
 t/t9153-git-svn-rewrite-uuid.sh|  8 ++---
 t/t9168-git-svn-partially-globbed-names.sh | 34 +++
 t/t9350-fast-export.sh | 52 --
 15 files changed, 187 insertions(+), 161 deletions(-)

diff --git a/t/t5300-pack-object.sh b/t/t5300-pack-object.sh
index 9c68b9925..91207ae0c 100755
--- a/t/t5300-pack-object.sh
+++ b/t/t5300-pack-object.sh
@@ -311,9 +311,9 @@ test_expect_success 'unpacking with --strict' '
rm -f .git/index &&
tail -n 10 LIST | git update-index --index-info &&
ST=$(git write-tree) &&
-   PACK5=$( git rev-list --objects "$LIST" "$LI" "$ST" | \
-   git pack-objects test-5 ) &&
-   PACK6=$( (
+   git rev-list --objects "$LIST" "$LI" "$ST" >actual &&
+   PACK5=$(git pack-objects test-5 < actual) &&
+   PACK6=$((
echo "$LIST"
echo "$LI"
echo "$ST"
@@ -358,8 +358,8 @@ test_expect_success 'index-pack with --strict' '
rm -f .git/index &&
tail -n 10 LIST | git update-index --index-info &&
ST=$(git write-tree) &&
-   PACK5=$( git rev-list --objects "$LIST" "$LI" "$ST" | \
-   git pack-objects test-5 ) &&
+   git rev-list --objects "$LIST" "$LI" "$ST" >actual &&
+   PACK5=$(git pack-objects test-5 < actual) &&
PACK6=$( (
echo "$LIST"
echo "$LI"
diff --git a/t/t5510-fetch.sh b/t/t5510-fetch.sh
index 668c54be4..c7b284138 100755
--- a/t/t5510-fetch.sh
+++ b/t/t5510-fetch.sh
@@ -693,8 +693,8 @@ test_expect_success C_LOCALE_OUTPUT 'fetch aligned output' '
test_commit long-tag &&
(
cd full-output &&
-   git -c fetch.output=full fetch origin 2>&1 | \
-   grep -e "->" | cut -c 22- >../actual
+   git -c fetch.output=full fetch origin >actual2 2>&1 &&
+   grep -e "->" actual2 | cut -c 22- >../actual
) &&
cat >expect <<-\EOF &&
master   -> origin/master
@@ -708,8 +708,8 @@ test_expect_success C_LOCALE_OUTPUT 'fetch compact output' '
test_commit extraaa &&
(
cd compact &&
-   git -c fetch.output=compact fetch origin 2>&1 | \
-   grep -e "->" | cut -c 22- >../actual
+   git -c fetch.output=compact fetch origin >actual2 2>&1 &&
+   grep -e "->" actual2 | cut -c 22- >../actual
) &&
cat >expect <<-\EOF &&
master -> origin/*
diff --git a/t/t7001-mv.sh b/t/t7001-mv.sh
index 6e5031f56..00aa9e45b 100755
--- a/t/t7001-mv.sh
+++ b/t/t7001-mv.sh
@@ -21,8 +21,8 @@ test_expect_success \
 
 test_expect_success \
 'checking the commit' \
-'git diff-tree -r -M --name-status  HEAD^ HEAD | \
-grep "^R100..*path0/COPYING..*path1/COPYING"'
+'git diff-tree -r -M --name-status  HEAD^ HEAD >actual &&
+grep "^R100..*path0/COPYING..*path1/COPYING" actual'
 
 test_expect_success \
 'moving the file back into subdirectory' \
@@ -35,8 +35,8 @@ test_expect_success \
 
 test_expect_success \
 'checking the commit' \
-'git diff-tree -r -M --name-status  HEAD^ HEAD | \
-grep "^R100..*path1/COPYING..*path0/COPYING"'
+'git diff-tree -r -M --name-status  HEAD^ HEAD >actual &&
+grep "^R100..*path1/COPYING..*path0/COPYING" actual'
 
 test_expect_success \
 'checking -k on non-existing file' \
@@ -116,10 +116,9 @@ 

Re: [PATCH 0/2] routines to generate JSON data

2018-03-19 Thread Jeff Hostetler



On 3/17/2018 3:38 AM, Jacob Keller wrote:

On Fri, Mar 16, 2018 at 2:18 PM, Jeff King  wrote:

   3. Some other similar format. YAML comes to mind. Last time I looked
  (quite a while ago), it seemed insanely complex, but I think you
  could implement only a reasonable subset. OTOH, I think the tools
  ecosystem for parsing JSON (e.g., jq) is much better.



I would personally avoid YAML. It's "easier" for humans to read/parse,
but honestly JSON is already simple enough and anyone who writes C or
javascript can likely parse and hand-write JSON anyways. YAML lacks
built-in parsers for most languages, where as many scripting languages
already have JSON parsing built in, or have more easily attainable
libraries available. In contrast, the YAML libraries are much more
complex and less likely to be available.

That's just my own experience at $dayjob though.


Agreed.  I just looked at the spec for it and I think it would be
harder for us to be assured we are generating valid output with
leading whitespace being significant (without a lot more inspection
of the strings being passed down to us).

Jeff



Re: [PATCH v6 3/3] exec_cmd: RUNTIME_PREFIX on some POSIX systems

2018-03-19 Thread Daniel Jacques
On Mon, Mar 19, 2018 at 1:24 PM Junio C Hamano  wrote:

> Look for these misspelled words:

Oh boy ... thanks, and done.

> OK.  An essentially no-op change but with the name better suited in
> the extended context---we used to only care about argv0 but that was
> an implementation detail of "where did our binary come from".  Nice.

Yes, exactly. Plus I think some other patches that I've seen circulating
around here recently use it in this new capacity, so the name update is
appropriate.


Re: [GSoC] Scripts to be conversted into builtins

2018-03-19 Thread Johannes Schindelin
Hi,

On Sat, 17 Mar 2018, Yash Yadav wrote:

> In the project ideas listed there is one idea talking of conversion of
> scripts to builtins. This interests me but no pointer forward is given
> and I'd like to dive more into that idea and go through the script(s).
> 
> So, where should I look further from here?

One concrete example how a script was converted can be seen here:

https://github.com/git/git/compare/b7786bb4b09%5E...b7786bb4b09%5E2

(This is the patch series that replaced the scripted version of the
`difftool` by the one written in portable C.)

But maybe you want to look at the micro-projects first, to see how you
like to work on Git's source code, and with the Git mailing list?

Ciao,
Johannes


Re: [PATCH v3 2/7] gc: add --keep-base-pack

2018-03-19 Thread Duy Nguyen
On Fri, Mar 16, 2018 at 10:05 PM, Ævar Arnfjörð Bjarmason
 wrote:
>
> On Fri, Mar 16 2018, Nguyễn Thái Ngọc Duy jotted:
>
>> +--keep-base-pack::
>> + All packs except the base pack and those marked with a `.keep`
>> + files are consolidated into a single pack. The largest pack is
>> + considered the base pack.
>> +
>
> I wonder if all of this would be less confusing as:
>
>> +--keep-biggest-pack::
>> + All packs except the largest pack and those marked with a `.keep`
>> + files are consolidated into a single pack.
>
> I.e. just skimming these docs I'd expect "base" to somehow be the thing
> that we initially cloned, of course in almost all cases that *is* the
> largest pack, but not necessarily. So rather than communicate that
> expectation let's just say largest/biggest?

Keeping the term base pack allows us to change its definition later
(something else other than "largest"). But to be honest I can't see
what else can a base pack(s) be. So unless people object I'm changing
this to --keep-biggest-pack (which could take a value  to keep 
largest packs, but I will refrain from doing things we don't need
right now).

>
> Maybe I'm the only one who finds this confusing...
-- 
Duy


Re: [PATCH v6 3/3] exec_cmd: RUNTIME_PREFIX on some POSIX systems

2018-03-19 Thread Junio C Hamano
Dan Jacques  writes:

> Enable Git to resolve its own binary location using a variety of
> OS-specific and generic methods, including:
>
> - procfs via "/proc/self/exe" (Linux)
> - _NSGetExecutablePath (Darwin)
> - KERN_PROC_PATHNAME sysctl on BSDs.
> - argv0, if absolute (all, including Windows).
>
> This is used to enable RUNTIME_PREFIX support for non-Windows systems,
> notably Linux and Darwin. When configured with RUNTIME_PREFIX, Git will
> do a best-effort resolution of its executable path and automatically use
> this as its "exec_path" for relative helper and data lookups, unless
> explicitly overridden.
>
> Small incidental formatting cleanup of "exec_cmd.c".
>
> Signed-off-by: Dan Jacques 
> Thanks-to: Robbie Iannucci 
> Thanks-to: Junio C Hamano 
> ---

Look for these misspelled words:

sysetems
applicaton
authoratative

> diff --git a/Makefile b/Makefile
> index 101a98a78..df17a62a4 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -418,6 +418,16 @@ all::
>  #
>  # Define HAVE_BSD_SYSCTL if your platform has a BSD-compatible sysctl 
> function.
>  #
> +# Define HAVE_BSD_KERN_PROC_SYSCTL if your platform supports the KERN_PROC 
> BSD
> +# sysctl function.
> +#
> +# Define PROCFS_EXECUTABLE_PATH if your platform mounts a "procfs" filesystem
> +# capable of resolving the path of the current executable. If defined, this
> +# must be the canonical path for the "procfs" current executable path.
> +#
> +# Define HAVE_NS_GET_EXECUTABLE_PATH if your platform supports calling
> +# _NSGetExecutablePath to retrieve the path of the running executable.
> +#

Sounds sensible.

> +/**
> + * Path to the current Git executable. Resolved on startup by
> + * 'git_resolve_executable_dir'.
> + */
> +static const char *executable_dirname;
>  
>  static const char *system_prefix(void)
>  {
>   static const char *prefix;
>  
> - assert(argv0_path);
> - assert(is_absolute_path(argv0_path));
> + assert(executable_dirname);
> + assert(is_absolute_path(executable_dirname));
>  
>   if (!prefix &&
> - !(prefix = strip_path_suffix(argv0_path, GIT_EXEC_PATH)) &&
> - !(prefix = strip_path_suffix(argv0_path, BINDIR)) &&
> - !(prefix = strip_path_suffix(argv0_path, "git"))) {
> + !(prefix = strip_path_suffix(executable_dirname, GIT_EXEC_PATH)) &&
> + !(prefix = strip_path_suffix(executable_dirname, BINDIR)) &&
> + !(prefix = strip_path_suffix(executable_dirname, "git"))) {
>   prefix = PREFIX;
>   trace_printf("RUNTIME_PREFIX requested, "
> - "but prefix computation failed.  "
> - "Using static fallback '%s'.\n", prefix);
> +  "but prefix computation failed.  "
> +  "Using static fallback '%s'.\n",
> +  prefix);
>   }
>   return prefix;
>  }

OK.  An essentially no-op change but with the name better suited in
the extended context---we used to only care about argv0 but that was
an implementation detail of "where did our binary come from".  Nice.


Re: [PATCH v6 2/3] Makefile: add Perl runtime prefix support

2018-03-19 Thread Daniel Jacques
On Mon, Mar 19, 2018 at 1:14 PM Junio C Hamano  wrote:

> > +# RUNTIME_PREFIX's resolution logic requires resource paths to be
expressed
> > +# relative to each other and share an installation path.
> > +#
> > +# This is a dependnecy in:

> dependency?

Oops, this is the second typo that has been pointed out. I'll release one
last series after a small review period with these fixed.

> > +# - Git's binary RUNTIME_PREFIX logic in (see "exec_cmd.c").
> > +# - The runtime prefix Perl header (see
> > +#   "perl/header_templates/runtime_prefix.template.pl").
> > +ifdef RUNTIME_PREFIX
> > +
> > +ifneq ($(filter /%,$(firstword $(gitexecdir_relative))),)
> > +$(error RUNTIME_PREFIX requires a relative gitexecdir, not:
$(gitexecdir))
> > +endif

> I see Dscho is CC'ed so I won't worry about "is there a more
> portable test than 'the path begins with a slash' to see if a path
> is relative, or is this good enough even for Windows in the context
> of this patch?".  It won't be a show-stopper issue as long as we do
> not error out with false positive, though ;-).

OK sounds good! There are other places in the Makefile that use this method
for this purpose, so hopefully the worst-case is that this is no more
broken than they are.


Re: [PATCH v6 2/3] Makefile: add Perl runtime prefix support

2018-03-19 Thread Junio C Hamano
Dan Jacques  writes:

> +# RUNTIME_PREFIX's resolution logic requires resource paths to be expressed
> +# relative to each other and share an installation path.
> +#
> +# This is a dependnecy in:

dependency?

> +# - Git's binary RUNTIME_PREFIX logic in (see "exec_cmd.c").
> +# - The runtime prefix Perl header (see
> +#   "perl/header_templates/runtime_prefix.template.pl").
> +ifdef RUNTIME_PREFIX
> +
> +ifneq ($(filter /%,$(firstword $(gitexecdir_relative))),)
> +$(error RUNTIME_PREFIX requires a relative gitexecdir, not: $(gitexecdir))
> +endif

I see Dscho is CC'ed so I won't worry about "is there a more
portable test than 'the path begins with a slash' to see if a path
is relative, or is this good enough even for Windows in the context
of this patch?".  It won't be a show-stopper issue as long as we do
not error out with false positive, though ;-).


Re: [PATCH 1/2] completion: improve ls-files filter performance

2018-03-19 Thread Johannes Schindelin
Hi drizzd,

first of all: thank you so much for working on this. I am sure it will
be noticeable to many Windows users, and also make my life easier.

On Sat, 17 Mar 2018, Clemens Buchacher wrote:

> From the output of ls-files, we remove all but the leftmost path
> component and then we eliminate duplicates. We do this in a while loop,
> which is a performance bottleneck when the number of iterations is large
> (e.g. for 6 files in linux.git).
> 
> $ COMP_WORDS=(git status -- ar) COMP_CWORD=3; time _git
> 
> real0m11.876s
> user0m4.685s
> sys 0m6.808s
> 
> Using an equivalent sed script improves performance significantly:
> 
> $ COMP_WORDS=(git status -- ar) COMP_CWORD=3; time _git
> 
> real0m1.372s
> user0m0.263s
> sys 0m0.167s
> 
> The measurements were done with mingw64 bash, which is used by Git for
> Windows.

Technically, it is not the *mingw64* bash, but it is an MSYS2 Bash. This
does make a little bit of a difference because of the penalty incurred by
the POSIX emulation layer provided by the MSYS2 runtime.

(And it also addresses Gabór's question whether you ran the test suite, I
guess... it takes multiple hours to run it even once on a regular
computer.)

Ciao,
Dscho

Re: [PATCH v4 1/4] worktree: improve message when creating a new worktree

2018-03-19 Thread Duy Nguyen
On Sat, Mar 17, 2018 at 11:22 PM, Thomas Gummerer  wrote:
> Currently 'git worktree add' produces output like the following, when
> '--no-checkout' is not given:
>
> Preparing foo (identifier foo)
> HEAD is now at 26da330922 
>
> where the first line is written to stderr, and the second line coming
> from 'git reset --hard' is written to stdout, even though both lines are
> supposed to tell the user what has happened.  In addition to someone not
> familiar with 'git worktree', this might seem as if the current HEAD was
> modified, not the HEAD in the new working tree.
>
> If the '--no-checkout' flag is given, the output of 'git worktree add'
> is just:
>
> Preparing foo (identifier foo)
>
> even though the HEAD is set to a commit, which is just not checked out.
>
> The identifier is also not particularly relevant for the user at the
> moment, as it's only used internally to distinguish between different
> worktrees that have the same $(basename ).
>
> Fix these inconsistencies, and no longer show the identifier by making
> the 'git reset --hard' call quiet, and printing the message directly
> from the builtin command instead.
>
> Signed-off-by: Thomas Gummerer 
> ---
>  builtin/worktree.c | 13 ++---
>  1 file changed, 10 insertions(+), 3 deletions(-)
>
> diff --git a/builtin/worktree.c b/builtin/worktree.c
> index 7cef5b120b..e5d04f0b4b 100644
> --- a/builtin/worktree.c
> +++ b/builtin/worktree.c
> @@ -303,8 +303,6 @@ static int add_worktree(const char *path, const char 
> *refname,
> strbuf_addf(, "%s/commondir", sb_repo.buf);
> write_file(sb.buf, "../..");
>
> -   fprintf_ln(stderr, _("Preparing %s (identifier %s)"), path, name);
> -
> argv_array_pushf(_env, "%s=%s", GIT_DIR_ENVIRONMENT, 
> sb_git.buf);
> argv_array_pushf(_env, "%s=%s", GIT_WORK_TREE_ENVIRONMENT, 
> path);
> cp.git_cmd = 1;
> @@ -320,10 +318,19 @@ static int add_worktree(const char *path, const char 
> *refname,
> if (ret)
> goto done;
>
> +   fprintf(stderr, _("worktree HEAD is now at %s"),

We use the term "working tree" for UI and documents. "worktree" is
only used in code comments and stuff.

> +   find_unique_abbrev(commit->object.oid.hash, DEFAULT_ABBREV));
> +
> +   strbuf_reset();
> +   pp_commit_easy(CMIT_FMT_ONELINE, commit, );
> +   if (sb.len > 0)
> +   fprintf(stderr, " %s", sb.buf);
> +   fputc('\n', stderr);
> +
> if (opts->checkout) {
> cp.argv = NULL;
> argv_array_clear();
> -   argv_array_pushl(, "reset", "--hard", NULL);
> +   argv_array_pushl(, "reset", "--hard", "--quiet", 
> NULL);
> cp.env = child_env.argv;
> ret = run_command();
> if (ret)
> --
> 2.17.0.rc0.231.g781580f06
>
-- 
Duy


Re: [PATCH v6 0/3] RUNTIME_PREFIX relocatable Git

2018-03-19 Thread Junio C Hamano
Dan Jacques  writes:

> This patch set expands support for the RUNTIME_PREFIX configuration flag,
> currently only used on Windows builds, to include Linux, Darwin, and
> FreeBSD. When Git is built with RUNTIME_PREFIX enabled, it resolves its
> ancillary paths relative to the runtime location of its executable
> rather than hard-coding them at compile-time, allowing a Git
> installation to be deployed to a path other than the one in which it
> was built/installed.
>
> Note that RUNTIME_PREFIX is not currently used outside of Windows.
> This patch set should not have an impact on default Git builds.
>
> I'm dusting this back off now that avarab@'s Perl Makefile simplification
> patch set has landed. It's been a few months, so I'm a bit rusty, but I think
> that I've incorporated all of the feedback. Please take a look and let me know
> what you think!

Yay.  Thanks for rebooting the effort.


Re: [PATCH] completion: complete tags with git tag --delete/--verify

2018-03-19 Thread Junio C Hamano
Todd Zullinger  writes:

> Completion of tag names has worked for the short -d/-v options since
> 88e21dc746 ("Teach bash about completing arguments for git-tag",
> 2007-08-31).  The long options were not added to "git tag" until many
> years later, in c97eff5a95 ("git-tag: introduce long forms for the
> options", 2011-08-28).
>
> Extend tag name completion to --delete/--verify.
>
> Signed-off-by: Todd Zullinger 
> ---

Thanks, makes sense.

>  contrib/completion/git-completion.bash | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/contrib/completion/git-completion.bash 
> b/contrib/completion/git-completion.bash
> index 6da95b8095..c7957f0a90 100644
> --- a/contrib/completion/git-completion.bash
> +++ b/contrib/completion/git-completion.bash
> @@ -2967,7 +2967,7 @@ _git_tag ()
>   while [ $c -lt $cword ]; do
>   i="${words[c]}"
>   case "$i" in
> - -d|-v)
> + -d|--delete|-v|--verify)
>   __gitcomp_direct "$(__git_tags "" "$cur" " ")"
>   return
>   ;;


Re: [PATCH v6 09/11] pack-objects: shrink size field in struct object_entry

2018-03-19 Thread Duy Nguyen
On Mon, Mar 19, 2018 at 5:43 PM, Junio C Hamano  wrote:
> Nguyễn Thái Ngọc Duy   writes:
>
>> +static inline void oe_set_size(struct object_entry *e,
>> +unsigned long size)
>> +{
>> + e->size_ = size;
>> + e->size_valid = e->size_ == size;
>
> A quite similar comment as my earlier one applies here.  I wonder if
> this is easier to read?
>
> e->size_valid = fits_in_32bits(size);
> if (e->size_valid)
> e->size_ = size;
>
> Stepping back a bit in a different tangent,
>
>  - fits_in_32bits() is a good public name if the helper is about
>seeing if the given quantity fits in 32bit uint,
>
>  - but that carves it in stone that our e->size_ *will* be 32bit
>forever, which is not good.
>
> So, it may be a good idea to call it size_cacheable_in_oe(size) or
> something to ask "I have this 'size'; is it small enough to fit in
> the field in the oe, i.e. allow us to cache it, as opposed to having
> to go back to the object every time?"  Of course, this would declare
> that the helper can only be used for that particular field, but that
> is sort of the point of such a change, to allow us to later define
> the e->size_ field to different sizes without affecting other stuff.

This is why I do "size_valid = size_ == size". In my private build, I
reduced size_ to less than 32 bits and change the "fits_in_32bits"
function to do something like

int fits_in_32bits(unsigned long size)
{
struct object_entry e;
e.size_ = size;
return e.size_ == size.
}

which makes sure it always works. This spreads the use of "valid = xx
== yy"  in more places though. I think if we just limit the use of
this expression in a couple access wrappers than it's not so bad.

>> + if (!e->size_valid) {
>> + unsigned long real_size;
>> +
>> + if (sha1_object_info(e->idx.oid.hash, _size) < 0 ||
>> + size != real_size)
>> + die("BUG: 'size' is supposed to be the object size!");
>> + }
>
> If an object that is smaller than 4GB is fed to this function with
> an incorrect size, we happily record it in e->size_ and declare it
> is valid.  Wouldn't that be equally grave error as we are catching
> in this block?

That adds an extra sha1_object_info() to all objects and it's
expensive (I think it's one of the reasons we cache values in
object_entry in the first place). I think there are also a few
occasions we reuse even bad in-pack objects (there are even tests for
that) so it's not always safe to die() here.
-- 
Duy


Re: [PATCH 1/1] Fix typo in merge-strategies documentation

2018-03-19 Thread Junio C Hamano
David Pursehouse  writes:

> From: David Pursehouse 
>
> Signed-off-by: David Pursehouse 
> ---

I somehow had to stare at the patch for a few minutes, view it in
two Emacs buffers and run M-x compare-windows before I finally spot
the single-byte typofix.

Will queue with a retitle.

Documentation/merge-strategies: typofix

It's strategy, not stragegy.

Signed-off-by: David Pursehouse 
Signed-off-by: Junio C Hamano 

Thanks.


>  Documentation/merge-strategies.txt | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/Documentation/merge-strategies.txt 
> b/Documentation/merge-strategies.txt
> index fd5d748d1..4a58aad4b 100644
> --- a/Documentation/merge-strategies.txt
> +++ b/Documentation/merge-strategies.txt
> @@ -40,7 +40,7 @@ the other tree did, declaring 'our' history contains all 
> that happened in it.
>  
>  theirs;;
>   This is the opposite of 'ours'; note that, unlike 'ours', there is
> - no 'theirs' merge stragegy to confuse this merge option with.
> + no 'theirs' merge strategy to confuse this merge option with.
>  
>  patience;;
>   With this option, 'merge-recursive' spends a little extra time


[PATCH] mingw: abort on invalid strftime formats

2018-03-19 Thread Johannes Schindelin
On Windows, strftime() does not silently ignore invalid formats, but
warns about them and then returns 0 and sets errno to EINVAL.

Unfortunately, Git does not expect such a behavior, as it disagrees
with strftime()'s semantics on Linux. As a consequence, Git
misinterprets the return value 0 as "I need more space" and grows the
buffer. As the larger buffer does not fix the format, the buffer grows
and grows and grows until we are out of memory and abort.

Ideally, we would switch off the parameter validation just for
strftime(), but we cannot even override the invalid parameter handler
via _set_thread_local_invalid_parameter_handler() using MINGW because
that function is not declared. Even _set_invalid_parameter_handler(),
which *is* declared, does not help, as it simply does... nothing.

So let's just bite the bullet and override strftime() for MINGW and
abort on an invalid format string. While this does not provide the
best user experience, it is the best we can do.

See https://msdn.microsoft.com/en-us/library/fe06s4ak.aspx for more
details.

This fixes https://github.com/git-for-windows/git/issues/863

Signed-off-by: Johannes Schindelin 
---

This is a really old patch (from 2016) that I had not managed to
contribute to git.git yet...

 compat/mingw.c | 11 +++
 compat/mingw.h |  3 +++
 2 files changed, 14 insertions(+)

diff --git a/compat/mingw.c b/compat/mingw.c
index 2d44d21aca8..a67872babf3 100644
--- a/compat/mingw.c
+++ b/compat/mingw.c
@@ -761,6 +761,17 @@ int mingw_utime (const char *file_name, const struct 
utimbuf *times)
return rc;
 }
 
+#undef strftime
+size_t mingw_strftime(char *s, size_t max,
+ const char *format, const struct tm *tm)
+{
+   size_t ret = strftime(s, max, format, tm);
+
+   if (!ret && errno == EINVAL)
+   die("invalid strftime format: '%s'", format);
+   return ret;
+}
+
 unsigned int sleep (unsigned int seconds)
 {
Sleep(seconds*1000);
diff --git a/compat/mingw.h b/compat/mingw.h
index e03aecfe2e6..571019d0bdd 100644
--- a/compat/mingw.h
+++ b/compat/mingw.h
@@ -361,6 +361,9 @@ int mingw_fstat(int fd, struct stat *buf);
 
 int mingw_utime(const char *file_name, const struct utimbuf *times);
 #define utime mingw_utime
+size_t mingw_strftime(char *s, size_t max,
+  const char *format, const struct tm *tm);
+#define strftime mingw_strftime
 
 pid_t mingw_spawnvpe(const char *cmd, const char **argv, char **env,
 const char *dir,

base-commit: 0afbf6caa5b16dcfa3074982e5b48e27d452dbbb
-- 
2.16.1.windows.4

Published-As: https://github.com/dscho/git/releases/tag/mingw-strftime-v1
Fetch-It-Via: git fetch https://github.com/dscho/git mingw-strftime-v1


Re: [PATCH v6 09/11] pack-objects: shrink size field in struct object_entry

2018-03-19 Thread Junio C Hamano
Nguyễn Thái Ngọc Duy   writes:

> +static inline void oe_set_size(struct object_entry *e,
> +unsigned long size)
> +{
> + e->size_ = size;
> + e->size_valid = e->size_ == size;

A quite similar comment as my earlier one applies here.  I wonder if
this is easier to read?

e->size_valid = fits_in_32bits(size);
if (e->size_valid)
e->size_ = size;

Stepping back a bit in a different tangent, 

 - fits_in_32bits() is a good public name if the helper is about
   seeing if the given quantity fits in 32bit uint,

 - but that carves it in stone that our e->size_ *will* be 32bit
   forever, which is not good.

So, it may be a good idea to call it size_cacheable_in_oe(size) or
something to ask "I have this 'size'; is it small enough to fit in
the field in the oe, i.e. allow us to cache it, as opposed to having
to go back to the object every time?"  Of course, this would declare
that the helper can only be used for that particular field, but that
is sort of the point of such a change, to allow us to later define
the e->size_ field to different sizes without affecting other stuff.

> + if (!e->size_valid) {
> + unsigned long real_size;
> +
> + if (sha1_object_info(e->idx.oid.hash, _size) < 0 ||
> + size != real_size)
> + die("BUG: 'size' is supposed to be the object size!");
> + }

If an object that is smaller than 4GB is fed to this function with
an incorrect size, we happily record it in e->size_ and declare it
is valid.  Wouldn't that be equally grave error as we are catching
in this block?

> +}
> +
>  #endif


Re: [PATCH] Makefile: detect compiler and enable more warnings in DEVELOPER=1

2018-03-19 Thread Duy Nguyen
On Sun, Mar 18, 2018 at 7:56 PM, Ramsay Jones
 wrote:
>
>
> On 18/03/18 15:55, Duy Nguyen wrote:
>> On Sun, Mar 18, 2018 at 9:18 AM, Nguyễn Thái Ngọc Duy  
>> wrote:
>>> +ifneq ($(or $(filter gcc6,$(COMPILER_FEATURES)),$(filter 
>>> clang4,$(COMPILER_FEATURES))),)
>>> +CFLAGS += -Wextra
>>
>> Another thing we can add here is -Og instead of standard -O2 (or -O0
>> in my build), which is supported since gcc 4.8. clang seems to support
>> it too (mapped to -O1 at least for clang5) but I don't know what
>> version added that flag.
>
> I've been doing a lot of compiling recently, using 10 'different'
> versions of clang and gcc ('different' if you count 64-bit and 32-bit
> compilers with the same version number as different!)
>
> However, I can tell you that clang version 3.4 and 3.8.0 don't
> support -Og, but clang version 5.0.1 does.

Yeah. I checked clang git mirror and this -Og is in 4.x release branch
(couldn't nail down exact release) so 5.x should be a safe bet.
-- 
Duy


Re: [PATCH v6 09/11] pack-objects: shrink size field in struct object_entry

2018-03-19 Thread Duy Nguyen
On Mon, Mar 19, 2018 at 5:19 PM, Junio C Hamano  wrote:
> Nguyễn Thái Ngọc Duy   writes:
>
>> +static inline int oe_fits_in_32bits(unsigned long limit)
>> +{
>> + uint32_t truncated_limit = (uint32_t)limit;
>> +
>> + return limit == truncated_limit;
>> +}
>
> I do not think it is worth a reroll (there only are a few
> callsites), but the above has nothing to do with "oe" fitting
> anything (it is about "limit").  Do you mind if I did this instead?
>
> static inline int fits_in_32bits(unsigned long size)
>
> ... or other suggestions, perhaps?
>

I just tried to not pollute the general namespace too much. That works
for me too.
-- 
Duy


Re: [PATCH v6 09/11] pack-objects: shrink size field in struct object_entry

2018-03-19 Thread Junio C Hamano
Nguyễn Thái Ngọc Duy   writes:

> +static inline int oe_fits_in_32bits(unsigned long limit)
> +{
> + uint32_t truncated_limit = (uint32_t)limit;
> +
> + return limit == truncated_limit;
> +}

I do not think it is worth a reroll (there only are a few
callsites), but the above has nothing to do with "oe" fitting
anything (it is about "limit").  Do you mind if I did this instead?

static inline int fits_in_32bits(unsigned long size)

... or other suggestions, perhaps?



[PATCH v2] filter-branch: use printf instead of echo -e

2018-03-19 Thread Michele Locati
In order to echo a tab character, it's better to use printf instead of
"echo -e", because it's more portable (for instance, "echo -e" doesn't work
as expected on a Mac).

This solves the "fatal: Not a valid object name" error in git-filter-branch
when using the --state-branch option.

Furthermore, let's switch from "/bin/echo" to just "echo", so that the
built-in echo command is used where available.

Signed-off-by: Michele Locati 
---
 git-filter-branch.sh | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/git-filter-branch.sh b/git-filter-branch.sh
index 1b7e4b2cd..98c76ec58 100755
--- a/git-filter-branch.sh
+++ b/git-filter-branch.sh
@@ -627,12 +627,12 @@ then
print H "$_:$f\n" or die;
}
close(H) or die;' || die "Unable to save state")
-   state_tree=$(/bin/echo -e "100644 blob $state_blob\tfilter.map" | git 
mktree)
+   state_tree=$(printf '100644 blob %s\tfilter.map\n' "$state_blob" | git 
mktree)
if test -n "$state_commit"
then
-   state_commit=$(/bin/echo "Sync" | git commit-tree "$state_tree" 
-p "$state_commit")
+   state_commit=$(echo "Sync" | git commit-tree "$state_tree" -p 
"$state_commit")
else
-   state_commit=$(/bin/echo "Sync" | git commit-tree "$state_tree" 
)
+   state_commit=$(echo "Sync" | git commit-tree "$state_tree" )
fi
git update-ref "$state_branch" "$state_commit"
 fi
-- 
2.16.2.windows.1



Re: [PATCH 04/36] t/helper: merge test-lazy-init-name-hash into test-tool

2018-03-19 Thread Duy Nguyen
On Mon, Mar 19, 2018 at 10:40 AM, Jeff Hostetler  wrote:
>
>
> On 3/18/2018 4:47 AM, Eric Sunshine wrote:
>>
>> On Sun, Mar 18, 2018 at 4:25 AM, Duy Nguyen  wrote:
>>>
>>> On Sun, Mar 18, 2018 at 3:11 AM, Eric Sunshine 
>>> wrote:

 On Sat, Mar 17, 2018 at 3:53 AM, Nguyễn Thái Ngọc Duy
  wrote:
>
> -extern int test_lazy_init_name_hash(struct index_state *istate, int
> try_threaded);
> +extern int lazy_init_name_hash_for_testing(struct index_state *istate,
> int try_threaded);


 I get why you renamed this since the "main" function in the test
 program wants to be called 'test_lazy_init_name_hash'...

> diff --git a/t/helper/test-lazy-init-name-hash.c
> b/t/helper/test-lazy-init-name-hash.c
> @@ -9,6 +10,9 @@ static int perf;
> +static int (*init_name_hash)(struct index_state *istate, int
> try_threaded) =
> +   lazy_init_name_hash_for_testing;
> +
> @@ -33,9 +37,9 @@ static void dump_run(void)
>  if (single) {
> -   test_lazy_init_name_hash(_index, 0);
> +   init_name_hash(_index, 0);


 ... but I'm having trouble understanding why this indirection through
 'init_name_hash' is used rather than just calling
 lazy_init_name_hash_for_testing() directly. Am I missing something
 obvious or is 'init_name_hash' just an unneeded artifact of an earlier
 iteration before the rename in cache.{c,h}?
>>>
>>>
>>> Nope. It just feels too long to me and because we're already in the
>>> test I don't feel the need to repeat _for_testing everywhere. If it's
>>> confusing, I'll remove init_name_hash.
>>
>>
>> Without an explanatory in-code comment, I'd guess that someone coming
>> across this in the future will also stumble over it just like I did in
>> the review.
>
>
> I agree with Eric, this indirection seems unnecessary and confusing.
> Generally, when I see function pointers initialized like that, I
> think that there are plans for additional providers/drivers for that
> functionality, but I don't see that here.  And it seems odd.
>
>>
>> What if, instead, you rename test_lazy_init_name_hash() to
>> lazy_init_name_hash_test(), shifting 'test' from the front to the back
>> of the name? That way, the name remains the same length at the call
>> sites in the test helper, and you don't have to introduce the
>> confusing, otherwise unneeded 'init_name_hash'.
>>
>
> I see 2 problems.
> 1. The test function in name-hash.c that needs access to private data.
> I'm not a fan of the "_for_testing" suffix, but I'm open.  I might
> either leave it as is, or make it a "TEST_" or "test__" prefix (as
> a guide for people used to languages with namespaces.
>
> 2. The new name for cmd_main().  There's no real need why it needs to
>be "test_*", right?   Your cmds[] maps the command line string to a
>function, but it could be anything.  That is, "cmd_main()" could be
>renamed "cmd__lazy_init_name_hash()".  Then you can conceptually
>reserve the "cmd__" prefix throughout t/helper for each test handler.

cmd__ prefix solves my problem nicely. I'll wait for a while before
resending another 30+ patches.
-- 
Duy


Re: [PATCH v3 0/2] stash push -u -- fixes

2018-03-19 Thread Marc Strapetz

On 16.03.2018 21:43, Thomas Gummerer wrote:

Thanks Marc for catching the regression I almost introduced and Junio
for the review of the second patch.  Here's a re-roll that should fix
the issues of v2.


Thanks, existing issues are fixed, but cleanup of the stashed files 
seems to not work properly when stashing a mixture of tracked and 
untracked files:


$ git init
$ touch file1
$ touch file2
$ git add file1 file2
$ git commit -m "initial import"
$ echo "a" > file1
$ echo "b" > file2
$ touch file3
$ git status --porcelain
 M file1
 M file2
?? file3
$ git stash push -u -- file1 file3
Saved working directory and index state WIP on master: 48fb140 initial 
import

$ git status --porcelain
 M file1
 M file2

file1 and file3 are properly stashed, but file1 still remains modified 
in the working tree. When instead stashing file1 and file2, results are 
fine:


$ git stash push -u -- file1 file2
Saved working directory and index state WIP on master: 48fb140 initial 
import

$ git status
On branch master
nothing to commit, working tree clean

-Marc




[PATCH] filter-branch: use printf instead of echo -e

2018-03-19 Thread Michele Locati
In order to echo a tab character, it's better to use printf instead of
"echo -e", because it's more portable (for instance, "echo -e" doesn't work
as expected on a Mac).

This solves the "fatal: Not a valid object name" error in git-filter-branch
when using the --state-branch option.

Signed-off-by: Michele Locati 
---
 git-filter-branch.sh | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/git-filter-branch.sh b/git-filter-branch.sh
index 1b7e4b2cd..21d84eff3 100755
--- a/git-filter-branch.sh
+++ b/git-filter-branch.sh
@@ -627,7 +627,7 @@ then
print H "$_:$f\n" or die;
}
close(H) or die;' || die "Unable to save state")
-   state_tree=$(/bin/echo -e "100644 blob $state_blob\tfilter.map" | git 
mktree)
+   state_tree=$(printf '100644 blob %s\tfilter.map\n' "$state_blob" | git 
mktree)
if test -n "$state_commit"
then
state_commit=$(/bin/echo "Sync" | git commit-tree "$state_tree" 
-p "$state_commit")
-- 
2.16.2.windows.1



  1   2   >