[ANNOUNCE] Git v2.15.0-rc0

2017-10-04 Thread Junio C Hamano
An early preview release Git v2.15.0-rc0 is now available for
testing at the usual places.  It is comprised of 672 non-merge
commits since v2.14.0, contributed by 66 people, 20 of which are
new faces.

The tarballs are found at:

https://www.kernel.org/pub/software/scm/git/testing/

The following public repositories all have a copy of the
'v2.15.0-rc0' tag and the 'master' branch that the tag points at:

  url = https://kernel.googlesource.com/pub/scm/git/git
  url = git://repo.or.cz/alt-git.git
  url = https://github.com/gitster/git

New contributors whose contributions weren't in v2.14.0 are as follows.
Welcome to the Git development community!

  Ann T Ropea, Daniel Watkins, Dimitrios Christidis, Eric Rannaud,
  Evan Zacks, Hielke Christian Braun, Ian Campbell, Ilya Kantor,
  Jameson Miller, Job Snijders, Joel Teichroeb, joernchen,
  Łukasz Gryglicki, Manav Rathi, Martin Ågren, Michael Forney,
  Patryk Obara, Rene Scharfe, Ross Kabus, and Urs Thuermann.

Returning contributors who helped this release are as follows.
Thanks for your continued support.

  Adam Dinwoodie, Ævar Arnfjörð Bjarmason, Andreas Heiduk,
  Anthony Sottile, Ben Boeckel, Brandon Casey, Brandon Williams,
  brian m. carlson, Christian Couder, Eric Blake, Han-Wen Nienhuys,
  Heiko Voigt, Jeff Hostetler, Jeff King, Johannes Schindelin,
  Jonathan Nieder, Jonathan Tan, Junio C Hamano, Kaartic Sivaraam,
  Kevin Daudt, Kevin Willford, Lars Schneider, Martin Koegler,
  Matthieu Moy, Max Kirillov, Michael Haggerty, Michael J Gruber,
  Nguyễn Thái Ngọc Duy, Nicolas Morey-Chaisemartin, Øystein
  Walle, Paolo Bonzini, Pat Thoyts, Philip Oakley, Phillip
  Wood, Raman Gupta, Ramsay Jones, René Scharfe, Sahil Dua,
  Santiago Torres, Stefan Beller, Stephan Beyer, Takashi Iwai,
  Thomas Gummerer, Tom G. Christensen, Torsten Bögershausen,
  and William Duclot.



Git 2.15 Release Notes (draft)
==

Backward compatibility notes and other notable changes.

 * Use of an empty string as a pathspec element that is used for
   'everything matches' is still warned and Git asks users to use a
   more explicit '.' for that instead.  The hope is that existing
   users will not mind this change, and eventually the warning can be
   turned into a hard error, upgrading the deprecation into removal of
   this (mis)feature.  That is now scheduled to happen in the upcoming
   release.

 * Git now avoids blindly falling back to ".git" when the setup
   sequence said we are _not_ in Git repository.  A corner case that
   happens to work right now may be broken by a call to die("BUG").
   We've tried hard to locate such cases and fixed them, but there
   might still be cases that need to be addressed--bug reports are
   greatly appreciated.

 * "branch --set-upstream" that has been deprecated in Git 1.8 has
   finally been retired.


Updates since v2.14
---

UI, Workflows & Features

 * An example that is now obsolete has been removed from a sample hook,
   and an old example in it that added a sign-off manually has been
   improved to use the interpret-trailers command.

 * The advice message given when "git rebase" stops for conflicting
   changes has been improved.

 * The "rerere-train" script (in contrib/) learned the "--overwrite"
   option to allow overwriting existing recorded resolutions.

 * "git contacts" (in contrib/) now lists the address on the
   "Reported-by:" trailer to its output, in addition to those on
   S-o-b: and other trailers, to make it easier to notify (and thank)
   the original bug reporter.

 * "git rebase", especially when it is run by mistake and ends up
   trying to replay many changes, spent long time in silence.  The
   command has been taught to show progress report when it spends
   long time preparing these many changes to replay (which would give
   the user a chance to abort with ^C).

 * "git merge" learned a "--signoff" option to add the Signed-off-by:
   trailer with the committer's name.

 * "git diff" learned to optionally paint new lines that are the same
   as deleted lines elsewhere differently from genuinely new lines.

 * "git interpret-trailers" learned to take the trailer specifications
   from the command line that overrides the configured values.

 * "git interpret-trailers" has been taught a "--parse" and a few
   other options to make it easier for scripts to grab existing
   trailer lines from a commit log message.

 * "gitweb" shows a link to visit the 'raw' contents of blbos in the
   history overview page.

 * "[gc] rerereResolved = 5.days" used to be invalid, as the variable
   is defined to take an integer counting the number of days.  It now
   is allowed.

 * The code to acquire a lock on a reference (e.g. while accepting a
   push from a client) used to immediately fail when the reference is
   already locked---now it waits for a very short while and retries,
   which can make it succeed if the 

Re: What's cooking in git.git (Oct 2017, #01; Wed, 4)

2017-10-04 Thread Junio C Hamano
Taylor Blau  writes:

> It may make sense to send my other series to 'master' as well
> ("ref-filter.c: pass empty-string as NULL to atom parsers").

Thanks for reminding.  I was involved in the review and remember
that everybody was happy with the direction.  The topic was left on
the list without getting picked up by a mere accident.

Now it is queued.  Let's make sure nobody is depending on the
%(atom) vs %(atom:) distinction by merging it to 'next' soonish.

Those who don't raise stink while the change is in 'next' but only
do so after it hits a release won't be heard ;-)


Re: "man git-stash" explanation of "--include-untracked" and "--all" seems ambiguous

2017-10-04 Thread Junio C Hamano
"Robert P. J. Day"  writes:

>> If this were --include=untracked vs --include=all, then I'd say your
>> suggestion will violate the usual expectation of "on the command
>> line, last one wins", but "--include-untracked" and "--all" are
>> spelled very differently, and may not look all that related to a
>> casual reader, so the expectation for "the last one wins" might be
>> weaker than usual.
>>
>> But once we start complaining to a command line that has both,
>> saying they are mutually exclusive, people will realize that they
>> are very much closely related options, even though spelled quite
>> differently.  And at that point, they will find it very unreasonable
>> that we do not follow the usual "the last one wins" rule but error
>> out X-<.
>>
>> If I really cared deeply about these two options [*1*], I would
>> think that the ideal longer term direction would be to introduce
>> --include={untracked,all-crufts} to replace and deprecate the
>> current two options.  And then we make sure --include=* forms follow
>> the usual "last one wins" rule.
>>
>>
>> [Footnote]
>>
>> *1* I personally don't, but that does not mean I will block efforts
>> by others who do to make this part of the system better.
>
>   since i'm the one who tripped over this pedantic nitpickery, i'm
> willing to take a shot at patching it, as long as there's consensus
> from those *way* higher up the food chain as to what that patch should
> look like.

That is rather hard to arrange.  I can give you, with some effort,
how a series of patches I may produce would look like if I were
interested in this topic.  But I cannot guarantee you that it would
become the consensus solution among other contributors on the list.

And more importantly, designing a good UI/UX (both the final user
interface, and the minimization of inconvenience to users during the
transition period) is more than 80% of the work required for a topic
like this, and by the time I outline something which may or may not
be close to a consensus solution, more than half of the effort
needed has already spent by _me_, on the topic that _I_ am not all
that interested.  That does not sound like a great economy to me.

I can still help polish a concrete proposal with the usual review on
design and implementation, of course.


3% Interest Rate

2017-10-04 Thread Vancity Loan Firm
We can help you with a genuine loan to meet your needs.
Do you need a personal or business loan without stress and quick approval?
Do you need an urgent loan today? No Credit Checks

* LOAN APPROVAL IN 60MINS !!
* GUARANTEED SAME DAY TRANSFER !!
* 100% APPROVAL RATE !!
*LOW INTEREST RATE !!

Contact US for more information about loan offer and we will solve your
financial problem.


lstat-ing delayed-filter output, was Re: playing with MSan

2017-10-04 Thread Jeff King
On Wed, Oct 04, 2017 at 08:30:05PM +0100, Thomas Gummerer wrote:

> > So I dunno. This approach is a _lot_ more convenient than trying to
> > rebuild all the dependencies from scratch, and it runs way faster than
> > valgrind.  It did find the cases that led to the patches in this
> > series, and at least one more: if the lstat() at the end of
> > entry.c:write_entry() fails, we write nonsense into the cache_entry.
> 
> Yeah valgrind found that one too, as I tried (and apparently failed :))
> to explain in the cover letter.  I just haven't found the time yet to
> actually try and go fix that one.

No, I just have poor memory. :)

The obvious fix is that we should check the return value of `lstat`, but
the bigger question is why and when that would fail.

The case triggered by t0021 is using the new "delayed" filter mechanism.
So at the time that write_entry() finishes, we don't actually have the
file in the filesystem. I think we need to recognize that we got delayed
and didn't actually check anything out, and skip that whole "if
(state->refresh_cache)" block. It's not clear to me, though, how we tell
the difference between the delayed and normal cases in that function.

But I think this lstat could also fail if we are checking out and
somebody else racily deletes our file. This is presumably sufficiently
rare that I actually wonder if we should just bail with an error, so
that the user knows something funny is going on.

+cc Lars for thoughts no the delayed-filter case.

-Peff


Re: Line ending normalization doesn't work as expected

2017-10-04 Thread Junio C Hamano
Junio C Hamano  writes:

> Both this and its "git read-tree --empty" cousin share a grave
> issue.  The "git add ." step would mean that before doing these
> commands, your working tree must be truly clean, i.e. the paths
> in the filesystem known to the index must match what is in the
> index (modulo the line-ending gotcha you are trying to correct), 
> *AND* there must be *NO* untracked paths you do not want to add
> in the working tree.
>
> That is a reason why we should solve it differently.  Perhaps adding
> a new option "git add --rehash" to tell Git "Hey, you may think some
> paths in the index and in the working tree are identical and no need
> to re-register, but you are WRONG.  For each path in the index,
> remove it and then register the object by hashing the contents from
> the filesystem afresh!" would be the best way to go.

Here is just to illustrate the direction I was heading to in the
above.  This is not even compile tested and I won't guarantee what
corner cases there are, though.

In a true production code, we shouldn't be using string-list with
two loops, but I just didn't want to spend more braincycles worrying
about removing from the list and then adding to it, both inside a
single loop that iterates over it in a mere illustration patch.

The second loop uses a simple "remove then add", but I think it
should rather be a "mark ce that it will _never_ match anything on
the working tree" followed by "add_file_to_cache()".  Currently we
do not have the "mark ce that it never matches" operation that lets
us bypass the comparison with the current cache entry (with safecrlf
thing that interferes), but we can afford to use a (in-core only)
bit in the ce flags word to represent this and plumb it through.
That way, we will still preserve the executable bit from the
original entry, hopefully ;-)


 builtin/add.c | 42 +-
 1 file changed, 41 insertions(+), 1 deletion(-)

diff --git a/builtin/add.c b/builtin/add.c
index 5d5773d5cd..264f84dbe7 100644
--- a/builtin/add.c
+++ b/builtin/add.c
@@ -26,6 +26,7 @@ static const char * const builtin_add_usage[] = {
 };
 static int patch_interactive, add_interactive, edit_interactive;
 static int take_worktree_changes;
+static int rehash;
 
 struct update_callback_data {
int flags;
@@ -121,6 +122,41 @@ int add_files_to_cache(const char *prefix,
return !!data.add_errors;
 }
 
+static int rehash_tracked_files(const char *prefix, const struct pathspec 
*pathspec,
+   int flags)
+{
+   struct string_list paths = STRING_LIST_INIT_DUP;
+   struct string_list_item *path;
+   int i, retval = 0;
+
+   for (i = 0; i < active_nr; i++) {
+   struct cache_entry *ce = active_cache[i];
+
+   if (ce_stage(ce))
+   continue; /* do not touch unmerged paths */
+   if (!S_ISREG(ce->ce_mode) && !S_ISLNK(ce->ce_mode))
+   continue; /* do not touch non blobs */
+   if (pathspec && !ce_path_match(ce, pathspec, NULL))
+   continue;
+   string_list_append(, ce->name);
+   }
+
+   for_each_string_list_item(path, ) {
+   /*
+* Having a blob contaminated with CR will trigger the
+* safe-crlf kludge, avoidance of which is the primary
+* thing this helper function exists.  Remove it and
+* then re-add it.  Note that this may lose executable
+* bit on a filesystem that lacks it.
+*/
+   remove_file_from_cache(path->string);
+   add_file_to_cache(path->string, flags);
+   }
+
+   string_list_clear(, 0);
+   return retval;
+}
+
 static char *prune_directory(struct dir_struct *dir, struct pathspec 
*pathspec, int prefix)
 {
char *seen;
@@ -274,6 +310,7 @@ static struct option builtin_add_options[] = {
OPT_BOOL('e', "edit", _interactive, N_("edit current diff and 
apply")),
OPT__FORCE(_too, N_("allow adding otherwise ignored files")),
OPT_BOOL('u', "update", _worktree_changes, N_("update tracked 
files")),
+   OPT_BOOL(0, "rehash", , N_("really update tracked files")),
OPT_BOOL('N', "intent-to-add", _to_add, N_("record only the fact 
that the path will be added later")),
OPT_BOOL('A', "all", _explicit, N_("add changes from all 
tracked and untracked files")),
{ OPTION_CALLBACK, 0, "ignore-removal", _explicit,
@@ -498,7 +535,10 @@ int cmd_add(int argc, const char **argv, const char 
*prefix)
 
plug_bulk_checkin();
 
-   exit_status |= add_files_to_cache(prefix, , flags);
+   if (rehash)
+   exit_status |= rehash_tracked_files(prefix, , flags);
+   else
+   exit_status |= add_files_to_cache(prefix, , flags);
 
if (add_new_files)
exit_status |= add_files(, flags);


Deleting a branch after merging it results in "there may be uncommitted changes"

2017-10-04 Thread Joshua Lamusga
Hello, I'm trying to understand Git and the mess I've made. Some time
ago, I did crazy things like adding to master even though I was
working in develop, leaving it a commit ahead and X commits behind. I
did crazier things, like trying to amend a previous post's message.

Anyway, I follow a very simple merging model for this one-person
project. Recently, I made a new local branch off of develop called
feature-printing. After checking out feature-printing, making my
changes, and committing changes, I merged it with develop. I then
immediately tried to delete feature-printing, which resulted in a
prompt asking if I was sure since it might contain uncommitted
changes. Though I've seen this problem many times on the internet, I
haven't seen it in the context of literally just merging. There are 0
steps between merging and deleting the old branch.

All of this is done in Visual Studio's GUI for Git. Any ideas?


Re: [PATCH v2] ref-filter.c: pass empty-string as NULL to atom parsers

2017-10-04 Thread Jonathan Nieder
Junio C Hamano wrote:

> From: Taylor Blau 
> Date: Mon, 2 Oct 2017 09:10:34 -0700
> Subject: [PATCH] ref-filter.c: pass empty-string as NULL to atom parsers
>
> Peff points out that different atom parsers handle the empty
> "sub-argument" list differently. An example of this is the format
> "%(refname:)".
>
> Since callers often use `string_list_split` (which splits the empty
> string with any delimiter as a 1-ary string_list containing the empty
> string), this makes handling empty sub-argument strings non-ergonomic.
>
> Let's fix this by declaring that atom parser implementations must
> not care about distinguishing between the empty string "%(refname:)"
> and no sub-arguments "%(refname)".  Current code aborts, either with
> "unrecognised arg" (e.g. "refname:") or "does not take args"
> (e.g. "body:") as an error message.
>
> Signed-off-by: Taylor Blau 
> Reviewed-by: Jeff King 
> Reviewed-by: Jonathan Nieder 
> Signed-off-by: Junio C Hamano 
> ---
>  ref-filter.c| 10 +-
>  t/t6300-for-each-ref.sh |  1 +
>  2 files changed, 10 insertions(+), 1 deletion(-)

Thanks for taking care of it.  This is indeed still
Reviewed-by: Jonathan Nieder 


Re: [PATCH 0/3] for-each-ref: add :remote-ref and :remote-name specifiers

2017-10-04 Thread Junio C Hamano
Johannes Schindelin  writes:

> Note: the `%(push:remote-name)` placeholder is only interpolated by the value
> of `branch..pushRemote`; unlike `git push`, it does not fall back to
> `branch..remote`. Likewise, `%(push:remote-ref)` interpolates to the
> empty string unless `remote..pushRefs` is configured.

I think the reason why I had to spend more time than necessary on
the above point during my review on this (otherwise mostly well
done) topic was because of this note.  It says what it does, but
does not say why this behaviour is better than the other obvious
alternative.

In a reroll to address the remaining points, please update the log
message (not the cover, which won't be in the committed history) to
explain why we think this is a better choice than an obvious
alternative.  Let's help future readers of "git log".

Thanks.



Re: [PATCH v2] ref-filter.c: pass empty-string as NULL to atom parsers

2017-10-04 Thread Junio C Hamano
Taylor Blau  writes:

> On Tue, Oct 03, 2017 at 08:55:01AM +0900, Junio C Hamano wrote:
>> Jonathan Nieder  writes:
>>
>> > The above does a nice job of explaining
>> >
>> >  - what this change is going to do
>> >  - how it's good for the internal code structure / maintainability
>> >
>> > What it doesn't tell me about is why the user-facing effect won't
>> > cause problems.  Is there no atom where %(atom:) was previously
>> > accepted and did something meaningful that this may break?

This loose end needs to be tied.  

I replaced "let's assume that doing this change is OK" from the
original log message with something a bit stronger, as with this
change, we are saying that it is forbidden to treat %(atom) and
%(atom:) differently.  I also recorded the result of due-diligence
survey of the current code to suggest that the change would be OK
for current users.

-- >8 --
From: Taylor Blau 
Date: Mon, 2 Oct 2017 09:10:34 -0700
Subject: [PATCH] ref-filter.c: pass empty-string as NULL to atom parsers

Peff points out that different atom parsers handle the empty
"sub-argument" list differently. An example of this is the format
"%(refname:)".

Since callers often use `string_list_split` (which splits the empty
string with any delimiter as a 1-ary string_list containing the empty
string), this makes handling empty sub-argument strings non-ergonomic.

Let's fix this by declaring that atom parser implementations must
not care about distinguishing between the empty string "%(refname:)"
and no sub-arguments "%(refname)".  Current code aborts, either with
"unrecognised arg" (e.g. "refname:") or "does not take args"
(e.g. "body:") as an error message.

Signed-off-by: Taylor Blau 
Reviewed-by: Jeff King 
Reviewed-by: Jonathan Nieder 
Signed-off-by: Junio C Hamano 
---
 ref-filter.c| 10 +-
 t/t6300-for-each-ref.sh |  1 +
 2 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/ref-filter.c b/ref-filter.c
index bc591f4f3d..f3e53d4448 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -415,8 +415,16 @@ static int parse_ref_filter_atom(const struct ref_format 
*format,
REALLOC_ARRAY(used_atom, used_atom_cnt);
used_atom[at].name = xmemdupz(atom, ep - atom);
used_atom[at].type = valid_atom[i].cmp_type;
-   if (arg)
+   if (arg) {
arg = used_atom[at].name + (arg - atom) + 1;
+   if (!*arg) {
+   /*
+* Treat empty sub-arguments list as NULL (i.e.,
+* "%(atom:)" is equivalent to "%(atom)").
+*/
+   arg = NULL;
+   }
+   }
memset(_atom[at].u, 0, sizeof(used_atom[at].u));
if (valid_atom[i].parser)
valid_atom[i].parser(format, _atom[at], arg);
diff --git a/t/t6300-for-each-ref.sh b/t/t6300-for-each-ref.sh
index 2274a4b733..edc1bd8eab 100755
--- a/t/t6300-for-each-ref.sh
+++ b/t/t6300-for-each-ref.sh
@@ -51,6 +51,7 @@ test_atom() {
 }
 
 test_atom head refname refs/heads/master
+test_atom head refname: refs/heads/master
 test_atom head refname:short master
 test_atom head refname:lstrip=1 heads/master
 test_atom head refname:lstrip=2 master
-- 
2.14.2-921-g20a440a8ba



Re: Line ending normalization doesn't work as expected

2017-10-04 Thread Jonathan Nieder
Junio C Hamano wrote:
> Jonathan Nieder  writes:

>>  git checkout --renormalize .
>>  git status; # Show files that will be normalized
>>  git commit; # Commit the result
>>
>> What do you think?  Would you be interested in writing a patch for it?
>> ("No" is as always an acceptable answer.)
>
> I actually think what is being requested is the opposite, i.e. "the
> object registered in the index have wrong line endings, and the
> safe-crlf is getting in the way to prevent me from correcting by
> hashing the working tree contents again to register contents with
> corrected line endings, even with 'git add .'".
>
> So I would understand if your suggestion were for
>
>   git checkin --renormalize .
>
> but not "git checkout".  And it probably is more familiar to lay
> people if we spelled that as "git add --renormalize ." ;-)

Good catch.  You understood correctly --- "git add --renormalize" is
the feature that I think is being hinted at here.

Thanks,
Jonathan


Re: cmd.exe Terminal is closing when cloning a repository on windows 10 (64.bit)

2017-10-04 Thread Johannes Schindelin
Hi,

On Thu, 5 Oct 2017, Johannes Schindelin wrote:

> On Mon, 2 Oct 2017, André Netzeband wrote:
> 
> > I installed git for windows 2.14.2 (64bit) and was trying to clone a
> > repository from a command terminal (cmd.exe):
> > 
> > git clone
> > https://netzeb...@bitbucket.org/Netzeband/deep-speeddreams.git
> > 
> > First everything went well, but after the repository was downloaded
> > the LFS download started. At this point the terminal window just
> > closed and I was not able to see anything related on the terminal.
> > There was no error message.  However several git processes (and git
> > lfs) were running in the background and downloaded everything for the
> > repository (all lfs files).
> 
> This is most likely the same issue as reported at
> https://github.com/git-lfs/git-lfs/issues/2631 and at
> https://github.com/git-for-windows/git/issues/1312.

I think this is solved by
https://github.com/git-for-windows/MSYS2-packages/commit/e9d0a2be272.

If everything goes according to plan, a new snapshot will land at
https://wingit.blob.core.windows.net/files/index.html within the next half
hour.

When it does, can you please test?

Thank you,
Johannes

Re: Line ending normalization doesn't work as expected

2017-10-04 Thread Junio C Hamano
Torsten Bögershausen  writes:

> One solution, which you can tell your team, is this one:
> $ git rm -r --cached . && git add .

Both this and its "git read-tree --empty" cousin share a grave
issue.  The "git add ." step would mean that before doing these
commands, your working tree must be truly clean, i.e. the paths
in the filesystem known to the index must match what is in the
index (modulo the line-ending gotcha you are trying to correct), 
*AND* there must be *NO* untracked paths you do not want to add
in the working tree.

That is a reason why we should solve it differently.  Perhaps adding
a new option "git add --rehash" to tell Git "Hey, you may think some
paths in the index and in the working tree are identical and no need
to re-register, but you are WRONG.  For each path in the index,
remove it and then register the object by hashing the contents from
the filesystem afresh!" would be the best way to go.  That will not
pick up untracked paths left in the filesystem, and does not limit
our solution to the "eol normalization is screwey" issue by not
calling the option "renormalize" or any other words that imply "why"
we are hashing again anew.



Re: Line ending normalization doesn't work as expected

2017-10-04 Thread Junio C Hamano
Jonathan Nieder  writes:

> I suspect what we are dancing around is the need for some command like
>
>   git checkout --renormalize .
>
> which would shorten the sequence to
>
>   git checkout --renormalize .
>   git status; # Show files that will be normalized
>   git commit; # Commit the result
>
> What do you think?  Would you be interested in writing a patch for it?
> ("No" is as always an acceptable answer.)

I actually think what is being requested is the opposite, i.e. "the
object registered in the index have wrong line endings, and the
safe-crlf is getting in the way to prevent me from correcting by
hashing the working tree contents again to register contents with
corrected line endings, even with 'git add .'".

So I would understand if your suggestion were for

git checkin --renormalize .

but not "git checkout".  And it probably is more familiar to lay
people if we spelled that as "git add --renormalize ." ;-)




Re: [PATCH v3 3/5] sha1_name: Unroll len loop in find_unique_abbrev_r

2017-10-04 Thread Junio C Hamano
Derrick Stolee  writes:

> On 10/4/2017 2:07 AM, Junio C Hamano wrote:
>> Derrick Stolee  writes:
>>
>>> -   exists = has_sha1_file(sha1);
>>> -   while (len < GIT_SHA1_HEXSZ) {
>>> -   struct object_id oid_ret;
>>> -   status = get_short_oid(hex, len, _ret, GET_OID_QUIETLY);
>>> -   if (exists
>>> -   ? !status
>>> -   : status == SHORT_NAME_NOT_FOUND) {
>>> -   hex[len] = 0;
>>> -   return len;
>>> -   }
>>> -   len++;
>>> -   }
>>> -   return len;
>> The "always_call_fn" thing is a big sledgehammer that overrides
>> everything else in update_candidates().  It bypasses the careful
>> machinery set up to avoid having to open ambiguous object to learn
>> their types as much as possible.  One narrow exception when it is OK
>> to use is if we never limit our candidates with type.
>
> I do not modify get_short_oid, which uses these advanced options,
> depending on the flags given. find_unique_abbrev_r() does not use
> these advanced options.

It is true that the function no longer even calls get_short_oid().

When it called get_short_oid() before this patch, it not pass "I
want to favor committish" in the old code, as we can see in the
above removed code.  In get_short_oid(), ds.fn would have been set
to default_disambigutate_hint, and that would have been used in the
update_candidates() function.

Now, unless the user has core.disambiguate configuration set, this
"default" thing becomes NULL, so overriding ds.fn like this patch
does and bypass major parts of update_candidates() is probably fine.
After all, update_candidates() wouldn't have inspected the object
types and made the candidate management anyway with ds.fn set to
NULL.

But having the configuration set would mean that the user wants to
set default_disambigutate_hint to some non-default value, e.g.
telling us to find disambiguation only among committishes; the patch
however overrides ds.fn and essentially makes the code ignore it
altogether, no?  That change in behaviour was what I was wondering
about.



Re: [PATCH v2] branch: change the error messages to be more meaningful

2017-10-04 Thread Junio C Hamano
Kaartic Sivaraam  writes:

> Moreover, as a consequence of my assumption that the tests don't check
> for the error messages themselves; I haven't even thought of checking
> whether the tests or the travis-ci build succeeded as a consequence of
> my patches that touch "only" the error messages!

That's a bad thing, right?


Re: cmd.exe Terminal is closing when cloning a repository on windows 10 (64.bit)

2017-10-04 Thread Johannes Schindelin
Hi André,

On Mon, 2 Oct 2017, André Netzeband wrote:

> I installed git for windows 2.14.2 (64bit) and was trying to clone a
> repository from a command terminal (cmd.exe):
> 
> git clone https://netzeb...@bitbucket.org/Netzeband/deep-speeddreams.git
> 
> First everything went well, but after the repository was downloaded the LFS
> download started. At this point the terminal window just closed and I was not
> able to see anything related on the terminal. There was no error message.
> However several git processes (and git lfs) were running in the background and
> downloaded everything for the repository (all lfs files).
> 
> The same happens if I use the git-bash.
> 
> After switching back to 2.13.0 with the same settings during installation,
> this error did not occur again.
> 
> I'm using Windows 10 (b4 bit) which all current updates installed.

This is most likely the same issue as reported at
https://github.com/git-lfs/git-lfs/issues/2631 and at
https://github.com/git-for-windows/git/issues/1312.

Ciao,
Johannes

RE

2017-10-04 Thread Stefan E Persson
Your email has been randonly selected for my foundation Donation reply  
to email: eriling.persson...@swedenmail.com for more details




Re: hacktoberfest

2017-10-04 Thread pedro rijo
seems my last answer was blocked due to HTML :(

here's the answer: Seems a nice start yes. I've been on vacations, but
next week I will go trough the current issues and add the
hacktoberfest label to some issues if you agree.

I went through the open issues a few moments ago. Some were closed
(solved or not applicable anymore), and some got an hacktoberfest
label, hoping new contributors pop up. Feel free to have a look and
add new issues to the label, as well as remove some of those I thought
it could be a good fit. There are easy issue, and some not so easy
(not very hard, but a bit more work) probably.

Issues labeled:
https://github.com/git/git-scm.com/issues?q=is%3Aopen+is%3Aissue+label%3Ahacktoberfest

Thanks,
Pedro

2017-09-28 22:19 GMT+01:00 Jeff King :
> On Wed, Sep 27, 2017 at 11:05:49PM +0100, pedro rijo wrote:
>
>> While the git repository itself is not hosted under GitHub, the Pro
>> Git book, git for Windows, and git-scm website (at least) projects
>> are, and could use this movement to get some more contributions, and
>> eventually more maintainers (at least git-scm website had some
>> maintainers problem some time ago).
>>
>> I've been helping on the git-scm repository (mostly filtering issues
>> and PRs), and I know there are still some issues which need to be
>> addressed. If the remaining maintainers agree, we could filter and
>> provide more instructions to some easy (or not so easy) issues, adding
>> the 'hacktoberfest' label and try to use this movement to solve some
>> problems
>
> I'd love it if more people wanted to contribute to the git-scm
> repository. I think one can probably find some low-hanging fruit by
> looking at the open issues list (though I'd be happy, too, if people
> with bug or feature suggestions opened new issues).
>
> Here are a couple small-to-moderate bugs that have been languishing:
>
>   https://github.com/git/git-scm.com/issues/701
>
>   https://github.com/git/git-scm.com/issues/987
>
>   https://github.com/git/git-scm.com/issues/994
>
> -Peff



-- 
Obrigado,

Pedro Rijo


Re: Line ending normalization doesn't work as expected

2017-10-04 Thread Torsten Bögershausen
On Wed, Oct 04, 2017 at 11:26:55AM -0500, Robert Dailey wrote:
> On Tue, Oct 3, 2017 at 9:00 PM, Junio C Hamano  wrote:
> > Torsten Bögershausen  writes:
> >
> >>> $ git rm -r --cached . && git add .
> >>
> >> (Both should work)
> >>
> >> To be honest, from the documentation, I can't figure out the difference 
> >> between
> >> $ git read-tree --empty
> >> and
> >> $ git rm -r --cached .
> >>
> >> Does anybody remember the discussion, why we ended up with read-tree ?
> >
> > We used to use neither, and considered it fine to "rm .git/index" if
> > you wanted to empty the on-disk index file in the old world.  In the
> > modern world, folks want you to avoid touching filesystem directly
> > and instead want you to use Git tools, and the above are two obvious
> > ways to do so.
> >
> > "git read-tree" (without any parameter, i.e. "read these 0 trees and
> > populate the index with it") and its modern and preferred synonym
> > "git read-tree --empty" (i.e. "I am giving 0 trees and I know the
> > sole effect of this command is to empty the index.") are more direct
> > ways to express "I want the index emptied" between the two.
> >
> > The other one, "git rm -r --cached .", in the end gives you the same
> > state because it tells Git to "iterate over all the entries in the
> > index, find the ones that match pathspec '.', and remove them from
> > the index.".  It is not wrong per-se, but conceptually it is a bit
> > roundabout way to say that "I want the index emptied", I would
> > think.
> >
> > I wouldn't be surprised if the "rm -r --cached ." were a lot slower,
> > due to the overhead of having to do the pathspec filtering that ends
> > up to be a no-op, but there shouldn't be a difference in the end
> > result.
> 
> You guys are obviously worlds ahead of me on the internals of things,
> but from my perspective I like to avoid the "plumbing" commands as
> much as I can. Even if I used them, if I have to tell the rest of my
> team "this is the way to do it", they're going to give me dirty looks
> if I ask them to run things like this that make no sense to them.
> That's the argument I have to deal with when it comes to Git's
> usability within the team I manage. So based on this, I'd favor the
> `git rm -r --cached` approach because this is the more common result
> you see in Google, and also makes a little more sense from a high
> level of thought perspective. However, this is just my personal
> opinion. `read-tree --empty` is far less self explanatory IMHO.
> 
> Also let's not forget the second part of the command chain that
> results in the different behavior. In one case, I use `git add` which
> results in proper line ending normalization. In the other case, I do
> `git reset --hard` which does *NOT* result in the line endings
> normalized (`git status` shows no results). In both cases, I'm still
> doing `git rm -r --cached`, so I am doubtful that is the root cause
> for the line ending normalization piece. I'm still trying to
> understand why both give different results (root cause) and also get
> an understanding of what the correct (modern) solution is for line
> ending normalization (not necessarily which is the right way to
> clear/delete the index, which is really AFAIK just a means to this end
> and an implementation detail of sorts for this specific task).

Hopefully I am able to give a useful answer.

"git reset --hard" works like a hammer
and may destroy work that has been done,
in our case the cleaning of the index,
which is needed for normalization since Git 2.10 (or so)

Back to the question:
One solution, which you can tell your team, is this one:
$ git rm -r --cached . && git add .

And as Junio pointed out, this may be slower than needed.
And we don't want "slow" solutions in the official documentation ;-)

Whatever you find on search engines may get stale after a while,
so that we appreciate direct questions here.

(And I will open an issue on Github the next days)

The background is that the CRLF handling in Git changed over the years,
and one effect is that "git reset" is not "allowed" any more.

For the interested reader:
https://github.com/git-for-windows/git/issues/954



Re: [PATCH 1/3] path.c: fix uninitialized memory access

2017-10-04 Thread Jonathan Nieder
Junio C Hamano wrote:

> From: Jeff King 
> Date: Tue, 3 Oct 2017 19:30:40 -0400
> Subject: [PATCH] path.c: fix uninitialized memory access
> 
> In cleanup_path we're passing in a char array, run a memcmp on it, and
> run through it without ever checking if something is in the array in the
> first place.  This can lead us to access uninitialized memory, for
> example in t5541-http-push-smart.sh test 7, when run under valgrind:
>
> ==4423== Conditional jump or move depends on uninitialised value(s)
> ==4423==at 0x242FA9: cleanup_path (path.c:35)
[...]
> ==4423==  Uninitialised value was created by a heap allocation
[...]
> ==4423==by 0x29A30B: strbuf_grow (strbuf.c:66)
> ==4423==by 0x29A30B: strbuf_vaddf (strbuf.c:277)
> ==4423==by 0x242F9F: mkpath (path.c:454)
[...]
> Avoid this by using skip_prefix(), which knows not to go beyond the
> end of the string.
>
> Reported-by: Thomas Gummerer 
> Signed-off-by: Jeff King 
> Reviewed-by: Jonathan Nieder 

This is indeed
Reviewed-by: Jonathan Nieder 

Thanks.


Re: [PATCH 3/3] sub-process: allocate argv on the heap

2017-10-04 Thread Thomas Gummerer
On 10/04, Junio C Hamano wrote:
> Johannes Sixt  writes:
> 
> > Am 03.10.2017 um 21:57 schrieb Thomas Gummerer:
> >> diff --git a/sub-process.c b/sub-process.c
> >> index 6dde5062be..4680af8193 100644
> >> --- a/sub-process.c
> >> +++ b/sub-process.c
> >> @@ -77,7 +77,9 @@ int subprocess_start(struct hashmap *hashmap, struct 
> >> subprocess_entry *entry, co
> >>   {
> >>int err;
> >>struct child_process *process;
> >> -  const char *argv[] = { cmd, NULL };
> >> +  const char **argv = xmalloc(2 * sizeof(char *));
> >> +  argv[0] = cmd;
> >> +  argv[1] = NULL;
> >>entry->cmd = cmd;
> >>process = >process;
> >>
> >
> > Perhaps this should become
> >
> > argv_array_push(>args, cmd);
> >
> > so that there is no new memory leak?
> 
> Sounds like a good idea (if I am not grossly mistaken as to what is
> being suggested).
> 
> Here is what I am planning to queue.
> 
> -- >8 --
> From: Johannes Sixt 
> Date: Tue, 3 Oct 2017 22:24:57 +0200
> Subject: [PATCH] sub-process: use child_process.args instead of 
> child_process.argv
> 
> Currently the argv is only allocated on the stack, and then assigned to
> process->argv.  When the start_subprocess function goes out of scope,
> the local argv variable is eliminated from the stack, but the pointer is
> still kept around in process->argv.
> 
> Much later when we try to access the same process->argv in
> finish_command, this leads us to access a memory location that no longer
> contains what we want.  As argv0 is only used for printing errors, this
> is not easily noticed in normal git operations.  However when running
> t0021-conversion.sh through valgrind, valgrind rightfully complains:
> 
> ==21024== Invalid read of size 8
> ==21024==at 0x2ACF64: finish_command (run-command.c:869)
> ==21024==by 0x2D6B18: subprocess_exit_handler (sub-process.c:72)
> ==21024==by 0x2AB41E: cleanup_children (run-command.c:45)
> ==21024==by 0x2AB526: cleanup_children_on_exit (run-command.c:81)
> ==21024==by 0x54AD487: __run_exit_handlers (in /usr/lib/libc-2.26.so)
> ==21024==by 0x54AD4D9: exit (in /usr/lib/libc-2.26.so)
> ==21024==by 0x11A9EF: handle_builtin (git.c:550)
> ==21024==by 0x11ABCC: run_argv (git.c:602)
> ==21024==by 0x11AD8E: cmd_main (git.c:679)
> ==21024==by 0x1BF125: main (common-main.c:43)
> ==21024==  Address 0x1ffeffec00 is on thread 1's stack
> ==21024==  1504 bytes below stack pointer
> ==21024==
> 
> These days, the child_process structure has its own args array, and
> the standard way to set up its argv[] is to use that one, instead of
> assigning to process->argv to point at an array that is outside.
> Use that facility automatically fixes this issue.
> 
> Reported-by: Thomas Gummerer 
> Signed-off-by: Johannes Sixt 
> Signed-off-by: Junio C Hamano 
> ---
>  sub-process.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
> 
> diff --git a/sub-process.c b/sub-process.c
> index fcc4832c14..648b3a3943 100644
> --- a/sub-process.c
> +++ b/sub-process.c
> @@ -74,13 +74,12 @@ int subprocess_start(struct hashmap *hashmap, struct 
> subprocess_entry *entry, co
>  {
>   int err;
>   struct child_process *process;
> - const char *argv[] = { cmd, NULL };
>  
>   entry->cmd = cmd;
>   process = >process;
>  
>   child_process_init(process);
> - process->argv = argv;
> + argv_array_push(>args, cmd);

Thanks!  *Much* nicer than what I had :)

>   process->use_shell = 1;
>   process->in = -1;
>   process->out = -1;
> -- 
> 2.14.2-889-gd2948f6aa6
> 


Re: playing with MSan, was Re: [PATCH 0/3] fixes for running the test suite with --valgrind

2017-10-04 Thread Thomas Gummerer
On 10/04, Jeff King wrote:
> On Tue, Oct 03, 2017 at 07:41:54PM -0400, Jeff King wrote:
> 
> > I think using SANITIZE=memory would catch these, but it needs some
> > suppressions tuning. The weird "zlib reads uninitialized memory" error
> > is a problem (valgrind sees this, too, but we have suppressions).
> 
> I dug into this a little more. You can blacklist certain functions from
> getting MSan treatment, but that's not quite what we want. We want to
> mark bytes from certain _sources_ as being initialized, even if MSan
> doesn't agree.
> 
> And indeed, you can do that. As far as I can tell, MSan works by keeping
> a shadow map of memory and setting flags when it believes it has been
> initialized, and then checking that map when we make decisions based on
> the memory. But it can only do that if it instruments all writes. So the
> MSan documentation recommends that you build _everything_, including
> libraries, with it. Which obviously we don't do if we're using a system
> zlib. Or a system libc for that matter (though they intercept many
> common libc functions to handle this).
> 
> So one strategy is to "cheat" a bit at the library interfaces, and claim
> whatever they send us is properly initialized. The patch below tries
> that with zlib, and it does seem to work. It would fail to notice a real
> problem with any input we send _to_ the library (since the library isn't
> instrumented, and we claim that whatever comes out of it is legitimate).
> I could probably live with that.
> 
> But there are quite a few test failures that would still need
> investigating and annotating:
> 
>   - Certainly it's confused by looking at regmatch_t results from
> regexec(). We can fix that by building with NO_REGEX. But pcre has
> a similar problem.
> 
>   - Ditto curl and openssl, whose exit points would need annotations.
> 
>   - For some reason test-sigchain segfaults when it raise()s in the
> signal handler and recurses. Not sure if this is an MSan bug or
> what.
> 
> So I dunno. This approach is a _lot_ more convenient than trying to
> rebuild all the dependencies from scratch, and it runs way faster than
> valgrind.  It did find the cases that led to the patches in this
> series, and at least one more: if the lstat() at the end of
> entry.c:write_entry() fails, we write nonsense into the cache_entry.

Yeah valgrind found that one too, as I tried (and apparently failed :))
to explain in the cover letter.  I just haven't found the time yet to
actually try and go fix that one.

> I think we could probably get it to zero false positives without _too_
> much effort. I'll stop here for tonight, but I may pick it up again
> later (of course anybody else is welcome to fool around with it, too).
> 
> Below is the patch that let me run:
> 
>   make SANITIZE=memory CC=clang-6.0 NO_REGEX=1
> 
> and get a tractable number of errors.
> 
> -- >8 --
> diff --git a/Makefile b/Makefile
> index b143e4eea3..1da5c01211 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -1047,6 +1047,9 @@ endif
>  ifneq ($(filter leak,$(SANITIZERS)),)
>  BASIC_CFLAGS += -DSUPPRESS_ANNOTATED_LEAKS
>  endif
> +ifneq ($(filter memory,$(SANITIZERS)),)
> +BASIC_CFLAGS += -DENABLE_MSAN_UNPOISON
> +endif
>  endif
>  
>  ifndef sysconfdir
> diff --git a/git-compat-util.h b/git-compat-util.h
> index cedad4d581..836a4c0b54 100644
> --- a/git-compat-util.h
> +++ b/git-compat-util.h
> @@ -1191,4 +1191,11 @@ extern void unleak_memory(const void *ptr, size_t len);
>  #define UNLEAK(var) do {} while (0)
>  #endif
>  
> +#ifdef ENABLE_MSAN_UNPOISON
> +#include 
> +#define msan_unpoison(ptr, len) __msan_unpoison(ptr, len)
> +#else
> +#define msan_unpoison(ptr, len) do {} while (0)
> +#endif
> +
>  #endif
> diff --git a/zlib.c b/zlib.c
> index 4223f1a8c5..5fa8f12507 100644
> --- a/zlib.c
> +++ b/zlib.c
> @@ -56,6 +56,8 @@ static void zlib_post_call(git_zstream *s)
>   if (s->z.total_in != s->total_in + bytes_consumed)
>   die("BUG: total_in mismatch");
>  
> + msan_unpoison(s->next_out, bytes_produced);
> +
>   s->total_out = s->z.total_out;
>   s->total_in = s->z.total_in;
>   s->next_in = s->z.next_in;


Re: [PATCH 1/3] path.c: fix uninitialized memory access

2017-10-04 Thread Thomas Gummerer
On 10/04, Junio C Hamano wrote:
> Jonathan Nieder  writes:
> 
> > Jeff King wrote:
> >> On Tue, Oct 03, 2017 at 03:45:01PM -0700, Jonathan Nieder wrote:
> >
> >>> In other words, an alternative fix would be
> >>> 
> >>>   if (*path == '.' && path[1] == '/') {
> >>>   ...
> >>>   }
> >>> 
> >>> which would not require passing in 'len' or switching to index-based
> >>> arithmetic.  I think I prefer it.  What do you think?
> >>
> >> Yes, I think that approach is much nicer. I think you could even use
> >> skip_prefix. Unfortunately you have to play a few games with const-ness,
> >> but I think the resulting signature for cleanup_path() is an
> >> improvement:
> 
> To tie the loose end, here is what I'll queue.

Thanks.  This is much nicer indeed!

> -- >8 --
> From: Jeff King 
> Date: Tue, 3 Oct 2017 19:30:40 -0400
> Subject: [PATCH] path.c: fix uninitialized memory access
> 
> In cleanup_path we're passing in a char array, run a memcmp on it, and
> run through it without ever checking if something is in the array in the
> first place.  This can lead us to access uninitialized memory, for
> example in t5541-http-push-smart.sh test 7, when run under valgrind:
> 
> ==4423== Conditional jump or move depends on uninitialised value(s)
> ==4423==at 0x242FA9: cleanup_path (path.c:35)
> ==4423==by 0x242FA9: mkpath (path.c:456)
> ==4423==by 0x256CC7: refname_match (refs.c:364)
> ==4423==by 0x26C181: count_refspec_match (remote.c:1015)
> ==4423==by 0x26C181: match_explicit_lhs (remote.c:1126)
> ==4423==by 0x26C181: check_push_refs (remote.c:1409)
> ==4423==by 0x2ABB4D: transport_push (transport.c:870)
> ==4423==by 0x186703: push_with_options (push.c:332)
> ==4423==by 0x18746D: do_push (push.c:409)
> ==4423==by 0x18746D: cmd_push (push.c:566)
> ==4423==by 0x1183E0: run_builtin (git.c:352)
> ==4423==by 0x11973E: handle_builtin (git.c:539)
> ==4423==by 0x11973E: run_argv (git.c:593)
> ==4423==by 0x11973E: main (git.c:698)
> ==4423==  Uninitialised value was created by a heap allocation
> ==4423==at 0x4C2CD8F: malloc (in 
> /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
> ==4423==by 0x4C2F195: realloc (in 
> /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
> ==4423==by 0x2C196B: xrealloc (wrapper.c:137)
> ==4423==by 0x29A30B: strbuf_grow (strbuf.c:66)
> ==4423==by 0x29A30B: strbuf_vaddf (strbuf.c:277)
> ==4423==by 0x242F9F: mkpath (path.c:454)
> ==4423==by 0x256CC7: refname_match (refs.c:364)
> ==4423==by 0x26C181: count_refspec_match (remote.c:1015)
> ==4423==by 0x26C181: match_explicit_lhs (remote.c:1126)
> ==4423==by 0x26C181: check_push_refs (remote.c:1409)
> ==4423==by 0x2ABB4D: transport_push (transport.c:870)
> ==4423==by 0x186703: push_with_options (push.c:332)
> ==4423==by 0x18746D: do_push (push.c:409)
> ==4423==by 0x18746D: cmd_push (push.c:566)
> ==4423==by 0x1183E0: run_builtin (git.c:352)
> ==4423==by 0x11973E: handle_builtin (git.c:539)
> ==4423==by 0x11973E: run_argv (git.c:593)
> ==4423==by 0x11973E: main (git.c:698)
> ==4423==
> 
> Avoid this by using skip_prefix(), which knows not to go beyond the
> end of the string.
> 
> Reported-by: Thomas Gummerer 
> Signed-off-by: Jeff King 
> Reviewed-by: Jonathan Nieder 
> Signed-off-by: Junio C Hamano 
> ---
>  path.c | 9 -
>  1 file changed, 4 insertions(+), 5 deletions(-)
> 
> diff --git a/path.c b/path.c
> index e50d2befcf..2fecf854fe 100644
> --- a/path.c
> +++ b/path.c
> @@ -33,11 +33,10 @@ static struct strbuf *get_pathname(void)
>   return sb;
>  }
>  
> -static char *cleanup_path(char *path)
> +static const char *cleanup_path(const char *path)
>  {
>   /* Clean it up */
> - if (!memcmp(path, "./", 2)) {
> - path += 2;
> + if (skip_prefix(path, "./", )) {
>   while (*path == '/')
>   path++;
>   }
> @@ -46,7 +45,7 @@ static char *cleanup_path(char *path)
>  
>  static void strbuf_cleanup_path(struct strbuf *sb)
>  {
> - char *path = cleanup_path(sb->buf);
> + const char *path = cleanup_path(sb->buf);
>   if (path > sb->buf)
>   strbuf_remove(sb, 0, path - sb->buf);
>  }
> @@ -63,7 +62,7 @@ char *mksnpath(char *buf, size_t n, const char *fmt, ...)
>   strlcpy(buf, bad_path, n);
>   return buf;
>   }
> - return cleanup_path(buf);
> + return (char *)cleanup_path(buf);
>  }
>  
>  static int dir_prefix(const char *buf, const char *dir)
> -- 
> 2.14.2-889-gd2948f6aa6
> 


Weird behavior/bug for git clean in untracked directory

2017-10-04 Thread David Spångberg
Hello

I'm a little bit confused about the behavior of git clean when trying to
clean multiple files/directories at once. For instance if I create two
directories with an empty file in an new git repository as such:

  mkdir tmprepo
  cd tmprepo
  git init
  mkdir a b
  touch a/file b/file

and try to clean both these files at the same time with git clean as
such:

  git clean -f a/file b/file

then git will not clean any of the files. However, if I separate the
cleaning into two separate calls like this:

  git clean -f a/file
  git clean -f b/file

then both directories will be cleaned. The git clean manual page suggests
git clean should support multiple paths/directories:

  SYNOPSIS
 git clean [-d] [-f] [-i] [-n] [-q] [-e ] [-x | -X] [--] 
...

  DESCRIPTION
 ...
 If any optional ... _arguments_ are given, only those _paths_ 
are affected.

The expected behavior for me would be that there should be no difference
between the first `git clean...' with two files and the two
`git clean...' with one file each.

I've tested on the following git versions so far:

  * 2.1.4
  * 2.14.2
  * 2.14.2.959.g6663358d3 (currently latest on next branch)


Best regards,
David


Re: What's cooking in git.git (Oct 2017, #01; Wed, 4)

2017-10-04 Thread Taylor Blau
On Wed, Oct 04, 2017 at 05:46:21PM +0900, Junio C Hamano wrote:
> Jeff King  writes:
>
> >>  - pretty.c: delimit "%(trailers)" arguments with ","
> >>
> >>  "git for-each-ref --format=..." learned a new format element,
> >>  %(trailers), to show only the commit log trailer part of the log
> >>  message.
> >>
> >>  Will merge to 'next'.
> >
> > I think we want the first patch of this series to graduate before v2.15,
> > even if the rest doesn't make it. It tweaks a new syntax introduced
> > earlier in this cycle by jk/trailers-parse. If we ship without the
> > tweak, then we'll have to support the colon-delimiter to remain
> > backwards-compatible.
>
> Yeah, thanks for reminding me.  I actually was hoping that this will
> prove to be stable enough by the time -rc1 gets tagged, but yes, the
> bottom one looks innocuous/safe enough and should be fast-tracked to
> 'master' soonish.

It may make sense to send my other series to 'master' as well
("ref-filter.c: pass empty-string as NULL to atom parsers").

The series you're discussing here adds support for "empty" sub-agruments
(via: --format=%(contents:trailers:)), but Peff points out that this is
not a consistent user experience:

> Doh, that string_list behavior is what I was missing in my earlier
> comments. I agree this is probably the best way of doing it. I'm tempted
> to say that parse_ref_filter_atom() should do a similar thing. Right now
> we've got:
>
>   $ git for-each-ref --format='%(refname)' | wc
>  22062206   79929
>   $ git for-each-ref --format='%(refname:short)' | wc
>  22062206   53622
>   $ git for-each-ref --format='%(refname:)' | wc
>   fatal: unrecognized %(refname:) argument:
>   0   0   0

"ref-filter.c: pass empty-string as NULL to atom parsers" makes this
behavior of allowing empty sub-argument atom formats in
git-for-each-ref(1) consistently OK.

To avoid introducing a case where %(atom:) is sometimes allowed and
sometimes not, I would recommend that both of these patches be applied
to master at the same time.

--
- Taylor


Re: Line ending normalization doesn't work as expected

2017-10-04 Thread Robert Dailey
On Wed, Oct 4, 2017 at 11:59 AM, Jonathan Nieder  wrote:
> Hi Robert,
>
> Robert Dailey wrote:
>
>> You guys are obviously worlds ahead of me on the internals of things,
>> but from my perspective I like to avoid the "plumbing" commands as
>> much as I can.
>
> I suspect what we are dancing around is the need for some command like
>
> git checkout --renormalize .
>
> which would shorten the sequence to
>
> git checkout --renormalize .
> git status; # Show files that will be normalized
> git commit; # Commit the result
>
> What do you think?  Would you be interested in writing a patch for it?
> ("No" is as always an acceptable answer.)

I wish I could, but ultimately I'd probably not be able to do it. I
rarely have time to do recreational coding outside of work these days.

That aside, for now I want to know the proper & recommended method to
renormalize line endings using existing commands. Additionally I also
am interested in knowing why only 1 of the 3 solutions I tried (In my
OP) worked but the others didn't. My short term goal is just to get
educated a bit. There's so much conflicting and variable information
on this topic on Google. It makes it difficult to find the one true
path, especially since Git evolves and improves over time and
information usually becomes stale.


Re: disable interactive prompting

2017-10-04 Thread Ernesto Alfonso
Thanks for pointing out the option. 

I tried "git help push" and searched for "prompt", "interactive",
"credentials"...

I don't think I would have figured to try
 "git help git" from "git help" even after reading it carefully.

I'd suggest adding a hint in "git help".

I also don't seem have the GIT_TERMINAL_PROMPT in "git version 1.8.3.1" but I 
can fix this.

Thanks,

Ernesto
Jeff King  writes:

> On Wed, Oct 04, 2017 at 09:10:48AM -0700, Ernesto Alfonso wrote:
>
>> Waiting for git-push synchronously slows me down, so I have a bash
>> alias/function to do this in the background. But when my origin is https, I
>> get an undesired interactive prompt. I've tried to disable by
>> redirecting stdin:
>> 
>> git push ${REMOTE} ${BRANCH} &>/dev/null > 
>> but I still get an interactive prompt.
>> 
>> Is there a way to either
>> 
>> 1. disable interactive prompting
>> 2. programmatically determine whether a git command (or at least a git
>> push) would interactively prompt
>
> I assume the prompt is for credentials, since that's generally the only
> thing git-push will prompt for.
>
> Try:
>
>   $ git help git | sed -ne '/PROMPT/,/^$/p'
>  GIT_TERMINAL_PROMPT
>If this environment variable is set to 0, git will not prompt on
>the terminal (e.g., when asking for HTTP authentication).
>
> Of course that just stops the prompting. If Git needs a credential and
> you don't provide it, then the push will fail.
>
> For advice on that that, try "git help credentials".
>
> -Peff


Re: disable interactive prompting

2017-10-04 Thread Ernesto Alfonso
Hi,

The prompt asks for my https username/password:


> $ grv
> originhttps://github.com/user/dotemacs (fetch)
> originhttps://github.com/user/dotemacs (push)
> $ gpom
> [1] 22549
> $ Username for 'https://github.com': 
> $ fg
> { git push origin master &>/dev/null < /dev/null; LAST=$?; test 0 -eq $LAST 
> || echo "gpom failed with $LAST !"; }
> Password for 'https://asd,c...@github.com': 
> gpom failed with 128 !
> $ 


Thanks,

Ernesto

Jonathan Nieder  writes:

> Hi Ernesto,
>
> Ernesto Alfonso wrote:
>
>> Waiting for git-push synchronously slows me down, so I have a bash
>> alias/function to do this in the background. But when my origin is https, I
>> get an undesired interactive prompt. I've tried to disable by
>> redirecting stdin:
>>
>> git push ${REMOTE} ${BRANCH} &>/dev/null >
>> but I still get an interactive prompt.
>>
>> Is there a way to either
>>
>> 1. disable interactive prompting
>> 2. programmatically determine whether a git command (or at least a git
>> push) would interactively prompt
>
> You left out an important detail: what does the interactive prompt in
> question say?
>
> The general question is also interesting, but seeing the particular
> prompt would make it easy to look into the specific case at the same
> time.
>
> Thanks,
> Jonathan


Re: "man git-stash" explanation of "--include-untracked" and "--all" seems ambiguous

2017-10-04 Thread Robert P. J. Day
On Mon, 2 Oct 2017, Junio C Hamano wrote:

> Thomas Gummerer  writes:
>
> > This is fine when --include-untracked is specified first, as --all
> > implies --include-untracked, but I guess the behaviour could be a
> > bit surprising if --all is specified first and --include-untracked
> > later on the command line.
> >
> > Changing this could possibly break someone that just adds
> > parameters to their 'git stash' invocation, but I'm tempted to say
> > allowing both at once is a bug, and change it to make git die when
> > both are specified.
>
> I am on the fence.
>
> If this were --include=untracked vs --include=all, then I'd say your
> suggestion will violate the usual expectation of "on the command
> line, last one wins", but "--include-untracked" and "--all" are
> spelled very differently, and may not look all that related to a
> casual reader, so the expectation for "the last one wins" might be
> weaker than usual.
>
> But once we start complaining to a command line that has both,
> saying they are mutually exclusive, people will realize that they
> are very much closely related options, even though spelled quite
> differently.  And at that point, they will find it very unreasonable
> that we do not follow the usual "the last one wins" rule but error
> out X-<.
>
> If I really cared deeply about these two options [*1*], I would
> think that the ideal longer term direction would be to introduce
> --include={untracked,all-crufts} to replace and deprecate the
> current two options.  And then we make sure --include=* forms follow
> the usual "last one wins" rule.
>
>
> [Footnote]
>
> *1* I personally don't, but that does not mean I will block efforts
> by others who do to make this part of the system better.

  since i'm the one who tripped over this pedantic nitpickery, i'm
willing to take a shot at patching it, as long as there's consensus
from those *way* higher up the food chain as to what that patch should
look like.

rday

-- 


Robert P. J. Day Ottawa, Ontario, CANADA
http://crashcourse.ca

Twitter:   http://twitter.com/rpjday
LinkedIn:   http://ca.linkedin.com/in/rpjday




Re: disable interactive prompting

2017-10-04 Thread Jeff King
On Wed, Oct 04, 2017 at 09:10:48AM -0700, Ernesto Alfonso wrote:

> Waiting for git-push synchronously slows me down, so I have a bash
> alias/function to do this in the background. But when my origin is https, I
> get an undesired interactive prompt. I've tried to disable by
> redirecting stdin:
> 
> git push ${REMOTE} ${BRANCH} &>/dev/null  
> but I still get an interactive prompt.
> 
> Is there a way to either
> 
> 1. disable interactive prompting
> 2. programmatically determine whether a git command (or at least a git
> push) would interactively prompt

I assume the prompt is for credentials, since that's generally the only
thing git-push will prompt for.

Try:

  $ git help git | sed -ne '/PROMPT/,/^$/p'
 GIT_TERMINAL_PROMPT
   If this environment variable is set to 0, git will not prompt on
   the terminal (e.g., when asking for HTTP authentication).

Of course that just stops the prompting. If Git needs a credential and
you don't provide it, then the push will fail.

For advice on that that, try "git help credentials".

-Peff


Re: disable interactive prompting

2017-10-04 Thread Jonathan Nieder
Hi Ernesto,

Ernesto Alfonso wrote:

> Waiting for git-push synchronously slows me down, so I have a bash
> alias/function to do this in the background. But when my origin is https, I
> get an undesired interactive prompt. I've tried to disable by
> redirecting stdin:
>
> git push ${REMOTE} ${BRANCH} &>/dev/null 
> but I still get an interactive prompt.
>
> Is there a way to either
>
> 1. disable interactive prompting
> 2. programmatically determine whether a git command (or at least a git
> push) would interactively prompt

You left out an important detail: what does the interactive prompt in
question say?

The general question is also interesting, but seeing the particular
prompt would make it easy to look into the specific case at the same
time.

Thanks,
Jonathan


Re: Line ending normalization doesn't work as expected

2017-10-04 Thread Jonathan Nieder
Hi Robert,

Robert Dailey wrote:

> You guys are obviously worlds ahead of me on the internals of things,
> but from my perspective I like to avoid the "plumbing" commands as
> much as I can.

I suspect what we are dancing around is the need for some command like

git checkout --renormalize .

which would shorten the sequence to

git checkout --renormalize .
git status; # Show files that will be normalized
git commit; # Commit the result

What do you think?  Would you be interested in writing a patch for it?
("No" is as always an acceptable answer.)

Thanks,
Jonathan


Re: Line ending normalization doesn't work as expected

2017-10-04 Thread Robert Dailey
On Tue, Oct 3, 2017 at 9:00 PM, Junio C Hamano  wrote:
> Torsten Bögershausen  writes:
>
>>> $ git rm -r --cached . && git add .
>>
>> (Both should work)
>>
>> To be honest, from the documentation, I can't figure out the difference 
>> between
>> $ git read-tree --empty
>> and
>> $ git rm -r --cached .
>>
>> Does anybody remember the discussion, why we ended up with read-tree ?
>
> We used to use neither, and considered it fine to "rm .git/index" if
> you wanted to empty the on-disk index file in the old world.  In the
> modern world, folks want you to avoid touching filesystem directly
> and instead want you to use Git tools, and the above are two obvious
> ways to do so.
>
> "git read-tree" (without any parameter, i.e. "read these 0 trees and
> populate the index with it") and its modern and preferred synonym
> "git read-tree --empty" (i.e. "I am giving 0 trees and I know the
> sole effect of this command is to empty the index.") are more direct
> ways to express "I want the index emptied" between the two.
>
> The other one, "git rm -r --cached .", in the end gives you the same
> state because it tells Git to "iterate over all the entries in the
> index, find the ones that match pathspec '.', and remove them from
> the index.".  It is not wrong per-se, but conceptually it is a bit
> roundabout way to say that "I want the index emptied", I would
> think.
>
> I wouldn't be surprised if the "rm -r --cached ." were a lot slower,
> due to the overhead of having to do the pathspec filtering that ends
> up to be a no-op, but there shouldn't be a difference in the end
> result.

You guys are obviously worlds ahead of me on the internals of things,
but from my perspective I like to avoid the "plumbing" commands as
much as I can. Even if I used them, if I have to tell the rest of my
team "this is the way to do it", they're going to give me dirty looks
if I ask them to run things like this that make no sense to them.
That's the argument I have to deal with when it comes to Git's
usability within the team I manage. So based on this, I'd favor the
`git rm -r --cached` approach because this is the more common result
you see in Google, and also makes a little more sense from a high
level of thought perspective. However, this is just my personal
opinion. `read-tree --empty` is far less self explanatory IMHO.

Also let's not forget the second part of the command chain that
results in the different behavior. In one case, I use `git add` which
results in proper line ending normalization. In the other case, I do
`git reset --hard` which does *NOT* result in the line endings
normalized (`git status` shows no results). In both cases, I'm still
doing `git rm -r --cached`, so I am doubtful that is the root cause
for the line ending normalization piece. I'm still trying to
understand why both give different results (root cause) and also get
an understanding of what the correct (modern) solution is for line
ending normalization (not necessarily which is the right way to
clear/delete the index, which is really AFAIK just a means to this end
and an implementation detail of sorts for this specific task).


disable interactive prompting

2017-10-04 Thread Ernesto Alfonso
Hi,

Waiting for git-push synchronously slows me down, so I have a bash
alias/function to do this in the background. But when my origin is https, I
get an undesired interactive prompt. I've tried to disable by
redirecting stdin:

git push ${REMOTE} ${BRANCH} &>/dev/null 

Re: distinguishing between staged and unstaged content in a stash?

2017-10-04 Thread Robert P. J. Day
On Wed, 4 Oct 2017, Kevin Daudt wrote:

> On Wed, Oct 04, 2017 at 07:10:46AM -0400, rpj...@crashcourse.ca wrote:
> >
> >   couple (admittedly trivial) questions about stashing. first, can
> > i clarify that when one stashes content, a stash *always*
> > distinguishes between what was staged, and what was unstaged? that
> > is, when one is stashing, the "--keep-index" option relates to
> > whether or not staged changes are left in the index (and,
> > consequently, in the working directory as well), but that option
> > has no effect on the final content of the stash, yes? even if
> > "--keep-index" is used, staged content still ends up in the stash.
> >
> >   also, is there a simple way to distinguish between the staged
> > and unstaged contents of a stash (or, more basically, is this even
> > a useful question to ask)? out of curiosity, i tried to figure out
> > how to do this, and came up with the following.
> >
> > to see staged portion of stash@{0}:
> >
> >   $ git show stash@{0}^2
> >
> > to see unstaged portion:
> >
> >   $ git diff stash@{0}^2 stash@{0}
> >
> > it's not like i have a pressing need to do that, i was just
> > curious if there's a simpler way to do this, or if this is just
> > not something people should need to do on a regular basis.
>
> There was a recent thread about this[0]. The conclusion is that it's
> seen as a good change, someone just needs to supply the patch to do
> this. A possible solution was also provided (from before that
> thread) here [1]
>
> [0]:https://public-inbox.org/git/1505626069.9625.6.ca...@gmail.com/
> [1]:https://public-inbox.org/git/20170317141417.g2oenl67k74nl...@sigill.intra.peff.net/

  ah, i see immediately what sort of worm can this represents, given
the various possible states of files in the stash:

  * staged and tracked
  * unstaged and tracked
  * untracked but not ignored
  * untracked and ignored

  i will continue reading.

rday

-- 


Robert P. J. Day Ottawa, Ontario, CANADA
http://crashcourse.ca

Twitter:   http://twitter.com/rpjday
LinkedIn:   http://ca.linkedin.com/in/rpjday



Re: Enhancement request: git-push: Allow (configurable) default push-option

2017-10-04 Thread Marius Paliga
Hi Stefan,

I will look at it.

Thanks,
Marius


2017-10-03 18:53 GMT+02:00 Stefan Beller :
> On Tue, Oct 3, 2017 at 3:15 AM, Marius Paliga  wrote:
>> There is a need to pass predefined push-option during "git push"
>> without need to specify it explicitly.
>>
>> In another words we need to have a new "git config" variable to
>> specify string that will be automatically passed as "--push-option"
>> when pushing to remote.
>>
>> Something like the following:
>>
>> git config push.optionDefault AllowMultipleCommits
>>
>> and then command
>>   git push
>> would silently run
>>   git push --push-option "AllowMultipleCommits"
>
> We would need to
> * design this feature (seems like you already have a good idea what you need)
> * implement it (see builtin/push.c):
>  - move "struct string_list push_options = STRING_LIST_INIT_DUP;"
>   to be a file-static variable, such that we have access to it outside
> of cmd_push.
>  - In git_push_config in builtin/push.c that parses the config, we'd
> need to check
>   for "push.optionDefault" and add these to the push_options (I assume 
> multiple
>   are allowed)
> * document it (Documentation/git-push.txt)
> * add a test for it ? (t/t5545-push-options.sh)
>
> Care to write a patch? Otherwise I'd mark it up as part of
> #leftoverbits for now,
> as it seems like a good starter project.
>
> Thanks,
> Stefan


Re: distinguishing between staged and unstaged content in a stash?

2017-10-04 Thread Kevin Daudt
On Wed, Oct 04, 2017 at 07:10:46AM -0400, rpj...@crashcourse.ca wrote:
> 
>   couple (admittedly trivial) questions about stashing. first, can i
> clarify that when one stashes content, a stash *always* distinguishes
> between what was staged, and what was unstaged? that is, when one is
> stashing, the "--keep-index" option relates to whether or not staged
> changes are left in the index (and, consequently, in the working
> directory as well), but that option has no effect on the final content
> of the stash, yes? even if "--keep-index" is used, staged content
> still ends up in the stash.
> 
>   also, is there a simple way to distinguish between the staged and
> unstaged contents of a stash (or, more basically, is this even a
> useful question to ask)? out of curiosity, i tried to figure out
> how to do this, and came up with the following.
> 
> to see staged portion of stash@{0}:
> 
>   $ git show stash@{0}^2
> 
> to see unstaged portion:
> 
>   $ git diff stash@{0}^2 stash@{0}
> 
> it's not like i have a pressing need to do that, i was just curious
> if there's a simpler way to do this, or if this is just not something
> people should need to do on a regular basis.
> 
> rday
> 
> 
> 

There was a recent thread about this[0]. The conclusion is that it's
seen as a good change, someone just needs to supply the patch to do
this. A possible solution was also provided (from before that thread)
here [1]

[0]:https://public-inbox.org/git/1505626069.9625.6.ca...@gmail.com/
[1]:https://public-inbox.org/git/20170317141417.g2oenl67k74nl...@sigill.intra.peff.net/


Re: [PATCH v2] setup: update error message to be more meaningful

2017-10-04 Thread Kaartic Sivaraam
On Tue, 2017-10-03 at 09:32 +0900, Junio C Hamano wrote:
> 
> As an aside, I wonder if we want to _() the message.  It's outside
> the scope of this fix, obviously.
> 

I'm surprised it isn't already! I think it should.


As an aside, I wanted to know whether we should add
'test_expect_failure's for commands of the following sort,

git log -p --full-diff master --stat
 
git commit -v Makefile --amend

just to keep us reminded that these are accepted not by design but by
accident ?

---
Kaartic


Re: [PATCH] test-list-objects: mark file-local symbols as static

2017-10-04 Thread Derrick Stolee

On 10/3/2017 5:51 PM, Ramsay Jones wrote:

Signed-off-by: Ramsay Jones 
---

Hi Derrick,

If you need to re-roll your 'ds/find-unique-abbrev-optim' branch,
could you please squash this into the relevant patch (commit 3792c78ba0,
"test-list-objects: list a subset of object ids", 01-10-2017).

Thanks!

ATB,
Ramsay Jones

  t/helper/test-list-objects.c | 32 
  1 file changed, 16 insertions(+), 16 deletions(-)

diff --git a/t/helper/test-list-objects.c b/t/helper/test-list-objects.c
index 22bc9b4e6..5c5d3c03f 100644
--- a/t/helper/test-list-objects.c
+++ b/t/helper/test-list-objects.c
@@ -6,43 +6,43 @@ struct count {
int select_mod;
  };
  
-int count_loose(const struct object_id *oid,

-   const char *path,
-   void *data)
+static int count_loose(const struct object_id *oid,
+  const char *path,
+  void *data)
  {
((struct count*)data)->total++;
return 0;
  }
  
-int count_packed(const struct object_id *oid,

-struct packed_git *pack,
-uint32_t pos,
-void* data)
+static int count_packed(const struct object_id *oid,
+   struct packed_git *pack,
+   uint32_t pos,
+   void* data)
  {
((struct count*)data)->total++;
return 0;
  }
  
-void output(const struct object_id *oid,

-   struct count *c)
+static void output(const struct object_id *oid,
+  struct count *c)
  {
if (!(c->total % c->select_mod))
printf("%s\n", oid_to_hex(oid));
c->total--;
  }
  
-int output_loose(const struct object_id *oid,

-const char *path,
-void *data)
+static int output_loose(const struct object_id *oid,
+   const char *path,
+   void *data)
  {
output(oid, (struct count*)data);
return 0;
  }
  
-int output_packed(const struct object_id *oid,

- struct packed_git *pack,
- uint32_t pos,
- void* data)
+static int output_packed(const struct object_id *oid,
+struct packed_git *pack,
+uint32_t pos,
+void* data)
  {
output(oid, (struct count*)data);
return 0;
Thanks, Ramsay. I applied these changes locally. I'll remember "static" 
in the future.


Thanks,
-Stolee


Re: [PATCH v1] convert: display progress for filtered objects that have been delayed

2017-10-04 Thread Lars Schneider

> On 04 Oct 2017, at 13:55, Junio C Hamano  wrote:
> 
> Lars Schneider  writes:
> 
>>> The delayed progress API is being simplified so I'll probably do a
>>> bit of evil merge while merging this to 'pu'.
>> 
>> I just realized that this patch got lost :-(
> 
> Really?  Isn't it 52f1d62e ("convert: display progress for filtered
> objects that have been delayed", 2017-08-20)?  It already is in
> 'master' since about 3 weeks ago.
> 
> As this is merely a new "feature", I do not think it is reasonable
> to expect it to be merged down to 'maint'.  In fact, I think your
> original patch <20170820154720.32259-1-larsxschnei...@gmail.com>
> declared that its Base Ref was "master", and I queued the topic on
> top of the then-current 'master', so it is not mergeable to 'maint'.
> 
> It may be cherry-pickable, but I do not think it deserves to be.  It
> is not a fix for a grave bug or anything like that, right?

The original topic 2841e8f ("convert: add "status=delayed" to filter 
process protocol", 2017-06-30) is a new feature. That's why I was
surprised to see it in 2.14.2! The problem is that this topic
might appear "broken" to the user without the progress output
introduce in 52f1d62e ("convert: display progress for filtered
objects that have been delayed", 2017-08-20).

In summary: 2841e8f was merged to 2.14.2 and 52f1d62e was not.
Therefore, it would be great to have 52f1d62e in 2.14.3.
Does this sound reasonable to you?

Thanks,
Lars


Re: [PATCH 00/18] Partial clone (from clone to lazy fetch in 18 patches)

2017-10-04 Thread Jeff Hostetler



On 10/3/2017 7:42 PM, Jonathan Tan wrote:

On Tue, Oct 3, 2017 at 7:39 AM, Jeff Hostetler  wrote:


As I see it there are the following major parts to partial clone:
1. How to let git-clone (and later git-fetch) specify the desired
subset of objects that it wants?  (A ref-relative request.)
2. How to let the server and git-pack-objects build that incomplete
packfile?
3. How to remember in the local config that a partial clone (or
fetch) was used and that missing object should be expected?
4. How to dynamically fetch individual missing objects individually?
 (Not a ref-relative request.)
5. How to augment the local ODB with partial clone information and
let git-fsck (and friends) perform limited consistency checking?
6. Methods to bulk fetching missing objects (whether in a pre-verb
hook or in unpack-tree)
7. Miscellaneous issues (e.g. fixing places that accidentally cause
a missing object to be fetched that don't really need it).


Thanks for the enumeration.


As was suggested above, I think we should merge our efforts:
using my filtering for 1 and 2 and Jonathan's code for 3, 4, and 5.
I would need to eliminate the "relax" options in favor of his
is_promised() functionality for index-pack and similar.  And omit
his blob-max-bytes changes from pack-objects, the protocol and
related commands.

That should be a good first step.


This sounds good to me. Jeff Hostetler's filtering (all blobs, blobs
by size, blobs by sparse checkout specification) is more comprehensive
than mine, so removing blob-max-bytes from my code is not a problem.


We both have thoughts on bulk fetching (mine in pre-verb hooks and
his in unpack-tree).  We don't need this immediately, but can wait
until the above is working to revisit.


Agreed.



Thanks.

I'll make a first pass at merging our efforts then and
post something shortly.

Jeff



Re: [PATCH v3 3/5] sha1_name: Unroll len loop in find_unique_abbrev_r

2017-10-04 Thread Derrick Stolee

On 10/4/2017 2:07 AM, Junio C Hamano wrote:

Derrick Stolee  writes:


-   exists = has_sha1_file(sha1);
-   while (len < GIT_SHA1_HEXSZ) {
-   struct object_id oid_ret;
-   status = get_short_oid(hex, len, _ret, GET_OID_QUIETLY);
-   if (exists
-   ? !status
-   : status == SHORT_NAME_NOT_FOUND) {
-   hex[len] = 0;
-   return len;
-   }
-   len++;
-   }
-   return len;

The "always_call_fn" thing is a big sledgehammer that overrides
everything else in update_candidates().  It bypasses the careful
machinery set up to avoid having to open ambiguous object to learn
their types as much as possible.  One narrow exception when it is OK
to use is if we never limit our candidates with type.


I do not modify get_short_oid, which uses these advanced options, 
depending on the flags given. find_unique_abbrev_r() does not use these 
advanced options.



And it might appear that the conversion is safe (if only because we
do not see any type limitation in the get_short_oid() call above),
but I think there is one case where this patch changes the
behaviour: what happens if core.disambiguate was set to anything
other than "none"?  The new code does not know anything about type
based filtering, so it can end up reporting longer abbreviation than
it was asked to produce.  It may not be a problem in practice, though.

I am not sure if setting core.disambiguate is generally a good idea
in the first place, and if it is OK to break find_unique_abbrev()
with respect to the configuration variable like this patch does.


I do not think that type-aware disambiguation goes through this code 
path, since it requires giving different parameters to get_short_oid(). 
Test t1512-rev-parse-disambituagion.sh has a test 'core.disambiguate 
config can prefer types' that verifies this behavior.



I'd feel safe if we get extra input from Peff, who introduced the
feature in 5b33cb1f ("get_short_sha1: make default disambiguation
configurable", 2016-09-27).


I look forward to more feedback. Thanks for taking the time to look at 
my patch series.


Thanks,
-Stolee


Re: [PATCH v3 3/5] sha1_name: Unroll len loop in find_unique_abbrev_r

2017-10-04 Thread Derrick Stolee

On 10/4/2017 2:10 AM, Junio C Hamano wrote:

Derrick Stolee  writes:
...

I understand that this patch on its own does not have good numbers. I
split the
patches 3 and 4 specifically to highlight two distinct changes:

Patch 3: Unroll the len loop that may inspect all files multiple times.
Patch 4: Parse less while disambiguating.

Patch 4 more than makes up for the performance hits in this patch.

Now you confused me even more.  When we read the similar table that
appears in [Patch 4/5], what does the "Base Time" column mean?
Vanilla Git with [Patch 3/5] applied?  Vanillay Git with [Patch 4/5]
alone applied?  Something else?
In PATCH 3, 4, and 5, I used the commit-by-commit diff for the perf 
numbers, so the "Base Time" for PATCH 4 is the time calculated when 
PATCH 3 is applied. The table in the [PATCH 0/5] message includes the 
relative change for all commits.


I recalculated the relative change for each patch related to the 
baseline (PATCH 2). Looking again, it appears I misspoke and PATCH 4 
does include a +8% change for a fully-repacked Linux repo relative to 
PATCH 2. Since PATCH 5 includes an optimization targeted directly at 
large packfiles, the final performance gain is significant in the 
fully-packed cases.


It is also worth looking at the absolute times for these cases, since 
the fully-packed case is significantly faster than the multiple-packfile 
case, so the relative change impacts users less.


One final note: the improvement was clearer in test p0008.1 when the 
test included "sort -R" to shuffle the known OIDs. Providing OIDs in 
lexicographic order has had a significant effect on the performance, 
which does not reflect real-world usage. I removed the "sort -R" because 
it is a GNU-ism, but if there is a good cross-platform alternative I 
would be happy to replace it.


p0008.1: find_unique_abbrev() for existing objects
--

For 10 repeated tests, each checking 100,000 known objects, we find the
following results when running in a Linux VM:

| Repo  | Baseline | Patch 3 | Rel % | Patch 4 | Rel % | Patch 5 | Rel % |
|---|--|-|---|-|---|-|---|
| Git   | 0.09 | 0.06    | -33%  | 0.05    | -44%  | 0.05    | -44%  |
| Git   | 0.11 | 0.08    | -27%  | 0.08    | -27%  | 0.08    | -27%  |
| Git   | 0.09 | 0.07    | -22%  | 0.06    | -33%  | 0.06    | -33%  |
| Linux | 0.13 | 0.32    | 146%  | 0.14    | + 8%  | 0.05    | -62%  |
| Linux | 1.13 | 1.12    | - 1%  | 0.94    | -17%  | 0.88    | -22%  |
| Linux | 1.08 | 1.05    | - 3%  | 0.86    | -20%  | 0.80    | -26%  |
| VSTS  | 0.12 | 0.23    | +92%  | 0.11    | - 8%  | 0.05    | -58%  |
| VSTS  | 1.02 | 1.08    | + 6%  | 0.95    | - 7%  | 0.95    | - 7%  |
| VSTS  | 2.25 | 2.08    | - 8%  | 1.82    | -19%  | 1.93    | -14%  |

(Each repo has three versions, in order: 1 packfile, multiple packfiles, 
and multiple packfiles and loose objects.)


Thanks,
-Stolee



Re: [PATCH v8 00/12] Fast git status via a file system watcher

2017-10-04 Thread Ben Peart



On 10/4/2017 2:38 AM, Alex Vandiver wrote:

On Wed, 4 Oct 2017, Junio C Hamano wrote:

Rats indeed.  Let's go incremental as promised, perhaps like this
(but please supply a better description if you have one).


I think you'll also want the following squashed into 5c8cdcfd8 and
def437671:

-- >8 --
 From 445d45027bb5b7823338cf111910d2884af6318b Mon Sep 17 00:00:00 2001
From: Alex Vandiver 
Date: Tue, 3 Oct 2017 23:27:46 -0700
Subject: [PATCH] fsmonitor: Read entirety of watchman output

In perl, setting $/ sets the string that is used as the "record
separator," which sets the boundary that the `<>` construct reads to.
Setting `local $/ = 0666;` evaluates the octal, getting 438, and
stringifies it.  Thus, the later read from `` stops as soon
as it encounters the string "438" in the watchman output, yielding
invalid JSON; repositories containing filenames with SHA1 hashes are
able to trip this easily.

Set `$/` to undefined, thus slurping all output from watchman.  Also
close STDIN which is provided to watchman, to better guarantee that we
cannot deadlock with watchman while both attempting to read.



Thank you!  I'm a perl neophyte so have to rely on others when it comes 
to these types of perl issues.  I tried out your fixes and they appear 
to work well.


While testing them, I discovered that your fix of `local $/ = 0666;` 
exposed an existing issue in the test version of the integration script. 
 The fix for that is within my perl capabilities and is fixed with the 
following patch:


-- >8 --
From 3e3b983a4208a62d166c233a7de3bf045322f6c7 Mon Sep 17 00:00:00 2001
From: Ben Peart 
Date: Wed, 4 Oct 2017 08:33:39 -0400
Subject: [PATCH] fsmonitor: preserve utf8 filenames in 
fsmonitor-watchman log


Update the test fsmonitor-watchman integration script to properly
preserve utf8 filenames when outputting the .git/watchman-output.out log
file.

Signed-off-by: Ben Peart 
---
 t/t7519/fsmonitor-watchman | 1 +
 1 file changed, 1 insertion(+)

diff --git a/t/t7519/fsmonitor-watchman b/t/t7519/fsmonitor-watchman
index 51330f8b3d..a3e30bf54f 100755
--- a/t/t7519/fsmonitor-watchman
+++ b/t/t7519/fsmonitor-watchman
@@ -129,6 +129,7 @@ sub launch_watchman {
"Falling back to scanning...\n" if $o->{error};

open ($fh, ">", ".git/watchman-output.out");
+   binmode $fh, ":utf8";
print $fh @{$o->{files}};
close $fh;

--
2.14.1.windows.1.1034.g0776750557



Re: [PATCH v2] branch: change the error messages to be more meaningful

2017-10-04 Thread Kaartic Sivaraam
On Wed, 2017-10-04 at 13:11 +0900, Junio C Hamano wrote:
> 
> It is a bit dissapointing that we do not need to touch tests, as it
> indicates that the logic to diagnose extra arguments as an error has
> no coverage.

Even if there were tests I don't think they would have needed any
updation as most of the tests (at least those that I came across) that
check for failure seem not to be checking for what error message gets
printed. They seem to test only if the command fails (using
test_must_fail in most cases, I guess).

Moreover, as a consequence of my assumption that the tests don't check
for the error messages themselves; I haven't even thought of checking
whether the tests or the travis-ci build succeeded as a consequence of
my patches that touch "only" the error messages!

---
Kaartic


Ahoj drahý!!

2017-10-04 Thread katiehiggin...@aol.com
Hi Dear!!

Nice meeting you as i got your email through my personal search, my
name is Katie Higgins, i am (Blue Angels United officer), working for
U.S Navy base camp here in Syria.. i urgently seek your assistance,
please kindly write me at your convenience,look forward reading from
you soon.

Katie Higgins



Ahoj drahý!!

Pekné stretnutie s Vami, ako som dostal svoj e-mail prostredníctvom
môjho osobného vyhľadávania, moje meno je Katie Higgins, ja som (Blue
Angels United dôstojník), pracujem pre základňu amerického námorníctva
v Sýrii .. naliehavo hľadám tvoju pomoc, prosíme, napíš ma na vaše
pohodlie, čoskoro čakať od vás.

Katie Higginsová


Re: [PATCH v8 00/12] Fast git status via a file system watcher

2017-10-04 Thread Ben Peart



On 10/3/2017 10:09 PM, Junio C Hamano wrote:

Ben Peart  writes:


Well, rats. I found one more issue that applies to two of the
commits. Can you squash this in as well or do you want it in some
other form?


Rats indeed.  Let's go incremental as promised, perhaps like this
(but please supply a better description if you have one).


Thank you.  Looks good.



-- >8 --
From: Ben Peart 
Subject: fsmonitor: MINGW support for watchman integration

Instead of just taking $ENV{'PWD'}, use the same logic that converts
PWD to $git_work_tree on MSYS_NT in the watchman integration hook
script also on MINGW.

Signed-off-by: Ben Peart 
Signed-off-by: Junio C Hamano 
---
  t/t7519/fsmonitor-watchman | 2 +-
  templates/hooks--fsmonitor-watchman.sample | 2 +-
  2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/t/t7519/fsmonitor-watchman b/t/t7519/fsmonitor-watchman
index 7ceb32dc18..cca3d71e90 100755
--- a/t/t7519/fsmonitor-watchman
+++ b/t/t7519/fsmonitor-watchman
@@ -36,7 +36,7 @@ my $system = `uname -s`;
  $system =~ s/[\r\n]+//g;
  my $git_work_tree;
  
-if ($system =~ m/^MSYS_NT/) {

+if ($system =~ m/^MSYS_NT/ || $system =~ m/^MINGW/) {
$git_work_tree = `cygpath -aw "\$PWD"`;
$git_work_tree =~ s/[\r\n]+//g;
$git_work_tree =~ s,\\,/,g;
diff --git a/templates/hooks--fsmonitor-watchman.sample 
b/templates/hooks--fsmonitor-watchman.sample
index 870a59d237..c68038ef00 100755
--- a/templates/hooks--fsmonitor-watchman.sample
+++ b/templates/hooks--fsmonitor-watchman.sample
@@ -35,7 +35,7 @@ my $system = `uname -s`;
  $system =~ s/[\r\n]+//g;
  my $git_work_tree;
  
-if ($system =~ m/^MSYS_NT/) {

+if ($system =~ m/^MSYS_NT/ || $system =~ m/^MINGW/) {
$git_work_tree = `cygpath -aw "\$PWD"`;
$git_work_tree =~ s/[\r\n]+//g;
$git_work_tree =~ s,\\,/,g;



Re: [PATCH v1] convert: display progress for filtered objects that have been delayed

2017-10-04 Thread Junio C Hamano
Lars Schneider  writes:

>> The delayed progress API is being simplified so I'll probably do a
>> bit of evil merge while merging this to 'pu'.
>
> I just realized that this patch got lost :-(

Really?  Isn't it 52f1d62e ("convert: display progress for filtered
objects that have been delayed", 2017-08-20)?  It already is in
'master' since about 3 weeks ago.

As this is merely a new "feature", I do not think it is reasonable
to expect it to be merged down to 'maint'.  In fact, I think your
original patch <20170820154720.32259-1-larsxschnei...@gmail.com>
declared that its Base Ref was "master", and I queued the topic on
top of the then-current 'master', so it is not mergeable to 'maint'.

It may be cherry-pickable, but I do not think it deserves to be.  It
is not a fix for a grave bug or anything like that, right?



Re: [PATCH v1] convert: display progress for filtered objects that have been delayed

2017-10-04 Thread Lars Schneider

> On 04 Oct 2017, at 12:52, Lars Schneider  wrote:
> 
> 
>> On 24 Aug 2017, at 21:40, Junio C Hamano  wrote:
>> 
>> Lars Schneider  writes:
>> 
>>> In 2841e8f ("convert: add "status=delayed" to filter process protocol",
>>> 2017-06-30) we taught the filter process protocol to delayed responses.
>>> These responses are processed after the "Checking out files" phase.
>>> If the processing takes noticeable time, then the user might think Git
>>> is stuck.
>>> 
>>> Display the progress of the delayed responses to let the user know that
>>> Git is still processing objects. This works very well for objects that
>>> can be filtered quickly. If filtering of an individual object takes
>>> noticeable time, then the user might still think that Git is stuck.
>>> However, in that case the user would at least know what Git is doing.
>>> 
>>> It would be technical more correct to display "Checking out files whose
>>> content filtering has been delayed". For brevity we only print
>>> "Filtering content".
>>> 
>>> The finish_delayed_checkout() call was moved below the stop_progress()
>>> call in unpack-trees.c to ensure that the "Checking out files" progress
>>> is properly stopped before the "Filtering content" progress starts in
>>> finish_delayed_checkout().
>>> 
>>> Signed-off-by: Lars Schneider 
>>> Suggested-by: Taylor Blau 
>>> ---
>> 
>> Makes sense.  The only thing that made me wonder was if we want the
>> change in unpack-trees.c in this patch.  After all, the procedure to
>> finish up the delayed checkout _is_ a part of the work need to be
>> done to populate the working tree files, so stopping the progress
>> before feels somewhat wrong at the phylosophical level.
>> 
>> I think our output cannot express nested progress bars, and I think
>> that is the reason why this patch tweaks unpack-trees.c; so I am
>> fine with the end result (and that is why I said "made me wonder
>> was", not "makes me wonder", the latter would imply "this we might
>> want fix before applying", but I do not think we want to change
>> anything this patch does to unpack-trees.c in this case).
>> 
>> The delayed progress API is being simplified so I'll probably do a
>> bit of evil merge while merging this to 'pu'.
>> 
>> Thanks.
> 
> Hi Junio,
> 
> I just realized that this patch got lost :-(
> That means 2.14.2 supports the delayed filters but does not show
> progress to the user. To the user Git will appear hanging.
> 
> Can you merge this this topic for 2.14.3 / 2.15 ?
> 
> Thank you,
> Lars

I should have expressed myself more clearly:
The patch made it into master with 52f1d62eb44faf569edca360ec9af9ddd4045fe0 .
Therefore, I think it will be released with 2.15, right?

The patch just did not make it into 2.14.2 and I wonder if you could
merge the patch for 2.14.3?

Thank you,
Lars



Re: [PATCH 2/3] for-each-ref: let upstream/push optionally report merge name.

2017-10-04 Thread Junio C Hamano
Junio C Hamano  writes:

>> +if (explicit)
>> +*s = xstrdup(merge);
>> +else
>> +*s = "";
>
> Here is the same "who are we trying to help---users who want to know
> where their push goes, or users who are debugging where the push
> destination is defined?" question.  I do not have a strong opinion
> either way, but I think giving the end result with fallback (i.e.
> not nullifying when the result is not explicit) may be easier to use
> by "push" users.

Now after thinking about it a bit more, I actually have a moderate
preference to doing it the way your patches do.  With programatic
%(if)%(else)%(end) support we acquired recently, the fallback can be
coded in the --format=... language if the user wanted to using the
"internal fallbacks, explicit==0, are ignored" behaviour that are
implemented by these two patches.  The reverse is not true.

I think the remaining points from my reviews are:

 - It would be better to compute precomputable stuff when used atoms
   are parsed, instead of running starts_with() in these functions;

 - We want to make sure there is no existing multi-word modifiers
   (in which case we can safely declare "multi-word" is the way we
   spell them from now on).  Or if there are existing ones, they
   already spell themselves as "multi-word".

   I have nothing against "remote-name"; I just want to make sure we
   are not making things inconsistent.  If there are only few (but
   non zero number of) multi-word modifiers that are not spelled
   "multi-word", as long as they are only few and their spelling are
   inferiour (e.g. concatenatedwords is much worse than
   concatenated-words), we could still declare "multi-word" is the
   right way to spell them going forward, declare that we will give
   them synonyms and deprecate the bad spelling out over time, and
   leave that plan as #leftover bits thing (i.e. not doing the
   deprecation of these other modifiers as part of this series.

   The only thing I want to happen in the scope of _this_ series, as
   due diligence, is to make sure we are happy with "multi-word",
   and also to know if we need a follow-up work (just yes/no,
   possibly with plans, but no actual work yet).

Thanks.


distinguishing between staged and unstaged content in a stash?

2017-10-04 Thread rpjday


  couple (admittedly trivial) questions about stashing. first, can i
clarify that when one stashes content, a stash *always* distinguishes
between what was staged, and what was unstaged? that is, when one is
stashing, the "--keep-index" option relates to whether or not staged
changes are left in the index (and, consequently, in the working
directory as well), but that option has no effect on the final content
of the stash, yes? even if "--keep-index" is used, staged content
still ends up in the stash.

  also, is there a simple way to distinguish between the staged and
unstaged contents of a stash (or, more basically, is this even a
useful question to ask)? out of curiosity, i tried to figure out
how to do this, and came up with the following.

to see staged portion of stash@{0}:

  $ git show stash@{0}^2

to see unstaged portion:

  $ git diff stash@{0}^2 stash@{0}

it's not like i have a pressing need to do that, i was just curious
if there's a simpler way to do this, or if this is just not something
people should need to do on a regular basis.

rday





Re: [PATCH v1] convert: display progress for filtered objects that have been delayed

2017-10-04 Thread Lars Schneider

> On 24 Aug 2017, at 21:40, Junio C Hamano  wrote:
> 
> Lars Schneider  writes:
> 
>> In 2841e8f ("convert: add "status=delayed" to filter process protocol",
>> 2017-06-30) we taught the filter process protocol to delayed responses.
>> These responses are processed after the "Checking out files" phase.
>> If the processing takes noticeable time, then the user might think Git
>> is stuck.
>> 
>> Display the progress of the delayed responses to let the user know that
>> Git is still processing objects. This works very well for objects that
>> can be filtered quickly. If filtering of an individual object takes
>> noticeable time, then the user might still think that Git is stuck.
>> However, in that case the user would at least know what Git is doing.
>> 
>> It would be technical more correct to display "Checking out files whose
>> content filtering has been delayed". For brevity we only print
>> "Filtering content".
>> 
>> The finish_delayed_checkout() call was moved below the stop_progress()
>> call in unpack-trees.c to ensure that the "Checking out files" progress
>> is properly stopped before the "Filtering content" progress starts in
>> finish_delayed_checkout().
>> 
>> Signed-off-by: Lars Schneider 
>> Suggested-by: Taylor Blau 
>> ---
> 
> Makes sense.  The only thing that made me wonder was if we want the
> change in unpack-trees.c in this patch.  After all, the procedure to
> finish up the delayed checkout _is_ a part of the work need to be
> done to populate the working tree files, so stopping the progress
> before feels somewhat wrong at the phylosophical level.
> 
> I think our output cannot express nested progress bars, and I think
> that is the reason why this patch tweaks unpack-trees.c; so I am
> fine with the end result (and that is why I said "made me wonder
> was", not "makes me wonder", the latter would imply "this we might
> want fix before applying", but I do not think we want to change
> anything this patch does to unpack-trees.c in this case).
> 
> The delayed progress API is being simplified so I'll probably do a
> bit of evil merge while merging this to 'pu'.
> 
> Thanks.

Hi Junio,

I just realized that this patch got lost :-(
That means 2.14.2 supports the delayed filters but does not show
progress to the user. To the user Git will appear hanging.

Can you merge this this topic for 2.14.3 / 2.15 ?

Thank you,
Lars


playing with MSan, was Re: [PATCH 0/3] fixes for running the test suite with --valgrind

2017-10-04 Thread Jeff King
On Tue, Oct 03, 2017 at 07:41:54PM -0400, Jeff King wrote:

> I think using SANITIZE=memory would catch these, but it needs some
> suppressions tuning. The weird "zlib reads uninitialized memory" error
> is a problem (valgrind sees this, too, but we have suppressions).

I dug into this a little more. You can blacklist certain functions from
getting MSan treatment, but that's not quite what we want. We want to
mark bytes from certain _sources_ as being initialized, even if MSan
doesn't agree.

And indeed, you can do that. As far as I can tell, MSan works by keeping
a shadow map of memory and setting flags when it believes it has been
initialized, and then checking that map when we make decisions based on
the memory. But it can only do that if it instruments all writes. So the
MSan documentation recommends that you build _everything_, including
libraries, with it. Which obviously we don't do if we're using a system
zlib. Or a system libc for that matter (though they intercept many
common libc functions to handle this).

So one strategy is to "cheat" a bit at the library interfaces, and claim
whatever they send us is properly initialized. The patch below tries
that with zlib, and it does seem to work. It would fail to notice a real
problem with any input we send _to_ the library (since the library isn't
instrumented, and we claim that whatever comes out of it is legitimate).
I could probably live with that.

But there are quite a few test failures that would still need
investigating and annotating:

  - Certainly it's confused by looking at regmatch_t results from
regexec(). We can fix that by building with NO_REGEX. But pcre has
a similar problem.

  - Ditto curl and openssl, whose exit points would need annotations.

  - For some reason test-sigchain segfaults when it raise()s in the
signal handler and recurses. Not sure if this is an MSan bug or
what.

So I dunno. This approach is a _lot_ more convenient than trying to
rebuild all the dependencies from scratch, and it runs way faster than
valgrind. It did find the cases that led to the patches in this
series, and at least one more: if the lstat() at the end of
entry.c:write_entry() fails, we write nonsense into the cache_entry.

I think we could probably get it to zero false positives without _too_
much effort. I'll stop here for tonight, but I may pick it up again
later (of course anybody else is welcome to fool around with it, too).

Below is the patch that let me run:

  make SANITIZE=memory CC=clang-6.0 NO_REGEX=1

and get a tractable number of errors.

-- >8 --
diff --git a/Makefile b/Makefile
index b143e4eea3..1da5c01211 100644
--- a/Makefile
+++ b/Makefile
@@ -1047,6 +1047,9 @@ endif
 ifneq ($(filter leak,$(SANITIZERS)),)
 BASIC_CFLAGS += -DSUPPRESS_ANNOTATED_LEAKS
 endif
+ifneq ($(filter memory,$(SANITIZERS)),)
+BASIC_CFLAGS += -DENABLE_MSAN_UNPOISON
+endif
 endif
 
 ifndef sysconfdir
diff --git a/git-compat-util.h b/git-compat-util.h
index cedad4d581..836a4c0b54 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -1191,4 +1191,11 @@ extern void unleak_memory(const void *ptr, size_t len);
 #define UNLEAK(var) do {} while (0)
 #endif
 
+#ifdef ENABLE_MSAN_UNPOISON
+#include 
+#define msan_unpoison(ptr, len) __msan_unpoison(ptr, len)
+#else
+#define msan_unpoison(ptr, len) do {} while (0)
+#endif
+
 #endif
diff --git a/zlib.c b/zlib.c
index 4223f1a8c5..5fa8f12507 100644
--- a/zlib.c
+++ b/zlib.c
@@ -56,6 +56,8 @@ static void zlib_post_call(git_zstream *s)
if (s->z.total_in != s->total_in + bytes_consumed)
die("BUG: total_in mismatch");
 
+   msan_unpoison(s->next_out, bytes_produced);
+
s->total_out = s->z.total_out;
s->total_in = s->z.total_in;
s->next_in = s->z.next_in;


Re: Is git am supposed to decode MIME?

2017-10-04 Thread Junio C Hamano
Florian Weimer  writes:

> The git am documentation talks about “mailboxes”.  I suppose these
> contain messages in Internet Mail syntax.  Is git am supposed to
> decode MIME?
>
> I'm asking because I have a message whose body is encoded as
> quoted-printable, but git am does not parse the patch contained in it.
>
> If git am is supposed to deal with this, I'll dig deeper and try to
> figure out where things go wrong.

The code to check should be in .  As its comment says,
the code was not designed to be a full MIME parser--we just have a
code that (empirically) works in practice on messages produced when
a patch is attached to a message via popular MUAs, not written from
the MIME RFC spec.

Thanks for your interest in making the world a better place ;-)
Very much appreciated.




Re: Is git am supposed to decode MIME?

2017-10-04 Thread Jeff King
On Wed, Oct 04, 2017 at 10:44:31AM +0200, Florian Weimer wrote:

> The git am documentation talks about “mailboxes”.  I suppose these contain
> messages in Internet Mail syntax.  Is git am supposed to decode MIME?
> 
> I'm asking because I have a message whose body is encoded as
> quoted-printable, but git am does not parse the patch contained in it.
> 
> If git am is supposed to deal with this, I'll dig deeper and try to figure
> out where things go wrong.

Yes, it should. I just double-checked with the toy patch patch below,
and it correctly extracted the quoted-printable from the commit message
and patch, as well as in the headers.

-- >8 --
>From p...@peff.net Wed Oct  4 05:21:57 2017
Date: Wed, 4 Oct 2017 05:21:55 -0400
From: =?utf-8?Q?=C3=81ccented N=C3=A1me?= 
To: Jeff King 
Subject: [PATCH] add 8bit content
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

This commit message also has s=C3=B3me 8-bit characters which
will need qp-encoding.

Signed-off-by: =C3=81ccented N=C3=A1me 
---
 file | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/file b/file
index d95f3ad..d39c7fc 100644
--- a/file
+++ b/file
@@ -1 +1 @@
-content
+8-bit c=C3=B3ntent
--=20
2.14.2.1117.g65a3442612



Re: [PATCH 2/3] for-each-ref: let upstream/push optionally report merge name.

2017-10-04 Thread Junio C Hamano
Johannes Schindelin  writes:

> From: J Wyman 
>
> There are times when scripts want to know not only the name of the
> push branch on the remote, but also the name of the branch as known
> by the remote repository.
>
> A useful example of this is with the push command where
> `branch..merge` is useful as the  value. Example:
>
>   $ git push  :
>
> This patch offers the new suffix :merge for the push atom, allowing to
> show exactly that. Example:
>
>   $ cat .git/config
>   ...
>   [remote "origin"]
>   url = https://where.do.we.come/from
>   fetch = refs/heads/*:refs/remote/origin/*
>   [branch "master"]
>   remote = origin
>   merge = refs/heads/master
>   [branch "develop/with/topics"]
>   remote = origin
>   merge = refs/heads/develop/with/topics
>   ...
>
>   $ git for-each-ref \
>   --format='%(push) %(push:remote-ref)' \
>   refs/heads
>   refs/remotes/origin/master refs/heads/master
>   refs/remotes/origin/develop/with/topics refs/heads/develop/with/topics

Ah, now it is clear that we _need_ to figure out how to spell
"multi-word" modifier to the format specifiers, as "push:remote"
would not let us differenciate the products of 1/3 and 2/3 X-<.

> ---

We need two Signed-off-by: lines at the end, one by Wyman and then
another by you in this order.

> --- a/Documentation/git-for-each-ref.txt
> +++ b/Documentation/git-for-each-ref.txt

Changes to this file in this patch looks sane.

> --- a/ref-filter.c
> +++ b/ref-filter.c
> @@ -1260,6 +1262,14 @@ static void fill_remote_ref_details(struct used_atom 
> *atom, const char *refname,
>   *s = xstrdup(remote);
>   else
>   *s = "";
> + } else if (atom->u.remote_ref.option == RR_REMOTE_REF) {
> + int explicit, for_push = starts_with(atom->name, "push");
> + const char *merge = remote_ref_for_branch(branch, for_push,
> +   );

Having multiple variables defined like this _and_ initialized with
anything other than a trivial constant, is somewhat hard to follow.
Can we split the above more like:

int explicit;
int for_push = starts_with(...);
const char *merge = remote_ref_for_branch(...);

Actually, if you did it this way, you do not even need "for_push":

int explicit;
const char *merge;

merge = remote_ref_for_branch(branch,
  starts_with(atom->name, "push"),
  );

which may be a lot easier to follow.

By the way, the spirit of parsing used atoms first before the "learn
what we care about this single ref" function like the above are
called repeatedly for each ref is because we want to hoist the
overhead of doing the constant computation like "does the atom's
name begins with 'push'?" out of the latter.  

Can we add a field in atom->u.remote_ref to precompute this bit so
that we do not even have to say "Are we doing this for push?  Let's
see if the atom's name begins with 'push'" each time this function
is called for a ref?  That makes this function a lot more in line
with the spirit of the design of the subsystem, I would think.

This comment probably applies equally to both 1/3 and this one.

> + if (explicit)
> + *s = xstrdup(merge);
> + else
> + *s = "";

Here is the same "who are we trying to help---users who want to know
where their push goes, or users who are debugging where the push
destination is defined?" question.  I do not have a strong opinion
either way, but I think giving the end result with fallback (i.e.
not nullifying when the result is not explicit) may be easier to use
by "push" users.

> diff --git a/remote.c b/remote.c
> index b220f0dfc61..2bdcfc280cd 100644
> --- a/remote.c
> +++ b/remote.c
> @@ -675,6 +675,36 @@ const char *pushremote_for_branch(struct branch *branch, 
> int *explicit)
>   return remote_for_branch(branch, explicit);
>  }
>  
> +const char *remote_ref_for_branch(struct branch *branch, int for_push,
> +   int *explicit)
> +{
> + if (branch) {
> + if (!for_push) {
> + if (branch->merge_nr) {
> + if (explicit)
> + *explicit = 1;
> + return branch->merge_name[0];
> + }
> + } else {
> + const char *dst, *remote_name =
> + pushremote_for_branch(branch, NULL);
> + struct remote *remote = remote_get(remote_name);

Again,

const char *dst, *remote_name;
struct 

Re: [PATCH 1/3] for-each-ref: let upstream/push optionally report the remote name

2017-10-04 Thread Junio C Hamano
Junio C Hamano  writes:

>> +} else if (atom->u.remote_ref.option == RR_REMOTE_NAME) {
>> +int explicit;
>> +const char *remote = starts_with(atom->name, "push") ?
>> +pushremote_for_branch(branch, ) :
>> +remote_for_branch(branch, );
>
> I think "int explicit = 0;" is needed, as pushremote_for_branch()
> does seem to expect that the "explicit" return parameter is
> initialized by the caller.

Not really. I misread these helper functions.  pushremote_for_branch()
ends up calling remote_for_branch() which does clear, so we are OK
without caller initializing it.


Re: [PATCH v2] oidmap: map with OID as key

2017-10-04 Thread Junio C Hamano
Jeff King  writes:

> On Tue, Oct 03, 2017 at 05:29:01PM -0700, Jonathan Tan wrote:
>
>> At this point I decided that I prefer the thin wrapper, but the "light
>> touch" (struct oidmap_entry, oidcmpfn(), oidmap_get() only) still
>> better than the status quo.
>
> OK. I can certainly live with that. And worst case, I suppose, is that a
> caller wants some underlying hashmap function and we just have to extend
> the oidmap API to include it. It's not like we're adding new hashmap
> functions willy-nilly.

OK, I think I can live with that, too.  I'll tentatively mark the
topic to be merged to 'next' but give it for a few days so that
others can stop me.

Thanks.


Re: What's cooking in git.git (Oct 2017, #01; Wed, 4)

2017-10-04 Thread Junio C Hamano
Jeff King  writes:

>>  - pretty.c: delimit "%(trailers)" arguments with ","
>> 
>>  "git for-each-ref --format=..." learned a new format element,
>>  %(trailers), to show only the commit log trailer part of the log
>>  message.
>> 
>>  Will merge to 'next'.
>
> I think we want the first patch of this series to graduate before v2.15,
> even if the rest doesn't make it. It tweaks a new syntax introduced
> earlier in this cycle by jk/trailers-parse. If we ship without the
> tweak, then we'll have to support the colon-delimiter to remain
> backwards-compatible.

Yeah, thanks for reminding me.  I actually was hoping that this will
prove to be stable enough by the time -rc1 gets tagged, but yes, the
bottom one looks innocuous/safe enough and should be fast-tracked to
'master' soonish.

Thanks.


Is git am supposed to decode MIME?

2017-10-04 Thread Florian Weimer
The git am documentation talks about “mailboxes”.  I suppose these 
contain messages in Internet Mail syntax.  Is git am supposed to decode 
MIME?


I'm asking because I have a message whose body is encoded as 
quoted-printable, but git am does not parse the patch contained in it.


If git am is supposed to deal with this, I'll dig deeper and try to 
figure out where things go wrong.


Thanks,
Florian


Re: What's cooking in git.git (Oct 2017, #01; Wed, 4)

2017-10-04 Thread Jeff King
On Wed, Oct 04, 2017 at 04:19:52PM +0900, Junio C Hamano wrote:

> I wanted to do v2.15-rc0 today but it seems that this has to slip,
> with too many topics still in flight in 'next' among which many are
> fixes, one is even a fix for a recent regression.

You probably already figured this out, but I want to be explicit just in
case:

> * tb/show-trailers-in-ref-filter (2017-10-02) 7 commits
>  - ref-filter.c: parse trailers arguments with %(contents) atom
>  - ref-filter.c: use trailer_opts to format trailers
>  - t6300: refactor %(trailers) tests
>  - doc: use "``"-style quoting for literal strings
>  - doc: 'trailers' is the preferred way to format trailers
>  - t4205: unfold across multiple lines
>  - pretty.c: delimit "%(trailers)" arguments with ","
> 
>  "git for-each-ref --format=..." learned a new format element,
>  %(trailers), to show only the commit log trailer part of the log
>  message.
> 
>  Will merge to 'next'.

I think we want the first patch of this series to graduate before v2.15,
even if the rest doesn't make it. It tweaks a new syntax introduced
earlier in this cycle by jk/trailers-parse. If we ship without the
tweak, then we'll have to support the colon-delimiter to remain
backwards-compatible.

-Peff


Re: What's cooking in git.git (Oct 2017, #01; Wed, 4)

2017-10-04 Thread Jeff King
On Wed, Oct 04, 2017 at 04:02:12AM -0400, Jeff King wrote:

> On Wed, Oct 04, 2017 at 04:19:52PM +0900, Junio C Hamano wrote:
> 
> > * jt/partial-clone-lazy-fetch (2017-10-02) 18 commits
> [...]
> 
> The merge of this topic into jch (at 766f92478b0) causes the test suite
> to fail when compiled with ASan/UBSan. The simplest reproduction I could
> come up with is:
> 
>$ make SANITIZE=address,undefined BLK_SHA1=1 &&
>  GIT_DIR=nope ./git shortlog /dev/null
>repository.c:69:31: runtime error: index 1869098813 out of bounds for type 
> 'git_hash_algo [1]'
> 
> Note that the series is fine by itself, it's only the merge which fails.
> Which implies to me it's some funny interaction with bc/hash-algo (which
> introduces the hash_algo concept). But I didn't dig further.
> 
> +cc brian and Jonathan.

Actually, I take it back. Bisecting between "master" and "pu" turned up
that commit (and I verified that it fails at that commit but not at its
first parent).

But if I go directly to the tip of bc/hash-algo, I can see the failure
there. And it bisects to 67a9dfcc00 (hash-algo: integrate hash algorithm
support with repo setup, 2017-08-21).

So I think it's a read of uninitialized data, and it may come and go as
various topics tickle the compiler differently. And that would make
jt/partial-clone-lazy-fetch a red herring.

This seems to make it go away (on top of 67a9dfcc00):

diff --git a/setup.c b/setup.c
index 289e24811c..310afe9736 100644
--- a/setup.c
+++ b/setup.c
@@ -1126,7 +1126,9 @@ const char *setup_git_directory_gently(int *nongit_ok)
repo_set_gitdir(the_repository, gitdir);
setup_git_env();
}
-   repo_set_hash_algo(the_repository, repo_fmt.hash_algo);
+   /* This is only valid if we actually found a repository */
+   if (startup_info->have_repository)
+   repo_set_hash_algo(the_repository, repo_fmt.hash_algo);
}
 
strbuf_release();

The problem is that we may not have a real repository at all, but we
still hit this code path, which tries to setup repository properties, if
there's a $GIT_DIR set.

I actually suspect this repo_set_hash_algo() line should go into
check_repository_format_gently(), where we set other globals (and which
should eventually take a pointer to struct repository rather than
touching the_repository).

But I'm sufficiently convinced now that this is just a bug in the
RFC-marked bc/hash-algo topic, and in no danger of graduating to master
soon.

-Peff


Re: Git for Windows: mingw.c's strange usage of creation time as ctime?

2017-10-04 Thread Marc Strapetz

Hello Jonathan,

On 04.10.2017 04:23, Jonathan Nieder wrote:

+git-for-windows
Hi Marc,

Marc Strapetz wrote:


compat/mingw.c assigns the Windows file creation time [1] to Git's
ctime fields at line 591 and at line 705:

buf->st_ctime = filetime_to_time_t(&(fdata.ftCreationTime));

ftCreationTime > ftLastWriteTime is actually possible after copying
a file, so it makes sense to include this timestamp somehow in the
Index, but I think it would be better to use the maximum of
ftLastWriteTime and ftCreationTime here which is less confusing and
closer to Unix's ctime:

buf->st_ctime = max(buf->st_mtime,
 filetime_to_time_t(&(fdata.ftCreationTime));


Can you say more about the practical effect?  Is this causing a bug in
practice, is it a bug waiting to happen, or is it making the code
difficult to understand without any ill effect expected at run time?


It's mainly about understanding: as a Windows guy it is quite tempting 
to read "ctime" as "creationTime" and getting this confirmed in code 
above. (I guess) based on this understanding I wrote wrong Java code 8 
years ago for all platforms. My bad of course that I was not digging 
into Unix sources here.


In practice, .git/index timestamps will become more similar to Linux, 
getting rid of "strange" old ctime values.


Drawback will be that after such a code update every Index entry will 
appear as outdated and the initial "git status" will have to compare all 
file contents. If this is not acceptable, a comment for the two 
offending lines would make sense.



By the way, do you have core.trustctime set to true or false?


It seems to be set to true by default on Windows and usually I'm not 
changing it.


-Marc





Re: What's cooking in git.git (Oct 2017, #01; Wed, 4)

2017-10-04 Thread Jeff King
On Wed, Oct 04, 2017 at 04:19:52PM +0900, Junio C Hamano wrote:

> * jt/partial-clone-lazy-fetch (2017-10-02) 18 commits
>  - fetch-pack: restore save_commit_buffer after use
>  - unpack-trees: batch fetching of missing blobs
>  - clone: configure blobmaxbytes in created repos
>  - clone: support excluding large blobs
>  - fetch: support excluding large blobs
>  - fetch: refactor calculation of remote list
>  - fetch-pack: support excluding large blobs
>  - pack-objects: support --blob-max-bytes
>  - pack-objects: rename want_.* to ignore_.*
>  - gc: do not repack promisor packfiles
>  - rev-list: support termination at promisor objects
>  - sha1_file: support lazily fetching missing objects
>  - introduce fetch-object: fetch one promisor object
>  - index-pack: refactor writing of .keep files
>  - fsck: support promisor objects as CLI argument
>  - fsck: support referenced promisor objects
>  - fsck: support refs pointing to promisor objects
>  - fsck: introduce partialclone extension
> 
>  A journey for "git clone" and "git fetch" to become "lazier" by
>  depending more on its remote repository---this is the beginning of
>  it.
> 
>  Needs review.

The merge of this topic into jch (at 766f92478b0) causes the test suite
to fail when compiled with ASan/UBSan. The simplest reproduction I could
come up with is:

   $ make SANITIZE=address,undefined BLK_SHA1=1 &&
 GIT_DIR=nope ./git shortlog /dev/null
   repository.c:69:31: runtime error: index 1869098813 out of bounds for type 
'git_hash_algo [1]'

Note that the series is fine by itself, it's only the merge which fails.
Which implies to me it's some funny interaction with bc/hash-algo (which
introduces the hash_algo concept). But I didn't dig further.

+cc brian and Jonathan.

-Peff


Re: [PATCH v2] oidmap: map with OID as key

2017-10-04 Thread Jeff King
On Tue, Oct 03, 2017 at 05:29:01PM -0700, Jonathan Tan wrote:

> At this point I decided that I prefer the thin wrapper, but the "light
> touch" (struct oidmap_entry, oidcmpfn(), oidmap_get() only) still
> better than the status quo.

OK. I can certainly live with that. And worst case, I suppose, is that a
caller wants some underlying hashmap function and we just have to extend
the oidmap API to include it. It's not like we're adding new hashmap
functions willy-nilly.

-Peff


What's cooking in git.git (Oct 2017, #01; Wed, 4)

2017-10-04 Thread Junio C Hamano
Here are the topics that have been cooking.  Commits prefixed with
'-' are only in 'pu' (proposed updates) while commits prefixed with
'+' are in 'next'.  The ones marked with '.' do not appear in any of
the integration branches, but I am still holding onto them.

I wanted to do v2.15-rc0 today but it seems that this has to slip,
with too many topics still in flight in 'next' among which many are
fixes, one is even a fix for a recent regression.

You can find the changes described here in the integration branches
of the repositories listed at

http://git-blame.blogspot.com/p/git-public-repositories.html

--
[Graduated to "master"]

* ad/doc-markup-fix (2017-09-29) 1 commit
  (merged to 'next' on 2017-10-02 at fdf600e8f3)
 + doc: correct command formatting

 Docfix.


* bc/rev-parse-parseopt-fix (2017-09-25) 7 commits
  (merged to 'next' on 2017-09-26 at f3e013eaa4)
 + parse-options: only insert newline in help text if needed
 + parse-options: write blank line to correct output stream
 + t0040,t1502: Demonstrate parse_options bugs
  (merged to 'next' on 2017-09-24 at e479bce0ff)
 + git-rebase: don't ignore unexpected command line arguments
 + rev-parse parseopt: interpret any whitespace as start of help text
 + rev-parse parseopt: do not search help text for flag chars
 + t1502: demonstrate rev-parse --parseopt option mis-parsing

 Recent versions of "git rev-parse --parseopt" did not parse the
 option specification that does not have the optional flags (*=?!)
 correctly, which has been corrected.


* bw/git-clang-format (2017-10-01) 1 commit
  (merged to 'next' on 2017-10-02 at ee84c89b07)
 + clang-format: adjust line break penalties
 (this branch is used by sb/git-clang-format.)

 Adjust clang-format penalty parameters.


* hn/path-ownership-comment (2017-09-27) 2 commits
  (merged to 'next' on 2017-09-28 at 7cf8481c29)
 + read_gitfile_gently: clarify return value ownership.
 + real_path: clarify return value ownership

 Add comment to a few functions that use a short-lived buffer the
 caller can peek and copy out of.


* hn/string-list-doc (2017-09-27) 1 commit
  (merged to 'next' on 2017-09-28 at ffc8f65f4a)
 + string-list.h: move documentation from Documentation/api/ into header

 Doc reorg.


* hn/submodule-comment (2017-09-26) 1 commit
  (merged to 'next' on 2017-09-28 at a071ca3fde)
 + submodule.c: describe submodule_to_gitdir() in a new comment


* jk/no-optional-locks (2017-09-27) 1 commit
  (merged to 'next' on 2017-09-28 at eaffe9c638)
 + git: add --no-optional-locks option

 Some commands (most notably "git status") makes an opportunistic
 update when performing a read-only operation to help optimize later
 operations in the same repository.  The new "--no-optional-locks"
 option can be passed to Git to disable them.


* jk/read-in-full (2017-09-27) 7 commits
  (merged to 'next' on 2017-09-28 at 9d109b9233)
 + worktree: check the result of read_in_full()
 + worktree: use xsize_t to access file size
 + distinguish error versus short read from read_in_full()
 + avoid looking at errno for short read_in_full() returns
 + prefer "!=" when checking read_in_full() result
 + notes-merge: drop dead zero-write code
 + files-backend: prefer "0" for write_in_full() error check

 Code clean-up to prevent future mistakes by copying and pasting
 code that checks the result of read_in_full() function.


* jk/validate-headref-fix (2017-09-27) 3 commits
  (merged to 'next' on 2017-09-28 at dcea9d16f9)
 + validate_headref: use get_oid_hex for detached HEADs
 + validate_headref: use skip_prefix for symref parsing
 + validate_headref: NUL-terminate HEAD buffer

 Code clean-up.


* js/rebase-i-final (2017-07-27) 10 commits
  (merged to 'next' on 2017-09-26 at ea3f8f5e11)
 + rebase -i: rearrange fixup/squash lines using the rebase--helper
 + t3415: test fixup with wrapped oneline
 + rebase -i: skip unnecessary picks using the rebase--helper
 + rebase -i: check for missing commits in the rebase--helper
 + t3404: relax rebase.missingCommitsCheck tests
 + rebase -i: also expand/collapse the SHA-1s via the rebase--helper
 + rebase -i: do not invent onelines when expanding/collapsing SHA-1s
 + rebase -i: remove useless indentation
 + rebase -i: generate the script via rebase--helper
 + t3415: verify that an empty instructionFormat is handled as before

 The final batch to "git rebase -i" updates to move more code from
 the shell script to C.


* mh/mmap-packed-refs (2017-09-25) 21 commits
  (merged to 'next' on 2017-09-29 at 3639417666)
 + packed-backend.c: rename a bunch of things and update comments
 + mmapped_ref_iterator: inline into `packed_ref_iterator`
 + ref_cache: remove support for storing peeled values
 + packed_ref_store: get rid of the `ref_cache` entirely
 + ref_store: implement `refs_peel_ref()` generically
 + packed_read_raw_ref(): read the reference from the mmapped buffer
 + packed_ref_iterator_begin(): iterate using `mmapped_ref_iterator`
 + 

Re: [PATCH 1/3] for-each-ref: let upstream/push optionally report the remote name

2017-10-04 Thread Junio C Hamano
Johannes Schindelin  writes:

> This patch offers the new suffix :remote for the upstream and for the
> push atoms, allowing to show exactly that.

Has the design changed since this description and examples were
written?  The documentation talks about ":remote-name".  I suspect
calling this ":remote" might be less hassle, as we do not have to
vet the current vocabulary to see if "remote-name" is the right way
to name a multi-word modifer (as opposed to "remotename", or
"remoteName", or "remote_name", or...).

>   ...
>
>   $ git for-each-ref \
>   --format='%(upstream) %(upstream:remote) %(push:remote)' \
>   refs/heads/master

> diff --git a/Documentation/git-for-each-ref.txt 
> b/Documentation/git-for-each-ref.txt
> index 66b4e0a4050..776368ee531 100644
> --- a/Documentation/git-for-each-ref.txt
> +++ b/Documentation/git-for-each-ref.txt
> @@ -140,17 +140,19 @@ upstream::

Aside from "':remote-name'?  Do we really want that?" issue I
already mentioned, the changes to this file look sensible.

> diff --git a/ref-filter.c b/ref-filter.c
> index bc591f4f3de..58d53c09390 100644
> --- a/ref-filter.c
> +++ b/ref-filter.c
> @@ -76,7 +76,9 @@ static struct used_atom {
>   char color[COLOR_MAXLEN];
>   struct align align;
>   struct {
> - enum { RR_REF, RR_TRACK, RR_TRACKSHORT } option;
> + enum {
> + RR_REF, RR_TRACK, RR_TRACKSHORT, RR_REMOTE_NAME
> + } option;
>   struct refname_atom refname;
>   unsigned int nobracket : 1;
>   } remote_ref;
> @@ -156,6 +158,8 @@ static void remote_ref_atom_parser(const struct 
> ref_format *format, struct used_
>   atom->u.remote_ref.option = RR_TRACKSHORT;
>   else if (!strcmp(s, "nobracket"))
>   atom->u.remote_ref.nobracket = 1;
> + else if (!strcmp(s, "remote-name"))
> + atom->u.remote_ref.option = RR_REMOTE_NAME;
>   else {
>   atom->u.remote_ref.option = RR_REF;
>   
> refname_atom_parser_internal(>u.remote_ref.refname,
> @@ -1247,6 +1251,15 @@ static void fill_remote_ref_details(struct used_atom 
> *atom, const char *refname,
>   *s = ">";
>   else
>   *s = "<>";
> + } else if (atom->u.remote_ref.option == RR_REMOTE_NAME) {
> + int explicit;
> + const char *remote = starts_with(atom->name, "push") ?
> + pushremote_for_branch(branch, ) :
> + remote_for_branch(branch, );

I think "int explicit = 0;" is needed, as pushremote_for_branch()
does seem to expect that the "explicit" return parameter is
initialized by the caller.

> + if (explicit)
> + *s = xstrdup(remote);
> + else
> + *s = "";

This one is debatable.  For the user of "push" who wants to learn
where the branch is pushed to, the choice this patch makes is not
useful (i.e. such a user does not care where the definition comes
from; s/he only wants to know where the push goes).  For the user of
"git config" who wants to know how pushremote is configured, and
does not want to accept the fallback to remote, the behaviour of
this patch is useful.  I have a mild suspicion that majority of
users who would want to use this feature would fall into the former
category, i.e. users of "push", rather than the latter, i.e.
debuggers of their config files, and if that is true, we can drop
the whole "explicit" business to simplify the code and get a more
useful behaviour at the same time ;-)

> @@ -1364,9 +1377,13 @@ static void populate_value(struct ref_array_item *ref)
>   continue;
>   branch = branch_get(branch_name);
>  
> - refname = branch_get_push(branch, NULL);
> - if (!refname)
> - continue;
> + if (starts_with(name, "push:remote-"))

I am not sure what is going on here.  

Are we expecting "push:remote-name" here, or anticipating
"push:remote-bar" can also be given or what?

> + refname = NULL;
> + else {
> + refname = branch_get_push(branch, NULL);
> + if (!refname)
> + continue;
> + }
>   fill_remote_ref_details(atom, refname, branch, >s);
>   continue;
>   } else if (starts_with(name, "color:")) {


Re: [PATCH v8 00/12] Fast git status via a file system watcher

2017-10-04 Thread Alex Vandiver
On Wed, 4 Oct 2017, Junio C Hamano wrote:
> Rats indeed.  Let's go incremental as promised, perhaps like this
> (but please supply a better description if you have one).

I think you'll also want the following squashed into 5c8cdcfd8 and
def437671:

-- >8 --
>From 445d45027bb5b7823338cf111910d2884af6318b Mon Sep 17 00:00:00 2001
From: Alex Vandiver 
Date: Tue, 3 Oct 2017 23:27:46 -0700
Subject: [PATCH] fsmonitor: Read entirety of watchman output

In perl, setting $/ sets the string that is used as the "record
separator," which sets the boundary that the `<>` construct reads to.
Setting `local $/ = 0666;` evaluates the octal, getting 438, and
stringifies it.  Thus, the later read from `` stops as soon
as it encounters the string "438" in the watchman output, yielding
invalid JSON; repositories containing filenames with SHA1 hashes are
able to trip this easily.

Set `$/` to undefined, thus slurping all output from watchman.  Also
close STDIN which is provided to watchman, to better guarantee that we
cannot deadlock with watchman while both attempting to read.

Signed-off-by: Alex Vandiver 
---
 t/t7519/fsmonitor-watchman | 6 ++
 templates/hooks--fsmonitor-watchman.sample | 6 ++
 2 files changed, 4 insertions(+), 8 deletions(-)

diff --git a/t/t7519/fsmonitor-watchman b/t/t7519/fsmonitor-watchman
index 7ceb32dc1..7d6aef635 100755
--- a/t/t7519/fsmonitor-watchman
+++ b/t/t7519/fsmonitor-watchman
@@ -50,9 +50,6 @@ launch_watchman();
 
 sub launch_watchman {
 
-   # Set input record separator
-   local $/ = 0666;
-
my $pid = open2(\*CHLD_OUT, \*CHLD_IN, 'watchman -j')
or die "open2() failed: $!\n" .
"Falling back to scanning...\n";
@@ -83,7 +80,8 @@ sub launch_watchman {
close $fh;
 
print CHLD_IN $query;
-   my $response = ;
+   close CHLD_IN;
+   my $response = do {local $/; };
 
open ($fh, ">", ".git/watchman-response.json");
print $fh $response;
diff --git a/templates/hooks--fsmonitor-watchman.sample 
b/templates/hooks--fsmonitor-watchman.sample
index 870a59d23..1b8ed173e 100755
--- a/templates/hooks--fsmonitor-watchman.sample
+++ b/templates/hooks--fsmonitor-watchman.sample
@@ -49,9 +49,6 @@ launch_watchman();
 
 sub launch_watchman {
 
-   # Set input record separator
-   local $/ = 0666;
-
my $pid = open2(\*CHLD_OUT, \*CHLD_IN, 'watchman -j')
or die "open2() failed: $!\n" .
"Falling back to scanning...\n";
@@ -78,7 +75,8 @@ sub launch_watchman {
END
 
print CHLD_IN $query;
-   my $response = ;
+   close CHLD_IN;
+   my $response = do {local $/; };
 
die "Watchman: command returned no output.\n" .
"Falling back to scanning...\n" if $response eq "";
-- 
2.14.2.959.g6663358d3


Re: [PATCH 2/3] http-push: fix construction of hex value from path

2017-10-04 Thread Junio C Hamano
Jeff King  writes:

> More seriously, is there any interest in marking it as deprecated in the
> release notes and issuing a warning when it's used for a few cycles?

No objection from me.

I do not object with such a well designed deprecation plan for any
other features, if there are other "favourite" ones people would
want to deprecate and eventually remove, by the way.


Re: [PATCH v3 00/10] protocol transition

2017-10-04 Thread Junio C Hamano
Thanks.  All of my review comments from the previous round seem to
have been addressed, so this is Reviewed-by: me ;-)



Re: [PATCH v3 4/5] sha1_name: Parse less while finding common prefix

2017-10-04 Thread Junio C Hamano
Derrick Stolee  writes:

> diff --git a/sha1_name.c b/sha1_name.c
> index f2a1ebe49..5081aeb71 100644
> --- a/sha1_name.c
> +++ b/sha1_name.c
> @@ -480,13 +480,23 @@ struct min_abbrev_data {
>   char *hex;
>  };
>  
> +static inline char get_hex_char_from_oid(const struct object_id *oid,
> +  int pos)
> +{
> + static const char hex[] = "0123456789abcdef";
> +
> + if ((pos & 1) == 0)
> + return hex[oid->hash[pos >> 1] >> 4];
> + else
> + return hex[oid->hash[pos >> 1] & 0xf];
> +}
> +


>  static int extend_abbrev_len(const struct object_id *oid, void *cb_data)
>  {
>   struct min_abbrev_data *mad = cb_data;
>  
> - char *hex = oid_to_hex(oid);
>   unsigned int i = mad->init_len;
> - while (mad->hex[i] && mad->hex[i] == hex[i])
> + while (mad->hex[i] && mad->hex[i] == get_hex_char_from_oid(oid, i))
>   i++;

Assuming that [PATCH 3/5] makes sense, it is an obvious optimization
to avoid writing the whole 20-byte out before comparing, and instead
to grab hex digits as they become needed.

I assume that the "Base Time" in the log message was with whatever
version of Git before [PATCH 3/5] and this one were applied
(i.e. not comparing to "vanilla Git plus 3/5")?

Thanks.


Re: [PATCH v3 3/5] sha1_name: Unroll len loop in find_unique_abbrev_r

2017-10-04 Thread Junio C Hamano
Derrick Stolee  writes:

> On 10/3/2017 6:49 AM, Junio C Hamano wrote:
>> Derrick Stolee  writes:
>>
>>> p0008.1: find_unique_abbrev() for existing objects
>>> --
>>>
>>> For 10 repeated tests, each checking 100,000 known objects, we find the
>>> following results when running in a Linux VM:
>>>
>>> |   | Pack  | Packed  | Loose  | Base   | New| |
>>> | Repo  | Files | Objects | Objects| Time   | Time   | Rel%|
>>> |---|---|-||||-|
>>> | Git   | 1 |  230078 |  0 | 0.09 s | 0.06 s | - 33.3% |
>>> | Git   | 5 |  230162 |  0 | 0.11 s | 0.08 s | - 27.3% |
>>> | Git   | 4 |  154310 |  75852 | 0.09 s | 0.07 s | - 22.2% |
>>> | Linux | 1 | 5606645 |  0 | 0.12 s | 0.32 s | +146.2% |
>>> | Linux |24 | 5606645 |  0 | 1.12 s | 1.12 s | -  0.9% |
>>> | Linux |23 | 5283204 | 323441 | 1.08 s | 1.05 s | -  2.8% |
>>> | VSTS  | 1 | 4355923 |  0 | 0.12 s | 0.23 s | + 91.7% |
>>> | VSTS  |32 | 4355923 |  0 | 1.02 s | 1.08 s | +  5.9% |
>>> | VSTS  |31 | 4276829 |  79094 | 2.25 s | 2.08 s | -  7.6% |
>>
>> The above does not look so good, especially in cases where a
>> repository is well maintained by packing into smaller number of
>> packs, we get much worse result?
>
> I understand that this patch on its own does not have good numbers. I
> split the
> patches 3 and 4 specifically to highlight two distinct changes:
>
> Patch 3: Unroll the len loop that may inspect all files multiple times.
> Patch 4: Parse less while disambiguating.
>
> Patch 4 more than makes up for the performance hits in this patch.

Now you confused me even more.  When we read the similar table that
appears in [Patch 4/5], what does the "Base Time" column mean?
Vanilla Git with [Patch 3/5] applied?  Vanillay Git with [Patch 4/5]
alone applied?  Something else?


Re: [PATCH v3 3/5] sha1_name: Unroll len loop in find_unique_abbrev_r

2017-10-04 Thread Junio C Hamano
Derrick Stolee  writes:

> -int find_unique_abbrev_r(char *hex, const unsigned char *sha1, int len)
> +struct min_abbrev_data {
> + unsigned int init_len;
> + unsigned int cur_len;
> + char *hex;
> +};
> +
> +static int extend_abbrev_len(const struct object_id *oid, void *cb_data)
>  {
> - int status, exists;
> + struct min_abbrev_data *mad = cb_data;
> +
> + char *hex = oid_to_hex(oid);
> + unsigned int i = mad->init_len;
> + while (mad->hex[i] && mad->hex[i] == hex[i])
> + i++;
> +
> + if (i < GIT_MAX_RAWSZ && i >= mad->cur_len)
> + mad->cur_len = i + 1;
> +
> + return 0;
> +}

The idea is to keep the target to be abbreviated in mad->hex[], and
as the two functions find_short_{object_filename,packed_object}
discover objects whose first 'len' letters of hexname are the same
as the target, they'd call into the "always_call_fn" callback, which
is this function, and it keeps track of the maximum of the common
prefix it has seen.  Which makes sense.  Well, sort of.

> - exists = has_sha1_file(sha1);
> - while (len < GIT_SHA1_HEXSZ) {
> - struct object_id oid_ret;
> - status = get_short_oid(hex, len, _ret, GET_OID_QUIETLY);
> - if (exists
> - ? !status
> - : status == SHORT_NAME_NOT_FOUND) {
> - hex[len] = 0;
> - return len;
> - }
> - len++;
> - }
> - return len;

The "always_call_fn" thing is a big sledgehammer that overrides
everything else in update_candidates().  It bypasses the careful
machinery set up to avoid having to open ambiguous object to learn
their types as much as possible.  One narrow exception when it is OK
to use is if we never limit our candidates with type.

And it might appear that the conversion is safe (if only because we
do not see any type limitation in the get_short_oid() call above),
but I think there is one case where this patch changes the
behaviour: what happens if core.disambiguate was set to anything
other than "none"?  The new code does not know anything about type
based filtering, so it can end up reporting longer abbreviation than
it was asked to produce.  It may not be a problem in practice, though.

I am not sure if setting core.disambiguate is generally a good idea
in the first place, and if it is OK to break find_unique_abbrev()
with respect to the configuration variable like this patch does.

I'd feel safe if we get extra input from Peff, who introduced the
feature in 5b33cb1f ("get_short_sha1: make default disambiguation
configurable", 2016-09-27).

> +
> + if (init_object_disambiguation(hex, len, ) < 0)
> + return -1;
> +
> + mad.init_len = len;
> + mad.cur_len = len;
> + mad.hex = hex;
> +
> + ds.fn = extend_abbrev_len;
> + ds.always_call_fn = 1;
> + ds.cb_data = (void*)
> +
> + find_short_object_filename();
> + find_short_packed_object();
> + (void)finish_object_disambiguation(, _ret);
> +
> + hex[mad.cur_len] = 0;
> + return mad.cur_len;
>  }
>  
>  const char *find_unique_abbrev(const unsigned char *sha1, int len)