difflame improvements

2017-02-14 Thread Edmundo Carmona Antoranz
Hi!

I've been working on detecting revisions where a "real" deletion was
made and I think I advanced a lot in that front. I still have to work
on many scenarios (renamed files, for example... also performance) but
at least I'm using a few runs against git-scm history and the results
are "promising":

23:05 $ git blame -s --reverse -L 25,40 HEAD~20..HEAD -- versioncmp.c
066fb0494e 25) static int initialized;
066fb0494e 26)
066fb0494e 27) /*
8ec68d1ae2 28)  * p1 and p2 point to the first different character in
two strings. If
8ec68d1ae2 29)  * either p1 or p2 starts with a prerelease suffix, it
will be forced
8ec68d1ae2 30)  * to be on top.
8ec68d1ae2 31)  *
8ec68d1ae2 32)  * If both p1 and p2 start with (different) suffix, the order is
8ec68d1ae2 33)  * determined by config file.
066fb0494e 34)  *
8ec68d1ae2 35)  * Note that we don't have to deal with the situation
when both p1 and
8ec68d1ae2 36)  * p2 start with the same suffix because the common
part is already
8ec68d1ae2 37)  * consumed by the caller.
066fb0494e 38)  *
066fb0494e 39)  * Return non-zero if *diff contains the return value
for versioncmp()
066fb0494e 40)  */

Lines 28-33:

23:05 $ git show --summary 8ec68d1ae2
commit 8ec68d1ae2863823b74d67c5e92297e38bbf97bc
Merge: e801be066 c48886779
Author: Junio C Hamano <>
Date:   Mon Jan 23 15:59:21 2017 -0800

Merge branch 'vn/diff-ihc-config'

"git diff" learned diff.interHunkContext configuration variable
that gives the default value for its --inter-hunk-context option.

* vn/diff-ihc-config:
  diff: add interhunk context config option



And this is not telling me the _real_ revision where the lines were
_deleted_ so it's not very helpful, as Peff has already mentioned.

Running difflame:

23:06 $ time ~/proyectos/git/difflame/difflame.py -bp=-s -w HEAD~20
HEAD -- versioncmp.c
diff --git a/versioncmp.c b/versioncmp.c
index 80bfd109f..9f81dc106 100644
--- a/versioncmp.c
+++ b/versioncmp.c
@@ -24,42 +24,83 @@
.
.
.
+b17846432d  33) static void find_better_matching_suffix(const char
*tagname, const char *suffix,
+b17846432d  34)int
suffix_len, int start, int conf_pos,
+b17846432d  35)struct
suffix_match *match)
+b17846432d  36) {
b17846432d  37)/*
   c026557a3 versioncmp: generalize version sort suffix reordering
-c026557a3 (SZEDER 28)  * p1 and p2 point to the first different
character in two strings. If
-c026557a3 (SZEDER 29)  * either p1 or p2 starts with a prerelease
suffix, it will be forced
-c026557a3 (SZEDER 30)  * to be on top.
-c026557a3 (SZEDER 31)  *
-c026557a3 (SZEDER 32)  * If both p1 and p2 start with (different)
suffix, the order is
-c026557a3 (SZEDER 33)  * determined by config file.
   b17846432 versioncmp: factor out helper for suffix matching
+b17846432d  38) * A better match either starts earlier or
starts at the same offset
+b17846432d  39) * but is longer.
+b17846432d  40) */
+b17846432d  41)int end = match->len < suffix_len ?
match->start : match->start-1;
.
.
.

Same range of (deleted) lines:

23:10 $ git --show --name-status c026557a3
commit c026557a37361b7019acca28f240a19f546739e9
Author: SZEDER Gábor <>
Date:   Thu Dec 8 15:24:01 2016 +0100

   versioncmp: generalize version sort suffix reordering

   The 'versionsort.prereleaseSuffix' configuration variable, as its name
   suggests, is supposed to only deal with tagnames with prerelease
.
.
.


   Signed-off-by: SZEDER Gábor <>
   Signed-off-by: Junio C Hamano <>

M   Documentation/config.txt
M   Documentation/git-tag.txt
M   t/t7004-tag.sh
M   versioncmp.c


This is the revision where the deletion happened.

That's it for the time being.


Re: [PATCH v2 2/2] completion: checkout: complete paths when ref given

2017-02-14 Thread SZEDER Gábor
On Tue, Feb 14, 2017 at 10:24 PM,   wrote:
> From: Cornelius Weig 
>
> Git-checkout completes words starting with '--' as options and other
> words as refs. Even after specifying a ref, further words not starting
> with '--' are completed as refs, which is invalid for git-checkout.

Refs completion is never attempted for words after the disambiguating
double-dash.

Even when refs completion is attempted, if it is unsuccessful, i.e.
there is no ref that matches the current word to be completed, then
Bash falls back to standard filename completion.  No refs match
'./'.

Furthermore, Bash performs filename completion on Alt-/ independently
from any completion function.

Granted, none of these will limit to only modified files...  But that
might be a good thing, see below.

> This commit ensures that after specifying a ref, further non-option
> words are completed as paths. Four cases are considered:
>
>  - If the word contains a ':', do not treat it as reference and use
>regular revlist completion.
>  - If no ref is found on the command line, complete non-options as refs
>as before.
>  - If the ref is HEAD or @, complete only with modified files because
>checking out unmodified files is a noop.

Here you use "modified" in the 'ls-files --modified' sense, but that
doesn't include modifications already staged in the index, see below.

>This case also applies if no ref is given, but '--' is present.
>  - If a ref other than HEAD or @ is found, offer only valid paths from
>that revision.
>
> Note that one corner-case is not covered by the current implementation:
> if a refname contains a ':' and is followed by '--' the completion would
> not recognize the valid refname.

I'm not sure what you mean here.  Refnames can't contain ':'.

> Signed-off-by: Cornelius Weig 
> ---
>  contrib/completion/git-completion.bash | 39 
> +++---
>  1 file changed, 31 insertions(+), 8 deletions(-)
>
> diff --git a/contrib/completion/git-completion.bash 
> b/contrib/completion/git-completion.bash
> index 4ab119d..df46f62 100644
> --- a/contrib/completion/git-completion.bash
> +++ b/contrib/completion/git-completion.bash
> @@ -1068,7 +1068,7 @@ _git_bundle ()
>
>  _git_checkout ()
>  {
> -   __git_has_doubledash && return
> +   local i c=2 ref="" seen_double_dash=""
>
> case "$cur" in
> --conflict=*)
> @@ -1081,13 +1081,36 @@ _git_checkout ()
> "
> ;;
> *)
> -   # check if --track, --no-track, or --no-guess was specified
> -   # if so, disable DWIM mode
> -   local flags="--track --no-track --no-guess" track=1
> -   if [ -n "$(__git_find_on_cmdline "$flags")" ]; then
> -   track=''
> -   fi
> -   __gitcomp_nl "$(__git_refs '' $track)"
> +   while [ $c -lt $cword ]; do
> +   i="${words[c]}"
> +   case "$i" in
> +   --) seen_double_dash=1 ;;
> +   -*|?*:*) ;;
> +   *) ref="$i"; break ;;

I haven't tried it, but this would trigger on e.g. 'git checkout -b
new-feature ', wouldn't it?

> +   esac
> +   ((c++))
> +   done
> +
> +   case "$ref,$seen_double_dash,$cur" in
> +   ,,*:*)
> +   __git_complete_revlist_file
> +   ;;
> +   ,,*)
> +   # check for --track, --no-track, or --no-guess
> +   # if so, disable DWIM mode
> +   local flags="--track --no-track --no-guess" track=1
> +   if [ -n "$(__git_find_on_cmdline "$flags")" ]; then
> +   track=''
> +   fi
> +   __gitcomp_nl "$(__git_refs '' $track)"
> +   ;;
> +   ,1,*|@,*|HEAD,*)
> +   __git_complete_index_file "--modified"

What about

  $ echo "unintentional change" >>tracked-file && git add -u
  $ git rm important-file
  $ git checkout HEAD 

?  It seems it will offer neither 'tracked-file' nor 'important-file',
but I think it should offer both.


We have __git_complete_index_file() for a while now, but only use it
for commands that accept only --options and filenames, e.g. 'add',
'clean', 'rm'.  This would be the first case when we would use it for
a command that accept both refs and filenames.  Perhaps similar corner
cases and the easy ways to trigger filename completion are why no one
thought it's worth it.

> +   ;;
> +   *)
> +   __git_complete_tree_file "$ref" "$cur"

Well, here you could go all-in, and say that this should complete only
files that are different from the version in $ref, because checking
out files 

Re: [PATCH] completion: restore removed line continuating backslash

2017-02-14 Thread Junio C Hamano
SZEDER Gábor  writes:

>> If you feel uncomfortable and want these to cook longer, please tell
>> me so.
>
> Well, it was mainly my surprise that a 20+ patch series arriving so
> late that it gets queued on top of -rc0 would still make it into the
> release.

It all depends on what area the changes are about ;-)

Not meaning to make light of your contribution, but if the upcoming
version of Git shipped with a slightly broken completion, and the
breakage is severe enough, the only thing we need to do is to do a
maintenance release that reverts a single script.  Distro packagers
may do that for us.  While waiting, the user can choose not to load
the completion and can keep using Git just fine.  A broken
"mergetool" would be similar in that a workaround to inspect "git
diff" is available.  If we break "pull" or "commit" on the other
hand, the necessary workaround would be a lot more involved.

Some changes in low-impact areas can wait without reducing the
end-user value of the new release, but if the risk of regression is
small, I favour merging them, rather than postponing them for one
cycle, if only to reduce the number of patches that I need to hold
onto.  Patches to clarify existing documentation fall into this
category.

My perception of "risk of regression" obviously is affected by the
size of the series, and 20+ is certainly on the larger side.  But
other things also come into the picture.  Patches from an author who
knows the area the patches touch very well and with track record of
not causing embarrassing regressions, would make me feel safer than
patches from others, for example.



Re: [RFCv3 PATCH 00/14] Checkout aware of Submodules!

2017-02-14 Thread brian m. carlson
On Tue, Feb 14, 2017 at 04:34:09PM -0800, Stefan Beller wrote:
> Integrate updating the submodules into git checkout, with the same
> safety promises that git-checkout has, i.e. not throw away data unless
> asked to. This is done by first checking if the submodule is at the same
> sha1 as it is recorded in the superproject. If there are changes we stop
> proceeding the checkout just like it is when checking out a file that
> has local changes.
> 
> The integration happens in the code that is also used in other commands
> such that it will be easier in the future to make other commands aware
> of submodule.
> 
> This also solves d/f conflicts in case you replace a file/directory
> with a submodule or vice versa.
> 
> The patches are still a bit rough, but the overall series seems
> promising enough to me that I want to put it out here.
> 
> Any review, specifically on the design level welcome!

Overall, I'm very pleased with this.  I don't really have any
design-level comments at this point, only small nits.  I'm considering
building and testing it out at work, where this would be extremely
useful.  I'm therefore hoping to see how it works under real-world
conditions shortly.
-- 
brian m. carlson / brian with sandals: Houston, Texas, US
+1 832 623 2791 | https://www.crustytoothpaste.net/~bmc | My opinion only
OpenPGP: https://keybase.io/bk2204


signature.asc
Description: PGP signature


Re: [PATCH 14/14] builtin/checkout: add --recurse-submodules switch

2017-02-14 Thread brian m. carlson
On Tue, Feb 14, 2017 at 04:34:23PM -0800, Stefan Beller wrote:
> +--[no-]recurse-submodules::
> + Using --recurse-submodules will update the content of all initialized
> + submodules according to the commit recorded in the superproject. If
> + local modifications in a submodule would be overwritten the checkout
> + will fail until `-f` is used. If nothing (or --no-recurse-submodules)

I would say "unless" instead of "until".  "Until" implies an ongoing
or repetitive action being interrupted, which isn't the case here.
-- 
brian m. carlson / brian with sandals: Houston, Texas, US
+1 832 623 2791 | https://www.crustytoothpaste.net/~bmc | My opinion only
OpenPGP: https://keybase.io/bk2204


signature.asc
Description: PGP signature


Re: [PATCH 09/14] update submodules: add submodule_go_from_to

2017-02-14 Thread brian m. carlson
On Tue, Feb 14, 2017 at 04:34:18PM -0800, Stefan Beller wrote:
> + prepare_submodule_repo_env_no_git_dir(_array);
> +
> + cp.git_cmd = 1;
> + cp.no_stdin = 1;
> + cp.dir = path;
> +
> + argv_array_pushf(, "--super-prefix=%s/", path);
> + argv_array_pushl(, "read-tree", NULL);
> +
> + if (!dry_run)
> + argv_array_push(, "-u");
> + else
> + argv_array_push(, "-n");

I might write this as

if (dry_run)
argv_array_push(, "-n");
else
argv_array_push(, "-u");

In other words, avoiding the negation when you have an else branch.  I
can also see an argument for keeping the condition identical to the
other branches, though.
-- 
brian m. carlson / brian with sandals: Houston, Texas, US
+1 832 623 2791 | https://www.crustytoothpaste.net/~bmc | My opinion only
OpenPGP: https://keybase.io/bk2204


signature.asc
Description: PGP signature


Re: Feature request: flagging “volatile” branches for integration/development

2017-02-14 Thread Junio C Hamano
"Herbert, Marc"  writes:

>> The hard part may be policy (e.g. what if the user does not want a branch
>> to be treated volatile by various commands even if it receives such
>> flag from a git server).
>
> There would be instant, human-readable value in such a new "volatile"
> flag. Machine use and policies can be discussed later. These will be
> easier to prototype, experiment and refine once the flag exists.

We tend to avoid adding random cruft to the system whose semantics
is not well defined, so that we can avoid having to support an ill
defined feature forever.

> ... Now I bet this on the other hand must have been
> discussed (and rejected?) before, any pointer?

I suspect that people may have expressed vague wish from time to
time, but I do not think we saw a proposal that outlines the design
at a concrete enough level to let us rejecting in the past ;-)

Let me list some things that needs to be designed that comes to my
mind offhand:

 * How would a user mark a ref as "volatile"?  I am assuming that
   anybody do so in her own local repository, but how does one mark
   a ref as "volatile" at a hosting site, and who is allowed to do
   so (one possibile design is "new option to 'git push' will do so,
   and anybody who can push to the ref is allowed to", and I am fine
   with that design, but you have to spell that out in a proposal)?

 * How would a user learn if a ref is "volatile"?  Does "ls-remote"
   show that information?

 * Does volatile-ness of a ref at the remote site propagate to your
   remote-tracking ref that corresponds to it?  What does it mean
   that refs/remotes/origin/pu is marked as volatile in your local
   repository?  You cannot "checkout -b" based on it?  Does "branch"
   based on it need to be forbidden as well?

 * Can a local ref be "volatile"?  What does it mean (the same
   subquestions as above)?

 * If your local branch myA is set to build on a remote-tracking
   branch A and push back to branch A at the remote, i.e.

$ git checkout -t -b myA refs/remotes/origin/A
$ ... work work work ...
$ git push

   is set to result in their branch A updated with what you built in
   myA, and if the branch A at the remote is marked as "volatile",
   does it make your "myA" also "volatile"?  How is the volatile-ness 
   inherited?  From their A to your remotes/origin/A and then to
   your myA?  Any other useful rule that defines the propagation?

 * Do we only care about "volatile"?  If we are extending the system
   to allow setting and propagating this new bit per ref (I am
   blindly assuming that you do not have a strong reason to limit
   this to branches), we may as well just design this extension to
   allow the projects to assign arbitrary set of bits to refs.  Some
   projects may want to have different degree of volatile-ness and
   have "volatile" refs, "unstable" refs and "iffy" refs, for
   example.

   Side note: even if we were to go with "any random label can be
   assigned and the meaning for the labels can be defined by the
   project convention" approach, it does not necessarily mean we are
   adding a random cruft whose semantics is ill-defined.  "Git does
   not do anything special to a ref based on what labels it has--it
   just remembers what labels the user told it to attach to the ref,
   and shows what labels the ref has when asked" can be very well
   defined semantics.



Re: [PATCH 01/14] lib-submodule-update.sh: reorder create_lib_submodule_repo

2017-02-14 Thread brian m. carlson
On Tue, Feb 14, 2017 at 04:34:10PM -0800, Stefan Beller wrote:
>  create_lib_submodule_repo () {
>   git init submodule_update_repo &&
>   (
> @@ -49,10 +54,11 @@ create_lib_submodule_repo () {
>   git config submodule.sub1.ignore all &&
>   git add .gitmodules &&
>   git commit -m "Add sub1" &&
> - git checkout -b remove_sub1 &&
> +
> + git checkout -b remove_sub1 add_sub1&&

You're missing a space before the "&&".
-- 
brian m. carlson / brian with sandals: Houston, Texas, US
+1 832 623 2791 | https://www.crustytoothpaste.net/~bmc | My opinion only
OpenPGP: https://keybase.io/bk2204


signature.asc
Description: PGP signature


Re: [PATCH v2] clean: use warning_errno() when appropriate

2017-02-14 Thread Jeff King
On Tue, Feb 14, 2017 at 05:28:36PM -0800, Junio C Hamano wrote:

> I care more about looking at errno ONLY after we saw our call to a
> system library function indicated an error, and I wanted to avoid
> doing:
> 
>   res = dry_run ? 0 : rmdir(path);
> saved_errno = errno;
>   if (res) {
>   ... do something else ...
>   errno = saved_errno;
> call_something_that_uses_errno();
> 
> When our call to rmdir() did not fail, or when we didn't even call
> rmdir() at all, what is in errno has nothing to do with what we are
> doing, and making a copy of it makes the code doubly confusing.
> 
> Rather, I'd prefer to see:
> 
>   res = dry_run ? 0 : rmdir(path);
>   if (res) {
> int saved_errno = errno;
>   ... do something else ...
>   errno = saved_errno;
> call_something_that_uses_errno();
> 
> which makes it clear what is going on when reading the resulting
> code.
> 
> For now, I'll queue a separate SQUASH??? and wait for others to
> comment.

I don't have a strong feeling either way, but I think your second
example there is probably preferable. The reason to save errno is
because of the "something else" that may affect it, and it puts the
saving close to that.

Duy's version above keeps the saved_errno close to the syscall that
caused it, which is nicer for making sure we're saving the right thing,
and didn't get fooled by:

  res = rmdir(path);
  ... some other stuff ...
  if (res) {
  int saved_errno = errno;
  ... something else ...
  errno = saved_errno;

That's wrong if "some other stuff" touches errno. But I think
"saved_errno" is not the right pattern there. It is "stuff away the
result _and_ errno for this thing so we can use it later".

IOW, I'd expect it to be more like:

  rmdir_result = rmdir(path);
  rmdir_errno = errno;
  ... some other stuff ...
  if (rmdir_result)
  show_error(rmdir_errno);

And that leads to the "gee, why don't we just encode error values as
negative integers" pattern. Which I agree is nicer, but I'm not sure I
want to get into wrapping every syscall to give it a better interface.

-Peff


Re: [PATCH] completion: restore removed line continuating backslash

2017-02-14 Thread SZEDER Gábor
On Mon, Feb 13, 2017 at 9:45 PM, Junio C Hamano  wrote:
> SZEDER Gábor  writes:
>
>> Recent commit 1cd23e9e0 (completion: don't use __gitdir() for git
>> commands, 2017-02-03) rewrapped a couple of long lines, and while
>> doing so it inadvertently removed a '\' from the end of a line, thus
>> breaking completion for 'git config remote.name.push '.
>>
>> Signed-off-by: SZEDER Gábor 
>> ---
>>
>> I wanted this to be a fixup!, but then noticed that this series is
>> already in next, hence the proper commit message.
>
> We get "--format=... : command not found"?

Yeah, that's the one.

> Thanks for a set of sharp eyes.

Heh.  Sharp?!  It took over five minutes to notice after I first got
that error...

Furthermore, that '\' was already missing in v1, almost a year ago :)

>> I see the last What's cooking marks this series as "will merge to
>> master".  That doesn't mean that it will be in v2.12, does it?
>
> I actually was hoping that these can go in.  Others that can (or
> should) wait are marked as "Will cook in 'next'".
>
> If you feel uncomfortable and want these to cook longer, please tell
> me so.

Well, it was mainly my surprise that a 20+ patch series arriving so
late that it gets queued on top of -rc0 would still make it into the
release.  However, I have been using this series with only minor
modifications for the better part of a year now, so it's unlikely that
there will be any big issues with it.  Maybe something small, like
this missing '\', but we will deal with it when it arises.

Gábor


Re: [PATCH v2] clean: use warning_errno() when appropriate

2017-02-14 Thread Junio C Hamano
Duy Nguyen  writes:

> On Wed, Feb 15, 2017 at 1:13 AM, Junio C Hamano  wrote:
> ...
>> The reason to store the errno in saved_errno here is not because we
>> want to help code after "if (res) {...}", but the patch sent as-is
>> gives that impression and is confusing to the readers.
>>
>> Perhaps all hunks of this patch share the same issue?  I could
>> locally amend, of course, but I'd like to double check before doing
>> so myself---perhaps you did it this way for a good reason that I am
>> missing?
>
> One thing I like about putting saved_errno right next to the related
> syscall is, the syscall is visible from the diff (previously some are
> out of context). This is really minor though.

I agree that this is minor.

I care more about looking at errno ONLY after we saw our call to a
system library function indicated an error, and I wanted to avoid
doing:

res = dry_run ? 0 : rmdir(path);
saved_errno = errno;
if (res) {
... do something else ...
errno = saved_errno;
call_something_that_uses_errno();

When our call to rmdir() did not fail, or when we didn't even call
rmdir() at all, what is in errno has nothing to do with what we are
doing, and making a copy of it makes the code doubly confusing.

Rather, I'd prefer to see:

res = dry_run ? 0 : rmdir(path);
if (res) {
int saved_errno = errno;
... do something else ...
errno = saved_errno;
call_something_that_uses_errno();

which makes it clear what is going on when reading the resulting
code.

For now, I'll queue a separate SQUASH??? and wait for others to
comment.

Thanks.


Re: [PATCH 11/11] refs: split and make get_*_ref_store() public API

2017-02-14 Thread Junio C Hamano
Duy Nguyen  writes:

> Long term, I think that's what Mike wants. Though the filter_ref_store
> might return an (opaque) iterator instead of a ref store and
> for_each_refstore() becomes a thin wrapper around the iterator
> interface.

Ah, yes, an iterator would be a lot more suitable abstraction there.

Thanks.


Re: [PATCH v2] clean: use warning_errno() when appropriate

2017-02-14 Thread Duy Nguyen
On Wed, Feb 15, 2017 at 1:13 AM, Junio C Hamano  wrote:
> Nguyễn Thái Ngọc Duy   writes:
>
>> All these warning() calls are preceded by a system call. Report the
>> actual error to help the user understand why we fail to remove
>> something.
>>
>> Signed-off-by: Nguyễn Thái Ngọc Duy 
>> ---
>>  v2 dances with errno
>
> Thanks.
>
>>
>>  builtin/clean.c | 19 ++-
>>  1 file changed, 14 insertions(+), 5 deletions(-)
>>
>> diff --git a/builtin/clean.c b/builtin/clean.c
>> index d6bc3aaae..3569736f6 100644
>> --- a/builtin/clean.c
>> +++ b/builtin/clean.c
>> @@ -154,6 +154,7 @@ static int remove_dirs(struct strbuf *path, const char 
>> *prefix, int force_flag,
>>   struct strbuf quoted = STRBUF_INIT;
>>   struct dirent *e;
>>   int res = 0, ret = 0, gone = 1, original_len = path->len, len;
>> + int saved_errno;
>>   struct string_list dels = STRING_LIST_INIT_DUP;
>>
>>   *dir_gone = 1;
>> @@ -173,9 +174,11 @@ static int remove_dirs(struct strbuf *path, const char 
>> *prefix, int force_flag,
>>   if (!dir) {
>>   /* an empty dir could be removed even if it is unreadble */
>>   res = dry_run ? 0 : rmdir(path->buf);
>> + saved_errno = errno;
>>   if (res) {
>>   quote_path_relative(path->buf, prefix, );
>
> I think this part should be more like
>
> res = ... : rmdir(...);
> if (res) {
> int saved_errno = errno;
> ... do other things that can touch errno ...
> errno = saved_errno;
> ... now we know what the original error was ...
>
> The reason to store the errno in saved_errno here is not because we
> want to help code after "if (res) {...}", but the patch sent as-is
> gives that impression and is confusing to the readers.
>
> Perhaps all hunks of this patch share the same issue?  I could
> locally amend, of course, but I'd like to double check before doing
> so myself---perhaps you did it this way for a good reason that I am
> missing?

One thing I like about putting saved_errno right next to the related
syscall is, the syscall is visible from the diff (previously some are
out of context). This is really minor though. I briefly thought of
introducing rmdir_errno() and friends that return -errno on error, so
we could do

res = ... : rmdir_errno(..);
if (res) {
errno = -res;
warning_errno(...);
}

But that's more work and the errno = -res is not particularly pleasing
to read. I'm fine with moving saved_errno in the error handling
blocks.
-- 
Duy


Re: [PATCH 11/11] refs: split and make get_*_ref_store() public API

2017-02-14 Thread Duy Nguyen
On Wed, Feb 15, 2017 at 1:24 AM, Junio C Hamano  wrote:
> Duy Nguyen  writes:
>
>> Direct call sites are just middle men though, e.g.
>> for_each_ref_submodule(). I'll attempt to clean this up at some point
>> in future (probably at the same time I attempt to kill *_submodule ref
>> api). But I think for now I'll just put a TODO or FIXME comment here.
>
> So we'd have get_*_ref_store() for various cases and then will have
> a unifying for_each_ref_in_refstore() that takes a ref-store, which
> supersedes all the ad-hoc iterators like for_each_ref_submodule(),
> for_each_namespaced_ref(), etc?
>
> That's a very sensible longer-term goal, methinks.

Ah I forgot about ref namespace. I'll need to try something out in that area.

> I am wondering what we should do to for_each_{tag,branch,...}_ref().
> Would that become
>
> ref_store = get_ref_main_store();
> tag_ref_store = filter_ref_store(ref_store, "refs/tags/");
> for_each_ref_in_refstore(tag_ref_store, ...);
>
> or do you plan to have some other pattern?

Long term, I think that's what Mike wants. Though the filter_ref_store
might return an (opaque) iterator instead of a ref store and
for_each_refstore() becomes a thin wrapper around the iterator
interface.
-- 
Duy


[PATCH 12/14] read-cache: remove_marked_cache_entries to wipe selected submodules.

2017-02-14 Thread Stefan Beller
Signed-off-by: Stefan Beller 
---
 read-cache.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/read-cache.c b/read-cache.c
index 9054369dd0..c0776b93b0 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -18,6 +18,7 @@
 #include "varint.h"
 #include "split-index.h"
 #include "utf8.h"
+#include "submodule.h"
 
 /* Mask for the name length in ce_flags in the on-disk index */
 
@@ -532,6 +533,8 @@ void remove_marked_cache_entries(struct index_state *istate)
 
for (i = j = 0; i < istate->cache_nr; i++) {
if (ce_array[i]->ce_flags & CE_REMOVE) {
+   if (is_interesting_submodule(ce_array[i]))
+   submodule_go_from_to(ce_array[i]->name, "HEAD", 
NULL, 0, 1);
remove_name_hash(istate, ce_array[i]);
save_or_free_index_entry(istate, ce_array[i]);
}
-- 
2.12.0.rc0.16.gd1691994b4.dirty



[PATCH 14/14] builtin/checkout: add --recurse-submodules switch

2017-02-14 Thread Stefan Beller
Signed-off-by: Stefan Beller 
---
 Documentation/git-checkout.txt |  7 +++
 builtin/checkout.c | 28 
 t/lib-submodule-update.sh  | 33 -
 t/t2013-checkout-submodule.sh  |  5 +
 4 files changed, 64 insertions(+), 9 deletions(-)

diff --git a/Documentation/git-checkout.txt b/Documentation/git-checkout.txt
index 8e2c0662dd..a0ea2c5651 100644
--- a/Documentation/git-checkout.txt
+++ b/Documentation/git-checkout.txt
@@ -256,6 +256,13 @@ section of linkgit:git-add[1] to learn how to operate the 
`--patch` mode.
out anyway. In other words, the ref can be held by more than one
worktree.
 
+--[no-]recurse-submodules::
+   Using --recurse-submodules will update the content of all initialized
+   submodules according to the commit recorded in the superproject. If
+   local modifications in a submodule would be overwritten the checkout
+   will fail until `-f` is used. If nothing (or --no-recurse-submodules)
+   is used, the work trees of submodules will not be updated.
+
 ::
Branch to checkout; if it refers to a branch (i.e., a name that,
when prepended with "refs/heads/", is a valid ref), then that
diff --git a/builtin/checkout.c b/builtin/checkout.c
index f174f50303..207ce09771 100644
--- a/builtin/checkout.c
+++ b/builtin/checkout.c
@@ -21,12 +21,31 @@
 #include "submodule-config.h"
 #include "submodule.h"
 
+static int recurse_submodules = RECURSE_SUBMODULES_DEFAULT;
+
 static const char * const checkout_usage[] = {
N_("git checkout [] "),
N_("git checkout [] [] -- ..."),
NULL,
 };
 
+int option_parse_recurse_submodules(const struct option *opt,
+   const char *arg, int unset)
+{
+   if (unset) {
+   recurse_submodules = RECURSE_SUBMODULES_OFF;
+   return 0;
+   }
+   if (arg)
+   recurse_submodules =
+   parse_update_recurse_submodules_arg(opt->long_name,
+   arg);
+   else
+   recurse_submodules = RECURSE_SUBMODULES_ON;
+
+   return 0;
+}
+
 struct checkout_opts {
int patch_mode;
int quiet;
@@ -1163,6 +1182,9 @@ int cmd_checkout(int argc, const char **argv, const char 
*prefix)
N_("second guess 'git checkout 
'")),
OPT_BOOL(0, "ignore-other-worktrees", 
_other_worktrees,
 N_("do not check if another worktree is holding the 
given ref")),
+   { OPTION_CALLBACK, 0, "recurse-submodules", _submodules,
+   "checkout", "control recursive updating of 
submodules",
+   PARSE_OPT_OPTARG, option_parse_recurse_submodules },
OPT_BOOL(0, "progress", _progress, N_("force progress 
reporting")),
OPT_END(),
};
@@ -1193,6 +1215,12 @@ int cmd_checkout(int argc, const char **argv, const char 
*prefix)
git_xmerge_config("merge.conflictstyle", conflict_style, NULL);
}
 
+   if (recurse_submodules != RECURSE_SUBMODULES_OFF) {
+   git_config(submodule_config, NULL);
+   if (recurse_submodules != RECURSE_SUBMODULES_DEFAULT)
+   
set_config_update_recurse_submodules(recurse_submodules);
+   }
+
if ((!!opts.new_branch + !!opts.new_branch_force + 
!!opts.new_orphan_branch) > 1)
die(_("-b, -B and --orphan are mutually exclusive"));
 
diff --git a/t/lib-submodule-update.sh b/t/lib-submodule-update.sh
index 7c8c557572..9c75a417d4 100755
--- a/t/lib-submodule-update.sh
+++ b/t/lib-submodule-update.sh
@@ -734,6 +734,11 @@ test_submodule_forced_switch () {
 
 test_submodule_switch_recursing () {
command="$1"
+   RESULT=success
+   if test "$KNOWN_FAILURE_DIRECTORY_SUBMODULE_CONFLICTS" = 1
+   then
+   RESULT=failure
+   fi
# Appearing submodule #
# Switching to a commit letting a submodule appear checks it out ...
test_expect_success "$command: added submodule is checked out" '
@@ -843,7 +848,7 @@ test_submodule_switch_recursing () {
'
# Replacing a submodule with files in a directory must succeeds
# when the submodule is clean
-   test_expect_success "$command: replace submodule with a directory" '
+   test_expect_$RESULT "$command: replace submodule with a directory" '
prolog &&
reset_work_tree_to_interested add_sub1 &&
(
@@ -855,7 +860,7 @@ test_submodule_switch_recursing () {
)
'
# ... absorbing a .git directory.
-   test_expect_success "$command: replace submodule containing a .git 
directory with a directory must absorb the git dir" '
+   test_expect_$RESULT "$command: replace 

[PATCH 13/14] entry.c: update submodules when interesting

2017-02-14 Thread Stefan Beller
Signed-off-by: Stefan Beller 
---
 entry.c | 28 
 1 file changed, 28 insertions(+)

diff --git a/entry.c b/entry.c
index c6eea240b6..bc6295d41a 100644
--- a/entry.c
+++ b/entry.c
@@ -2,6 +2,7 @@
 #include "blob.h"
 #include "dir.h"
 #include "streaming.h"
+#include "submodule.h"
 
 static void create_directories(const char *path, int path_len,
   const struct checkout *state)
@@ -203,6 +204,13 @@ static int write_entry(struct cache_entry *ce,
return error("cannot create temporary submodule %s", 
path);
if (mkdir(path, 0777) < 0)
return error("cannot create submodule directory %s", 
path);
+   if (is_interesting_submodule(ce))
+   /*
+* force=1 is ok for any case as we did a dry
+* run before with appropriate force setting
+*/
+   return submodule_go_from_to(ce->name,
+   NULL, oid_to_hex(>oid), 0, 1);
break;
default:
return error("unknown file mode for %s in index", path);
@@ -260,6 +268,26 @@ int checkout_entry(struct cache_entry *ce,
 
if (!check_path(path.buf, path.len, , state->base_dir_len)) {
unsigned changed = ce_match_stat(ce, , 
CE_MATCH_IGNORE_VALID|CE_MATCH_IGNORE_SKIP_WORKTREE);
+   /*
+* Needs to be checked before !changed returns early,
+* as the possibly empty directory was not changed
+*/
+   if (is_interesting_submodule(ce)) {
+   int err;
+   if (!is_submodule_populated_gently(ce->name, )) {
+   struct stat sb;
+   if (lstat(ce->name, ))
+   die(_("could not stat file '%s'"), 
ce->name);
+   if (!(st.st_mode & S_IFDIR))
+   unlink_or_warn(ce->name);
+
+   return submodule_go_from_to(ce->name,
+   NULL, oid_to_hex(>oid), 0, 1);
+   } else
+   return submodule_go_from_to(ce->name,
+   "HEAD", oid_to_hex(>oid), 0, 1);
+   }
+
if (!changed)
return 0;
if (!state->force) {
-- 
2.12.0.rc0.16.gd1691994b4.dirty



[PATCH 11/14] unpack-trees: check if we can perform the operation for submodules

2017-02-14 Thread Stefan Beller
Signed-off-by: Stefan Beller 
---
 unpack-trees.c | 98 --
 unpack-trees.h |  1 +
 2 files changed, 90 insertions(+), 9 deletions(-)

diff --git a/unpack-trees.c b/unpack-trees.c
index 616a0ae4b2..6805cb9549 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -10,6 +10,7 @@
 #include "attr.h"
 #include "split-index.h"
 #include "dir.h"
+#include "submodule.h"
 
 /*
  * Error messages expected by scripts out of plumbing commands such as
@@ -45,6 +46,9 @@ static const char 
*unpack_plumbing_errors[NB_UNPACK_TREES_ERROR_TYPES] = {
 
/* ERROR_WOULD_LOSE_ORPHANED_REMOVED */
"Working tree file '%s' would be removed by sparse checkout update.",
+
+   /* ERROR_WOULD_LOSE_UNTRACKED_SUBMODULE */
+   "Submodule '%s' cannot be deleted as it contains untracked files.",
 };
 
 #define ERRORMSG(o,type) \
@@ -161,6 +165,8 @@ void setup_unpack_trees_porcelain(struct 
unpack_trees_options *opts,
_("The following working tree files would be overwritten by 
sparse checkout update:\n%s");
msgs[ERROR_WOULD_LOSE_ORPHANED_REMOVED] =
_("The following working tree files would be removed by sparse 
checkout update:\n%s");
+   msgs[ERROR_WOULD_LOSE_UNTRACKED_SUBMODULE] =
+   _("Submodule '%s' cannot be deleted as it contains untracked 
files.");
 
opts->show_all_errors = 1;
/* rejected paths may not have a static buffer */
@@ -240,12 +246,44 @@ static void display_error_msgs(struct 
unpack_trees_options *o)
fprintf(stderr, _("Aborting\n"));
 }
 
+static int submodule_check_from_to(const struct cache_entry *ce, const char 
*old_id, const char *new_id, struct unpack_trees_options *o)
+{
+   if (submodule_go_from_to(ce->name, old_id,
+new_id, 1, o->reset))
+   return o->gently ? -1 :
+   add_rejected_path(o, 
ERROR_WOULD_LOSE_UNTRACKED_SUBMODULE, ce->name);
+   return 0;
+}
+
+static void reload_gitmodules_file(struct index_state *index,
+  struct checkout *state)
+{
+   int i;
+   for (i = 0; i < index->cache_nr; i++) {
+   struct cache_entry *ce = index->cache[i];
+   if (ce->ce_flags & CE_UPDATE) {
+
+   int r = strcmp(ce->name, ".gitmodules");
+   if (r < 0)
+   continue;
+   else if (r == 0) {
+   checkout_entry(ce, state, NULL);
+   } else
+   break;
+   }
+   }
+   gitmodules_config();
+   git_config(submodule_config, NULL);
+}
+
 /*
  * Unlink the last component and schedule the leading directories for
  * removal, such that empty directories get removed.
  */
 static void unlink_entry(const struct cache_entry *ce)
 {
+   if (is_interesting_submodule(ce))
+   submodule_go_from_to(ce->name, "HEAD", NULL, 0, 1);
if (!check_leading_path(ce->name, ce_namelen(ce)))
return;
if (remove_or_warn(ce->ce_mode, ce->name))
@@ -301,6 +339,9 @@ static int check_updates(struct unpack_trees_options *o)
remove_marked_cache_entries(index);
remove_scheduled_dirs();
 
+   if (submodules_interesting_for_update() && o->update && !o->dry_run)
+   reload_gitmodules_file(index, );
+
for (i = 0; i < index->cache_nr; i++) {
struct cache_entry *ce = index->cache[i];
 
@@ -1358,17 +1399,27 @@ static int verify_uptodate_1(const struct cache_entry 
*ce,
if (!lstat(ce->name, )) {
int flags = CE_MATCH_IGNORE_VALID|CE_MATCH_IGNORE_SKIP_WORKTREE;
unsigned changed = ie_match_stat(o->src_index, ce, , flags);
+
+   if (is_interesting_submodule(ce)) {
+   int r;
+   r = submodule_check_from_to(ce,
+   "HEAD", oid_to_hex(>oid), o);
+   if (r)
+   return o->gently ? -1 :
+   add_rejected_path(o, error_type, 
ce->name);
+   return 0;
+   }
+
if (!changed)
return 0;
/*
-* NEEDSWORK: the current default policy is to allow
-* submodule to be out of sync wrt the superproject
-* index.  This needs to be tightened later for
-* submodules that are marked to be automatically
-* checked out.
+* Historic default policy was to allow submodule to be out
+* of sync wrt the superproject index. If the submodule was
+* not considered interesting above, we don't care here.
 */
if (S_ISGITLINK(ce->ce_mode))
return 0;

[PATCH 03/14] make is_submodule_populated gently

2017-02-14 Thread Stefan Beller
We need the gentle version in a later patch. As we have just one caller,
migrate the caller.

Signed-off-by: Stefan Beller 
---
 builtin/grep.c | 2 +-
 submodule.c| 4 ++--
 submodule.h| 2 +-
 3 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/builtin/grep.c b/builtin/grep.c
index 2c727ef499..b17835aed6 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -616,7 +616,7 @@ static int grep_submodule(struct grep_opt *opt, const 
unsigned char *sha1,
 {
if (!is_submodule_initialized(path))
return 0;
-   if (!is_submodule_populated(path)) {
+   if (!is_submodule_populated_gently(path, NULL)) {
/*
 * If searching history, check for the presense of the
 * submodule's gitdir before skipping the submodule.
diff --git a/submodule.c b/submodule.c
index 3b98766a6b..9bbdd3ce7c 100644
--- a/submodule.c
+++ b/submodule.c
@@ -237,12 +237,12 @@ int is_submodule_initialized(const char *path)
 /*
  * Determine if a submodule has been populated at a given 'path'
  */
-int is_submodule_populated(const char *path)
+int is_submodule_populated_gently(const char *path, int *return_error_code)
 {
int ret = 0;
char *gitdir = xstrfmt("%s/.git", path);
 
-   if (resolve_gitdir(gitdir))
+   if (resolve_gitdir_gently(gitdir, return_error_code))
ret = 1;
 
free(gitdir);
diff --git a/submodule.h b/submodule.h
index 05ab674f06..689033e538 100644
--- a/submodule.h
+++ b/submodule.h
@@ -41,7 +41,7 @@ extern int submodule_config(const char *var, const char 
*value, void *cb);
 extern void gitmodules_config(void);
 extern void gitmodules_config_sha1(const unsigned char *commit_sha1);
 extern int is_submodule_initialized(const char *path);
-extern int is_submodule_populated(const char *path);
+extern int is_submodule_populated_gently(const char *path, int 
*return_error_code);
 extern int parse_submodule_update_strategy(const char *value,
struct submodule_update_strategy *dst);
 extern const char *submodule_strategy_to_string(const struct 
submodule_update_strategy *s);
-- 
2.12.0.rc0.16.gd1691994b4.dirty



[PATCH 08/14] update submodules: move up prepare_submodule_repo_env

2017-02-14 Thread Stefan Beller
In a later patch we need to prepare the submodule environment with
another git directory, so split up the function.

Also move it up in the file such that we do not need to declare the
function later before using it.

Signed-off-by: Stefan Beller 
---
 submodule.c | 29 +
 1 file changed, 17 insertions(+), 12 deletions(-)

diff --git a/submodule.c b/submodule.c
index 4c33374ae8..d3fc6c2a75 100644
--- a/submodule.c
+++ b/submodule.c
@@ -359,6 +359,23 @@ static void print_submodule_summary(struct rev_info *rev, 
FILE *f,
strbuf_release();
 }
 
+static void prepare_submodule_repo_env_no_git_dir(struct argv_array *out)
+{
+   const char * const *var;
+
+   for (var = local_repo_env; *var; var++) {
+   if (strcmp(*var, CONFIG_DATA_ENVIRONMENT))
+   argv_array_push(out, *var);
+   }
+}
+
+void prepare_submodule_repo_env(struct argv_array *out)
+{
+   prepare_submodule_repo_env_no_git_dir(out);
+   argv_array_pushf(out, "%s=%s", GIT_DIR_ENVIRONMENT,
+DEFAULT_GIT_DIR_ENVIRONMENT);
+}
+
 /* Helper function to display the submodule header line prior to the full
  * summary output. If it can locate the submodule objects directory it will
  * attempt to lookup both the left and right commits and put them into the
@@ -1403,18 +1420,6 @@ int parallel_submodules(void)
return parallel_jobs;
 }
 
-void prepare_submodule_repo_env(struct argv_array *out)
-{
-   const char * const *var;
-
-   for (var = local_repo_env; *var; var++) {
-   if (strcmp(*var, CONFIG_DATA_ENVIRONMENT))
-   argv_array_push(out, *var);
-   }
-   argv_array_pushf(out, "%s=%s", GIT_DIR_ENVIRONMENT,
-DEFAULT_GIT_DIR_ENVIRONMENT);
-}
-
 /*
  * Embeds a single submodules git directory into the superprojects git dir,
  * non recursively.
-- 
2.12.0.rc0.16.gd1691994b4.dirty



[PATCH 07/14] update submodules: introduce is_interesting_submodule

2017-02-14 Thread Stefan Beller
In later patches we introduce the --recurse-submodule flag for commands
that modify the working directory, e.g. git-checkout.

It is potentially expensive to check if a submodule needs an update,
because a common theme to interact with submodules is to spawn a child
process for each interaction.

So let's introduce a function that pre checks if a submodule needs
to be checked for an update.

Signed-off-by: Stefan Beller 
---
 submodule.c | 26 ++
 submodule.h |  8 
 2 files changed, 34 insertions(+)

diff --git a/submodule.c b/submodule.c
index c0060c29f2..4c33374ae8 100644
--- a/submodule.c
+++ b/submodule.c
@@ -551,6 +551,32 @@ void set_config_update_recurse_submodules(int value)
config_update_recurse_submodules = value;
 }
 
+int submodules_interesting_for_update(void)
+{
+   /*
+* Update can't be "none", "merge" or "rebase",
+* treat any value as OFF, except an explicit ON.
+*/
+   return config_update_recurse_submodules == RECURSE_SUBMODULES_ON;
+}
+
+int is_interesting_submodule(const struct cache_entry *ce)
+{
+   const struct submodule *sub;
+
+   if (!S_ISGITLINK(ce->ce_mode))
+   return 0;
+
+   if (!submodules_interesting_for_update())
+   return 0;
+
+   sub = submodule_from_path(null_sha1, ce->name);
+   if (!sub)
+   return 0;
+
+   return sub->update_strategy.type != SM_UPDATE_NONE;
+}
+
 static int has_remote(const char *refname, const struct object_id *oid,
  int flags, void *cb_data)
 {
diff --git a/submodule.h b/submodule.h
index c4e1ac828e..84b67a7c4a 100644
--- a/submodule.h
+++ b/submodule.h
@@ -59,6 +59,14 @@ extern void show_submodule_inline_diff(FILE *f, const char 
*path,
const struct diff_options *opt);
 extern void set_config_fetch_recurse_submodules(int value);
 extern void set_config_update_recurse_submodules(int value);
+
+/*
+ * Traditionally Git ignored changes made for submodules.
+ * This function checks if we are interested in the given submodule
+ * for any kind of operation.
+ */
+extern int submodules_interesting_for_update(void);
+extern int is_interesting_submodule(const struct cache_entry *ce);
 extern void check_for_new_submodule_commits(unsigned char new_sha1[20]);
 extern int fetch_populated_submodules(const struct argv_array *options,
   const char *prefix, int command_line_option,
-- 
2.12.0.rc0.16.gd1691994b4.dirty



[PATCH 10/14] unpack-trees: pass old oid to verify_clean_submodule

2017-02-14 Thread Stefan Beller
The check (which uses the old oid) is yet to be implemented, but this part
is just a refactor, so it can go separately first.

Signed-off-by: Stefan Beller 
---
 unpack-trees.c | 17 ++---
 1 file changed, 10 insertions(+), 7 deletions(-)

diff --git a/unpack-trees.c b/unpack-trees.c
index 3a8ee19fe8..616a0ae4b2 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -1407,7 +1407,8 @@ static void invalidate_ce_path(const struct cache_entry 
*ce,
  * Currently, git does not checkout subprojects during a superproject
  * checkout, so it is not going to overwrite anything.
  */
-static int verify_clean_submodule(const struct cache_entry *ce,
+static int verify_clean_submodule(const char *old_sha1,
+ const struct cache_entry *ce,
  enum unpack_trees_error_types error_type,
  struct unpack_trees_options *o)
 {
@@ -1427,16 +1428,18 @@ static int verify_clean_subdirectory(const struct 
cache_entry *ce,
struct dir_struct d;
char *pathbuf;
int cnt = 0;
-   unsigned char sha1[20];
 
-   if (S_ISGITLINK(ce->ce_mode) &&
-   resolve_gitlink_ref(ce->name, "HEAD", sha1) == 0) {
-   /* If we are not going to update the submodule, then
+   if (S_ISGITLINK(ce->ce_mode)) {
+   unsigned char sha1[20];
+   int sub_head = resolve_gitlink_ref(ce->name, "HEAD", sha1);
+   /*
+* If we are not going to update the submodule, then
 * we don't care.
 */
-   if (!hashcmp(sha1, ce->oid.hash))
+   if (!sub_head && !hashcmp(sha1, ce->oid.hash))
return 0;
-   return verify_clean_submodule(ce, error_type, o);
+   return verify_clean_submodule(sub_head ? NULL : 
sha1_to_hex(sha1),
+ ce, error_type, o);
}
 
/*
-- 
2.12.0.rc0.16.gd1691994b4.dirty



[PATCH 04/14] connect_work_tree_and_git_dir: safely create leading directories

2017-02-14 Thread Stefan Beller
In a later patch we'll use connect_work_tree_and_git_dir when the
directory for the gitlink file doesn't exist yet. Safely create
the directory first.

Signed-off-by: Stefan Beller 
---
 dir.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/dir.c b/dir.c
index 4541f9e146..69ca3d1411 100644
--- a/dir.c
+++ b/dir.c
@@ -2735,6 +2735,8 @@ void connect_work_tree_and_git_dir(const char 
*work_tree_, const char *git_dir_)
 
/* Update gitfile */
strbuf_addf(_name, "%s/.git", work_tree);
+   if (safe_create_leading_directories_const(file_name.buf))
+   fprintf(stderr, "could not create directories for %s\n", 
file_name.buf);
write_file(file_name.buf, "gitdir: %s",
   relative_path(git_dir, work_tree, _path));
 
-- 
2.12.0.rc0.16.gd1691994b4.dirty



[PATCH 09/14] update submodules: add submodule_go_from_to

2017-02-14 Thread Stefan Beller
In later patches we introduce the options and flag for commands
that modify the working directory, e.g. git-checkout.

This piece of code will be used universally for
all these working tree modifications as it
* supports dry run to answer the question:
  "Is it safe to change the submodule to this new state?"
  e.g. is it overwriting untracked files or are there local
  changes that would be overwritten?
* supports a force flag that can be used for resetting
  the tree.

Signed-off-by: Stefan Beller 
---
 submodule.c | 151 
 submodule.h |   5 ++
 2 files changed, 156 insertions(+)

diff --git a/submodule.c b/submodule.c
index d3fc6c2a75..194cba9535 100644
--- a/submodule.c
+++ b/submodule.c
@@ -1252,6 +1252,157 @@ int bad_to_remove_submodule(const char *path, unsigned 
flags)
return ret;
 }
 
+static int submodule_has_dirty_index(const struct submodule *sub)
+{
+   ssize_t len;
+   struct child_process cp = CHILD_PROCESS_INIT;
+   struct strbuf buf = STRBUF_INIT;
+   int ret = 0;
+
+   prepare_submodule_repo_env_no_git_dir(_array);
+
+   cp.git_cmd = 1;
+   argv_array_pushl(, "diff-index", "--cached", "HEAD", NULL);
+   cp.no_stdin = 1;
+   cp.out = -1;
+   cp.dir = sub->path;
+   if (start_command())
+   die("could not recurse into submodule %s", sub->path);
+
+   len = strbuf_read(, cp.out, 1024);
+   if (len > 2)
+   ret = 1;
+
+   close(cp.out);
+   if (finish_command())
+   die("could not recurse into submodule %s", sub->path);
+
+   strbuf_release();
+   return ret;
+}
+
+void submodule_clean_index(const char *path)
+{
+   struct child_process cp = CHILD_PROCESS_INIT;
+   prepare_submodule_repo_env_no_git_dir(_array);
+
+   cp.git_cmd = 1;
+   cp.no_stdin = 1;
+   cp.dir = path;
+
+   argv_array_pushf(, "--super-prefix=%s/", path);
+   argv_array_pushl(, "read-tree", "-u", "--reset", NULL);
+
+   argv_array_push(, EMPTY_TREE_SHA1_HEX);
+
+   if (run_command())
+   die("could not clean submodule index");
+}
+
+/**
+ * Moves a submodule at a given path from a given head to another new head.
+ * For edge cases (a submodule coming into existence or removing a submodule)
+ * pass NULL for old or new respectively.
+ *
+ * TODO: move dryrun and forced to flags.
+ */
+int submodule_go_from_to(const char *path,
+const char *old,
+const char *new,
+int dry_run,
+int force)
+{
+   int ret = 0;
+   struct child_process cp = CHILD_PROCESS_INIT;
+   const struct submodule *sub;
+
+   sub = submodule_from_path(null_sha1, path);
+
+   if (!sub)
+   die("BUG: could not get submodule information for '%s'", path);
+
+   if (!dry_run) {
+   if (old) {
+   if (!submodule_uses_gitfile(path))
+   absorb_git_dir_into_superproject("", path,
+   ABSORB_GITDIR_RECURSE_SUBMODULES);
+   } else {
+   struct strbuf sb = STRBUF_INIT;
+   strbuf_addf(, "%s/modules/%s",
+   get_git_common_dir(), sub->name);
+   connect_work_tree_and_git_dir(path, sb.buf);
+   strbuf_release();
+
+   /* make sure the index is clean as well */
+   submodule_clean_index(path);
+   }
+   }
+
+   if (old && !force) {
+   /* Check if the submodule has a dirty index. */
+   if (submodule_has_dirty_index(sub)) {
+   /* print a thing here? */
+   return -1;
+   }
+   }
+
+   prepare_submodule_repo_env_no_git_dir(_array);
+
+   cp.git_cmd = 1;
+   cp.no_stdin = 1;
+   cp.dir = path;
+
+   argv_array_pushf(, "--super-prefix=%s/", path);
+   argv_array_pushl(, "read-tree", NULL);
+
+   if (!dry_run)
+   argv_array_push(, "-u");
+   else
+   argv_array_push(, "-n");
+
+   if (force)
+   argv_array_push(, "--reset");
+   else
+   argv_array_push(, "-m");
+
+   argv_array_push(, old ? old : EMPTY_TREE_SHA1_HEX);
+   argv_array_push(, new ? new : EMPTY_TREE_SHA1_HEX);
+
+   if (run_command()) {
+   ret = -1;
+   goto out;
+   }
+
+   if (!dry_run) {
+   if (new) {
+   struct child_process cp1 = CHILD_PROCESS_INIT;
+   /* also set the HEAD accordingly */
+   cp1.git_cmd = 1;
+   cp1.no_stdin = 1;
+   cp1.dir = path;
+
+   argv_array_pushl(, "update-ref", "HEAD",
+   

[PATCH 06/14] update submodules: add a config option to determine if submodules are updated

2017-02-14 Thread Stefan Beller
In later patches we introduce the options and flag for commands
that modify the working directory, e.g. git-checkout.

Have a central place to store such settings whether we want to update
a submodule.

Signed-off-by: Stefan Beller 
---
 submodule.c | 6 ++
 submodule.h | 1 +
 2 files changed, 7 insertions(+)

diff --git a/submodule.c b/submodule.c
index 9bbdd3ce7c..c0060c29f2 100644
--- a/submodule.c
+++ b/submodule.c
@@ -17,6 +17,7 @@
 #include "worktree.h"
 
 static int config_fetch_recurse_submodules = RECURSE_SUBMODULES_ON_DEMAND;
+static int config_update_recurse_submodules = RECURSE_SUBMODULES_DEFAULT;
 static int parallel_jobs = 1;
 static struct string_list changed_submodule_paths = STRING_LIST_INIT_NODUP;
 static int initialized_fetch_ref_tips;
@@ -545,6 +546,11 @@ void set_config_fetch_recurse_submodules(int value)
config_fetch_recurse_submodules = value;
 }
 
+void set_config_update_recurse_submodules(int value)
+{
+   config_update_recurse_submodules = value;
+}
+
 static int has_remote(const char *refname, const struct object_id *oid,
  int flags, void *cb_data)
 {
diff --git a/submodule.h b/submodule.h
index 689033e538..c4e1ac828e 100644
--- a/submodule.h
+++ b/submodule.h
@@ -58,6 +58,7 @@ extern void show_submodule_inline_diff(FILE *f, const char 
*path,
const char *del, const char *add, const char *reset,
const struct diff_options *opt);
 extern void set_config_fetch_recurse_submodules(int value);
+extern void set_config_update_recurse_submodules(int value);
 extern void check_for_new_submodule_commits(unsigned char new_sha1[20]);
 extern int fetch_populated_submodules(const struct argv_array *options,
   const char *prefix, int command_line_option,
-- 
2.12.0.rc0.16.gd1691994b4.dirty



[PATCH 01/14] lib-submodule-update.sh: reorder create_lib_submodule_repo

2017-02-14 Thread Stefan Beller
Redraw the ASCII art describing the setup using more space, such that
it is easier to understand.  The leaf commits are now ordered the same
way the actual code is ordered.

Add empty lines to the setup code separating each of the leaf commits,
each starting with a "checkout -b".

Signed-off-by: Stefan Beller 
---
 t/lib-submodule-update.sh | 49 ---
 1 file changed, 29 insertions(+), 20 deletions(-)

diff --git a/t/lib-submodule-update.sh b/t/lib-submodule-update.sh
index 915eb4a7c6..61c54f2098 100755
--- a/t/lib-submodule-update.sh
+++ b/t/lib-submodule-update.sh
@@ -15,22 +15,27 @@
 # - Tracked file replaced by submodule (replace_sub1_with_file =>
 #   replace_file_with_sub1)
 #
-#   --O-O
-#  /  ^ replace_directory_with_sub1
-# /   replace_sub1_with_directory
-#/O
-#   / ^
-#  /  modify_sub1
-#  O--O---O
-#  ^  ^\  ^
-#  |  | \ remove_sub1
-#  |  |  -O-O
-#  |  |   \   ^ replace_file_with_sub1
-#  |  |\  replace_sub1_with_file
-#  |   add_sub1 --O-O
-# no_submodule^ valid_sub1
-# invalid_sub1
+# O
+#/^
+#   / remove_sub1
+#  /
+#   add_sub1  /---O
+# |  /^
+# | / modify_sub1
+# v/
+#  O--O---O-O
+#  ^   \  ^ replace_directory_with_sub1
+#  |\ replace_sub1_with_directory
+# no_submodule   \
+# O-O
+#  \  ^ replace_file_with_sub1
+#   \ replace_sub1_with_file
+#\
+# O-O
+# ^ valid_sub1
+# invalid_sub1
 #
+
 create_lib_submodule_repo () {
git init submodule_update_repo &&
(
@@ -49,10 +54,11 @@ create_lib_submodule_repo () {
git config submodule.sub1.ignore all &&
git add .gitmodules &&
git commit -m "Add sub1" &&
-   git checkout -b remove_sub1 &&
+
+   git checkout -b remove_sub1 add_sub1&&
git revert HEAD &&
 
-   git checkout -b "modify_sub1" "add_sub1" &&
+   git checkout -b modify_sub1 add_sub1 &&
git submodule update &&
(
cd sub1 &&
@@ -67,7 +73,7 @@ create_lib_submodule_repo () {
git add sub1 &&
git commit -m "Modify sub1" &&
 
-   git checkout -b "replace_sub1_with_directory" "add_sub1" &&
+   git checkout -b replace_sub1_with_directory add_sub1 &&
git submodule update &&
git -C sub1 checkout modifications &&
git rm --cached sub1 &&
@@ -75,22 +81,25 @@ create_lib_submodule_repo () {
git config -f .gitmodules --remove-section "submodule.sub1" &&
git add .gitmodules sub1/* &&
git commit -m "Replace sub1 with directory" &&
+
git checkout -b replace_directory_with_sub1 &&
git revert HEAD &&
 
-   git checkout -b "replace_sub1_with_file" "add_sub1" &&
+   git checkout -b replace_sub1_with_file add_sub1 &&
git rm sub1 &&
echo "content" >sub1 &&
git add sub1 &&
git commit -m "Replace sub1 with file" &&
+
git checkout -b replace_file_with_sub1 &&
git revert HEAD &&
 
-   git checkout -b "invalid_sub1" "add_sub1" &&
+   git checkout -b invalid_sub1 add_sub1 &&
git update-index --cacheinfo 16 
0123456789012345678901234567890123456789 sub1 &&
git commit -m "Invalid sub1 commit" &&
git checkout -b valid_sub1 &&
git revert HEAD &&
+
git checkout master
)
 }
-- 
2.12.0.rc0.16.gd1691994b4.dirty



[RFCv3 PATCH 00/14] Checkout aware of Submodules!

2017-02-14 Thread Stefan Beller
previous work:
https://public-inbox.org/git/20161203003022.29797-1-sbel...@google.com/

v3:
 * moved tests from t2013 to the generic submodule library.
 * factored out the refactoring patches to be up front
 * As I redid the complete implementation, I have the impression this time
   it is cleaner than previous versions.
 
 I think we still have to fix the corner cases of directory/file/submodule 
 conflicts before merging, but this serves as a status update on my current
 way of thinking how to implement the worktree commands being aware of
 submodules.
 
Thanks,
Stefan

v2:
* based on top of the series sent out an hour ago
  "[PATCHv4 0/5] submodule embedgitdirs"
* Try to embed a submodule if we need to remove it.
* Strictly do not change behavior if not giving the new flag.
* I think I missed some review comments from v1, but I'd like to get
  the current state out over the weekend, as a lot has changed so far.
  On Monday I'll go through the previous discussion with a comb to see
  if I missed something.
  
v1:
When working with submodules, nearly anytime after checking out
a different state of the projects, that has submodules changed
you'd run "git submodule update" with a current version of Git.

There are two problems with this approach:

* The "submodule update" command is dangerous as it
  doesn't check for work that may be lost in the submodule
  (e.g. a dangling commit).
* you may forget to run the command as checkout is supposed
  to do all the work for you.

Integrate updating the submodules into git checkout, with the same
safety promises that git-checkout has, i.e. not throw away data unless
asked to. This is done by first checking if the submodule is at the same
sha1 as it is recorded in the superproject. If there are changes we stop
proceeding the checkout just like it is when checking out a file that
has local changes.

The integration happens in the code that is also used in other commands
such that it will be easier in the future to make other commands aware
of submodule.

This also solves d/f conflicts in case you replace a file/directory
with a submodule or vice versa.

The patches are still a bit rough, but the overall series seems
promising enough to me that I want to put it out here.

Any review, specifically on the design level welcome!

Thanks,
Stefan


Stefan Beller (14):
  lib-submodule-update.sh: reorder create_lib_submodule_repo
  lib-submodule-update.sh: define tests for recursing into submodules
  make is_submodule_populated gently
  connect_work_tree_and_git_dir: safely create leading directories
  update submodules: add submodule config parsing
  update submodules: add a config option to determine if submodules are
updated
  update submodules: introduce is_interesting_submodule
  update submodules: move up prepare_submodule_repo_env
  update submodules: add submodule_go_from_to
  unpack-trees: pass old oid to verify_clean_submodule
  unpack-trees: check if we can perform the operation for submodules
  read-cache: remove_marked_cache_entries to wipe selected submodules.
  entry.c: update submodules when interesting
  builtin/checkout: add --recurse-submodules switch

 Documentation/git-checkout.txt |   7 +
 builtin/checkout.c |  28 +++
 builtin/grep.c |   2 +-
 dir.c  |   2 +
 entry.c|  28 +++
 read-cache.c   |   3 +
 submodule-config.c |  22 ++
 submodule-config.h |  17 +-
 submodule.c| 216 +++--
 submodule.h|  16 +-
 t/lib-submodule-update.sh  | 534 +++--
 t/t2013-checkout-submodule.sh  |   5 +
 unpack-trees.c | 115 +++--
 unpack-trees.h |   1 +
 14 files changed, 936 insertions(+), 60 deletions(-)

-- 
2.12.0.rc0.16.gd1691994b4.dirty



[PATCH 05/14] update submodules: add submodule config parsing

2017-02-14 Thread Stefan Beller
Similar to b33a15b08 (push: add recurseSubmodules config option,
2015-11-17) and 027771fcb1 (submodule: allow erroneous values for the
fetchRecurseSubmodules option, 2015-08-17), we add submodule-config code
that is later used to parse whether we are interested in updating
submodules.

We need the `die_on_error` parameter to be able to call this parsing
function for the config file as well, which if incorrect lets Git die.

As we're just touching the header file, also mark all functions extern.

Signed-off-by: Stefan Beller 
---
 submodule-config.c | 22 ++
 submodule-config.h | 17 +
 2 files changed, 31 insertions(+), 8 deletions(-)

diff --git a/submodule-config.c b/submodule-config.c
index 93453909cf..93f01c4378 100644
--- a/submodule-config.c
+++ b/submodule-config.c
@@ -234,6 +234,28 @@ int parse_fetch_recurse_submodules_arg(const char *opt, 
const char *arg)
return parse_fetch_recurse(opt, arg, 1);
 }
 
+static int parse_update_recurse(const char *opt, const char *arg,
+   int die_on_error)
+{
+   switch (git_config_maybe_bool(opt, arg)) {
+   case 1:
+   return RECURSE_SUBMODULES_ON;
+   case 0:
+   return RECURSE_SUBMODULES_OFF;
+   default:
+   if (!strcmp(arg, "checkout"))
+   return RECURSE_SUBMODULES_ON;
+   if (die_on_error)
+   die("bad %s argument: %s", opt, arg);
+   return RECURSE_SUBMODULES_ERROR;
+   }
+}
+
+int parse_update_recurse_submodules_arg(const char *opt, const char *arg)
+{
+   return parse_update_recurse(opt, arg, 1);
+}
+
 static int parse_push_recurse(const char *opt, const char *arg,
   int die_on_error)
 {
diff --git a/submodule-config.h b/submodule-config.h
index 70f19363fd..d434ecdb45 100644
--- a/submodule-config.h
+++ b/submodule-config.h
@@ -22,16 +22,17 @@ struct submodule {
int recommend_shallow;
 };
 
-int parse_fetch_recurse_submodules_arg(const char *opt, const char *arg);
-int parse_push_recurse_submodules_arg(const char *opt, const char *arg);
-int parse_submodule_config_option(const char *var, const char *value);
-const struct submodule *submodule_from_name(const unsigned char 
*commit_or_tree,
-   const char *name);
-const struct submodule *submodule_from_path(const unsigned char 
*commit_or_tree,
-   const char *path);
+extern int parse_fetch_recurse_submodules_arg(const char *opt, const char 
*arg);
+extern int parse_update_recurse_submodules_arg(const char *opt, const char 
*arg);
+extern int parse_push_recurse_submodules_arg(const char *opt, const char *arg);
+extern int parse_submodule_config_option(const char *var, const char *value);
+extern const struct submodule *submodule_from_name(
+   const unsigned char *commit_or_tree, const char *name);
+extern const struct submodule *submodule_from_path(
+   const unsigned char *commit_or_tree, const char *path);
 extern int gitmodule_sha1_from_commit(const unsigned char *commit_sha1,
  unsigned char *gitmodules_sha1,
  struct strbuf *rev);
-void submodule_free(void);
+extern void submodule_free(void);
 
 #endif /* SUBMODULE_CONFIG_H */
-- 
2.12.0.rc0.16.gd1691994b4.dirty



[PATCH 02/14] lib-submodule-update.sh: define tests for recursing into submodules

2017-02-14 Thread Stefan Beller
Currently lib-submodule-update.sh provides 2 functions
test_submodule_switch and test_submodule_forced_switch that are used by a
variety of tests to ensure that submodules behave as expected. The current
expected behavior is that submodules are not touched at all (see
42639d2317a for the exact setup).

In the future we want to teach all these commands to properly recurse
into submodules. To do that, we'll add two testing functions to
submodule-update-lib.sh test_submodule_switch_recursing and
test_submodule_forced_switch_recursing.

These two functions behave in analogy to the already existing functions
just with a different expectation on submodule behavior. The submodule
in the working tree is expected to be updated to the recorded submodule
version. The behavior is analogous to e.g. the behavior of files in a
nested directory in the working tree, where a change to the working tree
handles any arising directory/file conflicts just fine.

Signed-off-by: Stefan Beller 
---
 t/lib-submodule-update.sh | 474 +-
 1 file changed, 472 insertions(+), 2 deletions(-)

diff --git a/t/lib-submodule-update.sh b/t/lib-submodule-update.sh
index 61c54f2098..7c8c557572 100755
--- a/t/lib-submodule-update.sh
+++ b/t/lib-submodule-update.sh
@@ -4,6 +4,7 @@
 # - New submodule (no_submodule => add_sub1)
 # - Removed submodule (add_sub1 => remove_sub1)
 # - Updated submodule (add_sub1 => modify_sub1)
+# - Updated submodule recursively (modify_sub1 => modify_sub1_recursively)
 # - Submodule updated to invalid commit (add_sub1 => invalid_sub1)
 # - Submodule updated from invalid commit (invalid_sub1 => valid_sub1)
 # - Submodule replaced by tracked files in directory (add_sub1 =>
@@ -19,8 +20,8 @@
 #/^
 #   / remove_sub1
 #  /
-#   add_sub1  /---O
-# |  /^
+#   add_sub1  /---O-O
+# |  /^ modify_sub1_recursive
 # | / modify_sub1
 # v/
 #  O--O---O-O
@@ -73,6 +74,14 @@ create_lib_submodule_repo () {
git add sub1 &&
git commit -m "Modify sub1" &&
 
+   git checkout -b modify_sub1_recursively modify_sub1 &&
+   git -C sub1 checkout -b "add_nested_sub" &&
+   git -C sub1 submodule add --branch no_submodule ./. sub2 &&
+   git -C sub1 commit -a -m "add a nested submodule" &&
+   git add sub1 &&
+   git commit -a -m "update submodule, that updates a nested 
submodule" &&
+   git -C sub1 submodule deinit -f --all &&
+
git checkout -b replace_sub1_with_directory add_sub1 &&
git submodule update &&
git -C sub1 checkout modifications &&
@@ -139,6 +148,15 @@ test_git_directory_is_unchanged () {
)
 }
 
+test_git_directory_exists() {
+   test -e ".git/modules/$1" &&
+   if test -f sub1/.git
+   then
+   # does core.worktree point at the right place?
+   test "$(git -C .git/modules/$1 config core.worktree)" = 
"../../../$1"
+   fi
+}
+
 # Helper function to be executed at the start of every test below, it sets up
 # the submodule repo if it doesn't exist and configures the most problematic
 # settings for diff.ignoreSubmodules.
@@ -169,6 +187,18 @@ reset_work_tree_to () {
)
 }
 
+reset_work_tree_to_interested () {
+   reset_work_tree_to $1 &&
+   # indicate we are interested in the submodule:
+   git -C submodule_update config submodule.sub1.url "bogus" &&
+   # also have it available:
+   if ! test -d submodule_update/.git/modules/sub1
+   then
+   mkdir submodule_update/.git/modules &&
+   cp -r submodule_update_repo/.git/modules/sub1 
submodule_update/.git/modules/sub1
+   fi
+}
+
 # Test that the superproject contains the content according to commit "$1"
 # (the work tree must match the index for everything but submodules but the
 # index must exactly match the given commit including any submodule SHA-1s).
@@ -684,3 +714,443 @@ test_submodule_forced_switch () {
)
'
 }
+
+# Test that submodule contents are correctly updated when switching
+# between commits that change a submodule.
+# Test that the following transitions are correctly handled:
+# (These tests are also above in the case where we expect no change
+#  in the submodule)
+# - Updated submodule
+# - New submodule
+# - Removed submodule
+# - Directory containing tracked files replaced by submodule
+# - Submodule replaced by tracked files in directory
+# - Submodule replaced by tracked file with the same name
+# - tracked file replaced by submodule
+#
+# New test cases
+# - Removing a submodule with a git directory absorbs the submodules
+#   git directory first into the superproject.
+
+test_submodule_switch_recursing () {
+   command="$1"

Re: [RFC PATCH] show decorations at the end of the line

2017-02-14 Thread Jeff King
On Tue, Feb 14, 2017 at 02:11:06PM -0800, Junio C Hamano wrote:

> Junio C Hamano  writes:
> 
> > Thanks.  We'd need to update the tests that expects the old style
> > output, though.
> 
> The updates to the expectation look like this (already squashed).
> The --source decorations in 4202 are also shown at the end, which
> probably is in line with the way --show-decorations adds them at the
> end of the line, but was somewhat surprising from reading only the
> log message.

Hrm, that does surprise me. I'm not sure if that's desirable or not. I
do think some of the "nobody could possibly be parsing these" arguments
about decorations do not apply to --source (and also, they're harder for
humans to pick out from the end of the line as they lack punctuation and
color).

-Peff


Re: Feature request: flagging “volatile” branches for integration/development

2017-02-14 Thread Herbert, Marc
[apologies for the accidental "smart" quotes and the resulting UTF8
encoding of the subject]

On 04/02/2017 06:01, Duy Nguyen wrote:
> 
> But that would be local information only. We don't have ways to
> transfer branch metadata (and we definitely don't want to just share
> .git/config file with everybody). I suppose extending git protocol for
> this is not hard (e.g. appending metadata in the "have" lines).

Thanks Duy. So did you mean:

[ X ] send (big!) patches ?

> The hard part may be policy (e.g. what if the user does not want a branch
> to be treated volatile by various commands even if it receives such
> flag from a git server).

There would be instant, human-readable value in such a new "volatile"
flag. Machine use and policies can be discussed later. These will be
easier to prototype, experiment and refine once the flag exists.

  

Interestingly, it was pointed to me (thanks Calvin) that GitLab has
implemented this volatile flag exactly. It's called... "work in progress":
https://docs.gitlab.com/ee/user/project/merge_requests/work_in_progress_merge_requests.html

I'm not familiar with GitHub, however browsing its documentation the
(in)existence of a pull request seems equivalent to a (non-)volatile
flag. Just like a pull request by email without the need to find and search
a mailing-list.

I'm familiar with Gerrit and there's no strict equivalent to a volatile
flag, however it's:
- totally obvious when the commit has been accepted and merged - hence
  its SHA1 final.
- usually fairly clear whether the code is still WIP or near the
  "pull request" stage based on: how the code review is going, approvals
  and other metadata.

Long story short: to integrate code reviews and source control these
systems overload git with a ton of metadata so it's no surprise to
always find in them something that more or less looks like a "volatile"
flag. I guess this leads to the more general question of core git possibly
implementing some generic metadata/property system (key,value pairs?
everything is a ref?) to better support code review and other
git-based software... Now I bet this on the other hand must have been
discussed (and rejected?) before, any pointer?


Marc







Re: [git-for-windows] Re: Continuous Testing of Git on Windows

2017-02-14 Thread Junio C Hamano
"Philip Oakley"  writes:

> There are also a few ideas at the SO answers:
> http://stackoverflow.com/a/5652323/717355

I vaguely recall that I saw somebody said the same "mark tips of
topics as good" on the list and answered with why it does not quite
work, though.


Re: [RFC-PATCHv2] submodules: add a background story

2017-02-14 Thread Junio C Hamano
Stefan Beller  writes:

> I claim that the exposure into .gitmodules combined with
> the extreme similarity to its path is confusing. Maybe this
> can be fixed by a different default name.

I think that this may be worth thinking about it further.

The names are something the end users are not supposed to change,
and one way to ensure that is to make .gitmodules file a binary
black box that can only be updated with a specialized tool---as long
as the tool does not allow updating the "name" field, you wouldn't
risk them mucking with it.  Limiting the update to a specialized
tool also would give us a single place to ensure that it is globally
unique across the history of the project (well, at least the part of
the history that is visible to your repository).

Of course, being "one way" to do so does not mean it is the only
way, or it is the best way.  Keeping the information in a text file
lets you merge them more easily when you add a submodule B while I
added a submodule C, for example, and having a human readble name
lets us learn from the output of "git log -p .gitmodules" that the
repository of the "linux-kernel" submodule we use in our appliance
used to live at linux-2.6.git but has moved to linux.git over time
(for the latter use case to work well, we cannot change the name to
something unreadable by humans like uuid---discouraging people from
modifying and making them unreadble are two different things).


Re: [git-for-windows] Re: Continuous Testing of Git on Windows

2017-02-14 Thread Philip Oakley

From: "Christian Couder" 
On Tue, Feb 14, 2017 at 10:08 PM, Junio C Hamano  
wrote:

Johannes Schindelin  writes:


On Mon, 13 Feb 2017, Junio C Hamano wrote:


Johannes Schindelin  writes:

> That is why I taught the Git for Windows CI job that tests the four
> upstream Git integration branches to *also* bisect test breakages and
> then upload comments to the identified commit on GitHub

Good.  I do not think it is useful to try 'pu' as an aggregate and
expect it to always build and work [*1*], but your "bisect and
pinpoint" approach makes it useful to identify individual topic that
brings in a breakage.


Sadly the many different merge bases[*1*] between `next` and `pu` (which
are the obvious good/bad points for bisecting automatically) bring my
build agents to its knees. I may have to disable the bisecting feature 
as

a consequence.


Yeah, this is a bug in the bisect algorithm. Fixing it is in the GSoC
2017 Ideas.


There are also a few ideas at the SO answers: 
http://stackoverflow.com/a/5652323/717355





Probably a less resource intensive approach is to find the tips of
the topics not in 'next' but in 'pu' and test them.  That would give
you which topic(s) are problematic, which is a better starting point
than "Oh, 'pu' does not build".  After identifying which branch is
problematic, bisection of individual topic would be of more manageable
size.


It is still probably more resource intensive than it couls be.

[...]


This is one of these times I wish "git bisect --first-parent" were
available.


Implementing "git bisect --first-parent" is also part of the GSoC 2017 
Ideas.


By the way it should not be very difficult as a patch to do this and
more was proposed a long time ago:

https://public-inbox.org/git/4d3cddf9.6080...@intel.com/


The above "log" gives me 27 merges right now, which
should be bisectable within 5 rounds to identify a single broken
topic (if there is only one breakage, that is).



--
Philip 



Re: [git-for-windows] Re: Continuous Testing of Git on Windows

2017-02-14 Thread Junio C Hamano
Christian Couder  writes:

> By the way it should not be very difficult as a patch to do this and
> more was proposed a long time ago:
>
> https://public-inbox.org/git/4d3cddf9.6080...@intel.com/

Thanks for a link.  The one I found most interesting in the thread
is by Avery [*1*], where he explains why "first-parent" bisection
makes sense in "many people develop topics of their own, and they
are aggregated into an integration branch" environment:

Basically, we push/fetch *all* the branches from *everybody* into a
single repo, and build all of them as frequently as we can.  If you
think about it, if you have all the branches that someone might have
pulled/merged from, then you don't have to think of the git history
as a whole complicated DAG; you can just think of it as a whole
bunch of separate chunks of linear history.  Moreover, as long as
people are careful to only pull from a branch when that branch is
passing all tests - which you can easily see by looking at the
gitbuilder console - then playing inside each of these chunks of
linear history can help you figure out where particular bugs were
introduced during "messy" branches.

It also allows you a nice separation of concerns.  The owner of the
mainline branch (the "integration manager" person) only really cares
about which branch they merged that caused a problem, because that
person doesn't want to fix bugs, he/she simply wants to know who
owns the failing branch, so that person can fix *their* bug and
their branch will merge without breaking things.

[Reference]

*1* 
https://public-inbox.org/git/aanlktinwbm9gczhgeqcboepov0_xv7ujyqvc7j13q...@mail.gmail.com/


Re: [git-for-windows] Re: Continuous Testing of Git on Windows

2017-02-14 Thread Christian Couder
On Tue, Feb 14, 2017 at 10:08 PM, Junio C Hamano  wrote:
> Johannes Schindelin  writes:
>
>> On Mon, 13 Feb 2017, Junio C Hamano wrote:
>>
>>> Johannes Schindelin  writes:
>>>
>>> > That is why I taught the Git for Windows CI job that tests the four
>>> > upstream Git integration branches to *also* bisect test breakages and
>>> > then upload comments to the identified commit on GitHub
>>>
>>> Good.  I do not think it is useful to try 'pu' as an aggregate and
>>> expect it to always build and work [*1*], but your "bisect and
>>> pinpoint" approach makes it useful to identify individual topic that
>>> brings in a breakage.
>>
>> Sadly the many different merge bases[*1*] between `next` and `pu` (which
>> are the obvious good/bad points for bisecting automatically) bring my
>> build agents to its knees. I may have to disable the bisecting feature as
>> a consequence.

Yeah, this is a bug in the bisect algorithm. Fixing it is in the GSoC
2017 Ideas.

> Probably a less resource intensive approach is to find the tips of
> the topics not in 'next' but in 'pu' and test them.  That would give
> you which topic(s) are problematic, which is a better starting point
> than "Oh, 'pu' does not build".  After identifying which branch is
> problematic, bisection of individual topic would be of more manageable
> size.

It is still probably more resource intensive than it couls be.

[...]

> This is one of these times I wish "git bisect --first-parent" were
> available.

Implementing "git bisect --first-parent" is also part of the GSoC 2017 Ideas.

By the way it should not be very difficult as a patch to do this and
more was proposed a long time ago:

https://public-inbox.org/git/4d3cddf9.6080...@intel.com/

> The above "log" gives me 27 merges right now, which
> should be bisectable within 5 rounds to identify a single broken
> topic (if there is only one breakage, that is).


What's cooking in git.git (Feb 2017, #04; Tue, 14)

2017-02-14 Thread Junio C Hamano
Here are the topics that have been cooking.  Commits prefixed with
'-' are only in 'pu' (proposed updates) while commits prefixed with
'+' are in 'next'.  The ones marked with '.' do not appear in any of
the integration branches, but I am still holding onto them.

Topic descriptions of some new topics I just wrote are all iffy.
Suggestion to clarify them are very much welcomed.

Oh, also I'd like to get pull requests for gitk and git-gui updates
soonish, if we are to have one during this cycle.

You can find the changes described here in the integration branches
of the repositories listed at

http://git-blame.blogspot.com/p/git-public-repositories.html

--
[New Topics]

* jh/preload-index-skip-skip (2017-02-10) 1 commit
 - preload-index: avoid lstat for skip-worktree items

 The preload-index code has been taught not to bother with the index
 entries that are paths that are not checked out by "sparse checkout".

 Will merge to and cook in 'next'.


* rs/cocci-check-free-only-null (2017-02-11) 1 commit
 - cocci: detect useless free(3) calls

 A new coccinelle rule that catches a check of !pointer before the
 pointer is free(3)d, which most likely is a bug.

 Will merge to 'next'.


* jk/doc-remote-helpers-markup-fix (2017-02-13) 1 commit
  (merged to 'next' on 2017-02-14 at 95af1267c7)
 + docs/gitremote-helpers: fix unbalanced quotes

 Doc markup fix.

 Will merge to 'master'.


* jk/doc-submodule-markup-fix (2017-02-13) 1 commit
  (merged to 'next' on 2017-02-14 at b2f807f7d8)
 + docs/git-submodule: fix unbalanced quote

 Doc markup fix.

 Will merge to 'master'.


* rs/ls-files-partial-optim (2017-02-13) 2 commits
 - ls-files: move only kept cache entries in prune_cache()
 - ls-files: pass prefix length explicitly to prune_cache()

 "ls-files" run with pathspec has been micro-optimized to avoid one
 extra call to memmove().

 Will merge to 'next'.


* rs/strbuf-cleanup-in-rmdir-recursively (2017-02-13) 1 commit
 - rm: reuse strbuf for all remove_dir_recursively() calls, again

 Code clean-up.

 Will merge to 'next'.


* tg/stash-doc-cleanup (2017-02-13) 1 commit
  (merged to 'next' on 2017-02-14 at 5b9ffbc741)
 + Documentation/stash: remove mention of git reset --hard
 (this branch is used by tg/stash-push.)

 The documentation explained what "git stash" does to the working
 tree (after stashing away the local changes) in terms of "reset
 --hard", which was exposing an unnecessary implementation detail.

 Will merge to 'master'.


* tg/stash-push (2017-02-13) 6 commits
 - stash: allow pathspecs in the no verb form
 - stash: use stash_push for no verb form
 - stash: teach 'push' (and 'create') to honor pathspec
 - stash: introduce new format create
 - stash: add test for the create command line arguments
 - stash: introduce push verb
 (this branch uses tg/stash-doc-cleanup.)

 Allow "git stash" to take pathspec so that the local changes can be
 stashed away only partially.

 Waiting for the review discussion to settle, followed by a reroll.
 cf. <20170212215420.16701-1-t.gumme...@gmail.com>


* bc/object-id (2017-02-14) 19 commits
 - wt-status: convert to struct object_id
 - builtin/merge-base: convert to struct object_id
 - object_id: use struct object_id in object iteration callbacks
 - sha1_file: introduce an nth_packed_object_oid function
 - refs: simplify parsing of reflog entries
 - hex: introduce parse_oid_hex
 - refs: convert each_reflog_ent_fn to struct object_id
 - reflog-walk: convert struct reflog_info to struct object_id
 - builtin/replace: convert to struct object_id
 - object_id: convert remaining callers of resolve_refdup()
 - builtin/merge: convert to struct object_id
 - builtin/clone: convert to struct object_id
 - builtin/branch: convert to struct object_id
 - builtin/grep: convert to struct object_id
 - builtin/fmt-merge-message: convert to struct object_id
 - builtin/fast-export: convert to struct object_id
 - builtin/describe: convert to struct object_id
 - builtin/diff-tree: convert to struct object_id
 - builtin/commit: convert to struct object_id

 "uchar [40]" to "struct object_id" conversion continues.


* jk/grep-no-index-fix (2017-02-14) 7 commits
 - grep: treat revs the same for --untracked as for --no-index
 - grep: do not diagnose misspelt revs with --no-index
 - grep: avoid resolving revision names in --no-index case
 - grep: fix "--" rev/pathspec disambiguation
 - grep: re-order rev-parsing loop
 - grep: do not unnecessarily query repo for "--"
 - grep: move thread initialization a little lower

 The code to parse the command line "git grep ... 
 [[--] ...]" has been cleaned up, and a handful of bugs
 have been fixed (e.g. we used to check "--" if it is a rev).

 Will merge to and cook in 'next'.


* jk/show-branch-lift-name-len-limit (2017-02-14) 3 commits
 - show-branch: use skip_prefix to drop magic numbers
 - show-branch: store resolved head in heap buffer
 - show-branch: drop head_len variable

 "git show-branch" 

[PATCH/RFC] git-gui: Fix author name encoding in Amend Last Commit.

2017-02-14 Thread bernhardu
From: Bernhard Übelacker 

In "New Commit" author name is set correctly.
But when doing "Amend Last Commit" the encoding gets wrong.

This patch adds in load_last_commit a similar assignment converting
to tcl encoding for commit_author as already exists for msg.
---
 lib/commit.tcl | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/lib/commit.tcl b/lib/commit.tcl
index 83620b7..f5357f2 100644
--- a/lib/commit.tcl
+++ b/lib/commit.tcl
@@ -46,6 +46,9 @@ You are currently in the middle of a merge that has not been 
fully completed.  Y
set msg [encoding convertfrom $enc $msg]
}
set msg [string trim $msg]
+   if {$enc ne {}} {
+   set commit_author [encoding convertfrom $enc 
$commit_author]
+   }
} err]} {
error_popup [strcat [mc "Error loading commit data for amend:"] 
"\n\n$err"]
return
-- 
2.11.0



Re: [PATCH v2 2/2] completion: checkout: complete paths when ref given

2017-02-14 Thread Junio C Hamano
Cornelius Weig  writes:

> Hmm.. I'm a bit reluctant to let go of this just yet, because it was my
> original motivation for the whole patch. I admit that it may be
> confusing to not get completion in your example. However, I think that
> once acquainted with the new behavior, a user who wants some files
> cleaned would start by having a look at the list of files from "git
> checkout HEAD ". That's probably faster than spelling the
> first few characters and hit  for a file that's already clean.

I understand that "git checkout HEAD frotz" that gives 47 other
paths that all begin with "foo", when "frotz27" is the only one
among them that you know you changed, is not very useful to narrow
things down.  

But it is equally irritating when you know "frotz27" is the only
path that begins with "frotz" (after all, that is why you decided to
stop typing after saying "frotz" and letting the comletion kick in)
but you are not certain if you touched it.  The completion being
silent may be an indication that it is not modified, OR it may be an
indication that you mistyped the leading part "frotz", and leaves
you wondering.


Re: [RFC-PATCHv2] submodules: add a background story

2017-02-14 Thread Junio C Hamano
Stefan Beller  writes:

>> And then after doing the "git mv" you have pushed the result, which
>> I pulled.  Now, how will your "internal mapping" propagate to me?
>
> The "name" inside your superprojects git dir may be different from mine,
> after all the name only serves the purpose to not have duplicate
> git repositories when renaming a submodule.

That is true, but you still need to convey "what I used to have at
'path' is now at 'htap'".  It is clear how to do so if we use "name"
in .gitmodules (you say "what we collectively call module A is now
at 'htap'").  I do not know how you do so without having a name.


Re: [RFC-PATCHv2] submodules: add a background story

2017-02-14 Thread Stefan Beller
On Tue, Feb 14, 2017 at 2:17 PM, Junio C Hamano  wrote:
> Stefan Beller  writes:
>
>> On Tue, Feb 14, 2017 at 1:56 PM, Junio C Hamano  wrote:
>>> Stefan Beller  writes:
>>>
 If we were to redesign the .gitmodules file, we might have it as

 [submodule "path"]
 url = git://example.org
 branch = .
 ...

 and the "path -> name/UID" mapping would be inside $GIT_DIR.
>>>
>>> I am not sure how you are going to keep track of that mapping,
>>> though.  If .gitmodules file does not have a way to tell that what
>>> used to be at "path" in its v1.0 is now at "htap" (instead the above
>>> seems to assume there will just be an entry for [submodule "htap"]
>>> in the newer version, without anything that links the old one with
>>> the new one), how would the mapping inside $GIT_DIR know?
>>
>> It depends. Maybe git-mv could have rewritten the internal mapping
>> as well.
>
> And then after doing the "git mv" you have pushed the result, which
> I pulled.  Now, how will your "internal mapping" propagate to me?

The "name" inside your superprojects git dir may be different from mine,
after all the name only serves the purpose to not have duplicate
git repositories when renaming a submodule.

>
> I also do not think "this is similar to file renames" holds water.
> Moving the path a submodule bound to from one path to another is
> done as a whole, and it is not like the blob contents where we need
> to handle patch application that expresses a move as creation and
> deletion of similar contents at two different paths.  We can afford
> to be precise (after all, we are recording other information about
> submodules by having an extra .gitmodules file).
>
> In short, "name" is not a design mistake at all.  That needs to be
> excised from the "background story".

I am not saying it was a design mistake per se.

I claim that the exposure into .gitmodules combined with
the extreme similarity to its path is confusing. Maybe this
can be fixed by a different default name.


Re: [RFC-PATCHv2] submodules: add a background story

2017-02-14 Thread Junio C Hamano
Stefan Beller  writes:

> On Tue, Feb 14, 2017 at 1:56 PM, Junio C Hamano  wrote:
>> Stefan Beller  writes:
>>
>>> If we were to redesign the .gitmodules file, we might have it as
>>>
>>> [submodule "path"]
>>> url = git://example.org
>>> branch = .
>>> ...
>>>
>>> and the "path -> name/UID" mapping would be inside $GIT_DIR.
>>
>> I am not sure how you are going to keep track of that mapping,
>> though.  If .gitmodules file does not have a way to tell that what
>> used to be at "path" in its v1.0 is now at "htap" (instead the above
>> seems to assume there will just be an entry for [submodule "htap"]
>> in the newer version, without anything that links the old one with
>> the new one), how would the mapping inside $GIT_DIR know?
>
> It depends. Maybe git-mv could have rewritten the internal mapping
> as well.

And then after doing the "git mv" you have pushed the result, which
I pulled.  Now, how will your "internal mapping" propagate to me?

I also do not think "this is similar to file renames" holds water.
Moving the path a submodule bound to from one path to another is
done as a whole, and it is not like the blob contents where we need
to handle patch application that expresses a move as creation and
deletion of similar contents at two different paths.  We can afford
to be precise (after all, we are recording other information about
submodules by having an extra .gitmodules file).

In short, "name" is not a design mistake at all.  That needs to be
excised from the "background story".


Re: [PATCH v2 2/2] completion: checkout: complete paths when ref given

2017-02-14 Thread Cornelius Weig


On 02/14/2017 10:31 PM, Junio C Hamano wrote:
> cornelius.w...@tngtech.com writes:
> 
>> From: Cornelius Weig 
>>
>> Git-checkout completes words starting with '--' as options and other
>> words as refs. Even after specifying a ref, further words not starting
>> with '--' are completed as refs, which is invalid for git-checkout.
>>
>> This commit ensures that after specifying a ref, further non-option
>> words are completed as paths. Four cases are considered:
>>
>>  - If the word contains a ':', do not treat it as reference and use
>>regular revlist completion.
>>  - If no ref is found on the command line, complete non-options as refs
>>as before.
>>  - If the ref is HEAD or @, complete only with modified files because
>>checking out unmodified files is a noop.
>>This case also applies if no ref is given, but '--' is present.
> 
> Please at least do not do this one; a completion that is or pretends
> to be more clever than the end users will confuse them at best and
> annoy them.  Maybe the user does not recall if she touched the path
> or not, and just trying to be extra sure that it matches HEAD or
> index by doing "git checkout [HEAD] path".  Leave the "make it
> a noop" to Git, but just allow her do so.

Hmm.. I'm a bit reluctant to let go of this just yet, because it was my
original motivation for the whole patch. I admit that it may be
confusing to not get completion in your example. However, I think that
once acquainted with the new behavior, a user who wants some files
cleaned would start by having a look at the list of files from "git
checkout HEAD ". That's probably faster than spelling the
first few characters and hit  for a file that's already clean.
Let's hear if anybody else has an opinion about this.

> I personally feel that "git checkout ... foo" should
> just fall back to the normal "path on the filesystem" without any
> cleverness, instead of opening a tree object or peek into the index.

I was thinking about that as well, but it's not what happens for "git
checkout topic:path". And given that "git checkout topic path"
refers to the same object, consistency kind of demands that the tree
objects are opened in the latter case as well. However, because the
differences to filesystem path completion are somewhat corner cases, I'm
fine with that as long as I'm not offered ref names any longer...



Re: [RFC PATCH] show decorations at the end of the line

2017-02-14 Thread Junio C Hamano
Junio C Hamano  writes:

> Thanks.  We'd need to update the tests that expects the old style
> output, though.

The updates to the expectation look like this (already squashed).
The --source decorations in 4202 are also shown at the end, which
probably is in line with the way --show-decorations adds them at the
end of the line, but was somewhat surprising from reading only the
log message.

diff --git a/t/t4202-log.sh b/t/t4202-log.sh
index 48b55bfd27..dea2d449ab 100755
--- a/t/t4202-log.sh
+++ b/t/t4202-log.sh
@@ -1353,9 +1353,9 @@ test_expect_success 'set up --source tests' '
 
 test_expect_success 'log --source paints branch names' '
cat >expect <<-\EOF &&
-   09e12a9 source-b three
-   8e393e1 source-a two
-   1ac6c77 source-b one
+   09e12a9 three   source-b
+   8e393e1 two source-a
+   1ac6c77 one source-b
EOF
git log --oneline --source source-a source-b >actual &&
test_cmp expect actual
@@ -1364,9 +1364,9 @@ test_expect_success 'log --source paints branch names' '
 test_expect_success 'log --source paints tag names' '
git tag -m tagged source-tag &&
cat >expect <<-\EOF &&
-   09e12a9 source-tag three
-   8e393e1 source-a two
-   1ac6c77 source-tag one
+   09e12a9 three   source-tag
+   8e393e1 two source-a
+   1ac6c77 one source-tag
EOF
git log --oneline --source source-tag source-a >actual &&
test_cmp expect actual
diff --git a/t/t4207-log-decoration-colors.sh b/t/t4207-log-decoration-colors.sh
index b972296f06..08236a83e7 100755
--- a/t/t4207-log-decoration-colors.sh
+++ b/t/t4207-log-decoration-colors.sh
@@ -44,15 +44,15 @@ test_expect_success setup '
 '
 
 cat >expected <\
+${c_commit}COMMIT_ID${c_reset} B${c_commit} (${c_reset}${c_HEAD}HEAD ->\
  ${c_reset}${c_branch}master${c_reset}${c_commit},\
  ${c_reset}${c_tag}tag: v1.0${c_reset}${c_commit},\
- ${c_reset}${c_tag}tag: B${c_reset}${c_commit})${c_reset} B
-${c_commit}COMMIT_ID${c_reset}${c_commit} (${c_reset}${c_tag}tag: 
A1${c_reset}${c_commit},\
- ${c_reset}${c_remoteBranch}other/master${c_reset}${c_commit})${c_reset} A1
-${c_commit}COMMIT_ID${c_reset}${c_commit} 
(${c_reset}${c_stash}refs/stash${c_reset}${c_commit})${c_reset}\
- On master: Changes to A.t
-${c_commit}COMMIT_ID${c_reset}${c_commit} (${c_reset}${c_tag}tag: 
A${c_reset}${c_commit})${c_reset} A
+ ${c_reset}${c_tag}tag: B${c_reset}${c_commit})${c_reset}
+${c_commit}COMMIT_ID${c_reset} A1${c_commit} (${c_reset}${c_tag}tag: 
A1${c_reset}${c_commit},\
+ ${c_reset}${c_remoteBranch}other/master${c_reset}${c_commit})${c_reset}
+${c_commit}COMMIT_ID${c_reset} On master: Changes to A.t\
+${c_commit} (${c_reset}${c_stash}refs/stash${c_reset}${c_commit})${c_reset}
+${c_commit}COMMIT_ID${c_reset} A${c_commit} (${c_reset}${c_tag}tag: 
A${c_reset}${c_commit})${c_reset}
 EOF
 
 # We want log to show all, but the second parent to refs/stash is irrelevant


Re: [RFC-PATCHv2] submodules: add a background story

2017-02-14 Thread Stefan Beller
On Tue, Feb 14, 2017 at 1:56 PM, Junio C Hamano  wrote:
> Stefan Beller  writes:
>
>> If we were to redesign the .gitmodules file, we might have it as
>>
>> [submodule "path"]
>> url = git://example.org
>> branch = .
>> ...
>>
>> and the "path -> name/UID" mapping would be inside $GIT_DIR.
>
> I am not sure how you are going to keep track of that mapping,
> though.  If .gitmodules file does not have a way to tell that what
> used to be at "path" in its v1.0 is now at "htap" (instead the above
> seems to assume there will just be an entry for [submodule "htap"]
> in the newer version, without anything that links the old one with
> the new one), how would the mapping inside $GIT_DIR know?

It depends. Maybe git-mv could have rewritten the internal mapping
as well.

Maybe it would work similar to a rename detection
utilizing a bloomfilter that includes all recorded sha1s at a given path
and then we can take the sha1 from the a given path and check for each
absorbed submodule git dir if that commit belongs to this repo.

I did not quite think it through, but I was pointing out this is brittle.
I guess a quick way would be to follow the .git file inside the submodule
if that exists and if not build up an internal cache that can map
"path -> potential git dirs".

Of course we can argue that the same problem applies to e.g. remotes:
If I have
remote.origin.url = git://kernel.org and
remote.mirror.url = kernel.googlesource.com
then swapping the urls will of course yield different behavior
for 'origin' and 'mirror'. But in this case it is obvious because
"origin" is not the same string as "kernel.org".

So long term, maybe we should come up with a better default name
for submodules, e.g. just a hash of say the URL being used when
adding the submodule.

>  Don't
> forget that name was introduced as the identity because we cannot
> assume that URL for a single project will never change.

Yes, URL and path can both change over time, which is why it is
a good idea to have them versioned as well as having a way to
overwrite the URL in the config later on.

> I fully agree that our documentation and user education should
> stress that names must be unique and immultable throughout the
> history of a superproject, though.

This would be a good paragraph in this "background story" that this
patch tries to write. I'll add that.

Thanks,
Stefan


Re: [PATCH 0/5] A series of performance enhancements in the memihash and name-cache area

2017-02-14 Thread Jeff King
On Tue, Feb 14, 2017 at 12:31:46PM +0100, Johannes Schindelin wrote:

> On Windows, calls to memihash() and maintaining the istate.name_hash and
> istate.dir_hash HashMaps take significant time on very large
> repositories. This series of changes reduces the overall time taken for
> various operations by reducing the number calls to memihash(), moving
> some of them into multi-threaded code, and etc.
> 
> Note: one commenter in https://github.com/git-for-windows/git/pull/964
> pointed out that memihash() only handles ASCII correctly. That is true.
> And fixing this is outside the purview of this patch series.

Out of curiosity, do you have numbers? Bonus points if the speedup can
be shown via a t/perf script.

We have a read-cache perf-test already, but I suspect you'd want
something more like "git status" or "ls-files -o" that calls into
read_directory().

-Peff


Re: [PATCH v2 00/19] object_id part 6

2017-02-14 Thread Junio C Hamano
"brian m. carlson"  writes:

> This is another series in the continuing conversion to struct object_id.
>
> This series converts more of the builtin directory and some of the refs
> code to use struct object_id. Additionally, it implements an
> nth_packed_object_oid function which provides a struct object_id version
> of the nth_packed_object function, and a parse_oid_hex function that
> makes parsing easier.
>
> The patch to use parse_oid_hex in the refs code has been split out into
> its own patch, just because I'm wary of that code and potentially
> breaking things, and I want it to be easy to revert in case things go
> wrong.  I have no reason to believe it is anything other than fully
> functional, however.

Thanks.  Will queue.

There are a few hunks in builtin/merge.c that ends up getting discarded
when merged to 'pu' as is-old-style-invocation will just be removed,
but the conflict resolution was trivial.



Re: [PATCH 8/7] grep: treat revs the same for --untracked as for --no-index

2017-02-14 Thread Junio C Hamano
Jeff King  writes:

> ...
> The rationale for doing this with --no-index is that it is
> meant to be used outside a repository, and so parsing revs
> at all does not make sense.
>
> This patch gives --untracked the same treatment. While it
> _is_ meant to be used in a repository, it is explicitly
> about grepping the non-repository contents. Telling the user
> "we found a rev, but you are not allowed to use revs" is
> not really helpful compared to "we treated your argument as
> a path, and could not find it".

Yup, both sounds very sensible.  Thanks.


Re: [RFC-PATCHv2] submodules: add a background story

2017-02-14 Thread Junio C Hamano
Stefan Beller  writes:

> If we were to redesign the .gitmodules file, we might have it as
>
> [submodule "path"]
> url = git://example.org
> branch = .
> ...
>
> and the "path -> name/UID" mapping would be inside $GIT_DIR.

I am not sure how you are going to keep track of that mapping,
though.  If .gitmodules file does not have a way to tell that what
used to be at "path" in its v1.0 is now at "htap" (instead the above
seems to assume there will just be an entry for [submodule "htap"]
in the newer version, without anything that links the old one with
the new one), how would the mapping inside $GIT_DIR know?  Don't
forget that name was introduced as the identity because we cannot
assume that URL for a single project will never change.

I fully agree that our documentation and user education should
stress that names must be unique and immultable throughout the
history of a superproject, though.




[PATCH 8/7] grep: treat revs the same for --untracked as for --no-index

2017-02-14 Thread Jeff King
On Tue, Feb 14, 2017 at 10:19:56AM -0800, Jonathan Tan wrote:

> > I wasn't sure if we wanted to treat "untracked" in the same way.
> > Certainly we can catch the error here for the seen_dashdash case, but is
> > it also the case that:
> > 
> >   echo content >master
> >   git grep --untracked pattern master
> > 
> > should treat "master" as a path?
> 
> It is already the case that it cannot be treated as a rev:
> 
>   $ git grep --untracked pattern master
>   fatal: --no-index or --untracked cannot be used with revs.
> 
> So I think it would be better if it was treated as a path, for consistency
> with --no-index. I'm OK either way, though.

Right, it's always been disallowed. But the early detection changes a
few user-visible behaviors like the exact error message, and how
disambiguation works (see below). I think the arguments for making that
change for --no-index are stronger than for --untracked. But it probably
makes sense for --untracked, too.

Here's a patch on top of my series that lumps the two together again. It
_could_ be squashed into the earlier patch, but I think I prefer keeping
it separate.

-- >8 --
Subject: [PATCH] grep: treat revs the same for --untracked as for --no-index

git-grep has always disallowed grepping in a tree (as
opposed to the working directory) with both --untracked
and --no-index. But we traditionally did so by first
collecting the revs, and then complaining when any were
provided.

The --no-index option recently learned to detect revs
much earlier. This has two user-visible effects:

  - we don't bother to resolve revision names at all. So
when there's a rev/path ambiguity, we always choose to
treat it as a path.

  - likewise, when you do specify a revision without "--",
the error you get is "no such path" and not "--untracked
cannot be used with revs".

The rationale for doing this with --no-index is that it is
meant to be used outside a repository, and so parsing revs
at all does not make sense.

This patch gives --untracked the same treatment. While it
_is_ meant to be used in a repository, it is explicitly
about grepping the non-repository contents. Telling the user
"we found a rev, but you are not allowed to use revs" is
not really helpful compared to "we treated your argument as
a path, and could not find it".

Signed-off-by: Jeff King 
---
 builtin/grep.c  | 10 +-
 t/t7810-grep.sh |  2 +-
 2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/builtin/grep.c b/builtin/grep.c
index 1454bef49..9304c33e7 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -967,6 +967,7 @@ int cmd_grep(int argc, const char **argv, const char 
*prefix)
int dummy;
int use_index = 1;
int pattern_type_arg = GREP_PATTERN_TYPE_UNSPECIFIED;
+   int allow_revs;
 
struct option options[] = {
OPT_BOOL(0, "cached", ,
@@ -1165,6 +1166,7 @@ int cmd_grep(int argc, const char **argv, const char 
*prefix)
 * to it must resolve as a rev. If not, then we stop at the first
 * non-rev and assume everything else is a path.
 */
+   allow_revs = use_index && !untracked;
for (i = 0; i < argc; i++) {
const char *arg = argv[i];
unsigned char sha1[20];
@@ -1176,9 +1178,9 @@ int cmd_grep(int argc, const char **argv, const char 
*prefix)
break;
}
 
-   if (!use_index) {
+   if (!allow_revs) {
if (seen_dashdash)
-   die(_("--no-index cannot be used with revs"));
+   die(_("--no-index or --untracked cannot be used 
with revs"));
break;
}
 
@@ -1201,7 +1203,7 @@ int cmd_grep(int argc, const char **argv, const char 
*prefix)
if (!seen_dashdash) {
int j;
for (j = i; j < argc; j++)
-   verify_filename(prefix, argv[j], j == i && use_index);
+   verify_filename(prefix, argv[j], j == i && allow_revs);
}
 
parse_pathspec(, 0,
@@ -1273,8 +1275,6 @@ int cmd_grep(int argc, const char **argv, const char 
*prefix)
 
if (!use_index || untracked) {
int use_exclude = (opt_exclude < 0) ? use_index : !!opt_exclude;
-   if (list.nr)
-   die(_("--no-index or --untracked cannot be used with 
revs."));
hit = grep_directory(, , use_exclude, use_index);
} else if (0 <= opt_exclude) {
die(_("--[no-]exclude-standard cannot be used for tracked 
contents."));
diff --git a/t/t7810-grep.sh b/t/t7810-grep.sh
index 0ff9f6cae..cee42097b 100755
--- a/t/t7810-grep.sh
+++ b/t/t7810-grep.sh
@@ -1032,7 +1032,7 @@ test_expect_success 'grep --no-index pattern -- path' '
 
 test_expect_success 'grep --no-index complains of revs' '
test_must_fail git grep --no-index o master -- 2>err &&
-   test_i18ngrep 

Re: [RFC-PATCHv2] submodules: add a background story

2017-02-14 Thread Stefan Beller
Sorry for dropping the ball here, I was stressed out a bit.

On Thu, Feb 9, 2017 at 3:32 PM, Junio C Hamano  wrote:
>>   Do we need
>>
>>   * an opinionated way to check for a specific state of a submodule
>>   * (submodule helper to be plumbing?)
>>   * expose the design mistake of having the (name->path) mapping inside the
>> working tree, i.e. never remove a name from the submodule config even 
>> when
>> the submodule doesn't exist any more.
>
> I am not sure about the last item.
>
> Are you talking about a case where submodule comes and goes (think:
> "git checkout v1.0" that would make submodules added since that
> version disappar)?  .gitmodules that is checked out would not have
> any entry, but .git/config needs to record the end-user preference
> for the module, so that the user can do "git checkout -" to come
> back, no?

That is perfectly legit and I agree that is good design.

>  IOW .git/config that mentions all the submodule the user
> ever showed interests in is not a design mistake, so you must be
> talking about something else, but I do not know what it is.

I mean that we
(1) have a gitmodules file tracked in git that includes the name.
The "tracking some information inside the version control to
help the very version control system" is also not bad. The bad part
is that the name *must not be changed* and
 * we do not tell people about it in the docs
 * we happily make commits that change the name of a submodule
(2) name the submodule by path be default

See
https://public-inbox.org/git/7e54658a-dcb2-64a7-3c67-0c4fa221b...@gmail.com/

> Oh, I see. You did not just rename the path, but also the name
> in the .gitmodules?

I wasn't even aware that the submodule name was something different from
the path because the name is by default set to be the path to it.

You could blame this specific instance on the user, but I rather blame it on Git
as such questions come up once in a while on the mailing list.

If we were to redesign the .gitmodules file, we might have it as

[submodule "path"]
url = git://example.org
branch = .
...

and the "path -> name/UID" mapping would be inside $GIT_DIR.

>
> Are they both in section (1)?  I think the former (concepts) belongs
> to section 7 and the latter (file formats) belongs to section 5.

oops. Will fix.

>
>> diff --git a/Documentation/gitsubmodules.txt 
>> b/Documentation/gitsubmodules.txt
>> new file mode 100644
>> index 00..3369d55ae9
>> --- /dev/null
>> +++ b/Documentation/gitsubmodules.txt
>> @@ -0,0 +1,194 @@
>> +gitsubmodules(7)
>> +
>> +
>> +NAME
>> +
>> +gitsubmodules - information about submodules
>> +
>> +SYNOPSIS
>> +
>> +$GIT_DIR/config, .gitmodules
>> +
>> +--
>> +git submodule
>> +--
>> +
>> +DESCRIPTION
>> +---
>> +
>> +A submodule allows you to keep another Git repository in a subdirectory
>> +...
>> +When cloning or pulling a repository containing submodules however,
>> +the submodules will not be checked out by default; You need to instruct
>> +'clone' to recurse into submodules. The 'init' and 'update' subcommands
>
> I think this is not "You need to", but rather "You can, if you want
> to have each and every submodules."

ok. In this  man page for submodules I assumed an implicit
"[if you want these submodules to be there, then] you have to/need to ...

But I'll tone it down as it doesn't carry internal assumptions.

>> +
>> +** When you want to use a (third party) library tied to a specific version.
>> +   Using submodules for a library allows you to have a clean history for
>> +   your own project and only updating the library in the submodule when 
>> needed.
>
> I somehow do not see this as decoupling; it is keeping what is
> originally separate separate, isn't it?

ok I'll reword that to say keeping separate things separate.

>
>> +** In its current form Git scales up poorly for very large repositories that
>> +   change a lot, as the history grows very large. For that you may want to 
>> look
>> +   at shallow clone, sparse checkout or git-lfs.
>> +   However you can also use submodules to e.g. hold large binary assets
>> +   and these repositories are then shallowly cloned such that you do not
>> +   have a large history locally.
>
> In other words, a better way to list these may be
>
>  1. using another project that stands on its own.
>
>  2. artificially split a (logically single) project into multiple
> repositories and tying them back together.
>
> The access control and performance reasons are subclasses of 2.
> IOW, if Git had per-path ACL and infinite scaling, you wouldn't be
> splitting your project into submodules for 2.  You would still want
> to use somebody else's project by binding it as a subproject, instead
> of merging its history into yours.

Looking at the big picture with a logical view is better indeed.

>
>> +When working with submodules, you can think of 

Re: [PATCH v4 4/7] stash: introduce new format create

2017-02-14 Thread Thomas Gummerer
On 02/14, Matthieu Moy wrote:
> Thomas Gummerer  writes:
> 
> >  create_stash () {
> > -   stash_msg="$1"
> > -   untracked="$2"
> > +   stash_msg=
> > +   untracked=
> > +   new_style=
> > +   while test $# != 0
> > +   do
> > +   case "$1" in
> > +   -m|--message)
> > +   shift
> > +   test -z ${1+x} && usage
> > +   stash_msg="$1"
> > +   new_style=t
> > +   ;;
> > +   -u|--include-untracked)
> > +   shift
> > +   test -z ${1+x} && usage
> > +   untracked="$1"
> > +   new_style=t
> > +   ;;
> > +   *)
> > +   if test -n "$new_style"
> > +   then
> > +   echo "invalid argument"
> > +   option="$1"
> > +   # TRANSLATORS: $option is an invalid option, 
> > like
> > +   # `--blah-blah'. The 7 spaces at the beginning 
> > of the
> > +   # second line correspond to "error: ". So you 
> > should line
> > +   # up the second line with however many 
> > characters the
> > +   # translation of "error: " takes in your 
> > language. E.g. in
> > +   # English this is:
> > +   #
> > +   #$ git stash save --blah-blah 2>&1 | head 
> > -n 2
> > +   #error: unknown option for 'stash save': 
> > --blah-blah
> > +   #   To provide a message, use git stash 
> > save -- '--blah-blah'
> > +   eval_gettextln "error: unknown option for 
> > 'stash create': \$option"
> 
> The TRANSLATORS: hint seems a typoed cut-and-paste from somewhere else.
> There are no 7 spaces in this message.
> 
> Actually, if I read the code correctly, $option is not even necessarily
> an option as you're matching *. Perhaps you meant something like
> 
>   -*)
>   option="$1"
>   # TRANSLATORS: $option is an invalid option, like
>   # `--blah-blah'. The 7 spaces at the beginning of the
>   # second line correspond to "error: ". So you should line
>   # up the second line with however many characters the
>   # translation of "error: " takes in your language. E.g. in
>   # English this is:
>   #
>   #$ git stash save --blah-blah 2>&1 | head -n 2
>   #error: unknown option for 'stash save': --blah-blah
>   #   To provide a message, use git stash save -- 
> '--blah-blah'
>   eval_gettextln "error: unknown option for 'stash save': \$option
>To provide a message, use git stash save -- '\$option'"
> usage
> ;;
> *)
>   if test -n "$new_style"
>   then
>   arg="$1"
>   eval_gettextln "error: invalid argument for 'stash 
> create': \$arg"
>   usage
>   fi
> break
>   ;;
> 
> (untested)
> 
> Also, you may want to guard against
> 
>   git stash create "some message" -m "some other message"
> 
> since you are already rejecting
> 
>   git stash create -m "some message" "some other message"
> 
> ? Or perhaps apply "last one wins" for both "-m message" and
> "message"-without-dash-m.

Thanks, you're right I was missing some cases here.  As I just
indicated in [1] however I think we can just make this an internal
interface, instead of user interface facing, so I think we'll need
less error checking.

> -- 
> Matthieu Moy
> http://www-verimag.imag.fr/~moy/

[1]: http://public-inbox.org/git/20170214213038.GE652@hank/


Re: [PATCH v3 0/5] stash: support pathspec argument

2017-02-14 Thread Thomas Gummerer
On 02/14, Matthieu Moy wrote:
> Thomas Gummerer  writes:
> 
> > I'm almost convinced of special casing "-p".  (Maybe I'm easy to
> > convince as well, because it would be convenient ;) ) However it's a
> > bit weird that now "git stash -p file" would work, but "git stash -m
> > message" wouldn't.  Maybe we should do it the other way around, and
> > only special case "-q", and see if there is an non option argument
> > after that?  From a glance at the options that's the only one where
> > "git stash - " could make sense to the user.
> 
> Special-casing the allowed cases makes it easier to change the behavior
> in the future if needed: it's easy to add special cases and it doesn't
> break backward compatibility.
> 
> So, if we don't have a good reason to do otherwise, I'd rather stay on
> the future-proof side.

Ok, after reading the rest of the messages in the thread I'm convinced
we should just special case "-p" for now.

It's by far my most used stash command as well, after using git stash
without any arguments.

> -- 
> Matthieu Moy
> http://www-verimag.imag.fr/~moy/



Re: [PATCH v2 2/2] completion: checkout: complete paths when ref given

2017-02-14 Thread Junio C Hamano
cornelius.w...@tngtech.com writes:

> From: Cornelius Weig 
>
> Git-checkout completes words starting with '--' as options and other
> words as refs. Even after specifying a ref, further words not starting
> with '--' are completed as refs, which is invalid for git-checkout.
>
> This commit ensures that after specifying a ref, further non-option
> words are completed as paths. Four cases are considered:
>
>  - If the word contains a ':', do not treat it as reference and use
>regular revlist completion.
>  - If no ref is found on the command line, complete non-options as refs
>as before.
>  - If the ref is HEAD or @, complete only with modified files because
>checking out unmodified files is a noop.
>This case also applies if no ref is given, but '--' is present.

Please at least do not do this one; a completion that is or pretends
to be more clever than the end users will confuse them at best and
annoy them.  Maybe the user does not recall if she touched the path
or not, and just trying to be extra sure that it matches HEAD or
index by doing "git checkout [HEAD] path".  Leave the "make it
a noop" to Git, but just allow her do so.

I personally feel that "git checkout ... foo" should
just fall back to the normal "path on the filesystem" without any
cleverness, instead of opening a tree object or peek into the index.



Re: [PATCH v3 4/5] stash: introduce new format create

2017-02-14 Thread Thomas Gummerer
On 02/13, Jeff King wrote:
> On Mon, Feb 13, 2017 at 04:57:34PM -0500, Jeff King wrote:
> 
> > Yeah, I think your patch is actually fixing that case. But your search
> > is only part of the story. You found somebody using "-m" explicitly, but
> > what about somebody blindly calling:
> > 
> >   git stash create $*
> > 
> > That's now surprising to somebody who puts "-m" in their message.
> > 
> > > I *think* this regression is acceptable, but I'm happy to introduce
> > > another verb if people think otherwise.
> > 
> > Despite what I wrote above, I'm still inclined to say that this isn't an
> > important regression. I'd be surprised if "stash create" is used
> > independently much at all.
> 
> Just thinking on this more...do we really care about "fixing" the
> interface of "stash create"? This is really just about refactoring what
> underlies the new "push", right?
>
> So we could just do:
> 
> diff --git a/git-stash.sh b/git-stash.sh
> index 6d629fc43..ee37db135 100755
> --- a/git-stash.sh
> +++ b/git-stash.sh
> @@ -711,7 +711,7 @@ clear)
>   ;;
>  create)
>   shift
> - create_stash "$@" && echo "$w_commit"
> + create_stash -m "$*" && echo "$w_commit"
>   ;;
>  store)
>   shift
> 
> on top of your patch and keep the external interface the same.
> 
> It might be nice to clean up the interface for "create" to match other
> ones, but from this discussion I think it is mostly a historical wart
> for scripting, and we are OK to just leave its slightly-broken interface
> in place forever.

Yeah tbh I don't personally care too much about modernizing the
interface to git stash create.  What you have above makes a lot of
sense to me.

I also just noticed that git stash create -m message is not the worst
regression I introduced when trying to modernize this.  In that case
only the -m would go missing, but that's probably not the end of the
world.  The worse thing to do would be something like
git stash create -u untracked, where the intended message was "-u
untracked", but instead there is no message, but all untracked files
are now included in the stash as well.

In that light what you have above makes even more sense to me.
Thanks!

> I could go either way.
> 
> -Peff

-- 
Thomas


Re: [PATCH] show-branch: fix crash with long ref name

2017-02-14 Thread Christian Couder
On Tue, Feb 14, 2017 at 8:55 PM, Jeff King  wrote:
> On Tue, Feb 14, 2017 at 11:35:48AM -0800, Junio C Hamano wrote:
>
>> Jeff King  writes:
>>
>> > This fixes the problem, but I think we can simplify it quite a bit by
>> > using resolve_refdup(). Here's the patch series I ended up with:
>> >
>> >   [1/3]: show-branch: drop head_len variable
>> >   [2/3]: show-branch: store resolved head in heap buffer
>> >   [3/3]: show-branch: use skip_prefix to drop magic numbers

Yeah, I noticed there were a number of things that could be improved
in the area, but I didn't want to spend too much time on this, so
thanks for this series.

>> Yes, the whole thing is my fault ;-) and I agree with what these
>> patches do.
>>
>> The second one lacks free(head) but I think that is OK; it is
>> something we allocate in cmd_*() and use pretty much thruout the
>> rest of the program.
>
> Yes, I actually tested the whole thing under ASAN (which was necessary
> to notice the problem), which complained about the leak. I don't mind
> adding a free(head), but there are a bunch of similar "leaks" in that
> function, so I didn't bother.

Yeah, I didn't bother either.

> I notice Christian's patch added a few tests. I don't know if we'd want
> to squash them in (I didn't mean to override his patch at all; I was
> about to send mine out when I noticed his, and I wondered if we wanted
> to combine the two efforts).

I think it would be nice to have at least one test. Feel free to
squash mine if you want.


[PATCH v2 2/2] completion: checkout: complete paths when ref given

2017-02-14 Thread cornelius . weig
From: Cornelius Weig 

Git-checkout completes words starting with '--' as options and other
words as refs. Even after specifying a ref, further words not starting
with '--' are completed as refs, which is invalid for git-checkout.

This commit ensures that after specifying a ref, further non-option
words are completed as paths. Four cases are considered:

 - If the word contains a ':', do not treat it as reference and use
   regular revlist completion.
 - If no ref is found on the command line, complete non-options as refs
   as before.
 - If the ref is HEAD or @, complete only with modified files because
   checking out unmodified files is a noop.
   This case also applies if no ref is given, but '--' is present.
 - If a ref other than HEAD or @ is found, offer only valid paths from
   that revision.

Note that one corner-case is not covered by the current implementation:
if a refname contains a ':' and is followed by '--' the completion would
not recognize the valid refname.

Signed-off-by: Cornelius Weig 
---
 contrib/completion/git-completion.bash | 39 +++---
 1 file changed, 31 insertions(+), 8 deletions(-)

diff --git a/contrib/completion/git-completion.bash 
b/contrib/completion/git-completion.bash
index 4ab119d..df46f62 100644
--- a/contrib/completion/git-completion.bash
+++ b/contrib/completion/git-completion.bash
@@ -1068,7 +1068,7 @@ _git_bundle ()
 
 _git_checkout ()
 {
-   __git_has_doubledash && return
+   local i c=2 ref="" seen_double_dash=""
 
case "$cur" in
--conflict=*)
@@ -1081,13 +1081,36 @@ _git_checkout ()
"
;;
*)
-   # check if --track, --no-track, or --no-guess was specified
-   # if so, disable DWIM mode
-   local flags="--track --no-track --no-guess" track=1
-   if [ -n "$(__git_find_on_cmdline "$flags")" ]; then
-   track=''
-   fi
-   __gitcomp_nl "$(__git_refs '' $track)"
+   while [ $c -lt $cword ]; do
+   i="${words[c]}"
+   case "$i" in
+   --) seen_double_dash=1 ;;
+   -*|?*:*) ;;
+   *) ref="$i"; break ;;
+   esac
+   ((c++))
+   done
+
+   case "$ref,$seen_double_dash,$cur" in
+   ,,*:*)
+   __git_complete_revlist_file
+   ;;
+   ,,*)
+   # check for --track, --no-track, or --no-guess
+   # if so, disable DWIM mode
+   local flags="--track --no-track --no-guess" track=1
+   if [ -n "$(__git_find_on_cmdline "$flags")" ]; then
+   track=''
+   fi
+   __gitcomp_nl "$(__git_refs '' $track)"
+   ;;
+   ,1,*|@,*|HEAD,*)
+   __git_complete_index_file "--modified"
+   ;;
+   *)
+   __git_complete_tree_file "$ref" "$cur"
+   ;;
+   esac
;;
esac
 }
-- 
2.10.2



[PATCH v2 1/2] completion: extract utility to complete paths from tree-ish

2017-02-14 Thread cornelius . weig
From: Cornelius Weig 

The function __git_complete_revlist_file understands how to complete a
path such as 'topic:ref'. In that case, the revision (topic) and
the path component (ref) are both part of the same word. However,
some cases require that the revision is specified elsewhere than the
current word for completion, such as 'git checkout topic ref'.

In order to allow callers to specify the revision, extract a utility
function to complete paths from a tree-ish object. The utility will be
used later to implement path completion for git-checkout.

Signed-off-by: Cornelius Weig 
---
 contrib/completion/git-completion.bash | 73 +++---
 1 file changed, 41 insertions(+), 32 deletions(-)

diff --git a/contrib/completion/git-completion.bash 
b/contrib/completion/git-completion.bash
index 6c6e1c7..4ab119d 100644
--- a/contrib/completion/git-completion.bash
+++ b/contrib/completion/git-completion.bash
@@ -442,6 +442,46 @@ __git_compute_merge_strategies ()
__git_merge_strategies=$(__git_list_merge_strategies)
 }
 
+# __git_complete_tree_file requires 2 argument:
+# 1: the the tree-like to look at for completion
+# 2: the path component to complete
+__git_complete_tree_file ()
+{
+   local pfx ls ref="$1" cur_="$2"
+   case "$cur_" in
+   ?*/*)
+   pfx="${cur_%/*}"
+   cur_="${cur_##*/}"
+   ls="$ref:$pfx"
+   pfx="$pfx/"
+   ;;
+   *)
+   ls="$ref"
+   ;;
+   esac
+
+   case "$COMP_WORDBREAKS" in
+   *:*) : great ;;
+   *)   pfx="$ref:$pfx" ;;
+   esac
+
+   __gitcomp_nl "$(git --git-dir="$(__gitdir)" ls-tree "$ls" 2>/dev/null \
+   | sed '/^100... blob /{
+  s,^.*,,
+  s,$, ,
+  }
+  /^12 blob /{
+  s,^.*,,
+  s,$, ,
+  }
+  /^04 tree /{
+  s,^.*,,
+  s,$,/,
+  }
+  s/^.*//')" \
+   "$pfx" "$cur_" ""
+}
+
 __git_complete_revlist_file ()
 {
local pfx ls ref cur_="$cur"
@@ -452,38 +492,7 @@ __git_complete_revlist_file ()
?*:*)
ref="${cur_%%:*}"
cur_="${cur_#*:}"
-   case "$cur_" in
-   ?*/*)
-   pfx="${cur_%/*}"
-   cur_="${cur_##*/}"
-   ls="$ref:$pfx"
-   pfx="$pfx/"
-   ;;
-   *)
-   ls="$ref"
-   ;;
-   esac
-
-   case "$COMP_WORDBREAKS" in
-   *:*) : great ;;
-   *)   pfx="$ref:$pfx" ;;
-   esac
-
-   __gitcomp_nl "$(git --git-dir="$(__gitdir)" ls-tree "$ls" 
2>/dev/null \
-   | sed '/^100... blob /{
-  s,^.*,,
-  s,$, ,
-  }
-  /^12 blob /{
-  s,^.*,,
-  s,$, ,
-  }
-  /^04 tree /{
-  s,^.*,,
-  s,$,/,
-  }
-  s/^.*//')" \
-   "$pfx" "$cur_" ""
+   __git_complete_tree_file "$ref" "$cur_"
;;
*...*)
pfx="${cur_%...*}..."
-- 
2.10.2



Re: [PATCH] completion: complete modified files for checkout with '--'

2017-02-14 Thread Cornelius Weig
On 02/14/2017 01:50 AM, SZEDER Gábor wrote:
> On Tue, Feb 14, 2017 at 12:33 AM,   wrote:
>> From: Cornelius Weig 
>>
>> The command line completion for git-checkout bails out when seeing '--'
>> as an isolated argument. For git-checkout this signifies the start of a
>> list of files which are to be checked out. Checkout of files makes only
>> sense for modified files,
> 
> No, there is e.g. 'git checkout that-branch this-path', too.

Very true. Thanks for prodding me to this palpable oversight.

My error was to aim for a small improvement. I think the correct
approach is to improve the overall completion of git-checkout. IMHO it
is a completion bug that after giving a ref, completion will still offer
refs, e.g.
$ git checkout HEAD  --> list of refs

As far as I can see, giving two refs to checkout is always an error. The
correct behavior in the example above would be to offer paths instead.

I'll follow up with an improved version which considers these cases.



Re: [git-for-windows] Re: Continuous Testing of Git on Windows

2017-02-14 Thread Junio C Hamano
Johannes Schindelin  writes:

> On Mon, 13 Feb 2017, Junio C Hamano wrote:
>
>> Johannes Schindelin  writes:
>> 
>> > That is why I taught the Git for Windows CI job that tests the four
>> > upstream Git integration branches to *also* bisect test breakages and
>> > then upload comments to the identified commit on GitHub
>> 
>> Good.  I do not think it is useful to try 'pu' as an aggregate and
>> expect it to always build and work [*1*], but your "bisect and
>> pinpoint" approach makes it useful to identify individual topic that
>> brings in a breakage.
>
> Sadly the many different merge bases[*1*] between `next` and `pu` (which
> are the obvious good/bad points for bisecting automatically) bring my
> build agents to its knees. I may have to disable the bisecting feature as
> a consequence.

Probably a less resource intensive approach is to find the tips of
the topics not in 'next' but in 'pu' and test them.  That would give
you which topic(s) are problematic, which is a better starting point
than "Oh, 'pu' does not build".  After identifying which branch is
problematic, bisection of individual topic would be of more manageable
size.

  $ git log --first-parent --oneline 'pu^{/^### match next}..pu'

will you the merges of topics left outside 'next'.  I often reorder
to make the ones that look more OK than others closer to the bottom,
and if the breakages caused by them are caught earlier than they hit
'next', that would be ideal.

This is one of these times I wish "git bisect --first-parent" were
available.  The above "log" gives me 27 merges right now, which
should be bisectable within 5 rounds to identify a single broken
topic (if there is only one breakage, that is).





Re: [git-for-windows] Re: Continuous Testing of Git on Windows

2017-02-14 Thread Johannes Schindelin
Hi,

On Mon, 13 Feb 2017, Junio C Hamano wrote:

> Johannes Schindelin  writes:
> 
> > That is why I taught the Git for Windows CI job that tests the four
> > upstream Git integration branches to *also* bisect test breakages and
> > then upload comments to the identified commit on GitHub
> 
> Good.  I do not think it is useful to try 'pu' as an aggregate and
> expect it to always build and work [*1*], but your "bisect and
> pinpoint" approach makes it useful to identify individual topic that
> brings in a breakage.

Sadly the many different merge bases[*1*] between `next` and `pu` (which
are the obvious good/bad points for bisecting automatically) bring my
build agents to its knees. I may have to disable the bisecting feature as
a consequence.

Ciao,
Johannes

Footnote *1*: There are currently 21, some of which stemming back from a
year ago. For bisecting, they all have to be tested individually, putting
a major dent into bisect's otherwise speedy process.


Re: missing handling of "No newline at end of file" in git am

2017-02-14 Thread Olaf Hering
Am Tue, 14 Feb 2017 12:40:36 -0800
schrieb Junio C Hamano :

> Olaf Hering  writes:
> 
> > How is git send-email and git am supposed to handle a text file
> > which lacks a newline at the very end? This is about git 2.11.0.  
> 
> I think this has always worked, though.

For me it complains in line 721, which is the problematic one.
I try to apply from mutt via (cd /some/dir && git am), but that
probably does not make a difference.

How would I debug it?

Olaf


Re: missing handling of "No newline at end of file" in git am

2017-02-14 Thread Jeff King
On Tue, Feb 14, 2017 at 09:11:04PM +0100, Olaf Hering wrote:

> How is git send-email and git am supposed to handle a text file which
> lacks a newline at the very end? This is about git 2.11.0.

That workflow should handle this case, and the resulting applied patch
should not have a newline.

> Right now the patch in an email generated with 'git send-email' ends
> with '\ No newline at end of file', which 'git am' can not handle.  To
> me it looks like whatever variant of "diff" is used does the right thing
> and indicates the lack of newline. Just the used variant of "patch" does
> not deal with it.

I can't reproduce here:

  # new repo with nothing in it (the base commit is to have something to
  # reset back to)
  git init
  git commit --allow-empty -m base

  # our file with no trailing newline
  printf foo >file
  git add file
  git commit -m no-newline

  # now make a patch email; it should have the "\ No newline" bit at the
  # end.
  git format-patch -1
  cat 0001-no-newline.patch

  # and now reset back and try to apply it
  git reset --hard HEAD^
  git am 0001-no-newline.patch

  # double check that it has no newline
  xxd 

Re: missing handling of "No newline at end of file" in git am

2017-02-14 Thread Junio C Hamano
Olaf Hering  writes:

> How is git send-email and git am supposed to handle a text file which
> lacks a newline at the very end? This is about git 2.11.0.

I think this has always worked, though.

$ cd /var/tmp/x
$ git init am-incomplete-line
$ cd am-incomplete-line/
$ echo one line >file
$ git add file
$ git commit -a -m initial
[master (root-commit) 27b4668] initial
 1 file changed, 1 insertion(+)
 create mode 100644 file
$ echo -n an incomplete line >>file
$ git diff file
diff --git a/file b/file
index e3c0674..f2ec9f0 100644
--- a/file
+++ b/file
@@ -1 +1,2 @@
 one line
+an incomplete line
\ No newline at end of file
$ git commit -a -m 'incomplete second'
[master 57075ab] incomplete second
 1 file changed, 1 insertion(+)
$ git format-patch -1
0001-incomplete-second.txt
$ cat 0001-incomplete-second.txt
From 57075ab402e2d3714ebc9e2e9d4efd8dbfd74d5a Mon Sep 17 00:00:00 2001
From: Junio C Hamano 
Date: Tue, 14 Feb 2017 12:35:50 -0800
Subject: [PATCH] incomplete second

---
 file | 1 +
 1 file changed, 1 insertion(+)

diff --git a/file b/file
index e3c0674..f2ec9f0 100644
--- a/file
+++ b/file
@@ -1 +1,2 @@
 one line
+an incomplete line
\ No newline at end of file
-- 
2.12.0-rc1-235-g2fb706ef99
$ git checkout HEAD^
$ git am ./0001-incomplete-second.txt
Applying: incomplete second
$ git diff master
$ exit



[PATCH 2/2] remote helpers: avoid blind fall-back to ".git" when setting GIT_DIR

2017-02-14 Thread Jeff King
From: Jonathan Nieder 

To push from or fetch to the current repository, remote helpers need
to know what repository that is.  Accordingly, Git sets the GIT_DIR
environment variable to the path to the current repository when
invoking remote helpers.

There is a special case it does not handle: "git ls-remote" and "git
archive --remote" can be run to inspect a remote repository without
being run from any local repository.  GIT_DIR is not useful in this
scenario:

- if we are not in a repository, we don't need to set GIT_DIR to
  override an existing GIT_DIR value from the environment.  If GIT_DIR
  is present then we would be in a repository if it were valid and
  would have called die() if it weren't.

- not setting GIT_DIR may cause a helper to do the usual discovery
  walk to find the repository.  But we know we're not in one, or we
  would have found it ourselves.  So in the worst case it may expend
  a little extra effort to try to find a repository and fail (for
  example, remote-curl would do this to try to find repository-level
  configuration).

So leave GIT_DIR unset in this case.  This makes GIT_DIR easier to
understand for remote helper authors and makes transport code less of
a special case for repository discovery.

Noticed using b1ef400e (setup_git_env: avoid blind fall-back to
".git", 2016-10-20) from 'next':

 $ cd /tmp
 $ git ls-remote https://kernel.googlesource.com/pub/scm/git/git
 fatal: BUG: setup_git_env called without repository

Helped-by: Jeff King 
Signed-off-by: Jonathan Nieder 
Signed-off-by: Jeff King 
---
I dropped this down to a single test instance, and used the nongit
helper to shorten it.

Possible patches on top:

  - if we want to test this across more protocols, we can. I'm not sure
I see all that much value in it, given that we know the source of
the bug.

We probably _should_ have some kind of standard test-battery that
hits all protocols, or at the very least hits both dumb/smart http.
But waiting for that may be making the perfect the enemy of the
good. So I'm OK with doing it piece-wise for now if people really
feel we need to cover more protocols.

  - Jonathan's original had some nice remote-ext tests, but they were
sufficiently complex that they should be spun into their own patch
with more explanation.

 t/t5550-http-fetch-dumb.sh | 9 +
 transport-helper.c | 5 +++--
 2 files changed, 12 insertions(+), 2 deletions(-)

diff --git a/t/t5550-http-fetch-dumb.sh b/t/t5550-http-fetch-dumb.sh
index aeb3a63f7..b69ece1d6 100755
--- a/t/t5550-http-fetch-dumb.sh
+++ b/t/t5550-http-fetch-dumb.sh
@@ -34,6 +34,15 @@ test_expect_success 'clone http repository' '
test_cmp file clone/file
 '
 
+test_expect_success 'list refs from outside any repository' '
+   cat >expect <<-EOF &&
+   $(git rev-parse master) HEAD
+   $(git rev-parse master) refs/heads/master
+   EOF
+   nongit git ls-remote "$HTTPD_URL/dumb/repo.git" >actual &&
+   test_cmp expect actual
+'
+
 test_expect_success 'create password-protected repository' '
mkdir -p "$HTTPD_DOCUMENT_ROOT_PATH/auth/dumb/" &&
cp -Rf "$HTTPD_DOCUMENT_ROOT_PATH/repo.git" \
diff --git a/transport-helper.c b/transport-helper.c
index 91aed35eb..e4fd98238 100644
--- a/transport-helper.c
+++ b/transport-helper.c
@@ -124,8 +124,9 @@ static struct child_process *get_helper(struct transport 
*transport)
helper->git_cmd = 0;
helper->silent_exec_failure = 1;
 
-   argv_array_pushf(>env_array, "%s=%s", GIT_DIR_ENVIRONMENT,
-get_git_dir());
+   if (have_git_dir())
+   argv_array_pushf(>env_array, "%s=%s",
+GIT_DIR_ENVIRONMENT, get_git_dir());
 
code = start_command(helper);
if (code < 0 && errno == ENOENT)
-- 
2.12.0.rc1.479.g59880b11e


Re: [RFC PATCH] show decorations at the end of the line

2017-02-14 Thread Stephan Beyer
Hi,

On 02/13/2017 09:30 AM, Junio C Hamano wrote:
> Linus Torvalds  writes:
> 
>> On Sat, Feb 11, 2017 at 10:02 AM, Linus Torvalds
>>  wrote:
>>>
>>> I've signed off on this, because I think it's an "obvious" improvement,
>>> but I'm putting the "RFC" in the subject line because this is clearly a
>>> subjective thing.
>>
>> Side note: the one downside of showing the decorations at the end of
>> the line is that now they are obviously at the end of the line - and
>> thus likely to be more hidden by things like line truncation.
> 
> Side note: I refrained from commenting on this patch because
> everybody knows that the what I would say anyway ;-) and I didn't
> want to speak first to discourage others from raising their opinion.

A further side note: the current behavior of

git log --oneline --decorate

is equivalent to

git log --pretty='format:%C(auto)%h%d %s'

and Linus' preferred version is equivalent to

git log --pretty='format:%C(auto)%h %s%d'

Most Git users I know have their own favorite version of git log
--pretty=format:... sometimes with --graph as an alias ("git lg" or "git
logk" (because its output reminds of gitk) or something).

I don't know what the main benefit of this patch would be, but if it
gets accepted, it should probably be mentioned somewhere that the old
behavior is easily accessible using the line mentioned above.

Cheers
Stephan


[PATCH 1/2] remote: avoid reading $GIT_DIR config in non-repo

2017-02-14 Thread Jeff King
The "git ls-remote" command can be run outside of a
repository, but needs to look up configured remotes. The
config code is smart enough to handle this case itself, but
we also check the historical "branches" and "remotes" paths
in $GIT_DIR. The git_path() function causes us to blindly
look at ".git/remotes", even if we know we aren't in a git
repository.

For now, this is just an unlikely bug (you probably don't
have such a file if you're not in a repository), but it will
become more obvious once we merge b1ef400ee (setup_git_env:
avoid blind fall-back to ".git", 2016-10-20):

  [now]
  $ git ls-remote
  fatal: No remote configured to list refs from.

  [with b1ef400ee]
  $ git ls-remote
  fatal: BUG: setup_git_env called without repository

We can fix this by skipping these sources entirely when
we're outside of a repository.

The test is a little more complex than the demonstration
above. Rather than detect the correct behavior by parsing
the error message, we can actually set up a case where the
remote name we give is a valid repository, but b1ef400ee
would cause us to die in the configuration step.

This test doesn't fail now, but it future-proofs us for the
b1ef400ee change.

Signed-off-by: Jeff King 
---
 remote.c | 2 +-
 t/t5512-ls-remote.sh | 9 +
 2 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/remote.c b/remote.c
index bf1bf2309..9f83fe2c4 100644
--- a/remote.c
+++ b/remote.c
@@ -693,7 +693,7 @@ static struct remote *remote_get_1(const char *name,
name = get_default(current_branch, _given);
 
ret = make_remote(name, 0);
-   if (valid_remote_nick(name)) {
+   if (valid_remote_nick(name) && have_git_dir()) {
if (!valid_remote(ret))
read_remotes_file(ret);
if (!valid_remote(ret))
diff --git a/t/t5512-ls-remote.sh b/t/t5512-ls-remote.sh
index 55fc83fc0..94fc9be9c 100755
--- a/t/t5512-ls-remote.sh
+++ b/t/t5512-ls-remote.sh
@@ -248,4 +248,13 @@ test_expect_success PIPE,JGIT,GIT_DAEMON 'indicate no refs 
in standards-complian
test_expect_code 2 git ls-remote --exit-code 
git://localhost:$JGIT_DAEMON_PORT/empty.git
 '
 
+test_expect_success 'ls-remote works outside repository' '
+   # It is important for this repo to be inside the nongit
+   # area, as we want a repo name that does not include
+   # slashes (because those inhibit some of our configuration
+   # lookups).
+   nongit git init --bare dst.git &&
+   nongit git ls-remote dst.git
+'
+
 test_done
-- 
2.12.0.rc1.479.g59880b11e



Re: [PATCH v2] remote helpers: avoid blind fall-back to ".git" when setting GIT_DIR

2017-02-14 Thread Jeff King
On Tue, Feb 14, 2017 at 11:08:05AM -0800, Junio C Hamano wrote:

> Thanks for prodding.  I'm tempted to just rip out everything other
> than the 5-liner fix to transport.c and apply it, expecting that a
> follow-up patch with updated tests to come sometime not in too
> distant future.

I think we can at least include one basic test. Also, as it turns out
the problem I was seeing _wasn't_ the same one Jonathan fixed. There's
another bug. :)

So here's a patch series that fixes my bug, and then a cut-down version
of Jonathan's patch. I'd be happy to see more tests come on top. I don't
think there's a huge rush on getting any of this into master. There are
bugs in the existing code, but they're very hard to trigger in practice
(e.g., a non-repo which happens to have a bunch of repo-like files). It
only becomes an issue in 'next' when we die("BUG") to flush these cases
out.

  [1/2]: remote: avoid reading $GIT_DIR config in non-repo
  [2/2]: remote helpers: avoid blind fall-back to ".git" when setting GIT_DIR

 remote.c   | 2 +-
 t/t5512-ls-remote.sh   | 9 +
 t/t5550-http-fetch-dumb.sh | 9 +
 transport-helper.c | 5 +++--
 4 files changed, 22 insertions(+), 3 deletions(-)

-Peff


missing handling of "No newline at end of file" in git am

2017-02-14 Thread Olaf Hering
How is git send-email and git am supposed to handle a text file which
lacks a newline at the very end? This is about git 2.11.0.

Right now the patch in an email generated with 'git send-email' ends
with '\ No newline at end of file', which 'git am' can not handle.  To
me it looks like whatever variant of "diff" is used does the right thing
and indicates the lack of newline. Just the used variant of "patch" does
not deal with it.


Olaf


signature.asc
Description: PGP signature


Re: [PATCH] show-branch: fix crash with long ref name

2017-02-14 Thread Jeff King
On Tue, Feb 14, 2017 at 11:35:48AM -0800, Junio C Hamano wrote:

> Jeff King  writes:
> 
> > This fixes the problem, but I think we can simplify it quite a bit by
> > using resolve_refdup(). Here's the patch series I ended up with:
> >
> >   [1/3]: show-branch: drop head_len variable
> >   [2/3]: show-branch: store resolved head in heap buffer
> >   [3/3]: show-branch: use skip_prefix to drop magic numbers
> >
> >  builtin/show-branch.c | 39 ---
> >  1 file changed, 12 insertions(+), 27 deletions(-)
> 
> Yes, the whole thing is my fault ;-) and I agree with what these
> patches do.
> 
> The second one lacks free(head) but I think that is OK; it is
> something we allocate in cmd_*() and use pretty much thruout the
> rest of the program.

Yes, I actually tested the whole thing under ASAN (which was necessary
to notice the problem), which complained about the leak. I don't mind
adding a free(head), but there are a bunch of similar "leaks" in that
function, so I didn't bother.

I notice Christian's patch added a few tests. I don't know if we'd want
to squash them in (I didn't mean to override his patch at all; I was
about to send mine out when I noticed his, and I wondered if we wanted
to combine the two efforts).

-Peff


Re: [PATCH 3/3] show-branch: use skip_prefix to drop magic numbers

2017-02-14 Thread Jeff King
On Wed, Feb 15, 2017 at 12:23:52AM +0530, Pranit Bauva wrote:

> Did you purposely miss the one in line number 278 of
> builtin/show-branch.c because I think you only touched up the parts
> which were related to "refs/" but didn't explicitly mention it in the
> commit message?
> 
> if (starts_with(pretty_str, "[PATCH] "))
> pretty_str += 8;

Not purposely. I was just fixing up the bits that I had noticed in the
earlier patches.

> diff --git a/builtin/show-branch.c b/builtin/show-branch.c
> index c03d3ec7c..19756595d 100644
> --- a/builtin/show-branch.c
> +++ b/builtin/show-branch.c
> @@ -275,8 +275,7 @@ static void show_one_commit(struct commit *commit, int 
> no_name)
>   pp_commit_easy(CMIT_FMT_ONELINE, commit, );
>   pretty_str = pretty.buf;
>   }
> - if (starts_with(pretty_str, "[PATCH] "))
> - pretty_str += 8;
> + skip_prefix(pretty_str, "[PATCH] ", _str);

Yeah, I agree this the same type of fix and is worth including. Thanks!

-Peff


Re: [PATCH 5/7] grep: fix "--" rev/pathspec disambiguation

2017-02-14 Thread Jeff King
On Tue, Feb 14, 2017 at 10:56:21AM -0800, Brandon Williams wrote:

> On 02/14, Jeff King wrote:
> > -   /* Check revs and then paths */
> > +   /*
> > +* We have to find "--" in a separate pass, because its presence
> > +* influences how we will parse arguments that come before it.
> > +*/
> > +   for (i = 0; i < argc; i++) {
> > +   if (!strcmp(argv[i], "--")) {
> > +   seen_dashdash = 1;
> > +   break;
> > +   }
> > +   }
> 
> So this simply checks if "--" is an argument that was provided.  This
> then allows grep to know ahead of time how to handle revs/paths
> preceding a "--" or in the absences of the "--".  Seems sensible to me.

By the way, we have to check again later for "--" when parsing the revs
themselves. In theory you could set seen_dashdash to the offset of the
dashdash in the array, and do the iteration more like:

  for (i = 0; i < dashdash_pos; i++)
  handle_rev(argv[i]);
  for (i = dashdash_pos + 1; i < argc; i++)
  handle_path(argv[i]);

But our loops also handle the case where there is no "--" at all, and I
think that approach ends up convoluting the logic. I didn't go very far
in that direction before giving it up, though.

-Peff


Re: [PATCH] show-branch: fix crash with long ref name

2017-02-14 Thread Junio C Hamano
Jeff King  writes:

> This fixes the problem, but I think we can simplify it quite a bit by
> using resolve_refdup(). Here's the patch series I ended up with:
>
>   [1/3]: show-branch: drop head_len variable
>   [2/3]: show-branch: store resolved head in heap buffer
>   [3/3]: show-branch: use skip_prefix to drop magic numbers
>
>  builtin/show-branch.c | 39 ---
>  1 file changed, 12 insertions(+), 27 deletions(-)

Yes, the whole thing is my fault ;-) and I agree with what these
patches do.

The second one lacks free(head) but I think that is OK; it is
something we allocate in cmd_*() and use pretty much thruout the
rest of the program.

Thanks.


Re: [PATCH 5/7] grep: fix "--" rev/pathspec disambiguation

2017-02-14 Thread Junio C Hamano
Jeff King  writes:

> If we see "git grep pattern rev -- file" then we apply the
> usual rev/pathspec disambiguation rules: any "rev" before
> the "--" must be a revision, and we do not need to apply the
> verify_non_filename() check.
>
> But there are two bugs here:
>
>   1. We keep a seen_dashdash flag to handle this case, but
>  we set it in the same left-to-right pass over the
>  arguments in which we parse "rev".
>
>  So when we see "rev", we do not yet know that there is
>  a "--", and we mistakenly complain if there is a
>  matching file.
>
>  We can fix this by making a preliminary pass over the
>  arguments to find the "--", and only then checking the rev
>  arguments.
>
>   2. If we can't resolve "rev" but there isn't a dashdash,
>  that's OK. We treat it like a path, and complain later
>  if it doesn't exist.
>
>  But if there _is_ a dashdash, then we know it must be a
>  rev, and should treat it as such, complaining if it
>  does not resolve. The current code instead ignores it
>  and tries to treat it like a path.
>
> This patch fixes both bugs, and tries to comment the parsing
> flow a bit better.

Good.  Thanks.


Re: [PATCH v2] remote helpers: avoid blind fall-back to ".git" when setting GIT_DIR

2017-02-14 Thread Junio C Hamano
Jeff King  writes:

> On Thu, Dec 29, 2016 at 07:48:45PM -0500, Jeff King wrote:
>
>> On Thu, Dec 29, 2016 at 04:11:14PM -0800, Jonathan Nieder wrote:
>> 
>> > Thanks.  Here's the patch again, now with commit messages and a test.
>> > Thanks for the analysis and sorry for the slow turnaround.
>> 
>> Thanks for following up. While working on a similar one recently, I had
>> the nagging feeling that there was a case we had found that was still to
>> be dealt with, and I think this was it. :)
>> 
>> The patch to the C code looks good. I have a few comments on the tests:
>
> I happened to run into this problem myself today, so I thought I'd prod
> you. I think your patch looks good. Hopefully I didn't scare you off
> with my comments. :)

Thanks for prodding.  I'm tempted to just rip out everything other
than the 5-liner fix to transport.c and apply it, expecting that a
follow-up patch with updated tests to come sometime not in too
distant future.



Re: [PATCH] mingw: make stderr unbuffered again

2017-02-14 Thread Junio C Hamano
Johannes Sixt  writes:

> Am 14.02.2017 um 15:47 schrieb Johannes Schindelin:
>
>>>From my point of view, it is not crucial. The next Git for Windows version
>> will have it, of course, and Hannes is always running with his set of
>> patches, he can easily cherry-pick this one, too.
>
> The patch fixes the problem for me here.
>
> I am a bit hesitant to call it "not crucial": The symptom I observed
> is certainly not a show stopper; but who knows where else it is
> important that stderr goes to the terminal earlier than other
> output. As it is a new regression, it should be included in the next
> release.

I tend to agree with you on the "not crucial" bit.

I applied the patch with "Tested-by: Johannes Sixt "
on top of 1d3f065e0e ("mingw: follow-up to "replace isatty() hack"",
2017-01-18), which was the tip of js/mingw-isatty topic that was
merged to both master and maint during this cycle.  Unless I hear
the "fix" turns out to be undesirable in the coming few days, let's
plan to merge it to 'master' by the final.

Thanks.



Re: [PATCH 5/7] grep: fix "--" rev/pathspec disambiguation

2017-02-14 Thread Brandon Williams
On 02/14, Jeff King wrote:
> - /* Check revs and then paths */
> + /*
> +  * We have to find "--" in a separate pass, because its presence
> +  * influences how we will parse arguments that come before it.
> +  */
> + for (i = 0; i < argc; i++) {
> + if (!strcmp(argv[i], "--")) {
> + seen_dashdash = 1;
> + break;
> + }
> + }

So this simply checks if "--" is an argument that was provided.  This
then allows grep to know ahead of time how to handle revs/paths
preceding a "--" or in the absences of the "--".  Seems sensible to me.

> +
> + /*
> +  * Resolve any rev arguments. If we have a dashdash, then everything up
> +  * to it must resolve as a rev. If not, then we stop at the first
> +  * non-rev and assume everything else is a path.
> +  */
>   for (i = 0; i < argc; i++) {
>   const char *arg = argv[i];
>   unsigned char sha1[20];
> @@ -1158,13 +1173,14 @@ int cmd_grep(int argc, const char **argv, const char 
> *prefix)
>  
>   if (!strcmp(arg, "--")) {
>   i++;
> - seen_dashdash = 1;
>   break;
>   }
>  
> - /* Stop at the first non-rev */
> - if (get_sha1_with_context(arg, 0, sha1, ))
> + if (get_sha1_with_context(arg, 0, sha1, )) {
> + if (seen_dashdash)
> + die(_("unable to resolve revision: %s"), arg);
>   break;
> + }
>  
>   object = parse_object_or_die(sha1, arg);
>   if (!seen_dashdash)
> @@ -1172,7 +1188,10 @@ int cmd_grep(int argc, const char **argv, const char 
> *prefix)
>   add_object_array_with_path(object, arg, , oc.mode, 
> oc.path);
>   }
>  
> - /* The rest are paths */
> + /*
> +  * Anything left over is presumed to be a path. But in the non-dashdash
> +  * "do what I mean" case, we verify and complain when that isn't true.
> +  */

-- 
Brandon Williams


Re: [PATCH] mingw: make stderr unbuffered again

2017-02-14 Thread Junio C Hamano
Johannes Schindelin  writes:

>> OK.  Should this go directly to 'master', as the isatty thing is
>> already in?
>
> From my point of view, it is not crucial. The next Git for Windows version
> will have it, of course, and Hannes is always running with his set of
> patches, he can easily cherry-pick this one, too.

What hat were you wearing when you said the above "From my point of
view"?  Were you the "Git for Windows" maintainer, or were you a
contributor and a member of the Git development community that works
to improve the upstream?  If it was not clear, I was asking the
question to you wearing the latter hat.

To put it differently, what is our position, as the upstream
developers, toward those who are on Windows and wants to build from
the source?  It's not just Hannes.

Is our position "unless you are extremely competent and are willing
to cherry-pick missing things from Git for Windows tree as needed,
we recommend you to build Git for Windows source instead"?

It is inevitable that the upstream lags behind in Windows specific
changes.  Even though you have been trickling Windows specific
changes (both things in compat/ and also changes to the generic
parts, updating "c == '/'" to "is_dir_sep(c)") in, and I have been
accepting them for the past few years, in order to reduce the size
of the patch pile Git for Windows needs on top of the upstream,
until the patch pile becomes empty, it will always be the case.

So I won't object if that were our position.  I just need to know
what it is to adjust my expectations, and as far as I am concerned,
you and Hannes are the two people whose thinking I'd trust regarding
what to do with/to Windows users; even though I keep saying "our"
position, I am asking yours and Hannes's.

Thanks.



Re: [PATCH 4/7] grep: re-order rev-parsing loop

2017-02-14 Thread Brandon Williams
On 02/14, Jeff King wrote:
> - /* Is it a rev? */
> - if (!get_sha1_with_context(arg, 0, sha1, )) {
> - struct object *object = parse_object_or_die(sha1, arg);
> - if (!seen_dashdash)
> - verify_non_filename(prefix, arg);
> - add_object_array_with_path(object, arg, , oc.mode, 
> oc.path);
> - continue;
> - }
> - break;
> +
> + /* Stop at the first non-rev */
> + if (get_sha1_with_context(arg, 0, sha1, ))
> + break;
> +
> + object = parse_object_or_die(sha1, arg);
> + if (!seen_dashdash)
> + verify_non_filename(prefix, arg);
> + add_object_array_with_path(object, arg, , oc.mode, 
> oc.path);

This is much more readable!

-- 
Brandon Williams


Re: [PATCH 3/3] show-branch: use skip_prefix to drop magic numbers

2017-02-14 Thread Pranit Bauva
Hey Peff,

On Tue, Feb 14, 2017 at 10:58 PM, Jeff King  wrote:
> We make several starts_with() calls, only to advance
> pointers. This is exactly what skip_prefix() is for, which
> lets us avoid manually-counted magic numbers.
>
> Signed-off-by: Jeff King 
> ---
>  builtin/show-branch.c | 16 +++-
>  1 file changed, 7 insertions(+), 9 deletions(-)
>
> diff --git a/builtin/show-branch.c b/builtin/show-branch.c
> index 404c4d09a..c03d3ec7c 100644
> --- a/builtin/show-branch.c
> +++ b/builtin/show-branch.c
> @@ -470,17 +470,14 @@ static void snarf_refs(int head, int remotes)
> }
>  }
>
> -static int rev_is_head(char *head, char *name,
> +static int rev_is_head(const char *head, const char *name,
>unsigned char *head_sha1, unsigned char *sha1)
>  {
> if (!head || (head_sha1 && sha1 && hashcmp(head_sha1, sha1)))
> return 0;
> -   if (starts_with(head, "refs/heads/"))
> -   head += 11;
> -   if (starts_with(name, "refs/heads/"))
> -   name += 11;
> -   else if (starts_with(name, "heads/"))
> -   name += 6;
> +   skip_prefix(head, "refs/heads/", );
> +   if (!skip_prefix(name, "refs/heads/", ))
> +   skip_prefix(name, "heads/", );
> return !strcmp(head, name);
>  }
>
> @@ -799,8 +796,9 @@ int cmd_show_branch(int ac, const char **av, const char 
> *prefix)
> has_head++;
> }
> if (!has_head) {
> -   int offset = starts_with(head, "refs/heads/") ? 11 : 
> 0;
> -   append_one_rev(head + offset);
> +   const char *name = head;
> +   skip_prefix(name, "refs/heads/", );
> +   append_one_rev(name);
> }
> }
>


Did you purposely miss the one in line number 278 of
builtin/show-branch.c because I think you only touched up the parts
which were related to "refs/" but didn't explicitly mention it in the
commit message?

if (starts_with(pretty_str, "[PATCH] "))
pretty_str += 8;

If not, you can squash this patch attached. Sorry, couldn't send it in
mail because of proxy issues.

Regards,
Pranit Bauva
From 2e80d4458df659765011450621ee34459dc749f9 Mon Sep 17 00:00:00 2001
From: Pranit Bauva 
Date: Tue, 14 Feb 2017 23:53:36 +0530
Subject: [PATCH] !squash: show-branch: use skip_prefix to drop magic numbers

Signed-off-by: Pranit Bauva 
---
 builtin/show-branch.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/builtin/show-branch.c b/builtin/show-branch.c
index c03d3ec7c..19756595d 100644
--- a/builtin/show-branch.c
+++ b/builtin/show-branch.c
@@ -275,8 +275,7 @@ static void show_one_commit(struct commit *commit, int no_name)
 		pp_commit_easy(CMIT_FMT_ONELINE, commit, );
 		pretty_str = pretty.buf;
 	}
-	if (starts_with(pretty_str, "[PATCH] "))
-		pretty_str += 8;
+	skip_prefix(pretty_str, "[PATCH] ", _str);
 
 	if (!no_name) {
 		if (name && name->head_name) {
-- 
2.11.0



Re: [PATCH 1/7] grep: move thread initialization a little lower

2017-02-14 Thread Brandon Williams
On 02/14, Jeff King wrote:
> Originally, we set up the threads for grep before parsing
> the non-option arguments. In 53b8d931b (grep: disable
> threading in non-worktree case, 2011-12-12), the thread code
> got bumped lower in the function because it now needed to
> know whether we got any revision arguments.
> 
> That put a big block of code in between the parsing of revs
> and the parsing of pathspecs, both of which share some loop
> variables. That makes it harder to read the code than the
> original, where the shared loops were right next to each
> other.
> 
> Let's bump the thread initialization until after all of the
> parsing is done.
> 
> Signed-off-by: Jeff King 
> ---
> I double-checked to make sure no other code was relying on
> the thread setup having happened. I think we could actually
> bump it quite a bit lower (to right before we actually start
> grepping), but I doubt it matters much in practice.

Looks good.  And yes I don't believe anything needs the thread
initialization to happen earlier.

-- 
Brandon Williams


Re: [PATCH] mingw: make stderr unbuffered again

2017-02-14 Thread Johannes Sixt

Am 14.02.2017 um 15:47 schrieb Johannes Schindelin:

On Mon, 13 Feb 2017, Junio C Hamano wrote:

Johannes Schindelin  writes:

When removing the hack for isatty(), we actually removed more than just
an isatty() hack: we removed the hack where internal data structures of
the MSVC runtime are modified in order to redirect stdout/stderr.

Instead of using that hack (that does not work with newer versions of
the runtime, anyway), we replaced it by reopening the respective file
descriptors.

What we forgot was to mark stderr as unbuffered again.


I do not see how the earlier patch turned stderr from unbuffered to 
buffered, as it did not add or remove any setvbuf() call. Can you explain?



OK.  Should this go directly to 'master', as the isatty thing is
already in?



From my point of view, it is not crucial. The next Git for Windows version

will have it, of course, and Hannes is always running with his set of
patches, he can easily cherry-pick this one, too.


The patch fixes the problem for me here.

I am a bit hesitant to call it "not crucial": The symptom I observed is 
certainly not a show stopper; but who knows where else it is important 
that stderr goes to the terminal earlier than other output. As it is a 
new regression, it should be included in the next release.


Thanks,
-- Hannes



Re: [PATCH 11/11] refs: split and make get_*_ref_store() public API

2017-02-14 Thread Stefan Beller
On Tue, Feb 14, 2017 at 2:04 AM, Duy Nguyen  wrote:
> On Tue, Feb 14, 2017 at 6:55 AM, Stefan Beller  wrote:
>>>
>>> +/*
>>> + * Return the ref_store instance for the specified submodule. For the
>>> + * main repository, use submodule==NULL; such a call cannot fail.
>>
>> So now we have both a get_main as well as a get_submodule function,
>> but the submodule function can return the main as well?
>>
>> I'd rather see this as a BUG; or asking another way:
>> What is the difference between get_submodule_ref_store(NULL)
>> and get_main_ref_store() ?
>
> Technical debts :) In order to do that, we need to make sure
> _submodule() never takes NULL as main repo. But current code does. On
> example is revision.c where submodule==NULL is the main repo. In the
> end, I agree that get_submodule_ref_store(NULL) should be a bug.
>
>> As you went through all call sites (by renaming the function), we'd
>> be able to tell that there is no caller with NULL, or is it?
>
> Direct call sites are just middle men though, e.g.
> for_each_ref_submodule(). I'll attempt to clean this up at some point
> in future (probably at the same time I attempt to kill *_submodule ref
> api). But I think for now I'll just put a TODO or FIXME comment here.

ok, fine with me.

My line of thinking when originally responding was to put the FIXME
comment at the caller site, and there we'd differentiate between the
two cases and call the appropriate function.

> --
> Duy


Re: [PATCH 11/11] refs: split and make get_*_ref_store() public API

2017-02-14 Thread Junio C Hamano
Duy Nguyen  writes:

> Direct call sites are just middle men though, e.g.
> for_each_ref_submodule(). I'll attempt to clean this up at some point
> in future (probably at the same time I attempt to kill *_submodule ref
> api). But I think for now I'll just put a TODO or FIXME comment here.

So we'd have get_*_ref_store() for various cases and then will have
a unifying for_each_ref_in_refstore() that takes a ref-store, which
supersedes all the ad-hoc iterators like for_each_ref_submodule(),
for_each_namespaced_ref(), etc?

That's a very sensible longer-term goal, methinks.

I am wondering what we should do to for_each_{tag,branch,...}_ref().  
Would that become

ref_store = get_ref_main_store();
tag_ref_store = filter_ref_store(ref_store, "refs/tags/");
for_each_ref_in_refstore(tag_ref_store, ...);

or do you plan to have some other pattern?


Hello.

2017-02-14 Thread university
Hello,


Re: [PATCH 6/7] grep: avoid resolving revision names in --no-index case

2017-02-14 Thread Jonathan Tan

On 02/14/2017 10:04 AM, Jeff King wrote:

On Tue, Feb 14, 2017 at 08:53:04AM -0800, Jonathan Tan wrote:


On 02/13/2017 10:07 PM, Jeff King wrote:

diff --git a/builtin/grep.c b/builtin/grep.c
index e83b33bda..c4c632594 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -1176,6 +1176,12 @@ int cmd_grep(int argc, const char **argv, const char 
*prefix)
break;
}

+   if (!use_index) {
+   if (seen_dashdash)
+   die(_("--no-index cannot be used with revs"));


There is a subsequent check that prints "--no-index or --untracked cannot be
used with revs." - maybe we should just expand this part to incorporate that
case. (That is, write `if (!use_index || untracked)` instead of `if
(!use_index)`.) This also allows us to preserve the error message, which
might be useful for someone using a translated version of Git.


I wasn't sure if we wanted to treat "untracked" in the same way.
Certainly we can catch the error here for the seen_dashdash case, but is
it also the case that:

  echo content >master
  git grep --untracked pattern master

should treat "master" as a path?

-Peff


It is already the case that it cannot be treated as a rev:

  $ git grep --untracked pattern master
  fatal: --no-index or --untracked cannot be used with revs.

So I think it would be better if it was treated as a path, for 
consistency with --no-index. I'm OK either way, though.


git-p4.py caching

2017-02-14 Thread Martin-Louis Bright
hi!

I am using git-p4.py to migrate a lot of medium and large Perforce
depots into git. I almost exclusively go one way: from Perforce to
git. I also frequently re-clone/re-migrate as the Perforce migration
client spec is refined.

For this, I have added rudimentary caching in git-p4.py so that
Perforce requests are not repeated over the network.

I have it working and tested on a ~150MB repo and the migration time was halved.

Is this something that would be of interest to the larger community?

--Martin


Re: [PATCH v2] clean: use warning_errno() when appropriate

2017-02-14 Thread Junio C Hamano
Nguyễn Thái Ngọc Duy   writes:

> All these warning() calls are preceded by a system call. Report the
> actual error to help the user understand why we fail to remove
> something.
>
> Signed-off-by: Nguyễn Thái Ngọc Duy 
> ---
>  v2 dances with errno

Thanks.

>
>  builtin/clean.c | 19 ++-
>  1 file changed, 14 insertions(+), 5 deletions(-)
>
> diff --git a/builtin/clean.c b/builtin/clean.c
> index d6bc3aaae..3569736f6 100644
> --- a/builtin/clean.c
> +++ b/builtin/clean.c
> @@ -154,6 +154,7 @@ static int remove_dirs(struct strbuf *path, const char 
> *prefix, int force_flag,
>   struct strbuf quoted = STRBUF_INIT;
>   struct dirent *e;
>   int res = 0, ret = 0, gone = 1, original_len = path->len, len;
> + int saved_errno;
>   struct string_list dels = STRING_LIST_INIT_DUP;
>  
>   *dir_gone = 1;
> @@ -173,9 +174,11 @@ static int remove_dirs(struct strbuf *path, const char 
> *prefix, int force_flag,
>   if (!dir) {
>   /* an empty dir could be removed even if it is unreadble */
>   res = dry_run ? 0 : rmdir(path->buf);
> + saved_errno = errno;
>   if (res) {
>   quote_path_relative(path->buf, prefix, );

I think this part should be more like

res = ... : rmdir(...);
if (res) {
int saved_errno = errno;
... do other things that can touch errno ...
errno = saved_errno;
... now we know what the original error was ...

The reason to store the errno in saved_errno here is not because we
want to help code after "if (res) {...}", but the patch sent as-is
gives that impression and is confusing to the readers.  

Perhaps all hunks of this patch share the same issue?  I could
locally amend, of course, but I'd like to double check before doing
so myself---perhaps you did it this way for a good reason that I am
missing?


Re: [PATCH 6/7] grep: avoid resolving revision names in --no-index case

2017-02-14 Thread Jeff King
On Tue, Feb 14, 2017 at 08:53:04AM -0800, Jonathan Tan wrote:

> On 02/13/2017 10:07 PM, Jeff King wrote:
> > diff --git a/builtin/grep.c b/builtin/grep.c
> > index e83b33bda..c4c632594 100644
> > --- a/builtin/grep.c
> > +++ b/builtin/grep.c
> > @@ -1176,6 +1176,12 @@ int cmd_grep(int argc, const char **argv, const char 
> > *prefix)
> > break;
> > }
> > 
> > +   if (!use_index) {
> > +   if (seen_dashdash)
> > +   die(_("--no-index cannot be used with revs"));
> 
> There is a subsequent check that prints "--no-index or --untracked cannot be
> used with revs." - maybe we should just expand this part to incorporate that
> case. (That is, write `if (!use_index || untracked)` instead of `if
> (!use_index)`.) This also allows us to preserve the error message, which
> might be useful for someone using a translated version of Git.

I wasn't sure if we wanted to treat "untracked" in the same way.
Certainly we can catch the error here for the seen_dashdash case, but is
it also the case that:

  echo content >master
  git grep --untracked pattern master

should treat "master" as a path?

-Peff


Re: [PATCH 06/11] refs-internal.h: correct is_per_worktree_ref()

2017-02-14 Thread Stefan Beller
On Tue, Feb 14, 2017 at 1:40 AM, Duy Nguyen  wrote:
> On Tue, Feb 14, 2017 at 5:37 AM, Stefan Beller  wrote:
>> On Mon, Feb 13, 2017 at 7:20 AM, Nguyễn Thái Ngọc Duy  
>> wrote:
>>> All refs outside refs/ directory is per-worktree, not just HEAD.
>>>
>>> Signed-off-by: Nguyễn Thái Ngọc Duy 
>>> ---
>>>  refs/refs-internal.h | 2 +-
>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/refs/refs-internal.h b/refs/refs-internal.h
>>> index f4aed49f5..69d02b6ba 100644
>>> --- a/refs/refs-internal.h
>>> +++ b/refs/refs-internal.h
>>> @@ -653,7 +653,7 @@ const char *resolve_ref_recursively(struct ref_store 
>>> *refs,
>>>
>>>  static inline int is_per_worktree_ref(const char *refname)
>>>  {
>>> -   return !strcmp(refname, "HEAD") ||
>>> +   return !starts_with(refname, "refs/") ||
>>> starts_with(refname, "refs/bisect/");
>>
>> you're loosing HEAD here? (assuming we get HEAD in
>> short form here, as well as long form refs/HEAD)
>
> I don't understand. if refname is HEAD then both !strcmp(...) and
> !starts_with(refname, "refs/") return 1. If it's refs/HEAD, both
> return 0. In other words, there's no functional changes?

eh, my bad. You're right.


[PATCH 3/3] show-branch: use skip_prefix to drop magic numbers

2017-02-14 Thread Jeff King
We make several starts_with() calls, only to advance
pointers. This is exactly what skip_prefix() is for, which
lets us avoid manually-counted magic numbers.

Signed-off-by: Jeff King 
---
 builtin/show-branch.c | 16 +++-
 1 file changed, 7 insertions(+), 9 deletions(-)

diff --git a/builtin/show-branch.c b/builtin/show-branch.c
index 404c4d09a..c03d3ec7c 100644
--- a/builtin/show-branch.c
+++ b/builtin/show-branch.c
@@ -470,17 +470,14 @@ static void snarf_refs(int head, int remotes)
}
 }
 
-static int rev_is_head(char *head, char *name,
+static int rev_is_head(const char *head, const char *name,
   unsigned char *head_sha1, unsigned char *sha1)
 {
if (!head || (head_sha1 && sha1 && hashcmp(head_sha1, sha1)))
return 0;
-   if (starts_with(head, "refs/heads/"))
-   head += 11;
-   if (starts_with(name, "refs/heads/"))
-   name += 11;
-   else if (starts_with(name, "heads/"))
-   name += 6;
+   skip_prefix(head, "refs/heads/", );
+   if (!skip_prefix(name, "refs/heads/", ))
+   skip_prefix(name, "heads/", );
return !strcmp(head, name);
 }
 
@@ -799,8 +796,9 @@ int cmd_show_branch(int ac, const char **av, const char 
*prefix)
has_head++;
}
if (!has_head) {
-   int offset = starts_with(head, "refs/heads/") ? 11 : 0;
-   append_one_rev(head + offset);
+   const char *name = head;
+   skip_prefix(name, "refs/heads/", );
+   append_one_rev(name);
}
}
 
-- 
2.12.0.rc1.479.g59880b11e


[PATCH 2/3] show-branch: store resolved head in heap buffer

2017-02-14 Thread Jeff King
We resolve HEAD and copy the result to a fixed-size buffer
with memcpy, never checking that it actually fits. This bug
dates back to 8098a178b (Add git-symbolic-ref, 2005-09-30).
Before that we used readlink(), which took a maximum buffer
size.

We can fix this by using resolve_refdup(), which duplicates
the buffer on the heap. That also lets us just check
for a NULL pointer to see if we have resolved HEAD, and
drop the extra head_p variable.

Signed-off-by: Jeff King 
---
 builtin/show-branch.c | 19 +--
 1 file changed, 5 insertions(+), 14 deletions(-)

diff --git a/builtin/show-branch.c b/builtin/show-branch.c
index e4c488b8c..404c4d09a 100644
--- a/builtin/show-branch.c
+++ b/builtin/show-branch.c
@@ -473,8 +473,7 @@ static void snarf_refs(int head, int remotes)
 static int rev_is_head(char *head, char *name,
   unsigned char *head_sha1, unsigned char *sha1)
 {
-   if ((!head[0]) ||
-   (head_sha1 && sha1 && hashcmp(head_sha1, sha1)))
+   if (!head || (head_sha1 && sha1 && hashcmp(head_sha1, sha1)))
return 0;
if (starts_with(head, "refs/heads/"))
head += 11;
@@ -620,8 +619,7 @@ int cmd_show_branch(int ac, const char **av, const char 
*prefix)
int all_heads = 0, all_remotes = 0;
int all_mask, all_revs;
enum rev_sort_order sort_order = REV_SORT_IN_GRAPH_ORDER;
-   char head[128];
-   const char *head_p;
+   char *head;
struct object_id head_oid;
int merge_base = 0;
int independent = 0;
@@ -786,17 +784,10 @@ int cmd_show_branch(int ac, const char **av, const char 
*prefix)
snarf_refs(all_heads, all_remotes);
}
 
-   head_p = resolve_ref_unsafe("HEAD", RESOLVE_REF_READING,
-   head_oid.hash, NULL);
-   if (head_p) {
-   size_t head_len = strlen(head_p);
-   memcpy(head, head_p, head_len + 1);
-   }
-   else {
-   head[0] = 0;
-   }
+   head = resolve_refdup("HEAD", RESOLVE_REF_READING,
+ head_oid.hash, NULL);
 
-   if (with_current_branch && head_p) {
+   if (with_current_branch && head) {
int has_head = 0;
for (i = 0; !has_head && i < ref_name_cnt; i++) {
/* We are only interested in adding the branch
-- 
2.12.0.rc1.479.g59880b11e



[PATCH 1/3] show-branch: drop head_len variable

2017-02-14 Thread Jeff King
We copy the result of resolving HEAD into a buffer and keep
track of its length.  But we never actually use the length
for anything besides the copy. Let's stop passing it around.

Signed-off-by: Jeff King 
---
 builtin/show-branch.c | 8 ++--
 1 file changed, 2 insertions(+), 6 deletions(-)

diff --git a/builtin/show-branch.c b/builtin/show-branch.c
index 974f3403a..e4c488b8c 100644
--- a/builtin/show-branch.c
+++ b/builtin/show-branch.c
@@ -470,7 +470,7 @@ static void snarf_refs(int head, int remotes)
}
 }
 
-static int rev_is_head(char *head, int headlen, char *name,
+static int rev_is_head(char *head, char *name,
   unsigned char *head_sha1, unsigned char *sha1)
 {
if ((!head[0]) ||
@@ -622,7 +622,6 @@ int cmd_show_branch(int ac, const char **av, const char 
*prefix)
enum rev_sort_order sort_order = REV_SORT_IN_GRAPH_ORDER;
char head[128];
const char *head_p;
-   int head_len;
struct object_id head_oid;
int merge_base = 0;
int independent = 0;
@@ -790,11 +789,10 @@ int cmd_show_branch(int ac, const char **av, const char 
*prefix)
head_p = resolve_ref_unsafe("HEAD", RESOLVE_REF_READING,
head_oid.hash, NULL);
if (head_p) {
-   head_len = strlen(head_p);
+   size_t head_len = strlen(head_p);
memcpy(head, head_p, head_len + 1);
}
else {
-   head_len = 0;
head[0] = 0;
}
 
@@ -805,7 +803,6 @@ int cmd_show_branch(int ac, const char **av, const char 
*prefix)
 * HEAD points at.
 */
if (rev_is_head(head,
-   head_len,
ref_name[i],
head_oid.hash, NULL))
has_head++;
@@ -866,7 +863,6 @@ int cmd_show_branch(int ac, const char **av, const char 
*prefix)
for (i = 0; i < num_rev; i++) {
int j;
int is_head = rev_is_head(head,
- head_len,
  ref_name[i],
  head_oid.hash,
  rev[i]->object.oid.hash);
-- 
2.12.0.rc1.479.g59880b11e



Re: [PATCH] show-branch: fix crash with long ref name

2017-02-14 Thread Jeff King
On Tue, Feb 14, 2017 at 04:48:16PM +0100, Christian Couder wrote:

> @@ -791,11 +791,11 @@ int cmd_show_branch(int ac, const char **av, const char 
> *prefix)
>   head_oid.hash, NULL);
>   if (head_p) {
>   head_len = strlen(head_p);
> - memcpy(head, head_p, head_len + 1);
> + head_cpy = xstrdup(head_p);
>   }
>   else {
>   head_len = 0;
> - head[0] = 0;
> + head_cpy = xstrdup("");
>   }

This fixes the problem, but I think we can simplify it quite a bit by
using resolve_refdup(). Here's the patch series I ended up with:

  [1/3]: show-branch: drop head_len variable
  [2/3]: show-branch: store resolved head in heap buffer
  [3/3]: show-branch: use skip_prefix to drop magic numbers

 builtin/show-branch.c | 39 ---
 1 file changed, 12 insertions(+), 27 deletions(-)

-Peff


Re: [PATCH 0/7] grep rev/path parsing fixes

2017-02-14 Thread Jonathan Tan

On 02/13/2017 10:00 PM, Jeff King wrote:

I've fixed that, along with a few other bugs and cleanups. The complete
series is below. Patch 2 is your (untouched) patch. My suggestions for
your test are in patch 3, which can either remain on its own or be
squashed in.

  [1/7]: grep: move thread initialization a little lower
  [2/7]: grep: do not unnecessarily query repo for "--"
  [3/7]: t7810: make "--no-index --" test more robust
  [4/7]: grep: re-order rev-parsing loop
  [5/7]: grep: fix "--" rev/pathspec disambiguation
  [6/7]: grep: avoid resolving revision names in --no-index case
  [7/7]: grep: do not diagnose misspelt revs with --no-index


Thanks - these look good to me. I replied to 6/7 with a comment, but I 
also think that these are good as-is. Also, 3/7 can probably be squashed in.


Re: [PATCH 6/7] grep: avoid resolving revision names in --no-index case

2017-02-14 Thread Jonathan Tan

On 02/13/2017 10:07 PM, Jeff King wrote:

diff --git a/builtin/grep.c b/builtin/grep.c
index e83b33bda..c4c632594 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -1176,6 +1176,12 @@ int cmd_grep(int argc, const char **argv, const char 
*prefix)
break;
}

+   if (!use_index) {
+   if (seen_dashdash)
+   die(_("--no-index cannot be used with revs"));


There is a subsequent check that prints "--no-index or --untracked 
cannot be used with revs." - maybe we should just expand this part to 
incorporate that case. (That is, write `if (!use_index || untracked)` 
instead of `if (!use_index)`.) This also allows us to preserve the error 
message, which might be useful for someone using a translated version of 
Git.



+   break;
+   }
+
if (get_sha1_with_context(arg, 0, sha1, )) {
if (seen_dashdash)
die(_("unable to resolve revision: %s"), arg);


[PATCH] show-branch: fix crash with long ref name

2017-02-14 Thread Christian Couder
This is a minimum fix for a crash with a long ref name.

Note that if the refname is too long (for example if you
use 100 instead of 50 in the test script) you could get
an error like:

fatal: cannot lock ref 'refs/heads/1_2_3_4_ ... _98_99_100_':
Unable to create '... /.git/refs/heads/1_2_3_4_ ... _98_99_100_.lock':
File name too long

when creating the ref instead of a crash when using
show-branch and that would be ok.

So a simpler fix could have been just something like:

-   char head[128];
+   char head[1024];

But if the filesystem ever allows filenames longer than 1024
characters then the crash could appear again.

Reported by: Maxim Kuvyrkov 
Signed-off-by: Christian Couder 
---
 builtin/show-branch.c  | 14 +++---
 t/t3204-show-branch-refname.sh | 19 +++
 2 files changed, 26 insertions(+), 7 deletions(-)
 create mode 100755 t/t3204-show-branch-refname.sh

diff --git a/builtin/show-branch.c b/builtin/show-branch.c
index 974f3403ab..3c0fe55eec 100644
--- a/builtin/show-branch.c
+++ b/builtin/show-branch.c
@@ -620,7 +620,7 @@ int cmd_show_branch(int ac, const char **av, const char 
*prefix)
int all_heads = 0, all_remotes = 0;
int all_mask, all_revs;
enum rev_sort_order sort_order = REV_SORT_IN_GRAPH_ORDER;
-   char head[128];
+   char *head_cpy;
const char *head_p;
int head_len;
struct object_id head_oid;
@@ -791,11 +791,11 @@ int cmd_show_branch(int ac, const char **av, const char 
*prefix)
head_oid.hash, NULL);
if (head_p) {
head_len = strlen(head_p);
-   memcpy(head, head_p, head_len + 1);
+   head_cpy = xstrdup(head_p);
}
else {
head_len = 0;
-   head[0] = 0;
+   head_cpy = xstrdup("");
}
 
if (with_current_branch && head_p) {
@@ -804,15 +804,15 @@ int cmd_show_branch(int ac, const char **av, const char 
*prefix)
/* We are only interested in adding the branch
 * HEAD points at.
 */
-   if (rev_is_head(head,
+   if (rev_is_head(head_cpy,
head_len,
ref_name[i],
head_oid.hash, NULL))
has_head++;
}
if (!has_head) {
-   int offset = starts_with(head, "refs/heads/") ? 11 : 0;
-   append_one_rev(head + offset);
+   int offset = starts_with(head_cpy, "refs/heads/") ? 11 
: 0;
+   append_one_rev(head_cpy + offset);
}
}
 
@@ -865,7 +865,7 @@ int cmd_show_branch(int ac, const char **av, const char 
*prefix)
if (1 < num_rev || extra < 0) {
for (i = 0; i < num_rev; i++) {
int j;
-   int is_head = rev_is_head(head,
+   int is_head = rev_is_head(head_cpy,
  head_len,
  ref_name[i],
  head_oid.hash,
diff --git a/t/t3204-show-branch-refname.sh b/t/t3204-show-branch-refname.sh
new file mode 100755
index 00..83b64e3032
--- /dev/null
+++ b/t/t3204-show-branch-refname.sh
@@ -0,0 +1,19 @@
+#!/bin/sh
+
+test_description='test show-branch with long refname'
+
+. ./test-lib.sh
+
+test_expect_success 'setup' '
+
+   test_commit first &&
+   long_refname=$(printf "%s_" $(seq 1 50)) &&
+   git checkout -b "$long_refname"
+'
+
+test_expect_success 'show-branch with long refname' '
+
+   git show-branch first
+'
+
+test_done
-- 
2.12.0.rc0



Re: [BUG] Memory corruption crash with "git bisect start"

2017-02-14 Thread Christian Couder
On Tue, Feb 14, 2017 at 12:56 PM, Maxim Kuvyrkov
 wrote:
> I'm seeing the following memory corruption crash on a script-constructed repo 
> when starting git bisect.  I'm seeing this crash both with system git of 
> Ubuntu Xenial and with freshly-compiled git master from the official repo.
>
> The log of the crash is attached.

Yeah, thanks for the report.

I took a look and it looks like git bisect crashes when it runs git
show-branch and it can be reproduced with just `git show-branch
bottom` instead of `git bisect start bottom top`.

> It is possible that the bug is in Xenial glibc, in which case -- please let 
> me know and I will take it up with the glibc developers.

I have Ubuntu GLIBC 2.23-0ubuntu5 and I get a crash too.


Re: [PATCH] mingw: make stderr unbuffered again

2017-02-14 Thread Johannes Schindelin
Hi Junio,

On Mon, 13 Feb 2017, Junio C Hamano wrote:

> Johannes Schindelin  writes:
> 
> > When removing the hack for isatty(), we actually removed more than just
> > an isatty() hack: we removed the hack where internal data structures of
> > the MSVC runtime are modified in order to redirect stdout/stderr.
> >
> > Instead of using that hack (that does not work with newer versions of
> > the runtime, anyway), we replaced it by reopening the respective file
> > descriptors.
> >
> > What we forgot was to mark stderr as unbuffered again.
> >
> > Reported by Hannes Sixt. Fixed with Jeff Hostetler's assistance.
> >
> > Signed-off-by: Johannes Schindelin 
> > ---
> > Published-As: 
> > https://github.com/dscho/git/releases/tag/mingw-unbuffered-stderr-v1
> > Fetch-It-Via: git fetch https://github.com/dscho/git 
> > mingw-unbuffered-stderr-v1
> 
> OK.  Should this go directly to 'master', as the isatty thing is
> already in?

>From my point of view, it is not crucial. The next Git for Windows version
will have it, of course, and Hannes is always running with his set of
patches, he can easily cherry-pick this one, too.

Ciao,
Johannes

P.S.: Could you please cut the remainder of the mail that you are not
responding to? Thanks.


Re: [PATCH v2 2/2] grep: use '/' delimiter for paths

2017-02-14 Thread Stefan Hajnoczi
On Thu, Feb 09, 2017 at 12:20:34AM -0500, Jeff King wrote:
> On Wed, Feb 08, 2017 at 09:14:17PM -0800, Junio C Hamano wrote:
> > Jeff King  writes:
> (I _do_ think Stefan's proposed direction is worth it simply because the
> result is easier to read, but I agree the whole thing can be avoided by
> using pathspecs, as you've noted).

I won't be pushing this series further due to limited time.

Stefan


signature.asc
Description: PGP signature


  1   2   >