Re: Duplicate safecrlf warning for racily clean index entry

2018-02-21 Thread Matt McCutchen
On Wed, 2018-02-21 at 08:53 +0100, Torsten Bögershausen wrote:
> I don't hava a pointer, but what should happen ?
> 2 warnings for 2 "git add" should be OK, I think.
> 
> 1 warning is part of the optimization, that Git does to handle
> hundrets and thousands of files efficciently.
> 
> Is the 1/2 warning  real live problem  ?

As I've suggested, my opinion is that the nondeterministic second
warning can result in significant user confusion and should be avoided.
 (If it were always shown, I'd be less concerned.)  We'll see what the
decision-makers think.

Matt


Re: Duplicate safecrlf warning for racily clean index entry

2018-02-21 Thread Matt McCutchen
On Tue, 2018-02-20 at 08:42 -0500, Matt McCutchen wrote:
> In either case, if "git update-index --refresh" (or "git status") is
> run before "git add", then "git add" does not print the warning.  On
> the other hand, if line endings in the working tree file are changed,
> then git shows the file as having an unstaged change, even though the
> content that would be added to the index after CRLF conversion is
> identical.  So it seems that git remembers the pre-conversion file
> content and uses it for "git update-index --refresh" and would just
> need to use it for "git add" as well.

On further testing, this analysis is wrong.  What I was seeing is that
if the size of the working tree file has changed, git reports an
unstaged change.  (I suppose that reporting an unstaged change in this
case without checking whether the post-conversion content has changed
may be an important optimization.)  If the line endings are changed
without changing the size or post-conversion content, then no unstaged
change is reported.  It does not appear that git saves the pre-
conversion content.

Thus, if it were possible to create a file that doesn't need a safecrlf
warning, add it to the index, and then modify it so that it does need a
safecrlf warning without changing the size or post-conversion content,
we would have a bug where no warning is shown in the case where "git
status" is run before the second "git add".  I believe this bug can't
occur in the particular case of CRLF conversion without other filters
because the file that doesn't need a safecrlf warning has a unique
minimum (LF) or maximum (CRLF) size, though I presume it could occur
with custom filters.  My proposal would then be that "git add" should
not show a safecrlf warning if the size and post-conversion content
haven't changed; it would merely bring "git add" to parity with the
potential bug in the "git status" case.

Matt


Duplicate safecrlf warning for racily clean index entry

2018-02-20 Thread Matt McCutchen
I noticed that if a file subject to a safecrlf warning is added to the
index in the same second that it is created, resulting in a "racily
clean" index entry, then a subsequent "git add" command prints another
safecrlf warning.  I reproduced this on the current "next"
(499d7c4f91).  The procedure:

$ git init
$ git config core.autocrlf true
$ echo foo >file1 && git add file1 && git add file1
warning: LF will be replaced by CRLF in file1.
The file will have its original line endings in your working directory.
warning: LF will be replaced by CRLF in file1.
The file will have its original line endings in your working directory.
$ echo bar >file2 && sleep 1 && git add file2 && git add file2
warning: LF will be replaced by CRLF in file2.
The file will have its original line endings in your working directory.

This came up when I ran the test suite for Braid on Windows
(https://github.com/cristibalan/braid/issues/77).

The phenomenon actually seems to be more general: touching the file
causes the next "git add" to print a safecrlf warning, suggesting that
the warning occurs whenever the index entry is dirty.  One could argue
that a new warning is reasonable after touching the file, but it seems
clear that "racy cleanliness" is an implementation detail that
shouldn't have user-visible nondeterministic effects.

In either case, if "git update-index --refresh" (or "git status") is
run before "git add", then "git add" does not print the warning.  On
the other hand, if line endings in the working tree file are changed,
then git shows the file as having an unstaged change, even though the
content that would be added to the index after CRLF conversion is
identical.  So it seems that git remembers the pre-conversion file
content and uses it for "git update-index --refresh" and would just
need to use it for "git add" as well.

Thoughts about the proposed change?  Does someone want to work on it or
give me a pointer to where to get started?

Thanks,
Matt


Tools that do an automatic fetch defeat "git push --force-with-lease"

2017-04-07 Thread Matt McCutchen
When I'm rewriting history, "git push --force-with-lease" is a nice
safeguard compared to "git push --force", but it still assumes the
remote-tracking ref gives the old state the user wants to overwrite. 
Tools that do an implicit fetch, assuming it to be a safe operation,
may break this assumption.  In the worst case, Visual Studio Code does
an automatic fetch every 3 minutes by default [1], making
--force-with-lease pretty much reduce to --force.

For a safer workflow, "git push" would check against a separate "old"
ref that isn't updated by "git fetch", but is updated by "git push" the
same way the remote-tracking ref is and maybe also by commands that
update the local branch to take into account remote changes (I'm not
sure what reasonable scenarios there are, if any).  Has there been any
work on this?  I can write a wrapper script for the simple case of "git
push" of a single branch to the same branch on a remote, which is
pretty much all I use right now, but a native implementation would
support all of the command-line usage forms, which would be nice.

Thanks for reading!

[1]
https://github.com/Microsoft/vscode/blob/535a3de60023c81d75d0eac22044284f07dbcddf/extensions/git/src/autofetch.ts

Matt


[PATCH v2 3/3] fetch-pack: add specific error for fetching an unadvertised object

2017-02-22 Thread Matt McCutchen
Enhance filter_refs (which decides whether a request for an unadvertised
object should be sent to the server) to record a new match status on the
"struct ref" when a request is not allowed, and have
report_unmatched_refs check for this status and print a special error
message, "Server does not allow request for unadvertised object".

Signed-off-by: Matt McCutchen <m...@mattmccutchen.net>
---
 fetch-pack.c  | 42 +++---
 remote.h  |  9 +++--
 t/t5516-fetch-push.sh |  2 +-
 3 files changed, 35 insertions(+), 18 deletions(-)

diff --git a/fetch-pack.c b/fetch-pack.c
index 7c8d44c..f12bfcd 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -578,7 +578,7 @@ static void filter_refs(struct fetch_pack_args *args,
break; /* definitely do not have it */
else if (cmp == 0) {
keep = 1; /* definitely have it */
-   sought[i]->matched = 1;
+   sought[i]->match_status = REF_MATCHED;
}
i++;
}
@@ -598,22 +598,24 @@ static void filter_refs(struct fetch_pack_args *args,
}
 
/* Append unmatched requests to the list */
-   if ((allow_unadvertised_object_request &
-   (ALLOW_TIP_SHA1 | ALLOW_REACHABLE_SHA1))) {
-   for (i = 0; i < nr_sought; i++) {
-   unsigned char sha1[20];
+   for (i = 0; i < nr_sought; i++) {
+   unsigned char sha1[20];
 
-   ref = sought[i];
-   if (ref->matched)
-   continue;
-   if (get_sha1_hex(ref->name, sha1) ||
-   ref->name[40] != '\0' ||
-   hashcmp(sha1, ref->old_oid.hash))
-   continue;
+   ref = sought[i];
+   if (ref->match_status != REF_NOT_MATCHED)
+   continue;
+   if (get_sha1_hex(ref->name, sha1) ||
+   ref->name[40] != '\0' ||
+   hashcmp(sha1, ref->old_oid.hash))
+   continue;
 
-   ref->matched = 1;
+   if ((allow_unadvertised_object_request &
+   (ALLOW_TIP_SHA1 | ALLOW_REACHABLE_SHA1))) {
+   ref->match_status = REF_MATCHED;
*newtail = copy_ref(ref);
newtail = &(*newtail)->next;
+   } else {
+   ref->match_status = REF_UNADVERTISED_NOT_ALLOWED;
}
}
*refs = newlist;
@@ -1100,9 +1102,19 @@ int report_unmatched_refs(struct ref **sought, int 
nr_sought)
int i, ret = 0;
 
for (i = 0; i < nr_sought; i++) {
-   if (!sought[i] || sought[i]->matched)
+   if (!sought[i])
continue;
-   error(_("no such remote ref %s"), sought[i]->name);
+   switch (sought[i]->match_status) {
+   case REF_MATCHED:
+   continue;
+   case REF_NOT_MATCHED:
+   error(_("no such remote ref %s"), sought[i]->name);
+   break;
+   case REF_UNADVERTISED_NOT_ALLOWED:
+   error(_("Server does not allow request for unadvertised 
object %s"),
+ sought[i]->name);
+   break;
+   }
ret = 1;
}
return ret;
diff --git a/remote.h b/remote.h
index 9248811..0b9d8c4 100644
--- a/remote.h
+++ b/remote.h
@@ -89,8 +89,13 @@ struct ref {
force:1,
forced_update:1,
expect_old_sha1:1,
-   deletion:1,
-   matched:1;
+   deletion:1;
+
+   enum {
+   REF_NOT_MATCHED = 0, /* initial value */
+   REF_MATCHED,
+   REF_UNADVERTISED_NOT_ALLOWED
+   } match_status;
 
/*
 * Order is important here, as we write to FETCH_HEAD
diff --git a/t/t5516-fetch-push.sh b/t/t5516-fetch-push.sh
index 0d13a45..78f3b8e 100755
--- a/t/t5516-fetch-push.sh
+++ b/t/t5516-fetch-push.sh
@@ -1099,7 +1099,7 @@ test_expect_success 'fetch exact SHA1' '
 
# fetching the hidden object should fail by default
test_must_fail git fetch -v ../testrepo 
$the_commit:refs/heads/copy 2>err &&
-   test_i18ngrep "no such remote ref" err &&
+   test_i18ngrep "Server does not allow request for unadvertised 
object" err &&
test_must_fail git rev-parse --verify refs/heads/copy &&
 
# the server side can allow it to succeed
-- 
2.9.3




[PATCH v2 2/3] fetch_refs_via_pack: call report_unmatched_refs

2017-02-22 Thread Matt McCutchen
"git fetch" currently doesn't bother to check that it got all refs it
sought, because the common case of requesting a nonexistent ref triggers
a die() in get_fetch_map.  However, there's at least one case that
slipped through: "git fetch REMOTE SHA1" if the server doesn't allow
requests for unadvertised objects.  Make fetch_refs_via_pack (which is
on the "git fetch" code path) call report_unmatched_refs so that we at
least get an error message in that case.

Signed-off-by: Matt McCutchen <m...@mattmccutchen.net>
---
 t/t5516-fetch-push.sh |  3 ++-
 transport.c   | 14 +-
 2 files changed, 11 insertions(+), 6 deletions(-)

diff --git a/t/t5516-fetch-push.sh b/t/t5516-fetch-push.sh
index 26b2caf..0d13a45 100755
--- a/t/t5516-fetch-push.sh
+++ b/t/t5516-fetch-push.sh
@@ -1098,7 +1098,8 @@ test_expect_success 'fetch exact SHA1' '
test_must_fail git cat-file -t $the_commit &&
 
# fetching the hidden object should fail by default
-   test_must_fail git fetch -v ../testrepo 
$the_commit:refs/heads/copy &&
+   test_must_fail git fetch -v ../testrepo 
$the_commit:refs/heads/copy 2>err &&
+   test_i18ngrep "no such remote ref" err &&
test_must_fail git rev-parse --verify refs/heads/copy &&
 
# the server side can allow it to succeed
diff --git a/transport.c b/transport.c
index 04e5d66..c377907 100644
--- a/transport.c
+++ b/transport.c
@@ -204,6 +204,7 @@ static struct ref *get_refs_via_connect(struct transport 
*transport, int for_pus
 static int fetch_refs_via_pack(struct transport *transport,
   int nr_heads, struct ref **to_fetch)
 {
+   int ret = 0;
struct git_transport_data *data = transport->data;
struct ref *refs;
char *dest = xstrdup(transport->url);
@@ -241,19 +242,22 @@ static int fetch_refs_via_pack(struct transport 
*transport,
  >pack_lockfile);
close(data->fd[0]);
close(data->fd[1]);
-   if (finish_connect(data->conn)) {
-   free_refs(refs);
-   refs = NULL;
-   }
+   if (finish_connect(data->conn))
+   ret = -1;
data->conn = NULL;
data->got_remote_heads = 0;
data->options.self_contained_and_connected =
args.self_contained_and_connected;
 
+   if (refs == NULL)
+   ret = -1;
+   if (report_unmatched_refs(to_fetch, nr_heads))
+   ret = -1;
+
free_refs(refs_tmp);
free_refs(refs);
free(dest);
-   return (refs ? 0 : -1);
+   return ret;
 }
 
 static int push_had_errors(struct ref *ref)
-- 
2.9.3




[PATCH v2 1/3] fetch-pack: move code to report unmatched refs to a function

2017-02-22 Thread Matt McCutchen
We're preparing to reuse this code in transport.c for "git fetch".

While I'm here, internationalize the existing error message.

Signed-off-by: Matt McCutchen <m...@mattmccutchen.net>
---
 builtin/fetch-pack.c  |  7 +--
 fetch-pack.c  | 13 +
 fetch-pack.h  |  6 ++
 t/t5500-fetch-pack.sh |  6 +++---
 4 files changed, 23 insertions(+), 9 deletions(-)

diff --git a/builtin/fetch-pack.c b/builtin/fetch-pack.c
index cfe9e44..2a1c1c2 100644
--- a/builtin/fetch-pack.c
+++ b/builtin/fetch-pack.c
@@ -219,12 +219,7 @@ int cmd_fetch_pack(int argc, const char **argv, const char 
*prefix)
 * remote no-such-ref' would silently succeed without issuing
 * an error.
 */
-   for (i = 0; i < nr_sought; i++) {
-   if (!sought[i] || sought[i]->matched)
-   continue;
-   error("no such remote ref %s", sought[i]->name);
-   ret = 1;
-   }
+   ret |= report_unmatched_refs(sought, nr_sought);
 
while (ref) {
printf("%s %s\n",
diff --git a/fetch-pack.c b/fetch-pack.c
index 601f077..7c8d44c 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -1094,3 +1094,16 @@ struct ref *fetch_pack(struct fetch_pack_args *args,
clear_shallow_info();
return ref_cpy;
 }
+
+int report_unmatched_refs(struct ref **sought, int nr_sought)
+{
+   int i, ret = 0;
+
+   for (i = 0; i < nr_sought; i++) {
+   if (!sought[i] || sought[i]->matched)
+   continue;
+   error(_("no such remote ref %s"), sought[i]->name);
+   ret = 1;
+   }
+   return ret;
+}
diff --git a/fetch-pack.h b/fetch-pack.h
index c912e3d..a2d46e6 100644
--- a/fetch-pack.h
+++ b/fetch-pack.h
@@ -45,4 +45,10 @@ struct ref *fetch_pack(struct fetch_pack_args *args,
   struct sha1_array *shallow,
   char **pack_lockfile);
 
+/*
+ * Print an appropriate error message for each sought ref that wasn't
+ * matched.  Return 0 if all sought refs were matched, otherwise 1.
+ */
+int report_unmatched_refs(struct ref **sought, int nr_sought);
+
 #endif
diff --git a/t/t5500-fetch-pack.sh b/t/t5500-fetch-pack.sh
index 505e1b4..b5865b3 100755
--- a/t/t5500-fetch-pack.sh
+++ b/t/t5500-fetch-pack.sh
@@ -484,7 +484,7 @@ test_expect_success 'test lonely missing ref' '
cd client &&
test_must_fail git fetch-pack --no-progress .. refs/heads/xyzzy
) >/dev/null 2>error-m &&
-   test_cmp expect-error error-m
+   test_i18ncmp expect-error error-m
 '
 
 test_expect_success 'test missing ref after existing' '
@@ -492,7 +492,7 @@ test_expect_success 'test missing ref after existing' '
cd client &&
test_must_fail git fetch-pack --no-progress .. refs/heads/A 
refs/heads/xyzzy
) >/dev/null 2>error-em &&
-   test_cmp expect-error error-em
+   test_i18ncmp expect-error error-em
 '
 
 test_expect_success 'test missing ref before existing' '
@@ -500,7 +500,7 @@ test_expect_success 'test missing ref before existing' '
cd client &&
test_must_fail git fetch-pack --no-progress .. refs/heads/xyzzy 
refs/heads/A
) >/dev/null 2>error-me &&
-   test_cmp expect-error error-me
+   test_i18ncmp expect-error error-me
 '
 
 test_expect_success 'test --all, --depth, and explicit head' '
-- 
2.9.3




Re: [PATCH 1/3] fetch-pack: move code to report unmatched refs to a function

2017-02-22 Thread Matt McCutchen
On Wed, 2017-02-22 at 09:11 -0800, Junio C Hamano wrote:
> Matt McCutchen <m...@mattmccutchen.net> writes:
> 
> > We're preparing to reuse this code in transport.c for "git fetch".
> > 
> > While I'm here, internationalize the existing error message.
> > ---
> 
> Sounds good.  Please just say it is OK for me to forge your sign-off
> ;-)

Oops.  Given the other issue below, I'll just regenerate the patch
series.

> > diff --git a/fetch-pack.h b/fetch-pack.h
> > index c912e3d..fd4d80e 100644
> > --- a/fetch-pack.h
> > +++ b/fetch-pack.h
> > @@ -45,4 +45,13 @@ struct ref *fetch_pack(struct fetch_pack_args
> > *args,
> >        struct sha1_array *shallow,
> >        char **pack_lockfile);
> >  
> > +/*
> > + * Print an appropriate error message for each sought ref that
> > wasn't
> > + * matched.  Return 0 if all sought refs were matched, otherwise
> > 1.
> > + *
> > + * The type of "sought" should be "const struct ref *const *" but
> > for
> > + * http://stackoverflow.com/questions/5055655/double-pointer-const
> > -correctness-warnings-in-c .
> > + */
> 
> This is an unfinished sentence, but I wonder if we even need to have
> it here?  I'd be surprised if this function was unique in the
> codebase that takes an array pointer whose type is looser than
> necessary because of well-known language rules.

You're probably right.  I'm in the habit of documenting things that
were unknown to me, but I'll take your word for what's well-known to
the average git developer.  I'll remove the remark.

Matt


[PATCH 1/3] fetch-pack: move code to report unmatched refs to a function

2017-02-22 Thread Matt McCutchen
We're preparing to reuse this code in transport.c for "git fetch".

While I'm here, internationalize the existing error message.
---
 builtin/fetch-pack.c  |  7 +--
 fetch-pack.c  | 13 +
 fetch-pack.h  |  9 +
 t/t5500-fetch-pack.sh |  6 +++---
 4 files changed, 26 insertions(+), 9 deletions(-)

diff --git a/builtin/fetch-pack.c b/builtin/fetch-pack.c
index cfe9e44..2a1c1c2 100644
--- a/builtin/fetch-pack.c
+++ b/builtin/fetch-pack.c
@@ -219,12 +219,7 @@ int cmd_fetch_pack(int argc, const char **argv, const char 
*prefix)
 * remote no-such-ref' would silently succeed without issuing
 * an error.
 */
-   for (i = 0; i < nr_sought; i++) {
-   if (!sought[i] || sought[i]->matched)
-   continue;
-   error("no such remote ref %s", sought[i]->name);
-   ret = 1;
-   }
+   ret |= report_unmatched_refs(sought, nr_sought);
 
while (ref) {
printf("%s %s\n",
diff --git a/fetch-pack.c b/fetch-pack.c
index 601f077..7c8d44c 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -1094,3 +1094,16 @@ struct ref *fetch_pack(struct fetch_pack_args *args,
clear_shallow_info();
return ref_cpy;
 }
+
+int report_unmatched_refs(struct ref **sought, int nr_sought)
+{
+   int i, ret = 0;
+
+   for (i = 0; i < nr_sought; i++) {
+   if (!sought[i] || sought[i]->matched)
+   continue;
+   error(_("no such remote ref %s"), sought[i]->name);
+   ret = 1;
+   }
+   return ret;
+}
diff --git a/fetch-pack.h b/fetch-pack.h
index c912e3d..fd4d80e 100644
--- a/fetch-pack.h
+++ b/fetch-pack.h
@@ -45,4 +45,13 @@ struct ref *fetch_pack(struct fetch_pack_args *args,
   struct sha1_array *shallow,
   char **pack_lockfile);
 
+/*
+ * Print an appropriate error message for each sought ref that wasn't
+ * matched.  Return 0 if all sought refs were matched, otherwise 1.
+ *
+ * The type of "sought" should be "const struct ref *const *" but for
+ * 
http://stackoverflow.com/questions/5055655/double-pointer-const-correctness-warnings-in-c
 .
+ */
+int report_unmatched_refs(struct ref **sought, int nr_sought);
+
 #endif
diff --git a/t/t5500-fetch-pack.sh b/t/t5500-fetch-pack.sh
index 505e1b4..b5865b3 100755
--- a/t/t5500-fetch-pack.sh
+++ b/t/t5500-fetch-pack.sh
@@ -484,7 +484,7 @@ test_expect_success 'test lonely missing ref' '
cd client &&
test_must_fail git fetch-pack --no-progress .. refs/heads/xyzzy
) >/dev/null 2>error-m &&
-   test_cmp expect-error error-m
+   test_i18ncmp expect-error error-m
 '
 
 test_expect_success 'test missing ref after existing' '
@@ -492,7 +492,7 @@ test_expect_success 'test missing ref after existing' '
cd client &&
test_must_fail git fetch-pack --no-progress .. refs/heads/A 
refs/heads/xyzzy
) >/dev/null 2>error-em &&
-   test_cmp expect-error error-em
+   test_i18ncmp expect-error error-em
 '
 
 test_expect_success 'test missing ref before existing' '
@@ -500,7 +500,7 @@ test_expect_success 'test missing ref before existing' '
cd client &&
test_must_fail git fetch-pack --no-progress .. refs/heads/xyzzy 
refs/heads/A
) >/dev/null 2>error-me &&
-   test_cmp expect-error error-me
+   test_i18ncmp expect-error error-me
 '
 
 test_expect_success 'test --all, --depth, and explicit head' '
-- 
2.9.3




[PATCH 3/3] fetch-pack: add specific error for fetching an unadvertised object

2017-02-22 Thread Matt McCutchen
Enhance filter_refs (which decides whether a request for an unadvertised
object should be sent to the server) to record a new match status on the
"struct ref" when a request is not allowed, and have
report_unmatched_refs check for this status and print a special error
message, "Server does not allow request for unadvertised object".
---
 fetch-pack.c  | 42 +++---
 remote.h  |  9 +++--
 t/t5516-fetch-push.sh |  2 +-
 3 files changed, 35 insertions(+), 18 deletions(-)

diff --git a/fetch-pack.c b/fetch-pack.c
index 7c8d44c..f12bfcd 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -578,7 +578,7 @@ static void filter_refs(struct fetch_pack_args *args,
break; /* definitely do not have it */
else if (cmp == 0) {
keep = 1; /* definitely have it */
-   sought[i]->matched = 1;
+   sought[i]->match_status = REF_MATCHED;
}
i++;
}
@@ -598,22 +598,24 @@ static void filter_refs(struct fetch_pack_args *args,
}
 
/* Append unmatched requests to the list */
-   if ((allow_unadvertised_object_request &
-   (ALLOW_TIP_SHA1 | ALLOW_REACHABLE_SHA1))) {
-   for (i = 0; i < nr_sought; i++) {
-   unsigned char sha1[20];
+   for (i = 0; i < nr_sought; i++) {
+   unsigned char sha1[20];
 
-   ref = sought[i];
-   if (ref->matched)
-   continue;
-   if (get_sha1_hex(ref->name, sha1) ||
-   ref->name[40] != '\0' ||
-   hashcmp(sha1, ref->old_oid.hash))
-   continue;
+   ref = sought[i];
+   if (ref->match_status != REF_NOT_MATCHED)
+   continue;
+   if (get_sha1_hex(ref->name, sha1) ||
+   ref->name[40] != '\0' ||
+   hashcmp(sha1, ref->old_oid.hash))
+   continue;
 
-   ref->matched = 1;
+   if ((allow_unadvertised_object_request &
+   (ALLOW_TIP_SHA1 | ALLOW_REACHABLE_SHA1))) {
+   ref->match_status = REF_MATCHED;
*newtail = copy_ref(ref);
newtail = &(*newtail)->next;
+   } else {
+   ref->match_status = REF_UNADVERTISED_NOT_ALLOWED;
}
}
*refs = newlist;
@@ -1100,9 +1102,19 @@ int report_unmatched_refs(struct ref **sought, int 
nr_sought)
int i, ret = 0;
 
for (i = 0; i < nr_sought; i++) {
-   if (!sought[i] || sought[i]->matched)
+   if (!sought[i])
continue;
-   error(_("no such remote ref %s"), sought[i]->name);
+   switch (sought[i]->match_status) {
+   case REF_MATCHED:
+   continue;
+   case REF_NOT_MATCHED:
+   error(_("no such remote ref %s"), sought[i]->name);
+   break;
+   case REF_UNADVERTISED_NOT_ALLOWED:
+   error(_("Server does not allow request for unadvertised 
object %s"),
+ sought[i]->name);
+   break;
+   }
ret = 1;
}
return ret;
diff --git a/remote.h b/remote.h
index 9248811..0b9d8c4 100644
--- a/remote.h
+++ b/remote.h
@@ -89,8 +89,13 @@ struct ref {
force:1,
forced_update:1,
expect_old_sha1:1,
-   deletion:1,
-   matched:1;
+   deletion:1;
+
+   enum {
+   REF_NOT_MATCHED = 0, /* initial value */
+   REF_MATCHED,
+   REF_UNADVERTISED_NOT_ALLOWED
+   } match_status;
 
/*
 * Order is important here, as we write to FETCH_HEAD
diff --git a/t/t5516-fetch-push.sh b/t/t5516-fetch-push.sh
index 0d13a45..78f3b8e 100755
--- a/t/t5516-fetch-push.sh
+++ b/t/t5516-fetch-push.sh
@@ -1099,7 +1099,7 @@ test_expect_success 'fetch exact SHA1' '
 
# fetching the hidden object should fail by default
test_must_fail git fetch -v ../testrepo 
$the_commit:refs/heads/copy 2>err &&
-   test_i18ngrep "no such remote ref" err &&
+   test_i18ngrep "Server does not allow request for unadvertised 
object" err &&
test_must_fail git rev-parse --verify refs/heads/copy &&
 
# the server side can allow it to succeed
-- 
2.9.3




[PATCH 2/3] fetch_refs_via_pack: call report_unmatched_refs

2017-02-22 Thread Matt McCutchen
"git fetch" currently doesn't bother to check that it got all refs it
sought, because the common case of requesting a nonexistent ref triggers
a die() in get_fetch_map.  However, there's at least one case that
slipped through: "git fetch REMOTE SHA1" if the server doesn't allow
requests for unadvertised objects.  Make fetch_refs_via_pack (which is
on the "git fetch" code path) call report_unmatched_refs so that we at
least get an error message in that case.
---
 t/t5516-fetch-push.sh |  3 ++-
 transport.c   | 14 +-
 2 files changed, 11 insertions(+), 6 deletions(-)

diff --git a/t/t5516-fetch-push.sh b/t/t5516-fetch-push.sh
index 26b2caf..0d13a45 100755
--- a/t/t5516-fetch-push.sh
+++ b/t/t5516-fetch-push.sh
@@ -1098,7 +1098,8 @@ test_expect_success 'fetch exact SHA1' '
test_must_fail git cat-file -t $the_commit &&
 
# fetching the hidden object should fail by default
-   test_must_fail git fetch -v ../testrepo 
$the_commit:refs/heads/copy &&
+   test_must_fail git fetch -v ../testrepo 
$the_commit:refs/heads/copy 2>err &&
+   test_i18ngrep "no such remote ref" err &&
test_must_fail git rev-parse --verify refs/heads/copy &&
 
# the server side can allow it to succeed
diff --git a/transport.c b/transport.c
index 04e5d66..c377907 100644
--- a/transport.c
+++ b/transport.c
@@ -204,6 +204,7 @@ static struct ref *get_refs_via_connect(struct transport 
*transport, int for_pus
 static int fetch_refs_via_pack(struct transport *transport,
   int nr_heads, struct ref **to_fetch)
 {
+   int ret = 0;
struct git_transport_data *data = transport->data;
struct ref *refs;
char *dest = xstrdup(transport->url);
@@ -241,19 +242,22 @@ static int fetch_refs_via_pack(struct transport 
*transport,
  >pack_lockfile);
close(data->fd[0]);
close(data->fd[1]);
-   if (finish_connect(data->conn)) {
-   free_refs(refs);
-   refs = NULL;
-   }
+   if (finish_connect(data->conn))
+   ret = -1;
data->conn = NULL;
data->got_remote_heads = 0;
data->options.self_contained_and_connected =
args.self_contained_and_connected;
 
+   if (refs == NULL)
+   ret = -1;
+   if (report_unmatched_refs(to_fetch, nr_heads))
+   ret = -1;
+
free_refs(refs_tmp);
free_refs(refs);
free(dest);
-   return (refs ? 0 : -1);
+   return ret;
 }
 
 static int push_had_errors(struct ref *ref)
-- 
2.9.3




Re: [PATCH] fetch: print an error when declining to request an unadvertised object

2017-02-22 Thread Matt McCutchen
On Mon, 2017-02-20 at 22:36 -0800, Junio C Hamano wrote:
> Hmph, I would have expected this to be done as a three-patch series,
> 
>  * move the loop at the end of cmd_fetch_pack() to a separate helper
>    function report_unmatched_refs() and call it;
> 
>  * add a call to report_unmatched_refs() to the transport layer;
> 
>  * enhance report_unmatched_refs() by introducing match_status
>    field and adding new code to filter_refs() to diagnose other
>    kinds of errors.

Sure.

> The result looks reasonable from a cursory read, though.
> 
> Thanks for following it up to the completion.

This remark led me to believe you were satisfied with the single patch,
but the last "What's cooking in git.git" mail says "Expecting a split
series?".

Anyway, I made a split series and will send it in a moment.  I don't
know if all the commit messages include exactly the information you
want; hopefully you're happy to edit them as desired.  Compared to the
previous patch, there is one fix in the net result: fixing t5500-fetch-
pack.sh to deal with the internationalized "no such remote ref"
message.

Matt


[PATCH] fetch: print an error when declining to request an unadvertised object

2017-02-18 Thread Matt McCutchen
Currently "git fetch REMOTE SHA1" silently exits 1 if the server doesn't
allow requests for unadvertised objects by sha1.  The more common case
of requesting a nonexistent ref normally triggers a die() in
get_fetch_map, so "git fetch" wasn't bothering to check after the fetch
that it got all the refs it sought, like "git fetch-pack" does near the
end of cmd_fetch_pack.

Move the code from cmd_fetch_pack to a new function,
report_unmatched_refs, that is called by fetch_refs_via_pack as part of
"git fetch".  Also, change filter_refs (which checks whether a request
for an unadvertised object should be sent to the server) to set a new
match status on the "struct ref" when the request is not allowed, and
have report_unmatched_refs check for this status and print a special
error message, "Server does not allow request for unadvertised object".

Finally, add a simple test case for "git fetch REMOTE SHA1".

Signed-off-by: Matt McCutchen <m...@mattmccutchen.net>
---
 builtin/fetch-pack.c  |  7 +--
 fetch-pack.c  | 51 ++-
 fetch-pack.h  |  9 +
 remote.h  |  9 +++--
 t/t5516-fetch-push.sh |  3 ++-
 transport.c   | 14 +-
 6 files changed, 66 insertions(+), 27 deletions(-)

diff --git a/builtin/fetch-pack.c b/builtin/fetch-pack.c
index cfe9e44..2a1c1c2 100644
--- a/builtin/fetch-pack.c
+++ b/builtin/fetch-pack.c
@@ -219,12 +219,7 @@ int cmd_fetch_pack(int argc, const char **argv, const char 
*prefix)
 * remote no-such-ref' would silently succeed without issuing
 * an error.
 */
-   for (i = 0; i < nr_sought; i++) {
-   if (!sought[i] || sought[i]->matched)
-   continue;
-   error("no such remote ref %s", sought[i]->name);
-   ret = 1;
-   }
+   ret |= report_unmatched_refs(sought, nr_sought);
 
while (ref) {
printf("%s %s\n",
diff --git a/fetch-pack.c b/fetch-pack.c
index 601f077..f12bfcd 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -578,7 +578,7 @@ static void filter_refs(struct fetch_pack_args *args,
break; /* definitely do not have it */
else if (cmp == 0) {
keep = 1; /* definitely have it */
-   sought[i]->matched = 1;
+   sought[i]->match_status = REF_MATCHED;
}
i++;
}
@@ -598,22 +598,24 @@ static void filter_refs(struct fetch_pack_args *args,
}
 
/* Append unmatched requests to the list */
-   if ((allow_unadvertised_object_request &
-   (ALLOW_TIP_SHA1 | ALLOW_REACHABLE_SHA1))) {
-   for (i = 0; i < nr_sought; i++) {
-   unsigned char sha1[20];
+   for (i = 0; i < nr_sought; i++) {
+   unsigned char sha1[20];
 
-   ref = sought[i];
-   if (ref->matched)
-   continue;
-   if (get_sha1_hex(ref->name, sha1) ||
-   ref->name[40] != '\0' ||
-   hashcmp(sha1, ref->old_oid.hash))
-   continue;
+   ref = sought[i];
+   if (ref->match_status != REF_NOT_MATCHED)
+   continue;
+   if (get_sha1_hex(ref->name, sha1) ||
+   ref->name[40] != '\0' ||
+   hashcmp(sha1, ref->old_oid.hash))
+   continue;
 
-   ref->matched = 1;
+   if ((allow_unadvertised_object_request &
+   (ALLOW_TIP_SHA1 | ALLOW_REACHABLE_SHA1))) {
+   ref->match_status = REF_MATCHED;
*newtail = copy_ref(ref);
newtail = &(*newtail)->next;
+   } else {
+   ref->match_status = REF_UNADVERTISED_NOT_ALLOWED;
}
}
*refs = newlist;
@@ -1094,3 +1096,26 @@ struct ref *fetch_pack(struct fetch_pack_args *args,
clear_shallow_info();
return ref_cpy;
 }
+
+int report_unmatched_refs(struct ref **sought, int nr_sought)
+{
+   int i, ret = 0;
+
+   for (i = 0; i < nr_sought; i++) {
+   if (!sought[i])
+   continue;
+   switch (sought[i]->match_status) {
+   case REF_MATCHED:
+   continue;
+   case REF_NOT_MATCHED:
+   error(_("no such remote ref %s"), sought[i]->name);
+   break;
+   case REF_UNADVERTISED_

Re: [PATCH] fetch: print an error when declining to request an unadvertised object

2017-02-18 Thread Matt McCutchen
On Sun, 2017-02-12 at 15:49 -0800, Junio C Hamano wrote:
> The fact that we have the above two choices tells me that a two-step
> approach may be an appropriate approach. [...]

> Even if you did only the first step, as long as the second step can
> be done without reverting what the first step did [*4*] by somebody
> who cares the "specific error" deeply enough, I am OK with that.  Of
> course if you did both steps, that is fine by me as well ;-)

I appreciate the flexibility, but now that I've spent the time to
understand all the code involved, it would be a pity not to go for the
complete solution.  New patch coming.

Matt


Re: [PATCH] fetch: print an error when declining to request an unadvertised object

2017-02-12 Thread Matt McCutchen
On Fri, 2017-02-10 at 10:36 -0800, Junio C Hamano wrote:
> > 
> There is this piece of code near the end of builtin/fetch-pack.c:
> 
> [...]
> 
> that happens before the command shows the list of fetched refs, and
> this code is prepared to inspect what happend to the requests it (in
> response to the user request) made to the underlying fetch
> machinery, and issue the error message.
> If you change your command line to "git fetch-pack REMOTE SHA1", you
> do see an error from the above.

Yes, "error: no such remote ref ", which at least makes clear that
the operation didn't work, though it would be nice to give a more
specific error message.

> This all happens in transport.c::fetch_refs_via_pack().
> I think that function is a much better place to error or die than
> filter_refs().

I confirmed that checking the sought refs there works.  However, in
filter_refs, it's easy to give a more specific error message that the
server doesn't allow requests for unadvertised objects, and that code
works for "git fetch-pack" too.  To do the same in fetch_refs_via_pack,
we'd have to duplicate a few lines of code from filter_refs and expose
the allow_unadvertised_object_request variable, or just set a flag on
the "struct ref" in filter_refs and check it in fetch_refs_via_pack.

What do you think?  Do you not care about having a more specific error,
in which case I can copy the code from builtin/fetch-pack.c to
fetch_refs_via_pack?  Or shall I add code to filter_refs to set a flag
and add code to builtin/fetch-pack.c and fetch_refs_via_pack to check
the flag?  Or what?

Matt


[PATCH] fetch: print an error when declining to request an unadvertised object

2017-02-10 Thread Matt McCutchen
Currently "git fetch REMOTE SHA1" silently exits 1 if the server doesn't
allow requests for unadvertised objects by sha1.  Change it to print a
meaningful error message.

Signed-off-by: Matt McCutchen <m...@mattmccutchen.net>
---

The fetch code looks unbelievably complicated to me and I don't understand all
the cases that can arise.  Hopefully this patch is an acceptable solution to the
problem.

 fetch-pack.c  | 31 ---
 t/t5516-fetch-push.sh |  3 ++-
 2 files changed, 18 insertions(+), 16 deletions(-)

diff --git a/fetch-pack.c b/fetch-pack.c
index 601f077..117874c 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -598,23 +598,24 @@ static void filter_refs(struct fetch_pack_args *args,
}
 
/* Append unmatched requests to the list */
-   if ((allow_unadvertised_object_request &
-   (ALLOW_TIP_SHA1 | ALLOW_REACHABLE_SHA1))) {
-   for (i = 0; i < nr_sought; i++) {
-   unsigned char sha1[20];
+   for (i = 0; i < nr_sought; i++) {
+   unsigned char sha1[20];
 
-   ref = sought[i];
-   if (ref->matched)
-   continue;
-   if (get_sha1_hex(ref->name, sha1) ||
-   ref->name[40] != '\0' ||
-   hashcmp(sha1, ref->old_oid.hash))
-   continue;
+   ref = sought[i];
+   if (ref->matched)
+   continue;
+   if (get_sha1_hex(ref->name, sha1) ||
+   ref->name[40] != '\0' ||
+   hashcmp(sha1, ref->old_oid.hash))
+   continue;
 
-   ref->matched = 1;
-   *newtail = copy_ref(ref);
-   newtail = &(*newtail)->next;
-   }
+   if (!(allow_unadvertised_object_request &
+   (ALLOW_TIP_SHA1 | ALLOW_REACHABLE_SHA1)))
+   die(_("Server does not allow request for unadvertised 
object %s"), ref->name);
+
+   ref->matched = 1;
+   *newtail = copy_ref(ref);
+   newtail = &(*newtail)->next;
}
*refs = newlist;
 }
diff --git a/t/t5516-fetch-push.sh b/t/t5516-fetch-push.sh
index 0fc5a7c..177897e 100755
--- a/t/t5516-fetch-push.sh
+++ b/t/t5516-fetch-push.sh
@@ -1098,7 +1098,8 @@ test_expect_success 'fetch exact SHA1' '
test_must_fail git cat-file -t $the_commit &&
 
# fetching the hidden object should fail by default
-   test_must_fail git fetch -v ../testrepo 
$the_commit:refs/heads/copy &&
+   test_must_fail git fetch -v ../testrepo 
$the_commit:refs/heads/copy 2>err &&
+   test_i18ngrep "Server does not allow request for unadvertised 
object" err &&
test_must_fail git rev-parse --verify refs/heads/copy &&
 
# the server side can allow it to succeed
-- 
2.9.3




Re: [PATCH] merge-recursive: make "CONFLICT (rename/delete)" message show both paths

2017-02-01 Thread Matt McCutchen
On Wed, 2017-02-01 at 12:56 -0800, Junio C Hamano wrote:
> Junio C Hamano <gits...@pobox.com> writes:
> 
> > Matt McCutchen <m...@mattmccutchen.net> writes:
> > ...
> > > Please check that my refactoring is indeed correct!  All the
> > > existing tests pass
> > > for me, but the existing test coverage of these conflict messages
> > > looks poor.
> > 
> > This unfortunately is doing a bit too many things at once from that
> > point of view.  I need to reserve a solid quiet 20-minutes without
> > distraction to check it, which I am hoping to do tonight.
> 
> Let me make sure if I understood your changes correctly:
> 
>  * conflict_rename_delete() knew which one is renamed and which one
>    is deleted (even though the deleted one was called "other"), but
>    because in the original code handle_change_delete() wants to
>    always see tree A first and then tree B in its parameter list,
>    the original code swapped a/b before calling it.  In the original
>    code, handle_change_delete() needed to figure out which one is
>    the deleted one by looking at a_oid or b_oid.
> 
>  * In the updated code, the knowledge of which branch survives and
>    which branch is deleted is passed from the caller to
>    handle_change_delete(), which no longer needs to figure out by
>    looking at a_oid/b_oid.  The updated API only passes the oid for
>    surviving branch (as deleted one would have been 0{40} anyway).
> 
>  * In the updated code, handle_change_delete() is told the names of
>    both branches (the one that survives and the other that was
>    deleted).  It no longer has to switch between o->branch[12]
>    depending on the NULLness of a_oid/b_oid; it knows both names and
>    which one is which.
> 
>  * handle_modify_delete() also calls handle_change_delete().  Unlike
>    conflict_rename_delete(), it is not told by its caller which
>    branch keeps the path and which branch deletes the path, and
>    instead relies on handle_change_delete() to figure it out.
>    Because of the above change to the API, now it needs to sort it
>    out before calling handle_change_delete().
> 
> It all makes sense to me.  
> 
> The single call to update_file() that appears near the end of
> handle_change_delete() in the updated code corresponds to calls to
> the same function in 3 among 4 codepaths in the function in the
> original code.  It is a bit tricky to reason about, though.
> 
> In the original code, update_file() was omitted when we didn't have
> to come up with a unique alternate filename and the one that is left
> is a_oid (i.e. our side).  The way to tell if we did not come up
> with a unique alternate filename used to be to see the "renamed"
> variable but now it is the NULL-ness of "alt_path".

"alt_path" is the same variable that used to be "renamed".  I just
renamed it to be less confusing.

> And the way to
> tell if the side that is left is ours, we check to see o->branch1
> is the change_branch, not delete_branch.
> 
> So the condition to guard the call to update_file() also looks
> correct to me.

All of the above matches my understanding.  Would it have saved you
time if I had included some of this explanation in the patch "cover
letter"?

Matt



[PATCH] merge-recursive: make "CONFLICT (rename/delete)" message show both paths

2017-01-28 Thread Matt McCutchen
The current message printed by "git merge-recursive" for a rename/delete
conflict is like this:

CONFLICT (rename/delete): new-path deleted in HEAD and renamed in
other-branch. Version other-branch of new-path left in tree.

To be more helpful, the message should show both paths of the rename and
state that the deletion occurred at the old path, not the new path.  So
change the message to the following format:

CONFLICT (rename/delete): old-path deleted in HEAD and renamed to
new-path in other-branch. Version other-branch of new-path left in tree.

Since this doubles the number of cases in handle_change_delete (modify vs.
rename), refactor the code to halve the number of cases again by merging the
cases where o->branch1 has the change and o->branch2 has the delete with the
cases that are the other way around.

Also add a simple test of the new conflict message.

Signed-off-by: Matt McCutchen <m...@mattmccutchen.net>
---

This came up at:

https://github.com/cristibalan/braid/issues/41#issuecomment-275826716

Please check that my refactoring is indeed correct!  All the existing tests pass
for me, but the existing test coverage of these conflict messages looks poor.

 merge-recursive.c  | 117 ++---
 t/t6045-merge-rename-delete.sh |  23 
 2 files changed, 86 insertions(+), 54 deletions(-)
 create mode 100755 t/t6045-merge-rename-delete.sh

diff --git a/merge-recursive.c b/merge-recursive.c
index d327209..e8fce10 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -1061,16 +1061,20 @@ static int merge_file_one(struct merge_options *o,
 }
 
 static int handle_change_delete(struct merge_options *o,
-const char *path,
+const char *path, const char *old_path,
 const struct object_id *o_oid, int o_mode,
-const struct object_id *a_oid, int a_mode,
-const struct object_id *b_oid, int b_mode,
+const struct object_id *changed_oid,
+int changed_mode,
+const char *change_branch,
+const char *delete_branch,
 const char *change, const char *change_past)
 {
-   char *renamed = NULL;
+   char *alt_path = NULL;
+   const char *update_path = path;
int ret = 0;
+
if (dir_in_way(path, !o->call_depth, 0)) {
-   renamed = unique_path(o, path, a_oid ? o->branch1 : o->branch2);
+   update_path = alt_path = unique_path(o, path, change_branch);
}
 
if (o->call_depth) {
@@ -1081,43 +1085,43 @@ static int handle_change_delete(struct merge_options *o,
 */
ret = remove_file_from_cache(path);
if (!ret)
-   ret = update_file(o, 0, o_oid, o_mode,
- renamed ? renamed : path);
-   } else if (!a_oid) {
-   if (!renamed) {
-   output(o, 1, _("CONFLICT (%s/delete): %s deleted in %s "
-  "and %s in %s. Version %s of %s left in tree."),
-  change, path, o->branch1, change_past,
-  o->branch2, o->branch2, path);
-   ret = update_file(o, 0, b_oid, b_mode, path);
-   } else {
-   output(o, 1, _("CONFLICT (%s/delete): %s deleted in %s "
-  "and %s in %s. Version %s of %s left in tree at 
%s."),
-  change, path, o->branch1, change_past,
-  o->branch2, o->branch2, path, renamed);
-   ret = update_file(o, 0, b_oid, b_mode, renamed);
-   }
+   ret = update_file(o, 0, o_oid, o_mode, update_path);
} else {
-   if (!renamed) {
-   output(o, 1, _("CONFLICT (%s/delete): %s deleted in %s "
-  "and %s in %s. Version %s of %s left in tree."),
-  change, path, o->branch2, change_past,
-  o->branch1, o->branch1, path);
+   if (!alt_path) {
+   if (!old_path) {
+   output(o, 1, _("CONFLICT (%s/delete): %s 
deleted in %s "
+  "and %s in %s. Version %s of %s left in 
tree."),
+  change, path, delete_branch, change_past,
+  change_branch, change_branch, path);
+   } else {
+   output(o, 1, _("CONFLICT (%s/delete): %s 
deleted in %s "

[PATCH] t0001: don't let a default ACL interfere with the umask test

2017-01-28 Thread Matt McCutchen
The "init creates a new deep directory (umask vs. shared)" test expects
the permissions of newly created files to be based on the umask, which
fails if a default ACL is inherited from the working tree for git.  So
attempt to remove a default ACL if there is one.  Same idea as
8ed0a740dd42bd0724aebed6e3b07c4ea2a2d5e8.  (I guess I'm the only one who
ever runs the test suite with a default ACL set.)

Signed-off-by: Matt McCutchen <m...@mattmccutchen.net>
---
 t/t0001-init.sh | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/t/t0001-init.sh b/t/t0001-init.sh
index b8fc588..e424de5 100755
--- a/t/t0001-init.sh
+++ b/t/t0001-init.sh
@@ -258,6 +258,9 @@ test_expect_success POSIXPERM 'init creates a new deep 
directory (umask vs. shar
(
# Leading directories should honor umask while
# the repository itself should follow "shared"
+   mkdir newdir &&
+   # Remove a default ACL if possible.
+   (setfacl -k newdir 2>/dev/null || true) &&
umask 002 &&
git init --bare --shared=0660 newdir/a/b/c &&
test_path_is_dir newdir/a/b/c/refs &&
-- 
2.9.3




"git diff --ignore-space-change --stat" lists files with only whitespace differences as "changed"

2017-01-17 Thread Matt McCutchen
A bug report: I noticed that "git diff --ignore-space-change --stat"
lists files with only whitespace differences as having changed with 0
differing lines.  This is inconsistent with the behavior without --
stat, which doesn't list such files at all.  (Same behavior with all
the --ignore*space* flags.)  I can reproduce this with the current
"next", af746e4.  Quick test case:

echo ' ' >test1 && echo '  ' >test2 &&
git diff --stat --no-index --ignore-space-change test1 test2

This caused me some inconvenience in the following scenario: I was
reading a commit diff that had a bulk license change in all files
combined with code changes.  I attempted to revert the bulk license
change locally using "sed" to more easily read the code diff, but my
reversion left some whitespace diffs where the original files had
inconsistent whitespace.  So the diffstat after my reversion was
cluttered with these "0" entries.

Regards,
Matt



Re: Protecting old temporary objects being reused from concurrent "git gc"?

2016-11-15 Thread Matt McCutchen
On Tue, 2016-11-15 at 12:40 -0500, Jeff King wrote:
> On Tue, Nov 15, 2016 at 12:33:04PM -0500, Matt McCutchen wrote:
> 
> > 
> > On Tue, 2016-11-15 at 12:06 -0500, Jeff King wrote:
> > > 
> > >  - when an object write is optimized out because we already have the
> > >    object, git will update the mtime on the file (loose object or
> > >    packfile) to freshen it
> > 
> > FWIW, I am not seeing this happen when I do "git read-tree --prefix"
> > followed by "git write-tree" using the current master (3ab2281).  See
> > the attached test script.
> 
> The optimization I'm thinking about is the one from write_sha1_file(),
> which learned to freshen in 33d4221c7 (write_sha1_file: freshen existing
> objects, 2014-10-15).
> 
> I suspect the issue is that read-tree populates the cache-tree index
> extension, and then write-tree omits the object write before it even
> gets to write_sha1_file(). The solution is that it should probably be
> calling one of the freshen() functions (possibly just replacing
> has_sha1_file() with check_and_freshen(), but I haven't looked).
> 
> I'd definitely welcome patches in this area.

Cool, it's nice to have an idea of what's going on.  I don't think I'm
going to try to fix it myself though.

By the way, thanks for the fast response to my original question!

Matt


[PATCH] git-gc.txt: expand discussion of races with other processes

2016-11-15 Thread Matt McCutchen
In general, "git gc" may delete objects that another concurrent process
is using but hasn't created a reference to.  Git has some mitigations,
but they fall short of a complete solution.  Document this in the
git-gc(1) man page and add a reference from the documentation of the
gc.pruneExpire config variable.

Based on a write-up by Jeff King:

http://marc.info/?l=git=147922960131779=2

Signed-off-by: Matt McCutchen <m...@mattmccutchen.net>
---
 Documentation/config.txt |  4 +++-
 Documentation/git-gc.txt | 34 ++
 2 files changed, 29 insertions(+), 9 deletions(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 21fdddf..3f1d931 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -1409,7 +1409,9 @@ gc.pruneExpire::
Override the grace period with this config variable.  The value
"now" may be used to disable this grace period and always prune
unreachable objects immediately, or "never" may be used to
-   suppress pruning.
+   suppress pruning.  This feature helps prevent corruption when
+   'git gc' runs concurrently with another process writing to the
+   repository; see the "NOTES" section of linkgit:git-gc[1].
 
 gc.worktreePruneExpire::
When 'git gc' is run, it calls
diff --git a/Documentation/git-gc.txt b/Documentation/git-gc.txt
index bed60f4..852b72c 100644
--- a/Documentation/git-gc.txt
+++ b/Documentation/git-gc.txt
@@ -63,11 +63,10 @@ automatic consolidation of packs.
 --prune=::
Prune loose objects older than date (default is 2 weeks ago,
overridable by the config variable `gc.pruneExpire`).
-   --prune=all prunes loose objects regardless of their age (do
-   not use --prune=all unless you know exactly what you are doing.
-   Unless the repository is quiescent, you will lose newly created
-   objects that haven't been anchored with the refs and end up
-   corrupting your repository).  --prune is on by default.
+   --prune=all prunes loose objects regardless of their age and
+   increases the risk of corruption if another process is writing to
+   the repository concurrently; see "NOTES" below. --prune is on by
+   default.
 
 --no-prune::
Do not prune any loose objects.
@@ -138,17 +137,36 @@ default is "2 weeks ago".
 Notes
 -
 
-'git gc' tries very hard to be safe about the garbage it collects. In
+'git gc' tries very hard not to delete objects that are referenced
+anywhere in your repository. In
 particular, it will keep not only objects referenced by your current set
 of branches and tags, but also objects referenced by the index,
 remote-tracking branches, refs saved by 'git filter-branch' in
 refs/original/, or reflogs (which may reference commits in branches
 that were later amended or rewound).
-
-If you are expecting some objects to be collected and they aren't, check
+If you are expecting some objects to be deleted and they aren't, check
 all of those locations and decide whether it makes sense in your case to
 remove those references.
 
+On the other hand, when 'git gc' runs concurrently with another process,
+there is a risk of it deleting an object that the other process is using
+but hasn't created a reference to. This may just cause the other process
+to fail or may corrupt the repository if the other process later adds a
+reference to the deleted object. Git has two features that significantly
+mitigate this problem:
+
+. Any object with modification time newer than the `--prune` date is kept,
+  along with everything reachable from it.
+
+. Most operations that add an object to the database update the
+  modification time of the object if it is already present so that #1
+  applies.
+
+However, these features fall short of a complete solution, so users who
+run commands concurrently have to live with some risk of corruption (which
+seems to be low in practice) unless they turn off automatic garbage
+collection with 'git config gc.auto 0'.
+
 HOOKS
 -
 
-- 
2.7.4




Re: Protecting old temporary objects being reused from concurrent "git gc"?

2016-11-15 Thread Matt McCutchen
On Tue, 2016-11-15 at 12:06 -0500, Jeff King wrote:
>  - when an object write is optimized out because we already have the
>    object, git will update the mtime on the file (loose object or
>    packfile) to freshen it

FWIW, I am not seeing this happen when I do "git read-tree --prefix"
followed by "git write-tree" using the current master (3ab2281).  See
the attached test script.

> If you have long-running data (like, a temporary index file that might
> literally sit around for days or weeks) I think that is a potential
> problem. And the solution is probably to use refs in some way to point
> to your objects.

Agreed.  This is not my current scenario.

> If you're worried about a short-term operation where
> somebody happens to run git-gc concurrently, I agree it's a possible
> problem, but I suspect something you can ignore in practice.
> 
> For the most part, a lot of the client-side git tools assume that one
> operation is happening at a time in the repository. And I think that
> largely holds for a developer working on a single clone, and things just
> work in practice.
> 
> Auto-gc makes that a little sketchier, but historically does not seem to
> have really caused problems in practice.

OK.  I'll write a patch to add a summary of this information to the
git-gc man page.

Matt

test-git-read-tree-write-tree-touch-object.sh
Description: application/shellscript


Protecting old temporary objects being reused from concurrent "git gc"?

2016-11-15 Thread Matt McCutchen
The Braid subproject management tool stores the subproject content in
the main tree and is able to switch to a different upstream revision of
a subproject by doing the equivalent of "git read-tree -m" on the
superproject tree and the two upstream trees.  The tricky part is
preparing temporary trees with the upstream content moved to the path
configured for the superproject.  The usual method is "git read-tree
--prefix", but using what index file?  Braid currently uses the user's
actual worktree, which can leave a mess if it gets interrupted:

https://github.com/cristibalan/braid/blob/7d81da6e86e24de62a74f3ab8d880666cb343b04/lib/braid/commands/update.rb#L98

I want to change this to something that won't leave an inconsistent
state if interrupted.  I've written code for this kind of thing before
that sets GIT_INDEX_FILE and uses a temporary index file and "git
write-tree".  But I realized that if "git gc" runs concurrently, the
generated tree could be deleted before it is used and the tool would
fail.  If I had a need to run "git commit-tree", it seems like I might
even end up with a commit object with a broken reference to a tree.
 "git gc" normally doesn't delete objects that were created in the last
2 weeks, but if an identical tree was added to the object database more
than 2 weeks ago by another operation and is unreferenced, it could be
reused without updating its mtime and it could still get deleted.

Is there a recommended way to avoid this kind of problem in add-on
tools?  (I searched the Git documentation and the web for information
about races with "git gc" and didn't find anything useful.)  If not, it
seems to be a significant design flaw in "git gc", even if the problem
is extremely rare in practice.  I wonder if some of the built-in
commands may have the same problem, though I haven't tried to test
them.  If this is confirmed to be a known problem affecting built-in
commands, then at least I won't feel bad about introducing the
same problem into add-on tools. :/

Thanks,
Matt


Re: "git subtree --squash" interacts poorly with revert, merge, and rebase

2016-11-14 Thread Matt McCutchen
On Thu, 2016-11-10 at 16:53 -0500, Matt McCutchen wrote:
> On Wed, 2016-10-26 at 19:07 -0400, Matt McCutchen wrote:
> > 
> > Maybe we would never hit any of these problems in practice, but they
> > give me a bad enough feeling that I'm planning to write my own tool
> > that tracks the upstream commit ID in a file (like a submodule) and
> > doesn't generate any extra commits.  Without generating extra commits,
> > the only place to store the upstream content in the superproject would
> > be in another subtree, which would take up disk space in every working
> > tree unless developers manually set skip-worktree.  I think I prefer to
> > not store the upstream content and just have the tool fetch it from a
> > local subproject repository each time it's needed.
> > 
> > I'll of course post the tool on the web and would be happy to see it
> > integrated into "git subtree" if that makes sense, but I don't know how
> > much time I'd be willing to put into making that happen.
> 
> I have named my tool "git subtree-lite" and posted it here:
> 
> https://mattmccutchen.net/utils/git-subtree-lite.git/

As I was doing additional research in preparation for adding git-
subtree-lite to the tools page
(https://git.wiki.kernel.org/index.php/Interfaces,_frontends,_and_tools),
by chance I found an existing tool, Braid
(http://cristibalan.github.io/braid/), whose design meets my
requirements.  I have a few minor concerns, but assuming I'm able to
fix them without too much work and upstream accepts my patches, I plan
to switch to Braid.

I've made a properly marked section on the tools page for subproject
management tools:

https://git.wiki.kernel.org/index.php/Interfaces,_frontends,_and_tools#Subprojects_or_sets_of_repositories

in the hope that the next person with the same requirements as me finds
Braid.  (I unfortunately didn't check that page before starting, but I
will the next time I need something.)

Matt


Re: [PATCH] fetch/push: document that private data can be leaked

2016-11-14 Thread Matt McCutchen
On Mon, 2016-11-14 at 11:00 -0800, Junio C Hamano wrote:
> Matt McCutchen <m...@mattmccutchen.net> writes:
> 
> > 
> > > 
> > > Yup, and then "do not push to untrustworthy place without
> > > checking
> > > what you are pushing", too?
> > 
> > If there is no private data in the repository, then there is no
> > need
> > for the user to check what they are pushing. As I've indicated
> > before,
> > IMO manually checking each push would not be a workable security
> > measure in the long term anyway.
> 
> Then what is?

Don't put private data in the same repository, then the whole issue
becomes moot.  Am I missing something?

Matt


[PATCH] doc: mention transfer data leaks in more places

2016-11-14 Thread Matt McCutchen
The "SECURITY" section of the gitnamespaces(7) man page described two
ways for a client to steal data from a server that wasn't intended to be
shared. Similar attacks can be performed by a server on a client, so
adapt the section to cover both directions and add it to the
git-fetch(1), git-pull(1), and git-push(1) man pages. Also add
references to this section from the documentation of server
configuration options that attempt to control data leakage but may not
be fully effective.

Signed-off-by: Matt McCutchen <m...@mattmccutchen.net>
---
 Documentation/config.txt  | 17 ++---
 Documentation/git-fetch.txt   |  2 ++
 Documentation/git-pull.txt|  2 ++
 Documentation/git-push.txt|  2 ++
 Documentation/gitnamespaces.txt   | 20 +---
 Documentation/transfer-data-leaks.txt | 30 ++
 6 files changed, 51 insertions(+), 22 deletions(-)
 create mode 100644 Documentation/transfer-data-leaks.txt

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 21fdddf..fc2cf83 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -2898,6 +2898,11 @@ is omitted from the advertisements but 
`refs/heads/master` and
 `refs/namespaces/bar/refs/heads/master` are still advertised as so-called
 "have" lines. In order to match refs before stripping, add a `^` in front of
 the ref name. If you combine `!` and `^`, `!` must be specified first.
++
+Even if you hide refs, a client may still be able to steal the target
+objects via the techniques described in the "SECURITY" section of the
+linkgit:gitnamespaces[7] man page; it's best to keep private data in a
+separate repository.
 
 transfer.unpackLimit::
When `fetch.unpackLimit` or `receive.unpackLimit` are
@@ -2907,7 +2912,7 @@ transfer.unpackLimit::
 uploadarchive.allowUnreachable::
If true, allow clients to use `git archive --remote` to request
any tree, whether reachable from the ref tips or not. See the
-   discussion in the `SECURITY` section of
+   discussion in the "SECURITY" section of
linkgit:git-upload-archive[1] for more details. Defaults to
`false`.
 
@@ -2921,13 +2926,19 @@ uploadpack.allowTipSHA1InWant::
When `uploadpack.hideRefs` is in effect, allow `upload-pack`
to accept a fetch request that asks for an object at the tip
of a hidden ref (by default, such a request is rejected).
-   see also `uploadpack.hideRefs`.
+   See also `uploadpack.hideRefs`.  Even if this is false, a client
+   may be able to steal objects via the techniques described in the
+   "SECURITY" section of the linkgit:gitnamespaces[7] man page; it's
+   best to keep private data in a separate repository.
 
 uploadpack.allowReachableSHA1InWant::
Allow `upload-pack` to accept a fetch request that asks for an
object that is reachable from any ref tip. However, note that
calculating object reachability is computationally expensive.
-   Defaults to `false`.
+   Defaults to `false`.  Even if this is false, a client may be able
+   to steal objects via the techniques described in the "SECURITY"
+   section of the linkgit:gitnamespaces[7] man page; it's best to
+   keep private data in a separate repository.
 
 uploadpack.keepAlive::
When `upload-pack` has started `pack-objects`, there may be a
diff --git a/Documentation/git-fetch.txt b/Documentation/git-fetch.txt
index 9e42169..b153aef 100644
--- a/Documentation/git-fetch.txt
+++ b/Documentation/git-fetch.txt
@@ -192,6 +192,8 @@ The first command fetches the `maint` branch from the 
repository at
 objects will eventually be removed by git's built-in housekeeping (see
 linkgit:git-gc[1]).
 
+include::transfer-data-leaks.txt[]
+
 BUGS
 
 Using --recurse-submodules can only fetch new commits in already checked
diff --git a/Documentation/git-pull.txt b/Documentation/git-pull.txt
index d033b25..4470e4b 100644
--- a/Documentation/git-pull.txt
+++ b/Documentation/git-pull.txt
@@ -237,6 +237,8 @@ If you tried a pull which resulted in complex conflicts and
 would want to start over, you can recover with 'git reset'.
 
 
+include::transfer-data-leaks.txt[]
+
 BUGS
 
 Using --recurse-submodules can only fetch new commits in already checked
diff --git a/Documentation/git-push.txt b/Documentation/git-push.txt
index 47b77e6..8eefabd 100644
--- a/Documentation/git-push.txt
+++ b/Documentation/git-push.txt
@@ -559,6 +559,8 @@ Commits A and B would no longer belong to a branch with a 
symbolic name,
 and so would be unreachable.  As such, these commits would be removed by
 a `git gc` command on the origin repository.
 
+include::transfer-data-leaks.txt[]
+
 GIT
 ---
 Part of the linkgit:git[1] suite
diff --git a/Documentation/gitnamespaces.txt b/Documentation/gitnamespaces.txt
index 7685e36..b614969 100644
--- a/Documentation

Re: [PATCH] fetch/push: document that private data can be leaked

2016-11-14 Thread Matt McCutchen
On Sun, 2016-11-13 at 18:57 -0800, Junio C Hamano wrote:
> Matt McCutchen <m...@mattmccutchen.net> writes:
> 
> > 
> >  Documentation/fetch-push-security.txt | 9 +
> 
> A new (consolidated) piece like this that can be included in
> multiple places is a good idea.  I wonder if the original
> description in "namespaces" thing can be moved here and then
> "namespaces" page can be made to also borrow from this?

I gave this a try.  New patch coming.

> > --- /dev/null
> > +++ b/Documentation/fetch-push-security.txt
> > @@ -0,0 +1,9 @@
> > +SECURITY
> > +
> > +The fetch and push protocols are not designed to prevent a
> > malicious
> > +server from stealing data from your repository that you did not
> > intend to
> > +share. The possible attacks are similar to the ones described in
> > the
> > +"SECURITY" section of linkgit:gitnamespaces[7]. If you have
> > private data
> > +that you need to protect from the server, keep it in a separate
> > +repository.
> 
> Yup, and then "do not push to untrustworthy place without checking
> what you are pushing", too?

If there is no private data in the repository, then there is no need
for the user to check what they are pushing.  As I've indicated before,
IMO manually checking each push would not be a workable security
measure in the long term anyway.

Matt


Re: Fetch/push lets a malicious server steal the targets of "have" lines

2016-11-12 Thread Matt McCutchen
On Sun, 2016-10-30 at 01:16 -0700, Junio C Hamano wrote:
> Jon Loeliger  writes:
> 
> > 
> > Is there an existing protocol provision, or an extension to
> > the protocol that would allow a distrustful client to say to
> > the server, "Really, you have Y2?  Prove it."
> 
> There is not, but I do not think it would be an effective solution.
> 
> The issue is not the lack of protocol support, but how to determine
> that the other side needs such a proof for Y2 but not for other
> commits.  How does your side know what makes Y2 special and why does
> yout side think they should not have Y2?
> 
> Once you know how to determine Y2 is special, that knowledge can be
> used to abort the "push" before even starting.  When you are pushing
> back the 'master' and that 'master' reaches Y2, which must be kept
> secret, you shouldn't be pushing that 'master' to them, whether they
> claim to have Y2 or not.

FWIW, I can imagine a protocol that would prove possession for all
objects, which would completely fix the problem.  Each object would
have a "private" hash computed recursively over the object graph, just
like the ordinary object hash, but with a different seed.  The object
database would be extended to cache the private hash of every object.
 Then, during a fetch or push, when the two sides identify a matching
object, the side that would otherwise have had to send the object sends
the private hash.  Support for storing multiple hashes per object might
also be useful in some way for the migration to a stronger hash
function than SHA-1.

The next best solution, which doesn't require a protocol change but
requires a little user intervention, is to have a configuration option
per remote for a set of refs whose reachable objects are known to be
safe to send to the server.  This set presumably includes the remote's
own remote-tracking refs.  During fetch and push, the client looks for
matches only among these "safe" objects, effectively emulating a
repository containing only the safe objects.  A fetch may update the
remote-tracking refs to point to unsafe objects that were already in
the local repository, effectively making them safe, but only after the
server sends the content of these objects (and the client validates the
hashes!).

Unfortunately, I'm not signing up to implement either solution. :(

Matt


Re: Fetch/push lets a malicious server steal the targets of "have" lines

2016-11-12 Thread Matt McCutchen
On Sun, 2016-10-30 at 01:03 -0700, Junio C Hamano wrote:
> Matt McCutchen <m...@mattmccutchen.net> writes:
> 
> > 
> > On Fri, 2016-10-28 at 22:31 -0700, Junio C Hamano wrote:
> > > 
> > > Not sending to the list, where mails from Gmail/phone is known to
> > > get
> > > rejected.
> > 
> > [I guess I can go ahead and quote this to the list.]
> > 
> > > 
> > > No. I'm saying that the scenario you gave is bad and people
> > > should be
> > > taught not to connect to untrustworthy sites.
> > 
> > To clarify, are you saying:
> > 
> > (1) don't connect to an untrusted server ever (e.g., we don't
> > promise
> > that the server can't execute arbitrary code on the client), or
> > 
> > (2) don't connect to an untrusted server if the client repository
> > has
> > data that needs to be kept secret from the server?
> 
> You sneaked "arbitrary code execution" into the discussion but I do
> not know where it came from.  In any case, "don't pull from or push
> to untrustworthy place" would be a common sense advice that would
> make sense in any scenario ;-)

A blanket statement like that without explanation is not very helpful
to users who do find themselves needing to pull from or push to a
server they don't absolutely trust.  The only "definitely safe" option
it leaves them is to run the entire thing in a sandbox.  A statement of
the nature of the risk is much more helpful: users can determine that
they don't care about the risk, or if it does, what the easiest
workaround is.

The new risk we discovered in this thread is of leakage of private data
from the local repository.  To avoid that risk, it's sufficient for
users to move private data to a separate repository, so that's the
advice I propose to give.  Are you aware of issues with fetch/push with
potential impact beyond leakage of private data, which would make my
proposed text insufficient?  I was giving "arbitrary code execution" as
an example of what the impact of such an issue could be.

> Just for future reference, when you have ideas/issues that might
> have possible security ramifications, I'd prefer to see it first
> discussed on a private list we created for that exact purpose, until
> we can assess the impact (if any).  Right now MaintNotes says this:
> 
> If you think you found a security-sensitive issue and want to
> disclose
> it to us without announcing it to wider public, please contact us
> at
> our security mailing list <git-secur...@googlegroups.com>.  This
> is
> a closed list that is limited to people who need to know early
> about
> vulnerabilities, including:
> 
>   - people triaging and fixing reported vulnerabilities
>   - people operating major git hosting sites with many users
>   - people packaging and distributing git to large numbers of
> people
> 
> where these issues are discussed without risk of the information
> leaking out before we're ready to make public announcements.
> 
> We may want to tweak the description from "disclose it to us" to
> "have a discussion on it with us" (the former makes it sound as if
> the topic has to be a definite problem, the latter can include an
> idle speculation that may not be realistic attack vector).

OK.  I'll admit that I didn't even look for a policy on reporting of
security issues because I believed the issue had low enough impact that
a report to a dedicated security contact point would be unwelcome.
 Maybe that was reckless.  The new text sounds good, if you put it in a
place where people like me would see it. :/

Matt


[PATCH] fetch/push: document that private data can be leaked

2016-11-12 Thread Matt McCutchen
A malicious server may be able to use the fetch and push protocols to
steal data from a user's repository that the user did not intend to
share, via attacks similar to those described in the gitnamespaces(7)
man page. Mention this in the git-fetch(1), git-pull(1), and git-push(1)
man pages and recommend using separate repositories for private data and
interaction with untrusted servers.

Signed-off-by: Matt McCutchen <m...@mattmccutchen.net>
---

And here's a proposed patch.  Based on the maint branch, ac84098.

 Documentation/fetch-push-security.txt | 9 +
 Documentation/git-fetch.txt   | 2 ++
 Documentation/git-pull.txt| 2 ++
 Documentation/git-push.txt| 2 ++
 4 files changed, 15 insertions(+)
 create mode 100644 Documentation/fetch-push-security.txt

diff --git a/Documentation/fetch-push-security.txt 
b/Documentation/fetch-push-security.txt
new file mode 100644
index 000..00944ed
--- /dev/null
+++ b/Documentation/fetch-push-security.txt
@@ -0,0 +1,9 @@
+SECURITY
+
+The fetch and push protocols are not designed to prevent a malicious
+server from stealing data from your repository that you did not intend to
+share. The possible attacks are similar to the ones described in the
+"SECURITY" section of linkgit:gitnamespaces[7]. If you have private data
+that you need to protect from the server, keep it in a separate
+repository.
+
diff --git a/Documentation/git-fetch.txt b/Documentation/git-fetch.txt
index 9e42169..a461b4b 100644
--- a/Documentation/git-fetch.txt
+++ b/Documentation/git-fetch.txt
@@ -192,6 +192,8 @@ The first command fetches the `maint` branch from the 
repository at
 objects will eventually be removed by git's built-in housekeeping (see
 linkgit:git-gc[1]).
 
+include::fetch-push-security.txt[]
+
 BUGS
 
 Using --recurse-submodules can only fetch new commits in already checked
diff --git a/Documentation/git-pull.txt b/Documentation/git-pull.txt
index d033b25..0af2de9 100644
--- a/Documentation/git-pull.txt
+++ b/Documentation/git-pull.txt
@@ -237,6 +237,8 @@ If you tried a pull which resulted in complex conflicts and
 would want to start over, you can recover with 'git reset'.
 
 
+include::fetch-push-security.txt[]
+
 BUGS
 
 Using --recurse-submodules can only fetch new commits in already checked
diff --git a/Documentation/git-push.txt b/Documentation/git-push.txt
index 47b77e6..5ebef9e 100644
--- a/Documentation/git-push.txt
+++ b/Documentation/git-push.txt
@@ -559,6 +559,8 @@ Commits A and B would no longer belong to a branch with a 
symbolic name,
 and so would be unreachable.  As such, these commits would be removed by
 a `git gc` command on the origin repository.
 
+include::fetch-push-security.txt[]
+
 GIT
 ---
 Part of the linkgit:git[1] suite
-- 
2.7.4




Re: "git subtree --squash" interacts poorly with revert, merge, and rebase

2016-11-10 Thread Matt McCutchen
On Wed, 2016-10-26 at 19:07 -0400, Matt McCutchen wrote:
> Maybe we would never hit any of these problems in practice, but they
> give me a bad enough feeling that I'm planning to write my own tool
> that tracks the upstream commit ID in a file (like a submodule) and
> doesn't generate any extra commits.  Without generating extra commits,
> the only place to store the upstream content in the superproject would
> be in another subtree, which would take up disk space in every working
> tree unless developers manually set skip-worktree.  I think I prefer to
> not store the upstream content and just have the tool fetch it from a
> local subproject repository each time it's needed.
> 
> I'll of course post the tool on the web and would be happy to see it
> integrated into "git subtree" if that makes sense, but I don't know how
> much time I'd be willing to put into making that happen.

I have named my tool "git subtree-lite" and posted it here:

https://mattmccutchen.net/utils/git-subtree-lite.git/

For now, please email any bug reports, enhancement requests, or
proposed patches to me.  I have philosophical concerns about hosting my
own projects on services I don't control and practical concerns about a
 few "forge" apps that I looked into installing on my own web site, but
if people are seriously interested in collaborating on this, I'll work
something out.

Matt


Re: Fetch/push lets a malicious server steal the targets of "have" lines

2016-10-29 Thread Matt McCutchen
On Sat, 2016-10-29 at 09:39 -0400, Jeff King wrote:
> I'm not sure I understand how connecting to a remote server to fetch is
> a big problem. The server may learn about the existence of particular
> sha1s in your repository, but cannot get their content.
> 
> It's the subsequent push that is a problem.
> 
> In the scenarios you've described, I'm mostly inclined to say that the
> problem is not git or the protocol itself, but rather lax refspecs.
> You mentioned earlier:
> 
>   the server can just generate another ref 'xx' pointing to Y2, assuming
>   it can entice the victim to set up a corresponding local branch
>   refs/heads/for-server1/xx and push it back.  Or if the victim is for
>   some reason just mirroring back and forth:
> 
> This sounds a lot like "I told git to push a bunch of things without
> checking if they were really secret, and it turned out to push some
> secret things". IOW I think the problem is not that the server may lie
> about what it has, but that the user was not careful about what they
> pushed. I dunno. I do not mind making a note in the documentation
> explaining the implications of a server lying, but the scenarios seem
> pretty contrived to me.

Let's focus on the first scenario.  There the user is just pulling and
pushing a master branch.  Are you saying that each time the user pulls,
they need to look over all the commits they pulled before pushing them
back?  I think that's unrealistic, for example, on a busy project with
centralized code review or if the user is publishing a project-specific 
modified version of an upstream library.  The natural user expectation
is that anything pulled from a public repository is public.

But let's see what Junio says in the other subthread.

> A much more interesting one, IMHO, is a server whose receive-pack lies
> about which objects it has (possibly ones it found out about earlier via
> fetch), which provokes the client to generate deltas against objects the
> server doesn't have (and thereby leaking information about the base
> objects).
> 
> That is a problem no matter how careful your refspecs are. I suspect it
> would be a hard attack to pull off in practice, just because it's going
> to depend heavily on the content of the specific objects, what kinds of
> deltas you can convince the other side to generate, etc. That might
> merit a mention in the git-push documentation.

Sure, if I end up doing a patch, I'll include this.

Matt


Re: Fetch/push lets a malicious server steal the targets of "have" lines

2016-10-29 Thread Matt McCutchen
On Fri, 2016-10-28 at 22:31 -0700, Junio C Hamano wrote:
> Not sending to the list, where mails from Gmail/phone is known to get
> rejected.

[I guess I can go ahead and quote this to the list.]

> No. I'm saying that the scenario you gave is bad and people should be
> taught not to connect to untrustworthy sites.

To clarify, are you saying:

(1) don't connect to an untrusted server ever (e.g., we don't promise
that the server can't execute arbitrary code on the client), or

(2) don't connect to an untrusted server if the client repository has
data that needs to be kept secret from the server?

The fetch/push attack relates only to #2.  If #1, what are the other
risks you are thinking of?

Matt


Re: Fetch/push lets a malicious server steal the targets of "have" lines

2016-10-28 Thread Matt McCutchen
On Fri, 2016-10-28 at 18:11 -0700, Junio C Hamano wrote:
> Ah, I see.  My immediate reaction is that you can do worse things in
> the reverse direction compared to this, but your scenario does sound
> bad already.

Are you saying that clients connecting to untrusted servers already
face worse risks that people should know about, so there is no point in
documenting this one?  I guess I don't know about the other risks aside
from accepting a corrupt object, which should be preventable by
enabling fetch.fsckObjects.  It seems we need either a statement that
connecting to untrusted servers is officially unsupported or a
description of the specific risks.

Matt


Re: Fetch/push lets a malicious server steal the targets of "have" lines

2016-10-28 Thread Matt McCutchen
On Fri, 2016-10-28 at 15:00 -0700, Junio C Hamano wrote:
> Let me see if I understood your scenario correctly.
> 
> Suppose we start from this history where 'O' are common, your victim
> has a 'Y' branch with two commits that are private to it, as well as
> a 'X' branch on which it has X1 that it previously obtained from the
> server.  On the other hand, the server does not know about Y1 or Y2,
> and it added one commit X2 to the branch 'x' the victim is
> following:
> 
>    victimserver
> 
>  Y1---Y2   
> /  
> ---O---O---X1   ---O---O---X1---X2
> 
> Then when victim wants to fetch 'x' from the server, it would say
> 
> have X1, have Y2, have Y1, have O
> 
> and gets told to shut up by the server who heard enough.  The
> histories on these two parties will then become like this:
> 
> 
>    victimserver
> 
>  Y1---Y2   
> /  
> ---O---O---X1---X2  ---O---O---X1---X2

Then the server generates a commit X3 that lists Y2 as a parent, even
though it doesn't have Y2, and advances 'x' to X3.  The victim fetches
'x':

           victim                  server

 Y1---Y2                      (Y2)
/           \                         \ 
---O---O---X1---X2---X3   ---O---O---X1---X2---X3

Then the server rolls back 'x' to X2:

           victim                  server

             Y1---Y2
/           \
---O---O---X1---X2---X3   ---O---O---X1---X2

And the victim pushes:

           victim                  server

             Y1---Y2               Y1---Y2
/           \             /           \
---O---O---X1---X2---X3   ---O---O---X1---X2---X3

Now the server has the content of Y2.

If the victim is fetching and pulling a whole "directory" of refs, e.g:

fetch: refs/heads/*:refs/remotes/server1/*
push: refs/heads/for-server1/*:refs/heads/*

then instead of generating a merge commit, the server can just generate
another ref 'xx' pointing to Y2, assuming it can entice the victim to
set up a corresponding local branch refs/heads/for-server1/xx and push
it back.  Or if the victim is for some reason just mirroring back and
forth:

fetch: refs/heads/*:refs/heads/for-server1/*
push: refs/heads/for-
server1/*:refs/heads/*

then it doesn't have to set up a local branch as separate step.

Matt


Fetch/push lets a malicious server steal the targets of "have" lines

2016-10-28 Thread Matt McCutchen
I was studying the fetch protocol and I realized that in a scenario in
which a client regularly fetches a set of refs from a server and pushes
them back without careful scrutiny, the server can steal the targets of
unrelated refs from the client repository by fabricating its own refs
to the "have" objects specified by the client during the fetch.  This
is the reverse of attack #1 described in the "SECURITY" section of the
gitnamespaces(7) man page, with the addition that the server doesn't
have to know the object IDs in advance.  Is this supposed to be well-
known?  I've been using git since 2006 and it was a surprise to me.

Hopefully it isn't very common for a user to fetch and push with a
server they don't trust to have all the data in their repository.  I
don't think I have any such cases myself; I have unfinished work that
isn't meant for scrutiny by others, but nothing really damaging if it
were released to the server.  This attack presents no new risks if a
user already runs code fetched from the server in such a way that it
can read the repository.  But there might be some users who just review
embargoed security fixes from multiple sources (or something like that)
without running code themselves, and their security expectations might
be violated.

If my analysis is correct, I'd argue for documenting the issue in a
"SECURITY" section in the git-fetch man page.  Shall I submit a patch?

Thanks for your attention.

Matt


Re: "git subtree --squash" interacts poorly with revert, merge, and rebase

2016-10-26 Thread Matt McCutchen
On Wed, 2016-10-26 at 19:03 -0700, Stefan Beller wrote:
> On Wed, Oct 26, 2016 at 6:52 PM, Matt McCutchen <m...@mattmccutchen.net> 
> wrote:
> > 4. I pushed several dangling submodule pointers before I learned I
> > could set push.recurseSubmodules = check.  This isn't the default; each
> > developer has to do it manually.  (In theory, I could put such things
> > in a setup script for them to run if they trust me.)
> 
> There is a current series in flight/for review that makes "check" default.
> (It is blocked as check has some performance issues when having lots
> of commits to be pushed, so it may take a while and not show up in the
> next release)

Great!  One other thing: IIRC, "check" does not distinguish between
different remotes.  For example, suppose I fork a project that already
has a submodule and I have a pair of repositories that pull from the
"upstream" repositories and push to "origin" repositories for my
project.  Suppose I upgrade to a new upstream version and find that I'm
(temporarily) able to use the upstream submodule without modifications.
 The "check" feature won't stop me from pushing a pointer into the
"origin" superproject that points to a commit that exists in the
"upstream" subproject but not the "origin" subproject.

> > 5. Stashing changes to both the superproject and the subproject takes
> > more steps.
> 
> True, so you'd want to have a `git stash --recurse-submodules={yes,no}`
> where the command line option is configurable, so you don't have to type
> it all the time?

Sounds good.  I'm sure you realize this is not just a matter of running
"git stash" in each submodule because there are many ways the stash
stacks could get out of lockstep.  The submodule content needs to be
incorporated into the superproject stash.

> Thanks for pointing out the issues though. they align to what
> we plan on doing for submodules, so ... the plan actually makes
> sense :)

Again, I'm thrilled you're working on this, even if I don't use it on
my current project.

Matt


Re: "git subtree --squash" interacts poorly with revert, merge, and rebase

2016-10-26 Thread Matt McCutchen
Hi Stefan,

I appreciate the effort to remove obstacles to the use of submodules!
 It looks like a custom tool is probably still our best option at this
time, though we can always switch back to submodules later.

On Wed, 2016-10-26 at 16:23 -0700, Stefan Beller wrote:
> On Wed, Oct 26, 2016 at 4:07 PM, Matt McCutchen <m...@mattmccutchen.net> 
> wrote:
> > - We have to make separate commits and manage corresponding topic
> > branches for the superproject and subprojects.
> 
> Well yeah, that is how submodule work on a conceptual level.
> While having multiple commits may seem like overhead, note
> the subtle difference for these commits. One if deep down in the
> stack patching one of the submodules, the other is a high level
> commit advancing the submodule pointer.
> 
> Note that the target audience of these two commit messages
> might be vastly different, hence can be worded differently.
> (The submodule describing how you fixed e.g. a memleak or race condition
> and the superproject describes on why you needed to include that submodule,
> e.g. because you switched your toplevel application to use threads.)

I understand one can adhere to that philosophy, but I don't see the
practical benefit of doing so in our case compared to using a "git
subtree"-like approach and making a single commit.  It would just be us
looking at both commits.  If we do upstream any of the library changes,
we'll probably have to rework them anyway.

> > - A diff of the superproject doesn't include the content of
> > subprojects.

> Although this is just Git, you probably also have a code review system that
> would need that change as well.

Indeed.  We currently use Bitbucket.  I'd be open to switching, though
maybe not just for this.

> Is there anything else besides these 3 points that encourages you to
> switch away from submodules?

Those 3 are the ongoing pain points I can think of.  There are a few
other drawbacks compared to "git subtree" that come up less often:

1b. On another project, I was working with a teammate who was new to
version control and not very careful, who forgot to run "git submodule
update" and ended up committing back the old submodule pointer.
 Thankfully, this hasn't happened yet on my current project.

4. I pushed several dangling submodule pointers before I learned I
could set push.recurseSubmodules = check.  This isn't the default; each
developer has to do it manually.  (In theory, I could put such things
in a setup script for them to run if they trust me.)

5. Stashing changes to both the superproject and the subproject takes
more steps.

6. I use multiple worktrees (with "git worktree") because of #5 and
also so that I can run two versions of the application at the same time
and compare the behavior.  Using "git worktree" with submodules is
officially unsupported, though I was able to get things working by
manually editing some files.

7. We have to set up a repository for each subproject on our hosting
service.  Anyone who forks our application and modifies a subproject
has to set up a subproject repository and carry a change to .gitmodules
to point to their repository.  If we use relative URLs in .gitmodules,
it's even worse: anyone who forks our application has to set up
repositories for all the subprojects, even if they don't modify them.

Matt


"git subtree --squash" interacts poorly with revert, merge, and rebase

2016-10-26 Thread Matt McCutchen
I'm the lead developer of a research software application (https://bitb
ucket.org/objsheets/objsheets) that uses modified versions of two
third-party libraries, which we need to version and distribute along
with our application.  For better or for worse, we haven't made it a
priority to upstream our changes, so for now we just want to optimize
for ease of (1) making and reviewing changes and (2) upgrading to newer
upstream versions.

We've been using git submodules, but that's a pain for several reasons:
- We have to run "git submodule update" manually.
- We have to make separate commits and manage corresponding topic
branches for the superproject and subprojects.
- A diff of the superproject doesn't include the content of
subprojects.

Recently I looked into switching to the "git subtree" contrib tool in
the --squash mode, but I identified a few drawbacks compared to
submodules:

1. The upstream commit on which the subtree is based is assumed to be
given by the latest squash commit in "git log".  This means that (i) a
change to a different upstream commit can't be reverted with "git
revert" and (ii) a "git merge" of two superproject branches based on
different upstream commits may successfully merge the content of the
upstream commits but leave the tool thinking the subtree is based on an
arbitrary one of the two commits.

2. Rebasing messes up the merge commits generated by "git subtree --
squash".  --preserve-merges worked in a simple test but supposedly
doesn't work if there are conflicts or I want to reorder commits with
--interactive.

Maybe we would never hit any of these problems in practice, but they
give me a bad enough feeling that I'm planning to write my own tool
that tracks the upstream commit ID in a file (like a submodule) and
doesn't generate any extra commits.  Without generating extra commits,
the only place to store the upstream content in the superproject would
be in another subtree, which would take up disk space in every working
tree unless developers manually set skip-worktree.  I think I prefer to
not store the upstream content and just have the tool fetch it from a
local subproject repository each time it's needed.

I'll of course post the tool on the web and would be happy to see it
integrated into "git subtree" if that makes sense, but I don't know how
much time I'd be willing to put into making that happen.

Any advice?

Thanks,
Matt


Making the "Note from the maintainer" information discoverable

2016-02-09 Thread Matt McCutchen
On Mon, 2016-02-08 at 14:16 -0800, Junio C Hamano wrote:
> Jeff King  writes:
> 
> > FWIW, as the person who wrote that section, I think that is a good
> > addition.  We do have a link to Simon Tatham's bug-reporting guide,
> > but
> > this is a good place to put project-specific advice.
> > 
> > In addition to "try it on next" you may want to also mention "try
> > it on
> > the latest version of git". That is another frequently given
> > pointer to
> > bug reporters.  Trying "next" is obviously a superset, but I
> > suspect
> > trying a released version may be an easier first step for some
> > people.
> 
> Yes, definitely.
> 
> I agree that testing with the latest released version would
> typically be much easier to end users than building from the source.
> It would reduce the need for "Ah, that's ancient issue, we know it
> was fixed a few releases ago." responses by us; I do not recall many
> of such responses in the recent history on the list, though.
> 
> For the ones who are more into the spirit of helping each other who
> can build from the source to help us even more, checking 'master'
> and finding regressions before it gets too late is a very good
> thing.  Checking 'next' and confirming an upcoming fix is equally
> valuable.

While researching an unrelated issue, I stumbled upon
http://marc.info/?l=git=142714670111063=2, which seems to have even
more valuable information about community processes.  Is there any
interest in making this information discoverable from
https://git-scm.com/community and/or the man pages?  I'm happy to file
an issue or to write a patch that adds a link, but I don't see myself
spending more time on it than that.

Matt
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: update_linked_gitdir writes relative path to .git/worktrees//gitdir

2016-02-09 Thread Matt McCutchen
On Mon, 2016-02-08 at 14:16 -0800, Junio C Hamano wrote:
> Jeff King  writes:
> 
> > FWIW, as the person who wrote that section, I think that is a good
> > addition.  We do have a link to Simon Tatham's bug-reporting guide, but
> > this is a good place to put project-specific advice.
> > 
> > In addition to "try it on next" you may want to also mention "try it on
> > the latest version of git". That is another frequently given pointer to
> > bug reporters.  Trying "next" is obviously a superset, but I suspect
> > trying a released version may be an easier first step for some people.
> 
> Yes, definitely.
> 
> I agree that testing with the latest released version would
> typically be much easier to end users than building from the source.
> It would reduce the need for "Ah, that's ancient issue, we know it
> was fixed a few releases ago." responses by us; I do not recall many
> of such responses in the recent history on the list, though.
> 
> For the ones who are more into the spirit of helping each other who
> can build from the source to help us even more, checking 'master'
> and finding regressions before it gets too late is a very good
> thing.  Checking 'next' and confirming an upcoming fix is equally
> valuable.

OK, so this testing is an encouragement, not an expectation per se,
even if bug reports may be less likely to get attention without it (I'm
not familiar with the degree to which this may have been the case
recently).  See my revised proposed text here:

https://github.com/git/git-scm.com/pull/676/files

I'll send an analogous patch for the git(1) man page in a moment.

I left a mention of providing feedback on pending fixes but thought it
would be too much to go into the details of how to identify whether
there is a pending fix.  Is this sensible?

Matt


--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


git(1) man page has link to out-of-date googlecode site

2016-02-09 Thread Matt McCutchen
I noticed that near the top of the git(1) man page, there is a link to 
http://git-htmldocs.googlecode.com/git/git.html, which apparently has
not been updated since August 2015
(https://code.google.com/archive/p/git-htmldocs/source/default/commits).
Should this link just be removed, or replaced with
http://git-scm.com/docs (which lists a subset of the pages) or perhaps
http://git-scm.com/docs/git?  I will be happy to write the patch.

Matt


--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] git.txt: encourage bug reporters to test recent versions

2016-02-09 Thread Matt McCutchen
Specifically, the latest released version or the "next" branch, as
reporters are willing.  This is based on:

  http://marc.info/?l=git=145496979420513=2

Signed-off-by: Matt McCutchen <m...@mattmccutchen.net>
---
 Documentation/git.txt | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/Documentation/git.txt b/Documentation/git.txt
index d987ad2..1a148bc 100644
--- a/Documentation/git.txt
+++ b/Documentation/git.txt
@@ -1226,7 +1226,11 @@ Reporting Bugs
 
 Report bugs to the Git mailing list <git@vger.kernel.org> where the
 development and maintenance is primarily done.  You do not have to be
-subscribed to the list to send a message there.
+subscribed to the list to send a message there.  You can help us out by
+attempting to reproduce the bug in the latest released version of git,
+or if you're willing to build git from source, the `next` branch.
+Sometimes an attempted fix may be pending in this branch, in which case
+your feedback as to whether the fix worked for you will be appreciated.
 
 SEE ALSO
 
-- 
2.5.0


--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Documentation/git-clean.txt: don't mention deletion of .git/modules/*

2016-02-09 Thread Matt McCutchen
On Tue, 2016-02-09 at 09:34 -0800, Junio C Hamano wrote:
> Mikko Rapeli <mikko.rap...@iki.fi> writes:
> 
> > Sorry, can't reproduce the problem where submodules stayed in the
> > tree until
> > git clean was called with two -f's.
> > 
> > You are right in removing the second part.
> 
> Thanks, then let's do this.

Looks good except for a typo (below).  I'll try to write better commit
messages in the future.

> -- >8 --
> From: Matt McCutchen <m...@mattmccutchen.net>
> Date: Sat, 6 Feb 2016 15:25:41 -0500
> Subject: [PATCH] Documentation/git-clean.txt: don't mention deletion
> of .git/modules/*
> 
> The latter half fo

of

> this sentence, the removal of the submodules, was
> never done with (or without) double -f back when it was written, and
> we still do not do so.
> 
> Signed-off-by: Matt McCutchen <m...@mattmccutchen.net>
> Acked-by: Mikko Rapeli <mikko.rap...@iki.fi>
> Signed-off-by: Junio C Hamano <gits...@pobox.com>
> ---
>  Documentation/git-clean.txt | 4 +---
>  1 file changed, 1 insertion(+), 3 deletions(-)
> 
> diff --git a/Documentation/git-clean.txt b/Documentation/git-
> clean.txt
> index 641681f..51a7e26 100644
> --- a/Documentation/git-clean.txt
> +++ b/Documentation/git-clean.txt
> @@ -37,9 +37,7 @@ OPTIONS
>   to false, 'git clean' will refuse to delete files or
> directories
>   unless given -f, -n or -i. Git will refuse to delete
> directories
>   with .git sub directory or file unless a second -f
> - is given. This affects also git submodules where the storage
> area
> - of the removed submodule under .git/modules/ is not removed
> until
> - -f is given twice.
> + is given.
>  
>  -i::
>  --interactive::

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Documentation/git-clean.txt: don't mention deletion of .git/modules/*

2016-02-08 Thread Matt McCutchen
On Mon, 2016-02-08 at 14:22 -0800, Junio C Hamano wrote:
> Matt McCutchen <m...@mattmccutchen.net> writes:
> 
> > I found no evidence of such behavior in the source code.
> > 
> > Signed-off-by: Matt McCutchen <m...@mattmccutchen.net>
> > ---
> 
> That was added last year at bcd57cb9 (Documentation/git-clean.txt:
> document that -f may need to be given twice, 2015-02-26).  It would
> be better to know what got changed since then--that is, was the
> additional text unnecessary even back then, or we made changes to
> the system since then and forgot to remove the added text.
> 
> Mikko, is this need to give -f twice still the case?

I know you probably want confirmation from Mikko, but I'll offer my
understanding.  There were two statements added in bcd57cb9:

1. -f may need to be given twice to delete nested worktrees and
embedded repositories.  This is still true.

2. Deletion of submodule repositories under .git/modules is conditional
on -f being given twice.  AFAICT, this was wrong even back then: "git
clean" has never deleted such repositories under any conditions.

My patch is only removing #2.

Matt
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


update_linked_gitdir writes relative path to .git/worktrees//gitdir

2016-02-07 Thread Matt McCutchen
I noticed that when update_linked_gitdir chooses to update
.git/worktrees//gitdir, the path it writes is relative, at least
under some circumstances.  This contradicts the gitrepository-layout
man page, which says:

worktrees//gitdir::
A text file containing the absolute path back to the .git file
that points to here.

IIUC, this behavior defeats one of the three safeguards that is
supposed to prevent "git worktree prune" from pruning information for
worktrees that still exist.

A simple script to reproduce:

#!/bin/bash
set -e -x
rm -rf repo worktree2
git init repo
cd repo
touch foo
git add foo
git commit -m 'dummy commit'
git worktree add ../worktree2 -b branch2
cat .git/worktrees/worktree2/gitdir
touch -d '2 days ago' .git/worktrees/worktree2/gitdir
(cd ../worktree2 && git status)
cat .git/worktrees/worktree2/gitdir

Trying this on master as of earlier today (ff4ea60), I get:

[...]
/PATH/REDACTED/worktree2/.git
[...]
.git

Matt


--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] Documentation/git-clean.txt: don't mention deletion of .git/modules/*

2016-02-07 Thread Matt McCutchen
I found no evidence of such behavior in the source code.

Signed-off-by: Matt McCutchen <m...@mattmccutchen.net>
---
This is based on the maint branch, a08595f.

Try #2 to get correct email formatting.

 Documentation/git-clean.txt | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/Documentation/git-clean.txt b/Documentation/git-clean.txt
index 641681f..51a7e26 100644
--- a/Documentation/git-clean.txt
+++ b/Documentation/git-clean.txt
@@ -37,9 +37,7 @@ OPTIONS
to false, 'git clean' will refuse to delete files or directories
unless given -f, -n or -i. Git will refuse to delete directories
with .git sub directory or file unless a second -f
-   is given. This affects also git submodules where the storage area
-   of the removed submodule under .git/modules/ is not removed until
-   -f is given twice.
+   is given.
 
 -i::
 --interactive::
-- 
2.5.0


--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: update_linked_gitdir writes relative path to .git/worktrees//gitdir

2016-02-07 Thread Matt McCutchen
On Sun, 2016-02-07 at 15:56 -0800, Junio C Hamano wrote:
> Matt McCutchen <m...@mattmccutchen.net> writes:
> 
> > I noticed that when update_linked_gitdir chooses to update
> > .git/worktrees//gitdir, the path it writes is relative, at
> > least
> > under some circumstances.  This contradicts the gitrepository-
> > layout
> > man page, which says:
> 
> Duy, is it safe to say that the fix has already been cooking in
> 'next' as nd/do-not-move-worktree-manually topic,

Yes, looks like that topic removes the buggy functionality.

> it is very much
> appreciated when reporting bugs people check if a presumed fix is
> already cooking in 'next', try it to verify if it really fixes their
> problem, and send in a "OK fix is good" / "No that does not fix my
> case"?

Sorry to waste your time.  This wasn't documented where I looked,
namely the "Bug Reporting" section on http://git-scm.com/community .
 Here's a straw-man proposed update to that page:

https://github.com/mattmccutchen/git-scm.com/compare/master...bug-reporting-next

If you like it, I will submit it as a pull request.  I can propose a
similar update to the "REPORTING BUGS" section of the git(1) man page
if you like.

Matt


--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] Documentation/git-clean.txt: don't mention deletion of .git/modules/*

2016-02-06 Thread Matt McCutchen
I found no evidence of such behavior in the source code.

Signed-off-by: Matt McCutchen <m...@mattmccutchen.net>
---

This is based on the maint branch, a08595f.

 Documentation/git-clean.txt | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/Documentation/git-clean.txt b/Documentation/git-clean.txt
index 641681f..51a7e26 100644
--- a/Documentation/git-clean.txt
+++ b/Documentation/git-clean.txt
@@ -37,9 +37,7 @@ OPTIONS
    to false, 'git clean' will refuse to delete files or directories
    unless given -f, -n or -i. Git will refuse to delete directories
    with .git sub directory or file unless a second -f
-   is given. This affects also git submodules where the storage area
-   of the removed submodule under .git/modules/ is not removed until
-   -f is given twice.
+   is given.
 
 -i::
 --interactive::
-- 
2.5.0

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html