[PATCH v4 9/9] Documentation/config: add odb..promisorRemote

2018-08-01 Thread Christian Couder
From: Christian Couder 

Signed-off-by: Junio C Hamano 
---
 Documentation/config.txt | 5 +
 1 file changed, 5 insertions(+)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 43b2de7b5f..2d048d47f2 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -2513,6 +2513,11 @@ This setting can be overridden with the 
`GIT_NOTES_REWRITE_REF`
 environment variable, which must be a colon separated list of refs or
 globs.
 
+odb..promisorRemote::
+   The name of a promisor remote. For now promisor remotes are
+   the only kind of remote object database (odb) that is
+   supported.
+
 pack.window::
The size of the window used by linkgit:git-pack-objects[1] when no
window size is given on the command line. Defaults to 10.
-- 
2.18.0.330.g17eb9fed90



[PATCH v4 1/9] fetch-object: make functions return an error code

2018-08-01 Thread Christian Couder
From: Christian Couder 

The callers of the fetch_object() and fetch_objects() might
be interested in knowing if these functions succeeded or not.

Signed-off-by: Christian Couder 
Signed-off-by: Junio C Hamano 
---
 fetch-object.c | 15 +--
 fetch-object.h |  6 +++---
 sha1-file.c|  4 ++--
 3 files changed, 14 insertions(+), 11 deletions(-)

diff --git a/fetch-object.c b/fetch-object.c
index 48fe63dd6c..3c52009266 100644
--- a/fetch-object.c
+++ b/fetch-object.c
@@ -5,11 +5,12 @@
 #include "transport.h"
 #include "fetch-object.h"
 
-static void fetch_refs(const char *remote_name, struct ref *ref)
+static int fetch_refs(const char *remote_name, struct ref *ref)
 {
struct remote *remote;
struct transport *transport;
int original_fetch_if_missing = fetch_if_missing;
+   int res;
 
fetch_if_missing = 0;
remote = remote_get(remote_name);
@@ -19,18 +20,20 @@ static void fetch_refs(const char *remote_name, struct ref 
*ref)
 
transport_set_option(transport, TRANS_OPT_FROM_PROMISOR, "1");
transport_set_option(transport, TRANS_OPT_NO_DEPENDENTS, "1");
-   transport_fetch_refs(transport, ref, NULL);
+   res = transport_fetch_refs(transport, ref, NULL);
fetch_if_missing = original_fetch_if_missing;
+
+   return res;
 }
 
-void fetch_object(const char *remote_name, const unsigned char *sha1)
+int fetch_object(const char *remote_name, const unsigned char *sha1)
 {
struct ref *ref = alloc_ref(sha1_to_hex(sha1));
hashcpy(ref->old_oid.hash, sha1);
-   fetch_refs(remote_name, ref);
+   return fetch_refs(remote_name, ref);
 }
 
-void fetch_objects(const char *remote_name, const struct oid_array *to_fetch)
+int fetch_objects(const char *remote_name, const struct oid_array *to_fetch)
 {
struct ref *ref = NULL;
int i;
@@ -41,5 +44,5 @@ void fetch_objects(const char *remote_name, const struct 
oid_array *to_fetch)
new_ref->next = ref;
ref = new_ref;
}
-   fetch_refs(remote_name, ref);
+   return fetch_refs(remote_name, ref);
 }
diff --git a/fetch-object.h b/fetch-object.h
index 4b269d07ed..12e1f9ee70 100644
--- a/fetch-object.h
+++ b/fetch-object.h
@@ -3,9 +3,9 @@
 
 #include "sha1-array.h"
 
-extern void fetch_object(const char *remote_name, const unsigned char *sha1);
+extern int fetch_object(const char *remote_name, const unsigned char *sha1);
 
-extern void fetch_objects(const char *remote_name,
- const struct oid_array *to_fetch);
+extern int fetch_objects(const char *remote_name,
+const struct oid_array *to_fetch);
 
 #endif
diff --git a/sha1-file.c b/sha1-file.c
index de4839e634..c099f5584d 100644
--- a/sha1-file.c
+++ b/sha1-file.c
@@ -1312,8 +1312,8 @@ int oid_object_info_extended(struct repository *r, const 
struct object_id *oid,
if (fetch_if_missing && repository_format_partial_clone &&
!already_retried && r == the_repository) {
/*
-* TODO Investigate having fetch_object() return
-* TODO error/success and stopping the music here.
+* TODO Investigate checking fetch_object() return
+* TODO value and stopping on error here.
 * TODO Pass a repository struct through fetch_object,
 * such that arbitrary repositories work.
 */
-- 
2.18.0.330.g17eb9fed90



[PATCH v4 2/9] Add initial remote odb support

2018-08-01 Thread Christian Couder
From: Christian Couder 

The remote-odb.{c,h} files will contain the functions
that are called by the rest of Git mostly from
"sha1-file.c" to access the objects managed by the
remote odbs.

The odb-helper.{c,h} files will contain the functions to
actually implement communication with either the internal
functions or the external scripts or processes that will
manage and provide remote git objects.

For now only infrastructure to create helpers from the
config and to check if there are remote odbs and odb
helpers is implemented.

We expect that there will not be a lot of helpers, so it
is ok to use a simple linked list to manage them.

Helped-by: Jeff King 
Signed-off-by: Christian Couder 
Signed-off-by: Junio C Hamano 
---
 Makefile |  2 ++
 odb-helper.c | 16 +
 odb-helper.h | 19 +++
 remote-odb.c | 91 
 remote-odb.h |  7 
 5 files changed, 135 insertions(+)
 create mode 100644 odb-helper.c
 create mode 100644 odb-helper.h
 create mode 100644 remote-odb.c
 create mode 100644 remote-odb.h

diff --git a/Makefile b/Makefile
index 08e5c54549..2a9bd02208 100644
--- a/Makefile
+++ b/Makefile
@@ -896,6 +896,8 @@ LIB_OBJS += notes-cache.o
 LIB_OBJS += notes-merge.o
 LIB_OBJS += notes-utils.o
 LIB_OBJS += object.o
+LIB_OBJS += odb-helper.o
+LIB_OBJS += remote-odb.o
 LIB_OBJS += oidmap.o
 LIB_OBJS += oidset.o
 LIB_OBJS += packfile.o
diff --git a/odb-helper.c b/odb-helper.c
new file mode 100644
index 00..b4d403ffa9
--- /dev/null
+++ b/odb-helper.c
@@ -0,0 +1,16 @@
+#include "cache.h"
+#include "object.h"
+#include "argv-array.h"
+#include "odb-helper.h"
+#include "run-command.h"
+#include "sha1-lookup.h"
+
+struct odb_helper *odb_helper_new(const char *name, int namelen)
+{
+   struct odb_helper *o;
+
+   o = xcalloc(1, sizeof(*o));
+   o->name = xmemdupz(name, namelen);
+
+   return o;
+}
diff --git a/odb-helper.h b/odb-helper.h
new file mode 100644
index 00..4b792a3896
--- /dev/null
+++ b/odb-helper.h
@@ -0,0 +1,19 @@
+#ifndef ODB_HELPER_H
+#define ODB_HELPER_H
+
+/*
+ * An odb helper is a way to access a remote odb.
+ *
+ * Information in its fields comes from the config file [odb "NAME"]
+ * entries.
+ */
+struct odb_helper {
+   const char *name; /* odb..* */
+   const char *remote;   /* odb..promisorRemote */
+
+   struct odb_helper *next;
+};
+
+extern struct odb_helper *odb_helper_new(const char *name, int namelen);
+
+#endif /* ODB_HELPER_H */
diff --git a/remote-odb.c b/remote-odb.c
new file mode 100644
index 00..03be54ba2e
--- /dev/null
+++ b/remote-odb.c
@@ -0,0 +1,91 @@
+#include "cache.h"
+#include "remote-odb.h"
+#include "odb-helper.h"
+#include "config.h"
+
+static struct odb_helper *helpers;
+static struct odb_helper **helpers_tail = &helpers;
+
+static struct odb_helper *find_or_create_helper(const char *name, int len)
+{
+   struct odb_helper *o;
+
+   for (o = helpers; o; o = o->next)
+   if (!strncmp(o->name, name, len) && !o->name[len])
+   return o;
+
+   o = odb_helper_new(name, len);
+   *helpers_tail = o;
+   helpers_tail = &o->next;
+
+   return o;
+}
+
+static struct odb_helper *do_find_odb_helper(const char *remote)
+{
+   struct odb_helper *o;
+
+   for (o = helpers; o; o = o->next)
+   if (o->remote && !strcmp(o->remote, remote))
+   return o;
+
+   return NULL;
+}
+
+static int remote_odb_config(const char *var, const char *value, void *data)
+{
+   struct odb_helper *o;
+   const char *name;
+   int namelen;
+   const char *subkey;
+
+   if (parse_config_key(var, "odb", &name, &namelen, &subkey) < 0)
+   return 0;
+
+   o = find_or_create_helper(name, namelen);
+
+   if (!strcmp(subkey, "promisorremote")) {
+   const char *remote;
+   int res = git_config_string(&remote, var, value);
+
+   if (res)
+   return res;
+
+   if (do_find_odb_helper(remote))
+   return error(_("when parsing config key '%s' "
+  "helper for remote '%s' already exists"),
+var, remote);
+
+   o->remote = remote;
+
+   return 0;
+   }
+
+   return 0;
+}
+
+static void remote_odb_init(void)
+{
+   static int initialized;
+
+   if (initialized)
+   return;
+   initialized = 1;
+
+   git_config(remote_odb_config, NULL);
+}
+
+struct odb_helper *find_odb_helper(const char *remote)
+{
+   remote_odb_init();
+
+   if (!remote)
+   return helpers;
+
+   return do_find_odb_helper(remote);
+}
+
+int has_remote_odb(void)
+{
+   return !!find_odb_helper(NULL);
+}
diff --git a/remote-odb.h b/remote-odb.h
new file mode 100644
index 00..4c7b13695f
--- /dev/null
+++ b/remote-odb.h
@@

[PATCH v4 3/9] remote-odb: implement remote_odb_get_direct()

2018-08-01 Thread Christian Couder
From: Christian Couder 

This is implemented only in the promisor remote mode
for now by calling fetch_object().

Signed-off-by: Christian Couder 
Signed-off-by: Junio C Hamano 
---
 odb-helper.c | 14 ++
 odb-helper.h |  3 ++-
 remote-odb.c | 17 +
 remote-odb.h |  1 +
 4 files changed, 34 insertions(+), 1 deletion(-)

diff --git a/odb-helper.c b/odb-helper.c
index b4d403ffa9..99b5a345e8 100644
--- a/odb-helper.c
+++ b/odb-helper.c
@@ -4,6 +4,7 @@
 #include "odb-helper.h"
 #include "run-command.h"
 #include "sha1-lookup.h"
+#include "fetch-object.h"
 
 struct odb_helper *odb_helper_new(const char *name, int namelen)
 {
@@ -14,3 +15,16 @@ struct odb_helper *odb_helper_new(const char *name, int 
namelen)
 
return o;
 }
+
+int odb_helper_get_direct(struct odb_helper *o,
+ const unsigned char *sha1)
+{
+   int res;
+   uint64_t start = getnanotime();
+
+   res = fetch_object(o->remote, sha1);
+
+   trace_performance_since(start, "odb_helper_get_direct");
+
+   return res;
+}
diff --git a/odb-helper.h b/odb-helper.h
index 4b792a3896..4c52e81ce8 100644
--- a/odb-helper.h
+++ b/odb-helper.h
@@ -15,5 +15,6 @@ struct odb_helper {
 };
 
 extern struct odb_helper *odb_helper_new(const char *name, int namelen);
-
+extern int odb_helper_get_direct(struct odb_helper *o,
+const unsigned char *sha1);
 #endif /* ODB_HELPER_H */
diff --git a/remote-odb.c b/remote-odb.c
index 03be54ba2e..7f815c7ebb 100644
--- a/remote-odb.c
+++ b/remote-odb.c
@@ -89,3 +89,20 @@ int has_remote_odb(void)
 {
return !!find_odb_helper(NULL);
 }
+
+int remote_odb_get_direct(const unsigned char *sha1)
+{
+   struct odb_helper *o;
+
+   trace_printf("trace: remote_odb_get_direct: %s", sha1_to_hex(sha1));
+
+   remote_odb_init();
+
+   for (o = helpers; o; o = o->next) {
+   if (odb_helper_get_direct(o, sha1) < 0)
+   continue;
+   return 0;
+   }
+
+   return -1;
+}
diff --git a/remote-odb.h b/remote-odb.h
index 4c7b13695f..c5384c922d 100644
--- a/remote-odb.h
+++ b/remote-odb.h
@@ -3,5 +3,6 @@
 
 extern struct odb_helper *find_odb_helper(const char *remote);
 extern int has_remote_odb(void);
+extern int remote_odb_get_direct(const unsigned char *sha1);
 
 #endif /* REMOTE_ODB_H */
-- 
2.18.0.330.g17eb9fed90



[PATCH v4 4/9] remote-odb: implement remote_odb_get_many_direct()

2018-08-01 Thread Christian Couder
From: Christian Couder 

This function will be used to get many objects from a promisor
remote.

Signed-off-by: Christian Couder 
Signed-off-by: Junio C Hamano 
---
 odb-helper.c | 15 +++
 odb-helper.h |  3 +++
 remote-odb.c | 17 +
 remote-odb.h |  1 +
 4 files changed, 36 insertions(+)

diff --git a/odb-helper.c b/odb-helper.c
index 99b5a345e8..246ebf8f7a 100644
--- a/odb-helper.c
+++ b/odb-helper.c
@@ -28,3 +28,18 @@ int odb_helper_get_direct(struct odb_helper *o,
 
return res;
 }
+
+int odb_helper_get_many_direct(struct odb_helper *o,
+  const struct oid_array *to_get)
+{
+   int res;
+   uint64_t start;
+
+   start = getnanotime();
+
+   res = fetch_objects(o->remote, to_get);
+
+   trace_performance_since(start, "odb_helper_get_many_direct");
+
+   return res;
+}
diff --git a/odb-helper.h b/odb-helper.h
index 4c52e81ce8..154ce4c7e4 100644
--- a/odb-helper.h
+++ b/odb-helper.h
@@ -17,4 +17,7 @@ struct odb_helper {
 extern struct odb_helper *odb_helper_new(const char *name, int namelen);
 extern int odb_helper_get_direct(struct odb_helper *o,
 const unsigned char *sha1);
+extern int odb_helper_get_many_direct(struct odb_helper *o,
+ const struct oid_array *to_get);
+
 #endif /* ODB_HELPER_H */
diff --git a/remote-odb.c b/remote-odb.c
index 7f815c7ebb..09dfc2e16f 100644
--- a/remote-odb.c
+++ b/remote-odb.c
@@ -106,3 +106,20 @@ int remote_odb_get_direct(const unsigned char *sha1)
 
return -1;
 }
+
+int remote_odb_get_many_direct(const struct oid_array *to_get)
+{
+   struct odb_helper *o;
+
+   trace_printf("trace: remote_odb_get_many_direct: nr: %d", to_get->nr);
+
+   remote_odb_init();
+
+   for (o = helpers; o; o = o->next) {
+   if (odb_helper_get_many_direct(o, to_get) < 0)
+   continue;
+   return 0;
+   }
+
+   return -1;
+}
diff --git a/remote-odb.h b/remote-odb.h
index c5384c922d..e10a8bf855 100644
--- a/remote-odb.h
+++ b/remote-odb.h
@@ -4,5 +4,6 @@
 extern struct odb_helper *find_odb_helper(const char *remote);
 extern int has_remote_odb(void);
 extern int remote_odb_get_direct(const unsigned char *sha1);
+extern int remote_odb_get_many_direct(const struct oid_array *to_get);
 
 #endif /* REMOTE_ODB_H */
-- 
2.18.0.330.g17eb9fed90



[PATCH v4 0/9] Introducing remote ODBs

2018-08-01 Thread Christian Couder
This path series is a follow up from the patch series called "odb
remote" that I sent earlier this year, which was itself a follow up
from previous series. See the links section for more information.

As with the previous "odb remote" series, this series is only about
integrating with the promisor/narrow clone work and showing that it
makes it possible to use more than one promisor remote. Everything
that is not necessary for that integration has been removed for now
(though you can still find it in one of my branches on GitHub if you
want).

There is one test in patch 8/9 that shows that more than one promisor
remote can now be used, but I still feel that it could be interesting
to add other such tests, so I am open to ideas in this area.

Changes compared to V3 of this patch series
~~~

  - the remote_odb_reinit() function is not "inline" anymore in patch
5/9 as suggested by Beat Bolli in:

https://public-inbox.org/git/20180724215223.27516-3-dev+...@drbeat.li/

High level overview of this patch series


All the patches except 5/9 are unchanged compared to V3.

  - Patch 1/9:

This makes functions in fetch-object.c return an error code, which is
necessary to later tell that they failed and try another remote odb
when there is more than one. This could also just be seen as a fix to
these functions.

  - Patch 2/9:

This introduces the minimum infrastructure for remote odbs.

  - Patches 3/9 and 4/9:

These patches implement remote_odb_get_direct() and
remote_odb_get_many_direct() using the functions from fetch-object.c.
These new functions will be used in following patches to replace the
functions from fetch-object.c.

  - Patch 5/9:

This implement remote_odb_reinit() which will be needed to reparse the
remote odb configuration.

  - Patches 6/9 and 7/9:

These patches integrate the remote odb mechanism into the
promisor/narrow clone code. The "extensions.partialClone" config
option is replaced by "odb..promisorRemote" and
"core.partialCloneFilter" is replaced by
"odb..partialCloneFilter".

  - Patch 8/9:

This adds a test case that shows that now more than one promisor
remote can be used.

  - Patch 9/9:

This starts documenting the remote odb mechanism.

Discussion
~~

I am not sure that it is ok to completely replace the
"extensions.partialclone" config option. Even if it is fully replaced,
no "extensions.remoteodb" is implemented in these patches, as maybe
the "extensions.partialclone" name could be kept even if the
underlying mechanism is the remote odb mechanism.

Anyway I think that the remote odb mechanism is much more extensible,
so I think using "extensions.partialclone" to specify a promisor
remote should be at least deprecated.

Links
~

This patch series on GitHub:

V4: https://github.com/chriscool/git/commits/remote-odb
V3: https://github.com/chriscool/git/commits/remote-odb3
V2: https://github.com/chriscool/git/commits/remote-odb2
V1: https://github.com/chriscool/git/commits/remote-odb1

Discussions related to previous versions:

V3: https://public-inbox.org/git/20180713174959.16748-1-chrisc...@tuxfamily.org/
V2: https://public-inbox.org/git/20180630083542.20347-1-chrisc...@tuxfamily.org/
V1: https://public-inbox.org/git/20180623121846.19750-1-chrisc...@tuxfamily.org/

Previous "odb remote" series:

https://public-inbox.org/git/20180513103232.17514-1-chrisc...@tuxfamily.org/
https://github.com/chriscool/git/commits/odb-remote

Version 1 and 2 of the "Promisor remotes and external ODB support" series:

https://public-inbox.org/git/20180103163403.11303-1-chrisc...@tuxfamily.org/
https://public-inbox.org/git/20180319133147.15413-1-chrisc...@tuxfamily.org/

Version 1 and 2 of the "Promisor remotes and external ODB support" series on 
GitHub:

https://github.com/chriscool/git/commits/gl-small-promisor-external-odb12
https://github.com/chriscool/git/commits/gl-small-promisor-external-odb71

Christian Couder (9):
  fetch-object: make functions return an error code
  Add initial remote odb support
  remote-odb: implement remote_odb_get_direct()
  remote-odb: implement remote_odb_get_many_direct()
  remote-odb: add remote_odb_reinit()
  Use remote_odb_get_direct() and has_remote_odb()
  Use odb.origin.partialclonefilter instead of core.partialclonefilter
  t0410: test fetching from many promisor remotes
  Documentation/config: add odb..promisorRemote

 Documentation/config.txt  |   5 ++
 Makefile  |   2 +
 builtin/cat-file.c|   5 +-
 builtin/fetch.c   |  13 ++--
 builtin/gc.c  |   3 +-
 builtin/repack.c  |   3 +-
 cache.h   |   2 -
 connected.c   |   3 +-
 environment.c |   1 -
 fetch-object.c|  15 ++--
 fetch-object.h|   6 +-
 list-objects-filter-options.c |  51 +++--
 list-objects-filter-options.h |   3 +-
 odb-helper.c   

[PATCH v4 8/9] t0410: test fetching from many promisor remotes

2018-08-01 Thread Christian Couder
From: Christian Couder 

Signed-off-by: Christian Couder 
Signed-off-by: Junio C Hamano 
---
 t/t0410-partial-clone.sh | 24 +++-
 1 file changed, 23 insertions(+), 1 deletion(-)

diff --git a/t/t0410-partial-clone.sh b/t/t0410-partial-clone.sh
index 71a9b9a3e8..9d513ebf57 100755
--- a/t/t0410-partial-clone.sh
+++ b/t/t0410-partial-clone.sh
@@ -170,8 +170,30 @@ test_expect_success 'fetching of missing objects' '
git verify-pack --verbose "$IDX" | grep "$HASH"
 '
 
+test_expect_success 'fetching of missing objects from another odb remote' '
+   git clone "file://$(pwd)/server" server2 &&
+   test_commit -C server2 bar &&
+   git -C server2 repack -a -d --write-bitmap-index &&
+   HASH2=$(git -C server2 rev-parse bar) &&
+
+   git -C repo remote add server2 "file://$(pwd)/server2" &&
+   git -C repo config odb.magic2.promisorRemote server2 &&
+   git -C repo cat-file -p "$HASH2" &&
+
+   git -C repo fetch server2 &&
+   rm -rf repo/.git/objects/* &&
+   git -C repo cat-file -p "$HASH2" &&
+
+   # Ensure that the .promisor file is written, and check that its
+   # associated packfile contains the object
+   ls repo/.git/objects/pack/pack-*.promisor >promisorlist &&
+   test_line_count = 1 promisorlist &&
+   IDX=$(cat promisorlist | sed "s/promisor$/idx/") &&
+   git verify-pack --verbose "$IDX" | grep "$HASH2"
+'
+
 test_expect_success 'rev-list stops traversal at missing and promised commit' '
-   rm -rf repo &&
+   rm -rf repo server server2 &&
test_create_repo repo &&
test_commit -C repo foo &&
test_commit -C repo bar &&
-- 
2.18.0.330.g17eb9fed90



[PATCH v4 5/9] remote-odb: add remote_odb_reinit()

2018-08-01 Thread Christian Couder
From: Christian Couder 

We will need to reinitialize the remote odb configuration
as we will make some changes to it in a later commit when
we will detect that a remote is also a remote odb.

Signed-off-by: Christian Couder 
Signed-off-by: Junio C Hamano 
---
 remote-odb.c | 14 --
 remote-odb.h |  1 +
 2 files changed, 13 insertions(+), 2 deletions(-)

diff --git a/remote-odb.c b/remote-odb.c
index 09dfc2e16f..f063ba0fda 100644
--- a/remote-odb.c
+++ b/remote-odb.c
@@ -64,17 +64,27 @@ static int remote_odb_config(const char *var, const char 
*value, void *data)
return 0;
 }
 
-static void remote_odb_init(void)
+static void remote_odb_do_init(int force)
 {
static int initialized;
 
-   if (initialized)
+   if (!force && initialized)
return;
initialized = 1;
 
git_config(remote_odb_config, NULL);
 }
 
+static inline void remote_odb_init(void)
+{
+   remote_odb_do_init(0);
+}
+
+void remote_odb_reinit(void)
+{
+   remote_odb_do_init(1);
+}
+
 struct odb_helper *find_odb_helper(const char *remote)
 {
remote_odb_init();
diff --git a/remote-odb.h b/remote-odb.h
index e10a8bf855..79803782af 100644
--- a/remote-odb.h
+++ b/remote-odb.h
@@ -1,6 +1,7 @@
 #ifndef REMOTE_ODB_H
 #define REMOTE_ODB_H
 
+extern void remote_odb_reinit(void);
 extern struct odb_helper *find_odb_helper(const char *remote);
 extern int has_remote_odb(void);
 extern int remote_odb_get_direct(const unsigned char *sha1);
-- 
2.18.0.330.g17eb9fed90



[PATCH v4 7/9] Use odb.origin.partialclonefilter instead of core.partialclonefilter

2018-08-01 Thread Christian Couder
From: Christian Couder 

Let's make the partial clone filter specific to one odb
instead of general to all the odbs.

This makes it possible to have different partial clone
filters for different odbs.

Signed-off-by: Christian Couder 
Signed-off-by: Junio C Hamano 
---
 builtin/fetch.c   |  2 +-
 list-objects-filter-options.c | 28 
 list-objects-filter-options.h |  3 ++-
 odb-helper.h  |  1 +
 remote-odb.c  |  2 ++
 t/t5616-partial-clone.sh  |  2 +-
 6 files changed, 23 insertions(+), 15 deletions(-)

diff --git a/builtin/fetch.c b/builtin/fetch.c
index 9c7df8356c..f3dee73179 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -1369,7 +1369,7 @@ static inline void fetch_one_setup_partial(struct remote 
*remote)
 * the config.
 */
if (!filter_options.choice)
-   partial_clone_get_default_filter_spec(&filter_options);
+   partial_clone_get_default_filter_spec(&filter_options, 
remote->name);
return;
 }
 
diff --git a/list-objects-filter-options.c b/list-objects-filter-options.c
index 60452c8f36..755a793664 100644
--- a/list-objects-filter-options.c
+++ b/list-objects-filter-options.c
@@ -7,6 +7,7 @@
 #include "list-objects-filter.h"
 #include "list-objects-filter-options.h"
 #include "remote-odb.h"
+#include "odb-helper.h"
 
 /*
  * Parse value of the argument to the "filter" keyword.
@@ -29,6 +30,9 @@ static int gently_parse_list_objects_filter(
 {
const char *v0;
 
+   if (!arg)
+   return 0;
+
if (filter_options->choice) {
if (errbuf) {
strbuf_init(errbuf, 0);
@@ -116,6 +120,7 @@ void partial_clone_register(
const struct list_objects_filter_options *filter_options)
 {
char *cfg_name;
+   char *filter_name;
 
/* Check if it is already registered */
if (find_odb_helper(remote))
@@ -131,27 +136,26 @@ void partial_clone_register(
/*
 * Record the initial filter-spec in the config as
 * the default for subsequent fetches from this remote.
-*
-* TODO: move core.partialclonefilter into odb.
 */
-   core_partial_clone_filter_default =
-   xstrdup(filter_options->filter_spec);
-   git_config_set("core.partialclonefilter",
-  core_partial_clone_filter_default);
+   filter_name = xstrfmt("odb.%s.partialclonefilter", remote);
+   git_config_set(filter_name, filter_options->filter_spec);
+   free(filter_name);
 
/* Make sure the config info are reset */
remote_odb_reinit();
 }
 
 void partial_clone_get_default_filter_spec(
-   struct list_objects_filter_options *filter_options)
+   struct list_objects_filter_options *filter_options,
+   const char *remote)
 {
+   struct odb_helper *helper = find_odb_helper(remote);
+
/*
 * Parse default value, but silently ignore it if it is invalid.
 */
-   if (!core_partial_clone_filter_default)
-   return;
-   gently_parse_list_objects_filter(filter_options,
-core_partial_clone_filter_default,
-NULL);
+   if (helper)
+   gently_parse_list_objects_filter(filter_options,
+helper->partial_clone_filter,
+NULL);
 }
diff --git a/list-objects-filter-options.h b/list-objects-filter-options.h
index a61f82..12ceef3230 100644
--- a/list-objects-filter-options.h
+++ b/list-objects-filter-options.h
@@ -74,6 +74,7 @@ void partial_clone_register(
const char *remote,
const struct list_objects_filter_options *filter_options);
 void partial_clone_get_default_filter_spec(
-   struct list_objects_filter_options *filter_options);
+   struct list_objects_filter_options *filter_options,
+   const char *remote);
 
 #endif /* LIST_OBJECTS_FILTER_OPTIONS_H */
diff --git a/odb-helper.h b/odb-helper.h
index 154ce4c7e4..4af75ef55c 100644
--- a/odb-helper.h
+++ b/odb-helper.h
@@ -10,6 +10,7 @@
 struct odb_helper {
const char *name; /* odb..* */
const char *remote;   /* odb..promisorRemote */
+   const char *partial_clone_filter; /* odb..partialCloneFilter */
 
struct odb_helper *next;
 };
diff --git a/remote-odb.c b/remote-odb.c
index f063ba0fda..49cf8e30aa 100644
--- a/remote-odb.c
+++ b/remote-odb.c
@@ -60,6 +60,8 @@ static int remote_odb_config(const char *var, const char 
*value, void *data)
 
return 0;
}
+   if (!strcmp(subkey, "partialclonefilter"))
+   return git_config_string(&o->partial_clone_filter, var, value);
 
return 0;
 }
diff --git a/t/t5616-partial-clone.sh b/t/t5616-partial-clone.sh
index f7ed3c40e2..e2aeee1d7d 100755
--- a/t/t5616-partial-clone.sh
+++ b/t/t5616

[PATCH v4 6/9] Use remote_odb_get_direct() and has_remote_odb()

2018-08-01 Thread Christian Couder
From: Christian Couder 

Instead of using the repository_format_partial_clone global
and fetch_object() directly, let's use has_remote_odb() and
remote_odb_get_direct().

Signed-off-by: Christian Couder 
Signed-off-by: Junio C Hamano 
---
 builtin/cat-file.c|  5 +++--
 builtin/fetch.c   | 11 ++-
 builtin/gc.c  |  3 ++-
 builtin/repack.c  |  3 ++-
 cache.h   |  2 --
 connected.c   |  3 ++-
 environment.c |  1 -
 list-objects-filter-options.c | 27 +++
 packfile.c|  3 ++-
 setup.c   |  7 +--
 sha1-file.c   | 14 --
 t/t0410-partial-clone.sh  | 34 +-
 t/t5500-fetch-pack.sh |  4 ++--
 t/t5601-clone.sh  |  2 +-
 t/t5616-partial-clone.sh  |  2 +-
 t/t5702-protocol-v2.sh|  2 +-
 unpack-trees.c|  6 +++---
 17 files changed, 66 insertions(+), 63 deletions(-)

diff --git a/builtin/cat-file.c b/builtin/cat-file.c
index 4a44b2404f..4d19e9277e 100644
--- a/builtin/cat-file.c
+++ b/builtin/cat-file.c
@@ -14,6 +14,7 @@
 #include "sha1-array.h"
 #include "packfile.h"
 #include "object-store.h"
+#include "remote-odb.h"
 
 struct batch_options {
int enabled;
@@ -478,8 +479,8 @@ static int batch_objects(struct batch_options *opt)
 
for_each_loose_object(batch_loose_object, &sa, 0);
for_each_packed_object(batch_packed_object, &sa, 0);
-   if (repository_format_partial_clone)
-   warning("This repository has extensions.partialClone 
set. Some objects may not be loaded.");
+   if (has_remote_odb())
+   warning("This repository uses an odb. Some objects may 
not be loaded.");
 
cb.opt = opt;
cb.expand = &data;
diff --git a/builtin/fetch.c b/builtin/fetch.c
index ac06f6a576..9c7df8356c 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -22,6 +22,7 @@
 #include "utf8.h"
 #include "packfile.h"
 #include "list-objects-filter-options.h"
+#include "remote-odb.h"
 
 static const char * const builtin_fetch_usage[] = {
N_("git fetch [] [ [...]]"),
@@ -1339,7 +1340,7 @@ static inline void fetch_one_setup_partial(struct remote 
*remote)
 * If no prior partial clone/fetch and the current fetch DID NOT
 * request a partial-fetch, do a normal fetch.
 */
-   if (!repository_format_partial_clone && !filter_options.choice)
+   if (!has_remote_odb() && !filter_options.choice)
return;
 
/*
@@ -1347,7 +1348,7 @@ static inline void fetch_one_setup_partial(struct remote 
*remote)
 * on this repo and remember the given filter-spec as the default
 * for subsequent fetches to this remote.
 */
-   if (!repository_format_partial_clone && filter_options.choice) {
+   if (!has_remote_odb() && filter_options.choice) {
partial_clone_register(remote->name, &filter_options);
return;
}
@@ -1356,7 +1357,7 @@ static inline void fetch_one_setup_partial(struct remote 
*remote)
 * We are currently limited to only ONE promisor remote and only
 * allow partial-fetches from the promisor remote.
 */
-   if (strcmp(remote->name, repository_format_partial_clone)) {
+   if (!find_odb_helper(remote->name)) {
if (filter_options.choice)
die(_("--filter can only be used with the remote 
configured in core.partialClone"));
return;
@@ -1487,7 +1488,7 @@ int cmd_fetch(int argc, const char **argv, const char 
*prefix)
if (depth || deepen_since || deepen_not.nr)
deepen = 1;
 
-   if (filter_options.choice && !repository_format_partial_clone)
+   if (filter_options.choice && !has_remote_odb())
die("--filter can only be used when extensions.partialClone is 
set");
 
if (all) {
@@ -1521,7 +1522,7 @@ int cmd_fetch(int argc, const char **argv, const char 
*prefix)
}
 
if (remote) {
-   if (filter_options.choice || repository_format_partial_clone)
+   if (filter_options.choice || has_remote_odb())
fetch_one_setup_partial(remote);
result = fetch_one(remote, argc, argv, prune_tags_ok);
} else {
diff --git a/builtin/gc.c b/builtin/gc.c
index ccfb1ceaeb..02c783b514 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -26,6 +26,7 @@
 #include "pack-objects.h"
 #include "blob.h"
 #include "tree.h"
+#include "remote-odb.h"
 
 #define FAILED_RUN "failed to run %s"
 
@@ -619,7 +620,7 @@ int cmd_gc(int argc, const char **argv, const char *prefix)
argv_array_push(&prune, prune_expire);
if (quiet)
argv_array_push(&prune, "--no-progress");
- 

Re: Fetch on submodule update

2018-08-01 Thread Jonathan Nieder
Hi again,

Robert Dailey wrote:

> Problem: I want to avoid recursively fetching submodules when I run a
> `fetch` command, and instead defer that operation to the next
> `submodule update`. Essentially I want `fetch.recurseSubmodules` to be
> `false`, and `get submodule update` to do exactly what it does with
> the `--remote` option, but still use the SHA1 of the submodule instead
> of updating to the tip of the specified branch in the git modules
> config.

I think I misread this the first time.  I got distracted by your
mention of the --remote option, but you mentioned you want to use the
SHA-1 of the submodule listed, so that was silly of me.

I think you'll find that "git fetch --no-recurse-submodules" and "git
submodule update" do exactly what you want.  "git submodule update"
does perform a fetch (unless you pass --no-fetch).

Let me know how it goes. :)

I'd still be interested in hearing more about the nature of the
submodules involved --- maybe `submodule.fetchJobs` would help, or
maybe this is a workflow where a tool that transparently fetches
submodules on demand like
https://gerrit.googlesource.com/gitfs/+/master/docs/design.md would be
useful (I'm not recommending using slothfs for this today, since it's
read-only, but it illustrates the idea).

Thanks,
Jonathan


Re: [PATCH] remote: clear string_list after use in mv()

2018-08-01 Thread Jonathan Nieder
Hi,

René Scharfe wrote:

> Subject: remote: clear string_list after use in mv()

This subject line doesn't fully reflect the goodness of this change.
How about something like:

remote mv: avoid leaking ref names in string_list

?

> Switch to the _DUP variant of string_list for remote_branches to allow
> string_list_clear() to release the allocated memory at the end, and
> actually call that function.  Free the util pointer as well; it is
> allocated in read_remote_branches().
>
> NB: This string_list is empty until read_remote_branches() is called
> via for_each_ref(), so there is no need to clean it up when returning
> before that point.
>
> Signed-off-by: Rene Scharfe 
> ---
> Patch generated with ---function-context for easier review -- that
> makes it look much bigger than it actually is, though.

Thanks, that indeed helps.

[...]
> --- a/builtin/remote.c
> +++ b/builtin/remote.c
> @@ -558,23 +558,23 @@ struct rename_info {

optional: Would it be worth a comment in the struct definition to say
this string_list owns its items (or in other words that it's a _DUP
string_list)?

I think we're fine without, since it's local to the file, but may make
sense to do if rerolling.

[...]
> @@ -607,133 +607,134 @@ static int migrate_file(struct remote *remote)
>  static int mv(int argc, const char **argv)
>  {
>   struct option options[] = {
>   OPT_END()
>   };
>   struct remote *oldremote, *newremote;
>   struct strbuf buf = STRBUF_INIT, buf2 = STRBUF_INIT, buf3 = STRBUF_INIT,
>   old_remote_context = STRBUF_INIT;
> - struct string_list remote_branches = STRING_LIST_INIT_NODUP;
> + struct string_list remote_branches = STRING_LIST_INIT_DUP;

Verified that these are the only two functions that touch the
remote_branches field.  This patch didn't miss any callers.

[...]
>   if (!refspec_updated)
>   return 0;
>  
>   /*
>* First remove symrefs, then rename the rest, finally create
>* the new symrefs.
>*/
>   for_each_ref(read_remote_branches, &rename);

As you noted, this is the first caller that writes to the string_list,
so we don't have to worry about the 'return 0' above.  That said, I
wonder if it might be simpler and more futureproof to use
destructor-style cleanup handling anyway:

if (!refspec_updated)
goto done;
 [...]
  done:
string_list_clear(&remote_branches, 1);
return 0;

[...]
> + string_list_clear(&remote_branches, 1);

not about this patch: I wonder if we should make the free_util
parameter a flag word so the behavior is more obvious in the caller:

string_list_clear(&remote_branches, STRING_LIST_FREE_UTIL);

Or maybe even having a separate function:

string_list_clear_free_util(&remote_branches);

That's a topic for another day, though.

With whatever subset of the changes suggested above makes sense,

Reviewed-by: Jonathan Nieder 

Thanks for a pleasant read.


Re: [PATCH v2] remote: make refspec follow the same disambiguation rule as local refs

2018-08-01 Thread Jonathan Nieder
Junio C Hamano wrote:

> One thing I forgot to mention.
>
> When asking to fetch T, in order to be able to favor refs/tags/T
> over refs/heads/T at the fetching end, you would have to be able to
> *see* both, so all 6 variants "T", "refs/tags/T", "refs/heads/T",
> "refs/remotes/T", "refs/remotes/T/HEAD" must be asked to be shown
> when the ls-remote limiting is in effect.  Since the ls-remote
> filtering is relatively new development, we may further find subtle
> remaining bugs, if there still are some.

Fortunately, the fetch code does already do that. ;-)

Jonathan


Re: [PATCH v2] remote: make refspec follow the same disambiguation rule as local refs

2018-08-01 Thread Junio C Hamano
Jonathan Nieder  writes:

>> +const int num_rules = ARRAY_SIZE(ref_rev_parse_rules) - 1;
>
> This is assuming ref_rev_parse_rules consists exactly of its items
> followed by a NULL terminator, which is potentially a bit subtle.  I
> wonder if we should put
>
>   static const char *ref_rev_parse_rules[] = {
>   "%.*s",
>   "refs/%.*s",
>   "refs/tags/%.*s",
>   "refs/heads/%.*s",
>   "refs/remotes/%.*s",
>   "refs/remotes/%.*s/HEAD",
>   NULL
>   };
>   #define NUM_REV_PARSE_RULES (ARRAY_SIZE(ref_rev_parse_rules) - 1)
>
> and then use something like
>
>   const int num_rules = NUM_REV_PARSE_RULES;

Perhaps.  If we were to go that length, I'd rather first see if we
can lose the sentinel NULL, though.

> Alternatively, what would you think of using the simpler return
> convention
>
>   return p - ref_rev_parse_rules + 1;
>
> ?  Or even
>
>   return p - ref_rev_parse_rules;
>
> and -1 for "no match"?

Heh, that is what I did in the "how about this" patch, which made
the caller a bit more cumbersome by two comparisons, which in turn
was why I rejected the approach.

> Sensible and simple.  If we wanted to make items earlier in the list
> return a lower value from refname_match, then we'd need a !best_score
> test here, which might be what motivates that return value convention.

Exactly.  See the discussion between JTan and me on his original
patch.

> [...]
>> --- a/t/t5510-fetch.sh
>> +++ b/t/t5510-fetch.sh
>> @@ -535,6 +535,41 @@ test_expect_success "should be able to fetch with 
>> duplicate refspecs" '
>>  )
>>  '
>>  
>> +test_expect_success 'LHS of refspec follows ref disambiguation rules' '
>
> Clearly illustrates the bug this fixes, in a way that makes it obvious
> that a user would prefer the new behavior.  Good.
>
> With or without the tweak of introducing NUM_REV_PARSE_RULES mentioned
> above,
>
> Reviewed-by: Jonathan Nieder 

One thing I forgot to mention.

When asking to fetch T, in order to be able to favor refs/tags/T
over refs/heads/T at the fetching end, you would have to be able to
*see* both, so all 6 variants "T", "refs/tags/T", "refs/heads/T",
"refs/remotes/T", "refs/remotes/T/HEAD" must be asked to be shown
when the ls-remote limiting is in effect.  Since the ls-remote
filtering is relatively new development, we may further find subtle
remaining bugs, if there still are some.



Re: [PATCH v2 0/4] fix "rebase -i --root" corrupting root commit

2018-08-01 Thread brian m. carlson
On Tue, Jul 31, 2018 at 03:33:27AM -0400, Eric Sunshine wrote:
> This is a re-roll of [1] which fixes sequencer bugs resulting in commit
> object corruption when "rebase -i --root" swaps in a new commit as root.
> Unfortunately, those bugs made it into v2.18.0 and have already
> corrupted at least one repository (a local project of mine). Patches 3/4
> and 4/4 are new.
> 
> v1 fixed these bugs:
> 
> * trailing garbage on the commit's "author" header
> 
> * extra trailing digit on "author" header's timezone (caused by two
>   separate bugs)
> 
> v2 fixes those same bugs, plus:
> 
> * eliminates a bogus "@" prepended to the "author" header timestamp
>   which renders the header corrupt
> 
> * takes care to validate information coming from
>   "rebase-merge/author-script" before incorporating it into the "author"
>   header since that file may be hand-edited, and bogus hand-edited
>   values could corrupt the commit object.

I looked at this series and it seems sane and logical to me.  Thanks for
acting quickly to fix the corruption.

Reviewed-by: brian m. carlson 
-- 
brian m. carlson: Houston, Texas, US
OpenPGP: https://keybase.io/bk2204


signature.asc
Description: PGP signature


Re: [PATCH v2] remote: make refspec follow the same disambiguation rule as local refs

2018-08-01 Thread Jonathan Nieder
Hi,

Junio C Hamano wrote:

> When matching a non-wildcard LHS of a refspec against a list of
> refs, find_ref_by_name_abbrev() returns the first ref that matches
> using any DWIM rules used by refname_match() in refs.c, even if a
> better match occurs later in the list of refs.

Nicely explained.

[...]
> --- a/refs.c
> +++ b/refs.c
> @@ -487,16 +487,22 @@ static const char *ref_rev_parse_rules[] = {
>   NULL
>  };
>  
> +/*
> + * Is it possible that the caller meant full_name with abbrev_name?
> + * If so return a non-zero value to signal "yes"; the magnitude of
> + * the returned value gives the precedence used for disambiguation.
> + *
> + * If abbrev_name cannot mean full_name, return 0.
> + */
>  int refname_match(const char *abbrev_name, const char *full_name)
>  {
>   const char **p;
>   const int abbrev_name_len = strlen(abbrev_name);
> + const int num_rules = ARRAY_SIZE(ref_rev_parse_rules) - 1;

This is assuming ref_rev_parse_rules consists exactly of its items
followed by a NULL terminator, which is potentially a bit subtle.  I
wonder if we should put

static const char *ref_rev_parse_rules[] = {
"%.*s",
"refs/%.*s",
"refs/tags/%.*s",
"refs/heads/%.*s",
"refs/remotes/%.*s",
"refs/remotes/%.*s/HEAD",
NULL
};
#define NUM_REV_PARSE_RULES (ARRAY_SIZE(ref_rev_parse_rules) - 1)

and then use something like

const int num_rules = NUM_REV_PARSE_RULES;

so that this dependency is more obvious if the ref_rev_parse_rules
convention changes later.

Alternatively, what would you think of using the simpler return
convention

return p - ref_rev_parse_rules + 1;

?  Or even

return p - ref_rev_parse_rules;

and -1 for "no match"?

[...]
> --- a/remote.c
> +++ b/remote.c
> @@ -1880,11 +1880,18 @@ static struct ref *get_expanded_map(const struct ref 
> *remote_refs,
>  static const struct ref *find_ref_by_name_abbrev(const struct ref *refs, 
> const char *name)
>  {
>   const struct ref *ref;
> + const struct ref *best_match = NULL;
> + int best_score = 0;
> +
>   for (ref = refs; ref; ref = ref->next) {
> - if (refname_match(name, ref->name))
> - return ref;
> + int score = refname_match(name, ref->name);
> +
> + if (best_score < score) {
> + best_match = ref;
> + best_score = score;
> + }
>   }
> - return NULL;
> + return best_match;

Sensible and simple.  If we wanted to make items earlier in the list
return a lower value from refname_match, then we'd need a !best_score
test here, which might be what motivates that return value convention.

[...]
> --- a/t/t5510-fetch.sh
> +++ b/t/t5510-fetch.sh
> @@ -535,6 +535,41 @@ test_expect_success "should be able to fetch with 
> duplicate refspecs" '
>   )
>  '
>  
> +test_expect_success 'LHS of refspec follows ref disambiguation rules' '

Clearly illustrates the bug this fixes, in a way that makes it obvious
that a user would prefer the new behavior.  Good.

With or without the tweak of introducing NUM_REV_PARSE_RULES mentioned
above,

Reviewed-by: Jonathan Nieder 

Thanks.


Re: [PATCH] push: comment on a funny unbalanced option help

2018-08-01 Thread Jonathan Nieder
Hi,

Junio C Hamano wrote:

> Add a comment to save future readers from wasting time just like I
> did ;-)

Thanks.  I think we should go further, and start the comment with
TRANSLATORS so it shows up for the audience most affected by this as
well.  See the note on TRANSLATORS in po/README for details.

Sincerely,
Jonathan


Re: [PATCH 2/3] config: fix case sensitive subsection names on writing

2018-08-01 Thread Junio C Hamano
Stefan Beller  writes:

> A use reported a submodule issue regarding strange case indentation
> issues, but it could be boiled down to the following test case:

Perhaps

s/use/user/
s/case indentation issues/section mix-up/

> ... However we do not have a test for writing out config correctly with
> case sensitive subsection names, which is why this went unnoticed in
> 6ae996f2acf (git_config_set: make use of the config parser's event
> stream, 2018-04-09)

s/unnoticed in \(.*04-09)\)/unnoticed when \1 broke it./

This is why I asked if the patch is a "FIX" for an issue introduced
by the cited commit.

> Unfortunately we have to make a distinction between old style configuration
> that looks like
>
>   [foo.Bar]
> key = test
>
> and the new quoted style as seen above. The old style is documented as
> case-agnostic, hence we need to keep 'strncasecmp'; although the
> resulting setting for the old style config differs from the configuration.
> That will be fixed in a follow up patch.
>
> Reported-by: JP Sugarbroad 
> Signed-off-by: Stefan Beller 
> ---
>  config.c  | 12 +++-
>  t/t1300-config.sh |  1 +
>  2 files changed, 12 insertions(+), 1 deletion(-)
>
> diff --git a/config.c b/config.c
> index 7968ef7566a..1d3bf9b5fc0 100644
> --- a/config.c
> +++ b/config.c
> @@ -37,6 +37,7 @@ struct config_source {
>   int eof;
>   struct strbuf value;
>   struct strbuf var;
> + unsigned section_name_old_dot_style : 1;
>  
>   int (*do_fgetc)(struct config_source *c);
>   int (*do_ungetc)(int c, struct config_source *conf);
> @@ -605,6 +606,7 @@ static int get_value(config_fn_t fn, void *data, struct 
> strbuf *name)
>  
>  static int get_extended_base_var(struct strbuf *name, int c)
>  {
> + cf->section_name_old_dot_style = 0;
>   do {
>   if (c == '\n')
>   goto error_incomplete_line;
> @@ -641,6 +643,7 @@ static int get_extended_base_var(struct strbuf *name, int 
> c)
>  
>  static int get_base_var(struct strbuf *name)
>  {
> + cf->section_name_old_dot_style = 1;
>   for (;;) {
>   int c = get_next_char();
>   if (cf->eof)

OK, let me rephrase.  The basic parse structure is that

 * upon seeing '[', we call get_base_var(), which stuffs the
   "section" (including subsection, if exists) in the strbuf.

 * get_base_var() upon seeing a space after "[section ", calls
   get_extended_base_var().  This space can never exist in an
   old-style three-level names, where it is spelled as
   "[section.subsection]".  This space cannot exist in two-level
   names, either.  The closing ']' is eaten by this function before
   it returns.

 * get_extended_base_var() grabs the "double quoted" subsection name
   and eats the closing ']' before it returns.

So you set the new bit (section_name_old_dot_style) at the beginning
of get_base_var(), i.e. declare that you assume we are reading old
style, but upon entering get_extended_base_var(), unset it, because
now you know we are parsing a modern style three-level name(s).

Feels quite sensible way to keep track of old/new styles.

When parsing two-level names, old-style bit is set, which we may
need to be careful, thoguh.

> @@ -2355,14 +2358,21 @@ static int store_aux_event(enum config_event_t type,
>   store->parsed[store->parsed_nr].type = type;
>  
>   if (type == CONFIG_EVENT_SECTION) {
> + int (*cmpfn)(const char *, const char *, size_t);
> +
>   if (cf->var.len < 2 || cf->var.buf[cf->var.len - 1] != '.')
>   return error("invalid section name '%s'", cf->var.buf);
>  
> + if (cf->section_name_old_dot_style)
> + cmpfn = strncasecmp;
> + else
> + cmpfn = strncmp;
> +
>   /* Is this the section we were looking for? */
>   store->is_keys_section =
>   store->parsed[store->parsed_nr].is_keys_section =
>   cf->var.len - 1 == store->baselen &&
> - !strncasecmp(cf->var.buf, store->key, store->baselen);
> + !cmpfn(cf->var.buf, store->key, store->baselen);

OK.  Section names should still be case insensitive (only the case
sensitivity of subsection names is special), but presumably that's
already normalized by the caller so we do not have to worry when we
use strncmp()?  Can we add a test to demonstrate that it works
correctly?

Thanks.



Re: Fetch on submodule update

2018-08-01 Thread Jonathan Nieder
Hi,

Robert Dailey wrote:

> Problem: I want to avoid recursively fetching submodules when I run a
> `fetch` command, and instead defer that operation to the next
> `submodule update`. Essentially I want `fetch.recurseSubmodules` to be
> `false`, and `get submodule update` to do exactly what it does with
> the `--remote` option, but still use the SHA1 of the submodule instead
> of updating to the tip of the specified branch in the git modules
> config.
>
> I hope that makes sense. The reason for this ask is to
> improve/streamline workflow in parent repositories. There are cases
> where I want to quickly fetch only the parent repository, even if a
> submodule changes, to perform some changes that do not require the
> submodule itself (yet). Then at a later time, do `submodule update`
> and have it automatically fetch when the SHA1 it's updating to does
> not exist (because the former fetch operation for the submodule was
> skipped). For my case, it's very slow to wait on submodules to
> recursively fetch when I only wanted to fetch the parent repo for the
> specific task I plan to do.
>
> Is this possible right now through some variation of configuration?

Can you say more about the overall workflow?  This seems quite different
from what we've been designing --recurse-submodules around:

- avoiding the end user ever having to use the "git submodule" command,
  except to add, remove, or reconfigure submodules

- treating the whole codebase as something like one project, so that
  "git checkout --recurse-submodules " always checks out the
  same state

More details about the application would help with better
understanding whether it can fit into this framework, or whether it's
a case where you'd want to set "submodule.recurse" to false to have
more manual control.

Thanks and hope that helps,
Jonathan


Re: [PATCH] push: comment on a funny unbalanced option help

2018-08-01 Thread Ævar Arnfjörð Bjarmason


On Wed, Aug 01 2018, Junio C Hamano wrote:

> Ævar Arnfjörð Bjarmason  writes:
>
>>> + /* N_() will get "<>" around, resulting in 
>>> ":" */
>>
>> ...but this comment isn't accurate at all, N_() doesn't wrap the string
>> with <>'s, as can be seen by applying this patch:
>
> I know.  It is a short-hand for "what's inside N_() we see here".
> Try to come up with an equivalent that fits on a 80-char line.

I was going to say:

/* parse_options() will munge this to ":" */

or:

/* note: parse_options() will add surrounding <>'s for us */

or:

/* missing <>'s are not a bug, parse_options() adds them */

But looking at this again it looks like this whole thing should just be
replaced by:

diff --git a/builtin/push.c b/builtin/push.c
index 9cd8e8cd56..b8fa15c101 100644
--- a/builtin/push.c
+++ b/builtin/push.c
@@ -558,9 +558,10 @@ int cmd_push(int argc, const char **argv, const char 
*prefix)
OPT_BIT( 0,  "porcelain", &flags, N_("machine-readable 
output"), TRANSPORT_PUSH_PORCELAIN),
OPT_BIT('f', "force", &flags, N_("force updates"), 
TRANSPORT_PUSH_FORCE),
{ OPTION_CALLBACK,
- 0, CAS_OPT_NAME, &cas, N_("refname>::"),
  N_("require old value of ref to be at this value"),
- PARSE_OPT_OPTARG, parseopt_push_cas_option },
+ PARSE_OPT_OPTARG | PARSE_OPT_LITERAL_ARGHELP,
+ parseopt_push_cas_option },
{ OPTION_CALLBACK, 0, "recurse-submodules", 
&recurse_submodules, "check|on-demand|no",
N_("control recursive pushing of submodules"),
PARSE_OPT_OPTARG, option_parse_recurse_submodules },

I.e. the reason this is confusing is because the code originally added
in 28f5d17611 ("remote.c: add command line option parser for
"--force-with-lease"", 2013-07-08) didn't use PARSE_OPT_LITERAL_ARGHELP,
which I also see is what read-tree etc. use already to not end up with
these double <>'s, see also 29f25d493c ("parse-options: add
PARSE_OPT_LITERAL_ARGHELP for complicated argh's", 2009-05-21).


Re: [PATCH 2/3] config: fix case sensitive subsection names on writing

2018-08-01 Thread Junio C Hamano
Ramsay Jones  writes:

> On 01/08/18 20:34, Stefan Beller wrote:
>> A use reported a submodule issue regarding strange case indentation
>
> s/use/user/ ?

True.  Also s/indentation/something else/ ?

Insufficient proofreading, perhaps?



Re: [PATCH] fetch-pack: unify ref in and out param

2018-08-01 Thread Junio C Hamano
Brandon Williams  writes:

> ..., I expect we may need to do a bit more work on the whole
> fetching stack to get what we'd want in that case (because we would want
> to avoid this issue again).

Amen.  Thanks all.


Re: ds/multi-pack-index (was Re: What's cooking in git.git (Jul 2018, #03; Wed, 25))

2018-08-01 Thread Junio C Hamano
Derrick Stolee  writes:

> On 7/25/2018 6:13 PM, Junio C Hamano wrote:
>> * ds/multi-pack-index (2018-07-20) 23 commits
>> ...
>>   Ready to move to 'next', with some known issues to be followed up?
>>   cf. 
> I'm not sure if there is anything actionable for me to do in response
> to this message.

As I said elsewhere, "cf. " list does not attempt to be a
complete enumeration of things to be fixed.  It is a (sub)set of
pointers into list archive to help me in integration cycles to
decide if I can comfortably merge each topic to 'next' (or update
"What's cooking" to mark the topic as "Will merge").  FWIW, that
particular message is not even an objection ;-) It was telling the
future-me "hey, I looked at this series and found nothing wrong in
it, so no need to read them again to refresh your memory".

The other one is a good reminder, again, given primarily to
future-me, that the topic is known to be imperfect and the
discussion seemed to favor moving "with some known issues to be
followed up", so I should not waste time re-reading and poke the
same holes.



Re: ds/reachable (was Re: What's cooking in git.git (Jul 2018, #03; Wed, 25))

2018-08-01 Thread Junio C Hamano
Derrick Stolee  writes:

>>   Stuck in review?
>>   cf. <20180723203500.231932-1-jonathanta...@google.com>
>
> This comments on the initial values of 'struct ref_filter' (that are
> not used). All we need is the diff below squashed into "test-reach:
> test commit_contains".
>
>>   cf. <20180723204112.233274-1-jonathanta...@google.com>
> This comment asks why "parse_commit()" instead of
> "parse_commit_or_die()" but the _or_die() would create a change in
> behavior that is not the purpose of the series.
>>   cf. 
>
> I just responded to Stefan's comment about sorting. I don't believe
> any change is needed. Some tests output multiple results and the order
> is not defined by the method contract, so 'test-tool reach '
> will always sort the output (by OID).

Just to let everybody know, there is no point responding to all of
"cf. " comments in "What's cooking" report.  Because it is
*NOT* meant as an exhaustive list of things that need to be fixed,
refuting each and every one of them would not make the topic
"objection free" anyway.  The list of cf.'s are there to have just
enough of them to remind me to refrain from merging a topic to
'next' too hurriedly.  The discussion thread in the list archive is
the authoritative record of the discussion; treat "What's cooking"
as my personal note, nothing more.

> (Sorry for the delay. I'm on vacation.)

That's OK, and enjoy your time off.  We are not in a hurry.


> Thanks,
> -Stolee
>
> ---
>
> diff --git a/t/helper/test-reach.c b/t/helper/test-reach.c
> index eb21103998..ca30059117 100644
> --- a/t/helper/test-reach.c
> +++ b/t/helper/test-reach.c
> @@ -117,6 +117,7 @@ int cmd__reach(int ac, const char **av)
>     struct ref_filter filter;
>     struct contains_cache cache;
>     init_contains_cache(&cache);
> +    memset(&filter, 0, sizeof(filter));
>
>     if (ac > 2 && !strcmp(av[2], "--tag"))
>         filter.with_commit_tag_algo = 1;


Re: [PATCH] push: comment on a funny unbalanced option help

2018-08-01 Thread Junio C Hamano
Ævar Arnfjörð Bjarmason  writes:

>> +  /* N_() will get "<>" around, resulting in 
>> ":" */
>
> ...but this comment isn't accurate at all, N_() doesn't wrap the string
> with <>'s, as can be seen by applying this patch:

I know.  It is a short-hand for "what's inside N_() we see here".
Try to come up with an equivalent that fits on a 80-char line.


Re: [PATCH] fetch-pack: unify ref in and out param

2018-08-01 Thread Brandon Williams
On 08/01, Jonathan Tan wrote:
> When a user fetches:
>  - at least one up-to-date ref and at least one non-up-to-date ref,
>  - using HTTP with protocol v0 (or something else that uses the fetch
>command of a remote helper)
> some refs might not be updated after the fetch.
> 
> This bug was introduced in commit 989b8c4452 ("fetch-pack: put shallow
> info in output parameter", 2018-06-28) which allowed transports to
> report the refs that they have fetched in a new out-parameter
> "fetched_refs". If they do so, transport_fetch_refs() makes this
> information available to its caller.
> 
> Users of "fetched_refs" rely on the following 3 properties:
>  (1) it is the complete list of refs that was passed to
>  transport_fetch_refs(),
>  (2) it has shallow information (REF_STATUS_REJECT_SHALLOW set if
>  relevant), and
>  (3) it has updated OIDs if ref-in-want was used (introduced after
>  989b8c4452).
> 
> In an effort to satisfy (1), whenever transport_fetch_refs()
> filters the refs sent to the transport, it re-adds the filtered refs to
> whatever the transport supplies before returning it to the user.
> However, the implementation in 989b8c4452 unconditionally re-adds the
> filtered refs without checking if the transport refrained from reporting
> anything in "fetched_refs" (which it is allowed to do), resulting in an
> incomplete list, no longer satisfying (1).
> 
> An earlier effort to resolve this [1] solved the issue by readding the
> filtered refs only if the transport did not refrain from reporting in
> "fetched_refs", but after further discussion, it seems that the better
> solution is to revert the API change that introduced "fetched_refs".
> This API change was first suggested as part of a ref-in-want
> implementation that allowed for ref patterns and, thus, there could be
> drastic differences between the input refs and the refs actually fetched
> [2]; we eventually decided to only allow exact ref names, but this API
> change remained even though its necessity was decreased.
> 
> Therefore, revert this API change by reverting commit 989b8c4452, and
> make receive_wanted_refs() update the OIDs in the sought array (like how
> update_shallow() updates shallow information in the sought array)
> instead. A test is also included to show that the user-visible bug
> discussed at the beginning of this commit message no longer exists.
> 
> [1] https://public-inbox.org/git/20180801171806.ga122...@google.com/
> [2] 
> https://public-inbox.org/git/86a128c5fb710a41791e7183207c4d64889f9307.1485381677.git.jonathanta...@google.com/
> 
> Signed-off-by: Jonathan Tan 
> ---
> I now think that it's better to revert the API change introducing
> "fetched_refs" (or as Peff describes it, "this whole 'return the fetched
> refs' scheme from 989b8c4452"), so here is a patch doing so. I hope to
> have covered all of Peff's and Junio's questions in the commit message.
> 
> As for Brandon's question:
> 
> > I haven't thought too much about what we would need to do in the event
> > we add patterns to ref-in-want, but couldn't we possible mutate the
> > input list again in this case and just simply add the resulting refs to
> > the input list?
> 
> If we support ref patterns, we would need to support deletion of refs,
> not just addition (because a ref might have existed in the initial ref
> advertisement, but not when the packfile is delivered). But it should
> be possible to add a flag stating "don't use this" to the ref, and
> document that transport_fetch_refs() can append additional refs to the
> tail of the input list. Upon hindsight, maybe this should have been the
> original API change instead of the "fetched_refs" mechanism.

Thanks for getting this out, it looks good to me.  If we end up adding
patterns to ref-in-want then we can explore what changes would need to
be made then, I expect we may need to do a bit more work on the whole
fetching stack to get what we'd want in that case (because we would want
to avoid this issue again).

-- 
Brandon Williams


Re: [PATCH/RFC] clone: report duplicate entries on case-insensitive filesystems

2018-08-01 Thread Junio C Hamano
Junio C Hamano  writes:

> Jeff King  writes:
>
>>> Presumably we are already in an error codepath, so if it is
>>> absolutely necessary, then we can issue a lstat() to grab the inum
>>> for the path we are about to create, iterate over the previously
>>> checked out paths issuing lstat() and see which one yields the same
>>> inum, to find the one who is the culprit.
>>
>> Yes, this is the cleverness I was missing in my earlier response.
>>
>> So it seems do-able, and I like that this incurs no cost in the
>> non-error case.
>
> Not so fast, unfortunately.  
>
> I suspect that some filesystems do not give us inum that we can use
> for that "identity" purpose, and they tend to be the ones with the
> case smashing characteristics where we need this code in the error
> path the most X-<.

But even if inum is unreliable, we should be able to use other
clues, perhaps the same set of fields we use for cached stat
matching optimization we use for "diff" plumbing commands, to
implement the error report.  The more important part of the idea is
that we already need to notice that we need to remove a path that is
in the working tree while doing the checkout, so the alternative
approach won't incur any extra cost for normal cases where the
project being checked out does not have two files whose pathnames
are only different in case (or checking out such an offending
project to a case sensitive filesytem, of course).

So I guess it still _is_ workable.  Any takers?


Re: [PATCH 0/2] negotiator: improve recent behavior + docs

2018-08-01 Thread Ævar Arnfjörð Bjarmason


On Wed, Aug 01 2018, Jonathan Tan wrote:

>> I think 01/02 in this patch series implements something that's better
>> & more future-proof.
>
> Thanks. Both patches are:
> Reviewed-by: Jonathan Tan 
>
> A small note:
>
>> -packfile; any other value instructs Git to use the default algorithm
>> +packfile; The default is "default" which instructs Git to use the 
>> default algorithm
>
> I think we generally don't capitalize words after semicolons.

Yeah I think that's right. Will fix (or if there's no other comments
perhaps Junio will munge it...) :)

> Thanks for noticing that the check of fetch.negotiationAlgorithm only
> happens when a negotiation actually occurs - before your patches, it
> didn't really matter because we tolerated anything, but now we do. I
> think this is fine - as far as I know, Git commands generally only read
> the configs relevant to them, and if fetch.negotiationAlgorithm is not
> relevant in a certain situation, we don't need to read it.

Yeah I think that's OK.

>> That's awesome. This is exactly what I wanted, this patch series also
>> fixes another small issue in 02/02; which is that the docs for the two
>> really should cross-link to make these discoverable from one another.
>
> That's a good idea; thanks for doing it.
>
>> I.e. the way I'm doing this is I add all the remotes first, then I
>> fetch them all in parallel, but because the first time around I don't
>> have anything for that remote (and they don't share any commits) I
>> need to fake it up and pretend to be fetching from a repo that has
>> just one commit.
>>
>> It would be better if I could somehow say that I don't mind that the
>> ref doesn't exist, but currently you either error out with this, or
>> ignore the glob, depending on the mode.
>>
>> So I want this, but can't think of a less shitty UI than:
>>
>> git fetch --negotiation-tip=$REF 
>> --negotiation-tip-error-handling=missing-ref-means-no-want
>>
>> Or something equally atrocious, do you have any better ideas?
>
> If you wanted to do this, it seems better to me to just declare a "null"
> negotiation algorithm that does not perform any negotiation at all.

I think such an algorithm is a good idea in general, especially for
testing, and yeah, maybe that's the best way out of this, i.e. to do:

if git rev-parse {}/HEAD 2>/dev/null
then
git fetch --negotiation-tip={}/HEAD {}
else
git -c fetch.negotiationAlgorithm=null fetch {}
fi

Would such an algorithm be added by overriding default.c's add_tip
function to never add anything by calling default_negotiator_init()
followed by null_negotiator_init(), which would only override add_tip?
(yay C OO)

If so from fetch-pack.c it looks like there may be the limitation on the
interface that the negotiator can't exit early (in
fetch-pack.c:mark_tips). But I've just skimmed this, so maybe I've
missed something.

Also, looks like because of the current interface =null and
--negotiation-tip=* would (somewhat confusingly) do a "real" negotiation
if done that way, since it'll bypass the API and insert tips for it to
negotiate, but it looks like overriding next() will get around that.


Re: [PATCH 2/3] config: fix case sensitive subsection names on writing

2018-08-01 Thread Ramsay Jones



On 01/08/18 20:34, Stefan Beller wrote:
> A use reported a submodule issue regarding strange case indentation

s/use/user/ ?

ATB,
Ramsay Jones



ds/multi-pack-index (was Re: What's cooking in git.git (Jul 2018, #03; Wed, 25))

2018-08-01 Thread Derrick Stolee

On 7/25/2018 6:13 PM, Junio C Hamano wrote:

* ds/multi-pack-index (2018-07-20) 23 commits
  - midx: clear midx on repack
  - packfile: skip loading index if in multi-pack-index
  - midx: prevent duplicate packfile loads
  - midx: use midx in approximate_object_count
  - midx: use existing midx when writing new one
  - midx: use midx in abbreviation calculations
  - midx: read objects from multi-pack-index
  - config: create core.multiPackIndex setting
  - midx: write object offsets
  - midx: write object id fanout chunk
  - midx: write object ids in a chunk
  - midx: sort and deduplicate objects from packfiles
  - midx: read pack names into array
  - multi-pack-index: write pack names in chunk
  - multi-pack-index: read packfile list
  - packfile: generalize pack directory list
  - t5319: expand test data
  - multi-pack-index: load into memory
  - midx: write header information to lockfile
  - multi-pack-index: add 'write' verb
  - multi-pack-index: add builtin
  - multi-pack-index: add format details
  - multi-pack-index: add design document

  When there are too many packfiles in a repository (which is not
  recommended), looking up an object in these would require
  consulting many pack .idx files; a new mechanism to have a single
  file that consolidates all of these .idx files is introduced.

  Ready to move to 'next', with some known issues to be followed up?
  cf. 
I'm not sure if there is anything actionable for me to do in response to 
this message.

  cf. 
This message is in regard to UX around the usage output when the 
command-line arguments are incorrect. The recommendation is to 
explicitly state what the user did that is incorrect. For such a simple 
usage line, I don't think this is necessary. The message also included this:



I wouldn't want to see a re-roll just for this, especially for such a
long series. Perhaps such a change can be done as a follow-up patch by
someone at some point.


If this is something we _really_ want to do, then I will tackle it in my 
follow-up series that adds a 'verify' verb (thus complicating the usage and 
giving me an opportunity to improve this area).

Thanks,
-Stolee



Re: [PATCH] push: comment on a funny unbalanced option help

2018-08-01 Thread Ævar Arnfjörð Bjarmason


On Wed, Aug 01 2018, Junio C Hamano wrote:

> The option help text for the force-with-lease option to "git push"
> reads like this:
>
> $ git push -h 2>&1 | grep -e force-with-lease
>--force-with-lease[=:]
>
> which come from this
>
> 0, CAS_OPT_NAME, &cas, N_("refname>:
> in the source code, with an aparent lack of "<" and ">" at both
> ends.
>
> It turns out that parse-options machinery takes the whole string and
> encloses it inside a pair of "<>", expecting that it is a single
> placeholder.  The help string was written in a funnily unbalanced
> way knowing that the end result would balance out.
>
> Add a comment to save future readers from wasting time just like I
> did ;-)

There's something worth fixing here for sure...

> +   /* N_() will get "<>" around, resulting in 
> ":" */

...but this comment isn't accurate at all, N_() doesn't wrap the string
with <>'s, as can be seen by applying this patch:

- 0, CAS_OPT_NAME, &cas, N_("refname>::&1 | grep -e force-with-lease
--force-with-lease[=:]

Rather, it's the usage_argh() function in parse-options.c that's doing
this. Looks like the logic was added in 29f25d493c ("parse-options: add
PARSE_OPT_LITERAL_ARGHELP for complicated argh's", 2009-05-21).

Also because of this I looked at:

$ git grep -P 'N_\("<'

Which shows e.g.:

builtin/difftool.c:706: OPT_STRING('t', "tool", &difftool_cmd, 
N_(""),
builtin/difftool.c:714: OPT_STRING('x', "extcmd", &extcmd, 
N_(""),

Producing this bug:

$ git difftool -h 2>&1|grep '<<'
-t, --tool <>   use the specified diff tool
-x, --extcmd <>

But these all do the right thing for some reason, just looked briefly
and didn't see how they're different & manage to avoid this:

builtin/read-tree.c:134:{ OPTION_STRING, 0, "prefix", 
&opts.prefix, N_("/"),
builtin/show-branch.c:673:  { OPTION_CALLBACK, 'g', "reflog", 
&reflog_base, N_("[,]"),
builtin/update-index.c:969: 
N_(",,"),
builtin/write-tree.c:27:{ OPTION_STRING, 0, "prefix", 
&prefix, N_("/"),


ds/reachable (was Re: What's cooking in git.git (Jul 2018, #03; Wed, 25))

2018-08-01 Thread Derrick Stolee

On 7/25/2018 6:13 PM, Junio C Hamano wrote:

* ds/reachable (2018-07-20) 18 commits
  - commit-reach: use can_all_from_reach
  - commit-reach: make can_all_from_reach... linear
  - commit-reach: replace ref_newer logic
  - test-reach: test commit_contains
  - test-reach: test can_all_from_reach_with_flags
  - test-reach: test reduce_heads
  - test-reach: test get_merge_bases_many
  - test-reach: test is_descendant_of
  - test-reach: test in_merge_bases
  - test-reach: create new test tool for ref_newer
  - commit-reach: move can_all_from_reach_with_flags
  - upload-pack: generalize commit date cutoff
  - upload-pack: refactor ok_to_give_up()
  - upload-pack: make reachable() more generic
  - commit-reach: move commit_contains from ref-filter
  - commit-reach: move ref_newer from remote.c
  - commit.h: remove method declarations
  - commit-reach: move walk methods from commit.c
  (this branch uses ds/commit-graph-fsck, jt/commit-graph-per-object-store and 
sb/object-store-lookup; is tangled with ds/commit-graph-with-grafts.)

  The code for computing history reachability has been shuffled,
  obtained a bunch of new tests to cover them, and then being
  improved.

  Stuck in review?
  cf. <20180723203500.231932-1-jonathanta...@google.com>


This comments on the initial values of 'struct ref_filter' (that are not 
used). All we need is the diff below squashed into "test-reach: test 
commit_contains".



  cf. <20180723204112.233274-1-jonathanta...@google.com>
This comment asks why "parse_commit()" instead of 
"parse_commit_or_die()" but the _or_die() would create a change in 
behavior that is not the purpose of the series.

  cf. 


I just responded to Stefan's comment about sorting. I don't believe any 
change is needed. Some tests output multiple results and the order is 
not defined by the method contract, so 'test-tool reach ' will 
always sort the output (by OID).


(Sorry for the delay. I'm on vacation.)

Thanks,
-Stolee

---

diff --git a/t/helper/test-reach.c b/t/helper/test-reach.c
index eb21103998..ca30059117 100644
--- a/t/helper/test-reach.c
+++ b/t/helper/test-reach.c
@@ -117,6 +117,7 @@ int cmd__reach(int ac, const char **av)
    struct ref_filter filter;
    struct contains_cache cache;
    init_contains_cache(&cache);
+    memset(&filter, 0, sizeof(filter));

    if (ac > 2 && !strcmp(av[2], "--tag"))
        filter.with_commit_tag_algo = 1;






Re: [PATCH v2 17/18] commit-reach: make can_all_from_reach... linear

2018-08-01 Thread Derrick Stolee




On 7/23/2018 4:41 PM, Jonathan Tan wrote:

+   if (parse_commit(list[i]) ||
+   list[i]->generation < min_generation)

Here...


+   if (parse_commit(parent->item) ||
+   parent->item->date < 
min_commit_date ||
+   parent->item->generation < 
min_generation)

...and here, would parse_commit_or_die() be better? I think that a
function that returns a definitive answer (either the commits are
reachable or not) should die when the commits cannot be parsed.
I'm hesitant to add _or_die() here, when the previous implementation 
only used parse_object() or parse_commit(), so would not die when 
parsing fails. The same holds true for the other methods that call 
can_all_from_reach().


Thanks,
-Stolee


Re: [PATCH v2 00/18] Consolidate reachability logic

2018-08-01 Thread Derrick Stolee

On 7/20/2018 6:16 PM, Stefan Beller wrote:

* Use single rev-parse commands in test output, and pipe the OIDs through 'sort'

Why do we need to sort them? The order of the answers given by rev-parse
is the same as the input given and we did not need to sort it before, i.e.
the unit under test would not give sorted output but some deterministic(?)
order, which we can replicate as input to rev-parse.
Am I missing the obvious?
The output of the test program is not always deterministic (or at least, 
the order is determined by the implementation, but not as part of the 
method contract). For example: get_all_merge_bases can return the list 
of merge bases in any order.


By sorting, we can ensure the output values (and their multiplicity) 
match expected.


Thanks,
-Stolee


Re: [PATCH 0/2] negotiator: improve recent behavior + docs

2018-08-01 Thread Jonathan Tan
> I think 01/02 in this patch series implements something that's better
> & more future-proof.

Thanks. Both patches are:
Reviewed-by: Jonathan Tan 

A small note:

> - packfile; any other value instructs Git to use the default algorithm
> + packfile; The default is "default" which instructs Git to use the 
> default algorithm

I think we generally don't capitalize words after semicolons.

Thanks for noticing that the check of fetch.negotiationAlgorithm only
happens when a negotiation actually occurs - before your patches, it
didn't really matter because we tolerated anything, but now we do. I
think this is fine - as far as I know, Git commands generally only read
the configs relevant to them, and if fetch.negotiationAlgorithm is not
relevant in a certain situation, we don't need to read it.

> That's awesome. This is exactly what I wanted, this patch series also
> fixes another small issue in 02/02; which is that the docs for the two
> really should cross-link to make these discoverable from one another.

That's a good idea; thanks for doing it.

> I.e. the way I'm doing this is I add all the remotes first, then I
> fetch them all in parallel, but because the first time around I don't
> have anything for that remote (and they don't share any commits) I
> need to fake it up and pretend to be fetching from a repo that has
> just one commit.
> 
> It would be better if I could somehow say that I don't mind that the
> ref doesn't exist, but currently you either error out with this, or
> ignore the glob, depending on the mode.
> 
> So I want this, but can't think of a less shitty UI than:
> 
> git fetch --negotiation-tip=$REF 
> --negotiation-tip-error-handling=missing-ref-means-no-want
> 
> Or something equally atrocious, do you have any better ideas?

If you wanted to do this, it seems better to me to just declare a "null"
negotiation algorithm that does not perform any negotiation at all.


[PATCH] fetch-pack: unify ref in and out param

2018-08-01 Thread Jonathan Tan
When a user fetches:
 - at least one up-to-date ref and at least one non-up-to-date ref,
 - using HTTP with protocol v0 (or something else that uses the fetch
   command of a remote helper)
some refs might not be updated after the fetch.

This bug was introduced in commit 989b8c4452 ("fetch-pack: put shallow
info in output parameter", 2018-06-28) which allowed transports to
report the refs that they have fetched in a new out-parameter
"fetched_refs". If they do so, transport_fetch_refs() makes this
information available to its caller.

Users of "fetched_refs" rely on the following 3 properties:
 (1) it is the complete list of refs that was passed to
 transport_fetch_refs(),
 (2) it has shallow information (REF_STATUS_REJECT_SHALLOW set if
 relevant), and
 (3) it has updated OIDs if ref-in-want was used (introduced after
 989b8c4452).

In an effort to satisfy (1), whenever transport_fetch_refs()
filters the refs sent to the transport, it re-adds the filtered refs to
whatever the transport supplies before returning it to the user.
However, the implementation in 989b8c4452 unconditionally re-adds the
filtered refs without checking if the transport refrained from reporting
anything in "fetched_refs" (which it is allowed to do), resulting in an
incomplete list, no longer satisfying (1).

An earlier effort to resolve this [1] solved the issue by readding the
filtered refs only if the transport did not refrain from reporting in
"fetched_refs", but after further discussion, it seems that the better
solution is to revert the API change that introduced "fetched_refs".
This API change was first suggested as part of a ref-in-want
implementation that allowed for ref patterns and, thus, there could be
drastic differences between the input refs and the refs actually fetched
[2]; we eventually decided to only allow exact ref names, but this API
change remained even though its necessity was decreased.

Therefore, revert this API change by reverting commit 989b8c4452, and
make receive_wanted_refs() update the OIDs in the sought array (like how
update_shallow() updates shallow information in the sought array)
instead. A test is also included to show that the user-visible bug
discussed at the beginning of this commit message no longer exists.

[1] https://public-inbox.org/git/20180801171806.ga122...@google.com/
[2] 
https://public-inbox.org/git/86a128c5fb710a41791e7183207c4d64889f9307.1485381677.git.jonathanta...@google.com/

Signed-off-by: Jonathan Tan 
---
I now think that it's better to revert the API change introducing
"fetched_refs" (or as Peff describes it, "this whole 'return the fetched
refs' scheme from 989b8c4452"), so here is a patch doing so. I hope to
have covered all of Peff's and Junio's questions in the commit message.

As for Brandon's question:

> I haven't thought too much about what we would need to do in the event
> we add patterns to ref-in-want, but couldn't we possible mutate the
> input list again in this case and just simply add the resulting refs to
> the input list?

If we support ref patterns, we would need to support deletion of refs,
not just addition (because a ref might have existed in the initial ref
advertisement, but not when the packfile is delivered). But it should
be possible to add a flag stating "don't use this" to the ref, and
document that transport_fetch_refs() can append additional refs to the
tail of the input list. Upon hindsight, maybe this should have been the
original API change instead of the "fetched_refs" mechanism.
---
 builtin/clone.c |  4 ++--
 builtin/fetch.c | 28 
 fetch-object.c  |  2 +-
 fetch-pack.c| 30 +++---
 t/t5551-http-fetch-smart.sh | 18 ++
 transport-helper.c  |  6 ++
 transport-internal.h|  9 +
 transport.c | 34 ++
 transport.h |  3 +--
 9 files changed, 50 insertions(+), 84 deletions(-)

diff --git a/builtin/clone.c b/builtin/clone.c
index 5c439f139..76f7db47e 100644
--- a/builtin/clone.c
+++ b/builtin/clone.c
@@ -1156,7 +1156,7 @@ int cmd_clone(int argc, const char **argv, const char 
*prefix)
}
 
if (!is_local && !complete_refs_before_fetch)
-   transport_fetch_refs(transport, mapped_refs, NULL);
+   transport_fetch_refs(transport, mapped_refs);
 
remote_head = find_ref_by_name(refs, "HEAD");
remote_head_points_at =
@@ -1198,7 +1198,7 @@ int cmd_clone(int argc, const char **argv, const char 
*prefix)
if (is_local)
clone_local(path, git_dir);
else if (refs && complete_refs_before_fetch)
-   transport_fetch_refs(transport, mapped_refs, NULL);
+   transport_fetch_refs(transport, mapped_refs);
 
update_remote_refs(refs, mapped_refs, remote_head_points_at,
   

Re: [PATCH 0/3] sb/config-write-fix done without robbing Peter

2018-08-01 Thread Eric Sunshine
On Wed, Aug 1, 2018 at 3:34 PM Stefan Beller  wrote:
> The first patch stands as is unchanged, and the second and third patch
> are different enough that range-diff doesn't want to show a diff.

For future reference, range-diff's --creation-factor tweak may help
here. Depending upon just how heavily changed they are, you may be
able to use it to coerce range-diff into doing what you want.

(Also, interdiff may be a valid alternative.)


Re: [PATCHv2 0/8] Add color test for range-diff, simplify diff.c

2018-08-01 Thread Stefan Beller
On Wed, Aug 1, 2018 at 12:13 PM Junio C Hamano  wrote:
>
> Stefan Beller  writes:
>
> > Stefan Beller (8):
> >   test_decode_color: understand FAINT and ITALIC
> >   t3206: add color test for range-diff --dual-color
> >   diff.c: simplify caller of emit_line_0
> >   diff.c: reorder arguments for emit_line_ws_markup
> >   diff.c: add set_sign to emit_line_0
> >   diff: use emit_line_0 once per line
> >   diff.c: compute reverse locally in emit_line_0
> >   diff.c: rewrite emit_line_0 more understandably
> >
> >  diff.c  | 94 +++--
> >  t/t3206-range-diff.sh   | 39 +
> >  t/test-lib-functions.sh |  2 +
> >  3 files changed, 93 insertions(+), 42 deletions(-)
>
> As I cannot merge this as is to 'pu' and keep going, I'll queue the
> following to the tip.  I think it can be squashed to "t3206: add
> color test" but for now they will remain separate patches.
>
> Subject: [PATCH] fixup! t3206: add color test for range-diff --dual-color

Thanks for dealing with my stubbornness here.

I assumed that the contribution would be a one time hassle
during git-am, not an ongoing problem during the whole time
the patch flows through pu/next/master, which is why I punted
on doing this change and resending in a timely manner.

Further I assumed the sed trick as below is harder to read,
but it turns out it is not. It is still very understandable.

https://public-inbox.org/git/nycvar.qro.7.76.6.1808011800570...@tvgsbejvaqbjf.bet/
sounds like DScho wants to incorporate some white space related
stuff that might collide with the later parts of this series, so
I am not sure how easy it will be to carry this through pu,
so feel free to drop it as well.

Thanks!
Stefan


[PATCH 2/3] config: fix case sensitive subsection names on writing

2018-08-01 Thread Stefan Beller
A use reported a submodule issue regarding strange case indentation
issues, but it could be boiled down to the following test case:

  $ git init test  && cd test
  $ git config foo."Bar".key test
  $ git config foo."bar".key test
  $ tail -n 3 .git/config
  [foo "Bar"]
key = test
key = test

Sub sections are case sensitive and we have a test for correctly reading
them. However we do not have a test for writing out config correctly with
case sensitive subsection names, which is why this went unnoticed in
6ae996f2acf (git_config_set: make use of the config parser's event
stream, 2018-04-09)

Unfortunately we have to make a distinction between old style configuration
that looks like

  [foo.Bar]
key = test

and the new quoted style as seen above. The old style is documented as
case-agnostic, hence we need to keep 'strncasecmp'; although the
resulting setting for the old style config differs from the configuration.
That will be fixed in a follow up patch.

Reported-by: JP Sugarbroad 
Signed-off-by: Stefan Beller 
---
 config.c  | 12 +++-
 t/t1300-config.sh |  1 +
 2 files changed, 12 insertions(+), 1 deletion(-)

diff --git a/config.c b/config.c
index 7968ef7566a..1d3bf9b5fc0 100644
--- a/config.c
+++ b/config.c
@@ -37,6 +37,7 @@ struct config_source {
int eof;
struct strbuf value;
struct strbuf var;
+   unsigned section_name_old_dot_style : 1;
 
int (*do_fgetc)(struct config_source *c);
int (*do_ungetc)(int c, struct config_source *conf);
@@ -605,6 +606,7 @@ static int get_value(config_fn_t fn, void *data, struct 
strbuf *name)
 
 static int get_extended_base_var(struct strbuf *name, int c)
 {
+   cf->section_name_old_dot_style = 0;
do {
if (c == '\n')
goto error_incomplete_line;
@@ -641,6 +643,7 @@ static int get_extended_base_var(struct strbuf *name, int c)
 
 static int get_base_var(struct strbuf *name)
 {
+   cf->section_name_old_dot_style = 1;
for (;;) {
int c = get_next_char();
if (cf->eof)
@@ -2355,14 +2358,21 @@ static int store_aux_event(enum config_event_t type,
store->parsed[store->parsed_nr].type = type;
 
if (type == CONFIG_EVENT_SECTION) {
+   int (*cmpfn)(const char *, const char *, size_t);
+
if (cf->var.len < 2 || cf->var.buf[cf->var.len - 1] != '.')
return error("invalid section name '%s'", cf->var.buf);
 
+   if (cf->section_name_old_dot_style)
+   cmpfn = strncasecmp;
+   else
+   cmpfn = strncmp;
+
/* Is this the section we were looking for? */
store->is_keys_section =
store->parsed[store->parsed_nr].is_keys_section =
cf->var.len - 1 == store->baselen &&
-   !strncasecmp(cf->var.buf, store->key, store->baselen);
+   !cmpfn(cf->var.buf, store->key, store->baselen);
if (store->is_keys_section) {
store->section_seen = 1;
ALLOC_GROW(store->seen, store->seen_nr + 1,
diff --git a/t/t1300-config.sh b/t/t1300-config.sh
index ced13012409..a93f966f128 100755
--- a/t/t1300-config.sh
+++ b/t/t1300-config.sh
@@ -1290,6 +1290,7 @@ test_expect_success 'setting different case sensitive 
subsections ' '
Qc = v2
[d "e"]
f = v1
+   [d "E"]
Qf = v2
EOF
# exact match
-- 
2.18.0.132.g195c49a2227



[PATCH 0/3] sb/config-write-fix done without robbing Peter

2018-08-01 Thread Stefan Beller
> Am I correct to understand that this patch is a "FIX" for breakage
> introduced by that commit?  The phrasing is not helping me to pick
> a good base to queue these patches on.

Please pick 4f4d0b42bae (Merge branch 'js/empty-config-section-fix', 2018-05-08)
as the base of this new series (am needs -3 to apply), although I developed this
series on origin/master.

> Even though I hate to rob Peter to pay Paul (or vice versa)

Yeah me, too. Here is a proper fix (i.e. only pay Paul, without the robbery),
and a documentation of the second bug that we discovered.

The first patch stands as is unchanged, and the second and third patch
are different enough that range-diff doesn't want to show a diff.

Thanks,
Stefan

Stefan Beller (3):
  t1300: document current behavior of setting options
  config: fix case sensitive subsection names on writing
  git-config: document accidental multi-line setting in deprecated
syntax

 Documentation/git-config.txt | 21 +
 config.c | 12 -
 t/t1300-config.sh| 87 
 3 files changed, 119 insertions(+), 1 deletion(-)


[PATCH 3/3] git-config: document accidental multi-line setting in deprecated syntax

2018-08-01 Thread Stefan Beller
The bug was noticed when writing the previous patch; a fix for this bug
is not easy though: If we choose to ignore the case of the subsection
(and revert most of the code of the previous patch, just keeping
s/strncasecmp/strcmp/), then we'd introduce new sections using the
new syntax, such that

 
   [section.subsection]
 key = value1
 

  git config section.Subsection.key value2

would result in

 
   [section.subsection]
 key = value1
   [section.Subsection]
 key = value2
 

which is even more confusing. A proper fix would replace the first
occurrence of 'key'. As the syntax is deprecated, let's prefer to not
spend time on fixing the behavior and just document it instead.

Signed-off-by: Stefan Beller 
---
 Documentation/git-config.txt | 21 +
 1 file changed, 21 insertions(+)

diff --git a/Documentation/git-config.txt b/Documentation/git-config.txt
index 18ddc78f42d..8e240435bee 100644
--- a/Documentation/git-config.txt
+++ b/Documentation/git-config.txt
@@ -453,6 +453,27 @@ http.sslverify false
 
 include::config.txt[]
 
+BUGS
+
+When using the deprecated `[section.subsection]` syntax, changing a value
+will result in adding a multi-line key instead of a change, if the subsection
+is given with at least one uppercase character. For example when the config
+looks like
+
+
+  [section.subsection]
+key = value1
+
+
+and running `git config section.Subsection.key value2` will result in
+
+
+  [section.subsection]
+key = value1
+key = value2
+
+
+
 GIT
 ---
 Part of the linkgit:git[1] suite
-- 
2.18.0.132.g195c49a2227



[PATCH 1/3] t1300: document current behavior of setting options

2018-08-01 Thread Stefan Beller
This documents current behavior of the config machinery, when changing
the value of some settings. This patch just serves to provide a baseline
for the follow up that will fix some issues with the current behavior.

Signed-off-by: Stefan Beller 
---
 t/t1300-config.sh | 86 +++
 1 file changed, 86 insertions(+)

diff --git a/t/t1300-config.sh b/t/t1300-config.sh
index 03c223708eb..ced13012409 100755
--- a/t/t1300-config.sh
+++ b/t/t1300-config.sh
@@ -1218,6 +1218,92 @@ test_expect_success 'last one wins: three level vars' '
test_cmp expect actual
 '
 
+test_expect_success 'old-fashioned settings are case insensitive' '
+   test_when_finished "rm -f testConfig testConfig_expect 
testConfig_actual" &&
+
+   cat >testConfig_actual <<-EOF &&
+   [V.A]
+   r = value1
+   EOF
+   q_to_tab >testConfig_expect <<-EOF &&
+   [V.A]
+   Qr = value2
+   EOF
+   git config -f testConfig_actual "v.a.r" value2 &&
+   test_cmp testConfig_expect testConfig_actual &&
+
+   cat >testConfig_actual <<-EOF &&
+   [V.A]
+   r = value1
+   EOF
+   q_to_tab >testConfig_expect <<-EOF &&
+   [V.A]
+   QR = value2
+   EOF
+   git config -f testConfig_actual "V.a.R" value2 &&
+   test_cmp testConfig_expect testConfig_actual &&
+
+   cat >testConfig_actual <<-EOF &&
+   [V.A]
+   r = value1
+   EOF
+   q_to_tab >testConfig_expect <<-EOF &&
+   [V.A]
+   r = value1
+   Qr = value2
+   EOF
+   git config -f testConfig_actual "V.A.r" value2 &&
+   test_cmp testConfig_expect testConfig_actual &&
+
+   cat >testConfig_actual <<-EOF &&
+   [V.A]
+   r = value1
+   EOF
+   q_to_tab >testConfig_expect <<-EOF &&
+   [V.A]
+   r = value1
+   Qr = value2
+   EOF
+   git config -f testConfig_actual "v.A.r" value2 &&
+   test_cmp testConfig_expect testConfig_actual
+'
+
+test_expect_success 'setting different case sensitive subsections ' '
+   test_when_finished "rm -f testConfig testConfig_expect 
testConfig_actual" &&
+
+   cat >testConfig_actual <<-EOF &&
+   [V "A"]
+   R = v1
+   [K "E"]
+   Y = v1
+   [a "b"]
+   c = v1
+   [d "e"]
+   f = v1
+   EOF
+   q_to_tab >testConfig_expect <<-EOF &&
+   [V "A"]
+   Qr = v2
+   [K "E"]
+   Qy = v2
+   [a "b"]
+   Qc = v2
+   [d "e"]
+   f = v1
+   Qf = v2
+   EOF
+   # exact match
+   git config -f testConfig_actual a.b.c v2 &&
+   # match section and subsection, key is cased differently.
+   git config -f testConfig_actual K.E.y v2 &&
+   # section and key are matched case insensitive, but subsection needs
+   # to match; When writing out new values only the key is adjusted
+   git config -f testConfig_actual v.A.r v2 &&
+   # subsection is not matched:
+   git config -f testConfig_actual d.E.f v2 &&
+   test_cmp testConfig_expect testConfig_actual
+'
+
 for VAR in a .a a. a.0b a."b c". a."b c".0d
 do
test_expect_success "git -c $VAR=VAL rejects invalid '$VAR'" '
-- 
2.18.0.132.g195c49a2227



Re: [PATCH v2 2/2] sequencer: fix quoting in write_author_script

2018-08-01 Thread Eric Sunshine
On Wed, Aug 1, 2018 at 11:50 AM Phillip Wood  wrote:
> On 31/07/18 22:39, Eric Sunshine wrote:
> > On Tue, Jul 31, 2018 at 7:15 AM Phillip Wood  
> > wrote:
> >> +   /*
> >> +* write_author_script() used to fail to terminate the 
> >> GIT_AUTHOR_DATE
> >> +* line with a "'" and also escaped "'" incorrectly as "'''" 
> >> rather
> >> +* than "'\\''". We check for the terminating "'" on the last line 
> >> to
> >> +* see how "'" has been escaped in case git was upgraded while 
> >> rebase
> >> +* was stopped.
> >> +*/
> >> +   sq_bug = script.len && script.buf[script.len - 2] != '\'';
> >
> > This is a very "delicate" check, assuming that a hand-edited file
> > won't end with, say, an extra newline. I wonder if this level of
> > backward-compatibility is overkill for such an unlikely case.
>
> I think I'll get rid of the check and instead use a version number
> written to .git/rebase-merge/interactive to indicate if we need to fix
> the quoting (if there's no number then it needs fixing). We can
> increment the version number in the future if we ever need to implement
> other fallbacks to handle the case where git got upgraded while rebase
> was stopped. I'll send a patch tomorrow

Hmm, that approach is pretty heavyweight and would add a fair bit of
new code and complexity which itself could harbor bugs. When I
commented that the check was "delicate", I was (especially) referring
to the rigid "script[len-2]", not necessarily to the basic idea of the
check. So, you could keep the check (in spirit) but make it more
robust. Like this, for instance:

/* big comment explaining old buggy stuff */
static int broken_quoting(const char *s, size_t n)
{
const char *t = s + n;
while (t > s && *--t == '\n')
/* empty */;
if (t > s && *--t != '\'')
return 1;
return 0;
}

static int read_env_script(...)
{
...
sq_bug = broken_quoting(script.buf, script.len);
...
}

I would feel much more comfortable with a simple solution like this
than with one introducing new complexity associated with adding a
version number.


Re: [PATCHv2 0/8] Add color test for range-diff, simplify diff.c

2018-08-01 Thread Junio C Hamano
Stefan Beller  writes:

> Stefan Beller (8):
>   test_decode_color: understand FAINT and ITALIC
>   t3206: add color test for range-diff --dual-color
>   diff.c: simplify caller of emit_line_0
>   diff.c: reorder arguments for emit_line_ws_markup
>   diff.c: add set_sign to emit_line_0
>   diff: use emit_line_0 once per line
>   diff.c: compute reverse locally in emit_line_0
>   diff.c: rewrite emit_line_0 more understandably
>
>  diff.c  | 94 +++--
>  t/t3206-range-diff.sh   | 39 +
>  t/test-lib-functions.sh |  2 +
>  3 files changed, 93 insertions(+), 42 deletions(-)

As I cannot merge this as is to 'pu' and keep going, I'll queue the
following to the tip.  I think it can be squashed to "t3206: add
color test" but for now they will remain separate patches.

Subject: [PATCH] fixup! t3206: add color test for range-diff --dual-color

---
 t/t3206-range-diff.sh | 64 +--
 1 file changed, 32 insertions(+), 32 deletions(-)

diff --git a/t/t3206-range-diff.sh b/t/t3206-range-diff.sh
index 019724e61a..e3b0764b43 100755
--- a/t/t3206-range-diff.sh
+++ b/t/t3206-range-diff.sh
@@ -143,38 +143,38 @@ test_expect_success 'changed message' '
 '
 
 test_expect_success 'dual-coloring' '
-   cat >expect <<-\EOF &&
-   1:  a4b = 1:  f686024 s/5/A/
-   2:  f51d370 ! 2:  
4ab067d s/4/A/
-   @@ -2,6 +2,8 @@
-
-s/4/A/
-
-   +Also a silly comment here!
-   +
-diff --git a/file b/file
---- a/file
-+++ b/file
-   3:  0559556 ! 3:  
b9cb956 s/11/B/
-   @@ -10,7 +10,7 @@
- 9
- 10
--11
-   -+BB
-   ++B
- 12
- 13
- 14
-   4:  d966c5c ! 4:  
8add5f1 s/12/B/
-   @@ -8,7 +8,7 @@
-@@
- 9
- 10
-   - BB
-   + B
--12
-+B
- 13
+   sed -e "s|^:||" >expect <<-\EOF &&
+   :1:  a4b = 1:  f686024 s/5/A/
+   :2:  f51d370 ! 2:  
4ab067d s/4/A/
+   :@@ -2,6 +2,8 @@
+   : 
+   : s/4/A/
+   : 
+   :+Also a silly comment here!
+   :+
+   : diff --git a/file b/file
+   : --- a/file
+   : +++ b/file
+   :3:  0559556 ! 3:  
b9cb956 s/11/B/
+   :@@ -10,7 +10,7 @@
+   :  9
+   :  10
+   : -11
+   :-+BB
+   :++B
+   :  12
+   :  13
+   :  14
+   :4:  d966c5c ! 4:  
8add5f1 s/12/B/
+   :@@ -8,7 +8,7 @@
+   : @@
+   :  9
+   :  10
+   :- BB
+   :+ B
+   : -12
+   : +B
+   :  13
EOF
git range-diff changed...changed-message --color --dual-color 
>actual.raw &&
test_decode_color >actual 

Re: range-diff, was Re: What's cooking in git.git (Jul 2018, #03; Wed, 25)

2018-08-01 Thread Junio C Hamano
Johannes Schindelin  writes:

>> If any other issue arises, I do not mind taking an update, either,
>> but I think at this point the topic is reaching the point of
>> diminishing returns and should switch to incremental.
>
> Thomas had a couple of good suggestions, still, and I am considering to
> try to find time to simply disable the whitespace warnings altogether in
> range-diff.

OK, then I'll wait and refrain from merging this and other topics
that build on top for now.



[PATCH] push: comment on a funny unbalanced option help

2018-08-01 Thread Junio C Hamano
The option help text for the force-with-lease option to "git push"
reads like this:

$ git push -h 2>&1 | grep -e force-with-lease
   --force-with-lease[=:]

which come from this

  0, CAS_OPT_NAME, &cas, N_("refname>:" at both
ends.

It turns out that parse-options machinery takes the whole string and
encloses it inside a pair of "<>", expecting that it is a single
placeholder.  The help string was written in a funnily unbalanced
way knowing that the end result would balance out.

Add a comment to save future readers from wasting time just like I
did ;-)

Signed-off-by: Junio C Hamano 
---
 builtin/push.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/builtin/push.c b/builtin/push.c
index 9cd8e8cd56..9608b0cc4f 100644
--- a/builtin/push.c
+++ b/builtin/push.c
@@ -558,6 +558,7 @@ int cmd_push(int argc, const char **argv, const char 
*prefix)
OPT_BIT( 0,  "porcelain", &flags, N_("machine-readable 
output"), TRANSPORT_PUSH_PORCELAIN),
OPT_BIT('f', "force", &flags, N_("force updates"), 
TRANSPORT_PUSH_FORCE),
{ OPTION_CALLBACK,
+ /* N_() will get "<>" around, resulting in 
":" */
  0, CAS_OPT_NAME, &cas, N_("refname>:

Re: [PATCH v2] remote: make refspec follow the same disambiguation rule as local refs

2018-08-01 Thread Junio C Hamano
Jonathan Tan  writes:

>> +test_expect_success 'LHS of refspec follows ref disambiguation rules' '
>> +mkdir lhs-ambiguous &&
>> +(
>> +cd lhs-ambiguous &&
>> +git init server &&
>> +test_commit -C server unwanted &&
>> +test_commit -C server wanted &&
>> +
>> +git init client &&
>> +
>> +# Check a name coming after "refs" alphabetically ...
>> +git -C server update-ref refs/heads/s wanted &&
>> +git -C server update-ref refs/heads/refs/heads/s unwanted &&
>> +git -C client fetch ../server 
>> +refs/heads/s:refs/heads/checkthis &&
>> +git -C server rev-parse wanted >expect &&
>> +git -C client rev-parse checkthis >actual &&
>> +test_cmp expect actual &&
>> +
>> +# ... and one before.
>> +git -C server update-ref refs/heads/q wanted &&
>> +git -C server update-ref refs/heads/refs/heads/q unwanted &&
>> +git -C client fetch ../server 
>> +refs/heads/q:refs/heads/checkthis &&
>> +git -C server rev-parse wanted >expect &&
>> +git -C client rev-parse checkthis >actual &&
>> +test_cmp expect actual &&
>> +
>> +# Tags are preferred over branches like refs/{heads,tags}/*
>> +git -C server update-ref refs/tags/t wanted &&
>> +git -C server update-ref refs/heads/t unwanted &&
>> +git -C client fetch ../server +t:refs/heads/checkthis &&
>> +git -C server rev-parse wanted >expect &&
>> +git -C client rev-parse checkthis >actual
>> +)
>> +'
>
> Thanks, this looks good to me. Also thanks for adding the "+" in the
> fetch commands in the test.

Yup, otherwise the fetch may fail because "checkthis" may have to be
rewound when we fetch different things.


Re: [PATCH] transport: report refs only if transport does

2018-08-01 Thread Brandon Williams
On 07/31, Jonathan Tan wrote:
> > On Mon, Jul 30, 2018 at 03:56:01PM -0700, Jonathan Tan wrote:
> > 
> > > Commit 989b8c4452 ("fetch-pack: put shallow info in output parameter",
> > > 2018-06-28) allows transports to report the refs that they have fetched
> > > in a new out-parameter "fetched_refs". If they do so,
> > > transport_fetch_refs() makes this information available to its caller.
> > > 
> > > Because transport_fetch_refs() filters the refs sent to the transport,
> > > it cannot just report the transport's result directly, but first needs
> > > to readd the excluded refs, pretending that they are fetched. However,
> > > this results in a wrong result if the transport did not report the refs
> > > that they have fetched in "fetched_refs" - the excluded refs would be
> > > added and reported, presenting an incomplete picture to the caller.
> > 
> > This part leaves me confused. If we are not fetching them, then why do
> > we need to pretend that they are fetched?
> 
> The short answer is that we need:
>  (1) the complete list of refs that was passed to
>  transport_fetch_refs(),
>  (2) with shallow information (REF_STATUS_REJECT_SHALLOW set if
>  relevant), and
>  (3) with updated OIDs if ref-in-want was used.
> 
> The fetched_refs out param already fulfils (2) and (3), and this patch
> makes it fulfil (1). As for calling them fetched_refs, perhaps that is a
> misnomer, but they do appear in FETCH_HEAD even though they are not
> truly fetched.
> 
> Which raises the question...if completeness is so important, why not
> reuse the input list of refs and document that transport_fetch_refs()
> can mutate the input list? You ask the same question below, so I'll put
> the answer after quoting your paragraph.
> 
> > I think I am showing my lack of understanding about the reason for this
> > whole "return the fetched refs" scheme from 989b8c4452, and probably
> > reading the rest of that series would make it more clear. But from the
> > perspective of somebody digging into history and finding just this
> > commit, it probably needs to lay out a little more of the reasoning.
> 
> I think it's because 989b8c4452 is based on my earlier work [1] which
> also had a fetched_refs out param. Its main reason is to enable the
> invoker of transport_fetch_refs() to specify ref patterns (as you can
> see in a later commit in the same patch set [2]) - and if we specify
> patterns, the invoker of transport_fetch_refs() needs the resulting refs
> (which are provided through fetched_refs).
> 
> In the version that made it to master, however, there was some debate
> about whether ref patterns need to be allowed. In the end, ref patterns
> were not allowed [3], but the fetched_refs out param was still left in.
> 
> I think that reverting the API might work, but am on the fence about it.
> It would reduce the number of questions about the code (and would
> probably automatically fix the issue that I was fixing in the first

If you believe the API is difficult to work with (which given this bug
it is) then perhaps we go with your suggestion and revert the API back
to only providing a list of input refs and having the fetch operation
mutate that input list.

> place), but if we were to revert the API and then decide that we do want
> ref patterns in "want-ref" (or expand transport_fetch_refs in some
> similar way), we would need to revert our revert, causing code churn.

I haven't thought too much about what we would need to do in the event
we add patterns to ref-in-want, but couldn't we possible mutate the
input list again in this case and just simply add the resulting refs to
the input list?

> 
> [1] 
> https://public-inbox.org/git/86a128c5fb710a41791e7183207c4d64889f9307.1485381677.git.jonathanta...@google.com/
> [2] 
> https://public-inbox.org/git/eef2b77d88df0db08e4a1505b06e0af2d40143d5.1485381677.git.jonathanta...@google.com/
> [3] https://public-inbox.org/git/20180620213235.10952-1-bmw...@google.com/

-- 
Brandon Williams


Re: [PATCH 2/2] Highlight keywords in remote sideband output.

2018-08-01 Thread Junio C Hamano
Han-Wen Nienhuys  writes:

> On Wed, Aug 1, 2018 at 5:41 PM Junio C Hamano  wrote:
>
>> Hmm, do we actually say things like "Error: blah"?  I am not sure if
>> I like this strncasecmp all that much.
>
> this is for the remote end, so what we (git-core) says isn't all that
> relevant.

It is very relevant, I would think.  Because the coloring is
controlled at the client end with this implementation, third-party
remote implementations have strong incentive to follow what our
remote end says and not to deviate.  Preventing them from being
different just to be different does help the users, no?

>> > So, despite the explanation in the commit message, this function isn't
>> > _generally_ highlighting keywords in the sideband. Instead, it is
>> > highlighting a keyword only if it finds it at the start of string
>> > (ignoring whitespace). Perhaps the commit message could be more clear
>> > about that.
>>
>> Sounds good.
>>
>> I didn't comment on other parts of your review posed as questions
>> (that require some digging and thinking), but I think they are all
>> worthwhile thing to think about.
>
> Sorry for being dense, but do you want me to send an updated patch or
> not based on your and Eric's comments or not?

It would help to see the comments responded with either "such a
change is not needed for such and such reasons", "it may make sense
but let's leave it to a follow-up patch later," etc., or with a
"here is an updated patch, taking all the comments to the previous
version into account---note that I rejected that particular comment
because of such and such reasons".


Re: [PATCH v2] remote: make refspec follow the same disambiguation rule as local refs

2018-08-01 Thread Jonathan Tan
> +test_expect_success 'LHS of refspec follows ref disambiguation rules' '
> + mkdir lhs-ambiguous &&
> + (
> + cd lhs-ambiguous &&
> + git init server &&
> + test_commit -C server unwanted &&
> + test_commit -C server wanted &&
> +
> + git init client &&
> +
> + # Check a name coming after "refs" alphabetically ...
> + git -C server update-ref refs/heads/s wanted &&
> + git -C server update-ref refs/heads/refs/heads/s unwanted &&
> + git -C client fetch ../server 
> +refs/heads/s:refs/heads/checkthis &&
> + git -C server rev-parse wanted >expect &&
> + git -C client rev-parse checkthis >actual &&
> + test_cmp expect actual &&
> +
> + # ... and one before.
> + git -C server update-ref refs/heads/q wanted &&
> + git -C server update-ref refs/heads/refs/heads/q unwanted &&
> + git -C client fetch ../server 
> +refs/heads/q:refs/heads/checkthis &&
> + git -C server rev-parse wanted >expect &&
> + git -C client rev-parse checkthis >actual &&
> + test_cmp expect actual &&
> +
> + # Tags are preferred over branches like refs/{heads,tags}/*
> + git -C server update-ref refs/tags/t wanted &&
> + git -C server update-ref refs/heads/t unwanted &&
> + git -C client fetch ../server +t:refs/heads/checkthis &&
> + git -C server rev-parse wanted >expect &&
> + git -C client rev-parse checkthis >actual
> + )
> +'

Thanks, this looks good to me. Also thanks for adding the "+" in the
fetch commands in the test.


Re: [PATCH 1/1] Add the `p4-pre-submit` hook

2018-08-01 Thread Junio C Hamano
Chen Bin  writes:

> The `p4-pre-submit` hook is executed before git-p4 submits code.
> If the hook exits with non-zero value, submit process won't start.
>
> Signed-off-by: Chen Bin 
> ---

I see that the only difference between this and what has been queued
on 'pu', i.e. 

  https://github.com/gitster/git/commit/fb78b040c5dc5b80a093d028d13f8cd32ade17cd

is that this is missing the update in

  https://public-inbox.org/git/xmqq1sbkfneo@gitster-ct.c.googlers.com/

and also Luke's "Reviewed-by:".

I recall that you had trouble getting "git p4" in the test (not
"git-p4") working for some reason.  Has that been resolved (iow
have you figured out why it was failing?)

Thanks.


[PATCH v2] remote: make refspec follow the same disambiguation rule as local refs

2018-08-01 Thread Junio C Hamano
When matching a non-wildcard LHS of a refspec against a list of
refs, find_ref_by_name_abbrev() returns the first ref that matches
using any DWIM rules used by refname_match() in refs.c, even if a
better match occurs later in the list of refs.

This causes unexpected behavior when (for example) fetching using
the refspec "refs/heads/s:" from a remote with both
"refs/heads/refs/heads/s" and "refs/heads/s"; even if the former was
inadvertently created, one would still expect the latter to be
fetched.  Similarly, when both a tag T and a branch T exist,
fetching T should favor the tag, just like how local refname
disambiguation rule works.  But because the code walks over
ls-remote output from the remote, which happens to be sorted in
alphabetical order and has refs/heads/T before refs/tags/T, a
request to fetch T is (mis)interpreted as fetching refs/heads/T.

Update refname_match(), all of whose current callers care only if it
returns non-zero (i.e. matches) to see if an abbreviated name can
mean the full name being tested, so that it returns a positive
integer whose magnitude can be used to tell the precedence, and fix
the find_ref_by_name_abbrev() function not to stop at the first
match but find the match with the highest precedence.

This is based on an earlier work, which special cased only the exact
matches, by Jonathan Tan.

Signed-off-by: Junio C Hamano 
---

 * This time, with a log message, updated "precedence" number, a bit
   of in-code comment and a new test to show that the fix extends to
   non-exact disambiguation as well.

 refs.c   | 16 +++-
 remote.c | 13 ++---
 t/t5510-fetch.sh | 35 +++
 3 files changed, 56 insertions(+), 8 deletions(-)

diff --git a/refs.c b/refs.c
index 20ba82b434..d1be61b1b5 100644
--- a/refs.c
+++ b/refs.c
@@ -487,16 +487,22 @@ static const char *ref_rev_parse_rules[] = {
NULL
 };
 
+/*
+ * Is it possible that the caller meant full_name with abbrev_name?
+ * If so return a non-zero value to signal "yes"; the magnitude of
+ * the returned value gives the precedence used for disambiguation.
+ *
+ * If abbrev_name cannot mean full_name, return 0.
+ */
 int refname_match(const char *abbrev_name, const char *full_name)
 {
const char **p;
const int abbrev_name_len = strlen(abbrev_name);
+   const int num_rules = ARRAY_SIZE(ref_rev_parse_rules) - 1;
 
-   for (p = ref_rev_parse_rules; *p; p++) {
-   if (!strcmp(full_name, mkpath(*p, abbrev_name_len, 
abbrev_name))) {
-   return 1;
-   }
-   }
+   for (p = ref_rev_parse_rules; *p; p++)
+   if (!strcmp(full_name, mkpath(*p, abbrev_name_len, 
abbrev_name)))
+   return &ref_rev_parse_rules[num_rules] - p;
 
return 0;
 }
diff --git a/remote.c b/remote.c
index c10d87c246..4a3e7ba136 100644
--- a/remote.c
+++ b/remote.c
@@ -1880,11 +1880,18 @@ static struct ref *get_expanded_map(const struct ref 
*remote_refs,
 static const struct ref *find_ref_by_name_abbrev(const struct ref *refs, const 
char *name)
 {
const struct ref *ref;
+   const struct ref *best_match = NULL;
+   int best_score = 0;
+
for (ref = refs; ref; ref = ref->next) {
-   if (refname_match(name, ref->name))
-   return ref;
+   int score = refname_match(name, ref->name);
+
+   if (best_score < score) {
+   best_match = ref;
+   best_score = score;
+   }
}
-   return NULL;
+   return best_match;
 }
 
 struct ref *get_remote_ref(const struct ref *remote_refs, const char *name)
diff --git a/t/t5510-fetch.sh b/t/t5510-fetch.sh
index da9ac00557..858381a788 100755
--- a/t/t5510-fetch.sh
+++ b/t/t5510-fetch.sh
@@ -535,6 +535,41 @@ test_expect_success "should be able to fetch with 
duplicate refspecs" '
)
 '
 
+test_expect_success 'LHS of refspec follows ref disambiguation rules' '
+   mkdir lhs-ambiguous &&
+   (
+   cd lhs-ambiguous &&
+   git init server &&
+   test_commit -C server unwanted &&
+   test_commit -C server wanted &&
+
+   git init client &&
+
+   # Check a name coming after "refs" alphabetically ...
+   git -C server update-ref refs/heads/s wanted &&
+   git -C server update-ref refs/heads/refs/heads/s unwanted &&
+   git -C client fetch ../server 
+refs/heads/s:refs/heads/checkthis &&
+   git -C server rev-parse wanted >expect &&
+   git -C client rev-parse checkthis >actual &&
+   test_cmp expect actual &&
+
+   # ... and one before.
+   git -C server update-ref refs/heads/q wanted &&
+   git -C server update-ref refs/heads/refs/heads/q unwanted &&
+   git -C client fetch ../server 
+refs/heads/q:refs/heads/checkthis &&
+

Re: [PATCH 2/2] Highlight keywords in remote sideband output.

2018-08-01 Thread Han-Wen Nienhuys
On Wed, Aug 1, 2018 at 5:41 PM Junio C Hamano  wrote:

> Hmm, do we actually say things like "Error: blah"?  I am not sure if
> I like this strncasecmp all that much.

this is for the remote end, so what we (git-core) says isn't all that
relevant. The reason I put this here is that Gerrit has some messages
that say "ERROR: .. "

> >> +   strbuf_addstr(dest, p->color);
> >> +   strbuf_add(dest, src, len);
> >> +   strbuf_addstr(dest, GIT_COLOR_RESET);
> >> +   n -= len;
> >> +   src += len;
> >> +   break;
> >> +   }
> >
> > So, despite the explanation in the commit message, this function isn't
> > _generally_ highlighting keywords in the sideband. Instead, it is
> > highlighting a keyword only if it finds it at the start of string
> > (ignoring whitespace). Perhaps the commit message could be more clear
> > about that.
>
> Sounds good.
>
> I didn't comment on other parts of your review posed as questions
> (that require some digging and thinking), but I think they are all
> worthwhile thing to think about.

Sorry for being dense, but do you want me to send an updated patch or
not based on your and Eric's comments or not?

thanks,
--

Google Germany GmbH, Erika-Mann-Strasse 33, 80636 Munich

Registergericht und -nummer: Hamburg, HRB 86891

Sitz der Gesellschaft: Hamburg

Geschäftsführer: Paul Manicle, Halimah DeLaine Prado


Fetch on submodule update

2018-08-01 Thread Robert Dailey
Problem: I want to avoid recursively fetching submodules when I run a
`fetch` command, and instead defer that operation to the next
`submodule update`. Essentially I want `fetch.recurseSubmodules` to be
`false`, and `get submodule update` to do exactly what it does with
the `--remote` option, but still use the SHA1 of the submodule instead
of updating to the tip of the specified branch in the git modules
config.

I hope that makes sense. The reason for this ask is to
improve/streamline workflow in parent repositories. There are cases
where I want to quickly fetch only the parent repository, even if a
submodule changes, to perform some changes that do not require the
submodule itself (yet). Then at a later time, do `submodule update`
and have it automatically fetch when the SHA1 it's updating to does
not exist (because the former fetch operation for the submodule was
skipped). For my case, it's very slow to wait on submodules to
recursively fetch when I only wanted to fetch the parent repo for the
specific task I plan to do.

Is this possible right now through some variation of configuration?


Re: [PATCH v2 0/4] Speed up unpack_trees()

2018-08-01 Thread Duy Nguyen
On Tue, Jul 31, 2018 at 01:31:31PM -0400, Ben Peart wrote:
> 
> 
> On 7/31/2018 12:50 PM, Ben Peart wrote:
> > 
> > 
> > On 7/31/2018 11:31 AM, Duy Nguyen wrote:
> 
> >>
> >>> In the performance game of whack-a-mole, that call to repair cache-tree
> >>> is now looking quite expensive...
> >>
> >> Yeah and I think we can whack that mole too. I did some measurement.
> >> Best case possible, we just need to scan through two indexes (one with
> >> many good cache-tree, one with no cache-tree), compare and copy
> >> cache-tree over. The scanning takes like 1% time of current repair
> >> step and I suspect it's the hashing that takes most of the time. Of
> >> course real world won't have such nice numbers, but I guess we could
> >> maybe half cache-tree update/repair time.
> >>
> > 
> > I have some great profiling tools available so will take a look at this 
> > next and see exactly where the time is being spent.
> 
> Good instincts.  In cache_tree_update, the heavy hitter is definitely 
> hash_object_file followed by has_object_file.
> 
> Name  Inc %Inc
> + git!cache_tree_update12.4  4,935
> |+ git!update_one  11.8  4,706
> | + git!update_one 11.8  4,706
> |  + git!hash_object_file   6.1  2,406
> |  + git!has_object_file2.0813
> |  + OTHER <> 0.5203
> |  + git!strbuf_addf0.4155
> |  + git!strbuf_release 0.4143
> |  + git!strbuf_add 0.3121
> |  + OTHER <> 0.2 93
> |  + git!strbuf_grow0.1 25

Ben, if you work on this, this could be a good starting point. I will
not work on this because I still have some other things to catch up
and follow through. You can have my sign off if you reuse something
from this patch

Even if it's a naive implementation, the initial numbers look pretty
good. Without the patch we have

18:31:05.970621 unpack-trees.c:1437 performance: 0.01029 s: copy
18:31:05.975729 unpack-trees.c:1444 performance: 0.005082004 s: update

And with the patch

18:31:13.295655 unpack-trees.c:1437 performance: 0.000198017 s: copy
18:31:13.296757 unpack-trees.c:1444 performance: 0.001075935 s: update

Time saving is about 80% by the look of this (best possible case
because only the top tree needs to be hashed and written out).

-- 8< --
diff --git a/cache-tree.c b/cache-tree.c
index 6b46711996..67a4a93100 100644
--- a/cache-tree.c
+++ b/cache-tree.c
@@ -440,6 +440,147 @@ int cache_tree_update(struct index_state *istate, int 
flags)
return 0;
 }
 
+static int same(const struct cache_entry *a, const struct cache_entry *b)
+{
+   if (ce_stage(a) || ce_stage(b))
+   return 0;
+   if ((a->ce_flags | b->ce_flags) & CE_CONFLICTED)
+   return 0;
+   return a->ce_mode == b->ce_mode &&
+  !oidcmp(&a->oid, &b->oid);
+}
+
+static int cache_tree_name_pos(const struct index_state *istate,
+  const struct strbuf *path)
+{
+   int pos;
+
+   if (!path->len)
+   return 0;
+
+   pos = index_name_pos(istate, path->buf, path->len);
+   if (pos >= 0)
+   BUG("No no no, directory path must not exist in index");
+   return -pos - 1;
+}
+
+/*
+ * Locate the same cache-tree in two separate indexes. Check the
+ * cache-tree is still valid for the "to" index (i.e. it contains the
+ * same set of entries in the "from" index).
+ */
+static int verify_one_cache_tree(const struct index_state *to,
+const struct index_state *from,
+const struct cache_tree *it,
+const struct strbuf *path)
+{
+   int i, spos, dpos;
+
+   spos = cache_tree_name_pos(from, path);
+   if (spos + it->entry_count > from->cache_nr)
+   return -1;
+
+   dpos = cache_tree_name_pos(to, path);
+   if (dpos + it->entry_count > to->cache_nr)
+   return -1;
+
+   /* Can we quickly check head and tail and bail out early */
+   if (!same(from->cache[spos], to->cache[spos]) ||
+   !same(from->cache[spos + it->entry_count - 1],
+ to->cache[spos + it->entry_count - 1]))
+   return -1;
+
+   for (i = 1; i < it->entry_count - 1; i++)
+   if (!same(from->cache[spos + i],
+ to->cache[dpos + i]))
+   return -1;
+
+   return 0;
+}
+
+static int verify_and_invalidate(struct index_state *to,
+const struct index_state *from,
+struct cache_tree *it,
+struct strbuf *path)
+{
+   /*
+* Optimistically verify the current tree first. Alternatively
+* we could verify all the subtrees first then do this
+* last. 

Re: range-diff, was Re: What's cooking in git.git (Jul 2018, #03; Wed, 25)

2018-08-01 Thread Johannes Schindelin
Hi Junio,

On Mon, 30 Jul 2018, Junio C Hamano wrote:

> Johannes Schindelin  writes:
> 
> > FWIW I picked up your Asciidoc-underline fix, and I also fixed a typo in a
> > commit message (you may want to pick that up, too, unless you want me to
> > send a full new iteration, I don't care either way):
> 
> Meaning that if you send a full new iteration it would match what we
> have on 'pu' plus the one-liner below?  I think we can do without
> such a resend, because everybody has seen all there is to see if
> that is the case.
> 
> > -- snipsnap --
> > 11:  bf0a5879361 ! 11:  0c1f5db5d01 range-diff: add tests
> > @@ -3,7 +3,7 @@
> >  range-diff: add tests
> >  
> >  These are essentially lifted from https://github.com/trast/tbdiff, 
> > with
> > -light touch-ups to account for the command now being names `git
> > +light touch-ups to account for the command now being named `git
> >  range-diff`.
> >  
> >  Apart from renaming `tbdiff` to `range-diff`, only one test case 
> > needed
> 
> I'll need to remember to rebuild es/format-patch-rangediff after
> amending bf0a587936 with this, but I think I should be able to push
> out the result in today's round.
> 
> If any other issue arises, I do not mind taking an update, either,
> but I think at this point the topic is reaching the point of
> diminishing returns and should switch to incremental.

Thomas had a couple of good suggestions, still, and I am considering to
try to find time to simply disable the whitespace warnings altogether in
range-diff.

Ciao,
Dscho


[PATCH 1/1] Add the `p4-pre-submit` hook

2018-08-01 Thread Chen Bin
The `p4-pre-submit` hook is executed before git-p4 submits code.
If the hook exits with non-zero value, submit process won't start.

Signed-off-by: Chen Bin 
---
 Documentation/git-p4.txt   |  8 
 Documentation/githooks.txt |  7 +++
 git-p4.py  | 16 +++-
 t/t9800-git-p4-basic.sh| 29 +
 4 files changed, 59 insertions(+), 1 deletion(-)

diff --git a/Documentation/git-p4.txt b/Documentation/git-p4.txt
index f0de3b891..a7aac1b92 100644
--- a/Documentation/git-p4.txt
+++ b/Documentation/git-p4.txt
@@ -374,6 +374,14 @@ These options can be used to modify 'git p4 submit' 
behavior.
 been submitted. Implies --disable-rebase. Can also be set with
 git-p4.disableP4Sync. Sync with origin/master still goes ahead if possible.
 
+Hook for submit
+~~~
+The `p4-pre-submit` hook is executed if it exists and is executable.
+The hook takes no parameter and nothing from standard input. Exiting with
+non-zero status from this script prevents `git-p4 submit` from launching.
+
+One usage scenario is to run unit tests in the hook.
+
 Rebase options
 ~~
 These options can be used to modify 'git p4 rebase' behavior.
diff --git a/Documentation/githooks.txt b/Documentation/githooks.txt
index e3c283a17..22fcabbe2 100644
--- a/Documentation/githooks.txt
+++ b/Documentation/githooks.txt
@@ -485,6 +485,13 @@ The exit status determines whether git will use the data 
from the
 hook to limit its search.  On error, it will fall back to verifying
 all files and folders.
 
+p4-pre-submit
+~
+
+This hook is invoked by `git-p4 submit`. It takes no parameter and nothing
+from standard input. Exiting with non-zero status from this script prevent
+`git-p4 submit` from launching. Run `git-p4 submit --help` for details.
+
 GIT
 ---
 Part of the linkgit:git[1] suite
diff --git a/git-p4.py b/git-p4.py
index b449db1cc..879abfd2b 100755
--- a/git-p4.py
+++ b/git-p4.py
@@ -1494,7 +1494,13 @@ def __init__(self):
 optparse.make_option("--disable-p4sync", 
dest="disable_p4sync", action="store_true",
  help="Skip Perforce sync of p4/master 
after submit or shelve"),
 ]
-self.description = "Submit changes from git to the perforce depot."
+self.description = """Submit changes from git to the perforce depot.\n
+The `p4-pre-submit` hook is executed if it exists and is executable.
+The hook takes no parameter and nothing from standard input. Exiting with
+non-zero status from this script prevents `git-p4 submit` from launching.
+
+One usage scenario is to run unit tests in the hook."""
+
 self.usage += " [name of git branch to submit into perforce depot]"
 self.origin = ""
 self.detectRenames = False
@@ -2303,6 +2309,14 @@ def run(self, args):
 sys.exit("number of commits (%d) must match number of shelved 
changelist (%d)" %
  (len(commits), num_shelves))
 
+hooks_path = gitConfig("core.hooksPath")
+if len(hooks_path) <= 0:
+hooks_path = os.path.join(os.environ.get("GIT_DIR", ".git"), 
"hooks")
+
+hook_file = os.path.join(hooks_path, "p4-pre-submit")
+if os.path.isfile(hook_file) and os.access(hook_file, os.X_OK) and 
subprocess.call([hook_file]) != 0:
+sys.exit(1)
+
 #
 # Apply the commits, one at a time.  On failure, ask if should
 # continue to try the rest of the patches, or quit.
diff --git a/t/t9800-git-p4-basic.sh b/t/t9800-git-p4-basic.sh
index 4849edc4e..729cd2577 100755
--- a/t/t9800-git-p4-basic.sh
+++ b/t/t9800-git-p4-basic.sh
@@ -261,6 +261,35 @@ test_expect_success 'unresolvable host in P4PORT should 
display error' '
)
 '
 
+# Test following scenarios:
+#   - Without ".git/hooks/p4-pre-submit" , submit should continue
+#   - With the hook returning 0, submit should continue
+#   - With the hook returning 1, submit should abort
+test_expect_success 'run hook p4-pre-submit before submit' '
+   test_when_finished cleanup_git &&
+   git p4 clone --dest="$git" //depot &&
+   (
+   cd "$git" &&
+   echo "hello world" >hello.txt &&
+   git add hello.txt &&
+   git commit -m "add hello.txt" &&
+   git config git-p4.skipSubmitEdit true &&
+   git p4 submit --dry-run >out &&
+   grep "Would apply" out &&
+   mkdir -p .git/hooks &&
+   write_script .git/hooks/p4-pre-submit <<-\EOF &&
+   exit 0
+   EOF
+   git p4 submit --dry-run >out &&
+   grep "Would apply" out &&
+   write_script .git/hooks/p4-pre-submit <<-\EOF &&
+   exit 1
+   EOF
+   test_must_fail git p4 submit --dry-run >errs 2>&1 &&
+   ! grep "Would apply" errs
+   )
+'
+
 test_expect_success 'submit from detached head' '
   

Re: [PATCH v2 2/2] sequencer: fix quoting in write_author_script

2018-08-01 Thread Phillip Wood

On 31/07/18 22:39, Eric Sunshine wrote:

On Tue, Jul 31, 2018 at 7:15 AM Phillip Wood  wrote:

Single quotes should be escaped as \' not \\'. Note that this only
affects authors that contain a single quote and then only external
scripts that read the author script and users whose git is upgraded from
the shell version of rebase -i while rebase was stopped. This is because
the parsing in read_env_script() expected the broken version and for
some reason sq_dequote() called by read_author_ident() seems to handle
the broken quoting correctly.


Is the:

 ...for some reason sq_dequote() called by read_author_ident()
 seems to handle the broken quoting correctly.

bit outdated? We know now from patch 2/4 of my series[1] that
read_author_ident() wasn't handling it correctly at all. It was merely
ignoring the return value from sq_dequote() and using whatever broken
value came back from it.

[1]: 
https://public-inbox.org/git/20180731073331.40007-3-sunsh...@sunshineco.com/


Helped-by: Johannes Schindelin 
Signed-off-by: Phillip Wood 
---
diff --git a/sequencer.c b/sequencer.c
@@ -664,14 +664,25 @@ static int write_author_script(const char *message)
  static int read_env_script(struct argv_array *env)
  {
 if (strbuf_read_file(&script, rebase_path_author_script(), 256) <= 0)
 return -1;


This is not a problem introduced by this patch, but since
strbuf_read_file() doesn't guarantee that memory hasn't been allocated
when it returns an error, this is leaking.


+   /*
+* write_author_script() used to fail to terminate the GIT_AUTHOR_DATE
+* line with a "'" and also escaped "'" incorrectly as "'''" rather
+* than "'\\''". We check for the terminating "'" on the last line to
+* see how "'" has been escaped in case git was upgraded while rebase
+* was stopped.
+*/
+   sq_bug = script.len && script.buf[script.len - 2] != '\'';


I think you need to be checking 'script.len > 1', not just
'script.len', otherwise you might access memory outside the allocated
buffer.

This is a very "delicate" check, assuming that a hand-edited file
won't end with, say, an extra newline. I wonder if this level of
backward-compatibility is overkill for such an unlikely case.


I think I'll get rid of the check and instead use a version number 
written to .git/rebase-merge/interactive to indicate if we need to fix 
the quoting (if there's no number then it needs fixing). We can 
increment the version number in the future if we ever need to implement 
other fallbacks to handle the case where git got upgraded while rebase 
was stopped. I'll send a patch tomorrow


Best Wishes

Phillip




 for (p = script.buf; *p; p++)
-   if (skip_prefix(p, "'''", (const char **)&p2))
+   if (sq_bug && skip_prefix(p, "'''", &p2))
+   strbuf_splice(&script, p - script.buf, p2 - p, "'", 1);
+   else if (skip_prefix(p, "'\\''", &p2))
diff --git a/t/t3404-rebase-interactive.sh b/t/t3404-rebase-interactive.sh
@@ -75,6 +75,22 @@ test_expect_success 'rebase --keep-empty' '
+test_expect_success 'rebase -i writes correct author-script' '
+   test_when_finished "test_might_fail git rebase --abort" &&
+   git checkout -b author-with-sq master &&
+   GIT_AUTHOR_NAME="Auth O$SQ R" git commit --allow-empty -m with-sq &&
+   set_fake_editor &&
+   FAKE_LINES="edit 1" git rebase -ki HEAD^ &&


Hmph, -k doesn't seem to be documented in git-rebase.txt. Is it needed here?





Re: [PATCH 2/2] Highlight keywords in remote sideband output.

2018-08-01 Thread Junio C Hamano
Eric Sunshine  writes:

> On Tue, Jul 31, 2018 at 1:37 PM Han-Wen Nienhuys  wrote:
>> Highlight keywords in remote sideband output.
>
> Prefix with the module you're touching, don't capitalize, and drop the
> period. Perhaps:
>
> sideband: highlight keywords in remote sideband output

Yup (I locally did something similar when queued it).

>> The highlighting is done on the client-side. Supported keywords are
>> "error", "warning", "hint" and "success".
>>
>> The colorization is controlled with the config setting "color.remote".
>
> What's the motivation for this change? The commit message should say
> something about that and give an explanation of why this is done
> client-side rather than server-side.

Good suggestion.

>
>> Co-authored-by: Duy Nguyen 
>
> Helped-by: is more typical.
>
>> Signed-off-by: Han-Wen Nienhuys 
>> ---
>> diff --git a/Documentation/config.txt b/Documentation/config.txt
>> @@ -1229,6 +1229,15 @@ color.push::
>> +color.remote::
>> +   A boolean to enable/disable colored remote output. If unset,
>> +   then the value of `color.ui` is used (`auto` by default).
>
> If this is "boolean", what does "auto" mean? Perhaps update the
> description to better match other color-related options.

Existing `color.branch` is already loose in the same way, but others
like color.{diff,grep,interactive} read better.  No, past mistakes
by others is not a good excuse to make things worse, but I'd say it
also is OK to clean this up, together with `color.branch`, after this
change on top.

>> +   if (!strncasecmp(p->keyword, src, len) && 
>> !isalnum(src[len])) {
>
> So, the strncasecmp() is checking if one of the recognized keywords is
> at the 'src' position, and the !isalnum() ensures that you won't pick
> up something of which the keyword is merely a prefix. For instance,
> you won't mistakenly highlight "successful". It also works correctly
> when 'len' happens to reference the end-of-string NUL. Okay.

Hmm, do we actually say things like "Error: blah"?  I am not sure if
I like this strncasecmp all that much.

>> +   strbuf_addstr(dest, p->color);
>> +   strbuf_add(dest, src, len);
>> +   strbuf_addstr(dest, GIT_COLOR_RESET);
>> +   n -= len;
>> +   src += len;
>> +   break;
>> +   }
>
> So, despite the explanation in the commit message, this function isn't
> _generally_ highlighting keywords in the sideband. Instead, it is
> highlighting a keyword only if it finds it at the start of string
> (ignoring whitespace). Perhaps the commit message could be more clear
> about that.

Sounds good.

I didn't comment on other parts of your review posed as questions
(that require some digging and thinking), but I think they are all
worthwhile thing to think about.

Thanks.


Re: [PATCH/RFC] clone: report duplicate entries on case-insensitive filesystems

2018-08-01 Thread Duy Nguyen
On Tue, Jul 31, 2018 at 8:23 PM Torsten Bögershausen  wrote:
> I wonder if we can tell the users more about the "problems"
> and how to avoid them, or to live with them.
>
> This is more loud thinking:
>
> "The following paths only differ in case\n"
> "One a case-insensitive file system only one at a time can be present\n"
> "You may rename one like this:\n"
> "git checkout  && git mv  .1\n"

Jeff gave a couple more options [1] to fix or workaround this. I think
the problem is there is no single recommended way to deal with it. If
there is, we can describe in this warning. But listing multiple
options in this warning may be too much (the wall of text could easily
take half a screen).

Or if people agree on _one_ suggestion, I will gladly put it in.

[1] 
https://public-inbox.org/git/cacsjy8a_uzm7numyernhjmya0eyrqytv7dp2iklznxnboqu...@mail.gmail.com/T/#m60fedd7dc928a4d52eb5919811f84556f391a7b3

> > + fprintf(stderr, "\t%s\n", 
> > dup.items[i].string);
>
> Another question:
> Do we need any quote_path() here ?
> (This may be overkill, since typically the repos with conflicting names
> only use ASCII.)

Would be good to show trailing spaces in path names, so yes.
-- 
Duy


Re: [PATCH v2 2/2] sequencer: fix quoting in write_author_script

2018-08-01 Thread Junio C Hamano
Phillip Wood  writes:

>> Is the:
>>
>>  ...for some reason sq_dequote() called by read_author_ident()
>>  seems to handle the broken quoting correctly.
>>
>> bit outdated? We know now from patch 2/4 of my series[1] that
>> read_author_ident() wasn't handling it correctly at all. It was merely
>> ignoring the return value from sq_dequote() and using whatever broken
>> value came back from it.
>
> Yes you're right, when I tested it...
>
> Thanks for your comments, I'll do a reroll

Thanks, both.  Sounds like we are quickly converging to the
resolution ;-)


Re: [PATCH/RFC] clone: report duplicate entries on case-insensitive filesystems

2018-08-01 Thread Duy Nguyen
On Tue, Jul 31, 2018 at 8:44 PM Elijah Newren  wrote:
> Is it worth attempting to also warn about paths that only differ in
> UTF-normalization on relevant MacOS systems?

Down this thread, Jeff Hostetler drew a scarier picture of "case"
handling on MacOS and Windows. I think we should start with support
things like utf normalization... in core.ignore first before doing the
warning stuff. At that point we know well how to detect and warn, or
any other pitfalls.
-- 
Duy


[PATCH 2/2] fetch doc: cross-link two new negotiation options

2018-08-01 Thread Ævar Arnfjörð Bjarmason
Users interested in the fetch.negotiationAlgorithm variable added in
42cc7485a2 ("negotiator/skipping: skip commits during fetch",
2018-07-16) are probably interested in the related --negotiation-tip
option added in 3390e42adb ("fetch-pack: support negotiation tip
whitelist", 2018-07-02).

Change the documentation for those two to reference one another to
point readers in the right direction.

Signed-off-by: Ævar Arnfjörð Bjarmason 
---
 Documentation/config.txt| 2 ++
 Documentation/fetch-options.txt | 3 +++
 2 files changed, 5 insertions(+)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 84f73d7458..dc55ff17e0 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -1540,6 +1540,8 @@ fetch.negotiationAlgorithm::
that never skips commits (unless the server has acknowledged it or one
of its descendants).
Unknown values will cause 'git fetch' to error out.
++
+See also the `--negotiation-tip` option for linkgit:git-fetch[1].
 
 format.attach::
Enable multipart/mixed attachments as the default for
diff --git a/Documentation/fetch-options.txt b/Documentation/fetch-options.txt
index 2d09f87b4b..8bc36af4b1 100644
--- a/Documentation/fetch-options.txt
+++ b/Documentation/fetch-options.txt
@@ -57,6 +57,9 @@ commits reachable from any of the given commits.
 The argument to this option may be a glob on ref names, a ref, or the (possibly
 abbreviated) SHA-1 of a commit. Specifying a glob is equivalent to specifying
 this option multiple times, one for each matching ref name.
++
+See also the `fetch.negotiationAlgorithm` configuration variable
+documented in linkgit:git-config[1].
 
 ifndef::git-pull[]
 --dry-run::
-- 
2.18.0.345.g5c9ce644c3



[PATCH 0/2] negotiator: improve recent behavior + docs

2018-08-01 Thread Ævar Arnfjörð Bjarmason
On Tue, Jul 31 2018, Jonathan Tan wrote:

>> > +fetch.negotiationAlgorithm::
>> > +  Control how information about the commits in the local repository is
>> > +  sent when negotiating the contents of the packfile to be sent by the
>> > +  server. Set to "skipping" to use an algorithm that skips commits in an
>> > +  effort to converge faster, but may result in a larger-than-necessary
>> > +  packfile; any other value instructs Git to use the default algorithm
>> > +  that never skips commits (unless the server has acknowledged it or one
>> > +  of its descendants).
>> > +
>> 
>> ...let's instead document that there's just the values "skipping" and
>> "default", and say "default" is provided by default, and perhaps change
>> the code to warn about anything that isn't those two.
>> 
>> Then we're not painting ourselves into a corner by needing to break a
>> promise in the docs ("any other value instructs Git to use the default")
>> if we add a new one of these, and aren't silently falling back on the
>> default if we add new-fancy-algo the user's version doesn't know about.
>
> My intention was to allow future versions of Git to introduce more
> algorithms, but have older versions of Git still work even if a
> repository is configured to use a newer algorithm. But your suggestion
> is reasonable too.

I think 01/02 in this patch series implements something that's better
& more future-proof.

>> Now, running that "git fetch --all" takes ages, and I know why. It's
>> because the in the negotiation for "git fetch some/small-repo" I'm
>> emitting hundreds of thousands of "have" lines for SHA1s found in other
>> unrelated repos, only to get a NAK for all of them.
>> 
>> One way to fix that with this facility would be to have some way to pass
>> in arguments, similar to what we have for merge drivers, so I can say
>> "just emit 'have' lines for stuff found in this branch". The most
>> pathological cases are when I'm fetching a remote that has one commit,
>> and I'm desperately trying to find something in common by asking if the
>> remote has hundreds of K of commits it has no chance of having.
>> 
>> Or there may be some smarter way to do this, what do you think?
>
> Well, there is already a commit in "next" that does this :-)
>
> 3390e42adb ("fetch-pack: support negotiation tip whitelist", 2018-07-03)

That's awesome. This is exactly what I wanted, this patch series also
fixes another small issue in 02/02; which is that the docs for the two
really should cross-link to make these discoverable from one another.

Another thing I noticed, which I have not fixed, is that there's no
way to say "I don't want to you to say that anything is in
common". Currently I'm hacking around in my script by doing:

parallel --eta --verbose --progress '
if git rev-parse {}/HEAD 2>/dev/null
then
git fetch --negotiation-tip={}/HEAD {}
else
git fetch --negotiation-tip=ref-i-have-with-one-commit {}
fi
' ::: $(git remote)

I.e. the way I'm doing this is I add all the remotes first, then I
fetch them all in parallel, but because the first time around I don't
have anything for that remote (and they don't share any commits) I
need to fake it up and pretend to be fetching from a repo that has
just one commit.

It would be better if I could somehow say that I don't mind that the
ref doesn't exist, but currently you either error out with this, or
ignore the glob, depending on the mode.

So I want this, but can't think of a less shitty UI than:

git fetch --negotiation-tip=$REF 
--negotiation-tip-error-handling=missing-ref-means-no-want

Or something equally atrocious, do you have any better ideas?

Ævar Arnfjörð Bjarmason (2):
  negotiator: unknown fetch.negotiationAlgorithm should error out
  fetch doc: cross-link two new negotiation options

 Documentation/config.txt |  5 -
 Documentation/fetch-options.txt  |  3 +++
 fetch-negotiator.c   | 12 +---
 t/t5552-skipping-fetch-negotiator.sh | 23 +++
 4 files changed, 39 insertions(+), 4 deletions(-)

-- 
2.18.0.345.g5c9ce644c3



[PATCH 1/2] negotiator: unknown fetch.negotiationAlgorithm should error out

2018-08-01 Thread Ævar Arnfjörð Bjarmason
Change the handling of fetch.negotiationAlgorithm= to error out
on unknown strings, i.e. everything except "default" or "skipping".

This changes the behavior added in 42cc7485a2 ("negotiator/skipping:
skip commits during fetch", 2018-07-16) which would ignore all unknown
values and silently fall back to the "default" value.

For a feature like this it's much better to produce an error than
proceed. We don't want users to debug some amazingly slow fetch that
should benefit from "skipping", only to find that they'd forgotten to
deploy the new git version on that particular machine.

Signed-off-by: Ævar Arnfjörð Bjarmason 
---
 Documentation/config.txt |  3 ++-
 fetch-negotiator.c   | 12 +---
 t/t5552-skipping-fetch-negotiator.sh | 23 +++
 3 files changed, 34 insertions(+), 4 deletions(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 63365dcf3d..84f73d7458 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -1536,9 +1536,10 @@ fetch.negotiationAlgorithm::
sent when negotiating the contents of the packfile to be sent by the
server. Set to "skipping" to use an algorithm that skips commits in an
effort to converge faster, but may result in a larger-than-necessary
-   packfile; any other value instructs Git to use the default algorithm
+   packfile; The default is "default" which instructs Git to use the 
default algorithm
that never skips commits (unless the server has acknowledged it or one
of its descendants).
+   Unknown values will cause 'git fetch' to error out.
 
 format.attach::
Enable multipart/mixed attachments as the default for
diff --git a/fetch-negotiator.c b/fetch-negotiator.c
index 5d283049f4..d6d685cba0 100644
--- a/fetch-negotiator.c
+++ b/fetch-negotiator.c
@@ -6,9 +6,15 @@
 void fetch_negotiator_init(struct fetch_negotiator *negotiator,
   const char *algorithm)
 {
-   if (algorithm && !strcmp(algorithm, "skipping")) {
-   skipping_negotiator_init(negotiator);
-   return;
+   if (algorithm) {
+   if (!strcmp(algorithm, "skipping")) {
+   skipping_negotiator_init(negotiator);
+   return;
+   } else if (!strcmp(algorithm, "default")) {
+   /* Fall through to default initialization */
+   } else {
+   die("unknown fetch negotiation algorithm '%s'", 
algorithm);
+   }
}
default_negotiator_init(negotiator);
 }
diff --git a/t/t5552-skipping-fetch-negotiator.sh 
b/t/t5552-skipping-fetch-negotiator.sh
index 0a8e0e42ed..3b60bd44e0 100755
--- a/t/t5552-skipping-fetch-negotiator.sh
+++ b/t/t5552-skipping-fetch-negotiator.sh
@@ -47,6 +47,29 @@ test_expect_success 'commits with no parents are sent 
regardless of skip distanc
have_not_sent c6 c4 c3
 '
 
+test_expect_success 'unknown fetch.negotiationAlgorithm values error out' '
+   rm -rf server client trace &&
+   git init server &&
+   test_commit -C server to_fetch &&
+
+   git init client &&
+   test_commit -C client on_client &&
+   git -C client checkout on_client &&
+
+   test_config -C client fetch.negotiationAlgorithm invalid &&
+   test_must_fail git -C client fetch "$(pwd)/server" 2>err &&
+   test_i18ngrep "unknown fetch negotiation algorithm" err &&
+
+   # Explicit "default" value
+   test_config -C client fetch.negotiationAlgorithm default &&
+   git -C client -c fetch.negotiationAlgorithm=default fetch 
"$(pwd)/server" &&
+
+   # Implementation detail: If there is nothing to fetch, we will not 
error out
+   test_config -C client fetch.negotiationAlgorithm invalid &&
+   git -C client fetch "$(pwd)/server" 2>err &&
+   test_i18ngrep ! "unknown fetch negotiation algorithm" err
+'
+
 test_expect_success 'when two skips collide, favor the larger one' '
rm -rf server client trace &&
git init server &&
-- 
2.18.0.345.g5c9ce644c3



Re: [PATCH/RFC] clone: report duplicate entries on case-insensitive filesystems

2018-08-01 Thread Duy Nguyen
On Tue, Jul 31, 2018 at 9:13 PM Junio C Hamano  wrote:
>
> Nguyễn Thái Ngọc Duy   writes:
>
> > Another thing we probably should do is catch in "git checkout" too,
> > not just "git clone" since your linux/unix colleage colleague may
> > accidentally add some files that your mac/windows machine is not very
> > happy with.
>
> Then you would catch it not in checkout but in add, no?

No because the guy who added these may have done it on a
case-sensitive filesystem. Only later when his friend fetches new
changes on a case-insensitive filesytem, the problem becomes real. If
in this scenario, core.ignore is enforced on all machines, then yes we
could catch it at "git add" (and we should already do that or we have
a bug). At least in open source project setting, I think enforcing
core.ignore will not work.
-- 
Duy


Re: [PATCH 1/1] Add the `p4-pre-submit` hook

2018-08-01 Thread Junio C Hamano
chen bin  writes:

> I updated the patch. But for some reason the test keep failing at this line,
> `test_must_fail git p4 submit --dry-run >errs 2>&1 &&`.
>
> If I change this line to `test_must_fail git-p4 submit --dry-run >errs
> 2>&1 &&` the test will pass.

Hmph.  I somehow suspect that the test also will pass if you changed
it like this:

test_must_fail false >errs 2>&1 &&

IOW, my suspicion is that the shell fails to find "git-p4" [*1*] and
that is why your `test_must_fail git-p4 whatever` lets your test
pass, which is different from the way you _want_ your test_must_fail
succeed, i.e. "git p4 submit" is run with the "--dry-run" option and
exit with non-zero status.

Of course, if the shell cannot find "git-p4", `errs` would not have
the string "Would apply" in it, so the next test also would pass.

 On 31 July 2018 at 10:46, SZEDER Gábor  wrote:
>> + test_must_fail git-p4 submit --dry-run >errs 2>&1 &&>
>> + ! grep "Would apply" err



[Footnote]

 *1* As I do not use (nor install) p4, I cannot test "'git p4' works
 but 'git-p4' should not work" myself, but by running t0001 with
 a trial patch like the following, you can see that we do not
 find the dashed form `git-init` on $PATH and the test indeed
 fails.

 t/t0001-init.sh | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/t/t0001-init.sh b/t/t0001-init.sh
index 0681300707..8c598a0d84 100755
--- a/t/t0001-init.sh
+++ b/t/t0001-init.sh
@@ -31,7 +31,7 @@ check_config () {
 }
 
 test_expect_success 'plain' '
-   git init plain &&
+   git-init plain &&
check_config plain/.git false unset
 '
 


Re: [PATCH v2] checkout: optimize "git checkout -b "

2018-08-01 Thread Duy Nguyen
On Tue, Jul 31, 2018 at 7:03 PM Ben Peart  wrote:
>
> From: Ben Peart 
>
> Skip merging the commit, updating the index and working directory if and
> only if we are creating a new branch via "git checkout -b ."
> Any other checkout options will still go through the former code path.

I'd like to see this giant list of checks broken down and pushed down
to smaller areas so that chances of new things being added but checks
not updated become much smaller. And ideally there should just be no
behavior change (I think with your change, "checkout -b" will not
report local changes, but it's not mentioned in the config document;
more things like that can easily slip).

So. I assume this reason for this patch is because on super large worktrees

 - 2-way merge is too slow
 - applying spare checkout patterns on a huge worktree is also slow
 - writing index is, again, slow
 - show_local_changes() slow

For 2-way merge, I believe we can detect inside unpack_trees() that
it's a 2-way merge (fn == twoway_merge), from HEAD to HEAD (simple
enough check), then from the 2-way merge table we know for sure
nothing is going to change and we can just skip traverse_trees() call
in unpack_trees().

On the sparse checkout application. This only needs to be done when
there are new files added, or the spare-checkout file has been updated
since the last time it's been used. We can keep track of these things
(sparse-checkout file change could be kept track with just stat info
maybe as an index extension) then we can skip applying sparse checkout
not for this particular case for but general checkouts as well. Spare
checkout file rarely changes. Big win overall.

And if all go according to plan, there will be no changes made in the
index (by either 2-way merge or sparse checkout stuff) we should be
able to just skip writing down the index, if we haven't done that
already.

show_local_changes() should be sped up significantly with the new
cache-tree optimization I'm working on in another thread.

If I have not made any mistake in my analysis so far, we achieve a big
speedup without adding a new config knob and can still fall back to
slower-but-same-behavior when things are not in the right condition. I
know it will not be the same speedup as this patch because when facing
thousands of items, even counting them takes time. But I think it's a
reasonable speedup without making the code base more fragile.
-- 
Duy


Re: [PATCH 1/1] Add the `p4-pre-submit` hook

2018-08-01 Thread chen bin
I updated the patch. But for some reason the test keep failing at this line,
`test_must_fail git p4 submit --dry-run >errs 2>&1 &&`.

If I change this line to `test_must_fail git-p4 submit --dry-run >errs
2>&1 &&` the test will pass.

Could you help me to resolve this issue? I'm really confused. I've
already added latest `p4d` and `p4` to $PATH.

The commit is at
https://github.com/redguardtoo/git/commit/b88c38b9ce6cfb1c1fefef15527adfa92d78daf2


On Wed, Aug 1, 2018 at 5:54 AM, Luke Diamand  wrote:
> On 31 July 2018 at 16:40, Junio C Hamano  wrote:
>> Luke Diamand  writes:
>>
>>> I think there is an error in the test harness.
>>>
>>> On 31 July 2018 at 10:46, SZEDER Gábor  wrote:
> + test_must_fail git-p4 submit --dry-run >errs 2>&1 &&>
> + ! grep "Would apply" err
>>>
>>> It writes to the file "errs" but then looks for the message in "err".
>>>
>>> Luke
>>
>> Sigh. Thanks for spotting, both of you.
>>
>> Here is what I'd locally squash in.  If there is anything else, I'd
>> rather see a refreshed final one sent by the author.
>>
>> Thanks.
>>
>>  t/t9800-git-p4-basic.sh | 8 
>>  1 file changed, 4 insertions(+), 4 deletions(-)
>>
>> diff --git a/t/t9800-git-p4-basic.sh b/t/t9800-git-p4-basic.sh
>> index 2b7baa95d2..729cd25770 100755
>> --- a/t/t9800-git-p4-basic.sh
>> +++ b/t/t9800-git-p4-basic.sh
>> @@ -274,19 +274,19 @@ test_expect_success 'run hook p4-pre-submit before 
>> submit' '
>> git add hello.txt &&
>> git commit -m "add hello.txt" &&
>> git config git-p4.skipSubmitEdit true &&
>> -   git-p4 submit --dry-run >out &&
>> +   git p4 submit --dry-run >out &&
>> grep "Would apply" out &&
>> mkdir -p .git/hooks &&
>> write_script .git/hooks/p4-pre-submit <<-\EOF &&
>> exit 0
>> EOF
>> -   git-p4 submit --dry-run >out &&
>> +   git p4 submit --dry-run >out &&
>> grep "Would apply" out &&
>> write_script .git/hooks/p4-pre-submit <<-\EOF &&
>> exit 1
>> EOF
>> -   test_must_fail git-p4 submit --dry-run >errs 2>&1 &&
>> -   ! grep "Would apply" err
>> +   test_must_fail git p4 submit --dry-run >errs 2>&1 &&
>> +   ! grep "Would apply" errs
>> )
>>  '
>
> That set of deltas works for me.
>
> Sorry, I should have run the tests myself earlier and I would have
> picked up on this.



-- 
help me, help you.


Re: [PATCH v2 2/2] sequencer: fix quoting in write_author_script

2018-08-01 Thread Phillip Wood

Hi Eric

On 31/07/18 22:39, Eric Sunshine wrote:

On Tue, Jul 31, 2018 at 7:15 AM Phillip Wood  wrote:

Single quotes should be escaped as \' not \\'. Note that this only
affects authors that contain a single quote and then only external
scripts that read the author script and users whose git is upgraded from
the shell version of rebase -i while rebase was stopped. This is because
the parsing in read_env_script() expected the broken version and for
some reason sq_dequote() called by read_author_ident() seems to handle
the broken quoting correctly.


Is the:

 ...for some reason sq_dequote() called by read_author_ident()
 seems to handle the broken quoting correctly.

bit outdated? We know now from patch 2/4 of my series[1] that
read_author_ident() wasn't handling it correctly at all. It was merely
ignoring the return value from sq_dequote() and using whatever broken
value came back from it.


Yes you're right, when I tested it before I must of had GIT_AUTHOR_NAME 
set to the name with the "'" in it when I ran the rebase because it 
appeared to work, but actually sj_dequote() was returning NULL and so 
commit_tree() just picked up the default author. I've just changed the 
test you added to


test_expect_success 'valid author header after --root swap' '
rebase_setup_and_clean author-header no-conflict-branch &&
set_fake_editor &&
	git commit --amend --author="Au ${SQ}thor " 
--no-edit &&

git cat-file commit HEAD | grep ^author >expected &&
FAKE_LINES="5 1" git rebase -i --root &&
git cat-file commit HEAD^ | grep ^author >actual &&
test_cmp expected actual
'

and it fails without the fixes to write_author_script().



[1]: 
https://public-inbox.org/git/20180731073331.40007-3-sunsh...@sunshineco.com/


Helped-by: Johannes Schindelin 
Signed-off-by: Phillip Wood 
---
diff --git a/sequencer.c b/sequencer.c
@@ -664,14 +664,25 @@ static int write_author_script(const char *message)
  static int read_env_script(struct argv_array *env)
  {
 if (strbuf_read_file(&script, rebase_path_author_script(), 256) <= 0)
 return -1;


This is not a problem introduced by this patch, but since
strbuf_read_file() doesn't guarantee that memory hasn't been allocated
when it returns an error, this is leaking.


I can fix that


+   /*
+* write_author_script() used to fail to terminate the GIT_AUTHOR_DATE
+* line with a "'" and also escaped "'" incorrectly as "'''" rather
+* than "'\\''". We check for the terminating "'" on the last line to
+* see how "'" has been escaped in case git was upgraded while rebase
+* was stopped.
+*/
+   sq_bug = script.len && script.buf[script.len - 2] != '\'';


I think you need to be checking 'script.len > 1', not just
'script.len', otherwise you might access memory outside the allocated
buffer.


Good catch, Johannes's original was checking script.buf[script.len - 1] 
which I corrected but forget to adjust the previous check.



This is a very "delicate" check, assuming that a hand-edited file
won't end with, say, an extra newline. I wonder if this level of
backward-compatibility is overkill for such an unlikely case.


Yes, it is a bit fragile. Originally the patch just unquoted the correct 
and incorrect quoting but Johannes was worried that might lead to errors 
and suggested this check. The check is aimed at people whose git gets 
upgraded while rebase is stopped for a conflict resolution or edit and 
so have the bad quoting in the author-script from the old version of git 
which started the rebase. Authors with "'" in the name are uncommon but 
not unheard of, I think when I checked there were about half a dozen in 
git's history. I'm not sure what to do for the best.



 for (p = script.buf; *p; p++)
-   if (skip_prefix(p, "'''", (const char **)&p2))
+   if (sq_bug && skip_prefix(p, "'''", &p2))
+   strbuf_splice(&script, p - script.buf, p2 - p, "'", 1);
+   else if (skip_prefix(p, "'\\''", &p2))
diff --git a/t/t3404-rebase-interactive.sh b/t/t3404-rebase-interactive.sh
@@ -75,6 +75,22 @@ test_expect_success 'rebase --keep-empty' '
+test_expect_success 'rebase -i writes correct author-script' '
+   test_when_finished "test_might_fail git rebase --abort" &&
+   git checkout -b author-with-sq master &&
+   GIT_AUTHOR_NAME="Auth O$SQ R" git commit --allow-empty -m with-sq &&
+   set_fake_editor &&
+   FAKE_LINES="edit 1" git rebase -ki HEAD^ &&


Hmph, -k doesn't seem to be documented in git-rebase.txt. Is it needed here?


-k is short for --keep-empty which is needed as the test creates an 
empty commit to check the author (I think that is to avoid changing the 
tree  - Johannes wrote that bit).


Thanks for your comments, I'll do a reroll

Phillip


[PATCH] remote: clear string_list after use in mv()

2018-08-01 Thread René Scharfe
Switch to the _DUP variant of string_list for remote_branches to allow
string_list_clear() to release the allocated memory at the end, and
actually call that function.  Free the util pointer as well; it is
allocated in read_remote_branches().

NB: This string_list is empty until read_remote_branches() is called
via for_each_ref(), so there is no need to clean it up when returning
before that point.

Signed-off-by: Rene Scharfe 
---
Patch generated with ---function-context for easier review -- that
makes it look much bigger than it actually is, though.

 builtin/remote.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/builtin/remote.c b/builtin/remote.c
index c74ee88690..07bd51f8eb 100644
--- a/builtin/remote.c
+++ b/builtin/remote.c
@@ -558,23 +558,23 @@ struct rename_info {
 static int read_remote_branches(const char *refname,
const struct object_id *oid, int flags, void *cb_data)
 {
struct rename_info *rename = cb_data;
struct strbuf buf = STRBUF_INIT;
struct string_list_item *item;
int flag;
const char *symref;
 
strbuf_addf(&buf, "refs/remotes/%s/", rename->old_name);
if (starts_with(refname, buf.buf)) {
-   item = string_list_append(rename->remote_branches, 
xstrdup(refname));
+   item = string_list_append(rename->remote_branches, refname);
symref = resolve_ref_unsafe(refname, RESOLVE_REF_READING,
NULL, &flag);
if (symref && (flag & REF_ISSYMREF))
item->util = xstrdup(symref);
else
item->util = NULL;
}
strbuf_release(&buf);
 
return 0;
 }
@@ -607,133 +607,134 @@ static int migrate_file(struct remote *remote)
 static int mv(int argc, const char **argv)
 {
struct option options[] = {
OPT_END()
};
struct remote *oldremote, *newremote;
struct strbuf buf = STRBUF_INIT, buf2 = STRBUF_INIT, buf3 = STRBUF_INIT,
old_remote_context = STRBUF_INIT;
-   struct string_list remote_branches = STRING_LIST_INIT_NODUP;
+   struct string_list remote_branches = STRING_LIST_INIT_DUP;
struct rename_info rename;
int i, refspec_updated = 0;
 
if (argc != 3)
usage_with_options(builtin_remote_rename_usage, options);
 
rename.old_name = argv[1];
rename.new_name = argv[2];
rename.remote_branches = &remote_branches;
 
oldremote = remote_get(rename.old_name);
if (!remote_is_configured(oldremote, 1))
die(_("No such remote: %s"), rename.old_name);
 
if (!strcmp(rename.old_name, rename.new_name) && oldremote->origin != 
REMOTE_CONFIG)
return migrate_file(oldremote);
 
newremote = remote_get(rename.new_name);
if (remote_is_configured(newremote, 1))
die(_("remote %s already exists."), rename.new_name);
 
strbuf_addf(&buf, "refs/heads/test:refs/remotes/%s/test", 
rename.new_name);
if (!valid_fetch_refspec(buf.buf))
die(_("'%s' is not a valid remote name"), rename.new_name);
 
strbuf_reset(&buf);
strbuf_addf(&buf, "remote.%s", rename.old_name);
strbuf_addf(&buf2, "remote.%s", rename.new_name);
if (git_config_rename_section(buf.buf, buf2.buf) < 1)
return error(_("Could not rename config section '%s' to '%s'"),
buf.buf, buf2.buf);
 
strbuf_reset(&buf);
strbuf_addf(&buf, "remote.%s.fetch", rename.new_name);
git_config_set_multivar(buf.buf, NULL, NULL, 1);
strbuf_addf(&old_remote_context, ":refs/remotes/%s/", rename.old_name);
for (i = 0; i < oldremote->fetch.raw_nr; i++) {
char *ptr;
 
strbuf_reset(&buf2);
strbuf_addstr(&buf2, oldremote->fetch.raw[i]);
ptr = strstr(buf2.buf, old_remote_context.buf);
if (ptr) {
refspec_updated = 1;
strbuf_splice(&buf2,
  ptr-buf2.buf + strlen(":refs/remotes/"),
  strlen(rename.old_name), rename.new_name,
  strlen(rename.new_name));
} else
warning(_("Not updating non-default fetch refspec\n"
  "\t%s\n"
  "\tPlease update the configuration manually 
if necessary."),
buf2.buf);
 
git_config_set_multivar(buf.buf, buf2.buf, "^$", 0);
}
 
read_branches();
for (i = 0; i < branch_list.nr; i++) {
struct string_list_item *item = branch_list.items + i;
struct branch_info *info = item->util;
if (info->remote_name && !strcmp(info-

Re: [PATCH v2 1/2] sequencer: handle errors in read_author_ident()

2018-08-01 Thread Phillip Wood

Hi Eric

Thanks for taking a look at this

On 31/07/18 21:47, Eric Sunshine wrote:

On Tue, Jul 31, 2018 at 7:15 AM Phillip Wood  wrote:

The calling code treated NULL as a valid return value, so fix this by
returning and integer and passing in a parameter to receive the author.


It might be difficult for future readers (those who didn't follow the
discussion) to understand how/why NULL is not sufficient to signal an
error. Perhaps incorporating the explanation from your email[1] which
discussed that the author name, email, and/or date might change
unexpectedly would be sufficient. This excerpt from [1] might be a
good starting point:

 ... the caller does not treat NULL as an error, so this will
 change the date and potentially the author of the commit
 ... [which] does corrupt the author data compared to its
 expected value.

[1]: 
https://public-inbox.org/git/c80cf729-1bbe-10f5-6837-b074d371b...@talktalk.net/


Signed-off-by: Phillip Wood 
---
diff --git a/sequencer.c b/sequencer.c
@@ -701,57 +701,58 @@ static char *get_author(const char *message)
-static const char *read_author_ident(struct strbuf *buf)
+static int read_author_ident(char **author)


So, the caller is now responsible for freeing the string placed in
'author'. Okay.


  {
-   if (strbuf_read_file(buf, rebase_path_author_script(), 256) <= 0)
-   return NULL;
+   if (strbuf_read_file(&buf, rebase_path_author_script(), 256) <= 0)
+   return -1;


I think you need to strbuf_release(&buf) in this error path since
strbuf_read_file() doesn't guarantee that the strbuf hasn't been
partially populated when it returns an error. (That is, this is
leaking.)


Good point, I'll fix it


 /* dequote values and construct ident line in-place */


Ugh, this comment should have been adjusted in my series. A minor
matter, though, which can be tweaked later.


 /* validate date since fmt_ident() will die() on bad value */
 if (parse_date(val[2], &out)){
-   warning(_("invalid date format '%s' in '%s'"),
+   error(_("invalid date format '%s' in '%s'"),
 val[2], rebase_path_author_script());
 strbuf_release(&out);
-   return NULL;
+   strbuf_release(&buf);
+   return -1;


You were careful to print the error, which references a value from
'buf', before destroying 'buf'. Good.

(A simplifying alternative would have been to not print the actual
value and instead say generally that "the date" was bad. Not a big
deal.)


 }
-   strbuf_swap(buf, &out);
-   strbuf_release(&out);
-   return buf->buf;
+   *author = strbuf_detach(&out, NULL);


And, 'author' is only assigned when 0 is returned, so the caller only
has to free(author) upon success. Fine.


+   strbuf_release(&buf);
+   return 0;
  }

  static const char staged_changes_advice[] =
@@ -794,12 +795,14 @@ static int run_git_commit(const char *defmsg, struct 
replay_opts *opts,
-   struct strbuf msg = STRBUF_INIT, script = STRBUF_INIT;
-   const char *author = is_rebase_i(opts) ?
-   read_author_ident(&script) : NULL;
+   struct strbuf msg = STRBUF_INIT;
+   char *author = NULL;
 struct object_id root_commit, *cache_tree_oid;
 int res = 0;

+   if (is_rebase_i(opts) && read_author_ident(&author))
+   return -1;


Logic looks correct, and it's nice to see that you went with 'return
-1' rather than die(), especially since the caller of run_git_commit()
is already able to handle -1.


Yes, it reschedules the pick so the user has a chance to fix the 
author-script and then run 'git rebase --continue'


Best Wishes

Phillip



Re: [PATCH 1/2] Document git config getter return value.

2018-08-01 Thread Eric Sunshine
On Tue, Jul 31, 2018 at 1:37 PM Han-Wen Nienhuys  wrote:
> diff --git a/config.h b/config.h
> @@ -178,10 +178,16 @@ struct config_set {
> +/*
> + * The int return values in the functions is 1 if not found, 0 if found, 
> leaving
> + * the found value in teh 'dest' pointer.
> + */

"teh"?

Instead of talking about "the int return values", simpler would be to
say "returns 1 if ...; else 0". Not worth a re-roll, though.

> +extern int git_configset_add_file(struct config_set *cs, const char 
> *filename);
> +extern int git_configset_get_value(struct config_set *cs, const char *key, 
> const char **dest);


Re: Is detecting endianness at compile-time unworkable?

2018-08-01 Thread Ævar Arnfjörð Bjarmason


On Tue, Jul 31 2018, Michael Felt wrote:

> For AIX: again - the determination is simple. If _AIX is set to 1 then
> use BigEndian, or, use:
> michael@x071:[/home/michael]uname
> AIX
> i.e., something like:
> $(uname) == "AIX" && BigEndian=1

In lieu of some "let's test this with a compile-test" solution, that
seems like a viable way forward for now. Can you please test the merge
request I have at
https://github.com/cr-marcstevens/sha1collisiondetection/pull/45 by
manually applying that change to the relevant file in sha1dc/ and
building/testing git on AIX?

It should work, but as noted in the MR please test it so we can make
sure, and then (if you have a GitHub account) comment on the MR saying
it works for you.


Re: Is detecting endianness at compile-time unworkable?

2018-08-01 Thread Ævar Arnfjörð Bjarmason


On Mon, Jul 30 2018, Junio C Hamano wrote:

> Ævar Arnfjörð Bjarmason  writes:
>
>> And, as an aside, the reason we can't easily make it better ourselves is
>> because the build process for git.git doesn't have a facility to run
>> code to detect this type of stuff (the configure script is always
>> optional). So we can't just run this test ourselves.
>
> It won't help those who cross-compile anyway.

I was being unclear, what I mean by having "a hard dependency on some
way of doing checks via compiled code in our build system" is that we
would do some equivalent of this:

diff --git a/Makefile b/Makefile
index 08e5c54549..b021b6e1b6 100644
--- a/Makefile
+++ b/Makefile
@@ -1107,7 +1107,7 @@ DC_SHA1_SUBMODULE = auto
 endif

 include config.mak.uname
--include config.mak.autogen
+include config.mak.autogen
 -include config.mak

 ifdef DEVELOPER

And document that in order to build git you needed to do './configure &&
make' instead of just 'make', and we'd error out by default if
config.mak.autogen wasn't there.

Now obviously that would need some sort of escape hatch. I.e. you could
invoke 'make' like this:

make CONFIGURE_MAK_AUTOGEN_FILE=some-file

That's how you would do cross-compilation, you'd arrange to run
'./configure' on some system, save the output, and ferry over this
'some-file' to where you're building git, or you would manually prepare
a file that had all the settings we'd expect to have been set already
set.

Now, whether we do this with autoconf or not is just an implementation
detail. Looking at this some more I think since we already use the
$(shell) construct we could just have some 'configure-detect' Makefile
target which would compile various test programs, and we'd use their
output to set various settings, a sort of home-grown autoconf (because
people are bound to have objections to a hard dependency on it...).

> I thought we declared "we make a reasonable effort to guess the target
> endianness from the system header by inspecting usual macros, but will
> not aim to cover every system on the planet---instead there is a knob
> to tweak it for those on exotic platforms" last time we discussed
> this?

Yes, but I think it's worth re-visiting that decision, which was made
with the constraints that we don't have a build system that can do
checks via compiled code, so we need this hack in the first place
instead of things Just Working.

And as I pointed out in the linked E-Mail this also impacts us in other
ways, and will cause other issues in the future, so it's worth thinking
about if this is the right path to take.