Re: Duplicate safecrlf warning for racily clean index entry

2018-02-20 Thread Torsten Bögershausen
On Tue, Feb 20, 2018 at 08:42:26AM -0500, Matt McCutchen wrote:
> I noticed that if a file subject to a safecrlf warning is added to the
> index in the same second that it is created, resulting in a "racily
> clean" index entry, then a subsequent "git add" command prints another
> safecrlf warning.  I reproduced this on the current "next"
> (499d7c4f91).  The procedure:
> 
> $ git init
> $ git config core.autocrlf true
> $ echo foo >file1 && git add file1 && git add file1
> warning: LF will be replaced by CRLF in file1.
> The file will have its original line endings in your working directory.
> warning: LF will be replaced by CRLF in file1.
> The file will have its original line endings in your working directory.
> $ echo bar >file2 && sleep 1 && git add file2 && git add file2
> warning: LF will be replaced by CRLF in file2.
> The file will have its original line endings in your working directory.
> 
> This came up when I ran the test suite for Braid on Windows
> (https://github.com/cristibalan/braid/issues/77).

I think a .gitattributes file could/should be used. I'll answer
there seperatly.

> 
> The phenomenon actually seems to be more general: touching the file
> causes the next "git add" to print a safecrlf warning, suggesting that
> the warning occurs whenever the index entry is dirty.  One could argue
> that a new warning is reasonable after touching the file, but it seems
> clear that "racy cleanliness" is an implementation detail that
> shouldn't have user-visible nondeterministic effects.
> 
> In either case, if "git update-index --refresh" (or "git status") is
> run before "git add", then "git add" does not print the warning.  On
> the other hand, if line endings in the working tree file are changed,
> then git shows the file as having an unstaged change, even though the
> content that would be added to the index after CRLF conversion is
> identical.  So it seems that git remembers the pre-conversion file
> content and uses it for "git update-index --refresh" and would just
> need to use it for "git add" as well.
> 
> Thoughts about the proposed change?  Does someone want to work on it or
> give me a pointer to where to get started?

Good analyzes, thanks for that.

I don't hava a pointer, but what should happen ?
2 warnings for 2 "git add" should be OK, I think.

1 warning is part of the optimization, that Git does to handle
hundrets and thousands of files efficciently.

Is the 1/2 warning  real live problem  ?

> 
> Thanks,
> Matt


[PATCH v2 2/2] ref-filter: get rid of goto

2018-02-20 Thread Olga Telezhnaya
Get rid of goto command in ref-filter for better readability.

Signed-off-by: Olga Telezhnaia 
Mentored-by: Christian Couder 
Mentored by: Jeff King 
---
 ref-filter.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/ref-filter.c b/ref-filter.c
index 83ffd84affe52..35359818a1ebb 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -1494,11 +1494,11 @@ static void populate_value(struct ref_array_item *ref)
for (i = 0; i < used_atom_cnt; i++) {
struct atom_value *v = >value[i];
if (v->s == NULL)
-   goto need_obj;
+   break;
}
-   return;
+   if (i >= used_atom_cnt)
+   return;
 
- need_obj:
get_object(ref, >objectname, 0, );
 
/*

--
https://github.com/git/git/pull/460


[PATCH v2 1/2] ref-filter: get rid of duplicate code

2018-02-20 Thread Olga Telezhnaya
Make one function from 2 duplicate pieces and invoke it twice.

Signed-off-by: Olga Telezhnaia 
Mentored-by: Christian Couder 
Mentored by: Jeff King 
---
 ref-filter.c | 45 +
 1 file changed, 21 insertions(+), 24 deletions(-)

diff --git a/ref-filter.c b/ref-filter.c
index f9e25aea7a97e..83ffd84affe52 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -1354,15 +1354,31 @@ static const char *get_refname(struct used_atom *atom, 
struct ref_array_item *re
return show_ref(>u.refname, ref->refname);
 }
 
+static void get_object(struct ref_array_item *ref, const struct object_id *oid,
+  int deref, struct object **obj)
+{
+   int eaten;
+   unsigned long size;
+   void *buf = get_obj(oid, obj, , );
+   if (!buf)
+   die(_("missing object %s for %s"),
+   oid_to_hex(oid), ref->refname);
+   if (!*obj)
+   die(_("parse_object_buffer failed on %s for %s"),
+   oid_to_hex(oid), ref->refname);
+
+   grab_values(ref->value, deref, *obj, buf, size);
+   if (!eaten)
+   free(buf);
+}
+
 /*
  * Parse the object referred by ref, and grab needed value.
  */
 static void populate_value(struct ref_array_item *ref)
 {
-   void *buf;
struct object *obj;
-   int eaten, i;
-   unsigned long size;
+   int i;
const struct object_id *tagged;
 
ref->value = xcalloc(used_atom_cnt, sizeof(struct atom_value));
@@ -1483,17 +1499,7 @@ static void populate_value(struct ref_array_item *ref)
return;
 
  need_obj:
-   buf = get_obj(>objectname, , , );
-   if (!buf)
-   die(_("missing object %s for %s"),
-   oid_to_hex(>objectname), ref->refname);
-   if (!obj)
-   die(_("parse_object_buffer failed on %s for %s"),
-   oid_to_hex(>objectname), ref->refname);
-
-   grab_values(ref->value, 0, obj, buf, size);
-   if (!eaten)
-   free(buf);
+   get_object(ref, >objectname, 0, );
 
/*
 * If there is no atom that wants to know about tagged
@@ -1514,16 +1520,7 @@ static void populate_value(struct ref_array_item *ref)
 * is not consistent with what deref_tag() does
 * which peels the onion to the core.
 */
-   buf = get_obj(tagged, , , );
-   if (!buf)
-   die(_("missing object %s for %s"),
-   oid_to_hex(tagged), ref->refname);
-   if (!obj)
-   die(_("parse_object_buffer failed on %s for %s"),
-   oid_to_hex(tagged), ref->refname);
-   grab_values(ref->value, 1, obj, buf, size);
-   if (!eaten)
-   free(buf);
+   get_object(ref, tagged, 1, );
 }
 
 /*

--
https://github.com/git/git/pull/460


[PATCH 2/2] ref-filter: get rid of goto

2018-02-20 Thread Olga Telezhnaya
Get rid of goto command in ref-filter for better readability.

Signed-off-by: Olga Telezhnaia 
Mentored-by: Christian Couder 
Mentored by: Jeff King 
---
 ref-filter.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/ref-filter.c b/ref-filter.c
index 83ffd84affe52..28df6e21fb996 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -1494,11 +1494,11 @@ static void populate_value(struct ref_array_item *ref)
for (i = 0; i < used_atom_cnt; i++) {
struct atom_value *v = >value[i];
if (v->s == NULL)
-   goto need_obj;
+   break;
}
-   return;
+   if (used_atom_cnt <= i)
+   return;
 
- need_obj:
get_object(ref, >objectname, 0, );
 
/*

--
https://github.com/git/git/pull/460


[PATCH 1/2] ref-filter: get rid of duplicate code

2018-02-20 Thread Olga Telezhnaya
Make one function from 2 duplicate pieces and invoke it twice.

Signed-off-by: Olga Telezhnaia 
Mentored-by: Christian Couder 
Mentored by: Jeff King 
---
 ref-filter.c | 45 +
 1 file changed, 21 insertions(+), 24 deletions(-)

diff --git a/ref-filter.c b/ref-filter.c
index f9e25aea7a97e..83ffd84affe52 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -1354,15 +1354,31 @@ static const char *get_refname(struct used_atom *atom, 
struct ref_array_item *re
return show_ref(>u.refname, ref->refname);
 }
 
+static void get_object(struct ref_array_item *ref, const struct object_id *oid,
+  int deref, struct object **obj)
+{
+   int eaten;
+   unsigned long size;
+   void *buf = get_obj(oid, obj, , );
+   if (!buf)
+   die(_("missing object %s for %s"),
+   oid_to_hex(oid), ref->refname);
+   if (!*obj)
+   die(_("parse_object_buffer failed on %s for %s"),
+   oid_to_hex(oid), ref->refname);
+
+   grab_values(ref->value, deref, *obj, buf, size);
+   if (!eaten)
+   free(buf);
+}
+
 /*
  * Parse the object referred by ref, and grab needed value.
  */
 static void populate_value(struct ref_array_item *ref)
 {
-   void *buf;
struct object *obj;
-   int eaten, i;
-   unsigned long size;
+   int i;
const struct object_id *tagged;
 
ref->value = xcalloc(used_atom_cnt, sizeof(struct atom_value));
@@ -1483,17 +1499,7 @@ static void populate_value(struct ref_array_item *ref)
return;
 
  need_obj:
-   buf = get_obj(>objectname, , , );
-   if (!buf)
-   die(_("missing object %s for %s"),
-   oid_to_hex(>objectname), ref->refname);
-   if (!obj)
-   die(_("parse_object_buffer failed on %s for %s"),
-   oid_to_hex(>objectname), ref->refname);
-
-   grab_values(ref->value, 0, obj, buf, size);
-   if (!eaten)
-   free(buf);
+   get_object(ref, >objectname, 0, );
 
/*
 * If there is no atom that wants to know about tagged
@@ -1514,16 +1520,7 @@ static void populate_value(struct ref_array_item *ref)
 * is not consistent with what deref_tag() does
 * which peels the onion to the core.
 */
-   buf = get_obj(tagged, , , );
-   if (!buf)
-   die(_("missing object %s for %s"),
-   oid_to_hex(tagged), ref->refname);
-   if (!obj)
-   die(_("parse_object_buffer failed on %s for %s"),
-   oid_to_hex(tagged), ref->refname);
-   grab_values(ref->value, 1, obj, buf, size);
-   if (!eaten)
-   free(buf);
+   get_object(ref, tagged, 1, );
 }
 
 /*

--
https://github.com/git/git/pull/460


[PATCH 12/27] sha1_file: add repository argument to link_alt_odb_entries

2018-02-20 Thread Stefan Beller
See previous patch for explanation.

Signed-off-by: Stefan Beller 
Signed-off-by: Jonathan Nieder 
---
 sha1_file.c | 19 +--
 1 file changed, 13 insertions(+), 6 deletions(-)

diff --git a/sha1_file.c b/sha1_file.c
index 6edcc3d3a7..a0e9116318 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -467,8 +467,12 @@ static const char *parse_alt_odb_entry(const char *string,
return end;
 }
 
-static void link_alt_odb_entries(const char *alt, int sep,
-const char *relative_base, int depth)
+#define link_alt_odb_entries(r, a, s, rb, d) \
+   link_alt_odb_entries_##r(a, s, rb, d)
+static void link_alt_odb_entries_the_repository(const char *alt,
+   int sep,
+   const char *relative_base,
+   int depth)
 {
struct strbuf objdirbuf = STRBUF_INIT;
struct strbuf entry = STRBUF_INIT;
@@ -511,7 +515,7 @@ static void read_info_alternates_the_repository(const char 
*relative_base,
return;
}
 
-   link_alt_odb_entries(buf.buf, '\n', relative_base, depth);
+   link_alt_odb_entries(the_repository, buf.buf, '\n', relative_base, 
depth);
strbuf_release();
free(path);
 }
@@ -565,7 +569,8 @@ void add_to_alternates_file(const char *reference)
if (commit_lock_file())
die_errno("unable to move new alternates file into 
place");
if (the_repository->objects.alt_odb_tail)
-   link_alt_odb_entries(reference, '\n', NULL, 0);
+   link_alt_odb_entries(the_repository, reference,
+'\n', NULL, 0);
}
free(alts);
 }
@@ -578,7 +583,8 @@ void add_to_alternates_memory(const char *reference)
 */
prepare_alt_odb();
 
-   link_alt_odb_entries(reference, '\n', NULL, 0);
+   link_alt_odb_entries(the_repository, reference,
+'\n', NULL, 0);
 }
 
 /*
@@ -681,7 +687,8 @@ void prepare_alt_odb(void)
 
the_repository->objects.alt_odb_tail =
_repository->objects.alt_odb_list;
-   link_alt_odb_entries(alt, PATH_SEP, NULL, 0);
+   link_alt_odb_entries(the_repository, alt,
+PATH_SEP, NULL, 0);
 
read_info_alternates(the_repository, get_object_directory(), 0);
 }
-- 
2.16.1.291.g4437f3f132-goog



[PATCH 27/27] sha1_file: allow sha1_loose_object_info to handle arbitrary repositories

2018-02-20 Thread Stefan Beller
From: Jonathan Nieder 

Signed-off-by: Stefan Beller 
Signed-off-by: Jonathan Nieder 
---
 sha1_file.c | 11 +--
 1 file changed, 5 insertions(+), 6 deletions(-)

diff --git a/sha1_file.c b/sha1_file.c
index a20f811a07..36282acb1a 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -1156,10 +1156,9 @@ int parse_sha1_header(const char *hdr, unsigned long 
*sizep)
return parse_sha1_header_extended(hdr, , 0);
 }
 
-#define sha1_loose_object_info(r, s, o, f) sha1_loose_object_info_##r(s, o, f)
-static int sha1_loose_object_info_the_repository(const unsigned char *sha1,
-struct object_info *oi,
-int flags)
+static int sha1_loose_object_info(struct repository *r,
+ const unsigned char *sha1,
+ struct object_info *oi, int flags)
 {
int status = 0;
unsigned long mapsize;
@@ -1183,14 +1182,14 @@ static int sha1_loose_object_info_the_repository(const 
unsigned char *sha1,
if (!oi->typep && !oi->typename && !oi->sizep && !oi->contentp) {
const char *path;
struct stat st;
-   if (stat_sha1_file(the_repository, sha1, , ) < 0)
+   if (stat_sha1_file(r, sha1, , ) < 0)
return -1;
if (oi->disk_sizep)
*oi->disk_sizep = st.st_size;
return 0;
}
 
-   map = map_sha1_file(the_repository, sha1, );
+   map = map_sha1_file(r, sha1, );
if (!map)
return -1;
 
-- 
2.16.1.291.g4437f3f132-goog



[PATCH 06/27] object-store: close all packs upon clearing the object store

2018-02-20 Thread Stefan Beller
Signed-off-by: Stefan Beller 
---
 builtin/am.c   | 2 +-
 builtin/clone.c| 2 +-
 builtin/fetch.c| 2 +-
 builtin/merge.c| 2 +-
 builtin/receive-pack.c | 2 +-
 object.c   | 6 ++
 packfile.c | 4 ++--
 packfile.h | 2 +-
 8 files changed, 10 insertions(+), 12 deletions(-)

diff --git a/builtin/am.c b/builtin/am.c
index 5bdd2d7578..4762a702e3 100644
--- a/builtin/am.c
+++ b/builtin/am.c
@@ -1859,7 +1859,7 @@ static void am_run(struct am_state *state, int resume)
 */
if (!state->rebasing) {
am_destroy(state);
-   close_all_packs();
+   close_all_packs(_repository->objects);
run_command_v_opt(argv_gc_auto, RUN_GIT_CMD);
}
 }
diff --git a/builtin/clone.c b/builtin/clone.c
index 101c27a593..13cfaa6488 100644
--- a/builtin/clone.c
+++ b/builtin/clone.c
@@ -1217,7 +1217,7 @@ int cmd_clone(int argc, const char **argv, const char 
*prefix)
transport_disconnect(transport);
 
if (option_dissociate) {
-   close_all_packs();
+   close_all_packs(_repository->objects);
dissociate_from_references();
}
 
diff --git a/builtin/fetch.c b/builtin/fetch.c
index 8ee998ea2e..4d72efca78 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -1478,7 +1478,7 @@ int cmd_fetch(int argc, const char **argv, const char 
*prefix)
 
string_list_clear(, 0);
 
-   close_all_packs();
+   close_all_packs(_repository->objects);
 
argv_array_pushl(_gc_auto, "gc", "--auto", NULL);
if (verbosity < 0)
diff --git a/builtin/merge.c b/builtin/merge.c
index 30264cfd7c..907ae44ab5 100644
--- a/builtin/merge.c
+++ b/builtin/merge.c
@@ -411,7 +411,7 @@ static void finish(struct commit *head_commit,
 * We ignore errors in 'gc --auto', since the
 * user should see them.
 */
-   close_all_packs();
+   close_all_packs(_repository->objects);
run_command_v_opt(argv_gc_auto, RUN_GIT_CMD);
}
}
diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index b2eac79a6e..954fc72c7c 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -2027,7 +2027,7 @@ int cmd_receive_pack(int argc, const char **argv, const 
char *prefix)
proc.git_cmd = 1;
proc.argv = argv_gc_auto;
 
-   close_all_packs();
+   close_all_packs(_repository->objects);
if (!start_command()) {
if (use_sideband)
copy_to_sideband(proc.err, -1, NULL);
diff --git a/object.c b/object.c
index c76b62572a..34daaf37b3 100644
--- a/object.c
+++ b/object.c
@@ -4,6 +4,7 @@
 #include "tree.h"
 #include "commit.h"
 #include "tag.h"
+#include "packfile.h"
 
 static struct object **obj_hash;
 static int nr_objs, obj_hash_size;
@@ -469,8 +470,5 @@ void raw_object_store_clear(struct raw_object_store *o)
 
while (!list_empty(>packed_git_mru))
list_del(>packed_git_mru);
-   /*
-* TODO: call close_all_packs once migrated to
-* take an object store argument
-*/
+   close_all_packs(o);
 }
diff --git a/packfile.c b/packfile.c
index d41e4c83d0..511a2b0cdf 100644
--- a/packfile.c
+++ b/packfile.c
@@ -311,11 +311,11 @@ static void close_pack(struct packed_git *p)
close_pack_index(p);
 }
 
-void close_all_packs(void)
+void close_all_packs(struct raw_object_store *o)
 {
struct packed_git *p;
 
-   for (p = the_repository->objects.packed_git; p; p = p->next)
+   for (p = o->packed_git; p; p = p->next)
if (p->do_not_close)
die("BUG: want to close pack marked 'do-not-close'");
else
diff --git a/packfile.h b/packfile.h
index a7fca598d6..6a2c57045c 100644
--- a/packfile.h
+++ b/packfile.h
@@ -63,7 +63,7 @@ extern void close_pack_index(struct packed_git *);
 
 extern unsigned char *use_pack(struct packed_git *, struct pack_window **, 
off_t, unsigned long *);
 extern void close_pack_windows(struct packed_git *);
-extern void close_all_packs(void);
+extern void close_all_packs(struct raw_object_store *o);
 extern void unuse_pack(struct pack_window **);
 extern void clear_delta_base_cache(void);
 extern struct packed_git *add_packed_git(const char *path, size_t path_len, 
int local);
-- 
2.16.1.291.g4437f3f132-goog



[PATCH 17/27] sha1_file: add repository argument to stat_sha1_file

2018-02-20 Thread Stefan Beller
Add a repository argument to allow the stat_sha1_file caller to be
more specific about which repository to act on. This is a small
mechanical change; it doesn't change the implementation to handle
repositories other than the_repository yet.

As with the previous commits, use a macro to catch callers passing a
repository other than the_repository at compile time.

Signed-off-by: Stefan Beller 
Signed-off-by: Jonathan Nieder 
---
 sha1_file.c | 9 +
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/sha1_file.c b/sha1_file.c
index 811ed9e53f..ed297773e5 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -874,8 +874,9 @@ int git_open_cloexec(const char *name, int flags)
  * Note that it may point to static storage and is only valid until another
  * call to sha1_file_name(), etc.
  */
-static int stat_sha1_file(const unsigned char *sha1, struct stat *st,
- const char **path)
+#define stat_sha1_file(r, s, st, p) stat_sha1_file_##r(s, st, p)
+static int stat_sha1_file_the_repository(const unsigned char *sha1,
+struct stat *st, const char **path)
 {
struct alternate_object_database *alt;
static struct strbuf buf = STRBUF_INIT;
@@ -1181,7 +1182,7 @@ static int sha1_loose_object_info(const unsigned char 
*sha1,
if (!oi->typep && !oi->typename && !oi->sizep && !oi->contentp) {
const char *path;
struct stat st;
-   if (stat_sha1_file(sha1, , ) < 0)
+   if (stat_sha1_file(the_repository, sha1, , ) < 0)
return -1;
if (oi->disk_sizep)
*oi->disk_sizep = st.st_size;
@@ -1395,7 +1396,7 @@ void *read_sha1_file_extended(const unsigned char *sha1,
die("replacement %s not found for %s",
sha1_to_hex(repl), sha1_to_hex(sha1));
 
-   if (!stat_sha1_file(repl, , ))
+   if (!stat_sha1_file(the_repository, repl, , ))
die("loose object %s (stored in %s) is corrupt",
sha1_to_hex(repl), path);
 
-- 
2.16.1.291.g4437f3f132-goog



[PATCH 03/27] object-store: move alt_odb_list and alt_odb_tail to object store

2018-02-20 Thread Stefan Beller
In a process with multiple repositories open, alternates should be
associated to a single repository and not shared globally. Move
alt_odb_list and alt_odb_tail into the_repository and adjust callers
to reflect this.

Now that the alternative object data base is per repository, we're
leaking its memory upon freeing a repository. The next patch plugs
this hole.

No functional change intended.

Signed-off-by: Stefan Beller 
Signed-off-by: Jonathan Nieder 
---
 builtin/fsck.c |  4 +++-
 object-store.h |  9 ++---
 packfile.c |  3 ++-
 sha1_file.c| 25 -
 sha1_name.c|  3 ++-
 5 files changed, 25 insertions(+), 19 deletions(-)

diff --git a/builtin/fsck.c b/builtin/fsck.c
index 7a8a679d4f..908e4f092a 100644
--- a/builtin/fsck.c
+++ b/builtin/fsck.c
@@ -1,5 +1,6 @@
 #include "builtin.h"
 #include "cache.h"
+#include "repository.h"
 #include "config.h"
 #include "commit.h"
 #include "tree.h"
@@ -716,7 +717,8 @@ int cmd_fsck(int argc, const char **argv, const char 
*prefix)
fsck_object_dir(get_object_directory());
 
prepare_alt_odb();
-   for (alt = alt_odb_list; alt; alt = alt->next)
+   for (alt = the_repository->objects.alt_odb_list;
+   alt; alt = alt->next)
fsck_object_dir(alt->path);
 
if (check_full) {
diff --git a/object-store.h b/object-store.h
index 5678aa1136..e78eea1dde 100644
--- a/object-store.h
+++ b/object-store.h
@@ -1,7 +1,7 @@
 #ifndef OBJECT_STORE_H
 #define OBJECT_STORE_H
 
-extern struct alternate_object_database {
+struct alternate_object_database {
struct alternate_object_database *next;
 
/* see alt_scratch_buf() */
@@ -19,7 +19,7 @@ extern struct alternate_object_database {
struct oid_array loose_objects_cache;
 
char path[FLEX_ARRAY];
-} *alt_odb_list;
+};
 void prepare_alt_odb(void);
 char *compute_alternate_path(const char *path, struct strbuf *err);
 typedef int alt_odb_fn(struct alternate_object_database *, void *);
@@ -58,8 +58,11 @@ struct raw_object_store {
 * Cannot be NULL after initialization.
 */
char *objectdir;
+
+   struct alternate_object_database *alt_odb_list;
+   struct alternate_object_database **alt_odb_tail;
 };
-#define RAW_OBJECT_STORE_INIT { NULL }
+#define RAW_OBJECT_STORE_INIT { NULL, NULL, NULL }
 
 void raw_object_store_clear(struct raw_object_store *o);
 
diff --git a/packfile.c b/packfile.c
index 7dbe8739d1..216ea836ee 100644
--- a/packfile.c
+++ b/packfile.c
@@ -1,6 +1,7 @@
 #include "cache.h"
 #include "list.h"
 #include "pack.h"
+#include "repository.h"
 #include "dir.h"
 #include "mergesort.h"
 #include "packfile.h"
@@ -891,7 +892,7 @@ void prepare_packed_git(void)
return;
prepare_packed_git_one(get_object_directory(), 1);
prepare_alt_odb();
-   for (alt = alt_odb_list; alt; alt = alt->next)
+   for (alt = the_repository->objects.alt_odb_list; alt; alt = alt->next)
prepare_packed_git_one(alt->path, 0);
rearrange_packed_git();
prepare_packed_git_mru();
diff --git a/sha1_file.c b/sha1_file.c
index 831d9e7343..1348dce68f 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -22,6 +22,7 @@
 #include "pack-revindex.h"
 #include "sha1-lookup.h"
 #include "bulk-checkin.h"
+#include "repository.h"
 #include "streaming.h"
 #include "dir.h"
 #include "list.h"
@@ -342,9 +343,6 @@ static const char *alt_sha1_path(struct 
alternate_object_database *alt,
return buf->buf;
 }
 
-struct alternate_object_database *alt_odb_list;
-static struct alternate_object_database **alt_odb_tail;
-
 /*
  * Return non-zero iff the path is usable as an alternate object database.
  */
@@ -364,7 +362,7 @@ static int alt_odb_usable(struct strbuf *path, const char 
*normalized_objdir)
 * Prevent the common mistake of listing the same
 * thing twice, or object directory itself.
 */
-   for (alt = alt_odb_list; alt; alt = alt->next) {
+   for (alt = the_repository->objects.alt_odb_list; alt; alt = alt->next) {
if (!fspathcmp(path->buf, alt->path))
return 0;
}
@@ -424,8 +422,8 @@ static int link_alt_odb_entry(const char *entry, const char 
*relative_base,
ent = alloc_alt_odb(pathbuf.buf);
 
/* add the alternate entry */
-   *alt_odb_tail = ent;
-   alt_odb_tail = &(ent->next);
+   *the_repository->objects.alt_odb_tail = ent;
+   the_repository->objects.alt_odb_tail = &(ent->next);
ent->next = NULL;
 
/* recursively add alternates */
@@ -559,7 +557,7 @@ void add_to_alternates_file(const char *reference)
fprintf_or_die(out, "%s\n", reference);
if (commit_lock_file())
die_errno("unable to move new alternates file into 
place");
-   if (alt_odb_tail)
+   if 

[PATCH 16/27] sha1_file: add repository argument to sha1_file_name

2018-02-20 Thread Stefan Beller
Add a repository argument to allow sha1_file_name callers to be more
specific about which repository to handle. This is a small mechanical
change; it doesn't change the implementation to handle repositories
other than the_repository yet.

As with the previous commits, use a macro to catch callers passing a
repository other than the_repository at compile time.

While at it, move the declaration to object-store.h, where it should
be easier to find.

Signed-off-by: Stefan Beller 
Signed-off-by: Jonathan Nieder 
---
 cache.h|  6 --
 http-walker.c  |  3 ++-
 http.c |  5 ++---
 object-store.h |  7 +++
 sha1_file.c| 10 +-
 5 files changed, 16 insertions(+), 15 deletions(-)

diff --git a/cache.h b/cache.h
index 3fd9dc52cd..e641c23b0a 100644
--- a/cache.h
+++ b/cache.h
@@ -961,12 +961,6 @@ extern void check_repository_format(void);
 #define DATA_CHANGED0x0020
 #define TYPE_CHANGED0x0040
 
-/*
- * Put in `buf` the name of the file in the local object database that
- * would be used to store a loose object with the specified sha1.
- */
-extern void sha1_file_name(struct strbuf *buf, const unsigned char *sha1);
-
 /*
  * Return an abbreviated sha1 unique within this repository's object database.
  * The result will be at least `len` characters long, and will be NUL
diff --git a/http-walker.c b/http-walker.c
index 8bb5d991bb..76bfdbca76 100644
--- a/http-walker.c
+++ b/http-walker.c
@@ -1,4 +1,5 @@
 #include "cache.h"
+#include "repository.h"
 #include "commit.h"
 #include "walker.h"
 #include "http.h"
@@ -546,7 +547,7 @@ static int fetch_object(struct walker *walker, unsigned 
char *sha1)
ret = error("File %s has bad hash", hex);
} else if (req->rename < 0) {
struct strbuf buf = STRBUF_INIT;
-   sha1_file_name(, req->sha1);
+   sha1_file_name(the_repository, , req->sha1);
ret = error("unable to write sha1 filename %s", buf.buf);
strbuf_release();
}
diff --git a/http.c b/http.c
index a4a9e583c7..afe2707e86 100644
--- a/http.c
+++ b/http.c
@@ -2247,7 +2247,7 @@ struct http_object_request *new_http_object_request(const 
char *base_url,
hashcpy(freq->sha1, sha1);
freq->localfile = -1;
 
-   sha1_file_name(, sha1);
+   sha1_file_name(the_repository, , sha1);
snprintf(freq->tmpfile, sizeof(freq->tmpfile),
 "%s.temp", filename.buf);
 
@@ -2396,8 +2396,7 @@ int finish_http_object_request(struct http_object_request 
*freq)
unlink_or_warn(freq->tmpfile);
return -1;
}
-
-   sha1_file_name(, freq->sha1);
+   sha1_file_name(the_repository, , freq->sha1);
freq->rename = finalize_object_file(freq->tmpfile, filename.buf);
strbuf_release();
 
diff --git a/object-store.h b/object-store.h
index 873b341774..4f5060e768 100644
--- a/object-store.h
+++ b/object-store.h
@@ -123,4 +123,11 @@ struct raw_object_store {
 
 void raw_object_store_clear(struct raw_object_store *o);
 
+/*
+ * Put in `buf` the name of the file in the local object database that
+ * would be used to store a loose object with the specified sha1.
+ */
+#define sha1_file_name(r, b, s) sha1_file_name_##r(b, s)
+void sha1_file_name_the_repository(struct strbuf *buf, const unsigned char 
*sha1);
+
 #endif /* OBJECT_STORE_H */
diff --git a/sha1_file.c b/sha1_file.c
index 9cf3fffbeb..811ed9e53f 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -323,7 +323,7 @@ static void fill_sha1_path(struct strbuf *buf, const 
unsigned char *sha1)
}
 }
 
-void sha1_file_name(struct strbuf *buf, const unsigned char *sha1)
+void sha1_file_name_the_repository(struct strbuf *buf, const unsigned char 
*sha1)
 {
strbuf_addstr(buf, get_object_directory());
strbuf_addch(buf, '/');
@@ -720,7 +720,7 @@ static int check_and_freshen_local(const unsigned char 
*sha1, int freshen)
static struct strbuf buf = STRBUF_INIT;
 
strbuf_reset();
-   sha1_file_name(, sha1);
+   sha1_file_name(the_repository, , sha1);
 
return check_and_freshen_file(buf.buf, freshen);
 }
@@ -881,7 +881,7 @@ static int stat_sha1_file(const unsigned char *sha1, struct 
stat *st,
static struct strbuf buf = STRBUF_INIT;
 
strbuf_reset();
-   sha1_file_name(, sha1);
+   sha1_file_name(the_repository, , sha1);
*path = buf.buf;
 
if (!lstat(*path, st))
@@ -910,7 +910,7 @@ static int open_sha1_file(const unsigned char *sha1, const 
char **path)
static struct strbuf buf = STRBUF_INIT;
 
strbuf_reset();
-   sha1_file_name(, sha1);
+   sha1_file_name(the_repository, , sha1);
*path = buf.buf;
 
fd = git_open(*path);
@@ -1595,7 +1595,7 @@ static int write_loose_object(const unsigned char *sha1, 
char *hdr, int hdrlen,
static struct strbuf filename = STRBUF_INIT;
 
strbuf_reset();
-   

[PATCH 26/27] sha1_file: allow map_sha1_file to handle arbitrary repositories

2018-02-20 Thread Stefan Beller
Signed-off-by: Stefan Beller 
Signed-off-by: Jonathan Nieder 
---
 object-store.h | 3 +--
 sha1_file.c| 5 +++--
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/object-store.h b/object-store.h
index 146358fe33..5d93677783 100644
--- a/object-store.h
+++ b/object-store.h
@@ -129,7 +129,6 @@ void raw_object_store_clear(struct raw_object_store *o);
  */
 void sha1_file_name(struct repository *r, struct strbuf *buf, const unsigned 
char *sha1);
 
-#define map_sha1_file(r, s, sz) map_sha1_file_##r(s, sz)
-void *map_sha1_file_the_repository(const unsigned char *sha1, unsigned long 
*size);
+void *map_sha1_file(struct repository *r, const unsigned char *sha1, unsigned 
long *size);
 
 #endif /* OBJECT_STORE_H */
diff --git a/sha1_file.c b/sha1_file.c
index 522804d9f0..a20f811a07 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -964,9 +964,10 @@ static void *map_sha1_file_1(struct repository *r, const 
char *path,
return map;
 }
 
-void *map_sha1_file_the_repository(const unsigned char *sha1, unsigned long 
*size)
+void *map_sha1_file(struct repository *r,
+   const unsigned char *sha1, unsigned long *size)
 {
-   return map_sha1_file_1(the_repository, NULL, sha1, size);
+   return map_sha1_file_1(r, NULL, sha1, size);
 }
 
 static int unpack_sha1_short_header(git_zstream *stream,
-- 
2.16.1.291.g4437f3f132-goog



[PATCH 07/27] pack: move prepare_packed_git_run_once to object store

2018-02-20 Thread Stefan Beller
Each repository's object store can be initialized independently, so
they must not share a run_once variable.

Signed-off-by: Stefan Beller 
Signed-off-by: Jonathan Nieder 
---
 object-store.h | 8 +++-
 packfile.c | 7 +++
 2 files changed, 10 insertions(+), 5 deletions(-)

diff --git a/object-store.h b/object-store.h
index 1de9e07102..6cecba3951 100644
--- a/object-store.h
+++ b/object-store.h
@@ -92,6 +92,12 @@ struct raw_object_store {
 
struct alternate_object_database *alt_odb_list;
struct alternate_object_database **alt_odb_tail;
+
+   /*
+* Whether packed_git has already been populated with this repository's
+* packs.
+*/
+   unsigned packed_git_initialized : 1;
 };
 
 /*
@@ -101,7 +107,7 @@ struct raw_object_store {
  * that macro on member variables. Use NULL instead as that is defined
  * and accepted, deferring the real init to prepare_packed_git_mru(). */
 #define __MRU_INIT { NULL, NULL }
-#define RAW_OBJECT_STORE_INIT { NULL, NULL, __MRU_INIT, NULL, NULL }
+#define RAW_OBJECT_STORE_INIT { NULL, NULL, __MRU_INIT, NULL, NULL, 0 }
 
 void raw_object_store_clear(struct raw_object_store *o);
 
diff --git a/packfile.c b/packfile.c
index 511a2b0cdf..a8a2e7fe43 100644
--- a/packfile.c
+++ b/packfile.c
@@ -884,12 +884,11 @@ static void prepare_packed_git_mru(void)
list_add_tail(>mru, _repository->objects.packed_git_mru);
 }
 
-static int prepare_packed_git_run_once = 0;
 void prepare_packed_git(void)
 {
struct alternate_object_database *alt;
 
-   if (prepare_packed_git_run_once)
+   if (the_repository->objects.packed_git_initialized)
return;
prepare_packed_git_one(get_object_directory(), 1);
prepare_alt_odb();
@@ -897,13 +896,13 @@ void prepare_packed_git(void)
prepare_packed_git_one(alt->path, 0);
rearrange_packed_git();
prepare_packed_git_mru();
-   prepare_packed_git_run_once = 1;
+   the_repository->objects.packed_git_initialized = 1;
 }
 
 void reprepare_packed_git(void)
 {
approximate_object_count_valid = 0;
-   prepare_packed_git_run_once = 0;
+   the_repository->objects.packed_git_initialized = 0;
prepare_packed_git();
 }
 
-- 
2.16.1.291.g4437f3f132-goog



[PATCH 21/27] sha1_file: add repository argument to sha1_loose_object_info

2018-02-20 Thread Stefan Beller
Add a repository argument to allow the sha1_loose_object_info caller
to be more specific about which repository to act on. This is a small
mechanical change; it doesn't change the implementation to handle
repositories other than the_repository yet.

As with the previous commits, use a macro to catch callers passing a
repository other than the_repository at compile time.

Signed-off-by: Stefan Beller 
Signed-off-by: Jonathan Nieder 
---
 sha1_file.c | 9 +
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/sha1_file.c b/sha1_file.c
index 0d3ff99cf2..d1f77a4c80 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -1159,9 +1159,10 @@ int parse_sha1_header(const char *hdr, unsigned long 
*sizep)
return parse_sha1_header_extended(hdr, , 0);
 }
 
-static int sha1_loose_object_info(const unsigned char *sha1,
- struct object_info *oi,
- int flags)
+#define sha1_loose_object_info(r, s, o, f) sha1_loose_object_info_##r(s, o, f)
+static int sha1_loose_object_info_the_repository(const unsigned char *sha1,
+struct object_info *oi,
+int flags)
 {
int status = 0;
unsigned long mapsize;
@@ -1280,7 +1281,7 @@ int sha1_object_info_extended(const unsigned char *sha1, 
struct object_info *oi,
break;
 
/* Most likely it's a loose object. */
-   if (!sha1_loose_object_info(real, oi, flags))
+   if (!sha1_loose_object_info(the_repository, real, oi, flags))
return 0;
 
/* Not a loose object; someone else may have just packed it. */
-- 
2.16.1.291.g4437f3f132-goog



[PATCH 23/27] sha1_file: allow stat_sha1_file to handle arbitrary repositories

2018-02-20 Thread Stefan Beller
Signed-off-by: Stefan Beller 
Signed-off-by: Jonathan Nieder 
---
 sha1_file.c | 11 +--
 1 file changed, 5 insertions(+), 6 deletions(-)

diff --git a/sha1_file.c b/sha1_file.c
index 8fad523d85..6879af8993 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -874,23 +874,22 @@ int git_open_cloexec(const char *name, int flags)
  * Note that it may point to static storage and is only valid until another
  * call to sha1_file_name(), etc.
  */
-#define stat_sha1_file(r, s, st, p) stat_sha1_file_##r(s, st, p)
-static int stat_sha1_file_the_repository(const unsigned char *sha1,
-struct stat *st, const char **path)
+static int stat_sha1_file(struct repository *r, const unsigned char *sha1,
+ struct stat *st, const char **path)
 {
struct alternate_object_database *alt;
static struct strbuf buf = STRBUF_INIT;
 
strbuf_reset();
-   sha1_file_name(the_repository, , sha1);
+   sha1_file_name(r, , sha1);
*path = buf.buf;
 
if (!lstat(*path, st))
return 0;
 
-   prepare_alt_odb(the_repository);
+   prepare_alt_odb(r);
errno = ENOENT;
-   for (alt = the_repository->objects.alt_odb_list; alt; alt = alt->next) {
+   for (alt = r->objects.alt_odb_list; alt; alt = alt->next) {
*path = alt_sha1_path(alt, sha1);
if (!lstat(*path, st))
return 0;
-- 
2.16.1.291.g4437f3f132-goog



[PATCH 20/27] sha1_file: add repository argument to map_sha1_file

2018-02-20 Thread Stefan Beller
Add a repository argument to allow map_sha1_file callers to be more
specific about which repository to handle. This is a small mechanical
change; it doesn't change the implementation to handle repositories
other than the_repository yet.

As with the previous commits, use a macro to catch callers passing a
repository other than the_repository at compile time.

While at it, move the declaration to object-store.h, where it should
be easier to find.

Signed-off-by: Stefan Beller 
Signed-off-by: Jonathan Nieder 
---
 cache.h| 1 -
 object-store.h | 3 +++
 sha1_file.c| 4 ++--
 streaming.c| 5 -
 4 files changed, 9 insertions(+), 4 deletions(-)

diff --git a/cache.h b/cache.h
index e641c23b0a..d338b65d75 100644
--- a/cache.h
+++ b/cache.h
@@ -1242,7 +1242,6 @@ extern int pretend_sha1_file(void *, unsigned long, enum 
object_type, unsigned c
 extern int force_object_loose(const unsigned char *sha1, time_t mtime);
 extern int git_open_cloexec(const char *name, int flags);
 #define git_open(name) git_open_cloexec(name, O_RDONLY)
-extern void *map_sha1_file(const unsigned char *sha1, unsigned long *size);
 extern int unpack_sha1_header(git_zstream *stream, unsigned char *map, 
unsigned long mapsize, void *buffer, unsigned long bufsiz);
 extern int parse_sha1_header(const char *hdr, unsigned long *sizep);
 
diff --git a/object-store.h b/object-store.h
index 4f5060e768..a37d74a0c5 100644
--- a/object-store.h
+++ b/object-store.h
@@ -130,4 +130,7 @@ void raw_object_store_clear(struct raw_object_store *o);
 #define sha1_file_name(r, b, s) sha1_file_name_##r(b, s)
 void sha1_file_name_the_repository(struct strbuf *buf, const unsigned char 
*sha1);
 
+#define map_sha1_file(r, s, sz) map_sha1_file_##r(s, sz)
+void *map_sha1_file_the_repository(const unsigned char *sha1, unsigned long 
*size);
+
 #endif /* OBJECT_STORE_H */
diff --git a/sha1_file.c b/sha1_file.c
index 7fa9ed2f25..0d3ff99cf2 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -968,7 +968,7 @@ static void *map_sha1_file_1_the_repository(const char 
*path,
return map;
 }
 
-void *map_sha1_file(const unsigned char *sha1, unsigned long *size)
+void *map_sha1_file_the_repository(const unsigned char *sha1, unsigned long 
*size)
 {
return map_sha1_file_1(the_repository, NULL, sha1, size);
 }
@@ -1192,7 +1192,7 @@ static int sha1_loose_object_info(const unsigned char 
*sha1,
return 0;
}
 
-   map = map_sha1_file(sha1, );
+   map = map_sha1_file(the_repository, sha1, );
if (!map)
return -1;
 
diff --git a/streaming.c b/streaming.c
index 5892b50bd8..22d27df55e 100644
--- a/streaming.c
+++ b/streaming.c
@@ -3,6 +3,8 @@
  */
 #include "cache.h"
 #include "streaming.h"
+#include "repository.h"
+#include "object-store.h"
 #include "packfile.h"
 
 enum input_source {
@@ -335,7 +337,8 @@ static struct stream_vtbl loose_vtbl = {
 
 static open_method_decl(loose)
 {
-   st->u.loose.mapped = map_sha1_file(sha1, >u.loose.mapsize);
+   st->u.loose.mapped = map_sha1_file(the_repository,
+  sha1, >u.loose.mapsize);
if (!st->u.loose.mapped)
return -1;
if ((unpack_sha1_header(>z,
-- 
2.16.1.291.g4437f3f132-goog



[PATCH 25/27] sha1_file: allow map_sha1_file_1 to handle arbitrary repositories

2018-02-20 Thread Stefan Beller
From: Jonathan Nieder 

Signed-off-by: Stefan Beller 
Signed-off-by: Jonathan Nieder 
---
 sha1_file.c | 8 +++-
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/sha1_file.c b/sha1_file.c
index 598acc5410..522804d9f0 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -936,10 +936,8 @@ static int open_sha1_file(struct repository *r,
  * Map the loose object at "path" if it is not NULL, or the path found by
  * searching for a loose object named "sha1".
  */
-#define map_sha1_file_1(r, p, s, si) map_sha1_file_1_##r(p, s, si)
-static void *map_sha1_file_1_the_repository(const char *path,
-   const unsigned char *sha1,
-   unsigned long *size)
+static void *map_sha1_file_1(struct repository *r, const char *path,
+const unsigned char *sha1, unsigned long *size)
 {
void *map;
int fd;
@@ -947,7 +945,7 @@ static void *map_sha1_file_1_the_repository(const char 
*path,
if (path)
fd = git_open(path);
else
-   fd = open_sha1_file(the_repository, sha1, );
+   fd = open_sha1_file(r, sha1, );
map = NULL;
if (fd >= 0) {
struct stat st;
-- 
2.16.1.291.g4437f3f132-goog



[PATCH 04/27] object-store: free alt_odb_list

2018-02-20 Thread Stefan Beller
Free the memory and reset alt_odb_{list, tail} to NULL.

Signed-off-by: Stefan Beller 
---
 object.c | 17 +
 1 file changed, 17 insertions(+)

diff --git a/object.c b/object.c
index 11d904c033..17b1da6d6b 100644
--- a/object.c
+++ b/object.c
@@ -446,7 +446,24 @@ void clear_commit_marks_all(unsigned int flags)
}
 }
 
+static void free_alt_odb(struct alternate_object_database *alt)
+{
+   strbuf_release(>scratch);
+   oid_array_clear(>loose_objects_cache);
+}
+
+static void free_alt_odbs(struct raw_object_store *o)
+{
+   while (o->alt_odb_list) {
+   free_alt_odb(o->alt_odb_list);
+   o->alt_odb_list = o->alt_odb_list->next;
+   }
+}
+
 void raw_object_store_clear(struct raw_object_store *o)
 {
free(o->objectdir);
+
+   free_alt_odbs(o);
+   o->alt_odb_tail = NULL;
 }
-- 
2.16.1.291.g4437f3f132-goog



[PATCH 18/27] sha1_file: add repository argument to open_sha1_file

2018-02-20 Thread Stefan Beller
Add a repository argument to allow the open_sha1_file caller to be
more specific about which repository to act on. This is a small
mechanical change; it doesn't change the implementation to handle
repositories other than the_repository yet.

As with the previous commits, use a macro to catch callers passing a
repository other than the_repository at compile time.

Signed-off-by: Stefan Beller 
Signed-off-by: Jonathan Nieder 
---
 sha1_file.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/sha1_file.c b/sha1_file.c
index ed297773e5..68aff90792 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -903,7 +903,9 @@ static int stat_sha1_file_the_repository(const unsigned 
char *sha1,
  * Like stat_sha1_file(), but actually open the object and return the
  * descriptor. See the caveats on the "path" parameter above.
  */
-static int open_sha1_file(const unsigned char *sha1, const char **path)
+#define open_sha1_file(r, s, p) open_sha1_file_##r(s, p)
+static int open_sha1_file_the_repository(const unsigned char *sha1,
+const char **path)
 {
int fd;
struct alternate_object_database *alt;
@@ -946,7 +948,7 @@ static void *map_sha1_file_1(const char *path,
if (path)
fd = git_open(path);
else
-   fd = open_sha1_file(sha1, );
+   fd = open_sha1_file(the_repository, sha1, );
map = NULL;
if (fd >= 0) {
struct stat st;
-- 
2.16.1.291.g4437f3f132-goog



[PATCH 08/27] pack: move approximate object count to object store

2018-02-20 Thread Stefan Beller
The approximate_object_count() function maintains a rough count of
objects in a repository to estimate how long object name abbreviates
should be.  Object names are scoped to a repository and the
appropriate length may differ by repository, so the object count
should not be global.

Signed-off-by: Stefan Beller 
Signed-off-by: Jonathan Nieder 
---
 object-store.h | 10 +-
 packfile.c | 11 +--
 2 files changed, 14 insertions(+), 7 deletions(-)

diff --git a/object-store.h b/object-store.h
index 6cecba3951..bd1e4fcd8b 100644
--- a/object-store.h
+++ b/object-store.h
@@ -93,6 +93,14 @@ struct raw_object_store {
struct alternate_object_database *alt_odb_list;
struct alternate_object_database **alt_odb_tail;
 
+   /*
+* A fast, rough count of the number of objects in the repository.
+* These two fields are not meant for direct access. Use
+* approximate_object_count() instead.
+*/
+   unsigned long approximate_object_count;
+   unsigned approximate_object_count_valid : 1;
+
/*
 * Whether packed_git has already been populated with this repository's
 * packs.
@@ -107,7 +115,7 @@ struct raw_object_store {
  * that macro on member variables. Use NULL instead as that is defined
  * and accepted, deferring the real init to prepare_packed_git_mru(). */
 #define __MRU_INIT { NULL, NULL }
-#define RAW_OBJECT_STORE_INIT { NULL, NULL, __MRU_INIT, NULL, NULL, 0 }
+#define RAW_OBJECT_STORE_INIT { NULL, NULL, __MRU_INIT, NULL, NULL, 0, 0, 0 }
 
 void raw_object_store_clear(struct raw_object_store *o);
 
diff --git a/packfile.c b/packfile.c
index a8a2e7fe43..693bafbc98 100644
--- a/packfile.c
+++ b/packfile.c
@@ -803,8 +803,6 @@ static void prepare_packed_git_one(char *objdir, int local)
strbuf_release();
 }
 
-static int approximate_object_count_valid;
-
 /*
  * Give a fast, rough count of the number of objects in the repository. This
  * ignores loose objects completely. If you have a lot of them, then either
@@ -814,8 +812,8 @@ static int approximate_object_count_valid;
  */
 unsigned long approximate_object_count(void)
 {
-   static unsigned long count;
-   if (!approximate_object_count_valid) {
+   if (!the_repository->objects.approximate_object_count_valid) {
+   unsigned long count;
struct packed_git *p;
 
prepare_packed_git();
@@ -825,8 +823,9 @@ unsigned long approximate_object_count(void)
continue;
count += p->num_objects;
}
+   the_repository->objects.approximate_object_count = count;
}
-   return count;
+   return the_repository->objects.approximate_object_count;
 }
 
 static void *get_next_packed_git(const void *p)
@@ -901,7 +900,7 @@ void prepare_packed_git(void)
 
 void reprepare_packed_git(void)
 {
-   approximate_object_count_valid = 0;
+   the_repository->objects.approximate_object_count_valid = 0;
the_repository->objects.packed_git_initialized = 0;
prepare_packed_git();
 }
-- 
2.16.1.291.g4437f3f132-goog



[PATCH 19/27] sha1_file: add repository argument to map_sha1_file_1

2018-02-20 Thread Stefan Beller
Add a repository argument to allow the map_sha1_file_1 caller to be
more specific about which repository to act on. This is a small
mechanical change; it doesn't change the implementation to handle
repositories other than the_repository yet.

As with the previous commits, use a macro to catch callers passing a
repository other than the_repository at compile time.

Signed-off-by: Stefan Beller 
Signed-off-by: Jonathan Nieder 
---
 sha1_file.c | 11 ++-
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/sha1_file.c b/sha1_file.c
index 68aff90792..7fa9ed2f25 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -938,9 +938,10 @@ static int open_sha1_file_the_repository(const unsigned 
char *sha1,
  * Map the loose object at "path" if it is not NULL, or the path found by
  * searching for a loose object named "sha1".
  */
-static void *map_sha1_file_1(const char *path,
-const unsigned char *sha1,
-unsigned long *size)
+#define map_sha1_file_1(r, p, s, si) map_sha1_file_1_##r(p, s, si)
+static void *map_sha1_file_1_the_repository(const char *path,
+   const unsigned char *sha1,
+   unsigned long *size)
 {
void *map;
int fd;
@@ -969,7 +970,7 @@ static void *map_sha1_file_1(const char *path,
 
 void *map_sha1_file(const unsigned char *sha1, unsigned long *size)
 {
-   return map_sha1_file_1(NULL, sha1, size);
+   return map_sha1_file_1(the_repository, NULL, sha1, size);
 }
 
 static int unpack_sha1_short_header(git_zstream *stream,
@@ -2199,7 +2200,7 @@ int read_loose_object(const char *path,
 
*contents = NULL;
 
-   map = map_sha1_file_1(path, NULL, );
+   map = map_sha1_file_1(the_repository, path, NULL, );
if (!map) {
error_errno("unable to mmap %s", path);
goto out;
-- 
2.16.1.291.g4437f3f132-goog



[PATCH 11/27] sha1_file: add repository argument to read_info_alternates

2018-02-20 Thread Stefan Beller
See previous patch for explanation.

Signed-off-by: Stefan Beller 
Signed-off-by: Jonathan Nieder 
---
 sha1_file.c | 11 +++
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/sha1_file.c b/sha1_file.c
index aaa6276211..6edcc3d3a7 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -389,7 +389,9 @@ static int alt_odb_usable(struct raw_object_store *o,
  * SHA1, an extra slash for the first level indirection, and the
  * terminating NUL.
  */
-static void read_info_alternates(const char * relative_base, int depth);
+#define read_info_alternates(r, rb, d) read_info_alternates_##r(rb, d)
+static void read_info_alternates_the_repository(const char *relative_base,
+   int depth);
 #define link_alt_odb_entry(r, e, rb, d, n) link_alt_odb_entry_##r(e, rb, d, n)
 static int link_alt_odb_entry_the_repository(const char *entry,
const char *relative_base, int depth, const char *normalized_objdir)
@@ -430,7 +432,7 @@ static int link_alt_odb_entry_the_repository(const char 
*entry,
ent->next = NULL;
 
/* recursively add alternates */
-   read_info_alternates(pathbuf.buf, depth + 1);
+   read_info_alternates(the_repository, pathbuf.buf, depth + 1);
 
strbuf_release();
return 0;
@@ -496,7 +498,8 @@ static void link_alt_odb_entries(const char *alt, int sep,
strbuf_release();
 }
 
-static void read_info_alternates(const char * relative_base, int depth)
+static void read_info_alternates_the_repository(const char *relative_base,
+   int depth)
 {
char *path;
struct strbuf buf = STRBUF_INIT;
@@ -680,7 +683,7 @@ void prepare_alt_odb(void)
_repository->objects.alt_odb_list;
link_alt_odb_entries(alt, PATH_SEP, NULL, 0);
 
-   read_info_alternates(get_object_directory(), 0);
+   read_info_alternates(the_repository, get_object_directory(), 0);
 }
 
 /* Returns 1 if we have successfully freshened the file, 0 otherwise. */
-- 
2.16.1.291.g4437f3f132-goog



[PATCH 05/27] object-store: move packed_git and packed_git_mru to object store

2018-02-20 Thread Stefan Beller
In a process with multiple repositories open, packfile accessors
should be associated to a single repository and not shared globally.
Move packed_git and packed_git_mru into the_repository and adjust
callers to reflect this.

Patch generated by

 1. Moving the struct packed_git declaration to object-store.h
and packed_git, packed_git_mru globals to struct object_store.

 2. Applying the semantic patch
contrib/coccinelle/refactoring/packed_git.cocci to adjust callers.
This semantic patch is placed in a sub directory of the coccinelle
contrib dir, as this semantic patch is not expected to be of general
usefulness; it is only useful during developing this series and
merging it with other topics in flight. At a later date, just
delete that semantic patch.

 3. Applying line wrapping fixes from "make style" to break the
resulting long lines.

 4. Adding missing #includes of repository.h and object-store.h
where needed.

 5. As the packfiles are now owned by an objectstore/repository, which
is ephemeral unlike globals, we introduce memory leaks. So address
them in raw_object_store_clear().

Signed-off-by: Stefan Beller 
Signed-off-by: Jonathan Nieder 
---
 builtin/count-objects.c  |  6 --
 builtin/fsck.c   |  7 +--
 builtin/gc.c |  4 +++-
 builtin/index-pack.c |  1 +
 builtin/pack-objects.c   | 20 +++-
 builtin/pack-redundant.c |  6 --
 builtin/receive-pack.c   |  1 +
 cache.h  | 29 
 fast-import.c|  6 --
 http-backend.c   |  6 --
 http-push.c  |  1 +
 http-walker.c|  1 +
 http.c   |  1 +
 object-store.h   | 41 +++-
 object.c |  7 +++
 pack-bitmap.c|  4 +++-
 pack-check.c |  1 +
 pack-revindex.c  |  1 +
 packfile.c   | 39 +++---
 reachable.c  |  1 +
 repository.c |  2 ++
 server-info.c|  6 --
 sha1_name.c  |  5 +++--
 23 files changed, 122 insertions(+), 74 deletions(-)

diff --git a/builtin/count-objects.c b/builtin/count-objects.c
index 33343818c8..9334648dcc 100644
--- a/builtin/count-objects.c
+++ b/builtin/count-objects.c
@@ -7,6 +7,8 @@
 #include "cache.h"
 #include "config.h"
 #include "dir.h"
+#include "repository.h"
+#include "object-store.h"
 #include "builtin.h"
 #include "parse-options.h"
 #include "quote.h"
@@ -120,9 +122,9 @@ int cmd_count_objects(int argc, const char **argv, const 
char *prefix)
struct strbuf loose_buf = STRBUF_INIT;
struct strbuf pack_buf = STRBUF_INIT;
struct strbuf garbage_buf = STRBUF_INIT;
-   if (!packed_git)
+   if (!the_repository->objects.packed_git)
prepare_packed_git();
-   for (p = packed_git; p; p = p->next) {
+   for (p = the_repository->objects.packed_git; p; p = p->next) {
if (!p->pack_local)
continue;
if (open_pack_index(p))
diff --git a/builtin/fsck.c b/builtin/fsck.c
index 908e4f092a..2e99e02128 100644
--- a/builtin/fsck.c
+++ b/builtin/fsck.c
@@ -2,6 +2,7 @@
 #include "cache.h"
 #include "repository.h"
 #include "config.h"
+#include "object-store.h"
 #include "commit.h"
 #include "tree.h"
 #include "blob.h"
@@ -729,7 +730,8 @@ int cmd_fsck(int argc, const char **argv, const char 
*prefix)
prepare_packed_git();
 
if (show_progress) {
-   for (p = packed_git; p; p = p->next) {
+   for (p = the_repository->objects.packed_git; p;
+p = p->next) {
if (open_pack_index(p))
continue;
total += p->num_objects;
@@ -737,7 +739,8 @@ int cmd_fsck(int argc, const char **argv, const char 
*prefix)
 
progress = start_progress(_("Checking 
objects"), total);
}
-   for (p = packed_git; p; p = p->next) {
+   for (p = the_repository->objects.packed_git; p;
+p = p->next) {
/* verify gives error messages itself */
if (verify_pack(p, fsck_obj_buffer,
progress, count))
diff --git a/builtin/gc.c b/builtin/gc.c
index 77fa720bd0..96ff790867 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -11,6 +11,7 @@
  */
 
 #include "builtin.h"
+#include "repository.h"
 #include "config.h"
 #include "tempfile.h"
 #include "lockfile.h"
@@ -18,6 +19,7 @@
 #include "run-command.h"
 

[PATCH 13/27] sha1_file: add repository argument to prepare_alt_odb

2018-02-20 Thread Stefan Beller
See previous patch for explanation.

While at it, move the declaration to object-store.h,
where it should be easier to find.

Signed-off-by: Stefan Beller 
Signed-off-by: Jonathan Nieder 
---
 builtin/fsck.c |  2 +-
 object-store.h |  3 ++-
 packfile.c |  2 +-
 sha1_file.c| 13 +++--
 sha1_name.c|  3 ++-
 5 files changed, 13 insertions(+), 10 deletions(-)

diff --git a/builtin/fsck.c b/builtin/fsck.c
index 2e99e02128..76c94e9b5a 100644
--- a/builtin/fsck.c
+++ b/builtin/fsck.c
@@ -717,7 +717,7 @@ int cmd_fsck(int argc, const char **argv, const char 
*prefix)
} else {
fsck_object_dir(get_object_directory());
 
-   prepare_alt_odb();
+   prepare_alt_odb(the_repository);
for (alt = the_repository->objects.alt_odb_list;
alt; alt = alt->next)
fsck_object_dir(alt->path);
diff --git a/object-store.h b/object-store.h
index bd1e4fcd8b..514ad94287 100644
--- a/object-store.h
+++ b/object-store.h
@@ -20,7 +20,8 @@ struct alternate_object_database {
 
char path[FLEX_ARRAY];
 };
-void prepare_alt_odb(void);
+#define prepare_alt_odb(r) prepare_alt_odb_##r()
+void prepare_alt_odb_the_repository(void);
 char *compute_alternate_path(const char *path, struct strbuf *err);
 typedef int alt_odb_fn(struct alternate_object_database *, void *);
 int foreach_alt_odb(alt_odb_fn, void*);
diff --git a/packfile.c b/packfile.c
index 693bafbc98..ccfe084642 100644
--- a/packfile.c
+++ b/packfile.c
@@ -890,7 +890,7 @@ void prepare_packed_git(void)
if (the_repository->objects.packed_git_initialized)
return;
prepare_packed_git_one(get_object_directory(), 1);
-   prepare_alt_odb();
+   prepare_alt_odb(the_repository);
for (alt = the_repository->objects.alt_odb_list; alt; alt = alt->next)
prepare_packed_git_one(alt->path, 0);
rearrange_packed_git();
diff --git a/sha1_file.c b/sha1_file.c
index a0e9116318..c26921df4a 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -23,6 +23,7 @@
 #include "sha1-lookup.h"
 #include "bulk-checkin.h"
 #include "repository.h"
+#include "object-store.h"
 #include "streaming.h"
 #include "dir.h"
 #include "list.h"
@@ -581,7 +582,7 @@ void add_to_alternates_memory(const char *reference)
 * Make sure alternates are initialized, or else our entry may be
 * overwritten when they are.
 */
-   prepare_alt_odb();
+   prepare_alt_odb(the_repository);
 
link_alt_odb_entries(the_repository, reference,
 '\n', NULL, 0);
@@ -667,7 +668,7 @@ int foreach_alt_odb(alt_odb_fn fn, void *cb)
struct alternate_object_database *ent;
int r = 0;
 
-   prepare_alt_odb();
+   prepare_alt_odb(the_repository);
for (ent = the_repository->objects.alt_odb_list; ent; ent = ent->next) {
r = fn(ent, cb);
if (r)
@@ -676,7 +677,7 @@ int foreach_alt_odb(alt_odb_fn fn, void *cb)
return r;
 }
 
-void prepare_alt_odb(void)
+void prepare_alt_odb_the_repository(void)
 {
const char *alt;
 
@@ -730,7 +731,7 @@ static int check_and_freshen_local(const unsigned char 
*sha1, int freshen)
 static int check_and_freshen_nonlocal(const unsigned char *sha1, int freshen)
 {
struct alternate_object_database *alt;
-   prepare_alt_odb();
+   prepare_alt_odb(the_repository);
for (alt = the_repository->objects.alt_odb_list; alt; alt = alt->next) {
const char *path = alt_sha1_path(alt, sha1);
if (check_and_freshen_file(path, freshen))
@@ -889,7 +890,7 @@ static int stat_sha1_file(const unsigned char *sha1, struct 
stat *st,
if (!lstat(*path, st))
return 0;
 
-   prepare_alt_odb();
+   prepare_alt_odb(the_repository);
errno = ENOENT;
for (alt = the_repository->objects.alt_odb_list; alt; alt = alt->next) {
*path = alt_sha1_path(alt, sha1);
@@ -920,7 +921,7 @@ static int open_sha1_file(const unsigned char *sha1, const 
char **path)
return fd;
most_interesting_errno = errno;
 
-   prepare_alt_odb();
+   prepare_alt_odb(the_repository);
for (alt = the_repository->objects.alt_odb_list; alt; alt = alt->next) {
*path = alt_sha1_path(alt, sha1);
fd = git_open(*path);
diff --git a/sha1_name.c b/sha1_name.c
index 016c883b5c..3e490ee8f6 100644
--- a/sha1_name.c
+++ b/sha1_name.c
@@ -11,6 +11,7 @@
 #include "sha1-array.h"
 #include "packfile.h"
 #include "repository.h"
+#include "object-store.h"
 
 static int get_oid_oneline(const char *, struct object_id *, struct 
commit_list *);
 
@@ -353,7 +354,7 @@ static int init_object_disambiguation(const char *name, int 
len,
 
ds->len = len;
ds->hex_pfx[len] = '\0';
-   prepare_alt_odb();
+   prepare_alt_odb(the_repository);

[PATCH 14/27] sha1_file: allow link_alt_odb_entries to handle arbitrary repositories

2018-02-20 Thread Stefan Beller
Actually this also allows read_info_alternates and link_alt_odb_entry to
handle arbitrary repositories, but link_alt_odb_entries is the most
interesting function in this set of functions, hence the commit subject.

These functions span a strongly connected component in the function
graph, i.e. the recursive call chain might look like

  -> link_alt_odb_entries
-> link_alt_odb_entry
  -> read_info_alternates
-> link_alt_odb_entries

That is why we need to convert all these functions at the same time.

Signed-off-by: Jonathan Nieder 
Signed-off-by: Stefan Beller 
---
 object-store.h |  4 
 sha1_file.c| 36 
 2 files changed, 20 insertions(+), 20 deletions(-)

diff --git a/object-store.h b/object-store.h
index 514ad94287..f283fbdba9 100644
--- a/object-store.h
+++ b/object-store.h
@@ -18,6 +18,10 @@ struct alternate_object_database {
char loose_objects_subdir_seen[256];
struct oid_array loose_objects_cache;
 
+   /*
+* Path to the alternative object store. If this is a relative path,
+* it is relative to the current working directory.
+*/
char path[FLEX_ARRAY];
 };
 #define prepare_alt_odb(r) prepare_alt_odb_##r()
diff --git a/sha1_file.c b/sha1_file.c
index c26921df4a..6e5105a252 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -390,11 +390,10 @@ static int alt_odb_usable(struct raw_object_store *o,
  * SHA1, an extra slash for the first level indirection, and the
  * terminating NUL.
  */
-#define read_info_alternates(r, rb, d) read_info_alternates_##r(rb, d)
-static void read_info_alternates_the_repository(const char *relative_base,
-   int depth);
-#define link_alt_odb_entry(r, e, rb, d, n) link_alt_odb_entry_##r(e, rb, d, n)
-static int link_alt_odb_entry_the_repository(const char *entry,
+static void read_info_alternates(struct repository *r,
+const char *relative_base,
+int depth);
+static int link_alt_odb_entry(struct repository *r, const char *entry,
const char *relative_base, int depth, const char *normalized_objdir)
 {
struct alternate_object_database *ent;
@@ -420,7 +419,7 @@ static int link_alt_odb_entry_the_repository(const char 
*entry,
while (pathbuf.len && pathbuf.buf[pathbuf.len - 1] == '/')
strbuf_setlen(, pathbuf.len - 1);
 
-   if (!alt_odb_usable(_repository->objects, , 
normalized_objdir)) {
+   if (!alt_odb_usable(>objects, , normalized_objdir)) {
strbuf_release();
return -1;
}
@@ -428,12 +427,12 @@ static int link_alt_odb_entry_the_repository(const char 
*entry,
ent = alloc_alt_odb(pathbuf.buf);
 
/* add the alternate entry */
-   *the_repository->objects.alt_odb_tail = ent;
-   the_repository->objects.alt_odb_tail = &(ent->next);
+   *r->objects.alt_odb_tail = ent;
+   r->objects.alt_odb_tail = &(ent->next);
ent->next = NULL;
 
/* recursively add alternates */
-   read_info_alternates(the_repository, pathbuf.buf, depth + 1);
+   read_info_alternates(r, pathbuf.buf, depth + 1);
 
strbuf_release();
return 0;
@@ -468,12 +467,8 @@ static const char *parse_alt_odb_entry(const char *string,
return end;
 }
 
-#define link_alt_odb_entries(r, a, s, rb, d) \
-   link_alt_odb_entries_##r(a, s, rb, d)
-static void link_alt_odb_entries_the_repository(const char *alt,
-   int sep,
-   const char *relative_base,
-   int depth)
+static void link_alt_odb_entries(struct repository *r, const char *alt,
+int sep, const char *relative_base, int depth)
 {
struct strbuf objdirbuf = STRBUF_INIT;
struct strbuf entry = STRBUF_INIT;
@@ -487,7 +482,7 @@ static void link_alt_odb_entries_the_repository(const char 
*alt,
return;
}
 
-   strbuf_add_absolute_path(, get_object_directory());
+   strbuf_add_absolute_path(, r->objects.objectdir);
if (strbuf_normalize_path() < 0)
die("unable to normalize object directory: %s",
objdirbuf.buf);
@@ -496,15 +491,16 @@ static void link_alt_odb_entries_the_repository(const 
char *alt,
alt = parse_alt_odb_entry(alt, sep, );
if (!entry.len)
continue;
-   link_alt_odb_entry(the_repository, entry.buf,
+   link_alt_odb_entry(r, entry.buf,
   relative_base, depth, objdirbuf.buf);
}
strbuf_release();
strbuf_release();
 }
 
-static void read_info_alternates_the_repository(const char *relative_base,
-   int depth)
+static void 

[PATCH 09/27] sha1_file: add raw_object_store argument to alt_odb_usable

2018-02-20 Thread Stefan Beller
Add a raw_object_store to alt_odb_usable to be more specific about which
repository to act on. The choice of the repository is delegated to its
only caller link_alt_odb_entry.

Signed-off-by: Stefan Beller 
---
 sha1_file.c | 8 +---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/sha1_file.c b/sha1_file.c
index 1348dce68f..a8e23bd2f8 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -346,7 +346,9 @@ static const char *alt_sha1_path(struct 
alternate_object_database *alt,
 /*
  * Return non-zero iff the path is usable as an alternate object database.
  */
-static int alt_odb_usable(struct strbuf *path, const char *normalized_objdir)
+static int alt_odb_usable(struct raw_object_store *o,
+ struct strbuf *path,
+ const char *normalized_objdir)
 {
struct alternate_object_database *alt;
 
@@ -362,7 +364,7 @@ static int alt_odb_usable(struct strbuf *path, const char 
*normalized_objdir)
 * Prevent the common mistake of listing the same
 * thing twice, or object directory itself.
 */
-   for (alt = the_repository->objects.alt_odb_list; alt; alt = alt->next) {
+   for (alt = o->alt_odb_list; alt; alt = alt->next) {
if (!fspathcmp(path->buf, alt->path))
return 0;
}
@@ -414,7 +416,7 @@ static int link_alt_odb_entry(const char *entry, const char 
*relative_base,
while (pathbuf.len && pathbuf.buf[pathbuf.len - 1] == '/')
strbuf_setlen(, pathbuf.len - 1);
 
-   if (!alt_odb_usable(, normalized_objdir)) {
+   if (!alt_odb_usable(_repository->objects, , 
normalized_objdir)) {
strbuf_release();
return -1;
}
-- 
2.16.1.291.g4437f3f132-goog



[PATCH 01/27] repository: introduce raw object store field

2018-02-20 Thread Stefan Beller
The raw object store field will contain any objects needed for
access to objects in a given repository.

This patch introduces the raw object store and populates it with the
`objectdir`, which used to be part of the repository struct.

As the struct gains members, we'll also populate the function to clear
the memory for these members.

In a later we'll introduce a struct object_parser, that will complement
the object parsing in a repository struct: The raw object parser is the
layer that will provide access to raw object content, while the higher
level object parser code will parse raw objects and keeps track of
parenthood and other object relationships using 'struct object'.
For now only add the lower level to the repository struct.

Signed-off-by: Stefan Beller 
Signed-off-by: Jonathan Nieder 
---
 builtin/grep.c |  2 +-
 environment.c  |  5 +++--
 object-store.h | 15 +++
 object.c   |  5 +
 path.c |  2 +-
 repository.c   | 17 +
 repository.h   |  7 ---
 7 files changed, 42 insertions(+), 11 deletions(-)
 create mode 100644 object-store.h

diff --git a/builtin/grep.c b/builtin/grep.c
index 3ca4ac80d8..0f0c195705 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -432,7 +432,7 @@ static int grep_submodule(struct grep_opt *opt, struct 
repository *superproject,
 * object.
 */
grep_read_lock();
-   add_to_alternates_memory(submodule.objectdir);
+   add_to_alternates_memory(submodule.objects.objectdir);
grep_read_unlock();
 
if (oid) {
diff --git a/environment.c b/environment.c
index de8431e01e..ec10b062e6 100644
--- a/environment.c
+++ b/environment.c
@@ -13,6 +13,7 @@
 #include "refs.h"
 #include "fmt-merge-msg.h"
 #include "commit.h"
+#include "object-store.h"
 
 int trust_executable_bit = 1;
 int trust_ctime = 1;
@@ -244,9 +245,9 @@ const char *get_git_work_tree(void)
 
 char *get_object_directory(void)
 {
-   if (!the_repository->objectdir)
+   if (!the_repository->objects.objectdir)
BUG("git environment hasn't been setup");
-   return the_repository->objectdir;
+   return the_repository->objects.objectdir;
 }
 
 int odb_mkstemp(struct strbuf *template, const char *pattern)
diff --git a/object-store.h b/object-store.h
new file mode 100644
index 00..cf35760ceb
--- /dev/null
+++ b/object-store.h
@@ -0,0 +1,15 @@
+#ifndef OBJECT_STORE_H
+#define OBJECT_STORE_H
+
+struct raw_object_store {
+   /*
+* Path to the repository's object store.
+* Cannot be NULL after initialization.
+*/
+   char *objectdir;
+};
+#define RAW_OBJECT_STORE_INIT { NULL }
+
+void raw_object_store_clear(struct raw_object_store *o);
+
+#endif /* OBJECT_STORE_H */
diff --git a/object.c b/object.c
index 9e6f9ff20b..11d904c033 100644
--- a/object.c
+++ b/object.c
@@ -445,3 +445,8 @@ void clear_commit_marks_all(unsigned int flags)
obj->flags &= ~flags;
}
 }
+
+void raw_object_store_clear(struct raw_object_store *o)
+{
+   free(o->objectdir);
+}
diff --git a/path.c b/path.c
index da8b655730..81a42d9115 100644
--- a/path.c
+++ b/path.c
@@ -382,7 +382,7 @@ static void adjust_git_path(const struct repository *repo,
strbuf_splice(buf, 0, buf->len,
  repo->index_file, strlen(repo->index_file));
else if (dir_prefix(base, "objects"))
-   replace_dir(buf, git_dir_len + 7, repo->objectdir);
+   replace_dir(buf, git_dir_len + 7, repo->objects.objectdir);
else if (git_hooks_path && dir_prefix(base, "hooks"))
replace_dir(buf, git_dir_len + 5, git_hooks_path);
else if (repo->different_commondir)
diff --git a/repository.c b/repository.c
index 4ffbe9bc94..2255ff657e 100644
--- a/repository.c
+++ b/repository.c
@@ -1,11 +1,18 @@
 #include "cache.h"
 #include "repository.h"
+#include "object-store.h"
 #include "config.h"
 #include "submodule-config.h"
 
 /* The main repository */
 static struct repository the_repo = {
-   NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, _index, 
_algos[GIT_HASH_SHA1], 0, 0
+   NULL, NULL,
+   RAW_OBJECT_STORE_INIT,
+   NULL, NULL, NULL,
+   NULL, NULL, NULL,
+   _index,
+   _algos[GIT_HASH_SHA1],
+   0, 0
 };
 struct repository *the_repository = _repo;
 
@@ -42,8 +49,8 @@ static void repo_setup_env(struct repository *repo)
!repo->ignore_env);
free(repo->commondir);
repo->commondir = strbuf_detach(, NULL);
-   free(repo->objectdir);
-   repo->objectdir = git_path_from_env(DB_ENVIRONMENT, repo->commondir,
+   free(repo->objects.objectdir);
+   repo->objects.objectdir = git_path_from_env(DB_ENVIRONMENT, 
repo->commondir,
"objects", !repo->ignore_env);
free(repo->graft_file);
repo->graft_file = 

[PATCH 15/27] sha1_file: allow prepare_alt_odb to handle arbitrary repositories

2018-02-20 Thread Stefan Beller
Signed-off-by: Stefan Beller 
---
 object-store.h |  3 +--
 sha1_file.c| 21 +++--
 2 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/object-store.h b/object-store.h
index f283fbdba9..873b341774 100644
--- a/object-store.h
+++ b/object-store.h
@@ -24,8 +24,7 @@ struct alternate_object_database {
 */
char path[FLEX_ARRAY];
 };
-#define prepare_alt_odb(r) prepare_alt_odb_##r()
-void prepare_alt_odb_the_repository(void);
+void prepare_alt_odb(struct repository *r);
 char *compute_alternate_path(const char *path, struct strbuf *err);
 typedef int alt_odb_fn(struct alternate_object_database *, void *);
 int foreach_alt_odb(alt_odb_fn, void*);
diff --git a/sha1_file.c b/sha1_file.c
index 6e5105a252..9cf3fffbeb 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -673,21 +673,22 @@ int foreach_alt_odb(alt_odb_fn fn, void *cb)
return r;
 }
 
-void prepare_alt_odb_the_repository(void)
+void prepare_alt_odb(struct repository *r)
 {
-   const char *alt;
-
-   if (the_repository->objects.alt_odb_tail)
+   if (r->objects.alt_odb_tail)
return;
 
-   alt = getenv(ALTERNATE_DB_ENVIRONMENT);
+   r->objects.alt_odb_tail = >objects.alt_odb_list;
+
+   if (!r->ignore_env) {
+   const char *alt = getenv(ALTERNATE_DB_ENVIRONMENT);
+   if (!alt)
+   alt = "";
 
-   the_repository->objects.alt_odb_tail =
-   _repository->objects.alt_odb_list;
-   link_alt_odb_entries(the_repository, alt,
-PATH_SEP, NULL, 0);
+   link_alt_odb_entries(r, alt, PATH_SEP, NULL, 0);
+   }
 
-   read_info_alternates(the_repository, get_object_directory(), 0);
+   read_info_alternates(r, r->objects.objectdir, 0);
 }
 
 /* Returns 1 if we have successfully freshened the file, 0 otherwise. */
-- 
2.16.1.291.g4437f3f132-goog



[PATCH 10/27] sha1_file: add repository argument to link_alt_odb_entry

2018-02-20 Thread Stefan Beller
Add a repository argument to allow the link_alt_odb_entry caller to be
more specific about which repository to act on. This is a small
mechanical change; it doesn't change the implementation to handle
repositories other than the_repository yet.

Since the implementation does not yet work with other repositories,
use a wrapper macro to enforce that the caller passes in
the_repository as the first argument. It would be more appealing to
use BUILD_ASSERT_OR_ZERO to enforce this, but that doesn't work
because it requires a compile-time constant and common compilers like
gcc 4.8.4 do not consider "r == the_repository" a compile-time
constant.

This and the following three patches add repository arguments to
link_alt_odb_entry, read_info_alternates, link_alt_odb_entries
and prepare_alt_odb. Three out of the four functions are found
in a recursive call chain, calling each other, and one of them
accesses the repositories `objectdir` (which was migrated; it
was an obvious choice) and `ignore_env` (which we need to keep in
the repository struct for clarify); hence we will pass through the
repository unlike just the object store object + the ignore_env flag.

Signed-off-by: Stefan Beller 
Signed-off-by: Jonathan Nieder 
---
 sha1_file.c | 8 +---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/sha1_file.c b/sha1_file.c
index a8e23bd2f8..aaa6276211 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -390,8 +390,9 @@ static int alt_odb_usable(struct raw_object_store *o,
  * terminating NUL.
  */
 static void read_info_alternates(const char * relative_base, int depth);
-static int link_alt_odb_entry(const char *entry, const char *relative_base,
-   int depth, const char *normalized_objdir)
+#define link_alt_odb_entry(r, e, rb, d, n) link_alt_odb_entry_##r(e, rb, d, n)
+static int link_alt_odb_entry_the_repository(const char *entry,
+   const char *relative_base, int depth, const char *normalized_objdir)
 {
struct alternate_object_database *ent;
struct strbuf pathbuf = STRBUF_INIT;
@@ -488,7 +489,8 @@ static void link_alt_odb_entries(const char *alt, int sep,
alt = parse_alt_odb_entry(alt, sep, );
if (!entry.len)
continue;
-   link_alt_odb_entry(entry.buf, relative_base, depth, 
objdirbuf.buf);
+   link_alt_odb_entry(the_repository, entry.buf,
+  relative_base, depth, objdirbuf.buf);
}
strbuf_release();
strbuf_release();
-- 
2.16.1.291.g4437f3f132-goog



[PATCH 22/27] sha1_file: allow sha1_file_name to handle arbitrary repositories

2018-02-20 Thread Stefan Beller
Signed-off-by: Stefan Beller 
Signed-off-by: Jonathan Nieder 
---
 object-store.h | 3 +--
 sha1_file.c| 4 ++--
 2 files changed, 3 insertions(+), 4 deletions(-)

diff --git a/object-store.h b/object-store.h
index a37d74a0c5..146358fe33 100644
--- a/object-store.h
+++ b/object-store.h
@@ -127,8 +127,7 @@ void raw_object_store_clear(struct raw_object_store *o);
  * Put in `buf` the name of the file in the local object database that
  * would be used to store a loose object with the specified sha1.
  */
-#define sha1_file_name(r, b, s) sha1_file_name_##r(b, s)
-void sha1_file_name_the_repository(struct strbuf *buf, const unsigned char 
*sha1);
+void sha1_file_name(struct repository *r, struct strbuf *buf, const unsigned 
char *sha1);
 
 #define map_sha1_file(r, s, sz) map_sha1_file_##r(s, sz)
 void *map_sha1_file_the_repository(const unsigned char *sha1, unsigned long 
*size);
diff --git a/sha1_file.c b/sha1_file.c
index d1f77a4c80..8fad523d85 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -323,9 +323,9 @@ static void fill_sha1_path(struct strbuf *buf, const 
unsigned char *sha1)
}
 }
 
-void sha1_file_name_the_repository(struct strbuf *buf, const unsigned char 
*sha1)
+void sha1_file_name(struct repository *r, struct strbuf *buf, const unsigned 
char *sha1)
 {
-   strbuf_addstr(buf, get_object_directory());
+   strbuf_addstr(buf, r->objects.objectdir);
strbuf_addch(buf, '/');
fill_sha1_path(buf, sha1);
 }
-- 
2.16.1.291.g4437f3f132-goog



[PATCH 24/27] sha1_file: allow open_sha1_file to handle arbitrary repositories

2018-02-20 Thread Stefan Beller
Signed-off-by: Stefan Beller 
Signed-off-by: Jonathan Nieder 
---
 sha1_file.c | 11 +--
 1 file changed, 5 insertions(+), 6 deletions(-)

diff --git a/sha1_file.c b/sha1_file.c
index 6879af8993..598acc5410 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -902,9 +902,8 @@ static int stat_sha1_file(struct repository *r, const 
unsigned char *sha1,
  * Like stat_sha1_file(), but actually open the object and return the
  * descriptor. See the caveats on the "path" parameter above.
  */
-#define open_sha1_file(r, s, p) open_sha1_file_##r(s, p)
-static int open_sha1_file_the_repository(const unsigned char *sha1,
-const char **path)
+static int open_sha1_file(struct repository *r,
+ const unsigned char *sha1, const char **path)
 {
int fd;
struct alternate_object_database *alt;
@@ -912,7 +911,7 @@ static int open_sha1_file_the_repository(const unsigned 
char *sha1,
static struct strbuf buf = STRBUF_INIT;
 
strbuf_reset();
-   sha1_file_name(the_repository, , sha1);
+   sha1_file_name(r, , sha1);
*path = buf.buf;
 
fd = git_open(*path);
@@ -920,8 +919,8 @@ static int open_sha1_file_the_repository(const unsigned 
char *sha1,
return fd;
most_interesting_errno = errno;
 
-   prepare_alt_odb(the_repository);
-   for (alt = the_repository->objects.alt_odb_list; alt; alt = alt->next) {
+   prepare_alt_odb(r);
+   for (alt = r->objects.alt_odb_list; alt; alt = alt->next) {
*path = alt_sha1_path(alt, sha1);
fd = git_open(*path);
if (fd >= 0)
-- 
2.16.1.291.g4437f3f132-goog



[PATCH 02/27] object-store: migrate alternates struct and functions from cache.h

2018-02-20 Thread Stefan Beller
Migrate the struct alternate_object_database and all its related
functions to the object store as these functions are easier found in
that header. The migration is just a verbatim copy, no need to
include the object store header at any C file, because cache.h includes
repository.h which in turn includes the object-store.h

Signed-off-by: Stefan Beller 
---
 cache.h| 51 --
 object-store.h | 51 ++
 2 files changed, 51 insertions(+), 51 deletions(-)

diff --git a/cache.h b/cache.h
index 9cac7bb518..964873faf2 100644
--- a/cache.h
+++ b/cache.h
@@ -1576,57 +1576,6 @@ extern int has_dirs_only_path(const char *name, int len, 
int prefix_len);
 extern void schedule_dir_for_removal(const char *name, int len);
 extern void remove_scheduled_dirs(void);
 
-extern struct alternate_object_database {
-   struct alternate_object_database *next;
-
-   /* see alt_scratch_buf() */
-   struct strbuf scratch;
-   size_t base_len;
-
-   /*
-* Used to store the results of readdir(3) calls when searching
-* for unique abbreviated hashes.  This cache is never
-* invalidated, thus it's racy and not necessarily accurate.
-* That's fine for its purpose; don't use it for tasks requiring
-* greater accuracy!
-*/
-   char loose_objects_subdir_seen[256];
-   struct oid_array loose_objects_cache;
-
-   char path[FLEX_ARRAY];
-} *alt_odb_list;
-extern void prepare_alt_odb(void);
-extern char *compute_alternate_path(const char *path, struct strbuf *err);
-typedef int alt_odb_fn(struct alternate_object_database *, void *);
-extern int foreach_alt_odb(alt_odb_fn, void*);
-
-/*
- * Allocate a "struct alternate_object_database" but do _not_ actually
- * add it to the list of alternates.
- */
-struct alternate_object_database *alloc_alt_odb(const char *dir);
-
-/*
- * Add the directory to the on-disk alternates file; the new entry will also
- * take effect in the current process.
- */
-extern void add_to_alternates_file(const char *dir);
-
-/*
- * Add the directory to the in-memory list of alternates (along with any
- * recursive alternates it points to), but do not modify the on-disk alternates
- * file.
- */
-extern void add_to_alternates_memory(const char *dir);
-
-/*
- * Returns a scratch strbuf pre-filled with the alternate object directory,
- * including a trailing slash, which can be used to access paths in the
- * alternate. Always use this over direct access to alt->scratch, as it
- * cleans up any previous use of the scratch buffer.
- */
-extern struct strbuf *alt_scratch_buf(struct alternate_object_database *alt);
-
 struct pack_window {
struct pack_window *next;
unsigned char *base;
diff --git a/object-store.h b/object-store.h
index cf35760ceb..5678aa1136 100644
--- a/object-store.h
+++ b/object-store.h
@@ -1,6 +1,57 @@
 #ifndef OBJECT_STORE_H
 #define OBJECT_STORE_H
 
+extern struct alternate_object_database {
+   struct alternate_object_database *next;
+
+   /* see alt_scratch_buf() */
+   struct strbuf scratch;
+   size_t base_len;
+
+   /*
+* Used to store the results of readdir(3) calls when searching
+* for unique abbreviated hashes.  This cache is never
+* invalidated, thus it's racy and not necessarily accurate.
+* That's fine for its purpose; don't use it for tasks requiring
+* greater accuracy!
+*/
+   char loose_objects_subdir_seen[256];
+   struct oid_array loose_objects_cache;
+
+   char path[FLEX_ARRAY];
+} *alt_odb_list;
+void prepare_alt_odb(void);
+char *compute_alternate_path(const char *path, struct strbuf *err);
+typedef int alt_odb_fn(struct alternate_object_database *, void *);
+int foreach_alt_odb(alt_odb_fn, void*);
+
+/*
+ * Allocate a "struct alternate_object_database" but do _not_ actually
+ * add it to the list of alternates.
+ */
+struct alternate_object_database *alloc_alt_odb(const char *dir);
+
+/*
+ * Add the directory to the on-disk alternates file; the new entry will also
+ * take effect in the current process.
+ */
+void add_to_alternates_file(const char *dir);
+
+/*
+ * Add the directory to the in-memory list of alternates (along with any
+ * recursive alternates it points to), but do not modify the on-disk alternates
+ * file.
+ */
+void add_to_alternates_memory(const char *dir);
+
+/*
+ * Returns a scratch strbuf pre-filled with the alternate object directory,
+ * including a trailing slash, which can be used to access paths in the
+ * alternate. Always use this over direct access to alt->scratch, as it
+ * cleans up any previous use of the scratch buffer.
+ */
+struct strbuf *alt_scratch_buf(struct alternate_object_database *alt);
+
 struct raw_object_store {
/*
 * Path to the repository's object store.
-- 
2.16.1.291.g4437f3f132-goog



[PATCHv3 00/27] Moving global state into the repository object (part 1)

2018-02-20 Thread Stefan Beller
v3:
* reverted back to use the repository for most of the functions
  (including unduplicating 'ignore_env')
* rebased on master again (I lost that state when doing v2, as
  I did both rebase as well as conversion to object store in one go)
* one additional patch, that moves the alternates related things to
  object-store.h, such that inclusion of cache.h (in object-store.h)
  is not required any more.

v2:
* duplicated the 'ignore_env' bit into the object store as well
* the #define trick no longer works as we do not have a "the_objectstore" 
global,
  which means there is just one patch per function that is converted.
  As this follows the same structure of the previous series, I am still 
confident
  there is no hidden dependencies to globals outside the object store in these
  converted functions.
* rebased on top of current master, resolving the merge conflicts.
  I think I used the list.h APIs right, but please double check.

Thanks,
Stefan

v1:
This is a real take on the first part of the recent RFC[1].

Jonathan Tan suggested[2] that "sha1_loose_object_info to handle arbitrary 
repositories"
might be a good breaking point for a first part at that RFC at patch 38.
This series is smaller and contains only 26 patches as the patches in the big
RFC were slightly out of order.

I developed this series partly by writing patches, but mostly by cherrypicking
from that RFC on top of current master. I noticed no external conflicts apart
from one addition to the repositories _INIT macro, which was easy to resolve.

Comments in the early range of that RFC were on 003 where Junio pointed out
that the coccinelle patch ought to be not in contrib/coccinelle, so I put it
in a sub directory there, as 'make coccicheck' doesn't traverse subdirs.

brian had a questoin on patch 25 in the RFC, but that seemed to resolve itself
without any suggestion to include into this series[3].

Duy suggested that we shall not use the repository blindly, but should carefully
examine whether to pass on an object store or the refstore or such[4], which
I agree with if it makes sense. This series unfortunately has an issue with that
as I would not want to pass down the `ignore_env` flag separately from the 
object
store, so I made all functions that only take the object store to have the raw
object store as the first parameter, and others using the full repository.

Eric Sunshine brought up memory leaks with the RFC, and I would think to
have plugged all holes.

[1] https://public-inbox.org/git/20180205235508.216277-1-sbel...@google.com/
[2] 
https://public-inbox.org/git/20180207143300.ce1c39ca07f6a0d64fe0e...@google.com/
[3] 
https://public-inbox.org/git/20180206011940.gd7...@genre.crustytoothpaste.net/
[4] 
https://public-inbox.org/git/cacsjy8cggekpx4czkyytsprj87uqvkzsol7fyt__p2dh_1l...@mail.gmail.com/

Thanks,
Stefan

Jonathan Nieder (2):
  sha1_file: allow map_sha1_file_1 to handle arbitrary repositories
  sha1_file: allow sha1_loose_object_info to handle arbitrary
repositories

Stefan Beller (25):
  repository: introduce raw object store field
  object-store: migrate alternates struct and functions from cache.h
  object-store: move alt_odb_list and alt_odb_tail to object store
  object-store: free alt_odb_list
  object-store: move packed_git and packed_git_mru to object store
  object-store: close all packs upon clearing the object store
  pack: move prepare_packed_git_run_once to object store
  pack: move approximate object count to object store
  sha1_file: add raw_object_store argument to alt_odb_usable
  sha1_file: add repository argument to link_alt_odb_entry
  sha1_file: add repository argument to read_info_alternates
  sha1_file: add repository argument to link_alt_odb_entries
  sha1_file: add repository argument to prepare_alt_odb
  sha1_file: allow link_alt_odb_entries to handle arbitrary repositories
  sha1_file: allow prepare_alt_odb to handle arbitrary repositories
  sha1_file: add repository argument to sha1_file_name
  sha1_file: add repository argument to stat_sha1_file
  sha1_file: add repository argument to open_sha1_file
  sha1_file: add repository argument to map_sha1_file_1
  sha1_file: add repository argument to map_sha1_file
  sha1_file: add repository argument to sha1_loose_object_info
  sha1_file: allow sha1_file_name to handle arbitrary repositories
  sha1_file: allow stat_sha1_file to handle arbitrary repositories
  sha1_file: allow open_sha1_file to handle arbitrary repositories
  sha1_file: allow map_sha1_file to handle arbitrary repositories

 builtin/am.c |   2 +-
 builtin/clone.c  |   2 +-
 builtin/count-objects.c  |   6 +-
 builtin/fetch.c  |   2 +-
 builtin/fsck.c   |  13 ++--
 builtin/gc.c |   4 +-
 builtin/grep.c   |   2 +-
 builtin/index-pack.c |   1 +
 builtin/merge.c  |   2 +-
 builtin/pack-objects.c   |  20 +++---
 builtin/pack-redundant.c |   6 +-
 builtin/receive-pack.c   |   3 +-
 cache.h  |  87 

Re: Git should preserve modification times at least on request

2018-02-20 Thread Junio C Hamano
Jeff King  writes:

> Modification times are a subset of the total metadata you might care
> about, so they are solving a much more general problem. Which may also
> partially answer your question about why this isn't built into git. The
> general problem gets much bigger when you start wanting to carry things
> like modes (which git doesn't actually track; we really only care about
> the executable bit) or extended attributes (acls, etc).

"modes" are interesting, especially when you think about group
permissions, as it would make you design how you store group and
owner, which in turn forces you to think how "peter" on one system
relates to "peter" on another system (answer: there generally isn't
any relationship) ;-)



Re: [GSoC][PATCH] tag: Make "git tag --contains " less chatty if is invalid

2018-02-20 Thread Junio C Hamano
Stefan Beller  writes:

>> diff --git a/t/t7013-tag-contains.sh b/t/t7013-tag-contains.sh
>> new file mode 100755
>> index 0..65119dada
>> --- /dev/null
>> +++ b/t/t7013-tag-contains.sh
>
> Thanks for adding the tests into a new file instead of putting it somewhere
> where it is already convenient. (We have too many of those "just add it there
> as it is easiest to fit in")

Careful, as that cuts both ways.  We want to strongly encourage
people to see if there is already a place that is a good enough fit
for new tests before adding small test scripts randomly to consume
the test serial numbers and test process start-up cost.  Only when
there is nowhere appropriate, we do want to add.  And if this covers
both tag and branch, then a new script may be appropriate but it
shouldn't limit its future enhancement (to test 'git branch') by
having 'tag' to pretend that this file must be limited to 'git tag'.

> So I'd expect the return code to be 0 (if we don't care) or 1
> (if we do care), in the case of 1, we shall write:
>
>   test_must_fail git tag --contains ... &&
>   grep 
>
> (A long way of hinting at the test_must_fail test function,
> that lives in t/test-lib-functions.sh)

If you are looking at error stream, it is very likely that you would
want to study test_i18ngrep and use it, as errors are fair game for
i18n (and possibly coloring, which is a near-by topic).


RE: [BUG] Worktree prune complains about missing path

2018-02-20 Thread Randall S. Becker
On February 20, 2018 5:22 PM Eric Sunshine wrote:
> On Tue, Feb 20, 2018 at 3:36 PM, Randall S. Becker
>  wrote:
> > I’m a bit confused about this, as I thought I understood worktrees :(.
> >
> > /home/randall/nsgit/test/test dir.mytest: rm -rf dest.wt
> > /home/randall/nsgit/test/test dir.mytest/dest: git worktree prune -v
> > Removing worktrees/dest.wt: gitdir file points to non-existent
> > location
> >
> > It seems like one or two situations are a problem:
> > 1) I’m using a full path for the worktree.
> > 2) There’s a blank in my path – deliberate… tests, yanno.
> >
> > This is git 2.16.2. Could anyone shed some light on this?
> 
> This appears to be working as intended. "git worktree prune" is telling you
> that it is pruning the administrative data for the "dest.wt" worktree (reason:
> "worktree location no longer exists"), which you intentionally deleted before
> pruning. It's not clear what it is that you find confusing. There is not a lot
> more I can say without understanding what behavior you were expecting
> and how your expectation differs from the actual experience.

I should have been more clear here as to the issue. My bad. The git worktree 
prune operation does not remove all vestiges of the removed worktree. The 
following files are retained:

./logs/refs/heads/dest.wt
./refs/heads/dest.wt

So, now that I understand in hindsight, these are actually references to the 
worktree branch 'dest.wt' that obviously remains correctly and properly in git.

Adding:  git branch -D dest.wtto my test script cleared my (embarrassing) 
problem of my own doing.

> 
> (Also, please consider how easy or difficult it is for a reader to interpret 
> your
> pasted "sample session". The one provided is more confusing than necessary
> due to the command prompt bearing the same path information as the
> output of the "git worktree list" command, as well as unnecessary duplicated
> commands, such as "ls", and missing "cd" commands which do not help to
> illuminate what it is you are trying to get across. The pasted transcript also
> contains invalid code-points which render as oddball characters -- or not at 
> all
> -- which didn't help. Best would be to prepare a minimal example of shell
> commands to reproduce the behavior you're trying to illustrate.)

Good point. Again, my bad - very long day debugging. I wanted to show where I 
was in the directory so I sacrificed brevity for completeness and noise. My 
apologies.

So, no bug, just a buggy user.

Cheers,
Randall



Re: [PATCH v4 04/13] commit-graph: implement write_commit_graph()

2018-02-20 Thread Junio C Hamano
Derrick Stolee  writes:

> +#define GRAPH_OID_VERSION_SHA1 1
> +#define GRAPH_OID_LEN_SHA1 20

This hardcoded 20 on the right hand side of this #define is probably
problematic.   Unless you are planning to possibly store truncated
hash value for some future hash algorithm, GRAPH_OID_LEN_$HASH should
always be the same as GIT_$HASH_RAWSZ, I would think.  IOW

#define GRAPH_OID_LEN_SHA1 GIT_SHA1_RAWSZ

perhaps?

> +static void write_graph_chunk_fanout(struct sha1file *f,
> +  struct commit **commits,
> +  int nr_commits)
> +{
> + uint32_t i, count = 0;
> + struct commit **list = commits;
> + struct commit **last = commits + nr_commits;
> +
> + /*
> +  * Write the first-level table (the list is sorted,
> +  * but we use a 256-entry lookup to be able to avoid
> +  * having to do eight extra binary search iterations).
> +  */
> + for (i = 0; i < 256; i++) {
> + while (list < last) {
> + if ((*list)->object.oid.hash[0] != i)
> + break;
> + count++;
> + list++;
> + }

If count and list are always incremented in unison, perhaps you do
not need an extra variable "last".  If typeof(nr_commits) is wider
than typeof(count), this loop and the next write-be32 is screwed
anyway ;-)

This comment probably applies equally to some other uses of the same
"compute last pointer to compare with running pointer for
termination" pattern in this patch.

> + sha1write_be32(f, count);
> + }
> +}

> +static int commit_pos(struct commit **commits, int nr_commits,
> +   const struct object_id *oid, uint32_t *pos)
> +{

It is a bit unusual to see something_pos() that returns an integer
that does *NOT* return the position as its return value.  Dropping
the *pos parameter, and returning "mid" when commits[mid] is what we
wanted to see, and otherwise returning "-1 - first" to signal the
position at which we _would_ have found the object, if it were in
the table, would make it more consistent with the usual convention.

Don't we even have such a generalized binary search helper already
somewhere in the system?

> +static void write_graph_chunk_data(struct sha1file *f, int hash_len,
> +struct commit **commits, int nr_commits)
> +{
> + struct commit **list = commits;
> + struct commit **last = commits + nr_commits;
> + uint32_t num_large_edges = 0;
> +
> + while (list < last) {
> + struct commit_list *parent;
> + uint32_t int_id;
> + uint32_t packedDate[2];
> +
> +...
> + if (!parent)
> + int_id = GRAPH_PARENT_NONE;
> + else if (parent->next)
> + int_id = GRAPH_LARGE_EDGES_NEEDED | num_large_edges;
> + else if (!commit_pos(commits, nr_commits,
> + &(parent->item->object.oid), _id))
> + int_id = GRAPH_PARENT_MISSING;
> +
> + sha1write_be32(f, int_id);
> +
> + if (parent && parent->next) {

This is equivalent to checking "int_id & GRAPH_LARGE_EDGES_NEEDED",
right?  Not suggesting to use the other form of checks, but trying
to see what's the best way to express it in the most readable way.

> + do {
> + num_large_edges++;
> + parent = parent->next;
> + } while (parent);

It feels somewhat wasteful to traverse the commit's parents list
only to count, without populating the octopus table (which I
understand is assumed to be minority case under this design).

> + }
> +
> + if (sizeof((*list)->date) > 4)
> + packedDate[0] = htonl(((*list)->date >> 32) & 0x3);
> + else
> + packedDate[0] = 0;

OK, the undefined pattern in the previous round is now gone ;-)  Good.

> + packedDate[1] = htonl((*list)->date);
> + sha1write(f, packedDate, 8);
> +
> + list++;
> + }
> +}
> +
> +static void write_graph_chunk_large_edges(struct sha1file *f,
> +   struct commit **commits,
> +   int nr_commits)
> +{
> + struct commit **list = commits;
> + struct commit **last = commits + nr_commits;
> + struct commit_list *parent;
> +
> + while (list < last) {
> + int num_parents = 0;
> + for (parent = (*list)->parents; num_parents < 3 && parent;
> +  parent = parent->next)
> + num_parents++;
> +
> + if (num_parents <= 2) {
> + list++;
> + continue;
> + }
> +
> + /* Since num_parents > 2, this initializer is safe. */
> + for (parent = 

Re: Question about get_cached_commit_buffer()

2018-02-20 Thread Jeff King
On Tue, Feb 20, 2018 at 05:12:50PM -0500, Derrick Stolee wrote:

> In rev-list, the "--header" option outputs a value and expects the buffer to
> be cached. It outputs the header info only if get_cached_commit_buffer()
> returns a non-null buffer, giving incorrect output. If it called
> get_commit_buffer() instead, it would immediately call
> get_cached_commit_buffer() and on failure actually load the buffer.
> 
> This has not been a problem before, since the buffer was always loaded at
> some point for each commit (and saved because of the save_commit_buffer
> global).
> 
> I propose to make get_cached_commit_buffer() static to commit.c and convert
> all callers to get_commit_buffer(). Is there any reason to _not_ do this? It
> seems that there is no functional or performance change.

That helper was added in 152ff1cceb (provide helpers to access the
commit buffer, 2014-06-10). I think interesting part is the final
paragraph:

  Note that we also need to add a "get_cached" variant which
  returns NULL when we do not have a cached buffer. At first
  glance this seems to defeat the purpose of "get", which is
  to always provide a return value. However, some log code
  paths actually use the NULL-ness of commit->buffer as a
  boolean flag to decide whether to try printing the
  commit. At least for now, we want to continue supporting
  that use.

So I think a conversion to get_commit_buffer() would break the callers
that use the boolean nature for something useful. Unfortunately the
author did not enumerate exactly what those uses are, so we'll have to
dig. :)

My guess is that it has to do with showing some commits twice, since we
would normally have the buffer available the first time we hit the
commit, and then free it in finish_commit().

If we blame that rev-list line (and then walk back over a bunch of
uninteresting commits via parent re-blaming), it comes from 3131b71301
(Add "--show-all" revision walker flag for debugging, 2008-02-09).

So there it is. It does show commits multiple times, but suppresses the
verbose header after the first showing. If we do something like this:

  git rev-list --show-all --pretty --boundary c93150cfb0^-

you'll see some boundary commits that _don't_ have their pretty headers
shown. And with your proposed patch, we'd show them again. To keep the
same behavior we need to store that "we've already seen this" boolean
somewhere else (e.g., in an object flag; possibly SEEN, but that might
run afoul of other logic).

It looks like the call in log-tree comes from the same commit, and
serves the same purpose.

Aside from storing the boolean "did we show it" in another way, the
other option is to simply change the behavior and accept that we might
pretty-print the commit twice. This is a backwards-incompatible change,
but I'm not sure if anybody would care. According to that commit,
--show-all was added explicitly for debugging, and it has never been
documented.  I couldn't find any reference to people actually using it
on the list (a grep of the whole archive turns up 32 messages, most of
which are just it appearing in context; the only person mentioning its
actual use was Linus in 2008.

-Peff


Re: Git should preserve modification times at least on request

2018-02-20 Thread Peter Backes
On Tue, Feb 20, 2018 at 11:32:23PM +0100, Johannes Schindelin wrote:
> Hi Peter,
> 
> On Tue, 20 Feb 2018, Peter Backes wrote:
> 
> > On Tue, Feb 20, 2018 at 11:46:38AM +0100, Johannes Schindelin wrote:
> > 
> > > I would probably invent a file format (``)
> > 
> > I'm stuck there because of  being munged.
> 
> From which command do you want to get it? If you are looking at `git
> diff`, you may want to use the `-z --name-only` options to avoid munging
> the paths.

I plan to use "git diff-tree --name-only $w_tree HEAD" and subtract
all lines from "git diff-index --name-only HEAD" to get the files for 
which the timestamp should be stored..

If I use "-z" I get the non-munged path, but I cannot safely store such 
paths in the proposed file format; they might contain newlines (sigh). 
So at one point I have to munge. Then the same question arises when I 
have to get the actual path from the munged path when restoring the 
timestamps.

If there's no ready-made functionality to munge and unmunge paths, I 
have to write some awk for this. At first I thought this might add one 
more dependency to git, but it seems that awk is already used in 
git-mergetool.sh, so I suppose it's okay to use in git-stash.sh etc, 
too.

Best wishes
Peter

-- 
Peter Backes, r...@helen.plasma.xg8.de


Re: Is there any way to "interrupt" a rebase?

2018-02-20 Thread Jacob Keller
On Tue, Feb 20, 2018 at 12:56 PM, Jeff King  wrote:
> On Tue, Feb 20, 2018 at 12:44:51PM +0100, Johannes Schindelin wrote:
>
>> > It might be even possible to design a new subcommand for the interactive
>> > rebase to facilitate a variation of this strategy (possibly even making
>> > use of the fact that the interactive rebase accumulates mappings between
>> > the original commits and the rewritten ones in
>> > $GIT_DIR/rebase-merge/rewritten-list, intended for use in the post-rewrite
>> > hook).
>>
>> This feature might look somewhat like this:
>>
>>   git rebase --replay-latest-commits 3
>>
>> and it would not even have to look at the `rewritten-list`. All it would
>> do is to put back the latest `pick` from the `done` file (in case of merge
>> conflicts) into the `git-rebase-todo` file, then insert `pick lines for
>> HEAD~3.. at the beginning of that todo file, and then `git reset --hard
>> HEAD~3`.
>
> Keep in mind that the "pick" lines could be "edit", "squash", etc.
>
> I think the general form of your original email's proposal is something
> like: What if we had a "git rebase --rewind" that could "undo" the prior
> command? So if I had a todo file like:
>
>   pick 1
>   edit 2
>   x make test
>   edit 3
>   x make test
>   pick 4
>
> and I failed at the second "make test", then I'd have:
>
>   pick 1
>   edit 2
>   x make test
>   edit 3
>   x make test
>
> in the "done" file, with the final pick remaining in "todo". Could I
> then ask to "rewind" my state by moving "x make test" back to the
> "todo". And two rewinds would get me back to applying patch 3, which I
> could then fix up and re-run my test. Or four rewinds would get me back
> to patch 2, which maybe is where I made the initial mistake.
>
> That's a bit more primitive than what you're proposing in this
> follow-on, because you'd be doing the replay yourself (unless we remap
> the commits). But it's very easy to reason about and implement.
>
> Anyway, just musing at this point. I haven't thought it through, but I
> like the direction of everything you're saying. ;)
>
> -Peff

Using a --rewind that simply tracks the point of each history and can
reset back to each seems a bit more inline with what the original
suggestion is. Sort of like "undo" in an editor might. You could even
add a "rewind=x" so it could go back more than one step at a time, tho
just re-running rewind until you get where you want would be doable as
well.

I like the overall direction of both these suggestions.

Thanks,
Jake


Re: [GSoC][PATCH] tag: Make "git tag --contains " less chatty if is invalid

2018-02-20 Thread Stefan Beller
Welcome to the Git mailing list!

On Mon, Feb 19, 2018 at 1:21 PM, Paul-Sebastian Ungureanu
 wrote:
> git tag --contains  prints the whole help text if  is
> invalid. It should only show the error message instead.

Makes sense.

>
> This bug was a side effect of looking up the commit in option
> parser callback. When a error occurs in the option parser, the
> full usage is shown. To fix this bug, the part related to
> looking up the commit was moved outside of the option parser
> to the ref-filter module.
>
> Basically, the option parser only parses strings that represent
> commits and the ref-filter performs the commit look-up. If an
> error occurs during the option parsing, then it must be an invalid
> argument and the user should be informed of usage, but if a error
> occurs during ref-filtering, then it is a problem with the
> argument.
>
> Signed-off-by: Paul-Sebastian Ungureanu 
> ---

> diff --git a/parse-options.h b/parse-options.h
> index af711227a..3aa8ddd46 100644
> --- a/parse-options.h
> +++ b/parse-options.h

parse-options.h is a very generic place in Gits source code,
so this would also fix 'git branch --contains=' at the same time?
Would it make sense to say so in the commit message or have a
test for that?

> @@ -258,9 +258,20 @@ extern int parse_opt_passthru_argv(const struct option 
> *, const char *, int);
>   PARSE_OPT_LASTARG_DEFAULT | flag, \
>   parse_opt_commits, (intptr_t) "HEAD" \
> }
> +#define _OPT_CONTAINS_OR_WITH_STRS(name, variable, help, flag) \
> +   { OPTION_CALLBACK, 0, name, (variable), N_("commit"), (help), \
> + PARSE_OPT_LASTARG_DEFAULT | flag, \
> + parse_opt_string_list, (intptr_t) "HEAD" \
> +   }

This is the same as _OPT_CONTAINS_OR_WITH
except parse_opt_commits is substituted by parse_opt_string_list.

Do we need both? (As far as I understand this addresses a whole class
of errors that could be eliminated, we do not want to fix git-tag only,
we'd also want to fix git-branch as well as git-for-each-ref ?)
So instead convert all callers to the new behavior, or would that break
existing conventions? (Then there is no need to have 2 competing
defines that are nearly identical) Instead of double-defining, would it
be possible to turn it into a flag? OPT_DONT_HELP_ON_BAD_OBJECTID
or something?

> +
>  #define OPT_CONTAINS(v, h) _OPT_CONTAINS_OR_WITH("contains", v, h, 
> PARSE_OPT_NONEG)
>  #define OPT_NO_CONTAINS(v, h) _OPT_CONTAINS_OR_WITH("no-contains", v, h, 
> PARSE_OPT_NONEG)
>  #define OPT_WITH(v, h) _OPT_CONTAINS_OR_WITH("with", v, h, PARSE_OPT_HIDDEN 
> | PARSE_OPT_NONEG)
>  #define OPT_WITHOUT(v, h) _OPT_CONTAINS_OR_WITH("without", v, h, 
> PARSE_OPT_HIDDEN | PARSE_OPT_NONEG)
>
> +#define OPT_CONTAINS_STRS(v, h) _OPT_CONTAINS_OR_WITH_STRS("contains", v, h, 
> PARSE_OPT_NONEG)
> +#define OPT_NO_CONTAINS_STRS(v, h) _OPT_CONTAINS_OR_WITH_STRS("no-contains", 
> v, h, PARSE_OPT_NONEG)
> +#define OPT_WITH_STRS(v, h) _OPT_CONTAINS_OR_WITH_STRS("with", v, h, 
> PARSE_OPT_HIDDEN | PARSE_OPT_NONEG)
> +#define OPT_WITHOUT_STRS(v, h) _OPT_CONTAINS_OR_WITH_STRS("without", v, h, 
> PARSE_OPT_HIDDEN | PARSE_OPT_NONEG)
> +


> diff --git a/t/t7013-tag-contains.sh b/t/t7013-tag-contains.sh
> new file mode 100755
> index 0..65119dada
> --- /dev/null
> +++ b/t/t7013-tag-contains.sh

Thanks for adding the tests into a new file instead of putting it somewhere
where it is already convenient. (We have too many of those "just add it there
as it is easiest to fit in")

> +
> +test_expect_success 'tag --contains ' '
> +   ! (git tag --contains "v1.0" 2>&1 | grep -o "usage")

In the test suite we assume everything but Git flawless, but
Git *may* be faulty. What if Git crashes (segfault) ?
Then this test is still producing a valid "ok" as grep
doesn't find "usage". This pattern of piping output of
Git into other commands is around the test suite unfortunately,
but we'd want to not add this pattern in new code. So maybe:

git tag --contains v1.0 2 >error &&
grep "usage" err

Another thing on this: we'd want to check the return code of
git tag in this case.

In case of an error in parse-opt we error out with 129
just as git would without this patch:

 $ git tag --contains=a6d7eb2c7a6a402a938824bcf1c5f331dd1a06bc
error: no such commit a6d7eb2c7a6a402a938824bcf1c5f331dd1a06bc
usage: git tag [-a | -s | -u ] [-f] [-m  | -F ]
 []
   or: git tag -d ...
   or: git tag -l [-n[]] [--contains ] [--no-contains
] [--points-at ]
[]
 $ echo $?
129

But after applying this patch, we also do not want to have 129
as the return code as that indicates that parsing the options was
unsuccessful. Instead we want to see that the --contains operation was
unsucessful, maybe?

So I'd expect the return code to be 0 (if we don't care) or 1
(if we do care), in the case of 1, we shall write:

  test_must_fail git tag --contains ... &&
  grep 

(A long way of hinting at the 

Re: Git should preserve modification times at least on request

2018-02-20 Thread Johannes Schindelin
Hi Peter,

On Tue, 20 Feb 2018, Peter Backes wrote:

> On Tue, Feb 20, 2018 at 11:46:38AM +0100, Johannes Schindelin wrote:
> 
> > I would probably invent a file format (``)
> 
> I'm stuck there because of  being munged.

>From which command do you want to get it? If you are looking at `git
diff`, you may want to use the `-z --name-only` options to avoid munging
the paths.

Or are you looking at a different source for the paths?

Ciao,
Johannes


Re: File locking issues with fetching submodules in parallel

2018-02-20 Thread Johannes Schindelin
Hi Stefan,

On Tue, 20 Feb 2018, Stefan Beller wrote:

> On Tue, Feb 20, 2018 at 8:20 AM, Johannes Schindelin
>  wrote:
> > Hi Stefan (and other submodule wizards),
> >
> > Igor Melnichenko reported a submodule problem in a Git for Windows ticket:
> >
> > https://github.com/git-for-windows/git/issues/1469
> >
> 
> Thanks for bringing this up. Is it better to discuss it here on at that
> issue?

It's up to you. I have no preference.

> > Part of it is due to Windows' behavior where you cannot read the same
> > file in one process while you write it in another process.
> >
> > The other part is how our submodule code works in parallel. In
> > particular, we seem to write the `.git` file maybe even every time a
> > submodule is fetched, unconditionally, not even testing whether the
> > .git file is already there and contains the correct contents?
> 
> only when they need to be newly cloned (as that is when the
> submodule--helper clone is invoked). When fetching, we'd not touch the
> '.git' file IIRC.

Okay. I saw multiple calls to the same connect_work_tree_and_git_dir()
function in submodule.c, and was not so sure that it was only clone.

But the bug report talks about `git submodule add` with a new URL and a
new submodule location, so I guess it is going down the clone path.

> > For some reason, the bug reporter saw a "Permission denied" on the
> > `.git` file when we try to write it (and I am pretty certain that I
> > tracked it down correctly to the `connect_work_tree_and_git_dir()`
> > function). The intermittent "Permission denied" error seems to suggest
> > that *another* process is accessing this file while we are writing it.
> 
> The reporter also reports this coming out of "git submodule add",
> which is not parallelized, but rather in the messy state of migrating
> as much code into C from shell, trying to not introduce to many bugs. ;)
> 
> So for add there is no parallelism I am aware of, which makes me wonder
> how the race is coming in there.

Hrm. You're right, there is no indication in that bug report that `git
submodule` spawns multiple processes in parallel.

> I recall making the decision to unconditionally write out the '.git' file
> for other operations like "git reset --hard --recurse-submodules" or
> "git checkout --force --recurse-submodules", as there you are really asking
> for a clean slate. I missed the Windows FS semantics for that case,
> so I guess re-reading the file to make sure it is already correct would
> avoid issues on Windows there. (though I do not yet recall how the
> parallel stuff would bite us there, as the file is written as the very
> first thing)

Maybe it is not parallel, but maybe it is simply an unclosed handle to the
.git file? I do not see anything in read_gitfile_gently() (which is the
only location reading the .git file I know of)... Did I miss anything?

> > It also seems that this problem becomes worse if the firewall is turned
> > on, in which case a couple of network operations become a little slower
> > (which I could imagine to be the factor making the problems more likely).
> >
> > A plausible workaround would be to write the `.git` file only when needed
> > (which also would fix a bug on macOS/Linux, although the window is
> > substantially smaller: the reader could potentially read a
> > partially-written file).
> >
> > But maybe we are simply holding onto an open handle to the `.git` file
> > too long?
> 
> That could very well be the case.

But as I said above, I fail to find any smoking gun in the source code.
The only code path I see that reads the .git file releases the resources
correctly AFAICT.

> > I tried to put together a bit more information here:
> >
> > https://github.com/git-for-windows/git/issues/1469#issuecomment-366932746
> 
> Thanks!
> 
> > Do you think there is an easy solution for this? You're much deeper in the
> > submodule code than I ever want to be...
> 
> I'll take a look for open handles, though we're already using our
> wrapper functions
> like 'write_file', which is supposed to close the handle gracefully (and has
> undergone testing on Windows a lot as it is the standard write function)

Right, even some of my code uses this successfully...

Curious problem.

Ciao,
Dscho


Re: [BUG] Worktree prune complains about missing path

2018-02-20 Thread Eric Sunshine
On Tue, Feb 20, 2018 at 3:36 PM, Randall S. Becker
 wrote:
> I’m a bit confused about this, as I thought I understood worktrees :(.
>
> /home/randall/nsgit/test/test dir.mytest: rm -rf dest.wt
> /home/randall/nsgit/test/test dir.mytest/dest: git worktree prune -v
> Removing worktrees/dest.wt: gitdir file points to non-existent location
>
> It seems like one or two situations are a problem:
> 1) I’m using a full path for the worktree.
> 2) There’s a blank in my path – deliberate… tests, yanno.
>
> This is git 2.16.2. Could anyone shed some light on this?

This appears to be working as intended. "git worktree prune" is
telling you that it is pruning the administrative data for the
"dest.wt" worktree (reason: "worktree location no longer exists"),
which you intentionally deleted before pruning. It's not clear what it
is that you find confusing. There is not a lot more I can say without
understanding what behavior you were expecting and how your
expectation differs from the actual experience.

(Also, please consider how easy or difficult it is for a reader to
interpret your pasted "sample session". The one provided is more
confusing than necessary due to the command prompt bearing the same
path information as the output of the "git worktree list" command, as
well as unnecessary duplicated commands, such as "ls", and missing
"cd" commands which do not help to illuminate what it is you are
trying to get across. The pasted transcript also contains invalid
code-points which render as oddball characters -- or not at all --
which didn't help. Best would be to prepare a minimal example of shell
commands to reproduce the behavior you're trying to illustrate.)


Re: I'm trying to break "git pull --rebase"

2018-02-20 Thread Martin Langhoff
On Tue, Feb 20, 2018 at 5:00 PM, Julius Musseau  wrote:
> I was hoping to concoct a situation where "git pull --rebase" makes a
> mess of things.

It breaks quite easily with some workflows. They are all in the "don't
do that" territory.

Open a long-lived feature-dev branch, work on it. Other folks are
working on master. Merge master into feature-dev. Make sure some
merges might need conflict resolution.

Reorg some code on master, move files around. Code some more on
feature-dev branch. Merge master into feature-dev; the merge machinery
will probably cope with the code move, file renames. If it doesn't,
resolve it by hand.

Let all that simmer for a little bit.

Then try to rebase.

"Doctor, it hurts when I rebase after merging with conflict resolution... "

cheers,



m


Question about get_cached_commit_buffer()

2018-02-20 Thread Derrick Stolee
While working on my commit-graph patch [1] and using a local build in my 
usual workflows, I found a bug in my branch.


Essentially, when calling `git rev-list --header`, the header 
information is actually missing for many commits except the first. This 
was not caught in my testing since t6000-rev-list-misc.sh creates a test 
repo with only one commit.


The root cause is that the serialized commit graph gets its speedup by 
not loading buffers for every commit. For many use-cases (merge bases, 
--topo-order, etc.) we do not need the buffer for most of the commits. 
In the rev-list example, the buffer is loaded due to a side-effect of 
being referenced by HEAD or a branch. A similar effect happened in `git 
log`, hence the following change in my patch:


diff --git a/log-tree.c b/log-tree.c
index 580b3a9..14735d4 100644
--- a/log-tree.c
+++ b/log-tree.c
@@ -647,8 +647,7 @@ void show_log(struct rev_info *opt)
show_mergetag(opt, commit);
}
 
-	if (!get_cached_commit_buffer(commit, NULL))

-   return;
+   get_commit_buffer(commit, NULL);
 
 	if (opt->show_notes) {

int raw;


In rev-list, the "--header" option outputs a value and expects the 
buffer to be cached. It outputs the header info only if 
get_cached_commit_buffer() returns a non-null buffer, giving incorrect 
output. If it called get_commit_buffer() instead, it would immediately 
call get_cached_commit_buffer() and on failure actually load the buffer.


This has not been a problem before, since the buffer was always loaded 
at some point for each commit (and saved because of the 
save_commit_buffer global).


I propose to make get_cached_commit_buffer() static to commit.c and 
convert all callers to get_commit_buffer(). Is there any reason to _not_ 
do this? It seems that there is no functional or performance change.


After the serialized commit graph exists and is used, the only change is 
that we delay loading the buffer until we need to read the commit 
metadata that is not stored in the graph (message, author/committer).


Thanks,
-Stolee

[1] 
https://public-inbox.org/git/4d1ee202-7d79-d73c-6e05-d0fc85db9...@gmail.com/T/#m381bfd3f2eafbd254e35a5147cd198bc35055e92

    [Patch v4 00/14] Serialized Git Commit Graph

[2] 
https://public-inbox.org/git/20140610214039.gj19...@sigill.intra.peff.net/

    [Patch 10/17] provide helpers to access the commit buffer


Re: I'm trying to break "git pull --rebase"

2018-02-20 Thread Stefan Beller
On Tue, Feb 20, 2018 at 2:00 PM, Julius Musseau  wrote:
> Hi, Git Developers,
>
> I'm currently writing a blog post about "git pull --rebase".   The
> point of the blog post is to examine scenarios where two people are
> working together on a short-lived feature branch, where history
> rewrites are allowed, and where both are using "git pull --rebase" to
> stay in sync with each other.
>
> I was hoping to concoct a situation where "git pull --rebase" makes a
> mess of things.
>
> So far I have been unable to do this.  I tried version v1.7.2 of Git
> as well as version v2.14.1, and as far as I can tell, "git pull
> --rebase" is bulletproof.
>
> Does anyone here happen to know a situation where "git pull --rebase"
> makes a mess?

If you are inclined to experiment with submodules,
I would have an easy answer for you. :)

But instead of giving an answer myself (as I love reading about
things the usual mailing list folks miss),
maybe this is a good starting point to poke at things
https://github.com/git/git/commit/a6d7eb2c7a6a402a938824bcf1c5f331dd1a06bb

For the non-submodule use case, I would think pull is pretty solid,
as you lay out in your blog draft.

(If you finish by Wednesday 21st), you may be interested in
submitting to Git rev-news (or a later edition if you take time writing)
https://public-inbox.org/git/CAP8UFD1HPruE3N_0k8_TFreBML9V8K=ss8lqd-xkeeuhesm...@mail.gmail.com/

Thanks,
Stefan


Re: Git should preserve modification times at least on request

2018-02-20 Thread Peter Backes
Hi Jeff,

On Tue, Feb 20, 2018 at 04:16:34PM -0500, Jeff King wrote:
> I think there are some references buried somewhere in that wiki, but did
> you look at any of the third-party tools that store file metadata
> alongside the files in the repository? E.g.:
> 
>   https://etckeeper.branchable.com/
> 
> or
> 
>   https://github.com/przemoc/metastore
> 
> I didn't see either of those mentioned in this thread (though I also do
> not have personal experience with them, either).
> 
> Modification times are a subset of the total metadata you might care
> about, so they are solving a much more general problem. Which may also
> partially answer your question about why this isn't built into git. The
> general problem gets much bigger when you start wanting to carry things
> like modes (which git doesn't actually track; we really only care about
> the executable bit) or extended attributes (acls, etc).

I know about those, but that's not what I am looking for. Those tools 
serve entirely different purposes, ie., tracking file system changes. 
I, however, am specifically interested in version control.

In version control, the user checks out his own copy of the tree for 
working. For this purpose, it is thus pointless to track ownership, 
permissions (except for the x bit), xattrs, or any other metadata. In 
fact, it can be considered the wrong thing to do.

The modification time, however, is special. It clearly has its place in 
version control. It tells us when the last modification was actually 
done to the file. I am often working on some feature, and one part is 
finished and is lying around, but I am still working on other parts in 
other files. Then, maybe after some weeks, the other parts are 
finished. Now, when committing, the information about modification time 
is lost. Maybe some weeks later I want to figure out when I last 
modified those files that were committed. But that information is now 
gone, at least in the git repository. Sure, I could do lots of WIP 
commits, but this would clutter up the history unneccessarly and I 
would have lots of versions that might not even compile, let alone run.

As far as I remember, bitkeeper had this distinction between checkins 
and commits. You could check in a file at any time, and any number of 
times, and then group all those checkins together with a commit. Git 
seems to have avoided this principle, or have kept it only 
rudimentarily via git add (but git add cannot add more than one version 
of the same file). Perhaps for simplificiation of use, perhaps for 
simplification of implementation, I don't know.

I assume, if it were not for the build tool issues, git would have 
tracked mtime from the very start.

Best wishes
Peter
-- 
Peter Backes, r...@helen.plasma.xg8.de


I'm trying to break "git pull --rebase"

2018-02-20 Thread Julius Musseau
Hi, Git Developers,

I'm currently writing a blog post about "git pull --rebase".   The
point of the blog post is to examine scenarios where two people are
working together on a short-lived feature branch, where history
rewrites are allowed, and where both are using "git pull --rebase" to
stay in sync with each other.

I was hoping to concoct a situation where "git pull --rebase" makes a
mess of things.

So far I have been unable to do this.  I tried version v1.7.2 of Git
as well as version v2.14.1, and as far as I can tell, "git pull
--rebase" is bulletproof.

Does anyone here happen to know a situation where "git pull --rebase"
makes a mess?

Here's a draft of the blog post:

Title:  "(Too much) fun with git pull --rebase"

https://mergebase.com/doing-git-wrong/2018/02/17/fun-with-git-pull-rebase/


Here are the "git pull --rebase" scenarios I've tested so far:

1.  origin/feature rebased against origin/master

2.  origin/feature squash-merged against origin/master

3.  origin/feature squashed in-place`

4.  origin/feature dropped a commit

5.  origin/feature insanity (adjusted merge-base, reversed commits,
squashed some commits)

6.  undo of 5


So far "git pull --rebase" does the exact right thing in every case!

If anyone knows a scenario where "git pull --rebase" fails to do the
right thing, I would be very grateful to hear of it.



Thanks!

yours sincerely,

Julius Musseau


Re: [PATCH v4 03/13] commit-graph: create git-commit-graph builtin

2018-02-20 Thread Junio C Hamano
Derrick Stolee  writes:

> +int cmd_commit_graph(int argc, const char **argv, const char *prefix)
> +{
> + static struct option builtin_commit_graph_options[] = {
> + { OPTION_STRING, 'p', "object-dir", _dir,
> + N_("dir"),
> + N_("The object directory to store the graph") },

I have a suspicion that this was modeled after some other built-in
that has a similar issue (perhaps written long time ago), but isn't
OPT_STRING() sufficient to define this element these days?

Or am I missing something?

Why squat on short-and-sweet "-p"?  For that matter, since this is
not expected to be end-user facing command anyway, I suspect that we
do not want to allocate a single letter option from day one, which
paints ourselves into a corner from where we cannot escape.


Re: [PATCH v4 02/13] graph: add commit graph design document

2018-02-20 Thread Junio C Hamano
Derrick Stolee  writes:

> +2. Walking the entire graph to avoid topological order mistakes.

You have at least one more mention of "topological order mistakes"
below, but we commonly refer to this issue and blame it for "clock
skew".  Using the word highlights that there is no "mistake" in topo
order algorithm and mistakes are in the commit timestamps.

> +In practice, we expect some commits to be created recently and not stored
> +in the commit graph. We can treat these commits as having "infinite"
> +generation number and walk until reaching commits with known generation
> +number.

Hmm, "pretend infinity" is an interesting approach---I need to think
about it a bit more if it is sufficient.

> +- .graph files are managed only by the 'commit-graph' builtin. These are not
> +  updated automatically during clone, fetch, repack, or creating new commits.

OK.  s/builtin/subcommand/; it does not make much difference if it
is a built-in or standalone command.

> +- There is no 'verify' subcommand for the 'commit-graph' builtin to verify
> +  the contents of the graph file agree with the contents in the ODB.

I am not entirely sure about the merit of going into this level of
detail.  Being able to use only a single file looks like a more
fundamental design limitation, which deserves to be decribed in this
section, and we could ship the subsystem with that limitation.

But the lack of verify that can be called from fsck is merely the
matter of not the subsystem being mature enough (to be written,
reviewed and tested) and not a fundamental one, and we will not be
shipping the subsystem until that limitation is lifted.

So I'd guess that we prefer this bullet item to be in the commit log
message, not here, that describes the current status of the
development (as opposed to the state of the subsystem).

> +- Generation numbers are not computed in the current version. The file
> +  format supports storing them, along with a mechanism to upgrade from
> +  a file without generation numbers to one that uses them.

Exactly the same comment as above applies to this item.

> +- The commit graph is currently incompatible with commit grafts. This can be
> +  remedied by duplicating or refactoring the current graft logic.

Hmm.  Can it be lifted without first allowing us to use more than
one commit graph file (i.e. one for "traverse while honoring the
grafts", the other for "traverse while ignoring the grafts")?

> +- After computing and storing generation numbers, we must make graph
> +  walks aware of generation numbers to gain the performance benefits they
> +  enable. This will mostly be accomplished by swapping a commit-date-ordered
> +  priority queue with one ordered by generation number. The following
> +  operations are important candidates:
> +
> +- paint_down_to_common()
> +- 'log --topo-order'

Yes.

> +- The graph currently only adds commits to a previously existing graph.
> +  When writing a new graph, we could check that the ODB still contains
> +  the commits and choose to remove the commits that are deleted from the
> +  ODB. For performance reasons, this check should remain optional.

The last sentence is somehow unconvincing.  It probably is not
appropriate for the "Future Work" section to be making a hurried
design decision before having any working verification code to run
benchmark on.

> +- Currently, parse_commit_gently() requires filling in the root tree
> +  object for a commit. This passes through lookup_tree() and consequently
> +  lookup_object(). Also, it calls lookup_commit() when loading the parents.
> +  These method calls check the ODB for object existence, even if the
> +  consumer does not need the content. For example, we do not need the
> +  tree contents when computing merge bases. Now that commit parsing is
> +  removed from the computation time, these lookup operations are the
> +  slowest operations keeping graph walks from being fast. Consider
> +  loading these objects without verifying their existence in the ODB and
> +  only loading them fully when consumers need them. Consider a method
> +  such as "ensure_tree_loaded(commit)" that fully loads a tree before
> +  using commit->tree.

Very good idea.

> +- The current design uses the 'commit-graph' builtin to generate the graph.
> +  When this feature stabilizes enough to recommend to most users, we should
> +  add automatic graph writes to common operations that create many commits.
> +  For example, one coulde compute a graph on 'clone', 'fetch', or 'repack'
> +  commands.

s/coulde/could/.

Also do not forget "fsck" that calls "verify".  That is more urgent
than intergration with any other subcommand.

> +- A server could provide a commit graph file as part of the network protocol
> +  to avoid extra calculations by clients.

We need to assess the riskiness and threat models regarding this, if
we really want to follow this "could" through.  I would imagine that
the cost for verification is comparable 

Re: Fetch-hooks

2018-02-20 Thread Leo Gaspard
On 02/20/2018 08:42 AM, Jeff King wrote:>> [...]
>>
>> Is there a way for “pre-receive” to individually filter [refs]? I was
>> under the impression that the only way to do that was to use the
>> “update” hook, which was the reason I wanted to model it after “update”
>> rather than “pre-receive” (my use case being a check independent for
>> each pushed ref)
> 
> No, pre-receive is always atomic. However, that may actually be what you
> want, if the point is to keep untrusted data out of the repository. By
> rejecting the whole thing, we could in theory keep the objects from
> entering the repository at all. This is how the push side works via the
> "quarantine" system (which is explained in git-receive-pack(1)).
> Fetching doesn't currently quarantine objects, but it probably wouldn't
> be very hard to make it do so.

Oh, I didn't think about quarantining behavior, indeed!

So I guess, following your answer as well as Jake's, I'll try to
implement a pre-receive-like hook, and will come back to this list when
I'll have a tentative implementation. Thanks for the advice! :)


Re: Git should preserve modification times at least on request

2018-02-20 Thread Jeff King
On Mon, Feb 19, 2018 at 10:22:36PM +0100, Peter Backes wrote:

> please ensure to CC me if you reply as I am not subscribed to the list.
> 
> https://git.wiki.kernel.org/index.php/Git_FAQ#Why_isn.27t_Git_preserving_modification_time_on_files.3F
>  
> argues that git isn't preserving modification times because it needs to 
> ensure that build tools work properly.

I think there are some references buried somewhere in that wiki, but did
you look at any of the third-party tools that store file metadata
alongside the files in the repository? E.g.:

  https://etckeeper.branchable.com/

or

  https://github.com/przemoc/metastore

I didn't see either of those mentioned in this thread (though I also do
not have personal experience with them, either).

Modification times are a subset of the total metadata you might care
about, so they are solving a much more general problem. Which may also
partially answer your question about why this isn't built into git. The
general problem gets much bigger when you start wanting to carry things
like modes (which git doesn't actually track; we really only care about
the executable bit) or extended attributes (acls, etc).

-Peff


Re: [PATCH] t/known-leaky: add list of known-leaky test scripts

2018-02-20 Thread Jeff King
On Tue, Feb 20, 2018 at 09:44:51PM +0100, Martin Ågren wrote:

> For finding this sort of low-hanging fruit (windfall?) any local
> scripting and greping will do. I sort of hesitate wasting community CPU
> cycles on something which might be appreciated, in theory, but that
> doesn't have enough attention to move beyond "every now and then,
> something is fixed, but just as often, tests gain leaks and the
> blacklist gains entries".
> 
> To sum up: I probably won't be looking into Travis-ing such a blacklist
> in the near future.

Yeah, I think that's the right path for now. Hopefully we can get close
to zero leaks at some point, and then think about adding this kind of
infrastructure. Thanks for talking it through.

-Peff


Re: Git should preserve modification times at least on request

2018-02-20 Thread Peter Backes
Hi Johannes,

On Tue, Feb 20, 2018 at 11:46:38AM +0100, Johannes Schindelin wrote:
> If I were you [...]

It seems all pretty straight forward, except for

> I would probably invent a file format (``)

I'm stuck there because of  being munged.

To obtain or set the mtime of the file, I need the unmunged path.

How to get it?



What follows is irrelevant for progress.

> I don't think that code was ever there. Maybe you heard about some file
> mode being preserved overzealously (we stored the octal file mode
> verbatim, but then decided to store only 644 or 755).

I'm not sure. I'm not able to find that source anymore, though.

> As you can see from the code decoding a tree entry:
> 
> https://github.com/git-for-windows/git/blob/e1848984d/tree-walk.c#L25-L52
> 
> there is no mtime at all in the on-disk format of tree objects. There is
> the hash, the mode, and the file name.

I didn't comletely get the code in tree-walk.c since the parsing 
architecture seems to pass around pointers via global variables. 
It seems that in addition to hash, mode and file name, the on-disk 
format has at least the object type, see git cat-file -p master^{tree} 
Perhaps I got it wrong.

Best wishes
Peter

-- 
Peter Backes, r...@helen.plasma.xg8.de


Re: Is there any way to "interrupt" a rebase?

2018-02-20 Thread Jeff King
On Tue, Feb 20, 2018 at 12:44:51PM +0100, Johannes Schindelin wrote:

> > It might be even possible to design a new subcommand for the interactive
> > rebase to facilitate a variation of this strategy (possibly even making
> > use of the fact that the interactive rebase accumulates mappings between
> > the original commits and the rewritten ones in
> > $GIT_DIR/rebase-merge/rewritten-list, intended for use in the post-rewrite
> > hook).
> 
> This feature might look somewhat like this:
> 
>   git rebase --replay-latest-commits 3
> 
> and it would not even have to look at the `rewritten-list`. All it would
> do is to put back the latest `pick` from the `done` file (in case of merge
> conflicts) into the `git-rebase-todo` file, then insert `pick lines for
> HEAD~3.. at the beginning of that todo file, and then `git reset --hard
> HEAD~3`.

Keep in mind that the "pick" lines could be "edit", "squash", etc.

I think the general form of your original email's proposal is something
like: What if we had a "git rebase --rewind" that could "undo" the prior
command? So if I had a todo file like:

  pick 1
  edit 2
  x make test
  edit 3
  x make test
  pick 4

and I failed at the second "make test", then I'd have:

  pick 1
  edit 2
  x make test
  edit 3
  x make test

in the "done" file, with the final pick remaining in "todo". Could I
then ask to "rewind" my state by moving "x make test" back to the
"todo". And two rewinds would get me back to applying patch 3, which I
could then fix up and re-run my test. Or four rewinds would get me back
to patch 2, which maybe is where I made the initial mistake.

That's a bit more primitive than what you're proposing in this
follow-on, because you'd be doing the replay yourself (unless we remap
the commits). But it's very easy to reason about and implement.

Anyway, just musing at this point. I haven't thought it through, but I
like the direction of everything you're saying. ;)

-Peff


Re: [PATCH v4 01/13] commit-graph: add format document

2018-02-20 Thread Junio C Hamano
Derrick Stolee  writes:

>  Documentation/technical/commit-graph-format.txt | 90 
> +
>  1 file changed, 90 insertions(+)
>  create mode 100644 Documentation/technical/commit-graph-format.txt

Hopefully just a few remaining nits.  Overall I find this written
really clearly.

> +== graph-*.graph files have the following format:
> +
> +In order to allow extensions that add extra data to the graph, we organize
> +the body into "chunks" and provide a binary lookup table at the beginning
> +of the body. The header includes certain values, such as number of chunks,
> +hash lengths and types.

We no longer have lengths stored.

> + ...
> +  The remaining data in the body is described one chunk at a time, and
> +  these chunks may be given in any order. Chunks are required unless
> +  otherwise specified.

It is good that this explicitly says chunks can come in any order,
and which ones are required.  It should also say which chunk can
appear multiple times.  I think all four chunk types we currently
define can have at most one instance in a file.

> +CHUNK DATA:
> +
> +  OID Fanout (ID: {'O', 'I', 'D', 'F'}) (256 * 4 bytes)
> +  The ith entry, F[i], stores the number of OIDs with first
> +  byte at most i. Thus F[255] stores the total
> +  number of commits (N).
> +
> +  OID Lookup (ID: {'O', 'I', 'D', 'L'}) (N * H bytes)
> +  The OIDs for all commits in the graph, sorted in ascending order.

Somewhere in this document, we probably would want to say that this
format allows at most (1<<31)-1 commits recorded in the file (as
CGET and EDGE uses 31-bit uint to index into this table, using MSB
for other purposes, and the all-1-bit pattern is also special), and
when we refer to "int-ids" of a commit, it is this 31-bit number
that is an index into this table.

> +  Commit Data (ID: {'C', 'G', 'E', 'T' }) (N * (H + 16) bytes)
> +* The first H bytes are for the OID of the root tree.
> +* The next 8 bytes are for the int-ids of the first two parents
> +  of the ith commit. Stores value 0x if no parent in that
> +  position. If there are more than two parents, the second value
> +  has its most-significant bit on and the other bits store an array
> +  position into the Large Edge List chunk.
> +* The next 8 bytes store the generation number of the commit and
> +  the commit time in seconds since EPOCH. The generation number
> +  uses the higher 30 bits of the first 4 bytes, while the commit
> +  time uses the 32 bits of the second 4 bytes, along with the lowest
> +  2 bits of the lowest byte, storing the 33rd and 34th bit of the
> +  commit time.
> +
> +  Large Edge List (ID: {'E', 'D', 'G', 'E'}) [Optional]
> +  This list of 4-byte values store the second through nth parents for
> +  all octopus merges. The second parent value in the commit data stores
> +  an array position within this list along with the most-significant bit
> +  on. Starting at that array position, iterate through this list of 
> int-ids
> +  for the parents until reaching a value with the most-significant bit 
> on.
> +  The other bits correspond to the int-id of the last parent.
> +
> +TRAILER:
> +
> + H-byte HASH-checksum of all of the above.
> +


Re: [PATCH] t/known-leaky: add list of known-leaky test scripts

2018-02-20 Thread Martin Ågren
On 19 February 2018 at 22:29, Jeff King  wrote:
> On Wed, Feb 14, 2018 at 10:56:37PM +0100, Martin Ågren wrote:
>
>> Here's what a list of known leaks might look like. It feels a bit
>> awkward to post a known-incomplete list (I don't run all tests). Duy
>> offered to pick up the ball if I gave up, maybe you could complete and
>> post this as your own? :-? Even if I (or others) can't reproduce the
>> complete list locally, regressions will be trivial to find, and newly
>> leak-free tests fairly easy to notice.
>
> I didn't think about that when I posted my scripts. In general, it's OK
> to me if you miss a script when you generate the "leaky" list. But if
> you skip it, you cannot say whether it is leaky or not, and should
> probably neither add nor remove it from the known-leaky list. So I think
> the second shell snippet needs to become a little more clever about
> skipped test scripts.
>
> Even that isn't 100% fool-proof, as some individual tests may be skipped
> or not skipped on various platforms. But it may be enough in practice
> (and eventually we'd have no known-leaky tests, of course ;) ).

All good points.

> Or alternatively, we could just not bother with checking this into the
> repository, and it becomes a local thing for people interested in
> leak-testing. What's the value in having a shared known-leaky list,
> especially if we don't expect most people to run it.

This sums up my feeling about this.

> I guess it lets us add a Travis job to do the leak-checking, which might
> get more coverage. So maybe if we do have an in-repo known-leaky, it
> should match some canonical Travis environment. That won't give us
> complete coverage, but at this point we're just trying to notice
> low-hanging fruit.

Right. A Travis job to get non-complete but consistent data would be
"nice" (TM), but at this point it would perhaps just result in
blacklist-churning. At some point, adding/removing entries in a
blacklist will have some signal-value and psychology to it, but right
now it might just be noise. (My patch adds an incomplete blacklist of
539 scripts...)

For finding this sort of low-hanging fruit (windfall?) any local
scripting and greping will do. I sort of hesitate wasting community CPU
cycles on something which might be appreciated, in theory, but that
doesn't have enough attention to move beyond "every now and then,
something is fixed, but just as often, tests gain leaks and the
blacklist gains entries".

To sum up: I probably won't be looking into Travis-ing such a blacklist
in the near future.

Martin


Re: File locking issues with fetching submodules in parallel

2018-02-20 Thread Stefan Beller
On Tue, Feb 20, 2018 at 8:20 AM, Johannes Schindelin
 wrote:
> Hi Stefan (and other submodule wizards),
>
> Igor Melnichenko reported a submodule problem in a Git for Windows ticket:
>
> https://github.com/git-for-windows/git/issues/1469
>

Thanks for bringing this up. Is it better to discuss it here on at that issue?

> Part of it is due to Windows' behavior where you cannot read the same file
> in one process while you write it in another process.
>
> The other part is how our submodule code works in parallel. In particular,
> we seem to write the `.git` file maybe even every time a submodule is
> fetched, unconditionally, not even testing whether the .git file is
> already there and contains the correct contents?

only when they need to be newly cloned (as that is when the
submodule--helper clone is invoked). When fetching, we'd not touch
the '.git' file IIRC.

> For some reason, the bug reporter saw a "Permission denied" on the `.git`
> file when we try to write it (and I am pretty certain that I tracked it
> down correctly to the `connect_work_tree_and_git_dir()` function). The
> intermittent "Permission denied" error seems to suggest that *another*
> process is accessing this file while we are writing it.

The reporter also reports this coming out of "git submodule add",
which is not parallelized, but rather in the messy state of migrating
as much code into C from shell, trying to not introduce to many bugs. ;)

So for add there is no parallelism I am aware of, which makes me wonder
how the race is coming in there.

I recall making the decision to unconditionally write out the '.git' file
for other operations like "git reset --hard --recurse-submodules" or
"git checkout --force --recurse-submodules", as there you are really asking
for a clean slate. I missed the Windows FS semantics for that case,
so I guess re-reading the file to make sure it is already correct would
avoid issues on Windows there. (though I do not yet recall how the
parallel stuff would bite us there, as the file is written as the very
first thing)

> It also seems that this problem becomes worse if the firewall is turned
> on, in which case a couple of network operations become a little slower
> (which I could imagine to be the factor making the problems more likely).
>
> A plausible workaround would be to write the `.git` file only when needed
> (which also would fix a bug on macOS/Linux, although the window is
> substantially smaller: the reader could potentially read a
> partially-written file).
>
> But maybe we are simply holding onto an open handle to the `.git` file too
> long?

That could very well be the case.

> I tried to put together a bit more information here:
>
> https://github.com/git-for-windows/git/issues/1469#issuecomment-366932746

Thanks!

> Do you think there is an easy solution for this? You're much deeper in the
> submodule code than I ever want to be...

I'll take a look for open handles, though we're already using our
wrapper functions
like 'write_file', which is supposed to close the handle gracefully (and has
undergone testing on Windows a lot as it is the standard write function)

Thanks,
Stefan


[BUG] Worktree prune complains about missing path

2018-02-20 Thread Randall S. Becker
I’m a bit confused about this, as I thought I understood worktrees :(.

/home/randall/nsgit/test/test dir.mytest/dest git worktree list
/home/randall/nsgit/test/test dir.mytest/dest: git worktree list
/home/randall/nsgit/test/test dir.mytest/dest 4e901ca [master]
/home/randall/nsgit/test/test dir.mytest/dest.wt  4e901ca [dest.wt]

/home/randall/nsgit/test/test dir.mytest/dest: find . -name dest.wt
./.git/logs/refs/heads/dest.wt
./.git/refs/heads/dest.wt
./.git/worktrees/dest.wt
/home/randall/nsgit/test/test dir.mytest/dest/.git: cd worktrees/dest.wt
/home/randall/nsgit/test/test dir.mytest/dest/.git/worktrees/dest.wt: ls
HEAD  ORIG_HEAD  commondir  gitdir  index  logs

/home/randall/nsgit/test/test dir.mytest/dest/.git/worktrees/dest.wt: cat
gitdir
/home/randall/nsgit/test/test dir.mytest/dest.wt/.git

/home/randall/nsgit/test/test dir.mytest/dest/.git/worktrees/dest.wt: ls
HEAD  ORIG_HEAD  commondir  gitdir  index  logs

/home/randall/nsgit/test/test dir.mytest/dest/.git/worktrees/dest.wt: cd
../../../..
/home/randall/nsgit/test/test dir.mytest: rm -rf dest.wt
/home/randall/nsgit/test/test dir.mytest: cd dest
/home/randall/nsgit/test/test dir.mytest/dest: git worktree prune -v
Removing worktrees/dest.wt: gitdir file points to non-existent location

It seems like one or two situations are a problem:
1) I’m using a full path for the worktree. 
2) There’s a blank in my path – deliberate… tests, yanno.

This is git 2.16.2. Could anyone shed some light on this?

Cheers,
Randall




[PATCH] submodule: indicate that 'submodule.recurse' doesn't apply to clone

2018-02-20 Thread Brandon Williams
Update the documentation for the 'submodule.recurse' config to identify
that the clone command does not respect it.

Signed-off-by: Brandon Williams 
---
 Documentation/config.txt | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index f57e9cf10..59ff29d7a 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -3210,7 +3210,8 @@ submodule.active::
 
 submodule.recurse::
Specifies if commands recurse into submodules by default. This
-   applies to all commands that have a `--recurse-submodules` option.
+   applies to all commands that have a `--recurse-submodules` option,
+   except `clone`.
Defaults to false.
 
 submodule.fetchJobs::
-- 
2.16.1.291.g4437f3f132-goog



Re: [PATCHv2 00/16] Moving global state into the repository object (part 1)

2018-02-20 Thread Stefan Beller
On Tue, Feb 20, 2018 at 11:55 AM, Junio C Hamano  wrote:
> Stefan Beller  writes:
>
>> The step to take an object store would just add expressiveness
>> to the code, which may help in understanding what part of the code is
>> related to what other part of the code, so it may be a readability gain
>> on its own?
>
> It certainly would allow you to have a standalone object store that
> is not associated with *any* repository, but if we are not using
> such a layout, I doubt it would be a readability gain to add excess
> and unexercised expressiveness to the code.

So you favor v1?
Duy seems to be ok with v1 too if there is consensus that it is best
(or rather "if it makes Stefan's life hell, it's not worth doing.")[1].

I will try to resend that v1 shortly[2], as the only actual concern about the
code was where one struct was defined[3]. All other discussion was
meta-level, which direction to go.

Thanks for the clarification!
Stefan

[1] 
https://public-inbox.org/git/cacsjy8cpkese8atc_ewdnvknqyp9t6ebwkwcdzlhyafkh2b...@mail.gmail.com/
[2] https://public-inbox.org/git/20180213012241.187007-1-sbel...@google.com/
[3] https://public-inbox.org/git/20180213185120.ga108...@google.com/


Re: [PATCH v2] Fix misconversion of gitsubmodule.txt

2018-02-20 Thread Junio C Hamano
Todd Zullinger  writes:

> If replacing the non-ASCI apostrophes is the goal, aren't
> there a number of others in the same file worth cleaning up
> at the same time?
>
> $ grep '’' Documentation/gitsubmodules.txt
> the submodule’s working directory pointing to (i).
> superproject expects the submodule’s working directory to be at.
> When deinitialized or deleted (see below), the submodule’s Git
> but no submodule working directory. The submodule’s Git directory
> the superproject’s `$GIT_DIR/config` file, so the superproject’s history
> The deletion removes the superproject’s tracking data, which are
> The submodule’s working directory is removed from the file
>
> This does seem to be the only file which includes the
> non-ASCII apostrophe under Documentation.

Thanks for checking.  I agree that it is a good idea to clean the
whole file up at the same time.  The title would need to be updated,
as it will no longer be "fix misconversion".  The justification would
also become different---"the 'overhaul' did not just move the text
but made the apostrophe style changes and reverting that change is a
good thing" was a good justification for "fix misconversion", but
now we need to explain to the future readers of "git log" why we
prefer to turn all of them in this file to ASCII single quotes (just
saying "make it consistent" is sufficient).

Thanks.



Re: [PATCHv2 00/16] Moving global state into the repository object (part 1)

2018-02-20 Thread Junio C Hamano
Stefan Beller  writes:

> The step to take an object store would just add expressiveness
> to the code, which may help in understanding what part of the code is
> related to what other part of the code, so it may be a readability gain
> on its own?

It certainly would allow you to have a standalone object store that
is not associated with *any* repository, but if we are not using
such a layout, I doubt it would be a readability gain to add excess
and unexercised expressiveness to the code.



Re: [PATCH v2] Fix misconversion of gitsubmodule.txt

2018-02-20 Thread Stefan Beller
On Tue, Feb 20, 2018 at 11:36 AM, Kaartic Sivaraam
 wrote:
> Hi,
>
> On Wednesday 21 February 2018 12:20 AM, Stefan Beller wrote:
>> Kaartic was the last to touch that file,
>> (as found via git log origin/master -- Documentation/gitsubmodules.txt),
>> sorry I did not find this in the review.
>>
>
> "Non-ASCII characters" made me dig into to this a little deeper as I
> generally don't use them particularly for text files.
>
> "git blame" points at d48034551 (submodules: overhaul documentation,
> 2017-06-22) as the offending comment.

Hah, good catch! Sorry for my assumptions there.


Re: About connection resuming

2018-02-20 Thread Stefan Beller
On Tue, Feb 20, 2018 at 4:49 AM, chenzero  wrote:
> Hello,
> First, I am a user of git for about 2 years, I really appreciated you all to
> create this great useful software!

Hi, welcome to the mailing list!

> My encountered problem is:
> sometimes, the repo is big and because my networking is not stable(or my
> network proxy has
> some limitations), so, the clone will always fail.
> I tried following ways to solve this, however, not much success.
> 1. clone depth 1
> git clone --depth=1 https:///repo.git
> however, some repo even depth=1 might fail.(It seemed that my network proxy
> limit tcp connection.
> if transfer over 50M, it will break)
> 2. download bundle file.
> but no all git repo provide bundle files to download.
>
> After some investment on the code, I think, perhaps,
> if enhance the git-http-backend to support http header: Range, or
> Content-Range,
> maybe it will enable connection resuming.

This thought has come up before, for example
https://public-inbox.org/git/1473984742-12516-1-git-send-email-kevin.m.w...@gmail.com/
(Or you can search for "resumable clone" in that mailing list archive,
there are more interesting discussions)

But the current problem is that the bytes of a clone are not stable,
as the packs are ordered racily IIUC: So if you clone the same repository
(with the same branch values), the observed communication over the wire
may differ in the part when the pack file is transferred as the packfile is
packed up in a multi threaded fashion, that has no clear order defined.

> the problem of this way is: it needs to upgrade the current deployed
> "git-http-backend",
> and maybe much code need to change including git-remote-http etc.
>
> This is the very basic thought, and whether I should try other way ?
> Thanks a lot!

I think a more promising approach is to have support for references
inside the packfile for transfer, this could become one of the features
of the new protocol that is being worked on
https://public-inbox.org/git/20180207011312.189834-1-bmw...@google.com/
I do not find a reference for the idea, but Jonathan (cc'd) laid it out
well once upon a time. Jonathan do you have a mailing list reference
for references in packfiles?

Thanks,
Stefan


Re: [PATCH v2] Fix misconversion of gitsubmodule.txt

2018-02-20 Thread Todd Zullinger
Hi,

marmot1123 wrote:
> In the 2nd and 4th paragraph of DESCRIPTION, there ware misconversions 
> `submodule’s`.
> It seems non-ASCII apostrophes, so I rewrite ASCII apostrophes.

If replacing the non-ASCI apostrophes is the goal, aren't
there a number of others in the same file worth cleaning up
at the same time?

$ grep '’' Documentation/gitsubmodules.txt
the submodule’s working directory pointing to (i).
superproject expects the submodule’s working directory to be at.
When deinitialized or deleted (see below), the submodule’s Git
but no submodule working directory. The submodule’s Git directory
the superproject’s `$GIT_DIR/config` file, so the superproject’s history
The deletion removes the superproject’s tracking data, which are
The submodule’s working directory is removed from the file

This does seem to be the only file which includes the
non-ASCII apostrophe under Documentation.

Some tests include it (intentionally) as does
contrib/credential/netrc/git-credential-netrc.

-- 
Todd
~~
The direct use of physical force is so poor a solution to the problem
of limited resources that it is commonly employed only by small
children and great nations.
-- David Friedman



Re: [PATCH v2] Fix misconversion of gitsubmodule.txt

2018-02-20 Thread Kaartic Sivaraam
Hi,

On Wednesday 21 February 2018 12:20 AM, Stefan Beller wrote:
> Kaartic was the last to touch that file,
> (as found via git log origin/master -- Documentation/gitsubmodules.txt),
> sorry I did not find this in the review.
> 

"Non-ASCII characters" made me dig into to this a little deeper as I
generally don't use them particularly for text files.

"git blame" points at d48034551 (submodules: overhaul documentation,
2017-06-22) as the offending comment.


> Thanks for the patch!

Yeah, nice catch!


--
Kaartic

QUOTE

“It is impossible to live without failing at something, unless you live
so cautiously that you might as well not have lived at all – in which
case, you fail by default.”

  - J. K. Rowling



signature.asc
Description: OpenPGP digital signature


Re: [PATCH] make hg-to-git compatible with python2.x and 3.x

2018-02-20 Thread Junio C Hamano
Junio C Hamano  writes:

> - map(lambda ..., collection) is not liked; use list comprehension.
> ...
> I am not sure about the change from map(lambda ...) to list
> comprehension, though.  Not that I have a preference for or against
> (I am not a Python person), but I do not know if this is a change
> necessary to run with Python 3 or if it is merely more preferred to
> use list comprehension.
> ...
>> -prnts = map(lambda x: x[:x.find(':')], prnts)
>> +prnts = [x[:x.find(':')] for x in prnts]

Well, I should have dug this myself a bit more [*1*] before hitting
[SEND].  This change is required because map() no longer returns a
list.  So the bullet item in the proposed commit log message in my
earlier message should be updated to say that instead--it was unclear
to me so I phrased as if it was a mere preference, but it actually
is a required change.


[Footnote]

*1* Well, it would have been ideal if the original patch had enough
information from the beginning to make this kind of "digging"
unnecessary ;-)






Re: [PATCH] make hg-to-git compatible with python2.x and 3.x

2018-02-20 Thread Junio C Hamano
Hervé Beraud  writes:

> Signed-off-by: Hervé Beraud 
> ---
>  contrib/hg-to-git/hg-to-git.py | 52 
> +-
>  1 file changed, 26 insertions(+), 26 deletions(-)

I think you shrunk the scope of the change, but I feel that it is
still a bit under-explained.  Let me try to write a proposed commit
log message and ask you to see if I understood the idea behind the
changes correctly:

Rewrite features that are no longer supported (or recommended)
in Python 3 in the script so that it can be used with both
Python 2 and 3, namely:

- print is not a statement; use print() function instead.
- dict.has_key(key) is no more; use "key in dict" instead.
- map(lambda ..., collection) is not liked; use list comprehension.

Hopefully this would also serve as an illustration of the kind of
things we want in our log message.

I am not sure about the change from map(lambda ...) to list
comprehension, though.  Not that I have a preference for or against
(I am not a Python person), but I do not know if this is a change
necessary to run with Python 3 or if it is merely more preferred to
use list comprehension.

Thanks.

> diff --git a/contrib/hg-to-git/hg-to-git.py b/contrib/hg-to-git/hg-to-git.py
> index de3f81667ed97..8fa7698df5c20 100755
> --- a/contrib/hg-to-git/hg-to-git.py
> +++ b/contrib/hg-to-git/hg-to-git.py
> @@ -42,7 +42,7 @@
>  
>  def usage():
>  
> -print """\
> +print("""\
>  %s: [OPTIONS] 
>  
>  options:
> @@ -54,7 +54,7 @@ def usage():
>  
>  required:
>  hgprj:  name of the HG project to import (directory)
> -""" % sys.argv[0]
> +""" % sys.argv[0])
>  
>  
> #--
>  
> @@ -104,29 +104,29 @@ def getgitenv(user, date):
>  if state:
>  if os.path.exists(state):
>  if verbose:
> -print 'State does exist, reading'
> +print('State does exist, reading')
>  f = open(state, 'r')
>  hgvers = pickle.load(f)
>  else:
> -print 'State does not exist, first run'
> +print('State does not exist, first run')
>  
>  sock = os.popen('hg tip --template "{rev}"')
>  tip = sock.read()
>  if sock.close():
>  sys.exit(1)
>  if verbose:
> -print 'tip is', tip
> +print('tip is', tip)
>  
>  # Calculate the branches
>  if verbose:
> -print 'analysing the branches...'
> +print('analysing the branches...')
>  hgchildren["0"] = ()
>  hgparents["0"] = (None, None)
>  hgbranch["0"] = "master"
>  for cset in range(1, int(tip) + 1):
>  hgchildren[str(cset)] = ()
>  prnts = os.popen('hg log -r %d --template "{parents}"' % 
> cset).read().strip().split(' ')
> -prnts = map(lambda x: x[:x.find(':')], prnts)
> +prnts = [x[:x.find(':')] for x in prnts]
>  if prnts[0] != '':
>  parent = prnts[0].strip()
>  else:
> @@ -154,15 +154,15 @@ def getgitenv(user, date):
>  else:
>  hgbranch[str(cset)] = "branch-" + str(cset)
>  
> -if not hgvers.has_key("0"):
> -print 'creating repository'
> +if "0" not in hgvers:
> +print('creating repository')
>  os.system('git init')
>  
>  # loop through every hg changeset
>  for cset in range(int(tip) + 1):
>  
>  # incremental, already seen
> -if hgvers.has_key(str(cset)):
> +if str(cset) in hgvers:
>  continue
>  hgnewcsets += 1
>  
> @@ -180,27 +180,27 @@ def getgitenv(user, date):
>  os.write(fdcomment, csetcomment)
>  os.close(fdcomment)
>  
> -print '-'
> -print 'cset:', cset
> -print 'branch:', hgbranch[str(cset)]
> -print 'user:', user
> -print 'date:', date
> -print 'comment:', csetcomment
> +print('-')
> +print('cset:', cset)
> +print('branch:', hgbranch[str(cset)])
> +print('user:', user)
> +print('date:', date)
> +print('comment:', csetcomment)
>  if parent:
> - print 'parent:', parent
> + print('parent:', parent)
>  if mparent:
> -print 'mparent:', mparent
> +print('mparent:', mparent)
>  if tag:
> -print 'tag:', tag
> -print '-'
> +print('tag:', tag)
> +print('-')
>  
>  # checkout the parent if necessary
>  if cset != 0:
>  if hgbranch[str(cset)] == "branch-" + str(cset):
> -print 'creating new branch', hgbranch[str(cset)]
> +print('creating new branch', hgbranch[str(cset)])
>  os.system('git checkout -b %s %s' % (hgbranch[str(cset)], 
> hgvers[parent]))
>  else:
> -print 'checking out branch', hgbranch[str(cset)]
> +print('checking out branch', hgbranch[str(cset)])
>  os.system('git checkout %s' % hgbranch[str(cset)])
>  
>  # merge
> @@ -209,7 +209,7 @@ 

Re: [PATCHv2 00/16] Moving global state into the repository object (part 1)

2018-02-20 Thread Stefan Beller
On Tue, Feb 20, 2018 at 11:03 AM, Junio C Hamano  wrote:
> Jonathan Nieder  writes:
>
>> For what it's worth, I think I prefer v1.  I put some comments on why
>> on patch 0 of v1 and would be interested in your thoughts on them
>> (e.g. as a reply to that).  I also think that even if we want to
>> switch to a style that passes around object_store separately from
>> repository, it is easier to do the migration in two steps: first get
>> rid of hidden dependencies on the_repository, then do the (simpler)
>> automatic migration from
>>
>>  f(the_repository)
>>
>> to
>>
>>  f(the_repository->object_store)
>>
>> *afterwards*.
>>
>> Thoughts?
>
> Are we envisioning the future in which one repository has more than
> one object-store (I am counting an object store and its alternates
> that are pointed by its $GIT_OBJECT_DIRECTORY/info/alternates as a
> single logical "object store")?  If not, doing f(the_repository)
> migration, stopping there without f(the_repository->object_store)
> may perfectly be adequate, I would think.
>

I do not envision that, maybe Christian Couder does?
The partial clone world would be happy with just one object store
per repo, I would think.

The step to take an object store would just add expressiveness
to the code, which may help in understanding what part of the code is
related to what other part of the code, so it may be a readability gain
on its own?


Re: [PATCH 2/2] remote-curl: unquote incoming push-options

2018-02-20 Thread Brandon Williams
On 02/19, Jeff King wrote:
> The transport-helper protocol c-style quotes the value of
> any options passed to the helper via the "option  "
> directive. However, remote-curl doesn't actually unquote the
> push-option values, meaning that we will send the quoted
> version to the other side (whereas git-over-ssh would send
> the raw value).
> 
> The pack-protocol.txt documentation defines the push-options
> as a series of VCHARs, which excludes most characters that
> would need quoting. But:
> 
>   1. You can still see the bug with a valid push-option that
>  starts with a double-quote (since that triggers
>  quoting).
> 
>   2. We do currently handle any non-NUL characters correctly
>  in git-over-ssh. So even though the spec does not say
>  that we need to handle most quoted characters, it's
>  nice if our behavior is consistent between protocols.
> 
> There are two new tests: the "direct" one shows that this
> already works in the non-http case, and the http one covers
> this bugfix.

This seems like a fairly obvious fix.  If the value is quoted, unquote
it and send the unquoted value as a push-option, otherwise just send the
already unquoted value as a push-option.

Thanks for finding and fixing this :)

> 
> Reported-by: Jon Simons 
> Signed-off-by: Jeff King 
> ---
>  remote-curl.c   | 11 ++-
>  t/t5545-push-options.sh | 18 ++
>  2 files changed, 28 insertions(+), 1 deletion(-)
> 
> diff --git a/remote-curl.c b/remote-curl.c
> index 6ec5352435..f5b3d22e26 100644
> --- a/remote-curl.c
> +++ b/remote-curl.c
> @@ -13,6 +13,7 @@
>  #include "credential.h"
>  #include "sha1-array.h"
>  #include "send-pack.h"
> +#include "quote.h"
>  
>  static struct remote *remote;
>  /* always ends with a trailing slash */
> @@ -145,7 +146,15 @@ static int set_option(const char *name, const char 
> *value)
>   return -1;
>   return 0;
>   } else if (!strcmp(name, "push-option")) {
> - string_list_append(_options, value);
> + if (*value != '"')
> + string_list_append(_options, value);
> + else {
> + struct strbuf unquoted = STRBUF_INIT;
> + if (unquote_c_style(, value, NULL) < 0)
> + die("invalid quoting in push-option value");
> + string_list_append_nodup(_options,
> +  strbuf_detach(, 
> NULL));
> + }
>   return 0;
>  
>  #if LIBCURL_VERSION_NUM >= 0x070a08
> diff --git a/t/t5545-push-options.sh b/t/t5545-push-options.sh
> index c64dee2127..b47a95871c 100755
> --- a/t/t5545-push-options.sh
> +++ b/t/t5545-push-options.sh
> @@ -217,6 +217,15 @@ test_expect_success 'invalid push option in config' '
>   test_refs master HEAD@{1}
>  '
>  
> +test_expect_success 'push options keep quoted characters intact (direct)' '
> + mk_repo_pair &&
> + git -C upstream config receive.advertisePushOptions true &&
> + test_commit -C workbench one &&
> + git -C workbench push --push-option="\"embedded quotes\"" up master &&
> + echo "\"embedded quotes\"" >expect &&
> + test_cmp expect upstream/.git/hooks/pre-receive.push_options
> +'
> +
>  . "$TEST_DIRECTORY"/lib-httpd.sh
>  start_httpd
>  
> @@ -260,6 +269,15 @@ test_expect_success 'push options work properly across 
> http' '
>   test_cmp expect actual
>  '
>  
> +test_expect_success 'push options keep quoted characters intact (http)' '
> + mk_http_pair true &&
> +
> + test_commit -C test_http_clone one &&
> + git -C test_http_clone push --push-option="\"embedded quotes\"" origin 
> master &&
> + echo "\"embedded quotes\"" >expect &&
> + test_cmp expect 
> "$HTTPD_DOCUMENT_ROOT_PATH"/upstream.git/hooks/pre-receive.push_options
> +'
> +
>  stop_httpd
>  
>  test_done
> -- 
> 2.16.2.552.gea2a3cf654

-- 
Brandon Williams


Re: [PATCHv2 00/16] Moving global state into the repository object (part 1)

2018-02-20 Thread Stefan Beller
On Tue, Feb 20, 2018 at 11:00 AM, Brandon Williams  wrote:
> On 02/20, Stefan Beller wrote:
>> On Fri, Feb 16, 2018 at 2:34 PM, Jonathan Nieder  wrote:
>> > Hi,
>> >
>> > Stefan Beller wrote:
>> >
>> >> v2:
>> >> * duplicated the 'ignore_env' bit into the object store as well
>> >> * the #define trick no longer works as we do not have a "the_objectstore" 
>> >> global,
>> >>   which means there is just one patch per function that is converted.
>> >>   As this follows the same structure of the previous series, I am still 
>> >> confident
>> >>   there is no hidden dependencies to globals outside the object store in 
>> >> these
>> >>   converted functions.
>> >> * rebased on top of current master, resolving the merge conflicts.
>> >>   I think I used the list.h APIs right, but please double check.
>> >
>> > For what it's worth, I think I prefer v1.  I put some comments on why
>> > on patch 0 of v1 and would be interested in your thoughts on them
>> > (e.g. as a reply to that).  I also think that even if we want to
>> > switch to a style that passes around object_store separately from
>> > repository, it is easier to do the migration in two steps: first get
>> > rid of hidden dependencies on the_repository, then do the (simpler)
>> > automatic migration from
>> >
>> >  f(the_repository)
>> >
>> > to
>> >
>> >  f(the_repository->object_store)
>> >
>> > *afterwards*.
>> >
>> > Thoughts?
>>
>> I would prefer to not spend more time on these conversions.
>> If Duy and you would come to a conclusion to either pick this
>> or the previous version I would be happy.
>>
>> I do not see the benefit in splitting up the series even further and
>> do this multistep f(repo) -> f(object store), as the cost in potential
>> merge conflicts is too high. Note that brian just sent another object
>> id conversion series, also touching sha1_file.c, which I am sure will
>> produce merge conflicts for Junio.
>>
>> For the next part 2 and onwards I'd be happy to take either this
>> strategy or Duys strategy as requested.
>
> I think Jonathan is trying to point out that converting to f(repo) maybe
> easier than converting to f(repo->object_store) upfront

I agree.

> which would make
> it easier to write the patches (which most of them are already written)

true, but for this series we also have the conversion to f(object_store)
written already.

> and to review because you can use the #define trick to make some sort of
> guarantees.

That is true, so it would be a tradeoff between reviewers and maintainers time?

> After we have successfully completed the migration to f(repo), then we
> can revisit the subsystems which want to have a clearer abstraction
> layer and make the jump to f(repo->object_store).

I would think we can take this series as-is and then move on with making
f(repo) abstractions, eventually moving to f(specialized-subsystem) as those
patches are not written yet (neither direct conversions, nor the repo trick;
the patches I already have need adaption which takes enough time on its own.)


>
> --
> Brandon Williams


Re: [PATCHv2 00/16] Moving global state into the repository object (part 1)

2018-02-20 Thread Junio C Hamano
Jonathan Nieder  writes:

> For what it's worth, I think I prefer v1.  I put some comments on why
> on patch 0 of v1 and would be interested in your thoughts on them
> (e.g. as a reply to that).  I also think that even if we want to
> switch to a style that passes around object_store separately from
> repository, it is easier to do the migration in two steps: first get
> rid of hidden dependencies on the_repository, then do the (simpler)
> automatic migration from
>
>  f(the_repository)
>
> to
>
>  f(the_repository->object_store)
>
> *afterwards*.
>
> Thoughts?

Are we envisioning the future in which one repository has more than
one object-store (I am counting an object store and its alternates
that are pointed by its $GIT_OBJECT_DIRECTORY/info/alternates as a
single logical "object store")?  If not, doing f(the_repository)
migration, stopping there without f(the_repository->object_store)
may perfectly be adequate, I would think.




Re: [PATCHv2 00/16] Moving global state into the repository object (part 1)

2018-02-20 Thread Brandon Williams
On 02/20, Stefan Beller wrote:
> On Fri, Feb 16, 2018 at 2:34 PM, Jonathan Nieder  wrote:
> > Hi,
> >
> > Stefan Beller wrote:
> >
> >> v2:
> >> * duplicated the 'ignore_env' bit into the object store as well
> >> * the #define trick no longer works as we do not have a "the_objectstore" 
> >> global,
> >>   which means there is just one patch per function that is converted.
> >>   As this follows the same structure of the previous series, I am still 
> >> confident
> >>   there is no hidden dependencies to globals outside the object store in 
> >> these
> >>   converted functions.
> >> * rebased on top of current master, resolving the merge conflicts.
> >>   I think I used the list.h APIs right, but please double check.
> >
> > For what it's worth, I think I prefer v1.  I put some comments on why
> > on patch 0 of v1 and would be interested in your thoughts on them
> > (e.g. as a reply to that).  I also think that even if we want to
> > switch to a style that passes around object_store separately from
> > repository, it is easier to do the migration in two steps: first get
> > rid of hidden dependencies on the_repository, then do the (simpler)
> > automatic migration from
> >
> >  f(the_repository)
> >
> > to
> >
> >  f(the_repository->object_store)
> >
> > *afterwards*.
> >
> > Thoughts?
> 
> I would prefer to not spend more time on these conversions.
> If Duy and you would come to a conclusion to either pick this
> or the previous version I would be happy.
> 
> I do not see the benefit in splitting up the series even further and
> do this multistep f(repo) -> f(object store), as the cost in potential
> merge conflicts is too high. Note that brian just sent another object
> id conversion series, also touching sha1_file.c, which I am sure will
> produce merge conflicts for Junio.
> 
> For the next part 2 and onwards I'd be happy to take either this
> strategy or Duys strategy as requested.

I think Jonathan is trying to point out that converting to f(repo) maybe
easier than converting to f(repo->object_store) upfront which would make
it easier to write the patches (which most of them are already written)
and to review because you can use the #define trick to make some sort of
guarantees.

After we have successfully completed the migration to f(repo), then we
can revisit the subsystems which want to have a clearer abstraction
layer and make the jump to f(repo->object_store).

-- 
Brandon Williams


Re: [PATCHv2 00/16] Moving global state into the repository object (part 1)

2018-02-20 Thread Stefan Beller
On Fri, Feb 16, 2018 at 2:34 PM, Jonathan Nieder  wrote:
> Hi,
>
> Stefan Beller wrote:
>
>> v2:
>> * duplicated the 'ignore_env' bit into the object store as well
>> * the #define trick no longer works as we do not have a "the_objectstore" 
>> global,
>>   which means there is just one patch per function that is converted.
>>   As this follows the same structure of the previous series, I am still 
>> confident
>>   there is no hidden dependencies to globals outside the object store in 
>> these
>>   converted functions.
>> * rebased on top of current master, resolving the merge conflicts.
>>   I think I used the list.h APIs right, but please double check.
>
> For what it's worth, I think I prefer v1.  I put some comments on why
> on patch 0 of v1 and would be interested in your thoughts on them
> (e.g. as a reply to that).  I also think that even if we want to
> switch to a style that passes around object_store separately from
> repository, it is easier to do the migration in two steps: first get
> rid of hidden dependencies on the_repository, then do the (simpler)
> automatic migration from
>
>  f(the_repository)
>
> to
>
>  f(the_repository->object_store)
>
> *afterwards*.
>
> Thoughts?

I would prefer to not spend more time on these conversions.
If Duy and you would come to a conclusion to either pick this
or the previous version I would be happy.

I do not see the benefit in splitting up the series even further and
do this multistep f(repo) -> f(object store), as the cost in potential
merge conflicts is too high. Note that brian just sent another object
id conversion series, also touching sha1_file.c, which I am sure will
produce merge conflicts for Junio.

For the next part 2 and onwards I'd be happy to take either this
strategy or Duys strategy as requested.

Stefan


Re: [PATCH v2] Fix misconversion of gitsubmodule.txt

2018-02-20 Thread Stefan Beller
Kaartic was the last to touch that file,
(as found via git log origin/master -- Documentation/gitsubmodules.txt),
sorry I did not find this in the review.

Thanks for the patch!
Stefan


On Tue, Feb 20, 2018 at 3:48 AM, marmot1123  wrote:
> In the 2nd and 4th paragraph of DESCRIPTION, there ware misconversions 
> `submodule’s`.
> It seems non-ASCII apostrophes, so I rewrite ASCII apostrophes.
>
> Signed-off-by: Motoki Seki 
> ---
>  Documentation/gitsubmodules.txt | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/Documentation/gitsubmodules.txt b/Documentation/gitsubmodules.txt
> index 46cf120f666df..0d59ab4cdfb1c 100644
> --- a/Documentation/gitsubmodules.txt
> +++ b/Documentation/gitsubmodules.txt
> @@ -24,7 +24,7 @@ On the filesystem, a submodule usually (but not always - 
> see FORMS below)
>  consists of (i) a Git directory located under the `$GIT_DIR/modules/`
>  directory of its superproject, (ii) a working directory inside the
>  superproject's working directory, and a `.git` file at the root of
> -the submodule’s working directory pointing to (i).
> +the submodule's working directory pointing to (i).
>
>  Assuming the submodule has a Git directory at `$GIT_DIR/modules/foo/`
>  and a working directory at `path/to/bar/`, the superproject tracks the
> @@ -33,7 +33,7 @@ in its `.gitmodules` file (see linkgit:gitmodules[5]) of 
> the form
>  `submodule.foo.path = path/to/bar`.
>
>  The `gitlink` entry contains the object name of the commit that the
> -superproject expects the submodule’s working directory to be at.
> +superproject expects the submodule's working directory to be at.
>
>  The section `submodule.foo.*` in the `.gitmodules` file gives additional
>  hints to Gits porcelain layer such as where to obtain the submodule via
>
> --
> https://github.com/git/git/pull/459


Re: [PATCH v2 9/9] add -p: don't rely on apply's '--recount' option

2018-02-20 Thread Junio C Hamano
Phillip Wood  writes:

> From: Phillip Wood 
>
> Now that add -p counts patches properly it should be possible to turn
> off the '--recount' option when invoking 'git apply'
>
> Signed-off-by: Phillip Wood 
> ---
>
> Notes:
> I can't think of a reason why this shouldn't be OK but I can't help
> feeling slightly nervous about it. I've made it a separate patch so it
> can be easily dropped or reverted if I've missed something.

It is a lot more preferrable approach to produce a patch with
information as precise as the producer (i.e. add -p) can and feed it
to the consumer (i.e. apply) than to rely on the --recount and the
--allow-overlap options that are mainly ways to force the consumer
make imprecise guesses and accepting potential garbage with looser
consistency checks.  So I very much like the direction (and the next
step may be to make sure we coalesce correctly so that we do not
have to use "--allow-overlap").

Thanks for working on this.  Let's see if the updated code really
"counts patches properly" as the log message claims to ;-)

>  git-add--interactive.perl | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/git-add--interactive.perl b/git-add--interactive.perl
> index 
> 3226c2c4f02d5f8679d77b8eede984fc727b422d..a64c0db57d62ab02ef718b8c8f821105132d9920
>  100755
> --- a/git-add--interactive.perl
> +++ b/git-add--interactive.perl
> @@ -677,7 +677,7 @@ sub add_untracked_cmd {
>  sub run_git_apply {
>   my $cmd = shift;
>   my $fh;
> - open $fh, '| git ' . $cmd . " --recount --allow-overlap";
> + open $fh, '| git ' . $cmd . " --allow-overlap";
>   print $fh @_;
>   return close $fh;
>  }


Re: [PATCH v2 1/9] add -i: add function to format hunk header

2018-02-20 Thread Junio C Hamano
Phillip Wood  writes:

> From: Phillip Wood 
>
> This code is duplicated in a couple of places so make it into a
> function as we're going to add some more callers shortly.
>
> Signed-off-by: Phillip Wood 
> ---
>  git-add--interactive.perl | 21 +++--
>  1 file changed, 11 insertions(+), 10 deletions(-)
>
> diff --git a/git-add--interactive.perl b/git-add--interactive.perl
> index 
> 964c3a75420db4751cf11125b68b6904112632f1..8ababa6453ac4f57a925aacbb8b9bb1c973e4a54
>  100755

This is quite a tangent, but please refrain from using full object
name on the index line.  Because it does not add much value in the
context of patch exchange for project of this size to use more than
7-12 hexdigits each, the only effect of doing so is to push the mode
bit far to the right.


Von: Joy Kone

2018-02-20 Thread joy kone
Von: Joy Kone

Ich habe Ihnen diese E-Mail geschickt, weil Sie mit Ihnen diskutieren
müssen. Ich möchte nicht, dass Sie dieses Angebot in irgendeiner
Hinsicht missverstehen ... wenn es Ihnen recht ist, bitte ich um Ihre
volle Mitarbeit. Ich habe mit Ihnen Kontakt aufgenommen, um eine
Investition in Ihrem Land / Unternehmen in meinem Namen als
potentieller Partner zu führen.

Mein Name ist Joy Konei. ein Bürger aber wohnt hier. Es könnte Sie
interessieren zu wissen, dass ich US $ 10.500.000,00 bei einem
Finanzinstitut hinterlegt habe, das in Ihrem Land / Unternehmen
investiert werden soll. Es ist sachdienlich, mich wissen zu lassen, ob
Sie mit diesem Fonds / Ihrer Anlage in Ihrem Land umgehen können, um
Ihnen alle notwendigen Informationen über das Finanzinstitut für
weitere Informationen zu liefern. Inzwischen bin ich sehr ehrlich in
meinen Umgang mit Menschen und ich fordere auch dasselbe von dir als
Partner zu sein. Kann ich Ihnen diesen Fonds anvertrauen?

Ich möchte Sie darauf hinweisen, dass dies eine gemeinsame
Unternehmung ist, da dies eine Belohnung für Ihre Unterstützung ist.
Ich werde Ihnen Ihren Nutzen für Ihre Unterstützung mitteilen, während
wir fortfahren. Für eine umfassendere Details und Quelle des Fonds,
kontaktieren Sie mich bitte so schnell wie möglich. Wenn Sie diesen
Brief als anstößig empfinden, ignorieren Sie ihn bitte und akzeptieren
Sie meine Entschuldigung.

Grüße,
Freude Kone


Re: Why git-revert doesn't invoke the pre-commit and the commit-msg hooks?

2018-02-20 Thread Junio C Hamano
Phillip Wood  writes:

> ... I'm worried though that
> someone out there is scripting with a non-interactive editor which may
> break if we start verifying the message ...

This is a very valid concern.  Making sure that 'revert' pays
attention to the --verify option when given from the command line,
without changing the default or anything else, would satisfy
everybody's wish, I would think.  It won't bother existing users,
and in addition will now give an option to people to selectively run
pre-whatever hook when they choose to.

I am not sure if we should be running pre-commit/commit-msg for
revert.  The message and the effect of revert is quite distinct from
the ordinary commit, and it is very likely that people would want to
treat them differently.  It would make more sense (if we were to add
an option to run any hook we currently do not run to the command) to
run pre-revert/revert-msg hooks instead, and then people who happen
to want to do the same thing in these hooks what they do for
ordinary commits can just call their pre-commit/commit-msg hooks
from there, perhaps.


Re: [PATCH v2 4/9] t3701: don't hard code sha1 hash values

2018-02-20 Thread Junio C Hamano
Phillip Wood  writes:

> From: Phillip Wood 
>
> Purge the index lines from diffs so we're not hard coding sha1 hash
> values in the expected output.

The motivation of this patch is clear, but all-zero object name for
missing side of deletion or creation patch should not change when we
transition to any hash function.  Neither the permission bits shown
in the output (and whether the index line has the bits are shown on
it in the first place, i.e. the index line of a creation patch does
not, whilethe one in a modification patch does).

So I am a bit ambivalent about this change.

Perhaps have a filter that redacts, instead of removes, selected
pieces of information that are likely to change while hash
transition, and use that consistently to filter both the expected
output and the actual output before comparing?



Re: [PATCH v2 3/9] t3701: use test_write_lines and write_script

2018-02-20 Thread Junio C Hamano
Eric Sunshine  writes:

>>  test_expect_success 'setup fake editor' '
>> -   echo "#!$SHELL_PATH" >fake_editor.sh &&
>> -   cat >>fake_editor.sh <<-\EOF &&
>> +   FAKE_EDITOR="$(pwd)/fake-editor.sh" &&
>> +   write_script "$FAKE_EDITOR" <<-\EOF &&
>> mv -f "$1" oldpatch &&
>> mv -f patch "$1"
>> EOF
>> -   chmod a+x fake_editor.sh &&
>> -   test_set_editor "$(pwd)/fake_editor.sh"
>> +   test_set_editor "$FAKE_EDITOR"
>>  '
>
> The very first thing that test_set_editor() does is set FAKE_EDITOR to
> the value of $1, so it is confusing to see it getting setting it here
> first; the reader has to spend extra brain cycles wondering if
> something non-obvious is going on that requires this manual
> assignment. Perhaps drop the assignment altogether and just write
> literal "fake_editor.sh" in the couple places it's needed (as was done
> in the original code)?

Yeah, I think $(pwd)/ prefix is needed at the final step (i.e. as
the first argument to test_set_editor) for this to be a faithful
rewrite but it is distracting having to write it anywhere else.

Other than that, this looks like a quite straight-forward cleanup.

Thanks, both.  Here is what I'd be queuing tentatively.

 t/t3701-add-interactive.sh | 33 +
 1 file changed, 5 insertions(+), 28 deletions(-)

diff --git a/t/t3701-add-interactive.sh b/t/t3701-add-interactive.sh
index 39c7423069..4a369fcb51 100755
--- a/t/t3701-add-interactive.sh
+++ b/t/t3701-add-interactive.sh
@@ -87,13 +87,8 @@ test_expect_success 'setup expected' '
EOF
 '
 
-test_expect_success 'setup fake editor' '
-   >fake_editor.sh &&
-   chmod a+x fake_editor.sh &&
-   test_set_editor "$(pwd)/fake_editor.sh"
-'
-
 test_expect_success 'dummy edit works' '
+   test_set_editor : &&
(echo e; echo a) | git add -p &&
git diff > diff &&
test_cmp expected diff
@@ -110,12 +105,10 @@ test_expect_success 'setup patch' '
 '
 
 test_expect_success 'setup fake editor' '
-   echo "#!$SHELL_PATH" >fake_editor.sh &&
-   cat >>fake_editor.sh <<-\EOF &&
+   write_script "fake_editor.sh" <<-\EOF &&
mv -f "$1" oldpatch &&
mv -f patch "$1"
EOF
-   chmod a+x fake_editor.sh &&
test_set_editor "$(pwd)/fake_editor.sh"
 '
 
@@ -302,18 +295,12 @@ test_expect_success 'deleting an empty file' '
 
 test_expect_success 'split hunk setup' '
git reset --hard &&
-   for i in 10 20 30 40 50 60
-   do
-   echo $i
-   done >test &&
+   test_write_lines 10 20 30 40 50 60 >test &&
git add test &&
test_tick &&
git commit -m test &&
 
-   for i in 10 15 20 21 22 23 24 30 40 50 60
-   do
-   echo $i
-   done >test
+   test_write_lines 10 15 20 21 22 23 24 30 40 50 60 >test
 '
 
 test_expect_success 'split hunk "add -p (edit)"' '
@@ -334,17 +321,7 @@ test_expect_success 'split hunk "add -p (edit)"' '
 '
 
 test_expect_failure 'split hunk "add -p (no, yes, edit)"' '
-   cat >test <<-\EOF &&
-   5
-   10
-   20
-   21
-   30
-   31
-   40
-   50
-   60
-   EOF
+   test_write_lines 5 10 20 21 30 31 40 50 60 >test &&
git reset &&
# test sequence is s(plit), n(o), y(es), e(dit)
# q n q q is there to make sure we exit at the end.


[PATCH] make hg-to-git compatible with python2.x and 3.x

2018-02-20 Thread Hervé Beraud
Signed-off-by: Hervé Beraud 
---
 contrib/hg-to-git/hg-to-git.py | 52 +-
 1 file changed, 26 insertions(+), 26 deletions(-)

diff --git a/contrib/hg-to-git/hg-to-git.py b/contrib/hg-to-git/hg-to-git.py
index de3f81667ed97..8fa7698df5c20 100755
--- a/contrib/hg-to-git/hg-to-git.py
+++ b/contrib/hg-to-git/hg-to-git.py
@@ -42,7 +42,7 @@
 
 def usage():
 
-print """\
+print("""\
 %s: [OPTIONS] 
 
 options:
@@ -54,7 +54,7 @@ def usage():
 
 required:
 hgprj:  name of the HG project to import (directory)
-""" % sys.argv[0]
+""" % sys.argv[0])
 
 #--
 
@@ -104,29 +104,29 @@ def getgitenv(user, date):
 if state:
 if os.path.exists(state):
 if verbose:
-print 'State does exist, reading'
+print('State does exist, reading')
 f = open(state, 'r')
 hgvers = pickle.load(f)
 else:
-print 'State does not exist, first run'
+print('State does not exist, first run')
 
 sock = os.popen('hg tip --template "{rev}"')
 tip = sock.read()
 if sock.close():
 sys.exit(1)
 if verbose:
-print 'tip is', tip
+print('tip is', tip)
 
 # Calculate the branches
 if verbose:
-print 'analysing the branches...'
+print('analysing the branches...')
 hgchildren["0"] = ()
 hgparents["0"] = (None, None)
 hgbranch["0"] = "master"
 for cset in range(1, int(tip) + 1):
 hgchildren[str(cset)] = ()
 prnts = os.popen('hg log -r %d --template "{parents}"' % 
cset).read().strip().split(' ')
-prnts = map(lambda x: x[:x.find(':')], prnts)
+prnts = [x[:x.find(':')] for x in prnts]
 if prnts[0] != '':
 parent = prnts[0].strip()
 else:
@@ -154,15 +154,15 @@ def getgitenv(user, date):
 else:
 hgbranch[str(cset)] = "branch-" + str(cset)
 
-if not hgvers.has_key("0"):
-print 'creating repository'
+if "0" not in hgvers:
+print('creating repository')
 os.system('git init')
 
 # loop through every hg changeset
 for cset in range(int(tip) + 1):
 
 # incremental, already seen
-if hgvers.has_key(str(cset)):
+if str(cset) in hgvers:
 continue
 hgnewcsets += 1
 
@@ -180,27 +180,27 @@ def getgitenv(user, date):
 os.write(fdcomment, csetcomment)
 os.close(fdcomment)
 
-print '-'
-print 'cset:', cset
-print 'branch:', hgbranch[str(cset)]
-print 'user:', user
-print 'date:', date
-print 'comment:', csetcomment
+print('-')
+print('cset:', cset)
+print('branch:', hgbranch[str(cset)])
+print('user:', user)
+print('date:', date)
+print('comment:', csetcomment)
 if parent:
-   print 'parent:', parent
+   print('parent:', parent)
 if mparent:
-print 'mparent:', mparent
+print('mparent:', mparent)
 if tag:
-print 'tag:', tag
-print '-'
+print('tag:', tag)
+print('-')
 
 # checkout the parent if necessary
 if cset != 0:
 if hgbranch[str(cset)] == "branch-" + str(cset):
-print 'creating new branch', hgbranch[str(cset)]
+print('creating new branch', hgbranch[str(cset)])
 os.system('git checkout -b %s %s' % (hgbranch[str(cset)], 
hgvers[parent]))
 else:
-print 'checking out branch', hgbranch[str(cset)]
+print('checking out branch', hgbranch[str(cset)])
 os.system('git checkout %s' % hgbranch[str(cset)])
 
 # merge
@@ -209,7 +209,7 @@ def getgitenv(user, date):
 otherbranch = hgbranch[mparent]
 else:
 otherbranch = hgbranch[parent]
-print 'merging', otherbranch, 'into', hgbranch[str(cset)]
+print('merging', otherbranch, 'into', hgbranch[str(cset)])
 os.system(getgitenv(user, date) + 'git merge --no-commit -s ours "" %s 
%s' % (hgbranch[str(cset)], otherbranch))
 
 # remove everything except .git and .hg directories
@@ -233,12 +233,12 @@ def getgitenv(user, date):
 
 # delete branch if not used anymore...
 if mparent and len(hgchildren[str(cset)]):
-print "Deleting unused branch:", otherbranch
+print("Deleting unused branch:", otherbranch)
 os.system('git branch -d %s' % otherbranch)
 
 # retrieve and record the version
 vvv = os.popen('git show --quiet --pretty=format:%H').read()
-print 'record', cset, '->', vvv
+print('record', cset, '->', vvv)
 hgvers[str(cset)] = vvv
 
 if hgnewcsets >= opt_nrepack and opt_nrepack != -1:
@@ -247,7 +247,7 @@ def getgitenv(user, date):
 # write the state for incrementals
 if state:
 if verbose:
-print 'Writing state'
+print('Writing state')
 f = open(state, 'w')
 pickle.dump(hgvers, f)
 

--

Re: Git should preserve modification times at least on request

2018-02-20 Thread Hilco Wijbenga
On Mon, Feb 19, 2018 at 3:22 PM, Hilco Wijbenga
 wrote:
> Aside from exactly which modification times should be used (which I
> would love to have a bit more control over as well), something else
> I'd like to see is that, when switching between branches, files that
> are the same on both branches should not have their modification time
> changed.

As Junio pointed out to me, Git actually already does what I want when
switching branches. To verify, I switched between 5 branches after
setting a specific timestamp on a particular file, and it did not
change throughout the process. Now I'm left wondering when this
changed or whether my memory is faulty. I could have sworn this did
not work previously. :-)


Re: [RFC PATCH 0/1] Implement CMake build

2018-02-20 Thread Robert Dailey
On Thu, Jan 25, 2018 at 6:21 PM, Isaac Hier  wrote:
> Hi Jeff,
>
> I have been looking at the build generator, which looks promising, but
> I have one concern. Assuming I can generate a CMakeLists.txt that
> appropriately updates the library sources, etc. how do you suggest I
> handle new portability macros? For example, assume someone adds a
> macro HAVE_X to indicate the availability of some platform-specific
> function x. In the current Makefile, a comment would be added to the
> top indicating when HAVE_X or NO_X should be set, and that option
> would toggle the HAVE_X C macro. But CMake can test for the
> availability of x, which is one of the main motives for adding a CMake
> build. The current build generator uses the output of make, so all it
> would know is whether or not HAVE_X is defined on the platform that
> ran the Makefile, but not the entire list of platform that git
> supports.
>
> Bottom line: should I add the portability tests as they are now,
> without accounting for future portability macros? One good alternative
> might be to suggest the authors of new portability macros include a
> small sample C program to test it. That would allow me to easily patch
> the CMake tests whenever that came up. In a best case scenario, a
> practice could be established to write the test in a specific
> directory with a certain name so that I could automatically update the
> CMake tests from the build generator.

Isaac,

I'm very happy that you have started support for CMake. I have a lot
of experience with it. I'd love to help contribute. Do you have a fork
on github where this code is? I'd have to figure out how to apply a
patch from email, I haven't done it before. I think the goal should be
to replace the existing build system (this can be a transition that
happens slowly). I've been in situations where multiple build systems
are supported in parallel, worst case because of split personal
preferences on a project. That is more counterproductive than asking
the team to just compromise and take the initial hit on learning
curve. Ultimately that's up to the Git community, but that would be my
recommendation. But I think making CMake as complete as possible will
help build that confidence and trust. I can completely understand the
complexities and concerns they have.


Re: cherry-pick '-m' curiosity

2018-02-20 Thread Sergey Organov
"G. Sylvie Davies"  writes:

> On Mon, Feb 5, 2018 at 3:46 AM, Sergey Organov  wrote:
>> Hello,
>>
>> $ git help cherry-pick
>>
>> -m parent-number, --mainline parent-number
>>Usually you cannot cherry-pick a merge because you do not
>>know which side of the merge should be considered the
>>mainline.
>>
>> Isn't it always the case that "mainline" is the first parent, as that's
>> how "git merge" happens to work?
>>
>
> First-parent will be whatever commit you were sitting on when you
> typed "git merge".

Right, but I believe it's also "mainline", see below.

> If you're sitting on your branch and you type "git fetch; git merge
> origin/master", then "mainline" will be 2nd parent.

No. If you ever want to cherry-pick this commit, it'd still be -m1 side
of it that likely makes sense, and it's exactly the side that makes
sense to be picked that is called "mainline" in the manual page we are
discussing, and there is no any other definition of "mainline" as far as
I can tell.

There is nothing about 'origin/master' that makes it "mainline" from the
POV of future cherry-picks, if any, of this merge commit. I was also
unable to find any git documentation that calls 'origin/master'
"mainline". It's called "remote-tracking branch", or maybe sometimes
"upstream".

OTOH, when one merges something, he often merges "side branch" onto
"mainline", so in the context of this particular merge, your local
"master" happens to be "mainline" and "origin/master" happens to be
"side branch".

> "git revert -m" also has the same problem.

Yes, as it's essentially just "git cherry-pick --reverse -m", provided
cherry-pick has had the "--reverse" from regular "patch" utility [*].

It's also interesting to notice that manual page for "git revert" refers
to the revert-a-faulty-merge How-To, that in turn again uses only "git
revert -m 1".

Overall, it's still a mystery to me why "-m 1" is not the default
behavior for both "git revert" and "git cherry-pick".

The only suspicion I have is that actual intention is to deny picking
merge commits by default. Then, the usual git way would be to use
--force or --enable-merges to overcome the denial, but if we still do
need "-m 2" etc. even rarely, then rather re-using "-m 1" as "I mean it"
indication is only logical.

If my suspicion is true, how about something like this:

-m parent-number, --mainline parent-number
This option specifies the parent number (starting from 1) of a
commit and instructs cherry-pick to replay the change relative
to the specified parent. Cherry-pick will refuse to handle merge
commits unless this option is given.

Damn, it now has no "mainline" in the description at all, so it's
unclear why it has been called --mainline in the first place, not that
it was somehow clear to me before.

And while we are at it, I just stumbled over "git cherry-pick -m 1"
refusing to handle non-merge commits:

$ git cherry-pick -m1 dd5c320
error: Mainline was specified but commit 
dd5c320a300520a044cfa73d17f6cbffbbef60ef is not a merge.
fatal: cherry-pick failed
$ 

I wonder whether this is intentional? What's the rationale then? It
seems it could be useful to be able to cherry-pick multiple commits,
some of which are merges, no?

Footnote:

[*] rebase, revert, and cherry-pick all look rather similar in git and
could be calling for some unification.

-- Sergey


File locking issues with fetching submodules in parallel

2018-02-20 Thread Johannes Schindelin
Hi Stefan (and other submodule wizards),

Igor Melnichenko reported a submodule problem in a Git for Windows ticket:

https://github.com/git-for-windows/git/issues/1469

Part of it is due to Windows' behavior where you cannot read the same file
in one process while you write it in another process.

The other part is how our submodule code works in parallel. In particular,
we seem to write the `.git` file maybe even every time a submodule is
fetched, unconditionally, not even testing whether the .git file is
already there and contains the correct contents?

For some reason, the bug reporter saw a "Permission denied" on the `.git`
file when we try to write it (and I am pretty certain that I tracked it
down correctly to the `connect_work_tree_and_git_dir()` function). The
intermittent "Permission denied" error seems to suggest that *another*
process is accessing this file while we are writing it.

It also seems that this problem becomes worse if the firewall is turned
on, in which case a couple of network operations become a little slower
(which I could imagine to be the factor making the problems more likely).

A plausible workaround would be to write the `.git` file only when needed
(which also would fix a bug on macOS/Linux, although the window is
substantially smaller: the reader could potentially read a
partially-written file).

But maybe we are simply holding onto an open handle to the `.git` file too
long?

I tried to put together a bit more information here:

https://github.com/git-for-windows/git/issues/1469#issuecomment-366932746

Do you think there is an easy solution for this? You're much deeper in the
submodule code than I ever want to be...

Thanks,
Dscho


Re: Stackdump from stash save on Windows 10 64-bit

2018-02-20 Thread Johannes Schindelin
Hi Tim,

On Tue, 20 Feb 2018, Tim Mayo wrote:

> As of yesterday, stash save stopped working on my Windows 10 box - I get:
> 
>> git stash save
>Cannot save the current worktree state
> 
> and a stackdump file (see below).  This is with the 64-bit version of 2.16.1. 
>  Switching to the 32-bit version resolved the problem for me.

Can you please test with v2.16.2 (you can use the 64-bit Portable Git if
you want to keep your current installation), and if the problem still
persists please open a ticket here:

https://github.com/git-for-windows/git/issues/new

Ciao,
Johannes


git-gui, BUG: minimize/maximize buttons missing when repo is opened in gui

2018-02-20 Thread Birger Skogeng Pedersen
System: Ubuntu 17.10 Gnome

When opening git-gui from a directory which is a repository, minimize
and maximize buttons are showing and functional in git-gui.

However, if I open git-gui in a non-repo directory, git-gui opens a
dialog where I can "Create New Repository", "Clone Existing
Repository", etc. When a repository is opened from this dialog, the
git-gui window does not have minimize and maximize buttons visible.

Steps to reproduce:
1. Open git-gui in a non-repository directory
2. Click "Open Existing Repository"
3. Click "Browse" and select an existing repository directory
4. The repository is opened with git-gui, but without
minimize/maximize buttons visible


Duplicate safecrlf warning for racily clean index entry

2018-02-20 Thread Matt McCutchen
I noticed that if a file subject to a safecrlf warning is added to the
index in the same second that it is created, resulting in a "racily
clean" index entry, then a subsequent "git add" command prints another
safecrlf warning.  I reproduced this on the current "next"
(499d7c4f91).  The procedure:

$ git init
$ git config core.autocrlf true
$ echo foo >file1 && git add file1 && git add file1
warning: LF will be replaced by CRLF in file1.
The file will have its original line endings in your working directory.
warning: LF will be replaced by CRLF in file1.
The file will have its original line endings in your working directory.
$ echo bar >file2 && sleep 1 && git add file2 && git add file2
warning: LF will be replaced by CRLF in file2.
The file will have its original line endings in your working directory.

This came up when I ran the test suite for Braid on Windows
(https://github.com/cristibalan/braid/issues/77).

The phenomenon actually seems to be more general: touching the file
causes the next "git add" to print a safecrlf warning, suggesting that
the warning occurs whenever the index entry is dirty.  One could argue
that a new warning is reasonable after touching the file, but it seems
clear that "racy cleanliness" is an implementation detail that
shouldn't have user-visible nondeterministic effects.

In either case, if "git update-index --refresh" (or "git status") is
run before "git add", then "git add" does not print the warning.  On
the other hand, if line endings in the working tree file are changed,
then git shows the file as having an unstaged change, even though the
content that would be added to the index after CRLF conversion is
identical.  So it seems that git remembers the pre-conversion file
content and uses it for "git update-index --refresh" and would just
need to use it for "git add" as well.

Thoughts about the proposed change?  Does someone want to work on it or
give me a pointer to where to get started?

Thanks,
Matt


git-gui, feature request: add hotkeys to focus different widgets

2018-02-20 Thread Birger Skogeng Pedersen
To fully use git-gui with keyboard-only, a few more hotkeys are
needed. I am missing hotkeys to change focus between the "Unstanged
Changes", "Staged Changes", diff viewer and "Commit Message" widgets.
I would like to be able to stage, browse, unstage and commit in
git-gui, all without using the mouse.

I propose that CTRL+(number) could be used as hotkeys to change the
focus between the four widgets I've mentioned.


Best regards,
Birger Skogeng Pedersen


About connection resuming

2018-02-20 Thread chenzero

Hello,
First, I am a user of git for about 2 years, I really appreciated you 
all to create this great useful software!

My encountered problem is:
sometimes, the repo is big and because my networking is not stable(or my 
network proxy has

some limitations), so, the clone will always fail.
I tried following ways to solve this, however, not much success.
1. clone depth 1
git clone --depth=1 https:///repo.git
however, some repo even depth=1 might fail.(It seemed that my network 
proxy limit tcp connection.

if transfer over 50M, it will break)
2. download bundle file.
but no all git repo provide bundle files to download.

After some investment on the code, I think, perhaps,
if enhance the git-http-backend to support http header: Range, or 
Content-Range,

maybe it will enable connection resuming.
the problem of this way is: it needs to upgrade the current deployed 
"git-http-backend",

and maybe much code need to change including git-remote-http etc.

This is the very basic thought, and whether I should try other way ?
Thanks a lot!



Re: [RFC] Rebasing merges: a jorney to the ultimate solution (Road Clear)

2018-02-20 Thread Sergey Organov
Hi Igor,

Igor Djordjevic  writes:

> Hi Sergey,
>

[...]

>
> Even though this behavior is known and documented, it still left some 
> to be desired.
>
> Might be I`m missing something, but so far I like how described 
> approach just "feels right" (to me, for now), being really simple, 
> yet robust :)
>
> As my humble contribution to the cause and discussion itself, I`m 
> providing possibly naive, yet simple demonstration script[1], and 
> clearly showing where current `git rebase --preserve-merges` is 
> lacking (in my opinion, even though being expected and documented), 
> and how your proposal seems to improve the situation here.

Thanks for the test-case, -- it's nice to see this thingy working!

>
> Thanks for your thoughts, and hoping to see this going somewhere :)

So do I. Even if nobody volunteers to adopt it, I hope I'll be able to
implement something myself, not very soon though.

-- Sergey


[ANNOUNCE] Git for Windows 2.16.2

2018-02-20 Thread Johannes Schindelin
Dear Git users,

It is my pleasure to announce that Git for Windows 2.16.2 is available from:

https://gitforwindows.org/

Changes since Git for Windows v2.16.1(4) (February 7th 2018)

New Features

  * Comes with Git v2.16.2.
  * For every new Git for Windows version, .zip archives containing
.pdb files for some of Git for Windows' components are now
published alongside the new version.
  * Comes with MSYS2 runtime (Git for Windows flavor) based on Cygwin
2.10.0; This required rebuilding OpenSSH, Perl (and some Perl
modules) and Subversion.
  * Comes with Bash v4.4 patchlevel 019 .

Bug Fixes

  * The Perl upgrade in Git for Windows v2.16.1(4) broke interactive
authentication of git svn, which was fixed.
  * When configuring HTTPS transport to use Secure Channel, we now
refrain from configuring http.sslCAInfo. This also helps Git LFS
(which uses Git for Windows' private http.sslCAInfo setting) to use
the same credentials as git fetch and git push.

Filename | SHA-256
 | ---
Git-2.16.2-64-bit.exe | 
07e82be14e29d4f0c08ab0fbc06fcc8eb0fe01aa75dee7bf6afa37784e0ecb3c
Git-2.16.2-32-bit.exe | 
c7fa0a06385114b695b604490b444133899e4322bd416bd035007fab122128c4
PortableGit-2.16.2-64-bit.7z.exe | 
f51853689ad8a9e30759fb263a31bcd59753b3a97272b0e76a4210528a8631a1
PortableGit-2.16.2-32-bit.7z.exe | 
b5a91978956ac14f59ccdbc70c9c8d2f105e730e81ae8316a59791d33f3d6a87
MinGit-2.16.2-64-bit.zip | 
f4ac4e7d53d599d515e905824240cc2b82f3e2c294a872bb650e44b7e89cae8c
MinGit-2.16.2-32-bit.zip | 
322c727e482aa97522c64a5ac68bdda3780111e8670bcfb532beac8e11ece5da
MinGit-2.16.2-busybox-64-bit.zip | 
de698bd1f6ca240be8f27b63f027674cf1df20e168b03359af7a57e47b85537a
MinGit-2.16.2-busybox-32-bit.zip | 
08f469810ed5ecc497fc46fade03cbdf4d6539b702fd0560396b4c34c9efffde
Git-2.16.2-64-bit.tar.bz2 | 
23a493244ffc5facb264a32c0cc6cfa98378fb2ba61ec2928f0b2fa7ec156b06
Git-2.16.2-32-bit.tar.bz2 | 
7dcb4f29937ca2d744e9de53ad9f5ed4cbddf973901c756b92c7874473991bf0
pdbs-for-git-64-bit-2.16.2.1.e1848984d1-1.zip | 
8835317db3f63fdbdd636fde179b411574b3a6018607ad53322525e23f2b2463
pdbs-for-git-32-bit-2.16.2.1.e1848984d1-1.zip | 
add00cdd87a9d73ac5f45042272fc32f507addf6ef0e72cd299190412c35f690

Ciao,
Johannes


Re: Git should preserve modification times at least on request

2018-02-20 Thread Peter Backes
Hello Johannes,

On Tue, Feb 20, 2018 at 11:46:38AM +0100, Johannes Schindelin wrote:
> Oh, sorry. I understood your mail as if you had told the core Git
> developers that they should implement the feature you desire. I did not
> understand that you hinted at a discussion first, and that you would then
> go and implement the feature you asked for.

Well, sorry for being misunderstandable. It was my impression from the 
FAQ that the reason for why this feature doesn't exist was a strong 
opinion that it would cause technical problems. The FAQ doesn't mention 
anything like a lack of manpower. As I stated it was my 
impression that this feature would not be too hard to implement.

Because of this my email presupposed it was not manpower that prevented 
this feature.

My statement "Please provide options" was thus targeted at reviewing 
and discussing the perceived technical reasons for not implementing 
this feature at least as an option. It wasn't supposed to demand free 
lunch from anyone.

Of course I can offer to do some work to the best of my abilitites if 
that's the issue. That should go without saying for Free Software 
projects. Perhaps even my employer would be happy to pay me for 
implementing the feature during workign hours. This shouldn't be the 
issue. The issue is the seemingly dogmatic reply in the FAQ which makes 
me reluctant to put work into this in fear that a patch submission 
would be met with strong rejection.

> You will not be able to convince the core Git developers to make this the
> default, I don't think.

I have stressed very clearly in my mail that I am not asking the 
defaults about mtime restoring to be changed. I agree that those 
defaults are reasonable and in line with the principle of least 
astonishment.

What bugs me is my impression from the FAQ that even as an option, the 
feature might be unwelcome.

Best wishes
Peter
-- 
Peter Backes, r...@helen.plasma.xg8.de


[PATCH v2] Fix misconversion of gitsubmodule.txt

2018-02-20 Thread marmot1123
In the 2nd and 4th paragraph of DESCRIPTION, there ware misconversions 
`submodule’s`.
It seems non-ASCII apostrophes, so I rewrite ASCII apostrophes.

Signed-off-by: Motoki Seki 
---
 Documentation/gitsubmodules.txt | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/Documentation/gitsubmodules.txt b/Documentation/gitsubmodules.txt
index 46cf120f666df..0d59ab4cdfb1c 100644
--- a/Documentation/gitsubmodules.txt
+++ b/Documentation/gitsubmodules.txt
@@ -24,7 +24,7 @@ On the filesystem, a submodule usually (but not always - see 
FORMS below)
 consists of (i) a Git directory located under the `$GIT_DIR/modules/`
 directory of its superproject, (ii) a working directory inside the
 superproject's working directory, and a `.git` file at the root of
-the submodule’s working directory pointing to (i).
+the submodule's working directory pointing to (i).
 
 Assuming the submodule has a Git directory at `$GIT_DIR/modules/foo/`
 and a working directory at `path/to/bar/`, the superproject tracks the
@@ -33,7 +33,7 @@ in its `.gitmodules` file (see linkgit:gitmodules[5]) of the 
form
 `submodule.foo.path = path/to/bar`.
 
 The `gitlink` entry contains the object name of the commit that the
-superproject expects the submodule’s working directory to be at.
+superproject expects the submodule's working directory to be at.
 
 The section `submodule.foo.*` in the `.gitmodules` file gives additional
 hints to Gits porcelain layer such as where to obtain the submodule via

--
https://github.com/git/git/pull/459


Re: Is there any way to "interrupt" a rebase?

2018-02-20 Thread Johannes Schindelin
Hi Hilco,

On Tue, 20 Feb 2018, Johannes Schindelin wrote:

> When I am particularly tired and overworked (and therefore know that my
> working memory is less useful than usual), I therefore resort to my
> second-favorite strategy: U use the `done` file.
> 
> I literally copy parts of $GIT_DIR/rebase-merge/done to the beginning of
> $GIT_DIR/rebase-merge/git-rebase-todo (the most convenient way to open the
> latter is `git rebase --edit-todo`). In your case, those would be the
> `pick` lines cherry-picking D and E. Then, as before, `git reset --hard
> ` (where I look up the `` using an aliased version of `git
> log --graph --oneline --left-right --boundary`), amend the commit, and
> then `git rebase --continue`.
> 
> It might be even possible to design a new subcommand for the interactive
> rebase to facilitate a variation of this strategy (possibly even making
> use of the fact that the interactive rebase accumulates mappings between
> the original commits and the rewritten ones in
> $GIT_DIR/rebase-merge/rewritten-list, intended for use in the post-rewrite
> hook).

This feature might look somewhat like this:

git rebase --replay-latest-commits 3

and it would not even have to look at the `rewritten-list`. All it would
do is to put back the latest `pick` from the `done` file (in case of merge
conflicts) into the `git-rebase-todo` file, then insert `pick lines for
HEAD~3.. at the beginning of that todo file, and then `git reset --hard
HEAD~3`.

By not using the original lines from the `done` file (i.e. *different*
from what I described as my second-favorite strategy), you would also get
the resolved merge conflicts rather than having to re-resolve them.

(This all would of course only work properly without --preserve-merges and
without the upcoming --recreate-merges.)

Ciao,
Johannes


Re: [PATCH] Fix misconversion of gitsubmodule.txt

2018-02-20 Thread brian m. carlson
On Tue, Feb 20, 2018 at 08:51:46AM +, marmot1123 wrote:
> In the 2nd and 4th paragraph of DESCRIPTION, there ware misconversions 
> `submodule’s`.
> It seems non-ASCII apostrophes, so I rewrite ASCII apostrophes.

While I agree consistency is good, does this have any effect on the
output?  I think AsciiDoc (and Asciidoctor) produce the non-ASCII
format for output, so this should be a no-op for all known output
formats.
-- 
brian m. carlson / brian with sandals: Houston, Texas, US
https://www.crustytoothpaste.net/~bmc | My opinion only
OpenPGP: https://keybase.io/bk2204


signature.asc
Description: PGP signature


Stackdump from stash save on Windows 10 64-bit

2018-02-20 Thread Tim Mayo
Hi

As of yesterday, stash save stopped working on my Windows 10 box - I get:

   > git stash save
   Cannot save the current worktree state

and a stackdump file (see below).  This is with the 64-bit version of 2.16.1.  
Switching to the 32-bit version resolved the problem for me.

Cheers
Tim

PS. I think the problem was provoked by Windows security update KB4074588. 
Unfortunately, I am unable to uninstall this update to verify.


Exception: STATUS_STACK_OVERFLOW at rip=7FFB793F4B97
rax=0010 rbx= rcx=0010
rdx=000C rsi=AAC0 rdi=7FFB7164D4F0
r8 =A9F8 r9 =AA30 r10=A000
r11=FFE03E50 r12=B500 r13=027AD3E0
r14=014C r15=
rbp=AA30 rsp=A9E8
program=C:\Program Files\Git\usr\bin\sh.exe, pid 9752, thread unknown (0xB24)
cs=0033 ds=002B es=002B fs=0053 gs=002B ss=002B
Stack trace:
FrameFunctionArgs
000AA30  7FFB793F4B97 (104, 000AAA8, 7FFB793D2178, 000)
000AA30  7FFB793D2277 (7FFB7164D4F0, 7FFB7164D4F0, 001, 0230022)
000AAA8  7FFB761ED1F4 (7FFB7935, 7FFB793D21F0, 7FFB793BFAE0, 
7FFE)
000AC70  7FFB761ED071 (9E0900060001, 310F3245BC40, 023, 
000AD98)
000AC70  7FFB761ECA44 (7FFB7164D700, 000, 000, 7FFB7164D700)
000AC70  7FFB715B03A0 (000AC70, 000AD98, 2744620, 0161140)
000AC70  7FFB715837DB (000B5A0, 000, 000B950, 2744620)
000B3D9  7FFB7155E4D0 (002, 000B7C8, 00E, 00180010018)
00180010018  7FFB7155CFDD (000, 000B700, 000, 000B7A0)
000B700  7FFB7143 (000B870, 1B0, 000, 00180271780)
420  7FFB761ED7F6 (001, 000, 000B950, 0010001)
420  7FFB779CE4E3 (020, 000, 000BAD0, 001)
420  001800AB022 (000BA70, 000, 000, 001803007C0)
000BAF0  001800ABBD5 (00100410FBB, 000, 00600089010, 001004E9740)
000BCF0  0018011C93B (00100410FBB, 000, 00600089010, 001004E9740)
000BCF0  2DEE458 (00100410FBB, 000, 00600089010, 001004E9740)
End of stack trace (more stack frames may be present)

--
Dr Tim Mayo
Software Architect

Ubisense Ltd
St Andrew's House
St Andrew's Road, Chesterton
Cambridge, CB4 1DL
+44 1223 53 5170



  1   2   >